Commit Graph

78 Commits

Author SHA1 Message Date
Jeff Moyer b218c43f0f io_uring/rsrc: fix incorrect assignment of iter->nr_segs in io_import_fixed
JIRA: https://issues.redhat.com/browse/RHEL-64867

commit a23800f08a60787dfbf2b87b2e6ed411cb629859
Author: Chenliang Li <cliang01.li@samsung.com>
Date:   Wed Jun 19 14:38:19 2024 +0800

    io_uring/rsrc: fix incorrect assignment of iter->nr_segs in io_import_fixed
    
    In io_import_fixed when advancing the iter within the first bvec, the
    iter->nr_segs is set to bvec->bv_len. nr_segs should be the number of
    bvecs, plus we don't need to adjust it here, so just remove it.
    
    Fixes: b000ae0ec2d7 ("io_uring/rsrc: optimise single entry advance")
    Signed-off-by: Chenliang Li <cliang01.li@samsung.com>
    Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/20240619063819.2445-1-cliang01.li@samsung.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-12-02 11:12:48 -05:00
Jeff Moyer d45024afd9 io_uring: move mapping/allocation helpers to a separate file
JIRA: https://issues.redhat.com/browse/RHEL-64867
Conflicts: RHEL does not have commit 5e0a760b4441 ("mm, treewide:
rename MAX_ORDER to MAX_PAGE_ORDER").

commit f15ed8b4d0ce2c0831232ff85117418740f0c529
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Mar 27 14:59:09 2024 -0600

    io_uring: move mapping/allocation helpers to a separate file
    
    Move the related code from io_uring.c into memmap.c. No functional
    changes in this patch, just cleaning it up a bit now that the full
    transition is done.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-11-28 17:09:44 -05:00
Jeff Moyer cb57a5f470 io_uring: unify io_pin_pages()
JIRA: https://issues.redhat.com/browse/RHEL-64867

commit 1943f96b3816e0f0d3d6686374d6e1d617c8b42c
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Mar 13 14:58:14 2024 -0600

    io_uring: unify io_pin_pages()
    
    Move it into io_uring.c where it belongs, and use it in there as well
    rather than have two implementations of this.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-11-28 17:05:44 -05:00
Jeff Moyer 4368261722 io_uring/rsrc: cleanup io_pin_pages()
JIRA: https://issues.redhat.com/browse/RHEL-64867

commit 922a2c78f13611e2c08fc48f615c0cd367dcb6da
Author: Jens Axboe <axboe@kernel.dk>
Date:   Mon Oct 2 18:25:23 2023 -0600

    io_uring/rsrc: cleanup io_pin_pages()
    
    This function is overly convoluted with a goto error path, and checks
    under the mmap_read_lock() that don't need to be at all. Rearrange it
    a bit so the checks and errors fall out naturally, rather than needing
    to jump around for it.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-11-28 17:04:44 -05:00
Jeff Moyer f1ebf01f03 io_uring/alloc_cache: switch to array based caching
JIRA: https://issues.redhat.com/browse/RHEL-64867

commit 414d0f45c316221acbf066658afdbae5b354a5cc
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Mar 20 15:19:44 2024 -0600

    io_uring/alloc_cache: switch to array based caching
    
    Currently lists are being used to manage this, but best practice is
    usually to have these in an array instead as that it cheaper to manage.
    
    Outside of that detail, games are also played with KASAN as the list
    is inside the cached entry itself.
    
    Finally, all users of this need a struct io_cache_entry embedded in
    their struct, which is union'ized with something else in there that
    isn't used across the free -> realloc cycle.
    
    Get rid of all of that, and simply have it be an array. This will not
    change the memory used, as we're just trading an 8-byte member entry
    for the per-elem array size.
    
    This reduces the overhead of the recycled allocations, and it reduces
    the amount of code code needed to support recycling to about half of
    what it currently is.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-11-28 16:56:44 -05:00
Rafael Aquini 83191dde1b mm/gup: remove vmas parameter from pin_user_pages()
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 4c630f307455c06f99bdeca7f7a1ab5318604fe0
Author: Lorenzo Stoakes <lstoakes@gmail.com>
Date:   Wed May 17 20:25:45 2023 +0100

    mm/gup: remove vmas parameter from pin_user_pages()

    We are now in a position where no caller of pin_user_pages() requires the
    vmas parameter at all, so eliminate this parameter from the function and
    all callers.

    This clears the way to removing the vmas parameter from GUP altogether.

    Link: https://lkml.kernel.org/r/195a99ae949c9f5cb589d2222b736ced96ec199a.1684350871.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>  [qib]
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>   [drivers/media]
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Janosch Frank <frankja@linux.ibm.com>
    Cc: Jarkko Sakkinen <jarkko@kernel.org>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:37 -04:00
Rafael Aquini 22e1da15fd io_uring: rsrc: delegate VMA file-backed check to GUP
JIRA: https://issues.redhat.com/browse/RHEL-27742

This patch is a backport of the following upstream commit:
commit 34ed8d0dcd692378f1155fe27648f54f99adbfbf
Author: Lorenzo Stoakes <lstoakes@gmail.com>
Date:   Wed May 17 20:25:42 2023 +0100

    io_uring: rsrc: delegate VMA file-backed check to GUP

    Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
    broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
    writing to file-backed mappings", there is no need to explicitly check VMAs
    for this condition, so simply remove this logic from io_uring altogether.

    Link: https://lkml.kernel.org/r/e4a4efbda9cd12df71e0ed81796dc630231a1ef2.1684350871.git.lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Janosch Frank <frankja@linux.ibm.com>
    Cc: Jarkko Sakkinen <jarkko@kernel.org>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
    Cc: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-09-05 20:35:36 -04:00
Jeff Moyer ab68f1ab6a io_uring/rsrc: don't lock while !TASK_RUNNING
JIRA: https://issues.redhat.com/browse/RHEL-47830
CVE: CVE-2024-40922

commit 54559642b96116b45e4b5ca7fd9f7835b8561272
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Wed Jun 12 13:56:38 2024 +0100

    io_uring/rsrc: don't lock while !TASK_RUNNING
    
    There is a report of io_rsrc_ref_quiesce() locking a mutex while not
    TASK_RUNNING, which is due to forgetting restoring the state back after
    io_run_task_work_sig() and attempts to break out of the waiting loop.
    
    do not call blocking ops when !TASK_RUNNING; state=1 set at
    [<ffffffff815d2494>] prepare_to_wait+0xa4/0x380
    kernel/sched/wait.c:237
    WARNING: CPU: 2 PID: 397056 at kernel/sched/core.c:10099
    __might_sleep+0x114/0x160 kernel/sched/core.c:10099
    RIP: 0010:__might_sleep+0x114/0x160 kernel/sched/core.c:10099
    Call Trace:
     <TASK>
     __mutex_lock_common kernel/locking/mutex.c:585 [inline]
     __mutex_lock+0xb4/0x940 kernel/locking/mutex.c:752
     io_rsrc_ref_quiesce+0x590/0x940 io_uring/rsrc.c:253
     io_sqe_buffers_unregister+0xa2/0x340 io_uring/rsrc.c:799
     __io_uring_register io_uring/register.c:424 [inline]
     __do_sys_io_uring_register+0x5b9/0x2400 io_uring/register.c:613
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xd8/0x270 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x6f/0x77
    
    Reported-by: Li Shi <sl1589472800@gmail.com>
    Fixes: 4ea15b56f0810 ("io_uring/rsrc: use wq for quiescing")
    Cc: stable@vger.kernel.org
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/77966bc104e25b0534995d5dbb152332bc8f31c0.1718196953.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-07-26 17:07:04 -04:00
Jeff Moyer bc8154d90e io_uring: drop any code related to SCM_RIGHTS
JIRA: https://issues.redhat.com/browse/RHEL-36366
CVE: CVE-2023-52656
Conflicts: We backported commit 4f0b9194bc11 ("fs: Rename
  anon_inode_getfile_secure() and anon_inode_getfd_secure()"), which
  obviously causes a conflict in io_uring_get_file().

commit 6e5e6d274956305f1fc0340522b38f5f5be74bdb
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Dec 19 12:36:34 2023 -0700

    io_uring: drop any code related to SCM_RIGHTS
    
    This is dead code after we dropped support for passing io_uring fds
    over SCM_RIGHTS, get rid of it.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-05-15 13:58:17 -04:00
Ming Lei 66e4fb2214 io_uring: fix off-by one bvec index
JIRA: https://issues.redhat.com/browse/RHEL-19874

commit d6fef34ee4d102be448146f24caf96d7b4a05401
Author: Keith Busch <kbusch@kernel.org>
Date:   Mon Nov 20 14:18:31 2023 -0800

    io_uring: fix off-by one bvec index

    If the offset equals the bv_len of the first registered bvec, then the
    request does not include any of that first bvec. Skip it so that drivers
    don't have to deal with a zero length bvec, which was observed to break
    NVMe's PRP list creation.

    Cc: stable@vger.kernel.org
    Fixes: bd11b3a391 ("io_uring: don't use iov_iter_advance() for fixed buffers")
    Signed-off-by: Keith Busch <kbusch@kernel.org>
    Link: https://lore.kernel.org/r/20231120221831.2646460-1-kbusch@meta.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Ming Lei <ming.lei@redhat.com>
2024-01-03 10:02:11 +08:00
Jeff Moyer 3c20906e66 io_uring/rsrc: keep one global dummy_ubuf
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 19a63c4021702e389a559726b16fcbf07a8a05f9
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Fri Aug 11 13:53:46 2023 +0100

    io_uring/rsrc: keep one global dummy_ubuf
    
    We set empty registered buffers to dummy_ubuf as an optimisation.
    Currently, we allocate the dummy entry for each ring, whenever we can
    simply have one global instance.
    
    We're casting out const on assignment, it's fine as we're not going to
    change the content of the dummy, the constness gives us an extra layer
    of protection if sth ever goes wrong.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/e4a96dda35ab755914bc43f6781bba0df97ac489.1691757663.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 17:26:25 -04:00
Jeff Moyer a8b58ec6be io_uring: add helpers to decode the fixed file file_ptr
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 4bfb0c9af832a182a54e549123a634e0070c8d4f
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Jun 20 13:32:35 2023 +0200

    io_uring: add helpers to decode the fixed file file_ptr
    
    Remove all the open coded magic on slot->file_ptr by introducing two
    helpers that return the file pointer and the flags instead.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20230620113235.920399-9-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:55 -04:00
Jeff Moyer 1ca4caabd2 io_uring/rsrc: check for nonconsecutive pages
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 776617db78c6d208780e7c69d4d68d1fa82913de
Author: Tobias Holl <tobias@tholl.xyz>
Date:   Wed May 3 08:59:50 2023 -0600

    io_uring/rsrc: check for nonconsecutive pages
    
    Pages that are from the same folio do not necessarily need to be
    consecutive. In that case, we cannot consolidate them into a single bvec
    entry. Before applying the huge page optimization from commit 57bebf807e2a
    ("io_uring/rsrc: optimise registered huge pages"), check that the memory
    is actually consecutive.
    
    Cc: stable@vger.kernel.org
    Fixes: 57bebf807e2a ("io_uring/rsrc: optimise registered huge pages")
    Signed-off-by: Tobias Holl <tobias@tholl.xyz>
    [axboe: formatting]
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer 32af48a0f7 io_uring/rsrc: disassociate nodes and rsrc_data
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 2236b3905b4d4e9cd4d149ab35767858c02bb79b
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:41 2023 +0100

    io_uring/rsrc: disassociate nodes and rsrc_data
    
    Make rsrc nodes independent from rsrd_data, for that we keep ctx and
    rsrc type in nodes.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/4f259abe9cd4eea6a3b4ed83508635218acd3c3f.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer a02d5bc14b io_uring/rsrc: devirtualise rsrc put callbacks
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit fc7f3a8d3a78503c4f3e108155fb9a233dc307a4
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:40 2023 +0100

    io_uring/rsrc: devirtualise rsrc put callbacks
    
    We only have two rsrc types, buffers and files, replace virtual
    callbacks for putting resources down with a switch..case.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/02ca727bf8e5f7f820c2f404e95ae88c8f472930.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer e199b640e4 io_uring/rsrc: pass node to io_rsrc_put_work()
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 29b26c556e7439b1370ac6a59fce83a9d1521de1
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:39 2023 +0100

    io_uring/rsrc: pass node to io_rsrc_put_work()
    
    Instead of passing rsrc_data and a resource to io_rsrc_put_work() just
    forward node, that's all the function needs.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/791e8edd28d78797240b74d34e99facbaad62f3b.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer aacba14952 io_uring/rsrc: inline io_rsrc_put_work()
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 4130b49991d6b8ca0ea44cb256e710c4e48d7f01
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:38 2023 +0100

    io_uring/rsrc: inline io_rsrc_put_work()
    
    io_rsrc_put_work() is simple enough to be open coded into its only
    caller.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/1b36dd46766ced39a9b160767babfa2fce07b8f8.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer 252fa76607 io_uring/rsrc: add empty flag in rsrc_node
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 26147da37f3e52041d9deba189d39f27ce78a84f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:37 2023 +0100

    io_uring/rsrc: add empty flag in rsrc_node
    
    Unless a node was flushed by io_rsrc_ref_quiesce(), it'll carry a
    resource. Replace ->inline_items with an empty flag, which is
    initialised to false and only raised in io_rsrc_ref_quiesce().
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/75d384c9d2252e12af73b9cf8a44e1699106aeb1.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer b99aa89db1 io_uring/rsrc: merge nodes and io_rsrc_put
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c376644fb915fbdea8c4a04f859d032a8be352fd
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:36 2023 +0100

    io_uring/rsrc: merge nodes and io_rsrc_put
    
    struct io_rsrc_node carries a number of resources represented by struct
    io_rsrc_put. That was handy before for sync overhead ammortisation, but
    all complexity is gone and nodes are simple and lightweight. Let's
    allocate a separate node for each resource.
    
    Nodes and io_rsrc_put and not much different in size, and former are
    cached, so node allocation should work better. That also removes some
    overhead for nested iteration in io_rsrc_node_ref_zero() /
    __io_rsrc_put_work().
    
    Another reason for the patch is that it greatly reduces complexity
    by moving io_rsrc_node_switch[_start]() inside io_queue_rsrc_removal(),
    so users don't have to care about it.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/c7d3a45b30cc14cd93700a710dd112edc703db98.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer c5de9ff392 io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 63fea89027ff4fd4f350b471ad5b9220d373eec5
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:35 2023 +0100

    io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal
    
    For io_queue_rsrc_removal() we should always use the current active rsrc
    node, don't pass it directly but let the function grab it from the
    context.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/d15939b4afea730978b4925685c2577538b823bb.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 1cfbbea4bb io_uring/rsrc: refactor io_queue_rsrc_removal
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c899a5d7d0eca054546b63e95c94b1e609516f84
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:14 2023 +0100

    io_uring/rsrc: refactor io_queue_rsrc_removal
    
    We can queue up a rsrc into a list in io_queue_rsrc_removal() while
    allocating io_rsrc_put and so simplify the function.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/36bd708ee25c0e2e7992dc19b17db166eea9ac40.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 449bd51ba8 io_uring/rsrc: clean up __io_sqe_buffers_update()
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 9a57fffedc0ee078418a7793ab29cd3864205340
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:12 2023 +0100

    io_uring/rsrc: clean up __io_sqe_buffers_update()
    
    Inline offset variable, so we don't use it without subjecting it to
    array_index_nospec() first.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/77936d9ed23755588810c5eafcea7e1c3b90e3cd.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 7f28d9822b io_uring/rsrc: inline switch_start fast path
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 2f2af35f8e5a1ed552ed02e47277d50092a2b9f6
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:11 2023 +0100

    io_uring/rsrc: inline switch_start fast path
    
    Inline the part of io_rsrc_node_switch_start() that checks whether the
    cache is empty or not, as most of the times it will have some number of
    entries in there.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/9619c1717a0e01f22c5fce2f1ba2735f804da0f2.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 5904c89c9a io_uring/rsrc: remove rsrc_data refs
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 0b222eeb6514ba6c3457b667fa4f3645032e1fc9
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:10 2023 +0100

    io_uring/rsrc: remove rsrc_data refs
    
    Instead of waiting for rsrc_data->refs to be downed to zero, check
    whether there are rsrc nodes queued for completion, that's easier then
    maintaining references.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/8e33fd143d83e11af3e386aea28eb6d6c6a1be10.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer a4e29f0b79 io_uring/rsrc: fix DEFER_TASKRUN rsrc quiesce
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 7d481e0356334eb2de254414769b4bed4b2a8827
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:09 2023 +0100

    io_uring/rsrc: fix DEFER_TASKRUN rsrc quiesce
    
    For io_rsrc_ref_quiesce() to progress it should execute all task_work
    items, including deferred ones. However, currently nobody would wake us,
    and so let's set ctx->cq_wait_nr, so io_req_local_work_add() would wake
    us up.
    
    Fixes: c0e0d6ba25f18 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/f1a90d1bc5ebf096475b018fed52e54f3b89d4af.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer e282fef7b1 io_uring/rsrc: use wq for quiescing
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 4ea15b56f0810f0d8795d475db1bb74b3a7c1b2f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:08 2023 +0100

    io_uring/rsrc: use wq for quiescing
    
    Replace completions with waitqueues for rsrc data quiesce, the main
    wakeup condition is when data refs hit zero. Note that data refs are
    only changes under ->uring_lock, so we prepare before mutex_unlock()
    reacquire it after taking the lock back. This change will be needed
    in the next patch.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/1d0dbc74b3b4fd67c8f01819e680c5e0da252956.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 99349c9fac io_uring/rsrc: refactor io_rsrc_ref_quiesce
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit eef81fcaa61e1bc6b7735be65f41bbf1a8efd133
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:07 2023 +0100

    io_uring/rsrc: refactor io_rsrc_ref_quiesce
    
    Refactor io_rsrc_ref_quiesce() by moving the first mutex_unlock(),
    so we don't have to have a second mutex_unlock() further in the loop.
    It prepares us to the next patch.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/65bc876271fb16bf550a53a4c76c91aacd94e52e.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 7b8d16c640 io_uring/rsrc: remove io_rsrc_node::done
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c732ea242d565c8281c4b017929fc62a246d81b9
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:06 2023 +0100

    io_uring/rsrc: remove io_rsrc_node::done
    
    Kill io_rsrc_node::node and check refs instead, it's set when the nodes
    refcount hits zero, and it won't change afterwards.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/bbde361f4010f7e8bf196f1ecca27a763b79926f.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer 33ef9d0d06 io_uring/rsrc: extract SCM file put helper
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit d581076b6a85c6f8308a4ba2bdcd82651f5183df
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 11 12:06:08 2023 +0100

    io_uring/rsrc: extract SCM file put helper
    
    SCM file accounting is a slow path and is only used for UNIX files.
    Extract a helper out of io_rsrc_file_put() that does the SCM
    unaccounting.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/58cc7bffc2ee96bec8c2b89274a51febcbfa5556.1681210788.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer ce9bdf5c09 io_uring/rsrc: refactor io_rsrc_node_switch
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 2933ae6eaa05e8db6ad33a3ca12af18d2a25358c
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 11 12:06:07 2023 +0100

    io_uring/rsrc: refactor io_rsrc_node_switch
    
    We use io_rsrc_node_switch() coupled with io_rsrc_node_switch_start()
    for a bunch of cases including initialising ctx->rsrc_node, i.e. by
    passing NULL instead of rsrc_data. Leave it to only deal with actual
    node changing.
    
    For that, first remove it from io_uring_create() and add a function
    allocating the first node. Then also remove all calls to
    io_rsrc_node_switch() from files/buffers register as we already have a
    node installed and it does essentially nothing.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/d146fe306ff98b1a5a60c997c252534f03d423d7.1681210788.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer 5028c843e9 io_uring/rsrc: zero node's rsrc data on alloc
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 13c223962eac16f161cf9b6355209774c609af28
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 11 12:06:06 2023 +0100

    io_uring/rsrc: zero node's rsrc data on alloc
    
    struct io_rsrc_node::rsrc_data field is initialised on rsrc removal and
    shouldn't be used before that, still let's play safe and zero the field
    on alloc.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/09bd03cedc8da8a7974c5e6e4bf0489fd16593ab.1681210788.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer 29dd0a1af8 io_uring/rsrc: consolidate node caching
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 528407b1e0ea51260fff2cc8b669c632a65d7a09
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 11 12:06:05 2023 +0100

    io_uring/rsrc: consolidate node caching
    
    We store one pre-allocated rsrc node in ->rsrc_backup_node, merge it
    with ->rsrc_node_cache.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/6d5410e51ccd29be7a716be045b51d6b371baef6.1681210788.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer 13c43a91fc io_uring/rsrc: add lockdep checks
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 786788a8cfe03056e9c7b1c6e418c1db92a0ce80
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 11 12:06:04 2023 +0100

    io_uring/rsrc: add lockdep checks
    
    Add a lockdep chek to make sure that file and buffer updates hold
    ->uring_lock.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/961bbe6e433ec9bc0375127f23468b37b729df99.1681210788.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:32 -04:00
Jeff Moyer 3cb59f3336 io_uring/rsrc: optimise io_rsrc_data refcounting
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 757ef4682b6aa29fdf752ad47f0d63eb48b261cf
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:56 2023 +0100

    io_uring/rsrc: optimise io_rsrc_data refcounting
    
    Every struct io_rsrc_node takes a struct io_rsrc_data reference, which
    means all rsrc updates do 2 extra atomics. Replace atomics refcounting
    with a int as it's all done under ->uring_lock.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/e73c3d6820cf679532696d790b5b8fae23537213.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer 067bb80c4a io_uring/rsrc: add lockdep sanity checks
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 1f2c8f610aa6c6a3dc3523f93eaf28c25051df6f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:55 2023 +0100

    io_uring/rsrc: add lockdep sanity checks
    
    We should hold ->uring_lock while putting nodes with io_put_rsrc_node(),
    add a lockdep check for that.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/b50d5f156ac41450029796738c1dfd22a521df7a.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer d00f0ad4e6 io_uring/rsrc: cache struct io_rsrc_node
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 9eae8655f9cd2eeed99fb7a0d2bb22816c17e497
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:54 2023 +0100

    io_uring/rsrc: cache struct io_rsrc_node
    
    Add allocation cache for struct io_rsrc_node, it's always allocated and
    put under ->uring_lock, so it doesn't need any extra synchronisation
    around caches.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/252a9d9ef9654e6467af30fdc02f57c0118fb76e.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer e49bee429b io_uring/rsrc: don't offload node free
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 36b9818a5a84cb7c977fb723babca1c8d74f288f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:53 2023 +0100

    io_uring/rsrc: don't offload node free
    
    struct delayed_work rsrc_put_work was previously used to offload node
    freeing because io_rsrc_node_ref_zero() was previously called by RCU in
    the IRQ context. Now, as percpu refcounting is gone, we can do it
    eagerly at the spot without pushing it to a worker.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/13fb1aac1e8d068ad8fd4a0c6d0d157ab61b90c0.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer 5e922f26a3 io_uring/rsrc: optimise io_rsrc_put allocation
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit ff7c75ecaa9e6b251f76c24e289d4bfe413ffe31
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:52 2023 +0100

    io_uring/rsrc: optimise io_rsrc_put allocation
    
    Every io_rsrc_node keeps a list of items to put, and all entries are
    kmalloc()'ed. However, it's quite often to queue up only one entry per
    node, so let's add an inline entry there to avoid extra allocations.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/c482c1c652c45c85ac52e67c974bc758a50fed5f.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer ac730dc64c io_uring/rsrc: rename rsrc_list
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c824986c113f15e2ef2c00da9a226c09ecaac74c
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:51 2023 +0100

    io_uring/rsrc: rename rsrc_list
    
    We have too many "rsrc" around which makes the name of struct
    io_rsrc_node::rsrc_list confusing. The field is responsible for keeping
    a list of files or buffers, so call it item_list and add comments
    around.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/3e34d4dfc1fdbb6b520f904ee6187c2ccf680efe.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer 2b79b32388 io_uring/rsrc: kill rsrc_ref_lock
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 0a4813b1abdf06e44ce60cdebfd374cfd27c46bf
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:50 2023 +0100

    io_uring/rsrc: kill rsrc_ref_lock
    
    We use ->rsrc_ref_lock spinlock to protect ->rsrc_ref_list in
    io_rsrc_node_ref_zero(). Now we removed pcpu refcounting, which means
    io_rsrc_node_ref_zero() is not executed from the irq context as an RCU
    callback anymore, and we also put it under ->uring_lock.
    io_rsrc_node_switch(), which queues up nodes into the list, is also
    protected by ->uring_lock, so we can safely get rid of ->rsrc_ref_lock.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/6b60af883c263551190b526a55ff2c9d5ae07141.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer 1031294661 io_uring/rsrc: protect node refs with uring_lock
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit ef8ae64ffa9578c12e44de42604004c2cc3e9c27
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:49 2023 +0100

    io_uring/rsrc: protect node refs with uring_lock
    
    Currently, for nodes we have an atomic counter and some cached
    (non-atomic) refs protected by uring_lock. Let's put all ref
    manipulations under uring_lock and get rid of the atomic part.
    It's free as in all cases we care about we already hold the lock.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/25b142feed7d831008257d90c8b17c0115d4fc15.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:30 -04:00
Jeff Moyer 4622b46d9b io_uring/rsrc: keep cached refs per node
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 8e15c0e71b8ae64fb7163532860f8d608165281f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:46 2023 +0100

    io_uring/rsrc: keep cached refs per node
    
    We cache refs of the current node (i.e. ctx->rsrc_node) in
    ctx->rsrc_cached_refs. We'll be moving away from atomics, so move the
    cached refs in struct io_rsrc_node for now. It's a prep patch and
    shouldn't change anything in practise.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/9edc3669c1d71b06c2dca78b2b2b8bb9292738b9.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:29 -04:00
Jeff Moyer 847c95f93a io_uring/rsrc: use non-pcpu refcounts for nodes
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit b8fb5b4fdd67f9d18109c5d21d44a8bd4ddb608b
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 4 13:39:45 2023 +0100

    io_uring/rsrc: use non-pcpu refcounts for nodes
    
    One problem with the current rsrc infra is that often updates will
    generates lots of rsrc nodes, each carry pcpu refs. That takes quite a
    lot of memory, especially if there is a stall, and takes lots of CPU
    cycles. Only pcpu allocations takes >50 of CPU with a naive benchmark
    updating files in a loop.
    
    Replace pcpu refs with normal refcounting. There is already a hot path
    avoiding atomics / refs, but following patches will further improve it.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/e9ed8a9457b331a26555ff9443afc64cdaab7247.1680576071.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:29 -04:00
Jeff Moyer 3939e342f7 io_uring/rsrc: fix folio accounting
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit d2acf789088bb562cea342b6a24e646df4d47839
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Mar 16 15:26:05 2023 +0000

    io_uring/rsrc: fix folio accounting
    
    | BUG: Bad page state in process kworker/u8:0  pfn:5c001
    | page:00000000bfda61c8 refcount:0 mapcount:0 mapping:0000000000000000 index:0x20001 pfn:0x5c001
    | head:0000000011409842 order:9 entire_mapcount:0 nr_pages_mapped:0 pincount:1
    | anon flags: 0x3fffc00000b0004(uptodate|head|mappedtodisk|swapbacked|node=0|zone=0|lastcpupid=0xffff)
    | raw: 03fffc0000000000 fffffc0000700001 ffffffff00700903 0000000100000000
    | raw: 0000000000000200 0000000000000000 00000000ffffffff 0000000000000000
    | head: 03fffc00000b0004 dead000000000100 dead000000000122 ffff00000a809dc1
    | head: 0000000000020000 0000000000000000 00000000ffffffff 0000000000000000
    | page dumped because: nonzero pincount
    | CPU: 3 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc2-00001-gc6811bf0cd87 #1
    | Hardware name: linux,dummy-virt (DT)
    | Workqueue: events_unbound io_ring_exit_work
    | Call trace:
    |  dump_backtrace+0x13c/0x208
    |  show_stack+0x34/0x58
    |  dump_stack_lvl+0x150/0x1a8
    |  dump_stack+0x20/0x30
    |  bad_page+0xec/0x238
    |  free_tail_pages_check+0x280/0x350
    |  free_pcp_prepare+0x60c/0x830
    |  free_unref_page+0x50/0x498
    |  free_compound_page+0xcc/0x100
    |  free_transhuge_page+0x1f0/0x2b8
    |  destroy_large_folio+0x80/0xc8
    |  __folio_put+0xc4/0xf8
    |  gup_put_folio+0xd0/0x250
    |  unpin_user_page+0xcc/0x128
    |  io_buffer_unmap+0xec/0x2c0
    |  __io_sqe_buffers_unregister+0xa4/0x1e0
    |  io_ring_exit_work+0x68c/0x1188
    |  process_one_work+0x91c/0x1a58
    |  worker_thread+0x48c/0xe30
    |  kthread+0x278/0x2f0
    |  ret_from_fork+0x10/0x20
    
    Mark reports an issue with the recent patches coalescing compound pages
    while registering them in io_uring. The reason is that we try to drop
    excessive references with folio_put_refs(), but pages were acquired
    with pin_user_pages(), which has extra accounting and so should be put
    down with matching unpin_user_pages() or at least gup_put_folio().
    
    As a fix unpin_user_pages() all but first page instead, and let's figure
    out a better API after.
    
    Fixes: 57bebf807e2abcf8 ("io_uring/rsrc: optimise registered huge pages")
    Reported-by: Mark Rutland <mark.rutland@arm.com>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Tested-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/10efd5507d6d1f05ea0f3c601830e08767e189bd.1678980230.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:12 -04:00
Jeff Moyer 428ba2219a io_uring: rsrc: Optimize return value variable 'ret'
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 6acd352dfee558194643adbed7e849fe80fd1b93
Author: Li zeming <zeming@nfschina.com>
Date:   Sat Mar 18 02:25:38 2023 +0800

    io_uring: rsrc: Optimize return value variable 'ret'
    
    The initialization assignment of the variable ret is changed to 0, only
    in 'goto fail;' Use the ret variable as the function return value.
    
    Signed-off-by: Li zeming <zeming@nfschina.com>
    Link: https://lore.kernel.org/r/20230317182538.3027-1-zeming@nfschina.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:12 -04:00
Jeff Moyer 091f941ef1 io_uring/rsrc: always initialize 'folio' to NULL
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 977bc87356107fb946fb4ff24f1e4c241b5043ec
Author: Jens Axboe <axboe@kernel.dk>
Date:   Fri Feb 24 09:54:57 2023 -0700

    io_uring/rsrc: always initialize 'folio' to NULL
    
    Smatch complains that:
    
    smatch warnings:
    io_uring/rsrc.c:1262 io_sqe_buffer_register() error: uninitialized symbol 'folio'.
    
    'folio' may be used uninitialized, which can happen if we end up with a
    single page mapped. Ensure that we clear folio to NULL at the top so
    it's always set.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <error27@gmail.com>
    Link: https://lore.kernel.org/r/202302241432.YML1CD5C-lkp@intel.com/
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:11 -04:00
Jeff Moyer 60a0f7712b io_uring/rsrc: optimise registered huge pages
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 57bebf807e2abcf87d96b9de1266104ee2d8fc2f
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Wed Feb 22 14:36:51 2023 +0000

    io_uring/rsrc: optimise registered huge pages
    
    When registering huge pages, internally io_uring will split them into
    many PAGE_SIZE bvec entries. That's bad for performance as drivers need
    to eventually dma-map the data and will do it individually for each bvec
    entry. Coalesce huge pages into one large bvec.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:11 -04:00
Jeff Moyer 6c86e987d3 io_uring/rsrc: optimise single entry advance
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit b000ae0ec2d709046ac1a3c5722fea417f8a067e
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Wed Feb 22 14:36:50 2023 +0000

    io_uring/rsrc: optimise single entry advance
    
    Iterating within the first bvec entry should be essentially free, but we
    use iov_iter_advance() for that, which shows up in benchmark profiles
    taking up to 0.5% of CPU. Replace it with a hand coded version.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:11 -04:00
Jeff Moyer 9bf3263631 io_uring/rsrc: fix a comment in io_import_fixed()
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 6bf65a1b3668b04bb6c8126494d00303104eb9e5
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Mon Feb 20 14:13:52 2023 +0000

    io_uring/rsrc: fix a comment in io_import_fixed()
    
    io_import_fixed() supports offsets, but "may not" means the opposite.
    Replace it with "might not" so the comments rather speaks about
    possible cases.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Link: https://lore.kernel.org/r/5b5f79958456caa6dc532f6205f75f224b232c81.1676902343.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:11 -04:00
Jeff Moyer 07f24770d1 io_uring: use bvec_set_page to initialize a bvec
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit cc342a21930f0e3862c5fd0871cd5a65c5b59e27
Author: Christoph Hellwig <hch@lst.de>
Date:   Fri Feb 3 16:06:29 2023 +0100

    io_uring: use bvec_set_page to initialize a bvec
    
    Use the bvec_set_page helper to initialize a bvec.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
    Link: https://lore.kernel.org/r/20230203150634.3199647-19-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:11 -04:00