Commit Graph

13 Commits

Author SHA1 Message Date
Jeff Moyer b7fd7f39e1 io_uring/filetable: don't unnecessarily clear/reset bitmap
JIRA: https://issues.redhat.com/browse/RHEL-64867

commit 340f634aa43d4172771a784da31e5d4c7c7d3126
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue May 7 15:09:02 2024 -0600

    io_uring/filetable: don't unnecessarily clear/reset bitmap
    
    If we're updating an existing slot, we clear the slot bitmap only to
    set it again right after. Just leave the bit set rather than toggle
    it off and on, and move the unused slot setting into the branch of
    not already having a file occupy this slot.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-11-28 17:46:44 -05:00
Jeff Moyer bc8154d90e io_uring: drop any code related to SCM_RIGHTS
JIRA: https://issues.redhat.com/browse/RHEL-36366
CVE: CVE-2023-52656
Conflicts: We backported commit 4f0b9194bc11 ("fs: Rename
  anon_inode_getfile_secure() and anon_inode_getfd_secure()"), which
  obviously causes a conflict in io_uring_get_file().

commit 6e5e6d274956305f1fc0340522b38f5f5be74bdb
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Dec 19 12:36:34 2023 -0700

    io_uring: drop any code related to SCM_RIGHTS
    
    This is dead code after we dropped support for passing io_uring fds
    over SCM_RIGHTS, get rid of it.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2024-05-15 13:58:17 -04:00
Jeff Moyer a8b58ec6be io_uring: add helpers to decode the fixed file file_ptr
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 4bfb0c9af832a182a54e549123a634e0070c8d4f
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue Jun 20 13:32:35 2023 +0200

    io_uring: add helpers to decode the fixed file file_ptr
    
    Remove all the open coded magic on slot->file_ptr by introducing two
    helpers that return the file pointer and the flags instead.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20230620113235.920399-9-hch@lst.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:55 -04:00
Jeff Moyer b99aa89db1 io_uring/rsrc: merge nodes and io_rsrc_put
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c376644fb915fbdea8c4a04f859d032a8be352fd
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:36 2023 +0100

    io_uring/rsrc: merge nodes and io_rsrc_put
    
    struct io_rsrc_node carries a number of resources represented by struct
    io_rsrc_put. That was handy before for sync overhead ammortisation, but
    all complexity is gone and nodes are simple and lightweight. Let's
    allocate a separate node for each resource.
    
    Nodes and io_rsrc_put and not much different in size, and former are
    cached, so node allocation should work better. That also removes some
    overhead for nested iteration in io_rsrc_node_ref_zero() /
    __io_rsrc_put_work().
    
    Another reason for the patch is that it greatly reduces complexity
    by moving io_rsrc_node_switch[_start]() inside io_queue_rsrc_removal(),
    so users don't have to care about it.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/c7d3a45b30cc14cd93700a710dd112edc703db98.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:34 -04:00
Jeff Moyer c5de9ff392 io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit 63fea89027ff4fd4f350b471ad5b9220d373eec5
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Apr 18 14:06:35 2023 +0100

    io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal
    
    For io_queue_rsrc_removal() we should always use the current active rsrc
    node, don't pass it directly but let the function grab it from the
    context.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/d15939b4afea730978b4925685c2577538b823bb.1681822823.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 4ca541034e io_uring/rsrc: simplify single file node switching
JIRA: https://issues.redhat.com/browse/RHEL-12076

commit c87fd583f3b5ef770af33893394ea37c7a10b5b8
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Apr 13 15:28:13 2023 +0100

    io_uring/rsrc: simplify single file node switching
    
    At maximum io_install_fixed_file() removes only one file, so no need to
    keep needs_switch state and we can call io_rsrc_node_switch() right after
    removal.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/37cfb46f46160f81dced79f646e97db608994574.1681395792.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-11-02 15:31:33 -04:00
Jeff Moyer 022f90161a io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit 02a4d923e4400a36d340ea12d8058f69ebf3a383
Author: Savino Dicanosa <sd7.dev@pm.me>
Date:   Tue Mar 21 19:44:02 2023 +0000

    io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()
    
    When fixed files are unregistered, file_alloc_end and alloc_hint
    are not cleared. This can later cause a NULL pointer dereference in
    io_file_bitmap_get() if auto index selection is enabled via
    IORING_FILE_INDEX_ALLOC:
    
    [    6.519129] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [...]
    [    6.541468] RIP: 0010:_find_next_zero_bit+0x1a/0x70
    [...]
    [    6.560906] Call Trace:
    [    6.561322]  <TASK>
    [    6.561672]  io_file_bitmap_get+0x38/0x60
    [    6.562281]  io_fixed_fd_install+0x63/0xb0
    [    6.562851]  ? __pfx_io_socket+0x10/0x10
    [    6.563396]  io_socket+0x93/0xf0
    [    6.563855]  ? __pfx_io_socket+0x10/0x10
    [    6.564411]  io_issue_sqe+0x5b/0x3d0
    [    6.564914]  io_submit_sqes+0x1de/0x650
    [    6.565452]  __do_sys_io_uring_enter+0x4fc/0xb20
    [    6.566083]  ? __do_sys_io_uring_register+0x11e/0xd80
    [    6.566779]  do_syscall_64+0x3c/0x90
    [    6.567247]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
    [...]
    
    To fix the issue, set file alloc range and alloc_hint to zero after
    file tables are freed.
    
    Cc: stable@vger.kernel.org
    Fixes: 4278a0deb1f6 ("io_uring: defer alloc_hint update to io_file_bitmap_set()")
    Signed-off-by: Savino Dicanosa <sd7.dev@pm.me>
    [axboe: add explicit bitmap == NULL check as well]
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-05-05 15:26:31 -04:00
Jeff Moyer 7675a0e1e0 io_uring/filetable: fix file reference underflow
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit 9d94c04c0db024922e886c9fd429659f22f48ea4
Author: Lin Ma <linma@zju.edu.cn>
Date:   Wed Nov 23 02:40:15 2022 +0800

    io_uring/filetable: fix file reference underflow
    
    There is an interesting reference bug when -ENOMEM occurs in calling of
    io_install_fixed_file(). KASan report like below:
    
    [   14.057131] ==================================================================
    [   14.059161] BUG: KASAN: use-after-free in unix_get_socket+0x10/0x90
    [   14.060975] Read of size 8 at addr ffff88800b09cf20 by task kworker/u8:2/45
    [   14.062684]
    [   14.062768] CPU: 2 PID: 45 Comm: kworker/u8:2 Not tainted 6.1.0-rc4 #1
    [   14.063099] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    [   14.063666] Workqueue: events_unbound io_ring_exit_work
    [   14.063936] Call Trace:
    [   14.064065]  <TASK>
    [   14.064175]  dump_stack_lvl+0x34/0x48
    [   14.064360]  print_report+0x172/0x475
    [   14.064547]  ? _raw_spin_lock_irq+0x83/0xe0
    [   14.064758]  ? __virt_addr_valid+0xef/0x170
    [   14.064975]  ? unix_get_socket+0x10/0x90
    [   14.065167]  kasan_report+0xad/0x130
    [   14.065353]  ? unix_get_socket+0x10/0x90
    [   14.065553]  unix_get_socket+0x10/0x90
    [   14.065744]  __io_sqe_files_unregister+0x87/0x1e0
    [   14.065989]  ? io_rsrc_refs_drop+0x1c/0xd0
    [   14.066199]  io_ring_exit_work+0x388/0x6a5
    [   14.066410]  ? io_uring_try_cancel_requests+0x5bf/0x5bf
    [   14.066674]  ? try_to_wake_up+0xdb/0x910
    [   14.066873]  ? virt_to_head_page+0xbe/0xbe
    [   14.067080]  ? __schedule+0x574/0xd20
    [   14.067273]  ? read_word_at_a_time+0xe/0x20
    [   14.067492]  ? strscpy+0xb5/0x190
    [   14.067665]  process_one_work+0x423/0x710
    [   14.067879]  worker_thread+0x2a2/0x6f0
    [   14.068073]  ? process_one_work+0x710/0x710
    [   14.068284]  kthread+0x163/0x1a0
    [   14.068454]  ? kthread_complete_and_exit+0x20/0x20
    [   14.068697]  ret_from_fork+0x22/0x30
    [   14.068886]  </TASK>
    [   14.069000]
    [   14.069088] Allocated by task 289:
    [   14.069269]  kasan_save_stack+0x1e/0x40
    [   14.069463]  kasan_set_track+0x21/0x30
    [   14.069652]  __kasan_slab_alloc+0x58/0x70
    [   14.069899]  kmem_cache_alloc+0xc5/0x200
    [   14.070100]  __alloc_file+0x20/0x160
    [   14.070283]  alloc_empty_file+0x3b/0xc0
    [   14.070479]  path_openat+0xc3/0x1770
    [   14.070689]  do_filp_open+0x150/0x270
    [   14.070888]  do_sys_openat2+0x113/0x270
    [   14.071081]  __x64_sys_openat+0xc8/0x140
    [   14.071283]  do_syscall_64+0x3b/0x90
    [   14.071466]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
    [   14.071791]
    [   14.071874] Freed by task 0:
    [   14.072027]  kasan_save_stack+0x1e/0x40
    [   14.072224]  kasan_set_track+0x21/0x30
    [   14.072415]  kasan_save_free_info+0x2a/0x50
    [   14.072627]  __kasan_slab_free+0x106/0x190
    [   14.072858]  kmem_cache_free+0x98/0x340
    [   14.073075]  rcu_core+0x427/0xe50
    [   14.073249]  __do_softirq+0x110/0x3cd
    [   14.073440]
    [   14.073523] Last potentially related work creation:
    [   14.073801]  kasan_save_stack+0x1e/0x40
    [   14.074017]  __kasan_record_aux_stack+0x97/0xb0
    [   14.074264]  call_rcu+0x41/0x550
    [   14.074436]  task_work_run+0xf4/0x170
    [   14.074619]  exit_to_user_mode_prepare+0x113/0x120
    [   14.074858]  syscall_exit_to_user_mode+0x1d/0x40
    [   14.075092]  do_syscall_64+0x48/0x90
    [   14.075272]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
    [   14.075529]
    [   14.075612] Second to last potentially related work creation:
    [   14.075900]  kasan_save_stack+0x1e/0x40
    [   14.076098]  __kasan_record_aux_stack+0x97/0xb0
    [   14.076325]  task_work_add+0x72/0x1b0
    [   14.076512]  fput+0x65/0xc0
    [   14.076657]  filp_close+0x8e/0xa0
    [   14.076825]  __x64_sys_close+0x15/0x50
    [   14.077019]  do_syscall_64+0x3b/0x90
    [   14.077199]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
    [   14.077448]
    [   14.077530] The buggy address belongs to the object at ffff88800b09cf00
    [   14.077530]  which belongs to the cache filp of size 232
    [   14.078105] The buggy address is located 32 bytes inside of
    [   14.078105]  232-byte region [ffff88800b09cf00, ffff88800b09cfe8)
    [   14.078685]
    [   14.078771] The buggy address belongs to the physical page:
    [   14.079046] page:000000001bd520e7 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800b09de00 pfn:0xb09c
    [   14.079575] head:000000001bd520e7 order:1 compound_mapcount:0 compound_pincount:0
    [   14.079946] flags: 0x100000000010200(slab|head|node=0|zone=1)
    [   14.080244] raw: 0100000000010200 0000000000000000 dead000000000001 ffff88800493cc80
    [   14.080629] raw: ffff88800b09de00 0000000080190018 00000001ffffffff 0000000000000000
    [   14.081016] page dumped because: kasan: bad access detected
    [   14.081293]
    [   14.081376] Memory state around the buggy address:
    [   14.081618]  ffff88800b09ce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   14.081974]  ffff88800b09ce80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
    [   14.082336] >ffff88800b09cf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [   14.082690]                                ^
    [   14.082909]  ffff88800b09cf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
    [   14.083266]  ffff88800b09d000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
    [   14.083622] ==================================================================
    
    The actual tracing of this bug is shown below:
    
    commit 8c71fe750215 ("io_uring: ensure fput() called correspondingly
    when direct install fails") adds an additional fput() in
    io_fixed_fd_install() when io_file_bitmap_get() returns error values. In
    that case, the routine will never make it to io_install_fixed_file() due
    to an early return.
    
    static int io_fixed_fd_install(...)
    {
      if (alloc_slot) {
        ...
        ret = io_file_bitmap_get(ctx);
        if (unlikely(ret < 0)) {
          io_ring_submit_unlock(ctx, issue_flags);
          fput(file);
          return ret;
        }
        ...
      }
      ...
      ret = io_install_fixed_file(req, file, issue_flags, file_slot);
      ...
    }
    
    In the above scenario, the reference is okay as io_fixed_fd_install()
    ensures the fput() is called when something bad happens, either via
    bitmap or via inner io_install_fixed_file().
    
    However, the commit 61c1b44a21d7 ("io_uring: fix deadlock on iowq file
    slot alloc") breaks the balance because it places fput() into the common
    path for both io_file_bitmap_get() and io_install_fixed_file(). Since
    io_install_fixed_file() handles the fput() itself, the reference
    underflow come across then.
    
    There are some extra commits make the current code into
    io_fixed_fd_install() -> __io_fixed_fd_install() ->
    io_install_fixed_file()
    
    However, the fact that there is an extra fput() is called if
    io_install_fixed_file() calls fput(). Traversing through the code, I
    find that the existing two callers to __io_fixed_fd_install():
    io_fixed_fd_install() and io_msg_send_fd() have fput() when handling
    error return, this patch simply removes the fput() in
    io_install_fixed_file() to fix the bug.
    
    Fixes: 61c1b44a21d7 ("io_uring: fix deadlock on iowq file slot alloc")
    Signed-off-by: Lin Ma <linma@zju.edu.cn>
    Link: https://lore.kernel.org/r/be4ba4b.5d44.184a0a406a4.Coremail.linma@zju.edu.cn
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-05-05 15:24:19 -04:00
Jeff Moyer 4b74f8a7f1 io_uring: let to set a range for file slot allocation
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit 6e73dffbb93cb8797cd4e42e98d837edf0f1a967
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Sat Jun 25 11:55:38 2022 +0100

    io_uring: let to set a range for file slot allocation
    
    From recently io_uring provides an option to allocate a file index for
    operation registering fixed files. However, it's utterly unusable with
    mixed approaches when for a part of files the userspace knows better
    where to place it, as it may race and users don't have any sane way to
    pick a slot and hoping it will not be taken.
    
    Let the userspace to register a range of fixed file slots in which the
    auto-allocation happens. The use case is splittting the fixed table in
    two parts, where on of them is used for auto-allocation and another for
    slot-specified operations.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/66ab0394e436f38437cf7c44676e1920d09687ad.1656154403.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-04-29 07:08:02 -04:00
Jeff Moyer 3f0281207a io_uring: split out fixed file installation and removal
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit f110ed8498afa6ff8e9a8c08fb26880e02117616
Author: Jens Axboe <axboe@kernel.dk>
Date:   Mon Jun 13 04:42:56 2022 -0600

    io_uring: split out fixed file installation and removal
    
    Put it with the filetable code, which is where it belongs. While doing
    so, have the helpers take a ctx rather than an io_kiocb. It doesn't make
    sense to use a request, as it's not an operation on the request itself.
    It applies to the ring itself.
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-04-29 07:06:02 -04:00
Jeff Moyer 8034d812f4 io_uring: kill extra io_uring_types.h includes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit 27a9d66fec77cff0e32d2ecd5d0ac7ef878a7bb0
Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Thu Jun 16 13:57:18 2022 +0100

    io_uring: kill extra io_uring_types.h includes
    
    io_uring/io_uring.h already includes io_uring_types.h, no need to
    include it every time. Kill it in a bunch of places, it prepares us for
    following patches.
    
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-04-29 06:19:02 -04:00
Jeff Moyer 796fa3b9c5 io_uring: move remaining file table manipulation to filetable.c
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit c98817e6cd4471a6f6283813dd6efea162f5ac5f
Author: Jens Axboe <axboe@kernel.dk>
Date:   Thu May 26 09:44:31 2022 -0600

    io_uring: move remaining file table manipulation to filetable.c
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-04-29 05:40:02 -04:00
Jeff Moyer b082d23ed2 io_uring: separate out file table handling code
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237

commit 453b329be5eacfc48dd43035af82bc7f28ecfedf
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue May 24 21:43:10 2022 -0600

    io_uring: separate out file table handling code
    
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
2023-04-29 05:20:02 -04:00