JIRA: https://issues.redhat.com/browse/RHEL-64867
commit 340f634aa43d4172771a784da31e5d4c7c7d3126
Author: Jens Axboe <axboe@kernel.dk>
Date: Tue May 7 15:09:02 2024 -0600
io_uring/filetable: don't unnecessarily clear/reset bitmap
If we're updating an existing slot, we clear the slot bitmap only to
set it again right after. Just leave the bit set rather than toggle
it off and on, and move the unused slot setting into the branch of
not already having a file occupy this slot.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-36366
CVE: CVE-2023-52656
Conflicts: We backported commit 4f0b9194bc11 ("fs: Rename
anon_inode_getfile_secure() and anon_inode_getfd_secure()"), which
obviously causes a conflict in io_uring_get_file().
commit 6e5e6d274956305f1fc0340522b38f5f5be74bdb
Author: Jens Axboe <axboe@kernel.dk>
Date: Tue Dec 19 12:36:34 2023 -0700
io_uring: drop any code related to SCM_RIGHTS
This is dead code after we dropped support for passing io_uring fds
over SCM_RIGHTS, get rid of it.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-12076
commit 4bfb0c9af832a182a54e549123a634e0070c8d4f
Author: Christoph Hellwig <hch@lst.de>
Date: Tue Jun 20 13:32:35 2023 +0200
io_uring: add helpers to decode the fixed file file_ptr
Remove all the open coded magic on slot->file_ptr by introducing two
helpers that return the file pointer and the flags instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230620113235.920399-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-12076
commit c376644fb915fbdea8c4a04f859d032a8be352fd
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Tue Apr 18 14:06:36 2023 +0100
io_uring/rsrc: merge nodes and io_rsrc_put
struct io_rsrc_node carries a number of resources represented by struct
io_rsrc_put. That was handy before for sync overhead ammortisation, but
all complexity is gone and nodes are simple and lightweight. Let's
allocate a separate node for each resource.
Nodes and io_rsrc_put and not much different in size, and former are
cached, so node allocation should work better. That also removes some
overhead for nested iteration in io_rsrc_node_ref_zero() /
__io_rsrc_put_work().
Another reason for the patch is that it greatly reduces complexity
by moving io_rsrc_node_switch[_start]() inside io_queue_rsrc_removal(),
so users don't have to care about it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c7d3a45b30cc14cd93700a710dd112edc703db98.1681822823.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-12076
commit 63fea89027ff4fd4f350b471ad5b9220d373eec5
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Tue Apr 18 14:06:35 2023 +0100
io_uring/rsrc: infer node from ctx on io_queue_rsrc_removal
For io_queue_rsrc_removal() we should always use the current active rsrc
node, don't pass it directly but let the function grab it from the
context.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d15939b4afea730978b4925685c2577538b823bb.1681822823.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
JIRA: https://issues.redhat.com/browse/RHEL-12076
commit c87fd583f3b5ef770af33893394ea37c7a10b5b8
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Thu Apr 13 15:28:13 2023 +0100
io_uring/rsrc: simplify single file node switching
At maximum io_install_fixed_file() removes only one file, so no need to
keep needs_switch state and we can call io_rsrc_node_switch() right after
removal.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/37cfb46f46160f81dced79f646e97db608994574.1681395792.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237
commit 9d94c04c0db024922e886c9fd429659f22f48ea4
Author: Lin Ma <linma@zju.edu.cn>
Date: Wed Nov 23 02:40:15 2022 +0800
io_uring/filetable: fix file reference underflow
There is an interesting reference bug when -ENOMEM occurs in calling of
io_install_fixed_file(). KASan report like below:
[ 14.057131] ==================================================================
[ 14.059161] BUG: KASAN: use-after-free in unix_get_socket+0x10/0x90
[ 14.060975] Read of size 8 at addr ffff88800b09cf20 by task kworker/u8:2/45
[ 14.062684]
[ 14.062768] CPU: 2 PID: 45 Comm: kworker/u8:2 Not tainted 6.1.0-rc4 #1
[ 14.063099] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 14.063666] Workqueue: events_unbound io_ring_exit_work
[ 14.063936] Call Trace:
[ 14.064065] <TASK>
[ 14.064175] dump_stack_lvl+0x34/0x48
[ 14.064360] print_report+0x172/0x475
[ 14.064547] ? _raw_spin_lock_irq+0x83/0xe0
[ 14.064758] ? __virt_addr_valid+0xef/0x170
[ 14.064975] ? unix_get_socket+0x10/0x90
[ 14.065167] kasan_report+0xad/0x130
[ 14.065353] ? unix_get_socket+0x10/0x90
[ 14.065553] unix_get_socket+0x10/0x90
[ 14.065744] __io_sqe_files_unregister+0x87/0x1e0
[ 14.065989] ? io_rsrc_refs_drop+0x1c/0xd0
[ 14.066199] io_ring_exit_work+0x388/0x6a5
[ 14.066410] ? io_uring_try_cancel_requests+0x5bf/0x5bf
[ 14.066674] ? try_to_wake_up+0xdb/0x910
[ 14.066873] ? virt_to_head_page+0xbe/0xbe
[ 14.067080] ? __schedule+0x574/0xd20
[ 14.067273] ? read_word_at_a_time+0xe/0x20
[ 14.067492] ? strscpy+0xb5/0x190
[ 14.067665] process_one_work+0x423/0x710
[ 14.067879] worker_thread+0x2a2/0x6f0
[ 14.068073] ? process_one_work+0x710/0x710
[ 14.068284] kthread+0x163/0x1a0
[ 14.068454] ? kthread_complete_and_exit+0x20/0x20
[ 14.068697] ret_from_fork+0x22/0x30
[ 14.068886] </TASK>
[ 14.069000]
[ 14.069088] Allocated by task 289:
[ 14.069269] kasan_save_stack+0x1e/0x40
[ 14.069463] kasan_set_track+0x21/0x30
[ 14.069652] __kasan_slab_alloc+0x58/0x70
[ 14.069899] kmem_cache_alloc+0xc5/0x200
[ 14.070100] __alloc_file+0x20/0x160
[ 14.070283] alloc_empty_file+0x3b/0xc0
[ 14.070479] path_openat+0xc3/0x1770
[ 14.070689] do_filp_open+0x150/0x270
[ 14.070888] do_sys_openat2+0x113/0x270
[ 14.071081] __x64_sys_openat+0xc8/0x140
[ 14.071283] do_syscall_64+0x3b/0x90
[ 14.071466] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 14.071791]
[ 14.071874] Freed by task 0:
[ 14.072027] kasan_save_stack+0x1e/0x40
[ 14.072224] kasan_set_track+0x21/0x30
[ 14.072415] kasan_save_free_info+0x2a/0x50
[ 14.072627] __kasan_slab_free+0x106/0x190
[ 14.072858] kmem_cache_free+0x98/0x340
[ 14.073075] rcu_core+0x427/0xe50
[ 14.073249] __do_softirq+0x110/0x3cd
[ 14.073440]
[ 14.073523] Last potentially related work creation:
[ 14.073801] kasan_save_stack+0x1e/0x40
[ 14.074017] __kasan_record_aux_stack+0x97/0xb0
[ 14.074264] call_rcu+0x41/0x550
[ 14.074436] task_work_run+0xf4/0x170
[ 14.074619] exit_to_user_mode_prepare+0x113/0x120
[ 14.074858] syscall_exit_to_user_mode+0x1d/0x40
[ 14.075092] do_syscall_64+0x48/0x90
[ 14.075272] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 14.075529]
[ 14.075612] Second to last potentially related work creation:
[ 14.075900] kasan_save_stack+0x1e/0x40
[ 14.076098] __kasan_record_aux_stack+0x97/0xb0
[ 14.076325] task_work_add+0x72/0x1b0
[ 14.076512] fput+0x65/0xc0
[ 14.076657] filp_close+0x8e/0xa0
[ 14.076825] __x64_sys_close+0x15/0x50
[ 14.077019] do_syscall_64+0x3b/0x90
[ 14.077199] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 14.077448]
[ 14.077530] The buggy address belongs to the object at ffff88800b09cf00
[ 14.077530] which belongs to the cache filp of size 232
[ 14.078105] The buggy address is located 32 bytes inside of
[ 14.078105] 232-byte region [ffff88800b09cf00, ffff88800b09cfe8)
[ 14.078685]
[ 14.078771] The buggy address belongs to the physical page:
[ 14.079046] page:000000001bd520e7 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800b09de00 pfn:0xb09c
[ 14.079575] head:000000001bd520e7 order:1 compound_mapcount:0 compound_pincount:0
[ 14.079946] flags: 0x100000000010200(slab|head|node=0|zone=1)
[ 14.080244] raw: 0100000000010200 0000000000000000 dead000000000001 ffff88800493cc80
[ 14.080629] raw: ffff88800b09de00 0000000080190018 00000001ffffffff 0000000000000000
[ 14.081016] page dumped because: kasan: bad access detected
[ 14.081293]
[ 14.081376] Memory state around the buggy address:
[ 14.081618] ffff88800b09ce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 14.081974] ffff88800b09ce80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
[ 14.082336] >ffff88800b09cf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 14.082690] ^
[ 14.082909] ffff88800b09cf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
[ 14.083266] ffff88800b09d000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[ 14.083622] ==================================================================
The actual tracing of this bug is shown below:
commit 8c71fe750215 ("io_uring: ensure fput() called correspondingly
when direct install fails") adds an additional fput() in
io_fixed_fd_install() when io_file_bitmap_get() returns error values. In
that case, the routine will never make it to io_install_fixed_file() due
to an early return.
static int io_fixed_fd_install(...)
{
if (alloc_slot) {
...
ret = io_file_bitmap_get(ctx);
if (unlikely(ret < 0)) {
io_ring_submit_unlock(ctx, issue_flags);
fput(file);
return ret;
}
...
}
...
ret = io_install_fixed_file(req, file, issue_flags, file_slot);
...
}
In the above scenario, the reference is okay as io_fixed_fd_install()
ensures the fput() is called when something bad happens, either via
bitmap or via inner io_install_fixed_file().
However, the commit 61c1b44a21d7 ("io_uring: fix deadlock on iowq file
slot alloc") breaks the balance because it places fput() into the common
path for both io_file_bitmap_get() and io_install_fixed_file(). Since
io_install_fixed_file() handles the fput() itself, the reference
underflow come across then.
There are some extra commits make the current code into
io_fixed_fd_install() -> __io_fixed_fd_install() ->
io_install_fixed_file()
However, the fact that there is an extra fput() is called if
io_install_fixed_file() calls fput(). Traversing through the code, I
find that the existing two callers to __io_fixed_fd_install():
io_fixed_fd_install() and io_msg_send_fd() have fput() when handling
error return, this patch simply removes the fput() in
io_install_fixed_file() to fix the bug.
Fixes: 61c1b44a21d7 ("io_uring: fix deadlock on iowq file slot alloc")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://lore.kernel.org/r/be4ba4b.5d44.184a0a406a4.Coremail.linma@zju.edu.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237
commit 6e73dffbb93cb8797cd4e42e98d837edf0f1a967
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Sat Jun 25 11:55:38 2022 +0100
io_uring: let to set a range for file slot allocation
From recently io_uring provides an option to allocate a file index for
operation registering fixed files. However, it's utterly unusable with
mixed approaches when for a part of files the userspace knows better
where to place it, as it may race and users don't have any sane way to
pick a slot and hoping it will not be taken.
Let the userspace to register a range of fixed file slots in which the
auto-allocation happens. The use case is splittting the fixed table in
two parts, where on of them is used for auto-allocation and another for
slot-specified operations.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/66ab0394e436f38437cf7c44676e1920d09687ad.1656154403.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237
commit f110ed8498afa6ff8e9a8c08fb26880e02117616
Author: Jens Axboe <axboe@kernel.dk>
Date: Mon Jun 13 04:42:56 2022 -0600
io_uring: split out fixed file installation and removal
Put it with the filetable code, which is where it belongs. While doing
so, have the helpers take a ctx rather than an io_kiocb. It doesn't make
sense to use a request, as it's not an operation on the request itself.
It applies to the ring itself.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2068237
commit 27a9d66fec77cff0e32d2ecd5d0ac7ef878a7bb0
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Thu Jun 16 13:57:18 2022 +0100
io_uring: kill extra io_uring_types.h includes
io_uring/io_uring.h already includes io_uring_types.h, no need to
include it every time. Kill it in a bunch of places, it prepares us for
following patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>