Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Artem Savkov	1e9cbbe0f6	bpf: Add __bpf_kfunc_{start,end}_defs macros JIRA: https://issues.redhat.com/browse/RHEL-23643 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Conflicts: missing xdp commits, missing vma_task iterator commit 391145ba2accc48b596f3d438af1a6255b62a555 Author: Dave Marchevsky <davemarchevsky@fb.com> Date: Tue Oct 31 14:56:24 2023 -0700 bpf: Add __bpf_kfunc_{start,end}_defs macros BPF kfuncs are meant to be called from BPF programs. Accordingly, most kfuncs are not called from anywhere in the kernel, which the -Wmissing-prototypes warning is unhappy about. We've peppered __diag_ignore_all("-Wmissing-prototypes", ... everywhere kfuncs are defined in the codebase to suppress this warning. This patch adds two macros meant to bound one or many kfunc definitions. All existing kfunc definitions which use these __diag calls to suppress -Wmissing-prototypes are migrated to use the newly-introduced macros. A new __diag_ignore_all - for "-Wmissing-declarations" - is added to the __bpf_kfunc_start_defs macro based on feedback from Andrii on an earlier version of this patch [0] and another recent mailing list thread [1]. In the future we might need to ignore different warnings or do other kfunc-specific things. This change will make it easier to make such modifications for all kfunc defs. [0]: https://lore.kernel.org/bpf/CAEf4BzaE5dRWtK6RPLnjTW-MW9sx9K3Fn6uwqCTChK2Dcb1Xig@mail.gmail.com/ [1]: https://lore.kernel.org/bpf/ZT+2qCc%2FaXep0%2FLf@krava/ Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Cc: Jiri Olsa <olsajiri@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: David Vernet <void@manifault.com> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231031215625.2343848-1-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2024-03-27 11:23:42 +01:00
Artem Savkov	534a34437e	bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg JIRA: https://issues.redhat.com/browse/RHEL-23643 commit 0de4f50de25af79c2a46db55d70cdbd8f985c6d1 Author: Chuyi Zhou <zhouchuyi@bytedance.com> Date: Tue Nov 7 21:22:03 2023 +0800 bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg BTF_TYPE_SAFE_TRUSTED(struct bpf_iter__task) in verifier.c wanted to teach BPF verifier that bpf_iter__task -> task is a trusted ptr. But it doesn't work well. The reason is, bpf_iter__task -> task would go through btf_ctx_access() which enforces the reg_type of 'task' is ctx_arg_info->reg_type, and in task_iter.c, we actually explicitly declare that the ctx_arg_info->reg_type is PTR_TO_BTF_ID_OR_NULL. Actually we have a previous case like this[1] where PTR_TRUSTED is added to the arg flag for map_iter. This patch sets ctx_arg_info->reg_type is PTR_TO_BTF_ID_OR_NULL \| PTR_TRUSTED in task_reg_info. Similarly, bpf_cgroup_reg_info -> cgroup is also PTR_TRUSTED since we are under the protection of cgroup_mutex and we would check cgroup_is_dead() in __cgroup_iter_seq_show(). This patch is to improve the user experience of the newly introduced bpf_iter_css_task kfunc before hitting the mainline. The Fixes tag is pointing to the commit introduced the bpf_iter_css_task kfunc. Link[1]:https://lore.kernel.org/all/20230706133932.45883-3-aspsk@isovalent.com/ Fixes: 9c66dc94b62a ("bpf: Introduce css_task open-coded iterator kfuncs") Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231107132204.912120-2-zhouchuyi@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2024-03-27 10:27:58 +01:00
Artem Savkov	3fd10bb078	bpf: Introduce css open-coded iterator kfuncs JIRA: https://issues.redhat.com/browse/RHEL-23643 commit 7251d0905e7518bcb990c8e9a3615b1bb23c78f2 Author: Chuyi Zhou <zhouchuyi@bytedance.com> Date: Wed Oct 18 14:17:42 2023 +0800 bpf: Introduce css open-coded iterator kfuncs This Patch adds kfuncs bpf_iter_css_{new,next,destroy} which allow creation and manipulation of struct bpf_iter_css in open-coded iterator style. These kfuncs actually wrapps css_next_descendant_{pre, post}. css_iter can be used to: 1) iterating a sepcific cgroup tree with pre/post/up order 2) iterating cgroup_subsystem in BPF Prog, like for_each_mem_cgroup_tree/cpuset_for_each_descendant_pre in kernel. The API design is consistent with cgroup_iter. bpf_iter_css_new accepts parameters defining iteration order and starting css. Here we also reuse BPF_CGROUP_ITER_DESCENDANTS_PRE, BPF_CGROUP_ITER_DESCENDANTS_POST, BPF_CGROUP_ITER_ANCESTORS_UP enums. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20231018061746.111364-5-zhouchuyi@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2024-03-27 10:27:54 +01:00
Artem Savkov	f6cd44c258	cgroup: bpf: use cgroup_lock()/cgroup_unlock() wrappers Bugzilla: https://bugzilla.redhat.com/2221599 Conflicts: missing 0083d27b21dd2 "cgroup: Improve cftype add/rm error handling" commit 4cdb91b0dea7d7f59fa84a13c7753cd434fdedcf Author: Kamalesh Babulal <kamalesh.babulal@oracle.com> Date: Fri Mar 3 15:23:10 2023 +0530 cgroup: bpf: use cgroup_lock()/cgroup_unlock() wrappers Replace mutex_[un]lock() with cgroup_[un]lock() wrappers to stay consistent across cgroup core and other subsystem code, while operating on the cgroup_mutex. Signed-off-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-09-22 09:12:20 +02:00
Jerome Marchand	a846961ea9	bpf: Make struct cgroup btf id global Bugzilla: https://bugzilla.redhat.com/2177177 commit 5e67b8ef125bb6e83bf0f0442ad7ffc09e7956f9 Author: Yonghong Song <yhs@fb.com> Date: Tue Oct 25 21:28:40 2022 -0700 bpf: Make struct cgroup btf id global Make struct cgroup btf id global so later patch can reuse the same btf id. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042840.672602-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jerome Marchand <jmarchan@redhat.com>	2023-04-28 11:42:58 +02:00
Artem Savkov	a038314072	bpf: Remove useless else if Bugzilla: https://bugzilla.redhat.com/2166911 commit ccf365eac0c7705591dee0158ae5c198d9e8f858 Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Date: Wed Aug 31 10:16:18 2022 +0800 bpf: Remove useless else if The assignment of the else and else if branches is the same, so the else if here is redundant, so we remove it and add a comment to make the code here readable. ./kernel/bpf/cgroup_iter.c:81:6-8: WARNING: possible condition with no effect (if == else). Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2016 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20220831021618.86770-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-03-06 14:54:26 +01:00
Artem Savkov	453fd2596d	bpf: Add CGROUP prefix to cgroup_iter_order Bugzilla: https://bugzilla.redhat.com/2166911 commit d4ffb6f39f1a1b260966b43a4ffdb64779c650dd Author: Hao Luo <haoluo@google.com> Date: Thu Aug 25 15:39:36 2022 -0700 bpf: Add CGROUP prefix to cgroup_iter_order bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_order. This patch fixes two previous commits: one introduced the API and the other uses the API in bpf selftest (that is, the selftest cgroup_hierarchical_stats). I tested this patch via the following command: test_progs -t cgroup,iter,btf_dump Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Fixes: 88886309d2e8 ("selftests/bpf: add a selftest for cgroup hierarchical stats collection") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220825223936.1865810-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-03-06 14:54:25 +01:00
Artem Savkov	a4b272755d	bpf: Pin the start cgroup in cgroup_iter_seq_init() Bugzilla: https://bugzilla.redhat.com/2166911 commit 1a5160d4d8fe63ba4964cfff4a85831b6af75f2d Author: Hou Tao <houtao1@huawei.com> Date: Mon Nov 21 15:34:38 2022 +0800 bpf: Pin the start cgroup in cgroup_iter_seq_init() bpf_iter_attach_cgroup() has already acquired an extra reference for the start cgroup, but the reference may be released if the iterator link fd is closed after the creation of iterator fd, and it may lead to user-after-free problem when reading the iterator fd. An alternative fix is pinning iterator link when opening iterator, but it will make iterator link being still visible after the close of iterator link fd and the behavior is different with other link types, so just fixing it by acquiring another reference for the start cgroup. Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221121073440.1828292-2-houtao@huaweicloud.com Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-03-06 14:54:24 +01:00
Artem Savkov	4f54b76fde	bpf: cgroup_iter: support cgroup1 using cgroup fd Bugzilla: https://bugzilla.redhat.com/2166911 commit 35256d673a9cf723d9e2edb5d51e1b1b6b197ba3 Author: Yosry Ahmed <yosryahmed@google.com> Date: Tue Oct 11 00:33:59 2022 +0000 bpf: cgroup_iter: support cgroup1 using cgroup fd Use cgroup_v1v2_get_from_fd() in cgroup_iter to support attaching to both cgroup v1 and v2 using fds. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-03-06 14:54:21 +01:00
Artem Savkov	08b66ec3e9	bpf: Introduce cgroup iter Bugzilla: https://bugzilla.redhat.com/2166911 commit d4ccaf58a8472123ac97e6db03932c375b5c45ba Author: Hao Luo <haoluo@google.com> Date: Wed Aug 24 16:31:13 2022 -0700 bpf: Introduce cgroup iter Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes: - walking a cgroup's descendants in pre-order. - walking a cgroup's descendants in post-order. - walking a cgroup's ancestors. - process only the given cgroup. When attaching cgroup_iter, one can set a cgroup to the iter_link created from attaching. This cgroup is passed as a file descriptor or cgroup id and serves as the starting point of the walk. If no cgroup is specified, the starting point will be the root cgroup v2. For walking descendants, one can specify the order: either pre-order or post-order. For walking ancestors, the walk starts at the specified cgroup and ends at the root. One can also terminate the walk early by returning 1 from the iter program. Note that because walking cgroup hierarchy holds cgroup_mutex, the iter program is called with cgroup_mutex held. Currently only one session is supported, which means, depending on the volume of data bpf program intends to send to user space, the number of cgroups that can be walked is limited. For example, given the current buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can be walked is 512. This is a limitation of cgroup_iter. If the output data is larger than the kernel buffer size, after all data in the kernel buffer is consumed by user space, the subsequent read() syscall will signal EOPNOTSUPP. In order to work around, the user may have to update their program to reduce the volume of data sent to output. For example, skip some uninteresting cgroups. In future, we may extend bpf_iter flags to allow customizing buffer size. Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-2-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Artem Savkov <asavkov@redhat.com>	2023-03-06 14:54:03 +01:00

10 Commits