Commit Graph

241 Commits

Author SHA1 Message Date
Jerome Marchand 4275e2f620 bpf: Add MEM_WRITE attribute
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 6fad274f06f038c29660aa53fbad14241c9fd976
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Mon Oct 21 17:28:05 2024 +0200

    bpf: Add MEM_WRITE attribute

    Add a MEM_WRITE attribute for BPF helper functions which can be used in
    bpf_func_proto to annotate an argument type in order to let the verifier
    know that the helper writes into the memory passed as an argument. In
    the past MEM_UNINIT has been (ab)used for this function, but the latter
    merely tells the verifier that the passed memory can be uninitialized.

    There have been bugs with overloading the latter but aside from that
    there are also cases where the passed memory is read + written which
    currently cannot be expressed, see also 4b3786a6c539 ("bpf: Zero former
    ARG_PTR_TO_{LONG,INT} args in case of error").

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20241021152809.33343-1-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:08 +01:00
Jerome Marchand 2419f41261 bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 4b3786a6c5397dc220b1483d8e2f4867743e966f
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Sep 13 21:17:50 2024 +0200

    bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error

    For all non-tracing helpers which formerly had ARG_PTR_TO_{LONG,INT} as input
    arguments, zero the value for the case of an error as otherwise it could leak
    memory. For tracing, it is not needed given CAP_PERFMON can already read all
    kernel memory anyway hence bpf_get_func_arg() and bpf_get_func_ret() is skipped
    in here.

    Also, the MTU helpers mtu_len pointer value is being written but also read.
    Technically, the MEM_UNINIT should not be there in order to always force init.
    Removing MEM_UNINIT needs more verifier rework though: MEM_UNINIT right now
    implies two things actually: i) write into memory, ii) memory does not have
    to be initialized. If we lift MEM_UNINIT, it then becomes: i) read into memory,
    ii) memory must be initialized. This means that for bpf_*_check_mtu() we're
    readding the issue we're trying to fix, that is, it would then be able to
    write back into things like .rodata BPF maps. Follow-up work will rework the
    MEM_UNINIT semantics such that the intent can be better expressed. For now
    just clear the *mtu_len on error path which can be lifted later again.

    Fixes: 8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types")
    Fixes: d7a4cb9b67 ("bpf: Introduce bpf_strtol and bpf_strtoul helpers")
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/e5edd241-59e7-5e39-0ee5-a51e31b6840a@iogearbox.net
    Link: https://lore.kernel.org/r/20240913191754.13290-5-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:06 +01:00
Jerome Marchand e12894e8b8 bpf: Fix helper writes to read-only maps
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 32556ce93bc45c730829083cb60f95a2728ea48b
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Sep 13 21:17:48 2024 +0200

    bpf: Fix helper writes to read-only maps

    Lonial found an issue that despite user- and BPF-side frozen BPF map
    (like in case of .rodata), it was still possible to write into it from
    a BPF program side through specific helpers having ARG_PTR_TO_{LONG,INT}
    as arguments.

    In check_func_arg() when the argument is as mentioned, the meta->raw_mode
    is never set. Later, check_helper_mem_access(), under the case of
    PTR_TO_MAP_VALUE as register base type, it assumes BPF_READ for the
    subsequent call to check_map_access_type() and given the BPF map is
    read-only it succeeds.

    The helpers really need to be annotated as ARG_PTR_TO_{LONG,INT} | MEM_UNINIT
    when results are written into them as opposed to read out of them. The
    latter indicates that it's okay to pass a pointer to uninitialized memory
    as the memory is written to anyway.

    However, ARG_PTR_TO_{LONG,INT} is a special case of ARG_PTR_TO_FIXED_SIZE_MEM
    just with additional alignment requirement. So it is better to just get
    rid of the ARG_PTR_TO_{LONG,INT} special cases altogether and reuse the
    fixed size memory types. For this, add MEM_ALIGNED to additionally ensure
    alignment given these helpers write directly into the args via *<ptr> = val.
    The .arg*_size has been initialized reflecting the actual sizeof(*<ptr>).

    MEM_ALIGNED can only be used in combination with MEM_FIXED_SIZE annotated
    argument types, since in !MEM_FIXED_SIZE cases the verifier does not know
    the buffer size a priori and therefore cannot blindly write *<ptr> = val.

    Fixes: 57c3bb725a ("bpf: Introduce ARG_PTR_TO_{INT,LONG} arg types")
    Reported-by: Lonial Con <kongln9170@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Link: https://lore.kernel.org/r/20240913191754.13290-3-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:06 +01:00
Jerome Marchand d2adc04eeb bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 7d71f59e028028f1160602121f40f45e89b3664e
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Sep 13 21:17:47 2024 +0200

    bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers

    Both bpf_strtol() and bpf_strtoul() helpers passed a temporary "long long"
    respectively "unsigned long long" to __bpf_strtoll() / __bpf_strtoull().

    Later, the result was checked for truncation via _res != ({unsigned,} long)_res
    as the destination buffer for the BPF helpers was of type {unsigned,} long
    which is 32bit on 32bit architectures.

    Given the latter was a bug in the helper signatures where the destination buffer
    got adjusted to {s,u}64, the truncation check can now be removed.

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240913191754.13290-2-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:06 +01:00
Jerome Marchand d61b804e33 bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit cfe69c50b05510b24e26ccb427c7cc70beafd6c1
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Sep 13 21:17:46 2024 +0200

    bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit

    The bpf_strtol() and bpf_strtoul() helpers are currently broken on 32bit:

    The argument type ARG_PTR_TO_LONG is BPF-side "long", not kernel-side "long"
    and therefore always considered fixed 64bit no matter if 64 or 32bit underlying
    architecture.

    This contract breaks in case of the two mentioned helpers since their BPF_CALL
    definition for the helpers was added with {unsigned,}long *res. Meaning, the
    transition from BPF-side "long" (BPF program) to kernel-side "long" (BPF helper)
    breaks here.

    Both helpers call __bpf_strtoll() with "long long" correctly, but later assigning
    the result into 32-bit "*(long *)" on 32bit architectures. From a BPF program
    point of view, this means upper bits will be seen as uninitialised.

    Therefore, fix both BPF_CALL signatures to {s,u}64 types to fix this situation.

    Now, changing also uapi/bpf.h helper documentation which generates bpf_helper_defs.h
    for BPF programs is tricky: Changing signatures there to __{s,u}64 would trigger
    compiler warnings (incompatible pointer types passing 'long *' to parameter of type
    '__s64 *' (aka 'long long *')) for existing BPF programs.

    Leaving the signatures as-is would be fine as from BPF program point of view it is
    still BPF-side "long" and thus equivalent to __{s,u}64 on 64 or 32bit underlying
    architectures.

    Note that bpf_strtol() and bpf_strtoul() are the only helpers with this issue.

    Fixes: d7a4cb9b67 ("bpf: Introduce bpf_strtol and bpf_strtoul helpers")
    Reported-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/481fcec8-c12c-9abb-8ecb-76c71c009959@iogearbox.net
    Link: https://lore.kernel.org/r/20240913191754.13290-1-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:06 +01:00
Jerome Marchand 58244853b3 bpf: Export bpf_base_func_proto
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 866d571e6201cb8ccb18cb8407ab3ad3adb474b8
Author: Martin KaFai Lau <martin.lau@kernel.org>
Date:   Thu Aug 29 14:08:26 2024 -0700

    bpf: Export bpf_base_func_proto

    The bpf_testmod needs to use the bpf_tail_call helper in
    a later selftest patch. This patch is to EXPORT_GPL_SYMBOL
    the bpf_base_func_proto.

    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Link: https://lore.kernel.org/r/20240829210833.388152-5-martin.lau@linux.dev
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:01 +01:00
Jerome Marchand ff3de4085b bpf: Add bpf_copy_from_user_str kfunc
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 65ab5ac4df012388481d0414fcac1d5ac1721fb3
Author: Jordan Rome <linux@jordanrome.com>
Date:   Fri Aug 23 12:51:00 2024 -0700

    bpf: Add bpf_copy_from_user_str kfunc

    This adds a kfunc wrapper around strncpy_from_user,
    which can be called from sleepable BPF programs.

    This matches the non-sleepable 'bpf_probe_read_user_str'
    helper except it includes an additional 'flags'
    param, which allows consumers to clear the entire
    destination buffer on success or failure.

    Signed-off-by: Jordan Rome <linux@jordanrome.com>
    Link: https://lore.kernel.org/r/20240823195101.3621028-1-linux@jordanrome.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:00 +01:00
Jerome Marchand 0c6ef49fe5 bpf: Support bpf_kptr_xchg into local kptr
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit b0966c724584a5a9fd7fb529de19807c31f27a45
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Tue Aug 13 21:24:23 2024 +0000

    bpf: Support bpf_kptr_xchg into local kptr

    Currently, users can only stash kptr into map values with bpf_kptr_xchg().
    This patch further supports stashing kptr into local kptr by adding local
    kptr as a valid destination type.

    When stashing into local kptr, btf_record in program BTF is used instead
    of btf_record in map to search for the btf_field of the local kptr.

    The local kptr specific checks in check_reg_type() only apply when the
    source argument of bpf_kptr_xchg() is local kptr. Therefore, we make the
    scope of the check explicit as the destination now can also be local kptr.

    Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Signed-off-by: Amery Hung <amery.hung@bytedance.com>
    Link: https://lore.kernel.org/r/20240813212424.2871455-5-amery.hung@bytedance.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:00 +01:00
Jerome Marchand 9b5535c577 bpf: Rename ARG_PTR_TO_KPTR -> ARG_KPTR_XCHG_DEST
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit d59232afb0344e33e9399f308d9b4a03876e7676
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Tue Aug 13 21:24:22 2024 +0000

    bpf: Rename ARG_PTR_TO_KPTR -> ARG_KPTR_XCHG_DEST

    ARG_PTR_TO_KPTR is currently only used by the bpf_kptr_xchg helper.
    Although it limits reg types for that helper's first arg to
    PTR_TO_MAP_VALUE, any arbitrary mapval won't do: further custom
    verification logic ensures that the mapval reg being xchgd-into is
    pointing to a kptr field. If this is not the case, it's not safe to xchg
    into that reg's pointee.

    Let's rename the bpf_arg_type to more accurately describe the fairly
    specific expectations that this arg type encodes.

    This is a nonfunctional change.

    Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Signed-off-by: Amery Hung <amery.hung@bytedance.com>
    Link: https://lore.kernel.org/r/20240813212424.2871455-4-amery.hung@bytedance.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:27:00 +01:00
Jerome Marchand 1ae3972eba bpf: rename nocsr -> bpf_fastcall in verifier
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit ae010757a55b57c8b82628e8ea9b7da2269131d9
Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Thu Aug 22 01:41:07 2024 -0700

    bpf: rename nocsr -> bpf_fastcall in verifier

    Attribute used by LLVM implementation of the feature had been changed
    from no_caller_saved_registers to bpf_fastcall (see [1]).
    This commit replaces references to nocsr by references to bpf_fastcall
    to keep LLVM and Kernel parts in sync.

    [1] https://github.com/llvm/llvm-project/pull/105417

    Acked-by: Yonghong Song <yonghong.song@linux.dev>
    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20240822084112.3257995-2-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:24:25 +01:00
Jerome Marchand a11edfe6ee bpf: Fix percpu address space issues
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 6d641ca50d7ec7d5e4e889c3f8ea22afebc2a403
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Sun Aug 11 18:13:33 2024 +0200

    bpf: Fix percpu address space issues

    In arraymap.c:

    In bpf_array_map_seq_start() and bpf_array_map_seq_next()
    cast return values from the __percpu address space to
    the generic address space via uintptr_t [1].

    Correct the declaration of pptr pointer in __bpf_array_map_seq_show()
    to void __percpu * and cast the value from the generic address
    space to the __percpu address space via uintptr_t [1].

    In hashtab.c:

    Assign the return value from bpf_mem_cache_alloc() to void pointer
    and cast the value to void __percpu ** (void pointer to percpu void
    pointer) before dereferencing.

    In memalloc.c:

    Explicitly declare __percpu variables.

    Cast obj to void __percpu **.

    In helpers.c:

    Cast ptr in BPF_CALL_1 and BPF_CALL_2 from generic address space
    to __percpu address space via const uintptr_t [1].

    Found by GCC's named address space checks.

    There were no changes in the resulting object files.

    [1] https://sparse.docs.kernel.org/en/latest/annotations.html#address-space-name

    Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Andrii Nakryiko <andrii@kernel.org>
    Cc: Martin KaFai Lau <martin.lau@linux.dev>
    Cc: Eduard Zingerman <eddyz87@gmail.com>
    Cc: Song Liu <song@kernel.org>
    Cc: Yonghong Song <yonghong.song@linux.dev>
    Cc: John Fastabend <john.fastabend@gmail.com>
    Cc: KP Singh <kpsingh@kernel.org>
    Cc: Stanislav Fomichev <sdf@fomichev.me>
    Cc: Hao Luo <haoluo@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20240811161414.56744-1-ubizjak@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:24:25 +01:00
Jerome Marchand b936aa2a1f bpf: Allow bpf_current_task_under_cgroup() with BPF_CGROUP_*
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 7f6287417baf57754f47687c6ea1a749a0686ab0
Author: Matteo Croce <teknoraver@meta.com>
Date:   Mon Aug 19 18:28:05 2024 +0200

    bpf: Allow bpf_current_task_under_cgroup() with BPF_CGROUP_*

    The helper bpf_current_task_under_cgroup() currently is only allowed for
    tracing programs, allow its usage also in the BPF_CGROUP_* program types.

    Move the code from kernel/trace/bpf_trace.c to kernel/bpf/helpers.c,
    so it compiles also without CONFIG_BPF_EVENTS.

    This will be used in systemd-networkd to monitor the sysctl writes,
    and filter it's own writes from others:
    https://github.com/systemd/systemd/pull/32212

    Signed-off-by: Matteo Croce <teknoraver@meta.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20240819162805.78235-3-technoboy85@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:24:24 +01:00
Jerome Marchand 7c6ab9c4b7 bpf: Enable generic kfuncs for BPF_CGROUP_* programs
JIRA: https://issues.redhat.com/browse/RHEL-63880

Conflicts: Context change due to missing commit 53e380d21441 ("bpf:
Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from
bpf")

commit 67666479edf1e2b732f4d0ac797885e859a78de4
Author: Matteo Croce <teknoraver@meta.com>
Date:   Mon Aug 19 18:28:04 2024 +0200

    bpf: Enable generic kfuncs for BPF_CGROUP_* programs

    These kfuncs are enabled even in BPF_PROG_TYPE_TRACING, so they
    should be safe also in BPF_CGROUP_* programs.
    Since all BPF_CGROUP_* programs share the same hook,
    call register_btf_kfunc_id_set() only once.

    In enum btf_kfunc_hook, rename BTF_KFUNC_HOOK_CGROUP_SKB to a more
    generic BTF_KFUNC_HOOK_CGROUP, since it's used for all the cgroup
    related program types.

    Signed-off-by: Matteo Croce <teknoraver@meta.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20240819162805.78235-2-technoboy85@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-21 11:24:24 +01:00
Jerome Marchand cc5bbb3c66 bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()
JIRA: https://issues.redhat.com/browse/RHEL-63880

commit 91b7fbf3936f5c27d1673264dc24a713290e2165
Author: Eduard Zingerman <eddyz87@gmail.com>
Date:   Mon Jul 22 16:38:37 2024 -0700

    bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()

    The function bpf_get_smp_processor_id() is processed in a different
    way, depending on the arch:
    - on x86 verifier replaces call to bpf_get_smp_processor_id() with a
      sequence of instructions that modify only r0;
    - on riscv64 jit replaces call to bpf_get_smp_processor_id() with a
      sequence of instructions that modify only r0;
    - on arm64 jit replaces call to bpf_get_smp_processor_id() with a
      sequence of instructions that modify only r0 and tmp registers.

    These rewrites satisfy attribute no_caller_saved_registers contract.
    Allow rewrite of no_caller_saved_registers patterns for
    bpf_get_smp_processor_id() in order to use this function as a canary
    for no_caller_saved_registers tests.

    Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20240722233844.1406874-4-eddyz87@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2025-01-13 17:36:34 +01:00
Viktor Malik e45874f81b
bpf: Use __u64 to save the bits in bits iterator
JIRA: https://issues.redhat.com/browse/RHEL-30774

commit e1339383675063ae4760d81ffe13a79981841b8d
Author: Hou Tao <houtao1@huawei.com>
Date:   Wed Oct 30 18:05:15 2024 +0800

    bpf: Use __u64 to save the bits in bits iterator

    On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to
    bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the
    content of the u64. However, bits_copy is only 4 bytes, leading to stack
    corruption.

    The straightforward solution would be to replace u64 with unsigned long
    in bpf_iter_bits_new(). However, this introduces confusion and problems
    for 32-bit hosts because the size of ulong in bpf program is 8 bytes,
    but it is treated as 4-bytes after passed to bpf_iter_bits_new().

    Fix it by changing the type of both bits and bit_count from unsigned
    long to u64. However, the change is not enough. The main reason is that
    bpf_iter_bits_next() uses find_next_bit() to find the next bit and the
    pointer passed to find_next_bit() is an unsigned long pointer instead
    of a u64 pointer. For 32-bit little-endian host, it is fine but it is
    not the case for 32-bit big-endian host. Because under 32-bit big-endian
    host, the first iterated unsigned long will be the bits 32-63 of the u64
    instead of the expected bits 0-31. Therefore, in addition to changing
    the type, swap the two unsigned longs within the u64 for 32-bit
    big-endian host.

    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 15:55:22 +01:00
Viktor Malik d96494cede
bpf: Check the validity of nr_words in bpf_iter_bits_new()
JIRA: https://issues.redhat.com/browse/RHEL-30774

commit 393397fbdcad7396639d7077c33f86169184ba99
Author: Hou Tao <houtao1@huawei.com>
Date:   Wed Oct 30 18:05:14 2024 +0800

    bpf: Check the validity of nr_words in bpf_iter_bits_new()
    
    Check the validity of nr_words in bpf_iter_bits_new(). Without this
    check, when multiplication overflow occurs for nr_bits (e.g., when
    nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur
    due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008).
    
    Fix it by limiting the maximum value of nr_words to 511. The value is
    derived from the current implementation of BPF memory allocator. To
    ensure compatibility if the BPF memory allocator's size limitation
    changes in the future, use the helper bpf_mem_alloc_check_size() to
    check whether nr_bytes is too larger. And return -E2BIG instead of
    -ENOMEM for oversized nr_bytes.
    
    Fixes: 4665415975b0 ("bpf: Add bits iterator")
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20241030100516.3633640-4-houtao@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 15:55:21 +01:00
Viktor Malik ae29dfba30
bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()
JIRA: https://issues.redhat.com/browse/RHEL-30774

commit 101ccfbabf4738041273ce64e2b116cf440dea13
Author: Hou Tao <houtao1@huawei.com>
Date:   Wed Oct 30 18:05:12 2024 +0800

    bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()
    
    bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
    bits are dynamically allocated. However, the check is incorrect and may
    cause a kmemleak as shown below:
    
    unreferenced object 0xffff88812628c8c0 (size 32):
      comm "swapper/0", pid 1, jiffies 4294727320
      hex dump (first 32 bytes):
    	b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U...........
    	f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ..............
      backtrace (crc 781e32cc):
    	[<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
    	[<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
    	[<00000000597124d6>] __alloc.isra.0+0x89/0xb0
    	[<000000004ebfffcd>] alloc_bulk+0x2af/0x720
    	[<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
    	[<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
    	[<000000008b616eac>] bpf_global_ma_init+0x19/0x30
    	[<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
    	[<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
    	[<00000000b119f72f>] kernel_init+0x20/0x160
    	[<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
    	[<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
    
    That is because nr_bits will be set as zero in bpf_iter_bits_next()
    after all bits have been iterated.
    
    Fix the issue by setting kit->bit to kit->nr_bits instead of setting
    kit->nr_bits to zero when the iteration completes in
    bpf_iter_bits_next(). In addition, use "!nr_bits || bits >= nr_bits" to
    check whether the iteration is complete and still use "nr_bits > 64" to
    indicate whether bits are dynamically allocated. The "!nr_bits" check is
    necessary because bpf_iter_bits_new() may fail before setting
    kit->nr_bits, and this condition will stop the iteration early instead
    of accessing the zeroed or freed kit->bits.
    
    Considering the initial value of kit->bits is -1 and the type of
    kit->nr_bits is unsigned int, change the type of kit->nr_bits to int.
    The potential overflow problem will be handled in the following patch.
    
    Fixes: 4665415975b0 ("bpf: Add bits iterator")
    Acked-by: Yafang Shao <laoar.shao@gmail.com>
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20241030100516.3633640-2-houtao@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 15:55:21 +01:00
Viktor Malik a7cc5b6671
bpf: Fix bpf_dynptr documentation comments
JIRA: https://issues.redhat.com/browse/RHEL-30774

commit 78746f93e903d022c692b9bb3a3e2570167b2dc2
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Thu Jun 13 10:19:25 2024 -0600

    bpf: Fix bpf_dynptr documentation comments
    
    The function argument names were changed but the doc comment was not.
    Fix htmldocs build warning by updating doc comments.
    
    Fixes: cce4c40b9606 ("bpf: treewide: Align kfunc signatures to prog point-of-view")
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Link: https://lore.kernel.org/r/d0b0eb05f91e12e5795966153b11998d3fc1d433.1718295425.git.dxu@dxuuu.xyz
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 14:40:07 +01:00
Viktor Malik b41752a1a7
bpf: treewide: Align kfunc signatures to prog point-of-view
JIRA: https://issues.redhat.com/browse/RHEL-30774

Conflicts: Omitting bits for kfuncs and tests missing in RHEL due to
           unbackported upstream commits:
           67814c00de316 ("bpf, fsverity: Add kfunc bpf_get_fsverity_digest")
           3e1c6f35409f9 ("bpf: make common crypto API for TC/XDP programs")
           ac9c05e0e453c ("bpf: Add kfunc bpf_get_file_xattr")
           53e380d214419 ("bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf")
           e472f88891abb ("bpf: tcp: Support arbitrary SYN Cookie.")
           c313eae739b9a ("bpf: selftests: Add defrag selftests")

commit cce4c40b960673f9e020835def310f1e89d3a940
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Jun 12 09:58:33 2024 -0600

    bpf: treewide: Align kfunc signatures to prog point-of-view

    Previously, kfunc declarations in bpf_kfuncs.h (and others) used "user
    facing" types for kfuncs prototypes while the actual kfunc definitions
    used "kernel facing" types. More specifically: bpf_dynptr vs
    bpf_dynptr_kern, __sk_buff vs sk_buff, and xdp_md vs xdp_buff.

    It wasn't an issue before, as the verifier allows aliased types.
    However, since we are now generating kfunc prototypes in vmlinux.h (in
    addition to keeping bpf_kfuncs.h around), this conflict creates
    compilation errors.

    Fix this conflict by using "user facing" types in kfunc definitions.
    This results in more casts, but otherwise has no additional runtime
    cost.

    Note, similar to 5b268d1ebcdc ("bpf: Have bpf_rdonly_cast() take a const
    pointer"), we also make kfuncs take const arguments where appropriate in
    order to make the kfunc more permissive.

    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Link: https://lore.kernel.org/r/b58346a63a0e66bc9b7504da751b526b0b189a67.1718207789.git.dxu@dxuuu.xyz
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 14:40:07 +01:00
Viktor Malik 185f48f422
bpf: Add bits iterator
JIRA: https://issues.redhat.com/browse/RHEL-30774

commit 4665415975b0827e9646cab91c61d02a6b364d59
Author: Yafang Shao <laoar.shao@gmail.com>
Date:   Fri May 17 10:30:33 2024 +0800

    bpf: Add bits iterator
    
    Add three new kfuncs for the bits iterator:
    - bpf_iter_bits_new
      Initialize a new bits iterator for a given memory area. Due to the
      limitation of bpf memalloc, the max number of words (8-byte units) that
      can be iterated over is limited to (4096 / 8).
    - bpf_iter_bits_next
      Get the next bit in a bpf_iter_bits
    - bpf_iter_bits_destroy
      Destroy a bpf_iter_bits
    
    The bits iterator facilitates the iteration of the bits of a memory area,
    such as cpumask. It can be used in any context and on any address.
    
    Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20240517023034.48138-2-laoar.shao@gmail.com

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-26 14:40:00 +01:00
Viktor Malik ad58a0e351
bpf: Defer work in bpf_timer_cancel_and_free
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit a6fcd19d7eac1335eb76bc16b6a66b7f574d1d69
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Jul 9 18:54:39 2024 +0000

    bpf: Defer work in bpf_timer_cancel_and_free
    
    Currently, the same case as previous patch (two timer callbacks trying
    to cancel each other) can be invoked through bpf_map_update_elem as
    well, or more precisely, freeing map elements containing timers. Since
    this relies on hrtimer_cancel as well, it is prone to the same deadlock
    situation as the previous patch.
    
    It would be sufficient to use hrtimer_try_to_cancel to fix this problem,
    as the timer cannot be enqueued after async_cancel_and_free. Once
    async_cancel_and_free has been done, the timer must be reinitialized
    before it can be armed again. The callback running in parallel trying to
    arm the timer will fail, and freeing bpf_hrtimer without waiting is
    sufficient (given kfree_rcu), and bpf_timer_cb will return
    HRTIMER_NORESTART, preventing the timer from being rearmed again.
    
    However, there exists a UAF scenario where the callback arms the timer
    before entering this function, such that if cancellation fails (due to
    timer callback invoking this routine, or the target timer callback
    running concurrently). In such a case, if the timer expiration is
    significantly far in the future, the RCU grace period expiration
    happening before it will free the bpf_hrtimer state and along with it
    the struct hrtimer, that is enqueued.
    
    Hence, it is clear cancellation needs to occur after
    async_cancel_and_free, and yet it cannot be done inline due to deadlock
    issues. We thus modify bpf_timer_cancel_and_free to defer work to the
    global workqueue, adding a work_struct alongside rcu_head (both used at
    _different_ points of time, so can share space).
    
    Update existing code comments to reflect the new state of affairs.
    
    Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20240709185440.1104957-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-19 07:40:49 +01:00
Viktor Malik 1690ef8821
bpf: Fail bpf_timer_cancel when callback is being cancelled
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit d4523831f07a267a943f0dde844bf8ead7495f13
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Jul 9 18:54:38 2024 +0000

    bpf: Fail bpf_timer_cancel when callback is being cancelled
    
    Given a schedule:
    
    timer1 cb			timer2 cb
    
    bpf_timer_cancel(timer2);	bpf_timer_cancel(timer1);
    
    Both bpf_timer_cancel calls would wait for the other callback to finish
    executing, introducing a lockup.
    
    Add an atomic_t count named 'cancelling' in bpf_hrtimer. This keeps
    track of all in-flight cancellation requests for a given BPF timer.
    Whenever cancelling a BPF timer, we must check if we have outstanding
    cancellation requests, and if so, we must fail the operation with an
    error (-EDEADLK) since cancellation is synchronous and waits for the
    callback to finish executing. This implies that we can enter a deadlock
    situation involving two or more timer callbacks executing in parallel
    and attempting to cancel one another.
    
    Note that we avoid incrementing the cancelling counter for the target
    timer (the one being cancelled) if bpf_timer_cancel is not invoked from
    a callback, to avoid spurious errors. The whole point of detecting
    cur->cancelling and returning -EDEADLK is to not enter a busy wait loop
    (which may or may not lead to a lockup). This does not apply in case the
    caller is in a non-callback context, the other side can continue to
    cancel as it sees fit without running into errors.
    
    Background on prior attempts:
    
    Earlier versions of this patch used a bool 'cancelling' bit and used the
    following pattern under timer->lock to publish cancellation status.
    
    lock(t->lock);
    t->cancelling = true;
    mb();
    if (cur->cancelling)
    	return -EDEADLK;
    unlock(t->lock);
    hrtimer_cancel(t->timer);
    t->cancelling = false;
    
    The store outside the critical section could overwrite a parallel
    requests t->cancelling assignment to true, to ensure the parallely
    executing callback observes its cancellation status.
    
    It would be necessary to clear this cancelling bit once hrtimer_cancel
    is done, but lack of serialization introduced races. Another option was
    explored where bpf_timer_start would clear the bit when (re)starting the
    timer under timer->lock. This would ensure serialized access to the
    cancelling bit, but may allow it to be cleared before in-flight
    hrtimer_cancel has finished executing, such that lockups can occur
    again.
    
    Thus, we choose an atomic counter to keep track of all outstanding
    cancellation requests and use it to prevent lockups in case callbacks
    attempt to cancel each other while executing in parallel.
    
    Reported-by: Dohyun Kim <dohyunkim@google.com>
    Reported-by: Neel Natu <neelnatu@google.com>
    Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20240709185440.1104957-2-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-19 07:40:48 +01:00
Viktor Malik e1bbb496ab
bpf: helpers: fix bpf_wq_set_callback_impl signature
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit f56f4d541eab1ae060a46b56dd6ec9130d6e3a98
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Mon Jul 8 11:52:57 2024 +0200

    bpf: helpers: fix bpf_wq_set_callback_impl signature
    
    I realized this while having a map containing both a struct bpf_timer and
    a struct bpf_wq: the third argument provided to the bpf_wq callback is
    not the struct bpf_wq pointer itself, but the pointer to the value in
    the map.
    
    Which means that the users need to double cast the provided "value" as
    this is not a struct bpf_wq *.
    
    This is a change of API, but there doesn't seem to be much users of bpf_wq
    right now, so we should be able to go with this right now.
    
    Fixes: 81f1d7a583fa ("bpf: wq: add bpf_wq_set_callback_impl")
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240708-fix-wq-v2-1-667e5c9fbd99@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-19 07:40:48 +01:00
Viktor Malik f4776cf0c8
bpf: Introduce bpf_preempt_[disable,enable] kfuncs
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit fc7566ad0a826cdc8886c5dbbb39ce72a0dc6333
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Apr 24 03:13:14 2024 +0000

    bpf: Introduce bpf_preempt_[disable,enable] kfuncs
    
    Introduce two new BPF kfuncs, bpf_preempt_disable and
    bpf_preempt_enable. These kfuncs allow disabling preemption in BPF
    programs. Nesting is allowed, since the intended use cases includes
    building native BPF spin locks without kernel helper involvement. Apart
    from that, this can be used to per-CPU data structures for cases where
    programs (or userspace) may preempt one or the other. Currently, while
    per-CPU access is stable, whether it will be consistent is not
    guaranteed, as only migration is disabled for BPF programs.
    
    Global functions are disallowed from being called, but support for them
    will be added as a follow up not just preempt kfuncs, but rcu_read_lock
    kfuncs as well. Static subprog calls are permitted. Sleepable helpers
    and kfuncs are disallowed in non-preemptible regions.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20240424031315.2757363-2-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:52 +01:00
Viktor Malik f186f437f9
bpf: Don't check for recursion in bpf_wq_work.
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit dc92febf7b93da5049fe177804e6b1961fcc6bd7
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Apr 24 09:00:23 2024 -0700

    bpf: Don't check for recursion in bpf_wq_work.
    
    __bpf_prog_enter_sleepable_recur does recursion check which is not applicable
    to wq callback. The callback function is part of bpf program and bpf prog might
    be running on the same cpu. So recursion check would incorrectly prevent
    callback from running. The code can call __bpf_prog_enter_sleepable(), but
    run_ctx would be fake, hence use explicit rcu_read_lock_trace();
    migrate_disable(); to address this problem. Another reason to open code is
    __bpf_prog_enter* are not available in !JIT configs.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/oe-kbuild-all/202404241719.IIGdpAku-lkp@intel.com/
    Closes: https://lore.kernel.org/oe-kbuild-all/202404241811.FFV4Bku3-lkp@intel.com/
    Fixes: eb48f6cd41a0 ("bpf: wq: add bpf_wq_init")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:51 +01:00
Viktor Malik d4a565ac3f
bpf: add bpf_wq_start
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 8e83da9732d91c60fdc651b2486c8e5935eb0ca2
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:15 2024 +0200

    bpf: add bpf_wq_start
    
    again, copy/paste from bpf_timer_start().
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-15-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:51 +01:00
Viktor Malik 137412600f
bpf: wq: add bpf_wq_set_callback_impl
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 81f1d7a583fa1fa14f0c4e6140d34b5e3d08d227
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:13 2024 +0200

    bpf: wq: add bpf_wq_set_callback_impl
    
    To support sleepable async callbacks, we need to tell push_async_cb()
    whether the cb is sleepable or not.
    
    The verifier now detects that we are in bpf_wq_set_callback_impl and
    can allow a sleepable callback to happen.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-13-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:51 +01:00
Viktor Malik 4e3796d78d
bpf: wq: add bpf_wq_init
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit eb48f6cd41a0f7803770a76bbffb6bd5b1b2ae2f
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:11 2024 +0200

    bpf: wq: add bpf_wq_init
    
    We need to teach the verifier about the second argument which is declared
    as void * but which is of type KF_ARG_PTR_TO_MAP. We could have dropped
    this extra case if we declared the second argument as struct bpf_map *,
    but that means users will have to do extra casting to have their program
    compile.
    
    We also need to duplicate the timer code for the checking if the map
    argument is matching the provided workqueue.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-11-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:50 +01:00
Viktor Malik 1969d68a5e
bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 246331e3f1eac905170a923f0ec76725c2558232
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:09 2024 +0200

    bpf: allow struct bpf_wq to be embedded in arraymaps and hashmaps
    
    Currently bpf_wq_cancel_and_free() is just a placeholder as there is
    no memory allocation for bpf_wq just yet.
    
    Again, duplication of the bpf_timer approach
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-9-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:50 +01:00
Viktor Malik 4c29aac18f
bpf: replace bpf_timer_cancel_and_free with a generic helper
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit fc22d9495f0b32d75b5d25a17b300b7aad05c55d
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:04 2024 +0200

    bpf: replace bpf_timer_cancel_and_free with a generic helper
    
    Same reason than most bpf_timer* functions, we need almost the same for
    workqueues.
    So extract the generic part out of it so bpf_wq_cancel_and_free can reuse
    it.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-4-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:49 +01:00
Viktor Malik 17f047b4b2
bpf: replace bpf_timer_set_callback with a generic helper
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 073f11b0264310b85754b6a0946afee753790c66
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:03 2024 +0200

    bpf: replace bpf_timer_set_callback with a generic helper
    
    In the same way we have a generic __bpf_async_init(), we also need
    to share code between timer and workqueue for the set_callback call.
    
    We just add an unused flags parameter, as it will be used for workqueues.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-3-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:49 +01:00
Viktor Malik 8bc4b8defe
bpf: replace bpf_timer_init with a generic helper
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 56b4a177ae6322173360a93ea828ad18570a5a14
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:02 2024 +0200

    bpf: replace bpf_timer_init with a generic helper
    
    No code change except for the new flags argument being stored in the
    local data struct.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-2-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:49 +01:00
Viktor Malik 5786963723
bpf: make timer data struct more generic
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit be2749beff62e0d63cf97fe63cabc79a68443139
Author: Benjamin Tissoires <bentiss@kernel.org>
Date:   Sat Apr 20 11:09:01 2024 +0200

    bpf: make timer data struct more generic
    
    To be able to add workqueues and reuse most of the timer code, we need
    to make bpf_hrtimer more generic.
    
    There is no code change except that the new struct gets a new u64 flags
    attribute. We are still below 2 cache lines, so this shouldn't impact
    the current running codes.
    
    The ordering is also changed. Everything related to async callback
    is now on top of bpf_hrtimer.
    
    Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-1-6c986a5a741f@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:49 +01:00
Viktor Malik 013281ada1
bpf: Fix typos in comments
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit a7de265cb2d849f8986a197499ad58dca0a4f209
Author: Rafael Passos <rafael@rcpassos.me>
Date:   Wed Apr 17 15:49:14 2024 -0300

    bpf: Fix typos in comments
    
    Found the following typos in comments, and fixed them:
    
    s/unpriviledged/unprivileged/
    s/reponsible/responsible/
    s/possiblities/possibilities/
    s/Divison/Division/
    s/precsion/precision/
    s/havea/have a/
    s/reponsible/responsible/
    s/responsibile/responsible/
    s/tigher/tighter/
    s/respecitve/respective/
    
    Signed-off-by: Rafael Passos <rafael@rcpassos.me>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/6af7deb4-bb24-49e8-b3f1-8dd410597337@smtp-relay.sendinblue.com

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-11 07:44:49 +01:00
Viktor Malik 49ac3b5c59
bpf: Allow invoking kfuncs from BPF_PROG_TYPE_SYSCALL progs
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit a8e03b6bbb2cc7cf387d1ce335e4ce4c3bdfef9b
Author: David Vernet <void@manifault.com>
Date:   Fri Apr 5 09:30:40 2024 -0500

    bpf: Allow invoking kfuncs from BPF_PROG_TYPE_SYSCALL progs
    
    Currently, a set of core BPF kfuncs (e.g. bpf_task_*, bpf_cgroup_*,
    bpf_cpumask_*, etc) cannot be invoked from BPF_PROG_TYPE_SYSCALL
    programs. The whitelist approach taken for enabling kfuncs makes sense:
    it not safe to call these kfuncs from every program type. For example,
    it may not be safe to call bpf_task_acquire() in an fentry to
    free_task().
    
    BPF_PROG_TYPE_SYSCALL, on the other hand, is a perfectly safe program
    type from which to invoke these kfuncs, as it's a very controlled
    environment, and we should never be able to run into any of the typical
    problems such as recursive invoations, acquiring references on freeing
    kptrs, etc. Being able to invoke these kfuncs would be useful, as
    BPF_PROG_TYPE_SYSCALL can be invoked with BPF_PROG_RUN, and would
    therefore enable user space programs to synchronously call into BPF to
    manipulate these kptrs.
    
    This patch therefore enables invoking the aforementioned core kfuncs
    from BPF_PROG_TYPE_SYSCALL progs.
    
    Signed-off-by: David Vernet <void@manifault.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Yonghong Song <yonghong.song@linux.dev>
    Link: https://lore.kernel.org/bpf/20240405143041.632519-2-void@manifault.com

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-07 13:58:46 +01:00
Viktor Malik 57c548a158
bpf: add bpf_modify_return_test_tp() kfunc triggering tracepoint
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit 3124591f686115aca25d772c2ccb7b1e202c3197
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Tue Mar 26 09:21:50 2024 -0700

    bpf: add bpf_modify_return_test_tp() kfunc triggering tracepoint
    
    Add a simple bpf_modify_return_test_tp() kfunc, available to all program
    types, that is useful for various testing and benchmarking scenarios, as
    it allows to trigger most tracing BPF program types from BPF side,
    allowing to do complex testing and benchmarking scenarios.
    
    It is also attachable to for fmod_ret programs, making it a good and
    simple way to trigger fmod_ret program under test/benchmark.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240326162151.3981687-6-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-07 13:58:36 +01:00
Viktor Malik bfe07a51a2
bpf: Allow helper bpf_get_[ns_]current_pid_tgid() for all prog types
JIRA: https://issues.redhat.com/browse/RHEL-30773

commit eb166e522c77699fc19bfa705652327a1e51a117
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Fri Mar 15 11:48:54 2024 -0700

    bpf: Allow helper bpf_get_[ns_]current_pid_tgid() for all prog types
    
    Currently bpf_get_current_pid_tgid() is allowed in tracing, cgroup
    and sk_msg progs while bpf_get_ns_current_pid_tgid() is only allowed
    in tracing progs.
    
    We have an internal use case where for an application running
    in a container (with pid namespace), user wants to get
    the pid associated with the pid namespace in a cgroup bpf
    program. Currently, cgroup bpf progs already allow
    bpf_get_current_pid_tgid(). Let us allow bpf_get_ns_current_pid_tgid()
    as well.
    
    With auditing the code, bpf_get_current_pid_tgid() is also used
    by sk_msg prog. But there are no side effect to expose these two
    helpers to all prog types since they do not reveal any kernel specific
    data. The detailed discussion is in [1].
    
    So with this patch, both bpf_get_current_pid_tgid() and bpf_get_ns_current_pid_tgid()
    are put in bpf_base_func_proto(), making them available to all
    program types.
    
      [1] https://lore.kernel.org/bpf/20240307232659.1115872-1-yonghong.song@linux.dev/
    
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20240315184854.2975190-1-yonghong.song@linux.dev

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-11-07 13:58:30 +01:00
Jerome Marchand 63431baf9b bpf: fix warning for crash_kexec
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit 96b98a6552a90690d7bc18dd71b66312c9ded1fb
Author: Hari Bathini <hbathini@linux.ibm.com>
Date:   Tue Mar 19 13:31:52 2024 +0530

    bpf: fix warning for crash_kexec

    With [1], crash dump specific code is moved out of CONFIG_KEXEC_CORE
    and placed under CONFIG_CRASH_DUMP, where it is more appropriate.
    And since CONFIG_KEXEC & !CONFIG_CRASH_DUMP build option is supported
    with that, it led to the below warning:

      "WARN: resolve_btfids: unresolved symbol crash_kexec"

    Fix it by using the appropriate #ifdef.

    [1] https://lore.kernel.org/all/20240124051254.67105-1-bhe@redhat.com/

    Acked-by: Baoquan He <bhe@redhat.com>
    Fixes: 02aff8480533 ("crash: split crash dumping code out from kexec_core.c")
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Acked-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
    Link: https://lore.kernel.org/r/20240319080152.36987-1-hbathini@linux.ibm.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:17 +02:00
Jerome Marchand bc428f1615 bpf: Mark bpf_spin_{lock,unlock}() helpers with notrace correctly
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit 178c54666f9c4d2f49f2ea661d0c11b52f0ed190
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Tue Feb 6 23:01:02 2024 -0800

    bpf: Mark bpf_spin_{lock,unlock}() helpers with notrace correctly

    Currently tracing is supposed not to allow for bpf_spin_{lock,unlock}()
    helper calls. This is to prevent deadlock for the following cases:
      - there is a prog (prog-A) calling bpf_spin_{lock,unlock}().
      - there is a tracing program (prog-B), e.g., fentry, attached
        to bpf_spin_lock() and/or bpf_spin_unlock().
      - prog-B calls bpf_spin_{lock,unlock}().
    For such a case, when prog-A calls bpf_spin_{lock,unlock}(),
    a deadlock will happen.

    The related source codes are below in kernel/bpf/helpers.c:
      notrace BPF_CALL_1(bpf_spin_lock, struct bpf_spin_lock *, lock)
      notrace BPF_CALL_1(bpf_spin_unlock, struct bpf_spin_lock *, lock)
    notrace is supposed to prevent fentry prog from attaching to
    bpf_spin_{lock,unlock}().

    But actually this is not the case and fentry prog can successfully
    attached to bpf_spin_lock(). Siddharth Chintamaneni reported
    the issue in [1]. The following is the macro definition for
    above BPF_CALL_1:
      #define BPF_CALL_x(x, name, ...)                                               \
            static __always_inline                                                 \
            u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__));   \
            typedef u64 (*btf_##name)(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__)); \
            u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__));         \
            u64 name(__BPF_REG(x, __BPF_DECL_REGS, __BPF_N, __VA_ARGS__))          \
            {                                                                      \
                    return ((btf_##name)____##name)(__BPF_MAP(x,__BPF_CAST,__BPF_N,__VA_ARGS__));\
            }                                                                      \
            static __always_inline                                                 \
            u64 ____##name(__BPF_MAP(x, __BPF_DECL_ARGS, __BPF_V, __VA_ARGS__))

      #define BPF_CALL_1(name, ...)   BPF_CALL_x(1, name, __VA_ARGS__)

    The notrace attribute is actually applied to the static always_inline function
    ____bpf_spin_{lock,unlock}(). The actual callback function
    bpf_spin_{lock,unlock}() is not marked with notrace, hence
    allowing fentry prog to attach to two helpers, and this
    may cause the above mentioned deadlock. Siddharth Chintamaneni
    actually has a reproducer in [2].

    To fix the issue, a new macro NOTRACE_BPF_CALL_1 is introduced which
    will add notrace attribute to the original function instead of
    the hidden always_inline function and this fixed the problem.

      [1] https://lore.kernel.org/bpf/CAE5sdEigPnoGrzN8WU7Tx-h-iFuMZgW06qp0KHWtpvoXxf1OAQ@mail.gmail.com/
      [2] https://lore.kernel.org/bpf/CAE5sdEg6yUc_Jz50AnUXEEUh6O73yQ1Z6NV2srJnef0ZrQkZew@mail.gmail.com/

    Fixes: d83525ca62 ("bpf: introduce bpf_spin_lock")
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20240207070102.335167-1-yonghong.song@linux.dev

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:09 +02:00
Jerome Marchand 134fbcf4cd bpf: Have bpf_rdonly_cast() take a const pointer
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit 5b268d1ebcdceacf992dfda8f9031d56005a274e
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Sun Feb 4 14:06:34 2024 -0700

    bpf: Have bpf_rdonly_cast() take a const pointer

    Since 20d59ee55172 ("libbpf: add bpf_core_cast() macro"), libbpf is now
    exporting a const arg version of bpf_rdonly_cast(). This causes the
    following conflicting type error when generating kfunc prototypes from
    BTF:

    In file included from skeleton/pid_iter.bpf.c:5:
    /home/dxu/dev/linux/tools/bpf/bpftool/bootstrap/libbpf/include/bpf/bpf_core_read.h:297:14: error: conflicting types for 'bpf_rdonly_cast'
    extern void *bpf_rdonly_cast(const void *obj__ign, __u32 btf_id__k) __ksym __weak;
                 ^
    ./vmlinux.h:135625:14: note: previous declaration is here
    extern void *bpf_rdonly_cast(void *obj__ign, u32 btf_id__k) __weak __ksym;

    This is b/c the kernel defines bpf_rdonly_cast() with non-const arg.
    Since const arg is more permissive and thus backwards compatible, we
    change the kernel definition as well to avoid conflicting type errors.

    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Yonghong Song <yonghong.song@linux.dev>
    Link: https://lore.kernel.org/bpf/dfd3823f11ffd2d4c838e961d61ec9ae8a646773.1707080349.git.dxu@dxuuu.xyz

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:09 +02:00
Jerome Marchand 563e3eb7e7 bpf: treewide: Annotate BPF kfuncs in BTF
JIRA: https://issues.redhat.com/browse/RHEL-23649

Conflicts: Multiple conflicts due to missing kfuncs. All sections were
switched to use the new macro except bpf_mptcp_fmodret_ids which still
use BTF_SET8_* upstream. I don't know why. That might be an upstream
oversight.

commit 6f3189f38a3e995232e028a4c341164c4aca1b20
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Sun Jan 28 18:24:08 2024 -0700

    bpf: treewide: Annotate BPF kfuncs in BTF

    This commit marks kfuncs as such inside the .BTF_ids section. The upshot
    of these annotations is that we'll be able to automatically generate
    kfunc prototypes for downstream users. The process is as follows:

    1. In source, use BTF_KFUNCS_START/END macro pair to mark kfuncs
    2. During build, pahole injects into BTF a "bpf_kfunc" BTF_DECL_TAG for
       each function inside BTF_KFUNCS sets
    3. At runtime, vmlinux or module BTF is made available in sysfs
    4. At runtime, bpftool (or similar) can look at provided BTF and
       generate appropriate prototypes for functions with "bpf_kfunc" tag

    To ensure future kfunc are similarly tagged, we now also return error
    inside kfunc registration for untagged kfuncs. For vmlinux kfuncs,
    we also WARN(), as initcall machinery does not handle errors.

    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Acked-by: Benjamin Tissoires <bentiss@kernel.org>
    Link: https://lore.kernel.org/r/e55150ceecbf0a5d961e608941165c0bee7bc943.1706491398.git.dxu@dxuuu.xyz
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:07 +02:00
Jerome Marchand d1c16d1138 bpf: Take into account BPF token when fetching helper protos
JIRA: https://issues.redhat.com/browse/RHEL-23649

Conflicts: Context change due to missing commit 9a675ba55a96 ("net,
bpf: Add a warning if NAPI cb missed xdp_do_flush().")

commit bbc1d24724e110b86a1a7c3c1724ce0d62cc1e2e
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Tue Jan 23 18:21:04 2024 -0800

    bpf: Take into account BPF token when fetching helper protos

    Instead of performing unconditional system-wide bpf_capable() and
    perfmon_capable() calls inside bpf_base_func_proto() function (and other
    similar ones) to determine eligibility of a given BPF helper for a given
    program, use previously recorded BPF token during BPF_PROG_LOAD command
    handling to inform the decision.

    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20240124022127.2379740-8-andrii@kernel.org

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:49:03 +02:00
Jerome Marchand f0f8ba4f5d bpf: Support inlining bpf_kptr_xchg() helper
JIRA: https://issues.redhat.com/browse/RHEL-23649

Conflicts:
Context change. The bpf_arch_poke_desc_update() function has been
placed before bpf_jit_supports_exceptions() and
bpf_jit_supports_exceptions() by a previous backport (commit
fb0a7b0e48 "bpf: Fix prog_array_map_poke_run map poke update").
Reorder the function as upstream to help with furture backport.

commit 7c05e7f3e74e7e550534d524e04d7e6f78d6fa24
Author: Hou Tao <houtao1@huawei.com>
Date:   Fri Jan 5 18:48:17 2024 +0800

    bpf: Support inlining bpf_kptr_xchg() helper

    The motivation of inlining bpf_kptr_xchg() comes from the performance
    profiling of bpf memory allocator benchmark. The benchmark uses
    bpf_kptr_xchg() to stash the allocated objects and to pop the stashed
    objects for free. After inling bpf_kptr_xchg(), the performance for
    object free on 8-CPUs VM increases about 2%~10%. The inline also has
    downside: both the kasan and kcsan checks on the pointer will be
    unavailable.

    bpf_kptr_xchg() can be inlined by converting the calling of
    bpf_kptr_xchg() into an atomic_xchg() instruction. But the conversion
    depends on two conditions:
    1) JIT backend supports atomic_xchg() on pointer-sized word
    2) For the specific arch, the implementation of xchg is the same as
       atomic_xchg() on pointer-sized words.

    It seems most 64-bit JIT backends satisfies these two conditions. But
    as a precaution, defining a weak function bpf_jit_supports_ptr_xchg()
    to state whether such conversion is safe and only supporting inline for
    64-bit host.

    For x86-64, it supports BPF_XCHG atomic operation and both xchg() and
    atomic_xchg() use arch_xchg() to implement the exchange, so enabling the
    inline of bpf_kptr_xchg() on x86-64 first.

    Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20240105104819.3916743-2-houtao@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:48:58 +02:00
Viktor Malik 7dc0f5d37a
bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel
JIRA: https://issues.redhat.com/browse/RHEL-23644

JIRA: https://issues.redhat.com/browse/RHEL-31726
CVE: CVE-2024-26737

commit 0281b919e175bb9c3128bd3872ac2903e9436e3f
Author: Martin KaFai Lau <martin.lau@kernel.org>
Date:   Thu Feb 15 13:12:17 2024 -0800

    bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel

    The following race is possible between bpf_timer_cancel_and_free
    and bpf_timer_cancel. It will lead a UAF on the timer->timer.

    bpf_timer_cancel();
    	spin_lock();
    	t = timer->time;
    	spin_unlock();

    					bpf_timer_cancel_and_free();
    						spin_lock();
    						t = timer->timer;
    						timer->timer = NULL;
    						spin_unlock();
    						hrtimer_cancel(&t->timer);
    						kfree(t);

    	/* UAF on t */
    	hrtimer_cancel(&t->timer);

    In bpf_timer_cancel_and_free, this patch frees the timer->timer
    after a rcu grace period. This requires a rcu_head addition
    to the "struct bpf_hrtimer". Another kfree(t) happens in bpf_timer_init,
    this does not need a kfree_rcu because it is still under the
    spin_lock and timer->timer has not been visible by others yet.

    In bpf_timer_cancel, rcu_read_lock() is added because this helper
    can be used in a non rcu critical section context (e.g. from
    a sleepable bpf prog). Other timer->timer usages in helpers.c
    have been audited, bpf_timer_cancel() is the only place where
    timer->timer is used outside of the spin_lock.

    Another solution considered is to mark a t->flag in bpf_timer_cancel
    and clear it after hrtimer_cancel() is done.  In bpf_timer_cancel_and_free,
    it busy waits for the flag to be cleared before kfree(t). This patch
    goes with a straight forward solution and frees timer->timer after
    a rcu grace period.

    Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/bpf/20240215211218.990808-1-martin.lau@linux.dev

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:44 +02:00
Viktor Malik 9680ef97a0
Revert BPF token-related functionality
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit d17aff807f845cf93926c28705216639c7279110
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Tue Dec 19 07:37:35 2023 -0800

    Revert BPF token-related functionality

    This patch includes the following revert (one  conflicting BPF FS
    patch and three token patch sets, represented by merge commits):
      - revert 0f5d5454c723 "Merge branch 'bpf-fs-mount-options-parsing-follow-ups'";
      - revert 750e785796bb "bpf: Support uid and gid when mounting bpffs";
      - revert 733763285acf "Merge branch 'bpf-token-support-in-libbpf-s-bpf-object'";
      - revert c35919dcce28 "Merge branch 'bpf-token-and-bpf-fs-based-delegation'".

    Link: https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:29 +02:00
Viktor Malik 935518dfd6
x86/cfi,bpf: Fix bpf_exception_cb() signature
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 852486b35f344887786d63250946dd921a05d7e8
Author: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date:   Fri Dec 15 10:12:23 2023 +0100

    x86/cfi,bpf: Fix bpf_exception_cb() signature
    
    As per the earlier patches, BPF sub-programs have bpf_callback_t
    signature and CFI expects callers to have matching signature. This is
    violated by bpf_prog_aux::bpf_exception_cb().
    
    [peterz: Changelog]
    Reported-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/CAADnVQ+Z7UcXXBBhMubhcMM=R-dExk-uHtfOLtoLxQ1XxEpqEA@mail.gmail.com
    Link: https://lore.kernel.org/r/20231215092707.910319166@infradead.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:27 +02:00
Viktor Malik 5cc2464aa0
bpf: Fix dtor CFI
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit e4c00339891c074c76f626ac82981963cbba5332
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Dec 15 10:12:22 2023 +0100

    bpf: Fix dtor CFI

    Ensure the various dtor functions match their prototype and retain
    their CFI signatures, since they don't have their address taken, they
    are prone to not getting CFI, making them impossible to call
    indirectly.

    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20231215092707.799451071@infradead.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:27 +02:00
Viktor Malik 3e424bf42b
bpf: take into account BPF token when fetching helper protos
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 4cbb270e115bc197ff2046aeb54cc951666b16ec
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Nov 30 10:52:19 2023 -0800

    bpf: take into account BPF token when fetching helper protos
    
    Instead of performing unconditional system-wide bpf_capable() and
    perfmon_capable() calls inside bpf_base_func_proto() function (and other
    similar ones) to determine eligibility of a given BPF helper for a given
    program, use previously recorded BPF token during BPF_PROG_LOAD command
    handling to inform the decision.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231130185229.2688956-8-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 10:52:09 +02:00
Viktor Malik 2ef6b238c5
bpf: Check rcu_read_lock_trace_held() before calling bpf map helpers
JIRA: https://issues.redhat.com/browse/RHEL-23644

JIRA: https://issues.redhat.com/browse/RHEL-30513
CVE: CVE-2023-52621

commit 169410eba271afc9f0fb476d996795aa26770c6d
Author: Hou Tao <houtao1@huawei.com>
Date:   Mon Dec 4 22:04:19 2023 +0800

    bpf: Check rcu_read_lock_trace_held() before calling bpf map helpers

    These three bpf_map_{lookup,update,delete}_elem() helpers are also
    available for sleepable bpf program, so add the corresponding lock
    assertion for sleepable bpf program, otherwise the following warning
    will be reported when a sleepable bpf program manipulates bpf map under
    interpreter mode (aka bpf_jit_enable=0):

      WARNING: CPU: 3 PID: 4985 at kernel/bpf/helpers.c:40 ......
      CPU: 3 PID: 4985 Comm: test_progs Not tainted 6.6.0+ #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
      RIP: 0010:bpf_map_lookup_elem+0x54/0x60
      ......
      Call Trace:
       <TASK>
       ? __warn+0xa5/0x240
       ? bpf_map_lookup_elem+0x54/0x60
       ? report_bug+0x1ba/0x1f0
       ? handle_bug+0x40/0x80
       ? exc_invalid_op+0x18/0x50
       ? asm_exc_invalid_op+0x1b/0x20
       ? __pfx_bpf_map_lookup_elem+0x10/0x10
       ? rcu_lockdep_current_cpu_online+0x65/0xb0
       ? rcu_is_watching+0x23/0x50
       ? bpf_map_lookup_elem+0x54/0x60
       ? __pfx_bpf_map_lookup_elem+0x10/0x10
       ___bpf_prog_run+0x513/0x3b70
       __bpf_prog_run32+0x9d/0xd0
       ? __bpf_prog_enter_sleepable_recur+0xad/0x120
       ? __bpf_prog_enter_sleepable_recur+0x3e/0x120
       bpf_trampoline_6442580665+0x4d/0x1000
       __x64_sys_getpgid+0x5/0x30
       ? do_syscall_64+0x36/0xb0
       entry_SYSCALL_64_after_hwframe+0x6e/0x76
       </TASK>

    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20231204140425.1480317-2-houtao@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 10:52:03 +02:00
Viktor Malik 027c56f751
bpf: Add a new kfunc for cgroup1 hierarchy
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit fe977716b40cb98cf9c91a66454adf3dc2f8c59a
Author: Yafang Shao <laoar.shao@gmail.com>
Date:   Sat Nov 11 09:00:29 2023 +0000

    bpf: Add a new kfunc for cgroup1 hierarchy
    
    A new kfunc is added to acquire cgroup1 of a task:
    
    - bpf_task_get_cgroup1
      Acquires the associated cgroup of a task whithin a specific cgroup1
      hierarchy. The cgroup1 hierarchy is identified by its hierarchy ID.
    
    This new kfunc enables the tracing of tasks within a designated
    container or cgroup directory in BPF programs.
    
    Suggested-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
    Acked-by: Tejun Heo <tj@kernel.org>
    Link: https://lore.kernel.org/r/20231111090034.4248-2-laoar.shao@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 10:51:45 +02:00