Commit Graph

364 Commits

Author SHA1 Message Date
Jerome Marchand 1736223058 bpf: prepare btf_prepare_func_args() for multiple tags per argument
JIRA: https://issues.redhat.com/browse/RHEL-23649

commit 54c11ec4935a61af32bb03fc52e7172c97bd7203
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Jan 4 16:09:04 2024 -0800

    bpf: prepare btf_prepare_func_args() for multiple tags per argument

    Add btf_arg_tag flags enum to be able to record multiple tags per
    argument. Also streamline pointer argument processing some more.

    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20240105000909.2818934-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2024-10-15 10:48:58 +02:00
Viktor Malik c597e4b470
bpf: don't emit warnings intended for global subprogs for static subprogs
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 1eb986746a67952df86eb2c50a36450ef103d01b
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Fri Feb 2 11:05:29 2024 -0800

    bpf: don't emit warnings intended for global subprogs for static subprogs

    When btf_prepare_func_args() was generalized to handle both static and
    global subprogs, a few warnings/errors that are meant only for global
    subprog cases started to be emitted for static subprogs, where they are
    sort of expected and irrelavant.

    Stop polutting verifier logs with irrelevant scary-looking messages.

    Fixes: e26080d0da87 ("bpf: prepare btf_prepare_func_args() for handling static subprogs")
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240202190529.2374377-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:46 +02:00
Viktor Malik 06951a71fd
bpf: make sure scalar args don't accept __arg_nonnull tag
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 18810ad3929ff6b5d8e67e3adc40d690bd780fd6
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Jan 4 16:09:03 2024 -0800

    bpf: make sure scalar args don't accept __arg_nonnull tag

    Move scalar arg processing in btf_prepare_func_args() after all pointer
    arg processing is done. This makes it easier to do validation. One
    example of unintended behavior right now is ability to specify
    __arg_nonnull for integer/enum arguments. This patch fixes this.

    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20240105000909.2818934-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:45 +02:00
Viktor Malik 8591442f2c
bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 0ba971511d16603599f947459e59b435cc465b0d
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Jan 17 19:31:41 2024 -0800

    bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
    
    Add enforcement of expected types for context arguments tagged with
    arg:ctx (__arg_ctx) tag.
    
    First, any program type will accept generic `void *` context type when
    combined with __arg_ctx tag.
    
    Besides accepting "canonical" struct names and `void *`, for a bunch of
    program types for which program context is actually a named struct, we
    allows a bunch of pragmatic exceptions to match real-world and expected
    usage:
    
      - for both kprobes and perf_event we allow `bpf_user_pt_regs_t *` as
        canonical context argument type, where `bpf_user_pt_regs_t` is a
        *typedef*, not a struct;
      - for kprobes, we also always accept `struct pt_regs *`, as that's what
        actually is passed as a context to any kprobe program;
      - for perf_event, we resolve typedefs (unless it's `bpf_user_pt_regs_t`)
        down to actual struct type and accept `struct pt_regs *`, or
        `struct user_pt_regs *`, or `struct user_regs_struct *`, depending
        on the actual struct type kernel architecture points `bpf_user_pt_regs_t`
        typedef to; otherwise, canonical `struct bpf_perf_event_data *` is
        expected;
      - for raw_tp/raw_tp.w programs, `u64/long *` are accepted, as that's
        what's expected with BPF_PROG() usage; otherwise, canonical
        `struct bpf_raw_tracepoint_args *` is expected;
      - tp_btf supports both `struct bpf_raw_tracepoint_args *` and `u64 *`
        formats, both are coded as expections as tp_btf is actually a TRACING
        program type, which has no canonical context type;
      - iterator programs accept `struct bpf_iter__xxx *` structs, currently
        with no further iterator-type specific enforcement;
      - fentry/fexit/fmod_ret/lsm/struct_ops all accept `u64 *`;
      - classic tracepoint programs, as well as syscall and freplace
        programs allow any user-provided type.
    
    In all other cases kernel will enforce exact match of struct name to
    expected canonical type. And if user-provided type doesn't match that
    expectation, verifier will emit helpful message with expected type name.
    
    Note a bit unnatural way the check is done after processing all the
    arguments. This is done to avoid conflict between bpf and bpf-next
    trees. Once trees converge, a small follow up patch will place a simple
    btf_validate_prog_ctx_type() check into a proper ARG_PTR_TO_CTX branch
    (which bpf-next tree patch refactored already), removing duplicated
    arg:ctx detection logic.
    
    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240118033143.3384355-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:42 +02:00
Viktor Malik c8534d4a51
bpf: extract bpf_ctx_convert_map logic and make it more reusable
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 66967a32d3b16ed447e76fed4d946bab52e43d86
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Jan 17 19:31:40 2024 -0800

    bpf: extract bpf_ctx_convert_map logic and make it more reusable
    
    Refactor btf_get_prog_ctx_type() a bit to allow reuse of
    bpf_ctx_convert_map logic in more than one places. Simplify interface by
    returning btf_type instead of btf_member (field reference in BTF).
    
    To do the above we need to touch and start untangling
    btf_translate_to_vmlinux() implementation. We do the bare minimum to
    not regress anything for btf_translate_to_vmlinux(), but its
    implementation is very questionable for what it claims to be doing.
    Mapping kfunc argument types to kernel corresponding types conceptually
    is quite different from recognizing program context types. Fixing this
    is out of scope for this change though.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20240118033143.3384355-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:42 +02:00
Viktor Malik 08247bafc2
bpf: add support for passing dynptr pointer to global subprog
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit a64bfe618665ea9c722f922cba8c6e3234eac5ac
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:31 2023 -0800

    bpf: add support for passing dynptr pointer to global subprog
    
    Add ability to pass a pointer to dynptr into global functions.
    This allows to have global subprogs that accept and work with generic
    dynptrs that are created by caller. Dynptr argument is detected based on
    the name of a struct type, if it's "bpf_dynptr", it's assumed to be
    a proper dynptr pointer. Both actual struct and forward struct
    declaration types are supported.
    
    This is conceptually exactly the same semantics as
    bpf_user_ringbuf_drain()'s use of dynptr to pass a variable-sized
    pointer to ringbuf record. So we heavily rely on CONST_PTR_TO_DYNPTR
    bits of already existing logic in the verifier.
    
    During global subprog validation, we mark such CONST_PTR_TO_DYNPTR as
    having LOCAL type, as that's the most unassuming type of dynptr and it
    doesn't have any special helpers that can try to free or acquire extra
    references (unlike skb, xdp, or ringbuf dynptr). So that seems like a safe
    "choice" to make from correctness standpoint. It's still possible to
    pass any type of dynptr to such subprog, though, because generic dynptr
    helpers, like getting data/slice pointers, read/write memory copying
    routines, dynptr adjustment and getter routines all work correctly with
    any type of dynptr.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-8-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:32 +02:00
Viktor Malik 5d3ae6c758
bpf: support 'arg:xxx' btf_decl_tag-based hints for global subprog args
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 94e1c70a34523b5e1529e4ec508316acc6a26a2b
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:30 2023 -0800

    bpf: support 'arg:xxx' btf_decl_tag-based hints for global subprog args
    
    Add support for annotating global BPF subprog arguments to provide more
    information about expected semantics of the argument. Currently,
    verifier relies purely on argument's BTF type information, and supports
    three general use cases: scalar, pointer-to-context, and
    pointer-to-fixed-size-memory.
    
    Scalar and pointer-to-fixed-mem work well in practice and are quite
    natural to use. But pointer-to-context is a bit problematic, as typical
    BPF users don't realize that they need to use a special type name to
    signal to verifier that argument is not just some pointer, but actually
    a PTR_TO_CTX. Further, even if users do know which type to use, it is
    limiting in situations where the same BPF program logic is used across
    few different program types. Common case is kprobes, tracepoints, and
    perf_event programs having a helper to send some data over BPF perf
    buffer. bpf_perf_event_output() requires `ctx` argument, and so it's
    quite cumbersome to share such global subprog across few BPF programs of
    different types, necessitating extra static subprog that is context
    type-agnostic.
    
    Long story short, there is a need to go beyond types and allow users to
    add hints to global subprog arguments to define expectations.
    
    This patch adds such support for two initial special tags:
      - pointer to context;
      - non-null qualifier for generic pointer arguments.
    
    All of the above came up in practice already and seem generally useful
    additions. Non-null qualifier is an often requested feature, which
    currently has to be worked around by having unnecessary NULL checks
    inside subprogs even if we know that arguments are never NULL. Pointer
    to context was discussed earlier.
    
    As for implementation, we utilize btf_decl_tag attribute and set up an
    "arg:xxx" convention to specify argument hint. As such:
      - btf_decl_tag("arg:ctx") is a PTR_TO_CTX hint;
      - btf_decl_tag("arg:nonnull") marks pointer argument as not allowed to
        be NULL, making NULL check inside global subprog unnecessary.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-7-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:31 +02:00
Viktor Malik 393d19cf4f
bpf: move subprog call logic back to verifier.c
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit c5a7244759b1eeacc59d0426fb73859afa942d0d
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:28 2023 -0800

    bpf: move subprog call logic back to verifier.c
    
    Subprog call logic in btf_check_subprog_call() currently has both a lot
    of BTF parsing logic (which is, presumably, what justified putting it
    into btf.c), but also a bunch of register state checks, some of each
    utilize deep verifier logic helpers, necessarily exported from
    verifier.c: check_ptr_off_reg(), check_func_arg_reg_off(),
    and check_mem_reg().
    
    Going forward, btf_check_subprog_call() will have a minimum of
    BTF-related logic, but will get more internal verifier logic related to
    register state manipulation. So move it into verifier.c to minimize
    amount of verifier-specific logic exposed to btf.c.
    
    We do this move before refactoring btf_check_func_arg_match() to
    preserve as much history post-refactoring as possible.
    
    No functional changes.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-5-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:31 +02:00
Viktor Malik c44e0c2e4a
bpf: prepare btf_prepare_func_args() for handling static subprogs
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit e26080d0da87f20222ca6712b65f95a856fadee0
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:27 2023 -0800

    bpf: prepare btf_prepare_func_args() for handling static subprogs
    
    Generalize btf_prepare_func_args() to support both global and static
    subprogs. We are going to utilize this property in the next patch,
    reusing btf_prepare_func_args() for subprog call logic instead of
    reparsing BTF information in a completely separate implementation.
    
    btf_prepare_func_args() now detects whether subprog is global or static
    makes slight logic adjustments for static func cases, like not failing
    fatally (-EFAULT) for conditions that are allowable for static subprogs.
    
    Somewhat subtle (but major!) difference is the handling of pointer arguments.
    Both global and static functions need to handle special context
    arguments (which are pointers to predefined type names), but static
    subprogs give up on any other pointers, falling back to marking subprog
    as "unreliable", disabling the use of BTF type information altogether.
    
    For global functions, though, we are assuming that such pointers to
    unrecognized types are just pointers to fixed-sized memory region (or
    error out if size cannot be established, like for `void *` pointers).
    
    This patch accommodates these small differences and sets up a stage for
    refactoring in the next patch, eliminating a separate BTF-based parsing
    logic in btf_check_func_arg_match().
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:31 +02:00
Viktor Malik 3c17a31ed6
bpf: reuse btf_prepare_func_args() check for main program BTF validation
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 5eccd2db42d77e3570619c32d39e39bf486607cf
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:26 2023 -0800

    bpf: reuse btf_prepare_func_args() check for main program BTF validation
    
    Instead of btf_check_subprog_arg_match(), use btf_prepare_func_args()
    logic to validate "trustworthiness" of main BPF program's BTF information,
    if it is present.
    
    We ignored results of original BTF check anyway, often times producing
    confusing and ominously-sounding "reg type unsupported for arg#0
    function" message, which has no apparent effect on program correctness
    and verification process.
    
    All the -EFAULT returning sanity checks are already performed in
    check_btf_info_early(), so there is zero reason to have this duplication
    of logic between btf_check_subprog_call() and btf_check_subprog_arg_match().
    Dropping btf_check_subprog_arg_match() simplifies
    btf_check_func_arg_match() further removing `bool processing_call` flag.
    
    One subtle bit that was done by btf_check_subprog_arg_match() was
    potentially marking main program's BTF as unreliable. We do this
    explicitly now with a dedicated simple check, preserving the original
    behavior, but now based on well factored btf_prepare_func_args() logic.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:30 +02:00
Viktor Malik 803edf9e28
bpf: abstract away global subprog arg preparation logic from reg state setup
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 4ba1d0f23414135e4f426dae4cb5cdc2ce246f89
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Dec 14 17:13:25 2023 -0800

    bpf: abstract away global subprog arg preparation logic from reg state setup
    
    btf_prepare_func_args() is used to understand expectations and
    restrictions on global subprog arguments. But current implementation is
    hard to extend, as it intermixes BTF-based func prototype parsing and
    interpretation logic with setting up register state at subprog entry.
    
    Worse still, those registers are not completely set up inside
    btf_prepare_func_args(), requiring some more logic later in
    do_check_common(). Like calling mark_reg_unknown() and similar
    initialization operations.
    
    This intermixing of BTF interpretation and register state setup is
    problematic. First, it causes duplication of BTF parsing logic for global
    subprog verification (to set up initial state of global subprog) and
    global subprog call sites analysis (when we need to check that whatever
    is being passed into global subprog matches expectations), performed in
    btf_check_subprog_call().
    
    Given we want to extend global func argument with tags later, this
    duplication is problematic. So refactor btf_prepare_func_args() to do
    only BTF-based func proto and args parsing, returning high-level
    argument "expectations" only, with no regard to specifics of register
    state. I.e., if it's a context argument, instead of setting register
    state to PTR_TO_CTX, we return ARG_PTR_TO_CTX enum for that argument as
    "an argument specification" for further processing inside
    do_check_common(). Similarly for SCALAR arguments, PTR_TO_MEM, etc.
    
    This allows to reuse btf_prepare_func_args() in following patches at
    global subprog call site analysis time. It also keeps register setup
    code consistently in one place, do_check_common().
    
    Besides all this, we cache this argument specs information inside
    env->subprog_info, eliminating the need to redo these potentially
    expensive BTF traversals, especially if BPF program's BTF is big and/or
    there are lots of global subprog calls.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231215011334.2307144-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 11:07:30 +02:00
Viktor Malik 97b64fd5e1
bpf: tidy up exception callback management a bit
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 1a1ad782dcbbacd9e8d4e2e7ff1bf14d1db80727
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Mon Dec 4 15:39:21 2023 -0800

    bpf: tidy up exception callback management a bit
    
    Use the fact that we are passing subprog index around and have
    a corresponding struct bpf_subprog_info in bpf_verifier_env for each
    subprogram. We don't need to separately pass around a flag whether
    subprog is exception callback or not, each relevant verifier function
    can determine this using provided subprog index if we maintain
    bpf_subprog_info properly.
    
    Also move out exception callback-specific logic from
    btf_prepare_func_args(), keeping it generic. We can enforce all these
    restriction right before exception callback verification pass. We add
    out parameter, arg_cnt, for now, but this will be unnecessary with
    subsequent refactoring and will be removed.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/r/20231204233931.49758-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 10:52:21 +02:00
Viktor Malik 2fc6ffe976
bpf: Move GRAPH_{ROOT,NODE}_MASK macros into btf_field_type enum
JIRA: https://issues.redhat.com/browse/RHEL-23644

commit 790ce3cfefb1b768dccd4eee324ddef0f0ce3db4
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Tue Nov 7 00:56:37 2023 -0800

    bpf: Move GRAPH_{ROOT,NODE}_MASK macros into btf_field_type enum
    
    This refactoring patch removes the unused BPF_GRAPH_NODE_OR_ROOT
    btf_field_type and moves BPF_GRAPH_{NODE,ROOT} macros into the
    btf_field_type enum. Further patches in the series will use
    BPF_GRAPH_NODE, so let's move this useful definition out of btf.c.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20231107085639.3016113-5-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2024-06-25 10:51:42 +02:00
Artem Savkov a24569964d bpf: Add support for custom exception callbacks
JIRA: https://issues.redhat.com/browse/RHEL-23643

commit b9ae0c9dd0aca79bffc17be51c2dc148d1f72708
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Sep 13 01:32:03 2023 +0200

    bpf: Add support for custom exception callbacks
    
    By default, the subprog generated by the verifier to handle a thrown
    exception hardcodes a return value of 0. To allow user-defined logic
    and modification of the return value when an exception is thrown,
    introduce the 'exception_callback:' declaration tag, which marks a
    callback as the default exception handler for the program.
    
    The format of the declaration tag is 'exception_callback:<value>', where
    <value> is the name of the exception callback. Each main program can be
    tagged using this BTF declaratiion tag to associate it with an exception
    callback. In case the tag is absent, the default callback is used.
    
    As such, the exception callback cannot be modified at runtime, only set
    during verification.
    
    Allowing modification of the callback for the current program execution
    at runtime leads to issues when the programs begin to nest, as any
    per-CPU state maintaing this information will have to be saved and
    restored. We don't want it to stay in bpf_prog_aux as this takes a
    global effect for all programs. An alternative solution is spilling
    the callback pointer at a known location on the program stack on entry,
    and then passing this location to bpf_throw as a parameter.
    
    However, since exceptions are geared more towards a use case where they
    are ideally never invoked, optimizing for this use case and adding to
    the complexity has diminishing returns.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20230912233214.1518551-7-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2024-03-27 10:27:47 +01:00
Artem Savkov d0231b546b bpf: Add BPF_KPTR_PERCPU as a field type
JIRA: https://issues.redhat.com/browse/RHEL-23643

commit 55db92f42fe4a4ef7b4c2b4960c6212c8512dd53
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Sun Aug 27 08:27:39 2023 -0700

    bpf: Add BPF_KPTR_PERCPU as a field type
    
    BPF_KPTR_PERCPU represents a percpu field type like below
    
      struct val_t {
        ... fields ...
      };
      struct t {
        ...
        struct val_t __percpu_kptr *percpu_data_ptr;
        ...
      };
    
    where
      #define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr")))
    
    While BPF_KPTR_REF points to a trusted kernel object or a trusted
    local object, BPF_KPTR_PERCPU points to a trusted local
    percpu object.
    
    This patch added basic support for BPF_KPTR_PERCPU
    related to percpu_kptr field parsing, recording and free operations.
    BPF_KPTR_PERCPU also supports the same map types
    as BPF_KPTR_REF does.
    
    Note that unlike a local kptr, it is possible that
    a BPF_KTPR_PERCPU struct may not contain any
    special fields like other kptr, bpf_spin_lock, bpf_list_head, etc.
    
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Link: https://lore.kernel.org/r/20230827152739.1996391-1-yonghong.song@linux.dev
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2024-03-27 10:27:45 +01:00
Jerome Marchand c18ae73848 bpf: Fix a erroneous check after snprintf()
JIRA: https://issues.redhat.com/browse/RHEL-10691

commit a8f12572860ad8ba659d96eee9cf09e181f6ebcc
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Fri Sep 8 18:33:35 2023 +0200

    bpf: Fix a erroneous check after snprintf()

    snprintf() does not return negative error code on error, it returns the
    number of characters which *would* be generated for the given input.

    Fix the error handling check.

    Fixes: 57539b1c0ac2 ("bpf: Enable annotating trusted nested pointers")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://lore.kernel.org/r/393bdebc87b22563c08ace094defa7160eb7a6c0.1694190795.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:29:04 +01:00
Jerome Marchand ae51d6151a bpf: Fix an error in verifying a field in a union
JIRA: https://issues.redhat.com/browse/RHEL-10691

commit 33937607efa050d9e237e0c4ac4ada02d961c466
Author: Yafang Shao <laoar.shao@gmail.com>
Date:   Thu Jul 13 02:56:41 2023 +0000

    bpf: Fix an error in verifying a field in a union

    We are utilizing BPF LSM to monitor BPF operations within our container
    environment. When we add support for raw_tracepoint, it hits below
    error.

    ; (const void *)attr->raw_tracepoint.name);
    27: (79) r3 = *(u64 *)(r2 +0)
    access beyond the end of member map_type (mend:4) in struct (anon) with off 0 size 8

    It can be reproduced with below BPF prog.

    SEC("lsm/bpf")
    int BPF_PROG(bpf_audit, int cmd, union bpf_attr *attr, unsigned int size)
    {
    	switch (cmd) {
    	case BPF_RAW_TRACEPOINT_OPEN:
    		bpf_printk("raw_tracepoint is %s", attr->raw_tracepoint.name);
    		break;
    	default:
    		break;
    	}
    	return 0;
    }

    The reason is that when accessing a field in a union, such as bpf_attr,
    if the field is located within a nested struct that is not the first
    member of the union, it can result in incorrect field verification.

      union bpf_attr {
          struct {
              __u32 map_type; <<<< Actually it will find that field.
              __u32 key_size;
              __u32 value_size;
             ...
          };
          ...
          struct {
              __u64 name;    <<<< We want to verify this field.
              __u32 prog_fd;
          } raw_tracepoint;
      };

    Considering the potential deep nesting levels, finding a perfect
    solution to address this issue has proven challenging. Therefore, I
    propose a solution where we simply skip the verification process if the
    field in question is located within a union.

    Fixes: 7e3617a72d ("bpf: Add array support to btf_struct_access")
    Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
    Link: https://lore.kernel.org/r/20230713025642.27477-4-laoar.shao@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:28:52 +01:00
Jerome Marchand c239c634d8 bpf: Resolve modifiers when walking structs
JIRA: https://issues.redhat.com/browse/RHEL-10691

Conflicts: Context change from already applied commit 7ce4dc3e4a9d
("bpf: Fix an error around PTR_UNTRUSTED")

commit 819d43428a8661abccf8b9ecad94c7e6f23a0024
Author: Stanislav Fomichev <sdf@google.com>
Date:   Mon Jun 26 14:25:21 2023 -0700

    bpf: Resolve modifiers when walking structs

    It is impossible to use skb_frag_t in the tracing program. Resolve typedefs
    when walking structs.

    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20230626212522.2414485-1-sdf@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-14 15:22:23 +01:00
Viktor Malik 7ea4304297
bpf, btf: Warn but return no error for NULL btf from __register_btf_kfunc_id_set()
JIRA: https://issues.redhat.com/browse/RHEL-9957

commit 3de4d22cc9ac7c9f38e10edcf54f9a8891a9c2aa
Author: SeongJae Park <sj@kernel.org>
Date:   Sat Jul 1 17:14:47 2023 +0000

    bpf, btf: Warn but return no error for NULL btf from __register_btf_kfunc_id_set()
    
    __register_btf_kfunc_id_set() assumes .BTF to be part of the module's .ko
    file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case, the
    function prints an error message and return an error. As a result, such
    modules cannot be loaded.
    
    However, the section could be stripped out during a build process. It would
    be better to let the modules loaded, because their basic functionalities
    have no problem [0], though the BTF functionalities will not be supported.
    Make the function to lower the level of the message from error to warn, and
    return no error.
    
      [0] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion
    
    Fixes: c446fdacb10d ("bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF")
    Reported-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
    Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: SeongJae Park <sj@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/87y228q66f.fsf@oc8242746057.ibm.com
    Link: https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion
    Link: https://lore.kernel.org/bpf/20230701171447.56464-1-sj@kernel.org

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-10-26 17:06:21 +02:00
Viktor Malik 1b7193a839
bpf: Silence a warning in btf_type_id_size()
JIRA: https://issues.redhat.com/browse/RHEL-9957

commit e6c2f594ed961273479505b42040782820190305
Author: Yonghong Song <yhs@fb.com>
Date:   Tue May 30 13:50:29 2023 -0700

    bpf: Silence a warning in btf_type_id_size()
    
    syzbot reported a warning in [1] with the following stacktrace:
      WARNING: CPU: 0 PID: 5005 at kernel/bpf/btf.c:1988 btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988
      ...
      RIP: 0010:btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988
      ...
      Call Trace:
       <TASK>
       map_check_btf kernel/bpf/syscall.c:1024 [inline]
       map_create+0x1157/0x1860 kernel/bpf/syscall.c:1198
       __sys_bpf+0x127f/0x5420 kernel/bpf/syscall.c:5040
       __do_sys_bpf kernel/bpf/syscall.c:5162 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5160 [inline]
       __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5160
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    With the following btf
      [1] DECL_TAG 'a' type_id=4 component_idx=-1
      [2] PTR '(anon)' type_id=0
      [3] TYPE_TAG 'a' type_id=2
      [4] VAR 'a' type_id=3, linkage=static
    and when the bpf_attr.btf_key_type_id = 1 (DECL_TAG),
    the following WARN_ON_ONCE in btf_type_id_size() is triggered:
      if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) &&
                       !btf_type_is_var(size_type)))
              return NULL;
    
    Note that 'return NULL' is the correct behavior as we don't want
    a DECL_TAG type to be used as a btf_{key,value}_type_id even
    for the case like 'DECL_TAG -> STRUCT'. So there
    is no correctness issue here, we just want to silence warning.
    
    To silence the warning, I added DECL_TAG as one of kinds in
    btf_type_nosize() which will cause btf_type_id_size() returning
    NULL earlier without the warning.
    
      [1] https://lore.kernel.org/bpf/000000000000e0df8d05fc75ba86@google.com/
    
    Reported-by: syzbot+958967f249155967d42a@syzkaller.appspotmail.com
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20230530205029.264910-1-yhs@fb.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-10-26 17:06:16 +02:00
Viktor Malik 6eb236e952
bpf: Add kfunc filter function to 'struct btf_kfunc_id_set'
JIRA: https://issues.redhat.com/browse/RHEL-9957

commit e924e80ee6a39bc28d2ef8f51e19d336a98e3be0
Author: Aditi Ghag <aditi.ghag@isovalent.com>
Date:   Fri May 19 22:51:54 2023 +0000

    bpf: Add kfunc filter function to 'struct btf_kfunc_id_set'
    
    This commit adds the ability to filter kfuncs to certain BPF program
    types. This is required to limit bpf_sock_destroy kfunc implemented in
    follow-up commits to programs with attach type 'BPF_TRACE_ITER'.
    
    The commit adds a callback filter to 'struct btf_kfunc_id_set'.  The
    filter has access to the `bpf_prog` construct including its properties
    such as `expected_attached_type`.
    
    Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>
    Link: https://lore.kernel.org/r/20230519225157.760788-7-aditi.ghag@isovalent.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-10-26 17:06:13 +02:00
Artem Savkov e116e6fe5c bpf: Fix an error around PTR_UNTRUSTED
Bugzilla: https://bugzilla.redhat.com/2221599

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 7ce4dc3e4a9d954c8a1fb483c7a527e9b060b860
Author: Yafang Shao <laoar.shao@gmail.com>
Date:   Thu Jul 13 02:56:39 2023 +0000

    bpf: Fix an error around PTR_UNTRUSTED

    Per discussion with Alexei, the PTR_UNTRUSTED flag should not been
    cleared when we start to walk a new struct, because the struct in
    question may be a struct nested in a union. We should also check and set
    this flag before we walk its each member, in case itself is a union.
    We will clear this flag if the field is BTF_TYPE_SAFE_RCU_OR_NULL.

    Fixes: 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.")
    Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
    Link: https://lore.kernel.org/r/20230713025642.27477-2-laoar.shao@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:37 +02:00
Artem Savkov 2bd1c41c27 bpf/btf: Accept function names that contain dots
Bugzilla: https://bugzilla.redhat.com/2221599

commit 9724160b3942b0a967b91a59f81da5593f28b8ba
Author: Florent Revest <revest@chromium.org>
Date:   Thu Jun 15 16:56:07 2023 +0200

    bpf/btf: Accept function names that contain dots
    
    When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM
    leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn,
    pahole creates BTF_KIND_FUNC entries for these and this makes the BTF
    metadata validation fail because they contain a dot.
    
    In a dramatic turn of event, this BTF verification failure can cause
    the netfilter_bpf initialization to fail, causing netfilter_core to
    free the netfilter_helper hashmap and netfilter_ftp to trigger a
    use-after-free. The risk of u-a-f in netfilter will be addressed
    separately but the existence of "asan.module_ctor" debug info under some
    build conditions sounds like a good enough reason to accept functions
    that contain dots in BTF.
    
    Although using only LLVM=1 is the recommended way to compile clang-based
    kernels, users can certainly do LLVM=1, LLVM_IAS=0 as well and we still
    try to support that combination according to Nick. To clarify:
    
      - > v5.10 kernel, LLVM=1 (LLVM_IAS=0 is not the default) is recommended,
        but user can still have LLVM=1, LLVM_IAS=0 to trigger the issue
    
      - <= 5.10 kernel, LLVM=1 (LLVM_IAS=0 is the default) is recommended in
        which case GNU as will be used
    
    Fixes: 1dc9285184 ("bpf: kernel side support for BTF Var and DataSec")
    Signed-off-by: Florent Revest <revest@chromium.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Cc: Yonghong Song <yhs@meta.com>
    Cc: Nick Desaulniers <ndesaulniers@google.com>
    Link: https://lore.kernel.org/bpf/20230615145607.3469985-1-revest@chromium.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:36 +02:00
Artem Savkov 1f42afa9bd bpf: minimal support for programs hooked into netfilter framework
Bugzilla: https://bugzilla.redhat.com/2221599

commit fd9c663b9ad67dedfc9a3fd3429ddd3e83782b4d
Author: Florian Westphal <fw@strlen.de>
Date:   Fri Apr 21 19:02:55 2023 +0200

    bpf: minimal support for programs hooked into netfilter framework
    
    This adds minimal support for BPF_PROG_TYPE_NETFILTER bpf programs
    that will be invoked via the NF_HOOK() points in the ip stack.
    
    Invocation incurs an indirect call.  This is not a necessity: Its
    possible to add 'DEFINE_BPF_DISPATCHER(nf_progs)' and handle the
    program invocation with the same method already done for xdp progs.
    
    This isn't done here to keep the size of this chunk down.
    
    Verifier restricts verdicts to either DROP or ACCEPT.
    
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Link: https://lore.kernel.org/r/20230421170300.24115-3-fw@strlen.de
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:33 +02:00
Artem Savkov b5fd38be39 bpf: Fix race between btf_put and btf_idr walk.
Bugzilla: https://bugzilla.redhat.com/2221599

commit acf1c3d68e9a31f10d92bc67ad4673cdae5e8d92
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Thu Apr 20 18:49:01 2023 -0700

    bpf: Fix race between btf_put and btf_idr walk.
    
    Florian and Eduard reported hard dead lock:
    [   58.433327]  _raw_spin_lock_irqsave+0x40/0x50
    [   58.433334]  btf_put+0x43/0x90
    [   58.433338]  bpf_find_btf_id+0x157/0x240
    [   58.433353]  btf_parse_fields+0x921/0x11c0
    
    This happens since btf->refcount can be 1 at the time of btf_put() and
    btf_put() will call btf_free_id() which will try to grab btf_idr_lock
    and will dead lock.
    Avoid the issue by doing btf_put() without locking.
    
    Fixes: 3d78417b60 ("bpf: Add bpf_btf_find_by_name_kind() helper.")
    Fixes: 1e89106da253 ("bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().")
    Reported-by: Florian Westphal <fw@strlen.de>
    Reported-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Eduard Zingerman <eddyz87@gmail.com>
    Link: https://lore.kernel.org/bpf/20230421014901.70908-1-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:32 +02:00
Artem Savkov b250ffb135 bpf: support access variable length array of integer type
Bugzilla: https://bugzilla.redhat.com/2221599

commit 2569c7b8726fc06d946a4f999fb1be15b68f3f3c
Author: Feng Zhou <zhoufeng.zf@bytedance.com>
Date:   Thu Apr 20 11:27:34 2023 +0800

    bpf: support access variable length array of integer type
    
    After this commit:
    bpf: Support variable length array in tracing programs (9c5f8a1008)
    Trace programs can access variable length array, but for structure
    type. This patch adds support for integer type.
    
    Example:
    Hook load_balance
    struct sched_domain {
    	...
    	unsigned long span[];
    }
    
    The access: sd->span[0].
    
    Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
    Link: https://lore.kernel.org/r/20230420032735.27760-2-zhoufeng.zf@bytedance.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:32 +02:00
Artem Savkov d14ef31f8e bpf: Migrate bpf_rbtree_remove to possibly fail
Bugzilla: https://bugzilla.redhat.com/2221599

commit 404ad75a36fb1a1008e9fe803aa7d0212df9e240
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Sat Apr 15 13:18:09 2023 -0700

    bpf: Migrate bpf_rbtree_remove to possibly fail
    
    This patch modifies bpf_rbtree_remove to account for possible failure
    due to the input rb_node already not being in any collection.
    The function can now return NULL, and does when the aforementioned
    scenario occurs. As before, on successful removal an owning reference to
    the removed node is returned.
    
    Adding KF_RET_NULL to bpf_rbtree_remove's kfunc flags - now KF_RET_NULL |
    KF_ACQUIRE - provides the desired verifier semantics:
    
      * retval must be checked for NULL before use
      * if NULL, retval's ref_obj_id is released
      * retval is a "maybe acquired" owning ref, not a non-owning ref,
        so it will live past end of critical section (bpf_spin_unlock), and
        thus can be checked for NULL after the end of the CS
    
    BPF programs must add checks
    ============================
    
    This does change bpf_rbtree_remove's verifier behavior. BPF program
    writers will need to add NULL checks to their programs, but the
    resulting UX looks natural:
    
      bpf_spin_lock(&glock);
    
      n = bpf_rbtree_first(&ghead);
      if (!n) { /* ... */}
      res = bpf_rbtree_remove(&ghead, &n->node);
    
      bpf_spin_unlock(&glock);
    
      if (!res)  /* Newly-added check after this patch */
        return 1;
    
      n = container_of(res, /* ... */);
      /* Do something else with n */
      bpf_obj_drop(n);
      return 0;
    
    The "if (!res)" check above is the only addition necessary for the above
    program to pass verification after this patch.
    
    bpf_rbtree_remove no longer clobbers non-owning refs
    ====================================================
    
    An issue arises when bpf_rbtree_remove fails, though. Consider this
    example:
    
      struct node_data {
        long key;
        struct bpf_list_node l;
        struct bpf_rb_node r;
        struct bpf_refcount ref;
      };
    
      long failed_sum;
    
      void bpf_prog()
      {
        struct node_data *n = bpf_obj_new(/* ... */);
        struct bpf_rb_node *res;
        n->key = 10;
    
        bpf_spin_lock(&glock);
    
        bpf_list_push_back(&some_list, &n->l); /* n is now a non-owning ref */
        res = bpf_rbtree_remove(&some_tree, &n->r, /* ... */);
        if (!res)
          failed_sum += n->key;  /* not possible */
    
        bpf_spin_unlock(&glock);
        /* if (res) { do something useful and drop } ... */
      }
    
    The bpf_rbtree_remove in this example will always fail. Similarly to
    bpf_spin_unlock, bpf_rbtree_remove is a non-owning reference
    invalidation point. The verifier clobbers all non-owning refs after a
    bpf_rbtree_remove call, so the "failed_sum += n->key" line will fail
    verification, and in fact there's no good way to get information about
    the node which failed to add after the invalidation. This patch removes
    non-owning reference invalidation from bpf_rbtree_remove to allow the
    above usecase to pass verification. The logic for why this is now
    possible is as follows:
    
    Before this series, bpf_rbtree_add couldn't fail and thus assumed that
    its input, a non-owning reference, was in the tree. But it's easy to
    construct an example where two non-owning references pointing to the same
    underlying memory are acquired and passed to rbtree_remove one after
    another (see rbtree_api_release_aliasing in
    selftests/bpf/progs/rbtree_fail.c).
    
    So it was necessary to clobber non-owning refs to prevent this
    case and, more generally, to enforce "non-owning ref is definitely
    in some collection" invariant. This series removes that invariant and
    the failure / runtime checking added in this patch provide a clean way
    to deal with the aliasing issue - just fail to remove.
    
    Because the aliasing issue prevented by clobbering non-owning refs is no
    longer an issue, this patch removes the invalidate_non_owning_refs
    call from verifier handling of bpf_rbtree_remove. Note that
    bpf_spin_unlock - the other caller of invalidate_non_owning_refs -
    clobbers non-owning refs for a different reason, so its clobbering
    behavior remains unchanged.
    
    No BPF program changes are necessary for programs to remain valid as a
    result of this clobbering change. A valid program before this patch
    passed verification with its non-owning refs having shorter (or equal)
    lifetimes due to more aggressive clobbering.
    
    Also, update existing tests to check bpf_rbtree_remove retval for NULL
    where necessary, and move rbtree_api_release_aliasing from
    progs/rbtree_fail.c to progs/rbtree.c since it's now expected to pass
    verification.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230415201811.343116-8-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:30 +02:00
Artem Savkov 5c11bd9376 bpf: Introduce opaque bpf_refcount struct and add btf_record plumbing
Bugzilla: https://bugzilla.redhat.com/2221599

commit d54730b50bae1f3119bd686d551d66f0fcc387ca
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Sat Apr 15 13:18:04 2023 -0700

    bpf: Introduce opaque bpf_refcount struct and add btf_record plumbing
    
    A 'struct bpf_refcount' is added to the set of opaque uapi/bpf.h types
    meant for use in BPF programs. Similarly to other opaque types like
    bpf_spin_lock and bpf_rbtree_node, the verifier needs to know where in
    user-defined struct types a bpf_refcount can be located, so necessary
    btf_record plumbing is added to enable this. bpf_refcount is sized to
    hold a refcount_t.
    
    Similarly to bpf_spin_lock, the offset of a bpf_refcount is cached in
    btf_record as refcount_off in addition to being in the field array.
    Caching refcount_off makes sense for this field because further patches
    in the series will modify functions that take local kptrs (e.g.
    bpf_obj_drop) to change their behavior if the type they're operating on
    is refcounted. So enabling fast "is this type refcounted?" checks is
    desirable.
    
    No such verifier behavior changes are introduced in this patch, just
    logic to recognize 'struct bpf_refcount' in btf_record.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230415201811.343116-3-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:30 +02:00
Artem Savkov 201f7b639f bpf: Remove btf_field_offs, use btf_record's fields instead
Bugzilla: https://bugzilla.redhat.com/2221599

commit cd2a8079014aced27da9b2e669784f31680f1351
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Sat Apr 15 13:18:03 2023 -0700

    bpf: Remove btf_field_offs, use btf_record's fields instead
    
    The btf_field_offs struct contains (offset, size) for btf_record fields,
    sorted by offset. btf_field_offs is always used in conjunction with
    btf_record, which has btf_field 'fields' array with (offset, type), the
    latter of which btf_field_offs' size is derived from via
    btf_field_type_size.
    
    This patch adds a size field to struct btf_field and sorts btf_record's
    fields by offset, making it possible to get rid of btf_field_offs. Less
    data duplication and less code complexity results.
    
    Since btf_field_offs' lifetime closely followed the btf_record used to
    populate it, most complexity wins are from removal of initialization
    code like:
    
      if (btf_record_successfully_initialized) {
        foffs = btf_parse_field_offs(rec);
        if (IS_ERR_OR_NULL(foffs))
          // free the btf_record and return err
      }
    
    Other changes in this patch are pretty mechanical:
    
      * foffs->field_off[i] -> rec->fields[i].offset
      * foffs->field_sz[i] -> rec->fields[i].size
      * Sort rec->fields in btf_parse_fields before returning
        * It's possible that this is necessary independently of other
          changes in this patch. btf_record_find in syscall.c expects
          btf_record's fields to be sorted by offset, yet there's no
          explicit sorting of them before this patch, record's fields are
          populated in the order they're read from BTF struct definition.
          BTF docs don't say anything about the sortedness of struct fields.
      * All functions taking struct btf_field_offs * input now instead take
        struct btf_record *. All callsites of these functions already have
        access to the correct btf_record.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230415201811.343116-2-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:30 +02:00
Artem Savkov 637973f018 bpf/btf: Fix is_int_ptr()
Bugzilla: https://bugzilla.redhat.com/2221599

commit 91f2dc6838c19342f7f2993627c622835cc24890
Author: Feng Zhou <zhoufeng.zf@bytedance.com>
Date:   Mon Apr 10 16:59:07 2023 +0800

    bpf/btf: Fix is_int_ptr()
    
    When tracing a kernel function with arg type is u32*, btf_ctx_access()
    would report error: arg2 type INT is not a struct.
    
    The commit bb6728d75611 ("bpf: Allow access to int pointer arguments
    in tracing programs") added support for int pointer, but did not skip
    modifiers before checking it's type. This patch fixes it.
    
    Fixes: bb6728d75611 ("bpf: Allow access to int pointer arguments in tracing programs")
    Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
    Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20230410085908.98493-2-zhoufeng.zf@bytedance.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:29 +02:00
Artem Savkov ccb9ef7406 bpf: Simplify internal verifier log interface
Bugzilla: https://bugzilla.redhat.com/2221599

commit bdcab4144f5da97cc0fa7e1dd63b8475e10c8f0a
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Apr 6 16:41:59 2023 -0700

    bpf: Simplify internal verifier log interface
    
    Simplify internal verifier log API down to bpf_vlog_init() and
    bpf_vlog_finalize(). The former handles input arguments validation in
    one place and makes it easier to change it. The latter subsumes -ENOSPC
    (truncation) and -EFAULT handling and simplifies both caller's code
    (bpf_check() and btf_parse()).
    
    For btf_parse(), this patch also makes sure that verifier log
    finalization happens even if there is some error condition during BTF
    verification process prior to normal finalization step.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/bpf/20230406234205.323208-14-andrii@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:28 +02:00
Artem Savkov 475e5e02ee bpf: Add log_true_size output field to return necessary log buffer size
Bugzilla: https://bugzilla.redhat.com/2221599

commit 47a71c1f9af0a334c9dfa97633c41de4feda4287
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Apr 6 16:41:58 2023 -0700

    bpf: Add log_true_size output field to return necessary log buffer size
    
    Add output-only log_true_size and btf_log_true_size field to
    BPF_PROG_LOAD and BPF_BTF_LOAD commands, respectively. It will return
    the size of log buffer necessary to fit in all the log contents at
    specified log_level. This is very useful for BPF loader libraries like
    libbpf to be able to size log buffer correctly, but could be used by
    users directly, if necessary, as well.
    
    This patch plumbs all this through the code, taking into account actual
    bpf_attr size provided by user to determine if these new fields are
    expected by users. And if they are, set them from kernel on return.
    
    We refactory btf_parse() function to accommodate this, moving attr and
    uattr handling inside it. The rest is very straightforward code, which
    is split from the logging accounting changes in the previous patch to
    make it simpler to review logic vs UAPI changes.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/bpf/20230406234205.323208-13-andrii@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:28 +02:00
Artem Savkov 30d9b4314e bpf: Simplify logging-related error conditions handling
Bugzilla: https://bugzilla.redhat.com/2221599

commit 8a6ca6bc553e3c878fa53c506bc6ec281efdc039
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Apr 6 16:41:56 2023 -0700

    bpf: Simplify logging-related error conditions handling
    
    Move log->level == 0 check into bpf_vlog_truncated() instead of doing it
    explicitly. Also remove unnecessary goto in kernel/bpf/verifier.c.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/bpf/20230406234205.323208-11-andrii@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:28 +02:00
Artem Savkov 38cf6148dd bpf: Fix missing -EFAULT return on user log buf error in btf_parse()
Bugzilla: https://bugzilla.redhat.com/2221599

commit 971fb5057d787d0a7e7c8cb910207c82e2db920e
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Apr 6 16:41:54 2023 -0700

    bpf: Fix missing -EFAULT return on user log buf error in btf_parse()
    
    btf_parse() is missing -EFAULT error return if log->ubuf was NULL-ed out
    due to error while copying data into user-provided buffer. Add it, but
    handle a special case of BPF_LOG_KERNEL in which log->ubuf is always NULL.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/bpf/20230406234205.323208-9-andrii@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:28 +02:00
Artem Savkov 77949afca9 bpf: Switch BPF verifier log to be a rotating log by default
Bugzilla: https://bugzilla.redhat.com/2221599

commit 1216640938035e63bdbd32438e91c9bcc1fd8ee1
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Apr 6 16:41:49 2023 -0700

    bpf: Switch BPF verifier log to be a rotating log by default
    
    Currently, if user-supplied log buffer to collect BPF verifier log turns
    out to be too small to contain full log, bpf() syscall returns -ENOSPC,
    fails BPF program verification/load, and preserves first N-1 bytes of
    the verifier log (where N is the size of user-supplied buffer).
    
    This is problematic in a bunch of common scenarios, especially when
    working with real-world BPF programs that tend to be pretty complex as
    far as verification goes and require big log buffers. Typically, it's
    when debugging tricky cases at log level 2 (verbose). Also, when BPF program
    is successfully validated, log level 2 is the only way to actually see
    verifier state progression and all the important details.
    
    Even with log level 1, it's possible to get -ENOSPC even if the final
    verifier log fits in log buffer, if there is a code path that's deep
    enough to fill up entire log, even if normally it would be reset later
    on (there is a logic to chop off successfully validated portions of BPF
    verifier log).
    
    In short, it's not always possible to pre-size log buffer. Also, what's
    worse, in practice, the end of the log most often is way more important
    than the beginning, but verifier stops emitting log as soon as initial
    log buffer is filled up.
    
    This patch switches BPF verifier log behavior to effectively behave as
    rotating log. That is, if user-supplied log buffer turns out to be too
    short, verifier will keep overwriting previously written log,
    effectively treating user's log buffer as a ring buffer. -ENOSPC is
    still going to be returned at the end, to notify user that log contents
    was truncated, but the important last N bytes of the log would be
    returned, which might be all that user really needs. This consistent
    -ENOSPC behavior, regardless of rotating or fixed log behavior, allows
    to prevent backwards compatibility breakage. The only user-visible
    change is which portion of verifier log user ends up seeing *if buffer
    is too small*. Given contents of verifier log itself is not an ABI,
    there is no breakage due to this behavior change. Specialized tools that
    rely on specific contents of verifier log in -ENOSPC scenario are
    expected to be easily adapted to accommodate old and new behaviors.
    
    Importantly, though, to preserve good user experience and not require
    every user-space application to adopt to this new behavior, before
    exiting to user-space verifier will rotate log (in place) to make it
    start at the very beginning of user buffer as a continuous
    zero-terminated string. The contents will be a chopped off N-1 last
    bytes of full verifier log, of course.
    
    Given beginning of log is sometimes important as well, we add
    BPF_LOG_FIXED (which equals 8) flag to force old behavior, which allows
    tools like veristat to request first part of verifier log, if necessary.
    BPF_LOG_FIXED flag is also a simple and straightforward way to check if
    BPF verifier supports rotating behavior.
    
    On the implementation side, conceptually, it's all simple. We maintain
    64-bit logical start and end positions. If we need to truncate the log,
    start position will be adjusted accordingly to lag end position by
    N bytes. We then use those logical positions to calculate their matching
    actual positions in user buffer and handle wrap around the end of the
    buffer properly. Finally, right before returning from bpf_check(), we
    rotate user log buffer contents in-place as necessary, to make log
    contents contiguous. See comments in relevant functions for details.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/bpf/20230406234205.323208-4-andrii@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:27 +02:00
Artem Savkov b489f7394c bpf: Refactor btf_nested_type_is_trusted().
Bugzilla: https://bugzilla.redhat.com/2221599

commit 63260df1396578226ac3134cf7f764690002e70e
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Mon Apr 3 21:50:24 2023 -0700

    bpf: Refactor btf_nested_type_is_trusted().
    
    btf_nested_type_is_trusted() tries to find a struct member at corresponding offset.
    It works for flat structures and falls apart in more complex structs with nested structs.
    The offset->member search is already performed by btf_struct_walk() including nested structs.
    Reuse this work and pass {field name, field btf id} into btf_nested_type_is_trusted()
    instead of offset to make BTF_TYPE_SAFE*() logic more robust.
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/bpf/20230404045029.82870-4-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:26 +02:00
Artem Savkov b5247b9d85 bpf: Disable migration when freeing stashed local kptr using obj drop
Bugzilla: https://bugzilla.redhat.com/2221599

commit 9e36a204bd43553a9cd4bd574612cd9a5df791ea
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Mon Mar 13 14:46:41 2023 -0700

    bpf: Disable migration when freeing stashed local kptr using obj drop
    
    When a local kptr is stashed in a map and freed when the map goes away,
    currently an error like the below appears:
    
    [   39.195695] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u32:15/2875
    [   39.196549] caller is bpf_mem_free+0x56/0xc0
    [   39.196958] CPU: 15 PID: 2875 Comm: kworker/u32:15 Tainted: G           O       6.2.0-13016-g22df776a9a86 #4477
    [   39.197897] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
    [   39.198949] Workqueue: events_unbound bpf_map_free_deferred
    [   39.199470] Call Trace:
    [   39.199703]  <TASK>
    [   39.199911]  dump_stack_lvl+0x60/0x70
    [   39.200267]  check_preemption_disabled+0xbf/0xe0
    [   39.200704]  bpf_mem_free+0x56/0xc0
    [   39.201032]  ? bpf_obj_new_impl+0xa0/0xa0
    [   39.201430]  bpf_obj_free_fields+0x1cd/0x200
    [   39.201838]  array_map_free+0xad/0x220
    [   39.202193]  ? finish_task_switch+0xe5/0x3c0
    [   39.202614]  bpf_map_free_deferred+0xea/0x210
    [   39.203006]  ? lockdep_hardirqs_on_prepare+0xe/0x220
    [   39.203460]  process_one_work+0x64f/0xbe0
    [   39.203822]  ? pwq_dec_nr_in_flight+0x110/0x110
    [   39.204264]  ? do_raw_spin_lock+0x107/0x1c0
    [   39.204662]  ? lockdep_hardirqs_on_prepare+0xe/0x220
    [   39.205107]  worker_thread+0x74/0x7a0
    [   39.205451]  ? process_one_work+0xbe0/0xbe0
    [   39.205818]  kthread+0x171/0x1a0
    [   39.206111]  ? kthread_complete_and_exit+0x20/0x20
    [   39.206552]  ret_from_fork+0x1f/0x30
    [   39.206886]  </TASK>
    
    This happens because the call to __bpf_obj_drop_impl I added in the patch
    adding support for stashing local kptrs doesn't disable migration. Prior
    to that patch, __bpf_obj_drop_impl logic only ran when called by a BPF
    progarm, whereas now it can be called from map free path, so it's
    necessary to explicitly disable migration.
    
    Also, refactor a bit to just call __bpf_obj_drop_impl directly instead
    of bothering w/ dtor union and setting pointer-to-obj_drop.
    
    Fixes: c8e187540914 ("bpf: Support __kptr to local kptrs")
    Reported-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230313214641.3731908-1-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:16 +02:00
Artem Savkov 400701606d bpf: Support __kptr to local kptrs
Bugzilla: https://bugzilla.redhat.com/2221599

commit c8e18754091479fac3f5b6c053c6bc4be0b7fb11
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Fri Mar 10 15:07:41 2023 -0800

    bpf: Support __kptr to local kptrs
    
    If a PTR_TO_BTF_ID type comes from program BTF - not vmlinux or module
    BTF - it must have been allocated by bpf_obj_new and therefore must be
    free'd with bpf_obj_drop. Such a PTR_TO_BTF_ID is considered a "local
    kptr" and is tagged with MEM_ALLOC type tag by bpf_obj_new.
    
    This patch adds support for treating __kptr-tagged pointers to "local
    kptrs" as having an implicit bpf_obj_drop destructor for referenced kptr
    acquire / release semantics. Consider the following example:
    
      struct node_data {
              long key;
              long data;
              struct bpf_rb_node node;
      };
    
      struct map_value {
              struct node_data __kptr *node;
      };
    
      struct {
              __uint(type, BPF_MAP_TYPE_ARRAY);
              __type(key, int);
              __type(value, struct map_value);
              __uint(max_entries, 1);
      } some_nodes SEC(".maps");
    
    If struct node_data had a matching definition in kernel BTF, the verifier would
    expect a destructor for the type to be registered. Since struct node_data does
    not match any type in kernel BTF, the verifier knows that there is no kfunc
    that provides a PTR_TO_BTF_ID to this type, and that such a PTR_TO_BTF_ID can
    only come from bpf_obj_new. So instead of searching for a registered dtor,
    a bpf_obj_drop dtor can be assumed.
    
    This allows the runtime to properly destruct such kptrs in
    bpf_obj_free_fields, which enables maps to clean up map_vals w/ such
    kptrs when going away.
    
    Implementation notes:
      * "kernel_btf" variable is renamed to "kptr_btf" in btf_parse_kptr.
        Before this patch, the variable would only ever point to vmlinux or
        module BTFs, but now it can point to some program BTF for local kptr
        type. It's later used to populate the (btf, btf_id) pair in kptr btf
        field.
      * It's necessary to btf_get the program BTF when populating btf_field
        for local kptr. btf_record_free later does a btf_put.
      * Behavior for non-local referenced kptrs is not modified, as
        bpf_find_btf_id helper only searches vmlinux and module BTFs for
        matching BTF type. If such a type is found, btf_field_kptr's btf will
        pass btf_is_kernel check, and the associated release function is
        some one-argument dtor. If btf_is_kernel check fails, associated
        release function is two-arg bpf_obj_drop_impl. Before this patch
        only btf_field_kptr's w/ kernel or module BTFs were created.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230310230743.2320707-2-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:16 +02:00
Artem Savkov 3fbb5641e4 bpf: btf: Remove unused btf_field_info_type enum
Bugzilla: https://bugzilla.redhat.com/2221599

commit a4aa38897b6a6dad4318bed036edc7ed0c8a4578
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Thu Mar 9 10:01:07 2023 -0800

    bpf: btf: Remove unused btf_field_info_type enum
    
    This enum was added and used in commit aa3496accc41 ("bpf: Refactor kptr_off_tab
    into btf_record"). Later refactoring in commit db559117828d ("bpf: Consolidate
    spin_lock, timer management into btf_record") resulted in the enum
    values no longer being used anywhere.
    
    Let's remove them.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230309180111.1618459-3-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:16 +02:00
Artem Savkov 80b8dda2ea bpf: add iterator kfuncs registration and validation logic
Bugzilla: https://bugzilla.redhat.com/2221599

commit 215bf4962f6c9605710012fad222a5fec001b3ad
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Mar 8 10:41:15 2023 -0800

    bpf: add iterator kfuncs registration and validation logic
    
    Add ability to register kfuncs that implement BPF open-coded iterator
    contract and enforce naming and function proto convention. Enforcement
    happens at the time of kfunc registration and significantly simplifies
    the rest of iterators logic in the verifier.
    
    More details follow in subsequent patches, but we enforce the following
    conditions.
    
    All kfuncs (constructor, next, destructor) have to be named consistenly
    as bpf_iter_<type>_{new,next,destroy}(), respectively. <type> represents
    iterator type, and iterator state should be represented as a matching
    `struct bpf_iter_<type>` state type. Also, all iter kfuncs should have
    a pointer to this `struct bpf_iter_<type>` as the very first argument.
    
    Additionally:
      - Constructor, i.e., bpf_iter_<type>_new(), can have arbitrary extra
      number of arguments. Return type is not enforced either.
      - Next method, i.e., bpf_iter_<type>_next(), has to return a pointer
      type and should have exactly one argument: `struct bpf_iter_<type> *`
      (const/volatile/restrict and typedefs are ignored).
      - Destructor, i.e., bpf_iter_<type>_destroy(), should return void and
      should have exactly one argument, similar to the next method.
      - struct bpf_iter_<type> size is enforced to be positive and
      a multiple of 8 bytes (to fit stack slots correctly).
    
    Such strictness and consistency allows to build generic helpers
    abstracting important, but boilerplate, details to be able to use
    open-coded iterators effectively and ergonomically (see bpf_for_each()
    in subsequent patches). It also simplifies the verifier logic in some
    places. At the same time, this doesn't hurt generality of possible
    iterator implementations. Win-win.
    
    Constructor kfunc is marked with a new KF_ITER_NEW flags, next method is
    marked with KF_ITER_NEXT (and should also have KF_RET_NULL, of course),
    while destructor kfunc is marked as KF_ITER_DESTROY.
    
    Additionally, we add a trivial kfunc name validation: it should be
    a valid non-NULL and non-empty string.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20230308184121.1165081-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:13 +02:00
Artem Savkov 1cbef93115 bpf: Refactor RCU enforcement in the verifier.
Bugzilla: https://bugzilla.redhat.com/2221599

commit 6fcd486b3a0a628c41f12b3a7329a18a2c74b351
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Thu Mar 2 20:14:46 2023 -0800

    bpf: Refactor RCU enforcement in the verifier.
    
    bpf_rcu_read_lock/unlock() are only available in clang compiled kernels. Lack
    of such key mechanism makes it impossible for sleepable bpf programs to use RCU
    pointers.
    
    Allow bpf_rcu_read_lock/unlock() in GCC compiled kernels (though GCC doesn't
    support btf_type_tag yet) and allowlist certain field dereferences in important
    data structures like tast_struct, cgroup, socket that are used by sleepable
    programs either as RCU pointer or full trusted pointer (which is valid outside
    of RCU CS). Use BTF_TYPE_SAFE_RCU and BTF_TYPE_SAFE_TRUSTED macros for such
    tagging. They will be removed once GCC supports btf_type_tag.
    
    With that refactor check_ptr_to_btf_access(). Make it strict in enforcing
    PTR_TRUSTED and PTR_UNTRUSTED while deprecating old PTR_TO_BTF_ID without
    modifier flags. There is a chance that this strict enforcement might break
    existing programs (especially on GCC compiled kernels), but this cleanup has to
    start sooner than later. Note PTR_TO_CTX access still yields old deprecated
    PTR_TO_BTF_ID. Once it's converted to strict PTR_TRUSTED or PTR_UNTRUSTED the
    kfuncs and helpers will be able to default to KF_TRUSTED_ARGS. KF_RCU will
    remain as a weaker version of KF_TRUSTED_ARGS where obj refcnt could be 0.
    
    Adjust rcu_read_lock selftest to run on gcc and clang compiled kernels.
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/bpf/20230303041446.3630-7-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:10 +02:00
Artem Savkov 987a6a9c19 bpf: Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted.
Bugzilla: https://bugzilla.redhat.com/2221599

commit 03b77e17aeb22a5935ea20d585ca6a1f2947e62b
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Thu Mar 2 20:14:41 2023 -0800

    bpf: Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted.
    
    __kptr meant to store PTR_UNTRUSTED kernel pointers inside bpf maps.
    The concept felt useful, but didn't get much traction,
    since bpf_rdonly_cast() was added soon after and bpf programs received
    a simpler way to access PTR_UNTRUSTED kernel pointers
    without going through restrictive __kptr usage.
    
    Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted to indicate
    its intended usage.
    The main goal of __kptr_untrusted was to read/write such pointers
    directly while bpf_kptr_xchg was a mechanism to access refcnted
    kernel pointers. The next patch will allow RCU protected __kptr access
    with direct read. At that point __kptr_untrusted will be deprecated.
    
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/bpf/20230303041446.3630-2-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:10 +02:00
Artem Savkov 6735ca36be bpf: Add skb dynptrs
Bugzilla: https://bugzilla.redhat.com/2221599

commit b5964b968ac64c2ec2debee7518499113b27c34e
Author: Joanne Koong <joannelkoong@gmail.com>
Date:   Wed Mar 1 07:49:50 2023 -0800

    bpf: Add skb dynptrs
    
    Add skb dynptrs, which are dynptrs whose underlying pointer points
    to a skb. The dynptr acts on skb data. skb dynptrs have two main
    benefits. One is that they allow operations on sizes that are not
    statically known at compile-time (eg variable-sized accesses).
    Another is that parsing the packet data through dynptrs (instead of
    through direct access of skb->data and skb->data_end) can be more
    ergonomic and less brittle (eg does not need manual if checking for
    being within bounds of data_end).
    
    For bpf prog types that don't support writes on skb data, the dynptr is
    read-only (bpf_dynptr_write() will return an error)
    
    For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write()
    interfaces, reading and writing from/to data in the head as well as from/to
    non-linear paged buffers is supported. Data slices through the
    bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and
    bpf_dynptr_slice_rdwr() (added in subsequent commit) should be used.
    
    For examples of how skb dynptrs can be used, please see the attached
    selftests.
    
    Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
    Link: https://lore.kernel.org/r/20230301154953.641654-8-joannelkoong@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:08 +02:00
Artem Savkov a8df2ed96d bpf: Support "sk_buff" and "xdp_buff" as valid kfunc arg types
Bugzilla: https://bugzilla.redhat.com/2221599

commit 2f46439346700a2b41cf0fa9432f110f42fd8821
Author: Joanne Koong <joannelkoong@gmail.com>
Date:   Wed Mar 1 07:49:44 2023 -0800

    bpf: Support "sk_buff" and "xdp_buff" as valid kfunc arg types
    
    The bpf mirror of the in-kernel sk_buff and xdp_buff data structures are
    __sk_buff and xdp_md. Currently, when we pass in the program ctx to a
    kfunc where the program ctx is a skb or xdp buffer, we reject the
    program if the in-kernel definition is sk_buff/xdp_buff instead of
    __sk_buff/xdp_md.
    
    This change allows "sk_buff <--> __sk_buff" and "xdp_buff <--> xdp_md"
    to be recognized as valid matches. The user program may pass in their
    program ctx as a __sk_buff or xdp_md, and the in-kernel definition
    of the kfunc may define this arg as a sk_buff or xdp_buff.
    
    Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
    Link: https://lore.kernel.org/r/20230301154953.641654-2-joannelkoong@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-09-22 09:12:08 +02:00
Viktor Malik 8930f46acc btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
Bugzilla: https://bugzilla.redhat.com/2178930

commit 9b459804ff9973e173fabafba2a1319f771e85fa
Author: Lorenz Bauer <lorenz.bauer@isovalent.com>
Date:   Mon Mar 6 11:21:37 2023 +0000

    btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
    
    btf_datasec_resolve contains a bug that causes the following BTF
    to fail loading:
    
        [1] DATASEC a size=2 vlen=2
            type_id=4 offset=0 size=1
            type_id=7 offset=1 size=1
        [2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
        [3] PTR (anon) type_id=2
        [4] VAR a type_id=3 linkage=0
        [5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
        [6] TYPEDEF td type_id=5
        [7] VAR b type_id=6 linkage=0
    
    This error message is printed during btf_check_all_types:
    
        [1] DATASEC a size=2 vlen=2
            type_id=7 offset=1 size=1 Invalid type
    
    By tracing btf_*_resolve we can pinpoint the problem:
    
        btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0
            btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0
                btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0
            btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0
        btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22
    
    The last invocation of btf_datasec_resolve should invoke btf_var_resolve
    by means of env_stack_push, instead it returns EINVAL. The reason is that
    env_stack_push is never executed for the second VAR.
    
        if (!env_type_is_resolve_sink(env, var_type) &&
            !env_type_is_resolved(env, var_type_id)) {
            env_stack_set_next_member(env, i + 1);
            return env_stack_push(env, var_type, var_type_id);
        }
    
    env_type_is_resolve_sink() changes its behaviour based on resolve_mode.
    For RESOLVE_PTR, we can simplify the if condition to the following:
    
        (btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved()
    
    Since we're dealing with a VAR the clause evaluates to false. This is
    not sufficient to trigger the bug however. The log output and EINVAL
    are only generated if btf_type_id_size() fails.
    
        if (!btf_type_id_size(btf, &type_id, &type_size)) {
            btf_verifier_log_vsi(env, v->t, vsi, "Invalid type");
            return -EINVAL;
        }
    
    Most types are sized, so for example a VAR referring to an INT is not a
    problem. The bug is only triggered if a VAR points at a modifier. Since
    we skipped btf_var_resolve that modifier was also never resolved, which
    means that btf_resolved_type_id returns 0 aka VOID for the modifier.
    This in turn causes btf_type_id_size to return NULL, triggering EINVAL.
    
    To summarise, the following conditions are necessary:
    
    - VAR pointing at PTR, STRUCT, UNION or ARRAY
    - Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or
      TYPE_TAG
    
    The fix is to reset resolve_mode to RESOLVE_TBD before attempting to
    resolve a VAR from a DATASEC.
    
    Fixes: 1dc9285184 ("bpf: kernel side support for BTF Var and DataSec")
    Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
    Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:45:44 +02:00
Viktor Malik b9ffc5c79f bpf: Fix global subprog context argument resolution logic
Bugzilla: https://bugzilla.redhat.com/2178930

commit d384dce281ed1b504fae2e279507827638d56fa3
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Feb 15 20:59:52 2023 -0800

    bpf: Fix global subprog context argument resolution logic
    
    KPROBE program's user-facing context type is defined as typedef
    bpf_user_pt_regs_t. This leads to a problem when trying to passing
    kprobe/uprobe/usdt context argument into global subprog, as kernel
    always strip away mods and typedefs of user-supplied type, but takes
    expected type from bpf_ctx_convert as is, which causes mismatch.
    
    Current way to work around this is to define a fake struct with the same
    name as expected typedef:
    
      struct bpf_user_pt_regs_t {};
    
      __noinline my_global_subprog(struct bpf_user_pt_regs_t *ctx) { ... }
    
    This patch fixes the issue by resolving expected type, if it's not
    a struct. It still leaves the above work-around working for backwards
    compatibility.
    
    Fixes: 91cc1a9974 ("bpf: Annotate context types")
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/bpf/20230216045954.3002473-2-andrii@kernel.org

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:45:40 +02:00
Viktor Malik c9e4819d61 bpf: Special verifier handling for bpf_rbtree_{remove, first}
Bugzilla: https://bugzilla.redhat.com/2178930

commit a40d3632436b1677a94c16e77be8da798ee9e12b
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Mon Feb 13 16:40:14 2023 -0800

    bpf: Special verifier handling for bpf_rbtree_{remove, first}
    
    Newly-added bpf_rbtree_{remove,first} kfuncs have some special properties
    that require handling in the verifier:
    
      * both bpf_rbtree_remove and bpf_rbtree_first return the type containing
        the bpf_rb_node field, with the offset set to that field's offset,
        instead of a struct bpf_rb_node *
        * mark_reg_graph_node helper added in previous patch generalizes
          this logic, use it
    
      * bpf_rbtree_remove's node input is a node that's been inserted
        in the tree - a non-owning reference.
    
      * bpf_rbtree_remove must invalidate non-owning references in order to
        avoid aliasing issue. Use previously-added
        invalidate_non_owning_refs helper to mark this function as a
        non-owning ref invalidation point.
    
      * Unlike other functions, which convert one of their input arg regs to
        non-owning reference, bpf_rbtree_first takes no arguments and just
        returns a non-owning reference (possibly null)
        * For now verifier logic for this is special-cased instead of
          adding new kfunc flag.
    
    This patch, along with the previous one, complete special verifier
    handling for all rbtree API functions added in this series.
    
    With functional verifier handling of rbtree_remove, under current
    non-owning reference scheme, a node type with both bpf_{list,rb}_node
    fields could cause the verifier to accept programs which remove such
    nodes from collections they haven't been added to.
    
    In order to prevent this, this patch adds a check to btf_parse_fields
    which rejects structs with both bpf_{list,rb}_node fields. This is a
    temporary measure that can be removed after "collection identity"
    followup. See comment added in btf_parse_fields. A linked_list BTF test
    exercising the new check is added in this patch as well.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230214004017.2534011-6-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:45:30 +02:00
Viktor Malik 7e487d11fc bpf: Add basic bpf_rb_{root,node} support
Bugzilla: https://bugzilla.redhat.com/2178930

commit 9c395c1b99bd23f74bc628fa000480c49593d17f
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Mon Feb 13 16:40:10 2023 -0800

    bpf: Add basic bpf_rb_{root,node} support
    
    This patch adds special BPF_RB_{ROOT,NODE} btf_field_types similar to
    BPF_LIST_{HEAD,NODE}, adds the necessary plumbing to detect the new
    types, and adds bpf_rb_root_free function for freeing bpf_rb_root in
    map_values.
    
    structs bpf_rb_root and bpf_rb_node are opaque types meant to
    obscure structs rb_root_cached rb_node, respectively.
    
    btf_struct_access will prevent BPF programs from touching these special
    fields automatically now that they're recognized.
    
    btf_check_and_fixup_fields now groups list_head and rb_root together as
    "graph root" fields and {list,rb}_node as "graph node", and does same
    ownership cycle checking as before. Note that this function does _not_
    prevent ownership type mixups (e.g. rb_root owning list_node) - that's
    handled by btf_parse_graph_root.
    
    After this patch, a bpf program can have a struct bpf_rb_root in a
    map_value, but not add anything to nor do anything useful with it.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20230214004017.2534011-2-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:45:29 +02:00
Viktor Malik dbb525e729 bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
Bugzilla: https://bugzilla.redhat.com/2178930

commit 49f67f393ff264e8d83f6fcec0728a6aa8eed102
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date:   Sat Jan 28 01:06:44 2023 +0100

    bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
    
    s390x eBPF JIT needs to know whether a function return value is signed
    and which function arguments are signed, in order to generate code
    compliant with the s390x ABI.
    
    Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
    Link: https://lore.kernel.org/r/20230128000650.1516334-26-iii@linux.ibm.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:45:02 +02:00
Viktor Malik 3588e87eaa bpf: Allow trusted args to walk struct when checking BTF IDs
Bugzilla: https://bugzilla.redhat.com/2178930

commit b613d335a743cf0e0ef0ccba9ad129904e2a26fb
Author: David Vernet <void@manifault.com>
Date:   Fri Jan 20 13:25:16 2023 -0600

    bpf: Allow trusted args to walk struct when checking BTF IDs
    
    When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
    currently enforces that the top-level type must match when calling
    the kfunc. In other words, the verifier does not allow the BPF program
    to pass a bitwise equivalent struct, despite it being allowed according
    to the C standard.
    
    For example, if you have the following type:
    
    struct  nf_conn___init {
    	struct nf_conn ct;
    };
    
    The C standard stipulates that it would be safe to pass a struct
    nf_conn___init to a kfunc expecting a struct nf_conn. The verifier
    currently disallows this, however, as semantically kfuncs may want to
    enforce that structs that have equivalent types according to the C
    standard, but have different BTF IDs, are not able to be passed to
    kfuncs expecting one or the other. For example, struct nf_conn___init
    may not be queried / looked up, as it is allocated but may not yet be
    fully initialized.
    
    On the other hand, being able to pass types that are equivalent
    according to the C standard will be useful for other types of kfunc /
    kptrs enabled by BPF.  For example, in a follow-on patch, a series of
    kfuncs will be added which allow programs to do bitwise queries on
    cpumasks that are either allocated by the program (in which case they'll
    be a 'struct bpf_cpumask' type that wraps a cpumask_t as its first
    element), or a cpumask that was allocated by the main kernel (in which
    case it will just be a straight cpumask_t, as in task->cpus_ptr).
    
    Having the two types of cpumasks allows us to distinguish between the
    two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
    can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
    cannot be. On the other hand, a struct bpf_cpumask can of course be
    queried in the exact same manner as a cpumask_t, with e.g.
    bpf_cpumask_test_cpu().
    
    If we were to enforce that top level types match, then a user that's
    passing a struct bpf_cpumask to a read-only cpumask_t argument would
    have to cast with something like bpf_cast_to_kern_ctx() (which itself
    would need to be updated to expect the alias, and currently it only
    accommodates a single alias per prog type). Additionally, not specifying
    KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
    struct bpf_cpumask *, and another as a struct cpumask *
    (i.e. cpumask_t).
    
    In order to enable this, this patch relaxes the constraint that a
    KF_TRUSTED_ARGS kfunc must have strict type matching, and instead only
    enforces strict type matching if a type is observed to be a "no-cast
    alias" (i.e., that the type names are equivalent, but one is suffixed
    with ___init).
    
    Additionally, in order to try and be conservative and match existing
    behavior / expectations, this patch also enforces strict type checking
    for acquire kfuncs. We were already enforcing it for release kfuncs, so
    this should also improve the consistency of the semantics for kfuncs.
    
    Signed-off-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20230120192523.3650503-3-void@manifault.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:49 +02:00
Viktor Malik 726cfbc70b bpf: Enable annotating trusted nested pointers
Bugzilla: https://bugzilla.redhat.com/2178930

commit 57539b1c0ac2dcccbe64a7675ff466be009c040f
Author: David Vernet <void@manifault.com>
Date:   Fri Jan 20 13:25:15 2023 -0600

    bpf: Enable annotating trusted nested pointers
    
    In kfuncs, a "trusted" pointer is a pointer that the kfunc can assume is
    safe, and which the verifier will allow to be passed to a
    KF_TRUSTED_ARGS kfunc. Currently, a KF_TRUSTED_ARGS kfunc disallows any
    pointer to be passed at a nonzero offset, but sometimes this is in fact
    safe if the "nested" pointer's lifetime is inherited from its parent.
    For example, the const cpumask_t *cpus_ptr field in a struct task_struct
    will remain valid until the task itself is destroyed, and thus would
    also be safe to pass to a KF_TRUSTED_ARGS kfunc.
    
    While it would be conceptually simple to enable this by using BTF tags,
    gcc unfortunately does not yet support this. In the interim, this patch
    enables support for this by using a type-naming convention. A new
    BTF_TYPE_SAFE_NESTED macro is defined in verifier.c which allows a
    developer to specify the nested fields of a type which are considered
    trusted if its parent is also trusted. The verifier is also updated to
    account for this. A patch with selftests will be added in a follow-on
    change, along with documentation for this feature.
    
    Signed-off-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20230120192523.3650503-2-void@manifault.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:49 +02:00
Viktor Malik 99f1beb2d0 bpf: btf: limit logging of ignored BTF mismatches
Bugzilla: https://bugzilla.redhat.com/2178930

commit 9cb61e50bf6bf54db712bba6cf20badca4383f96
Author: Connor O'Brien <connoro@google.com>
Date:   Sat Jan 7 02:53:31 2023 +0000

    bpf: btf: limit logging of ignored BTF mismatches
    
    Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF
    mismatches are expected and module loading should proceed
    anyway. Logging with pr_warn() on every one of these "benign"
    mismatches creates unnecessary noise when many such modules are
    loaded. Instead, handle this case with a single log warning that BTF
    info may be unavailable.
    
    Mismatches also result in calls to __btf_verifier_log() via
    __btf_verifier_log_type() or btf_verifier_log_member(), adding several
    additional lines of logging per mismatched module. Add checks to these
    paths to skip logging for module BTF mismatches in the "allow
    mismatch" case.
    
    All existing logging behavior is preserved in the default
    CONFIG_MODULE_ALLOW_BTF_MISMATCH=n case.
    
    Signed-off-by: Connor O'Brien <connoro@google.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20230107025331.3240536-1-connoro@google.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:35 +02:00
Viktor Malik 8a9358eb85 bpf: rename list_head -> graph_root in field info types
Bugzilla: https://bugzilla.redhat.com/2178930

commit 30465003ad776a922c32b2dac58db14f120f037e
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Sat Dec 17 00:24:57 2022 -0800

    bpf: rename list_head -> graph_root in field info types
    
    Many of the structs recently added to track field info for linked-list
    head are useful as-is for rbtree root. So let's do a mechanical renaming
    of list_head-related types and fields:
    
    include/linux/bpf.h:
      struct btf_field_list_head -> struct btf_field_graph_root
      list_head -> graph_root in struct btf_field union
    kernel/bpf/btf.c:
      list_head -> graph_root in struct btf_field_info
    
    This is a nonfunctional change, functionality to actually use these
    fields for rbtree will be added in further patches.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20221217082506.1570898-5-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:30 +02:00
Jerome Marchand 95f498c7c9 bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 74bc3a5acc82f020d2e126f56c535d02d1e74e37
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Fri Jan 20 13:21:48 2023 +0100

    bpf: Add missing btf_put to register_btf_id_dtor_kfuncs

    We take the BTF reference before we register dtors and we need
    to put it back when it's done.

    We probably won't se a problem with kernel BTF, but module BTF
    would stay loaded (because of the extra ref) even when its module
    is removed.

    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Fixes: 5ce937d613a4 ("bpf: Populate pairs of btf_id and destructor kfunc in btf")
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20230120122148.1522359-1-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:21 +02:00
Jerome Marchand 6af9035ad8 bpf: do not rely on ALLOW_ERROR_INJECTION for fmod_ret
Bugzilla: https://bugzilla.redhat.com/2177177

commit 5b481acab4ce017fda8166fa9428511da41109e5
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Dec 6 15:59:32 2022 +0100

    bpf: do not rely on ALLOW_ERROR_INJECTION for fmod_ret
    
    The current way of expressing that a non-bpf kernel component is willing
    to accept that bpf programs can be attached to it and that they can change
    the return value is to abuse ALLOW_ERROR_INJECTION.
    This is debated in the link below, and the result is that it is not a
    reasonable thing to do.
    
    Reuse the kfunc declaration structure to also tag the kernel functions
    we want to be fmodret. This way we can control from any subsystem which
    functions are being modified by bpf without touching the verifier.
    
    Link: https://lore.kernel.org/all/20221121104403.1545f9b5@gandalf.local.home/
    Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/r/20221206145936.922196-2-benjamin.tissoires@redhat.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:16 +02:00
Jerome Marchand 4b450e77be bpf: Do not mark certain LSM hook arguments as trusted
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Context change due to missing commit 401e64b3a4af
("bpf-lsm: Make bpf_lsm_userns_create() sleepable")

commit c0c852dd1876dc1db4600ce951a92aadd3073b1c
Author: Yonghong Song <yhs@fb.com>
Date:   Sat Dec 3 12:49:54 2022 -0800

    bpf: Do not mark certain LSM hook arguments as trusted

    Martin mentioned that the verifier cannot assume arguments from
    LSM hook sk_alloc_security being trusted since after the hook
    is called, the sk ref_count is set to 1. This will overwrite
    the ref_count changed by the bpf program and may cause ref_count
    underflow later on.

    I then further checked some other hooks. For example,
    for bpf_lsm_file_alloc() hook in fs/file_table.c,

            f->f_cred = get_cred(cred);
            error = security_file_alloc(f);
            if (unlikely(error)) {
                    file_free_rcu(&f->f_rcuhead);
                    return ERR_PTR(error);
            }

            atomic_long_set(&f->f_count, 1);

    The input parameter 'f' to security_file_alloc() cannot be trusted
    as well.

    Specifically, I investiaged bpf_map/bpf_prog/file/sk/task alloc/free
    lsm hooks. Except bpf_map_alloc and task_alloc, arguments for all other
    hooks should not be considered as trusted. This may not be a complete
    list, but it covers common usage for sk and task.

    Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221203204954.2043348-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:14 +02:00
Jerome Marchand 0ec7796171 bpf: Add kfunc bpf_rcu_read_lock/unlock()
Bugzilla: https://bugzilla.redhat.com/2177177

commit 9bb00b2895cbfe0ad410457b605d0a72524168c1
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Nov 23 21:32:17 2022 -0800

    bpf: Add kfunc bpf_rcu_read_lock/unlock()

    Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's
    can be used for all program types. The following is an example about how
    rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock().

      struct task_struct {
        ...
        struct task_struct              *last_wakee;
        struct task_struct __rcu        *real_parent;
        ...
      };

    Let us say prog does 'task = bpf_get_current_task_btf()' to get a
    'task' pointer. The basic rules are:
      - 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock
        region. This is to simulate rcu_dereference() operation. The
        'real_parent' is marked as MEM_RCU only if (1). task->real_parent is
        inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So
        MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region.
      - 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock
        region since it tries to access rcu protected memory.
      - the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general
        it is not clear whether the object pointed by 'last_wakee' is valid or
        not even inside bpf_rcu_read_lock region.

    The verifier will reset all rcu pointer register states to untrusted
    at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer
    won't be trusted any more outside the bpf_rcu_read_lock() region.

    The current implementation does not support nested rcu read lock
    region in the prog.

    Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:12 +02:00
Jerome Marchand 284ee39aa7 bpf: Don't mark arguments to fentry/fexit programs as trusted.
Bugzilla: https://bugzilla.redhat.com/2177177

commit c6b0337f01205decb31ed5e90e5aa760ac2d5b41
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Thu Nov 24 13:53:14 2022 -0800

    bpf: Don't mark arguments to fentry/fexit programs as trusted.

    The PTR_TRUSTED flag should only be applied to pointers where the verifier can
    guarantee that such pointers are valid.
    The fentry/fexit/fmod_ret programs are not in this category.
    Only arguments of SEC("tp_btf") and SEC("iter") programs are trusted
    (which have BPF_TRACE_RAW_TP and BPF_TRACE_ITER attach_type correspondingly)

    This bug was masked because convert_ctx_accesses() was converting trusted
    loads into BPF_PROBE_MEM loads. Fix it as well.
    The loads from trusted pointers don't need exception handling.

    Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20221124215314.55890-1-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:11 +02:00
Jerome Marchand cb943794aa bpf: Unify and simplify btf_func_proto_check error handling
Bugzilla: https://bugzilla.redhat.com/2177177

commit 5bad3587b7a292148cea10185cd8770baaeb7445
Author: Stanislav Fomichev <sdf@google.com>
Date:   Wed Nov 23 16:28:38 2022 -0800

    bpf: Unify and simplify btf_func_proto_check error handling

    Replace 'err = x; break;' with 'return x;'.

    Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20221124002838.2700179-1-sdf@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:11 +02:00
Jerome Marchand 750e4d2c71 bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
Bugzilla: https://bugzilla.redhat.com/2177177

commit fd264ca020948a743e4c36731dfdecc4a812153c
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Nov 20 11:54:32 2022 -0800

    bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx

    Implement bpf_cast_to_kern_ctx() kfunc which does a type cast
    of a uapi ctx object to the corresponding kernel ctx. Previously
    if users want to access some data available in kctx but not
    in uapi ctx, bpf_probe_read_kernel() helper is needed.
    The introduction of bpf_cast_to_kern_ctx() allows direct
    memory access which makes code simpler and easier to understand.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221120195432.3113982-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:09 +02:00
Jerome Marchand 96be11db9f bpf: Add support for kfunc set with common btf_ids
Bugzilla: https://bugzilla.redhat.com/2177177

commit cfe1456440c8feaf6558577a400745d774418379
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Nov 20 11:54:26 2022 -0800

    bpf: Add support for kfunc set with common btf_ids

    Later on, we will introduce kfuncs bpf_cast_to_kern_ctx() and
    bpf_rdonly_cast() which apply to all program types. Currently kfunc set
    only supports individual prog types. This patch added support for kfunc
    applying to all program types.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221120195426.3113828-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:09 +02:00
Jerome Marchand a52cc75452 bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 3f00c52393445ed49aadc1a567aa502c6333b1a1
Author: David Vernet <void@manifault.com>
Date:   Sat Nov 19 23:10:02 2022 -0600

    bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs

    Kfuncs currently support specifying the KF_TRUSTED_ARGS flag to signal
    to the verifier that it should enforce that a BPF program passes it a
    "safe", trusted pointer. Currently, "safe" means that the pointer is
    either PTR_TO_CTX, or is refcounted. There may be cases, however, where
    the kernel passes a BPF program a safe / trusted pointer to an object
    that the BPF program wishes to use as a kptr, but because the object
    does not yet have a ref_obj_id from the perspective of the verifier, the
    program would be unable to pass it to a KF_ACQUIRE | KF_TRUSTED_ARGS
    kfunc.

    The solution is to expand the set of pointers that are considered
    trusted according to KF_TRUSTED_ARGS, so that programs can invoke kfuncs
    with these pointers without getting rejected by the verifier.

    There is already a PTR_UNTRUSTED flag that is set in some scenarios,
    such as when a BPF program reads a kptr directly from a map
    without performing a bpf_kptr_xchg() call. These pointers of course can
    and should be rejected by the verifier. Unfortunately, however,
    PTR_UNTRUSTED does not cover all the cases for safety that need to
    be addressed to adequately protect kfuncs. Specifically, pointers
    obtained by a BPF program "walking" a struct are _not_ considered
    PTR_UNTRUSTED according to BPF. For example, say that we were to add a
    kfunc called bpf_task_acquire(), with KF_ACQUIRE | KF_TRUSTED_ARGS, to
    acquire a struct task_struct *. If we only used PTR_UNTRUSTED to signal
    that a task was unsafe to pass to a kfunc, the verifier would mistakenly
    allow the following unsafe BPF program to be loaded:

    SEC("tp_btf/task_newtask")
    int BPF_PROG(unsafe_acquire_task,
                 struct task_struct *task,
                 u64 clone_flags)
    {
            struct task_struct *acquired, *nested;

            nested = task->last_wakee;

            /* Would not be rejected by the verifier. */
            acquired = bpf_task_acquire(nested);
            if (!acquired)
                    return 0;

            bpf_task_release(acquired);
            return 0;
    }

    To address this, this patch defines a new type flag called PTR_TRUSTED
    which tracks whether a PTR_TO_BTF_ID pointer is safe to pass to a
    KF_TRUSTED_ARGS kfunc or a BPF helper function. PTR_TRUSTED pointers are
    passed directly from the kernel as a tracepoint or struct_ops callback
    argument. Any nested pointer that is obtained from walking a PTR_TRUSTED
    pointer is no longer PTR_TRUSTED. From the example above, the struct
    task_struct *task argument is PTR_TRUSTED, but the 'nested' pointer
    obtained from 'task->last_wakee' is not PTR_TRUSTED.

    A subsequent patch will add kfuncs for storing a task kfunc as a kptr,
    and then another patch will add selftests to validate.

    Signed-off-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20221120051004.3605026-3-void@manifault.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:08 +02:00
Jerome Marchand de6eb19233 bpf: Add comments for map BTF matching requirement for bpf_list_head
Bugzilla: https://bugzilla.redhat.com/2177177

commit c22dfdd21592c5d56b49d5fba8de300ad7bf293c
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:26:08 2022 +0530

    bpf: Add comments for map BTF matching requirement for bpf_list_head

    The old behavior of bpf_map_meta_equal was that it compared timer_off
    to be equal (but not spin_lock_off, because that was not allowed), and
    did memcmp of kptr_off_tab.

    Now, we memcmp the btf_record of two bpf_map structs, which has all
    fields.

    We preserve backwards compat as we kzalloc the array, so if only spin
    lock and timer exist in map, we only compare offset while the rest of
    unused members in the btf_field struct are zeroed out.

    In case of kptr, btf and everything else is of vmlinux or module, so as
    long type is same it will match, since kernel btf, module, dtor pointer
    will be same across maps.

    Now with list_head in the mix, things are a bit complicated. We
    implicitly add a requirement that both BTFs are same, because struct
    btf_field_list_head has btf and value_rec members.

    We obviously shouldn't force BTFs to be equal by default, as that breaks
    backwards compatibility.

    Currently it is only implicitly required due to list_head matching
    struct btf and value_rec member. value_rec points back into a btf_record
    stashed in the map BTF (btf member of btf_field_list_head). So that
    pointer and btf member has to match exactly.

    Document all these subtle details so that things don't break in the
    future when touching this code.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-19-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:07 +02:00
Jerome Marchand a976de70c4 bpf: Rewrite kfunc argument handling
Bugzilla: https://bugzilla.redhat.com/2177177

commit 00b85860feb809852af9a88cb4ca8766d7dff6a3
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:26:01 2022 +0530

    bpf: Rewrite kfunc argument handling

    As we continue to add more features, argument types, kfunc flags, and
    different extensions to kfuncs, the code to verify the correctness of
    the kfunc prototype wrt the passed in registers has become ad-hoc and
    ugly to read. To make life easier, and make a very clear split between
    different stages of argument processing, move all the code into
    verifier.c and refactor into easier to read helpers and functions.

    This also makes sharing code within the verifier easier with kfunc
    argument processing. This will be more and more useful in later patches
    as we are now moving to implement very core BPF helpers as kfuncs, to
    keep them experimental before baking into UAPI.

    Remove all kfunc related bits now from btf_check_func_arg_match, as
    users have been converted away to refactored kfunc argument handling.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-12-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:07 +02:00
Jerome Marchand 5fb8030979 bpf: Verify ownership relationships for user BTF types
Bugzilla: https://bugzilla.redhat.com/2177177

commit 865ce09a49d79d2b2c1d980f4c05ffc0b3517bdc
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:57 2022 +0530

    bpf: Verify ownership relationships for user BTF types

    Ensure that there can be no ownership cycles among different types by
    way of having owning objects that can hold some other type as their
    element. For instance, a map value can only hold allocated objects, but
    these are allowed to have another bpf_list_head. To prevent unbounded
    recursion while freeing resources, elements of bpf_list_head in local
    kptrs can never have a bpf_list_head which are part of list in a map
    value. Later patches will verify this by having dedicated BTF selftests.

    Also, to make runtime destruction easier, once btf_struct_metas is fully
    populated, we can stash the metadata of the value type directly in the
    metadata of the list_head fields, as that allows easier access to the
    value type's layout to destruct it at runtime from the btf_field entry
    of the list head itself.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand ef745b384b bpf: Recognize lock and list fields in allocated objects
Bugzilla: https://bugzilla.redhat.com/2177177

commit 8ffa5cc142137a59d6a10eb5273fa2ba5dcd4947
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:56 2022 +0530

    bpf: Recognize lock and list fields in allocated objects

    Allow specifying bpf_spin_lock, bpf_list_head, bpf_list_node fields in a
    allocated object.

    Also update btf_struct_access to reject direct access to these special
    fields.

    A bpf_list_head allows implementing map-in-map style use cases, where an
    allocated object with bpf_list_head is linked into a list in a map
    value. This would require embedding a bpf_list_node, support for which
    is also included. The bpf_spin_lock is used to protect the bpf_list_head
    and other data.

    While we strictly don't require to hold a bpf_spin_lock while touching
    the bpf_list_head in such objects, as when have access to it, we have
    complete ownership of the object, the locking constraint is still kept
    and may be conditionally lifted in the future.

    Note that the specification of such types can be done just like map
    values, e.g.:

    struct bar {
    	struct bpf_list_node node;
    };

    struct foo {
    	struct bpf_spin_lock lock;
    	struct bpf_list_head head __contains(bar, node);
    	struct bpf_list_node node;
    };

    struct map_value {
    	struct bpf_spin_lock lock;
    	struct bpf_list_head head __contains(foo, node);
    };

    To recognize such types in user BTF, we build a btf_struct_metas array
    of metadata items corresponding to each BTF ID. This is done once during
    the btf_parse stage to avoid having to do it each time during the
    verification process's requirement to inspect the metadata.

    Moreover, the computed metadata needs to be passed to some helpers in
    future patches which requires allocating them and storing them in the
    BTF that is pinned by the program itself, so that valid access can be
    assumed to such data during program runtime.

    A key thing to note is that once a btf_struct_meta is available for a
    type, both the btf_record and btf_field_offs should be available. It is
    critical that btf_field_offs is available in case special fields are
    present, as we extensively rely on special fields being zeroed out in
    map values and allocated objects in later patches. The code ensures that
    by bailing out in case of errors and ensuring both are available
    together. If the record is not available, the special fields won't be
    recognized, so not having both is also fine (in terms of being a
    verification error and not a runtime bug).

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-7-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand a714a43577 bpf: Introduce allocated objects support
Bugzilla: https://bugzilla.redhat.com/2177177

commit 282de143ead96a5d53331e946f31c977b4610a74
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:55 2022 +0530

    bpf: Introduce allocated objects support

    Introduce support for representing pointers to objects allocated by the
    BPF program, i.e. PTR_TO_BTF_ID that point to a type in program BTF.
    This is indicated by the presence of MEM_ALLOC type flag in reg->type to
    avoid having to check btf_is_kernel when trying to match argument types
    in helpers.

    Whenever walking such types, any pointers being walked will always yield
    a SCALAR instead of pointer. In the future we might permit kptr inside
    such allocated objects (either kernel or program allocated), and it will
    then form a PTR_TO_BTF_ID of the respective type.

    For now, such allocated objects will always be referenced in verifier
    context, hence ref_obj_id == 0 for them is a bug. It is allowed to write
    to such objects, as long fields that are special are not touched
    (support for which will be added in subsequent patches). Note that once
    such a pointer is marked PTR_UNTRUSTED, it is no longer allowed to write
    to it.

    No PROBE_MEM handling is therefore done for loads into this type unless
    PTR_UNTRUSTED is part of the register type, since they can never be in
    an undefined state, and their lifetime will always be valid.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-6-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand 39fc2fdfd6 bpf: Refactor btf_struct_access
Bugzilla: https://bugzilla.redhat.com/2177177

commit 6728aea7216c0c06c98e2e58d753a5e8b2ae1c6f
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:28 2022 +0530

    bpf: Refactor btf_struct_access

    Instead of having to pass multiple arguments that describe the register,
    pass the bpf_reg_state into the btf_struct_access callback. Currently,
    all call sites simply reuse the btf and btf_id of the reg they want to
    check the access of. The only exception to this pattern is the callsite
    in check_ptr_to_map_access, hence for that case create a dummy reg to
    simulate PTR_TO_BTF_ID access.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand d03c51f6bc bpf: Support bpf_list_head in map values
Bugzilla: https://bugzilla.redhat.com/2177177

commit f0c5941ff5b255413d31425bb327c2aec3625673
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:25 2022 +0530

    bpf: Support bpf_list_head in map values

    Add the support on the map side to parse, recognize, verify, and build
    metadata table for a new special field of the type struct bpf_list_head.
    To parameterize the bpf_list_head for a certain value type and the
    list_node member it will accept in that value type, we use BTF
    declaration tags.

    The definition of bpf_list_head in a map value will be done as follows:

    struct foo {
    	struct bpf_list_node node;
    	int data;
    };

    struct map_value {
    	struct bpf_list_head head __contains(foo, node);
    };

    Then, the bpf_list_head only allows adding to the list 'head' using the
    bpf_list_node 'node' for the type struct foo.

    The 'contains' annotation is a BTF declaration tag composed of four
    parts, "contains:name:node" where the name is then used to look up the
    type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
    the lookup. The node defines name of the member in this type that has
    the type struct bpf_list_node, which is actually used for linking into
    the linked list. For now, 'kind' part is hardcoded as struct.

    This allows building intrusive linked lists in BPF, using container_of
    to obtain pointer to entry, while being completely type safe from the
    perspective of the verifier. The verifier knows exactly the type of the
    nodes, and knows that list helpers return that type at some fixed offset
    where the bpf_list_node member used for this list exists. The verifier
    also uses this information to disallow adding types that are not
    accepted by a certain list.

    For now, no elements can be added to such lists. Support for that is
    coming in future patches, hence draining and freeing items is done with
    a TODO that will be resolved in a future patch.

    Note that the bpf_list_head_free function moves the list out to a local
    variable under the lock and releases it, doing the actual draining of
    the list items outside the lock. While this helps with not holding the
    lock for too long pessimizing other concurrent list operations, it is
    also necessary for deadlock prevention: unless every function called in
    the critical section would be notrace, a fentry/fexit program could
    attach and call bpf_map_update_elem again on the map, leading to the
    same lock being acquired if the key matches and lead to a deadlock.
    While this requires some special effort on part of the BPF programmer to
    trigger and is highly unlikely to occur in practice, it is always better
    if we can avoid such a condition.

    While notrace would prevent this, doing the draining outside the lock
    has advantages of its own, hence it is used to also fix the deadlock
    related problem.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand a7aa687757 bpf: Remove BPF_MAP_OFF_ARR_MAX
Bugzilla: https://bugzilla.redhat.com/2177177

commit 2d577252579b3efb9e934b68948a2edfa9920110
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:23 2022 +0530

    bpf: Remove BPF_MAP_OFF_ARR_MAX

    In f71b2f64177a ("bpf: Refactor map->off_arr handling"), map->off_arr
    was refactored to be btf_field_offs. The number of field offsets is
    equal to maximum possible fields limited by BTF_FIELDS_MAX. Hence, reuse
    BTF_FIELDS_MAX as spin_lock and timer no longer are to be handled
    specially for offset sorting, fix the comment, and remove incorrect
    WARN_ON as its rec->cnt can never exceed this value. The reason to keep
    separate constant was the it was always more 2 more than total kptrs.
    This is no longer the case.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand e9b5bda40b bpf: Refactor map->off_arr handling
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Minor changes from already backported commit 1f6e04a1c7b8
("bpf: Fix offset calculation error in __copy_map_value and
zero_map_value")

commit f71b2f64177a199d5b1d2047e155d45fd98f564a
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:57 2022 +0530

    bpf: Refactor map->off_arr handling

    Refactor map->off_arr handling into generic functions that can work on
    their own without hardcoding map specific code. The btf_fields_offs
    structure is now returned from btf_parse_field_offs, which can be reused
    later for types in program BTF.

    All functions like copy_map_value, zero_map_value call generic
    underlying functions so that they can also be reused later for copying
    to values allocated in programs which encode specific fields.

    Later, some helper functions will also require access to this
    btf_field_offs structure to be able to skip over special fields at
    runtime.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-9-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand 2b8a340165 bpf: Consolidate spin_lock, timer management into btf_record
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Context change from already backported commit 997849c4b969
("bpf: Zeroing allocated object from slab in bpf memory allocator")

commit db559117828d2448fe81ada051c60bcf39f822e9
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:56 2022 +0530

    bpf: Consolidate spin_lock, timer management into btf_record

    Now that kptr_off_tab has been refactored into btf_record, and can hold
    more than one specific field type, accomodate bpf_spin_lock and
    bpf_timer as well.

    While they don't require any more metadata than offset, having all
    special fields in one place allows us to share the same code for
    allocated user defined types and handle both map values and these
    allocated objects in a similar fashion.

    As an optimization, we still keep spin_lock_off and timer_off offsets in
    the btf_record structure, just to avoid having to find the btf_field
    struct each time their offset is needed. This is mostly needed to
    manipulate such objects in a map value at runtime. It's ok to hardcode
    just one offset as more than one field is disallowed.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand 40100e4a5a bpf: Refactor kptr_off_tab into btf_record
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts:
 - Context change from already backported commit 997849c4b969 ("bpf:
Zeroing allocated object from slab in bpf memory allocator")
 - Minor changes from already backported commit 1f6e04a1c7b8 ("bpf:
Fix offset calculation error in __copy_map_value and zero_map_value")

commit aa3496accc412b3d975e4ee5d06076d73394d8b5
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:55 2022 +0530

    bpf: Refactor kptr_off_tab into btf_record

    To prepare the BPF verifier to handle special fields in both map values
    and program allocated types coming from program BTF, we need to refactor
    the kptr_off_tab handling code into something more generic and reusable
    across both cases to avoid code duplication.

    Later patches also require passing this data to helpers at runtime, so
    that they can work on user defined types, initialize them, destruct
    them, etc.

    The main observation is that both map values and such allocated types
    point to a type in program BTF, hence they can be handled similarly. We
    can prepare a field metadata table for both cases and store them in
    struct bpf_map or struct btf depending on the use case.

    Hence, refactor the code into generic btf_record and btf_field member
    structs. The btf_record represents the fields of a specific btf_type in
    user BTF. The cnt indicates the number of special fields we successfully
    recognized, and field_mask is a bitmask of fields that were found, to
    enable quick determination of availability of a certain field.

    Subsequently, refactor the rest of the code to work with these generic
    types, remove assumptions about kptr and kptr_off_tab, rename variables
    to more meaningful names, etc.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-7-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand f3c85b07ca bpf: Allow specifying volatile type modifier for kptrs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 23da464dd6b8935b66f4ee306ad8947fd32ccd75
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:51 2022 +0530

    bpf: Allow specifying volatile type modifier for kptrs

    This is useful in particular to mark the pointer as volatile, so that
    compiler treats each load and store to the field as a volatile access.
    The alternative is having to define and use READ_ONCE and WRITE_ONCE in
    the BPF program.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Acked-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:42:59 +02:00
Artem Savkov f76f5ca9d3 bpf: Prevent decl_tag from being referenced in func_proto arg
Bugzilla: https://bugzilla.redhat.com/2166911

commit f17472d4599697d701aa239b4c475a506bccfd19
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Nov 22 19:54:22 2022 -0800

    bpf: Prevent decl_tag from being referenced in func_proto arg
    
    Syzkaller managed to hit another decl_tag issue:
    
      btf_func_proto_check kernel/bpf/btf.c:4506 [inline]
      btf_check_all_types kernel/bpf/btf.c:4734 [inline]
      btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763
      btf_parse kernel/bpf/btf.c:5042 [inline]
      btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709
      bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342
      __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034
      __do_sys_bpf kernel/bpf/syscall.c:5093 [inline]
      __se_sys_bpf kernel/bpf/syscall.c:5091 [inline]
      __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091
      do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48
    
    This seems similar to commit ea68376c8bed ("bpf: prevent decl_tag from being
    referenced in func_proto") but for the argument.
    
    Reported-by: syzbot+8dd0551dda6020944c5d@syzkaller.appspotmail.com
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20221123035422.872531-2-sdf@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:24 +01:00
Artem Savkov fd35528978 bpf: prevent decl_tag from being referenced in func_proto
Bugzilla: https://bugzilla.redhat.com/2166911

commit ea68376c8bed5cd156900852aada20c3a0874d17
Author: Stanislav Fomichev <sdf@google.com>
Date:   Fri Oct 14 17:24:44 2022 -0700

    bpf: prevent decl_tag from being referenced in func_proto
    
    Syzkaller was able to hit the following issue:
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 3609 at kernel/bpf/btf.c:1946
    btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
    Modules linked in:
    CPU: 0 PID: 3609 Comm: syz-executor361 Not tainted
    6.0.0-syzkaller-02734-g0326074ff465 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 09/22/2022
    RIP: 0010:btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
    Code: ef e8 7f 8e e4 ff 41 83 ff 0b 77 28 f6 44 24 10 18 75 3f e8 6d 91
    e4 ff 44 89 fe bf 0e 00 00 00 e8 20 8e e4 ff e8 5b 91 e4 ff <0f> 0b 45
    31 f6 e9 98 02 00 00 41 83 ff 12 74 18 e8 46 91 e4 ff 44
    RSP: 0018:ffffc90003cefb40 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
    RDX: ffff8880259c0000 RSI: ffffffff81968415 RDI: 0000000000000005
    RBP: ffff88801270ca00 R08: 0000000000000005 R09: 000000000000000e
    R10: 0000000000000011 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000011 R14: ffff888026ee6424 R15: 0000000000000011
    FS:  000055555641b300(0000) GS:ffff8880b9a00000(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000f2e258 CR3: 000000007110e000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     btf_func_proto_check kernel/bpf/btf.c:4447 [inline]
     btf_check_all_types kernel/bpf/btf.c:4723 [inline]
     btf_parse_type_sec kernel/bpf/btf.c:4752 [inline]
     btf_parse kernel/bpf/btf.c:5026 [inline]
     btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892
     bpf_btf_load kernel/bpf/syscall.c:4324 [inline]
     __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010
     __do_sys_bpf kernel/bpf/syscall.c:5069 [inline]
     __se_sys_bpf kernel/bpf/syscall.c:5067 [inline]
     __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    RIP: 0033:0x7f0fbae41c69
    Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89
    f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
    f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffc8aeb6228 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fbae41c69
    RDX: 0000000000000020 RSI: 0000000020000140 RDI: 0000000000000012
    RBP: 00007f0fbae05e10 R08: 0000000000000000 R09: 0000000000000000
    R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0fbae05ea0
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
     </TASK>
    
    Looks like it tries to create a func_proto which return type is
    decl_tag. For the details, see Martin's spot on analysis in [0].
    
    0: https://lore.kernel.org/bpf/CAKH8qBuQDLva_hHxxBuZzyAcYNO4ejhovz6TQeVSk8HY-2SO6g@mail.gmail.com/T/#mea6524b3fcd6298347432226e81b1e6155efc62c
    
    Cc: Yonghong Song <yhs@fb.com>
    Cc: Martin KaFai Lau <martin.lau@kernel.org>
    Fixes: bd16dee66ae4 ("bpf: Add BTF_KIND_DECL_TAG typedef support")
    Reported-by: syzbot+d8bd751aef7c6b39a344@syzkaller.appspotmail.com
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221015002444.2680969-2-sdf@google.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:21 +01:00
Artem Savkov db95e85722 bpf: Tweak definition of KF_TRUSTED_ARGS
Bugzilla: https://bugzilla.redhat.com/2166911

commit eed807f626101f6a4227bd53942892c5983b95a7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Sep 21 18:48:25 2022 +0200

    bpf: Tweak definition of KF_TRUSTED_ARGS
    
    Instead of forcing all arguments to be referenced pointers with non-zero
    reg->ref_obj_id, tweak the definition of KF_TRUSTED_ARGS to mean that
    only PTR_TO_BTF_ID (and socket types translated to PTR_TO_BTF_ID) have
    that constraint, and require their offset to be set to 0.
    
    The rest of pointer types are also accomodated in this definition of
    trusted pointers, but with more relaxed rules regarding offsets.
    
    The inherent meaning of setting this flag is that all kfunc pointer
    arguments have a guranteed lifetime, and kernel object pointers
    (PTR_TO_BTF_ID, PTR_TO_CTX) are passed in their unmodified form (with
    offset 0). In general, this is not true for PTR_TO_BTF_ID as it can be
    obtained using pointer walks.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
    Link: https://lore.kernel.org/r/cdede0043c47ed7a357f0a915d16f9ce06a1d589.1663778601.git.lorenzo@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:16 +01:00
Artem Savkov 11644af3ae btf: Allow dynamic pointer parameters in kfuncs
Bugzilla: https://bugzilla.redhat.com/2166911

commit b8d31762a0ae6861e1115302ee338560d853e317
Author: Roberto Sassu <roberto.sassu@huawei.com>
Date:   Tue Sep 20 09:59:42 2022 +0200

    btf: Allow dynamic pointer parameters in kfuncs
    
    Allow dynamic pointers (struct bpf_dynptr_kern *) to be specified as
    parameters in kfuncs. Also, ensure that dynamic pointers passed as argument
    are valid and initialized, are a pointer to the stack, and of the type
    local. More dynamic pointer types can be supported in the future.
    
    To properly detect whether a parameter is of the desired type, introduce
    the stringify_struct() macro to compare the returned structure name with
    the desired name. In addition, protect against structure renames, by
    halting the build with BUILD_BUG_ON(), so that developers have to revisit
    the code.
    
    To check if a dynamic pointer passed to the kfunc is valid and initialized,
    and if its type is local, export the existing functions
    is_dynptr_reg_valid_init() and is_dynptr_type_expected().
    
    Cc: Joanne Koong <joannelkoong@gmail.com>
    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220920075951.929132-5-roberto.sassu@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:15 +01:00
Artem Savkov 4c0dd85157 bpf: Allow kfuncs to be used in LSM programs
Bugzilla: https://bugzilla.redhat.com/2166911

commit d15bf1501c7533826a616478002c601fcc7671f3
Author: KP Singh <kpsingh@kernel.org>
Date:   Tue Sep 20 09:59:39 2022 +0200

    bpf: Allow kfuncs to be used in LSM programs
    
    In preparation for the addition of new kfuncs, allow kfuncs defined in the
    tracing subsystem to be used in LSM programs by mapping the LSM program
    type to the TRACING hook.
    
    Signed-off-by: KP Singh <kpsingh@kernel.org>
    Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220920075951.929132-2-roberto.sassu@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:15 +01:00
Artem Savkov 426eb2fc0d bpf: simplify code in btf_parse_hdr
Bugzilla: https://bugzilla.redhat.com/2166911

commit 3a74904ceff3ecdb9d6cc0844ed67df417968eb6
Author: William Dean <williamsukatube@163.com>
Date:   Sat Sep 17 16:42:48 2022 +0800

    bpf: simplify code in btf_parse_hdr
    
    It could directly return 'btf_check_sec_info' to simplify code.
    
    Signed-off-by: William Dean <williamsukatube@163.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220917084248.3649-1-williamsukatube@163.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:14 +01:00
Artem Savkov e859178cb6 bpf/btf: Use btf_type_str() whenever possible
Bugzilla: https://bugzilla.redhat.com/2166911

commit 571f9738bfb3d4b42253c1d0ad26da9fede85f36
Author: Peilin Ye <peilin.ye@bytedance.com>
Date:   Fri Sep 16 13:28:00 2022 -0700

    bpf/btf: Use btf_type_str() whenever possible
    
    We have btf_type_str().  Use it whenever possible in btf.c, instead of
    "btf_kind_str[BTF_INFO_KIND(t->info)]".
    
    Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
    Link: https://lore.kernel.org/r/20220916202800.31421-1-yepeilin.cs@gmail.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:13 +01:00
Artem Savkov 294868a446 bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
Bugzilla: https://bugzilla.redhat.com/2166911

commit a37a32583e282d8d815e22add29bc1e91e19951a
Author: Lorenz Bauer <oss@lmb.io>
Date:   Sat Sep 10 11:01:20 2022 +0000

    bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
    
    When trying to finish resolving a struct member, btf_struct_resolve
    saves the member type id in a u16 temporary variable. This truncates
    the 32 bit type id value if it exceeds UINT16_MAX.
    
    As a result, structs that have members with type ids > UINT16_MAX and
    which need resolution will fail with a message like this:
    
        [67414] STRUCT ff_device size=120 vlen=12
            effect_owners type_id=67434 bits_offset=960 Member exceeds struct_size
    
    Fix this by changing the type of last_member_type_id to u32.
    
    Fixes: a0791f0df7 ("bpf: fix BTF limits")
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Lorenz Bauer <oss@lmb.io>
    Link: https://lore.kernel.org/r/20220910110120.339242-1-oss@lmb.io
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:12 +01:00
Artem Savkov 05f87e0648 bpf: Export btf_type_by_id() and bpf_log()
Bugzilla: https://bugzilla.redhat.com/2166911

commit 84c6ac417ceacd086efc330afece8922969610b7
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Sep 7 10:40:39 2022 -0600

    bpf: Export btf_type_by_id() and bpf_log()
    
    These symbols will be used in nf_conntrack.ko to support direct writes
    to `nf_conn`.
    
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Link: https://lore.kernel.org/r/3c98c19dc50d3b18ea5eca135b4fc3a5db036060.1662568410.git.dxu@dxuuu.xyz
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:12 +01:00
Artem Savkov 775481dcf1 bpf/verifier: allow kfunc to return an allocated mem
Bugzilla: https://bugzilla.redhat.com/2166911

commit eb1f7f71c126c8fd50ea81af98f97c4b581ea4ae
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:13:02 2022 +0200

    bpf/verifier: allow kfunc to return an allocated mem
    
    For drivers (outside of network), the incoming data is not statically
    defined in a struct. Most of the time the data buffer is kzalloc-ed
    and thus we can not rely on eBPF and BTF to explore the data.
    
    This commit allows to return an arbitrary memory, previously allocated by
    the driver.
    An interesting extra point is that the kfunc can mark the exported
    memory region as read only or read/write.
    
    So, when a kfunc is not returning a pointer to a struct but to a plain
    type, we can consider it is a valid allocated memory assuming that:
    - one of the arguments is either called rdonly_buf_size or
      rdwr_buf_size
    - and this argument is a const from the caller point of view
    
    We can then use this parameter as the size of the allocated memory.
    
    The memory is either read-only or read-write based on the name
    of the size parameter.
    
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-7-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:10 +01:00
Artem Savkov 4a3d0f39ff bpf/btf: bump BTF_KFUNC_SET_MAX_CNT
Bugzilla: https://bugzilla.redhat.com/2166911

commit f9b348185f4d684cc19e6bd9b87904823d5aa5ed
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:13:01 2022 +0200

    bpf/btf: bump BTF_KFUNC_SET_MAX_CNT
    
    net/bpf/test_run.c is already presenting 20 kfuncs.
    net/netfilter/nf_conntrack_bpf.c is also presenting an extra 10 kfuncs.
    
    Given that all the kfuncs are regrouped into one unique set, having
    only 2 space left prevent us to add more selftests.
    
    Bump it to 256.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-6-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:10 +01:00
Artem Savkov c903dd1ca2 bpf: split btf_check_subprog_arg_match in two
Bugzilla: https://bugzilla.redhat.com/2166911

commit 95f2f26f3cac06cfc046d2b29e60719d7848ea54
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:12:58 2022 +0200

    bpf: split btf_check_subprog_arg_match in two
    
    btf_check_subprog_arg_match() was used twice in verifier.c:
    - when checking for the type mismatches between a (sub)prog declaration
      and BTF
    - when checking the call of a subprog to see if the provided arguments
      are correct and valid
    
    This is problematic when we check if the first argument of a program
    (pointer to ctx) is correctly accessed:
    To be able to ensure we access a valid memory in the ctx, the verifier
    assumes the pointer to context is not null.
    This has the side effect of marking the program accessing the entire
    context, even if the context is never dereferenced.
    
    For example, by checking the context access with the current code, the
    following eBPF program would fail with -EINVAL if the ctx is set to null
    from the userspace:
    
    ```
    SEC("syscall")
    int prog(struct my_ctx *args) {
      return 0;
    }
    ```
    
    In that particular case, we do not want to actually check that the memory
    is correct while checking for the BTF validity, but we just want to
    ensure that the (sub)prog definition matches the BTF we have.
    
    So split btf_check_subprog_arg_match() in two so we can actually check
    for the memory used when in a call, and ignore that part when not.
    
    Note that a further patch is in preparation to disentangled
    btf_check_func_arg_match() from these two purposes, and so right now we
    just add a new hack around that by adding a boolean to this function.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-3-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:09 +01:00
Artem Savkov ca63c6e52e bpf: Allow struct argument in trampoline based programs
Bugzilla: https://bugzilla.redhat.com/2166911

commit 720e6a435194fb5237833a4a7ec6aa60a78964a8
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Aug 31 08:26:46 2022 -0700

    bpf: Allow struct argument in trampoline based programs
    
    Allow struct argument in trampoline based programs where
    the struct size should be <= 16 bytes. In such cases, the argument
    will be put into up to 2 registers for bpf, x86_64 and arm64
    architectures.
    
    To support arch-specific trampoline manipulation,
    add arg_flags for additional struct information about arguments
    in btf_func_model. Such information will be used in arch specific
    function arch_prepare_bpf_trampoline() to prepare argument access
    properly in trampoline.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220831152646.2078089-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:08 +01:00
Artem Savkov a1f618c375 bpf: Always return corresponding btf_type in __get_type_size()
Bugzilla: https://bugzilla.redhat.com/2166911

commit a00ed8430199abbc9d9bf43ea31795bfe98998ca
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Aug 7 10:51:16 2022 -0700

    bpf: Always return corresponding btf_type in __get_type_size()
    
    Currently in funciton __get_type_size(), the corresponding
    btf_type is returned only in invalid cases. Let us always
    return btf_type regardless of valid or invalid cases.
    Such a new functionality will be used in subsequent patches.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220807175116.4179242-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:53:59 +01:00
Artem Savkov a6a7d34ceb btf: Add a new kfunc flag which allows to mark a function to be sleepable
Bugzilla: https://bugzilla.redhat.com/2166911

commit fa96b24204af42274ec13dfb2f2e6990d7510e55
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Fri Aug 5 14:48:14 2022 -0700

    btf: Add a new kfunc flag which allows to mark a function to be sleepable
    
    This allows to declare a kfunc as sleepable and prevents its use in
    a non sleepable program.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Co-developed-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Hao Luo <haoluo@google.com>
    Link: https://lore.kernel.org/r/20220805214821.1058337-2-haoluo@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-03 15:25:56 +01:00
Artem Savkov cb6acbdffd bpf: btf: Fix vsnprintf return value check
Bugzilla: https://bugzilla.redhat.com/2137876

commit 58250ae350de8d28ce91ade4605d32c9e7f062a8
Author: Fedor Tokarev <ftokarev@gmail.com>
Date:   Mon Jul 11 23:13:17 2022 +0200

    bpf: btf: Fix vsnprintf return value check
    
    vsnprintf returns the number of characters which would have been written if
    enough space had been available, excluding the terminating null byte. Thus,
    the return value of 'len_left' means that the last character has been
    dropped.
    
    Signed-off-by: Fedor Tokarev <ftokarev@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Alan Maguire <alan.maguire@oracle.com>
    Link: https://lore.kernel.org/bpf/20220711211317.GA1143610@laptop

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:44 +01:00
Artem Savkov d581951eb0 bpf: Add support for forcing kfunc args to be trusted
Bugzilla: https://bugzilla.redhat.com/2137876

commit 56e948ffc098a780fefb6c1784a3a2c7b81100a1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Jul 21 15:42:36 2022 +0200

    bpf: Add support for forcing kfunc args to be trusted
    
    Teach the verifier to detect a new KF_TRUSTED_ARGS kfunc flag, which
    means each pointer argument must be trusted, which we define as a
    pointer that is referenced (has non-zero ref_obj_id) and also needs to
    have its offset unchanged, similar to how release functions expect their
    argument. This allows a kfunc to receive pointer arguments unchanged
    from the result of the acquire kfunc.
    
    This is required to ensure that kfunc that operate on some object only
    work on acquired pointers and not normal PTR_TO_BTF_ID with same type
    which can be obtained by pointer walking. The restrictions applied to
    release arguments also apply to trusted arguments. This implies that
    strict type matching (not deducing type by recursively following members
    at offset) and OBJ_RELEASE offset checks (ensuring they are zero) are
    used for trusted pointer arguments.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220721134245.2450-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:42 +01:00
Artem Savkov dbfa384357 bpf: Switch to new kfunc flags infrastructure
Bugzilla: https://bugzilla.redhat.com/2137876

commit a4703e3184320d6e15e2bc81d2ccf1c8c883f9d1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Jul 21 15:42:35 2022 +0200

    bpf: Switch to new kfunc flags infrastructure
    
    Instead of populating multiple sets to indicate some attribute and then
    researching the same BTF ID in them, prepare a single unified BTF set
    which indicates whether a kfunc is allowed to be called, and also its
    attributes if any at the same time. Now, only one call is needed to
    perform the lookup for both kfunc availability and its attributes.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:42 +01:00
Artem Savkov fbd76e5346 bpf: Fix check against plain integer v 'NULL'
Bugzilla: https://bugzilla.redhat.com/2137876

commit a2a5580fcbf808e7c2310e4959b62f9d2157fdb6
Author: Ben Dooks <ben.dooks@sifive.com>
Date:   Thu Jul 14 11:03:22 2022 +0100

    bpf: Fix check against plain integer v 'NULL'
    
    When checking with sparse, btf_show_type_value() is causing a
    warning about checking integer vs NULL when the macro is passed
    a pointer, due to the 'value != 0' check. Stop sparse complaining
    about any type-casting by adding a cast to the typeof(value).
    
    This fixes the following sparse warnings:
    
    kernel/bpf/btf.c:2579:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:2581:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:3407:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:3758:9: warning: Using plain integer as NULL pointer
    
    Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220714100322.260467-1-ben.dooks@sifive.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:39 +01:00
Artem Savkov 930fe36094 bpf, libbpf: Add type match support
Bugzilla: https://bugzilla.redhat.com/2137876

commit ec6209c8d42f815bc3bef10934637ca92114cd1b
Author: Daniel Müller <deso@posteo.net>
Date:   Tue Jun 28 16:01:21 2022 +0000

    bpf, libbpf: Add type match support
    
    This patch adds support for the proposed type match relation to
    relo_core where it is shared between userspace and kernel. It plumbs
    through both kernel-side and libbpf-side support.
    
    The matching relation is defined as follows (copy from source):
    - modifiers and typedefs are stripped (and, hence, effectively ignored)
    - generally speaking types need to be of same kind (struct vs. struct, union
      vs. union, etc.)
      - exceptions are struct/union behind a pointer which could also match a
        forward declaration of a struct or union, respectively, and enum vs.
        enum64 (see below)
    Then, depending on type:
    - integers:
      - match if size and signedness match
    - arrays & pointers:
      - target types are recursively matched
    - structs & unions:
      - local members need to exist in target with the same name
      - for each member we recursively check match unless it is already behind a
        pointer, in which case we only check matching names and compatible kind
    - enums:
      - local variants have to have a match in target by symbolic name (but not
        numeric value)
      - size has to match (but enum may match enum64 and vice versa)
    - function pointers:
      - number and position of arguments in local type has to match target
      - for each argument and the return value we recursively check match
    
    Signed-off-by: Daniel Müller <deso@posteo.net>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:35 +01:00
Artem Savkov ee4f4249cd bpf: minimize number of allocated lsm slots per program
Bugzilla: https://bugzilla.redhat.com/2137876

commit c0e19f2c9a3edd38e4b1bdae98eb44555d02bc31
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jun 28 10:43:07 2022 -0700

    bpf: minimize number of allocated lsm slots per program
    
    Previous patch adds 1:1 mapping between all 211 LSM hooks
    and bpf_cgroup program array. Instead of reserving a slot per
    possible hook, reserve 10 slots per cgroup for lsm programs.
    Those slots are dynamically allocated on demand and reclaimed.
    
    struct cgroup_bpf {
    	struct bpf_prog_array *    effective[33];        /*     0   264 */
    	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
    	struct hlist_head          progs[33];            /*   264   264 */
    	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
    	u8                         flags[33];            /*   528    33 */
    
    	/* XXX 7 bytes hole, try to pack */
    
    	struct list_head           storages;             /*   568    16 */
    	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
    	struct bpf_prog_array *    inactive;             /*   584     8 */
    	struct percpu_ref          refcnt;               /*   592    16 */
    	struct work_struct         release_work;         /*   608    72 */
    
    	/* size: 680, cachelines: 11, members: 7 */
    	/* sum members: 673, holes: 1, sum holes: 7 */
    	/* last cacheline: 40 bytes */
    };
    
    Reviewed-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220628174314.1216643-5-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:33 +01:00
Artem Savkov 9a33161b25 bpf: per-cgroup lsm flavor
Bugzilla: https://bugzilla.redhat.com/2137876

Conflicts: already applied 65d9ecfe0ca73 "bpf: Fix ref_obj_id for dynptr
data slices in verifier"

commit 69fd337a975c7e690dfe49d9cb4fe5ba1e6db44e
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jun 28 10:43:06 2022 -0700

    bpf: per-cgroup lsm flavor

    Allow attaching to lsm hooks in the cgroup context.

    Attaching to per-cgroup LSM works exactly like attaching
    to other per-cgroup hooks. New BPF_LSM_CGROUP is added
    to trigger new mode; the actual lsm hook we attach to is
    signaled via existing attach_btf_id.

    For the hooks that have 'struct socket' or 'struct sock' as its first
    argument, we use the cgroup associated with that socket. For the rest,
    we use 'current' cgroup (this is all on default hierarchy == v2 only).
    Note that for some hooks that work on 'struct sock' we still
    take the cgroup from 'current' because some of them work on the socket
    that hasn't been properly initialized yet.

    Behind the scenes, we allocate a shim program that is attached
    to the trampoline and runs cgroup effective BPF programs array.
    This shim has some rudimentary ref counting and can be shared
    between several programs attaching to the same lsm hook from
    different cgroups.

    Note that this patch bloats cgroup size because we add 211
    cgroup_bpf_attach_type(s) for simplicity sake. This will be
    addressed in the subsequent patch.

    Also note that we only add non-sleepable flavor for now. To enable
    sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
    shim programs have to be freed via trace rcu, cgroup_bpf.effective
    should be also trace-rcu-managed + maybe some other changes that
    I'm not aware of.

    Reviewed-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220628174314.1216643-4-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:33 +01:00
Artem Savkov 0f68342144 bpf: Merge "types_are_compat" logic into relo_core.c
Bugzilla: https://bugzilla.redhat.com/2137876

commit fd75733da2f376c0c8c6513c3cb2ac227082ec5c
Author: Daniel Müller <deso@posteo.net>
Date:   Thu Jun 23 18:29:34 2022 +0000

    bpf: Merge "types_are_compat" logic into relo_core.c
    
    BPF type compatibility checks (bpf_core_types_are_compat()) are
    currently duplicated between kernel and user space. That's a historical
    artifact more than intentional doing and can lead to subtle bugs where
    one implementation is adjusted but another is forgotten.
    
    That happened with the enum64 work, for example, where the libbpf side
    was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation
    support")) to use the btf_kind_core_compat() helper function but the
    kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64
    support")).
    
    This patch addresses both the duplication issue, by merging both
    implementations and moving them into relo_core.c, and fixes the alluded
    to kind check (by giving preference to libbpf's already adjusted logic).
    
    For discussion of the topic, please refer to:
    https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665
    
    Changelog:
    v1 -> v2:
    - limited libbpf recursion limit to 32
    - changed name to __bpf_core_types_are_compat
    - included warning previously present in libbpf version
    - merged kernel and user space changes into a single patch
    
    Signed-off-by: Daniel Müller <deso@posteo.net>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:31 +01:00
Yauheni Kaliuta 274ba317fc bpf: Add btf enum64 support
Bugzilla: http://bugzilla.redhat.com/2120968

commit 6089fb325cf737eeb2c4d236c94697112ca860da
Author: Yonghong Song <yhs@fb.com>
Date:   Mon Jun 6 23:26:00 2022 -0700

    bpf: Add btf enum64 support
    
    Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM.
    But in kernel, some enum indeed has 64bit values, e.g.,
    in uapi bpf.h, we have
      enum {
            BPF_F_INDEX_MASK                = 0xffffffffULL,
            BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
            BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
      };
    In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK
    as 0, which certainly is incorrect.
    
    This patch added a new btf kind, BTF_KIND_ENUM64, which permits
    64bit value to cover the above use case. The BTF_KIND_ENUM64 has
    the following three fields followed by the common type:
      struct bpf_enum64 {
        __u32 nume_off;
        __u32 val_lo32;
        __u32 val_hi32;
      };
    Currently, btf type section has an alignment of 4 as all element types
    are u32. Representing the value with __u64 will introduce a pad
    for bpf_enum64 and may also introduce misalignment for the 64bit value.
    Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues.
    
    The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64
    to indicate whether the value is signed or unsigned. The kflag intends
    to provide consistent output of BTF C fortmat with the original
    source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff.
    The format C has two choices, printing out 0xffffffff or -1 and current libbpf
    prints out as unsigned value. But if the signedness is preserved in btf,
    the value can be printed the same as the original source code.
    The kflag value 0 means unsigned values, which is consistent to the default
    by libbpf and should also cover most cases as well.
    
    The new BTF_KIND_ENUM64 is intended to support the enum value represented as
    64bit value. But it can represent all BTF_KIND_ENUM values as well.
    The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has
    to be represented with 64 bits.
    
    In addition, a static inline function btf_kind_core_compat() is introduced which
    will be used later when libbpf relo_core.c changed. Here the kernel shares the
    same relo_core.c with libbpf.
    
      [1] https://reviews.llvm.org/D124641
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:11 +02:00
Yauheni Kaliuta 5e92a3254e bpf: Limit maximum modifier chain length in btf_check_type_tags
Bugzilla: https://bugzilla.redhat.com/2120968

commit d1a374a1aeb7e31191448e225ed2f9c5e894f280
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Jun 15 09:51:51 2022 +0530

    bpf: Limit maximum modifier chain length in btf_check_type_tags
    
    On processing a module BTF of module built for an older kernel, we might
    sometimes find that some type points to itself forming a loop. If such a
    type is a modifier, btf_check_type_tags's while loop following modifier
    chain will be caught in an infinite loop.
    
    Fix this by defining a maximum chain length and bailing out if we spin
    any longer than that.
    
    Fixes: eb596b090558 ("bpf: Ensure type tags precede modifiers in BTF")
    Reported-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220615042151.2266537-1-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:07 +02:00
Yauheni Kaliuta 639d908f91 bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
Bugzilla: https://bugzilla.redhat.com/2120968

commit f858c2b2ca04fc7ead291821a793638ae120c11d
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date:   Mon Jun 6 09:52:51 2022 +0200

    bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
    
    The verifier allows programs to call global functions as long as their
    argument types match, using BTF to check the function arguments. One of the
    allowed argument types to such global functions is PTR_TO_CTX; however the
    check for this fails on BPF_PROG_TYPE_EXT functions because the verifier
    uses the wrong type to fetch the vmlinux BTF ID for the program context
    type. This failure is seen when an XDP program is loaded using
    libxdp (which loads it as BPF_PROG_TYPE_EXT and attaches it to a global XDP
    type program).
    
    Fix the issue by passing in the target program type instead of the
    BPF_PROG_TYPE_EXT type to bpf_prog_get_ctx() when checking function
    argument compatibility.
    
    The first Fixes tag refers to the latest commit that touched the code in
    question, while the second one points to the code that first introduced
    the global function call verification.
    
    v2:
    - Use resolve_prog_type()
    
    Fixes: 3363bd0cfbb8 ("bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support")
    Fixes: 51c39bb1d5 ("bpf: Introduce function-by-function verification")
    Reported-by: Simon Sundberg <simon.sundberg@kau.se>
    Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/r/20220606075253.28422-1-toke@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:07 +02:00