Commit Graph

364 Commits

Author SHA1 Message Date
Viktor Malik 726cfbc70b bpf: Enable annotating trusted nested pointers
Bugzilla: https://bugzilla.redhat.com/2178930

commit 57539b1c0ac2dcccbe64a7675ff466be009c040f
Author: David Vernet <void@manifault.com>
Date:   Fri Jan 20 13:25:15 2023 -0600

    bpf: Enable annotating trusted nested pointers
    
    In kfuncs, a "trusted" pointer is a pointer that the kfunc can assume is
    safe, and which the verifier will allow to be passed to a
    KF_TRUSTED_ARGS kfunc. Currently, a KF_TRUSTED_ARGS kfunc disallows any
    pointer to be passed at a nonzero offset, but sometimes this is in fact
    safe if the "nested" pointer's lifetime is inherited from its parent.
    For example, the const cpumask_t *cpus_ptr field in a struct task_struct
    will remain valid until the task itself is destroyed, and thus would
    also be safe to pass to a KF_TRUSTED_ARGS kfunc.
    
    While it would be conceptually simple to enable this by using BTF tags,
    gcc unfortunately does not yet support this. In the interim, this patch
    enables support for this by using a type-naming convention. A new
    BTF_TYPE_SAFE_NESTED macro is defined in verifier.c which allows a
    developer to specify the nested fields of a type which are considered
    trusted if its parent is also trusted. The verifier is also updated to
    account for this. A patch with selftests will be added in a follow-on
    change, along with documentation for this feature.
    
    Signed-off-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20230120192523.3650503-2-void@manifault.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:49 +02:00
Viktor Malik 99f1beb2d0 bpf: btf: limit logging of ignored BTF mismatches
Bugzilla: https://bugzilla.redhat.com/2178930

commit 9cb61e50bf6bf54db712bba6cf20badca4383f96
Author: Connor O'Brien <connoro@google.com>
Date:   Sat Jan 7 02:53:31 2023 +0000

    bpf: btf: limit logging of ignored BTF mismatches
    
    Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF
    mismatches are expected and module loading should proceed
    anyway. Logging with pr_warn() on every one of these "benign"
    mismatches creates unnecessary noise when many such modules are
    loaded. Instead, handle this case with a single log warning that BTF
    info may be unavailable.
    
    Mismatches also result in calls to __btf_verifier_log() via
    __btf_verifier_log_type() or btf_verifier_log_member(), adding several
    additional lines of logging per mismatched module. Add checks to these
    paths to skip logging for module BTF mismatches in the "allow
    mismatch" case.
    
    All existing logging behavior is preserved in the default
    CONFIG_MODULE_ALLOW_BTF_MISMATCH=n case.
    
    Signed-off-by: Connor O'Brien <connoro@google.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20230107025331.3240536-1-connoro@google.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:35 +02:00
Viktor Malik 8a9358eb85 bpf: rename list_head -> graph_root in field info types
Bugzilla: https://bugzilla.redhat.com/2178930

commit 30465003ad776a922c32b2dac58db14f120f037e
Author: Dave Marchevsky <davemarchevsky@fb.com>
Date:   Sat Dec 17 00:24:57 2022 -0800

    bpf: rename list_head -> graph_root in field info types
    
    Many of the structs recently added to track field info for linked-list
    head are useful as-is for rbtree root. So let's do a mechanical renaming
    of list_head-related types and fields:
    
    include/linux/bpf.h:
      struct btf_field_list_head -> struct btf_field_graph_root
      list_head -> graph_root in struct btf_field union
    kernel/bpf/btf.c:
      list_head -> graph_root in struct btf_field_info
    
    This is a nonfunctional change, functionality to actually use these
    fields for rbtree will be added in further patches.
    
    Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
    Link: https://lore.kernel.org/r/20221217082506.1570898-5-davemarchevsky@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Viktor Malik <vmalik@redhat.com>
2023-06-13 22:44:30 +02:00
Jerome Marchand 95f498c7c9 bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 74bc3a5acc82f020d2e126f56c535d02d1e74e37
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Fri Jan 20 13:21:48 2023 +0100

    bpf: Add missing btf_put to register_btf_id_dtor_kfuncs

    We take the BTF reference before we register dtors and we need
    to put it back when it's done.

    We probably won't se a problem with kernel BTF, but module BTF
    would stay loaded (because of the extra ref) even when its module
    is removed.

    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Fixes: 5ce937d613a4 ("bpf: Populate pairs of btf_id and destructor kfunc in btf")
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20230120122148.1522359-1-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:21 +02:00
Jerome Marchand 6af9035ad8 bpf: do not rely on ALLOW_ERROR_INJECTION for fmod_ret
Bugzilla: https://bugzilla.redhat.com/2177177

commit 5b481acab4ce017fda8166fa9428511da41109e5
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Dec 6 15:59:32 2022 +0100

    bpf: do not rely on ALLOW_ERROR_INJECTION for fmod_ret
    
    The current way of expressing that a non-bpf kernel component is willing
    to accept that bpf programs can be attached to it and that they can change
    the return value is to abuse ALLOW_ERROR_INJECTION.
    This is debated in the link below, and the result is that it is not a
    reasonable thing to do.
    
    Reuse the kfunc declaration structure to also tag the kernel functions
    we want to be fmodret. This way we can control from any subsystem which
    functions are being modified by bpf without touching the verifier.
    
    Link: https://lore.kernel.org/all/20221121104403.1545f9b5@gandalf.local.home/
    Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/r/20221206145936.922196-2-benjamin.tissoires@redhat.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:16 +02:00
Jerome Marchand 4b450e77be bpf: Do not mark certain LSM hook arguments as trusted
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Context change due to missing commit 401e64b3a4af
("bpf-lsm: Make bpf_lsm_userns_create() sleepable")

commit c0c852dd1876dc1db4600ce951a92aadd3073b1c
Author: Yonghong Song <yhs@fb.com>
Date:   Sat Dec 3 12:49:54 2022 -0800

    bpf: Do not mark certain LSM hook arguments as trusted

    Martin mentioned that the verifier cannot assume arguments from
    LSM hook sk_alloc_security being trusted since after the hook
    is called, the sk ref_count is set to 1. This will overwrite
    the ref_count changed by the bpf program and may cause ref_count
    underflow later on.

    I then further checked some other hooks. For example,
    for bpf_lsm_file_alloc() hook in fs/file_table.c,

            f->f_cred = get_cred(cred);
            error = security_file_alloc(f);
            if (unlikely(error)) {
                    file_free_rcu(&f->f_rcuhead);
                    return ERR_PTR(error);
            }

            atomic_long_set(&f->f_count, 1);

    The input parameter 'f' to security_file_alloc() cannot be trusted
    as well.

    Specifically, I investiaged bpf_map/bpf_prog/file/sk/task alloc/free
    lsm hooks. Except bpf_map_alloc and task_alloc, arguments for all other
    hooks should not be considered as trusted. This may not be a complete
    list, but it covers common usage for sk and task.

    Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221203204954.2043348-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:14 +02:00
Jerome Marchand 0ec7796171 bpf: Add kfunc bpf_rcu_read_lock/unlock()
Bugzilla: https://bugzilla.redhat.com/2177177

commit 9bb00b2895cbfe0ad410457b605d0a72524168c1
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Nov 23 21:32:17 2022 -0800

    bpf: Add kfunc bpf_rcu_read_lock/unlock()

    Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's
    can be used for all program types. The following is an example about how
    rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock().

      struct task_struct {
        ...
        struct task_struct              *last_wakee;
        struct task_struct __rcu        *real_parent;
        ...
      };

    Let us say prog does 'task = bpf_get_current_task_btf()' to get a
    'task' pointer. The basic rules are:
      - 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock
        region. This is to simulate rcu_dereference() operation. The
        'real_parent' is marked as MEM_RCU only if (1). task->real_parent is
        inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So
        MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region.
      - 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock
        region since it tries to access rcu protected memory.
      - the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general
        it is not clear whether the object pointed by 'last_wakee' is valid or
        not even inside bpf_rcu_read_lock region.

    The verifier will reset all rcu pointer register states to untrusted
    at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer
    won't be trusted any more outside the bpf_rcu_read_lock() region.

    The current implementation does not support nested rcu read lock
    region in the prog.

    Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:12 +02:00
Jerome Marchand 284ee39aa7 bpf: Don't mark arguments to fentry/fexit programs as trusted.
Bugzilla: https://bugzilla.redhat.com/2177177

commit c6b0337f01205decb31ed5e90e5aa760ac2d5b41
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Thu Nov 24 13:53:14 2022 -0800

    bpf: Don't mark arguments to fentry/fexit programs as trusted.

    The PTR_TRUSTED flag should only be applied to pointers where the verifier can
    guarantee that such pointers are valid.
    The fentry/fexit/fmod_ret programs are not in this category.
    Only arguments of SEC("tp_btf") and SEC("iter") programs are trusted
    (which have BPF_TRACE_RAW_TP and BPF_TRACE_ITER attach_type correspondingly)

    This bug was masked because convert_ctx_accesses() was converting trusted
    loads into BPF_PROBE_MEM loads. Fix it as well.
    The loads from trusted pointers don't need exception handling.

    Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20221124215314.55890-1-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:11 +02:00
Jerome Marchand cb943794aa bpf: Unify and simplify btf_func_proto_check error handling
Bugzilla: https://bugzilla.redhat.com/2177177

commit 5bad3587b7a292148cea10185cd8770baaeb7445
Author: Stanislav Fomichev <sdf@google.com>
Date:   Wed Nov 23 16:28:38 2022 -0800

    bpf: Unify and simplify btf_func_proto_check error handling

    Replace 'err = x; break;' with 'return x;'.

    Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20221124002838.2700179-1-sdf@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:11 +02:00
Jerome Marchand 750e4d2c71 bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx
Bugzilla: https://bugzilla.redhat.com/2177177

commit fd264ca020948a743e4c36731dfdecc4a812153c
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Nov 20 11:54:32 2022 -0800

    bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx

    Implement bpf_cast_to_kern_ctx() kfunc which does a type cast
    of a uapi ctx object to the corresponding kernel ctx. Previously
    if users want to access some data available in kctx but not
    in uapi ctx, bpf_probe_read_kernel() helper is needed.
    The introduction of bpf_cast_to_kern_ctx() allows direct
    memory access which makes code simpler and easier to understand.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221120195432.3113982-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:09 +02:00
Jerome Marchand 96be11db9f bpf: Add support for kfunc set with common btf_ids
Bugzilla: https://bugzilla.redhat.com/2177177

commit cfe1456440c8feaf6558577a400745d774418379
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Nov 20 11:54:26 2022 -0800

    bpf: Add support for kfunc set with common btf_ids

    Later on, we will introduce kfuncs bpf_cast_to_kern_ctx() and
    bpf_rdonly_cast() which apply to all program types. Currently kfunc set
    only supports individual prog types. This patch added support for kfunc
    applying to all program types.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221120195426.3113828-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:09 +02:00
Jerome Marchand a52cc75452 bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 3f00c52393445ed49aadc1a567aa502c6333b1a1
Author: David Vernet <void@manifault.com>
Date:   Sat Nov 19 23:10:02 2022 -0600

    bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs

    Kfuncs currently support specifying the KF_TRUSTED_ARGS flag to signal
    to the verifier that it should enforce that a BPF program passes it a
    "safe", trusted pointer. Currently, "safe" means that the pointer is
    either PTR_TO_CTX, or is refcounted. There may be cases, however, where
    the kernel passes a BPF program a safe / trusted pointer to an object
    that the BPF program wishes to use as a kptr, but because the object
    does not yet have a ref_obj_id from the perspective of the verifier, the
    program would be unable to pass it to a KF_ACQUIRE | KF_TRUSTED_ARGS
    kfunc.

    The solution is to expand the set of pointers that are considered
    trusted according to KF_TRUSTED_ARGS, so that programs can invoke kfuncs
    with these pointers without getting rejected by the verifier.

    There is already a PTR_UNTRUSTED flag that is set in some scenarios,
    such as when a BPF program reads a kptr directly from a map
    without performing a bpf_kptr_xchg() call. These pointers of course can
    and should be rejected by the verifier. Unfortunately, however,
    PTR_UNTRUSTED does not cover all the cases for safety that need to
    be addressed to adequately protect kfuncs. Specifically, pointers
    obtained by a BPF program "walking" a struct are _not_ considered
    PTR_UNTRUSTED according to BPF. For example, say that we were to add a
    kfunc called bpf_task_acquire(), with KF_ACQUIRE | KF_TRUSTED_ARGS, to
    acquire a struct task_struct *. If we only used PTR_UNTRUSTED to signal
    that a task was unsafe to pass to a kfunc, the verifier would mistakenly
    allow the following unsafe BPF program to be loaded:

    SEC("tp_btf/task_newtask")
    int BPF_PROG(unsafe_acquire_task,
                 struct task_struct *task,
                 u64 clone_flags)
    {
            struct task_struct *acquired, *nested;

            nested = task->last_wakee;

            /* Would not be rejected by the verifier. */
            acquired = bpf_task_acquire(nested);
            if (!acquired)
                    return 0;

            bpf_task_release(acquired);
            return 0;
    }

    To address this, this patch defines a new type flag called PTR_TRUSTED
    which tracks whether a PTR_TO_BTF_ID pointer is safe to pass to a
    KF_TRUSTED_ARGS kfunc or a BPF helper function. PTR_TRUSTED pointers are
    passed directly from the kernel as a tracepoint or struct_ops callback
    argument. Any nested pointer that is obtained from walking a PTR_TRUSTED
    pointer is no longer PTR_TRUSTED. From the example above, the struct
    task_struct *task argument is PTR_TRUSTED, but the 'nested' pointer
    obtained from 'task->last_wakee' is not PTR_TRUSTED.

    A subsequent patch will add kfuncs for storing a task kfunc as a kptr,
    and then another patch will add selftests to validate.

    Signed-off-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20221120051004.3605026-3-void@manifault.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:08 +02:00
Jerome Marchand de6eb19233 bpf: Add comments for map BTF matching requirement for bpf_list_head
Bugzilla: https://bugzilla.redhat.com/2177177

commit c22dfdd21592c5d56b49d5fba8de300ad7bf293c
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:26:08 2022 +0530

    bpf: Add comments for map BTF matching requirement for bpf_list_head

    The old behavior of bpf_map_meta_equal was that it compared timer_off
    to be equal (but not spin_lock_off, because that was not allowed), and
    did memcmp of kptr_off_tab.

    Now, we memcmp the btf_record of two bpf_map structs, which has all
    fields.

    We preserve backwards compat as we kzalloc the array, so if only spin
    lock and timer exist in map, we only compare offset while the rest of
    unused members in the btf_field struct are zeroed out.

    In case of kptr, btf and everything else is of vmlinux or module, so as
    long type is same it will match, since kernel btf, module, dtor pointer
    will be same across maps.

    Now with list_head in the mix, things are a bit complicated. We
    implicitly add a requirement that both BTFs are same, because struct
    btf_field_list_head has btf and value_rec members.

    We obviously shouldn't force BTFs to be equal by default, as that breaks
    backwards compatibility.

    Currently it is only implicitly required due to list_head matching
    struct btf and value_rec member. value_rec points back into a btf_record
    stashed in the map BTF (btf member of btf_field_list_head). So that
    pointer and btf member has to match exactly.

    Document all these subtle details so that things don't break in the
    future when touching this code.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-19-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:07 +02:00
Jerome Marchand a976de70c4 bpf: Rewrite kfunc argument handling
Bugzilla: https://bugzilla.redhat.com/2177177

commit 00b85860feb809852af9a88cb4ca8766d7dff6a3
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:26:01 2022 +0530

    bpf: Rewrite kfunc argument handling

    As we continue to add more features, argument types, kfunc flags, and
    different extensions to kfuncs, the code to verify the correctness of
    the kfunc prototype wrt the passed in registers has become ad-hoc and
    ugly to read. To make life easier, and make a very clear split between
    different stages of argument processing, move all the code into
    verifier.c and refactor into easier to read helpers and functions.

    This also makes sharing code within the verifier easier with kfunc
    argument processing. This will be more and more useful in later patches
    as we are now moving to implement very core BPF helpers as kfuncs, to
    keep them experimental before baking into UAPI.

    Remove all kfunc related bits now from btf_check_func_arg_match, as
    users have been converted away to refactored kfunc argument handling.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-12-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:07 +02:00
Jerome Marchand 5fb8030979 bpf: Verify ownership relationships for user BTF types
Bugzilla: https://bugzilla.redhat.com/2177177

commit 865ce09a49d79d2b2c1d980f4c05ffc0b3517bdc
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:57 2022 +0530

    bpf: Verify ownership relationships for user BTF types

    Ensure that there can be no ownership cycles among different types by
    way of having owning objects that can hold some other type as their
    element. For instance, a map value can only hold allocated objects, but
    these are allowed to have another bpf_list_head. To prevent unbounded
    recursion while freeing resources, elements of bpf_list_head in local
    kptrs can never have a bpf_list_head which are part of list in a map
    value. Later patches will verify this by having dedicated BTF selftests.

    Also, to make runtime destruction easier, once btf_struct_metas is fully
    populated, we can stash the metadata of the value type directly in the
    metadata of the list_head fields, as that allows easier access to the
    value type's layout to destruct it at runtime from the btf_field entry
    of the list head itself.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand ef745b384b bpf: Recognize lock and list fields in allocated objects
Bugzilla: https://bugzilla.redhat.com/2177177

commit 8ffa5cc142137a59d6a10eb5273fa2ba5dcd4947
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:56 2022 +0530

    bpf: Recognize lock and list fields in allocated objects

    Allow specifying bpf_spin_lock, bpf_list_head, bpf_list_node fields in a
    allocated object.

    Also update btf_struct_access to reject direct access to these special
    fields.

    A bpf_list_head allows implementing map-in-map style use cases, where an
    allocated object with bpf_list_head is linked into a list in a map
    value. This would require embedding a bpf_list_node, support for which
    is also included. The bpf_spin_lock is used to protect the bpf_list_head
    and other data.

    While we strictly don't require to hold a bpf_spin_lock while touching
    the bpf_list_head in such objects, as when have access to it, we have
    complete ownership of the object, the locking constraint is still kept
    and may be conditionally lifted in the future.

    Note that the specification of such types can be done just like map
    values, e.g.:

    struct bar {
    	struct bpf_list_node node;
    };

    struct foo {
    	struct bpf_spin_lock lock;
    	struct bpf_list_head head __contains(bar, node);
    	struct bpf_list_node node;
    };

    struct map_value {
    	struct bpf_spin_lock lock;
    	struct bpf_list_head head __contains(foo, node);
    };

    To recognize such types in user BTF, we build a btf_struct_metas array
    of metadata items corresponding to each BTF ID. This is done once during
    the btf_parse stage to avoid having to do it each time during the
    verification process's requirement to inspect the metadata.

    Moreover, the computed metadata needs to be passed to some helpers in
    future patches which requires allocating them and storing them in the
    BTF that is pinned by the program itself, so that valid access can be
    assumed to such data during program runtime.

    A key thing to note is that once a btf_struct_meta is available for a
    type, both the btf_record and btf_field_offs should be available. It is
    critical that btf_field_offs is available in case special fields are
    present, as we extensively rely on special fields being zeroed out in
    map values and allocated objects in later patches. The code ensures that
    by bailing out in case of errors and ensuring both are available
    together. If the record is not available, the special fields won't be
    recognized, so not having both is also fine (in terms of being a
    verification error and not a runtime bug).

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-7-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand a714a43577 bpf: Introduce allocated objects support
Bugzilla: https://bugzilla.redhat.com/2177177

commit 282de143ead96a5d53331e946f31c977b4610a74
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 18 07:25:55 2022 +0530

    bpf: Introduce allocated objects support

    Introduce support for representing pointers to objects allocated by the
    BPF program, i.e. PTR_TO_BTF_ID that point to a type in program BTF.
    This is indicated by the presence of MEM_ALLOC type flag in reg->type to
    avoid having to check btf_is_kernel when trying to match argument types
    in helpers.

    Whenever walking such types, any pointers being walked will always yield
    a SCALAR instead of pointer. In the future we might permit kptr inside
    such allocated objects (either kernel or program allocated), and it will
    then form a PTR_TO_BTF_ID of the respective type.

    For now, such allocated objects will always be referenced in verifier
    context, hence ref_obj_id == 0 for them is a bug. It is allowed to write
    to such objects, as long fields that are special are not touched
    (support for which will be added in subsequent patches). Note that once
    such a pointer is marked PTR_UNTRUSTED, it is no longer allowed to write
    to it.

    No PROBE_MEM handling is therefore done for loads into this type unless
    PTR_UNTRUSTED is part of the register type, since they can never be in
    an undefined state, and their lifetime will always be valid.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221118015614.2013203-6-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:06 +02:00
Jerome Marchand 39fc2fdfd6 bpf: Refactor btf_struct_access
Bugzilla: https://bugzilla.redhat.com/2177177

commit 6728aea7216c0c06c98e2e58d753a5e8b2ae1c6f
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:28 2022 +0530

    bpf: Refactor btf_struct_access

    Instead of having to pass multiple arguments that describe the register,
    pass the bpf_reg_state into the btf_struct_access callback. Currently,
    all call sites simply reuse the btf and btf_id of the reg they want to
    check the access of. The only exception to this pattern is the callsite
    in check_ptr_to_map_access, hence for that case create a dummy reg to
    simulate PTR_TO_BTF_ID access.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand d03c51f6bc bpf: Support bpf_list_head in map values
Bugzilla: https://bugzilla.redhat.com/2177177

commit f0c5941ff5b255413d31425bb327c2aec3625673
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:25 2022 +0530

    bpf: Support bpf_list_head in map values

    Add the support on the map side to parse, recognize, verify, and build
    metadata table for a new special field of the type struct bpf_list_head.
    To parameterize the bpf_list_head for a certain value type and the
    list_node member it will accept in that value type, we use BTF
    declaration tags.

    The definition of bpf_list_head in a map value will be done as follows:

    struct foo {
    	struct bpf_list_node node;
    	int data;
    };

    struct map_value {
    	struct bpf_list_head head __contains(foo, node);
    };

    Then, the bpf_list_head only allows adding to the list 'head' using the
    bpf_list_node 'node' for the type struct foo.

    The 'contains' annotation is a BTF declaration tag composed of four
    parts, "contains:name:node" where the name is then used to look up the
    type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
    the lookup. The node defines name of the member in this type that has
    the type struct bpf_list_node, which is actually used for linking into
    the linked list. For now, 'kind' part is hardcoded as struct.

    This allows building intrusive linked lists in BPF, using container_of
    to obtain pointer to entry, while being completely type safe from the
    perspective of the verifier. The verifier knows exactly the type of the
    nodes, and knows that list helpers return that type at some fixed offset
    where the bpf_list_node member used for this list exists. The verifier
    also uses this information to disallow adding types that are not
    accepted by a certain list.

    For now, no elements can be added to such lists. Support for that is
    coming in future patches, hence draining and freeing items is done with
    a TODO that will be resolved in a future patch.

    Note that the bpf_list_head_free function moves the list out to a local
    variable under the lock and releases it, doing the actual draining of
    the list items outside the lock. While this helps with not holding the
    lock for too long pessimizing other concurrent list operations, it is
    also necessary for deadlock prevention: unless every function called in
    the critical section would be notrace, a fentry/fexit program could
    attach and call bpf_map_update_elem again on the map, leading to the
    same lock being acquired if the key matches and lead to a deadlock.
    While this requires some special effort on part of the BPF programmer to
    trigger and is highly unlikely to occur in practice, it is always better
    if we can avoid such a condition.

    While notrace would prevent this, doing the draining outside the lock
    has advantages of its own, hence it is used to also fix the deadlock
    related problem.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand a7aa687757 bpf: Remove BPF_MAP_OFF_ARR_MAX
Bugzilla: https://bugzilla.redhat.com/2177177

commit 2d577252579b3efb9e934b68948a2edfa9920110
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Nov 15 00:45:23 2022 +0530

    bpf: Remove BPF_MAP_OFF_ARR_MAX

    In f71b2f64177a ("bpf: Refactor map->off_arr handling"), map->off_arr
    was refactored to be btf_field_offs. The number of field offsets is
    equal to maximum possible fields limited by BTF_FIELDS_MAX. Hence, reuse
    BTF_FIELDS_MAX as spin_lock and timer no longer are to be handled
    specially for offset sorting, fix the comment, and remove incorrect
    WARN_ON as its rec->cnt can never exceed this value. The reason to keep
    separate constant was the it was always more 2 more than total kptrs.
    This is no longer the case.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221114191547.1694267-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:04 +02:00
Jerome Marchand e9b5bda40b bpf: Refactor map->off_arr handling
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Minor changes from already backported commit 1f6e04a1c7b8
("bpf: Fix offset calculation error in __copy_map_value and
zero_map_value")

commit f71b2f64177a199d5b1d2047e155d45fd98f564a
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:57 2022 +0530

    bpf: Refactor map->off_arr handling

    Refactor map->off_arr handling into generic functions that can work on
    their own without hardcoding map specific code. The btf_fields_offs
    structure is now returned from btf_parse_field_offs, which can be reused
    later for types in program BTF.

    All functions like copy_map_value, zero_map_value call generic
    underlying functions so that they can also be reused later for copying
    to values allocated in programs which encode specific fields.

    Later, some helper functions will also require access to this
    btf_field_offs structure to be able to skip over special fields at
    runtime.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-9-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand 2b8a340165 bpf: Consolidate spin_lock, timer management into btf_record
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Context change from already backported commit 997849c4b969
("bpf: Zeroing allocated object from slab in bpf memory allocator")

commit db559117828d2448fe81ada051c60bcf39f822e9
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:56 2022 +0530

    bpf: Consolidate spin_lock, timer management into btf_record

    Now that kptr_off_tab has been refactored into btf_record, and can hold
    more than one specific field type, accomodate bpf_spin_lock and
    bpf_timer as well.

    While they don't require any more metadata than offset, having all
    special fields in one place allows us to share the same code for
    allocated user defined types and handle both map values and these
    allocated objects in a similar fashion.

    As an optimization, we still keep spin_lock_off and timer_off offsets in
    the btf_record structure, just to avoid having to find the btf_field
    struct each time their offset is needed. This is mostly needed to
    manipulate such objects in a map value at runtime. It's ok to hardcode
    just one offset as more than one field is disallowed.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-8-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand 40100e4a5a bpf: Refactor kptr_off_tab into btf_record
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts:
 - Context change from already backported commit 997849c4b969 ("bpf:
Zeroing allocated object from slab in bpf memory allocator")
 - Minor changes from already backported commit 1f6e04a1c7b8 ("bpf:
Fix offset calculation error in __copy_map_value and zero_map_value")

commit aa3496accc412b3d975e4ee5d06076d73394d8b5
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:55 2022 +0530

    bpf: Refactor kptr_off_tab into btf_record

    To prepare the BPF verifier to handle special fields in both map values
    and program allocated types coming from program BTF, we need to refactor
    the kptr_off_tab handling code into something more generic and reusable
    across both cases to avoid code duplication.

    Later patches also require passing this data to helpers at runtime, so
    that they can work on user defined types, initialize them, destruct
    them, etc.

    The main observation is that both map values and such allocated types
    point to a type in program BTF, hence they can be handled similarly. We
    can prepare a field metadata table for both cases and store them in
    struct bpf_map or struct btf depending on the use case.

    Hence, refactor the code into generic btf_record and btf_field member
    structs. The btf_record represents the fields of a specific btf_type in
    user BTF. The cnt indicates the number of special fields we successfully
    recognized, and field_mask is a bitmask of fields that were found, to
    enable quick determination of availability of a certain field.

    Subsequently, refactor the rest of the code to work with these generic
    types, remove assumptions about kptr and kptr_off_tab, rename variables
    to more meaningful names, etc.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-7-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:43:01 +02:00
Jerome Marchand f3c85b07ca bpf: Allow specifying volatile type modifier for kptrs
Bugzilla: https://bugzilla.redhat.com/2177177

commit 23da464dd6b8935b66f4ee306ad8947fd32ccd75
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Nov 4 00:39:51 2022 +0530

    bpf: Allow specifying volatile type modifier for kptrs

    This is useful in particular to mark the pointer as volatile, so that
    compiler treats each load and store to the field as a volatile access.
    The alternative is having to define and use READ_ONCE and WRITE_ONCE in
    the BPF program.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Acked-by: David Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20221103191013.1236066-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:42:59 +02:00
Artem Savkov f76f5ca9d3 bpf: Prevent decl_tag from being referenced in func_proto arg
Bugzilla: https://bugzilla.redhat.com/2166911

commit f17472d4599697d701aa239b4c475a506bccfd19
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Nov 22 19:54:22 2022 -0800

    bpf: Prevent decl_tag from being referenced in func_proto arg
    
    Syzkaller managed to hit another decl_tag issue:
    
      btf_func_proto_check kernel/bpf/btf.c:4506 [inline]
      btf_check_all_types kernel/bpf/btf.c:4734 [inline]
      btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763
      btf_parse kernel/bpf/btf.c:5042 [inline]
      btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709
      bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342
      __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034
      __do_sys_bpf kernel/bpf/syscall.c:5093 [inline]
      __se_sys_bpf kernel/bpf/syscall.c:5091 [inline]
      __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091
      do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48
    
    This seems similar to commit ea68376c8bed ("bpf: prevent decl_tag from being
    referenced in func_proto") but for the argument.
    
    Reported-by: syzbot+8dd0551dda6020944c5d@syzkaller.appspotmail.com
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20221123035422.872531-2-sdf@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:24 +01:00
Artem Savkov fd35528978 bpf: prevent decl_tag from being referenced in func_proto
Bugzilla: https://bugzilla.redhat.com/2166911

commit ea68376c8bed5cd156900852aada20c3a0874d17
Author: Stanislav Fomichev <sdf@google.com>
Date:   Fri Oct 14 17:24:44 2022 -0700

    bpf: prevent decl_tag from being referenced in func_proto
    
    Syzkaller was able to hit the following issue:
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 3609 at kernel/bpf/btf.c:1946
    btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
    Modules linked in:
    CPU: 0 PID: 3609 Comm: syz-executor361 Not tainted
    6.0.0-syzkaller-02734-g0326074ff465 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 09/22/2022
    RIP: 0010:btf_type_id_size+0x2d5/0x9d0 kernel/bpf/btf.c:1946
    Code: ef e8 7f 8e e4 ff 41 83 ff 0b 77 28 f6 44 24 10 18 75 3f e8 6d 91
    e4 ff 44 89 fe bf 0e 00 00 00 e8 20 8e e4 ff e8 5b 91 e4 ff <0f> 0b 45
    31 f6 e9 98 02 00 00 41 83 ff 12 74 18 e8 46 91 e4 ff 44
    RSP: 0018:ffffc90003cefb40 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
    RDX: ffff8880259c0000 RSI: ffffffff81968415 RDI: 0000000000000005
    RBP: ffff88801270ca00 R08: 0000000000000005 R09: 000000000000000e
    R10: 0000000000000011 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000011 R14: ffff888026ee6424 R15: 0000000000000011
    FS:  000055555641b300(0000) GS:ffff8880b9a00000(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000f2e258 CR3: 000000007110e000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     btf_func_proto_check kernel/bpf/btf.c:4447 [inline]
     btf_check_all_types kernel/bpf/btf.c:4723 [inline]
     btf_parse_type_sec kernel/bpf/btf.c:4752 [inline]
     btf_parse kernel/bpf/btf.c:5026 [inline]
     btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892
     bpf_btf_load kernel/bpf/syscall.c:4324 [inline]
     __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010
     __do_sys_bpf kernel/bpf/syscall.c:5069 [inline]
     __se_sys_bpf kernel/bpf/syscall.c:5067 [inline]
     __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    RIP: 0033:0x7f0fbae41c69
    Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89
    f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
    f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffc8aeb6228 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fbae41c69
    RDX: 0000000000000020 RSI: 0000000020000140 RDI: 0000000000000012
    RBP: 00007f0fbae05e10 R08: 0000000000000000 R09: 0000000000000000
    R10: 00000000ffffffff R11: 0000000000000246 R12: 00007f0fbae05ea0
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
     </TASK>
    
    Looks like it tries to create a func_proto which return type is
    decl_tag. For the details, see Martin's spot on analysis in [0].
    
    0: https://lore.kernel.org/bpf/CAKH8qBuQDLva_hHxxBuZzyAcYNO4ejhovz6TQeVSk8HY-2SO6g@mail.gmail.com/T/#mea6524b3fcd6298347432226e81b1e6155efc62c
    
    Cc: Yonghong Song <yhs@fb.com>
    Cc: Martin KaFai Lau <martin.lau@kernel.org>
    Fixes: bd16dee66ae4 ("bpf: Add BTF_KIND_DECL_TAG typedef support")
    Reported-by: syzbot+d8bd751aef7c6b39a344@syzkaller.appspotmail.com
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20221015002444.2680969-2-sdf@google.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:21 +01:00
Artem Savkov db95e85722 bpf: Tweak definition of KF_TRUSTED_ARGS
Bugzilla: https://bugzilla.redhat.com/2166911

commit eed807f626101f6a4227bd53942892c5983b95a7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Sep 21 18:48:25 2022 +0200

    bpf: Tweak definition of KF_TRUSTED_ARGS
    
    Instead of forcing all arguments to be referenced pointers with non-zero
    reg->ref_obj_id, tweak the definition of KF_TRUSTED_ARGS to mean that
    only PTR_TO_BTF_ID (and socket types translated to PTR_TO_BTF_ID) have
    that constraint, and require their offset to be set to 0.
    
    The rest of pointer types are also accomodated in this definition of
    trusted pointers, but with more relaxed rules regarding offsets.
    
    The inherent meaning of setting this flag is that all kfunc pointer
    arguments have a guranteed lifetime, and kernel object pointers
    (PTR_TO_BTF_ID, PTR_TO_CTX) are passed in their unmodified form (with
    offset 0). In general, this is not true for PTR_TO_BTF_ID as it can be
    obtained using pointer walks.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
    Link: https://lore.kernel.org/r/cdede0043c47ed7a357f0a915d16f9ce06a1d589.1663778601.git.lorenzo@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:16 +01:00
Artem Savkov 11644af3ae btf: Allow dynamic pointer parameters in kfuncs
Bugzilla: https://bugzilla.redhat.com/2166911

commit b8d31762a0ae6861e1115302ee338560d853e317
Author: Roberto Sassu <roberto.sassu@huawei.com>
Date:   Tue Sep 20 09:59:42 2022 +0200

    btf: Allow dynamic pointer parameters in kfuncs
    
    Allow dynamic pointers (struct bpf_dynptr_kern *) to be specified as
    parameters in kfuncs. Also, ensure that dynamic pointers passed as argument
    are valid and initialized, are a pointer to the stack, and of the type
    local. More dynamic pointer types can be supported in the future.
    
    To properly detect whether a parameter is of the desired type, introduce
    the stringify_struct() macro to compare the returned structure name with
    the desired name. In addition, protect against structure renames, by
    halting the build with BUILD_BUG_ON(), so that developers have to revisit
    the code.
    
    To check if a dynamic pointer passed to the kfunc is valid and initialized,
    and if its type is local, export the existing functions
    is_dynptr_reg_valid_init() and is_dynptr_type_expected().
    
    Cc: Joanne Koong <joannelkoong@gmail.com>
    Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220920075951.929132-5-roberto.sassu@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:15 +01:00
Artem Savkov 4c0dd85157 bpf: Allow kfuncs to be used in LSM programs
Bugzilla: https://bugzilla.redhat.com/2166911

commit d15bf1501c7533826a616478002c601fcc7671f3
Author: KP Singh <kpsingh@kernel.org>
Date:   Tue Sep 20 09:59:39 2022 +0200

    bpf: Allow kfuncs to be used in LSM programs
    
    In preparation for the addition of new kfuncs, allow kfuncs defined in the
    tracing subsystem to be used in LSM programs by mapping the LSM program
    type to the TRACING hook.
    
    Signed-off-by: KP Singh <kpsingh@kernel.org>
    Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220920075951.929132-2-roberto.sassu@huaweicloud.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:15 +01:00
Artem Savkov 426eb2fc0d bpf: simplify code in btf_parse_hdr
Bugzilla: https://bugzilla.redhat.com/2166911

commit 3a74904ceff3ecdb9d6cc0844ed67df417968eb6
Author: William Dean <williamsukatube@163.com>
Date:   Sat Sep 17 16:42:48 2022 +0800

    bpf: simplify code in btf_parse_hdr
    
    It could directly return 'btf_check_sec_info' to simplify code.
    
    Signed-off-by: William Dean <williamsukatube@163.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220917084248.3649-1-williamsukatube@163.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:14 +01:00
Artem Savkov e859178cb6 bpf/btf: Use btf_type_str() whenever possible
Bugzilla: https://bugzilla.redhat.com/2166911

commit 571f9738bfb3d4b42253c1d0ad26da9fede85f36
Author: Peilin Ye <peilin.ye@bytedance.com>
Date:   Fri Sep 16 13:28:00 2022 -0700

    bpf/btf: Use btf_type_str() whenever possible
    
    We have btf_type_str().  Use it whenever possible in btf.c, instead of
    "btf_kind_str[BTF_INFO_KIND(t->info)]".
    
    Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
    Link: https://lore.kernel.org/r/20220916202800.31421-1-yepeilin.cs@gmail.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:13 +01:00
Artem Savkov 294868a446 bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
Bugzilla: https://bugzilla.redhat.com/2166911

commit a37a32583e282d8d815e22add29bc1e91e19951a
Author: Lorenz Bauer <oss@lmb.io>
Date:   Sat Sep 10 11:01:20 2022 +0000

    bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
    
    When trying to finish resolving a struct member, btf_struct_resolve
    saves the member type id in a u16 temporary variable. This truncates
    the 32 bit type id value if it exceeds UINT16_MAX.
    
    As a result, structs that have members with type ids > UINT16_MAX and
    which need resolution will fail with a message like this:
    
        [67414] STRUCT ff_device size=120 vlen=12
            effect_owners type_id=67434 bits_offset=960 Member exceeds struct_size
    
    Fix this by changing the type of last_member_type_id to u32.
    
    Fixes: a0791f0df7 ("bpf: fix BTF limits")
    Reviewed-by: Stanislav Fomichev <sdf@google.com>
    Signed-off-by: Lorenz Bauer <oss@lmb.io>
    Link: https://lore.kernel.org/r/20220910110120.339242-1-oss@lmb.io
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:12 +01:00
Artem Savkov 05f87e0648 bpf: Export btf_type_by_id() and bpf_log()
Bugzilla: https://bugzilla.redhat.com/2166911

commit 84c6ac417ceacd086efc330afece8922969610b7
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Sep 7 10:40:39 2022 -0600

    bpf: Export btf_type_by_id() and bpf_log()
    
    These symbols will be used in nf_conntrack.ko to support direct writes
    to `nf_conn`.
    
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Link: https://lore.kernel.org/r/3c98c19dc50d3b18ea5eca135b4fc3a5db036060.1662568410.git.dxu@dxuuu.xyz
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:12 +01:00
Artem Savkov 775481dcf1 bpf/verifier: allow kfunc to return an allocated mem
Bugzilla: https://bugzilla.redhat.com/2166911

commit eb1f7f71c126c8fd50ea81af98f97c4b581ea4ae
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:13:02 2022 +0200

    bpf/verifier: allow kfunc to return an allocated mem
    
    For drivers (outside of network), the incoming data is not statically
    defined in a struct. Most of the time the data buffer is kzalloc-ed
    and thus we can not rely on eBPF and BTF to explore the data.
    
    This commit allows to return an arbitrary memory, previously allocated by
    the driver.
    An interesting extra point is that the kfunc can mark the exported
    memory region as read only or read/write.
    
    So, when a kfunc is not returning a pointer to a struct but to a plain
    type, we can consider it is a valid allocated memory assuming that:
    - one of the arguments is either called rdonly_buf_size or
      rdwr_buf_size
    - and this argument is a const from the caller point of view
    
    We can then use this parameter as the size of the allocated memory.
    
    The memory is either read-only or read-write based on the name
    of the size parameter.
    
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-7-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:10 +01:00
Artem Savkov 4a3d0f39ff bpf/btf: bump BTF_KFUNC_SET_MAX_CNT
Bugzilla: https://bugzilla.redhat.com/2166911

commit f9b348185f4d684cc19e6bd9b87904823d5aa5ed
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:13:01 2022 +0200

    bpf/btf: bump BTF_KFUNC_SET_MAX_CNT
    
    net/bpf/test_run.c is already presenting 20 kfuncs.
    net/netfilter/nf_conntrack_bpf.c is also presenting an extra 10 kfuncs.
    
    Given that all the kfuncs are regrouped into one unique set, having
    only 2 space left prevent us to add more selftests.
    
    Bump it to 256.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-6-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:10 +01:00
Artem Savkov c903dd1ca2 bpf: split btf_check_subprog_arg_match in two
Bugzilla: https://bugzilla.redhat.com/2166911

commit 95f2f26f3cac06cfc046d2b29e60719d7848ea54
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Tue Sep 6 17:12:58 2022 +0200

    bpf: split btf_check_subprog_arg_match in two
    
    btf_check_subprog_arg_match() was used twice in verifier.c:
    - when checking for the type mismatches between a (sub)prog declaration
      and BTF
    - when checking the call of a subprog to see if the provided arguments
      are correct and valid
    
    This is problematic when we check if the first argument of a program
    (pointer to ctx) is correctly accessed:
    To be able to ensure we access a valid memory in the ctx, the verifier
    assumes the pointer to context is not null.
    This has the side effect of marking the program accessing the entire
    context, even if the context is never dereferenced.
    
    For example, by checking the context access with the current code, the
    following eBPF program would fail with -EINVAL if the ctx is set to null
    from the userspace:
    
    ```
    SEC("syscall")
    int prog(struct my_ctx *args) {
      return 0;
    }
    ```
    
    In that particular case, we do not want to actually check that the memory
    is correct while checking for the BTF validity, but we just want to
    ensure that the (sub)prog definition matches the BTF we have.
    
    So split btf_check_subprog_arg_match() in two so we can actually check
    for the memory used when in a call, and ignore that part when not.
    
    Note that a further patch is in preparation to disentangled
    btf_check_func_arg_match() from these two purposes, and so right now we
    just add a new hack around that by adding a boolean to this function.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220906151303.2780789-3-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:09 +01:00
Artem Savkov ca63c6e52e bpf: Allow struct argument in trampoline based programs
Bugzilla: https://bugzilla.redhat.com/2166911

commit 720e6a435194fb5237833a4a7ec6aa60a78964a8
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Aug 31 08:26:46 2022 -0700

    bpf: Allow struct argument in trampoline based programs
    
    Allow struct argument in trampoline based programs where
    the struct size should be <= 16 bytes. In such cases, the argument
    will be put into up to 2 registers for bpf, x86_64 and arm64
    architectures.
    
    To support arch-specific trampoline manipulation,
    add arg_flags for additional struct information about arguments
    in btf_func_model. Such information will be used in arch specific
    function arch_prepare_bpf_trampoline() to prepare argument access
    properly in trampoline.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220831152646.2078089-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:08 +01:00
Artem Savkov a1f618c375 bpf: Always return corresponding btf_type in __get_type_size()
Bugzilla: https://bugzilla.redhat.com/2166911

commit a00ed8430199abbc9d9bf43ea31795bfe98998ca
Author: Yonghong Song <yhs@fb.com>
Date:   Sun Aug 7 10:51:16 2022 -0700

    bpf: Always return corresponding btf_type in __get_type_size()
    
    Currently in funciton __get_type_size(), the corresponding
    btf_type is returned only in invalid cases. Let us always
    return btf_type regardless of valid or invalid cases.
    Such a new functionality will be used in subsequent patches.
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220807175116.4179242-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:53:59 +01:00
Artem Savkov a6a7d34ceb btf: Add a new kfunc flag which allows to mark a function to be sleepable
Bugzilla: https://bugzilla.redhat.com/2166911

commit fa96b24204af42274ec13dfb2f2e6990d7510e55
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Fri Aug 5 14:48:14 2022 -0700

    btf: Add a new kfunc flag which allows to mark a function to be sleepable
    
    This allows to declare a kfunc as sleepable and prevents its use in
    a non sleepable program.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Co-developed-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
    Signed-off-by: Hao Luo <haoluo@google.com>
    Link: https://lore.kernel.org/r/20220805214821.1058337-2-haoluo@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-03 15:25:56 +01:00
Artem Savkov cb6acbdffd bpf: btf: Fix vsnprintf return value check
Bugzilla: https://bugzilla.redhat.com/2137876

commit 58250ae350de8d28ce91ade4605d32c9e7f062a8
Author: Fedor Tokarev <ftokarev@gmail.com>
Date:   Mon Jul 11 23:13:17 2022 +0200

    bpf: btf: Fix vsnprintf return value check
    
    vsnprintf returns the number of characters which would have been written if
    enough space had been available, excluding the terminating null byte. Thus,
    the return value of 'len_left' means that the last character has been
    dropped.
    
    Signed-off-by: Fedor Tokarev <ftokarev@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Alan Maguire <alan.maguire@oracle.com>
    Link: https://lore.kernel.org/bpf/20220711211317.GA1143610@laptop

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:44 +01:00
Artem Savkov d581951eb0 bpf: Add support for forcing kfunc args to be trusted
Bugzilla: https://bugzilla.redhat.com/2137876

commit 56e948ffc098a780fefb6c1784a3a2c7b81100a1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Jul 21 15:42:36 2022 +0200

    bpf: Add support for forcing kfunc args to be trusted
    
    Teach the verifier to detect a new KF_TRUSTED_ARGS kfunc flag, which
    means each pointer argument must be trusted, which we define as a
    pointer that is referenced (has non-zero ref_obj_id) and also needs to
    have its offset unchanged, similar to how release functions expect their
    argument. This allows a kfunc to receive pointer arguments unchanged
    from the result of the acquire kfunc.
    
    This is required to ensure that kfunc that operate on some object only
    work on acquired pointers and not normal PTR_TO_BTF_ID with same type
    which can be obtained by pointer walking. The restrictions applied to
    release arguments also apply to trusted arguments. This implies that
    strict type matching (not deducing type by recursively following members
    at offset) and OBJ_RELEASE offset checks (ensuring they are zero) are
    used for trusted pointer arguments.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220721134245.2450-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:42 +01:00
Artem Savkov dbfa384357 bpf: Switch to new kfunc flags infrastructure
Bugzilla: https://bugzilla.redhat.com/2137876

commit a4703e3184320d6e15e2bc81d2ccf1c8c883f9d1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Jul 21 15:42:35 2022 +0200

    bpf: Switch to new kfunc flags infrastructure
    
    Instead of populating multiple sets to indicate some attribute and then
    researching the same BTF ID in them, prepare a single unified BTF set
    which indicates whether a kfunc is allowed to be called, and also its
    attributes if any at the same time. Now, only one call is needed to
    perform the lookup for both kfunc availability and its attributes.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:42 +01:00
Artem Savkov fbd76e5346 bpf: Fix check against plain integer v 'NULL'
Bugzilla: https://bugzilla.redhat.com/2137876

commit a2a5580fcbf808e7c2310e4959b62f9d2157fdb6
Author: Ben Dooks <ben.dooks@sifive.com>
Date:   Thu Jul 14 11:03:22 2022 +0100

    bpf: Fix check against plain integer v 'NULL'
    
    When checking with sparse, btf_show_type_value() is causing a
    warning about checking integer vs NULL when the macro is passed
    a pointer, due to the 'value != 0' check. Stop sparse complaining
    about any type-casting by adding a cast to the typeof(value).
    
    This fixes the following sparse warnings:
    
    kernel/bpf/btf.c:2579:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:2581:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:3407:17: warning: Using plain integer as NULL pointer
    kernel/bpf/btf.c:3758:9: warning: Using plain integer as NULL pointer
    
    Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220714100322.260467-1-ben.dooks@sifive.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:39 +01:00
Artem Savkov 930fe36094 bpf, libbpf: Add type match support
Bugzilla: https://bugzilla.redhat.com/2137876

commit ec6209c8d42f815bc3bef10934637ca92114cd1b
Author: Daniel Müller <deso@posteo.net>
Date:   Tue Jun 28 16:01:21 2022 +0000

    bpf, libbpf: Add type match support
    
    This patch adds support for the proposed type match relation to
    relo_core where it is shared between userspace and kernel. It plumbs
    through both kernel-side and libbpf-side support.
    
    The matching relation is defined as follows (copy from source):
    - modifiers and typedefs are stripped (and, hence, effectively ignored)
    - generally speaking types need to be of same kind (struct vs. struct, union
      vs. union, etc.)
      - exceptions are struct/union behind a pointer which could also match a
        forward declaration of a struct or union, respectively, and enum vs.
        enum64 (see below)
    Then, depending on type:
    - integers:
      - match if size and signedness match
    - arrays & pointers:
      - target types are recursively matched
    - structs & unions:
      - local members need to exist in target with the same name
      - for each member we recursively check match unless it is already behind a
        pointer, in which case we only check matching names and compatible kind
    - enums:
      - local variants have to have a match in target by symbolic name (but not
        numeric value)
      - size has to match (but enum may match enum64 and vice versa)
    - function pointers:
      - number and position of arguments in local type has to match target
      - for each argument and the return value we recursively check match
    
    Signed-off-by: Daniel Müller <deso@posteo.net>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:35 +01:00
Artem Savkov ee4f4249cd bpf: minimize number of allocated lsm slots per program
Bugzilla: https://bugzilla.redhat.com/2137876

commit c0e19f2c9a3edd38e4b1bdae98eb44555d02bc31
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jun 28 10:43:07 2022 -0700

    bpf: minimize number of allocated lsm slots per program
    
    Previous patch adds 1:1 mapping between all 211 LSM hooks
    and bpf_cgroup program array. Instead of reserving a slot per
    possible hook, reserve 10 slots per cgroup for lsm programs.
    Those slots are dynamically allocated on demand and reclaimed.
    
    struct cgroup_bpf {
    	struct bpf_prog_array *    effective[33];        /*     0   264 */
    	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
    	struct hlist_head          progs[33];            /*   264   264 */
    	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
    	u8                         flags[33];            /*   528    33 */
    
    	/* XXX 7 bytes hole, try to pack */
    
    	struct list_head           storages;             /*   568    16 */
    	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
    	struct bpf_prog_array *    inactive;             /*   584     8 */
    	struct percpu_ref          refcnt;               /*   592    16 */
    	struct work_struct         release_work;         /*   608    72 */
    
    	/* size: 680, cachelines: 11, members: 7 */
    	/* sum members: 673, holes: 1, sum holes: 7 */
    	/* last cacheline: 40 bytes */
    };
    
    Reviewed-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220628174314.1216643-5-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:33 +01:00
Artem Savkov 9a33161b25 bpf: per-cgroup lsm flavor
Bugzilla: https://bugzilla.redhat.com/2137876

Conflicts: already applied 65d9ecfe0ca73 "bpf: Fix ref_obj_id for dynptr
data slices in verifier"

commit 69fd337a975c7e690dfe49d9cb4fe5ba1e6db44e
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jun 28 10:43:06 2022 -0700

    bpf: per-cgroup lsm flavor

    Allow attaching to lsm hooks in the cgroup context.

    Attaching to per-cgroup LSM works exactly like attaching
    to other per-cgroup hooks. New BPF_LSM_CGROUP is added
    to trigger new mode; the actual lsm hook we attach to is
    signaled via existing attach_btf_id.

    For the hooks that have 'struct socket' or 'struct sock' as its first
    argument, we use the cgroup associated with that socket. For the rest,
    we use 'current' cgroup (this is all on default hierarchy == v2 only).
    Note that for some hooks that work on 'struct sock' we still
    take the cgroup from 'current' because some of them work on the socket
    that hasn't been properly initialized yet.

    Behind the scenes, we allocate a shim program that is attached
    to the trampoline and runs cgroup effective BPF programs array.
    This shim has some rudimentary ref counting and can be shared
    between several programs attaching to the same lsm hook from
    different cgroups.

    Note that this patch bloats cgroup size because we add 211
    cgroup_bpf_attach_type(s) for simplicity sake. This will be
    addressed in the subsequent patch.

    Also note that we only add non-sleepable flavor for now. To enable
    sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
    shim programs have to be freed via trace rcu, cgroup_bpf.effective
    should be also trace-rcu-managed + maybe some other changes that
    I'm not aware of.

    Reviewed-by: Martin KaFai Lau <kafai@fb.com>
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220628174314.1216643-4-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:33 +01:00
Artem Savkov 0f68342144 bpf: Merge "types_are_compat" logic into relo_core.c
Bugzilla: https://bugzilla.redhat.com/2137876

commit fd75733da2f376c0c8c6513c3cb2ac227082ec5c
Author: Daniel Müller <deso@posteo.net>
Date:   Thu Jun 23 18:29:34 2022 +0000

    bpf: Merge "types_are_compat" logic into relo_core.c
    
    BPF type compatibility checks (bpf_core_types_are_compat()) are
    currently duplicated between kernel and user space. That's a historical
    artifact more than intentional doing and can lead to subtle bugs where
    one implementation is adjusted but another is forgotten.
    
    That happened with the enum64 work, for example, where the libbpf side
    was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation
    support")) to use the btf_kind_core_compat() helper function but the
    kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64
    support")).
    
    This patch addresses both the duplication issue, by merging both
    implementations and moving them into relo_core.c, and fixes the alluded
    to kind check (by giving preference to libbpf's already adjusted logic).
    
    For discussion of the topic, please refer to:
    https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665
    
    Changelog:
    v1 -> v2:
    - limited libbpf recursion limit to 32
    - changed name to __bpf_core_types_are_compat
    - included warning previously present in libbpf version
    - merged kernel and user space changes into a single patch
    
    Signed-off-by: Daniel Müller <deso@posteo.net>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:31 +01:00
Yauheni Kaliuta 274ba317fc bpf: Add btf enum64 support
Bugzilla: http://bugzilla.redhat.com/2120968

commit 6089fb325cf737eeb2c4d236c94697112ca860da
Author: Yonghong Song <yhs@fb.com>
Date:   Mon Jun 6 23:26:00 2022 -0700

    bpf: Add btf enum64 support
    
    Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM.
    But in kernel, some enum indeed has 64bit values, e.g.,
    in uapi bpf.h, we have
      enum {
            BPF_F_INDEX_MASK                = 0xffffffffULL,
            BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
            BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
      };
    In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK
    as 0, which certainly is incorrect.
    
    This patch added a new btf kind, BTF_KIND_ENUM64, which permits
    64bit value to cover the above use case. The BTF_KIND_ENUM64 has
    the following three fields followed by the common type:
      struct bpf_enum64 {
        __u32 nume_off;
        __u32 val_lo32;
        __u32 val_hi32;
      };
    Currently, btf type section has an alignment of 4 as all element types
    are u32. Representing the value with __u64 will introduce a pad
    for bpf_enum64 and may also introduce misalignment for the 64bit value.
    Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues.
    
    The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64
    to indicate whether the value is signed or unsigned. The kflag intends
    to provide consistent output of BTF C fortmat with the original
    source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff.
    The format C has two choices, printing out 0xffffffff or -1 and current libbpf
    prints out as unsigned value. But if the signedness is preserved in btf,
    the value can be printed the same as the original source code.
    The kflag value 0 means unsigned values, which is consistent to the default
    by libbpf and should also cover most cases as well.
    
    The new BTF_KIND_ENUM64 is intended to support the enum value represented as
    64bit value. But it can represent all BTF_KIND_ENUM values as well.
    The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has
    to be represented with 64 bits.
    
    In addition, a static inline function btf_kind_core_compat() is introduced which
    will be used later when libbpf relo_core.c changed. Here the kernel shares the
    same relo_core.c with libbpf.
    
      [1] https://reviews.llvm.org/D124641
    
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:11 +02:00
Yauheni Kaliuta 5e92a3254e bpf: Limit maximum modifier chain length in btf_check_type_tags
Bugzilla: https://bugzilla.redhat.com/2120968

commit d1a374a1aeb7e31191448e225ed2f9c5e894f280
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Wed Jun 15 09:51:51 2022 +0530

    bpf: Limit maximum modifier chain length in btf_check_type_tags
    
    On processing a module BTF of module built for an older kernel, we might
    sometimes find that some type points to itself forming a loop. If such a
    type is a modifier, btf_check_type_tags's while loop following modifier
    chain will be caught in an infinite loop.
    
    Fix this by defining a maximum chain length and bailing out if we spin
    any longer than that.
    
    Fixes: eb596b090558 ("bpf: Ensure type tags precede modifiers in BTF")
    Reported-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220615042151.2266537-1-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:07 +02:00
Yauheni Kaliuta 639d908f91 bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
Bugzilla: https://bugzilla.redhat.com/2120968

commit f858c2b2ca04fc7ead291821a793638ae120c11d
Author: Toke Høiland-Jørgensen <toke@redhat.com>
Date:   Mon Jun 6 09:52:51 2022 +0200

    bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
    
    The verifier allows programs to call global functions as long as their
    argument types match, using BTF to check the function arguments. One of the
    allowed argument types to such global functions is PTR_TO_CTX; however the
    check for this fails on BPF_PROG_TYPE_EXT functions because the verifier
    uses the wrong type to fetch the vmlinux BTF ID for the program context
    type. This failure is seen when an XDP program is loaded using
    libxdp (which loads it as BPF_PROG_TYPE_EXT and attaches it to a global XDP
    type program).
    
    Fix the issue by passing in the target program type instead of the
    BPF_PROG_TYPE_EXT type to bpf_prog_get_ctx() when checking function
    argument compatibility.
    
    The first Fixes tag refers to the latest commit that touched the code in
    question, while the second one points to the code that first introduced
    the global function call verification.
    
    v2:
    - Use resolve_prog_type()
    
    Fixes: 3363bd0cfbb8 ("bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support")
    Fixes: 51c39bb1d5 ("bpf: Introduce function-by-function verification")
    Reported-by: Simon Sundberg <simon.sundberg@kau.se>
    Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/r/20220606075253.28422-1-toke@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:07 +02:00
Yauheni Kaliuta 5b373179cb bpf: Allow kfunc in tracing and syscall programs.
Bugzilla: https://bugzilla.redhat.com/2120968

commit 979497674e63666a99fd7d242dba53a5ca5d628b
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Wed May 18 22:59:08 2022 +0200

    bpf: Allow kfunc in tracing and syscall programs.
    
    Tracing and syscall BPF program types are very convenient to add BPF
    capabilities to subsystem otherwise not BPF capable.
    When we add kfuncs capabilities to those program types, we can add
    BPF features to subsystems without having to touch BPF core.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220518205924.399291-2-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:06 +02:00
Yauheni Kaliuta 11fec2f10e bpf: Compute map_btf_id during build time
Bugzilla: https://bugzilla.redhat.com/2120968

commit c317ab71facc2cd0a94145973318a4c914e11acc
Author: Menglong Dong <imagedong@tencent.com>
Date:   Mon Apr 25 21:32:47 2022 +0800

    bpf: Compute map_btf_id during build time
    
    For now, the field 'map_btf_id' in 'struct bpf_map_ops' for all map
    types are computed during vmlinux-btf init:
    
      btf_parse_vmlinux() -> btf_vmlinux_map_ids_init()
    
    It will lookup the btf_type according to the 'map_btf_name' field in
    'struct bpf_map_ops'. This process can be done during build time,
    thanks to Jiri's resolve_btfids.
    
    selftest of map_ptr has passed:
    
      $96 map_ptr:OK
      Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
    
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Menglong Dong <imagedong@tencent.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:00 +02:00
Yauheni Kaliuta 39845718d1 bpf: Make BTF type match stricter for release arguments
Bugzilla: https://bugzilla.redhat.com/2120968

commit 2ab3b3808eb17f729edfd69e061667ca0a427195
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:57 2022 +0530

    bpf: Make BTF type match stricter for release arguments
    
    The current of behavior of btf_struct_ids_match for release arguments is
    that when type match fails, it retries with first member type again
    (recursively). Since the offset is already 0, this is akin to just
    casting the pointer in normal C, since if type matches it was just
    embedded inside parent sturct as an object. However, we want to reject
    cases for release function type matching, be it kfunc or BPF helpers.
    
    An example is the following:
    
    struct foo {
    	struct bar b;
    };
    
    struct foo *v = acq_foo();
    rel_bar(&v->b); // btf_struct_ids_match fails btf_types_are_same, then
    		// retries with first member type and succeeds, while
    		// it should fail.
    
    Hence, don't walk the struct and only rely on btf_types_are_same for
    strict mode. All users of strict mode must be dealing with zero offset
    anyway, since otherwise they would want the struct to be walked.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-10-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 716c115570 bpf: Teach verifier about kptr_get kfunc helpers
Bugzilla: https://bugzilla.redhat.com/2120968

commit a1ef195996526da45bbc9710849254023df75aea
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:56 2022 +0530

    bpf: Teach verifier about kptr_get kfunc helpers
    
    We introduce a new style of kfunc helpers, namely *_kptr_get, where they
    take pointer to the map value which points to a referenced kernel
    pointer contained in the map. Since this is referenced, only
    bpf_kptr_xchg from BPF side and xchg from kernel side is allowed to
    change the current value, and each pointer that resides in that location
    would be referenced, and RCU protected (this must be kept in mind while
    adding kernel types embeddable as reference kptr in BPF maps).
    
    This means that if do the load of the pointer value in an RCU read
    section, and find a live pointer, then as long as we hold RCU read lock,
    it won't be freed by a parallel xchg + release operation. This allows us
    to implement a safe refcount increment scheme. Hence, enforce that first
    argument of all such kfunc is a proper PTR_TO_MAP_VALUE pointing at the
    right offset to referenced pointer.
    
    For the rest of the arguments, they are subjected to typical kfunc
    argument checks, hence allowing some flexibility in passing more intent
    into how the reference should be taken.
    
    For instance, in case of struct nf_conn, it is not freed until RCU grace
    period ends, but can still be reused for another tuple once refcount has
    dropped to zero. Hence, a bpf_ct_kptr_get helper not only needs to call
    refcount_inc_not_zero, but also do a tuple match after incrementing the
    reference, and when it fails to match it, put the reference again and
    return NULL.
    
    This can be implemented easily if we allow passing additional parameters
    to the bpf_ct_kptr_get kfunc, like a struct bpf_sock_tuple * and a
    tuple__sz pair.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-9-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 12c4199b33 bpf: Wire up freeing of referenced kptr
Bugzilla: https://bugzilla.redhat.com/2120968

commit 14a324f6a67ef6a53e04362a70160a47eb8afffa
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:55 2022 +0530

    bpf: Wire up freeing of referenced kptr
    
    A destructor kfunc can be defined as void func(type *), where type may
    be void or any other pointer type as per convenience.
    
    In this patch, we ensure that the type is sane and capture the function
    pointer into off_desc of ptr_off_tab for the specific pointer offset,
    with the invariant that the dtor pointer is always set when 'kptr_ref'
    tag is applied to the pointer's pointee type, which is indicated by the
    flag BPF_MAP_VALUE_OFF_F_REF.
    
    Note that only BTF IDs whose destructor kfunc is registered, thus become
    the allowed BTF IDs for embedding as referenced kptr. Hence it serves
    the purpose of finding dtor kfunc BTF ID, as well acting as a check
    against the whitelist of allowed BTF IDs for this purpose.
    
    Finally, wire up the actual freeing of the referenced pointer if any at
    all available offsets, so that no references are leaked after the BPF
    map goes away and the BPF program previously moved the ownership a
    referenced pointer into it.
    
    The behavior is similar to BPF timers, where bpf_map_{update,delete}_elem
    will free any existing referenced kptr. The same case is with LRU map's
    bpf_lru_push_free/htab_lru_push_free functions, which are extended to
    reset unreferenced and free referenced kptr.
    
    Note that unlike BPF timers, kptr is not reset or freed when map uref
    drops to zero.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-8-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 24419e5e2e bpf: Populate pairs of btf_id and destructor kfunc in btf
Bugzilla: https://bugzilla.redhat.com/2120968

commit 5ce937d613a423ca3102f53d9f3daf4210c1b6e2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:54 2022 +0530

    bpf: Populate pairs of btf_id and destructor kfunc in btf
    
    To support storing referenced PTR_TO_BTF_ID in maps, we require
    associating a specific BTF ID with a 'destructor' kfunc. This is because
    we need to release a live referenced pointer at a certain offset in map
    value from the map destruction path, otherwise we end up leaking
    resources.
    
    Hence, introduce support for passing an array of btf_id, kfunc_btf_id
    pairs that denote a BTF ID and its associated release function. Then,
    add an accessor 'btf_find_dtor_kfunc' which can be used to look up the
    destructor kfunc of a certain BTF ID. If found, we can use it to free
    the object from the map free path.
    
    The registration of these pairs also serve as a whitelist of structures
    which are allowed as referenced PTR_TO_BTF_ID in a BPF map, because
    without finding the destructor kfunc, we will bail and return an error.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-7-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta f008a624b3 bpf: Allow storing referenced kptr in map
Bugzilla: https://bugzilla.redhat.com/2120968

commit c0a5a21c25f37c9fd7b36072f9968cdff1e4aa13
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:51 2022 +0530

    bpf: Allow storing referenced kptr in map
    
    Extending the code in previous commits, introduce referenced kptr
    support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
    unreferenced kptr, referenced kptr have a lot more restrictions. In
    addition to the type matching, only a newly introduced bpf_kptr_xchg
    helper is allowed to modify the map value at that offset. This transfers
    the referenced pointer being stored into the map, releasing the
    references state for the program, and returning the old value and
    creating new reference state for the returned pointer.
    
    Similar to unreferenced pointer case, return value for this case will
    also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
    must either be eventually released by calling the corresponding release
    function, otherwise it must be transferred into another map.
    
    It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
    the value, and obtain the old value if any.
    
    BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
    commit will permit using BPF_LDX for such pointers, but attempt at
    making it safe, since the lifetime of object won't be guaranteed.
    
    There are valid reasons to enforce the restriction of permitting only
    bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
    consistent in face of concurrent modification, and any prior values
    contained in the map must also be released before a new one is moved
    into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
    returns the old value, which the verifier would require the user to
    either free or move into another map, and releases the reference held
    for the pointer being moved in.
    
    In the future, direct BPF_XCHG instruction may also be permitted to work
    like bpf_kptr_xchg helper.
    
    Note that process_kptr_func doesn't have to call
    check_helper_mem_access, since we already disallow rdonly/wronly flags
    for map, which is what check_map_access_type checks, and we already
    ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
    so check_map_access is also not required.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 6e36eccfc4 bpf: Tag argument to be released in bpf_func_proto
Bugzilla: https://bugzilla.redhat.com/2120968

commit 8f14852e89113d738c99c375b4c8b8b7e1073df1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:50 2022 +0530

    bpf: Tag argument to be released in bpf_func_proto
    
    Add a new type flag for bpf_arg_type that when set tells verifier that
    for a release function, that argument's register will be the one for
    which meta.ref_obj_id will be set, and which will then be released
    using release_reference. To capture the regno, introduce a new field
    release_regno in bpf_call_arg_meta.
    
    This would be required in the next patch, where we may either pass NULL
    or a refcounted pointer as an argument to the release function
    bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not
    enough, as there is a case where the type of argument needed matches,
    but the ref_obj_id is set to 0. Hence, we must enforce that whenever
    meta.ref_obj_id is zero, the register that is to be released can only
    be NULL for a release function.
    
    Since we now indicate whether an argument is to be released in
    bpf_func_proto itself, is_release_function helper has lost its utitlity,
    hence refactor code to work without it, and just rely on
    meta.release_regno to know when to release state for a ref_obj_id.
    Still, the restriction of one release argument and only one ref_obj_id
    passed to BPF helper or kfunc remains. This may be lifted in the future.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 9b0b6285f7 bpf: Allow storing unreferenced kptr in map
Bugzilla: https://bugzilla.redhat.com/2120968

commit 61df10c7799e27807ad5e459eec9d77cddf8bf45
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:49 2022 +0530

    bpf: Allow storing unreferenced kptr in map
    
    This commit introduces a new pointer type 'kptr' which can be embedded
    in a map value to hold a PTR_TO_BTF_ID stored by a BPF program during
    its invocation. When storing such a kptr, BPF program's PTR_TO_BTF_ID
    register must have the same type as in the map value's BTF, and loading
    a kptr marks the destination register as PTR_TO_BTF_ID with the correct
    kernel BTF and BTF ID.
    
    Such kptr are unreferenced, i.e. by the time another invocation of the
    BPF program loads this pointer, the object which the pointer points to
    may not longer exist. Since PTR_TO_BTF_ID loads (using BPF_LDX) are
    patched to PROBE_MEM loads by the verifier, it would safe to allow user
    to still access such invalid pointer, but passing such pointers into
    BPF helpers and kfuncs should not be permitted. A future patch in this
    series will close this gap.
    
    The flexibility offered by allowing programs to dereference such invalid
    pointers while being safe at runtime frees the verifier from doing
    complex lifetime tracking. As long as the user may ensure that the
    object remains valid, it can ensure data read by it from the kernel
    object is valid.
    
    The user indicates that a certain pointer must be treated as kptr
    capable of accepting stores of PTR_TO_BTF_ID of a certain type, by using
    a BTF type tag 'kptr' on the pointed to type of the pointer. Then, this
    information is recorded in the object BTF which will be passed into the
    kernel by way of map's BTF information. The name and kind from the map
    value BTF is used to look up the in-kernel type, and the actual BTF and
    BTF ID is recorded in the map struct in a new kptr_off_tab member. For
    now, only storing pointers to structs is permitted.
    
    An example of this specification is shown below:
    
    	#define __kptr __attribute__((btf_type_tag("kptr")))
    
    	struct map_value {
    		...
    		struct task_struct __kptr *task;
    		...
    	};
    
    Then, in a BPF program, user may store PTR_TO_BTF_ID with the type
    task_struct into the map, and then load it later.
    
    Note that the destination register is marked PTR_TO_BTF_ID_OR_NULL, as
    the verifier cannot know whether the value is NULL or not statically, it
    must treat all potential loads at that map value offset as loading a
    possibly NULL pointer.
    
    Only BPF_LDX, BPF_STX, and BPF_ST (with insn->imm = 0 to denote NULL)
    are allowed instructions that can access such a pointer. On BPF_LDX, the
    destination register is updated to be a PTR_TO_BTF_ID, and on BPF_STX,
    it is checked whether the source register type is a PTR_TO_BTF_ID with
    same BTF type as specified in the map BTF. The access size must always
    be BPF_DW.
    
    For the map in map support, the kptr_off_tab for outer map is copied
    from the inner map's kptr_off_tab. It was chosen to do a deep copy
    instead of introducing a refcount to kptr_off_tab, because the copy only
    needs to be done when paramterizing using inner_map_fd in the map in map
    case, hence would be unnecessary for all other users.
    
    It is not permitted to use MAP_FREEZE command and mmap for BPF map
    having kptrs, similar to the bpf_timer case. A kptr also requires that
    BPF program has both read and write access to the map (hence both
    BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG are disallowed).
    
    Note that check_map_access must be called from both
    check_helper_mem_access and for the BPF instructions, hence the kptr
    check must distinguish between ACCESS_DIRECT and ACCESS_HELPER, and
    reject ACCESS_HELPER cases. We rename stack_access_src to bpf_access_src
    and reuse it for this purpose.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 7e5456e641 bpf: Make btf_find_field more generic
Bugzilla: https://bugzilla.redhat.com/2120968

commit 42ba1308074d9046386d58b56e793604be48ce22
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Apr 15 21:33:42 2022 +0530

    bpf: Make btf_find_field more generic
    
    Next commit introduces field type 'kptr' whose kind will not be struct,
    but pointer, and it will not be limited to one offset, but multiple
    ones. Make existing btf_find_struct_field and btf_find_datasec_var
    functions amenable to use for finding kptrs in map value, by moving
    spin_lock and timer specific checks into their own function.
    
    The alignment, and name are checked before the function is called, so it
    is the last point where we can skip field or return an error before the
    next loop iteration happens. Size of the field and type is meant to be
    checked inside the function.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20220415160354.1050687-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:10 +02:00
Yauheni Kaliuta b21e06ba5d bpf: Ensure type tags precede modifiers in BTF
Bugzilla: https://bugzilla.redhat.com/2120968

commit eb596b0905584a9389585b0f437cf8a2faeb14d0
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Apr 19 22:16:07 2022 +0530

    bpf: Ensure type tags precede modifiers in BTF
    
    It is guaranteed that for modifiers, clang always places type tags
    before other modifiers, and then the base type. We would like to rely on
    this guarantee inside the kernel to make it simple to parse type tags
    from BTF.
    
    However, a user would be allowed to construct a BTF without such
    guarantees. Hence, add a pass to check that in modifier chains, type
    tags only occur at the head of the chain, and then don't occur later in
    the chain.
    
    If we see a type tag, we can have one or more type tags preceding other
    modifiers that then never have another type tag. If we see other
    modifiers, all modifiers following them should never be a type tag.
    
    Instead of having to walk chains we verified previously, we can remember
    the last good modifier type ID which headed a good chain. At that point,
    we must have verified all other chains headed by type IDs less than it.
    This makes the verification process less costly, and it becomes a simple
    O(n) pass.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220419164608.1990559-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:10 +02:00
Jerome Marchand 45d1f7dabd bpf: Fix maximum permitted number of arguments check
Bugzilla: https://bugzilla.redhat.com/2120966

commit c29a4920dfcaa1433b09e2674f605f72767a385c
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Fri Mar 25 00:42:38 2022 +0800

    bpf: Fix maximum permitted number of arguments check

    Since the m->arg_size array can hold up to MAX_BPF_FUNC_ARGS argument
    sizes, it's ok that nargs is equal to MAX_BPF_FUNC_ARGS.

    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220324164238.1274915-1-ytcoode@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:08 +02:00
Jerome Marchand 571461ff91 bpf: Simplify check in btf_parse_hdr()
Bugzilla: https://bugzilla.redhat.com/2120966

commit 583669ab3aed29994e50bde6c66b52d44e1bdb73
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Sun Mar 20 15:52:40 2022 +0800

    bpf: Simplify check in btf_parse_hdr()

    Replace offsetof(hdr_len) + sizeof(hdr_len) with offsetofend(hdr_len) to
    simplify the check for correctness of btf_data_size in btf_parse_hdr()

    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20220320075240.1001728-1-ytcoode@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand 8415d528f2 bpf: Check for NULL return from bpf_get_btf_vmlinux
Bugzilla: https://bugzilla.redhat.com/2120966

commit 7ada3787e91c89b0aa7abf47682e8e587b855c13
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sun Mar 20 20:00:03 2022 +0530

    bpf: Check for NULL return from bpf_get_btf_vmlinux

    When CONFIG_DEBUG_INFO_BTF is disabled, bpf_get_btf_vmlinux can return a
    NULL pointer. Check for it in btf_get_module_btf to prevent a NULL pointer
    dereference.

    While kernel test robot only complained about this specific case, let's
    also check for NULL in other call sites of bpf_get_btf_vmlinux.

    Fixes: 9492450fd287 ("bpf: Always raise reference in btf_get_module_btf")
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220320143003.589540-1-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand 7211feaef4 bpf: Always raise reference in btf_get_module_btf
Bugzilla: https://bugzilla.redhat.com/2120966

commit 9492450fd28736262dea9143ebb3afc2c131ace1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Mar 17 17:29:51 2022 +0530

    bpf: Always raise reference in btf_get_module_btf

    Align it with helpers like bpf_find_btf_id, so all functions returning
    BTF in out parameter follow the same rule of raising reference
    consistently, regardless of module or vmlinux BTF.

    Adjust existing callers to handle the change accordinly.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220317115957.3193097-10-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand a6464a8f1f bpf: Factor out fd returning from bpf_btf_find_by_name_kind
Bugzilla: https://bugzilla.redhat.com/2120966

commit edc3ec09ab706c45e955f7a52f0904b4ed649ca9
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Mar 17 17:29:43 2022 +0530

    bpf: Factor out fd returning from bpf_btf_find_by_name_kind

    In next few patches, we need a helper that searches all kernel BTFs
    (vmlinux and module BTFs), and finds the type denoted by 'name' and
    'kind'. Turns out bpf_btf_find_by_name_kind already does the same thing,
    but it instead returns a BTF ID and optionally fd (if module BTF). This
    is used for relocating ksyms in BPF loader code (bpftool gen skel -L).

    We extract the core code out into a new helper bpf_find_btf_id, which
    returns the BTF ID in the return value, and BTF pointer in an out
    parameter. The reference for the returned BTF pointer is always raised,
    hence user must either transfer it (e.g. to a fd), or release it after
    use.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220317115957.3193097-2-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:05 +02:00
Jerome Marchand 5018bf8095 bpf: Reject programs that try to load __percpu memory.
Bugzilla: https://bugzilla.redhat.com/2120966

commit 5844101a1be9b8636024cb31c865ef13c7cc6db3
Author: Hao Luo <haoluo@google.com>
Date:   Fri Mar 4 11:16:56 2022 -0800

    bpf: Reject programs that try to load __percpu memory.

    With the introduction of the btf_type_tag "percpu", we can add a
    MEM_PERCPU to identify those pointers that point to percpu memory.
    The ability of differetiating percpu pointers from regular memory
    pointers have two benefits:

     1. It forbids unexpected use of percpu pointers, such as direct loads.
        In kernel, there are special functions used for accessing percpu
        memory. Directly loading percpu memory is meaningless. We already
        have BPF helpers like bpf_per_cpu_ptr() and bpf_this_cpu_ptr() that
        wrap the kernel percpu functions. So we can now convert percpu
        pointers into regular pointers in a safe way.

     2. Previously, bpf_per_cpu_ptr() and bpf_this_cpu_ptr() only work on
        PTR_TO_PERCPU_BTF_ID, a special reg_type which describes static
        percpu variables in kernel (we rely on pahole to encode them into
        vmlinux BTF). Now, since we can identify __percpu tagged pointers,
        we can also identify dynamically allocated percpu memory as well.
        It means we can use bpf_xxx_cpu_ptr() on dynamic percpu memory.
        This would be very convenient when accessing fields like
        "cgroup->rstat_cpu".

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220304191657.981240-4-haoluo@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand 9bc38e28b6 bpf: Harden register offset checks for release helpers and kfuncs
Bugzilla: https://bugzilla.redhat.com/2120966

commit 24d5bb806c7e2c0b9972564fd493069f612d90dd
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Mar 5 04:16:41 2022 +0530

    bpf: Harden register offset checks for release helpers and kfuncs

    Let's ensure that the PTR_TO_BTF_ID reg being passed in to release BPF
    helpers and kfuncs always has its offset set to 0. While not a real
    problem now, there's a very real possibility this will become a problem
    when more and more kfuncs are exposed, and more BPF helpers are added
    which can release PTR_TO_BTF_ID.

    Previous commits already protected against non-zero var_off. One of the
    case we are concerned about now is when we have a type that can be
    returned by e.g. an acquire kfunc:

    struct foo {
    	int a;
    	int b;
    	struct bar b;
    };

    ... and struct bar is also a type that can be returned by another
    acquire kfunc.

    Then, doing the following sequence:

    	struct foo *f = bpf_get_foo(); // acquire kfunc
    	if (!f)
    		return 0;
    	bpf_put_bar(&f->b); // release kfunc

    ... would work with the current code, since the btf_struct_ids_match
    takes reg->off into account for matching pointer type with release kfunc
    argument type, but would obviously be incorrect, and most likely lead to
    a kernel crash. A test has been included later to prevent regressions in
    this area.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220304224645.3677453-5-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand 0925903186 bpf: Fix PTR_TO_BTF_ID var_off check
Bugzilla: https://bugzilla.redhat.com/2120966

commit 655efe5089f077485eec848272bd7e26b1a5a735
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Mar 5 04:16:39 2022 +0530

    bpf: Fix PTR_TO_BTF_ID var_off check

    When kfunc support was added, check_ctx_reg was called for PTR_TO_CTX
    register, but no offset checks were made for PTR_TO_BTF_ID. Only
    reg->off was taken into account by btf_struct_ids_match, which protected
    against type mismatch due to non-zero reg->off, but when reg->off was
    zero, a user could set the variable offset of the register and allow it
    to be passed to kfunc, leading to bad pointer being passed into the
    kernel.

    Fix this by reusing the extracted helper check_func_arg_reg_off from
    previous commit, and make one call before checking all supported
    register types. Since the list is maintained, any future changes will be
    taken into account by updating check_func_arg_reg_off. This function
    prevents non-zero var_off to be set for PTR_TO_BTF_ID, but still allows
    a fixed non-zero reg->off, which is needed for type matching to work
    correctly when using pointer arithmetic.

    ARG_DONTCARE is passed as arg_type, since kfunc doesn't support
    accepting a ARG_PTR_TO_ALLOC_MEM without relying on size of parameter
    type from BTF (in case of pointer), or using a mem, len pair. The
    forcing of offset check for ARG_PTR_TO_ALLOC_MEM is done because ringbuf
    helpers obtain the size from the header located at the beginning of the
    memory region, hence any changes to the original pointer shouldn't be
    allowed. In case of kfunc, size is always known, either at verification
    time, or using the length parameter, hence this forcing is not required.

    Since this check will happen once already for PTR_TO_CTX, remove the
    check_ptr_off_reg call inside its block.

    Fixes: e6ac2450d6 ("bpf: Support bpf program calling kernel function")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220304224645.3677453-3-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand edbd96eb68 bpf: Add config to allow loading modules with BTF mismatches
Bugzilla: https://bugzilla.redhat.com/2120966

commit 5e214f2e43e453d862ebbbd2a4f7ee3fe650f209
Author: Connor O'Brien <connoro@google.com>
Date:   Wed Feb 23 01:28:14 2022 +0000

    bpf: Add config to allow loading modules with BTF mismatches

    BTF mismatch can occur for a separately-built module even when the ABI is
    otherwise compatible and nothing else would prevent successfully loading.

    Add a new Kconfig to control how mismatches are handled. By default, preserve
    the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH
    is enabled, load the module but ignore its BTF information.

    Suggested-by: Yonghong Song <yhs@fb.com>
    Suggested-by: Michal Suchánek <msuchanek@suse.de>
    Signed-off-by: Connor O'Brien <connoro@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com
    Link: https://lore.kernel.org/bpf/20220223012814.1898677-1-connoro@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand c5056baccd bpf: Cleanup comments
Bugzilla: https://bugzilla.redhat.com/2120966

commit c561d11063009323a0e57c528cb1d77b7d2c41e0
Author: Tom Rix <trix@redhat.com>
Date:   Sun Feb 20 10:40:55 2022 -0800

    bpf: Cleanup comments

    Add leading space to spdx tag
    Use // for spdx c file comment

    Replacements
    resereved to reserved
    inbetween to in between
    everytime to every time
    intutivie to intuitive
    currenct to current
    encontered to encountered
    referenceing to referencing
    upto to up to
    exectuted to executed

    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20220220184055.3608317-1-trix@redhat.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand 5b0c453bf8 bpf: Initialize ret to 0 inside btf_populate_kfunc_set()
Bugzilla: https://bugzilla.redhat.com/2120966

commit d0b3822902b6af45f2c75706d7eb2a35aacab223
Author: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
Date:   Sat Feb 19 22:09:15 2022 +0530

    bpf: Initialize ret to 0 inside btf_populate_kfunc_set()

    Kernel test robot reported below error ->

    kernel/bpf/btf.c:6718 btf_populate_kfunc_set()
    error: uninitialized symbol 'ret'.

    Initialize ret to 0.

    Fixes: dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/bpf/20220219163915.125770-1-jrdr.linux@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand a1698130ea libbpf: Split bpf_core_apply_relo()
Bugzilla: https://bugzilla.redhat.com/2120966

commit adb8fa195efdfaac5852aaac24810b456ce43b04
Author: Mauricio Vásquez <mauricio@kinvolk.io>
Date:   Tue Feb 15 17:58:50 2022 -0500

    libbpf: Split bpf_core_apply_relo()

    BTFGen needs to run the core relocation logic in order to understand
    what are the types involved in a given relocation.

    Currently bpf_core_apply_relo() calculates and **applies** a relocation
    to an instruction. Having both operations in the same function makes it
    difficult to only calculate the relocation without patching the
    instruction. This commit splits that logic in two different phases: (1)
    calculate the relocation and (2) patch the instruction.

    For the first phase bpf_core_apply_relo() is renamed to
    bpf_core_calc_relo_insn() who is now only on charge of calculating the
    relocation, the second phase uses the already existing
    bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
    the BTFGen will use only bpf_core_calc_relo_insn().

    Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
    Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
    Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
    Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:50 +02:00
Jerome Marchand b8affec23c bpf: Implement bpf_core_types_are_compat().
Bugzilla: https://bugzilla.redhat.com/2120966

commit e70e13e7d4ab8f932f49db1c9500b30a34a6d420
Author: Matteo Croce <mcroce@microsoft.com>
Date:   Fri Feb 4 01:55:18 2022 +0100

    bpf: Implement bpf_core_types_are_compat().

    Adopt libbpf's bpf_core_types_are_compat() for kernel duty by adding
    explicit recursion limit of 2 which is enough to handle 2 levels of
    function prototypes.

    Signed-off-by: Matteo Croce <mcroce@microsoft.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220204005519.60361-2-mcroce@linux.microsoft.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:47 +02:00
Jerome Marchand 7b7955f647 bpf: reject program if a __user tagged memory accessed in kernel way
Bugzilla: https://bugzilla.redhat.com/2120966

commit c6f1bfe89ac95dc829dcb4ed54780da134ac5fce
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Jan 27 07:46:06 2022 -0800

    bpf: reject program if a __user tagged memory accessed in kernel way

    BPF verifier supports direct memory access for BPF_PROG_TYPE_TRACING type
    of bpf programs, e.g., a->b. If "a" is a pointer
    pointing to kernel memory, bpf verifier will allow user to write
    code in C like a->b and the verifier will translate it to a kernel
    load properly. If "a" is a pointer to user memory, it is expected
    that bpf developer should be bpf_probe_read_user() helper to
    get the value a->b. Without utilizing BTF __user tagging information,
    current verifier will assume that a->b is a kernel memory access
    and this may generate incorrect result.

    Now BTF contains __user information, it can check whether the
    pointer points to a user memory or not. If it is, the verifier
    can reject the program and force users to use bpf_probe_read_user()
    helper explicitly.

    In the future, we can easily extend btf_add_space for other
    address space tagging, for example, rcu/percpu etc.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220127154606.654961-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:44 +02:00
Jerome Marchand c841220259 bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF
Bugzilla: https://bugzilla.redhat.com/2120966

commit c446fdacb10dcb3b9a9ed3b91d91e72d71d94b03
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jan 25 16:13:40 2022 -0800

    bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF

    Commit dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    breaks loading of some modules when CONFIG_DEBUG_INFO_BTF is not set.
    register_btf_kfunc_id_set returns -ENOENT to the callers when
    there is no module btf. Let's return 0 (success) instead to let
    those modules work in !CONFIG_DEBUG_INFO_BTF cases.

    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Fixes: dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220126001340.1573649-1-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:44 +02:00
Jerome Marchand aa7f151c25 bpf: Add reference tracking support to kfunc
Bugzilla: https://bugzilla.redhat.com/2120966

Conflicts:
Simple context change due to already applied commit 45ce4b4f9009
("bpf: Fix crash due to out of bounds access into reg2btf_ids.")

commit 5c073f26f9dc78a6c8194b23eac7537c9692c7d7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:48 2022 +0530

    bpf: Add reference tracking support to kfunc

    This patch adds verifier support for PTR_TO_BTF_ID return type of kfunc
    to be a reference, by reusing acquire_reference_state/release_reference
    support for existing in-kernel bpf helpers.

    We make use of the three kfunc types:

    - BTF_KFUNC_TYPE_ACQUIRE
      Return true if kfunc_btf_id is an acquire kfunc.  This will
      acquire_reference_state for the returned PTR_TO_BTF_ID (this is the
      only allow return value). Note that acquire kfunc must always return a
      PTR_TO_BTF_ID{_OR_NULL}, otherwise the program is rejected.

    - BTF_KFUNC_TYPE_RELEASE
      Return true if kfunc_btf_id is a release kfunc.  This will release the
      reference to the passed in PTR_TO_BTF_ID which has a reference state
      (from earlier acquire kfunc).
      The btf_check_func_arg_match returns the regno (of argument register,
      hence > 0) if the kfunc is a release kfunc, and a proper referenced
      PTR_TO_BTF_ID is being passed to it.
      This is similar to how helper call check uses bpf_call_arg_meta to
      store the ref_obj_id that is later used to release the reference.
      Similar to in-kernel helper, we only allow passing one referenced
      PTR_TO_BTF_ID as an argument. It can also be passed in to normal
      kfunc, but in case of release kfunc there must always be one
      PTR_TO_BTF_ID argument that is referenced.

    - BTF_KFUNC_TYPE_RET_NULL
      For kfunc returning PTR_TO_BTF_ID, tells if it can be NULL, hence
      force caller to mark the pointer not null (using check) before
      accessing it. Note that taking into account the case fixed by commit
      93c230e3f5 ("bpf: Enforce id generation for all may-be-null register type")
      we assign a non-zero id for mark_ptr_or_null_reg logic. Later, if more
      return types are supported by kfunc, which have a _OR_NULL variant, it
      might be better to move this id generation under a common
      reg_type_may_be_null check, similar to the case in the commit.

    Referenced PTR_TO_BTF_ID is currently only limited to kfunc, but can be
    extended in the future to other BPF helpers as well.  For now, we can
    rely on the btf_struct_ids_match check to ensure we get the pointer to
    the expected struct type. In the future, care needs to be taken to avoid
    ambiguity for reference PTR_TO_BTF_ID passed to release function, in
    case multiple candidates can release same BTF ID.

    e.g. there might be two release kfuncs (or kfunc and helper):

    foo(struct abc *p);
    bar(struct abc *p);

    ... such that both release a PTR_TO_BTF_ID with btf_id of struct abc. In
    this case we would need to track the acquire function corresponding to
    the release function to avoid type confusion, and store this information
    in the register state so that an incorrect program can be rejected. This
    is not a problem right now, hence it is left as an exercise for the
    future patch introducing such a case in the kernel.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-6-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:40 +02:00
Jerome Marchand dd8556d8d2 bpf: Introduce mem, size argument pair support for kfunc
Bugzilla: https://bugzilla.redhat.com/2120966

Conflicts:
Context change due to already applied commit be80a1d3f9db ("bpf:
Generalize check_ctx_reg for reuse with other types")

commit d583691c47dc0424ebe926000339a6d6cd590ff7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:47 2022 +0530

    bpf: Introduce mem, size argument pair support for kfunc

    BPF helpers can associate two adjacent arguments together to pass memory
    of certain size, using ARG_PTR_TO_MEM and ARG_CONST_SIZE arguments.
    Since we don't use bpf_func_proto for kfunc, we need to leverage BTF to
    implement similar support.

    The ARG_CONST_SIZE processing for helpers is refactored into a common
    check_mem_size_reg helper that is shared with kfunc as well. kfunc
    ptr_to_mem support follows logic similar to global functions, where
    verification is done as if pointer is not null, even when it may be
    null.

    This leads to a simple to follow rule for writing kfunc: always check
    the argument pointer for NULL, except when it is PTR_TO_CTX. Also, the
    PTR_TO_CTX case is also only safe when the helper expecting pointer to
    program ctx is not exposed to other programs where same struct is not
    ctx type. In that case, the type check will fall through to other cases
    and would permit passing other types of pointers, possibly NULL at
    runtime.

    Currently, we require the size argument to be suffixed with "__sz" in
    the parameter name. This information is then recorded in kernel BTF and
    verified during function argument checking. In the future we can use BTF
    tagging instead, and modify the kernel function definitions. This will
    be a purely kernel-side change.

    This allows us to have some form of backwards compatibility for
    structures that are passed in to the kernel function with their size,
    and allow variable length structures to be passed in if they are
    accompanied by a size parameter.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand 1306b4efce bpf: Remove check_kfunc_call callback and old kfunc BTF ID API
Bugzilla: https://bugzilla.redhat.com/2120966

commit b202d84422223b7222cba5031d182f20b37e146e
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:46 2022 +0530

    bpf: Remove check_kfunc_call callback and old kfunc BTF ID API

    Completely remove the old code for check_kfunc_call to help it work
    with modules, and also the callback itself.

    The previous commit adds infrastructure to register all sets and put
    them in vmlinux or module BTF, and concatenates all related sets
    organized by the hook and the type. Once populated, these sets remain
    immutable for the lifetime of the struct btf.

    Also, since we don't need the 'owner' module anywhere when doing
    check_kfunc_call, drop the 'btf_modp' module parameter from
    find_kfunc_desc_btf.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-4-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand 25756e43b7 bpf: Populate kfunc BTF ID sets in struct btf
Bugzilla: https://bugzilla.redhat.com/2120966

commit dee872e124e8d5de22b68c58f6f6c3f5e8889160
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:45 2022 +0530

    bpf: Populate kfunc BTF ID sets in struct btf

    This patch prepares the kernel to support putting all kinds of kfunc BTF
    ID sets in the struct btf itself. The various kernel subsystems will
    make register_btf_kfunc_id_set call in the initcalls (for built-in code
    and modules).

    The 'hook' is one of the many program types, e.g. XDP and TC/SCHED_CLS,
    STRUCT_OPS, and 'types' are check (allowed or not), acquire, release,
    and ret_null (with PTR_TO_BTF_ID_OR_NULL return type).

    A maximum of BTF_KFUNC_SET_MAX_CNT (32) kfunc BTF IDs are permitted in a
    set of certain hook and type for vmlinux sets, since they are allocated
    on demand, and otherwise set as NULL. Module sets can only be registered
    once per hook and type, hence they are directly assigned.

    A new btf_kfunc_id_set_contains function is exposed for use in verifier,
    this new method is faster than the existing list searching method, and
    is also automatic. It also lets other code not care whether the set is
    unallocated or not.

    Note that module code can only do single register_btf_kfunc_id_set call
    per hook. This is why sorting is only done for in-kernel vmlinux sets,
    because there might be multiple sets for the same hook and type that
    must be concatenated, hence sorting them is required to ensure bsearch
    in btf_id_set_contains continues to work correctly.

    Next commit will update the kernel users to make use of this
    infrastructure.

    Finally, add __maybe_unused annotation for BTF ID macros for the
    !CONFIG_DEBUG_INFO_BTF case, so that they don't produce warnings during
    build time.

    The previous patch is also needed to provide synchronization against
    initialization for module BTF's kfunc_set_tab introduced here, as
    described below:

      The kfunc_set_tab pointer in struct btf is write-once (if we consider
      the registration phase (comprised of multiple register_btf_kfunc_id_set
      calls) as a single operation). In this sense, once it has been fully
      prepared, it isn't modified, only used for lookup (from the verifier
      context).

      For btf_vmlinux, it is initialized fully during the do_initcalls phase,
      which happens fairly early in the boot process, before any processes are
      present. This also eliminates the possibility of bpf_check being called
      at that point, thus relieving us of ensuring any synchronization between
      the registration and lookup function (btf_kfunc_id_set_contains).

      However, the case for module BTF is a bit tricky. The BTF is parsed,
      prepared, and published from the MODULE_STATE_COMING notifier callback.
      After this, the module initcalls are invoked, where our registration
      function will be called to populate the kfunc_set_tab for module BTF.

      At this point, BTF may be available to userspace while its corresponding
      module is still intializing. A BTF fd can then be passed to verifier
      using bpf syscall (e.g. for kfunc call insn).

      Hence, there is a race window where verifier may concurrently try to
      lookup the kfunc_set_tab. To prevent this race, we must ensure the
      operations are serialized, or waiting for the __init functions to
      complete.

      In the earlier registration API, this race was alleviated as verifier
      bpf_check_mod_kfunc_call didn't find the kfunc BTF ID until it was added
      by the registration function (called usually at the end of module __init
      function after all module resources have been initialized). If the
      verifier made the check_kfunc_call before kfunc BTF ID was added to the
      list, it would fail verification (saying call isn't allowed). The
      access to list was protected using a mutex.

      Now, it would still fail verification, but for a different reason
      (returning ENXIO due to the failed btf_try_get_module call in
      add_kfunc_call), because if the __init call is in progress the module
      will be in the middle of MODULE_STATE_COMING -> MODULE_STATE_LIVE
      transition, and the BTF_MODULE_LIVE flag for btf_module instance will
      not be set, so the btf_try_get_module call will fail.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand ab96ac72cd bpf: Fix UAF due to race between btf_try_get_module and load_module
Bugzilla: https://bugzilla.redhat.com/2120966

commit 18688de203b47e5d8d9d0953385bf30b5949324f
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:44 2022 +0530

    bpf: Fix UAF due to race between btf_try_get_module and load_module

    While working on code to populate kfunc BTF ID sets for module BTF from
    its initcall, I noticed that by the time the initcall is invoked, the
    module BTF can already be seen by userspace (and the BPF verifier). The
    existing btf_try_get_module calls try_module_get which only fails if
    mod->state == MODULE_STATE_GOING, i.e. it can increment module reference
    when module initcall is happening in parallel.

    Currently, BTF parsing happens from MODULE_STATE_COMING notifier
    callback. At this point, the module initcalls have not been invoked.
    The notifier callback parses and prepares the module BTF, allocates an
    ID, which publishes it to userspace, and then adds it to the btf_modules
    list allowing the kernel to invoke btf_try_get_module for the BTF.

    However, at this point, the module has not been fully initialized (i.e.
    its initcalls have not finished). The code in module.c can still fail
    and free the module, without caring for other users. However, nothing
    stops btf_try_get_module from succeeding between the state transition
    from MODULE_STATE_COMING to MODULE_STATE_LIVE.

    This leads to a use-after-free issue when BPF program loads
    successfully in the state transition, load_module's do_init_module call
    fails and frees the module, and BPF program fd on close calls module_put
    for the freed module. Future patch has test case to verify we don't
    regress in this area in future.

    There are multiple points after prepare_coming_module (in load_module)
    where failure can occur and module loading can return error. We
    illustrate and test for the race using the last point where it can
    practically occur (in module __init function).

    An illustration of the race:

    CPU 0                           CPU 1
    			  load_module
    			    notifier_call(MODULE_STATE_COMING)
    			      btf_parse_module
    			      btf_alloc_id	// Published to userspace
    			      list_add(&btf_mod->list, btf_modules)
    			    mod->init(...)
    ...				^
    bpf_check		        |
    check_pseudo_btf_id             |
      btf_try_get_module            |
        returns true                |  ...
    ...                             |  module __init in progress
    return prog_fd                  |  ...
    ...                             V
    			    if (ret < 0)
    			      free_module(mod)
    			    ...
    close(prog_fd)
     ...
     bpf_prog_free_deferred
      module_put(used_btf.mod) // use-after-free

    We fix this issue by setting a flag BTF_MODULE_F_LIVE, from the notifier
    callback when MODULE_STATE_LIVE state is reached for the module, so that
    we return NULL from btf_try_get_module for modules that are not fully
    formed. Since try_module_get already checks that module is not in
    MODULE_STATE_GOING state, and that is the only transition a live module
    can make before being removed from btf_modules list, this is enough to
    close the race and prevent the bug.

    A later selftest patch crafts the race condition artifically to verify
    that it has been fixed, and that verifier fails to load program (with
    ENXIO).

    Lastly, a couple of comments:

     1. Even if this race didn't exist, it seems more appropriate to only
        access resources (ksyms and kfuncs) of a fully formed module which
        has been initialized completely.

     2. This patch was born out of need for synchronization against module
        initcall for the next patch, so it is needed for correctness even
        without the aforementioned race condition. The BTF resources
        initialized by module initcall are set up once and then only looked
        up, so just waiting until the initcall has finished ensures correct
        behavior.

    Fixes: 541c3bad8d ("bpf: Support BPF ksym variables in kernel modules")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-2-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Artem Savkov 176866ac1f bpf: Fix crash due to out of bounds access into reg2btf_ids.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 45ce4b4f9009102cd9f581196d480a59208690c1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Feb 17 01:49:43 2022 +0530

    bpf: Fix crash due to out of bounds access into reg2btf_ids.

    When commit e6ac2450d6 ("bpf: Support bpf program calling kernel function") added
    kfunc support, it defined reg2btf_ids as a cheap way to translate the verifier
    reg type to the appropriate btf_vmlinux BTF ID, however
    commit c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
    moved the __BPF_REG_TYPE_MAX from the last member of bpf_reg_type enum to after
    the base register types, and defined other variants using type flag
    composition. However, now, the direct usage of reg->type to index into
    reg2btf_ids may no longer fall into __BPF_REG_TYPE_MAX range, and hence lead to
    out of bounds access and kernel crash on dereference of bad pointer.

    Fixes: c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220216201943.624869-1-memxor@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:55 +02:00
Artem Savkov c336e6fbf1 bpf: Generalize check_ctx_reg for reuse with other types
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit be80a1d3f9dbe5aee79a325964f7037fe2d92f30
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Mon Jan 10 14:05:49 2022 +0000

    bpf: Generalize check_ctx_reg for reuse with other types

    Generalize the check_ctx_reg() helper function into a more generic named one
    so that it can be reused for other register types as well to check whether
    their offset is non-zero. No functional change.

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Acked-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:54 +02:00
Artem Savkov 0adb10f4f1 bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Omitted-fix: f858c2b2ca04 bpf: Fix calling global functions from
             BPF_PROG_TYPE_EXT programs
    It has not so small dependencies so given it is from 5.18 which is
    planned to be backported in 9.2 as well I'm leaving it till later.

commit 3363bd0cfbb80dfcd25003cd3815b0ad8b68d0ff
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Dec 17 07:20:24 2021 +0530

    bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support

    Allow passing PTR_TO_CTX, if the kfunc expects a matching struct type,
    and punt to PTR_TO_MEM block if reg->type does not fall in one of
    PTR_TO_BTF_ID or PTR_TO_SOCK* types. This will be used by future commits
    to get access to XDP and TC PTR_TO_CTX, and pass various data (flags,
    l4proto, netns_id, etc.) encoded in opts struct passed as pointer to
    kfunc.

    For PTR_TO_MEM support, arguments are currently limited to pointer to
    scalar, or pointer to struct composed of scalars. This is done so that
    unsafe scenarios (like passing PTR_TO_MEM where PTR_TO_BTF_ID of
    in-kernel valid structure is expected, which may have pointers) are
    avoided. Since the argument checking happens basd on argument register
    type, it is not easy to ascertain what the expected type is. In the
    future, support for PTR_TO_MEM for kfunc can be extended to serve other
    usecases. The struct type whose pointer is passed in may have maximum
    nesting depth of 4, all recursively composed of scalars or struct with
    scalars.

    Future commits will add negative tests that check whether these
    restrictions imposed for kfunc arguments are duly rejected by BPF
    verifier or not.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217015031.1278167-4-memxor@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 7f76bfc54f bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 216e3cd2f28dbbf1fe86848e0e29e6693b9f0a20
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:51 2021 -0800

    bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.

    Some helper functions may modify its arguments, for example,
    bpf_d_path, bpf_get_stack etc. Previously, their argument types
    were marked as ARG_PTR_TO_MEM, which is compatible with read-only
    mem types, such as PTR_TO_RDONLY_BUF. Therefore it's legitimate,
    but technically incorrect, to modify a read-only memory by passing
    it into one of such helper functions.

    This patch tags the bpf_args compatible with immutable memory with
    MEM_RDONLY flag. The arguments that don't have this flag will be
    only compatible with mutable memory types, preventing the helper
    from modifying a read-only memory. The bpf_args that have
    MEM_RDONLY are compatible with both mutable memory and immutable
    memory.

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-9-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov eb324cec1c bpf: Convert PTR_TO_MEM_OR_NULL to composable types.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit cf9f2f8d62eca810afbd1ee6cc0800202b000e57
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:49 2021 -0800

    bpf: Convert PTR_TO_MEM_OR_NULL to composable types.

    Remove PTR_TO_MEM_OR_NULL and replace it with PTR_TO_MEM combined with
    flag PTR_MAYBE_NULL.

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-7-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 86317abded bpf: Introduce MEM_RDONLY flag
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 20b2aff4bc15bda809f994761d5719827d66c0b4
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:48 2021 -0800

    bpf: Introduce MEM_RDONLY flag

    This patch introduce a flag MEM_RDONLY to tag a reg value
    pointing to read-only memory. It makes the following changes:

    1. PTR_TO_RDWR_BUF -> PTR_TO_BUF
    2. PTR_TO_RDONLY_BUF -> PTR_TO_BUF | MEM_RDONLY

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-6-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 2938af7b1d bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c25b2ae136039ffa820c26138ed4a5e5f3ab3841
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:47 2021 -0800

    bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL

    We have introduced a new type to make bpf_reg composable, by
    allocating bits in the type to represent flags.

    One of the flags is PTR_MAYBE_NULL which indicates a pointer
    may be NULL. This patch switches the qualified reg_types to
    use this flag. The reg_types changed in this patch include:

    1. PTR_TO_MAP_VALUE_OR_NULL
    2. PTR_TO_SOCKET_OR_NULL
    3. PTR_TO_SOCK_COMMON_OR_NULL
    4. PTR_TO_TCP_SOCK_OR_NULL
    5. PTR_TO_BTF_ID_OR_NULL
    6. PTR_TO_MEM_OR_NULL
    7. PTR_TO_RDONLY_BUF_OR_NULL
    8. PTR_TO_RDWR_BUF_OR_NULL

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/r/20211217003152.48334-5-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:49 +02:00
Artem Savkov 003660d21e bpf: Allow access to int pointer arguments in tracing programs
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit bb6728d756112596881a5fdf2040544031905840
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Wed Dec 8 20:32:41 2021 +0100

    bpf: Allow access to int pointer arguments in tracing programs

    Adding support to access arguments with int pointer arguments
    in tracing programs.

    Currently we allow tracing programs to access only pointers to
    string (char pointer), void pointers and pointers to structs.

    If we try to access argument which is pointer to int, verifier
    will fail to load the program with;

      R1 type=ctx expected=fp
      ; int BPF_PROG(fmod_ret_test, int _a, __u64 _b, int _ret)
      0: (bf) r6 = r1
      ; int BPF_PROG(fmod_ret_test, int _a, __u64 _b, int _ret)
      1: (79) r9 = *(u64 *)(r6 +8)
      func 'bpf_modify_return_test' arg1 type INT is not a struct

    There is no harm for the program to access int pointer argument.
    We are already doing that for string pointer, which is pointer
    to int with 1 byte size.

    Changing the is_string_ptr to generic integer check and renaming
    it to btf_type_is_int.

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211208193245.172141-2-jolsa@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov d475f1fc96 bpf: Silence coverity false positive warning.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit f18a499799dd0f0fdd98cf72d98d3866ce9ac60e
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Sat Dec 11 18:08:19 2021 -0800

    bpf: Silence coverity false positive warning.

    Coverity issued the following warning:
    6685            cands = bpf_core_add_cands(cands, main_btf, 1);
    6686            if (IS_ERR(cands))
    >>>     CID 1510300:    (RETURN_LOCAL)
    >>>     Returning pointer "cands" which points to local variable "local_cand".
    6687                    return cands;

    It's a false positive.
    Add ERR_CAST() to silence it.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov a1c11bfa7b bpf: Use kmemdup() to replace kmalloc + memcpy
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4674f21071b935c237217ac02cb310522d6ad95d
Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Date:   Thu Dec 9 14:21:22 2021 +0800

    bpf: Use kmemdup() to replace kmalloc + memcpy

    Eliminate the follow coccicheck warning:

    ./kernel/bpf/btf.c:6537:13-20: WARNING opportunity for kmemdup.

    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/1639030882-92383-1-git-send-email-jiapeng.chong@linux.alibaba.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov 67e859d1a6 bpf: Remove redundant assignment to pointer t
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 73b6eae583f44e278e19489a411f9c1e22d530fc
Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Tue Dec 7 22:47:18 2021 +0000

    bpf: Remove redundant assignment to pointer t

    The pointer t is being initialized with a value that is never read. The
    pointer is re-assigned a value a littler later on, hence the initialization
    is redundant and can be removed.

    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211207224718.59593-1-colin.i.king@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:45 +02:00
Artem Savkov a611a702c6 bpf: Silence purge_cand_cache build warning.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 29f2e5bd9439445fe14ba8570b1c9a7ad682df84
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Mon Dec 6 17:48:39 2021 -0800

    bpf: Silence purge_cand_cache build warning.

    When CONFIG_DEBUG_INFO_BTF_MODULES is not set
    the following warning can be seen:
    kernel/bpf/btf.c:6588:13: warning: 'purge_cand_cache' defined but not used [-Wunused-function]
    Fix it.

    Fixes: 1e89106da253 ("bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211207014839.6976-1-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:45 +02:00
Artem Savkov 65a903edb8 bpf: Disallow BPF_LOG_KERNEL log level for bpf(BPF_BTF_LOAD)
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 866de407444398bc8140ea70de1dba5f91cc34ac
Author: Hou Tao <houtao1@huawei.com>
Date:   Fri Dec 3 13:30:01 2021 +0800

    bpf: Disallow BPF_LOG_KERNEL log level for bpf(BPF_BTF_LOAD)

    BPF_LOG_KERNEL is only used internally, so disallow bpf_btf_load()
    to set log level as BPF_LOG_KERNEL. The same checking has already
    been done in bpf_check(), so factor out a helper to check the
    validity of log attributes and use it in both places.

    Fixes: 8580ac9404 ("bpf: Process in-kernel BTF")
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20211203053001.740945-1-houtao1@huawei.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:44 +02:00
Artem Savkov bad77b29d8 libbpf: Reduce bpf_core_apply_relo_insn() stack usage.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 78c1f8d0634cc35da613d844eda7c849fc50f643
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Fri Dec 3 10:28:36 2021 -0800

    libbpf: Reduce bpf_core_apply_relo_insn() stack usage.

    Reduce bpf_core_apply_relo_insn() stack usage and bump
    BPF_CORE_SPEC_MAX_LEN limit back to 64.

    Fixes: 29db4bea1d10 ("bpf: Prepare relo_core.c for kernel duty.")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:44 +02:00
Artem Savkov 1471e757ae bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 1e89106da25390826608ad6ac0edfb7c9952eff3
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:31 2021 -0800

    bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().

    Given BPF program's BTF root type name perform the following steps:
    . search in vmlinux candidate cache.
    . if (present in cache and candidate list >= 1) return candidate list.
    . do a linear search through kernel BTFs for possible candidates.
    . regardless of number of candidates found populate vmlinux cache.
    . if (candidate list >= 1) return candidate list.
    . search in module candidate cache.
    . if (present in cache) return candidate list (even if list is empty).
    . do a linear search through BTFs of all kernel modules
      collecting candidates from all of them.
    . regardless of number of candidates found populate module cache.
    . return candidate list.
    Then wire the result into bpf_core_apply_relo_insn().

    When BPF program is trying to CO-RE relocate a type
    that doesn't exist in either vmlinux BTF or in modules BTFs
    these steps will perform 2 cache lookups when cache is hit.

    Note the cache doesn't prevent the abuse by the program that might
    have lots of relocations that cannot be resolved. Hence cond_resched().

    CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-9-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:43 +02:00
Artem Savkov 1413cea45f bpf: Adjust BTF log size limit.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c5a2d43e998a821701029f23e25b62f9188e93ff
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:29 2021 -0800

    bpf: Adjust BTF log size limit.

    Make BTF log size limit to be the same as the verifier log size limit.
    Otherwise tools that progressively increase log size and use the same log
    for BTF loading and program loading will be hitting hard to debug EINVAL.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-7-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 77c4b3ac35 bpf: Pass a set of bpf_core_relo-s to prog_load command.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit fbd94c7afcf99c9f3b1ba1168657ecc428eb2c8d
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:28 2021 -0800

    bpf: Pass a set of bpf_core_relo-s to prog_load command.

    struct bpf_core_relo is generated by llvm and processed by libbpf.
    It's a de-facto uapi.
    With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
    Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
    and let the kernel perform CO-RE relocations.

    Note the struct bpf_line_info and struct bpf_func_info have the same
    layout when passed from LLVM to libbpf and from libbpf to the kernel
    except "insn_off" fields means "byte offset" when LLVM generates it.
    Then libbpf converts it to "insn index" to pass to the kernel.
    The struct bpf_core_relo's "insn_off" field is always "byte offset".

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 540b7ebc57 bpf: Prepare relo_core.c for kernel duty.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: merge conflict upstream resolved in be3158290db8

commit 29db4bea1d10b73749d7992c1fc9ac13499e8871
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:26 2021 -0800

    bpf: Prepare relo_core.c for kernel duty.

    Make relo_core.c to be compiled for the kernel and for user space libbpf.

    Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
    This is the maximum number of nested structs and arrays.
    For example:
     struct sample {
         int a;
         struct {
             int b[10];
         };
     };

     struct sample *s = ...;
     int *y = &s->b[5];
    This field access is encoded as "0:1:0:5" and spec len is 4.

    The follow up patch might bump it back to 64.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 92b13fc051 bpf: Rename btf_member accessors.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 8293eb995f349aed28006792cad4cb48091919dd
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:25 2021 -0800

    bpf: Rename btf_member accessors.

    Rename btf_member_bit_offset() and btf_member_bitfield_size() to
    avoid conflicts with similarly named helpers in libbpf's btf.h.
    Rename the kernel helpers, since libbpf helpers are part of uapi.

    Suggested-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-3-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 5cebd099b9 bpf: Introduce btf_tracing_ids
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit d19ddb476a539fd78ad1028ae13bb38506286931
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 12 07:02:43 2021 -0800

    bpf: Introduce btf_tracing_ids

    Similar to btf_sock_ids, btf_tracing_ids provides btf ID for task_struct,
    file, and vm_area_struct via easy to understand format like
    btf_tracing_ids[BTF_TRACING_TYPE_[TASK|file|VMA]].

    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211112150243.1270987-3-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:37 +02:00
Artem Savkov f06641eab5 bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 9e2ad638ae3632ef916ceb39f70e3104bf8fdc97
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 12 07:02:42 2021 -0800

    bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs

    syzbot reported the following BUG w/o CONFIG_DEBUG_INFO_BTF

    BUG: KASAN: global-out-of-bounds in task_iter_init+0x212/0x2e7 kernel/bpf/task_iter.c:661
    Read of size 4 at addr ffffffff90297404 by task swapper/0/1

    CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.15.0-syzkaller #0
    Hardware name: ... Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    <TASK>
    __dump_stack lib/dump_stack.c:88 [inline]
    dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
    print_address_description.constprop.0.cold+0xf/0x309 mm/kasan/report.c:256
    __kasan_report mm/kasan/report.c:442 [inline]
    kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
    task_iter_init+0x212/0x2e7 kernel/bpf/task_iter.c:661
    do_one_initcall+0x103/0x650 init/main.c:1295
    do_initcall_level init/main.c:1368 [inline]
    do_initcalls init/main.c:1384 [inline]
    do_basic_setup init/main.c:1403 [inline]
    kernel_init_freeable+0x6b1/0x73a init/main.c:1606
    kernel_init+0x1a/0x1d0 init/main.c:1497
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
    </TASK>

    This is caused by hard-coded name[1] in BTF_ID_LIST_GLOBAL (w/o
    CONFIG_DEBUG_INFO_BTF). Fix this by adding a parameter n to
    BTF_ID_LIST_GLOBAL. This avoids ifdef CONFIG_DEBUG_INFO_BTF in btf.c and
    filter.c.

    Fixes: 7c7e3d31e785 ("bpf: Introduce helper bpf_find_vma")
    Reported-by: syzbot+e0d81ec552a21d9071aa@syzkaller.appspotmail.com
    Reported-by: Eric Dumazet <edumazet@google.com>
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211112150243.1270987-2-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:36 +02:00
Artem Savkov b90de8b07e bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 8c42d2fa4eeab6c37a0b1b1aa7a2715248ef4f34
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Nov 11 17:26:09 2021 -0800

    bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes

    LLVM patches ([1] for clang, [2] and [3] for BPF backend)
    added support for btf_type_tag attributes. This patch
    added support for the kernel.

    The main motivation for btf_type_tag is to bring kernel
    annotations __user, __rcu etc. to btf. With such information
    available in btf, bpf verifier can detect mis-usages
    and reject the program. For example, for __user tagged pointer,
    developers can then use proper helper like bpf_probe_read_user()
    etc. to read the data.

    BTF_KIND_TYPE_TAG may also useful for other tracing
    facility where instead of to require user to specify
    kernel/user address type, the kernel can detect it
    by itself with btf.

      [1] https://reviews.llvm.org/D111199
      [2] https://reviews.llvm.org/D113222
      [3] https://reviews.llvm.org/D113496

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:36 +02:00
Artem Savkov c083c778ce bpf: Introduce helper bpf_find_vma
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 7c7e3d31e7856a8260a254f8c71db416f7f9f5a1
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 5 16:23:29 2021 -0700

    bpf: Introduce helper bpf_find_vma

    In some profiler use cases, it is necessary to map an address to the
    backing file, e.g., a shared library. bpf_find_vma helper provides a
    flexible way to achieve this. bpf_find_vma maps an address of a task to
    the vma (vm_area_struct) for this address, and feed the vma to an callback
    BPF function. The callback function is necessary here, as we need to
    ensure mmap_sem is unlocked.

    It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem
    safely when irqs are disable, we use the same mechanism as stackmap with
    build_id. Specifically, when irqs are disabled, the unlocked is postponed
    in an irq_work. Refactor stackmap.c so that the irq_work is shared among
    bpf_find_vma and stackmap helpers.

    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211105232330.1936330-2-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:34 +02:00
Yauheni Kaliuta 4430625d2e bpf: Fix a btf decl_tag bug when tagging a function
Bugzilla: http://bugzilla.redhat.com/2069045

commit d7e7b42f4f956f2c68ad8cda87d750093dbba737
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Feb 3 11:17:27 2022 -0800

    bpf: Fix a btf decl_tag bug when tagging a function
    
    syzbot reported a btf decl_tag bug with stack trace below:
    
      general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 0 PID: 3592 Comm: syz-executor914 Not tainted 5.16.0-syzkaller-11424-gb7892f7d5cb2 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:btf_type_vlen include/linux/btf.h:231 [inline]
      RIP: 0010:btf_decl_tag_resolve+0x83e/0xaa0 kernel/bpf/btf.c:3910
      ...
      Call Trace:
       <TASK>
       btf_resolve+0x251/0x1020 kernel/bpf/btf.c:4198
       btf_check_all_types kernel/bpf/btf.c:4239 [inline]
       btf_parse_type_sec kernel/bpf/btf.c:4280 [inline]
       btf_parse kernel/bpf/btf.c:4513 [inline]
       btf_new_fd+0x19fe/0x2370 kernel/bpf/btf.c:6047
       bpf_btf_load kernel/bpf/syscall.c:4039 [inline]
       __sys_bpf+0x1cbb/0x5970 kernel/bpf/syscall.c:4679
       __do_sys_bpf kernel/bpf/syscall.c:4738 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:4736 [inline]
       __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:4736
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The kasan error is triggered with an illegal BTF like below:
       type 0: void
       type 1: int
       type 2: decl_tag to func type 3
       type 3: func to func_proto type 8
    The total number of types is 4 and the type 3 is illegal
    since its func_proto type is out of range.
    
    Currently, the target type of decl_tag can be struct/union, var or func.
    Both struct/union and var implemented their own 'resolve' callback functions
    and hence handled properly in kernel.
    But func type doesn't have 'resolve' callback function. When
    btf_decl_tag_resolve() tries to check func type, it tries to get
    vlen of its func_proto type, which triggered the above kasan error.
    
    To fix the issue, btf_decl_tag_resolve() needs to do btf_func_check()
    before trying to accessing func_proto type.
    In the current implementation, func type is checked with
    btf_func_check() in the main checking function btf_check_all_types().
    To fix the above kasan issue, let us implement 'resolve' callback
    func type properly. The 'resolve' callback will be also called
    in btf_check_all_types() for func types.
    
    Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
    Reported-by: syzbot+53619be9444215e785ed@syzkaller.appspotmail.com
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220203191727.741862-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:53 +03:00
Yauheni Kaliuta bf40542602 bpf: Fix bpf_check_mod_kfunc_call for built-in modules
Bugzilla: http://bugzilla.redhat.com/2069045

commit b12f031043247b80999bf5e03b8cded3b0b40f8d
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Nov 22 20:17:41 2021 +0530

    bpf: Fix bpf_check_mod_kfunc_call for built-in modules
    
    When module registering its set is built-in, THIS_MODULE will be NULL,
    hence we cannot return early in case owner is NULL.
    
    Fixes: 14f267d95fe4 ("bpf: btf: Introduce helpers for dynamic BTF set registration")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20211122144742.477787-3-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:51 +03:00
Yauheni Kaliuta 3dbb603806 bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Bugzilla: http://bugzilla.redhat.com/2069045

commit d9847eb8be3d895b2b5f514fdf3885d47a0b92a2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Nov 22 20:17:40 2021 +0530

    bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
    
    Vinicius Costa Gomes reported [0] that build fails when
    CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
    This leads to btf.c not being compiled, and then no symbol being present
    in vmlinux for the declarations in btf.h. Since BTF is not useful
    without enabling BPF subsystem, disallow this combination.
    
    However, theoretically disabling both now could still fail, as the
    symbol for kfunc_btf_id_list variables is not available. This isn't a
    problem as the compiler usually optimizes the whole register/unregister
    call, but at lower optimization levels it can fail the build in linking
    stage.
    
    Fix that by adding dummy variables so that modules taking address of
    them still work, but the whole thing is a noop.
    
      [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com
    
    Fixes: 14f267d95fe4 ("bpf: btf: Introduce helpers for dynamic BTF set registration")
    Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:51 +03:00
Yauheni Kaliuta fe7b5bf2d3 bpf: Add BTF_KIND_DECL_TAG typedef support
Bugzilla: http://bugzilla.redhat.com/2069045

commit bd16dee66ae4de3f1726c69ac901d2b7a53b0c86
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Oct 21 12:56:28 2021 -0700

    bpf: Add BTF_KIND_DECL_TAG typedef support
    
    The llvm patches ([1], [2]) added support to attach btf_decl_tag
    attributes to typedef declarations. This patch added
    support in kernel.
    
      [1] https://reviews.llvm.org/D110127
      [2] https://reviews.llvm.org/D112259
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211021195628.4018847-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:45 +03:00
Yauheni Kaliuta fdbdd94ffe bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Bugzilla: http://bugzilla.redhat.com/2069045

commit 223f903e9c832699f4e5f422281a60756c1c6cfe
Author: Yonghong Song <yhs@fb.com>
Date:   Tue Oct 12 09:48:38 2021 -0700

    bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
    
    Patch set [1] introduced BTF_KIND_TAG to allow tagging
    declarations for struct/union, struct/union field, var, func
    and func arguments and these tags will be encoded into
    dwarf. They are also encoded to btf by llvm for the bpf target.
    
    After BTF_KIND_TAG is introduced, we intended to use it
    for kernel __user attributes. But kernel __user is actually
    a type attribute. Upstream and internal discussion showed
    it is not a good idea to mix declaration attribute and
    type attribute. So we proposed to introduce btf_type_tag
    as a type attribute and existing btf_tag renamed to
    btf_decl_tag ([2]).
    
    This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
    other declarations with *_tag to *_decl_tag to make it clear
    the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
    might be introduced per [3].
    
     [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
     [2] https://reviews.llvm.org/D111588
     [3] https://reviews.llvm.org/D111199
    
    Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
    Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
    Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:42 +03:00
Yauheni Kaliuta da9f7998d2 bpf: selftests: Add selftests for module kfunc support
Bugzilla: http://bugzilla.redhat.com/2069045

commit c48e51c8b07aba8a18125221cb67a40cb1256bf2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:57 2021 +0530

    bpf: selftests: Add selftests for module kfunc support
    
    This adds selftests that tests the success and failure path for modules
    kfuncs (in presence of invalid kfunc calls) for both libbpf and
    gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
    add module BTF ID set from bpf_testmod.
    
    This also introduces  a couple of test cases to verifier selftests for
    validating whether we get an error or not depending on if invalid kfunc
    call remains after elimination of unreachable instructions.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta 1633f5d5a1 bpf: Enable TCP congestion control kfunc from modules
Bugzilla: http://bugzilla.redhat.com/2069045

commit 0e32dfc80bae53b05e9eda7eaf259f30ab9ba43a
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:53 2021 +0530

    bpf: Enable TCP congestion control kfunc from modules
    
    This commit moves BTF ID lookup into the newly added registration
    helper, in a way that the bbr, cubic, and dctcp implementation set up
    their sets in the bpf_tcp_ca kfunc_btf_set list, while the ones not
    dependent on modules are looked up from the wrapper function.
    
    This lifts the restriction for them to be compiled as built in objects,
    and can be loaded as modules if required. Also modify Makefile.modfinal
    to call resolve_btfids for each module.
    
    Note that since kernel kfunc_ids never overlap with module kfunc_ids, we
    only match the owner for module btf id sets.
    
    See following commits for background on use of:
    
     CONFIG_X86 ifdef:
     569c484f99 (bpf: Limit static tcp-cc functions in the .BTF_ids list to x86)
    
     CONFIG_DYNAMIC_FTRACE ifdef:
     7aae231ac9 (bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE)
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-6-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta 8dfb448993 bpf: btf: Introduce helpers for dynamic BTF set registration
Bugzilla: http://bugzilla.redhat.com/2069045

commit 14f267d95fe4b08831a022c8e15a2eb8991edbf6
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:51 2021 +0530

    bpf: btf: Introduce helpers for dynamic BTF set registration
    
    This adds helpers for registering btf_id_set from modules and the
    bpf_check_mod_kfunc_call callback that can be used to look them up.
    
    With in kernel sets, the way this is supposed to work is, in kernel
    callback looks up within the in-kernel kfunc whitelist, and then defers
    to the dynamic BTF set lookup if it doesn't find the BTF id. If there is
    no in-kernel BTF id set, this callback can be used directly.
    
    Also fix includes for btf.h and bpfptr.h so that they can included in
    isolation. This is in preparation for their usage in tcp_bbr, tcp_cubic
    and tcp_dctcp modules in the next patch.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-4-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta df49efeb8a bpf: Support for new btf kind BTF_KIND_TAG
Bugzilla: http://bugzilla.redhat.com/2069045

commit b5ea834dde6b6e7f75e51d5f66dac8cd7c97b5ef
Author: Yonghong Song <yhs@fb.com>
Date:   Tue Sep 14 15:30:15 2021 -0700

    bpf: Support for new btf kind BTF_KIND_TAG
    
    LLVM14 added support for a new C attribute ([1])
      __attribute__((btf_tag("arbitrary_str")))
    This attribute will be emitted to dwarf ([2]) and pahole
    will convert it to BTF. Or for bpf target, this
    attribute will be emitted to BTF directly ([3], [4]).
    The attribute is intended to provide additional
    information for
      - struct/union type or struct/union member
      - static/global variables
      - static/global function or function parameter.
    
    For linux kernel, the btf_tag can be applied
    in various places to specify user pointer,
    function pre- or post- condition, function
    allow/deny in certain context, etc. Such information
    will be encoded in vmlinux BTF and can be used
    by verifier.
    
    The btf_tag can also be applied to bpf programs
    to help global verifiable functions, e.g.,
    specifying preconditions, etc.
    
    This patch added basic parsing and checking support
    in kernel for new BTF_KIND_TAG kind.
    
     [1] https://reviews.llvm.org/D106614
     [2] https://reviews.llvm.org/D106621
     [3] https://reviews.llvm.org/D106622
     [4] https://reviews.llvm.org/D109560
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:16:12 +03:00
Jerome Marchand 70ca1bb90e bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS
Bugzilla: http://bugzilla.redhat.com/2041365

commit eb529c5b10b9401a0f2d1f469e82c6a0ba98082c
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Aug 25 18:48:31 2021 -0700

    bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS

    This commit fixes linker errors along the lines of:

        s390-linux-ld: task_iter.c:(.init.text+0xa4): undefined reference to `btf_task_struct_ids'`

    Fix by defining btf_task_struct_ids unconditionally in kernel/bpf/btf.c
    since there exists code that unconditionally uses btf_task_struct_ids.

    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/05d94748d9f4b3eecedc4fddd6875418a396e23c.1629942444.git.dxu@dxuuu.xyz

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:17:12 +02:00
Jerome Marchand 5f52b0fe98 bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
Bugzilla: http://bugzilla.redhat.com/2041365

commit d36216429ff3e69db4f6ea5e0c86b80010f5f30b
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Jul 28 11:30:25 2021 -0700

    bpf: Emit better log message if bpf_iter ctx arg btf_id == 0

    To avoid kernel build failure due to some missing .BTF-ids referenced
    functions/types, the patch ([1]) tries to fill btf_id 0 for
    these types.

    In bpf verifier, for percpu variable and helper returning btf_id cases,
    verifier already emitted proper warning with something like
      verbose(env, "Helper has invalid btf_id in R%d\n", regno);
      verbose(env, "invalid return type %d of func %s#%d\n",
              fn->ret_type, func_id_name(func_id), func_id);

    But this is not the case for bpf_iter context arguments.
    I hacked resolve_btfids to encode btf_id 0 for struct task_struct.
    With `./test_progs -n 7/5`, I got,
      0: (79) r2 = *(u64 *)(r1 +0)
      func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta'
      ; struct seq_file *seq = ctx->meta->seq;
      1: (79) r6 = *(u64 *)(r2 +0)
      ; struct task_struct *task = ctx->task;
      2: (79) r7 = *(u64 *)(r1 +8)
      ; if (task == (void *)0) {
      3: (55) if r7 != 0x0 goto pc+11
      ...
      ; BPF_SEQ_PRINTF(seq, "%8d %8d\n", task->tgid, task->pid);
      26: (61) r1 = *(u32 *)(r7 +1372)
      Type '(anon)' is not a struct

    Basically, verifier will return btf_id 0 for task_struct.
    Later on, when the code tries to access task->tgid, the
    verifier correctly complains the type is '(anon)' and it is
    not a struct. Users still need to backtrace to find out
    what is going on.

    Let us catch the invalid btf_id 0 earlier
    and provide better message indicating btf_id is wrong.
    The new error message looks like below:
      R1 type=ctx expected=fp
      ; struct seq_file *seq = ctx->meta->seq;
      0: (79) r2 = *(u64 *)(r1 +0)
      func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta'
      ; struct seq_file *seq = ctx->meta->seq;
      1: (79) r6 = *(u64 *)(r2 +0)
      ; struct task_struct *task = ctx->task;
      2: (79) r7 = *(u64 *)(r1 +8)
      invalid btf_id for context argument offset 8
      invalid bpf_context access off=8 size=8

    [1] https://lore.kernel.org/bpf/20210727132532.2473636-1-hengqi.chen@gmail.com/

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210728183025.1461750-1-yhs@fb.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:14:36 +02:00
Jerome Marchand 103c5a16ea bpf: Add map side support for bpf timers.
Bugzilla: http://bugzilla.redhat.com/2041365

commit 68134668c17f31f51930478f75495b552a411550
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Jul 14 17:54:10 2021 -0700

    bpf: Add map side support for bpf timers.

    Restrict bpf timers to array, hash (both preallocated and kmalloced), and
    lru map types. The per-cpu maps with timers don't make sense, since 'struct
    bpf_timer' is a part of map value. bpf timers in per-cpu maps would mean that
    the number of timers depends on number of possible cpus and timers would not be
    accessible from all cpus. lpm map support can be added in the future.
    The timers in inner maps are supported.

    The bpf_map_update/delete_elem() helpers and sys_bpf commands cancel and free
    bpf_timer in a given map element.

    Similar to 'struct bpf_spin_lock' BTF is required and it is used to validate
    that map element indeed contains 'struct bpf_timer'.

    Make check_and_init_map_value() init both bpf_spin_lock and bpf_timer when
    map element data is reused in preallocated htab and lru maps.

    Teach copy_map_value() to support both bpf_spin_lock and bpf_timer in a single
    map element. There could be one of each, but not more than one. Due to 'one
    bpf_timer in one element' restriction do not support timers in global data,
    since global data is a map of single element, but from bpf program side it's
    seen as many global variables and restriction of single global timer would be
    odd. The sys_bpf map_freeze and sys_mmap syscalls are not allowed on maps with
    timers, since user space could have corrupted mmap element and crashed the
    kernel. The maps with timers cannot be readonly. Due to these restrictions
    search for bpf_timer in datasec BTF in case it was placed in the global data to
    report clear error.

    The previous patch allowed 'struct bpf_timer' as a first field in a map
    element only. Relax this restriction.

    Refactor lru map to s/bpf_lru_push_free/htab_lru_push_free/ to cancel and free
    the timer when lru map deletes an element as a part of it eviction algorithm.

    Make sure that bpf program cannot access 'struct bpf_timer' via direct load/store.
    The timer operation are done through helpers only.
    This is similar to 'struct bpf_spin_lock'.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/bpf/20210715005417.78572-5-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:14:31 +02:00
David S. Miller a52171ae7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-06-17

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 25 day(s) which contain
a total of 148 files changed, 4779 insertions(+), 1248 deletions(-).

The main changes are:

1) BPF infrastructure to migrate TCP child sockets from a listener to another
   in the same reuseport group/map, from Kuniyuki Iwashima.

2) Add a provably sound, faster and more precise algorithm for tnum_mul() as
   noted in https://arxiv.org/abs/2105.05398, from Harishankar Vishwanathan.

3) Streamline error reporting changes in libbpf as planned out in the
   'libbpf: the road to v1.0' effort, from Andrii Nakryiko.

4) Add broadcast support to xdp_redirect_map(), from Hangbin Liu.

5) Extends bpf_map_lookup_and_delete_elem() functionality to 4 more map
   types, that is, {LRU_,PERCPU_,LRU_PERCPU_,}HASH, from Denis Salopek.

6) Support new LLVM relocations in libbpf to make them more linker friendly,
   also add a doc to describe the BPF backend relocations, from Yonghong Song.

7) Silence long standing KUBSAN complaints on register-based shifts in
   interpreter, from Daniel Borkmann and Eric Biggers.

8) Add dummy PT_REGS macros in libbpf to fail BPF program compilation when
   target arch cannot be determined, from Lorenz Bauer.

9) Extend AF_XDP to support large umems with 1M+ pages, from Magnus Karlsson.

10) Fix two minor libbpf tc BPF API issues, from Kumar Kartikeya Dwivedi.

11) Move libbpf BPF_SEQ_PRINTF/BPF_SNPRINTF macros that can be used by BPF
    programs to bpf_helpers.h header, from Florent Revest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:54:56 -07:00
Jakub Kicinski 5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Zhen Lei 8fb33b6055 bpf: Fix spelling mistakes
Fix some spelling mistakes in comments:
aother ==> another
Netiher ==> Neither
desribe ==> describe
intializing ==> initializing
funciton ==> function
wont ==> won't and move the word 'the' at the end to the next line
accross ==> across
pathes ==> paths
triggerred ==> triggered
excute ==> execute
ether ==> either
conervative ==> conservative
convetion ==> convention
markes ==> marks
interpeter ==> interpreter

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210525025659.8898-2-thunder.leizhen@huawei.com
2021-05-24 21:13:05 -07:00
Alexei Starovoitov 3d78417b60 bpf: Add bpf_btf_find_by_name_kind() helper.
Add new helper:
long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags)
Description
	Find BTF type with given name and kind in vmlinux BTF or in module's BTFs.
Return
	Returns btf_id and btf_obj_fd in lower and upper 32 bits.

It will be used by loader program to find btf_id to attach the program to
and to find btf_ids of ksyms.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail.com
2021-05-19 00:33:40 +02:00
Alexei Starovoitov c571bd752e bpf: Make btf_load command to be bpfptr_t compatible.
Similar to prog_load make btf_load command to be availble to
bpf_prog_type_syscall program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-7-alexei.starovoitov@gmail.com
2021-05-19 00:33:40 +02:00
Jiri Olsa 31379397dc bpf: Forbid trampoline attach for functions with variable arguments
We can't currently allow to attach functions with variable arguments.
The problem is that we should save all the registers for arguments,
which is probably doable, but if caller uses more than 6 arguments,
we need stack data, which will be wrong, because of the extra stack
frame we do in bpf trampoline, so we could crash.

Also currently there's malformed trampoline code generated for such
functions at the moment as described in:

  https://lore.kernel.org/bpf/20210429212834.82621-1-jolsa@kernel.org/

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210505132529.401047-1-jolsa@kernel.org
2021-05-07 01:28:28 +02:00
Colin Ian King 235fc0e36d bpf: Remove redundant assignment of variable id
The variable id is being assigned a value that is never read, the
assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210326194348.623782-1-colin.king@canonical.com
2021-03-30 22:58:53 +02:00
Martin KaFai Lau e6ac2450d6 bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling
kernel function directly.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

This patch is to make the required changes in the bpf verifier.

First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
When the passed in "btf->kernel_btf == true", it means matching the
verifier regs' states with a kernel function.  This will handle the
PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
and PTR_TO_TCP_SOCK to its kernel's btf_id.

In the later libbpf patch, the insn calling a kernel function will
look like:

insn->code == (BPF_JMP | BPF_CALL)
insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
insn->imm == func_btf_id /* btf_id of the running kernel */

[ For the future calling function-in-kernel-module support, an array
  of module btf_fds can be passed at the load time and insn->off
  can be used to index into this array. ]

At the early stage of verifier, the verifier will collect all kernel
function calls into "struct bpf_kfunc_desc".  Those
descriptors are stored in "prog->aux->kfunc_tab" and will
be available to the JIT.  Since this "add" operation is similar
to the current "add_subprog()" and looking for the same insn->code,
they are done together in the new "add_subprog_and_kfunc()".

In the "do_check()" stage, the new "check_kfunc_call()" is added
to verify the kernel function call instruction:
1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
   A new bpf_verifier_ops "check_kfunc_call" is added to do that.
   The bpf-tcp-cc struct_ops program will implement this function in
   a later patch.
2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
   used as the args of a kernel function.
3. Mark the regs' type, subreg_def, and zext_dst.

At the later do_misc_fixups() stage, the new fixup_kfunc_call()
will replace the insn->imm with the function address (relative
to __bpf_call_base).  If needed, the jit can find the btf_func_model
by calling the new bpf_jit_find_kfunc_model(prog, insn).
With the imm set to the function address, "bpftool prog dump xlated"
will be able to display the kernel function calls the same way as
it displays other bpf helper calls.

gpl_compatible program is required to call kernel function.

This feature currently requires JIT.

The verifier selftests are adjusted because of the changes in
the verbose log in add_subprog_and_kfunc().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 34747c4120 bpf: Refactor btf_check_func_arg_match
This patch moved the subprog specific logic from
btf_check_func_arg_match() to the new btf_check_subprog_arg_match().
The core logic is left in btf_check_func_arg_match() which
will be reused later to check the kernel function call.

The "if (!btf_type_is_ptr(t))" is checked first to improve the
indentation which will be useful for a later patch.

Some of the "btf_kind_str[]" usages is replaced with the shortcut
"btf_type_str(t)".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com
2021-03-26 20:41:50 -07:00
David S. Miller c1acda9807 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-03-09

The following pull-request contains BPF updates for your *net-next* tree.

We've added 90 non-merge commits during the last 17 day(s) which contain
a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).

The main changes are:

1) Faster bpf_redirect_map(), from Björn.

2) skmsg cleanup, from Cong.

3) Support for floating point types in BTF, from Ilya.

4) Documentation for sys_bpf commands, from Joe.

5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.

6) Enable task local storage for tracing programs, from Song.

7) bpf_for_each_map_elem() helper, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 18:07:05 -08:00
Ilya Leoshkevich b1828f0b04 bpf: Add BTF_KIND_FLOAT support
On the kernel side, introduce a new btf_kind_operations. It is
similar to that of BTF_KIND_INT, however, it does not need to
handle encodings and bit offsets. Do not implement printing, since
the kernel does not know how to format floating-point values.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-7-iii@linux.ibm.com
2021-03-04 17:58:16 -08:00
Dmitrii Banshchikov 523a4cf491 bpf: Use MAX_BPF_FUNC_REG_ARGS macro
Instead of using integer literal here and there use macro name for
better context.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210225202629.585485-1-me@ubique.spb.ru
2021-02-26 11:59:53 -08:00
Dmitrii Banshchikov f4eda8b6e4 bpf: Drop imprecise log message
Now it is possible for global function to have a pointer argument that
points to something different than struct. Drop the irrelevant log
message and keep the logic same.

Fixes: e5069b9c23 ("bpf: Support pointers in global func args")
Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210223090416.333943-1-me@ubique.spb.ru
2021-02-24 16:43:39 +01:00
Dmitrii Banshchikov e5069b9c23 bpf: Support pointers in global func args
Add an ability to pass a pointer to a type with known size in arguments
of a global function. Such pointers may be used to overcome the limit on
the maximum number of arguments, avoid expensive and tricky workarounds
and to have multiple output arguments.

A referenced type may contain pointers but indirect access through them
isn't supported.

The implementation consists of two parts.  If a global function has an
argument that is a pointer to a type with known size then:

  1) In btf_check_func_arg_match(): check that the corresponding
register points to NULL or to a valid memory region that is large enough
to contain the expected argument's type.

  2) In btf_prepare_func_args(): set the corresponding register type to
PTR_TO_MEM_OR_NULL and its size to the size of the expected type.

Only global functions are supported because allowance of pointers for
static functions might break validation. Consider the following
scenario. A static function has a pointer argument. A caller passes
pointer to its stack memory. Because the callee can change referenced
memory verifier cannot longer assume any particular slot type of the
caller's stack memory hence the slot type is changed to SLOT_MISC.  If
there is an operation that relies on slot type other than SLOT_MISC then
verifier won't be able to infer safety of the operation.

When verifier sees a static function that has a pointer argument
different from PTR_TO_CTX then it skips arguments check and continues
with "inline" validation with more information available. The operation
that relies on the particular slot type now succeeds.

Because global functions were not allowed to have pointer arguments
different from PTR_TO_CTX it's not possible to break existing and valid
code.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212205642.620788-4-me@ubique.spb.ru
2021-02-12 17:37:23 -08:00
Dmitrii Banshchikov feb4adfad5 bpf: Rename bpf_reg_state variables
Using "reg" for an array of bpf_reg_state and "reg[i + 1]" for an
individual bpf_reg_state is error-prone and verbose. Use "regs" for the
former and "reg" for the latter as other code nearby does.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212205642.620788-2-me@ubique.spb.ru
2021-02-12 17:37:23 -08:00
Yonghong Song 13ca51d5eb bpf: Permit size-0 datasec
llvm patch https://reviews.llvm.org/D84002 permitted
to emit empty rodata datasec if the elf .rodata section
contains read-only data from local variables. These
local variables will be not emitted as BTF_KIND_VARs
since llvm converted these local variables as
static variables with private linkage without debuginfo
types. Such an empty rodata datasec will make
skeleton code generation easy since for skeleton
a rodata struct will be generated if there is a
.rodata elf section. The existence of a rodata
btf datasec is also consistent with the existence
of a rodata map created by libbpf.

The btf with such an empty rodata datasec will fail
in the kernel though as kernel will reject a datasec
with zero vlen and zero size. For example, for the below code,
    int sys_enter(void *ctx)
    {
       int fmt[6] = {1, 2, 3, 4, 5, 6};
       int dst[6];

       bpf_probe_read(dst, sizeof(dst), fmt);
       return 0;
    }
We got the below btf (bpftool btf dump ./test.o):
    [1] PTR '(anon)' type_id=0
    [2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1
            'ctx' type_id=1
    [3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
    [4] FUNC 'sys_enter' type_id=2 linkage=global
    [5] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
    [6] ARRAY '(anon)' type_id=5 index_type_id=7 nr_elems=4
    [7] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none)
    [8] VAR '_license' type_id=6, linkage=global-alloc
    [9] DATASEC '.rodata' size=0 vlen=0
    [10] DATASEC 'license' size=0 vlen=1
            type_id=8 offset=0 size=4
When loading the ./test.o to the kernel with bpftool,
we see the following error:
    libbpf: Error loading BTF: Invalid argument(22)
    libbpf: magic: 0xeb9f
    ...
    [6] ARRAY (anon) type_id=5 index_type_id=7 nr_elems=4
    [7] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
    [8] VAR _license type_id=6 linkage=1
    [9] DATASEC .rodata size=24 vlen=0 vlen == 0
    libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.

Basically, libbpf changed .rodata datasec size to 24 since elf .rodata
section size is 24. The kernel then rejected the BTF since vlen = 0.
Note that the above kernel verifier failure can be worked around with
changing local variable "fmt" to a static or global, optionally const, variable.

This patch permits a datasec with vlen = 0 in kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210119153519.3901963-1-yhs@fb.com
2021-01-20 14:14:09 -08:00
Jakub Kicinski 0fe2f273ab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/can/dev.c
  commit 03f16c5075 ("can: dev: can_restart: fix use after free bug")
  commit 3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")

  Code move.

drivers/net/dsa/b53/b53_common.c
 commit 8e4052c32d ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
 commit b7a9e0da2d ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")

 Field rename.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 12:16:11 -08:00
Andrii Nakryiko 541c3bad8d bpf: Support BPF ksym variables in kernel modules
Add support for directly accessing kernel module variables from BPF programs
using special ldimm64 instructions. This functionality builds upon vmlinux
ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
specifying kernel module BTF's FD in insn[1].imm field.

During BPF program load time, verifier will resolve FD to BTF object and will
take reference on BTF object itself and, for module BTFs, corresponding module
as well, to make sure it won't be unloaded from under running BPF program. The
mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

One interesting change is also in how per-CPU variable is determined. The
logic is to find .data..percpu data section in provided BTF, but both vmlinux
and module each have their own .data..percpu entries in BTF. So for module's
case, the search for DATASEC record needs to look at only module's added BTF
types. This is implemented with custom search function.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20210112075520.4103414-6-andrii@kernel.org
2021-01-12 17:24:30 -08:00
Andrii Nakryiko bcc5e6162d bpf: Allow empty module BTFs
Some modules don't declare any new types and end up with an empty BTF,
containing only valid BTF header and no types or strings sections. This
currently causes BTF validation error. There is nothing wrong with such BTF,
so fix the issue by allowing module BTFs with no types or strings.

Fixes: 36e68442d1 ("bpf: Load and verify kernel module BTFs")
Reported-by: Christopher William Snowhill <chris@kode54.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210110070341.1380086-1-andrii@kernel.org
2021-01-12 21:11:30 +01:00
Andrii Nakryiko 290248a5b7 bpf: Allow to specify kernel module BTFs when attaching BPF programs
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD
of a module or vmlinux BTF object. For backwards compatibility reasons,
0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Andrii Nakryiko 22dc4a0f5e bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.

This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.

Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.

Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Andrii Nakryiko 7112d12798 bpf: Compile out btf_parse_module() if module BTF is not enabled
Make sure btf_parse_module() is compiled out if module BTFs are not enabled.

Fixes: 36e68442d1 ("bpf: Load and verify kernel module BTFs")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201111040645.903494-1-andrii@kernel.org
2020-11-10 20:15:07 -08:00
Andrii Nakryiko 36e68442d1 bpf: Load and verify kernel module BTFs
Add kernel module listener that will load/validate and unload module BTF.
Module BTFs gets ID generated for them, which makes it possible to iterate
them with existing BTF iteration API. They are given their respective module's
names, which will get reported through GET_OBJ_INFO API. They are also marked
as in-kernel BTFs for tooling to distinguish them from user-provided BTFs.

Also, similarly to vmlinux BTF, kernel module BTFs are exposed through
sysfs as /sys/kernel/btf/<module-name>. This is convenient for user-space
tools to inspect module BTF contents and dump their types with existing tools:

[vmuser@archvm bpf]$ ls -la /sys/kernel/btf
total 0
drwxr-xr-x  2 root root       0 Nov  4 19:46 .
drwxr-xr-x 13 root root       0 Nov  4 19:46 ..

...

-r--r--r--  1 root root     888 Nov  4 19:46 irqbypass
-r--r--r--  1 root root  100225 Nov  4 19:46 kvm
-r--r--r--  1 root root   35401 Nov  4 19:46 kvm_intel
-r--r--r--  1 root root     120 Nov  4 19:46 pcspkr
-r--r--r--  1 root root     399 Nov  4 19:46 serio_raw
-r--r--r--  1 root root 4094095 Nov  4 19:46 vmlinux

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Andrii Nakryiko 5329722057 bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
objects in the system. To allow distinguishing vmlinux BTF (and later kernel
module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
BTF).  We might want to later allow specifying BTF name for user-provided BTFs
as well, if that makes sense. But currently this is reserved only for
in-kernel BTFs.

Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
in-kernel BTF type with ability to specify BTF types from kernel modules, not
just vmlinux BTF. This will be implemented in a follow up patch set for
fentry/fexit/fmod_ret/lsm/etc.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Andrii Nakryiko 951bb64621 bpf: Add in-kernel split BTF support
Adjust in-kernel BTF implementation to support a split BTF mode of operation.
Changes are mostly mirroring libbpf split BTF changes, with the exception of
start_id being 0 for in-kernel implementation due to simpler read-only mode.

Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
where necessary, is encapsulated in few helper functions. Type numbering and
string offset in a split BTF are logically continuing where base BTF ends, so
most of the high-level logic is kept without changes.

Type verification and size resolution is only doing an added resolution of new
split BTF types and relies on already cached size and type resolution results
in the base BTF.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-2-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Wang Qing 666475ccbf bpf, btf: Remove the duplicate btf_ids.h include
Remove duplicate btf_ids.h header which is included twice.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1604736650-11197-1-git-send-email-wangqing@vivo.com
2020-11-10 00:05:18 +01:00
Hao Luo eaa6bcb71e bpf: Introduce bpf_per_cpu_ptr()
Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
except that it may return NULL. This happens when the cpu parameter is
out of range. So the caller must check the returned value.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com
2020-10-02 15:00:49 -07:00
Hao Luo 4976b718c3 bpf: Introduce pseudo_btf_id
Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
ksym so that further dereferences on the ksym can use the BTF info
to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
the verifier reads the btf_id stored in the insn[0]'s imm field and
marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
which is encoded in btf_vminux by pahole. If the VAR is not of a struct
type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
and the mem_size is resolved to the size of the VAR's type.

>From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

Therefore, the proper functionality of pseudo_btf_id depends on (1)
kallsyms and (2) the encoding of kernel global VARs in pahole, which
should be available since pahole v1.18.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-2-haoluo@google.com
2020-10-02 14:59:25 -07:00
Toke Høiland-Jørgensen 43bc2874e7 bpf: Fix context type resolving for extension programs
Eelco reported we can't properly access arguments if the tracing
program is attached to extension program.

Having following program:

  SEC("classifier/test_pkt_md_access")
  int test_pkt_md_access(struct __sk_buff *skb)

with its extension:

  SEC("freplace/test_pkt_md_access")
  int test_pkt_md_access_new(struct __sk_buff *skb)

and tracing that extension with:

  SEC("fentry/test_pkt_md_access_new")
  int BPF_PROG(fentry, struct sk_buff *skb)

It's not possible to access skb argument in the fentry program,
with following error from verifier:

  ; int BPF_PROG(fentry, struct sk_buff *skb)
  0: (79) r1 = *(u64 *)(r1 +0)
  invalid bpf_context access off=0 size=8

The problem is that btf_ctx_access gets the context type for the
traced program, which is in this case the extension.

But when we trace extension program, we want to get the context
type of the program that the extension is attached to, so we can
access the argument properly in the trace program.

This version of the patch is tweaked slightly from Jiri's original one,
since the refactoring in the previous patches means we have to get the
target prog type from the new variable in prog->aux instead of directly
from the target prog.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355278.48470.17057040257274725638.stgit@toke.dk
2020-09-29 13:09:24 -07:00
Toke Høiland-Jørgensen 3aac1ead5e bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach
In preparation for allowing multiple attachments of freplace programs, move
the references to the target program and trampoline into the
bpf_tracing_link structure when that is created. To do this atomically,
introduce a new mutex in prog->aux to protect writing to the two pointers
to target prog and trampoline, and rename the members to make it clear that
they are related.

With this change, it is no longer possible to attach the same tracing
program multiple times (detaching in-between), since the reference from the
tracing program to the target disappears on the first attach. However,
since the next patch will let the caller supply an attach target, that will
also make it possible to attach to the same place multiple times.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355059.48470.2503076992210324984.stgit@toke.dk
2020-09-29 13:09:23 -07:00
Alan Maguire eb411377ae bpf: Add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                        u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire 31d0bc8163 bpf: Move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Toke Høiland-Jørgensen efc68158c4 bpf: change logging calls from verbose() to bpf_log() and use log pointer
In preparation for moving code around, change a bunch of references to
env->log (and the verbose() logging helper) to use bpf_log() and a direct
pointer to struct bpf_verifier_log. While we're touching the function
signature, mark the 'prog' argument to bpf_check_type_match() as const.

Also enhance the bpf_verifier_log_needed() check to handle NULL pointers
for the log struct so we can re-use the code with logging disabled.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:09:59 -07:00
Lorenz Bauer 9436ef6e86 bpf: Allow specifying a BTF ID per argument in function protos
Function prototypes using ARG_PTR_TO_BTF_ID currently use two ways to signal
which BTF IDs are acceptable. First, bpf_func_proto.btf_id is an array of
IDs, one for each argument. This array is only accessed up to the highest
numbered argument that uses ARG_PTR_TO_BTF_ID and may therefore be less than
five arguments long. It usually points at a BTF_ID_LIST. Second, check_btf_id
is a function pointer that is called by the verifier if present. It gets the
actual BTF ID of the register, and the argument number we're currently checking.
It turns out that the only user check_arg_btf_id ignores the argument, and is
simply used to check whether the BTF ID has a struct sock_common at it's start.

Replace both of these mechanisms with an explicit BTF ID for each argument
in a function proto. Thanks to btf_struct_ids_match this is very flexible:
check_arg_btf_id can be replaced by requiring struct sock_common.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200921121227.255763-5-lmb@cloudflare.com
2020-09-21 15:00:40 -07:00