Commit Graph

364 Commits

Author SHA1 Message Date
Yauheni Kaliuta 5b373179cb bpf: Allow kfunc in tracing and syscall programs.
Bugzilla: https://bugzilla.redhat.com/2120968

commit 979497674e63666a99fd7d242dba53a5ca5d628b
Author: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Date:   Wed May 18 22:59:08 2022 +0200

    bpf: Allow kfunc in tracing and syscall programs.
    
    Tracing and syscall BPF program types are very convenient to add BPF
    capabilities to subsystem otherwise not BPF capable.
    When we add kfuncs capabilities to those program types, we can add
    BPF features to subsystems without having to touch BPF core.
    
    Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
    Link: https://lore.kernel.org/r/20220518205924.399291-2-benjamin.tissoires@redhat.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:06 +02:00
Yauheni Kaliuta 11fec2f10e bpf: Compute map_btf_id during build time
Bugzilla: https://bugzilla.redhat.com/2120968

commit c317ab71facc2cd0a94145973318a4c914e11acc
Author: Menglong Dong <imagedong@tencent.com>
Date:   Mon Apr 25 21:32:47 2022 +0800

    bpf: Compute map_btf_id during build time
    
    For now, the field 'map_btf_id' in 'struct bpf_map_ops' for all map
    types are computed during vmlinux-btf init:
    
      btf_parse_vmlinux() -> btf_vmlinux_map_ids_init()
    
    It will lookup the btf_type according to the 'map_btf_name' field in
    'struct bpf_map_ops'. This process can be done during build time,
    thanks to Jiri's resolve_btfids.
    
    selftest of map_ptr has passed:
    
      $96 map_ptr:OK
      Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
    
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Menglong Dong <imagedong@tencent.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:00 +02:00
Yauheni Kaliuta 39845718d1 bpf: Make BTF type match stricter for release arguments
Bugzilla: https://bugzilla.redhat.com/2120968

commit 2ab3b3808eb17f729edfd69e061667ca0a427195
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:57 2022 +0530

    bpf: Make BTF type match stricter for release arguments
    
    The current of behavior of btf_struct_ids_match for release arguments is
    that when type match fails, it retries with first member type again
    (recursively). Since the offset is already 0, this is akin to just
    casting the pointer in normal C, since if type matches it was just
    embedded inside parent sturct as an object. However, we want to reject
    cases for release function type matching, be it kfunc or BPF helpers.
    
    An example is the following:
    
    struct foo {
    	struct bar b;
    };
    
    struct foo *v = acq_foo();
    rel_bar(&v->b); // btf_struct_ids_match fails btf_types_are_same, then
    		// retries with first member type and succeeds, while
    		// it should fail.
    
    Hence, don't walk the struct and only rely on btf_types_are_same for
    strict mode. All users of strict mode must be dealing with zero offset
    anyway, since otherwise they would want the struct to be walked.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-10-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 716c115570 bpf: Teach verifier about kptr_get kfunc helpers
Bugzilla: https://bugzilla.redhat.com/2120968

commit a1ef195996526da45bbc9710849254023df75aea
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:56 2022 +0530

    bpf: Teach verifier about kptr_get kfunc helpers
    
    We introduce a new style of kfunc helpers, namely *_kptr_get, where they
    take pointer to the map value which points to a referenced kernel
    pointer contained in the map. Since this is referenced, only
    bpf_kptr_xchg from BPF side and xchg from kernel side is allowed to
    change the current value, and each pointer that resides in that location
    would be referenced, and RCU protected (this must be kept in mind while
    adding kernel types embeddable as reference kptr in BPF maps).
    
    This means that if do the load of the pointer value in an RCU read
    section, and find a live pointer, then as long as we hold RCU read lock,
    it won't be freed by a parallel xchg + release operation. This allows us
    to implement a safe refcount increment scheme. Hence, enforce that first
    argument of all such kfunc is a proper PTR_TO_MAP_VALUE pointing at the
    right offset to referenced pointer.
    
    For the rest of the arguments, they are subjected to typical kfunc
    argument checks, hence allowing some flexibility in passing more intent
    into how the reference should be taken.
    
    For instance, in case of struct nf_conn, it is not freed until RCU grace
    period ends, but can still be reused for another tuple once refcount has
    dropped to zero. Hence, a bpf_ct_kptr_get helper not only needs to call
    refcount_inc_not_zero, but also do a tuple match after incrementing the
    reference, and when it fails to match it, put the reference again and
    return NULL.
    
    This can be implemented easily if we allow passing additional parameters
    to the bpf_ct_kptr_get kfunc, like a struct bpf_sock_tuple * and a
    tuple__sz pair.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-9-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 12c4199b33 bpf: Wire up freeing of referenced kptr
Bugzilla: https://bugzilla.redhat.com/2120968

commit 14a324f6a67ef6a53e04362a70160a47eb8afffa
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:55 2022 +0530

    bpf: Wire up freeing of referenced kptr
    
    A destructor kfunc can be defined as void func(type *), where type may
    be void or any other pointer type as per convenience.
    
    In this patch, we ensure that the type is sane and capture the function
    pointer into off_desc of ptr_off_tab for the specific pointer offset,
    with the invariant that the dtor pointer is always set when 'kptr_ref'
    tag is applied to the pointer's pointee type, which is indicated by the
    flag BPF_MAP_VALUE_OFF_F_REF.
    
    Note that only BTF IDs whose destructor kfunc is registered, thus become
    the allowed BTF IDs for embedding as referenced kptr. Hence it serves
    the purpose of finding dtor kfunc BTF ID, as well acting as a check
    against the whitelist of allowed BTF IDs for this purpose.
    
    Finally, wire up the actual freeing of the referenced pointer if any at
    all available offsets, so that no references are leaked after the BPF
    map goes away and the BPF program previously moved the ownership a
    referenced pointer into it.
    
    The behavior is similar to BPF timers, where bpf_map_{update,delete}_elem
    will free any existing referenced kptr. The same case is with LRU map's
    bpf_lru_push_free/htab_lru_push_free functions, which are extended to
    reset unreferenced and free referenced kptr.
    
    Note that unlike BPF timers, kptr is not reset or freed when map uref
    drops to zero.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-8-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta 24419e5e2e bpf: Populate pairs of btf_id and destructor kfunc in btf
Bugzilla: https://bugzilla.redhat.com/2120968

commit 5ce937d613a423ca3102f53d9f3daf4210c1b6e2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:54 2022 +0530

    bpf: Populate pairs of btf_id and destructor kfunc in btf
    
    To support storing referenced PTR_TO_BTF_ID in maps, we require
    associating a specific BTF ID with a 'destructor' kfunc. This is because
    we need to release a live referenced pointer at a certain offset in map
    value from the map destruction path, otherwise we end up leaking
    resources.
    
    Hence, introduce support for passing an array of btf_id, kfunc_btf_id
    pairs that denote a BTF ID and its associated release function. Then,
    add an accessor 'btf_find_dtor_kfunc' which can be used to look up the
    destructor kfunc of a certain BTF ID. If found, we can use it to free
    the object from the map free path.
    
    The registration of these pairs also serve as a whitelist of structures
    which are allowed as referenced PTR_TO_BTF_ID in a BPF map, because
    without finding the destructor kfunc, we will bail and return an error.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-7-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:46:59 +02:00
Yauheni Kaliuta f008a624b3 bpf: Allow storing referenced kptr in map
Bugzilla: https://bugzilla.redhat.com/2120968

commit c0a5a21c25f37c9fd7b36072f9968cdff1e4aa13
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:51 2022 +0530

    bpf: Allow storing referenced kptr in map
    
    Extending the code in previous commits, introduce referenced kptr
    support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
    unreferenced kptr, referenced kptr have a lot more restrictions. In
    addition to the type matching, only a newly introduced bpf_kptr_xchg
    helper is allowed to modify the map value at that offset. This transfers
    the referenced pointer being stored into the map, releasing the
    references state for the program, and returning the old value and
    creating new reference state for the returned pointer.
    
    Similar to unreferenced pointer case, return value for this case will
    also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
    must either be eventually released by calling the corresponding release
    function, otherwise it must be transferred into another map.
    
    It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
    the value, and obtain the old value if any.
    
    BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
    commit will permit using BPF_LDX for such pointers, but attempt at
    making it safe, since the lifetime of object won't be guaranteed.
    
    There are valid reasons to enforce the restriction of permitting only
    bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
    consistent in face of concurrent modification, and any prior values
    contained in the map must also be released before a new one is moved
    into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
    returns the old value, which the verifier would require the user to
    either free or move into another map, and releases the reference held
    for the pointer being moved in.
    
    In the future, direct BPF_XCHG instruction may also be permitted to work
    like bpf_kptr_xchg helper.
    
    Note that process_kptr_func doesn't have to call
    check_helper_mem_access, since we already disallow rdonly/wronly flags
    for map, which is what check_map_access_type checks, and we already
    ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
    so check_map_access is also not required.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 6e36eccfc4 bpf: Tag argument to be released in bpf_func_proto
Bugzilla: https://bugzilla.redhat.com/2120968

commit 8f14852e89113d738c99c375b4c8b8b7e1073df1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:50 2022 +0530

    bpf: Tag argument to be released in bpf_func_proto
    
    Add a new type flag for bpf_arg_type that when set tells verifier that
    for a release function, that argument's register will be the one for
    which meta.ref_obj_id will be set, and which will then be released
    using release_reference. To capture the regno, introduce a new field
    release_regno in bpf_call_arg_meta.
    
    This would be required in the next patch, where we may either pass NULL
    or a refcounted pointer as an argument to the release function
    bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not
    enough, as there is a case where the type of argument needed matches,
    but the ref_obj_id is set to 0. Hence, we must enforce that whenever
    meta.ref_obj_id is zero, the register that is to be released can only
    be NULL for a release function.
    
    Since we now indicate whether an argument is to be released in
    bpf_func_proto itself, is_release_function helper has lost its utitlity,
    hence refactor code to work without it, and just rely on
    meta.release_regno to know when to release state for a ref_obj_id.
    Still, the restriction of one release argument and only one ref_obj_id
    passed to BPF helper or kfunc remains. This may be lifted in the future.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 9b0b6285f7 bpf: Allow storing unreferenced kptr in map
Bugzilla: https://bugzilla.redhat.com/2120968

commit 61df10c7799e27807ad5e459eec9d77cddf8bf45
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Apr 25 03:18:49 2022 +0530

    bpf: Allow storing unreferenced kptr in map
    
    This commit introduces a new pointer type 'kptr' which can be embedded
    in a map value to hold a PTR_TO_BTF_ID stored by a BPF program during
    its invocation. When storing such a kptr, BPF program's PTR_TO_BTF_ID
    register must have the same type as in the map value's BTF, and loading
    a kptr marks the destination register as PTR_TO_BTF_ID with the correct
    kernel BTF and BTF ID.
    
    Such kptr are unreferenced, i.e. by the time another invocation of the
    BPF program loads this pointer, the object which the pointer points to
    may not longer exist. Since PTR_TO_BTF_ID loads (using BPF_LDX) are
    patched to PROBE_MEM loads by the verifier, it would safe to allow user
    to still access such invalid pointer, but passing such pointers into
    BPF helpers and kfuncs should not be permitted. A future patch in this
    series will close this gap.
    
    The flexibility offered by allowing programs to dereference such invalid
    pointers while being safe at runtime frees the verifier from doing
    complex lifetime tracking. As long as the user may ensure that the
    object remains valid, it can ensure data read by it from the kernel
    object is valid.
    
    The user indicates that a certain pointer must be treated as kptr
    capable of accepting stores of PTR_TO_BTF_ID of a certain type, by using
    a BTF type tag 'kptr' on the pointed to type of the pointer. Then, this
    information is recorded in the object BTF which will be passed into the
    kernel by way of map's BTF information. The name and kind from the map
    value BTF is used to look up the in-kernel type, and the actual BTF and
    BTF ID is recorded in the map struct in a new kptr_off_tab member. For
    now, only storing pointers to structs is permitted.
    
    An example of this specification is shown below:
    
    	#define __kptr __attribute__((btf_type_tag("kptr")))
    
    	struct map_value {
    		...
    		struct task_struct __kptr *task;
    		...
    	};
    
    Then, in a BPF program, user may store PTR_TO_BTF_ID with the type
    task_struct into the map, and then load it later.
    
    Note that the destination register is marked PTR_TO_BTF_ID_OR_NULL, as
    the verifier cannot know whether the value is NULL or not statically, it
    must treat all potential loads at that map value offset as loading a
    possibly NULL pointer.
    
    Only BPF_LDX, BPF_STX, and BPF_ST (with insn->imm = 0 to denote NULL)
    are allowed instructions that can access such a pointer. On BPF_LDX, the
    destination register is updated to be a PTR_TO_BTF_ID, and on BPF_STX,
    it is checked whether the source register type is a PTR_TO_BTF_ID with
    same BTF type as specified in the map BTF. The access size must always
    be BPF_DW.
    
    For the map in map support, the kptr_off_tab for outer map is copied
    from the inner map's kptr_off_tab. It was chosen to do a deep copy
    instead of introducing a refcount to kptr_off_tab, because the copy only
    needs to be done when paramterizing using inner_map_fd in the map in map
    case, hence would be unnecessary for all other users.
    
    It is not permitted to use MAP_FREEZE command and mmap for BPF map
    having kptrs, similar to the bpf_timer case. A kptr also requires that
    BPF program has both read and write access to the map (hence both
    BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG are disallowed).
    
    Note that check_map_access must be called from both
    check_helper_mem_access and for the BPF instructions, hence the kptr
    check must distinguish between ACCESS_DIRECT and ACCESS_HELPER, and
    reject ACCESS_HELPER cases. We rename stack_access_src to bpf_access_src
    and reuse it for this purpose.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220424214901.2743946-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:11 +02:00
Yauheni Kaliuta 7e5456e641 bpf: Make btf_find_field more generic
Bugzilla: https://bugzilla.redhat.com/2120968

commit 42ba1308074d9046386d58b56e793604be48ce22
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Apr 15 21:33:42 2022 +0530

    bpf: Make btf_find_field more generic
    
    Next commit introduces field type 'kptr' whose kind will not be struct,
    but pointer, and it will not be limited to one offset, but multiple
    ones. Make existing btf_find_struct_field and btf_find_datasec_var
    functions amenable to use for finding kptrs in map value, by moving
    spin_lock and timer specific checks into their own function.
    
    The alignment, and name are checked before the function is called, so it
    is the last point where we can skip field or return an error before the
    next loop iteration happens. Size of the field and type is meant to be
    checked inside the function.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20220415160354.1050687-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:10 +02:00
Yauheni Kaliuta b21e06ba5d bpf: Ensure type tags precede modifiers in BTF
Bugzilla: https://bugzilla.redhat.com/2120968

commit eb596b0905584a9389585b0f437cf8a2faeb14d0
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Tue Apr 19 22:16:07 2022 +0530

    bpf: Ensure type tags precede modifiers in BTF
    
    It is guaranteed that for modifiers, clang always places type tags
    before other modifiers, and then the base type. We would like to rely on
    this guarantee inside the kernel to make it simple to parse type tags
    from BTF.
    
    However, a user would be allowed to construct a BTF without such
    guarantees. Hence, add a pass to check that in modifier chains, type
    tags only occur at the head of the chain, and then don't occur later in
    the chain.
    
    If we see a type tag, we can have one or more type tags preceding other
    modifiers that then never have another type tag. If we see other
    modifiers, all modifiers following them should never be a type tag.
    
    Instead of having to walk chains we verified previously, we can remember
    the last good modifier type ID which headed a good chain. At that point,
    we must have verified all other chains headed by type IDs less than it.
    This makes the verification process less costly, and it becomes a simple
    O(n) pass.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220419164608.1990559-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-28 16:52:10 +02:00
Jerome Marchand 45d1f7dabd bpf: Fix maximum permitted number of arguments check
Bugzilla: https://bugzilla.redhat.com/2120966

commit c29a4920dfcaa1433b09e2674f605f72767a385c
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Fri Mar 25 00:42:38 2022 +0800

    bpf: Fix maximum permitted number of arguments check

    Since the m->arg_size array can hold up to MAX_BPF_FUNC_ARGS argument
    sizes, it's ok that nargs is equal to MAX_BPF_FUNC_ARGS.

    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220324164238.1274915-1-ytcoode@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:08 +02:00
Jerome Marchand 571461ff91 bpf: Simplify check in btf_parse_hdr()
Bugzilla: https://bugzilla.redhat.com/2120966

commit 583669ab3aed29994e50bde6c66b52d44e1bdb73
Author: Yuntao Wang <ytcoode@gmail.com>
Date:   Sun Mar 20 15:52:40 2022 +0800

    bpf: Simplify check in btf_parse_hdr()

    Replace offsetof(hdr_len) + sizeof(hdr_len) with offsetofend(hdr_len) to
    simplify the check for correctness of btf_data_size in btf_parse_hdr()

    Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20220320075240.1001728-1-ytcoode@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand 8415d528f2 bpf: Check for NULL return from bpf_get_btf_vmlinux
Bugzilla: https://bugzilla.redhat.com/2120966

commit 7ada3787e91c89b0aa7abf47682e8e587b855c13
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sun Mar 20 20:00:03 2022 +0530

    bpf: Check for NULL return from bpf_get_btf_vmlinux

    When CONFIG_DEBUG_INFO_BTF is disabled, bpf_get_btf_vmlinux can return a
    NULL pointer. Check for it in btf_get_module_btf to prevent a NULL pointer
    dereference.

    While kernel test robot only complained about this specific case, let's
    also check for NULL in other call sites of bpf_get_btf_vmlinux.

    Fixes: 9492450fd287 ("bpf: Always raise reference in btf_get_module_btf")
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220320143003.589540-1-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand 7211feaef4 bpf: Always raise reference in btf_get_module_btf
Bugzilla: https://bugzilla.redhat.com/2120966

commit 9492450fd28736262dea9143ebb3afc2c131ace1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Mar 17 17:29:51 2022 +0530

    bpf: Always raise reference in btf_get_module_btf

    Align it with helpers like bpf_find_btf_id, so all functions returning
    BTF in out parameter follow the same rule of raising reference
    consistently, regardless of module or vmlinux BTF.

    Adjust existing callers to handle the change accordinly.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220317115957.3193097-10-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand a6464a8f1f bpf: Factor out fd returning from bpf_btf_find_by_name_kind
Bugzilla: https://bugzilla.redhat.com/2120966

commit edc3ec09ab706c45e955f7a52f0904b4ed649ca9
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Mar 17 17:29:43 2022 +0530

    bpf: Factor out fd returning from bpf_btf_find_by_name_kind

    In next few patches, we need a helper that searches all kernel BTFs
    (vmlinux and module BTFs), and finds the type denoted by 'name' and
    'kind'. Turns out bpf_btf_find_by_name_kind already does the same thing,
    but it instead returns a BTF ID and optionally fd (if module BTF). This
    is used for relocating ksyms in BPF loader code (bpftool gen skel -L).

    We extract the core code out into a new helper bpf_find_btf_id, which
    returns the BTF ID in the return value, and BTF pointer in an out
    parameter. The reference for the returned BTF pointer is always raised,
    hence user must either transfer it (e.g. to a fd), or release it after
    use.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220317115957.3193097-2-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:05 +02:00
Jerome Marchand 5018bf8095 bpf: Reject programs that try to load __percpu memory.
Bugzilla: https://bugzilla.redhat.com/2120966

commit 5844101a1be9b8636024cb31c865ef13c7cc6db3
Author: Hao Luo <haoluo@google.com>
Date:   Fri Mar 4 11:16:56 2022 -0800

    bpf: Reject programs that try to load __percpu memory.

    With the introduction of the btf_type_tag "percpu", we can add a
    MEM_PERCPU to identify those pointers that point to percpu memory.
    The ability of differetiating percpu pointers from regular memory
    pointers have two benefits:

     1. It forbids unexpected use of percpu pointers, such as direct loads.
        In kernel, there are special functions used for accessing percpu
        memory. Directly loading percpu memory is meaningless. We already
        have BPF helpers like bpf_per_cpu_ptr() and bpf_this_cpu_ptr() that
        wrap the kernel percpu functions. So we can now convert percpu
        pointers into regular pointers in a safe way.

     2. Previously, bpf_per_cpu_ptr() and bpf_this_cpu_ptr() only work on
        PTR_TO_PERCPU_BTF_ID, a special reg_type which describes static
        percpu variables in kernel (we rely on pahole to encode them into
        vmlinux BTF). Now, since we can identify __percpu tagged pointers,
        we can also identify dynamically allocated percpu memory as well.
        It means we can use bpf_xxx_cpu_ptr() on dynamic percpu memory.
        This would be very convenient when accessing fields like
        "cgroup->rstat_cpu".

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20220304191657.981240-4-haoluo@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand 9bc38e28b6 bpf: Harden register offset checks for release helpers and kfuncs
Bugzilla: https://bugzilla.redhat.com/2120966

commit 24d5bb806c7e2c0b9972564fd493069f612d90dd
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Mar 5 04:16:41 2022 +0530

    bpf: Harden register offset checks for release helpers and kfuncs

    Let's ensure that the PTR_TO_BTF_ID reg being passed in to release BPF
    helpers and kfuncs always has its offset set to 0. While not a real
    problem now, there's a very real possibility this will become a problem
    when more and more kfuncs are exposed, and more BPF helpers are added
    which can release PTR_TO_BTF_ID.

    Previous commits already protected against non-zero var_off. One of the
    case we are concerned about now is when we have a type that can be
    returned by e.g. an acquire kfunc:

    struct foo {
    	int a;
    	int b;
    	struct bar b;
    };

    ... and struct bar is also a type that can be returned by another
    acquire kfunc.

    Then, doing the following sequence:

    	struct foo *f = bpf_get_foo(); // acquire kfunc
    	if (!f)
    		return 0;
    	bpf_put_bar(&f->b); // release kfunc

    ... would work with the current code, since the btf_struct_ids_match
    takes reg->off into account for matching pointer type with release kfunc
    argument type, but would obviously be incorrect, and most likely lead to
    a kernel crash. A test has been included later to prevent regressions in
    this area.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220304224645.3677453-5-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand 0925903186 bpf: Fix PTR_TO_BTF_ID var_off check
Bugzilla: https://bugzilla.redhat.com/2120966

commit 655efe5089f077485eec848272bd7e26b1a5a735
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Mar 5 04:16:39 2022 +0530

    bpf: Fix PTR_TO_BTF_ID var_off check

    When kfunc support was added, check_ctx_reg was called for PTR_TO_CTX
    register, but no offset checks were made for PTR_TO_BTF_ID. Only
    reg->off was taken into account by btf_struct_ids_match, which protected
    against type mismatch due to non-zero reg->off, but when reg->off was
    zero, a user could set the variable offset of the register and allow it
    to be passed to kfunc, leading to bad pointer being passed into the
    kernel.

    Fix this by reusing the extracted helper check_func_arg_reg_off from
    previous commit, and make one call before checking all supported
    register types. Since the list is maintained, any future changes will be
    taken into account by updating check_func_arg_reg_off. This function
    prevents non-zero var_off to be set for PTR_TO_BTF_ID, but still allows
    a fixed non-zero reg->off, which is needed for type matching to work
    correctly when using pointer arithmetic.

    ARG_DONTCARE is passed as arg_type, since kfunc doesn't support
    accepting a ARG_PTR_TO_ALLOC_MEM without relying on size of parameter
    type from BTF (in case of pointer), or using a mem, len pair. The
    forcing of offset check for ARG_PTR_TO_ALLOC_MEM is done because ringbuf
    helpers obtain the size from the header located at the beginning of the
    memory region, hence any changes to the original pointer shouldn't be
    allowed. In case of kfunc, size is always known, either at verification
    time, or using the length parameter, hence this forcing is not required.

    Since this check will happen once already for PTR_TO_CTX, remove the
    check_ptr_off_reg call inside its block.

    Fixes: e6ac2450d6 ("bpf: Support bpf program calling kernel function")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220304224645.3677453-3-memxor@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:56 +02:00
Jerome Marchand edbd96eb68 bpf: Add config to allow loading modules with BTF mismatches
Bugzilla: https://bugzilla.redhat.com/2120966

commit 5e214f2e43e453d862ebbbd2a4f7ee3fe650f209
Author: Connor O'Brien <connoro@google.com>
Date:   Wed Feb 23 01:28:14 2022 +0000

    bpf: Add config to allow loading modules with BTF mismatches

    BTF mismatch can occur for a separately-built module even when the ABI is
    otherwise compatible and nothing else would prevent successfully loading.

    Add a new Kconfig to control how mismatches are handled. By default, preserve
    the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH
    is enabled, load the module but ignore its BTF information.

    Suggested-by: Yonghong Song <yhs@fb.com>
    Suggested-by: Michal Suchánek <msuchanek@suse.de>
    Signed-off-by: Connor O'Brien <connoro@google.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com
    Link: https://lore.kernel.org/bpf/20220223012814.1898677-1-connoro@google.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand c5056baccd bpf: Cleanup comments
Bugzilla: https://bugzilla.redhat.com/2120966

commit c561d11063009323a0e57c528cb1d77b7d2c41e0
Author: Tom Rix <trix@redhat.com>
Date:   Sun Feb 20 10:40:55 2022 -0800

    bpf: Cleanup comments

    Add leading space to spdx tag
    Use // for spdx c file comment

    Replacements
    resereved to reserved
    inbetween to in between
    everytime to every time
    intutivie to intuitive
    currenct to current
    encontered to encountered
    referenceing to referencing
    upto to up to
    exectuted to executed

    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20220220184055.3608317-1-trix@redhat.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand 5b0c453bf8 bpf: Initialize ret to 0 inside btf_populate_kfunc_set()
Bugzilla: https://bugzilla.redhat.com/2120966

commit d0b3822902b6af45f2c75706d7eb2a35aacab223
Author: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
Date:   Sat Feb 19 22:09:15 2022 +0530

    bpf: Initialize ret to 0 inside btf_populate_kfunc_set()

    Kernel test robot reported below error ->

    kernel/bpf/btf.c:6718 btf_populate_kfunc_set()
    error: uninitialized symbol 'ret'.

    Initialize ret to 0.

    Fixes: dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/bpf/20220219163915.125770-1-jrdr.linux@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Jerome Marchand a1698130ea libbpf: Split bpf_core_apply_relo()
Bugzilla: https://bugzilla.redhat.com/2120966

commit adb8fa195efdfaac5852aaac24810b456ce43b04
Author: Mauricio Vásquez <mauricio@kinvolk.io>
Date:   Tue Feb 15 17:58:50 2022 -0500

    libbpf: Split bpf_core_apply_relo()

    BTFGen needs to run the core relocation logic in order to understand
    what are the types involved in a given relocation.

    Currently bpf_core_apply_relo() calculates and **applies** a relocation
    to an instruction. Having both operations in the same function makes it
    difficult to only calculate the relocation without patching the
    instruction. This commit splits that logic in two different phases: (1)
    calculate the relocation and (2) patch the instruction.

    For the first phase bpf_core_apply_relo() is renamed to
    bpf_core_calc_relo_insn() who is now only on charge of calculating the
    relocation, the second phase uses the already existing
    bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
    the BTFGen will use only bpf_core_calc_relo_insn().

    Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
    Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
    Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
    Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:50 +02:00
Jerome Marchand b8affec23c bpf: Implement bpf_core_types_are_compat().
Bugzilla: https://bugzilla.redhat.com/2120966

commit e70e13e7d4ab8f932f49db1c9500b30a34a6d420
Author: Matteo Croce <mcroce@microsoft.com>
Date:   Fri Feb 4 01:55:18 2022 +0100

    bpf: Implement bpf_core_types_are_compat().

    Adopt libbpf's bpf_core_types_are_compat() for kernel duty by adding
    explicit recursion limit of 2 which is enough to handle 2 levels of
    function prototypes.

    Signed-off-by: Matteo Croce <mcroce@microsoft.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220204005519.60361-2-mcroce@linux.microsoft.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:47 +02:00
Jerome Marchand 7b7955f647 bpf: reject program if a __user tagged memory accessed in kernel way
Bugzilla: https://bugzilla.redhat.com/2120966

commit c6f1bfe89ac95dc829dcb4ed54780da134ac5fce
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Jan 27 07:46:06 2022 -0800

    bpf: reject program if a __user tagged memory accessed in kernel way

    BPF verifier supports direct memory access for BPF_PROG_TYPE_TRACING type
    of bpf programs, e.g., a->b. If "a" is a pointer
    pointing to kernel memory, bpf verifier will allow user to write
    code in C like a->b and the verifier will translate it to a kernel
    load properly. If "a" is a pointer to user memory, it is expected
    that bpf developer should be bpf_probe_read_user() helper to
    get the value a->b. Without utilizing BTF __user tagging information,
    current verifier will assume that a->b is a kernel memory access
    and this may generate incorrect result.

    Now BTF contains __user information, it can check whether the
    pointer points to a user memory or not. If it is, the verifier
    can reject the program and force users to use bpf_probe_read_user()
    helper explicitly.

    In the future, we can easily extend btf_add_space for other
    address space tagging, for example, rcu/percpu etc.

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20220127154606.654961-1-yhs@fb.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:44 +02:00
Jerome Marchand c841220259 bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF
Bugzilla: https://bugzilla.redhat.com/2120966

commit c446fdacb10dcb3b9a9ed3b91d91e72d71d94b03
Author: Stanislav Fomichev <sdf@google.com>
Date:   Tue Jan 25 16:13:40 2022 -0800

    bpf: fix register_btf_kfunc_id_set for !CONFIG_DEBUG_INFO_BTF

    Commit dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    breaks loading of some modules when CONFIG_DEBUG_INFO_BTF is not set.
    register_btf_kfunc_id_set returns -ENOENT to the callers when
    there is no module btf. Let's return 0 (success) instead to let
    those modules work in !CONFIG_DEBUG_INFO_BTF cases.

    Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Fixes: dee872e124e8 ("bpf: Populate kfunc BTF ID sets in struct btf")
    Signed-off-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20220126001340.1573649-1-sdf@google.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:44 +02:00
Jerome Marchand aa7f151c25 bpf: Add reference tracking support to kfunc
Bugzilla: https://bugzilla.redhat.com/2120966

Conflicts:
Simple context change due to already applied commit 45ce4b4f9009
("bpf: Fix crash due to out of bounds access into reg2btf_ids.")

commit 5c073f26f9dc78a6c8194b23eac7537c9692c7d7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:48 2022 +0530

    bpf: Add reference tracking support to kfunc

    This patch adds verifier support for PTR_TO_BTF_ID return type of kfunc
    to be a reference, by reusing acquire_reference_state/release_reference
    support for existing in-kernel bpf helpers.

    We make use of the three kfunc types:

    - BTF_KFUNC_TYPE_ACQUIRE
      Return true if kfunc_btf_id is an acquire kfunc.  This will
      acquire_reference_state for the returned PTR_TO_BTF_ID (this is the
      only allow return value). Note that acquire kfunc must always return a
      PTR_TO_BTF_ID{_OR_NULL}, otherwise the program is rejected.

    - BTF_KFUNC_TYPE_RELEASE
      Return true if kfunc_btf_id is a release kfunc.  This will release the
      reference to the passed in PTR_TO_BTF_ID which has a reference state
      (from earlier acquire kfunc).
      The btf_check_func_arg_match returns the regno (of argument register,
      hence > 0) if the kfunc is a release kfunc, and a proper referenced
      PTR_TO_BTF_ID is being passed to it.
      This is similar to how helper call check uses bpf_call_arg_meta to
      store the ref_obj_id that is later used to release the reference.
      Similar to in-kernel helper, we only allow passing one referenced
      PTR_TO_BTF_ID as an argument. It can also be passed in to normal
      kfunc, but in case of release kfunc there must always be one
      PTR_TO_BTF_ID argument that is referenced.

    - BTF_KFUNC_TYPE_RET_NULL
      For kfunc returning PTR_TO_BTF_ID, tells if it can be NULL, hence
      force caller to mark the pointer not null (using check) before
      accessing it. Note that taking into account the case fixed by commit
      93c230e3f5 ("bpf: Enforce id generation for all may-be-null register type")
      we assign a non-zero id for mark_ptr_or_null_reg logic. Later, if more
      return types are supported by kfunc, which have a _OR_NULL variant, it
      might be better to move this id generation under a common
      reg_type_may_be_null check, similar to the case in the commit.

    Referenced PTR_TO_BTF_ID is currently only limited to kfunc, but can be
    extended in the future to other BPF helpers as well.  For now, we can
    rely on the btf_struct_ids_match check to ensure we get the pointer to
    the expected struct type. In the future, care needs to be taken to avoid
    ambiguity for reference PTR_TO_BTF_ID passed to release function, in
    case multiple candidates can release same BTF ID.

    e.g. there might be two release kfuncs (or kfunc and helper):

    foo(struct abc *p);
    bar(struct abc *p);

    ... such that both release a PTR_TO_BTF_ID with btf_id of struct abc. In
    this case we would need to track the acquire function corresponding to
    the release function to avoid type confusion, and store this information
    in the register state so that an incorrect program can be rejected. This
    is not a problem right now, hence it is left as an exercise for the
    future patch introducing such a case in the kernel.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-6-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:40 +02:00
Jerome Marchand dd8556d8d2 bpf: Introduce mem, size argument pair support for kfunc
Bugzilla: https://bugzilla.redhat.com/2120966

Conflicts:
Context change due to already applied commit be80a1d3f9db ("bpf:
Generalize check_ctx_reg for reuse with other types")

commit d583691c47dc0424ebe926000339a6d6cd590ff7
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:47 2022 +0530

    bpf: Introduce mem, size argument pair support for kfunc

    BPF helpers can associate two adjacent arguments together to pass memory
    of certain size, using ARG_PTR_TO_MEM and ARG_CONST_SIZE arguments.
    Since we don't use bpf_func_proto for kfunc, we need to leverage BTF to
    implement similar support.

    The ARG_CONST_SIZE processing for helpers is refactored into a common
    check_mem_size_reg helper that is shared with kfunc as well. kfunc
    ptr_to_mem support follows logic similar to global functions, where
    verification is done as if pointer is not null, even when it may be
    null.

    This leads to a simple to follow rule for writing kfunc: always check
    the argument pointer for NULL, except when it is PTR_TO_CTX. Also, the
    PTR_TO_CTX case is also only safe when the helper expecting pointer to
    program ctx is not exposed to other programs where same struct is not
    ctx type. In that case, the type check will fall through to other cases
    and would permit passing other types of pointers, possibly NULL at
    runtime.

    Currently, we require the size argument to be suffixed with "__sz" in
    the parameter name. This information is then recorded in kernel BTF and
    verified during function argument checking. In the future we can use BTF
    tagging instead, and modify the kernel function definitions. This will
    be a purely kernel-side change.

    This allows us to have some form of backwards compatibility for
    structures that are passed in to the kernel function with their size,
    and allow variable length structures to be passed in if they are
    accompanied by a size parameter.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-5-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand 1306b4efce bpf: Remove check_kfunc_call callback and old kfunc BTF ID API
Bugzilla: https://bugzilla.redhat.com/2120966

commit b202d84422223b7222cba5031d182f20b37e146e
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:46 2022 +0530

    bpf: Remove check_kfunc_call callback and old kfunc BTF ID API

    Completely remove the old code for check_kfunc_call to help it work
    with modules, and also the callback itself.

    The previous commit adds infrastructure to register all sets and put
    them in vmlinux or module BTF, and concatenates all related sets
    organized by the hook and the type. Once populated, these sets remain
    immutable for the lifetime of the struct btf.

    Also, since we don't need the 'owner' module anywhere when doing
    check_kfunc_call, drop the 'btf_modp' module parameter from
    find_kfunc_desc_btf.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-4-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand 25756e43b7 bpf: Populate kfunc BTF ID sets in struct btf
Bugzilla: https://bugzilla.redhat.com/2120966

commit dee872e124e8d5de22b68c58f6f6c3f5e8889160
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:45 2022 +0530

    bpf: Populate kfunc BTF ID sets in struct btf

    This patch prepares the kernel to support putting all kinds of kfunc BTF
    ID sets in the struct btf itself. The various kernel subsystems will
    make register_btf_kfunc_id_set call in the initcalls (for built-in code
    and modules).

    The 'hook' is one of the many program types, e.g. XDP and TC/SCHED_CLS,
    STRUCT_OPS, and 'types' are check (allowed or not), acquire, release,
    and ret_null (with PTR_TO_BTF_ID_OR_NULL return type).

    A maximum of BTF_KFUNC_SET_MAX_CNT (32) kfunc BTF IDs are permitted in a
    set of certain hook and type for vmlinux sets, since they are allocated
    on demand, and otherwise set as NULL. Module sets can only be registered
    once per hook and type, hence they are directly assigned.

    A new btf_kfunc_id_set_contains function is exposed for use in verifier,
    this new method is faster than the existing list searching method, and
    is also automatic. It also lets other code not care whether the set is
    unallocated or not.

    Note that module code can only do single register_btf_kfunc_id_set call
    per hook. This is why sorting is only done for in-kernel vmlinux sets,
    because there might be multiple sets for the same hook and type that
    must be concatenated, hence sorting them is required to ensure bsearch
    in btf_id_set_contains continues to work correctly.

    Next commit will update the kernel users to make use of this
    infrastructure.

    Finally, add __maybe_unused annotation for BTF ID macros for the
    !CONFIG_DEBUG_INFO_BTF case, so that they don't produce warnings during
    build time.

    The previous patch is also needed to provide synchronization against
    initialization for module BTF's kfunc_set_tab introduced here, as
    described below:

      The kfunc_set_tab pointer in struct btf is write-once (if we consider
      the registration phase (comprised of multiple register_btf_kfunc_id_set
      calls) as a single operation). In this sense, once it has been fully
      prepared, it isn't modified, only used for lookup (from the verifier
      context).

      For btf_vmlinux, it is initialized fully during the do_initcalls phase,
      which happens fairly early in the boot process, before any processes are
      present. This also eliminates the possibility of bpf_check being called
      at that point, thus relieving us of ensuring any synchronization between
      the registration and lookup function (btf_kfunc_id_set_contains).

      However, the case for module BTF is a bit tricky. The BTF is parsed,
      prepared, and published from the MODULE_STATE_COMING notifier callback.
      After this, the module initcalls are invoked, where our registration
      function will be called to populate the kfunc_set_tab for module BTF.

      At this point, BTF may be available to userspace while its corresponding
      module is still intializing. A BTF fd can then be passed to verifier
      using bpf syscall (e.g. for kfunc call insn).

      Hence, there is a race window where verifier may concurrently try to
      lookup the kfunc_set_tab. To prevent this race, we must ensure the
      operations are serialized, or waiting for the __init functions to
      complete.

      In the earlier registration API, this race was alleviated as verifier
      bpf_check_mod_kfunc_call didn't find the kfunc BTF ID until it was added
      by the registration function (called usually at the end of module __init
      function after all module resources have been initialized). If the
      verifier made the check_kfunc_call before kfunc BTF ID was added to the
      list, it would fail verification (saying call isn't allowed). The
      access to list was protected using a mutex.

      Now, it would still fail verification, but for a different reason
      (returning ENXIO due to the failed btf_try_get_module call in
      add_kfunc_call), because if the __init call is in progress the module
      will be in the middle of MODULE_STATE_COMING -> MODULE_STATE_LIVE
      transition, and the BTF_MODULE_LIVE flag for btf_module instance will
      not be set, so the btf_try_get_module call will fail.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-3-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Jerome Marchand ab96ac72cd bpf: Fix UAF due to race between btf_try_get_module and load_module
Bugzilla: https://bugzilla.redhat.com/2120966

commit 18688de203b47e5d8d9d0953385bf30b5949324f
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Jan 14 22:09:44 2022 +0530

    bpf: Fix UAF due to race between btf_try_get_module and load_module

    While working on code to populate kfunc BTF ID sets for module BTF from
    its initcall, I noticed that by the time the initcall is invoked, the
    module BTF can already be seen by userspace (and the BPF verifier). The
    existing btf_try_get_module calls try_module_get which only fails if
    mod->state == MODULE_STATE_GOING, i.e. it can increment module reference
    when module initcall is happening in parallel.

    Currently, BTF parsing happens from MODULE_STATE_COMING notifier
    callback. At this point, the module initcalls have not been invoked.
    The notifier callback parses and prepares the module BTF, allocates an
    ID, which publishes it to userspace, and then adds it to the btf_modules
    list allowing the kernel to invoke btf_try_get_module for the BTF.

    However, at this point, the module has not been fully initialized (i.e.
    its initcalls have not finished). The code in module.c can still fail
    and free the module, without caring for other users. However, nothing
    stops btf_try_get_module from succeeding between the state transition
    from MODULE_STATE_COMING to MODULE_STATE_LIVE.

    This leads to a use-after-free issue when BPF program loads
    successfully in the state transition, load_module's do_init_module call
    fails and frees the module, and BPF program fd on close calls module_put
    for the freed module. Future patch has test case to verify we don't
    regress in this area in future.

    There are multiple points after prepare_coming_module (in load_module)
    where failure can occur and module loading can return error. We
    illustrate and test for the race using the last point where it can
    practically occur (in module __init function).

    An illustration of the race:

    CPU 0                           CPU 1
    			  load_module
    			    notifier_call(MODULE_STATE_COMING)
    			      btf_parse_module
    			      btf_alloc_id	// Published to userspace
    			      list_add(&btf_mod->list, btf_modules)
    			    mod->init(...)
    ...				^
    bpf_check		        |
    check_pseudo_btf_id             |
      btf_try_get_module            |
        returns true                |  ...
    ...                             |  module __init in progress
    return prog_fd                  |  ...
    ...                             V
    			    if (ret < 0)
    			      free_module(mod)
    			    ...
    close(prog_fd)
     ...
     bpf_prog_free_deferred
      module_put(used_btf.mod) // use-after-free

    We fix this issue by setting a flag BTF_MODULE_F_LIVE, from the notifier
    callback when MODULE_STATE_LIVE state is reached for the module, so that
    we return NULL from btf_try_get_module for modules that are not fully
    formed. Since try_module_get already checks that module is not in
    MODULE_STATE_GOING state, and that is the only transition a live module
    can make before being removed from btf_modules list, this is enough to
    close the race and prevent the bug.

    A later selftest patch crafts the race condition artifically to verify
    that it has been fixed, and that verifier fails to load program (with
    ENXIO).

    Lastly, a couple of comments:

     1. Even if this race didn't exist, it seems more appropriate to only
        access resources (ksyms and kfuncs) of a fully formed module which
        has been initialized completely.

     2. This patch was born out of need for synchronization against module
        initcall for the next patch, so it is needed for correctness even
        without the aforementioned race condition. The BTF resources
        initialized by module initcall are set up once and then only looked
        up, so just waiting until the initcall has finished ensures correct
        behavior.

    Fixes: 541c3bad8d ("bpf: Support BPF ksym variables in kernel modules")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Link: https://lore.kernel.org/r/20220114163953.1455836-2-memxor@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:39 +02:00
Artem Savkov 176866ac1f bpf: Fix crash due to out of bounds access into reg2btf_ids.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 45ce4b4f9009102cd9f581196d480a59208690c1
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Thu Feb 17 01:49:43 2022 +0530

    bpf: Fix crash due to out of bounds access into reg2btf_ids.

    When commit e6ac2450d6 ("bpf: Support bpf program calling kernel function") added
    kfunc support, it defined reg2btf_ids as a cheap way to translate the verifier
    reg type to the appropriate btf_vmlinux BTF ID, however
    commit c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
    moved the __BPF_REG_TYPE_MAX from the last member of bpf_reg_type enum to after
    the base register types, and defined other variants using type flag
    composition. However, now, the direct usage of reg->type to index into
    reg2btf_ids may no longer fall into __BPF_REG_TYPE_MAX range, and hence lead to
    out of bounds access and kernel crash on dereference of bad pointer.

    Fixes: c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20220216201943.624869-1-memxor@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:55 +02:00
Artem Savkov c336e6fbf1 bpf: Generalize check_ctx_reg for reuse with other types
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit be80a1d3f9dbe5aee79a325964f7037fe2d92f30
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Mon Jan 10 14:05:49 2022 +0000

    bpf: Generalize check_ctx_reg for reuse with other types

    Generalize the check_ctx_reg() helper function into a more generic named one
    so that it can be reused for other register types as well to check whether
    their offset is non-zero. No functional change.

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Acked-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:54 +02:00
Artem Savkov 0adb10f4f1 bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Omitted-fix: f858c2b2ca04 bpf: Fix calling global functions from
             BPF_PROG_TYPE_EXT programs
    It has not so small dependencies so given it is from 5.18 which is
    planned to be backported in 9.2 as well I'm leaving it till later.

commit 3363bd0cfbb80dfcd25003cd3815b0ad8b68d0ff
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Fri Dec 17 07:20:24 2021 +0530

    bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support

    Allow passing PTR_TO_CTX, if the kfunc expects a matching struct type,
    and punt to PTR_TO_MEM block if reg->type does not fall in one of
    PTR_TO_BTF_ID or PTR_TO_SOCK* types. This will be used by future commits
    to get access to XDP and TC PTR_TO_CTX, and pass various data (flags,
    l4proto, netns_id, etc.) encoded in opts struct passed as pointer to
    kfunc.

    For PTR_TO_MEM support, arguments are currently limited to pointer to
    scalar, or pointer to struct composed of scalars. This is done so that
    unsafe scenarios (like passing PTR_TO_MEM where PTR_TO_BTF_ID of
    in-kernel valid structure is expected, which may have pointers) are
    avoided. Since the argument checking happens basd on argument register
    type, it is not easy to ascertain what the expected type is. In the
    future, support for PTR_TO_MEM for kfunc can be extended to serve other
    usecases. The struct type whose pointer is passed in may have maximum
    nesting depth of 4, all recursively composed of scalars or struct with
    scalars.

    Future commits will add negative tests that check whether these
    restrictions imposed for kfunc arguments are duly rejected by BPF
    verifier or not.

    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217015031.1278167-4-memxor@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 7f76bfc54f bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 216e3cd2f28dbbf1fe86848e0e29e6693b9f0a20
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:51 2021 -0800

    bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.

    Some helper functions may modify its arguments, for example,
    bpf_d_path, bpf_get_stack etc. Previously, their argument types
    were marked as ARG_PTR_TO_MEM, which is compatible with read-only
    mem types, such as PTR_TO_RDONLY_BUF. Therefore it's legitimate,
    but technically incorrect, to modify a read-only memory by passing
    it into one of such helper functions.

    This patch tags the bpf_args compatible with immutable memory with
    MEM_RDONLY flag. The arguments that don't have this flag will be
    only compatible with mutable memory types, preventing the helper
    from modifying a read-only memory. The bpf_args that have
    MEM_RDONLY are compatible with both mutable memory and immutable
    memory.

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-9-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov eb324cec1c bpf: Convert PTR_TO_MEM_OR_NULL to composable types.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit cf9f2f8d62eca810afbd1ee6cc0800202b000e57
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:49 2021 -0800

    bpf: Convert PTR_TO_MEM_OR_NULL to composable types.

    Remove PTR_TO_MEM_OR_NULL and replace it with PTR_TO_MEM combined with
    flag PTR_MAYBE_NULL.

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-7-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 86317abded bpf: Introduce MEM_RDONLY flag
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 20b2aff4bc15bda809f994761d5719827d66c0b4
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:48 2021 -0800

    bpf: Introduce MEM_RDONLY flag

    This patch introduce a flag MEM_RDONLY to tag a reg value
    pointing to read-only memory. It makes the following changes:

    1. PTR_TO_RDWR_BUF -> PTR_TO_BUF
    2. PTR_TO_RDONLY_BUF -> PTR_TO_BUF | MEM_RDONLY

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211217003152.48334-6-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:50 +02:00
Artem Savkov 2938af7b1d bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c25b2ae136039ffa820c26138ed4a5e5f3ab3841
Author: Hao Luo <haoluo@google.com>
Date:   Thu Dec 16 16:31:47 2021 -0800

    bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL

    We have introduced a new type to make bpf_reg composable, by
    allocating bits in the type to represent flags.

    One of the flags is PTR_MAYBE_NULL which indicates a pointer
    may be NULL. This patch switches the qualified reg_types to
    use this flag. The reg_types changed in this patch include:

    1. PTR_TO_MAP_VALUE_OR_NULL
    2. PTR_TO_SOCKET_OR_NULL
    3. PTR_TO_SOCK_COMMON_OR_NULL
    4. PTR_TO_TCP_SOCK_OR_NULL
    5. PTR_TO_BTF_ID_OR_NULL
    6. PTR_TO_MEM_OR_NULL
    7. PTR_TO_RDONLY_BUF_OR_NULL
    8. PTR_TO_RDWR_BUF_OR_NULL

    Signed-off-by: Hao Luo <haoluo@google.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/r/20211217003152.48334-5-haoluo@google.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:49 +02:00
Artem Savkov 003660d21e bpf: Allow access to int pointer arguments in tracing programs
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit bb6728d756112596881a5fdf2040544031905840
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Wed Dec 8 20:32:41 2021 +0100

    bpf: Allow access to int pointer arguments in tracing programs

    Adding support to access arguments with int pointer arguments
    in tracing programs.

    Currently we allow tracing programs to access only pointers to
    string (char pointer), void pointers and pointers to structs.

    If we try to access argument which is pointer to int, verifier
    will fail to load the program with;

      R1 type=ctx expected=fp
      ; int BPF_PROG(fmod_ret_test, int _a, __u64 _b, int _ret)
      0: (bf) r6 = r1
      ; int BPF_PROG(fmod_ret_test, int _a, __u64 _b, int _ret)
      1: (79) r9 = *(u64 *)(r6 +8)
      func 'bpf_modify_return_test' arg1 type INT is not a struct

    There is no harm for the program to access int pointer argument.
    We are already doing that for string pointer, which is pointer
    to int with 1 byte size.

    Changing the is_string_ptr to generic integer check and renaming
    it to btf_type_is_int.

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211208193245.172141-2-jolsa@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov d475f1fc96 bpf: Silence coverity false positive warning.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit f18a499799dd0f0fdd98cf72d98d3866ce9ac60e
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Sat Dec 11 18:08:19 2021 -0800

    bpf: Silence coverity false positive warning.

    Coverity issued the following warning:
    6685            cands = bpf_core_add_cands(cands, main_btf, 1);
    6686            if (IS_ERR(cands))
    >>>     CID 1510300:    (RETURN_LOCAL)
    >>>     Returning pointer "cands" which points to local variable "local_cand".
    6687                    return cands;

    It's a false positive.
    Add ERR_CAST() to silence it.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov a1c11bfa7b bpf: Use kmemdup() to replace kmalloc + memcpy
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 4674f21071b935c237217ac02cb310522d6ad95d
Author: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Date:   Thu Dec 9 14:21:22 2021 +0800

    bpf: Use kmemdup() to replace kmalloc + memcpy

    Eliminate the follow coccicheck warning:

    ./kernel/bpf/btf.c:6537:13-20: WARNING opportunity for kmemdup.

    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/1639030882-92383-1-git-send-email-jiapeng.chong@linux.alibaba.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:47 +02:00
Artem Savkov 67e859d1a6 bpf: Remove redundant assignment to pointer t
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 73b6eae583f44e278e19489a411f9c1e22d530fc
Author: Colin Ian King <colin.i.king@gmail.com>
Date:   Tue Dec 7 22:47:18 2021 +0000

    bpf: Remove redundant assignment to pointer t

    The pointer t is being initialized with a value that is never read. The
    pointer is re-assigned a value a littler later on, hence the initialization
    is redundant and can be removed.

    Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211207224718.59593-1-colin.i.king@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:45 +02:00
Artem Savkov a611a702c6 bpf: Silence purge_cand_cache build warning.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 29f2e5bd9439445fe14ba8570b1c9a7ad682df84
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Mon Dec 6 17:48:39 2021 -0800

    bpf: Silence purge_cand_cache build warning.

    When CONFIG_DEBUG_INFO_BTF_MODULES is not set
    the following warning can be seen:
    kernel/bpf/btf.c:6588:13: warning: 'purge_cand_cache' defined but not used [-Wunused-function]
    Fix it.

    Fixes: 1e89106da253 ("bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211207014839.6976-1-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:45 +02:00
Artem Savkov 65a903edb8 bpf: Disallow BPF_LOG_KERNEL log level for bpf(BPF_BTF_LOAD)
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 866de407444398bc8140ea70de1dba5f91cc34ac
Author: Hou Tao <houtao1@huawei.com>
Date:   Fri Dec 3 13:30:01 2021 +0800

    bpf: Disallow BPF_LOG_KERNEL log level for bpf(BPF_BTF_LOAD)

    BPF_LOG_KERNEL is only used internally, so disallow bpf_btf_load()
    to set log level as BPF_LOG_KERNEL. The same checking has already
    been done in bpf_check(), so factor out a helper to check the
    validity of log attributes and use it in both places.

    Fixes: 8580ac9404 ("bpf: Process in-kernel BTF")
    Signed-off-by: Hou Tao <houtao1@huawei.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20211203053001.740945-1-houtao1@huawei.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:44 +02:00
Artem Savkov bad77b29d8 libbpf: Reduce bpf_core_apply_relo_insn() stack usage.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 78c1f8d0634cc35da613d844eda7c849fc50f643
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Fri Dec 3 10:28:36 2021 -0800

    libbpf: Reduce bpf_core_apply_relo_insn() stack usage.

    Reduce bpf_core_apply_relo_insn() stack usage and bump
    BPF_CORE_SPEC_MAX_LEN limit back to 64.

    Fixes: 29db4bea1d10 ("bpf: Prepare relo_core.c for kernel duty.")
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:44 +02:00
Artem Savkov 1471e757ae bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 1e89106da25390826608ad6ac0edfb7c9952eff3
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:31 2021 -0800

    bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().

    Given BPF program's BTF root type name perform the following steps:
    . search in vmlinux candidate cache.
    . if (present in cache and candidate list >= 1) return candidate list.
    . do a linear search through kernel BTFs for possible candidates.
    . regardless of number of candidates found populate vmlinux cache.
    . if (candidate list >= 1) return candidate list.
    . search in module candidate cache.
    . if (present in cache) return candidate list (even if list is empty).
    . do a linear search through BTFs of all kernel modules
      collecting candidates from all of them.
    . regardless of number of candidates found populate module cache.
    . return candidate list.
    Then wire the result into bpf_core_apply_relo_insn().

    When BPF program is trying to CO-RE relocate a type
    that doesn't exist in either vmlinux BTF or in modules BTFs
    these steps will perform 2 cache lookups when cache is hit.

    Note the cache doesn't prevent the abuse by the program that might
    have lots of relocations that cannot be resolved. Hence cond_resched().

    CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-9-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:43 +02:00
Artem Savkov 1413cea45f bpf: Adjust BTF log size limit.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c5a2d43e998a821701029f23e25b62f9188e93ff
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:29 2021 -0800

    bpf: Adjust BTF log size limit.

    Make BTF log size limit to be the same as the verifier log size limit.
    Otherwise tools that progressively increase log size and use the same log
    for BTF loading and program loading will be hitting hard to debug EINVAL.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-7-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 77c4b3ac35 bpf: Pass a set of bpf_core_relo-s to prog_load command.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit fbd94c7afcf99c9f3b1ba1168657ecc428eb2c8d
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:28 2021 -0800

    bpf: Pass a set of bpf_core_relo-s to prog_load command.

    struct bpf_core_relo is generated by llvm and processed by libbpf.
    It's a de-facto uapi.
    With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
    Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
    and let the kernel perform CO-RE relocations.

    Note the struct bpf_line_info and struct bpf_func_info have the same
    layout when passed from LLVM to libbpf and from libbpf to the kernel
    except "insn_off" fields means "byte offset" when LLVM generates it.
    Then libbpf converts it to "insn index" to pass to the kernel.
    The struct bpf_core_relo's "insn_off" field is always "byte offset".

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 540b7ebc57 bpf: Prepare relo_core.c for kernel duty.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Conflicts: merge conflict upstream resolved in be3158290db8

commit 29db4bea1d10b73749d7992c1fc9ac13499e8871
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:26 2021 -0800

    bpf: Prepare relo_core.c for kernel duty.

    Make relo_core.c to be compiled for the kernel and for user space libbpf.

    Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
    This is the maximum number of nested structs and arrays.
    For example:
     struct sample {
         int a;
         struct {
             int b[10];
         };
     };

     struct sample *s = ...;
     int *y = &s->b[5];
    This field access is encoded as "0:1:0:5" and spec len is 4.

    The follow up patch might bump it back to 64.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 92b13fc051 bpf: Rename btf_member accessors.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 8293eb995f349aed28006792cad4cb48091919dd
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Dec 1 10:10:25 2021 -0800

    bpf: Rename btf_member accessors.

    Rename btf_member_bit_offset() and btf_member_bitfield_size() to
    avoid conflicts with similarly named helpers in libbpf's btf.h.
    Rename the kernel helpers, since libbpf helpers are part of uapi.

    Suggested-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211201181040.23337-3-alexei.starovoitov@gmail.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:42 +02:00
Artem Savkov 5cebd099b9 bpf: Introduce btf_tracing_ids
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit d19ddb476a539fd78ad1028ae13bb38506286931
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 12 07:02:43 2021 -0800

    bpf: Introduce btf_tracing_ids

    Similar to btf_sock_ids, btf_tracing_ids provides btf ID for task_struct,
    file, and vm_area_struct via easy to understand format like
    btf_tracing_ids[BTF_TRACING_TYPE_[TASK|file|VMA]].

    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211112150243.1270987-3-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:37 +02:00
Artem Savkov f06641eab5 bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 9e2ad638ae3632ef916ceb39f70e3104bf8fdc97
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 12 07:02:42 2021 -0800

    bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs

    syzbot reported the following BUG w/o CONFIG_DEBUG_INFO_BTF

    BUG: KASAN: global-out-of-bounds in task_iter_init+0x212/0x2e7 kernel/bpf/task_iter.c:661
    Read of size 4 at addr ffffffff90297404 by task swapper/0/1

    CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.15.0-syzkaller #0
    Hardware name: ... Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    <TASK>
    __dump_stack lib/dump_stack.c:88 [inline]
    dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
    print_address_description.constprop.0.cold+0xf/0x309 mm/kasan/report.c:256
    __kasan_report mm/kasan/report.c:442 [inline]
    kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
    task_iter_init+0x212/0x2e7 kernel/bpf/task_iter.c:661
    do_one_initcall+0x103/0x650 init/main.c:1295
    do_initcall_level init/main.c:1368 [inline]
    do_initcalls init/main.c:1384 [inline]
    do_basic_setup init/main.c:1403 [inline]
    kernel_init_freeable+0x6b1/0x73a init/main.c:1606
    kernel_init+0x1a/0x1d0 init/main.c:1497
    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
    </TASK>

    This is caused by hard-coded name[1] in BTF_ID_LIST_GLOBAL (w/o
    CONFIG_DEBUG_INFO_BTF). Fix this by adding a parameter n to
    BTF_ID_LIST_GLOBAL. This avoids ifdef CONFIG_DEBUG_INFO_BTF in btf.c and
    filter.c.

    Fixes: 7c7e3d31e785 ("bpf: Introduce helper bpf_find_vma")
    Reported-by: syzbot+e0d81ec552a21d9071aa@syzkaller.appspotmail.com
    Reported-by: Eric Dumazet <edumazet@google.com>
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211112150243.1270987-2-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:36 +02:00
Artem Savkov b90de8b07e bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 8c42d2fa4eeab6c37a0b1b1aa7a2715248ef4f34
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Nov 11 17:26:09 2021 -0800

    bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes

    LLVM patches ([1] for clang, [2] and [3] for BPF backend)
    added support for btf_type_tag attributes. This patch
    added support for the kernel.

    The main motivation for btf_type_tag is to bring kernel
    annotations __user, __rcu etc. to btf. With such information
    available in btf, bpf verifier can detect mis-usages
    and reject the program. For example, for __user tagged pointer,
    developers can then use proper helper like bpf_probe_read_user()
    etc. to read the data.

    BTF_KIND_TYPE_TAG may also useful for other tracing
    facility where instead of to require user to specify
    kernel/user address type, the kernel can detect it
    by itself with btf.

      [1] https://reviews.llvm.org/D111199
      [2] https://reviews.llvm.org/D113222
      [3] https://reviews.llvm.org/D113496

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:36 +02:00
Artem Savkov c083c778ce bpf: Introduce helper bpf_find_vma
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 7c7e3d31e7856a8260a254f8c71db416f7f9f5a1
Author: Song Liu <songliubraving@fb.com>
Date:   Fri Nov 5 16:23:29 2021 -0700

    bpf: Introduce helper bpf_find_vma

    In some profiler use cases, it is necessary to map an address to the
    backing file, e.g., a shared library. bpf_find_vma helper provides a
    flexible way to achieve this. bpf_find_vma maps an address of a task to
    the vma (vm_area_struct) for this address, and feed the vma to an callback
    BPF function. The callback function is necessary here, as we need to
    ensure mmap_sem is unlocked.

    It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem
    safely when irqs are disable, we use the same mechanism as stackmap with
    build_id. Specifically, when irqs are disabled, the unlocked is postponed
    in an irq_work. Refactor stackmap.c so that the irq_work is shared among
    bpf_find_vma and stackmap helpers.

    Signed-off-by: Song Liu <songliubraving@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/bpf/20211105232330.1936330-2-songliubraving@fb.com

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:34 +02:00
Yauheni Kaliuta 4430625d2e bpf: Fix a btf decl_tag bug when tagging a function
Bugzilla: http://bugzilla.redhat.com/2069045

commit d7e7b42f4f956f2c68ad8cda87d750093dbba737
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Feb 3 11:17:27 2022 -0800

    bpf: Fix a btf decl_tag bug when tagging a function
    
    syzbot reported a btf decl_tag bug with stack trace below:
    
      general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 0 PID: 3592 Comm: syz-executor914 Not tainted 5.16.0-syzkaller-11424-gb7892f7d5cb2 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:btf_type_vlen include/linux/btf.h:231 [inline]
      RIP: 0010:btf_decl_tag_resolve+0x83e/0xaa0 kernel/bpf/btf.c:3910
      ...
      Call Trace:
       <TASK>
       btf_resolve+0x251/0x1020 kernel/bpf/btf.c:4198
       btf_check_all_types kernel/bpf/btf.c:4239 [inline]
       btf_parse_type_sec kernel/bpf/btf.c:4280 [inline]
       btf_parse kernel/bpf/btf.c:4513 [inline]
       btf_new_fd+0x19fe/0x2370 kernel/bpf/btf.c:6047
       bpf_btf_load kernel/bpf/syscall.c:4039 [inline]
       __sys_bpf+0x1cbb/0x5970 kernel/bpf/syscall.c:4679
       __do_sys_bpf kernel/bpf/syscall.c:4738 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:4736 [inline]
       __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:4736
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The kasan error is triggered with an illegal BTF like below:
       type 0: void
       type 1: int
       type 2: decl_tag to func type 3
       type 3: func to func_proto type 8
    The total number of types is 4 and the type 3 is illegal
    since its func_proto type is out of range.
    
    Currently, the target type of decl_tag can be struct/union, var or func.
    Both struct/union and var implemented their own 'resolve' callback functions
    and hence handled properly in kernel.
    But func type doesn't have 'resolve' callback function. When
    btf_decl_tag_resolve() tries to check func type, it tries to get
    vlen of its func_proto type, which triggered the above kasan error.
    
    To fix the issue, btf_decl_tag_resolve() needs to do btf_func_check()
    before trying to accessing func_proto type.
    In the current implementation, func type is checked with
    btf_func_check() in the main checking function btf_check_all_types().
    To fix the above kasan issue, let us implement 'resolve' callback
    func type properly. The 'resolve' callback will be also called
    in btf_check_all_types() for func types.
    
    Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
    Reported-by: syzbot+53619be9444215e785ed@syzkaller.appspotmail.com
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220203191727.741862-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:53 +03:00
Yauheni Kaliuta bf40542602 bpf: Fix bpf_check_mod_kfunc_call for built-in modules
Bugzilla: http://bugzilla.redhat.com/2069045

commit b12f031043247b80999bf5e03b8cded3b0b40f8d
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Nov 22 20:17:41 2021 +0530

    bpf: Fix bpf_check_mod_kfunc_call for built-in modules
    
    When module registering its set is built-in, THIS_MODULE will be NULL,
    hence we cannot return early in case owner is NULL.
    
    Fixes: 14f267d95fe4 ("bpf: btf: Introduce helpers for dynamic BTF set registration")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20211122144742.477787-3-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:51 +03:00
Yauheni Kaliuta 3dbb603806 bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Bugzilla: http://bugzilla.redhat.com/2069045

commit d9847eb8be3d895b2b5f514fdf3885d47a0b92a2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Mon Nov 22 20:17:40 2021 +0530

    bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
    
    Vinicius Costa Gomes reported [0] that build fails when
    CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
    This leads to btf.c not being compiled, and then no symbol being present
    in vmlinux for the declarations in btf.h. Since BTF is not useful
    without enabling BPF subsystem, disallow this combination.
    
    However, theoretically disabling both now could still fail, as the
    symbol for kfunc_btf_id_list variables is not available. This isn't a
    problem as the compiler usually optimizes the whole register/unregister
    call, but at lower optimization levels it can fail the build in linking
    stage.
    
    Fix that by adding dummy variables so that modules taking address of
    them still work, but the whole thing is a noop.
    
      [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com
    
    Fixes: 14f267d95fe4 ("bpf: btf: Introduce helpers for dynamic BTF set registration")
    Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:51 +03:00
Yauheni Kaliuta fe7b5bf2d3 bpf: Add BTF_KIND_DECL_TAG typedef support
Bugzilla: http://bugzilla.redhat.com/2069045

commit bd16dee66ae4de3f1726c69ac901d2b7a53b0c86
Author: Yonghong Song <yhs@fb.com>
Date:   Thu Oct 21 12:56:28 2021 -0700

    bpf: Add BTF_KIND_DECL_TAG typedef support
    
    The llvm patches ([1], [2]) added support to attach btf_decl_tag
    attributes to typedef declarations. This patch added
    support in kernel.
    
      [1] https://reviews.llvm.org/D110127
      [2] https://reviews.llvm.org/D112259
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211021195628.4018847-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:45 +03:00
Yauheni Kaliuta fdbdd94ffe bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Bugzilla: http://bugzilla.redhat.com/2069045

commit 223f903e9c832699f4e5f422281a60756c1c6cfe
Author: Yonghong Song <yhs@fb.com>
Date:   Tue Oct 12 09:48:38 2021 -0700

    bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
    
    Patch set [1] introduced BTF_KIND_TAG to allow tagging
    declarations for struct/union, struct/union field, var, func
    and func arguments and these tags will be encoded into
    dwarf. They are also encoded to btf by llvm for the bpf target.
    
    After BTF_KIND_TAG is introduced, we intended to use it
    for kernel __user attributes. But kernel __user is actually
    a type attribute. Upstream and internal discussion showed
    it is not a good idea to mix declaration attribute and
    type attribute. So we proposed to introduce btf_type_tag
    as a type attribute and existing btf_tag renamed to
    btf_decl_tag ([2]).
    
    This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
    other declarations with *_tag to *_decl_tag to make it clear
    the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
    might be introduced per [3].
    
     [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
     [2] https://reviews.llvm.org/D111588
     [3] https://reviews.llvm.org/D111199
    
    Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
    Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
    Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG")
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:42 +03:00
Yauheni Kaliuta da9f7998d2 bpf: selftests: Add selftests for module kfunc support
Bugzilla: http://bugzilla.redhat.com/2069045

commit c48e51c8b07aba8a18125221cb67a40cb1256bf2
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:57 2021 +0530

    bpf: selftests: Add selftests for module kfunc support
    
    This adds selftests that tests the success and failure path for modules
    kfuncs (in presence of invalid kfunc calls) for both libbpf and
    gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
    add module BTF ID set from bpf_testmod.
    
    This also introduces  a couple of test cases to verifier selftests for
    validating whether we get an error or not depending on if invalid kfunc
    call remains after elimination of unreachable instructions.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta 1633f5d5a1 bpf: Enable TCP congestion control kfunc from modules
Bugzilla: http://bugzilla.redhat.com/2069045

commit 0e32dfc80bae53b05e9eda7eaf259f30ab9ba43a
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:53 2021 +0530

    bpf: Enable TCP congestion control kfunc from modules
    
    This commit moves BTF ID lookup into the newly added registration
    helper, in a way that the bbr, cubic, and dctcp implementation set up
    their sets in the bpf_tcp_ca kfunc_btf_set list, while the ones not
    dependent on modules are looked up from the wrapper function.
    
    This lifts the restriction for them to be compiled as built in objects,
    and can be loaded as modules if required. Also modify Makefile.modfinal
    to call resolve_btfids for each module.
    
    Note that since kernel kfunc_ids never overlap with module kfunc_ids, we
    only match the owner for module btf id sets.
    
    See following commits for background on use of:
    
     CONFIG_X86 ifdef:
     569c484f99 (bpf: Limit static tcp-cc functions in the .BTF_ids list to x86)
    
     CONFIG_DYNAMIC_FTRACE ifdef:
     7aae231ac9 (bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE)
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-6-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta 8dfb448993 bpf: btf: Introduce helpers for dynamic BTF set registration
Bugzilla: http://bugzilla.redhat.com/2069045

commit 14f267d95fe4b08831a022c8e15a2eb8991edbf6
Author: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Date:   Sat Oct 2 06:47:51 2021 +0530

    bpf: btf: Introduce helpers for dynamic BTF set registration
    
    This adds helpers for registering btf_id_set from modules and the
    bpf_check_mod_kfunc_call callback that can be used to look them up.
    
    With in kernel sets, the way this is supposed to work is, in kernel
    callback looks up within the in-kernel kfunc whitelist, and then defers
    to the dynamic BTF set lookup if it doesn't find the BTF id. If there is
    no in-kernel BTF id set, this callback can be used directly.
    
    Also fix includes for btf.h and bpfptr.h so that they can included in
    isolation. This is in preparation for their usage in tcp_bbr, tcp_cubic
    and tcp_dctcp modules in the next patch.
    
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20211002011757.311265-4-memxor@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:23:38 +03:00
Yauheni Kaliuta df49efeb8a bpf: Support for new btf kind BTF_KIND_TAG
Bugzilla: http://bugzilla.redhat.com/2069045

commit b5ea834dde6b6e7f75e51d5f66dac8cd7c97b5ef
Author: Yonghong Song <yhs@fb.com>
Date:   Tue Sep 14 15:30:15 2021 -0700

    bpf: Support for new btf kind BTF_KIND_TAG
    
    LLVM14 added support for a new C attribute ([1])
      __attribute__((btf_tag("arbitrary_str")))
    This attribute will be emitted to dwarf ([2]) and pahole
    will convert it to BTF. Or for bpf target, this
    attribute will be emitted to BTF directly ([3], [4]).
    The attribute is intended to provide additional
    information for
      - struct/union type or struct/union member
      - static/global variables
      - static/global function or function parameter.
    
    For linux kernel, the btf_tag can be applied
    in various places to specify user pointer,
    function pre- or post- condition, function
    allow/deny in certain context, etc. Such information
    will be encoded in vmlinux BTF and can be used
    by verifier.
    
    The btf_tag can also be applied to bpf programs
    to help global verifiable functions, e.g.,
    specifying preconditions, etc.
    
    This patch added basic parsing and checking support
    in kernel for new BTF_KIND_TAG kind.
    
     [1] https://reviews.llvm.org/D106614
     [2] https://reviews.llvm.org/D106621
     [3] https://reviews.llvm.org/D106622
     [4] https://reviews.llvm.org/D109560
    
    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-06-03 17:16:12 +03:00
Jerome Marchand 70ca1bb90e bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS
Bugzilla: http://bugzilla.redhat.com/2041365

commit eb529c5b10b9401a0f2d1f469e82c6a0ba98082c
Author: Daniel Xu <dxu@dxuuu.xyz>
Date:   Wed Aug 25 18:48:31 2021 -0700

    bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS

    This commit fixes linker errors along the lines of:

        s390-linux-ld: task_iter.c:(.init.text+0xa4): undefined reference to `btf_task_struct_ids'`

    Fix by defining btf_task_struct_ids unconditionally in kernel/bpf/btf.c
    since there exists code that unconditionally uses btf_task_struct_ids.

    Reported-by: kernel test robot <lkp@intel.com>
    Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/05d94748d9f4b3eecedc4fddd6875418a396e23c.1629942444.git.dxu@dxuuu.xyz

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:17:12 +02:00
Jerome Marchand 5f52b0fe98 bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
Bugzilla: http://bugzilla.redhat.com/2041365

commit d36216429ff3e69db4f6ea5e0c86b80010f5f30b
Author: Yonghong Song <yhs@fb.com>
Date:   Wed Jul 28 11:30:25 2021 -0700

    bpf: Emit better log message if bpf_iter ctx arg btf_id == 0

    To avoid kernel build failure due to some missing .BTF-ids referenced
    functions/types, the patch ([1]) tries to fill btf_id 0 for
    these types.

    In bpf verifier, for percpu variable and helper returning btf_id cases,
    verifier already emitted proper warning with something like
      verbose(env, "Helper has invalid btf_id in R%d\n", regno);
      verbose(env, "invalid return type %d of func %s#%d\n",
              fn->ret_type, func_id_name(func_id), func_id);

    But this is not the case for bpf_iter context arguments.
    I hacked resolve_btfids to encode btf_id 0 for struct task_struct.
    With `./test_progs -n 7/5`, I got,
      0: (79) r2 = *(u64 *)(r1 +0)
      func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta'
      ; struct seq_file *seq = ctx->meta->seq;
      1: (79) r6 = *(u64 *)(r2 +0)
      ; struct task_struct *task = ctx->task;
      2: (79) r7 = *(u64 *)(r1 +8)
      ; if (task == (void *)0) {
      3: (55) if r7 != 0x0 goto pc+11
      ...
      ; BPF_SEQ_PRINTF(seq, "%8d %8d\n", task->tgid, task->pid);
      26: (61) r1 = *(u32 *)(r7 +1372)
      Type '(anon)' is not a struct

    Basically, verifier will return btf_id 0 for task_struct.
    Later on, when the code tries to access task->tgid, the
    verifier correctly complains the type is '(anon)' and it is
    not a struct. Users still need to backtrace to find out
    what is going on.

    Let us catch the invalid btf_id 0 earlier
    and provide better message indicating btf_id is wrong.
    The new error message looks like below:
      R1 type=ctx expected=fp
      ; struct seq_file *seq = ctx->meta->seq;
      0: (79) r2 = *(u64 *)(r1 +0)
      func 'bpf_iter_task' arg0 has btf_id 29739 type STRUCT 'bpf_iter_meta'
      ; struct seq_file *seq = ctx->meta->seq;
      1: (79) r6 = *(u64 *)(r2 +0)
      ; struct task_struct *task = ctx->task;
      2: (79) r7 = *(u64 *)(r1 +8)
      invalid btf_id for context argument offset 8
      invalid bpf_context access off=8 size=8

    [1] https://lore.kernel.org/bpf/20210727132532.2473636-1-hengqi.chen@gmail.com/

    Signed-off-by: Yonghong Song <yhs@fb.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20210728183025.1461750-1-yhs@fb.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:14:36 +02:00
Jerome Marchand 103c5a16ea bpf: Add map side support for bpf timers.
Bugzilla: http://bugzilla.redhat.com/2041365

commit 68134668c17f31f51930478f75495b552a411550
Author: Alexei Starovoitov <ast@kernel.org>
Date:   Wed Jul 14 17:54:10 2021 -0700

    bpf: Add map side support for bpf timers.

    Restrict bpf timers to array, hash (both preallocated and kmalloced), and
    lru map types. The per-cpu maps with timers don't make sense, since 'struct
    bpf_timer' is a part of map value. bpf timers in per-cpu maps would mean that
    the number of timers depends on number of possible cpus and timers would not be
    accessible from all cpus. lpm map support can be added in the future.
    The timers in inner maps are supported.

    The bpf_map_update/delete_elem() helpers and sys_bpf commands cancel and free
    bpf_timer in a given map element.

    Similar to 'struct bpf_spin_lock' BTF is required and it is used to validate
    that map element indeed contains 'struct bpf_timer'.

    Make check_and_init_map_value() init both bpf_spin_lock and bpf_timer when
    map element data is reused in preallocated htab and lru maps.

    Teach copy_map_value() to support both bpf_spin_lock and bpf_timer in a single
    map element. There could be one of each, but not more than one. Due to 'one
    bpf_timer in one element' restriction do not support timers in global data,
    since global data is a map of single element, but from bpf program side it's
    seen as many global variables and restriction of single global timer would be
    odd. The sys_bpf map_freeze and sys_mmap syscalls are not allowed on maps with
    timers, since user space could have corrupted mmap element and crashed the
    kernel. The maps with timers cannot be readonly. Due to these restrictions
    search for bpf_timer in datasec BTF in case it was placed in the global data to
    report clear error.

    The previous patch allowed 'struct bpf_timer' as a first field in a map
    element only. Relax this restriction.

    Refactor lru map to s/bpf_lru_push_free/htab_lru_push_free/ to cancel and free
    the timer when lru map deletes an element as a part of it eviction algorithm.

    Make sure that bpf program cannot access 'struct bpf_timer' via direct load/store.
    The timer operation are done through helpers only.
    This is similar to 'struct bpf_spin_lock'.

    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Yonghong Song <yhs@fb.com>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Link: https://lore.kernel.org/bpf/20210715005417.78572-5-alexei.starovoitov@gmail.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-04-29 18:14:31 +02:00
David S. Miller a52171ae7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-06-17

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 25 day(s) which contain
a total of 148 files changed, 4779 insertions(+), 1248 deletions(-).

The main changes are:

1) BPF infrastructure to migrate TCP child sockets from a listener to another
   in the same reuseport group/map, from Kuniyuki Iwashima.

2) Add a provably sound, faster and more precise algorithm for tnum_mul() as
   noted in https://arxiv.org/abs/2105.05398, from Harishankar Vishwanathan.

3) Streamline error reporting changes in libbpf as planned out in the
   'libbpf: the road to v1.0' effort, from Andrii Nakryiko.

4) Add broadcast support to xdp_redirect_map(), from Hangbin Liu.

5) Extends bpf_map_lookup_and_delete_elem() functionality to 4 more map
   types, that is, {LRU_,PERCPU_,LRU_PERCPU_,}HASH, from Denis Salopek.

6) Support new LLVM relocations in libbpf to make them more linker friendly,
   also add a doc to describe the BPF backend relocations, from Yonghong Song.

7) Silence long standing KUBSAN complaints on register-based shifts in
   interpreter, from Daniel Borkmann and Eric Biggers.

8) Add dummy PT_REGS macros in libbpf to fail BPF program compilation when
   target arch cannot be determined, from Lorenz Bauer.

9) Extend AF_XDP to support large umems with 1M+ pages, from Magnus Karlsson.

10) Fix two minor libbpf tc BPF API issues, from Kumar Kartikeya Dwivedi.

11) Move libbpf BPF_SEQ_PRINTF/BPF_SNPRINTF macros that can be used by BPF
    programs to bpf_helpers.h header, from Florent Revest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:54:56 -07:00
Jakub Kicinski 5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Zhen Lei 8fb33b6055 bpf: Fix spelling mistakes
Fix some spelling mistakes in comments:
aother ==> another
Netiher ==> Neither
desribe ==> describe
intializing ==> initializing
funciton ==> function
wont ==> won't and move the word 'the' at the end to the next line
accross ==> across
pathes ==> paths
triggerred ==> triggered
excute ==> execute
ether ==> either
conervative ==> conservative
convetion ==> convention
markes ==> marks
interpeter ==> interpreter

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210525025659.8898-2-thunder.leizhen@huawei.com
2021-05-24 21:13:05 -07:00
Alexei Starovoitov 3d78417b60 bpf: Add bpf_btf_find_by_name_kind() helper.
Add new helper:
long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags)
Description
	Find BTF type with given name and kind in vmlinux BTF or in module's BTFs.
Return
	Returns btf_id and btf_obj_fd in lower and upper 32 bits.

It will be used by loader program to find btf_id to attach the program to
and to find btf_ids of ksyms.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail.com
2021-05-19 00:33:40 +02:00
Alexei Starovoitov c571bd752e bpf: Make btf_load command to be bpfptr_t compatible.
Similar to prog_load make btf_load command to be availble to
bpf_prog_type_syscall program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-7-alexei.starovoitov@gmail.com
2021-05-19 00:33:40 +02:00
Jiri Olsa 31379397dc bpf: Forbid trampoline attach for functions with variable arguments
We can't currently allow to attach functions with variable arguments.
The problem is that we should save all the registers for arguments,
which is probably doable, but if caller uses more than 6 arguments,
we need stack data, which will be wrong, because of the extra stack
frame we do in bpf trampoline, so we could crash.

Also currently there's malformed trampoline code generated for such
functions at the moment as described in:

  https://lore.kernel.org/bpf/20210429212834.82621-1-jolsa@kernel.org/

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210505132529.401047-1-jolsa@kernel.org
2021-05-07 01:28:28 +02:00
Colin Ian King 235fc0e36d bpf: Remove redundant assignment of variable id
The variable id is being assigned a value that is never read, the
assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210326194348.623782-1-colin.king@canonical.com
2021-03-30 22:58:53 +02:00
Martin KaFai Lau e6ac2450d6 bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling
kernel function directly.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

This patch is to make the required changes in the bpf verifier.

First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
When the passed in "btf->kernel_btf == true", it means matching the
verifier regs' states with a kernel function.  This will handle the
PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
and PTR_TO_TCP_SOCK to its kernel's btf_id.

In the later libbpf patch, the insn calling a kernel function will
look like:

insn->code == (BPF_JMP | BPF_CALL)
insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
insn->imm == func_btf_id /* btf_id of the running kernel */

[ For the future calling function-in-kernel-module support, an array
  of module btf_fds can be passed at the load time and insn->off
  can be used to index into this array. ]

At the early stage of verifier, the verifier will collect all kernel
function calls into "struct bpf_kfunc_desc".  Those
descriptors are stored in "prog->aux->kfunc_tab" and will
be available to the JIT.  Since this "add" operation is similar
to the current "add_subprog()" and looking for the same insn->code,
they are done together in the new "add_subprog_and_kfunc()".

In the "do_check()" stage, the new "check_kfunc_call()" is added
to verify the kernel function call instruction:
1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
   A new bpf_verifier_ops "check_kfunc_call" is added to do that.
   The bpf-tcp-cc struct_ops program will implement this function in
   a later patch.
2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
   used as the args of a kernel function.
3. Mark the regs' type, subreg_def, and zext_dst.

At the later do_misc_fixups() stage, the new fixup_kfunc_call()
will replace the insn->imm with the function address (relative
to __bpf_call_base).  If needed, the jit can find the btf_func_model
by calling the new bpf_jit_find_kfunc_model(prog, insn).
With the imm set to the function address, "bpftool prog dump xlated"
will be able to display the kernel function calls the same way as
it displays other bpf helper calls.

gpl_compatible program is required to call kernel function.

This feature currently requires JIT.

The verifier selftests are adjusted because of the changes in
the verbose log in add_subprog_and_kfunc().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-03-26 20:41:51 -07:00
Martin KaFai Lau 34747c4120 bpf: Refactor btf_check_func_arg_match
This patch moved the subprog specific logic from
btf_check_func_arg_match() to the new btf_check_subprog_arg_match().
The core logic is left in btf_check_func_arg_match() which
will be reused later to check the kernel function call.

The "if (!btf_type_is_ptr(t))" is checked first to improve the
indentation which will be useful for a later patch.

Some of the "btf_kind_str[]" usages is replaced with the shortcut
"btf_type_str(t)".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com
2021-03-26 20:41:50 -07:00
David S. Miller c1acda9807 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-03-09

The following pull-request contains BPF updates for your *net-next* tree.

We've added 90 non-merge commits during the last 17 day(s) which contain
a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).

The main changes are:

1) Faster bpf_redirect_map(), from Björn.

2) skmsg cleanup, from Cong.

3) Support for floating point types in BTF, from Ilya.

4) Documentation for sys_bpf commands, from Joe.

5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.

6) Enable task local storage for tracing programs, from Song.

7) bpf_for_each_map_elem() helper, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 18:07:05 -08:00
Ilya Leoshkevich b1828f0b04 bpf: Add BTF_KIND_FLOAT support
On the kernel side, introduce a new btf_kind_operations. It is
similar to that of BTF_KIND_INT, however, it does not need to
handle encodings and bit offsets. Do not implement printing, since
the kernel does not know how to format floating-point values.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-7-iii@linux.ibm.com
2021-03-04 17:58:16 -08:00
Dmitrii Banshchikov 523a4cf491 bpf: Use MAX_BPF_FUNC_REG_ARGS macro
Instead of using integer literal here and there use macro name for
better context.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210225202629.585485-1-me@ubique.spb.ru
2021-02-26 11:59:53 -08:00
Dmitrii Banshchikov f4eda8b6e4 bpf: Drop imprecise log message
Now it is possible for global function to have a pointer argument that
points to something different than struct. Drop the irrelevant log
message and keep the logic same.

Fixes: e5069b9c23 ("bpf: Support pointers in global func args")
Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210223090416.333943-1-me@ubique.spb.ru
2021-02-24 16:43:39 +01:00
Dmitrii Banshchikov e5069b9c23 bpf: Support pointers in global func args
Add an ability to pass a pointer to a type with known size in arguments
of a global function. Such pointers may be used to overcome the limit on
the maximum number of arguments, avoid expensive and tricky workarounds
and to have multiple output arguments.

A referenced type may contain pointers but indirect access through them
isn't supported.

The implementation consists of two parts.  If a global function has an
argument that is a pointer to a type with known size then:

  1) In btf_check_func_arg_match(): check that the corresponding
register points to NULL or to a valid memory region that is large enough
to contain the expected argument's type.

  2) In btf_prepare_func_args(): set the corresponding register type to
PTR_TO_MEM_OR_NULL and its size to the size of the expected type.

Only global functions are supported because allowance of pointers for
static functions might break validation. Consider the following
scenario. A static function has a pointer argument. A caller passes
pointer to its stack memory. Because the callee can change referenced
memory verifier cannot longer assume any particular slot type of the
caller's stack memory hence the slot type is changed to SLOT_MISC.  If
there is an operation that relies on slot type other than SLOT_MISC then
verifier won't be able to infer safety of the operation.

When verifier sees a static function that has a pointer argument
different from PTR_TO_CTX then it skips arguments check and continues
with "inline" validation with more information available. The operation
that relies on the particular slot type now succeeds.

Because global functions were not allowed to have pointer arguments
different from PTR_TO_CTX it's not possible to break existing and valid
code.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212205642.620788-4-me@ubique.spb.ru
2021-02-12 17:37:23 -08:00
Dmitrii Banshchikov feb4adfad5 bpf: Rename bpf_reg_state variables
Using "reg" for an array of bpf_reg_state and "reg[i + 1]" for an
individual bpf_reg_state is error-prone and verbose. Use "regs" for the
former and "reg" for the latter as other code nearby does.

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212205642.620788-2-me@ubique.spb.ru
2021-02-12 17:37:23 -08:00
Yonghong Song 13ca51d5eb bpf: Permit size-0 datasec
llvm patch https://reviews.llvm.org/D84002 permitted
to emit empty rodata datasec if the elf .rodata section
contains read-only data from local variables. These
local variables will be not emitted as BTF_KIND_VARs
since llvm converted these local variables as
static variables with private linkage without debuginfo
types. Such an empty rodata datasec will make
skeleton code generation easy since for skeleton
a rodata struct will be generated if there is a
.rodata elf section. The existence of a rodata
btf datasec is also consistent with the existence
of a rodata map created by libbpf.

The btf with such an empty rodata datasec will fail
in the kernel though as kernel will reject a datasec
with zero vlen and zero size. For example, for the below code,
    int sys_enter(void *ctx)
    {
       int fmt[6] = {1, 2, 3, 4, 5, 6};
       int dst[6];

       bpf_probe_read(dst, sizeof(dst), fmt);
       return 0;
    }
We got the below btf (bpftool btf dump ./test.o):
    [1] PTR '(anon)' type_id=0
    [2] FUNC_PROTO '(anon)' ret_type_id=3 vlen=1
            'ctx' type_id=1
    [3] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
    [4] FUNC 'sys_enter' type_id=2 linkage=global
    [5] INT 'char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED
    [6] ARRAY '(anon)' type_id=5 index_type_id=7 nr_elems=4
    [7] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none)
    [8] VAR '_license' type_id=6, linkage=global-alloc
    [9] DATASEC '.rodata' size=0 vlen=0
    [10] DATASEC 'license' size=0 vlen=1
            type_id=8 offset=0 size=4
When loading the ./test.o to the kernel with bpftool,
we see the following error:
    libbpf: Error loading BTF: Invalid argument(22)
    libbpf: magic: 0xeb9f
    ...
    [6] ARRAY (anon) type_id=5 index_type_id=7 nr_elems=4
    [7] INT __ARRAY_SIZE_TYPE__ size=4 bits_offset=0 nr_bits=32 encoding=(none)
    [8] VAR _license type_id=6 linkage=1
    [9] DATASEC .rodata size=24 vlen=0 vlen == 0
    libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.

Basically, libbpf changed .rodata datasec size to 24 since elf .rodata
section size is 24. The kernel then rejected the BTF since vlen = 0.
Note that the above kernel verifier failure can be worked around with
changing local variable "fmt" to a static or global, optionally const, variable.

This patch permits a datasec with vlen = 0 in kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210119153519.3901963-1-yhs@fb.com
2021-01-20 14:14:09 -08:00
Jakub Kicinski 0fe2f273ab Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/can/dev.c
  commit 03f16c5075 ("can: dev: can_restart: fix use after free bug")
  commit 3e77f70e73 ("can: dev: move driver related infrastructure into separate subdir")

  Code move.

drivers/net/dsa/b53/b53_common.c
 commit 8e4052c32d ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
 commit b7a9e0da2d ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")

 Field rename.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20 12:16:11 -08:00
Andrii Nakryiko 541c3bad8d bpf: Support BPF ksym variables in kernel modules
Add support for directly accessing kernel module variables from BPF programs
using special ldimm64 instructions. This functionality builds upon vmlinux
ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
specifying kernel module BTF's FD in insn[1].imm field.

During BPF program load time, verifier will resolve FD to BTF object and will
take reference on BTF object itself and, for module BTFs, corresponding module
as well, to make sure it won't be unloaded from under running BPF program. The
mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

One interesting change is also in how per-CPU variable is determined. The
logic is to find .data..percpu data section in provided BTF, but both vmlinux
and module each have their own .data..percpu entries in BTF. So for module's
case, the search for DATASEC record needs to look at only module's added BTF
types. This is implemented with custom search function.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20210112075520.4103414-6-andrii@kernel.org
2021-01-12 17:24:30 -08:00
Andrii Nakryiko bcc5e6162d bpf: Allow empty module BTFs
Some modules don't declare any new types and end up with an empty BTF,
containing only valid BTF header and no types or strings sections. This
currently causes BTF validation error. There is nothing wrong with such BTF,
so fix the issue by allowing module BTFs with no types or strings.

Fixes: 36e68442d1 ("bpf: Load and verify kernel module BTFs")
Reported-by: Christopher William Snowhill <chris@kode54.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210110070341.1380086-1-andrii@kernel.org
2021-01-12 21:11:30 +01:00
Andrii Nakryiko 290248a5b7 bpf: Allow to specify kernel module BTFs when attaching BPF programs
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD
of a module or vmlinux BTF object. For backwards compatibility reasons,
0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Andrii Nakryiko 22dc4a0f5e bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.

This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.

Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.

Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Andrii Nakryiko 7112d12798 bpf: Compile out btf_parse_module() if module BTF is not enabled
Make sure btf_parse_module() is compiled out if module BTFs are not enabled.

Fixes: 36e68442d1 ("bpf: Load and verify kernel module BTFs")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201111040645.903494-1-andrii@kernel.org
2020-11-10 20:15:07 -08:00
Andrii Nakryiko 36e68442d1 bpf: Load and verify kernel module BTFs
Add kernel module listener that will load/validate and unload module BTF.
Module BTFs gets ID generated for them, which makes it possible to iterate
them with existing BTF iteration API. They are given their respective module's
names, which will get reported through GET_OBJ_INFO API. They are also marked
as in-kernel BTFs for tooling to distinguish them from user-provided BTFs.

Also, similarly to vmlinux BTF, kernel module BTFs are exposed through
sysfs as /sys/kernel/btf/<module-name>. This is convenient for user-space
tools to inspect module BTF contents and dump their types with existing tools:

[vmuser@archvm bpf]$ ls -la /sys/kernel/btf
total 0
drwxr-xr-x  2 root root       0 Nov  4 19:46 .
drwxr-xr-x 13 root root       0 Nov  4 19:46 ..

...

-r--r--r--  1 root root     888 Nov  4 19:46 irqbypass
-r--r--r--  1 root root  100225 Nov  4 19:46 kvm
-r--r--r--  1 root root   35401 Nov  4 19:46 kvm_intel
-r--r--r--  1 root root     120 Nov  4 19:46 pcspkr
-r--r--r--  1 root root     399 Nov  4 19:46 serio_raw
-r--r--r--  1 root root 4094095 Nov  4 19:46 vmlinux

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Andrii Nakryiko 5329722057 bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
objects in the system. To allow distinguishing vmlinux BTF (and later kernel
module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
BTF).  We might want to later allow specifying BTF name for user-provided BTFs
as well, if that makes sense. But currently this is reserved only for
in-kernel BTFs.

Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
in-kernel BTF type with ability to specify BTF types from kernel modules, not
just vmlinux BTF. This will be implemented in a follow up patch set for
fentry/fexit/fmod_ret/lsm/etc.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Andrii Nakryiko 951bb64621 bpf: Add in-kernel split BTF support
Adjust in-kernel BTF implementation to support a split BTF mode of operation.
Changes are mostly mirroring libbpf split BTF changes, with the exception of
start_id being 0 for in-kernel implementation due to simpler read-only mode.

Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
where necessary, is encapsulated in few helper functions. Type numbering and
string offset in a split BTF are logically continuing where base BTF ends, so
most of the high-level logic is kept without changes.

Type verification and size resolution is only doing an added resolution of new
split BTF types and relies on already cached size and type resolution results
in the base BTF.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-2-andrii@kernel.org
2020-11-10 15:25:53 -08:00
Wang Qing 666475ccbf bpf, btf: Remove the duplicate btf_ids.h include
Remove duplicate btf_ids.h header which is included twice.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1604736650-11197-1-git-send-email-wangqing@vivo.com
2020-11-10 00:05:18 +01:00
Hao Luo eaa6bcb71e bpf: Introduce bpf_per_cpu_ptr()
Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
except that it may return NULL. This happens when the cpu parameter is
out of range. So the caller must check the returned value.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com
2020-10-02 15:00:49 -07:00
Hao Luo 4976b718c3 bpf: Introduce pseudo_btf_id
Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
ksym so that further dereferences on the ksym can use the BTF info
to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
the verifier reads the btf_id stored in the insn[0]'s imm field and
marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
which is encoded in btf_vminux by pahole. If the VAR is not of a struct
type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
and the mem_size is resolved to the size of the VAR's type.

>From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

Therefore, the proper functionality of pseudo_btf_id depends on (1)
kallsyms and (2) the encoding of kernel global VARs in pahole, which
should be available since pahole v1.18.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200929235049.2533242-2-haoluo@google.com
2020-10-02 14:59:25 -07:00
Toke Høiland-Jørgensen 43bc2874e7 bpf: Fix context type resolving for extension programs
Eelco reported we can't properly access arguments if the tracing
program is attached to extension program.

Having following program:

  SEC("classifier/test_pkt_md_access")
  int test_pkt_md_access(struct __sk_buff *skb)

with its extension:

  SEC("freplace/test_pkt_md_access")
  int test_pkt_md_access_new(struct __sk_buff *skb)

and tracing that extension with:

  SEC("fentry/test_pkt_md_access_new")
  int BPF_PROG(fentry, struct sk_buff *skb)

It's not possible to access skb argument in the fentry program,
with following error from verifier:

  ; int BPF_PROG(fentry, struct sk_buff *skb)
  0: (79) r1 = *(u64 *)(r1 +0)
  invalid bpf_context access off=0 size=8

The problem is that btf_ctx_access gets the context type for the
traced program, which is in this case the extension.

But when we trace extension program, we want to get the context
type of the program that the extension is attached to, so we can
access the argument properly in the trace program.

This version of the patch is tweaked slightly from Jiri's original one,
since the refactoring in the previous patches means we have to get the
target prog type from the new variable in prog->aux instead of directly
from the target prog.

Reported-by: Eelco Chaudron <echaudro@redhat.com>
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355278.48470.17057040257274725638.stgit@toke.dk
2020-09-29 13:09:24 -07:00
Toke Høiland-Jørgensen 3aac1ead5e bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach
In preparation for allowing multiple attachments of freplace programs, move
the references to the target program and trampoline into the
bpf_tracing_link structure when that is created. To do this atomically,
introduce a new mutex in prog->aux to protect writing to the two pointers
to target prog and trampoline, and rename the members to make it clear that
they are related.

With this change, it is no longer possible to attach the same tracing
program multiple times (detaching in-between), since the reference from the
tracing program to the target disappears on the first attach. However,
since the next patch will let the caller supply an attach target, that will
also make it possible to attach to the same place multiple times.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/160138355059.48470.2503076992210324984.stgit@toke.dk
2020-09-29 13:09:23 -07:00
Alan Maguire eb411377ae bpf: Add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                        u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire 31d0bc8163 bpf: Move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Toke Høiland-Jørgensen efc68158c4 bpf: change logging calls from verbose() to bpf_log() and use log pointer
In preparation for moving code around, change a bunch of references to
env->log (and the verbose() logging helper) to use bpf_log() and a direct
pointer to struct bpf_verifier_log. While we're touching the function
signature, mark the 'prog' argument to bpf_check_type_match() as const.

Also enhance the bpf_verifier_log_needed() check to handle NULL pointers
for the log struct so we can re-use the code with logging disabled.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:09:59 -07:00
Lorenz Bauer 9436ef6e86 bpf: Allow specifying a BTF ID per argument in function protos
Function prototypes using ARG_PTR_TO_BTF_ID currently use two ways to signal
which BTF IDs are acceptable. First, bpf_func_proto.btf_id is an array of
IDs, one for each argument. This array is only accessed up to the highest
numbered argument that uses ARG_PTR_TO_BTF_ID and may therefore be less than
five arguments long. It usually points at a BTF_ID_LIST. Second, check_btf_id
is a function pointer that is called by the verifier if present. It gets the
actual BTF ID of the register, and the argument number we're currently checking.
It turns out that the only user check_arg_btf_id ignores the argument, and is
simply used to check whether the BTF ID has a struct sock_common at it's start.

Replace both of these mechanisms with an explicit BTF ID for each argument
in a function proto. Thanks to btf_struct_ids_match this is very flexible:
check_arg_btf_id can be replaced by requiring struct sock_common.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200921121227.255763-5-lmb@cloudflare.com
2020-09-21 15:00:40 -07:00
Lorenz Bauer 2af30f115d btf: Make btf_set_contains take a const pointer
bsearch doesn't modify the contents of the array, so we can take a const pointer.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200921121227.255763-2-lmb@cloudflare.com
2020-09-21 15:00:40 -07:00
Jiri Olsa eae2e83e62 bpf: Add BTF_SET_START/END macros
Adding support to define sorted set of BTF ID values.

Following defines sorted set of BTF ID values:

  BTF_SET_START(btf_allowlist_d_path)
  BTF_ID(func, vfs_truncate)
  BTF_ID(func, vfs_fallocate)
  BTF_ID(func, dentry_open)
  BTF_ID(func, vfs_getattr)
  BTF_ID(func, filp_close)
  BTF_SET_END(btf_allowlist_d_path)

It defines following 'struct btf_id_set' variable to access
values and count:

  struct btf_id_set btf_allowlist_d_path;

Adding 'allowed' callback to struct bpf_func_proto, to allow
verifier the check on allowed callers.

Adding btf_id_set_contains function, which will be used by
allowed callbacks to verify the caller's BTF ID value is
within allowed set.

Also removing extra '\' in __BTF_ID_LIST macro.

Added BTF_SET_START_GLOBAL macro for global sets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-10-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa faaf4a790d bpf: Add btf_struct_ids_match function
Adding btf_struct_ids_match function to check if given address provided
by BTF object + offset is also address of another nested BTF object.

This allows to pass an argument to helper, which is defined via parent
BTF object + offset, like for bpf_d_path (added in following changes):

  SEC("fentry/filp_close")
  int BPF_PROG(prog_close, struct file *file, void *id)
  {
    ...
    ret = bpf_d_path(&file->f_path, ...

The first bpf_d_path argument is hold by verifier as BTF file object
plus offset of f_path member.

The btf_struct_ids_match function will walk the struct file object and
check if there's nested struct path object on the given offset.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-9-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa 1c6d28a6ac bpf: Factor btf_struct_access function
Adding btf_struct_walk function that walks through the
struct type + given offset and returns following values:

  enum bpf_struct_walk_result {
       /* < 0 error */
       WALK_SCALAR = 0,
       WALK_PTR,
       WALK_STRUCT,
  };

WALK_SCALAR - when SCALAR_VALUE is found
WALK_PTR    - when pointer value is found, its ID is stored
              in 'next_btf_id' output param
WALK_STRUCT - when nested struct object is found, its ID is stored
              in 'next_btf_id' output param

It will be used in following patches to get all nested
struct objects for given type and offset.

The btf_struct_access now calls btf_struct_walk function,
as long as it gets nested structs as return value.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-8-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa dafe58fc19 bpf: Remove recursion call in btf_struct_access
Andrii suggested we can simply jump to again label
instead of making recursion call.

Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-7-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa 887c31a39c bpf: Add type_id pointer as argument to __btf_resolve_size
Adding type_id pointer as argument to __btf_resolve_size
to return also BTF ID of the resolved type. It will be
used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-6-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa 69ff304792 bpf: Add elem_id pointer as argument to __btf_resolve_size
If the resolved type is array, make btf_resolve_size return also
ID of the elem type. It will be needed in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-5-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Jiri Olsa 6298399bfc bpf: Move btf_resolve_size into __btf_resolve_size
Moving btf_resolve_size into __btf_resolve_size and
keeping btf_resolve_size public with just first 3
arguments, because the rest of the arguments are not
used by outside callers.

Following changes are adding more arguments, which
are not useful to outside callers. They will be added
to the __btf_resolve_size function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-4-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
David S. Miller 2e7199bd77 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-08-04

The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 135 files changed, 4603 insertions(+), 1013 deletions(-).

The main changes are:

1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF
   syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko.

2) Add BPF iterator for map elements and to iterate all BPF programs for efficient
   in-kernel inspection, from Yonghong Song and Alexei Starovoitov.

3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid
   unwinder errors, from Song Liu.

4) Allow cgroup local storage map to be shared between programs on the same
   cgroup. Also extend BPF selftests with coverage, from YiFei Zhu.

5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM
   load instructions, from Jean-Philippe Brucker.

6) Follow-up fixes on BPF socket lookup in combination with reuseport group
   handling. Also add related BPF selftests, from Jakub Sitnicki.

7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for
   socket create/release as well as bind functions, from Stanislav Fomichev.

8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct
   xdp_statistics, from Peilin Ye.

9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime.

10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6}
    fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin.

11) Fix a bpftool segfault due to missing program type name and make it more robust
    to prevent them in future gaps, from Quentin Monnet.

12) Consolidate cgroup helper functions across selftests and fix a v6 localhost
    resolver issue, from John Fastabend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:27:40 -07:00
David S. Miller bd0b33b248 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-02 01:02:12 -07:00
Yonghong Song afbf21dce6 bpf: Support readonly/readwrite buffers in verifier
Readonly and readwrite buffer register states
are introduced. Totally four states,
PTR_TO_RDONLY_BUF[_OR_NULL] and PTR_TO_RDWR_BUF[_OR_NULL]
are supported. As suggested by their respective
names, PTR_TO_RDONLY_BUF[_OR_NULL] are for
readonly buffers and PTR_TO_RDWR_BUF[_OR_NULL]
for read/write buffers.

These new register states will be used
by later bpf map element iterator.

New register states share some similarity to
PTR_TO_TP_BUFFER as it will calculate accessed buffer
size during verification time. The accessed buffer
size will be later compared to other metrics during
later attach/link_create time.

Similar to reg_state PTR_TO_BTF_ID_OR_NULL in bpf
iterator programs, PTR_TO_RDONLY_BUF_OR_NULL or
PTR_TO_RDWR_BUF_OR_NULL reg_types can be set at
prog->aux->bpf_ctx_arg_aux, and bpf verifier will
retrieve the values during btf_ctx_access().
Later bpf map element iterator implementation
will show how such information will be assigned
during target registeration time.

The verifier is also enhanced such that PTR_TO_RDONLY_BUF
can be passed to ARG_PTR_TO_MEM[_OR_NULL] helper argument, and
PTR_TO_RDWR_BUF can be passed to ARG_PTR_TO_MEM[_OR_NULL] or
ARG_PTR_TO_UNINIT_MEM.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184111.590274-1-yhs@fb.com
2020-07-25 20:16:32 -07:00
Yonghong Song 951cf368bc bpf: net: Use precomputed btf_id for bpf iterators
One additional field btf_id is added to struct
bpf_ctx_arg_aux to store the precomputed btf_ids.
The btf_id is computed at build time with
BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions.
All existing bpf iterators are changed to used
pre-compute btf_ids.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163403.1393551-1-yhs@fb.com
2020-07-21 13:26:26 -07:00
Yonghong Song bc4f0548f6 bpf: Compute bpf_skc_to_*() helper socket btf ids at build time
Currently, socket types (struct tcp_sock, udp_sock, etc.)
used by bpf_skc_to_*() helpers are computed when vmlinux_btf
is first built in the kernel.

Commit 5a2798ab32
("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros")
implemented a mechanism to compute btf_ids at kernel build
time which can simplify kernel implementation and reduce
runtime overhead by removing in-kernel btf_id calculation.
This patch did exactly this, removing in-kernel btf_id
computation and utilizing build-time btf_id computation.

If CONFIG_DEBUG_INFO_BTF is not defined, BTF_ID_LIST will
define an array with size of 5, which is not enough for
btf_sock_ids. So define its own static array if
CONFIG_DEBUG_INFO_BTF is not defined.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163358.1393023-1-yhs@fb.com
2020-07-21 13:26:26 -07:00
Peilin Ye 5b801dfb7f bpf: Fix NULL pointer dereference in __btf_resolve_helper_id()
Prevent __btf_resolve_helper_id() from dereferencing `btf_vmlinux`
as NULL. This patch fixes the following syzbot bug:

    https://syzkaller.appspot.com/bug?id=f823224ada908fa5c207902a5a62065e53ca0fcc

Reported-by: syzbot+ee09bda7017345f1fbe6@syzkaller.appspotmail.com
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200714180904.277512-1-yepeilin.cs@gmail.com
2020-07-15 22:53:39 +02:00
David S. Miller 07dd1b7e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-13

The following pull-request contains BPF updates for your *net-next* tree.

We've added 36 non-merge commits during the last 7 day(s) which contain
a total of 62 files changed, 2242 insertions(+), 468 deletions(-).

The main changes are:

1) Avoid trace_printk warning banner by switching bpf_trace_printk to use
   its own tracing event, from Alan.

2) Better libbpf support on older kernels, from Andrii.

3) Additional AF_XDP stats, from Ciara.

4) build time resolution of BTF IDs, from Jiri.

5) BPF_CGROUP_INET_SOCK_RELEASE hook, from Stanislav.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 18:04:05 -07:00
Jiri Olsa 49f4e67207 bpf: Use BTF_ID to resolve bpf_ctx_convert struct
This way the ID is resolved during compile time,
and we can remove the runtime name search.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200711215329.41165-7-jolsa@kernel.org
2020-07-13 10:42:03 -07:00
Jiri Olsa 138b9a0511 bpf: Remove btf_id helpers resolving
Now when we moved the helpers btf_id arrays into .BTF_ids section,
we can remove the code that resolve those IDs in runtime.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200711215329.41165-6-jolsa@kernel.org
2020-07-13 10:42:02 -07:00
David S. Miller 71930d6102 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
All conflicts seemed rather trivial, with some guidance from
Saeed Mameed on the tc_ct.c one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-11 00:46:00 -07:00
John Fastabend a9b59159d3 bpf: Do not allow btf_ctx_access with __int128 types
To ensure btf_ctx_access() is safe the verifier checks that the BTF
arg type is an int, enum, or pointer. When the function does the
BTF arg lookup it uses the calculation 'arg = off / 8'  using the
fact that registers are 8B. This requires that the first arg is
in the first reg, the second in the second, and so on. However,
for __int128 the arg will consume two registers by default LLVM
implementation. So this will cause the arg layout assumed by the
'arg = off / 8' calculation to be incorrect.

Because __int128 is uncommon this patch applies the easiest fix and
will force int types to be sizeof(u64) or smaller so that they will
fit in a single register.

v2: remove unneeded parens per Andrii's feedback

Fixes: 9e15db6613 ("bpf: Implement accurate raw_tp context access via BTF")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159303723962.11287.13309537171132420717.stgit@john-Precision-5820-Tower
2020-06-25 16:17:05 +02:00
Yonghong Song af7ec13833 bpf: Add bpf_skc_to_tcp6_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.

A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.

Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.

All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
2020-06-24 18:37:59 -07:00
Andrey Ignatov 41c48f3a98 bpf: Support access to bpf map fields
There are multiple use-cases when it's convenient to have access to bpf
map fields, both `struct bpf_map` and map type specific struct-s such as
`struct bpf_array`, `struct bpf_htab`, etc.

For example while working with sock arrays it can be necessary to
calculate the key based on map->max_entries (some_hash % max_entries).
Currently this is solved by communicating max_entries via "out-of-band"
channel, e.g. via additional map with known key to get info about target
map. That works, but is not very convenient and error-prone while
working with many maps.

In other cases necessary data is dynamic (i.e. unknown at loading time)
and it's impossible to get it at all. For example while working with a
hash table it can be convenient to know how much capacity is already
used (bpf_htab.count.counter for BPF_F_NO_PREALLOC case).

At the same time kernel knows this info and can provide it to bpf
program.

Fill this gap by adding support to access bpf map fields from bpf
program for both `struct bpf_map` and map type specific fields.

Support is implemented via btf_struct_access() so that a user can define
their own `struct bpf_map` or map type specific struct in their program
with only necessary fields and preserve_access_index attribute, cast a
map to this struct and use a field.

For example:

	struct bpf_map {
		__u32 max_entries;
	} __attribute__((preserve_access_index));

	struct bpf_array {
		struct bpf_map map;
		__u32 elem_size;
	} __attribute__((preserve_access_index));

	struct {
		__uint(type, BPF_MAP_TYPE_ARRAY);
		__uint(max_entries, 4);
		__type(key, __u32);
		__type(value, __u32);
	} m_array SEC(".maps");

	SEC("cgroup_skb/egress")
	int cg_skb(void *ctx)
	{
		struct bpf_array *array = (struct bpf_array *)&m_array;
		struct bpf_map *map = (struct bpf_map *)&m_array;

		/* .. use map->max_entries or array->map.max_entries .. */
	}

Similarly to other btf_struct_access() use-cases (e.g. struct tcp_sock
in net/ipv4/bpf_tcp_ca.c) the patch allows access to any fields of
corresponding struct. Only reading from map fields is supported.

For btf_struct_access() to work there should be a way to know btf id of
a struct that corresponds to a map type. To get btf id there should be a
way to get a stringified name of map-specific struct, such as
"bpf_array", "bpf_htab", etc for a map type. Two new fields are added to
`struct bpf_map_ops` to handle it:
* .map_btf_name keeps a btf name of a struct returned by map_alloc();
* .map_btf_id is used to cache btf id of that struct.

To make btf ids calculation cheaper they're calculated once while
preparing btf_vmlinux and cached same way as it's done for btf_id field
of `struct bpf_func_proto`

While calculating btf ids, struct names are NOT checked for collision.
Collisions will be checked as a part of the work to prepare btf ids used
in verifier in compile time that should land soon. The only known
collision for `struct bpf_htab` (kernel/bpf/hashtab.c vs
net/core/sock_map.c) was fixed earlier.

Both new fields .map_btf_name and .map_btf_id must be set for a map type
for the feature to work. If neither is set for a map type, verifier will
return ENOTSUPP on a try to access map_ptr of corresponding type. If
just one of them set, it's verifier misconfiguration.

Only `struct bpf_array` for BPF_MAP_TYPE_ARRAY and `struct bpf_htab` for
BPF_MAP_TYPE_HASH are supported by this patch. Other map types will be
supported separately.

The feature is available only for CONFIG_DEBUG_INFO_BTF=y and gated by
perfmon_capable() so that unpriv programs won't have access to bpf map
fields.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/6479686a0cd1e9067993df57b4c3eef0e276fec9.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00
Andrey Ignatov a2d0d62f4d bpf: Switch btf_parse_vmlinux to btf_find_by_name_kind
btf_parse_vmlinux() implements manual search for struct bpf_ctx_convert
since at the time of implementing btf_find_by_name_kind() was not
available.

Later btf_find_by_name_kind() was introduced in 27ae7997a6 ("bpf:
Introduce BPF_PROG_TYPE_STRUCT_OPS"). It provides similar search
functionality and can be leveraged in btf_parse_vmlinux(). Do it.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/6e12d5c3e8a3d552925913ef73a695dd1bb27800.1592600985.git.rdna@fb.com
2020-06-22 22:22:58 +02:00
Yonghong Song 3c32cc1bce bpf: Enable bpf_iter targets registering ctx argument types
Commit b121b341e5 ("bpf: Add PTR_TO_BTF_ID_OR_NULL
support") adds a field btf_id_or_null_non0_off to
bpf_prog->aux structure to indicate that the
first ctx argument is PTR_TO_BTF_ID reg_type and
all others are PTR_TO_BTF_ID_OR_NULL.
This approach does not really scale if we have
other different reg types in the future, e.g.,
a pointer to a buffer.

This patch enables bpf_iter targets registering ctx argument
reg types which may be different from the default one.
For example, for pointers to structures, the default reg_type
is PTR_TO_BTF_ID for tracing program. The target can register
a particular pointer type as PTR_TO_BTF_ID_OR_NULL which can
be used by the verifier to enforce accesses.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513180221.2949882-1-yhs@fb.com
2020-05-13 12:30:50 -07:00
Yonghong Song 9c5f8a1008 bpf: Support variable length array in tracing programs
In /proc/net/ipv6_route, we have
  struct fib6_info {
    struct fib6_table *fib6_table;
    ...
    struct fib6_nh fib6_nh[0];
  }
  struct fib6_nh {
    struct fib_nh_common nh_common;
    struct rt6_info **rt6i_pcpu;
    struct rt6_exception_bucket *rt6i_exception_bucket;
  };
  struct fib_nh_common {
    ...
    u8 nhc_gw_family;
    ...
  }

The access:
  struct fib6_nh *fib6_nh = &rt->fib6_nh;
  ... fib6_nh->nh_common.nhc_gw_family ...

This patch ensures such an access is handled properly.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175916.2476853-1-yhs@fb.com
2020-05-09 17:05:27 -07:00
Yonghong Song b121b341e5 bpf: Add PTR_TO_BTF_ID_OR_NULL support
Add bpf_reg_type PTR_TO_BTF_ID_OR_NULL support.
For tracing/iter program, the bpf program context
definition, e.g., for previous bpf_map target, looks like
  struct bpf_iter__bpf_map {
    struct bpf_iter_meta *meta;
    struct bpf_map *map;
  };

The kernel guarantees that meta is not NULL, but
map pointer maybe NULL. The NULL map indicates that all
objects have been traversed, so bpf program can take
proper action, e.g., do final aggregation and/or send
final report to user space.

Add btf_id_or_null_non0_off to prog->aux structure, to
indicate that if the context access offset is not 0,
set to PTR_TO_BTF_ID_OR_NULL instead of PTR_TO_BTF_ID.
This bit is set for tracing/iter program.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200509175912.2476576-1-yhs@fb.com
2020-05-09 17:05:26 -07:00
Andrii Nakryiko f2e10bff16 bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
Add ability to fetch bpf_link details through BPF_OBJ_GET_INFO_BY_FD command.
Also enhance show_fdinfo to potentially include bpf_link type-specific
information (similarly to obj_info).

Also introduce enum bpf_link_type stored in bpf_link itself and expose it in
UAPI. bpf_link_tracing also now will store and return bpf_attach_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200429001614.1544-5-andriin@fb.com
2020-04-28 17:27:08 -07:00
David S. Miller ed52f2c608 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 19:52:37 -07:00
KP Singh f50b49a0bf bpf: btf: Fix arg verification in btf_ctx_access()
The bounds checking for the arguments accessed in the BPF program breaks
when the expected_attach_type is not BPF_TRACE_FEXIT, BPF_LSM_MAC or
BPF_MODIFY_RETURN resulting in no check being done for the default case
(the programs which do not receive the return value of the attached
function in its arguments) when the index of the argument being accessed
is equal to the number of arguments (nr_args).

This was a result of a misplaced "else if" block  introduced by the
Commit 6ba43b761c ("bpf: Attachment verification for
BPF_MODIFY_RETURN")

Fixes: 6ba43b761c ("bpf: Attachment verification for BPF_MODIFY_RETURN")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200330144246.338-1-kpsingh@chromium.org
2020-03-30 13:28:02 -07:00
David S. Miller f0b5989745 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor comment conflict in mac80211.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29 21:25:29 -07:00
KP Singh 9e4e01dfd3 bpf: lsm: Implement attach, detach and execution
JITed BPF programs are dynamically attached to the LSM hooks
using BPF trampolines. The trampoline prologue generates code to handle
conversion of the signature of the hook to the appropriate BPF context.

The allocated trampoline programs are attached to the nop functions
initialized as LSM hooks.

BPF_PROG_TYPE_LSM programs must have a GPL compatible license and
and need CAP_SYS_ADMIN (required for loading eBPF programs).

Upon attachment:

* A BPF fexit trampoline is used for LSM hooks with a void return type.
* A BPF fmod_ret trampoline is used for LSM hooks which return an
  int. The attached programs can override the return value of the
  bpf LSM hook to indicate a MAC Policy decision.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-5-kpsingh@chromium.org
2020-03-30 01:34:00 +02:00
David S. Miller 9fb16955fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Overlapping header include additions in macsec.c

A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c

Overlapping test additions in selftests Makefile

Overlapping PCI ID table adjustments in iwlwifi driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 18:58:11 -07:00
Greg Kroah-Hartman 5c6f258879 bpf: Explicitly memset some bpf info structures declared on the stack
Trying to initialize a structure with "= {};" will not always clean out
all padding locations in a structure. So be explicit and call memset to
initialize everything for a number of bpf information structures that
are then copied from userspace, sometimes from smaller memory locations
than the size of the structure.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200320162258.GA794295@kroah.com
2020-03-20 21:05:22 +01:00
Fangrui Song 90ceddcb49 bpf: Support llvm-objcopy for vmlinux BTF
Simplify gen_btf logic to make it work with llvm-objcopy. The existing
'file format' and 'architecture' parsing logic is brittle and does not
work with llvm-objcopy/llvm-objdump.

'file format' output of llvm-objdump>=11 will match GNU objdump, but
'architecture' (bfdarch) may not.

.BTF in .tmp_vmlinux.btf is non-SHF_ALLOC. Add the SHF_ALLOC flag
because it is part of vmlinux image used for introspection. C code
can reference the section via linker script defined __start_BTF and
__stop_BTF. This fixes a small problem that previous .BTF had the
SHF_WRITE flag (objcopy -I binary -O elf* synthesized .data).

Additionally, `objcopy -I binary` synthesized symbols
_binary__btf_vmlinux_bin_start and _binary__btf_vmlinux_bin_stop (not
used elsewhere) are replaced with more commonplace __start_BTF and
__stop_BTF.

Add 2>/dev/null because GNU objcopy (but not llvm-objcopy) warns
"empty loadable segment detected at vaddr=0xffffffff81000000, is this intentional?"

We use a dd command to change the e_type field in the ELF header from
ET_EXEC to ET_REL so that lld will accept .btf.vmlinux.bin.o.  Accepting
ET_EXEC as an input file is an extremely rare GNU ld feature that lld
does not intend to support, because this is error-prone.

The output section description .BTF in include/asm-generic/vmlinux.lds.h
avoids potential subtle orphan section placement issues and suppresses
--orphan-handling=warn warnings.

Fixes: df786c9b94 ("bpf: Force .BTF section start to zero when dumping from vmlinux")
Fixes: cb0cc635c7 ("powerpc: Include .BTF section")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Fangrui Song <maskray@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Stanislav Fomichev <sdf@google.com>
Tested-by: Andrii Nakryiko <andriin@fb.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Link: https://github.com/ClangBuiltLinux/linux/issues/871
Link: https://lore.kernel.org/bpf/20200318222746.173648-1-maskray@google.com
2020-03-19 12:32:38 +01:00
Yoshiki Komachi da6c7faeb1 bpf/btf: Fix BTF verification of enum members in struct/union
btf_enum_check_member() was currently sure to recognize the size of
"enum" type members in struct/union as the size of "int" even if
its size was packed.

This patch fixes BTF enum verification to use the correct size
of member in BPF programs.

Fixes: 179cde8cef ("bpf: btf: Check members of struct/union")
Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1583825550-18606-2-git-send-email-komachi.yoshiki@gmail.com
2020-03-10 10:00:41 -07:00
KP Singh 6ba43b761c bpf: Attachment verification for BPF_MODIFY_RETURN
- Allow BPF_MODIFY_RETURN attachment only to functions that are:

    * Whitelisted for error injection by checking
      within_error_injection_list. Similar discussions happened for the
      bpf_override_return helper.

    * security hooks, this is expected to be cleaned up with the LSM
      changes after the KRSI patches introduce the LSM_HOOK macro:

        https://lore.kernel.org/bpf/20200220175250.10795-1-kpsingh@chromium.org/

- The attachment is currently limited to functions that return an int.
  This can be extended later other types (e.g. PTR).

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-5-kpsingh@chromium.org
2020-03-04 13:41:05 -08:00
KP Singh ae24082331 bpf: Introduce BPF_MODIFY_RETURN
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  <--- do_fentry

do_fmod_ret:
   <update ret by calling fmod_ret>
   if (ret != 0)
        goto do_fexit;

original_function:

    <side_effects_happen_here>

}  <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
2020-03-04 13:41:05 -08:00
Hongbo Yao 2bf0eb9b3b bpf: Make btf_check_func_type_match() static
Fix the following sparse warning:

kernel/bpf/btf.c:4131:5: warning: symbol 'btf_check_func_type_match' was
not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200210011441.147102-1-yaohongbo@huawei.com
2020-02-11 00:22:47 +01:00
Alexei Starovoitov 257af63d7f bpf: Fix modifier skipping logic
Fix the way modifiers are skipped while walking pointers. Otherwise second
level dereferences of 'const struct foo *' will be rejected by the verifier.

Fixes: 9e15db6613 ("bpf: Implement accurate raw_tp context access via BTF")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200201000314.261392-1-ast@kernel.org
2020-02-04 00:06:07 +01:00
Martin KaFai Lau d3e42bb0a3 bpf: Reuse log from btf_prase_vmlinux() in btf_struct_ops_init()
Instead of using a locally defined "struct bpf_verifier_log log = {}",
btf_struct_ops_init() should reuse the "log" from its calling
function "btf_parse_vmlinux()".  It should also resolve the
frame-size too large compiler warning in some ARCH.

Fixes: 27ae7997a6 ("bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200127175145.1154438-1-kafai@fb.com
2020-01-29 16:40:54 +01:00
Jiri Olsa 84ad7a7ab6 bpf: Allow BTF ctx access for string pointers
When accessing the context we allow access to arguments with
scalar type and pointer to struct. But we deny access for
pointer to scalar type, which is the case for many functions.

Alexei suggested to take conservative approach and allow
currently only string pointer access, which is the case
for most functions now:

Adding check if the pointer is to string type and allow access to it.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200123161508.915203-2-jolsa@kernel.org
2020-01-25 07:12:40 -08:00
Alexei Starovoitov be8704ff07 bpf: Introduce dynamic program extensions
Introduce dynamic program extensions. The users can load additional BPF
functions and replace global functions in previously loaded BPF programs while
these programs are executing.

Global functions are verified individually by the verifier based on their types only.
Hence the global function in the new program which types match older function can
safely replace that corresponding function.

This new function/program is called 'an extension' of old program. At load time
the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function
to be replaced. The BPF program type is derived from the target program into
extension program. Technically bpf_verifier_ops is copied from target program.
The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops.
The extension program can call the same bpf helper functions as target program.
Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program
types. The verifier allows only one level of replacement. Meaning that the
extension program cannot recursively extend an extension. That also means that
the maximum stack size is increasing from 512 to 1024 bytes and maximum
function nesting level from 8 to 16. The programs don't always consume that
much. The stack usage is determined by the number of on-stack variables used by
the program. The verifier could have enforced 512 limit for combined original
plus extension program, but it makes for difficult user experience. The main
use case for extensions is to provide generic mechanism to plug external
programs into policy program or function call chaining.

BPF trampoline is used to track both fentry/fexit and program extensions
because both are using the same nop slot at the beginning of every BPF
function. Attaching fentry/fexit to a function that was replaced is not
allowed. The opposite is true as well. Replacing a function that currently
being analyzed with fentry/fexit is not allowed. The executable page allocated
by BPF trampoline is not used by program extensions. This inefficiency will be
optimized in future patches.

Function by function verification of global function supports scalars and
pointer to context only. Hence program extensions are supported for such class
of global functions only. In the future the verifier will be extended with
support to pointers to structures, arrays with sizes, etc.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-2-ast@kernel.org
2020-01-22 23:04:52 +01:00
Alexei Starovoitov 51c39bb1d5 bpf: Introduce function-by-function verification
New llvm and old llvm with libbpf help produce BTF that distinguish global and
static functions. Unlike arguments of static function the arguments of global
functions cannot be removed or optimized away by llvm. The compiler has to use
exactly the arguments specified in a function prototype. The argument type
information allows the verifier validate each global function independently.
For now only supported argument types are pointer to context and scalars. In
the future pointers to structures, sizes, pointer to packet data can be
supported as well. Consider the following example:

static int f1(int ...)
{
  ...
}

int f3(int b);

int f2(int a)
{
  f1(a) + f3(a);
}

int f3(int b)
{
  ...
}

int main(...)
{
  f1(...) + f2(...) + f3(...);
}

The verifier will start its safety checks from the first global function f2().
It will recursively descend into f1() because it's static. Then it will check
that arguments match for the f3() invocation inside f2(). It will not descend
into f3(). It will finish f2() that has to be successfully verified for all
possible values of 'a'. Then it will proceed with f3(). That function also has
to be safe for all possible values of 'b'. Then it will start subprog 0 (which
is main() function). It will recursively descend into f1() and will skip full
check of f2() and f3(), since they are global. The order of processing global
functions doesn't affect safety, since all global functions must be proven safe
based on their arguments only.

Such function by function verification can drastically improve speed of the
verification and reduce complexity.

Note that the stack limit of 512 still applies to the call chain regardless whether
functions were static or global. The nested level of 8 also still applies. The
same recursion prevention checks are in place as well.

The type information and static/global kind is preserved after the verification
hence in the above example global function f2() and f3() can be replaced later
by equivalent functions with the same types that are loaded and verified later
without affecting safety of this main() program. Such replacement (re-linking)
of global functions is a subject of future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-3-ast@kernel.org
2020-01-10 17:20:07 +01:00
Martin KaFai Lau 85d33df357 bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
The patch introduces BPF_MAP_TYPE_STRUCT_OPS.  The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.

The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code).  For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
	refcount_t refcnt;
	enum bpf_struct_ops_state state;
	struct tcp_congestion_ops data;  /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case).  This "value" struct
is created automatically by a macro.  Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem).  The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".

Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
   set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
   running kernel.
   Instead of reusing the attr->btf_value_type_id,
   btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
   used as the "user" btf which could store other useful sysadmin/debug
   info that may be introduced in the furture,
   e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
   in the running kernel btf.  Populate the value of this object.
   The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
   the map value.  The key is always "0".

During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated.  BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem.  The map will not allow further update
from this point.

Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".

Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0".  The map value returned will
have the prog _id_ populated as the func ptr.

The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)

The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ".  This patch uses a separate refcnt
for the purose of tracking the subsystem usage.  Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage.  However, that will also tie down the
future semantics of map->refcnt and map->usercnt.

The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt.  When the very last subsystem's refcnt
is gone, it will also release the map->refcnt.  All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).

Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops  name dctcp  flags 0x0
	key 4B  value 256B  max_entries 1  memlock 4096B
	btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
        "value": {
            "refcnt": {
                "refs": {
                    "counter": 1
                }
            },
            "state": 1,
            "data": {
                "list": {
                    "next": 0,
                    "prev": 0
                },
                "key": 0,
                "flags": 2,
                "init": 24,
                "release": 0,
                "ssthresh": 25,
                "cong_avoid": 30,
                "set_state": 27,
                "cwnd_event": 28,
                "in_ack_event": 26,
                "undo_cwnd": 29,
                "pkts_acked": 0,
                "min_tso_segs": 0,
                "sndbuf_expand": 0,
                "cong_control": 0,
                "get_info": 0,
                "name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
                ],
                "owner": 0
            }
        }
    }
]

Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
  It does an inplace update on "*value" instead returning a pointer
  to syscall.c.  Otherwise, it needs a separate copy of "zero" value
  for the BPF_STRUCT_OPS_STATE_INIT to avoid races.

* The bpf_struct_ops_map_delete_elem() is also called without
  preempt_disable() from map_delete_elem().  It is because
  the "->unreg()" may requires sleepable context, e.g.
  the "tcp_unregister_congestion_control()".

* "const" is added to some of the existing "struct btf_func_model *"
  function arg to avoid a compiler warning caused by this patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Martin KaFai Lau 27ae7997a6 bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS
This patch allows the kernel's struct ops (i.e. func ptr) to be
implemented in BPF.  The first use case in this series is the
"struct tcp_congestion_ops" which will be introduced in a
latter patch.

This patch introduces a new prog type BPF_PROG_TYPE_STRUCT_OPS.
The BPF_PROG_TYPE_STRUCT_OPS prog is verified against a particular
func ptr of a kernel struct.  The attr->attach_btf_id is the btf id
of a kernel struct.  The attr->expected_attach_type is the member
"index" of that kernel struct.  The first member of a struct starts
with member index 0.  That will avoid ambiguity when a kernel struct
has multiple func ptrs with the same func signature.

For example, a BPF_PROG_TYPE_STRUCT_OPS prog is written
to implement the "init" func ptr of the "struct tcp_congestion_ops".
The attr->attach_btf_id is the btf id of the "struct tcp_congestion_ops"
of the _running_ kernel.  The attr->expected_attach_type is 3.

The ctx of BPF_PROG_TYPE_STRUCT_OPS is an array of u64 args saved
by arch_prepare_bpf_trampoline that will be done in the next
patch when introducing BPF_MAP_TYPE_STRUCT_OPS.

"struct bpf_struct_ops" is introduced as a common interface for the kernel
struct that supports BPF_PROG_TYPE_STRUCT_OPS prog.  The supporting kernel
struct will need to implement an instance of the "struct bpf_struct_ops".

The supporting kernel struct also needs to implement a bpf_verifier_ops.
During BPF_PROG_LOAD, bpf_struct_ops_find() will find the right
bpf_verifier_ops by searching the attr->attach_btf_id.

A new "btf_struct_access" is also added to the bpf_verifier_ops such
that the supporting kernel struct can optionally provide its own specific
check on accessing the func arg (e.g. provide limited write access).

After btf_vmlinux is parsed, the new bpf_struct_ops_init() is called
to initialize some values (e.g. the btf id of the supporting kernel
struct) and it can only be done once the btf_vmlinux is available.

The R0 checks at BPF_EXIT is excluded for the BPF_PROG_TYPE_STRUCT_OPS prog
if the return type of the prog->aux->attach_func_proto is "void".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003503.3855825-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Martin KaFai Lau 976aba002f bpf: Support bitfield read access in btf_struct_access
This patch allows bitfield access as a scalar.

It checks "off + size > t->size" to avoid accessing bitfield
end up accessing beyond the struct.  This check is done
outside of the loop since it is applicable to all access.

It also takes this chance to break early on the "off < moff" case.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003501.3855427-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Martin KaFai Lau 218b3f65f9 bpf: Add enum support to btf_ctx_access()
It allows bpf prog (e.g. tracing) to attach
to a kernel function that takes enum argument.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003459.3855366-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Martin KaFai Lau 275517ff45 bpf: Avoid storing modifier to info->btf_id
info->btf_id expects the btf_id of a struct, so it should
store the final result after skipping modifiers (if any).

It also takes this chanace to add a missing newline in one of the
bpf_log() messages.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003456.3855176-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Arnd Bergmann 4c80c7bc58 bpf: Fix build in minimal configurations, again
Building with -Werror showed another failure:

kernel/bpf/btf.c: In function 'btf_get_prog_ctx_type.isra.31':
kernel/bpf/btf.c:3508:63: error: array subscript 0 is above array bounds of 'u8[0]' {aka 'unsigned char[0]'} [-Werror=array-bounds]
  ctx_type = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2;

I don't actually understand why the array is empty, but a similar
fix has addressed a related problem, so I suppose we can do the
same thing here.

Fixes: ce27709b81 ("bpf: Fix build in minimal configurations")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191210203553.2941035-1-arnd@arndb.de
2019-12-11 13:57:26 +01:00
Alexei Starovoitov ce27709b81 bpf: Fix build in minimal configurations
Some kconfigs can have BPF enabled without a single valid program type.
In such configurations the build will fail with:
./kernel/bpf/btf.c:3466:1: error: empty enum is invalid

Fix it by adding unused value to the enum.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/bpf/20191128043508.2346723-1-ast@kernel.org
2019-11-29 01:03:42 +01:00
Alexei Starovoitov d0f0104341 bpf: Fix static checker warning
kernel/bpf/btf.c:4023 btf_distill_func_proto()
        error: potentially dereferencing uninitialized 't'.

kernel/bpf/btf.c
  4012          nargs = btf_type_vlen(func);
  4013          if (nargs >= MAX_BPF_FUNC_ARGS) {
  4014                  bpf_log(log,
  4015                          "The function %s has %d arguments. Too many.\n",
  4016                          tname, nargs);
  4017                  return -EINVAL;
  4018          }
  4019          ret = __get_type_size(btf, func->type, &t);
                                                       ^^
t isn't initialized for the first -EINVAL return

This is unlikely path, since BTF should have been validated at this point.
Fix it by returning 'void' BTF.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191126230106.237179-1-ast@kernel.org
2019-11-27 01:04:47 +01:00
Alexei Starovoitov 5b92a28aae bpf: Support attaching tracing BPF program to other BPF programs
Allow FENTRY/FEXIT BPF programs to attach to other BPF programs of any type
including their subprograms. This feature allows snooping on input and output
packets in XDP, TC programs including their return values. In order to do that
the verifier needs to track types not only of vmlinux, but types of other BPF
programs as well. The verifier also needs to translate uapi/linux/bpf.h types
used by networking programs into kernel internal BTF types used by FENTRY/FEXIT
BPF programs. In some cases LLVM optimizations can remove arguments from BPF
subprograms without adjusting BTF info that LLVM backend knows. When BTF info
disagrees with actual types that the verifiers sees the BPF trampoline has to
fallback to conservative and treat all arguments as u64. The FENTRY/FEXIT
program can still attach to such subprograms, but it won't be able to recognize
pointer types like 'struct sk_buff *' and it won't be able to pass them to
bpf_skb_output() for dumping packets to user space. The FENTRY/FEXIT program
would need to use bpf_probe_read_kernel() instead.

The BPF_PROG_LOAD command is extended with attach_prog_fd field. When it's set
to zero the attach_btf_id is one vmlinux BTF type ids. When attach_prog_fd
points to previously loaded BPF program the attach_btf_id is BTF type id of
main function or one of its subprograms.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-18-ast@kernel.org
2019-11-15 23:45:24 +01:00
Alexei Starovoitov 8c1b6e69dc bpf: Compare BTF types of functions arguments with actual types
Make the verifier check that BTF types of function arguments match actual types
passed into top-level BPF program and into BPF-to-BPF calls. If types match
such BPF programs and sub-programs will have full support of BPF trampoline. If
types mismatch the trampoline has to be conservative. It has to save/restore
five program arguments and assume 64-bit scalars.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-17-ast@kernel.org
2019-11-15 23:45:02 +01:00
Alexei Starovoitov 91cc1a9974 bpf: Annotate context types
Annotate BPF program context types with program-side type and kernel-side type.
This type information is used by the verifier. btf_get_prog_ctx_type() is
used in the later patches to verify that BTF type of ctx in BPF program matches to
kernel expected ctx type. For example, the XDP program type is:
BPF_PROG_TYPE(BPF_PROG_TYPE_XDP, xdp, struct xdp_md, struct xdp_buff)
That means that XDP program should be written as:
int xdp_prog(struct xdp_md *ctx) { ... }

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-16-ast@kernel.org
2019-11-15 23:44:48 +01:00
Alexei Starovoitov 9cc31b3a09 bpf: Fix race in btf_resolve_helper_id()
btf_resolve_helper_id() caching logic is a bit racy, since under root the
verifier can verify several programs in parallel. Fix it with READ/WRITE_ONCE.
Fix the type as well, since error is also recorded.

Fixes: a7658e1a41 ("bpf: Check types of arguments passed into helpers")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-15-ast@kernel.org
2019-11-15 23:44:20 +01:00
Alexei Starovoitov fec56f5890 bpf: Introduce BPF trampoline
Introduce BPF trampoline concept to allow kernel code to call into BPF programs
with practically zero overhead.  The trampoline generation logic is
architecture dependent.  It's converting native calling convention into BPF
calling convention.  BPF ISA is 64-bit (even on 32-bit architectures). The
registers R1 to R5 are used to pass arguments into BPF functions. The main BPF
program accepts only single argument "ctx" in R1. Whereas CPU native calling
convention is different. x86-64 is passing first 6 arguments in registers
and the rest on the stack. x86-32 is passing first 3 arguments in registers.
sparc64 is passing first 6 in registers. And so on.

The trampolines between BPF and kernel already exist.  BPF_CALL_x macros in
include/linux/filter.h statically compile trampolines from BPF into kernel
helpers. They convert up to five u64 arguments into kernel C pointers and
integers. On 64-bit architectures this BPF_to_kernel trampolines are nops. On
32-bit architecture they're meaningful.

The opposite job kernel_to_BPF trampolines is done by CAST_TO_U64 macros and
__bpf_trace_##call() shim functions in include/trace/bpf_probe.h. They convert
kernel function arguments into array of u64s that BPF program consumes via
R1=ctx pointer.

This patch set is doing the same job as __bpf_trace_##call() static
trampolines, but dynamically for any kernel function. There are ~22k global
kernel functions that are attachable via nop at function entry. The function
arguments and types are described in BTF.  The job of btf_distill_func_proto()
function is to extract useful information from BTF into "function model" that
architecture dependent trampoline generators will use to generate assembly code
to cast kernel function arguments into array of u64s.  For example the kernel
function eth_type_trans has two pointers. They will be casted to u64 and stored
into stack of generated trampoline. The pointer to that stack space will be
passed into BPF program in R1. On x86-64 such generated trampoline will consume
16 bytes of stack and two stores of %rdi and %rsi into stack. The verifier will
make sure that only two u64 are accessed read-only by BPF program. The verifier
will also recognize the precise type of the pointers being accessed and will
not allow typecasting of the pointer to a different type within BPF program.

The tracing use case in the datacenter demonstrated that certain key kernel
functions have (like tcp_retransmit_skb) have 2 or more kprobes that are always
active.  Other functions have both kprobe and kretprobe.  So it is essential to
keep both kernel code and BPF programs executing at maximum speed. Hence
generated BPF trampoline is re-generated every time new program is attached or
detached to maintain maximum performance.

To avoid the high cost of retpoline the attached BPF programs are called
directly. __bpf_prog_enter/exit() are used to support per-program execution
stats.  In the future this logic will be optimized further by adding support
for bpf_stats_enabled_key inside generated assembly code. Introduction of
preemptible and sleepable BPF programs will completely remove the need to call
to __bpf_prog_enter/exit().

Detach of a BPF program from the trampoline should not fail. To avoid memory
allocation in detach path the half of the page is used as a reserve and flipped
after each attach/detach. 2k bytes is enough to call 40+ BPF programs directly
which is enough for BPF tracing use cases. This limit can be increased in the
future.

BPF_TRACE_FENTRY programs have access to raw kernel function arguments while
BPF_TRACE_FEXIT programs have access to kernel return value as well. Often
kprobe BPF program remembers function arguments in a map while kretprobe
fetches arguments from a map and analyzes them together with return value.
BPF_TRACE_FEXIT accelerates this typical use case.

Recursion prevention for kprobe BPF programs is done via per-cpu
bpf_prog_active counter. In practice that turned out to be a mistake. It
caused programs to randomly skip execution. The tracing tools missed results
they were looking for. Hence BPF trampoline doesn't provide builtin recursion
prevention. It's a job of BPF program itself and will be addressed in the
follow up patches.

BPF trampoline is intended to be used beyond tracing and fentry/fexit use cases
in the future. For example to remove retpoline cost from XDP programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191114185720.1641606-5-ast@kernel.org
2019-11-15 23:41:51 +01:00
Martin KaFai Lau 7e3617a72d bpf: Add array support to btf_struct_access
This patch adds array support to btf_struct_access().
It supports array of int, array of struct and multidimensional
array.

It also allows using u8[] as a scratch space.  For example,
it allows access the "char cb[48]" with size larger than
the array's element "char".  Another potential use case is
"u64 icsk_ca_priv[]" in the tcp congestion control.

btf_resolve_size() is added to resolve the size of any type.
It will follow the modifier if there is any.  Please
see the function comment for details.

This patch also adds the "off < moff" check at the beginning
of the for loop.  It is to reject cases when "off" is pointing
to a "hole" in a struct.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191107180903.4097702-1-kafai@fb.com
2019-11-07 10:59:08 -08:00
Martin KaFai Lau 3820729160 bpf: Prepare btf_ctx_access for non raw_tp use case
This patch makes a few changes to btf_ctx_access() to prepare
it for non raw_tp use case where the attach_btf_id is not
necessary a BTF_KIND_TYPEDEF.

It moves the "btf_trace_" prefix check and typedef-follow logic to a new
function "check_attach_btf_id()" which is called only once during
bpf_check().  btf_ctx_access() only operates on a BTF_KIND_FUNC_PROTO
type now. That should also be more efficient since it is done only
one instead of every-time check_ctx_access() is called.

"check_attach_btf_id()" needs to find the func_proto type from
the attach_btf_id.  It needs to store the result into the
newly added prog->aux->attach_func_proto.  func_proto
btf type has no name, so a proper name should be stored into
"attach_func_name" also.

v2:
- Move the "btf_trace_" check to an earlier verifier phase (Alexei)

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191025001811.1718491-1-kafai@fb.com
2019-10-24 18:41:08 -07:00
Alexei Starovoitov a7658e1a41 bpf: Check types of arguments passed into helpers
Introduce new helper that reuses existing skb perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct sk_buff *' as tracepoint argument or
can walk other kernel data structures to skb pointer.

In order to do that teach verifier to resolve true C types
of bpf helpers into in-kernel BTF ids.
The type of kernel pointer passed by raw tracepoint into bpf
program will be tracked by the verifier all the way until
it's passed into helper function.
For example:
kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
bpf programs receives that skb pointer and may eventually
pass it into bpf_skb_output() bpf helper which in-kernel is
implemented via bpf_skb_event_output() kernel function.
Its first argument in the kernel is 'struct sk_buff *'.
The verifier makes sure that types match all the way.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
2019-10-17 16:44:36 +02:00
Alexei Starovoitov 9e15db6613 bpf: Implement accurate raw_tp context access via BTF
libbpf analyzes bpf C program, searches in-kernel BTF for given type name
and stores it into expected_attach_type.
The kernel verifier expects this btf_id to point to something like:
typedef void (*btf_trace_kfree_skb)(void *, struct sk_buff *skb, void *loc);
which represents signature of raw_tracepoint "kfree_skb".

Then btf_ctx_access() matches ctx+0 access in bpf program with 'skb'
and 'ctx+8' access with 'loc' arguments of "kfree_skb" tracepoint.
In first case it passes btf_id of 'struct sk_buff *' back to the verifier core
and 'void *' in second case.

Then the verifier tracks PTR_TO_BTF_ID as any other pointer type.
Like PTR_TO_SOCKET points to 'struct bpf_sock',
PTR_TO_TCP_SOCK points to 'struct bpf_tcp_sock', and so on.
PTR_TO_BTF_ID points to in-kernel structs.
If 1234 is btf_id of 'struct sk_buff' in vmlinux's BTF
then PTR_TO_BTF_ID#1234 points to one of in kernel skbs.

When PTR_TO_BTF_ID#1234 is dereferenced (like r2 = *(u64 *)r1 + 32)
the btf_struct_access() checks which field of 'struct sk_buff' is
at offset 32. Checks that size of access matches type definition
of the field and continues to track the dereferenced type.
If that field was a pointer to 'struct net_device' the r2's type
will be PTR_TO_BTF_ID#456. Where 456 is btf_id of 'struct net_device'
in vmlinux's BTF.

Such verifier analysis prevents "cheating" in BPF C program.
The program cannot cast arbitrary pointer to 'struct sk_buff *'
and access it. C compiler would allow type cast, of course,
but the verifier will notice type mismatch based on BPF assembly
and in-kernel BTF.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-7-ast@kernel.org
2019-10-17 16:44:35 +02:00
Alexei Starovoitov 8580ac9404 bpf: Process in-kernel BTF
If in-kernel BTF exists parse it and prepare 'struct btf *btf_vmlinux'
for further use by the verifier.
In-kernel BTF is trusted just like kallsyms and other build artifacts
embedded into vmlinux.
Yet run this BTF image through BTF verifier to make sure
that it is valid and it wasn't mangled during the build.
If in-kernel BTF is incorrect it means either gcc or pahole or kernel
are buggy. In such case disallow loading BPF programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191016032505.2089704-4-ast@kernel.org
2019-10-17 16:44:35 +02:00
Colin Ian King e3439af4a3 bpf: Clean up indentation issue in BTF kflag processing
There is a statement that is indented one level too deeply, remove
the extraneous tab.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20190925093835.19515-1-colin.king@canonical.com
2019-09-26 17:09:18 +02:00
Alexei Starovoitov 9eea984979 bpf: fix BTF verification of enums
vmlinux BTF has enums that are 8 byte and 1 byte in size.
2 byte enum is a valid construct as well.
Fix BTF enum verification to accept those sizes.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-19 14:22:44 +02:00
Quentin Monnet 1b9ed84ecf bpf: add new BPF_BTF_GET_NEXT_ID syscall command
Add a new command for the bpf() system call: BPF_BTF_GET_NEXT_ID is used
to cycle through all BTF objects loaded on the system.

The motivation is to be able to inspect (list) all BTF objects presents
on the system.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-08-20 09:51:06 -07:00
Quentin Monnet 3481e64bbe bpf: add BTF ids in procfs for file descriptors to BTF objects
Implement the show_fdinfo hook for BTF FDs file operations, and make it
print the id of the BTF object. This allows for a quick retrieval of the
BTF id from its FD; or it can help understanding what type of object
(BTF) the file descriptor points to.

v2:
- Do not expose data_size, only btf_id, in FD info.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-20 16:19:12 +02:00
Andrii Nakryiko 1acc5d5c58 bpf: fix BTF verifier size resolution logic
BTF verifier has a size resolution bug which in some circumstances leads to
invalid size resolution for, e.g., TYPEDEF modifier.  This happens if we have
[1] PTR -> [2] TYPEDEF -> [3] ARRAY, in which case due to being in pointer
context ARRAY size won't be resolved (because for pointer it doesn't matter, so
it's a sink in pointer context), but it will be permanently remembered as zero
for TYPEDEF and TYPEDEF will be marked as RESOLVED. Eventually ARRAY size will
be resolved correctly, but TYPEDEF resolved_size won't be updated anymore.
This, subsequently, will lead to erroneous map creation failure, if that
TYPEDEF is specified as either key or value, as key_size/value_size won't
correspond to resolved size of TYPEDEF (kernel will believe it's zero).

Note, that if BTF was ordered as [1] ARRAY <- [2] TYPEDEF <- [3] PTR, this
won't be a problem, as by the time we get to TYPEDEF, ARRAY's size is already
calculated and stored.

This bug manifests itself in rejecting BTF-defined maps that use array
typedef as a value type:

typedef int array_t[16];

struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __type(value, array_t); /* i.e., array_t *value; */
} test_map SEC(".maps");

The fix consists on not relying on modifier's resolved_size and instead using
modifier's resolved_id (type ID for "concrete" type to which modifier
eventually resolves) and doing size determination for that resolved type. This
allow to preserve existing "early DFS termination" logic for PTR or
STRUCT_OR_ARRAY contexts, but still do correct size determination for modifier
types.

Fixes: eb3f595dab ("bpf: btf: Validate type reference")
Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-15 23:02:17 +02:00
Stanislav Fomichev e4f0712021 bpf: fix NULL deref in btf_type_is_resolve_source_only
Commit 1dc9285184 ("bpf: kernel side support for BTF Var and DataSec")
added invocations of btf_type_is_resolve_source_only before
btf_type_nosize_or_null which checks for the NULL pointer.
Swap the order of btf_type_nosize_or_null and
btf_type_is_resolve_source_only to make sure the do the NULL pointer
check first.

Fixes: 1dc9285184 ("bpf: kernel side support for BTF Var and DataSec")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-24 15:53:19 +02:00
Daniel Borkmann 2824ecb701 bpf: allow for key-less BTF in array map
Given we'll be reusing BPF array maps for global data/bss/rodata
sections, we need a way to associate BTF DataSec type as its map
value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR()
macro hack e.g. via 38d5d3b3d5 ("bpf: Introduce BPF_ANNOTATE_KV_PAIR")
to get initial map to type association going. While more use cases
for it are discouraged, this also won't work for global data since
the use of array map is a BPF loader detail and therefore unknown
at compilation time. For array maps with just a single entry we make
an exception in terms of BTF in that key type is declared optional
if value type is of DataSec type. The latter LLVM is guaranteed to
emit and it also aligns with how we regard global data maps as just
a plain buffer area reusing existing map facilities for allowing
things like introspection with existing tools.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
Daniel Borkmann 1dc9285184 bpf: kernel side support for BTF Var and DataSec
This work adds kernel-side verification, logging and seq_show dumping
of BTF Var and DataSec kinds which are emitted with latest LLVM. The
following constraints apply:

BTF Var must have:

- Its kind_flag is 0
- Its vlen is 0
- Must point to a valid type
- Type must not resolve to a forward type
- Size of underlying type must be > 0
- Must have a valid name
- Can only be a source type, not sink or intermediate one
- Name may include dots (e.g. in case of static variables
  inside functions)
- Cannot be a member of a struct/union
- Linkage so far can either only be static or global/allocated

BTF DataSec must have:

- Its kind_flag is 0
- Its vlen cannot be 0
- Its size cannot be 0
- Must have a valid name
- Can only be a source type, not sink or intermediate one
- Name may include dots (e.g. to represent .bss, .data, .rodata etc)
- Cannot be a member of a struct/union
- Inner btf_var_secinfo array with {type,offset,size} triple
  must be sorted by offset in ascending order
- Type must always point to BTF Var
- BTF resolved size of Var must be <= size provided by triple
- DataSec size must be >= sum of triple sizes (thus holes
  are allowed)

btf_var_resolve(), btf_ptr_resolve() and btf_modifier_resolve()
are on a high level quite similar but each come with slight,
subtle differences. They could potentially be a bit refactored
in future which hasn't been done here to ease review.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:46 -07:00
David S. Miller a655fe9f19 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.

Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:00:17 -08:00
Alexei Starovoitov d83525ca62 bpf: introduce bpf_spin_lock
Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
bpf program serialize access to other variables.

Example:
struct hash_elem {
    int cnt;
    struct bpf_spin_lock lock;
};
struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
if (val) {
    bpf_spin_lock(&val->lock);
    val->cnt++;
    bpf_spin_unlock(&val->lock);
}

Restrictions and safety checks:
- bpf_spin_lock is only allowed inside HASH and ARRAY maps.
- BTF description of the map is mandatory for safety analysis.
- bpf program can take one bpf_spin_lock at a time, since two or more can
  cause dead locks.
- only one 'struct bpf_spin_lock' is allowed per map element.
  It drastically simplifies implementation yet allows bpf program to use
  any number of bpf_spin_locks.
- when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
- bpf program must bpf_spin_unlock() before return.
- bpf program can access 'struct bpf_spin_lock' only via
  bpf_spin_lock()/bpf_spin_unlock() helpers.
- load/store into 'struct bpf_spin_lock lock;' field is not allowed.
- to use bpf_spin_lock() helper the BTF description of map value must be
  a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
  Nested lock inside another struct is not allowed.
- syscall map_lookup doesn't copy bpf_spin_lock field to user space.
- syscall map_update and program map_update do not update bpf_spin_lock field.
- bpf_spin_lock cannot be on the stack or inside networking packet.
  bpf_spin_lock can only be inside HASH or ARRAY map value.
- bpf_spin_lock is available to root only and to all program types.
- bpf_spin_lock is not allowed in inner maps of map-in-map.
- ld_abs is not allowed inside spin_lock-ed region.
- tracing progs and socket filter progs cannot use bpf_spin_lock due to
  insufficient preemption checks

Implementation details:
- cgroup-bpf class of programs can nest with xdp/tc programs.
  Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
  Other solutions to avoid nested bpf_spin_lock are possible.
  Like making sure that all networking progs run with softirq disabled.
  spin_lock_irqsave is the simplest and doesn't add overhead to the
  programs that don't use it.
- arch_spinlock_t is used when its implemented as queued_spin_lock
- archs can force their own arch_spinlock_t
- on architectures where queued_spin_lock is not available and
  sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
- presence of bpf_spin_lock inside map value could have been indicated via
  extra flag during map_create, but specifying it via BTF is cleaner.
  It provides introspection for map key/value and reduces user mistakes.

Next steps:
- allow bpf_spin_lock in other map types (like cgroup local storage)
- introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
  to request kernel to grab bpf_spin_lock before rewriting the value.
  That will serialize access to map elements.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:38 +01:00
Yonghong Song 81f5c6f5db bpf: btf: allow typedef func_proto
Current implementation does not allow typedef func_proto.
But it is actually allowed.
  -bash-4.4$ cat t.c
  typedef int (f) (int);
  f *g;
  -bash-4.4$ clang -O2 -g -c -target bpf t.c -Xclang -target-feature -Xclang +dwarfris
  -bash-4.4$ pahole -JV t.o
  File t.o:
  [1] PTR (anon) type_id=2
  [2] TYPEDEF f type_id=3
  [3] FUNC_PROTO (anon) return=4 args=(4 (anon))
  [4] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  -bash-4.4$

This patch related btf verifier to allow such (typedef func_proto)
patterns.

Fixes: 2667a2626f ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-29 19:15:32 -08:00
David S. Miller ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
Mathieu Malaterre 583c531853 bpf: Make function btf_name_offset_valid static
Initially in commit 69b693f0ae ("bpf: btf: Introduce BPF Type Format
(BTF)") the function 'btf_name_offset_valid' was introduced as static
function it was later on changed to a non-static one, and then finally
in commit 23127b33ec ("bpf: Create a new btf_name_by_offset() for
non type name use case") the function prototype was removed.

Revert back to original implementation and make the function static.
Remove warning triggered with W=1:

  kernel/bpf/btf.c:470:6: warning: no previous prototype for 'btf_name_offset_valid' [-Wmissing-prototypes]

Fixes: 23127b33ec ("bpf: Create a new btf_name_by_offset() for non type name use case")
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-17 16:47:05 +01:00
Yonghong Song b1e8818cab bpf: btf: support 128 bit integer type
Currently, btf only supports up to 64-bit integer.
On the other hand, 128bit support for gcc and clang
has existed for a long time. For example, both gcc 4.8
and llvm 3.7 supports types "__int128" and
"unsigned __int128" for virtually all 64bit architectures
including bpf.

The requirement for __int128 support comes from two areas:
  . bpf program may use __int128. For example, some bcc tools
    (https://github.com/iovisor/bcc/tree/master/tools),
    mostly tcp v6 related, tcpstates.py, tcpaccept.py, etc.,
    are using __int128 to represent the ipv6 addresses.
  . linux itself is using __int128 types. Hence supporting
    __int128 type in BTF is required for vmlinux BTF,
    which will be used by "compile once and run everywhere"
    and other projects.

For 128bit integer, instead of base-10, hex numbers are pretty
printed out as large decimal number is hard to decipher, e.g.,
for ipv6 addresses.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-16 22:53:44 +01:00
Yonghong Song 17e3ac8125 bpf: fix bpffs bitfield pretty print
Commit 9d5f9f701b ("bpf: btf: fix struct/union/fwd types
with kind_flag") introduced kind_flag and used bitfield_size
in the btf_member to directly pretty print member values.

The commit contained a bug where the incorrect parameters could be
passed to function btf_bitfield_seq_show(). The bits_offset
parameter in the function expects a value less than 8.
Instead, the member offset in the structure is passed.

The below is btf_bitfield_seq_show() func signature:
  void btf_bitfield_seq_show(void *data, u8 bits_offset,
                             u8 nr_bits, struct seq_file *m)
both bits_offset and nr_bits are u8 type. If the bitfield
member offset is greater than 256, incorrect value will
be printed.

This patch fixed the issue by calculating correct proper
data offset and bits_offset similar to non kind_flag case.

Fixes: 9d5f9f701b ("bpf: btf: fix struct/union/fwd types with kind_flag")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-11 10:40:54 +01:00
Yonghong Song 76c43ae84e bpf: log struct/union attribute for forward type
Current btf internal verbose logger logs fwd type as
  [2] FWD A type_id=0
where A is the type name.

Commit 9d5f9f701b ("bpf: btf: fix struct/union/fwd types
with kind_flag") introduced kind_flag which can be used
to distinguish whether a forward type is a struct or
union.

Also, "type_id=0" does not carry any meaningful
information for fwd type as btf_type.type = 0 is simply
enforced during btf verification and is not used
anywhere else.

This commit changed the log to
  [2] FWD A struct
if kind_flag = 0, or
  [2] FWD A union
if kind_flag = 1.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-19 00:47:56 +01:00
Yonghong Song ffa0c1cf59 bpf: enable cgroup local storage map pretty print with kind_flag
Commit 970289fc0a83 ("bpf: add bpffs pretty print for cgroup
local storage maps") added bpffs pretty print for cgroup
local storage maps. The commit worked for struct without kind_flag
set.

This patch refactored and made pretty print also work
with kind_flag set for the struct.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Yonghong Song 9d5f9f701b bpf: btf: fix struct/union/fwd types with kind_flag
This patch fixed two issues with BTF. One is related to
struct/union bitfield encoding and the other is related to
forward type.

Issue #1 and solution:

======================

Current btf encoding of bitfield follows what pahole generates.
For each bitfield, pahole will duplicate the type chain and
put the bitfield size at the final int or enum type.
Since the BTF enum type cannot encode bit size,
pahole workarounds the issue by generating
an int type whenever the enum bit size is not 32.

For example,
  -bash-4.4$ cat t.c
  typedef int ___int;
  enum A { A1, A2, A3 };
  struct t {
    int a[5];
    ___int b:4;
    volatile enum A c:4;
  } g;
  -bash-4.4$ gcc -c -O2 -g t.c
The current kernel supports the following BTF encoding:
  $ pahole -JV t.o
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t size=24 vlen=3
        a type_id=5 bits_offset=0
        b type_id=9 bits_offset=160
        c type_id=11 bits_offset=164
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3
  [8] INT int size=1 bit_offset=0 nr_bits=4 encoding=(none)
  [9] TYPEDEF ___int type_id=8
  [10] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED
  [11] VOLATILE (anon) type_id=10

Two issues are in the above:
  . by changing enum type to int, we lost the original
    type information and this will not be ideal later
    when we try to convert BTF to a header file.
  . the type duplication for bitfields will cause
    BTF bloat. Duplicated types cannot be deduplicated
    later if the bitfield size is different.

To fix this issue, this patch implemented a compatible
change for BTF struct type encoding:
  . the bit 31 of struct_type->info, previously reserved,
    now is used to indicate whether bitfield_size is
    encoded in btf_member or not.
  . if bit 31 of struct_type->info is set,
    btf_member->offset will encode like:
      bit 0 - 23: bit offset
      bit 24 - 31: bitfield size
    if bit 31 is not set, the old behavior is preserved:
      bit 0 - 31: bit offset

So if the struct contains a bit field, the maximum bit offset
will be reduced to (2^24 - 1) instead of MAX_UINT. The maximum
bitfield size will be 256 which is enough for today as maximum
bitfield in compiler can be 128 where int128 type is supported.

This kernel patch intends to support the new BTF encoding:
  $ pahole -JV t.o
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t kind_flag=1 size=24 vlen=3
        a type_id=5 bitfield_size=0 bits_offset=0
        b type_id=1 bitfield_size=4 bits_offset=160
        c type_id=7 bitfield_size=4 bits_offset=164
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3

Issue #2 and solution:
======================

Current forward type in BTF does not specify whether the original
type is struct or union. This will not work for type pretty print
and BTF-to-header-file conversion as struct/union must be specified.
  $ cat tt.c
  struct t;
  union u;
  int foo(struct t *t, union u *u) { return 0; }
  $ gcc -c -g -O2 tt.c
  $ pahole -JV tt.o
  [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [2] FWD t type_id=0
  [3] PTR (anon) type_id=2
  [4] FWD u type_id=0
  [5] PTR (anon) type_id=4

To fix this issue, similar to issue #1, type->info bit 31
is used. If the bit is set, it is union type. Otherwise, it is
a struct type.

  $ pahole -JV tt.o
  [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [2] FWD t kind_flag=0 type_id=0
  [3] PTR (anon) kind_flag=0 type_id=2
  [4] FWD u kind_flag=1 type_id=0
  [5] PTR (anon) kind_flag=0 type_id=4

Pahole/LLVM change:
===================

The new kind_flag functionality has been implemented in pahole
and llvm:
  https://github.com/yonghong-song/pahole/tree/bitfield
  https://github.com/yonghong-song/llvm/tree/bitfield

Note that pahole hasn't implemented func/func_proto kind
and .BTF.ext. So to print function signature with bpftool,
the llvm compiler should be used.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Yonghong Song f97be3ab04 bpf: btf: refactor btf_int_bits_seq_show()
Refactor function btf_int_bits_seq_show() by creating
function btf_bitfield_seq_show() which has no dependence
on btf and btf_type. The function btf_bitfield_seq_show()
will be in later patch to directly dump bitfield member values.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Martin KaFai Lau 23127b33ec bpf: Create a new btf_name_by_offset() for non type name use case
The current btf_name_by_offset() is returning "(anon)" type name for
the offset == 0 case and "(invalid-name-offset)" for the out-of-bound
offset case.

It fits well for the internal BTF verbose log purpose which
is focusing on type.  For example,
offset == 0 => "(anon)" => anonymous type/name.
Returning non-NULL for the bad offset case is needed
during the BTF verification process because the BTF verifier may
complain about another field first before discovering the name_off
is invalid.

However, it may not be ideal for the newer use case which does not
necessary mean type name.  For example, when logging line_info
in the BPF verifier in the next patch, it is better to log an
empty src line instead of logging "(anon)".

The existing bpf_name_by_offset() is renamed to __bpf_name_by_offset()
and static to btf.c.

A new bpf_name_by_offset() is added for generic context usage.  It
returns "\0" for name_off == 0 (note that btf->strings[0] is "\0")
and NULL for invalid offset.  It allows the caller to decide
what is the best output in its context.

The new btf_name_by_offset() is overlapped with btf_name_offset_valid().
Hence, btf_name_offset_valid() is removed from btf.h to keep the btf.h API
minimal.  The existing btf_name_offset_valid() usage in btf.c could also be
replaced later.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-14 14:17:34 -08:00
Roman Gushchin 9a1126b631 bpf: add bpffs pretty print for cgroup local storage maps
Implement bpffs pretty printing for cgroup local storage maps
(both shared and per-cpu).
Output example (captured for tools/testing/selftests/bpf/netcnt_prog.c):

Shared:
  $ cat /sys/fs/bpf/map_2
  # WARNING!! The output is for debug purpose only
  # WARNING!! The output format will change
  {4294968594,1}: {9999,1039896}

Per-cpu:
  $ cat /sys/fs/bpf/map_1
  # WARNING!! The output is for debug purpose only
  # WARNING!! The output format will change
  {4294968594,1}: {
  	cpu0: {0,0,0,0,0}
  	cpu1: {0,0,0,0,0}
  	cpu2: {1,104,0,0,0}
  	cpu3: {0,0,0,0,0}
  }

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-12 15:33:38 -08:00
David S. Miller addb067983 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-11

The following pull-request contains BPF updates for your *net-next* tree.

It has three minor merge conflicts, resolutions:

1) tools/testing/selftests/bpf/test_verifier.c

 Take first chunk with alignment_prevented_execution.

2) net/core/filter.c

  [...]
  case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
  case bpf_ctx_range(struct __sk_buff, wire_len):
        return false;
  [...]

3) include/uapi/linux/bpf.h

  Take the second chunk for the two cases each.

The main changes are:

1) Add support for BPF line info via BTF and extend libbpf as well
   as bpftool's program dump to annotate output with BPF C code to
   facilitate debugging and introspection, from Martin.

2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter
   and all JIT backends, from Jiong.

3) Improve BPF test coverage on archs with no efficient unaligned
   access by adding an "any alignment" flag to the BPF program load
   to forcefully disable verifier alignment checks, from David.

4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for
   proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz.

5) Extend tc BPF programs to use a new __sk_buff field called wire_len
   for more accurate accounting of packets going to wire, from Petar.

6) Improve bpftool to allow dumping the trace pipe from it and add
   several improvements in bash completion and map/prog dump,
   from Quentin.

7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for
   kernel addresses and add a dedicated BPF JIT backend allocator,
   from Ard.

8) Add a BPF helper function for IR remotes to report mouse movements,
   from Sean.

9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info
   member naming consistent with existing conventions, from Yonghong
   and Song.

10) Misc cleanups and improvements in allowing to pass interface name
    via cmdline for xdp1 BPF example, from Matteo.

11) Fix a potential segfault in BPF sample loader's kprobes handling,
    from Daniel T.

12) Fix SPDX license in libbpf's README.rst, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10 18:00:43 -08:00
David S. Miller 4cc1feeb6f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts, seemingly all over the place.

I used Stephen Rothwell's sample resolutions for many of these, if not
just to double check my own work, so definitely the credit largely
goes to him.

The NFP conflict consisted of a bug fix (moving operations
past the rhashtable operation) while chaning the initial
argument in the function call in the moved code.

The net/dsa/master.c conflict had to do with a bug fix intermixing of
making dsa_master_set_mtu() static with the fixing of the tagging
attribute location.

cls_flower had a conflict because the dup reject fix from Or
overlapped with the addition of port range classifiction.

__set_phy_supported()'s conflict was relatively easy to resolve
because Andrew fixed it in both trees, so it was just a matter
of taking the net-next copy.  Or at least I think it was :-)

Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
intermixed with changes on how the sdif and caller_net are calculated
in these code paths in net-next.

The remaining BPF conflicts were largely about the addition of the
__bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
to the relevant data structure where the MD pointer macros are used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 21:43:31 -08:00
Martin KaFai Lau c454a46b5e bpf: Add bpf_line_info support
This patch adds bpf_line_info support.

It accepts an array of bpf_line_info objects during BPF_PROG_LOAD.
The "line_info", "line_info_cnt" and "line_info_rec_size" are added
to the "union bpf_attr".  The "line_info_rec_size" makes
bpf_line_info extensible in the future.

The new "check_btf_line()" ensures the userspace line_info is valid
for the kernel to use.

When the verifier is translating/patching the bpf_prog (through
"bpf_patch_insn_single()"), the line_infos' insn_off is also
adjusted by the newly added "bpf_adj_linfo()".

If the bpf_prog is jited, this patch also provides the jited addrs (in
aux->jited_linfo) for the corresponding line_info.insn_off.
"bpf_prog_fill_jited_linfo()" is added to fill the aux->jited_linfo.
It is currently called by the x86 jit.  Other jits can also use
"bpf_prog_fill_jited_linfo()" and it will be done in the followup patches.
In the future, if it deemed necessary, a particular jit could also provide
its own "bpf_prog_fill_jited_linfo()" implementation.

A few "*line_info*" fields are added to the bpf_prog_info such
that the user can get the xlated line_info back (i.e. the line_info
with its insn_off reflecting the translated prog).  The jited_line_info
is available if the prog is jited.  It is an array of __u64.
If the prog is not jited, jited_line_info_cnt is 0.

The verifier's verbose log with line_info will be done in
a follow up patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09 13:54:38 -08:00
Yonghong Song eb04bbb608 bpf: btf: check name validity for various types
This patch added name checking for the following types:
 . BTF_KIND_PTR, BTF_KIND_ARRAY, BTF_KIND_VOLATILE,
   BTF_KIND_CONST, BTF_KIND_RESTRICT:
     the name must be null
 . BTF_KIND_STRUCT, BTF_KIND_UNION: the struct/member name
     is either null or a valid identifier
 . BTF_KIND_ENUM: the enum type name is either null or a valid
     identifier; the enumerator name must be a valid identifier.
 . BTF_KIND_FWD: the name must be a valid identifier
 . BTF_KIND_TYPEDEF: the name must be a valid identifier

For those places a valid name is required, the name must be
a valid C identifier. This can be relaxed later if we found
use cases for a different (non-C) frontend.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:03:04 -08:00
Yonghong Song cdbb096add bpf: btf: implement btf_name_valid_identifier()
Function btf_name_valid_identifier() have been implemented in
bpf-next commit 2667a2626f ("bpf: btf: Add BTF_KIND_FUNC and
BTF_KIND_FUNC_PROTO"). Backport this function so later patch
can use it.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:03:04 -08:00
Colin Ian King 311fe1a813 bpf: btf: fix spelling mistake "Memmber" -> "Member"
There is a spelling mistake in a btf_verifier_log_member message,
fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-26 01:28:16 +01:00
Yonghong Song 838e96904f bpf: Introduce bpf_func_info
This patch added interface to load a program with the following
additional information:
   . prog_btf_fd
   . func_info, func_info_rec_size and func_info_cnt
where func_info will provide function range and type_id
corresponding to each function.

The func_info_rec_size is introduced in the UAPI to specify
struct bpf_func_info size passed from user space. This
intends to make bpf_func_info structure growable in the future.
If the kernel gets a different bpf_func_info size from userspace,
it will try to handle user request with part of bpf_func_info
it can understand. In this patch, kernel can understand
  struct bpf_func_info {
       __u32   insn_offset;
       __u32   type_id;
  };
If user passed a bpf func_info record size of 16 bytes, the
kernel can still handle part of records with the above definition.

If verifier agrees with function range provided by the user,
the bpf_prog ksym for each function will use the func name
provided in the type_id, which is supposed to provide better
encoding as it is not limited by 16 bytes program name
limitation and this is better for bpf program which contains
multiple subprograms.

The bpf_prog_info interface is also extended to
return btf_id, func_info, func_info_rec_size and func_info_cnt
to userspace, so userspace can print out the function prototype
for each xlated function. The insn_offset in the returned
func_info corresponds to the insn offset for xlated functions.
With other jit related fields in bpf_prog_info, userspace can also
print out function prototypes for each jited function.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:39 -08:00
Martin KaFai Lau 2667a2626f bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
This patch adds BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
to support the function debug info.

BTF_KIND_FUNC_PROTO must not have a name (i.e. !t->name_off)
and it is followed by >= 0 'struct bpf_param' objects to
describe the function arguments.

The BTF_KIND_FUNC must have a valid name and it must
refer back to a BTF_KIND_FUNC_PROTO.

The above is the conclusion after the discussion between
Edward Cree, Alexei, Daniel, Yonghong and Martin.

By combining BTF_KIND_FUNC and BTF_LIND_FUNC_PROTO,
a complete function signature can be obtained.  It will be
used in the later patches to learn the function signature of
a running bpf program.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:38 -08:00
Martin KaFai Lau b47a0bd23e bpf: btf: Break up btf_type_is_void()
This patch breaks up btf_type_is_void() into
btf_type_is_void() and btf_type_is_fwd().

It also adds btf_type_nosize() to better describe it is
testing a type has nosize info.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:38 -08:00
Martin Lau 4a6998aff8 bpf, btf: fix a missing check bug in btf_parse
Wenwen Wang reported:

  In btf_parse(), the header of the user-space btf data 'btf_data'
  is firstly parsed and verified through btf_parse_hdr().
  In btf_parse_hdr(), the header is copied from user-space 'btf_data'
  to kernel-space 'btf->hdr' and then verified. If no error happens
  during the verification process, the whole data of 'btf_data',
  including the header, is then copied to 'data' in btf_parse(). It
  is obvious that the header is copied twice here. More importantly,
  no check is enforced after the second copy to make sure the headers
  obtained in these two copies are same. Given that 'btf_data' resides
  in the user space, a malicious user can race to modify the header
  between these two copies. By doing so, the user can inject
  inconsistent data, which can cause undefined behavior of the
  kernel and introduce potential security risk.

This issue is similar to the one fixed in commit 8af03d1ae2 ("bpf:
btf: Fix a missing check bug"). To fix it, this patch copies the user
'btf_data' *before* parsing / verifying the BTF header.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Co-developed-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-26 00:42:03 +02:00
Wenwen Wang 8af03d1ae2 bpf: btf: Fix a missing check bug
In btf_parse_hdr(), the length of the btf data header is firstly copied
from the user space to 'hdr_len' and checked to see whether it is larger
than 'btf_data_size'. If yes, an error code EINVAL is returned. Otherwise,
the whole header is copied again from the user space to 'btf->hdr'.
However, after the second copy, there is no check between
'btf->hdr->hdr_len' and 'hdr_len' to confirm that the two copies get the
same value. Given that the btf data is in the user space, a malicious user
can race to change the data between the two copies. By doing so, the user
can provide malicious data to the kernel and cause undefined behavior.

This patch adds a necessary check after the second copy, to make sure
'btf->hdr->hdr_len' has the same value as 'hdr_len'. Otherwise, an error
code EINVAL will be returned.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-09 21:42:51 -07:00
Martin KaFai Lau 4b1c5d917d bpf: btf: Fix end boundary calculation for type section
The end boundary math for type section is incorrect in
btf_check_all_metas().  It just happens that hdr->type_off
is always 0 for now because there are only two sections
(type and string) and string section must be at the end (ensured
in btf_parse_str_sec).

However, type_off may not be 0 if a new section would be added later.
This patch fixes it.

Fixes: f80442a4cd ("bpf: btf: Change how section is supported in btf_header")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-12 22:00:23 +02:00
Martin KaFai Lau 6283fa38dc bpf: btf: Ensure the member->offset is in the right order
This patch ensures the member->offset of a struct
is in the correct order (i.e the later member's offset cannot
go backward).

The current "pahole -J" BTF encoder does not generate something
like this.  However, checking this can ensure future encoder
will not violate this.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-24 01:20:44 +02:00
Martin KaFai Lau 36fc3c8c28 bpf: btf: Clean up BTF_INT_BITS() in uapi btf.h
This patch shrinks the BTF_INT_BITS() mask.  The current
btf_int_check_meta() ensures the nr_bits of an integer
cannot exceed 64.  Hence, it is mostly an uapi cleanup.

The actual btf usage (i.e. seq_show()) is also modified
to use u8 instead of u16.  The verification (e.g. btf_int_check_meta())
path stays as is to deal with invalid BTF situation.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-20 10:25:48 +02:00
Okash Khawaja b65f370d06 bpf: btf: Fix bitfield extraction for big endian
When extracting bitfield from a number, btf_int_bits_seq_show() builds
a mask and accesses least significant byte of the number in a way
specific to little-endian. This patch fixes that by checking endianness
of the machine and then shifting left and right the unneeded bits.

Thanks to Martin Lau for the help in navigating potential pitfalls when
dealing with endianess and for the final solution.

Fixes: b00b8daec8 ("bpf: btf: Add pretty print capability for data with BTF type info")
Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-11 22:36:08 +02:00
Kees Cook 778e1cdd81 treewide: kvzalloc() -> kvcalloc()
The kvzalloc() function has a 2-factor argument form, kvcalloc(). This
patch replaces cases of:

        kvzalloc(a * b, gfp)

with:
        kvcalloc(a * b, gfp)

as well as handling cases of:

        kvzalloc(a * b * c, gfp)

with:

        kvzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kvcalloc(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kvzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kvzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kvzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kvzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kvzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kvzalloc
+ kvcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kvzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kvzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kvzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kvzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kvzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kvzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kvzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kvzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kvzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kvzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kvzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kvzalloc(C1 * C2 * C3, ...)
|
  kvzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kvzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kvzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kvzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kvzalloc(sizeof(THING) * C2, ...)
|
  kvzalloc(sizeof(TYPE) * C2, ...)
|
  kvzalloc(C1 * C2 * C3, ...)
|
  kvzalloc(C1 * C2, ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kvzalloc
+ kvcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Martin KaFai Lau 8175383f23 bpf: btf: Ensure t->type == 0 for BTF_KIND_FWD
The t->type in BTF_KIND_FWD is not used.  It must be 0.
This patch ensures that and also adds a test case in test_btf.c

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-02 11:22:36 -07:00
Martin KaFai Lau b9308ae696 bpf: btf: Check array t->size
This patch ensures array's t->size is 0.

The array size is decided by its individual elem's size and the
number of elements.  Hence, t->size is not used and
it must be 0.

A test case is added to test_btf.c

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-02 11:22:36 -07:00
Arnd Bergmann 53c8036cb7 bpf: btf: avoid -Wreturn-type warning
gcc warns about a noreturn function possibly returning in
some configurations:

kernel/bpf/btf.c: In function 'env_type_is_resolve_sink':
kernel/bpf/btf.c:729:1: error: control reaches end of non-void function [-Werror=return-type]

Using BUG() instead of BUG_ON() avoids that warning and otherwise
does the exact same thing.

Fixes: eb3f595dab ("bpf: btf: Validate type reference")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-28 17:40:58 +02:00