Commit Graph

4 Commits

Author SHA1 Message Date
Jerome Marchand 773bccce1d bpf: Handle bpf_mprog_query with NULL entry
JIRA: https://issues.redhat.com/browse/RHEL-10691

Conflicts: Apply only the multi-progs part of the patch since tcx is
missing.

commit edfa9af0a73ecc2000d7bb81d0b0fd3158cc9a65
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Sat Oct 7 00:06:50 2023 +0200

    bpf: Handle bpf_mprog_query with NULL entry

    Improve consistency for bpf_mprog_query() API and let the latter also handle
    a NULL entry as can be the case for tcx. Instead of returning -ENOENT, we
    copy a count of 0 and revision of 1 to user space, so that this can be fed
    into a subsequent bpf_mprog_attach() call as expected_revision. A BPF self-
    test as part of this series has been added to assert this case.

    Suggested-by: Lorenz Bauer <lmb@isovalent.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/r/20231006220655.1653-2-daniel@iogearbox.net
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:29:05 +01:00
Jerome Marchand 1acd65b8a6 bpf, mprog: Fix maximum program check on mprog attachment
JIRA: https://issues.redhat.com/browse/RHEL-10691

commit f9b0e1088bbf35933e25c839b75094039059b3be
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Sep 29 22:41:20 2023 +0200

    bpf, mprog: Fix maximum program check on mprog attachment

    After Paul's recent improvement to syzkaller to improve coverage for
    bpf_mprog and tcx, it hit a splat that the program limit was surpassed.
    What happened is that the maximum number of progs got added, followed
    by another prog add request which adds with BPF_F_BEFORE flag relative
    to the last program in the array. The idx >= bpf_mprog_max() check in
    bpf_mprog_attach() still passes because the index is below the maximum
    but the maximum will be surpassed. We need to add a check upfront for
    insertions to catch this situation.

    Fixes: 053c8e1f235d ("bpf: Add generic attach/detach/query API for multi-progs")
    Reported-by: syzbot+baa44e3dbbe48e05c1ad@syzkaller.appspotmail.com
    Reported-by: syzbot+b97d20ed568ce0951a06@syzkaller.appspotmail.com
    Reported-by: syzbot+2558ca3567a77b7af4e3@syzkaller.appspotmail.com
    Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Tested-by: syzbot+baa44e3dbbe48e05c1ad@syzkaller.appspotmail.com
    Tested-by: syzbot+b97d20ed568ce0951a06@syzkaller.appspotmail.com
    Link: https://github.com/google/syzkaller/pull/4207
    Link: https://lore.kernel.org/bpf/20230929204121.20305-1-daniel@iogearbox.net

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:29:05 +01:00
Jerome Marchand 0d4f4173a8 bpf: Fix mprog detachment for empty mprog entry
JIRA: https://issues.redhat.com/browse/RHEL-10691

commit d210f9735e13e9b1ef6ffbb636ee051f615bd109
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Aug 4 15:11:11 2023 +0200

    bpf: Fix mprog detachment for empty mprog entry

    syzbot reported an UBSAN array-index-out-of-bounds access in bpf_mprog_read()
    upon bpf_mprog_detach(). While it did not have a reproducer, I was able to
    manually reproduce through an empty mprog entry which just has miniq present.

    The latter is important given otherwise we get an ENOENT error as tcx detaches
    the whole mprog entry. The index 4294967295 was triggered via NULL dtuple.prog
    which then attempts to detach from the back. bpf_mprog_fetch() in this case
    did hit the idx == total and therefore tried to grab the entry at idx -1.

    Fix it by adding an explicit bpf_mprog_total() check in bpf_mprog_detach() and
    bail out early with ENOENT.

    Fixes: 053c8e1f235d ("bpf: Add generic attach/detach/query API for multi-progs")
    Reported-by: syzbot+0c06ba0f831fe07a8f27@syzkaller.appspotmail.com
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/r/20230804131112.11012-1-daniel@iogearbox.net
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:28:55 +01:00
Jerome Marchand 73b130693e bpf: Add generic attach/detach/query API for multi-progs
JIRA: https://issues.redhat.com/browse/RHEL-10691

Conflicts: The MAINTAINERS file has been reworked upstream. The Added
file is already covered by the generic BPF section which contain the
all directory kernel/bpf/.

commit 053c8e1f235dc3f69d13375b32f4209228e1cb96
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed Jul 19 16:08:51 2023 +0200

    bpf: Add generic attach/detach/query API for multi-progs

    This adds a generic layer called bpf_mprog which can be reused by different
    attachment layers to enable multi-program attachment and dependency resolution.
    In-kernel users of the bpf_mprog don't need to care about the dependency
    resolution internals, they can just consume it with few API calls.

    The initial idea of having a generic API sparked out of discussion [0] from an
    earlier revision of this work where tc's priority was reused and exposed via
    BPF uapi as a way to coordinate dependencies among tc BPF programs, similar
    as-is for classic tc BPF. The feedback was that priority provides a bad user
    experience and is hard to use [1], e.g.:

      I cannot help but feel that priority logic copy-paste from old tc, netfilter
      and friends is done because "that's how things were done in the past". [...]
      Priority gets exposed everywhere in uapi all the way to bpftool when it's
      right there for users to understand. And that's the main problem with it.

      The user don't want to and don't need to be aware of it, but uapi forces them
      to pick the priority. [...] Your cover letter [0] example proves that in
      real life different service pick the same priority. They simply don't know
      any better. Priority is an unnecessary magic that apps _have_ to pick, so
      they just copy-paste and everyone ends up using the same.

    The course of the discussion showed more and more the need for a generic,
    reusable API where the "same look and feel" can be applied for various other
    program types beyond just tc BPF, for example XDP today does not have multi-
    program support in kernel, but also there was interest around this API for
    improving management of cgroup program types. Such common multi-program
    management concept is useful for BPF management daemons or user space BPF
    applications coordinating internally about their attachments.

    Both from Cilium and Meta side [2], we've collected the following requirements
    for a generic attach/detach/query API for multi-progs which has been implemented
    as part of this work:

      - Support prog-based attach/detach and link API
      - Dependency directives (can also be combined):
        - BPF_F_{BEFORE,AFTER} with relative_{fd,id} which can be {prog,link,none}
          - BPF_F_ID flag as {fd,id} toggle; the rationale for id is so that user
            space application does not need CAP_SYS_ADMIN to retrieve foreign fds
            via bpf_*_get_fd_by_id()
          - BPF_F_LINK flag as {prog,link} toggle
          - If relative_{fd,id} is none, then BPF_F_BEFORE will just prepend, and
            BPF_F_AFTER will just append for attaching
          - Enforced only at attach time
        - BPF_F_REPLACE with replace_bpf_fd which can be prog, links have their
          own infra for replacing their internal prog
        - If no flags are set, then it's default append behavior for attaching
      - Internal revision counter and optionally being able to pass expected_revision
      - User space application can query current state with revision, and pass it
        along for attachment to assert current state before doing updates
      - Query also gets extension for link_ids array and link_attach_flags:
        - prog_ids are always filled with program IDs
        - link_ids are filled with link IDs when link was used, otherwise 0
        - {prog,link}_attach_flags for holding {prog,link}-specific flags
      - Must be easy to integrate/reuse for in-kernel users

    The uapi-side changes needed for supporting bpf_mprog are rather minimal,
    consisting of the additions of the attachment flags, revision counter, and
    expanding existing union with relative_{fd,id} member.

    The bpf_mprog framework consists of an bpf_mprog_entry object which holds
    an array of bpf_mprog_fp (fast-path structure). The bpf_mprog_cp (control-path
    structure) is part of bpf_mprog_bundle. Both have been separated, so that
    fast-path gets efficient packing of bpf_prog pointers for maximum cache
    efficiency. Also, array has been chosen instead of linked list or other
    structures to remove unnecessary indirections for a fast point-to-entry in
    tc for BPF.

    The bpf_mprog_entry comes as a pair via bpf_mprog_bundle so that in case of
    updates the peer bpf_mprog_entry is populated and then just swapped which
    avoids additional allocations that could otherwise fail, for example, in
    detach case. bpf_mprog_{fp,cp} arrays are currently static, but they could
    be converted to dynamic allocation if necessary at a point in future.
    Locking is deferred to the in-kernel user of bpf_mprog, for example, in case
    of tcx which uses this API in the next patch, it piggybacks on rtnl.

    An extensive test suite for checking all aspects of this API for prog-based
    attach/detach and link API comes as BPF selftests in this series.

    Thanks also to Andrii Nakryiko for early API discussions wrt Meta's BPF prog
    management.

      [0] https://lore.kernel.org/bpf/20221004231143.19190-1-daniel@iogearbox.net
      [1] https://lore.kernel.org/bpf/CAADnVQ+gEY3FjCR=+DmjDR4gp5bOYZUFJQXj4agKFHT9CQPZBw@mail.gmail.com
      [2] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf

    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/r/20230719140858.13224-2-daniel@iogearbox.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-12-15 09:28:53 +01:00