Commit Graph

10 Commits

Author SHA1 Message Date
Jerome Marchand 0e62381974 bpf: Enable non-atomic allocations in local storage
Bugzilla: https://bugzilla.redhat.com/2120966

commit b00fa38a9c1cba044a32a601b49a55a18ed719d1
Author: Joanne Koong <joannelkoong@gmail.com>
Date:   Thu Mar 17 21:55:52 2022 -0700

    bpf: Enable non-atomic allocations in local storage

    Currently, local storage memory can only be allocated atomically
    (GFP_ATOMIC). This restriction is too strict for sleepable bpf
    programs.

    In this patch, the verifier detects whether the program is sleepable,
    and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a
    5th argument to bpf_task/sk/inode_storage_get. This flag will propagate
    down to the local storage functions that allocate memory.

    Please note that bpf_task/sk/inode_storage_update_elem functions are
    invoked by userspace applications through syscalls. Preemption is
    disabled before bpf_task/sk/inode_storage_update_elem is called, which
    means they will always have to allocate memory atomically.

    Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: KP Singh <kpsingh@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220318045553.3091807-2-joannekoong@fb.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:06 +02:00
Jerome Marchand c5056baccd bpf: Cleanup comments
Bugzilla: https://bugzilla.redhat.com/2120966

commit c561d11063009323a0e57c528cb1d77b7d2c41e0
Author: Tom Rix <trix@redhat.com>
Date:   Sun Feb 20 10:40:55 2022 -0800

    bpf: Cleanup comments

    Add leading space to spdx tag
    Use // for spdx c file comment

    Replacements
    resereved to reserved
    inbetween to in between
    everytime to every time
    intutivie to intuitive
    currenct to current
    encontered to encountered
    referenceing to referencing
    upto to up to
    exectuted to executed

    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20220220184055.3608317-1-trix@redhat.com

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:57:51 +02:00
Artem Savkov 9c42002344 bpf: Fix usage of trace RCU in local storage.
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit dcf456c9a095a6e71f53d6f6f004133ee851ee70
Author: KP Singh <kpsingh@kernel.org>
Date:   Mon Apr 18 15:51:58 2022 +0000

    bpf: Fix usage of trace RCU in local storage.

    bpf_{sk,task,inode}_storage_free() do not need to use
    call_rcu_tasks_trace as no BPF program should be accessing the owner
    as it's being destroyed. The only other reader at this point is
    bpf_local_storage_map_free() which uses normal RCU.

    The only path that needs trace RCU are:

    * bpf_local_storage_{delete,update} helpers
    * map_{delete,update}_elem() syscalls

    Fixes: 0fe4b381a59e ("bpf: Allow bpf_local_storage to be used by sleepable programs")
    Signed-off-by: KP Singh <kpsingh@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20220418155158.2865678-1-kpsingh@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:56 +02:00
Artem Savkov a3732d50aa bpf: Allow bpf_local_storage to be used by sleepable programs
Bugzilla: https://bugzilla.redhat.com/2069046

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 0fe4b381a59ebc53522fce579b281a67a9e1bee6
Author: KP Singh <kpsingh@kernel.org>
Date:   Fri Dec 24 15:29:15 2021 +0000

    bpf: Allow bpf_local_storage to be used by sleepable programs

    Other maps like hashmaps are already available to sleepable programs.
    Sleepable BPF programs run under trace RCU. Allow task, sk and inode
    storage to be used from sleepable programs. This allows sleepable and
    non-sleepable programs to provide shareable annotations on kernel
    objects.

    Sleepable programs run in trace RCU where as non-sleepable programs run
    in a normal RCU critical section i.e.  __bpf_prog_enter{_sleepable}
    and __bpf_prog_exit{_sleepable}) (rcu_read_lock or rcu_read_lock_trace).

    In order to make the local storage maps accessible to both sleepable
    and non-sleepable programs, one needs to call both
    call_rcu_tasks_trace and call_rcu to wait for both trace and classical
    RCU grace periods to expire before freeing memory.

    Paul's work on call_rcu_tasks_trace allows us to have per CPU queueing
    for call_rcu_tasks_trace. This behaviour can be achieved by setting
    rcupdate.rcu_task_enqueue_lim=<num_cpus> boot parameter.

    In light of these new performance changes and to keep the local storage
    code simple, avoid adding a new flag for sleepable maps / local storage
    to select the RCU synchronization (trace / classical).

    Also, update the dereferencing of the pointers to use
    rcu_derference_check (with either the trace or normal RCU locks held)
    with a common bpf_rcu_lock_held helper method.

    Signed-off-by: KP Singh <kpsingh@kernel.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Acked-by: Martin KaFai Lau <kafai@fb.com>
    Link: https://lore.kernel.org/bpf/20211224152916.1550677-2-kpsingh@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2022-08-24 12:53:51 +02:00
Song Liu bc235cdb42 bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]
BPF helpers bpf_task_storage_[get|delete] could hold two locks:
bpf_local_storage_map_bucket->lock and bpf_local_storage->lock. Calling
these helpers from fentry/fexit programs on functions in bpf_*_storage.c
may cause deadlock on either locks.

Prevent such deadlock with a per cpu counter, bpf_task_storage_busy. We
need this counter to be global, because the two locks here belong to two
different objects: bpf_local_storage_map and bpf_local_storage. If we
pick one of them as the owner of the counter, it is still possible to
trigger deadlock on the other lock. For example, if bpf_local_storage_map
owns the counters, it cannot prevent deadlock on bpf_local_storage->lock
when two maps are used.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210225234319.336131-3-songliubraving@fb.com
2021-02-26 11:51:48 -08:00
Song Liu a10787e6d5 bpf: Enable task local storage for tracing programs
To access per-task data, BPF programs usually creates a hash table with
pid as the key. This is not ideal because:
 1. The user need to estimate the proper size of the hash table, which may
    be inaccurate;
 2. Big hash tables are slow;
 3. To clean up the data properly during task terminations, the user need
    to write extra logic.

Task local storage overcomes these issues and offers a better option for
these per-task data. Task local storage is only available to BPF_LSM. Now
enable it for tracing programs.

Unlike LSM programs, tracing programs can be called in IRQ contexts.
Helpers that access task local storage are updated to use
raw_spin_lock_irqsave() instead of raw_spin_lock_bh().

Tracing programs can attach to functions on the task free path, e.g.
exit_creds(). To avoid allocating task local storage after
bpf_task_storage_free(). bpf_task_storage_get() is updated to not allocate
new storage when the task is not refcounted (task->usage == 0).

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210225234319.336131-2-songliubraving@fb.com
2021-02-26 11:51:47 -08:00
Roman Gushchin ab31be378a bpf: Eliminate rlimit-based memory accounting for bpf local storage maps
Do not use rlimit-based memory accounting for bpf local storage maps.
It has been replaced with the memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201201215900.3569844-32-guro@fb.com
2020-12-02 18:32:47 -08:00
Roman Gushchin e9aae8beba bpf: Memcg-based memory accounting for bpf local storage maps
Account memory used by bpf local storage maps:
per-socket, per-inode and per-task storages.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201201215900.3569844-16-guro@fb.com
2020-12-02 18:32:45 -08:00
Martin KaFai Lau 70b971118e bpf: Use hlist_add_head_rcu when linking to local_storage
The local_storage->list will be traversed by rcu reader in parallel.
Thus, hlist_add_head_rcu() is needed in bpf_selem_link_storage_nolock().
This patch fixes it.

This part of the code has recently been refactored in bpf-next
and this patch makes changes to the new file "bpf_local_storage.c".
Instead of using the original offending commit in the Fixes tag,
the commit that created the file "bpf_local_storage.c" is used.

A separate fix has been provided to the bpf tree.

Fixes: 450af8d0f6 ("bpf: Split bpf_local_storage to bpf_sk_storage")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200916204453.2003915-1-kafai@fb.com
2020-09-19 01:12:35 +02:00
KP Singh 450af8d0f6 bpf: Split bpf_local_storage to bpf_sk_storage
A purely mechanical change:

	bpf_sk_storage.c = bpf_sk_storage.c + bpf_local_storage.c
	bpf_sk_storage.h = bpf_sk_storage.h + bpf_local_storage.h

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-5-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00