Commit Graph

758 Commits

Author SHA1 Message Date
Donald Dutile 9d720c5a56 kallsyms: Delete an unused parameter related to {module_}kallsyms_on_each_symbol()
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit 3703bd54cd37e7875f51ece8df8c85c184e40bba
Author: Zhen Lei <thunder.leizhen@huawei.com>
Date:   Wed Mar 8 15:38:46 2023 +0800

    kallsyms: Delete an unused parameter related to {module_}kallsyms_on_each_symbol()

    The parameter 'struct module *' in the hook function associated with
    {module_}kallsyms_on_each_symbol() is no longer used. Delete it.

    Suggested-by: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
    Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:23 -04:00
Donald Dutile b0a329a194 livepatch: Improve the search performance of module_kallsyms_on_each_symbol()
JIRA: https://issues.redhat.com/browse/RHEL-28063

Conflicts: upstream 73feb8d5fa3b already backported to RHEL9, so that results
           in missing hunk for RHEL9 in module.h.
           Removed bpf_trace.c hunk since bpf updated closer to upstream
           and no longer uses kallsyms function.

commit 07cc2c931e8e1083a31f4c51d2244fe264af63bf
Author: Zhen Lei <thunder.leizhen@huawei.com>
Date:   Mon Jan 16 11:10:07 2023 +0100

    livepatch: Improve the search performance of module_kallsyms_on_each_symbol()

    Currently we traverse all symbols of all modules to find the specified
    function for the specified module. But in reality, we just need to find
    the given module and then traverse all the symbols in it.

    Let's add a new parameter 'const char *modname' to function
    module_kallsyms_on_each_symbol(), then we can compare the module names
    directly in this function and call hook 'fn' after matching. If 'modname'
    is NULL, the symbols of all modules are still traversed for compatibility
    with other usage cases.

    Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
                    |
    Phase2:          -->f1-->f2-->f3

    Assuming that there are m modules, each module has n symbols on average,
    then the time complexity is reduced from O(m * n) to O(m) + O(n).

    Reviewed-by: Petr Mladek <pmladek@suse.com>
    Acked-by: Song Liu <song@kernel.org>
    Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Acked-by: Miroslav Benes <mbenes@suse.cz>
    Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
    Link: https://lore.kernel.org/r/20230116101009.23694-2-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:22 -04:00
Donald Dutile e9c109d3f5 ftrace: Fix assuming build time sort works for s390
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit 6b9b6413700e104934734b72a3be622a76923b98
Author: Steven Rostedt (Google) <rostedt@goodmis.org>
Date:   Sat Jan 22 09:17:10 2022 -0500

    ftrace: Fix assuming build time sort works for s390

    To speed up the boot process, as mcount_loc needs to be sorted for ftrace
    to work properly, sorting it at build time is more efficient than boot up
    and can save milliseconds of time. Unfortunately, this change broke s390
    as it will modify the mcount_loc location after the sorting takes place
    and will put back the unsorted locations. Since the sorting is skipped at
    boot up if it is believed that it was sorted at run time, ftrace can crash
    as its algorithms are dependent on the list being sorted.

    Add a new config BUILDTIME_MCOUNT_SORT that is set when
    BUILDTIME_TABLE_SORT but not if S390 is set. Use this config to determine
    if sorting should take place at boot up.

    Link: https://lore.kernel.org/all/yt9dee51ctfn.fsf@linux.ibm.com/

    Fixes: 72b3942a173c ("scripts: ftrace - move the sort-processing in ftrace_init")
    Reported-by: Sven Schnelle <svens@linux.ibm.com>
    Tested-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:12 -04:00
Donald Dutile a22368bc3f ftrace: Add test to make sure compiled time sorts work
JIRA: https://issues.redhat.com/browse/RHEL-28063

Conflicts:
  Add this commit so next one applies cleanly;
  turn if off by default in RHEL9.

commit 8147dc78e6e4b645f8277bdf377f2193ddfcdee1
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Mon Dec 6 15:18:58 2021 -0500

    ftrace: Add test to make sure compiled time sorts work

    Now that ftrace function pointers are sorted at compile time, add a test
    that makes sure they are sorted at run time. This test is only run if it is
    configured in.

    Link: https://lkml.kernel.org/r/20211206151858.4d21a24d@gandalf.local.home

    Cc: Yinan Liu <yinan@linux.alibaba.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:12 -04:00
Donald Dutile 1c7ced2303 scripts: ftrace - move the sort-processing in ftrace_init
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit 72b3942a173c387b27860ba1069636726e208777
Author: Yinan Liu <yinan@linux.alibaba.com>
Date:   Sun Dec 12 19:33:58 2021 +0800

    scripts: ftrace - move the sort-processing in ftrace_init

    When the kernel starts, the initialization of ftrace takes
    up a portion of the time (approximately 6~8ms) to sort mcount
    addresses. We can save this time by moving mcount-sorting to
    compile time.

    Link: https://lkml.kernel.org/r/20211212113358.34208-2-yinan@linux.alibaba.com

    Signed-off-by: Yinan Liu <yinan@linux.alibaba.com>
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:12 -04:00
Prarit Bhargava c81bc94b07 sections: move and rename core_kernel_data() to is_kernel_core_data()
JIRA: https://issues.redhat.com/browse/RHEL-25415

commit a20deb3a348719adaf8c12e1bf4b599bfc51836e
Author: Kefeng Wang <wangkefeng.wang@huawei.com>
Date:   Mon Nov 8 18:33:51 2021 -0800

    sections: move and rename core_kernel_data() to is_kernel_core_data()

    Move core_kernel_data() into sections.h and rename it to
    is_kernel_core_data(), also make it return bool value, then update all the
    callers.

    Link: https://lkml.kernel.org/r/20210930071143.63410-4-wangkefeng.wang@huawei.com
    Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
    Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Andrey Konovalov <andreyknvl@gmail.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
    Cc: Matt Turner <mattst88@gmail.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Michal Simek <monstr@monstr.eu>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Richard Henderson <rth@twiddle.net>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
2024-03-20 09:43:21 -04:00
Prarit Bhargava a1d2c7a416 ftrace: Remove return value of ftrace_arch_modify_*()
JIRA: https://issues.redhat.com/browse/RHEL-25415

Conflicts: Minor drift issues.

commit 3a2bfec0b02f2226ff3376a5d2ff604d799bd7ea
Author: Li kunyu <kunyu@nfschina.com>
Date:   Wed May 18 10:36:40 2022 +0800

    ftrace: Remove return value of ftrace_arch_modify_*()

    All instances of the function ftrace_arch_modify_prepare() and
      ftrace_arch_modify_post_process() return zero. There's no point in
      checking their return value. Just have them be void functions.

    Link: https://lkml.kernel.org/r/20220518023639.4065-1-kunyu@nfschina.com

    Signed-off-by: Li kunyu <kunyu@nfschina.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
2024-03-20 09:42:36 -04:00
Adrien Thierry d25975cb85 ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct()
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 2a2d8c51defb446e8d89a83f42f8e5cd529111e9
Author: Zheng Yejian <zhengyejian1@huawei.com>
Date:   Thu Mar 30 10:52:23 2023 +0800

    ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct()

    Syzkaller report a WARNING: "WARN_ON(!direct)" in modify_ftrace_direct().

    Root cause is 'direct->addr' was changed from 'old_addr' to 'new_addr' but
    not restored if error happened on calling ftrace_modify_direct_caller().
    Then it can no longer find 'direct' by that 'old_addr'.

    To fix it, restore 'direct->addr' to 'old_addr' explicitly in error path.

    Link: https://lore.kernel.org/linux-trace-kernel/20230330025223.1046087-1-zhengyejian1@huawei.com

    Cc: stable@vger.kernel.org
    Cc: <mhiramat@kernel.org>
    Cc: <mark.rutland@arm.com>
    Cc: <ast@kernel.org>
    Cc: <daniel@iogearbox.net>
    Fixes: 8a141dd7f7 ("ftrace: Fix modify_ftrace_direct.")
    Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:50 -04:00
Adrien Thierry 6bc4e822b2 ftrace: Fix invalid address access in lookup_rec() when index is 0
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit ee92fa443358f4fc0017c1d0d325c27b37802504
Author: Chen Zhongjin <chenzhongjin@huawei.com>
Date:   Thu Mar 9 16:02:30 2023 +0800

    ftrace: Fix invalid address access in lookup_rec() when index is 0

    KASAN reported follow problem:

     BUG: KASAN: use-after-free in lookup_rec
     Read of size 8 at addr ffff000199270ff0 by task modprobe
     CPU: 2 Comm: modprobe
     Call trace:
      kasan_report
      __asan_load8
      lookup_rec
      ftrace_location
      arch_check_ftrace_location
      check_kprobe_address_safe
      register_kprobe

    When checking pg->records[pg->index - 1].ip in lookup_rec(), it can get a
    pg which is newly added to ftrace_pages_start in ftrace_process_locs().
    Before the first pg->index++, index is 0 and accessing pg->records[-1].ip
    will cause this problem.

    Don't check the ip when pg->index is 0.

    Link: https://lore.kernel.org/linux-trace-kernel/20230309080230.36064-1-chenzhongjin@huawei.com

    Cc: stable@vger.kernel.org
    Fixes: 9644302e33 ("ftrace: Speed up search by skipping pages by address")
    Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:50 -04:00
Adrien Thierry 71015cc8bd ftrace: Fix null pointer dereference in ftrace_add_mod()
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 19ba6c8af9382c4c05dc6a0a79af3013b9a35cd0
Author: Xiu Jianfeng <xiujianfeng@huawei.com>
Date:   Wed Nov 16 09:52:07 2022 +0800

    ftrace: Fix null pointer dereference in ftrace_add_mod()

    The @ftrace_mod is allocated by kzalloc(), so both the members {prev,next}
    of @ftrace_mode->list are NULL, it's not a valid state to call list_del().
    If kstrdup() for @ftrace_mod->{func|module} fails, it goes to @out_free
    tag and calls free_ftrace_mod() to destroy @ftrace_mod, then list_del()
    will write prev->next and next->prev, where null pointer dereference
    happens.

    BUG: kernel NULL pointer dereference, address: 0000000000000008
    Oops: 0002 [#1] PREEMPT SMP NOPTI
    Call Trace:
     <TASK>
     ftrace_mod_callback+0x20d/0x220
     ? do_filp_open+0xd9/0x140
     ftrace_process_regex.isra.51+0xbf/0x130
     ftrace_regex_write.isra.52.part.53+0x6e/0x90
     vfs_write+0xee/0x3a0
     ? __audit_filter_op+0xb1/0x100
     ? auditd_test_task+0x38/0x50
     ksys_write+0xa5/0xe0
     do_syscall_64+0x3a/0x90
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    Kernel panic - not syncing: Fatal exception

    So call INIT_LIST_HEAD() to initialize the list member to fix this issue.

    Link: https://lkml.kernel.org/r/20221116015207.30858-1-xiujianfeng@huawei.com

    Cc: stable@vger.kernel.org
    Fixes: 673feb9d76 ("ftrace: Add :mod: caching infrastructure to trace_array")
    Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:44 -04:00
Adrien Thierry 83097979d4 ftrace: Fix the possible incorrect kernel message
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 08948caebe93482db1adfd2154eba124f66d161d
Author: Wang Wensheng <wangwensheng4@huawei.com>
Date:   Wed Nov 9 09:44:32 2022 +0000

    ftrace: Fix the possible incorrect kernel message

    If the number of mcount entries is an integer multiple of
    ENTRIES_PER_PAGE, the page count showing on the console would be wrong.

    Link: https://lkml.kernel.org/r/20221109094434.84046-2-wangwensheng4@huawei.com

    Cc: <mhiramat@kernel.org>
    Cc: <mark.rutland@arm.com>
    Cc: stable@vger.kernel.org
    Fixes: 5821e1b74f ("function tracing: fix wrong pos computing when read buffer has been fulfilled")
    Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:44 -04:00
Adrien Thierry e4b8268538 ftrace: Fix use-after-free for dynamic ftrace_ops
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 0e792b89e6800cd9cb4757a76a96f7ef3e8b6294
Author: Li Huafei <lihuafei1@huawei.com>
Date:   Thu Nov 3 11:10:10 2022 +0800

    ftrace: Fix use-after-free for dynamic ftrace_ops

    KASAN reported a use-after-free with ftrace ops [1]. It was found from
    vmcore that perf had registered two ops with the same content
    successively, both dynamic. After unregistering the second ops, a
    use-after-free occurred.

    In ftrace_shutdown(), when the second ops is unregistered, the
    FTRACE_UPDATE_CALLS command is not set because there is another enabled
    ops with the same content.  Also, both ops are dynamic and the ftrace
    callback function is ftrace_ops_list_func, so the
    FTRACE_UPDATE_TRACE_FUNC command will not be set. Eventually the value
    of 'command' will be 0 and ftrace_shutdown() will skip the rcu
    synchronization.

    However, ftrace may be activated. When the ops is released, another CPU
    may be accessing the ops.  Add the missing synchronization to fix this
    problem.

    [1]
    BUG: KASAN: use-after-free in __ftrace_ops_list_func kernel/trace/ftrace.c:7020 [inline]
    BUG: KASAN: use-after-free in ftrace_ops_list_func+0x2b0/0x31c kernel/trace/ftrace.c:7049
    Read of size 8 at addr ffff56551965bbc8 by task syz-executor.2/14468

    CPU: 1 PID: 14468 Comm: syz-executor.2 Not tainted 5.10.0 #7
    Hardware name: linux,dummy-virt (DT)
    Call trace:
     dump_backtrace+0x0/0x40c arch/arm64/kernel/stacktrace.c:132
     show_stack+0x30/0x40 arch/arm64/kernel/stacktrace.c:196
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x1b4/0x248 lib/dump_stack.c:118
     print_address_description.constprop.0+0x28/0x48c mm/kasan/report.c:387
     __kasan_report mm/kasan/report.c:547 [inline]
     kasan_report+0x118/0x210 mm/kasan/report.c:564
     check_memory_region_inline mm/kasan/generic.c:187 [inline]
     __asan_load8+0x98/0xc0 mm/kasan/generic.c:253
     __ftrace_ops_list_func kernel/trace/ftrace.c:7020 [inline]
     ftrace_ops_list_func+0x2b0/0x31c kernel/trace/ftrace.c:7049
     ftrace_graph_call+0x0/0x4
     __might_sleep+0x8/0x100 include/linux/perf_event.h:1170
     __might_fault mm/memory.c:5183 [inline]
     __might_fault+0x58/0x70 mm/memory.c:5171
     do_strncpy_from_user lib/strncpy_from_user.c:41 [inline]
     strncpy_from_user+0x1f4/0x4b0 lib/strncpy_from_user.c:139
     getname_flags+0xb0/0x31c fs/namei.c:149
     getname+0x2c/0x40 fs/namei.c:209
     [...]

    Allocated by task 14445:
     kasan_save_stack+0x24/0x50 mm/kasan/common.c:48
     kasan_set_track mm/kasan/common.c:56 [inline]
     __kasan_kmalloc mm/kasan/common.c:479 [inline]
     __kasan_kmalloc.constprop.0+0x110/0x13c mm/kasan/common.c:449
     kasan_kmalloc+0xc/0x14 mm/kasan/common.c:493
     kmem_cache_alloc_trace+0x440/0x924 mm/slub.c:2950
     kmalloc include/linux/slab.h:563 [inline]
     kzalloc include/linux/slab.h:675 [inline]
     perf_event_alloc.part.0+0xb4/0x1350 kernel/events/core.c:11230
     perf_event_alloc kernel/events/core.c:11733 [inline]
     __do_sys_perf_event_open kernel/events/core.c:11831 [inline]
     __se_sys_perf_event_open+0x550/0x15f4 kernel/events/core.c:11723
     __arm64_sys_perf_event_open+0x6c/0x80 kernel/events/core.c:11723
     [...]

    Freed by task 14445:
     kasan_save_stack+0x24/0x50 mm/kasan/common.c:48
     kasan_set_track+0x24/0x34 mm/kasan/common.c:56
     kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:358
     __kasan_slab_free.part.0+0x11c/0x1b0 mm/kasan/common.c:437
     __kasan_slab_free mm/kasan/common.c:445 [inline]
     kasan_slab_free+0x2c/0x40 mm/kasan/common.c:446
     slab_free_hook mm/slub.c:1569 [inline]
     slab_free_freelist_hook mm/slub.c:1608 [inline]
     slab_free mm/slub.c:3179 [inline]
     kfree+0x12c/0xc10 mm/slub.c:4176
     perf_event_alloc.part.0+0xa0c/0x1350 kernel/events/core.c:11434
     perf_event_alloc kernel/events/core.c:11733 [inline]
     __do_sys_perf_event_open kernel/events/core.c:11831 [inline]
     __se_sys_perf_event_open+0x550/0x15f4 kernel/events/core.c:11723
     [...]

    Link: https://lore.kernel.org/linux-trace-kernel/20221103031010.166498-1-lihuafei1@huawei.com

    Fixes: edb096e007 ("ftrace: Fix memleak when unregistering dynamic ops when tracing disabled")
    Cc: stable@vger.kernel.org
    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Li Huafei <lihuafei1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:43 -04:00
Adrien Thierry 74b03360df ftrace: Fix char print issue in print_ip_ins()
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 30f7d1cac2aab8fec560a388ad31ca5e5d04a822
Author: Zheng Yejian <zhengyejian1@huawei.com>
Date:   Tue Oct 11 12:03:52 2022 +0000

    ftrace: Fix char print issue in print_ip_ins()

    When ftrace bug happened, following log shows every hex data in
    problematic ip address:
      actual:   ffffffe8:6b:ffffffd9:01:21

    But so many 'f's seem a little confusing, and that is because format
    '%x' being used to print signed chars in array 'ins'. As suggested
    by Joe, change to use format "%*phC" to print array 'ins'.

    After this patch, the log is like:
      actual:   e8:6b:d9:01:21

    Link: https://lkml.kernel.org/r/20221011120352.1878494-1-zhengyejian1@huawei.com

    Fixes: 6c14133d2d ("ftrace: Do not blindly read the ip address in ftrace_bug()")
    Suggested-by: Joe Perches <joe@perches.com>
    Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:42 -04:00
Adrien Thierry 36d0574c87 ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit c3b0f72e805f0801f05fa2aa52011c4bfc694c44
Author: Yang Jihong <yangjihong1@huawei.com>
Date:   Thu Aug 18 11:26:59 2022 +0800

    ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead

    ftrace_startup does not remove ops from ftrace_ops_list when
    ftrace_startup_enable fails:

    register_ftrace_function
      ftrace_startup
        __register_ftrace_function
          ...
          add_ftrace_ops(&ftrace_ops_list, ops)
          ...
        ...
        ftrace_startup_enable // if ftrace failed to modify, ftrace_disabled is set to 1
        ...
      return 0 // ops is in the ftrace_ops_list.

    When ftrace_disabled = 1, unregister_ftrace_function simply returns without doing anything:
    unregister_ftrace_function
      ftrace_shutdown
        if (unlikely(ftrace_disabled))
                return -ENODEV;  // return here, __unregister_ftrace_function is not executed,
                                 // as a result, ops is still in the ftrace_ops_list
        __unregister_ftrace_function
        ...

    If ops is dynamically allocated, it will be free later, in this case,
    is_ftrace_trampoline accesses NULL pointer:

    is_ftrace_trampoline
      ftrace_ops_trampoline
        do_for_each_ftrace_op(op, ftrace_ops_list) // OOPS! op may be NULL!

    Syzkaller reports as follows:
    [ 1203.506103] BUG: kernel NULL pointer dereference, address: 000000000000010b
    [ 1203.508039] #PF: supervisor read access in kernel mode
    [ 1203.508798] #PF: error_code(0x0000) - not-present page
    [ 1203.509558] PGD 800000011660b067 P4D 800000011660b067 PUD 130fb8067 PMD 0
    [ 1203.510560] Oops: 0000 [#1] SMP KASAN PTI
    [ 1203.511189] CPU: 6 PID: 29532 Comm: syz-executor.2 Tainted: G    B   W         5.10.0 #8
    [ 1203.512324] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
    [ 1203.513895] RIP: 0010:is_ftrace_trampoline+0x26/0xb0
    [ 1203.514644] Code: ff eb d3 90 41 55 41 54 49 89 fc 55 53 e8 f2 00 fd ff 48 8b 1d 3b 35 5d 03 e8 e6 00 fd ff 48 8d bb 90 00 00 00 e8 2a 81 26 00 <48> 8b ab 90 00 00 00 48 85 ed 74 1d e8 c9 00 fd ff 48 8d bb 98 00
    [ 1203.518838] RSP: 0018:ffffc900012cf960 EFLAGS: 00010246
    [ 1203.520092] RAX: 0000000000000000 RBX: 000000000000007b RCX: ffffffff8a331866
    [ 1203.521469] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000000010b
    [ 1203.522583] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8df18b07
    [ 1203.523550] R10: fffffbfff1be3160 R11: 0000000000000001 R12: 0000000000478399
    [ 1203.524596] R13: 0000000000000000 R14: ffff888145088000 R15: 0000000000000008
    [ 1203.525634] FS:  00007f429f5f4700(0000) GS:ffff8881daf00000(0000) knlGS:0000000000000000
    [ 1203.526801] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1203.527626] CR2: 000000000000010b CR3: 0000000170e1e001 CR4: 00000000003706e0
    [ 1203.528611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1203.529605] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

    Therefore, when ftrace_startup_enable fails, we need to rollback registration
    process and remove ops from ftrace_ops_list.

    Link: https://lkml.kernel.org/r/20220818032659.56209-1-yangjihong1@huawei.com

    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:42 -04:00
Adrien Thierry 0681cb9c4e ftrace: fix building with SYSCTL=y but DYNAMIC_FTRACE=n
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit 8fd7c2144d1292f15c901211750dee021ed5079a
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Thu Apr 21 11:38:43 2022 -0700

    ftrace: fix building with SYSCTL=y but DYNAMIC_FTRACE=n

    Ok so hopefully this is the last of it. 0day picked up a build
    failure [0] when SYSCTL=y but DYNAMIC_FTRACE=n. This can be fixed
    by just declaring an empty routine for the calls moved just
    recently.

    [0] https://lkml.kernel.org/r/202204161203.6dSlgKJX-lkp@intel.com

    Reported-by: kernel test robot <lkp@intel.com>
    Fixes: f8b7d2b4c192 ("ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y")
    Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:35 -04:00
Adrien Thierry 0d6822ddd7 ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y
JIRA: https://issues.redhat.com/browse/RHEL-1491

commit f8b7d2b4c192118c37ab24c0540d1134dd0104d8
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Fri Apr 15 14:29:57 2022 -0700

    ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y

    One can enable dyanmic tracing but disable sysctls.
    When this is doen we get the compile kernel warning:

      CC      kernel/trace/ftrace.o
    kernel/trace/ftrace.c:3086:13: warning: ‘ftrace_shutdown_sysctl’ defined
    but not used [-Wunused-function]
     3086 | static void ftrace_shutdown_sysctl(void)
          |             ^~~~~~~~~~~~~~~~~~~~~~
    kernel/trace/ftrace.c:3068:13: warning: ‘ftrace_startup_sysctl’ defined
    but not used [-Wunused-function]
     3068 | static void ftrace_startup_sysctl(void)

    When CONFIG_DYNAMIC_FTRACE=n the ftrace_startup_sysctl() and
    routines ftrace_shutdown_sysctl() still compiles, so these
    are actually more just used for when SYSCTL=y.

    Fix this then by just moving these routines to when sysctls
    are enabled.

    Fixes: 7cde53da38a3 ("ftrace: move sysctl_ftrace_enabled to ftrace.c")
    Reported-by: kernel test robot <lkp@intel.com>
    Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Adrien Thierry <athierry@redhat.com>
2023-08-17 15:16:29 -04:00
Waiman Long 5185520983 ftrace: Export ftrace_free_filter() to modules
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2190342

commit 8be9fbd5345da52f4a74f7f81d55ff9fa0a2958e
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Tue, 3 Jan 2023 12:49:11 +0000

    ftrace: Export ftrace_free_filter() to modules

    Setting filters on an ftrace ops results in some memory being allocated
    for the filter hashes, which must be freed before the ops can be freed.
    This can be done by removing every individual element of the hash by
    calling ftrace_set_filter_ip() or ftrace_set_filter_ips() with `remove`
    set, but this is somewhat error prone as it's easy to forget to remove
    an element.

    Make it easier to clean this up by exporting ftrace_free_filter(), which
    can be used to clean up all of the filter hashes after an ftrace_ops has
    been unregistered.

    Using this, fix the ftrace-direct* samples to free hashes prior to being
    unloaded. All other code either removes individual filters explicitly or
    is built-in and already calls ftrace_free_filter().

    Link: https://lkml.kernel.org/r/20230103124912.2948963-3-mark.rutland@arm.com

    Cc: stable@vger.kernel.org
    Cc: Florent Revest <revest@chromium.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Fixes: e1067a07cfbc ("ftrace/samples: Add module to test multi direct modify interface")
    Fixes: 5fae941b9a6f ("ftrace/samples: Add multi direct interface test module")
    Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Waiman Long <longman@redhat.com>
2023-06-30 20:32:14 -04:00
Jerome Marchand 85bb9b95b8 ftrace: Add support to resolve module symbols in ftrace_lookup_symbols
Bugzilla: https://bugzilla.redhat.com/2177177

Conflicts: Context change from missing commit ("5d79fa0d3325 ftrace:
Fix build warning")

commit 3640bf8584f4ab0f5eed6285f09213954acd8b62
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Tue Oct 25 15:41:42 2022 +0200

    ftrace: Add support to resolve module symbols in ftrace_lookup_symbols

    Currently ftrace_lookup_symbols iterates only over core symbols,
    adding module_kallsyms_on_each_symbol call to check on modules
    symbols as well.

    Also removing 'args.found == args.cnt' condition, because it's
    already checked in kallsyms_callback function.

    Also removing 'err < 0' check, because both *kallsyms_on_each_symbol
    functions do not return error.

    Reported-by: Martynas Pumputis <m@lambda.lt>
    Acked-by: Song Liu <song@kernel.org>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20221025134148.3300700-3-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2023-04-28 11:42:56 +02:00
Prarit Bhargava 2c8a78d6e0 ftrace: Cleanup ftrace_dyn_arch_init()
Bugzilla: https://bugzilla.redhat.com/2163809

commit 6644c654ea70e0d8b8d5111e1272f8f29df00f21
Author: Weizhao Ouyang <o451686892@gmail.com>
Date:   Thu Sep 9 17:02:16 2021 +0800

    ftrace: Cleanup ftrace_dyn_arch_init()

    Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
    ftrace_dyn_arch_init() to cleanup them.

    Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com

    Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390)
    Acked-by: Helge Deller <deller@gmx.de> (parisc)
    Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
2023-04-10 09:58:49 -04:00
Artem Savkov a1e313c3bf ftrace: Keep the resolved addr in kallsyms_callback
Bugzilla: https://bugzilla.redhat.com/2166911

commit 9d68c19c57d690547cde977bb3d9ccd3ceb6afe9
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Mon Sep 26 17:33:36 2022 +0200

    ftrace: Keep the resolved addr in kallsyms_callback
    
    Keeping the resolved 'addr' in kallsyms_callback, instead of taking
    ftrace_location value, because we depend on symbol address in the
    cookie related code.
    
    With CONFIG_X86_KERNEL_IBT option the ftrace_location value differs
    from symbol address, which screwes the symbol address cookies matching.
    
    There are 2 users of this function:
    - bpf_kprobe_multi_link_attach
        for which this fix is for
    
    - get_ftrace_locations
        which is used by register_fprobe_syms
    
        this function needs to get symbols resolved to addresses,
        but does not need 'ftrace location addresses' at this point
        there's another ftrace location translation in the path done
        by ftrace_set_filter_ips call:
    
         register_fprobe_syms
           addrs = get_ftrace_locations
    
           register_fprobe_ips(addrs)
             ...
             ftrace_set_filter_ips
               ...
                 __ftrace_match_addr
                   ip = ftrace_location(ip);
                   ...
    
    Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20220926153340.1621984-3-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-03-06 14:54:18 +01:00
Artem Savkov e90b12cbbc ftrace: Add cleanup to unregister_ftrace_direct_multi
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit fea3ffa48c6d42a11dca766c89284d22eaf5603f
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Mon Dec 6 19:20:31 2021 +0100

    ftrace: Add cleanup to unregister_ftrace_direct_multi

    Adding ops cleanup to unregister_ftrace_direct_multi,
    so it can be reused in another register call.

    Link: https://lkml.kernel.org/r/20211206182032.87248-3-jolsa@kernel.org

    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Fixes: f64dd4627ec6 ("ftrace: Add multi direct register/unregister interface")
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:50 +01:00
Artem Savkov 834664bf5b ftrace: Use direct_ops hash in unregister_ftrace_direct
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 7d5b7cad79da76f3dad4a9f6040e524217814e5a
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Mon Dec 6 19:20:30 2021 +0100

    ftrace: Use direct_ops hash in unregister_ftrace_direct

    Now when we have *direct_multi interface the direct_functions
    hash is no longer owned just by direct_ops. It's also used by
    any other ftrace_ops passed to *direct_multi interface.

    Thus to find out that we are unregistering the last function
    from direct_ops, we need to check directly direct_ops's hash.

    Link: https://lkml.kernel.org/r/20211206182032.87248-2-jolsa@kernel.org

    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Fixes: f64dd4627ec6 ("ftrace: Add multi direct register/unregister interface")
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:49 +01:00
Artem Savkov 118f3039f7 ftrace: Fix recursive locking direct_mutex in ftrace_modify_direct_caller
Bugzilla: https://bugzilla.redhat.com/2137876

commit 9d2ce78ddcee159eb6a97449e9c68b6d60b9cec4
Author: Song Liu <song@kernel.org>
Date:   Mon Sep 26 17:41:46 2022 -0700

    ftrace: Fix recursive locking direct_mutex in ftrace_modify_direct_caller
    
    Naveen reported recursive locking of direct_mutex with sample
    ftrace-direct-modify.ko:
    
    [   74.762406] WARNING: possible recursive locking detected
    [   74.762887] 6.0.0-rc6+ #33 Not tainted
    [   74.763216] --------------------------------------------
    [   74.763672] event-sample-fn/1084 is trying to acquire lock:
    [   74.764152] ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
        register_ftrace_function+0x1f/0x180
    [   74.764922]
    [   74.764922] but task is already holding lock:
    [   74.765421] ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
        modify_ftrace_direct+0x34/0x1f0
    [   74.766142]
    [   74.766142] other info that might help us debug this:
    [   74.766701]  Possible unsafe locking scenario:
    [   74.766701]
    [   74.767216]        CPU0
    [   74.767437]        ----
    [   74.767656]   lock(direct_mutex);
    [   74.767952]   lock(direct_mutex);
    [   74.768245]
    [   74.768245]  *** DEADLOCK ***
    [   74.768245]
    [   74.768750]  May be due to missing lock nesting notation
    [   74.768750]
    [   74.769332] 1 lock held by event-sample-fn/1084:
    [   74.769731]  #0: ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
        modify_ftrace_direct+0x34/0x1f0
    [   74.770496]
    [   74.770496] stack backtrace:
    [   74.770884] CPU: 4 PID: 1084 Comm: event-sample-fn Not tainted ...
    [   74.771498] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
    [   74.772474] Call Trace:
    [   74.772696]  <TASK>
    [   74.772896]  dump_stack_lvl+0x44/0x5b
    [   74.773223]  __lock_acquire.cold.74+0xac/0x2b7
    [   74.773616]  lock_acquire+0xd2/0x310
    [   74.773936]  ? register_ftrace_function+0x1f/0x180
    [   74.774357]  ? lock_is_held_type+0xd8/0x130
    [   74.774744]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
    [   74.775213]  __mutex_lock+0x99/0x1010
    [   74.775536]  ? register_ftrace_function+0x1f/0x180
    [   74.775954]  ? slab_free_freelist_hook.isra.43+0x115/0x160
    [   74.776424]  ? ftrace_set_hash+0x195/0x220
    [   74.776779]  ? register_ftrace_function+0x1f/0x180
    [   74.777194]  ? kfree+0x3e1/0x440
    [   74.777482]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
    [   74.777941]  ? __schedule+0xb40/0xb40
    [   74.778258]  ? register_ftrace_function+0x1f/0x180
    [   74.778672]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
    [   74.779128]  register_ftrace_function+0x1f/0x180
    [   74.779527]  ? ftrace_set_filter_ip+0x33/0x70
    [   74.779910]  ? __schedule+0xb40/0xb40
    [   74.780231]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
    [   74.780678]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
    [   74.781147]  ftrace_modify_direct_caller+0x5b/0x90
    [   74.781563]  ? 0xffffffffa0201000
    [   74.781859]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
    [   74.782309]  modify_ftrace_direct+0x1b2/0x1f0
    [   74.782690]  ? __schedule+0xb40/0xb40
    [   74.783014]  ? simple_thread+0x2a/0xb0 [ftrace_direct_modify]
    [   74.783508]  ? __schedule+0xb40/0xb40
    [   74.783832]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
    [   74.784294]  simple_thread+0x76/0xb0 [ftrace_direct_modify]
    [   74.784766]  kthread+0xf5/0x120
    [   74.785052]  ? kthread_complete_and_exit+0x20/0x20
    [   74.785464]  ret_from_fork+0x22/0x30
    [   74.785781]  </TASK>
    
    Fix this by using register_ftrace_function_nolock in
    ftrace_modify_direct_caller.
    
    Link: https://lkml.kernel.org/r/20220927004146.1215303-1-song@kernel.org
    
    Fixes: 53cd885bc5c3 ("ftrace: Allow IPMODIFY and DIRECT ops on the same function")
    Reported-and-tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:49 +01:00
Artem Savkov df3ef06779 ftrace: Fix build warning for ops_references_rec() not used
Bugzilla: https://bugzilla.redhat.com/2137876

commit 123d6455771ec577ce65f8d1bda548fb0eb7ef21
Author: Wang Jingjin <wangjingjin1@huawei.com>
Date:   Mon Aug 1 16:47:45 2022 +0800

    ftrace: Fix build warning for ops_references_rec() not used
    
    The change that made IPMODIFY and DIRECT ops work together needed access
    to the ops_references_ip() function, which it pulled out of the module
    only code. But now if both CONFIG_MODULES and
    CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not set, we get the below
    warning:
    
        ‘ops_references_rec’ defined but not used.
    
    Since ops_references_rec() only calls ops_references_ip() replace the
    usage of ops_references_rec() with ops_references_ip() and encompass the
    function with an #ifdef of DIRECT_CALLS || MODULES being defined.
    
    Link: https://lkml.kernel.org/r/20220801084745.1187987-1-wangjingjin1@huawei.com
    
    Fixes: 53cd885bc5c3 ("ftrace: Allow IPMODIFY and DIRECT ops on the same function")
    Signed-off-by: Wang Jingjin <wangjingjin1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:48 +01:00
Artem Savkov c4e57782d9 ftrace: Allow IPMODIFY and DIRECT ops on the same function
Bugzilla: https://bugzilla.redhat.com/2137876

commit 53cd885bc5c3ea283cc9c00ca6446c778f00bfba
Author: Song Liu <song@kernel.org>
Date:   Tue Jul 19 17:21:24 2022 -0700

    ftrace: Allow IPMODIFY and DIRECT ops on the same function
    
    IPMODIFY (livepatch) and DIRECT (bpf trampoline) ops are both important
    users of ftrace. It is necessary to allow them work on the same function
    at the same time.
    
    First, DIRECT ops no longer specify IPMODIFY flag. Instead, DIRECT flag is
    handled together with IPMODIFY flag in __ftrace_hash_update_ipmodify().
    
    Then, a callback function, ops_func, is added to ftrace_ops. This is used
    by ftrace core code to understand whether the DIRECT ops can share with an
    IPMODIFY ops. To share with IPMODIFY ops, the DIRECT ops need to implement
    the callback function and adjust the direct trampoline accordingly.
    
    If DIRECT ops is attached before the IPMODIFY ops, ftrace core code calls
    ENABLE_SHARE_IPMODIFY_PEER on the DIRECT ops before registering the
    IPMODIFY ops.
    
    If IPMODIFY ops is attached before the DIRECT ops, ftrace core code calls
    ENABLE_SHARE_IPMODIFY_SELF in __ftrace_hash_update_ipmodify. Owner of the
    DIRECT ops may return 0 if the DIRECT trampoline can share with IPMODIFY,
    so error code otherwise. The error code is propagated to
    register_ftrace_direct_multi so that onwer of the DIRECT trampoline can
    handle it properly.
    
    For more details, please refer to comment before enum ftrace_ops_cmd.
    
    Signed-off-by: Song Liu <song@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Link: https://lore.kernel.org/all/20220602193706.2607681-2-song@kernel.org/
    Link: https://lore.kernel.org/all/20220718055449.3960512-1-song@kernel.org/
    Link: https://lore.kernel.org/bpf/20220720002126.803253-3-song@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov c73bf446d3 ftrace: Add modify_ftrace_direct_multi_nolock
Bugzilla: https://bugzilla.redhat.com/2137876

commit f96f644ab97abeed3f7007c953836a574ce928cc
Author: Song Liu <song@kernel.org>
Date:   Tue Jul 19 17:21:23 2022 -0700

    ftrace: Add modify_ftrace_direct_multi_nolock
    
    This is similar to modify_ftrace_direct_multi, but does not acquire
    direct_mutex. This is useful when direct_mutex is already locked by the
    user.
    
    Signed-off-by: Song Liu <song@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Link: https://lore.kernel.org/bpf/20220720002126.803253-2-song@kernel.org

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov 45f7db07df ftrace/direct: Fix lockup in modify_ftrace_direct_multi
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 2e6e9058d13a22a6fdd36a8c444ac71d9656003a
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Tue Nov 9 12:42:17 2021 +0100

    ftrace/direct: Fix lockup in modify_ftrace_direct_multi

    We can't call unregister_ftrace_function under ftrace_lock.

    Link: https://lkml.kernel.org/r/20211109114217.1645296-1-jolsa@kernel.org

    Fixes: ed29271894aa ("ftrace/direct: Do not disable when switching direct callers")
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov 119b491f4f ftrace/direct: Do not disable when switching direct callers
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit ed29271894aa92826d308231593b7ee7ac5a4932
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Thu Oct 14 16:11:14 2021 -0400

    ftrace/direct: Do not disable when switching direct callers

    Currently to switch a set of "multi" direct trampolines from one
    trampoline to another, a full shutdown of the current set needs to be
    done, followed by an update to what trampoline the direct callers would
    call, and then re-enabling the callers. This leaves a time when the
    functions will not be calling anything, and events may be missed.

    Instead, use a trick to allow all the functions with direct trampolines
    attached will always call either the new or old trampoline while the
    switch is happening. To do this, first attach a "dummy" callback via
    ftrace to all the functions that the current direct trampoline is attached
    to. This will cause the functions to call the "list func" instead of the
    direct trampoline. The list function will call the direct trampoline
    "helper" that will set the function it should call as it returns back to
    the ftrace trampoline.

    At this moment, the direct caller descriptor can safely update the direct
    call trampoline. The list function will pick either the new or old
    function (depending on the memory coherency model of the architecture).

    Now removing the dummy function from each of the locations of the direct
    trampoline caller, will put back the direct call, but now to the new
    trampoline.

    A better visual is:

    [ Changing direct call from my_direct_1 to my_direct_2 ]

      <traced_func>:
         call my_direct_1

     ||||||||||||||||||||
     vvvvvvvvvvvvvvvvvvvv

      <traced_func>:
         call ftrace_caller

      <ftrace_caller>:
        [..]
        call ftrace_ops_list_func

            ftrace_ops_list_func()
            {
                    ops->func() -> direct_helper -> set rax to my_direct_1 or my_direct_2
            }

       call rax (to either my_direct_1 or my_direct_2

     ||||||||||||||||||||
     vvvvvvvvvvvvvvvvvvvv

      <traced_func>:
         call my_direct_2

    Link: https://lore.kernel.org/all/20211014162819.5c85618b@gandalf.local.home/

    Acked-by: Jiri Olsa <jolsa@redhat.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov 2e9d0a0fb1 ftrace: Add multi direct modify interface
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit ccf5a89efd6f0a9483cea8acd4a0822b1a47e59a
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Fri Oct 8 11:13:35 2021 +0200

    ftrace: Add multi direct modify interface

    Adding interface to modify registered direct function
    for ftrace_ops. Adding following function:

       modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)

    The function changes the currently registered direct
    function for all attached functions.

    Link: https://lkml.kernel.org/r/20211008091336.33616-8-jolsa@kernel.org

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov 4cc08747a0 ftrace: Add multi direct register/unregister interface
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit f64dd4627ec6edc39bf1430fe6dbc923d2300a88
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Fri Oct 8 11:13:34 2021 +0200

    ftrace: Add multi direct register/unregister interface

    Adding interface to register multiple direct functions
    within single call. Adding following functions:

      register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
      unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)

    The register_ftrace_direct_multi registers direct function (addr)
    with all functions in ops filter. The ops filter can be updated
    before with ftrace_set_filter_ip calls.

    All requested functions must not have direct function currently
    registered, otherwise register_ftrace_direct_multi will fail.

    The unregister_ftrace_direct_multi unregisters ops related direct
    functions.

    Link: https://lkml.kernel.org/r/20211008091336.33616-7-jolsa@kernel.org

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Artem Savkov 9b0bf08e5b ftrace: Add ftrace_add_rec_direct function
Bugzilla: https://bugzilla.redhat.com/2137876

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit 1904a8144598031af85406873c5fbec806ee3fd7
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Fri Oct 8 11:13:33 2021 +0200

    ftrace: Add ftrace_add_rec_direct function

    Factor out the code that adds (ip, addr) tuple to direct_functions
    hash in new ftrace_add_rec_direct function. It will be used in
    following patches.

    Link: https://lkml.kernel.org/r/20211008091336.33616-6-jolsa@kernel.org

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Artem Savkov <asavkov@redhat.com>
2023-01-05 15:46:43 +01:00
Herton R. Krzesinski a12f14167c Merge: v5.18 backports for s390 expolines
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1610

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2072713
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2121735
Tested: Passed beaker CKI tests. https://beaker.engineering.redhat.com/jobs/7231265

This MR adds backports for s390 expolines. These commits were included.
- s390: replace cc-option-yn uses with cc-option
- s390/entry: remove unused expoline thunk
- s390: remove unused expoline to BC instructions
- s390/nospec: generate single register thunks if possible
- s390/nospec: add an option to use thunk-extern
- s390/nospec: align and size extern thunks
- s390/nospec: build expoline.o for modules_prepare target
- s390/nospec: remove unneeded header includes

Signed-off-by: Julia Denham <jdenham@redhat.com>

Approved-by: Joe Lawrence <joe.lawrence@redhat.com>
Approved-by: Waiman Long <longman@redhat.com>

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2022-12-22 16:33:54 +00:00
Yauheni Kaliuta 4cb417aeff ftrace: Still disable enabled records marked as disabled
Bugzilla: http://bugzilla.redhat.com/2120968

commit cf04f2d5df0037741207382ac8fe289e8bf84ced
Author: Steven Rostedt (Google) <rostedt@goodmis.org>
Date:   Wed Oct 5 00:38:09 2022 -0400

    ftrace: Still disable enabled records marked as disabled
    
    Weak functions started causing havoc as they showed up in the
    "available_filter_functions" and this confused people as to why some
    functions marked as "notrace" were listed, but when enabled they did
    nothing. This was because weak functions can still have fentry calls, and
    these addresses get added to the "available_filter_functions" file.
    kallsyms is what converts those addresses to names, and since the weak
    functions are not listed in kallsyms, it would just pick the function
    before that.
    
    To solve this, there was a trick to detect weak functions listed, and
    these records would be marked as DISABLED so that they do not get enabled
    and are mostly ignored. As the processing of the list of all functions to
    figure out what is weak or not can take a long time, this process is put
    off into a kernel thread and run in parallel with the rest of start up.
    
    Now the issue happens whet function tracing is enabled via the kernel
    command line. As it starts very early in boot up, it can be enabled before
    the records that are weak are marked to be disabled. This causes an issue
    in the accounting, as the weak records are enabled by the command line
    function tracing, but after boot up, they are not disabled.
    
    The ftrace records have several accounting flags and a ref count. The
    DISABLED flag is just one. If the record is enabled before it is marked
    DISABLED it will get an ENABLED flag and also have its ref counter
    incremented. After it is marked for DISABLED, neither the ENABLED flag nor
    the ref counter is cleared. There's sanity checks on the records that are
    performed after an ftrace function is registered or unregistered, and this
    detected that there were records marked as ENABLED with ref counter that
    should not have been.
    
    Note, the module loading code uses the DISABLED flag as well to keep its
    functions from being modified while its being loaded and some of these
    flags may get set in this process. So changing the verification code to
    ignore DISABLED records is a no go, as it still needs to verify that the
    module records are working too.
    
    Also, the weak functions still are calling a trampoline. Even though they
    should never be called, it is dangerous to leave these weak functions
    calling a trampoline that is freed, so they should still be set back to
    nops.
    
    There's two places that need to not skip records that have the ENABLED
    and the DISABLED flags set. That is where the ftrace_ops is processed and
    sets the records ref counts, and then later when the function itself is to
    be updated, and the ENABLED flag gets removed. Add a helper function
    "skip_record()" that returns true if the record has the DISABLED flag set
    but not the ENABLED flag.
    
    Link: https://lkml.kernel.org/r/20221005003809.27d2b97b@gandalf.local.home
    
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: stable@vger.kernel.org
    Fixes: b39181f7c6907 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:11 +02:00
Yauheni Kaliuta a3babfbdc5 ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function
Bugzilla: http://bugzilla.redhat.com/2130850

commit b39181f7c6907dc66ff937b74758671fa6ba430c
Author: Steven Rostedt (Google) <rostedt@goodmis.org>
Date:   Thu May 26 14:19:12 2022 -0400

    ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function

    If an unused weak function was traced, it's call to fentry will still
    exist, which gets added into the __mcount_loc table. Ftrace will use
    kallsyms to retrieve the name for each location in __mcount_loc to display
    it in the available_filter_functions and used to enable functions via the
    name matching in set_ftrace_filter/notrace. Enabling these functions do
    nothing but enable an unused call to ftrace_caller. If a traced weak
    function is overridden, the symbol of the function would be used for it,
    which will either created duplicate names, or if the previous function was
    not traced, it would be incorrectly be listed in available_filter_functions
    as a function that can be traced.

    This became an issue with BPF[1] as there are tooling that enables the
    direct callers via ftrace but then checks to see if the functions were
    actually enabled. The case of one function that was marked notrace, but
    was followed by an unused weak function that was traced. The unused
    function's call to fentry was added to the __mcount_loc section, and
    kallsyms retrieved the untraced function's symbol as the weak function was
    overridden. Since the untraced function would not get traced, the BPF
    check would detect this and fail.

    The real fix would be to fix kallsyms to not show addresses of weak
    functions as the function before it. But that would require adding code in
    the build to add function size to kallsyms so that it can know when the
    function ends instead of just using the start of the next known symbol.

    In the mean time, this is a work around. Add a FTRACE_MCOUNT_MAX_OFFSET
    macro that if defined, ftrace will ignore any function that has its call
    to fentry/mcount that has an offset from the symbol that is greater than
    FTRACE_MCOUNT_MAX_OFFSET.

    If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
    to zero (unless IBT is enabled), which will have ftrace ignore all locations
    that are not at the start of the function (or one after the ENDBR
    instruction).

    A worker thread is added at boot up to scan all the ftrace record entries,
    and will mark any that fail the FTRACE_MCOUNT_MAX_OFFSET test as disabled.
    They will still appear in the available_filter_functions file as:

      __ftrace_invalid_address___<invalid-offset>

    (showing the offset that caused it to be invalid).

    This is required for tools that use libtracefs (like trace-cmd does) that
    scan the available_filter_functions and enable set_ftrace_filter and
    set_ftrace_notrace using indexes of the function listed in the file (this
    is a speedup, as enabling thousands of files via names is an O(n^2)
    operation and can take minutes to complete, where the indexing takes less
    than a second).

    The invalid functions cannot be removed from available_filter_functions as
    the names there correspond to the ftrace records in the array that manages
    them (and the indexing depends on this).

    [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/

    Link: https://lkml.kernel.org/r/20220526141912.794c2786@gandalf.local.home

    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
2022-11-30 12:47:11 +02:00
Frantisek Hrbata 1269719102 Merge: BPF and XDP rebase to v5.18
Merge conflicts:
-----------------
arch/x86/net/bpf_jit_comp.c
        - bpf_arch_text_poke()
          HEAD(!1464) contains b73b002f7f ("x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline")
          Resolved in favour of !1464, but keep the return statement from !1477

MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1477

Bugzilla: https://bugzilla.redhat.com/2120966

Rebase BPF and XDP to the upstream kernel version 5.18

Patch applied, then reverted:
```
544356 selftests/bpf: switch to new libbpf XDP APIs
0bfb95 selftests, bpf: Do not yet switch to new libbpf XDP APIs
```
Taken in the perf rebase:
```
23fcfc perf: use generic bpf_program__set_type() to set BPF prog type
```
Unsuported arches:
```
5c1011 libbpf: Fix riscv register names
cf0b5b libbpf: Fix accessing syscall arguments on riscv
```
Depends on changes of other subsystems:
```
7fc8c3 s390/bpf: encode register within extable entry
aebfd1 x86/ibt,ftrace: Search for __fentry__ location
589127 x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline
```
Broken selftest:
```
edae34 selftests net: add UDP GRO fraglist + bpf self-tests
cf6783 selftests net: fix bpf build error
7b92aa selftests net: fix kselftest net fatal error
```
Out of scope:
```
baebdf net: dev: Makes sure netif_rx() can be invoked in any context.
5c8166 kbuild: replace $(if A,A,B) with $(or A,B)
1a97ce perf maps: Use a pointer for kmaps
967747 uaccess: remove CONFIG_SET_FS
42b01a s390: always use the packed stack layout
bf0882 flow_dissector: Add support for HSR
d09a30 s390/extable: move EX_TABLE define to asm-extable.h
3d6671 s390/extable: convert to relative table with data
4efd41 s390: raise minimum supported machine generation to z10
f65e58 flow_dissector: Add support for HSRv0
1a6d7a netdevsim: Introduce support for L3 offload xstats
9b1894 selftests: netdevsim: hw_stats_l3: Add a new test
84005b perf ftrace latency: Add -n/--use-nsec option
36c4a7 kasan, arm64: don't tag executable vmalloc allocations
8df013 docs: netdev: move the netdev-FAQ to the process pages
4d4d00 perf tools: Update copy of libbpf's hashmap.c
0df6ad perf evlist: Rename cpus to user_requested_cpus
1b8089 flow_dissector: fix false-positive __read_overflow2_field() warning
0ae065 perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
8994e9 perf test bpf: Skip test if clang is not present
735346 perf build: Fix btf__load_from_kernel_by_id() feature check
f037ac s390/stack: merge empty stack frame slots
335220 docs: netdev: update maintainer-netdev.rst reference
a0b098 s390/nospec: remove unneeded header includes
34513a netdevsim: Fix hwstats debugfs file permissions
```

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>

Approved-by: John W. Linville <linville@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: Torez Smith <torez@redhat.com>
Approved-by: Jan Stancek <jstancek@redhat.com>
Approved-by: Prarit Bhargava <prarit@redhat.com>
Approved-by: Felix Maurer <fmaurer@redhat.com>
Approved-by: Viktor Malik <vmalik@redhat.com>

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
2022-11-21 05:30:47 -05:00
Julia Denham b1e59daec9 ftrace: Introduce ftrace_need_init_nop()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2121735

commit 67ccddf86621b18dbffe56f11a106774ee8f44bd
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date:   Wed Jul 28 23:25:45 2021 +0200

ftrace: Introduce ftrace_need_init_nop()

Implementing live patching on s390 requires each function's prologue to
contain a very special kind of nop, which gcc and clang don't generate.
However, the current code assumes that if CC_USING_NOP_MCOUNT is
defined, then whatever the compiler generates is good enough.

Move the CC_USING_NOP_MCOUNT check into the new ftrace_need_init_nop()
macro, that the architectures can override.

An alternative solution is to disable using -mnop-mcount in the
Makefile, however, this makes the build logic (even) more complicated
and forces the arch-specific code to deal with the useless __fentry__
symbol.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20210728212546.128248-2-iii@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 67ccddf86621b18dbffe56f11a106774ee8f44bd)

Signed-off-by: Julia Denham <jdenham@redhat.com>
2022-11-08 17:33:02 -05:00
Joe Lawrence 80789bdd0d x86/ibt,ftrace: Search for __fentry__ location
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2121207

commit aebfd12521d9c7d0b502cf6d06314cfbcdccfe3b
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Tue Mar 8 16:30:29 2022 +0100

    x86/ibt,ftrace: Search for __fentry__ location

    Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
    with Intel IBT enabled the first instruction of a function will most
    likely be ENDBR.

    Change ftrace_location() to not only return the __fentry__ location
    when called for the __fentry__ location, but also when called for the
    sym+0 location.

    Then audit/update all callsites of this function to consistently use
    these new semantics.

    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
    Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
    Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org

Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
2022-10-27 14:27:56 -04:00
Jerome Marchand 2fbcdb2566 ftrace: Keep address offset in ftrace_lookup_symbols
Bugzilla: https://bugzilla.redhat.com/2120966

commit eb1b2985fe5c5f02e43e4c0d47bbe7ed835007f3
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Wed Jun 15 13:21:16 2022 +0200

    ftrace: Keep address offset in ftrace_lookup_symbols

    We want to store the resolved address on the same index as
    the symbol string, because that's the user (bpf kprobe link)
    code assumption.

    Also making sure we don't store duplicates that might be
    present in kallsyms.

    Acked-by: Song Liu <songliubraving@fb.com>
    Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Fixes: bed0d9a50dac ("ftrace: Add ftrace_lookup_symbols function")
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20220615112118.497303-3-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:11 +02:00
Jerome Marchand c52d055c5c ftrace: Add ftrace_lookup_symbols function
Bugzilla: https://bugzilla.redhat.com/2120966

commit bed0d9a50dacee6fcf785c555cfb0d2573355afc
Author: Jiri Olsa <jolsa@kernel.org>
Date:   Tue May 10 14:26:13 2022 +0200

    ftrace: Add ftrace_lookup_symbols function

    Adding ftrace_lookup_symbols function that resolves array of symbols
    with single pass over kallsyms.

    The user provides array of string pointers with count and pointer to
    allocated array for resolved values.

      int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt,
                                unsigned long *addrs)

    It iterates all kallsyms symbols and tries to loop up each in provided
    symbols array with bsearch. The symbols array needs to be sorted by
    name for this reason.

    We also check each symbol to pass ftrace_location, because this API
    will be used for fprobe symbols resolving.

    Suggested-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: Andrii Nakryiko <andrii@kernel.org>
    Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/r/20220510122616.2652285-3-jolsa@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:10 +02:00
Jerome Marchand 199c6e01ca ftrace: Add ftrace_set_filter_ips function
Bugzilla: https://bugzilla.redhat.com/2120966

commit 4f554e955614f19425cee86de4669351741a6280
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Tue Mar 15 23:00:26 2022 +0900

    ftrace: Add ftrace_set_filter_ips function

    Adding ftrace_set_filter_ips function to be able to set filter on
    multiple ip addresses at once.

    With the kprobe multi attach interface we have cases where we need to
    initialize ftrace_ops object with thousands of functions, so having
    single function diving into ftrace_hash_move_and_update_ops with
    ftrace_lock is faster.

    The functions ips are passed as unsigned long array with count.

    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/164735282673.1084943.18310504594134769804.stgit@devnote2

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-10-25 14:58:03 +02:00
Phil Auld 5087f87023 sched/tracing: Append prev_state to tp args instead
Bugzilla: https://bugzilla.redhat.com/2078906
Conflicts: Skipped one hunk, in samples, due to not having 3a73333fb370
("tracing: Add TRACE_CUSTOM_EVENT() macro").

commit 9c2136be0878c88c53dea26943ce40bb03ad8d8d
Author: Delyan Kratunov <delyank@fb.com>
Date:   Wed May 11 18:28:36 2022 +0000

    sched/tracing: Append prev_state to tp args instead

    Commit fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting
    sched_switch event, 2022-01-20) added a new prev_state argument to the
    sched_switch tracepoint, before the prev task_struct pointer.

    This reordering of arguments broke BPF programs that use the raw
    tracepoint (e.g. tp_btf programs). The type of the second argument has
    changed and existing programs that assume a task_struct* argument
    (e.g. for bpf_task_storage access) will now fail to verify.

    If we instead append the new argument to the end, all existing programs
    would continue to work and can conditionally extract the prev_state
    argument on supported kernel versions.

    Fixes: fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting sched_switch event, 2022-01-20)
    Signed-off-by: Delyan Kratunov <delyank@fb.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Link: https://lkml.kernel.org/r/c8a6930dfdd58a4a5755fc01732675472979732b.camel@fb.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2022-06-02 09:20:55 -04:00
Jerome Marchand 1f4a51caa1 tracing: Disable "other" permission bits in the tracefs files
Bugzilla: https://bugzilla.redhat.com/2069708

commit 21ccc9cd72116289469e5519b6159c675a2fa58f
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Wed Aug 18 11:24:51 2021 -0400

    tracing: Disable "other" permission bits in the tracefs files

    When building the files in the tracefs file system, do not by default set
    any permissions for OTH (other). This will make it easier for admins who
    want to define a group for accessing tracefs and not having to first
    disable all the permission bits for "other" in the file system.

    As tracing can leak sensitive information, it should never by default
    allowing all users access. An admin can still set the permission bits for
    others to have access, which may be useful for creating a honeypot and
    seeing who takes advantage of it and roots the machine.

    Link: https://lkml.kernel.org/r/20210818153038.864149276@goodmis.org

    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2022-05-16 09:14:22 +02:00
Phil Auld 090df5874d sched/tracing: Don't re-read p->state when emitting sched_switch event
Bugzilla: http://bugzilla.redhat.com/2062831

commit fa2c3254d7cfff5f7a916ab928a562d1165f17bb
Author: Valentin Schneider <valentin.schneider@arm.com>
Date:   Thu Jan 20 16:25:19 2022 +0000

    sched/tracing: Don't re-read p->state when emitting sched_switch event

    As of commit

      c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")

    the following sequence becomes possible:

                          p->__state = TASK_INTERRUPTIBLE;
                          __schedule()
                            deactivate_task(p);
      ttwu()
        READ !p->on_rq
        p->__state=TASK_WAKING
                            trace_sched_switch()
                              __trace_sched_switch_state()
                                task_state_index()
                                  return 0;

    TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
    the trace event.

    Prevent this by pushing the value read from __schedule() down the trace
    event.

    Reported-by: Abhijeet Dharmapurikar <adharmap@quicinc.com>
    Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@arm.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2022-03-28 09:28:37 -04:00
Wander Lairson Costa d6846b1c2e
ftrace: disable preemption when recursion locked
Bugzilla: http://bugzilla.redhat.com/1938117

commit ce5e48036c9e76a2a5bd4d9079eac273087a533a
Author: 王贇 <yun.wang@linux.alibaba.com>
Date:   Wed Oct 27 11:14:44 2021 +0800

    ftrace: disable preemption when recursion locked

    As the documentation explained, ftrace_test_recursion_trylock()
    and ftrace_test_recursion_unlock() were supposed to disable and
    enable preemption properly, however currently this work is done
    outside of the function, which could be missing by mistake.

    And since the internal using of trace_test_and_set_recursion()
    and trace_clear_recursion() also require preemption disabled, we
    can just merge the logical.

    This patch will make sure the preemption has been disabled when
    trace_test_and_set_recursion() return bit >= 0, and
    trace_clear_recursion() will enable the preemption if previously
    enabled.

    Link: https://lkml.kernel.org/r/13bde807-779c-aa4c-0672-20515ae365ea@linux.alibaba.com

    CC: Petr Mladek <pmladek@suse.com>
    Cc: Guo Ren <guoren@kernel.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Helge Deller <deller@gmx.de>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Paul Walmsley <paul.walmsley@sifive.com>
    Cc: Palmer Dabbelt <palmer@dabbelt.com>
    Cc: Albert Ou <aou@eecs.berkeley.edu>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Josh Poimboeuf <jpoimboe@redhat.com>
    Cc: Jiri Kosina <jikos@kernel.org>
    Cc: Joe Lawrence <joe.lawrence@redhat.com>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Jisheng Zhang <jszhang@kernel.org>
    CC: Steven Rostedt <rostedt@goodmis.org>
    CC: Miroslav Benes <mbenes@suse.cz>
    Reported-by: Abaci <abaci@linux.alibaba.com>
    Suggested-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
    [ Removed extra line in comment - SDR ]
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
2022-01-03 11:30:41 -03:00
Colin Ian King 3b1a8f457f ftrace: Remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Link: https://lkml.kernel.org/r/20210721120915.122278-1-colin.king@canonical.com

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-23 08:46:02 -04:00
Nicolas Saenz Julienne 68e83498cb ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary
synchronize_rcu_tasks_rude() triggers IPIs and forces rescheduling on
all CPUs. It is a costly operation and, when targeting nohz_full CPUs,
very disrupting (hence the name). So avoid calling it when 'old_hash'
doesn't need to be freed.

Link: https://lkml.kernel.org/r/20210721114726.1545103-1-nsaenzju@redhat.com

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-23 08:45:53 -04:00
Baokun Li 3ecda64475 ftrace: Use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add().

Link: https://lkml.kernel.org/r/20210608031108.2820996-1-libaokun1@huawei.com

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-08 13:02:58 -04:00
Steven Rostedt (VMware) 6c14133d2d ftrace: Do not blindly read the ip address in ftrace_bug()
It was reported that a bug on arm64 caused a bad ip address to be used for
updating into a nop in ftrace_init(), but the error path (rightfully)
returned -EINVAL and not -EFAULT, as the bug caused more than one error to
occur. But because -EINVAL was returned, the ftrace_bug() tried to report
what was at the location of the ip address, and read it directly. This
caused the machine to panic, as the ip was not pointing to a valid memory
address.

Instead, read the ip address with copy_from_kernel_nofault() to safely
access the memory, and if it faults, report that the address faulted,
otherwise report what was in that location.

Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/

Cc: stable@vger.kernel.org
Fixes: 05736a427f ("ftrace: warn on failure to disable mcount callers")
Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-08 16:44:00 -04:00
Linus Torvalds 7ec901b6fa tracing: Fix probes written to the set_ftrace_filter file
Now that there's a library that accesses the tracefs file system,
 (libtracefs), the way the files are interacted with is slightly
 different than the command line. For instance, the write() system
 call is used directly instead of an echo. This exposes some old bugs.
 
 If a probe is written to "set_ftrace_filter" without any white space
 after it, it will be ignored. This is because the write expects
 that a string written to it that does not end with white spaces thinks
 there is more to come. But if the file is closed, the release function
 needs to finish it. The "set_ftrace_filter" release function handles
 the filter part of the "set_ftrace_filter" file, but did not handle
 the probe part.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYJP4OBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quzjAQCoFQXkJtYhwlMk0dTxclrsQlm0t93H
 pHwJA9Zyxe25UgD8D/rpG/wtHaSSuP6omEDbqvshpNdszqKb0Nt+UM116QU=
 =niJ6
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix probes written to the set_ftrace_filter file

  Now that there's a library that accesses the tracefs file system
  (libtracefs), the way the files are interacted with is slightly
  different than the command line. For instance, the write() system call
  is used directly instead of an echo. This exposes some old bugs.

  If a probe is written to "set_ftrace_filter" without any white space
  after it, it will be ignored. This is because the write expects that a
  string written to it that does not end with white spaces thinks there
  is more to come. But if the file is closed, the release function needs
  to finish it. The "set_ftrace_filter" release function handles the
  filter part of the "set_ftrace_filter" file, but did not handle the
  probe part"

* tag 'trace-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Handle commands when closing set_ftrace_filter file
2021-05-06 10:03:38 -07:00
Steven Rostedt (VMware) 8c9af478c0 ftrace: Handle commands when closing set_ftrace_filter file
# echo switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter

will cause switch_mm to stop tracing by the traceoff command.

 # echo -n switch_mm:traceoff > /sys/kernel/tracing/set_ftrace_filter

does nothing.

The reason is that the parsing in the write function only processes
commands if it finished parsing (there is white space written after the
command). That's to handle:

 write(fd, "switch_mm:", 10);
 write(fd, "traceoff", 8);

cases, where the command is broken over multiple writes.

The problem is if the file descriptor is closed, then the write call is
not processed, and the command needs to be processed in the release code.
The release code can handle matching of functions, but does not handle
commands.

Cc: stable@vger.kernel.org
Fixes: eda1e32855 ("tracing: handle broken names in ftrace filter")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-05-05 10:38:24 -04:00