Commit Graph

8 Commits

Author SHA1 Message Date
Dave Airlie bd578c3abb lib/ref_tracker: add printing to memory buffer
JIRA: https://issues.redhat.com/browse/RHEL-24101
Upstream Status: v6.5-rc1

commit 227c6c832303cec3941166d3335ecbccd980d615
Author:     Andrzej Hajda <andrzej.hajda@intel.com>
AuthorDate: Fri Jun  2 12:21:35 2023 +0200
Commit:     Jakub Kicinski <kuba@kernel.org>
CommitDate: Mon Jun  5 15:28:42 2023 -0700

    Similar to stack_(depot|trace)_snprint the patch
    adds helper to printing stats to memory buffer.
    It will be helpful in case of debugfs.

    Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-17 10:46:58 +10:00
Dave Airlie 3b0a87ad0e lib/ref_tracker: improve printing stats
JIRA: https://issues.redhat.com/browse/RHEL-24101
Upstream Status: v6.5-rc1

This doesn't backport the namespace chunk that isn't
in RHEL yet.

Conflicts:
        net/core/net_namespace.c

commit b6d7c0eb2dcbd238fa233a3a1737654e380e784a
Author:     Andrzej Hajda <andrzej.hajda@intel.com>
AuthorDate: Fri Jun  2 12:21:34 2023 +0200
Commit:     Jakub Kicinski <kuba@kernel.org>
CommitDate: Mon Jun  5 15:28:42 2023 -0700

    In case the library is tracking busy subsystem, simply
    printing stack for every active reference will spam log
    with long, hard to read, redundant stack traces. To improve
    readabilty following changes have been made:
    - reports are printed per stack_handle - log is more compact,
    - added display name for ref_tracker_dir - it will differentiate
      multiple subsystems,
    - stack trace is printed indented, in the same printk call,
    - info about dropped references is printed as well.

    Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-17 10:46:57 +10:00
Dave Airlie fe7dc05e4f lib/ref_tracker: add unlocked leak print helper
JIRA: https://issues.redhat.com/browse/RHEL-24101
Upstream Status: v6.5-rc1

commit 7a113ff6355944283402fb617dc97122f68d5a41
Author:     Andrzej Hajda <andrzej.hajda@intel.com>
AuthorDate: Fri Jun  2 12:21:33 2023 +0200
Commit:     Jakub Kicinski <kuba@kernel.org>
CommitDate: Mon Jun  5 15:28:42 2023 -0700

    To have reliable detection of leaks, caller must be able to check under
    the same lock both: tracked counter and the leaks. dir.lock is natural
    candidate for such lock and unlocked print helper can be called with this
    lock taken.
    As a bonus we can reuse this helper in ref_tracker_dir_exit.

    Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-17 10:46:57 +10:00
Chris von Recklinghausen d21c52fd60 ref_tracker: remove filter_irq_stacks() call
Bugzilla: https://bugzilla.redhat.com/2120352

commit c2d1e3df4af59261777b39c2e47476acd4d1cbeb
Author: Eric Dumazet <edumazet@google.com>
Date:   Sat Feb 5 09:27:11 2022 -0800

    ref_tracker: remove filter_irq_stacks() call

    After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
    in stack_depot_save()") it became unnecessary to filter the stack
    before calling stack_depot_save().

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Marco Elver <elver@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2022-10-12 07:27:44 -04:00
Ivan Vecera 332ff16036 ref_tracker: add a count of untracked references
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

Conflicts:
- context conflict due to missing 2dba5eb1c73b ("lib/stackdepot: allow
  optional init and stack_table allocation by kvmalloc()")

commit 8fd5522f44dcd7f05454ddc4f16d0f821b676cd9
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Feb 4 14:42:36 2022 -0800

    ref_tracker: add a count of untracked references

    We are still chasing a netdev refcount imbalance, and we suspect
    we have one rogue dev_put() that is consuming a reference taken
    from a dev_hold_track()

    To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
    to be called with a NULL @trackerp parameter, and use a dedicated
    refcount_t just for them.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:39:26 +02:00
Ivan Vecera 5de2574985 ref_tracker: implement use-after-free detection
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit e3ececfe668facd87d920b608349a32607060e66
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Feb 4 14:42:35 2022 -0800

    ref_tracker: implement use-after-free detection

    Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
    as dead.

    Test the dead status from ref_tracker_alloc() and ref_tracker_free()

    This should detect buggy dev_put()/dev_hold() happening too late
    in netdevice dismantle process.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:39:20 +02:00
Ivan Vecera 88939bc19e ref_tracker: use __GFP_NOFAIL more carefully
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit c12837d1bb31032bead9060dec99ef310d5b9fb7
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Jan 12 03:14:45 2022 -0800

    ref_tracker: use __GFP_NOFAIL more carefully

    syzbot was able to trigger this warning from new_slab()
                    /*
                     * All existing users of the __GFP_NOFAIL are blockable, so warn
                     * of any new users that actually require GFP_NOWAIT
                     */
                    if (WARN_ON_ONCE(!can_direct_reclaim))
                            goto fail;

    Indeed, we should use __GFP_NOFAIL if direct reclaim is possible.

    Hopefully in the future we will be able to use SLAB_NOFAILSLAB
    option so that syzbot can benefit from full ref_tracker
    even in the presence of memory fault injections.

    WARNING: CPU: 0 PID: 13 at mm/page_alloc.c:5081 __alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
    Modules linked in:
    CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 5.16.0-rc5-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
    Code: 90 08 00 00 48 81 c7 d8 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 84 f0 ea ff ff e8 3d 82 09 00 e9 e6 ea ff ff 4d 89 fd <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 8b 54 24 30 48 c1 ea 03 80
    RSP: 0018:ffffc90000d272b8 EFLAGS: 00010246

    RAX: 0000000000000000 RBX: ffff88813fffc300 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88813fffc348
    RBP: ffff88813fffc300 R08: 00000000000013dc R09: 00000000000013c8
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: ffffc90000d274e8 R14: dffffc0000000000 R15: ffffc90000d274e8
    FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffefe6000f8 CR3: 000000001d21e000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     __alloc_pages+0x412/0x500 mm/page_alloc.c:5382 mm/page_alloc.c:5382
     alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191 mm/mempolicy.c:2191
     alloc_slab_page mm/slub.c:1793 [inline]
     allocate_slab mm/slub.c:1938 [inline]
     alloc_slab_page mm/slub.c:1793 [inline] mm/slub.c:1993
     allocate_slab mm/slub.c:1938 [inline] mm/slub.c:1993
     new_slab+0x349/0x4a0 mm/slub.c:1993 mm/slub.c:1993
     ___slab_alloc+0x918/0xfe0 mm/slub.c:3022 mm/slub.c:3022
     __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109 mm/slub.c:3109
     slab_alloc_node mm/slub.c:3200 [inline]
     slab_alloc mm/slub.c:3242 [inline]
     slab_alloc_node mm/slub.c:3200 [inline] mm/slub.c:3259
     slab_alloc mm/slub.c:3242 [inline] mm/slub.c:3259
     kmem_cache_alloc_trace+0x289/0x2c0 mm/slub.c:3259 mm/slub.c:3259
     kmalloc include/linux/slab.h:590 [inline]
     kzalloc include/linux/slab.h:724 [inline]
     kmalloc include/linux/slab.h:590 [inline] lib/ref_tracker.c:74
     kzalloc include/linux/slab.h:724 [inline] lib/ref_tracker.c:74
     ref_tracker_alloc+0xe1/0x430 lib/ref_tracker.c:74 lib/ref_tracker.c:74
     netdev_tracker_alloc include/linux/netdevice.h:3855 [inline]
     dev_hold_track include/linux/netdevice.h:3872 [inline]
     netdev_tracker_alloc include/linux/netdevice.h:3855 [inline] net/core/dst.c:52
     dev_hold_track include/linux/netdevice.h:3872 [inline] net/core/dst.c:52
     dst_init+0xe0/0x520 net/core/dst.c:52 net/core/dst.c:52
     dst_alloc+0x16b/0x1f0 net/core/dst.c:96 net/core/dst.c:96
     rt_dst_alloc+0x73/0x450 net/ipv4/route.c:1614 net/ipv4/route.c:1614
     ip_route_input_mc net/ipv4/route.c:1720 [inline]
     ip_route_input_mc net/ipv4/route.c:1720 [inline] net/ipv4/route.c:2465
     ip_route_input_rcu.part.0+0x4fe/0xcc0 net/ipv4/route.c:2465 net/ipv4/route.c:2465
     ip_route_input_rcu net/ipv4/route.c:2420 [inline]
     ip_route_input_rcu net/ipv4/route.c:2420 [inline] net/ipv4/route.c:2416
     ip_route_input_noref+0x1b8/0x2a0 net/ipv4/route.c:2416 net/ipv4/route.c:2416
     ip_rcv_finish_core.constprop.0+0x288/0x1e90 net/ipv4/ip_input.c:354 net/ipv4/ip_input.c:354
     ip_rcv_finish+0x135/0x2f0 net/ipv4/ip_input.c:427 net/ipv4/ip_input.c:427
     NF_HOOK include/linux/netfilter.h:307 [inline]
     NF_HOOK include/linux/netfilter.h:301 [inline]
     NF_HOOK include/linux/netfilter.h:307 [inline] net/ipv4/ip_input.c:540
     NF_HOOK include/linux/netfilter.h:301 [inline] net/ipv4/ip_input.c:540
     ip_rcv+0xaa/0xd0 net/ipv4/ip_input.c:540 net/ipv4/ip_input.c:540
     __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5350 net/core/dev.c:5350
     __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5464 net/core/dev.c:5464
     process_backlog+0x2a5/0x6c0 net/core/dev.c:5796 net/core/dev.c:5796
     __napi_poll+0xaf/0x440 net/core/dev.c:6364 net/core/dev.c:6364
     napi_poll net/core/dev.c:6431 [inline]
     napi_poll net/core/dev.c:6431 [inline] net/core/dev.c:6518
     net_rx_action+0x801/0xb40 net/core/dev.c:6518 net/core/dev.c:6518
     __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 kernel/softirq.c:558
     run_ksoftirqd kernel/softirq.c:921 [inline]
     run_ksoftirqd kernel/softirq.c:921 [inline] kernel/softirq.c:913
     run_ksoftirqd+0x2d/0x60 kernel/softirq.c:913 kernel/softirq.c:913
     smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164 kernel/smpboot.c:164
     kthread+0x405/0x4f0 kernel/kthread.c:327 kernel/kthread.c:327
     ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 arch/x86/entry/entry_64.S:295

    Fixes: 4e66934eaadc ("lib: add reference counting tracking infrastructure")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:39:19 +02:00
Ivan Vecera 2d60ac3218 lib: add reference counting tracking infrastructure
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2096377

commit 4e66934eaadc83b27ada8d42b60894018f3bfabf
Author: Eric Dumazet <edumazet@google.com>
Date:   Sat Dec 4 20:21:55 2021 -0800

    lib: add reference counting tracking infrastructure

    It can be hard to track where references are taken and released.

    In networking, we have annoying issues at device or netns dismantles,
    and we had various proposals to ease root causing them.

    This patch adds new infrastructure pairing refcount increases
    and decreases. This will self document code, because programmers
    will have to associate increments/decrements.

    This is controled by CONFIG_REF_TRACKER which can be selected
    by users of this feature.

    This adds both cpu and memory costs, and thus should probably be
    used with care.

    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
2022-06-13 18:35:56 +02:00