Centos-kernel-stream-9/kernel/locking
Augusto Caringi 376981b858 Merge: locking/semaphore: Use wake_q to wake up processes outside lock critical section
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6733

JIRA: https://issues.redhat.com/browse/RHEL-87008

commit 85b2b9c16d053364e2004883140538e73b333cdb
Author: Waiman Long <longman@redhat.com>
Date:   Fri, 7 Mar 2025 15:26:52 -0800

    locking/semaphore: Use wake_q to wake up processes outside lock critical section

    A circular lock dependency splat has been seen involving down_trylock():

      ======================================================
      WARNING: possible circular locking dependency detected
      6.12.0-41.el10.s390x+debug
      ------------------------------------------------------
      dd/32479 is trying to acquire lock:
      0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90

      but task is already holding lock:
      000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0

      the existing dependency chain (in reverse order) is:
      -> #4 (&zone->lock){-.-.}-{2:2}:
      -> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
      -> #2 (&rq->__lock){-.-.}-{2:2}:
      -> #1 (&p->pi_lock){-.-.}-{2:2}:
      -> #0 ((console_sem).lock){-.-.}-{2:2}:

    The console_sem -> pi_lock dependency is due to calling try_to_wake_up()
    while holding the console_sem raw_spinlock. This dependency can be broken
    by using wake_q to do the wakeup instead of calling try_to_wake_up()
    under the console_sem lock. This will also make the semaphore's
    raw_spinlock become a terminal lock without taking any further locks
    underneath it.

    The hrtimer_bases.lock is a raw_spinlock while zone->lock is a
    spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via
    the debug_objects_fill_pool() helper function in the debugobjects code.

      -> #4 (&zone->lock){-.-.}-{2:2}:
             __lock_acquire+0xe86/0x1cc0
             lock_acquire.part.0+0x258/0x630
             lock_acquire+0xb8/0xe0
             _raw_spin_lock_irqsave+0xb4/0x120
             rmqueue_bulk+0xac/0x8f0
             __rmqueue_pcplist+0x580/0x830
             rmqueue_pcplist+0xfc/0x470
             rmqueue.isra.0+0xdec/0x11b0
             get_page_from_freelist+0x2ee/0xeb0
             __alloc_pages_noprof+0x2c2/0x520
             alloc_pages_mpol_noprof+0x1fc/0x4d0
             alloc_pages_noprof+0x8c/0xe0
             allocate_slab+0x320/0x460
             ___slab_alloc+0xa58/0x12b0
             __slab_alloc.isra.0+0x42/0x60
             kmem_cache_alloc_noprof+0x304/0x350
             fill_pool+0xf6/0x450
             debug_object_activate+0xfe/0x360
             enqueue_hrtimer+0x34/0x190
             __run_hrtimer+0x3c8/0x4c0
             __hrtimer_run_queues+0x1b2/0x260
             hrtimer_interrupt+0x316/0x760
             do_IRQ+0x9a/0xe0
             do_irq_async+0xf6/0x160

    Normally a raw_spinlock to spinlock dependency is not legitimate
    and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled,
    but debug_objects_fill_pool() is an exception as it explicitly
    allows this dependency for non-PREEMPT_RT kernel without causing
    PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is
    legitimate and not a bug.

    Anyway, semaphore is the only locking primitive left that is still
    using try_to_wake_up() to do wakeup inside critical section, all the
    other locking primitives had been migrated to use wake_q to do wakeup
    outside of the critical section. It is also possible that there are
    other circular locking dependencies involving printk/console_sem or
    other existing/new semaphores lurking somewhere which may show up in
    the future. Let just do the migration now to wake_q to avoid headache
    like this.

    Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com

Signed-off-by: Waiman Long <longman@redhat.com>

Approved-by: Čestmír Kalina <ckalina@redhat.com>
Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Augusto Caringi <acaringi@redhat.com>
2025-06-26 10:58:51 -03:00
..
Makefile locking/lockdep: Disable KASAN instrumentation of lockdep.c 2025-03-07 23:10:41 -05:00
irqflag-debug.c lockdep: Noinstr annotate warn_bogus_irq_restore() 2021-02-10 14:44:39 +01:00
lock_events.c locking/debug: Fix debugfs API return value checks to use IS_ERR() 2024-05-22 19:52:14 -04:00
lock_events.h locking/qspinlock: Always evaluate lockevent* non-event parameter once 2024-12-16 22:02:24 +01:00
lock_events_list.h locking/rwsem: Remove reader optimistic spinning 2020-12-09 17:08:48 +01:00
lockdep.c locking/lockdep: Add kasan_check_byte() check in lock_acquire() 2025-03-07 23:10:46 -05:00
lockdep_internals.h locking/lockdep: Iterate lock_classes directly when reading lockdep files 2022-05-12 08:34:11 -04:00
lockdep_proc.c locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning 2024-05-22 19:52:14 -04:00
lockdep_states.h
locktorture.c locktorture: Add MODULE_DESCRIPTION() 2024-12-16 22:02:25 +01:00
mcs_spinlock.h locking: Fix typos in comments 2021-03-22 02:45:52 +01:00
mutex-debug.c locking/mutex: Introduce devm_mutex_init() 2024-12-13 11:26:39 -03:00
mutex.c locking/mutex: Document that mutex_unlock() is non-atomic 2024-05-22 19:52:16 -04:00
mutex.h locking/mutex: Move the 'struct mutex_waiter' definition from <linux/mutex.h> to the internal header 2021-09-27 16:19:01 -04:00
osq_lock.c locking/osq_lock: Clarify osq_wait_next() 2024-05-22 19:52:16 -04:00
percpu-rwsem.c locking/percpu-rwsem: Trigger contention tracepoints only if contended 2024-12-16 22:02:24 +01:00
qrwlock.c locking: Add __lockfunc to slow path functions 2023-03-07 15:26:28 -05:00
qspinlock.c locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail() 2024-12-16 22:02:25 +01:00
qspinlock_paravirt.h locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node() 2024-12-18 17:06:50 +01:00
qspinlock_stat.h
rtmutex.c rtmutex: Drop rt_mutex::wait_lock before scheduling 2024-12-18 17:06:50 +01:00
rtmutex_api.c locking/rtmutex: Fix task->pi_waiters integrity 2024-01-15 10:10:43 -05:00
rtmutex_common.h locking/rtmutex: Fix task->pi_waiters integrity 2024-01-15 10:10:43 -05:00
rwbase_rt.c locking/rtmutex: Add a lockdep assert to catch potential nested blocking 2024-03-27 10:06:01 -04:00
rwsem.c locking/rwsem: Add __always_inline annotation to __down_write_common() and inlined callers 2024-12-16 22:02:26 +01:00
semaphore.c locking/semaphore: Use wake_q to wake up processes outside lock critical section 2025-04-11 22:44:16 -04:00
spinlock.c locking/local_lock: Add local nested BH locking infrastructure. 2024-10-08 11:35:35 +02:00
spinlock_debug.c locking/rwlock: Provide RT variant 2021-09-27 16:18:59 -04:00
spinlock_rt.c locking/rtmutex: Add a lockdep assert to catch potential nested blocking 2024-03-27 10:06:01 -04:00
test-ww_mutex.c locking/ww_mutex/test: Make sure we bail out instead of livelock 2024-05-22 19:52:14 -04:00
ww_mutex.h locking/rtmutex: Fix task->pi_waiters integrity 2024-01-15 10:10:43 -05:00
ww_rt_mutex.c locking/rtmutex: Avoid unconditional slowpath for DEBUG_RT_MUTEXES 2024-03-27 10:05:57 -04:00