Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Čestmír Kalina	9361f76aa8	rtmutex: Drop rt_mutex::wait_lock before scheduling JIRA: https://issues.redhat.com/browse/RHEL-60306 commit d33d26036a0274b472299d7dcdaa5fb34329f91b Author: Roland Xu <mu001999@outlook.com> Date: Thu, 15 Aug 2024 10:58:13 +0800 rt_mutex_handle_deadlock() is called with rt_mutex::wait_lock held. In the good case it returns with the lock held and in the deadlock case it emits a warning and goes into an endless scheduling loop with the lock held, which triggers the 'scheduling in atomic' warning. Unlock rt_mutex::wait_lock in the dead lock case before issuing the warning and dropping into the schedule for ever loop. [ tglx: Moved unlock before the WARN(), removed the pointless comment, massaged changelog, added Fixes tag ] Fixes: `3d5c9340d1` ("rtmutex: Handle deadlock detection smarter") Signed-off-by: Roland Xu <mu001999@outlook.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/ME0P300MB063599BEF0743B8FA339C2CECC802@ME0P300MB0635.AUSP300.PROD.OUTLOOK.COM Signed-off-by: Čestmír Kalina <ckalina@redhat.com>	2024-12-18 17:06:50 +01:00
Čestmír Kalina	4d2e958bff	locking/rtmutex: Use try_cmpxchg_relaxed() in mark_rt_mutex_waiters() JIRA: https://issues.redhat.com/browse/RHEL-60306 commit ce3576ebd62d99f79c1dc98824e2ef6d6ab68434 Author: Uros Bizjak <ubizjak@gmail.com> Date: Wed, 24 Jan 2024 11:49:53 +0100 Use try_cmpxchg() instead of cmpxchg(ptr, old, new) == old. The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after CMPXCHG (and related move instruction in front of CMPXCHG). Also, try_cmpxchg() implicitly assigns old ptr value to "old" when CMPXCHG fails. There is no need to re-read the value in the loop. Note that the value from *ptr should be read using READ_ONCE() to prevent the compiler from merging, refetching or reordering the read. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240124104953.612063-1-ubizjak@gmail.com Signed-off-by: Čestmír Kalina <ckalina@redhat.com>	2024-12-16 22:02:24 +01:00
Waiman Long	ca8db1144f	locking/rtmutex: Add a lockdep assert to catch potential nested blocking JIRA: https://issues.redhat.com/browse/RHEL-28616 commit 45f67f30a22f264bc7a0a61255c2ee1a838e9403 Author: Thomas Gleixner <tglx@linutronix.de> Date: Fri, 8 Sep 2023 18:22:53 +0200 locking/rtmutex: Add a lockdep assert to catch potential nested blocking There used to be a BUG_ON(current->pi_blocked_on) in the lock acquisition functions, but that vanished in one of the rtmutex overhauls. Bring it back in form of a lockdep assert to catch code paths which take rtmutex based locks with current::pi_blocked_on != NULL. Reported-by: Crystal Wood <swood@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230908162254.999499-7-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 10:06:01 -04:00
Waiman Long	f62c68f20c	locking/rtmutex: Use rt_mutex specific scheduler helpers JIRA: https://issues.redhat.com/browse/RHEL-28616 commit d14f9e930b9073de264c106bf04968286ef9b3a4 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Fri, 8 Sep 2023 18:22:52 +0200 locking/rtmutex: Use rt_mutex specific scheduler helpers Have rt_mutex use the rt_mutex specific scheduler helpers to avoid recursion vs rtlock on the PI state. [[ peterz: adapted to new names ]] Reported-by: Crystal Wood <swood@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230908162254.999499-6-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 10:05:58 -04:00
Waiman Long	c6a557ade6	locking/rtmutex: Avoid unconditional slowpath for DEBUG_RT_MUTEXES JIRA: https://issues.redhat.com/browse/RHEL-28616 commit af9f006393b53409be0ca83ae234bef840cdef4a Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Fri, 8 Sep 2023 18:22:49 +0200 locking/rtmutex: Avoid unconditional slowpath for DEBUG_RT_MUTEXES With DEBUG_RT_MUTEXES enabled the fast-path rt_mutex_cmpxchg_acquire() always fails and all lock operations take the slow path. Provide a new helper inline rt_mutex_try_acquire() which maps to rt_mutex_cmpxchg_acquire() in the non-debug case. For the debug case it invokes rt_mutex_slowtrylock() which can acquire a non-contended rtmutex under full debug coverage. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230908162254.999499-3-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 10:05:57 -04:00
Waiman Long	0badc86620	Revert "locking/rtmutex: Submit/resume work explicitly before/after blocking" JIRA: https://issues.redhat.com/browse/RHEL-28616 Upstream Status: RHEL only Revert linux-rt-devel specific commit a44b38b17bf3 ("locking/rtmutex: Submit/resume work explicitly before/after blocking") to prepare for the submission of upstream equivalent. Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 09:56:35 -04:00
Waiman Long	c07eb0516e	Revert "locking/rtmutex: Avoid pointless blk_flush_plug() invocations" JIRA: https://issues.redhat.com/browse/RHEL-28616 Upstream Status: RHEL only Revert linux-rt-devel specific commit 96c0a06e80cb ("locking/rtmutex: Avoid pointless blk_flush_plug() invocations") to prepare for the submission of upstream equivalent. Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 09:56:35 -04:00
Waiman Long	3ad42081d5	Revert "locking/rtmutex: Add a lockdep assert to catch potential nested blocking" JIRA: https://issues.redhat.com/browse/RHEL-28616 Upstream Status: RHEL only Revert linux-rt-devel specific commit e2d27efe1923 ("locking/rtmutex: Add a lockdep assert to catch potential nested blocking") to prepare for the submission of upstream equivalent. Signed-off-by: Waiman Long <longman@redhat.com>	2024-03-27 09:56:34 -04:00
Joel Savitz	baca3f37f7	locking/rtmutex: Fix task->pi_waiters integrity JIRA: https://issues.redhat.com/browse/RHEL-5226 commit f7853c34241807bb97673a5e97719123be39a09e Author: Peter Zijlstra <peterz@infradead.org> Date: Fri Jul 7 16:19:09 2023 +0200 locking/rtmutex: Fix task->pi_waiters integrity Henry reported that rt_mutex_adjust_prio_check() has an ordering problem and puts the lie to the comment in [7]. Sharing the sort key between lock->waiters and owner->pi_waiters does create problems, since unlike what the comment claims, holding [L] is insufficient. Notably, consider: A / \ M1 M2 \| \| B C That is, task A owns both M1 and M2, B and C block on them. In this case a concurrent chain walk (B & C) will modify their resp. sort keys in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L] is meaningless, they're different Ls. This then gives rise to a race condition between [7] and [11], where the requeue of pi_waiters will observe an inconsistent tree order. B C (holds M1->wait_lock, (holds M2->wait_lock, holds B->pi_lock) holds A->pi_lock) [7] waiter_update_prio(); ... [8] raw_spin_unlock(B->pi_lock); ... [10] raw_spin_lock(A->pi_lock); [11] rt_mutex_enqueue_pi(); // observes inconsistent A->pi_waiters // tree order Fixing this means either extending the range of the owner lock from [10-13] to [6-13], with the immediate problem that this means [6-8] hold both blocked and owner locks, or duplicating the sort key. Since the locking in chain walk is horrible enough without having to consider pi_lock nesting rules, duplicate the sort key instead. By giving each tree their own sort key, the above race becomes harmless, if C sees B at the old location, then B will correct things (if they need correcting) when it walks up the chain and reaches A. Fixes: `fb00aca474` ("rtmutex: Turn the plist into an rb-tree") Reported-by: Henry Wu <triangletrap12@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Henry Wu <triangletrap12@gmail.com> Link: https://lkml.kernel.org/r/20230707161052.GF2883469%40hirez.programming.kicks-ass.net" Signed-off-by: Joel Savitz <jsavitz@redhat.com>	2024-01-15 10:10:43 -05:00
Crystal Wood	31f7062808	locking/rtmutex: Add a lockdep assert to catch potential nested blocking Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2218724 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit e2d27efe19234c94a42e123dc8122c4f13c9a9ab Author: Thomas Gleixner <tglx@linutronix.de> Date: Thu Apr 27 13:19:37 2023 +0200 locking/rtmutex: Add a lockdep assert to catch potential nested blocking There used to be a BUG_ON(current->pi_blocked_on) in the lock acquisition functions, but that vanished in one of the rtmutex overhauls. Bring it back in form of a lockdep assert to catch code paths which take rtmutex based locks with current::pi_blocked_on != NULL. Reported-by: Crystal Wood <swood@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20230427111937.2745231-5-bigeasy@linutronix.de Signed-off-by: Crystal Wood <swood@redhat.com>	2023-07-18 17:22:36 -05:00
Crystal Wood	fbe16f5d83	locking/rtmutex: Avoid pointless blk_flush_plug() invocations Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2218724 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit 96c0a06e80cb53788a282e087773b2cfa5525545 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Thu Apr 27 13:19:36 2023 +0200 locking/rtmutex: Avoid pointless blk_flush_plug() invocations With DEBUG_RT_MUTEXES enabled the fast-path rt_mutex_cmpxchg_acquire() always fails and all lock operations take the slow path, which leads to the invocation of blk_flush_plug() even if the lock is not contended which is unnecessary and avoids batch processing of requests. Provide a new helper inline rt_mutex_try_acquire() which maps to rt_mutex_cmpxchg_acquire() in the non-debug case. For the debug case it invokes rt_mutex_slowtrylock() which can acquire a non-contended rtmutex under full debug coverage. Replace the rt_mutex_cmpxchg_acquire() invocations in __rt_mutex_lock() and __ww_rt_mutex_lock() with the new helper function, which avoid the blk_flush_plug() for the non-contended case and preserves the debug mechanism. [ tglx: Created a new helper and massaged changelog ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20230427111937.2745231-4-bigeasy@linutronix.de Signed-off-by: Crystal Wood <swood@redhat.com>	2023-07-18 17:22:36 -05:00
Crystal Wood	2ef9c3d906	locking/rtmutex: Submit/resume work explicitly before/after blocking Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2218724 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit a44b38b17bf31d90509125a8d34c9ac8f0dcc886 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Thu Apr 27 13:19:35 2023 +0200 locking/rtmutex: Submit/resume work explicitly before/after blocking schedule() invokes sched_submit_work() before scheduling and sched_resume_work() afterwards to ensure that queued block requests are flushed and the (IO)worker machineries can instantiate new workers if required. This avoids deadlocks and starvation. With rt_mutexes this can lead to a subtle problem: When rtmutex blocks current::pi_blocked_on points to the rtmutex it blocks on. When one of the functions in sched_submit/resume_work() contends on a rtmutex based lock then that would corrupt current::pi_blocked_on. Let rtmutex and the RT lock variants which are based on it invoke sched_submit/resume_work() explicitly before and after the slowpath so it's guaranteed that current::pi_blocked_on cannot be corrupted by blocking on two locks. This does not apply to the PREEMPT_RT variants of spinlock_t and rwlock_t as their scheduling slowpath is separate and cannot invoke the work related functions due to potential deadlocks anyway. [ tglx: Make it explicit and symmetric. Massage changelog ] Fixes: e17ba59b7e8e1 ("locking/rtmutex: Guard regular sleeping locks specific functions") Reported-by: Crystal Wood <swood@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/4b4ab374d3e24e6ea8df5cadc4297619a6d945af.camel@redhat.com Link: https://lore.kernel.org/r/20230427111937.2745231-3-bigeasy@linutronix.de Signed-off-by: Crystal Wood <swood@redhat.com>	2023-07-18 17:22:36 -05:00
Joel Savitz	8f01288457	rtmutex: Ensure that the top waiter is always woken up commit db370a8b9f67ae5f17e3d5482493294467784504 Author: Wander Lairson Costa <wander@redhat.com> Date: Thu Feb 2 09:30:20 2023 -0300 rtmutex: Ensure that the top waiter is always woken up Let L1 and L2 be two spinlocks. Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top waiter of L2. Let T2 be the task holding L2. Let T3 be a task trying to acquire L1. The following events will lead to a state in which the wait queue of L2 isn't empty, but no task actually holds the lock. T1 T2 T3 == == == spin_lock(L1) \| raw_spin_lock(L1->wait_lock) \| rtlock_slowlock_locked(L1) \| \| task_blocks_on_rt_mutex(L1, T3) \| \| \| orig_waiter->lock = L1 \| \| \| orig_waiter->task = T3 \| \| \| raw_spin_unlock(L1->wait_lock) \| \| \| rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3) spin_unlock(L2) \| \| \| \| \| rt_mutex_slowunlock(L2) \| \| \| \| \| \| raw_spin_lock(L2->wait_lock) \| \| \| \| \| \| wakeup(T1) \| \| \| \| \| \| raw_spin_unlock(L2->wait_lock) \| \| \| \| \| \| \| \| waiter = T1->pi_blocked_on \| \| \| \| waiter == rt_mutex_top_waiter(L2) \| \| \| \| waiter->task == T1 \| \| \| \| raw_spin_lock(L2->wait_lock) \| \| \| \| dequeue(L2, waiter) \| \| \| \| update_prio(waiter, T1) \| \| \| \| enqueue(L2, waiter) \| \| \| \| waiter != rt_mutex_top_waiter(L2) \| \| \| \| L2->owner == NULL \| \| \| \| wakeup(T1) \| \| \| \| raw_spin_unlock(L2->wait_lock) T1 wakes up T1 != top_waiter(L2) schedule_rtlock() If the deadline of T1 is updated before the call to update_prio(), and the new deadline is greater than the deadline of the second top waiter, then after the requeue, T1 is no longer the top waiter, and the wrong task is woken up which will then go back to sleep because it is not the top waiter. This can be reproduced in PREEMPT_RT with stress-ng: while true; do stress-ng --sched deadline --sched-period 1000000000 \ --sched-runtime 800000000 --sched-deadline \ 1000000000 --mmapfork 23 -t 20 done A similar issue was pointed out by Thomas versus the cases where the top waiter drops out early due to a signal or timeout, which is a general issue for all regular rtmutex use cases, e.g. futex. The problematic code is in rt_mutex_adjust_prio_chain(): // Save the top waiter before dequeue/enqueue prerequeue_top_waiter = rt_mutex_top_waiter(lock); rt_mutex_dequeue(lock, waiter); waiter_update_prio(waiter, task); rt_mutex_enqueue(lock, waiter); // Lock has no owner? if (!rt_mutex_owner(lock)) { // Top waiter changed ----> if (prerequeue_top_waiter != rt_mutex_top_waiter(lock)) ----> wake_up_state(waiter->task, waiter->wake_state); This only takes the case into account where @waiter is the new top waiter due to the requeue operation. But it fails to handle the case where @waiter is not longer the top waiter due to the requeue operation. Ensure that the new top waiter is woken up so in all cases so it can take over the ownerless lock. [ tglx: Amend changelog, add Fixes tag ] Fixes: c014ef69b3ac ("locking/rtmutex: Add wake_state to rt_mutex_waiter") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2176147 Signed-off-by: Joel Savitz <jsavitz@redhat.com>	2023-03-07 15:26:28 -05:00
Joel Savitz	2d216f7bd8	locking: Apply contention tracepoints in the slow path conflict in kernel/locking/rtmutex.c detail: c9s commit `c3a495f437` ("rtmutex: Add acquire semantics for rtmutex lock acquisition slow path"), backport of upstream commit 1c0908d8e441, adds a second parameter to fixup_rt_mutex_waiters(), which is not present in upstream commit ee042be16cb4. action: keep new call to fixup_rt_mutex_waiters() commit ee042be16cb455116d0fe99b77c6bc8baf87c8c6 Author: Namhyung Kim <namhyung@kernel.org> Date: Tue Mar 22 11:57:09 2022 -0700 locking: Apply contention tracepoints in the slow path Adding the lock contention tracepoints in various lock function slow paths. Note that each arch can define spinlock differently, I only added it only to the generic qspinlock for now. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Link: https://lkml.kernel.org/r/20220322185709.141236-3-namhyung@kernel.org Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2176147 Signed-off-by: Joel Savitz <jsavitz@redhat.com>	2023-03-07 15:26:28 -05:00
Brian Masney	c3a495f437	rtmutex: Add acquire semantics for rtmutex lock acquisition slow path Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c0908d8e441631f5b8ba433523cf39339ee2ba0 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2163507 Conflicts: Corrected minor context diff commit 1c0908d8e441631f5b8ba433523cf39339ee2ba0 Author: Mel Gorman <mgorman@techsingularity.net> Date: Fri Dec 2 10:02:23 2022 +0000 rtmutex: Add acquire semantics for rtmutex lock acquisition slow path Jan Kara reported the following bug triggering on 6.0.5-rt14 running dbench on XFS on arm64. kernel BUG at fs/inode.c:625! Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP CPU: 11 PID: 6611 Comm: dbench Tainted: G E 6.0.0-rt14-rt+ #1 pc : clear_inode+0xa0/0xc0 lr : clear_inode+0x38/0xc0 Call trace: clear_inode+0xa0/0xc0 evict+0x160/0x180 iput+0x154/0x240 do_unlinkat+0x184/0x300 __arm64_sys_unlinkat+0x48/0xc0 el0_svc_common.constprop.4+0xe4/0x2c0 do_el0_svc+0xac/0x100 el0_svc+0x78/0x200 el0t_64_sync_handler+0x9c/0xc0 el0t_64_sync+0x19c/0x1a0 It also affects 6.1-rc7-rt5 and affects a preempt-rt fork of 5.14 so this is likely a bug that existed forever and only became visible when ARM support was added to preempt-rt. The same problem does not occur on x86-64 and he also reported that converting sb->s_inode_wblist_lock to raw_spinlock_t makes the problem disappear indicating that the RT spinlock variant is the problem. Which in turn means that RT mutexes on ARM64 and any other weakly ordered architecture are affected by this independent of RT. Will Deacon observed: "I'd be more inclined to be suspicious of the slowpath tbh, as we need to make sure that we have acquire semantics on all paths where the lock can be taken. Looking at the rtmutex code, this really isn't obvious to me -- for example, try_to_take_rt_mutex() appears to be able to return via the 'takeit' label without acquire semantics and it looks like we might be relying on the caller's subsequent _unlock_ of the wait_lock for ordering, but that will give us release semantics which aren't correct." Sebastian Andrzej Siewior prototyped a fix that does work based on that comment but it was a little bit overkill and added some fences that should not be necessary. The lock owner is updated with an IRQ-safe raw spinlock held, but the spin_unlock does not provide acquire semantics which are needed when acquiring a mutex. Adds the necessary acquire semantics for lock owner updates in the slow path acquisition and the waiter bit logic. It successfully completed 10 iterations of the dbench workload while the vanilla kernel fails on the first iteration. [ bigeasy@linutronix.de: Initial prototype fix ] Fixes: `700318d1d7` ("locking/rtmutex: Use acquire/release semantics") Fixes: `23f78d4a03` ("[PATCH] pi-futex: rt mutex core") Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221202100223.6mevpbl7i6x5udfd@techsingularity.net Signed-off-by: Brian Masney <bmasney@redhat.com>	2023-01-23 13:03:46 -05:00
Waiman Long	1f0d97425a	locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 Conflicts: Upstream merge conflict, use resolution listed in merge commit f16cc980d649 ("Merge branch 'locking/urgent' into locking/core"). commit 8f556a326c93213927e683fc32bbf5be1b62540a Author: Zqiang <qiang1.zhang@intel.com> Date: Fri, 17 Dec 2021 15:42:07 +0800 locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner() Optimistic spinning needs to be terminated when the spinning waiter is not longer the top waiter on the lock, but the condition is negated. It terminates if the waiter is the top waiter, which is defeating the whole purpose. Fixes: c3123c431447 ("locking/rtmutex: Dont dereference waiter lockless") Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211217074207.77425-1-qiang1.zhang@intel.com Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:34:04 -04:00
Waiman Long	cf476291f3	locking: Make owner_on_cpu() into <linux/sched.h> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit c0bed69daf4b67809b58cc7cd81a8fa4f45bc161 Author: Kefeng Wang <wangkefeng.wang@huawei.com> Date: Fri, 3 Dec 2021 15:59:34 +0800 locking: Make owner_on_cpu() into <linux/sched.h> Move the owner_on_cpu() from kernel/locking/rwsem.c into include/linux/sched.h with under CONFIG_SMP, then use it in the mutex/rwsem/rtmutex to simplify the code. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211203075935.136808-2-wangkefeng.wang@huawei.com Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:34:03 -04:00
Waiman Long	b16109588c	locking/rtmutex: Squash self-deadlock check for ww_rt_mutex. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit 02ea9fc96fe976e7f7e067f38b12202f126e3f2f Author: Peter Zijlstra <peterz@infradead.org> Date: Mon, 29 Nov 2021 18:46:46 +0100 locking/rtmutex: Squash self-deadlock check for ww_rt_mutex. Similar to the issues in commits: 6467822b8cc9 ("locking/rtmutex: Prevent spurious EDEADLK return caused by ww_mutexes") a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters") ww_rt_mutex_lock() should not return EDEADLK without first going through the __ww_mutex logic to set the required state. In fact, the chain-walk can deal with the spurious cycles (per the above commits) this check warns about and is trying to avoid. Therefore ignore this test for ww_rt_mutex and simply let things fall in place. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211129174654.668506-4-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:33:53 -04:00
Waiman Long	65d9183f94	rtmutex: Wake up the waiters lockless while dropping the read lock. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit 9321f8152d9a764208c3f0dad49e0c55f293b7ab Author: Thomas Gleixner <tglx@linutronix.de> Date: Tue, 28 Sep 2021 17:00:06 +0200 rtmutex: Wake up the waiters lockless while dropping the read lock. The rw_semaphore and rwlock_t implementation both wake the waiter while holding the rt_mutex_base::wait_lock acquired. This can be optimized by waking the waiter lockless outside of the locked section to avoid a needless contention on the rt_mutex_base::wait_lock lock. Extend rt_mutex_wake_q_add() to also accept task and state and use it in __rwbase_read_unlock(). Suggested-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210928150006.597310-3-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:32:18 -04:00
Waiman Long	c5f0f13946	rtmutex: Check explicit for TASK_RTLOCK_WAIT. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit 8fe46535e10dbfebad68ad9f2f8260e49f5852c9 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Tue, 28 Sep 2021 17:00:05 +0200 rtmutex: Check explicit for TASK_RTLOCK_WAIT. rt_mutex_wake_q_add() needs to need to distiguish between sleeping locks (TASK_RTLOCK_WAIT) and normal locks which use TASK_NORMAL to use the proper wake mechanism. Instead of checking for != TASK_NORMAL make it more robust and check explicit for TASK_RTLOCK_WAIT which is the reason why a different wake mechanism is used. No functional change. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210928150006.597310-2-bigeasy@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:32:18 -04:00
Waiman Long	e21ed8b6d0	locking/rtmutex: Fix ww_mutex deadlock check Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit e5480572706da1b2c2dc2c6484eab64f92b9263b Author: Peter Zijlstra <peterz@infradead.org> Date: Wed, 1 Sep 2021 11:44:11 +0200 locking/rtmutex: Fix ww_mutex deadlock check Dan reported that rt_mutex_adjust_prio_chain() can be called with .orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters") unconditionally dereferences it. Since both call-sites that have .orig_waiter == NULL don't care for the return value, simply disable the deadlock squash by adding the NULL check. Notably, both callers use the deadlock condition as a termination condition for the iteration; once detected, it is sure that (de)boosting is done. Arguably step [3] would be a more natural termination point, but it's dubious whether adding a third deadlock detection state would improve the code. Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/YS9La56fHMiCCo75@hirez.programming.kicks-ass.net Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:30 -04:00
Waiman Long	beb2236d5b	locking/rtmutex: Return success on deadlock for ww_mutex waiters Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit a055fcc132d4c25b96d1115aea514258810dc6fc Author: Peter Zijlstra <peterz@infradead.org> Date: Thu, 26 Aug 2021 10:48:18 +0200 locking/rtmutex: Return success on deadlock for ww_mutex waiters ww_mutexes can legitimately cause a deadlock situation in the lock graph which is resolved afterwards by the wait/wound mechanics. The rtmutex chain walk can detect such a deadlock and returns EDEADLK which in turn skips the wait/wound mechanism and returns EDEADLK to the caller. That's wrong because both lock chains might get EDEADLK or the wrong waiter would back out. Detect that situation and return 'success' in case that the waiter which initiated the chain walk is a ww_mutex with context. This allows the wait/wound mechanics to resolve the situation according to the rules. [ tglx: Split it apart and added changelog ] Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Fixes: add461325ec5 ("locking/rtmutex: Extend the rtmutex core to support ww_mutex") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/YSeWjCHoK4v5OcOt@hirez.programming.kicks-ass.net Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:29 -04:00
Waiman Long	dd328dbe46	locking/rtmutex: Prevent spurious EDEADLK return caused by ww_mutexes Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 6467822b8cc96e5feda98c7bf5c6329c6a896c91 Author: Peter Zijlstra <peterz@infradead.org> Date: Thu, 26 Aug 2021 09:36:53 +0200 locking/rtmutex: Prevent spurious EDEADLK return caused by ww_mutexes rtmutex based ww_mutexes can legitimately create a cycle in the lock graph which can be observed by a blocker which didn't cause the problem: P1: A, ww_A, ww_B P2: ww_B, ww_A P3: A P3 might therefore be trapped in the ww_mutex induced cycle and run into the lock depth limitation of rt_mutex_adjust_prio_chain() which returns -EDEADLK to the caller. Disable the deadlock detection walk when the chain walk observes a ww_mutex to prevent this looping. [ tglx: Split it apart and added changelog ] Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Fixes: add461325ec5 ("locking/rtmutex: Extend the rtmutex core to support ww_mutex") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/YSeWjCHoK4v5OcOt@hirez.programming.kicks-ass.net Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:28 -04:00
Waiman Long	b9878279f9	locking/rtmutex: Dequeue waiter on ww_mutex deadlock Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 37e8abff2bebbf9947d6b784f5c75ed48a717089 Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed, 25 Aug 2021 12:33:14 +0200 locking/rtmutex: Dequeue waiter on ww_mutex deadlock The rt_mutex based ww_mutex variant queues the new waiter first in the lock's rbtree before evaluating the ww_mutex specific conditions which might decide that the waiter should back out. This check and conditional exit happens before the waiter is enqueued into the PI chain. The failure handling at the call site assumes that the waiter, if it is the top most waiter on the lock, is queued in the PI chain and then proceeds to adjust the unmodified PI chain, which results in RB tree corruption. Dequeue the waiter from the lock waiter list in the ww_mutex error exit path to prevent this. Fixes: add461325ec5 ("locking/rtmutex: Extend the rtmutex core to support ww_mutex") Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210825102454.042280541@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:27 -04:00
Waiman Long	c26102c71a	locking/rtmutex: Dont dereference waiter lockless Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit c3123c431447da99db160264506de9897c003513 Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed, 25 Aug 2021 12:33:12 +0200 locking/rtmutex: Dont dereference waiter lockless The new rt_mutex_spin_on_onwer() loop checks whether the spinning waiter is still the top waiter on the lock by utilizing rt_mutex_top_waiter(), which is broken because that function contains a sanity check which dereferences the top waiter pointer to check whether the waiter belongs to the lock. That's wrong in the lockless spinwait case: CPU 0 CPU 1 rt_mutex_lock(lock) rt_mutex_lock(lock); queue(waiter0) waiter0 == rt_mutex_top_waiter(lock) rt_mutex_spin_on_onwer(lock, waiter0) { queue(waiter1) waiter1 == rt_mutex_top_waiter(lock) ... top_waiter = rt_mutex_top_waiter(lock) leftmost = rb_first_cached(&lock->waiters); -> signal dequeue(waiter1) destroy(waiter1) w = rb_entry(leftmost, ....) BUG_ON(w->lock != lock) <- UAF The BUG_ON() is correct for the case where the caller holds lock->wait_lock which guarantees that the leftmost waiter entry cannot vanish. For the lockless spinwait case it's broken. Create a new helper function which avoids the pointer dereference and just compares the leftmost entry pointer with current's waiter pointer to validate that currrent is still elegible for spinning. Fixes: 992caf7f1724 ("locking/rtmutex: Add adaptive spinwait mechanism") Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210825102453.981720644@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:27 -04:00
Waiman Long	8f0a29c215	locking/rtmutex: Add adaptive spinwait mechanism Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 992caf7f17243d736fc996770bac6566103778f6 Author: Steven Rostedt <rostedt@goodmis.org> Date: Sun, 15 Aug 2021 23:29:25 +0200 locking/rtmutex: Add adaptive spinwait mechanism Going to sleep when locks are contended can be quite inefficient when the contention time is short and the lock owner is running on a different CPU. The MCS mechanism cannot be used because MCS is strictly FIFO ordered while for rtmutex based locks the waiter ordering is priority based. Provide a simple adaptive spinwait mechanism which currently restricts the spinning to the top priority waiter. [ tglx: Provide a contemporary changelog, extended it to all rtmutex based locks and updated it to match the other spin on owner implementations ] Originally-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.912050691@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:25 -04:00
Waiman Long	511fea4a2d	locking/rtmutex: Implement equal priority lock stealing Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 48eb3f4fcfd35495a8357459aa6fe437aa430b00 Author: Gregory Haskins <ghaskins@novell.com> Date: Sun, 15 Aug 2021 23:29:23 +0200 locking/rtmutex: Implement equal priority lock stealing The current logic only allows lock stealing to occur if the current task is of higher priority than the pending owner. Significant throughput improvements can be gained by allowing the lock stealing to include tasks of equal priority when the contended lock is a spin_lock or a rw_lock and the tasks are not in a RT scheduling task. The assumption was that the system will make faster progress by allowing the task already on the CPU to take the lock rather than waiting for the system to wake up a different task. This does add a degree of unfairness, but in reality no negative side effects have been observed in the many years that this has been used in the RT kernel. [ tglx: Refactored and rewritten several times by Steve Rostedt, Sebastian Siewior and myself ] Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.857240222@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:24 -04:00
Waiman Long	83850b9f0b	locking/rtmutex: Extend the rtmutex core to support ww_mutex Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit add461325ec5bc39aa619a1bfcde7245e5f31ac7 Author: Peter Zijlstra <peterz@infradead.org> Date: Sun, 15 Aug 2021 23:28:58 +0200 locking/rtmutex: Extend the rtmutex core to support ww_mutex Add a ww acquire context pointer to the waiter and various functions and add the ww_mutex related invocations to the proper spots in the locking code, similar to the mutex based variant. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.966139174@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:12 -04:00
Waiman Long	21ef327373	locking/rtmutex: Squash !RT tasks to DEFAULT_PRIO Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 715f7f9ece4685157bb59560f6c612340d730ab4 Author: Peter Zijlstra <peterz@infradead.org> Date: Sun, 15 Aug 2021 23:28:30 +0200 locking/rtmutex: Squash !RT tasks to DEFAULT_PRIO Ensure all !RT tasks have the same prio such that they end up in FIFO order and aren't split up according to nice level. The reason why nice levels were taken into account so far is historical. In the early days of the rtmutex code it was done to give the PI boosting and deboosting a larger coverage. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.938676930@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:19:00 -04:00
Waiman Long	3c4688ad75	locking/rtmutex: Provide the spin/rwlock core lock function Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 1c143c4b65da09081d644110e619decc49c9dee4 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:25 +0200 locking/rtmutex: Provide the spin/rwlock core lock function A simplified version of the rtmutex slowlock function, which neither handles signals nor timeouts, and is careful about preserving the state of the blocked task across the lock operation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.770228446@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:57 -04:00
Waiman Long	7f6abebfbf	locking/rtmutex: Guard regular sleeping locks specific functions Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit e17ba59b7e8e1f67e36d8fcc46daa13370efcf11 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:12 +0200 locking/rtmutex: Guard regular sleeping locks specific functions Guard the regular sleeping lock specific functionality, which is used for rtmutex on non-RT enabled kernels and for mutex, rtmutex and semaphores on RT enabled kernels so the code can be reused for the RT specific implementation of spinlocks and rwlocks in a different compilation unit. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.311535693@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:51 -04:00
Waiman Long	a98d615dfb	locking/rtmutex: Prepare RT rt_mutex_wake_q for RT locks Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 456cfbc65cd072f4f53936ee5a37eb1447a7d3ba Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:11 +0200 locking/rtmutex: Prepare RT rt_mutex_wake_q for RT locks Add an rtlock_task pointer to rt_mutex_wake_q, which allows to handle the RT specific wakeup for spin/rwlock waiters. The pointer is just consuming 4/8 bytes on the stack so it is provided unconditionaly to avoid #ifdeffery all over the place. This cannot use a regular wake_q, because a task can have concurrent wakeups which would make it miss either lock or the regular wakeups, depending on what gets queued first, unless task struct gains a separate wake_q_node for this, which would be overkill, because there can only be a single task which gets woken up in the spin/rw_lock unlock path. No functional change for non-RT enabled kernels. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.253614678@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:51 -04:00
Waiman Long	c0e45115f7	locking/rtmutex: Use rt_mutex_wake_q_head Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 7980aa397cc0968ea3ffee7a985c31c92ad84f81 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:09 +0200 locking/rtmutex: Use rt_mutex_wake_q_head Prepare for the required state aware handling of waiter wakeups via wake_q and switch the rtmutex code over to the rtmutex specific wrapper. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.197113263@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:50 -04:00
Waiman Long	58001d18d1	locking/rtmutex: Provide rt_wake_q_head and helpers Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit b576e640ce5e22673e12949cf14ae3cb18d9b859 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:08 +0200 locking/rtmutex: Provide rt_wake_q_head and helpers To handle the difference between wakeups for regular sleeping locks (mutex, rtmutex, rw_semaphore) and the wakeups for 'sleeping' spin/rwlocks on PREEMPT_RT enabled kernels correctly, it is required to provide a wake_q_head construct which allows to keep them separate. Provide a wrapper around wake_q_head and the required helpers, which will be extended with the state handling later. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.139337655@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:49 -04:00
Waiman Long	07c9369108	locking/rtmutex: Add wake_state to rt_mutex_waiter Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit c014ef69b3acdb8c9e7fc412e96944f4d5c36fa0 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:06 +0200 locking/rtmutex: Add wake_state to rt_mutex_waiter Regular sleeping locks like mutexes, rtmutexes and rw_semaphores are always entering and leaving a blocking section with task state == TASK_RUNNING. On a non-RT kernel spinlocks and rwlocks never affect the task state, but on RT kernels these locks are converted to rtmutex based 'sleeping' locks. So in case of contention the task goes to block, which requires to carefully preserve the task state, and restore it after acquiring the lock taking regular wakeups for the task into account, which happened while the task was blocked. This state preserving is achieved by having a separate task state for blocking on a RT spin/rwlock and a saved_state field in task_struct along with careful handling of these wakeup scenarios in try_to_wake_up(). To avoid conditionals in the rtmutex code, store the wake state which has to be used for waking a lock waiter in rt_mutex_waiter which allows to handle the regular and RT spin/rwlocks by handing it to wake_up_state(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.079800739@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:49 -04:00
Waiman Long	f90139f12c	locking/rtmutex: Provide rt_mutex_slowlock_locked() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit ebbdc41e90ffce8b6bb3cbba1801ede2dd07a89b Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:28:00 +0200 locking/rtmutex: Provide rt_mutex_slowlock_locked() Split the inner workings of rt_mutex_slowlock() out into a separate function, which can be reused by the upcoming RT lock substitutions, e.g. for rw_semaphores. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.841971086@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:46 -04:00
Waiman Long	3c29e6cff1	locking/rtmutex: Split out the inner parts of 'struct rtmutex' Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 830e6acc8a1cafe153a0d88f9b2455965b396131 Author: Peter Zijlstra <peterz@infradead.org> Date: Sun, 15 Aug 2021 23:27:58 +0200 locking/rtmutex: Split out the inner parts of 'struct rtmutex' RT builds substitutions for rwsem, mutex, spinlock and rwlock around rtmutexes. Split the inner working out so each lock substitution can use them with the appropriate lockdep annotations. This avoids having an extra unused lockdep map in the wrapped rtmutex. No functional change. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.784739994@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:45 -04:00
Waiman Long	41f37aa11c	locking/rtmutex: Split API from implementation Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 531ae4b06a737ed5539cd75dc6f6b9a28f900bba Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:27:57 +0200 locking/rtmutex: Split API from implementation Prepare for reusing the inner functions of rtmutex for RT lock substitutions: introduce kernel/locking/rtmutex_api.c and move them there. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.726560996@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:44 -04:00
Waiman Long	cee965ef5a	locking/rtmutex: Switch to from cmpxchg_() to try_cmpxchg_() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 709e0b62869f625afd18edd79f190c38cb39dfb2 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:27:55 +0200 locking/rtmutex: Switch to from cmpxchg_() to try_cmpxchg_() Allows the compiler to generate better code depending on the architecture. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.668958502@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:43 -04:00
Waiman Long	b7b36f9743	locking/rtmutex: Convert macros to inlines Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 785159301bedea25fae9b20cae3d12377246e941 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Sun, 15 Aug 2021 23:27:54 +0200 locking/rtmutex: Convert macros to inlines Inlines are type-safe... Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.610830960@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:43 -04:00
Waiman Long	448f1816fe	locking/rtmutex: Set proper wait context for lockdep Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit b41cda03765580caf7723b8c1b672d191c71013f Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 15 Aug 2021 23:27:38 +0200 locking/rtmutex: Set proper wait context for lockdep RT mutexes belong to the LD_WAIT_SLEEP class. Make them so. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.031014562@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:36 -04:00
Zhen Lei	07d25971b2	locking/rtmutex: Use the correct rtmutex debugging config option It's CONFIG_DEBUG_RT_MUTEXES not CONFIG_DEBUG_RT_MUTEX. Fixes: `f7efc4799f` ("locking/rtmutex: Inline chainwalk depth check") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Boqun Feng <boqun.feng@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210731123011.4555-1-thunder.leizhen@huawei.com	2021-08-10 08:21:52 +02:00
Peter Zijlstra	2f064a59a1	sched: Change task_struct::state Change the type and name of task_struct::state. Drop the volatile and shrink it to an 'unsigned int'. Rename it in order to find all uses such that we can use READ_ONCE/WRITE_ONCE as appropriate. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20210611082838.550736351@infradead.org	2021-06-18 11:43:09 +02:00
Thomas Gleixner	a51a327f3b	locking/rtmutex: Clean up signal handling in __rt_mutex_slowlock() The signal handling in __rt_mutex_slowlock() is open coded. Use signal_pending_state() instead. Aside of the cleanup this also prepares for the RT lock substituions which require support for TASK_KILLABLE. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153944.533811987@linutronix.de	2021-03-29 15:57:05 +02:00
Thomas Gleixner	c2c360ed7f	locking/rtmutex: Restrict the trylock WARN_ON() to debug The warning as written is expensive and not really required for a production kernel. Make it depend on rt mutex debugging and use !in_task() for the condition which generates far better code and gives the same answer. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153944.436565064@linutronix.de	2021-03-29 15:57:04 +02:00
Thomas Gleixner	82cd5b1039	locking/rtmutex: Fix misleading comment in rt_mutex_postunlock() Preemption is disabled in mark_wakeup_next_waiter(,) not in rt_mutex_slowunlock(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153944.341734608@linutronix.de	2021-03-29 15:57:04 +02:00
Thomas Gleixner	70c80103aa	locking/rtmutex: Consolidate the fast/slowpath invocation The indirection via a function pointer (which is at least optimized into a tail call by the compiler) is making the code hard to read. Clean it up and move the futex related trylock functions down to the futex section. Move the wake_q wakeup into rt_mutex_slowunlock(). No point in handing it to the caller. The futex code uses a different function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153944.247927548@linutronix.de	2021-03-29 15:57:04 +02:00
Thomas Gleixner	d7a2edb890	locking/rtmutex: Make text section and inlining consistent rtmutex is half __sched and the other half is not. If the compiler decides to not inline larger static functions then part of the code ends up in the regular text section. There are also quite some performance related small helpers which are either static or plain inline. Force inline those which make sense and mark the rest __sched. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153944.152977820@linutronix.de	2021-03-29 15:57:04 +02:00
Thomas Gleixner	f5a98866e5	locking/rtmutex: Decrapify __rt_mutex_init() The conditional debug handling is just another layer of obfuscation. Split the function so rt_mutex_init_proxy_locked() can invoke the inner init and __rt_mutex_init() gets the full treatment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153943.955697588@linutronix.de	2021-03-29 15:57:03 +02:00
Thomas Gleixner	f7efc4799f	locking/rtmutex: Inline chainwalk depth check There is no point for this wrapper at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210326153943.754254046@linutronix.de	2021-03-29 15:57:03 +02:00

1 2 3

142 Commits