Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Waiman Long	dfd6ba19b1	rcutorture: Make rcutorture support print rcu-tasks gp state JIRA: https://issues.redhat.com/browse/RHEL-55557 commit dddcddef1414be3ebc37a40d13fcc0f6a672ba9f Author: Zqiang <qiang.zhang1211@gmail.com> Date: Mon, 18 Mar 2024 17:34:11 +0800 rcutorture: Make rcutorture support print rcu-tasks gp state This commit make rcu-tasks related rcutorture test support rcu-tasks gp state printing when the writer stall occurs or the at the end of rcutorture test, and generate rcu_ops->get_gp_data() operation to simplify the acquisition of gp state for different types of rcutorture tests. Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:47 -04:00
Waiman Long	890d600997	srcu: Improve comments about acceleration leak JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 67050837ec14fc20a26b237ce965c50c85a318b7 Author: Joel Fernandes (Google) <joel@joelfernandes.org> Date: Wed, 27 Dec 2023 12:47:38 -0500 srcu: Improve comments about acceleration leak The comments added in commit 1ef990c4b36b ("srcu: No need to advance/accelerate if no callback enqueued") are a bit confusing. The comments are describing a scenario for code that was moved and is no longer the way it was (snapshot after advancing). Improve the code comments to reflect this and also document why acceleration can never fail. Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:22 -04:00
Waiman Long	52dceb8212	srcu: Explain why callbacks invocations can't run concurrently JIRA: https://issues.redhat.com/browse/RHEL-55557 commit c21357e4461f3f9c8ff93302906b5372411ee108 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 4 Oct 2023 01:29:03 +0200 srcu: Explain why callbacks invocations can't run concurrently If an SRCU barrier is queued while callbacks are running and a new callbacks invocator for the same sdp were to run concurrently, the RCU barrier might execute too early. As this requirement is non-obvious, make sure to keep a record. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:05 -04:00
Waiman Long	9f330005fd	srcu: No need to advance/accelerate if no callback enqueued JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 94c55b9e21979daa88e190bf971c47432a818ebe Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 4 Oct 2023 01:29:02 +0200 srcu: No need to advance/accelerate if no callback enqueued While in grace period start, there is nothing to accelerate and therefore no need to advance the callbacks either if no callback is to be enqueued. Spare these needless operations in this case. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:04 -04:00
Waiman Long	fbd1964642	srcu: Remove superfluous callbacks advancing from srcu_gp_start() JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 20eb4142397cf3ec221de43f10ea149af462c572 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 4 Oct 2023 01:29:01 +0200 srcu: Remove superfluous callbacks advancing from srcu_gp_start() Callbacks advancing on SRCU must be performed on two specific places: 1) On enqueue time in order to make room for the acceleration of the new callback. 2) On invocation time in order to move the callbacks ready to invoke. Any other callback advancing callsite is needless. Remove the remaining one in srcu_gp_start(). Co-developed-by: Yong He <zhuangel570@gmail.com> Signed-off-by: Yong He <zhuangel570@gmail.com> Co-developed-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Co-developed-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:03 -04:00
Waiman Long	46ae167ae6	srcu: Only accelerate on enqueue time JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 8a77f38bcd28d3c22ab7dd8eff3f299d43c00411 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 4 Oct 2023 01:29:00 +0200 srcu: Only accelerate on enqueue time Acceleration in SRCU happens on enqueue time for each new callback. This operation is expected not to fail and therefore any similar attempt from other places shouldn't find any remaining callbacks to accelerate. Moreover accelerations performed beyond enqueue time are error prone because rcu_seq_snap() then may return the snapshot for a new grace period that is not going to be started. Remove these dangerous and needless accelerations and introduce instead assertions reporting leaking unaccelerated callbacks beyond enqueue time. Co-developed-by: Yong He <alexyonghe@tencent.com> Signed-off-by: Yong He <alexyonghe@tencent.com> Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Co-developed-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Reviewed-by: Like Xu <likexu@tencent.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:17 -04:00
Waiman Long	3eedd5a2b3	srcu: Fix callbacks acceleration mishandling JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 4a8e65b0c348e42107c64381e692e282900be361 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 4 Oct 2023 01:28:59 +0200 srcu: Fix callbacks acceleration mishandling SRCU callbacks acceleration might fail if the preceding callbacks advance also fails. This can happen when the following steps are met: 1) The RCU_WAIT_TAIL segment has callbacks (say for gp_num 8) and the RCU_NEXT_READY_TAIL also has callbacks (say for gp_num 12). 2) The grace period for RCU_WAIT_TAIL is observed as started but not yet completed so rcu_seq_current() returns 4 + SRCU_STATE_SCAN1 = 5. 3) This value is passed to rcu_segcblist_advance() which can't move any segment forward and fails. 4) srcu_gp_start_if_needed() still proceeds with callback acceleration. But then the call to rcu_seq_snap() observes the grace period for the RCU_WAIT_TAIL segment (gp_num 8) as completed and the subsequent one for the RCU_NEXT_READY_TAIL segment as started (ie: 8 + SRCU_STATE_SCAN1 = 9) so it returns a snapshot of the next grace period, which is 16. 5) The value of 16 is passed to rcu_segcblist_accelerate() but the freshly enqueued callback in RCU_NEXT_TAIL can't move to RCU_NEXT_READY_TAIL which already has callbacks for a previous grace period (gp_num = 12). So acceleration fails. 6) Note in all these steps, srcu_invoke_callbacks() hadn't had a chance to run srcu_invoke_callbacks(). Then some very bad outcome may happen if the following happens: 7) Some other CPU races and starts the grace period number 16 before the CPU handling previous steps had a chance. Therefore srcu_gp_start() isn't called on the latter sdp to fix the acceleration leak from previous steps with a new pair of call to advance/accelerate. 8) The grace period 16 completes and srcu_invoke_callbacks() is finally called. All the callbacks from previous grace periods (8 and 12) are correctly advanced and executed but callbacks in RCU_NEXT_READY_TAIL still remain. Then rcu_segcblist_accelerate() is called with a snaphot of 20. 9) Since nothing started the grace period number 20, callbacks stay unhandled. This has been reported in real load: [3144162.608392] INFO: task kworker/136:12:252684 blocked for more than 122 seconds. [3144162.615986] Tainted: G O K 5.4.203-1-tlinux4-0011.1 #1 [3144162.623053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [3144162.631162] kworker/136:12 D 0 252684 2 0x90004000 [3144162.631189] Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm] [3144162.631192] Call Trace: [3144162.631202] __schedule+0x2ee/0x660 [3144162.631206] schedule+0x33/0xa0 [3144162.631209] schedule_timeout+0x1c4/0x340 [3144162.631214] ? update_load_avg+0x82/0x660 [3144162.631217] ? raw_spin_rq_lock_nested+0x1f/0x30 [3144162.631218] wait_for_completion+0x119/0x180 [3144162.631220] ? wake_up_q+0x80/0x80 [3144162.631224] __synchronize_srcu.part.19+0x81/0xb0 [3144162.631226] ? __bpf_trace_rcu_utilization+0x10/0x10 [3144162.631227] synchronize_srcu+0x5f/0xc0 [3144162.631236] irqfd_shutdown+0x3c/0xb0 [kvm] [3144162.631239] ? __schedule+0x2f6/0x660 [3144162.631243] process_one_work+0x19a/0x3a0 [3144162.631244] worker_thread+0x37/0x3a0 [3144162.631247] kthread+0x117/0x140 [3144162.631247] ? process_one_work+0x3a0/0x3a0 [3144162.631248] ? __kthread_cancel_work+0x40/0x40 [3144162.631250] ret_from_fork+0x1f/0x30 Fix this with taking the snapshot for acceleration _before_ the read of the current grace period number. The only side effect of this solution is that callbacks advancing happen then _after_ the full barrier in rcu_seq_snap(). This is not a problem because that barrier only cares about: 1) Ordering accesses of the update side before call_srcu() so they don't bleed. 2) See all the accesses prior to the grace period of the current gp_num The only things callbacks advancing need to be ordered against are carried by snp locking. Reported-by: Yong He <alexyonghe@tencent.com> Co-developed-by:: Yong He <alexyonghe@tencent.com> Signed-off-by: Yong He <alexyonghe@tencent.com> Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Co-developed-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com> Link: http://lore.kernel.org/CANZk6aR+CqZaqmMWrC2eRRPY12qAZnDZLwLnHZbNi=xXMB401g@mail.gmail.com Fixes: `da915ad5cf` ("srcu: Parallelize callback handling") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:17 -04:00
Waiman Long	faa9279f82	srcu: Fix srcu_struct node grpmask overflow on 64-bit systems JIRA: https://issues.redhat.com/browse/RHEL-34076 commit d8d5b7bf6f2105883bbd91bbd4d5b67e4e3dff71 Author: Denis Arefev <arefev@swemel.ru> Date: Mon, 4 Sep 2023 15:21:14 +0300 srcu: Fix srcu_struct node grpmask overflow on 64-bit systems The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing the bitwise operation. The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF Kconfig option, which on 64-bit systems defaults to 16 (resulting in a maximum shift of 15), but which can be set up as high as 64 (resulting in a maximum shift of 63). A value of 31 can result in sign extension, resulting in 0xffffffff80000000 instead of the desired 0x80000000. A value of 32 or greater triggers undefined behavior per the C standard. This bug has not been known to cause issues because almost all kernels take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as a given compiler gives a deterministic non-zero result for 1<<N for N>=32, the code correctly invokes all SRCU callbacks, albeit wasting CPU time along the way. This commit therefore substitutes the correct 1UL for the buggy 1. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev <arefev@swemel.ru> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: David Laight <David.Laight@aculab.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:16 -04:00
Waiman Long	2ef5742a82	rcu: Dump memory object info if callback function is invalid JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 2cbc482d325ee58001472c4359b311958c4efdd1 Author: Zhen Lei <thunder.leizhen@huawei.com> Date: Sat, 5 Aug 2023 11:17:26 +0800 rcu: Dump memory object info if callback function is invalid When a structure containing an RCU callback rhp is (incorrectly) freed and reallocated after rhp is passed to call_rcu(), it is not unusual for rhp->func to be set to NULL. This defeats the debugging prints used by __call_rcu_common() in kernels built with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which expect to identify the offending code using the identity of this function. And in kernels build without CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, things are even worse, as can be seen from this splat: Unable to handle kernel NULL pointer dereference at virtual address 0 ... ... PC is at 0x0 LR is at rcu_do_batch+0x1c0/0x3b8 ... ... (rcu_do_batch) from (rcu_core+0x1d4/0x284) (rcu_core) from (__do_softirq+0x24c/0x344) (__do_softirq) from (__irq_exit_rcu+0x64/0x108) (__irq_exit_rcu) from (irq_exit+0x8/0x10) (irq_exit) from (__handle_domain_irq+0x74/0x9c) (__handle_domain_irq) from (gic_handle_irq+0x8c/0x98) (gic_handle_irq) from (__irq_svc+0x5c/0x94) (__irq_svc) from (arch_cpu_idle+0x20/0x3c) (arch_cpu_idle) from (default_idle_call+0x4c/0x78) (default_idle_call) from (do_idle+0xf8/0x150) (do_idle) from (cpu_startup_entry+0x18/0x20) (cpu_startup_entry) from (0xc01530) This commit therefore adds calls to mem_dump_obj(rhp) to output some information, for example: slab kmalloc-256 start ffff410c45019900 pointer offset 0 size 256 This provides the rough size of the memory block and the offset of the rcu_head structure, which as least provides at least a few clues to help locate the problem. If the problem is reproducible, additional slab debugging can be enabled, for example, CONFIG_DEBUG_SLAB=y, which can provide significantly more information. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:16 -04:00
Waiman Long	1cb8f685e9	srcu: Fix error handling in init_srcu_struct_fields() JIRA: https://issues.redhat.com/browse/RHEL-34076 commit f0a31b26be1f725e3f56beed76486c1b034120b3 Author: Joel Fernandes (Google) <joel@joelfernandes.org> Date: Sat, 29 Jul 2023 14:27:32 +0000 srcu: Fix error handling in init_srcu_struct_fields() The current error handling in init_srcu_struct_fields() is a bit inconsistent. If init_srcu_struct_nodes() fails, the function either returns -ENOMEM or 0 depending on whether ssp->sda_is_static is true or false. This can make init_srcu_struct_fields() return 0 even if memory allocation failed! Simplify the error handling by always returning -ENOMEM if either init_srcu_struct_nodes() or the per-CPU allocation fails. This makes the control flow easier to follow and avoids the inconsistent return values. Add goto labels to avoid duplicating the error cleanup code. Link: https://lore.kernel.org/r/20230404003508.GA254019@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:16 -04:00
Waiman Long	5e683ba142	srcu: Clarify comments on memory barrier "E" JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 754aa6427efeb8a059233e18e810263a108fdd71 Author: Joel Fernandes (Google) <joel@joelfernandes.org> Date: Sat, 28 Jan 2023 03:59:01 +0000 srcu: Clarify comments on memory barrier "E" There is an smp_mb() named "E" in srcu_flip() immediately before the increment (flip) of the srcu_struct structure's ->srcu_idx. The purpose of E is to order the preceding scan's read of lock counters against the flipping of the ->srcu_idx, in order to prevent new readers from continuing to use the old ->srcu_idx value, which might needlessly extend the grace period. However, this ordering is already enforced because of the control dependency between the preceding scan and the ->srcu_idx flip. This control dependency exists because atomic_long_read() is used to scan the counts, because WRITE_ONCE() is used to flip ->srcu_idx, and because ->srcu_idx is not flipped until the ->srcu_lock_count[] and ->srcu_unlock_count[] counts match. And such a match cannot happen when there is an in-flight reader that started before the flip (observation courtesy Mathieu Desnoyers). The litmus test below (courtesy of Frederic Weisbecker, with changes for ctrldep by Boqun and Joel) shows this: C srcu (* * bad condition: P0's first scan (SCAN1) saw P1's idx=0 LOCK count inc, though P1 saw flip. * * So basically, the ->po ordering on both P0 and P1 is enforced via ->ppo * (control deps) on both sides, and both P0 and P1 are interconnected by ->rf * relations. Combining the ->ppo with ->rf, a cycle is impossible. ) {} // updater P0(int IDX, int LOCK0, int UNLOCK0, int LOCK1, int UNLOCK1) { int lock1; int unlock1; int lock0; int unlock0; // SCAN1 unlock1 = READ_ONCE(UNLOCK1); smp_mb(); // A lock1 = READ_ONCE(LOCK1); // FLIP if (lock1 == unlock1) { // Control dep smp_mb(); // E // Remove E and still passes. WRITE_ONCE(IDX, 1); smp_mb(); // D // SCAN2 unlock0 = READ_ONCE(UNLOCK0); smp_mb(); // A lock0 = READ_ONCE(LOCK0); } } // reader P1(int IDX, int LOCK0, int UNLOCK0, int LOCK1, int UNLOCK1) { int tmp; int idx1; int idx2; // 1st reader idx1 = READ_ONCE(IDX); if (idx1 == 0) { // Control dep tmp = READ_ONCE(LOCK0); WRITE_ONCE(LOCK0, tmp + 1); smp_mb(); / B and C / tmp = READ_ONCE(UNLOCK0); WRITE_ONCE(UNLOCK0, tmp + 1); } else { tmp = READ_ONCE(LOCK1); WRITE_ONCE(LOCK1, tmp + 1); smp_mb(); / B and C / tmp = READ_ONCE(UNLOCK1); WRITE_ONCE(*UNLOCK1, tmp + 1); } } exists (0:lock1=1 /\ 1:idx1=1) More complicated litmus tests with multiple SRCU readers also show that memory barrier E is not needed. This commit therefore clarifies the comment on memory barrier E. Why not also remove that redundant smp_mb()? Because control dependencies are quite fragile due to their not being recognized by most compilers and tools. Control dependencies therefore exact an ongoing maintenance burden, and such a burden cannot be justified in this slowpath. Therefore, that smp_mb() stays until such time as its overhead becomes a measurable problem in a real workload running on a real production system, or until such time as compilers start paying attention to this sort of control dependency. Co-developed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Co-developed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:20 -04:00
Waiman Long	de3564aca0	srcu: Fix long lines in srcu_funnel_gp_start() JIRA: https://issues.redhat.com/browse/RHEL-5228 commit cefc0a599b19d8dd0e26d0b2e43311bae7530ca1 Author: Paul E. McKenney <paulmck@kernel.org> Date: Sat, 18 Mar 2023 12:31:53 -0700 srcu: Fix long lines in srcu_funnel_gp_start() This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:19 -04:00
Waiman Long	0756127b1a	srcu: Fix long lines in srcu_gp_end() JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 6c366522e10f2b5e916b3f08ac042df1a1cd512a Author: Paul E. McKenney <paulmck@kernel.org> Date: Sat, 18 Mar 2023 10:52:48 -0700 srcu: Fix long lines in srcu_gp_end() This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:18 -04:00
Waiman Long	e916f43a12	srcu: Fix long lines in cleanup_srcu_struct() JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 5ff8319f07db11c3b9347ce1dc0a6d951ae96d29 Author: Paul E. McKenney <paulmck@kernel.org> Date: Sat, 18 Mar 2023 10:51:50 -0700 srcu: Fix long lines in cleanup_srcu_struct() This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:17 -04:00
Waiman Long	5725498a44	srcu: Fix long lines in srcu_get_delay() JIRA: https://issues.redhat.com/browse/RHEL-5228 commit eabe7625f053050e4cecbbe98bd944f7e8eb14f5 Author: Paul E. McKenney <paulmck@kernel.org> Date: Sat, 18 Mar 2023 09:34:52 -0700 srcu: Fix long lines in srcu_get_delay() This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:17 -04:00
Waiman Long	57aa7f8ce0	srcu: Check for readers at module-exit time JIRA: https://issues.redhat.com/browse/RHEL-5228 commit a7bf4d7c16c1cf9753873879630a5d5169eb3206 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 24 Mar 2023 09:05:50 -0700 srcu: Check for readers at module-exit time If a given statically allocated in-module srcu_struct structure was ever used for updates, srcu_module_going() will invoke cleanup_srcu_struct() at module-exit time. This will check for the error case of SRCU readers persisting past module-exit time. On the other hand, if this srcu_struct structure never went through a grace period, srcu_module_going() only invokes free_percpu(), which would result in strange failures if SRCU readers persisted past module-exit time. This commit therefore adds a srcu_readers_active() check to srcu_module_going(), splatting if readers have persisted and refraining from invoking free_percpu() in that case. Better to leak memory than to suffer silent memory corruption! [ paulmck: Apply Zhang, Qiang1 feedback on memory leak. ] Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:16 -04:00
Waiman Long	7b446e704b	srcu: Move work-scheduling fields from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit fd1b3f8e097b7fbbab8ac4a802b24fc23c703dcf Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 21:30:32 -0700 srcu: Move work-scheduling fields from srcu_struct to srcu_usage This commit moves the ->reschedule_jiffies, ->reschedule_count, and ->work fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. However, this means that the container_of() calls cannot get a pointer to the srcu_struct because they are no longer in the srcu_struct. This issue is addressed by adding a ->srcu_ssp field in the srcu_usage structure that references the corresponding srcu_struct structure. And given the presence of the sup pointer to the srcu_usage structure, replace some ssp->srcu_usage-> instances with sup->. [ paulmck Apply feedback from kernel test robot. ] Link: https://lore.kernel.org/oe-kbuild-all/202303191400.iO5BOqka-lkp@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:16 -04:00
Waiman Long	b201447229	srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit d20162e0bfc222183a7c94cd00e74b6bbf1a605b Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 21:08:18 -0700 srcu: Move srcu_barrier() fields from srcu_struct to srcu_usage This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex, ->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:15 -04:00
Waiman Long	9c5b51a047	srcu: Move ->sda_is_static from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 660349ac79cb22bb64c44b026d879069783e97d5 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 20:22:58 -0700 srcu: Move ->sda_is_static from srcu_struct to srcu_usage This commit moves the ->sda_is_static field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:15 -04:00
Waiman Long	340cb0f7a8	srcu: Move heuristics fields from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 3b46679c623c2766f4c56fd3f9ce8edbb38c5d20 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 20:01:02 -0700 srcu: Move heuristics fields from srcu_struct to srcu_usage This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries, and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:14 -04:00
Waiman Long	871bc36c0e	srcu: Move grace-period fields from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 03200b5ca3b4d4edf634dc052bf3b8eb8dc8bbbc Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 19:30:50 -0700 srcu: Move grace-period fields from srcu_struct to srcu_usage This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed, ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:14 -04:00
Waiman Long	9238b6fe23	srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit e3a6ab25cfa0fcdcb31c346b9871a566d440980d Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 19:13:16 -0700 srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usage This commit moves the ->srcu_gp_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:13 -04:00
Waiman Long	464974c24d	srcu: Move ->lock from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit b3fb11f7e9c3c64dd86403409a070c996d8ac081 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 18:29:38 -0700 srcu: Move ->lock from srcu_struct to srcu_usage This commit moves the ->lock field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:13 -04:00
Waiman Long	a9b4dbe54a	srcu: Move ->lock initialization after srcu_usage allocation JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 0839ade94bdef395bab02b3a579416c112062026 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 17:35:21 -0700 srcu: Move ->lock initialization after srcu_usage allocation Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize the srcu_struct structure's ->lock before the srcu_usage structure has been allocated. This of course prevents the ->lock from being moved to the srcu_usage structure, so this commit moves the initialization into the init_srcu_struct_fields() after the srcu_usage structure has been allocated. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:12 -04:00
Waiman Long	c47e914c6c	srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 574dc1a7efe490dffe5c1ce0285306feec16a880 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 17:22:27 -0700 srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usage This commit moves the ->srcu_cb_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:11 -04:00
Waiman Long	e6644842d9	srcu: Move ->srcu_size_state from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit a0d8cbd3821369dc9478cabd605417afb9eb24dc Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 17:16:30 -0700 srcu: Move ->srcu_size_state from srcu_struct to srcu_usage This commit moves the ->srcu_size_state field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:11 -04:00
Waiman Long	906311b6eb	srcu: Move ->level from srcu_struct to srcu_usage JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 208f41b1312443401353bec0c1939e2bfc28adce Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 14:43:08 -0700 srcu: Move ->level from srcu_struct to srcu_usage This commit moves the ->level[] array from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:10 -04:00
Waiman Long	175ca90f08	srcu: Begin offloading srcu_struct fields to srcu_update JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 95433f7263011e0e6e83caef85d98896dd99cab7 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 16 Mar 2023 17:58:51 -0700 srcu: Begin offloading srcu_struct fields to srcu_update The current srcu_struct structure is on the order of 200 bytes in size (depending on architecture and .config), which is much better than the old-style 26K bytes, but still all too inconvenient when one is trying to achieve good cache locality on a fastpath involving SRCU readers. However, only a few fields in srcu_struct are used by SRCU readers. The remaining fields could be offloaded to a new srcu_update structure, thus shrinking the srcu_struct structure down to a few tens of bytes. This commit begins this noble quest, a quest that is complicated by open-coded initialization of the srcu_struct within the srcu_notifier_head structure. This complication is addressed by updating the srcu_notifier_head structure's open coding, given that there does not appear to be a straightforward way of abstracting that initialization. This commit moves only the ->node pointer to srcu_update. Later commits will move additional fields. [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ] Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:08 -04:00
Waiman Long	713d6e5730	srcu: Use static init for statically allocated in-module srcu_struct JIRA: https://issues.redhat.com/browse/RHEL-5228 commit f4d01a259374ef358cd6b00a96b4dfc0fb05a844 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 17 Mar 2023 13:28:04 -0700 srcu: Use static init for statically allocated in-module srcu_struct Further shrinking the srcu_struct structure is eased by requiring that in-module srcu_struct structures rely more heavily on static initialization. In particular, this preserves the property that a module-load-time srcu_struct initialization can fail only due to memory-allocation failure of the per-CPU srcu_data structures. It might also slightly improve robustness by keeping the number of memory allocations that must succeed down percpu_alloc() call. This is in preparation for splitting an srcu_usage structure out of the srcu_struct structure. [ paulmck: Fold in qiang1.zhang@intel.com feedback. ] Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-10-06 21:44:07 -04:00
Waiman Long	d8689d7dfb	rcu: Annotate SRCU's update-side lockdep dependencies JIRA: https://issues.redhat.com/browse/RHEL-5228 commit f0f44752f5f61ee4e3bd88ae033fdb888320aafe Author: Boqun Feng <boqun.feng@gmail.com> Date: Thu, 12 Jan 2023 22:59:54 -0800 rcu: Annotate SRCU's update-side lockdep dependencies Although all flavors of RCU readers are annotated correctly with lockdep as recursive read locks, they do not set the lock_acquire 'check' parameter. This means that RCU read locks are not added to the lockdep dependency graph, which in turn means that lockdep cannot detect RCU-based deadlocks. This is not a problem for RCU flavors having atomic read-side critical sections because context-based annotations can catch these deadlocks, see for example the RCU_LOCKDEP_WARN() statement in synchronize_rcu(). But context-based annotations are not helpful for sleepable RCU, especially given that it is perfectly legal to do synchronize_srcu(&srcu1) within an srcu_read_lock(&srcu2). However, we can detect SRCU-based by: (1) Making srcu_read_lock() a 'check'ed recursive read lock and (2) Making synchronize_srcu() a empty write lock critical section. Even better, with the newly introduced lock_sync(), we can avoid false positives about irq-unsafe/safe. This commit therefore makes it so. Note that NMI-safe SRCU read side critical sections are currently not annotated, but might be annotated in the future. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ boqun: Add comments for annotation per Waiman's suggestion ] [ boqun: Fix comment warning reported by Stephen Rothwell ] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:46 -04:00
Waiman Long	793924bf53	srcu: Update comment after the index flip JIRA: https://issues.redhat.com/browse/RHEL-5228 commit dafc4d1603c27671adc2b41eb7e7827f8cc18961 Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 21 Dec 2022 08:32:51 -0800 srcu: Update comment after the index flip Because there is not guaranteed to be a full memory barrier between the ->srcu_unlock_count increment of an srcu_read_unlock() and the ->srcu_lock_count increment of the next srcu_read_lock(), this next srcu_read_lock() is not guaranteed to see the effect of the index flip just prior to this comment. However, this next srcu_read_lock() will execute a full memory barrier, so the srcu_read_lock() after that is guaranteed to see that index flip. This guarantee is illustrated by the following diagram of events and the litmus test following that. ------------------------------------------------------------------------ READER UPDATER ------------- ---------- // idx is initially 0. srcu_flip() { smp_mb(); // RSCS srcu_read_unlock() { smp_mb(); idx++; // P smp_mb(); // QQ } srcu_readers_unlock_idx(0) { ,--counted------------ count all unlock[0]; // Q \| unlock[0]++; // X } smp_mb(); srcu_read_lock() { READ(idx) = 0; ,---- count all lock[0]; // contributes imbalance of 1. lock[0]++; ----counted \| smp_mb(); // PP } \| } \| \| // RSCS not going to effect above scan \| srcu_read_unlock() { \| smp_mb(); \| unlock[0]++; \| } \| / / srcu_read_lock() { \| READ(idx); // Y -----cannot be counted because of P (has to sample idx as 1) lock[1]++; ... } ------------------------------------------------------------------------ This makes it similar to the store buffer pattern. Using X, Y, P and Q annotated above, we get: ------------------------------------------------------------------------ READER UPDATER X (write) P (write) smp_mb(); //PP smp_mb(); //QQ Y (read) Q (read) ------------------------------------------------------------------------ ASCII art courtesy of Joel Fernandes. Reported-by: Joel Fernandes <joel@joelfernandes.org> Reported-by: Boqun Feng <boqun.feng@gmail.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:32 -04:00
Waiman Long	2d5559c0c7	srcu: Yet more detail for srcu_readers_active_idx_check() comments JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 0cd4b50b12d96d668b0627c149b19b5784ad4898 Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 14 Dec 2022 10:50:30 -0800 srcu: Yet more detail for srcu_readers_active_idx_check() comments The comment in srcu_readers_active_idx_check() following the smp_mb() is out of date, hailing from a simpler time when preemption was disabled across the bulk of __srcu_read_lock(). The fact that preemption was disabled meant that the number of tasks that had fetched the old index but not yet incremented counters was limited by the number of CPUs. In our more complex modern times, the number of CPUs is no longer a limit. This commit therefore updates this comment, additionally giving more memory-ordering detail. [ paulmck: Apply Nt->Nc feedback from Joel Fernandes. ] Reported-by: Boqun Feng <boqun.feng@gmail.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: "Joel Fernandes (Google)" <joel@joelfernandes.org> Reported-by: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Reported-by: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:31 -04:00
Waiman Long	a3d3ec08ee	srcu: Remove needless rcu_seq_done() check while holding read lock JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 1bafbfb3e1a18af7f404977ed0d218dc4f176f8e Author: Pingfan Liu <kernelfans@gmail.com> Date: Wed, 23 Nov 2022 21:56:37 +0800 srcu: Remove needless rcu_seq_done() check while holding read lock The srcu_gp_start_if_needed() function now read-holds the srcu_struct whose grace period is being started, which means that the corresponding SRCU grace period cannot end. This in turn means that the SRCU grace-period sequence number returned by rcu_seq_snap() cannot expire during this time. And that means that the calls to rcu_seq_done() in srcu_funnel_exp_start() and srcu_funnel_gp_start() can never return true. This commit therefore removes these rcu_seq_done() checks, but adds checks in kernels built with CONFIG_PROVE_RCU=y that splats if rcu_seq_done() does somehow return true. [ paulmck: Rearrange checks to handle kernels built with lockdep. ] Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> To: rcu@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:31 -04:00
Waiman Long	133e4893e5	srcu: Fix the comparision in srcu_invl_snp_seq() JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 50be0c0439fc1d8bda733ff26f6526e49970857a Author: Pingfan Liu <kernelfans@gmail.com> Date: Wed, 16 Nov 2022 09:52:44 +0800 srcu: Fix the comparision in srcu_invl_snp_seq() A grace-period sequence number contains two fields: counter and state. SRCU_SNP_INIT_SEQ provides a guaranteed invalid value for grace-period sequence numbers in newly allocated srcu_node structures' ->srcu_have_cbs[] and ->srcu_gp_seq_needed_exp fields. The point of the comparison in srcu_invl_snp_seq() is not to detect invalid grace-period sequence numbers in general, but rather to detect a newly allocated srcu_node structure whose ->srcu_have_cbs[] and ->srcu_gp_seq_needed_exp fields need to be brought into line with the srcu_struct structure's ->srcu_gp_seq field. This commit therefore causes srcu_invl_snp_seq() to compare both fields of the specified grace-period sequence number. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: <rcu@vger.kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:30 -04:00
Waiman Long	3cbc967ed1	srcu: Debug NMI safety even on archs that don't require it JIRA: https://issues.redhat.com/browse/RHEL-5228 Conflicts: A merge conflict in the srcu_barrier() hunk of kernel/rcu/srcutree.c due to the presence of a later upstream commit 7f24626d6dd8 ("srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL"). commit e29a4915db1480f96e0bc2e928699d086a71f43c Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 13 Oct 2022 19:22:44 +0200 srcu: Debug NMI safety even on archs that don't require it Currently the NMI safety debugging is only performed on architectures that don't support NMI-safe this_cpu_inc(). Reorder the code so that other architectures like x86 also detect bad uses. [ paulmck: Apply kernel test robot, Stephen Rothwell, and Zqiang feedback. ] Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:14 -04:00
Waiman Long	9511be47c8	srcu: Explain the reason behind the read side critical section on GP start JIRA: https://issues.redhat.com/browse/RHEL-5228 commit ae3c0706160b60ac5e7d36aac428ae6e572dc932 Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 13 Oct 2022 19:22:43 +0200 srcu: Explain the reason behind the read side critical section on GP start Tell about the need to protect against concurrent updaters who may overflow the GP counter behind the current update. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:14 -04:00
Waiman Long	8cfa692f14	srcu: Warn when NMI-unsafe API is used in NMI JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 6b77bb9b99c66c6596c58e7a25169bc2ea6b82dd Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 13 Oct 2022 19:22:42 +0200 srcu: Warn when NMI-unsafe API is used in NMI Using the NMI-unsafe reader API from within an NMI handler is very likely to be buggy for three reasons: 1) NMIs aren't strictly re-entrant (a pending nested NMI will execute at the end of the current one) so it should be fine to use a non-atomic increment here. However, breakpoints can still interrupt NMIs and if a breakpoint callback has a reader on that same ssp, a racy increment can happen. 2) If the only reader site for a given srcu_struct structure is in an NMI handler, then RCU should be used instead of SRCU. 3) Because of the previous reason (2), an srcu_struct structure having an SRCU read side critical section in an NMI handler is likely to have another one from a task context. For all these reasons, warn if an NMI-unsafe reader API is used from an NMI handler. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:13 -04:00
Waiman Long	c68d820dda	srcu: Check for consistent global per-srcu_struct NMI safety JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 36f65f1d1553e35cd9e6b281271f40d639a128c3 Author: Paul E. McKenney <paulmck@kernel.org> Date: Tue, 20 Sep 2022 14:54:41 -0700 srcu: Check for consistent global per-srcu_struct NMI safety This commit adds runtime checks to verify that a given srcu_struct uses consistent NMI-safe (or not) read-side primitives globally, but based on the per-CPU data. These global checks are made by the grace-period code that must scan the srcu_data structures anyway, and are done only in kernels built with CONFIG_PROVE_RCU=y. Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:11 -04:00
Waiman Long	8fb0ae0f42	srcu: Check for consistent per-CPU per-srcu_struct NMI safety JIRA: https://issues.redhat.com/browse/RHEL-5228 Conflicts: A merge conflict in the srcu_barrier() hunk of kernel/rcu/srcutree.c due to the presence of a later upstream commit 7f24626d6dd8 ("srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL"). commit 27120e7d2c4d5c438b76f9c6330037a52ad0722e Author: Paul E. McKenney <paulmck@kernel.org> Date: Mon, 19 Sep 2022 14:03:07 -0700 srcu: Check for consistent per-CPU per-srcu_struct NMI safety This commit adds runtime checks to verify that a given srcu_struct uses consistent NMI-safe (or not) read-side primitives on a per-CPU basis. Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:11 -04:00
Waiman Long	2cd5fec95b	srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() JIRA: https://issues.redhat.com/browse/RHEL-5228 Conflicts: A merge conflict in the srcu_barrier() hunk of kernel/rcu/srcutree.c due to the presence of a later upstream commit 7f24626d6dd8 ("srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL"). commit 2e83b879fb91dafe995967b46a1d38a5b0889242 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 15 Sep 2022 14:29:07 -0700 srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() On strict load-store architectures, the use of this_cpu_inc() by srcu_read_lock() and srcu_read_unlock() is not NMI-safe in TREE SRCU. To see this suppose that an NMI arrives in the middle of srcu_read_lock(), just after it has read ->srcu_lock_count, but before it has written the incremented value back to memory. If that NMI handler also does srcu_read_lock() and srcu_read_lock() on that same srcu_struct structure, then upon return from that NMI handler, the interrupted srcu_read_lock() will overwrite the NMI handler's update to ->srcu_lock_count, but leave unchanged the NMI handler's update by srcu_read_unlock() to ->srcu_unlock_count. This can result in a too-short SRCU grace period, which can in turn result in arbitrary memory corruption. If the NMI handler instead interrupts the srcu_read_unlock(), this can result in eternal SRCU grace periods, which is not much better. This commit therefore creates a pair of new srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions, which allow SRCU readers in both NMI handlers and in process and IRQ context. It is bad practice to mix the existing and the new _nmisafe() primitives on the same srcu_struct structure. Use one set or the other, not both. Just to underline that "bad practice" point, using srcu_read_lock() at process level and srcu_read_lock_nmisafe() in your NMI handler will not, repeat NOT, work. If you do not immediately understand why this is the case, please review the earlier paragraphs in this commit log. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from Randy Dunlap. ] [ paulmck: Apply feedback from John Ogness. ] [ paulmck: Apply feedback from Frederic Weisbecker. ] Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:10 -04:00
Waiman Long	0b57e9b754	srcu: Convert ->srcu_lock_count and ->srcu_unlock_count to atomic JIRA: https://issues.redhat.com/browse/RHEL-5228 commit 5d0f5953b60f5f7a278085b55ddc73e2932f4c33 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 15 Sep 2022 12:09:30 -0700 srcu: Convert ->srcu_lock_count and ->srcu_unlock_count to atomic NMI-safe variants of srcu_read_lock() and srcu_read_unlock() are needed by printk(), which on many architectures entails read-modify-write atomic operations. This commit prepares Tree SRCU for this change by making both ->srcu_lock_count and ->srcu_unlock_count by atomic_long_t. [ paulmck: Apply feedback from John Ogness. ] Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-09-22 13:21:08 -04:00
Pingfan Liu	1e3afd647b	srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2129726 Upstream: Linux tree Conflicts: Minor context diffs but not affect this simple substitution commit 7f24626d6dd844bfc6d1f492d214d29c86d02550 Author: Pingfan Liu <kernelfans@gmail.com> Date: Mon Oct 31 09:52:37 2022 +0800 srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL Commit 994f706872e6 ("srcu: Make Tree SRCU able to operate without snp_node array") assumes that cpu 0 is always online. However, there really are situations when some other CPU is the boot CPU, for example, when booting a kdump kernel with the maxcpus=1 boot parameter. On PowerPC, the kdump kernel can hang as follows: ... [ 1.740036] systemd[1]: Hostname set to <xyz.com> [ 243.686240] INFO: task systemd:1 blocked for more than 122 seconds. [ 243.686264] Not tainted 6.1.0-rc1 #1 [ 243.686272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.686281] task:systemd state:D stack:0 pid:1 ppid:0 flags:0x00042000 [ 243.686296] Call Trace: [ 243.686301] [c000000016657640] [c000000016657670] 0xc000000016657670 (unreliable) [ 243.686317] [c000000016657830] [c00000001001dec0] __switch_to+0x130/0x220 [ 243.686333] [c000000016657890] [c000000010f607b8] __schedule+0x1f8/0x580 [ 243.686347] [c000000016657940] [c000000010f60bb4] schedule+0x74/0x140 [ 243.686361] [c0000000166579b0] [c000000010f699b8] schedule_timeout+0x168/0x1c0 [ 243.686374] [c000000016657a80] [c000000010f61de8] __wait_for_common+0x148/0x360 [ 243.686387] [c000000016657b20] [c000000010176bb0] __flush_work.isra.0+0x1c0/0x3d0 [ 243.686401] [c000000016657bb0] [c0000000105f2768] fsnotify_wait_marks_destroyed+0x28/0x40 [ 243.686415] [c000000016657bd0] [c0000000105f21b8] fsnotify_destroy_group+0x68/0x160 [ 243.686428] [c000000016657c40] [c0000000105f6500] inotify_release+0x30/0xa0 [ 243.686440] [c000000016657cb0] [c0000000105751a8] __fput+0xc8/0x350 [ 243.686452] [c000000016657d00] [c00000001017d524] task_work_run+0xe4/0x170 [ 243.686464] [c000000016657d50] [c000000010020e94] do_notify_resume+0x134/0x140 [ 243.686478] [c000000016657d80] [c00000001002eb18] interrupt_exit_user_prepare_main+0x198/0x270 [ 243.686493] [c000000016657de0] [c00000001002ec60] syscall_exit_prepare+0x70/0x180 [ 243.686505] [c000000016657e10] [c00000001000bf7c] system_call_vectored_common+0xfc/0x280 [ 243.686520] --- interrupt: 3000 at 0x7fffa47d5ba4 [ 243.686528] NIP: 00007fffa47d5ba4 LR: 0000000000000000 CTR: 0000000000000000 [ 243.686538] REGS: c000000016657e80 TRAP: 3000 Not tainted (6.1.0-rc1) [ 243.686548] MSR: 800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE> CR: 42044440 XER: 00000000 [ 243.686572] IRQMASK: 0 [ 243.686572] GPR00: 0000000000000006 00007ffffa606710 00007fffa48e7200 0000000000000000 [ 243.686572] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000001 [ 243.686572] GPR08: 000001000c172dd0 0000000000000000 0000000000000000 0000000000000000 [ 243.686572] GPR12: 0000000000000000 00007fffa4ff4bc0 0000000000000000 0000000000000000 [ 243.686572] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 243.686572] GPR20: 0000000132dfdc50 000000000000000e 0000000000189375 0000000000000000 [ 243.686572] GPR24: 00007ffffa606ae0 0000000000000005 000001000c185490 000001000c172570 [ 243.686572] GPR28: 000001000c172990 000001000c184850 000001000c172e00 00007fffa4fedd98 [ 243.686683] NIP [00007fffa47d5ba4] 0x7fffa47d5ba4 [ 243.686691] LR [0000000000000000] 0x0 [ 243.686698] --- interrupt: 3000 [ 243.686708] INFO: task kworker/u16:1:24 blocked for more than 122 seconds. [ 243.686717] Not tainted 6.1.0-rc1 #1 [ 243.686724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.686733] task:kworker/u16:1 state:D stack:0 pid:24 ppid:2 flags:0x00000800 [ 243.686747] Workqueue: events_unbound fsnotify_mark_destroy_workfn [ 243.686758] Call Trace: [ 243.686762] [c0000000166736e0] [c00000004fd91000] 0xc00000004fd91000 (unreliable) [ 243.686775] [c0000000166738d0] [c00000001001dec0] __switch_to+0x130/0x220 [ 243.686788] [c000000016673930] [c000000010f607b8] __schedule+0x1f8/0x580 [ 243.686801] [c0000000166739e0] [c000000010f60bb4] schedule+0x74/0x140 [ 243.686814] [c000000016673a50] [c000000010f699b8] schedule_timeout+0x168/0x1c0 [ 243.686827] [c000000016673b20] [c000000010f61de8] __wait_for_common+0x148/0x360 [ 243.686840] [c000000016673bc0] [c000000010210840] __synchronize_srcu.part.0+0xa0/0xe0 [ 243.686855] [c000000016673c30] [c0000000105f2c64] fsnotify_mark_destroy_workfn+0xc4/0x1a0 [ 243.686868] [c000000016673ca0] [c000000010174ea8] process_one_work+0x2a8/0x570 [ 243.686882] [c000000016673d40] [c000000010175208] worker_thread+0x98/0x5e0 [ 243.686895] [c000000016673dc0] [c0000000101828d4] kthread+0x124/0x130 [ 243.686908] [c000000016673e10] [c00000001000cd40] ret_from_kernel_thread+0x5c/0x64 [ 366.566274] INFO: task systemd:1 blocked for more than 245 seconds. [ 366.566298] Not tainted 6.1.0-rc1 #1 [ 366.566305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 366.566314] task:systemd state:D stack:0 pid:1 ppid:0 flags:0x00042000 [ 366.566329] Call Trace: ... The above splat occurs because PowerPC really does use maxcpus=1 instead of nr_cpus=1 in the kernel command line. Consequently, the (quite possibly non-zero) kdump CPU is the only online CPU in the kdump kernel. SRCU unconditionally queues a sdp->work on cpu 0, for which no worker thread has been created, so sdp->work will be never executed and __synchronize_srcu() will never be completed. This commit therefore replaces CPU ID 0 with get_boot_cpu_id() in key places in Tree SRCU. Since the CPU indicated by get_boot_cpu_id() is guaranteed to be online, this avoids the above splat. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> To: rcu@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Pingfan Liu <piliu@redhat.com>	2023-03-13 08:36:27 +08:00
Waiman Long	929a1c06d5	srcu: Make expedited RCU grace periods block even less frequently Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 4f2bfd9494a072d58203600de6bedd72680e612a Author: Neeraj Upadhyay <quic_neeraju@quicinc.com> Date: Fri, 1 Jul 2022 08:45:45 +0530 srcu: Make expedited RCU grace periods block even less frequently The purpose of commit 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU") was to prevent a long series of never-blocking expedited SRCU grace periods from blocking kernel-live-patching (KLP) progress. Although it was successful, it also resulted in excessive boot times on certain embedded workloads running under qemu with the "-bios QEMU_EFI.fd" command line. Here "excessive" means increasing the boot time up into the three-to-four minute range. This increase in boot time was due to the more than 6000 back-to-back invocations of synchronize_rcu_expedited() within the KVM host OS, which in turn resulted from qemu's emulation of a long series of MMIO accesses. Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods") did not significantly help this particular use case. Zhangfei Gao and Shameerali Kolothum Thodi did experiments varying the value of SRCU_MAX_NODELAY_PHASE with HZ=250 and with various values of non-sleeping per phase counts on a system with preemption enabled, and observed the following boot times: +──────────────────────────+────────────────+ \| SRCU_MAX_NODELAY_PHASE \| Boot time (s) \| +──────────────────────────+────────────────+ \| 100 \| 30.053 \| \| 150 \| 25.151 \| \| 200 \| 20.704 \| \| 250 \| 15.748 \| \| 500 \| 11.401 \| \| 1000 \| 11.443 \| \| 10000 \| 11.258 \| \| 1000000 \| 11.154 \| +──────────────────────────+────────────────+ Analysis on the experiment results show additional improvements with CPU-bound delays approaching one jiffy in duration. This improvement was also seen when number of per-phase iterations were scaled to one jiffy. This commit therefore scales per-grace-period phase number of non-sleeping polls so that non-sleeping polls extend for about one jiffy. In addition, the delay-calculation call to srcu_get_delay() in srcu_gp_end() is replaced with a simple check for an expedited grace period. This change schedules callback invocation immediately after expedited grace periods complete, which results in greatly improved boot times. Testing done by Marc and Zhangfei confirms that this change recovers most of the performance degradation in boottime; for CONFIG_HZ_250 configuration, specifically, boot times improve from 3m50s to 41s on Marc's setup; and from 2m40s to ~9.7s on Zhangfei's setup. In addition to the changes to default per phase delays, this change adds 3 new kernel parameters - srcutree.srcu_max_nodelay, srcutree.srcu_max_nodelay_phase, and srcutree.srcu_retry_check_delay. This allows users to configure the srcu grace period scanning delays in order to more quickly react to additional use cases. Fixes: 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods") Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU") Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reported-by: yueluck <yueluck@163.com> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:38:29 -04:00
Waiman Long	9456a65490	srcu: Block less aggressively for expedited grace periods Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 8f870e6eb8c0c3f9869bf3fcf9db39f86cfcea49 Author: Paul E. McKenney <paulmck@kernel.org> Date: Sun, 12 Jun 2022 15:00:06 -0700 srcu: Block less aggressively for expedited grace periods Commit 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU") fixed a problem where a long-running expedited SRCU grace period could block kernel live patching. It did so by giving up on expediting once a given SRCU expedited grace period grew too old. Unfortunately, this added excessive delays to boots of virtual embedded systems specifying "-bios QEMU_EFI.fd" to qemu. This commit therefore makes the transition away from expediting less aggressive, increasing the per-grace-period phase number of non-sleeping polls of readers from one to three and increasing the required grace-period age from one jiffy (actually from zero to one jiffies) to two jiffies (actually from one to two jiffies). Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reported-by: chenxiang (M)" <chenxiang66@hisilicon.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/ Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:38:28 -04:00
Waiman Long	95af8dfe11	srcu: Drop needless initialization of sdp in srcu_gp_start() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 586e31d59c436cda65a2e8ac04ff954bed247023 Author: Lukas Bulwahn <lukas.bulwahn@gmail.com> Date: Tue, 15 Mar 2022 09:55:49 +0100 srcu: Drop needless initialization of sdp in srcu_gp_start() Commit 9c7ef4c30f12 ("srcu: Make Tree SRCU able to operate without snp_node array") initializes the local variable sdp differently depending on the srcu's state in srcu_gp_start(). Either way, this initialization overwrites the value used when sdp is defined. This commit therefore drops this pointless definition-time initialization. Although there is no functional change, compiler code generation may be affected. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:23 -04:00
Waiman Long	cc172221e3	srcu: Prevent expedited GPs and blocking readers from consuming CPU Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 282d8998e9979c2186af7f7d22366f2fc3149838 Author: Paul E. McKenney <paulmck@kernel.org> Date: Tue, 8 Mar 2022 15:45:33 -0800 srcu: Prevent expedited GPs and blocking readers from consuming CPU If an SRCU reader blocks while a synchronize_srcu_expedited() waits for that same reader, then that grace period will spawn an endless series of workqueue handlers, consuming a full CPU. This quickly gets pointless because consuming more CPU isn't going to make that reader get done faster, especially if it is blocked waiting for an external event. This commit therefore spawns at most one pair of back-to-back workqueue handlers per expedited grace period phase, instead inserting increasing delays as that grace period phase grows older, but capped at 10 jiffies. In any case, if there have been at least 100 back-to-back workqueue handlers within a single jiffy, regardless of grace period or grace-period phase, then a one-jiffy delay is inserted. [ paulmck: Apply feedback from kernel test robot. ] Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reported-by: Song Liu <song@kernel.org> Tested-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:23 -04:00
Waiman Long	babac7a730	srcu: Add contention check to call_srcu() srcu_data ->lock acquisition Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit c2445d38785086422e56dcbe049b73a53b2ba81f Author: Paul E. McKenney <paulmck@kernel.org> Date: Mon, 31 Jan 2022 13:27:15 -0800 srcu: Add contention check to call_srcu() srcu_data ->lock acquisition This commit increases the sensitivity of contention detection by adding checks to the acquisition of the srcu_data structure's lock on the call_srcu() code path. Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:22 -04:00
Waiman Long	3538999436	srcu: Automatically determine size-transition strategy at boot Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit a57ffb3c6b67e59e8632f731414b792eacc6cca0 Author: Paul E. McKenney <paulmck@kernel.org> Date: Mon, 31 Jan 2022 11:21:30 -0800 srcu: Automatically determine size-transition strategy at boot This commit adds a srcutree.convert_to_big option of zero that causes SRCU to decide at boot whether to wait for contention (small systems) or immediately expand to large (large systems). A new srcutree.big_cpu_lim (defaulting to 128) defines how many CPUs constitute a large system. Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:22 -04:00
Waiman Long	6e9dff95d2	srcu: Add contention-triggered addition of srcu_node tree Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 9f2e91d94c91558e3764fe4e01c5e6281a90f239 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 27 Jan 2022 20:32:05 -0800 srcu: Add contention-triggered addition of srcu_node tree This commit instruments the acquisitions of the srcu_struct structure's ->lock, enabling the initiation of a transition from SRCU_SIZE_SMALL to SRCU_SIZE_BIG when sufficient contention is experienced. The instrumentation counts the number of trylock failures within the confines of a single jiffy. If that number exceeds the value specified by the srcutree.small_contention_lim kernel boot parameter (which defaults to 100), and if the value specified by the srcutree.convert_to_big kernel boot parameter has the 0x10 bit set (defaults to 0), then a transition will be automatically initiated. By default, there will never be any transitions, so that none of the srcu_struct structures ever gains an srcu_node array. The useful values for srcutree.convert_to_big are: 0x00: Never convert. 0x01: Always convert at init_srcu_struct() time. 0x02: Convert when rcutorture prints its first round of statistics. 0x03: Decide conversion approach at boot given system size. 0x10: Convert if contention is encountered. 0x12: Convert if contention is encountered or when rcutorture prints its first round of statistics, whichever comes first. The value 0x11 acts the same as 0x01 because the conversion happens before there is any chance of contention. [ paulmck: Apply "static" feedback from kernel test robot. ] Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:21:59 -04:00
Waiman Long	0d2a81fe27	srcu: Create concurrency-safe helper for initiating size transition Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 99659f64b14e55cfa48980f5396f83820bafd028 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 27 Jan 2022 14:56:39 -0800 srcu: Create concurrency-safe helper for initiating size transition Once there are contention-initiated size transitions, it will be possible for rcutorture to initiate a transition at the same time as a contention-initiated transition. This commit therefore creates a concurrency-safe helper function named srcu_transition_to_big() to safely initiate size transitions. Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:21:59 -04:00

1 2 3

150 Commits