Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Rado Vrbovsky	16bf54f108	Merge: Fix RCUC latency issue MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5165 JIRA: https://issues.redhat.com/browse/RHEL-20288 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Leonardo Bras <leobras@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Marcelo Tosatti <mtosatti@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-25 16:26:53 +00:00
Leonardo Bras	483ecb54c6	rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter JIRA: https://issues.redhat.com/browse/RHEL-20288 commit 68d124b0999919015e6d23008eafea106ec6bb40 Author: Paul E. McKenney <paulmck@kernel.org> Date: 2024-05-08 20:11:58 -0700 rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter If a CPU is running either a userspace application or a guest OS in nohz_full mode, it is possible for a system call to occur just as an RCU grace period is starting. If that CPU also has the scheduling-clock tick enabled for any reason (such as a second runnable task), and if the system was booted with rcutree.use_softirq=0, then RCU can add insult to injury by awakening that CPU's rcuc kthread, resulting in yet another task and yet more OS jitter due to switching to that task, running it, and switching back. In addition, in the common case where that system call is not of excessively long duration, awakening the rcuc task is pointless. This pointlessness is due to the fact that the CPU will enter an extended quiescent state upon returning to the userspace application or guest OS. In this case, the rcuc kthread cannot do anything that the main RCU grace-period kthread cannot do on its behalf, at least if it is given a few additional milliseconds (for example, given the time duration specified by rcutree.jiffies_till_first_fqs, give or take scheduling delays). This commit therefore adds a rcutree.nohz_full_patience_delay kernel boot parameter that specifies the grace period age (in milliseconds, rounded to jiffies) before which RCU will refrain from awakening the rcuc kthread. Preliminary experimentation suggests a value of 1000, that is, one second. Increasing rcutree.nohz_full_patience_delay will increase grace-period latency and in turn increase memory footprint, so systems with constrained memory might choose a smaller value. Systems with less-aggressive OS-jitter requirements might choose the default value of zero, which keeps the traditional immediate-wakeup behavior, thus avoiding increases in grace-period latency. [ paulmck: Apply Leonardo Bras feedback. ] Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/ Reported-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com>	2024-10-08 18:52:03 -03:00
Waiman Long	7804cac54b	rcu: Make hotplug operations track GP state, not flags JIRA: https://issues.redhat.com/browse/RHEL-55557 commit ae2b217ab542d0db0ca1a6de4f442201a1982f00 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 8 Mar 2024 11:15:01 -0800 rcu: Make hotplug operations track GP state, not flags Currently, there are rcu_data structure fields named ->rcu_onl_gp_seq and ->rcu_ofl_gp_seq that track the rcu_state.gp_flags field at the time of the corresponding CPU's last online or offline operation, respectively. However, this information is not particularly useful. It would be better to instead track the grace period state kept in rcu_state.gp_state. This would also be consistent with the initialization in rcu_boot_init_percpu_data(), which is to RCU_GP_CLEANED (an rcu_state.gp_state value), and also with the diagnostics in rcu_implicit_dynticks_qs(), whose format is consistent with an integer, not a bitmask. This commit therefore makes this change and changes the names to ->rcu_onl_gp_flags and ->rcu_ofl_gp_flags, respectively. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:32 -04:00
Waiman Long	2f23c68f4a	rcu/exp: Handle parallel exp gp kworkers affinity JIRA: https://issues.redhat.com/browse/RHEL-55557 commit b67cffcbbf9dc759d95d330a5af5d1480af2b1f1 Author: Frederic Weisbecker <frederic@kernel.org> Date: Fri, 12 Jan 2024 16:46:20 +0100 rcu/exp: Handle parallel exp gp kworkers affinity Affine the parallel expedited gp kworkers to their respective RCU node in order to make them close to the cache their are playing with. This reuses the boost kthreads machinery that probe into CPU hotplug operations such that the kthreads become/stay affine to their respective node as soon/long as they contain online CPUs. Otherwise and if the current CPU going down was the last online on the leaf node, the related kthread is affine to the housekeeping CPUs. In the long run, this affinity VS CPU hotplug operation game should probably be implemented at the generic kthread level. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> [boqun: s/* rcu_boost_task/*rcu_boost_task as reported by checkpatch] Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:14 -04:00
Waiman Long	d5ad8ad294	rcu/exp: Make parallel exp gp kworker per rcu node JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 8e5e621566485a3e160c0d8bfba206cb1d6b980d Author: Frederic Weisbecker <frederic@kernel.org> Date: Fri, 12 Jan 2024 16:46:19 +0100 rcu/exp: Make parallel exp gp kworker per rcu node When CONFIG_RCU_EXP_KTHREAD=n, the expedited grace period per node initialization is performed in parallel via workqueues (one work per node). However in CONFIG_RCU_EXP_KTHREAD=y, this per node initialization is performed by a single kworker serializing each node initialization (one work for all nodes). The second part is certainly less scalable and efficient beyond a single leaf node. To improve this, expand this single kworker into per-node kworkers. This new layout is eventually intended to remove the workqueues based implementation since it will essentially now become duplicate code. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:13 -04:00
Waiman Long	119acfe64c	rcu: s/boost_kthread_mutex/kthread_mutex JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 7836b270607676ed1c0c6a4a840a2ede9437a6a1 Author: Frederic Weisbecker <frederic@kernel.org> Date: Fri, 12 Jan 2024 16:46:17 +0100 rcu: s/boost_kthread_mutex/kthread_mutex This mutex is currently protecting per node boost kthreads creation and affinity setting across CPU hotplug operations. Since the expedited kworkers will soon be split per node as well, they will be subject to the same concurrency constraints against hotplug. Therefore their creation and affinity tuning operations will be grouped with those of boost kthreads and then rely on the same mutex. To prepare for that, generalize its name. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:12 -04:00
Waiman Long	dafc57ac84	rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 9146eb25495ea8bfb5010192e61e3ed5805ce9ef Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 7 Apr 2023 16:05:38 -0700 rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp The per-CPU rcu_data structure's ->cpu_no_qs.b.exp field is updated only on the instance corresponding to the current CPU, but can be read more widely. Unmarked accesses are OK from the corresponding CPU, but only if interrupts are disabled, given that interrupt handlers can and do modify this field. Unfortunately, although the load from rcu_preempt_deferred_qs() is always carried out from the corresponding CPU, interrupts are not necessarily disabled. This commit therefore upgrades this load to READ_ONCE. Similarly, the diagnostic access from synchronize_rcu_expedited_wait() might run with interrupts disabled and from some other CPU. This commit therefore marks this load with data_race(). Finally, the C-language access in rcu_preempt_ctxt_queue() is OK as is because interrupts are disabled and this load is always from the corresponding CPU. This commit adds a comment giving the rationale for this access being safe. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:12 -04:00
Waiman Long	909a68d328	rcu: Synchronize ->qsmaskinitnext in rcu_boost_kthread_setaffinity() JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 6343402ac35dd534291a6c82924a4f09cf6cd1e5 Author: Pingfan Liu <kernelfans@gmail.com> Date: Tue, 6 Sep 2022 11:36:42 -0700 rcu: Synchronize ->qsmaskinitnext in rcu_boost_kthread_setaffinity() Once either rcutree_online_cpu() or rcutree_dead_cpu() is invoked concurrently, the following rcu_boost_kthread_setaffinity() race can occur: CPU 1 CPU2 mask = rcu_rnp_online_cpus(rnp); ... mask = rcu_rnp_online_cpus(rnp); ... set_cpus_allowed_ptr(t, cm); set_cpus_allowed_ptr(t, cm); This results in CPU2's update being overwritten by that of CPU1, and thus the possibility of ->boost_kthread_task continuing to run on a to-be-offlined CPU. This commit therefore eliminates this race by relying on the pre-existing acquisition of ->boost_kthread_mutex to serialize the full process of changing the affinity of ->boost_kthread_task. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:12 -04:00
Waiman Long	9cc21271ea	rcu-tasks: Make RCU Tasks Trace check for userspace execution Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 528262f50274079740b53e29bcaaabf219aa7417 Author: Zqiang <qiang1.zhang@intel.com> Date: Tue, 19 Jul 2022 12:39:00 +0800 rcu-tasks: Make RCU Tasks Trace check for userspace execution Userspace execution is a valid quiescent state for RCU Tasks Trace, but the scheduling-clock interrupt does not currently report such quiescent states. Of course, the scheduling-clock interrupt is not strictly speaking userspace execution. However, the only way that this code is not in a quiescent state is if something invoked rcu_read_lock_trace(), and that would be reflected in the ->trc_reader_nesting field in the task_struct structure. Furthermore, this field is checked by rcu_tasks_trace_qs(), which is invoked by rcu_tasks_qs() which is in turn invoked by rcu_note_voluntary_context_switch() in kernels building at least one of the RCU Tasks flavors. It is therefore safe to invoke rcu_tasks_trace_qs() from the rcu_sched_clock_irq(). But rcu_tasks_qs() also invokes rcu_tasks_classic_qs() for RCU Tasks, which lacks the read-side markers provided by RCU Tasks Trace. This raises the possibility that an RCU Tasks grace period could start after the interrupt from userspace execution, but before the call to rcu_sched_clock_irq(). However, it turns out that this is safe because the RCU Tasks grace period waits for an RCU grace period, which will wait for the entire scheduling-clock interrupt handler, including any RCU Tasks read-side critical section that this handler might contain. This commit therefore updates the rcu_sched_clock_irq() function's check for usermode execution and its call to rcu_tasks_classic_qs() to instead check for both usermode execution and interrupt from idle, and to instead call rcu_note_voluntary_context_switch(). This consolidates code and provides more faster RCU Tasks Trace reporting of quiescent states in kernels that do scheduling-clock interrupts for userspace execution. [ paulmck: Consolidate checks into rcu_sched_clock_irq(). ] Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:48:15 -04:00
Waiman Long	b6ffe74fc1	rcu: Exclude outgoing CPU when it is the last to leave Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 7634b1eaa0cd135d5eedadb04ad3c91b1ecf28a9 Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 24 Aug 2022 14:46:56 -0700 rcu: Exclude outgoing CPU when it is the last to leave The rcu_boost_kthread_setaffinity() function removes the outgoing CPU from the set_cpus_allowed() mask for the corresponding leaf rcu_node structure's rcub priority-boosting kthread. Except that if the outgoing CPU will leave that structure without any online CPUs, the mask is set to the housekeeping CPU mask from housekeeping_cpumask(). Which is fine unless the outgoing CPU happens to be a housekeeping CPU. This commit therefore removes the outgoing CPU from the housekeeping mask. This would of course be problematic if the outgoing CPU was the last online housekeeping CPU, but in that case you are in a world of hurt anyway. If someone comes up with a valid use case for a system needing all the housekeeping CPUs to be offline, further adjustments can be made. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:47:59 -04:00
Waiman Long	67a20d4628	rcu: Avoid triggering strict-GP irq-work when RCU is idle Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 621189a1fe93cb2b34d62c5cdb9e258bca044813 Author: Zqiang <qiang1.zhang@intel.com> Date: Mon, 8 Aug 2022 10:26:26 +0800 rcu: Avoid triggering strict-GP irq-work when RCU is idle Kernels built with PREEMPT_RCU=y and RCU_STRICT_GRACE_PERIOD=y trigger irq-work from rcu_read_unlock(), and the resulting irq-work handler invokes rcu_preempt_deferred_qs_handle(). The point of this triggering is to force grace periods to end quickly in order to give tools like KASAN a better chance of detecting RCU usage bugs such as leaking RCU-protected pointers out of an RCU read-side critical section. However, this irq-work triggering is unconditional. This works, but there is no point in doing this irq-work unless the current grace period is waiting on the running CPU or task, which is not the common case. After all, in the common case there are many rcu_read_unlock() calls per CPU per grace period. This commit therefore triggers the irq-work only when the current grace period is waiting on the running CPU or task. This change was tested as follows on a four-CPU system: echo rcu_preempt_deferred_qs_handler > /sys/kernel/debug/tracing/set_ftrace_filter echo 1 > /sys/kernel/debug/tracing/function_profile_enabled insmod rcutorture.ko sleep 20 rmmod rcutorture.ko echo 0 > /sys/kernel/debug/tracing/function_profile_enabled echo > /sys/kernel/debug/tracing/set_ftrace_filter This procedure produces results in this per-CPU set of files: /sys/kernel/debug/tracing/trace_stat/function* Sample output from one of these files is as follows: Function Hit Time Avg s^2 -------- --- ---- --- --- rcu_preempt_deferred_qs_handle 838746 182650.3 us 0.217 us 0.004 us The baseline sum of the "Hit" values (the number of calls to this function) was 3,319,015. With this commit, that sum was 1,140,359, for a 2.9x reduction. The worst-case variance across the CPUs was less than 25%, so this large effect size is statistically significant. The raw data is available in the Link: URL. Link: https://lore.kernel.org/all/20220808022626.12825-1-qiang1.zhang@intel.com/ Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:47:58 -04:00
Waiman Long	dc93a1a75e	rcu: Document reason for rcu_all_qs() call to preempt_disable() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 089254fd386eb6800dd7d7863f12a04ada0c35fa Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 3 Aug 2022 08:48:12 -0700 rcu: Document reason for rcu_all_qs() call to preempt_disable() Given that rcu_all_qs() is in non-preemptible kernels, why on earth should it invoke preempt_disable()? This commit adds the reason, which is to work nicely with debugging enabled in CONFIG_PREEMPT_COUNT=y kernels. Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reported-by: Boqun Feng <boqun.feng@gmail.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:47:56 -04:00
Waiman Long	84cca2f288	rcu: Update rcu_preempt_deferred_qs() comments for !PREEMPT kernels Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit bca4fa8cb0f4c096b515952f64e560fd784a0514 Author: Zqiang <qiang1.zhang@intel.com> Date: Mon, 20 Jun 2022 14:42:24 +0800 rcu: Update rcu_preempt_deferred_qs() comments for !PREEMPT kernels In non-premptible kernels, tasks never do context switches within RCU read-side critical sections. Therefore, in such kernels, each leaf rcu_node structure's ->blkd_tasks list will always be empty. The comment on the non-preemptible version of rcu_preempt_deferred_qs() confuses this point, so this commit therefore fixes it. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:47:55 -04:00
Waiman Long	58978c68c8	rcu: Fix rcu_read_unlock_strict() strict QS reporting Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 6d60ea03ac2d3dcf6ddee6b45aa7213d8b0461c5 Author: Zqiang <qiang1.zhang@intel.com> Date: Thu, 16 Jun 2022 21:53:47 +0800 rcu: Fix rcu_read_unlock_strict() strict QS reporting Kernels built with CONFIG_PREEMPT=n and CONFIG_RCU_STRICT_GRACE_PERIOD=y report the quiescent state directly from the outermost rcu_read_unlock(). However, the current CPU's rcu_data structure's ->cpu_no_qs.b.norm might still be set, in which case rcu_report_qs_rdp() will exit early, thus failing to report quiescent state. This commit therefore causes rcu_read_unlock_strict() to clear CPU's rcu_data structure's ->cpu_no_qs.b.norm field before invoking rcu_report_qs_rdp(). Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:47:55 -04:00
Waiman Long	3cd6c37180	rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 5103850654fdc651f0a7076ac753b958f018bb85 Author: Zqiang <qiang1.zhang@intel.com> Date: Fri, 29 Apr 2022 20:42:22 +0800 rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread() Callbacks are invoked in RCU kthreads when calbacks are offloaded (rcu_nocbs boot parameter) or when RCU's softirq handler has been offloaded to rcuc kthreads (use_softirq==0). The current code allows for the rcu_nocbs case but not the use_softirq case. This commit adds support for the use_softirq case. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:22 -04:00
Waiman Long	64478f9fce	rcu: Immediately boost preempted readers for strict grace periods Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 70a82c3c55c8665d3996dcb9968adcf24d52bbc4 Author: Zqiang <qiang1.zhang@intel.com> Date: Fri, 13 May 2022 08:42:55 +0800 rcu: Immediately boost preempted readers for strict grace periods The intent of the CONFIG_RCU_STRICT_GRACE_PERIOD Konfig option is to cause normal grace periods to complete quickly in order to better catch errors resulting from improperly leaking pointers from RCU read-side critical sections. However, kernels built with this option enabled still wait for some hundreds of milliseconds before boosting RCU readers that have been preempted within their current critical section. The value of this delay is set by the CONFIG_RCU_BOOST_DELAY Kconfig option, which defaults to 500 milliseconds. This commit therefore causes kernels build with strict grace periods to ignore CONFIG_RCU_BOOST_DELAY. This causes rcu_initiate_boost() to start boosting immediately after all CPUs on a given leaf rcu_node structure have passed through their quiescent states. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:20 -04:00
Waiman Long	845a0ce6d6	rcu: Avoid tracing a few functions executed in stop machine Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 Conflicts: 2 merge conflicts due to upstream merge conflict. Apply the same resolutions as in upstream merge commit 34bc7b454dc3 ("Merge branch 'ctxt.2022.07.05a' into HEAD"). commit 48f8070f5dd8e13148ae4647780a452d53c457a2 Author: Patrick Wang <patrick.wang.shcn@gmail.com> Date: Tue, 26 Apr 2022 18:45:02 +0800 rcu: Avoid tracing a few functions executed in stop machine Stop-machine recently started calling additional functions while waiting: ---------------------------------------------------------------- Former stop machine wait loop: do { cpu_relax(); => macro ... } while (curstate != STOPMACHINE_EXIT); ----------------------------------------------------------------- Current stop machine wait loop: do { stop_machine_yield(cpumask); => function (notraced) ... touch_nmi_watchdog(); => function (notraced, inside calls also notraced) ... rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced) } while (curstate != MULTI_STOP_EXIT); ------------------------------------------------------------------ These functions (and the functions that they call) must be marked notrace to prevent them from being updated while they are executing. The consequences of failing to mark these functions can be severe: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0 rcu: 3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0 (detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4) Task dump for CPU 1: task:migration/1 state:R running task stack: 0 pid: 19 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: Task dump for CPU 3: task:migration/3 state:R running task stack: 0 pid: 29 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=2 timer-softirq=594 rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack: 0 pid: 14 ppid: 2 flags:0x00000000 Call Trace: schedule+0x56/0xc2 schedule_timeout+0x82/0x184 rcu_gp_fqs_loop+0x19a/0x318 rcu_gp_kthread+0x11a/0x140 kthread+0xee/0x118 ret_from_exception+0x0/0x14 rcu: Stack dump where RCU GP kthread last ran: Task dump for CPU 2: task:migration/2 state:R running task stack: 0 pid: 24 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: This commit therefore marks these functions notrace: rcu_preempt_deferred_qs() rcu_preempt_need_deferred_qs() rcu_preempt_deferred_qs_irqrestore() [ paulmck: Apply feedback from Neeraj Upadhyay. ] Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:19 -04:00
Waiman Long	5b925bf582	rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 172114552701b85d5c3b1a089a73ee85d0d7786b Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 8 Jun 2022 16:40:33 +0200 rcu/context-tracking: Move RCU-dynticks internal functions to context_tracking Move the core RCU eqs/dynticks functions to context tracking so that we can later merge all that code within context tracking. Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:18 -04:00
Waiman Long	fe0d176f60	rcu-tasks: Make rcu_note_context_switch() unconditionally call rcu_tasks_qs() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 commit 6a694411977a6d57ff76a896a745c2f717372dac Author: Paul E. McKenney <paulmck@kernel.org> Date: Tue, 24 May 2022 20:33:17 -0700 rcu-tasks: Make rcu_note_context_switch() unconditionally call rcu_tasks_qs() This commit makes rcu_note_context_switch() unconditionally invoke the rcu_tasks_qs() function, as opposed to doing so only when RCU (as opposed to RCU Tasks Trace) urgently needs a grace period to end. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: KP Singh <kpsingh@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:08 -04:00
Waiman Long	bc9106a9da	rcu: Use IRQ_WORK_INIT_HARD() to avoid rcu_read_unlock() hangs Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit f596e2ce1c0f250bb3ecc179f611be37e862635f Author: Zqiang <qiang1.zhang@intel.com> Date: Mon, 4 Apr 2022 07:59:32 +0800 rcu: Use IRQ_WORK_INIT_HARD() to avoid rcu_read_unlock() hangs When booting kernels built with both CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT_RT=y, the rcu_read_unlock_special() function's invocation of irq_work_queue_on() the init_irq_work() causes the rcu_preempt_deferred_qs_handler() function to work execute in SCHED_FIFO irq_work kthreads. Because rcu_read_unlock_special() is invoked on each rcu_read_unlock() in such kernels, the amount of work just keeps piling up, resulting in a boot-time hang. This commit therefore avoids this hang by using IRQ_WORK_INIT_HARD() instead of init_irq_work(), but only in kernels built with both CONFIG_PREEMPT_RT=y and CONFIG_RCU_STRICT_GRACE_PERIOD=y. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:14 -04:00
Waiman Long	9a95f382ca	rcu: Check for successful spawn of ->boost_kthread_task Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 88ca472f80604c070526eb58b977ea0a9c3c2e1f Author: Zqiang <qiang1.zhang@intel.com> Date: Thu, 24 Mar 2022 19:15:15 +0800 rcu: Check for successful spawn of ->boost_kthread_task For the spawning of the priority-boost kthreads can fail, improbable though this might seem. This commit therefore refrains from attemoting to initiate RCU priority boosting when The ->boost_kthread_task pointer is NULL. Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:13 -04:00
Waiman Long	f0eaee2a3d	rcu: Fix rcu_preempt_deferred_qs_irqrestore() strict QS reporting Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 90d2efe7bdbde5371b6122174af0718843f805c6 Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 16 Feb 2022 09:54:56 -0800 rcu: Fix rcu_preempt_deferred_qs_irqrestore() strict QS reporting Suppose we have a kernel built with both CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT=y. Suppose further that an RCU reader from which RCU core needs a quiescent state ends in rcu_preempt_deferred_qs_irqrestore(). This function will then invoke rcu_report_qs_rdp() in order to immediately report that quiescent state. Unfortunately, it will not have cleared that reader's CPU's rcu_data structure's ->cpu_no_qs.b.norm field. As a result, rcu_report_qs_rdp() will take an early exit because it will believe that this CPU has not yet encountered a quiescent state, and there will be no reporting of the current quiescent state. This commit therefore causes rcu_preempt_deferred_qs_irqrestore() to clear the ->cpu_no_qs.b.norm field before invoking rcu_report_qs_rdp(). Kudos to Boqun Feng and Neeraj Upadhyay for helping with analysis of this issue! Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:10 -04:00
Waiman Long	b19ed13b34	rcu: Initialize boost kthread only for boot node prior SMP initialization Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2117491 commit 3352911fa9b47a90165e5c6fed440048c55146d1 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 16 Feb 2022 16:42:07 +0100 rcu: Initialize boost kthread only for boot node prior SMP initialization The rcu_spawn_gp_kthread() function is called as an early initcall, which means that SMP initialization hasn't happened yet and only the boot CPU is online. Therefore, create only the boost kthread for the leaf node of the boot CPU. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-08-30 17:22:01 -04:00
Waiman Long	a9408fae13	rcu: Add per-CPU rcuc task dumps to RCU CPU stall warnings Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit c9515875850fefcc79492c5189fe8431e75ddec5 Author: Zqiang <qiang1.zhang@intel.com> Date: Tue, 25 Jan 2022 10:47:44 +0800 rcu: Add per-CPU rcuc task dumps to RCU CPU stall warnings When the rcutree.use_softirq kernel boot parameter is set to zero, all RCU_SOFTIRQ processing is carried out by the per-CPU rcuc kthreads. If these kthreads are being starved, quiescent states will not be reported, which in turn means that the grace period will not end, which can in turn trigger RCU CPU stall warnings. This commit therefore dumps stack traces of stalled CPUs' rcuc kthreads, which can help identify what is preventing those kthreads from running. Suggested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Reviewed-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:30:04 -04:00
Waiman Long	35f8e0f336	rcu: Replace cpumask_weight with cpumask_empty where appropriate Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 Conflicts: A merge conflict in rcu_boost_kthread_setaffinity() of kernel/rcu/tree_plugin.h due to the presence of a later upstream commit 04d4e665a609 ("sched/isolation: Use single feature type while referring to housekeeping cpumask"). commit 6a2c1d450a6a328027280a854019c55de989e14e Author: Yury Norov <yury.norov@gmail.com> Date: Sun, 23 Jan 2022 10:38:53 -0800 rcu: Replace cpumask_weight with cpumask_empty where appropriate In some places, RCU code calls cpumask_weight() to check if any bit of a given cpumask is set. We can do it more efficiently with cpumask_empty() because cpumask_empty() stops traversing the cpumask as soon as it finds first set bit, while cpumask_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:29:02 -04:00
Waiman Long	6e2345a90d	rcu: Don't deboost before reporting expedited quiescent state Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit 10c535787436d62ea28156a4b91365fd89b5a432 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 21 Jan 2022 12:40:08 -0800 rcu: Don't deboost before reporting expedited quiescent state Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx before reporting the expedited quiescent state. Under heavy real-time load, this can result in this function being preempted before the quiescent state is reported, which can in turn prevent the expedited grace period from completing. Tim Murray reports that the resulting expedited grace periods can take hundreds of milliseconds and even more than one second, when they should normally complete in less than a millisecond. This was fine given that there were no particular response-time constraints for synchronize_rcu_expedited(), as it was designed for throughput rather than latency. However, some users now need sub-100-millisecond response-time constratints. This patch therefore follows Neeraj's suggestion (seconded by Tim and by Uladzislau Rezki) of simply reversing the two operations. Reported-by: Tim Murray <timmurray@google.com> Reported-by: Joel Fernandes <joelaf@google.com> Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Tim Murray <timmurray@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:26:14 -04:00
Waiman Long	5bef7666bb	rcu: Remove unused rcu_state.boost Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 Conflicts: Fuzz in rcu_spawn_one_boost_kthread() due to upstream commit conflict as shown in merge commit d5578190bed3. commit eae9f147a4b02e132187a2d88a403b9ccc28212a Author: Neeraj Upadhyay <quic_neeraju@quicinc.com> Date: Mon, 13 Dec 2021 12:32:09 +0530 rcu: Remove unused rcu_state.boost Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:25:55 -04:00
Waiman Long	9f48f77ccc	rcu: Create and use an rcu_rdp_cpu_online() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 commit 5ae0f1b58b28b53f4ab3708ef9337a2665e79664 Author: Paul E. McKenney <paulmck@kernel.org> Date: Fri, 10 Dec 2021 13:44:17 -0800 rcu: Create and use an rcu_rdp_cpu_online() The pattern "rdp->grpmask & rcu_rnp_online_cpus(rnp)" occurs frequently in RCU code in order to determine whether rdp->cpu is online from an RCU perspective. This commit therefore creates an rcu_rdp_cpu_online() function to replace it. [ paulmck: Apply kernel test robot unused-variable feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:25:49 -04:00
Waiman Long	ba1bfcb746	rcu: Add mutex for rcu boost kthread spawning and affinity setting Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2076713 Conflicts: A fuzz in rcu_boost_kthread_setaffinity() of kernel/rcu/tree_plugin.h due to the presence of a later ustream commit 04d4e665a609 ("sched/isolation: Use single feature type while referring to housekeeping cpumask"). commit 218b957a6959a2fb5b3967fc824072bb89ac2611 Author: David Woodhouse <dwmw@amazon.co.uk> Date: Wed, 8 Dec 2021 23:41:53 +0000 rcu: Add mutex for rcu boost kthread spawning and affinity setting As we handle parallel CPU bringup, we will need to take care to avoid spawning multiple boost threads, or race conditions when setting their affinity. Spotted by Paul McKenney. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-05-12 08:25:17 -04:00
Patrick Talbert	d46e36b09c	Merge: sched/isolation: Split housekeeping cpumask per isolation features MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/671 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065222 Depends: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 Tested: Setup isolation and ran scheduler tests, checked that housekeeping looked right (tasks offloaded from isolated cpus to HK ones etc). Split the housekeeping flags into finer granularity in preparation for allowing them to be configured dynamically. There should not be much functional change. Signed-off-by: Phil Auld <pauld@redhat.com> Approved-by: Jiri Benc <jbenc@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Prarit Bhargava <prarit@redhat.com> Approved-by: Paolo Bonzini <bonzini@gnu.org> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Signed-off-by: Patrick Talbert <ptalbert@redhat.com>	2022-05-11 08:42:56 +02:00
Patrick Talbert	ea38048f36	Merge: rcu: Backport upstream RCU related commits up to v5.17 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/602 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/602 This patch series backport upstream RCU and various torture tests up to v5.17 kernel. Beside patch 10 which has a merge conflict due to upstream merge conflict, the other patches are all applied cleanly with any issue. Signed-off-by: Waiman Long <longman@redhat.com> ~~~ Waiman Long (112): torture: Apply CONFIG_KCSAN_STRICT to kvm.sh --kcsan argument torture: Make torture.sh print the number of files to be compressed rcu-nocb: Fix a couple of tree_nocb code-style nits rcu: Eliminate rcu_implicit_dynticks_qs() local variable rnhqp rcu: Eliminate rcu_implicit_dynticks_qs() local variable ruqp doc: Add another stall-warning root cause in stallwarn.rst rcu: Fix undefined Kconfig macros rcu: Comment rcu_gp_init() code waiting for CPU-hotplug operations rcu-tasks: Simplify trc_read_check_handler() atomic operations rcu-tasks: Add trc_inspect_reader() checks for exiting critical section rcu-tasks: Remove second argument of rcu_read_unlock_trace_special() rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting() rcu: Simplify rcu_report_dead() call to rcu_report_exp_rdp() rcu: Make rcutree_dying_cpu() use its "cpu" parameter rcu-tasks: Wait for trc_read_check_handler() IPIs rcutorture: Suppressing read-exit testing is not an error rcu-tasks: Fix s/instruction/instructions/ typo in comment rcutorture: Warn on individual rcu_torture_init() error conditions locktorture: Warn on individual lock_torture_init() error conditions rcuscale: Warn on individual rcu_scale_init() error conditions rcutorture: Don't cpuhp_remove_state() if cpuhp_setup_state() failed rcu: Make rcu_normal_after_boot writable again rcu: Make rcu update module parameters world-readable rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives rcu: Fix existing exp request check in sync_sched_exp_online_cleanup() rcutorture: Avoid problematic critical section nesting on PREEMPT_RT rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader rcu: Replace ________p1 and _________p1 with __UNIQUE_ID(rcu) rcu-tasks: Update comments to cond_resched_tasks_rcu_qs() rcu: Ignore rdp.cpu_no_qs.b.exp on preemptible RCU's rcu_qs() rcu: Move rcu_data.cpu_no_qs.b.exp reset to rcu_export_exp_rdp() rcu: Remove rcu_data.exp_deferred_qs and convert to rcu_data.cpu no_qs.b.exp rcu-tasks: Don't remove tasks with pending IPIs from holdout list torture: Catch kvm.sh help text up with actual options rcutorture: Sanitize RCUTORTURE_RDR_MASK rcutorture: More thoroughly test nested readers srcu: Prevent redundant __srcu_read_unlock() wakeup rcutorture: Suppress pi-lock-across read-unlock testing for Tiny SRCU doc: Remove obsolete kernel-per-CPU-kthreads RCU_FAST_NO_HZ advice rcu: in_irq() cleanup rcu: Always inline rcu_dynticks_task_{enter,exit}() rcu: Mark sync_sched_exp_online_cleanup() ->cpu_no_qs.b.exp load rcu: Prevent expedited GP from enabling tick on offline CPU rcu: Make idle entry report expedited quiescent states rcu/nocb: Make local rcu_nocb_lock_irqsave() safe against concurrent deoffloading rcu/nocb: Prepare state machine for a new step rcu/nocb: Invoke rcu_core() at the start of deoffloading rcu/nocb: Make rcu_core() callbacks acceleration preempt-safe rcu/nocb: Make rcu_core() callbacks acceleration (de-)offloading safe rcu/nocb: Check a stable offloaded state to manipulate qlen_last_fqs_check rcu/nocb: Use appropriate rcu_nocb_lock_irqsave() rcu/nocb: Limit number of softirq callbacks only on softirq rcu: Fix callbacks processing time limit retaining cond_resched() rcu: Apply callbacks processing time limit only on softirq rcu/nocb: Don't invoke local rcu core on callback overload from nocb kthread rcu: Improve tree_plugin.h comments and add code cleanups refscale: Simplify the errexit checkpoint refscale: Prevent buffer to pr_alert() being too long refscale: Always log the error message doc: Add refcount analogy to What is RCU refscale: Add missing '\n' to flush message scftorture: Add missing '\n' to flush message scftorture: Remove unused SCFTORTOUT scftorture: Account for weight_resched when checking for all zeroes rcuscale: Always log error message doc: RCU: Avoid 'Symbol' font-family in SVG figures scftorture: Always log error message locktorture,rcutorture,torture: Always log error message rcu-tasks: Create per-CPU callback lists rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection rcu-tasks: Convert grace-period counter to grace-period sequence number rcu_tasks: Convert bespoke callback list to rcu_segcblist structure rcu-tasks: Use spin_lock_rcu_node() and friends rcu-tasks: Inspect stalled task's trc state in locked state rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure rcu-tasks: Abstract checking of callback lists rcu-tasks: Abstract invocations of callbacks rcutorture: Avoid soft lockup during cpu stall torture: Make kvm-find-errors.sh report link-time undefined symbols rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations rcu-tasks: Make rcu_barrier_tasks() handle multiple callback queues rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing rcutorture: Test RCU-tasks multiqueue callback queueing rcu: Avoid running boost kthreads on isolated CPUs rcu: Avoid alloc_pages() when recording stack rcutorture: Add CONFIG_PREEMPT_DYNAMIC=n to tiny scenarios torture: Retry download once before giving up rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention rcu/nocb: Remove rcu_node structure from nocb list when de-offloaded rcu/nocb: Prepare nocb_cb_wait() to start with a non-offloaded rdp rcu/nocb: Optimize kthreads and rdp initialization rcu/nocb: Create kthreads on all CPUs if "rcu_nocbs=" or "nohz_full=" are passed rcu/nocb: Allow empty "rcu_nocbs" kernel parameter rcu/nocb: Merge rcu_spawn_cpu_nocb_kthread() and rcu_spawn_one_nocb_kthread() rcutorture: Enable multiple concurrent callback-flood kthreads rcutorture: Cause TREE02 and TREE10 scenarios to do more callback flooding rcutorture: Add ability to limit callback-flood intensity rcutorture: Combine n_max_cbs from all kthreads in a callback flood rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic() rcu-tasks: Use more callback queues if contention encountered rcutorture: Test RCU Tasks lock-contention detection rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing rcu-tasks: Use fewer callbacks queues if callback flood ends rcu/exp: Mark current CPU as exp-QS in IPI loop second pass torture: Fix incorrectly redirected "exit" in kvm-remote.sh torture: Properly redirect kvm-remote.sh "echo" commands rcu-tasks: Fix computation of CPU-to-list shift counts .../Expedited-Grace-Periods/Funnel0.svg \| 4 +- .../Expedited-Grace-Periods/Funnel1.svg \| 4 +- .../Expedited-Grace-Periods/Funnel2.svg \| 4 +- .../Expedited-Grace-Periods/Funnel3.svg \| 4 +- .../Expedited-Grace-Periods/Funnel4.svg \| 4 +- .../Expedited-Grace-Periods/Funnel5.svg \| 4 +- .../Expedited-Grace-Periods/Funnel6.svg \| 4 +- .../Expedited-Grace-Periods/Funnel7.svg \| 4 +- .../Expedited-Grace-Periods/Funnel8.svg \| 4 +- .../Tree-RCU-Memory-Ordering.rst \| 69 +-- .../Requirements/GPpartitionReaders1.svg \| 36 +- .../Requirements/ReadersPartitionGP1.svg \| 62 +- Documentation/RCU/stallwarn.rst \| 10 + Documentation/RCU/whatisRCU.rst \| 90 ++- .../admin-guide/kernel-parameters.txt \| 66 +- .../admin-guide/kernel-per-CPU-kthreads.rst \| 2 +- arch/sh/configs/sdk7786_defconfig \| 1 - arch/xtensa/configs/nommu_kc705_defconfig \| 1 - include/linux/rcu_segcblist.h \| 51 +- include/linux/rcupdate.h \| 50 +- include/linux/rcupdate_trace.h \| 5 +- include/linux/rcutiny.h \| 2 +- include/linux/srcu.h \| 3 +- include/linux/torture.h \| 17 +- kernel/locking/locktorture.c \| 18 +- kernel/rcu/Kconfig \| 2 +- kernel/rcu/rcu_segcblist.c \| 10 +- kernel/rcu/rcu_segcblist.h \| 12 +- kernel/rcu/rcuscale.c \| 24 +- kernel/rcu/rcutorture.c \| 320 +++++++--- kernel/rcu/refscale.c \| 50 +- kernel/rcu/srcutiny.c \| 2 +- kernel/rcu/tasks.h \| 583 ++++++++++++++---- kernel/rcu/tree.c \| 119 ++-- kernel/rcu/tree.h \| 24 +- kernel/rcu/tree_exp.h \| 15 +- kernel/rcu/tree_nocb.h \| 162 +++-- kernel/rcu/tree_plugin.h \| 61 +- kernel/rcu/update.c \| 8 +- kernel/scftorture.c \| 20 +- kernel/torture.c \| 4 +- .../rcutorture/bin/kvm-find-errors.sh \| 4 +- .../rcutorture/bin/kvm-recheck-rcu.sh \| 2 +- .../selftests/rcutorture/bin/kvm-remote.sh \| 23 +- tools/testing/selftests/rcutorture/bin/kvm.sh \| 11 +- .../selftests/rcutorture/bin/parse-build.sh \| 3 +- .../selftests/rcutorture/bin/torture.sh \| 9 +- .../selftests/rcutorture/configs/rcu/SRCU-T \| 1 + .../selftests/rcutorture/configs/rcu/SRCU-U \| 1 + .../rcutorture/configs/rcu/TASKS01.boot \| 1 + .../selftests/rcutorture/configs/rcu/TINY01 \| 1 + .../selftests/rcutorture/configs/rcu/TINY02 \| 1 + .../rcutorture/configs/rcu/TRACE01.boot \| 1 + .../rcutorture/configs/rcu/TRACE02.boot \| 1 + .../rcutorture/configs/rcu/TREE02.boot \| 1 + .../rcutorture/configs/rcu/TREE10.boot \| 1 + .../rcutorture/configs/rcuscale/TINY \| 1 + 57 files changed, 1360 insertions(+), 637 deletions(-) create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot Approved-by: Prarit Bhargava <prarit@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Signed-off-by: Patrick Talbert <ptalbert@redhat.com>	2022-04-19 12:23:21 +02:00
Phil Auld	1cf795c344	sched/isolation: Use single feature type while referring to housekeeping cpumask Bugzilla: http://bugzilla.redhat.com/2065222 commit 04d4e665a60902cf36e7ad39af1179cb5df542ad Author: Frederic Weisbecker <frederic@kernel.org> Date: Mon Feb 7 16:59:06 2022 +0100 sched/isolation: Use single feature type while referring to housekeeping cpumask Refer to housekeeping APIs using single feature types instead of flags. This prevents from passing multiple isolation features at once to housekeeping interfaces, which soon won't be possible anymore as each isolation features will have their own cpumask. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220207155910.527133-5-frederic@kernel.org Signed-off-by: Phil Auld <pauld@redhat.com>	2022-03-31 10:40:39 -04:00
Waiman Long	1a6798ec33	rcu: Avoid running boost kthreads on isolated CPUs Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit c2cf0767e98eb4487444e5c7ebba491a866811ce Author: Zqiang <qiang.zhang1211@gmail.com> Date: Mon, 15 Nov 2021 13:15:46 +0800 rcu: Avoid running boost kthreads on isolated CPUs When the boost kthreads are created on systems with nohz_full CPUs, the cpus_allowed_ptr is set to housekeeping_cpumask(HK_FLAG_KTHREAD). However, when the rcu_boost_kthread_setaffinity() is called, the original affinity will be changed and these kthreads can subsequently run on nohz_full CPUs. This commit makes rcu_boost_kthread_setaffinity() restrict these boost kthreads to housekeeping CPUs. Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:16:17 -04:00
Waiman Long	235acef905	rcu: Improve tree_plugin.h comments and add code cleanups Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 17ea3718824912e773b0fd78579694b2e75ee597 Author: Zhouyi Zhou <zhouzhouyi@gmail.com> Date: Sun, 24 Oct 2021 08:36:34 +0800 rcu: Improve tree_plugin.h comments and add code cleanups This commit cleans up some comments and code in kernel/rcu/tree_plugin.h. Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:16:04 -04:00
Waiman Long	aa5e9f7836	rcu: Make idle entry report expedited quiescent states Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 790da248978a0722d92d1471630c881704f7eb0d Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 29 Sep 2021 11:09:34 -0700 rcu: Make idle entry report expedited quiescent states In non-preemptible kernels, an unfortunately timed expedited grace period can result in the rcu_exp_handler() IPI handler setting the rcu_data structure's cpu_no_qs.b.exp field just as the target CPU enters idle. There are situations in which this field will not be checked until after that CPU exits idle. The resulting grace-period latency does not qualify as "expedited". This commit therefore checks this field upon non-preemptible idle entry in the rcu_preempt_deferred_qs() function. It also qualifies the rcu_core() preempt_count() check with IS_ENABLED(CONFIG_PREEMPT_COUNT) to prevent false-positive quiescent states from count-free kernels. Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:59 -04:00
Waiman Long	8b492f5404	rcu: Always inline rcu_dynticks_task_{enter,exit}() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 7663ad9a5dbcc27f3090e6bfd192c7e59222709f Author: Peter Zijlstra <peterz@infradead.org> Date: Tue, 28 Sep 2021 10:40:21 +0200 rcu: Always inline rcu_dynticks_task_{enter,exit}() RCU managed to grow a few noinstr violations: vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x0: call to rcu_dynticks_task_trace_enter() leaves .noinstr.text section vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0xe: call to rcu_dynticks_task_trace_exit() leaves .noinstr.text section Fix them by adding __always_inline to the relevant trivial functions. Also replace the noinstr with __always_inline for the existing rcu_dynticks_task_*() functions since noinstr would force noinline them, even when empty, which seems silly. Fixes: `7d0c9c50c5` ("rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so built") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:57 -04:00
Waiman Long	5dac0f1d20	rcu: in_irq() cleanup Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 2407a64f8045552203ee5cb9904ce75ce2fceef4 Author: Changbin Du <changbin.du@intel.com> Date: Tue, 28 Sep 2021 08:21:28 +0800 rcu: in_irq() cleanup This commit replaces the obsolete and ambiguous macro in_irq() with its shiny new in_hardirq() equivalent. Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:57 -04:00
Waiman Long	c9b4dd21b8	rcu: Remove rcu_data.exp_deferred_qs and convert to rcu_data.cpu no_qs.b.exp Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 6120b72e25e195b6fa15b0a674479a38166c392a Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 16 Sep 2021 14:10:48 +0200 rcu: Remove rcu_data.exp_deferred_qs and convert to rcu_data.cpu no_qs.b.exp Having two fields for the same purpose with subtle differences on different RCU flavours is confusing, especially when both fields always exist on both RCU flavours. Fortunately, it is now safe for preemptible RCU to rely on the rcu_data structure's ->cpu_no_qs.b.exp field, just like non-preemptible RCU. This commit therefore removes the ad-hoc ->exp_deferred_qs field. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:53 -04:00
Waiman Long	c58e6fd8a6	rcu: Move rcu_data.cpu_no_qs.b.exp reset to rcu_export_exp_rdp() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit 6e16b0f7bae3817ea67f4bef4f84298e880fbf66 Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 16 Sep 2021 14:10:47 +0200 rcu: Move rcu_data.cpu_no_qs.b.exp reset to rcu_export_exp_rdp() On non-preemptible RCU, move clearing of the rcu_data structure's ->cpu_no_qs.b.exp filed to the actual expedited quiescent state report function, matching hw preemptible RCU handles the ->exp_deferred_qs field. This prepares for removing ->exp_deferred_qs in favor of ->cpu_no_qs.b.exp for both preemptible and non-preemptible RCU. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:53 -04:00
Waiman Long	f88081bad1	rcu: Ignore rdp.cpu_no_qs.b.exp on preemptible RCU's rcu_qs() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2065994 commit a4382659487f84c00b5fbb61df25a9ad59396789 Author: Frederic Weisbecker <frederic@kernel.org> Date: Thu, 16 Sep 2021 14:10:45 +0200 rcu: Ignore rdp.cpu_no_qs.b.exp on preemptible RCU's rcu_qs() Preemptible RCU does not use the rcu_data structure's ->cpu_no_qs.b.exp, instead using a separate ->exp_deferred_qs field to record the need for an expedited quiescent state. In fact ->cpu_no_qs.b.exp should never be set in preemptible RCU because preemptible RCU's expedited grace periods use other mechanisms to record quiescent states. This commit therefore removes the implicit rcu_qs() reference to ->cpu_no_qs.b.exp in favor of a direct reference to ->cpu_no_qs.b.norm. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2022-03-24 17:15:52 -04:00
Desnes A. Nunes do Rosario	3a7d6d5b49	rcu: Move rcu_needs_cpu() to tree.c Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2059555 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=bc849e9192c75833a85f2e9376a265ab31f8eec7 commit bc849e9192c75833a85f2e9376a265ab31f8eec7 Author: "Paul E. McKenney" <paulmck@kernel.org> Date: Mon, 27 Sep 2021 14:30:20 -0700 Now that RCU_FAST_NO_HZ is no more, there is but one implementation of the rcu_needs_cpu() function. This commit therefore moves this function from kernel/rcu/tree_plugin.c to kernel/rcu/tree.c. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Desnes A. Nunes do Rosario <drosario@redhat.com>	2022-03-24 14:39:57 -04:00
Desnes A. Nunes do Rosario	9814a162d4	rcu: Remove the RCU_FAST_NO_HZ Kconfig option Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2059555 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=e2c73a6860bdf54f2c6bf8cddc34ddc91a1343e1 commit e2c73a6860bdf54f2c6bf8cddc34ddc91a1343e1 Author: "Paul E. McKenney" <paulmck@kernel.org> Date: Mon, 27 Sep 2021 14:18:51 -0700 All of the uses of CONFIG_RCU_FAST_NO_HZ=y that I have seen involve systems with RCU callbacks offloaded. In this situation, all that this Kconfig option does is slow down idle entry/exit with an additional allways-taken early exit. If this is the only use case, then this Kconfig option nothing but an attractive nuisance that needs to go away. This commit therefore removes the RCU_FAST_NO_HZ Kconfig option. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Desnes A. Nunes do Rosario <drosario@redhat.com>	2022-03-24 14:39:57 -04:00
Waiman Long	c33d095f30	rcu: Print human-readable message for schedule() in RCU reader Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2022806 commit 521c89b3a4022269c75b35062358d1dae4ebfa79 Author: Paul E. McKenney <paulmck@kernel.org> Date: Mon, 19 Jul 2021 11:52:12 -0700 rcu: Print human-readable message for schedule() in RCU reader The WARN_ON_ONCE() invocation within the CONFIG_PREEMPT=y version of rcu_note_context_switch() triggers when there is a voluntary context switch in an RCU read-side critical section, but there is quite a gap between the output of that WARN_ON_ONCE() and this RCU-usage error. This commit therefore converts the WARN_ON_ONCE() to a WARN_ONCE() that explicitly describes the problem in its message. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2021-11-12 14:23:16 -05:00
Waiman Long	e65c9cc4e1	rcu: Fix macro name CONFIG_TASKS_RCU_TRACE Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2022806 commit fed31a4dd3adb5455df7c704de2abb639a1dc1c0 Author: Zhouyi Zhou <zhouzhouyi@gmail.com> Date: Tue, 13 Jul 2021 08:56:45 +0800 rcu: Fix macro name CONFIG_TASKS_RCU_TRACE This commit fixes several typos where CONFIG_TASKS_RCU_TRACE should instead be CONFIG_TASKS_TRACE_RCU. Among other things, these typos could cause CONFIG_TASKS_TRACE_RCU_READ_MB=y kernels to suffer from memory-ordering bugs that could result in false-positive quiescent states and too-short grace periods. Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2021-11-12 14:23:13 -05:00
Waiman Long	a48351713d	rcu: Mark accesses to ->rcu_read_lock_nesting Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2022806 commit 5fcb3a5f04ee6422714adb02f5364042228bfc2e Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu, 20 May 2021 13:35:50 -0700 rcu: Mark accesses to ->rcu_read_lock_nesting KCSAN flags accesses to ->rcu_read_lock_nesting as data races, but in the past, the overhead of marked accesses was excessive. However, that was long ago, and much has changed since then, both in terms of hardware and of compilers. Here is data taken on an eight-core laptop using Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz with a kernel built using gcc version 9.3.0, with all data in nanoseconds. Unmarked accesses (status quo), measured by three refscale runs: Minimum reader duration: 3.286 2.851 3.395 Median reader duration: 3.698 3.531 3.4695 Maximum reader duration: 4.481 5.215 5.157 Marked accesses, also measured by three refscale runs: Minimum reader duration: 3.501 3.677 3.580 Median reader duration: 4.053 3.723 3.895 Maximum reader duration: 7.307 4.999 5.511 This focused microbenhmark shows only sub-nanosecond differences which are unlikely to be visible at the system level. This commit therefore marks data-racing accesses to ->rcu_read_lock_nesting. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2021-11-12 14:22:54 -05:00
Waiman Long	91e2081a69	rcu/nocb: Start moving nocb code to its own plugin file Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2022806 commit dfcb27540213e8061ecffacd4bd8ed54a310a7b0 Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 19 May 2021 02:09:28 +0200 rcu/nocb: Start moving nocb code to its own plugin file The kernel/rcu/tree_plugin.h file contains not only the plugins for preemptible RCU, but also many other features including rcu_nocbs callback offloading. This offloading has become large and complex, so it is time to put it in its own file. This commit starts that process. Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> [ paulmck: Rename to tree_nocb.h, add Frederic as author. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2021-11-12 14:22:52 -05:00
Waiman Long	3c29e6cff1	locking/rtmutex: Split out the inner parts of 'struct rtmutex' Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2007032 commit 830e6acc8a1cafe153a0d88f9b2455965b396131 Author: Peter Zijlstra <peterz@infradead.org> Date: Sun, 15 Aug 2021 23:27:58 +0200 locking/rtmutex: Split out the inner parts of 'struct rtmutex' RT builds substitutions for rwsem, mutex, spinlock and rwlock around rtmutexes. Split the inner working out so each lock substitution can use them with the appropriate lockdep annotations. This avoids having an extra unused lockdep map in the wrapped rtmutex. No functional change. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.784739994@linutronix.de Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-27 16:18:45 -04:00
Waiman Long	7f0d9a6f21	rcu: Avoid unneeded function call in rcu_read_unlock() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1998549 Upstream Status: linux-rcu commit fd07d7b373a8e7c8406a04b206bed89ec3cc2b52 https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git Tested: Before this patch, the symbol rcu_read_unlock_strict is found in 100s kernel modules. After this patch, the symbol is no longer found in any of the kernel modules. commit fd07d7b373a8e7c8406a04b206bed89ec3cc2b52 Author: Waiman Long <longman@redhat.com> Date: Thu, 26 Aug 2021 22:21:22 -0400 rcu: Avoid unneeded function call in rcu_read_unlock() Since commit `aa40c138cc` ("rcu: Report QS for outermost PREEMPT=n rcu_read_unlock() for strict GPs") the function rcu_read_unlock_strict() is invoked by the inlined rcu_read_unlock() function. However, rcu_read_unlock_strict() is an empty function in production kernels, which are built with CONFIG_RCU_STRICT_GRACE_PERIOD=n. There is a mention of rcu_read_unlock_strict() in the BPF verifier, but this is in a deny-list, meaning that BPF does not care whether rcu_read_unlock_strict() is ever called. This commit therefore provides a slight performance improvement by hoisting the check of CONFIG_RCU_STRICT_GRACE_PERIOD from rcu_read_unlock_strict() into rcu_read_unlock(), thus avoiding the pointless call to an empty function. Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2021-09-08 15:45:20 -04:00
Linus Torvalds	28e92f9903	Merge branch 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Bitmap parsing support for "all" as an alias for all bits - Documentation updates - Miscellaneous fixes, including some that overlap into mm and lockdep - kvfree_rcu() updates - mem_dump_obj() updates, with acks from one of the slab-allocator maintainers - RCU NOCB CPU updates, including limited deoffloading - SRCU updates - Tasks-RCU updates - Torture-test updates * 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (78 commits) tasks-rcu: Make show_rcu_tasks_gp_kthreads() be static inline rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states rcu: Add missing __releases() annotation rcu: Remove obsolete rcu_read_unlock() deadlock commentary rcu: Improve comments describing RCU read-side critical sections rcu: Create an unrcu_pointer() to remove __rcu from a pointer srcu: Early test SRCU polling start rcu: Fix various typos in comments rcu/nocb: Unify timers rcu/nocb: Prepare for fine-grained deferred wakeup rcu/nocb: Only cancel nocb timer if not polling rcu/nocb: Delete bypass_timer upon nocb_gp wakeup rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup rcu/nocb: Allow de-offloading rdp leader rcu/nocb: Directly call __wake_nocb_gp() from bypass timer rcu: Don't penalize priority boosting when there is nothing to boost rcu: Point to documentation of ordering guarantees rcu: Make rcu_gp_cleanup() be noinline for tracing rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP ...	2021-07-04 12:58:33 -07:00
Peter Zijlstra	b03fbd4ff2	sched: Introduce task_is_running() Replace a bunch of 'p->state == TASK_RUNNING' with a new helper: task_is_running(p). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210611082838.222401495@infradead.org	2021-06-18 11:43:07 +02:00

1 2 3 4 5 ...

535 Commits