Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
David Arcari	8b4033c281	cpufreq: intel_pstate: Make it possible to avoid enabling CAS JIRA: https://issues.redhat.com/browse/RHEL-85517 commit 7802fce7dc18394d041a1310fe4ad76120e08145 Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Mon Jan 27 14:07:12 2025 +0100 cpufreq: intel_pstate: Make it possible to avoid enabling CAS Capacity-aware scheduling (CAS) is enabled by default by intel_pstate on hybrid systems without SMT, but in some usage scenarios it may be more attractive to place tasks for maximum CPU performance regardless of the extra cost in terms of energy, which is the case on such systems when CAS is not enabled, so introduce a command line option to forbid intel_pstate to enable CAS. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by:Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/2781262.mvXUDI8C0e@rjwysocki.net Signed-off-by: David Arcari <darcari@redhat.com>	2025-03-31 08:07:03 -04:00
Augusto Caringi	51bbb488a9	Merge: Scheduler updates for 9.7 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6398 JIRA: https://issues.redhat.com/browse/RHEL-78821 Proactive fixes and minor updates for scheduler related code. This includes needed commits up to v6.14-rc1. There are not as many since there are a few features upstream which we are not taking into rhel9 at this point. Signed-off-by: Phil Auld <pauld@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Herton R. Krzesinski <herton@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Juri Lelli <juri.lelli@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Augusto Caringi <acaringi@redhat.com>	2025-03-12 14:53:01 -03:00
Phil Auld	37dc45d04d	sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full" JIRA: https://issues.redhat.com/browse/RHEL-78821 commit 1174b9344bc7e7989439cad207fcd94eaab028db Author: Waiman Long <longman@redhat.com> Date: Wed Oct 30 13:52:51 2024 -0400 sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full" The "isolcpus=nohz" boot parameter and flag were used to disable tick when running a single task. Nowsdays, this "nohz" flag is seldomly used as it is included as part of the "nohz_full" parameter. Extend this flag to cover other kernel noises disabled by the "nohz_full" parameter to make them equivalent. This also eliminates the need to use both the "isolcpus" and the "nohz_full" parameters to fully isolated a given set of CPUs. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20241030175253.125248-3-longman@redhat.com Signed-off-by: Phil Auld <pauld@redhat.com>	2025-02-27 15:13:10 +00:00
Waiman Long	8b6c3917c0	clocksource: Scale the watchdog read retries automatically JIRA: https://issues.redhat.com/browse/RHEL-76143 Conflicts: A context diff in the include/linux/clocksource.h hunk due to the presence of later upstream commit 6b2e29977518 ("timekeeping: Provide infrastructure for converting to/from a base clock"). commit 2ed08e4bc53298db3f87b528cd804cb0cce066a9 Author: Feng Tang <feng.tang@intel.com> Date: Wed, 21 Feb 2024 14:08:59 +0800 clocksource: Scale the watchdog read retries automatically On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled during boot time on about one out of 120 boot attempts: clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns, wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable tsc: Marking TSC unstable due to clocksource watchdog TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152) clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896. clocksource: Switched to clocksource hpet The reason is that for platform with a large number of CPUs, there are sporadic big or huge read latencies while reading the watchog/clocksource during boot or when system is under stress work load, and the frequency and maximum value of the latency goes up with the number of online CPUs. The cCurrent code already has logic to detect and filter such high latency case by reading the watchdog twice and checking the two deltas. Due to the randomness of the latency, there is a low probabilty that the first delta (latency) is big, but the second delta is small and looks valid. The watchdog code retries the readouts by default twice, which is not necessarily sufficient for systems with a large number of CPUs. There is a command line parameter 'max_cswd_read_retries' which allows to increase the number of retries, but that's not user friendly as it needs to be tweaked per system. As the number of required retries is proportional to the number of online CPUs, this parameter can be calculated at runtime. Scale and enlarge the number of retries according to the number of online CPUs and remove the command line parameter completely. [ tglx: Massaged change log and comments ] Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jin Wang <jin1.wang@intel.com> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com Signed-off-by: Waiman Long <longman@redhat.com>	2025-02-04 13:20:56 -05:00
Patrick Talbert	4003ae72c9	Merge: Preparatory patches for TDX support in KVM MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6045 # Merge Request Required Information ## Summary of Changes Backport more patches, mostly from 6.12, that are needed to enable TDX support in KVM. These prerequisites are less self contained, but are enough to have a mostly conflict-free TDX backport. ## Approved Development Ticket(s) All submissions to CentOS Stream must reference a ticket in [Red Hat Jira](https://issues.redhat.com/). ``` JIRA: https://issues.redhat.com/browse/RHEL-71541 Depends: https://issues.redhat.com/browse/RHEL-64444 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Omitted-fix: 3f749befb0998472470d850b11b430477c0718cc (irrelevant series of changes for odd Kconfigs) Omitted-fix: ea4290d77bda2bd1f173a86f07aa79b568e0a6f8 (irrelevant series of changes for odd Kconfigs) Omitted-fix: 2a5fe5a01668e831af1de3951718fbf88b9a9b9c (irrelevant series of changes for odd Kconfigs) Omitted-fix: 338b655a1178900ac05aca7ac66dc28b05100430 (irrelevant series of changes for odd Kconfigs) Omitted-fix: 341e4023032fba6c02326bfc6babd63ef4039712 (irrelevant series of changes for odd Kconfigs) Omitted-fix: 1331343af6f502aecd274d522dd34bf7c965f484 (irrelevant series of changes for odd Kconfigs) Omitted-fix: 9ee62c33c0fe017ee02501a877f6f562363122fa (irrelevant series of changes for odd Kconfigs) Omitted-fix: 2a5fe5a01668e831af1de3951718fbf88b9a9b9c (irrelevant series of changes for odd Kconfigs) Omitted-fix: d822ca29a4fc5278fb511790dace44836e8cc40d (can be backported via perf) Omitted-fix: 979956bc681105f34642971448c4cda048954a07 (irrelevant with RHEL gcc) Omitted-fix: e120829dbf927c8b93cd5e06acfec0332cc82e02 (can be backported via perf) ``` Approved-by: Vitaly Kuznetsov <vkuznets@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Patrick Talbert <ptalbert@redhat.com>	2025-01-27 15:24:23 +01:00
Paolo Bonzini	b02548e87e	KVM: Add a module param to allow enabling virtualization when KVM is loaded JIRA: https://issues.redhat.com/browse/RHEL-71541 Add an on-by-default module param, enable_virt_at_load, to let userspace force virtualization to be enabled in hardware when KVM is initialized, i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization during KVM initialization allows userspace to avoid the additional latency when creating/destroying the first/last VM (or more specifically, on the 0=>1 and 1=>0 edges of creation/destruction). Now that KVM uses the cpuhp framework to do per-CPU enabling, the latency could be non-trivial as the cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could be problematic for use case that need to spin up VMs quickly. Prior to commit `10474ae894` ("KVM: Activate Virtualization On Demand"), KVM _unconditionally_ enabled virtualization during load, i.e. there's no fundamental reason KVM needs to dynamically toggle virtualization. These days, the only known argument for not enabling virtualization is to allow KVM to be autoloaded without blocking other out-of-tree hypervisors, and such use cases can simply change the module param, e.g. via command line. Note, the aforementioned commit also mentioned that enabling SVM (AMD's virtualization extensions) can result in "using invalid TLB entries". It's not clear whether the changelog was referring to a KVM bug, a CPU bug, or something else entirely. Regardless, leaving virtualization off by default is not a robust "fix", as any protection provided is lost the instant userspace creates the first VM. Reviewed-by: Chao Gao <chao.gao@intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Tested-by: Farrah Chen <farrah.chen@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240830043600.127750-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit b4886fab6fb620b96ad7eeefb9801c42dfa91741) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-12-17 14:42:06 +01:00
Rado Vrbovsky	4b9fce484f	Merge: mm: proactive fixes for RHEL-9.6 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5812 JIRA: https://issues.redhat.com/browse/RHEL-27745 JIRA: https://issues.redhat.com/browse/RHEL-15601 JIRA: https://issues.redhat.com/browse/RHEL-28873 JIRA: https://issues.redhat.com/browse/RHEL-54929 JIRA: https://issues.redhat.com/browse/RHEL-61137 JIRA: https://issues.redhat.com/browse/RHEL-62336 JIRA: https://issues.redhat.com/browse/RHEL-66627 JIRA: https://issues.redhat.com/browse/RHEL-66794 JIRA: https://issues.redhat.com/browse/RHEL-66818 JIRA: https://issues.redhat.com/browse/RHEL-66950 JIRA: https://issues.redhat.com/browse/RHEL-66977 JIRA: https://issues.redhat.com/browse/RHEL-68011 JIRA: https://issues.redhat.com/browse/RHEL-68909 JIRA: https://issues.redhat.com/browse/RHEL-69683 JIRA: https://issues.redhat.com/browse/RHEL-70053 CVE: CVE-2023-52490 CVE: CVE-2024-42316 CVE: CVE-2024-50182 CVE: CVE-2024-50199 CVE: CVE-2024-50200 CVE: CVE-2024-50219 CVE: CVE-2024-50228 CVE: CVE-2024-50272 CVE: CVE-2024-53097 CVE: CVE-2024-53105 CVE: CVE-2024-53136 This set proactively brings into RHEL9 core MM code a set of follow-up fixes as they were pushed into upstream's stable v6.6 LTS branch, but Mainline commits are backported instead in order to keep it easy to track the RHEL backports against upstream. Dependencies were also selectively backported where it made sense to do so, and all the selected commits are sorted in upstream's topological order. Omitted-fix: c567f2948f57 ("Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."") Omitted-fix: 4b944f8ef996 ("Revert "mm/filemap: avoid buffered read/write race to read inconsistent data"") Omitted-fix: 9d08ec41a064 ("mm: allow set/clear page_type again") Omitted-fix: cc9bc36ebef7 ("mm: zswap: remove nr_zswap_stored atomic") Omitted-fix: 0e4008447242 ("zswap: track swapins from disk more accurately") Omitted-fix: 6359c39c9de6 ("mm: remove unused hugepage for vma_alloc_folio()") Omitted-fix: 9b5c87d47949 ("mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount") Omitted-fix: 1390a3334a48 ("mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio") Omitted-fix: f708f6970cc9 ("mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio") Omitted-fix: 4de22b2a6a74 ("mm: open-code PageTail in folio_flags() and const_folio_flags()") Omitted-fix: 6a7de1bf218d ("mm: open-code page_folio() in dump_page()") Omitted-fix: 40a024b81d1c ("ALSA: core: Drop superfluous no_free_ptr() for memdup_user() errors") Omitted-fix: 9d197b627e5f ("docs/zh_CN: update the translation of mm/page_table_check.rst") Omitted-fix: ce8f9fb651fa ("comedi: Flush partial mappings in error case") Signed-off-by: Rafael Aquini <raquini@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Herton R. Krzesinski <herton@redhat.com> Approved-by: Jerry Snitselaar <jsnitsel@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: John W. Linville <linville@redhat.com> Approved-by: Mark Langsdorf <mlangsdo@redhat.com> Approved-by: Jocelyn Falempe <jfalempe@redhat.com> Approved-by: Lucas Zampieri <lzampier@redhat.com> Approved-by: Ivan Vecera <ivecera@redhat.com> Approved-by: Gavin Shan <gshan@redhat.com> Approved-by: Andrea Claudi <aclaudi@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-12-16 19:49:11 +00:00
Rafael Aquini	c8c9c0b259	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER JIRA: https://issues.redhat.com/browse/RHEL-27745 Conflicts: * arch//Kconfig: all hunks dropped as there were only text blurbs and comments being changed with no functional changes whatsoever, and RHEL9 is missing several (unrelated) commits to these arches that tranform the text blurbs in the way these non-functional hunks were expecting; drivers/accel/qaic/qaic_data.c: hunk dropped due to RHEL-only commit `083c0cdce2` ("Merge DRM changes from upstream v6.8..v6.9"); * drivers/gpu/drm/i915/gem/selftests/huge_pages.c: hunk dropped due to RHEL-only commit `ca8b16c11b` ("Merge DRM changes from upstream v6.7..v6.8"); * drivers/gpu/drm/ttm/tests/ttm_pool_test.c: all hunks dropped due to RHEL-only commit `ca8b16c11b` ("Merge DRM changes from upstream v6.7..v6.8"); * drivers/video/fbdev/vermilion/vermilion.c: hunk dropped as RHEL9 misses commit `dbe7e429fe` ("vmlfb: framebuffer driver for Intel Vermilion Range"); * include/linux/pageblock-flags.h: differences due to out-of-order backport of upstream commits 72801513b2bf ("mm: set pageblock_order to HPAGE_PMD_ORDER in case with !CONFIG_HUGETLB_PAGE but THP enabled"), and 3a7e02c040b1 ("minmax: avoid overly complicated constant expressions in VM code"); * mm/mm_init.c: differences on the 3rd, and 4th hunks are due to RHEL backport commit `1845b92dcf` ("mm: move most of core MM initialization to mm/mm_init.c") ignoring the out-of-order backport of commit 3f6dac0fd1b8 ("mm/page_alloc: make deferred page init free pages in MAX_ORDER blocks") thus partially reverting the changes introduced by the latter; This patch is a backport of the following upstream commit: commit 5e0a760b44417f7cadd79de2204d6247109558a0 Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Date: Thu Dec 28 17:47:04 2023 +0300 mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rafael Aquini <raquini@redhat.com>	2024-12-09 12:24:17 -05:00
Rado Vrbovsky	191f608532	Merge: PCI: ACS updates MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5246 ``` JIRA: https://issues.redhat.com/browse/RHEL-48601 Signed-off-by: Myron Stowe <mstowe@redhat.com> ``` Approved-by: John W. Linville <linville@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-12-09 08:21:20 +00:00
Rado Vrbovsky	492f67b5c3	Merge: [RHEL 9.6] Update core Arm code MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5092 JIRA: https://issues.redhat.com/browse/RHEL-40604 Depends: !5252 Omitted-fix: b8995a184170 Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD" Omitted-fix: f481bb32d60e Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD" Don't need to revert then revert the revert. Omitted-fix: cb1a393c40ee mm: add arch hook to validate mmap() prot flags Omitted-fix: 50e3ed0f93f4 arm64: mm: add support for WXN memory translation attribute These get reverted. Omitted-fix: a07a59415217 arm64: smp: avoid NMI IPIs with broken MediaTek FW Omitted-fix: 4bb49009e071 Revert "arm64: smp: avoid NMI IPIs with broken MediaTek FW" Backport selected patches through upstream 6.9, including: - bug fixes - various cpu feature detection enhancements - save/restore fpsimd state on context switch - ARM Cortex-A510 erratum 3117295 workaround - LPA2 related patch series. Not to enable LPA2 but for cleanup of startup code Signed-off-by: Mark Salter <msalter@redhat.com> Approved-by: Gavin Shan <gshan@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-12-09 08:21:13 +00:00
Rado Vrbovsky	05df4237af	Merge: USB/TBT code rebase of supported drivers to upstream v6.11 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5592 JIRA: https://issues.redhat.com/browse/RHEL-59051 CVE: CVE-2024-44960 CVE JIRA: https://issues.redhat.com/browse/RHEL-57138 CVE: CVE-2024-46675 CVE JIRA: https://issues.redhat.com/browse/RHEL-64322 This MR rebases supported USB/TBT drivers to upstream kernel v6.11. By design, changes on this rebase are limited to supported USB/Thunderbolt drivers and infrastructure. Changes which happen to touch the drivers but are tree-wide are selectively or partially pulled in, whenever relevant. Notes: I) Omits: Omitted-fix: aefa036be8c2 ("phy: freescale: imx8qm-hsio: Include bitfield.h for FIELD_PREP") Omitted-fix: 2d6213bd592b ("crypto: spacc - Add ifndef around MIN") Omitted-fix: b8fc70ab7b5f ("Revert "crypto: spacc - Add SPAcc Skcipher support") Omitted-fix: bf791751162a ("thunderbolt: Add only on-board retimers when !CONFIG_USB4_DEBUGFS_MARGINING") II) This MR drops `rtsx_pci_ms` driver because it became dead code with commit <c0e5f4e73a71> ("misc: rtsx: Add support for RTS5261"), which as consequence was latter dropped on commit <d0f459259c13> ("memstick: rtsx_pci_ms: Remove Realtek PCI memstick driver"). The latter is being merged here. III) This MR also includes minmax updates to fix these build and test errors: 1 - Signedness error: ``` drivers/usb/typec/ucsi/ucsi.c: In function 'ucsi_get_pd_message': ./include/linux/build_bug.h:78:41: error: static assertion failed: "min(bytes, (((con->ucsi)->version < 0x0200) ? 0x10 : 0xff)) signedness error, fix types or consider umin() before min_t()" 78 \| #define __static_assert(expr, msg, ...) _Static_assert(expr, msg) ``` 2 - ISO C90 error: ``` drivers/scsi/Makefile:196: FORCE prerequisite is missing lib/vsprintf.c: In function 'resource_string': lib/vsprintf.c:1068:9: error: ISO C90 forbids variable length array 'sym' [-Werror=vla] 1068 \| char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE, \| ^~~~ ``` 3 - Oops on drm_gem_shmem CKI testing: ``` Unable to handle kernel paging request at virtual address ffffffff80000000 ... Internal error: Oops: 0000000096000146 [#1] SMP ... drm_gem_shmem_test_obj_create_private+0x1cc/0x41c [drm_gem_shmem_test] ... # drm_gem_shmem_test_obj_create_private: try faulted: last line seen drivers/gpu/drm/tests/drm_gem_shmem_test.c:120 # drm_gem_shmem_test_obj_create_private: internal error occurred preventing test case from running: -4 ``` Signed-off-by: Desnes Nunes <desnesn@redhat.com> Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com> Approved-by: Bastien Nocera <bnocera@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: Ivan Vecera <ivecera@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: Eric Chanudet <echanude@redhat.com> Approved-by: Adam Jackson <ajax@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-11-25 13:17:44 +00:00
Rado Vrbovsky	993b335734	Merge: Update arch/{x86,powerpc,arm64}/mm to v6.6 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5391 JIRA: https://issues.redhat.com/browse/RHEL-55461 JIRA: https://issues.redhat.com/browse/RHEL-55465 JIRA: https://issues.redhat.com/browse/RHEL-55462 Depends: !5252 Updated the respective arch mm directories to v6.6. Most of the patches have already been updated or included by the respective arch teams and by Rafael's mm update to v6.6. Dropped the following to avoid issues with the ppc64le build: 41b7a347bf14 powerpc: Book3S 64-bit outline-only KASAN support c7b9ed7c34a9 powerpc/64e: KASAN Full support for BOOK3E/64 Omitted-fix: 7bd6680b47fa Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"" Omitted-fix: 7b59e8ae92fe arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices Omitted-fix: a54b7fa6b9ab arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor Omitted-fix: 9a5f0b11e49e arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP Omitted-fix: cd87d9f58439 x86/mm: further clarify switch_mm_irqs_off() documentation Signed-off-by: Audra Mitchell <audra@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: Vladis Dronov <vdronov@redhat.com> Approved-by: Herton R. Krzesinski <herton@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: Nico Pache <npache@redhat.com> Approved-by: Lenny Szubowicz <lszubowi@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-11-12 08:02:20 +00:00
Desnes Nunes	f10fe0c8b9	usb-storage: Optimize scan delay more precisely JIRA: https://issues.redhat.com/browse/RHEL-59051 commit 804da867ad016d53bf33373cfeaae041775455f1 Author: Norihiko Hama <Norihiko.Hama@alpsalpine.com> Date: Wed, 15 May 2024 09:43:39 +0900 Current storage scan delay is reduced by the following old commit. `a4a47bc03f` ("Lower USB storage settling delay to something more reasonable") It means that delay is at least 'one second', or zero with delay_use=0. 'one second' is still long delay especially for embedded system but when delay_use is set to 0 (no delay), still error observed on some USB drives. So delay_use should not be set to 0 but 'one second' is quite long. Especially for embedded system, it's important for end user how quickly access to USB drive when it's connected. That's why we have a chance to minimize such a constant long delay. This patch optimizes scan delay more precisely to minimize delay time but not to have any problems on USB drives by extending module parameter 'delay_use' in milliseconds internally. The parameter 'delay_use' optionally supports in milliseconds if it ends with 'ms'. It makes the range of value to 1 / 1000 in internal 32-bit value but it's still enough to set the delay time. By default, delay time is 'one second' for backward compatibility. For example, it seems to be good by changing delay_use=100ms, that is 100 millisecond delay without issues for most USB pen drives. Signed-off-by: Norihiko Hama <Norihiko.Hama@alpsalpine.com> Link: https://lore.kernel.org/r/20240515004339.29892-1-Norihiko.Hama@alpsalpine.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Desnes Nunes <desnesn@redhat.com>	2024-11-07 23:01:28 -03:00
Jerry Snitselaar	ceab946260	iommu/amd: Add kernel parameters to limit V1 page-sizes JIRA: https://issues.redhat.com/browse/RHEL-61942 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git commit f0295913c4b4f377c454e06f50c1a04f2f80d9df Author: Joerg Roedel <jroedel@suse.de> Date: Thu Sep 5 09:22:40 2024 +0200 iommu/amd: Add kernel parameters to limit V1 page-sizes Add two new kernel command line parameters to limit the page-sizes used for v1 page-tables: nohugepages - Limits page-sizes to 4KiB v2_pgsizes_only - Limits page-sizes to 4Kib/2Mib/1GiB; The same as the sizes used with v2 page-tables This is needed for multiple scenarios. When assigning devices to SEV-SNP guests the IOMMU page-sizes need to match the sizes in the RMP table, otherwise the device will not be able to access all shared memory. Also, some ATS devices do not work properly with arbitrary IO page-sizes as supported by AMD-Vi, so limiting the sizes used by the driver is a suitable workaround. All-in-all, these parameters are only workarounds until the IOMMU core and related APIs gather the ability to negotiate the page-sizes in a better way. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240905072240.253313-1-joro@8bytes.org (cherry picked from commit f0295913c4b4f377c454e06f50c1a04f2f80d9df) Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>	2024-11-04 08:57:30 -07:00
Audra Mitchell	d14eefe788	powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults JIRA: https://issues.redhat.com/browse/RHEL-55462 This patch is a backport of the following upstream commit: commit 6b34a099faa123488b13caf704562f4dbe483fc4 Author: Nicholas Piggin <npiggin@gmail.com> Date: Mon Oct 24 13:01:50 2022 +1000 powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults This option increases the number of hash misses by limiting the number of kernel HPT entries, by keeping a per-CPU record of the last kernel HPTEs installed, and removing that from the hash table on the next hash insertion. A timer round-robins CPUs removing remaining kernel HPTEs and clearing the TLB (in the case of bare metal) to increase and slightly randomise kernel fault activity. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add comment about NR_CPUS usage, fixup whitespace] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221024030150.852517-1-npiggin@gmail.com Signed-off-by: Audra Mitchell <audra@redhat.com>	2024-11-04 09:14:16 -05:00
Rado Vrbovsky	8d10957dfa	Merge: kvm/aarch64: rhel9.6 rebase upto 6.11 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5430 JIRA: https://issues.redhat.com/browse/RHEL-57113 Upstream Status: up to v6.11 and fixes up to v6.12-rc5 \ Tested: kvm-unit-tests, kselftest, migration test. This is the first round rebase kvm-arm up to v6.11, which contains the below series: 1. KVM: arm64: pKVM host proxy FF-A fixes (part of them) 2. KVM: arm64: nv: Shadow stage-2 page table handling 3. KVM: arm64: Allow userspace to modify CTR_EL0 4. KVM: arm64: nv: FPSIMD/SVE, plus some other CPTR goodies 5. KVM: arm64: fix warnings in W=1 build 6. Misc commits Besides that, it also takes the fixes commit `4155539bc5ba ("KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set")` which up to v6.12-rc1. * 42fb33dde42b KVM: arm64: Use FF-A 1.1 with pKVM \ This commit belongs to the series 1, don't pick it because downstream doesn't support FF-A 1.1 (The related upstream commit is `1609626c32c4 ("firmware: arm_ffa: Update the FF-A command list with v1.1 additions")`). This `KVM: arm64: Fix handling of TCR2_EL1` series can be taken by kvm-arm rebase but since it depends on the arm64 rebase, so will pick them in the second round when the arm64 rebase being merged. * 838d992b8448 KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() \ Don't pick this commit since downstream doesn't support bitmap_gather(). Changelog: \ v2 -> v3: \ Add commits: * eb9d53d4a949 KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity * cb52b5c8b81b Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity" * 810ecbefdd54 KVM: Documentation: Correct the VGIC V2 CPU interface addr space size * 03bd36a387b8 KVM: Documentation: Enumerate allowed value macros of irq_type * ae8f8b376102 KVM: arm64: Unregister redistributor for failed vCPU creation * c6c167afa090 KVM: arm64: Fix shift-out-of-bounds bug * 78a005555500 KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration v1 -> v2: \ Add those two commits to avoid conflicts when backport `894376385a2d KVM: arm64: Add support for FFA_PARTITION_INFO_GET`. * 3fad96e9b21b ("firmware: arm_ffa: Declare ffa_bus_type structure in the header") * 989e8661dc45 ("firmware: arm_ffa: Make ffa_bus_type const") Add commits: * b26e484b8bb3 ("arm64: Add CFI error handling") * 7a928b32f1de arm64: Introduce esr_brk_comment, esr_is_cfi_brk * 8f3873a39529 KVM: arm64: Introduce print_nvhe_hyp_panic helper * eca4ba5b6dff KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 Add commits: * f26a525b77e0 KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer * a1d402abf8e3 KVM: arm64: Fix kvm_has_feat''() handling of negative features 78fee4198bb4 KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path * a9f41588a902 KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM * dc0dddb1d66d KVM: arm64: Invalidate EL1&0 TLB entries for all VMIDs in nvhe hyp init * ed49fe5a6fb9 KVM: arm64: Ensure TLBI uses correct VMID after changing context * e0b7de4fd18c KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging * ae41d7dbaeb4 KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE * 38753cbc4dca KVM: arm64: Move data barrier to end of split walk Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Approved-by: Gavin Shan <gshan@redhat.com> Approved-by: Sebastian Ott <sebott@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-11-01 08:10:52 +00:00
Mark Salter	90e536e7f6	arm64: Add the arm64.no32bit_el0 command line option JIRA: https://issues.redhat.com/browse/RHEL-40604 commit 1279e8d0dcead53cf1f51e926a1cf6d2a79332d6 Author: Andrea della Porta <andrea.porta@suse.com> Date: Mon, 29 Apr 2024 12:28:33 +0200 Introducing the field 'el0' to the idreg-override for register ID_AA64PFR0_EL1. This field is also aliased to the new kernel command line option 'arm64.no32bit_el0' as a more recognizable and mnemonic name to disable the execution of 32 bit userspace applications (i.e. avoid Aarch32 execution state in EL0) from kernel command line. Link: https://lore.kernel.org/all/20240207105847.7739-1-andrea.porta@suse.com/ Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Link: https://lore.kernel.org/r/20240429102833.6426-1-andrea.porta@suse.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Salter <msalter@redhat.com>	2024-10-31 10:42:52 -04:00
Shaoqin Huang	282b3b61c1	KVM: arm64: Add early_param to control WFx trapping JIRA: https://issues.redhat.com/browse/RHEL-57113 Conflicts: - Documentation/admin-guide/kernel-parameters.txt Contextual conflicts due to missing commit 600716592a3a ("doc: Add EARLY flag to early-parsed kernel boot parameters"). commit 0b5afe05377d7993f19292bf49dd13e959000790 Author: Colton Lewis <coltonlewis@google.com> Date: Thu May 23 17:40:55 2024 +0000 KVM: arm64: Add early_param to control WFx trapping Add an early_params to control WFI and WFE trapping. This is to control the degree guests can wait for interrupts on their own without being trapped by KVM. Options for each param are trap and notrap. trap enables the trap. notrap disables the trap. Note that when enabled, traps are allowed but not guaranteed by the CPU architecture. Absent an explicitly set policy, default to current behavior: disabling the trap if only a single task is running and enabling otherwise. Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20240523174056.1565133-1-coltonlewis@google.com [ oliver: rework kvm_vcpu_should_clear_tw*() for readability ] Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Shaoqin Huang <shahuang@redhat.com>	2024-10-28 04:37:46 -04:00
Rado Vrbovsky	d2bd7080ef	Merge: Sched: Updates and fixes for 9.6 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5250 JIRA: https://issues.redhat.com/browse/RHEL-56494 JIRA: https://issues.redhat.com/browse/RHEL-57142 CVE: CVE-2024-44958 Tested: Ran scheduler tests and general stress testing. Have asked perf QE for sanity tests. Omitted-fix: c049acee3c71 ("selftests/ftrace: Fix test to handle both old and new kernels"): Somewhat out of scope for this MR and should not need to run test against old kernels in RHEL. Series of scheduler related fixes and updates, up to v6.11. A large number of these are refactoring (making naming consistent, breaking out code into new files etc) with no functional changes. Otherwise, primarily bug fixes and cleanups, no real feature additions. Signed-off-by: Phil Auld <pauld@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Mark Langsdorf <mlangsdo@redhat.com> Approved-by: Juri Lelli <juri.lelli@redhat.com> Approved-by: Eric Chanudet <echanude@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-25 16:52:35 +00:00
Rado Vrbovsky	16bf54f108	Merge: Fix RCUC latency issue MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5165 JIRA: https://issues.redhat.com/browse/RHEL-20288 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Leonardo Bras <leobras@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: Waiman Long <longman@redhat.com> Approved-by: Marcelo Tosatti <mtosatti@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-25 16:26:53 +00:00
Rado Vrbovsky	d30d477e21	Merge: rcu: Backport upstream RCU commits up to v6.10 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5074 JIRA: https://issues.redhat.com/browse/RHEL-55557 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5074 This MR backports upstream RCU commits up to v6.10 with relevant bug fixes, if applicable. Signed-off-by: Waiman Long <longman@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-25 16:11:27 +00:00
Leonardo Bras	483ecb54c6	rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter JIRA: https://issues.redhat.com/browse/RHEL-20288 commit 68d124b0999919015e6d23008eafea106ec6bb40 Author: Paul E. McKenney <paulmck@kernel.org> Date: 2024-05-08 20:11:58 -0700 rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter If a CPU is running either a userspace application or a guest OS in nohz_full mode, it is possible for a system call to occur just as an RCU grace period is starting. If that CPU also has the scheduling-clock tick enabled for any reason (such as a second runnable task), and if the system was booted with rcutree.use_softirq=0, then RCU can add insult to injury by awakening that CPU's rcuc kthread, resulting in yet another task and yet more OS jitter due to switching to that task, running it, and switching back. In addition, in the common case where that system call is not of excessively long duration, awakening the rcuc task is pointless. This pointlessness is due to the fact that the CPU will enter an extended quiescent state upon returning to the userspace application or guest OS. In this case, the rcuc kthread cannot do anything that the main RCU grace-period kthread cannot do on its behalf, at least if it is given a few additional milliseconds (for example, given the time duration specified by rcutree.jiffies_till_first_fqs, give or take scheduling delays). This commit therefore adds a rcutree.nohz_full_patience_delay kernel boot parameter that specifies the grace period age (in milliseconds, rounded to jiffies) before which RCU will refrain from awakening the rcuc kthread. Preliminary experimentation suggests a value of 1000, that is, one second. Increasing rcutree.nohz_full_patience_delay will increase grace-period latency and in turn increase memory footprint, so systems with constrained memory might choose a smaller value. Systems with less-aggressive OS-jitter requirements might choose the default value of zero, which keeps the traditional immediate-wakeup behavior, thus avoiding increases in grace-period latency. [ paulmck: Apply Leonardo Bras feedback. ] Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/ Reported-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com>	2024-10-08 18:52:03 -03:00
Phil Auld	d3ffc226fc	sched/core: Drop spinlocks on contention iff kernel is preemptible JIRA: https://issues.redhat.com/browse/RHEL-56494 Conflicts: Minor context differences. commit c793a62823d1ce8f70d9cfc7803e3ea436277cda Author: Sean Christopherson <seanjc@google.com> Date: Mon May 27 17:34:48 2024 -0700 sched/core: Drop spinlocks on contention iff kernel is preemptible Use preempt_model_preemptible() to detect a preemptible kernel when deciding whether or not to reschedule in order to drop a contended spinlock or rwlock. Because PREEMPT_DYNAMIC selects PREEMPTION, kernels built with PREEMPT_DYNAMIC=y will yield contended locks even if the live preemption model is "none" or "voluntary". In short, make kernels with dynamically selected models behave the same as kernels with statically selected models. Somewhat counter-intuitively, NOT yielding a lock can provide better latency for the relevant tasks/processes. E.g. KVM x86's mmu_lock, a rwlock, is often contended between an invalidation event (takes mmu_lock for write) and a vCPU servicing a guest page fault (takes mmu_lock for read). For _some_ setups, letting the invalidation task complete even if there is mmu_lock contention provides lower latency for all tasks, i.e. the invalidation completes sooner and the vCPU services the guest page fault sooner. But even KVM's mmu_lock behavior isn't uniform, e.g. the "best" behavior can vary depending on the host VMM, the guest workload, the number of vCPUs, the number of pCPUs in the host, why there is lock contention, etc. In other words, simply deleting the CONFIG_PREEMPTION guard (or doing the opposite and removing contention yielding entirely) needs to come with a big pile of data proving that changing the status quo is a net positive. Opportunistically document this side effect of preempt=full, as yielding contended spinlocks can have significant, user-visible impact. Fixes: c597bfddc9e9 ("sched: Provide Kconfig support for default dynamic preempt mode") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@proxmox.com Signed-off-by: Phil Auld <pauld@redhat.com>	2024-09-23 13:33:03 -04:00
Phil Auld	14a470e760	sched/pelt: Remove shift of thermal clock JIRA: https://issues.redhat.com/browse/RHEL-56494 commit 97450eb909658573dcacc1063b06d3d08642c0c1 Author: Vincent Guittot <vincent.guittot@linaro.org> Date: Tue Mar 26 10:16:16 2024 +0100 sched/pelt: Remove shift of thermal clock The optional shift of the clock used by thermal/hw load avg has been introduced to handle case where the signal was not always a high frequency hw signal. Now that cpufreq provides a signal for firmware and SW pressure, we can remove this exception and always keep this PELT signal aligned with other signals. Mark sysctl_sched_migration_cost boot parameter as deprecated Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Qais Yousef <qyousef@layalina.io> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20240326091616.3696851-6-vincent.guittot@linaro.org Signed-off-by: Phil Auld <pauld@redhat.com>	2024-09-23 13:33:02 -04:00
Myron Stowe	4eeffc7615	PCI: Extend ACS configurability JIRA: https://issues.redhat.com/browse/RHEL-48601 Upstream Status: 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25 commit 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25 Author: Vidya Sagar <vidyas@nvidia.com> Date: Tue Jun 25 21:01:50 2024 +0530 PCI: Extend ACS configurability PCIe ACS settings control the level of isolation and the possible P2P paths between devices. With greater isolation the kernel will create smaller iommu_groups and with less isolation there is more HW that can achieve P2P transfers. From a virtualization perspective all devices in the same iommu_group must be assigned to the same VM as they lack security isolation. There is no way for the kernel to automatically know the correct ACS settings for any given system and workload. Existing command line options (e.g., disable_acs_redir) allow only for large scale change, disabling all isolation, but this is not sufficient for more complex cases. Add a kernel command-line option 'config_acs' to directly control all the ACS bits for specific devices, which allows the operator to setup the right level of isolation to achieve the desired P2P configuration. The definition is future proof; when new ACS bits are added to the spec the open syntax can be extended. ACS needs to be setup early in the kernel boot as the ACS settings affect how iommu_groups are formed. iommu_group formation is a one time event during initial device discovery, so changing ACS bits after kernel boot can result in an inaccurate view of the iommu_groups compared to the current isolation configuration. ACS applies to PCIe Downstream Ports and multi-function devices. The default ACS settings are strict and deny any direct traffic between two functions. This results in the smallest iommu_group the HW can support. Frequently these values result in slow or non-working P2PDMA. ACS offers a range of security choices controlling how traffic is allowed to go directly between two devices. Some popular choices: - Full prevention - Translated requests can be direct, with various options - Asymmetric direct traffic, A can reach B but not the reverse - All traffic can be direct Along with some other less common ones for special topologies. The intention is that this option would be used with expert knowledge of the HW capability and workload to achieve the desired configuration. Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [bhelgaas: add example, tidy printk formats] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-09-19 14:13:25 -06:00
Thomas Huth	ba994843de	docs: move s390 under arch JIRA: https://issues.redhat.com/browse/RHEL-54248 commit 37002bc6b6039e1491140869c6801e0a2deee43e Author: Costa Shulyupin <costa.shul@redhat.com> Date: Tue Jul 18 07:55:02 2023 +0300 docs: move s390 under arch and fix all in-tree references. Architecture-specific documentation is being moved into Documentation/arch/ as a way of cleaning up the top-level documentation directory and making the docs hierarchy more closely match the source hierarchy. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230718045550.495428-1-costa.shul@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Conflicts: Documentation/admin-guide/kernel-parameters.txt Documentation/arch/index.rst MAINTAINERS (contextual conflicts due to missing other patches in downstream) Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-09-06 17:33:51 +02:00
Thomas Huth	1bbaf4f572	s390/con3215: Drop console data printout when buffer full JIRA: https://issues.redhat.com/browse/RHEL-54248 commit 1f3307cf3aac88763077fac90404f2c57bc5181a Author: Thomas Richter <tmricht@linux.ibm.com> Date: Tue Sep 20 14:26:16 2022 +0200 s390/con3215: Drop console data printout when buffer full Using z/VM the 3270 terminal emulator also emulates an IBM 3215 console which outputs line by line. When the screen is full, the console enters the MORE... state and waits for the operator to confirm the data on the screen by pressing a clear key. If this does not happen in the default time frame (currently 50 seconds) the console enters the HOLDING state. It then waits another time frame (currently 10 seconds) before the output continues on the next screen. When the operator presses the clear key during these wait times, the output continues immediately. This may lead to a very long boot time when the console has to print many messages, also the system may hang because of the console's limited buffer space and the system waits for the console output to drain and finally to finish. This problem can only occur when a terminal emulator is actually connected to the 3215 console driver. If not z/VM simply drops console output. Remedy this rare situation and add a kernel boot command line parameter con3215_drop. It can be set to 0 (do not drop) or 1 (do drop) which is the default. This instructs the kernel drop console data when the console buffer is full. This speeds up the boot time considerable and also does not hang the system anymore. Add a sysfs attribute file for console IBM 3215 named con_drop. This allows for changing the behavior after the boot, for example when during interactive debugging a panic/crash is expected. Here is a test of the new behavior using the following test program: #/bin/bash declare -i cnt=4 mode=$(cat /sys/bus/ccw/drivers/3215/con_drop) [ $mode = yes ] && cnt=25 echo "cons_drop $(cat /sys/bus/ccw/drivers/3215/con_drop)" echo "vmcp term more 5 2" vmcp term more 5 2 echo "Run $cnt iterations of "'echo t > /proc/sysrq-trigger' for i in $(seq $cnt) do echo "$i. command 'echo t > /proc/sysrq-trigger' at $(date +%F,%T)" echo t > /proc/sysrq-trigger sleep 1 done echo "droptest done" > /dev/kmsg # Output with sysfs attribute con_drop set to 1: # ./droptest.sh cons_drop yes vmcp term more 5 2 Run 25 iterations of echo t > /proc/sysrq-trigger 1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:09 2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:10 3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:11 4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:12 5. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:13 6. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:14 7. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:15 8. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:16 9. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:17 10. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:18 11. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:19 12. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:20 13. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:21 14. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:22 15. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:23 16. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:24 17. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:25 18. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:26 19. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:27 20. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:28 21. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:29 22. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:30 23. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:31 24. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:32 25. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:33 # There are no hangs anymore. Output with sysfs attribute con_drop set to 0 and identical setting for z/VM console 'term more 5 2'. Sometimes hitting the clear key at the x3270 console to progress output. # ./droptest.sh cons_drop no vmcp term more 5 2 Run 4 iterations of echo t > /proc/sysrq-trigger 1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:20:58 2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:24:32 3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:28:04 4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:31:37 # Details: Enable function raw3215_write() to handle tab expansion and newlines and feed it with input not larger than the console buffer of 65536 bytes. Function raw3125_putchar() just forwards its character for output to raw3215_write(). This moves tab to blank conversion to one function raw3215_write() which also does call raw3215_make_room() to wait for enough free buffer space. Function handle_write() loops over all its input and segments input into chunks of console buffer size (should the input be larger). Rework tab expansion handling logic to avoid code duplication. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-09-06 17:33:35 +02:00
David Arcari	3531f1645e	x86/cpu: Detect real BSP on crash kernels JIRA: https://issues.redhat.com/browse/RHEL-43147 commit 5c5682b9f87a3b7bd4833884f300ec673685f6a6 Author: Thomas Gleixner <tglx@linutronix.de> Date: Tue Feb 13 22:05:54 2024 +0100 x86/cpu: Detect real BSP on crash kernels When a kdump kernel is started from a crashing CPU then there is no guarantee that this CPU is the real boot CPU (BSP). If the kdump kernel tries to online the BSP then the INIT sequence will reset the machine. There is a command line option to prevent this, but in case of nested kdump kernels this is wrong. But that command line option is not required at all because the real BSP is enumerated as the first CPU by firmware. Support for the only known system which was different (Voyager) got removed long ago. Detect whether the boot CPU APIC ID is the first APIC ID enumerated by the firmware. If the first APIC ID enumerated is not matching the boot CPU APIC ID then skip registering it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240213210252.348542071@linutronix.de Signed-off-by: David Arcari <darcari@redhat.com>	2024-08-29 08:19:49 -04:00
Waiman Long	e62041bc08	rcu: Reduce synchronize_rcu() latency JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 988f569ae041ccc93a79d98d1b0043dff4d7e9b7 Author: Uladzislau Rezki (Sony) <urezki@gmail.com> Date: Fri, 8 Mar 2024 18:34:05 +0100 rcu: Reduce synchronize_rcu() latency A call to a synchronize_rcu() can be optimized from a latency point of view. Workloads which depend on this can benefit of it. The delay of wakeme_after_rcu() callback, which unblocks a waiter, depends on several factors: - how fast a process of offloading is started. Combination of: - !CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU; - !CONFIG_RCU_LAZY/CONFIG_RCU_LAZY; - other. - when started, invoking path is interrupted due to: - time limit; - need_resched(); - if limit is reached. - where in a nocb list it is located; - how fast previous callbacks completed; Example: 1. On our embedded devices i can easily trigger the scenario when it is a last in the list out of ~3600 callbacks: <snip> <...>-29 [001] d..1. 21950.145313: rcu_batch_start: rcu_preempt CBs=3613 bl=28 ... <...>-29 [001] ..... 21950.152578: rcu_invoke_callback: rcu_preempt rhp=00000000b2d6dee8 func=__free_vm_area_struct.cfi_jt <...>-29 [001] ..... 21950.152579: rcu_invoke_callback: rcu_preempt rhp=00000000a446f607 func=__free_vm_area_struct.cfi_jt <...>-29 [001] ..... 21950.152580: rcu_invoke_callback: rcu_preempt rhp=00000000a5cab03b func=__free_vm_area_struct.cfi_jt <...>-29 [001] ..... 21950.152581: rcu_invoke_callback: rcu_preempt rhp=0000000013b7e5ee func=__free_vm_area_struct.cfi_jt <...>-29 [001] ..... 21950.152582: rcu_invoke_callback: rcu_preempt rhp=000000000a8ca6f9 func=__free_vm_area_struct.cfi_jt <...>-29 [001] ..... 21950.152583: rcu_invoke_callback: rcu_preempt rhp=000000008f162ca8 func=wakeme_after_rcu.cfi_jt <...>-29 [001] d..1. 21950.152625: rcu_batch_end: rcu_preempt CBs-invoked=3612 idle=.... <snip> 2. We use cpuset/cgroup to classify tasks and assign them into different cgroups. For example "backgrond" group which binds tasks only to little CPUs or "foreground" which makes use of all CPUs. Tasks can be migrated between groups by a request if an acceleration is needed. See below an example how "surfaceflinger" task gets migrated. Initially it is located in the "system-background" cgroup which allows to run only on little cores. In order to speed it up it can be temporary moved into "foreground" cgroup which allows to use big/all CPUs: cgroup_attach_task(): -> cgroup_migrate_execute() -> cpuset_can_attach() -> percpu_down_write() -> rcu_sync_enter() -> synchronize_rcu() -> now move tasks to the new cgroup. -> cgroup_migrate_finish() <snip> rcuop/1-29 [000] ..... 7030.528570: rcu_invoke_callback: rcu_preempt rhp=00000000461605e0 func=wakeme_after_rcu.cfi_jt PERFD-SERVER-1855 [000] d..1. 7030.530293: cgroup_attach_task: dst_root=3 dst_id=22 dst_level=1 dst_path=/foreground pid=1900 comm=surfaceflinger TimerDispatch-2768 [002] d..5. 7030.537542: sched_migrate_task: comm=surfaceflinger pid=1900 prio=98 orig_cpu=0 dest_cpu=4 <snip> "Boosting a task" depends on synchronize_rcu() latency: - first trace shows a completion of synchronize_rcu(); - second shows attaching a task to a new group; - last shows a final step when migration occurs. 3. To address this drawback, maintain a separate track that consists of synchronize_rcu() callers only. After completion of a grace period users are deferred to a dedicated worker to process requests. 4. This patch reduces the latency of synchronize_rcu() approximately by ~30-40% on synthetic tests. The real test case, camera launch time, shows(time is in milliseconds): 1-run 542 vs 489 improvement 9% 2-run 540 vs 466 improvement 13% 3-run 518 vs 468 improvement 9% 4-run 531 vs 457 improvement 13% 5-run 548 vs 475 improvement 13% 6-run 509 vs 484 improvement 4% Synthetic test(no "noise" from other callbacks): Hardware: x86_64 64 CPUs, 64GB of memory Linux-6.6 - 10K tasks(simultaneous); - each task does(1000 loops) synchronize_rcu(); kfree(p); default: CONFIG_RCU_NOCB_CPU: takes 54 seconds to complete all users; patch: CONFIG_RCU_NOCB_CPU: takes 35 seconds to complete all users. Running 60K gives approximately same results on my setup. Please note it is without any interaction with another type of callbacks, otherwise it will impact a lot a default case. 5. By default it is disabled. To enable this perform one of the below sequence: echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1" Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Co-developed-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:37 -04:00
Waiman Long	a7ee6faa72	rcu: Provide a boot time parameter to control lazy RCU JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 7f66f099de4dc4b1a66a3f94e6db16409924a6f8 Author: Qais Yousef <qyousef@layalina.io> Date: Sun, 3 Dec 2023 01:12:52 +0000 rcu: Provide a boot time parameter to control lazy RCU To allow more flexible arrangements while still provide a single kernel for distros, provide a boot time parameter to enable/disable lazy RCU. Specify: rcutree.enable_rcu_lazy=[y\|1\|n\|0] Which also requires rcu_nocbs=all at boot time to enable/disable lazy RCU. To disable it by default at build time when CONFIG_RCU_LAZY=y, the new CONFIG_RCU_LAZY_DEFAULT_OFF can be used. Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Tested-by: Andrea Righi <andrea.righi@canonical.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-26 10:57:22 -04:00
Waiman Long	4175e632cf	doc: Get rcutree module parameters back into alpha order JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 51823ca651364f68bd3ad33d848c1542fffdd627 Author: Paul E. McKenney <paulmck@kernel.org> Date: Tue, 21 Mar 2023 17:28:40 -0700 doc: Get rcutree module parameters back into alpha order This commit puts the rcutree module parameters back into proper alphabetical order. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-21 15:04:30 -04:00
Waiman Long	83b3fc77cb	doc: Document rcutree.nocb_nobypass_lim_per_jiffy kernel parameter JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 89f7f29140da767f4675efbbe7892f38786451ec Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 27 Apr 2022 09:24:31 -0700 doc: Document rcutree.nocb_nobypass_lim_per_jiffy kernel parameter This commit provides documentation for the kernel parameter controlling RCU's handling of callback floods on offloaded (rcu_nocbs) CPUs. This parameter might be obscure, but it is always there when you need it. Reported-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-21 15:04:29 -04:00
Waiman Long	e15ff5264d	doc: Document the rcutree.rcu_divisor kernel boot parameter JIRA: https://issues.redhat.com/browse/RHEL-55557 commit 71de1e34f1dfc31ab3cb052cdd7038950aae06e7 Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 20 Apr 2022 08:59:46 -0700 doc: Document the rcutree.rcu_divisor kernel boot parameter This commit adds kernel-parameters.txt documentation for the rcutree.rcu_divisor kernel boot parameter, which controls the softirq callback-invocation batch limit. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-08-21 15:04:29 -04:00
Waiman Long	3342973efe	x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE JIRA: https://issues.redhat.com/browse/RHEL-31230 Conflicts: 1) The net/netfilter/Makefile hunk is dropped due to missing nft_ct_fast.c file first intruduced by commit d9e789147605 ("netfilter: nf_tables: avoid retpoline overhead for some ct expression calls"). 2) A merge conflict in the tools/objtool/check.c hunk due to missing upstream commit 9bb2ec608a20 ("objtool: Update Retpoline validation"). 3) First hunk of net/netfilter/nf_tables_core.c is dropped and a merge conflict in the second hunk due to missing upstream commit d8d760627855 ("netfilter: nf_tables: add static key to skip retpoline workarounds"). 4) The net/netfilter/nft_ct.c hunks are dropped due to missing upstream commit d9e789147605 ("netfilter: nf_tables: avoid retpoline overhead for some ct expression calls"). commit aefb2f2e619b6c334bcb31de830aa00ba0b11129 Author: Breno Leitao <leitao@debian.org> Date: Tue, 21 Nov 2023 08:07:32 -0800 x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE Step 5/10 of the namespace unification of CPU mitigations related Kconfig options. [ mingo: Converted a few more uses in comments/messages as well. ] Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ariel Miculas <amiculas@cisco.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org Signed-off-by: Waiman Long <longman@redhat.com>	2024-07-26 14:33:35 -04:00
Lucas Zampieri	5c0d3906e7	Merge: RHEL-9.5: NFS Updates to v6.8 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4239 This MR updates NFS, kNFSD, lockd, and sunrpc subsystems to upstream v6.8, with some omissions and additions for compatibility and fixes. Testing is currently in progress.. JIRA: https://issues.redhat.com/browse/RHEL-34875 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=62234972 Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Approved-by: Steve Dickson <steved@redhat.com> Approved-by: Rafael Aquini <aquini@redhat.com> Approved-by: Paulo Alcantara <paalcant@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Approved-by: Scott Mayhew <smayhew@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-07-16 19:40:48 +00:00
Paolo Bonzini	1bc808f550	x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT JIRA: https://issues.redhat.com/browse/RHEL-16745 It was meant well at the time but nothing's using it so get rid of it. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240202163510.GDZb0Zvj8qOndvFOiZ@fat_crate.local (cherry picked from commit 29956748339aa8757a7e2f927a8679dd08f24bb6) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-07-01 08:55:34 +02:00
Benjamin Coddington	8520ae59d8	NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY JIRA: https://issues.redhat.com/browse/RHEL-34875 commit 5b9d31ae1c925bb5f15975e31b31ff5ae3c81f8f Author: Trond Myklebust <trond.myklebust@hammerspace.com> Date: Sat Sep 9 12:23:01 2023 -0400 NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY When using a 'softerr' mount, the NFSv4 client can get stuck waiting forever while the server just returns NFS4ERR_DELAY. Among other things, this causes the knfsd server threads to busy wait. Add a parameter that tells the NFSv4 client how many times to retry before giving up. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>	2024-06-27 08:14:24 -04:00
Lucas Zampieri	cd66a5d192	Merge: Update kernel-module support to v6.8 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4061 # Merge Request Required Information ## Summary of Changes RHIVOS is running into early mm init performance issues, and a long-term set of solutions is to improve the kernel linear map when kernel security is set to a max-level, a RHIVOS FuSa requirement, where all of memory is -not- read/writeable via the linear map (all of memory mapping from PAGE_OFFSET), but has strict execute-only, rodata, rw-data and no-execute pages. Although RHEL9 and upstream can support the latter functionally, it is a significant performance issue as page-level mapping of the kernel linear map has to be employed from the default huge-page mappings that the various arch's support. The boot kernel itself is relatively easy to know how to map for optimal page-mappings and protection, because it is the first to load and ELF sections can be scanned for needed info; the same can't be said for all the loadable kernel modules, which is the impetus for page-splitting of the linear map (on x86) and the per-page-mapping on ARM64, where page-splitting of the linear map is not supported, but is the long-term optimal solution. In order to make a step in this long-term effort, this patch series attempts to take the existing RHEL9 kernel module load support, which is barely 12 patches past the initial v5.14 base, and bring it up to a current, v6.8 version. Of course, such an update brings a lot of other needed backports to apply cleanly, if the goal is to get close to upstream, maintain RHEL kmod support, and not regress. Thus, this series results with major updates to dynamic-debug (since it involves modifying kernel module sections), kbuild, modpost, genksyms, and sprinkle an odd livepatch, fpatch, and BPF patch, although the latter were trimmed or dropped wherever possible. The split is approximately 150 kernel-module, 30 dyndbg, 80 modpost, 15 kbuild, 3 livepatch, 3 ftrace, 2 bpf (one being a fix for earlier kernel commit). Note: modpost and related kbuild updates moved it to approximately v6.4. A full update to 6.8 wasn't deemed necessary, and was an additional 30+ commits, and more kbuild modifications. This effort was deemed sufficiently large and complete for the intended goal of making RHEL9 amenable to future updates to the kernel-module subsystem for posted patches on review now in linux-mm by Mike Rapaport. Those patches and expected follow-ons, will be backported to RHEL-9 when upstream settles on final updates in this area; these updates will make the kernel-load subsystem more common, and less arch-specific. One patch from v6.9-rc1 was taken, modules: wait do_free_init correctly, to repair a race seen in the module-load path on a RHIVOS platform, which needed to sit on top of this series for ease of backporting. v6: Evidently the rebase to -457 kept a merge conflict, which was a duplicate patch already taken in. Latest series is now 299 patches vs 300. No functional changes! v5: rebased to latest kernel (-457) since gitlab punted due to claimed merge conflict; only conflict was relative source, due to other MRs pulled into cs9/9.5 ahead of this MR; no code changes, and (tkdiff+)diff-ing v4 patches to v5, showed no diffs to the author's naked eye. v4: Just updated 3rd patch's revert to put Upstream status after Subject, so it shows correctly in a git-format output. No code changes from v3. (although CKI running a-muck after push'd update w/only a commit-log change). v3: Rebase to -455 kernel since v2 was 8300 commits behind and had merge conflicts with JoeL's objtool update MR. v2: Pulling out of Draft. : (Hopefully) fixed numerous nits (Jira: -> JIRA:; proper link so no more 404's, etc.) : add new/latest Fixes, some id'd by reviewers, some new to v6.9 : Cleaned up/out bad merges that had introduced RHEL-only hunks : Significantly re-ordered the series to make it more bisectable; still breaks where the upstream maintainer tore code out of modpost.c and into a sed script, and then put the functionality back into modpost.c, and removed the sed script, which this series didn't backport since it was already large enough. : identified a failure with systemtap, that Will Cohen is repairing; thus, this MR has to wait for a systemtap update before it will pass its check in the (brew? cki?) builds. v1: Draft! This series has gone through some simple, preliminary testing, but it needs deep review by ftrace, BPF, livepatch, and rh-kabi support to ensure no regressions in these few, but corner kernel-modifying code paths. rh-kabi tooling is a bit unknown, as it isn't in the kernel, but there are RHEL-only patches in the kernel for it. A patchreview run against the series was exed'd, and needed Fixes were added/included. The list of self-documented omissions is listed below. If new ones have popped in v6.9-rc<n>, please forward them for addition. Bisectability: The series is has known bisectability (patch-ordering) issues at the moment, but plan to re-shuffle the patches in v2 to improve if not make it completely bisectable. Expected feedback will be incorporated in v2, and planned upgrade to full-MR/drop-Draft status. Shout-out to Joe Lawrence who aided in debugging and providing fixes for well-hidden noarch build failures around Documentation generation, as well as warning cleanups for EXPORT'd init-tagged functions, which the update checks for now. Joe was instrumental in finding key chunks of the modpost update that appears to have closed gaps in my original backport efforts. Intentionally Omitted Fix: 0aa24a79ee3b603f kbuild: do not try to parse .cmd files for objects provided by compiler -- for parisc & sky arch's, not needed in RHEL9 Intentionally Omitted Fix: f5983dab0ead modpost: define more R_ARM_ for old distributions For old releases not having R_ARM_* in arch/arm/include/asm/elf.h, which RHEL9 has Intentionally Omitted Fix: 08700ec705043e linux/export: fix reference to exported functions for parisc64 -- no parisc64 support in RHEL9 Intentionally Omitted Fix: 86495af1171e1feec79f media: dvb: symbol fixup for dvb_attach() -- not included in this backport due to partner request not to include until RHEL-10 Intentionally Omitted Fix: d81f0d7b8 Subject: kunit: add KUNIT_INIT_TABLE to init link -- will let KUNIT update bring in and enable as needed ## Approved Development Ticket JIRA: https://issues.redhat.com/browse/RHEL-28063 Signed-off-by: Donald Dutile <ddutile@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Approved-by: Eric Chanudet <echanude@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-24 12:17:19 +00:00
Lucas Zampieri	0ff0944e55	Merge: smp: Backport CSD tracepoints MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3614 # Merge Request Required Information ## Summary of Changes Introduce csd tracepoints that help tracking IPIs that can be messing with latency. Also, make the trace available for all smp_function_call*(), not only the ones that result in an IPI. ## Approved Development Ticket JIRA: https://issues.redhat.com/browse/RHEL-13876 Signed-off-by: Leonardo Bras <leobras@redhat.com> Approved-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-21 12:50:04 +00:00
Lucas Zampieri	175f008f91	Merge: rcu: Backport upstream RCU commits up to v6.7 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4115 JIRA: https://issues.redhat.com/browse/RHEL-34076 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4115 This MR backports upstream RCU commits up to v6.7 with relevant bug fixes, if applicable. Signed-off-by: Waiman Long <longman@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-19 17:58:02 +00:00
Lucas Zampieri	3cadd5b0ec	Merge: x86/bhi: Additional mitigation for BHI vulnerability (CVE-2024-2201) MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4014 JIRA: https://issues.redhat.com/browse/RHEL-28203 JIRA: https://issues.redhat.com/browse/RHEL-28209 CVE: CVE-2024-2201 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4014 Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3961 Branch History Injection (BHI) attacks may allow a malicious application to influence indirect branch prediction in kernel by poisoning the branch history. eIBRS isolates indirect branch targets in ring0. The BHB can still influence the choice of indirect branch predictor entry, and although branch predictor entries are isolated between modes when eIBRS is enabled, the BHB itself is not isolated between modes. Alder Lake and new processors supports a hardware control BHI_DIS_S to mitigate BHI. For older processors Intel has released a software sequence to clear the branch history on parts that don't support BHI_DIS_S. Add support to execute the software sequence at syscall entry and VMexit to overwrite the branch history. This MR extends the existing spectre_v2 mitigation to enable either software or hardware BHI mitigation for vulnerable Intel processors, if enabled. The spectre_v2 vulnerability sysfs file will now show the status of the BHI mitigation like ...; SW sequence; BHI: SW loop, KVM: SW loop As Linus has changed the default upstream to CONFIG_SPECTRE_BHI_ON, the syscall hardening commit 1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls") is skipped for now. It may be backported in the future, if necessary. Signed-off-by: Waiman Long <longman@redhat.com> Approved-by: Paolo Bonzini <bonzini@gnu.org> Approved-by: David Arcari <darcari@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-18 12:42:44 +00:00
Donald Dutile	a27f75beb4	module: add debugging auto-load duplicate module support JIRA: https://issues.redhat.com/browse/RHEL-28063 commit 8660484ed1cf3261e89e0bad94c6395597e87599 Author: Luis Chamberlain <mcgrof@kernel.org> Date: Thu Apr 13 22:28:39 2023 -0700 module: add debugging auto-load duplicate module support The finit_module() system call can in the worst case use up to more than twice of a module's size in virtual memory. Duplicate finit_module() system calls are non fatal, however they unnecessarily strain virtual memory during bootup and in the worst case can cause a system to fail to boot. This is only known to currently be an issue on systems with larger number of CPUs. To help debug this situation we need to consider the different sources for finit_module(). Requests from the kernel that rely on module auto-loading, ie, the kernel's request_module() API, are one source of calls. Although modprobe checks to see if a module is already loaded prior to calling finit_module() there is a small race possible allowing userspace to trigger multiple modprobe calls racing against modprobe and this not seeing the module yet loaded. This adds debugging support to the kernel module auto-loader (request_module() calls) to easily detect duplicate module requests. To aid with possible bootup failure issues incurred by this, it will converge duplicates requests to a single request. This avoids any possible strain on virtual memory during bootup which could be incurred by duplicate module autoloading requests. Folks debugging virtual memory abuse on bootup can and should enable this to see what pr_warn()s come on, to see if module auto-loading is to blame for their wores. If they see duplicates they can further debug this by enabling the module.enable_dups_trace kernel parameter or by enabling CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE. Current evidence seems to point to only a few duplicates for module auto-loading. And so the source for other duplicates creating heavy virtual memory pressure due to larger number of CPUs should becoming from another place (likely udev). Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Donald Dutile <ddutile@redhat.com>	2024-06-17 14:17:26 -04:00
Donald Dutile	ca8f0d3fa6	module: Add support for default value for module async_probe JIRA: https://issues.redhat.com/browse/RHEL-28063 commit ae39e9ed964f8e450d0de410b5a757e19581dfc5 Author: Saravana Kannan <saravanak@google.com> Date: Fri Jun 3 18:01:00 2022 -0700 module: Add support for default value for module async_probe Add a module.async_probe kernel command line option that allows enabling async probing for all modules. When this command line option is used, there might still be some modules for which we want to explicitly force synchronous probing, so extend <modulename>.async_probe to take an optional bool input so that async probing can be disabled for a specific module. Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Donald Dutile <ddutile@redhat.com>	2024-06-17 14:17:18 -04:00
Donald Dutile	c4f068de33	dyndbg: Remove support for ddebug_query param JIRA: https://issues.redhat.com/browse/RHEL-28063 commit 9c40e1aa84123750773a57c9cf39112459a952dd Author: Andrew Halaney <ahalaney@redhat.com> Date: Wed Oct 13 11:40:21 2021 -0400 dyndbg: Remove support for ddebug_query param This param has been deprecated for a very long time now, let's rip it out. Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Link: https://lore.kernel.org/r/1634139622-20667-3-git-send-email-jbaron@akamai.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Donald Dutile <ddutile@redhat.com>	2024-06-17 14:17:11 -04:00
Leonardo Bras	bd63f8635f	locking/csd_lock: Remove added data from CSD lock debugging JIRA: https://issues.redhat.com/browse/RHEL-13876 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git conflicts: Fixes (some) conflicts introduced by downstream commit `aa5786b04d` ("sched, smp: Trace smp callback causing an IPI") by applying the original dependency commit, and making it easier to cherry-pick the next upstream commits due to not having conflicts. commit 1771257cb447a7b27a15ed9aaf332726c47fcbcf Author: Paul E. McKenney <paulmck@kernel.org> Date: 2023-03-20 17:55:14 -0700 locking/csd_lock: Remove added data from CSD lock debugging The diagnostics added by this commit were extremely useful in one instance: `a5aabace5f` ("locking/csd_lock: Add more data to CSD lock debugging") However, they have not seen much action since, and there have been some concerns expressed that the complexity is not worth the benefit. Therefore, manually revert this commit, but leave a comment telling people where to find these diagnostics. [ paulmck: Apply Juergen Gross feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230321005516.50558-2-paulmck@kernel.org Signed-off-by: Leonardo Bras <leobras@redhat.com>	2024-06-17 12:58:15 -03:00
Leonardo Bras	6e00a94924	trace,smp: Add tracepoints for scheduling remotelly called functions JIRA: https://issues.redhat.com/browse/RHEL-13876 Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git commit c52198601695851622f361d3f16456e9fc857629 Author: Paul E. McKenney <paulmck@kernel.org> Date: 2023-03-20 17:55:13 -0700 locking/csd_lock: Add Kconfig option for csd_debug default The csd_debug kernel parameter works well, but is inconvenient in cases where it is more closely associated with boot loaders or automation than with a particular kernel version or release. Thererfore, provide a new CSD_LOCK_WAIT_DEBUG_DEFAULT Kconfig option that defaults csd_debug to 1 when selected and 0 otherwise, with this latter being the default. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230321005516.50558-1-paulmck@kernel.org Signed-off-by: Leonardo Bras <leobras@redhat.com>	2024-06-17 12:58:14 -03:00
Lucas Zampieri	f6029bf351	Merge: workqueue: Backport workqueue commits to v6.9 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910 JIRA: https://issues.redhat.com/browse/RHEL-25103 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910 Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3847 The primary purpose of this MR is to backport those upstream workqueue commits which enables ordered workqueues and rescuers to follow changes in workqueue unbound cpumask which is necessary to make sure that isolated CPUs won't be disturbed due to unbound work items being handled by those CPUs. These upstream commits were merged into the v6.9 kernel which also contains some major changes in workqueue code. This makes the required commits dependent on some of the v6.9 workqueue commits. It is less risky to sync the workqueue code up to v6.9 instead of selective backports of some dependent commits. This MR also includes some miscellaneous commits in other subsystems due to changes in the underlying workqueue implementations. A follow-up proactive workqueue fixes MR will be created later on, if necessary. Signed-off-by: Waiman Long <longman@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: Vladis Dronov <vdronov@redhat.com> Approved-by: Prarit Bhargava <prarit@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Radu Rendec <rrendec@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-13 13:07:43 +00:00
Lucas Zampieri	304e2a4e29	Merge: [RHEL-9.5.0] iommu and dma mapping api updates MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4151 # Merge Request Required Information JIRA: https://issues.redhat.com/browse/RHEL-28780 JIRA: https://issues.redhat.com/browse/RHEL-12083 JIRA: https://issues.redhat.com/browse/RHEL-12322 JIRA: https://issues.redhat.com/browse/RHEL-29105 JIRA: https://issues.redhat.com/browse/RHEL-29357 JIRA: https://issues.redhat.com/browse/RHEL-29359 Omitted-fix: ed8b94f6e0ac ("powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add") - Reverted by 1fba2bf8e9d5 ("Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"") Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git branch: next Tested: In progress - general cki coverage - Nvidia testing arm-smmu-v3 and iommufd related changes they have requested. - Multiple rounds testing of amd_iommu, intel_iommu, and arm-smmu-v3 with various iommu configurations with disk i/o using fio, covering lazy iotlb invalidation, strict iotlb invalidation, and passthrough. Also tested with forcedac set. Intel Scalable Mode capable systems tested with the iotlb invalidation policies, and passthrough with scalable mode enabled, and disabled. AMD systems tested tested with v1 pages tables and v2. - Tested booting with various iommu configurations, and verifying system in correct state on AMD, Intel, and ARM. - Limited test on ppc64le. The system I had access to was setting up a 64-bit bypass window, and using dma_direct calls. It ran, but since I don't normally touch ppc64le iommu code, I need to investigate more or get IBM assistance to more thoroughly test it. - Working on getting testing assistance from IBM for the s390x changes. ## Summary of Changes This brings iommu, iommufd, and dma mapping api up to 6.9 with some additions from Joerg's next branch minus some commits changes in a 6.9 SEV-SNP pull for AMD. Some hightlights: - The removal of the amd_iommu_v2 code, and the addition of it's replacement based on the iommu core SVA api, along with a re-org of the amd_iommu code. - The migration of s390 to the iommu core dma-iommu dma ops implementation, joining Intel, AMD, and ARM as users of the same code base. - The beginnings of a re-work of the arm-smmu-v3 driver by Jason, and others. - A number of changes to iommufd as it continues to get fleshed out. - IOPT memory usage observability (code that was basis for talk at LPC last year) Example output in vmstat files: ``` # grep iommu /sys/devices/system/node/node*/vmstat /sys/devices/system/node/node0/vmstat:nr_iommu_pages 342 /sys/devices/system/node/node1/vmstat:nr_iommu_pages 0 ``` - Continued work on shared virtual addressing and io page faulting (PRI). - Dynamic swiotlb memory pools. This is not enabled yet, as they still seem to be shaking out issues upstream, but the code is in place now. - Re-working of iommu core domain allocation. Note: iommufd selftest is being enabled in separate work that has been delegated to another engineer starting to help with iommu. So that will be enabled in the next few weeks to add more coverage for iommufd. Conflicts wise, they should be noted in the individual commits, but not too bad overall. 13/30 were dropping unsupported bits, and another 8 were context diffs. A couple caused by out of order backports due to fixes, and couple upstream conflicts from colliding patchsets that had to be resolved in the merge commits. Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Approved-by: Jan Stancek <jstancek@redhat.com> Approved-by: Donald Dutile <ddutile@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: David Airlie <airlied@redhat.com> Approved-by: Lenny Szubowicz <lszubowi@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: John W. Linville <linville@redhat.com> Approved-by: Mark Langsdorf <mlangsdo@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-05 20:03:50 +00:00
Lucas Zampieri	95ec32f109	Merge: USB/TB code rebase of supported drivers to upstream v6.8 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4268 JIRA: https://issues.redhat.com/browse/RHEL-34114 This rebases supported USB and Thunderbolt drivers to upstream kernel v6.8 By design, changes on this rebase are limited to supported usb/thunderbolt drivers. Changes which happen to touch the drivers but are tree-wide are selectively or partially pulled in, when relevant. Omitted-fix: 9dc292413c56 ("usb: gadget: ncm: Fix endianness of wMaxSegmentSize variable in ecm_desc") Omitted-fix: f90ce1e04cbc ("usb: gadget: ncm: Fix handling of zero block length packets") Omitted-fix: 5b9e00a6004c ("powerpc/4xx: Fix warp_gpio_leds build failure") Omitted-fix: 6f98e44984d5 ("spi: ppc4xx: Fix fallout from include cleanup") Omitted-fix: 70e6163d17dd ("arm64: dts: qcom: qrb5165-rb5: use u16 for DP altmode svid") Signed-off-by: Desnes Nunes <desnesn@redhat.com> Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com> Approved-by: Eric Chanudet <echanude@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-03 19:41:30 +00:00
Waiman Long	65e2702499	rcu: Restrict access to RCU CPU stall notifiers JIRA: https://issues.redhat.com/browse/RHEL-34076 commit 4e58aaeebb3c27993c734c99eae6881b196b1ddb Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed, 1 Nov 2023 18:28:38 -0700 rcu: Restrict access to RCU CPU stall notifiers Although the RCU CPU stall notifiers can be useful for dumping state when tracking down delicate forward-progress bugs where NUMA effects cause cache lines to be delivered to a given CPU regularly, but always in a state that prevents that CPU from making forward progress. These bugs can be detected by the RCU CPU stall-warning mechanism, but in some cases, the stall-warnings printk()s disrupt the forward-progress bug before any useful state can be obtained. Unfortunately, the notifier mechanism added by commit 5b404fdabacf ("rcu: Add RCU CPU stall notifier") can make matters worse if used at all carelessly. For example, if the stall warning was caused by a lock not being released, then any attempt to acquire that lock in the notifier will hang. This will prevent not only the notifier from producing any useful output, but it will also prevent the stall-warning message from ever appearing. This commit therefore hides this new RCU CPU stall notifier mechanism under a new RCU_CPU_STALL_NOTIFIER Kconfig option that depends on both DEBUG_KERNEL and RCU_EXPERT. In addition, the rcupdate.rcu_cpu_stall_notifiers=1 kernel boot parameter must also be specified. The RCU_CPU_STALL_NOTIFIER Kconfig option's help text contains a warning and explains the dangers of careless use, recommending lockless notifier code. In addition, a WARN() is triggered each time that an attempt is made to register a stall-warning notifier in kernels built with CONFIG_RCU_CPU_STALL_NOTIFIER=y. This combination of measures will keep use of this mechanism confined to debug kernels and away from routine deployments. [ paulmck: Apply Dan Carpenter feedback. ] Fixes: 5b404fdabacf ("rcu: Add RCU CPU stall notifier") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-31 10:56:18 -04:00

1 2 3 4 5 ...

986 Commits