Commit Graph

986 Commits

Author SHA1 Message Date
David Arcari 8b4033c281 cpufreq: intel_pstate: Make it possible to avoid enabling CAS
JIRA: https://issues.redhat.com/browse/RHEL-85517

commit 7802fce7dc18394d041a1310fe4ad76120e08145
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Mon Jan 27 14:07:12 2025 +0100

    cpufreq: intel_pstate: Make it possible to avoid enabling CAS

    Capacity-aware scheduling (CAS) is enabled by default by intel_pstate on
    hybrid systems without SMT, but in some usage scenarios it may be more
    attractive to place tasks for maximum CPU performance regardless of the
    extra cost in terms of energy, which is the case on such systems when
    CAS is not enabled, so introduce a command line option to forbid
    intel_pstate to enable CAS.

    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Acked-by:Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Link: https://patch.msgid.link/2781262.mvXUDI8C0e@rjwysocki.net

Signed-off-by: David Arcari <darcari@redhat.com>
2025-03-31 08:07:03 -04:00
Augusto Caringi 51bbb488a9 Merge: Scheduler updates for 9.7
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6398

JIRA: https://issues.redhat.com/browse/RHEL-78821

Proactive fixes and minor updates for scheduler related
code. This includes needed commits up to v6.14-rc1. There
are not as many since there are a few features upstream
which we are not taking into rhel9 at this point.

Signed-off-by: Phil Auld <pauld@redhat.com>

Approved-by: Waiman Long <longman@redhat.com>
Approved-by: Herton R. Krzesinski <herton@redhat.com>
Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Juri Lelli <juri.lelli@redhat.com>
Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Augusto Caringi <acaringi@redhat.com>
2025-03-12 14:53:01 -03:00
Phil Auld 37dc45d04d sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"
JIRA: https://issues.redhat.com/browse/RHEL-78821

commit 1174b9344bc7e7989439cad207fcd94eaab028db
Author: Waiman Long <longman@redhat.com>
Date:   Wed Oct 30 13:52:51 2024 -0400

    sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"

    The "isolcpus=nohz" boot parameter and flag were used to disable tick
    when running a single task.  Nowsdays, this "nohz" flag is seldomly used
    as it is included as part of the "nohz_full" parameter.  Extend this
    flag to cover other kernel noises disabled by the "nohz_full" parameter
    to make them equivalent. This also eliminates the need to use both the
    "isolcpus" and the "nohz_full" parameters to fully isolated a given
    set of CPUs.

    Suggested-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Frederic Weisbecker <frederic@kernel.org>
    Link: https://lore.kernel.org/r/20241030175253.125248-3-longman@redhat.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2025-02-27 15:13:10 +00:00
Waiman Long 8b6c3917c0 clocksource: Scale the watchdog read retries automatically
JIRA: https://issues.redhat.com/browse/RHEL-76143
Conflicts: A context diff in the include/linux/clocksource.h hunk due
	   to the presence of later upstream commit 6b2e29977518
	   ("timekeeping: Provide infrastructure for converting to/from
	   a base clock").

commit 2ed08e4bc53298db3f87b528cd804cb0cce066a9
Author: Feng Tang <feng.tang@intel.com>
Date:   Wed, 21 Feb 2024 14:08:59 +0800

    clocksource: Scale the watchdog read retries automatically

    On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled
    during boot time on about one out of 120 boot attempts:

        clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns,
        wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable
        tsc: Marking TSC unstable due to clocksource watchdog
        TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
        sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152)
        clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896.
        clocksource: Switched to clocksource hpet

    The reason is that for platform with a large number of CPUs, there are
    sporadic big or huge read latencies while reading the watchog/clocksource
    during boot or when system is under stress work load, and the frequency and
    maximum value of the latency goes up with the number of online CPUs.

    The cCurrent code already has logic to detect and filter such high latency
    case by reading the watchdog twice and checking the two deltas. Due to the
    randomness of the latency, there is a low probabilty that the first delta
    (latency) is big, but the second delta is small and looks valid. The
    watchdog code retries the readouts by default twice, which is not
    necessarily sufficient for systems with a large number of CPUs.

    There is a command line parameter 'max_cswd_read_retries' which allows to
    increase the number of retries, but that's not user friendly as it needs to
    be tweaked per system. As the number of required retries is proportional to
    the number of online CPUs, this parameter can be calculated at runtime.

    Scale and enlarge the number of retries according to the number of online
    CPUs and remove the command line parameter completely.

    [ tglx: Massaged change log and comments ]

    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Jin Wang <jin1.wang@intel.com>
    Tested-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Waiman Long <longman@redhat.com>
    Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
    Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com

Signed-off-by: Waiman Long <longman@redhat.com>
2025-02-04 13:20:56 -05:00
Patrick Talbert 4003ae72c9 Merge: Preparatory patches for TDX support in KVM
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6045

# Merge Request Required Information

## Summary of Changes

Backport more patches, mostly from 6.12, that are needed to enable TDX support in KVM. These prerequisites are less self contained, but are enough to have a mostly conflict-free TDX backport.

## Approved Development Ticket(s)
All submissions to CentOS Stream must reference a ticket in [Red Hat Jira](https://issues.redhat.com/).

```
JIRA: https://issues.redhat.com/browse/RHEL-71541
Depends: https://issues.redhat.com/browse/RHEL-64444
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Omitted-fix: 3f749befb0998472470d850b11b430477c0718cc (irrelevant series of changes for odd Kconfigs)
Omitted-fix: ea4290d77bda2bd1f173a86f07aa79b568e0a6f8 (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 2a5fe5a01668e831af1de3951718fbf88b9a9b9c (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 338b655a1178900ac05aca7ac66dc28b05100430 (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 341e4023032fba6c02326bfc6babd63ef4039712 (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 1331343af6f502aecd274d522dd34bf7c965f484 (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 9ee62c33c0fe017ee02501a877f6f562363122fa (irrelevant series of changes for odd Kconfigs)
Omitted-fix: 2a5fe5a01668e831af1de3951718fbf88b9a9b9c (irrelevant series of changes for odd Kconfigs)
Omitted-fix: d822ca29a4fc5278fb511790dace44836e8cc40d (can be backported via perf)
Omitted-fix: 979956bc681105f34642971448c4cda048954a07 (irrelevant with RHEL gcc)
Omitted-fix: e120829dbf927c8b93cd5e06acfec0332cc82e02 (can be backported via perf)
```

Approved-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Patrick Talbert <ptalbert@redhat.com>
2025-01-27 15:24:23 +01:00
Paolo Bonzini b02548e87e KVM: Add a module param to allow enabling virtualization when KVM is loaded
JIRA: https://issues.redhat.com/browse/RHEL-71541

Add an on-by-default module param, enable_virt_at_load, to let userspace
force virtualization to be enabled in hardware when KVM is initialized,
i.e. just before /dev/kvm is exposed to userspace.  Enabling virtualization
during KVM initialization allows userspace to avoid the additional latency
when creating/destroying the first/last VM (or more specifically, on the
0=>1 and 1=>0 edges of creation/destruction).

Now that KVM uses the cpuhp framework to do per-CPU enabling, the latency
could be non-trivial as the cpuhup bringup/teardown is serialized across
CPUs, e.g. the latency could be problematic for use case that need to spin
up VMs quickly.

Prior to commit 10474ae894 ("KVM: Activate Virtualization On Demand"),
KVM _unconditionally_ enabled virtualization during load, i.e. there's no
fundamental reason KVM needs to dynamically toggle virtualization.  These
days, the only known argument for not enabling virtualization is to allow
KVM to be autoloaded without blocking other out-of-tree hypervisors, and
such use cases can simply change the module param, e.g. via command line.

Note, the aforementioned commit also mentioned that enabling SVM (AMD's
virtualization extensions) can result in "using invalid TLB entries".
It's not clear whether the changelog was referring to a KVM bug, a CPU
bug, or something else entirely.  Regardless, leaving virtualization off
by default is not a robust "fix", as any protection provided is lost the
instant userspace creates the first VM.

Reviewed-by: Chao Gao <chao.gao@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240830043600.127750-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b4886fab6fb620b96ad7eeefb9801c42dfa91741)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-17 14:42:06 +01:00
Rado Vrbovsky 4b9fce484f Merge: mm: proactive fixes for RHEL-9.6
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5812

JIRA: https://issues.redhat.com/browse/RHEL-27745
JIRA: https://issues.redhat.com/browse/RHEL-15601
JIRA: https://issues.redhat.com/browse/RHEL-28873
JIRA: https://issues.redhat.com/browse/RHEL-54929
JIRA: https://issues.redhat.com/browse/RHEL-61137
JIRA: https://issues.redhat.com/browse/RHEL-62336
JIRA: https://issues.redhat.com/browse/RHEL-66627
JIRA: https://issues.redhat.com/browse/RHEL-66794
JIRA: https://issues.redhat.com/browse/RHEL-66818
JIRA: https://issues.redhat.com/browse/RHEL-66950
JIRA: https://issues.redhat.com/browse/RHEL-66977
JIRA: https://issues.redhat.com/browse/RHEL-68011
JIRA: https://issues.redhat.com/browse/RHEL-68909
JIRA: https://issues.redhat.com/browse/RHEL-69683
JIRA: https://issues.redhat.com/browse/RHEL-70053

CVE: CVE-2023-52490
CVE: CVE-2024-42316
CVE: CVE-2024-50182
CVE: CVE-2024-50199
CVE: CVE-2024-50200
CVE: CVE-2024-50219
CVE: CVE-2024-50228
CVE: CVE-2024-50272
CVE: CVE-2024-53097
CVE: CVE-2024-53105
CVE: CVE-2024-53136

This set proactively brings into RHEL9 core MM code a set of follow-up
fixes as they were pushed into upstream's stable v6.6 LTS branch, but
Mainline commits are backported instead in order to keep it easy to
track the RHEL backports against upstream. Dependencies were also
selectively backported where it made sense to do so, and all the
selected commits are sorted in upstream's topological order.

Omitted-fix: c567f2948f57 ("Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."")
Omitted-fix: 4b944f8ef996 ("Revert "mm/filemap: avoid buffered read/write race to read inconsistent data"")
Omitted-fix: 9d08ec41a064 ("mm: allow set/clear page_type again")
Omitted-fix: cc9bc36ebef7 ("mm: zswap: remove nr_zswap_stored atomic")
Omitted-fix: 0e4008447242 ("zswap: track swapins from disk more accurately")
Omitted-fix: 6359c39c9de6 ("mm: remove unused hugepage for vma_alloc_folio()")
Omitted-fix: 9b5c87d47949 ("mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount")
Omitted-fix: 1390a3334a48 ("mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio")
Omitted-fix: f708f6970cc9 ("mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio")
Omitted-fix: 4de22b2a6a74 ("mm: open-code PageTail in folio_flags() and const_folio_flags()")
Omitted-fix: 6a7de1bf218d ("mm: open-code page_folio() in dump_page()")
Omitted-fix: 40a024b81d1c ("ALSA: core: Drop superfluous no_free_ptr() for memdup_user() errors")
Omitted-fix: 9d197b627e5f ("docs/zh_CN: update the translation of mm/page_table_check.rst")
Omitted-fix: ce8f9fb651fa ("comedi: Flush partial mappings in error case")

Signed-off-by: Rafael Aquini <raquini@redhat.com>

Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: Herton R. Krzesinski <herton@redhat.com>
Approved-by: Jerry Snitselaar <jsnitsel@redhat.com>
Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: John W. Linville <linville@redhat.com>
Approved-by: Mark Langsdorf <mlangsdo@redhat.com>
Approved-by: Jocelyn Falempe <jfalempe@redhat.com>
Approved-by: Lucas Zampieri <lzampier@redhat.com>
Approved-by: Ivan Vecera <ivecera@redhat.com>
Approved-by: Gavin Shan <gshan@redhat.com>
Approved-by: Andrea Claudi <aclaudi@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-12-16 19:49:11 +00:00
Rafael Aquini c8c9c0b259 mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
JIRA: https://issues.redhat.com/browse/RHEL-27745
Conflicts:
  * arch/*/Kconfig: all hunks dropped as there were only text blurbs and comments
     being changed with no functional changes whatsoever, and RHEL9 is missing
     several (unrelated) commits to these arches that tranform the text blurbs in
     the way these non-functional hunks were expecting;
  * drivers/accel/qaic/qaic_data.c: hunk dropped due to RHEL-only commit
     083c0cdce2 ("Merge DRM changes from upstream v6.8..v6.9");
  * drivers/gpu/drm/i915/gem/selftests/huge_pages.c: hunk dropped due to RHEL-only
     commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * drivers/gpu/drm/ttm/tests/ttm_pool_test.c: all hunks dropped due to RHEL-only
     commit ca8b16c11b ("Merge DRM changes from upstream v6.7..v6.8");
  * drivers/video/fbdev/vermilion/vermilion.c: hunk dropped as RHEL9 misses
     commit dbe7e429fe ("vmlfb: framebuffer driver for Intel Vermilion Range");
  * include/linux/pageblock-flags.h: differences due to out-of-order backport
    of upstream commits 72801513b2bf ("mm: set pageblock_order to HPAGE_PMD_ORDER
    in case with !CONFIG_HUGETLB_PAGE but THP enabled"), and 3a7e02c040b1
    ("minmax: avoid overly complicated constant expressions in VM code");
  * mm/mm_init.c: differences on the 3rd, and 4th hunks are due to RHEL
     backport commit 1845b92dcf ("mm: move most of core MM initialization to
     mm/mm_init.c") ignoring the out-of-order backport of commit 3f6dac0fd1b8
     ("mm/page_alloc: make deferred page init free pages in MAX_ORDER blocks")
     thus partially reverting the changes introduced by the latter;

This patch is a backport of the following upstream commit:
commit 5e0a760b44417f7cadd79de2204d6247109558a0
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Thu Dec 28 17:47:04 2023 +0300

    mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER

    commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
    changed the definition of MAX_ORDER to be inclusive.  This has caused
    issues with code that was not yet upstream and depended on the previous
    definition.

    To draw attention to the altered meaning of the define, rename MAX_ORDER
    to MAX_PAGE_ORDER.

    Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:24:17 -05:00
Rado Vrbovsky 191f608532 Merge: PCI: ACS updates
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5246

```
JIRA: https://issues.redhat.com/browse/RHEL-48601

Signed-off-by: Myron Stowe <mstowe@redhat.com>

```

Approved-by: John W. Linville <linville@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-12-09 08:21:20 +00:00
Rado Vrbovsky 492f67b5c3 Merge: [RHEL 9.6] Update core Arm code
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5092

JIRA: https://issues.redhat.com/browse/RHEL-40604
Depends: !5252

Omitted-fix: b8995a184170 Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
Omitted-fix: f481bb32d60e Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
Don't need to revert then revert the revert.

Omitted-fix: cb1a393c40ee mm: add arch hook to validate mmap() prot flags
Omitted-fix: 50e3ed0f93f4 arm64: mm: add support for WXN memory translation attribute
These get reverted.

Omitted-fix: a07a59415217 arm64: smp: avoid NMI IPIs with broken MediaTek FW
Omitted-fix: 4bb49009e071 Revert "arm64: smp: avoid NMI IPIs with broken MediaTek FW"

Backport selected patches through upstream 6.9, including:

- bug fixes
- various cpu feature detection enhancements
- save/restore fpsimd state on context switch
- ARM Cortex-A510 erratum 3117295 workaround
- LPA2 related patch series. Not to enable LPA2 but for cleanup of startup code

Signed-off-by: Mark Salter <msalter@redhat.com>

Approved-by: Gavin Shan <gshan@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-12-09 08:21:13 +00:00
Rado Vrbovsky 05df4237af Merge: USB/TBT code rebase of supported drivers to upstream v6.11
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5592

JIRA: https://issues.redhat.com/browse/RHEL-59051

CVE: CVE-2024-44960
CVE JIRA: https://issues.redhat.com/browse/RHEL-57138

CVE: CVE-2024-46675
CVE JIRA: https://issues.redhat.com/browse/RHEL-64322

This MR rebases supported USB/TBT drivers to upstream kernel v6.11. By
design, changes on this rebase are limited to supported USB/Thunderbolt
drivers and infrastructure. Changes which happen to touch the drivers but
are tree-wide are selectively or partially pulled in, whenever relevant.

Notes:

I) Omits:

Omitted-fix: aefa036be8c2 ("phy: freescale: imx8qm-hsio: Include bitfield.h for FIELD_PREP")
Omitted-fix: 2d6213bd592b ("crypto: spacc - Add ifndef around MIN")
Omitted-fix: b8fc70ab7b5f ("Revert "crypto: spacc - Add SPAcc Skcipher support")
Omitted-fix: bf791751162a ("thunderbolt: Add only on-board retimers when !CONFIG_USB4_DEBUGFS_MARGINING")

II) This MR drops `rtsx_pci_ms` driver because it became dead code with
commit <c0e5f4e73a71> ("misc: rtsx: Add support for RTS5261"), which as
consequence was latter dropped on commit <d0f459259c13> ("memstick:
rtsx_pci_ms: Remove Realtek PCI memstick driver"). The latter is being
merged here.

III) This MR also includes minmax updates to fix these build and test errors:

1 - Signedness error:

```
drivers/usb/typec/ucsi/ucsi.c: In function 'ucsi_get_pd_message':
./include/linux/build_bug.h:78:41: error: static assertion failed: "min(bytes, (((con->ucsi)->version < 0x0200) ? 0x10 : 0xff)) signedness error, fix types or consider umin() before min_t()"
   78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
```

2 - ISO C90 error:

```
drivers/scsi/Makefile:196: FORCE prerequisite is missing
lib/vsprintf.c: In function 'resource_string':
lib/vsprintf.c:1068:9: error: ISO C90 forbids variable length array 'sym' [-Werror=vla]
 1068 |         char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
      |         ^~~~
```

3 - Oops on drm_gem_shmem CKI testing:

```
Unable to handle kernel paging request at virtual address ffffffff80000000
...
Internal error: Oops: 0000000096000146 [#1] SMP
...
drm_gem_shmem_test_obj_create_private+0x1cc/0x41c [drm_gem_shmem_test]
...
# drm_gem_shmem_test_obj_create_private: try faulted: last line seen drivers/gpu/drm/tests/drm_gem_shmem_test.c:120
# drm_gem_shmem_test_obj_create_private: internal error occurred preventing test case from running: -4
```

Signed-off-by: Desnes Nunes <desnesn@redhat.com>

Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com>
Approved-by: Bastien Nocera <bnocera@redhat.com>
Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Ivan Vecera <ivecera@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: Adam Jackson <ajax@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-25 13:17:44 +00:00
Rado Vrbovsky 993b335734 Merge: Update arch/{x86,powerpc,arm64}/mm to v6.6
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5391

JIRA: https://issues.redhat.com/browse/RHEL-55461  
JIRA: https://issues.redhat.com/browse/RHEL-55465  
JIRA: https://issues.redhat.com/browse/RHEL-55462  
Depends: !5252 

Updated the respective arch mm directories to v6.6. Most of the patches  
have already been updated or included by the respective arch teams and by  
Rafael's mm update to v6.6.   
  
Dropped the following to avoid issues with the ppc64le build:  
41b7a347bf14 powerpc: Book3S 64-bit outline-only KASAN support  
c7b9ed7c34a9 powerpc/64e: KASAN Full support for BOOK3E/64  

Omitted-fix: 7bd6680b47fa Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""
Omitted-fix: 7b59e8ae92fe arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices
Omitted-fix: a54b7fa6b9ab arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor
Omitted-fix: 9a5f0b11e49e arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP
Omitted-fix: cd87d9f58439 x86/mm: further clarify switch_mm_irqs_off() documentation
  
Signed-off-by: Audra Mitchell <audra@redhat.com>

Approved-by: Rafael Aquini <raquini@redhat.com>
Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Herton R. Krzesinski <herton@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: Nico Pache <npache@redhat.com>
Approved-by: Lenny Szubowicz <lszubowi@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-12 08:02:20 +00:00
Desnes Nunes f10fe0c8b9 usb-storage: Optimize scan delay more precisely
JIRA: https://issues.redhat.com/browse/RHEL-59051

commit 804da867ad016d53bf33373cfeaae041775455f1
Author: Norihiko Hama <Norihiko.Hama@alpsalpine.com>
Date: Wed, 15 May 2024 09:43:39 +0900

  Current storage scan delay is reduced by the following old commit.

  a4a47bc03f ("Lower USB storage settling delay to something more reasonable")

  It means that delay is at least 'one second', or zero with delay_use=0.
  'one second' is still long delay especially for embedded system but
  when delay_use is set to 0 (no delay), still error observed on some USB drives.

  So delay_use should not be set to 0 but 'one second' is quite long.
  Especially for embedded system, it's important for end user
  how quickly access to USB drive when it's connected.
  That's why we have a chance to minimize such a constant long delay.

  This patch optimizes scan delay more precisely
  to minimize delay time but not to have any problems on USB drives
  by extending module parameter 'delay_use' in milliseconds internally.
  The parameter 'delay_use' optionally supports in milliseconds
  if it ends with 'ms'.
  It makes the range of value to 1 / 1000 in internal 32-bit value
  but it's still enough to set the delay time.
  By default, delay time is 'one second' for backward compatibility.

  For example, it seems to be good by changing delay_use=100ms,
  that is 100 millisecond delay without issues for most USB pen drives.

  Signed-off-by: Norihiko Hama <Norihiko.Hama@alpsalpine.com>
  Link: https://lore.kernel.org/r/20240515004339.29892-1-Norihiko.Hama@alpsalpine.com
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Desnes Nunes <desnesn@redhat.com>
2024-11-07 23:01:28 -03:00
Jerry Snitselaar ceab946260 iommu/amd: Add kernel parameters to limit V1 page-sizes
JIRA: https://issues.redhat.com/browse/RHEL-61942
Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit f0295913c4b4f377c454e06f50c1a04f2f80d9df
Author: Joerg Roedel <jroedel@suse.de>
Date:   Thu Sep 5 09:22:40 2024 +0200

    iommu/amd: Add kernel parameters to limit V1 page-sizes

    Add two new kernel command line parameters to limit the page-sizes
    used for v1 page-tables:

	    nohugepages     - Limits page-sizes to 4KiB

	    v2_pgsizes_only - Limits page-sizes to 4Kib/2Mib/1GiB; The
			      same as the sizes used with v2 page-tables

    This is needed for multiple scenarios. When assigning devices to
    SEV-SNP guests the IOMMU page-sizes need to match the sizes in the RMP
    table, otherwise the device will not be able to access all shared
    memory.

    Also, some ATS devices do not work properly with arbitrary IO
    page-sizes as supported by AMD-Vi, so limiting the sizes used by the
    driver is a suitable workaround.

    All-in-all, these parameters are only workarounds until the IOMMU core
    and related APIs gather the ability to negotiate the page-sizes in a
    better way.

    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
    Link: https://lore.kernel.org/r/20240905072240.253313-1-joro@8bytes.org

(cherry picked from commit f0295913c4b4f377c454e06f50c1a04f2f80d9df)
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
2024-11-04 08:57:30 -07:00
Audra Mitchell d14eefe788 powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults
JIRA: https://issues.redhat.com/browse/RHEL-55462

This patch is a backport of the following upstream commit:
commit 6b34a099faa123488b13caf704562f4dbe483fc4
Author: Nicholas Piggin <npiggin@gmail.com>
Date:   Mon Oct 24 13:01:50 2022 +1000

    powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults

    This option increases the number of hash misses by limiting the number
    of kernel HPT entries, by keeping a per-CPU record of the last kernel
    HPTEs installed, and removing that from the hash table on the next hash
    insertion. A timer round-robins CPUs removing remaining kernel HPTEs and
    clearing the TLB (in the case of bare metal) to increase and slightly
    randomise kernel fault activity.

    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
    [mpe: Add comment about NR_CPUS usage, fixup whitespace]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20221024030150.852517-1-npiggin@gmail.com

Signed-off-by: Audra Mitchell <audra@redhat.com>
2024-11-04 09:14:16 -05:00
Rado Vrbovsky 8d10957dfa Merge: kvm/aarch64: rhel9.6 rebase upto 6.11
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5430

JIRA: https://issues.redhat.com/browse/RHEL-57113

Upstream Status: up to v6.11 and fixes up to v6.12-rc5 \
Tested: kvm-unit-tests, kselftest, migration test.

This is the first round rebase kvm-arm up to v6.11, which contains the below series:
1. KVM: arm64: pKVM host proxy FF-A fixes (part of them)
2. KVM: arm64: nv: Shadow stage-2 page table handling
3. KVM: arm64: Allow userspace to modify CTR_EL0
4. KVM: arm64: nv: FPSIMD/SVE, plus some other CPTR goodies
5. KVM: arm64: fix warnings in W=1 build
6. Misc commits

Besides that, it also takes the fixes commit `4155539bc5ba ("KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set")` which up to v6.12-rc1.

* 42fb33dde42b KVM: arm64: Use FF-A 1.1 with pKVM \
This commit belongs to the series 1, don't pick it because downstream doesn't support FF-A 1.1 (The related upstream commit is `1609626c32c4 ("firmware: arm_ffa: Update the FF-A command list with v1.1 additions")`).

This `KVM: arm64: Fix handling of TCR2_EL1` series can be taken by kvm-arm rebase but since it depends on the arm64 rebase, so will pick them in the second round when the arm64 rebase being merged.

* 838d992b8448 KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() \
Don't pick this commit since downstream doesn't support bitmap_gather().

Changelog: \
v2 -> v3: \
Add commits:
* eb9d53d4a949 KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity
* cb52b5c8b81b Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"
* 810ecbefdd54 KVM: Documentation: Correct the VGIC V2 CPU interface addr space size
* 03bd36a387b8 KVM: Documentation: Enumerate allowed value macros of irq_type
* ae8f8b376102 KVM: arm64: Unregister redistributor for failed vCPU creation
* c6c167afa090 KVM: arm64: Fix shift-out-of-bounds bug
* 78a005555500 KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration

v1 -> v2: \
Add those two commits to avoid conflicts when backport `894376385a2d KVM: arm64: Add support for FFA_PARTITION_INFO_GET`.
* 3fad96e9b21b ("firmware: arm_ffa: Declare ffa_bus_type structure in the header")
* 989e8661dc45 ("firmware: arm_ffa: Make ffa_bus_type const")

Add commits:
* b26e484b8bb3 ("arm64: Add CFI error handling")
* 7a928b32f1de arm64: Introduce esr_brk_comment, esr_is_cfi_brk
* 8f3873a39529 KVM: arm64: Introduce print_nvhe_hyp_panic helper
* eca4ba5b6dff KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2

Add commits:
* f26a525b77e0 KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
* a1d402abf8e3 KVM: arm64: Fix kvm_has_feat'*'() handling of negative features
* 78fee4198bb4 KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path
* a9f41588a902 KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM
* dc0dddb1d66d KVM: arm64: Invalidate EL1&0 TLB entries for all VMIDs in nvhe hyp init
* ed49fe5a6fb9 KVM: arm64: Ensure TLBI uses correct VMID after changing context
* e0b7de4fd18c KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging 
* ae41d7dbaeb4 KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE
* 38753cbc4dca KVM: arm64: Move data barrier to end of split walk


Signed-off-by: Shaoqin Huang <shahuang@redhat.com>

Approved-by: Gavin Shan <gshan@redhat.com>
Approved-by: Sebastian Ott <sebott@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-11-01 08:10:52 +00:00
Mark Salter 90e536e7f6 arm64: Add the arm64.no32bit_el0 command line option
JIRA: https://issues.redhat.com/browse/RHEL-40604

commit 1279e8d0dcead53cf1f51e926a1cf6d2a79332d6
Author: Andrea della Porta <andrea.porta@suse.com>
Date: Mon, 29 Apr 2024 12:28:33 +0200

    Introducing the field 'el0' to the idreg-override for register
    ID_AA64PFR0_EL1. This field is also aliased to the new kernel
    command line option 'arm64.no32bit_el0' as a more recognizable
    and mnemonic name to disable the execution of 32 bit userspace
    applications (i.e. avoid Aarch32 execution state in EL0) from
    kernel command line.

    Link: https://lore.kernel.org/all/20240207105847.7739-1-andrea.porta@suse.com/
    Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
    Link: https://lore.kernel.org/r/20240429102833.6426-1-andrea.porta@suse.com
    Signed-off-by: Will Deacon <will@kernel.org>

Signed-off-by: Mark Salter <msalter@redhat.com>
2024-10-31 10:42:52 -04:00
Shaoqin Huang 282b3b61c1 KVM: arm64: Add early_param to control WFx trapping
JIRA: https://issues.redhat.com/browse/RHEL-57113

Conflicts:
- Documentation/admin-guide/kernel-parameters.txt
Contextual conflicts due to missing commit
600716592a3a ("doc: Add EARLY flag to early-parsed kernel boot parameters").

commit 0b5afe05377d7993f19292bf49dd13e959000790
Author: Colton Lewis <coltonlewis@google.com>
Date:   Thu May 23 17:40:55 2024 +0000

    KVM: arm64: Add early_param to control WFx trapping

    Add an early_params to control WFI and WFE trapping. This is to
    control the degree guests can wait for interrupts on their own without
    being trapped by KVM. Options for each param are trap and notrap. trap
    enables the trap. notrap disables the trap. Note that when enabled,
    traps are allowed but not guaranteed by the CPU architecture. Absent
    an explicitly set policy, default to current behavior: disabling the
    trap if only a single task is running and enabling otherwise.

    Signed-off-by: Colton Lewis <coltonlewis@google.com>
    Reviewed-by: Jing Zhang <jingzhangos@google.com>
    Link: https://lore.kernel.org/r/20240523174056.1565133-1-coltonlewis@google.com
    [ oliver: rework kvm_vcpu_should_clear_tw*() for readability ]
    Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
2024-10-28 04:37:46 -04:00
Rado Vrbovsky d2bd7080ef Merge: Sched: Updates and fixes for 9.6
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5250

JIRA: https://issues.redhat.com/browse/RHEL-56494
  
JIRA: https://issues.redhat.com/browse/RHEL-57142

CVE: CVE-2024-44958

Tested: Ran scheduler tests and general stress testing. Have asked  
perf QE for sanity tests.   

Omitted-fix: c049acee3c71 ("selftests/ftrace: Fix test to handle both old and new kernels"): Somewhat out of scope for this MR and should not need to run test against old kernels in RHEL. 

Series of scheduler related fixes and updates, up to v6.11. A large  
number of these are refactoring (making naming consistent, breaking out  
code into new files etc) with no functional changes. Otherwise, primarily  
bug fixes and cleanups, no real feature additions.   
  
  
  
  
  
Signed-off-by: Phil Auld <pauld@redhat.com>

Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Mark Langsdorf <mlangsdo@redhat.com>
Approved-by: Juri Lelli <juri.lelli@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-25 16:52:35 +00:00
Rado Vrbovsky 16bf54f108 Merge: Fix RCUC latency issue
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5165

JIRA: https://issues.redhat.com/browse/RHEL-20288

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Signed-off-by: Leonardo Bras <leobras@redhat.com>

Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: Waiman Long <longman@redhat.com>
Approved-by: Marcelo Tosatti <mtosatti@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-25 16:26:53 +00:00
Rado Vrbovsky d30d477e21 Merge: rcu: Backport upstream RCU commits up to v6.10
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5074

JIRA: https://issues.redhat.com/browse/RHEL-55557    
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5074

This MR backports upstream RCU commits up to v6.10 with relevant bug
fixes, if applicable.

Signed-off-by: Waiman Long <longman@redhat.com>

Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>
2024-10-25 16:11:27 +00:00
Leonardo Bras 483ecb54c6 rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter
JIRA: https://issues.redhat.com/browse/RHEL-20288

commit 68d124b0999919015e6d23008eafea106ec6bb40
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   2024-05-08 20:11:58 -0700

    rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter

    If a CPU is running either a userspace application or a guest OS in
    nohz_full mode, it is possible for a system call to occur just as an
    RCU grace period is starting.  If that CPU also has the scheduling-clock
    tick enabled for any reason (such as a second runnable task), and if the
    system was booted with rcutree.use_softirq=0, then RCU can add insult to
    injury by awakening that CPU's rcuc kthread, resulting in yet another
    task and yet more OS jitter due to switching to that task, running it,
    and switching back.

    In addition, in the common case where that system call is not of
    excessively long duration, awakening the rcuc task is pointless.
    This pointlessness is due to the fact that the CPU will enter an extended
    quiescent state upon returning to the userspace application or guest OS.
    In this case, the rcuc kthread cannot do anything that the main RCU
    grace-period kthread cannot do on its behalf, at least if it is given
    a few additional milliseconds (for example, given the time duration
    specified by rcutree.jiffies_till_first_fqs, give or take scheduling
    delays).

    This commit therefore adds a rcutree.nohz_full_patience_delay kernel
    boot parameter that specifies the grace period age (in milliseconds,
    rounded to jiffies) before which RCU will refrain from awakening the
    rcuc kthread.  Preliminary experimentation suggests a value of 1000,
    that is, one second.  Increasing rcutree.nohz_full_patience_delay will
    increase grace-period latency and in turn increase memory footprint,
    so systems with constrained memory might choose a smaller value.
    Systems with less-aggressive OS-jitter requirements might choose the
    default value of zero, which keeps the traditional immediate-wakeup
    behavior, thus avoiding increases in grace-period latency.

    [ paulmck: Apply Leonardo Bras feedback.  ]

    Link: https://lore.kernel.org/all/20240328171949.743211-1-leobras@redhat.com/

    Reported-by: Leonardo Bras <leobras@redhat.com>
    Suggested-by: Leonardo Bras <leobras@redhat.com>
    Suggested-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Leonardo Bras <leobras@redhat.com>

Signed-off-by: Leonardo Bras <leobras@redhat.com>
2024-10-08 18:52:03 -03:00
Phil Auld d3ffc226fc sched/core: Drop spinlocks on contention iff kernel is preemptible
JIRA: https://issues.redhat.com/browse/RHEL-56494
Conflicts: Minor context differences.

commit c793a62823d1ce8f70d9cfc7803e3ea436277cda
Author: Sean Christopherson <seanjc@google.com>
Date:   Mon May 27 17:34:48 2024 -0700

    sched/core: Drop spinlocks on contention iff kernel is preemptible

    Use preempt_model_preemptible() to detect a preemptible kernel when
    deciding whether or not to reschedule in order to drop a contended
    spinlock or rwlock.  Because PREEMPT_DYNAMIC selects PREEMPTION, kernels
    built with PREEMPT_DYNAMIC=y will yield contended locks even if the live
    preemption model is "none" or "voluntary".  In short, make kernels with
    dynamically selected models behave the same as kernels with statically
    selected models.

    Somewhat counter-intuitively, NOT yielding a lock can provide better
    latency for the relevant tasks/processes.  E.g. KVM x86's mmu_lock, a
    rwlock, is often contended between an invalidation event (takes mmu_lock
    for write) and a vCPU servicing a guest page fault (takes mmu_lock for
    read).  For _some_ setups, letting the invalidation task complete even
    if there is mmu_lock contention provides lower latency for *all* tasks,
    i.e. the invalidation completes sooner *and* the vCPU services the guest
    page fault sooner.

    But even KVM's mmu_lock behavior isn't uniform, e.g. the "best" behavior
    can vary depending on the host VMM, the guest workload, the number of
    vCPUs, the number of pCPUs in the host, why there is lock contention, etc.

    In other words, simply deleting the CONFIG_PREEMPTION guard (or doing the
    opposite and removing contention yielding entirely) needs to come with a
    big pile of data proving that changing the status quo is a net positive.

    Opportunistically document this side effect of preempt=full, as yielding
    contended spinlocks can have significant, user-visible impact.

    Fixes: c597bfddc9e9 ("sched: Provide Kconfig support for default dynamic preempt mode")
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
    Reviewed-by: Chen Yu <yu.c.chen@intel.com>
    Link: https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@proxmox.com

Signed-off-by: Phil Auld <pauld@redhat.com>
2024-09-23 13:33:03 -04:00
Phil Auld 14a470e760 sched/pelt: Remove shift of thermal clock
JIRA: https://issues.redhat.com/browse/RHEL-56494

commit 97450eb909658573dcacc1063b06d3d08642c0c1
Author: Vincent Guittot <vincent.guittot@linaro.org>
Date:   Tue Mar 26 10:16:16 2024 +0100

    sched/pelt: Remove shift of thermal clock

    The optional shift of the clock used by thermal/hw load avg has been
    introduced to handle case where the signal was not always a high frequency
    hw signal. Now that cpufreq provides a signal for firmware and
    SW pressure, we can remove this exception and always keep this PELT signal
    aligned with other signals.
    Mark sysctl_sched_migration_cost boot parameter as deprecated

    Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Tested-by: Lukasz Luba <lukasz.luba@arm.com>
    Reviewed-by: Qais Yousef <qyousef@layalina.io>
    Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
    Link: https://lore.kernel.org/r/20240326091616.3696851-6-vincent.guittot@linaro.org

Signed-off-by: Phil Auld <pauld@redhat.com>
2024-09-23 13:33:02 -04:00
Myron Stowe 4eeffc7615 PCI: Extend ACS configurability
JIRA: https://issues.redhat.com/browse/RHEL-48601
Upstream Status: 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25

commit 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25
Author: Vidya Sagar <vidyas@nvidia.com>
Date:   Tue Jun 25 21:01:50 2024 +0530

    PCI: Extend ACS configurability

    PCIe ACS settings control the level of isolation and the possible P2P paths
    between devices. With greater isolation the kernel will create smaller
    iommu_groups and with less isolation there is more HW that can achieve P2P
    transfers. From a virtualization perspective all devices in the same
    iommu_group must be assigned to the same VM as they lack security
    isolation.

    There is no way for the kernel to automatically know the correct ACS
    settings for any given system and workload. Existing command line options
    (e.g., disable_acs_redir) allow only for large scale change, disabling all
    isolation, but this is not sufficient for more complex cases.

    Add a kernel command-line option 'config_acs' to directly control all the
    ACS bits for specific devices, which allows the operator to setup the right
    level of isolation to achieve the desired P2P configuration.  The
    definition is future proof; when new ACS bits are added to the spec the
    open syntax can be extended.

    ACS needs to be setup early in the kernel boot as the ACS settings affect
    how iommu_groups are formed. iommu_group formation is a one time event
    during initial device discovery, so changing ACS bits after kernel boot can
    result in an inaccurate view of the iommu_groups compared to the current
    isolation configuration.

    ACS applies to PCIe Downstream Ports and multi-function devices.  The
    default ACS settings are strict and deny any direct traffic between two
    functions. This results in the smallest iommu_group the HW can support.
    Frequently these values result in slow or non-working P2PDMA.

    ACS offers a range of security choices controlling how traffic is
    allowed to go directly between two devices. Some popular choices:

      - Full prevention

      - Translated requests can be direct, with various options

      - Asymmetric direct traffic, A can reach B but not the reverse

      - All traffic can be direct

    Along with some other less common ones for special topologies.

    The intention is that this option would be used with expert knowledge of
    the HW capability and workload to achieve the desired configuration.

    Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com
    Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
    [bhelgaas: add example, tidy printk formats]
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-09-19 14:13:25 -06:00
Thomas Huth ba994843de docs: move s390 under arch
JIRA: https://issues.redhat.com/browse/RHEL-54248

commit 37002bc6b6039e1491140869c6801e0a2deee43e
Author: Costa Shulyupin <costa.shul@redhat.com>
Date:   Tue Jul 18 07:55:02 2023 +0300

    docs: move s390 under arch

    and fix all in-tree references.

    Architecture-specific documentation is being moved into Documentation/arch/
    as a way of cleaning up the top-level documentation directory and making
    the docs hierarchy more closely match the source hierarchy.

    Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
    Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
    Acked-by: Jonathan Corbet <corbet@lwn.net>
    Acked-by: Heiko Carstens <hca@linux.ibm.com>
    Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
    Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
    Link: https://lore.kernel.org/r/20230718045550.495428-1-costa.shul@redhat.com
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>

Conflicts:
    Documentation/admin-guide/kernel-parameters.txt
    Documentation/arch/index.rst
    MAINTAINERS
    (contextual conflicts due to missing other patches in downstream)
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-06 17:33:51 +02:00
Thomas Huth 1bbaf4f572 s390/con3215: Drop console data printout when buffer full
JIRA: https://issues.redhat.com/browse/RHEL-54248

commit 1f3307cf3aac88763077fac90404f2c57bc5181a
Author: Thomas Richter <tmricht@linux.ibm.com>
Date:   Tue Sep 20 14:26:16 2022 +0200

    s390/con3215: Drop console data printout when buffer full

    Using z/VM the 3270 terminal emulator also emulates an IBM 3215 console
    which outputs line by line. When the screen is full, the console enters
    the MORE... state and waits for the operator to confirm the data
    on the screen by pressing a clear key. If this does not happen in the
    default time frame (currently 50 seconds) the console enters the HOLDING
    state.
    It then waits another time frame (currently 10 seconds) before the output
    continues on the next screen. When the operator presses the clear key
    during these wait times, the output continues immediately.

    This may lead to a very long boot time when the console
    has to print many messages, also the system may hang because of the
    console's limited buffer space and the system waits for the console
    output to drain and finally to finish. This problem can only occur
    when a terminal emulator is actually connected to the 3215 console
    driver. If not z/VM simply drops console output.

    Remedy this rare situation and add a kernel boot command line parameter
    con3215_drop. It can be set to 0 (do not drop) or 1 (do drop) which is
    the default. This instructs the kernel drop console data when the
    console buffer is full. This speeds up the boot time considerable and
    also does not hang the system anymore.

    Add a sysfs attribute file for console IBM 3215 named con_drop.
    This allows for changing the behavior after the boot, for example when
    during interactive debugging a panic/crash is expected.

    Here is a test of the new behavior using the following test program:
     #/bin/bash
     declare -i cnt=4

     mode=$(cat /sys/bus/ccw/drivers/3215/con_drop)
     [ $mode = yes ] && cnt=25

     echo "cons_drop $(cat /sys/bus/ccw/drivers/3215/con_drop)"
     echo "vmcp term more 5 2"
     vmcp term more 5 2
     echo "Run $cnt iterations of "'echo t > /proc/sysrq-trigger'

     for i in $(seq $cnt)
     do
            echo "$i. command 'echo t > /proc/sysrq-trigger' at $(date +%F,%T)"
            echo t > /proc/sysrq-trigger
            sleep 1
     done
     echo "droptest done" > /dev/kmsg
     #

    Output with sysfs attribute con_drop set to 1:
     # ./droptest.sh
     cons_drop yes
     vmcp term more 5 2
     Run 25 iterations of echo t > /proc/sysrq-trigger
     1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:09
     2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:10
     3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:11
     4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:12
     5. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:13
     6. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:14
     7. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:15
     8. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:16
     9. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:17
     10. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:18
     11. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:19
     12. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:20
     13. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:21
     14. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:22
     15. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:23
     16. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:24
     17. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:25
     18. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:26
     19. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:27
     20. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:28
     21. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:29
     22. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:30
     23. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:31
     24. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:32
     25. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:33
     #

    There are no hangs anymore.

    Output with sysfs attribute con_drop set to 0 and identical
    setting for z/VM console 'term more 5 2'. Sometimes hitting the
    clear key at the x3270 console to progress output.

     # ./droptest.sh
     cons_drop no
     vmcp term more 5 2
     Run 4 iterations of echo t > /proc/sysrq-trigger
     1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:20:58
     2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:24:32
     3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:28:04
     4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:31:37
     #

    Details:
    Enable function raw3215_write() to handle tab expansion and newlines
    and feed it with input not larger than the console buffer of 65536
    bytes. Function raw3125_putchar() just forwards its character for
    output to raw3215_write().

    This moves tab to blank conversion to one function raw3215_write()
    which also does call raw3215_make_room() to wait for enough free
    buffer space.

    Function handle_write() loops over all its input and segments input
    into chunks of console buffer size (should the input be larger).

    Rework tab expansion handling logic to avoid code duplication.

    Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
    Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
    Acked-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-06 17:33:35 +02:00
David Arcari 3531f1645e x86/cpu: Detect real BSP on crash kernels
JIRA: https://issues.redhat.com/browse/RHEL-43147

commit 5c5682b9f87a3b7bd4833884f300ec673685f6a6
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Feb 13 22:05:54 2024 +0100

    x86/cpu: Detect real BSP on crash kernels

    When a kdump kernel is started from a crashing CPU then there is no
    guarantee that this CPU is the real boot CPU (BSP). If the kdump kernel
    tries to online the BSP then the INIT sequence will reset the machine.

    There is a command line option to prevent this, but in case of nested kdump
    kernels this is wrong.

    But that command line option is not required at all because the real
    BSP is enumerated as the first CPU by firmware. Support for the only
    known system which was different (Voyager) got removed long ago.

    Detect whether the boot CPU APIC ID is the first APIC ID enumerated by
    the firmware. If the first APIC ID enumerated is not matching the boot
    CPU APIC ID then skip registering it.

    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Michael Kelley <mhklinux@outlook.com>
    Tested-by: Sohil Mehta <sohil.mehta@intel.com>
    Link: https://lore.kernel.org/r/20240213210252.348542071@linutronix.de

Signed-off-by: David Arcari <darcari@redhat.com>
2024-08-29 08:19:49 -04:00
Waiman Long e62041bc08 rcu: Reduce synchronize_rcu() latency
JIRA: https://issues.redhat.com/browse/RHEL-55557

commit 988f569ae041ccc93a79d98d1b0043dff4d7e9b7
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date:   Fri, 8 Mar 2024 18:34:05 +0100

    rcu: Reduce synchronize_rcu() latency

    A call to a synchronize_rcu() can be optimized from a latency
    point of view. Workloads which depend on this can benefit of it.

    The delay of wakeme_after_rcu() callback, which unblocks a waiter,
    depends on several factors:

    - how fast a process of offloading is started. Combination of:
        - !CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU;
        - !CONFIG_RCU_LAZY/CONFIG_RCU_LAZY;
        - other.
    - when started, invoking path is interrupted due to:
        - time limit;
        - need_resched();
        - if limit is reached.
    - where in a nocb list it is located;
    - how fast previous callbacks completed;

    Example:

    1. On our embedded devices i can easily trigger the scenario when
    it is a last in the list out of ~3600 callbacks:

    <snip>
      <...>-29      [001] d..1. 21950.145313: rcu_batch_start: rcu_preempt CBs=3613 bl=28
    ...
      <...>-29      [001] ..... 21950.152578: rcu_invoke_callback: rcu_preempt rhp=00000000b2d6dee8 func=__free_vm_area_struct.cfi_jt
      <...>-29      [001] ..... 21950.152579: rcu_invoke_callback: rcu_preempt rhp=00000000a446f607 func=__free_vm_area_struct.cfi_jt
      <...>-29      [001] ..... 21950.152580: rcu_invoke_callback: rcu_preempt rhp=00000000a5cab03b func=__free_vm_area_struct.cfi_jt
      <...>-29      [001] ..... 21950.152581: rcu_invoke_callback: rcu_preempt rhp=0000000013b7e5ee func=__free_vm_area_struct.cfi_jt
      <...>-29      [001] ..... 21950.152582: rcu_invoke_callback: rcu_preempt rhp=000000000a8ca6f9 func=__free_vm_area_struct.cfi_jt
      <...>-29      [001] ..... 21950.152583: rcu_invoke_callback: rcu_preempt rhp=000000008f162ca8 func=wakeme_after_rcu.cfi_jt
      <...>-29      [001] d..1. 21950.152625: rcu_batch_end: rcu_preempt CBs-invoked=3612 idle=....
    <snip>

    2. We use cpuset/cgroup to classify tasks and assign them into
    different cgroups. For example "backgrond" group which binds tasks
    only to little CPUs or "foreground" which makes use of all CPUs.
    Tasks can be migrated between groups by a request if an acceleration
    is needed.

    See below an example how "surfaceflinger" task gets migrated.
    Initially it is located in the "system-background" cgroup which
    allows to run only on little cores. In order to speed it up it
    can be temporary moved into "foreground" cgroup which allows
    to use big/all CPUs:

    cgroup_attach_task():
     -> cgroup_migrate_execute()
       -> cpuset_can_attach()
         -> percpu_down_write()
           -> rcu_sync_enter()
             -> synchronize_rcu()
       -> now move tasks to the new cgroup.
     -> cgroup_migrate_finish()

    <snip>
             rcuop/1-29      [000] .....  7030.528570: rcu_invoke_callback: rcu_preempt rhp=00000000461605e0 func=wakeme_after_rcu.cfi_jt
        PERFD-SERVER-1855    [000] d..1.  7030.530293: cgroup_attach_task: dst_root=3 dst_id=22 dst_level=1 dst_path=/foreground pid=1900 comm=surfaceflinger
       TimerDispatch-2768    [002] d..5.  7030.537542: sched_migrate_task: comm=surfaceflinger pid=1900 prio=98 orig_cpu=0 dest_cpu=4
    <snip>

    "Boosting a task" depends on synchronize_rcu() latency:

    - first trace shows a completion of synchronize_rcu();
    - second shows attaching a task to a new group;
    - last shows a final step when migration occurs.

    3. To address this drawback, maintain a separate track that consists
    of synchronize_rcu() callers only. After completion of a grace period
    users are deferred to a dedicated worker to process requests.

    4. This patch reduces the latency of synchronize_rcu() approximately
    by ~30-40% on synthetic tests. The real test case, camera launch time,
    shows(time is in milliseconds):

    1-run 542 vs 489 improvement 9%
    2-run 540 vs 466 improvement 13%
    3-run 518 vs 468 improvement 9%
    4-run 531 vs 457 improvement 13%
    5-run 548 vs 475 improvement 13%
    6-run 509 vs 484 improvement 4%

    Synthetic test(no "noise" from other callbacks):
    Hardware: x86_64 64 CPUs, 64GB of memory
    Linux-6.6

    - 10K tasks(simultaneous);
    - each task does(1000 loops)
         synchronize_rcu();
         kfree(p);

    default: CONFIG_RCU_NOCB_CPU: takes 54 seconds to complete all users;
    patch: CONFIG_RCU_NOCB_CPU: takes 35 seconds to complete all users.

    Running 60K gives approximately same results on my setup. Please note
    it is without any interaction with another type of callbacks, otherwise
    it will impact a lot a default case.

    5. By default it is disabled. To enable this perform one of the
    below sequence:

    echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
    or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"

    Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
    Co-developed-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
    Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-08-26 10:57:37 -04:00
Waiman Long a7ee6faa72 rcu: Provide a boot time parameter to control lazy RCU
JIRA: https://issues.redhat.com/browse/RHEL-55557

commit 7f66f099de4dc4b1a66a3f94e6db16409924a6f8
Author: Qais Yousef <qyousef@layalina.io>
Date:   Sun, 3 Dec 2023 01:12:52 +0000

    rcu: Provide a boot time parameter to control lazy RCU

    To allow more flexible arrangements while still provide a single kernel
    for distros, provide a boot time parameter to enable/disable lazy RCU.

    Specify:

            rcutree.enable_rcu_lazy=[y|1|n|0]

    Which also requires

            rcu_nocbs=all

    at boot time to enable/disable lazy RCU.

    To disable it by default at build time when CONFIG_RCU_LAZY=y, the new
    CONFIG_RCU_LAZY_DEFAULT_OFF can be used.

    Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
    Tested-by: Andrea Righi <andrea.righi@canonical.com>
    Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-08-26 10:57:22 -04:00
Waiman Long 4175e632cf doc: Get rcutree module parameters back into alpha order
JIRA: https://issues.redhat.com/browse/RHEL-55557

commit 51823ca651364f68bd3ad33d848c1542fffdd627
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue, 21 Mar 2023 17:28:40 -0700

    doc: Get rcutree module parameters back into alpha order

    This commit puts the rcutree module parameters back into proper
    alphabetical order.

    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-08-21 15:04:30 -04:00
Waiman Long 83b3fc77cb doc: Document rcutree.nocb_nobypass_lim_per_jiffy kernel parameter
JIRA: https://issues.redhat.com/browse/RHEL-55557

commit 89f7f29140da767f4675efbbe7892f38786451ec
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed, 27 Apr 2022 09:24:31 -0700

    doc: Document rcutree.nocb_nobypass_lim_per_jiffy kernel parameter

    This commit provides documentation for the kernel parameter controlling
    RCU's handling of callback floods on offloaded (rcu_nocbs) CPUs.
    This parameter might be obscure, but it is always there when you need it.

    Reported-by: Frederic Weisbecker <frederic@kernel.org>
    Reported-by: Uladzislau Rezki <urezki@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-08-21 15:04:29 -04:00
Waiman Long e15ff5264d doc: Document the rcutree.rcu_divisor kernel boot parameter
JIRA: https://issues.redhat.com/browse/RHEL-55557

commit 71de1e34f1dfc31ab3cb052cdd7038950aae06e7
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed, 20 Apr 2022 08:59:46 -0700

    doc: Document the rcutree.rcu_divisor kernel boot parameter

    This commit adds kernel-parameters.txt documentation for the
    rcutree.rcu_divisor kernel boot parameter, which controls the softirq
    callback-invocation batch limit.

    Cc: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-08-21 15:04:29 -04:00
Waiman Long 3342973efe x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE
JIRA: https://issues.redhat.com/browse/RHEL-31230
Conflicts:
 1) The net/netfilter/Makefile hunk is dropped due to missing
    nft_ct_fast.c file first intruduced by commit d9e789147605
    ("netfilter: nf_tables: avoid retpoline overhead for some ct
    expression calls").
 2) A merge conflict in the tools/objtool/check.c hunk due to missing
    upstream commit 9bb2ec608a20 ("objtool: Update Retpoline validation").
 3) First hunk of net/netfilter/nf_tables_core.c is dropped and a merge
    conflict in the second hunk due to missing upstream commit
    d8d760627855 ("netfilter: nf_tables: add static key to skip retpoline
    workarounds").
 4) The net/netfilter/nft_ct.c hunks are dropped due to missing upstream
    commit d9e789147605 ("netfilter: nf_tables: avoid retpoline overhead
    for some ct expression calls").

commit aefb2f2e619b6c334bcb31de830aa00ba0b11129
Author: Breno Leitao <leitao@debian.org>
Date:   Tue, 21 Nov 2023 08:07:32 -0800

    x86/bugs: Rename CONFIG_RETPOLINE            => CONFIG_MITIGATION_RETPOLINE

    Step 5/10 of the namespace unification of CPU mitigations related Kconfig options.

    [ mingo: Converted a few more uses in comments/messages as well. ]

    Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Signed-off-by: Breno Leitao <leitao@debian.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Reviewed-by: Ariel Miculas <amiculas@cisco.com>
    Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org

Signed-off-by: Waiman Long <longman@redhat.com>
2024-07-26 14:33:35 -04:00
Lucas Zampieri 5c0d3906e7 Merge: RHEL-9.5: NFS Updates to v6.8
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4239

This MR updates NFS, kNFSD, lockd, and sunrpc subsystems to upstream v6.8, with some omissions and additions for compatibility and fixes.

Testing is currently in progress..

JIRA: https://issues.redhat.com/browse/RHEL-34875  
Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=62234972  
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>

Approved-by: Steve Dickson <steved@redhat.com>
Approved-by: Rafael Aquini <aquini@redhat.com>
Approved-by: Paulo Alcantara <paalcant@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>
Approved-by: Scott Mayhew <smayhew@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-07-16 19:40:48 +00:00
Paolo Bonzini 1bc808f550 x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
JIRA: https://issues.redhat.com/browse/RHEL-16745

It was meant well at the time but nothing's using it so get rid of it.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240202163510.GDZb0Zvj8qOndvFOiZ@fat_crate.local
(cherry picked from commit 29956748339aa8757a7e2f927a8679dd08f24bb6)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-01 08:55:34 +02:00
Benjamin Coddington 8520ae59d8 NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY
JIRA: https://issues.redhat.com/browse/RHEL-34875

commit 5b9d31ae1c925bb5f15975e31b31ff5ae3c81f8f
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sat Sep 9 12:23:01 2023 -0400

    NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY

    When using a 'softerr' mount, the NFSv4 client can get stuck waiting
    forever while the server just returns NFS4ERR_DELAY. Among other things,
    this causes the knfsd server threads to busy wait.
    Add a parameter that tells the NFSv4 client how many times to retry
    before giving up.

    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
2024-06-27 08:14:24 -04:00
Lucas Zampieri cd66a5d192 Merge: Update kernel-module support to v6.8
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4061

# Merge Request Required Information

## Summary of Changes
RHIVOS is running into early mm init performance issues, and a long-term set of solutions is to improve the kernel linear map when kernel security is set to a max-level, a RHIVOS FuSa requirement, where all of memory is -not- read/writeable via the linear map (all of memory mapping from PAGE_OFFSET), but has strict execute-only, rodata, rw-data and no-execute pages.
Although RHEL9 and upstream can support the latter functionally, it is a significant performance issue as page-level mapping of the kernel linear map has to be employed from the default huge-page mappings that the various arch's support. The boot kernel itself is relatively easy to know how to map for optimal page-mappings and protection, because it is the first to load and ELF sections can be scanned for needed info; the same can't be said for all the loadable kernel modules, which is the impetus for page-splitting of the linear map (on x86) and the per-page-mapping on ARM64, where page-splitting of the linear map is not supported, but is the long-term optimal solution.
In order to make a step in this long-term effort, this patch series attempts to take the existing RHEL9 kernel module load support, which is barely 12 patches past the initial v5.14 base, and bring it up to a current, v6.8 version.
Of course, such an update brings a lot of other needed backports to apply cleanly, if the goal is to get close to upstream, maintain RHEL kmod support, and not regress.   Thus, this series results with major updates to dynamic-debug (since it involves modifying kernel module sections), kbuild, modpost, genksyms, and sprinkle an odd livepatch, fpatch, and BPF patch, although the latter were trimmed or dropped wherever possible.  
The split is approximately 150 kernel-module, 30 dyndbg, 80 modpost, 15 kbuild, 3 livepatch, 3 ftrace, 2 bpf (one being a fix for earlier kernel commit).
Note: modpost and related kbuild updates moved it to approximately v6.4.  A full update to 6.8 wasn't deemed necessary, and was an additional 30+ commits, and more kbuild modifications.  This effort was deemed sufficiently large and complete for the intended goal of making RHEL9 amenable to future updates to the kernel-module subsystem for posted patches on review now in linux-mm by Mike Rapaport. Those patches and expected follow-ons, will be backported to RHEL-9 when upstream settles on final updates in this area; these updates will make the kernel-load subsystem more common, and less arch-specific.
One patch from v6.9-rc1 was taken, modules: wait do_free_init correctly, to repair a race seen in the module-load path on a RHIVOS platform, which needed to sit on top of this series for ease of backporting.

v6: Evidently the rebase to -457 kept a merge conflict, which was a duplicate patch already taken in.  Latest series is now 299 patches vs 300.  No functional changes!

v5: rebased to latest kernel (-457) since gitlab punted due to claimed merge conflict; only conflict was relative source, due to other MRs pulled into cs9/9.5 ahead of this MR; no code changes, and (tkdiff+)diff-ing v4 patches to v5, showed no diffs to the author's naked eye.

v4: Just updated 3rd patch's revert to put Upstream status *after* Subject, so it shows correctly in a git-format output. No code changes from v3. (although CKI running a-muck after push'd update w/only a commit-log change).

v3: Rebase to -455 kernel since v2 was 8300 commits behind and had merge conflicts with JoeL's objtool update MR.

v2: Pulling out of Draft.
  : (Hopefully) fixed numerous nits (Jira: -> JIRA:; proper link so no more 404's, etc.)
  : add new/latest Fixes, some id'd by reviewers, some new to v6.9
  : Cleaned up/out bad merges that had introduced RHEL-only hunks
  : Significantly re-ordered the series to make it more bisectable; still breaks where the upstream maintainer tore code out of modpost.c and into a sed script, and then put the functionality back into modpost.c, and removed the sed script, which this series didn't backport since it was already large enough.
  : identified a failure with systemtap, that Will Cohen is repairing;  thus, this MR has to wait for a systemtap update before it will pass its check in the (brew? cki?) builds.
 
v1: Draft! 
This series has gone through some simple, preliminary testing, but it needs deep review by ftrace, BPF, livepatch, and rh-kabi support to ensure no regressions in these few, but corner kernel-modifying code paths.  rh-kabi tooling is a bit unknown, as it isn't in the kernel, but there are RHEL-only patches in the kernel for it.  
A patchreview run against the series was exed'd, and needed Fixes were added/included.
The list of self-documented omissions is listed below.  If new ones have popped in v6.9-rc<n>,
please forward them for addition.
Bisectability: The series is has known bisectability (patch-ordering) issues at the moment, but plan to re-shuffle the patches in v2 to improve if not make it completely bisectable.
Expected feedback will be incorporated in v2, and planned upgrade to full-MR/drop-Draft status.

Shout-out to Joe Lawrence who aided in debugging and providing fixes for well-hidden noarch build failures around Documentation generation, as well as warning cleanups for EXPORT'd init-tagged functions, which the update checks for now.  Joe was instrumental in finding key chunks of the modpost update that appears to have closed gaps in my original backport efforts.

Intentionally Omitted Fix: 0aa24a79ee3b603f kbuild: do not try to parse *.cmd files for objects provided by compiler
   -- for parisc & sky arch's, not needed in RHEL9

Intentionally Omitted Fix: f5983dab0ead modpost: define more R_ARM_* for old distributions
  For old releases not having R_ARM_* in arch/arm/include/asm/elf.h, which RHEL9 has

Intentionally Omitted Fix: 08700ec705043e linux/export: fix reference to exported functions for parisc64
 -- no parisc64 support in RHEL9

Intentionally Omitted Fix: 86495af1171e1feec79f media: dvb: symbol fixup for dvb_attach()
 -- not included in this backport due to partner request not to include until RHEL-10

Intentionally Omitted Fix: d81f0d7b8 Subject: kunit: add KUNIT_INIT_TABLE to init link
  -- will let KUNIT update bring in and enable as needed

## Approved Development Ticket
JIRA: https://issues.redhat.com/browse/RHEL-28063

Signed-off-by: Donald Dutile <ddutile@redhat.com>

Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-24 12:17:19 +00:00
Lucas Zampieri 0ff0944e55 Merge: smp: Backport CSD tracepoints
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3614

# Merge Request Required Information

## Summary of Changes
Introduce csd tracepoints that help tracking IPIs that can be messing with latency.
Also, make the trace available for all smp_function_call*(), not only the ones that result in an IPI.

## Approved Development Ticket

JIRA: https://issues.redhat.com/browse/RHEL-13876

Signed-off-by: Leonardo Bras <leobras@redhat.com>

Approved-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-21 12:50:04 +00:00
Lucas Zampieri 175f008f91 Merge: rcu: Backport upstream RCU commits up to v6.7
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4115

JIRA: https://issues.redhat.com/browse/RHEL-34076    
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4115

This MR backports upstream RCU commits up to v6.7 with relevant bug
fixes, if applicable.

Signed-off-by: Waiman Long <longman@redhat.com>

Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-19 17:58:02 +00:00
Lucas Zampieri 3cadd5b0ec Merge: x86/bhi: Additional mitigation for BHI vulnerability (CVE-2024-2201)
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4014

JIRA: https://issues.redhat.com/browse/RHEL-28203    
JIRA: https://issues.redhat.com/browse/RHEL-28209    
CVE: CVE-2024-2201    
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4014    
Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3961

Branch History Injection (BHI) attacks may allow a malicious application to
influence indirect branch prediction in kernel by poisoning the branch
history. eIBRS isolates indirect branch targets in ring0.  The BHB can
still influence the choice of indirect branch predictor entry, and although
branch predictor entries are isolated between modes when eIBRS is enabled,
the BHB itself is not isolated between modes.

Alder Lake and new processors supports a hardware control BHI_DIS_S to
mitigate BHI.  For older processors Intel has released a software sequence
to clear the branch history on parts that don't support BHI_DIS_S. Add
support to execute the software sequence at syscall entry and VMexit to
overwrite the branch history.

This MR extends the existing spectre_v2 mitigation to enable either
software or hardware BHI mitigation for vulnerable Intel processors,
if enabled. The spectre_v2 vulnerability sysfs file will now show the
status of the BHI mitigation like

   ...; SW sequence; BHI: SW loop, KVM: SW loop

As Linus has changed the default upstream to CONFIG_SPECTRE_BHI_ON,
the syscall hardening commit 1e3ad78334a6 ("x86/syscall: Don't force
use of indirect calls for system calls") is skipped for now. It may be
backported in the future, if necessary.

Signed-off-by: Waiman Long <longman@redhat.com>

Approved-by: Paolo Bonzini <bonzini@gnu.org>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-18 12:42:44 +00:00
Donald Dutile a27f75beb4 module: add debugging auto-load duplicate module support
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit 8660484ed1cf3261e89e0bad94c6395597e87599
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Thu Apr 13 22:28:39 2023 -0700

    module: add debugging auto-load duplicate module support

    The finit_module() system call can in the worst case use up to more than
    twice of a module's size in virtual memory. Duplicate finit_module()
    system calls are non fatal, however they unnecessarily strain virtual
    memory during bootup and in the worst case can cause a system to fail
    to boot. This is only known to currently be an issue on systems with
    larger number of CPUs.

    To help debug this situation we need to consider the different sources for
    finit_module(). Requests from the kernel that rely on module auto-loading,
    ie, the kernel's *request_module() API, are one source of calls. Although
    modprobe checks to see if a module is already loaded prior to calling
    finit_module() there is a small race possible allowing userspace to
    trigger multiple modprobe calls racing against modprobe and this not
    seeing the module yet loaded.

    This adds debugging support to the kernel module auto-loader (*request_module()
    calls) to easily detect duplicate module requests. To aid with possible bootup
    failure issues incurred by this, it will converge duplicates requests to a
    single request. This avoids any possible strain on virtual memory during
    bootup which could be incurred by duplicate module autoloading requests.

    Folks debugging virtual memory abuse on bootup can and should enable
    this to see what pr_warn()s come on, to see if module auto-loading is to
    blame for their wores. If they see duplicates they can further debug this
    by enabling the module.enable_dups_trace kernel parameter or by enabling
    CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE.

    Current evidence seems to point to only a few duplicates for module
    auto-loading. And so the source for other duplicates creating heavy
    virtual memory pressure due to larger number of CPUs should becoming
    from another place (likely udev).

    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:26 -04:00
Donald Dutile ca8f0d3fa6 module: Add support for default value for module async_probe
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit ae39e9ed964f8e450d0de410b5a757e19581dfc5
Author: Saravana Kannan <saravanak@google.com>
Date:   Fri Jun 3 18:01:00 2022 -0700

    module: Add support for default value for module async_probe

    Add a module.async_probe kernel command line option that allows enabling
    async probing for all modules. When this command line option is used,
    there might still be some modules for which we want to explicitly force
    synchronous probing, so extend <modulename>.async_probe to take an
    optional bool input so that async probing can be disabled for a specific
    module.

    Signed-off-by: Saravana Kannan <saravanak@google.com>
    Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:18 -04:00
Donald Dutile c4f068de33 dyndbg: Remove support for ddebug_query param
JIRA: https://issues.redhat.com/browse/RHEL-28063

commit 9c40e1aa84123750773a57c9cf39112459a952dd
Author: Andrew Halaney <ahalaney@redhat.com>
Date:   Wed Oct 13 11:40:21 2021 -0400

    dyndbg: Remove support for ddebug_query param

    This param has been deprecated for a very long time now, let's rip it
    out.

    Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
    Signed-off-by: Jason Baron <jbaron@akamai.com>
    Link: https://lore.kernel.org/r/1634139622-20667-3-git-send-email-jbaron@akamai.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Donald Dutile <ddutile@redhat.com>
2024-06-17 14:17:11 -04:00
Leonardo Bras bd63f8635f locking/csd_lock: Remove added data from CSD lock debugging
JIRA: https://issues.redhat.com/browse/RHEL-13876
Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

conflicts: Fixes (some) conflicts introduced by downstream commit
aa5786b04d ("sched, smp: Trace smp callback causing an IPI")
by applying the original dependency commit, and making it easier to
cherry-pick the next upstream commits due to not having conflicts.

commit 1771257cb447a7b27a15ed9aaf332726c47fcbcf
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   2023-03-20 17:55:14 -0700

    locking/csd_lock: Remove added data from CSD lock debugging

    The diagnostics added by this commit were extremely useful in one instance:

    a5aabace5f ("locking/csd_lock: Add more data to CSD lock debugging")

    However, they have not seen much action since, and there have been some
    concerns expressed that the complexity is not worth the benefit.

    Therefore, manually revert this commit, but leave a comment telling
    people where to find these diagnostics.

    [ paulmck: Apply Juergen Gross feedback. ]

    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Juergen Gross <jgross@suse.com>
    Link: https://lore.kernel.org/r/20230321005516.50558-2-paulmck@kernel.org

Signed-off-by: Leonardo Bras <leobras@redhat.com>
2024-06-17 12:58:15 -03:00
Leonardo Bras 6e00a94924 trace,smp: Add tracepoints for scheduling remotelly called functions
JIRA: https://issues.redhat.com/browse/RHEL-13876
Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

commit c52198601695851622f361d3f16456e9fc857629
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   2023-03-20 17:55:13 -0700

    locking/csd_lock: Add Kconfig option for csd_debug default

    The csd_debug kernel parameter works well, but is inconvenient in cases
    where it is more closely associated with boot loaders or automation than
    with a particular kernel version or release.  Thererfore, provide a new
    CSD_LOCK_WAIT_DEBUG_DEFAULT Kconfig option that defaults csd_debug to
    1 when selected and 0 otherwise, with this latter being the default.

    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: Juergen Gross <jgross@suse.com>
    Link: https://lore.kernel.org/r/20230321005516.50558-1-paulmck@kernel.org

Signed-off-by: Leonardo Bras <leobras@redhat.com>
2024-06-17 12:58:14 -03:00
Lucas Zampieri f6029bf351 Merge: workqueue: Backport workqueue commits to v6.9
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910

JIRA: https://issues.redhat.com/browse/RHEL-25103    
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910    
Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3847    

The primary purpose of this MR is to backport those upstream workqueue
commits which enables ordered workqueues and rescuers to follow
changes in workqueue unbound cpumask which is necessary to make sure
that isolated CPUs won't be disturbed due to unbound work items being
handled by those CPUs.

These upstream commits were merged into the v6.9 kernel which also
contains some major changes in workqueue code. This makes the required
commits dependent on some of the v6.9 workqueue commits. It is less risky
to sync the workqueue code up to v6.9 instead of selective backports
of some dependent commits. This MR also includes some miscellaneous
commits in other subsystems due to changes in the underlying workqueue
implementations.

A follow-up proactive workqueue fixes MR will be created later on,
if necessary.

Signed-off-by: Waiman Long <longman@redhat.com>

Approved-by: Tony Camuso <tcamuso@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: Vladis Dronov <vdronov@redhat.com>
Approved-by: Prarit Bhargava <prarit@redhat.com>
Approved-by: Wander Lairson Costa <wander@redhat.com>
Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: Radu Rendec <rrendec@redhat.com>
Approved-by: Chris von Recklinghausen <crecklin@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-13 13:07:43 +00:00
Lucas Zampieri 304e2a4e29 Merge: [RHEL-9.5.0] iommu and dma mapping api updates
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4151

# Merge Request Required Information

JIRA: https://issues.redhat.com/browse/RHEL-28780  
JIRA: https://issues.redhat.com/browse/RHEL-12083  
JIRA: https://issues.redhat.com/browse/RHEL-12322  
JIRA: https://issues.redhat.com/browse/RHEL-29105  
JIRA: https://issues.redhat.com/browse/RHEL-29357  
JIRA: https://issues.redhat.com/browse/RHEL-29359  

Omitted-fix: ed8b94f6e0ac ("powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add")  
     - Reverted by 1fba2bf8e9d5 ("Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"")

Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  
Upstream Status: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git  
                 branch: next   
Tested: In progress  

	- general cki coverage  

	- Nvidia testing arm-smmu-v3 and iommufd related changes they have requested.  

	- Multiple rounds testing of amd_iommu, intel_iommu, and arm-smmu-v3 with  
	  various iommu configurations with disk i/o using fio,  
	  covering lazy iotlb invalidation, strict iotlb invalidation,  
	  and passthrough. Also tested with forcedac set. Intel  
	  Scalable Mode capable systems tested with the iotlb invalidation  
	  policies, and passthrough with scalable mode enabled, and disabled.  
	  AMD systems tested tested with v1 pages tables and v2.  

	- Tested booting with various iommu configurations, and verifying system  
	  in correct state on AMD, Intel, and ARM.  

	- Limited test on ppc64le. The system I had access to was  
	  setting up a 64-bit bypass window, and using dma_direct  
	  calls.  It ran, but since I don't normally touch ppc64le  
	  iommu code, I need to investigate more or get IBM assistance  
	  to more thoroughly test it.  

	- Working on getting testing assistance from IBM for the s390x changes.  

## Summary of Changes


This brings iommu, iommufd, and dma mapping api up to 6.9 with some additions from Joerg's  
next branch minus some commits changes in a 6.9 SEV-SNP pull for AMD. Some hightlights:  

- The removal of the amd_iommu_v2 code, and the addition of it's replacement based on the    
  iommu core SVA api, along with a re-org of the amd_iommu code.  
- The migration of s390 to the iommu core dma-iommu dma ops implementation, joining Intel,  
  AMD, and ARM as users of the same code base.  
- The beginnings of a re-work of the arm-smmu-v3 driver by Jason, and others.  
- A number of changes to iommufd as it continues to get fleshed out.  
- IOPT memory usage observability (code that was basis for talk at LPC last year)  

  Example output in vmstat files:  

```
    # grep iommu /sys/devices/system/node/node*/vmstat  
    /sys/devices/system/node/node0/vmstat:nr_iommu_pages 342  
    /sys/devices/system/node/node1/vmstat:nr_iommu_pages 0  
```

- Continued work on shared virtual addressing and io page faulting (PRI).  
- Dynamic swiotlb memory pools. This is not enabled yet, as they still seem to be  
  shaking out issues upstream, but the code is in place now.  
- Re-working of iommu core domain allocation.  

Note: iommufd selftest is being enabled in separate work that has been delegated to    
      another engineer starting to help with iommu. So that will be enabled in the  
      next few weeks to add more coverage for iommufd.  

Conflicts wise, they should be noted in the individual commits, but  
not too bad overall. 13/30 were dropping unsupported bits, and another  
8 were context diffs. A couple caused by out of order backports due  
to fixes, and couple upstream conflicts from colliding patchsets that  
had to be resolved in the merge commits.  

Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>

Approved-by: Jan Stancek <jstancek@redhat.com>
Approved-by: Donald Dutile <ddutile@redhat.com>
Approved-by: Phil Auld <pauld@redhat.com>
Approved-by: David Airlie <airlied@redhat.com>
Approved-by: Lenny Szubowicz <lszubowi@redhat.com>
Approved-by: Steve Best <sbest@redhat.com>
Approved-by: John W. Linville <linville@redhat.com>
Approved-by: Mark Langsdorf <mlangsdo@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-05 20:03:50 +00:00
Lucas Zampieri 95ec32f109 Merge: USB/TB code rebase of supported drivers to upstream v6.8
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/4268


JIRA: https://issues.redhat.com/browse/RHEL-34114

This rebases supported USB and Thunderbolt drivers to upstream kernel v6.8
By design, changes on this rebase are limited to supported usb/thunderbolt
 drivers. Changes which happen to touch the drivers but are tree-wide are 
selectively or partially pulled in, when relevant.

Omitted-fix: 9dc292413c56 ("usb: gadget: ncm: Fix endianness of wMaxSegmentSize variable in ecm_desc")
Omitted-fix: f90ce1e04cbc ("usb: gadget: ncm: Fix handling of zero block length packets")
Omitted-fix: 5b9e00a6004c ("powerpc/4xx: Fix warp_gpio_leds build failure")
Omitted-fix: 6f98e44984d5 ("spi: ppc4xx: Fix fallout from include cleanup")
Omitted-fix: 70e6163d17dd ("arm64: dts: qcom: qrb5165-rb5: use u16 for DP altmode svid")

Signed-off-by: Desnes Nunes <desnesn@redhat.com>

Approved-by: José Ignacio Tornos Martínez <jtornosm@redhat.com>
Approved-by: Eric Chanudet <echanude@redhat.com>
Approved-by: David Arcari <darcari@redhat.com>
Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com>

Merged-by: Lucas Zampieri <lzampier@redhat.com>
2024-06-03 19:41:30 +00:00
Waiman Long 65e2702499 rcu: Restrict access to RCU CPU stall notifiers
JIRA: https://issues.redhat.com/browse/RHEL-34076

commit 4e58aaeebb3c27993c734c99eae6881b196b1ddb
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed, 1 Nov 2023 18:28:38 -0700

    rcu: Restrict access to RCU CPU stall notifiers

    Although the RCU CPU stall notifiers can be useful for dumping state when
    tracking down delicate forward-progress bugs where NUMA effects cause
    cache lines to be delivered to a given CPU regularly, but always in a
    state that prevents that CPU from making forward progress.  These bugs can
    be detected by the RCU CPU stall-warning mechanism, but in some cases,
    the stall-warnings printk()s disrupt the forward-progress bug before
    any useful state can be obtained.

    Unfortunately, the notifier mechanism added by commit 5b404fdabacf ("rcu:
    Add RCU CPU stall notifier") can make matters worse if used at all
    carelessly. For example, if the stall warning was caused by a lock not
    being released, then any attempt to acquire that lock in the notifier
    will hang. This will prevent not only the notifier from producing any
    useful output, but it will also prevent the stall-warning message from
    ever appearing.

    This commit therefore hides this new RCU CPU stall notifier
    mechanism under a new RCU_CPU_STALL_NOTIFIER Kconfig option that
    depends on both DEBUG_KERNEL and RCU_EXPERT.  In addition, the
    rcupdate.rcu_cpu_stall_notifiers=1 kernel boot parameter must also
    be specified.  The RCU_CPU_STALL_NOTIFIER Kconfig option's help text
    contains a warning and explains the dangers of careless use, recommending
    lockless notifier code.  In addition, a WARN() is triggered each time
    that an attempt is made to register a stall-warning notifier in kernels
    built with CONFIG_RCU_CPU_STALL_NOTIFIER=y.

    This combination of measures will keep use of this mechanism confined to
    debug kernels and away from routine deployments.

    [ paulmck: Apply Dan Carpenter feedback. ]

    Fixes: 5b404fdabacf ("rcu: Add RCU CPU stall notifier")
    Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
    Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>

Signed-off-by: Waiman Long <longman@redhat.com>
2024-05-31 10:56:18 -04:00