Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Myron Stowe	66c02b62b2	PCI: Honor Max Link Speed when determining supported speeds JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: 3202ca221578850f34e0fea39dc6cfa745ed7aac commit 3202ca221578850f34e0fea39dc6cfa745ed7aac Author: Lukas Wunner <lukas@wunner.de> Date: Tue Dec 17 10:51:01 2024 +0100 PCI: Honor Max Link Speed when determining supported speeds The Supported Link Speeds Vector in the Link Capabilities 2 Register indicates the supported link speeds. The Max Link Speed field in the Link Capabilities Register indicates the maximum of those speeds. pcie_get_supported_speeds() neglects to honor the Max Link Speed field and will thus incorrectly deem higher speeds as supported. Fix it. One user-visible issue addressed here is an incorrect value in the sysfs attribute "max_link_speed". But the main motivation is a boot hang reported by Niklas: Intel JHL7540 "Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds, but indicate 2.5 GT/s as maximum. Ilpo recalls seeing this on more devices. It can be explained by the controller's Downstream Ports supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s if the port interfaces to a PCIe Adapter, in accordance with USB4 v2 sec 11.2.1: "This section defines the functionality of an Internal PCIe Port that interfaces to a PCIe Adapter. [...] The Logical sub-block shall update the PCIe configuration registers with the following characteristics: [...] Max Link Speed field in the Link Capabilities Register set to 0001b (data rate of 2.5 GT/s only). Note: These settings do not represent actual throughput. Throughput is implementation specific and based on the USB4 Fabric performance." The present commit is not sufficient on its own to fix Niklas' boot hang, but it is a prerequisite: A subsequent commit will fix the boot hang by enabling bandwidth control only if more than one speed is supported. The GENMASK() macro used herein specifies 0 as lowest bit, even though the Supported Link Speeds Vector ends at bit 1. This is done on purpose to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro would be invalid as the lowest bit is greater than the highest bit. Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated Endpoints in particular, so it does occur in practice. (The Link Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.) Fixes: d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds") Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org Link: https://lore.kernel.org/r/fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de Reported-by: Niklas Schnelle <niks@kernel.org> Tested-by: Niklas Schnelle <niks@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-03-20 10:33:58 -06:00
Myron Stowe	9512e6dbf7	PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: 665745f274870c921020f610e2c99a3b1613519b commit 665745f274870c921020f610e2c99a3b1613519b Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:52 2024 +0300 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller This mostly reverts the commit `b4c7d2076b` ("PCI/LINK: Remove bandwidth notification"). An upcoming commit extends this driver building PCIe bandwidth controller on top of it. PCIe bandwidth notifications were first added in the commit `e8303bb7a7` ("PCI/LINK: Report degraded links via link bandwidth notification") but later had to be removed. The significant changes compared with the old bandwidth notification driver include: 1) Don't print the notifications into kernel log, just keep the Link Speed cached in struct pci_bus updated. While somewhat unfortunate, the log spam was the source of complaints that eventually lead to the removal of the bandwidth notifications driver (see the links below for further information). 2) Besides the Link Bandwidth Management Interrupt, also enable Link Autonomous Bandwidth Interrupt to cover the other source of bandwidth changes. 3) Handle Link Speed updates robustly. Refresh the cached Link Speed when enabling Bandwidth Notification Interrupts, and solve the race between Link Speed read and LBMS/LABS update in pcie_bwnotif_irq_thread(). 4) Use concurrency safe LNKCTL RMW operations. 5) The driver is now called PCIe bwctrl (bandwidth controller) instead of just bandwidth notifications because of increased scope and functionality within the driver. 6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain(). Provide LBMS counting API for it. 7) Tweaks to variable/functions names for consistency and length reasons. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. [bhelgaas: This is based on previous work by Alexandru Gagniuc <mr.nuke.me@gmail.com>; see `e8303bb7a7` ("PCI/LINK: Report degraded links via link bandwidth notification")] Link: https://lore.kernel.org/r/20241018144755.7875-7-ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/ Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/ Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/ Suggested-by: Lukas Wunner <lukas@wunner.de> # Building bwctrl on top of bwnotif Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler: https://lore.kernel.org/r/20241115165717.15233-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-03-20 10:33:57 -06:00
Myron Stowe	13bc7bd987	PCI: Store all PCIe Supported Link Speeds JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: d2bd39c0456b75be9dfc7d774b8d021355c26ae3 commit d2bd39c0456b75be9dfc7d774b8d021355c26ae3 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:49 2024 +0300 PCI: Store all PCIe Supported Link Speeds The PCIe bandwidth controller added by a subsequent commit will require selecting PCIe Link Speeds that are lower than the Maximum Link Speed. The struct pci_bus only stores max_bus_speed. Even if PCIe r6.1 sec 8.2.1 currently disallows gaps in supported Link Speeds, the Implementation Note in PCIe r6.1 sec 7.5.3.18, recommends determining supported Link Speeds using the Supported Link Speeds Vector in the Link Capabilities 2 Register (when available) to "avoid software being confused if a future specification defines Links that do not require support for all slower speeds." Reuse code in pcie_get_speed_cap() to add pcie_get_supported_speeds() to query the Supported Link Speeds Vector of a PCIe device. The value is taken directly from the Supported Link Speeds Vector or synthesized from the Max Link Speed in the Link Capabilities Register when the Link Capabilities 2 Register is not available. The Supported Link Speeds Vector in the Link Capabilities Register 2 corresponds to the bus below on Root Ports and Downstream Ports, whereas it corresponds to the bus above on Upstream Ports and Endpoints (PCIe r6.1 sec 7.5.3.18): Supported Link Speeds Vector - This field indicates the supported Link speed(s) of the associated Port. Add supported_speeds into the struct pci_dev that caches the Supported Link Speeds Vector. supported_speeds contains a set of Link Speeds only in the case where PCIe Link Speed can be determined. Root Complex Integrated Endpoints do not have a well-defined Link Speed because they do not implement either of the Link Capabilities Registers, which is allowed by PCIe r6.1 sec 7.5.3 (the same limitation applies to determining cur_bus_speed and max_bus_speed that are PCI_SPEED_UNKNOWN in such case). This is of no concern from PCIe bandwidth controller point of view because such devices are not attached into a PCIe Root Port that could be controlled. The supported_speeds field keeps the extra reserved zero at the least significant bit to match the Link Capabilities 2 Register layout. An attempt was made to store supported_speeds field into the struct pci_bus as an intersection of both ends of the Link, however, the subordinate struct pci_bus is not available early enough. The Target Speed quirk (in pcie_failed_link_retrain()) can run either during initial scan or later, requiring it to use the API provided by the PCIe bandwidth controller to set the Target Link Speed in order to co-exist with the bandwidth controller. When the Target Speed quirk is calling the bandwidth controller during initial scan, the struct pci_bus is not yet initialized. As such, storing supported_speeds into the struct pci_bus is not viable. Suggested-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/20241018144755.7875-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: move pcie_get_supported_speeds() decl to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-03-20 10:33:56 -06:00
Myron Stowe	e5d50346c5	PCI: Fix pci_enable_acs() support for the ACS quirks JIRA: https://issues.redhat.com/browse/RHEL-67693 Upstream Status: f3c3ccc4fe49dbc560b01d16bebd1b116c46c2b4 commit f3c3ccc4fe49dbc560b01d16bebd1b116c46c2b4 Author: Jason Gunthorpe <jgg@ziepe.ca> Date: Wed Oct 16 20:52:33 2024 -0300 PCI: Fix pci_enable_acs() support for the ACS quirks There are ACS quirks that hijack the normal ACS processing and deliver to to special quirk code. The enable path needs to call pci_dev_specific_enable_acs() and then pci_dev_specific_acs_enabled() will report the hidden ACS state controlled by the quirk. The recent rework got this out of order and we should try to call pci_dev_specific_enable_acs() regardless of any actual ACS support in the device. As before command line parameters that effect standard PCI ACS don't interact with the quirk versions, including the new config_acs= option. Link: https://lore.kernel.org/r/0-v1-f96b686c625b+124-pci_acs_quirk_fix_jgg@nvidia.com Fixes: 47c8846a49ba ("PCI: Extend ACS configurability") Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/e89107da-ac99-4d3a-9527-a4df9986e120@kernel.org Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229019 Tested-by: Steffen Dirkwinkel <me@steffen.cc> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-02-18 09:48:10 -07:00
Myron Stowe	d23dc59b04	PCI: Pass domain number to pci_bus_release_domain_nr() explicitly JIRA: https://issues.redhat.com/browse/RHEL-67693 Upstream Status: 0cca961a026177af69044f10d6ae76d8ce043764 commit 0cca961a026177af69044f10d6ae76d8ce043764 Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Date: Thu Sep 12 11:00:25 2024 +0530 PCI: Pass domain number to pci_bus_release_domain_nr() explicitly The pci_bus_release_domain_nr() API is supposed to free the domain number allocated by pci_bus_find_domain_nr(). Most of the callers of pci_bus_find_domain_nr(), store the domain number in pci_bus::domain_nr. As such, the pci_bus_release_domain_nr() implicitly frees the domain number by dereferencing 'struct pci_bus'. However, one of the callers of this API, the PCI endpoint subsystem, doesn't have 'struct pci_bus', so it only passes NULL. Due to this, the API will end up dereferencing the NULL pointer. To fix this issue, pass the domain number to this API explicitly. Since 'struct pci_bus' is not used for anything else other than extracting the domain number, it makes sense to pass the domain number directly. Fixes: 0328947c5032 ("PCI: endpoint: Assign PCI domain number for endpoint controllers") Closes: https://lore.kernel.org/linux-pci/c0c40ddb-bf64-4b22-9dd1-8dbb18aa2813@stanley.mountain Link: https://lore.kernel.org/linux-pci/20240912053025.25314-1-manivannan.sadhasivam@linaro.org Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-02-18 09:48:10 -07:00
Myron Stowe	731f98d0ae	PCI: Rename CRS Completion Status to RRS JIRA: https://issues.redhat.com/browse/RHEL-67693 Upstream Status: 87f10faf166a9114aa0d4132298cad379de16fdd commit 87f10faf166a9114aa0d4132298cad379de16fdd Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Aug 27 18:48:48 2024 -0500 PCI: Rename CRS Completion Status to RRS PCIe r6.0 changed the abbreviation for "Configuration Request Retry Status" Completion Status from "CRS" to "RRS" and uses the terminology of "Configuration RRS Software Visibility" instead of "CRS Software Visibility". Align the Linux usage with the r6.0 spec language. No functional change intended. It's confusing to make this change, but I think "RRS" is a better abbreviation because it was easy to interpret "CRS" as "Completion Retry Status", which really didn't make any sense. Link: https://lore.kernel.org/r/20240827234848.4429-4-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2025-02-17 12:01:29 -07:00
Rado Vrbovsky	2ba815bf62	Merge: PCI/ASPM: PCIe link training fixes MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/6055 ``` JIRA: https://issues.redhat.com/browse/RHEL-71363 This series include a set of key fixes related to PCIe's link training from upstream v6.12. Signed-off-by: Myron Stowe <mstowe@redhat.com> ``` Approved-by: Charles Mirabile <cmirabil@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2025-01-23 13:14:39 +00:00
Myron Stowe	5f7319dc18	PCI: Wait for Link before restoring Downstream Buses JIRA: https://issues.redhat.com/browse/RHEL-71363 Upstream Status: 3e40aa29d47e231a54640addf6a09c1f64c5b63f commit 3e40aa29d47e231a54640addf6a09c1f64c5b63f Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Thu Aug 8 15:17:07 2024 +0300 PCI: Wait for Link before restoring Downstream Buses __pci_reset_bus() calls pci_bridge_secondary_bus_reset() to perform the reset and also waits for the Secondary Bus to become again accessible. __pci_reset_bus() then calls pci_bus_restore_locked() that restores the PCI devices connected to the bus, and if necessary, recursively restores also the subordinate buses and their devices. The logic in pci_bus_restore_locked() does not take into account that after restoring a device on one level, there might be another Link Downstream that can only start to come up after restore has been performed for its Downstream Port device. That is, the Link may require additional wait until it becomes accessible. Similarly, pci_slot_restore_locked() lacks wait. Amend pci_bus_restore_locked() and pci_slot_restore_locked() to wait for the Secondary Bus before recursively performing the restore of that bus. Fixes: `090a3c5322` ("PCI: Add pci_reset_slot() and pci_reset_bus()") Link: https://lore.kernel.org/r/20240808121708.2523-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-12-18 08:00:17 -07:00
Myron Stowe	2844a0051c	PCI: Use an error code with PCIe failed link retraining JIRA: https://issues.redhat.com/browse/RHEL-71363 Upstream Status: 59100eb248c0b15585affa546c7f6834b30eb5a4 commit 59100eb248c0b15585affa546c7f6834b30eb5a4 Author: Maciej W. Rozycki <macro@orcam.me.uk> Date: Fri Aug 9 14:25:02 2024 +0100 PCI: Use an error code with PCIe failed link retraining Given how the call place in pcie_wait_for_link_delay() got structured now, and that pcie_retrain_link() returns a potentially useful error code, convert pcie_failed_link_retrain() to return an error code rather than a boolean status, fixing handling at the call site mentioned. Update the other call site accordingly. Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Link: https://lore.kernel.org/r/alpine.DEB.2.21.2408091156530.61955@angie.orcam.me.uk Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/ Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.5+ Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-12-18 07:59:58 -07:00
Myron Stowe	916f943e31	PCI: Clear the LBMS bit after a link retrain JIRA: https://issues.redhat.com/browse/RHEL-71363 Upstream Status: 8037ac08c2bbb3186f83a5a924f52d1048dbaec5 commit 8037ac08c2bbb3186f83a5a924f52d1048dbaec5 Author: Maciej W. Rozycki <macro@orcam.me.uk> Date: Fri Aug 9 14:24:46 2024 +0100 PCI: Clear the LBMS bit after a link retrain The LBMS bit, where implemented, is set by hardware either in response to the completion of retraining caused by writing 1 to the Retrain Link bit or whenever hardware has changed the link speed or width in attempt to correct unreliable link operation. It is never cleared by hardware other than by software writing 1 to the bit position in the Link Status register and we never do such a write. We currently have two places, namely apply_bad_link_workaround() and pcie_failed_link_retrain() in drivers/pci/controller/dwc/pcie-tegra194.c and drivers/pci/quirks.c respectively where we check the state of the LBMS bit and neither is interested in the state of the bit resulting from the completion of retraining, both check for a link fault. And in particular pcie_failed_link_retrain() causes issues consequently, by trying to retrain a link where there's no downstream device anymore and the state of 1 in the LBMS bit has been retained from when there was a device downstream that has since been removed. Clear the LBMS bit then at the conclusion of pcie_retrain_link(), so that we have a single place that controls it and that our code can track link speed or width changes resulting from unreliable link operation. Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures") Link: https://lore.kernel.org/r/alpine.DEB.2.21.2408091133140.61955@angie.orcam.me.uk Reported-by: Matthew W Carlis <mattc@purestorage.com> Link: https://lore.kernel.org/r/20240806000659.30859-1-mattc@purestorage.com/ Link: https://lore.kernel.org/r/20240722193407.23255-1-mattc@purestorage.com/ Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: <stable@vger.kernel.org> # v6.5+ Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-12-18 07:59:22 -07:00
Myron Stowe	7307178813	PCI: Wait for device readiness with Configuration RRS JIRA: https://issues.redhat.com/browse/RHEL-71363 Upstream Status: d591f6804e7e1310881c9224d72247a2b65039af commit d591f6804e7e1310881c9224d72247a2b65039af Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Aug 27 18:48:46 2024 -0500 PCI: Wait for device readiness with Configuration RRS After a device reset, delays are required before the device can successfully complete config accesses. PCIe r6.0, sec 6.6, specifies some delays required before software can perform config accesses. Devices that require more time after those delays may respond to config accesses with Configuration Request Retry Status (RRS) completions. Callers of pci_dev_wait() are responsible for delays until the device can respond to config accesses. pci_dev_wait() waits any additional time until the device can successfully complete config accesses. Reading config space of devices that are not present or not ready typically returns ~0 (PCI_ERROR_RESPONSE). Previously we polled the Command register until we got a value other than ~0. This is sometimes a problem because Root Complex handling of RRS completions may include several retries and implementation-specific behavior that is invisible to software (see sec 2.3.2), so the exponential backoff in pci_dev_wait() may not work as intended. Linux enables Configuration RRS Software Visibility on all Root Ports that support it. If it is enabled, read the Vendor ID instead of the Command register. RRS completions cause immediate return of the 0x0001 reserved Vendor ID value, so the pci_dev_wait() backoff works correctly. When a read of Vendor ID eventually completes successfully by returning a non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be initialized and ready to respond to config requests. For conventional PCI devices or devices below Root Ports that don't support Configuration RRS Software Visibility, poll the Command register as before. This was developed independently, but is very similar to Stanislav Spassov's previous work at https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com Link: https://lore.kernel.org/r/20240827234848.4429-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Duc Dang <ducdang@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-12-18 07:59:11 -07:00
Robert Foss	c7bc023366	PM: runtime: Simplify pm_runtime_get_if_active() usage JIRA: https://issues.redhat.com/browse/RHEL-53569 Upstream Status: v6.9-rc1 Conflicts: Conflicts due to whitespace change DRM v6.9 backport drivers/gpu/drm/i915/intel_runtime_pm.c 0d08026ac609 ("net: ipa: kill ipa_clock_get_additional()") drivers/net/ipa/ipa_smp2p.c d3fcd7360338 ("PCI: Fix runtime PM race with PME polling") drivers/pci/pci.c commit c0ef3df8dbaef51ee4cfd58a471adf2eaee6f6b3 Author: Sakari Ailus <sakari.ailus@linux.intel.com> AuthorDate: Tue Jan 30 13:28:05 2024 +0200 Commit: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CommitDate: Mon Feb 12 16:57:47 2024 +0100 There are two ways to opportunistically increment a device's runtime PM usage count, calling either pm_runtime_get_if_active() or pm_runtime_get_if_in_use(). The former has an argument to tell whether to ignore the usage count or not, and the latter simply calls the former with ign_usage_count set to false. The other users that want to ignore the usage_count will have to explicitly set that argument to true which is a bit cumbersome. To make this function more practical to use, remove the ign_usage_count argument from the function. The main implementation is in a static function called pm_runtime_get_conditional() and implementations of pm_runtime_get_if_active() and pm_runtime_get_if_in_use() are moved to runtime.c. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Takashi Iwai <tiwai@suse.de> # sound/ Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> # drivers/accel/ivpu/ Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> # drivers/gpu/drm/i915/ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Robert Foss <rfoss@redhat.com>	2024-12-17 22:59:19 +01:00
Rado Vrbovsky	191f608532	Merge: PCI: ACS updates MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5246 ``` JIRA: https://issues.redhat.com/browse/RHEL-48601 Signed-off-by: Myron Stowe <mstowe@redhat.com> ``` Approved-by: John W. Linville <linville@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-12-09 08:21:20 +00:00
Myron Stowe	854d83025c	PCI: Bring the PCIe speed to MBps logic to new pcie_dev_speed_mbps() JIRA: https://issues.redhat.com/browse/RHEL-65598 Upstream Status: 100ae5d77f07f9f046106e228778c7aa1c6d3af3 commit 100ae5d77f07f9f046106e228778c7aa1c6d3af3 Author: Krishna chaitanya chundru <quic_krichai@quicinc.com> Date: Wed Jun 19 20:41:12 2024 +0530 PCI: Bring the PCIe speed to MBps logic to new pcie_dev_speed_mbps() Bring the switch case in pcie_link_speed_mbps() to new function to the header file so that it can be used in other places like in controller driver. Link: https://lore.kernel.org/linux-pci/20240619-opp_support-v15-3-aa769a2173a3@quicinc.com Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-11-04 15:44:21 -07:00
Rado Vrbovsky	14b4cc02eb	Merge: BPF 6.9 rebase MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5142 Rebase BPF subsystem to upstream version 6.9 JIRA: https://issues.redhat.com/browse/RHEL-23649 Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Approved-by: Viktor Malik <vmalik@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: Rafael Aquini <raquini@redhat.com> Approved-by: Mark Salter <msalter@redhat.com> Approved-by: Toke Høiland-Jørgensen <toke@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-30 07:25:08 +00:00
Rado Vrbovsky	aae21e3edb	Merge: Update CXL subsystem with content from v6.10 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5049 Back-port kernel's CXL subsystem core content from upstream v6.10 Notably excluded is "cxl/dax: Create dax devices for CXL RAM regions" (09d09e04d2fc). Also, the memory tiering code is updated to match v6.10. ## Approved Development Ticket JIRA: https://issues.redhat.com/browse/RHEL-54609 Depends: !4961 Signed-off-by: John W. Linville <linville@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: Jeff Moyer <jmoyer@redhat.com> Approved-by: Myron Stowe <mstowe@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-30 07:20:45 +00:00
Rado Vrbovsky	67448d15b8	Merge: Update kernel's PCI subsystem to v6.11 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5357 ``` This series updates RHEL9's PCI subsystem with content from upstream v6.11 - Merge tag 'pci-v6.11-fixes-4' of git://git.kernel.org/pub/scm/../pci/pci https://lkml.org/lkml/2024/9/13/ commit b7718454f937f50f44f98c1222f5135eaef29132 Merge: e936e7d4a83b fc8c818e7569 Merge tag 'pci-v6.11-fixes-3' of git://git.kernel.org/pub/scm/../pci/pci https://lkml.org/lkml/2024/9/6/1405 commit 487ee43bac846446fb3e832436bdedd7acb4fe46 Merge: a86b83f77797 8f62819aaace 4 files changed, 44 insertions(+), 5 deletions(-) Merge tag 'pci-v6.11-fixes-2' of git://git.kernel.org/pub/scm/../pci/pci https://lkml.org/lkml/2024/8/30/1561 commit 8101b2766d5bfee43a4de737107b9592db251470 Merge: 216d163165a9 150b572a7c1d 3 files changed, 21 insertions(+), 2 deletions(-) Merge tag 'pci-v6.11-fixes-1' of git://git.kernel.org/pub/scm/../pci/pci https://lkml.org/lkml/2024/8/1/1278 commit c0ecd6388360d930440cc5554026818895199923 Merge: 183d46ff422e 5560a612c20d 2 files changed, 11 insertions(+), 8 deletions(-) Merge tag 'pci-v6.11-changes' of git://git.kernel.org/pub/scm/../pci/pci https://lkml.org/lkml/2024/7/19/844 commit 3f386cb8ee9f04ff4be164ca7a1d0ef3f81f7374 Merge: 8e5c0abfa02d 45659274e608 105 files changed, 5208 insertions(+), 1932 deletions(-) All but three of patches within the series back-ported cleanly. However, there were a few back-ports where some changes were made to the originating upstream patch due to it either not being quite up to date with more recent changes, or subsequent changes were made during its merge commit. All such occurances are noted in the back-port's commit message with the same changes that occurred upstream being made in the back-port to keep things in sync. v2: Removing back-ports of merge commit df5dd337283a "Merge branch 'pci/controller/qcom'" due to prerequisite content that conflicts with other MRs. Will create a separate MR for df5dd337283a once the dependent MRs have merged. JIRA: https://issues.redhat.com/browse/RHEL-59033 Signed-off-by: Myron Stowe <mstowe@redhat.com> ``` Approved-by: Andrew Halaney <ahalaney@redhat.com> Approved-by: Jarod Wilson <jarod@redhat.com> Approved-by: Mika Penttilä <mpenttil@redhat.com> Approved-by: John W. Linville <linville@redhat.com> Approved-by: Ivan Vecera <ivecera@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-10-19 08:16:08 +00:00
Jerome Marchand	33482c3f06	mm: Introduce vmap_page_range() to map pages in PCI address space JIRA: https://issues.redhat.com/browse/RHEL-23649 Conflicts: There is no loongarch arch on RHEL-9 kernel. commit d7bca9199a27b8690ae1c71dc11f825154af7234 Author: Alexei Starovoitov <ast@kernel.org> Date: Fri Mar 8 09:12:54 2024 -0800 mm: Introduce vmap_page_range() to map pages in PCI address space ioremap_page_range() should be used for ranges within vmalloc range only. The vmalloc ranges are allocated by get_vm_area(). PCI has "resource" allocator that manages PCI_IOBASE, IO_SPACE_LIMIT address range, hence introduce vmap_page_range() to be used exclusively to map pages in PCI address space. Fixes: 3e49a866c9dc ("mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.") Reported-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/bpf/CANiq72ka4rir+RTN2FQoT=Vvprp_Ao-CvoYEkSNqtSY+RZj+AA@mail.gmail.com Signed-off-by: Jerome Marchand <jmarchan@redhat.com>	2024-10-15 10:49:14 +02:00
John W. Linville	b0fc7cbc66	PCI/CXL: Add 'cxl_bus' reset method for devices below CXL Ports JIRA: https://issues.redhat.com/browse/RHEL-54609 By default Secondary Bus Reset (SBR) is masked for CXL Ports (see CXL r3.1, sec 8.1.5.2). Add cxl_reset_bus_function() (method "cxl_bus") to set the "Unmask SBR" bit in the upstream CXL Port before performing the bus reset and restore the original value afterwards. This method allows the user to perform a bus reset on a CXL device without needing to set the "Unmask SBR" bit via a user tool. Link: https://lore.kernel.org/r/20240502165851.1948523-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> [bhelgaas: simplify commit log, invert condition to avoid negation] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit 53c49b6e6dd2ebc1d3257ae838e067699229bc8d) Signed-off-by: John W. Linville <linville@redhat.com>	2024-10-07 14:03:30 -04:00
John W. Linville	9c6c2e14df	PCI/CXL: Fail bus reset if upstream CXL Port has SBR masked JIRA: https://issues.redhat.com/browse/RHEL-54609 Per CXL spec r3.1, sec 8.1.5.2, the Secondary Bus Reset (SBR) bit in the Bridge Control register of a CXL port has no effect unless the "Unmask SBR" bit is set. Return -ENOTTY if we attempt a bus reset on a device below a CXL Port where "Unmask SBR" is 0. Otherwise, the bus reset would appear to have succeeded even though setting the bridge SBR bit had no effect. Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/ Link: https://lore.kernel.org/r/20240502165851.1948523-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> [bhelgaas: simplify commit log and comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit b1956e2d0713e210a56ae65ad3488ae36f833e76) Signed-off-by: John W. Linville <linville@redhat.com>	2024-10-07 14:03:30 -04:00
John W. Linville	9fcbacec86	cxl: Calculate and store PCI link latency for the downstream ports JIRA: https://issues.redhat.com/browse/RHEL-54609 The latency is calculated by dividing the flit size over the bandwidth. Add support to retrieve the flit size for the CXL switch device and calculate the latency of the PCIe link. Cache the latency number with cxl_dport. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319621931.2212653.6800240203604822886.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit 4d07a05397c8c15c37c8c3abb7afaea1dcd2f0e7) Signed-off-by: John W. Linville <linville@redhat.com>	2024-10-07 13:43:50 -04:00
Myron Stowe	1f804955d7	PCI: Warn on missing cfg_access_lock during secondary bus reset JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: 920f6468924f8dc7e0e6e1510d000888592ef861 Conflict(s): There isn't a conflict per sey; the upstream patch was based on code prior to commit c9d52fb313d3 "PCI: Revert the cfg_access_lock lockdep mechanism". However, commit c9d52fb313d3 was in place prior to this patch so it doesn't apply cleanly. commit 920f6468924f8dc7e0e6e1510d000888592ef861 Author: Dan Williams <dan.j.williams@intel.com> Date: Thu May 30 18:04:29 2024 -0700 PCI: Warn on missing cfg_access_lock during secondary bus reset The recent adventure with adding lockdep tracking for cfg_access_lock, while it yielded many false positives [1], did catch a true positive in the pci_reset_bus() path [2]. So, while lockdep is difficult to deploy, open coding a check that cfg_access_lock is held during the reset is feasible. While this does not offer a full backtrace, it should be sufficient to implicate the caller of pci_bridge_secondary_bus_reset() as a path that needs investigation. Link: https://lore.kernel.org/r/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com Link: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html [1] Link: http://lore.kernel.org/r/cfb50601-5d2a-4676-a958-1bd3f1b06654@intel.com [2] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 13:25:26 -06:00
Myron Stowe	2df1e4bea1	PCI: Fix devres regression in pci_intx() JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: 00f89ae4e759a7eef07e4188e1534af7dd2c7e9c commit 00f89ae4e759a7eef07e4188e1534af7dd2c7e9c Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jul 25 14:07:30 2024 +0200 PCI: Fix devres regression in pci_intx() pci_intx() becomes managed if pcim_enable_device() has been called in advance. Commit 25216afc9db5 ("PCI: Add managed pcim_intx()") changed this behavior so that pci_intx() always leads to creation of a separate device resource for itself, whereas earlier, a shared resource was used for all PCI devres operations. Unfortunately, pci_intx() seems to be used in some drivers' remove() paths; in the managed case this causes a device resource to be created on driver detach, which causes .probe() to fail if the driver is reloaded: pci 0000:00:1f.2: Resources present before probing Fix the regression by only redirecting pci_intx() to its managed twin pcim_intx() if the pci_command changes. Link: https://lore.kernel.org/r/20240725120729.59788-2-pstanner@redhat.com Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()") Reported-by: Damien Le Moal <dlemoal@kernel.org> Closes: https://lore.kernel.org/all/b8f4ba97-84fc-4b7e-ba1a-99de2d9f0118@kernel.org/ Signed-off-by: Philipp Stanner <pstanner@redhat.com> [bhelgaas: add error message to commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:43 -06:00
Myron Stowe	2d1cf513dd	PCI: Add managed pcim_intx() JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: 25216afc9db53d85dc648aba8fb7f6d31f2c8731 commit 25216afc9db53d85dc648aba8fb7f6d31f2c8731 Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jun 13 13:50:23 2024 +0200 PCI: Add managed pcim_intx() pci_intx() is a "hybrid" function, i.e., it is managed if pcim_enable_device() has been called, but unmanaged otherwise. Add pcim_intx(), which is always managed, and implement pci_intx() using it. Remove the now-unused struct pci_devres.orig_intx and .restore_intx and find_pci_dr(). Link: https://lore.kernel.org/r/20240613115032.29098-11-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> [kwilczynski: squashed in https://lore.kernel.org/r/426645d40776198e0fcc942f4a6cac4433c7a9aa.camel@red hat.com to fix problem reported and tested by Ashish Kalra <Ashish.Kalra@amd.com>: https://lore.kernel.org/r/20240708214656.4721-1-Ashish.Kalra@amd.com https://lore.kernel.org/r/8c4634e9-4f02-4c54-9c89-d75e2f4bf026@amd.com/] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:42 -06:00
Myron Stowe	43236dc826	PCI: Remove struct pci_devres.enabled status bit JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: 77f79ac8de0f490fca4f0a5f2e1e38eeee191f05 commit 77f79ac8de0f490fca4f0a5f2e1e38eeee191f05 Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jun 13 13:50:20 2024 +0200 PCI: Remove struct pci_devres.enabled status bit The struct pci_devres has a separate boolean to track whether a device is enabled. That, however, can easily be tracked in an agnostic manner through the function pci_is_enabled(). Using it allows for simplifying the PCI devres implementation. Replace the separate 'enabled' status bit from struct pci_devres with calls to pci_is_enabled() at the appropriate places. Link: https://lore.kernel.org/r/20240613115032.29098-8-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:42 -06:00
Myron Stowe	ec1f828164	PCI: Document hybrid devres hazards JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: 81fcf28e74a3ffda67a6896cd38843d80bc9ec68 commit 81fcf28e74a3ffda67a6896cd38843d80bc9ec68 Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jun 13 13:50:19 2024 +0200 PCI: Document hybrid devres hazards These functions: pci_request_region() pci_request_regions() pci_request_regions_exclusive() pci_request_selected_regions() pci_request_selected_regions_exclusive() pci_intx() are "hybrid" functions that are managed if pcim_enable_device() has been called, but unmanaged otherwise. This is confusing and has already caused a bug (in `8558de401b` ("drm/vboxvideo: use managed pci functions")) because users believe all PCI functions, such as pci_iomap_range(), can become managed that way, which is not the case. Add comments to the relevant functions' docstrings that warn users about this behavior. Link: https://lore.kernel.org/r/20240613115032.29098-7-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:42 -06:00
Myron Stowe	662face48b	PCI: Add managed pcim_request_region() JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: d47bde708086c77b1ceeb7643e600089f63dd03b commit d47bde708086c77b1ceeb7643e600089f63dd03b Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jun 13 13:50:18 2024 +0200 PCI: Add managed pcim_request_region() These existing functions: pci_request_region() pci_request_selected_regions() pci_request_selected_regions_exclusive() are "hybrid" functions built on __pci_request_region() and are managed if pcim_enable_device() has been called, but unmanaged otherwise. Add these new functions: pcim_request_region() pcim_request_region_exclusive() These are always managed and use the new pcim_addr_devres tracking infrastructure instead of find_pci_dr() and struct pci_devres.region_mask. Implement the hybrid functions using the new "pure" functions and remove struct pci_devres.region_mask, which is no longer needed. Link: https://lore.kernel.org/r/20240613115032.29098-6-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:42 -06:00
Myron Stowe	37d2c8944f	PCI: Add managed partial-BAR request and map infrastructure JIRA: https://issues.redhat.com/browse/RHEL-59033 Upstream Status: bbaff68bf4a404bee5f5e20e7b1e30301b26304a commit bbaff68bf4a404bee5f5e20e7b1e30301b26304a Author: Philipp Stanner <pstanner@redhat.com> Date: Thu Jun 13 13:50:16 2024 +0200 PCI: Add managed partial-BAR request and map infrastructure The pcim_iomap_devres table tracks entire-BAR mappings, so we can't use it to build a managed version of pci_iomap_range(), which maps partial BARs. Add struct pcim_addr_devres, which can track request and mapping of both entire BARs and partial BARs. Add the following internal devres functions based on struct pcim_addr_devres: pcim_iomap_region() # request & map entire BAR pcim_iounmap_region() # unmap & release entire BAR pcim_request_region() # request entire BAR pcim_release_region() # release entire BAR pcim_request_all_regions() # request all entire BARs pcim_release_all_regions() # release all entire BARs Rework the following public interfaces using the new infrastructure listed above: pcim_iomap() # map partial BAR pcim_iounmap() # unmap partial BAR pcim_iomap_regions() # request & map specified BARs pcim_iomap_regions_request_all() # request all BARs, map specified BARs pcim_iounmap_regions() # unmap & release specified BARs Link: https://lore.kernel.org/r/20240613115032.29098-4-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-10-01 11:50:41 -06:00
Myron Stowe	4eeffc7615	PCI: Extend ACS configurability JIRA: https://issues.redhat.com/browse/RHEL-48601 Upstream Status: 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25 commit 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25 Author: Vidya Sagar <vidyas@nvidia.com> Date: Tue Jun 25 21:01:50 2024 +0530 PCI: Extend ACS configurability PCIe ACS settings control the level of isolation and the possible P2P paths between devices. With greater isolation the kernel will create smaller iommu_groups and with less isolation there is more HW that can achieve P2P transfers. From a virtualization perspective all devices in the same iommu_group must be assigned to the same VM as they lack security isolation. There is no way for the kernel to automatically know the correct ACS settings for any given system and workload. Existing command line options (e.g., disable_acs_redir) allow only for large scale change, disabling all isolation, but this is not sufficient for more complex cases. Add a kernel command-line option 'config_acs' to directly control all the ACS bits for specific devices, which allows the operator to setup the right level of isolation to achieve the desired P2P configuration. The definition is future proof; when new ACS bits are added to the spec the open syntax can be extended. ACS needs to be setup early in the kernel boot as the ACS settings affect how iommu_groups are formed. iommu_group formation is a one time event during initial device discovery, so changing ACS bits after kernel boot can result in an inaccurate view of the iommu_groups compared to the current isolation configuration. ACS applies to PCIe Downstream Ports and multi-function devices. The default ACS settings are strict and deny any direct traffic between two functions. This results in the smallest iommu_group the HW can support. Frequently these values result in slow or non-working P2PDMA. ACS offers a range of security choices controlling how traffic is allowed to go directly between two devices. Some popular choices: - Full prevention - Translated requests can be direct, with various options - Asymmetric direct traffic, A can reach B but not the reverse - All traffic can be direct Along with some other less common ones for special topologies. The intention is that this option would be used with expert knowledge of the HW capability and workload to achieve the desired configuration. Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [bhelgaas: add example, tidy printk formats] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-09-19 14:13:25 -06:00
CKI Backport Bot	699ed49382	PCI: Add missing bridge lock to pci_bus_lock() JIRA: https://issues.redhat.com/browse/RHEL-59331 CVE: CVE-2024-46750 commit a4e772898f8bf2e7e1cf661a12c60a5612c4afab Author: Dan Williams <dan.j.williams@intel.com> Date: Thu May 30 18:04:35 2024 -0700 PCI: Add missing bridge lock to pci_bus_lock() One of the true positives that the cfg_access_lock lockdep effort identified is this sequence: WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70 RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70 Call Trace: <TASK> ? __warn+0x8c/0x190 ? pci_bridge_secondary_bus_reset+0x5d/0x70 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? pci_bridge_secondary_bus_reset+0x5d/0x70 pci_reset_bus+0x1d8/0x270 vmd_probe+0x778/0xa10 pci_device_probe+0x95/0x120 Where pci_reset_bus() users are triggering unlocked secondary bus resets. Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses pci_bus_lock() before issuing the reset which locks everything but the bridge itself. For the same motivation as adding: bridge = pci_upstream_bridge(dev); if (bridge) pci_dev_lock(bridge); to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add pci_dev_lock() for @bus->self to pci_bus_lock(). Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.com Reported-by: Imre Deak <imre.deak@intel.com> Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuch Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: squash in recursive locking deadlock fix from Keith Busch: https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: CKI Backport Bot <cki-ci-bot+cki-gitlab-backport-bot@redhat.com>	2024-09-18 10:00:50 +00:00
Rado Vrbovsky	2131e1ec0c	Merge: PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/5087 ``` JIRA: https://issues.redhat.com/browse/RHEL-54981 CVE: CVE-2024-42302 Signed-off-by: Myron Stowe <mstowe@redhat.com> ``` Approved-by: Desnes Nunes <desnesn@redhat.com> Approved-by: John W. Linville <linville@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Rado Vrbovsky <rvrbovsk@redhat.com>	2024-09-11 07:16:05 +00:00
Myron Stowe	3482703d91	PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal JIRA: https://issues.redhat.com/browse/RHEL-54981 CVE: CVE-2024-42302 Upstream Status: 11a1f4bc47362700fcbde717292158873fb847ed commit 11a1f4bc47362700fcbde717292158873fb847ed Author: Lukas Wunner <lukas@wunner.de> Date: Tue Jun 18 12:54:55 2024 +0200 PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Keith reports a use-after-free when a DPC event occurs concurrently to hot-removal of the same portion of the hierarchy: The dpc_handler() awaits readiness of the secondary bus below the Downstream Port where the DPC event occurred. To do so, it polls the config space of the first child device on the secondary bus. If that child device is concurrently removed, accesses to its struct pci_dev cause the kernel to oops. That's because pci_bridge_wait_for_secondary_bus() neglects to hold a reference on the child device. Before v6.3, the function was only called on resume from system sleep or on runtime resume. Holding a reference wasn't necessary back then because the pciehp IRQ thread could never run concurrently. (On resume from system sleep, IRQs are not enabled until after the resume_noirq phase. And runtime resume is always awaited before a PCI device is removed.) However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset"), which introduced that, failed to appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a reference on the child device because dpc_handler() and pciehp may indeed run concurrently. The commit was backported to v5.10+ stable kernels, so that's the oldest one affected. Add the missing reference acquisition. Abridged stack trace: BUG: unable to handle page fault for address: 00000000091400c0 CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 RIP: pci_bus_read_config_dword+0x17/0x50 pci_dev_wait() pci_bridge_wait_for_secondary_bus() dpc_reset_link() pcie_do_recovery() dpc_handler() Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset") Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/ Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.de Reported-by: Keith Busch <kbusch@kernel.org> Tested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-23 09:16:10 -06:00
Myron Stowe	57bfaa7c9d	PCI: Revert the cfg_access_lock lockdep mechanism JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: c9d52fb313d3719d69a040f4ca78a3e2e95fba21 commit c9d52fb313d3719d69a040f4ca78a3e2e95fba21 Author: Dan Williams <dan.j.williams@intel.com> Date: Thu May 30 18:04:24 2024 -0700 PCI: Revert the cfg_access_lock lockdep mechanism While the experiment did reveal that there are additional places that are missing the lock during secondary bus reset, one of the places that needs to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep annotation. Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is currently dependent on the fact that the device_lock() is marked lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that annotation, pci_bus_lock() would need to use something like a new pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the topology, and a hope that the depth of a PCI tree never exceeds the max value for a lockdep subclass. The alternative to ripping out the lockdep coverage would be to deploy a dynamic lock key for every PCI device. Unfortunately, there is evidence that increasing the number of keys that lockdep needs to track to be per-PCI-device is prohibitively expensive for something like the cfg_access_lock. The main motivation for adding the annotation in the first place was to catch unlocked secondary bus resets, not necessarily catch lock ordering problems between cfg_access_lock and other locks. Solve that narrower problem with follow-on patches, and just due to targeted revert for now. Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()") Reported-by: Imre Deak <imre.deak@intel.com> Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Cc: Jani Saarinen <jani.saarinen@intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	815e68c019	PCI: Make pcie_bandwidth_capable() static JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: fe4a83ec07818f2243eac584488e65397699550c commit fe4a83ec07818f2243eac584488e65397699550c Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Tue May 7 15:17:58 2024 +0300 PCI: Make pcie_bandwidth_capable() static pcie_bandwidth_capable() is only used within pci.c, make it static. Link: https://lore.kernel.org/r/20240507121758.13849-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	e9671490b6	PCI: Annotate pci_cache_line_size variables as __ro_after_init JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: c7ae396ec597b2f3644f90f5c7278674b0527aa9 commit c7ae396ec597b2f3644f90f5c7278674b0527aa9 Author: Heiner Kallweit <hkallweit1@gmail.com> Date: Thu Apr 18 20:29:21 2024 +0200 PCI: Annotate pci_cache_line_size variables as __ro_after_init Annotate both variables as __ro_after_init, enforcing that they can't be changed after the init phase. Link: https://lore.kernel.org/r/52fd058d-6d72-48db-8e61-5fcddcd0aa51@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	837b24c1e2	PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 256df20c590bf0e4d63ac69330cf23faddac3e08 commit 256df20c590bf0e4d63ac69330cf23faddac3e08 Author: Mario Limonciello <mario.limonciello@amd.com> Date: Thu Mar 7 10:37:09 2024 -0600 PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports Hewlett-Packard HP Pavilion 17 Notebook PC/1972 is an Intel Ivy Bridge system with a muxless AMD Radeon dGPU. Attempting to use the dGPU fails with the following sequence: ACPI Error: Aborting method \AMD3._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) radeon 0000:01:00.0: not ready 1023ms after resume; waiting radeon 0000:01:00.0: not ready 2047ms after resume; waiting radeon 0000:01:00.0: not ready 4095ms after resume; waiting radeon 0000:01:00.0: not ready 8191ms after resume; waiting radeon 0000:01:00.0: not ready 16383ms after resume; waiting radeon 0000:01:00.0: not ready 32767ms after resume; waiting radeon 0000:01:00.0: not ready 65535ms after resume; giving up radeon 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible The issue is that the Root Port the dGPU is connected to can't handle the transition from D3cold to D0 so the dGPU can't properly exit runtime PM. The existing logic in pci_bridge_d3_possible() checks for systems that are newer than 2015 to decide that D3 is safe. This would nominally work for an Ivy Bridge system (which was discontinued in 2015), but this system appears to have continued to receive BIOS updates until 2017 and so this existing logic doesn't appropriately capture it. Add the system to bridge_d3_blacklist to prevent D3cold from being used. Link: https://lore.kernel.org/r/20240307163709.323-1-mario.limonciello@amd.com Reported-by: Eric Heintzmann <heintzmann.eric@free.fr> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3229 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Eric Heintzmann <heintzmann.eric@free.fr> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	b4d4ad3baa	PCI: Do not wait for disconnected devices when resuming JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 6613443ffc49d03e27f0404978f685c4eac43fba commit 6613443ffc49d03e27f0404978f685c4eac43fba Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Thu Feb 8 15:23:21 2024 +0200 PCI: Do not wait for disconnected devices when resuming On runtime resume, pci_dev_wait() is called: pci_pm_runtime_resume() pci_pm_bridge_power_up_actions() pci_bridge_wait_for_secondary_bus() pci_dev_wait() While a device is runtime suspended along with its PCI hierarchy, the device could get disconnected. In such case, the link will not come up no matter how long pci_dev_wait() waits for it. Besides the above mentioned case, there could be other ways to get the device disconnected while pci_dev_wait() is waiting for the link to come up. Make pci_dev_wait() exit if the device is already disconnected to avoid unnecessary delay. The use cases of pci_dev_wait() boil down to two: 1. Waiting for the device after reset 2. pci_bridge_wait_for_secondary_bus() The callers in both cases seem to benefit from propagating the disconnection as error even if device disconnection would be more analoguous to the case where there is no device in the first place which return 0 from pci_dev_wait(). In the case 2, it results in unnecessary marking of the devices disconnected again but that is just harmless extra work. Also make sure compiler does not become too clever with dev->error_state and use READ_ONCE() to force a fetch for the up-to-date value. Link: https://lore.kernel.org/r/20240208132322.4811-1-ilpo.jarvinen@linux.intel.com Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	73528b22d7	PCI: Remove unused pci_enable_device_io() JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 844177a80753fc173131f3e591124c8dcbc89812 commit 844177a80753fc173131f3e591124c8dcbc89812 Author: Heiner Kallweit <hkallweit1@gmail.com> Date: Sat Mar 23 18:16:36 2024 +0100 PCI: Remove unused pci_enable_device_io() After the last user was removed, remove this PCI core function. It's very unlikely that we'll see a new device requiring io space access, even though memory space access is supported. Link: https://lore.kernel.org/r/213ebf62-53a3-42b7-8518-ecd5cd6d6b08@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	e1ce2d3e77	PCI: Clarify intent of LT wait JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: cdc6c4abcb313be1b7118b6e86eb99a85a626578 commit cdc6c4abcb313be1b7118b6e86eb99a85a626578 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Tue Apr 23 16:08:20 2024 +0300 PCI: Clarify intent of LT wait Clarify the comment relating to the LT wait and the purpose of the check that implements the implementation note in PCIe r6.1 sec 7.5.3.7. Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/20240423130820.43824-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	af5d87ce2b	PCI: Wait for Link Training==0 before starting Link retrain JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 73cb3a35f94db723c0211ad099bce55b2155e3f0 commit 73cb3a35f94db723c0211ad099bce55b2155e3f0 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Tue Apr 23 16:08:19 2024 +0300 PCI: Wait for Link Training==0 before starting Link retrain Two changes were made in link retraining logic independent of each other. The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added a check to pcie_retrain_link() to ensure no Link Training is currently active to address the Implementation Note in PCIe r6.1 sec 7.5.3.7. At that time pcie_wait_for_retrain() only checked for the Link Training (LT) bit being cleared. The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to pcie_retrain_link()") generalized pcie_wait_for_retrain() into pcie_wait_for_link_status() which can wait either for LT or the Data Link Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting waiting for either cleared or set using 'active' argument. In the merge commit 1abb47390350 ("Merge branch 'pci/enumeration'"), those two divergent branches converged. The merge changed LT bit checking added in the commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") to now wait for completion of any ongoing Link Training using DLLLA bit being set if 'use_lt' is false. When 'use_lt' is false, the pseudo-code steps of what occurs in pcie_retrain_link(): 1. Wait for DLLLA==1 2. Trigger link to retrain 3. Wait for DLLLA==1 Step 3 waits for the link to come up from the retraining triggered by Step 2. As Step 1 is supposed to wait for any ongoing retraining to end, using DLLLA also for it does not make sense because link training being active is still indicated using LT bit, not with DLLLA. Correct the pcie_wait_for_link_status() parameters in Step 1 to only wait for LT==0 to ensure there is no ongoing Link Training. This only impacts the Target Speed quirk, which is the only case where waiting for DLLLA bit is used. It currently works in the problematic case by means of link training getting initiated by hardware repeatedly and respecting the new link parameters set by the caller, which then make training succeed and bring the link up, setting DLLLA and causing pcie_wait_for_link_status() to return success. We are not supposed to rely on luck and need to make sure that LT transitioned through the inactive state though before we initiate link training by hand via RL (Retrain Link) bit. Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Link: https://lore.kernel.org/r/20240423130820.43824-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	2d69d79e9a	PCI: Lock upstream bridge for pci_reset_function() JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 7e89efc6e9e402839643cb297bab14055c547f07 commit 7e89efc6e9e402839643cb297bab14055c547f07 Author: Dave Jiang <dave.jiang@intel.com> Date: Thu May 2 09:57:31 2024 -0700 PCI: Lock upstream bridge for pci_reset_function() Fix a long-standing locking gap for missing pci_cfg_access_lock() while manipulating bridge reset registers and configuration during pci_reset_bus_function(). If there is an upstream bridge, lock it before locking the device itself. pci_dev_lock() calls pci_cfg_access_lock(), which blocks the writing of PCI config space by user space. Add lockdep assertion via pci_dev->cfg_access_lock to verify pci_dev->block_cfg_access is set. Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240502165851.1948523-3-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-08-15 15:31:13 -06:00
Myron Stowe	934bc80496	PCI: Place interrupt related code into irq.c JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 1e8cc8e6bd85d7b25e0ed3759aedde804c91ba97 Conflict(s) drivers/pci/Makefile: Same conflict(s) as encountered upstream see 420b8c360695 Merge branch 'pci/enumeration'. commit 1e8cc8e6bd85d7b25e0ed3759aedde804c91ba97 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Mon Jan 29 13:36:54 2024 +0200 PCI: Place interrupt related code into irq.c Interrupt related code is spread into irq.c, pci.c, and setup-irq.c. Group them into pre-existing irq.c. Link: https://lore.kernel.org/r/20240129113655.3368-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:55:19 -06:00
Myron Stowe	39776a1a0a	PCI: Move devres code from pci.c to devres.c JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 815a3909ead7440e2827042e5ec618f4396f022c commit 815a3909ead7440e2827042e5ec618f4396f022c Author: Philipp Stanner <pstanner@redhat.com> Date: Wed Jan 31 10:00:23 2024 +0100 PCI: Move devres code from pci.c to devres.c The file pci.c is very large and contains a number of devres functions. These functions should now reside in devres.c. Move as much devres-specific code from pci.c to devres.c as possible. There are a few callers left in pci.c that do devres operations. These should be ported in the future. Add corresponding TODOs. The reason they are not moved right now in this commit is that PCI's devres currently implements a sort of "hybrid-mode": pci_request_region(), for instance, does not have a corresponding pcim_ equivalent, yet. Instead, the function can be made managed by previously calling pcim_enable_device() (instead of pci_enable_device()). This makes it unreasonable to move pci_request_region() to devres.c. Moving the functions would require changes to PCI's API and is, therefore, left for future work. In summary, this commit serves as a preparation step for a following patch series that will cleanly separate the PCI's managed and unmanaged API. Link: https://lore.kernel.org/r/20240131090023.12331-5-pstanner@redhat.com Suggested-by: Danilo Krummrich <dakr@redhat.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:32 -06:00
Myron Stowe	6104885fc7	PCI/ASPM: Disable L1 before configuring L1 Substates JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 64dbb2d707444f691539fb12aacf81797786c10b commit 64dbb2d707444f691539fb12aacf81797786c10b Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Mar 5 15:15:25 2024 -0600 PCI/ASPM: Disable L1 before configuring L1 Substates Per PCIe r6.1, sec 5.5.4, L1 must be disabled while setting ASPM L1 PM Substates enable bits. Previously this was enforced by clearing PCI_EXP_LNKCTL_ASPMC before calling pci_restore_aspm_l1ss_state(). Move the L1 (and L0s, although that doesn't seem required) disable into pci_restore_aspm_l1ss_state() itself so it's closer to the code that depends on it. Link: https://lore.kernel.org/r/20240223213733.GA115410@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:32 -06:00
Myron Stowe	c5e76d4ac3	PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state() JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: c198fafa0125e97728d16411aa653602900ab0bc commit c198fafa0125e97728d16411aa653602900ab0bc Author: David E. Box <david.e.box@linux.intel.com> Date: Fri Feb 23 14:58:51 2024 -0600 PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state() ASPM state is saved and restored from pci_save/restore_pcie_state(). Since the LTR Capability is linked with ASPM, move the LTR save and restore calls there as well. No functional change intended. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20240128233212.1139663-6-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-6-helgaas@kernel.org Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:32 -06:00
Myron Stowe	39e98b2dcc	PCI/ASPM: Save L1 PM Substates Capability for suspend/resume JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 17423360a27ae58c1850f588bdd8013bbfcd250b commit 17423360a27ae58c1850f588bdd8013bbfcd250b Author: David E. Box <david.e.box@linux.intel.com> Date: Fri Feb 23 14:58:50 2024 -0600 PCI/ASPM: Save L1 PM Substates Capability for suspend/resume 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") restored the L1 PM Substates Capability after resume, which reduced power consumption by making the ASPM L1.x states work after resume. a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") reverted 4ff116d0d5fd because resume failed on some systems, so power consumption after resume increased again. a7152be79b62 mentioned that we restore L1 PM substate configuration even though ASPM L1 may already be enabled. This is due the fact that the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state(). Save and restore the L1 PM Substates Capability, following PCIe r6.1, sec 5.5.4 more closely by: 1) Do not restore ASPM configuration in pci_restore_pcie_state() but do that after PCIe capability is restored in pci_restore_aspm_state() following PCIe r6.1, sec 5.5.4. 2) If BIOS reenables L1SS, particularly L1.2, we need to clear the enables in the right order, downstream before upstream. Defer restoring the L1SS config until we are at the downstream component. Then update the config for both ends of the link in the prescribed order. 3) Program ASPM L1 PM substate configuration before L1 enables. 4) Program ASPM L1 PM substate enables last, after rest of the fields in the capability are programmed. [bhelgaas: commit log, squash L1SS-related patches, do both LNKCTL restores in pci_restore_pcie_state()] Link: https://lore.kernel.org/r/20240128233212.1139663-3-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240128233212.1139663-4-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-5-helgaas@kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Co-developed-by: David E. Box <david.e.box@linux.intel.com> Reported-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Tasev Nikola <tasev.stefanoska@skynet.be> # Asus UX305FA Cc: Mark Enriquez <enriquezmark36@gmail.com> Cc: Thomas Witt <kernel@witt.link> Cc: Werner Sembach <wse@tuxedocomputers.com> Cc: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:32 -06:00
Myron Stowe	11beb3fff4	PCI/ASPM: Move pci_save_ltr_state() to aspm.c JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 1e11b5494c3dbb1e5fce7e95021c1698799c7288 commit 1e11b5494c3dbb1e5fce7e95021c1698799c7288 Author: David E. Box <david.e.box@linux.intel.com> Date: Fri Feb 23 14:58:49 2024 -0600 PCI/ASPM: Move pci_save_ltr_state() to aspm.c Even when CONFIG_PCIEASPM is not set, we save and restore the LTR Capability so that if ASPM L1.2 and LTR were configured by the platform, ASPM L1.2 will still work after suspend/resume, when that platform configuration may be lost. See `dbbfadf231` ("PCI/ASPM: Save LTR Capability for suspend/resume"). Since ASPM L1.2 depends on the LTR Capability, move the save/restore code to the part of aspm.c that is always compiled regardless of CONFIG_PCIEASPM. No functional change intended. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20240128233212.1139663-5-david.e.box@linux.intel.com [bhelgaas: commit log, reorder to make this a pure move] Link: https://lore.kernel.org/r/20240223205851.114931-4-helgaas@kernel.org Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:32 -06:00
Myron Stowe	47a9ded2ef	PCI/ASPM: Move pci_configure_ltr() to aspm.c JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: fa84f4435a6202dd90248517f41e54bf3fb85bc5 Conflict(s): Patching file drivers/pci/pci.h; Hunk #2 FAILED at 572. False conflict as this upstream patch's basis was prior to upstream commit 1e560864159d "PCI/ASPM: Fix deadlock when enabling ASPM". commit fa84f4435a6202dd90248517f41e54bf3fb85bc5 Author: David E. Box <david.e.box@linux.intel.com> Date: Fri Feb 23 14:58:47 2024 -0600 PCI/ASPM: Move pci_configure_ltr() to aspm.c The Latency Tolerance Reporting (LTR) mechanism supports the ASPM L1.2 state and is only configured when CONFIG_PCIEASPM is set. Move pci_configure_ltr() and pci_bridge_reconfigure_ltr() into aspm.c since they only build when CONFIG_PCIEASPM is set. No functional change intended. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20240128233212.1139663-2-david.e.box@linux.intel.com [bhelgaas: commit log, split build change from function moves] Link: https://lore.kernel.org/r/20240223205851.114931-2-helgaas@kernel.org Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:52:11 -06:00
Myron Stowe	e4b1b96519	PCI/AER: Generalize TLP Header Log reading JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 0a5a46a6a61be7b63c12c18495d427f91f3662a9 commit 0a5a46a6a61be7b63c12c18495d427f91f3662a9 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Tue Feb 6 15:57:15 2024 +0200 PCI/AER: Generalize TLP Header Log reading Both AER and DPC RP PIO provide TLP Header Log registers (PCIe r6.1 secs 7.8.4 & 7.9.14) to convey error diagnostics but the struct is named after AER as the struct aer_header_log_regs. Also, not all places that handle TLP Header Log use the struct and the struct members are named individually. Generalize the struct name and members, and use it consistently where TLP Header Log is being handled so that a pcie_read_tlp_log() helper can be easily added. Link: https://lore.kernel.org/r/20240206135717.8565-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: drop ixgbe changes for now, tidy whitespace] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:49:50 -06:00
Myron Stowe	33e16d3ebf	PCI: Add debug print for device ready delay JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 0a5ef95923e01aa93210d22e0d62d66b601238d7 commit 0a5ef95923e01aa93210d22e0d62d66b601238d7 Author: Ido Schimmel <idosch@nvidia.com> Date: Wed Nov 15 13:17:17 2023 +0100 PCI: Add debug print for device ready delay Currently, the time it took a PCI device to become ready after reset is only printed if it was longer than 1000ms ('PCI_RESET_WAIT'). However, for debugging purposes it is useful to know this time even if it was shorter. For example, with the device I am working on, hardware engineers asked to verify that it becomes ready on the first try (no delay). To that end, add a debug level print that can be enabled using dynamic debug. Example: # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c \| grep ready # echo "file drivers/pci/pci.c +p" > /sys/kernel/debug/dynamic_debug/control # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c \| grep ready [ 396.060335] mlxsw_spectrum4 0000:01:00.0: ready 0ms after bus reset # echo "file drivers/pci/pci.c -p" > /sys/kernel/debug/dynamic_debug/control # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset # dmesg -c \| grep ready Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Myron Stowe <mstowe@redhat.com>	2024-05-13 15:49:50 -06:00

1 2 3 4 5 ...

997 Commits