Centos-kernel-stream-9/drivers/pci/pci.h

989 lines
31 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef DRIVERS_PCI_H
#define DRIVERS_PCI_H
#include <linux/pci.h>
/* Number of possible devfns: 0.0 to 1f.7 inclusive */
#define MAX_NR_DEVFNS 256
#define PCI_FIND_CAP_TTL 48
PCI: Recognize Thunderbolt devices Detect on probe whether a PCI device is part of a Thunderbolt controller. Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234 on such devices. Detect presence of this VSEC and cache it in a newly added is_thunderbolt bit in struct pci_dev. Also, add a helper to check whether a given PCI device is situated on a Thunderbolt daisy chain (i.e., below a PCI device with is_thunderbolt set). The necessity arises from the following: * If an external Thunderbolt GPU is connected to a dual GPU laptop, that GPU is currently registered with vga_switcheroo even though it can neither drive the laptop's panel nor be powered off by the platform. To vga_switcheroo it will appear as if two discrete GPUs are present. As a result, when the external GPU is runtime suspended, vga_switcheroo will cut power to the internal discrete GPU which may not be runtime suspended at all at this moment. The solution is to not register external GPUs with vga_switcheroo, which necessitates a way to recognize if they're on a Thunderbolt daisy chain. * Dual GPU MacBook Pros introduced 2011+ can no longer switch external DisplayPort ports between GPUs. (They're no longer just used for DP but have become combined DP/Thunderbolt ports.) The driver to switch the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence of a Thunderbolt controller and, if found, keep external ports permanently switched to the discrete GPU. v2: Make kerneldoc for pci_is_thunderbolt_attached() more precise, drop portion of commit message pertaining to separate series. (Bjorn Helgaas) Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Michael Jamet <michael.jamet@intel.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Cc: Amir Levy <amir.jer.levy@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: http://patchwork.freedesktop.org/patch/msgid/0ab165a4a35c0b60f29d4c306c653ead14fcd8f9.1489145162.git.lukas@wunner.de
2017-03-10 20:23:45 +00:00
#define PCI_VSEC_ID_INTEL_TBT 0x1234 /* Thunderbolt */
#define PCIE_LINK_RETRAIN_TIMEOUT_MS 1000
/*
* Power stable to PERST# inactive.
*
* See the "Power Sequencing and Reset Signal Timings" table of the PCI Express
* Card Electromechanical Specification, Revision 5.1, Section 2.9.2, Symbol
* "T_PVPERL".
*/
#define PCIE_T_PVPERL_MS 100
/*
* REFCLK stable before PERST# inactive.
*
* See the "Power Sequencing and Reset Signal Timings" table of the PCI Express
* Card Electromechanical Specification, Revision 5.1, Section 2.9.2, Symbol
* "T_PERST-CLK".
*/
#define PCIE_T_PERST_CLK_US 100
/*
* End of conventional reset (PERST# de-asserted) to first configuration
* request (device able to respond with a "Request Retry Status" completion),
* from PCIe r6.0, sec 6.6.1.
*/
#define PCIE_T_RRS_READY_MS 100
/*
* PCIe r6.0, sec 5.3.3.2.1 <PME Synchronization>
* Recommends 1ms to 10ms timeout to check L2 ready.
*/
#define PCIE_PME_TO_L2_TIMEOUT_US 10000
/* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */
#define PCIE_MSG_TYPE_R_RC 0
#define PCIE_MSG_TYPE_R_ADDR 1
#define PCIE_MSG_TYPE_R_ID 2
#define PCIE_MSG_TYPE_R_BC 3
#define PCIE_MSG_TYPE_R_LOCAL 4
#define PCIE_MSG_TYPE_R_GATHER 5
/* Power Management Messages; PCIe r6.0, sec 2.2.8.2 */
#define PCIE_MSG_CODE_PME_TURN_OFF 0x19
/* INTx Mechanism Messages; PCIe r6.0, sec 2.2.8.1 */
#define PCIE_MSG_CODE_ASSERT_INTA 0x20
#define PCIE_MSG_CODE_ASSERT_INTB 0x21
#define PCIE_MSG_CODE_ASSERT_INTC 0x22
#define PCIE_MSG_CODE_ASSERT_INTD 0x23
#define PCIE_MSG_CODE_DEASSERT_INTA 0x24
#define PCIE_MSG_CODE_DEASSERT_INTB 0x25
#define PCIE_MSG_CODE_DEASSERT_INTC 0x26
#define PCIE_MSG_CODE_DEASSERT_INTD 0x27
extern const unsigned char pcie_link_speed[];
extern bool pci_early_dump;
bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
bool pcie_cap_has_lnkctl2(const struct pci_dev *dev);
bool pcie_cap_has_rtctl(const struct pci_dev *dev);
/* Functions internal to the PCI core code */
#ifdef CONFIG_DMI
extern const struct attribute_group pci_dev_smbios_attr_group;
#endif
enum pci_mmap_api {
PCI_MMAP_SYSFS, /* mmap on /sys/bus/pci/devices/<BDF>/resource<N> */
PCI_MMAP_PROCFS /* mmap on /proc/bus/pci/<BDF> */
};
int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
enum pci_mmap_api mmap_api);
bool pci_reset_supported(struct pci_dev *dev);
void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
struct pci_cap_saved_data {
u16 cap_nr;
bool cap_extended;
unsigned int size;
u32 data[];
};
struct pci_cap_saved_state {
struct hlist_node next;
struct pci_cap_saved_data cap;
};
void pci_allocate_cap_save_buffers(struct pci_dev *dev);
void pci_free_cap_save_buffers(struct pci_dev *dev);
int pci_add_cap_save_buffer(struct pci_dev *dev, char cap, unsigned int size);
int pci_add_ext_cap_save_buffer(struct pci_dev *dev,
u16 cap, unsigned int size);
struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char cap);
struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
u16 cap);
#define PCI_PM_D2_DELAY 200 /* usec; see PCIe r4.0, sec 5.9.1 */
#define PCI_PM_D3HOT_WAIT 10 /* msec */
#define PCI_PM_D3COLD_WAIT 100 /* msec */
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
void pci_disable_enabled_device(struct pci_dev *dev);
int pci_finish_runtime_suspend(struct pci_dev *dev);
void pcie_clear_device_status(struct pci_dev *dev);
void pcie_clear_root_pme_status(struct pci_dev *dev);
bool pci_check_pme_status(struct pci_dev *dev);
void pci_pme_wakeup_bus(struct pci_bus *bus);
void pci_pme_restore(struct pci_dev *dev);
bool pci_dev_need_resume(struct pci_dev *dev);
void pci_dev_adjust_pme(struct pci_dev *dev);
void pci_dev_complete_resume(struct pci_dev *pci_dev);
void pci_config_pm_runtime_get(struct pci_dev *dev);
void pci_config_pm_runtime_put(struct pci_dev *dev);
void pci_pm_init(struct pci_dev *dev);
void pci_ea_init(struct pci_dev *dev);
void pci_msi_init(struct pci_dev *dev);
void pci_msix_init(struct pci_dev *dev);
bool pci_bridge_d3_possible(struct pci_dev *dev);
void pci_bridge_d3_update(struct pci_dev *dev);
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type);
static inline bool pci_bus_rrs_vendor_id(u32 l)
PCI: Wait for device readiness with Configuration RRS JIRA: https://issues.redhat.com/browse/RHEL-71363 Upstream Status: d591f6804e7e1310881c9224d72247a2b65039af commit d591f6804e7e1310881c9224d72247a2b65039af Author: Bjorn Helgaas <bhelgaas@google.com> Date: Tue Aug 27 18:48:46 2024 -0500 PCI: Wait for device readiness with Configuration RRS After a device reset, delays are required before the device can successfully complete config accesses. PCIe r6.0, sec 6.6, specifies some delays required before software can perform config accesses. Devices that require more time after those delays may respond to config accesses with Configuration Request Retry Status (RRS) completions. Callers of pci_dev_wait() are responsible for delays until the device can respond to config accesses. pci_dev_wait() waits any additional time until the device can successfully complete config accesses. Reading config space of devices that are not present or not ready typically returns ~0 (PCI_ERROR_RESPONSE). Previously we polled the Command register until we got a value other than ~0. This is sometimes a problem because Root Complex handling of RRS completions may include several retries and implementation-specific behavior that is invisible to software (see sec 2.3.2), so the exponential backoff in pci_dev_wait() may not work as intended. Linux enables Configuration RRS Software Visibility on all Root Ports that support it. If it is enabled, read the Vendor ID instead of the Command register. RRS completions cause immediate return of the 0x0001 reserved Vendor ID value, so the pci_dev_wait() backoff works correctly. When a read of Vendor ID eventually completes successfully by returning a non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be initialized and ready to respond to config requests. For conventional PCI devices or devices below Root Ports that don't support Configuration RRS Software Visibility, poll the Command register as before. This was developed independently, but is very similar to Stanislav Spassov's previous work at https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com Link: https://lore.kernel.org/r/20240827234848.4429-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Duc Dang <ducdang@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-12-16 16:42:52 +00:00
{
return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
}
static inline void pci_wakeup_event(struct pci_dev *dev)
{
/* Wait 100 ms before the system can be put into a sleep state. */
pm_wakeup_event(&dev->dev, 100);
}
static inline bool pci_has_subordinate(struct pci_dev *pci_dev)
{
return !!(pci_dev->subordinate);
}
PCI: Put PCIe ports into D3 during suspend Currently the Linux PCI core does not touch power state of PCI bridges and PCIe ports when system suspend is entered. Leaving them in D0 consumes power unnecessarily and may prevent the CPU from entering deeper C-states. With recent PCIe hardware we can power down the ports to save power given that we take into account few restrictions: - The PCIe port hardware is recent enough, starting from 2015. - Devices connected to PCIe ports are effectively in D3cold once the port is transitioned to D3 (the config space is not accessible anymore and the link may be powered down). - Devices behind the PCIe port need to be allowed to transition to D3cold and back. There is a way both drivers and userspace can forbid this. - If the device behind the PCIe port is capable of waking the system it needs to be able to do so from D3cold. This patch adds a new flag to struct pci_device called 'bridge_d3'. This flag is set and cleared by the PCI core whenever there is a change in power management state of any of the devices behind the PCIe port. When system later on is suspended we only need to check this flag and if it is true transition the port to D3 otherwise we leave it in D0. Also provide override mechanism via command line parameter "pcie_port_pm=[off|force]" that can be used to disable or enable the feature regardless of the BIOS manufacturing date. Tested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-02 08:17:12 +00:00
static inline bool pci_power_manageable(struct pci_dev *pci_dev)
{
/*
* Currently we allow normal PCI devices and PCI bridges transition
* into D3 if their bridge_d3 is set.
*/
return !pci_has_subordinate(pci_dev) || pci_dev->bridge_d3;
}
static inline bool pcie_downstream_port(const struct pci_dev *dev)
{
int type = pci_pcie_type(dev);
return type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
type == PCI_EXP_TYPE_PCIE_BRIDGE;
}
void pci_vpd_init(struct pci_dev *dev);
extern const struct attribute_group pci_dev_vpd_attr_group;
/* PCI Virtual Channel */
int pci_save_vc_state(struct pci_dev *dev);
void pci_restore_vc_state(struct pci_dev *dev);
void pci_allocate_vc_save_buffers(struct pci_dev *dev);
/* PCI /proc functions */
#ifdef CONFIG_PROC_FS
int pci_proc_attach_device(struct pci_dev *dev);
int pci_proc_detach_device(struct pci_dev *dev);
int pci_proc_detach_bus(struct pci_bus *bus);
#else
static inline int pci_proc_attach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_bus(struct pci_bus *bus) { return 0; }
#endif
/* Functions for PCI Hotplug drivers to use */
int pci_hp_add_bridge(struct pci_dev *dev);
PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Conflict(s): Patching file drivers/pci/Makefile: Hunk #1 FAILED at 4. Post context conflict due to originating patch not taking into account upstream commit 1e8cc8e6bd85 "PCI: Place interrupt related code into irq.c" (see merge commit b8de187056f "Merge branch 'pci/sysfs'"). commit be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Author: Lukas Wunner <lukas@wunner.de> Date: Mon Oct 30 13:32:12 2023 +0100 PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-05-03 01:04:49 +00:00
#if defined(CONFIG_SYSFS) && defined(HAVE_PCI_LEGACY)
void pci_create_legacy_files(struct pci_bus *bus);
void pci_remove_legacy_files(struct pci_bus *bus);
#else
static inline void pci_create_legacy_files(struct pci_bus *bus) { }
static inline void pci_remove_legacy_files(struct pci_bus *bus) { }
#endif
/* Lock for read/write access to pci device and bus lists */
extern struct rw_semaphore pci_bus_sem;
extern struct mutex pci_slot_mutex;
extern raw_spinlock_t pci_lock;
extern unsigned int pci_pm_d3hot_delay;
#ifdef CONFIG_PCI_MSI
void pci_no_msi(void);
#else
static inline void pci_no_msi(void) { }
#endif
void pci_realloc_get_opt(char *);
static inline int pci_no_d1d2(struct pci_dev *dev)
{
unsigned int parent_dstates = 0;
if (dev->bus->self)
parent_dstates = dev->bus->self->no_d1d2;
return (dev->no_d1d2 || parent_dstates);
}
PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Conflict(s): Patching file drivers/pci/Makefile: Hunk #1 FAILED at 4. Post context conflict due to originating patch not taking into account upstream commit 1e8cc8e6bd85 "PCI: Place interrupt related code into irq.c" (see merge commit b8de187056f "Merge branch 'pci/sysfs'"). commit be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Author: Lukas Wunner <lukas@wunner.de> Date: Mon Oct 30 13:32:12 2023 +0100 PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-05-03 01:04:49 +00:00
#ifdef CONFIG_SYSFS
int pci_create_sysfs_dev_files(struct pci_dev *pdev);
void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
extern const struct attribute_group *pci_dev_groups[];
PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Conflict(s): Patching file drivers/pci/Makefile: Hunk #1 FAILED at 4. Post context conflict due to originating patch not taking into account upstream commit 1e8cc8e6bd85 "PCI: Place interrupt related code into irq.c" (see merge commit b8de187056f "Merge branch 'pci/sysfs'"). commit be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Author: Lukas Wunner <lukas@wunner.de> Date: Mon Oct 30 13:32:12 2023 +0100 PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-05-03 01:04:49 +00:00
extern const struct attribute_group *pci_dev_attr_groups[];
extern const struct attribute_group *pcibus_groups[];
extern const struct attribute_group *pci_bus_groups[];
PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Conflict(s): Patching file drivers/pci/Makefile: Hunk #1 FAILED at 4. Post context conflict due to originating patch not taking into account upstream commit 1e8cc8e6bd85 "PCI: Place interrupt related code into irq.c" (see merge commit b8de187056f "Merge branch 'pci/sysfs'"). commit be9c3a4c8be13326e434d8817d6dda6c5d2835f5 Author: Lukas Wunner <lukas@wunner.de> Date: Mon Oct 30 13:32:12 2023 +0100 PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-05-03 01:04:49 +00:00
#else
static inline int pci_create_sysfs_dev_files(struct pci_dev *pdev) { return 0; }
static inline void pci_remove_sysfs_dev_files(struct pci_dev *pdev) { }
#define pci_dev_groups NULL
#define pci_dev_attr_groups NULL
#define pcibus_groups NULL
#define pci_bus_groups NULL
#endif
extern unsigned long pci_hotplug_io_size;
extern unsigned long pci_hotplug_mmio_size;
extern unsigned long pci_hotplug_mmio_pref_size;
extern unsigned long pci_hotplug_bus_size;
/**
* pci_match_one_device - Tell if a PCI device structure has a matching
* PCI device id structure
* @id: single PCI device id structure to match
* @dev: the PCI device structure to match against
*
* Returns the matching pci_device_id structure or %NULL if there is no match.
*/
static inline const struct pci_device_id *
pci_match_one_device(const struct pci_device_id *id, const struct pci_dev *dev)
{
if ((id->vendor == PCI_ANY_ID || id->vendor == dev->vendor) &&
(id->device == PCI_ANY_ID || id->device == dev->device) &&
(id->subvendor == PCI_ANY_ID || id->subvendor == dev->subsystem_vendor) &&
(id->subdevice == PCI_ANY_ID || id->subdevice == dev->subsystem_device) &&
!((id->class ^ dev->class) & id->class_mask))
return id;
return NULL;
}
PCI: introduce pci_slot Currently, /sys/bus/pci/slots/ only exposes hotplug attributes when a hotplug driver is loaded, but PCI slots have attributes such as address, speed, width, etc. that are not related to hotplug at all. Introduce pci_slot as the primary data structure and kobject model. Hotplug attributes described in hotplug_slot become a secondary structure associated with the pci_slot. This patch only creates the infrastructure that allows the separation of PCI slot attributes and hotplug attributes. In this patch, the PCI hotplug core remains the only user of this infrastructure, and thus, /sys/bus/pci/slots/ will still only become populated when a hotplug driver is loaded. A later patch in this series will add a second user of this new infrastructure and demonstrate splitting the task of exposing pci_slot attributes from hotplug_slot attributes. - Make pci_slot the primary sysfs entity. hotplug_slot becomes a subsidiary structure. o pci_create_slot() creates and registers a slot with the PCI core o pci_slot_add_hotplug() gives it hotplug capability - Change the prototype of pci_hp_register() to take the bus and slot number (on parent bus) as parameters. - Remove all the ->get_address methods since this functionality is now handled by pci_slot directly. [achiang@hp.com: rpaphp-correctly-pci_hp_register-for-empty-pci-slots] Tested-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: make headers_check happy] [akpm@linux-foundation.org: nuther build fix] [akpm@linux-foundation.org: fix typo in #include] Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Matthew Wilcox <matthew@wil.cx> Cc: Greg KH <greg@kroah.com> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Cc: Len Brown <lenb@kernel.org> Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-10 21:28:50 +00:00
/* PCI slot sysfs helper code */
#define to_pci_slot(s) container_of(s, struct pci_slot, kobj)
extern struct kset *pci_slots_kset;
struct pci_slot_attribute {
struct attribute attr;
ssize_t (*show)(struct pci_slot *, char *);
ssize_t (*store)(struct pci_slot *, const char *, size_t);
};
#define to_pci_slot_attr(s) container_of(s, struct pci_slot_attribute, attr)
enum pci_bar_type {
pci_bar_unknown, /* Standard PCI BAR probe */
pci_bar_io, /* An I/O port BAR */
pci_bar_mem32, /* A 32-bit memory BAR */
pci_bar_mem64, /* A 64-bit memory BAR */
};
struct device *pci_get_host_bridge_device(struct pci_dev *dev);
void pci_put_host_bridge_device(struct device *dev);
int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
PCI: Workaround IDT switch ACS Source Validation erratum Some IDT switches incorrectly flag an ACS Source Validation error on completions for config read requests even though PCIe r4.0, sec 6.12.1.1, says that completions are never affected by ACS Source Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36: Item #36 - Downstream port applies ACS Source Validation to Completions Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that completions are never affected by ACS Source Validation. However, completions received by a downstream port of the PCIe switch from a device that has not yet captured a PCIe bus number are incorrectly dropped by ACS Source Validation by the switch downstream port. Workaround: Issue a CfgWr1 to the downstream device before issuing the first CfgRd1 to the device. This allows the downstream device to capture its bus number; ACS Source Validation no longer stops completions from being forwarded by the downstream port. It has been observed that Microsoft Windows implements this workaround already; however, some versions of Linux and other operating systems may not. When doing the first config read to probe for a device, if the device is behind an IDT switch with this erratum: 1. Disable ACS Source Validation if enabled 2. Wait for device to become ready to accept config accesses (by using the Config Request Retry Status mechanism) 3. Do a config write to the endpoint 4. Enable ACS Source Validation (if it was enabled to begin with) The workaround suggested by IDT is basically only step 3, but we don't know when the device is ready to accept config requests. That means we need to do config reads until we receive a non-Config Request Retry Status, which means we need to disable ACS SV temporarily. Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> [bhelgaas: changelog, clean up whitespace, fold in unused variable fix from Anders Roxell <anders.roxell@linaro.org>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-07-09 15:31:25 +00:00
bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int rrs_timeout);
int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
PCI: Workaround IDT switch ACS Source Validation erratum Some IDT switches incorrectly flag an ACS Source Validation error on completions for config read requests even though PCIe r4.0, sec 6.12.1.1, says that completions are never affected by ACS Source Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36: Item #36 - Downstream port applies ACS Source Validation to Completions Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that completions are never affected by ACS Source Validation. However, completions received by a downstream port of the PCIe switch from a device that has not yet captured a PCIe bus number are incorrectly dropped by ACS Source Validation by the switch downstream port. Workaround: Issue a CfgWr1 to the downstream device before issuing the first CfgRd1 to the device. This allows the downstream device to capture its bus number; ACS Source Validation no longer stops completions from being forwarded by the downstream port. It has been observed that Microsoft Windows implements this workaround already; however, some versions of Linux and other operating systems may not. When doing the first config read to probe for a device, if the device is behind an IDT switch with this erratum: 1. Disable ACS Source Validation if enabled 2. Wait for device to become ready to accept config accesses (by using the Config Request Retry Status mechanism) 3. Do a config write to the endpoint 4. Enable ACS Source Validation (if it was enabled to begin with) The workaround suggested by IDT is basically only step 3, but we don't know when the device is ready to accept config requests. That means we need to do config reads until we receive a non-Config Request Retry Status, which means we need to disable ACS SV temporarily. Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> [bhelgaas: changelog, clean up whitespace, fold in unused variable fix from Anders Roxell <anders.roxell@linaro.org>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-07-09 15:31:25 +00:00
int pci_setup_device(struct pci_dev *dev);
PCI: Batch BAR sizing operations JIRA: https://issues.redhat.com/browse/RHEL-76025 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=enumeration&id=4453f360862e5d9f0807941d613162c3f7a36559 In maintainer's PCI 'enumeration' branch, slated for v6.14. https://lore.kernel.org/all/20250120224418.GA906057@bhelgaas/ commit 4453f360862e5d9f0807941d613162c3f7a36559 (pci/enumeration) Author: Alex Williamson <alex.williamson@redhat.com> Date: Mon Jan 20 11:21:59 2025 -0700 PCI: Batch BAR sizing operations Toggling memory enable is free on bare metal, but potentially expensive in virtualized environments as the device MMIO spaces are added and removed from the VM address space, including DMA mapping of those spaces through the IOMMU where peer-to-peer is supported. Currently memory decode is disabled around sizing each individual BAR, even for SR-IOV BARs while VF Enable is cleared. This can be better optimized for virtual environments by sizing a set of BARs at once, stashing the resulting mask into an array, while only toggling memory enable once. This also naturally improves the SR-IOV path as the caller becomes responsible for any necessary decode disables while sizing BARs, therefore SR-IOV BARs are sized relying only on the VF Enable rather than toggling the PF memory enable in the command register. Link: https://lore.kernel.org/r/20250120182202.1878581-1-alex.williamson@redhat.com Reported-by: Mitchell Augustin <mitchell.augustin@canonical.com> Link: https://lore.kernel.org/r/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Mitchell Augustin <mitchell.augustin@canonical.com> Reviewed-by: Mitchell Augustin <mitchell.augustin@canonical.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-01-24 16:12:41 +00:00
void __pci_size_stdbars(struct pci_dev *dev, int count,
unsigned int pos, u32 *sizes);
int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
PCI: Batch BAR sizing operations JIRA: https://issues.redhat.com/browse/RHEL-76025 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=enumeration&id=4453f360862e5d9f0807941d613162c3f7a36559 In maintainer's PCI 'enumeration' branch, slated for v6.14. https://lore.kernel.org/all/20250120224418.GA906057@bhelgaas/ commit 4453f360862e5d9f0807941d613162c3f7a36559 (pci/enumeration) Author: Alex Williamson <alex.williamson@redhat.com> Date: Mon Jan 20 11:21:59 2025 -0700 PCI: Batch BAR sizing operations Toggling memory enable is free on bare metal, but potentially expensive in virtualized environments as the device MMIO spaces are added and removed from the VM address space, including DMA mapping of those spaces through the IOMMU where peer-to-peer is supported. Currently memory decode is disabled around sizing each individual BAR, even for SR-IOV BARs while VF Enable is cleared. This can be better optimized for virtual environments by sizing a set of BARs at once, stashing the resulting mask into an array, while only toggling memory enable once. This also naturally improves the SR-IOV path as the caller becomes responsible for any necessary decode disables while sizing BARs, therefore SR-IOV BARs are sized relying only on the VF Enable rather than toggling the PF memory enable in the command register. Link: https://lore.kernel.org/r/20250120182202.1878581-1-alex.williamson@redhat.com Reported-by: Mitchell Augustin <mitchell.augustin@canonical.com> Link: https://lore.kernel.org/r/CAHTA-uYp07FgM6T1OZQKqAdSA5JrZo0ReNEyZgQZub4mDRrV5w@mail.gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Mitchell Augustin <mitchell.augustin@canonical.com> Reviewed-by: Mitchell Augustin <mitchell.augustin@canonical.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-01-24 16:12:41 +00:00
struct resource *res, unsigned int reg, u32 *sizes);
void pci_configure_ari(struct pci_dev *dev);
void __pci_bus_size_bridges(struct pci_bus *bus,
PCI / ACPI: Use boot-time resource allocation rules during hotplug On x86 platforms, the kernel respects PCI resource assignments from the BIOS and only reassigns resources for unassigned BARs at boot time. However, with the ACPI-based hotplug (acpiphp), it ignores the BIOS' PCI resource assignments completely and reassigns all resources by itself. This causes differences in PCI resource allocation between boot time and runtime hotplug to occur, which is generally undesirable and sometimes actively breaks things. Namely, if there are enough resources, reassigning all PCI resources during runtime hotplug should work, but it may fail if the resources are constrained. This may happen, for instance, when some PCI devices with huge MMIO BARs are involved in the runtime hotplug operations, because the current PCI MMIO alignment algorithm may waste huge chunks of MMIO address space in those cases. On the Alexander's Sony VAIO VPCZ23A4R the BIOS allocates limited MMIO resources for the dock station which contains a device (graphics adapter) with a 256MB MMIO BAR. An attempt to reassign that during runtime hotplug causes the dock station MMIO window to be exhausted and acpiphp fails to allocate resources for the majority of devices on the dock station as a result. To prevent that from happening, modify acpiphp to follow the boot time resources allocation behavior so that the BIOS' resource assignments are respected during runtime hotplug too. [rjw: Changelog] References: https://bugzilla.kernel.org/show_bug.cgi?id=56531 Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com> Tested-by: Illya Klymov <xanf@xanf.me> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: 3.9+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-22 23:01:35 +00:00
struct list_head *realloc_head);
void __pci_bus_assign_resources(const struct pci_bus *bus,
struct list_head *realloc_head,
struct list_head *fail_head);
bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
const char *pci_resource_name(struct pci_dev *dev, unsigned int i);
void pci_reassigndev_resource_alignment(struct pci_dev *dev);
void pci_disable_bridge_window(struct pci_dev *dev);
struct pci_bus *pci_bus_get(struct pci_bus *bus);
void pci_bus_put(struct pci_bus *bus);
2009-03-16 08:13:39 +00:00
PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: de9a6c8d5dbfedb5eb3722c822da0490f6a59a45 commit de9a6c8d5dbfedb5eb3722c822da0490f6a59a45 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:53 2024 +0300 PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed Currently, PCIe Link Speeds are adjusted by custom code rather than in a common function provided in PCI core. The PCIe bandwidth controller (bwctrl) introduces an in-kernel API, pcie_set_target_speed(), to set PCIe Link Speed. Convert Target Speed quirk to use the new API. The Target Speed quirk runs very early when bwctrl is not yet probed for a Port and can also run later when bwctrl is already setup for the Port, which requires the per port mutex (set_speed_mutex) to be only taken if the bwctrl setup is already complete. The new API is also intended to be used in an upcoming commit that adds a thermal cooling device to throttle PCIe bandwidth when thermal thresholds are reached. The PCIe bandwidth control procedure is as follows. The highest speed supported by the Port and the PCIe device which is not higher than the requested speed is selected and written into the Target Link Speed in the Link Control 2 Register. Then bandwidth controller retrains the PCIe Link. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. While Bandwidth Notifications should also be generated when bandwidth controller alters the PCIe Link Speed, a few platforms do not deliver LMBS interrupt after Link Training as expected. Thus, after changing the Link Speed, bandwidth controller makes additional read for the Link Status Register to ensure cur_bus_speed is consistent with the new PCIe Link Speed. Link: https://lore.kernel.org/r/20241018144755.7875-8-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash devm_mutex_init() error checking from https://lore.kernel.org/r/20241030163139.2111689-1-andriy.shevchenko@linux.intel.com, drop export of pcie_set_target_speed()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-03-20 16:33:57 +00:00
#define PCIE_LNKCAP_SLS2SPEED(lnkcap) \
({ \
((lnkcap) == PCI_EXP_LNKCAP_SLS_64_0GB ? PCIE_SPEED_64_0GT : \
(lnkcap) == PCI_EXP_LNKCAP_SLS_32_0GB ? PCIE_SPEED_32_0GT : \
(lnkcap) == PCI_EXP_LNKCAP_SLS_16_0GB ? PCIE_SPEED_16_0GT : \
(lnkcap) == PCI_EXP_LNKCAP_SLS_8_0GB ? PCIE_SPEED_8_0GT : \
(lnkcap) == PCI_EXP_LNKCAP_SLS_5_0GB ? PCIE_SPEED_5_0GT : \
(lnkcap) == PCI_EXP_LNKCAP_SLS_2_5GB ? PCIE_SPEED_2_5GT : \
PCI_SPEED_UNKNOWN); \
})
/* PCIe link information from Link Capabilities 2 */
#define PCIE_LNKCAP2_SLS2SPEED(lnkcap2) \
((lnkcap2) & PCI_EXP_LNKCAP2_SLS_64_0GB ? PCIE_SPEED_64_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_32_0GB ? PCIE_SPEED_32_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_16_0GB ? PCIE_SPEED_16_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_8_0GB ? PCIE_SPEED_8_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_5_0GB ? PCIE_SPEED_5_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_2_5GB ? PCIE_SPEED_2_5GT : \
PCI_SPEED_UNKNOWN)
PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: de9a6c8d5dbfedb5eb3722c822da0490f6a59a45 commit de9a6c8d5dbfedb5eb3722c822da0490f6a59a45 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:53 2024 +0300 PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed Currently, PCIe Link Speeds are adjusted by custom code rather than in a common function provided in PCI core. The PCIe bandwidth controller (bwctrl) introduces an in-kernel API, pcie_set_target_speed(), to set PCIe Link Speed. Convert Target Speed quirk to use the new API. The Target Speed quirk runs very early when bwctrl is not yet probed for a Port and can also run later when bwctrl is already setup for the Port, which requires the per port mutex (set_speed_mutex) to be only taken if the bwctrl setup is already complete. The new API is also intended to be used in an upcoming commit that adds a thermal cooling device to throttle PCIe bandwidth when thermal thresholds are reached. The PCIe bandwidth control procedure is as follows. The highest speed supported by the Port and the PCIe device which is not higher than the requested speed is selected and written into the Target Link Speed in the Link Control 2 Register. Then bandwidth controller retrains the PCIe Link. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. While Bandwidth Notifications should also be generated when bandwidth controller alters the PCIe Link Speed, a few platforms do not deliver LMBS interrupt after Link Training as expected. Thus, after changing the Link Speed, bandwidth controller makes additional read for the Link Status Register to ensure cur_bus_speed is consistent with the new PCIe Link Speed. Link: https://lore.kernel.org/r/20241018144755.7875-8-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash devm_mutex_init() error checking from https://lore.kernel.org/r/20241030163139.2111689-1-andriy.shevchenko@linux.intel.com, drop export of pcie_set_target_speed()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-03-20 16:33:57 +00:00
#define PCIE_LNKCTL2_TLS2SPEED(lnkctl2) \
((lnkctl2) == PCI_EXP_LNKCTL2_TLS_64_0GT ? PCIE_SPEED_64_0GT : \
(lnkctl2) == PCI_EXP_LNKCTL2_TLS_32_0GT ? PCIE_SPEED_32_0GT : \
(lnkctl2) == PCI_EXP_LNKCTL2_TLS_16_0GT ? PCIE_SPEED_16_0GT : \
(lnkctl2) == PCI_EXP_LNKCTL2_TLS_8_0GT ? PCIE_SPEED_8_0GT : \
(lnkctl2) == PCI_EXP_LNKCTL2_TLS_5_0GT ? PCIE_SPEED_5_0GT : \
(lnkctl2) == PCI_EXP_LNKCTL2_TLS_2_5GT ? PCIE_SPEED_2_5GT : \
PCI_SPEED_UNKNOWN)
/* PCIe speed to Mb/s reduced by encoding overhead */
#define PCIE_SPEED2MBS_ENC(speed) \
((speed) == PCIE_SPEED_64_0GT ? 64000*1/1 : \
(speed) == PCIE_SPEED_32_0GT ? 32000*128/130 : \
(speed) == PCIE_SPEED_16_0GT ? 16000*128/130 : \
(speed) == PCIE_SPEED_8_0GT ? 8000*128/130 : \
(speed) == PCIE_SPEED_5_0GT ? 5000*8/10 : \
(speed) == PCIE_SPEED_2_5GT ? 2500*8/10 : \
0)
static inline int pcie_dev_speed_mbps(enum pci_bus_speed speed)
{
switch (speed) {
case PCIE_SPEED_2_5GT:
return 2500;
case PCIE_SPEED_5_0GT:
return 5000;
case PCIE_SPEED_8_0GT:
return 8000;
case PCIE_SPEED_16_0GT:
return 16000;
case PCIE_SPEED_32_0GT:
return 32000;
case PCIE_SPEED_64_0GT:
return 64000;
default:
break;
}
return -EINVAL;
}
PCI: Store all PCIe Supported Link Speeds JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: d2bd39c0456b75be9dfc7d774b8d021355c26ae3 commit d2bd39c0456b75be9dfc7d774b8d021355c26ae3 Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:49 2024 +0300 PCI: Store all PCIe Supported Link Speeds The PCIe bandwidth controller added by a subsequent commit will require selecting PCIe Link Speeds that are lower than the Maximum Link Speed. The struct pci_bus only stores max_bus_speed. Even if PCIe r6.1 sec 8.2.1 currently disallows gaps in supported Link Speeds, the Implementation Note in PCIe r6.1 sec 7.5.3.18, recommends determining supported Link Speeds using the Supported Link Speeds Vector in the Link Capabilities 2 Register (when available) to "avoid software being confused if a future specification defines Links that do not require support for all slower speeds." Reuse code in pcie_get_speed_cap() to add pcie_get_supported_speeds() to query the Supported Link Speeds Vector of a PCIe device. The value is taken directly from the Supported Link Speeds Vector or synthesized from the Max Link Speed in the Link Capabilities Register when the Link Capabilities 2 Register is not available. The Supported Link Speeds Vector in the Link Capabilities Register 2 corresponds to the bus below on Root Ports and Downstream Ports, whereas it corresponds to the bus above on Upstream Ports and Endpoints (PCIe r6.1 sec 7.5.3.18): Supported Link Speeds Vector - This field indicates the supported Link speed(s) of the associated Port. Add supported_speeds into the struct pci_dev that caches the Supported Link Speeds Vector. supported_speeds contains a set of Link Speeds only in the case where PCIe Link Speed can be determined. Root Complex Integrated Endpoints do not have a well-defined Link Speed because they do not implement either of the Link Capabilities Registers, which is allowed by PCIe r6.1 sec 7.5.3 (the same limitation applies to determining cur_bus_speed and max_bus_speed that are PCI_SPEED_UNKNOWN in such case). This is of no concern from PCIe bandwidth controller point of view because such devices are not attached into a PCIe Root Port that could be controlled. The supported_speeds field keeps the extra reserved zero at the least significant bit to match the Link Capabilities 2 Register layout. An attempt was made to store supported_speeds field into the struct pci_bus as an intersection of both ends of the Link, however, the subordinate struct pci_bus is not available early enough. The Target Speed quirk (in pcie_failed_link_retrain()) can run either during initial scan or later, requiring it to use the API provided by the PCIe bandwidth controller to set the Target Link Speed in order to co-exist with the bandwidth controller. When the Target Speed quirk is calling the bandwidth controller during initial scan, the struct pci_bus is not yet initialized. As such, storing supported_speeds into the struct pci_bus is not viable. Suggested-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/20241018144755.7875-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: move pcie_get_supported_speeds() decl to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-03-20 16:33:56 +00:00
u8 pcie_get_supported_speeds(struct pci_dev *dev);
const char *pci_speed_string(enum pci_bus_speed speed);
void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
void pcie_report_downtraining(struct pci_dev *dev);
static inline void __pcie_update_link_speed(struct pci_bus *bus, u16 linksta)
{
bus->cur_bus_speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS];
}
void pcie_update_link_speed(struct pci_bus *bus);
/* Single Root I/O Virtualization */
struct pci_sriov {
int pos; /* Capability position */
int nres; /* Number of resources */
u32 cap; /* SR-IOV Capabilities */
u16 ctrl; /* SR-IOV Control */
u16 total_VFs; /* Total VFs associated with the PF */
u16 initial_VFs; /* Initial VFs associated with the PF */
u16 num_VFs; /* Number of VFs available */
u16 offset; /* First VF Routing ID offset */
u16 stride; /* Following VF stride */
u16 vf_device; /* VF device ID */
u32 pgsz; /* Page size for BAR alignment */
u8 link; /* Function Dependency Link */
u8 max_VF_buses; /* Max buses consumed by VFs */
u16 driver_max_VFs; /* Max num VFs driver supports */
struct pci_dev *dev; /* Lowest numbered PF */
struct pci_dev *self; /* This PF */
u32 class; /* VF device */
u8 hdr_type; /* VF header type */
u16 subsystem_vendor; /* VF subsystem vendor */
u16 subsystem_device; /* VF subsystem device */
resource_size_t barsz[PCI_SRIOV_NUM_BARS]; /* VF BAR size */
bool drivers_autoprobe; /* Auto probing of VFs by driver */
RH_KABI_RESERVE(1)
RH_KABI_RESERVE(2)
RH_KABI_RESERVE(3)
RH_KABI_RESERVE(4)
RH_KABI_RESERVE(5)
RH_KABI_RESERVE(6)
RH_KABI_RESERVE(7)
RH_KABI_RESERVE(8)
};
PCI/DOE: Create mailboxes on device enumeration JIRA: https://issues.redhat.com/browse/RHEL-23582 Currently a DOE instance cannot be shared by multiple drivers because each driver creates its own pci_doe_mb struct for a given DOE instance. For the same reason a DOE instance cannot be shared between the PCI core and a driver. Moreover, finding out which protocols a DOE instance supports requires creating a pci_doe_mb for it. If a device has multiple DOE instances, a driver looking for a specific protocol may need to create a pci_doe_mb for each of the device's DOE instances and then destroy those which do not support the desired protocol. That's obviously an inefficient way to do things. Overcome these issues by creating mailboxes in the PCI core on device enumeration. Provide a pci_find_doe_mailbox() API call to allow drivers to get a pci_doe_mb for a given (pci_dev, vendor, protocol) triple. This API is modeled after pci_find_capability() and can later be amended with a pci_find_next_doe_mailbox() call to iterate over all mailboxes of a given pci_dev which support a specific protocol. On removal, destroy the mailboxes in pci_destroy_dev(), after the driver is unbound. This allows drivers to use DOE in their ->remove() hook. On surprise removal, cancel ongoing DOE exchanges and prevent new ones from being scheduled. Thereby ensure that a hot-removed device doesn't needlessly wait for a running exchange to time out. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/40a6f973f72ef283d79dd55e7e6fddc7481199af.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit ac04840350e2c21a17d867b262a1586603b87a92) Signed-off-by: John W. Linville <linville@redhat.com>
2023-03-11 14:40:12 +00:00
#ifdef CONFIG_PCI_DOE
void pci_doe_init(struct pci_dev *pdev);
void pci_doe_destroy(struct pci_dev *pdev);
void pci_doe_disconnected(struct pci_dev *pdev);
#else
static inline void pci_doe_init(struct pci_dev *pdev) { }
static inline void pci_doe_destroy(struct pci_dev *pdev) { }
static inline void pci_doe_disconnected(struct pci_dev *pdev) { }
#endif
/**
* pci_dev_set_io_state - Set the new error state if possible.
*
* @dev: PCI device to set new error_state
* @new: the state we want dev to be in
*
PCI: hotplug: Allow marking devices as disconnected during bind/unbind Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179137 Upstream Status: 74ff8864cc842be994853095dba6db48e716400a commit 74ff8864cc842be994853095dba6db48e716400a Author: Lukas Wunner <lukas@wunner.de> Date: Fri Jan 20 10:19:02 2023 +0100 PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-03-29 14:56:44 +00:00
* If the device is experiencing perm_failure, it has to remain in that state.
* Any other transition is allowed.
*
* Returns true if state has been changed to the requested state.
*/
static inline bool pci_dev_set_io_state(struct pci_dev *dev,
pci_channel_state_t new)
{
PCI: hotplug: Allow marking devices as disconnected during bind/unbind Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179137 Upstream Status: 74ff8864cc842be994853095dba6db48e716400a commit 74ff8864cc842be994853095dba6db48e716400a Author: Lukas Wunner <lukas@wunner.de> Date: Fri Jan 20 10:19:02 2023 +0100 PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-03-29 14:56:44 +00:00
pci_channel_state_t old;
switch (new) {
case pci_channel_io_perm_failure:
PCI: hotplug: Allow marking devices as disconnected during bind/unbind Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179137 Upstream Status: 74ff8864cc842be994853095dba6db48e716400a commit 74ff8864cc842be994853095dba6db48e716400a Author: Lukas Wunner <lukas@wunner.de> Date: Fri Jan 20 10:19:02 2023 +0100 PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-03-29 14:56:44 +00:00
xchg(&dev->error_state, pci_channel_io_perm_failure);
return true;
case pci_channel_io_frozen:
PCI: hotplug: Allow marking devices as disconnected during bind/unbind Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179137 Upstream Status: 74ff8864cc842be994853095dba6db48e716400a commit 74ff8864cc842be994853095dba6db48e716400a Author: Lukas Wunner <lukas@wunner.de> Date: Fri Jan 20 10:19:02 2023 +0100 PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-03-29 14:56:44 +00:00
old = cmpxchg(&dev->error_state, pci_channel_io_normal,
pci_channel_io_frozen);
return old != pci_channel_io_perm_failure;
case pci_channel_io_normal:
PCI: hotplug: Allow marking devices as disconnected during bind/unbind Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2179137 Upstream Status: 74ff8864cc842be994853095dba6db48e716400a commit 74ff8864cc842be994853095dba6db48e716400a Author: Lukas Wunner <lukas@wunner.de> Date: Fri Jan 20 10:19:02 2023 +0100 PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-03-29 14:56:44 +00:00
old = cmpxchg(&dev->error_state, pci_channel_io_frozen,
pci_channel_io_normal);
return old != pci_channel_io_perm_failure;
default:
return false;
}
}
static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
{
pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
PCI/DOE: Create mailboxes on device enumeration JIRA: https://issues.redhat.com/browse/RHEL-23582 Currently a DOE instance cannot be shared by multiple drivers because each driver creates its own pci_doe_mb struct for a given DOE instance. For the same reason a DOE instance cannot be shared between the PCI core and a driver. Moreover, finding out which protocols a DOE instance supports requires creating a pci_doe_mb for it. If a device has multiple DOE instances, a driver looking for a specific protocol may need to create a pci_doe_mb for each of the device's DOE instances and then destroy those which do not support the desired protocol. That's obviously an inefficient way to do things. Overcome these issues by creating mailboxes in the PCI core on device enumeration. Provide a pci_find_doe_mailbox() API call to allow drivers to get a pci_doe_mb for a given (pci_dev, vendor, protocol) triple. This API is modeled after pci_find_capability() and can later be amended with a pci_find_next_doe_mailbox() call to iterate over all mailboxes of a given pci_dev which support a specific protocol. On removal, destroy the mailboxes in pci_destroy_dev(), after the driver is unbound. This allows drivers to use DOE in their ->remove() hook. On surprise removal, cancel ongoing DOE exchanges and prevent new ones from being scheduled. Thereby ensure that a hot-removed device doesn't needlessly wait for a running exchange to time out. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/40a6f973f72ef283d79dd55e7e6fddc7481199af.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com> (cherry picked from commit ac04840350e2c21a17d867b262a1586603b87a92) Signed-off-by: John W. Linville <linville@redhat.com>
2023-03-11 14:40:12 +00:00
pci_doe_disconnected(dev);
return 0;
}
/* pci_dev priv_flags */
#define PCI_DEV_ADDED 0
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
#define PCI_DPC_RECOVERED 1
#define PCI_DPC_RECOVERING 2
static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
{
assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
}
static inline bool pci_dev_is_added(const struct pci_dev *dev)
{
return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
}
#ifdef CONFIG_PCIEAER
#include <linux/aer.h>
#define AER_MAX_MULTI_ERR_DEVICES 5 /* Not likely to have more */
struct aer_err_info {
struct pci_dev *dev[AER_MAX_MULTI_ERR_DEVICES];
int error_dev_num;
unsigned int id:16;
unsigned int severity:2; /* 0:NONFATAL | 1:FATAL | 2:COR */
unsigned int __pad1:5;
unsigned int multi_error_valid:1;
unsigned int first_error:5;
unsigned int __pad2:2;
unsigned int tlp_header_valid:1;
unsigned int status; /* COR/UNCOR Error Status */
unsigned int mask; /* COR/UNCOR Error Mask */
struct pcie_tlp_log tlp; /* TLP Header */
};
int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info);
void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
#endif /* CONFIG_PCIEAER */
#ifdef CONFIG_PCIEPORTBUS
/* Cached RCEC Endpoint Association */
struct rcec_ea {
u8 nextbusn;
u8 lastbusn;
u32 bitmap;
};
#endif
#ifdef CONFIG_PCIE_DPC
void pci_save_dpc_state(struct pci_dev *dev);
void pci_restore_dpc_state(struct pci_dev *dev);
void pci_dpc_init(struct pci_dev *pdev);
void dpc_process_error(struct pci_dev *pdev);
pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
bool pci_dpc_recovered(struct pci_dev *pdev);
#else
static inline void pci_save_dpc_state(struct pci_dev *dev) { }
static inline void pci_restore_dpc_state(struct pci_dev *dev) { }
static inline void pci_dpc_init(struct pci_dev *pdev) { }
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
#endif
#ifdef CONFIG_PCIEPORTBUS
void pci_rcec_init(struct pci_dev *dev);
void pci_rcec_exit(struct pci_dev *dev);
void pcie_link_rcec(struct pci_dev *rcec);
void pcie_walk_rcec(struct pci_dev *rcec,
int (*cb)(struct pci_dev *, void *),
void *userdata);
#else
static inline void pci_rcec_init(struct pci_dev *dev) { }
static inline void pci_rcec_exit(struct pci_dev *dev) { }
static inline void pcie_link_rcec(struct pci_dev *rcec) { }
static inline void pcie_walk_rcec(struct pci_dev *rcec,
int (*cb)(struct pci_dev *, void *),
void *userdata) { }
#endif
#ifdef CONFIG_PCI_ATS
/* Address Translation Service */
void pci_ats_init(struct pci_dev *dev);
void pci_restore_ats_state(struct pci_dev *dev);
#else
static inline void pci_ats_init(struct pci_dev *d) { }
static inline void pci_restore_ats_state(struct pci_dev *dev) { }
#endif /* CONFIG_PCI_ATS */
#ifdef CONFIG_PCI_PRI
void pci_pri_init(struct pci_dev *dev);
void pci_restore_pri_state(struct pci_dev *pdev);
#else
static inline void pci_pri_init(struct pci_dev *dev) { }
static inline void pci_restore_pri_state(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCI_PASID
void pci_pasid_init(struct pci_dev *dev);
void pci_restore_pasid_state(struct pci_dev *pdev);
#else
static inline void pci_pasid_init(struct pci_dev *dev) { }
static inline void pci_restore_pasid_state(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCI_IOV
int pci_iov_init(struct pci_dev *dev);
void pci_iov_release(struct pci_dev *dev);
void pci_iov_remove(struct pci_dev *dev);
void pci_iov_update_resource(struct pci_dev *dev, int resno);
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
void pci_restore_iov_state(struct pci_dev *dev);
int pci_iov_bus_range(struct pci_bus *bus);
PCI/IOV: Add sysfs MSI-X vector assignment interface A typical cloud provider SR-IOV use case is to create many VFs for use by guest VMs. The VFs may not be assigned to a VM until a customer requests a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors proportional to the number of CPUs in the VM, but there is no standard way to change the number of MSI-X vectors supported by a VF. Some Mellanox ConnectX devices support dynamic assignment of MSI-X vectors to SR-IOV VFs. This can be done by the PF driver after VFs are enabled, and it can be done without affecting VFs that are already in use. The hardware supports a limited pool of MSI-X vectors that can be assigned to the PF or to individual VFs. This is device-specific behavior that requires support in the PF driver. Add a read-only "sriov_vf_total_msix" sysfs file for the PF and a writable "sriov_vf_msix_count" file for each VF. Management software may use these to learn how many MSI-X vectors are available and to dynamically assign them to VFs before the VFs are passed through to a VM. If the PF driver implements the ->sriov_get_vf_total_msix() callback, "sriov_vf_total_msix" contains the total number of MSI-X vectors available for distribution among VFs. If no driver is bound to the VF, writing "N" to "sriov_vf_msix_count" uses the PF driver ->sriov_set_msix_vec_count() callback to assign "N" MSI-X vectors to the VF. When a VF driver subsequently reads the MSI-X Message Control register, it will see the new Table Size "N". Link: https://lore.kernel.org/linux-pci/20210314124256.70253-2-leon@kernel.org Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-04-04 07:22:18 +00:00
extern const struct attribute_group sriov_pf_dev_attr_group;
extern const struct attribute_group sriov_vf_dev_attr_group;
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
return -ENODEV;
}
static inline void pci_iov_release(struct pci_dev *dev) { }
static inline void pci_iov_remove(struct pci_dev *dev) { }
static inline void pci_restore_iov_state(struct pci_dev *dev) { }
static inline int pci_iov_bus_range(struct pci_bus *bus)
{
return 0;
}
#endif /* CONFIG_PCI_IOV */
#ifdef CONFIG_PCIE_PTM
void pci_ptm_init(struct pci_dev *dev);
void pci_save_ptm_state(struct pci_dev *dev);
void pci_restore_ptm_state(struct pci_dev *dev);
PCI/PTM: Add pci_suspend_ptm() and pci_resume_ptm() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2135902 Upstream Status: e8bdc5ea481638e0a4fd5639050d2b170417f493 commit e8bdc5ea481638e0a4fd5639050d2b170417f493 Author: Bjorn Helgaas <bhelgaas@google.com> Date: Fri Sep 9 15:25:00 2022 -0500 PCI/PTM: Add pci_suspend_ptm() and pci_resume_ptm() We disable PTM during suspend because that allows some Root Ports to enter lower-power PM states, which means we also need to disable PTM for all downstream devices. Add pci_suspend_ptm() and pci_resume_ptm() for this purpose. pci_enable_ptm() and pci_disable_ptm() are for drivers to use to enable or disable PTM. They use dev->ptm_enabled to keep track of whether PTM should be enabled. pci_suspend_ptm() and pci_resume_ptm() are PCI core-internal functions to temporarily disable PTM during suspend and (depending on dev->ptm_enabled) re-enable PTM during resume. Enable/disable/suspend/resume all use internal __pci_enable_ptm() and __pci_disable_ptm() functions that only update the PTM Control register. Outline: pci_enable_ptm(struct pci_dev *dev) { __pci_enable_ptm(dev); dev->ptm_enabled = 1; pci_ptm_info(dev); } pci_disable_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) { __pci_disable_ptm(dev); dev->ptm_enabled = 0; } } pci_suspend_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) __pci_disable_ptm(dev); } pci_resume_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) __pci_enable_ptm(dev); } Nothing currently calls pci_resume_ptm(); the suspend path saves the PTM state before disabling PTM, so the PTM state restore in the resume path implicitly re-enables it. A future change will use pci_resume_ptm() to fix some problems with this approach. Link: https://lore.kernel.org/r/20220909202505.314195-5-helgaas@kernel.org Tested-by: Rajvi Jingar <rajvi.jingar@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2022-10-18 23:40:36 +00:00
void pci_suspend_ptm(struct pci_dev *dev);
void pci_resume_ptm(struct pci_dev *dev);
#else
static inline void pci_ptm_init(struct pci_dev *dev) { }
static inline void pci_save_ptm_state(struct pci_dev *dev) { }
static inline void pci_restore_ptm_state(struct pci_dev *dev) { }
PCI/PTM: Add pci_suspend_ptm() and pci_resume_ptm() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2135902 Upstream Status: e8bdc5ea481638e0a4fd5639050d2b170417f493 commit e8bdc5ea481638e0a4fd5639050d2b170417f493 Author: Bjorn Helgaas <bhelgaas@google.com> Date: Fri Sep 9 15:25:00 2022 -0500 PCI/PTM: Add pci_suspend_ptm() and pci_resume_ptm() We disable PTM during suspend because that allows some Root Ports to enter lower-power PM states, which means we also need to disable PTM for all downstream devices. Add pci_suspend_ptm() and pci_resume_ptm() for this purpose. pci_enable_ptm() and pci_disable_ptm() are for drivers to use to enable or disable PTM. They use dev->ptm_enabled to keep track of whether PTM should be enabled. pci_suspend_ptm() and pci_resume_ptm() are PCI core-internal functions to temporarily disable PTM during suspend and (depending on dev->ptm_enabled) re-enable PTM during resume. Enable/disable/suspend/resume all use internal __pci_enable_ptm() and __pci_disable_ptm() functions that only update the PTM Control register. Outline: pci_enable_ptm(struct pci_dev *dev) { __pci_enable_ptm(dev); dev->ptm_enabled = 1; pci_ptm_info(dev); } pci_disable_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) { __pci_disable_ptm(dev); dev->ptm_enabled = 0; } } pci_suspend_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) __pci_disable_ptm(dev); } pci_resume_ptm(struct pci_dev *dev) { if (dev->ptm_enabled) __pci_enable_ptm(dev); } Nothing currently calls pci_resume_ptm(); the suspend path saves the PTM state before disabling PTM, so the PTM state restore in the resume path implicitly re-enables it. A future change will use pci_resume_ptm() to fix some problems with this approach. Link: https://lore.kernel.org/r/20220909202505.314195-5-helgaas@kernel.org Tested-by: Rajvi Jingar <rajvi.jingar@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2022-10-18 23:40:36 +00:00
static inline void pci_suspend_ptm(struct pci_dev *dev) { }
static inline void pci_resume_ptm(struct pci_dev *dev) { }
#endif
unsigned long pci_cardbus_resource_alignment(struct resource *);
static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
struct resource *res)
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-28 20:00:06 +00:00
{
#ifdef CONFIG_PCI_IOV
int resno = res - dev->resource;
if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
return pci_sriov_resource_alignment(dev, resno);
#endif
if (dev->class >> 8 == PCI_CLASS_BRIDGE_CARDBUS)
return pci_cardbus_resource_alignment(res);
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-28 20:00:06 +00:00
return resource_alignment(res);
}
void pci_acs_init(struct pci_dev *dev);
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
int pcie_failed_link_retrain(struct pci_dev *dev);
#else
static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
u16 acs_flags)
{
return -ENOTTY;
}
static inline int pci_dev_specific_enable_acs(struct pci_dev *dev)
{
return -ENOTTY;
}
static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
{
return -ENOTTY;
}
static inline int pcie_failed_link_retrain(struct pci_dev *dev)
PCI: Work around PCIe link training failures JIRA: https://issues.redhat.com/browse/RHEL-2570 Upstream Status: a89c82249c3763780522f763dd2e615e2ea114de commit a89c82249c3763780522f763dd2e615e2ea114de Author: Maciej W. Rozycki <macro@orcam.me.uk> Date: Sun Jun 11 18:20:10 2023 +0100 PCI: Work around PCIe link training failures Attempt to handle cases such as with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues switching between speeds indefinitely with the data link layer never reaching the active state. It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the switches should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if necessary. Instead the link continues oscillating between the two speeds, at the rate of 34-35 times per second, with link training reported repeatedly active ~84% of the time. Limiting the target link speed to 2.5GT/s with the upstream ASM2824 device makes the two switches communicate correctly. Removing the speed restriction afterwards makes the two devices switch to 5.0GT/s then. Make use of these observations and detect the inability to train the link by checking for the Data Link Layer Link Active status bit being off while the Link Bandwidth Management Status indicating that hardware has changed the link speed or width in an attempt to correct unreliable link operation. Restrict the speed to 2.5GT/s then with the Target Link Speed field, request a retrain and wait 200ms for the data link to go up. If this is successful, lift the restriction, letting the devices negotiate a higher speed. Also check for a 2.5GT/s speed restriction the firmware may have already arranged and lift it too with ports of devices known to continue working afterwards (currently only ASM2824), that already report their data link being up. [bhelgaas: reorder and squash stubs from https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk to avoid adding stubs that do nothing] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/ Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68 Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-09-18 17:25:19 +00:00
{
return -ENOTTY;
PCI: Work around PCIe link training failures JIRA: https://issues.redhat.com/browse/RHEL-2570 Upstream Status: a89c82249c3763780522f763dd2e615e2ea114de commit a89c82249c3763780522f763dd2e615e2ea114de Author: Maciej W. Rozycki <macro@orcam.me.uk> Date: Sun Jun 11 18:20:10 2023 +0100 PCI: Work around PCIe link training failures Attempt to handle cases such as with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues switching between speeds indefinitely with the data link layer never reaching the active state. It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the switches should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if necessary. Instead the link continues oscillating between the two speeds, at the rate of 34-35 times per second, with link training reported repeatedly active ~84% of the time. Limiting the target link speed to 2.5GT/s with the upstream ASM2824 device makes the two switches communicate correctly. Removing the speed restriction afterwards makes the two devices switch to 5.0GT/s then. Make use of these observations and detect the inability to train the link by checking for the Data Link Layer Link Active status bit being off while the Link Bandwidth Management Status indicating that hardware has changed the link speed or width in an attempt to correct unreliable link operation. Restrict the speed to 2.5GT/s then with the Target Link Speed field, request a retrain and wait 200ms for the data link to go up. If this is successful, lift the restriction, letting the devices negotiate a higher speed. Also check for a 2.5GT/s speed restriction the firmware may have already arranged and lift it too with ports of devices known to continue working afterwards (currently only ASM2824), that already report their data link being up. [bhelgaas: reorder and squash stubs from https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk to avoid adding stubs that do nothing] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/ Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68 Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-09-18 17:25:19 +00:00
}
#endif
/* PCI error reporting and recovery */
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_subordinates)(struct pci_dev *pdev));
bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
Merge branch 'pci/enumeration' JIRA: https://issues.redhat.com/browse/RHEL-2570 Upstream Status: 1abb47390350a1bd5430390e296492e6248865b2 Conflict(s): Upstream had two patches that effectively did the same thing which obviously lead to conflicts: 3c0ec896a4b4 PCI/ASPM: Factor out waiting for link training to complete 9c7f136433d2 PCI/ASPM: Factor out pcie_wait_for_retrain() These were resolved upstream via merge commit 1abb47390350 "Merge branch 'pci/enumeration'". This patch does effectively the same thing as upstream. One way to detect the needed content is to run 'git blame' on the files and look for this patch's commit ID. The end result was also verified by running 'diff -u' on the files against upstream to assure that the content in question matched. commit 1abb47390350a1bd5430390e296492e6248865b2 Merge: 0f32114ea074 08e3ed12ca86 Author: Bjorn Helgaas <bhelgaas@google.com> Date: Mon Jun 26 12:59:56 2023 -0500 Merge branch 'pci/enumeration' - Add PCI_EXT_CAP_ID_PL_32GT define (Ben Dooks) - Propagate firmware node by calling device_set_node() for better modularity (Andy Shevchenko) - Discover Data Link Layer Link Active Reporting earlier so quirks can take advantage of it (Maciej W. Rozycki) - Use cached Data Link Layer Link Active Reporting capability in pciehp, powerpc/eeh, and mlx5 (Maciej W. Rozycki) - Run quirk for devices that require OS to clear Retrain Link earlier, so later quirks can rely on it (Maciej W. Rozycki) - Export pcie_retrain_link() for use outside ASPM (Maciej W. Rozycki) - Add Data Link Layer Link Active Reporting as another way for pcie_retrain_link() to determine the link is up (Maciej W. Rozycki) - Work around link training failures (especially on the ASMedia ASM2824 switch) by training first at 2.5GT/s and then attempting higher rates (Maciej W. Rozycki) * pci/enumeration: PCI: Add failed link recovery for device reset events PCI: Work around PCIe link training failures PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay() PCI: Add support for polling DLLLA to pcie_retrain_link() PCI: Export pcie_retrain_link() for use outside ASPM PCI: Export PCIe link retrain timeout PCI: Execute quirk_enable_clear_retrain_link() earlier PCI/ASPM: Factor out waiting for link training to complete PCI/ASPM: Avoid unnecessary pcie_link_state use PCI/ASPM: Use distinct local vars in pcie_retrain_link() net/mlx5: Rely on dev->link_active_reporting powerpc/eeh: Rely on dev->link_active_reporting PCI: pciehp: Rely on dev->link_active_reporting PCI: Initialize dev->link_active_reporting earlier PCI: of: Propagate firmware node by calling device_set_node() PCI: Add PCI_EXT_CAP_ID_PL_32GT define # Conflicts: # drivers/pci/pcie/aspm.c Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-09-18 17:25:20 +00:00
int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
/* ASPM-related functionality we need even without CONFIG_PCIEASPM */
void pci_save_ltr_state(struct pci_dev *dev);
void pci_restore_ltr_state(struct pci_dev *dev);
PCI/ASPM: Save L1 PM Substates Capability for suspend/resume JIRA: https://issues.redhat.com/browse/RHEL-33544 Upstream Status: 17423360a27ae58c1850f588bdd8013bbfcd250b commit 17423360a27ae58c1850f588bdd8013bbfcd250b Author: David E. Box <david.e.box@linux.intel.com> Date: Fri Feb 23 14:58:50 2024 -0600 PCI/ASPM: Save L1 PM Substates Capability for suspend/resume 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") restored the L1 PM Substates Capability after resume, which reduced power consumption by making the ASPM L1.x states work after resume. a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") reverted 4ff116d0d5fd because resume failed on some systems, so power consumption after resume increased again. a7152be79b62 mentioned that we restore L1 PM substate configuration even though ASPM L1 may already be enabled. This is due the fact that the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state(). Save and restore the L1 PM Substates Capability, following PCIe r6.1, sec 5.5.4 more closely by: 1) Do not restore ASPM configuration in pci_restore_pcie_state() but do that after PCIe capability is restored in pci_restore_aspm_state() following PCIe r6.1, sec 5.5.4. 2) If BIOS reenables L1SS, particularly L1.2, we need to clear the enables in the right order, downstream before upstream. Defer restoring the L1SS config until we are at the downstream component. Then update the config for both ends of the link in the prescribed order. 3) Program ASPM L1 PM substate configuration before L1 enables. 4) Program ASPM L1 PM substate enables last, after rest of the fields in the capability are programmed. [bhelgaas: commit log, squash L1SS-related patches, do both LNKCTL restores in pci_restore_pcie_state()] Link: https://lore.kernel.org/r/20240128233212.1139663-3-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240128233212.1139663-4-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-5-helgaas@kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Co-developed-by: David E. Box <david.e.box@linux.intel.com> Reported-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Tasev Nikola <tasev.stefanoska@skynet.be> # Asus UX305FA Cc: Mark Enriquez <enriquezmark36@gmail.com> Cc: Thomas Witt <kernel@witt.link> Cc: Werner Sembach <wse@tuxedocomputers.com> Cc: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-04-24 18:16:10 +00:00
void pci_configure_aspm_l1ss(struct pci_dev *dev);
void pci_save_aspm_l1ss_state(struct pci_dev *dev);
void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
#ifdef CONFIG_PCIEASPM
void pcie_aspm_init_link_state(struct pci_dev *pdev);
void pcie_aspm_exit_link_state(struct pci_dev *pdev);
PCI/ASPM: Fix deadlock when enabling ASPM JIRA: https://issues.redhat.com/browse/RHEL-26162 Upstream Status: 1e560864159d002b453da42bd2c13a1805515a20 commit 1e560864159d002b453da42bd2c13a1805515a20 Author: Johan Hovold <johan+linaro@kernel.org> Date: Tue Jan 30 11:02:43 2024 +0100 PCI/ASPM: Fix deadlock when enabling ASPM A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice. Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7 Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-02-25 15:40:35 +00:00
void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
void pci_configure_ltr(struct pci_dev *pdev);
void pci_bridge_reconfigure_ltr(struct pci_dev *pdev);
#else
static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
PCI/ASPM: Fix deadlock when enabling ASPM JIRA: https://issues.redhat.com/browse/RHEL-26162 Upstream Status: 1e560864159d002b453da42bd2c13a1805515a20 commit 1e560864159d002b453da42bd2c13a1805515a20 Author: Johan Hovold <johan+linaro@kernel.org> Date: Tue Jan 30 11:02:43 2024 +0100 PCI/ASPM: Fix deadlock when enabling ASPM A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice. Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7 Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-02-25 15:40:35 +00:00
static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { }
static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
static inline void pci_configure_ltr(struct pci_dev *pdev) { }
static inline void pci_bridge_reconfigure_ltr(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCIE_ECRC
void pcie_set_ecrc_checking(struct pci_dev *dev);
void pcie_ecrc_get_policy(char *str);
#else
static inline void pcie_set_ecrc_checking(struct pci_dev *dev) { }
static inline void pcie_ecrc_get_policy(char *str) { }
#endif
PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller JIRA: https://issues.redhat.com/browse/RHEL-81906 Upstream Status: 665745f274870c921020f610e2c99a3b1613519b commit 665745f274870c921020f610e2c99a3b1613519b Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Date: Fri Oct 18 17:47:52 2024 +0300 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth notification"). An upcoming commit extends this driver building PCIe bandwidth controller on top of it. PCIe bandwidth notifications were first added in the commit e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification") but later had to be removed. The significant changes compared with the old bandwidth notification driver include: 1) Don't print the notifications into kernel log, just keep the Link Speed cached in struct pci_bus updated. While somewhat unfortunate, the log spam was the source of complaints that eventually lead to the removal of the bandwidth notifications driver (see the links below for further information). 2) Besides the Link Bandwidth Management Interrupt, also enable Link Autonomous Bandwidth Interrupt to cover the other source of bandwidth changes. 3) Handle Link Speed updates robustly. Refresh the cached Link Speed when enabling Bandwidth Notification Interrupts, and solve the race between Link Speed read and LBMS/LABS update in pcie_bwnotif_irq_thread(). 4) Use concurrency safe LNKCTL RMW operations. 5) The driver is now called PCIe bwctrl (bandwidth controller) instead of just bandwidth notifications because of increased scope and functionality within the driver. 6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain(). Provide LBMS counting API for it. 7) Tweaks to variable/functions names for consistency and length reasons. Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes. [bhelgaas: This is based on previous work by Alexandru Gagniuc <mr.nuke.me@gmail.com>; see e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification")] Link: https://lore.kernel.org/r/20241018144755.7875-7-ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/ Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/ Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/ Suggested-by: Lukas Wunner <lukas@wunner.de> # Building bwctrl on top of bwnotif Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler: https://lore.kernel.org/r/20241115165717.15233-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2025-03-20 16:33:57 +00:00
#ifdef CONFIG_PCIEPORTBUS
void pcie_reset_lbms_count(struct pci_dev *port);
int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
#else
static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
{
return -EOPNOTSUPP;
}
#endif
struct pci_dev_reset_methods {
u16 vendor;
u16 device;
int (*reset)(struct pci_dev *dev, bool probe);
};
struct pci_reset_fn_method {
int (*reset_fn)(struct pci_dev *pdev, bool probe);
char *name;
};
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_reset(struct pci_dev *dev, bool probe);
#else
static inline int pci_dev_specific_reset(struct pci_dev *dev, bool probe)
{
return -ENOTTY;
}
#endif
#if defined(CONFIG_PCI_QUIRKS) && defined(CONFIG_ARM64)
int acpi_get_rc_resources(struct device *dev, const char *hid, u16 segment,
struct resource *res);
#else
static inline int acpi_get_rc_resources(struct device *dev, const char *hid,
u16 segment, struct resource *res)
{
return -ENODEV;
}
#endif
int pci_rebar_get_current_size(struct pci_dev *pdev, int bar);
int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size);
static inline u64 pci_rebar_size_to_bytes(int size)
{
return 1ULL << (size + 20);
}
struct device_node;
#ifdef CONFIG_OF
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
int of_get_pci_domain_nr(struct device_node *node);
int of_pci_get_max_link_speed(struct device_node *node);
u32 of_pci_get_slot_power_limit(struct device_node *node,
u8 *slot_power_limit_value,
u8 *slot_power_limit_scale);
bool of_pci_preserve_config(struct device_node *node);
PCI: Restrict device disabled status check to DT Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2184745 Upstream Status: 0d21e71a91debc87e88437a2cf9c6f34f8bf012f commit 0d21e71a91debc87e88437a2cf9c6f34f8bf012f Author: Rob Herring <robh@kernel.org> Date: Wed Apr 19 14:35:13 2023 -0500 PCI: Restrict device disabled status check to DT Commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") checked the firmware device status for both DT and ACPI devices. That caused a regression in some ACPI systems. The exact reason isn't clear. It's possibly a firmware bug. For now, at least, refactor the check to be for DT based systems only. Note that the original implementation leaked a refcount which is now correctly handled. [bhelgaas: Per ACPI r6.5, sec 6.3.7, for devices on an enumerable bus, _STA must return with bit[0] ("device is present") set] Link: https://lore.kernel.org/all/m2fs9lgndw.fsf@gmail.com/ Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/r/20230419193513.708818-1-robh@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217317 Reported-by: Donald Hunter <donald.hunter@gmail.com> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Donald Hunter <donald.hunter@gmail.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Binbin Zhou <zhoubinbin@loongson.cn> Cc: Liu Peibao <liupeibao@loongson.cn> Cc: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-04-20 22:48:17 +00:00
int pci_set_of_node(struct pci_dev *dev);
void pci_release_of_node(struct pci_dev *dev);
void pci_set_bus_of_node(struct pci_bus *bus);
void pci_release_bus_of_node(struct pci_bus *bus);
int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge);
#else
static inline int
of_pci_parse_bus_range(struct device_node *node, struct resource *res)
{
return -EINVAL;
}
static inline int
of_get_pci_domain_nr(struct device_node *node)
{
return -1;
}
static inline int
of_pci_get_max_link_speed(struct device_node *node)
{
return -EINVAL;
}
static inline u32
of_pci_get_slot_power_limit(struct device_node *node,
u8 *slot_power_limit_value,
u8 *slot_power_limit_scale)
{
if (slot_power_limit_value)
*slot_power_limit_value = 0;
if (slot_power_limit_scale)
*slot_power_limit_scale = 0;
return 0;
}
static inline bool of_pci_preserve_config(struct device_node *node)
{
return false;
}
PCI: Restrict device disabled status check to DT Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2184745 Upstream Status: 0d21e71a91debc87e88437a2cf9c6f34f8bf012f commit 0d21e71a91debc87e88437a2cf9c6f34f8bf012f Author: Rob Herring <robh@kernel.org> Date: Wed Apr 19 14:35:13 2023 -0500 PCI: Restrict device disabled status check to DT Commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") checked the firmware device status for both DT and ACPI devices. That caused a regression in some ACPI systems. The exact reason isn't clear. It's possibly a firmware bug. For now, at least, refactor the check to be for DT based systems only. Note that the original implementation leaked a refcount which is now correctly handled. [bhelgaas: Per ACPI r6.5, sec 6.3.7, for devices on an enumerable bus, _STA must return with bit[0] ("device is present") set] Link: https://lore.kernel.org/all/m2fs9lgndw.fsf@gmail.com/ Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/r/20230419193513.708818-1-robh@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217317 Reported-by: Donald Hunter <donald.hunter@gmail.com> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Donald Hunter <donald.hunter@gmail.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Binbin Zhou <zhoubinbin@loongson.cn> Cc: Liu Peibao <liupeibao@loongson.cn> Cc: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2023-04-20 22:48:17 +00:00
static inline int pci_set_of_node(struct pci_dev *dev) { return 0; }
static inline void pci_release_of_node(struct pci_dev *dev) { }
static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
{
return 0;
}
#endif /* CONFIG_OF */
PCI: Create device tree node for bridge JIRA: https://issues.redhat.com/browse/RHEL-50255 Upstream Status: 407d1a51921e9f28c1bcec647c2205925bd1fdab Conflict(s): Files: drivers/pci/Kconfig, drivers/pci/Makefile Not intending to enable new functionality, thus why the hunks pertaining to the above files were skipped, but keeping up with code changes to help general re-base dependency efforts. Iff PCI_DYNAMIC_OF_NODES is needed in the future, it can be handled when such occurs. commit 407d1a51921e9f28c1bcec647c2205925bd1fdab Author: Lizhi Hou <lizhi.hou@amd.com> Date: Tue Aug 15 10:19:57 2023 -0700 PCI: Create device tree node for bridge The PCI endpoint device such as Xilinx Alveo PCI card maps the register spaces from multiple hardware peripherals to its PCI BAR. Normally, the PCI core discovers devices and BARs using the PCI enumeration process. There is no infrastructure to discover the hardware peripherals that are present in a PCI device, and which can be accessed through the PCI BARs. Apparently, the device tree framework requires a device tree node for the PCI device. Thus, it can generate the device tree nodes for hardware peripherals underneath. Because PCI is self discoverable bus, there might not be a device tree node created for PCI devices. Furthermore, if the PCI device is hot pluggable, when it is plugged in, the device tree nodes for its parent bridges are required. Add support to generate device tree node for PCI bridges. Add an of_pci_make_dev_node() interface that can be used to create device tree node for PCI devices. Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on, the kernel will generate device tree nodes for PCI bridges unconditionally. Initially, add the basic properties for the dynamically generated device tree nodes which include #address-cells, #size-cells, device_type, compatible, ranges, reg. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/1692120000-46900-3-git-send-email-lizhi.hou@amd.com Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Myron Stowe <mstowe@redhat.com>
2024-08-15 01:30:53 +00:00
struct of_changeset;
#ifdef CONFIG_PCI_DYNAMIC_OF_NODES
void of_pci_make_dev_node(struct pci_dev *pdev);
void of_pci_remove_node(struct pci_dev *pdev);
int of_pci_add_properties(struct pci_dev *pdev, struct of_changeset *ocs,
struct device_node *np);
#else
static inline void of_pci_make_dev_node(struct pci_dev *pdev) { }
static inline void of_pci_remove_node(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCIEAER
void pci_no_aer(void);
void pci_aer_init(struct pci_dev *dev);
void pci_aer_exit(struct pci_dev *dev);
extern const struct attribute_group aer_stats_attr_group;
void pci_aer_clear_fatal_status(struct pci_dev *dev);
int pci_aer_clear_status(struct pci_dev *dev);
int pci_aer_raw_clear_status(struct pci_dev *dev);
void pci_save_aer_state(struct pci_dev *dev);
void pci_restore_aer_state(struct pci_dev *dev);
#else
static inline void pci_no_aer(void) { }
static inline void pci_aer_init(struct pci_dev *d) { }
static inline void pci_aer_exit(struct pci_dev *d) { }
static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
static inline void pci_save_aer_state(struct pci_dev *dev) { }
static inline void pci_restore_aer_state(struct pci_dev *dev) { }
#endif
#ifdef CONFIG_ACPI
bool pci_acpi_preserve_config(struct pci_host_bridge *bridge);
PCI/ACPI: Remove unnecessary struct hotplug_program_ops Move the ACPI-specific structs hpx_type0, hpx_type1, hpx_type2 and hpx_type3 to drivers/pci/pci-acpi.c as they are not used anywhere else. Then remove the struct hotplug_program_ops that has been shared between drivers/pci/probe.c and drivers/pci/pci-acpi.c from drivers/pci/pci.h as it is no longer needed. The struct hotplug_program_ops was added by 87fcf12e846a ("PCI/ACPI: Remove the need for 'struct hotplug_params'") and replaced previously used struct hotplug_params enabling the support for the _HPX Type 3 Setting Record that was added by f873c51a155a ("PCI/ACPI: Implement _HPX Type 3 Setting Record"). The new struct allowed for the static functions such program_hpx_type0(), program_hpx_type1(), etc., from the drivers/pci/probe.c to be called from the function pci_acpi_program_hp_params() in the drivers/pci/pci-acpi.c. Previously a programming of _HPX Type 0 was as follows: drivers/pci/probe.c: program_hpx_type0() ... pci_configure_device() hp_ops = { .program_type0 = program_hpx_type0, ... } pci_acpi_program_hp_params(&hp_ops) drivers/pci/pci-acpi.c: pci_acpi_program_hp_params(&hp_ops) acpi_run_hpx(hp_ops) decode_type0_hpx_record() hp_ops->program_type0 # program_hpx_type0() called via hp_ops After the ACPI-specific functions, structs, enums, etc., have been moved to drivers/pci/pci-acpi.c there is no need for the hotplug_program_ops as all of the _HPX Type 0, 1, 2 and 3 are directly accessible. Link: https://lore.kernel.org/r/20190827094951.10613-4-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-27 09:49:51 +00:00
int pci_acpi_program_hp_params(struct pci_dev *dev);
extern const struct attribute_group pci_dev_acpi_attr_group;
void pci_set_acpi_fwnode(struct pci_dev *dev);
int pci_dev_acpi_reset(struct pci_dev *dev, bool probe);
bool acpi_pci_power_manageable(struct pci_dev *dev);
bool acpi_pci_bridge_d3(struct pci_dev *dev);
int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state);
pci_power_t acpi_pci_get_power_state(struct pci_dev *dev);
void acpi_pci_refresh_power_state(struct pci_dev *dev);
int acpi_pci_wakeup(struct pci_dev *dev, bool enable);
bool acpi_pci_need_resume(struct pci_dev *dev);
pci_power_t acpi_pci_choose_state(struct pci_dev *pdev);
#else
static inline bool pci_acpi_preserve_config(struct pci_host_bridge *bridge)
{
return false;
}
static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
{
return -ENOTTY;
}
static inline void pci_set_acpi_fwnode(struct pci_dev *dev) { }
PCI/ACPI: Remove unnecessary struct hotplug_program_ops Move the ACPI-specific structs hpx_type0, hpx_type1, hpx_type2 and hpx_type3 to drivers/pci/pci-acpi.c as they are not used anywhere else. Then remove the struct hotplug_program_ops that has been shared between drivers/pci/probe.c and drivers/pci/pci-acpi.c from drivers/pci/pci.h as it is no longer needed. The struct hotplug_program_ops was added by 87fcf12e846a ("PCI/ACPI: Remove the need for 'struct hotplug_params'") and replaced previously used struct hotplug_params enabling the support for the _HPX Type 3 Setting Record that was added by f873c51a155a ("PCI/ACPI: Implement _HPX Type 3 Setting Record"). The new struct allowed for the static functions such program_hpx_type0(), program_hpx_type1(), etc., from the drivers/pci/probe.c to be called from the function pci_acpi_program_hp_params() in the drivers/pci/pci-acpi.c. Previously a programming of _HPX Type 0 was as follows: drivers/pci/probe.c: program_hpx_type0() ... pci_configure_device() hp_ops = { .program_type0 = program_hpx_type0, ... } pci_acpi_program_hp_params(&hp_ops) drivers/pci/pci-acpi.c: pci_acpi_program_hp_params(&hp_ops) acpi_run_hpx(hp_ops) decode_type0_hpx_record() hp_ops->program_type0 # program_hpx_type0() called via hp_ops After the ACPI-specific functions, structs, enums, etc., have been moved to drivers/pci/pci-acpi.c there is no need for the hotplug_program_ops as all of the _HPX Type 0, 1, 2 and 3 are directly accessible. Link: https://lore.kernel.org/r/20190827094951.10613-4-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-27 09:49:51 +00:00
static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
{
return -ENODEV;
}
static inline bool acpi_pci_power_manageable(struct pci_dev *dev)
{
return false;
}
static inline bool acpi_pci_bridge_d3(struct pci_dev *dev)
{
return false;
}
static inline int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
{
return -ENODEV;
}
static inline pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
{
return PCI_UNKNOWN;
}
static inline void acpi_pci_refresh_power_state(struct pci_dev *dev) { }
static inline int acpi_pci_wakeup(struct pci_dev *dev, bool enable)
{
return -ENODEV;
}
static inline bool acpi_pci_need_resume(struct pci_dev *dev)
{
return false;
}
static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
{
return PCI_POWER_ERROR;
}
#endif
#ifdef CONFIG_PCIEASPM
extern const struct attribute_group aspm_ctrl_attr_group;
#endif
extern const struct attribute_group pci_dev_reset_method_attr_group;
#ifdef CONFIG_X86_INTEL_MID
bool pci_use_mid_pm(void);
int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state);
pci_power_t mid_pci_get_power_state(struct pci_dev *pdev);
#else
static inline bool pci_use_mid_pm(void)
{
return false;
}
static inline int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state)
{
return -ENODEV;
}
static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
{
return PCI_UNKNOWN;
}
#endif
int pcim_intx(struct pci_dev *dev, int enable);
int pcim_request_region_exclusive(struct pci_dev *pdev, int bar,
const char *name);
void pcim_release_region(struct pci_dev *pdev, int bar);
/*
* Config Address for PCI Configuration Mechanism #1
*
* See PCI Local Bus Specification, Revision 3.0,
* Section 3.2.2.3.2, Figure 3-2, p. 50.
*/
#define PCI_CONF1_BUS_SHIFT 16 /* Bus number */
#define PCI_CONF1_DEV_SHIFT 11 /* Device number */
#define PCI_CONF1_FUNC_SHIFT 8 /* Function number */
#define PCI_CONF1_BUS_MASK 0xff
#define PCI_CONF1_DEV_MASK 0x1f
#define PCI_CONF1_FUNC_MASK 0x7
#define PCI_CONF1_REG_MASK 0xfc /* Limit aligned offset to a maximum of 256B */
#define PCI_CONF1_ENABLE BIT(31)
#define PCI_CONF1_BUS(x) (((x) & PCI_CONF1_BUS_MASK) << PCI_CONF1_BUS_SHIFT)
#define PCI_CONF1_DEV(x) (((x) & PCI_CONF1_DEV_MASK) << PCI_CONF1_DEV_SHIFT)
#define PCI_CONF1_FUNC(x) (((x) & PCI_CONF1_FUNC_MASK) << PCI_CONF1_FUNC_SHIFT)
#define PCI_CONF1_REG(x) ((x) & PCI_CONF1_REG_MASK)
#define PCI_CONF1_ADDRESS(bus, dev, func, reg) \
(PCI_CONF1_ENABLE | \
PCI_CONF1_BUS(bus) | \
PCI_CONF1_DEV(dev) | \
PCI_CONF1_FUNC(func) | \
PCI_CONF1_REG(reg))
/*
* Extension of PCI Config Address for accessing extended PCIe registers
*
* No standardized specification, but used on lot of non-ECAM-compliant ARM SoCs
* or on AMD Barcelona and new CPUs. Reserved bits [27:24] of PCI Config Address
* are used for specifying additional 4 high bits of PCI Express register.
*/
#define PCI_CONF1_EXT_REG_SHIFT 16
#define PCI_CONF1_EXT_REG_MASK 0xf00
#define PCI_CONF1_EXT_REG(x) (((x) & PCI_CONF1_EXT_REG_MASK) << PCI_CONF1_EXT_REG_SHIFT)
#define PCI_CONF1_EXT_ADDRESS(bus, dev, func, reg) \
(PCI_CONF1_ADDRESS(bus, dev, func, reg) | \
PCI_CONF1_EXT_REG(reg))
#endif /* DRIVERS_PCI_H */