Commit Graph

62 Commits

Author SHA1 Message Date
David Arcari ad113b28ee cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU
JIRA: https://issues.redhat.com/browse/RHEL-22704

commit b3fce429a1e030b50c1c91351d69b8667eef627b
Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Date:   Wed Nov 27 16:22:46 2024 -0800

    cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU

    Commit

      5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")

    adds functionality that architectures can use to optionally allocate and
    build cacheinfo early during boot. Commit

      6539cffa9495 ("cacheinfo: Add arch specific early level initializer")

    lets secondary CPUs correct (and reallocate memory) cacheinfo data if
    needed.

    If the early build functionality is not used and cacheinfo does not need
    correction, memory for cacheinfo is never allocated. x86 does not use
    the early build functionality. Consequently, during the cacheinfo CPU
    hotplug callback, last_level_cache_is_valid() attempts to dereference
    a NULL pointer:

      BUG: kernel NULL pointer dereference, address: 0000000000000100
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not present page
      PGD 0 P4D 0
      Oops: 0000 [#1] PREEPMT SMP NOPTI
      CPU: 0 PID 19 Comm: cpuhp/0 Not tainted 6.4.0-rc2 #1
      RIP: 0010: last_level_cache_is_valid+0x95/0xe0a

    Allocate memory for cacheinfo during the cacheinfo CPU hotplug callback
    if not done earlier.

    Moreover, before determining the validity of the last-level cache info,
    ensure that it has been allocated. Simply checking for non-zero
    cache_leaves() is not sufficient, as some architectures (e.g., Intel
    processors) have non-zero cache_leaves() before allocation.

    Dereferencing NULL cacheinfo can occur in update_per_cpu_data_slice_size().
    This function iterates over all online CPUs. However, a CPU may have come
    online recently, but its cacheinfo may not have been allocated yet.

    While here, remove an unnecessary indentation in allocate_cache_info().

      [ bp: Massage. ]

    Fixes: 6539cffa9495 ("cacheinfo: Add arch specific early level initializer")
    Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Radu Rendec <rrendec@redhat.com>
    Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
    Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Cc: stable@vger.kernel.org # 6.3+
    Link: https://lore.kernel.org/r/20241128002247.26726-2-ricardo.neri-calderon@linux.intel.com

Signed-off-by: David Arcari <darcari@redhat.com>
2024-12-17 09:06:50 -05:00
Chris von Recklinghausen 8bfb38d72a mm and cache_info: remove unnecessary CPU cache info update
JIRA: https://issues.redhat.com/browse/RHEL-20141

commit 5cec4eb7fad6fb1e9a3dd8403b558d1eff7490ff
Author: Huang Ying <ying.huang@intel.com>
Date:   Fri Jan 26 16:19:44 2024 +0800

    mm and cache_info: remove unnecessary CPU cache info update

    For each CPU hotplug event, we will update per-CPU data slice size and
    corresponding PCP configuration for every online CPU to make the
    implementation simple.  But, Kyle reported that this takes tens seconds
    during boot on a machine with 34 zones and 3840 CPUs.

    So, in this patch, for each CPU hotplug event, we only update per-CPU data
    slice size and corresponding PCP configuration for the CPUs that share
    caches with the hotplugged CPU.  With the patch, the system boot time
    reduces 67 seconds on the machine.

    Link: https://lkml.kernel.org/r/20240126081944.414520-1-ying.huang@intel.com
    Fixes: 362d37a106dd ("mm, pcp: reduce lock contention for draining high-order pages")
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Originally-by: Kyle Meyer <kyle.meyer@hpe.com>
    Reported-and-tested-by: Kyle Meyer <kyle.meyer@hpe.com>
    Cc: Sudeep Holla <sudeep.holla@arm.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-06-07 13:14:13 -04:00
Chris von Recklinghausen 0ae67db50a mm, pcp: reduce lock contention for draining high-order pages
JIRA: https://issues.redhat.com/browse/RHEL-20141

commit 362d37a106dd3f6431b2fdd91d9208b0d023b50d
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Oct 16 13:29:56 2023 +0800

    mm, pcp: reduce lock contention for draining high-order pages

    In commit f26b3fa04611 ("mm/page_alloc: limit number of high-order pages
    on PCP during bulk free"), the PCP (Per-CPU Pageset) will be drained when
    PCP is mostly used for high-order pages freeing to improve the cache-hot
    pages reusing between page allocating and freeing CPUs.

    On system with small per-CPU data cache slice, pages shouldn't be cached
    before draining to guarantee cache-hot.  But on a system with large
    per-CPU data cache slice, some pages can be cached before draining to
    reduce zone lock contention.

    So, in this patch, instead of draining without any caching, "pcp->batch"
    pages will be cached in PCP before draining if the size of the per-CPU
    data cache slice is more than "3 * batch".

    In theory, if the size of per-CPU data cache slice is more than "2 *
    batch", we can reuse cache-hot pages between CPUs.  But considering the
    other usage of cache (code, other data accessing, etc.), "3 * batch" is
    used.

    Note: "3 * batch" is chosen to make sure the optimization works on recent
    x86_64 server CPUs.  If you want to increase it, please check whether it
    breaks the optimization.

    On a 2-socket Intel server with 128 logical CPU, with the patch, the
    network bandwidth of the UNIX (AF_UNIX) test case of lmbench test suite
    with 16-pair processes increase 70.5%.  The cycles% of the spinlock
    contention (mostly for zone lock) decreases from 46.1% to 21.3%.  The
    number of PCP draining for high order pages freeing (free_high) decreases
    89.9%.  The cache miss rate keeps 0.2%.

    Link: https://lkml.kernel.org/r/20231016053002.756205-4-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Sudeep Holla <sudeep.holla@arm.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Johannes Weiner <jweiner@redhat.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Arjan van de Ven <arjan@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-06-07 13:14:12 -04:00
Chris von Recklinghausen 57f22f324e cacheinfo: calculate size of per-CPU data cache slice
JIRA: https://issues.redhat.com/browse/RHEL-20141

commit 94a3bfe4073cd88b05f7fb201ea7bf9dfa2cf5d5
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Oct 16 13:29:55 2023 +0800

    cacheinfo: calculate size of per-CPU data cache slice

    This can be used to estimate the size of the data cache slice that can be
    used by one CPU under ideal circumstances.  Both DATA caches and UNIFIED
    caches are used in calculation.  So, the users need to consider the impact
    of the code cache usage.

    Because the cache inclusive/non-inclusive information isn't available now,
    we just use the size of the per-CPU slice of LLC to make the result more
    predictable across architectures.  This may be improved when more cache
    information is available in the future.

    A brute-force algorithm to iterate all online CPUs is used to avoid to
    allocate an extra cpumask, especially in offline callback.

    Link: https://lkml.kernel.org/r/20231016053002.756205-3-ying.huang@intel.com
    Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Cc: Sudeep Holla <sudeep.holla@arm.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Johannes Weiner <jweiner@redhat.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Arjan van de Ven <arjan@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com>
2024-06-07 13:14:12 -04:00
Mark Langsdorf 0060df544c drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit c26fabe73330d983c7ce822c6b6ec0879b4da61f
Author: K Prateek Nayak <kprateek.nayak@amd.com>
Date: Wed, 31 May 2023 20:36:47 +0000

Until commit 5c2712387d48 ("cacheinfo: Fix LLC is not exported through
sysfs"), cacheinfo called populate_cache_leaves() for CPU coming online
which let the arch specific functions handle (at least on x86)
populating the shared_cpu_map. However, with the changes in the
aforementioned commit, populate_cache_leaves() is not called when a CPU
comes online as a result of hotplug since last_level_cache_is_valid()
returns true as the cacheinfo data is not discarded. The CPU coming
online is not present in shared_cpu_map, however, it will not be added
since the cpu_cacheinfo->cpu_map_populated flag is set (it is set in
populate_cache_leaves() when cacheinfo is first populated for x86)

This can lead to inconsistencies in the shared_cpu_map when an offlined
CPU comes online again. Example below depicts the inconsistency in the
shared_cpu_list in cacheinfo when CPU8 is offlined and onlined again on
a 3rd Generation EPYC processor:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 > /sys/devices/system/cpu/cpu8/online
  # echo 1 > /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8

  # cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
    136

  # cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
    9-15,136-143

Clear the flag when the CPU is removed from shared_cpu_map when
cache_shared_cpu_map_remove() is called during CPU hotplug. This will
allow cache_shared_cpu_map_setup() to add the CPU coming back online in
the shared_cpu_map. Set the flag again when the shared_cpu_map is setup.
Following are results of performing the same test as described above with
the changes:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 > /sys/devices/system/cpu/cpu8/online
  # echo 1 > /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
    8,136

  # cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
    8-15,136-143

Fixes: 5c2712387d48 ("cacheinfo: Fix LLC is not exported through sysfs")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-3-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:34 -05:00
Mark Langsdorf b79939d1e5 drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit 126310c9f669c9a8c875a3e5c2292299ca90225d
Author: K Prateek Nayak <kprateek.nayak@amd.com>
Date: Wed, 31 May 2023 20:36:46 +0000

While building the shared_cpu_map, check if the cache level and cache
type matches. On certain systems that build the cache topology based on
the instance ID, there are cases where the same ID may repeat across
multiple cache levels, leading inaccurate topology.

In event of CPU offlining, the cache_shared_cpu_map_remove() does not
consider if IDs at same level are being compared. As a result, when same
IDs repeat across different cache levels, the CPU going offline is not
removed from all the shared_cpu_map.

Below is the output of cache topology of CPU8 and it's SMT sibling after
CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor
(2 x 64C/128T) running kernel release v6.3:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 > /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143

CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of
index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the
SMT siblings, and they have the same cache instance ID, CPU 2 is only
removed from the first index with matching ID which is index1 (L1i) in
this case. With this fix, the results are as expected when performing
the same experiment on the same system:

  # for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
    /sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143

  # echo 0 > /sys/devices/system/cpu/cpu8/online

  # for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
    /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136
    /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143

When rebuilding topology, the same problem appears as
cache_shared_cpu_map_setup() implements a similar logic. Consider the
same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1
and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on
the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from
Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs
sharing one cache domain. This went undetected since x86 processors
depended on arch specific populate_cache_leaves() method to repopulate
the shared_cpus_map when CPU came back online until kernel release
v6.3-rc5.

Fixes: 198102c9103f ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:34 -05:00
Mark Langsdorf 4fe19e00d8 cacheinfo: Adjust includes to remove of_device.h
JIRA: https://issues.redhat.com/browse/RHEL-1023

commit b9581552b0b94586fa7296613a5d8a4a63e801be
Author: Rob Herring <robh@kernel.org>
Date: Thu, 13 Apr 2023 17:46:34 +0000

Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
implicitly including other includes, and is no longer needed. Update the
includes to use of.h instead of of_device.h.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230329-dt-cpu-header-cleanups-v1-10-581e2605fe47@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-11-01 11:12:34 -05:00
Mark Langsdorf d003f011de cacheinfo: Fix shared_cpu_map to handle shared caches at different levels
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2178302

commit 198102c9103fc78d8478495971947af77edb05c1
Author: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Date: Tue, 17 Jan 2023 10:51:33 +0000

The cacheinfo sets up the shared_cpu_map by checking whether the caches
with the same index are shared between CPUs. However, this will trigger
slab-out-of-bounds access if the CPUs do not have the same cache hierarchy.
Another problem is the mismatched shared_cpu_map when the shared cache does
not have the same index between CPUs.

CPU0	I	D	L3
index	0	1	2	x
	^	^	^	^
index	0	1	2	3
CPU1	I	D	L2	L3

This patch checks each cache is shared with all caches on other CPUs.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Link: https://lore.kernel.org/r/20230117105133.4445-2-yongxuan.wang@sifive.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2023-06-08 12:33:12 -04:00
Radu Rendec 408e8b893f cacheinfo: Add use_arch[|_cache]_info field/function
Bugzilla: https://bugzilla.redhat.com/2180619

commit ef9f643a9f8b62bcbcc51f0e0af8599adc2e17ed
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Fri Apr 14 10:14:52 2023 +0200

    cacheinfo: Add use_arch[|_cache]_info field/function

    The cache information can be extracted from either a Device
    Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
    for arm64).

    The clidr_el1 register is used only if DT/ACPI information is not
    available. It does not states how caches are shared among CPUs.

    Add a use_arch_cache_info field/function to identify when the
    DT/ACPI doesn't provide cache information. Use this information
    to assume L1 caches are privates and L2 and higher are shared among
    all CPUs.

    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Link: https://lore.kernel.org/r/20230414081453.244787-5-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec a9577a3816 cacheinfo: Check cache properties are present in DT
Bugzilla: https://bugzilla.redhat.com/2180619

commit cde0fbff07eff7e4e0e85fa053fe19a24c86b1e0
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Fri Apr 14 10:14:50 2023 +0200

    cacheinfo: Check cache properties are present in DT

    If a Device Tree (DT) is used, the presence of cache properties is
    assumed. Not finding any is not considered. For arm64 platforms,
    cache information can be fetched from the clidr_el1 register.
    Checking whether cache information is available in the DT
    allows to switch to using clidr_el1.

    init_of_cache_level()
    \-of_count_cache_leaves()
    will assume there a 2 cache leaves (L1 data/instruction caches), which
    can be different from clidr_el1 information.

    cache_setup_of_node() tries to read cache properties in the DT.
    If there are none, this is considered a success. Knowing no
    information was available would allow to switch to using clidr_el1.

    Fixes: de0df442ee49 ("cacheinfo: Check 'cache-unified' property to count cache leaves")
    Reported-by: Alexandre Ghiti <alexghiti@rivosinc.com>
    Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20230414081453.244787-3-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec 5d00293d35 cacheinfo: Check sib_leaf in cache_leaves_are_shared()
Bugzilla: https://bugzilla.redhat.com/2180619

commit 7a306e3eabf2b2fd8cffa69b87b32dbf814d79ce
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Fri Apr 14 10:14:49 2023 +0200

    cacheinfo: Check sib_leaf in cache_leaves_are_shared()

    If there is no ACPI/DT information, it is assumed that L1 caches
    are private and L2 (and higher) caches are shared. A cache is
    'shared' between two CPUs if it is accessible from these two
    CPUs.

    Each CPU owns a representation (i.e. has a dedicated cacheinfo struct)
    of the caches it has access to. cache_leaves_are_shared() tries to
    identify whether two representations are designating the same actual
    cache.

    In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher)
    and 'sib_leaf' is a L1 cache, the caches are detected as shared as
    only this_leaf's cache level is checked.
    This is leads to setting sib_leaf as being shared with another CPU,
    which is incorrect as this is a L1 cache.

    Check 'sib_leaf->level'. Also update the comment as the function is
    called when populating 'shared_cpu_map'.

    Fixes: f16d1becf96f ("cacheinfo: Use cache identifiers to check if the caches are shared if available")
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20230414081453.244787-2-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec d5bcd008ad cacheinfo: Add arch specific early level initializer
Bugzilla: https://bugzilla.redhat.com/2180619

commit 6539cffa94957241c096099a57d05fa4d8c7db8a
Author: Radu Rendec <rrendec@redhat.com>
Date:   Wed Apr 12 14:57:57 2023 -0400

    cacheinfo: Add arch specific early level initializer

    This patch gives architecture specific code the ability to initialize
    the cache level and allocate cacheinfo memory early, when cache level
    initialization runs on the primary CPU for all possible CPUs.

    This is part of a patch series that attempts to further the work in
    commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
    Previously, in the absence of any DT/ACPI cache info, architecture
    specific cache detection and info allocation for secondary CPUs would
    happen in non-preemptible context during early CPU initialization and
    trigger a "BUG: sleeping function called from invalid context" splat on
    an RT kernel.

    More specifically, this patch adds the early_cache_level() function,
    which is called by fetch_cache_info() as a fallback when the number of
    cache leaves cannot be extracted from DT/ACPI. In the default generic
    (weak) implementation, this new function returns -ENOENT, which
    preserves the original behavior for architectures that do not implement
    the function.

    Since early detection can get the number of cache leaves wrong in some
    cases*, additional logic is added to still call init_cache_level() later
    on the secondary CPU, therefore giving the architecture specific code an
    opportunity to go back and fix the initial guess. Again, the original
    behavior is preserved for architectures that do not implement the new
    function.

    * For example, on arm64, CLIDR_EL1 detection works only when it runs on
      the current CPU. In other words, a CPU cannot detect the cache depth
      for any other CPU than itself.

    Signed-off-by: Radu Rendec <rrendec@redhat.com>
    Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
    Link: https://lore.kernel.org/r/20230412185759.755408-2-rrendec@redhat.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec cbbbaceabb cacheinfo: Fix LLC is not exported through sysfs
Bugzilla: https://bugzilla.redhat.com/2180619

commit 5c2712387d4850e0b64121d5fd3e6c4e84ea3266
Author: Yicong Yang <yangyicong@hisilicon.com>
Date:   Tue Mar 28 19:49:15 2023 +0800

    cacheinfo: Fix LLC is not exported through sysfs

    After entering 6.3-rc1 the LLC cacheinfo is not exported on our ACPI
    based arm64 server. This is because the LLC cacheinfo is partly reset
    when secondary CPUs boot up. On arm64 the primary cpu will allocate
    and setup cacheinfo:
    init_cpu_topology()
      for_each_possible_cpu()
        fetch_cache_info() // Allocate cacheinfo and init levels
    detect_cache_attributes()
      cache_shared_cpu_map_setup()
        if (!last_level_cache_is_valid()) // not valid, setup LLC
          cache_setup_properties() // setup LLC

    On secondary CPU boot up:
    detect_cache_attributes()
      populate_cache_leaves()
        get_cache_type() // Get cache type from clidr_el1,
                         // for LLC type=CACHE_TYPE_NOCACHE
      cache_shared_cpu_map_setup()
        if (!last_level_cache_is_valid()) // Valid and won't go to this branch,
                                          // leave LLC's type=CACHE_TYPE_NOCACHE

    The last_level_cache_is_valid() use cacheinfo->{attributes, fw_token} to
    test it's valid or not, but populate_cache_leaves() will only reset
    LLC's type, so we won't try to re-setup LLC's type and leave it
    CACHE_TYPE_NOCACHE and won't export it through sysfs.

    This patch tries to fix this by not re-populating the cache leaves if
    the LLC is valid.

    Fixes: 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")
    Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
    Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Link: https://lore.kernel.org/r/20230328114915.33340-1-yangyicong@huawei.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec 785c37a73f cacheinfo: Remove of_node_put() for fw_token
Bugzilla: https://bugzilla.redhat.com/2180619

commit 2613cc29c5723881ca603b1a3b50f0107010d5d6
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Nov 16 10:49:58 2022 +0100

    cacheinfo: Remove of_node_put() for fw_token

    fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
    For DT based systems, fw_token is set to a pointer to a DT node.

    commit 3da72e18371c ("cacheinfo: Decrement refcount in
    cache_setup_of_node()")
    doesn't increment the refcount of fw_token anymore in
    cache_setup_of_node(). fw_token is indeed used as a token and not
    as a (struct device_node*), so no reference to fw_token should be
    kept.

    However, [1] is triggered when hotplugging a CPU multiple times
    since cache_shared_cpu_map_remove() decrements the refcount to
    fw_token at each CPU unplugging, eventually reaching 0.

    Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().

    [1]
    ------------[ cut here ]------------
    refcount_t: saturated; leaking memory.
    WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
    Modules linked in:
    CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G        W          6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
    Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
    pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
    lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
    [...]
    Call trace:
    [...]
    of_node_release (drivers/of/dynamic.c:335)
    kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
    of_node_put (drivers/of/dynamic.c:49)
    free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
    cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
    cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
    cpuhp_thread_fun (kernel/cpu.c:785)
    smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
    kthread (kernel/kthread.c:376)
    ret_from_fork (arch/arm64/kernel/entry.S:861)
    ---[ end trace 0000000000000000 ]---

    Fixes: 3da72e18371c ("cacheinfo: Decrement refcount in cache_setup_of_node()")
    Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Tested-by: Sudeep Holla <sudeep.holla@arm.com>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Link: https://lore.kernel.org/r/20221116094958.2141072-1-pierre.gondois@arm.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec de7953501c cacheinfo: Decrement refcount in cache_setup_of_node()
Bugzilla: https://bugzilla.redhat.com/2180619

commit 3da72e18371c41a6f6f96b594854b178168c7757
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Oct 26 20:59:54 2022 +0200

    cacheinfo: Decrement refcount in cache_setup_of_node()

    Refcounts to DT nodes are only incremented in the function
    and never decremented. Decrease the refcounts when necessary.

    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Link: https://lore.kernel.org/r/20221026185954.991547-1-pierre.gondois@arm.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-04 17:28:09 -04:00
Radu Rendec a919145004 cacheinfo: Initialize variables in fetch_cache_info()
Bugzilla: https://bugzilla.redhat.com/2180619

commit ecaef469920fd6d2c7687f19081946f47684a423
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Tue Jan 24 16:40:46 2023 +0100

    cacheinfo: Initialize variables in fetch_cache_info()

    Set potentially uninitialized variables to 0. This is particularly
    relevant when CONFIG_ACPI_PPTT is not set.

    Reported-by: kernel test robot <lkp@intel.com>
    Link: https://lore.kernel.org/all/202301052307.JYt1GWaJ-lkp@intel.com/
    Reported-by: Dan Carpenter <error27@gmail.com>
    Link: https://lore.kernel.org/all/Y86iruJPuwNN7rZw@kili/
    Fixes: 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20230124154053.355376-2-pierre.gondois@arm.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-02 16:57:28 -04:00
Radu Rendec 566b5dbe26 arch_topology: Build cacheinfo from primary CPU
Bugzilla: https://bugzilla.redhat.com/2180619

commit 5944ce092b97caed5d86d961e963b883b5c44ee2
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Jan 4 19:30:29 2023 +0100

    arch_topology: Build cacheinfo from primary CPU

    commit 3fcbf1c77d08 ("arch_topology: Fix cache attributes detection
    in the CPU hotplug path")
    adds a call to detect_cache_attributes() to populate the cacheinfo
    before updating the siblings mask. detect_cache_attributes() allocates
    memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
    kernels, on secondary CPUs, this triggers a:
      'BUG: sleeping function called from invalid context' [1]
    as the code is executed with preemption and interrupts disabled.

    The primary CPU was previously storing the cache information using
    the now removed (struct cpu_topology).llc_id:
    commit 5b8dc787ce4a ("arch_topology: Drop LLC identifier stash from
    the CPU topology")

    allocate_cache_info() tries to build the cacheinfo from the primary
    CPU prior secondary CPUs boot, if the DT/ACPI description
    contains cache information.
    If allocate_cache_info() fails, then fallback to the current state
    for the cacheinfo allocation. [1] will be triggered in such case.

    When unplugging a CPU, the cacheinfo memory cannot be freed. If it
    was, then the memory would be allocated early by the re-plugged
    CPU and would trigger [1].

    Note that populate_cache_leaves() might be called multiple times
    due to populate_leaves being moved up. This is required since
    detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
    being allocated but not populated.

    [1]:
     | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
     | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
     | preempt_count: 1, expected: 0
     | RCU nest depth: 1, expected: 1
     | 3 locks held by swapper/111/0:
     |  #0:  (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
     |  #1:  (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
     |  #2:  (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
     | irq event stamp: 0
     | hardirqs last  enabled at (0):  0x0
     | hardirqs last disabled at (0):  copy_process+0x5dc/0x1ab8
     | softirqs last  enabled at (0):  copy_process+0x5dc/0x1ab8
     | softirqs last disabled at (0):  0x0
     | Preemption disabled at:
     |  migrate_enable+0x30/0x130
     | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-rt6-[...]
     | Call trace:
     |  __kmalloc+0xbc/0x1e8
     |  detect_cache_attributes+0x2d4/0x5f0
     |  update_siblings_masks+0x30/0x368
     |  store_cpu_topology+0x78/0xb8
     |  secondary_start_kernel+0xd0/0x198
     |  __secondary_switched+0xb0/0xb4

    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
    Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
    Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-02 16:57:28 -04:00
Radu Rendec 905d03f79b cacheinfo: Check 'cache-unified' property to count cache leaves
Bugzilla: https://bugzilla.redhat.com/2180619

commit de0df442ee49cb1f6ee58f3fec5dcb5e5eb70aab
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Jan 4 19:30:26 2023 +0100

    cacheinfo: Check 'cache-unified' property to count cache leaves

    The DeviceTree Specification v0.3 specifies that the cache node
    '[d-|i-|]cache-size' property is required. The 'cache-unified'
    property is specifies whether the cache level is separate
    or unified.

    If the cache-size property is missing, no cache leaves is accounted.
    This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.

    Check 'cache-unified' property and always account for at least
    one cache leaf when parsing the device tree.

    [1] https://lore.kernel.org/all/0f19cb3f-d6cf-4032-66d2-dedc9d09a0e3@linaro.org/

    Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Tested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://lore.kernel.org/r/20230104183033.755668-4-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-02 16:57:27 -04:00
Radu Rendec 233ab57771 cacheinfo: Return error code in init_of_cache_level()
Bugzilla: https://bugzilla.redhat.com/2180619

commit 8844c3df001bc1d8397fddea341308da63855d53
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Jan 4 19:30:25 2023 +0100

    cacheinfo: Return error code in init_of_cache_level()

    Make init_of_cache_level() return an error code when the cache
    information parsing fails to help detecting missing information.

    init_of_cache_level() is only called for riscv. Returning an error
    code instead of 0 will prevent detect_cache_attributes() to allocate
    memory if an incomplete DT is parsed.

    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
    Link: https://lore.kernel.org/r/20230104183033.755668-3-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-02 16:57:27 -04:00
Radu Rendec d86dc62d68 cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
Bugzilla: https://bugzilla.redhat.com/2180619

commit c3719bd9eeb2edf84bd263d662e36ca0ba262a23
Author: Pierre Gondois <pierre.gondois@arm.com>
Date:   Wed Jan 4 19:30:24 2023 +0100

    cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation

    RISC-V's implementation of init_of_cache_level() is following
    the Devicetree Specification v0.3 regarding caches, cf.:
    - s3.7.3 'Internal (L1) Cache Properties'
    - s3.8 'Multi-level and Shared Cache Nodes'

    Allow reusing the implementation by moving it.

    Also make 'levels', 'leaves' and 'level' unsigned int.

    Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
    Link: https://lore.kernel.org/r/20230104183033.755668-2-pierre.gondois@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Signed-off-by: Radu Rendec <rrendec@redhat.com>
2023-05-02 16:57:27 -04:00
Mark Langsdorf 8082711d95 cacheinfo: Use atomic allocation for percpu cache attributes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 11969d698f8cda31bd176ec346833ef97ea7c67e
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Wed, 20 Jul 2022 13:55:38 +0100

On couple of architectures like RISC-V and ARM64, we need to detect
cache attribues quite early during the boot when the secondary CPUs
start. So we will call detect_cache_attributes in the atomic context
and since use of normal allocation can sleep, we will end up getting
"sleeping in the atomic context" bug splat.

In order avoid that, move the allocation to use atomic version in
preparation to move the actual detection of cache attributes in the
CPU hotplug path which is atomic.

Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-1-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 15:18:45 -04:00
Mark Langsdorf 212201b353 cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 521103134a0d07774c8b17f25ff0ef70cbd56c9d
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:52 +0100

The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.

Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 04986160d9 cacheinfo: Use cache identifiers to check if the caches are shared if available
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit f16d1becf96f0a95dc9e1a5a7f97feeec2b149d5
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:51 +0100

The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.

We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.

Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 730c929b3f cacheinfo: Allow early detection and population of cache attributes
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 36bbc5b4ffab33ccac0f4db27f619a6ba7a4fd32
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:50 +0100

Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.

Allow detect_cache_attributes to be called quite early during the boot.

Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf ab49e7ba77 cacheinfo: Add support to check if last level cache(LLC) is valid or shared
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit cc1cfc47ea47187a21ec1f079b3c53264157fe15
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:49 +0100

It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.

This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.

Also add helper to check if the llc information in cacheinfo is valid
or not.

Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 6258474f9f cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit 9447eb0f1575572218267180b4edff937b3aec57
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:48 +0100

cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.

Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.

Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 86b6209df8 cacheinfo: Add helper to access any cache index for a given CPU
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit b14e8d21f726f4ffeaf8833783eda68a1c152b15
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:47 +0100

The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.

Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.

Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 99b796880b cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2122318

commit d4ec840baecbed280c7305f9103a10641d4d3799
Author: Sudeep Holla <sudeep.holla@arm.com>
Date: Mon, 4 Jul 2022 11:15:46 +0100

The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.

Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
   much earlier before the CPUs are registered as devices.

Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-10-25 14:21:14 -04:00
Mark Langsdorf 26d4637c02 cacheinfo: clear cache_leaves(cpu) in free_cache_attributes()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2067252

commit e022eac85ecd2140a0829970d923d984356185eb
Author: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Date: Wed, 14 Jul 2021 09:32:55 +0800

On ARM64, when PPTT(Processor Properties Topology Table) is not
implemented in ACPI boot, we will goto 'free_ci' with the following
print:
  Unable to detect cache hierarchy for CPU 0

But some other codes may still use 'num_leaves' to iterate through the
'info_list', such as get_cpu_cacheinfo_id(). If 'info_list' is NULL , it
would crash. So clear 'num_leaves' in free_cache_attributes().

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Link: https://lore.kernel.org/r/1626226375-58730-1-git-send-email-wangxiongfeng2@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
2022-06-14 16:01:47 -05:00
Joe Perches e015e036ae drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_show
Do not indirect the bitmap printing of these shared_cpu show functions by
using cpumap_print_to_pagebuf/bitmap_print_to_pagebuf.

Use the more typical style with the vsnprintf %*pb and %*pbl extensions
directly so there is no possible mixup about the use of offset_in_page(buf)
by bitmap_print_to_pagebuf.

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/80457b467ab6cde13a173cfd8a4f49cd8467a7fd.1600285923.git.joe@perches.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02 13:24:40 +02:00
Joe Perches 948b3edba8 drivers core: Miscellaneous changes for sysfs_emit
Change additional instances that could use sysfs_emit and sysfs_emit_at
that the coccinelle script could not convert.

o macros creating show functions with ## concatenation
o unbound sprintf uses with buf+len for start of output to sysfs_emit_at
o returns with ?: tests and sprintf to sysfs_emit
o sysfs output with struct class * not struct device * arguments

Miscellanea:

o remove unnecessary initializations around these changes
o consistently use int len for return length of show functions
o use octal permissions and not S_<FOO>
o rename a few show function names so DEVICE_ATTR_<FOO> can be used
o use DEVICE_ATTR_ADMIN_RO where appropriate
o consistently use const char *output for strings
o checkpatch/style neatening

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/8bc24444fe2049a9b2de6127389b57edfdfe324d.1600285923.git.joe@perches.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02 13:12:07 +02:00
Joe Perches 973c39115c drivers core: Remove strcat uses around sysfs_emit and neaten
strcat is no longer necessary for sysfs_emit and sysfs_emit_at uses.

Convert the strcat uses to sysfs_emit calls and neaten other block
uses of direct returns to use an intermediate const char *.

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/5d606519698ce4c8f1203a2b35797d8254c6050a.1600285923.git.joe@perches.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02 13:09:10 +02:00
Joe Perches aa838896d8 drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions
Convert the various sprintf fmaily calls in sysfs device show functions
to sysfs_emit and sysfs_emit_at for PAGE_SIZE buffer safety.

Done with:

$ spatch -sp-file sysfs_emit_dev.cocci --in-place --max-width=80 .

And cocci script:

$ cat sysfs_emit_dev.cocci
@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	strcpy(buf, chr);
+	sysfs_emit(buf, chr);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
-	len += scnprintf(buf + len, PAGE_SIZE - len,
+	len += sysfs_emit_at(buf, len,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	...
-	strcpy(buf, chr);
-	return strlen(buf);
+	return sysfs_emit(buf, chr);
}

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/3d033c33056d88bbe34d4ddb62afd05ee166ab9a.1600285923.git.joe@perches.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02 13:09:10 +02:00
Linus Torvalds f632a8170a Driver Core and debugfs changes for 5.3-rc1
Here is the "big" driver core and debugfs changes for 5.3-rc1
 
 It's a lot of different patches, all across the tree due to some api
 changes and lots of debugfs cleanups.  Because of this, there is going
 to be some merge issues with your tree at the moment, I'll follow up
 with the expected resolutions to make it easier for you.
 
 Other than the debugfs cleanups, in this set of changes we have:
 	- bus iteration function cleanups (will cause build warnings
 	  with s390 and coresight drivers in your tree)
 	- scripts/get_abi.pl tool to display and parse Documentation/ABI
 	  entries in a simple way
 	- cleanups to Documenatation/ABI/ entries to make them parse
 	  easier due to typos and other minor things
 	- default_attrs use for some ktype users
 	- driver model documentation file conversions to .rst
 	- compressed firmware file loading
 	- deferred probe fixes
 
 All of these have been in linux-next for a while, with a bunch of merge
 issues that Stephen has been patient with me for.  Other than the merge
 issues, functionality is working properly in linux-next :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
 LpRyb3zX29oChFaZkc5a
 =XrEZ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs updates from Greg KH:
 "Here is the "big" driver core and debugfs changes for 5.3-rc1

  It's a lot of different patches, all across the tree due to some api
  changes and lots of debugfs cleanups.

  Other than the debugfs cleanups, in this set of changes we have:

   - bus iteration function cleanups

   - scripts/get_abi.pl tool to display and parse Documentation/ABI
     entries in a simple way

   - cleanups to Documenatation/ABI/ entries to make them parse easier
     due to typos and other minor things

   - default_attrs use for some ktype users

   - driver model documentation file conversions to .rst

   - compressed firmware file loading

   - deferred probe fixes

  All of these have been in linux-next for a while, with a bunch of
  merge issues that Stephen has been patient with me for"

* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
  debugfs: make error message a bit more verbose
  orangefs: fix build warning from debugfs cleanup patch
  ubifs: fix build warning after debugfs cleanup patch
  driver: core: Allow subsystems to continue deferring probe
  drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
  arch_topology: Remove error messages on out-of-memory conditions
  lib: notifier-error-inject: no need to check return value of debugfs_create functions
  swiotlb: no need to check return value of debugfs_create functions
  ceph: no need to check return value of debugfs_create functions
  sunrpc: no need to check return value of debugfs_create functions
  ubifs: no need to check return value of debugfs_create functions
  orangefs: no need to check return value of debugfs_create functions
  nfsd: no need to check return value of debugfs_create functions
  lib: 842: no need to check return value of debugfs_create functions
  debugfs: provide pr_fmt() macro
  debugfs: log errors when something goes wrong
  drivers: s390/cio: Fix compilation warning about const qualifiers
  drivers: Add generic helper to match by of_node
  driver_find_device: Unify the match function with class_find_device()
  bus_find_device: Unify the match callback with class_find_device
  ...
2019-07-12 12:24:03 -07:00
James Morse 83b44fe343 drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
The cacheinfo structures are alloced/freed by cpu online/offline
callbacks. Originally these were only used by sysfs to expose the
cache topology to user space. Without any in-kernel dependencies
CPUHP_AP_ONLINE_DYN was an appropriate choice.

resctrl has started using these structures to identify CPUs that
share a cache. It updates its 'domain' structures from cpu
online/offline callbacks. These depend on the cacheinfo structures
(resctrl_online_cpu()->domain_add_cpu()->get_cache_id()->
 get_cpu_cacheinfo()).
These also run as CPUHP_AP_ONLINE_DYN.

Now that there is an in-kernel dependency, move the cacheinfo
work earlier so we know its done before resctrl's CPUHP_AP_ONLINE_DYN
work runs.

Fixes: 2264d9c74d ("x86/intel_rdt: Build structures for each resource based on cache topology")
Cc: <stable@vger.kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20190624173656.202407-1-james.morse@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03 21:25:41 +02:00
Shaokun Zhang 9a83c84c3a drivers: base: cacheinfo: Add variable to record max cache line size
Add coherency_max_size variable to record the maximum cache line size
for different cache levels. If it is available, we will synchronize
it as cache line size, otherwise we will use CTR_EL0.CWG reporting
in cache_line_size() for arm64.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 13:42:54 +01:00
Huacai Chen 3a34c98632 cacheinfo: Keep the old value if of_property_read_u32 fails
Commit 448a5a552f ("drivers: base: cacheinfo: use OF
property_read_u32 instead of get_property,read_number") makes cache
size and number_of_sets be 0 if DT doesn't provide there values. I
think this is unreasonable so make them keep the old values, which is
the same as old kernels.

Fixes: 448a5a552f ("drivers: base: cacheinfo: use OF property_read_u32 instead of get_property,read_number")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22 13:50:31 +01:00
Jeffrey Hugo ca388e436f drivers: base: cacheinfo: Do not populate sysfs for unknown cache types
If a cache has an unknown type because neither the hardware nor the
firmware told us, an entry in the sysfs tree will be made, but the type
file will not be present.  lscpu depends on the type file being present
for every entry, and will error out without printing system information
if lscpu cannot open the type file.

Presenting information about a cache without indicating its type is not
useful, therefore if we hit a cache with an unknown type, stop populating
sysfs so that userspace has the maximum amount of useful information.

This addresses the following lscpu error, which prevents any output.
lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
file or directory

Suggested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-04 23:02:17 +02:00
Sudeep Holla 448a5a552f drivers: base: cacheinfo: use OF property_read_u32 instead of get_property,read_number
of_property_read_u32 searches for a property in a device node and read
a 32-bit value from it. Instead of using of_get_property to get the
property and then read 32-bit value using of_read_number, we can
simplify it by using of_property_read_u32.

Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-07 17:20:47 +02:00
Jeremy Linton 582b468bdc drivers: base cacheinfo: Add support for ACPI based firmware tables
Call ACPI cache parsing routines from base cacheinfo code if ACPI
is enabled. Also stub out cache_setup_acpi and acpi_find_last_cache_level
so that individual architectures can enable ACPI topology parsing.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17 17:28:09 +01:00
Jeremy Linton 9b97387c5c cacheinfo: rename of_node to fw_token
Rename and change the type of of_node to indicate
it is a generic pointer which is generally only used
for comparison purposes. In a later patch we will put
an ACPI/PPTT token pointer in fw_token so that
the code which builds the shared cpu masks can be reused.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17 17:28:09 +01:00
Jeremy Linton 2ff075c7df drivers: base: cacheinfo: setup DT cache properties early
The original intent in cacheinfo was that an architecture
specific populate_cache_leaves() would probe the hardware
and then cache_shared_cpu_map_setup() and
cache_override_properties() would provide firmware help to
extend/expand upon what was probed. Arm64 was really
the only architecture that was working this way, and
with the removal of most of the hardware probing logic it
became clear that it was possible to simplify the logic a bit.

This patch combines the walk of the DT nodes with the
code updating the cache size/line_size and nr_sets.
cache_override_properties() (which was DT specific) is
then removed. The result is that cacheinfo.of_node is
no longer used as a temporary place to hold DT references
for future calls that update cache properties. That change
helps to clarify its one remaining use (matching
cacheinfo nodes that represent shared caches) which
will be used by the ACPI/PPTT code in the following patches.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17 17:27:49 +01:00
Jeremy Linton d529a18a61 drivers: base: cacheinfo: move cache_setup_of_node()
In preparation for the next patch, and to aid in
review of that patch, lets move cache_setup_of_node
further down in the module without any changes.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Vijaya Kumar K <vkilari@codeaurora.org>
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-17 17:06:49 +01:00
Greg Kroah-Hartman 8c9076b07c Merge 4.15-rc6 into driver-core-next
We want the fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02 14:56:51 +01:00
Sudeep Holla f57ab9a01a drivers: base: cacheinfo: fix cache type for non-architected system cache
Commit dfea747d2a ("drivers: base: cacheinfo: support DT overrides for
cache properties") doesn't initialise the cache type if it's present
only in DT and the architecture is not aware of it. They are unified
system level cache which are generally transparent.

This patch check if the cache type is set to NOCACHE but the DT node
indicates that it's unified cache and sets the cache type accordingly.

Fixes: dfea747d2a ("drivers: base: cacheinfo: support DT overrides for cache properties")
Reported-and-tested-by: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-15 20:07:49 +01:00
Greg Kroah-Hartman 3282570990 driver core: Remove redundant license text
Now that the SPDX tag is in all driver core files, that identifies the
license in a specific and legally-defined manner.  So the extra GPL text
wording can be removed as it is no longer needed at all.

This is done on a quest to remove the 700+ different ways that files in
the kernel describe the GPL license text.  And there's unneeded stuff
like the address (sometimes incorrect) for the FSF which is never
needed.

No copyright headers or other non-license-description text was removed.

Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-07 18:36:44 +01:00
Greg Kroah-Hartman 989d42e85d driver core: add SPDX identifiers to all driver core files
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.

Update the driver core files files with the correct SPDX license
identifier based on the license text in the file itself.  The SPDX
identifier is a legally binding shorthand, which can be used instead of
the full boiler plate text.

This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.

Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-07 18:36:43 +01:00
Linus Torvalds eb254f323b Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache allocation interface from Thomas Gleixner:
 "This provides support for Intel's Cache Allocation Technology, a cache
  partitioning mechanism.

  The interface is odd, but the hardware interface of that CAT stuff is
  odd as well.

  We tried hard to come up with an abstraction, but that only allows
  rather simple partitioning, but no way of sharing and dealing with the
  per package nature of this mechanism.

  In the end we decided to expose the allocation bitmaps directly so all
  combinations of the hardware can be utilized.

  There are two ways of associating a cache partition:

   - Task

     A task can be added to a resource group. It uses the cache
     partition associated to the group.

   - CPU

     All tasks which are not member of a resource group use the group to
     which the CPU they are running on is associated with.

     That allows for simple CPU based partitioning schemes.

  The main expected user sare:

   - Virtualization so a VM can only trash only the associated part of
     the cash w/o disturbing others

   - Real-Time systems to seperate RT and general workloads.

   - Latency sensitive enterprise workloads

   - In theory this also can be used to protect against cache side
     channel attacks"

[ Intel RDT is "Resource Director Technology". The interface really is
  rather odd and very specific, which delayed this pull request while I
  was thinking about it. The pull request itself came in early during
  the merge window, I just delayed it until things had calmed down and I
  had more time.

  But people tell me they'll use this, and the good news is that it is
  _so_ specific that it's rather independent of anything else, and no
  user is going to depend on the interface since it's pretty rare. So if
  push comes to shove, we can just remove the interface and nothing will
  break ]

* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86/intel_rdt: Implement show_options() for resctrlfs
  x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled
  x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  x86/intel_rdt: Fix setting of closid when adding CPUs to a group
  x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
  x86/intel_rdt: Reset per cpu closids on unmount
  x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A
  x86/intel_rdt: Prevent deadlock against hotplug lock
  x86/intel_rdt: Protect info directory from removal
  x86/intel_rdt: Add info files to Documentation
  x86/intel_rdt: Export the minimum number of set mask bits in sysfs
  x86/intel_rdt: Propagate error in rdt_mount() properly
  x86/intel_rdt: Add a missing #include
  MAINTAINERS: Add maintainer for Intel RDT resource allocation
  x86/intel_rdt: Add scheduler hook
  x86/intel_rdt: Add schemata file
  x86/intel_rdt: Add tasks files
  x86/intel_rdt: Add cpus file
  x86/intel_rdt: Add mkdir to resctrl file system
  x86/intel_rdt: Add "info" files to resctrl file system
  ...
2016-12-22 09:25:45 -08:00
Linus Torvalds 098c30557a Driver core patches for 4.10-rc1
Here's the new driver core patches for 4.10-rc1.
 
 Big thing here is the nice addition of "functional dependencies" to the
 driver core.  The idea has been talked about for a very long time, great
 job to Rafael for stepping up and implementing it. It's been tested for
 longer than the 4.9-rc1 date, we held off on merging it earlier in order
 to feel more comfortable about it.
 
 Other than that, it's just a handful of small other patches, some good
 cleanups to the mess that is the firmware class code, and we have a test
 driver for the deferred probe logic.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWFAvPQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym3NgCgmhFeWEkp9SDt17YGGavmnzQUlBQAoJlUipJp
 PHeQkq15ZWw3wWC9FEvM
 =91M1
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here's the new driver core patches for 4.10-rc1.

  Big thing here is the nice addition of "functional dependencies" to
  the driver core. The idea has been talked about for a very long time,
  great job to Rafael for stepping up and implementing it. It's been
  tested for longer than the 4.9-rc1 date, we held off on merging it
  earlier in order to feel more comfortable about it.

  Other than that, it's just a handful of small other patches, some good
  cleanups to the mess that is the firmware class code, and we have a
  test driver for the deferred probe logic.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits)
  firmware: Correct handling of fw_state_wait() return value
  driver core: Silence device links sphinx warning
  firmware: remove warning at documentation generation time
  drivers: base: dma-mapping: Fix typo in dmam_alloc_non_coherent comments
  driver core: test_async: fix up typo found by 0-day
  firmware: move fw_state_is_done() into UHM section
  firmware: do not use fw_lock for fw_state protection
  firmware: drop bit ops in favor of simple state machine
  firmware: refactor loading status
  firmware: fix usermode helper fallback loading
  driver core: firmware_class: convert to use class_groups
  driver core: devcoredump: convert to use class_groups
  driver core: class: add class_groups support
  kernfs: Declare two local data structures static
  driver-core: fix platform_no_drv_owner.cocci warnings
  drivers/base/memory.c: Remove unused 'first_page' variable
  driver core: add CLASS_ATTR_WO()
  drivers: base: cacheinfo: support DT overrides for cache properties
  drivers: base: cacheinfo: add pr_fmt logging
  drivers: base: cacheinfo: fix boot error message when acpi is enabled
  ...
2016-12-13 11:42:18 -08:00
Sudeep Holla dfea747d2a drivers: base: cacheinfo: support DT overrides for cache properties
Few architectures like x86, ia64 and s390 derive the cache topology and
all the properties using a specific architected mechanism while some
other architectures like powerpc all those information id derived from
the device tree.

On ARM, both the mechanism is used. While all the cache properties can
be derived in a architected way, it needs to rely on device tree to get
the cache topology information.

However there are few platforms where this architected mechanism is
broken and the device tree properties can be used to override these
incorrect values.

This patch adds support for overriding the cache properties values to
the values specified in the device tree.

Cc: Alex Van Brunt <avanbrunt@nvidia.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-10 17:30:53 +01:00