Commit Graph

437 Commits

Author SHA1 Message Date
Michael Petlan 56250a0061 perf parse-events: Remove "not supported" hybrid cache events
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 71c86cda750b001100e0d6dc04a88449b7381a59
Author: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Date: Fri Sep 23 11:00:13 2022 +0800

description
===========
By default, we create two hybrid cache events, one is for cpu_core, and
another is for cpu_atom. But Some hybrid hardware cache events are only
available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only
available on cpu_core, while the 'L1-icache-loads' is only available on
cpu_atom. We need to remove "not supported" hybrid cache events. By
extending is_event_supported() to global API and using it to check if the
hybrid cache events are supported before being created, we can remove the
"not supported" hybrid cache events.

Before:

 # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

 Performance counter stats for 'system wide':

            52,570      cpu_core/L1-dcache-load-misses/
   <not supported>      cpu_atom/L1-dcache-load-misses/
   <not supported>      cpu_core/L1-icache-loads/
         1,471,817      cpu_atom/L1-icache-loads/

       1.004915229 seconds time elapsed

After:

 # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

 Performance counter stats for 'system wide':

            54,510      cpu_core/L1-dcache-load-misses/
         1,441,286      cpu_atom/L1-icache-loads/

       1.005114281 seconds time elapsed

Fixes: 30def61f64 ("perf parse-events: Create two hybrid cache events")
    Reported-by: Yi Ammy <ammy.yi@intel.com>
    Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
    Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jin Yao <yao.jin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20220923030013.3726410-2-zhengjun.xing@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:26:07 +01:00
Michael Petlan d164a27d46 perf tools: Do not pass NULL to parse_events()
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 806731a9465b42aaf887cbaf8bfee7eccc9417de
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue Aug 9 11:07:02 2022 +0300

description
===========
Many cases do not use the extra error information provided by
parse_events and instead pass NULL as the struct parse_events_error
pointer. Add a wrapper for those cases so that the pointer is never
NULL.

    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220809080702.6921-4-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:58 +01:00
Michael Petlan 40ce65a372 perf parse-events: Fix segfault when event parser gets an error
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 2e828582b81f5bc76a4fe8e7812df259ab208302
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue Aug 9 11:07:00 2022 +0300

description
===========
parse_events() is often called with parse_events_error set to NULL.
Make parse_events_error__handle() not segfault in that case.

A subsequent patch changes to avoid passing NULL in the first place.

Fixes: 43eb05d066 ("perf tests: Support 'Track with sched_switch' test for hybrid")
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jin Yao <yao.jin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220809080702.6921-2-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:58 +01:00
Michael Petlan 980c403b61 perf parse-events: Break out tracepoint and printing
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 9b7c7728f4e4ba8dd75269fb111fa187faa018c6
Author: Ian Rogers <irogers@google.com>
Date: Fri Jul 29 13:42:17 2022 -0700

description
===========
Move print_*_events functions out of parse-events.c into a new
print-events.c. Move tracepoint code into tracepoint.c or
trace-event-info.c (sole user). This reduces the dependencies of
parse-events.c and makes it more amenable to being a library in the
future.

Remove some unnecessary definitions from parse-events.h. Fix a
checkpatch.pl warning on using unsigned rather than unsigned int.  Fix
some line length warnings too.

    Signed-off-by: Ian Rogers <irogers@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com
[ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ]
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:55 +01:00
Michael Petlan d9dd69e316 perf parse-events: Don't #define YY_EXTRA_TYPE
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 32f457abb84676f54a8d3144fd7504207d348f87
Author: Ian Rogers <irogers@google.com>
Date: Fri Jul 29 13:42:16 2022 -0700

description
===========
Adding a #define to side-effect a local include isn't clean, for
example, it inhibits header precompilation. YY_EXTRA_TYPE is
defined to be void* by default, so just remove.

    Signed-off-by: Ian Rogers <irogers@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: https://lore.kernel.org/r/20220729204217.250166-2-irogers@google.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:55 +01:00
Michael Petlan 7a532ba916 perf stat: Add requires_cpu flag for uncore
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit d3345fecf9e5f63be7946a1e5bf1f5695c67b445
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue May 24 10:54:33 2022 +0300

description
===========
Uncore events require a CPU i.e. it cannot be -1.

The evsel system_wide flag is intended for events that should be on every
CPU, which does not make sense for uncore events because uncore events do
not map one-to-one with CPUs.

These 2 requirements are not exactly the same, so introduce a new flag
'requires_cpu' for the uncore case.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:10 +02:00
Michael Petlan 15d9160cb8 perf evsel: Constify a few arrays
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 545a96c90fbe82c4c1e214b0aada4a22e5ffd1e4
Author: Ian Rogers <irogers@google.com>
Date: Fri May 6 22:34:07 2022 -0700

description
===========
Remove public definition of evsel__tool_names(). Not used outside
util/evsel.c.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:03 +02:00
Michael Petlan 059e673404 perf list: Print all available tool events
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 75eafc970bd9d36d906960a81376549f5dc99696
Author: Florian Fischer <florian.fischer@muhq.space>
Date: Wed Apr 20 19:42:44 2022 +0200

description
===========
Introduce names for the new tool events 'user_time' and 'system_time'.

  $ perf list
  ...
  duration_time                                      [Tool event]
  user_time                                          [Tool event]
  system_time                                        [Tool event]
  ...

Committer testing:

Before:

  $ perf list | grep Tool
  duration_time                                      [Tool event]
  $

After:

  $ perf list | grep Tool
    duration_time                                    [Tool event]
    user_time                                        [Tool event]
    system_time                                      [Tool event]
  $

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:01 +02:00
Michael Petlan 80229988ad perf stat: Add user_time and system_time events
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit b03b89b350034f220cc24fc77c56990a97a796b2
Author: Florian Fischer <florian.fischer@muhq.space>
Date: Wed Apr 20 12:23:53 2022 +0200

description
===========
It bothered me that during benchmarking using 'perf stat' (to collect
for example CPU cache events) I could not simultaneously retrieve the
times spend in user or kernel mode in a machine readable format.

When running 'perf stat' the output for humans contains the times
reported by rusage and wait4.

  $ perf stat -e cache-misses:u -- true

   Performance counter stats for 'true':

             4,206      cache-misses:u

       0.001113619 seconds time elapsed

       0.001175000 seconds user
       0.000000000 seconds sys

But 'perf stat's machine-readable format does not provide this information.

  $ perf stat -x, -e cache-misses:u -- true
  4282,,cache-misses:u,492859,100.00,,

I found no way to retrieve this information using the available events
while using machine-readable output.

This patch adds two new tool internal events 'user_time' and
'system_time', similarly to the already present 'duration_time' event.

Both events use the already collected rusage information obtained by
wait4 and tracked in the global ru_stats.

Examples presenting cache-misses and rusage information in both human
and machine-readable form:

  $ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .

   Performance counter stats for 'grep -q -r duration_time .':

        67,422,542 ns   duration_time:u
        50,517,000 ns   user_time:u
        16,839,000 ns   system_time:u
            30,937      cache-misses:u

       0.067422542 seconds time elapsed

       0.050517000 seconds user
       0.016839000 seconds sys

  $ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .
  72134524,ns,duration_time:u,72134524,100.00,,
  65225000,ns,user_time:u,65225000,100.00,,
  6865000,ns,system_time:u,6865000,100.00,,
  38705,,cache-misses:u,71189328,100.00,,

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:01 +02:00
Michael Petlan e078eae354 perf tools: Fix misleading add event PMU debug message
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit f034fc50d3c7d9385c20d505ab4cf56b8fd18ac7
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Mon Apr 11 09:17:58 2022 +0300

description
===========
Fix incorrect debug message:

   Attempting to add event pmu 'intel_pt' with '' that may result in
   non-fatal errors

which always appears with perf record -vv and intel_pt e.g.

    perf record -vv -e intel_pt//u uname

The message is incorrect because there will never be non-fatal errors.

Suppress the message if the PMU is 'selectable' i.e. meant to be
selected directly as an event.

Fixes: 4ac22b484d ("perf parse-events: Make add PMU verbose output clearer")

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:57 +02:00
Michael Petlan 1208b28107 perf parse: Fix event parser error for hybrid systems
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 91c9923a473a694eb1c5c01ab778a77114969707
Author: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Date: Mon Mar 7 23:16:27 2022 +0800

description
===========
This bug happened on hybrid systems when both cpu_core and cpu_atom
have the same event name such as "UOPS_RETIRED.MS" while their event
terms are different, then during perf stat, the event for cpu_atom
will parse fail and then no output for cpu_atom.

UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/

It is because event terms in the "head" of parse_events_multi_pmu_add
will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS
for cpu_core, then when parsing the same event for cpu_atom, it still
uses the event terms for cpu_core, but event terms for cpu_atom are
different with cpu_core, the event parses for cpu_atom will fail. This
patch fixes it, the event terms should be parsed from the original
event.

This patch can work for the hybrid systems that have the same event
in more than 2 PMUs. It also can work in non-hybrid systems.

Before:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 2737845 16068518485 16068518485

 Performance counter stats for 'system wide':

         2,737,845      cpu_core/UOPS_RETIRED.MS/

       1.002553850 seconds time elapsed

After:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 1977555 16076950711 16076950711
  UOPS_RETIRED.MS: 568684 8038694234 8038694234

 Performance counter stats for 'system wide':

         1,977,555      cpu_core/UOPS_RETIRED.MS/
           568,684      cpu_atom/UOPS_RETIRED.MS/

       1.004758259 seconds time elapsed

Fixes: fb0811535e92c6c1 ("perf parse-events: Allow config on kernel PMU events")

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:14 +02:00
Michael Petlan dad904186e perf parse-events: Fix NULL check against wrong variable
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit a7a72631f62445e3671b7cab5ad01f856c1aa90d
Author: Weiguo Li <liwg06@foxmail.com>
Date: Fri Mar 11 21:06:57 2022 +0800

description
===========
We did a null check after "tmp->symbol = strdup(...)", but we checked
"list->symbol" other than "tmp->symbol".

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:14 +02:00
Michael Petlan e2f499f686 perf test: Add parse-events test for aliases with hyphens
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit b4a7276c5e9a79c238a2fad4fb9498dd3558ad2e
Author: John Garry <john.garry@huawei.com>
Date: Mon Jan 17 23:10:15 2022 +0800

description
===========
Add a test which allows us to test parsing an event alias with hyphens.

Since these events typically do not exist on most host systems, add the
alias to the fake pmu.

Function perf_pmu__test_parse_init() has terms added to match known test
aliases.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:08 +02:00
Michael Petlan 28ddab1a11 perf parse-events: Support event alias in form foo-bar-baz
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 864bc8c905261f264c3ea357027cf555fe51c5a3
Author: John Garry <john.garry@huawei.com>
Date: Mon Jan 17 23:10:13 2022 +0800

description
===========
Event aliasing for events whose name in the form foo-bar-baz is not
supported, while foo-bar, foo_bar_baz, and other combinations are, i.e.
two hyphens are not supported.

The HiSilicon D06 platform has events in such form:

  $ ./perf list sdir-home-migrate

  List of pre-defined events (to be used in -e):

  uncore hha:
    sdir-home-migrate
   [Unit: hisi_sccl,hha]

  $ sudo ./perf stat -e sdir-home-migrate
  event syntax error: 'sdir-home-migrate'
                          \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

   -e, --event <event>event selector. use 'perf list' to list available events

To support, add an extra PMU event symbol type for "baz", and add a new
rule in the bison file.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:07 +02:00
Michael Petlan fe5233088f perf parse-events: Architecture specific leader override
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 94dbfd6781a0e87b6faa6012810eb22e7d5b8a70
Author: Ian Rogers <irogers@google.com>
Date: Tue Nov 30 09:49:45 2021 -0800

description
===========
Currently topdown events must appear after a slots event:

  $ perf stat -e '{slots,topdown-fe-bound}' /bin/true

   Performance counter stats for '/bin/true':

         3,183,090      slots
           986,133      topdown-fe-bound

Reversing the events yields:

  $ perf stat -e '{topdown-fe-bound,slots}' /bin/true
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-fe-bound).

For metrics the order of events is determined by iterating over a
hashmap, and so slots isn't guaranteed to be first which can yield this
error.

Change the set_leader in parse-events, called when a group is closed, so
that rather than always making the first event the leader, if the slots
event exists then it is made the leader. It is then moved to the head of
the evlist otherwise it won't be opened in the correct order.

The result is:

  $ perf stat -e '{topdown-fe-bound,slots}' /bin/true

   Performance counter stats for '/bin/true':

         3,274,795      slots
         1,001,702      topdown-fe-bound

A problem with this approach is the slots event is identified by name,
names can be overwritten like 'cpu/slots,name=foo/' and this causes the
leader change to fail.

The change also modifies and fixes mixed groups like, with the change:

  $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2

   Performance counter stats for 'system wide':

        5574985410      slots
         971981616      instructions
        1348461887      topdown-fe-bound

       2.001263120 seconds time elapsed

Without the change:

  $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2

   Performance counter stats for 'system wide':

     <not counted>      instructions
     <not counted>      slots
   <not supported>      topdown-fe-bound

       2.006247990 seconds time elapsed

Something that may be undesirable here is that the events are reordered
in the output.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:41 +02:00
Michael Petlan 8d8613cbac perf evlist: Allow setting arbitrary leader
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit ecdcf630d71f3b4c64097cad0add561cd5010c02
Author: Ian Rogers <irogers@google.com>
Date: Tue Nov 30 09:49:44 2021 -0800

description
===========
The leader of a group is the first, but allow it to be an arbitrary list
member so that for Intel topdown events slots may always be the group
leader.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:41 +02:00
Michael Petlan e06beabb61 perf evsel: Fix memory leaks relating to unit
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit b194c9cd09dd98af76beaa32a041af674260d730
Author: Ian Rogers <irogers@google.com>
Date: Thu Nov 18 00:47:49 2021 -0800

description
===========
unit may have a strdup pointer or be to a literal, consequently memory
assocciated with it isn't freed. Change it so the unit is always strdup
and so the memory can be safely freed.

Fix related issue in perf_event__process_event_update() for name and
own_cpus. Leaks were spotted by leak sanitizer.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:30 +02:00
Michael Petlan be19bc8026 perf parse-event: Add init and exit to parse_event_error
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 07eafd4e053a41d72611848b8758df0752b53ee4
Author: Ian Rogers <irogers@google.com>
Date: Sun Nov 7 01:00:01 2021 -0800

description
===========
parse_events() may succeed but leave string memory allocations reachable
in the error.

Add an init/exit that must be called to initialize and clean up the
error. This fixes a leak in metricgroup parse_ids.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:13 +02:00
Michael Petlan 2620c00cd7 perf parse-events: Rename parse_events_error functions
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 6c1912898ed21bef2d7f8b52902b8bc3c0e5c2b5
Author: Ian Rogers <irogers@google.com>
Date: Sun Nov 7 01:00:00 2021 -0800

description
===========
Group error functions and name after the data type they manipulate.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:12 +02:00
Michael Petlan 0a3a21add7 perf list: Display hybrid PMU events with cpu type
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 0e0ae8742207c3b477cf0357b8115cec7b19612c
Author: Jin Yao <yao.jin@linux.intel.com>
Date: Fri Sep 3 10:52:39 2021 +0800

description
===========
Add a new option '--cputype' to 'perf list' to display core-only PMU
events or atom-only PMU events.

Each hybrid PMU event has been assigned with a PMU name, this patch
compares the PMU name before listing the result.

For example:

  perf list --cputype atom
  ...
  cache:
    core_reject_l2q.any
         [Counts the number of request that were not accepted into the L2Q because the L2Q is FULL. Unit: cpu_atom]
  ...

The "Unit: cpu_atom" is displayed in the brief description section
to indicate this is an atom event.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:05 +02:00
Michael Petlan 303319165f perf parse-events: Allow config on kernel PMU events
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit fb0811535e92c6c1e093d7f59eb9d66426653b39
Author: Ian Rogers <irogers@google.com>
Date: Fri Oct 15 10:21:26 2021 -0700

description
===========
An event like inst_retired.any on an Intel skylake is found in the
pmu-events code created from the pipeline event JSON.

The event is an alias for cpu/event=0xc0,period=2000003/ and
parse-events recognizes the event with the token PE_KERNEL_PMU_EVENT.

The parser doesn't currently allow extra configuration on such events,
except for modifiers, so:

  $ perf stat -e inst_retired.any// /bin/true
  event syntax error: 'inst_retired.any//'
                       \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events

This patch adds configuration to these events which can be useful for a
number of parameters like name and call-graph:

  $ sudo perf record -e inst_retired.any/call-graph=lbr/ -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.856 MB perf.data (44 samples) ]

It is necessary for the metric code so that we may add metric-id values
to these events before they are parsed.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:59 +02:00
Michael Petlan bddff40e39 perf parse-events: Add new "metric-id" term
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 2b62b3a611715d3ca612e3225cf436277ed9648b
Author: Ian Rogers <irogers@google.com>
Date: Fri Oct 15 10:21:25 2021 -0700

description
===========
Add a new "metric-id" term to events so that metric parsing can set an
ID that can be reliably looked up.

Metric parsing currently will turn a metric like "instructions/cycles"
into a parse events string of "{instructions,cycles}:W".

However, parse-events may change "instructions" into "instructions:u" if
perf_event_paranoid=2.

When this happens expr__resolve_id currently fails as stat-shadow adds
the ID "instructions:u" to match with the counter value and the metric
tries to look up the ID just "instructions".

A later patch will use the new term.

An example of the current problem:

  $ echo -1 > /proc/sys/kernel/perf_event_paranoid
  $ perf stat -M IPC /bin/true
   Performance counter stats for '/bin/true':

           1,217,161      inst_retired.any          #     0.97 IPC
           1,250,389      cpu_clk_unhalted.thread

         0.002064773 seconds time elapsed

         0.002378000 seconds user
         0.000000000 seconds sys

  $ echo 2 > /proc/sys/kernel/perf_event_paranoid
  $ perf stat -M IPC /bin/true
   Performance counter stats for '/bin/true':

             150,298      inst_retired.any:u        #      nan IPC
             187,095      cpu_clk_unhalted.thread:u

         0.002042731 seconds time elapsed

         0.000000000 seconds user
         0.002377000 seconds sys

Note: nan IPC is printed as an effect of "perf metric: Use NAN for
missing event IDs." but earlier versions of perf just fail with a parse
error and display no value.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:59 +02:00
Michael Petlan 5852e30573 perf parse-events: Add const to evsel name
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 8e8bbfb311a26a17834f1839e15e2c29ea5e58c6
Author: Ian Rogers <irogers@google.com>
Date: Fri Oct 15 10:21:24 2021 -0700

description
===========
The evsel name is strdup-ed before assignment and so can be const.

A later change will add another similar string.

Using const makes it clearer that these are not out arguments.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:58 +02:00
Michael Petlan b31bf1d526 perf parse-events: Set numeric term config
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 4f9d4f8aa7328068a2f25cfb3d1d04c1b6fa54ac
Author: John Garry <john.garry@huawei.com>
Date: Thu Sep 16 20:34:21 2021 +0800

description
===========
For numeric terms, the config field may be NULL as it is not set from
the l+y parsing.

Fix by setting the term config from the term type name.

Also fix up the pmu-events test to set the alias strings to set the
period term properly, and fix up parse-events test to check the term
config string.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:47 +02:00
Michael Petlan 50b64087cb perf list: Display pmu prefix for partially supported hybrid cache events
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 6c93f39f2f435d822c2f765650f405acebdc49fc
Author: Jin Yao <yao.jin@linux.intel.com>
Date: Thu Sep 9 14:18:44 2021 +0800

description
===========
Part of hardware cache events are only available on one CPU PMU.
For example, 'L1-dcache-load-misses' is only available on cpu_core.
perf list should clearly report this info.

root@otcpl-adl-s-2:~# ./perf list

Before:
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-stores                                        [Hardware cache event]

After:
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  cpu_atom/L1-icache-loads/                          [Hardware cache event]
  cpu_core/L1-dcache-load-misses/                    [Hardware cache event]
  cpu_core/node-load-misses/                         [Hardware cache event]
  cpu_core/node-loads/                               [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]

Now we can clearly see 'L1-dcache-load-misses' is only available
on cpu_core.

If without pmu prefix, it indicates the event is available on both
cpu_core and cpu_atom.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:45 +02:00
Michael Petlan 33db845289 perf parse-events: Remove unnecessary #includes
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit cb7bfb1da6f609bf954d5f164733ff35b1cb4d4e
Author: Ian Rogers <irogers@google.com>
Date: Wed Jan 27 10:46:29 2021 -0800

description
===========
Minor cleanup motivated by trying to separately fuzz test parse-events.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:44 +02:00
Michael Petlan 8a3c82fc8f perf parse-events: Avoid enum forward declaration.
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 8228e9361e2a7447eaed87499123e85871c8bc18
Author: Ian Rogers <irogers@google.com>
Date: Wed Sep 15 14:14:28 2021 -0700

description
===========
Enum forward declarations aren't allowed as the size can't be implied.
Switch to just using an int. This fixes a clang warning:

  In file included from tools/perf/bench/evlist-open-close.c:13:
  tools/perf/bench/../util/parse-events.h:185:6: error: redeclaration of already-defined enum 'perf_tool_event' is a GNU extension [-Werror,-Wgnu-redeclared-enum]
  enum perf_tool_event;
       ^
  tools/perf/bench/../util/evsel.h:28:6: note: previous definition is here
  enum perf_tool_event {
       ^

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:43 +02:00
Michael Petlan 7e244c35ab perf tools: Fix hybrid config terms list corruption
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit 99fc5941b835d662eb2e91d8b61249e9a51df9f0
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Thu Sep 9 15:55:08 2021 +0300

description
===========
A config terms list was spliced twice, resulting in a never-ending loop
when the list was traversed. Fix by using list_splice_init() and copying
and freeing the lists as necessary.

This patch also depends on patch "perf tools: Factor out
copy_config_terms() and free_config_terms()"

Example on ADL:

 Before:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  # jobs
  [1]+  Running                    perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname
  # perf top -E 10
    PerfTop:    4071 irqs/sec  kernel: 6.9%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles],  (all, 24 CPUs)
  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    97.60%  perf           [.] __evsel__get_config_term
     0.25%  [kernel]       [k] kallsyms_expand_symbol.constprop.13
     0.24%  perf           [.] kallsyms__parse
     0.15%  [kernel]       [k] _raw_spin_lock
     0.14%  [kernel]       [k] number
     0.13%  [kernel]       [k] advance_transaction
     0.08%  [kernel]       [k] format_decode
     0.08%  perf           [.] map__process_kallsym_symbol
     0.08%  perf           [.] rb_insert_color
     0.08%  [kernel]       [k] vsnprintf
  exiting.
  # kill %1

After:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.060 MB perf.data ]
  # perf script | head
       perf-exec   604 [001]  1827.312293:                            psb:  psb offs: 0                       ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms])

Fixes: 30def61f64 ("perf parse-events Create two hybrid cache events")
Fixes: 94da591b1c ("perf parse-events Create two hybrid raw events")
Fixes: 9cbfa2f64c ("perf parse-events Create two hybrid hardware events")

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:34:24 +02:00
Michael Petlan acf52cbf64 perf tools: Factor out copy_config_terms() and free_config_terms()
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit a7d212fc6c89d1619b9441f4c801cbff8ca34197
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Thu Sep 9 15:55:07 2021 +0300

description
===========
Factor out copy_config_terms() and free_config_terms() so that they can
be reused.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:34:24 +02:00
Jiri Olsa 2e6263ab54 libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()
Move the implementation of evlist__set_leader() to a new libperf
perf_evlist__set_leader() function with the same functionality make it a
libperf exported API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa 3a683120d8 libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa fba7c86601 libperf: Move 'leader' from tools/perf to perf_evsel::leader
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:31 -03:00
Jiri Olsa 38fe0e0156 libperf: Move 'idx' from tools/perf to perf_evsel::idx
Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.

Committer notes:

Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.

Also fixed up these:

$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c:                      evsel->idx + i);
tools/perf/ui/gtk/annotate.c:                   evsel->idx);
$

That running 'make -C tools/perf build-test' caught.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:28 -03:00
Arnaldo Carvalho de Melo 3b2f17ad17 perf parse-events: Check if the software events array slots are populated
To avoid a NULL pointer dereference when the kernel supports the new
feature but the tooling still hasn't an entry for it.

This happened with the recently added PERF_COUNT_SW_CGROUP_SWITCHES
software event.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/linux-perf-users/YKVESEKRjKtILhog@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 07:47:56 -03:00
Namhyung Kim fb6c79d726 perf tools: Add 'cgroup-switches' software event
It counts how often cgroups are changed actually during the context
switches.

  # perf stat -a -e context-switches,cgroup-switches -a sleep 1

   Performance counter stats for 'system wide':

              11,267      context-switches
              10,950      cgroup-switches

         1.015634369 seconds time elapsed

Committer notes:

The kernel patches landed in v5.13, but this entry wasn't filled in
perf's parse-events tables, which was leading to a segfault when running
'perf list' on a kernel with that feature, as reported by Thomas
Richter.

Also removed the part touching tools/include/uapi/linux/perf_event.h as
it was updated in the usual sync with the kernel UAPI headers, in a
previous, already upstream, patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210210083327.22726-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 14:23:23 -03:00
Jin Yao 5e4edd1f73 perf parse-events: Support event inside hybrid pmu
On hybrid platform, user may want to enable events on one pmu.

Following syntax are supported:

cpu_core/<event>/
cpu_atom/<event>/

But the syntax doesn't work for cache event.

Before:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
  event syntax error: 'cpu_core/LLC-loads/'
                                \___ unknown term 'LLC-loads' for pmu 'cpu_core'

Cache events are a bit complex. We can't create aliases for them.
We use another solution. For example, if we use "cpu_core/LLC-loads/",
in parse_events_add_pmu(), term->config is "LLC-loads".

Then we create a new parser to scan "LLC-loads". The
parse_events_add_cache() would be called during parsing.
The parse_state->hybrid_pmu_name is used to identify the pmu
where the event should be enabled on.

After:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1

   Performance counter stats for 'system wide':

              24,593      cpu_core/LLC-loads/

         1.003911601 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao c93afadc92 perf parse-events: Compare with hybrid pmu name
On hybrid platform, user may want to enable event only on one pmu.
Following syntax will be supported:

cpu_core/<event>/
cpu_atom/<event>/

For hardware event, hardware cache event and raw event, two events
are created by default. We pass the specified pmu name in parse_state
and it would be checked before event creation. So next only the
event with the specified pmu would be created.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao 30def61f64 perf parse-events: Create two hybrid cache events
For cache events, they have pre-defined configs. The kernel needs
to know where the cache event comes from (e.g. from cpu_core pmu
or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
can't carry pmu information.

Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
The PMU type ID is stored at attr.config[63:32].

When enabling a hybrid cache event without specified pmu, such as,
'perf stat -e LLC-loads -a', two events are created
automatically. One is for atom, the other is for core.

  # perf stat -e LLC-loads -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  LLC-loads: 0: 1507 1001800280 1001800280
  LLC-loads: 1: 666 1001812250 1001812250
  LLC-loads: 2: 3353 1001813453 1001813453
  LLC-loads: 3: 514 1001848795 1001848795
  LLC-loads: 4: 627 1001952832 1001952832
  LLC-loads: 5: 4399 1001451154 1001451154
  LLC-loads: 6: 1240 1001481052 1001481052
  LLC-loads: 7: 478 1001520348 1001520348
  LLC-loads: 8: 691 1001551236 1001551236
  LLC-loads: 9: 310 1001578945 1001578945
  LLC-loads: 10: 1018 1001594354 1001594354
  LLC-loads: 11: 3656 1001622355 1001622355
  LLC-loads: 12: 882 1001661416 1001661416
  LLC-loads: 13: 506 1001693963 1001693963
  LLC-loads: 14: 3547 1001721013 1001721013
  LLC-loads: 15: 1399 1001734818 1001734818
  LLC-loads: 0: 1314 1001793826 1001793826
  LLC-loads: 1: 2857 1001752764 1001752764
  LLC-loads: 2: 646 1001830694 1001830694
  LLC-loads: 3: 1612 1001864861 1001864861
  LLC-loads: 4: 2244 1001912381 1001912381
  LLC-loads: 5: 1255 1001943889 1001943889
  LLC-loads: 6: 4624 1002021109 1002021109
  LLC-loads: 7: 2703 1001959302 1001959302
  LLC-loads: 24793 16026838264 16026838264
  LLC-loads: 17255 8015078826 8015078826

   Performance counter stats for 'system wide':

              24,793      cpu_core/LLC-loads/
              17,255      cpu_atom/LLC-loads/

         1.001970988 seconds time elapsed

0x4 in 0x400000002 indicates the cpu_core pmu.
0x8 in 0x800000002 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao 9cbfa2f64c perf parse-events: Create two hybrid hardware events
Current hardware events has special perf types PERF_TYPE_HARDWARE.
But it doesn't pass the PMU type in the user interface. For a hybrid
system, the perf kernel doesn't know which PMU the events belong to.

So now this type is extended to be PMU aware type. The PMU type ID
is stored at attr.config[63:32].

PMU type ID is retrieved from sysfs.

  root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
  8

  root@lkp-adl-d01:/sys/devices/cpu_core# cat type
  4

When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.

  # perf stat -e cycles -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  cycles: 0: 836272 1001525722 1001525722
  cycles: 1: 628564 1001580453 1001580453
  cycles: 2: 872693 1001605997 1001605997
  cycles: 3: 70417 1001641369 1001641369
  cycles: 4: 88593 1001726722 1001726722
  cycles: 5: 470495 1001752993 1001752993
  cycles: 6: 484733 1001840440 1001840440
  cycles: 7: 1272477 1001593105 1001593105
  cycles: 8: 209185 1001608616 1001608616
  cycles: 9: 204391 1001633962 1001633962
  cycles: 10: 264121 1001661745 1001661745
  cycles: 11: 826104 1001689904 1001689904
  cycles: 12: 89935 1001728861 1001728861
  cycles: 13: 70639 1001756757 1001756757
  cycles: 14: 185266 1001784810 1001784810
  cycles: 15: 171094 1001825466 1001825466
  cycles: 0: 129624 1001854843 1001854843
  cycles: 1: 122533 1001840421 1001840421
  cycles: 2: 90055 1001882506 1001882506
  cycles: 3: 139607 1001896463 1001896463
  cycles: 4: 141791 1001907838 1001907838
  cycles: 5: 530927 1001883880 1001883880
  cycles: 6: 143246 1001852529 1001852529
  cycles: 7: 667769 1001872626 1001872626
  cycles: 6744979 16026956922 16026956922
  cycles: 1965552 8014991106 8014991106

   Performance counter stats for 'system wide':

           6,744,979      cpu_core/cycles/
           1,965,552      cpu_atom/cycles/

         1.001882711 seconds time elapsed

0x4 in 0x400000000 indicates the cpu_core pmu.
0x8 in 0x800000000 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao 12279429d8 perf stat: Uniquify hybrid event name
It would be useful to let user know the pmu which the event belongs to.
perf-stat has supported '--no-merge' option and it can print the pmu
name after the event name, such as:

"cycles [cpu_core]"

Now this option is enabled by default for hybrid platform but change
the format to:

"cpu_core/cycles/"

If user configs the name, we still use the user specified name.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Song Liu 01bd8efcec perf stat: Introduce ':b' modifier
Introduce 'b' modifier to event parser, which means use BPF program to
manage this event. This is the same as --bpf-counters option, but only
applies to this event. For example,

  perf stat -e cycles:b,cs               # use bpf for cycles, but not cs
  perf stat -e cycles,cs --bpf-counters  # use bpf for both cycles and cs

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:58 -03:00
Arnaldo Carvalho de Melo b0a752d43b Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes sent via perf/urgent and in the BPF tools/ directories.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-29 10:39:10 -03:00
Ingo Molnar 4d39c89f0b perf tools: Fix various typos in comments
Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
accumulated over the years.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com
Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-23 17:13:43 -03:00
Jin Yao e40647762f perf pmu: Validate raw event with sysfs exported format bits
A raw PMU event (eventsel+umask) in the form of rNNN is supported
by perf but lacks of checking for the validity of raw encoding.

For example, bit 16 and bit 17 are not valid on KBL but perf doesn't
report warning when encoding with these bits.

Before:

  # ./perf stat -e cpu/r031234/ -a -- sleep 1

   Performance counter stats for 'system wide':

                   0      cpu/r031234/

         1.003798924 seconds time elapsed

It may silently measure the wrong event!

The kernel supported bits have been exported through
/sys/devices/<pmu>/format/. Perf collects the information to
'struct perf_pmu_format' and links it to 'pmu->format' list.

The 'struct perf_pmu_format' has a bitmap which records the
valid bits for this format. For example,

  root@kbl-ppc:/sys/devices/cpu/format# cat umask
  config:8-15

The valid bits (bit8-bit15) are recorded in bitmap of format 'umask'.

We collect total valid bits of all formats, save to a local variable
'masks' and reverse it. Now '~masks' represents total invalid bits.

bits = config & ~masks;

The set bits in 'bits' indicate the invalid bits used in config.
Finally we use bitmap_scnprintf to report the invalid bits.

Some architectures may not export supported bits through sysfs,
so if masks is 0, perf_pmu__warn_invalid_config directly returns.

After:

Single event without name:

  # ./perf stat -e cpu/r031234/ -a -- sleep 1
  WARNING: event 'N/A' not valid (bits 16-17 of config '31234' not supported by kernel)!

   Performance counter stats for 'system wide':

                   0      cpu/r031234/

         1.001597373 seconds time elapsed

Multiple events with names:

  # ./perf stat -e cpu/rf01234,name=aaa/,cpu/r031234,name=bbb/ -a -- sleep 1
  WARNING: event 'aaa' not valid (bits 20,22 of config 'f01234' not supported by kernel)!
  WARNING: event 'bbb' not valid (bits 16-17 of config '31234' not supported by kernel)!

   Performance counter stats for 'system wide':

                   0      aaa
                   0      bbb

         1.001573787 seconds time elapsed

Warnings are reported for invalid bits.

Co-developed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210310051138.12154-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-15 10:12:02 -03:00
Arnaldo Carvalho de Melo e414fd1a3f perf evlist: Use the right prefix for 'struct evlist' evsel list methods
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-30 14:52:44 -03:00
Arnaldo Carvalho de Melo a622eafa1a perf evlist: Use the right prefix for 'struct evlist' methods: evlist__set_leader()
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-30 09:22:07 -03:00
Arnaldo Carvalho de Melo c18cf78d79 perf bpf: Enclose libbpf.h include within HAVE_LIBBPF_SUPPORT
As it uses the 'deprecated' attribute in a way that breaks the build
with old gcc compilers, so to continue being able to build in such
systems where NO_LIBBPF=1 is being used, enclose it under
HAVE_LIBBPF_SUPPORT.

   1 centos:6          : FAIL gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
   2 oraclelinux:6     : FAIL gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)

    CC       /tmp/build/perf/builtin-record.o
  In file included from util/bpf-loader.h:11,
                   from builtin-record.c:39:
  /git/linux/tools/lib/bpf/libbpf.h:203: error: wrong number of arguments specified for 'deprecated' attribute

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-04 09:42:40 -03:00
Andi Kleen 0997a2662f perf tools: Add support for exclusive groups/events
Peter suggested that using the exclusive mode in perf could avoid some
problems with bad scheduling of groups. Exclusive is implemented in the
kernel, but wasn't exposed by the perf tool, so hard to use without
custom low level API users.

Add support for marking groups or events with :e for exclusive in the
perf tool.  The implementation is basically the same as the existing
pinned attribute.

Committer testing:

  # perf test "parse event"
   6: Parse event definition strings                                  : Ok
  # perf test -v "parse event" |& grep :u*e
  running test 56 'instructions:uep'
  running test 57 '{cycles,cache-misses,branch-misses}:e'
  #
  #
  # grep "model name" -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 3900X 12-Core Processor
  #
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

   Performance counter stats for 'system wide':

       <not counted>      cycles                                                        (0.00%)
       <not counted>      cache-misses                                                  (0.00%)
       <not counted>      branch-misses                                                 (0.00%)

         1.001269893 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
  	echo 0 > /proc/sys/kernel/nmi_watchdog
  	perf stat ...
  	echo 1 > /proc/sys/kernel/nmi_watchdog
  # echo 0 > /proc/sys/kernel/nmi_watchdog
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

   Performance counter stats for 'system wide':

       1,298,663,141      cycles
          30,962,215      cache-misses
           5,325,150      branch-misses

         1.001474934 seconds time elapsed

  #
  # The output for asking for precise events on AMD needs to improve, it
  # supposedly works only for system wide or per CPU
  #
  # perf stat -a -e '{cycles,cache-misses,branch-misses}:uep' sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
  /bin/dmesg | grep -i perf may provide additional information.

  # perf stat -a -e '{cycles,cache-misses,branch-misses}:ue' sleep 1

   Performance counter stats for 'system wide':

         746,363,126      cycles
          16,881,611      cache-misses
           2,871,259      branch-misses

         1.001636066 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20201014144255.22699-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 12:24:28 -03:00
Arnaldo Carvalho de Melo dbaa1b3d9a Merge branch 'perf/urgent' into perf/core
To pick fixes that missed v5.9.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 13:02:20 -03:00
Ian Rogers aa98d8482c perf parse-events: Reduce casts around bp_addr
perf_event_attr bp_addr is a u64. parse-events.y parses it as a u64, but
casts it to a void* and then parse-events.c casts it back to a u64.
Rather than all the casts, change the type of the address to be a u64.

This removes an issue noted in:

  https://lore.kernel.org/lkml/20200903184359.GC3495158@kernel.org/

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200925003903.561568-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 09:22:39 -03:00