Commit Graph

659 Commits

Author SHA1 Message Date
Michael Petlan 03ff35bf05 perf record: Fix cpu mask bit setting for mixed mmaps
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit ca76d7d2812b46124291f99c9b50aaf63a936f23
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Thu Sep 15 15:26:11 2022 +0300

description
===========
With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value
 -1 must be skipped when setting CPU mask bits.

Prior to commit cbd7bfc7fd99acdd ("tools/perf: Fix out of bound access
to cpu mask array") the invalid setting went unnoticed, but since then
it causes perf record to fail with an error.

Example:

 Before:

   $ perf record -e intel_pt// --per-thread uname
   Failed to initialize parallel data streaming masks

 After:

   $ perf record -e intel_pt// --per-thread uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.068 MB perf.data ]

Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:26:06 +01:00
Michael Petlan 946d524ed7 perf record: Fix synthesis failure warnings
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit faf59ec8c3c3708c64ff76b50e6f757c6b4a1054
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Wed Sep 7 19:24:58 2022 +0300

description
===========
Some calls to synthesis functions set err < 0 but only warn about the
failure and continue.  However they do not set err back to zero, relying
on subsequent code to do that.

That changed with the introduction of option --synth. When --synth=no
subsequent functions that set err back to zero are not called.

Fix by setting err = 0 in those cases.

Example:

 Before:

   $ perf record --no-bpf-event --synth=all -o /tmp/huh uname
   Couldn't synthesize bpf events.
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]
   $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
   Couldn't synthesize bpf events.

 After:

   $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
   Couldn't synthesize bpf events.
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]

Fixes: 41b740b6e8a994e5 ("perf record: Add --synth option")
    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220907162458.72817-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:26:05 +01:00
Michael Petlan f0e330e6b6 tools/perf: Fix out of bound access to cpu mask array
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit cbd7bfc7fd99acdde58ec2b0bce990158fba1654
Author: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Date: Mon Sep 5 19:49:29 2022 +0530

description
===========
The cpu mask init code in "record__mmap_cpu_mask_init" function access
"bits" array part of "struct mmap_cpu_mask".  The size of this array is
the value from cpu__max_cpu().cpu.  This array is used to contain the
cpumask value for each cpu. While setting bit for each cpu, it calls
"set_bit" function which access index in "bits" array.

If we provide a command line option to -C which is greater than the
number of CPU's present in the system, the set_bit could access an array
member which is out-of the array size. This is because currently, there
is no boundary check for the CPU. This will result in seg fault:

<<>>
  ./perf record -C 12341234 ls
  Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
  Segmentation fault (core dumped)
<<>>

Debugging with gdb, points to function flow as below:

<<>>
  set_bit
  record__mmap_cpu_mask_init
  record__init_thread_default_masks
  record__init_thread_masks
  cmd_record
<<>>

Fix this by adding boundary check for the array.

After the patch:

<<>>
./perf record -C 12341234 ls
  Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
  Failed to initialize parallel data streaming masks
<<>>

With this fix, if -C is given a non-exsiting CPU, perf
record will fail with:

<<>>
  ./perf record -C 50 ls
  Failed to initialize parallel data streaming masks
<<>>

    Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
    Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lore.kernel.org/r/20220905141929.7171-2-atrajeev@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:26:05 +01:00
Michael Petlan 563c7868cf perf record: Improve error message of -p not_existing_pid
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 1bf7d836e57ba4943e33e163e730cd77ab837572
Author: Martin Liška <mliska@suse.cz>
Date: Fri Aug 12 13:40:49 2022 +0200

description
===========
When one uses -p $not_existing_pid, the output of --help is printed:

  $ perf record -p 123456789 2>&1 | head -n3

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

Let's change it something similar what perf top -p $not_existing_pid
prints:

  $ ./perf top -p 123456789 --stdio
  Error:
  Couldn't create thread/CPU maps: No such process

Newly suggested error message:

  $ ./perf record -p 123456789
  Couldn't create thread/CPU maps: No such process

    Signed-off-by: Martin Liška <mliska@suse.cz>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Link: http://lore.kernel.org/lkml/8e00eda1-4de0-2c44-ce67-d4df48ac1f7c@suse.cz
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:26:00 +01:00
Michael Petlan d7493e1c75 perf record: Add finished init event
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 3812d2987733c5a00e103be4e23d63ec9342043a
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri Jun 10 14:33:15 2022 +0300

description
===========
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.

This is needed to enable injecting events after the initial synthesized
user events (that have an all zero id sample) but before regular events.

Committer notes:

Add entry about PERF_RECORD_FINISHED_INIT to
tools/perf/Documentation/perf.data-file-format.txt.

Committer testing:

Before:

  # perf report -D | grep FINISHED
  0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
    FINISHED_ROUND events:          1  ( 0.5%)
  #

After:

  # perf record -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
  # perf report -D | grep FINISHED
  0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
  0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
    FINISHED_ROUND events:          1  ( 0.5%)
     FINISHED_INIT events:          1  ( 0.5%)
  #

    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:42 +01:00
Michael Petlan 105c09c919 perf record: Add new option to sample identifier
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 61110883a02090cb5fd1f890978e238cc99f0164
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Wed Jun 15 08:25:11 2022 +0300

description
===========
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.

Add an option to always include sample type PERF_SAMPLE_IDENTIFIER.

Committer testing:

  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
  # perf evlist -v
  cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #
  #
  # perf record --sample-identifier sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.022 MB perf.data (7 samples) ]
  # perf evlist -v
  cycles: size: 128, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #

    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220615052511.4441-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:42 +01:00
Michael Petlan bfde98a192 perf record: Always record id index
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit 6b080312fc821658a479e74bdb0c3f7d9ac5838f
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri Jun 10 14:33:13 2022 +0300

description
===========
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.

Adjust the logic so that if there are IDs then the id index is recorded.

    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220610113316.6682-3-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:42 +01:00
Michael Petlan 7d029575f4 perf record: Always get text_poke events with --kcore option
Bugzilla: https://bugzilla.redhat.com/2123229

upstream
========
commit f42c0ce573df79d1b8bd169008c994dcdd43585a
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri Jun 10 14:33:12 2022 +0300

description
===========
kcore provides a copy of the running kernel including any modified code.
A trace that benefits from that also benefits from text_poke events, so
enable them.

    Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20220610113316.6682-2-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-11-14 20:25:42 +01:00
Michael Petlan b5a8dd0e08 perf record: Add cgroup support for off-cpu profiling
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 685439a7a037d8677e3d1acf0302624002ee6a6d
Author: Namhyung Kim <namhyung@kernel.org>
Date: Wed May 18 15:47:24 2022 -0700

description
===========
This covers two different use cases.  The first one is cgroup
filtering given by -G/--cgroup option which controls the off-cpu
profiling for tasks in the given cgroups only.

The other use case is cgroup sampling which is enabled by
--all-cgroups option and it adds PERF_SAMPLE_CGROUP to the sample_type
to set the cgroup id of the task in the sample data.

Example output.

  $ sudo perf record -a --off-cpu --all-cgroups sleep 1

  $ sudo perf report --stdio -s comm,cgroup --call-graph=no
  ...
  # Samples: 144  of event 'offcpu-time'
  # Event count (approx.): 48452045427
  #
  # Children      Self  Command          Cgroup
  # ........  ........  ...............  ..........................................
  #
      61.57%     5.60%  Chrome_ChildIOT  /user.slice/user-657345.slice/user@657345.service/app.slice/...
      29.51%     7.38%  Web Content      /user.slice/user-657345.slice/user@657345.service/app.slice/...
      17.48%     1.59%  Chrome_IOThread  /user.slice/user-657345.slice/user@657345.service/app.slice/...
      16.48%     4.12%  pipewire-pulse   /user.slice/user-657345.slice/user@657345.service/session.slice/...
      14.48%     2.07%  perf             /user.slice/user-657345.slice/user@657345.service/app.slice/...
      14.30%     7.15%  CompositorTileW  /user.slice/user-657345.slice/user@657345.service/app.slice/...
      13.33%     6.67%  Timer            /user.slice/user-657345.slice/user@657345.service/app.slice/...
  ...

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:10 +02:00
Michael Petlan c52b642036 perf record: Implement basic filtering for off-cpu
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 10742d0c0771d9fb0329d03bb7c7620c8738f065
Author: Namhyung Kim <namhyung@kernel.org>
Date: Wed May 18 15:47:22 2022 -0700

description
===========
It should honor cpu and task filtering with -a, -C or -p, -t options.

Committer testing:

  # perf record --off-cpu --cpu 1 perf bench sched messaging -l 1000
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 1.722 [sec]
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 1.446 MB perf.data (7248 samples) ]
  #
  # perf script | head -20
              perf 97164 [001] 38287.696761:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97164 [001] 38287.696764:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97164 [001] 38287.696765:          9      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97164 [001] 38287.696767:        212      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
              perf 97164 [001] 38287.696768:       5130      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
              perf 97164 [001] 38287.696770:     123063      cycles:  ffffffffb6e0011e syscall_return_via_sysret+0x38 (vmlinux)
              perf 97164 [001] 38287.696803:    2292748      cycles:  ffffffffb636c82d __fput+0xad (vmlinux)
           swapper     0 [001] 38287.702852:    1927474      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
            :97513 97513 [001] 38287.767207:    1172536      cycles:  ffffffffb612ff65 newidle_balance+0x5 (vmlinux)
           swapper     0 [001] 38287.769567:    1073081      cycles:  ffffffffb618216d ktime_get_mono_fast_ns+0xd (vmlinux)
            :97533 97533 [001] 38287.770962:     984460      cycles:  ffffffffb65b2900 selinux_socket_sendmsg+0x0 (vmlinux)
            :97540 97540 [001] 38287.772242:     883462      cycles:  ffffffffb6d0bf59 irqentry_exit_to_user_mode+0x9 (vmlinux)
           swapper     0 [001] 38287.773633:     741963      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
            :97552 97552 [001] 38287.774539:     606680      cycles:  ffffffffb62eda0a page_add_file_rmap+0x7a (vmlinux)
            :97556 97556 [001] 38287.775333:     502254      cycles:  ffffffffb634f964 get_obj_cgroup_from_current+0xc4 (vmlinux)
            :97561 97561 [001] 38287.776163:     427891      cycles:  ffffffffb61b1522 cgroup_rstat_updated+0x22 (vmlinux)
           swapper     0 [001] 38287.776854:     359030      cycles:  ffffffffb612fc5e load_balance+0x9ce (vmlinux)
            :97567 97567 [001] 38287.777312:     330371      cycles:  ffffffffb6a8d8d0 skb_set_owner_w+0x0 (vmlinux)
            :97566 97566 [001] 38287.777589:     311622      cycles:  ffffffffb614a7a8 native_queued_spin_lock_slowpath+0x148 (vmlinux)
            :97512 97512 [001] 38287.777671:     307851      cycles:  ffffffffb62e0f35 find_vma+0x55 (vmlinux)
  #
  # perf record --off-cpu --cpu 4 perf bench sched messaging -l 1000
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 1.613 [sec]
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 1.415 MB perf.data (6729 samples) ]
  # perf script | head -20
              perf 97650 [004] 38323.728036:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97650 [004] 38323.728040:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97650 [004] 38323.728041:          9      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
              perf 97650 [004] 38323.728042:        208      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
              perf 97650 [004] 38323.728044:       5026      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
              perf 97650 [004] 38323.728046:     119970      cycles:  ffffffffb6d0bebc syscall_exit_to_user_mode+0x1c (vmlinux)
              perf 97650 [004] 38323.728078:    2190103      cycles:            54b756 perf_tool__process_synth_event+0x16 (/home/acme/bin/perf)
           swapper     0 [004] 38323.783357:    1593139      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
           swapper     0 [004] 38323.785352:    1593139      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
           swapper     0 [004] 38323.797330:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
           swapper     0 [004] 38323.802350:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
           swapper     0 [004] 38323.806333:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
            :97996 97996 [004] 38323.807145:    1418936      cycles:      7f5db9be6917 [unknown] ([unknown])
            :97959 97959 [004] 38323.807730:    1445074      cycles:  ffffffffb6329d36 memcg_slab_post_alloc_hook+0x146 (vmlinux)
            :97959 97959 [004] 38323.808103:    1341584      cycles:  ffffffffb62fd90f get_page_from_freelist+0x112f (vmlinux)
            :97959 97959 [004] 38323.808451:    1227537      cycles:  ffffffffb65b2905 selinux_socket_sendmsg+0x5 (vmlinux)
            :97959 97959 [004] 38323.808768:    1184321      cycles:  ffffffffb6d1ba35 _raw_spin_lock_irqsave+0x15 (vmlinux)
            :97959 97959 [004] 38323.809073:    1153017      cycles:  ffffffffb6a8d92d skb_set_owner_w+0x5d (vmlinux)
            :97959 97959 [004] 38323.809402:    1126875      cycles:  ffffffffb6329c64 memcg_slab_post_alloc_hook+0x74 (vmlinux)
            :97959 97959 [004] 38323.809695:    1073248      cycles:  ffffffffb6e0001d entry_SYSCALL_64+0x1d (vmlinux)
  #

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:10 +02:00
Michael Petlan b22b83620d perf record: Enable off-cpu analysis with BPF
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit edc41a1099c2d08ccfd4ed7d59688501e3749015
Author: Namhyung Kim <namhyung@kernel.org>
Date: Wed May 18 15:47:21 2022 -0700

description
===========
Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
use a bpf_output event and rename it to "offcpu-time".  Samples will
be synthesized at the end of the record session using data from a BPF
map which contains the aggregated off-cpu time at context switches.
So it needs root privilege to get the off-cpu profiling.

Each sample will have a separate user stacktrace so it will skip
kernel threads.  The sample ip will be set from the stacktrace and
other sample data will be updated accordingly.  Currently it only
handles some basic sample types.

The sample timestamp is set to a dummy value just not to bother with
other events during the sorting.  So it has a very big initial value
and increase it on processing each samples.

Good thing is that it can be used together with regular profiling like
cpu cycles.  If you don't want to that, you can use a dummy event to
enable off-cpu profiling only.

Example output:
  $ sudo perf record --off-cpu perf bench sched messaging -l 1000

  $ sudo perf report --stdio --call-graph=no
  # Total Lost Samples: 0
  #
  # Samples: 41K of event 'cycles'
  # Event count (approx.): 42137343851
  ...

  # Samples: 1K of event 'offcpu-time'
  # Event count (approx.): 587990831640
  #
  # Children      Self  Command          Shared Object       Symbol
  # ........  ........  ...............  ..................  .........................
  #
      81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
      81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
      81.66%     0.00%  sched-messaging  perf                [.] main
      81.66%     0.00%  sched-messaging  perf                [.] run_builtin
      81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
      40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
      37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
       2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
  ...

As you can see it spent most of off-cpu time in read and write in
bench_sched_messaging().  The --call-graph=no was added just to make
the output concise here.

It uses perf hooks facility to control BPF program during the record
session rather than adding new BPF/off-cpu specific calls.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:10 +02:00
Michael Petlan 88fa1aca8c perf tools: Allow all_cpus to be a superset of user_requested_cpus
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 7be1fedd2a0a5b8f20952a675c611815254b74b6
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue May 24 10:54:30 2022 +0300

description
===========
To support collection of system-wide events with user requested CPUs,
all_cpus must be a superset of user_requested_cpus.

In order to support all_cpus to be a superset of user_requested_cpus,
all_cpus must be used instead of user_requested_cpus when dealing with CPUs
of all events instead of CPUs of requested events.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:09 +02:00
Michael Petlan a90760a767 perf record: Use evlist__add_dummy_on_all_cpus() in record__config_text_poke()
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 921e3be5a5648f483f80c9ba21ca2942d82d581c
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue May 24 10:54:27 2022 +0300

description
===========
Use evlist__add_dummy_on_all_cpus() in record__config_text_poke() in
preparation for allowing system-wide events on all CPUs while the user
requested events are on only user requested CPUs.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:09 +02:00
Michael Petlan 1f17466f8e perf cpumap: Switch to using perf_cpu_map API
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 0255571a16059c8e863a65a4b1611db93bb9b3ae
Author: Ian Rogers <irogers@google.com>
Date: Mon May 2 21:17:52 2022 -0700

description
===========
Switch some raw accesses to the cpu map to using the library API. This
can help with reference count checking. Some BPF cases switch from index
to CPU for consistency, this shouldn't matter as the CPU map is full.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:23:03 +02:00
Michael Petlan f63d52f364 perf record: Fix per-thread option
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 23380e4d53305765789fd1a2cf6bddb07239cd3b
Author: Alexey Bayduraev <alexey.bayduraev@gmail.com>
Date: Wed Apr 13 18:46:40 2022 -0700

description
===========
Per-thread mode doesn't have specific CPUs for events, add checks for
this case.

Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an
out of bound array access.

Fixes: 7954f71689f90cb2 ("perf record: Introduce thread affinity and mmap masks")

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:57 +02:00
Michael Petlan 1ed24eddc8 perf evlist: Rename cpus to user_requested_cpus
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 0df6ade7119daa40904b0c18871169e753663e14
Author: Ian Rogers <irogers@google.com>
Date: Mon Mar 28 16:26:44 2022 -0700

description
===========
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
of all evsels.

For non-task targets, cpus is set to be cpus requested from the command
line, defaulting to all online cpus if no cpus are specified.

For an uncore event, all_cpus may be just CPU 0 or every online CPU.

This causes all_cpus to have fewer values than the cpus variable which
is confusing given the 'all' in the name.

To try to make the behavior clearer, rename cpus to user_requested_cpus
and add comments on the two struct variables.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:55 +02:00
Michael Petlan 46ba8a7118 perf data: Adding error message if perf_data__create_dir() fails
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 65e7c963267f128df155f496a50933cea7dfa5b8
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Tue Feb 22 12:14:17 2022 +0300

description
===========
Add proper return codes for all cases of data directory creation failure
and add error message output based on these codes.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:52 +02:00
Michael Petlan b6823ef9b1 perf record: Implement compatibility checks
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit b5f2511d4b3976e352b47b79c3c119addb7c2033
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:34 2022 +0300

description
===========
Implement compatibility checks for other modes and related command line
options: asynchronous (--aio) trace streaming and affinity (--affinity)
modes, pipe mode, AUX area tracing --snapshot and --aux-sample options,
--switch-output, --switch-output-event, --switch-max-files and
--timestamp-filename options. Parallel data streaming is compatible with
Zstd compression (--compression-level) and external control commands
(--control). CPU mask provided via -C option filters --threads
specification masks.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan 5a53f236d5 perf record: Extend --threads command line option
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit f466e5ed6c356d1dc22dda68f46315a92ec160c6
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:33 2022 +0300

description
===========
Extend --threads option in perf record command line interface.
The option can have a value in the form of masks that specify
CPUs to be monitored with data streaming threads and its layout
in system topology. The masks can be filtered using CPU mask
provided via -C option.

The specification value can be user defined list of masks. Masks
separated by colon define CPUs to be monitored by one thread and
affinity mask of that thread is separated by slash. For example:
<cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2>
specifies parallel threads layout that consists of two threads
with corresponding assigned CPUs to be monitored.

The specification value can be a string e.g. "cpu", "core" or
"package" meaning creation of data streaming thread for every
CPU or core or package to monitor distinct CPUs or CPUs grouped
by core or package.

The option provided with no or empty value defaults to per-cpu
parallel threads layout creating data streaming thread for every
CPU being monitored.

Document --threads option syntax and parallel data streaming modes
in Documentation/perf-record.txt.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan d12e83820a perf record: Introduce --threads command line option
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 06380a849fa89da33d309597890ef26d24095b41
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:32 2022 +0300

description
===========
Provide --threads option in perf record command line interface.
The option creates a data streaming thread for each CPU in the system.
Document --threads option in Documentation/perf-record.txt.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan 59f5923ba3 perf record: Introduce data transferred and compressed stats
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 610fbc016531b7a09dcc98febd2a8f4a0cdd3190
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:31 2022 +0300

description
===========
Introduce bytes_transferred and bytes_compressed stats so they
would capture statistics for the related data buffer transfers.

[ Use PRiu64 to print u64 values, fixing the build on 32-bit architectures ]

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan e7a354e033 perf record: Introduce compressor at mmap buffer object
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 75f5f1fcb9c0f0f542f44d993de18047b2b7f37f
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:30 2022 +0300

description
===========
Introduce compressor object into mmap object so it could be used to
pack the data stream from the corresponding kernel data buffer.
Initialize and make use of the introduced per mmap compressor.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan 9cef0e1b26 perf record: Introduce bytes written stats
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit ae9c7242b29fa2976c70b5b250f8942cf7289211
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:29 2022 +0300

description
===========
Introduce a function to calculate the total amount of data written
and use it to support the --max-size option.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan bc0a90e384 perf record: Introduce data file at mmap buffer object
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 56f735fff35e31e54027df36a653b0268bc94f06
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:28 2022 +0300

description
===========
Introduce data file objects into mmap object so it could be used to
process and store data stream from the corresponding kernel data buffer.
Initialize data files located at mmap buffer objects so trace data
can be written into several data file located at data directory.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan eb10d57999 perf record: Start threads in the beginning of trace streaming
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 3217e9fecf118d5dcabdd68d91e0c6afcb4c3e1b
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:27 2022 +0300

description
===========
Start thread in detached state because its management is implemented
via messaging to avoid any scaling issues. Block signals prior thread
start so only main tool thread would be notified on external async
signals during data collection. Thread affinity mask is used to assign
eligible CPUs for the thread to run. Wait and sync on thread start using
thread ack pipe.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:46 +02:00
Michael Petlan c2f92debce perf record: Stop threads in the end of trace streaming
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 1e5de7d9c6ded0722736eb6e58c72b18937efc06
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:26 2022 +0300

description
===========
Signal thread to terminate by closing write fd of msg pipe.
Receive THREAD_MSG__READY message as the confirmation of the
thread's termination. Stop threads created for parallel trace
streaming prior their stats processing.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:45 +02:00
Michael Petlan 3e7fafcf71 perf record: Introduce thread local variable
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 396b626b95d22664d2f2e5ca332e777ea699a10e
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:25 2022 +0300

description
===========
Introduce thread local variable and use it for threaded trace streaming.
Use thread affinity mask instead of record affinity mask in affinity
modes. Use evlist__ctlfd_update() to propagate control commands from
thread object to global evlist object to enable evlist__ctlfd_*
functionality. Move waking and sample statistic to struct record_thread
and introduce record__waking function to calculate the total number of
wakes.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:45 +02:00
Michael Petlan 538a3bf73d perf record: Introduce thread specific data array
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 415ccb58f68a6bebcbb9db373973394a6af3d553
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:23 2022 +0300

description
===========
Introduce thread specific data object and array of such objects
to store and manage thread local data. Implement functions to
allocate, initialize, finalize and release thread specific data.

Thread local maps and overwrite_maps arrays keep pointers to
mmap buffer objects to serve according to maps thread mask.
Thread local pollfd array keeps event fds connected to mmaps
buffers according to maps thread mask.

Thread control commands are delivered via thread local comm pipes
and ctlfd_pos fd. External control commands (--control option)
are delivered via evlist ctlfd_pos fd and handled by the main
tool thread.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:45 +02:00
Michael Petlan 4b283bbb14 perf record: Introduce thread affinity and mmap masks
Bugzilla: https://bugzilla.redhat.com/2123231

upstream
========
commit 7954f71689f90cb2ae252d3923354d48071994bf
Author: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Date: Mon Jan 17 21:34:21 2022 +0300

description
===========
Introduce affinity and mmap thread masks. Thread affinity mask
defines CPUs that a thread is allowed to run on. Thread maps
mask defines mmap data buffers the thread serves to stream
profiling data from.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-09-21 07:22:45 +02:00
Michael Petlan 5f091c527c perf record: Disable debuginfod by default
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 9bce13ea88f85344b765abe5d3dabdd0f44dc177
Author: Jiri Olsa <jolsa@redhat.com>
Date: Thu Dec 9 21:04:25 2021 +0100

description
===========
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.

  # DEBUGINFOD_PROGRESS=1 ./perf record -a
  ^C[ perf record: Woken up 1 times to write data ]
  Downloading from https://debuginfod.fedoraproject.org/ 447069
  Downloading from https://debuginfod.fedoraproject.org/ 1502175
  Downloading \^Z

Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.

  Default without debuginfo processing:
  # perf record -a

  Using system debuginfod setup:
  # perf record -a --debuginfod

  Using custom debuginfd url:
  # perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'

Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:04 +02:00
Michael Petlan 86ce88496c perf cpumap: Give CPUs their own type
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 6d18804b963b78dcd53851f11e9080408b3d85c2
Author: Ian Rogers <irogers@google.com>
Date: Tue Jan 4 22:13:51 2022 -0800

description
===========
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.

Committer notes:

To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:

  tools/perf/util/bpf_counter.c
  tools/perf/util/bpf_counter_cgroup.c
  tools/perf/util/bpf_ftrace.c

Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".

Additionally these needed to be fixed for the ARM builds to complete:

  tools/perf/arch/arm/util/cs-etm.c
  tools/perf/arch/arm64/util/pmu.c

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:36:01 +02:00
Michael Petlan 49007a9814 perf tools: Record ARM64 LR register automatically
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 7248e308a57587615431b83689cd57e957815bfc
Author: Alexandre Truong <alexandre.truong@arm.com>
Date: Fri Dec 17 15:45:15 2021 +0000

description
===========
On ARM64, automatically record the link register if the frame pointer
mode is on. It will be used to do a dwarf unwind to find the caller of
the leaf frame if the frame pointer was omitted.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:47 +02:00
Michael Petlan de451198af perf tools: Check vmlinux/kallsyms arguments in all tools
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 7cc72553ac03ec20afe2dec91dce4624ccd379b8
Author: James Clark <james.clark@arm.com>
Date: Mon Oct 18 14:48:42 2021 +0100

description
===========
Only perf report checked the validity of these arguments so apply the
same check to all tools that read them for consistency.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:11 +02:00
Michael Petlan d2b1c92483 perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_ID
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 61750473589b6f8adc35007c8261986043907f13
Author: Adrian Hunter <adrian.hunter@intel.com>
Date: Tue Sep 7 19:39:02 2021 +0300

description
===========
The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output
data like Intel PT PEBS-via-PT back to the event that it came from, by
providing a hardware ID that is present in the AUX output.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:35:02 +02:00
Michael Petlan bf53798f1c perf record: Add --synth option
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 41b740b6e8a994e5830daa5e15785522874f7456
Author: Namhyung Kim <namhyung@kernel.org>
Date: Tue Aug 10 21:46:58 2021 -0700

description
===========
Add an option to control the synthesizing behavior.

    --synth <no|all|task|mmap|cgroup>
                      Fine-tune event synthesis: default=all

This can be useful when we know it doesn't need some synthesis like
in a specific usecase and/or when using pipe:

  $ perf record -a --all-cgroups --synth cgroup -o- sleep 1 | \
  > perf report -i- -s cgroup

Committer notes:

Added a clarification to the man page entry for --synth that this is
about pre-existing threads.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:44 +02:00
Michael Petlan 3de72356e1 perf tools: Allow controlling synthesizing PERF_RECORD_ metadata events during record
Bugzilla: https://bugzilla.redhat.com/2069073

upstream
========
commit 84111b9c950ec9a8b31166973e79aa77ddcee7e3
Author: Namhyung Kim <namhyung@kernel.org>
Date: Tue Aug 10 21:46:57 2021 -0700

description
===========
Depending on the use case, it might require some kind of synthesizing
and some not.  Make it controllable to turn off heavy operations like
MMAP for all tasks.

Currently all users are converted to enable all the synthesis by
default.  It'll be updated in the later patch.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-05-16 11:34:44 +02:00
Michael Petlan f183b25b86 perf record: Fix wrong comm in system-wide mode with delay
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit bb07d62e039b592f8006c9faedab48cd627e20c4
Author: Namhyung Kim <namhyung@kernel.org>
Date: Fri Aug 27 16:32:12 2021 -0700

description
===========
Stephane found that the name of the forked process in a system-wide
mode is wrong when --delay option is used.  For example,

  # perf record -a --delay=1000  noploop 3

The noploop process will run a busy loop for 3 second.  And on an idle
machine it should show up at the top in the perf report.  It works
well without the --delay option.  But if I add the option, it showed
'perf' not 'noploop'.

  # perf report -s comm -q | head -3
      52.94%  perf
      16.65%  swapper
      12.04%  chrome

It turned out that the dummy event didn't work at all and it missed
COMM and MMAP events for the noploop process (and others too).  We
should enable the dummy event immediately in system-wide mode, as the
enable-on-exec would work only for task events.

With this change,

  # perf report -s comm -q | head -3
      52.75%  noploop
      17.03%  swapper
      12.83%  chrome

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:33:07 +02:00
Michael Petlan 1b539daf08 perf tools: Enable on a list of CPUs for hybrid
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit 1d3351e631fc34d73b530a67263188062fe598ba
Author: Jin Yao <yao.jin@linux.intel.com>
Date: Fri Jul 23 14:34:33 2021 +0800

description
===========
The 'perf record' and 'perf stat' commands have supported the option
'-C/--cpus' to count or collect only on the list of CPUs provided. This
option needs to be supported for hybrid as well.

For hybrid support, it needs to check that the cpu list are available
on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
is 'cpu_atom'.

Before:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1

   Performance counter stats for 'CPU(s) 11':

     <not supported>      cpu_core/cycles/

         1.006179431 seconds time elapsed

The 'perf stat' command silently returned "<not supported>" without any
helpful information. It should error out pointing out that that cpu11
was not 'cpu_core'.

After:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
  failed to use cpu list 11

We also need to support the events without pmu prefix specified.

  # perf stat -e cycles -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)

   Performance counter stats for 'CPU(s) 11':

           1,067,373      cpu_atom/cycles/

         1.005544738 seconds time elapsed

The perf tool creates two cycles events automatically, cpu_core/cycles/ and
cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
for cpu_core/cycles/ and only count the cpu_atom/cycles/.

If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example,

  # perf stat -e cycles -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           1,914,704      cpu_core/cycles/
           2,036,983      cpu_atom/cycles/

         1.005815641 seconds time elapsed

It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
cpu_atom/cycles/, and output with some warnings.

Some more complex examples,

  # perf stat -e cycles,instructions -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           2,780,387      cpu_core/cycles/
           1,583,432      cpu_atom/cycles/
           3,957,277      cpu_core/instructions/
           1,167,089      cpu_atom/instructions/

         1.006005124 seconds time elapsed

  # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           3,290,301      cpu_core/cycles/
           1,953,073      cpu_atom/cycles/
           1,407,869      cpu_atom/instructions/

         1.006260912 seconds time elapsed

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:33:06 +02:00
Michael Petlan f8ed20f6c5 perf inject: Fix output from a file to a pipe
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit c3a057dc3aa9979ce6dc350e05eb2e4c021432cd
Author: Namhyung Kim <namhyung@kernel.org>
Date: Mon Jul 19 15:31:52 2021 -0700

description
===========
When the input is a regular file but the output is a pipe, it should
write a pipe header.  But just repiping would write a portion of the
existing header which is different in 'size' value.  So we need to
prevent it and write a new pipe header along with other information
like event attributes and features.

This can handle something like this:

  # perf record -a -B sleep 1

  # perf inject -b -i perf.data | perf report -i -

Factor out perf_event__synthesize_for_pipe() to be shared between perf
record and inject.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:33:02 +02:00
Michael Petlan 1df7d85569 perf tools: Remove repipe argument from perf_session__new()
Bugzilla: https://bugzilla.redhat.com/2069070

upstream
========
commit 2681bd85a4b92788e265934d0d76bd56b5b08d16
Author: Namhyung Kim <namhyung@kernel.org>
Date: Mon Jul 19 15:31:49 2021 -0700

description
===========
The repipe argument is only used by perf inject and the all others
passes 'false'.  Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
2022-04-25 12:33:02 +02:00
Vitaly Kuznetsov f43071af87 tools: rename bitmap_alloc() to bitmap_zalloc()
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2009338

commit 7fc5b571325f1bcbe1ce384409b2d05546431b04
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Tue Sep 7 19:59:35 2021 -0700

    tools: rename bitmap_alloc() to bitmap_zalloc()

    Rename bitmap_alloc() to bitmap_zalloc() in tools to follow the bitmap API
    in the kernel.

    No functional changes intended.

    Link: https://lkml.kernel.org/r/20210814211713.180533-14-yury.norov@gmail.com
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Yury Norov <yury.norov@gmail.com>
    Suggested-by: Yury Norov <yury.norov@gmail.com>
    Acked-by: Yury Norov <yury.norov@gmail.com>
    Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Acked-by: Jiri Olsa <jolsa@redhat.com>
    Cc: Alexander Lobakin <alobakin@pm.me>
    Cc: Alexey Klimov <aklimov@redhat.com>
    Cc: Dennis Zhou <dennis@kernel.org>
    Cc: Ulf Hansson <ulf.hansson@linaro.org>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
2021-12-08 10:43:17 +01:00
Kan Liang b91e5492f9 perf record: Add a dummy event on hybrid systems to collect metadata records
Some symbols may not be resolved if a user only monitors one type of
PMU.

  $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
  $ sudo perf report –stdio
  # Overhead  Command    Shared Object      Symbol
  # ........  .........  .................  .....................
  #
     28.02%  perf-exec  [unknown]          [.] 0x0000000000401cf6
     11.32%  perf-exec  [unknown]          [.] 0x0000000000401d04
     10.90%  perf-exec  [unknown]          [.] 0x0000000000401d11
     10.61%  perf-exec  [unknown]          [.] 0x0000000000401cfc

To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are
generated by the kernel, are required.

To decide whether to generate the metadata records, the kernel relies on
the event_filter_match() to filter the unrelated events.

On a hybrid system, event_filter_match() further checks the CPU mask of
the current enabled PMU. If an event is collected on the CPU which
doesn't have an enabled PMU, it's treated as an unrelated event.

The "big_small_workload" is created in a big core, but runs on a small
core. The metadata records are filtered, because the user only monitors
the PMU of the small core. The big core PMU is not enabled.

For a hybrid system, a dummy event is required to generate the complete
side-band events.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760212-18441-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa 3a683120d8 libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa fba7c86601 libperf: Move 'leader' from tools/perf to perf_evsel::leader
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:31 -03:00
Arnaldo Carvalho de Melo ce09673636 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes, since perf/urgent is already upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 13:56:50 -03:00
Namhyung Kim 4f2abe9192 perf record: Move probing cgroup sampling support
I found that checking cgroup sampling support using the missing features
doesn't work on old kernels.  Because it added both attr.cgroup bit and
PERF_SAMPLE_CGROUP bit, it needs to check whichever comes first (usually
the actual event, not dummy).

But it only checks the attr.cgroup bit which is set only in the dummy
event so cannot detect failtures due the sample bits.  Also we don't
ignore the missing feature and retry, it'd be better checking it with
the API probing logic.

Committer notes:

Extracted the minimal part to check using the new cgroup API probe
routine, the part that removes the cgroup member can be left for further
discussion.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210527182835.1634339-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:32:00 -03:00
Adrian Hunter 66286ed3e8 perf record: Set timestamp boundary for AUX area events
AUX area data is not processed by 'perf record' and consequently the
 --timestamp-boundary option may result in no values for "time of first
sample" and "time of last sample". However there are non-sample events
that can be used instead, namely 'itrace_start' and 'aux'.
'itrace_start' is issued before tracing starts, and 'aux' is issued
every time data is ready.

Implement tool callbacks for those two for 'perf record', to update the
timestamp boundary.

Example:

 $ perf record -e intel_pt//u --timestamp-boundary uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.022 MB perf.data ]
 $ perf script --header-only | grep "time of"
 # time of first sample : 4574.835541
 # time of last sample : 4574.835907
 $ perf script --itrace=be -F-ip | head -1
           uname 13752 [001]  4574.835589:          1 branches:uH:
 $ perf script --itrace=be -F-ip | tail -1
           uname 13752 [001]  4574.835867:          1 branches:uH:
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210503064222.5319-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Jin Yao 91c0f5ec81 perf record: Uniquify hybrid event name
For perf-record, it would be useful to tell user the pmu which the
event belongs to.

For example,

  # perf record -a -- sleep 1
  # perf report

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 106  of event 'cpu_core/cycles/'
  # Event count (approx.): 22043448
  #
  # Overhead  Command       Shared Object            Symbol
  # ........  ............  .......................  ............................
  #
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Jin Yao b53a0755d5 perf record: Create two hybrid 'cycles' events by default
When evlist is empty, for example no '-e' specified in perf record,
one default 'cycles' event is added to evlist.

While on hybrid platform, it needs to create two default 'cycles'
events. One is for cpu_core, the other is for cpu_atom.

This patch actually calls evsel__new_cycles() two times to create
two 'cycles' events.

  # ./perf record -vv -a -- sleep 1
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------

We have to create evlist-hybrid.c otherwise due to the symbol
dependency the perf test python would be failed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-29 10:30:59 -03:00
Arnaldo Carvalho de Melo 3535a6967c perf record: Improve 'Workload failed' message printing events + what was exec'ed
Before:

  # perf record -a cycles,instructions,cache-misses
  Workload failed: No such file or directory
  #

After:

  # perf record -a cycles,instructions,cache-misses
  Failed to collect 'cycles' for the 'cycles,instructions,cache-misses' workload: No such file or directory
  #

Helps disambiguating other error scenarios:

  # perf record -a -e cycles,instructions,cache-misses bla
  Failed to collect 'cycles,instructions,cache-misses' for the 'bla' workload: No such file or directory
  # perf record -a cycles,instructions,cache-misses sleep 1
  Failed to collect 'cycles' for the 'cycles,instructions,cache-misses' workload: No such file or directory
  #

When all goes well we're back to the usual:

  # perf record -a -e cycles,instructions,cache-misses sleep 1
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 3.151 MB perf.data (21242 samples) ]
  #

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210414131628.2064862-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-15 16:34:05 -03:00