Commit Graph

495 Commits

Author SHA1 Message Date
Sandipan Das 2fe6575924 perf script: Skip aggregation for stat events
The script command does not support aggregation modes by itself although
that can be achieved using post-processing scripts. Because of this, it
does not allocate memory for aggregated event values.

Upon running perf stat record, the aggregation mode is set in the perf
data file. If the mode is AGGR_GLOBAL, the aggregated event values are
accessed and this leads to a segmentation fault since these were never
allocated to begin with. Set the mode to AGGR_NONE explicitly to avoid
this.

E.g.

  $ perf stat record -e cycles true
  $ perf script

Before:
  Segmentation fault (core dumped)

After:
  CPU   THREAD             VAL             ENA             RUN            TIME EVENT
   -1   231919          162831          362069          362069          935289 cycles:u

Fixes: 8b76a3188b ("perf stat: Remove unused perf_counts.aggr field")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: stable@vger.kernel.org # v6.2+
Link: https://lore.kernel.org/r/83d6c6c05c54bf00c5a9df32ac160718efca0c7a.1683280603.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-10 14:35:58 -03:00
Arnaldo Carvalho de Melo 3ad1be6fae perf dso: Fix use before NULL check introduced by map__dso() introduction
James Clark noticed that the recent 63df0e4bc3 ("perf map: Add
accessor for dso") patch accessed map->dso before the 'map' variable was
NULL checked, which is a change in logic that leads to segmentation
faults, so comb thru that patch to fix similar cases.

Fixes: 63df0e4bc3 ("perf map: Add accessor for dso")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/lkml/ZD68RYCVT8hqPuxr@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 10:51:48 -03:00
Ian Rogers 78a1f7cd90 perf map: Add helper for ->map_ip() and ->unmap_ip()
Later changes will add reference count checking for struct map, add a
helper function to invoke the map_ip and unmap_ip function pointers. The
helper allows the reference count check to be in fewer places.

Committer notes:

Add missing conversions to:

  tools/perf/util/map.c
  tools/perf/util/cs-etm.c
  tools/perf/util/annotate.c
  tools/perf/arch/powerpc/util/sym-handling.c
  tools/perf/arch/s390/annotate/instructions.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-06 22:10:17 -03:00
Ian Rogers 0e6aa013bb perf map: Rename map_ip() and unmap_ip()
Add dso to match comment. This avoids a naming conflict with later
added accessor functions for variables in struct map.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230404205954.2245628-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-06 21:59:45 -03:00
Ian Rogers e5116f46d4 perf map: Add accessor for start and end
Later changes will add reference count checking for struct map, start
and end are frequently accessed variables. Add an accessor so that the
reference count check is only necessary in one place.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04 16:54:11 -03:00
Ian Rogers 63df0e4bc3 perf map: Add accessor for dso
Later changes will add reference count checking for struct map, with
dso being the most frequently accessed variable. Add an accessor so
that the reference count check is only necessary in one place.

Additional changes:
 - add a dso variable to avoid repeated map__dso calls.
 - in builtin-mem.c dump_raw_samples, code only partially tested for
   dso == NULL. Make the possibility of NULL consistent.
 - in thread.c thread__memcpy fix use of spaces and use tabs.

Committer notes:

Did missing conversions on these files:

   tools/perf/arch/powerpc/util/skip-callchain-idx.c
   tools/perf/arch/powerpc/util/sym-handling.c
   tools/perf/ui/browsers/hists.c
   tools/perf/ui/gtk/annotate.c
   tools/perf/util/cs-etm.c
   tools/perf/util/thread.c
   tools/perf/util/unwind-libunwind-local.c
   tools/perf/util/unwind-libunwind.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04 16:41:57 -03:00
Adrian Hunter 34f576c95d perf intel-pt: Add event type names UINTR and UIRET
UINTR and UIRET are listed in table 32-50 "CFE Packet Type and Vector
Fields Details" in the Intel Processor Trace chapter of The Intel SDM
Volume 3 version 078.

The codes are for "User interrupt delivered" and "Exiting from user
interrupt routine" respectively.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-20 19:25:27 -03:00
Adrian Hunter 80c3a7d9f2 perf script: Fix Python support when no libtraceevent
Python scripting can be used without libtraceevent. In particular,
scripting for Intel PT does not use tracepoints, and so does not need
libtraceevent support.

Alter the build and employ conditional compilation to allow Python
scripting without libtraceevent.

Example:

 Before:

    $ ldd `which perf` | grep -i python
    $ ldd `which perf` | grep -i libtraceevent
    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.031 MB perf.data ]
    $ perf script intel-pt-events.py |& head -3
      Error: Couldn't find script `intel-pt-events.py'

     See perf script -l for available scripts.

 After:

    $ ldd `which perf` | grep -i python
            libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000)
    $ ldd `which perf` | grep -i libtraceevent
    $ perf script intel-pt-events.py | head
    Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
         Switch In    8021/8021  [000]     11234.097713404     0/0
           perf-exec  8021/8021  [000]     11234.098041726       psb                        offset: 0x0                0 [unknown] ([unknown])
           perf-exec  8021/8021  [000]     11234.098041726       cbr                         45  freq: 4505 MHz  (161%)                0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098082170  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098082379  branches:uH  tr end                    7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098083629  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083629  branches:uH  call                      7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083837  branches:uH  tr end                    7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])  IPC: 0.01 (9/938)
               uname  8021/8021  [000]     11234.098084670  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

Fixes: 378ef0f5d9 ("perf build: Use libtraceevent from the system")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-15 10:27:07 -03:00
Ian Rogers aa0964e3ec perf stat: Remove saved_value/runtime_stat
As saved_value/runtime_stat are only written to and not read, remove
the associated logic and writes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-52-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19 08:10:25 -03:00
Ian Rogers cc26ffaa01 perf stat: Hide runtime_stat
runtime_stat is only shared for the sake of tests that don't care
about its value. Move the definition into stat-shadow.c and have the
tests also use the global version.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-48-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-19 08:10:25 -03:00
Kan Liang 17f248aa86 perf script: Support Retire Latency
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports the number of
elapsed core clocks between the retirement of the instruction indicated
by the Instruction Pointer field of the PEBS record and the retirement
of the prior instruction. That's quite useful to display the information
with perf script.

Add a new field retire_lat for the Retire Latency information.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-9-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:26:40 -03:00
Sandipan Das 6ade6c6460 perf script: Show branch speculation info
Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for optimizing code further.

The speculation info is appended to the end of the list of fields so any
existing tools that use "/" as a delimiter for access fields via an index
remain unaffected. Also show "-" instead of "N/A" when speculation info
is unavailable because "/" is used as the field separator.

E.g.

  $ perf record -j any,u,save_type ./test_branch
  $ perf script --fields brstacksym

Before:

  [...]
  check_match+0x60/strcmp+0x0/P/-/-/0/CALL
  do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL
  [...]

After:

  [...]
  check_match+0x60/strcmp+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  [...]

The bitfield swapping scheme used duing sample parsing has changed
because of the addition of new branch flags, namely "spec", "new_type"
and "priv". Earlier, these were all part of the "reserved" field but
now, each of these fields get swapped separately. Change the expected
flag values accordingly for the test to pass.

E.g.

  $ perf test -v 27

Before:

   27: Sample parsing                                                  :
  --- start ---
  test child forked, pid 61979
  parsing failed for sample_type 0x800
  test child finished with -1
  ---- end ----
  Sample parsing: FAILED!

After:

   27: Sample parsing                                                  :
  --- start ---
  test child forked, pid 63293
  test child finished with 0
  ---- end ----
  Sample parsing: Ok

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/56e272583552526e999ba0b536ac009ae3613966.1675333809.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Namhyung Kim 3fd7a168bf perf script: Add 'cgroup' field for output
There's no field for the cgroup, let's add one.  To do that, users need to
specify --all-cgroup option for perf record to capture the cgroup info.

  $ perf record --all-cgroups -- true

  $ perf script -F comm,pid,cgroup
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...

If it's recorded without the --all-cgroups, it'd complain.

  $ perf script -F comm,pid,cgroup
  Samples for 'cycles:u' event do not have CGROUP attribute set. Cannot print 'cgroup' field.
  Hint: run 'perf record --all-cgroups ...'

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230126213610.3381147-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Diederik de Haas fc5d836c67 perf: Various spelling fixes
Fix various spelling errors as reported by Debian's lintian tool.

"amount of times" -> "number of times"
ocurrence -> occurrence
upto -> up to

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230122122034.48020-1-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-23 10:00:47 -03:00
Yang Jihong 7c0a6144f9 perf tools: Fix usage of the verbose variable
The data type of the verbose variable is integer and can be negative,
replace improperly used cases in a unified manner:
 1. if (verbose)        => if (verbose > 0)
 2. if (!verbose)       => if (verbose <= 0)
 3. if (XX && verbose)  => if (XX && verbose > 0)
 4. if (XX && !verbose) => if (XX && verbose <= 0)

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221220035702.188413-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:16:33 -03:00
Ian Rogers 378ef0f5d9 perf build: Use libtraceevent from the system
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.

If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.

This also disables CONFIG_TRACE that controls "perf trace".

CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.

Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed.  The
majority of commands continue to work including "perf test".

Committer notes:

Fixed up a tools/perf/util/Build reject and added:

  #include <traceevent/event-parse.h>

to tools/perf/util/scripting-engines/trace-event-perl.c.

Committer testing:

  $ rpm -qi libtraceevent-devel
  Name        : libtraceevent-devel
  Version     : 1.5.3
  Release     : 2.fc36
  Architecture: x86_64
  Install Date: Mon 25 Jul 2022 03:20:19 PM -03
  Group       : Unspecified
  Size        : 27728
  License     : LGPLv2+ and GPLv2+
  Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
  Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
  Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
  Build Host  : buildvm-x86-05.iad2.fedoraproject.org
  Packager    : Fedora Project
  Vendor      : Fedora Project
  URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
  Bug URL     : https://bugz.fedoraproject.org/libtraceevent
  Summary     : Development headers of libtraceevent
  Description :
  Development headers of libtraceevent-libs
  $

Default build:

  $ ldd ~/bin/perf | grep tracee
  	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
  $

  # perf trace -e sched:* --max-events 10
       0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
       0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
       0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
       1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
       1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
       0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
       0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
       0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
       1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
       1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
  #

Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.

Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:

- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y

- perf-$(CONFIG_LIBTRACEEVENT) += scripts/

- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y

- The python binding needed some fixups and util/trace-event.c can't be
  built and linked with the python binding shared object, so remove it
  in tools/perf/util/setup.py and exclude it from the list of
  dependencies in the python/perf.so Makefile.perf target.

Building without libtraceevent-devel installed uncovered more build
failures:

- The python binding tools/perf/util/python.c was assuming that
  traceevent/parse-events.h was always available, which was the case
  when we defaulted to using the in-kernel tools/lib/traceevent/ files,
  now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
  the other parts of it that deal with tracepoints.

- We have to ifdef the rules in the Build files with
  CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
  tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
  setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
  detect libtraceevent-devel installed in the system. Simplification here
  to avoid these two ways of disabling builtin-trace.c and not having
  CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
  way.

From Athira:

<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>

Then, ditto for arm64 and s390, detected by container cross build tests.

- s/390 uses test__checkevent_tracepoint() that is now only available if
  HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.

Also from Athira:

<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>

Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Namhyung Kim 1f297a6eb2 perf stat: Allocate evsel->stats->aggr properly
The perf_stat_config.aggr_map should have a correct size of the
aggregation map.  Use it to allocate aggr_counts.

Also AGGR_NONE with per-core events can be tricky because it doesn't
aggreate basically but it needs to do so for per-core events only.
So only per-core evsels will have stats->aggr data.

Note that other caller of evlist__alloc_stat() might not have
stat_config or aggr_map.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221018020227.85905-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-27 16:37:25 -03:00
Ravi Bangoria d793107005 perf script: Add missing fields in usage hint
A few fields are missing in the usage message printed when an unknown
field option is passed. Add them to the list.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-9-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06 16:32:20 -03:00
Namhyung Kim 1337b9dcb0 perf tools: Remove special handling of system-wide evsel
For system-wide evsels, the thread map should be dummy - i.e. it has a
single entry of -1.  But the code guarantees such a thread map, so no
need to handle it specially.

No functional change intended.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221003204647.1481128-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06 08:03:53 -03:00
Anshuman Khandual 0ddea8e2a0 perf branch: Extend branch type classification
This updates the perf tool with generic branch type classification with new
ABI extender place holder i.e PERF_BR_EXTEND_ABI, the new 4 bit branch type
field i.e perf_branch_entry.new_type, new generic page fault related branch
types and some arch specific branch types as added earlier in the kernel.

Committer note:

Add an extra entry to the branch_type_name array to cope with
PERF_BR_EXTEND_ABI, to address build warnings on some compiler/systems,
like:

  75     8.89 ubuntu:20.04-x-powerpc64el    : FAIL gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04)
        inlined from 'branch_type_stat_display' at util/branch.c:152:4:
    /usr/powerpc64le-linux-gnu/include/bits/stdio2.h💯10: error: '%8s' directive argument is null [-Werror=format-overflow=]
      100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
          |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      101 |    __va_arg_pack ());
          |    ~~~~~~~~~~~~~~~~~

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220824044822.70230-7-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04 08:55:20 -03:00
Zhengjun Xing 82b2425fad perf script: Fix Cannot print 'iregs' field for hybrid systems
Commit b91e5492f9 ("perf record: Add a dummy event on hybrid
systems to collect metadata records") adds a dummy event on hybrid
systems to fix the symbol "unknown" issue when the workload is created
in a P-core but runs on an E-core. The added dummy event will cause
"perf script -F iregs" to fail. Dummy events do not have "iregs"
attribute set, so when we do evsel__check_attr, the "iregs" attribute
check will fail, so the issue happened.

The following commit [1] has fixed a similar issue by skipping the attr
check for the dummy event because it does not have any samples anyway. It
works okay for the normal mode, but the issue still happened when running
the test in the pipe mode. In the pipe mode, it calls process_attr() which
still checks the attr for the dummy event. This commit fixed the issue by
skipping the attr check for the dummy event in the API evsel__check_attr,
Otherwise, we have to patch everywhere when evsel__check_attr() is called.

Before:

  #./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5
  Samples for 'dummy:HG' event do not have IREGS attribute set. Cannot print 'iregs' field.
  0x120 [0x90]: failed to process type: 64
  #

After:

  # ./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5
  ABI:2    CX:0x55b8efa87000    DX:0x55b8efa7e000    DI:0xffffba5e625efbb0    R8:0xffff90e51f8ae100
  ABI:2    CX:0x7f1dae1e4000    DX:0xd0    DI:0xffff90e18c675ac0    R8:0x71
  ABI:2    CX:0xcc0    DX:0x1    DI:0xffff90e199880240    R8:0x0
  ABI:2    CX:0xffff90e180dd7500    DX:0xffff90e180dd7500    DI:0xffff90e180043500    R8:0x1
  ABI:2    CX:0x50    DX:0xffff90e18c583bd0    DI:0xffff90e1998803c0    R8:0x58
  #

[1]https://lore.kernel.org/lkml/20220831124041.219925-1-jolsa@kernel.org/

Fixes: b91e5492f9 ("perf record: Add a dummy event on hybrid systems to collect metadata records")
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220908070030.3455164-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-08 15:27:39 -03:00
Jiri Olsa 35503ce12a perf script: Skip dummy event attr check
Hongtao Yu reported problem when displaying uregs in perf script
for system wide perf.data:

  # perf script -F uregs | head -10
  Samples for 'dummy:HG' event do not have UREGS attribute set. Cannot print 'uregs' field.

The problem is the extra dummy event added for system wide,
which does not have proper sample_type setup.

Skipping attr check completely for dummy event as suggested
by Namhyung, because it does not have any samples anyway.

Reported-by: Hongtao Yu <hoy@fb.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220831124041.219925-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-31 13:23:44 -03:00
shaomin Deng ae4e4a0ba3 perf script: Delete repeated word "from"
Delete the repeated word "from" in code.

Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220807080642.13004-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-12 16:44:30 -03:00
Adrian Hunter e28fb159f1 perf script: Add machine_pid and vcpu
Add fields machine_pid and vcpu. These are displayed only if machine_pid is
non-zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:13 -03:00
Adrian Hunter 57190e38b0 perf script: Add --dump-unsorted-raw-trace option
When reviewing the results of perf inject, it is useful to be able to see
the events in the order they appear in the file.

So add --dump-unsorted-raw-trace option to do an unsorted dump.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter 52f28b7bac perf script: Add some missing event dumps
When the -D option is used, the details of thread-map, cpu-map and
event-update events are not currently dumped. Add prints so that they are.

Example:

  # perf record --kcore sleep 0.1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ]
  # perf script -D | grep 'THREAD\|CPU'
  0 0x4950 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 35116
  0 0x4978 [0x20]: PERF_RECORD_CPU_MAP: 0-7
  # perf script -D | grep -A4 'UPDATE'
  0 0x4920 [0x30]: PERF_RECORD_EVENT_UPDATE
  ... id:    147
  ... 0-7

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-23 11:54:22 -03:00
Adrian Hunter 5b20814460 perf script: Add guest_code support
Add an option to indicate that guest code can be found in the hypervisor
process.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220517131011.6117-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:19:04 -03:00
Adrian Hunter a5014310f7 perf script: Print Intel ptwrite value as a string if it is ASCII
It can be convenient to put a string value into a ptwrite payload as
a quick and easy way to identify what is being printed.

To make that useful, if the Intel ptwrite payload value contains only
printable ASCII characters padded with NULLs, then print it also as a
string.

Using the example program from the "Emulated PTWRITE" section of
tools/perf/Documentation/perf-intel-pt.txt:

 $ echo -n "Hello" | od -t x8
 0000000 0000006f6c6c6548
 0000005
 $ perf record -e intel_pt//u ./eg_ptw 0x0000006f6c6c6548
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.016 MB perf.data ]
 $ perf script --itrace=ew
           eg_ptw 35563 [005] 18256.087338:     ptwrite:  IP: 0 payload: 0x6f6c6c6548 Hello      55e764db5196 perf_emulate_ptwrite+0x16 (/home/user/eg_ptw)
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220509152400.376613-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 11:56:02 -03:00
Leo Yan c6d8df0106 perf script: Always allow field 'data_src' for auxtrace
If use command 'perf script -F,+data_src' to dump memory samples with
Arm SPE trace data, it reports error:

  # perf script -F,+data_src
  Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field.

This is because the 'dummy:u' event is absent DATA_SRC bit in its sample
type, so if a file contains AUX area tracing data then always allow
field 'data_src' to be selected as an option for perf script.

Fixes: e55ed3423c ("perf arm-spe: Synthesize memory event")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 18:39:34 -03:00
Wei Li ae0f4eb34f perf tools: Enhance the matching of sub-commands abbreviations
We support short command 'rec*' for 'record' and 'rep*' for 'report' in
lots of sub-commands, but the matching is not quite strict currnetly.

It may be puzzling sometime, like we mis-type a 'recport' to report but
it will perform 'record' in fact without any message.

To fix this, add a check to ensure that the short cmd is valid prefix
of the real command.

Committer testing:

  [root@quaco ~]# perf c2c re sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  # perf c2c rec sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.038 MB perf.data (16 samples) ]
  # perf c2c recport sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  # perf c2c record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.038 MB perf.data (15 samples) ]
  # perf c2c records sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  #

Signed-off-by: Wei Li <liwei391@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rui Xiang <rui.xiang@huawei.com>
Link: http://lore.kernel.org/lkml/20220325092032.2956161-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:55:57 -03:00
Kan Liang 6f680c6aa2 perf script: Add 'brstackinsnlen' for branch stacks
When analyzing with 'perf script', it's useful to understand the
captured instruction and the next sequential instruction.

To calculate the address of the next sequential instruction, the length
of the captured instruction is required.

For example, you can’t know the next sequential instruction after an
unconditional branch unless you calculate that based on its length.

For branch stacks, 'perf script' only prints the instruction bytes with
'brstackinsn', but lacks the instruction length.

Add 'brstackinsnlen' to print the instruction length.

  $ perf script -F ip,brstackinsn,brstackinsnlen --xed
     7fa555be8f75
        _start:
        00007fa555be8090    mov %rsp, %rdi              ilen: 3
        00007fa555be8093    callq  0x7fa555be8ea0       ilen: 5 # PRED 102 cycles [102] 0.02 IPC
        _dl_start+38:
        00007fa555be8ec6    movq  %rdx,0x227853(%rip)   ilen: 7
        00007fa555be8ecd    leaq  0x227f94(%rip),%rdx   ilen: 7

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ahmad Yasin <ahmad.yasin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/1647871212-184070-1-git-send-email-kan.liang@linux.intel.com
[ Added the new field to tools/perf/Documentation/perf-script.txt ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 18:00:45 -03:00
Arnaldo Carvalho de Melo 65eab2bc7d Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes that went thru perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-14 19:15:16 -03:00
James Clark 1f48989cdc perf script: Output branch sample type
The type info is saved when using '-j save_type'. Output this in 'perf
script' so it can be accessed by other tools or for debugging.

It's appended to the end of the list of fields so any existing tools
that split on / and access fields via an index are not affected. Also
output '-' instead of 'N/A' when the branch type isn't saved because /
is used as a field separator.

Entries before this change look like this:

  0xaaaadb350838/0xaaaadb3507a4/P/-/-/0

And afterwards like this:

  0xaaaadb350838/0xaaaadb3507a4/P/-/-/0/CALL

or this if no type info is saved:

  0x7fb57586df6b/0x7fb5758731f0/P/-/-/143/-

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220307171917.2555829-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-07 14:51:56 -03:00
James Clark b2dac688a5 perf script: Refactor branch stack printing
Remove duplicate code so that future changes to flags are always made to
all 3 printing variations.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220307171917.2555829-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-07 14:51:38 -03:00
German Gomez 13e741b834 perf script: Fix error when printing 'weight' field
In SPE traces the 'weight' field can't be printed in 'perf script'
because the 'dummy:u' event doesn't have the WEIGHT attribute set.

Use evsel__do_check_stype(..) to check this field, as it's done with
other fields such as "phys_addr".

Before:

  $ perf record -e arm_spe_0// -- sleep 1
  $ perf script -F event,ip,weight
  Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field.

After:

  $ perf script -F event,ip,weight
     l1d-access:               12 ffffaf629d4cb320
     tlb-access:               12 ffffaf629d4cb320
         memory:               12 ffffaf629d4cb320

Fixes: b0fde9c6e2 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT")
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220221171707.62960-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22 21:17:55 -03:00
Adrian Hunter 2673859865 perf script: Display new D (Intr Disabled) and t (Intr Toggle) flags
Amend the display to include D and t flags in the same way as the x flag.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15 17:14:43 -03:00
Adrian Hunter a48b96ca5a perf script: Display Intel PT iflag synthesized event
Similar to other Intel PT synth events, display changes to the interrupt
flag represented by the MODE.Exec packet.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15 17:14:24 -03:00
Adrian Hunter 5b11749b36 perf script: Display Intel PT CFE (Control Flow Event) / EVD (Event Data) synthesized event
Similar to other Intel PT synth events, display Event Trace events recorded
by CFE / EVD packets.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15 17:14:05 -03:00
Yao Jin 9edcde68d6 perf script: Fix printing 'phys_addr' failure issue
Perf script was failed to print the phys_addr for SPE profiling.
One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR
attribute set, perf script then exits with error.

Now referring to 'addr', use evsel__do_check_stype() to check the type.

Before:

  # perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
		store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
  # perf script -F pid,tid,addr,phys_addr
  Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field.

After:

  # perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
		store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
  # perf script -F pid,tid,addr,phys_addr
  4064384/4064384 ffff802f921be0d0      2f921be0d0
  4064384/4064384 ffff802f921be0d0      2f921be0d0

Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Yao Jin <jinyao5@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220121065954.2121900-1-liwei391@huawei.com
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-22 17:02:08 -03:00
Ian Rogers 6d18804b96 perf cpumap: Give CPUs their own type
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.

Committer notes:

To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:

  tools/perf/util/bpf_counter.c
  tools/perf/util/bpf_counter_cgroup.c
  tools/perf/util/bpf_ftrace.c

Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".

Additionally these needed to be fixed for the ARM builds to complete:

  tools/perf/arch/arm/util/cs-etm.c
  tools/perf/arch/arm64/util/pmu.c

Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12 14:28:23 -03:00
Ian Rogers b57af1b401 perf script: Fix flipped index and cpu
perf_counts are accessed by the densely packed index.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-47-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12 14:28:23 -03:00
Ian Rogers f9551b3f62 perf script: Use for each cpu to aid readability
Use perf_cpu_map__for_each_cpu() to help with readability.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-33-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12 14:28:22 -03:00
Arnaldo Carvalho de Melo 65f8d08cf8 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-03 11:54:30 -03:00
Adrian Hunter 5e0c325cdb perf script: Fix CPU filtering of a script's switch events
CPU filtering was not being applied to a script's switch events.

Fixes: 5bf83c29a0 ("perf script: Add scripting operation process_switch()")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211215080636.149562-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-28 17:26:25 -03:00
Alexandre Truong aa8db3e41d perf callchain: Enable dwarf_callchain_users on arm64
Enable dwarf_callchain_users on arm64 which will be needed to do a
DWARF unwind in order to get the caller of the leaf frame.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-5-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-21 18:35:44 -03:00
Alexandre Truong ab23692134 perf script: Use callchain_param_setup() instead of open coded equivalent
Refactoring script__setup_sample_type() by using callchain_param_setup()
to replace the duplicate code for callchain parameter setting up.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-4-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-21 18:35:44 -03:00
German Gomez 83869019c7 perf arch: Support register names from all archs
When reading a perf.data file with register values, there is a mismatch
between the names and the values of the registers because the tool is
built using only the register names from the local architecture.

Reading a perf.data file that was recorded on ARM64, gives the following
erroneous output on an X86 machine:

  # perf report -i perf_arm64.data -D
  [...]
  24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
  ... user regs: mask 0x1ffffffff ABI 64-bit
  .... AX    0x0000ffffd1515817
  .... BX    0x0000ffffd1515480
  .... CX    0x0000aaaadabf6c80
  .... DX    0x000000000000002e
  .... SI    0x0000000040100401
  .... DI    0x0040600200000080
  .... BP    0x0000ffffd1510e10
  .... SP    0x0000000000000000
  .... IP    0x00000000000000dd
  .... FLAGS 0x0000ffffd1510cd0
  .... CS    0x0000000000000000
  .... SS    0x0000000000000030
  .... DS    0x0000ffffa569a208
  .... ES    0x0000000000000000
  .... FS    0x0000000000000000
  .... GS    0x0000000000000000
  .... R8    0x0000aaaad3de9650
  .... R9    0x0000ffffa57397f0
  .... R10   0x0000000000000001
  .... R11   0x0000ffffa57fd000
  .... R12   0x0000ffffd1515817
  .... R13   0x0000ffffd1515480
  .... R14   0x0000aaaadabf6c80
  .... R15   0x0000000000000000
  .... unknown 0x0000000000000001
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000ffffd1510d90
  .... unknown 0x0000ffffa5739b90
  .... unknown 0x0000ffffd1510d80
  .... XMM0  0x0000ffffa57392c8
   ... thread: perf-exec:43239
   ...... dso: [kernel.kallsyms]

As can be seen, the register names correspond to X86 registers, even
though the perf.data file was recorded on an ARM64 system. After this
patch, the output of the command displays the correct register names:

  # perf report -i perf_arm64.data -D
  [...]
  24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
  ... user regs: mask 0x1ffffffff ABI 64-bit
  .... x0    0x0000ffffd1515817
  .... x1    0x0000ffffd1515480
  .... x2    0x0000aaaadabf6c80
  .... x3    0x000000000000002e
  .... x4    0x0000000040100401
  .... x5    0x0040600200000080
  .... x6    0x0000ffffd1510e10
  .... x7    0x0000000000000000
  .... x8    0x00000000000000dd
  .... x9    0x0000ffffd1510cd0
  .... x10   0x0000000000000000
  .... x11   0x0000000000000030
  .... x12   0x0000ffffa569a208
  .... x13   0x0000000000000000
  .... x14   0x0000000000000000
  .... x15   0x0000000000000000
  .... x16   0x0000aaaad3de9650
  .... x17   0x0000ffffa57397f0
  .... x18   0x0000000000000001
  .... x19   0x0000ffffa57fd000
  .... x20   0x0000ffffd1515817
  .... x21   0x0000ffffd1515480
  .... x22   0x0000aaaadabf6c80
  .... x23   0x0000000000000000
  .... x24   0x0000000000000001
  .... x25   0x0000000000000000
  .... x26   0x0000000000000000
  .... x27   0x0000000000000000
  .... x28   0x0000000000000000
  .... x29   0x0000ffffd1510d90
  .... lr    0x0000ffffa5739b90
  .... sp    0x0000ffffd1510d80
  .... pc    0x0000ffffa57392c8
   ... thread: perf-exec:43239
   ...... dso: [kernel.kallsyms]

Tester comments:

Athira reports:

"Looks good to me. Tested this patchset in powerpc by capturing regs in
powerpc and doing perf report to read the data from x86."

Reported-by: Alexandre Truong <alexandre.truong@arm.com>
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211207180653.1147374-4-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-16 12:18:12 -03:00
James Clark 7cc72553ac perf tools: Check vmlinux/kallsyms arguments in all tools
Only perf report checked the validity of these arguments so apply the
same check to all tools that read them for consistency.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20211018134844.2627174-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:27:38 -03:00
Arnaldo Carvalho de Melo 875eaa3990 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 07:10:30 -03:00
Kan Liang 27730c8cd6 perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
-F weight in perf script is broken.

  # ./perf mem record
  # ./perf script -F weight
  Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.

The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.

With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.

Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT

Fixes: ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Song Liu 29c77550ee perf script: Check session->header.env.arch before using it
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.

Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.

Committer notes:

If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.

Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Kan Liang 6ea5d1a3e3 perf script: Support instruction latency
The instruction latency information can be recorded on
some platforms, e.g., the Intel Sapphire Rapids server. With both memory
latency (weight) and the new instruction latency information, users can
easily locate the expensive load instructions, and also understand the time
spent in different stages. The users can optimize their applications in
different pipeline stages.

Add a new field "ins_lat" to filter the instruction latency information,
which is available with sample type PERF_SAMPLE_WEIGHT_STRUCT.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Link: https://lore.kernel.org/r/1632929894-102778-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 09:28:03 -03:00
Adrian Hunter ff6f41fbce perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.

The attr_type() function is removed, and the match of attr->type to
output type is corrected.

Example on ADL using taskset to select an atom cpu:

 # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]

 Before:

  # perf script | head
         taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
           uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:

 After:

  # perf script | head
         taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
           uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Stephen Brennan 538d9c1829 perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event
perf_events may sometimes throttle an event due to creating too many
samples during a given timer tick.

As of now, the perf tool will not report on throttling, which means this
is a silent error.

Implement a callback for the throttle and unthrottle events within the
Python scripting engine, which can allow scripts to detect and report
when events may have been lost due to throttling.

The simplest script to report throttle events is:

  def throttle(*args):
      print("throttle" + repr(args))

  def unthrottle(*args):
      print("unthrottle" + repr(args))

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210901210815.133251-1-stephen.s.brennan@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:18:25 -03:00
Adrian Hunter 29159727aa perf script: Fix unnecessary machine_resolve()
machine_resolve() may have already been called. Test for that to avoid
calling it again unnecessarily.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:33:59 -03:00
Namhyung Kim 2681bd85a4 perf tools: Remove repipe argument from perf_session__new()
The repipe argument is only used by perf inject and the all others
passes 'false'.  Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:06:51 -03:00
Riccardo Mancini faf3ac305d perf script: Fix memory 'threads' and 'cpus' leaks on exit
ASan reports several memory leaks while running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

Two of these are caused by some refcounts not being decreased on
perf-script exit, namely script.threads and script.cpus.

This patch adds the missing __put calls in a new perf_script__exit
function, which is called at the end of cmd_script.

This patch concludes the fixes of all remaining memory leaks in perf
test "82: Use vfs_getname probe to get syscall args filenames".

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: cfc8874a48 ("perf script: Process cpu/threads maps")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/5ee73b19791c6fa9d24c4d57f4ac1a23609400d7.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:28:14 -03:00
Riccardo Mancini 1b1f57cf9e perf script: Release zstd data
ASan reports several memory leak while running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

One of the leaks is caused by zstd data not being released on exit in
perf-script.

This patch adds the missing zstd_fini().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: b13b04d938 ("perf script: Initialize zstd_data")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/39388e8cc2f85ca219ea18697a17b7bd8f74b693.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Jiri Olsa fba7c86601 libperf: Move 'leader' from tools/perf to perf_evsel::leader
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:31 -03:00
Adrian Hunter 3d032a2516 perf script: Add option to pass arguments to dlfilters
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter 638e2b9984 perf script: Add option to list dlfilters
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter 9bde93a79a perf script: Add dlfilter__filter_event_early()
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter 291961fc3c perf script: Add API for filtering via dynamically loaded shared object
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter b743b86ce6 perf script: Share addr_al between functions
Share the addr_location of 'addr' so that it need not be resolved more than
once.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:21:54 -03:00
Adrian Hunter 4371fbc0c9 perf script: Move filtering before scripting
To make it possible to use filtering with scripts, move filtering before
scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:18:15 -03:00
Adrian Hunter 9300041c66 perf script: Move filter_cpu() earlier
Generally, it should be more efficient if filter_cpu() comes before
machine__resolve() because filter_cpu() is much less code than
machine__resolve().

Example:

 $ perf record --sample-cpu -- make -C tools/perf >/dev/null

Before:

 $ perf stat -- perf script -C 0 >/dev/null

  Performance counter stats for 'perf script -C 0':

            116.94 msec task-clock                #    0.992 CPUs utilized
                 2      context-switches          #   17.103 /sec
                 0      cpu-migrations            #    0.000 /sec
             8,187      page-faults               #   70.011 K/sec
       478,351,812      cycles                    #    4.091 GHz
       564,785,464      instructions              #    1.18  insn per cycle
       114,341,105      branches                  #  977.789 M/sec
         2,615,495      branch-misses             #    2.29% of all branches

       0.117840576 seconds time elapsed

       0.085040000 seconds user
       0.032396000 seconds sys

After:

 $ perf stat -- perf script -C 0 >/dev/null

  Performance counter stats for 'perf script -C 0':

            107.45 msec task-clock                #    0.992 CPUs utilized
                 3      context-switches          #   27.919 /sec
                 0      cpu-migrations            #    0.000 /sec
             7,964      page-faults               #   74.117 K/sec
       438,417,260      cycles                    #    4.080 GHz
       522,571,855      instructions              #    1.19  insn per cycle
       105,187,488      branches                  #  978.921 M/sec
         2,356,261      branch-misses             #    2.24% of all branches

       0.108282546 seconds time elapsed

       0.095935000 seconds user
       0.011991000 seconds sys

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:15:42 -03:00
Adrian Hunter d9ae9c9776 perf script: Factor out script_fetch_insn()
Factor out script_fetch_insn() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:46 -03:00
Adrian Hunter 67e50ce0e3 perf scripting: Add perf_session to scripting_context
This is preparation for allowing a script to set the itrace options
for the session if they have not already been set.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:17 -03:00
Adrian Hunter 2ede92173f perf scripting python: Add auxtrace error
Add auxtrace_error to general python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter 54cd8b0324 perf script: Factor out perf_sample__sprintf_flags()
Factor out perf_sample__sprintf_flags() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter 3f8e009e01 perf scripting python: Add 'addr_location' for 'addr'
If sample addr correlates to a symbol, add  "addr_dso", "addr_symbol", and
"addr_symoff" to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter 6ea4b5dbe0 perf script: Find script file relative to exec path
Allow perf script to find a script in the exec path.

Example:

Before:

 $ perf record -a -e intel_pt/branch=0/ sleep 0.1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.954 MB perf.data ]
 $ perf script intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `intel-pt-events.py'
   See perf script -l for available scripts.
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Can't open python script "intel-pt-events.py": No such file or directory
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py'
   See perf script -l for available scripts.
 $

After:

 $ perf script intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210524065718.11421-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 09:51:44 -03:00
Ingo Molnar 4d39c89f0b perf tools: Fix various typos in comments
Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
accumulated over the years.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com
Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-23 17:13:43 -03:00
Arnaldo Carvalho de Melo 297e69bfa4 perf script: Fixup 'struct evsel_script' method prefix
They all operate on 'struct evsel_script' instances, so should be
prefixed with evsel_script__, not with perf_evsel_script__.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-03-09 08:59:21 -03:00
Adrian Hunter c025d46cd9 perf script: Add branch types for VM-Entry and VM-Exit
In preparation to support Intel PT decoding of virtual machine traces, add
branch types for VM-Entry and VM-Exit.

Note they are both treated as "calls" because the VM-Exit transfers control
to a different address.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 16:12:51 -03:00
Adrian Hunter c840cbfeff perf intel-pt: Add PSB events
Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis
of code with Intel PT, it is useful to know if a timing bubble was caused
by Intel PT or not. Add reporting of PSB events via perf script. PSB
events are printed with the existing itrace 'p' option which also prints
power and frequency changes. The PSB event contains the trace offset at
which the PSB occurs, to allow easy reference back to the PSB+ packets.

The PSB event timestamp is always the timestamp from the PSB+ TSC
packet, and the ip is always the address from the PSB+ FUP packet.

The code changes are non-trivial because the decoder must walk to the
PSB+ FUP address before outputting the PSB event.

Example:

  $ perf record -e intel_pt/cyc,psb_period=0/u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.046 MB perf.data ]
  $ perf script --itrace=p --ns
     perf 17981 [006] 25617.510820383:  psb:  psb offs: 0                               0 [unknown] ([unknown])
     perf 17981 [006] 25617.510820383:  cbr:  cbr: 42 freq: 4219 MHz (156%)             0 [unknown] ([unknown])
    uname 17981 [006] 25617.510889753:  psb:  psb offs: 0xb50                7f78c12a212e __GI___tunables_init+0xee (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.510899162:  psb:  psb offs: 0x12d0               7f78c128af1c dl_main+0x93c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.510939242:  psb:  psb offs: 0x1a50               7f78c128eefc _dl_map_object_from_fd+0x13c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.510981274:  psb:  psb offs: 0x21c8               7f78c1296307 _dl_relocate_object+0x927 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.510993034:  psb:  psb offs: 0x2948               7f78c12940e4 _dl_lookup_symbol_x+0x14 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.511003871:  psb:  psb offs: 0x30c8               7f78c12937b3 do_lookup_x+0x2f3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.511019854:  psb:  psb offs: 0x3850               7f78c1295eed _dl_relocate_object+0x50d (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.511029015:  psb:  psb offs: 0x4390               7f78c12a855a strcmp+0xf6a (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
    uname 17981 [006] 25617.511064876:  psb:  psb offs: 0x4b10                          0 [unknown] ([unknown])
    uname 17981 [006] 25617.511080762:  psb:  psb offs: 0x5290               7f78c11db53d _dl_addr+0x13d (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511086035:  psb:  psb offs: 0x5a08               7f78c11db538 _dl_addr+0x138 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511091381:  psb:  psb offs: 0x6190               7f78c11db534 _dl_addr+0x134 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511096681:  psb:  psb offs: 0x6910               7f78c11db4c3 _dl_addr+0xc3 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511119520:  psb:  psb offs: 0x7090               7f78c10ada5e _nl_intern_locale_data+0x12e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511126584:  psb:  psb offs: 0x7818               7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511132775:  psb:  psb offs: 0x8358               7f78c10c20c0 getenv+0xa0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511134598:  psb:  psb offs: 0x8ad0               7f78c10ada09 _nl_intern_locale_data+0xd9 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511135685:  psb:  psb offs: 0x9258               7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511138322:  psb:  psb offs: 0x99d0               7f78c11fffd9 __strncmp_avx2+0x39 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
    uname 17981 [006] 25617.511158907:  psb:  psb offs: 0xa150                          0 [unknown] ([unknown])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210205175350.23817-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 16:04:10 -03:00
Yang Li 8524711d2c perf script: Simplify bool conversion
Fix the following coccicheck warning:
  ./tools/perf/builtin-script.c:2789:36-41: WARNING: conversion to bool
  not needed here
  ./tools/perf/builtin-script.c:3237:48-53: WARNING: conversion to bool
  not needed here

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1612773936-98691-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-09 14:58:56 -03:00
Jin Yao 61d9fc4449 perf script: Support filtering by hex address
'perf script' supports '-S' or '--symbol' options to only list the
records for these symbols. A symbol is typically a name or hex address.
If it's hex address, it is the start address of one symbol.

While it would be useful if we can filter trace records by any hex
address (not only the start address of symbol). So now we support
filtering trace records by more conditions, such as:

- symbol name
- start address of symbol
- any hexadecimal address
- address range

The comparison order is defined as:

1. symbol name comparison
2. symbol start address comparison.
3. any hexadecimal address comparison.
4. address range comparison.

The idea is if we can get a valid address from -S list, we add the
address to addr_list for address comparison otherwise we still leave
it to sym_list for symbol comparison.

Some examples:

  root@kbl-ppc:~# ./perf script -S ffffffff9a477308
            perf  8562 [000] 347303.578858:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [000] 347303.578860:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [000] 347303.578861:         11   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [001] 347303.578903:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [001] 347303.578905:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [001] 347303.578906:         15   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [002] 347303.578952:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
            perf  8562 [002] 347303.578953:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])

Filter the traced records by hex address ffffffff9a477308.

  root@kbl-ppc:~# ./perf script -S ffffffff9a4dd4ce,ffffffff9a4d2de9,ffffffff9a6bf9f4
            perf  8562 [001] 347303.578911:     311706   cycles:  ffffffff9a6bf9f4 __kmalloc_node+0x204 ([kernel.kallsyms])
            perf  8562 [002] 347303.578960:     354477   cycles:  ffffffff9a4d2de9 sched_setaffinity+0x49 ([kernel.kallsyms])
            perf  8562 [003] 347303.579015:     450958   cycles:  ffffffff9a4dd4ce dequeue_task_fair+0x1ae ([kernel.kallsyms])

Filter the traced records by hex address ffffffff9a4dd4ce, ffffffff9a4d2de9, ffffffff9a6bf9f4.

  root@kbl-ppc:~# ./perf script -S ffffffff9a477309 --addr-range 16
            perf  8562 [000] 347303.578863:        291   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
            perf  8562 [001] 347303.578907:        411   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
            perf  8562 [002] 347303.578956:        462   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
            perf  8562 [003] 347303.579010:        497   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
            perf  8562 [004] 347303.579059:        429   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
            perf  8562 [005] 347303.579109:        408   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
            perf  8562 [006] 347303.579159:        460   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
            perf  8562 [007] 347303.579213:        436   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])

Filter the traced records from address range [ffffffff9a477309, ffffffff9a477309 + 15].

  root@kbl-ppc:~# ./perf script -S "ffffffff9b163046,rcu_nmi_exit"
            perf  8562 [004] 347303.579060:      12013   cycles:  ffffffff9b163046 exc_nmi+0x166 ([kernel.kallsyms])
            perf  8562 [007] 347303.579214:      12138   cycles:  ffffffff9b165944 rcu_nmi_exit+0x34 ([kernel.kallsyms])

Filter by address + symbol

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210207080935.31784-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08 17:09:11 -03:00
Jin Yao 4b799a9b77 perf script: Support DSO filter like in other perf tools
Other perf tool builtins already supported a DSO filter.

For example:

  $ perf report --dsos a,b,c

which only considers symbols in these dsos.

Now the DSO filter is supported in 'perf script':

  root@kbl-ppc:~# ./perf script --dsos "[kernel.kallsyms]"
            perf 18123 [000] 6142863.075104:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
            perf 18123 [000] 6142863.075107:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
            perf 18123 [000] 6142863.075108:         10   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
            perf 18123 [000] 6142863.075109:        273   cycles:  ffffffff9ca7730a native_write_msr+0xa ([kernel.kallsyms])
            perf 18123 [000] 6142863.075110:       7684   cycles:  ffffffff9ca3c9c0 native_sched_clock+0x50 ([kernel.kallsyms])
            perf 18123 [000] 6142863.075112:     213017   cycles:  ffffffff9d765a92 syscall_exit_to_user_mode+0x32 ([kernel.kallsyms])
            perf 18123 [001] 6142863.075156:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
            perf 18123 [001] 6142863.075158:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
            perf 18123 [001] 6142863.075159:         17   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])

Committer testing:

  $ perf script
                ls 2364888 29303.010949:          1 cycles:u:  ffffffffa4bbc6a9 [unknown] ([unknown])
                ls 2364888 29303.010957:          1 cycles:u:  ffffffffa429ef48 [unknown] ([unknown])
                ls 2364888 29303.010961:          1 cycles:u:  ffffffffa4260133 [unknown] ([unknown])
                ls 2364888 29303.010964:          5 cycles:u:  ffffffffa429efad [unknown] ([unknown])
                ls 2364888 29303.010967:         41 cycles:u:  ffffffffa42a4586 [unknown] ([unknown])
                ls 2364888 29303.010972:        435 cycles:u:  ffffffffa429efe0 [unknown] ([unknown])
                ls 2364888 29303.010978:       5142 cycles:u:      7f9b95bc2abf __GI___tunables_init+0x11f (/usr/lib64/ld-2.32.so)
                ls 2364888 29303.011006:      38551 cycles:u:  ffffffffa4290f61 [unknown] ([unknown])
                ls 2364888 29303.011486:     238234 cycles:u:      7f9b95bb7741 _dl_relocate_object+0xa71 (/usr/lib64/ld-2.32.so)
                ls 2364888 29303.011937:     415870 cycles:u:      7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
  $

Before:

  $ perf script --dsos /usr/lib64/libc-2.32.so |& head -5
    Error: unknown option `dsos'

   Usage: perf script [<options>]
      or: perf script [<options>] record <script> [<record-options>] <command>
      or: perf script [<options>] report <script> [script-args]
  $

After:

  $ perf script --dsos /usr/lib64/libc-2.32.so
                ls 2364888 29303.011937:     415870 cycles:u:      7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
  $

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210124232750.19170-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 13:10:43 -03:00
Arnaldo Carvalho de Melo 70f0ba9f24 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-27 16:48:04 -03:00
Jin Yao 8adc0a06d6 perf script: Fix overrun issue for dynamically-allocated PMU type number
When unpacking the event which is from dynamic PMU, the array
output[OUTPUT_TYPE_MAX] may be overrun. For example, type number of SKL
uncore_imc is 10, but OUTPUT_TYPE_MAX is 7 now (OUTPUT_TYPE_MAX =
PERF_TYPE_MAX + 1).

/* In builtin-script.c */

process_event()
{
        unsigned int type = output_type(attr->type);

        if (output[type].fields == 0)
                return;
}

output[10] is overrun.

Create a type OUTPUT_TYPE_OTHER for dynamic PMU events, then
output_type(attr->type) will return OUTPUT_TYPE_OTHER here.

Note that if PERF_TYPE_MAX ever changed, then there would be a conflict
between old perf.data files that had a dynamicaliy allocated PMU number
that would then be the same as a fixed PERF_TYPE.

Example:

  # perf record --switch-events -C 0 -e "{cpu-clock,uncore_imc/data_reads/,uncore_imc/data_writes/}:SD" -a -- sleep 1
  # perf script

  Before:
         swapper     0 [000] 1479253.987551:     277766               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.987797:     246709               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.988127:     329883               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.988273:     146393               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.988523:     249977               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.988877:     354090               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.989023:     145940               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.989383:     359856               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1479253.989523:     140082               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])

  After:
         swapper     0 [000] 1397040.402011:     272384               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1397040.402011:       5396  uncore_imc/data_reads/:
         swapper     0 [000] 1397040.402011:        967 uncore_imc/data_writes/:
         swapper     0 [000] 1397040.402259:     249153               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1397040.402259:       7231  uncore_imc/data_reads/:
         swapper     0 [000] 1397040.402259:       1297 uncore_imc/data_writes/:
         swapper     0 [000] 1397040.402508:     249108               cpu-clock:  ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
         swapper     0 [000] 1397040.402508:       5333  uncore_imc/data_reads/:
         swapper     0 [000] 1397040.402508:       1008 uncore_imc/data_writes/:

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: https://lore.kernel.org/r/20201209005828.21302-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-21 17:25:33 -03:00
Stephane Eranian c513de8a70 perf script: Add support for PERF_SAMPLE_CODE_PAGE_SIZE
Display sampled code page sizes when PERF_SAMPLE_CODE_PAGE_SIZE was set.

For example:

  # perf script --fields comm,event,ip,code_page_size
            dtlb mem-loads:uP:            445777 4K
            dtlb mem-loads:uP:            40f724 4K
            dtlb mem-loads:uP:            474926 4K
            dtlb mem-loads:uP:            401075 4K
            dtlb mem-loads:uP:            401095 4K
            dtlb mem-loads:uP:            401095 4K
            dtlb mem-loads:uP:            4010cc 4K
            dtlb mem-loads:uP:            440b6f 4K
  #

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210105195752.43489-5-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-20 14:34:20 -03:00
Kan Liang 6b9bae63de perf script: Support data page size
Display the data page size if it is available and asked by the user:

Can be configured by the user, for example:

  perf script --fields comm,event,phys_addr,data_page_size
            dtlb mem-loads:uP:        3fec82ea8 4K
            dtlb mem-loads:uP:        3fec82e90 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3fec82f20 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3b4211bec 4K
            dtlb mem-loads:uP:        382205dc0 2M
            dtlb mem-loads:uP:        36fa082c0 2M
            dtlb mem-loads:uP:        377607340 2M
            dtlb mem-loads:uP:        330010180 2M
            dtlb mem-loads:uP:        33200fd80 2M
            dtlb mem-loads:uP:        31b012b80 2M

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20201216185805.9981-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-19 17:04:39 -03:00
Arnaldo Carvalho de Melo 3ccf8a7b66 perf evlist: Use the right prefix for 'struct evlist' sample id lookup methods
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-30 14:17:57 -03:00
Arnaldo Carvalho de Melo 53f5e9084d perf evlist: Use the right prefix for 'struct evlist' stats methods
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-30 09:31:04 -03:00
Adrian Hunter fc18380fb9 perf script: Display negative tid in non-sample events
The kernel can release tasks while they are still running. This can
result in a task having no tid, in which case perf records a tid of -1.
Improve the perf script output in that case.

Example:

Before:

  # cat ./autoreap.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>
  #include <signal.h>

  struct sigaction act = {
          .sa_handler = SIG_IGN,
  };

  int main()
  {
          pid_t child;
          int status = 0;

          sigaction(SIGCHLD, &act, NULL);
          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  # gcc -o autoreap autoreap.c
  # ./perf record -a -e dummy --switch-events ./autoreap
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.948 MB perf.data ]
  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
  PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4294967295/4294967295
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

After:

  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
               :-1    -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    -1/-1
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 16:06:22 -03:00
Jiri Olsa e534bfb164 perf script: Add 'tod' field to display time of day
Add a 'tod' field to display time of day column with time of date
(wallclock) time.

  # perf record -k CLOCK_MONOTONIC kill
  kill: not enough arguments
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.033 MB perf.data (8 samples) ]

  # perf script
            perf 261340 152919.481538:          1 cycles:  ffffffff8106d104 ...
            perf 261340 152919.481543:          1 cycles:  ffffffff8106d104 ...
            perf 261340 152919.481545:          7 cycles:  ffffffff8106d104 ...
  ...

  # perf script --ns
            perf 261340 152919.481538922:          1 cycles:  ffffffff8106d ...
            perf 261340 152919.481543286:          1 cycles:  ffffffff8106d ...
            perf 261340 152919.481545397:          7 cycles:  ffffffff8106d ...
  ...

  # perf script -F+tod
            perf 261340 2020-07-13 18:26:55.620971 152919.481538:           ...
            perf 261340 2020-07-13 18:26:55.620975 152919.481543:           ...
            perf 261340 2020-07-13 18:26:55.620978 152919.481545:           ...
  ...

  # perf script -F+tod --ns
            perf 261340 2020-07-13 18:26:55.620971621 152919.481538922:     ...
            perf 261340 2020-07-13 18:26:55.620975985 152919.481543286:     ...
            perf 261340 2020-07-13 18:26:55.620978096 152919.481545397:     ...
  ...

It's available only for recording with clockid specified, because it's
the only case where we can get reference time to wallclock time. It's
can't do that with perf clock yet.

Error is display if you want to use --tod on data without clockid
specified:

  # perf script -F+tod
  Can't provide 'tod' time, missing clock data. Please record with -k/--clockid option.

Original-patch-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:45:23 -03:00
Jiri Olsa 60e5eeb56a perf script: Change the 'enum perf_output_field' enumerators to be 64 bits
So it's possible to add new values. I did not find any place where the
enum values are passed through some number type, so it's safe to make
this change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:44:34 -03:00
Adrian Hunter 7eeb9855c1 perf script: Show text poke address symbol
It is generally more useful to show the symbol with an address. In this
case, the print function requires the 'machine' which means changing
callers to provide it as a parameter. It is optional because most events
do not need it and the callers that matter can provide it.

Committer notes:

Made 'union perf_event' continue to be the first parameter to the
perf_event__fprintf() and perf_event__fprintf_text_poke() events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-10 08:39:14 -03:00
Adrian Hunter 92ecf3a64f perf script: Add option --show-text-poke-events
Consistent with other new events, add an option to perf script to
display text poke events and ksymbol events. Both text poke events and
ksymbol events are displayed because some text pokes (e.g. ftrace
trampolines) have corresponding ksymbol events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-10 08:31:45 -03:00
Arnaldo Carvalho de Melo facbf0b982 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes and move perf/core forward, minor conflict as
perf_evlist__add_dummy() lost its 'perf_' prefix as it operates on a
'struct evlist', not on a 'struct perf_evlist', i.e. its tools/perf/
specific, it is not in libperf.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-08 13:49:15 -03:00
Adrian Hunter add07ccd92 perf intel-pt: Fix displaying PEBS-via-PT with registers
After recording PEBS-via-PT, perf script will not accept 'iregs' field e.g.

 # perf record -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
 ...
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.062 MB perf.data ]
 # ./perf script --itrace=eop -F+iregs
 Samples for 'dummy:u' event do not have IREGS attribute set. Cannot print 'iregs' field.

Fix by using allow_user_set, which is true when recording AUX area data.

Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-06 09:03:39 -03:00
Arnaldo Carvalho de Melo 92c7d7cdf4 perf evlist: Fix the class prefix for 'struct evlist' branch_type methods
To differentiate from libperf's 'struct perf_evlist' methods.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-22 16:28:10 -03:00
Arnaldo Carvalho de Melo b3c2cc2bd2 perf evlist: Fix the class prefix for 'struct evlist' sample_type methods
To differentiate from libperf's 'struct perf_evlist' methods.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-22 16:28:09 -03:00
Arnaldo Carvalho de Melo afdd63f593 perf script: Fixup some evsel/evlist method names
Fixups related to the introduction of libperf, where the
perf_{evsel,evlist}__ prefix is reserved for functions operating on
struct perf_{evsel,evlist}.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-22 16:28:09 -03:00
Milian Wolff b13b04d938 perf script: Initialize zstd_data
Fixes segmentation fault when trying to interpret zstd-compressed data
with perf script:

```
  $ perf record -z ls
  ...
  [ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
  $ memcheck perf script
  ...
  ==67911== Invalid read of size 4
  ==67911==    at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
  ==67911==    by 0x6E726B: zstd_decompress_stream (zstd.c:100)
  ==67911==    by 0x65729C: perf_session__process_compressed_event (session.c:72)
  ==67911==    by 0x6598E8: perf_session__process_user_event (session.c:1583)
  ==67911==    by 0x65BA59: reader__process_events (session.c:2177)
  ==67911==    by 0x65BA59: __perf_session__process_events (session.c:2234)
  ==67911==    by 0x65BA59: perf_session__process_events (session.c:2267)
  ==67911==    by 0x5A7397: __cmd_script (builtin-script.c:2447)
  ==67911==    by 0x5A7397: cmd_script (builtin-script.c:3840)
  ==67911==    by 0x5FE9D2: run_builtin (perf.c:312)
  ==67911==    by 0x711627: handle_internal_command (perf.c:364)
  ==67911==    by 0x711627: run_argv (perf.c:408)
  ==67911==    by 0x711627: main (perf.c:538)
  ==67911==  Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
```

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-17 13:19:37 -03:00
Adrian Hunter b51640854d perf script: Fix --call-trace for Intel PT
Make process_attr() respect -F-ip, noting also that the condition in
process_attr() (callchain_param.record_mode != CALLCHAIN_NONE) is always
true so test the sample type directly.

Example:

  Before:

    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.033 MB perf.data ]
    $ perf script --call-trace | head -5
           uname 30992 [006] 41758.313696574:  cbr: 42 freq: 4219 MHz (156%)                    0 [unknown] ([unknown]                                         )
           uname 30992 [006] 41758.313696907: _start                               7f71792c4100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )
           uname 30992 [006] 41758.313699574:     _dl_start                        7f71792c4103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )
           uname 30992 [006] 41758.313699907:     _dl_start                        7f71792c4e18 _dl_start+0x28 (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )
           uname 30992 [006] 41758.313701574:     _dl_start                        7f71792c5128 _dl_start+0x338 (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )

  After:

    $ perf script --call-trace | head -5
           uname 30992 [006] 41758.313696574:  cbr: 42 freq: 4219 MHz (156%)
           uname 30992 [006] 41758.313696907: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )      _start
           uname 30992 [006] 41758.313699574: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start
           uname 30992 [006] 41758.313699907: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start
           uname 30992 [006] 41758.313701574: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start

Fixes: f288e8e1aa4f ("perf script: Enable IP fields for callchains")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200527180250.16723-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 11:31:13 -03:00
Andi Kleen 8c3e05c827 perf script: Don't force less for non tty output with --xed
--xed currently forces less. When piping the output to other scripts
this can waste a lot of CPU time because less is rather slow.
I've seen it using up a full core on its own in a pipeline.
Only force less when the output is actually a terminal.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20200522020914.527564-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:28 -03:00
Gustavo A. R. Silva 6549a8c0c3 perf tools: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array
member[1][2], introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by this
change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:27 -03:00
Jiri Olsa 53fb18941d perf script: Enable IP fields for callchains
In case the callchains were deleted in pipe mode, we need to ensure that
the IP fields are enabled, otherwise the callchain is not displayed.

Enabling IP and SYM, which should be enough for callchains.

Committer testing:

Before:

Committer Testing:

before:

  # ls
  # perf record -g -e 'syscalls:*' sleep 0.1 2>/dev/null | perf script | tail
       sleep 5677 [0] 5034.295882:         syscalls:sys_exit_mmap: 0x7fcbcfa74000
       sleep 5677 [0] 5034.295885:       syscalls:sys_enter_close: fd: 0x00000003
       sleep 5677 [0] 5034.295886:        syscalls:sys_exit_close: 0x0
       sleep 5677 [0] 5034.295911:   syscalls:sys_enter_nanosleep: rqtp: 0x7fff775b33a0, rmtp: 0x00000000
       sleep 5677 [0] 5034.396021:    syscalls:sys_exit_nanosleep: 0x0
       sleep 5677 [0] 5034.396027:       syscalls:sys_enter_close: fd: 0x00000001
       sleep 5677 [0] 5034.396028:        syscalls:sys_exit_close: 0x0
       sleep 5677 [0] 5034.396029:       syscalls:sys_enter_close: fd: 0x00000002
       sleep 5677 [0] 5034.396029:        syscalls:sys_exit_close: 0x0
       sleep 5677 [0] 5034.396032:  syscalls:sys_enter_exit_group: error_code: 0x00000000
  #
  # ls
  #

After:

  # perf record --call-graph=dwarf -e 'syscalls:sys_enter*' sleep 0.1 2>/dev/null | perf script | tail -37
  sleep 33010 [000]  5400.625269:              syscalls:sys_enter_nanosleep: rqtp: 0x7fff2d0e7860, rmtp: 0x00000000
  	    7f1406f131a7 __GI___nanosleep (inlined)
  	    561c4f996966 [unknown]
  	    561c4f99673f [unknown]
  	    561c4f9937af [unknown]
  	    7f1406e6c1a2 __libc_start_main
  	    561c4f99388d [unknown]

  sleep 33010 [000]  5400.725391:                  syscalls:sys_enter_close: fd: 0x00000001
  	    7f1406f3c3cb __GI___close_nocancel (inlined)
  	    7f1406ec7d6f _IO_new_file_close_it (inlined)
  	    7f1406ebafa5 _IO_new_fclose (inlined)
  	    561c4f996a40 [unknown]
  	    561c4f993d79 [unknown]
  	    7f1406e83e86 __run_exit_handlers
  	    7f1406e8403f __GI_exit (inlined)
  	    7f1406e6c1a9 __libc_start_main
  	    561c4f99388d [unknown]

  sleep 33010 [000]  5400.725395:                  syscalls:sys_enter_close: fd: 0x00000002
  	    7f1406f3c3cb __GI___close_nocancel (inlined)
  	    7f1406ec7d6f _IO_new_file_close_it (inlined)
  	    7f1406ebafa5 _IO_new_fclose (inlined)
  	    561c4f996a40 [unknown]
  	    561c4f993da2 [unknown]
  	    7f1406e83e86 __run_exit_handlers
  	    7f1406e8403f __GI_exit (inlined)
  	    7f1406e6c1a9 __libc_start_main
  	    561c4f99388d [unknown]

  sleep 33010 [000]  5400.725399:             syscalls:sys_enter_exit_group: error_code: 0x00000000
  	    7f1406f13466 __GI__exit (inlined)
  	    7f1406e83fa1 __run_exit_handlers
  	    7f1406e8403f __GI_exit (inlined)
  	    7f1406e6c1a9 __libc_start_main
  	    561c4f99388d [unknown]
  #

And, if we install coreutils-debuginfo, we'll have those [unknown] resolved,
those are for the /usr/bin/sleep binary, use:

  # dnf debuginfo-install coreutils

On Fedora and derivatives, then:

  # perf record --call-graph=dwarf -e 'syscalls:sys_enter*' sleep 0.1 2>/dev/null | perf script | tail -37
  sleep 33046 [009]  5533.910074:              syscalls:sys_enter_nanosleep: rqtp: 0x7ffea6fa7ab0, rmtp: 0x00000000
  	    7f5f786e81a7 __GI___nanosleep (inlined)
  	    564472454966 rpl_nanosleep
  	    56447245473f xnanosleep
  	    5644724517af main
  	    7f5f786411a2 __libc_start_main
  	    56447245188d _start

  sleep 33046 [009]  5534.010218:                  syscalls:sys_enter_close: fd: 0x00000001
  	    7f5f787113cb __GI___close_nocancel (inlined)
  	    7f5f7869cd6f _IO_new_file_close_it (inlined)
  	    7f5f7868ffa5 _IO_new_fclose (inlined)
  	    564472454a40 close_stream
  	    564472451d79 close_stdout
  	    7f5f78658e86 __run_exit_handlers
  	    7f5f7865903f __GI_exit (inlined)
  	    7f5f786411a9 __libc_start_main
  	    56447245188d _start

  sleep 33046 [009]  5534.010224:                  syscalls:sys_enter_close: fd: 0x00000002
  	    7f5f787113cb __GI___close_nocancel (inlined)
  	    7f5f7869cd6f _IO_new_file_close_it (inlined)
  	    7f5f7868ffa5 _IO_new_fclose (inlined)
  	    564472454a40 close_stream
  	    564472451da2 close_stdout
  	    7f5f78658e86 __run_exit_handlers
  	    7f5f7865903f __GI_exit (inlined)
  	    7f5f786411a9 __libc_start_main
  	    56447245188d _start

  sleep 33046 [009]  5534.010229:             syscalls:sys_enter_exit_group: error_code: 0x00000000
  	    7f5f786e8466 __GI__exit (inlined)
  	    7f5f78658fa1 __run_exit_handlers
  	    7f5f7865903f __GI_exit (inlined)
  	    7f5f786411a9 __libc_start_main
  	    56447245188d _start

  #

Reported-by: Paul Khuong <pvk@pvk.ca>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200507095024.2789147-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00