Commit 2322d6c5 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more perf tooling updates from Thomas Gleixner:
 "Perf tool updates and fixes:

  perf stat:

   - Display user and system time for workload targets (Jiri Olsa)

  perf record:

   - Enable arbitrary event names thru name= modifier (Alexey Budankov)

  PowerPC:

   - Add a python script for hypervisor call statistics (Ravi Bangoria)

  Intel PT: (Adrian Hunter)

   - Fix sync_switch INTEL_PT_SS_NOT_TRACING

   - Fix decoding to accept CBR between FUP and corresponding TIP

   - Fix MTC timing after overflow

   - Fix "Unexpected indirect branch" error

  perf test:

   - record+probe_libc_inet_pton:
      - To get the symbol table for dynamic shared objects on ubuntu we
        need to pass the -D/--dynamic command line option, unlike with
        the fedora distros (Arnaldo Carvalho de Melo)

   - code-reading:
      - Fix perf_env setup for PTI entry trampolines (Adrian Hunter)

   - kmod-path:
      - Add tests for vdso32 and vdsox32 (Adrian Hunter)

   - Use header file util/debug.h (Thomas Richter)

  perf annotate:

   - Make the various UI backends (stdio, TUI, gtk) use more
     consistently structs with annotation options as specified by the
     user (Arnaldo Carvalho de Melo)

   - Move annotation specific knobs from the symbol_conf global kitchen
     sink to the annotation option structs (Arnaldo Carvalho de Melo)

  perf script:

   - Add more PMU fields to python scripts event handler dict (Jin Yao)

  Core:

   - Fix misleading error for some unparsable events mentioning PMUs
     when those are not involved in the problem (Jiri Olsa)

   - Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
     (Arnaldo Carvalho de Melo)

   - Be more robust when trying to use per-symbol histograms, checking
     for unlikely but possible cases where the space for the histograms
     wasn't allocated, print a debug message for such cases (Arnaldo
     Carvalho de Melo)

   - Fix symbol and object code resolution for vdso32 and vdsox32
     (Adrian Hunter)

   - No need to check for null when passing pointers to foo__get() style
     refcount grabbing helpers, just like in the kernel and with free(),
     its safe to pass a NULL pointer to avoid having to check it before
     each and every foo__get() call (Arnaldo Carvalho de Melo)

   - Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)

   - Remove some needless globals, making them local (Arnaldo Carvalho
     de Melo)

   - Reduce usage of symbol_conf.use_callchain, using other means of
     finding out if callchains are in use or available for specific
     events, as we evolved this codebase to allow requesting callchains
     for just a subset of the monitored events. In time it will help
     polish recording and showing mixed sets accross the various tools:

        perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'

     (Arnaldo Carvalho de Melo)

   - Consider PTI entry trampolines in map__rip_2objdump() (Adrian
     Hunter)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  perf script python: Add dict fields introduction to Documentation
  perf script python: Add more PMU fields to event handler dict
  perf script python: Move dsoname code to a new function
  perf symbols: Add BSS symbols when reading from /proc/kallsyms
  perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL
  perf intel-pt: Fix "Unexpected indirect branch" error
  perf intel-pt: Fix MTC timing after overflow
  perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
  perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
  perf script powerpc: Python script for hypervisor call statistics
  perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols
  perf map: Consider PTI entry trampolines in rip_2objdump()
  perf test code-reading: Fix perf_env setup for PTI entry trampolines
  perf tools: Fix pmu events parsing rule
  perf stat: Display user and system time
  perf record: Enable arbitrary event names thru name= modifier
  perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
  perf tests kmod-path: Add tests for vdso32 and vdsox32
  perf hists: Check if a hist_entry has callchains before using them
  perf hists: Introduce hist_entry__has_callchain() method
  ...
parents 9f3fbe85 2696ec45
......@@ -124,7 +124,11 @@ The available PMUs and their raw parameters can be listed with
For example the raw event "LSD.UOPS" core pmu event above could
be specified as
perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ...
perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
or using extended name syntax
perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
PER SOCKET PMUS
---------------
......
......@@ -57,6 +57,9 @@ OPTIONS
FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
"no" for disable callgraph.
- 'stack-size': user stack size for dwarf mode
- 'name' : User defined event name. Single quotes (') may be used to
escape symbols in the name from parsing by shell and tool
like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.
See the linkperf:perf-list[1] man page for more parameters.
......
......@@ -610,6 +610,32 @@ Various utility functions for use with perf script:
nsecs_str(nsecs) - returns printable string in the form secs.nsecs
avg(total, n) - returns average given a sum and a total number of values
SUPPORTED FIELDS
----------------
Currently supported fields:
ev_name, comm, pid, tid, cpu, ip, time, period, phys_addr, addr,
symbol, dso, time_enabled, time_running, values, callchain,
brstack, brstacksym, datasrc, datasrc_decode, iregs, uregs,
weight, transaction, raw_buf, attr.
Some fields have sub items:
brstack:
from, to, from_dsoname, to_dsoname, mispred,
predicted, in_tx, abort, cycles.
brstacksym:
items: from, to, pred, in_tx, abort (converted string)
For example,
We can use this code to print brstack "from", "to", "cycles".
if 'brstack' in dict:
for entry in dict['brstack']:
print "from %s, to %s, cycles %s" % (entry["from"], entry["to"], entry["cycles"])
SEE ALSO
--------
linkperf:perf-script[1]
......@@ -310,20 +310,38 @@ Users who wants to get the actual value can apply --no-metric-only.
EXAMPLES
--------
$ perf stat -- make -j
$ perf stat -- make
Performance counter stats for 'make -j':
Performance counter stats for 'make':
8117.370256 task clock ticks # 11.281 CPU utilization factor
678 context switches # 0.000 M/sec
133 CPU migrations # 0.000 M/sec
235724 pagefaults # 0.029 M/sec
24821162526 CPU cycles # 3057.784 M/sec
18687303457 instructions # 2302.138 M/sec
172158895 cache references # 21.209 M/sec
27075259 cache misses # 3.335 M/sec
83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
3,228,188 page-faults:u # 0.039 M/sec
229,570,665,834 cycles:u # 2.742 GHz
313,163,853,778 instructions:u # 1.36 insn per cycle
69,704,684,856 branches:u # 832.559 M/sec
2,078,861,393 branch-misses:u # 2.98% of all branches
Wall-clock time elapsed: 719.554352 msecs
83.409183620 seconds time elapsed
74.684747000 seconds user
8.739217000 seconds sys
TIMINGS
-------
As displayed in the example above we can display 3 types of timings.
We always display the time the counters were enabled/alive:
83.409183620 seconds time elapsed
For workload sessions we also display time the workloads spent in
user/system lands:
74.684747000 seconds user
8.739217000 seconds sys
Those times are the very same as displayed by the 'time' tool.
CSV FORMAT
----------
......
......@@ -189,7 +189,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
return -1;
}
int perf_env__lookup_objdump(struct perf_env *env)
int perf_env__lookup_objdump(struct perf_env *env, const char **path)
{
/*
* For live mode, env->arch will be NULL and we can use
......@@ -198,5 +198,5 @@ int perf_env__lookup_objdump(struct perf_env *env)
if (env->arch == NULL)
return 0;
return perf_env__lookup_binutils_path(env, "objdump", &objdump_path);
return perf_env__lookup_binutils_path(env, "objdump", path);
}
......@@ -4,8 +4,6 @@
#include "../util/env.h"
extern const char *objdump_path;
int perf_env__lookup_objdump(struct perf_env *env);
int perf_env__lookup_objdump(struct perf_env *env, const char **path);
#endif /* ARCH_PERF_COMMON_H */
......@@ -40,9 +40,8 @@
struct perf_annotate {
struct perf_tool tool;
struct perf_session *session;
struct annotation_options opts;
bool use_tui, use_stdio, use_stdio2, use_gtk;
bool full_paths;
bool print_line;
bool skip_missing;
bool has_br_stack;
bool group_set;
......@@ -162,12 +161,12 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
hist__account_cycles(sample->branch_stack, al, sample, false);
bi = he->branch_info;
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
if (err)
goto out;
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
out:
return err;
......@@ -249,7 +248,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
if (he == NULL)
return -ENOMEM;
ret = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
ret = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
hists__inc_nr_samples(hists, true);
return ret;
}
......@@ -289,10 +288,9 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
struct perf_annotate *ann)
{
if (!ann->use_stdio2)
return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel,
ann->print_line, ann->full_paths, 0, 0);
return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel,
ann->print_line, ann->full_paths);
return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel, &ann->opts);
return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel, &ann->opts);
}
static void hists__find_annotations(struct hists *hists,
......@@ -343,7 +341,7 @@ static void hists__find_annotations(struct hists *hists,
/* skip missing symbols */
nd = rb_next(nd);
} else if (use_browser == 1) {
key = hist_entry__tui_annotate(he, evsel, NULL);
key = hist_entry__tui_annotate(he, evsel, NULL, &ann->opts);
switch (key) {
case -1:
......@@ -390,8 +388,9 @@ static int __cmd_annotate(struct perf_annotate *ann)
goto out;
}
if (!objdump_path) {
ret = perf_env__lookup_objdump(&session->header.env);
if (!ann->opts.objdump_path) {
ret = perf_env__lookup_objdump(&session->header.env,
&ann->opts.objdump_path);
if (ret)
goto out;
}
......@@ -476,6 +475,7 @@ int cmd_annotate(int argc, const char **argv)
.ordered_events = true,
.ordering_requires_timestamps = true,
},
.opts = annotation__default_options,
};
struct perf_data data = {
.mode = PERF_DATA_MODE_READ,
......@@ -503,9 +503,9 @@ int cmd_annotate(int argc, const char **argv)
"file", "vmlinux pathname"),
OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules,
"load module symbols - WARNING: use only with -k and LIVE kernel"),
OPT_BOOLEAN('l', "print-line", &annotate.print_line,
OPT_BOOLEAN('l', "print-line", &annotate.opts.print_lines,
"print matching source lines (may be slow)"),
OPT_BOOLEAN('P', "full-paths", &annotate.full_paths,
OPT_BOOLEAN('P', "full-paths", &annotate.opts.full_path,
"Don't shorten the displayed pathnames"),
OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing,
"Skip symbols that cannot be annotated"),
......@@ -516,13 +516,13 @@ int cmd_annotate(int argc, const char **argv)
OPT_CALLBACK(0, "symfs", NULL, "directory",
"Look for files with symbols relative to this directory",
symbol__config_symfs),
OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src,
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING(0, "objdump", &objdump_path, "path",
OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
"Show event group information together"),
......
......@@ -1976,7 +1976,7 @@ static int filter_cb(struct hist_entry *he)
c2c_he = container_of(he, struct c2c_hist_entry, he);
if (c2c.show_src && !he->srcline)
he->srcline = hist_entry__get_srcline(he);
he->srcline = hist_entry__srcline(he);
calc_width(c2c_he);
......
......@@ -1438,8 +1438,6 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
goto out;
}
symbol_conf.nr_events = kvm->evlist->nr_entries;
if (perf_evlist__create_maps(kvm->evlist, &kvm->opts.target) < 0)
usage_with_options(live_usage, live_options);
......
......@@ -81,8 +81,7 @@ static int parse_probe_event(const char *str)
params.target_used = true;
}
if (params.nsi)
pev->nsi = nsinfo__get(params.nsi);
pev->nsi = nsinfo__get(params.nsi);
/* Parse a perf-probe command into event */
ret = parse_perf_probe_command(str, pev);
......
......@@ -71,6 +71,7 @@ struct report {
bool group_set;
int max_stack;
struct perf_read_values show_threads_values;
struct annotation_options annotation_opts;
const char *pretty_printing_style;
const char *cpu_list;
const char *symbol_filter_str;
......@@ -136,26 +137,25 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
if (sort__mode == SORT_MODE__BRANCH) {
bi = he->branch_info;
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
if (err)
goto out;
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
} else if (rep->mem_mode) {
mi = he->mem_info;
err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel);
if (err)
goto out;
err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
} else if (symbol_conf.cumulate_callchain) {
if (single)
err = hist_entry__inc_addr_samples(he, sample, evsel->idx,
al->addr);
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
} else {
err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
}
out:
......@@ -181,11 +181,11 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
rep->nonany_branch_mode);
bi = he->branch_info;
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
if (err)
goto out;
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
branch_type_count(&rep->brtype_stat, &bi->flags,
bi->from.addr, bi->to.addr);
......@@ -561,7 +561,7 @@ static int report__browse_hists(struct report *rep)
ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
rep->min_percent,
&session->header.env,
true);
true, &rep->annotation_opts);
/*
* Usually "ret" is the last pressed key, and we only
* care if the key notifies us to switch data file.
......@@ -946,12 +946,6 @@ parse_percent_limit(const struct option *opt, const char *str,
return 0;
}
#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
int cmd_report(int argc, const char **argv)
{
struct perf_session *session;
......@@ -960,6 +954,10 @@ int cmd_report(int argc, const char **argv)
bool has_br_stack = false;
int branch_mode = -1;
bool branch_call_mode = false;
#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
CALLCHAIN_REPORT_HELP
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
const char * const report_usage[] = {
"perf report [<options>]",
......@@ -989,6 +987,7 @@ int cmd_report(int argc, const char **argv)
.max_stack = PERF_MAX_STACK_DEPTH,
.pretty_printing_style = "normal",
.socket_filter = -1,
.annotation_opts = annotation__default_options,
};
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file",
......@@ -1078,11 +1077,11 @@ int cmd_report(int argc, const char **argv)
"list of cpus to profile"),
OPT_BOOLEAN('I', "show-info", &report.show_full_info,
"Display extended information about perf.data file"),
OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
OPT_BOOLEAN(0, "source", &report.annotation_opts.annotate_src,
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
OPT_STRING('M', "disassembler-style", &report.annotation_opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
"Show a column with the sum of periods"),
......@@ -1093,7 +1092,7 @@ int cmd_report(int argc, const char **argv)
parse_branch_mode),
OPT_BOOLEAN(0, "branch-history", &branch_call_mode,
"add last branch records to call history"),
OPT_STRING(0, "objdump", &objdump_path, "path",
OPT_STRING(0, "objdump", &report.annotation_opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
"Disable symbol demangling"),
......
......@@ -2143,7 +2143,7 @@ static void save_task_callchain(struct perf_sched *sched,
return;
}
if (!symbol_conf.use_callchain || sample->callchain == NULL)
if (!sched->show_callchain || sample->callchain == NULL)
return;
if (thread__resolve_callchain(thread, cursor, evsel, sample,
......@@ -2271,10 +2271,11 @@ static struct thread *get_idle_thread(int cpu)
return idle_threads[cpu];
}
static void save_idle_callchain(struct idle_thread_runtime *itr,
static void save_idle_callchain(struct perf_sched *sched,
struct idle_thread_runtime *itr,
struct perf_sample *sample)
{
if (!symbol_conf.use_callchain || sample->callchain == NULL)
if (!sched->show_callchain || sample->callchain == NULL)
return;
callchain_cursor__copy(&itr->cursor, &callchain_cursor);
......@@ -2320,7 +2321,7 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
/* copy task callchain when entering to idle */
if (perf_evsel__intval(evsel, sample, "next_pid") == 0)
save_idle_callchain(itr, sample);
save_idle_callchain(sched, itr, sample);
}
}
......@@ -2849,7 +2850,7 @@ static void timehist_print_summary(struct perf_sched *sched,
printf(" CPU %2d idle entire time window\n", i);
}
if (sched->idle_hist && symbol_conf.use_callchain) {
if (sched->idle_hist && sched->show_callchain) {
callchain_param.mode = CHAIN_FOLDED;
callchain_param.value = CCVAL_PERIOD;
......@@ -2933,8 +2934,7 @@ static int timehist_check_attr(struct perf_sched *sched,
return -1;
}
if (sched->show_callchain &&
!(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) {
if (sched->show_callchain && !evsel__has_callchain(evsel)) {
pr_info("Samples do not have callchains.\n");
sched->show_callchain = 0;
symbol_conf.use_callchain = 0;
......
......@@ -517,7 +517,7 @@ static int perf_session__check_output_opt(struct perf_session *session)
evlist__for_each_entry(session->evlist, evsel) {
not_pipe = true;
if (evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
if (evsel__has_callchain(evsel)) {
use_callchain = true;
break;
}
......@@ -532,22 +532,18 @@ static int perf_session__check_output_opt(struct perf_session *session)
*/
if (symbol_conf.use_callchain &&
!output[PERF_TYPE_TRACEPOINT].user_set) {
struct perf_event_attr *attr;
j = PERF_TYPE_TRACEPOINT;
evlist__for_each_entry(session->evlist, evsel) {
if (evsel->attr.type != j)
continue;
attr = &evsel->attr;
if (attr->sample_type & PERF_SAMPLE_CALLCHAIN) {
if (evsel__has_callchain(evsel)) {
output[j].fields |= PERF_OUTPUT_IP;
output[j].fields |= PERF_OUTPUT_SYM;
output[j].fields |= PERF_OUTPUT_SYMOFFSET;
output[j].fields |= PERF_OUTPUT_DSO;
set_print_ip_opts(attr);
set_print_ip_opts(&evsel->attr);
goto out;
}
}
......@@ -610,7 +606,7 @@ static int perf_sample__fprintf_start(struct perf_sample *sample,
if (PRINT_FIELD(COMM)) {
if (latency_format)
printed += fprintf(fp, "%8.8s ", thread__comm_str(thread));
else if (PRINT_FIELD(IP) && symbol_conf.use_callchain)
else if (PRINT_FIELD(IP) && evsel__has_callchain(evsel) && symbol_conf.use_callchain)
printed += fprintf(fp, "%s ", thread__comm_str(thread));
else
printed += fprintf(fp, "%16s ", thread__comm_str(thread));
......
......@@ -80,6 +80,9 @@
#include <sys/stat.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/wait.h>
#include "sane_ctype.h"
......@@ -175,6 +178,8 @@ static int output_fd;
static int print_free_counters_hint;
static int print_mixed_hw_group_error;
static u64 *walltime_run;
static bool ru_display = false;
static struct rusage ru_data;
struct perf_stat {
bool record;
......@@ -726,7 +731,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
break;
}
}
waitpid(child_pid, &status, 0);
wait4(child_pid, &status, 0, &ru_data);
if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
......@@ -1804,6 +1809,11 @@ static void print_table(FILE *output, int precision, double avg)
fprintf(output, "\n%*s# Final result:\n", indent, "");
}
static double timeval2double(struct timeval *t)
{
return t->tv_sec + (double) t->tv_usec/USEC_PER_SEC;
}
static void print_footer(void)
{
double avg = avg_stats(&walltime_nsecs_stats) / NSEC_PER_SEC;
......@@ -1815,6 +1825,15 @@ static void print_footer(void)
if (run_count == 1) {
fprintf(output, " %17.9f seconds time elapsed", avg);
if (ru_display) {
double ru_utime = timeval2double(&ru_data.ru_utime);
double ru_stime = timeval2double(&ru_data.ru_stime);
fprintf(output, "\n\n");
fprintf(output, " %17.9f seconds user\n", ru_utime);
fprintf(output, " %17.9f seconds sys\n", ru_stime);
}
} else {
double sd = stddev_stats(&walltime_nsecs_stats) / NSEC_PER_SEC;
/*
......@@ -2950,6 +2969,13 @@ int cmd_stat(int argc, const char **argv)
setup_system_wide(argc);
/*
* Display user/system times only for single
* run and when there's specified tracee.
*/
if ((run_count == 1) && target__none(&target))
ru_display = true;
if (run_count < 0) {
pr_err("Run count must be a positive number\n");
parse_options_usage(stat_usage, stat_options, "r", 1);
......
......@@ -123,14 +123,9 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
}
notes = symbol__annotation(sym);
if (notes->src != NULL) {
pthread_mutex_lock(&notes->lock);
goto out_assign;
}
pthread_mutex_lock(&notes->lock);
if (symbol__alloc_hist(sym) < 0) {
if (!symbol__hists(sym, top->evlist->nr_entries)) {
pthread_mutex_unlock(&notes->lock);
pr_err("Not enough memory for annotating '%s' symbol!\n",
sym->name);
......@@ -138,9 +133,8 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
return err;
}
err = symbol__annotate(sym, map, evsel, 0, NULL);
err = symbol__annotate(sym, map, evsel, 0, &top->annotation_opts, NULL);
if (err == 0) {
out_assign:
top->sym_filter_entry = he;
} else {
char msg[BUFSIZ];
......@@ -188,7 +182,7 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
static void perf_top__record_precise_ip(struct perf_top *top,
struct hist_entry *he,
struct perf_sample *sample,
int counter, u64 ip)
struct perf_evsel *evsel, u64 ip)
{
struct annotation *notes;
struct symbol *sym = he->ms.sym;
......@@ -204,7 +198,7 @@ static void perf_top__record_precise_ip(struct perf_top *top,
if (pthread_mutex_trylock(&notes->lock))
return;
err = hist_entry__inc_addr_samples(he, sample, counter, ip);
err = hist_entry__inc_addr_samples(he, sample, evsel, ip);
pthread_mutex_unlock(&notes->lock);
......@@ -249,10 +243,9 @@ static void perf_top__show_details(struct perf_top *top)
goto out_unlock;
printf("Showing %s for %s\n", perf_evsel__name(top->sym_evsel), symbol->name);
printf(" Events Pcnt (>=%d%%)\n", top->sym_pcnt_filter);
printf(" Events Pcnt (>=%d%%)\n", top->annotation_opts.min_pcnt);
more = symbol__annotate_printf(symbol, he->ms.map, top->sym_evsel,
0, top->sym_pcnt_filter, top->print_entries, 4);
more = symbol__annotate_printf(symbol, he->ms.map, top->sym_evsel, &top->annotation_opts);
if (top->evlist->enabled) {
if (top->zero)
......@@ -412,7 +405,7 @@ static void perf_top__print_mapped_keys(struct perf_top *top)
fprintf(stdout, "\t[f] profile display filter (count). \t(%d)\n", top->count_filter);
fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", top->sym_pcnt_filter);
fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", top->annotation_opts.min_pcnt);
fprintf(stdout, "\t[s] annotate symbol. \t(%s)\n", name?: "NULL");
fprintf(stdout, "\t[S] stop annotation.\n");
......@@ -515,7 +508,7 @@ static bool perf_top__handle_keypress(struct perf_top *top, int c)
prompt_integer(&top->count_filter, "Enter display event count filter");
break;
case 'F':
prompt_percent(&top->sym_pcnt_filter,
prompt_percent(&top->annotation_opts.min_pcnt,
"Enter details display event filter (percent)");
break;
case 'K':
......@@ -613,7 +606,8 @@ static void *display_thread_tui(void *arg)
perf_evlist__tui_browse_hists(top->evlist, help, &hbt,
top->min_percent,
&top->session->header.env,
!top->record_opts.overwrite);
!top->record_opts.overwrite,
&top->annotation_opts);
done = 1;
return NULL;
......@@ -691,7 +685,7 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;
if (perf_hpp_list.sym && single)
perf_top__record_precise_ip(top, he, iter->sample, evsel->idx, al->addr);
perf_top__record_precise_ip(top, he, iter->sample, evsel, al->addr);
hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
!(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
......@@ -1083,8 +1077,9 @@ static int __cmd_top(struct perf_top *top)
if (top->session == NULL)
return -1;
if (!objdump_path) {
ret = perf_env__lookup_objdump(&top->session->header.env);
if (!top->annotation_opts.objdump_path) {
ret = perf_env__lookup_objdump(&top->session->header.env,
&top->annotation_opts.objdump_path);
if (ret)
goto out_delete;
}
......@@ -1265,7 +1260,7 @@ int cmd_top(int argc, const char **argv)
.overwrite = 1,
},
.max_stack = sysctl__max_stack(),
.sym_pcnt_filter = 5,
.annotation_opts = annotation__default_options,
.nr_threads_synthesize = UINT_MAX,
};
struct record_opts *opts = &top.record_opts;
......@@ -1347,15 +1342,15 @@ int cmd_top(int argc, const char **argv)
"only consider symbols in these comms"),
OPT_STRING(0, "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
"only consider these symbols"),
OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
OPT_BOOLEAN(0, "source", &top.annotation_opts.annotate_src,
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
OPT_BOOLEAN(0, "asm-raw", &top.annotation_opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_STRING(0, "objdump", &objdump_path, "path",
OPT_STRING(0, "objdump", &top.annotation_opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
OPT_STRING('M', "disassembler-style", &top.annotation_opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"),
OPT_CALLBACK(0, "percent-limit", &top, "percent",
......@@ -1391,6 +1386,9 @@ int cmd_top(int argc, const char **argv)
if (status < 0)
return status;
top.annotation_opts.min_pcnt = 5;
top.annotation_opts.context = 4;
top.evlist = perf_evlist__new();
if (top.evlist == NULL)
return -ENOMEM;
......@@ -1468,8 +1466,6 @@ int cmd_top(int argc, const char **argv)
goto out_delete_evlist;
}
symbol_conf.nr_events = top.evlist->nr_entries;
if (top.delay_secs < 1)
top.delay_secs = 1;
......
......@@ -2491,7 +2491,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)