From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: jolsa@kernel.org, peterz@infradead.org, mingo@redhat.com,
alexander.shishkin@linux.intel.com, Linux-kernel@vger.kernel.org,
ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v5 06/12] perf util: Update and print per-thread shadow stats
Date: Sat, 2 Dec 2017 12:46:10 +0800 [thread overview]
Message-ID: <45c17cab-e040-8145-ea59-5c149d6e4d59@linux.intel.com> (raw)
In-Reply-To: <20171201142148.GY3298@kernel.org>
On 12/1/2017 10:21 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Dec 01, 2017 at 06:57:30PM +0800, Jin Yao escreveu:
>> The functions perf_stat__update_shadow_stats() and
>> perf_stat__print_shadow_statss() are called to update
>> and print the shadow stats on a set of static variables.
>>
>> But the static variables are the limitations to support
>> per-thread shadow stats.
>>
>> This patch lets the perf_stat__update_shadow_stats() support
>> to update the shadow stats on a input parameter 'stat' and
>> uses update_runtime_stat() to update the stats. It will not
>> directly update the static variables as before.
>>
>> And this patch also lets the perf_stat__print_shadow_stats()
>
> When 'also' appears on a patch usually it means it should be split in
> two, one for the things up to the 'also' and another for the remaining
> parts.
>
> A patch that has these stats:
>
> 5 files changed, 219 insertions(+), 120 deletions(-)
>
> raises eyebrows :-\
>
> I'm trying now to break it into at least two, one for printing and the
> other for the rest.
>
> - Arnaldo
>
Yes, too much in this patch.
Actually I also want to split it into more patches. While I just feel a
little bit difficulty because some dependencies are there.
If you need me to do anything on this patch (e.g.
refine/split/reorg/...), I'd like to, please let me know.
Thanks
Jin Yao
>> support to print the shadow stats from a input parameter 'stat'.
>>
>> It will not directly get value from static variable. Instead, it now
>> uses runtime_stat_avg() and runtime_stat_n() to get and compute the
>> values.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> tools/perf/builtin-script.c | 6 +-
>> tools/perf/builtin-stat.c | 27 ++--
>> tools/perf/util/stat-shadow.c | 293 +++++++++++++++++++++++++++---------------
>> tools/perf/util/stat.c | 8 +-
>> tools/perf/util/stat.h | 5 +-
>> 5 files changed, 219 insertions(+), 120 deletions(-)
>>
>> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
>> index 39d8b55..fac6f05 100644
>> --- a/tools/perf/builtin-script.c
>> +++ b/tools/perf/builtin-script.c
>> @@ -1548,7 +1548,8 @@ static void perf_sample__fprint_metric(struct perf_script *script,
>> val = sample->period * evsel->scale;
>> perf_stat__update_shadow_stats(evsel,
>> val,
>> - sample->cpu);
>> + sample->cpu,
>> + &rt_stat);
>> evsel_script(evsel)->val = val;
>> if (evsel_script(evsel->leader)->gnum == evsel->leader->nr_members) {
>> for_each_group_member (ev2, evsel->leader) {
>> @@ -1556,7 +1557,8 @@ static void perf_sample__fprint_metric(struct perf_script *script,
>> evsel_script(ev2)->val,
>> sample->cpu,
>> &ctx,
>> - NULL);
>> + NULL,
>> + &rt_stat);
>> }
>> evsel_script(evsel->leader)->gnum = 0;
>> }
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index a027b47..1edc082 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -1097,7 +1097,8 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
>> }
>>
>> static void printout(int id, int nr, struct perf_evsel *counter, double uval,
>> - char *prefix, u64 run, u64 ena, double noise)
>> + char *prefix, u64 run, u64 ena, double noise,
>> + struct runtime_stat *stat)
>> {
>> struct perf_stat_output_ctx out;
>> struct outstate os = {
>> @@ -1190,7 +1191,8 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
>>
>> perf_stat__print_shadow_stats(counter, uval,
>> first_shadow_cpu(counter, id),
>> - &out, &metric_events);
>> + &out, &metric_events,
>> + stat);
>> if (!csv_output && !metric_only) {
>> print_noise(counter, noise);
>> print_running(run, ena);
>> @@ -1214,7 +1216,8 @@ static void aggr_update_shadow(void)
>> val += perf_counts(counter->counts, cpu, 0)->val;
>> }
>> perf_stat__update_shadow_stats(counter, val,
>> - first_shadow_cpu(counter, id));
>> + first_shadow_cpu(counter, id),
>> + &rt_stat);
>> }
>> }
>> }
>> @@ -1334,7 +1337,8 @@ static void print_aggr(char *prefix)
>> fprintf(output, "%s", prefix);
>>
>> uval = val * counter->scale;
>> - printout(id, nr, counter, uval, prefix, run, ena, 1.0);
>> + printout(id, nr, counter, uval, prefix, run, ena, 1.0,
>> + &rt_stat);
>> if (!metric_only)
>> fputc('\n', output);
>> }
>> @@ -1364,7 +1368,8 @@ static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
>> fprintf(output, "%s", prefix);
>>
>> uval = val * counter->scale;
>> - printout(thread, 0, counter, uval, prefix, run, ena, 1.0);
>> + printout(thread, 0, counter, uval, prefix, run, ena, 1.0,
>> + &rt_stat);
>> fputc('\n', output);
>> }
>> }
>> @@ -1401,7 +1406,8 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
>> fprintf(output, "%s", prefix);
>>
>> uval = cd.avg * counter->scale;
>> - printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled, cd.avg);
>> + printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled,
>> + cd.avg, &rt_stat);
>> if (!metric_only)
>> fprintf(output, "\n");
>> }
>> @@ -1440,7 +1446,8 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
>> fprintf(output, "%s", prefix);
>>
>> uval = val * counter->scale;
>> - printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
>> + printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
>> + &rt_stat);
>>
>> fputc('\n', output);
>> }
>> @@ -1472,7 +1479,8 @@ static void print_no_aggr_metric(char *prefix)
>> run = perf_counts(counter->counts, cpu, 0)->run;
>>
>> uval = val * counter->scale;
>> - printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
>> + printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
>> + &rt_stat);
>> }
>> fputc('\n', stat_config.output);
>> }
>> @@ -1528,7 +1536,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
>> perf_stat__print_shadow_stats(counter, 0,
>> 0,
>> &out,
>> - &metric_events);
>> + &metric_events,
>> + &rt_stat);
>> }
>> fputc('\n', stat_config.output);
>> }
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index e60c321..0d34d5e 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -116,19 +116,29 @@ static void saved_value_delete(struct rblist *rblist __maybe_unused,
>>
>> static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
>> int cpu,
>> - bool create)
>> + bool create,
>> + enum stat_type type,
>> + int ctx,
>> + struct runtime_stat *stat)
>> {
>> + struct rblist *rblist;
>> struct rb_node *nd;
>> struct saved_value dm = {
>> .cpu = cpu,
>> .evsel = evsel,
>> + .type = type,
>> + .ctx = ctx,
>> + .stat = stat,
>> };
>> - nd = rblist__find(&runtime_saved_values, &dm);
>> +
>> + rblist = &stat->value_list;
>> +
>> + nd = rblist__find(rblist, &dm);
>> if (nd)
>> return container_of(nd, struct saved_value, rb_node);
>> if (create) {
>> - rblist__add_node(&runtime_saved_values, &dm);
>> - nd = rblist__find(&runtime_saved_values, &dm);
>> + rblist__add_node(rblist, &dm);
>> + nd = rblist__find(rblist, &dm);
>> if (nd)
>> return container_of(nd, struct saved_value, rb_node);
>> }
>> @@ -217,13 +227,24 @@ void perf_stat__reset_shadow_stats(void)
>> }
>> }
>>
>> +static void update_runtime_stat(struct runtime_stat *stat,
>> + enum stat_type type,
>> + int ctx, int cpu, u64 count)
>> +{
>> + struct saved_value *v = saved_value_lookup(NULL, cpu, true,
>> + type, ctx, stat);
>> +
>> + if (v)
>> + update_stats(&v->stats, count);
>> +}
>> +
>> /*
>> * Update various tracking values we maintain to print
>> * more semantic information such as miss/hit ratios,
>> * instruction rates, etc:
>> */
>> void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
>> - int cpu)
>> + int cpu, struct runtime_stat *stat)
>> {
>> int ctx = evsel_context(counter);
>>
>> @@ -231,50 +252,58 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
>>
>> if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK) ||
>> perf_evsel__match(counter, SOFTWARE, SW_CPU_CLOCK))
>> - update_stats(&runtime_nsecs_stats[cpu], count);
>> + update_runtime_stat(stat, STAT_NSECS, 0, cpu, count);
>> else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
>> - update_stats(&runtime_cycles_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_CYCLES, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
>> - update_stats(&runtime_cycles_in_tx_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_CYCLES_IN_TX, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TRANSACTION_START))
>> - update_stats(&runtime_transaction_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TRANSACTION, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, ELISION_START))
>> - update_stats(&runtime_elision_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_ELISION, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
>> - update_stats(&runtime_topdown_total_slots[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TOPDOWN_TOTAL_SLOTS,
>> + ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
>> - update_stats(&runtime_topdown_slots_issued[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TOPDOWN_SLOTS_ISSUED,
>> + ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
>> - update_stats(&runtime_topdown_slots_retired[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TOPDOWN_SLOTS_RETIRED,
>> + ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
>> - update_stats(&runtime_topdown_fetch_bubbles[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TOPDOWN_FETCH_BUBBLES,
>> + ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
>> - update_stats(&runtime_topdown_recovery_bubbles[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_TOPDOWN_RECOVERY_BUBBLES,
>> + ctx, cpu, count);
>> else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
>> - update_stats(&runtime_stalled_cycles_front_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_STALLED_CYCLES_FRONT,
>> + ctx, cpu, count);
>> else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
>> - update_stats(&runtime_stalled_cycles_back_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_STALLED_CYCLES_BACK,
>> + ctx, cpu, count);
>> else if (perf_evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
>> - update_stats(&runtime_branches_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_BRANCHES, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
>> - update_stats(&runtime_cacherefs_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_CACHEREFS, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
>> - update_stats(&runtime_l1_dcache_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_L1_DCACHE, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
>> - update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_L1_ICACHE, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_LL))
>> - update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_LL_CACHE, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
>> - update_stats(&runtime_dtlb_cache_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_DTLB_CACHE, ctx, cpu, count);
>> else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
>> - update_stats(&runtime_itlb_cache_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_ITLB_CACHE, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, SMI_NUM))
>> - update_stats(&runtime_smi_num_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_SMI_NUM, ctx, cpu, count);
>> else if (perf_stat_evsel__is(counter, APERF))
>> - update_stats(&runtime_aperf_stats[ctx][cpu], count);
>> + update_runtime_stat(stat, STAT_APERF, ctx, cpu, count);
>>
>> if (counter->collect_stat) {
>> - struct saved_value *v = saved_value_lookup(counter, cpu, true);
>> + struct saved_value *v = saved_value_lookup(counter, cpu, true,
>> + STAT_NONE, 0, stat);
>> update_stats(&v->stats, count);
>> }
>> }
>> @@ -395,15 +424,40 @@ void perf_stat__collect_metric_expr(struct perf_evlist *evsel_list)
>> }
>> }
>>
>> +static double runtime_stat_avg(struct runtime_stat *stat,
>> + enum stat_type type, int ctx, int cpu)
>> +{
>> + struct saved_value *v;
>> +
>> + v = saved_value_lookup(NULL, cpu, false, type, ctx, stat);
>> + if (!v)
>> + return 0.0;
>> +
>> + return avg_stats(&v->stats);
>> +}
>> +
>> +static double runtime_stat_n(struct runtime_stat *stat,
>> + enum stat_type type, int ctx, int cpu)
>> +{
>> + struct saved_value *v;
>> +
>> + v = saved_value_lookup(NULL, cpu, false, type, ctx, stat);
>> + if (!v)
>> + return 0.0;
>> +
>> + return v->stats.n;
>> +}
>> +
>> static void print_stalled_cycles_frontend(int cpu,
>> struct perf_evsel *evsel, double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -419,13 +473,14 @@ static void print_stalled_cycles_frontend(int cpu,
>>
>> static void print_stalled_cycles_backend(int cpu,
>> struct perf_evsel *evsel, double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -438,13 +493,14 @@ static void print_stalled_cycles_backend(int cpu,
>> static void print_branch_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_branches_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_BRANCHES, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -457,13 +513,15 @@ static void print_branch_misses(int cpu,
>> static void print_l1_dcache_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> +
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_l1_dcache_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_L1_DCACHE, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -476,13 +534,15 @@ static void print_l1_dcache_misses(int cpu,
>> static void print_l1_icache_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> +
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_l1_icache_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_L1_ICACHE, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -494,13 +554,14 @@ static void print_l1_icache_misses(int cpu,
>> static void print_dtlb_cache_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_dtlb_cache_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_DTLB_CACHE, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -512,13 +573,14 @@ static void print_dtlb_cache_misses(int cpu,
>> static void print_itlb_cache_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_itlb_cache_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_ITLB_CACHE, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -530,13 +592,14 @@ static void print_itlb_cache_misses(int cpu,
>> static void print_ll_cache_misses(int cpu,
>> struct perf_evsel *evsel,
>> double avg,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double total, ratio = 0.0;
>> const char *color;
>> int ctx = evsel_context(evsel);
>>
>> - total = avg_stats(&runtime_ll_cache_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_LL_CACHE, ctx, cpu);
>>
>> if (total)
>> ratio = avg / total * 100.0;
>> @@ -594,68 +657,72 @@ static double sanitize_val(double x)
>> return x;
>> }
>>
>> -static double td_total_slots(int ctx, int cpu)
>> +static double td_total_slots(int ctx, int cpu, struct runtime_stat *stat)
>> {
>> - return avg_stats(&runtime_topdown_total_slots[ctx][cpu]);
>> + return runtime_stat_avg(stat, STAT_TOPDOWN_TOTAL_SLOTS, ctx, cpu);
>> }
>>
>> -static double td_bad_spec(int ctx, int cpu)
>> +static double td_bad_spec(int ctx, int cpu, struct runtime_stat *stat)
>> {
>> double bad_spec = 0;
>> double total_slots;
>> double total;
>>
>> - total = avg_stats(&runtime_topdown_slots_issued[ctx][cpu]) -
>> - avg_stats(&runtime_topdown_slots_retired[ctx][cpu]) +
>> - avg_stats(&runtime_topdown_recovery_bubbles[ctx][cpu]);
>> - total_slots = td_total_slots(ctx, cpu);
>> + total = runtime_stat_avg(stat, STAT_TOPDOWN_SLOTS_ISSUED, ctx, cpu) -
>> + runtime_stat_avg(stat, STAT_TOPDOWN_SLOTS_RETIRED, ctx, cpu) +
>> + runtime_stat_avg(stat, STAT_TOPDOWN_RECOVERY_BUBBLES, ctx, cpu);
>> +
>> + total_slots = td_total_slots(ctx, cpu, stat);
>> if (total_slots)
>> bad_spec = total / total_slots;
>> return sanitize_val(bad_spec);
>> }
>>
>> -static double td_retiring(int ctx, int cpu)
>> +static double td_retiring(int ctx, int cpu, struct runtime_stat *stat)
>> {
>> double retiring = 0;
>> - double total_slots = td_total_slots(ctx, cpu);
>> - double ret_slots = avg_stats(&runtime_topdown_slots_retired[ctx][cpu]);
>> + double total_slots = td_total_slots(ctx, cpu, stat);
>> + double ret_slots = runtime_stat_avg(stat, STAT_TOPDOWN_SLOTS_RETIRED,
>> + ctx, cpu);
>>
>> if (total_slots)
>> retiring = ret_slots / total_slots;
>> return retiring;
>> }
>>
>> -static double td_fe_bound(int ctx, int cpu)
>> +static double td_fe_bound(int ctx, int cpu, struct runtime_stat *stat)
>> {
>> double fe_bound = 0;
>> - double total_slots = td_total_slots(ctx, cpu);
>> - double fetch_bub = avg_stats(&runtime_topdown_fetch_bubbles[ctx][cpu]);
>> + double total_slots = td_total_slots(ctx, cpu, stat);
>> + double fetch_bub = runtime_stat_avg(stat, STAT_TOPDOWN_FETCH_BUBBLES,
>> + ctx, cpu);
>>
>> if (total_slots)
>> fe_bound = fetch_bub / total_slots;
>> return fe_bound;
>> }
>>
>> -static double td_be_bound(int ctx, int cpu)
>> +static double td_be_bound(int ctx, int cpu, struct runtime_stat *stat)
>> {
>> - double sum = (td_fe_bound(ctx, cpu) +
>> - td_bad_spec(ctx, cpu) +
>> - td_retiring(ctx, cpu));
>> + double sum = (td_fe_bound(ctx, cpu, stat) +
>> + td_bad_spec(ctx, cpu, stat) +
>> + td_retiring(ctx, cpu, stat));
>> if (sum == 0)
>> return 0;
>> return sanitize_val(1.0 - sum);
>> }
>>
>> static void print_smi_cost(int cpu, struct perf_evsel *evsel,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> double smi_num, aperf, cycles, cost = 0.0;
>> int ctx = evsel_context(evsel);
>> const char *color = NULL;
>>
>> - smi_num = avg_stats(&runtime_smi_num_stats[ctx][cpu]);
>> - aperf = avg_stats(&runtime_aperf_stats[ctx][cpu]);
>> - cycles = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> + smi_num = runtime_stat_avg(stat, STAT_SMI_NUM, ctx, cpu);
>> + aperf = runtime_stat_avg(stat, STAT_APERF, ctx, cpu);
>> + cycles = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>>
>> if ((cycles == 0) || (aperf == 0))
>> return;
>> @@ -675,7 +742,8 @@ static void generic_metric(const char *metric_expr,
>> const char *metric_name,
>> double avg,
>> int cpu,
>> - struct perf_stat_output_ctx *out)
>> + struct perf_stat_output_ctx *out,
>> + struct runtime_stat *stat)
>> {
>> print_metric_t print_metric = out->print_metric;
>> struct parse_ctx pctx;
>> @@ -694,7 +762,8 @@ static void generic_metric(const char *metric_expr,
>> stats = &walltime_nsecs_stats;
>> scale = 1e-9;
>> } else {
>> - v = saved_value_lookup(metric_events[i], cpu, false);
>> + v = saved_value_lookup(metric_events[i], cpu, false,
>> + STAT_NONE, 0, stat);
>> if (!v)
>> break;
>> stats = &v->stats;
>> @@ -722,7 +791,8 @@ static void generic_metric(const char *metric_expr,
>> void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> double avg, int cpu,
>> struct perf_stat_output_ctx *out,
>> - struct rblist *metric_events)
>> + struct rblist *metric_events,
>> + struct runtime_stat *stat)
>> {
>> void *ctxp = out->ctx;
>> print_metric_t print_metric = out->print_metric;
>> @@ -733,7 +803,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> int num = 1;
>>
>> if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
>> - total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>> +
>> if (total) {
>> ratio = avg / total;
>> print_metric(ctxp, NULL, "%7.2f ",
>> @@ -741,8 +812,13 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> } else {
>> print_metric(ctxp, NULL, NULL, "insn per cycle", 0);
>> }
>> - total = avg_stats(&runtime_stalled_cycles_front_stats[ctx][cpu]);
>> - total = max(total, avg_stats(&runtime_stalled_cycles_back_stats[ctx][cpu]));
>> +
>> + total = runtime_stat_avg(stat, STAT_STALLED_CYCLES_FRONT,
>> + ctx, cpu);
>> +
>> + total = max(total, runtime_stat_avg(stat,
>> + STAT_STALLED_CYCLES_BACK,
>> + ctx, cpu));
>>
>> if (total && avg) {
>> out->new_line(ctxp);
>> @@ -755,8 +831,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> "stalled cycles per insn", 0);
>> }
>> } else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
>> - if (runtime_branches_stats[ctx][cpu].n != 0)
>> - print_branch_misses(cpu, evsel, avg, out);
>> + if (runtime_stat_n(stat, STAT_BRANCHES, ctx, cpu) != 0)
>> + print_branch_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all branches", 0);
>> } else if (
>> @@ -764,8 +840,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> evsel->attr.config == ( PERF_COUNT_HW_CACHE_L1D |
>> ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>> - if (runtime_l1_dcache_stats[ctx][cpu].n != 0)
>> - print_l1_dcache_misses(cpu, evsel, avg, out);
>> +
>> + if (runtime_stat_n(stat, STAT_L1_DCACHE, ctx, cpu) != 0)
>> + print_l1_dcache_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all L1-dcache hits", 0);
>> } else if (
>> @@ -773,8 +850,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> evsel->attr.config == ( PERF_COUNT_HW_CACHE_L1I |
>> ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>> - if (runtime_l1_icache_stats[ctx][cpu].n != 0)
>> - print_l1_icache_misses(cpu, evsel, avg, out);
>> +
>> + if (runtime_stat_n(stat, STAT_L1_ICACHE, ctx, cpu) != 0)
>> + print_l1_icache_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all L1-icache hits", 0);
>> } else if (
>> @@ -782,8 +860,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> evsel->attr.config == ( PERF_COUNT_HW_CACHE_DTLB |
>> ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>> - if (runtime_dtlb_cache_stats[ctx][cpu].n != 0)
>> - print_dtlb_cache_misses(cpu, evsel, avg, out);
>> +
>> + if (runtime_stat_n(stat, STAT_DTLB_CACHE, ctx, cpu) != 0)
>> + print_dtlb_cache_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all dTLB cache hits", 0);
>> } else if (
>> @@ -791,8 +870,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> evsel->attr.config == ( PERF_COUNT_HW_CACHE_ITLB |
>> ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>> - if (runtime_itlb_cache_stats[ctx][cpu].n != 0)
>> - print_itlb_cache_misses(cpu, evsel, avg, out);
>> +
>> + if (runtime_stat_n(stat, STAT_ITLB_CACHE, ctx, cpu) != 0)
>> + print_itlb_cache_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all iTLB cache hits", 0);
>> } else if (
>> @@ -800,27 +880,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> evsel->attr.config == ( PERF_COUNT_HW_CACHE_LL |
>> ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
>> - if (runtime_ll_cache_stats[ctx][cpu].n != 0)
>> - print_ll_cache_misses(cpu, evsel, avg, out);
>> +
>> + if (runtime_stat_n(stat, STAT_LL_CACHE, ctx, cpu) != 0)
>> + print_ll_cache_misses(cpu, evsel, avg, out, stat);
>> else
>> print_metric(ctxp, NULL, NULL, "of all LL-cache hits", 0);
>> } else if (perf_evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
>> - total = avg_stats(&runtime_cacherefs_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CACHEREFS, ctx, cpu);
>>
>> if (total)
>> ratio = avg * 100 / total;
>>
>> - if (runtime_cacherefs_stats[ctx][cpu].n != 0)
>> + if (runtime_stat_n(stat, STAT_CACHEREFS, ctx, cpu) != 0)
>> print_metric(ctxp, NULL, "%8.3f %%",
>> "of all cache refs", ratio);
>> else
>> print_metric(ctxp, NULL, NULL, "of all cache refs", 0);
>> } else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
>> - print_stalled_cycles_frontend(cpu, evsel, avg, out);
>> + print_stalled_cycles_frontend(cpu, evsel, avg, out, stat);
>> } else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
>> - print_stalled_cycles_backend(cpu, evsel, avg, out);
>> + print_stalled_cycles_backend(cpu, evsel, avg, out, stat);
>> } else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
>> - total = avg_stats(&runtime_nsecs_stats[cpu]);
>> + total = runtime_stat_avg(stat, STAT_NSECS, 0, cpu);
>>
>> if (total) {
>> ratio = avg / total;
>> @@ -829,7 +910,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> print_metric(ctxp, NULL, NULL, "Ghz", 0);
>> }
>> } else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
>> - total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>> +
>> if (total)
>> print_metric(ctxp, NULL,
>> "%7.2f%%", "transactional cycles",
>> @@ -838,8 +920,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> print_metric(ctxp, NULL, NULL, "transactional cycles",
>> 0);
>> } else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
>> - total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
>> - total2 = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES, ctx, cpu);
>> + total2 = runtime_stat_avg(stat, STAT_CYCLES_IN_TX, ctx, cpu);
>> +
>> if (total2 < avg)
>> total2 = avg;
>> if (total)
>> @@ -848,19 +931,21 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> else
>> print_metric(ctxp, NULL, NULL, "aborted cycles", 0);
>> } else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
>> - total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES_IN_TX,
>> + ctx, cpu);
>>
>> if (avg)
>> ratio = total / avg;
>>
>> - if (runtime_cycles_in_tx_stats[ctx][cpu].n != 0)
>> + if (runtime_stat_n(stat, STAT_CYCLES_IN_TX, ctx, cpu) != 0)
>> print_metric(ctxp, NULL, "%8.0f",
>> "cycles / transaction", ratio);
>> else
>> print_metric(ctxp, NULL, NULL, "cycles / transaction",
>> - 0);
>> + 0);
>> } else if (perf_stat_evsel__is(evsel, ELISION_START)) {
>> - total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
>> + total = runtime_stat_avg(stat, STAT_CYCLES_IN_TX,
>> + ctx, cpu);
>>
>> if (avg)
>> ratio = total / avg;
>> @@ -874,28 +959,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> else
>> print_metric(ctxp, NULL, NULL, "CPUs utilized", 0);
>> } else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
>> - double fe_bound = td_fe_bound(ctx, cpu);
>> + double fe_bound = td_fe_bound(ctx, cpu, stat);
>>
>> if (fe_bound > 0.2)
>> color = PERF_COLOR_RED;
>> print_metric(ctxp, color, "%8.1f%%", "frontend bound",
>> fe_bound * 100.);
>> } else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
>> - double retiring = td_retiring(ctx, cpu);
>> + double retiring = td_retiring(ctx, cpu, stat);
>>
>> if (retiring > 0.7)
>> color = PERF_COLOR_GREEN;
>> print_metric(ctxp, color, "%8.1f%%", "retiring",
>> retiring * 100.);
>> } else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
>> - double bad_spec = td_bad_spec(ctx, cpu);
>> + double bad_spec = td_bad_spec(ctx, cpu, stat);
>>
>> if (bad_spec > 0.1)
>> color = PERF_COLOR_RED;
>> print_metric(ctxp, color, "%8.1f%%", "bad speculation",
>> bad_spec * 100.);
>> } else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
>> - double be_bound = td_be_bound(ctx, cpu);
>> + double be_bound = td_be_bound(ctx, cpu, stat);
>> const char *name = "backend bound";
>> static int have_recovery_bubbles = -1;
>>
>> @@ -908,19 +993,19 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>>
>> if (be_bound > 0.2)
>> color = PERF_COLOR_RED;
>> - if (td_total_slots(ctx, cpu) > 0)
>> + if (td_total_slots(ctx, cpu, stat) > 0)
>> print_metric(ctxp, color, "%8.1f%%", name,
>> be_bound * 100.);
>> else
>> print_metric(ctxp, NULL, NULL, name, 0);
>> } else if (evsel->metric_expr) {
>> generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
>> - evsel->metric_name, avg, cpu, out);
>> - } else if (runtime_nsecs_stats[cpu].n != 0) {
>> + evsel->metric_name, avg, cpu, out, stat);
>> + } else if (runtime_stat_n(stat, STAT_NSECS, 0, cpu) != 0) {
>> char unit = 'M';
>> char unit_buf[10];
>>
>> - total = avg_stats(&runtime_nsecs_stats[cpu]);
>> + total = runtime_stat_avg(stat, STAT_NSECS, 0, cpu);
>>
>> if (total)
>> ratio = 1000.0 * avg / total;
>> @@ -931,7 +1016,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
>> print_metric(ctxp, NULL, "%8.3f", unit_buf, ratio);
>> } else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
>> - print_smi_cost(cpu, evsel, out);
>> + print_smi_cost(cpu, evsel, out, stat);
>> } else {
>> num = 0;
>> }
>> @@ -944,7 +1029,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> out->new_line(ctxp);
>> generic_metric(mexp->metric_expr, mexp->metric_events,
>> evsel->name, mexp->metric_name,
>> - avg, cpu, out);
>> + avg, cpu, out, stat);
>> }
>> }
>> if (num == 0)
>> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
>> index 151e9ef..78abfd4 100644
>> --- a/tools/perf/util/stat.c
>> +++ b/tools/perf/util/stat.c
>> @@ -278,9 +278,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
>> perf_evsel__compute_deltas(evsel, cpu, thread, count);
>> perf_counts_values__scale(count, config->scale, NULL);
>> if (config->aggr_mode == AGGR_NONE)
>> - perf_stat__update_shadow_stats(evsel, count->val, cpu);
>> + perf_stat__update_shadow_stats(evsel, count->val, cpu,
>> + &rt_stat);
>> if (config->aggr_mode == AGGR_THREAD)
>> - perf_stat__update_shadow_stats(evsel, count->val, 0);
>> + perf_stat__update_shadow_stats(evsel, count->val, 0,
>> + &rt_stat);
>> break;
>> case AGGR_GLOBAL:
>> aggr->val += count->val;
>> @@ -362,7 +364,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
>> /*
>> * Save the full runtime - to allow normalization during printout:
>> */
>> - perf_stat__update_shadow_stats(counter, *count, 0);
>> + perf_stat__update_shadow_stats(counter, *count, 0, &rt_stat);
>>
>> return 0;
>> }
>> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
>> index 1e2b761..b8448b1 100644
>> --- a/tools/perf/util/stat.h
>> +++ b/tools/perf/util/stat.h
>> @@ -130,7 +130,7 @@ void runtime_stat__exit(struct runtime_stat *stat);
>> void perf_stat__init_shadow_stats(void);
>> void perf_stat__reset_shadow_stats(void);
>> void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
>> - int cpu);
>> + int cpu, struct runtime_stat *stat);
>> struct perf_stat_output_ctx {
>> void *ctx;
>> print_metric_t print_metric;
>> @@ -141,7 +141,8 @@ struct perf_stat_output_ctx {
>> void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
>> double avg, int cpu,
>> struct perf_stat_output_ctx *out,
>> - struct rblist *metric_events);
>> + struct rblist *metric_events,
>> + struct runtime_stat *stat);
>> void perf_stat__collect_metric_expr(struct perf_evlist *);
>>
>> int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);
>> --
>> 2.7.4
next prev parent reply other threads:[~2017-12-02 4:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-01 10:57 [PATCH v5 00/12] perf stat: Enable '--per-thread' on all thread Jin Yao
2017-12-01 10:57 ` [PATCH v5 01/12] perf util: Create rblist__exit() function Jin Yao
2017-12-06 16:36 ` [tip:perf/core] perf rblist: " tip-bot for Jin Yao
2017-12-01 10:57 ` [PATCH v5 02/12] perf util: Define a structure for runtime shadow stats Jin Yao
2017-12-01 14:02 ` Arnaldo Carvalho de Melo
2017-12-02 4:39 ` Jin, Yao
2017-12-01 10:57 ` [PATCH v5 03/12] perf util: Extend rbtree to support " Jin Yao
2017-12-01 14:10 ` Arnaldo Carvalho de Melo
2017-12-02 4:40 ` Jin, Yao
2017-12-01 10:57 ` [PATCH v5 04/12] perf util: Add rbtree node_delete ops Jin Yao
2017-12-01 14:14 ` Arnaldo Carvalho de Melo
2017-12-01 18:29 ` Andi Kleen
2017-12-06 16:37 ` [tip:perf/core] perf stat: Add rbtree node_delete op tip-bot for Jin Yao
2017-12-01 10:57 ` [PATCH v5 05/12] perf util: Create the runtime_stat init/exit function Jin Yao
2017-12-01 10:57 ` [PATCH v5 06/12] perf util: Update and print per-thread shadow stats Jin Yao
2017-12-01 14:21 ` Arnaldo Carvalho de Melo
2017-12-02 4:46 ` Jin, Yao [this message]
2017-12-01 10:57 ` [PATCH v5 07/12] perf util: Remove a set of shadow stats static variables Jin Yao
2017-12-01 10:57 ` [PATCH v5 08/12] perf stat: Allocate shadow stats buffer for threads Jin Yao
2017-12-01 10:57 ` [PATCH v5 09/12] perf stat: Update or print per-thread stats Jin Yao
2017-12-01 10:57 ` [PATCH v5 10/12] perf util: Reuse thread_map__new_by_uid to enumerate threads from /proc Jin Yao
2017-12-01 14:44 ` Arnaldo Carvalho de Melo
2017-12-01 15:02 ` Arnaldo Carvalho de Melo
2017-12-02 4:53 ` Jin, Yao
2017-12-02 4:47 ` Jin, Yao
2017-12-01 10:57 ` [PATCH v5 11/12] perf stat: Remove --per-thread pid/tid limitation Jin Yao
2017-12-01 10:57 ` [PATCH v5 12/12] perf stat: Resort '--per-thread' result Jin Yao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45c17cab-e040-8145-ea59-5c149d6e4d59@linux.intel.com \
--to=yao.jin@linux.intel.com \
--cc=Linux-kernel@vger.kernel.org \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@intel.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=yao.jin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.