Re: [PATCH 15/15] perf tools: Support spark lines in perf stat

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Jiri Olsa <jolsa@kernel.org>
Cc: linux-kernel@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 15/15] perf tools: Support spark lines in perf stat
Date: Tue, 10 Jun 2014 07:04:43 +0200	[thread overview]
Message-ID: <20140610050443.GA14522@gmail.com> (raw)
In-Reply-To: <1402344150-14860-16-git-send-email-jolsa@kernel.org>


* Jiri Olsa <jolsa@kernel.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> perf stat -rX prints the stddev for multiple measurements.

Call it "--repeat X", that's how most people know it.

> Just looking at the stddev for judging the quality of the data
> is a bit dangerous The simplest sanity check is to just look

"Dangerous?". Unexplained FUD. Also a missing perid.

> at a simple plot. This patchs add a sparkline to the end
> of the measurements to make it simple to judge the data.

Should explain what a 'sparkline' is when it's first mentioned, not 2 
sentences later.

> The sparkline only uses UTF-8, so should be readable
> in all modern tools and terminals.
> 
> The sparkline is between the minimum and maximum of the data,
> so it's mainly a indicator of variance. To keep the code
> simple and make the output not too wide only the first
> 8 values are printed. If more values are there it adds '..'
> 
> The code is inspired by Zach Holman's spark shell script.

Zach Holman should be on Cc:.

> Example output (view in non-proportial font):
> 
>  Performance counter stats for 'true' (10 runs):
> 
>           0.175672      task-clock (msec)         #    0.555 CPUs utilized            ( +-  1.77% ) █▄▁▁▁▁▁▁..
>                  0      context-switches          #    0.000 K/sec
>                  0      cpu-migrations            #    0.000 K/sec
>                114      page-faults               #    0.647 M/sec                    ( +-  0.14% ) ▁█▁▁████..
>            520,798      cycles                    #    2.965 GHz                      ( +-  1.75% ) █▄▁▁▁▁▁▁..
>            433,525      instructions              #    0.83  insns per cycle          ( +-  0.28% ) ▅▇▅▄▇█▁▆..
>             83,012      branches                  #  472.537 M/sec                    ( +-  0.31% ) ▅▇▆▄▇█▁▆..
>              3,157      branch-misses             #    3.80% of all branches          ( +-  2.55% ) ▇█▃▅▁▃▁▂..
> 
>        0.000316660 seconds time elapsed                                          ( +-  1.78% ) █▅▁▁▁▁▁▁..

The noise data is not aligned vertically in the changelog - is it in 
the real code? It should be.

> As you can see even in the most simple run there are quite 
> interesting patterns. The time sparkline suggests it would be also 
> useful to have an option to throw the first measurement away.

First two should be thrown away, probably, to reduce the resulting 
stddev.

> Known issues:
> - Makes the perf stat output wider. Could be adjust by shrinking
> some white space. Not done so far.
> - No output for -A/--per-socket/--per-core with -rX. This code
> is missing the basic noise detection code. Once it's added there
> sparklines could be shown too.

The feauture looks useful, but at minimum the second shortcoming 
should be fixed first in a separate patch, before applying new 
features and further complicating the code.

> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> Link: http://lkml.kernel.org/r/1401838086-17750-1-git-send-email-andi@firstfloor.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/Documentation/perf-stat.txt |  4 ++++
>  tools/perf/Makefile.perf               |  1 +
>  tools/perf/builtin-stat.c              | 12 ++++++++++++
>  tools/perf/util/spark.c                | 31 +++++++++++++++++++++++++++++++
>  tools/perf/util/spark.h                |  4 ++++
>  tools/perf/util/stat.c                 | 34 ++++++++++++++++++++++++++++++++++
>  tools/perf/util/stat.h                 | 10 ++++++++++
>  7 files changed, 96 insertions(+)
>  create mode 100644 tools/perf/util/spark.c
>  create mode 100644 tools/perf/util/spark.h
> 
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 29ee857..840c1db 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -53,6 +53,10 @@ OPTIONS
>  -r::
>  --repeat=<n>::
>  	repeat command and print average + stddev (max: 100). 0 means forever.
> +	In addition it prints a spark line (when not in CSV mode), which visualizes the
> +	variance between minimum and maximum of the measurements. This allows a simple sanity
> +	check of the measurements. Only 8 values are printed, when more are available
> +	it adds ..
>  
>  -B::
>  --big-num::
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 9670a16..90b3aa7 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -359,6 +359,7 @@ LIB_OBJS += $(OUTPUT)util/trace-event-scripting.o
>  LIB_OBJS += $(OUTPUT)util/trace-event.o
>  LIB_OBJS += $(OUTPUT)util/svghelper.o
>  LIB_OBJS += $(OUTPUT)util/sort.o
> +LIB_OBJS += $(OUTPUT)util/spark.o
>  LIB_OBJS += $(OUTPUT)util/hist.o
>  LIB_OBJS += $(OUTPUT)util/probe-event.o
>  LIB_OBJS += $(OUTPUT)util/util.o
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 65a151e..cb0f7c5 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1176,6 +1176,9 @@ static void print_aggr(char *prefix)
>  				if (run != ena)
>  					fprintf(output, "  (%.2f%%)",
>  						100.0 * run / ena);
> +
> +				fputc(' ', output);
> +				print_stat_spark(output, counter->priv);
>  			}
>  			fputc('\n', output);
>  		}
> @@ -1229,6 +1232,9 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
>  		return;
>  	}
>  
> +	fputc(' ', output);
> +	print_stat_spark(output, counter->priv);
> +
>  	if (scaled) {
>  		double avg_enabled, avg_running;
>  
> @@ -1295,6 +1301,9 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
>  			if (run != ena)
>  				fprintf(output, "  (%.2f%%)",
>  					100.0 * run / ena);
> +
> +			fputc(' ', output);
> +			print_stat_spark(output, counter->priv);
>  		}
>  		fputc('\n', output);
>  	}
> @@ -1355,6 +1364,9 @@ static void print_stat(int argc, const char **argv)
>  			fprintf(output, "                                        ");
>  			print_noise_pct(stddev_stats(&walltime_nsecs_stats),
>  					avg_stats(&walltime_nsecs_stats));
> +
> +			fputc(' ', output);
> +			print_stat_spark(output, &walltime_nsecs_stats);


Why is that 2-line pattern repeated 4 times?

>  		}
>  		fprintf(output, "\n\n");
>  	}
> diff --git a/tools/perf/util/spark.c b/tools/perf/util/spark.c
> new file mode 100644
> index 0000000..5a1033f
> --- /dev/null
> +++ b/tools/perf/util/spark.c
> @@ -0,0 +1,31 @@
> +#include <stdio.h>
> +#include <limits.h>
> +#include "spark.h"
> +
> +#define NUM_SPARKS 8
> +#define SPARK_SHIFT 8
> +
> +/* Print spark lines on outf for numval values in val. */
> +void print_spark(FILE *outf, unsigned long long *val, int numval)

The output file is called 'output' in the calling context, there's no 
reason to randomly depart from that convention and introduce a random 
new one: not a single existing line in tools/perf/ calls an output 
file 'outf'.

Also, I don't see 'numval' used anywhere in perf. The canonical naming 
would be different.

> +{
> +	static const char *ticks[NUM_SPARKS] = {
> +		"▁",  "▂", "▃", "▄", "▅", "▆", "▇", "█"
> +	};
> +	int i;
> +	unsigned long long min = ULLONG_MAX, max = 0, f;

min/max is a somewhat sloppy name as well (easy to confuse with 
min()/max()), should be somehing like min_val/max_val, which is in 
line .

> +
> +	for (i = 0; i < numval; i++) {
> +		if (val[i] < min)
> +			min = val[i];
> +		if (val[i] > max)
> +			max = val[i];
> +	}
> +	f = ((max - min) << SPARK_SHIFT) / (NUM_SPARKS - 1);
> +	if (f < 1)
> +		f = 1;
> +	for (i = 0; i < numval; i++) {
> +		int index = ((val[i] - min) << SPARK_SHIFT) / f;
> +		if (index >= 0 && index < NUM_SPARKS)
> +			fputs(ticks[index], outf);

That's not primarily an 'index', it's an indexed 'variance' value. So 
'variance_idx' or so.

> +	}
> +}
> diff --git a/tools/perf/util/spark.h b/tools/perf/util/spark.h
> new file mode 100644
> index 0000000..d9ebc94
> --- /dev/null
> +++ b/tools/perf/util/spark.h
> @@ -0,0 +1,4 @@
> +#ifndef SPARK_H
> +#define SPARK_H 1
> +void print_spark(FILE *outf, unsigned long long *val, int numval);
> +#endif
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 6506b3d..a7c02c5 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -1,10 +1,16 @@
>  #include <math.h>
> +#include <stdio.h>
>  
>  #include "stat.h"
> +#include "spark.h"
>  
>  void update_stats(struct stats *stats, u64 val)
>  {
>  	double delta;
> +	int n = stats->n;
> +
> +	if (n < NUM_SPARK_VALS)
> +		stats->svals[n] = val;
>  
>  	stats->n++;
>  	delta = val - stats->mean;
>	stats->mean += delta / stats->n;

It is absolutely sloppy, potentially misleading and unrobust to put an 
integer rounded value of 'double n' into a local variable named 'int 
n' ...

> @@ -61,3 +67,31 @@ double rel_stddev_stats(double stddev, double avg)
>  
>  	return pct;
>  }
> +
> +static int all_zero(unsigned long long *vals, int len)
> +{
> +	int i;
> +
> +	for (i = 0; i < len; i++)
> +		if (vals[i] != 0)
> +			return 0;
> +	return 1;
> +}
> +
> +void print_stat_spark(FILE *f, struct stats *stat)
> +{
> +	int len;
> +
> +	if (stat->n <= 1)
> +		return;
> +
> +	len = stat->n;
> +	if (len > NUM_SPARK_VALS)
> +		len = NUM_SPARK_VALS;
> +	if (all_zero(stat->svals, len))
> +		return;
> +
> +	print_spark(f, stat->svals, len);
> +	if (stat->n > NUM_SPARK_VALS)
> +		fputs("..", f);
> +}
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 5667fc3..a049f03 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -1,12 +1,16 @@
>  #ifndef __PERF_STATS_H
>  #define __PERF_STATS_H
>  
> +#include <stdio.h>
>  #include <linux/types.h>
>  
> +#define NUM_SPARK_VALS 8 /* support spark line on first N items */
> +
>  struct stats
>  {
>  	double n, mean, M2;
>  	u64 max, min;
> +	unsigned long long svals[NUM_SPARK_VALS];
>  };

The structure's increasing size justifies proper vertical alignment.

>  
>  void update_stats(struct stats *stats, u64 val);
> @@ -14,12 +18,18 @@ double avg_stats(struct stats *stats);
>  double stddev_stats(struct stats *stats);
>  double rel_stddev_stats(double stddev, double avg);
>  
> +void print_stat_spark(FILE *f, struct stats *stat);
> +
>  static inline void init_stats(struct stats *stats)
>  {
> +	int i;
> +
>  	stats->n    = 0.0;
>  	stats->mean = 0.0;
>  	stats->M2   = 0.0;
>  	stats->min  = (u64) -1;
>  	stats->max  = 0;
> +	for (i = 0; i < NUM_SPARK_VALS; i++)
> +		stats->svals[i] = 0;

memset(), or introduce a new helper, memclear_struct():

 #define memclear_struct(struct_ptr) memset(struct_ptr, 0, sizeof(*(struct_ptr)))

Also, init_stats() should now probably move into util/stat.c. (in a 
separate patch preparing these changes.)

Thanks,

	Ingo

next prev parent reply	other threads:[~2014-06-10  5:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-09 20:02 [GIT PULL 00/15] perf/core improvements and fixes Jiri Olsa
2014-06-09 20:02 ` [PATCH 01/15] tools lib traceevent: Add flag to not load event plugins Jiri Olsa
2014-06-09 20:02 ` [PATCH 02/15] tools lib traceevent: Add options to plugins Jiri Olsa
2014-06-09 20:02 ` [PATCH 03/15] tools lib traceevent: Add options to function plugin Jiri Olsa
2014-06-09 20:02 ` [PATCH 04/15] tools lib traceevent: Added support for __get_bitmask() macro Jiri Olsa
2014-06-09 20:02 ` [PATCH 05/15] perf tools: Fix pipe check regression in attr event callback Jiri Olsa
2014-06-09 20:02 ` [PATCH 06/15] perf tools: Prettify the tags/TAGS/cscope targets output Jiri Olsa
2014-06-09 20:02 ` [PATCH 07/15] perf script/python: Print array argument as string Jiri Olsa
2014-06-09 20:02 ` [PATCH 08/15] perf: Pass protection and flags bits through mmap2 interface Jiri Olsa
2014-06-09 20:02 ` [PATCH 09/15] perf tools: Update mmap2 interface with protection and flag bits Jiri Olsa
2014-06-09 20:02 ` [PATCH 10/15] Revert "perf: Disable PERF_RECORD_MMAP2 support" Jiri Olsa
2014-06-09 20:02 ` [PATCH 11/15] perf report: Add mem-mode documentation to report command Jiri Olsa
2014-06-09 20:02 ` [PATCH 12/15] perf tools: Add cpumode to struct hist_entry Jiri Olsa
2014-06-09 20:02 ` [PATCH 13/15] perf tools: Add support to dynamically get cacheline size Jiri Olsa
2014-06-09 20:02 ` [PATCH 14/15] perf tools: Add dcacheline sort Jiri Olsa
2014-06-09 20:02 ` [PATCH 15/15] perf tools: Support spark lines in perf stat Jiri Olsa
2014-06-10  5:04   ` Ingo Molnar [this message]
2014-06-10  7:37     ` Peter Zijlstra
2014-06-12 11:54 ` [GIT PULL 00/15] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140610050443.GA14522@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.