public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Andi Kleen <ak@linux.intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 02/10] perf record: implement -f,--mmap-flush=<threshold> option
Date: Thu, 28 Feb 2019 15:44:45 -0300	[thread overview]
Message-ID: <20190228184445.GE9508@kernel.org> (raw)
In-Reply-To: <4abe0a4f-a0a9-f8e2-d2d1-e24846de15c9@linux.intel.com>

Em Thu, Feb 28, 2019 at 12:00:41PM +0300, Alexey Budankov escreveu:
> 
> Implemented -f,--mmap-flush option that specifies threshold to postpone
> and/or trigger the move of data from mmaped kernel buffers to a storage.
> The option can be used to avoid capturing every single byte of data into
> the stored trace. The default option value is 1.

Can you add something here explain more clearly what this means, and
also on the tools/perf/Documentation/perf-record.txt file? Something
like a paragraph explaining when is that there is a mmap flush normally,
or rephrase what you wrote if you think you said that.

- Arnaldo

 
>   $ tools/perf/perf record -f 1024 -e cycles -- matrix.gcc
>   $ tools/perf/perf record --aio -f 1024 -e cycles -- matrix.gcc
> 
> Implemented sync param is the mean to force data move independently from
> the threshold value. Despite a user provides flush value from the command
> line, the tool needs capability to drain memory buffers, at least in the
> end of the collection.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
>  tools/perf/Documentation/perf-record.txt |  5 +++
>  tools/perf/builtin-record.c              | 53 +++++++++++++++++++++---
>  tools/perf/perf.h                        |  1 +
>  tools/perf/util/evlist.c                 |  6 +--
>  tools/perf/util/evlist.h                 |  3 +-
>  tools/perf/util/mmap.c                   |  4 +-
>  tools/perf/util/mmap.h                   |  3 +-
>  7 files changed, 63 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 8f0c2be34848..1727663382f1 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -459,6 +459,11 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo
>    node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
>    cpu  - thread affinity mask is set to cpu of the processed mmap buffer
>  
> +-f::
> +--mmap-flush=n::
> +Minimal number of bytes accumulated in mmaped kernel buffer that is flushed to a storage (default: 1).
> +Maximal allowed value is a quarter of mmaped kernel buffer size.
> +
>  --all-kernel::
>  Configure all used events to run in kernel space.
>  
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index f3f7f3100336..61818cbce443 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -334,6 +334,29 @@ static int record__aio_enabled(struct record *rec)
>  	return rec->opts.nr_cblocks > 0;
>  }
>  
> +#define MMAP_FLUSH_DEFAULT 1
> +static int record__mmap_flush_parse(const struct option *opt,
> +				    const char *str,
> +				    int unset)
> +{
> +	int mmap_len;
> +	struct record_opts *opts = (struct record_opts *)opt->value;
> +
> +	if (unset)
> +		return 0;
> +
> +	if (str)
> +		opts->mmap_flush = strtol(str, NULL, 0);
> +	if (!opts->mmap_flush)
> +		opts->mmap_flush = MMAP_FLUSH_DEFAULT;
> +
> +	mmap_len = perf_evlist__mmap_size(opts->mmap_pages);
> +	if (opts->mmap_flush > mmap_len / 4)
> +		opts->mmap_flush = mmap_len / 4;
> +
> +	return 0;
> +}
> +
>  static int process_synthesized_event(struct perf_tool *tool,
>  				     union perf_event *event,
>  				     struct perf_sample *sample __maybe_unused,
> @@ -543,7 +566,8 @@ static int record__mmap_evlist(struct record *rec,
>  	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
>  				 opts->auxtrace_mmap_pages,
>  				 opts->auxtrace_snapshot_mode,
> -				 opts->nr_cblocks, opts->affinity) < 0) {
> +				 opts->nr_cblocks, opts->affinity,
> +				 opts->mmap_flush) < 0) {
>  		if (errno == EPERM) {
>  			pr_err("Permission error mapping pages.\n"
>  			       "Consider increasing "
> @@ -733,7 +757,7 @@ static void record__adjust_affinity(struct record *rec, struct perf_mmap *map)
>  }
>  
>  static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
> -				    bool overwrite)
> +				    bool overwrite, bool sync)
>  {
>  	u64 bytes_written = rec->bytes_written;
>  	int i;
> @@ -756,12 +780,19 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  		off = record__aio_get_pos(trace_fd);
>  
>  	for (i = 0; i < evlist->nr_mmaps; i++) {
> +		u64 flush = MMAP_FLUSH_DEFAULT;
>  		struct perf_mmap *map = &maps[i];
>  
>  		if (map->base) {
>  			record__adjust_affinity(rec, map);
> +			if (sync) {
> +				flush = map->flush;
> +				map->flush = MMAP_FLUSH_DEFAULT;
> +			}
>  			if (!record__aio_enabled(rec)) {
>  				if (perf_mmap__push(map, rec, record__pushfn) != 0) {
> +					if (sync)
> +						map->flush = flush;
>  					rc = -1;
>  					goto out;
>  				}
> @@ -774,10 +805,14 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  				idx = record__aio_sync(map, false);
>  				if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) {
>  					record__aio_set_pos(trace_fd, off);
> +					if (sync)
> +						map->flush = flush;
>  					rc = -1;
>  					goto out;
>  				}
>  			}
> +			if (sync)
> +				map->flush = flush;
>  		}
>  
>  		if (map->auxtrace_mmap.base && !rec->opts.auxtrace_snapshot_mode &&
> @@ -803,15 +838,15 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  	return rc;
>  }
>  
> -static int record__mmap_read_all(struct record *rec)
> +static int record__mmap_read_all(struct record *rec, bool sync)
>  {
>  	int err;
>  
> -	err = record__mmap_read_evlist(rec, rec->evlist, false);
> +	err = record__mmap_read_evlist(rec, rec->evlist, false, sync);
>  	if (err)
>  		return err;
>  
> -	return record__mmap_read_evlist(rec, rec->evlist, true);
> +	return record__mmap_read_evlist(rec, rec->evlist, true, sync);
>  }
>  
>  static void record__init_features(struct record *rec)
> @@ -1310,7 +1345,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		if (trigger_is_hit(&switch_output_trigger) || done || draining)
>  			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_DATA_PENDING);
>  
> -		if (record__mmap_read_all(rec) < 0) {
> +		if (record__mmap_read_all(rec, false) < 0) {
>  			trigger_error(&auxtrace_snapshot_trigger);
>  			trigger_error(&switch_output_trigger);
>  			err = -1;
> @@ -1411,6 +1446,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		record__synthesize_workload(rec, true);
>  
>  out_child:
> +	record__mmap_read_all(rec, true);
>  	record__aio_mmap_read_sync(rec);
>  
>  	if (forks) {
> @@ -1813,6 +1849,7 @@ static struct record record = {
>  			.uses_mmap   = true,
>  			.default_per_cpu = true,
>  		},
> +		.mmap_flush          = MMAP_FLUSH_DEFAULT,
>  	},
>  	.tool = {
>  		.sample		= process_sample_event,
> @@ -1879,6 +1916,9 @@ static struct option __record_options[] = {
>  	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
>  		     "number of mmap data pages and AUX area tracing mmap pages",
>  		     record__parse_mmap_pages),
> +	OPT_CALLBACK('f', "mmap-flush", &record.opts, "bytes",
> +		     "Minimal number of bytes in mmap data pages that is written to a storage (default: 1)",
> +		     record__mmap_flush_parse),
>  	OPT_BOOLEAN(0, "group", &record.opts.group,
>  		    "put the counters into a counter group"),
>  	OPT_CALLBACK_NOOPT('g', NULL, &callchain_param,
> @@ -2182,6 +2222,7 @@ int cmd_record(int argc, const char **argv)
>  		pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks);
>  
>  	pr_debug("affinity: %s\n", affinity_tags[rec->opts.affinity]);
> +	pr_debug("mmap flush: %d\n", rec->opts.mmap_flush);
>  
>  	err = __cmd_record(&record, argc, argv);
>  out:
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index b120e547ddc7..7886cc9771cf 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -85,6 +85,7 @@ struct record_opts {
>  	u64          clockid_res_ns;
>  	int	     nr_cblocks;
>  	int	     affinity;
> +	int	     mmap_flush;
>  };
>  
>  enum perf_affinity {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 08cedb643ea6..937039faac59 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1022,7 +1022,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
>   */
>  int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  			 unsigned int auxtrace_pages,
> -			 bool auxtrace_overwrite, int nr_cblocks, int affinity)
> +			 bool auxtrace_overwrite, int nr_cblocks, int affinity, int flush)
>  {
>  	struct perf_evsel *evsel;
>  	const struct cpu_map *cpus = evlist->cpus;
> @@ -1032,7 +1032,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  	 * Its value is decided by evsel's write_backward.
>  	 * So &mp should not be passed through const pointer.
>  	 */
> -	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity };
> +	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity, .flush = flush };
>  
>  	if (!evlist->mmap)
>  		evlist->mmap = perf_evlist__alloc_mmap(evlist, false);
> @@ -1064,7 +1064,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
>  {
> -	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS);
> +	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS, 1);
>  }
>  
>  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 744906dd4887..edf18811e39f 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -165,7 +165,8 @@ unsigned long perf_event_mlock_kb_in_pages(void);
>  
>  int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  			 unsigned int auxtrace_pages,
> -			 bool auxtrace_overwrite, int nr_cblocks, int affinity);
> +			 bool auxtrace_overwrite, int nr_cblocks,
> +			 int affinity, int flush);
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
>  void perf_evlist__munmap(struct perf_evlist *evlist);
>  
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index cdc7740fc181..ef3d79b2c90b 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -440,6 +440,8 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd, int c
>  
>  	perf_mmap__setup_affinity_mask(map, mp);
>  
> +	map->flush = mp->flush;
> +
>  	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
>  				&mp->auxtrace_mp, map->base, fd))
>  		return -1;
> @@ -492,7 +494,7 @@ static int __perf_mmap__read_init(struct perf_mmap *md)
>  	md->start = md->overwrite ? head : old;
>  	md->end = md->overwrite ? old : head;
>  
> -	if (md->start == md->end)
> +	if ((md->end - md->start) < md->flush)
>  		return -EAGAIN;
>  
>  	size = md->end - md->start;
> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> index e566c19b242b..b82f8c2d55c4 100644
> --- a/tools/perf/util/mmap.h
> +++ b/tools/perf/util/mmap.h
> @@ -39,6 +39,7 @@ struct perf_mmap {
>  	} aio;
>  #endif
>  	cpu_set_t	affinity_mask;
> +	u64		flush;
>  };
>  
>  /*
> @@ -70,7 +71,7 @@ enum bkw_mmap_state {
>  };
>  
>  struct mmap_params {
> -	int			    prot, mask, nr_cblocks, affinity;
> +	int			    prot, mask, nr_cblocks, affinity, flush;
>  	struct auxtrace_mmap_params auxtrace_mp;
>  };

-- 

- Arnaldo

  reply	other threads:[~2019-02-28 18:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28  8:35 [PATCH v4 0/10] perf: enable compression of record mode trace to save storage space Alexey Budankov
2019-02-28  8:59 ` [PATCH v4 01/10] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines Alexey Budankov
2019-02-28 18:46   ` Arnaldo Carvalho de Melo
2019-02-28 20:11     ` Alexey Budankov
2019-03-01  6:52       ` Alexey Budankov
2019-02-28  9:00 ` [PATCH v4 02/10] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
2019-02-28 18:44   ` Arnaldo Carvalho de Melo [this message]
2019-02-28 20:19     ` Alexey Budankov
2019-02-28  9:02 ` [PATCH v4 03/10] perf session: define bytes_transferred and bytes_compressed metrics Alexey Budankov
2019-02-28 18:47   ` Arnaldo Carvalho de Melo
2019-02-28 20:23     ` Alexey Budankov
2019-02-28  9:03 ` [PATCH v4 04/10] perf record: implement COMPRESSED event record and its attributes Alexey Budankov
2019-02-28  9:08 ` [PATCH v4 05/10] perf mmap: implement dedicated memory buffer for data compression Alexey Budankov
2019-02-28  9:09 ` [PATCH v4 06/10] perf util: introduce Zstd based streaming compression API Alexey Budankov
2019-02-28  9:11 ` [PATCH v4 07/10] perf record: implement -z,--compression_level=n option and compression Alexey Budankov
2019-02-28  9:16 ` [PATCH v4 08/10] perf report: implement record trace decompression Alexey Budankov
2019-02-28  9:17 ` [PATCH v4 09/10] perf inject: enable COMPRESSED records decompression Alexey Budankov
2019-02-28  9:18 ` [PATCH v4 10/10] perf tests: implement Zstd comp/decomp integration test Alexey Budankov
2019-03-13 14:37 ` [PATCH v4 0/10] perf: enable compression of record mode trace to save storage space Jiri Olsa
2019-03-13 15:00   ` Alexey Budankov
2019-03-13 14:37 ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190228184445.GE9508@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.budankov@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox