All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Andi Kleen <ak@linux.intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 02/10] perf record: implement -f,--mmap-flush=<threshold> option
Date: Thu, 28 Feb 2019 15:44:45 -0300	[thread overview]
Message-ID: <20190228184445.GE9508@kernel.org> (raw)
In-Reply-To: <4abe0a4f-a0a9-f8e2-d2d1-e24846de15c9@linux.intel.com>

Em Thu, Feb 28, 2019 at 12:00:41PM +0300, Alexey Budankov escreveu:
> 
> Implemented -f,--mmap-flush option that specifies threshold to postpone
> and/or trigger the move of data from mmaped kernel buffers to a storage.
> The option can be used to avoid capturing every single byte of data into
> the stored trace. The default option value is 1.

Can you add something here explain more clearly what this means, and
also on the tools/perf/Documentation/perf-record.txt file? Something
like a paragraph explaining when is that there is a mmap flush normally,
or rephrase what you wrote if you think you said that.

- Arnaldo

 
>   $ tools/perf/perf record -f 1024 -e cycles -- matrix.gcc
>   $ tools/perf/perf record --aio -f 1024 -e cycles -- matrix.gcc
> 
> Implemented sync param is the mean to force data move independently from
> the threshold value. Despite a user provides flush value from the command
> line, the tool needs capability to drain memory buffers, at least in the
> end of the collection.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
>  tools/perf/Documentation/perf-record.txt |  5 +++
>  tools/perf/builtin-record.c              | 53 +++++++++++++++++++++---
>  tools/perf/perf.h                        |  1 +
>  tools/perf/util/evlist.c                 |  6 +--
>  tools/perf/util/evlist.h                 |  3 +-
>  tools/perf/util/mmap.c                   |  4 +-
>  tools/perf/util/mmap.h                   |  3 +-
>  7 files changed, 63 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 8f0c2be34848..1727663382f1 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -459,6 +459,11 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo
>    node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
>    cpu  - thread affinity mask is set to cpu of the processed mmap buffer
>  
> +-f::
> +--mmap-flush=n::
> +Minimal number of bytes accumulated in mmaped kernel buffer that is flushed to a storage (default: 1).
> +Maximal allowed value is a quarter of mmaped kernel buffer size.
> +
>  --all-kernel::
>  Configure all used events to run in kernel space.
>  
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index f3f7f3100336..61818cbce443 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -334,6 +334,29 @@ static int record__aio_enabled(struct record *rec)
>  	return rec->opts.nr_cblocks > 0;
>  }
>  
> +#define MMAP_FLUSH_DEFAULT 1
> +static int record__mmap_flush_parse(const struct option *opt,
> +				    const char *str,
> +				    int unset)
> +{
> +	int mmap_len;
> +	struct record_opts *opts = (struct record_opts *)opt->value;
> +
> +	if (unset)
> +		return 0;
> +
> +	if (str)
> +		opts->mmap_flush = strtol(str, NULL, 0);
> +	if (!opts->mmap_flush)
> +		opts->mmap_flush = MMAP_FLUSH_DEFAULT;
> +
> +	mmap_len = perf_evlist__mmap_size(opts->mmap_pages);
> +	if (opts->mmap_flush > mmap_len / 4)
> +		opts->mmap_flush = mmap_len / 4;
> +
> +	return 0;
> +}
> +
>  static int process_synthesized_event(struct perf_tool *tool,
>  				     union perf_event *event,
>  				     struct perf_sample *sample __maybe_unused,
> @@ -543,7 +566,8 @@ static int record__mmap_evlist(struct record *rec,
>  	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
>  				 opts->auxtrace_mmap_pages,
>  				 opts->auxtrace_snapshot_mode,
> -				 opts->nr_cblocks, opts->affinity) < 0) {
> +				 opts->nr_cblocks, opts->affinity,
> +				 opts->mmap_flush) < 0) {
>  		if (errno == EPERM) {
>  			pr_err("Permission error mapping pages.\n"
>  			       "Consider increasing "
> @@ -733,7 +757,7 @@ static void record__adjust_affinity(struct record *rec, struct perf_mmap *map)
>  }
>  
>  static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
> -				    bool overwrite)
> +				    bool overwrite, bool sync)
>  {
>  	u64 bytes_written = rec->bytes_written;
>  	int i;
> @@ -756,12 +780,19 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  		off = record__aio_get_pos(trace_fd);
>  
>  	for (i = 0; i < evlist->nr_mmaps; i++) {
> +		u64 flush = MMAP_FLUSH_DEFAULT;
>  		struct perf_mmap *map = &maps[i];
>  
>  		if (map->base) {
>  			record__adjust_affinity(rec, map);
> +			if (sync) {
> +				flush = map->flush;
> +				map->flush = MMAP_FLUSH_DEFAULT;
> +			}
>  			if (!record__aio_enabled(rec)) {
>  				if (perf_mmap__push(map, rec, record__pushfn) != 0) {
> +					if (sync)
> +						map->flush = flush;
>  					rc = -1;
>  					goto out;
>  				}
> @@ -774,10 +805,14 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  				idx = record__aio_sync(map, false);
>  				if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) {
>  					record__aio_set_pos(trace_fd, off);
> +					if (sync)
> +						map->flush = flush;
>  					rc = -1;
>  					goto out;
>  				}
>  			}
> +			if (sync)
> +				map->flush = flush;
>  		}
>  
>  		if (map->auxtrace_mmap.base && !rec->opts.auxtrace_snapshot_mode &&
> @@ -803,15 +838,15 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
>  	return rc;
>  }
>  
> -static int record__mmap_read_all(struct record *rec)
> +static int record__mmap_read_all(struct record *rec, bool sync)
>  {
>  	int err;
>  
> -	err = record__mmap_read_evlist(rec, rec->evlist, false);
> +	err = record__mmap_read_evlist(rec, rec->evlist, false, sync);
>  	if (err)
>  		return err;
>  
> -	return record__mmap_read_evlist(rec, rec->evlist, true);
> +	return record__mmap_read_evlist(rec, rec->evlist, true, sync);
>  }
>  
>  static void record__init_features(struct record *rec)
> @@ -1310,7 +1345,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		if (trigger_is_hit(&switch_output_trigger) || done || draining)
>  			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_DATA_PENDING);
>  
> -		if (record__mmap_read_all(rec) < 0) {
> +		if (record__mmap_read_all(rec, false) < 0) {
>  			trigger_error(&auxtrace_snapshot_trigger);
>  			trigger_error(&switch_output_trigger);
>  			err = -1;
> @@ -1411,6 +1446,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		record__synthesize_workload(rec, true);
>  
>  out_child:
> +	record__mmap_read_all(rec, true);
>  	record__aio_mmap_read_sync(rec);
>  
>  	if (forks) {
> @@ -1813,6 +1849,7 @@ static struct record record = {
>  			.uses_mmap   = true,
>  			.default_per_cpu = true,
>  		},
> +		.mmap_flush          = MMAP_FLUSH_DEFAULT,
>  	},
>  	.tool = {
>  		.sample		= process_sample_event,
> @@ -1879,6 +1916,9 @@ static struct option __record_options[] = {
>  	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
>  		     "number of mmap data pages and AUX area tracing mmap pages",
>  		     record__parse_mmap_pages),
> +	OPT_CALLBACK('f', "mmap-flush", &record.opts, "bytes",
> +		     "Minimal number of bytes in mmap data pages that is written to a storage (default: 1)",
> +		     record__mmap_flush_parse),
>  	OPT_BOOLEAN(0, "group", &record.opts.group,
>  		    "put the counters into a counter group"),
>  	OPT_CALLBACK_NOOPT('g', NULL, &callchain_param,
> @@ -2182,6 +2222,7 @@ int cmd_record(int argc, const char **argv)
>  		pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks);
>  
>  	pr_debug("affinity: %s\n", affinity_tags[rec->opts.affinity]);
> +	pr_debug("mmap flush: %d\n", rec->opts.mmap_flush);
>  
>  	err = __cmd_record(&record, argc, argv);
>  out:
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index b120e547ddc7..7886cc9771cf 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -85,6 +85,7 @@ struct record_opts {
>  	u64          clockid_res_ns;
>  	int	     nr_cblocks;
>  	int	     affinity;
> +	int	     mmap_flush;
>  };
>  
>  enum perf_affinity {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 08cedb643ea6..937039faac59 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1022,7 +1022,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
>   */
>  int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  			 unsigned int auxtrace_pages,
> -			 bool auxtrace_overwrite, int nr_cblocks, int affinity)
> +			 bool auxtrace_overwrite, int nr_cblocks, int affinity, int flush)
>  {
>  	struct perf_evsel *evsel;
>  	const struct cpu_map *cpus = evlist->cpus;
> @@ -1032,7 +1032,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  	 * Its value is decided by evsel's write_backward.
>  	 * So &mp should not be passed through const pointer.
>  	 */
> -	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity };
> +	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity, .flush = flush };
>  
>  	if (!evlist->mmap)
>  		evlist->mmap = perf_evlist__alloc_mmap(evlist, false);
> @@ -1064,7 +1064,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
>  {
> -	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS);
> +	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS, 1);
>  }
>  
>  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 744906dd4887..edf18811e39f 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -165,7 +165,8 @@ unsigned long perf_event_mlock_kb_in_pages(void);
>  
>  int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  			 unsigned int auxtrace_pages,
> -			 bool auxtrace_overwrite, int nr_cblocks, int affinity);
> +			 bool auxtrace_overwrite, int nr_cblocks,
> +			 int affinity, int flush);
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
>  void perf_evlist__munmap(struct perf_evlist *evlist);
>  
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index cdc7740fc181..ef3d79b2c90b 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -440,6 +440,8 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd, int c
>  
>  	perf_mmap__setup_affinity_mask(map, mp);
>  
> +	map->flush = mp->flush;
> +
>  	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
>  				&mp->auxtrace_mp, map->base, fd))
>  		return -1;
> @@ -492,7 +494,7 @@ static int __perf_mmap__read_init(struct perf_mmap *md)
>  	md->start = md->overwrite ? head : old;
>  	md->end = md->overwrite ? old : head;
>  
> -	if (md->start == md->end)
> +	if ((md->end - md->start) < md->flush)
>  		return -EAGAIN;
>  
>  	size = md->end - md->start;
> diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
> index e566c19b242b..b82f8c2d55c4 100644
> --- a/tools/perf/util/mmap.h
> +++ b/tools/perf/util/mmap.h
> @@ -39,6 +39,7 @@ struct perf_mmap {
>  	} aio;
>  #endif
>  	cpu_set_t	affinity_mask;
> +	u64		flush;
>  };
>  
>  /*
> @@ -70,7 +71,7 @@ enum bkw_mmap_state {
>  };
>  
>  struct mmap_params {
> -	int			    prot, mask, nr_cblocks, affinity;
> +	int			    prot, mask, nr_cblocks, affinity, flush;
>  	struct auxtrace_mmap_params auxtrace_mp;
>  };

-- 

- Arnaldo

  reply	other threads:[~2019-02-28 18:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28  8:35 [PATCH v4 0/10] perf: enable compression of record mode trace to save storage space Alexey Budankov
2019-02-28  8:59 ` [PATCH v4 01/10] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines Alexey Budankov
2019-02-28 18:46   ` Arnaldo Carvalho de Melo
2019-02-28 20:11     ` Alexey Budankov
2019-03-01  6:52       ` Alexey Budankov
2019-02-28  9:00 ` [PATCH v4 02/10] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
2019-02-28 18:44   ` Arnaldo Carvalho de Melo [this message]
2019-02-28 20:19     ` Alexey Budankov
2019-02-28  9:02 ` [PATCH v4 03/10] perf session: define bytes_transferred and bytes_compressed metrics Alexey Budankov
2019-02-28 18:47   ` Arnaldo Carvalho de Melo
2019-02-28 20:23     ` Alexey Budankov
2019-02-28  9:03 ` [PATCH v4 04/10] perf record: implement COMPRESSED event record and its attributes Alexey Budankov
2019-02-28  9:08 ` [PATCH v4 05/10] perf mmap: implement dedicated memory buffer for data compression Alexey Budankov
2019-02-28  9:09 ` [PATCH v4 06/10] perf util: introduce Zstd based streaming compression API Alexey Budankov
2019-02-28  9:11 ` [PATCH v4 07/10] perf record: implement -z,--compression_level=n option and compression Alexey Budankov
2019-02-28  9:16 ` [PATCH v4 08/10] perf report: implement record trace decompression Alexey Budankov
2019-02-28  9:17 ` [PATCH v4 09/10] perf inject: enable COMPRESSED records decompression Alexey Budankov
2019-02-28  9:18 ` [PATCH v4 10/10] perf tests: implement Zstd comp/decomp integration test Alexey Budankov
2019-03-13 14:37 ` [PATCH v4 0/10] perf: enable compression of record mode trace to save storage space Jiri Olsa
2019-03-13 15:00   ` Alexey Budankov
2019-03-13 14:37 ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190228184445.GE9508@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.budankov@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.