linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Riccardo Mancini <rickyman7@gmail.com>
To: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Antonov <alexander.antonov@linux.intel.com>,
	Alexei Budankov <abudankov@huawei.com>,
	linux-perf-users@vger.kernel.org, Ian Rogers <irogers@google.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: [PATCH v6 20/20] perf session: Load data directory files for analysis
Date: Fri, 04 Jun 2021 01:28:17 +0200	[thread overview]
Message-ID: <c3f5c4ecfc86ec1de29f6db681b2e5fce7ef23a3.camel@gmail.com> (raw)
In-Reply-To: <be40346cdb384e0721f79d918067ff9026743845.1622025774.git.alexey.v.bayduraev@linux.intel.com>

Hi,

On Wed, 2021-05-26 at 13:53 +0300, Alexey Bayduraev wrote:
> Load data directory files and provide basic raw dump and aggregated
> analysis support of data directories in report mode, still with no
> memory consumption optimizations.
> 
> Design and implementation are based on the prototype [1], [2].
> 
> [1] git clone https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git -
> b perf/record_threads
> [2] https://lore.kernel.org/lkml/20180913125450.21342-1-jolsa@kernel.org/
> 
> Suggested-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
> ---
>  tools/perf/util/session.c | 129 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 129 insertions(+)
> 
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 041601810b85..dd4ef9749cd0 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -65,6 +65,7 @@ struct reader_state {
>         u64      data_size;
>         u64      head;
>         bool     eof;
> +       u64      size;
>  };
>  
>  enum {
> @@ -2319,6 +2320,7 @@ reader__read_event(struct reader *rd, struct
> perf_session *session,
>         if (skip)
>                 size += skip;
>  
> +       st->size += size;
>         st->head += size;
>         st->file_pos += size;
>  
> @@ -2418,6 +2420,130 @@ static int __perf_session__process_events(struct
> perf_session *session)
>         return err;
>  }
>  
> +/*
> + * This function reads, merge and process directory data.
> + * It assumens the version 1 of directory data, where each
> + * data file holds per-cpu data, already sorted by kernel.
> + */
> +static int __perf_session__process_dir_events(struct perf_session *session)
> +{
> +       struct perf_data *data = session->data;
> +       struct perf_tool *tool = session->tool;
> +       int i, ret = 0, readers = 1;
> +       struct ui_progress prog;
> +       u64 total_size = perf_data__size(session->data);
> +       struct reader *rd;
> +
> +       perf_tool__fill_defaults(tool);
> +
> +       ui_progress__init_size(&prog, total_size, "Sorting events...");
> +
> +       for (i = 0; i < data->dir.nr; i++) {
> +               if (data->dir.files[i].size)
> +                       readers++;
> +       }
> +
> +       rd = session->readers = zalloc(readers * sizeof(struct reader));
> +       if (!rd)
> +               return -ENOMEM;
> +       session->nr_readers = readers;
> +       readers = 0;
> +
> +       rd[readers] = (struct reader) {
> +               .fd              = perf_data__fd(session->data),
> +               .path            = session->data->file.path,
> +               .data_size       = session->header.data_size,
> +               .data_offset     = session->header.data_offset,
> +               .in_place_update = session->data->in_place_update,
> +       };
> +       ret = reader__init(&rd[readers], NULL);
> +       if (ret)
> +               goto out_err;
> +       ret = reader__mmap(&rd[readers], session);
> +       if (ret != READER_OK) {
> +               if (ret == READER_EOF)
> +                       ret = -EINVAL;
> +               goto out_err;
> +       }
> +       readers++;
> +
> +       for (i = 0; i < data->dir.nr; i++) {
> +               if (data->dir.files[i].size) {
> +                       rd[readers] = (struct reader) {
> +                               .fd              = data->dir.files[i].fd,
> +                               .path            = data->dir.files[i].path,
> +                               .data_size       = data->dir.files[i].size,
> +                               .data_offset     = 0,
> +                               .in_place_update = session->data-
> >in_place_update,
> +                       };
> +                       ret = reader__init(&rd[readers], NULL);

zstd_fini is never called on rd[readers].zstd_data
Maybe it can be done in perf_session__delete. For example, we could add a new
reader__fini function to do the cleanup of zstd data and
perf_decomp__release_events.

Thanks,
Riccardo

> +                       if (ret)
> +                               goto out_err;
> +                       ret = reader__mmap(&rd[readers], session);
> +                       if (ret != READER_OK) {
> +                               if (ret == READER_EOF)
> +                                       ret = -EINVAL;
> +                               goto out_err;
> +                       }
> +                       readers++;
> +               }
> +       }
> +
> +       i = 0;
> +
> +       while ((ret >= 0) && readers) {
> +               if (session_done())
> +                       return 0;
> +
> +               if (rd[i].state.eof) {
> +                       i = (i + 1) % session->nr_readers;
> +                       continue;
> +               }
> +
> +               ret = reader__read_event(&rd[i], session, &prog);
> +               if (ret < 0)
> +                       break;
> +               if (ret == READER_EOF) {
> +                       ret = reader__mmap(&rd[i], session);
> +                       if (ret < 0)
> +                               goto out_err;
> +                       if (ret == READER_EOF)
> +                               readers--;
> +               }
> +
> +               /*
> +                * Processing 10MBs of data from each reader in sequence,
> +                * because that's the way the ordered events sorting works
> +                * most efficiently.
> +                */
> +               if (rd[i].state.size >= 10*1024*1024) {
> +                       rd[i].state.size = 0;
> +                       i = (i + 1) % session->nr_readers;
> +               }
> +       }
> +
> +       ret = ordered_events__flush(&session->ordered_events,
> OE_FLUSH__FINAL);
> +       if (ret)
> +               goto out_err;
> +
> +       ret = perf_session__flush_thread_stacks(session);
> +out_err:
> +       ui_progress__finish();
> +
> +       if (!tool->no_warn)
> +               perf_session__warn_about_errors(session);
> +
> +       /*
> +        * We may switching perf.data output, make ordered_events
> +        * reusable.
> +        */
> +       ordered_events__reinit(&session->ordered_events);
> +
> +       session->one_mmap = false;
> +
> +       return ret;
> +}
> +
>  int perf_session__process_events(struct perf_session *session)
>  {
>         if (perf_session__register_idle_thread(session) < 0)
> @@ -2426,6 +2552,9 @@ int perf_session__process_events(struct perf_session
> *session)
>         if (perf_data__is_pipe(session->data))
>                 return __perf_session__process_pipe_events(session);
>  
> +       if (perf_data__is_dir(session->data))
> +               return __perf_session__process_dir_events(session);
> +
>         return __perf_session__process_events(session);
>  }
>  



      parent reply	other threads:[~2021-06-03 23:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1622025774.git.alexey.v.bayduraev@linux.intel.com>
     [not found] ` <c5a046f8bed989e4ede98f1fcdaa9d0b6bf78cac.1622025774.git.alexey.v.bayduraev@linux.intel.com>
2021-06-03 22:56   ` [PATCH v6 03/20] perf record: Introduce thread local variable Riccardo Mancini
2021-06-09 22:54     ` Namhyung Kim
     [not found] ` <bdbb55a052ced7adf7f2d16cbc4c7c5507b7c0e3.1622025774.git.alexey.v.bayduraev@linux.intel.com>
2021-06-03 23:01   ` [PATCH v6 05/20] perf record: Start threads in the beginning of trace streaming Riccardo Mancini
     [not found] ` <59a8bd9c18b70150919c44c95c551569a7c58bb0.1622025774.git.alexey.v.bayduraev@linux.intel.com>
2021-06-03 23:14   ` [PATCH v6 10/20] perf record: Introduce --threads=<spec> command line option Riccardo Mancini
     [not found] ` <ec370117b49575be493add488a07450517c78aaf.1622025774.git.alexey.v.bayduraev@linux.intel.com>
2021-06-03 23:22   ` [PATCH v6 16/20] perf session: Introduce decompressor into trace reader object Riccardo Mancini
     [not found] ` <be40346cdb384e0721f79d918067ff9026743845.1622025774.git.alexey.v.bayduraev@linux.intel.com>
2021-06-03 23:28   ` Riccardo Mancini [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c3f5c4ecfc86ec1de29f6db681b2e5fce7ef23a3.camel@gmail.com \
    --to=rickyman7@gmail.com \
    --cc=abudankov@huawei.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.antonov@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.v.bayduraev@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).