From: Frederic Weisbecker <fweisbec@gmail.com>
To: David Ahern <dsahern@gmail.com>
Cc: acme@ghostprotocols.net, linux-kernel@vger.kernel.org,
Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
Mike Galbraith <efault@gmx.de>,
Namhyung Kim <namhyung@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH] perf session: Add option to copy events when queueing
Date: Sat, 14 Sep 2013 18:16:27 +0200 [thread overview]
Message-ID: <20130914161626.GD1718@localhost.localdomain> (raw)
In-Reply-To: <1378496221-61525-1-git-send-email-dsahern@gmail.com>
On Fri, Sep 06, 2013 at 01:37:01PM -0600, David Ahern wrote:
> When processing events the session code has an ordered samples queue which is
> used to time-sort events coming in across multiple mmaps. At a later point in
> time samples on the queue are flushed up to some timestamp at which point the
> event is actually processed.
>
> When analyzing events live (ie., record/analysis path in the same command)
> there is a race that leads to corrupted events and parse errors which cause
> perf to terminate. The problem is that when the event is placed in the ordered
> samples queue it is only a reference to the event which is really sitting in
> the mmap buffer. Even though the event is queued for later processing the mmap
> tail pointer is updated which indicates to the kernel that the event has been
> processed. The race is flushing the event from the queue before it gets
> overwritten by some other event. For commands trying to process events live
> (versus just writing to a file) and processing a high rate of events this leads
> to parse failures and perf terminates.
>
> Examples hitting this problem are 'perf kvm stat live', especially with nested
> VMs which generate 100,000+ traces per second, and a command processing
> scheduling events with a high rate of context switching -- e.g., running
> 'perf bench sched pipe'.
>
> This patch offers live commands an option to copy the event when it is placed in
> the ordered samples queue.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Mike Galbraith <efault@gmx.de>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Stephane Eranian <eranian@google.com>
> ---
> tools/perf/util/session.c | 17 ++++++++++++++---
> tools/perf/util/session.h | 1 +
> 2 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 1b185ca..71f16db 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -483,6 +483,8 @@ static void perf_session_free_sample_buffers(struct perf_session *session)
>
> sq = list_entry(os->to_free.next, struct sample_queue, list);
> list_del(&sq->list);
> + if (session->copy_on_queue)
> + free(sq->event);
> free(sq);
> }
> }
> @@ -513,11 +515,15 @@ static int flush_sample_queue(struct perf_session *s,
> break;
>
> ret = perf_evlist__parse_sample(s->evlist, iter->event, &sample);
> - if (ret)
> + if (ret) {
> pr_err("Can't parse sample, err = %d\n", ret);
> - else {
> + if (s->copy_on_queue)
> + free(iter->event);
> + } else {
> ret = perf_session_deliver_event(s, iter->event, &sample, tool,
> iter->file_offset);
> + if (s->copy_on_queue)
> + free(iter->event);
> if (ret)
> return ret;
> }
> @@ -676,7 +682,12 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
>
> new->timestamp = timestamp;
> new->file_offset = file_offset;
> - new->event = event;
> +
> + if (s->copy_on_queue) {
> + new->event = malloc(event->header.size);
> + memcpy(new->event, event, event->header.size);
> + } else
> + new->event = event;
>
> __queue_event(new, s);
>
> diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
> index 3aa75fb..4adfcbb 100644
> --- a/tools/perf/util/session.h
> +++ b/tools/perf/util/session.h
> @@ -38,6 +38,7 @@ struct perf_session {
> bool fd_pipe;
> bool repipe;
> struct ordered_samples ordered_samples;
> + bool copy_on_queue;
So do you think it should stay optional? This looks like a global problem, I mean
the event can be unmapped anytime for any builtin tool mapping it, right?
Also we already allocate the sample list node (struct sample_queue) from os->sample
buffer. ie: we have our own allocator there.
Probably we should reuse that and include the copied event space in "struct sample_queue"?
Also looking at it now, it seems we have a bug on the existing code:
if (!list_empty(sc)) {
new = list_entry(sc->next, struct sample_queue, list);
list_del(&new->list);
} else if (os->sample_buffer) {
new = os->sample_buffer + os->sample_buffer_idx;
if (++os->sample_buffer_idx == MAX_SAMPLE_BUFFER)
os->sample_buffer = NULL;
} else {
os->sample_buffer = malloc(MAX_SAMPLE_BUFFER * sizeof(*new));
if (!os->sample_buffer)
return -ENOMEM;
list_add(&os->sample_buffer->list, &os->to_free);
os->sample_buffer_idx = 2;
new = os->sample_buffer + 1;
}
If we actually run out of buffer rooms, we should realloc right after and not
wait for the next entry, otherwise we loose an event:
if (!list_empty(sc)) {
new = list_entry(sc->next, struct sample_queue, list);
list_del(&new->list);
} else {
if (os->sample_buffer) {
new = os->sample_buffer + os->sample_buffer_idx;
if (++os->sample_buffer_idx == MAX_SAMPLE_BUFFER)
os->sample_buffer = NULL;
}
if (!os->sample_buffer) {
os->sample_buffer = malloc(MAX_SAMPLE_BUFFER * sizeof(*new));
if (!os->sample_buffer)
return -ENOMEM;
list_add(&os->sample_buffer->list, &os->to_free);
os->sample_buffer_idx = 2;
new = os->sample_buffer + 1;
}
Although the mirrored os->sample_buffer condition check is a bit ugly and should move to
a function. But the idea is there.
next prev parent reply other threads:[~2013-09-14 16:16 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-06 19:37 [PATCH] perf session: Add option to copy events when queueing David Ahern
2013-09-14 16:16 ` Frederic Weisbecker [this message]
2013-09-14 17:25 ` David Ahern
2013-09-16 16:40 ` Frederic Weisbecker
2013-09-17 4:14 ` David Ahern
2013-10-24 9:30 ` Frederic Weisbecker
2013-10-24 10:23 ` David Ahern
2013-10-24 12:27 ` Arnaldo Melo
2013-10-24 13:12 ` David Ahern
2013-10-24 14:07 ` Arnaldo Melo
2013-10-25 16:04 ` David Ahern
2013-10-02 12:18 ` Jiri Olsa
2013-10-02 12:38 ` Jiri Olsa
2013-10-02 12:52 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130914161626.GD1718@localhost.localdomain \
--to=fweisbec@gmail.com \
--cc=acme@ghostprotocols.net \
--cc=dsahern@gmail.com \
--cc=efault@gmx.de \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.