From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org,
Alexander Yarygin <yarygin@linux.vnet.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
David Ahern <dsahern@gmail.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Jiri Olsa <jolsa@redhat.com>, Mike Galbraith <efault@gmx.de>,
Namhyung Kim <namhyung.kim@lge.com>,
Paul Mackerras <paulus@samba.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Stephane Eranian <eranian@google.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 14/15] perf session: Add option to copy events when queueing
Date: Wed, 15 Oct 2014 17:52:47 -0300 [thread overview]
Message-ID: <1413406368-26245-15-git-send-email-acme@kernel.org> (raw)
In-Reply-To: <1413406368-26245-1-git-send-email-acme@kernel.org>
From: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
When processing events the session code has an ordered samples queue
which is used to time-sort events coming in across multiple mmaps. At a
later point in time samples on the queue are flushed up to some
timestamp at which point the event is actually processed.
When analyzing events live (ie., record/analysis path in the same
command) there is a race that leads to corrupted events and parse errors
which cause perf to terminate. The problem is that when the event is
placed in the ordered samples queue it is only a reference to the event
which is really sitting in the mmap buffer. Even though the event is
queued for later processing the mmap tail pointer is updated which
indicates to the kernel that the event has been processed. The race is
flushing the event from the queue before it gets overwritten by some
other event. For commands trying to process events live (versus just
writing to a file) and processing a high rate of events this leads to
parse failures and perf terminates.
Examples hitting this problem are 'perf kvm stat live', especially with
nested VMs which generate 100,000+ traces per second, and a command
processing scheduling events with a high rate of context switching --
e.g., running 'perf bench sched pipe'.
This patch offers live commands an option to copy the event when it is
placed in the ordered samples queue.
Based on a patch from David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1412347212-28237-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/ordered-events.c | 49 ++++++++++++++++++++++++++++++++++++----
tools/perf/util/ordered-events.h | 10 +++++++-
tools/perf/util/session.c | 5 ++--
3 files changed, 56 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index 706ce1a66169..fd4be94125fb 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -1,5 +1,6 @@
#include <linux/list.h>
#include <linux/compiler.h>
+#include <linux/string.h>
#include "ordered-events.h"
#include "evlist.h"
#include "session.h"
@@ -57,11 +58,45 @@ static void queue_event(struct ordered_events *oe, struct ordered_event *new)
}
}
+static union perf_event *__dup_event(struct ordered_events *oe,
+ union perf_event *event)
+{
+ union perf_event *new_event = NULL;
+
+ if (oe->cur_alloc_size < oe->max_alloc_size) {
+ new_event = memdup(event, event->header.size);
+ if (new_event)
+ oe->cur_alloc_size += event->header.size;
+ }
+
+ return new_event;
+}
+
+static union perf_event *dup_event(struct ordered_events *oe,
+ union perf_event *event)
+{
+ return oe->copy_on_queue ? __dup_event(oe, event) : event;
+}
+
+static void free_dup_event(struct ordered_events *oe, union perf_event *event)
+{
+ if (oe->copy_on_queue) {
+ oe->cur_alloc_size -= event->header.size;
+ free(event);
+ }
+}
+
#define MAX_SAMPLE_BUFFER (64 * 1024 / sizeof(struct ordered_event))
-static struct ordered_event *alloc_event(struct ordered_events *oe)
+static struct ordered_event *alloc_event(struct ordered_events *oe,
+ union perf_event *event)
{
struct list_head *cache = &oe->cache;
struct ordered_event *new = NULL;
+ union perf_event *new_event;
+
+ new_event = dup_event(oe, event);
+ if (!new_event)
+ return NULL;
if (!list_empty(cache)) {
new = list_entry(cache->next, struct ordered_event, list);
@@ -74,8 +109,10 @@ static struct ordered_event *alloc_event(struct ordered_events *oe)
size_t size = MAX_SAMPLE_BUFFER * sizeof(*new);
oe->buffer = malloc(size);
- if (!oe->buffer)
+ if (!oe->buffer) {
+ free_dup_event(oe, new_event);
return NULL;
+ }
pr("alloc size %" PRIu64 "B (+%zu), max %" PRIu64 "B\n",
oe->cur_alloc_size, size, oe->max_alloc_size);
@@ -90,15 +127,17 @@ static struct ordered_event *alloc_event(struct ordered_events *oe)
pr("allocation limit reached %" PRIu64 "B\n", oe->max_alloc_size);
}
+ new->event = new_event;
return new;
}
struct ordered_event *
-ordered_events__new(struct ordered_events *oe, u64 timestamp)
+ordered_events__new(struct ordered_events *oe, u64 timestamp,
+ union perf_event *event)
{
struct ordered_event *new;
- new = alloc_event(oe);
+ new = alloc_event(oe, event);
if (new) {
new->timestamp = timestamp;
queue_event(oe, new);
@@ -111,6 +150,7 @@ void ordered_events__delete(struct ordered_events *oe, struct ordered_event *eve
{
list_move(&event->list, &oe->cache);
oe->nr_events--;
+ free_dup_event(oe, event->event);
}
static int __ordered_events__flush(struct perf_session *s,
@@ -240,6 +280,7 @@ void ordered_events__free(struct ordered_events *oe)
event = list_entry(oe->to_free.next, struct ordered_event, list);
list_del(&event->list);
+ free_dup_event(oe, event->event);
free(event);
}
}
diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
index 3b2f20542a01..7b8f9b011f38 100644
--- a/tools/perf/util/ordered-events.h
+++ b/tools/perf/util/ordered-events.h
@@ -34,9 +34,11 @@ struct ordered_events {
int buffer_idx;
unsigned int nr_events;
enum oe_flush last_flush_type;
+ bool copy_on_queue;
};
-struct ordered_event *ordered_events__new(struct ordered_events *oe, u64 timestamp);
+struct ordered_event *ordered_events__new(struct ordered_events *oe, u64 timestamp,
+ union perf_event *event);
void ordered_events__delete(struct ordered_events *oe, struct ordered_event *event);
int ordered_events__flush(struct perf_session *s, struct perf_tool *tool,
enum oe_flush how);
@@ -48,4 +50,10 @@ void ordered_events__set_alloc_size(struct ordered_events *oe, u64 size)
{
oe->max_alloc_size = size;
}
+
+static inline
+void ordered_events__set_copy_on_queue(struct ordered_events *oe, bool copy)
+{
+ oe->copy_on_queue = copy;
+}
#endif /* __ORDERED_EVENTS_H */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 896bac73ea08..6702ac28754b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -532,17 +532,16 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
return -EINVAL;
}
- new = ordered_events__new(oe, timestamp);
+ new = ordered_events__new(oe, timestamp, event);
if (!new) {
ordered_events__flush(s, tool, OE_FLUSH__HALF);
- new = ordered_events__new(oe, timestamp);
+ new = ordered_events__new(oe, timestamp, event);
}
if (!new)
return -ENOMEM;
new->file_offset = file_offset;
- new->event = event;
return 0;
}
--
1.9.3
next prev parent reply other threads:[~2014-10-15 20:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-15 20:52 [GIT PULL 00/15] perf/core improvements and fixes Arnaldo Carvalho de Melo
2014-10-15 20:52 ` Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 01/15] perf top: Add a visual cue for toggle zeroing of samples Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 02/15] perf Documentation: sysfs events/ interfaces Arnaldo Carvalho de Melo
2014-10-15 20:52 ` Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 03/15] perf Documentation: Remove Ruplicated docs for powerpc cpu specific events Arnaldo Carvalho de Melo
2014-10-15 20:52 ` Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 04/15] Revert "perf tools: Default to cpu// for events v5" Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 05/15] perf tools: Parse the pmu event prefix and suffix Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 06/15] perf tools: Add support to new style format of kernel PMU event Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 07/15] perf test: Add test case for pmu event new style format Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 08/15] perf evlist: Fix for double free in tools/perf stat Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 09/15] perf report: Set callchain_param.record_mode for future use Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 10/15] perf callchain: Create an address space per thread Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 11/15] perf kvm: Use thread_{,_set}_priv helpers Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 12/15] perf trace: " Arnaldo Carvalho de Melo
2014-10-15 20:52 ` [PATCH 13/15] perf Documentation: Fix typos in perf/Documentation Arnaldo Carvalho de Melo
2014-10-15 20:52 ` Arnaldo Carvalho de Melo [this message]
2014-10-15 20:52 ` [PATCH 15/15] perf kvm stat live: Enable events copying Arnaldo Carvalho de Melo
2014-10-16 5:18 ` [GIT PULL 00/15] perf/core improvements and fixes Ingo Molnar
2014-10-16 5:18 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1413406368-26245-15-git-send-email-acme@kernel.org \
--to=acme@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=dsahern@gmail.com \
--cc=efault@gmx.de \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung.kim@lge.com \
--cc=paulus@samba.org \
--cc=yarygin@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.