* [PATCH] perf: detect loops processing events
@ 2013-05-08 15:18 David Ahern
2013-05-09 9:24 ` Ingo Molnar
0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2013-05-08 15:18 UTC (permalink / raw)
To: acme, linux-kernel
Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
Jiri Olsa, Namhyung Kim, Stephane Eranian
Recovery algorithm in __perf_session__process_events attempts to remap
a perf.data file with a different file_offset and try again at a new head
position. Both of these adjustment rely on page_offset. If page_offset is
0 then file_offset and head never change which means the remap attempt is
the same and the fetch_mmaped_event is the same and the processing just
loops forever.
Detect this condition and warn the user.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
---
tools/perf/util/session.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index cf1fe01..1c4dc45 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1235,6 +1235,12 @@ more:
}
page_offset = page_size * (head / page_size);
+ /* catch looping where we never make forward progress. */
+ if (page_offset == 0) {
+ pr_err("Loop detection processing events. Is file corrupted?\n");
+ return -1;
+ }
+
file_offset += page_offset;
head -= page_offset;
goto remap;
--
1.7.10.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] perf: detect loops processing events
2013-05-08 15:18 [PATCH] perf: detect loops processing events David Ahern
@ 2013-05-09 9:24 ` Ingo Molnar
2013-05-09 9:30 ` Ingo Molnar
0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2013-05-09 9:24 UTC (permalink / raw)
To: David Ahern
Cc: acme, linux-kernel, Frederic Weisbecker, Peter Zijlstra,
Jiri Olsa, Namhyung Kim, Stephane Eranian
* David Ahern <dsahern@gmail.com> wrote:
> Recovery algorithm in __perf_session__process_events attempts to remap
> a perf.data file with a different file_offset and try again at a new head
> position. Both of these adjustment rely on page_offset. If page_offset is
> 0 then file_offset and head never change which means the remap attempt is
> the same and the fetch_mmaped_event is the same and the processing just
> loops forever.
>
> Detect this condition and warn the user.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Stephane Eranian <eranian@google.com>
> ---
> tools/perf/util/session.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index cf1fe01..1c4dc45 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1235,6 +1235,12 @@ more:
> }
>
> page_offset = page_size * (head / page_size);
> + /* catch looping where we never make forward progress. */
> + if (page_offset == 0) {
> + pr_err("Loop detection processing events. Is file corrupted?\n");
> + return -1;
> + }
> +
> file_offset += page_offset;
> head -= page_offset;
> goto remap;
Ah, nice!
Btw., would it make sense to emit a (once-only) warning and optimistically
fix page_offset up to 1 (or 4096) and let things continue with the next
set of data - can we recover most of the data in that case?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] perf: detect loops processing events
2013-05-09 9:24 ` Ingo Molnar
@ 2013-05-09 9:30 ` Ingo Molnar
2013-05-10 1:10 ` Namhyung Kim
0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2013-05-09 9:30 UTC (permalink / raw)
To: David Ahern
Cc: acme, linux-kernel, Frederic Weisbecker, Peter Zijlstra,
Jiri Olsa, Namhyung Kim, Stephane Eranian
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * David Ahern <dsahern@gmail.com> wrote:
>
> > Recovery algorithm in __perf_session__process_events attempts to remap
> > a perf.data file with a different file_offset and try again at a new head
> > position. Both of these adjustment rely on page_offset. If page_offset is
> > 0 then file_offset and head never change which means the remap attempt is
> > the same and the fetch_mmaped_event is the same and the processing just
> > loops forever.
> >
> > Detect this condition and warn the user.
> >
> > Signed-off-by: David Ahern <dsahern@gmail.com>
> > Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Stephane Eranian <eranian@google.com>
> > ---
> > tools/perf/util/session.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> > index cf1fe01..1c4dc45 100644
> > --- a/tools/perf/util/session.c
> > +++ b/tools/perf/util/session.c
> > @@ -1235,6 +1235,12 @@ more:
> > }
> >
> > page_offset = page_size * (head / page_size);
> > + /* catch looping where we never make forward progress. */
> > + if (page_offset == 0) {
> > + pr_err("Loop detection processing events. Is file corrupted?\n");
> > + return -1;
> > + }
> > +
> > file_offset += page_offset;
> > head -= page_offset;
> > goto remap;
>
> Ah, nice!
>
> Btw., would it make sense to emit a (once-only) warning and optimistically
> fix page_offset up to 1 (or 4096) and let things continue with the next
> set of data - can we recover most of the data in that case?
Basically, what'd like to do with binary data is similar to what we do
with text data if some trace output is corrupted:
sshd-15478 [002] 1100.859353: 15478:120:S ==> [002] 0:140:R <idle>
<idle>-0 [005] 1100.859378: 0:140:R ==> [005] 20169:120:R cat
cat-20169 [005] 1100.860718: 20169:120:R + [005] 15521:120:S bash
^@¨<81><92>^@^Bª^P^P^D<9c>8@<88>^A^M ;¤0 ^D<90>"ª<81>^B^T)Ò^C$^@^N^@^A
cat-20169 [005] 1100.860720: 20169:120:R + [005] 305:115:S kblockd/5
cat-20169 [005] 1100.860722: 20169:120:? ==> [005] 305:115:R kblockd/5
kblockd/5-305 [005] 1100.860755: 305:115:S ==> [005] 15521:120:R bash
bash-15521 [005] 1100.860792: 15521:120:S + [002] 15478:120:S sshd
<idle>-0 [002] 1100.860853: 0:140:R ==> [002] 15478:120:R sshd
sshd-15478 [002] 1100.860895: 15478:120:S ==> [002] 0:140:R <idle>
bash-15521 [005] 1100.860925: 15521:120:S + [002] 15478:120:S sshd
bash-15521 [005] 1100.860999: 15521:120:S ==> [005] 0:140:R <idle>
See that junk in the middle, sign of some sort of file corruption? Instead
of detecting it and aborting we just try to skip that line and try to find
the next useful looking line, ignoring the junk and bits around it.
Is there a perf.data equivalent of intelligently trying to skip to the
next plausible looking event record?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] perf: detect loops processing events
2013-05-09 9:30 ` Ingo Molnar
@ 2013-05-10 1:10 ` Namhyung Kim
2013-05-10 1:40 ` David Ahern
0 siblings, 1 reply; 5+ messages in thread
From: Namhyung Kim @ 2013-05-10 1:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: David Ahern, acme, linux-kernel, Frederic Weisbecker,
Peter Zijlstra, Jiri Olsa, Stephane Eranian
Hi Ingo,
On Thu, 9 May 2013 11:30:25 +0200, Ingo Molnar wrote:
> * Ingo Molnar <mingo@kernel.org> wrote:
>> Btw., would it make sense to emit a (once-only) warning and optimistically
>> fix page_offset up to 1 (or 4096) and let things continue with the next
>> set of data - can we recover most of the data in that case?
>
> Basically, what'd like to do with binary data is similar to what we do
> with text data if some trace output is corrupted:
>
> sshd-15478 [002] 1100.859353: 15478:120:S ==> [002] 0:140:R <idle>
> <idle>-0 [005] 1100.859378: 0:140:R ==> [005] 20169:120:R cat
> cat-20169 [005] 1100.860718: 20169:120:R + [005] 15521:120:S bash
> ^@¨<81><92>^@^Bª^P^P^D<9c>8@<88>^A^M ;¤0 ^D<90>"ª<81>^B^T)Ò^C$^@^N^@^A
> cat-20169 [005] 1100.860720: 20169:120:R + [005] 305:115:S kblockd/5
> cat-20169 [005] 1100.860722: 20169:120:? ==> [005] 305:115:R kblockd/5
> kblockd/5-305 [005] 1100.860755: 305:115:S ==> [005] 15521:120:R bash
> bash-15521 [005] 1100.860792: 15521:120:S + [002] 15478:120:S sshd
> <idle>-0 [002] 1100.860853: 0:140:R ==> [002] 15478:120:R sshd
> sshd-15478 [002] 1100.860895: 15478:120:S ==> [002] 0:140:R <idle>
> bash-15521 [005] 1100.860925: 15521:120:S + [002] 15478:120:S sshd
> bash-15521 [005] 1100.860999: 15521:120:S ==> [005] 0:140:R <idle>
>
> See that junk in the middle, sign of some sort of file corruption? Instead
> of detecting it and aborting we just try to skip that line and try to find
> the next useful looking line, ignoring the junk and bits around it.
>
> Is there a perf.data equivalent of intelligently trying to skip to the
> next plausible looking event record?
I think we should not truncate file_size for this case. It was
decreased to data_offset + data_size in order not to read unrelated
metadata (additional header feature info). But in this case, since
data_size is 0 it'd have same value as data_offset, and in turn
mmap_size truncated to data_offset too. So fetch_mmaped_event() always
return NULL as head + sizeof(event->header) exceeds mmap_size.
If we keep original file_size, perf can report existing samples but no
metadata. So does the patch below make sense?
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index cf1fe01b7e89..cf4e574c7b7f 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1196,7 +1196,7 @@ int __perf_session__process_events(struct perf_session *session,
file_offset = page_offset;
head = data_offset - page_offset;
- if (data_offset + data_size < file_size)
+ if (data_size && (data_offset + data_size < file_size))
file_size = data_offset + data_size;
progress_next = file_size / 16;
--
Thanks,
Namhyung
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] perf: detect loops processing events
2013-05-10 1:10 ` Namhyung Kim
@ 2013-05-10 1:40 ` David Ahern
0 siblings, 0 replies; 5+ messages in thread
From: David Ahern @ 2013-05-10 1:40 UTC (permalink / raw)
To: Namhyung Kim
Cc: Ingo Molnar, acme, linux-kernel, Frederic Weisbecker,
Peter Zijlstra, Jiri Olsa, Stephane Eranian
On 5/9/13 7:10 PM, Namhyung Kim wrote:
>
> I think we should not truncate file_size for this case. It was
> decreased to data_offset + data_size in order not to read unrelated
> metadata (additional header feature info). But in this case, since
> data_size is 0 it'd have same value as data_offset, and in turn
> mmap_size truncated to data_offset too. So fetch_mmaped_event() always
> return NULL as head + sizeof(event->header) exceeds mmap_size.
>
> If we keep original file_size, perf can report existing samples but no
> metadata. So does the patch below make sense?
>
>
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index cf1fe01b7e89..cf4e574c7b7f 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1196,7 +1196,7 @@ int __perf_session__process_events(struct perf_session *session,
> file_offset = page_offset;
> head = data_offset - page_offset;
>
> - if (data_offset + data_size < file_size)
> + if (data_size && (data_offset + data_size < file_size))
> file_size = data_offset + data_size;
>
> progress_next = file_size / 16;
Nice. That does handle the case of the perf.data file not getting closed
properly. With this, my patch should not return -1 just print the error
message to the user which would explain why the feature data is not printed.
David
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-05-10 1:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-08 15:18 [PATCH] perf: detect loops processing events David Ahern
2013-05-09 9:24 ` Ingo Molnar
2013-05-09 9:30 ` Ingo Molnar
2013-05-10 1:10 ` Namhyung Kim
2013-05-10 1:40 ` David Ahern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox