All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: acme@ghostprotocols.net, linux-kernel@vger.kernel.org,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Mike Galbraith <efault@gmx.de>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH] perf record: Delete file if a failure occurs writing the perf data file
Date: Tue, 12 Nov 2013 07:51:16 -0700	[thread overview]
Message-ID: <52824064.4060100@gmail.com> (raw)
In-Reply-To: <5280ECFF.10103@gmail.com>

Ingo:

On 11/11/13, 7:43 AM, David Ahern wrote:
> On 11/11/13, 2:37 AM, Ingo Molnar wrote:
>>
>> * David Ahern <dsahern@gmail.com> wrote:
>>
>>> If perf fails to write data to the data file (e.g., ENOSPC error) it
>>> fails
>>> with the message:
>>>    failed to write perf data, error: No space left on device
>>>
>>> and stops — killing the workload too. The file is an unknown state.
>>> Trying to read it (e.g., perf report) fails with a SIGBUS error.
>>
>> Ouch - guys please first investiage that SIGBUS, we should not behave
>> unexpectedly on _any_ (read: random) perf.data file contents. The SIGBUS
>> likely suggests that the parsing isn't robust enough.

If you agree with the below summary then any further objections to 
deleting the file on write failure?

David

>
> I think we know why the SIGBUS is happening. From 'man mmap':
>
>
>  From man mmap:
>         SIGBUS Attempted access to a portion of the buffer that
>         does not correspond  to  the  file (for  example, beyond
>         the end of the file, ...
>
>
> With regards to perf-record, on a write() failure the header is not
> updated. From a recent change we try to proceed even though the data
> size is 0 - parsing the events we can. We finally hit upon an event that
> is only partially in the file (eg., header, but no data for event).
> Trying to read the event data leads to the SIGBUS:
>
> Running perf-report in gdb:
>
> WARNING: The /tmp/mnt/perf.data file's data size field is 0 which is
> unexpected.
> Was the 'perf record' command properly terminated?
>
>
> Program received signal SIGBUS, Bus error.
> perf_evsel__parse_sample (evsel=0x94eec0, event=0x7ffff7ed9d80,
> data=0x7fffffffd260)
>      at util/evsel.c:1242
> 1242        u16 max_size = event->header.size;
> (gdb) bt
> #0  perf_evsel__parse_sample (evsel=0x94eec0, event=0x7ffff7ed9d80,
> data=0x7fffffffd260)
>      at util/evsel.c:1242
> #1  0x000000000047c9ce in flush_sample_queue (s=0x94e2b0,
> tool=0x7fffffffde80)
>      at util/session.c:542
> #2  0x000000000047e2d4 in __perf_session__process_events (session=0x94e2b0,
>      data_offset=<optimized out>, data_size=<optimized out>,
> file_size=1048576, tool=0x7fffffffde80)
>      at util/session.c:1388
> #3  0x000000000042993c in __cmd_report (rep=0x7fffffffde80) at
> builtin-report.c:509
> #4  cmd_report (argc=0, argv=0x7fffffffe370, prefix=<optimized out>) at
> builtin-report.c:967
> #5  0x000000000041b063 in run_builtin (p=0x7cdf28, argc=4,
> argv=0x7fffffffe370) at perf.c:319
> #6  0x000000000041a8e3 in handle_internal_command (argv=0x7fffffffe370,
> argc=4) at perf.c:376
> #7  run_argv (argv=0x7fffffffe180, argcp=0x7fffffffe18c) at perf.c:420
> #8  main (argc=4, argv=0x7fffffffe370) at perf.c:521
>
>>
>>> Fix by deleting the file on a failure.
>>
>> That only works around the issue - if the same data file is produced by
>> some other method (or maliciously) then perf report will still SIGBUS ...
>
> We could handle SIGBUS in the analysis commands too. See the suggestion
> I had for handling the output failure using the mmap output option which
> uses lngjmp.
>
> David


  reply	other threads:[~2013-11-12 14:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-08 16:41 [PATCH] perf record: Delete file if a failure occurs writing the perf data file David Ahern
2013-11-08 17:58 ` Jiri Olsa
2013-11-11  9:37 ` Ingo Molnar
2013-11-11 14:43   ` David Ahern
2013-11-12 14:51     ` David Ahern [this message]
2013-11-12 15:04       ` Peter Zijlstra
2013-11-12 15:25         ` David Ahern
2013-11-12 15:34           ` Peter Zijlstra
2013-11-20  4:39             ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52824064.4060100@gmail.com \
    --to=dsahern@gmail.com \
    --cc=acme@ghostprotocols.net \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.