From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
linux-kernel@vger.kernel.org, zhlbj@cn.ibm.com
Subject: Re: perf_evlist__filter_pollfd() in trace__run()
Date: Fri, 30 Jan 2015 12:22:16 -0300 [thread overview]
Message-ID: <20150130152216.GE3101@kernel.org> (raw)
In-Reply-To: <20150129235522.GA14385@us.ibm.com>
Em Thu, Jan 29, 2015 at 03:55:22PM -0800, Sukadev Bhattiprolu escreveu:
> Arnaldo,
>
> On one of our systems we are seeing an intermittent SIGSEGV with
>
> perf trace sleep 1
>
> and I have question about the 'draining' flag below:
> |
> | From 46fb3c21d20415dd2693570c33d0ea6eb8745e04 Mon Sep 17 00:00:00 2001
> | From: Arnaldo Carvalho de Melo <acme@redhat.com>
> | Date: Mon, 22 Sep 2014 14:39:48 -0300
> | Subject: [PATCH 1/1] perf trace: Filter out POLLHUP'ed file descriptors
> |
> | So that we don't continue polling on vanished file descriptors, i.e.
> | file descriptors for events monitoring threads that exited.
> |
> | I.e. the following 'trace' command now exits as expected, instead
> | of staying in an eternal loop:
> |
> | $ sleep 5s &
> | $ trace -p `pidof sleep`
> |
> | Reported-by: Jiri Olsa <jolsa@redhat.com>
> | Cc: Adrian Hunter <adrian.hunter@intel.com>
> | Cc: David Ahern <dsahern@gmail.com>
> | Cc: Don Zickus <dzickus@redhat.com>
> | Cc: Frederic Weisbecker <fweisbec@gmail.com>
> | Cc: Jiri Olsa <jolsa@redhat.com>
> | Cc: Mike Galbraith <efault@gmx.de>
> | Cc: Namhyung Kim <namhyung@kernel.org>
> | Cc: Paul Mackerras <paulus@samba.org>
> | Cc: Peter Zijlstra <peterz@infradead.org>
> | Cc: Stephane Eranian <eranian@google.com>
> | Link: http://lkml.kernel.org/n/tip-6qegv786zbf6i8us6t4rxug9@git.kernel.org
> | Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> | ---
> | tools/perf/builtin-trace.c | 7 ++++++-
> | 1 file changed, 6 insertions(+), 1 deletion(-)
> |
> | diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> | index b8fedf3..fe39dc6 100644
> | --- a/tools/perf/builtin-trace.c
> | +++ b/tools/perf/builtin-trace.c
> | @@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> | int err = -1, i;
> | unsigned long before;
> | const bool forks = argc > 0;
> | + bool draining = false;
> | char sbuf[STRERR_BUFSIZE];
> |
> | trace->live = true;
> | @@ -2171,8 +2172,12 @@ next_event:
> | if (trace->nr_events == before) {
> | int timeout = done ? 100 : -1;
> |
> | - if (perf_evlist__poll(evlist, timeout) > 0)
> | + if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> | + if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> | + draining = true;
> | +
>
> If an fd gets into POLLHUP state, perf_evlist__filter_pollfd() removes
> ("puts") the mmap for the fd. We are seeing that sometimes (frequently)
> _all_ fds are in the POLLHUP state and hence their mmap->base are set
> to NULL.
>
>
> | goto again;
>
> Now with this goto, we go back and call perf_evlist__mmap_read() which
> tries to access the freed mmaps.
>
> Should there be another check to before reading the mmap again ?
Possibly, checking, but a similar algorithm should be in place for
'record', do you see any problems there? I.e. with 'perf record sleep
1'?
- Arnaldo
> I must add that I don't get the SIGSEGV on recent perf-core and the
> system where we get the crash, first runs into the following
> errors that we are still looking into (maybe related to "ppc64le"
> architecture).
>
> Problems reading syscall 45 information
> Problems reading syscall 5 information
> Problems reading syscall 5 information
> Problems reading syscall 108 information
> Problems reading syscall 108 information
> Problems reading syscall 90 information
> Problems reading syscall 90 information
> Problems reading syscall 6 information
>
> Unlike the SIGSEGV, these errors occur always.
>
> | + }
> | } else {
> | goto again;
> | }
> | --
> | 1.8.3.1
> |
>
> Following hack seems to fix the SIGSEGV, but then we completely ignore
> 'draining' flag.
>
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index fb12645..ac25e16 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -2173,8 +2173,10 @@ next_event:
> int timeout = done ? 100 : -1;
>
> if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> - if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> + if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0) {
> draining = true;
> + goto out_disable;
> + }
>
> goto again;
> }
next prev parent reply other threads:[~2015-01-30 15:22 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-29 23:55 perf_evlist__filter_pollfd() in trace__run() Sukadev Bhattiprolu
2015-01-30 15:22 ` Arnaldo Carvalho de Melo [this message]
2015-01-30 20:16 ` Sukadev Bhattiprolu
2015-02-01 14:10 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150130152216.GE3101@kernel.org \
--to=acme@kernel.org \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sukadev@linux.vnet.ibm.com \
--cc=zhlbj@cn.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox