From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932540AbbFQVYI (ORCPT ); Wed, 17 Jun 2015 17:24:08 -0400 Received: from casper.infradead.org ([85.118.1.10]:35738 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932422AbbFQVXx (ORCPT ); Wed, 17 Jun 2015 17:23:53 -0400 From: Arnaldo Carvalho de Melo To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Sukadev Bhattiprolu , Jiri Olsa , Li Zhang , Arnaldo Carvalho de Melo Subject: [PATCH 6/8] perf trace: Fix race condition at the end of started workloads Date: Wed, 17 Jun 2015 18:22:33 -0300 Message-Id: <1434576155-30038-7-git-send-email-acme@kernel.org> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1434576155-30038-1-git-send-email-acme@kernel.org> References: <1434576155-30038-1-git-send-email-acme@kernel.org> X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sukadev Bhattiprolu I get following crash on multiple systems and across several releases (at least since v3.18). Core was generated by `/tmp/perf trace sleep 0.2 '. Program terminated with signal SIGSEGV, Segmentation fault. #0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195 195 u64 head = ACCESS_ONCE(pc->data_head); (gdb) bt #0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195 #1 perf_evlist__mmap_read (evlist=0x10027f11910, idx=) at util/evlist.c:637 #2 0x000000001003ce4c in trace__run (argv=, argc=, trace=0x3fffd7b28288) at builtin-trace.c:2259 #3 cmd_trace (argc=, argv=, prefix=) at builtin-trace.c:2799 #4 0x00000000100657b8 in run_builtin (p=0x10176798 , argc=3, argv=0x3fffd7b2b550) at perf.c:370 #5 0x00000000100063e8 in handle_internal_command (argv=0x3fffd7b2b550, argc=3) at perf.c:429 #6 run_argv (argv=0x3fffd7b2af70, argcp=0x3fffd7b2af7c) at perf.c:473 #7 main (argc=3, argv=0x3fffd7b2b550) at perf.c:588 The problem seems to be a race condition, when the application has just exited. Some/all fds associated with the perf-events (tracepoints) go into a POLLHUP/ POLLERR state and the mmap region associated with those events are unmapped (in perf_evlist__filter_pollfd()). But we go back and do a perf_evlist__mmap_read() which assumes that the mmaps are still valid and we hit the crash. If the mapping for an event is released, its refcnt is 0 (and ->base is NULL), so ensure we have non-zero refcount before accessing the map. Note that perf-record has a similar logic but unlike perf-trace, the record__mmap_read_all() checks the evlist->mmap[i].base before accessing the map. Signed-off-by: Sukadev Bhattiprolu Cc: Jiri Olsa Cc: Li Zhang Link: http://lkml.kernel.org/r/20150612060003.GA19913@us.ibm.com [ Fixed it up to use atomic_read() ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/evlist.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index dc1dc2c181ef..6b58a47a79ec 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -634,11 +634,18 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist, union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx) { struct perf_mmap *md = &evlist->mmap[idx]; - u64 head = perf_mmap__read_head(md); + u64 head; u64 old = md->prev; unsigned char *data = md->base + page_size; union perf_event *event = NULL; + /* + * Check if event was unmapped due to a POLLHUP/POLLERR. + */ + if (!atomic_read(&md->refcnt)) + return NULL; + + head = perf_mmap__read_head(md); if (evlist->overwrite) { /* * If we're further behind than half the buffer, there's a chance -- 2.1.0