From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Clark Williams <williams@redhat.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Jiri Olsa <jolsa@redhat.com>, Jiri Olsa <jolsa@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>, David Ahern <dsahern@gmail.com>,
Kan Liang <kan.liang@linux.intel.com>,
Lukasz Odzioba <lukasz.odzioba@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Wang Nan <wangnan0@huawei.com>,
kernel-team@lge.com, Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 24/27] perf tools: Fix struct comm_str removal crash
Date: Wed, 25 Jul 2018 14:59:58 -0300 [thread overview]
Message-ID: <20180725180001.15108-25-acme@kernel.org> (raw)
In-Reply-To: <20180725180001.15108-1-acme@kernel.org>
From: Jiri Olsa <jolsa@redhat.com>
We occasionaly hit following assert failure in 'perf top', when processing the
/proc info in multiple threads.
perf: ...include/linux/refcount.h:109: refcount_inc:
Assertion `!(!refcount_inc_not_zero(r))' failed.
The gdb backtrace looks like this:
[Switching to Thread 0x7ffff11ba700 (LWP 13749)]
0x00007ffff50839fb in raise () from /lib64/libc.so.6
(gdb)
#0 0x00007ffff50839fb in raise () from /lib64/libc.so.6
#1 0x00007ffff5085800 in abort () from /lib64/libc.so.6
#2 0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000535373 in refcount_inc (r=0x7fffdc009be0)
at ...include/linux/refcount.h:109
#5 0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0)
at util/comm.c:24
#6 0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2",
root=0xbed5c0 <comm_str_root>) at util/comm.c:72
#7 0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2",
root=0xbed5c0 <comm_str_root>) at util/comm.c:95
#8 0x000000000053582e in comm__new (str=0x7fffd000b260 ":2",
timestamp=0, exec=false) at util/comm.c:111
#9 0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57
#10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38,
threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457
#11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
...
The failing assertion is this one:
REFCOUNT_WARN(!refcount_inc_not_zero(r), ...
The problem is that we keep global comm_str_root list, which
is accessed by multiple threads during the 'perf top' startup
and following 2 paths can race:
thread 1:
...
thread__new
comm__new
comm_str__findnew
down_write(&comm_str_lock);
__comm_str__findnew
comm_str__get
thread 2:
...
comm__override or comm__free
comm_str__put
refcount_dec_and_test
down_write(&comm_str_lock);
rb_erase(&cs->rb_node, &comm_str_root);
Because thread 2 first decrements the refcnt and only after then it removes the
struct comm_str from the list, the thread 1 can find this object on the list
with refcnt equls to 0 and hit the assert.
This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects
that already dropped the refcnt to 0. For the rest of the objects we take the
refcnt before comparing its name and release it afterwards with comm_str__put,
which can also release the object completely.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20180720101740.GA27176@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/comm.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 7798a2cc8a86..31279a7bd919 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -20,9 +20,10 @@ static struct rw_semaphore comm_str_lock = {.lock = PTHREAD_RWLOCK_INITIALIZER,}
static struct comm_str *comm_str__get(struct comm_str *cs)
{
- if (cs)
- refcount_inc(&cs->refcnt);
- return cs;
+ if (cs && refcount_inc_not_zero(&cs->refcnt))
+ return cs;
+
+ return NULL;
}
static void comm_str__put(struct comm_str *cs)
@@ -67,9 +68,14 @@ struct comm_str *__comm_str__findnew(const char *str, struct rb_root *root)
parent = *p;
iter = rb_entry(parent, struct comm_str, rb_node);
+ /*
+ * If we race with comm_str__put, iter->refcnt is 0
+ * and it will be removed within comm_str__put call
+ * shortly, ignore it in this search.
+ */
cmp = strcmp(str, iter->str);
- if (!cmp)
- return comm_str__get(iter);
+ if (!cmp && comm_str__get(iter))
+ return iter;
if (cmp < 0)
p = &(*p)->rb_left;
--
2.14.4
next prev parent reply other threads:[~2018-07-25 17:59 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-25 17:59 [GIT PULL 00/27] perf/core improvements and fixes Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 01/27] perf tests: Check that complex event name is parsed correctly Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 02/27] perf hists: Clarify callchain disabling when available Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 03/27] perf cs-etm: Introduce invalid address macro Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 04/27] perf cs-etm: Bail out immediately for instruction sample failure Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 05/27] Revert "perf list: Add s390 support for detailed/verbose PMU event description" Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 06/27] perf list: Add s390 support for detailed PMU event description Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 07/27] perf json: Add s390 transaction counter definition Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 08/27] perf stat: Add transaction flag (-T) support for s390 Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 09/27] perf kvm: Fix subcommands on s390 Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 10/27] perf list: Add missing documentation for --desc and --debug options Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 11/27] perf powerpc: Fix callchain ip filtering Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 12/27] perf powerpc: Fix callchain ip filtering when return address is in a register Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 13/27] perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 14/27] perf tests: Fix record+probe_libc_inet_pton.sh to ensure cleanups Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 15/27] perf tests: Fix record+probe_libc_inet_pton.sh when event exists Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 16/27] tools include: Grab copies of arm64 dependent unistd.h files Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 17/27] perf arm64: Generate system call table from asm/unistd.h Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 18/27] perf trace arm64: Use generated syscall table Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 19/27] perf script: Show correct offsets for DWARF-based unwinding Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 20/27] perf tools: Synthesize GROUP_DESC feature in pipe mode Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 21/27] perf machine: Add threads__get_last_match function Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 22/27] perf machine: Add threads__set_last_match function Arnaldo Carvalho de Melo
2018-07-25 17:59 ` [PATCH 23/27] perf machine: Use last_match threads cache only in single thread mode Arnaldo Carvalho de Melo
2018-07-25 17:59 ` Arnaldo Carvalho de Melo [this message]
2018-07-25 17:59 ` [PATCH 25/27] perf tools: Use perf_evsel__match instead of open coded equivalent Arnaldo Carvalho de Melo
2018-07-25 18:00 ` [PATCH 26/27] perf stat: Get rid of extra clock display function Arnaldo Carvalho de Melo
2018-07-25 18:00 ` [PATCH 27/27] perf test: Fix subtest number when showing results Arnaldo Carvalho de Melo
2018-07-25 20:34 ` [GIT PULL 00/27] perf/core improvements and fixes Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180725180001.15108-25-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dsahern@gmail.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=kan.liang@linux.intel.com \
--cc=kernel-team@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=lukasz.odzioba@intel.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=wangnan0@huawei.com \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).