From: Masami Hiramatsu <mhiramat@kernel.org>
To: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: oleg@redhat.com, srikar@linux.vnet.ibm.com, rostedt@goodmis.org,
peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
alexander.shishkin@linux.intel.com, jolsa@redhat.com,
namhyung@kernel.org, linux-kernel@vger.kernel.org,
corbet@lwn.net, linux-doc@vger.kernel.org,
ananth@linux.vnet.ibm.com, alexis.berlemont@gmail.com,
naveen.n.rao@linux.vnet.ibm.com
Subject: Re: [PATCH 0/7] Uprobes: Support SDT markers having reference count (semaphore)
Date: Fri, 8 Jun 2018 10:10:23 +0900 [thread overview]
Message-ID: <20180608101023.cbf20db485026213d490cb7c@kernel.org> (raw)
In-Reply-To: <20180606083344.31320-1-ravi.bangoria@linux.ibm.com>
On Wed, 6 Jun 2018 14:03:37 +0530
Ravi Bangoria <ravi.bangoria@linux.ibm.com> wrote:
> Why RFC again:
>
> This series is different from earlier versions[1]. Earlier series
> implemented this feature in trace_uprobe while this has implemented
> the logic in core uprobe. Few reasons for this:
> 1. One of the major reason was the deadlock between uprobe_lock and
> mm->mmap inside trace_uprobe_mmap(). That deadlock was not easy to fix
> because mm->mmap is not in control of trace_uprobe_mmap() and it has
> to take uprobe_lock to loop over trace_uprobe list. More details can
> be found at[2]. With this new approach, there are no deadlocks found
> so far.
> 2. Many of the core uprobe function and data-structures needs to be
> exported to make earlier implementation simple. With this new approach,
> reference counter logic is been implemented in core uprobe and thus
> no need to export anything.
I agree with you. Moreover, since uprobe_register/unregister() are
exported to modules, this enablement would better be implemented
inside uprobe so that all uprobe users benefit from this.
>
> Description:
>
> Userspace Statically Defined Tracepoints[3] are dtrace style markers
> inside userspace applications. Applications like PostgreSQL, MySQL,
> Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
> have these markers embedded in them. These markers are added by developer
> at important places in the code. Each marker source expands to a single
> nop instruction in the compiled code but there may be additional
> overhead for computing the marker arguments which expands to couple of
> instructions. In case the overhead is more, execution of it can be
> omitted by runtime if() condition when no one is tracing on the marker:
>
> if (reference_counter > 0) {
> Execute marker instructions;
> }
>
> Default value of reference counter is 0. Tracer has to increment the
> reference counter before tracing on a marker and decrement it when
> done with the tracing.
>
> Currently, perf tool has limited supports for SDT markers. I.e. it
> can not trace markers surrounded by reference counter. Also, it's
> not easy to add reference counter logic in userspace tool like perf,
> so basic idea for this patchset is to add reference counter logic in
> the trace_uprobe infrastructure. Ex,[4]
>
> # cat tick.c
> ...
> for (i = 0; i < 100; i++) {
> DTRACE_PROBE1(tick, loop1, i);
> if (TICK_LOOP2_ENABLED()) {
> DTRACE_PROBE1(tick, loop2, i);
> }
> printf("hi: %d\n", i);
> sleep(1);
> }
> ...
>
> Here tick:loop1 is marker without reference counter where as tick:loop2
> is surrounded by reference counter condition.
>
> # perf buildid-cache --add /tmp/tick
> # perf probe sdt_tick:loop1
> # perf probe sdt_tick:loop2
>
> # perf stat -e sdt_tick:loop1,sdt_tick:loop2 -- /tmp/tick
> hi: 0
> hi: 1
> hi: 2
> ^C
> Performance counter stats for '/tmp/tick':
> 3 sdt_tick:loop1
> 0 sdt_tick:loop2
> 2.747086086 seconds time elapsed
>
> Perf failed to record data for tick:loop2. Same experiment with this
> patch series:
>
> # ./perf buildid-cache --add /tmp/tick
> # ./perf probe sdt_tick:loop2
> # ./perf stat -e sdt_tick:loop2 /tmp/tick
> hi: 0
> hi: 1
> hi: 2
> ^C
> Performance counter stats for '/tmp/tick':
> 3 sdt_tick:loop2
> 2.561851452 seconds time elapsed
>
>
> Note:
> - 'reference counter' is called as 'semaphore' in original Dtrace
> (or Systemtap, bcc and even in ELF) documentation and code. But the
> term 'semaphore' is misleading in this context. This is just a counter
> used to hold number of tracers tracing on a marker. This is not haeally
> used for any synchronization. So we are referring it as 'reference
> counter' in kernel / perf code.
+1. I like this "reference counter" term :)
> - This patches still has one issue. If there are multiple instances of
> same application running and user wants to trace any particular
> instance, trace_uprobe is updating reference counter in all instances.
> This is not a problem on user side because instruction is not replaced
> with trap/int3 and thus user will only see samples from his interested
> process. But still this is more of a correctness issue. I'm working on
> a fix for this.
Hmm, it sounds like not a correctness issue, but there maybe a performace
tradeoff. Tracing one particulear instance, other instances also will get
a performance loss (Only if the parameter preparation block is heavy,
because the heaviest part of probing - trap/int3 and recording data - isn't
executed.)
BTW, why this happens? I thought the refcounter part is just a data which
is not shared among processes...
Thank you,
>
> [1] https://lkml.org/lkml/2018/4/17/23
> [2] https://lkml.org/lkml/2018/5/25/111
> [3] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
> [4] https://github.com/iovisor/bcc/issues/327#issuecomment-200576506
>
> Ravi Bangoria (7):
> Uprobes: Simplify uprobe_register() body
> Uprobes: Support SDT markers having reference count (semaphore)
> Uprobes/sdt: Fix multiple update of same reference counter
> trace_uprobe/sdt: Prevent multiple reference counter for same uprobe
> Uprobes/sdt: Prevent multiple reference counter for same uprobe
> Uprobes/sdt: Document about reference counter
> perf probe: Support SDT markers having reference counter (semaphore)
>
> Documentation/trace/uprobetracer.rst | 16 +-
> include/linux/uprobes.h | 5 +
> kernel/events/uprobes.c | 502 +++++++++++++++++++++++++++++++----
> kernel/trace/trace.c | 2 +-
> kernel/trace/trace_uprobe.c | 74 +++++-
> tools/perf/util/probe-event.c | 39 ++-
> tools/perf/util/probe-event.h | 1 +
> tools/perf/util/probe-file.c | 34 ++-
> tools/perf/util/probe-file.h | 1 +
> tools/perf/util/symbol-elf.c | 46 +++-
> tools/perf/util/symbol.h | 7 +
> 11 files changed, 643 insertions(+), 84 deletions(-)
>
> --
> 1.8.3.1
>
--
Masami Hiramatsu <mhiramat@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-06-08 1:10 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-06 8:33 [PATCH 0/7] Uprobes: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-06-06 8:33 ` [PATCH 1/7] Uprobes: Simplify uprobe_register() body Ravi Bangoria
2018-06-06 8:33 ` [PATCH 2/7] Uprobes: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-06-06 8:33 ` [PATCH 3/7] Uprobes/sdt: Fix multiple update of same reference counter Ravi Bangoria
2018-06-06 8:33 ` [PATCH 4/7] trace_uprobe/sdt: Prevent multiple reference counter for same uprobe Ravi Bangoria
2018-06-06 8:33 ` [PATCH 5/7] Uprobes/sdt: " Ravi Bangoria
2018-06-06 8:33 ` [PATCH 6/7] Uprobes/sdt: Document about reference counter Ravi Bangoria
2018-06-06 8:33 ` [PATCH 7/7] perf probe: Support SDT markers having reference counter (semaphore) Ravi Bangoria
2018-06-06 8:35 ` [PATCH 0/7] Uprobes: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-06-08 1:10 ` Masami Hiramatsu [this message]
2018-06-08 2:29 ` Ravi Bangoria
2018-06-08 5:14 ` Masami Hiramatsu
2018-06-08 6:34 ` Ravi Bangoria
2018-06-08 15:45 ` Masami Hiramatsu
2018-06-11 4:31 ` Ravi Bangoria
2018-06-16 13:50 ` Masami Hiramatsu
2018-06-16 15:07 ` Ravi Bangoria
2018-06-08 16:36 ` Oleg Nesterov
2018-06-11 4:13 ` Ravi Bangoria
2018-06-20 16:37 ` Steven Rostedt
2018-06-21 2:35 ` Ravi Bangoria
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180608101023.cbf20db485026213d490cb7c@kernel.org \
--to=mhiramat@kernel.org \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexis.berlemont@gmail.com \
--cc=ananth@linux.vnet.ibm.com \
--cc=corbet@lwn.net \
--cc=jolsa@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@linux.ibm.com \
--cc=rostedt@goodmis.org \
--cc=srikar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).