From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on archive.lwn.net X-Spam-Level: X-Spam-Status: No, score=-5.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham autolearn_force=no version=3.4.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by archive.lwn.net (Postfix) with ESMTP id 041967D072 for ; Thu, 28 Jun 2018 05:22:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751368AbeF1FWZ (ORCPT ); Thu, 28 Jun 2018 01:22:25 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:57458 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751306AbeF1FWY (ORCPT ); Thu, 28 Jun 2018 01:22:24 -0400 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5S5J97H057175 for ; Thu, 28 Jun 2018 01:22:23 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 2jvsa68dj4-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 28 Jun 2018 01:22:23 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 28 Jun 2018 06:22:21 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 28 Jun 2018 06:22:15 +0100 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w5S5MEW521692500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 28 Jun 2018 05:22:14 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0B2C14203F; Thu, 28 Jun 2018 06:22:05 +0100 (BST) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B822142042; Thu, 28 Jun 2018 06:22:01 +0100 (BST) Received: from bangoria.in.ibm.com (unknown [9.124.31.233]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 28 Jun 2018 06:22:01 +0100 (BST) From: Ravi Bangoria To: srikar@linux.vnet.ibm.com, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, linux-kernel@vger.kernel.org, corbet@lwn.net, linux-doc@vger.kernel.org, ananth@linux.vnet.ibm.com, alexis.berlemont@gmail.com, naveen.n.rao@linux.vnet.ibm.com, linux-arm-kernel@lists.infradead.org, linux-mips@linux-mips.org, linux@armlinux.org.uk, ralf@linux-mips.org, paul.burton@mips.com, Ravi Bangoria Subject: [PATCH v5 00/10] Uprobes: Support SDT markers having reference count (semaphore) Date: Thu, 28 Jun 2018 10:51:59 +0530 X-Mailer: git-send-email 2.14.4 X-TM-AS-GCONF: 00 x-cbid: 18062805-0008-0000-0000-0000024CAA3F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18062805-0009-0000-0000-000021B3221A Message-Id: <20180628052209.13056-1-ravi.bangoria@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-06-28_02:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1806210000 definitions=main-1806280059 Sender: linux-doc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org Userspace Statically Defined Tracepoints[1] are dtrace style markers inside userspace applications. Applications like PostgreSQL, MySQL, Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc have these markers embedded in them. These markers are added by developer at important places in the code. Each marker source expands to a single nop instruction in the compiled code but there may be additional overhead for computing the marker arguments which expands to couple of instructions. In case the overhead is more, execution of it can be omitted by runtime if() condition when no one is tracing on the marker: if (reference_counter > 0) { Execute marker instructions; } Default value of reference counter is 0. Tracer has to increment the reference counter before tracing on a marker and decrement it when done with the tracing. Currently, perf tool has limited supports for SDT markers. I.e. it can not trace markers surrounded by reference counter. Also, it's not easy to add reference counter logic in userspace tool like perf, so basic idea for this patchset is to add reference counter logic in the a uprobe infrastructure. Ex,[2] # cat tick.c ... for (i = 0; i < 100; i++) { DTRACE_PROBE1(tick, loop1, i); if (TICK_LOOP2_ENABLED()) { DTRACE_PROBE1(tick, loop2, i); } printf("hi: %d\n", i); sleep(1); } ... Here tick:loop1 is marker without reference counter where as tick:loop2 is surrounded by reference counter condition. # perf buildid-cache --add /tmp/tick # perf probe sdt_tick:loop1 # perf probe sdt_tick:loop2 # perf stat -e sdt_tick:loop1,sdt_tick:loop2 -- /tmp/tick hi: 0 hi: 1 hi: 2 ^C Performance counter stats for '/tmp/tick': 3 sdt_tick:loop1 0 sdt_tick:loop2 2.747086086 seconds time elapsed Perf failed to record data for tick:loop2. Same experiment with this patch series: # ./perf buildid-cache --add /tmp/tick # ./perf probe sdt_tick:loop2 # ./perf stat -e sdt_tick:loop2 /tmp/tick hi: 0 hi: 1 hi: 2 ^C Performance counter stats for '/tmp/tick': 3 sdt_tick:loop2 2.561851452 seconds time elapsed v5 changes: - Fix build failure. 'struct uprobe' was local to uprobe.c file and I was using it in some arch specific code which caused a build failure. Added new patch [PATCH v5 1/10] to fix it. v4 changes: - Previous version moved the implementation from trace_uprobe to core uprobe. But it had some issues. I've fixed them in this version. To cut a long story short, reference counter increment/decrement is tied to instruction patching. Whenever instruction gets patched, we update the reference counter. Now, what if vma holding reference counter is not present while patching an instruction? To overcome this issue, we will add such uprobe to delayed_uprobe list. Whenever process maps the region holding the reference counter, we increment it and remove the uprobe from delayed_uprobe list. - Until last version, we were incrementing reference counter for each uprobe_consumer. That isn't true now. With this implementation, we increment and decrement the counter only once. This is fine because we increment the counter when we find first valid consumer and we decrement it when last consumer is going away. (For a tiny binary, multiple vmas maps to the same file portion. In such case, we are patching the instruction multiple time and thus we will increment / decrement the reference counter multiple time as well.) - Fortunately, there is no need to maintain sdt_mm_list now, because we are sure that the increment and decrement will always happen in sync. This make the implementation lot more simpler compared to earlier versions. v4 can be foud at: https://lkml.org/lkml/2018/6/19/1324 Note: - 'reference counter' is called as 'semaphore' in original Dtrace (or Systemtap, bcc and even in ELF) documentation and code. But the term 'semaphore' is misleading in this context. This is just a counter used to hold number of tracers tracing on a marker. This is not really used for any synchronization. So we are referring it as 'reference counter' in kernel / perf code. [1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation [2] https://github.com/iovisor/bcc/issues/327#issuecomment-200576506 Ravi Bangoria (10): Uprobes: Move uprobe structure to uprobe.h Uprobes: Simplify uprobe_register() body Uprobe: Change set_swbp definition Uprobe: Change set_orig_insn definition Uprobe: Change uprobe_write_opcode definition Uprobes: Support SDT markers having reference count (semaphore) trace_uprobe/sdt: Prevent multiple reference counter for same uprobe Uprobes/sdt: Prevent multiple reference counter for same uprobe Uprobes/sdt: Document about reference counter perf probe: Support SDT markers having reference counter (semaphore) Documentation/trace/uprobetracer.rst | 16 +- arch/arm/probes/uprobes/core.c | 6 +- arch/mips/kernel/uprobes.c | 6 +- include/linux/uprobes.h | 37 +++- kernel/events/uprobes.c | 390 ++++++++++++++++++++++++++++------- kernel/trace/trace.c | 2 +- kernel/trace/trace_uprobe.c | 74 ++++++- tools/perf/util/probe-event.c | 39 +++- tools/perf/util/probe-event.h | 1 + tools/perf/util/probe-file.c | 34 ++- tools/perf/util/probe-file.h | 1 + tools/perf/util/symbol-elf.c | 46 +++-- tools/perf/util/symbol.h | 7 + 13 files changed, 542 insertions(+), 117 deletions(-) -- 2.14.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html