linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: mhiramat@kernel.org, peterz@infradead.org,
	srikar@linux.vnet.ibm.com, acme@kernel.org,
	ananth@linux.vnet.ibm.com, akpm@linux-foundation.org,
	alexander.shishkin@linux.intel.com, alexis.berlemont@gmail.com,
	corbet@lwn.net, dan.j.williams@intel.com,
	gregkh@linuxfoundation.org, huawei.libin@huawei.com,
	hughd@google.com, jack@suse.cz, jglisse@redhat.com,
	jolsa@redhat.com, kan.liang@intel.com,
	kirill.shutemov@linux.intel.com, kjlx@templeofstupid.com,
	kstewart@linuxfoundation.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	mhocko@suse.com, milian.wolff@kdab.com, mingo@redhat.com,
	namhyung@kernel.org, naveen.n.rao@linux.vnet.ibm.com,
	pc@us.ibm.com, pombredanne@nexb.com, rostedt@goodmis.org,
	tglx@linutronix.de, tmricht@linux.vnet.ibm.com,
	willy@infradead.org, yao.jin@linux.intel.com,
	fengguang.wu@intel.com,
	Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Subject: Re: [PATCH 6/8] trace_uprobe/sdt: Fix multiple update of same reference counter
Date: Fri, 16 Mar 2018 19:19:34 +0530	[thread overview]
Message-ID: <20efff94-de74-dcbe-68e4-a72476fab209@linux.vnet.ibm.com> (raw)
In-Reply-To: <c93216a4-a4e1-dd8f-00be-17254e308cd1@linux.vnet.ibm.com>



On 03/16/2018 05:42 PM, Ravi Bangoria wrote:
>
> On 03/15/2018 08:19 PM, Oleg Nesterov wrote:
>> On 03/13, Ravi Bangoria wrote:
>>> For tiny binaries/libraries, different mmap regions points to the
>>> same file portion. In such cases, we may increment reference counter
>>> multiple times.
>> Yes,
>>
>>> But while de-registration, reference counter will get
>>> decremented only by once
>> could you explain why this happens? sdt_increment_ref_ctr() and
>> sdt_decrement_ref_ctr() look symmetrical, _decrement_ should see
>> the same mappings?
> Sorry, I thought this happens only for tiny binaries. But that is not the case.
> This happens for binary / library of any length.
>
> Also, it's not a problem with sdt_increment_ref_ctr() / sdt_increment_ref_ctr().
> The problem happens with trace_uprobe_mmap_callback().
>
> To illustrate in detail, I'm adding a pr_info() in trace_uprobe_mmap_callback():
>
> A A A A A A A A A A A A A A A  vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> +A A A A A A A A A A A A  pr_info("0x%lx-0x%lx : 0x%lx\n", vma->vm_start, vma->vm_end, vaddr);
> A A A A A A A A A A A A A A A  sdt_update_ref_ctr(vma->vm_mm, vaddr, 1);
>
>
> Ok now, libpython has SDT markers with reference counter:
>
> A  A  # readelf -n /usr/lib64/libpython2.7.so.1.0 | grep -A2 Provider
> A  A  A  A  Provider: python
> A A A  A  A  Name: function__entry
> A A  A A A A  ... Semaphore: 0x00000000002899d8
>
> Probing on that marker:
>
> A  A  # cd /sys/kernel/debug/tracing/
> A A A  # echo "p:sdt_python/function__entry /usr/lib64/libpython2.7.so.1.0:0x16a4d4(0x2799d8)" > uprobe_events
> A A A  # echo 1 > events/sdt_python/function__entry/enable
>
> When I run python:
>
> A A A  # strace -o out python
> A A  A A  mmap(NULL, 2738968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff92460000
> A A A A A  mmap(0x7fff926a0000, 327680, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x230000) = 0x7fff926a0000
> A A A A A  mprotect(0x7fff926a0000, 65536, PROT_READ) = 0
>
> The first mmap() maps the whole library into one region. Second mmap()
> and third mprotect() split out the whole region into smaller vmas and sets
> appropriate protection flags.
>
> Now, in this case, trace_uprobe_mmap_callback() updates reference counter
> twice -- by second mmap() call and by third mprotect() call -- because both
> regions contain reference counter offset. This I can verify in dmesg:
>
> A A A  # dmesg | tail
> A A A A A  trace_kprobe: 0x7fff926a0000-0x7fff926f0000 : 0x7fff926e99d8
> A A A A A  trace_kprobe: 0x7fff926b0000-0x7fff926f0000 : 0x7fff926e99d8
>
> Final vmas of libpython:
>
> A A A  # cat /proc/`pgrep python`/maps | grep libpython
> A A A A A  7fff92460000-7fff926a0000 r-xp 00000000 08:05 403934A  /usr/lib64/libpython2.7.so.1.0
> A A A A A  7fff926a0000-7fff926b0000 r--p 00230000 08:05 403934A  /usr/lib64/libpython2.7.so.1.0
> A A A A A  7fff926b0000-7fff926f0000 rw-p 00240000 08:05 403934A  /usr/lib64/libpython2.7.so.1.0
>
>
> I see similar problem with normal binary as well. I'm using Brendan Gregg's
> example[1]:
>
> A A A  # readelf -n /tmp/tick | grep -A2 Provider
> A A A  A A A  Provider: tick
> A  A A  A A  Name: loop2
> A A A A A A A  ... Semaphore: 0x000000001005003c
>
> Probing that marker:
>
> A A A  # echo "p:sdt_tick/loop2 /tmp/tick:0x6e4(0x10036)" > uprobe_events
> A A A  # echo 1 > events/sdt_tick/loop2/enable
>
> Now when I run the binary
>
> A A A  # /tmp/tick
>
> load_elf_binary() internally calls mmap() and I see trace_uprobe_mmap_callback()
> updating reference counter twice:
>
> A A A  # dmesg | tail
> A A A A A  trace_kprobe: 0x10010000-0x10030000 : 0x10020036
> A A A A A  trace_kprobe: 0x10020000-0x10030000 : 0x10020036
>
> proc/<pid>/maps of the tick:
>
> A A A  # cat /proc/`pgrep tick`/maps
> A A A A A  10000000-10010000 r-xp 00000000 08:05 1335712A  /tmp/tick
> A A A  A  10010000-10020000 r--p 00000000 08:05 1335712A  /tmp/tick
> A A A A A  10020000-10030000 rw-p 00010000 08:05 1335712A  /tmp/tick
>
> [1] https://github.com/iovisor/bcc/issues/327#issuecomment-200576506

Also, while de-registration, we look for all existing mms using
uprobe_build_mmap_info() and decrement the counter in each
of the mm. i.e. we decrement the counter only once.

-Ravi

  reply	other threads:[~2018-03-16 13:48 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-13 12:55 [PATCH 0/8] trace_uprobe: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-03-13 12:55 ` [PATCH 1/8] Uprobe: Export vaddr <-> offset conversion functions Ravi Bangoria
2018-03-13 20:36   ` Jerome Glisse
2018-03-15 16:27   ` Steven Rostedt
2018-03-16  8:54     ` Ravi Bangoria
2018-03-13 12:55 ` [PATCH 2/8] mm: Prefix vma_ to vaddr_to_offset() and offset_to_vaddr() Ravi Bangoria
2018-03-13 20:38   ` Jerome Glisse
2018-03-15 16:28   ` Steven Rostedt
2018-03-16  8:58     ` Ravi Bangoria
2018-03-13 12:55 ` [PATCH 3/8] Uprobe: Rename map_info to uprobe_map_info Ravi Bangoria
2018-03-13 20:39   ` Jerome Glisse
2018-03-15 16:44   ` Steven Rostedt
2018-03-16  8:56     ` Ravi Bangoria
2018-03-13 12:55 ` [PATCH 4/8] Uprobe: Export uprobe_map_info along with uprobe_{build/free}_map_info() Ravi Bangoria
2018-03-13 20:40   ` Jerome Glisse
2018-03-15 16:32   ` Steven Rostedt
2018-03-16  8:59     ` Ravi Bangoria
2018-03-13 12:56 ` [PATCH 5/8] trace_uprobe: Support SDT markers having reference count (semaphore) Ravi Bangoria
2018-03-14 13:48   ` Masami Hiramatsu
2018-03-14 15:12     ` Ravi Bangoria
2018-03-14 16:59   ` Oleg Nesterov
2018-03-15 11:23     ` Ravi Bangoria
2018-03-19  4:28     ` Ravi Bangoria
2018-03-19 13:46       ` Oleg Nesterov
2018-03-14 21:58   ` Steven Rostedt
2018-03-15 14:21   ` Oleg Nesterov
2018-03-15 14:30     ` Oleg Nesterov
2018-03-16  9:28       ` Ravi Bangoria
2018-03-16 11:39         ` Oleg Nesterov
2018-03-16 11:46           ` Ravi Bangoria
2018-03-16  9:21     ` Ravi Bangoria
2018-03-15 15:01   ` Oleg Nesterov
2018-03-16  9:31     ` Ravi Bangoria
2018-03-15 16:48   ` Steven Rostedt
2018-03-16  9:01     ` Ravi Bangoria
2018-03-16 16:16     ` Oleg Nesterov
2018-03-13 12:56 ` [PATCH 6/8] trace_uprobe/sdt: Fix multiple update of same reference counter Ravi Bangoria
2018-03-14 14:15   ` Masami Hiramatsu
2018-03-14 15:15     ` Ravi Bangoria
2018-03-15 14:49   ` Oleg Nesterov
2018-03-16 12:12     ` Ravi Bangoria
2018-03-16 13:49       ` Ravi Bangoria [this message]
2018-03-16 17:50       ` Oleg Nesterov
2018-03-19  9:18         ` Ravi Bangoria
2018-03-19 13:40           ` Oleg Nesterov
2018-03-13 12:56 ` [PATCH 7/8] perf probe: Support SDT markers having reference counter (semaphore) Ravi Bangoria
2018-03-14 14:09   ` Masami Hiramatsu
2018-03-14 15:21     ` Ravi Bangoria
2018-03-13 12:56 ` [PATCH 8/8] trace_uprobe/sdt: Document about reference counter Ravi Bangoria
2018-03-14 13:50   ` Masami Hiramatsu
2018-03-14 15:22     ` Ravi Bangoria
2018-03-15 12:47       ` Masami Hiramatsu
2018-03-16  9:42         ` Ravi Bangoria
2018-03-16 14:26           ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20efff94-de74-dcbe-68e4-a72476fab209@linux.vnet.ibm.com \
    --to=ravi.bangoria@linux.vnet.ibm.com \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexis.berlemont@gmail.com \
    --cc=ananth@linux.vnet.ibm.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=huawei.libin@huawei.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kjlx@templeofstupid.com \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=milian.wolff@kdab.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=oleg@redhat.com \
    --cc=pc@us.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pombredanne@nexb.com \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tmricht@linux.vnet.ibm.com \
    --cc=willy@infradead.org \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).