From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Borislav Petkov <bp@alien8.de>, Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH 1/2] perf: Add munmap callback
Date: Mon, 5 Nov 2018 10:43:41 -0500 [thread overview]
Message-ID: <abe1cbab-d010-0899-acb1-32b1f9bf17df@linux.intel.com> (raw)
In-Reply-To: <CABPqkBRT678TYMbPMwsUjmazQij_qFDgu2nS7dm619ukni5EGQ@mail.gmail.com>
On 11/5/2018 5:59 AM, Stephane Eranian wrote:
> Hi Kan,
>
> I built a small test case for you to demonstrate the issue for code and data.
> Compile the test program and then do:
> For text:
> $ perf record ./mmap
> $ perf report -D | fgrep MMAP2
>
> The test program mmaps 2 pages, unmaps the second, and remap 1 page
> over the freed space.
> If you look at the MMAP2 record, you will not be able to reconstruct
> what happened and perf will
> get confused should it try to symbolize from the address range
>
> With Text:
> PERF_RECORD_MMAP2 5937/5937: [0x400000(0x1000) @ 0 08:01 400938
> 824817672]: r-xp /home/eranian/mmap
> PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000
> 00:00 0 0]: rwxp //anon
> PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000
> 00:00 0 0]: rwxp //anon
>
> ^^^^^^^^^^^^^^^^^^^^^^^^ captures the whole VMA but not the mapping
> change in user space
>
> For data:
> $ perf record -d ./mmap
> $ perf report -D | fgrep MMAP2
> With data:
> PERF_RECORD_MMAP2 6430/6430: [0x400000(0x1000) @ 0 08:01 400938
> 3278843184]: r-xp /home/eranian/mmap
> PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000
> 00:00 0 0]: rw-p //anon
> PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000
> 00:00 0 0]: rw-p //anon
>
> Same test case with data.
> Perf will think the entire 2 pages have been replaced when in fact
> only the second has.
> I believe the problem is likely to impact data and jitted code cache
>
> #include <sys/types.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <err.h>
> #include <getopt.h>
>
> int main(int argc, char **argv)
> {
> void *addr1, *addr2;
> size_t pgsz = sysconf(_SC_PAGESIZE);
> int n = 2;
> int ret;
> int c, mode = 0;
>
> while ((c = getopt(argc, argv, "hd")) != -1) {
> switch (c) {
> case 'h':
> printf("[-h]\tget this help\n");
> printf("[-d]\tuse data mmaps (no PROT_EXEC)\n");
> return 0;
> case 'd':
> mode = PROT_EXEC;
> break;
> default:
> errx(1, "unknown option");
> }
> }
> /* default to data */
> if (mode == 0)
> mode = PROT_WRITE;
>
> /*
> * mmap 2 contiugous pages
> */
> addr1 = mmap(NULL, n * pgsz, PROT_READ| mode, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
> if (addr1 == (void *)MAP_FAILED)
> err(1, "mmap 1 failed");
>
> printf("addr1=[%p : %p]\n", addr1, addr1 + n * pgsz);
>
> /*
> * unmap only the second page
> */
> ret = munmap(addr1 + pgsz, pgsz);
> if (ret == -1)
> err(1, "munmp failed");
>
> /*
> * mmap 1 page at the location of the unmap page (should reuse virtual space)
> * This creates a continuous region built from two mmaps and
> potentially two different sources
> * especially with jitted runtimes
> */
The two mmaps are both anon. As my understanding, we cannot symbolize
from the anonymous address, can we?
If we cannot, why we have to distinguish with them? I think we do not
need to know their sources for symbolization.
As my understanding, only --jit can inject MMAP event, which tag an
anon. Perf can symbolize the address after that. Then the unmap is needed.
Thanks,
Kan
> addr2 = mmap(addr1 + pgsz, 1 * pgsz, PROT_READ|PROT_WRITE | mode,
> MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>
> printf("addr2=%p\n", addr2);
>
> if (addr2 == (void *)MAP_FAILED)
> err(1, "mmap 2 failed");
> if (addr2 != (addr1 + pgsz))
> errx(1, "wrong mmap2 address");
>
> sleep(1);
>
> return 0;
> }
>
> On Thu, Nov 1, 2018 at 7:10 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 10/24/2018 3:30 PM, Stephane Eranian wrote:
>>> The need for this new record type extends beyond physical address conversions
>>> and PEBS. A long while ago, someone reported issues with symbolization related
>>> to perf lacking munmap tracking. It had to do with vma merging. I think the
>>> sequence of mmaps was as follows in the problematic case:
>>> 1. addr1 = mmap(8192);
>>> 2. munmap(addr1 + 4096, 4096)
>>> 3. addr2 = mmap(addr1+4096, 4096)
>>>
>>> If successful, that yields addr2 = addr1 + 4096 (could also get the
>>> same without forcing the address).
>>>
>>> In that case, if I recall correctly, the vma for 1st mapping (now at
>>> 4k) and that of the 2nd mapping (4k)
>>> get merged into a single 8k vma and this is what perf_events will
>>> record for PERF_RECORD_MMAP.
>>> On the perf tool side, it is assumed that if two timestamped mappings
>>> overlap then, the latter overrides
>>> the former. In this case, perf would loose the mapping of the first
>>> 4kb and assume all symbols comes from
>>> 2nd mapping. Hopefully I got the scenario right. If so, then you'd
>>> need PERF_RECORD_UNMAP to
>>> disambiguate assuming the perf tool is modified accordingly.
>>>
>>
>> Hi Stephane and Peter,
>>
>> I went through the link(https://lkml.org/lkml/2017/1/27/452). I'm trying
>> to understand the problematic case.
>>
>> It looks like the issue can only be triggered by perf inject --jit.
>> Because it can inject extra MMAP events.
>> As my understanding, Linux kernel only try to merge VMAs if they are
>> both from anon or they are both from the same file. --jit breaks the
>> rule, and makes the merged VMA partly from anon, partly from file.
>> Now, there is a new MMAP event which range covers the modified VMA.
>> Without the help of MUNMAP event, perf tool have no idea if the new one
>> is a newly merged VMA (modified VMA + a new VMA) or a brand new VMA.
>> Current code just simply overwrite the modified VMAs. The VMA
>> information which --jit injected may be lost. The symbolization may be
>> lost as well.
>>
>> Except --jit, the VMAs information should be consistent between kernel
>> and perf tools. We shouldn't observe the problem. MUNMAP event is not
>> needed.
>>
>> Is my understanding correct?
>>
>> Do you have a test case for the problem?
>>
>> Thanks,
>> Kan
next prev parent reply other threads:[~2018-11-05 15:43 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-24 15:11 [PATCH 1/2] perf: Add munmap callback kan.liang
2018-10-24 15:11 ` [PATCH 2/2] perf/x86/intel: Fix missing physical address in large PEBS kan.liang
2018-10-24 16:23 ` [PATCH 1/2] perf: Add munmap callback Andi Kleen
2018-10-24 16:32 ` Arnaldo Carvalho de Melo
2018-10-24 18:12 ` Liang, Kan
2018-10-24 18:28 ` Andi Kleen
2018-10-25 0:31 ` Peter Zijlstra
2018-10-24 19:15 ` Arnaldo Carvalho de Melo
2018-10-24 19:30 ` Stephane Eranian
2018-10-25 0:23 ` Peter Zijlstra
2018-10-25 0:25 ` Stephane Eranian
2018-10-25 0:34 ` Peter Zijlstra
2018-10-25 0:44 ` Stephane Eranian
2018-11-01 14:09 ` Liang, Kan
2018-11-05 10:59 ` Stephane Eranian
2018-11-05 15:43 ` Liang, Kan [this message]
2018-11-06 15:00 ` Stephane Eranian
2018-11-06 16:47 ` Liang, Kan
2018-10-25 0:29 ` Peter Zijlstra
2018-10-25 14:00 ` Liang, Kan
2018-10-30 12:51 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abe1cbab-d010-0899-acb1-32b1f9bf17df@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=bp@alien8.de \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.