From: Masami Hiramatsu <mhiramat@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@elte.hu, linux-kernel@vger.kernel.org, paulus@samba.org,
eranian@google.com, robert.richter@amd.com, fweisbec@gmail.com
Subject: Re: [RFC][PATCH 10/11] perf, x86: use LBR for PEBS IP+1 fixup
Date: Wed, 03 Mar 2010 16:11:51 -0500 [thread overview]
Message-ID: <4B8ED097.6090006@redhat.com> (raw)
In-Reply-To: <1267645029.25158.106.camel@laptop>
Peter Zijlstra wrote:
> On Wed, 2010-03-03 at 13:05 -0500, Masami Hiramatsu wrote:
>> Peter Zijlstra wrote:
>>> PEBS always reports the IP+1, that is the instruction after the one
>>> that got sampled, cure this by using the LBR to reliably rewind the
>>> instruction stream.
>>
>> Hmm, does PEBS always report one byte after the end address of the
>> sampled instruction? Or the instruction which will be executed next
>> step?
>
> The next instruction, its trap like.
>
>> [...]
>>> +#include <asm/insn.h>
>>> +
>>> +#define MAX_INSN_SIZE 16
>>
>> Hmm, we'd better integrate these kinds of definitions into
>> asm/insn.h... (several features define it)
>
> Agreed, I'll look at doing a patch to collect them all into asm/insn.h
> if nobody beats me to it :-)
At least kprobes doesn't :)
>>> +
>>> +static void intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
>>> +{
>>> +#if 0
>>> + /*
>>> + * Borken, makes the machine expode at times trying to
>>> + * derefence funny userspace addresses.
>>> + *
>>> + * Should we always fwd decode from @to, instead of trying
>>> + * to rewind as implemented?
>>> + */
>>> +
>>> + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
>>> + unsigned long from = cpuc->lbr_entries[0].from;
>>> + unsigned long to = cpuc->lbr_entries[0].to;
>>
>> Ah, I see. For branch instruction case, we can use LBR to
>> find previous IP...
>
> Right, we use the LBR to find the basic block.
Hm, that's a good idea :)
>>> + unsigned long ip = regs->ip;
>>> + u8 buf[2*MAX_INSN_SIZE];
>>> + u8 *kaddr;
>>> + int i;
>>> +
>>> + if (from && to) {
>>> + /*
>>> + * We sampled a branch insn, rewind using the LBR stack
>>> + */
>>> + if (ip == to) {
>>> + regs->ip = from;
>>> + return;
>>> + }
>>> + }
>>> +
>>> + if (user_mode(regs)) {
>>> + int bytes = copy_from_user_nmi(buf,
>>> + (void __user *)(ip - MAX_INSN_SIZE),
>>> + 2*MAX_INSN_SIZE);
>>> +
>>
>> maybe, you'd better check the source address range is within
>> the user address range. e.g. ip < MAX_INSN_SIZE.
>
> Not only that, I realized user_mode() checks regs->cs, which is not set
> by the PEBS code, so I added some helpers.
>
>>> +
>>> + /*
>>> + * Try to find the longest insn ending up at the given IP
>>> + */
>>> + for (i = MAX_INSN_SIZE; i > 0; i--) {
>>> + struct insn insn;
>>> +
>>> + kernel_insn_init(&insn, kaddr + MAX_INSN_SIZE - i);
>>> + insn_get_length(&insn);
>>> + if (insn.length == i) {
>>> + regs->ip -= i;
>>> + return;
>>> + }
>>> + }
>>
>> Hmm, this will not work correctly on x86, since the decoder can
>> miss-decode the tail bytes of previous instruction as prefix bytes. :(
>>
>> Thus, if you want to rewind instruction stream, you need to decode
>> a function (or basic block) entirely.
>
> Something like the below?
Great! it looks good to me.
Yeah, LBR.to may always smaller than current ip (if no one disabled LBR).
Thank you,
>
> #ifdef CONFIG_X86_32
> static bool kernel_ip(unsigned long ip)
> {
> return ip > TASK_SIZE;
> }
> #else
> static bool kernel_ip(unsigned long ip)
> {
> return (long)ip < 0;
> }
> #endif
>
> static int intel_pmu_pebs_fixup_ip(unsigned long *ipp)
> {
> struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
> unsigned long from = cpuc->lbr_entries[0].from;
> unsigned long old_to, to = cpuc->lbr_entries[0].to;
> unsigned long ip = *ipp;
> int i;
>
> /*
> * We don't need to fixup if the PEBS assist is fault like
> */
> if (!x86_pmu.intel_perf_capabilities.pebs_trap)
> return 0;
>
> if (!cpuc->lbr_stack.nr || !from || !to)
> return 0;
>
> if (ip < to)
> return 0;
>
> /*
> * We sampled a branch insn, rewind using the LBR stack
> */
> if (ip == to) {
> *ipp = from;
> return 1;
> }
>
> do {
> struct insn insn;
> u8 buf[MAX_INSN_SIZE];
> void *kaddr;
>
> old_to = to;
> if (!kernel_ip(ip)) {
> int bytes = copy_from_user_nmi(buf, (void __user *)to,
> MAX_INSN_SIZE);
>
> if (bytes != MAX_INSN_SIZE)
> return 0;
>
> kaddr = buf;
> } else kaddr = (void *)to;
>
> kernel_insn_init(&insn, kaddr);
> insn_get_length(&insn);
> to += insn.length;
> } while (to < ip);
>
> if (to == ip) {
> *ipp = old_to;
> return 1;
> }
>
> return 0;
> }
>
> I thought about exposing the success of this fixup as a PERF_RECORD_MISC
> bit.
>
--
Masami Hiramatsu
e-mail: mhiramat@redhat.com
next prev parent reply other threads:[~2010-03-03 21:12 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-03 16:39 [RFC][PATCH 00/11] Another stab at PEBS and LBR support Peter Zijlstra
2010-03-03 16:39 ` [RFC][PATCH 01/11] perf, x86: Remove superfluous arguments to x86_perf_event_set_period() Peter Zijlstra
2010-03-03 16:39 ` [RFC][PATCH 02/11] perf, x86: Remove superfluous arguments to x86_perf_event_update() Peter Zijlstra
2010-03-03 16:39 ` [RFC][PATCH 03/11] perf, x86: Change x86_pmu.{enable,disable} calling convention Peter Zijlstra
2010-03-03 16:39 ` [RFC][PATCH 04/11] perf, x86: Use unlocked bitops Peter Zijlstra
2010-03-03 16:39 ` [RFC][PATCH 05/11] perf: Generic perf_sample_data initialization Peter Zijlstra
2010-03-03 16:49 ` David Miller
2010-03-03 21:14 ` Frederic Weisbecker
2010-03-05 8:44 ` Jean Pihet
2010-03-03 16:39 ` [RFC][PATCH 06/11] perf, x86: PEBS infrastructure Peter Zijlstra
2010-03-03 17:38 ` Robert Richter
2010-03-03 17:42 ` Peter Zijlstra
2010-03-04 8:50 ` Robert Richter
2010-03-03 16:39 ` [RFC][PATCH 07/11] perf: Provide PERF_SAMPLE_REGS Peter Zijlstra
2010-03-03 17:30 ` Stephane Eranian
2010-03-03 17:39 ` Peter Zijlstra
2010-03-03 17:49 ` Stephane Eranian
2010-03-03 17:55 ` David Miller
2010-03-03 18:18 ` Stephane Eranian
2010-03-03 19:18 ` Peter Zijlstra
2010-03-04 2:59 ` Ingo Molnar
2010-03-04 12:58 ` Arnaldo Carvalho de Melo
2010-03-03 22:02 ` Frederic Weisbecker
2010-03-04 8:58 ` Peter Zijlstra
2010-03-04 11:04 ` Ingo Molnar
2010-03-03 16:39 ` [RFC][PATCH 08/11] perf, x86: Implement simple LBR support Peter Zijlstra
2010-03-03 21:52 ` Stephane Eranian
2010-03-04 8:58 ` Peter Zijlstra
2010-03-03 21:57 ` Stephane Eranian
2010-03-04 8:58 ` Peter Zijlstra
2010-03-04 17:54 ` Stephane Eranian
2010-03-04 18:18 ` Peter Zijlstra
2010-03-04 20:23 ` Peter Zijlstra
2010-03-04 20:57 ` Stephane Eranian
2010-03-03 16:39 ` [RFC][PATCH 09/11] perf, x86: Implement PERF_SAMPLE_BRANCH_STACK Peter Zijlstra
2010-03-03 21:08 ` Frederic Weisbecker
2010-03-03 16:39 ` [RFC][PATCH 10/11] perf, x86: use LBR for PEBS IP+1 fixup Peter Zijlstra
2010-03-03 18:05 ` Masami Hiramatsu
2010-03-03 19:37 ` Peter Zijlstra
2010-03-03 21:11 ` Masami Hiramatsu [this message]
2010-03-03 21:50 ` Stephane Eranian
2010-03-04 8:57 ` Peter Zijlstra
2010-03-09 1:41 ` Stephane Eranian
2010-03-03 16:39 ` [RFC][PATCH 11/11] perf, x86: Clean up IA32_PERF_CAPABILITIES usage Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B8ED097.6090006@redhat.com \
--to=mhiramat@redhat.com \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).