* [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU @ 2021-01-08 0:01 Rob Herring 2021-01-12 15:32 ` Peter Zijlstra 0 siblings, 1 reply; 5+ messages in thread From: Rob Herring @ 2021-01-08 0:01 UTC (permalink / raw) To: Jiri Olsa Cc: Will Deacon, Andy Lutomirski, linux-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin Userspace access using rdpmc only makes sense if the event is valid for the current CPU. However, cap_user_rdpmc is currently set no matter which CPU the event is associated with. The result is userspace reading another CPU's event thinks it can use rdpmc to read the counter. In doing so, the wrong counter will be read. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Rob Herring <robh@kernel.org> --- I'm working on adding userspace counter access for arm64 along with a common access routine in libperf. I found this issue when testing per CPU events. More details are here[1]. I'm going to need the same thing for arm64. This could possibly go into perf_event_update_userpage() instead, but perhaps there could be an arch where userspace can read other cpu's counters. What's the ABI between libperf and kernel versions? This change will only help the libperf implementation if libperf doesn't need to support old kernels. [1] https://lore.kernel.org/r/CAL_JsqJzeCebq4VP+xBtfh=fbomvaJoVMp35AQQDGTYD-fRWgw@mail.gmail.com --- arch/x86/events/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index a88c94d65693..6e6d4c1d03ca 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event, userpg->cap_user_time = 0; userpg->cap_user_time_zero = 0; userpg->cap_user_rdpmc = - !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED); + !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) && + (event->oncpu == smp_processor_id()); userpg->pmc_width = x86_pmu.cntval_bits; if (!using_native_sched_clock() || !sched_clock_stable()) -- 2.27.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU 2021-01-08 0:01 [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU Rob Herring @ 2021-01-12 15:32 ` Peter Zijlstra 2021-01-12 16:16 ` Rob Herring 0 siblings, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2021-01-12 15:32 UTC (permalink / raw) To: Rob Herring Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Namhyung Kim, Thomas Gleixner, Borislav Petkov, x86, H. Peter Anvin On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote: > Userspace access using rdpmc only makes sense if the event is valid for > the current CPU. However, cap_user_rdpmc is currently set no matter which > CPU the event is associated with. The result is userspace reading another > CPU's event thinks it can use rdpmc to read the counter. In doing so, the > wrong counter will be read. Don't do that then? > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > index a88c94d65693..6e6d4c1d03ca 100644 > --- a/arch/x86/events/core.c > +++ b/arch/x86/events/core.c > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event, > userpg->cap_user_time = 0; > userpg->cap_user_time_zero = 0; > userpg->cap_user_rdpmc = > - !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED); > + !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) && > + (event->oncpu == smp_processor_id()); > userpg->pmc_width = x86_pmu.cntval_bits; > > if (!using_native_sched_clock() || !sched_clock_stable()) Isn't that a nop? That is, from the few sites I checked, we're always calling this on the event's CPU. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU 2021-01-12 15:32 ` Peter Zijlstra @ 2021-01-12 16:16 ` Rob Herring 2021-01-12 17:03 ` Peter Zijlstra 0 siblings, 1 reply; 5+ messages in thread From: Rob Herring @ 2021-01-12 16:16 UTC (permalink / raw) To: Peter Zijlstra Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel@vger.kernel.org, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Namhyung Kim, Thomas Gleixner, Borislav Petkov, X86 ML, H. Peter Anvin On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote: > > Userspace access using rdpmc only makes sense if the event is valid for > > the current CPU. However, cap_user_rdpmc is currently set no matter which > > CPU the event is associated with. The result is userspace reading another > > CPU's event thinks it can use rdpmc to read the counter. In doing so, the > > wrong counter will be read. > > Don't do that then? I could check this in userspace I suppose, but then it's yet another thing the rdpmc loop has to check. I think it's better to not add more overhead there. Or we just ignore this. This issue came up in the referenced thread with Jiri. I'm not all that convinced that per CPU events and userspace rdpmc is too useful of a combination. It would only reduce the overhead on 1 out of N cpus. In this case, we'd have to limit the userspace read to per thread events (and ones on any cpu I suppose). > > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > > index a88c94d65693..6e6d4c1d03ca 100644 > > --- a/arch/x86/events/core.c > > +++ b/arch/x86/events/core.c > > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event, > > userpg->cap_user_time = 0; > > userpg->cap_user_time_zero = 0; > > userpg->cap_user_rdpmc = > > - !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED); > > + !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) && > > + (event->oncpu == smp_processor_id()); > > userpg->pmc_width = x86_pmu.cntval_bits; > > > > if (!using_native_sched_clock() || !sched_clock_stable()) > > Isn't that a nop? That is, from the few sites I checked, we're always > calling this on the event's CPU. If cpu0 opens and mmaps an event for cpu1, then cpu0 will see cap_user_rdpmc set and think it can use rdpmc. Rob ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU 2021-01-12 16:16 ` Rob Herring @ 2021-01-12 17:03 ` Peter Zijlstra 2021-01-12 20:09 ` Rob Herring 0 siblings, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2021-01-12 17:03 UTC (permalink / raw) To: Rob Herring Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel@vger.kernel.org, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Namhyung Kim, Thomas Gleixner, Borislav Petkov, X86 ML, H. Peter Anvin On Tue, Jan 12, 2021 at 10:16:50AM -0600, Rob Herring wrote: > On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote: > > > Userspace access using rdpmc only makes sense if the event is valid for > > > the current CPU. However, cap_user_rdpmc is currently set no matter which > > > CPU the event is associated with. The result is userspace reading another > > > CPU's event thinks it can use rdpmc to read the counter. In doing so, the > > > wrong counter will be read. > > > > Don't do that then? > > I could check this in userspace I suppose, but then it's yet another > thing the rdpmc loop has to check. I think it's better to not add more > overhead there. So all this was designed for self monitoring; attempting rdpmc on an event not for yourself is out of spec. > > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > > > index a88c94d65693..6e6d4c1d03ca 100644 > > > --- a/arch/x86/events/core.c > > > +++ b/arch/x86/events/core.c > > > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event, > > > userpg->cap_user_time = 0; > > > userpg->cap_user_time_zero = 0; > > > userpg->cap_user_rdpmc = > > > - !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED); > > > + !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) && > > > + (event->oncpu == smp_processor_id()); > > > userpg->pmc_width = x86_pmu.cntval_bits; > > > > > > if (!using_native_sched_clock() || !sched_clock_stable()) > > > > Isn't that a nop? That is, from the few sites I checked, we're always > > calling this on the event's CPU. > > If cpu0 opens and mmaps an event for cpu1, then cpu0 will see > cap_user_rdpmc set and think it can use rdpmc. I don't think your check helps with that. IIRC we always call arch_perf_update_userpage() on the CPU the event actually runs on. So it's always true. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU 2021-01-12 17:03 ` Peter Zijlstra @ 2021-01-12 20:09 ` Rob Herring 0 siblings, 0 replies; 5+ messages in thread From: Rob Herring @ 2021-01-12 20:09 UTC (permalink / raw) To: Peter Zijlstra Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel@vger.kernel.org, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Namhyung Kim, Thomas Gleixner, Borislav Petkov, X86 ML, H. Peter Anvin On Tue, Jan 12, 2021 at 11:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Jan 12, 2021 at 10:16:50AM -0600, Rob Herring wrote: > > On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote: > > > > Userspace access using rdpmc only makes sense if the event is valid for > > > > the current CPU. However, cap_user_rdpmc is currently set no matter which > > > > CPU the event is associated with. The result is userspace reading another > > > > CPU's event thinks it can use rdpmc to read the counter. In doing so, the > > > > wrong counter will be read. > > > > > > Don't do that then? > > > > I could check this in userspace I suppose, but then it's yet another > > thing the rdpmc loop has to check. I think it's better to not add more > > overhead there. > > So all this was designed for self monitoring; attempting rdpmc on an > event not for yourself is out of spec. > > > > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > > > > index a88c94d65693..6e6d4c1d03ca 100644 > > > > --- a/arch/x86/events/core.c > > > > +++ b/arch/x86/events/core.c > > > > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event, > > > > userpg->cap_user_time = 0; > > > > userpg->cap_user_time_zero = 0; > > > > userpg->cap_user_rdpmc = > > > > - !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED); > > > > + !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) && > > > > + (event->oncpu == smp_processor_id()); > > > > userpg->pmc_width = x86_pmu.cntval_bits; > > > > > > > > if (!using_native_sched_clock() || !sched_clock_stable()) > > > > > > Isn't that a nop? That is, from the few sites I checked, we're always > > > calling this on the event's CPU. > > > > If cpu0 opens and mmaps an event for cpu1, then cpu0 will see > > cap_user_rdpmc set and think it can use rdpmc. > > I don't think your check helps with that. IIRC we always call > arch_perf_update_userpage() on the CPU the event actually runs on. So > it's always true. My testing says otherwise. I tested this change on the arm64 version of arch_perf_update_userpage, but I don't think x86 should be any different here. I'm testing with libperf test_stat_cpu() modified to mmap each cpu event. Without the change I get the following result: # taskset 2 test-evsel-a -v - running test-evsel.c... mmap base 0xffff9fd77000 userspace counter access enabled on cpu0 <<<<< Reflects cap_user_rdpmc state cpu0: count = 0x72cf, ena = 0x1a838, run = 0x1a838 <<<<< count is from rdpmc mmap base 0xffff9fd76000 userspace counter access enabled on cpu1 cpu1: count = 0xc978, ena = 0x163e6, run = 0x163e6 <<<<< count is from rdpmc cpu0 is idle here, so we'd expect count to be near zero, but it's not. Then with the change, I get the following: # taskset 2 test-evsel-a -v - running test-evsel.c... mmap base 0xffffa742d000 cpu0: count = 0xddb, ena = 0x3f8d6, run = 0x3f8d6 <<<<< count is from read() mmap base 0xffffa742c000 userspace counter access enabled on cpu1 cpu1: count = 0xb538, ena = 0x154f0, run = 0x154f0 <<<<< count is from rdpmc # taskset 1 test-evsel-a -v - running test-evsel.c... mmap base 0xffff8b008000 userspace counter access enabled on cpu0 cpu0: count = 0x7c21, ena = 0x18574, run = 0x18574 mmap base 0xffff8b007000 cpu1: count = 0xb3c, ena = 0x61aa8, run = 0x61aa8 As you can see, count tracks the idle and not idle cpu, and cap_user_rdpmc is only set for the cpu event matching the cpu we are running on. Rob ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-01-12 21:59 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-08 0:01 [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU Rob Herring 2021-01-12 15:32 ` Peter Zijlstra 2021-01-12 16:16 ` Rob Herring 2021-01-12 17:03 ` Peter Zijlstra 2021-01-12 20:09 ` Rob Herring
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.