All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: mingo@kernel.org, acme@kernel.org, namhyung@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org,
	ak@linux.intel.com, eranian@google.com,
	Sandipan Das <sandipan.das@amd.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	silviazhao <silviazhao-oc@zhaoxin.com>
Subject: Re: [PATCH V4 1/5] perf/x86: Extend event update interface
Date: Thu, 1 Aug 2024 18:36:18 +0200	[thread overview]
Message-ID: <20240801163618.GD39708@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <f9b18e66-eb7d-4998-8843-b1a16cc004b0@linux.intel.com>

On Thu, Aug 01, 2024 at 11:31:40AM -0400, Liang, Kan wrote:
> 
> 
> On 2024-08-01 10:03 a.m., Peter Zijlstra wrote:
> > On Wed, Jul 31, 2024 at 07:38:31AM -0700, kan.liang@linux.intel.com wrote:
> >> From: Kan Liang <kan.liang@linux.intel.com>
> >>
> >> The current event update interface directly reads the values from the
> >> counter, but the values may not be the accurate ones users require. For
> >> example, the sample read feature wants the counter value of the member
> >> events when the leader event is overflow. But with the current
> >> implementation, the read (event update) actually happens in the NMI
> >> handler. There may be a small gap between the overflow and the NMI
> >> handler.
> > 
> > This...
> > 
> >> The new Intel PEBS counters snapshotting feature can provide
> >> the accurate counter value in the overflow. The event update interface
> >> has to be updated to apply the given accurate values.
> >>
> >> Pass the accurate values via the event update interface. If the value is
> >> not available, still directly read the counter.
> > 
> >> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> >> index 12f2a0c14d33..07a56bf71160 100644
> >> --- a/arch/x86/events/core.c
> >> +++ b/arch/x86/events/core.c
> >> @@ -112,7 +112,7 @@ u64 __read_mostly hw_cache_extra_regs
> >>   * Can only be executed on the CPU where the event is active.
> >>   * Returns the delta events processed.
> >>   */
> >> -u64 x86_perf_event_update(struct perf_event *event)
> >> +u64 x86_perf_event_update(struct perf_event *event, u64 *val)
> >>  {
> >>  	struct hw_perf_event *hwc = &event->hw;
> >>  	int shift = 64 - x86_pmu.cntval_bits;
> >> @@ -131,7 +131,10 @@ u64 x86_perf_event_update(struct perf_event *event)
> >>  	 */
> >>  	prev_raw_count = local64_read(&hwc->prev_count);
> >>  	do {
> >> -		rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> >> +		if (!val)
> >> +			rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> >> +		else
> >> +			new_raw_count = *val;
> >>  	} while (!local64_try_cmpxchg(&hwc->prev_count,
> >>  				      &prev_raw_count, new_raw_count));
> >>  
> > 
> > Does that mean the following is possible?
> > 
> > Two counters: C0 and C1, where C0 is a PEBS counter that also samples
> > C1.
> > 
> >   C0: overflow-with-PEBS-assist -> PEBS entry with counter value A
> >       (DS buffer threshold not reached)
> > 
> >   C1: overflow -> PMI -> x86_perf_event_update(C1, NULL)
> >       rdpmcl reads value 'A+d', and sets prev_raw_count
> > 
> >   C0: more assists, hit DS threshold -> PMI
> >       PEBS processing does x86_perf_event_update(C1, A)
> >       and sets prev_raw_count *backwards*
> 
> I think the C0 PMI handler doesn't touch other counters unless
> PERF_SAMPLE_READ is applied. For the PERF_SAMPLE_READ, only one counter
> does sampling. It's impossible that C0 and C1 do sampling at the same
> time. I don't think the above scenario is possible.

It is perfectly fine for C0 to have PERF_SAMPLE_READ and C1 to be a
normal counter, sampling or otherwise.

> Maybe we can add the below check to further prevent the abuse of the
> interface.

There is no abuse in the above scenario. You can have a group with all
sampling events and any number of them can have PERF_SAMPLE_READ. This
is perfectly fine.

> WARN_ON_ONCE(!(event->attr.sample_type & PERF_SAMPLE_READ) && val);

I don't see how PERF_SAMPLE_READ is relevant, *any* PMI for the C1 event
will cause x86_perf_event_update() to be called. And remember that even
non-sampling events have EVENTSEL_INT set to deal with counter overflow.

The problem here is that C0/PEBS will come in late and try to force
update an out-of-date value.

If you have C1 be a non-sampling event, this will typically not happen,
but it still *can*, and when you do, you get your counter moving
backwards.

  reply	other threads:[~2024-08-01 16:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 14:38 [PATCH V4 0/5] Support Lunar Lake and Arrow Lake core PMU kan.liang
2024-07-31 14:38 ` [PATCH V4 1/5] perf/x86: Extend event update interface kan.liang
2024-08-01 14:03   ` Peter Zijlstra
2024-08-01 15:31     ` Liang, Kan
2024-08-01 16:36       ` Peter Zijlstra [this message]
2024-08-01 19:18         ` Liang, Kan
2024-07-31 14:38 ` [PATCH V4 2/5] perf: Extend perf_output_read kan.liang
2024-07-31 14:38 ` [PATCH V4 3/5] perf/x86/intel: Move PEBS event update after the sample output kan.liang
2024-07-31 14:38 ` [PATCH V4 4/5] perf/x86/intel: Support PEBS counters snapshotting kan.liang
2024-07-31 14:38 ` [PATCH V4 5/5] perf/x86/intel: Support RDPMC metrics clear mode kan.liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240801163618.GD39708@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    --cc=silviazhao-oc@zhaoxin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.