All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, namhyung@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	ak@linux.intel.com, eranian@google.com,
	dapeng1.mi@linux.intel.com
Subject: Re: [PATCH V9 3/3] perf/x86/intel: Support PEBS counters snapshotting
Date: Tue, 21 Jan 2025 10:25:03 -0500	[thread overview]
Message-ID: <e05bb0c7-cbbb-422a-882c-1dfa648b29e9@linux.intel.com> (raw)
In-Reply-To: <7f0ed750-b4b3-4adc-98d2-1e9cccd3bf02@linux.intel.com>



On 2025-01-16 4:50 p.m., Liang, Kan wrote:
> 
> 
> On 2025-01-16 3:56 p.m., Peter Zijlstra wrote:
>> On Thu, Jan 16, 2025 at 09:42:25PM +0100, Peter Zijlstra wrote:
>>> On Thu, Jan 16, 2025 at 10:55:46AM -0500, Liang, Kan wrote:
>>>
>>>>> Also, I think I found you another bug... Consider what happens to the
>>>>> counter value when we reschedule a HES_STOPPED counter, then we skip
>>>>> x86_pmu_start(RELOAD) on step2, which leave the counter value with
>>>>> 'random' crap from whatever was there last.
>>>>>
>>>>> But meanwhile you do program PEBS to sample it. That will happily sample
>>>>> this garbage.
>>>>>
>>>>> Hmm?
>>>>
>>>> I'm not quite sure I understand the issue.
>>>>
>>>> The HES_STOPPED counter should be a pre-existing counter. Just for some
>>>> reason, it's stopped, right? So perf doesn't need to re-configure the
>>>> PEBS__DATA_CFG, since the idx is not changed.
>>>
>>> Suppose you have your group {A, B, C} and lets suppose A is the PEBS
>>> event, further suppose that B is also a sampling event. Lets say they
>>> get hardware counters 1,2 and 3 respectively.
>>>
>>> Then lets say B gets throttled.
>>>
>>> While it is throttled, we get a new event D scheduled, and D gets placed
>>> on counter 2 -- where B lives, which gets moved over to counter 4.
>>>
>>> Then our loops will update and remove B from 2, but because
>>> throttled/HES_STOPPED it will not start it on counter 4.
>>>>> Meanwhile, we do have the PEBS_DATA_CFG thing updated to sample counter
>>> 1,3 and 4.
>>>
>>> PEBS assist happens, and samples the uninitialized counter 4.
>>> Also, by skipping x86_pmu_start() we miss the assignment of
>> cpuc->events[] so PEBS buffer decode can't even find the dodgy event.
>>
> 
> Yes, counter 4 includes garbage before the B is started again.
> But the cpuc->events[counter 4] is NULL either.
> 
> The current implementation ignores the NULL cpuc->events[]. The stopped
> B should not be mistakenly updated.
> 
> +static void intel_perf_event_pmc_to_count(struct perf_event *event, u64
> pmc)
> +{
> +	int shift = 64 - x86_pmu.cntval_bits;
> +	struct hw_perf_event *hwc;
> +	u64 delta, prev_pmc;
> +
> +	/*
> +	 * The PEBS record doesn't shrink on pmu::del().
> +	 * See pebs_update_state().
> +	 * Ignore the non-exist event.
> +	 */
> +	if (!event)
> +		return;
> 
> 

I've sent a V10 to address all the comments in V9.
The above case is explained in the comments of
intel_perf_event_update_pmc() in V10.
https://lore.kernel.org/lkml/20250121152303.3128733-4-kan.liang@linux.intel.com/

Please take a look and let me know if it's not sufficient to handle the
case.

Thanks,
Kan

  reply	other threads:[~2025-01-21 15:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 18:43 [PATCH V9 1/3] perf/x86/intel: Avoid pmu_disable/enable if !cpuc->enabled in sample read kan.liang
2025-01-15 18:43 ` [PATCH V9 2/3] perf: Avoid the read if the count is already updated kan.liang
2025-01-15 18:43 ` [PATCH V9 3/3] perf/x86/intel: Support PEBS counters snapshotting kan.liang
2025-01-16 11:47   ` Peter Zijlstra
2025-01-16 15:55     ` Liang, Kan
2025-01-16 20:42       ` Peter Zijlstra
2025-01-16 20:56         ` Peter Zijlstra
2025-01-16 21:50           ` Liang, Kan
2025-01-21 15:25             ` Liang, Kan [this message]
2025-01-23  9:14             ` Peter Zijlstra
2025-01-23 15:36               ` Liang, Kan
2025-01-16 12:02   ` Peter Zijlstra
2025-01-16 10:32 ` [PATCH V9 1/3] perf/x86/intel: Avoid pmu_disable/enable if !cpuc->enabled in sample read Peter Zijlstra
2025-01-16 10:51   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e05bb0c7-cbbb-422a-882c-1dfa648b29e9@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.