All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Breno Leitao <leitao@debian.org>,
	sandipan.das@amd.com, Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	"Liang, Kan" <kan.liang@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	leit@meta.com,
	"open list:PERFORMANCE EVENTS SUBSYSTEM"
	<linux-perf-users@vger.kernel.org>,
	"open list:PERFORMANCE EVENTS SUBSYSTEM"
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] perf/x86/amd: Warn only on new bits set
Date: Wed, 26 Jun 2024 10:51:53 +0200	[thread overview]
Message-ID: <20240626085153.GA31592@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <e17a924d-9699-465f-8bef-cde4411e2146@paulmck-laptop>

On Tue, Jun 25, 2024 at 07:47:06AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 25, 2024 at 01:57:34PM +0200, Peter Zijlstra wrote:
> > On Fri, May 24, 2024 at 07:10:20AM -0700, Breno Leitao wrote:
> > > Warning at every leaking bits can cause a flood of message, triggering
> > > vairous stall-warning mechanisms to fire, including CSD locks, which
> > > makes the machine to be unusable.
> > > 
> > > Track the bits that are being leaked, and only warn when a new bit is
> > > set.
> > > 
> > > Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> > > ---
> > >  arch/x86/events/amd/core.c | 9 +++++++--
> > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
> > > index 1fc4ce44e743..df0ba2382d13 100644
> > > --- a/arch/x86/events/amd/core.c
> > > +++ b/arch/x86/events/amd/core.c
> > > @@ -941,11 +941,12 @@ static int amd_pmu_v2_snapshot_branch_stack(struct perf_branch_entry *entries, u
> > >  static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
> > >  {
> > >  	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > > +	static atomic64_t status_warned = ATOMIC64_INIT(0);
> > > +	u64 reserved, status, mask, new_bits;
> > >  	struct perf_sample_data data;
> > >  	struct hw_perf_event *hwc;
> > >  	struct perf_event *event;
> > >  	int handled = 0, idx;
> > > -	u64 reserved, status, mask;
> > >  	bool pmu_enabled;
> > >  
> > >  	/*
> > > @@ -1010,7 +1011,11 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
> > >  	 * the corresponding PMCs are expected to be inactive according to the
> > >  	 * active_mask
> > >  	 */
> > > -	WARN_ON(status > 0);
> > > +	if (status > 0) {
> > > +		new_bits = atomic64_fetch_or(status, &status_warned) ^ atomic64_read(&status_warned);
> > > +		// A new bit was set for the very first time.
> > > +		WARN(new_bits, "New overflows for inactive PMCs: %llx\n", new_bits);
> > > +	}
> > 
> > Why not just a WARN_ON_ONCE() instead? This really shouldn't be
> > happening in the first place.
> 
> We did consider that, but seeing the full set of bits that shouldn't
> have been happening in the first place helps with debuggging.
> 
> But is there a better way to accumulate and print the full set of
> unexpected bits?

Dunno, I was just wondering if the whole thing wasn't massive overkill.
The changelog wasn't really explaining much here.

  reply	other threads:[~2024-06-26  8:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 14:10 [PATCH] perf/x86/amd: Warn only on new bits set Breno Leitao
2024-06-06  5:34 ` Sandipan Das
2024-06-25 11:57 ` Peter Zijlstra
2024-06-25 14:47   ` Paul E. McKenney
2024-06-26  8:51     ` Peter Zijlstra [this message]
2024-06-26 13:47       ` Paul E. McKenney
2024-06-26 13:57       ` Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240626085153.GA31592@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=leit@meta.com \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=sandipan.das@amd.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.