[PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: kan.liang@linux.intel.com
To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org
Cc: acme@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@redhat.com,
	namhyung@kernel.org, eranian@google.com, ak@linux.intel.com,
	Kan Liang <kan.liang@linux.intel.com>,
	stable@vger.kernel.org
Subject: [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event"
Date: Wed,  3 Mar 2021 05:42:18 -0800	[thread overview]
Message-ID: <1614778938-93092-1-git-send-email-kan.liang@linux.intel.com> (raw)

From: Kan Liang <kan.liang@linux.intel.com>

This reverts commit 01330d7288e0 ("perf/x86: Allow zero PEBS status with
only single active event")

A repeatable crash can be triggered by the perf_fuzzer on some Haswell
system.
https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@maine.edu/

For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
may be mistakenly set to 0. To minimize the impact of the defect, the
commit was introduced to try to avoid dropping the PEBS record for some
cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
the local pebs_status accordingly. However, it doesn't correct the PEBS
status in the PEBS record, which may trigger the crash, especially for
the large PEBS.

It's possible that all the PEBS records in a large PEBS have the PEBS
status 0. If so, the first get_next_pebs_record_by_bit() in the
__intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
PEBS, the 'count' parameter must > 1. The second
get_next_pebs_record_by_bit() will crash.

Two solutions were considered to fix the crash.
- Keep the SW workaround and add extra checks in the
  get_next_pebs_record_by_bit() to workaround the issue. The
  get_next_pebs_record_by_bit() is a critical path. The extra checks
  will bring extra overhead for the latest CPUs which don't have the
  defect. Also, the defect can only be observed on some old CPUs
  (For example, the issue can be reproduced on an HSW client, but I
  didn't observe the issue on my Haswell server machine.). The impact
  of the defect should be limit.
  This solution is dropped.
- Drop the SW workaround and revert the commit.
  It seems that the commit never works, because the PEBS status in the
  PEBS record never be changed. The get_next_pebs_record_by_bit() only
  checks the PEBS status in the PEBS record. The record is dropped
  eventually. Reverting the commit should not change the current
  behavior.

Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/ds.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..9c90d1e 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2000,18 +2000,6 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
 			continue;
 		}
 
-		/*
-		 * On some CPUs the PEBS status can be zero when PEBS is
-		 * racing with clearing of GLOBAL_STATUS.
-		 *
-		 * Normally we would drop that record, but in the
-		 * case when there is only a single active PEBS event
-		 * we can assume it's for that event.
-		 */
-		if (!pebs_status && cpuc->pebs_enabled &&
-			!(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
-			pebs_status = cpuc->pebs_enabled;
-
 		bit = find_first_bit((unsigned long *)&pebs_status,
 					x86_pmu.max_pebs_events);
 		if (bit >= x86_pmu.max_pebs_events)
-- 
2.7.4

next             reply	other threads:[~2021-03-03 18:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 13:42 kan.liang [this message]
2021-03-03 18:59 ` [PATCH] Revert "perf/x86: Allow zero PEBS status with only single active event" Peter Zijlstra
2021-03-03 19:53   ` Liang, Kan
2021-03-03 20:21     ` Peter Zijlstra
2021-03-16  7:22       ` Namhyung Kim
2021-03-16 12:28         ` Liang, Kan
2021-03-16 18:34           ` Stephane Eranian
2021-03-16 19:36             ` Liang, Kan
2021-03-17  2:04           ` Namhyung Kim

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:7ebae18 dfblob:9c90d1e )
 OR (
bs:"[PATCH] Revert "
bs:"perf/x86: Allow zero PEBS status with only single active event" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1614778938-93092-1-git-send-email-kan.liang@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox