stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Kim Phillips <kim.phillips@amd.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Arnaldo Carvalho de Melo" <acme@kernel.org>,
	x86@kernel.org, Ingo Molnar <mingo@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Jiri Olsa <jolsa@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Borislav Petkov" <bp@alien8.de>,
	Stephane Eranian <eranian@google.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.14 30/36] perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
Date: Wed,  4 Sep 2019 12:01:16 -0400	[thread overview]
Message-ID: <20190904160122.4179-30-sashal@kernel.org> (raw)
In-Reply-To: <20190904160122.4179-1-sashal@kernel.org>

From: Kim Phillips <kim.phillips@amd.com>

[ Upstream commit 0f4cd769c410e2285a4e9873a684d90423f03090 ]

When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
sample bias, IBS hardware preloads the least significant 7 bits of
current count (IbsOpCurCnt) with random values, such that, after the
interrupt is handled and counting resumes, the next sample taken
will be slightly perturbed.

The current count bitfield is in the IBS execution control h/w register,
alongside the maximum count field.

Currently, the IBS driver writes that register with the maximum count,
leaving zeroes to fill the current count field, thereby overwriting
the random bits the hardware preloaded for itself.

Fix the driver to actually retain and carry those random bits from the
read of the IBS control register, through to its write, instead of
overwriting the lower current count bits with zeroes.

Tested with:

perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>

'perf annotate' output before:

 15.70  65:   addsd     %xmm0,%xmm1
 17.30        add       $0x1,%rax
 15.88        cmp       %rdx,%rax
              je        82
 17.32  72:   test      $0x1,%al
              jne       7c
  7.52        movapd    %xmm1,%xmm0
  5.90        jmp       65
  8.23  7c:   sqrtsd    %xmm1,%xmm0
 12.15        jmp       65

'perf annotate' output after:

 16.63  65:   addsd     %xmm0,%xmm1
 16.82        add       $0x1,%rax
 16.81        cmp       %rdx,%rax
              je        82
 16.69  72:   test      $0x1,%al
              jne       7c
  8.30        movapd    %xmm1,%xmm0
  8.13        jmp       65
  8.24  7c:   sqrtsd    %xmm1,%xmm0
  8.39        jmp       65

Tested on Family 15h and 17h machines.

Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
affect their operation.

It is unknown why commit db98c5faf8cb ("perf/x86: Implement 64-bit
counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
field; the number of preloaded random bits has always been 7, AFAICT.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: "Namhyung Kim" <namhyung@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/events/amd/ibs.c         | 13 ++++++++++---
 arch/x86/include/asm/perf_event.h | 12 ++++++++----
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 8c51844694e2f..7a86fbc07ddc1 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -672,10 +672,17 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
 
 	throttle = perf_event_overflow(event, &data, &regs);
 out:
-	if (throttle)
+	if (throttle) {
 		perf_ibs_stop(event, 0);
-	else
-		perf_ibs_enable_event(perf_ibs, hwc, period >> 4);
+	} else {
+		period >>= 4;
+
+		if ((ibs_caps & IBS_CAPS_RDWROPCNT) &&
+		    (*config & IBS_OP_CNT_CTL))
+			period |= *config & IBS_OP_CUR_CNT_RAND;
+
+		perf_ibs_enable_event(perf_ibs, hwc, period);
+	}
 
 	perf_event_update_userpage(event);
 
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 78241b736f2a0..f6c4915a863e0 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -209,16 +209,20 @@ struct x86_pmu_capability {
 #define IBSCTL_LVT_OFFSET_VALID		(1ULL<<8)
 #define IBSCTL_LVT_OFFSET_MASK		0x0F
 
-/* ibs fetch bits/masks */
+/* IBS fetch bits/masks */
 #define IBS_FETCH_RAND_EN	(1ULL<<57)
 #define IBS_FETCH_VAL		(1ULL<<49)
 #define IBS_FETCH_ENABLE	(1ULL<<48)
 #define IBS_FETCH_CNT		0xFFFF0000ULL
 #define IBS_FETCH_MAX_CNT	0x0000FFFFULL
 
-/* ibs op bits/masks */
-/* lower 4 bits of the current count are ignored: */
-#define IBS_OP_CUR_CNT		(0xFFFF0ULL<<32)
+/*
+ * IBS op bits/masks
+ * The lower 7 bits of the current count are random bits
+ * preloaded by hardware and ignored in software
+ */
+#define IBS_OP_CUR_CNT		(0xFFF80ULL<<32)
+#define IBS_OP_CUR_CNT_RAND	(0x0007FULL<<32)
 #define IBS_OP_CNT_CTL		(1ULL<<19)
 #define IBS_OP_VAL		(1ULL<<18)
 #define IBS_OP_ENABLE		(1ULL<<17)
-- 
2.20.1


  parent reply	other threads:[~2019-09-04 16:02 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 16:00 [PATCH AUTOSEL 4.14 01/36] ARM: OMAP2+: Fix missing SYSC_HAS_RESET_STATUS for dra7 epwmss Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 02/36] s390/bpf: fix lcgr instruction encoding Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 03/36] ARM: OMAP2+: Fix omap4 errata warning on other SoCs Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 04/36] ARM: dts: dra74x: Fix iodelay configuration for mmc3 Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 05/36] s390/bpf: use 32-bit index for tail calls Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 06/36] batman-adv: fix uninit-value in batadv_netlink_get_ifindex() Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 07/36] fpga: altera-ps-spi: Fix getting of optional confd gpio Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 08/36] netfilter: xt_nfacct: Fix alignment mismatch in xt_nfacct_match_info Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 09/36] NFSv4: Fix return values for nfs4_file_open() Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 10/36] NFSv4: Fix return value in nfs_finish_open() Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 11/36] NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 12/36] Kconfig: Fix the reference to the IDT77105 Phy driver in the description of ATM_NICSTAR_USE_IDT77105 Sasha Levin
2019-09-04 16:00 ` [PATCH AUTOSEL 4.14 13/36] qed: Add cleanup in qed_slowpath_start() Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 14/36] ARM: 8874/1: mm: only adjust sections of valid mm structures Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 15/36] batman-adv: Only read OGM tvlv_len after buffer len check Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 16/36] batman-adv: Only read OGM2 " Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 17/36] r8152: Set memory to all 0xFFs on failed reg reads Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 18/36] x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 19/36] netfilter: nf_conntrack_ftp: Fix debug output Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 20/36] NFSv2: Fix eof handling Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 21/36] NFSv2: Fix write regression Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 22/36] kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 23/36] cifs: set domainName when a domain-key is used in multiuser Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 24/36] cifs: Use kzfree() to zero out the password Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 25/36] x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 26/36] ARM: 8901/1: add a criteria for pfn_valid of arm Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 27/36] sky2: Disable MSI on yet another ASUS boards (P6Xxxx) Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 28/36] i2c: designware: Synchronize IRQs when unregistering slave client Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 29/36] perf/x86/intel: Restrict period on Nehalem Sasha Levin
2019-09-04 16:01 ` Sasha Levin [this message]
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 31/36] amd-xgbe: Fix error path in xgbe_mod_init() Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 32/36] net: stmmac: dwmac-rk: Don't fail if phy regulator is absent Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 33/36] tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2 Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 34/36] tools/power x86_energy_perf_policy: Fix argument parsing Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 35/36] tools/power turbostat: fix buffer overrun Sasha Levin
2019-09-04 16:01 ` [PATCH AUTOSEL 4.14 36/36] net: seeq: Fix the function used to release some memory in an error handling path Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190904160122.4179-30-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=eranian@google.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).