public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, kvm@vger.kernel.org,
	H Peter Anvin <hpa@zytor.com>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Leo Yan <leo.yan@linaro.org>
Subject: [PATCH V2 03/11] perf/x86: Add support for TSC in nanoseconds as a perf event clock
Date: Mon, 14 Feb 2022 13:09:06 +0200	[thread overview]
Message-ID: <20220214110914.268126-4-adrian.hunter@intel.com> (raw)
In-Reply-To: <20220214110914.268126-1-adrian.hunter@intel.com>

Currently, when Intel PT is used within a VM guest, it is not possible to
make use of TSC because perf clock is subject to paravirtualization.

If the hypervisor leaves rdtsc alone, the TSC value will be subject only to
the VMCS TSC Offset and Scaling, the same as the TSC packet from Intel PT.
The new clock is based on rdtsc and not subject to paravirtualization.

Hence it would be possible to use this new clock for Intel PT decoding
within a VM guest.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/events/core.c            | 41 ++++++++++++++++++++-----------
 arch/x86/include/asm/perf_event.h |  2 ++
 include/uapi/linux/perf_event.h   |  6 +++++
 kernel/events/core.c              |  6 +++++
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 51d5345de30a..905975a7d475 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -41,6 +41,7 @@
 #include <asm/desc.h>
 #include <asm/ldt.h>
 #include <asm/unwind.h>
+#include <asm/tsc.h>
 
 #include "perf_event.h"
 
@@ -2728,18 +2729,26 @@ void arch_perf_update_userpage(struct perf_event *event,
 		!!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT);
 	userpg->pmc_width = x86_pmu.cntval_bits;
 
-	if (event->attr.use_clockid &&
-	    event->attr.ns_clockid &&
-	    event->attr.clockid == CLOCK_PERF_HW_CLOCK) {
-		userpg->cap_user_time_zero = 1;
-		userpg->time_mult = 1;
-		userpg->time_shift = 0;
-		userpg->time_offset = 0;
-		userpg->time_zero = 0;
-		return;
+	if (event->attr.use_clockid && event->attr.ns_clockid) {
+		if (event->attr.clockid == CLOCK_PERF_HW_CLOCK) {
+			userpg->cap_user_time_zero = 1;
+			userpg->time_mult = 1;
+			userpg->time_shift = 0;
+			userpg->time_offset = 0;
+			userpg->time_zero = 0;
+			return;
+		}
+		if (event->attr.clockid == CLOCK_PERF_HW_CLOCK_NS)
+			userpg->cap_user_time_zero = 1;
+	}
+
+	if (using_native_sched_clock() && sched_clock_stable()) {
+		userpg->cap_user_time = 1;
+		if (!event->attr.use_clockid)
+			userpg->cap_user_time_zero = 1;
 	}
 
-	if (!using_native_sched_clock() || !sched_clock_stable())
+	if (!userpg->cap_user_time && !userpg->cap_user_time_zero)
 		return;
 
 	cyc2ns_read_begin(&data);
@@ -2750,19 +2759,16 @@ void arch_perf_update_userpage(struct perf_event *event,
 	 * Internal timekeeping for enabled/running/stopped times
 	 * is always in the local_clock domain.
 	 */
-	userpg->cap_user_time = 1;
 	userpg->time_mult = data.cyc2ns_mul;
 	userpg->time_shift = data.cyc2ns_shift;
 	userpg->time_offset = offset - now;
 
 	/*
 	 * cap_user_time_zero doesn't make sense when we're using a different
-	 * time base for the records.
+	 * time base for the records, except for CLOCK_PERF_HW_CLOCK_NS.
 	 */
-	if (!event->attr.use_clockid) {
-		userpg->cap_user_time_zero = 1;
+	if (userpg->cap_user_time_zero)
 		userpg->time_zero = offset;
-	}
 
 	cyc2ns_read_end();
 }
@@ -2996,6 +3002,11 @@ u64 perf_hw_clock(void)
 	return rdtsc_ordered();
 }
 
+u64 perf_hw_clock_ns(void)
+{
+	return native_sched_clock_from_tsc(perf_hw_clock());
+}
+
 void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
 {
 	cap->version		= x86_pmu.version;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 5288ea1ae2ba..46cbca90cdd1 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -453,6 +453,8 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 
 extern u64 perf_hw_clock(void);
 #define perf_hw_clock		perf_hw_clock
+extern u64 perf_hw_clock_ns(void);
+#define perf_hw_clock_ns	perf_hw_clock_ns
 
 #include <asm/stacktrace.h>
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e8617efd552b..0edc005f8ddf 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -298,6 +298,12 @@ enum {
  * paravirtualized.
  */
 #define CLOCK_PERF_HW_CLOCK		0x10000000
+/*
+ * Same as CLOCK_PERF_HW_CLOCK but in nanoseconds. Note support of
+ * CLOCK_PERF_HW_CLOCK_NS does not necesssarily imply support of
+ * CLOCK_PERF_HW_CLOCK or vice versa.
+ */
+#define CLOCK_PERF_HW_CLOCK_NS	0x10000001
 
 /*
  * The format of the data returned by read() on a perf event fd,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 15dee265a5b9..65e70fb669fd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -12019,6 +12019,12 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id, bool
 			event->clock = &perf_hw_clock;
 			nmi_safe = true;
 			break;
+#endif
+#ifdef perf_hw_clock_ns
+		case CLOCK_PERF_HW_CLOCK_NS:
+			event->clock = &perf_hw_clock_ns;
+			nmi_safe = true;
+			break;
 #endif
 		default:
 			return -EINVAL;
-- 
2.25.1


  parent reply	other threads:[~2022-02-14 11:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 11:09 [PATCH V2 00/11] perf intel-pt: Add perf event clocks to better support VM tracing Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 01/11] perf/x86: Fix native_perf_sched_clock_from_tsc() with __sched_clock_offset Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 02/11] perf/x86: Add support for TSC as a perf event clock Adrian Hunter
2022-03-04 12:30   ` Peter Zijlstra
2022-03-04 13:03     ` Adrian Hunter
2022-03-04 12:32   ` Peter Zijlstra
2022-03-04 17:51     ` Thomas Gleixner
2022-03-04 12:33   ` Peter Zijlstra
2022-03-04 12:41     ` Adrian Hunter
2022-02-14 11:09 ` Adrian Hunter [this message]
2022-03-04 13:41   ` [PATCH V2 03/11] perf/x86: Add support for TSC in nanoseconds " Peter Zijlstra
2022-03-04 18:27     ` Adrian Hunter
2022-03-07  9:50       ` Peter Zijlstra
2022-03-07 10:06         ` Juergen Gross
2022-03-07 10:38           ` Peter Zijlstra
2022-03-07 10:58             ` Juergen Gross
2022-03-07 12:36         ` Adrian Hunter
2022-03-07 14:42           ` Peter Zijlstra
2022-03-08 14:23             ` Adrian Hunter
2022-03-08 21:06               ` Hall, Christopher S
2022-03-14 11:50                 ` Adrian Hunter
2022-04-25  5:30                   ` Adrian Hunter
2022-04-25  9:32                     ` Thomas Gleixner
2022-04-25 13:15                       ` Adrian Hunter
2022-04-25 17:05                         ` Thomas Gleixner
2022-04-26  6:51                           ` Adrian Hunter
2022-04-27 23:10                             ` Thomas Gleixner
2022-05-16  7:20                               ` Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 04/11] perf tools: Add new perf clock IDs Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 05/11] perf tools: Add API probes for new " Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 06/11] perf tools: Add new clock IDs to "perf time to TSC" test Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 07/11] perf tools: Add perf_read_tsc_conv_for_clockid() Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 08/11] perf intel-pt: Add support for new clock IDs Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 09/11] perf intel-pt: Use CLOCK_PERF_HW_CLOCK_NS by default Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 10/11] perf intel-pt: Add config variables for timing parameters Adrian Hunter
2022-02-14 11:09 ` [PATCH V2 11/11] perf intel-pt: Add documentation for new clock IDs Adrian Hunter
2022-02-21  6:54 ` [PATCH V2 00/11] perf intel-pt: Add perf event clocks to better support VM tracing Adrian Hunter
2022-03-01 11:06   ` Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220214110914.268126-4-adrian.hunter@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox