From: Adrian Hunter <adrian.hunter@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Jiri Olsa <jolsa@redhat.com>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, kvm@vger.kernel.org,
H Peter Anvin <hpa@zytor.com>,
Mathieu Poirier <mathieu.poirier@linaro.org>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Leo Yan <leo.yan@linaro.org>
Subject: [PATCH 02/11] perf/x86: Add support for TSC as a perf event clock
Date: Wed, 9 Feb 2022 10:49:20 +0200 [thread overview]
Message-ID: <20220209084929.54331-3-adrian.hunter@intel.com> (raw)
In-Reply-To: <20220209084929.54331-1-adrian.hunter@intel.com>
Currently, using Intel PT to trace a VM guest is limited to kernel space
because decoding requires side band events such as MMAP and CONTEXT_SWITCH.
While these events can be collected for the host, there is not a way to do
that yet for a guest. One approach, would be to collect them inside the
guest, but that would require being able to synchronize with host
timestamps.
The motivation for this patch is to provide a clock that can be used within
a VM guest, and that correlates to a VM host clock. In the case of TSC, if
the hypervisor leaves rdtsc alone, the TSC value will be subject only to
the VMCS TSC Offset and Scaling. Adjusting for that would make it possible
to inject events from a guest perf.data file, into a host perf.data file.
Thus making possible the collection of VM guest side band for Intel PT
decoding.
There are other potential benefits of TSC as a perf event clock:
- ability to work directly with TSC
- ability to inject non-Intel-PT-related events from a guest
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
arch/x86/events/core.c | 14 ++++++++++++++
arch/x86/include/asm/perf_event.h | 3 +++
include/uapi/linux/perf_event.h | 8 ++++++++
kernel/events/core.c | 7 +++++++
4 files changed, 32 insertions(+)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index e686c5e0537b..e2ad3f9cca93 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2728,6 +2728,15 @@ void arch_perf_update_userpage(struct perf_event *event,
!!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT);
userpg->pmc_width = x86_pmu.cntval_bits;
+ if (event->attr.use_clockid && event->attr.clockid == CLOCK_PERF_HW_CLOCK) {
+ userpg->cap_user_time_zero = 1;
+ userpg->time_mult = 1;
+ userpg->time_shift = 0;
+ userpg->time_offset = 0;
+ userpg->time_zero = 0;
+ return;
+ }
+
if (!using_native_sched_clock() || !sched_clock_stable())
return;
@@ -2980,6 +2989,11 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
return misc;
}
+u64 perf_hw_clock(void)
+{
+ return rdtsc_ordered();
+}
+
void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
{
cap->version = x86_pmu.version;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 58d9e4b1fa0a..5288ea1ae2ba 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -451,6 +451,9 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
extern unsigned long perf_misc_flags(struct pt_regs *regs);
#define perf_misc_flags(regs) perf_misc_flags(regs)
+extern u64 perf_hw_clock(void);
+#define perf_hw_clock perf_hw_clock
+
#include <asm/stacktrace.h>
/*
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 82858b697c05..150d2b70a41f 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -290,6 +290,14 @@ enum {
PERF_TXN_ABORT_SHIFT = 32,
};
+/*
+ * If supported, clockid value to select an architecture dependent hardware
+ * clock. Note this means the unit of time is ticks not nanoseconds.
+ * On x86, this is provided by the rdtsc instruction, and is not
+ * paravirtualized.
+ */
+#define CLOCK_PERF_HW_CLOCK 0x10000000
+
/*
* The format of the data returned by read() on a perf event fd,
* as specified by attr.read_format:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 57249f37c37d..aab78f033711 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -12035,6 +12035,13 @@ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
event->clock = &ktime_get_clocktai_ns;
break;
+#ifdef perf_hw_clock
+ case CLOCK_PERF_HW_CLOCK:
+ event->clock = &perf_hw_clock;
+ nmi_safe = true;
+ break;
+#endif
+
default:
return -EINVAL;
}
--
2.25.1
next prev parent reply other threads:[~2022-02-09 11:58 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-09 8:49 [PATCH 00/11] perf intel-pt: Add perf event clocks to better support VM tracing Adrian Hunter
2022-02-09 8:49 ` [PATCH 01/11] perf/x86: Fix native_perf_sched_clock_from_tsc() with __sched_clock_offset Adrian Hunter
2022-02-09 12:54 ` Peter Zijlstra
2022-02-09 14:26 ` Adrian Hunter
2022-02-09 8:49 ` Adrian Hunter [this message]
2022-02-09 13:11 ` [PATCH 02/11] perf/x86: Add support for TSC as a perf event clock Peter Zijlstra
2022-02-09 13:39 ` Adrian Hunter
2022-02-09 8:49 ` [PATCH 03/11] perf/x86: Add support for TSC in nanoseconds " Adrian Hunter
2022-02-09 13:00 ` Peter Zijlstra
2022-02-09 8:49 ` [PATCH 04/11] perf tools: Add new perf clock IDs Adrian Hunter
2022-02-09 8:49 ` [PATCH 05/11] perf tools: Add API probes for new " Adrian Hunter
2022-02-09 8:49 ` [PATCH 06/11] perf tools: Add new clock IDs to "perf time to TSC" test Adrian Hunter
2022-02-09 8:49 ` [PATCH 07/11] perf tools: Add perf_read_tsc_conv_for_clockid() Adrian Hunter
2022-02-09 8:49 ` [PATCH 08/11] perf intel-pt: Add support for new clock IDs Adrian Hunter
2022-02-09 8:49 ` [PATCH 09/11] perf intel-pt: Use CLOCK_PERF_HW_CLOCK_NS by default Adrian Hunter
2022-02-09 8:49 ` [PATCH 10/11] perf intel-pt: Add config variables for timing parameters Adrian Hunter
2022-02-09 8:49 ` [PATCH 11/11] perf intel-pt: Add documentation for new clock IDs Adrian Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220209084929.54331-3-adrian.hunter@intel.com \
--to=adrian.hunter@intel.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jolsa@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=leo.yan@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.poirier@linaro.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox