* [PATCH v2 0/5] Correct perf sampling with Guest VMs
@ 2024-09-11 22:24 Colton Lewis
2024-09-11 22:24 ` [PATCH v2 1/5] arm: perf: Drop unused functions Colton Lewis
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
v2:
* Better explain commit messages
* Fix incorrect type in patch 2
* Fix missing argument in call to common_misc_flags() in patch 5
* Rebase to 6.11-rc7 and include base commit in letter
v1:
https://lore.kernel.org/kvm/20240904204133.1442132-1-coltonlewis@google.com/
This series cleans up perf recording around guest events and improves
the accuracy of the resulting perf reports.
Perf was incorrectly counting any PMU overflow interrupt that occurred
while a VCPU was loaded as a guest event even when the events were not
truely guest events. This lead to much less accurate and useful perf
recordings.
See as an example the below reports of `perf record
dirty_log_perf_test -m 2 -v 4` before and after the series on ARM64.
Without series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Overhead Command Shared Object Symbol
54.54% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.39% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
0.89% dirty_log_perf_ [kernel.vmlinux] [k] release_pages
0.70% dirty_log_perf_ [kernel.vmlinux] [k] free_pcppages_bulk
0.62% dirty_log_perf_ dirty_log_perf_test [.] userspace_mem_region_find
0.49% dirty_log_perf_ dirty_log_perf_test [.] sparsebit_is_set
0.46% dirty_log_perf_ dirty_log_perf_test [.] _virt_pg_map
0.46% dirty_log_perf_ dirty_log_perf_test [.] node_add
0.37% dirty_log_perf_ dirty_log_perf_test [.] node_reduce
0.35% dirty_log_perf_ [kernel.vmlinux] [k] free_unref_page_commit
0.33% dirty_log_perf_ [kernel.vmlinux] [k] __kvm_pgtable_walk
0.31% dirty_log_perf_ [kernel.vmlinux] [k] stage2_attr_walker
0.29% dirty_log_perf_ [kernel.vmlinux] [k] unmap_page_range
0.29% dirty_log_perf_ dirty_log_perf_test [.] test_assert
0.26% dirty_log_perf_ [kernel.vmlinux] [k] __mod_memcg_lruvec_state
0.24% dirty_log_perf_ [kernel.vmlinux] [k] kvm_s2_put_page
With series:
Samples: 15K of event 'instructions', Event count (approx.): 31830580924
Samples: 15K of event 'instructions', Event count (approx.): 30898031385
Overhead Command Shared Object Symbol
54.05% dirty_log_perf_ dirty_log_perf_test [.] run_test
5.48% dirty_log_perf_ [kernel.kallsyms] [k] kvm_arch_vcpu_ioctl_run
4.70% dirty_log_perf_ dirty_log_perf_test [.] vcpu_worker
3.11% dirty_log_perf_ [kernel.kallsyms] [k] kvm_handle_guest_abort
2.24% dirty_log_perf_ [kernel.kallsyms] [k] up_read
1.98% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_tlb_flush_vmid_ipa_nsh
1.97% dirty_log_perf_ [kernel.kallsyms] [k] __pi_clear_page
1.30% dirty_log_perf_ [kernel.kallsyms] [k] down_read
1.13% dirty_log_perf_ [kernel.kallsyms] [k] release_pages
1.12% dirty_log_perf_ [kernel.kallsyms] [k] __kvm_pgtable_walk
1.08% dirty_log_perf_ [kernel.kallsyms] [k] folio_batch_move_lru
1.06% dirty_log_perf_ [kernel.kallsyms] [k] __srcu_read_lock
1.03% dirty_log_perf_ [kernel.kallsyms] [k] get_page_from_freelist
1.01% dirty_log_perf_ [kernel.kallsyms] [k] __pte_offset_map_lock
0.82% dirty_log_perf_ [kernel.kallsyms] [k] handle_mm_fault
0.74% dirty_log_perf_ [kernel.kallsyms] [k] mas_state_walk
Colton Lewis (5):
arm: perf: Drop unused functions
perf: Hoist perf_instruction_pointer() and perf_misc_flags()
powerpc: perf: Use perf_arch_instruction_pointer()
x86: perf: Refactor misc flag assignments
perf: Correct perf sampling with guest VMs
arch/arm/include/asm/perf_event.h | 7 ---
arch/arm/kernel/perf_callchain.c | 17 -------
arch/arm64/include/asm/perf_event.h | 4 --
arch/arm64/kernel/perf_callchain.c | 28 ------------
arch/powerpc/include/asm/perf_event_server.h | 6 +--
arch/powerpc/perf/callchain.c | 2 +-
arch/powerpc/perf/callchain_32.c | 2 +-
arch/powerpc/perf/callchain_64.c | 2 +-
arch/powerpc/perf/core-book3s.c | 4 +-
arch/s390/include/asm/perf_event.h | 6 +--
arch/s390/kernel/perf_event.c | 4 +-
arch/x86/events/core.c | 48 ++++++++++++--------
arch/x86/include/asm/perf_event.h | 12 +++--
include/linux/perf_event.h | 26 +++++++++--
kernel/events/core.c | 27 ++++++++++-
15 files changed, 96 insertions(+), 99 deletions(-)
base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/5] arm: perf: Drop unused functions
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
@ 2024-09-11 22:24 ` Colton Lewis
2024-09-11 22:24 ` [PATCH v2 2/5] perf: Hoist perf_instruction_pointer() and perf_misc_flags() Colton Lewis
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
For arm's implementation, perf_instruction_pointer() and
perf_misc_flags() are equivalent to the generic versions in
include/linux/perf_event.h so arch/arm doesn't need to provide its
own versions. Drop them here.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/arm/include/asm/perf_event.h | 7 -------
arch/arm/kernel/perf_callchain.c | 17 -----------------
2 files changed, 24 deletions(-)
diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h
index bdbc1e590891..c08f16f2e243 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -8,13 +8,6 @@
#ifndef __ARM_PERF_EVENT_H__
#define __ARM_PERF_EVENT_H__
-#ifdef CONFIG_PERF_EVENTS
-struct pt_regs;
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-#define perf_misc_flags(regs) perf_misc_flags(regs)
-#endif
-
#define perf_arch_fetch_caller_regs(regs, __ip) { \
(regs)->ARM_pc = (__ip); \
frame_pointer((regs)) = (unsigned long) __builtin_frame_address(0); \
diff --git a/arch/arm/kernel/perf_callchain.c b/arch/arm/kernel/perf_callchain.c
index 1d230ac9d0eb..a2601b1ef318 100644
--- a/arch/arm/kernel/perf_callchain.c
+++ b/arch/arm/kernel/perf_callchain.c
@@ -96,20 +96,3 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
arm_get_current_stackframe(regs, &fr);
walk_stackframe(&fr, callchain_trace, entry);
}
-
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
-{
- return instruction_pointer(regs);
-}
-
-unsigned long perf_misc_flags(struct pt_regs *regs)
-{
- int misc = 0;
-
- if (user_mode(regs))
- misc |= PERF_RECORD_MISC_USER;
- else
- misc |= PERF_RECORD_MISC_KERNEL;
-
- return misc;
-}
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/5] perf: Hoist perf_instruction_pointer() and perf_misc_flags()
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
2024-09-11 22:24 ` [PATCH v2 1/5] arm: perf: Drop unused functions Colton Lewis
@ 2024-09-11 22:24 ` Colton Lewis
2024-09-11 22:24 ` [PATCH v2 3/5] powerpc: perf: Use perf_arch_instruction_pointer() Colton Lewis
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
For clarity, rename the arch-specific definitions of these functions
to perf_arch_* to denote they are arch-specifc. Define the
generic-named functions in one place where they can call the
arch-specific ones as needed.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/arm64/include/asm/perf_event.h | 6 +++---
arch/arm64/kernel/perf_callchain.c | 4 ++--
arch/powerpc/include/asm/perf_event_server.h | 6 +++---
arch/powerpc/perf/core-book3s.c | 4 ++--
arch/s390/include/asm/perf_event.h | 6 +++---
arch/s390/kernel/perf_event.c | 4 ++--
arch/x86/events/core.c | 4 ++--
arch/x86/include/asm/perf_event.h | 10 +++++-----
include/linux/perf_event.h | 9 ++++++---
kernel/events/core.c | 10 ++++++++++
10 files changed, 38 insertions(+), 25 deletions(-)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index eb7071c9eb34..31a5584ed423 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -11,9 +11,9 @@
#ifdef CONFIG_PERF_EVENTS
struct pt_regs;
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-#define perf_misc_flags(regs) perf_misc_flags(regs)
+extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
+extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
+#define perf_arch_misc_flags(regs) perf_misc_flags(regs)
#define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs
#endif
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index e8ed5673f481..01a9d08fc009 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -39,7 +39,7 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
arch_stack_walk(callchain_trace, entry, current, regs);
}
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
+unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
{
if (perf_guest_state())
return perf_guest_get_ip();
@@ -47,7 +47,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
return instruction_pointer(regs);
}
-unsigned long perf_misc_flags(struct pt_regs *regs)
+unsigned long perf_arch_misc_flags(struct pt_regs *regs)
{
unsigned int guest_state = perf_guest_state();
int misc = 0;
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 5995614e9062..41587d3f8446 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -102,8 +102,8 @@ struct power_pmu {
int __init register_power_pmu(struct power_pmu *pmu);
struct pt_regs;
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
+extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
+extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
extern unsigned long int read_bhrb(int n);
/*
@@ -111,7 +111,7 @@ extern unsigned long int read_bhrb(int n);
* if we have hardware PMU support.
*/
#ifdef CONFIG_PPC_PERF_CTRS
-#define perf_misc_flags(regs) perf_misc_flags(regs)
+#define perf_arch_misc_flags(regs) perf_misc_flags(regs)
#endif
/*
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 42867469752d..dc01aa604cc1 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2332,7 +2332,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
* Called from generic code to get the misc flags (i.e. processor mode)
* for an event_id.
*/
-unsigned long perf_misc_flags(struct pt_regs *regs)
+unsigned long perf_arch_misc_flags(struct pt_regs *regs)
{
u32 flags = perf_get_misc_flags(regs);
@@ -2346,7 +2346,7 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
* Called from generic code to get the instruction pointer
* for an event_id.
*/
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
+unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
{
unsigned long siar = mfspr(SPRN_SIAR);
diff --git a/arch/s390/include/asm/perf_event.h b/arch/s390/include/asm/perf_event.h
index 9917e2717b2b..f2d83289ec7a 100644
--- a/arch/s390/include/asm/perf_event.h
+++ b/arch/s390/include/asm/perf_event.h
@@ -37,9 +37,9 @@ extern ssize_t cpumf_events_sysfs_show(struct device *dev,
/* Perf callbacks */
struct pt_regs;
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-#define perf_misc_flags(regs) perf_misc_flags(regs)
+extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
+extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
+#define perf_arch_misc_flags(regs) perf_misc_flags(regs)
#define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs
/* Perf pt_regs extension for sample-data-entry indicators */
diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c
index 5fff629b1a89..f9000ab49f4a 100644
--- a/arch/s390/kernel/perf_event.c
+++ b/arch/s390/kernel/perf_event.c
@@ -57,7 +57,7 @@ static unsigned long instruction_pointer_guest(struct pt_regs *regs)
return sie_block(regs)->gpsw.addr;
}
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
+unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
{
return is_in_guest(regs) ? instruction_pointer_guest(regs)
: instruction_pointer(regs);
@@ -84,7 +84,7 @@ static unsigned long perf_misc_flags_sf(struct pt_regs *regs)
return flags;
}
-unsigned long perf_misc_flags(struct pt_regs *regs)
+unsigned long perf_arch_misc_flags(struct pt_regs *regs)
{
/* Check if the cpum_sf PMU has created the pt_regs structure.
* In this case, perf misc flags can be easily extracted. Otherwise,
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index be01823b1bb4..760ad067527c 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2940,7 +2940,7 @@ static unsigned long code_segment_base(struct pt_regs *regs)
return 0;
}
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
+unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
{
if (perf_guest_state())
return perf_guest_get_ip();
@@ -2948,7 +2948,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
return regs->ip + code_segment_base(regs);
}
-unsigned long perf_misc_flags(struct pt_regs *regs)
+unsigned long perf_arch_misc_flags(struct pt_regs *regs)
{
unsigned int guest_state = perf_guest_state();
int misc = 0;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 91b73571412f..feb87bf3d2e9 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -536,15 +536,15 @@ struct x86_perf_regs {
u64 *xmm_regs;
};
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-#define perf_misc_flags(regs) perf_misc_flags(regs)
+extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
+extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
+#define perf_arch_misc_flags(regs) perf_arch_misc_flags(regs)
#include <asm/stacktrace.h>
/*
- * We abuse bit 3 from flags to pass exact information, see perf_misc_flags
- * and the comment with PERF_EFLAGS_EXACT.
+ * We abuse bit 3 from flags to pass exact information, see
+ * perf_arch_misc_flags() and the comment with PERF_EFLAGS_EXACT.
*/
#define perf_arch_fetch_caller_regs(regs, __ip) { \
(regs)->ip = (__ip); \
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1a8942277dda..d061e327ad54 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1633,10 +1633,13 @@ extern void perf_tp_event(u16 event_type, u64 count, void *record,
struct task_struct *task);
extern void perf_bp_event(struct perf_event *event, void *data);
-#ifndef perf_misc_flags
-# define perf_misc_flags(regs) \
+extern unsigned long perf_misc_flags(struct pt_regs *regs);
+extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
+
+#ifndef perf_arch_misc_flags
+# define perf_arch_misc_flags(regs) \
(user_mode(regs) ? PERF_RECORD_MISC_USER : PERF_RECORD_MISC_KERNEL)
-# define perf_instruction_pointer(regs) instruction_pointer(regs)
+# define perf_arch_instruction_pointer(regs) instruction_pointer(regs)
#endif
#ifndef perf_arch_bpf_user_pt_regs
# define perf_arch_bpf_user_pt_regs(regs) regs
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8a6c6bbcd658..eeabbf791a8c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6921,6 +6921,16 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks);
#endif
+unsigned long perf_misc_flags(struct pt_regs *regs)
+{
+ return perf_arch_misc_flags(regs);
+}
+
+unsigned long perf_instruction_pointer(struct pt_regs *regs)
+{
+ return perf_arch_instruction_pointer(regs);
+}
+
static void
perf_output_sample_regs(struct perf_output_handle *handle,
struct pt_regs *regs, u64 mask)
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/5] powerpc: perf: Use perf_arch_instruction_pointer()
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
2024-09-11 22:24 ` [PATCH v2 1/5] arm: perf: Drop unused functions Colton Lewis
2024-09-11 22:24 ` [PATCH v2 2/5] perf: Hoist perf_instruction_pointer() and perf_misc_flags() Colton Lewis
@ 2024-09-11 22:24 ` Colton Lewis
2024-09-11 22:24 ` [PATCH v2 4/5] x86: perf: Refactor misc flag assignments Colton Lewis
2024-09-11 22:24 ` [PATCH v2 5/5] perf: Correct perf sampling with guest VMs Colton Lewis
4 siblings, 0 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
Make sure powerpc uses the arch-specific function now that those have
been reorganized.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/powerpc/perf/callchain.c | 2 +-
arch/powerpc/perf/callchain_32.c | 2 +-
arch/powerpc/perf/callchain_64.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index 6b4434dd0ff3..26aa26482c9a 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -51,7 +51,7 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
lr = regs->link;
sp = regs->gpr[1];
- perf_callchain_store(entry, perf_instruction_pointer(regs));
+ perf_callchain_store(entry, perf_arch_instruction_pointer(regs));
if (!validate_sp(sp, current))
return;
diff --git a/arch/powerpc/perf/callchain_32.c b/arch/powerpc/perf/callchain_32.c
index ea8cfe3806dc..ddcc2d8aa64a 100644
--- a/arch/powerpc/perf/callchain_32.c
+++ b/arch/powerpc/perf/callchain_32.c
@@ -139,7 +139,7 @@ void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
long level = 0;
unsigned int __user *fp, *uregs;
- next_ip = perf_instruction_pointer(regs);
+ next_ip = perf_arch_instruction_pointer(regs);
lr = regs->link;
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
diff --git a/arch/powerpc/perf/callchain_64.c b/arch/powerpc/perf/callchain_64.c
index 488e8a21a11e..115d1c105e8a 100644
--- a/arch/powerpc/perf/callchain_64.c
+++ b/arch/powerpc/perf/callchain_64.c
@@ -74,7 +74,7 @@ void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
struct signal_frame_64 __user *sigframe;
unsigned long __user *fp, *uregs;
- next_ip = perf_instruction_pointer(regs);
+ next_ip = perf_arch_instruction_pointer(regs);
lr = regs->link;
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 4/5] x86: perf: Refactor misc flag assignments
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
` (2 preceding siblings ...)
2024-09-11 22:24 ` [PATCH v2 3/5] powerpc: perf: Use perf_arch_instruction_pointer() Colton Lewis
@ 2024-09-11 22:24 ` Colton Lewis
2024-09-11 22:24 ` [PATCH v2 5/5] perf: Correct perf sampling with guest VMs Colton Lewis
4 siblings, 0 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
Break the assignment logic for misc flags into their own respective
functions to reduce the complexity of the nested logic.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/x86/events/core.c | 31 +++++++++++++++++++++++--------
arch/x86/include/asm/perf_event.h | 2 ++
2 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 760ad067527c..d51e5d24802b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2948,16 +2948,34 @@ unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
return regs->ip + code_segment_base(regs);
}
+static unsigned long common_misc_flags(struct pt_regs *regs)
+{
+ if (regs->flags & PERF_EFLAGS_EXACT)
+ return PERF_RECORD_MISC_EXACT_IP;
+
+ return 0;
+}
+
+unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs)
+{
+ unsigned long guest_state = perf_guest_state();
+ unsigned long flags = common_misc_flags(regs);
+
+ if (guest_state & PERF_GUEST_USER)
+ flags |= PERF_RECORD_MISC_GUEST_USER;
+ else if (guest_state & PERF_GUEST_ACTIVE)
+ flags |= PERF_RECORD_MISC_GUEST_KERNEL;
+
+ return flags;
+}
+
unsigned long perf_arch_misc_flags(struct pt_regs *regs)
{
unsigned int guest_state = perf_guest_state();
- int misc = 0;
+ unsigned long misc = common_misc_flags(regs);
if (guest_state) {
- if (guest_state & PERF_GUEST_USER)
- misc |= PERF_RECORD_MISC_GUEST_USER;
- else
- misc |= PERF_RECORD_MISC_GUEST_KERNEL;
+ misc |= perf_arch_guest_misc_flags(regs);
} else {
if (user_mode(regs))
misc |= PERF_RECORD_MISC_USER;
@@ -2965,9 +2983,6 @@ unsigned long perf_arch_misc_flags(struct pt_regs *regs)
misc |= PERF_RECORD_MISC_KERNEL;
}
- if (regs->flags & PERF_EFLAGS_EXACT)
- misc |= PERF_RECORD_MISC_EXACT_IP;
-
return misc;
}
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index feb87bf3d2e9..d95f902acc52 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -538,7 +538,9 @@ struct x86_perf_regs {
extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
+extern unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs);
#define perf_arch_misc_flags(regs) perf_arch_misc_flags(regs)
+#define perf_arch_guest_misc_flags(regs) perf_arch_guest_misc_flags(regs)
#include <asm/stacktrace.h>
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 5/5] perf: Correct perf sampling with guest VMs
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
` (3 preceding siblings ...)
2024-09-11 22:24 ` [PATCH v2 4/5] x86: perf: Refactor misc flag assignments Colton Lewis
@ 2024-09-11 22:24 ` Colton Lewis
2024-09-11 22:56 ` Sean Christopherson
4 siblings, 1 reply; 8+ messages in thread
From: Colton Lewis @ 2024-09-11 22:24 UTC (permalink / raw)
To: kvm
Cc: Oliver Upton, Sean Christopherson, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390, Colton Lewis
Previously any PMU overflow interrupt that fired while a VCPU was
loaded was recorded as a guest event whether it truly was or not. This
resulted in nonsense perf recordings that did not honor
perf_event_attr.exclude_guest and recorded guest IPs where it should
have recorded host IPs.
Rework the sampling logic to only record guest samples for events with
exclude_guest clear. This way any host-only events with exclude_guest
set will never see unexpected guest samples. The behaviour of events
with exclude_guest clear is unchanged.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/arm64/include/asm/perf_event.h | 4 ----
arch/arm64/kernel/perf_callchain.c | 28 ----------------------------
arch/x86/events/core.c | 15 ++++-----------
include/linux/perf_event.h | 21 +++++++++++++++++++--
kernel/events/core.c | 21 +++++++++++++++++----
5 files changed, 40 insertions(+), 49 deletions(-)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 31a5584ed423..ee45b4e77347 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -10,10 +10,6 @@
#include <asm/ptrace.h>
#ifdef CONFIG_PERF_EVENTS
-struct pt_regs;
-extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs);
-extern unsigned long perf_arch_misc_flags(struct pt_regs *regs);
-#define perf_arch_misc_flags(regs) perf_misc_flags(regs)
#define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs
#endif
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index 01a9d08fc009..9b7f26b128b5 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -38,31 +38,3 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
arch_stack_walk(callchain_trace, entry, current, regs);
}
-
-unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
-{
- if (perf_guest_state())
- return perf_guest_get_ip();
-
- return instruction_pointer(regs);
-}
-
-unsigned long perf_arch_misc_flags(struct pt_regs *regs)
-{
- unsigned int guest_state = perf_guest_state();
- int misc = 0;
-
- if (guest_state) {
- if (guest_state & PERF_GUEST_USER)
- misc |= PERF_RECORD_MISC_GUEST_USER;
- else
- misc |= PERF_RECORD_MISC_GUEST_KERNEL;
- } else {
- if (user_mode(regs))
- misc |= PERF_RECORD_MISC_USER;
- else
- misc |= PERF_RECORD_MISC_KERNEL;
- }
-
- return misc;
-}
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d51e5d24802b..5f119b9516dc 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2942,9 +2942,6 @@ static unsigned long code_segment_base(struct pt_regs *regs)
unsigned long perf_arch_instruction_pointer(struct pt_regs *regs)
{
- if (perf_guest_state())
- return perf_guest_get_ip();
-
return regs->ip + code_segment_base(regs);
}
@@ -2974,14 +2971,10 @@ unsigned long perf_arch_misc_flags(struct pt_regs *regs)
unsigned int guest_state = perf_guest_state();
unsigned long misc = common_misc_flags(regs);
- if (guest_state) {
- misc |= perf_arch_guest_misc_flags(regs);
- } else {
- if (user_mode(regs))
- misc |= PERF_RECORD_MISC_USER;
- else
- misc |= PERF_RECORD_MISC_KERNEL;
- }
+ if (user_mode(regs))
+ misc |= PERF_RECORD_MISC_USER;
+ else
+ misc |= PERF_RECORD_MISC_KERNEL;
return misc;
}
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d061e327ad54..968f3edd95e4 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1633,8 +1633,9 @@ extern void perf_tp_event(u16 event_type, u64 count, void *record,
struct task_struct *task);
extern void perf_bp_event(struct perf_event *event, void *data);
-extern unsigned long perf_misc_flags(struct pt_regs *regs);
-extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
+extern unsigned long perf_misc_flags(struct perf_event *event, struct pt_regs *regs);
+extern unsigned long perf_instruction_pointer(struct perf_event *event,
+ struct pt_regs *regs);
#ifndef perf_arch_misc_flags
# define perf_arch_misc_flags(regs) \
@@ -1645,6 +1646,22 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
# define perf_arch_bpf_user_pt_regs(regs) regs
#endif
+#ifndef perf_arch_guest_misc_flags
+static inline unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs)
+{
+ unsigned long guest_state = perf_guest_state();
+
+ if (guest_state & PERF_GUEST_USER)
+ return PERF_RECORD_MISC_GUEST_USER;
+
+ if (guest_state & PERF_GUEST_ACTIVE)
+ return PERF_RECORD_MISC_GUEST_KERNEL;
+
+ return 0;
+}
+# define perf_arch_guest_misc_flags(regs) perf_arch_guest_misc_flags(regs)
+#endif
+
static inline bool has_branch_stack(struct perf_event *event)
{
return event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index eeabbf791a8c..c5e57c024d9a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6921,13 +6921,26 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks);
#endif
-unsigned long perf_misc_flags(struct pt_regs *regs)
+static bool should_sample_guest(struct perf_event *event)
{
+ return !event->attr.exclude_guest && perf_guest_state();
+}
+
+unsigned long perf_misc_flags(struct perf_event *event,
+ struct pt_regs *regs)
+{
+ if (should_sample_guest(event))
+ return perf_arch_guest_misc_flags(regs);
+
return perf_arch_misc_flags(regs);
}
-unsigned long perf_instruction_pointer(struct pt_regs *regs)
+unsigned long perf_instruction_pointer(struct perf_event *event,
+ struct pt_regs *regs)
{
+ if (should_sample_guest(event))
+ return perf_guest_get_ip();
+
return perf_arch_instruction_pointer(regs);
}
@@ -7743,7 +7756,7 @@ void perf_prepare_sample(struct perf_sample_data *data,
__perf_event_header__init_id(data, event, filtered_sample_type);
if (filtered_sample_type & PERF_SAMPLE_IP) {
- data->ip = perf_instruction_pointer(regs);
+ data->ip = perf_instruction_pointer(event, regs);
data->sample_flags |= PERF_SAMPLE_IP;
}
@@ -7907,7 +7920,7 @@ void perf_prepare_header(struct perf_event_header *header,
{
header->type = PERF_RECORD_SAMPLE;
header->size = perf_sample_data_size(data, event);
- header->misc = perf_misc_flags(regs);
+ header->misc = perf_misc_flags(event, regs);
/*
* If you're adding more sample types here, you likely need to do
--
2.46.0.598.g6f2099f65c-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 5/5] perf: Correct perf sampling with guest VMs
2024-09-11 22:24 ` [PATCH v2 5/5] perf: Correct perf sampling with guest VMs Colton Lewis
@ 2024-09-11 22:56 ` Sean Christopherson
2024-09-12 20:31 ` Colton Lewis
0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2024-09-11 22:56 UTC (permalink / raw)
To: Colton Lewis
Cc: kvm, Oliver Upton, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Kan Liang, Will Deacon, Russell King, Catalin Marinas,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
linux-perf-users, linux-kernel, linux-arm-kernel, linuxppc-dev,
linux-s390
On Wed, Sep 11, 2024, Colton Lewis wrote:
> Previously any PMU overflow interrupt that fired while a VCPU was
> loaded was recorded as a guest event whether it truly was or not. This
> resulted in nonsense perf recordings that did not honor
> perf_event_attr.exclude_guest and recorded guest IPs where it should
> have recorded host IPs.
>
> Rework the sampling logic to only record guest samples for events with
> exclude_guest clear. This way any host-only events with exclude_guest
> set will never see unexpected guest samples. The behaviour of events
> with exclude_guest clear is unchanged.
Nit, "with exclude_guest clear" is easy to misread as simply "with exclude_guest"
(I did so at least three times). Maybe
The behavior of exclude_guest=0 events is unchanged.
or
The behavior of events without exclude_guest is unchanged.
I think it's also worth explicitly calling out that events that are configured
to sample both host and guest may still be prone to misattributing a PMI that
arrived in the host as a guest event, depending on the KVM arch and/or vendor
behavior.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 5/5] perf: Correct perf sampling with guest VMs
2024-09-11 22:56 ` Sean Christopherson
@ 2024-09-12 20:31 ` Colton Lewis
0 siblings, 0 replies; 8+ messages in thread
From: Colton Lewis @ 2024-09-12 20:31 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, oliver.upton, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
will, linux, catalin.marinas, mpe, npiggin, christophe.leroy,
naveen, hca, gor, agordeev, borntraeger, svens, tglx, bp,
dave.hansen, x86, hpa, linux-perf-users, linux-kernel,
linux-arm-kernel, linuxppc-dev, linux-s390
Sean Christopherson <seanjc@google.com> writes:
> On Wed, Sep 11, 2024, Colton Lewis wrote:
>> Previously any PMU overflow interrupt that fired while a VCPU was
>> loaded was recorded as a guest event whether it truly was or not. This
>> resulted in nonsense perf recordings that did not honor
>> perf_event_attr.exclude_guest and recorded guest IPs where it should
>> have recorded host IPs.
>> Rework the sampling logic to only record guest samples for events with
>> exclude_guest clear. This way any host-only events with exclude_guest
>> set will never see unexpected guest samples. The behaviour of events
>> with exclude_guest clear is unchanged.
> Nit, "with exclude_guest clear" is easy to misread as simply "with
> exclude_guest"
> (I did so at least three times). Maybe
> The behavior of exclude_guest=0 events is unchanged.
> or
> The behavior of events without exclude_guest is unchanged.
> I think it's also worth explicitly calling out that events that are
> configured
> to sample both host and guest may still be prone to misattributing a PMI
> that
> arrived in the host as a guest event, depending on the KVM arch and/or
> vendor
> behavior.
Done
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-09-12 20:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-11 22:24 [PATCH v2 0/5] Correct perf sampling with Guest VMs Colton Lewis
2024-09-11 22:24 ` [PATCH v2 1/5] arm: perf: Drop unused functions Colton Lewis
2024-09-11 22:24 ` [PATCH v2 2/5] perf: Hoist perf_instruction_pointer() and perf_misc_flags() Colton Lewis
2024-09-11 22:24 ` [PATCH v2 3/5] powerpc: perf: Use perf_arch_instruction_pointer() Colton Lewis
2024-09-11 22:24 ` [PATCH v2 4/5] x86: perf: Refactor misc flag assignments Colton Lewis
2024-09-11 22:24 ` [PATCH v2 5/5] perf: Correct perf sampling with guest VMs Colton Lewis
2024-09-11 22:56 ` Sean Christopherson
2024-09-12 20:31 ` Colton Lewis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).