* [PATCH v2 0/2] riscv: Add perf support to collect KVM guest statistics from host side
@ 2024-08-13 13:23 zhouquan
2024-08-13 13:23 ` [PATCH v2 1/2] riscv: perf: add guest vs host distinction zhouquan
2024-08-13 13:24 ` [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling zhouquan
0 siblings, 2 replies; 8+ messages in thread
From: zhouquan @ 2024-08-13 13:23 UTC (permalink / raw)
To: anup, ajones, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa
Cc: linux-kernel, linux-riscv, kvm, kvm-riscv, linux-perf-users,
Quan Zhou
From: Quan Zhou <zhouquan@iscas.ac.cn>
Add basic guest support to RISC-V perf, enabling it to distinguish
whether PMU interrupts occur in the host or the guest, and then
collect some basic guest information from the host side
(guest os callchain is not supported for now).
Based on the x86/arm implementation, tested with kvm-riscv.
test env:
- host: qemu-9.0.0
- guest: qemu-9.0.0 --enable-kvm (only start one guest and run top)
-----------------------------------------
1) perf kvm top
./perf kvm --host --guest \
--guestkallsyms=/root/repo/shared/kallsyms \
--guestmodules=/root/repo/shared/modules top
PerfTop: 41 irqs/sec kernel:97.6% us: 0.0% guest kernel: 0.0% guest us: 0.0% exact: 0.0% [250Hz cycles:P], (all, 4 CPUs)
-------------------------------------------------------------------------------
64.57% [kernel] [k] default_idle_call
3.12% [kernel] [k] _raw_spin_unlock_irqrestore
3.03% [guest.kernel] [g] mem_serial_out
2.61% [kernel] [k] handle_softirqs
2.32% [kernel] [k] do_trap_ecall_u
1.71% [kernel] [k] _raw_spin_unlock_irq
1.26% [guest.kernel] [g] do_raw_spin_lock
1.25% [kernel] [k] finish_task_switch.isra.0
1.16% [kernel] [k] do_idle
0.77% libc.so.6 [.] ioctl
0.76% [kernel] [k] queue_work_on
0.69% [kernel] [k] __local_bh_enable_ip
0.67% [guest.kernel] [g] __noinstr_text_start
0.64% [guest.kernel] [g] mem_serial_in
0.41% libc.so.6 [.] pthread_sigmask
0.39% [kernel] [k] mem_cgroup_uncharge_skmem
0.39% [kernel] [k] __might_resched
0.39% [guest.kernel] [g] _nohz_idle_balance.isra.0
0.37% [kernel] [k] sched_balance_update_blocked_averages
0.34% [kernel] [k] sched_balance_rq
2) perf kvm record
./perf kvm --host --guest \
--guestkallsyms=/root/repo/shared/kallsyms \
--guestmodules=/root/repo/shared/modules record -a sleep 60
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 1.292 MB perf.data.kvm (17990 samples) ]
3) perf kvm report (the data shown here is not complete)
./perf kvm --host --guest \
--guestkallsyms=/root/repo/shared/kallsyms \
--guestmodules=/root/repo/shared/modules report -i perf.data.kvm
# Total Lost Samples: 0
#
# Samples: 17K of event 'cycles:P'
# Event count (approx.): 269968947184
#
# Overhead Command Shared Object Symbol
# ........ ............... ....................... ..............................................
#
61.86% swapper [kernel.kallsyms] [k] default_idle_call
2.93% :6463 [guest.kernel.kallsyms] [g] do_raw_spin_lock
2.82% :6462 [guest.kernel.kallsyms] [g] mem_serial_out
2.11% sshd [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.78% :6462 [guest.kernel.kallsyms] [g] do_raw_spin_lock
1.37% swapper [kernel.kallsyms] [k] handle_softirqs
1.36% swapper [kernel.kallsyms] [k] do_idle
1.21% sshd [kernel.kallsyms] [k] do_trap_ecall_u
1.21% sshd [kernel.kallsyms] [k] _raw_spin_unlock_irq
1.11% qemu-system-ris [kernel.kallsyms] [k] do_trap_ecall_u
0.93% qemu-system-ris libc.so.6 [.] ioctl
0.89% sshd [kernel.kallsyms] [k] __local_bh_enable_ip
0.77% qemu-system-ris [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.68% qemu-system-ris [kernel.kallsyms] [k] queue_work_on
0.65% sshd [kernel.kallsyms] [k] handle_softirqs
0.44% :6462 [guest.kernel.kallsyms] [g] mem_serial_in
0.42% sshd libc.so.6 [.] pthread_sigmask
0.34% :6462 [guest.kernel.kallsyms] [g] serial8250_tx_chars
0.30% swapper [kernel.kallsyms] [k] finish_task_switch.isra.0
0.29% swapper [kernel.kallsyms] [k] sched_balance_rq
0.29% sshd [kernel.kallsyms] [k] __might_resched
0.26% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit
0.26% swapper [kernel.kallsyms] [k] sched_balance_update_blocked_averages
0.26% swapper [kernel.kallsyms] [k] _nohz_idle_balance.isra.0
0.24% qemu-system-ris [kernel.kallsyms] [k] finish_task_switch.isra.0
0.23% :6462 [guest.kernel.kallsyms] [g] __noinstr_text_start
---
Change since v1:
- Rebased on v6.11-rc3
- Fix incorrect misc type (Andrew)
---
v1 link:
https://lore.kernel.org/all/cover.1721271251.git.zhouquan@iscas.ac.cn/
Quan Zhou (2):
riscv: perf: add guest vs host distinction
riscv: KVM: add basic support for host vs guest profiling
arch/riscv/include/asm/kvm_host.h | 5 ++++
arch/riscv/include/asm/perf_event.h | 7 ++++++
arch/riscv/kernel/perf_callchain.c | 38 +++++++++++++++++++++++++++++
arch/riscv/kvm/Kconfig | 1 +
arch/riscv/kvm/main.c | 12 +++++++--
arch/riscv/kvm/vcpu.c | 7 ++++++
6 files changed, 68 insertions(+), 2 deletions(-)
base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] riscv: perf: add guest vs host distinction
2024-08-13 13:23 [PATCH v2 0/2] riscv: Add perf support to collect KVM guest statistics from host side zhouquan
@ 2024-08-13 13:23 ` zhouquan
2024-08-21 12:48 ` Andrew Jones
2024-08-13 13:24 ` [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling zhouquan
1 sibling, 1 reply; 8+ messages in thread
From: zhouquan @ 2024-08-13 13:23 UTC (permalink / raw)
To: anup, ajones, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa
Cc: linux-kernel, linux-riscv, kvm, kvm-riscv, linux-perf-users,
Quan Zhou
From: Quan Zhou <zhouquan@iscas.ac.cn>
Introduce basic guest support in perf, enabling it to distinguish
between PMU interrupts in the host or guest, and collect
fundamental information.
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
---
arch/riscv/include/asm/perf_event.h | 7 ++++++
arch/riscv/kernel/perf_callchain.c | 38 +++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/arch/riscv/include/asm/perf_event.h b/arch/riscv/include/asm/perf_event.h
index 665bbc9b2f84..c2b73c3aefe4 100644
--- a/arch/riscv/include/asm/perf_event.h
+++ b/arch/riscv/include/asm/perf_event.h
@@ -8,13 +8,20 @@
#ifndef _ASM_RISCV_PERF_EVENT_H
#define _ASM_RISCV_PERF_EVENT_H
+#ifdef CONFIG_PERF_EVENTS
#include <linux/perf_event.h>
#define perf_arch_bpf_user_pt_regs(regs) (struct user_regs_struct *)regs
+extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
+extern unsigned short perf_misc_flags(struct pt_regs *regs);
+#define perf_misc_flags(regs) perf_misc_flags(regs)
+
#define perf_arch_fetch_caller_regs(regs, __ip) { \
(regs)->epc = (__ip); \
(regs)->s0 = (unsigned long) __builtin_frame_address(0); \
(regs)->sp = current_stack_pointer; \
(regs)->status = SR_PP; \
}
+#endif
+
#endif /* _ASM_RISCV_PERF_EVENT_H */
diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
index 3348a61de7d9..7af90a3bb373 100644
--- a/arch/riscv/kernel/perf_callchain.c
+++ b/arch/riscv/kernel/perf_callchain.c
@@ -58,6 +58,11 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
{
unsigned long fp = 0;
+ if (perf_guest_state()) {
+ /* TODO: We don't support guest os callchain now */
+ return;
+ }
+
fp = regs->s0;
perf_callchain_store(entry, regs->epc);
@@ -74,5 +79,38 @@ static bool fill_callchain(void *entry, unsigned long pc)
void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
+ if (perf_guest_state()) {
+ /* TODO: We don't support guest os callchain now */
+ return;
+ }
+
walk_stackframe(NULL, regs, fill_callchain, entry);
}
+
+unsigned long perf_instruction_pointer(struct pt_regs *regs)
+{
+ if (perf_guest_state())
+ return perf_guest_get_ip();
+
+ return instruction_pointer(regs);
+}
+
+unsigned short perf_misc_flags(struct pt_regs *regs)
+{
+ unsigned int guest_state = perf_guest_state();
+ unsigned short misc = 0;
+
+ if (guest_state) {
+ if (guest_state & PERF_GUEST_USER)
+ misc |= PERF_RECORD_MISC_GUEST_USER;
+ else
+ misc |= PERF_RECORD_MISC_GUEST_KERNEL;
+ } else {
+ if (user_mode(regs))
+ misc |= PERF_RECORD_MISC_USER;
+ else
+ misc |= PERF_RECORD_MISC_KERNEL;
+ }
+
+ return misc;
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling
2024-08-13 13:23 [PATCH v2 0/2] riscv: Add perf support to collect KVM guest statistics from host side zhouquan
2024-08-13 13:23 ` [PATCH v2 1/2] riscv: perf: add guest vs host distinction zhouquan
@ 2024-08-13 13:24 ` zhouquan
2024-08-21 12:51 ` Andrew Jones
1 sibling, 1 reply; 8+ messages in thread
From: zhouquan @ 2024-08-13 13:24 UTC (permalink / raw)
To: anup, ajones, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa
Cc: linux-kernel, linux-riscv, kvm, kvm-riscv, linux-perf-users,
Quan Zhou
From: Quan Zhou <zhouquan@iscas.ac.cn>
For the information collected on the host side, we need to
identify which data originates from the guest and record
these events separately, this can be achieved by having
KVM register perf callbacks.
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
---
arch/riscv/include/asm/kvm_host.h | 5 +++++
arch/riscv/kvm/Kconfig | 1 +
arch/riscv/kvm/main.c | 12 ++++++++++--
arch/riscv/kvm/vcpu.c | 7 +++++++
4 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index 2e2254fd2a2a..d2350b08a3f4 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -286,6 +286,11 @@ struct kvm_vcpu_arch {
} sta;
};
+static inline bool kvm_arch_pmi_in_guest(struct kvm_vcpu *vcpu)
+{
+ return IS_ENABLED(CONFIG_GUEST_PERF_EVENTS) && !!vcpu;
+}
+
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
#define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12
diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
index 26d1727f0550..0c3cbb0915ff 100644
--- a/arch/riscv/kvm/Kconfig
+++ b/arch/riscv/kvm/Kconfig
@@ -32,6 +32,7 @@ config KVM
select KVM_XFER_TO_GUEST_WORK
select KVM_GENERIC_MMU_NOTIFIER
select SCHED_INFO
+ select GUEST_PERF_EVENTS if PERF_EVENTS
help
Support hosting virtualized guest machines.
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index bab2ec34cd87..734b48d8f6dd 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -51,6 +51,12 @@ void kvm_arch_hardware_disable(void)
csr_write(CSR_HIDELEG, 0);
}
+static void kvm_riscv_teardown(void)
+{
+ kvm_riscv_aia_exit();
+ kvm_unregister_perf_callbacks();
+}
+
static int __init riscv_kvm_init(void)
{
int rc;
@@ -105,9 +111,11 @@ static int __init riscv_kvm_init(void)
kvm_info("AIA available with %d guest external interrupts\n",
kvm_riscv_aia_nr_hgei);
+ kvm_register_perf_callbacks(NULL);
+
rc = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
if (rc) {
- kvm_riscv_aia_exit();
+ kvm_riscv_teardown();
return rc;
}
@@ -117,7 +125,7 @@ module_init(riscv_kvm_init);
static void __exit riscv_kvm_exit(void)
{
- kvm_riscv_aia_exit();
+ kvm_riscv_teardown();
kvm_exit();
}
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 8d7d381737ee..e8ffb3456898 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -226,6 +226,13 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
return (vcpu->arch.guest_context.sstatus & SR_SPP) ? true : false;
}
+#ifdef CONFIG_GUEST_PERF_EVENTS
+unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.guest_context.sepc;
+}
+#endif
+
vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
{
return VM_FAULT_SIGBUS;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] riscv: perf: add guest vs host distinction
2024-08-13 13:23 ` [PATCH v2 1/2] riscv: perf: add guest vs host distinction zhouquan
@ 2024-08-21 12:48 ` Andrew Jones
2024-08-22 6:38 ` Quan Zhou
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Jones @ 2024-08-21 12:48 UTC (permalink / raw)
To: zhouquan
Cc: anup, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa, linux-kernel, linux-riscv, kvm,
kvm-riscv, linux-perf-users
On Tue, Aug 13, 2024 at 09:23:54PM GMT, zhouquan@iscas.ac.cn wrote:
> From: Quan Zhou <zhouquan@iscas.ac.cn>
>
> Introduce basic guest support in perf, enabling it to distinguish
> between PMU interrupts in the host or guest, and collect
> fundamental information.
>
> Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
> ---
> arch/riscv/include/asm/perf_event.h | 7 ++++++
> arch/riscv/kernel/perf_callchain.c | 38 +++++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+)
>
> diff --git a/arch/riscv/include/asm/perf_event.h b/arch/riscv/include/asm/perf_event.h
> index 665bbc9b2f84..c2b73c3aefe4 100644
> --- a/arch/riscv/include/asm/perf_event.h
> +++ b/arch/riscv/include/asm/perf_event.h
> @@ -8,13 +8,20 @@
> #ifndef _ASM_RISCV_PERF_EVENT_H
> #define _ASM_RISCV_PERF_EVENT_H
>
> +#ifdef CONFIG_PERF_EVENTS
> #include <linux/perf_event.h>
> #define perf_arch_bpf_user_pt_regs(regs) (struct user_regs_struct *)regs
>
> +extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
> +extern unsigned short perf_misc_flags(struct pt_regs *regs);
> +#define perf_misc_flags(regs) perf_misc_flags(regs)
> +
> #define perf_arch_fetch_caller_regs(regs, __ip) { \
> (regs)->epc = (__ip); \
> (regs)->s0 = (unsigned long) __builtin_frame_address(0); \
> (regs)->sp = current_stack_pointer; \
> (regs)->status = SR_PP; \
> }
> +#endif
> +
> #endif /* _ASM_RISCV_PERF_EVENT_H */
> diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
> index 3348a61de7d9..7af90a3bb373 100644
> --- a/arch/riscv/kernel/perf_callchain.c
> +++ b/arch/riscv/kernel/perf_callchain.c
> @@ -58,6 +58,11 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
> {
> unsigned long fp = 0;
>
> + if (perf_guest_state()) {
> + /* TODO: We don't support guest os callchain now */
> + return;
> + }
> +
> fp = regs->s0;
> perf_callchain_store(entry, regs->epc);
>
> @@ -74,5 +79,38 @@ static bool fill_callchain(void *entry, unsigned long pc)
> void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
> struct pt_regs *regs)
> {
> + if (perf_guest_state()) {
> + /* TODO: We don't support guest os callchain now */
> + return;
> + }
> +
> walk_stackframe(NULL, regs, fill_callchain, entry);
> }
> +
> +unsigned long perf_instruction_pointer(struct pt_regs *regs)
> +{
> + if (perf_guest_state())
> + return perf_guest_get_ip();
> +
> + return instruction_pointer(regs);
> +}
> +
> +unsigned short perf_misc_flags(struct pt_regs *regs)
I see that the consumer of perf_misc_flags is only a u16, but all other
architectures define this function as returning an unsigned long, and
your last version did as well. My comment in the last version was that
we should use an unsigned long for the 'misc' variable to match the
return type of the function. I still think we should do that instead
since the function should be consistent with the other architectures.
> +{
> + unsigned int guest_state = perf_guest_state();
> + unsigned short misc = 0;
> +
> + if (guest_state) {
> + if (guest_state & PERF_GUEST_USER)
> + misc |= PERF_RECORD_MISC_GUEST_USER;
> + else
> + misc |= PERF_RECORD_MISC_GUEST_KERNEL;
> + } else {
> + if (user_mode(regs))
> + misc |= PERF_RECORD_MISC_USER;
> + else
> + misc |= PERF_RECORD_MISC_KERNEL;
> + }
> +
> + return misc;
> +}
> --
> 2.34.1
>
Thanks,
drew
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling
2024-08-13 13:24 ` [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling zhouquan
@ 2024-08-21 12:51 ` Andrew Jones
2024-08-22 6:42 ` Quan Zhou
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Jones @ 2024-08-21 12:51 UTC (permalink / raw)
To: zhouquan
Cc: anup, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa, linux-kernel, linux-riscv, kvm,
kvm-riscv, linux-perf-users
On Tue, Aug 13, 2024 at 09:24:10PM GMT, zhouquan@iscas.ac.cn wrote:
> From: Quan Zhou <zhouquan@iscas.ac.cn>
>
> For the information collected on the host side, we need to
> identify which data originates from the guest and record
> these events separately, this can be achieved by having
> KVM register perf callbacks.
>
> Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
> ---
> arch/riscv/include/asm/kvm_host.h | 5 +++++
> arch/riscv/kvm/Kconfig | 1 +
> arch/riscv/kvm/main.c | 12 ++++++++++--
> arch/riscv/kvm/vcpu.c | 7 +++++++
> 4 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 2e2254fd2a2a..d2350b08a3f4 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -286,6 +286,11 @@ struct kvm_vcpu_arch {
> } sta;
> };
>
Let's add the same comment here that arm64 has unless you determine
that 'any event that arrives while a vCPU is loaded is considered to be
"in guest"' is not true for riscv.
> +static inline bool kvm_arch_pmi_in_guest(struct kvm_vcpu *vcpu)
> +{
> + return IS_ENABLED(CONFIG_GUEST_PERF_EVENTS) && !!vcpu;
> +}
> +
> static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>
> #define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12
> diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
> index 26d1727f0550..0c3cbb0915ff 100644
> --- a/arch/riscv/kvm/Kconfig
> +++ b/arch/riscv/kvm/Kconfig
> @@ -32,6 +32,7 @@ config KVM
> select KVM_XFER_TO_GUEST_WORK
> select KVM_GENERIC_MMU_NOTIFIER
> select SCHED_INFO
> + select GUEST_PERF_EVENTS if PERF_EVENTS
> help
> Support hosting virtualized guest machines.
>
> diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
> index bab2ec34cd87..734b48d8f6dd 100644
> --- a/arch/riscv/kvm/main.c
> +++ b/arch/riscv/kvm/main.c
> @@ -51,6 +51,12 @@ void kvm_arch_hardware_disable(void)
> csr_write(CSR_HIDELEG, 0);
> }
>
> +static void kvm_riscv_teardown(void)
> +{
> + kvm_riscv_aia_exit();
> + kvm_unregister_perf_callbacks();
> +}
> +
> static int __init riscv_kvm_init(void)
> {
> int rc;
> @@ -105,9 +111,11 @@ static int __init riscv_kvm_init(void)
> kvm_info("AIA available with %d guest external interrupts\n",
> kvm_riscv_aia_nr_hgei);
>
> + kvm_register_perf_callbacks(NULL);
> +
> rc = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> if (rc) {
> - kvm_riscv_aia_exit();
> + kvm_riscv_teardown();
> return rc;
> }
>
> @@ -117,7 +125,7 @@ module_init(riscv_kvm_init);
>
> static void __exit riscv_kvm_exit(void)
> {
> - kvm_riscv_aia_exit();
> + kvm_riscv_teardown();
>
> kvm_exit();
> }
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index 8d7d381737ee..e8ffb3456898 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -226,6 +226,13 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> return (vcpu->arch.guest_context.sstatus & SR_SPP) ? true : false;
> }
>
> +#ifdef CONFIG_GUEST_PERF_EVENTS
> +unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
> +{
> + return vcpu->arch.guest_context.sepc;
> +}
> +#endif
> +
> vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
> {
> return VM_FAULT_SIGBUS;
> --
> 2.34.1
>
Thanks,
drew
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] riscv: perf: add guest vs host distinction
2024-08-21 12:48 ` Andrew Jones
@ 2024-08-22 6:38 ` Quan Zhou
2024-08-22 13:41 ` Andrew Jones
0 siblings, 1 reply; 8+ messages in thread
From: Quan Zhou @ 2024-08-22 6:38 UTC (permalink / raw)
To: Andrew Jones
Cc: anup, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa, linux-kernel, linux-riscv, kvm,
kvm-riscv, linux-perf-users
On 2024/8/21 20:48, Andrew Jones wrote:
> On Tue, Aug 13, 2024 at 09:23:54PM GMT, zhouquan@iscas.ac.cn wrote:
>> From: Quan Zhou <zhouquan@iscas.ac.cn>
>>
>> Introduce basic guest support in perf, enabling it to distinguish
>> between PMU interrupts in the host or guest, and collect
>> fundamental information.
>>
>> Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
>> ---
>> arch/riscv/include/asm/perf_event.h | 7 ++++++
>> arch/riscv/kernel/perf_callchain.c | 38 +++++++++++++++++++++++++++++
>> 2 files changed, 45 insertions(+)
>>
>> diff --git a/arch/riscv/include/asm/perf_event.h b/arch/riscv/include/asm/perf_event.h
>> index 665bbc9b2f84..c2b73c3aefe4 100644
>> --- a/arch/riscv/include/asm/perf_event.h
>> +++ b/arch/riscv/include/asm/perf_event.h
>> @@ -8,13 +8,20 @@
>> #ifndef _ASM_RISCV_PERF_EVENT_H
>> #define _ASM_RISCV_PERF_EVENT_H
>>
>> +#ifdef CONFIG_PERF_EVENTS
>> #include <linux/perf_event.h>
>> #define perf_arch_bpf_user_pt_regs(regs) (struct user_regs_struct *)regs
>>
>> +extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
>> +extern unsigned short perf_misc_flags(struct pt_regs *regs);
>> +#define perf_misc_flags(regs) perf_misc_flags(regs)
>> +
>> #define perf_arch_fetch_caller_regs(regs, __ip) { \
>> (regs)->epc = (__ip); \
>> (regs)->s0 = (unsigned long) __builtin_frame_address(0); \
>> (regs)->sp = current_stack_pointer; \
>> (regs)->status = SR_PP; \
>> }
>> +#endif
>> +
>> #endif /* _ASM_RISCV_PERF_EVENT_H */
>> diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
>> index 3348a61de7d9..7af90a3bb373 100644
>> --- a/arch/riscv/kernel/perf_callchain.c
>> +++ b/arch/riscv/kernel/perf_callchain.c
>> @@ -58,6 +58,11 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
>> {
>> unsigned long fp = 0;
>>
>> + if (perf_guest_state()) {
>> + /* TODO: We don't support guest os callchain now */
>> + return;
>> + }
>> +
>> fp = regs->s0;
>> perf_callchain_store(entry, regs->epc);
>>
>> @@ -74,5 +79,38 @@ static bool fill_callchain(void *entry, unsigned long pc)
>> void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
>> struct pt_regs *regs)
>> {
>> + if (perf_guest_state()) {
>> + /* TODO: We don't support guest os callchain now */
>> + return;
>> + }
>> +
>> walk_stackframe(NULL, regs, fill_callchain, entry);
>> }
>> +
>> +unsigned long perf_instruction_pointer(struct pt_regs *regs)
>> +{
>> + if (perf_guest_state())
>> + return perf_guest_get_ip();
>> +
>> + return instruction_pointer(regs);
>> +}
>> +
>> +unsigned short perf_misc_flags(struct pt_regs *regs)
>
> I see that the consumer of perf_misc_flags is only a u16, but all other
> architectures define this function as returning an unsigned long, and
> your last version did as well. My comment in the last version was that
> we should use an unsigned long for the 'misc' variable to match the
> return type of the function. I still think we should do that instead
> since the function should be consistent with the other architectures.
>
I agree with your point that the type of `misc` should be consistent
with other architectures.
However, one thing confuses me. The return value of perf_misc_flags
is assigned to the `misc` field of the perf_event_header structure,
and the field is defined as `u16`. I checked the return type of
`perf_misc_flags` in other architectures, and I found that for
x86/arm/s390, the type is `unsigned long`, while for powerpc, it is `u32`.
These do not match `u16`, which seems to pose a risk of type truncation
and feels a bit unconventional. Or is there some other reasonable
consideration behind this?
Thanks a lot!
Quan
>> +{
>> + unsigned int guest_state = perf_guest_state();
>> + unsigned short misc = 0;
>> +
>> + if (guest_state) {
>> + if (guest_state & PERF_GUEST_USER)
>> + misc |= PERF_RECORD_MISC_GUEST_USER;
>> + else
>> + misc |= PERF_RECORD_MISC_GUEST_KERNEL;
>> + } else {
>> + if (user_mode(regs))
>> + misc |= PERF_RECORD_MISC_USER;
>> + else
>> + misc |= PERF_RECORD_MISC_KERNEL;
>> + }
>> +
>> + return misc;
>> +}
>> --
>> 2.34.1
>>
>
> Thanks,
> drew
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling
2024-08-21 12:51 ` Andrew Jones
@ 2024-08-22 6:42 ` Quan Zhou
0 siblings, 0 replies; 8+ messages in thread
From: Quan Zhou @ 2024-08-22 6:42 UTC (permalink / raw)
To: Andrew Jones
Cc: anup, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa, linux-kernel, linux-riscv, kvm,
kvm-riscv, linux-perf-users
On 2024/8/21 20:51, Andrew Jones wrote:
> On Tue, Aug 13, 2024 at 09:24:10PM GMT, zhouquan@iscas.ac.cn wrote:
>> From: Quan Zhou <zhouquan@iscas.ac.cn>
>>
>> For the information collected on the host side, we need to
>> identify which data originates from the guest and record
>> these events separately, this can be achieved by having
>> KVM register perf callbacks.
>>
>> Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
>> ---
>> arch/riscv/include/asm/kvm_host.h | 5 +++++
>> arch/riscv/kvm/Kconfig | 1 +
>> arch/riscv/kvm/main.c | 12 ++++++++++--
>> arch/riscv/kvm/vcpu.c | 7 +++++++
>> 4 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
>> index 2e2254fd2a2a..d2350b08a3f4 100644
>> --- a/arch/riscv/include/asm/kvm_host.h
>> +++ b/arch/riscv/include/asm/kvm_host.h
>> @@ -286,6 +286,11 @@ struct kvm_vcpu_arch {
>> } sta;
>> };
>>
>
> Let's add the same comment here that arm64 has unless you determine
> that 'any event that arrives while a vCPU is loaded is considered to be
> "in guest"' is not true for riscv.
>
Ok, i will add this comment to clarify the point.
Thanks,
Quan
>> +static inline bool kvm_arch_pmi_in_guest(struct kvm_vcpu *vcpu)
>> +{
>> + return IS_ENABLED(CONFIG_GUEST_PERF_EVENTS) && !!vcpu;
>> +}
>> +
>> static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>>
>> #define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12
>> diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
>> index 26d1727f0550..0c3cbb0915ff 100644
>> --- a/arch/riscv/kvm/Kconfig
>> +++ b/arch/riscv/kvm/Kconfig
>> @@ -32,6 +32,7 @@ config KVM
>> select KVM_XFER_TO_GUEST_WORK
>> select KVM_GENERIC_MMU_NOTIFIER
>> select SCHED_INFO
>> + select GUEST_PERF_EVENTS if PERF_EVENTS
>> help
>> Support hosting virtualized guest machines.
>>
>> diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
>> index bab2ec34cd87..734b48d8f6dd 100644
>> --- a/arch/riscv/kvm/main.c
>> +++ b/arch/riscv/kvm/main.c
>> @@ -51,6 +51,12 @@ void kvm_arch_hardware_disable(void)
>> csr_write(CSR_HIDELEG, 0);
>> }
>>
>> +static void kvm_riscv_teardown(void)
>> +{
>> + kvm_riscv_aia_exit();
>> + kvm_unregister_perf_callbacks();
>> +}
>> +
>> static int __init riscv_kvm_init(void)
>> {
>> int rc;
>> @@ -105,9 +111,11 @@ static int __init riscv_kvm_init(void)
>> kvm_info("AIA available with %d guest external interrupts\n",
>> kvm_riscv_aia_nr_hgei);
>>
>> + kvm_register_perf_callbacks(NULL);
>> +
>> rc = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>> if (rc) {
>> - kvm_riscv_aia_exit();
>> + kvm_riscv_teardown();
>> return rc;
>> }
>>
>> @@ -117,7 +125,7 @@ module_init(riscv_kvm_init);
>>
>> static void __exit riscv_kvm_exit(void)
>> {
>> - kvm_riscv_aia_exit();
>> + kvm_riscv_teardown();
>>
>> kvm_exit();
>> }
>> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
>> index 8d7d381737ee..e8ffb3456898 100644
>> --- a/arch/riscv/kvm/vcpu.c
>> +++ b/arch/riscv/kvm/vcpu.c
>> @@ -226,6 +226,13 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
>> return (vcpu->arch.guest_context.sstatus & SR_SPP) ? true : false;
>> }
>>
>> +#ifdef CONFIG_GUEST_PERF_EVENTS
>> +unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
>> +{
>> + return vcpu->arch.guest_context.sepc;
>> +}
>> +#endif
>> +
>> vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
>> {
>> return VM_FAULT_SIGBUS;
>> --
>> 2.34.1
>>
>
> Thanks,
> drew
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] riscv: perf: add guest vs host distinction
2024-08-22 6:38 ` Quan Zhou
@ 2024-08-22 13:41 ` Andrew Jones
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Jones @ 2024-08-22 13:41 UTC (permalink / raw)
To: Quan Zhou
Cc: anup, atishp, paul.walmsley, palmer, aou, mark.rutland,
alexander.shishkin, jolsa, linux-kernel, linux-riscv, kvm,
kvm-riscv, linux-perf-users
On Thu, Aug 22, 2024 at 02:38:44PM GMT, Quan Zhou wrote:
...
> > > +unsigned short perf_misc_flags(struct pt_regs *regs)
> >
> > I see that the consumer of perf_misc_flags is only a u16, but all other
> > architectures define this function as returning an unsigned long, and
> > your last version did as well. My comment in the last version was that
> > we should use an unsigned long for the 'misc' variable to match the
> > return type of the function. I still think we should do that instead
> > since the function should be consistent with the other architectures.
> >
>
> I agree with your point that the type of `misc` should be consistent with
> other architectures.
>
> However, one thing confuses me. The return value of perf_misc_flags
> is assigned to the `misc` field of the perf_event_header structure,
> and the field is defined as `u16`. I checked the return type of
Yes, that's what I mentioned above.
> `perf_misc_flags` in other architectures, and I found that for x86/arm/s390,
> the type is `unsigned long`, while for powerpc, it is `u32`.
> These do not match `u16`, which seems to pose a risk of type truncation and
> feels a bit unconventional. Or is there some other reasonable consideration
> behind this?
No, it's just historic. I see three paths, one is use 'unsigned long' like
the other architectures and assume we'll never have flags touching bits
over 15, so it's fine. Or, same as the first path, but also add
'#define PERF_RECORD_MISC_MAX 15' with a comment explaining misc flags
must be 15 or less as a separate patch. Or, for the third, add patches to
this series that first change all architectures to return u16s.
Thanks,
drew
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-08-22 13:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-13 13:23 [PATCH v2 0/2] riscv: Add perf support to collect KVM guest statistics from host side zhouquan
2024-08-13 13:23 ` [PATCH v2 1/2] riscv: perf: add guest vs host distinction zhouquan
2024-08-21 12:48 ` Andrew Jones
2024-08-22 6:38 ` Quan Zhou
2024-08-22 13:41 ` Andrew Jones
2024-08-13 13:24 ` [PATCH v2 2/2] riscv: KVM: add basic support for host vs guest profiling zhouquan
2024-08-21 12:51 ` Andrew Jones
2024-08-22 6:42 ` Quan Zhou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).