From: John Garry <john.g.garry@oracle.com>
To: Sandipan Das <sandipan.das@amd.com>
Cc: linux-perf-users@vger.kernel.org, x86@kernel.org,
ravi.bangoria@amd.com, Namhyung Kim <namhyung@kernel.org>
Subject: Re: [bug report] perf top generates kernel "unchecked MSR access error: WRMSR"
Date: Thu, 24 Oct 2024 17:20:24 +0100 [thread overview]
Message-ID: <8be4ca4e-ce69-468f-a846-d532a0e7393c@oracle.com> (raw)
In-Reply-To: <368573d6-fd3d-43c3-8c15-d01ef0c35026@amd.com>
On 24/10/2024 07:21, Sandipan Das wrote:
> Thanks for bringing this to our attention.
>
>> On Tue, Oct 22, 2024 at 03:55:05PM +0100, John Garry wrote:
>>> Hi all,
>>>
>>> On my VM, "perf top" gives this stackframe on v6.12-rc4:
>>>
>>> [ 930.527581] unchecked MSR access error: WRMSR to 0xc0010200 (tried to
>>> write 0x0000020000510076) at rIP: 0xffffffff94ead548
>>> (native_write_msr+0x8/0x30)
>>> [ 930.531135] Call Trace:
>>> [ 930.531456] <IRQ>
>>> [ 930.531749] ? ex_handler_msr+0x138/0x150
>>> [ 930.532285] ? search_extable+0x26/0x30
>>> [ 930.532780] ? fixup_exception+0x9c/0x310
>>> [ 930.533405] ? exc_general_protection+0x10c/0x490
>>> [ 930.534081] ? asm_exc_general_protection+0x26/0x30
>>> [ 930.534768] ? native_write_msr+0x8/0x30
>>> [ 930.535357] ? srso_alias_return_thunk+0x5/0xfbef5
>>> [ 930.535998] x86_pmu_enable_event+0xa5/0xd0
>>> [ 930.536641] amd_pmu_enable_all+0x4e/0x80
>>> [ 930.537211] ctx_resched+0x13b/0x1d0
>>> [ 930.537735] __perf_install_in_context+0x2a2/0x390
>>> [ 930.538439] remote_function+0x49/0x60
>>> [ 930.538931] __flush_smp_call_function_queue+0xdc/0x700
>>> [ 930.539694] ? __pfx_remote_function+0x10/0x10
>>> [ 930.540480] __sysvec_call_function_single+0x38/0x140
>>> [ 930.541134] sysvec_call_function_single+0x6c/0x90
>>> [ 930.541970] </IRQ>
>>> [ 930.542269] <TASK>
>>> [ 930.542766] asm_sysvec_call_function_single+0x1a/0x20
>>> [ 930.543493] RIP: 0010:pv_native_safe_halt+0xf/0x20
>>> [ 930.544195] Code: 22 d7 e9 ff b5 13 00 0f 1f 40 00 90 90 90 90 90 90 90
>>> 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d d3 e3 25 00 fb f4 <e9>
>>> d7 b5 13 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
>>> [ 930.546841] RSP: 0018:ffffffff96a03e68 EFLAGS: 00000206
>>> [ 930.547563] RAX: 0000000000000006 RBX: ffffffff96a269c0 RCX:
>>> 0000000000000000
>>> [ 930.548579] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
>>> ffffffff94f53f31
>>> [ 930.549568] RBP: 0000000000000000 R08: 0000000000000001 R09:
>>> 0000000000000000
>>> [ 930.550529] R10: 0000000000000001 R11: 0000000000000000 R12:
>>> ffffffff970608e0
>>> [ 930.551582] R13: ffffffff96a269c0 R14: 0000000000000000 R15:
>>> 0000000000000000
>>> [ 930.552683] ? do_idle+0x1d1/0x2a0
>>> [ 930.553182] default_idle+0x9/0x20
>>> [ 930.553670] default_idle_call+0x7d/0xc0
>>> [ 930.554226] do_idle+0x1d1/0x2a0
>>> [ 930.554696] cpu_startup_entry+0x29/0x30
>>> [ 930.555154] rest_init+0x12e/0x1d0
>>> [ 930.555621] start_kernel+0x60f/0x6d0
>>> [ 930.556064] x86_64_start_reservations+0x21/0x40
>>> [ 930.556633] x86_64_start_kernel+0x91/0xa0
>>> [ 930.557107] common_startup_64+0x13e/0x141
>>> [ 930.558038] </TASK>
>>> [ 930.738880] perf: interrupt took too long (2511 > 2500), lowering
>>> kernel.perf_event_max_sample_rate to 79000
>>> [ 930.772912] perf: interrupt took too long (3414 > 3138), lowering
>>> kernel.perf_event_max_sample_rate to 58000
>>> [ 930.797764] perf: interrupt took too long (4275 > 4267), lowering
>>> kernel.perf_event_max_sample_rate to 46000
>>> [ 931.117733] perf: interrupt took too long (5345 > 5343), lowering
>>> kernel.perf_event_max_sample_rate to 37000
>>> [ 933.862829] perf: interrupt took too long (6765 > 6681), lowering
>>> kernel.perf_event_max_sample_rate to 29000
>>> [opc@jgarry-atomic-write-exp-e4-8-instance-20231214-1221 ~]$ ^C
>>>
>>> a known issue?
> I am unable to replicate this with KVM guests. MSR 0xc0010200 is the
> first PERF_CTL (event selector) and generally, unchecked MSR accesses
> happen when the hypervisor restricts what guests can access.
>
> Can you share details about the hypervisor?
> If its just KVM, can you share the host kernel version as well?
It's KVM, but I don't know the host version - I don't think it's easy
info to get. Here's some KVM prints:
[opc@jgarry-atomic-write-exp-e4-8-instance-20231214-1221 ~]$ sudo dmesg
| grep -i kvm
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: using sched offset of 394705327075 cycles
[ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff
max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 0.010289] kvm-guest: APIC: eoi() replaced with
kvm_guest_apic_eoi_write()
[ 0.010297] kvm-guest: KVM setup pv remote TLB flush
[ 0.010300] kvm-guest: setup PV sched yield
[ 0.010324] Booting paravirtualized kernel on KVM
[ 0.016465] kvm-guest: PV spinlocks enabled
[ 0.058487] kvm-guest: APIC: send_IPI_mask() replaced with
kvm_send_ipi_mask()
[ 0.058491] kvm-guest: APIC: send_IPI_mask_allbutself() replaced with
kvm_send_ipi_mask_allbutself()
[ 0.058492] kvm-guest: setup PV IPIs
[ 0.280302] clocksource: Switched to clocksource kvm-clock
[ 1.302630] systemd[1]: Detected virtualization kvm.
[ 1.312201] systemd[1]: Initializing machine ID from KVM UUID.
[ 13.771695] systemd[1]: Detected virtualization kvm.
[ 15.072121] kvm_amd: Nested Virtualization enabled
[ 15.072124] kvm_amd: Nested Paging enabled
[opc@jgarry-atomic-write-exp-e4-8-instance-20231214-1221 ~]$
>
>>> more /proc/cpuinfo gives:
>>>
>>> processor : 0
>>> vendor_id : AuthenticAMD
>>> cpu family : 25
>>> model : 1
>>> model name : AMD EPYC 7J13 64-Core Processor
>>> stepping : 1
>>> microcode : 0x1000065
>>> cpu MHz : 2445.322
>>> cache size : 512 KB
>>> physical id : 0
>>> siblings : 16
>>> core id : 0
>>> cpu cores : 8
>>> apicid : 0
>>> initial apicid : 0
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 16
>>> wp : yes
>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>>> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
>>> pdpe1gb rdtscp lm rep_good nopl xtopology cpuid extd_apicid tsc_kn
>>> own_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
>>> tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy
>>> svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topo
>>> ext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2
>>> smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt
>>> xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat npt nrip_sa
>>> ve umip pku ospke vaes vpclmulqdq rdpid arch_capabilities
>>> bugs : sysret_ss_attrs null_seg spectre_v1 spectre_v2
>>> spec_store_bypass srso ibpb_no_ret
>>> bogomips : 4890.64
>>> TLB size : 1024 4K pages
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 40 bits physical, 48 bits virtual
>>> power management:
>>>
> I tried replicating this on systems with an EPYC 7713 (very similar to the
> one above) and an EPYC 9654 but had no luck.
Thanks for checking.
but wouldn't you know it - it does not occur now. I guess that it will
reappear... I'll let you know.
Thanks
John
prev parent reply other threads:[~2024-10-24 16:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 14:55 [bug report] perf top generates kernel "unchecked MSR access error: WRMSR" John Garry
2024-10-23 22:59 ` Namhyung Kim
2024-10-24 6:21 ` Sandipan Das
2024-10-24 16:20 ` John Garry [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8be4ca4e-ce69-468f-a846-d532a0e7393c@oracle.com \
--to=john.g.garry@oracle.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
--cc=ravi.bangoria@amd.com \
--cc=sandipan.das@amd.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).