From: Sean Christopherson <seanjc@google.com>
To: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>,
Xiong Zhang <xiong.y.zhang@intel.com>,
Zhenyu Wang <zhenyuw@linux.intel.com>,
Like Xu <like.xu.linux@gmail.com>,
Jinrong Liang <cloudliang@tencent.com>,
Yongwei Ma <yongwei.ma@intel.com>,
Dapeng Mi <dapeng1.mi@intel.com>
Subject: Re: [kvm-unit-tests patch v6 04/18] x86: pmu: Fix the issue that pmu_counter_t.config crosses cache line
Date: Fri, 14 Feb 2025 13:05:34 -0800 [thread overview]
Message-ID: <Z6-wHilax9b59ps8@google.com> (raw)
In-Reply-To: <20240914101728.33148-5-dapeng1.mi@linux.intel.com>
On Sat, Sep 14, 2024, Dapeng Mi wrote:
> When running pmu test on SPR, the following #GP fault is reported.
>
> Unhandled exception 13 #GP at ip 000000000040771f
> error_code=0000 rflags=00010046 cs=00000008
> rax=00000000004031ad rcx=0000000000000186 rdx=0000000000000000 rbx=00000000005142f0
> rbp=0000000000514260 rsi=0000000000000020 rdi=0000000000000340
> r8=0000000000513a65 r9=00000000000003f8 r10=000000000000000d r11=00000000ffffffff
> r12=000000000043003c r13=0000000000514450 r14=000000000000000b r15=0000000000000001
> cr0=0000000080010011 cr2=0000000000000000 cr3=0000000001007000 cr4=0000000000000020
> cr8=0000000000000000
> STACK: @40771f 40040e 400976 400aef 40148d 401da9 4001ad
> FAIL pmu
>
> It looks EVENTSEL0 MSR (0x186) is written a invalid value (0x4031ad) and
> cause a #GP.
Nope.
> Further investigation shows the #GP is caused by below code in
> __start_event().
>
> rmsr(MSR_GP_EVENT_SELECTx(event_to_global_idx(evt)),
> evt->config | EVNTSEL_EN);
Nope.
> The evt->config is correctly initialized but seems corrupted before
> writing to MSR.
Still nope.
>
> The original pmu_counter_t layout looks as below.
>
> typedef struct {
> uint32_t ctr;
> uint64_t config;
> uint64_t count;
> int idx;
> } pmu_counter_t;
>
> Obviously the config filed crosses two cache lines.
Yeah, no. Cache lines are 64 bytes on x86, and even with the bad layout, the size
only adds up to 32 bytes on x86-64. Packing it slightly better drops it to 24
bytes, but that has nothing to do with cache lines.
> When the two cache lines are not updated simultaneously, the config value is
> corrupted.
This is simply nonsensical. Compilers don't generate accesses that split cache
lines unless software is being deliberately stupid, and x86 doesn't corrupt data
on unaligned accesses.
The actual problem is that your next patch increases the size of the array in
check_counters_many() from 10 to 48 entries. With 32 bytes per entry, it's just
enough to overflow the stack when making function calls (the array itself stays
on the stack page). And because KUT's percpu and stack management is complete
and utter garbage, overflowing the stack clobbers the percpu area.
Of course, it's way too hard to even see that, because all of the code is beyond
stupid and (a) doesn't align the stacks to 4KiB, and (b) puts the percpu area at
the bottom of the stack "page".
I'll send patches to put band-aids on the per-CPU insanity, along with a refreshed
version of this series.
next prev parent reply other threads:[~2025-02-14 21:05 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-14 10:17 [kvm-unit-tests patch v6 00/18] pmu test bugs fix and improvements Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 01/18] x86: pmu: Remove duplicate code in pmu_init() Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 02/18] x86: pmu: Remove blank line and redundant space Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 03/18] x86: pmu: Refine fixed_events[] names Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 04/18] x86: pmu: Fix the issue that pmu_counter_t.config crosses cache line Dapeng Mi
2025-02-14 21:05 ` Sean Christopherson [this message]
2025-02-18 9:07 ` Mi, Dapeng
2024-09-14 10:17 ` [kvm-unit-tests patch v6 05/18] x86: pmu: Enlarge cnt[] length to 48 in check_counters_many() Dapeng Mi
2025-02-14 21:06 ` Sean Christopherson
2025-02-18 9:24 ` Mi, Dapeng
2025-02-18 15:56 ` Sean Christopherson
2024-09-14 10:17 ` [kvm-unit-tests patch v6 06/18] x86: pmu: Print measured event count if test fails Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 07/18] x86: pmu: Fix potential out of bound access for fixed events Dapeng Mi
2025-02-14 21:07 ` Sean Christopherson
2025-02-18 9:34 ` Mi, Dapeng
2025-02-18 15:04 ` Sean Christopherson
2024-09-14 10:17 ` [kvm-unit-tests patch v6 08/18] x86: pmu: Fix cycles event validation failure Dapeng Mi
2025-02-14 21:07 ` Sean Christopherson
2025-02-18 9:36 ` Mi, Dapeng
2024-09-14 10:17 ` [kvm-unit-tests patch v6 09/18] x86: pmu: Use macro to replace hard-coded branches event index Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 10/18] x86: pmu: Use macro to replace hard-coded ref-cycles " Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 11/18] x86: pmu: Use macro to replace hard-coded instructions " Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 12/18] x86: pmu: Enable and disable PMCs in loop() asm blob Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 13/18] x86: pmu: Improve instruction and branches events verification Dapeng Mi
2025-02-14 21:08 ` Sean Christopherson
2025-02-18 9:40 ` Mi, Dapeng
2024-09-14 10:17 ` [kvm-unit-tests patch v6 14/18] x86: pmu: Improve LLC misses event verification Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 15/18] x86: pmu: Adjust lower boundary of llc-misses event to 0 for legacy CPUs Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 16/18] x86: pmu: Add IBPB indirect jump asm blob Dapeng Mi
2024-09-14 10:17 ` [kvm-unit-tests patch v6 17/18] x86: pmu: Adjust lower boundary of branch-misses event Dapeng Mi
2025-02-14 21:09 ` Sean Christopherson
2025-02-18 9:42 ` Mi, Dapeng
2024-09-14 10:17 ` [kvm-unit-tests patch v6 18/18] x86: pmu: Optimize emulated instruction validation Dapeng Mi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6-wHilax9b59ps8@google.com \
--to=seanjc@google.com \
--cc=cloudliang@tencent.com \
--cc=dapeng1.mi@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=like.xu.linux@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=xiong.y.zhang@intel.com \
--cc=yongwei.ma@intel.com \
--cc=zhenyuw@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox