* arch_timer_edge_cases failures on ampere-one
@ 2025-04-10 15:10 Sebastian Ott
2025-04-10 15:35 ` Marc Zyngier
0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Ott @ 2025-04-10 15:10 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, kvmarm, kvm, linux-kselftest
Hey,
I'm seeing consistent failures for the arch_timer_edge_cases
selftest one ampere-one(x):
==== Test Assertion Failure ====
arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
pid=6277 tid=6277 errno=4 - Interrupted system call
1 0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
2 0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
3 0x0000ffffa8b2625b: ?? ??:0
4 0x0000ffffa8b2633b: ?? ??:0
5 0x000000000040202f: _start at ??:?
0x1 != 0x0 (timer_condition != istatus)
The (first) test that's failing is from test_timers_in_the_past():
/* Set a timer to counter=0 (in the past) */
test_timer_cval(timer, 0, wm, true, DEF_CNT);
If I understand this correctly then the timer condition is met, an
irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
set). KVM also reads 1 in timer_save_state() via
read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?
These machines have FEAT_ECV (as a test I disabled that in the kernel
but with the same result).
As a hack I set ARCH_TIMER_CTRL_IT_STAT in timer_save_state() when
the timer condition is met and set up traps for the register - this
lets the testcase succeed.
All with the current upstream kernel - but this is not new, I saw
this a couple of months ago but lost access to the machine before
I could debug..
Any hints what to do here?
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: arch_timer_edge_cases failures on ampere-one
2025-04-10 15:10 arch_timer_edge_cases failures on ampere-one Sebastian Ott
@ 2025-04-10 15:35 ` Marc Zyngier
2025-04-15 17:31 ` Sebastian Ott
0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2025-04-10 15:35 UTC (permalink / raw)
To: Sebastian Ott
Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Catalin Marinas, Will Deacon, linux-arm-kernel, kvmarm, kvm,
linux-kselftest
On Thu, 10 Apr 2025 16:10:43 +0100,
Sebastian Ott <sebott@redhat.com> wrote:
>
> Hey,
>
> I'm seeing consistent failures for the arch_timer_edge_cases
> selftest one ampere-one(x):
> ==== Test Assertion Failure ====
> arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
> pid=6277 tid=6277 errno=4 - Interrupted system call
> 1 0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
> 2 0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
> 3 0x0000ffffa8b2625b: ?? ??:0
> 4 0x0000ffffa8b2633b: ?? ??:0
> 5 0x000000000040202f: _start at ??:?
> 0x1 != 0x0 (timer_condition != istatus)
>
> The (first) test that's failing is from test_timers_in_the_past():
> /* Set a timer to counter=0 (in the past) */
> test_timer_cval(timer, 0, wm, true, DEF_CNT);
>
> If I understand this correctly then the timer condition is met, an
> irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
>
> What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
> set). KVM also reads 1 in timer_save_state() via
> read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?
My hunch is that this is related to AC03_CPU_14 in [1] (now archived
locally for future reference...).
>
> These machines have FEAT_ECV (as a test I disabled that in the kernel
> but with the same result).
>
> As a hack I set ARCH_TIMER_CTRL_IT_STAT in timer_save_state() when
> the timer condition is met and set up traps for the register - this
> lets the testcase succeed.
>
> All with the current upstream kernel - but this is not new, I saw
> this a couple of months ago but lost access to the machine before
> I could debug..
>
> Any hints what to do here?
Not a lot to do, assuming this is the actual cause. Similar things
happen on my QC box, which has a "remarkable" timer implementation and
a bunch of terrible hacks to keep it alive.
On the other hand, I'm not even convinced that this test-case is
legit. It seems to rely on the counter being 64bit wide, which is not
always the case.
M.
[1] https://amperecomputing.com/assets/AmpereOne_Developer_ER_v0_80_20240823_28945022f4.pdf
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: arch_timer_edge_cases failures on ampere-one
2025-04-10 15:35 ` Marc Zyngier
@ 2025-04-15 17:31 ` Sebastian Ott
0 siblings, 0 replies; 3+ messages in thread
From: Sebastian Ott @ 2025-04-15 17:31 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Catalin Marinas, Will Deacon, linux-arm-kernel, kvmarm, kvm,
linux-kselftest
On Thu, 10 Apr 2025, Marc Zyngier wrote:
> On Thu, 10 Apr 2025 16:10:43 +0100,
> Sebastian Ott <sebott@redhat.com> wrote:
>>
>> Hey,
>>
>> I'm seeing consistent failures for the arch_timer_edge_cases
>> selftest one ampere-one(x):
>> ==== Test Assertion Failure ====
>> arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
>> pid=6277 tid=6277 errno=4 - Interrupted system call
>> 1 0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
>> 2 0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
>> 3 0x0000ffffa8b2625b: ?? ??:0
>> 4 0x0000ffffa8b2633b: ?? ??:0
>> 5 0x000000000040202f: _start at ??:?
>> 0x1 != 0x0 (timer_condition != istatus)
>>
>> The (first) test that's failing is from test_timers_in_the_past():
>> /* Set a timer to counter=0 (in the past) */
>> test_timer_cval(timer, 0, wm, true, DEF_CNT);
>>
>> If I understand this correctly then the timer condition is met, an
>> irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
>>
>> What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
>> set). KVM also reads 1 in timer_save_state() via
>> read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?
>
> My hunch is that this is related to AC03_CPU_14 in [1] (now archived
> locally for future reference...).
Thanks! Your hunch seems to point in the right direction. If the diff is
smaller than 2^63 this is working as expected. I'll try to figure out some
changes that let the test succeed on ampere while still doing meaningful
tests.
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-04-15 17:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10 15:10 arch_timer_edge_cases failures on ampere-one Sebastian Ott
2025-04-10 15:35 ` Marc Zyngier
2025-04-15 17:31 ` Sebastian Ott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).