kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* arch_timer_edge_cases failures on ampere-one
@ 2025-04-10 15:10 Sebastian Ott
  2025-04-10 15:35 ` Marc Zyngier
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Ott @ 2025-04-10 15:10 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon
  Cc: linux-arm-kernel, kvmarm, kvm, linux-kselftest

Hey,

I'm seeing consistent failures for the arch_timer_edge_cases
selftest one ampere-one(x):
==== Test Assertion Failure ====
   arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
   pid=6277 tid=6277 errno=4 - Interrupted system call
      1  0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
      2  0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
      3  0x0000ffffa8b2625b: ?? ??:0
      4  0x0000ffffa8b2633b: ?? ??:0
      5  0x000000000040202f: _start at ??:?
   0x1 != 0x0 (timer_condition != istatus)

The (first) test that's failing is from test_timers_in_the_past():
     /* Set a timer to counter=0 (in the past) */
     test_timer_cval(timer, 0, wm, true, DEF_CNT);

If I understand this correctly then the timer condition is met, an
irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.

What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
set). KVM also reads 1 in timer_save_state() via
read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?

These machines have FEAT_ECV (as a test I disabled that in the kernel
but with the same result).

As a hack I set ARCH_TIMER_CTRL_IT_STAT in timer_save_state() when
the timer condition is met and set up traps for the register - this
lets the testcase succeed.

All with the current upstream kernel - but this is not new, I saw
this a couple of months ago but lost access to the machine before
I could debug..

Any hints what to do here?

Sebastian


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: arch_timer_edge_cases failures on ampere-one
  2025-04-10 15:10 arch_timer_edge_cases failures on ampere-one Sebastian Ott
@ 2025-04-10 15:35 ` Marc Zyngier
  2025-04-15 17:31   ` Sebastian Ott
  0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2025-04-10 15:35 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kselftest

On Thu, 10 Apr 2025 16:10:43 +0100,
Sebastian Ott <sebott@redhat.com> wrote:
> 
> Hey,
> 
> I'm seeing consistent failures for the arch_timer_edge_cases
> selftest one ampere-one(x):
> ==== Test Assertion Failure ====
>   arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
>   pid=6277 tid=6277 errno=4 - Interrupted system call
>      1  0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
>      2  0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
>      3  0x0000ffffa8b2625b: ?? ??:0
>      4  0x0000ffffa8b2633b: ?? ??:0
>      5  0x000000000040202f: _start at ??:?
>   0x1 != 0x0 (timer_condition != istatus)
> 
> The (first) test that's failing is from test_timers_in_the_past():
>     /* Set a timer to counter=0 (in the past) */
>     test_timer_cval(timer, 0, wm, true, DEF_CNT);
> 
> If I understand this correctly then the timer condition is met, an
> irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
> 
> What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
> set). KVM also reads 1 in timer_save_state() via
> read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?

My hunch is that this is related to AC03_CPU_14 in [1] (now archived
locally for future reference...).

> 
> These machines have FEAT_ECV (as a test I disabled that in the kernel
> but with the same result).
> 
> As a hack I set ARCH_TIMER_CTRL_IT_STAT in timer_save_state() when
> the timer condition is met and set up traps for the register - this
> lets the testcase succeed.
> 
> All with the current upstream kernel - but this is not new, I saw
> this a couple of months ago but lost access to the machine before
> I could debug..
> 
> Any hints what to do here?

Not a lot to do, assuming this is the actual cause. Similar things
happen on my QC box, which has a "remarkable" timer implementation and
a bunch of terrible hacks to keep it alive.

On the other hand, I'm not even convinced that this test-case is
legit. It seems to rely on the counter being 64bit wide, which is not
always the case.

	M.

[1] https://amperecomputing.com/assets/AmpereOne_Developer_ER_v0_80_20240823_28945022f4.pdf

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: arch_timer_edge_cases failures on ampere-one
  2025-04-10 15:35 ` Marc Zyngier
@ 2025-04-15 17:31   ` Sebastian Ott
  0 siblings, 0 replies; 3+ messages in thread
From: Sebastian Ott @ 2025-04-15 17:31 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kselftest

On Thu, 10 Apr 2025, Marc Zyngier wrote:
> On Thu, 10 Apr 2025 16:10:43 +0100,
> Sebastian Ott <sebott@redhat.com> wrote:
>>
>> Hey,
>>
>> I'm seeing consistent failures for the arch_timer_edge_cases
>> selftest one ampere-one(x):
>> ==== Test Assertion Failure ====
>>   arm64/arch_timer_edge_cases.c:170: timer_condition == istatus
>>   pid=6277 tid=6277 errno=4 - Interrupted system call
>>      1  0x0000000000403bcf: test_run at arch_timer_edge_cases.c:962
>>      2  0x0000000000401f1f: main at arch_timer_edge_cases.c:1083
>>      3  0x0000ffffa8b2625b: ?? ??:0
>>      4  0x0000ffffa8b2633b: ?? ??:0
>>      5  0x000000000040202f: _start at ??:?
>>   0x1 != 0x0 (timer_condition != istatus)
>>
>> The (first) test that's failing is from test_timers_in_the_past():
>>     /* Set a timer to counter=0 (in the past) */
>>     test_timer_cval(timer, 0, wm, true, DEF_CNT);
>>
>> If I understand this correctly then the timer condition is met, an
>> irq should be raised with the istatus bit from SYS_CNTV_CTL_EL0 set.
>>
>> What the guest gets for SYS_CNTV_CTL_EL0 is 1 (only the enable bit
>> set). KVM also reads 1 in timer_save_state() via
>> read_sysreg_el0(SYS_CNTV_CTL). Is this a HW/FW issue?
>
> My hunch is that this is related to AC03_CPU_14 in [1] (now archived
> locally for future reference...).

Thanks! Your hunch seems to point in the right direction. If the diff is
smaller than 2^63 this is working as expected. I'll try to figure out some
changes that let the test succeed on ampere while still doing meaningful
tests.

Sebastian


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-04-15 17:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10 15:10 arch_timer_edge_cases failures on ampere-one Sebastian Ott
2025-04-10 15:35 ` Marc Zyngier
2025-04-15 17:31   ` Sebastian Ott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).