All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization
@ 2026-06-08 10:29 Teddy Astie
  2026-06-08 10:57 ` Juergen Gross
  2026-06-08 15:13 ` Thomas Gleixner
  0 siblings, 2 replies; 4+ messages in thread
From: Teddy Astie @ 2026-06-08 10:29 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, regressions; +Cc: Xen-devel, Olivier Lambert


[-- Attachment #1.1.1: Type: text/plain, Size: 1354 bytes --]

Hello,

In 6.12.5+ kernels on AMD CPUs, we observe abnormally long boot times 
where the guest is struggling on PV spinlock initialization.

This occurs starting with 6.12.5, and also on more recent kernels on 
Intel platforms, but that hasn't been fully investigated at this time 
(but I assume it's a variant of the same issue).

This occurs since a backport of 76031d9 ("clocksource: Make negative 
motion detection more robust").

Some (claude-based) analysis made appears to relate that to the lack of 
proper max_raw_delta in the jiffies clocksource which appears to make 
the clock fail to progress meaningfully.

Here is a raw summary of the analysis
 > We tracked it down to a single stable backport in 6.12.5: commit 
1a678f6829a8 ("clocksource: Make negative motion detection more robust", 
upstream 76031d9536a0). It introduces a max_raw_delta field on struct 
clocksource but never initializes it for the default boot timekeeper 
(the jiffies clocksource), so clocksource_delta() clamps every delta to 
0 and CLOCK_MONOTONIC freezes while that clocksource is active. On this 
HVM guest, SMP bring-up runs while the jiffies clocksource is still the 
timekeeper, and the Xen single shot (high resolution) tick then advances 
jiffies far too slowly, so the secondary CPUs burn seconds in 
calibrate_delay().

Teddy

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 2489 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization
  2026-06-08 10:29 [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization Teddy Astie
@ 2026-06-08 10:57 ` Juergen Gross
  2026-06-08 15:13 ` Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: Juergen Gross @ 2026-06-08 10:57 UTC (permalink / raw)
  To: Teddy Astie, linux-kernel@vger.kernel.org, regressions,
	Thomas Gleixner
  Cc: Xen-devel, Olivier Lambert


[-- Attachment #1.1.1: Type: text/plain, Size: 1535 bytes --]

Add Thomas Gleixner (author of the patch introducing the regression).


Juergen

On 08.06.26 12:29, Teddy Astie wrote:
> Hello,
> 
> In 6.12.5+ kernels on AMD CPUs, we observe abnormally long boot times where the 
> guest is struggling on PV spinlock initialization.
> 
> This occurs starting with 6.12.5, and also on more recent kernels on Intel 
> platforms, but that hasn't been fully investigated at this time (but I assume 
> it's a variant of the same issue).
> 
> This occurs since a backport of 76031d9 ("clocksource: Make negative motion 
> detection more robust").
> 
> Some (claude-based) analysis made appears to relate that to the lack of proper 
> max_raw_delta in the jiffies clocksource which appears to make the clock fail to 
> progress meaningfully.
> 
> Here is a raw summary of the analysis
>  > We tracked it down to a single stable backport in 6.12.5: commit 1a678f6829a8 
> ("clocksource: Make negative motion detection more robust", upstream 
> 76031d9536a0). It introduces a max_raw_delta field on struct clocksource but 
> never initializes it for the default boot timekeeper (the jiffies clocksource), 
> so clocksource_delta() clamps every delta to 0 and CLOCK_MONOTONIC freezes while 
> that clocksource is active. On this HVM guest, SMP bring-up runs while the 
> jiffies clocksource is still the timekeeper, and the Xen single shot (high 
> resolution) tick then advances jiffies far too slowly, so the secondary CPUs 
> burn seconds in calibrate_delay().
> 
> Teddy


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization
  2026-06-08 10:29 [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization Teddy Astie
  2026-06-08 10:57 ` Juergen Gross
@ 2026-06-08 15:13 ` Thomas Gleixner
  2026-06-08 21:29   ` Thomas Gleixner
  1 sibling, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2026-06-08 15:13 UTC (permalink / raw)
  To: Teddy Astie, linux-kernel@vger.kernel.org, regressions
  Cc: Xen-devel, Olivier Lambert

On Mon, Jun 08 2026 at 12:29, Teddy Astie wrote:
> In 6.12.5+ kernels on AMD CPUs, we observe abnormally long boot times 
> where the guest is struggling on PV spinlock initialization.
>
> This occurs starting with 6.12.5, and also on more recent kernels on 
> Intel platforms, but that hasn't been fully investigated at this time 
> (but I assume it's a variant of the same issue).
>
> This occurs since a backport of 76031d9 ("clocksource: Make negative 
> motion detection more robust").
>
> Some (claude-based) analysis made appears to relate that to the lack of 
> proper max_raw_delta in the jiffies clocksource which appears to make 
> the clock fail to progress meaningfully.
>
> Here is a raw summary of the analysis
>  > We tracked it down to a single stable backport in 6.12.5: commit 
> 1a678f6829a8 ("clocksource: Make negative motion detection more robust", 
> upstream 76031d9536a0). It introduces a max_raw_delta field on struct 
> clocksource but never initializes it for the default boot timekeeper 
> (the jiffies clocksource), so clocksource_delta() clamps every delta to 
> 0 and CLOCK_MONOTONIC freezes while that clocksource is active.

Bah. jiffies clocksource is registered way _after_ timekeeping started to
use it.

The untested below should fix that.

Thanks,

        tglx
---
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -60,15 +60,9 @@ EXPORT_SYMBOL(get_jiffies_64);
 
 EXPORT_SYMBOL(jiffies);
 
-static int __init init_jiffies_clocksource(void)
-{
-	return __clocksource_register(&clocksource_jiffies);
-}
-
-core_initcall(init_jiffies_clocksource);
-
 struct clocksource * __init __weak clocksource_default_clock(void)
 {
+	clocksource_register(&clocksource_jiffies);
 	return &clocksource_jiffies;
 }
 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization
  2026-06-08 15:13 ` Thomas Gleixner
@ 2026-06-08 21:29   ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2026-06-08 21:29 UTC (permalink / raw)
  To: Teddy Astie, linux-kernel@vger.kernel.org, regressions
  Cc: Xen-devel, Olivier Lambert

On Mon, Jun 08 2026 at 17:13, Thomas Gleixner wrote:
> On Mon, Jun 08 2026 at 12:29, Teddy Astie wrote:
>> In 6.12.5+ kernels on AMD CPUs, we observe abnormally long boot times 
>> where the guest is struggling on PV spinlock initialization.
>>
>> This occurs starting with 6.12.5, and also on more recent kernels on 
>> Intel platforms, but that hasn't been fully investigated at this time 
>> (but I assume it's a variant of the same issue).
>>
>> This occurs since a backport of 76031d9 ("clocksource: Make negative 
>> motion detection more robust").
>>
>> Some (claude-based) analysis made appears to relate that to the lack of 
>> proper max_raw_delta in the jiffies clocksource which appears to make 
>> the clock fail to progress meaningfully.
>>
>> Here is a raw summary of the analysis
>>  > We tracked it down to a single stable backport in 6.12.5: commit 
>> 1a678f6829a8 ("clocksource: Make negative motion detection more robust", 
>> upstream 76031d9536a0). It introduces a max_raw_delta field on struct 
>> clocksource but never initializes it for the default boot timekeeper 
>> (the jiffies clocksource), so clocksource_delta() clamps every delta to 
>> 0 and CLOCK_MONOTONIC freezes while that clocksource is active.
>
> Bah. jiffies clocksource is registered way _after_ timekeeping started to
> use it.
>
> The untested below should fix that.

That obviously needs to be:

--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -60,15 +60,9 @@ EXPORT_SYMBOL(get_jiffies_64);
 
 EXPORT_SYMBOL(jiffies);
 
-static int __init init_jiffies_clocksource(void)
-{
-	return __clocksource_register(&clocksource_jiffies);
-}
-
-core_initcall(init_jiffies_clocksource);
-
 struct clocksource * __init __weak clocksource_default_clock(void)
 {
+	__clocksource_register(&clocksource_jiffies);
 	return &clocksource_jiffies;
 }
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-08 21:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 10:29 [REGRESSION][BISECTED] Long boot time with Xen HVM guests during PV spinlock initialization Teddy Astie
2026-06-08 10:57 ` Juergen Gross
2026-06-08 15:13 ` Thomas Gleixner
2026-06-08 21:29   ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.