All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
@ 2026-05-20 20:15 Waiman Long
  2026-06-03  2:26 ` Waiman Long
  2026-06-03  7:56 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 6+ messages in thread
From: Waiman Long @ 2026-05-20 20:15 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Sebastian Andrzej Siewior,
	Clark Williams, Steven Rostedt
  Cc: linux-kernel, linux-rt-devel, Waiman Long

When booting a debug PREEMPT_RT kernel on an arm64 system with grace
processor, the following lockdep warning was reported during early boot.

  ================================
  WARNING: inconsistent lock state
  7.1.0-rc4-test+ #1 Not tainted
  --------------------------------
  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
  swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  ffff0000803346a0 (&n->list_lock){?.+.}-{3:3}, at: get_from_partial_node+0x74/0xa0
    :
  Call trace:
    :
   rt_spin_lock+0xa0/0x400
   get_from_partial_node+0x74/0xa0
   ___slab_alloc+0x94/0x4f8
   kmem_cache_alloc_noprof+0x2d4/0x598
   kmem_alloc_batch+0x54/0x170
   fill_pool+0x12c/0x438
   debug_objects_fill_pool+0x58/0x60
   debug_object_activate+0xfc/0x3d0
   add_timer_on+0x250/0x3a0
   add_interrupt_randomness+0x2d4/0x340
   handle_percpu_devid_irq+0x2e0/0x4e0
   handle_irq_desc+0xc0/0x120
   generic_handle_domain_irq+0x20/0x40
   __gic_handle_irq_from_irqson.isra.0+0x3c4/0x708
   gic_handle_irq+0x7c/0xe0
   call_on_irq_stack+0x30/0x48
   do_interrupt_handler+0x134/0x158
   el1_interrupt+0x48/0xb0
    :

The {IN-HARDIRQ-W} usage happens when debug_objects_fill_pool() calls
fill_pool() in the hardirq context during early boot. It is caused by the
"system_state < SYSTEM_SCHEDULING" check in debug_objects_fill_pool()
which allows fill_pool() to be called from any context during early
boot before scheduling is enabled.

Calling fill_pool() from any context is problematic as deadlock can
happen even though the early boot window should be pretty short. Fix
that by restricting the fill_pool() call to only in_task() context
during early boot.

Fixes: 06e0ae988f6e ("debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

 [v3] Rebased on top of tip/urgent/core & trim call trace.

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 772ddabcbe7d..76bfc2571591 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -736,11 +736,15 @@ static void debug_objects_fill_pool(void)
 
 	/*
 	 * On RT enabled kernels the pool refill must happen in preemptible
-	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
-	 * on the fact that spinlock_t and raw_spinlock_t are basically the
-	 * same type and this lock-type inversion works just fine.
+	 * context and not enqueued on an rt_mutex or in task context during
+	 * early boot before scheduling starts.
+	 *
+	 * For !RT kernels we rely on the fact that spinlock_t and
+	 * raw_spinlock_t are basically the same type and this lock-type
+	 * inversion works just fine.
 	 */
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
+	    (system_state < SYSTEM_SCHEDULING && in_task()) ||
 	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
 		/*
 		 * Annotate away the spinlock_t inside raw_spinlock_t warning
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
  2026-05-20 20:15 [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context Waiman Long
@ 2026-06-03  2:26 ` Waiman Long
  2026-06-03  7:56 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 6+ messages in thread
From: Waiman Long @ 2026-06-03  2:26 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Sebastian Andrzej Siewior,
	Clark Williams, Steven Rostedt
  Cc: linux-kernel, linux-rt-devel

On 5/20/26 4:15 PM, Waiman Long wrote:
> When booting a debug PREEMPT_RT kernel on an arm64 system with grace
> processor, the following lockdep warning was reported during early boot.
>
>    ================================
>    WARNING: inconsistent lock state
>    7.1.0-rc4-test+ #1 Not tainted
>    --------------------------------
>    inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>    swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>    ffff0000803346a0 (&n->list_lock){?.+.}-{3:3}, at: get_from_partial_node+0x74/0xa0
>      :
>    Call trace:
>      :
>     rt_spin_lock+0xa0/0x400
>     get_from_partial_node+0x74/0xa0
>     ___slab_alloc+0x94/0x4f8
>     kmem_cache_alloc_noprof+0x2d4/0x598
>     kmem_alloc_batch+0x54/0x170
>     fill_pool+0x12c/0x438
>     debug_objects_fill_pool+0x58/0x60
>     debug_object_activate+0xfc/0x3d0
>     add_timer_on+0x250/0x3a0
>     add_interrupt_randomness+0x2d4/0x340
>     handle_percpu_devid_irq+0x2e0/0x4e0
>     handle_irq_desc+0xc0/0x120
>     generic_handle_domain_irq+0x20/0x40
>     __gic_handle_irq_from_irqson.isra.0+0x3c4/0x708
>     gic_handle_irq+0x7c/0xe0
>     call_on_irq_stack+0x30/0x48
>     do_interrupt_handler+0x134/0x158
>     el1_interrupt+0x48/0xb0
>      :
>
> The {IN-HARDIRQ-W} usage happens when debug_objects_fill_pool() calls
> fill_pool() in the hardirq context during early boot. It is caused by the
> "system_state < SYSTEM_SCHEDULING" check in debug_objects_fill_pool()
> which allows fill_pool() to be called from any context during early
> boot before scheduling is enabled.
>
> Calling fill_pool() from any context is problematic as deadlock can
> happen even though the early boot window should be pretty short. Fix
> that by restricting the fill_pool() call to only in_task() context
> during early boot.
>
> Fixes: 06e0ae988f6e ("debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>   lib/debugobjects.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
>
>   [v3] Rebased on top of tip/urgent/core & trim call trace.
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 772ddabcbe7d..76bfc2571591 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -736,11 +736,15 @@ static void debug_objects_fill_pool(void)
>   
>   	/*
>   	 * On RT enabled kernels the pool refill must happen in preemptible
> -	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
> -	 * on the fact that spinlock_t and raw_spinlock_t are basically the
> -	 * same type and this lock-type inversion works just fine.
> +	 * context and not enqueued on an rt_mutex or in task context during
> +	 * early boot before scheduling starts.
> +	 *
> +	 * For !RT kernels we rely on the fact that spinlock_t and
> +	 * raw_spinlock_t are basically the same type and this lock-type
> +	 * inversion works just fine.
>   	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
> +	    (system_state < SYSTEM_SCHEDULING && in_task()) ||
>   	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
>   		/*
>   		 * Annotate away the spinlock_t inside raw_spinlock_t warning

Ping. Any comment on this patch?

Cheers,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
  2026-05-20 20:15 [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context Waiman Long
  2026-06-03  2:26 ` Waiman Long
@ 2026-06-03  7:56 ` Sebastian Andrzej Siewior
  2026-06-03 14:59   ` Waiman Long
  2026-06-03 17:07   ` Thomas Gleixner
  1 sibling, 2 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-06-03  7:56 UTC (permalink / raw)
  To: Waiman Long
  Cc: Thomas Gleixner, Andrew Morton, Clark Williams, Steven Rostedt,
	linux-kernel, linux-rt-devel

On 2026-05-20 16:15:09 [-0400], Waiman Long wrote:
> When booting a debug PREEMPT_RT kernel on an arm64 system with grace
> processor, the following lockdep warning was reported during early boot.
> 
>   ================================
>   WARNING: inconsistent lock state
>   7.1.0-rc4-test+ #1 Not tainted
>   --------------------------------
>   inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>   swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>   ffff0000803346a0 (&n->list_lock){?.+.}-{3:3}, at: get_from_partial_node+0x74/0xa0
>     :
>   Call trace:
>     :
>    rt_spin_lock+0xa0/0x400
>    get_from_partial_node+0x74/0xa0
>    ___slab_alloc+0x94/0x4f8
>    kmem_cache_alloc_noprof+0x2d4/0x598
>    kmem_alloc_batch+0x54/0x170
>    fill_pool+0x12c/0x438
>    debug_objects_fill_pool+0x58/0x60
>    debug_object_activate+0xfc/0x3d0
>    add_timer_on+0x250/0x3a0
>    add_interrupt_randomness+0x2d4/0x340
>    handle_percpu_devid_irq+0x2e0/0x4e0
>    handle_irq_desc+0xc0/0x120
>    generic_handle_domain_irq+0x20/0x40
>    __gic_handle_irq_from_irqson.isra.0+0x3c4/0x708
>    gic_handle_irq+0x7c/0xe0
>    call_on_irq_stack+0x30/0x48
>    do_interrupt_handler+0x134/0x158
>    el1_interrupt+0x48/0xb0
>     :

What about:

  During early boot, interrupts are getting enabled before the scheduler
  is enabled. In this window (before SYSTEM_SCHEDULING is set) interrupts
  can fire and attempt to fill the pool from within the hardirq. This can
  lead to a deadlock the interrupt occurred while in the memory allocator.
  
  Reorder the exception rule and forbid this scenario by excluding
  allocations from hardirq.

…
> Fixes: 06e0ae988f6e ("debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  	/*
>  	 * On RT enabled kernels the pool refill must happen in preemptible
> -	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
> -	 * on the fact that spinlock_t and raw_spinlock_t are basically the
> -	 * same type and this lock-type inversion works just fine.
> +	 * context and not enqueued on an rt_mutex or in task context during
> +	 * early boot before scheduling starts.
> +	 *
> +	 * For !RT kernels we rely on the fact that spinlock_t and
> +	 * raw_spinlock_t are basically the same type and this lock-type
> +	 * inversion works just fine.
>  	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
> +	    (system_state < SYSTEM_SCHEDULING && in_task()) ||
>  	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
>  		/*
>  		 * Annotate away the spinlock_t inside raw_spinlock_t warning

I updated the comment to explain in more verbose why this and that is
done.
I re-ordered the whole thing stared with the pi-locked-on part since
this is always valid. It shouldn't happen during early boot I think it
is easier to read that way. Then we restrict it to the preeptible case
which can be overruled with the SYSTEM_SCHEDULING exception however as
long as it is not an hardirq. It looks easier to parse and hopefully
brings an end to this.

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index b18a682fe3da2..2adfe2a79a086 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -736,12 +736,17 @@ static void debug_objects_fill_pool(void)
 
 	/*
 	 * On RT enabled kernels the pool refill must happen in preemptible
-	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
-	 * on the fact that spinlock_t and raw_spinlock_t are basically the
-	 * same type and this lock-type inversion works just fine.
+	 * context and not while blocking on a lock which can trigger recursion
+	 * during PI. During system boot (before scheduling) preemption is
+	 * disabled and the pool gets exhausted. Without scheduling a deadlock
+	 * is not possible if allocations from interrupt context are excluded.
+	 * For !RT kernels we rely on the fact that spinlock_t and
+	 * raw_spinlock_t are basically the same type and this lock-type
+	 * inversion works just fine.
 	 */
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
-	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
+	    !debug_objects_is_pi_blocked_on() &&
+	    (preemptible() || (system_state < SYSTEM_SCHEDULING && !in_hardirq()))) {
 		/*
 		 * Annotate away the spinlock_t inside raw_spinlock_t warning
 		 * by temporarily raising the wait-type to LD_WAIT_CONFIG, matching
Sebastian

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
  2026-06-03  7:56 ` Sebastian Andrzej Siewior
@ 2026-06-03 14:59   ` Waiman Long
  2026-06-03 15:30     ` Sebastian Andrzej Siewior
  2026-06-03 17:07   ` Thomas Gleixner
  1 sibling, 1 reply; 6+ messages in thread
From: Waiman Long @ 2026-06-03 14:59 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Thomas Gleixner, Andrew Morton, Clark Williams, Steven Rostedt,
	linux-kernel, linux-rt-devel

On 6/3/26 3:56 AM, Sebastian Andrzej Siewior wrote:
> On 2026-05-20 16:15:09 [-0400], Waiman Long wrote:
>> When booting a debug PREEMPT_RT kernel on an arm64 system with grace
>> processor, the following lockdep warning was reported during early boot.
>>
>>    ================================
>>    WARNING: inconsistent lock state
>>    7.1.0-rc4-test+ #1 Not tainted
>>    --------------------------------
>>    inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>>    swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>>    ffff0000803346a0 (&n->list_lock){?.+.}-{3:3}, at: get_from_partial_node+0x74/0xa0
>>      :
>>    Call trace:
>>      :
>>     rt_spin_lock+0xa0/0x400
>>     get_from_partial_node+0x74/0xa0
>>     ___slab_alloc+0x94/0x4f8
>>     kmem_cache_alloc_noprof+0x2d4/0x598
>>     kmem_alloc_batch+0x54/0x170
>>     fill_pool+0x12c/0x438
>>     debug_objects_fill_pool+0x58/0x60
>>     debug_object_activate+0xfc/0x3d0
>>     add_timer_on+0x250/0x3a0
>>     add_interrupt_randomness+0x2d4/0x340
>>     handle_percpu_devid_irq+0x2e0/0x4e0
>>     handle_irq_desc+0xc0/0x120
>>     generic_handle_domain_irq+0x20/0x40
>>     __gic_handle_irq_from_irqson.isra.0+0x3c4/0x708
>>     gic_handle_irq+0x7c/0xe0
>>     call_on_irq_stack+0x30/0x48
>>     do_interrupt_handler+0x134/0x158
>>     el1_interrupt+0x48/0xb0
>>      :
> What about:
>
>    During early boot, interrupts are getting enabled before the scheduler
>    is enabled. In this window (before SYSTEM_SCHEDULING is set) interrupts
>    can fire and attempt to fill the pool from within the hardirq. This can
>    lead to a deadlock the interrupt occurred while in the memory allocator.
>    
>    Reorder the exception rule and forbid this scenario by excluding
>    allocations from hardirq.
Yes, the debug_objects_is_pi_blocked_on() check should cover the 
system_state check as well.
>
> …
>> Fixes: 06e0ae988f6e ("debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
> …
>>   	/*
>>   	 * On RT enabled kernels the pool refill must happen in preemptible
>> -	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
>> -	 * on the fact that spinlock_t and raw_spinlock_t are basically the
>> -	 * same type and this lock-type inversion works just fine.
>> +	 * context and not enqueued on an rt_mutex or in task context during
>> +	 * early boot before scheduling starts.
>> +	 *
>> +	 * For !RT kernels we rely on the fact that spinlock_t and
>> +	 * raw_spinlock_t are basically the same type and this lock-type
>> +	 * inversion works just fine.
>>   	 */
>> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
>> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
>> +	    (system_state < SYSTEM_SCHEDULING && in_task()) ||
>>   	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
>>   		/*
>>   		 * Annotate away the spinlock_t inside raw_spinlock_t warning
> I updated the comment to explain in more verbose why this and that is
> done.
> I re-ordered the whole thing stared with the pi-locked-on part since
> this is always valid. It shouldn't happen during early boot I think it
> is easier to read that way. Then we restrict it to the preeptible case
> which can be overruled with the SYSTEM_SCHEDULING exception however as
> long as it is not an hardirq. It looks easier to parse and hopefully
> brings an end to this.
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index b18a682fe3da2..2adfe2a79a086 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -736,12 +736,17 @@ static void debug_objects_fill_pool(void)
>   
>   	/*
>   	 * On RT enabled kernels the pool refill must happen in preemptible
> -	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
> -	 * on the fact that spinlock_t and raw_spinlock_t are basically the
> -	 * same type and this lock-type inversion works just fine.
> +	 * context and not while blocking on a lock which can trigger recursion
> +	 * during PI. During system boot (before scheduling) preemption is
> +	 * disabled and the pool gets exhausted. Without scheduling a deadlock
> +	 * is not possible if allocations from interrupt context are excluded.
> +	 * For !RT kernels we rely on the fact that spinlock_t and
> +	 * raw_spinlock_t are basically the same type and this lock-type
> +	 * inversion works just fine.
>   	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
> -	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
> +	    !debug_objects_is_pi_blocked_on() &&
> +	    (preemptible() || (system_state < SYSTEM_SCHEDULING && !in_hardirq()))) {
>   		/*
>   		 * Annotate away the spinlock_t inside raw_spinlock_t warning
>   		 * by temporarily raising the wait-type to LD_WAIT_CONFIG, matching

I guess softirq won't be active during early boot. If in_nmi() is true, 
we are screwed for non-RT kernel as well. So only checking for 
in_hardirq() should be fine. I will adopt your suggestion and send a new 
version.

Thanks,
Longman

> Sebastian
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
  2026-06-03 14:59   ` Waiman Long
@ 2026-06-03 15:30     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-06-03 15:30 UTC (permalink / raw)
  To: Waiman Long
  Cc: Thomas Gleixner, Andrew Morton, Clark Williams, Steven Rostedt,
	linux-kernel, linux-rt-devel

On 2026-06-03 10:59:55 [-0400], Waiman Long wrote:
> I guess softirq won't be active during early boot. If in_nmi() is true, we
> are screwed for non-RT kernel as well. So only checking for in_hardirq()
> should be fine. I will adopt your suggestion and send a new version.

softirqs shouldn't be active but if they are then are either delayed to
ksoftirqd or inline as of local_bh_enable() which is fine.
And for in_nmi() we are screwed anyway yes :)

> Thanks,
> Longman
> 
Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context
  2026-06-03  7:56 ` Sebastian Andrzej Siewior
  2026-06-03 14:59   ` Waiman Long
@ 2026-06-03 17:07   ` Thomas Gleixner
  1 sibling, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2026-06-03 17:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Waiman Long
  Cc: Andrew Morton, Clark Williams, Steven Rostedt, linux-kernel,
	linux-rt-devel

On Wed, Jun 03 2026 at 09:56, Sebastian Andrzej Siewior wrote:
> On 2026-05-20 16:15:09 [-0400], Waiman Long wrote:
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index b18a682fe3da2..2adfe2a79a086 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -736,12 +736,17 @@ static void debug_objects_fill_pool(void)
>  
>  	/*
>  	 * On RT enabled kernels the pool refill must happen in preemptible
> -	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
> -	 * on the fact that spinlock_t and raw_spinlock_t are basically the
> -	 * same type and this lock-type inversion works just fine.
> +	 * context and not while blocking on a lock which can trigger recursion
> +	 * during PI. During system boot (before scheduling) preemption is
> +	 * disabled and the pool gets exhausted. Without scheduling a deadlock
> +	 * is not possible if allocations from interrupt context are excluded.
> +	 * For !RT kernels we rely on the fact that spinlock_t and
> +	 * raw_spinlock_t are basically the same type and this lock-type
> +	 * inversion works just fine.
>  	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
> -	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
> +	    !debug_objects_is_pi_blocked_on() &&
> +	    (preemptible() || (system_state < SYSTEM_SCHEDULING && !in_hardirq()))) {

This whole thing is unreadable gunk by now and I really can't decode the
correctness of the condition without reading it five times. Something
like the below:

--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -720,6 +720,34 @@ static inline bool debug_objects_is_pi_b
 #endif
 }
 
+static inline bool can_fill_pool(void)
+{
+	/*
+	 * On !RT enabled kernels there are no restrictions and spinlock_t and
+	 * raw_spinlock_t are the same types.
+	 */
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		return true;
+
+	/*
+	 * On RT enabled kernels the pool refill must happen in preemptible
+	 * context and the task must not be blocked on a lock as that could
+	 * corrupt the PI state when blocking on a lock in the allocation path.
+	 */
+	if (preemptible() && !debug_objects_is_pi_blocked_on())
+		return true;
+
+	/*
+	 * Though during system boot before scheduling is set up, preemption is
+	 * disabled and the pool can get exhausted. Before scheduling is active
+	 * a task cannot be blocked on a sleeping lock, but it might hold a lock
+	 * and if interrupted then hard interrupt context might run into a lock
+	 * inversion. So exclude hard interrupt context from allocations before
+	 * scheduling is active.
+	 */
+	return system_state < SYSTEM_SCHEDULING && !in_hardirq();
+}
+
 static void debug_objects_fill_pool(void)
 {
 	if (!static_branch_likely(&obj_cache_enabled))
@@ -734,18 +762,11 @@ static void debug_objects_fill_pool(void
 	if (likely(!pool_should_refill(&pool_global)))
 		return;
 
-	/*
-	 * On RT enabled kernels the pool refill must happen in preemptible
-	 * context and not enqueued on an rt_mutex -- for !RT kernels we rely
-	 * on the fact that spinlock_t and raw_spinlock_t are basically the
-	 * same type and this lock-type inversion works just fine.
-	 */
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || system_state < SYSTEM_SCHEDULING ||
-	    (preemptible() && !debug_objects_is_pi_blocked_on())) {
+	if (can_fill_pool()) {
 		/*
 		 * Annotate away the spinlock_t inside raw_spinlock_t warning
 		 * by temporarily raising the wait-type to LD_WAIT_CONFIG, matching
-		 * the preemptible() condition above.
+		 * the preemptible() condition in can_fill_pool().
 		 */
 		static DEFINE_WAIT_OVERRIDE_MAP(fill_pool_map, LD_WAIT_CONFIG);
 		lock_map_acquire_try(&fill_pool_map);

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-03 17:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 20:15 [PATCH-tip v3] debugobjects: Don't call fill_pool() in early boot non-task context Waiman Long
2026-06-03  2:26 ` Waiman Long
2026-06-03  7:56 ` Sebastian Andrzej Siewior
2026-06-03 14:59   ` Waiman Long
2026-06-03 15:30     ` Sebastian Andrzej Siewior
2026-06-03 17:07   ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.