netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG]? xfrm: INFO: inconsistent lock state
@ 2010-02-01 10:01 Wei Yongjun
  2010-02-02  7:49 ` Herbert Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Wei Yongjun @ 2010-02-01 10:01 UTC (permalink / raw)
  To: netdev@vger.kernel.org; +Cc: Herbert Xu

I got following lock INFO in my Fedora12 box uner 2.6.33-rc6.
Is this a BUG?
 
=================================
[ INFO: inconsistent lock state ]
2.6.33-rc6 #120
---------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (&(&x->lock)->rlock){?.-...}, at: [<c072d03f>] xfrm_input+0xdf/0x5c0
{IN-HARDIRQ-W} state was registered at:
  [<c04718dc>] __lock_acquire+0xa9c/0x1890
  [<c047274f>] lock_acquire+0x7f/0xf0
  [<c0762958>] _raw_spin_lock+0x38/0x50
  [<c072b5ca>] xfrm_timer_handler+0x3a/0x260
  [<c0447d9d>] __hrtimer_tasklet_trampoline+0xd/0x10
  [<c04634ce>] hrtimer_run_queues+0x15e/0x2a0
  [<c045146d>] run_local_timers+0xd/0x20
  [<c04514b4>] update_process_times+0x34/0x70
  [<c046ce8a>] tick_periodic+0x2a/0x80
  [<c046cefe>] tick_handle_periodic+0x1e/0x90
  [<c0768377>] smp_apic_timer_interrupt+0x57/0x8b
  [<c076382f>] apic_timer_interrupt+0x2f/0x34
  [<c0401d3b>] cpu_idle+0x4b/0x80
  [<c074e0d7>] rest_init+0x67/0x70
  [<c0956874>] start_kernel+0x30e/0x314
  [<c095609e>] i386_start_kernel+0x9e/0xa5
irq event stamp: 1113374
hardirqs last  enabled at (1113374): [<c0449288>] local_bh_enable_ip+0x68/0xd0
hardirqs last disabled at (1113373): [<c0449248>] local_bh_enable_ip+0x28/0xd0
softirqs last  enabled at (1113338): [<c0449570>] __do_softirq+0x100/0x1c0
softirqs last disabled at (1113343): [<c0404e17>] do_softirq+0x97/0xe0

other info that might help us debug this:
3 locks held by swapper/0:
 #0:  (rcu_read_lock){.+.+..}, at: [<c06b8426>] net_rx_action+0xd6/0x230
 #1:  (rcu_read_lock){.+.+..}, at: [<c06b77c8>] netif_receive_skb+0x138/0x570
 #2:  (rcu_read_lock){.+.+..}, at: [<d1c5d540>] ip6_input_finish+0x0/0x390 [ipv6]

stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.33-rc6 #120
Call Trace:
 [<c075f984>] ? printk+0x1d/0x21
 [<c046fb8f>] print_usage_bug+0x15f/0x1a0
 [<c046ff8a>] mark_lock+0x3ba/0x5c0
 [<c04701f2>] ? mark_held_locks+0x62/0x90
 [<c0473540>] ? check_usage_backwards+0x0/0xd0
 [<c04717ef>] __lock_acquire+0x9af/0x1890
 [<c05a0cac>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c07633fc>] ? restore_all_notrace+0x0/0x18
 [<c04701f2>] ? mark_held_locks+0x62/0x90
 [<c047274f>] lock_acquire+0x7f/0xf0
 [<c072d03f>] ? xfrm_input+0xdf/0x5c0
 [<c0762958>] _raw_spin_lock+0x38/0x50
 [<c072d03f>] ? xfrm_input+0xdf/0x5c0
 [<c072d03f>] xfrm_input+0xdf/0x5c0
 [<d1c88929>] xfrm6_rcv_spi+0x29/0x30 [ipv6]
 [<d1c8894c>] xfrm6_rcv+0x1c/0x20 [ipv6]
 [<d1c5d681>] ip6_input_finish+0x141/0x390 [ipv6]
 [<d1c5d540>] ? ip6_input_finish+0x0/0x390 [ipv6]
 [<d1c5d92f>] ip6_input+0x5f/0x70 [ipv6]
 [<d1c5de78>] ipv6_rcv+0x338/0x440 [ipv6]
 [<d1c5db6e>] ? ipv6_rcv+0x2e/0x440 [ipv6]
 [<c06b79df>] netif_receive_skb+0x34f/0x570
 [<c06b77c8>] ? netif_receive_skb+0x138/0x570
 [<c06affde>] ? __alloc_skb+0x4e/0x130
 [<c04080b0>] ? nommu_sync_single_for_device+0x0/0x10
 [<d087ec5e>] pcnet32_poll+0x27e/0x7a0 [pcnet32]
 [<c0450797>] ? run_timer_softirq+0x37/0x2c0
 [<c0763067>] ? _raw_spin_unlock_irq+0x27/0x30
 [<c06b84a0>] net_rx_action+0x150/0x230
 [<c06b8426>] ? net_rx_action+0xd6/0x230
 [<c04494c8>] ? __do_softirq+0x58/0x1c0
 [<c0449510>] __do_softirq+0xa0/0x1c0
 [<c0449470>] ? __do_softirq+0x0/0x1c0
 <IRQ>  [<c0448ce5>] ? irq_exit+0x45/0x50
 [<c076837c>] ? smp_apic_timer_interrupt+0x5c/0x8b
 [<c05a0cbc>] ? trace_hardirqs_off_thunk+0xc/0x10
 [<c076382f>] ? apic_timer_interrupt+0x2f/0x34
 [<c0409de3>] ? default_idle+0x43/0xa0
 [<c0401d3b>] ? cpu_idle+0x4b/0x80
 [<c074e0d7>] ? rest_init+0x67/0x70
 [<c0956874>] ? start_kernel+0x30e/0x314
 [<c0956390>] ? unknown_bootoption+0x0/0x1a0
 [<c095609e>] ? i386_start_kernel+0x9e/0xa5



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG]? xfrm: INFO: inconsistent lock state
  2010-02-01 10:01 [BUG]? xfrm: INFO: inconsistent lock state Wei Yongjun
@ 2010-02-02  7:49 ` Herbert Xu
  2010-02-02 13:51   ` Yury Polyanskiy
  0 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2010-02-02  7:49 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: netdev@vger.kernel.org, David S. Miller, Yury Polyanskiy

On Mon, Feb 01, 2010 at 06:01:20PM +0800, Wei Yongjun wrote:
> I got following lock INFO in my Fedora12 box uner 2.6.33-rc6.
> Is this a BUG?

Apparently.

This is introduced by

commit 9e0d57fd6dad37d72a3ca6db00ca8c76f2215454
Author: Yury Polyanskiy <polyanskiy@gmail.com>
Date:   Sun Nov 8 20:58:41 2009 -0800

    xfrm: SAD entries do not expire correctly after suspend-resume

To be honest I never liked that patch since the semantic change
makes sense for suspend-resume but is totally the wrong thing to
do when the system clock is stepped/changed.

Whether this is a real dead-lock depends on whether hrtimers
behave as real hard IRQs.  If they do then yes it can deadlock.

So my suggestion would be to just revert this change and consider
an alternative solution to the original suspend-resume problem.

My preference would be something that is triggered through power-
management as that would then allows us to differentiate between
the suspend-resume case and the clock change case.

It could even be triggered from user-space as it knows when an
resume occurs.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [BUG]? xfrm: INFO: inconsistent lock state
  2010-02-02  7:49 ` Herbert Xu
@ 2010-02-02 13:51   ` Yury Polyanskiy
  2010-02-02 14:20     ` [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Yury Polyanskiy @ 2010-02-02 13:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Wei Yongjun, netdev@vger.kernel.org, David S. Miller, peterz,
	polyanskiy

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]

On Tue, 2 Feb 2010 18:49:14 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Mon, Feb 01, 2010 at 06:01:20PM +0800, Wei Yongjun wrote:
> > I got following lock INFO in my Fedora12 box uner 2.6.33-rc6.
> > Is this a BUG?
> 
> Apparently.
> 
> This is introduced by
> 
> commit 9e0d57fd6dad37d72a3ca6db00ca8c76f2215454
> Author: Yury Polyanskiy <polyanskiy@gmail.com>
> Date:   Sun Nov 8 20:58:41 2009 -0800
> 
>     xfrm: SAD entries do not expire correctly after suspend-resume
>  
> Whether this is a real dead-lock depends on whether hrtimers
> behave as real hard IRQs.  If they do then yes it can deadlock.

If hrtimer_tasklet interface functions properly, the
xfrm_timer_handler should be called in softirq context (and thus is
never in parallel with xfrm_input()). The deadlock isn't possible then.

In this case it seems that for some reason xfrm_timer_handler() is
called in the hardirq context. The relevant code in hrtimer_tasklet:

static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
{
	struct tasklet_hrtimer *ttimer =
		container_of(timer, struct tasklet_hrtimer, timer);

	if (hrtimer_is_hres_active(timer)) {
		tasklet_hi_schedule(&ttimer->tasklet);
		return HRTIMER_NORESTART;
	}
	return ttimer->function(timer);
}

I am copying Peter on this. Peter, how is it possible that
ttimer->function() is called in hardirq?

Could it be that switch from hres_active happened after the call to
trampoline and before the if() above?

Best,
Yury

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH]  hrtimer, softirq: Fix hrtimer->softirq trampoline
  2010-02-02 13:51   ` Yury Polyanskiy
@ 2010-02-02 14:20     ` Peter Zijlstra
  2010-02-02 14:28       ` Yury Polyanskiy
                         ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Peter Zijlstra @ 2010-02-02 14:20 UTC (permalink / raw)
  To: Yury Polyanskiy
  Cc: Herbert Xu, Wei Yongjun, netdev@vger.kernel.org, David S. Miller,
	polyanskiy, Thomas Gleixner, lkml

On Tue, 2010-02-02 at 08:51 -0500, Yury Polyanskiy wrote:

> If hrtimer_tasklet interface functions properly, the
> xfrm_timer_handler should be called in softirq context (and thus is
> never in parallel with xfrm_input()). The deadlock isn't possible then.
> 
> In this case it seems that for some reason xfrm_timer_handler() is
> called in the hardirq context. The relevant code in hrtimer_tasklet:
> 
> static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
> {
> 	struct tasklet_hrtimer *ttimer =
> 		container_of(timer, struct tasklet_hrtimer, timer);
> 
> 	if (hrtimer_is_hres_active(timer)) {
> 		tasklet_hi_schedule(&ttimer->tasklet);
> 		return HRTIMER_NORESTART;
> 	}
> 	return ttimer->function(timer);
> }
> 
> I am copying Peter on this. Peter, how is it possible that
> ttimer->function() is called in hardirq?
> 
> Could it be that switch from hres_active happened after the call to
> trampoline and before the if() above?

The original email had more information:

> {IN-HARDIRQ-W} state was registered at:
>   [<c04718dc>] __lock_acquire+0xa9c/0x1890
>   [<c047274f>] lock_acquire+0x7f/0xf0
>   [<c0762958>] _raw_spin_lock+0x38/0x50
>   [<c072b5ca>] xfrm_timer_handler+0x3a/0x260
>   [<c0447d9d>] __hrtimer_tasklet_trampoline+0xd/0x10
>   [<c04634ce>] hrtimer_run_queues+0x15e/0x2a0
>   [<c045146d>] run_local_timers+0xd/0x20
>   [<c04514b4>] update_process_times+0x34/0x70
>   [<c046ce8a>] tick_periodic+0x2a/0x80
>   [<c046cefe>] tick_handle_periodic+0x1e/0x90
>   [<c0768377>] smp_apic_timer_interrupt+0x57/0x8b
>   [<c076382f>] apic_timer_interrupt+0x2f/0x34
>   [<c0401d3b>] cpu_idle+0x4b/0x80
>   [<c074e0d7>] rest_init+0x67/0x70
>   [<c0956874>] start_kernel+0x30e/0x314
>   [<c095609e>] i386_start_kernel+0x9e/0xa5

Which indicates we were called from hardirq context, it appears that
that hrtimer_is_hres_active() case is indeed faulty. Not sure if I made
a mistake when I wrote that or if we changed hrtimer behaviour
afterwards, but the hrtimer fallback is still from hardirq context.

Which would seem to suggest the following patch:

---
Subject: hrtimer, softirq: Fix hrtimer->softirq trampoline

hrtimers callbacks are always done from hardirq context, either the
jiffy tick interrupt or the hrtimer device interrupt.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/softirq.c |   13 +++----------
 1 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index a09502e..c1983b7 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -500,22 +500,15 @@ EXPORT_SYMBOL(tasklet_kill);
  */
 
 /*
- * The trampoline is called when the hrtimer expires. If this is
- * called from the hrtimer interrupt then we schedule the tasklet as
- * the timer callback function expects to run in softirq context. If
- * it's called in softirq context anyway (i.e. high resolution timers
- * disabled) then the hrtimer callback is called right away.
+ * The trampoline is called when the hrtimer expires. 
  */
 static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
 {
 	struct tasklet_hrtimer *ttimer =
 		container_of(timer, struct tasklet_hrtimer, timer);
 
-	if (hrtimer_is_hres_active(timer)) {
-		tasklet_hi_schedule(&ttimer->tasklet);
-		return HRTIMER_NORESTART;
-	}
-	return ttimer->function(timer);
+	tasklet_hi_schedule(&ttimer->tasklet);
+	return HRTIMER_NORESTART;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH]  hrtimer, softirq: Fix hrtimer->softirq trampoline
  2010-02-02 14:20     ` [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline Peter Zijlstra
@ 2010-02-02 14:28       ` Yury Polyanskiy
  2010-02-02 14:35         ` Peter Zijlstra
  2010-02-03  1:47       ` Wei Yongjun
  2010-02-03 16:56       ` David Miller
  2 siblings, 1 reply; 8+ messages in thread
From: Yury Polyanskiy @ 2010-02-02 14:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Herbert Xu, Wei Yongjun, netdev@vger.kernel.org, David S. Miller,
	polyanskiy, Thomas Gleixner, lkml

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

On Tue, 02 Feb 2010 15:20:01 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
>  static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
>  {
>  	struct tasklet_hrtimer *ttimer =
>  		container_of(timer, struct tasklet_hrtimer, timer);
>  
> -	if (hrtimer_is_hres_active(timer)) {
> -		tasklet_hi_schedule(&ttimer->tasklet);
> -		return HRTIMER_NORESTART;
> -	}
> -	return ttimer->function(timer);
> +	tasklet_hi_schedule(&ttimer->tasklet);
> +	return HRTIMER_NORESTART;
>  }
>  

Are you totally against if(in_irq())?

Yury

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]  hrtimer, softirq: Fix hrtimer->softirq trampoline
  2010-02-02 14:28       ` Yury Polyanskiy
@ 2010-02-02 14:35         ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2010-02-02 14:35 UTC (permalink / raw)
  To: Yury Polyanskiy
  Cc: Herbert Xu, Wei Yongjun, netdev@vger.kernel.org, David S. Miller,
	polyanskiy, Thomas Gleixner, lkml

On Tue, 2010-02-02 at 09:28 -0500, Yury Polyanskiy wrote:
> On Tue, 02 Feb 2010 15:20:01 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> >  static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
> >  {
> >  	struct tasklet_hrtimer *ttimer =
> >  		container_of(timer, struct tasklet_hrtimer, timer);
> >  
> > -	if (hrtimer_is_hres_active(timer)) {
> > -		tasklet_hi_schedule(&ttimer->tasklet);
> > -		return HRTIMER_NORESTART;
> > -	}
> > -	return ttimer->function(timer);
> > +	tasklet_hi_schedule(&ttimer->tasklet);
> > +	return HRTIMER_NORESTART;
> >  }
> >  
> 
> Are you totally against if(in_irq())?

Yeah, things like that are an indication that you really don't know wtf
you're doing and are just patching up.

There is a single site where hrtimer callbacks can indeed be done from
softirq, but in that case the above still works correctly, and I've been
meaning to get rid of that anyway.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]  hrtimer, softirq: Fix hrtimer->softirq trampoline
  2010-02-02 14:20     ` [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline Peter Zijlstra
  2010-02-02 14:28       ` Yury Polyanskiy
@ 2010-02-03  1:47       ` Wei Yongjun
  2010-02-03 16:56       ` David Miller
  2 siblings, 0 replies; 8+ messages in thread
From: Wei Yongjun @ 2010-02-03  1:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yury Polyanskiy, Herbert Xu, netdev@vger.kernel.org,
	David S. Miller, polyanskiy, Thomas Gleixner, lkml

Peter Zijlstra wrote:
> On Tue, 2010-02-02 at 08:51 -0500, Yury Polyanskiy wrote:
>
> The original email had more information:
>
>   
>> {IN-HARDIRQ-W} state was registered at:
>>   [<c04718dc>] __lock_acquire+0xa9c/0x1890
>>   [<c047274f>] lock_acquire+0x7f/0xf0
>>   [<c0762958>] _raw_spin_lock+0x38/0x50
>>   [<c072b5ca>] xfrm_timer_handler+0x3a/0x260
>>   [<c0447d9d>] __hrtimer_tasklet_trampoline+0xd/0x10
>>   [<c04634ce>] hrtimer_run_queues+0x15e/0x2a0
>>   [<c045146d>] run_local_timers+0xd/0x20
>>   [<c04514b4>] update_process_times+0x34/0x70
>>   [<c046ce8a>] tick_periodic+0x2a/0x80
>>   [<c046cefe>] tick_handle_periodic+0x1e/0x90
>>   [<c0768377>] smp_apic_timer_interrupt+0x57/0x8b
>>   [<c076382f>] apic_timer_interrupt+0x2f/0x34
>>   [<c0401d3b>] cpu_idle+0x4b/0x80
>>   [<c074e0d7>] rest_init+0x67/0x70
>>   [<c0956874>] start_kernel+0x30e/0x314
>>   [<c095609e>] i386_start_kernel+0x9e/0xa5
>>     
>
> Which indicates we were called from hardirq context, it appears that
> that hrtimer_is_hres_active() case is indeed faulty. Not sure if I made
> a mistake when I wrote that or if we changed hrtimer behaviour
> afterwards, but the hrtimer fallback is still from hardirq context.
>
> Which would seem to suggest the following patch:
>
> ---
> Subject: hrtimer, softirq: Fix hrtimer->softirq trampoline
>
> hrtimers callbacks are always done from hardirq context, either the
> jiffy tick interrupt or the hrtimer device interrupt.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>
>   

With this patch, the inconsistent lock state INFO is gone. Thanks.

Wei Yongjun


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline
  2010-02-02 14:20     ` [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline Peter Zijlstra
  2010-02-02 14:28       ` Yury Polyanskiy
  2010-02-03  1:47       ` Wei Yongjun
@ 2010-02-03 16:56       ` David Miller
  2 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2010-02-03 16:56 UTC (permalink / raw)
  To: peterz; +Cc: ypolyans, herbert, yjwei, netdev, polyanskiy, tglx, linux-kernel

From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 02 Feb 2010 15:20:01 +0100

> Subject: hrtimer, softirq: Fix hrtimer->softirq trampoline
> 
> hrtimers callbacks are always done from hardirq context, either the
> jiffy tick interrupt or the hrtimer device interrupt.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

Acked-by: David S. Miller <davem@davemloft.net>

It would be nice to give mention of the bug reporter et al.
in the final commit message.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-02-03 16:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-01 10:01 [BUG]? xfrm: INFO: inconsistent lock state Wei Yongjun
2010-02-02  7:49 ` Herbert Xu
2010-02-02 13:51   ` Yury Polyanskiy
2010-02-02 14:20     ` [PATCH] hrtimer, softirq: Fix hrtimer->softirq trampoline Peter Zijlstra
2010-02-02 14:28       ` Yury Polyanskiy
2010-02-02 14:35         ` Peter Zijlstra
2010-02-03  1:47       ` Wei Yongjun
2010-02-03 16:56       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).