inconsistent lock state

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* inconsistent lock state
@ 2010-03-07 10:21 Sergey Senozhatsky
  0 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-07 10:21 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 5293 bytes --]

Hello,

Hardly reproducible.
/*
* 2.6.33. x86. ASUS f3jc 
*/

[329645.010697] =================================
[329645.010699] [ INFO: inconsistent lock state ]
[329645.010703] 2.6.33-33-0-dbg #31
[329645.010705] ---------------------------------
[329645.010708] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[329645.010712] events/0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
[329645.010715]  (&(&table->hash[i].lock)->rlock){+.?.-.}, at: [<c12c1892>] spin_lock+0x8/0xa
[329645.010729] {IN-SOFTIRQ-W} state was registered at:
[329645.010732]   [<c105878f>] __lock_acquire+0x27e/0xb86
[329645.010739]   [<c1059988>] lock_acquire+0xa1/0xb8
[329645.010744]   [<c12f39b2>] _raw_spin_lock+0x28/0x58
[329645.010750]   [<c12c1892>] spin_lock+0x8/0xa
[329645.010755]   [<c12c2c8f>] T.958+0x3c/0x141
[329645.010759]   [<c12c2f7a>] __udp4_lib_rcv+0x1e6/0x3cd
[329645.010764]   [<c12c3173>] udp_rcv+0x12/0x14
[329645.010769]   [<c12a559b>] ip_local_deliver_finish+0xc9/0x130
[329645.010774]   [<c12a5663>] ip_local_deliver+0x61/0x66
[329645.010778]   [<c12a51a5>] ip_rcv_finish+0x275/0x29d
[329645.010783]   [<c12a539a>] ip_rcv+0x1cd/0x1ed
[329645.010786]   [<c128858d>] netif_receive_skb+0x340/0x360
[329645.010791]   [<fd1ab6f4>] rtl8169_rx_interrupt+0x2bf/0x37e [r8169]
[329645.010801]   [<fd1adab8>] rtl8169_poll+0x29/0x15a [r8169]
[329645.010808]   [<c1288b8c>] net_rx_action+0x95/0x1af
[329645.010812]   [<c1037243>] __do_softirq+0xc6/0x187
[329645.010819]   [<c103732f>] do_softirq+0x2b/0x43
[329645.010823]   [<c10374bf>] irq_exit+0x38/0x75
[329645.010828]   [<c1004116>] do_IRQ+0x88/0x9c
[329645.010833]   [<c1002f35>] common_interrupt+0x35/0x3c
[329645.010837]   [<c1275d0b>] cpuidle_idle_call+0x72/0xd3
[329645.010844]   [<c1001c1b>] cpu_idle+0x92/0xbf
[329645.010848]   [<c12e3f16>] rest_init+0x76/0x78
[329645.010853]   [<c14c2868>] start_kernel+0x33c/0x341
[329645.010859]   [<c14c2092>] i386_start_kernel+0x92/0x99
[329645.010864] irq event stamp: 157782307
[329645.010866] hardirqs last  enabled at (157782307): [<c10b74b8>] kmem_cache_free+0x97/0xd6
[329645.010873] hardirqs last disabled at (157782306): [<c10b745a>] kmem_cache_free+0x39/0xd6
[329645.010878] softirqs last  enabled at (157782304): [<c12928ef>] rcu_read_unlock_bh+0x1c/0x1e
[329645.010885] softirqs last disabled at (157782302): [<c129289b>] rcu_read_lock_bh+0x8/0x26
[329645.010892] 
[329645.010893] other info that might help us debug this:
[329645.010896] 5 locks held by events/0/9:
[329645.010898]  #0:  (events){+.+.+.}, at: [<c1044f2f>] worker_thread+0x16a/0x27c
[329645.010908]  #1:  ((&(&tp->task)->work)){+.+...}, at: [<c1044f2f>] worker_thread+0x16a/0x27c
[329645.010915]  #2:  (rtnl_mutex){+.+.+.}, at: [<c12911dd>] rtnl_lock+0xf/0x11
[329645.010922]  #3:  (rcu_read_lock){.+.+..}, at: [<c1286a01>] rcu_read_lock+0x0/0x2b
[329645.010931]  #4:  (rcu_read_lock){.+.+..}, at: [<c12a4ed0>] rcu_read_lock+0x0/0x2b
[329645.010938] 
[329645.010939] stack backtrace:
[329645.010942] Pid: 9, comm: events/0 Not tainted 2.6.33-33-0-dbg #31
[329645.010945] Call Trace:
[329645.010950]  [<c12f18ec>] ? printk+0xf/0x11
[329645.010955]  [<c1057800>] valid_state+0x12a/0x13d
[329645.010960]  [<c1057904>] mark_lock+0xf1/0x1e2
[329645.010965]  [<c1057ffa>] ? check_usage_backwards+0x0/0x6f
[329645.010970]  [<c10587fd>] __lock_acquire+0x2ec/0xb86
[329645.010976]  [<c100864f>] ? native_sched_clock+0x48/0x8d
[329645.010982]  [<c104d24b>] ? sched_clock_local+0x17/0x11e
[329645.010987]  [<c12c1892>] ? spin_lock+0x8/0xa
[329645.010992]  [<c1059988>] lock_acquire+0xa1/0xb8
[329645.010997]  [<c12c1892>] ? spin_lock+0x8/0xa
[329645.011002]  [<c12f39b2>] _raw_spin_lock+0x28/0x58
[329645.011006]  [<c12c1892>] ? spin_lock+0x8/0xa
[329645.011010]  [<c12c1892>] spin_lock+0x8/0xa
[329645.011015]  [<c12c2c8f>] T.958+0x3c/0x141
[329645.011020]  [<c104d472>] ? sched_clock_cpu+0x120/0x128
[329645.011025]  [<c100864f>] ? native_sched_clock+0x48/0x8d
[329645.011031]  [<c1059088>] ? __lock_acquire+0xb77/0xb86
[329645.011037]  [<c12a1345>] ? rcu_read_unlock+0x0/0x35
[329645.011042]  [<c12a38c0>] ? ip_route_input+0x102/0xacb
[329645.011046]  [<c1057831>] ? mark_lock+0x1e/0x1e2
[329645.011051]  [<c12a4ed0>] ? rcu_read_lock+0x0/0x2b
[329645.011056]  [<c12c2f7a>] __udp4_lib_rcv+0x1e6/0x3cd
[329645.011061]  [<c12c3173>] udp_rcv+0x12/0x14
[329645.011065]  [<c12a559b>] ip_local_deliver_finish+0xc9/0x130
[329645.011070]  [<c12a5663>] ip_local_deliver+0x61/0x66
[329645.011074]  [<c12a51a5>] ip_rcv_finish+0x275/0x29d
[329645.011078]  [<c12a539a>] ip_rcv+0x1cd/0x1ed
[329645.011083]  [<c128858d>] netif_receive_skb+0x340/0x360
[329645.011093]  [<fd1ab6f4>] rtl8169_rx_interrupt+0x2bf/0x37e [r8169]
[329645.011100]  [<fd1aba02>] rtl8169_reset_task+0x38/0xcd [r8169]
[329645.011105]  [<c1044f71>] worker_thread+0x1ac/0x27c
[329645.011110]  [<c1044f2f>] ? worker_thread+0x16a/0x27c
[329645.011116]  [<fd1ab9ca>] ? rtl8169_reset_task+0x0/0xcd [r8169]
[329645.011123]  [<c1048725>] ? autoremove_wake_function+0x0/0x2f
[329645.011128]  [<c1044dc5>] ? worker_thread+0x0/0x27c
[329645.011132]  [<c104838a>] kthread+0x6a/0x6f
[329645.011137]  [<c1048320>] ? kthread+0x0/0x6f
[329645.011142]  [<c1002f42>] kernel_thread_helper+0x6/0x1a



	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
       [not found] <20100307192305.GA598@elte.hu>
@ 2010-03-08 12:51 ` Oleg Nesterov
  2010-03-15 21:01   ` Eric Dumazet
  0 siblings, 1 reply; 37+ messages in thread
From: Oleg Nesterov @ 2010-03-08 12:51 UTC (permalink / raw)
  To: Ingo Molnar, Sergey Senozhatsky, Francois Romieu
  Cc: Peter Zijlstra, netdev, linux-kernel, David Miller

Sergey Senozhatsky wrote:
>
> Hello,
>
> Hardly reproducible.
> /*
> * 2.6.33. x86. ASUS f3jc
> */
>
> [329645.010697] =================================
> [329645.010699] [ INFO: inconsistent lock state ]
> [329645.010703] 2.6.33-33-0-dbg #31
> [329645.010705] ---------------------------------
> [329645.010708] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
>
> ...
>
> [329645.011083]  [<c128858d>] netif_receive_skb+0x340/0x360
> [329645.011093]  [<fd1ab6f4>] rtl8169_rx_interrupt+0x2bf/0x37e [r8169]
> [329645.011100]  [<fd1aba02>] rtl8169_reset_task+0x38/0xcd [r8169]
> [329645.011105]  [<c1044f71>] worker_thread+0x1ac/0x27c

I don't understand this code, but at first glance drivers/net/r8169.c
is obviously buggy.

The work->func, rtl8169_reset_task(), calls rtl8169_rx_interrupt() ->
netif_receive_skb(), and the last one may only be called from softirq.

Oleg.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-03-08 12:51 ` inconsistent lock state Oleg Nesterov
@ 2010-03-15 21:01   ` Eric Dumazet
  2010-03-15 21:09     ` David Miller
  2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
  0 siblings, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-15 21:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Sergey Senozhatsky, Francois Romieu, Peter Zijlstra,
	netdev, linux-kernel, David Miller

Le lundi 08 mars 2010 à 13:51 +0100, Oleg Nesterov a écrit :
> Sergey Senozhatsky wrote:
> >
> > Hello,
> >
> > Hardly reproducible.
> > /*
> > * 2.6.33. x86. ASUS f3jc
> > */
> >
> > [329645.010697] =================================
> > [329645.010699] [ INFO: inconsistent lock state ]
> > [329645.010703] 2.6.33-33-0-dbg #31
> > [329645.010705] ---------------------------------
> > [329645.010708] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> >
> > ...
> >
> > [329645.011083]  [<c128858d>] netif_receive_skb+0x340/0x360
> > [329645.011093]  [<fd1ab6f4>] rtl8169_rx_interrupt+0x2bf/0x37e [r8169]
> > [329645.011100]  [<fd1aba02>] rtl8169_reset_task+0x38/0xcd [r8169]
> > [329645.011105]  [<c1044f71>] worker_thread+0x1ac/0x27c
> 
> I don't understand this code, but at first glance drivers/net/r8169.c
> is obviously buggy.
> 
> The work->func, rtl8169_reset_task(), calls rtl8169_rx_interrupt() ->
> netif_receive_skb(), and the last one may only be called from softirq.
> 

Yes, this is wrong. In this context (process context, not softirq), we
should use netif_rx() or just drop frames if we are in reset phase.

Also, this driver schedules a reset in case a fifo error is reported in
rtl8169_rx_interrupt()

                if (status & RxFOVF) {
                        rtl8169_schedule_work(dev, rtl8169_reset_task);
                        dev->stats.rx_fifo_errors++;
                }


This seems very strange too : In case of a RX spike, we reset card ?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-03-15 21:01   ` Eric Dumazet
@ 2010-03-15 21:09     ` David Miller
  2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
  1 sibling, 0 replies; 37+ messages in thread
From: David Miller @ 2010-03-15 21:09 UTC (permalink / raw)
  To: eric.dumazet
  Cc: oleg, mingo, sergey.senozhatsky, romieu, a.p.zijlstra, netdev,
	linux-kernel

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 15 Mar 2010 22:01:05 +0100

> Also, this driver schedules a reset in case a fifo error is reported in
> rtl8169_rx_interrupt()
> 
>                 if (status & RxFOVF) {
>                         rtl8169_schedule_work(dev, rtl8169_reset_task);
>                         dev->stats.rx_fifo_errors++;
>                 }
> 
> 
> This seems very strange too : In case of a RX spike, we reset card ?

It's possible that this has been found to hang the card.

If so, it should be documented because otherwise yes we
should not be doing this.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-15 21:01   ` Eric Dumazet
  2010-03-15 21:09     ` David Miller
@ 2010-03-16  0:33     ` Eric Dumazet
  2010-03-16  6:50       ` Sergey Senozhatsky
                         ` (2 more replies)
  1 sibling, 3 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-16  0:33 UTC (permalink / raw)
  To: Oleg Nesterov, David Miller
  Cc: Ingo Molnar, Sergey Senozhatsky, Francois Romieu, Peter Zijlstra,
	netdev, linux-kernel

Le lundi 15 mars 2010 à 22:01 +0100, Eric Dumazet a écrit :

> Yes, this is wrong. In this context (process context, not softirq), we
> should use netif_rx() or just drop frames if we are in reset phase.
> 

Sergey,

Here is a compiled but untested patch (I dont have the hardware), could
you please test it ?

Thanks

[PATCH] r8169: Fix rtl8169_rx_interrupt()

In case a reset is performed, rtl8169_rx_interrupt() is called from
process context instead of softirq context. Special care must be taken
to call appropriate network core services (netif_rx() instead of
netif_receive_skb()). VLAN handling also corrected.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 9d3ebf3..d873639 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	u32 opts2 = le32_to_cpu(desc->opts2);
 	struct vlan_group *vlgrp = tp->vlgrp;
 	int ret;
 
 	if (vlgrp && (opts2 & RxVlanTag)) {
-		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
+		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
 		ret = 0;
 	} else
 		ret = -1;
@@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	return -1;
 }
@@ -4429,12 +4429,20 @@ out:
 	return done;
 }
 
+/*
+ * Warning : rtl8169_rx_interrupt() might be called :
+ * 1) from NAPI (softirq) context
+ *	(polling = 1 : we should call netif_receive_skb())
+ * 2) from process context (rtl8169_reset_task())
+ *	(polling = 0 : we must call netif_rx() instead)
+ */		
 static int rtl8169_rx_interrupt(struct net_device *dev,
 				struct rtl8169_private *tp,
 				void __iomem *ioaddr, u32 budget)
 {
 	unsigned int cur_rx, rx_left;
 	unsigned int delta, count;
+	int polling = (budget != ~(u32)0) ? 1 : 0;
 
 	cur_rx = tp->cur_rx;
 	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
@@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			skb_put(skb, pkt_size);
 			skb->protocol = eth_type_trans(skb, dev);
 
-			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
-				netif_receive_skb(skb);
+			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
+				if (likely(polling))
+					netif_receive_skb(skb);
+				else
+					netif_rx(skb);
+			}
 
 			dev->stats.rx_bytes += pkt_size;
 			dev->stats.rx_packets++;



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
@ 2010-03-16  6:50       ` Sergey Senozhatsky
  2010-03-16  7:12         ` Eric Dumazet
  2010-03-16 14:50       ` Sergey Senozhatsky
  2010-03-16 15:00       ` Sergey Senozhatsky
  2 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16  6:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Sergey Senozhatsky,
	Francois Romieu, Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3237 bytes --]

Hello,

Sure. Give me a couple of days. By the way, should I test against .34
or .33?

	Sergey


On (03/16/10 01:33), Eric Dumazet wrote:
> Le lundi 15 mars 2010 à 22:01 +0100, Eric Dumazet a écrit :
> 
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> > 
> 
> Sergey,
> 
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
> 
> Thanks
> 
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
> 
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	u32 opts2 = le32_to_cpu(desc->opts2);
>  	struct vlan_group *vlgrp = tp->vlgrp;
>  	int ret;
>  
>  	if (vlgrp && (opts2 & RxVlanTag)) {
> -		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> +		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
>  		ret = 0;
>  	} else
>  		ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	return -1;
>  }
> @@ -4429,12 +4429,20 @@ out:
>  	return done;
>  }
>  
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
>  static int rtl8169_rx_interrupt(struct net_device *dev,
>  				struct rtl8169_private *tp,
>  				void __iomem *ioaddr, u32 budget)
>  {
>  	unsigned int cur_rx, rx_left;
>  	unsigned int delta, count;
> +	int polling = (budget != ~(u32)0) ? 1 : 0;
>  
>  	cur_rx = tp->cur_rx;
>  	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			skb_put(skb, pkt_size);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> -				netif_receive_skb(skb);
> +			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> +				if (likely(polling))
> +					netif_receive_skb(skb);
> +				else
> +					netif_rx(skb);
> +			}
>  
>  			dev->stats.rx_bytes += pkt_size;
>  			dev->stats.rx_packets++;
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16  6:50       ` Sergey Senozhatsky
@ 2010-03-16  7:12         ` Eric Dumazet
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-16  7:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

Le mardi 16 mars 2010 à 08:50 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> Sure. Give me a couple of days. By the way, should I test against .34
> or .33?
> 

It's an old bug anyway, so you can pick up whats is the best for you.
My personal pref would be current kernel.

To trigger the original condition (fifo overflow), you might flood your
machine from another one with small packets, for example using pktgen.

Thanks

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
  2010-03-16  6:50       ` Sergey Senozhatsky
@ 2010-03-16 14:50       ` Sergey Senozhatsky
  2010-03-16 15:00       ` Sergey Senozhatsky
  2 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16 14:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4209 bytes --]

Hello,

Eric, I did pktgen testing (3 times).

sudo cat /proc/net/pktgen/eth0 
Params: count 1000000000  min_pkt_size: 42  max_pkt_size: 42
     frags: 0  delay: 0  clone_skb: 4000000  ifname: eth0
     flows: 0 flowlen: 0
     queue_map_min: 0  queue_map_max: 0
     dst_min: 10.6.22.59  dst_max: 
        src_min:   src_max: 
     src_mac: 00:1a:92:c9:a0:68 dst_mac: 00:1a:92:c9:a0:68
     udp_src_min: 9  udp_src_max: 9  udp_dst_min: 9  udp_dst_max: 9
     src_mac_count: 0  dst_mac_count: 0
     Flags: 
Current:
     pkts-sofar: 1000000000  errors: 0
     started: 9957912410us  stopped: 16682620361us idle: 1231us
     seq_num: 1000000001  cur_dst_mac_offset: 0  cur_src_mac_offset: 0
     cur_saddr: 0x3b16060a  cur_daddr: 0x3b16060a
     cur_udp_dst: 9  cur_udp_src: 9
     cur_queue_map: 0
     flows: 0
Result: OK: 6724707951(c6724706719+d1231) nsec, 1000000000 (42byte,0frags)
  148705pps 49Mb/sec (49964880bps) errors: 0


Without any errors from the r8169 side. Can we consider testing successful?
If so, fell free to add Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>


	Sergey



On (03/16/10 01:33), Eric Dumazet wrote:
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> > 
> 
> Sergey,
> 
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
> 
> Thanks
> 
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
> 
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	u32 opts2 = le32_to_cpu(desc->opts2);
>  	struct vlan_group *vlgrp = tp->vlgrp;
>  	int ret;
>  
>  	if (vlgrp && (opts2 & RxVlanTag)) {
> -		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> +		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
>  		ret = 0;
>  	} else
>  		ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	return -1;
>  }
> @@ -4429,12 +4429,20 @@ out:
>  	return done;
>  }
>  
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
>  static int rtl8169_rx_interrupt(struct net_device *dev,
>  				struct rtl8169_private *tp,
>  				void __iomem *ioaddr, u32 budget)
>  {
>  	unsigned int cur_rx, rx_left;
>  	unsigned int delta, count;
> +	int polling = (budget != ~(u32)0) ? 1 : 0;
>  
>  	cur_rx = tp->cur_rx;
>  	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			skb_put(skb, pkt_size);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> -				netif_receive_skb(skb);
> +			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> +				if (likely(polling))
> +					netif_receive_skb(skb);
> +				else
> +					netif_rx(skb);
> +			}
>  
>  			dev->stats.rx_bytes += pkt_size;
>  			dev->stats.rx_packets++;
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
  2010-03-16  6:50       ` Sergey Senozhatsky
  2010-03-16 14:50       ` Sergey Senozhatsky
@ 2010-03-16 15:00       ` Sergey Senozhatsky
  2010-03-16 15:05         ` Eric Dumazet
  2 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16 15:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5442 bytes --]

Nope!
Here it is:


[17250.998293] ------------[ cut here ]------------
[17250.998305] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
[17250.998308] Hardware name: F3JC                
[17250.998312] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[17250.998315] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap
mmc_core led_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd_page_alloc rng_core sg evdev i2c_i801 snd soundcore psmouse r8169 serio_raw mii uhci_hcd ehci_hcd sr_mod
usbcore cdrom sd_mod ata_piix
[17250.998371] Pid: 3985, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #46
[17250.998375] Call Trace:
[17250.998383]  [<c102e353>] warn_slowpath_common+0x65/0x7c
[17250.998388]  [<c126b654>] ? dev_watchdog+0xc1/0x125
[17250.998393]  [<c102e39e>] warn_slowpath_fmt+0x24/0x27
[17250.998398]  [<c126b654>] dev_watchdog+0xc1/0x125
[17250.998405]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
[17250.998411]  [<c1036c11>] run_timer_softirq+0x176/0x1eb
[17250.998416]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
[17250.998421]  [<c126b593>] ? dev_watchdog+0x0/0x125
[17250.998426]  [<c1032df9>] __do_softirq+0x8d/0x117
[17250.998431]  [<c1032eae>] do_softirq+0x2b/0x43
[17250.998436]  [<c1032fd3>] irq_exit+0x38/0x75
[17250.998442]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
[17250.998448]  [<c12c812a>] apic_timer_interrupt+0x36/0x3c
[17250.998457]  [<c1185d18>] ? do_raw_spin_trylock+0x28/0x37
[17250.998464]  [<c12c7101>] _raw_spin_lock+0x2f/0x58
[17250.998472]  [<f80855ca>] ? spin_lock+0x8/0xa [pktgen]
[17250.998478]  [<f80855ca>] spin_lock+0x8/0xa [pktgen]
[17250.998484]  [<f80873b6>] pktgen_thread_worker+0x9b/0x631 [pktgen]
[17250.998491]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
[17250.998497]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
[17250.998503]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
[17250.998508]  [<c103f6ce>] kthread+0x6a/0x6f
[17250.998514]  [<c103f664>] ? kthread+0x0/0x6f
[17250.998520]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
[17250.998523] ---[ end trace a22d306b065d4a68 ]---
[17251.011663] r8169 0000:02:00.0: eth0: link up

[17419.011748] NOHZ: local_softirq_pending 08



	Sergey


On (03/16/10 01:33), Eric Dumazet wrote:
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> > 
> 
> Sergey,
> 
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
> 
> Thanks
> 
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
> 
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	u32 opts2 = le32_to_cpu(desc->opts2);
>  	struct vlan_group *vlgrp = tp->vlgrp;
>  	int ret;
>  
>  	if (vlgrp && (opts2 & RxVlanTag)) {
> -		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> +		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
>  		ret = 0;
>  	} else
>  		ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	return -1;
>  }
> @@ -4429,12 +4429,20 @@ out:
>  	return done;
>  }
>  
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
>  static int rtl8169_rx_interrupt(struct net_device *dev,
>  				struct rtl8169_private *tp,
>  				void __iomem *ioaddr, u32 budget)
>  {
>  	unsigned int cur_rx, rx_left;
>  	unsigned int delta, count;
> +	int polling = (budget != ~(u32)0) ? 1 : 0;
>  
>  	cur_rx = tp->cur_rx;
>  	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			skb_put(skb, pkt_size);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> -				netif_receive_skb(skb);
> +			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> +				if (likely(polling))
> +					netif_receive_skb(skb);
> +				else
> +					netif_rx(skb);
> +			}
>  
>  			dev->stats.rx_bytes += pkt_size;
>  			dev->stats.rx_packets++;
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 15:00       ` Sergey Senozhatsky
@ 2010-03-16 15:05         ` Eric Dumazet
  2010-03-16 15:10           ` Sergey Senozhatsky
  0 siblings, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2010-03-16 15:05 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

Le mardi 16 mars 2010 à 17:00 +0200, Sergey Senozhatsky a écrit :
> Nope!
> Here it is:
> 
> 
> [17250.998293] ------------[ cut here ]------------
> [17250.998305] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> [17250.998308] Hardware name: F3JC                
> [17250.998312] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> [17250.998315] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap
> mmc_core led_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd_page_alloc rng_core sg evdev i2c_i801 snd soundcore psmouse r8169 serio_raw mii uhci_hcd ehci_hcd sr_mod
> usbcore cdrom sd_mod ata_piix
> [17250.998371] Pid: 3985, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #46
> [17250.998375] Call Trace:
> [17250.998383]  [<c102e353>] warn_slowpath_common+0x65/0x7c
> [17250.998388]  [<c126b654>] ? dev_watchdog+0xc1/0x125
> [17250.998393]  [<c102e39e>] warn_slowpath_fmt+0x24/0x27
> [17250.998398]  [<c126b654>] dev_watchdog+0xc1/0x125
> [17250.998405]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> [17250.998411]  [<c1036c11>] run_timer_softirq+0x176/0x1eb
> [17250.998416]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> [17250.998421]  [<c126b593>] ? dev_watchdog+0x0/0x125
> [17250.998426]  [<c1032df9>] __do_softirq+0x8d/0x117
> [17250.998431]  [<c1032eae>] do_softirq+0x2b/0x43
> [17250.998436]  [<c1032fd3>] irq_exit+0x38/0x75
> [17250.998442]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
> [17250.998448]  [<c12c812a>] apic_timer_interrupt+0x36/0x3c
> [17250.998457]  [<c1185d18>] ? do_raw_spin_trylock+0x28/0x37
> [17250.998464]  [<c12c7101>] _raw_spin_lock+0x2f/0x58
> [17250.998472]  [<f80855ca>] ? spin_lock+0x8/0xa [pktgen]
> [17250.998478]  [<f80855ca>] spin_lock+0x8/0xa [pktgen]
> [17250.998484]  [<f80873b6>] pktgen_thread_worker+0x9b/0x631 [pktgen]
> [17250.998491]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> [17250.998497]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> [17250.998503]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
> [17250.998508]  [<c103f6ce>] kthread+0x6a/0x6f
> [17250.998514]  [<c103f664>] ? kthread+0x0/0x6f
> [17250.998520]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> [17250.998523] ---[ end trace a22d306b065d4a68 ]---
> [17251.011663] r8169 0000:02:00.0: eth0: link up
> 
> [17419.011748] NOHZ: local_softirq_pending 08
> 
> 

But this stack trace is on pktgen side (the sender), and my patch was
about the receiver ?

I wonder if you dont hit another problem :)



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 15:05         ` Eric Dumazet
@ 2010-03-16 15:10           ` Sergey Senozhatsky
  2010-03-16 15:20             ` Eric Dumazet
  0 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16 15:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, Ingo Molnar,
	Francois Romieu, Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2843 bytes --]

On (03/16/10 16:05), Eric Dumazet wrote:
> > Nope!
> > Here it is:
> > 
> > 
> > [17250.998293] ------------[ cut here ]------------
> > [17250.998305] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [17250.998308] Hardware name: F3JC                
> > [17250.998312] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [17250.998315] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap
> > mmc_core led_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd_page_alloc rng_core sg evdev i2c_i801 snd soundcore psmouse r8169 serio_raw mii uhci_hcd ehci_hcd sr_mod
> > usbcore cdrom sd_mod ata_piix
> > [17250.998371] Pid: 3985, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #46
> > [17250.998375] Call Trace:
> > [17250.998383]  [<c102e353>] warn_slowpath_common+0x65/0x7c
> > [17250.998388]  [<c126b654>] ? dev_watchdog+0xc1/0x125
> > [17250.998393]  [<c102e39e>] warn_slowpath_fmt+0x24/0x27
> > [17250.998398]  [<c126b654>] dev_watchdog+0xc1/0x125
> > [17250.998405]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> > [17250.998411]  [<c1036c11>] run_timer_softirq+0x176/0x1eb
> > [17250.998416]  [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> > [17250.998421]  [<c126b593>] ? dev_watchdog+0x0/0x125
> > [17250.998426]  [<c1032df9>] __do_softirq+0x8d/0x117
> > [17250.998431]  [<c1032eae>] do_softirq+0x2b/0x43
> > [17250.998436]  [<c1032fd3>] irq_exit+0x38/0x75
> > [17250.998442]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
> > [17250.998448]  [<c12c812a>] apic_timer_interrupt+0x36/0x3c
> > [17250.998457]  [<c1185d18>] ? do_raw_spin_trylock+0x28/0x37
> > [17250.998464]  [<c12c7101>] _raw_spin_lock+0x2f/0x58
> > [17250.998472]  [<f80855ca>] ? spin_lock+0x8/0xa [pktgen]
> > [17250.998478]  [<f80855ca>] spin_lock+0x8/0xa [pktgen]
> > [17250.998484]  [<f80873b6>] pktgen_thread_worker+0x9b/0x631 [pktgen]
> > [17250.998491]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [17250.998497]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [17250.998503]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
> > [17250.998508]  [<c103f6ce>] kthread+0x6a/0x6f
> > [17250.998514]  [<c103f664>] ? kthread+0x0/0x6f
> > [17250.998520]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [17250.998523] ---[ end trace a22d306b065d4a68 ]---
> > [17251.011663] r8169 0000:02:00.0: eth0: link up
> > 
> > [17419.011748] NOHZ: local_softirq_pending 08
> > 
> > 
> 
> But this stack trace is on pktgen side (the sender), and my patch was
> about the receiver ?
> 

[...]
>> NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Isn't it r8169 related?


> I wonder if you dont hit another problem :)
:)


	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 15:10           ` Sergey Senozhatsky
@ 2010-03-16 15:20             ` Eric Dumazet
  2010-03-16 18:26               ` Sergey Senozhatsky
  0 siblings, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2010-03-16 15:20 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

Le mardi 16 mars 2010 à 17:10 +0200, Sergey Senozhatsky a écrit :

> [...]
> >> NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> Isn't it r8169 related?
> 

This is tx side which seems blocked for more than 6 seconds.

To really test the patch, you need following setup :

machine A with r8169 NIC, patch applied, receiver of pktgen flood.

machine B with any NIC, preferably a not buggy one, doing the pktgen
flood to machine A

If machine A survives, my patch is tested and ok.

If machine B crashes, we have another problem to investigate

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 15:20             ` Eric Dumazet
@ 2010-03-16 18:26               ` Sergey Senozhatsky
  2010-03-16 18:48                 ` Eric Dumazet
  0 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16 18:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2982 bytes --]

Hello,
Got it right now.

System completely froze. Even SysRq didn't work.
/*spin_lock deadlock?*/

NOTE: I'm losing network constantly with pktgen tests
[24208.010980] r8169 0000:02:00.0: eth0: link up
[24220.010980] r8169 0000:02:00.0: eth0: link up
[24232.011030] r8169 0000:02:00.0: eth0: link up
[24340.010980] r8169 0000:02:00.0: eth0: link up
[24352.010966] r8169 0000:02:00.0: eth0: link up
[24364.010966] r8169 0000:02:00.0: eth0: link up
[24376.010964] r8169 0000:02:00.0: eth0: link up
[24388.010961] r8169 0000:02:00.0: eth0: link up
[24400.010959] r8169 0000:02:00.0: eth0: link up
[24412.010963] r8169 0000:02:00.0: eth0: link up


Traces:
[24600.625078] INFO: task events/0:9 blocked for more than 120 seconds.
[24600.625083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[24600.625087] events/0      D 00001636     0     9      2 0x00000000
[24600.625096]  f7085ebc 00000046 a87e9490 00001636 c1617cc0 c1617cc0 c1617cc0 c1617cc0
[24600.625109]  f707c250 c1617cc0 c1617cc0 00000000 00000000 00000000 00000000 f707bfc0
[24600.625122]  c14745c8 c14745c8 f707bfc0 00000202 f7085efc c12c6449 00000000 00085edc
[24600.625135] Call Trace:
[24600.625148]  [<c12c6449>] __mutex_lock_common+0x233/0x3af
[24600.625155]  [<c12c65ff>] mutex_lock_nested+0x12/0x15
[24600.625163]  [<c1265e0f>] ? rtnl_lock+0xf/0x11
[24600.625168]  [<c1265e0f>] rtnl_lock+0xf/0x11
[24600.625183]  [<f9174acd>] rtl8169_reset_task+0x16/0xee [r8169]
[24600.625191]  [<c103c887>] worker_thread+0x161/0x233
[24600.625196]  [<c103c845>] ? worker_thread+0x11f/0x233
[24600.625205]  [<f9174ab7>] ? rtl8169_reset_task+0x0/0xee [r8169]
[24600.625214]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
[24600.625220]  [<c103c726>] ? worker_thread+0x0/0x233
[24600.625225]  [<c103f6ce>] kthread+0x6a/0x6f
[24600.625232]  [<c103f664>] ? kthread+0x0/0x6f
[24600.625238]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
[24600.625242] INFO: lockdep is turned off.
[24600.625259] INFO: task X:3176 blocked for more than 120 seconds.
[24600.625262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[24600.625266] X             D 00000000     0  3176   3175 0x00400004
[24600.625273]  f6435a10 00003046 00025709 00000000 c1617cc0 c1617cc0 c1617cc0 c1617cc0
[24600.625286]  f60897d0 c1617c  ... 
/*THE REST IS LOST*/



	Sergey


On (03/16/10 16:20), Eric Dumazet wrote:
> > [...]
> > >> NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > Isn't it r8169 related?
> > 
> 
> This is tx side which seems blocked for more than 6 seconds.
> 
> To really test the patch, you need following setup :
> 
> machine A with r8169 NIC, patch applied, receiver of pktgen flood.
> 
> machine B with any NIC, preferably a not buggy one, doing the pktgen
> flood to machine A
> 
> If machine A survives, my patch is tested and ok.
> 
> If machine B crashes, we have another problem to investigate
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 18:26               ` Sergey Senozhatsky
@ 2010-03-16 18:48                 ` Eric Dumazet
  2010-03-16 19:02                   ` Sergey Senozhatsky
                                     ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-16 18:48 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

Le mardi 16 mars 2010 à 20:26 +0200, Sergey Senozhatsky a écrit :
> Hello,
> Got it right now.
> 
> System completely froze. Even SysRq didn't work.
> /*spin_lock deadlock?*/
> 
> NOTE: I'm losing network constantly with pktgen tests

yes, every 12 seconds there is a reset because of fifo overflow

> [24208.010980] r8169 0000:02:00.0: eth0: link up
> [24220.010980] r8169 0000:02:00.0: eth0: link up
> [24232.011030] r8169 0000:02:00.0: eth0: link up
> [24340.010980] r8169 0000:02:00.0: eth0: link up
> [24352.010966] r8169 0000:02:00.0: eth0: link up
> [24364.010966] r8169 0000:02:00.0: eth0: link up
> [24376.010964] r8169 0000:02:00.0: eth0: link up
> [24388.010961] r8169 0000:02:00.0: eth0: link up
> [24400.010959] r8169 0000:02:00.0: eth0: link up
> [24412.010963] r8169 0000:02:00.0: eth0: link up
> 
> 
> Traces:
> [24600.625078] INFO: task events/0:9 blocked for more than 120 seconds.
> [24600.625083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [24600.625087] events/0      D 00001636     0     9      2 0x00000000
> [24600.625096]  f7085ebc 00000046 a87e9490 00001636 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> [24600.625109]  f707c250 c1617cc0 c1617cc0 00000000 00000000 00000000 00000000 f707bfc0
> [24600.625122]  c14745c8 c14745c8 f707bfc0 00000202 f7085efc c12c6449 00000000 00085edc
> [24600.625135] Call Trace:
> [24600.625148]  [<c12c6449>] __mutex_lock_common+0x233/0x3af
> [24600.625155]  [<c12c65ff>] mutex_lock_nested+0x12/0x15
> [24600.625163]  [<c1265e0f>] ? rtnl_lock+0xf/0x11
> [24600.625168]  [<c1265e0f>] rtnl_lock+0xf/0x11
> [24600.625183]  [<f9174acd>] rtl8169_reset_task+0x16/0xee [r8169]
> [24600.625191]  [<c103c887>] worker_thread+0x161/0x233
> [24600.625196]  [<c103c845>] ? worker_thread+0x11f/0x233
> [24600.625205]  [<f9174ab7>] ? rtl8169_reset_task+0x0/0xee [r8169]
> [24600.625214]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> [24600.625220]  [<c103c726>] ? worker_thread+0x0/0x233
> [24600.625225]  [<c103f6ce>] kthread+0x6a/0x6f
> [24600.625232]  [<c103f664>] ? kthread+0x0/0x6f
> [24600.625238]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> [24600.625242] INFO: lockdep is turned off.
> [24600.625259] INFO: task X:3176 blocked for more than 120 seconds.
> [24600.625262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [24600.625266] X             D 00000000     0  3176   3175 0x00400004
> [24600.625273]  f6435a10 00003046 00025709 00000000 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> [24600.625286]  f60897d0 c1617c  ... 
> /*THE REST IS LOST*/
> 
> 

OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
or multiple invocation...

Did you tried removing the rtl8169_schedule_work() call from
rtl8169_rx_interrupt() ?

Maybe the reset is not necessary at all in case of fifo overflow..

Cumulative patch :

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 9d3ebf3..d6ef4dd 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	u32 opts2 = le32_to_cpu(desc->opts2);
 	struct vlan_group *vlgrp = tp->vlgrp;
 	int ret;
 
 	if (vlgrp && (opts2 & RxVlanTag)) {
-		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
+		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
 		ret = 0;
 	} else
 		ret = -1;
@@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	return -1;
 }
@@ -4429,12 +4429,20 @@ out:
 	return done;
 }
 
+/*
+ * Warning : rtl8169_rx_interrupt() might be called :
+ * 1) from NAPI (softirq) context
+ *	(polling = 1 : we should call netif_receive_skb())
+ * 2) from process context (rtl8169_reset_task())
+ *	(polling = 0 : we must call netif_rx() instead)
+ */		
 static int rtl8169_rx_interrupt(struct net_device *dev,
 				struct rtl8169_private *tp,
 				void __iomem *ioaddr, u32 budget)
 {
 	unsigned int cur_rx, rx_left;
 	unsigned int delta, count;
+	int polling = (budget != ~(u32)0) ? 1 : 0;
 
 	cur_rx = tp->cur_rx;
 	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
@@ -4459,7 +4467,6 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			if (status & RxCRC)
 				dev->stats.rx_crc_errors++;
 			if (status & RxFOVF) {
-				rtl8169_schedule_work(dev, rtl8169_reset_task);
 				dev->stats.rx_fifo_errors++;
 			}
 			rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
@@ -4496,8 +4503,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			skb_put(skb, pkt_size);
 			skb->protocol = eth_type_trans(skb, dev);
 
-			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
-				netif_receive_skb(skb);
+			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
+				if (likely(polling))
+					netif_receive_skb(skb);
+				else
+					netif_rx(skb);
+			}
 
 			dev->stats.rx_bytes += pkt_size;
 			dev->stats.rx_packets++;

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 18:48                 ` Eric Dumazet
@ 2010-03-16 19:02                   ` Sergey Senozhatsky
  2010-03-17  7:25                   ` Sergey Senozhatsky
  2010-03-25 11:30                   ` Sergey Senozhatsky
  2 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-16 19:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, Ingo Molnar,
	Francois Romieu, Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4906 bytes --]

On (03/16/10 19:48), Eric Dumazet wrote:
> > Traces:
> > [24600.625078] INFO: task events/0:9 blocked for more than 120 seconds.
> > [24600.625083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [24600.625087] events/0      D 00001636     0     9      2 0x00000000
> > [24600.625096]  f7085ebc 00000046 a87e9490 00001636 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> > [24600.625109]  f707c250 c1617cc0 c1617cc0 00000000 00000000 00000000 00000000 f707bfc0
> > [24600.625122]  c14745c8 c14745c8 f707bfc0 00000202 f7085efc c12c6449 00000000 00085edc
> > [24600.625135] Call Trace:
> > [24600.625148]  [<c12c6449>] __mutex_lock_common+0x233/0x3af
> > [24600.625155]  [<c12c65ff>] mutex_lock_nested+0x12/0x15
> > [24600.625163]  [<c1265e0f>] ? rtnl_lock+0xf/0x11
> > [24600.625168]  [<c1265e0f>] rtnl_lock+0xf/0x11
> > [24600.625183]  [<f9174acd>] rtl8169_reset_task+0x16/0xee [r8169]
> > [24600.625191]  [<c103c887>] worker_thread+0x161/0x233
> > [24600.625196]  [<c103c845>] ? worker_thread+0x11f/0x233
> > [24600.625205]  [<f9174ab7>] ? rtl8169_reset_task+0x0/0xee [r8169]
> > [24600.625214]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [24600.625220]  [<c103c726>] ? worker_thread+0x0/0x233
> > [24600.625225]  [<c103f6ce>] kthread+0x6a/0x6f
> > [24600.625232]  [<c103f664>] ? kthread+0x0/0x6f
> > [24600.625238]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [24600.625242] INFO: lockdep is turned off.
> > [24600.625259] INFO: task X:3176 blocked for more than 120 seconds.
> > [24600.625262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [24600.625266] X             D 00000000     0  3176   3175 0x00400004
> > [24600.625273]  f6435a10 00003046 00025709 00000000 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> > [24600.625286]  f60897d0 c1617c  ... 
> > /*THE REST IS LOST*/
> > 
> > 
> 
> OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> or multiple invocation...
> 
> Did you tried removing the rtl8169_schedule_work() call from
> rtl8169_rx_interrupt() ?
>

Not yet. I'm reading rtl8169_rx_interrupt now and rtl8169_schedule_work in case of RxFOVF
grabbed my attention too. 
 

> Maybe the reset is not necessary at all in case of fifo overflow..
> 
> Cumulative patch :
>

Will try it.


	Sergey 


> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d6ef4dd 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	u32 opts2 = le32_to_cpu(desc->opts2);
>  	struct vlan_group *vlgrp = tp->vlgrp;
>  	int ret;
>  
>  	if (vlgrp && (opts2 & RxVlanTag)) {
> -		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> +		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
>  		ret = 0;
>  	} else
>  		ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	return -1;
>  }
> @@ -4429,12 +4429,20 @@ out:
>  	return done;
>  }
>  
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
>  static int rtl8169_rx_interrupt(struct net_device *dev,
>  				struct rtl8169_private *tp,
>  				void __iomem *ioaddr, u32 budget)
>  {
>  	unsigned int cur_rx, rx_left;
>  	unsigned int delta, count;
> +	int polling = (budget != ~(u32)0) ? 1 : 0;
>  
>  	cur_rx = tp->cur_rx;
>  	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4459,7 +4467,6 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			if (status & RxCRC)
>  				dev->stats.rx_crc_errors++;
>  			if (status & RxFOVF) {
> -				rtl8169_schedule_work(dev, rtl8169_reset_task);
>  				dev->stats.rx_fifo_errors++;
>  			}
>  			rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
> @@ -4496,8 +4503,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			skb_put(skb, pkt_size);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> -				netif_receive_skb(skb);
> +			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> +				if (likely(polling))
> +					netif_receive_skb(skb);
> +				else
> +					netif_rx(skb);
> +			}
>  
>  			dev->stats.rx_bytes += pkt_size;
>  			dev->stats.rx_packets++;
> 
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 18:48                 ` Eric Dumazet
  2010-03-16 19:02                   ` Sergey Senozhatsky
@ 2010-03-17  7:25                   ` Sergey Senozhatsky
  2010-03-17  7:37                     ` Eric Dumazet
  2010-03-25 11:30                   ` Sergey Senozhatsky
  2 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-17  7:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Oleg Nesterov, David Miller, Ingo Molnar, Francois Romieu,
	Peter Zijlstra, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7388 bytes --]

Hello,

cumulative patch:

[  155.337373] ------------[ cut here ]------------
[  155.337386] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
[  155.337390] Hardware name: F3JC                
[  155.337394] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[  155.337397] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel snd_hda_codec
asus_laptop sparse_keymap mmc_core led_class snd_pcm snd_timer snd_page_alloc psmouse rng_core snd soundcore sg i2c_i801 evdev serio_raw r8169 mii usbhid hid uhci_hcd ehci_hcd
sr_mod cdrom sd_mod usbcore ata_piix
[  155.337468] Pid: 7, comm: ksoftirqd/1 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
[  155.337472] Call Trace:
[  155.337481]  [<c102e293>] warn_slowpath_common+0x65/0x7c
[  155.337506]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
[  155.337512]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
[  155.337517]  [<c126ac34>] dev_watchdog+0xc1/0x125
[  155.337525]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[  155.337530]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
[  155.337536]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[  155.337566]  [<c126ab73>] ? dev_watchdog+0x0/0x125
[  155.337576]  [<c1032d39>] __do_softirq+0x8d/0x117
[  155.337667]  [<c1032dee>] do_softirq+0x2b/0x43
[  155.337729]  [<c1032fc1>] run_ksoftirqd+0x71/0x140
[  155.337745]  [<c1032f50>] ? run_ksoftirqd+0x0/0x140
[  155.337810]  [<c103f60e>] kthread+0x6a/0x6f
[  155.337832]  [<c103f5a4>] ? kthread+0x0/0x6f
[  155.337903]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
[  155.337907] ---[ end trace a22d306b065d4a68 ]---
[  155.350902] r8169 0000:02:00.0: eth0: link up
[  167.350892] r8169 0000:02:00.0: eth0: link up



	Sergey



On (03/16/10 19:48), Eric Dumazet wrote:
> > Hello,
> > Got it right now.
> > 
> > System completely froze. Even SysRq didn't work.
> > /*spin_lock deadlock?*/
> > 
> > NOTE: I'm losing network constantly with pktgen tests
> 
> yes, every 12 seconds there is a reset because of fifo overflow
> 
> > [24208.010980] r8169 0000:02:00.0: eth0: link up
> > [24220.010980] r8169 0000:02:00.0: eth0: link up
> > [24232.011030] r8169 0000:02:00.0: eth0: link up
> > [24340.010980] r8169 0000:02:00.0: eth0: link up
> > [24352.010966] r8169 0000:02:00.0: eth0: link up
> > [24364.010966] r8169 0000:02:00.0: eth0: link up
> > [24376.010964] r8169 0000:02:00.0: eth0: link up
> > [24388.010961] r8169 0000:02:00.0: eth0: link up
> > [24400.010959] r8169 0000:02:00.0: eth0: link up
> > [24412.010963] r8169 0000:02:00.0: eth0: link up
> > 
> > 
> > Traces:
> > [24600.625078] INFO: task events/0:9 blocked for more than 120 seconds.
> > [24600.625083] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [24600.625087] events/0      D 00001636     0     9      2 0x00000000
> > [24600.625096]  f7085ebc 00000046 a87e9490 00001636 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> > [24600.625109]  f707c250 c1617cc0 c1617cc0 00000000 00000000 00000000 00000000 f707bfc0
> > [24600.625122]  c14745c8 c14745c8 f707bfc0 00000202 f7085efc c12c6449 00000000 00085edc
> > [24600.625135] Call Trace:
> > [24600.625148]  [<c12c6449>] __mutex_lock_common+0x233/0x3af
> > [24600.625155]  [<c12c65ff>] mutex_lock_nested+0x12/0x15
> > [24600.625163]  [<c1265e0f>] ? rtnl_lock+0xf/0x11
> > [24600.625168]  [<c1265e0f>] rtnl_lock+0xf/0x11
> > [24600.625183]  [<f9174acd>] rtl8169_reset_task+0x16/0xee [r8169]
> > [24600.625191]  [<c103c887>] worker_thread+0x161/0x233
> > [24600.625196]  [<c103c845>] ? worker_thread+0x11f/0x233
> > [24600.625205]  [<f9174ab7>] ? rtl8169_reset_task+0x0/0xee [r8169]
> > [24600.625214]  [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [24600.625220]  [<c103c726>] ? worker_thread+0x0/0x233
> > [24600.625225]  [<c103f6ce>] kthread+0x6a/0x6f
> > [24600.625232]  [<c103f664>] ? kthread+0x0/0x6f
> > [24600.625238]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [24600.625242] INFO: lockdep is turned off.
> > [24600.625259] INFO: task X:3176 blocked for more than 120 seconds.
> > [24600.625262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [24600.625266] X             D 00000000     0  3176   3175 0x00400004
> > [24600.625273]  f6435a10 00003046 00025709 00000000 c1617cc0 c1617cc0 c1617cc0 c1617cc0
> > [24600.625286]  f60897d0 c1617c  ... 
> > /*THE REST IS LOST*/
> > 
> > 
> 
> OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> or multiple invocation...
> 
> Did you tried removing the rtl8169_schedule_work() call from
> rtl8169_rx_interrupt() ?
> 
> Maybe the reset is not necessary at all in case of fifo overflow..
> 
> Cumulative patch :
> 
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d6ef4dd 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	u32 opts2 = le32_to_cpu(desc->opts2);
>  	struct vlan_group *vlgrp = tp->vlgrp;
>  	int ret;
>  
>  	if (vlgrp && (opts2 & RxVlanTag)) {
> -		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> +		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
>  		ret = 0;
>  	} else
>  		ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
>  }
>  
>  static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> -			       struct sk_buff *skb)
> +			       struct sk_buff *skb, int polling)
>  {
>  	return -1;
>  }
> @@ -4429,12 +4429,20 @@ out:
>  	return done;
>  }
>  
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
>  static int rtl8169_rx_interrupt(struct net_device *dev,
>  				struct rtl8169_private *tp,
>  				void __iomem *ioaddr, u32 budget)
>  {
>  	unsigned int cur_rx, rx_left;
>  	unsigned int delta, count;
> +	int polling = (budget != ~(u32)0) ? 1 : 0;
>  
>  	cur_rx = tp->cur_rx;
>  	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4459,7 +4467,6 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			if (status & RxCRC)
>  				dev->stats.rx_crc_errors++;
>  			if (status & RxFOVF) {
> -				rtl8169_schedule_work(dev, rtl8169_reset_task);
>  				dev->stats.rx_fifo_errors++;
>  			}
>  			rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
> @@ -4496,8 +4503,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  			skb_put(skb, pkt_size);
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> -				netif_receive_skb(skb);
> +			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> +				if (likely(polling))
> +					netif_receive_skb(skb);
> +				else
> +					netif_rx(skb);
> +			}
>  
>  			dev->stats.rx_bytes += pkt_size;
>  			dev->stats.rx_packets++;
> 
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17  7:25                   ` Sergey Senozhatsky
@ 2010-03-17  7:37                     ` Eric Dumazet
  2010-03-17  7:58                       ` Sergey Senozhatsky
                                         ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-17  7:37 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

trimming some cc

Le mercredi 17 mars 2010 à 09:25 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> cumulative patch:
> 
> [  155.337373] ------------[ cut here ]------------
> [  155.337386] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> [  155.337390] Hardware name: F3JC                
> [  155.337394] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> [  155.337397] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel snd_hda_codec
> asus_laptop sparse_keymap mmc_core led_class snd_pcm snd_timer snd_page_alloc psmouse rng_core snd soundcore sg i2c_i801 evdev serio_raw r8169 mii usbhid hid uhci_hcd ehci_hcd
> sr_mod cdrom sd_mod usbcore ata_piix
> [  155.337468] Pid: 7, comm: ksoftirqd/1 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
> [  155.337472] Call Trace:
> [  155.337481]  [<c102e293>] warn_slowpath_common+0x65/0x7c
> [  155.337506]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
> [  155.337512]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
> [  155.337517]  [<c126ac34>] dev_watchdog+0xc1/0x125
> [  155.337525]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> [  155.337530]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
> [  155.337536]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> [  155.337566]  [<c126ab73>] ? dev_watchdog+0x0/0x125
> [  155.337576]  [<c1032d39>] __do_softirq+0x8d/0x117
> [  155.337667]  [<c1032dee>] do_softirq+0x2b/0x43
> [  155.337729]  [<c1032fc1>] run_ksoftirqd+0x71/0x140
> [  155.337745]  [<c1032f50>] ? run_ksoftirqd+0x0/0x140
> [  155.337810]  [<c103f60e>] kthread+0x6a/0x6f
> [  155.337832]  [<c103f5a4>] ? kthread+0x0/0x6f
> [  155.337903]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> [  155.337907] ---[ end trace a22d306b065d4a68 ]---
> [  155.350902] r8169 0000:02:00.0: eth0: link up
> [  167.350892] r8169 0000:02:00.0: eth0: link up
> 
> 
> 

On receiver ?

I suspect lot of work is needed on this driver to make it working, but I
dont have a machine with said adapter.

Are you in 100 Mb full duplex mode ?



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17  7:37                     ` Eric Dumazet
@ 2010-03-17  7:58                       ` Sergey Senozhatsky
  2010-03-17 10:58                       ` Sergey Senozhatsky
  2010-03-17 23:55                       ` Francois Romieu
  2 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-17  7:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

[-- Attachment #1: Type: text/plain, Size: 3064 bytes --]

On (03/17/10 08:37), Eric Dumazet wrote:
> trimming some cc
> 
> Le mercredi 17 mars 2010 à 09:25 +0200, Sergey Senozhatsky a écrit :
> > Hello,
> > 
> > cumulative patch:
> > 
> > [  155.337373] ------------[ cut here ]------------
> > [  155.337386] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [  155.337390] Hardware name: F3JC                
> > [  155.337394] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [  155.337397] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel snd_hda_codec
> > asus_laptop sparse_keymap mmc_core led_class snd_pcm snd_timer snd_page_alloc psmouse rng_core snd soundcore sg i2c_i801 evdev serio_raw r8169 mii usbhid hid uhci_hcd ehci_hcd
> > sr_mod cdrom sd_mod usbcore ata_piix
> > [  155.337468] Pid: 7, comm: ksoftirqd/1 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
> > [  155.337472] Call Trace:
> > [  155.337481]  [<c102e293>] warn_slowpath_common+0x65/0x7c
> > [  155.337506]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
> > [  155.337512]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
> > [  155.337517]  [<c126ac34>] dev_watchdog+0xc1/0x125
> > [  155.337525]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [  155.337530]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
> > [  155.337536]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [  155.337566]  [<c126ab73>] ? dev_watchdog+0x0/0x125
> > [  155.337576]  [<c1032d39>] __do_softirq+0x8d/0x117
> > [  155.337667]  [<c1032dee>] do_softirq+0x2b/0x43
> > [  155.337729]  [<c1032fc1>] run_ksoftirqd+0x71/0x140
> > [  155.337745]  [<c1032f50>] ? run_ksoftirqd+0x0/0x140
> > [  155.337810]  [<c103f60e>] kthread+0x6a/0x6f
> > [  155.337832]  [<c103f5a4>] ? kthread+0x0/0x6f
> > [  155.337903]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [  155.337907] ---[ end trace a22d306b065d4a68 ]---
> > [  155.350902] r8169 0000:02:00.0: eth0: link up
> > [  167.350892] r8169 0000:02:00.0: eth0: link up
> > 
> > 
> > 
> 
> On receiver ?
> 
Localhost pollution (both sender and receiver are localhost). In 2 hours I will
try to pktgen over LAN.


> I suspect lot of work is needed on this driver to make it working, but I
> dont have a machine with said adapter.
> 
> Are you in 100 Mb full duplex mode ?
> 

ethtool eth0
Settings for eth0:
	Supported ports: [ TP MII ]
	Supported link modes:   10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Supports auto-negotiation: Yes
	Advertised link modes:  10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Advertised auto-negotiation: Yes
	Speed: 100Mb/s
	Duplex: Full
	Port: MII
	PHYAD: 0
	Transceiver: internal
	Auto-negotiation: on
	Supports Wake-on: pumbg
	Wake-on: g
	Current message level: 0x00000033 (51)
	Link detected: yes



	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17  7:37                     ` Eric Dumazet
  2010-03-17  7:58                       ` Sergey Senozhatsky
@ 2010-03-17 10:58                       ` Sergey Senozhatsky
  2010-03-17 13:54                         ` Eric Dumazet
  2010-03-17 23:55                       ` Francois Romieu
  2 siblings, 1 reply; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-17 10:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

[-- Attachment #1: Type: text/plain, Size: 6036 bytes --]

Hello,

We did pktgen over LAN testing for several hours:

iftop
                       19.1Mb                 38.1Mb                  57.2Mb                 76.3Mb            95.4Mb
└──────────────────────┴──────────────────────┴───────────────────────┴──────────────────────┴───────────────────────
xxxxxxxx0007t2                               => xxxxxxxxxxxxxx.xxxxx.xxxx.xxx                    0b   3.60Kb  4.28Kb
                                             <=                                               92.7Mb  92.7Mb  92.5Mb

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
TX:             cumm:  2.50MB   peak:   57.7Kb                                       rates:   7.71Kb  5.97Kb  10.7Kb
RX:                    28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6Mb
TOTAL:                 28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6M

Without any problems.


As soon as I switched back to localhost "pollution":

[ 7343.999279] ------------[ cut here ]------------
[ 7343.999292] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
[ 7343.999295] Hardware name: F3JC                
[ 7343.999298] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[ 7343.999301] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel mmc_core
asus_laptop sparse_keymap snd_hda_codec led_class snd_pcm rng_core sg psmouse snd_timer snd_page_alloc i2c_i801 evdev snd soundcore serio_raw r8169 mii usbhid hid uhci_hcd
ehci_hcd sr_mod cdrom sd_mod usbcore ata_piix
[ 7343.999361] Pid: 4801, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
[ 7343.999364] Call Trace:
[ 7343.999372]  [<c102e293>] warn_slowpath_common+0x65/0x7c
[ 7343.999378]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
[ 7343.999383]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
[ 7343.999388]  [<c126ac34>] dev_watchdog+0xc1/0x125
[ 7343.999395]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[ 7343.999401]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
[ 7343.999406]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[ 7343.999411]  [<c126ab73>] ? dev_watchdog+0x0/0x125
[ 7343.999417]  [<c1032d39>] __do_softirq+0x8d/0x117
[ 7343.999422]  [<c1032dee>] do_softirq+0x2b/0x43
[ 7343.999426]  [<c1032f13>] irq_exit+0x38/0x75
[ 7343.999433]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
[ 7343.999438]  [<c12c770a>] apic_timer_interrupt+0x36/0x3c
[ 7343.999447]  [<f80878df>] ? pktgen_thread_worker+0x5c4/0x631 [pktgen]
[ 7343.999454]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
[ 7343.999459]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
[ 7343.999465]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
[ 7343.999470]  [<c103f60e>] kthread+0x6a/0x6f
[ 7343.999476]  [<c103f5a4>] ? kthread+0x0/0x6f
[ 7343.999481]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
[ 7343.999485] ---[ end trace a22d306b065d4a68 ]---
[ 7344.012654] r8169 0000:02:00.0: eth0: link up
[ 7356.012657] r8169 0000:02:00.0: eth0: link up
[ 7368.013545] r8169 0000:02:00.0: eth0: link up


	Sergey



On (03/17/10 08:37), Eric Dumazet wrote:
> trimming some cc
> 
> Le mercredi 17 mars 2010 à 09:25 +0200, Sergey Senozhatsky a écrit :
> > Hello,
> > 
> > cumulative patch:
> > 
> > [  155.337373] ------------[ cut here ]------------
> > [  155.337386] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [  155.337390] Hardware name: F3JC                
> > [  155.337394] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [  155.337397] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel snd_hda_codec
> > asus_laptop sparse_keymap mmc_core led_class snd_pcm snd_timer snd_page_alloc psmouse rng_core snd soundcore sg i2c_i801 evdev serio_raw r8169 mii usbhid hid uhci_hcd ehci_hcd
> > sr_mod cdrom sd_mod usbcore ata_piix
> > [  155.337468] Pid: 7, comm: ksoftirqd/1 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
> > [  155.337472] Call Trace:
> > [  155.337481]  [<c102e293>] warn_slowpath_common+0x65/0x7c
> > [  155.337506]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
> > [  155.337512]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
> > [  155.337517]  [<c126ac34>] dev_watchdog+0xc1/0x125
> > [  155.337525]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [  155.337530]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
> > [  155.337536]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [  155.337566]  [<c126ab73>] ? dev_watchdog+0x0/0x125
> > [  155.337576]  [<c1032d39>] __do_softirq+0x8d/0x117
> > [  155.337667]  [<c1032dee>] do_softirq+0x2b/0x43
> > [  155.337729]  [<c1032fc1>] run_ksoftirqd+0x71/0x140
> > [  155.337745]  [<c1032f50>] ? run_ksoftirqd+0x0/0x140
> > [  155.337810]  [<c103f60e>] kthread+0x6a/0x6f
> > [  155.337832]  [<c103f5a4>] ? kthread+0x0/0x6f
> > [  155.337903]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [  155.337907] ---[ end trace a22d306b065d4a68 ]---
> > [  155.350902] r8169 0000:02:00.0: eth0: link up
> > [  167.350892] r8169 0000:02:00.0: eth0: link up
> > 
> > 
> > 
> 
> On receiver ?
> 
> I suspect lot of work is needed on this driver to make it working, but I
> dont have a machine with said adapter.
> 
> Are you in 100 Mb full duplex mode ?
> 
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17 10:58                       ` Sergey Senozhatsky
@ 2010-03-17 13:54                         ` Eric Dumazet
  2010-03-18 12:28                           ` Sergey Senozhatsky
  0 siblings, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2010-03-17 13:54 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

Le mercredi 17 mars 2010 à 12:58 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> We did pktgen over LAN testing for several hours:
> 
> iftop
>                        19.1Mb                 38.1Mb                  57.2Mb                 76.3Mb            95.4Mb
> └──────────────────────┴──────────────────────┴───────────────────────┴──────────────────────┴───────────────────────
> xxxxxxxx0007t2                               => xxxxxxxxxxxxxx.xxxxx.xxxx.xxx                    0b   3.60Kb  4.28Kb
>                                              <=                                               92.7Mb  92.7Mb  92.5Mb
> 
> ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> TX:             cumm:  2.50MB   peak:   57.7Kb                                       rates:   7.71Kb  5.97Kb  10.7Kb
> RX:                    28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6Mb
> TOTAL:                 28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6M
> 
> Without any problems.
> 
> 
> As soon as I switched back to localhost "pollution":
> 
> [ 7343.999279] ------------[ cut here ]------------
> [ 7343.999292] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> [ 7343.999295] Hardware name: F3JC                
> [ 7343.999298] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> [ 7343.999301] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel mmc_core
> asus_laptop sparse_keymap snd_hda_codec led_class snd_pcm rng_core sg psmouse snd_timer snd_page_alloc i2c_i801 evdev snd soundcore serio_raw r8169 mii usbhid hid uhci_hcd
> ehci_hcd sr_mod cdrom sd_mod usbcore ata_piix
> [ 7343.999361] Pid: 4801, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
> [ 7343.999364] Call Trace:
> [ 7343.999372]  [<c102e293>] warn_slowpath_common+0x65/0x7c
> [ 7343.999378]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
> [ 7343.999383]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
> [ 7343.999388]  [<c126ac34>] dev_watchdog+0xc1/0x125
> [ 7343.999395]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> [ 7343.999401]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
> [ 7343.999406]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> [ 7343.999411]  [<c126ab73>] ? dev_watchdog+0x0/0x125
> [ 7343.999417]  [<c1032d39>] __do_softirq+0x8d/0x117
> [ 7343.999422]  [<c1032dee>] do_softirq+0x2b/0x43
> [ 7343.999426]  [<c1032f13>] irq_exit+0x38/0x75
> [ 7343.999433]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
> [ 7343.999438]  [<c12c770a>] apic_timer_interrupt+0x36/0x3c
> [ 7343.999447]  [<f80878df>] ? pktgen_thread_worker+0x5c4/0x631 [pktgen]
> [ 7343.999454]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
> [ 7343.999459]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
> [ 7343.999465]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
> [ 7343.999470]  [<c103f60e>] kthread+0x6a/0x6f
> [ 7343.999476]  [<c103f5a4>] ? kthread+0x0/0x6f
> [ 7343.999481]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> [ 7343.999485] ---[ end trace a22d306b065d4a68 ]---
> [ 7344.012654] r8169 0000:02:00.0: eth0: link up
> [ 7356.012657] r8169 0000:02:00.0: eth0: link up
> [ 7368.013545] r8169 0000:02:00.0: eth0: link up
> 
> 

What do you exactly mean by 'localhost pollution' ?



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17  7:37                     ` Eric Dumazet
  2010-03-17  7:58                       ` Sergey Senozhatsky
  2010-03-17 10:58                       ` Sergey Senozhatsky
@ 2010-03-17 23:55                       ` Francois Romieu
  2010-03-18 12:32                         ` Sergey Senozhatsky
  2010-03-18 13:31                         ` Sergey Senozhatsky
  2 siblings, 2 replies; 37+ messages in thread
From: Francois Romieu @ 2010-03-17 23:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, netdev

On Wed, Mar 17, 2010 at 08:37:17AM +0100, Eric Dumazet wrote:
[...]
> I suspect lot of work is needed on this driver to make it working, but I
> dont have a machine with said adapter.

This one should help too if Sergey owns a (MSI) 8168.

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index dfc3573..721e7f3 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4532,21 +4532,39 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 	return count;
 }
 
+static void rtl_napi_cond_schedule(struct rtl8169_private *tp, u16 status)
+{
+	if (status & tp->napi_event) {
+		void __iomem *ioaddr = tp->mmio_addr;
+
+		RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
+		mmiowb();
+		napi_schedule(&tp->napi);
+	}
+}
+
 static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
 {
 	struct net_device *dev = dev_instance;
 	struct rtl8169_private *tp = netdev_priv(dev);
 	void __iomem *ioaddr = tp->mmio_addr;
 	int handled = 0;
-	int status;
+	u16 status;
 
 	/* loop handling interrupts until we have no new ones or
 	 * we hit a invalid/hotplug case.
 	 */
 	status = RTL_R16(IntrStatus);
 	while (status && status != 0xffff) {
+		u16 acked;
+
 		handled = 1;
 
+		acked = (status & RxFIFOOver) ? (status | RxOverflow) : status;
+		acked &= ~tp->napi_event;
+
+		RTL_W16(IntrStatus, acked);
+
 		/* Handle all of the error cases first. These will reset
 		 * the chip, so just exit the loop.
 		 */
@@ -4557,7 +4575,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
 
 		/* Work around for rx fifo overflow */
 		if (unlikely(status & RxFIFOOver) &&
-		(tp->mac_version == RTL_GIGA_MAC_VER_11)) {
+		    (tp->mac_version == RTL_GIGA_MAC_VER_11)) {
 			netif_stop_queue(dev);
 			rtl8169_tx_timeout(dev);
 			break;
@@ -4571,30 +4589,9 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
 		if (status & LinkChg)
 			rtl8169_check_link_status(dev, tp, ioaddr);
 
-		/* We need to see the lastest version of tp->intr_mask to
-		 * avoid ignoring an MSI interrupt and having to wait for
-		 * another event which may never come.
-		 */
-		smp_rmb();
-		if (status & tp->intr_mask & tp->napi_event) {
-			RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
-			tp->intr_mask = ~tp->napi_event;
-
-			if (likely(napi_schedule_prep(&tp->napi)))
-				__napi_schedule(&tp->napi);
-			else
-				netif_info(tp, intr, dev,
-					   "interrupt %04x in poll\n", status);
-		}
+		rtl_napi_cond_schedule(tp, status);
 
-		/* We only get a new MSI interrupt when all active irq
-		 * sources on the chip have been acknowledged. So, ack
-		 * everything we've seen and check if new sources have become
-		 * active to avoid blocking all interrupts from the chip.
-		 */
-		RTL_W16(IntrStatus,
-			(status & RxFIFOOver) ? (status | RxOverflow) : status);
-		status = RTL_R16(IntrStatus);
+		break;
 	}
 
 	return IRQ_RETVAL(handled);
@@ -4607,22 +4604,19 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
 	void __iomem *ioaddr = tp->mmio_addr;
 	int work_done;
 
+
+	RTL_W16(IntrStatus, tp->napi_event);
+
 	work_done = rtl8169_rx_interrupt(dev, tp, ioaddr, (u32) budget);
 	rtl8169_tx_interrupt(dev, tp, ioaddr);
 
 	if (work_done < budget) {
 		napi_complete(napi);
 
-		/* We need for force the visibility of tp->intr_mask
-		 * for other CPUs, as we can loose an MSI interrupt
-		 * and potentially wait for a retransmit timeout if we don't.
-		 * The posted write to IntrMask is safe, as it will
-		 * eventually make it to the chip and we won't loose anything
-		 * until it does.
-		 */
-		tp->intr_mask = 0xffff;
-		smp_wmb();
 		RTL_W16(IntrMask, tp->intr_event);
+		mmiowb();
+
+		rtl_napi_cond_schedule(tp, RTL_R16(IntrStatus));
 	}
 
 	return work_done;

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17 13:54                         ` Eric Dumazet
@ 2010-03-18 12:28                           ` Sergey Senozhatsky
  0 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-18 12:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, Francois Romieu,
	netdev

[-- Attachment #1: Type: text/plain, Size: 4276 bytes --]

Hello,

On (03/17/10 14:54), Eric Dumazet wrote:
> Le mercredi 17 mars 2010 à 12:58 +0200, Sergey Senozhatsky a écrit :
> > Hello,
> > 
> > We did pktgen over LAN testing for several hours:
> > 
> > iftop
> >                        19.1Mb                 38.1Mb                  57.2Mb                 76.3Mb            95.4Mb
> > └──────────────────────┴──────────────────────┴───────────────────────┴──────────────────────┴───────────────────────
> > xxxxxxxx0007t2                               => xxxxxxxxxxxxxx.xxxxx.xxxx.xxx                    0b   3.60Kb  4.28Kb
> >                                              <=                                               92.7Mb  92.7Mb  92.5Mb
> > 
> > ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> > TX:             cumm:  2.50MB   peak:   57.7Kb                                       rates:   7.71Kb  5.97Kb  10.7Kb
> > RX:                    28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6Mb
> > TOTAL:                 28.0GB           92.9Mb                                                92.7Mb  92.8Mb  92.6M
> > 
> > Without any problems.
> > 
> > 
> > As soon as I switched back to localhost "pollution":
> > 
> > [ 7343.999279] ------------[ cut here ]------------
> > [ 7343.999292] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [ 7343.999295] Hardware name: F3JC                
> > [ 7343.999298] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [ 7343.999301] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci snd_hda_intel mmc_core
> > asus_laptop sparse_keymap snd_hda_codec led_class snd_pcm rng_core sg psmouse snd_timer snd_page_alloc i2c_i801 evdev snd soundcore serio_raw r8169 mii usbhid hid uhci_hcd
> > ehci_hcd sr_mod cdrom sd_mod usbcore ata_piix
> > [ 7343.999361] Pid: 4801, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #47
> > [ 7343.999364] Call Trace:
> > [ 7343.999372]  [<c102e293>] warn_slowpath_common+0x65/0x7c
> > [ 7343.999378]  [<c126ac34>] ? dev_watchdog+0xc1/0x125
> > [ 7343.999383]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
> > [ 7343.999388]  [<c126ac34>] dev_watchdog+0xc1/0x125
> > [ 7343.999395]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [ 7343.999401]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
> > [ 7343.999406]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
> > [ 7343.999411]  [<c126ab73>] ? dev_watchdog+0x0/0x125
> > [ 7343.999417]  [<c1032d39>] __do_softirq+0x8d/0x117
> > [ 7343.999422]  [<c1032dee>] do_softirq+0x2b/0x43
> > [ 7343.999426]  [<c1032f13>] irq_exit+0x38/0x75
> > [ 7343.999433]  [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
> > [ 7343.999438]  [<c12c770a>] apic_timer_interrupt+0x36/0x3c
> > [ 7343.999447]  [<f80878df>] ? pktgen_thread_worker+0x5c4/0x631 [pktgen]
> > [ 7343.999454]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
> > [ 7343.999459]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
> > [ 7343.999465]  [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
> > [ 7343.999470]  [<c103f60e>] kthread+0x6a/0x6f
> > [ 7343.999476]  [<c103f5a4>] ? kthread+0x0/0x6f
> > [ 7343.999481]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [ 7343.999485] ---[ end trace a22d306b065d4a68 ]---
> > [ 7344.012654] r8169 0000:02:00.0: eth0: link up
> > [ 7356.012657] r8169 0000:02:00.0: eth0: link up
> > [ 7368.013545] r8169 0000:02:00.0: eth0: link up
> > 
> > 
> 
> What do you exactly mean by 'localhost pollution' ?
> 
Sorry. I mean pktgen started on localhost with src set to localhost (not pktgen over LAN from another machine).


	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17 23:55                       ` Francois Romieu
@ 2010-03-18 12:32                         ` Sergey Senozhatsky
  2010-03-18 13:31                         ` Sergey Senozhatsky
  1 sibling, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-18 12:32 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Oleg Nesterov, David Miller, netdev

[-- Attachment #1: Type: text/plain, Size: 5145 bytes --]

Hello,

This hunk is rejected. I suppose I should apply patch against unpatched version (Eric's patch).
Correct? /* Eric's patch does make sense to my mind. */


@@ -4604,22 +4601,19 @@
        void __iomem *ioaddr = tp->mmio_addr;
        int work_done;
 
+
+       RTL_W16(IntrStatus, tp->napi_event);
+
        work_done = rtl8169_rx_interrupt(dev, tp, ioaddr, (u32) budget);
        rtl8169_tx_interrupt(dev, tp, ioaddr);
 
        if (work_done < budget) {
                napi_complete(napi);
 
-               /* We need for force the visibility of tp->intr_mask
-                * for other CPUs, as we can loose an MSI interrupt
-                * and potentially wait for a retransmit timeout if we don't.
-                * The posted write to IntrMask is safe, as it will
-                * eventually make it to the chip and we won't loose anything
-                * until it does.
-                */
-               tp->intr_mask = 0xffff;
-               smp_wmb();
                RTL_W16(IntrMask, tp->intr_event);
+               mmiowb();
+
+               rtl_napi_cond_schedule(tp, RTL_R16(IntrStatus));
        }



	Sergey



On (03/18/10 00:55), Francois Romieu wrote:
> This one should help too if Sergey owns a (MSI) 8168.
> 
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index dfc3573..721e7f3 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -4532,21 +4532,39 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  	return count;
>  }
>  
> +static void rtl_napi_cond_schedule(struct rtl8169_private *tp, u16 status)
> +{
> +	if (status & tp->napi_event) {
> +		void __iomem *ioaddr = tp->mmio_addr;
> +
> +		RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
> +		mmiowb();
> +		napi_schedule(&tp->napi);
> +	}
> +}
> +
>  static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  {
>  	struct net_device *dev = dev_instance;
>  	struct rtl8169_private *tp = netdev_priv(dev);
>  	void __iomem *ioaddr = tp->mmio_addr;
>  	int handled = 0;
> -	int status;
> +	u16 status;
>  
>  	/* loop handling interrupts until we have no new ones or
>  	 * we hit a invalid/hotplug case.
>  	 */
>  	status = RTL_R16(IntrStatus);
>  	while (status && status != 0xffff) {
> +		u16 acked;
> +
>  		handled = 1;
>  
> +		acked = (status & RxFIFOOver) ? (status | RxOverflow) : status;
> +		acked &= ~tp->napi_event;
> +
> +		RTL_W16(IntrStatus, acked);
> +
>  		/* Handle all of the error cases first. These will reset
>  		 * the chip, so just exit the loop.
>  		 */
> @@ -4557,7 +4575,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  
>  		/* Work around for rx fifo overflow */
>  		if (unlikely(status & RxFIFOOver) &&
> -		(tp->mac_version == RTL_GIGA_MAC_VER_11)) {
> +		    (tp->mac_version == RTL_GIGA_MAC_VER_11)) {
>  			netif_stop_queue(dev);
>  			rtl8169_tx_timeout(dev);
>  			break;
> @@ -4571,30 +4589,9 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  		if (status & LinkChg)
>  			rtl8169_check_link_status(dev, tp, ioaddr);
>  
> -		/* We need to see the lastest version of tp->intr_mask to
> -		 * avoid ignoring an MSI interrupt and having to wait for
> -		 * another event which may never come.
> -		 */
> -		smp_rmb();
> -		if (status & tp->intr_mask & tp->napi_event) {
> -			RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
> -			tp->intr_mask = ~tp->napi_event;
> -
> -			if (likely(napi_schedule_prep(&tp->napi)))
> -				__napi_schedule(&tp->napi);
> -			else
> -				netif_info(tp, intr, dev,
> -					   "interrupt %04x in poll\n", status);
> -		}
> +		rtl_napi_cond_schedule(tp, status);
>  
> -		/* We only get a new MSI interrupt when all active irq
> -		 * sources on the chip have been acknowledged. So, ack
> -		 * everything we've seen and check if new sources have become
> -		 * active to avoid blocking all interrupts from the chip.
> -		 */
> -		RTL_W16(IntrStatus,
> -			(status & RxFIFOOver) ? (status | RxOverflow) : status);
> -		status = RTL_R16(IntrStatus);
> +		break;
>  	}
>  
>  	return IRQ_RETVAL(handled);
> @@ -4607,22 +4604,19 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
>  	void __iomem *ioaddr = tp->mmio_addr;
>  	int work_done;
>  
> +
> +	RTL_W16(IntrStatus, tp->napi_event);
> +
>  	work_done = rtl8169_rx_interrupt(dev, tp, ioaddr, (u32) budget);
>  	rtl8169_tx_interrupt(dev, tp, ioaddr);
>  
>  	if (work_done < budget) {
>  		napi_complete(napi);
>  
> -		/* We need for force the visibility of tp->intr_mask
> -		 * for other CPUs, as we can loose an MSI interrupt
> -		 * and potentially wait for a retransmit timeout if we don't.
> -		 * The posted write to IntrMask is safe, as it will
> -		 * eventually make it to the chip and we won't loose anything
> -		 * until it does.
> -		 */
> -		tp->intr_mask = 0xffff;
> -		smp_wmb();
>  		RTL_W16(IntrMask, tp->intr_event);
> +		mmiowb();
> +
> +		rtl_napi_cond_schedule(tp, RTL_R16(IntrStatus));
>  	}
>  
>  	return work_done;
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-17 23:55                       ` Francois Romieu
  2010-03-18 12:32                         ` Sergey Senozhatsky
@ 2010-03-18 13:31                         ` Sergey Senozhatsky
  1 sibling, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-18 13:31 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eric Dumazet, Oleg Nesterov, David Miller, netdev

[-- Attachment #1: Type: text/plain, Size: 6635 bytes --]

Hello,
Patched r8169 (both Eric's and Francois' patches are applied). /*pktgen localhost2localhost*/

[  498.818640] pktgen 2.72: Packet Generator for packet performance testing.
[  568.999957] ------------[ cut here ]------------
[  568.999969] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
[  568.999973] Hardware name: F3JC                
[  568.999976] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[  568.999979] Modules linked in: pktgen snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek asus_laptop sparse_keymap sdhci_pci sdhci snd_hda_intel mmc_core led_class
snd_hda_codec snd_pcm snd_timer psmouse snd soundcore snd_page_alloc sg evdev i2c_i801 rng_core serio_raw r8169 mii uhci_hcd sr_mod ehci_hcd cdrom sd_mod usbcore ata_piix
[  569.000029] Pid: 3350, comm: kpktgend_0 Tainted: G        W  2.6.34-rc1-dbg-git6-r8169 #48
[  569.000033] Call Trace:
[  569.000041]  [<c102e293>] warn_slowpath_common+0x65/0x7c
[  569.000046]  [<c126c024>] ? dev_watchdog+0xc1/0x125
[  569.000051]  [<c102e2de>] warn_slowpath_fmt+0x24/0x27
[  569.000056]  [<c126c024>] dev_watchdog+0xc1/0x125
[  569.000063]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[  569.000069]  [<c1036b51>] run_timer_softirq+0x176/0x1eb
[  569.000074]  [<c1036afb>] ? run_timer_softirq+0x120/0x1eb
[  569.000079]  [<c126bf63>] ? dev_watchdog+0x0/0x125
[  569.000084]  [<c1032d39>] __do_softirq+0x8d/0x117
[  569.000089]  [<c1032dee>] do_softirq+0x2b/0x43
[  569.000097]  [<f809510d>] ? pktgen_xmit+0xdb7/0xe8e [pktgen]
[  569.000102]  [<c1032e9c>] _local_bh_enable_ip+0x88/0xb0
[  569.000107]  [<c1032ecc>] local_bh_enable_ip+0x8/0xa
[  569.000114]  [<c12c83a0>] _raw_spin_unlock_bh+0x2f/0x32
[  569.000120]  [<f809510d>] pktgen_xmit+0xdb7/0xe8e [pktgen]
[  569.000129]  [<f91674a9>] ? rtl8169_start_xmit+0x0/0x304 [r8169]
[  569.000136]  [<c1183940>] ? trace_hardirqs_on_thunk+0xc/0x10
[  569.000143]  [<c105012d>] ? print_lock_contention_bug+0x11/0xb2
[  569.000150]  [<f80935ca>] ? spin_lock+0x8/0xa [pktgen]
[  569.000156]  [<f80954a8>] pktgen_thread_worker+0x18d/0x631 [pktgen]
[  569.000163]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
[  569.000169]  [<c103f931>] ? autoremove_wake_function+0x0/0x2f
[  569.000175]  [<f809531b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
[  569.000180]  [<c103f60e>] kthread+0x6a/0x6f
[  569.000185]  [<c103f5a4>] ? kthread+0x0/0x6f
[  569.000191]  [<c1002e42>] kernel_thread_helper+0x6/0x1a
[  569.000195] ---[ end trace a22d306b065d4a68 ]---



	Sergey



On (03/18/10 00:55), Francois Romieu wrote:
> > I suspect lot of work is needed on this driver to make it working, but I
> > dont have a machine with said adapter.
> 
> This one should help too if Sergey owns a (MSI) 8168.
> 
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index dfc3573..721e7f3 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -4532,21 +4532,39 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
>  	return count;
>  }
>  
> +static void rtl_napi_cond_schedule(struct rtl8169_private *tp, u16 status)
> +{
> +	if (status & tp->napi_event) {
> +		void __iomem *ioaddr = tp->mmio_addr;
> +
> +		RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
> +		mmiowb();
> +		napi_schedule(&tp->napi);
> +	}
> +}
> +
>  static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  {
>  	struct net_device *dev = dev_instance;
>  	struct rtl8169_private *tp = netdev_priv(dev);
>  	void __iomem *ioaddr = tp->mmio_addr;
>  	int handled = 0;
> -	int status;
> +	u16 status;
>  
>  	/* loop handling interrupts until we have no new ones or
>  	 * we hit a invalid/hotplug case.
>  	 */
>  	status = RTL_R16(IntrStatus);
>  	while (status && status != 0xffff) {
> +		u16 acked;
> +
>  		handled = 1;
>  
> +		acked = (status & RxFIFOOver) ? (status | RxOverflow) : status;
> +		acked &= ~tp->napi_event;
> +
> +		RTL_W16(IntrStatus, acked);
> +
>  		/* Handle all of the error cases first. These will reset
>  		 * the chip, so just exit the loop.
>  		 */
> @@ -4557,7 +4575,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  
>  		/* Work around for rx fifo overflow */
>  		if (unlikely(status & RxFIFOOver) &&
> -		(tp->mac_version == RTL_GIGA_MAC_VER_11)) {
> +		    (tp->mac_version == RTL_GIGA_MAC_VER_11)) {
>  			netif_stop_queue(dev);
>  			rtl8169_tx_timeout(dev);
>  			break;
> @@ -4571,30 +4589,9 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>  		if (status & LinkChg)
>  			rtl8169_check_link_status(dev, tp, ioaddr);
>  
> -		/* We need to see the lastest version of tp->intr_mask to
> -		 * avoid ignoring an MSI interrupt and having to wait for
> -		 * another event which may never come.
> -		 */
> -		smp_rmb();
> -		if (status & tp->intr_mask & tp->napi_event) {
> -			RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
> -			tp->intr_mask = ~tp->napi_event;
> -
> -			if (likely(napi_schedule_prep(&tp->napi)))
> -				__napi_schedule(&tp->napi);
> -			else
> -				netif_info(tp, intr, dev,
> -					   "interrupt %04x in poll\n", status);
> -		}
> +		rtl_napi_cond_schedule(tp, status);
>  
> -		/* We only get a new MSI interrupt when all active irq
> -		 * sources on the chip have been acknowledged. So, ack
> -		 * everything we've seen and check if new sources have become
> -		 * active to avoid blocking all interrupts from the chip.
> -		 */
> -		RTL_W16(IntrStatus,
> -			(status & RxFIFOOver) ? (status | RxOverflow) : status);
> -		status = RTL_R16(IntrStatus);
> +		break;
>  	}
>  
>  	return IRQ_RETVAL(handled);
> @@ -4607,22 +4604,19 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
>  	void __iomem *ioaddr = tp->mmio_addr;
>  	int work_done;
>  
> +
> +	RTL_W16(IntrStatus, tp->napi_event);
> +
>  	work_done = rtl8169_rx_interrupt(dev, tp, ioaddr, (u32) budget);
>  	rtl8169_tx_interrupt(dev, tp, ioaddr);
>  
>  	if (work_done < budget) {
>  		napi_complete(napi);
>  
> -		/* We need for force the visibility of tp->intr_mask
> -		 * for other CPUs, as we can loose an MSI interrupt
> -		 * and potentially wait for a retransmit timeout if we don't.
> -		 * The posted write to IntrMask is safe, as it will
> -		 * eventually make it to the chip and we won't loose anything
> -		 * until it does.
> -		 */
> -		tp->intr_mask = 0xffff;
> -		smp_wmb();
>  		RTL_W16(IntrMask, tp->intr_event);
> +		mmiowb();
> +
> +		rtl_napi_cond_schedule(tp, RTL_R16(IntrStatus));
>  	}
>  
>  	return work_done;
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-16 18:48                 ` Eric Dumazet
  2010-03-16 19:02                   ` Sergey Senozhatsky
  2010-03-17  7:25                   ` Sergey Senozhatsky
@ 2010-03-25 11:30                   ` Sergey Senozhatsky
  2010-03-25 13:19                     ` Eric Dumazet
  2010-03-31 12:08                     ` Eric Dumazet
  2 siblings, 2 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-25 11:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

Hello,

On (03/16/10 19:48), Eric Dumazet wrote:
[..]
> OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> or multiple invocation...
> 
> Did you tried removing the rtl8169_schedule_work() call from
> rtl8169_rx_interrupt() ?
> 
> Maybe the reset is not necessary at all in case of fifo overflow..
> 
> Cumulative patch :
> 
[..]

Eric, I think you can put Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
on the cumulative patch as I don't see any errors with pktgen over LAN testing.

Thanks.


	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-25 11:30                   ` Sergey Senozhatsky
@ 2010-03-25 13:19                     ` Eric Dumazet
  2010-03-25 13:48                       ` Sergey Senozhatsky
  2010-03-26 20:29                       ` François Romieu
  2010-03-31 12:08                     ` Eric Dumazet
  1 sibling, 2 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-03-25 13:19 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

Le jeudi 25 mars 2010 à 13:30 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> On (03/16/10 19:48), Eric Dumazet wrote:
> [..]
> > OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> > or multiple invocation...
> > 
> > Did you tried removing the rtl8169_schedule_work() call from
> > rtl8169_rx_interrupt() ?
> > 
> > Maybe the reset is not necessary at all in case of fifo overflow..
> > 
> > Cumulative patch :
> > 
> [..]
> 
> Eric, I think you can put Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> on the cumulative patch as I don't see any errors with pktgen over LAN testing.
> 

Hi

Thanks for the report. Problem is I am not confident enough on this
driver.
I believe François Romieu is the one that can possibly push a patch ?

Thanks



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-25 13:19                     ` Eric Dumazet
@ 2010-03-25 13:48                       ` Sergey Senozhatsky
  2010-03-26 20:29                       ` François Romieu
  1 sibling, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-03-25 13:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, Francois Romieu,
	netdev

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]

On (03/25/10 14:19), Eric Dumazet wrote:
> > Eric, I think you can put Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > on the cumulative patch as I don't see any errors with pktgen over LAN testing.
> > 
> 
> Hi
> 
> Thanks for the report. Problem is I am not confident enough on this
> driver.
> I believe François Romieu is the one that can possibly push a patch ?
> 

Thanks.

Hm. François' patch introduced new 'problems'/'stack traces' with pktgen testing...
François, any comments?


	Sergey

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-25 13:19                     ` Eric Dumazet
  2010-03-25 13:48                       ` Sergey Senozhatsky
@ 2010-03-26 20:29                       ` François Romieu
  1 sibling, 0 replies; 37+ messages in thread
From: François Romieu @ 2010-03-26 20:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sergey Senozhatsky, Oleg Nesterov, David Miller, netdev

Eric Dumazet <eric.dumazet@gmail.com> :
[...]
> Thanks for the report. Problem is I am not confident enough on this
> driver.

Your patch is simple and it improves things.

> I believe François Romieu is the one that can possibly push a patch ?

Feel free to push.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-25 11:30                   ` Sergey Senozhatsky
  2010-03-25 13:19                     ` Eric Dumazet
@ 2010-03-31 12:08                     ` Eric Dumazet
  2010-04-02  1:43                       ` David Miller
  1 sibling, 1 reply; 37+ messages in thread
From: Eric Dumazet @ 2010-03-31 12:08 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Oleg Nesterov, David Miller, Francois Romieu, netdev

Le jeudi 25 mars 2010 à 13:30 +0200, Sergey Senozhatsky a écrit :
> Hello,
> 
> On (03/16/10 19:48), Eric Dumazet wrote:
> [..]
> > OK thanks for the report, this rtl8169_reset_task() seems pretty buggy,
> > or multiple invocation...
> > 
> > Did you tried removing the rtl8169_schedule_work() call from
> > rtl8169_rx_interrupt() ?
> > 
> > Maybe the reset is not necessary at all in case of fifo overflow..
> > 
> > Cumulative patch :
> > 
> [..]
> 
> Eric, I think you can put Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> on the cumulative patch as I don't see any errors with pktgen over LAN testing.
> 

OK thanks, here is the patch then, but I prefer let the
rtl8169_schedule_work() from rtl8169_rx_interrupt() for now, its removal
might be the subject of another patch.

Lets be very conservative for now.

[PATCH net-next-2.6] r8169: Fix rtl8169_rx_interrupt()

In case a reset is performed, rtl8169_rx_interrupt() is called from
process context instead of softirq context. Special care must be taken
to call appropriate network core services (netif_rx() instead of
netif_receive_skb()). VLAN handling also corrected.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/r8169.c |   22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 964305c..1dffe29 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1054,14 +1054,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	u32 opts2 = le32_to_cpu(desc->opts2);
 	struct vlan_group *vlgrp = tp->vlgrp;
 	int ret;
 
 	if (vlgrp && (opts2 & RxVlanTag)) {
-		vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
+		__vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
 		ret = 0;
 	} else
 		ret = -1;
@@ -1078,7 +1078,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
 }
 
 static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
-			       struct sk_buff *skb)
+			       struct sk_buff *skb, int polling)
 {
 	return -1;
 }
@@ -4467,12 +4467,20 @@ out:
 	return done;
 }
 
+/*
+ * Warning : rtl8169_rx_interrupt() might be called :
+ * 1) from NAPI (softirq) context
+ *	(polling = 1 : we should call netif_receive_skb())
+ * 2) from process context (rtl8169_reset_task())
+ *	(polling = 0 : we must call netif_rx() instead)
+ */		
 static int rtl8169_rx_interrupt(struct net_device *dev,
 				struct rtl8169_private *tp,
 				void __iomem *ioaddr, u32 budget)
 {
 	unsigned int cur_rx, rx_left;
 	unsigned int delta, count;
+	int polling = (budget != ~(u32)0) ? 1 : 0;
 
 	cur_rx = tp->cur_rx;
 	rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
@@ -4534,8 +4542,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
 			skb_put(skb, pkt_size);
 			skb->protocol = eth_type_trans(skb, dev);
 
-			if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
-				netif_receive_skb(skb);
+			if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
+				if (likely(polling))
+					netif_receive_skb(skb);
+				else
+					netif_rx(skb);
+			}
 
 			dev->stats.rx_bytes += pkt_size;
 			dev->stats.rx_packets++;



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-03-31 12:08                     ` Eric Dumazet
@ 2010-04-02  1:43                       ` David Miller
  2010-04-02 13:51                         ` Eric Dumazet
  0 siblings, 1 reply; 37+ messages in thread
From: David Miller @ 2010-04-02  1:43 UTC (permalink / raw)
  To: eric.dumazet; +Cc: sergey.senozhatsky, oleg, romieu, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 31 Mar 2010 14:08:31 +0200

> [PATCH net-next-2.6] r8169: Fix rtl8169_rx_interrupt()
> 
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied to net-next-2.6, but:

> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + *	(polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + *	(polling = 0 : we must call netif_rx() instead)
> + */		
      ^^^^^^^^

Trailing whitespace I had to delete.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] r8169: Fix rtl8169_rx_interrupt()
  2010-04-02  1:43                       ` David Miller
@ 2010-04-02 13:51                         ` Eric Dumazet
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-04-02 13:51 UTC (permalink / raw)
  To: David Miller; +Cc: sergey.senozhatsky, oleg, romieu, netdev

Le jeudi 01 avril 2010 à 18:43 -0700, David Miller a écrit :

> Applied to net-next-2.6, but:
> 
> > +/*
> > + * Warning : rtl8169_rx_interrupt() might be called :
> > + * 1) from NAPI (softirq) context
> > + *	(polling = 1 : we should call netif_receive_skb())
> > + * 2) from process context (rtl8169_reset_task())
> > + *	(polling = 0 : we must call netif_rx() instead)
> > + */		
>       ^^^^^^^^
> 
> Trailing whitespace I had to delete.

Thanks David



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
       [not found] <20100615112434.GA3967@swordfish.minsk.epam.com>
@ 2010-06-18 20:30 ` Andrew Morton
  2010-06-19  8:33   ` Sergey Senozhatsky
                     ` (4 more replies)
  0 siblings, 5 replies; 37+ messages in thread
From: Andrew Morton @ 2010-06-18 20:30 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Alexander Viro, Peter Zijlstra, Sage Weil, linux-fsdevel,
	linux-kernel, Dominik Brodowski, Maciej Rutecki, Eric Dumazet,
	Paul E. McKenney, Lai Jiangshan, David S. Miller, netdev


This was also reported by Dominik and is being tracked at
https://bugzilla.kernel.org/show_bug.cgi?id=16230

On Tue, 15 Jun 2010 14:24:34 +0300
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> Hello,
> 
> kernel: [ 3272.351191] 
> kernel: [ 3272.351194] =================================
> kernel: [ 3272.351199] [ INFO: inconsistent lock state ]
> kernel: [ 3272.351204] 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> kernel: [ 3272.351206] ---------------------------------
> kernel: [ 3272.351210] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> kernel: [ 3272.351215] X/3827 [HC0[0]:SC0[0]:HE1:SE1] takes:
> kernel: [ 3272.351218]  (&(&new->fa_lock)->rlock){?.-...}, at: [<c10aefb4>] kill_fasync+0x37/0x71
> kernel: [ 3272.351232] {IN-HARDIRQ-W} state was registered at:
> kernel: [ 3272.351235]   [<c104e95c>] __lock_acquire+0x281/0xbe1
> kernel: [ 3272.351243]   [<c104f652>] lock_acquire+0x59/0x70
> kernel: [ 3272.351248]   [<c12c6c48>] _raw_spin_lock+0x25/0x34
> kernel: [ 3272.351255]   [<c10aefb4>] kill_fasync+0x37/0x71
> kernel: [ 3272.351261]   [<fd220c81>] evdev_event+0x135/0x190 [evdev]
> kernel: [ 3272.351275]   [<c1232003>] input_pass_event+0x6f/0xae
> kernel: [ 3272.351283]   [<c1232ef5>] input_handle_event+0x38d/0x396
> kernel: [ 3272.351288]   [<c1232fbf>] input_event+0x4f/0x62
> kernel: [ 3272.351293]   [<c12368e4>] input_sync+0xe/0x11
> kernel: [ 3272.351299]   [<c1236d72>] atkbd_interrupt+0x48b/0x541
> kernel: [ 3272.351304]   [<c122ecb2>] serio_interrupt+0x35/0x68
> kernel: [ 3272.351309]   [<c122fbff>] i8042_interrupt+0x264/0x26e
> kernel: [ 3272.351314]   [<c106bb02>] handle_IRQ_event+0x1d/0x98
> kernel: [ 3272.351321]   [<c106d506>] handle_edge_irq+0xc0/0x107
> kernel: [ 3272.351326]   [<c10045ca>] handle_irq+0x1a/0x20
> kernel: [ 3272.351332]   [<c100435f>] do_IRQ+0x43/0x8d
> kernel: [ 3272.351337]   [<c1002d75>] common_interrupt+0x35/0x3c
> kernel: [ 3272.351342]   [<c124723d>] cpuidle_idle_call+0x6a/0xa0
> kernel: [ 3272.351349]   [<c100170d>] cpu_idle+0x89/0xbe
> kernel: [ 3272.351354]   [<c12b6d11>] rest_init+0xb5/0xba
> kernel: [ 3272.351361]   [<c148a7bf>] start_kernel+0x33b/0x340
> kernel: [ 3272.351368]   [<c148a0c9>] i386_start_kernel+0xc9/0xd0
> kernel: [ 3272.351374] irq event stamp: 54104917
> kernel: [ 3272.351377] hardirqs last  enabled at (54104917): [<c12c70f2>] _raw_spin_unlock_irqrestore+0x36/0x5b
> kernel: [ 3272.351384] hardirqs last disabled at (54104916): [<c12c6ced>] _raw_spin_lock_irqsave+0x13/0x42
> kernel: [ 3272.351391] softirqs last  enabled at (54104732): [<c1032cf2>] __do_softirq+0xfd/0x10c
> kernel: [ 3272.351398] softirqs last disabled at (54104703): [<c1032d30>] do_softirq+0x2f/0x47
> kernel: [ 3272.351404] 
> kernel: [ 3272.351405] other info that might help us debug this:
> kernel: [ 3272.351409] 3 locks held by X/3827:
> kernel: [ 3272.351412]  #0:  (rcu_read_lock){.+.+..}, at: [<c124fdfa>] rcu_read_lock+0x0/0x26
> kernel: [ 3272.351423]  #1:  (rcu_read_lock){.+.+..}, at: [<c124d5d9>] rcu_read_lock+0x0/0x26
> kernel: [ 3272.351432]  #2:  (rcu_read_lock){.+.+..}, at: [<c10ae429>] rcu_read_lock+0x0/0x26
> kernel: [ 3272.351442] 
> kernel: [ 3272.351443] stack backtrace:
> kernel: [ 3272.351448] Pid: 3827, comm: X Not tainted 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> kernel: [ 3272.351451] Call Trace:
> kernel: [ 3272.351456]  [<c12c4ff1>] ? printk+0xf/0x11
> kernel: [ 3272.351462]  [<c104e51a>] valid_state+0x133/0x141
> kernel: [ 3272.351468]  [<c104e5f7>] mark_lock+0xcf/0x1b3
> kernel: [ 3272.351473]  [<c104e54e>] ? mark_lock+0x26/0x1b3
> kernel: [ 3272.351479]  [<c104dfd2>] ? check_usage_backwards+0x0/0x68
> kernel: [ 3272.351484]  [<c104e9d0>] __lock_acquire+0x2f5/0xbe1
> kernel: [ 3272.351489]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> kernel: [ 3272.351495]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> kernel: [ 3272.351502]  [<c102ab40>] ? try_to_wake_up+0x2a8/0x2bb
> kernel: [ 3272.351508]  [<c104f652>] lock_acquire+0x59/0x70
> kernel: [ 3272.351513]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> kernel: [ 3272.351519]  [<c12c6c48>] _raw_spin_lock+0x25/0x34
> kernel: [ 3272.351524]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> kernel: [ 3272.351529]  [<c10aefb4>] kill_fasync+0x37/0x71
> kernel: [ 3272.351534]  [<c124d694>] sock_wake_async+0x77/0x83
> kernel: [ 3272.351540]  [<c124fe4d>] sk_wake_async+0x2d/0x32
> kernel: [ 3272.351545]  [<c1250004>] sock_def_readable+0x45/0x51
> kernel: [ 3272.351551]  [<c12b0247>] unix_stream_sendmsg+0x1e2/0x269
> kernel: [ 3272.351557]  [<c124fe6e>] ? rcu_read_unlock+0x1c/0x1e
> kernel: [ 3272.351562]  [<c124cf1a>] __sock_sendmsg+0x51/0x5a
> kernel: [ 3272.351567]  [<c124cff7>] sock_aio_write+0xd4/0xdd
> kernel: [ 3272.351575]  [<c10a4d95>] do_sync_readv_writev+0x84/0xb7
> kernel: [ 3272.351582]  [<c10a4288>] ? copy_from_user+0x8/0xa
> kernel: [ 3272.351587]  [<c10a4e69>] ? rw_copy_check_uvector+0x55/0xc7
> kernel: [ 3272.351594]  [<c1164082>] ? security_file_permission+0xf/0x11
> kernel: [ 3272.351599]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> kernel: [ 3272.351605]  [<c10a4f58>] do_readv_writev+0x7d/0xdf
> kernel: [ 3272.351610]  [<c124cf23>] ? sock_aio_write+0x0/0xdd
> kernel: [ 3272.351615]  [<c1164082>] ? security_file_permission+0xf/0x11
> kernel: [ 3272.351621]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> kernel: [ 3272.351626]  [<c10a4ff3>] vfs_writev+0x39/0x42
> kernel: [ 3272.351632]  [<c10a5102>] sys_writev+0x3b/0x8c
> kernel: [ 3272.351637]  [<c10027d3>] sysenter_do_call+0x12/0x32
> 

This, I think?


From: Andrew Morton <akpm@linux-foundation.org>

Fix a lockdep-splat-causing regression introduced by

: commit 989a2979205dd34269382b357e6d4b4b6956b889
: Author:     Eric Dumazet <eric.dumazet@gmail.com>
: AuthorDate: Wed Apr 14 09:55:35 2010 +0000
: Commit:     David S. Miller <davem@davemloft.net>
: CommitDate: Wed Apr 21 16:19:29 2010 -0700
: 
:     fasync: RCU and fine grained locking

kill_fasync() can be called from both process and hard-irq context, so
fa_lock must be taken with IRQs disabled.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fcntl.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff -puN fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe fs/fcntl.c
--- a/fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe
+++ a/fs/fcntl.c
@@ -733,12 +733,14 @@ static void kill_fasync_rcu(struct fasyn
 {
 	while (fa) {
 		struct fown_struct *fown;
+		unsigned long flags;
+
 		if (fa->magic != FASYNC_MAGIC) {
 			printk(KERN_ERR "kill_fasync: bad magic number in "
 			       "fasync_struct!\n");
 			return;
 		}
-		spin_lock(&fa->fa_lock);
+		spin_lock_irqsave(&fa->fa_lock, flags);
 		if (fa->fa_file) {
 			fown = &fa->fa_file->f_owner;
 			/* Don't send SIGURG to processes which have not set a
@@ -747,7 +749,7 @@ static void kill_fasync_rcu(struct fasyn
 			if (!(sig == SIGURG && fown->signum == 0))
 				send_sigio(fown, fa->fa_fd, band);
 		}
-		spin_unlock(&fa->fa_lock);
+		spin_unlock_irqrestore(&fa->fa_lock, flags);
 		fa = rcu_dereference(fa->fa_next);
 	}
 }
_


afaict all other lockers of fa_lock are OK (but one never really knows
with spin_lock_irq()).

Guys, please review-and-ack and I'll get it merged up.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-06-18 20:30 ` inconsistent lock state Andrew Morton
@ 2010-06-19  8:33   ` Sergey Senozhatsky
  2010-06-19 13:05   ` Dominik Brodowski
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Sergey Senozhatsky @ 2010-06-19  8:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Peter Zijlstra, Sage Weil,
	linux-fsdevel, linux-kernel, Dominik Brodowski, Maciej Rutecki,
	Eric Dumazet, Paul E. McKenney, Lai Jiangshan, David S. Miller,
	netdev

[-- Attachment #1: Type: text/plain, Size: 8190 bytes --]

Hello Andrew,

Thanks. I'll test.


	Sergey


On (06/18/10 13:30), Andrew Morton wrote:
> This was also reported by Dominik and is being tracked at
> https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> On Tue, 15 Jun 2010 14:24:34 +0300
> Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> 
> > Hello,
> > 
> > kernel: [ 3272.351191] 
> > kernel: [ 3272.351194] =================================
> > kernel: [ 3272.351199] [ INFO: inconsistent lock state ]
> > kernel: [ 3272.351204] 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351206] ---------------------------------
> > kernel: [ 3272.351210] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> > kernel: [ 3272.351215] X/3827 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > kernel: [ 3272.351218]  (&(&new->fa_lock)->rlock){?.-...}, at: [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351232] {IN-HARDIRQ-W} state was registered at:
> > kernel: [ 3272.351235]   [<c104e95c>] __lock_acquire+0x281/0xbe1
> > kernel: [ 3272.351243]   [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351248]   [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351255]   [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351261]   [<fd220c81>] evdev_event+0x135/0x190 [evdev]
> > kernel: [ 3272.351275]   [<c1232003>] input_pass_event+0x6f/0xae
> > kernel: [ 3272.351283]   [<c1232ef5>] input_handle_event+0x38d/0x396
> > kernel: [ 3272.351288]   [<c1232fbf>] input_event+0x4f/0x62
> > kernel: [ 3272.351293]   [<c12368e4>] input_sync+0xe/0x11
> > kernel: [ 3272.351299]   [<c1236d72>] atkbd_interrupt+0x48b/0x541
> > kernel: [ 3272.351304]   [<c122ecb2>] serio_interrupt+0x35/0x68
> > kernel: [ 3272.351309]   [<c122fbff>] i8042_interrupt+0x264/0x26e
> > kernel: [ 3272.351314]   [<c106bb02>] handle_IRQ_event+0x1d/0x98
> > kernel: [ 3272.351321]   [<c106d506>] handle_edge_irq+0xc0/0x107
> > kernel: [ 3272.351326]   [<c10045ca>] handle_irq+0x1a/0x20
> > kernel: [ 3272.351332]   [<c100435f>] do_IRQ+0x43/0x8d
> > kernel: [ 3272.351337]   [<c1002d75>] common_interrupt+0x35/0x3c
> > kernel: [ 3272.351342]   [<c124723d>] cpuidle_idle_call+0x6a/0xa0
> > kernel: [ 3272.351349]   [<c100170d>] cpu_idle+0x89/0xbe
> > kernel: [ 3272.351354]   [<c12b6d11>] rest_init+0xb5/0xba
> > kernel: [ 3272.351361]   [<c148a7bf>] start_kernel+0x33b/0x340
> > kernel: [ 3272.351368]   [<c148a0c9>] i386_start_kernel+0xc9/0xd0
> > kernel: [ 3272.351374] irq event stamp: 54104917
> > kernel: [ 3272.351377] hardirqs last  enabled at (54104917): [<c12c70f2>] _raw_spin_unlock_irqrestore+0x36/0x5b
> > kernel: [ 3272.351384] hardirqs last disabled at (54104916): [<c12c6ced>] _raw_spin_lock_irqsave+0x13/0x42
> > kernel: [ 3272.351391] softirqs last  enabled at (54104732): [<c1032cf2>] __do_softirq+0xfd/0x10c
> > kernel: [ 3272.351398] softirqs last disabled at (54104703): [<c1032d30>] do_softirq+0x2f/0x47
> > kernel: [ 3272.351404] 
> > kernel: [ 3272.351405] other info that might help us debug this:
> > kernel: [ 3272.351409] 3 locks held by X/3827:
> > kernel: [ 3272.351412]  #0:  (rcu_read_lock){.+.+..}, at: [<c124fdfa>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351423]  #1:  (rcu_read_lock){.+.+..}, at: [<c124d5d9>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351432]  #2:  (rcu_read_lock){.+.+..}, at: [<c10ae429>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351442] 
> > kernel: [ 3272.351443] stack backtrace:
> > kernel: [ 3272.351448] Pid: 3827, comm: X Not tainted 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351451] Call Trace:
> > kernel: [ 3272.351456]  [<c12c4ff1>] ? printk+0xf/0x11
> > kernel: [ 3272.351462]  [<c104e51a>] valid_state+0x133/0x141
> > kernel: [ 3272.351468]  [<c104e5f7>] mark_lock+0xcf/0x1b3
> > kernel: [ 3272.351473]  [<c104e54e>] ? mark_lock+0x26/0x1b3
> > kernel: [ 3272.351479]  [<c104dfd2>] ? check_usage_backwards+0x0/0x68
> > kernel: [ 3272.351484]  [<c104e9d0>] __lock_acquire+0x2f5/0xbe1
> > kernel: [ 3272.351489]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351495]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351502]  [<c102ab40>] ? try_to_wake_up+0x2a8/0x2bb
> > kernel: [ 3272.351508]  [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351513]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351519]  [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351524]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351529]  [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351534]  [<c124d694>] sock_wake_async+0x77/0x83
> > kernel: [ 3272.351540]  [<c124fe4d>] sk_wake_async+0x2d/0x32
> > kernel: [ 3272.351545]  [<c1250004>] sock_def_readable+0x45/0x51
> > kernel: [ 3272.351551]  [<c12b0247>] unix_stream_sendmsg+0x1e2/0x269
> > kernel: [ 3272.351557]  [<c124fe6e>] ? rcu_read_unlock+0x1c/0x1e
> > kernel: [ 3272.351562]  [<c124cf1a>] __sock_sendmsg+0x51/0x5a
> > kernel: [ 3272.351567]  [<c124cff7>] sock_aio_write+0xd4/0xdd
> > kernel: [ 3272.351575]  [<c10a4d95>] do_sync_readv_writev+0x84/0xb7
> > kernel: [ 3272.351582]  [<c10a4288>] ? copy_from_user+0x8/0xa
> > kernel: [ 3272.351587]  [<c10a4e69>] ? rw_copy_check_uvector+0x55/0xc7
> > kernel: [ 3272.351594]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351599]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351605]  [<c10a4f58>] do_readv_writev+0x7d/0xdf
> > kernel: [ 3272.351610]  [<c124cf23>] ? sock_aio_write+0x0/0xdd
> > kernel: [ 3272.351615]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351621]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351626]  [<c10a4ff3>] vfs_writev+0x39/0x42
> > kernel: [ 3272.351632]  [<c10a5102>] sys_writev+0x3b/0x8c
> > kernel: [ 3272.351637]  [<c10027d3>] sysenter_do_call+0x12/0x32
> > 
> 
> This, I think?
> 
> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/fcntl.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff -puN fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe fs/fcntl.c
> --- a/fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe
> +++ a/fs/fcntl.c
> @@ -733,12 +733,14 @@ static void kill_fasync_rcu(struct fasyn
>  {
>  	while (fa) {
>  		struct fown_struct *fown;
> +		unsigned long flags;
> +
>  		if (fa->magic != FASYNC_MAGIC) {
>  			printk(KERN_ERR "kill_fasync: bad magic number in "
>  			       "fasync_struct!\n");
>  			return;
>  		}
> -		spin_lock(&fa->fa_lock);
> +		spin_lock_irqsave(&fa->fa_lock, flags);
>  		if (fa->fa_file) {
>  			fown = &fa->fa_file->f_owner;
>  			/* Don't send SIGURG to processes which have not set a
> @@ -747,7 +749,7 @@ static void kill_fasync_rcu(struct fasyn
>  			if (!(sig == SIGURG && fown->signum == 0))
>  				send_sigio(fown, fa->fa_fd, band);
>  		}
> -		spin_unlock(&fa->fa_lock);
> +		spin_unlock_irqrestore(&fa->fa_lock, flags);
>  		fa = rcu_dereference(fa->fa_next);
>  	}
>  }
> _
> 
> 
> afaict all other lockers of fa_lock are OK (but one never really knows
> with spin_lock_irq()).
> 
> Guys, please review-and-ack and I'll get it merged up.
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-06-18 20:30 ` inconsistent lock state Andrew Morton
  2010-06-19  8:33   ` Sergey Senozhatsky
@ 2010-06-19 13:05   ` Dominik Brodowski
  2010-06-21  6:54   ` Peter Zijlstra
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Dominik Brodowski @ 2010-06-19 13:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Peter Zijlstra, Sage Weil,
	linux-fsdevel, linux-kernel, Maciej Rutecki, Eric Dumazet,
	Paul E. McKenney, Lai Jiangshan, David S. Miller, netdev

Hey,

On Fri, Jun 18, 2010 at 01:30:04PM -0700, Andrew Morton wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230

Seems to work fine -- thanks!

Best,
	Dominik

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-06-18 20:30 ` inconsistent lock state Andrew Morton
  2010-06-19  8:33   ` Sergey Senozhatsky
  2010-06-19 13:05   ` Dominik Brodowski
@ 2010-06-21  6:54   ` Peter Zijlstra
  2010-06-22  8:59   ` Eric Dumazet
  2010-06-22 17:17   ` Paul E. McKenney
  4 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2010-06-21  6:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Sage Weil, linux-fsdevel,
	linux-kernel, Dominik Brodowski, Maciej Rutecki, Eric Dumazet,
	Paul E.McKenney, Lai Jiangshan, David S.Miller, netdev

On Fri, 2010-06-18 at 13:30 -0700, Andrew Morton wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> 

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-06-18 20:30 ` inconsistent lock state Andrew Morton
                     ` (2 preceding siblings ...)
  2010-06-21  6:54   ` Peter Zijlstra
@ 2010-06-22  8:59   ` Eric Dumazet
  2010-06-22 17:17   ` Paul E. McKenney
  4 siblings, 0 replies; 37+ messages in thread
From: Eric Dumazet @ 2010-06-22  8:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Peter Zijlstra, Sage Weil,
	linux-fsdevel, linux-kernel, Dominik Brodowski, Maciej Rutecki,
	Paul E.McKenney, Lai Jiangshan, David S.Miller, netdev

Le vendredi 18 juin 2010 à 13:30 -0700, Andrew Morton a écrit : 
> This was also reported by Dominik and is being tracked at
> https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> On Tue, 15 Jun 2010 14:24:34 +0300
> Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> 
> > Hello,
> > 
> > kernel: [ 3272.351191] 
> > kernel: [ 3272.351194] =================================
> > kernel: [ 3272.351199] [ INFO: inconsistent lock state ]
> > kernel: [ 3272.351204] 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351206] ---------------------------------
> > kernel: [ 3272.351210] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> > kernel: [ 3272.351215] X/3827 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > kernel: [ 3272.351218]  (&(&new->fa_lock)->rlock){?.-...}, at: [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351232] {IN-HARDIRQ-W} state was registered at:
> > kernel: [ 3272.351235]   [<c104e95c>] __lock_acquire+0x281/0xbe1
> > kernel: [ 3272.351243]   [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351248]   [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351255]   [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351261]   [<fd220c81>] evdev_event+0x135/0x190 [evdev]
> > kernel: [ 3272.351275]   [<c1232003>] input_pass_event+0x6f/0xae
> > kernel: [ 3272.351283]   [<c1232ef5>] input_handle_event+0x38d/0x396
> > kernel: [ 3272.351288]   [<c1232fbf>] input_event+0x4f/0x62
> > kernel: [ 3272.351293]   [<c12368e4>] input_sync+0xe/0x11
> > kernel: [ 3272.351299]   [<c1236d72>] atkbd_interrupt+0x48b/0x541
> > kernel: [ 3272.351304]   [<c122ecb2>] serio_interrupt+0x35/0x68
> > kernel: [ 3272.351309]   [<c122fbff>] i8042_interrupt+0x264/0x26e
> > kernel: [ 3272.351314]   [<c106bb02>] handle_IRQ_event+0x1d/0x98
> > kernel: [ 3272.351321]   [<c106d506>] handle_edge_irq+0xc0/0x107
> > kernel: [ 3272.351326]   [<c10045ca>] handle_irq+0x1a/0x20
> > kernel: [ 3272.351332]   [<c100435f>] do_IRQ+0x43/0x8d
> > kernel: [ 3272.351337]   [<c1002d75>] common_interrupt+0x35/0x3c
> > kernel: [ 3272.351342]   [<c124723d>] cpuidle_idle_call+0x6a/0xa0
> > kernel: [ 3272.351349]   [<c100170d>] cpu_idle+0x89/0xbe
> > kernel: [ 3272.351354]   [<c12b6d11>] rest_init+0xb5/0xba
> > kernel: [ 3272.351361]   [<c148a7bf>] start_kernel+0x33b/0x340
> > kernel: [ 3272.351368]   [<c148a0c9>] i386_start_kernel+0xc9/0xd0
> > kernel: [ 3272.351374] irq event stamp: 54104917
> > kernel: [ 3272.351377] hardirqs last  enabled at (54104917): [<c12c70f2>] _raw_spin_unlock_irqrestore+0x36/0x5b
> > kernel: [ 3272.351384] hardirqs last disabled at (54104916): [<c12c6ced>] _raw_spin_lock_irqsave+0x13/0x42
> > kernel: [ 3272.351391] softirqs last  enabled at (54104732): [<c1032cf2>] __do_softirq+0xfd/0x10c
> > kernel: [ 3272.351398] softirqs last disabled at (54104703): [<c1032d30>] do_softirq+0x2f/0x47
> > kernel: [ 3272.351404] 
> > kernel: [ 3272.351405] other info that might help us debug this:
> > kernel: [ 3272.351409] 3 locks held by X/3827:
> > kernel: [ 3272.351412]  #0:  (rcu_read_lock){.+.+..}, at: [<c124fdfa>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351423]  #1:  (rcu_read_lock){.+.+..}, at: [<c124d5d9>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351432]  #2:  (rcu_read_lock){.+.+..}, at: [<c10ae429>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351442] 
> > kernel: [ 3272.351443] stack backtrace:
> > kernel: [ 3272.351448] Pid: 3827, comm: X Not tainted 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351451] Call Trace:
> > kernel: [ 3272.351456]  [<c12c4ff1>] ? printk+0xf/0x11
> > kernel: [ 3272.351462]  [<c104e51a>] valid_state+0x133/0x141
> > kernel: [ 3272.351468]  [<c104e5f7>] mark_lock+0xcf/0x1b3
> > kernel: [ 3272.351473]  [<c104e54e>] ? mark_lock+0x26/0x1b3
> > kernel: [ 3272.351479]  [<c104dfd2>] ? check_usage_backwards+0x0/0x68
> > kernel: [ 3272.351484]  [<c104e9d0>] __lock_acquire+0x2f5/0xbe1
> > kernel: [ 3272.351489]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351495]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351502]  [<c102ab40>] ? try_to_wake_up+0x2a8/0x2bb
> > kernel: [ 3272.351508]  [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351513]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351519]  [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351524]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351529]  [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351534]  [<c124d694>] sock_wake_async+0x77/0x83
> > kernel: [ 3272.351540]  [<c124fe4d>] sk_wake_async+0x2d/0x32
> > kernel: [ 3272.351545]  [<c1250004>] sock_def_readable+0x45/0x51
> > kernel: [ 3272.351551]  [<c12b0247>] unix_stream_sendmsg+0x1e2/0x269
> > kernel: [ 3272.351557]  [<c124fe6e>] ? rcu_read_unlock+0x1c/0x1e
> > kernel: [ 3272.351562]  [<c124cf1a>] __sock_sendmsg+0x51/0x5a
> > kernel: [ 3272.351567]  [<c124cff7>] sock_aio_write+0xd4/0xdd
> > kernel: [ 3272.351575]  [<c10a4d95>] do_sync_readv_writev+0x84/0xb7
> > kernel: [ 3272.351582]  [<c10a4288>] ? copy_from_user+0x8/0xa
> > kernel: [ 3272.351587]  [<c10a4e69>] ? rw_copy_check_uvector+0x55/0xc7
> > kernel: [ 3272.351594]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351599]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351605]  [<c10a4f58>] do_readv_writev+0x7d/0xdf
> > kernel: [ 3272.351610]  [<c124cf23>] ? sock_aio_write+0x0/0xdd
> > kernel: [ 3272.351615]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351621]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351626]  [<c10a4ff3>] vfs_writev+0x39/0x42
> > kernel: [ 3272.351632]  [<c10a5102>] sys_writev+0x3b/0x8c
> > kernel: [ 3272.351637]  [<c10027d3>] sysenter_do_call+0x12/0x32
> > 
> 
> This, I think?
> 
> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/fcntl.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff -puN fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe fs/fcntl.c
> --- a/fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe
> +++ a/fs/fcntl.c
> @@ -733,12 +733,14 @@ static void kill_fasync_rcu(struct fasyn
>  {
>  	while (fa) {
>  		struct fown_struct *fown;
> +		unsigned long flags;
> +
>  		if (fa->magic != FASYNC_MAGIC) {
>  			printk(KERN_ERR "kill_fasync: bad magic number in "
>  			       "fasync_struct!\n");
>  			return;
>  		}
> -		spin_lock(&fa->fa_lock);
> +		spin_lock_irqsave(&fa->fa_lock, flags);
>  		if (fa->fa_file) {
>  			fown = &fa->fa_file->f_owner;
>  			/* Don't send SIGURG to processes which have not set a
> @@ -747,7 +749,7 @@ static void kill_fasync_rcu(struct fasyn
>  			if (!(sig == SIGURG && fown->signum == 0))
>  				send_sigio(fown, fa->fa_fd, band);
>  		}
> -		spin_unlock(&fa->fa_lock);
> +		spin_unlock_irqrestore(&fa->fa_lock, flags);
>  		fa = rcu_dereference(fa->fa_next);
>  	}
>  }
> _
> 
> 
> afaict all other lockers of fa_lock are OK (but one never really knows
> with spin_lock_irq()).
> 
> Guys, please review-and-ack and I'll get it merged up.
> 

Sorry for the delay, I was travelling...

Alternative solution would be to change fasync_remove_entry() and
fasync_add_entry() to use spin_lock_irq(&fasync_lock) and
spin_lock(&fa->fa_lock), but we would disable IRQ on possibly long scans
(as before my "fasync: RCU and fine grained locking" patch).

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: inconsistent lock state
  2010-06-18 20:30 ` inconsistent lock state Andrew Morton
                     ` (3 preceding siblings ...)
  2010-06-22  8:59   ` Eric Dumazet
@ 2010-06-22 17:17   ` Paul E. McKenney
  4 siblings, 0 replies; 37+ messages in thread
From: Paul E. McKenney @ 2010-06-22 17:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Alexander Viro, Peter Zijlstra, Sage Weil,
	linux-fsdevel, linux-kernel, Dominik Brodowski, Maciej Rutecki,
	Eric Dumazet, Lai Jiangshan, David S. Miller, netdev

On Fri, Jun 18, 2010 at 01:30:04PM -0700, Andrew Morton wrote:
> 
> This was also reported by Dominik and is being tracked at
> https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> On Tue, 15 Jun 2010 14:24:34 +0300
> Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:
> 
> > Hello,
> > 
> > kernel: [ 3272.351191] 
> > kernel: [ 3272.351194] =================================
> > kernel: [ 3272.351199] [ INFO: inconsistent lock state ]
> > kernel: [ 3272.351204] 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351206] ---------------------------------
> > kernel: [ 3272.351210] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> > kernel: [ 3272.351215] X/3827 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > kernel: [ 3272.351218]  (&(&new->fa_lock)->rlock){?.-...}, at: [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351232] {IN-HARDIRQ-W} state was registered at:
> > kernel: [ 3272.351235]   [<c104e95c>] __lock_acquire+0x281/0xbe1
> > kernel: [ 3272.351243]   [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351248]   [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351255]   [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351261]   [<fd220c81>] evdev_event+0x135/0x190 [evdev]
> > kernel: [ 3272.351275]   [<c1232003>] input_pass_event+0x6f/0xae
> > kernel: [ 3272.351283]   [<c1232ef5>] input_handle_event+0x38d/0x396
> > kernel: [ 3272.351288]   [<c1232fbf>] input_event+0x4f/0x62
> > kernel: [ 3272.351293]   [<c12368e4>] input_sync+0xe/0x11
> > kernel: [ 3272.351299]   [<c1236d72>] atkbd_interrupt+0x48b/0x541
> > kernel: [ 3272.351304]   [<c122ecb2>] serio_interrupt+0x35/0x68
> > kernel: [ 3272.351309]   [<c122fbff>] i8042_interrupt+0x264/0x26e
> > kernel: [ 3272.351314]   [<c106bb02>] handle_IRQ_event+0x1d/0x98
> > kernel: [ 3272.351321]   [<c106d506>] handle_edge_irq+0xc0/0x107
> > kernel: [ 3272.351326]   [<c10045ca>] handle_irq+0x1a/0x20
> > kernel: [ 3272.351332]   [<c100435f>] do_IRQ+0x43/0x8d
> > kernel: [ 3272.351337]   [<c1002d75>] common_interrupt+0x35/0x3c
> > kernel: [ 3272.351342]   [<c124723d>] cpuidle_idle_call+0x6a/0xa0
> > kernel: [ 3272.351349]   [<c100170d>] cpu_idle+0x89/0xbe
> > kernel: [ 3272.351354]   [<c12b6d11>] rest_init+0xb5/0xba
> > kernel: [ 3272.351361]   [<c148a7bf>] start_kernel+0x33b/0x340
> > kernel: [ 3272.351368]   [<c148a0c9>] i386_start_kernel+0xc9/0xd0
> > kernel: [ 3272.351374] irq event stamp: 54104917
> > kernel: [ 3272.351377] hardirqs last  enabled at (54104917): [<c12c70f2>] _raw_spin_unlock_irqrestore+0x36/0x5b
> > kernel: [ 3272.351384] hardirqs last disabled at (54104916): [<c12c6ced>] _raw_spin_lock_irqsave+0x13/0x42
> > kernel: [ 3272.351391] softirqs last  enabled at (54104732): [<c1032cf2>] __do_softirq+0xfd/0x10c
> > kernel: [ 3272.351398] softirqs last disabled at (54104703): [<c1032d30>] do_softirq+0x2f/0x47
> > kernel: [ 3272.351404] 
> > kernel: [ 3272.351405] other info that might help us debug this:
> > kernel: [ 3272.351409] 3 locks held by X/3827:
> > kernel: [ 3272.351412]  #0:  (rcu_read_lock){.+.+..}, at: [<c124fdfa>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351423]  #1:  (rcu_read_lock){.+.+..}, at: [<c124d5d9>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351432]  #2:  (rcu_read_lock){.+.+..}, at: [<c10ae429>] rcu_read_lock+0x0/0x26
> > kernel: [ 3272.351442] 
> > kernel: [ 3272.351443] stack backtrace:
> > kernel: [ 3272.351448] Pid: 3827, comm: X Not tainted 2.6.35-rc3-dbg-00106-ga75e02b-dirty #15
> > kernel: [ 3272.351451] Call Trace:
> > kernel: [ 3272.351456]  [<c12c4ff1>] ? printk+0xf/0x11
> > kernel: [ 3272.351462]  [<c104e51a>] valid_state+0x133/0x141
> > kernel: [ 3272.351468]  [<c104e5f7>] mark_lock+0xcf/0x1b3
> > kernel: [ 3272.351473]  [<c104e54e>] ? mark_lock+0x26/0x1b3
> > kernel: [ 3272.351479]  [<c104dfd2>] ? check_usage_backwards+0x0/0x68
> > kernel: [ 3272.351484]  [<c104e9d0>] __lock_acquire+0x2f5/0xbe1
> > kernel: [ 3272.351489]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351495]  [<c104ea44>] ? __lock_acquire+0x369/0xbe1
> > kernel: [ 3272.351502]  [<c102ab40>] ? try_to_wake_up+0x2a8/0x2bb
> > kernel: [ 3272.351508]  [<c104f652>] lock_acquire+0x59/0x70
> > kernel: [ 3272.351513]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351519]  [<c12c6c48>] _raw_spin_lock+0x25/0x34
> > kernel: [ 3272.351524]  [<c10aefb4>] ? kill_fasync+0x37/0x71
> > kernel: [ 3272.351529]  [<c10aefb4>] kill_fasync+0x37/0x71
> > kernel: [ 3272.351534]  [<c124d694>] sock_wake_async+0x77/0x83
> > kernel: [ 3272.351540]  [<c124fe4d>] sk_wake_async+0x2d/0x32
> > kernel: [ 3272.351545]  [<c1250004>] sock_def_readable+0x45/0x51
> > kernel: [ 3272.351551]  [<c12b0247>] unix_stream_sendmsg+0x1e2/0x269
> > kernel: [ 3272.351557]  [<c124fe6e>] ? rcu_read_unlock+0x1c/0x1e
> > kernel: [ 3272.351562]  [<c124cf1a>] __sock_sendmsg+0x51/0x5a
> > kernel: [ 3272.351567]  [<c124cff7>] sock_aio_write+0xd4/0xdd
> > kernel: [ 3272.351575]  [<c10a4d95>] do_sync_readv_writev+0x84/0xb7
> > kernel: [ 3272.351582]  [<c10a4288>] ? copy_from_user+0x8/0xa
> > kernel: [ 3272.351587]  [<c10a4e69>] ? rw_copy_check_uvector+0x55/0xc7
> > kernel: [ 3272.351594]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351599]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351605]  [<c10a4f58>] do_readv_writev+0x7d/0xdf
> > kernel: [ 3272.351610]  [<c124cf23>] ? sock_aio_write+0x0/0xdd
> > kernel: [ 3272.351615]  [<c1164082>] ? security_file_permission+0xf/0x11
> > kernel: [ 3272.351621]  [<c10a47e5>] ? rw_verify_area+0x90/0xac
> > kernel: [ 3272.351626]  [<c10a4ff3>] vfs_writev+0x39/0x42
> > kernel: [ 3272.351632]  [<c10a5102>] sys_writev+0x3b/0x8c
> > kernel: [ 3272.351637]  [<c10027d3>] sysenter_do_call+0x12/0x32
> > 
> 
> This, I think?
> 
> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Fix a lockdep-splat-causing regression introduced by
> 
> : commit 989a2979205dd34269382b357e6d4b4b6956b889
> : Author:     Eric Dumazet <eric.dumazet@gmail.com>
> : AuthorDate: Wed Apr 14 09:55:35 2010 +0000
> : Commit:     David S. Miller <davem@davemloft.net>
> : CommitDate: Wed Apr 21 16:19:29 2010 -0700
> : 
> :     fasync: RCU and fine grained locking
> 
> kill_fasync() can be called from both process and hard-irq context, so
> fa_lock must be taken with IRQs disabled.
> 
> Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16230
> 
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/fcntl.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff -puN fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe fs/fcntl.c
> --- a/fs/fcntl.c~fs-fcntlc-kill_fasync_rcu-fa_lock-must-be-irq-safe
> +++ a/fs/fcntl.c
> @@ -733,12 +733,14 @@ static void kill_fasync_rcu(struct fasyn
>  {
>  	while (fa) {
>  		struct fown_struct *fown;
> +		unsigned long flags;
> +
>  		if (fa->magic != FASYNC_MAGIC) {
>  			printk(KERN_ERR "kill_fasync: bad magic number in "
>  			       "fasync_struct!\n");
>  			return;
>  		}
> -		spin_lock(&fa->fa_lock);
> +		spin_lock_irqsave(&fa->fa_lock, flags);
>  		if (fa->fa_file) {
>  			fown = &fa->fa_file->f_owner;
>  			/* Don't send SIGURG to processes which have not set a
> @@ -747,7 +749,7 @@ static void kill_fasync_rcu(struct fasyn
>  			if (!(sig == SIGURG && fown->signum == 0))
>  				send_sigio(fown, fa->fa_fd, band);
>  		}
> -		spin_unlock(&fa->fa_lock);
> +		spin_unlock_irqrestore(&fa->fa_lock, flags);
>  		fa = rcu_dereference(fa->fa_next);
>  	}
>  }
> _
> 
> 
> afaict all other lockers of fa_lock are OK (but one never really knows
> with spin_lock_irq()).
> 
> Guys, please review-and-ack and I'll get it merged up.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2010-06-22 17:17 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20100307192305.GA598@elte.hu>
2010-03-08 12:51 ` inconsistent lock state Oleg Nesterov
2010-03-15 21:01   ` Eric Dumazet
2010-03-15 21:09     ` David Miller
2010-03-16  0:33     ` [PATCH] r8169: Fix rtl8169_rx_interrupt() Eric Dumazet
2010-03-16  6:50       ` Sergey Senozhatsky
2010-03-16  7:12         ` Eric Dumazet
2010-03-16 14:50       ` Sergey Senozhatsky
2010-03-16 15:00       ` Sergey Senozhatsky
2010-03-16 15:05         ` Eric Dumazet
2010-03-16 15:10           ` Sergey Senozhatsky
2010-03-16 15:20             ` Eric Dumazet
2010-03-16 18:26               ` Sergey Senozhatsky
2010-03-16 18:48                 ` Eric Dumazet
2010-03-16 19:02                   ` Sergey Senozhatsky
2010-03-17  7:25                   ` Sergey Senozhatsky
2010-03-17  7:37                     ` Eric Dumazet
2010-03-17  7:58                       ` Sergey Senozhatsky
2010-03-17 10:58                       ` Sergey Senozhatsky
2010-03-17 13:54                         ` Eric Dumazet
2010-03-18 12:28                           ` Sergey Senozhatsky
2010-03-17 23:55                       ` Francois Romieu
2010-03-18 12:32                         ` Sergey Senozhatsky
2010-03-18 13:31                         ` Sergey Senozhatsky
2010-03-25 11:30                   ` Sergey Senozhatsky
2010-03-25 13:19                     ` Eric Dumazet
2010-03-25 13:48                       ` Sergey Senozhatsky
2010-03-26 20:29                       ` François Romieu
2010-03-31 12:08                     ` Eric Dumazet
2010-04-02  1:43                       ` David Miller
2010-04-02 13:51                         ` Eric Dumazet
     [not found] <20100615112434.GA3967@swordfish.minsk.epam.com>
2010-06-18 20:30 ` inconsistent lock state Andrew Morton
2010-06-19  8:33   ` Sergey Senozhatsky
2010-06-19 13:05   ` Dominik Brodowski
2010-06-21  6:54   ` Peter Zijlstra
2010-06-22  8:59   ` Eric Dumazet
2010-06-22 17:17   ` Paul E. McKenney
2010-03-07 10:21 Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).