[PATCH] Deadlock in af_packet/packet

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Deadlock in af_packet/packet_rcv
@ 2004-11-25 20:55 Olaf Kirch
  2004-11-30  4:02 ` David S. Miller
  2004-11-30  9:32 ` Tommy Christensen
  0 siblings, 2 replies; 9+ messages in thread
From: Olaf Kirch @ 2004-11-25 20:55 UTC (permalink / raw)
  To: netdev

Before introduction of lock-less loopback, all of netdev_xmit_nit was
running with bottom halves disabled, and some code seems to rely on
this. One of them is af_packet's packet_rcv handler, which is called
from the receive path via netdev_xmit_nit.  It takes a spin lock,
and if an interrupt occurs and calls netdev_xmit during this time,
the CPU deadlocks.

The patch below disables BHs while in the TX path for loopback and
similar devices.

An alternative, less conservative fix would be to just use spin_lock_bh
in af_packet.c.

Signed-off-by: Olaf Kirch <okir@suse.de>

Index: linux-2.6.9/net/core/dev.c
===================================================================
--- linux-2.6.9.orig/net/core/dev.c
+++ linux-2.6.9/net/core/dev.c
@@ -1222,6 +1222,7 @@ int __skb_linearize(struct sk_buff *skb,
 }

 #define HARD_TX_LOCK(dev, cpu) {			\
+	local_disable_bh();				\
 	if ((dev->features & NETIF_F_LLTX) == 0) {	\
 		spin_lock(&dev->xmit_lock);		\
 		dev->xmit_lock_owner = cpu;		\
@@ -1233,6 +1234,7 @@ int __skb_linearize(struct sk_buff *skb,
 		dev->xmit_lock_owner = -1;		\
 		spin_unlock(&dev->xmit_lock);		\
 	}						\
+	local_enable_bh();				\
 }

 static inline void qdisc_run(struct net_device *dev)

-- 
Olaf Kirch     | Things that make Monday morning interesting, #2:
okir@suse.de   |        "We have 8,000 NFS mount points, why do we keep
---------------+ 	 running out of privileged ports?"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-25 20:55 [PATCH] Deadlock in af_packet/packet_rcv Olaf Kirch
@ 2004-11-30  4:02 ` David S. Miller
  2004-11-30 10:48   ` Olaf Kirch
  2004-11-30  9:32 ` Tommy Christensen
  1 sibling, 1 reply; 9+ messages in thread
From: David S. Miller @ 2004-11-30  4:02 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: netdev

On Thu, 25 Nov 2004 21:55:03 +0100
Olaf Kirch <okir@suse.de> wrote:

> Before introduction of lock-less loopback, all of netdev_xmit_nit was
> running with bottom halves disabled, and some code seems to rely on
> this. One of them is af_packet's packet_rcv handler, which is called
> from the receive path via netdev_xmit_nit.  It takes a spin lock,
> and if an interrupt occurs and calls netdev_xmit during this time,
> the CPU deadlocks.
> 
> The patch below disables BHs while in the TX path for loopback and
> similar devices.
>
> An alternative, less conservative fix would be to just use spin_lock_bh
> in af_packet.c.

Good spotting.

If it's only the dev_queue_xmit_nit() handlers which all want
BHs disabled, why don't we just disable BHs in that function?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-25 20:55 [PATCH] Deadlock in af_packet/packet_rcv Olaf Kirch
  2004-11-30  4:02 ` David S. Miller
@ 2004-11-30  9:32 ` Tommy Christensen
  2004-11-30 11:01   ` Olaf Kirch
  1 sibling, 1 reply; 9+ messages in thread
From: Tommy Christensen @ 2004-11-30  9:32 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: netdev

Olaf Kirch wrote:
> Before introduction of lock-less loopback, all of netdev_xmit_nit was
> running with bottom halves disabled, and some code seems to rely on
> this. One of them is af_packet's packet_rcv handler, which is called
> from the receive path via netdev_xmit_nit.  It takes a spin lock,
> and if an interrupt occurs and calls netdev_xmit during this time,
> the CPU deadlocks.

An interrupt handler shouldn't call dev_queue_xmit() directly. If
this indeed happens, it needs to be fixed. Which handler is this?

> The patch below disables BHs while in the TX path for loopback and
> similar devices.

dev_queue_xmit() already have BHs disabled, so this won't do anything.
And weren't you talking about interrupts ??

Protocol handlers are called with bottom halves disabled, but IRQs
are enabled. This hasn't changed - and is assumed in lots of places.

-Tommy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30  4:02 ` David S. Miller
@ 2004-11-30 10:48   ` Olaf Kirch
  0 siblings, 0 replies; 9+ messages in thread
From: Olaf Kirch @ 2004-11-30 10:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 534 bytes --]

On Mon, Nov 29, 2004 at 08:02:20PM -0800, David S. Miller wrote:
> If it's only the dev_queue_xmit_nit() handlers which all want
> BHs disabled, why don't we just disable BHs in that function?

That was my original fix (see attachment), but Andi Kleen suggested to
actually disable BHs at that level; I think as a matter of caution.

Olaf
-- 
Olaf Kirch     | Things that make Monday morning interesting, #2:
okir@suse.de   |        "We have 8,000 NFS mount points, why do we keep
---------------+ 	 running out of privileged ports?"

[-- Attachment #2: netdev-xmit --]
[-- Type: text/plain, Size: 626 bytes --]

Index: linux-2.6.9/net/core/dev.c
===================================================================
--- linux-2.6.9.orig/net/core/dev.c	2004-10-18 23:54:08.000000000 +0200
+++ linux-2.6.9/net/core/dev.c	2004-11-30 11:46:39.000000000 +0100
@@ -1094,9 +1094,15 @@ void dev_queue_xmit_nit(struct sk_buff *
 				skb2->nh.raw = skb2->data;
 			}
 
+			/* The packet handler may expect BHs to be disabled,
+			 * so don't let them down */
+			local_bh_disable();
+
 			skb2->h.raw = skb2->nh.raw;
 			skb2->pkt_type = PACKET_OUTGOING;
 			ptype->func(skb2, skb->dev, ptype);
+
+			local_bh_enable();
 		}
 	}
 	rcu_read_unlock();

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30  9:32 ` Tommy Christensen
@ 2004-11-30 11:01   ` Olaf Kirch
  2004-11-30 11:31     ` Tommy Christensen
  0 siblings, 1 reply; 9+ messages in thread
From: Olaf Kirch @ 2004-11-30 11:01 UTC (permalink / raw)
  To: Tommy Christensen; +Cc: netdev

On Tue, Nov 30, 2004 at 10:32:31AM +0100, Tommy Christensen wrote:
> An interrupt handler shouldn't call dev_queue_xmit() directly. If
> this indeed happens, it needs to be fixed. Which handler is this?

The call path according to KDB goes like this:

	application does sendmsg()
	udp_push_pending_frames 
	ip_push_pending_frames 
	ip_output 
	dev_queue_xmit 
	dev_queue_xmit_nit 
		calls ptype->func(skb2, skb->dev, ptype),
		where func=packet_rcv 
	packet_rcv (and this runs with BHs enabled)
		take the &sk->sk_receive_queue spinlock 
*** timer interrupt
	net_tx_action
		take the dev->queue_lock spin lock
	qdisc_run
	qdisc_restart
	dev_queue_xmit_nit
		as above
	packet_rcv
		blocks on the &sk->sk_receive_queue spinlock

Before lockless-loopback this never triggered because we did a
spin_lock_bh(&dev->xmit_lock) around the call to dev_queue_xmit_nit.

Olaf
-- 
Olaf Kirch     | Things that make Monday morning interesting, #2:
okir@suse.de   |        "We have 8,000 NFS mount points, why do we keep
---------------+ 	 running out of privileged ports?"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30 11:01   ` Olaf Kirch
@ 2004-11-30 11:31     ` Tommy Christensen
  2004-11-30 11:45       ` Olaf Kirch
  0 siblings, 1 reply; 9+ messages in thread
From: Tommy Christensen @ 2004-11-30 11:31 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: netdev

Olaf Kirch wrote:
> On Tue, Nov 30, 2004 at 10:32:31AM +0100, Tommy Christensen wrote:
> 
>>An interrupt handler shouldn't call dev_queue_xmit() directly. If
>>this indeed happens, it needs to be fixed. Which handler is this?
> 
> 
> The call path according to KDB goes like this:
> 
> 	application does sendmsg()
> 	udp_push_pending_frames 
> 	ip_push_pending_frames 
> 	ip_output 
> 	dev_queue_xmit 
> 	dev_queue_xmit_nit 
> 		calls ptype->func(skb2, skb->dev, ptype),
> 		where func=packet_rcv 
> 	packet_rcv (and this runs with BHs enabled)
> 		take the &sk->sk_receive_queue spinlock 
> *** timer interrupt
> 	net_tx_action
> 		take the dev->queue_lock spin lock
> 	qdisc_run
> 	qdisc_restart
> 	dev_queue_xmit_nit
> 		as above
> 	packet_rcv
> 		blocks on the &sk->sk_receive_queue spinlock
> 
> Before lockless-loopback this never triggered because we did a
> spin_lock_bh(&dev->xmit_lock) around the call to dev_queue_xmit_nit.
> 
> Olaf

Ahh, back-traces are *so* nice to have.

I still don't agree with the conclusion, though. The spin_lock_bh()
is changed to a local_bh_disable() and an optional spin_lock().
That should not lead to what you are seeing!

I think perhaps your 'BH disabled count' has been corrupted.

There's a fix for that in 2.6.10-rc2.

-Tommy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30 11:31     ` Tommy Christensen
@ 2004-11-30 11:45       ` Olaf Kirch
  2004-11-30 11:56         ` Tommy Christensen
  0 siblings, 1 reply; 9+ messages in thread
From: Olaf Kirch @ 2004-11-30 11:45 UTC (permalink / raw)
  To: Tommy Christensen; +Cc: netdev

On Tue, Nov 30, 2004 at 12:31:50PM +0100, Tommy Christensen wrote:
> I still don't agree with the conclusion, though. The spin_lock_bh()
> is changed to a local_bh_disable() and an optional spin_lock().
> That should not lead to what you are seeing!

Well, the code in 2.6.9 has

#define HARD_TX_LOCK(dev, cpu) {                        \
	if ((dev->features & NETIF_F_LLTX) == 0) {      \
		spin_lock(&dev->xmit_lock);             \
		dev->xmit_lock_owner = cpu;             \
	}                                               \
}

i.e. there's no local_bh_disable at all - adding the local_bh_disable
was the whole point of my patch. Or did you refer to a different spinlock?

Olaf
-- 
Olaf Kirch     | Things that make Monday morning interesting, #2:
okir@suse.de   |        "We have 8,000 NFS mount points, why do we keep
---------------+ 	 running out of privileged ports?"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30 11:45       ` Olaf Kirch
@ 2004-11-30 11:56         ` Tommy Christensen
  2004-11-30 21:07           ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Tommy Christensen @ 2004-11-30 11:56 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: netdev

Olaf Kirch wrote:
> On Tue, Nov 30, 2004 at 12:31:50PM +0100, Tommy Christensen wrote:
> 
>>I still don't agree with the conclusion, though. The spin_lock_bh()
>>is changed to a local_bh_disable() and an optional spin_lock().
>>That should not lead to what you are seeing!
> 
> 
> Well, the code in 2.6.9 has
> 
> #define HARD_TX_LOCK(dev, cpu) {                        \
> 	if ((dev->features & NETIF_F_LLTX) == 0) {      \
> 		spin_lock(&dev->xmit_lock);             \
> 		dev->xmit_lock_owner = cpu;             \
> 	}                                               \
> }
> 
> i.e. there's no local_bh_disable at all - adding the local_bh_disable
> was the whole point of my patch. Or did you refer to a different spinlock?

The local_bh_disable() is called earlier in dev_queue_xmit(), and
is held across the whole HARD_TX_LOCK/dev_queue_xmit_nit/
hard_start_xmit/HARD_TX_UNLOCK sequence.

-Tommy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Deadlock in af_packet/packet_rcv
  2004-11-30 11:56         ` Tommy Christensen
@ 2004-11-30 21:07           ` David S. Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-11-30 21:07 UTC (permalink / raw)
  To: Tommy Christensen; +Cc: okir, netdev

On Tue, 30 Nov 2004 12:56:30 +0100
Tommy Christensen <tommy.christensen@tpack.net> wrote:

> > i.e. there's no local_bh_disable at all - adding the local_bh_disable
> > was the whole point of my patch. Or did you refer to a different spinlock?
> 
> The local_bh_disable() is called earlier in dev_queue_xmit(), and
> is held across the whole HARD_TX_LOCK/dev_queue_xmit_nit/
> hard_start_xmit/HARD_TX_UNLOCK sequence.

He's right.  There must be something else happening in your
tree Olaf.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-11-30 21:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-25 20:55 [PATCH] Deadlock in af_packet/packet_rcv Olaf Kirch
2004-11-30  4:02 ` David S. Miller
2004-11-30 10:48   ` Olaf Kirch
2004-11-30  9:32 ` Tommy Christensen
2004-11-30 11:01   ` Olaf Kirch
2004-11-30 11:31     ` Tommy Christensen
2004-11-30 11:45       ` Olaf Kirch
2004-11-30 11:56         ` Tommy Christensen
2004-11-30 21:07           ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).