public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net_device refcnt bug when NFQUEUEing bridged packets
@ 2008-01-15 23:05 Jan Christoph Nordholz
  2008-01-15 23:11 ` Stephen Hemminger
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Christoph Nordholz @ 2008-01-15 23:05 UTC (permalink / raw)
  To: shemminger, kaber; +Cc: linux-kernel

Hi,

I came across the following bug a few weeks ago (which still applies to
2.6.24-rc7):

Packets that are to be sent out over a bridge device are skb_clone()d in
br_loop() before traversing the appropriate (FORWARD/OUTPUT) NF chain. 
The copies made by skb_clone() share their nf_bridge metadata with the
original, which is no problem usually.
If however one or more packets of a br_loop() run end up in a NFQUEUE,
their shared nf_bridge metadata causes trouble when they are about to be
reinjected: nf_reinject() decrements the net_device refcounts that were
previously upped when queueing the packet in __nf_queue(), but as
skb->nf_bridge->physoutdev points to the same device for all these
packets, most (if not all) of them will affect the wrong refcnt.

(I originally encountered the bug on a Xen host because the hypervisor
refused to shutdown a virtual device with non-zero refcount... but it is
perfectly reproducible with a standard kernel, too, although it was a
bit more tedious to create a test scenario, involving a couple of UMLs.)

I'd suggest to make a real copy of the nf_bridge member in br_loop() if
CONFIG_BRIDGE_NETFILTER is defined - I've attached a patch that illus-
trates how to fix the bug (and the machine I've found the bug on is
running a kernel with this patch since weeks and has not had any
refcount anomalies since), but I admit it is ugly, returning the reference
acquired by __nf_copy() and then copying manually...

Please tell me where that logic should really go (skbuff.h? br_netfilter.c?)
so I can wrap up a final and CodingStyle-conformant version, or feel free
to simply apply a modified version.


Regards,

Jan


Signed-off-by: Jan Christoph Nordholz <hesso@pool.math.tu-berlin.de>
---
diff -Naur linux-2.6.24-rc7/ linux/
--- linux-2.6.24-rc7/net/bridge/br_forward.c
+++ linux/net/bridge/br_forward.c
@@ -120,6 +120,20 @@
					return;
				}
 
+#ifdef CONFIG_BRIDGE_NETFILTER
+				if (skb->nf_bridge) {
+					nf_bridge_put(skb2->nf_bridge);
+					if ((skb2->nf_bridge = kzalloc(sizeof(struct nf_bridge_info), GFP_ATOMIC)) == NULL) {
+						br->statistics.tx_dropped++;
+						kfree_skb(skb2);
+						kfree_skb(skb);
+						return;
+					}
+					memcpy(skb2->nf_bridge, skb->nf_bridge, sizeof(struct nf_bridge_info));
+					atomic_set(&(skb2->nf_bridge->use), 1);
+				}
+#endif
+
				__packet_hook(prev, skb2);
			}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] net_device refcnt bug when NFQUEUEing bridged packets
  2008-01-15 23:05 [PATCH] net_device refcnt bug when NFQUEUEing bridged packets Jan Christoph Nordholz
@ 2008-01-15 23:11 ` Stephen Hemminger
  0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2008-01-15 23:11 UTC (permalink / raw)
  To: Jan Christoph Nordholz; +Cc: kaber, linux-kernel

On Wed, 16 Jan 2008 00:05:44 +0100
Jan Christoph Nordholz <hesso@pool.math.tu-berlin.de> wrote:

> Hi,
> 
> I came across the following bug a few weeks ago (which still applies to
> 2.6.24-rc7):
> 
> Packets that are to be sent out over a bridge device are skb_clone()d in
> br_loop() before traversing the appropriate (FORWARD/OUTPUT) NF chain. 
> The copies made by skb_clone() share their nf_bridge metadata with the
> original, which is no problem usually.
> If however one or more packets of a br_loop() run end up in a NFQUEUE,
> their shared nf_bridge metadata causes trouble when they are about to be
> reinjected: nf_reinject() decrements the net_device refcounts that were
> previously upped when queueing the packet in __nf_queue(), but as
> skb->nf_bridge->physoutdev points to the same device for all these
> packets, most (if not all) of them will affect the wrong refcnt.
> 
> (I originally encountered the bug on a Xen host because the hypervisor
> refused to shutdown a virtual device with non-zero refcount... but it is
> perfectly reproducible with a standard kernel, too, although it was a
> bit more tedious to create a test scenario, involving a couple of UMLs.)
> 
> I'd suggest to make a real copy of the nf_bridge member in br_loop() if
> CONFIG_BRIDGE_NETFILTER is defined - I've attached a patch that illus-
> trates how to fix the bug (and the machine I've found the bug on is
> running a kernel with this patch since weeks and has not had any
> refcount anomalies since), but I admit it is ugly, returning the reference
> acquired by __nf_copy() and then copying manually...
> 
> Please tell me where that logic should really go (skbuff.h? br_netfilter.c?)
> so I can wrap up a final and CodingStyle-conformant version, or feel free
> to simply apply a modified version.
> 
> 
> Regards,
> 
> Jan
> 
>

Please submit a bug to kernel bugzilla.  Not sure that the patch is the
proper way to fix this.

-- 
Stephen Hemminger <stephen.hemminger@vyatta.com>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-01-15 23:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-15 23:05 [PATCH] net_device refcnt bug when NFQUEUEing bridged packets Jan Christoph Nordholz
2008-01-15 23:11 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox