Ethernet Bridge development
 help / color / mirror / Atom feed
* [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
  2006-04-14 13:42 [Bridge] " Chris Rankin
@ 2006-04-14 16:40 ` Stephen Hemminger
       [not found]   ` <20060414192656.870.qmail@web52914.mail.yahoo.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2006-04-14 16:40 UTC (permalink / raw)
  To: Chris Rankin; +Cc: nfs, bridge

On Fri, 14 Apr 2006 14:42:20 +0100 (BST)
Chris Rankin <rankincj@yahoo.com> wrote:

> Hi,
> 
> I have 2 servers which are connected to a gateway machine. The gateway and one server are running
> Linux 2.6.16.2, while the third machine is running 2.6.16.5. The two ethernet ports on the gateway
> which are connected to the servers are combined into a single ethernet bridge device.
> 
> Ever since 2.6.16, I have noticed that I can no longer cross-mount the two servers' /home
> directories via UDP NFS. Which is to say that the mount command succeeds, but that trying to
> access the filesystem makes the process hang and the "NFS server not responding" message to appear
> in the console log. This is true regardless of which machine is the NFS server and which is the
> NFS client.
> 
> It all works fine if I use TCP NFS instead.
> 
> Also, UDP NFS works OK between any server and the gateway itself, so it only goes wrong when UDP
> NFS traffic is forwarded across the bridge. (I have not changed my firewall rules, which just tell
> the gateway to forward all traffic coming in from the bridge device anyway.)
> 
> Can anyone reproduce this, please? I obviously have a workaround (using TCP instead of UDP) but it
> sounds like there's a bug somewhere.
> 
> Cheers,
> Chris

Most likely the problem is that the MTU on the two devices in the bridge is different.
The bridge will silently drop packets if they are too large for the destination port (it's in the 802.1d
standard). TCP has path mtu discovery and is smart enough to recover.  UDP doesn't do
that.

Anyway don't run NFS over UDP unless you want data corruption.  There are sequence number wraparound
issues that are unsolvable when running NFS over UDP/IP and faster links.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
       [not found]   ` <20060414192656.870.qmail@web52914.mail.yahoo.com>
@ 2006-04-14 20:53     ` Stephen Hemminger
       [not found]       ` <20060414205815.15338.qmail@web52911.mail.yahoo.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2006-04-14 20:53 UTC (permalink / raw)
  To: Chris Rankin; +Cc: nfs, bridge

On Fri, 14 Apr 2006 20:26:56 +0100 (BST)
Chris Rankin <rankincj@yahoo.com> wrote:

> --- Stephen Hemminger <shemminger@osdl.org> wrote:
> > Most likely the problem is that the MTU on the two devices in the bridge is different.
> > The bridge will silently drop packets if they are too large for the destination port (it's in
> > the 802.1d standard). TCP has path mtu discovery and is smart enough to recover.  UDP doesn't do
> > that.
> 
> Hi,
> 
> Thanks for the top about the dangers of NFS and UDP. However, I don't think that the MTU is the
> problem. All the ethernet devices (including the bridge) have an MTU of 1500, and according to my
> routing table, only the default route has a lower MTU. Both servers are configured like this:
> 
> 192.168.0.0/24 dev eth0  proto kernel  scope link  src 192.168.0.x
> 169.254.0.0/16 dev eth0  scope link
> default via 192.168.0.1 dev eth0  src 192.168.0.x  advmss 1452
> 
> eth0      Link encap:Ethernet  HWaddr nn:nn:nn:nn:nn:nn
>           inet addr:192.168.0.x  Bcast:192.168.0.255  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:6817 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:4951 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:1840267 (1.7 MiB)  TX bytes:678873 (662.9 KiB)
>           Base address:0xdcc0 Memory:ff6e0000-ff700000
> 
> So all traffic between 192.168.0.x machines should be routed with an MTU of 1500.
> 
> Cheers,
> Chris

What is the mtu of eth0 and eth1 on the bridge?

> 
> 
> 
> 		
> ___________________________________________________________ 
> Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com
> 
> 
> 		
> ___________________________________________________________ 
> Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com


-- 
Stephen Hemminger <shemminger@osdl.org>
OSDL http://developer.osdl.org/~shemminger

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
       [not found]       ` <20060414205815.15338.qmail@web52911.mail.yahoo.com>
@ 2006-04-14 21:00         ` Stephen Hemminger
       [not found]           ` <20060414221312.64298.qmail@web52901.mail.yahoo.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2006-04-14 21:00 UTC (permalink / raw)
  To: Chris Rankin; +Cc: nfs, bridge

On Fri, 14 Apr 2006 21:58:15 +0100 (BST)
Chris Rankin <rankincj@yahoo.com> wrote:

> --- Stephen Hemminger <shemminger@osdl.org> wrote:
> > What is the mtu of eth0 and eth1 on the bridge?
> 
> 1500 on both eth0 and eth1, and on the actual bridge device too.

Are you doing brouting or filtering? or vlan's?
Are the ethernet devices on the bridge doing hardware checksumming?
What version of kernel and configuration are running on the bridge?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
       [not found]           ` <20060414221312.64298.qmail@web52901.mail.yahoo.com>
@ 2006-04-14 22:17             ` Stephen Hemminger
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2006-04-14 22:17 UTC (permalink / raw)
  To: Chris Rankin; +Cc: nfs, bridge

On Fri, 14 Apr 2006 23:13:12 +0100 (BST)
Chris Rankin <rankincj@yahoo.com> wrote:

> --- Stephen Hemminger <shemminger@osdl.org> wrote:
> > Are you doing brouting or filtering? or vlan's?
> > Are the ethernet devices on the bridge doing hardware checksumming?
> > What version of kernel and configuration are running on the bridge?
> 
> The gateway is running 2.6.16.2, and all its ethernet devices are e100s. They might well do
> hardware checksumming, but the configuration used to work fine. There is no brouting / ebfilters,
> and I don't think that there's any vlan either (on the basis that I can't have it if I don't know
> what it is ;-).)
> 

If you have the patience, then "git bisect" will pin down the regression.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
       [not found] <20060414235008.66335.qmail@web52913.mail.yahoo.com>
@ 2006-04-15  5:17 ` Stephen Hemminger
  2006-04-17 18:00   ` Patrick McHardy
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2006-04-15  5:17 UTC (permalink / raw)
  To: Chris Rankin; +Cc: nfs, bridge

Then get a packet trace of a failing session with tcpdump. You may need 
to get two, one
one the client and one on the server to be able to see which packet 
isn't getting past the
bridge.

There are tools to santize tcpdump files if you are paranoid about IP 
adresses, etc.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
  2006-04-15  5:17 ` Stephen Hemminger
@ 2006-04-17 18:00   ` Patrick McHardy
  0 siblings, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2006-04-17 18:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bridge, nfs, Chris Rankin

Stephen Hemminger wrote:
> Then get a packet trace of a failing session with tcpdump. You may need
> to get two, one
> one the client and one on the server to be able to see which packet
> isn't getting past the
> bridge.

I only saw half of this thread (Chris' mails haven't made it to the list
yet), but in case you're using bridge-netfilter and conntrack, its most
likely because of conntrack fragmentation changes in 2.6.16. Conntrack
defragments packets, but relies on the IP layer to do the
refragmentation now. With purely bridged traffic, the packets don't go
through the IP layer, so they exceed the MTU of the outgoing bridge
port. 2.6.16.6 will include a fix for this problem:

[patch 06/22] NETFILTER: Fix fragmentation issues with bridge netfilter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x?
       [not found] <20060417181727.43038.qmail@web52902.mail.yahoo.com>
@ 2006-04-17 18:24 ` Patrick McHardy
  0 siblings, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2006-04-17 18:24 UTC (permalink / raw)
  To: Chris Rankin; +Cc: bridge, nfs, Stephen Hemminger

[-- Attachment #1: Type: text/plain, Size: 1335 bytes --]

Chris Rankin wrote:
> --- Patrick McHardy <kaber@trash.net> wrote:
> 
>>I only saw half of this thread (Chris' mails haven't made it to the list
>>yet), but in case you're using bridge-netfilter and conntrack, its most
>>likely because of conntrack fragmentation changes in 2.6.16. Conntrack
>>defragments packets, but relies on the IP layer to do the
>>refragmentation now. With purely bridged traffic, the packets don't go
>>through the IP layer, so they exceed the MTU of the outgoing bridge
>>port. 2.6.16.6 will include a fix for this problem:
>>
>>[patch 06/22] NETFILTER: Fix fragmentation issues with bridge netfilter
> 
> 
> I emailed the packet dumps to Stephen privately, but what was happening was that the server was
> receiving the request and was fragmenting the reply.  However, the client was never receiving the
> reply packets for some reason.

I guess the request is small enough so it doesn't have to be fragmented.

> Yes, I am using connection tracking and netfilter, and the br0 interface is referenced in my
> iptables rules. I am not using / have not loaded the ebtables modules, although I did compile
> them.

Its enough to have CONFIG_BRIDGE_NETFILTER enabled for this error
to occur, it passes bridged packets to IP netfilter by default.

Attached is the patch queued for -stable, please try if it helps.

[-- Attachment #2: netfilter-fix-fragmentation-issues-with-bridge-netfilter.patch --]
[-- Type: text/plain, Size: 3336 bytes --]

-stable review patch.  If anyone has any objections, please let us know.

------------------
[NETFILTER]: Fix fragmentation issues with bridge netfilter

The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/net/ip.h          |    1 +
 net/bridge/br_netfilter.c |   13 +++++++++++--
 net/ipv4/ip_output.c      |    6 +++---
 3 files changed, 15 insertions(+), 5 deletions(-)

--- linux-2.6.16.5.orig/include/net/ip.h
+++ linux-2.6.16.5/include/net/ip.h
@@ -95,6 +95,7 @@ extern int		ip_local_deliver(struct sk_b
 extern int		ip_mr_input(struct sk_buff *skb);
 extern int		ip_output(struct sk_buff *skb);
 extern int		ip_mc_output(struct sk_buff *skb);
+extern int		ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
 extern int		ip_do_nat(struct sk_buff *skb);
 extern void		ip_send_check(struct iphdr *ip);
 extern int		ip_queue_xmit(struct sk_buff *skb, int ipfragok);
--- linux-2.6.16.5.orig/net/bridge/br_netfilter.c
+++ linux-2.6.16.5/net/bridge/br_netfilter.c
@@ -739,6 +739,15 @@ out:
 	return NF_STOLEN;
 }
 
+static int br_nf_dev_queue_xmit(struct sk_buff *skb)
+{
+	if (skb->protocol == htons(ETH_P_IP) &&
+	    skb->len > skb->dev->mtu &&
+	    !(skb_shinfo(skb)->ufo_size || skb_shinfo(skb)->tso_size))
+		return ip_fragment(skb, br_dev_queue_push_xmit);
+	else
+		return br_dev_queue_push_xmit(skb);
+}
 
 /* PF_BRIDGE/POST_ROUTING ********************************************/
 static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff **pskb,
@@ -798,7 +807,7 @@ static unsigned int br_nf_post_routing(u
 		realoutdev = nf_bridge->netoutdev;
 #endif
 	NF_HOOK(pf, NF_IP_POST_ROUTING, skb, NULL, realoutdev,
-	        br_dev_queue_push_xmit);
+	        br_nf_dev_queue_xmit);
 
 	return NF_STOLEN;
 
@@ -843,7 +852,7 @@ static unsigned int ip_sabotage_out(unsi
 	if ((out->hard_start_xmit == br_dev_xmit &&
 	    okfn != br_nf_forward_finish &&
 	    okfn != br_nf_local_out_finish &&
-	    okfn != br_dev_queue_push_xmit)
+	    okfn != br_nf_dev_queue_xmit)
 #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
 	    || ((out->priv_flags & IFF_802_1Q_VLAN) &&
 	    VLAN_DEV_INFO(out)->real_dev->hard_start_xmit == br_dev_xmit)
--- linux-2.6.16.5.orig/net/ipv4/ip_output.c
+++ linux-2.6.16.5/net/ipv4/ip_output.c
@@ -86,8 +86,6 @@
 
 int sysctl_ip_default_ttl = IPDEFTTL;
 
-static int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*));
-
 /* Generate a checksum for an outgoing IP datagram. */
 __inline__ void ip_send_check(struct iphdr *iph)
 {
@@ -421,7 +419,7 @@ static void ip_copy_metadata(struct sk_b
  *	single device frame, and queue such a frame for sending.
  */
 
-static int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*))
+int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*))
 {
 	struct iphdr *iph;
 	int raw = 0;
@@ -673,6 +671,8 @@ fail:
 	return err;
 }
 
+EXPORT_SYMBOL(ip_fragment);
+
 int
 ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
 {

--


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-04-17 18:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20060417181727.43038.qmail@web52902.mail.yahoo.com>
2006-04-17 18:24 ` [Bridge] Re: No UDP NFS over bridges in Linux 2.6.16.x? Patrick McHardy
     [not found] <20060414235008.66335.qmail@web52913.mail.yahoo.com>
2006-04-15  5:17 ` Stephen Hemminger
2006-04-17 18:00   ` Patrick McHardy
2006-04-14 13:42 [Bridge] " Chris Rankin
2006-04-14 16:40 ` [Bridge] " Stephen Hemminger
     [not found]   ` <20060414192656.870.qmail@web52914.mail.yahoo.com>
2006-04-14 20:53     ` Stephen Hemminger
     [not found]       ` <20060414205815.15338.qmail@web52911.mail.yahoo.com>
2006-04-14 21:00         ` Stephen Hemminger
     [not found]           ` <20060414221312.64298.qmail@web52901.mail.yahoo.com>
2006-04-14 22:17             ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox