public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] Dropping fragmented IP packets within VLAN frames on bridge
@ 2007-05-25  8:17 Adam Osuchowski
  2007-05-25 15:59 ` [Bridge] " Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Osuchowski @ 2007-05-25  8:17 UTC (permalink / raw)
  To: bridge; +Cc: linux-kernel

There is a problem with fragmented IP packet sent within 802.1Q tagged
ethernet frame through bridge. Problem exists when conntrack is enabled
(i.e. nf_conntrack_ipv4 module is loaded). Then, such packets are not
fragmented again (after prior reassembling on bridge device) during
passing it to bridge enslaved NIC. It cause MTU exceeding and as a result
dropping packet.

Problem exists from kernel version 2.6.17 to 2.6.21.3 inclusive.

Below, there is a patch to fix it.

Regards.


--- linux-2.6.21.3.orig/net/bridge/br_netfilter.c	2007-05-25 09:56:15.000000000 +0200
+++ linux-2.6.21.3/net/bridge/br_netfilter.c	2007-05-25 10:11:42.000000000 +0200
@@ -731,7 +731,7 @@
 
 static int br_nf_dev_queue_xmit(struct sk_buff *skb)
 {
-	if (skb->protocol == htons(ETH_P_IP) &&
+	if ((skb->protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb)) &&
 	    skb->len > skb->dev->mtu &&
 	    !skb_is_gso(skb))
 		return ip_fragment(skb, br_dev_queue_push_xmit);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
  2007-05-25  8:17 [BUG] Dropping fragmented IP packets within VLAN frames on bridge Adam Osuchowski
@ 2007-05-25 15:59 ` Stephen Hemminger
  2007-05-25 17:49   ` Adam Osuchowski
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2007-05-25 15:59 UTC (permalink / raw)
  To: Adam Osuchowski; +Cc: bridge, linux-kernel

On Fri, 25 May 2007 10:17:50 +0200
Adam Osuchowski <adwol@zonk.pl> wrote:

> There is a problem with fragmented IP packet sent within 802.1Q tagged
> ethernet frame through bridge. Problem exists when conntrack is enabled
> (i.e. nf_conntrack_ipv4 module is loaded). Then, such packets are not
> fragmented again (after prior reassembling on bridge device) during
> passing it to bridge enslaved NIC. It cause MTU exceeding and as a result
> dropping packet.
> 
> Problem exists from kernel version 2.6.17 to 2.6.21.3 inclusive.
> 
> Below, there is a patch to fix it.
> 
> Regards.
> 
> 
> --- linux-2.6.21.3.orig/net/bridge/br_netfilter.c	2007-05-25 09:56:15.000000000 +0200
> +++ linux-2.6.21.3/net/bridge/br_netfilter.c	2007-05-25 10:11:42.000000000 +0200
> @@ -731,7 +731,7 @@
>  
>  static int br_nf_dev_queue_xmit(struct sk_buff *skb)
>  {
> -	if (skb->protocol == htons(ETH_P_IP) &&
> +	if ((skb->protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb)) &&
>  	    skb->len > skb->dev->mtu &&
>  	    !skb_is_gso(skb))
>  		return ip_fragment(skb, br_dev_queue_push_xmit);

It would be better to account for the tag in the length check.
Something like
	if (skb->protocol == htons(ETH_P_IP) &&
	    skb->len > skb->dev->mtu - (IS_VLAN_IP(skb) ? VLAN_HLEN : 0) &&
	    !skb_is_gso(skb))
		return ip_fragment ...


-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
  2007-05-25 15:59 ` [Bridge] " Stephen Hemminger
@ 2007-05-25 17:49   ` Adam Osuchowski
  2007-05-26  8:13     ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Osuchowski @ 2007-05-25 17:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bridge, linux-kernel

Stephen Hemminger wrote:
> It would be better to account for the tag in the length check.
> Something like
> 	if (skb->protocol == htons(ETH_P_IP) &&
> 	    skb->len > skb->dev->mtu - (IS_VLAN_IP(skb) ? VLAN_HLEN : 0) &&
> 	    !skb_is_gso(skb))
> 		return ip_fragment ...

It isn't good solution because one of IS_VLAN_IP() necessary condition is

    skb->protocol == htons(ETH_P_8021Q)

which is, of course, mutually exclusive with

    skb->protocol == htons(ETH_P_IP)

from br_nf_dev_queue_xmit(). IMHO, one should check length of ETH_P_IP
and ETH_P_8021Q frames separately:

    if (((skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu) ||
        (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)) &&
	!skb_is_gso(skb))
	    return ip_fragment ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
  2007-05-25 17:49   ` Adam Osuchowski
@ 2007-05-26  8:13     ` Patrick McHardy
  2007-05-26 14:20       ` Ingo Oeser
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick McHardy @ 2007-05-26  8:13 UTC (permalink / raw)
  To: Adam Osuchowski; +Cc: Stephen Hemminger, bridge, linux-kernel

Adam Osuchowski wrote:
> Stephen Hemminger wrote:
> 
>>It would be better to account for the tag in the length check.
>>Something like
>>	if (skb->protocol == htons(ETH_P_IP) &&
>>	    skb->len > skb->dev->mtu - (IS_VLAN_IP(skb) ? VLAN_HLEN : 0) &&
>>	    !skb_is_gso(skb))
>>		return ip_fragment ...
> 
> 
> It isn't good solution because one of IS_VLAN_IP() necessary condition is
> 
>     skb->protocol == htons(ETH_P_8021Q)
> 
> which is, of course, mutually exclusive with
> 
>     skb->protocol == htons(ETH_P_IP)
> 
> from br_nf_dev_queue_xmit(). IMHO, one should check length of ETH_P_IP
> and ETH_P_8021Q frames separately:
> 
>     if (((skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu) ||
>         (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)) &&
> 	!skb_is_gso(skb))
> 	    return ip_fragment ...


net/8021q ignores the VLAN header overhead, so we should probably do the
same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current
-rc) looks fine, additionally we should probably also check for
skb->nfct != NULL to make sure that at least without connection tracking
the bridge doesn't perform fragmentation.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
  2007-05-26  8:13     ` Patrick McHardy
@ 2007-05-26 14:20       ` Ingo Oeser
  2007-05-26 15:05         ` Patrick McHardy
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Oeser @ 2007-05-26 14:20 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Adam Osuchowski, Stephen Hemminger, bridge, linux-kernel

On Saturday 26 May 2007, Patrick McHardy wrote:
> Adam Osuchowski wrote:
> >     if (((skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu) ||
> >         (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)) &&
> > 	!skb_is_gso(skb))
> > 	    return ip_fragment ...
> 
> 
> net/8021q ignores the VLAN header overhead, so we should probably do the
> same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current
> -rc) looks fine, additionally we should probably also check for
> skb->nfct != NULL to make sure that at least without connection tracking
> the bridge doesn't perform fragmentation.

And could we separe the conditions for that into a static helper function
explaining each of these conditions? e.g. sth. like that:

static bool br_nf_need_fragment(struct sk_buff *skb)
{
	/* Plain IP packet does not fit in MTU */
	if (!(skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu))
		return true;

	/* VLAN encapsulated IP packet does not fit in MTU */
	if (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)
		return true;

	/* PPPoE encapsulated IP packet does not fit in MTU */
	if (IS_PPPOE_IP(skb) && skb->len > skb->dev->mtu - PPPOE_SES_HLEN)
		return true;

	return false;
}

and then br_nf_dev_queue_xmit() becomes:

static int br_nf_dev_queue_xmit(struct sk_buff *skb)
{
        if (br_nf_need_fragment(skb) &&  !skb_is_gso(skb))
                return ip_fragment(skb, br_dev_queue_push_xmit);
        else
                return br_dev_queue_push_xmit(skb);
}

which is much more readable, more documented and doesn't contain a condition monster :-)

@Patrick: Could you check, wether the PPPoE case is correct?

What do you think? Should I submit a patch for that?


Best Regards

Ingo Oeser

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Bridge] [BUG] Dropping fragmented IP packets within VLAN frames on bridge
  2007-05-26 14:20       ` Ingo Oeser
@ 2007-05-26 15:05         ` Patrick McHardy
  0 siblings, 0 replies; 6+ messages in thread
From: Patrick McHardy @ 2007-05-26 15:05 UTC (permalink / raw)
  To: Ingo Oeser
  Cc: Adam Osuchowski, Stephen Hemminger, bridge, linux-kernel,
	Bart De Schuymer

Ingo Oeser wrote:
> On Saturday 26 May 2007, Patrick McHardy wrote:
> 
>>net/8021q ignores the VLAN header overhead, so we should probably do the
>>same here for consistency. Using IS_VLAN_IP (and IS_PPPOE_IP for current
>>-rc) looks fine, additionally we should probably also check for
>>skb->nfct != NULL to make sure that at least without connection tracking
>>the bridge doesn't perform fragmentation.
> 
> 
> And could we separe the conditions for that into a static helper function
> explaining each of these conditions? e.g. sth. like that:


The MTU checks are self-explanatory. Just a comment above the function
stating that it tries to find out whether a packet needs to be
refragmented because it was defragmented by IPv4 connection tracking
and exceeds the MTU should be enough.

> static bool br_nf_need_fragment(struct sk_buff *skb)
> {
> 	/* Plain IP packet does not fit in MTU */
> 	if (!(skb->protocol == htons(ETH_P_IP) && skb->len > skb->dev->mtu))
> 		return true;
> 
> 	/* VLAN encapsulated IP packet does not fit in MTU */
> 	if (IS_VLAN_IP(skb) && skb->len > skb->dev->mtu - VLAN_HLEN)
> 		return true;
> 
> 	/* PPPoE encapsulated IP packet does not fit in MTU */
> 	if (IS_PPPOE_IP(skb) && skb->len > skb->dev->mtu - PPPOE_SES_HLEN)
> 		return true;
> 
> 	return false;
> }


As I said, I don't think we should account for the VLAN header overhead,
the VLAN code itself doesn't even do it. And we should exclude packets
that don't have a connection tracking reference attached since we are
only undoing the damage connection tracking did by defragmenting it
and should avoid fragmenting other packets as good as possible.

> and then br_nf_dev_queue_xmit() becomes:
> 
> static int br_nf_dev_queue_xmit(struct sk_buff *skb)
> {
>         if (br_nf_need_fragment(skb) &&  !skb_is_gso(skb))
>                 return ip_fragment(skb, br_dev_queue_push_xmit);
>         else
>                 return br_dev_queue_push_xmit(skb);
> }
> 
> which is much more readable, more documented and doesn't contain a condition monster :-)
> 
> @Patrick: Could you check, wether the PPPoE case is correct?


It looks OK. But there is another problem, ip_fragment doesn't care
about the PPPoE overhead and produces a packet that will be too large
after restoring the PPPoE header. A second __fake_rtable that accounts
for the PPPoE overhead could probably fix that ..

> What do you think? Should I submit a patch for that?


Sure :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-26 15:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-25  8:17 [BUG] Dropping fragmented IP packets within VLAN frames on bridge Adam Osuchowski
2007-05-25 15:59 ` [Bridge] " Stephen Hemminger
2007-05-25 17:49   ` Adam Osuchowski
2007-05-26  8:13     ` Patrick McHardy
2007-05-26 14:20       ` Ingo Oeser
2007-05-26 15:05         ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox