* Problems with fragments since gso skb forwarding changes in virtual environment
@ 2014-04-07 16:04 Tobias Brunner
2014-04-07 23:46 ` Florian Westphal
0 siblings, 1 reply; 7+ messages in thread
From: Tobias Brunner @ 2014-04-07 16:04 UTC (permalink / raw)
To: Florian Westphal
Cc: netdev, David S. Miller, Herbert Xu, Marcelo Ricardo Leitner
Hi Florian et al,
We noticed a problem with fragmented packets in the KVM/libvirt-based
strongSwan integration test environment [1] with guest kernels that
include the following commit:
net: ip, ipv6: handle gso skbs in forwarding path
fe6cc55f3a9a053482a76f5a6b2257cee51b4663
The network topology in test scenarios that trigger the problem is as
follows:
Host A - br1 - Router R - br2 - Host B
Where the two hosts and the router are virtual guests connected via
bridges all created via libvirt (see [2] for the XML config files). The
guest's network interfaces all use virtio.
If the router runs with a kernel that includes the commit above, packets
sent from A to B that exceed the MTU (1500 in this case) will be split
into fragments when leaving R. These fragment skbs get defragmented by
the host kernel's nf_defrag_ipv4 module while being forwarded on br2.
This poses a problem. Because the fragment skbs are not GSO, neither is
the defragmented skb. Hence the packets are dropped just before leaving
the bridge, as is_skb_forwardable() will return false unless a too large
skb is actually GSO (this is the same for older host kernels, where the
check is directly done in br_forward.c).
Without the commit, and between A and R even with it (because it only
affects forwarding), the skbs are GSO throughout and transmitted from A
to B without ever actually being fragmented.
Any ideas how to fix this properly? That is, without just reverting
parts of your commit for our guest kernels.
Regards,
Tobias
[1] http://wiki.strongswan.org/projects/strongswan/wiki/TestingEnvironment
[2] http://git.strongswan.org/?p=strongswan.git;a=tree;f=testing/config/kvm
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-07 16:04 Problems with fragments since gso skb forwarding changes in virtual environment Tobias Brunner
@ 2014-04-07 23:46 ` Florian Westphal
2014-04-08 0:05 ` David Miller
2014-04-08 12:24 ` Tobias Brunner
0 siblings, 2 replies; 7+ messages in thread
From: Florian Westphal @ 2014-04-07 23:46 UTC (permalink / raw)
To: Tobias Brunner
Cc: Florian Westphal, netdev, David S. Miller, Herbert Xu,
Marcelo Ricardo Leitner
Tobias Brunner <tobias@strongswan.org> wrote:
> We noticed a problem with fragmented packets in the KVM/libvirt-based
> strongSwan integration test environment [1] with guest kernels that
> include the following commit:
>
> net: ip, ipv6: handle gso skbs in forwarding path
> fe6cc55f3a9a053482a76f5a6b2257cee51b4663
>
> The network topology in test scenarios that trigger the problem is as
> follows:
>
> Host A - br1 - Router R - br2 - Host B
>
> Where the two hosts and the router are virtual guests connected via
> bridges all created via libvirt (see [2] for the XML config files). The
> guest's network interfaces all use virtio.
>
> If the router runs with a kernel that includes the commit above, packets
> sent from A to B that exceed the MTU (1500 in this case) will be split
> into fragments when leaving R. These fragment skbs get defragmented by
> the host kernel's nf_defrag_ipv4 module while being forwarded on br2.
Thanks for the detailed bug report, much appreciated.
Do I interpret this correctly:
Host A - br1 - Router R - br2 - Host B
Mtu >1500 Mtu 1500
1. host A sends GSO packet, DF not set
2. packet arrives at R, still GSO packet
3. forward on R fragments packet since it won't fit
outgoing interface (which is normal virtio ethernet) mtu
4. fragmented packets leave R
5. fragmented packets arrive on host system (not pictured above) br2
interface
6. packets are being bridged on host system, call_iptables sysctl on
7. packets are defragmented by netfilter on host due to call_iptables
sysctl on
8. packets are tossed on host in br_dev_queue_push_xmit because
is_skb_forwardable() returns false
Is that correct?
> Without the commit, and between A and R even with it (because it only
> affects forwarding), the skbs are GSO throughout and transmitted from A
> to B without ever actually being fragmented.
I see why this change makes it trip over GSO skbs, but I fail to
see why it would work with larger-than-1500-mtu-and-fragmentation-allowed
packets being sent from A to B. (or with fragments generated locally
on R).
To the host system it should make no difference at all if the fragments
came into existence in R's forwarding path, or being sent by A, or if
the fragments were generated locally on R (i.e. ping -s $bignum $hosta
on R with DF off).
> Any ideas how to fix this properly? That is, without just reverting
> parts of your commit for our guest kernels.
Looking at br_nf_dev_queue_xmit() in br_netfilter.c I see that it has
a bug (not related 'gso skbs in forwarding path' change): it assumes
that if skb->nfct is NULL no reassembly has taken place. Thats not
true (can load ipv4 defrag module without ipv4 conntrack one), or
netfilter defragmented the packet but then protocol tracker returned
error ('INVALID' conntrack state in netfilter speak).
I admit its rare condition, but afaics br_nf_dev_queue_xmit is
supposed to re-fragment packets that have been subject to defrag.
I'll look into this again tomorrow.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-07 23:46 ` Florian Westphal
@ 2014-04-08 0:05 ` David Miller
2014-04-08 0:26 ` Florian Westphal
2014-04-08 12:24 ` Tobias Brunner
1 sibling, 1 reply; 7+ messages in thread
From: David Miller @ 2014-04-08 0:05 UTC (permalink / raw)
To: fw; +Cc: tobias, netdev, herbert, mleitner
From: Florian Westphal <fw@strlen.de>
Date: Tue, 8 Apr 2014 01:46:40 +0200
> Looking at br_nf_dev_queue_xmit() in br_netfilter.c I see that it has
> a bug (not related 'gso skbs in forwarding path' change): it assumes
> that if skb->nfct is NULL no reassembly has taken place. Thats not
> true (can load ipv4 defrag module without ipv4 conntrack one), or
> netfilter defragmented the packet but then protocol tracker returned
> error ('INVALID' conntrack state in netfilter speak).
>
> I admit its rare condition, but afaics br_nf_dev_queue_xmit is
> supposed to re-fragment packets that have been subject to defrag.
In fact, judging by commits:
commit e179e6322ac334e21a3c6d669d95bc967e5d0a80
Author: Bart De Schuymer <bdschuym@pandora.be>
Date: Thu Apr 15 12:26:39 2010 +0200
netfilter: bridge-netfilter: Fix MAC header handling with IP DNAT
and subsequently:
commit c197facc8ea08062f8f949aade6a33649ee06771
Author: hummerbliss@gmail.com <hummerbliss@gmail.com>
Date: Mon Apr 20 17:12:35 2009 +0200
netfilter: bridge: allow fragmentation of VLAN packets traversing a bridge
I would say that we should simply remove the skb->nfct check
altogether and everything will work fine.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-08 0:05 ` David Miller
@ 2014-04-08 0:26 ` Florian Westphal
0 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2014-04-08 0:26 UTC (permalink / raw)
To: David Miller; +Cc: fw, tobias, netdev, herbert, mleitner
David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Tue, 8 Apr 2014 01:46:40 +0200
>
> > Looking at br_nf_dev_queue_xmit() in br_netfilter.c I see that it has
> > a bug (not related 'gso skbs in forwarding path' change): it assumes
> > that if skb->nfct is NULL no reassembly has taken place. Thats not
> > true (can load ipv4 defrag module without ipv4 conntrack one), or
> > netfilter defragmented the packet but then protocol tracker returned
> > error ('INVALID' conntrack state in netfilter speak).
> >
> > I admit its rare condition, but afaics br_nf_dev_queue_xmit is
> > supposed to re-fragment packets that have been subject to defrag.
>
> In fact, judging by commits:
>
> commit e179e6322ac334e21a3c6d669d95bc967e5d0a80
> Author: Bart De Schuymer <bdschuym@pandora.be>
> Date: Thu Apr 15 12:26:39 2010 +0200
>
> netfilter: bridge-netfilter: Fix MAC header handling with IP DNAT
>
> and subsequently:
>
> commit c197facc8ea08062f8f949aade6a33649ee06771
> Author: hummerbliss@gmail.com <hummerbliss@gmail.com>
> Date: Mon Apr 20 17:12:35 2009 +0200
>
> netfilter: bridge: allow fragmentation of VLAN packets traversing a bridge
>
> I would say that we should simply remove the skb->nfct check
> altogether and everything will work fine.
I was thinking about tracking defrag-on-top-of-bridge in
skb->nf_bridge. But after looking at the changes you're mentioning I
think you're right, I don't see how we can end up in
br_nf_dev_queue_xmit with packet-exceeding-MTU and said skb NOT being
a defragmented packet.
I am afraid IPv6 defrag also needs to be considered here 8-/
I'll look into it tomorrow.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-07 23:46 ` Florian Westphal
2014-04-08 0:05 ` David Miller
@ 2014-04-08 12:24 ` Tobias Brunner
2014-04-08 14:33 ` Florian Westphal
1 sibling, 1 reply; 7+ messages in thread
From: Tobias Brunner @ 2014-04-08 12:24 UTC (permalink / raw)
To: Florian Westphal
Cc: netdev, David S. Miller, Herbert Xu, Marcelo Ricardo Leitner
Hi Florian,
> Do I interpret this correctly:
>
> Host A - br1 - Router R - br2 - Host B
> Mtu >1500 Mtu 1500
>
> 1. host A sends GSO packet, DF not set
> 2. packet arrives at R, still GSO packet
> 3. forward on R fragments packet since it won't fit
> outgoing interface (which is normal virtio ethernet) mtu
> 4. fragmented packets leave R
> 5. fragmented packets arrive on host system (not pictured above) br2
> interface
>
> 6. packets are being bridged on host system, call_iptables sysctl on
> 7. packets are defragmented by netfilter on host due to call_iptables
> sysctl on
> 8. packets are tossed on host in br_dev_queue_push_xmit because
> is_skb_forwardable() returns false
>
> Is that correct?
Exactly. The MTU is 1500 on all interfaces though.
>> Without the commit, and between A and R even with it (because it only
>> affects forwarding), the skbs are GSO throughout and transmitted from A
>> to B without ever actually being fragmented.
>
> I see why this change makes it trip over GSO skbs, but I fail to
> see why it would work with larger-than-1500-mtu-and-fragmentation-allowed
> packets being sent from A to B. (or with fragments generated locally
> on R).
>
> To the host system it should make no difference at all if the fragments
> came into existence in R's forwarding path, or being sent by A, or if
> the fragments were generated locally on R (i.e. ping -s $bignum $hosta
> on R with DF off).
In our test scenarios the packets are UDP and GSO and without the commit
(or between A and R) they travel unchanged between guest and host
kernels without ever touching a physical interface that would actually
cause them to get fragmented (I wasn't aware of this, until I looked
into this issue).
For ICMP it's interesting to note that 'ping -s $bignum $hostb' from A
works even with the commit. The packet is already fragmented when it
leaves A and these fragments are forwarded properly by the host bridges.
They are defragmented by the nf_defrag_ipv4 module, but are fragmented
again in br_nf_dev_queue_xmit() because skb->nfct is non-null as pointed
out by you and David.
I tried removing the skb->nfct check, and while that fixes the
forwarding issue on the host, for some reason the UDP socket on B does
not receive the packet (the guest kernel does, even defragments it and
queues it to the socket, but the userland program never receives the
datagram).
Regards,
Tobias
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-08 12:24 ` Tobias Brunner
@ 2014-04-08 14:33 ` Florian Westphal
2014-04-08 15:36 ` Florian Westphal
0 siblings, 1 reply; 7+ messages in thread
From: Florian Westphal @ 2014-04-08 14:33 UTC (permalink / raw)
To: Tobias Brunner
Cc: Florian Westphal, netdev, David S. Miller, Herbert Xu,
Marcelo Ricardo Leitner
Tobias Brunner <tobias@strongswan.org> wrote:
> >
> > Host A - br1 - Router R - br2 - Host B
> > Mtu >1500 Mtu 1500
> >
> > 1. host A sends GSO packet, DF not set
> > 2. packet arrives at R, still GSO packet
> > 3. forward on R fragments packet since it won't fit
> > outgoing interface (which is normal virtio ethernet) mtu
> > 4. fragmented packets leave R
> > 5. fragmented packets arrive on host system (not pictured above) br2
> > interface
> >
> > 6. packets are being bridged on host system, call_iptables sysctl on
> > 7. packets are defragmented by netfilter on host due to call_iptables
> > sysctl on
> > 8. packets are tossed on host in br_dev_queue_push_xmit because
> > is_skb_forwardable() returns false
> >
> > Is that correct?
>
> Exactly. The MTU is 1500 on all interfaces though.
Thanks for clarifying. In this case there is another problem as well as
no fragments should be generated in the forwarding path if the outgoing mtu
is not reduced.
Most likely a problem with udp gso + skb_gso_network_seglen().
I'll report back, thanks for your feedback.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problems with fragments since gso skb forwarding changes in virtual environment
2014-04-08 14:33 ` Florian Westphal
@ 2014-04-08 15:36 ` Florian Westphal
0 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2014-04-08 15:36 UTC (permalink / raw)
To: Florian Westphal
Cc: Tobias Brunner, netdev, David S. Miller, Herbert Xu,
Marcelo Ricardo Leitner
Florian Westphal <fw@strlen.de> wrote:
[ cc'd Eric ]
> Thanks for clarifying. In this case there is another problem as well as
> no fragments should be generated in the forwarding path if the outgoing mtu
> is not reduced.
>
> Most likely a problem with udp gso + skb_gso_network_seglen().
Indeed, the problem is here:
unsigned int skb_gso_transport_seglen(const struct sk_buff *skb)
{
const struct skb_shared_info *shinfo = skb_shinfo(skb);
unsigned int hdr_len;
if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
hdr_len = tcp_hdrlen(skb);
else
hdr_len = sizeof(struct udphdr);
return hdr_len + shinfo->gso_size;
}
For large GSO UDP packet received via virtio shinfo->gso_size is 1480 (mtu 1500).
Thus skb_gso_transport_seglen() claims transport seglen of 1488 which makes fwd
path think we're trying to send 1508 byte segments.
I am not familiar with UFO; setting hdr_len = 0 in else clause makes things
work for me. I'll see if the sizeof(struct udphdr) is bogus for ufo or if ->gso_size
is wrong.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-04-08 15:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-07 16:04 Problems with fragments since gso skb forwarding changes in virtual environment Tobias Brunner
2014-04-07 23:46 ` Florian Westphal
2014-04-08 0:05 ` David Miller
2014-04-08 0:26 ` Florian Westphal
2014-04-08 12:24 ` Tobias Brunner
2014-04-08 14:33 ` Florian Westphal
2014-04-08 15:36 ` Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).