* Sending undersized ARP packets with VXLAN L3 interface @ 2014-08-27 17:06 Martin Rusko 2014-08-27 17:28 ` Cong Wang 0 siblings, 1 reply; 11+ messages in thread From: Martin Rusko @ 2014-08-27 17:06 UTC (permalink / raw) To: netdev I tried to use VXLAN interface as an L3 interface. Something like this: ip link add name vxln7 \ type vxlan id 7007 group 232.1.42.7 \ local 10.7.12.250 dev vlan482 \ dstport 0 ageing 300 ip ad ad 192.168.3.200/24 brd + dev vxln7 ip li set vxln7 up Now this doesn't work very well for small packets like those carrying ARP protocol. Because resulting ethernet frames which are encapsulated in VXLAN are not padded to minimum 64bytes required for Ethernet. Once the inner frame traverse through any switch, it will get dropped as undersized (runt) packet. I'm wondering, where is the proper place to fix this. Should arp_create() function allocate skb big enough to produce ethernet frame with at least minimum size? Or is it somewhere in NIC drivers where small packets are padded with zeros? Regards, Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 17:06 Sending undersized ARP packets with VXLAN L3 interface Martin Rusko @ 2014-08-27 17:28 ` Cong Wang 2014-08-27 17:52 ` Vlad Yasevich 0 siblings, 1 reply; 11+ messages in thread From: Cong Wang @ 2014-08-27 17:28 UTC (permalink / raw) To: Martin Rusko; +Cc: netdev On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: > > I'm wondering, where is the proper place to fix this. Should > arp_create() function allocate skb big enough to produce ethernet > frame with at least minimum size? Or is it somewhere in NIC drivers > where small packets are padded with zeros? Drivers do that, for example e1000: /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, * packets may get corrupted during padding by HW. * To WA this issue, pad all small packets manually. */ if (skb->len < ETH_ZLEN) { if (skb_pad(skb, ETH_ZLEN - skb->len)) return NETDEV_TX_OK; skb->len = ETH_ZLEN; skb_set_tail_pointer(skb, ETH_ZLEN); } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 17:28 ` Cong Wang @ 2014-08-27 17:52 ` Vlad Yasevich 2014-08-27 18:16 ` Rick Jones 2014-08-27 18:42 ` Stephen Hemminger 0 siblings, 2 replies; 11+ messages in thread From: Vlad Yasevich @ 2014-08-27 17:52 UTC (permalink / raw) To: Cong Wang, Martin Rusko; +Cc: netdev On 08/27/2014 01:28 PM, Cong Wang wrote: > On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: >> >> I'm wondering, where is the proper place to fix this. Should >> arp_create() function allocate skb big enough to produce ethernet >> frame with at least minimum size? Or is it somewhere in NIC drivers >> where small packets are padded with zeros? > > Drivers do that, for example e1000: > > /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, > * packets may get corrupted during padding by HW. > * To WA this issue, pad all small packets manually. > */ > if (skb->len < ETH_ZLEN) { > if (skb_pad(skb, ETH_ZLEN - skb->len)) > return NETDEV_TX_OK; > skb->len = ETH_ZLEN; > skb_set_tail_pointer(skb, ETH_ZLEN); > } I think vxlan needs something like this: From: Vladislav Yasevich <vyasevich@gmail.com> Date: Wed, 27 Aug 2014 13:39:32 -0400 Subject: [PATCH] vxlan: Pad short ethernet frames. If sending short ethernet frames from the vxlan device, pad them to minimum size so they can be forwarded after decapsulation. Reported-by: Martin Rusko <martin.rusko@gmail.com> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> --- drivers/net/vxlan.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 1fb7b37..48267d4 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev) #endif } + /* Pad short frames so they can be forwarded after decapsulation */ + if (skb->len < ETH_ZLEN) { + if (skb_pad(skb, ETH_ZLEN - skb->len)) + return NETDEV_TX_OK; + skb->len = ETH_ZLEN; + skb_set_tail_pointer(skb, ETH_ZLEN); + } + f = vxlan_find_mac(vxlan, eth->h_dest); did_rsc = false; -- 1.9.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 17:52 ` Vlad Yasevich @ 2014-08-27 18:16 ` Rick Jones 2014-08-27 18:42 ` Stephen Hemminger 1 sibling, 0 replies; 11+ messages in thread From: Rick Jones @ 2014-08-27 18:16 UTC (permalink / raw) To: Vlad Yasevich, Cong Wang, Martin Rusko; +Cc: netdev On 08/27/2014 10:52 AM, Vlad Yasevich wrote: > I think vxlan needs something like this: > > From: Vladislav Yasevich <vyasevich@gmail.com> > Date: Wed, 27 Aug 2014 13:39:32 -0400 > Subject: [PATCH] vxlan: Pad short ethernet frames. > > If sending short ethernet frames from the vxlan device, pad > them to minimum size so they can be forwarded after decapsulation. > > Reported-by: Martin Rusko <martin.rusko@gmail.com> > Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> > --- > drivers/net/vxlan.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index 1fb7b37..48267d4 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct > net_device *dev) > #endif > } > > + /* Pad short frames so they can be forwarded after decapsulation */ > + if (skb->len < ETH_ZLEN) { > + if (skb_pad(skb, ETH_ZLEN - skb->len)) > + return NETDEV_TX_OK; > + skb->len = ETH_ZLEN; > + skb_set_tail_pointer(skb, ETH_ZLEN); > + } > + > f = vxlan_find_mac(vxlan, eth->h_dest); > did_rsc = false; > It is perhaps putting a stripe on the bikeshed, but should that be an "unlikely" on the length check? There seem to be examples both ways in the handful of physical drivers I've checked. rick jones ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 17:52 ` Vlad Yasevich 2014-08-27 18:16 ` Rick Jones @ 2014-08-27 18:42 ` Stephen Hemminger 2014-08-27 18:45 ` Vlad Yasevich 1 sibling, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2014-08-27 18:42 UTC (permalink / raw) To: Vlad Yasevich; +Cc: Cong Wang, Martin Rusko, netdev On Wed, 27 Aug 2014 13:52:03 -0400 Vlad Yasevich <vyasevich@gmail.com> wrote: > On 08/27/2014 01:28 PM, Cong Wang wrote: > > On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: > >> > >> I'm wondering, where is the proper place to fix this. Should > >> arp_create() function allocate skb big enough to produce ethernet > >> frame with at least minimum size? Or is it somewhere in NIC drivers > >> where small packets are padded with zeros? > > > > Drivers do that, for example e1000: > > > > /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, > > * packets may get corrupted during padding by HW. > > * To WA this issue, pad all small packets manually. > > */ > > if (skb->len < ETH_ZLEN) { > > if (skb_pad(skb, ETH_ZLEN - skb->len)) > > return NETDEV_TX_OK; > > skb->len = ETH_ZLEN; > > skb_set_tail_pointer(skb, ETH_ZLEN); > > } > > > I think vxlan needs something like this: > > From: Vladislav Yasevich <vyasevich@gmail.com> > Date: Wed, 27 Aug 2014 13:39:32 -0400 > Subject: [PATCH] vxlan: Pad short ethernet frames. > > If sending short ethernet frames from the vxlan device, pad > them to minimum size so they can be forwarded after decapsulation. > > Reported-by: Martin Rusko <martin.rusko@gmail.com> > Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> > --- > drivers/net/vxlan.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c > index 1fb7b37..48267d4 100644 > --- a/drivers/net/vxlan.c > +++ b/drivers/net/vxlan.c > @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct > net_device *dev) > #endif > } > > + /* Pad short frames so they can be forwarded after decapsulation */ > + if (skb->len < ETH_ZLEN) { > + if (skb_pad(skb, ETH_ZLEN - skb->len)) > + return NETDEV_TX_OK; > + skb->len = ETH_ZLEN; > + skb_set_tail_pointer(skb, ETH_ZLEN); > + } > + > f = vxlan_find_mac(vxlan, eth->h_dest); > did_rsc = false; > No. The short frame is perfectly valid, over the VXLAN. The system doing the decap and forwarding should be where any padding is added if necessary. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 18:42 ` Stephen Hemminger @ 2014-08-27 18:45 ` Vlad Yasevich 2014-08-27 20:01 ` Martin Rusko 0 siblings, 1 reply; 11+ messages in thread From: Vlad Yasevich @ 2014-08-27 18:45 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Cong Wang, Martin Rusko, netdev On 08/27/2014 02:42 PM, Stephen Hemminger wrote: > On Wed, 27 Aug 2014 13:52:03 -0400 > Vlad Yasevich <vyasevich@gmail.com> wrote: > >> On 08/27/2014 01:28 PM, Cong Wang wrote: >>> On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: >>>> >>>> I'm wondering, where is the proper place to fix this. Should >>>> arp_create() function allocate skb big enough to produce ethernet >>>> frame with at least minimum size? Or is it somewhere in NIC drivers >>>> where small packets are padded with zeros? >>> >>> Drivers do that, for example e1000: >>> >>> /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, >>> * packets may get corrupted during padding by HW. >>> * To WA this issue, pad all small packets manually. >>> */ >>> if (skb->len < ETH_ZLEN) { >>> if (skb_pad(skb, ETH_ZLEN - skb->len)) >>> return NETDEV_TX_OK; >>> skb->len = ETH_ZLEN; >>> skb_set_tail_pointer(skb, ETH_ZLEN); >>> } >> >> >> I think vxlan needs something like this: >> >> From: Vladislav Yasevich <vyasevich@gmail.com> >> Date: Wed, 27 Aug 2014 13:39:32 -0400 >> Subject: [PATCH] vxlan: Pad short ethernet frames. >> >> If sending short ethernet frames from the vxlan device, pad >> them to minimum size so they can be forwarded after decapsulation. >> >> Reported-by: Martin Rusko <martin.rusko@gmail.com> >> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> >> --- >> drivers/net/vxlan.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c >> index 1fb7b37..48267d4 100644 >> --- a/drivers/net/vxlan.c >> +++ b/drivers/net/vxlan.c >> @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct >> net_device *dev) >> #endif >> } >> >> + /* Pad short frames so they can be forwarded after decapsulation */ >> + if (skb->len < ETH_ZLEN) { >> + if (skb_pad(skb, ETH_ZLEN - skb->len)) >> + return NETDEV_TX_OK; >> + skb->len = ETH_ZLEN; >> + skb_set_tail_pointer(skb, ETH_ZLEN); >> + } >> + >> f = vxlan_find_mac(vxlan, eth->h_dest); >> did_rsc = false; >> > > No. The short frame is perfectly valid, over the VXLAN. > The system doing the decap and forwarding should be where any padding is added if necessary. > If that's the case, then Martin is most likely seeing a HW bug on the switch. I wonder how common such a bug might be? -vlad ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 18:45 ` Vlad Yasevich @ 2014-08-27 20:01 ` Martin Rusko 2014-08-27 20:23 ` Vlad Yasevich 0 siblings, 1 reply; 11+ messages in thread From: Martin Rusko @ 2014-08-27 20:01 UTC (permalink / raw) To: Vlad Yasevich; +Cc: Stephen Hemminger, Cong Wang, netdev On Wed, Aug 27, 2014 at 8:45 PM, Vlad Yasevich <vyasevich@gmail.com> wrote: > On 08/27/2014 02:42 PM, Stephen Hemminger wrote: >> On Wed, 27 Aug 2014 13:52:03 -0400 >> Vlad Yasevich <vyasevich@gmail.com> wrote: >> >>> On 08/27/2014 01:28 PM, Cong Wang wrote: >>>> On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: >>>>> >>>>> I'm wondering, where is the proper place to fix this. Should >>>>> arp_create() function allocate skb big enough to produce ethernet >>>>> frame with at least minimum size? Or is it somewhere in NIC drivers >>>>> where small packets are padded with zeros? >>>> >>>> Drivers do that, for example e1000: >>>> >>>> /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, >>>> * packets may get corrupted during padding by HW. >>>> * To WA this issue, pad all small packets manually. >>>> */ >>>> if (skb->len < ETH_ZLEN) { >>>> if (skb_pad(skb, ETH_ZLEN - skb->len)) >>>> return NETDEV_TX_OK; >>>> skb->len = ETH_ZLEN; >>>> skb_set_tail_pointer(skb, ETH_ZLEN); >>>> } >>> >>> >>> I think vxlan needs something like this: >>> >>> From: Vladislav Yasevich <vyasevich@gmail.com> >>> Date: Wed, 27 Aug 2014 13:39:32 -0400 >>> Subject: [PATCH] vxlan: Pad short ethernet frames. >>> >>> If sending short ethernet frames from the vxlan device, pad >>> them to minimum size so they can be forwarded after decapsulation. >>> >>> Reported-by: Martin Rusko <martin.rusko@gmail.com> >>> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> >>> --- >>> drivers/net/vxlan.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c >>> index 1fb7b37..48267d4 100644 >>> --- a/drivers/net/vxlan.c >>> +++ b/drivers/net/vxlan.c >>> @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct >>> net_device *dev) >>> #endif >>> } >>> >>> + /* Pad short frames so they can be forwarded after decapsulation */ >>> + if (skb->len < ETH_ZLEN) { >>> + if (skb_pad(skb, ETH_ZLEN - skb->len)) >>> + return NETDEV_TX_OK; >>> + skb->len = ETH_ZLEN; >>> + skb_set_tail_pointer(skb, ETH_ZLEN); >>> + } >>> + >>> f = vxlan_find_mac(vxlan, eth->h_dest); >>> did_rsc = false; >>> >> >> No. The short frame is perfectly valid, over the VXLAN. >> The system doing the decap and forwarding should be where any padding is added if necessary. >> Well, RFC 7348 is not dealing with padding at all. Both deployment scenarios listed in RFC, as well as most of the existing real life deployments today (in my opinion) use VXLAN for bridged traffic. In other words, frame encapsulated by VTEP is received first over some ethernet interface (physical or virtual) which implies that the frame is at least 64 bytes long already. Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it might be safer not to count on receiving VTEP doing the right thing (pad small packets with zeros). > > If that's the case, then Martin is most likely seeing a HW bug on the switch. > I wonder how common such a bug might be? > > -vlad > I see this on Vmware distributed virtual switch. Perhaps soon I will be able to test it against HP 5930 switch. I'm going to try how Linux bridge copes with it, now. Many thanks for the patch anyway! Regards, Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 20:01 ` Martin Rusko @ 2014-08-27 20:23 ` Vlad Yasevich 2014-08-27 21:00 ` Martin Rusko 0 siblings, 1 reply; 11+ messages in thread From: Vlad Yasevich @ 2014-08-27 20:23 UTC (permalink / raw) To: Martin Rusko; +Cc: Stephen Hemminger, Cong Wang, netdev On 08/27/2014 04:01 PM, Martin Rusko wrote: > On Wed, Aug 27, 2014 at 8:45 PM, Vlad Yasevich <vyasevich@gmail.com> wrote: >> On 08/27/2014 02:42 PM, Stephen Hemminger wrote: >>> On Wed, 27 Aug 2014 13:52:03 -0400 >>> Vlad Yasevich <vyasevich@gmail.com> wrote: >>> >>>> On 08/27/2014 01:28 PM, Cong Wang wrote: >>>>> On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: >>>>>> >>>>>> I'm wondering, where is the proper place to fix this. Should >>>>>> arp_create() function allocate skb big enough to produce ethernet >>>>>> frame with at least minimum size? Or is it somewhere in NIC drivers >>>>>> where small packets are padded with zeros? >>>>> >>>>> Drivers do that, for example e1000: >>>>> >>>>> /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, >>>>> * packets may get corrupted during padding by HW. >>>>> * To WA this issue, pad all small packets manually. >>>>> */ >>>>> if (skb->len < ETH_ZLEN) { >>>>> if (skb_pad(skb, ETH_ZLEN - skb->len)) >>>>> return NETDEV_TX_OK; >>>>> skb->len = ETH_ZLEN; >>>>> skb_set_tail_pointer(skb, ETH_ZLEN); >>>>> } >>>> >>>> >>>> I think vxlan needs something like this: >>>> >>>> From: Vladislav Yasevich <vyasevich@gmail.com> >>>> Date: Wed, 27 Aug 2014 13:39:32 -0400 >>>> Subject: [PATCH] vxlan: Pad short ethernet frames. >>>> >>>> If sending short ethernet frames from the vxlan device, pad >>>> them to minimum size so they can be forwarded after decapsulation. >>>> >>>> Reported-by: Martin Rusko <martin.rusko@gmail.com> >>>> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> >>>> --- >>>> drivers/net/vxlan.c | 8 ++++++++ >>>> 1 file changed, 8 insertions(+) >>>> >>>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c >>>> index 1fb7b37..48267d4 100644 >>>> --- a/drivers/net/vxlan.c >>>> +++ b/drivers/net/vxlan.c >>>> @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct >>>> net_device *dev) >>>> #endif >>>> } >>>> >>>> + /* Pad short frames so they can be forwarded after decapsulation */ >>>> + if (skb->len < ETH_ZLEN) { >>>> + if (skb_pad(skb, ETH_ZLEN - skb->len)) >>>> + return NETDEV_TX_OK; >>>> + skb->len = ETH_ZLEN; >>>> + skb_set_tail_pointer(skb, ETH_ZLEN); >>>> + } >>>> + >>>> f = vxlan_find_mac(vxlan, eth->h_dest); >>>> did_rsc = false; >>>> >>> >>> No. The short frame is perfectly valid, over the VXLAN. >>> The system doing the decap and forwarding should be where any padding is added if necessary. >>> > > Well, RFC 7348 is not dealing with padding at all. Both deployment > scenarios listed in RFC, as well as most of the existing real life > deployments today (in my opinion) use VXLAN for bridged traffic. In > other words, frame encapsulated by VTEP is received first over some > ethernet interface (physical or virtual) which implies that the frame > is at least 64 bytes long already. > > Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it > might be safer not to count on receiving VTEP doing the right thing > (pad small packets with zeros). > >> >> If that's the case, then Martin is most likely seeing a HW bug on the switch. >> I wonder how common such a bug might be? >> >> -vlad >> > > I see this on Vmware distributed virtual switch. Perhaps soon I will > be able to test it against HP 5930 switch. I'm going to try how Linux > bridge copes with it, now. Linux bridge will do just fine as it will pass the frame off to the hw driver which should pad things appropriately. -vlad > > Many thanks for the patch anyway! > > Regards, > Martin > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 20:23 ` Vlad Yasevich @ 2014-08-27 21:00 ` Martin Rusko 2014-09-01 14:26 ` Martin Rusko 0 siblings, 1 reply; 11+ messages in thread From: Martin Rusko @ 2014-08-27 21:00 UTC (permalink / raw) To: Vlad Yasevich; +Cc: Stephen Hemminger, Cong Wang, netdev On Wed, Aug 27, 2014 at 10:23 PM, Vlad Yasevich <vyasevich@gmail.com> wrote: > On 08/27/2014 04:01 PM, Martin Rusko wrote: >> On Wed, Aug 27, 2014 at 8:45 PM, Vlad Yasevich <vyasevich@gmail.com> wrote: >>> On 08/27/2014 02:42 PM, Stephen Hemminger wrote: >>>> On Wed, 27 Aug 2014 13:52:03 -0400 >>>> Vlad Yasevich <vyasevich@gmail.com> wrote: >>>> >>>>> On 08/27/2014 01:28 PM, Cong Wang wrote: >>>>>> On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote: >>>>>>> >>>>>>> I'm wondering, where is the proper place to fix this. Should >>>>>>> arp_create() function allocate skb big enough to produce ethernet >>>>>>> frame with at least minimum size? Or is it somewhere in NIC drivers >>>>>>> where small packets are padded with zeros? >>>>>> >>>>>> Drivers do that, for example e1000: >>>>>> >>>>>> /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN, >>>>>> * packets may get corrupted during padding by HW. >>>>>> * To WA this issue, pad all small packets manually. >>>>>> */ >>>>>> if (skb->len < ETH_ZLEN) { >>>>>> if (skb_pad(skb, ETH_ZLEN - skb->len)) >>>>>> return NETDEV_TX_OK; >>>>>> skb->len = ETH_ZLEN; >>>>>> skb_set_tail_pointer(skb, ETH_ZLEN); >>>>>> } >>>>> >>>>> >>>>> I think vxlan needs something like this: >>>>> >>>>> From: Vladislav Yasevich <vyasevich@gmail.com> >>>>> Date: Wed, 27 Aug 2014 13:39:32 -0400 >>>>> Subject: [PATCH] vxlan: Pad short ethernet frames. >>>>> >>>>> If sending short ethernet frames from the vxlan device, pad >>>>> them to minimum size so they can be forwarded after decapsulation. >>>>> >>>>> Reported-by: Martin Rusko <martin.rusko@gmail.com> >>>>> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> >>>>> --- >>>>> drivers/net/vxlan.c | 8 ++++++++ >>>>> 1 file changed, 8 insertions(+) >>>>> >>>>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c >>>>> index 1fb7b37..48267d4 100644 >>>>> --- a/drivers/net/vxlan.c >>>>> +++ b/drivers/net/vxlan.c >>>>> @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct >>>>> net_device *dev) >>>>> #endif >>>>> } >>>>> >>>>> + /* Pad short frames so they can be forwarded after decapsulation */ >>>>> + if (skb->len < ETH_ZLEN) { >>>>> + if (skb_pad(skb, ETH_ZLEN - skb->len)) >>>>> + return NETDEV_TX_OK; >>>>> + skb->len = ETH_ZLEN; >>>>> + skb_set_tail_pointer(skb, ETH_ZLEN); >>>>> + } >>>>> + >>>>> f = vxlan_find_mac(vxlan, eth->h_dest); >>>>> did_rsc = false; >>>>> >>>> >>>> No. The short frame is perfectly valid, over the VXLAN. >>>> The system doing the decap and forwarding should be where any padding is added if necessary. >>>> >> >> Well, RFC 7348 is not dealing with padding at all. Both deployment >> scenarios listed in RFC, as well as most of the existing real life >> deployments today (in my opinion) use VXLAN for bridged traffic. In >> other words, frame encapsulated by VTEP is received first over some >> ethernet interface (physical or virtual) which implies that the frame >> is at least 64 bytes long already. >> >> Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it >> might be safer not to count on receiving VTEP doing the right thing >> (pad small packets with zeros). >> >>> >>> If that's the case, then Martin is most likely seeing a HW bug on the switch. >>> I wonder how common such a bug might be? >>> >>> -vlad >>> >> >> I see this on Vmware distributed virtual switch. Perhaps soon I will >> be able to test it against HP 5930 switch. I'm going to try how Linux >> bridge copes with it, now. > > Linux bridge will do just fine as it will pass the frame off to the hw driver > which should pad things appropriately. > > -vlad > I can confirm that, now. After using namespaces to setup following topology: [main host] ~~~~~ [switch ns] ------ [host ns] ~~~ = vxlan (on top of veth link) ---- = veth link # namespace for the bridge with VTEP ip netns add switch # namespace for the remote host behind the bridge ip netns add host ip li add name veth0 type veth peer name veth1 ip li set veth1 netns switch ip li set veth2 netns switch ip li set veth3 netns host ip ad add 192.0.2.1/30 brd + dev veth0 ip li set veth0 up ip netns exec switch ip ad add 192.0.2.2/30 brd + dev veth1 ip netns exec switch ip li set veth1 up ip li add name vxln0 type vxlan id 100 group 239.0.2.0 \ local 192.0.2.1 dev veth0 dstport 0 ip ad add 198.51.100.1/24 brd + dev vxln0 ip li set vxln0 up ip netns exec switch ip li add name vxln1 type vxlan id 100 \ group 239.0.2.0 local 192.0.2.2 dev veth1 dstport 0 ip netns exec switch ip li add name vbr0 type bridge ip netns exec switch ip li set vxln1 master vbr0 ip netns exec switch ip li set veth2 master vbr0 ip netns exec switch ip li set vxln1 up ip netns exec switch ip li set veth2 up ip netns exec switch ip li set vbr0 up ip netns exec host ip ad add 198.51.100.2/24 brd + dev veth3 ip netns exec host ip li set veth3 up I was able to arping remote host from the main host and when I tapped to veth0 and veth2 interfaces, I could see small packets being exchange without any issues. Vlad, I'm going to recompile 3.16.1 kernel with your patch. Regards, Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-08-27 21:00 ` Martin Rusko @ 2014-09-01 14:26 ` Martin Rusko 2014-09-11 16:16 ` Martin Rusko 0 siblings, 1 reply; 11+ messages in thread From: Martin Rusko @ 2014-09-01 14:26 UTC (permalink / raw) To: Vlad Yasevich, Stephen Hemminger; +Cc: netdev >>>>> >>>>> No. The short frame is perfectly valid, over the VXLAN. >>>>> The system doing the decap and forwarding should be where any padding is added if necessary. >>>>> >>> >>> Well, RFC 7348 is not dealing with padding at all. Both deployment >>> scenarios listed in RFC, as well as most of the existing real life >>> deployments today (in my opinion) use VXLAN for bridged traffic. In >>> other words, frame encapsulated by VTEP is received first over some >>> ethernet interface (physical or virtual) which implies that the frame >>> is at least 64 bytes long already. >>> >>> Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it >>> might be safer not to count on receiving VTEP doing the right thing >>> (pad small packets with zeros). >>> >>>> >>>> If that's the case, then Martin is most likely seeing a HW bug on the switch. >>>> I wonder how common such a bug might be? >>>> >>>> -vlad >>>> >>> >>> I see this on Vmware distributed virtual switch. Perhaps soon I will >>> be able to test it against HP 5930 switch. I'm going to try how Linux >>> bridge copes with it, now. >> >> Linux bridge will do just fine as it will pass the frame off to the hw driver >> which should pad things appropriately. >> >> -vlad >> > > I can confirm that, now. I also tried a setup with Xen hypervisor and HVM guest connected to bridge with vxlan interface. In any combination I tried, it pretty much didn't care about the ethernet packet size. Haven't had chance to test it with hardware switch yet, so it's only causing problems with Vmware so far, when inner ethernet frame is not padded. > > Vlad, I'm going to recompile 3.16.1 kernel with your patch. > Vlad's patch works perfectly. I asked the authors of RFC 7348 to eventually clarify if any padding should be required for the inner frames. Shall the patch be included in the mainline kernel or not? /Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Sending undersized ARP packets with VXLAN L3 interface 2014-09-01 14:26 ` Martin Rusko @ 2014-09-11 16:16 ` Martin Rusko 0 siblings, 0 replies; 11+ messages in thread From: Martin Rusko @ 2014-09-11 16:16 UTC (permalink / raw) To: Vlad Yasevich, Stephen Hemminger; +Cc: netdev On Mon, Sep 1, 2014 at 4:26 PM, Martin Rusko <martin.rusko@gmail.com> wrote: >>>>>> >>>>>> No. The short frame is perfectly valid, over the VXLAN. >>>>>> The system doing the decap and forwarding should be where any padding is added if necessary. >>>>>> >>>> >>>> Well, RFC 7348 is not dealing with padding at all. Both deployment >>>> scenarios listed in RFC, as well as most of the existing real life >>>> deployments today (in my opinion) use VXLAN for bridged traffic. In >>>> other words, frame encapsulated by VTEP is received first over some >>>> ethernet interface (physical or virtual) which implies that the frame >>>> is at least 64 bytes long already. >>>> >>>> Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it >>>> might be safer not to count on receiving VTEP doing the right thing >>>> (pad small packets with zeros). >>>> >>>>> >>>>> If that's the case, then Martin is most likely seeing a HW bug on the switch. >>>>> I wonder how common such a bug might be? >>>>> >>>>> -vlad >>>>> >>>> >>>> I see this on Vmware distributed virtual switch. Perhaps soon I will >>>> be able to test it against HP 5930 switch. I'm going to try how Linux >>>> bridge copes with it, now. [...] > Haven't had chance to test it with hardware switch yet, so it's only > causing problems with Vmware so far, when inner ethernet frame is not > padded. Perhaps nobody is reading this anymore, but at least it will make it to the archive for reference. :-) So I tested also how HP 5930 switch handles small inner frames. They were padded. So it's in line with what Stephen said, that the receiving VTEP should pad frames to minimum size before sending them to the medium where size check is enforced. Regards, Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2014-09-11 16:16 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-27 17:06 Sending undersized ARP packets with VXLAN L3 interface Martin Rusko 2014-08-27 17:28 ` Cong Wang 2014-08-27 17:52 ` Vlad Yasevich 2014-08-27 18:16 ` Rick Jones 2014-08-27 18:42 ` Stephen Hemminger 2014-08-27 18:45 ` Vlad Yasevich 2014-08-27 20:01 ` Martin Rusko 2014-08-27 20:23 ` Vlad Yasevich 2014-08-27 21:00 ` Martin Rusko 2014-09-01 14:26 ` Martin Rusko 2014-09-11 16:16 ` Martin Rusko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).