From: Martin Rusko <martin.rusko@gmail.com>
To: Vlad Yasevich <vyasevich@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
Cong Wang <cwang@twopensource.com>,
netdev <netdev@vger.kernel.org>
Subject: Re: Sending undersized ARP packets with VXLAN L3 interface
Date: Wed, 27 Aug 2014 22:01:36 +0200 [thread overview]
Message-ID: <CAMYYbY4KHTkkv6peT14S8mCzcQJ8wROnHWj5tMy563H-w79KMg@mail.gmail.com> (raw)
In-Reply-To: <53FE2738.30702@gmail.com>
On Wed, Aug 27, 2014 at 8:45 PM, Vlad Yasevich <vyasevich@gmail.com> wrote:
> On 08/27/2014 02:42 PM, Stephen Hemminger wrote:
>> On Wed, 27 Aug 2014 13:52:03 -0400
>> Vlad Yasevich <vyasevich@gmail.com> wrote:
>>
>>> On 08/27/2014 01:28 PM, Cong Wang wrote:
>>>> On Wed, Aug 27, 2014 at 10:06 AM, Martin Rusko <martin.rusko@gmail.com> wrote:
>>>>>
>>>>> I'm wondering, where is the proper place to fix this. Should
>>>>> arp_create() function allocate skb big enough to produce ethernet
>>>>> frame with at least minimum size? Or is it somewhere in NIC drivers
>>>>> where small packets are padded with zeros?
>>>>
>>>> Drivers do that, for example e1000:
>>>>
>>>> /* On PCI/PCI-X HW, if packet size is less than ETH_ZLEN,
>>>> * packets may get corrupted during padding by HW.
>>>> * To WA this issue, pad all small packets manually.
>>>> */
>>>> if (skb->len < ETH_ZLEN) {
>>>> if (skb_pad(skb, ETH_ZLEN - skb->len))
>>>> return NETDEV_TX_OK;
>>>> skb->len = ETH_ZLEN;
>>>> skb_set_tail_pointer(skb, ETH_ZLEN);
>>>> }
>>>
>>>
>>> I think vxlan needs something like this:
>>>
>>> From: Vladislav Yasevich <vyasevich@gmail.com>
>>> Date: Wed, 27 Aug 2014 13:39:32 -0400
>>> Subject: [PATCH] vxlan: Pad short ethernet frames.
>>>
>>> If sending short ethernet frames from the vxlan device, pad
>>> them to minimum size so they can be forwarded after decapsulation.
>>>
>>> Reported-by: Martin Rusko <martin.rusko@gmail.com>
>>> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com>
>>> ---
>>> drivers/net/vxlan.c | 8 ++++++++
>>> 1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>>> index 1fb7b37..48267d4 100644
>>> --- a/drivers/net/vxlan.c
>>> +++ b/drivers/net/vxlan.c
>>> @@ -1939,6 +1939,14 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct
>>> net_device *dev)
>>> #endif
>>> }
>>>
>>> + /* Pad short frames so they can be forwarded after decapsulation */
>>> + if (skb->len < ETH_ZLEN) {
>>> + if (skb_pad(skb, ETH_ZLEN - skb->len))
>>> + return NETDEV_TX_OK;
>>> + skb->len = ETH_ZLEN;
>>> + skb_set_tail_pointer(skb, ETH_ZLEN);
>>> + }
>>> +
>>> f = vxlan_find_mac(vxlan, eth->h_dest);
>>> did_rsc = false;
>>>
>>
>> No. The short frame is perfectly valid, over the VXLAN.
>> The system doing the decap and forwarding should be where any padding is added if necessary.
>>
Well, RFC 7348 is not dealing with padding at all. Both deployment
scenarios listed in RFC, as well as most of the existing real life
deployments today (in my opinion) use VXLAN for bridged traffic. In
other words, frame encapsulated by VTEP is received first over some
ethernet interface (physical or virtual) which implies that the frame
is at least 64 bytes long already.
Perhaps we're going to see more VXLAN interfaces in L3 mode, yet it
might be safer not to count on receiving VTEP doing the right thing
(pad small packets with zeros).
>
> If that's the case, then Martin is most likely seeing a HW bug on the switch.
> I wonder how common such a bug might be?
>
> -vlad
>
I see this on Vmware distributed virtual switch. Perhaps soon I will
be able to test it against HP 5930 switch. I'm going to try how Linux
bridge copes with it, now.
Many thanks for the patch anyway!
Regards,
Martin
next prev parent reply other threads:[~2014-08-27 20:01 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-27 17:06 Sending undersized ARP packets with VXLAN L3 interface Martin Rusko
2014-08-27 17:28 ` Cong Wang
2014-08-27 17:52 ` Vlad Yasevich
2014-08-27 18:16 ` Rick Jones
2014-08-27 18:42 ` Stephen Hemminger
2014-08-27 18:45 ` Vlad Yasevich
2014-08-27 20:01 ` Martin Rusko [this message]
2014-08-27 20:23 ` Vlad Yasevich
2014-08-27 21:00 ` Martin Rusko
2014-09-01 14:26 ` Martin Rusko
2014-09-11 16:16 ` Martin Rusko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMYYbY4KHTkkv6peT14S8mCzcQJ8wROnHWj5tMy563H-w79KMg@mail.gmail.com \
--to=martin.rusko@gmail.com \
--cc=cwang@twopensource.com \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
--cc=vyasevich@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).