From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [RFC] Ethernet drivers to add padding on egress Date: Tue, 20 Nov 2018 16:55:17 -0800 Message-ID: <20181120165517.28b21004@xeon-e3> References: <98CBD80474FA8B44BF855DF32C47DC35B424A6@smartserver.smartshare.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Morten =?UTF-8?B?QnLDuHJ1cA==?= , "dev@dpdk.org" , Ferruh Yigit , Declan Doherty , Chas Williams , "John W. Linville" , Marcin Wojtas , Michal Krawczyk , Guy Tzalik , Evgeny Schemeilin , Ravi Kumar , Igor Russkikh , Pavel Belous , Shepard Siegel , Ed Czeck , John Miller , Ajit Khaparde , Somnath Kotur , Jerin Jacob , Maciej Czekaj , Shijith Thotton , Srisivasubrama To: Shahaf Shuler Return-path: Received: from mail-pl1-f195.google.com (mail-pl1-f195.google.com [209.85.214.195]) by dpdk.org (Postfix) with ESMTP id A87D258C6 for ; Wed, 21 Nov 2018 01:55:27 +0100 (CET) Received: by mail-pl1-f195.google.com with SMTP id f12-v6so2846538plo.1 for ; Tue, 20 Nov 2018 16:55:27 -0800 (PST) In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, 19 Nov 2018 08:02:02 +0000 Shahaf Shuler wrote: > Thursday, November 15, 2018 6:57 PM, Morten Br=C3=B8rup: > > Subject: [RFC] Ethernet drivers to add padding on egress > >=20 > > Hi networking driver maintainers, > >=20 > > I suggest that the TX functions of Ethernet interface drivers accept pa= ckets > > with less than 60 byte payload, and transmit them on the medium as valid > > Ethernet frames, i.e. by padding the packets up to the minimum Ethernet > > packet size of 64 bytes incl. Ethernet FCS, instead of discarding them. > >=20 > > This feature makes it easier for application developers who are using D= PDK as > > the lower layer in an IP stack, where lots of packets have less than 60= bytes > > Ethernet payload, e.g. TCP SYN and TCP ACK packets. > >=20 > > This feature also makes it easier for application developers who are us= ing > > DPDK library functions that split, merge or otherwise transform packets= into > > packets of other sizes, e.g. Generic Segmentation Offload, IP Fragmenta= tion > > and various tunnel encapsulation/decapsulation functions. > >=20 > > Currently (without this feature), it is required by the application to = check if > > packets originating from the IP stack or having passed through a > > split/merge/transform function are about to egress on an Ethernet inter= face, > > and in that case, if some of the packets are less than 60 bytes (excl. = Ethernet > > FCS), add padding before passing them on to the driver's TX function. > >=20 > > E.g. when using Generic Segmentation Offload, a packet carrying 1461 by= te > > TCP payload (excl. 54 bytes Ethernet+IP+TCP headers) will be split into= two > > packets of respectively 1514 byte (incl. 54 bytes Ethernet+IP+TCP heade= rs) > > and 55 bytes (incl. 54 bytes Ethernet+IP+TCP headers), and the latter m= ust > > be padded before it is transmitted on an Ethernet interface. > >=20 > >=20 > > In my opinion, it should be a requirement that the Ethernet interface d= rivers > > ensure correct padding when egressing the packet on the medium. > > Alternatively, it can be an optional feature, which could be exposed as= an TX > > Capabilities flag of the driver. > >=20 > > What do you think? =20 >=20 > I think at the first stage it should be a Tx offload capability - the abi= lity to pad (maybe in HW) the packets and avoid the cost of padding in SW. > PMD vendors who wants to make an easier life for their customers can impl= ement it in SW, however the gain here is only with simplicity of code for a= pplication. Performance wise it wouldn't matter.=20 >=20 > When the majority/all PMDs will have this feature we can discuss on makin= g it a standard for each PMD (like the CRC strip we have today). Yet another tx offload flag may look good as a vendor but doesn't add anyth= ing useful and hurts useablity. Every driver should take any size Ethernet packet and pad in hardware (or s= oftware) based on what it knows the NIC hardware can do. For virtual devices where there is no minimum length is required, then noth= ing needs to be done. Packets < Ether header are obvious errors and should increment tx_output er= rors.