From mboxrd@z Thu Jan 1 00:00:00 1970 From: Damien Millescamps Subject: Re: Best example for showing throughput? Date: Wed, 29 May 2013 16:07:28 +0200 Message-ID: <51A60BA0.7000700@6wind.com> References: <519F74F6.3000903@mahan.org> <201305241641.38896.thomas.monjalon@6wind.com> <201305241745.25844.thomas.monjalon@6wind.com> <5BBC85C7-B39F-4200-AB7B-CD5464BDA431@mahan.org> <51A10FC3.5050703@6wind.com> <51A12618.3040509@6wind.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit To: "dev-VfR2kkLFssw@public.gmane.org" Return-path: In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On 05/28/2013 09:15 PM, Patrick Mahan wrote: > So the overhead cost is almost 70%? > > Can this ever do line rate? Under what conditions? It has been my experience that the industry standard is testing throughput using these 64 byte packets. This overhead can actually be explained considering the PCIe 2.1[1] standard and 82599 Specifications[2]. To sum up, for each packet the adapter needs to first send a read request on a 16 Bytes packet descriptor (cf. [2]), to which it will receive a read answer. Then the adapter must issue either a read or write request to the packet physical address for the size of the packet. The frame format for PCIe read and write request is composed with a Start of frame, a Sequence Number, a Header, the Data, an LRC and an End of frame (cf. [1]). The overhead we are talking about here is more than 16 Bytes per PCIe message. In addition to that, the PCIe physical layer uses a 10bits per bytes encoding, thus adding to the overhead. Now if you apply this to the 64 Bytes packet, you should notice that the overhead is way above 70% (4 messages plus descriptor and data size times 10b/8b encoding which should be around 83% if I didn't miss anything). However, if we end up with a limited overhead it is because the 82599 implements thresholds in order to be able to batch the packet descriptor reading / writing back (cf. [2] WTHRESH for example) thus reducing the overhead to a little more than 70% with the default DPDK parameters. You can achieve line-rate for 64 Bytes packets on each port independently. When using both port simultaneously you can achieve line-rate using packet size above 64Bytes. In the post to which I redirected you, Alexander talked about 256Bytes packets. But if you take the time to compute the total throughput needed on the PCIe as a function of the packet size, you'll probably end up with a lower minimum packet size than 256B to achieve line-rate simultaneously on both port. [1] http://www.pcisig.com/members/downloads/specifications/pciexpress/PCI_Express_Base_r2_1_04Mar09.pdf [2] http://www.intel.com/content/dam/doc/datasheet/82599-10-gbe-controller-datasheet.pdf -- Damien Millescamps