From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: Kernel 4.19 network performance - forwarding/routing normal users traffic Date: Mon, 5 Nov 2018 21:17:33 +0100 Message-ID: <20181105211733.7468cc61@redhat.com> References: <61697e49-e839-befc-8330-fc00187c48ee@itcare.pl> <3a88bb53-9d17-3e85-638e-a605f5bfe0fb@gmail.com> <20181101115522.10b0dd0a@redhat.com> <63198d68-6752-3695-f406-d86fb395c12b@itcare.pl> <7141e1e0-93e4-ab20-bce6-17f1e14682f1@gmail.com> <394a0bf2-fa97-1085-2eda-98ddf476895c@itcare.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: David Ahern , netdev , Yoel Caspersen , brouer@redhat.com To: =?UTF-8?B?UGF3ZcWC?= Staszewski Return-path: Received: from mx1.redhat.com ([209.132.183.28]:57998 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727174AbeKFFjB (ORCPT ); Tue, 6 Nov 2018 00:39:01 -0500 In-Reply-To: <394a0bf2-fa97-1085-2eda-98ddf476895c@itcare.pl> Sender: netdev-owner@vger.kernel.org List-ID: On Sun, 4 Nov 2018 01:24:03 +0100 Paweł Staszewski wrote: > And today again after allpy patch for page allocator - reached again > 64/64 Gbit/s > > with only 50-60% cpu load Great. > today no slowpath hit for netwoking :) > > But again dropped pckt at 64GbitRX and 64TX .... > And as it should not be pcie express limit  -i think something more is Well, this does sounds like a PCIe bandwidth limit to me. See the PCIe BW here: https://en.wikipedia.org/wiki/PCI_Express You likely have PCIe v3, where 1-lane have 984.6 MBytes/s or 7.87 Gbit/s Thus, x16-lanes have 15.75 GBytes or 126 Gbit/s. It does say "in each direction", but you are also forwarding this RX->TX on both (dual) ports NIC that is sharing the same PCIe slot. > going on there - and hard to catch - cause perf top doestn chenged > besides there is no queued slowpath hit now > > I ordered now also intel cards to compare - but 3 weeks eta > Faster - cause 3 days - i will have mellanox connectx 5 - so can > separate traffic to two different x16 pcie busses I do think you need to separate traffic to two different x16 PCIe slots. I have found that the ConnectX-5 is significantly faster packet-per-sec performance than ConnectX-4, but that is not your use-case (max BW). I've not tested these NICs for maximum _bidirectional_ bandwidth limits, I've only made sure I can do 100G unidirectional, which can hit some funny motherboard memory limits (remember to equip motherboard with 4 RAM blocks for full memory BW). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer