From mboxrd@z Thu Jan 1 00:00:00 1970 From: Willy Tarreau Subject: Re: splice() performance for TCP socket forwarding Date: Thu, 13 Dec 2018 14:57:20 +0100 Message-ID: <20181213135720.GB16149@1wt.eu> References: <20181213125553.GA16149@1wt.eu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Marek Majkowski , netdev@vger.kernel.org To: Eric Dumazet Return-path: Received: from wtarreau.pck.nerim.net ([62.212.114.60]:55474 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729430AbeLMN5Z (ORCPT ); Thu, 13 Dec 2018 08:57:25 -0500 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Hi Eric! On Thu, Dec 13, 2018 at 05:37:11AM -0800, Eric Dumazet wrote: > Maybe mlx5 driver is in LRO mode, packing TCP payload in 4K pages ? I could be wrong but I don't think so : I remember having been used to LRO on myri10ge a decade ago giving me good performance which would degrade with concurrent connections, till the point LRO got deprecated when GRO started to work quite well. Thus this got me used to always disabling LRO to be sure to measure something durable ;-) > bnx2x GRO/LRO has this mode, meaning that around 8 pages are used for a GRO packets of ~32 KB, > while mlx4 for instance would use one page frag for every ~1428 bytes of payload. I remember that it was the same on myri10ge (1 segment per page), making splice() return rougly 21 or 22kB per call for a 64kB pipe. BTW, I think I said bullshit and that 3 years ago it was mlx4 and not mlx5 that I've been using. Willy