From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Abeni Subject: Re: [PATCH RFC net-next 00/11] udp gso Date: Wed, 18 Apr 2018 13:17:54 +0200 Message-ID: <1524050274.2599.21.camel@redhat.com> References: <20180417200059.30154-1-willemdebruijn.kernel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Willem de Bruijn To: Willem de Bruijn , netdev@vger.kernel.org Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45016 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752162AbeDRLR4 (ORCPT ); Wed, 18 Apr 2018 07:17:56 -0400 In-Reply-To: <20180417200059.30154-1-willemdebruijn.kernel@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, 2018-04-17 at 16:00 -0400, Willem de Bruijn wrote: > From: Willem de Bruijn > > Segmentation offload reduces cycles/byte for large packets by > amortizing the cost of protocol stack traversal. > > This patchset implements GSO for UDP. A process can concatenate and > submit multiple datagrams to the same destination in one send call > by setting socket option SOL_UDP/UDP_SEGMENT with the segment size, > or passing an analogous cmsg at send time. > > The stack will send the entire large (up to network layer max size) > datagram through the protocol layer. At the GSO layer, it is broken > up in individual segments. All receive the same network layer header > and UDP src and dst port. All but the last segment have the same UDP > header, but the last may differ in length and checksum. This is interesting, thanks for sharing! I have some local patches somewhere implementing UDP GRO, but I never tried to upstream them, since I lacked the associated GSO and I thought that the use-case was not too relevant. Given that your use-case is a connected socket - no per packet route lookup - how does GSO performs compared to plain sendmmsg()? Have you considered using and/or improving the latter? When testing with Spectre/Meltdown mitigation in places, I expect that the most relevant part of the gain is due to the single syscall per burst. Cheers, Paolo