From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: SO_REUSEPORT - can it be done in kernel? Date: Wed, 02 Mar 2011 03:56:38 +0100 Message-ID: <1299034598.2930.47.camel@edumazet-laptop> References: <20110301110708.GJ9763@canuck.infradead.org> <1298977984.3284.15.camel@edumazet-laptop> <20110301112759.GK9763@canuck.infradead.org> <1298979909.3284.28.camel@edumazet-laptop> <20110301115305.GA6984@gondor.apana.org.au> <1298984609.3284.98.camel@edumazet-laptop> <20110301131823.GB8028@gondor.apana.org.au> <1298997084.3284.119.camel@edumazet-laptop> <20110302002353.GA15009@gondor.apana.org.au> <1299031203.2930.26.camel@edumazet-laptop> <20110302023920.GA16072@gondor.apana.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Thomas Graf , David Miller , rick.jones2@hp.com, therbert@google.com, wsommerfeld@google.com, daniel.baluta@gmail.com, netdev@vger.kernel.org To: Herbert Xu Return-path: Received: from mail-wy0-f174.google.com ([74.125.82.174]:42467 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757031Ab1CBC4n (ORCPT ); Tue, 1 Mar 2011 21:56:43 -0500 Received: by wyg36 with SMTP id 36so5360859wyg.19 for ; Tue, 01 Mar 2011 18:56:42 -0800 (PST) In-Reply-To: <20110302023920.GA16072@gondor.apana.org.au> Sender: netdev-owner@vger.kernel.org List-ID: Le mercredi 02 mars 2011 =C3=A0 10:39 +0800, Herbert Xu a =C3=A9crit : > UDP is a datagram protocol, TCP is not. >=20 > Anyway, here is an alternate proposal. When a TCP socket transmits > for the first time (SYN or SYN-ACK), we pick a queue based on CPU and > store it in the socket. From then on we stick to that selection. >=20 Many TCP apps I know use one thread to perform listen/accept and a pool of threads to handle each new conn. Anyway, the SYN-ACK is generated by softirq, not really user choice. CPU depends if NIC is RX multiqueue or RPS is setup. All this discussion is about letting process scheduler decide TX queue, (because user/admin used cpu affinity) or let network stack drive scheduler : Please migrate this thread on this cpu. Both schems should be allowed/configurable so that best results are available. > We would only allow changes if we can ensure that all transmitted > packets have left the queue. Or we just never change it like we > do now. >=20 We do change in case of dst/route change. Each device can have differen= t number of TX queues.