From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx() Date: Mon, 31 Mar 2008 12:44:03 +0200 Message-ID: <20080331104403.GA12681@elte.hu> References: <20080323.032949.194309002.davem@davemloft.net> <47E6A5FD.6060407@cosmosbay.com> <20080331094823.GA11651@elte.hu> <20080331.030848.175668431.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: dada1@cosmosbay.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, a.p.zijlstra@chello.nl To: David Miller Return-path: Received: from mx3.mail.elte.hu ([157.181.1.138]:60398 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751484AbYCaKoR (ORCPT ); Mon, 31 Mar 2008 06:44:17 -0400 Content-Disposition: inline In-Reply-To: <20080331.030848.175668431.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: * David Miller wrote: > I don't think it's safe. > > Every packet you receive can result in a sent packet, which in turn > can result in a full packet receive path being taken, and yet again > another sent packet. > > And so on and so forth. > > Some cases like this would be stack bugs, but wouldn't you like that > bug to be a very busy cpu instead of a crash from overrunning the > current stack? sure. But the core problem remains: our loopback networking scalability is poor. For plain localhost<->localhost connected sockets we hit the loopback device lock for every packet, and this very much shows up on real workloads on a quad already: the lock instruction in netif_rx is the most expensive instruction in a sysbench DB workload. and it's not just about scalability, the plain algorithmic overhead is way too high as well: $ taskset 1 ./bw_tcp -s $ taskset 1 ./bw_tcp localhost Socket bandwidth using localhost: 2607.09 MB/sec $ taskset 1 ./bw_pipe Pipe bandwidth: 3680.44 MB/sec i dont think this is acceptable. Either we should fix loopback TCP performance or we should transparently switch to VFS pipes as a transport method when an app establishes a plain loopback connection (as long as there are no frills like content-modifying component in the delivery path of packets after a connection has been established - which covers 99.9% of the real-life loopback cases). I'm not suggesting we shouldnt use TCP for connection establishing - but if the TCP loopback packet transport is too slow we should use the VFS transport which is both more scalable, less cache-intense and has lower straight overhead as well. Ingo