From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Bennieston Subject: Re: [PATCH RFC 0/4]: xen-net{back, front}: Multiple transmit and receive queues Date: Thu, 16 Jan 2014 10:27:16 +0000 Message-ID: <52D7B404.9080007@citrix.com> References: <1389803004-31812-1-git-send-email-andrew.bennieston@citrix.com> <9AAE0902D5BC7E449B7C8E4E778ABCD0208D6E@AMSPEX01CL01.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1W3kAS-0001Rb-U2 for xen-devel@lists.xenproject.org; Thu, 16 Jan 2014 10:27:21 +0000 In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD0208D6E@AMSPEX01CL01.citrite.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Paul Durrant , "xen-devel@lists.xenproject.org" Cc: Wei Liu , Ian Campbell List-Id: xen-devel@lists.xenproject.org Subject: Re: [PATCH RFC 0/4]: xen-net{back,front}: Multiple transmit and receive queues To: Paul Durrant ,"xen-devel@lists.xenproject.org" Cc: Ian Campbell ,Wei Liu Bcc: -=-=-=-=-=-=-=-=-=# Don't remove this line #=-=-=-=-=-=-=-=-=- On 16/01/14 10:04, Paul Durrant wrote: >> -----Original Message----- From: Andrew J. Bennieston >> [mailto:andrew.bennieston@citrix.com] Sent: 15 January 2014 16:23 To: >> xen-devel@lists.xenproject.org Cc: Ian Campbell; Wei Liu; Paul >> Durrant Subject: [PATCH RFC 0/4]: xen-net{back,front}: Multiple >> transmit and receive queues >> >> This patch series implements multiple transmit and receive queues >> (i.e. multiple shared rings) for the xen virtual network interfaces. >> >> The series is split up as follows: - Patches 1 and 3 factor out the >> queue-specific data for netback and netfront respectively, and modify >> the rest of the code to use these as appropriate. - Patches 2 and 4 >> introduce new XenStore keys to negotiate and use multiple shared >> rings and event channels, and code to connect these as appropriate. >> >> All other transmit and receive processing remains unchanged, i.e. >> there is a kthread per queue and a NAPI context per queue. >> >> The performance of these patches has been analysed in detail, with >> results available at: >> >> http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi- >> queue_performance_testing >> > > Nice numbers! > >> To summarise: * Using multiple queues allows a VM to transmit at line >> rate on a 10 Gbit/s NIC, compared with a maximum aggregate throughput >> of 6 Gbit/s with a single queue. * For intra-host VM--VM traffic, >> eight queues provide 171% of the throughput of a single queue; almost >> 12 Gbit/s instead of 6 Gbit/s. * There is a corresponding increase >> in total CPU usage, i.e. this is a scaling out over available >> resources, not an efficiency improvement. * Results depend on the >> availability of sufficient CPUs, as well as the distribution of >> interrupts and the distribution of TCP streams across the queues. >> >> One open issue is how to deal with the tx_credit data for rate >> limiting. This used to exist on a per-VIF basis, and these patches >> move it to per-queue to avoid contention on concurrent access to the >> tx_credit data from multiple threads. This has the side effect of >> breaking the tx_credit accounting across the VIF as a whole. I cannot >> see a situation in which people would want to use both rate limiting >> and a high-performance multi-queue mode, but if this is problematic >> then it can be brought back to the VIF level, with appropriate >> protection. Obviously, it continues to work identically in the case >> where there is only one queue. >> >> Queue selection is currently achieved via an L4 hash on the packet >> (i.e. TCP src/dst port, IP src/dst address) and is not negotiated >> between the frontend and backend, since only one option exists. >> Future patches to support other frontends (particularly Windows) will >> need to add some capability to negotiate not only the hash algorithm >> selection, but also allow the frontend to specify some parameters to >> this. >> > > Yes, Windows RSS stipulates a Toeplitz hash and specifies a hash key > and mapping table. There's further awkwardness in the need to pass the > actual hash value to the frontend too - but we could use an 'extra' > seg for that, analogous to passing the GSO mss value through. Yes, I was hoping we might be able to play tricks like that when it came to implementing Toeplitz support. Andrew > > Paul > >> Queue-specific XenStore entries for ring references and event >> channels are stored hierarchically, i.e. under .../queue-N/... where >> N varies from 0 to one less than the requested number of queues >> (inclusive). If only one queue is requested, it falls back to the >> flat structure where the ring references and event channels are >> written at the same level as other vif information. >> >> -- Andrew J. Bennieston