From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH net-next] net: introduce SO_INCOMING_CPU Date: Fri, 14 Nov 2014 14:27:45 -0800 Message-ID: References: <1415393472.13896.119.camel@edumazet-glaptop2.roam.corp.google.com> <1415993614.17262.55.camel@edumazet-glaptop2.roam.corp.google.com> <1416003883.17262.72.camel@edumazet-glaptop2.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: <1416003883.17262.72.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Eric Dumazet Cc: Tom Herbert , Michael Kerrisk , David Miller , netdev , Ying Cai , Willem de Bruijn , Neal Cardwell , Linux API List-Id: linux-api@vger.kernel.org On Fri, Nov 14, 2014 at 2:24 PM, Eric Dumazet wrote: > On Fri, 2014-11-14 at 14:10 -0800, Andy Lutomirski wrote: > >> I have a bunch of threads that are pinned to various CPUs or groups of >> CPUs. Each thread is responsible for a fixed set of flows. I'd like >> those flows to go to those CPUs. >> >> RFS will eventually do it, but it would be nice if I could >> deterministically ask for a flow to be routed to the right CPU. Also, >> if my thread bounces temporarily to another CPU, I don't really need >> the flow to follow it -- I'd like it to stay put. >> >> This has a significant benefit over using automatic steering: with >> automatic steering, I have to make all of the hash tables have a size >> around the square of the total number of the flows in order to make it >> reliable. >> >> Something like SO_STEER_TO_THIS_CPU would be fine, as long as it >> reported whether it worked (for my diagnostics). > > This requires some kind of hardware support, and unfortunately this is > not generic. > > With SO_INCOMING_CPU, you simply can pass fd of sockets around threads, > so that a dumb RSS multiqueue NIC is OK (assuming you are not using some > encapsulation that NIC is not able to parse to find L4 information) I can't really do this. It means that the performance of my system will be wildly different every time I restart it. I don't have enough connections for everything to average out. > > Steering is a dream, I really think its easier to build flows so that > their RX queue matches your requirements. I have supporting hardware :) I just want it to work without programming the ntuple table myself. > > We usually can pick at least one element of the 4-tuple, so its actually > possible to get this before connect(). > Hmm. An API for that would be quite nice :) > > Two cases : > > 1) Passive connections. > > After accept(), get SO_INCOMING_CPU, then pass the fd to appropriate > thread of your pool. > > 2) Active connections . > find a proper 4-tuple, bind() then connect(). Eventually check > SO_INCOMING_CPU to verify your expectations. The people at the other end will be really pissed if that results in lots of reconnections. --Andy > > > -- Andy Lutomirski AMA Capital Management, LLC