netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: David Miller <davem@davemloft.net>
Cc: therbert@google.com, shemminger@vyatta.com, dada1@cosmosbay.com,
	andi@firstfloor.org, netdev@vger.kernel.org
Subject: Re: [PATCH] Software receive packet steering
Date: Wed, 22 Apr 2009 11:49:50 -0700	[thread overview]
Message-ID: <49EF66CE.10800@hp.com> (raw)
In-Reply-To: <20090422.022120.211323498.davem@davemloft.net>

David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Tue, 21 Apr 2009 11:52:07 -0700
> 
> 
>>That is possible and don't think the design of our patch would
>>preclude it, but I am worried that each time the mapping from a
>>connection to a CPU changes this could cause of out of order
>>packets.  I suppose this is similar problem to changing the RSS hash
>>mappings in a device.
> 
> 
> Yes, out of order packet processing is a serious issue.
> 
> There are some things I've been brainstorming about.
> 
> One thought I keep coming back to is the hack the block layer
> is using right now.  It remembers which CPU a block I/O request
> comes in on, and it makes sure the completion runs on that
> cpu too.
> 
> We could remember the cpu that the last socket level operation
> occurred upon, and use that as a target for packets.  This requires a
> bit of work.
> 
> First we'd need some kind of pre-demux at netif_receive_skb()
> time to look up the cpu target, and reference this blob from
> the socket somehow, and keep it uptodate at various specific
> locations (read/write/poll, whatever...).

Does poll on the socket touch all that many cachelines, or are you thinking of it 
as being a predictor of where read/write will be called?

> 
> Or we could pre-demux the real socket.  That could be exciting.
> 
> But then we come back to the cpu number changing issue.  There is a
> cool way to handle this, because it seems that we can just keep
> queueing to the previous cpu and it can check the socket cpu cookie.
> If that changes, the old target can push the rest of it's queue to
> that cpu and then update the cpu target blob.
> 
> Anyways, just some ideas.

For what it is worth, at the 5000 foot description level that is exactly what 
HP-UX 11.X does and calls TOPS (Thread Optimized Packet Scheduling).  Where the 
socket was last accessed is stashed away (in the socket/stream structure) and 
that is looked-up when the driver hands the packet up the stack.  It was done 
that way in HP-UX 11.X because we found that simply hashing the headers (what 
HP-UX 10.20 called "Inbound Packet Scheduling" or IPS) while fine for discrete 
netperf TCP_RR tests, wasn't really what one wanted when a single thread of 
execution was servicing more than one connection/flow.

The TOPS patches were added to HP-UX 11.0 ca 1998 and while there have been some 
issues (as you surmise, and others thanks to Streams being involved :) it appears 
to have worked rather well these last ten years.  So, at least in the abstract 
what is proposed above has at least a little pre-validation.  TOPS can be 
disabled/enabled via an ndd (ie sysctl) setting for those cases when the number 
of NICs (back then they were all single-queue) or now queues is a reasonable 
fraction of the number of cores and  the administrator can/wants to silo things.

rick jones

  parent reply	other threads:[~2009-04-22 18:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-08 22:48 [PATCH] Software receive packet steering Tom Herbert
2009-04-08 23:08 ` Stephen Hemminger
2009-04-08 23:09 ` Stephen Hemminger
2009-04-08 23:15   ` David Miller
2009-04-09 16:43     ` Tom Herbert
2009-04-09 18:23       ` Ben Hutchings
2009-04-09 21:17       ` David Miller
2009-04-09  0:36 ` David Miller
2009-04-09  4:40   ` Tom Herbert
2009-04-09  5:24     ` David Miller
2009-04-20 10:32 ` Andi Kleen
2009-04-20 10:46   ` David Miller
2009-04-21  3:26   ` Tom Herbert
2009-04-21  9:48     ` Eric Dumazet
2009-04-21 15:46       ` Stephen Hemminger
2009-04-21 18:52         ` Tom Herbert
2009-04-22  9:21           ` David Miller
2009-04-22 15:46             ` Tom Herbert
2009-04-22 18:49             ` Rick Jones [this message]
2009-04-22 20:44             ` Jesper Dangaard Brouer
2009-04-23  6:58               ` Jens Axboe
2009-04-23  7:25                 ` David Miller
2009-04-23  7:29                   ` Jens Axboe
2009-04-23  9:12               ` Jens Laas
2009-04-22 14:33         ` Martin Josefsson
2009-04-23  7:34           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49EF66CE.10800@hp.com \
    --to=rick.jones2@hp.com \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).