All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: David Miller <davem@davemloft.net>
Cc: therbert@google.com, shemminger@vyatta.com, dada1@cosmosbay.com,
	andi@firstfloor.org, netdev@vger.kernel.org
Subject: Re: [PATCH] Software receive packet steering
Date: Wed, 22 Apr 2009 11:49:50 -0700	[thread overview]
Message-ID: <49EF66CE.10800@hp.com> (raw)
In-Reply-To: <20090422.022120.211323498.davem@davemloft.net>

David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Tue, 21 Apr 2009 11:52:07 -0700
> 
> 
>>That is possible and don't think the design of our patch would
>>preclude it, but I am worried that each time the mapping from a
>>connection to a CPU changes this could cause of out of order
>>packets.  I suppose this is similar problem to changing the RSS hash
>>mappings in a device.
> 
> 
> Yes, out of order packet processing is a serious issue.
> 
> There are some things I've been brainstorming about.
> 
> One thought I keep coming back to is the hack the block layer
> is using right now.  It remembers which CPU a block I/O request
> comes in on, and it makes sure the completion runs on that
> cpu too.
> 
> We could remember the cpu that the last socket level operation
> occurred upon, and use that as a target for packets.  This requires a
> bit of work.
> 
> First we'd need some kind of pre-demux at netif_receive_skb()
> time to look up the cpu target, and reference this blob from
> the socket somehow, and keep it uptodate at various specific
> locations (read/write/poll, whatever...).

Does poll on the socket touch all that many cachelines, or are you thinking of it 
as being a predictor of where read/write will be called?

> 
> Or we could pre-demux the real socket.  That could be exciting.
> 
> But then we come back to the cpu number changing issue.  There is a
> cool way to handle this, because it seems that we can just keep
> queueing to the previous cpu and it can check the socket cpu cookie.
> If that changes, the old target can push the rest of it's queue to
> that cpu and then update the cpu target blob.
> 
> Anyways, just some ideas.

For what it is worth, at the 5000 foot description level that is exactly what 
HP-UX 11.X does and calls TOPS (Thread Optimized Packet Scheduling).  Where the 
socket was last accessed is stashed away (in the socket/stream structure) and 
that is looked-up when the driver hands the packet up the stack.  It was done 
that way in HP-UX 11.X because we found that simply hashing the headers (what 
HP-UX 10.20 called "Inbound Packet Scheduling" or IPS) while fine for discrete 
netperf TCP_RR tests, wasn't really what one wanted when a single thread of 
execution was servicing more than one connection/flow.

The TOPS patches were added to HP-UX 11.0 ca 1998 and while there have been some 
issues (as you surmise, and others thanks to Streams being involved :) it appears 
to have worked rather well these last ten years.  So, at least in the abstract 
what is proposed above has at least a little pre-validation.  TOPS can be 
disabled/enabled via an ndd (ie sysctl) setting for those cases when the number 
of NICs (back then they were all single-queue) or now queues is a reasonable 
fraction of the number of cores and  the administrator can/wants to silo things.

rick jones

  parent reply	other threads:[~2009-04-22 18:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-08 22:48 [PATCH] Software receive packet steering Tom Herbert
2009-04-08 23:08 ` Stephen Hemminger
2009-04-08 23:09 ` Stephen Hemminger
2009-04-08 23:15   ` David Miller
2009-04-09 16:43     ` Tom Herbert
2009-04-09 18:23       ` Ben Hutchings
2009-04-09 21:17       ` David Miller
2009-04-09  0:36 ` David Miller
2009-04-09  4:40   ` Tom Herbert
2009-04-09  5:24     ` David Miller
2009-04-20 10:32 ` Andi Kleen
2009-04-20 10:46   ` David Miller
2009-04-21  3:26   ` Tom Herbert
2009-04-21  9:48     ` Eric Dumazet
2009-04-21 15:46       ` Stephen Hemminger
2009-04-21 18:52         ` Tom Herbert
2009-04-22  9:21           ` David Miller
2009-04-22 15:46             ` Tom Herbert
2009-04-22 18:49             ` Rick Jones [this message]
2009-04-22 20:44             ` Jesper Dangaard Brouer
2009-04-23  6:58               ` Jens Axboe
2009-04-23  7:25                 ` David Miller
2009-04-23  7:29                   ` Jens Axboe
2009-04-23  9:12               ` Jens Laas
2009-04-22 14:33         ` Martin Josefsson
2009-04-23  7:34           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49EF66CE.10800@hp.com \
    --to=rick.jones2@hp.com \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.