netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: Andi Kleen <ak@suse.de>
Cc: David Miller <davem@davemloft.net>,
	rdreier@cisco.com, tom@opengridcomputing.com,
	netdev@vger.kernel.org, akpm@osdl.org
Subject: Re: RDMA will be reverted
Date: Mon, 24 Jul 2006 17:29:05 -0700	[thread overview]
Message-ID: <44C565D1.6070202@hp.com> (raw)
In-Reply-To: <200607250202.02913.ak@suse.de>

This all sounds like the discussions we had within HP-UX between 10.20 and 11.0 
concerning Inbound Packet Scheduling vs Thread Optimized Packet Scheduling.  IPS 
was done by the 10.20 stack at the handoff between the driver and netisr.  If 
the packet was not an IP datagram fragment, parts of the transport and IP 
headers would be hashed, and the result would be the netisr queue to which the 
packet would be queued for further processing.

It worked fine and dandy for stuff like aggregate netperf TCP_RR tests because 
there was a 1-1 correspondence between a connection and a process/thread.  It 
was "OK" for the networking to dictate where the process should run.  That feels 
rather like a NIC that would hash packets and pick the MSI-X based on that.

However, as Andi discusses, when there is a process/thread doing more than one 
connection, picking a CPU based on addressing hashing will be like TweedleDee 
and TweedleDum telling Alice to go in opposite directions.  Hence TOPS in 11.X. 
  This time, when there is a "normal" lookup location in the path, where the 
application last accessed the socket is determined, and things shift-over to 
that CPU.  This then is the process (well actually the scheduler) telling 
networking where it should do its work.

That addresses the multiple connections per thread/process and still works just 
as well for 1-1.  There are still issues if you have mutiple threads/processes 
concurrently accessing the same socket/connection, but that one is much more rare.

Nirvana I suppose would be the addition of a field in the header which could be 
used for the determination of where to process. A Transport Protocol option I 
suppose, maybe the IPv6 flow id, but knuth only knows if anyone would go for 
something along those lines.  It does though mean that the "state" is per-packet 
without it having to be based on addressing information.  Almost like RDMA 
arriving saying where the data goes, but this thing says where the processing 
should happen :)

rick jones

  reply	other threads:[~2006-07-25  0:29 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-28  7:07 RDMA will be reverted David Miller
2006-06-28  7:41 ` Evgeniy Polyakov
2006-06-28 14:56 ` Tom Tucker
2006-06-28 15:01 ` Steve Wise
2006-06-29 16:54 ` Roland Dreier
2006-06-29 17:32   ` YOSHIFUJI Hideaki / 吉藤英明
2006-06-29 17:35     ` Roland Dreier
2006-06-29 17:40       ` YOSHIFUJI Hideaki / 吉藤英明
2006-06-29 19:46   ` David Miller
2006-06-29 20:11     ` Tom Tucker
2006-06-29 20:16       ` Tom Tucker
2006-06-29 20:19       ` David Miller
2006-06-29 20:47         ` Tom Tucker
2006-06-29 20:53           ` David Miller
2006-06-29 21:28             ` Tom Tucker
2006-06-29 21:25         ` Andi Kleen
2006-06-29 20:42       ` James Morris
2006-06-30 20:51     ` Roland Dreier
2006-06-30 21:16       ` David Miller
2006-06-30 23:01         ` Tom Tucker
2006-07-01 14:26           ` Andi Kleen
2006-07-04 18:34             ` Andy Gay
2006-07-04 20:47               ` Andi Kleen
2006-07-04 22:22                 ` Andy Gay
2006-07-04 23:01                   ` Andi Kleen
2006-07-04 23:48                     ` Andy Gay
2006-07-05  0:04                       ` Andi Kleen
2006-07-04 20:34             ` Roland Dreier
2006-07-24 22:06               ` David Miller
2006-07-24 23:10                 ` Andi Kleen
2006-07-24 23:22                   ` David Miller
2006-07-25  0:02                     ` Andi Kleen
2006-07-25  0:29                       ` Rick Jones [this message]
2006-07-25  0:45                         ` David Miller
2006-07-25  0:55                           ` Rick Jones
2006-07-25  1:04                             ` Andi Kleen
2006-07-25  1:21                             ` David Miller
2006-07-25 16:29                               ` Rick Jones
2006-07-25 16:32                                 ` Andi Kleen
2006-07-25  1:03                           ` Rick Jones
2006-07-25  1:42                         ` Andi Kleen
2006-07-25  5:51                 ` Evgeniy Polyakov
2006-07-25  6:48                   ` David Miller
2006-07-25  6:59                     ` Evgeniy Polyakov
2006-07-25  7:33                       ` David Miller
2006-07-25  7:42                         ` Evgeniy Polyakov
2006-07-05 17:09             ` Tom Tucker
2006-07-05 17:50               ` Steve Wise
2006-07-24 22:25                 ` David Miller
2006-07-24 22:47                   ` Caitlin Bestler
2006-07-24 22:23               ` David Miller
2006-07-24 22:57                 ` Caitlin Bestler
2006-07-01 21:45           ` David Miller
2006-07-04 20:34             ` Roland Dreier
2006-07-05 18:27               ` David Miller
2006-07-05 20:29                 ` Roland Dreier
2006-07-06  3:03                   ` David Miller
2006-07-06  5:25                     ` Tom Tucker
2006-07-06 14:08                       ` Herbert Xu
2006-07-06 17:36                         ` Tom Tucker
2006-07-07  0:03                           ` Herbert Xu
2006-07-07  0:32                             ` Tom Tucker
2006-07-07  6:53                       ` David Miller
2006-07-07  8:11                         ` What is RDMA (was: RDMA will be reverted) Herbert Xu
2006-07-07 18:25                           ` Steve Wise
2006-07-11  8:17                             ` Herbert Xu
2006-07-11 13:27                               ` Steve Wise
2006-07-24 22:29                           ` What is RDMA David Miller
2006-07-24 22:34                             ` Rick Jones
2006-07-24 22:39                               ` David Miller
2006-07-24 22:49                               ` Andi Kleen
2006-07-07 13:29                         ` RDMA will be reverted Tom Tucker
  -- strict thread matches above, loose matches on Subject: below --
2006-07-06 13:26 Caitlin Bestler
2006-07-25 19:59 Tom Tucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44C565D1.6070202@hp.com \
    --to=rick.jones2@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=rdreier@cisco.com \
    --cc=tom@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).