public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Mitchell Erblich <erblichs@earthlink.net>
Cc: netdev@vger.kernel.org
Subject: Re: Proposed linux kernel changes : scaling  tcp/ip stack
Date: Thu, 03 Jun 2010 11:14:00 +0200	[thread overview]
Message-ID: <1275556440.2456.19.camel@edumazet-laptop> (raw)
In-Reply-To: <FDFFEFAB-A741-4232-821E-17BFAE5CAFAC@earthlink.net>

Le jeudi 03 juin 2010 à 01:16 -0700, Mitchell Erblich a écrit :
> To whom it may concern,
> 
> First, my assumption is to keep this discussion local to just a few tcp/ip
> developers to see if there is any consensus that the below is a logical 
> approach. Please also pass this email if there is a "owner(s)" of this stack
> to identify if a case exists for the below possible changes.
> 
> I am not currently on the linux kernel mail group.
> 			
> I have experience with modifications of the Linux tcp/ip stack, and have
> merged the changes into the company's local tree and left the possible 
> global integration to others.
> 
> I have been approached by a number of companies about scaling the
> stack with the assumption of a number of cpu cores. At present, I find extra
> time on my hands and am considering looking into this area on my own.
> 
> The first assumption is that if extra cores are available, that a single
> received homogeneous flow of a large number of packets/segments per
> second (pps) can be split into non-equal flows. This split can in effect
> allow a larger recv'd pps rate at the same core load while splitting off
> other workloads, such as xmit'ing pure ACKs.
> 
> Simply, again assuming Amdahl's law (and not looking to equalize the load
> between cores), and creating logical separations where in a many core 
> system, different cores could have new kernel threads  that operate in 
> parallel within the tcp/ip stack. The initial separation points would be at 
> the ip/tcp layer boundry and where any recv'd sk/pkt would generate some 
> form of output.
> 
> The ip/tcp layer would be split like the vintage AT&T STREAMs protocol,
> with some form of queuing & scheduling, would be needed. In addition,
> the queuing/schedullng of other kernel threads would occur within ip & tcp
> to separate the I/O.
> 
> A possible validation test is to identify the max recv'd pps rate within the
> tcp/ip modules within normal flow TCP established state with normal order 
> of say 64byte non fragmented segments, before and after each 
> incremental change. Or the same rate with fewer core/cpu cycles.
> 
> I am willing to have a private git Linux.org tree that concentrates proposed
> changes into this tree and if there is willingness, a seen want/need then identify
> how to implement the merge.

Hi Mitchell

We work everyday to improve network stack, and standard linux tree is
pretty scalable, you dont need to setup a separate git tree for that.

Our beloved maintainer David S. Miller handles two trees, net-2.6 and
net-next-2.6 where we put all our changes.

http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git

I suggest you read the last patches (say .. about 10.000 of them), to
have an idea of things we did during last years.

keywords : RCU, multiqueue, RPS, percpu data, lockless algos, cache line
placement...

Its nice to see another man joining the team !

Thanks



  reply	other threads:[~2010-06-03  9:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-03  8:16 Proposed linux kernel changes : scaling tcp/ip stack Mitchell Erblich
2010-06-03  9:14 ` Eric Dumazet [this message]
2010-06-16  3:11   ` Mitchell Erblich
2010-06-16  3:30     ` Proposed linux kernel changes : scaling tcp/ip stack : 2nd part Mitchell Erblich
2010-06-16  6:09       ` Proposed linux kernel changes : scaling tcp/ip stack : 3rd part Mitchell Erblich
2010-06-16  6:37         ` Eric Dumazet
2010-06-16  7:46           ` Mitchell Erblich
2010-06-16  9:10     ` Proposed linux kernel changes : scaling tcp/ip stack Andi Kleen
2010-06-16 19:39       ` Mitchell Erblich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1275556440.2456.19.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=erblichs@earthlink.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox