From: Eric Dumazet <eric.dumazet@gmail.com>
To: Mitchell Erblich <erblichs@earthlink.net>
Cc: netdev@vger.kernel.org
Subject: Re: Proposed linux kernel changes : scaling tcp/ip stack
Date: Thu, 03 Jun 2010 11:14:00 +0200 [thread overview]
Message-ID: <1275556440.2456.19.camel@edumazet-laptop> (raw)
In-Reply-To: <FDFFEFAB-A741-4232-821E-17BFAE5CAFAC@earthlink.net>
Le jeudi 03 juin 2010 à 01:16 -0700, Mitchell Erblich a écrit :
> To whom it may concern,
>
> First, my assumption is to keep this discussion local to just a few tcp/ip
> developers to see if there is any consensus that the below is a logical
> approach. Please also pass this email if there is a "owner(s)" of this stack
> to identify if a case exists for the below possible changes.
>
> I am not currently on the linux kernel mail group.
>
> I have experience with modifications of the Linux tcp/ip stack, and have
> merged the changes into the company's local tree and left the possible
> global integration to others.
>
> I have been approached by a number of companies about scaling the
> stack with the assumption of a number of cpu cores. At present, I find extra
> time on my hands and am considering looking into this area on my own.
>
> The first assumption is that if extra cores are available, that a single
> received homogeneous flow of a large number of packets/segments per
> second (pps) can be split into non-equal flows. This split can in effect
> allow a larger recv'd pps rate at the same core load while splitting off
> other workloads, such as xmit'ing pure ACKs.
>
> Simply, again assuming Amdahl's law (and not looking to equalize the load
> between cores), and creating logical separations where in a many core
> system, different cores could have new kernel threads that operate in
> parallel within the tcp/ip stack. The initial separation points would be at
> the ip/tcp layer boundry and where any recv'd sk/pkt would generate some
> form of output.
>
> The ip/tcp layer would be split like the vintage AT&T STREAMs protocol,
> with some form of queuing & scheduling, would be needed. In addition,
> the queuing/schedullng of other kernel threads would occur within ip & tcp
> to separate the I/O.
>
> A possible validation test is to identify the max recv'd pps rate within the
> tcp/ip modules within normal flow TCP established state with normal order
> of say 64byte non fragmented segments, before and after each
> incremental change. Or the same rate with fewer core/cpu cycles.
>
> I am willing to have a private git Linux.org tree that concentrates proposed
> changes into this tree and if there is willingness, a seen want/need then identify
> how to implement the merge.
Hi Mitchell
We work everyday to improve network stack, and standard linux tree is
pretty scalable, you dont need to setup a separate git tree for that.
Our beloved maintainer David S. Miller handles two trees, net-2.6 and
net-next-2.6 where we put all our changes.
http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git
I suggest you read the last patches (say .. about 10.000 of them), to
have an idea of things we did during last years.
keywords : RCU, multiqueue, RPS, percpu data, lockless algos, cache line
placement...
Its nice to see another man joining the team !
Thanks
next prev parent reply other threads:[~2010-06-03 9:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-03 8:16 Proposed linux kernel changes : scaling tcp/ip stack Mitchell Erblich
2010-06-03 9:14 ` Eric Dumazet [this message]
2010-06-16 3:11 ` Mitchell Erblich
2010-06-16 3:30 ` Proposed linux kernel changes : scaling tcp/ip stack : 2nd part Mitchell Erblich
2010-06-16 6:09 ` Proposed linux kernel changes : scaling tcp/ip stack : 3rd part Mitchell Erblich
2010-06-16 6:37 ` Eric Dumazet
2010-06-16 7:46 ` Mitchell Erblich
2010-06-16 9:10 ` Proposed linux kernel changes : scaling tcp/ip stack Andi Kleen
2010-06-16 19:39 ` Mitchell Erblich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1275556440.2456.19.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=erblichs@earthlink.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox