From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: netdev@vger.kernel.org
Subject: [Announce] New netchannels implementation. Userspace network stack.
Date: Fri, 20 Oct 2006 13:53:05 +0400 [thread overview]
Message-ID: <20061020095304.GA22445@2ka.mipt.ru> (raw)
[-- Attachment #1: Type: text/plain, Size: 4106 bytes --]
Netchannel [1] is pure bridge between low-level hardware and user, without any
special protocol processing involved between them.
Users are not limited to userspace only - I will use this netchannel
infrastructure for fast NAT implementation, which is purely kernelspace user
(although it is possible to create NAT in userspace, but price of the
kernelspace board crossing is too high, which only needs to change some fields
in the header and recalculate checksum).
Userspace network stack [2] is another user of the new netchannel subsystem.
Current netchannel version supports data transfer using copy*user().
One could ask how does it differ from netfilter's queue target?
There are three differencies (read advantages):
* it does not depend on netfilter (and thus does not introduce it's slow path)
* it is very scalable, since it does not use neither hash tables, nor lists
* it does not depend on netfilter (and thus does not introduce it's slow path).
Yes, again, since if we get into account NAT implementation, then we
need to add dependency on connection tracking, which is not needed for
existing netchannels implementation.
It is also much smaller and scalable compared to tun/tap devices.
And some other small advantages: possibility to perform zero-copy sending and
receiving using network allocator's [3] facilities (not implemented in the current
version of netchannels), it is very small, there are no locks in the very short
fast path (except RCU and skb queue linking lock, which is held for 5
operations) and so on...
There are also some limitations: it is only possible to get one packet per read
from netchannel's file descriptor (it is possible to extend it to read several
packets, but right now I leave it as is), it is ipv4 only (I'm lazy and only
implemented tree comparison functions for IPv4 addresses).
First user of the netchannel subsystem is userspace network stack [2], which
supports:
* TCP/UDP sending and receiving.
* Timestamp, window scaling, MSS TCP options.
* PAWS.
* Slow start and congestion control.
* Route table (including startic ARP cache).
* Socket-like interface.
* IP and ethernet processing code.
* complete retransmit algorithm.
* fast retransmit support.
* support for TCP listen state (only point-to-point mode, i.e. no new data
channels are created, when new client is connected, instead state is changed
according to protocol (TCP state is changed to ESTABLISHED).
* support for the new netchannels interface.
Speed/CPU usage graph for the socket code (which uses epoll and send/recv) is attached.
With the same 100 Mbit speed, CPU usage for netchanenls and userspace
network stack is about 2-3 times smaller than socket one with small
packet (128 bytes) sending/receiving.
There is very strange behaviour of userspace time() function, which if
being used actively results in extremely high kernel load and following
functions start to appear on the top of profiles:
* get_offset_pmtmr() - 25%, second position, even higher than sysenter_past_esp().
* do_gettimeofday() - 0.6%, 4'th place.
* delay_pmtmr() - 0.29%, 11'th place.
First place is poll_idle().
Testing system, which runs either netchannel or socket tests runs
HT-enabled Xeon:
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 2.40GHz
stepping : 7
with 1GB of RAM and e100 network adapter on Linux 2.6.17-rc3.
Main (vanilla) system is amd64 with 1GB of RAM and 8169 gigabit adapter
on Linux 2.6.18-1.2200.fc5, software is either netcat dumping data into
/dev/null or sendfile based server.
All sources are available on project's homepages.
Thank you.
1. Netchannels subsystem.
http://tservice.net.ru/~s0mbre/old/?section=projects&item=netchannel
2. Userspace network stack.
http://tservice.net.ru/~s0mbre/old/?section=projects&item=unetstack
3. Network allocator.
http://tservice.net.ru/~s0mbre/old/?section=projects&item=nta
If you have read upto here, then I want you to know that adverticement is
over. Thanks again.
--
Evgeniy Polyakov
[-- Attachment #2: atcp_speed.png --]
[-- Type: image/png, Size: 6337 bytes --]
next reply other threads:[~2006-10-20 9:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-20 9:53 Evgeniy Polyakov [this message]
2006-10-20 12:20 ` [Announce] New netchannels implementation. Userspace network stack Evgeniy Polyakov
2006-10-26 10:51 ` [Announce] Netchannels ported to the latest git tree. Gigabit benchmark. Complete rout Evgeniy Polyakov
2006-10-26 13:44 ` bert hubert
2006-10-26 15:23 ` Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061020095304.GA22445@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).