From: Eric Dumazet <eric.dumazet@gmail.com>
To: Changli Gao <xiaosuo@gmail.com>
Cc: Viral Mehta <Viral.Mehta@lntinfotech.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: zero copy for relay server
Date: Tue, 29 Mar 2011 06:23:10 +0200 [thread overview]
Message-ID: <1301372590.2506.57.camel@edumazet-laptop> (raw)
In-Reply-To: <AANLkTimWf4kyi4HJFToXP=HH==hQgObQsHYyRfrSe0FS@mail.gmail.com>
Le mardi 29 mars 2011 à 10:00 +0800, Changli Gao a écrit :
> On Tue, Mar 29, 2011 at 2:34 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> I think he concerns the overhead of system calls. In order to omit a
> system call, I think you can implement sth. like this:
>
> splice2(infd, outfd, pipefd, ...)
>
Yes, but given no numbers are given, and no code yet written, I ask the
question.
Giving 4 file descriptors to a single syscall sounds convoluted.
> What you need do is maintaining pipes by yourself.
>
> >> 2. I believe underlying PIPE that we are using will also have some size limit
> >> (like in user space 4K or 64K, not sure)
> >
> > What kind of socket is able to deliver more than 64K frames ?
>
> You can enlarge the size with fcntl(pipefd, F_SETPIPE_SZ,...).
>
Not really useful, since splice() internals use automatic arrays sized
with PIPE_DEF_BUFFERS.
You can enlarge the size of pipe, but still we are limited to at most
64K in skb_splice_bits() for example [On x86 and its 4KB pages]
This doesnt matter, since skb are limited to 16 pages anyway (or 64Kb)
F_SETPIPE_SZ only can increase size of pipe ringbuffer (which should be
empty or contain at most one skb), therefore increasing dcache needs.
> >
> > sendfile() is based on top of splice(), but it's faster to use splice().
> >
> >
>
> Why? Thanks.
>
The real cost is not syscall overhead, but context switches and cache
misses. Adding a "super syscall" adds kernel text and increases icache
misses on real machine (I am not talking about machine used in micro
benchmarks)
Most likely, GRO can significantly speed this workload, while a syscall
avoidance wont.
next prev parent reply other threads:[~2011-03-29 4:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-28 16:27 zero copy for relay server Viral Mehta
2011-03-28 16:52 ` Eric Dumazet
2011-03-28 18:18 ` Viral Mehta
2011-03-28 18:34 ` Eric Dumazet
2011-03-29 2:00 ` Changli Gao
2011-03-29 4:23 ` Eric Dumazet [this message]
2011-03-29 11:28 ` Changli Gao
2011-03-29 14:13 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1301372590.2506.57.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=Viral.Mehta@lntinfotech.com \
--cc=netdev@vger.kernel.org \
--cc=xiaosuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox