All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: chetan loke <loke.chetan@gmail.com>,
	Andreas Gruenbacher <agruen@linbit.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Herbert Xu <herbert@gondor.hengli.com.au>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
Date: Tue, 3 Jul 2012 02:02:02 +0200	[thread overview]
Message-ID: <20120703000202.GC11039@1wt.eu> (raw)
In-Reply-To: <1341265024.22621.464.camel@edumazet-glaptop>

Hi Eric,

On Mon, Jul 02, 2012 at 11:37:04PM +0200, Eric Dumazet wrote:
> On Mon, 2012-07-02 at 15:41 -0400, chetan loke wrote:
> > On Mon, Jul 2, 2012 at 12:06 PM, Andreas Gruenbacher <agruen@linbit.com> wrote:
> > > On Mon, 2012-07-02 at 15:54 +0200, Eric Dumazet wrote:
> > >> So I will just say no to your patches, unless you demonstrate the
> > >> splice() problems, and how you can fix the alignment problem in a new
> > >> layer instead of in the existing zero copy standard one.
> > >
> > > Again, splice or not is not the issue here. It does not, by itself, allow zero
> > > copy from the network directly to disk but it could likely be made to support
> > > that if we can get the alignment right first.  The proposed MSG_NEW_PACKET flag
> > > helps with that, but maybe someone has a better idea.
> > >
> > 
> > Eric - by using splice do you mean something like:
> > 
> > int filedes[2];
> > PIPE_SIZE (64*1024)
> > pipe(filedes);
> > ret = splice (sock_fd_from, &from_offset, filedes [1], NULL, PIPE_SIZE,
> >                      SPLICE_F_MORE | SPLICE_F_MOVE);
> > 
> > 
> > ret = splice (filedes [0], NULL, file_fd_to,
> >                          &to_offset, ret,
> >                          SPLICE_F_MORE | SPLICE_F_MOVE);
> > 
> 
> Yes, thats more or less the plan. You also can play with bigger
> PIPE_SIZE if needed.

I confirm, this is recommended at high bit rates if you're working with
large windows.

> > i.e. splice-in from socket to pipe, and splice-out from pipe to destination?
> > 
> > Andreas - if the above assumption is true then can you apply the
> > 'MSG_NEW_PACKET' on the sender and see if the above pseudo-splice code
> > achieves something similar to what you expect on the receive side(you
> > can also play w/ F_SETPIPE_SZ -  although I found very little
> > reduction in CPU usage)? Note: My personal experience - using splice
> > from an input-file-A to output-file-B bought very minimal cpu
> > reduction(yes, both the files used O_DIRECT). Instead, a simple
> > read/write w/ O_DIRECT from file-A to file-B was much much faster.
> 
> splice() performance from socket to pipe have improved a lot in
> linux-3.5
> 
> It was not true zero copy, until very recent patches.

In fact it has been true zero copy in 2.6.25 until we faced a large
amount of data corruption and the zero copy was disabled in 2.6.25.X.
Since then it remained that way until you brought your patches to
re-instantiate it.

> (It was zero copy only on certain class of NIC, not on the ones found
> on appliances or cheap platforms)
> 
> Willy Tarreau mentioned a nice boost of performance with haproxy.

Yes definitely. The savings are more noticeable on small systems where
memory bandwidth is limited. On a small ARM system bound by RAM bandwidth,
the performance was basically doubled. But I also observed nice savings
on a core2duo equipped with 2 myricom 10Gig NICs forwarding at line rate.

> Willy wanted to work on a direct splice from socket to socket, but
> I am not sure it'll bring major speed improvement.

I'm not sure at all either, I'm betting a few percent saved from the
reduction of syscalls, not much more. This is why I'll probably check
this when I have enough time to kill.

Regards,
Willy


  reply	other threads:[~2012-07-03  0:02 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-29 14:53 [RFC] [TCP 0/3] Receive from socket into bio without copying Andreas Gruenbacher
2012-06-29 14:53 ` Andreas Gruenbacher
2012-06-29 15:08 ` Eric Dumazet
2012-06-29 15:08   ` Eric Dumazet
2012-07-02 11:45   ` Andreas Gruenbacher
2012-07-02 11:45     ` Andreas Gruenbacher
2012-07-02 12:36     ` Eric Dumazet
2012-07-02 12:36       ` Eric Dumazet
2012-07-02 13:02       ` Andreas Gruenbacher
2012-07-02 13:02         ` Andreas Gruenbacher
2012-07-02 13:54         ` Eric Dumazet
2012-07-02 13:54           ` Eric Dumazet
2012-07-02 16:06           ` Andreas Gruenbacher
2012-07-02 16:06             ` Andreas Gruenbacher
2012-07-02 19:41             ` chetan loke
2012-07-02 19:41               ` chetan loke
2012-07-02 21:37               ` Eric Dumazet
2012-07-02 21:37                 ` Eric Dumazet
2012-07-03  0:02                 ` Willy Tarreau [this message]
2012-07-02 13:39     ` saeed bishara
2012-07-02 13:39       ` saeed bishara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120703000202.GC11039@1wt.eu \
    --to=w@1wt.eu \
    --cc=agruen@linbit.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.hengli.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loke.chetan@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.