netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Susi <psusi@cfl.rr.com>
To: Ashwini Kulkarni <ashwini.kulkarni@intel.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	christopher.leech@intel.com
Subject: Re: [RFC 0/6] TCP socket splice
Date: Fri, 22 Sep 2006 13:45:37 -0400	[thread overview]
Message-ID: <45142141.3010802@cfl.rr.com> (raw)
In-Reply-To: <20060920210711.17480.92354.stgit@gitlost.site>

How is this different than just having the application mmap() the file 
and recv() into that buffer?

Ashwini Kulkarni wrote:
> My name is Ashwini Kulkarni and I have been working at Intel Corporation for
> the past 4 months as an engineering intern. I have been working on the 'TCP
> socket splice' project with Chris Leech. This is a work-in-progress version
> of the project with scope for further modifications.
> 
> TCP socket splicing:
> It allows a TCP socket to be spliced to a file via a pipe buffer. First, to
> splice data from a socket to a pipe buffer, upto 16 source pages(s) are pulled
> into the pipe buffer. Then to splice data from the pipe buffer to a file,
> those pages are migrated into the address space of the target file. It takes
> place entirely within the kernel and thus results in zero memory copies. It is
> the receive side complement to sendfile() but unlike sendfile() it is
> possible to splice from a socket as well and not just to a socket.
> 
> Current Method:
>                          + >  Application Buffer +
>                          |                       |
>         _________________|_______________________|_____________
>                          |                       |
>               Receive or |                       | Write
>               I/OAT DMA  |                       |
>                          |                       |
>                          |                       V
>                        Network              File System
>                        Buffer                  Buffer
>                          ^                       |
>                          |                       |
>         _________________|_______________________|_____________
>                      DMA |                       | DMA
>                          |                       |
>        Hardware          |                       |
>                          |                       V
>                         NIC                     SATA
>                                                                     
> In the current method, the packet is DMA’d from the NIC into the network buffer.
> There is a read on socket to the user space and the packet data is copied from
> the network buffer to the application buffer. A write operation then moves the
> data from the application buffer to the file system buffer which is then DMA'd
> to the disk again. Thus, in the current method there will be one full copy of
> all the data to the user space.
> 
> Using TCP socket splice:
> 
>                     Application Control
>                          |
>         _________________|__________________________________
>                          |
>                          |   TCP socket splice
>                          | +---------------------+
>                          | |     Direct path     |
>                          V |                     V
>                        Network              File System
>                        Buffer                  Buffer
>                          ^                       |
>                          |                       |
>         _________________|_______________________|__________
>                      DMA |                       | DMA
>                          |                       |
>        Hardware          |                       |
>                          |                       V
>                         NIC                     SATA
>                                                                     
> In this method, the objective is to use TCP socket splicing to create a direct
> path in the kernel from the network buffer to the file system buffer via a pipe
> buffer. The pages will migrate from the network buffer (which is associated
> with the socket) into the pipe buffer for an optimized path. From the pipe
> buffer, the pages will then be migrated to the output file address space page
> cache. This will enable to create a LAN to file-system API which will avoid the
> memcpy operations in user space and thus create a fast path from the network
> buffer to the storage buffer.
> 
> Open Issues (currently being addressed):
> There is a performance drop when transferring bigger files (usually larger than
> 65536 bytes in size). Performance drop increases with the size of the file.
> Work is in progress to identify the source of this issue.
> 
> We encourage the community to review our TCP socket splice project. Feedback
> would be greatly appreciated.
> 
> --
> Ashwini Kulkarni


      parent reply	other threads:[~2006-09-22 17:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-20 21:07 [RFC 0/6] TCP socket splice Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 1/6] Make splice_to_pipe non-static and move structure definitions to a header file Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 2/6] Make sock_def_wakeup non-static Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 3/6] Add in TCP related part of splice read to ipv4 Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 4/6] Add TCP socket splicing (tcp_splice_read) support Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 5/6] Add skb_splice_bits to skbuff.c Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 6/6] Move i_size_read part from do_splice_to() to __generic_file_splice_read() in splice.c Ashwini Kulkarni
2006-09-21  6:00 ` [RFC 0/6] TCP socket splice Evgeniy Polyakov
2006-09-22 17:45 ` Phillip Susi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45142141.3010802@cfl.rr.com \
    --to=psusi@cfl.rr.com \
    --cc=ashwini.kulkarni@intel.com \
    --cc=christopher.leech@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).