netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ashwini Kulkarni <ashwini.kulkarni@intel.com>
To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Cc: christopher.leech@intel.com
Subject: [RFC 0/6] TCP socket splice
Date: Wed, 20 Sep 2006 14:07:11 -0700	[thread overview]
Message-ID: <20060920210711.17480.92354.stgit@gitlost.site> (raw)


My name is Ashwini Kulkarni and I have been working at Intel Corporation for
the past 4 months as an engineering intern. I have been working on the 'TCP
socket splice' project with Chris Leech. This is a work-in-progress version
of the project with scope for further modifications.

TCP socket splicing:
It allows a TCP socket to be spliced to a file via a pipe buffer. First, to
splice data from a socket to a pipe buffer, upto 16 source pages(s) are pulled
into the pipe buffer. Then to splice data from the pipe buffer to a file,
those pages are migrated into the address space of the target file. It takes
place entirely within the kernel and thus results in zero memory copies. It is
the receive side complement to sendfile() but unlike sendfile() it is
possible to splice from a socket as well and not just to a socket.

Current Method:
                         + >  Application Buffer +
                         |                       |
        _________________|_______________________|_____________
                         |                       |
              Receive or |                       | Write
              I/OAT DMA  |                       |
                         |                       |
                         |                       V
                       Network              File System
                       Buffer                  Buffer
                         ^                       |
                         |                       |
        _________________|_______________________|_____________
                     DMA |                       | DMA
                         |                       |
       Hardware          |                       |
                         |                       V
                        NIC                     SATA
                                                                    
In the current method, the packet is DMA’d from the NIC into the network buffer.
There is a read on socket to the user space and the packet data is copied from
the network buffer to the application buffer. A write operation then moves the
data from the application buffer to the file system buffer which is then DMA'd
to the disk again. Thus, in the current method there will be one full copy of
all the data to the user space.

Using TCP socket splice:

                    Application Control
                         |
        _________________|__________________________________
                         |
                         |   TCP socket splice
                         | +---------------------+
                         | |     Direct path     |
                         V |                     V
                       Network              File System
                       Buffer                  Buffer
                         ^                       |
                         |                       |
        _________________|_______________________|__________
                     DMA |                       | DMA
                         |                       |
       Hardware          |                       |
                         |                       V
                        NIC                     SATA
                                                                    
In this method, the objective is to use TCP socket splicing to create a direct
path in the kernel from the network buffer to the file system buffer via a pipe
buffer. The pages will migrate from the network buffer (which is associated
with the socket) into the pipe buffer for an optimized path. From the pipe
buffer, the pages will then be migrated to the output file address space page
cache. This will enable to create a LAN to file-system API which will avoid the
memcpy operations in user space and thus create a fast path from the network
buffer to the storage buffer.

Open Issues (currently being addressed):
There is a performance drop when transferring bigger files (usually larger than
65536 bytes in size). Performance drop increases with the size of the file.
Work is in progress to identify the source of this issue.

We encourage the community to review our TCP socket splice project. Feedback
would be greatly appreciated.

--
Ashwini Kulkarni

             reply	other threads:[~2006-09-20 20:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-20 21:07 Ashwini Kulkarni [this message]
2006-09-20 21:08 ` [RFC 1/6] Make splice_to_pipe non-static and move structure definitions to a header file Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 2/6] Make sock_def_wakeup non-static Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 3/6] Add in TCP related part of splice read to ipv4 Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 4/6] Add TCP socket splicing (tcp_splice_read) support Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 5/6] Add skb_splice_bits to skbuff.c Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 6/6] Move i_size_read part from do_splice_to() to __generic_file_splice_read() in splice.c Ashwini Kulkarni
2006-09-21  6:00 ` [RFC 0/6] TCP socket splice Evgeniy Polyakov
2006-09-22 17:45 ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060920210711.17480.92354.stgit@gitlost.site \
    --to=ashwini.kulkarni@intel.com \
    --cc=christopher.leech@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).