From: Phillip Susi <psusi@cfl.rr.com>
To: Ashwini Kulkarni <ashwini.kulkarni@intel.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
christopher.leech@intel.com
Subject: Re: [RFC 0/6] TCP socket splice
Date: Fri, 22 Sep 2006 13:45:37 -0400 [thread overview]
Message-ID: <45142141.3010802@cfl.rr.com> (raw)
In-Reply-To: <20060920210711.17480.92354.stgit@gitlost.site>
How is this different than just having the application mmap() the file
and recv() into that buffer?
Ashwini Kulkarni wrote:
> My name is Ashwini Kulkarni and I have been working at Intel Corporation for
> the past 4 months as an engineering intern. I have been working on the 'TCP
> socket splice' project with Chris Leech. This is a work-in-progress version
> of the project with scope for further modifications.
>
> TCP socket splicing:
> It allows a TCP socket to be spliced to a file via a pipe buffer. First, to
> splice data from a socket to a pipe buffer, upto 16 source pages(s) are pulled
> into the pipe buffer. Then to splice data from the pipe buffer to a file,
> those pages are migrated into the address space of the target file. It takes
> place entirely within the kernel and thus results in zero memory copies. It is
> the receive side complement to sendfile() but unlike sendfile() it is
> possible to splice from a socket as well and not just to a socket.
>
> Current Method:
> + > Application Buffer +
> | |
> _________________|_______________________|_____________
> | |
> Receive or | | Write
> I/OAT DMA | |
> | |
> | V
> Network File System
> Buffer Buffer
> ^ |
> | |
> _________________|_______________________|_____________
> DMA | | DMA
> | |
> Hardware | |
> | V
> NIC SATA
>
> In the current method, the packet is DMA’d from the NIC into the network buffer.
> There is a read on socket to the user space and the packet data is copied from
> the network buffer to the application buffer. A write operation then moves the
> data from the application buffer to the file system buffer which is then DMA'd
> to the disk again. Thus, in the current method there will be one full copy of
> all the data to the user space.
>
> Using TCP socket splice:
>
> Application Control
> |
> _________________|__________________________________
> |
> | TCP socket splice
> | +---------------------+
> | | Direct path |
> V | V
> Network File System
> Buffer Buffer
> ^ |
> | |
> _________________|_______________________|__________
> DMA | | DMA
> | |
> Hardware | |
> | V
> NIC SATA
>
> In this method, the objective is to use TCP socket splicing to create a direct
> path in the kernel from the network buffer to the file system buffer via a pipe
> buffer. The pages will migrate from the network buffer (which is associated
> with the socket) into the pipe buffer for an optimized path. From the pipe
> buffer, the pages will then be migrated to the output file address space page
> cache. This will enable to create a LAN to file-system API which will avoid the
> memcpy operations in user space and thus create a fast path from the network
> buffer to the storage buffer.
>
> Open Issues (currently being addressed):
> There is a performance drop when transferring bigger files (usually larger than
> 65536 bytes in size). Performance drop increases with the size of the file.
> Work is in progress to identify the source of this issue.
>
> We encourage the community to review our TCP socket splice project. Feedback
> would be greatly appreciated.
>
> --
> Ashwini Kulkarni
prev parent reply other threads:[~2006-09-22 17:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-20 21:07 [RFC 0/6] TCP socket splice Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 1/6] Make splice_to_pipe non-static and move structure definitions to a header file Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 2/6] Make sock_def_wakeup non-static Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 3/6] Add in TCP related part of splice read to ipv4 Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 4/6] Add TCP socket splicing (tcp_splice_read) support Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 5/6] Add skb_splice_bits to skbuff.c Ashwini Kulkarni
2006-09-20 21:08 ` [RFC 6/6] Move i_size_read part from do_splice_to() to __generic_file_splice_read() in splice.c Ashwini Kulkarni
2006-09-21 6:00 ` [RFC 0/6] TCP socket splice Evgeniy Polyakov
2006-09-22 17:45 ` Phillip Susi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45142141.3010802@cfl.rr.com \
--to=psusi@cfl.rr.com \
--cc=ashwini.kulkarni@intel.com \
--cc=christopher.leech@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).