From: Robert Hancock <hancockrwd@gmail.com>
To: "Patrick J. LoPresti" <lopresti@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: sendfile() expert advice sought
Date: Wed, 17 Feb 2010 18:57:10 -0600 [thread overview]
Message-ID: <4B7C9066.6010107@gmail.com> (raw)
In-Reply-To: <23986fd91002161153y516bb5e3i9e85f11469b9160e@mail.gmail.com>
On 02/16/2010 01:53 PM, Patrick J. LoPresti wrote:
> Executive summary: Can I get the benefits of sendfile() for anonymous pages?
>
> I have an application that generates hundreds of gigabytes of data per
> hour. I want to push that data out over a TCP socket. (The network
> connection will be fast; multiple bonded GigE lines or 10GigE.)
>
> I gather that sendfile() is pretty efficient, so I would like to use
> it. But I do not want to write all of my data to disk first. So I am
> considering an approach like this:
>
> int fd = shm_open("/foo", O_RDWR|O_TRUNC);
> ftruncate(fd, length);
> void *p = mmap (0, length, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> // (fill memory block at p with some data)
> sendfile(fd, sock, 0, length);
>
> Questions:
>
> 1) Will this work at all? (Some on-line sources suggest sendfile()
> does not work with tmpfs files. But I think this was fixed at some
> point...)
>
> 2) Will it provide zero-copy behavior, or does the fact that the pages
> are mapped in my process cause sendfile() to copy them?
>
> 3) If it is zero-copy, what happens if I overwrite the memory block
> after sendfile() returns? Do I risk corrupting my data? (In
> particular, suppose I have TCP_CORK set on the socket. Will
> sendfile() return before all of the data has actually been sent,
> giving me a window to corrupt my data? If so, how do I know when it
> is "safe" to re-use the memory?)
>
> 4) If sendfile() is not zero-copy in this example, would I expect a
> performance boost anyway, because sendfile() does not need to crawl
> page tables or something?
>
> Any responses or references will be appreciated.
I can't really answer definitively, but as far as I know you wouldn't
get any magic performance benefits from playing games with sendfile,
splice, etc. as long as the pages were mapped into userspace already,
they only really help when the data being written out is coming from
another file, pipe, etc. not when your application is generating the
data internally.
next prev parent reply other threads:[~2010-02-18 0:57 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-16 19:53 sendfile() expert advice sought Patrick J. LoPresti
2010-02-18 0:57 ` Robert Hancock [this message]
2010-02-18 2:19 ` Bryan Donlan
2010-02-18 2:47 ` Patrick J. LoPresti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B7C9066.6010107@gmail.com \
--to=hancockrwd@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lopresti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox