From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jamie Lokier <jamie@shareable.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
jens.axboe@oracle.com, akpm@linux-foundation.org,
nickpiggin@yahoo.com.au, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch v3] splice: fix race with page invalidation
Date: Thu, 31 Jul 2008 11:54:56 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.1.10.0807311142510.3277@nehalem.linux-foundation.org> (raw)
In-Reply-To: <20080731172111.GA23644@shareable.org>
On Thu, 31 Jul 2008, Jamie Lokier wrote:
>
> But did you miss the bit where you DON'T COPY ANYTHING EVER*? COW is
> able provide _correctness_ for the rare corner cases which you're not
> optimising for. You don't actually copy more than 0.0% (*approx).
The thing is, just even _marking_ things COW is the expensive part. If we
have to walk page tables - we're screwed.
> The cost of COW is TLB flushes*. But for splice, there ARE NO TLB
> FLUSHES because such files are not mapped writable!
For splice, there are also no flags to set, no extra tracking costs, etc
etc.
But yes, we could make splice (from a file) do something like
- just fall back to copy if the page is already mapped (page->mapcount
gives us that)
- set a bit ("splicemapped") when we splice it in, and increment
page->mapcount for each splice copy.
- if a "splicemapped" page is ever mmap'ed or written to (either through
write or truncate), we COW it then (and actually move the page cache
page - it would be a "woc": a reverse cow, not a normal one).
- do all of this with page lock held, to make sure that there are no
writers or new mappers happening.
So it's probably doable.
(We could have a separate "splicecount", and actually allow non-writable
mappings, but I suspect we cannot afford the space in teh "struct space"
for a whole new count).
> You're missing the real point of network splice().
>
> It's not just for speed.
>
> It's for sharing data. Your TCP buffers can share data, when the same
> big lump is in flight to lots of clients. Think static file / web /
> FTP server, the kind with 80% of hits to 0.01% of the files roughly
> the same of your RAM.
Maybe. Does it really show up as a big thing?
Linus
next prev parent reply other threads:[~2008-07-31 18:58 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-30 9:43 [patch v3] splice: fix race with page invalidation Miklos Szeredi
2008-07-30 17:00 ` Linus Torvalds
2008-07-30 17:29 ` Miklos Szeredi
2008-07-30 17:54 ` Jens Axboe
2008-07-30 18:32 ` Miklos Szeredi
2008-07-30 18:43 ` Miklos Szeredi
2008-07-30 19:45 ` Jens Axboe
2008-07-30 20:05 ` Miklos Szeredi
2008-07-30 20:13 ` Linus Torvalds
2008-07-30 20:45 ` Miklos Szeredi
2008-07-30 20:51 ` Linus Torvalds
2008-07-30 21:16 ` Miklos Szeredi
2008-07-30 21:22 ` Linus Torvalds
2008-07-30 21:46 ` Miklos Szeredi
2008-07-30 21:56 ` Linus Torvalds
2008-07-31 0:11 ` Jamie Lokier
2008-07-31 0:42 ` Jamie Lokier
2008-07-31 0:51 ` Linus Torvalds
2008-07-31 0:54 ` Linus Torvalds
2008-07-31 6:12 ` Jamie Lokier
2008-07-31 10:26 ` Evgeniy Polyakov
2008-07-31 12:33 ` Jamie Lokier
2008-07-31 12:49 ` Nick Piggin
2008-07-31 13:29 ` Evgeniy Polyakov
2008-07-31 16:56 ` Linus Torvalds
2008-07-31 16:34 ` Linus Torvalds
2008-07-31 17:21 ` Jamie Lokier
2008-07-31 18:54 ` Linus Torvalds [this message]
2008-07-31 7:30 ` Miklos Szeredi
2008-07-31 2:16 ` Nick Piggin
2008-07-31 12:59 ` Nick Piggin
2008-07-31 17:00 ` Linus Torvalds
2008-07-31 18:13 ` Miklos Szeredi
2008-08-01 1:22 ` Nick Piggin
2008-08-01 18:28 ` Miklos Szeredi
2008-08-01 18:32 ` Linus Torvalds
2008-08-02 4:26 ` Nick Piggin
2008-08-04 15:29 ` Jamie Lokier
2008-08-05 2:57 ` Nick Piggin
2008-08-11 3:22 ` Michael Kerrisk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.10.0807311142510.3277@nehalem.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=jamie@shareable.org \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).