From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
linux@horizon.com, linux-kernel@vger.kernel.org, sct@redhat.com
Subject: Re: msync() behaviour broken for MS_ASYNC, revert patch?
Date: Sat, 11 Feb 2006 07:03:14 +1100 [thread overview]
Message-ID: <43ECF182.9020505@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0602101056130.19172@g5.osdl.org>
Linus Torvalds wrote:
>
> On Sat, 11 Feb 2006, Nick Piggin wrote:
>
>> When MS_ASYNC is specified, msync() shall return immediately once all
>> the write operations are initiated or queued for servicing;
>>
>>It is talking about write operations, not dirtying. Actually the only
>>difference with MS_SYNC is that it waits for said write operations (of the
>>type queued up by MS_ASYNC) to complete.
>
>
> Right. And it's what we do. We queue them by moving the pages to the dirty
> lists (yeah, it's just a tag on the page index thing, whatever).
>
> And yes, you argue that we should move the queue closer to the actual
> disk, but I have used at least one app that really hated the "start IO
> now" approach. I can't talk about that app in any detail, but I can say
> that it was an in-memory checkpoint thing with the checkpoints easily
> being in the hundred-meg range.
>
Hey fix your damn broken proprietary app (nah just kidding)
> And moving a hundred megs to the IO layer is insane. It also makes the
> system pretty unusable.
>
> So we may have different expectations, because we've seen different
> patterns. Me, I've seen the "events are huge, and you stagger them", so
> that the previous event has time to flow out to disk while you generate
> the next one. There, MS_ASYNC starting IO is _wrong_, because the scale of
> the event is just huge, so trying to push it through the IO subsystem asap
> just makes everything suck.
>
> In contrast, you seem to be coming at it from a standpoint of "only one
> event ever outstanding at any particular time, and it's either small or
> it's the only thing the whole system is doing". In which case pushing it
> out to IO buffers is probably the right thing to do.
>
The way I see it, it stems from simply a different expectation of
MS_ASYNC semantics, rather than exactly what the app is doing.
If there are no data integrity requirements, then the writing should
be left up to the VM. If there are, then there will be a MS_SYNC,
which *will* move those hundred megs to the IO layer so there is no
reason for MS_ASYNC *not* to get it started earlier (and it will
be more efficient if it does).
The semantics your app wants, in my interpretation, are provided
by MS_INVALIDATE. Which kind of says "bring mmap data into coherence
with system cache", which would presumably transfer dirty bits if
modified (though as an implementation detail, we are never actually
incoherent as far as the data goes, only dirty bits).
At this point the best I can do is agree to disagree if you are
still not convinced and I'll leave it to Linux to keep debating it.
We reached something of an agreement on the fadvise thing at least.
Thanks,
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
next prev parent reply other threads:[~2006-02-10 20:03 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-09 7:18 msync() behaviour broken for MS_ASYNC, revert patch? linux
2006-02-09 8:18 ` Andrew Morton
2006-02-09 8:35 ` Nick Piggin
2006-02-09 8:42 ` Andrew Morton
2006-02-09 12:38 ` Nick Piggin
2006-02-09 12:39 ` Nick Piggin
2006-02-09 17:48 ` Andrew Morton
2006-02-10 3:36 ` Nick Piggin
2006-02-10 3:50 ` Andrew Morton
2006-02-10 3:57 ` Nick Piggin
2006-02-10 4:13 ` Andrew Morton
2006-02-10 4:30 ` Nick Piggin
2006-02-10 4:43 ` Andrew Morton
2006-02-10 4:52 ` Nick Piggin
2006-02-10 5:13 ` Andrew Morton
2006-02-10 5:29 ` Nick Piggin
2006-02-10 5:50 ` Andrew Morton
2006-02-10 6:03 ` Nick Piggin
2006-02-10 6:13 ` Andrew Morton
2006-02-10 6:31 ` Nick Piggin
2006-02-10 6:46 ` Andrew Morton
2006-02-10 6:57 ` Nick Piggin
2006-02-10 7:14 ` Andrew Morton
2006-02-10 12:41 ` Nick Piggin
2006-02-10 16:19 ` Linus Torvalds
2006-02-10 17:00 ` Nick Piggin
2006-02-10 17:12 ` Linus Torvalds
2006-02-10 17:35 ` Linus Torvalds
2006-02-10 17:59 ` Nick Piggin
2006-02-10 18:55 ` Linus Torvalds
2006-02-10 19:29 ` Nick Piggin
2006-02-10 19:44 ` Linus Torvalds
2006-02-10 19:52 ` Nick Piggin
2006-02-10 20:03 ` Linus Torvalds
2006-02-11 5:49 ` Nick Piggin
2006-02-10 16:05 ` Linus Torvalds
2006-02-10 16:37 ` Nick Piggin
2006-02-10 17:03 ` Linus Torvalds
2006-02-10 17:37 ` Nick Piggin
2006-02-10 18:01 ` Linus Torvalds
2006-02-10 18:38 ` Nick Piggin
2006-02-10 19:05 ` Linus Torvalds
2006-02-10 19:34 ` Oliver Neukum
2006-02-10 19:59 ` Linus Torvalds
2006-02-10 20:11 ` Andrew Morton
2006-02-10 21:15 ` Linus Torvalds
2006-02-10 21:28 ` Andrew Morton
2006-02-10 20:03 ` Nick Piggin [this message]
2006-02-10 21:10 ` Linus Torvalds
2006-02-10 21:55 ` Trond Myklebust
2006-02-10 22:46 ` Linus Torvalds
2006-02-10 23:02 ` Trond Myklebust
2006-02-10 23:15 ` Linus Torvalds
2006-02-11 19:07 ` Trond Myklebust
2006-02-10 17:29 ` linux
2006-02-10 17:42 ` Linus Torvalds
2006-02-10 18:57 ` Nick Piggin
2006-02-10 8:00 ` linux
2006-02-10 13:18 ` Nick Piggin
2006-02-10 7:15 ` linux
2006-02-10 7:28 ` Andrew Morton
2006-02-09 11:18 ` linux
-- strict thread matches above, loose matches on Subject: below --
2004-03-31 22:16 Stephen C. Tweedie
2004-03-31 22:37 ` Linus Torvalds
2004-03-31 23:41 ` Stephen C. Tweedie
2004-04-01 0:08 ` Linus Torvalds
2004-04-01 0:30 ` Andrew Morton
2004-04-01 15:40 ` Stephen C. Tweedie
2004-04-01 16:02 ` Linus Torvalds
2004-04-01 16:33 ` Stephen C. Tweedie
2004-04-01 16:19 ` Jamie Lokier
2004-04-01 16:57 ` Stephen C. Tweedie
2004-04-01 18:51 ` Andrew Morton
2004-03-31 22:53 ` Andrew Morton
2004-03-31 23:20 ` Stephen C. Tweedie
2004-04-16 22:35 ` Jamie Lokier
2004-04-19 21:54 ` Stephen C. Tweedie
2004-04-21 2:10 ` Jamie Lokier
2004-04-21 9:52 ` Stephen C. Tweedie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43ECF182.9020505@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@horizon.com \
--cc=sct@redhat.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox