linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4@vger.kernel.org,
	Arjan van de Ven <arjan@infradead.org>
Subject: Re: [PATCH, RFC] Use WRITE_SYNC in __block_write_full_page() if WBC_SYNC_ALL
Date: Mon, 5 Jan 2009 16:16:07 -0500	[thread overview]
Message-ID: <20090105211607.GF8939@mit.edu> (raw)
In-Reply-To: <20090105193820.GW32491@kernel.dk>

On Mon, Jan 05, 2009 at 08:38:20PM +0100, Jens Axboe wrote:
> On Mon, Jan 05 2009, Theodore Tso wrote:
> > So long-term, I suspect the hueristic which makes sense is that in the
> > case where there is an fsync() in progress, any writes which take
> > place as a result of that fsync (which includes the journal records as
> > well as ordered writes that are being forced out as a result of
> > data=ordered and which block the fsync from returning), should get a
> > hint which propagates down to the block layer that these writes *are*
> > synchronous in that someone is waiting for them to complete.  They
> 
> If someone is waiting for them, they are by definition sync!

Surely.  :-)

Andrew's argument is that someone *shouldn't* be waiting for them ---
and he's right, although in the case of fsync() in particular, there's
nothing we can do; there will be a userspace application waiting by
definition.

The bigger problem right now is until we split up the meaning of
"unplug the I/O queue" with "mark the I/O as synchronous", right now
the way data ordered mode works is all of the data blocks get pushed
out in 4k chunks.  So in the worst case, if the user has just written
some 200 megabytes of vmlinuz and kernel modules, and then calls
fsync(), the block I/O layer might get flooded with some 50,000+ 4k
writes, and if they are all BIO_RW_SYNC, they might not get coalesced
properly, and the result would be badness.  One could argue that
journal layer should do doing a better job of coalescing the write
requests, but historically the block layer has done this for us, so
why add duplicate functionality at the journalling layer?

In any case, that's why I'm really not convinced we can afford to use
BIO_RW_SYNC until we separate out the queue unplug functionality.
Maybe what makes sence is to have two flags, BIO_RW_UNPLUG and
BIO_RW_SYNCIO, and then make BIO_RW_SYNC be defined to be
(BIO_RW_UNPLUG|BIO_RW_SYNCIO)?

> > shouldn't necessarily be prioritized ahead of other reads (unless they
> > are readahead operations that couldn't be combined with reads that
> > *are* synchronous that someone is waiting for completion), but they
> > should be prioritized ahead of asynchronous writes.
> 
> And that is *exactly* what flagging the write as sync will do...

Great, so once we separate out the queue unplug request, I think this
should be exactly what we need.

							- Ted

  reply	other threads:[~2009-01-05 21:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-04 21:52 [PATCH, RFC] Use WRITE_SYNC in __block_write_full_page() if WBC_SYNC_ALL Theodore Ts'o
2009-01-04 22:23 ` Andrew Morton
2009-01-04 22:43   ` Theodore Tso
2009-01-04 23:19     ` Andrew Morton
2009-01-05  0:21       ` Theodore Tso
2009-01-05  8:02       ` Jens Axboe
2009-01-05 14:47         ` Theodore Tso
2009-01-05 15:58           ` Jens Axboe
2009-01-05 18:47           ` Andrew Morton
2009-01-05 19:35             ` Theodore Tso
2009-01-05 19:38               ` Jens Axboe
2009-01-05 21:16                 ` Theodore Tso [this message]
2009-01-06  7:34                   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090105211607.GF8939@mit.edu \
    --to=tytso@mit.edu \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).