public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Jamie Lokier <jamie@shareable.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Chris Wedgwood <cw@f00f.org>
Subject: Re: Proposal for "proper" durable fsync() and fdatasync()
Date: Tue, 26 Feb 2008 12:54:47 -0500	[thread overview]
Message-ID: <47C45267.4090105@garzik.org> (raw)
In-Reply-To: <20080226170011.GB21203@shareable.org>

Jamie Lokier wrote:
> Jeff Garzik wrote:
>> Nick Piggin wrote:
>>> Anyway, the idea of making fsync/fdatasync etc. safe by default is
>>> a good idea IMO, and is a bad bug that we don't do that :(
>> Agreed...  it's also disappointing that [unless I'm mistaken] you have 
>> to hack each filesystem to support barriers.
>>
>> It seems far easier to make sync_blkdev() Do The Right Thing, and 
>> magically make all filesystems data-safe.
> 
> Well, you need ordered metadata writes, barriers _and_ flushes with
> some filesystems.
> 
> Merely writing all the data pages than issuing a drive cache flush
> won't Do The Right Thing with those filesystems - someone already
> mentioned Btrfs, where it won't.

Oh certainly.  That's why we have a VFS :)  fsync for NFS will look 
quite different, too.


> But I agree that your suggestion would make a superb default, for
> filesystems which don't provide their own function.

Yep.  That would immediately cover a bunch of filesystems.


> It's not optimal even then.
> 
>   Devices: On a software RAID, you ideally don't want to issue flushes
>   to all drives if your database did a 1 block commit entry.  (But they
>   probably use O_DIRECT anyway, changing the rules again).  But all that
>   can be optimised in generic VFS code eventually.  It doesn't need
>   filesystem assistance in most cases.

My own idea is that we create a FLUSH command for blkdev request queues, 
to exist alongside READ, WRITE, and the current barrier implementation. 
  Then FLUSH could be passed down through MD or DM.

	Jeff



  reply	other threads:[~2008-02-26 17:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-26  7:26 Proposal for "proper" durable fsync() and fdatasync() Jamie Lokier
2008-02-26  7:43 ` Andrew Morton
2008-02-26  7:59   ` Jamie Lokier
2008-02-26  9:16     ` Nick Piggin
2008-02-26 14:09       ` Jörn Engel
2008-02-26 15:07         ` Jamie Lokier
2008-02-26 16:27           ` Andrew Morton
2008-02-26 15:28         ` Jamie Lokier
2008-02-26 17:02           ` Jörn Engel
2008-02-26 17:29             ` Jamie Lokier
2008-02-26 17:38               ` Jörn Engel
2008-02-26 16:43       ` Jeff Garzik
2008-02-26 17:00         ` Jamie Lokier
2008-02-26 17:54           ` Jeff Garzik [this message]
2008-02-27 14:16             ` Jamie Lokier
2008-02-26  7:43 ` Jeff Garzik
2008-02-26  7:55   ` Jamie Lokier
2008-02-26  9:25   ` Jamie Lokier
2008-02-26 12:13   ` Ric Wheeler
2008-02-26 15:43     ` Jamie Lokier
2008-11-24 21:10       ` Sachin Gaikwad
2008-11-25 10:17         ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47C45267.4090105@garzik.org \
    --to=jeff@garzik.org \
    --cc=akpm@linux-foundation.org \
    --cc=cw@f00f.org \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox