linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Olaf van der Spek <olafvdspek@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: Atomic non-durable file write API
Date: Sun, 26 Dec 2010 08:40:07 +1100	[thread overview]
Message-ID: <20101226084007.7939aabc@notabene.brown> (raw)
In-Reply-To: <AANLkTikHzECjyNNJC=-0x+-WgNrY=-PjzJnVt=G2NHX_@mail.gmail.com>

On Fri, 24 Dec 2010 12:17:46 +0100 Olaf van der Spek <olafvdspek@gmail.com>
wrote:

> On Thu, Dec 23, 2010 at 10:51 PM, Neil Brown <neilb@suse.de> wrote:
> > You are asking for something that doesn't exist, which is why no-one can tell
> > you want the answer is.
> 
> It seems like a very common and basic operation. If it doesn't exist
> IMO it should be created.
> 
> > The only mechanism for synchronising different filesystem operations is
> > fsync.  You should use that.
> >
> > If it is too slow, use data journalling, and place your journal on a
> > small low-latency device (NVRAM??)
> 
> This isn't about some DB-like app, it's about normal file writes, like
> archive extractions, compiling, editors, etc.
> 

Yes, it might be nice to have a very low cost way to make those safer against
corruption during a crash.
It would have to be *very* low cost as in most cases the cost of cleaning up
after the crash instead (e.g. 'make clean') is quite low.  But people do
sometime edit /etc/init.d files with an ordinary editor and it would be
rather embarrassing if a crash just at the wrong time left some critical file
incomplete, and maybe it would be easier to teach editors to fsync before
rename for files in /etc .....

So what would this mechanism really look like?  I think the proposal is to
delay committing the rename until the writeout of the file is complete,
without accelerating the writeout.
That would probably require delaying all updates to the directory until the
writeout was complete, as trying to reason about which changes were dependent
and which were independent is unlikely to be easy.

So as soon as you rename a file, you create a dependency between the file and
the directory such that no update for the directory may be written while any
page in the file is dirty.  Conversely, any fsync of the directory would
fsync the file as well.

Any write to the file should probably break the dependency as you can no
longer be sure what exactly the rename was supposed to protect.

I suspect that much of the infrastructure for this could be implemented in
the VFS/VM.  Certainly the dependency linkage between inodes, created on
rename, destroyed on write or fsync or when writeout on the inode completes,
and the fsync dependency could be common code.  Preventing writeout of
directories with dependent files would need some fs interaction. You could
probably prototype in ext2 quite easily to do some testing and collection
some numbers on overhead.

I think this would be an interesting project for someone to do and I would be
happy to review any patches.  Whether it ever got further than an interesting
project would depend very much on how intrusive it was to other filesystems,
how much over head it caused, and what actual benefits resulted.
If anyone wanted to pursue this idea, they would certainly need to address
each of those in their final proposal.

I think there could be room for improved transactional semantics in Linux
filesystems.  This might be what they should look like ... don't know yet.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-12-25 22:13 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTing7+SK+pavFehR4AGDbRRfFwvvzNxgWQ3zRp+O@mail.gmail.com>
2010-12-09 12:03 ` Atomic non-durable file write API Olaf van der Spek
2010-12-16 12:22   ` Olaf van der Spek
2010-12-16 20:11     ` Ric Wheeler
2010-12-18 22:15       ` Calvin Walton
2010-12-19 16:39         ` Olaf van der Spek
2010-12-23 15:49           ` Olaf van der Spek
2010-12-23 21:51             ` Neil Brown
2010-12-23 22:22               ` Ted Ts'o
2010-12-24  0:30                 ` Christian Stroetmann
2010-12-24  0:48                   ` Ted Ts'o
2010-12-24  1:00                     ` Christian Stroetmann
2010-12-24  9:51                       ` Ted Ts'o
2010-12-24 11:14                         ` Olaf van der Spek
2010-12-24 11:25                           ` Christian Stroetmann
2010-12-25  3:15                           ` Ted Ts'o
2010-12-25 10:41                             ` Olaf van der Spek
2010-12-25 11:33                               ` Nick Piggin
2010-12-25 15:24                                 ` Olaf van der Spek
2010-12-25 17:25                                   ` Nick Piggin
2010-12-26 15:08                                     ` Olaf van der Spek
2010-12-26 15:55                                       ` Boaz Harrosh
2010-12-26 16:02                                         ` Olaf van der Spek
2010-12-26 16:27                                           ` Boaz Harrosh
2010-12-26 18:26                                             ` Olaf van der Spek
2010-12-26 16:43                                       ` Nick Piggin
2010-12-26 18:51                                         ` Olaf van der Spek
2010-12-26 22:10                                           ` Ted Ts'o
2010-12-27  0:30                                             ` Christian Stroetmann
2010-12-27  1:04                                               ` Ted Ts'o
2010-12-27  1:30                                                 ` Christian Stroetmann
2010-12-27  2:53                                                   ` Ted Ts'o
2010-12-27 10:21                                             ` Olaf van der Spek
2010-12-27 11:07                                               ` Marco Stornelli
2010-12-27 15:30                                               ` Christian Stroetmann
2010-12-27 19:07                                                 ` Olaf van der Spek
2010-12-27 19:30                                                   ` Christian Stroetmann
2010-12-28 17:22                                                     ` Olaf van der Spek
2010-12-28 20:59                                                       ` Neil Brown
2010-12-28 22:00                                                         ` Greg Freemyer
2010-12-28 22:06                                                           ` Olaf van der Spek
2010-12-28 22:15                                                             ` Greg Freemyer
2010-12-28 22:28                                                               ` Olaf van der Spek
2010-12-28 22:35                                                               ` Neil Brown
2010-12-29 11:05                                                           ` Dave Chinner
2010-12-28 22:10                                                         ` Olaf van der Spek
2010-12-28 22:31                                                           ` Neil Brown
2010-12-28 22:54                                                             ` Olaf van der Spek
2010-12-28 23:42                                                               ` Ted Ts'o
2010-12-29  9:09                                                                 ` Olaf van der Spek
2010-12-29 15:30                                                               ` Christian Stroetmann
2010-12-29 15:41                                                                 ` Olaf van der Spek
2010-12-29 16:30                                                                   ` Christian Stroetmann
2010-12-29 17:14                                                                     ` Olaf van der Spek
2010-12-30  0:50                                                                       ` Neil Brown
2011-01-07 14:23                                                                         ` Olaf van der Spek
2010-12-27  4:12                                           ` Nick Piggin
2010-12-27 11:48                                             ` Olaf van der Spek
2010-12-27 12:43                                               ` Olaf van der Spek
2010-12-28  0:45                                               ` Ted Ts'o
2010-12-24 11:21                         ` Christian Stroetmann
2010-12-24 11:17               ` Olaf van der Spek
2010-12-24 11:29                 ` Christian Stroetmann
2010-12-24 11:30                   ` Olaf van der Spek
2010-12-25 21:40                 ` Neil Brown [this message]
2010-12-23 22:43             ` Dave Chinner
2010-12-23 22:47               ` Ted Ts'o
2010-12-26  9:59                 ` Amir Goldstein
2010-12-26 15:23                   ` Olaf van der Spek
2010-12-26 16:52                     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101226084007.7939aabc@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=olafvdspek@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).