linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <ric@emc.com>
To: Bryan Henderson <hbryan@us.ibm.com>
Cc: Chris Wedgwood <cw@f00f.org>, Al Boldi <a1426z@gawab.com>,
	linux-fsdevel@vger.kernel.org, linux-xfs@oss.sgi.com,
	Steve Lord <lord@xfs.org>, "'Nathan Scott'" <nathans@sgi.com>,
	reiserfs-list@namesys.com
Subject: Re: XFS corruption during power-blackout
Date: Fri, 01 Jul 2005 08:53:40 -0400	[thread overview]
Message-ID: <42C53CD4.4000205@emc.com> (raw)
In-Reply-To: <OFBC8F19C9.0A8B9C84-ON88257030.006EB89B-88257030.00725493@us.ibm.com>

Bryan Henderson wrote:

>
>It's because of the words before that:  "everything that was buffered when 
>sync()
>started is hardened before the next sync() returns."  The point is that 
>the second sync() is the one that waits (it actually waits for the 
>previous one to finish before it starts).  By the way, I'm not talking 
>about Linux at this point.  I'm talking about so-called POSIX systems in 
>general.
>
>But it does sound like Linux has a pretty firm philosophy of synchronous 
>sync (I see it documented in an old man page), so I guess it's OK to rely 
>on it.
>
>There are scenarios where you'd rather not have a process tied up while 
>syncing takes place.  Stepping back, I would guess the primary original 
>purpose of sync() was to allow you to make a sync daemon.  Early Unix 
>systems did not have in-kernel safety clean timers.  A user space process 
>did that.
>
>--
>Bryan Henderson                     IBM Almaden Research Center
>San Jose CA                         Filesystems
>  
>
We have been playing around with various sync techniques that allow you 
to get good data safety for a large batch of files (think of a restore 
of a file system or a migration of lots of files from one server to 
another).  You can always restart a restore if the box goes down in the 
middle, but once you are done, you want a hard promise that all files 
are safely on the disk platter.

Using system level sync() has all  of the disadvantages that you mention 
along with the lack of a per-file system barrier flush.

You can try to hack in a flush by issuing an fsync() call on one file 
per file system after the sync() completes, but whether or not the file 
system issues a barrier operation is file system dependent.

Doing an fsync() per file is slow but safe. Writing the files without 
syncing and then reopening and fsync()'ing each one in  reasonable batch 
size is much faster, but still kludgey.

An attractive, but as far as I can see missing feature, would be the 
ability to do a file system specific sync() command.  Another option 
would be a batched AIO like fsync() with a bit vector of descriptors to 
sync.  Not surprising, but the best performance is reached when you let 
the writing phase working asynchronously and let the underlying file 
system do its thing and wrap it up with a group cache to disk sync and a 
single disk write cache invalidate (barrier) at the end.

  reply	other threads:[~2005-07-01 12:53 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20050629001847.GB850@frodo>
2005-06-29  4:53 ` XFS corruption during power-blackout Al Boldi
2005-06-29 16:38   ` Christian Rice
2005-06-29 17:02   ` Chris Wedgwood
2005-06-29 17:56     ` Steve Lord
2005-06-29 20:56       ` Chris Wedgwood
2005-06-30 16:30         ` Bryan Henderson
2005-06-30 18:46           ` Chris Wedgwood
2005-06-30 19:44             ` Jörn Engel
2005-06-30 20:32               ` Chris Wedgwood
2005-06-30 21:07                 ` Jörn Engel
2005-07-01 12:36                 ` Ric Wheeler
2005-07-01 12:56                   ` Jens Axboe
2005-06-30 20:49             ` Bryan Henderson
2005-07-01 12:53               ` Ric Wheeler [this message]
2005-07-01 18:24                 ` Bryan Henderson
2005-07-01 19:58                   ` David Masover
2005-07-01 21:10                     ` Jörn Engel
2005-07-01 21:39                       ` David Masover
2005-07-01  1:09             ` Stewart Smith
2005-07-05 15:53             ` Sonny Rao
2005-06-29 21:10       ` Nathan Scott
2005-07-01  8:17     ` David Masover
2005-07-01  9:24       ` Jens Axboe
     [not found]         ` <20050701131950.GA15180@ime.usp.br>
2005-07-01 13:57           ` Ric Wheeler
2005-07-01 18:37             ` Bryan Henderson
2005-07-01 18:41               ` Jens Axboe
2005-07-11 12:53                 ` Ric Wheeler
2005-07-01 14:05         ` Al Boldi
2005-07-01 16:35           ` Alistair John Strachan
2005-07-05 15:49           ` Sonny Rao
2005-07-05 17:25             ` Al Boldi
2005-07-05 18:10               ` Sonny Rao
2005-07-05 19:24                 ` Dieter Nützel
2005-07-06  4:24                 ` Al Boldi
2005-07-06  4:46                   ` Nathan Scott
2005-07-16  7:02 Al Boldi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42C53CD4.4000205@emc.com \
    --to=ric@emc.com \
    --cc=a1426z@gawab.com \
    --cc=cw@f00f.org \
    --cc=hbryan@us.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    --cc=lord@xfs.org \
    --cc=nathans@sgi.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).