linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: Peter Grandi <pg@lxra2.to.sabi.co.UK>
Cc: Linux RAID <linux-raid@vger.kernel.org>, Linux fs XFS <xfs@OSS.SGI.com>
Subject: Re: raid10n2/xfs setup guidance on write-cache/barrier
Date: Fri, 16 Mar 2012 19:02:22 -0500	[thread overview]
Message-ID: <4F63D48E.9080109@hardwarefreak.com> (raw)
In-Reply-To: <20323.37976.162802.876821@tree.ty.sabi.co.UK>

On 3/16/2012 2:28 PM, Peter Grandi wrote:
> [ ... ]
> 
>>>> write barriers will ensure journal and thus filesystem
>>>> integrity in a crash/power fail event.  They do NOT guarantee
>>>> file data integrity as file data isn't journaled.
> 
> Not well expressed, 

Given the audience, the OP, I was simply avoiding getting too deep in
the weeds Peter.  This thread is on the linux-raid list, not xfs@oss.
You know I have a tendency to get too deep in the weeds.  I think I did
nice job of balance here. ;)

> as XFS barriers do ensure file data integrity,
> *if the applications uses them* (and uses them in exactly the
> right way).

How will the OP know which, if any, of his users' desktop applications
do fsyncs, properly?  He won't.  Which is why I made the general
statement, which is correct, if not elaborate, nor down in the weeds.

> The difference between metadata and data with XFS is that XFS
> itself will use barriers on metadata at the right times, because
> that's data to XFS, but it won't use barriers on data[1], leaving
> that entirely to the application.

[1]File data, just to be clear

>>>>  No filesystem (Linux anyway) journals data, only metadata.
> 
>>> That's not true, is it? ext3 and ext4 support journal=data.
> 
> They do, because they journal blocks, which is not generally a
> great choice, but gives the option to journal data blocks too more
> easily than other choices. But it is a very special case that few
> people use.

Few use it because the performance is absolutely horrible.  data=journal
disables delayed allocation (which serious contributes to any modern
filesystem's performance--EXT devs stole/borrowed delayed allocation
from XFS BTW) and it disables O_DIRECT.  It also doubles the number of
data writes to media, once to the journal, once to the filesystem, for
every block of every file written.

> On a more general note, journaling and barriers are sort of
> distinct issues.
> 
> The real purpose of barriers is to ensure that updates are
> actually on the recording medium, whether in the journal or
> directly on final destination.
> That is barriers are used to ensure that data or metadata on the
> persistent layer is current.

Correct.  Again, trying to stay out of the weeds.  I'd established that
XFS uses barriers on journal writes for metadata consistency, which
prevents filesystem corruption after a crash, but not necessarily file
corruption.  Making the statement that XFS doesn't journal data gets the
point across more quickly, while staying out of the weeds.

[...]

> The 'freeze' features of XFS does not rely on snapshotting, it
> relies on suspending all processes that are writing to the
> filetree, so updates are avoided for the duration.

xfs_freeze was moved into the VFS in 2.6.29 and is called automatically
when doing an LVM snapshot of any Linux FS supporting such.  Thus,
snapshotting relies on xfs_freeze, not the other way round.  And
xfs_freeze doesn't suspend all processes that are writing to the
filesystem.  All write system calls to the filesystem are simply halted,
and the process blocks on IO until the filesystem is unfrozen.

> As the XFS team have been adding or planning to add various "new"
> features like checksums, maybe one day they will add COW to XFS
> too (not such an easy task when considering how large XFS extents
> can be, but the hole punching code can help there).

Not at all an easy rewrite of XFS.  And that's what COW would be, a
massive rewrite.  Copy on write definitely has some advantages for some
usage scenarios, but it's not yet been proven the holy grail of
filesystem design.

-- 
Stan

  reply	other threads:[~2012-03-17  0:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15  0:30 raid10n2/xfs setup guidance on write-cache/barrier Jessie Evangelista
2012-03-15  5:38 ` Stan Hoeppner
2012-03-15 12:06   ` Jessie Evangelista
2012-03-15 14:07     ` Peter Grandi
2012-03-15 15:25       ` keld
2012-03-15 16:52         ` Jessie Evangelista
2012-03-15 17:15           ` keld
2012-03-15 17:40             ` keld
2012-03-15 16:18       ` Jessie Evangelista
2012-03-15 23:00         ` Peter Grandi
2012-03-16  3:36           ` Jessie Evangelista
2012-03-16 11:06             ` Michael Monnerie
2012-03-16 12:21               ` Peter Grandi
2012-03-16 17:15             ` Brian Candler
2012-03-17 15:35             ` Peter Grandi
2012-03-17 21:39               ` raid10n2/xfs setup guidance on write-cache/barrier (GiB alignment) Zdenek Kaspar
2012-03-18  0:08                 ` Peter Grandi
     [not found]       ` <4F64115D.50208@hardwarefreak.com>
     [not found]         ` <20120317223454.GQ5091@dastard>
2012-03-18  2:09           ` raid10n2/xfs setup guidance on write-cache/barrier Peter Grandi
2012-03-18 11:25             ` Peter Grandi
2012-03-18 14:00               ` Christoph Hellwig
2012-03-18 19:17                 ` Peter Grandi
2012-03-19  9:07                   ` Stan Hoeppner
2012-03-20 12:34                     ` Jessie Evangelista
2012-03-18 18:08               ` Stan Hoeppner
2012-03-22 21:26                 ` Peter Grandi
2012-03-23  5:10                   ` Stan Hoeppner
2012-03-16 12:25     ` Stan Hoeppner
2012-03-16 18:01       ` Jon Nelson
2012-03-16 18:03         ` Jon Nelson
2012-03-16 19:28           ` Peter Grandi
2012-03-17  0:02             ` Stan Hoeppner [this message]
2012-03-17 22:10 ` Zdenek Kaspar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F63D48E.9080109@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=pg@lxra2.to.sabi.co.UK \
    --cc=xfs@OSS.SGI.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).