linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: Eric Sandeen <sandeen@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/4] (RESEND) ext3[34] barrier changes
Date: Fri, 16 May 2008 17:21:26 -0500	[thread overview]
Message-ID: <482E08E6.4030507@redhat.com> (raw)
In-Reply-To: <20080516220315.GB15334@shareable.org>

Jamie Lokier wrote:
> Eric Sandeen wrote:
>>> If we were seeing a significant number of "hey, my disk got wrecked"
>>> reports which attributable to this then yes, perhaps we should change
>>> the default.  But I've never seen _any_, although I've seen claims that
>>> others have seen reports.
>> Hm, how would we know, really?  What does it look like?  It'd totally
>> depend on what got lost...  When do you find out?  Again depends what
>> you're doing, I think.  I'll admit that I don't have any good evidence
>> of my own.  I'll go off and do some plug-pull-testing and a benchmark or
>> two.
> 
> You have to pull the plug quite a lot, while there is data in write
> cache, and when the data is something you will notice later.
> 
> Checking filesystem is hard.  Something systematic would be good - for
> which you will want an electronically controlled power switch.

Right, that was the plan.  I wasn't really going to stand there and pull
the plug.  :)  I'd like to get to "out of $NUMBER power-loss events
under this usage, I saw $THIS corruption $THISMANY times ..."

> I have seen corruption which I believe is from lack of barriers, and
> hasn't occurred since I implemented them (or disabled write cache).
> But it's hard to be sure that was the real cause.
> 
> If you just want to test the block I/O layer and drive itself, don't
> use the filesystem, but write a program which just access the block
> device, continuously writing with/without barriers every so often, and
> after power cycle read back to see what was and wasn't written.

Well, I think it is worth testing through the filesystem, different
journaling mechanisms will probably react^wcorrupt in different ways.

> I think there may be drives which won't show any effect - if they have
> enough internal power (from platter inertia) to write everything in
> the cache before losing it.

... and those with flux capacitors.  ;)  I've heard of this mechanism
but I'm  not sure I believe it is present in any modern drive.  Not sure
the seagates of the world will tell us, either ....

> If you want to test, the worst case is to queue many small writes at
> seek positions acrosss the disk, so that flushing the disk's write
> cache takes the longest time.  A good pattern might be take numbers
> 0..2^N-1 (e.g. 0..255), for each number reverse the bit order (0, 128,
> 64, 192...) and do writes at those block positions, scaling up to the
> range of the whole disk.  The idea is if the disk just caches the last
> few queued, they will always be quite spread out.

I suppose we could go about it 2 ways; come up with something diabolical
and  try very hard to break it (I think we know that we can) or do
something more realistic (like untarring & building a kernel?) and see
what happens in that case...

> The MacOS X folks decided that speed is most important for fsync().
> fsync() does not guarantee commit to platter.  *But* they added an
> fcntl() for applications to request a commit to platter, which SQLite
> at least uses.  I don't know if MacOS X uses barriers for filesystem
> operations.

heh,  reminds me of xfs's "osyncisosync" option ;)

>> and install by default on lvm which won't pass barriers anyway.
> 
> Considering how many things depend on LVM not passing barriers, that
> is scary.  People use software RAID assuming integrity.  They are
> immune to many hardware faults.  But it turns out, on Linux, that a
> single disk can have higher integrity against power failure than a
> RAID.

FWIW...  md also only does it on raid1... but lvm with a single device
or mirror underneath really *should* IMHO...

>> So maybe it's hypocritical to send this patch from redhat.com :)
> 
> So send the patch to fix LVM too :-)

hehe, I'll try ... ;)

>> And as another "who uses barriers" datapoint, reiserfs & xfs both have
>> them on by default.
> 
> Are they noticably slower than ext3?  If not, it suggests ext3 can be
> fixed to keep its performance with barriers enabled.

Well it all depends on what you're testing (and the hardware you're
testing it on).  Between ext3 & xfs you can find plenty of tests which
will show either one or the other as faster.  And most benchmark results
out there probably don't state whether barriers were in force or not.

> Specifically: under some workloads, batching larger changes into the
> journal between commit blocks might compensate.  Maybe the journal has
> been tuned for barriers off because they are by default?
> 
>> I suppose alternately I could send another patch to remove "remember
>> that ext3/4 by default offers higher data integrity guarantees than
>> most." from Documentation/filesystems/ext4.txt  ;)
> 
> It would be fair.  I suspect a fair number of people are under the
> impression ext3 uses barriers with no special options, prior to this
> thread.  It was advertised as a feature in development during the 2.5
> series.

It certainly does come into play in benchmarking scenarios...

-Eric

  reply	other threads:[~2008-05-16 22:21 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-16 19:02 [PATCH 0/4] (RESEND) ext3[34] barrier changes Eric Sandeen
2008-05-16 19:05 ` [PATCH 1/4] ext3: enable barriers by default Eric Sandeen
2008-05-19  8:58   ` Pavel Machek
2008-05-16 19:07 ` [PATCH 2/4] ext3: call blkdev_issue_flush on fsync Eric Sandeen
2008-05-16 22:15   ` Jamie Lokier
2008-05-16 19:08 ` [PATCH 3/4] ext4: enable barriers by default Eric Sandeen
2008-05-16 19:09 ` [PATCH 4/4] ext4: call blkdev_issue_flush on fsync Eric Sandeen
2008-05-20  2:34   ` Theodore Tso
2008-05-20 15:43     ` Jamie Lokier
2008-05-20 15:52       ` Eric Sandeen
2008-05-20 19:54       ` Jens Axboe
2008-05-20 22:02         ` Jamie Lokier
2008-05-21  7:30           ` Jens Axboe
     [not found]       ` <4832F3C6.1050601@redhat.com>
2008-05-20 20:14         ` Jens Axboe
2008-05-16 20:05 ` [PATCH 0/4] (RESEND) ext3[34] barrier changes Andrew Morton
2008-05-16 20:53   ` Eric Sandeen
2008-05-16 20:58     ` Andrew Morton
2008-05-16 21:45       ` Jamie Lokier
2008-05-16 22:03         ` Eric Sandeen
2008-05-16 22:09           ` Jamie Lokier
2008-05-16 22:03     ` Jamie Lokier
2008-05-16 22:21       ` Eric Sandeen [this message]
2008-05-16 22:53         ` Jamie Lokier
2008-05-17  0:20           ` Theodore Tso
2008-05-17  0:35             ` Andrew Morton
2008-05-17 13:43               ` Theodore Tso
2008-05-17 17:59                 ` Andreas Dilger
2008-05-17 20:44                 ` Theodore Tso
2008-05-20 14:45                   ` Jamie Lokier
2008-05-18  0:48               ` Chris Mason
2008-05-18  1:36                 ` Theodore Tso
2008-05-18 14:49                   ` Ric Wheeler
     [not found]                   ` <4830420D.4080608@gmail.com>
2008-05-20 14:42                     ` Jamie Lokier
2008-05-20 23:48                     ` Jamie Lokier
2008-05-20 23:44                 ` Jamie Lokier
2008-05-18 20:03         ` Andi Kleen
2008-05-19  0:43           ` Theodore Tso
2008-05-19  2:29             ` Eric Sandeen
     [not found]             ` <4830E60A.2010809@redhat.com>
2008-05-19  4:11               ` Andrew Morton
2008-05-19 17:16                 ` Chris Mason
2008-05-19 18:39                   ` Chris Mason
2008-05-19 22:39                     ` Jan Kara
2008-05-20  0:29                       ` Chris Mason
2008-05-20  3:29                         ` Timothy Shimmin
2008-05-20 12:04                           ` Chris Mason
2008-05-20  8:25                     ` Jens Axboe
2008-05-20 12:17                       ` Chris Mason
2008-05-21 11:22                     ` Pavel Machek
2008-05-21 12:32                       ` Theodore Tso
2008-05-21 18:03                       ` Andrew Morton
2008-05-21 18:15                         ` Eric Sandeen
2008-05-21 19:43                           ` Jamie Lokier
2008-05-21 18:29                         ` Theodore Tso
2008-05-21 18:49                           ` Andrew Morton
2008-05-21 19:42                             ` Jamie Lokier
2008-05-21 19:36                           ` Jamie Lokier
     [not found]                           ` <20080521193633.GA26780@shareable.org>
2008-05-21 19:40                             ` Chris Mason
2008-05-21 19:54                         ` Jamie Lokier
2008-05-20 14:58                   ` Jamie Lokier
2008-05-21 22:30                   ` Daniel Phillips
2008-05-20 23:35               ` Jamie Lokier
2008-05-19  0:28       ` Theodore Tso
2008-05-20 15:13         ` Jamie Lokier
     [not found]         ` <20080520151306.GF16676@shareable.org>
2008-05-21 20:25           ` Greg Smith
2008-05-16 22:30   ` Jamie Lokier
2008-05-18 19:54   ` Andi Kleen
2008-05-19 13:26     ` Chris Mason
2008-05-19 14:46       ` Theodore Tso
2008-05-20  2:51         ` [PATCH, RFC] ext4: Fix use of write barrier in commit logic Theodore Tso
     [not found]         ` <20080520025112.GN15035@mit.edu>
2008-05-20 15:23           ` Jamie Lokier
2008-05-23 18:33         ` [PATCH 0/4] (RESEND) ext3[34] barrier changes Ric Wheeler
2008-05-20 15:36       ` Jamie Lokier
2008-05-20 16:02         ` Chris Mason
2008-05-20 16:27           ` Jamie Lokier
2008-05-20 17:08             ` Chris Mason
2008-05-20 22:26               ` Jamie Lokier
2008-05-19  9:04   ` Pavel Machek
2008-05-29 13:36   ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=482E08E6.4030507@redhat.com \
    --to=sandeen@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).