linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rob Landley <rob@landley.net>
To: Theodore Tso <tytso@mit.edu>
Cc: Pavel Machek <pavel@ucw.cz>, Rik van Riel <riel@redhat.com>,
	Ric Wheeler <rwheeler@redhat.com>,
	Florian Weimer <fweimer@bfk.de>,
	Goswin von Brederlow <goswin-v-b@web.de>,
	kernel list <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>,
	mtk.manpages@gmail.com, rdunlap@xenotime.net,
	linux-doc@vger.kernel.org, linux-ext4@vger.kernel.org,
	corbet@lwn.net
Subject: Re: [patch] ext2/3: document conditions when reliable operation is possible
Date: Thu, 27 Aug 2009 01:06:13 -0500	[thread overview]
Message-ID: <200908270106.15032.rob@landley.net> (raw)
In-Reply-To: <20090826122813.GI32712@mit.edu>

On Wednesday 26 August 2009 07:28:13 Theodore Tso wrote:
> On Wed, Aug 26, 2009 at 01:17:52PM +0200, Pavel Machek wrote:
> > > Metadata takes up such a small part of the disk that fscking
> > > it and finding it to be OK is absolutely no guarantee that
> > > the data on the filesystem has not been horribly mangled.
> > >
> > > Personally, what I care about is my data.
> > >
> > > The metadata is just a way to get to my data, while the data
> > > is actually important.
> >
> > Personally, I care about metadata consistency, and ext3 documentation
> > suggests that journal protects its integrity. Except that it does not
> > on broken storage devices, and you still need to run fsck there.
>
> Caring about metadata consistency and not data is just weird, I'm
> sorry.  I can't imagine anyone who actually *cares* about what they
> have stored, whether it's digital photographs of child taking a first
> step, or their thesis research, caring about more about the metadata
> than the data.  Giving advice that pretends that most users have that
> priority is Just Wrong.

I thought the reason for that was that if your metadata is horked, further 
writes to the disk can trash unrelated existing data because it's lost track 
of what's allocated and what isn't.  So back when the assumption was "what's 
written stays written", then keeping the metadata sane was still darn 
important to prevent normal operation from overwriting unrelated existing 
data.

Then Pavel notified us of a situation where interrupted writes to the disk can 
trash unrelated existing data _anyway_, because the flash block size on the 16 
gig flash key I bought retail at Fry's is 2 megabytes, and the filesystem thinks 
it's 4k or smaller.  It seems like what _broke_ was the assumption that the 
filesystem block size >= the disk block size, and nobody noticed for a while.  
(Except the people making jffs2 and friends, anyway.)

Today we have cheap plentiful USB keys that act like hard drives, except that 
their write block size isn't remotely the same as hard drives', but they 
pretend it is, and then the block wear levelling algorithms fuzz things 
further.  (Gee, a drive controller lying about drive geometry, the scsi crowd 
should feel right at home.)

Now Pavel's coming back with a second situation where RAID stripes (under 
certain circumstances) seem to have similar granularity issues, again breaking 
what seems to be the same assumption.  Big media use big chunks for data, and 
media is getting bigger.  It doesn't seem like this problem is going to 
diminish in future.

I agree that it seems like a good idea to have BIG RED WARNING SIGNS about 
those kind of media and how _any_ journaling filesystem doesn't really help 
here.  So specifically documenting "These kinds of media lose unrelated random 
data if writes to them are interrupted, journaling filesystems can't help with 
this and may actually hide the problem, and even an fsck will only find 
corrupted metadata not lost file contents" seems kind of useful.

That said, ext3's assumption that filesystem block size always >= disk update 
block size _is_ a fundamental part of this problem, and one that isn't shared 
by things like jffs2, and which things like btrfs might be able to address if 
they try, by adding awareness of the real media update granularity to their 
node layout algorithms.  (Heck, ext2 has a stripe size parameter already.  
Does setting that appropriately for your raid make this suck less?  I haven't 
heard anybody comment on that one yet...) 

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

  reply	other threads:[~2009-08-27  6:06 UTC|newest]

Thread overview: 269+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-12  9:21 ext2/3: document conditions when reliable operation is possible Pavel Machek
2009-03-12 11:40 ` Jochen Voß
2009-03-21 11:24   ` Pavel Machek
2009-03-12 19:13 ` Rob Landley
2009-03-16 12:28   ` Pavel Machek
2009-03-16 19:26     ` Rob Landley
2009-03-23 10:45       ` Pavel Machek
2009-03-30 15:06         ` Goswin von Brederlow
     [not found]           ` <20090824093143.GD25591@elf.ucw.cz>
2009-08-24 11:19             ` [patch] " Florian Weimer
2009-08-24 13:01               ` Theodore Tso
2009-08-24 14:55                 ` Artem Bityutskiy
2009-08-24 22:30                   ` Rob Landley
2009-08-24 19:52                 ` Pavel Machek
2009-08-24 20:24                   ` Ric Wheeler
2009-08-24 20:52                     ` Pavel Machek
2009-08-24 21:08                       ` Ric Wheeler
2009-08-24 21:25                         ` Pavel Machek
2009-08-24 22:05                           ` Ric Wheeler
2009-08-24 22:22                             ` Zan Lynx
2009-08-24 22:44                               ` Pavel Machek
2009-08-25  0:34                                 ` Ric Wheeler
2009-08-24 23:42                               ` david
2009-08-24 22:41                             ` Pavel Machek
2009-08-24 22:39                           ` Theodore Tso
2009-08-24 23:00                             ` Pavel Machek
     [not found]                             ` <20090824230036.GK29763@elf.ucw.cz>
2009-08-25  0:02                               ` david
2009-08-25  9:32                                 ` Pavel Machek
2009-08-25  0:06                               ` Ric Wheeler
2009-08-25  9:34                                 ` Pavel Machek
2009-08-25 15:34                                   ` david
2009-08-26  3:32                                   ` Rik van Riel
2009-08-26 11:17                                     ` Pavel Machek
2009-08-26 11:29                                       ` david
2009-08-26 13:10                                         ` Pavel Machek
2009-08-26 13:43                                           ` david
2009-08-26 18:02                                             ` Theodore Tso
2009-08-27  6:28                                               ` Eric Sandeen
2009-11-09  8:53                                               ` periodic fsck was " Pavel Machek
     [not found]                                               ` <20091109085318.GE4818@elf.ucw.cz>
2009-11-09 14:05                                                 ` Theodore Tso
2009-11-09 15:58                                                   ` Andreas Dilger
2009-08-30  7:03                                             ` Pavel Machek
2009-08-26 12:28                                       ` Theodore Tso
2009-08-27  6:06                                         ` Rob Landley [this message]
2009-08-27  6:54                                           ` david
2009-08-27  7:34                                             ` Rob Landley
2009-08-28 14:37                                               ` david
2009-08-30  7:19                                             ` Pavel Machek
2009-08-30 12:48                                               ` david
2009-08-27  5:27                                     ` Rob Landley
2009-08-25  0:08                               ` Theodore Tso
2009-08-25  9:42                                 ` Pavel Machek
     [not found]                                 ` <20090825094244.GC15563@elf.ucw.cz>
2009-08-25 13:37                                   ` Ric Wheeler
2009-08-25 13:42                                     ` Alan Cox
2009-08-27  3:16                                       ` Rob Landley
2009-08-25 21:15                                     ` Pavel Machek
2009-08-25 22:42                                       ` Ric Wheeler
2009-08-25 22:51                                         ` Pavel Machek
2009-08-25 23:03                                           ` david
2009-08-25 23:29                                             ` Pavel Machek
2009-08-25 23:03                                           ` Ric Wheeler
2009-08-25 23:26                                             ` Pavel Machek
2009-08-25 23:40                                               ` Ric Wheeler
2009-08-25 23:48                                                 ` david
2009-08-25 23:53                                                 ` Pavel Machek
2009-08-26  0:11                                                   ` Ric Wheeler
2009-08-26  0:16                                                     ` Pavel Machek
2009-08-26  0:31                                                       ` Ric Wheeler
2009-08-26  1:00                                                         ` Theodore Tso
2009-08-26  1:15                                                           ` Ric Wheeler
2009-08-26  1:16                                                           ` Pavel Machek
2009-08-26  2:53                                                           ` Henrique de Moraes Holschuh
     [not found]                                                           ` <20090826011605.GS4300@elf.ucw.cz>
2009-08-26  2:55                                                             ` Theodore Tso
2009-08-26 13:37                                                               ` Ric Wheeler
     [not found]                                                           ` <4A948C94.7040103@redhat.com>
2009-08-26  2:58                                                             ` Theodore Tso
2009-08-26 10:39                                                               ` Ric Wheeler
     [not found]                                                               ` <4A9510D2.1090704@redhat.com>
2009-08-26 11:12                                                                 ` Pavel Machek
2009-08-26 11:28                                                                   ` david
2009-08-29  9:49                                                                     ` [testcase] test your fs/storage stack (was Re: [patch] ext2/3: document conditions when reliable operation is possible) Pavel Machek
2009-08-29 11:28                                                                       ` Ric Wheeler
2009-09-02 20:12                                                                         ` Pavel Machek
2009-09-02 20:42                                                                           ` Ric Wheeler
2009-09-02 23:00                                                                             ` Rob Landley
2009-09-02 23:09                                                                               ` david
2009-09-03  8:55                                                                                 ` Pavel Machek
2009-09-03  0:36                                                                               ` jim owens
2009-09-03  2:41                                                                                 ` Rob Landley
2009-09-03 14:14                                                                                   ` jim owens
2009-09-04  7:44                                                                                     ` Rob Landley
2009-09-04 11:49                                                                                       ` Ric Wheeler
2009-09-05 10:28                                                                                         ` Pavel Machek
2009-09-05 12:20                                                                                           ` Ric Wheeler
2009-09-05 13:54                                                                                           ` Jonathan Corbet
2009-09-05 21:27                                                                                             ` Pavel Machek
2009-09-05 21:56                                                                                               ` Theodore Tso
2009-09-02 22:45                                                                           ` Rob Landley
2009-09-02 22:49                                                                           ` [PATCH] Update Documentation/md.txt to mention journaling won't help dirty+degraded case Rob Landley
2009-09-03  9:08                                                                             ` Pavel Machek
2009-09-03 12:05                                                                             ` Ric Wheeler
2009-09-03 12:31                                                                               ` Pavel Machek
2009-08-29 16:35                                                                       ` [testcase] test your fs/storage stack (was Re: [patch] ext2/3: document conditions when reliable operation is possible) david
2009-08-30  7:07                                                                         ` Pavel Machek
2009-08-26 12:01                                                                   ` [patch] ext2/3: document conditions when reliable operation is possible Ric Wheeler
2009-08-26 12:23                                                                   ` Theodore Tso
2009-08-30  7:01                                                                     ` Pavel Machek
2009-08-27  5:19                                                               ` Rob Landley
2009-08-27 12:24                                                                 ` Theodore Tso
2009-08-27 13:10                                                                   ` Ric Wheeler
     [not found]                                                                   ` <4A9685D4.2070906@redhat.com>
2009-08-27 16:54                                                                     ` MD/DM and barriers (was Re: [patch] ext2/3: document conditions when reliable operation is possible) Jeff Garzik
2009-08-27 18:09                                                                       ` Alasdair G Kergon
2009-09-01 14:01                                                                       ` Pavel Machek
2009-09-02 16:17                                                                         ` Michael Tokarev
2009-08-29 10:02                                                                   ` [patch] ext2/3: document conditions when reliable operation is possible Pavel Machek
2009-09-03  9:47                                                           ` Pavel Machek
2009-08-26  3:50                                                   ` Rik van Riel
2009-08-27  3:53                                                 ` Rob Landley
2009-08-27 11:43                                                   ` Ric Wheeler
2009-08-27 20:51                                                     ` Rob Landley
2009-08-27 22:00                                                       ` Ric Wheeler
2009-08-28 14:49                                                       ` david
2009-08-29 10:05                                                         ` Pavel Machek
2009-08-29 20:22                                                           ` Rob Landley
2009-08-29 21:34                                                             ` Pavel Machek
2009-09-03 16:56                                                             ` what fsck can (and can't) do was " david
2009-09-03 19:27                                                               ` Theodore Tso
2009-08-27 22:13                                                     ` raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible) Pavel Machek
2009-08-28  1:32                                                       ` Ric Wheeler
2009-08-28  6:44                                                         ` Pavel Machek
2009-08-28  7:31                                                           ` NeilBrown
2009-11-09 10:50                                                             ` Pavel Machek
2009-08-28 11:16                                                           ` Ric Wheeler
2009-09-01 13:58                                                             ` Pavel Machek
2009-08-28  7:11                                                         ` raid is dangerous but that's secret Florian Weimer
2009-08-28  7:23                                                           ` NeilBrown
2009-08-28 12:08                                                         ` raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible) Theodore Tso
2009-08-30  7:51                                                           ` Pavel Machek
     [not found]                                                           ` <20090830075135.GA1874@ucw.cz>
2009-08-30  9:01                                                             ` Christian Kujau
2009-09-02 20:55                                                               ` Pavel Machek
2009-08-30 12:55                                                             ` david
2009-08-30 14:12                                                               ` Ric Wheeler
2009-08-30 14:44                                                                 ` Michael Tokarev
2009-08-30 16:10                                                                   ` Ric Wheeler
2009-08-30 16:35                                                                   ` Christoph Hellwig
2009-08-31 13:15                                                                     ` Ric Wheeler
2009-08-31 13:16                                                                       ` Christoph Hellwig
2009-08-31 13:19                                                                         ` Mark Lord
2009-08-31 13:21                                                                           ` Christoph Hellwig
2009-08-31 15:14                                                                             ` jim owens
2009-09-03  1:59                                                                             ` Ric Wheeler
2009-09-03 11:12                                                                               ` Krzysztof Halasa
2009-09-03 11:18                                                                                 ` Ric Wheeler
2009-09-03 13:34                                                                                   ` Krzysztof Halasa
2009-09-03 13:50                                                                                     ` Ric Wheeler
2009-09-03 13:59                                                                                       ` Krzysztof Halasa
2009-09-03 14:15                                                                                         ` wishful thinking about atomic, multi-sector or full MD stripe width, writes in storage Ric Wheeler
2009-09-03 14:26                                                                                           ` Florian Weimer
2009-09-03 15:09                                                                                             ` Ric Wheeler
2009-09-03 23:50                                                                                           ` Krzysztof Halasa
2009-09-04  0:39                                                                                             ` Ric Wheeler
2009-09-04 21:21                                                                                           ` Mark Lord
2009-09-04 21:29                                                                                             ` Ric Wheeler
2009-09-05 12:57                                                                                               ` Mark Lord
2009-09-05 13:40                                                                                                 ` Ric Wheeler
2009-09-05 21:43                                                                                                 ` NeilBrown
2009-09-07 11:45                                                                                           ` Pavel Machek
2009-09-07 13:10                                                                                             ` Theodore Tso
2010-04-04 13:47                                                                                               ` fsck more often when powerfail is detected (was Re: wishful thinking about atomic, multi-sector or full MD stripe width, writes in storage) Pavel Machek
2010-04-04 17:39                                                                                                 ` tytso
2010-04-04 17:59                                                                                                 ` Rob Landley
2010-04-04 18:45                                                                                                   ` Pavel Machek
2010-04-04 19:35                                                                                                     ` tytso
2010-04-04 19:29                                                                                                   ` tytso
2010-04-04 23:58                                                                                                     ` Rob Landley
2009-09-03 14:35                                                                                     ` raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible) david
2009-08-31 13:22                                                                         ` Ric Wheeler
2009-08-31 15:50                                                                           ` david
2009-08-31 16:21                                                                             ` Ric Wheeler
2009-08-31 18:31                                                                             ` Christoph Hellwig
2009-08-31 19:11                                                                               ` david
2009-08-30 15:05                                                               ` Pavel Machek
2009-08-30 15:20                                                             ` Theodore Tso
2009-08-31 17:49                                                               ` Jesse Brandeburg
     [not found]                                                               ` <4807377b0908311049id9a2167r937bc8447c2b3546@mail.gmail.com>
2009-08-31 18:01                                                                 ` Ric Wheeler
2009-08-31 21:01                                                                   ` MD5/6? (was Re: raid is dangerous but that's secret ...) Ron Johnson
2009-08-31 18:07                                                                 ` raid is dangerous but that's secret (was Re: [patch] ext2/3: document conditions when reliable operation is possible) martin f krafft
2009-08-31 22:26                                                                   ` Jesse Brandeburg
2009-08-31 23:19                                                                     ` Ron Johnson
2009-09-01  5:45                                                                     ` martin f krafft
2009-09-05 10:34                                                               ` Pavel Machek
2009-08-25 23:46                                               ` [patch] ext2/3: document conditions when reliable operation is possible david
2009-08-25 23:08                                       ` Neil Brown
2009-08-25 23:44                                         ` Pavel Machek
2009-08-26  4:08                                           ` Rik van Riel
2009-08-26 11:15                                             ` Pavel Machek
2009-08-27  3:29                                               ` Rik van Riel
2009-08-25 16:11                                   ` Theodore Tso
2009-08-25 22:21                                     ` [patch] document flash/RAID dangers Pavel Machek
2009-08-25 22:33                                       ` david
2009-08-25 22:40                                         ` Pavel Machek
2009-08-25 22:59                                           ` david
2009-08-25 23:37                                             ` Pavel Machek
2009-08-25 23:48                                               ` Ric Wheeler
2009-08-26  0:06                                                 ` Pavel Machek
2009-08-26  0:12                                                   ` Ric Wheeler
2009-08-26  0:20                                                     ` Pavel Machek
2009-08-26  0:26                                                       ` david
2009-08-26  0:28                                                       ` Ric Wheeler
2009-08-26  0:38                                                         ` Pavel Machek
2009-08-26  0:45                                                           ` Ric Wheeler
2009-08-26 11:21                                                             ` Pavel Machek
2009-08-26 11:58                                                               ` Ric Wheeler
2009-08-26 12:40                                                                 ` Theodore Tso
2009-08-26 13:11                                                                   ` Ric Wheeler
     [not found]                                                                   ` <4A95349E.7010101@redhat.com>
2009-08-26 13:44                                                                     ` david
2009-08-29  9:38                                                                 ` Pavel Machek
2009-08-26  4:24                                                       ` Rik van Riel
2009-08-26 11:22                                                         ` Pavel Machek
2009-08-26 14:45                                                           ` Rik van Riel
2009-08-29  9:39                                                             ` Pavel Machek
2009-08-29 11:47                                                               ` Ron Johnson
2009-08-29 16:12                                                                 ` jim owens
2009-08-25 23:56                                               ` david
2009-08-26  0:12                                                 ` Pavel Machek
2009-08-26  0:20                                                   ` david
2009-08-26  0:39                                                     ` Pavel Machek
2009-08-26  1:17                                                       ` david
2009-08-26  0:26                                                   ` Ric Wheeler
2009-08-26  0:44                                                     ` Pavel Machek
2009-08-26  0:50                                                       ` Ric Wheeler
2009-08-26  1:19                                                       ` david
2009-08-26 11:25                                                         ` Pavel Machek
2009-08-26 12:37                                                           ` Theodore Tso
2009-08-30  6:49                                                             ` Pavel Machek
2009-08-26  4:20                                           ` Rik van Riel
2009-08-25 22:27                                     ` [patch] document that ext2 can't handle barriers Pavel Machek
2009-08-27  3:34                                 ` [patch] ext2/3: document conditions when reliable operation is possible Rob Landley
2009-08-27  8:46                                 ` David Woodhouse
2009-08-28 14:46                                   ` david
2009-08-29 10:09                                     ` Pavel Machek
2009-08-29 16:27                                       ` david
2009-08-29 21:33                                         ` Pavel Machek
2009-08-25 22:58                             ` Neil Brown
2009-08-25 23:10                               ` Ric Wheeler
2009-08-25 23:32                                 ` NeilBrown
2009-08-24 21:11                       ` Greg Freemyer
2009-08-25 20:56                         ` Rob Landley
2009-08-25 21:08                           ` david
2009-08-25 18:52                     ` Rob Landley
2009-08-25 14:43                 ` Florian Weimer
2009-08-24 13:50               ` Theodore Tso
2009-08-24 18:48                 ` Pavel Machek
2009-08-24 18:39               ` Pavel Machek
2009-08-24 13:21             ` Greg Freemyer
2009-08-24 18:44               ` Pavel Machek
2009-08-25 23:28               ` Neil Brown
2009-08-26  1:34                 ` david
2009-08-24 21:11             ` Rob Landley
2009-08-24 21:33               ` Pavel Machek
2009-08-25 18:45                 ` Jan Kara
2009-03-16 12:30   ` Pavel Machek
2009-03-16 19:03     ` Theodore Tso
2009-03-23 18:23       ` Pavel Machek
2009-03-16 19:40     ` Sitsofe Wheeler
2009-03-16 21:43       ` Rob Landley
2009-03-17  4:55         ` Kyle Moffett
2009-03-23 11:00       ` Pavel Machek
2009-08-29  1:33   ` Robert Hancock
2009-08-29 13:04     ` Alan Cox
2009-03-16 19:45 ` Greg Freemyer
2009-03-16 21:48   ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200908270106.15032.rob@landley.net \
    --to=rob@landley.net \
    --cc=akpm@osdl.org \
    --cc=corbet@lwn.net \
    --cc=fweimer@bfk.de \
    --cc=goswin-v-b@web.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=pavel@ucw.cz \
    --cc=rdunlap@xenotime.net \
    --cc=riel@redhat.com \
    --cc=rwheeler@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).