Dump corrupts ext2?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Dump corrupts ext2?
@ 2001-10-10 23:03 Lew Wolfgang
  2001-10-10 23:11 ` Doug McNaught
  2001-10-10 23:28 ` H. Peter Anvin
  0 siblings, 2 replies; 15+ messages in thread
From: Lew Wolfgang @ 2001-10-10 23:03 UTC (permalink / raw)
  To: linux-kernel

Hi Folks,

I was looking for some scripts to backup ext2 partitions
to multiple CDR's when I stumbled onto "cdbackup" at
http://www.cableone.net/ccondit/cdbackup/.

Alas, there is a warning saying:

"WARNING! When using this program under Linux, be sure not to use
 dump with kernels in the 2.4.x series. Using dump on an ext2
 filesystem has a very high potential for causing filesystem
 corruption.  As of kernel version 2.4.5, this has not been
 resolved, and it may not be for some time."

I don't recall any problems like this, does anyone have
additional comments?

Regards,
Lew Wolfgang

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:03 Dump corrupts ext2? Lew Wolfgang
@ 2001-10-10 23:11 ` Doug McNaught
  2001-10-10 23:34   ` Andreas Dilger
  2001-10-11  0:38   ` Mike Fedyk
  2001-10-10 23:28 ` H. Peter Anvin
  1 sibling, 2 replies; 15+ messages in thread
From: Doug McNaught @ 2001-10-10 23:11 UTC (permalink / raw)
  To: Lew Wolfgang; +Cc: linux-kernel

Lew Wolfgang <wolfgang@sweet-haven.com> writes:

> Hi Folks,
> 
> I was looking for some scripts to backup ext2 partitions
> to multiple CDR's when I stumbled onto "cdbackup" at
> http://www.cableone.net/ccondit/cdbackup/.
> 
> Alas, there is a warning saying:
> 
> "WARNING! When using this program under Linux, be sure not to use
>  dump with kernels in the 2.4.x series. Using dump on an ext2
>  filesystem has a very high potential for causing filesystem
>  corruption.  As of kernel version 2.4.5, this has not been
>  resolved, and it may not be for some time."
> 
> I don't recall any problems like this, does anyone have
> additional comments?

I'm pretty sure this is because dump reads the block device directly
(which is cached in the buffer cache), while the file data for cached
files lives in the page cache, and the two caches are no longer
coherent (as of 2.4).

If you can find it, Linus has ranted on this list at least once about
why you should never use 'dump'...

If you're doing backups under 2.4, use tar or cpio.

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:11 ` Doug McNaught
@ 2001-10-10 23:34   ` Andreas Dilger
  2001-10-10 23:55     ` Doug McNaught
                       ` (2 more replies)
  2001-10-11  0:38   ` Mike Fedyk
  1 sibling, 3 replies; 15+ messages in thread
From: Andreas Dilger @ 2001-10-10 23:34 UTC (permalink / raw)
  To: Doug McNaught; +Cc: Lew Wolfgang, linux-kernel

On Oct 10, 2001  19:11 -0400, Doug McNaught wrote:
> Lew Wolfgang <wolfgang@sweet-haven.com> writes:
> > I was looking for some scripts to backup ext2 partitions
> > to multiple CDR's when I stumbled onto "cdbackup" at
> > http://www.cableone.net/ccondit/cdbackup/.
> > 
> > Alas, there is a warning saying:
> > 
> > "WARNING! When using this program under Linux, be sure not to use
> >  dump with kernels in the 2.4.x series. Using dump on an ext2
> >  filesystem has a very high potential for causing filesystem
> >  corruption.  As of kernel version 2.4.5, this has not been
> >  resolved, and it may not be for some time."
> 
> I'm pretty sure this is because dump reads the block device directly
> (which is cached in the buffer cache), while the file data for cached
> files lives in the page cache, and the two caches are no longer
> coherent (as of 2.4).

In Linus kernels 2.4.11+ the block devices and filesystems all use the
page cache, so no more coherency issues.

Also, I don't think this ever had the potential to corrupt the filesystem,
but maybe make a slightly bad backup.

> If you can find it, Linus has ranted on this list at least once about
> why you should never use 'dump'...

Yes, but the only issue is if the filesystem is busy, you may get
a bad backup for those files that have changed, but not for any files
that have not changed during the backup.

Reasons for not using tar or cpio include atime change and the fact
that an "incremental" tar can't record the deletion of a file (AFAIK).

Cheers, Andreas
--
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:34   ` Andreas Dilger
@ 2001-10-10 23:55     ` Doug McNaught
  2001-10-11  1:33     ` Richard Gooch
  2001-10-11  2:57     ` H. Peter Anvin
  2 siblings, 0 replies; 15+ messages in thread
From: Doug McNaught @ 2001-10-10 23:55 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Lew Wolfgang, linux-kernel

Andreas Dilger <adilger@turbolabs.com> writes:

> On Oct 10, 2001  19:11 -0400, Doug McNaught wrote:
> > I'm pretty sure this is because dump reads the block device directly
> > (which is cached in the buffer cache), while the file data for cached
> > files lives in the page cache, and the two caches are no longer
> > coherent (as of 2.4).
> 
> In Linus kernels 2.4.11+ the block devices and filesystems all use the
> page cache, so no more coherency issues.

You're right, of course.  But for most of the lifetime of 2.4 the
above was true...

> Also, I don't think this ever had the potential to corrupt the filesystem,
> but maybe make a slightly bad backup.

Right, might corrupt the dump, but shouldn't hurt the filesystem.  

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:34   ` Andreas Dilger
  2001-10-10 23:55     ` Doug McNaught
@ 2001-10-11  1:33     ` Richard Gooch
  2001-10-11  1:48       ` Chris Mason
  2001-10-11  2:57     ` H. Peter Anvin
  2 siblings, 1 reply; 15+ messages in thread
From: Richard Gooch @ 2001-10-11  1:33 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Doug McNaught, Lew Wolfgang, linux-kernel

Andreas Dilger writes:
> On Oct 10, 2001  19:11 -0400, Doug McNaught wrote:
> > Lew Wolfgang <wolfgang@sweet-haven.com> writes:
> > > I was looking for some scripts to backup ext2 partitions
> > > to multiple CDR's when I stumbled onto "cdbackup" at
> > > http://www.cableone.net/ccondit/cdbackup/.
> > > 
> > > Alas, there is a warning saying:
> > > 
> > > "WARNING! When using this program under Linux, be sure not to use
> > >  dump with kernels in the 2.4.x series. Using dump on an ext2
> > >  filesystem has a very high potential for causing filesystem
> > >  corruption.  As of kernel version 2.4.5, this has not been
> > >  resolved, and it may not be for some time."
> > 
> > I'm pretty sure this is because dump reads the block device directly
> > (which is cached in the buffer cache), while the file data for cached
> > files lives in the page cache, and the two caches are no longer
> > coherent (as of 2.4).
> 
> In Linus kernels 2.4.11+ the block devices and filesystems all use
> the page cache, so no more coherency issues.

Um, I thought that there wasn't going to be coherency? For example, if
you open /dev/sda and /dev/sda1, they each have a separate cache. I
remember some debate about this, and Linus pointed out how hard it was
to make things coherent.

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  1:33     ` Richard Gooch
@ 2001-10-11  1:48       ` Chris Mason
  2001-10-11  4:16         ` Benjamin LaHaise
  2001-10-11  4:25         ` Richard Gooch
  0 siblings, 2 replies; 15+ messages in thread
From: Chris Mason @ 2001-10-11  1:48 UTC (permalink / raw)
  To: Richard Gooch, Andreas Dilger; +Cc: Doug McNaught, Lew Wolfgang, linux-kernel

On Wednesday, October 10, 2001 07:33:55 PM -0600 Richard Gooch <rgooch@ras.ucalgary.ca> wrote:

> Andreas Dilger writes:

>> In Linus kernels 2.4.11+ the block devices and filesystems all use
>> the page cache, so no more coherency issues.
> 
> Um, I thought that there wasn't going to be coherency? For example, if
> you open /dev/sda and /dev/sda1, they each have a separate cache. I
> remember some debate about this, and Linus pointed out how hard it was
> to make things coherent.

They all use the page cache, but they still use different address spaces.

The block device and getblk share the same address space, so the metadata
and the block device are on the same cache, except for ext2 directories,
which act like files do.  Each file has its own address space, so that
isn't coherent with the block device.

In other words, block device reads with the FS mounted will probably
never give consistent results.

The bug where dump could corrupt things was when getblk and the
block device both used the buffer cache.  That issue hasn't changed.

-chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  1:48       ` Chris Mason
@ 2001-10-11  4:16         ` Benjamin LaHaise
  2001-10-11  4:29           ` Alexander Viro
  2001-10-11  4:25         ` Richard Gooch
  1 sibling, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2001-10-11  4:16 UTC (permalink / raw)
  To: Chris Mason
  Cc: Richard Gooch, Andreas Dilger, Doug McNaught, Lew Wolfgang,
	linux-kernel

On Wed, Oct 10, 2001 at 09:48:41PM -0400, Chris Mason wrote:
> The bug where dump could corrupt things was when getblk and the
> block device both used the buffer cache.  That issue hasn't changed.

Let me emphasize this: 2.4.11+ will still exhibit filesystem corruption if 
the block device is accessed.  The only way to avoid this is to use raw io, 
in which case you know you're not getting a coherent view of things, so...

		-ben

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  4:16         ` Benjamin LaHaise
@ 2001-10-11  4:29           ` Alexander Viro
  2001-10-11 11:47             ` Chris Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Alexander Viro @ 2001-10-11  4:29 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Chris Mason, Richard Gooch, Andreas Dilger, Doug McNaught,
	Lew Wolfgang, linux-kernel



On Thu, 11 Oct 2001, Benjamin LaHaise wrote:

> On Wed, Oct 10, 2001 at 09:48:41PM -0400, Chris Mason wrote:
> > The bug where dump could corrupt things was when getblk and the
> > block device both used the buffer cache.  That issue hasn't changed.
> 
> Let me emphasize this: 2.4.11+ will still exhibit filesystem corruption if 
> the block device is accessed.  The only way to avoid this is to use raw io, 

What?  Details, please.  If you are talking about read access I would
really like to know which filesystem it is.  ext2 used to have a bug
in that area, but it had been fixed months ago.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  4:29           ` Alexander Viro
@ 2001-10-11 11:47             ` Chris Mason
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Mason @ 2001-10-11 11:47 UTC (permalink / raw)
  To: Alexander Viro, Benjamin LaHaise
  Cc: Richard Gooch, Andreas Dilger, Doug McNaught, Lew Wolfgang,
	linux-kernel



On Thursday, October 11, 2001 12:29:03 AM -0400 Alexander Viro <viro@math.psu.edu> wrote:
> On Thu, 11 Oct 2001, Benjamin LaHaise wrote:
> 
>> On Wed, Oct 10, 2001 at 09:48:41PM -0400, Chris Mason wrote:
>> > The bug where dump could corrupt things was when getblk and the
>> > block device both used the buffer cache.  That issue hasn't changed.
>> 
>> Let me emphasize this: 2.4.11+ will still exhibit filesystem corruption if 
>> the block device is accessed.  The only way to avoid this is to use raw io, 
> 
> What?  Details, please.  If you are talking about read access I would
> really like to know which filesystem it is.  ext2 used to have a bug
> in that area, but it had been fixed months ago.

Sorry, I wasn't very clear.  As far as I know, the specific ext2 bug
(race on up to date flag of newly allocated metadata) was found/fixed
by Al.  

The issues left are just dump getting inconsistent backups from
a rw mounted disk.  We'll have this bug regardless of page cache vs buffer
cache vs raw io in dump.

Now, what's interesting is the raw io dump + ext3 case.

-chris


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  1:48       ` Chris Mason
  2001-10-11  4:16         ` Benjamin LaHaise
@ 2001-10-11  4:25         ` Richard Gooch
  1 sibling, 0 replies; 15+ messages in thread
From: Richard Gooch @ 2001-10-11  4:25 UTC (permalink / raw)
  To: Chris Mason; +Cc: Andreas Dilger, Doug McNaught, Lew Wolfgang, linux-kernel

Chris Mason writes:
> 
> 
> On Wednesday, October 10, 2001 07:33:55 PM -0600 Richard Gooch <rgooch@ras.ucalgary.ca> wrote:
> 
> > Andreas Dilger writes:
> 
> >> In Linus kernels 2.4.11+ the block devices and filesystems all use
> >> the page cache, so no more coherency issues.
> > 
> > Um, I thought that there wasn't going to be coherency? For example, if
> > you open /dev/sda and /dev/sda1, they each have a separate cache. I
> > remember some debate about this, and Linus pointed out how hard it was
> > to make things coherent.
> 
> They all use the page cache, but they still use different address
> spaces.

OK, different "address spaces". I didn't recall the precise
terminology :-)

> The block device and getblk share the same address space, so the metadata
> and the block device are on the same cache, except for ext2 directories,
> which act like files do.  Each file has its own address space, so that
> isn't coherent with the block device.
> 
> In other words, block device reads with the FS mounted will probably
> never give consistent results.

Indeed.

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:34   ` Andreas Dilger
  2001-10-10 23:55     ` Doug McNaught
  2001-10-11  1:33     ` Richard Gooch
@ 2001-10-11  2:57     ` H. Peter Anvin
  2001-10-11  3:13       ` Andreas Dilger
  2 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2001-10-11  2:57 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <20011010173449.Q10443@turbolinux.com>
By author:    Andreas Dilger <adilger@turbolabs.com>
In newsgroup: linux.dev.kernel
> > 
> > I'm pretty sure this is because dump reads the block device directly
> > (which is cached in the buffer cache), while the file data for cached
> > files lives in the page cache, and the two caches are no longer
> > coherent (as of 2.4).
> 
> In Linus kernels 2.4.11+ the block devices and filesystems all use the
> page cache, so no more coherency issues.
> 

How do you find a random block in the page cache?  Last my
understanding was that the page cache is organized by inode/offset,
which wouldn't lend itself to looking up a random hardware block.

(Not to mention the fact that the filesystem is perfectly allowed not
to present anything like a coherent state to the disk while mounted,
which means that even if you did a snapshot in time you're not
guaranteed to have anything functional.  I understand this can be done
by sending a "quiet point" command to the filesystems, followed by an
LVM snapshot, but I doubt may people do that!

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  2:57     ` H. Peter Anvin
@ 2001-10-11  3:13       ` Andreas Dilger
  0 siblings, 0 replies; 15+ messages in thread
From: Andreas Dilger @ 2001-10-11  3:13 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

H. Peter Anvin writes:
> By author:    Andreas Dilger <adilger@turbolabs.com>
> > In Linus kernels 2.4.11+ the block devices and filesystems all use the
> > page cache, so no more coherency issues.
> 
> How do you find a random block in the page cache?  Last my
> understanding was that the page cache is organized by inode/offset,
> which wouldn't lend itself to looking up a random hardware block.

Doh, you are right of course.  I was just thinking "buffer cache" vs.
"page cache", but of course the address space of /dev/hda1 is different
than that of any file inside the mounted filesystem.  However, at least
the ext2 metadata is coherent between user space and kernel space (which
is half the battle when doing a backup) and your file data can only be
a few seconds out of date.

> I understand this can be done by sending a "quiet point" command to the
> filesystems, followed by an LVM snapshot, but I doubt may people do that!

Yes, there is now a hook in the VFS to support a snapshot of the filesystem
for backups (or whatever), which LVM uses.  It is directly supported by
ext3, reiserfs and XFS.  Other filesystems will only have a fsync_dev() and
write_super done, but this should be enough to get things to disk.

It turns out that this is also a handy thing for doing "live" fsck on an
ext2/ext3 filesystem for systems which don't get rebooted very often - you
can still verify that the disk/cables/kernel haven't corrupted anything.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:11 ` Doug McNaught
  2001-10-10 23:34   ` Andreas Dilger
@ 2001-10-11  0:38   ` Mike Fedyk
  2001-10-11  5:07     ` Eric W. Biederman
  1 sibling, 1 reply; 15+ messages in thread
From: Mike Fedyk @ 2001-10-11  0:38 UTC (permalink / raw)
  To: Doug McNaught; +Cc: Lew Wolfgang, linux-kernel

On Wed, Oct 10, 2001 at 07:11:43PM -0400, Doug McNaught wrote:
> Lew Wolfgang <wolfgang@sweet-haven.com> writes:
> 
> > Hi Folks,
> > 
> > I was looking for some scripts to backup ext2 partitions
> > to multiple CDR's when I stumbled onto "cdbackup" at
> > http://www.cableone.net/ccondit/cdbackup/.
> > 
> > Alas, there is a warning saying:
> > 
> > "WARNING! When using this program under Linux, be sure not to use
> >  dump with kernels in the 2.4.x series. Using dump on an ext2
> >  filesystem has a very high potential for causing filesystem
> >  corruption.  As of kernel version 2.4.5, this has not been
> >  resolved, and it may not be for some time."
> > 
> > I don't recall any problems like this, does anyone have
> > additional comments?
> 
> I'm pretty sure this is because dump reads the block device directly
> (which is cached in the buffer cache), while the file data for cached
> files lives in the page cache, and the two caches are no longer
> coherent (as of 2.4).
>

IIRC, 2.2 didn't have a coherent buffer and page cache also.

I.E. if you "cat /dev/hda > /dev/null" you wouldn't be able to expect any
speedup when reading through the mounted filesystem (except for meta-data?).

Am I wrong?  Has Linux ever had a coherent page and buffer cache?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-11  0:38   ` Mike Fedyk
@ 2001-10-11  5:07     ` Eric W. Biederman
  0 siblings, 0 replies; 15+ messages in thread
From: Eric W. Biederman @ 2001-10-11  5:07 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Doug McNaught, Lew Wolfgang, linux-kernel

Mike Fedyk <mfedyk@matchmail.com> writes:

> IIRC, 2.2 didn't have a coherent buffer and page cache also.
> 
> I.E. if you "cat /dev/hda > /dev/null" you wouldn't be able to expect any
> speedup when reading through the mounted filesystem (except for meta-data?).
> 
> Am I wrong?  Has Linux ever had a coherent page and buffer cache?

In 2.2 all writes went through the buffer cache.  So for the buffer cache
was coherent with the filesystem but the filesystem wasn't coherent with the
buffer cache.

Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Dump corrupts ext2?
  2001-10-10 23:03 Dump corrupts ext2? Lew Wolfgang
  2001-10-10 23:11 ` Doug McNaught
@ 2001-10-10 23:28 ` H. Peter Anvin
  1 sibling, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2001-10-10 23:28 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <Pine.LNX.4.33.0110101558210.7049-100000@train.sweet-haven.com>
By author:    Lew Wolfgang <wolfgang@sweet-haven.com>
In newsgroup: linux.dev.kernel
>
> Hi Folks,
> 
> I was looking for some scripts to backup ext2 partitions
> to multiple CDR's when I stumbled onto "cdbackup" at
> http://www.cableone.net/ccondit/cdbackup/.
> 
> Alas, there is a warning saying:
> 
> "WARNING! When using this program under Linux, be sure not to use
>  dump with kernels in the 2.4.x series. Using dump on an ext2
>  filesystem has a very high potential for causing filesystem
>  corruption.  As of kernel version 2.4.5, this has not been
>  resolved, and it may not be for some time."
> 
> I don't recall any problems like this, does anyone have
> additional comments?
> 

Not really surprising... doesn't dump expect to be able to read a rw
mounted filesystem by reading the raw device and get the data off it?
Doesn't work.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2001-10-11 11:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-10 23:03 Dump corrupts ext2? Lew Wolfgang
2001-10-10 23:11 ` Doug McNaught
2001-10-10 23:34   ` Andreas Dilger
2001-10-10 23:55     ` Doug McNaught
2001-10-11  1:33     ` Richard Gooch
2001-10-11  1:48       ` Chris Mason
2001-10-11  4:16         ` Benjamin LaHaise
2001-10-11  4:29           ` Alexander Viro
2001-10-11 11:47             ` Chris Mason
2001-10-11  4:25         ` Richard Gooch
2001-10-11  2:57     ` H. Peter Anvin
2001-10-11  3:13       ` Andreas Dilger
2001-10-11  0:38   ` Mike Fedyk
2001-10-11  5:07     ` Eric W. Biederman
2001-10-10 23:28 ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox