* Filesystem corruption
@ 2001-01-31 14:20 Carsten Langgaard
2001-01-31 15:52 ` Florian Lohoff
2001-02-05 10:02 ` Ralf Baechle
0 siblings, 2 replies; 89+ messages in thread
From: Carsten Langgaard @ 2001-01-31 14:20 UTC (permalink / raw)
To: linux-mips
Has anyone seen problems with fsck on the latest 2.4.0 kernel ?
My filesystem gets corrupted from time to time when I use the latest
2.4.0 kernel.
/Carsten
--
_ _ ____ ___ Carsten Langgaard Mailto:carstenl@mips.com
|\ /|||___)(___ MIPS Denmark Direct: +45 4486 5527
| \/ ||| ____) Lautrupvang 4B Switch: +45 4486 5555
TECHNOLOGIES 2750 Ballerup Fax...: +45 4486 5556
Denmark http://www.mips.com
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2001-01-31 14:20 Carsten Langgaard
@ 2001-01-31 15:52 ` Florian Lohoff
2001-01-31 16:24 ` Carsten Langgaard
2001-02-05 10:02 ` Ralf Baechle
1 sibling, 1 reply; 89+ messages in thread
From: Florian Lohoff @ 2001-01-31 15:52 UTC (permalink / raw)
To: Carsten Langgaard; +Cc: linux-mips
On Wed, Jan 31, 2001 at 03:20:35PM +0100, Carsten Langgaard wrote:
>
> Has anyone seen problems with fsck on the latest 2.4.0 kernel ?
> My filesystem gets corrupted from time to time when I use the latest
> 2.4.0 kernel.
>
Hmm - nope - 2.4.0 Bigendian here
resume:~# uptime
3:50pm up 6 days, 10 min, 1 user, load average: 0.00, 0.00, 0.00
resume:~# uname -a
Linux resume.rfc822.org 2.4.0 #3 Thu Jan 25 16:25:23 CET 2001 mips unknown
Flo
--
Florian Lohoff flo@rfc822.org +49-5201-669912
Why is it called "common sense" when nobody seems to have any?
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2001-01-31 15:52 ` Florian Lohoff
@ 2001-01-31 16:24 ` Carsten Langgaard
2001-01-31 16:48 ` Florian Lohoff
0 siblings, 1 reply; 89+ messages in thread
From: Carsten Langgaard @ 2001-01-31 16:24 UTC (permalink / raw)
To: Florian Lohoff; +Cc: linux-mips
Try use fsck.
/Carsten
Florian Lohoff wrote:
> On Wed, Jan 31, 2001 at 03:20:35PM +0100, Carsten Langgaard wrote:
> >
> > Has anyone seen problems with fsck on the latest 2.4.0 kernel ?
> > My filesystem gets corrupted from time to time when I use the latest
> > 2.4.0 kernel.
> >
>
> Hmm - nope - 2.4.0 Bigendian here
>
> resume:~# uptime
> 3:50pm up 6 days, 10 min, 1 user, load average: 0.00, 0.00, 0.00
> resume:~# uname -a
> Linux resume.rfc822.org 2.4.0 #3 Thu Jan 25 16:25:23 CET 2001 mips unknown
>
> Flo
> --
> Florian Lohoff flo@rfc822.org +49-5201-669912
> Why is it called "common sense" when nobody seems to have any?
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2001-01-31 16:24 ` Carsten Langgaard
@ 2001-01-31 16:48 ` Florian Lohoff
0 siblings, 0 replies; 89+ messages in thread
From: Florian Lohoff @ 2001-01-31 16:48 UTC (permalink / raw)
To: Carsten Langgaard; +Cc: linux-mips
On Wed, Jan 31, 2001 at 05:24:58PM +0100, Carsten Langgaard wrote:
>
> Try use fsck.
>
*Urgs* Trouble ...
resume:~# df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/sda1 2074328 1360061 607040 70% /
/dev/sde1 3839092 217476 3426600 6% /chroot
/dev/sdb1 4003992 3708044 89260 98% /home2
/dev/sdc1 4003992 449472 3347832 12% /home3
/dev/sdd1 4003992 1134620 2662684 30% /ftp.rfc822.org
resume:~# umount /ftp.rfc822.org/
resume:~# fsck -f /dev/sdd1
Parallelizing fsck version 1.18 (11-Nov-1999)
e2fsck 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
Pass 1: Checking inodes, blocks, and sizes
Inode 64654, i_blocks is 42696, should be 44744. Fix<y>? yes
Duplicate blocks found... invoking duplicate block passes.
Pass 1B: Rescan for duplicate/bad blocks
Duplicate/bad block(s) in inode 64654: 265881 ... ... ...
Duplicate/bad block(s) in inode 193927: 265881 ... ... ...
Pass 1C: Scan directories for inodes with dup blocks.
Pass 1D: Reconciling duplicate blocks
(There are 2 inodes containing duplicate/bad blocks.)
File /kernel/kernel-image-2.4.0-ip22-r4k.tgz (inode #193927, mod time Thu Jan 25 11:17:00 2001)
has 251 duplicate block(s), shared with 1 file(s):
/devel/gcc-20000822-mips.tar.gz (inode #64654, mod time Mon Aug 28 17:14:56 2000)
Clone duplicate/bad blocks<y>? yes
File /devel/gcc-20000822-mips.tar.gz (inode #64654, mod time Mon Aug 28 17:14:56 2000)
has 251 duplicate block(s), shared with 1 file(s):
/kernel/kernel-image-2.4.0-ip22-r4k.tgz (inode #193927, mod time Thu Jan 25 11:17:00 2001)
Duplicated blocks already reassigned or cloned.
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +265876 +265877 +265878 +265879 +265880
Fix<y>? yes
Free blocks count wrong for group #0 (29960, counted=29709).
Fix<y>? yes
Free blocks count wrong for group #8 (5, counted=0).
Fix<y>? yes
Free blocks count wrong (717343, counted=717087).
Fix<y>? yes
/dev/sdd1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sdd1: 6277/1034240 files (21.0% non-contiguous), 316359/1033446 blocks
I ran -test6 and -test9 before.
Flo
--
Florian Lohoff flo@rfc822.org +49-5201-669912
Why is it called "common sense" when nobody seems to have any?
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2001-01-31 14:20 Carsten Langgaard
2001-01-31 15:52 ` Florian Lohoff
@ 2001-02-05 10:02 ` Ralf Baechle
2001-02-05 12:10 ` Alan Cox
1 sibling, 1 reply; 89+ messages in thread
From: Ralf Baechle @ 2001-02-05 10:02 UTC (permalink / raw)
To: Carsten Langgaard; +Cc: linux-mips
On Wed, Jan 31, 2001 at 03:20:35PM +0100, Carsten Langgaard wrote:
> Has anyone seen problems with fsck on the latest 2.4.0 kernel ?
> My filesystem gets corrupted from time to time when I use the latest
> 2.4.0 kernel.
2.4.1 is known to cause fs corruption for all architectures; 2.4.0 should
actually be fine. I just reached 8 days of uptime on a 32p Origin 2000,
so it can't be that bad.
Ralf
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 12:10 ` Alan Cox
0 siblings, 0 replies; 89+ messages in thread
From: Alan Cox @ 2001-02-05 12:10 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Carsten Langgaard, linux-mips
> 2.4.1 is known to cause fs corruption for all architectures; 2.4.0 should
> actually be fine. I just reached 8 days of uptime on a 32p Origin 2000,
> so it can't be that bad.
Im tracking fs corruption and worse on 2.4.0 as well (zero page corruptions
since 2.4.0test10 for example)
I dont believe any 2.4 is currently 'safe'
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 12:10 ` Alan Cox
0 siblings, 0 replies; 89+ messages in thread
From: Alan Cox @ 2001-02-05 12:10 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Carsten Langgaard, linux-mips
> 2.4.1 is known to cause fs corruption for all architectures; 2.4.0 should
> actually be fine. I just reached 8 days of uptime on a 32p Origin 2000,
> so it can't be that bad.
Im tracking fs corruption and worse on 2.4.0 as well (zero page corruptions
since 2.4.0test10 for example)
I dont believe any 2.4 is currently 'safe'
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2001-02-05 12:10 ` Alan Cox
(?)
@ 2001-02-05 12:56 ` Geert Uytterhoeven
2001-02-05 13:01 ` Alan Cox
-1 siblings, 1 reply; 89+ messages in thread
From: Geert Uytterhoeven @ 2001-02-05 12:56 UTC (permalink / raw)
To: Alan Cox; +Cc: Ralf Baechle, Carsten Langgaard, linux-mips
On Mon, 5 Feb 2001, Alan Cox wrote:
> > 2.4.1 is known to cause fs corruption for all architectures; 2.4.0 should
> > actually be fine. I just reached 8 days of uptime on a 32p Origin 2000,
> > so it can't be that bad.
>
> Im tracking fs corruption and worse on 2.4.0 as well (zero page corruptions
> since 2.4.0test10 for example)
Is the zero page mapped on non-m68k architectures?
> I dont believe any 2.4 is currently 'safe'
Ugh...
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 13:01 ` Alan Cox
0 siblings, 0 replies; 89+ messages in thread
From: Alan Cox @ 2001-02-05 13:01 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Alan Cox, Ralf Baechle, Carsten Langgaard, linux-mips
> Is the zero page mapped on non-m68k architectures?
It can certainly be hit by DMA and kernel memory ops
> > I dont believe any 2.4 is currently 'safe'
> Ugh...
We'll get there, its doing pretty well for most folks
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 13:01 ` Alan Cox
0 siblings, 0 replies; 89+ messages in thread
From: Alan Cox @ 2001-02-05 13:01 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Alan Cox, Ralf Baechle, Carsten Langgaard, linux-mips
> Is the zero page mapped on non-m68k architectures?
It can certainly be hit by DMA and kernel memory ops
> > I dont believe any 2.4 is currently 'safe'
> Ugh...
We'll get there, its doing pretty well for most folks
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 13:16 Ian Chilton
0 siblings, 0 replies; 89+ messages in thread
From: Ian Chilton @ 2001-02-05 13:16 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-mips
Hello,
> I dont believe any 2.4 is currently 'safe'
auchhh..
If Alan Cox himself (nearly as bad as Linus saying it..) is saying
that, I am glad I am still running 2.2.17/18 on my servers and am
wondering if I should have upgraded my workstations to 2.4.1 ;(
Bye for Now,
Ian
\|||/
(o o)
/---------------------------ooO-(_)-Ooo---------------------------\
| Ian Chilton (IRC Nick - GadgetMan) ICQ #: 16007717 |
|-----------------------------------------------------------------|
| E-Mail: ian@ichilton.co.uk Web: http://www.ichilton.co.uk |
|-----------------------------------------------------------------|
| Proofread carefully to see if you any words out. |
\-----------------------------------------------------------------/
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 16:00 Ian Chilton
0 siblings, 0 replies; 89+ messages in thread
From: Ian Chilton @ 2001-02-05 16:00 UTC (permalink / raw)
To: J. Scott Kasten; +Cc: linux-mips
Hello,
> If you're worried about it, do what I do. Pick one server that always
> runs a known stable release and keep your working/home directories on it
> as NFS exports. Run your development kernel/tools on an NFS client box.
> That way when it croaks, you don't wast a lot of you time fscking and
> possibly loosing files.
That's basically what I do..
Bye for Now,
Ian
\|||/
(o o)
/---------------------------ooO-(_)-Ooo---------------------------\
| Ian Chilton (IRC Nick - GadgetMan) ICQ #: 16007717 |
|-----------------------------------------------------------------|
| E-Mail: ian@ichilton.co.uk Web: http://www.ichilton.co.uk |
|-----------------------------------------------------------------|
| Budget: A method for going broke methodically. |
\-----------------------------------------------------------------/
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 22:01 ` Ralf Baechle
0 siblings, 0 replies; 89+ messages in thread
From: Ralf Baechle @ 2001-02-05 22:01 UTC (permalink / raw)
To: Alan Cox; +Cc: Geert Uytterhoeven, Carsten Langgaard, linux-mips
On Mon, Feb 05, 2001 at 01:01:33PM +0000, Alan Cox wrote:
> > Is the zero page mapped on non-m68k architectures?
>
> It can certainly be hit by DMA and kernel memory ops
>
> > > I dont believe any 2.4 is currently 'safe'
> > Ugh...
>
> We'll get there, its doing pretty well for most folks
I hope so. For many of us 2.2 is no longer an option. That is at least
without heavy patching to add support for hardware that isn't supported
by 2.2.
Ralf
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2001-02-05 22:01 ` Ralf Baechle
0 siblings, 0 replies; 89+ messages in thread
From: Ralf Baechle @ 2001-02-05 22:01 UTC (permalink / raw)
To: Alan Cox; +Cc: Geert Uytterhoeven, Carsten Langgaard, linux-mips
On Mon, Feb 05, 2001 at 01:01:33PM +0000, Alan Cox wrote:
> > Is the zero page mapped on non-m68k architectures?
>
> It can certainly be hit by DMA and kernel memory ops
>
> > > I dont believe any 2.4 is currently 'safe'
> > Ugh...
>
> We'll get there, its doing pretty well for most folks
I hope so. For many of us 2.2 is no longer an option. That is at least
without heavy patching to add support for hardware that isn't supported
by 2.2.
Ralf
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 1846 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
2002-06-07 7:15 ` Oleg Drokin
0 siblings, 1 reply; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 1734 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2070 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2294 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic06540.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2406 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic19921.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic06540.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2518 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic18956.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic19921.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic06540.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2630 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic30134.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic18956.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic19921.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic06540.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-06-06 18:00 Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-06 18:00 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 2742 bytes --]
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic29967.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic30134.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic18956.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic19921.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic06540.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic08003.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic04883.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic11654.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic24262.pcx)
(Embedded
image moved Kurt <kpalmer@advance.net>
to file: 06/06/2002 02:00 PM
pic13835.pcx)
Hello all,
I currently have a system configured as follows :-
1) LVM version 1.0.1-rc4(ish)(03/10/2001)
2) /dev/PROJ/proj on /proj type reiserfs (rw,noatime,notail)
3) /dev/PROJ/proj 239G 142G 97G 60% /proj
4) 2.4.17 with reiserfs tools 3.x.0k
5) Reiserfs compiled in (CONFIG_REISERFS_CHECK set to NO)
6) 256 MB RAM ("sar -r" shows memory usage is not abnormal for this box)
7)Tuns of very small files based on log processing
I am told by my co-worker that the system unresponsive and showed reiserfs
related errors on the console.
Upon restart they noticed that the file
/proj/webtrends/receive/bama/www3/access.01Jun.r.gz was unreadable by root
(permission denied).
I did a reiserfsck on the drive and noticed that access.01Jun.r.gz returned an
error stating the file pointed to nowhere.
I was unable to complete a reiserfsck --fix-fixable because of the length of
time that this (fsck) process took since this was an unscheduled downtime.
During the weekend i will attempt to do the fsck again, however i really
needed to know if this problem has been observed by anyone else, and what
steps they took to fix the problem.
-Kurt
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem Corruption
2002-06-06 18:00 Kurt
@ 2002-06-07 7:15 ` Oleg Drokin
2002-06-11 16:49 ` Kurt
0 siblings, 1 reply; 89+ messages in thread
From: Oleg Drokin @ 2002-06-07 7:15 UTC (permalink / raw)
To: Kurt; +Cc: reiserfs-list
Hello!
On Thu, Jun 06, 2002 at 02:00:01PM -0400, Kurt wrote:
> error stating the file pointed to nowhere.
> I was unable to complete a reiserfsck --fix-fixable because of the length of
> time that this (fsck) process took since this was an unscheduled downtime.
> During the weekend i will attempt to do the fsck again, however i really
> needed to know if this problem has been observed by anyone else, and what
> steps they took to fix the problem.
We recommend you to upgrade your kernel to 2.4.18.
To know what exact problem is it would be very useful if you'd posted excerpts
from kernel logs with actual errors.
Thank you.
Bye,
Oleg
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem Corruption
2002-06-07 7:15 ` Oleg Drokin
@ 2002-06-11 16:49 ` Kurt
0 siblings, 0 replies; 89+ messages in thread
From: Kurt @ 2002-06-11 16:49 UTC (permalink / raw)
To: Oleg Drokin; +Cc: reiserfs-list
Thanks Oleg,
sorry for the late response (i was out of the office) , you may find the
following information on the last crash useful :-
+++++++++++++++
3 04:32:37 devo kernel: vs-13075: reiserfs_read_inode2: dead inode read from
disk [854 1695654 0x0 SD]. This is likely to be race with knfsd. Ignore
Jun 3 04:32:39 devo kernel: vs-13060: reiserfs_update_sd: stat data of object
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)
Jun 3 04:41:38 devo kernel: vs-13060: reiserfs_update_sd: stat data of object
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)
Jun 3 04:41:43 devo kernel: vs-13060: reiserfs_update_sd: stat data of object
[854 1695654 0x0 SD] (nlink == 1) not found (pos 1)
++++++++++++
I will upgrade the kernel and reiserfs tools this week and inform you of the
result after a fsck.
-Kurt
On Friday 07 June 2002 3:15 am, Oleg Drokin wrote:
> Hello!
>
> On Thu, Jun 06, 2002 at 02:00:01PM -0400, Kurt wrote:
> > error stating the file pointed to nowhere.
> > I was unable to complete a reiserfsck --fix-fixable because of the length
> > of time that this (fsck) process took since this was an unscheduled
> > downtime. During the weekend i will attempt to do the fsck again, however
> > i really needed to know if this problem has been observed by anyone else,
> > and what steps they took to fix the problem.
>
> We recommend you to upgrade your kernel to 2.4.18.
> To know what exact problem is it would be very useful if you'd posted
> excerpts from kernel logs with actual errors.
> Thank you.
>
> Bye,
> Oleg
--
================================================
Kurt Palmer SysAdmin
kpalmer@advance.net Advance Internet
201-459-2846
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2002-09-05 15:57 Brian Tinsley
0 siblings, 0 replies; 89+ messages in thread
From: Brian Tinsley @ 2002-09-05 15:57 UTC (permalink / raw)
To: reiserfs-list
We had problems on a production filesystem, apparently from a machine
crash. I ran reiserfsck (vers. 3.6.3) on this filesystem and received a
message that one corruption can only be fixed during --rebuild-tree. My
question is if I do this, is there any chance of data loss or will the
filesystem be safely repaired? I've never had to do anything but a
simple --fix-fixable before (without any problems).
--
Brian Tinsley
Chief Systems Engineer
Emageon
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem corruption
@ 2003-08-13 16:05 Locke
2003-08-14 7:49 ` Oleg Drokin
0 siblings, 1 reply; 89+ messages in thread
From: Locke @ 2003-08-13 16:05 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 4521 bytes --]
Hi,
I've got a problem with reiserfs today while I was trying to access
my network files. I tried browsing my network drive and found out that
some of my directories were empty. So I unmounted the partition and ran
reiserfsck(3.6.8), it said I had 4 corruptions and told me to run
--rebuild-tree. And so I did and it recovered only 7.8GB of 47.8GB of
the files. I'm guessing the reason why it recovered so little was
because that because I was running a 7.8GB+40GB LVM and the 40GB
pyhsical volume wasn't working and left it with only 7.8GB.
Here's the specs of my system: linux-2.4.21, reisfs-3.6.8, LVM-1.0.7
(7.8GB + 40GB)
Partitions:
/dev/hda (ext2) / 3.2GB
/dev/hdb+/dev/hdg => /dev/main_vg/storage_lv(reiserfs)
/mnt/storage 47.8GB
Here's some output of dmesg at the point where I discovered the problem:
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 8838461. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 8838461. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 11534730. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 8838461. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 8838461. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 3412777. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 11534730. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 11604101. Fsck?
is_tree_node: node level 0 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 11534730. Fsck?
reiserfs: checking transaction log (device 3a:00) ...
is_tree_node: node level 32769 does not match to the expected one 1
vs-5150: search_by_key: invalid format found in block 5505049. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [1 2 0x0 SD]
Using r5 hash to sort names
is_tree_node: node level 0 does not match to the expected one 2
vs-5150: search_by_key: invalid format found in block 2412772. Fsck?
vs-2140: finish_unfinished: search_by_key returned -2
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 3a:00) ...
is_leaf: free space seems wrong: level=1, nr_items=1, free_space=778 rdkey
vs-5150: search_by_key: invalid format found in block 5505049. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [1 2 0x0 SD]
Using r5 hash to sort names
is_tree_node: node level 0 does not match to the expected one 2
vs-5150: search_by_key: invalid format found in block 2412772. Fsck?
vs-2140: finish_unfinished: search_by_key returned -2
ReiserFS version 3.6.25
VFS: Can't find ext3 filesystem on dev lvm(58,0).
reiserfs: checking transaction log (device 3a:00) ...
is_leaf: free space seems wrong: level=1, nr_items=1, free_space=778 rdkey
vs-5150: search_by_key: invalid format found in block 5505049. Fsck?
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
data of [1 2 0x0 SD]
Using r5 hash to sort names
is_tree_node: node level 0 does not match to the expected one 2
vs-5150: search_by_key: invalid format found in block 2412772. Fsck?
vs-2140: finish_unfinished: search_by_key returned -2
ReiserFS version 3.6.25
And also when rebooting after the corruption I saw several error
messages for all drives, hda, hdb and hdg
**
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
**The messages are copied from the FAQ in namesys.com because they
looked similar so I'm not sure if they're the exactly same.
I tried loading a previous kernel(2.4.20) and the error messages were
gone, this was probably because of some errors I made when configuring
the 2.4.21 kernel. It was the first time I've compiled the kernel
without thoroughly checking the configurations and now I suffer the
consequences.
Is there anything I can try to recover more data?
Regards,
Kent
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2003-08-13 16:05 Locke
@ 2003-08-14 7:49 ` Oleg Drokin
0 siblings, 0 replies; 89+ messages in thread
From: Oleg Drokin @ 2003-08-14 7:49 UTC (permalink / raw)
To: Locke; +Cc: reiserfs-list
Hello!
On Thu, Aug 14, 2003 at 12:05:28AM +0800, Locke wrote:
> the files. I'm guessing the reason why it recovered so little was
> because that because I was running a 7.8GB+40GB LVM and the 40GB
> pyhsical volume wasn't working and left it with only 7.8GB.
Yes of course.
> is_tree_node: node level 0 does not match to the expected one 1
> vs-5150: search_by_key: invalid format found in block 8838461. Fsck?
So LVM substitures zero filled blocks instead of data if physical volume
is unavailable.
Of course reiserfsck happily thrown all of those blocks out of the tree.
> And also when rebooting after the corruption I saw several error
> messages for all drives, hda, hdb and hdg
> **
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Also you should consider replacing your noisy IDE cable for primary IDE
controller with not noisy one. Or just run in lower UDMA mode.
> **The messages are copied from the FAQ in namesys.com because they
> looked similar so I'm not sure if they're the exactly same.
Well, if they are not the same, you'd better write them down on paper.
> Is there anything I can try to recover more data?
You might try to get LVM up again and run reiserfsck --rebuild tree.
Some more stuff wuill be restored.
Though still you will have lots of files' content lost and there is no way
to restore it anymore.
Also use reiserfsck 3.6.11
Bye,
Oleg
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem corruption
@ 2007-05-27 13:18 Laurent CARON
2007-05-28 12:23 ` Vladimir V. Saveliev
` (2 more replies)
0 siblings, 3 replies; 89+ messages in thread
From: Laurent CARON @ 2007-05-27 13:18 UTC (permalink / raw)
To: reiserfs-list, reiserfs-dev
Hi,
A few days ago, one of my procmail suddenly receipes stopped to work.
I didn't care much since this only was for 1 or 2 mails.
Yesterday, i took time to dig it a bit further and looked at the
filesystem on my mail server
Here is the output of ls -al in the Maildir where my mails are stored
total 1341
drwx------ 6 lcaron mail 256 2007-05-24 10:35 ./
drwx------ 363 lcaron mail 12184 2007-05-25 21:52 ../
-rw-r--r-- 1 lcaron mail 17 2004-05-25 09:19 courierimapacl
drwx------ 2 lcaron mail 48 2004-05-25 09:20 courierimapkeywords/
-rw-r--r-- 1 lcaron lcaron 169365 2007-05-24 10:35 courierimapuiddb
drwx------ 2 lcaron mail 1185016 2007-05-24 10:26 cur/
-rw------- 1 lcaron mail 0 2004-05-25 09:19 maildirfolder
?--------- ? ? ? ? ? new
drwx------ 2 lcaron mail 48 2007-05-24 19:16 tmp/
The entry that scares me is
?--------- ? ? ? ? ? new
Seems to me it is a filesystem corruption.
Any other solution than rebuild-tree ?
Thanks
Laurent
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-27 13:18 Laurent CARON
@ 2007-05-28 12:23 ` Vladimir V. Saveliev
2007-05-28 14:10 ` Laurent CARON
[not found] ` <Pine.LNX.4.64.0705280025570.10429@sheep.housecafe.de>
[not found] ` <465BA9AC.8040805@ultraviolet.org>
2 siblings, 1 reply; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-28 12:23 UTC (permalink / raw)
To: Laurent CARON; +Cc: reiserfs-dev, reiserfs-list
Hello
On Sunday 27 May 2007 17:18, Laurent CARON wrote:
> Hi,
>
> A few days ago, one of my procmail suddenly receipes stopped to work.
>
> I didn't care much since this only was for 1 or 2 mails.
>
> Yesterday, i took time to dig it a bit further and looked at the
> filesystem on my mail server
>
> Here is the output of ls -al in the Maildir where my mails are stored
>
> total 1341
> drwx------ 6 lcaron mail 256 2007-05-24 10:35 ./
> drwx------ 363 lcaron mail 12184 2007-05-25 21:52 ../
> -rw-r--r-- 1 lcaron mail 17 2004-05-25 09:19 courierimapacl
> drwx------ 2 lcaron mail 48 2004-05-25 09:20 courierimapkeywords/
> -rw-r--r-- 1 lcaron lcaron 169365 2007-05-24 10:35 courierimapuiddb
> drwx------ 2 lcaron mail 1185016 2007-05-24 10:26 cur/
> -rw------- 1 lcaron mail 0 2004-05-25 09:19 maildirfolder
> ?--------- ? ? ? ? ? new
> drwx------ 2 lcaron mail 48 2007-05-24 19:16 tmp/
>
>
> The entry that scares me is
> ?--------- ? ? ? ? ? new
>
> Seems to me it is a filesystem corruption.
>
> Any other solution than rebuild-tree ?
>
Did you try "rm -rf new"?
> Thanks
>
> Laurent
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 12:23 ` Vladimir V. Saveliev
@ 2007-05-28 14:10 ` Laurent CARON
2007-05-28 17:13 ` Vladimir V. Saveliev
0 siblings, 1 reply; 89+ messages in thread
From: Laurent CARON @ 2007-05-28 14:10 UTC (permalink / raw)
To: reiserfs-list; +Cc: Vladimir V. Saveliev, reiserfs-dev
Vladimir V. Saveliev a écrit :
> Did you try "rm -rf new"?
$ rm -rf new
rm: cannot lstat `new': Permission denied
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 14:10 ` Laurent CARON
@ 2007-05-28 17:13 ` Vladimir V. Saveliev
2007-05-28 17:27 ` Laurent CARON
0 siblings, 1 reply; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-28 17:13 UTC (permalink / raw)
To: Laurent CARON; +Cc: reiserfs-list
Hello
On Monday 28 May 2007 18:10, Laurent CARON wrote:
> Vladimir V. Saveliev a écrit :
> > Did you try "rm -rf new"?
>
> $ rm -rf new
> rm: cannot lstat `new': Permission denied
>
>
Is there anything from reiserfs in system logs?
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 17:13 ` Vladimir V. Saveliev
@ 2007-05-28 17:27 ` Laurent CARON
0 siblings, 0 replies; 89+ messages in thread
From: Laurent CARON @ 2007-05-28 17:27 UTC (permalink / raw)
To: reiserfs-list; +Cc: Vladimir V. Saveliev
Vladimir V. Saveliev a écrit :
> Is there anything from reiserfs in system logs?
>
Nothing from reiserfs/kernel in
I did experience a similar bug on another computer a while ago (this bug
was "fixed" by rebuilding the tree).
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
[not found] ` <Pine.LNX.4.64.0705280025570.10429@sheep.housecafe.de>
@ 2007-05-28 17:31 ` Christian Kujau
2007-05-28 18:16 ` Laurent CARON
0 siblings, 1 reply; 89+ messages in thread
From: Christian Kujau @ 2007-05-28 17:31 UTC (permalink / raw)
To: Christian Kujau; +Cc: reiserfs-list
[resending, because lncsa.com bounced my mail]
On Mon, 28 May 2007, Christian Kujau wrote:
> On Sun, 27 May 2007, Laurent CARON wrote:
>> The entry that scares me is
>> ?--------- ? ? ? ? ? new
>>
>> Seems to me it is a filesystem corruption.
>> Any other solution than rebuild-tree ?
>
> Please try to check the fs with a current version of reiserfsprogs first. As
> the manpage advises, try --check first and use --rebuild-tree only if you
> know what you're doing, IOW: have a current backup.
>
> Also, which kernel/machine is this running on? Do you know *why* this
> corruption may have occured? Any recent hardware issues? Is ther anything in
> the logs regarding fs/device errors?
>
> C.
> --
> BOFH excuse #448:
>
> vi needs to be upgraded to vii
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 17:31 ` Christian Kujau
@ 2007-05-28 18:16 ` Laurent CARON
2007-05-28 23:19 ` Christian Kujau
2007-05-29 8:39 ` Vladimir V. Saveliev
0 siblings, 2 replies; 89+ messages in thread
From: Laurent CARON @ 2007-05-28 18:16 UTC (permalink / raw)
To: reiserfs-list
Christian Kujau a écrit :
>> Please try to check the fs with a current version of reiserfsprogs
>> first. As the manpage advises, try --check first and use
>> --rebuild-tree only if you know what you're doing, IOW: have a current
>> backup.
Over the past few years, i experienced a few reiser corruption on
various hardware (dell, hp, asus, sata, scsi, ide...) with the same
symptoms (unredable file/dir).
Always ran check which told me to run fix-fixable or rebuild-tree, which
I did after ensuring of backup reliability, and the error was corrected
(after eventually losing a few files i fortunately had in the backups).
>>
>> Also, which kernel/machine is this running on? Do you know *why* this
>> corruption may have occured? Any recent hardware issues? Is ther
>> anything in the logs regarding fs/device errors?
Kernel is 2.6.19.
The machine does not seem to have any HW issue, nothing strange in the
logs..... :$
This is just a plain Dell 2650 server with a bunch of SCSI HDD, software
raid5 array, reiserfs on top of it.
Laurent
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 18:16 ` Laurent CARON
@ 2007-05-28 23:19 ` Christian Kujau
2007-05-29 8:39 ` Vladimir V. Saveliev
1 sibling, 0 replies; 89+ messages in thread
From: Christian Kujau @ 2007-05-28 23:19 UTC (permalink / raw)
To: reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Mon, 28 May 2007, Laurent CARON wrote:
> Always ran check which told me to run fix-fixable or rebuild-tree, which I
> did after ensuring of backup reliability, and the error was corrected (after
> eventually losing a few files i fortunately had in the backups).
Well, lucky you :)
> The machine does not seem to have any HW issue, nothing strange in the
> logs..... :$
> This is just a plain Dell 2650 server with a bunch of SCSI HDD, software
> raid5 array, reiserfs on top of it.
...and no power-failures, bad memory whatsoever?
Hm, too bad, since now it's unclear
what *caused* the corruptions in the first place. You'll probably
(hopefully) be able to correct this corruption with --rebuild-tree but
I'd have a close look on this filesystem for further curruptions.
Christian.
- --
BOFH excuse #118:
the router thinks its a printer.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFGW2N/+A7rjkF8z0wRAg9yAJ9PgWYfv1KC1Z3o/cVXScqxTYDPfwCdHKDD
Wy3p1M9ODJFfuqn0JaCEu8U=
=uCAH
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
[not found] ` <465BA9AC.8040805@ultraviolet.org>
@ 2007-05-29 8:15 ` Vladimir V. Saveliev
2007-05-29 12:36 ` Toby Thain
0 siblings, 1 reply; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-29 8:15 UTC (permalink / raw)
To: Tracy R Reed; +Cc: Laurent CARON, reiserfs-list
Hello
On Tuesday 29 May 2007 08:18, Tracy R Reed wrote:
> Laurent CARON wrote:
> > Seems to me it is a filesystem corruption.
>
> Did I miss it or did not a single person ask you if this happened with
> reiserfs 3 or 4?
>
Laurent mentioned rebuild-tree mode of reiserfsck. So the problem happened with reiserfs 3.
> I would be quite surprised if this were reiser 3 and not so surprised if
> it were reiser 4 which is still beta afaik.
>
> Reiser has a nasty reputation for filesystem corruption more than any
> other fs. I have always found reiser3 to be rock solid but you can't
> mention using reiserfs in mixed company without someone accusing you of
> throwing your data away. You would think the developers would be doing
> more to counter this but I have been following reiserfs for years and
> nobody seems to really care all that much.
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-28 18:16 ` Laurent CARON
2007-05-28 23:19 ` Christian Kujau
@ 2007-05-29 8:39 ` Vladimir V. Saveliev
1 sibling, 0 replies; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-29 8:39 UTC (permalink / raw)
To: Laurent CARON; +Cc: reiserfs-list
Hello
On Monday 28 May 2007 22:16, Laurent CARON wrote:
> Christian Kujau a écrit :
> >> Please try to check the fs with a current version of reiserfsprogs
> >> first. As the manpage advises, try --check first and use
> >> --rebuild-tree only if you know what you're doing, IOW: have a current
> >> backup.
>
> Over the past few years, i experienced a few reiser corruption on
> various hardware (dell, hp, asus, sata, scsi, ide...) with the same
> symptoms (unredable file/dir).
> Always ran check which told me to run fix-fixable or rebuild-tree, which
> I did after ensuring of backup reliability, and the error was corrected
> (after eventually losing a few files i fortunately had in the backups).
>
Would you run reiserfsck --check -l log and let us see the log?
That may give a hint about which kind of corruptions do you have.
> >>
> >> Also, which kernel/machine is this running on? Do you know *why* this
> >> corruption may have occured? Any recent hardware issues? Is ther
> >> anything in the logs regarding fs/device errors?
>
> Kernel is 2.6.19.
> The machine does not seem to have any HW issue, nothing strange in the
> logs..... :$
> This is just a plain Dell 2650 server with a bunch of SCSI HDD, software
> raid5 array, reiserfs on top of it.
>
> Laurent
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-29 8:15 ` Vladimir V. Saveliev
@ 2007-05-29 12:36 ` Toby Thain
2007-05-30 13:25 ` David Masover
2007-05-30 16:08 ` Vladimir V. Saveliev
0 siblings, 2 replies; 89+ messages in thread
From: Toby Thain @ 2007-05-29 12:36 UTC (permalink / raw)
To: Vladimir V. Saveliev; +Cc: ReiserFS List
>> I have always found reiser3 to be rock solid
My experienced too, over many server years.
>> but you can't
>> mention using reiserfs in mixed company without someone accusing
>> you of
>> throwing your data away.
People who repeat this rarely have any direct experience of Reiser;
they repeat what they've heard; like all myths and legends they are
transmitted orally rather than based on scientific observation.
>> You would think the developers would be doing
>> more to counter this but I have been following reiserfs for years and
>> nobody seems to really care all that much.
>>
Can't do much about human nature. MySQL suffers from the same
baseless poisoned folk wisdom.
--Toby
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-29 12:36 ` Toby Thain
@ 2007-05-30 13:25 ` David Masover
2007-05-30 16:02 ` Vladimir V. Saveliev
2007-05-30 16:42 ` Toby Thain
2007-05-30 16:08 ` Vladimir V. Saveliev
1 sibling, 2 replies; 89+ messages in thread
From: David Masover @ 2007-05-30 13:25 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]
On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:
> >> but you can't
> >> mention using reiserfs in mixed company without someone accusing
> >> you of
> >> throwing your data away.
>
> People who repeat this rarely have any direct experience of Reiser;
> they repeat what they've heard; like all myths and legends they are
> transmitted orally rather than based on scientific observation.
Well, there is one problem I vaguely remember that I don't think has been
addressed, I think it was one of those lets-put-it-off-till-v4 things. It was
the fact that there are a limited number of inodes (or keys, or whatever you
call a unique file), and no way of knowing how many you have left until your
FS will suddenly, one day refuse to create another file.
(For comparison, ext3 seems to support not only telling you how many inodes
you have left, but tuning that on the fly.)
But, I haven't run into that, and the only problem I've had lately has been
Reiser4 losing data, and crashing occasionally. I switched most of my data
off of Reiser4 and onto XFS for that reason. I've also been using ext3 in
some places, and Reiser3 in others (one place in particular where space is
limited, but I will have tons of small files).
I later learned that XFS does out-of-order writes by default, making me think
I should give up and invest in UPS hardware. But, switching away from Reiser4
means I no longer see random files (including stuff in, for example, /sbin,
that I hadn't touched in months) go up in smoke.
Ordinarily I like to help debug things, but not at the risk of my data. Maybe
I'll try again later, and see if I can reproduce it in a VM or somewhere
safe...
I do still follow the list, though, in case something interesting happens. It
was fun while it lasted!
[-- Attachment #2: Type: application/pgp-signature, Size: 827 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 13:25 ` David Masover
@ 2007-05-30 16:02 ` Vladimir V. Saveliev
2007-05-30 20:06 ` David Masover
2007-05-30 16:42 ` Toby Thain
1 sibling, 1 reply; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-30 16:02 UTC (permalink / raw)
To: David Masover; +Cc: reiserfs-list
Hello
On Wednesday 30 May 2007 17:25, David Masover wrote:
> On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:
>
> > >> but you can't
> > >> mention using reiserfs in mixed company without someone accusing
> > >> you of
> > >> throwing your data away.
> >
> > People who repeat this rarely have any direct experience of Reiser;
> > they repeat what they've heard; like all myths and legends they are
> > transmitted orally rather than based on scientific observation.
>
> Well, there is one problem I vaguely remember that I don't think has been
> addressed, I think it was one of those lets-put-it-off-till-v4 things. It was
> the fact that there are a limited number of inodes (or keys, or whatever you
> call a unique file), and no way of knowing how many you have left until your
> FS will suddenly, one day refuse to create another file.
>
reiserfs is limited to ~2^32 file creations. It is possible to exhaust but I do not remember any reports about that.
> (For comparison, ext3 seems to support not only telling you how many inodes
> you have left, but tuning that on the fly.)
>
> But, I haven't run into that, and the only problem I've had lately has been
> Reiser4 losing data, and crashing occasionally. I switched most of my data
> off of Reiser4 and onto XFS for that reason. I've also been using ext3 in
> some places, and Reiser3 in others (one place in particular where space is
> limited, but I will have tons of small files).
>
> I later learned that XFS does out-of-order writes by default, making me think
> I should give up and invest in UPS hardware. But, switching away from Reiser4
> means I no longer see random files (including stuff in, for example, /sbin,
> that I hadn't touched in months) go up in smoke.
>
> Ordinarily I like to help debug things, but not at the risk of my data. Maybe
> I'll try again later, and see if I can reproduce it in a VM or somewhere
> safe...
>
that would be great, thanks
> I do still follow the list, though, in case something interesting happens. It
> was fun while it lasted!
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-29 12:36 ` Toby Thain
2007-05-30 13:25 ` David Masover
@ 2007-05-30 16:08 ` Vladimir V. Saveliev
1 sibling, 0 replies; 89+ messages in thread
From: Vladimir V. Saveliev @ 2007-05-30 16:08 UTC (permalink / raw)
To: Toby Thain; +Cc: reiserfs-list
Hello
On Tuesday 29 May 2007 16:36, Toby Thain wrote:
> >> I have always found reiser3 to be rock solid
>
> My experienced too, over many server years.
>
> >> but you can't
> >> mention using reiserfs in mixed company without someone accusing
> >> you of
> >> throwing your data away.
>
> People who repeat this rarely have any direct experience of Reiser;
> they repeat what they've heard; like all myths and legends they are
> transmitted orally rather than based on scientific observation.
>
well, there were in past several bad stories when reiserfsck was unable restore filesystems because it was unable to find
reiserfs metadata.
Later we found that sometimes (for unknown (but not likely due to reiserfs problem) reason) partition table changes so that
beginning of a partition gets shifted by few sectors. So, now, when a user reports that reiserfs metadata disappered from a device completely - recovering a partition table to
original state makes data available again.
> >> You would think the developers would be doing
> >> more to counter this but I have been following reiserfs for years and
> >> nobody seems to really care all that much.
> >>
>
> Can't do much about human nature. MySQL suffers from the same
> baseless poisoned folk wisdom.
>
> --Toby
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 13:25 ` David Masover
2007-05-30 16:02 ` Vladimir V. Saveliev
@ 2007-05-30 16:42 ` Toby Thain
2007-05-30 19:42 ` David Masover
1 sibling, 1 reply; 89+ messages in thread
From: Toby Thain @ 2007-05-30 16:42 UTC (permalink / raw)
To: David Masover; +Cc: ReiserFS List
On 30-May-07, at 10:25 AM, David Masover wrote:
> On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:
>
>>>> but you can't
>>>> mention using reiserfs in mixed company without someone accusing
>>>> you of
>>>> throwing your data away.
>>
>> People who repeat this rarely have any direct experience of Reiser;
>> they repeat what they've heard; like all myths and legends they are
>> transmitted orally rather than based on scientific observation.
>
> Well, there is one problem I vaguely remember that I don't think
> has been
> addressed, I think it was one of those lets-put-it-off-till-v4
> things. It was
> the fact that there are a limited number of inodes (or keys, or
> whatever you
> call a unique file),
But does it cause data loss? One usually sees claims that "reiserfs
ate my data", or "I heard reiserfs ate somebody's data", but without
supplying a root cause - bad memory? powerfail? bad disk? etc.
> and no way of knowing how many you have left until your
> FS will suddenly, one day refuse to create another file.
>
> ... switching away from Reiser4
> means I no longer see random files (including stuff in, for
> example, /sbin,
> that I hadn't touched in months) go up in smoke.
I only wish sanity had prevailed over kernel inclusion, then we'd
see it shaken down a lot quicker, like R3 was.
>
> Ordinarily I like to help debug things, but not at the risk of my
> data. Maybe
> I'll try again later, and see if I can reproduce it in a VM or
> somewhere
> safe...
>
> I do still follow the list, though, in case something interesting
> happens.
Yeah, R4 is "something interesting". :) I still hope it gets finished...
--Toby
> It
> was fun while it lasted!
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2007-05-30 17:22 devsk
2007-05-30 19:24 ` Toby Thain
2007-05-30 20:03 ` David Masover
0 siblings, 2 replies; 89+ messages in thread
From: devsk @ 2007-05-30 17:22 UTC (permalink / raw)
To: Toby Thain, David Masover; +Cc: ReiserFS List
[-- Attachment #1: Type: text/plain, Size: 3263 bytes --]
I think people just like to spread FUD without doing any analysis of what really caused the FS corruption. It can be anything from a bad 3rd party driver to bad hardware ('bad blocks', does anybody check for them before mkfs these days? I do). People also like to try those untested patchsets, containing every blah that's thrown out by so called 'kernel hackers' which makes your system 10x faster. Rieser4 seems like an easy candidate to vent their anger on afterwards.
I have used R4 for a year now and I have had to reset my PC, troubleshooting problems with vmware/mythtv/cisco vpn client/nvidia, so many times that its not even funny! And R4 didn't give me any problems even once. It boots right up, without any files lost and consistent FS as a subsequent livecd boot and fsck proved it everytime. If I did that to ext or xfs, I would have lost big time. Only files I have ever lost were on ext3 during a sudden power failure. I don't trust safety of my data on any FS but Rieserfs. I hope people don't leave this good piece of code to rot!!
-devsk
----- Original Message ----
From: Toby Thain <toby@smartgames.ca>
To: David Masover <ninja@slaphack.com>
Cc: ReiserFS List <reiserfs-list@namesys.com>
Sent: Wednesday, May 30, 2007 9:42:01 AM
Subject: Re: Filesystem corruption
On 30-May-07, at 10:25 AM, David Masover wrote:
> On Tuesday 29 May 2007 07:36:13 Toby Thain wrote:
>
>>>> but you can't
>>>> mention using reiserfs in mixed company without someone accusing
>>>> you of
>>>> throwing your data away.
>>
>> People who repeat this rarely have any direct experience of Reiser;
>> they repeat what they've heard; like all myths and legends they are
>> transmitted orally rather than based on scientific observation.
>
> Well, there is one problem I vaguely remember that I don't think
> has been
> addressed, I think it was one of those lets-put-it-off-till-v4
> things. It was
> the fact that there are a limited number of inodes (or keys, or
> whatever you
> call a unique file),
But does it cause data loss? One usually sees claims that "reiserfs
ate my data", or "I heard reiserfs ate somebody's data", but without
supplying a root cause - bad memory? powerfail? bad disk? etc.
> and no way of knowing how many you have left until your
> FS will suddenly, one day refuse to create another file.
>
> ... switching away from Reiser4
> means I no longer see random files (including stuff in, for
> example, /sbin,
> that I hadn't touched in months) go up in smoke.
I only wish sanity had prevailed over kernel inclusion, then we'd
see it shaken down a lot quicker, like R3 was.
>
> Ordinarily I like to help debug things, but not at the risk of my
> data. Maybe
> I'll try again later, and see if I can reproduce it in a VM or
> somewhere
> safe...
>
> I do still follow the list, though, in case something interesting
> happens.
Yeah, R4 is "something interesting". :) I still hope it gets finished...
--Toby
> It
> was fun while it lasted!
____________________________________________________________________________________Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
[-- Attachment #2: Type: text/html, Size: 4130 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 17:22 devsk
@ 2007-05-30 19:24 ` Toby Thain
2007-05-30 20:03 ` David Masover
1 sibling, 0 replies; 89+ messages in thread
From: Toby Thain @ 2007-05-30 19:24 UTC (permalink / raw)
To: devsk; +Cc: David Masover, ReiserFS List
On 30-May-07, at 2:22 PM, devsk wrote:
> I think people just like to spread FUD without doing any analysis
> of what really caused the FS corruption.
I fear you're right. OTOH, filesystem developers on this list (and
others including ZFS list) tend to be extremely meticulous.
--Toby
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 16:42 ` Toby Thain
@ 2007-05-30 19:42 ` David Masover
0 siblings, 0 replies; 89+ messages in thread
From: David Masover @ 2007-05-30 19:42 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 827 bytes --]
On Wednesday 30 May 2007 11:42:01 Toby Thain wrote:
> But does it cause data loss? One usually sees claims that "reiserfs
> ate my data", or "I heard reiserfs ate somebody's data", but without
> supplying a root cause - bad memory? powerfail? bad disk? etc.
Power failure shouldn't kill a filesystem, and generally shouldn't eat data
that was written to disk before the failure. (Although I could complain all
day here about why corruption happens anyway when you do any kind of
out-of-order operations... I am looking forward to that Reiser4 transaction
API, so we can finally get rid of the tmpfile+rename hack.)
But in any case, there were some kernels -- 2.4.16, I think? -- in which
reiserfs was unstable and did corrupt easily. I believe that was tracked down
to kernel bugs outside of reiserfs.
[-- Attachment #2: Type: application/pgp-signature, Size: 827 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 17:22 devsk
2007-05-30 19:24 ` Toby Thain
@ 2007-05-30 20:03 ` David Masover
2007-05-31 0:11 ` Ingo Bormuth
1 sibling, 1 reply; 89+ messages in thread
From: David Masover @ 2007-05-30 20:03 UTC (permalink / raw)
To: devsk; +Cc: Toby Thain, ReiserFS List
[-- Attachment #1: Type: text/plain, Size: 3186 bytes --]
On Wednesday 30 May 2007 12:22:17 devsk wrote:
> I have used R4 for a year now and I have had to reset my PC,
> troubleshooting problems with vmware/mythtv/cisco vpn client/nvidia, so
> many times that its not even funny! And R4 didn't give me any problems even
> once. It boots right up, without any files lost and consistent FS as a
> subsequent livecd boot and fsck proved it everytime.
That happened to me for maybe a year or so, I'm not sure. Then, slowly, I
started to get problems. The machine crashing due to some nvidia bug -- or
even a reiser-specific oops or something -- then I'd have to fsck it, which
would take an hour or more, then I'd boot, and apparently no problems.
Only, recently, these fsck-a-thons started happening more and more often, and
I started to lose random files. They'd just be silently truncated to 0 bytes.
And not files I was writing a lot -- I'm talking about things
like /bin/mount.
Now, maybe it's an amd64-specific bug. Or (somehow) a dmraid-specific bug, or
a dont_load_bitmap bug. (Who can blame me; without dont_load_bitmap, it takes
at least 30 seconds, maybe a minute to mount.) Could even be, somehow, a
Gentoo-specific bug. Could be a 350-gig-partition bug, or even a bug of the
it-hates-me variety. (My server ran Reiser4 for awhile longer, with no
problems, but I wasn't about to take chances there.)
But, I switched a friend over to Ubuntu, and he had the same kind of problems.
In fact, he had them first (I thought it was his computer, for awhile).
Finally, we switched to stock Ubuntu kernels and XFS, me on dmraid, him on
normal linux raid5 (md), and we now have no problems. It's even faster -- the
biggest gain for Reiser4 was /usr/portage, which doesn't exist on Ubuntu.
> If I did that to ext
> or xfs, I would have lost big time.
Well, I'm on XFS on my desktop now, and ext3 on my server. No problems at all
so far. Also much faster, because my desktop now has a repacker (xfs_fsr).
> I hope people don't leave this good piece of code to rot!!
Me too, but you know, I can no longer afford to spend a few hours running fsck
for no apparent reason. I no longer have a machine that can do anything but
just work.
The killer feature of Reiser4, as implemented, is small file performance that
makes ReiserFSv3 weep, and v3 makes XFS weep. All the other stuff we were
promised is either planned for a later release (repacker, pseudofiles,
transaction API) or barely working (cryptocompress).
And on just about any setup I work on today, small file performance is a small
enough priority that even the slightest hint of instability is a
deal-breaker. Enough people feel the same way that ext3 is still widely used.
And if it's ever really crucial, there's reiserfs3.
So, you can blame it on my hardware, or on not getting kernel inclusion, or
anything you want, but the only place I still use Reiser4 is on the
gameserver at our LAN party, and we're thinking of moving that to something
like ext3 or xfs, just so we don't need custom kernels. And after all, that's
a gameserver, it's not like the filesystem is the bottleneck anyway.
[-- Attachment #2: Type: application/pgp-signature, Size: 827 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 16:02 ` Vladimir V. Saveliev
@ 2007-05-30 20:06 ` David Masover
0 siblings, 0 replies; 89+ messages in thread
From: David Masover @ 2007-05-30 20:06 UTC (permalink / raw)
To: Vladimir V. Saveliev; +Cc: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 842 bytes --]
On Wednesday 30 May 2007 11:02:26 Vladimir V. Saveliev wrote:
> > Ordinarily I like to help debug things, but not at the risk of my data.
> > Maybe I'll try again later, and see if I can reproduce it in a VM or
> > somewhere safe...
>
> that would be great, thanks
Keep in mind, it's unlikely, given I don't have much resembling my original
setup left around. And it was fairly random, under fairly normal usage
patterns -- just I'd suddenly notice my movie had stopped playing, and I'd
hit ctrl+alt+f8 and find a bunch of reiser4 error messages.
Is it at all likely that this is an amd64 bug? (The only two places I've seen
it are on my box and my friend's, both amd64 on some sort of RAID.) If you
don't have enough testers or hardware for amd64, I can try (again) to setup a
working x86_64 VM for you to test on.
[-- Attachment #2: Type: application/pgp-signature, Size: 827 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2007-05-30 20:13 devsk
0 siblings, 0 replies; 89+ messages in thread
From: devsk @ 2007-05-30 20:13 UTC (permalink / raw)
To: David Masover; +Cc: Toby Thain, ReiserFS List
[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]
David, Its funny how my setup is very similar to yours: gentoo, amd64, nvraid using dmraid. mount/mkfs is VERY fast (less than a second) here, and I don't use any specific mount options except noatime. My partition is about 16GB though, hosting '/' and /home.
what sources do you use? I use gentoo-sources (currently using 2.6.21-r2) with the latest stable patch (currently 2.6.21) from namesys, applied manually. Nothing else. I use suspend-to-ram (with a UPS) and the whole system is rock solid.
-devsk
----- Original Message ----
From: David Masover <ninja@slaphack.com>
To: devsk <funtoos@yahoo.com>
Cc: Toby Thain <toby@smartgames.ca>; ReiserFS List <reiserfs-list@namesys.com>
Sent: Wednesday, May 30, 2007 1:03:14 PM
Subject: Re: Filesystem corruption
On Wednesday 30 May 2007 12:22:17 devsk wrote:
> I have used R4 for a year now and I have had to reset my PC,
> troubleshooting problems with vmware/mythtv/cisco vpn client/nvidia, so
> many times that its not even funny! And R4 didn't give me any problems even
> once. It boots right up, without any files lost and consistent FS as a
> subsequent livecd boot and fsck proved it everytime.
That happened to me for maybe a year or so, I'm not sure. Then, slowly, I
started to get problems. The machine crashing due to some nvidia bug -- or
even a reiser-specific oops or something -- then I'd have to fsck it, which
would take an hour or more, then I'd boot, and apparently no problems.
Only, recently, these fsck-a-thons started happening more and more often, and
I started to lose random files. They'd just be silently truncated to 0 bytes.
And not files I was writing a lot -- I'm talking about things
like /bin/mount.
Now, maybe it's an amd64-specific bug. Or (somehow) a dmraid-specific bug, or
a dont_load_bitmap bug. (Who can blame me; without dont_load_bitmap, it takes
at least 30 seconds, maybe a minute to mount.) Could even be, somehow, a
Gentoo-specific bug. Could be a 350-gig-partition bug, or even a bug of the
it-hates-me variety. (My server ran Reiser4 for awhile longer, with no
problems, but I wasn't about to take chances there.)
But, I switched a friend over to Ubuntu, and he had the same kind of problems.
In fact, he had them first (I thought it was his computer, for awhile).
Finally, we switched to stock Ubuntu kernels and XFS, me on dmraid, him on
normal linux raid5 (md), and we now have no problems. It's even faster -- the
biggest gain for Reiser4 was /usr/portage, which doesn't exist on Ubuntu.
> If I did that to ext
> or xfs, I would have lost big time.
Well, I'm on XFS on my desktop now, and ext3 on my server. No problems at all
so far. Also much faster, because my desktop now has a repacker (xfs_fsr).
> I hope people don't leave this good piece of code to rot!!
Me too, but you know, I can no longer afford to spend a few hours running fsck
for no apparent reason. I no longer have a machine that can do anything but
just work.
The killer feature of Reiser4, as implemented, is small file performance that
makes ReiserFSv3 weep, and v3 makes XFS weep. All the other stuff we were
promised is either planned for a later release (repacker, pseudofiles,
transaction API) or barely working (cryptocompress).
And on just about any setup I work on today, small file performance is a small
enough priority that even the slightest hint of instability is a
deal-breaker. Enough people feel the same way that ext3 is still widely used.
And if it's ever really crucial, there's reiserfs3.
So, you can blame it on my hardware, or on not getting kernel inclusion, or
anything you want, but the only place I still use Reiser4 is on the
gameserver at our LAN party, and we're thinking of moving that to something
like ext3 or xfs, just so we don't need custom kernels. And after all, that's
a gameserver, it's not like the filesystem is the bottleneck anyway.
____________________________________________________________________________________Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting
[-- Attachment #2: Type: text/html, Size: 4744 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-30 20:03 ` David Masover
@ 2007-05-31 0:11 ` Ingo Bormuth
2007-06-02 23:10 ` Edward Shishkin
0 siblings, 1 reply; 89+ messages in thread
From: Ingo Bormuth @ 2007-05-31 0:11 UTC (permalink / raw)
To: reiserfs-list
On 2007-05-30 15:03, David Masover wrote:
> Only, recently, these fsck-a-thons started happening more and more often, and
> I started to lose random files. They'd just be silently truncated to 0 bytes.
> And not files I was writing a lot -- I'm talking about things
> like /bin/mount.
Hm, same here. I lost /bin/sleep several times. I have a little script
printing status messages to the screen, sleeping two seconds and print
again - you name it. The probability that /bin/sleep is accessed at the
same time the system crashes is quite high (this is _no_ write access,
the system is even mounted noatime).
How could pure execution of a file cause corruption of the file itself?
Any idea ?
Apart from that single file, I never had any serious problems with
reiser4 on three busy systems for years - fsck.reiser4 works like charme.
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-05-31 0:11 ` Ingo Bormuth
@ 2007-06-02 23:10 ` Edward Shishkin
2007-06-04 2:55 ` Ingo Bormuth
0 siblings, 1 reply; 89+ messages in thread
From: Edward Shishkin @ 2007-06-02 23:10 UTC (permalink / raw)
To: Ingo Bormuth; +Cc: reiserfs-list
Ingo Bormuth wrote:
>On 2007-05-30 15:03, David Masover wrote:
>
>
>
>>Only, recently, these fsck-a-thons started happening more and more often, and
>>I started to lose random files. They'd just be silently truncated to 0 bytes.
>>And not files I was writing a lot -- I'm talking about things
>>like /bin/mount.
>>
>>
>
>Hm, same here. I lost /bin/sleep several times.
>
Would you please describe the problem in more details?
What kernel version? What does "I lost /bin/sleep" mean?
Does it mean that:
1. /bin/sleep was truncated to 0 bytes, i.e. "ls -l /bin/sleep" shows
something like
-rwxr-xr-x 1 root root 0 2005-04-20 18:32 /bin/sleep
2. /bin/sleep disappeared ("ls -l /bin" doesn't show this file)
3. /bin/sleep exists, but filled by zeros
etc...
Thanks,
Edward.
> I have a little script
>printing status messages to the screen, sleeping two seconds and print
>again - you name it. The probability that /bin/sleep is accessed at the
>same time the system crashes is quite high (this is _no_ write access,
>the system is even mounted noatime).
>
>How could pure execution of a file cause corruption of the file itself?
>Any idea ?
>
>Apart from that single file, I never had any serious problems with
>reiser4 on three busy systems for years - fsck.reiser4 works like charme.
>
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-06-02 23:10 ` Edward Shishkin
@ 2007-06-04 2:55 ` Ingo Bormuth
2007-06-04 9:41 ` Edward Shishkin
0 siblings, 1 reply; 89+ messages in thread
From: Ingo Bormuth @ 2007-06-04 2:55 UTC (permalink / raw)
To: reiserfs-list
On 2007-06-03 03:10, Edward Shishkin wrote:
> Ingo Bormuth wrote:
> >Hm, same here. I lost /bin/sleep several times.
> Would you please describe the problem in more details?
> What kernel version? What does "I lost /bin/sleep" mean?
> Does it mean that:
> 1. /bin/sleep was truncated to 0 bytes, i.e. "ls -l /bin/sleep" shows
> something like
> -rwxr-xr-x 1 root root 0 2005-04-20 18:32 /bin/sleep
> 2. /bin/sleep disappeared ("ls -l /bin" doesn't show this file)
> 3. /bin/sleep exists, but filled by zeros
> etc...
The file was removed by 'fsck.reiser4 --fix' which emmitted a
message about deleting a corrupted file. (Case 2 in your list).
This always happened after a system freeze or power loss.
The machine freezes quite frequently - I think it has a DMA problem.
Nevertheless I don't see how a file that was not written to can
get corrupted.
Current kernel is 2.6.20.5 (the reiser4 patch I submitted to this
list on may 2nd).
Root is mounted rw,noatime,nodiratime,onerror=remount-ro,tmgr.atom_max_age=60
Hope that helps.
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-06-04 2:55 ` Ingo Bormuth
@ 2007-06-04 9:41 ` Edward Shishkin
2007-06-05 23:20 ` Ingo Bormuth
0 siblings, 1 reply; 89+ messages in thread
From: Edward Shishkin @ 2007-06-04 9:41 UTC (permalink / raw)
To: Ingo Bormuth; +Cc: reiserfs-list
Ingo Bormuth wrote:
>On 2007-06-03 03:10, Edward Shishkin wrote:
>
>
>>Ingo Bormuth wrote:
>>
>>
>>>Hm, same here. I lost /bin/sleep several times.
>>>
>>>
>
>
>
>>Would you please describe the problem in more details?
>>What kernel version? What does "I lost /bin/sleep" mean?
>>Does it mean that:
>>1. /bin/sleep was truncated to 0 bytes, i.e. "ls -l /bin/sleep" shows
>>something like
>>-rwxr-xr-x 1 root root 0 2005-04-20 18:32 /bin/sleep
>>2. /bin/sleep disappeared ("ls -l /bin" doesn't show this file)
>>3. /bin/sleep exists, but filled by zeros
>>etc...
>>
>>
>
>The file was removed by 'fsck.reiser4 --fix' which emmitted a
>message about deleting a corrupted file. (Case 2 in your list).
>
>This always happened after a system freeze or power loss.
>The machine freezes quite frequently - I think it has a DMA problem.
>Nevertheless I don't see how a file that was not written to can
>get corrupted.
>
>
>
When performing mapping read (needed for execution, etc) reiser4
converts small
files from tails to extents and back (your /bin/sleep is less then 4 *
blocksize, right?)
>Current kernel is 2.6.20.5 (the reiser4 patch I submitted to this
>list on may 2nd).
>
>
Please, rebuild your kernel with the official patch
http://ftp.namesys.com/pub/reiser4-for-2.6/2.6.20/
It contains a bugfix related to tail conversion (races when acquiring
exclusive access).
Please, report, if such data loss still takes place after upgrade.
Thanks,
Edward.
>Root is mounted rw,noatime,nodiratime,onerror=remount-ro,tmgr.atom_max_age=60
>
>Hope that helps.
>
>
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-06-04 9:41 ` Edward Shishkin
@ 2007-06-05 23:20 ` Ingo Bormuth
0 siblings, 0 replies; 89+ messages in thread
From: Ingo Bormuth @ 2007-06-05 23:20 UTC (permalink / raw)
To: reiserfs-list
On 2007-06-04 13:41, Edward Shishkin wrote:
> When performing mapping read (needed for execution, etc) reiser4
> converts small files from tails to extents and back (your /bin/sleep
> is less then 4 * blocksize, right?)
Yes, it's 15k.
The conversion is done on disk, even when mounted read only? I'd like
to see the logic in the code. In case you just know by heart, it' would
be nice if you could give me a little hint where to start at.
> Please, rebuild your kernel with the official patch
> [...]
> Please, report, if such data loss still takes place after upgrade.
I'll keep you informed ...
Thanks.
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
@ 2007-06-06 3:10 Xu CanHao
2007-06-06 12:16 ` Ingo Bormuth
0 siblings, 1 reply; 89+ messages in thread
From: Xu CanHao @ 2007-06-06 3:10 UTC (permalink / raw)
To: reiserfs-list
So maybe I'd suggest anybody take the _official_ reiser4 patch-set and
_vanilla_ kernel source, these things should provide the maximum
stability. My root filesystem with reiser4 never loses data.
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem corruption
2007-06-06 3:10 Filesystem corruption Xu CanHao
@ 2007-06-06 12:16 ` Ingo Bormuth
0 siblings, 0 replies; 89+ messages in thread
From: Ingo Bormuth @ 2007-06-06 12:16 UTC (permalink / raw)
To: reiserfs-list
On 2007-06-06 11:10, Xu CanHao wrote:
> So maybe I'd suggest anybody take the _official_ reiser4 patch-set and
> _vanilla_ kernel source, these things should provide the maximum
> stability. My root filesystem with reiser4 never loses data.
I fully agree, as long as there _exists_ a current official patch.
That was not always the case in the recent past. No wonder people
started to get their own hands dirty from time to time.
Btw: It's also fun to read / mess with the code ...
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
^ permalink raw reply [flat|nested] 89+ messages in thread
* filesystem corruption
@ 2011-01-03 1:58 Patrick H.
2011-01-03 3:16 ` Neil Brown
0 siblings, 1 reply; 89+ messages in thread
From: Patrick H. @ 2011-01-03 1:58 UTC (permalink / raw)
To: linux-raid
I've been trying to track down an issue for a while now and from digging
around it appears (though not certain) the issue lies with the md raid
device.
Whats happening is that after improperly shutting down a raid-5 array,
upon reassembly, a few files on the filesystem will be corrupt. I dont
think this is normal filesystem corruption from files being modified
during the shut down because some of the files that end up corrupted are
several hours old.
The exact details of what I'm doing:
I have a 3-node test cluster I'm doing integrity testing on. Each node
in the cluster is exporting a couple of disks via ATAoE.
I have the first disk of all 3 nodes in a raid-1 that is holding the
journal data for the ext3 filesystem. The array is running with an
internal bitmap as well.
The second disk of all 3 nodes is a raid-5 array holding the ext3
filesystem itself. This is also running with an internal bitmap.
The ext3 filesystem is mounted with 'data=journal,barrier=1,sync'.
When I power down the node which is actively running both md raid
devices, another node in the cluster takes over and starts both arrays
up (in degraded mode of course).
Once the original node comes back up, the new master re-adds its disks
back into the raid arrays and re-syncs them.
During all this, the filesystem is exported through nfs (nfs also has
sync turned on) and a client is randomly creating, removing, and
verifying checksums on the files in the filesystem (nfs is hard mounted
so operations always retry). The client script averages about 30
creations/s, 30 deletes/s, and 30 checksums/s.
So, as stated above, every now and then (1 in 50 chance or so), when the
master is hard-rebooted, the client will detect a few files with invalid
md5 checksums. These files could be hours old so they were not being
actively modified.
Another key point that leads me to believe its a md raid issue is that
before I had the ext3 journal running internally on the raid-5 array
(part of the filesystem itself). When I did this, there would
occasionally be massive corruption. As in file modification times in the
future, lots of corrupt files, thousands of files put in the
'lost+found' dir upon fsck, etc. After I put it on a separate raid-1,
there are no more invalid modification times, there hasnt been a single
file added to 'lost+found', and the number of corrupt files dropped
significantly. This would seem to indicate that the journal was getting
corrupted, and when it was played back, it went horribly wrong.
So it would seem there's something wrong with the raid-5 array, but I
dont know what it could be. Any ideas or input would be much
appreciated. I can modify the clustering scripts to obtain whatever
information is needed when they start the arrays.
-Patrick
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-03 1:58 Patrick H.
@ 2011-01-03 3:16 ` Neil Brown
[not found] ` <4D214B5C.3010103@feystorm.net>
` (2 more replies)
0 siblings, 3 replies; 89+ messages in thread
From: Neil Brown @ 2011-01-03 3:16 UTC (permalink / raw)
To: Patrick H.; +Cc: linux-raid
On Sun, 02 Jan 2011 18:58:34 -0700 "Patrick H." <linux-raid@feystorm.net>
wrote:
> I've been trying to track down an issue for a while now and from digging
> around it appears (though not certain) the issue lies with the md raid
> device.
> Whats happening is that after improperly shutting down a raid-5 array,
> upon reassembly, a few files on the filesystem will be corrupt. I dont
> think this is normal filesystem corruption from files being modified
> during the shut down because some of the files that end up corrupted are
> several hours old.
>
> The exact details of what I'm doing:
> I have a 3-node test cluster I'm doing integrity testing on. Each node
> in the cluster is exporting a couple of disks via ATAoE.
> I have the first disk of all 3 nodes in a raid-1 that is holding the
> journal data for the ext3 filesystem. The array is running with an
> internal bitmap as well.
> The second disk of all 3 nodes is a raid-5 array holding the ext3
> filesystem itself. This is also running with an internal bitmap.
> The ext3 filesystem is mounted with 'data=journal,barrier=1,sync'.
> When I power down the node which is actively running both md raid
> devices, another node in the cluster takes over and starts both arrays
> up (in degraded mode of course).
> Once the original node comes back up, the new master re-adds its disks
> back into the raid arrays and re-syncs them.
> During all this, the filesystem is exported through nfs (nfs also has
> sync turned on) and a client is randomly creating, removing, and
> verifying checksums on the files in the filesystem (nfs is hard mounted
> so operations always retry). The client script averages about 30
> creations/s, 30 deletes/s, and 30 checksums/s.
>
> So, as stated above, every now and then (1 in 50 chance or so), when the
> master is hard-rebooted, the client will detect a few files with invalid
> md5 checksums. These files could be hours old so they were not being
> actively modified.
> Another key point that leads me to believe its a md raid issue is that
> before I had the ext3 journal running internally on the raid-5 array
> (part of the filesystem itself). When I did this, there would
> occasionally be massive corruption. As in file modification times in the
> future, lots of corrupt files, thousands of files put in the
> 'lost+found' dir upon fsck, etc. After I put it on a separate raid-1,
> there are no more invalid modification times, there hasnt been a single
> file added to 'lost+found', and the number of corrupt files dropped
> significantly. This would seem to indicate that the journal was getting
> corrupted, and when it was played back, it went horribly wrong.
>
> So it would seem there's something wrong with the raid-5 array, but I
> dont know what it could be. Any ideas or input would be much
> appreciated. I can modify the clustering scripts to obtain whatever
> information is needed when they start the arrays.
What you are doing cannot work reliably.
If a RAID5 suffers an unclean shutdown and is restarted without a full
complement of devices, then it can corrupt data that has not been changed
recently, just as you are seeing.
This is why mdadm will not assemble that array unless you provide the --force
flag which essentially says "I know what I am doing and accept the risk".
When md needs to update a block in your 3-drive RAID5, it will read the other
block in the same stripe (if that isn't in the cache or being written at the
same time) and then write out the data block (or blocks) and the newly
computed parity block.
If you crash after one of those writes has completed, but before all of the
writes have completed, then the parity block will not match the data blocks
on disk.
When you re-assemble the array with one device missing, md will compute the
data that was on the device using the other data block and the parity block.
As the parity and data blocks could be inconsistent, the result could easily
be wrong.
With RAID1 there is no similar problem. When you read after a crash you will
always get "correct" data. It maybe from before the last write that was
attempted, or after, but if the data was not written recently you will read
exactly the right data.
This is why the situation improved substantially when you moved the journal
to RAID1.
The get full improvement, you need to move the data to RAID1 (or RAID10) as
well.
NeilBrown
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
[not found] ` <4D214B5C.3010103@feystorm.net>
@ 2011-01-03 4:56 ` Neil Brown
2011-01-03 5:05 ` Patrick H.
0 siblings, 1 reply; 89+ messages in thread
From: Neil Brown @ 2011-01-03 4:56 UTC (permalink / raw)
To: Patrick H.; +Cc: linux-raid
On Sun, 02 Jan 2011 21:06:52 -0700 "Patrick H." <linux-raid@feystorm.net>
wrote:
> That makes sense assuming that MD acknowleges the write once the data is
> written to the data disks but not necessarily the parity disk, which is
> what I gather you were saying is what happens. Is there any option that
> can change the behavior so that md wont ack the write until its been
> committed to all disks (I'm guessing no since you didnt mention it)?
> Also does raid6 suffer this problem? Is it smart enough to use both
> parity disks when calculating replacement, or will it just use one?
>
md/raid5 doesn't acknowledge the write until both the data and the parity
have been written. But that doesn't make any difference.
If you schedule a number of interdependent writes (data and parity) and then
allow some to complete but not all, then you have inconsistency.
Recovery from losing a single device requires consistency of parity and data.
RAID6 suffers equally from this problem. Even if it used both parity disks
to recover (which it doesn't) how would that help? It would then have two
possible value for the data and no way to know which was correct, and every
possibility that both are incorrect. This would happen if a single data
block was successfully written, but neither parity blocks were.
The only way you can avoid this 'write hole' is by journalling in multiples
of whole stripes. No current filesystems that I know of can do this as they
journal in blocks, and the maximum block size is less than the minimum stripe
size. So you would need journalling integrated with md/raid, or you would
need a filesystem which was designed to understand this problem and write
whole stripes at a time, always to an area of the device which did not
contain live data.
NeilBrown
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-03 4:56 ` Neil Brown
@ 2011-01-03 5:05 ` Patrick H.
2011-01-04 5:33 ` NeilBrown
0 siblings, 1 reply; 89+ messages in thread
From: Patrick H. @ 2011-01-03 5:05 UTC (permalink / raw)
To: linux-raid
Sent: Sun Jan 02 2011 21:56:30 GMT-0700 (Mountain Standard Time)
From: Neil Brown <neilb@suse.de>
To: Patrick H. <linux-raid@feystorm.net> linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
> On Sun, 02 Jan 2011 21:06:52 -0700 "Patrick H." <linux-raid@feystorm.net>
> wrote:
>
>
>
>> That makes sense assuming that MD acknowleges the write once the data is
>> written to the data disks but not necessarily the parity disk, which is
>> what I gather you were saying is what happens. Is there any option that
>> can change the behavior so that md wont ack the write until its been
>> committed to all disks (I'm guessing no since you didnt mention it)?
>> Also does raid6 suffer this problem? Is it smart enough to use both
>> parity disks when calculating replacement, or will it just use one?
>>
>>
>
> md/raid5 doesn't acknowledge the write until both the data and the parity
> have been written. But that doesn't make any difference.
> If you schedule a number of interdependent writes (data and parity) and then
> allow some to complete but not all, then you have inconsistency.
> Recovery from losing a single device requires consistency of parity and data.
>
> RAID6 suffers equally from this problem. Even if it used both parity disks
> to recover (which it doesn't) how would that help? It would then have two
> possible value for the data and no way to know which was correct, and every
> possibility that both are incorrect. This would happen if a single data
> block was successfully written, but neither parity blocks were.
>
> The only way you can avoid this 'write hole' is by journalling in multiples
> of whole stripes. No current filesystems that I know of can do this as they
> journal in blocks, and the maximum block size is less than the minimum stripe
> size. So you would need journalling integrated with md/raid, or you would
> need a filesystem which was designed to understand this problem and write
> whole stripes at a time, always to an area of the device which did not
> contain live data.
>
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Ok, thanks for the info.
I think I'll solve it by creating 2 dedicated hosts for running the
array, but not actually export any disks themselves. This way if a
master dies, all the raid disks are still there and can be picked up by
the other master.
-Patrick
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-03 5:05 ` Patrick H.
@ 2011-01-04 5:33 ` NeilBrown
2011-01-04 7:50 ` Patrick H.
0 siblings, 1 reply; 89+ messages in thread
From: NeilBrown @ 2011-01-04 5:33 UTC (permalink / raw)
To: Patrick H.; +Cc: linux-raid
On Sun, 02 Jan 2011 22:05:06 -0700 "Patrick H." <linux-raid@feystorm.net>
wrote:
> Ok, thanks for the info.
> I think I'll solve it by creating 2 dedicated hosts for running the
> array, but not actually export any disks themselves. This way if a
> master dies, all the raid disks are still there and can be picked up by
> the other master.
>
That sounds like it should work OK.
NeilBrown
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-04 5:33 ` NeilBrown
@ 2011-01-04 7:50 ` Patrick H.
2011-01-04 17:31 ` Patrick H.
0 siblings, 1 reply; 89+ messages in thread
From: Patrick H. @ 2011-01-04 7:50 UTC (permalink / raw)
To: linux-raid
Sent: Mon Jan 03 2011 22:33:24 GMT-0700 (Mountain Standard Time)
From: NeilBrown <neilb@suse.de>
To: Patrick H. <linux-raid@feystorm.net> linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
> On Sun, 02 Jan 2011 22:05:06 -0700 "Patrick H." <linux-raid@feystorm.net>
> wrote:
>
>
>> Ok, thanks for the info.
>> I think I'll solve it by creating 2 dedicated hosts for running the
>> array, but not actually export any disks themselves. This way if a
>> master dies, all the raid disks are still there and can be picked up by
>> the other master.
>>
>>
>
> That sounds like it should work OK.
>
> NeilBrown
>
Well, it didnt solve it. if I power the entire cluster down and start it
back up, I get corruption, on old files that werent being modified
still. If I power off just a single node, it seems to handle it fine,
just not the whole cluster.
It also seems to happen fairly frequently now. In the previous setup it
was probably 1 in 50 failures that there was corruption. Now its pretty
much a guarantee there will be corruption if I kill it.
On the last failure I did, when it came back up, it re-assembled the
entire raid-5 array with all disks active and none of them needing any
sort of re-sync. The disk controller is battery backed, so even if it
was re-ordering the writes, the battery should ensure that it all gets
committed.
Any other ideas?
-Patrick
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-04 7:50 ` Patrick H.
@ 2011-01-04 17:31 ` Patrick H.
2011-01-05 1:22 ` Patrick H.
0 siblings, 1 reply; 89+ messages in thread
From: Patrick H. @ 2011-01-04 17:31 UTC (permalink / raw)
To: linux-raid
Sent: Tue Jan 04 2011 00:50:39 GMT-0700 (Mountain Standard Time)
From: Patrick H. <linux-raid@feystorm.net>
To: linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
> Sent: Mon Jan 03 2011 22:33:24 GMT-0700 (Mountain Standard Time)
> From: NeilBrown <neilb@suse.de>
> To: Patrick H. <linux-raid@feystorm.net> linux-raid@vger.kernel.org
> Subject: Re: filesystem corruption
>> On Sun, 02 Jan 2011 22:05:06 -0700 "Patrick H."
>> <linux-raid@feystorm.net>
>> wrote:
>>
>>
>>> Ok, thanks for the info.
>>> I think I'll solve it by creating 2 dedicated hosts for running the
>>> array, but not actually export any disks themselves. This way if a
>>> master dies, all the raid disks are still there and can be picked up
>>> by the other master.
>>>
>>>
>>
>> That sounds like it should work OK.
>>
>> NeilBrown
>>
> Well, it didnt solve it. if I power the entire cluster down and start
> it back up, I get corruption, on old files that werent being modified
> still. If I power off just a single node, it seems to handle it fine,
> just not the whole cluster.
>
> It also seems to happen fairly frequently now. In the previous setup
> it was probably 1 in 50 failures that there was corruption. Now its
> pretty much a guarantee there will be corruption if I kill it.
> On the last failure I did, when it came back up, it re-assembled the
> entire raid-5 array with all disks active and none of them needing any
> sort of re-sync. The disk controller is battery backed, so even if it
> was re-ordering the writes, the battery should ensure that it all gets
> committed.
>
> Any other ideas?
>
> -Patrick
Here is some info from my most recent failure simulation. This one
resulted in about 50 corrupt files, another 40 or so that cant even be
opened, and one stale nfs file handle.
I had the cluster script dump out a bunch of info before and after
assembling the array.
= = = = = = = = = =
# mdadm -E /dev/etherd/e1.1p1
/dev/etherd/e1.1p1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:126 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : a20adb76:af00f276:5be79a36:b4ff3a8b
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 16:45:56 2011
Checksum : 361041f6 - correct
Events : 486
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 0
Array State : AAA ('A' == active, '.' == missing)
# mdadm -X /dev/etherd/e1.1p1
Filename : /dev/etherd/e1.1p1
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 486
Events Cleared : 486
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 189 dirty (1.1%)
= = = = = = = = = =
= = = = = = = = = =
# mdadm -E /dev/etherd/e2.1p1
/dev/etherd/e2.1p1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:126 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f9205ace:0796ecf5:2cca363c:c2873816
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 16:45:56 2011
Checksum : 9d235885 - correct
Events : 486
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 1
Array State : AAA ('A' == active, '.' == missing)
# mdadm -X /dev/etherd/e2.1p1
Filename : /dev/etherd/e2.1p1
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 486
Events Cleared : 486
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 189 dirty (1.1%)
= = = = = = = = = =
= = = = = = = = = =
# mdadm -E /dev/etherd/e3.1p1
/dev/etherd/e3.1p1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:126 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 7f90958d:22de5c08:88750ecb:5f376058
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 16:46:13 2011
Checksum : 3fce6b33 - correct
Events : 487
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing)
# mdadm -X /dev/etherd/e3.1p1
Filename : /dev/etherd/e3.1p1
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 487
Events Cleared : 486
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 249 dirty (1.5%)
= = = = = = = = = =
- - - - - - - - - - -
# mdadm -D /dev/md/fs01
/dev/md/fs01:
Version : 1.2
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Array Size : 2119424 (2.02 GiB 2.17 GB)
Used Dev Size : 1059712 (1035.05 MiB 1085.15 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Jan 4 16:46:13 2011
State : active, resyncing
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 1% complete
Name : dm01:126 (local to host dm01)
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 486
Number Major Minor RaidDevice State
0 152 273 0 active sync /dev/block/152:273
1 152 529 1 active sync /dev/block/152:529
3 152 785 2 active sync /dev/block/152:785
- - - - - - - - - - -
The old method *never* resulted in this much corruption, and never
generated stale nfs file handles. Why is this so much worse now when it
was supposed to be better?
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-04 17:31 ` Patrick H.
@ 2011-01-05 1:22 ` Patrick H.
0 siblings, 0 replies; 89+ messages in thread
From: Patrick H. @ 2011-01-05 1:22 UTC (permalink / raw)
To: linux-raid
I think I may have found something on this. I was messing around with it
more (switched to iSCSI instead of ATAoE), and managed to create a
situation where 2 of the 3 raid-5 disks had failed, yet the MD device
was still active, and it was letting me use it. This is bad.
mdadm -D /dev/md/fs01
/dev/md/fs01:
Version : 1.2
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Array Size : 2119424 (2.02 GiB 2.17 GB)
Used Dev Size : 1059712 (1035.05 MiB 1085.15 MB)
Raid Devices : 3
Total Devices : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Jan 4 22:58:44 2011
State : active, FAILED
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Name : dm01:125 (local to host dm01)
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 2980
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 80 1 active sync /dev/sdf
2 0 0 2 removed
Notice, there's only one disk in the array, the other 2 failed and were
removed. Yet state is still saying active. The filesystem is still up
and running, and I can even read and write to it, though it spits out
tons of IO errors.
I then stopped the array and tried to reassemble it, and now it wont
reassemble.
# mdadm -A /dev/md/fs01 --uuid 9cd9ae9b:39454845:62f2b08d:a4a1ac6c -vv
mdadm: looking for devices for /dev/md/fs01
mdadm: no recogniseable superblock on /dev/md/fs01_journal
mdadm: /dev/md/fs01_journal has wrong uuid.
mdadm: cannot open device /dev/sdg: Device or resource busy
mdadm: /dev/sdg has wrong uuid.
mdadm: cannot open device /dev/sdd: Device or resource busy
mdadm: /dev/sdd has wrong uuid.
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: /dev/sdb has wrong uuid.
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: /dev/sda2 has wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 has wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sde is identified as a member of /dev/md/fs01, slot 2.
mdadm: /dev/sdc is identified as a member of /dev/md/fs01, slot 0.
mdadm: /dev/sdf is identified as a member of /dev/md/fs01, slot 1.
mdadm: added /dev/sdc to /dev/md/fs01 as 0
mdadm: added /dev/sde to /dev/md/fs01 as 2
mdadm: added /dev/sdf to /dev/md/fs01 as 1
mdadm: /dev/md/fs01 assembled from 1 drive - not enough to start the array.
# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md125 : inactive sdf[1](S) sde[3](S) sdc[0](S)
3179280 blocks super 1.2
md126 : active raid1 sdg[0] sdb[2] sdd[1]
265172 blocks super 1.2 [3/3] [UUU]
bitmap: 0/3 pages [0KB], 64KB chunk
unused devices: <none>
md126 is the ext3 journal for the filesystem
Below is mdadm info on all the devices in the array
# mdadm -E /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:125 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : a20adb76:af00f276:5be79a36:b4ff3a8b
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 22:44:20 2011
Checksum : 350c988f - correct
Events : 1150
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 0
Array State : AA. ('A' == active, '.' == missing)
# mdadm -X /dev/sdc
Filename : /dev/sdc
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 1150
Events Cleared : 1144
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 93 dirty (0.6%)
# mdadm -E /dev/sdf
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:125 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f9205ace:0796ecf5:2cca363c:c2873816
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 23:00:49 2011
Checksum : 9c20ba71 - correct
Events : 3062
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 1
Array State : .A. ('A' == active, '.' == missing)
# mdadm -X /dev/sdf
Filename : /dev/sdf
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 3062
Events Cleared : 1144
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 150 dirty (0.9%)
# mdadm -E /dev/sde
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Name : dm01:125 (local to host dm01)
Creation Time : Tue Jan 4 04:45:50 2011
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 2119520 (1035.10 MiB 1085.19 MB)
Array Size : 4238848 (2.02 GiB 2.17 GB)
Used Dev Size : 2119424 (1035.05 MiB 1085.15 MB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 7f90958d:22de5c08:88750ecb:5f376058
Internal Bitmap : 2 sectors from superblock
Update Time : Tue Jan 4 22:43:53 2011
Checksum : 3ecec198 - correct
Events : 1144
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing)
# mdadm -X /dev/sde
Filename : /dev/sde
Magic : 6d746962
Version : 4
UUID : 9cd9ae9b:39454845:62f2b08d:a4a1ac6c
Events : 1144
Events Cleared : 1143
State : OK
Chunksize : 64 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 1059712 (1035.05 MiB 1085.15 MB)
Bitmap : 16558 bits (chunks), 38 dirty (0.2%)
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-03 3:16 ` Neil Brown
[not found] ` <4D214B5C.3010103@feystorm.net>
@ 2011-01-05 7:02 ` CoolCold
[not found] ` <AANLkTinL_nz58f8rSPuhYvVwGY5jdu1XVkNLC1ky5A65@mail.gmail.com>
2 siblings, 0 replies; 89+ messages in thread
From: CoolCold @ 2011-01-05 7:02 UTC (permalink / raw)
To: Neil Brown; +Cc: Patrick H., linux-raid
On Mon, Jan 3, 2011 at 6:16 AM, Neil Brown <neilb@suse.de> wrote:
> On Sun, 02 Jan 2011 18:58:34 -0700 "Patrick H." <linux-raid@feystorm.net>
> wrote:
>
>> I've been trying to track down an issue for a while now and from digging
>> around it appears (though not certain) the issue lies with the md raid
>> device.
>> Whats happening is that after improperly shutting down a raid-5 array,
>> upon reassembly, a few files on the filesystem will be corrupt. I dont
>> think this is normal filesystem corruption from files being modified
>> during the shut down because some of the files that end up corrupted are
>> several hours old.
>>
>> The exact details of what I'm doing:
>> I have a 3-node test cluster I'm doing integrity testing on. Each node
>> in the cluster is exporting a couple of disks via ATAoE.
>> I have the first disk of all 3 nodes in a raid-1 that is holding the
>> journal data for the ext3 filesystem. The array is running with an
>> internal bitmap as well.
>> The second disk of all 3 nodes is a raid-5 array holding the ext3
>> filesystem itself. This is also running with an internal bitmap.
>> The ext3 filesystem is mounted with 'data=journal,barrier=1,sync'.
>> When I power down the node which is actively running both md raid
>> devices, another node in the cluster takes over and starts both arrays
>> up (in degraded mode of course).
>> Once the original node comes back up, the new master re-adds its disks
>> back into the raid arrays and re-syncs them.
>> During all this, the filesystem is exported through nfs (nfs also has
>> sync turned on) and a client is randomly creating, removing, and
>> verifying checksums on the files in the filesystem (nfs is hard mounted
>> so operations always retry). The client script averages about 30
>> creations/s, 30 deletes/s, and 30 checksums/s.
>>
>> So, as stated above, every now and then (1 in 50 chance or so), when the
>> master is hard-rebooted, the client will detect a few files with invalid
>> md5 checksums. These files could be hours old so they were not being
>> actively modified.
>> Another key point that leads me to believe its a md raid issue is that
>> before I had the ext3 journal running internally on the raid-5 array
>> (part of the filesystem itself). When I did this, there would
>> occasionally be massive corruption. As in file modification times in the
>> future, lots of corrupt files, thousands of files put in the
>> 'lost+found' dir upon fsck, etc. After I put it on a separate raid-1,
>> there are no more invalid modification times, there hasnt been a single
>> file added to 'lost+found', and the number of corrupt files dropped
>> significantly. This would seem to indicate that the journal was getting
>> corrupted, and when it was played back, it went horribly wrong.
>>
>> So it would seem there's something wrong with the raid-5 array, but I
>> dont know what it could be. Any ideas or input would be much
>> appreciated. I can modify the clustering scripts to obtain whatever
>> information is needed when they start the arrays.
>
> What you are doing cannot work reliably.
>
> If a RAID5 suffers an unclean shutdown and is restarted without a full
> complement of devices, then it can corrupt data that has not been changed
> recently, just as you are seeing.
> This is why mdadm will not assemble that array unless you provide the --force
> flag which essentially says "I know what I am doing and accept the risk".
>
> When md needs to update a block in your 3-drive RAID5, it will read the other
> block in the same stripe (if that isn't in the cache or being written at the
> same time) and then write out the data block (or blocks) and the newly
> computed parity block.
>
> If you crash after one of those writes has completed, but before all of the
> writes have completed, then the parity block will not match the data blocks
> on disk.
Am I understanding right, that in case of hardware controller with
bbu, data and parity gonna be written properly ( for locally connected
drives of course ) even in case of powerloss and this is the only
feature which hardware raid controllers can do and softraid can't ?
(well, except some nice features like maxiq - cache on ssd for adaptec
controllers and overall write performance expansion because of
ram/bbu)
>
> When you re-assemble the array with one device missing, md will compute the
> data that was on the device using the other data block and the parity block.
> As the parity and data blocks could be inconsistent, the result could easily
> be wrong.
>
> With RAID1 there is no similar problem. When you read after a crash you will
> always get "correct" data. It maybe from before the last write that was
> attempted, or after, but if the data was not written recently you will read
> exactly the right data.
>
> This is why the situation improved substantially when you moved the journal
> to RAID1.
>
> The get full improvement, you need to move the data to RAID1 (or RAID10) as
> well.
>
> NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
[not found] ` <AANLkTinL_nz58f8rSPuhYvVwGY5jdu1XVkNLC1ky5A65@mail.gmail.com>
@ 2011-01-05 14:28 ` Patrick H.
2011-01-05 15:52 ` Spelic
0 siblings, 1 reply; 89+ messages in thread
From: Patrick H. @ 2011-01-05 14:28 UTC (permalink / raw)
To: linux-raid
Sent: Wed Jan 05 2011 00:00:48 GMT-0700 (Mountain Standard Time)
From: CoolCold <coolthecold@gmail.com>
To: Neil Brown <neilb@suse.de> "Patrick H." <linux-raid@feystorm.net>,
linux-raid@vger.kernel.org
Subject: Re: filesystem corruption
>
> Am I understanding right, that in case of hardware controller with
> bbu, data and parity gonna be written properly ( for locally
> connected drives of course ) even in case of powerloss and this is
> the only feature which hardware raid controllers can do and softraid
> can't ? (well, except some nice features like maxiq - cache on ssd for
> adaptec controllers and overall write performance expansion because of
> ram/bbu)
>
>
No, my drives are battery backed as well.
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-05 14:28 ` Patrick H.
@ 2011-01-05 15:52 ` Spelic
2011-01-05 15:55 ` Patrick H.
0 siblings, 1 reply; 89+ messages in thread
From: Spelic @ 2011-01-05 15:52 UTC (permalink / raw)
To: Patrick H.; +Cc: linux-raid
On 01/05/2011 03:28 PM, Patrick H. wrote:
> No, my drives are battery backed as well.
what drives are they, if I can ask? OCZ SSDs with supercapacitor maybe?
Do you know if they will really flush the whole write cache on sudden
power off? I read smoky sentences about this for the OCZ drives. In
certain points it seemed like the supercapacitor was only able to
provide the same guarantees of a HDD, that is, no further data loss due
to erase-then-rewrite-32K and flash wear levelling stuff, but was not
able to flush the write cache.
Did you try with e.g. a stream of simple databases transactions then
disconnecting the cable suddenly like this test
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
?
Thank you
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2011-01-05 15:52 ` Spelic
@ 2011-01-05 15:55 ` Patrick H.
0 siblings, 0 replies; 89+ messages in thread
From: Patrick H. @ 2011-01-05 15:55 UTC (permalink / raw)
To: linux-raid
HP DL360-G6. SAS controller with battery backed write accelerator.
I havent been focusing on the reliability of the drives as this is proof
of concept testing. If we decide to use it, the drives will be replaced
with 2TB SSD PCIe cards.
-Patrick
Sent: Wed Jan 05 2011 08:52:04 GMT-0700 (Mountain Standard Time)
From: Spelic <spelic@shiftmail.org>
To: Patrick H. <linux-raid@feystorm.net> linux-raid
<linux-raid@vger.kernel.org>
Subject: Re: filesystem corruption
> On 01/05/2011 03:28 PM, Patrick H. wrote:
>> No, my drives are battery backed as well.
>
> what drives are they, if I can ask? OCZ SSDs with supercapacitor maybe?
>
> Do you know if they will really flush the whole write cache on sudden
> power off? I read smoky sentences about this for the OCZ drives. In
> certain points it seemed like the supercapacitor was only able to
> provide the same guarantees of a HDD, that is, no further data loss
> due to erase-then-rewrite-32K and flash wear levelling stuff, but was
> not able to flush the write cache.
> Did you try with e.g. a stream of simple databases transactions then
> disconnecting the cable suddenly like this test
> http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
>
> ?
>
> Thank you
^ permalink raw reply [flat|nested] 89+ messages in thread
* filesystem corruption
@ 2014-10-31 0:29 Tobias Holst
2014-10-31 1:02 ` Tobias Holst
0 siblings, 1 reply; 89+ messages in thread
From: Tobias Holst @ 2014-10-31 0:29 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Hi
I was using a btrfs RAID1 with two disks under Ubuntu 14.04, kernel
3.13 and btrfs-tools 3.14.1 for weeks without issues.
Now I updated to kernel 3.17.1 and btrfs-tools 3.17. After a reboot
everything looked fine and I started some tests. While running
duperemover (just scanning, not doing anything) and a balance at the
same time the load suddenly went up to >30 and the system was not
responding anymore. Everyhting working with the filesystem stopped
responding. So I did a hard reset.
I was able to reboot, but on the login prompt nothing happened but a
kernel bug. Same back in kernel 3.13.
Now I started a live system (Ubuntu 14.10, kernel 3.16.x, btrfs-tools
3.14.1), and mounted the btrfs filesystem. I can browse through the
files but sometimes, especially when accessing my snapshots or trying
to create a new snapshot, the kernel bug appears and the filesystem
hangs.
It shows this:
Oct 31 00:09:14 ubuntu kernel: [ 187.661731] ------------[ cut here
]------------
Oct 31 00:09:14 ubuntu kernel: [ 187.661770] WARNING: CPU: 1 PID:
4417 at /build/buildd/linux-3.16.0/fs/btrfs/relocation.c:924
build_backref_tree+0xcab/0x1240 [btrfs]()
Oct 31 00:09:14 ubuntu kernel: [ 187.661772] Modules linked in:
nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
e1000e libahci ptp pps_core
Oct 31 00:09:14 ubuntu kernel: [ 187.661800] CPU: 1 PID: 4417 Comm:
btrfs-balance Tainted: G C 3.16.0-23-generic #31-Ubuntu
Oct 31 00:09:14 ubuntu kernel: [ 187.661802] Hardware name:
Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
Oct 31 00:09:14 ubuntu kernel: [ 187.661804] 0000000000000009
ffff8800a0ae7a00 ffffffff8177fcbc 0000000000000000
Oct 31 00:09:14 ubuntu kernel: [ 187.661807] ffff8800a0ae7a38
ffffffff8106fd8d ffff8800a1440750 ffff8800a1440b48
Oct 31 00:09:14 ubuntu kernel: [ 187.661809] ffff88020a8ce000
0000000000000001 ffff88020b6b0d00 ffff8800a0ae7a48
Oct 31 00:09:14 ubuntu kernel: [ 187.661812] Call Trace:
Oct 31 00:09:14 ubuntu kernel: [ 187.661820] [<ffffffff8177fcbc>]
dump_stack+0x45/0x56
Oct 31 00:09:14 ubuntu kernel: [ 187.661825] [<ffffffff8106fd8d>]
warn_slowpath_common+0x7d/0xa0
Oct 31 00:09:14 ubuntu kernel: [ 187.661827] [<ffffffff8106fe6a>]
warn_slowpath_null+0x1a/0x20
Oct 31 00:09:14 ubuntu kernel: [ 187.661842] [<ffffffffc01b734b>]
build_backref_tree+0xcab/0x1240 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661857] [<ffffffffc01b7ae1>]
relocate_tree_blocks+0x201/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661872] [<ffffffffc01b88d8>] ?
add_data_references+0x268/0x2a0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661887] [<ffffffffc01b96fd>]
relocate_block_group+0x25d/0x6b0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661902] [<ffffffffc01b9d36>]
btrfs_relocate_block_group+0x1e6/0x2f0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661916] [<ffffffffc0190988>]
btrfs_relocate_chunk.isra.27+0x58/0x720 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661926] [<ffffffffc0140dc1>] ?
btrfs_set_path_blocking+0x41/0x80 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661935] [<ffffffffc0145dfd>] ?
btrfs_search_slot+0x48d/0xa40 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661950] [<ffffffffc018b49b>] ?
release_extent_buffer+0x2b/0xd0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661964] [<ffffffffc018b95f>] ?
free_extent_buffer+0x4f/0xa0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661979] [<ffffffffc01936c3>]
__btrfs_balance+0x4d3/0x8d0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.661993] [<ffffffffc0193d48>]
btrfs_balance+0x288/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.662008] [<ffffffffc019411d>]
balance_kthread+0x5d/0x80 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.662022] [<ffffffffc01940c0>] ?
btrfs_balance+0x600/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.662026] [<ffffffff81094aeb>]
kthread+0xdb/0x100
Oct 31 00:09:14 ubuntu kernel: [ 187.662029] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:09:14 ubuntu kernel: [ 187.662032] [<ffffffff81787c3c>]
ret_from_fork+0x7c/0xb0
Oct 31 00:09:14 ubuntu kernel: [ 187.662035] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:09:14 ubuntu kernel: [ 187.662037] ---[ end trace
fb7849e4a6f20424 ]---
end this:
Oct 31 00:09:14 ubuntu kernel: [ 187.682629] ------------[ cut here
]------------
Oct 31 00:09:14 ubuntu kernel: [ 187.682635] kernel BUG at
/build/buildd/linux-3.16.0/fs/btrfs/extent-tree.c:868!
Oct 31 00:09:14 ubuntu kernel: [ 187.682638] invalid opcode: 0000 [#1] SMP
Oct 31 00:09:14 ubuntu kernel: [ 187.682642] Modules linked in:
nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
e1000e libahci ptp pps_core
Oct 31 00:09:14 ubuntu kernel: [ 187.682686] CPU: 1 PID: 4417 Comm:
btrfs-balance Tainted: G WC 3.16.0-23-generic #31-Ubuntu
Oct 31 00:09:14 ubuntu kernel: [ 187.682688] Hardware name:
Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
Oct 31 00:09:14 ubuntu kernel: [ 187.682690] task: ffff8801bb5728c0
ti: ffff8800a0ae4000 task.ti: ffff8800a0ae4000
Oct 31 00:09:14 ubuntu kernel: [ 187.682691] RIP:
0010:[<ffffffffc0150609>] [<ffffffffc0150609>]
btrfs_lookup_extent_info+0x469/0x4a0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682704] RSP:
0018:ffff8800a0ae7810 EFLAGS: 00010246
Oct 31 00:09:14 ubuntu kernel: [ 187.682706] RAX: 0000000000000000
RBX: ffff8800a1440b40 RCX: 000000129457c000
Oct 31 00:09:14 ubuntu kernel: [ 187.682708] RDX: ffff8801ab1be3c0
RSI: 000000129457c000 RDI: ffff8801ab1be428
Oct 31 00:09:14 ubuntu kernel: [ 187.682709] RBP: ffff8800a0ae7898
R08: ffff8801ab1be3c0 R09: 0000160000000000
Oct 31 00:09:14 ubuntu kernel: [ 187.682711] R10: 0000000000000000
R11: 000000000000003a R12: ffff8801ab1be428
Oct 31 00:09:14 ubuntu kernel: [ 187.682713] R13: 000000129457c000
R14: ffff8801b8800be0 R15: 0000000000000000
Oct 31 00:09:14 ubuntu kernel: [ 187.682715] FS:
0000000000000000(0000) GS:ffff880217c80000(0000)
knlGS:0000000000000000
Oct 31 00:09:14 ubuntu kernel: [ 187.682717] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Oct 31 00:09:14 ubuntu kernel: [ 187.682718] CR2: 0000000000ed3970
CR3: 0000000208e63000 CR4: 00000000000007e0
Oct 31 00:09:14 ubuntu kernel: [ 187.682720] Stack:
Oct 31 00:09:14 ubuntu kernel: [ 187.682721] ffff8800a0ae78c0
0000000000000000 0000000000000000 ffff8801ab1be3c0
Oct 31 00:09:14 ubuntu kernel: [ 187.682724] ffff8801b88be1b0
ffff8801ab1be3c0 ffff8801ab1be400 c0008801b8a45720
Oct 31 00:09:14 ubuntu kernel: [ 187.682727] 00a8000000129457
ff00000000000040 ffffffffc01570d1 0000000000000001
Oct 31 00:09:14 ubuntu kernel: [ 187.682730] Call Trace:
Oct 31 00:09:14 ubuntu kernel: [ 187.682742] [<ffffffffc01570d1>] ?
btrfs_alloc_free_block+0x3a1/0x470 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682751] [<ffffffffc01416f4>]
update_ref_for_cow+0x174/0x360 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682761] [<ffffffffc0141afd>]
__btrfs_cow_block+0x21d/0x510 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682770] [<ffffffffc0141f86>]
btrfs_cow_block+0x116/0x1b0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682779] [<ffffffffc0145b44>]
btrfs_search_slot+0x1d4/0xa40 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682791] [<ffffffffc01677ad>] ?
record_root_in_trans+0xad/0x120 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682807] [<ffffffffc01b64f3>]
do_relocation+0x3c3/0x570 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682817] [<ffffffffc0152878>] ?
btrfs_block_rsv_refill+0x48/0xa0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682832] [<ffffffffc01b7e35>]
relocate_tree_blocks+0x555/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682847] [<ffffffffc01b88d8>] ?
add_data_references+0x268/0x2a0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682862] [<ffffffffc01b96fd>]
relocate_block_group+0x25d/0x6b0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682876] [<ffffffffc01b9d36>]
btrfs_relocate_block_group+0x1e6/0x2f0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682891] [<ffffffffc0190988>]
btrfs_relocate_chunk.isra.27+0x58/0x720 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682900] [<ffffffffc0140dc1>] ?
btrfs_set_path_blocking+0x41/0x80 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682909] [<ffffffffc0145dfd>] ?
btrfs_search_slot+0x48d/0xa40 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682924] [<ffffffffc018b49b>] ?
release_extent_buffer+0x2b/0xd0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682938] [<ffffffffc018b95f>] ?
free_extent_buffer+0x4f/0xa0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682953] [<ffffffffc01936c3>]
__btrfs_balance+0x4d3/0x8d0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682968] [<ffffffffc0193d48>]
btrfs_balance+0x288/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682982] [<ffffffffc019411d>]
balance_kthread+0x5d/0x80 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.682997] [<ffffffffc01940c0>] ?
btrfs_balance+0x600/0x600 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.683001] [<ffffffff81094aeb>]
kthread+0xdb/0x100
Oct 31 00:09:14 ubuntu kernel: [ 187.683004] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:09:14 ubuntu kernel: [ 187.683007] [<ffffffff81787c3c>]
ret_from_fork+0x7c/0xb0
Oct 31 00:09:14 ubuntu kernel: [ 187.683010] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:09:14 ubuntu kernel: [ 187.683011] Code: be b0 00 00 00 48
c7 c7 90 77 1e c0 48 89 55 a8 e8 5d f8 f1 c0 48 8b 55 a8 e9 2e fe ff
ff 0f 0b 48 83 7d 88 00 0f 85 8d fe ff ff <0f> 0b 31 c0 e9 de fe ff ff
be 6c 03 00 00 48 c7 c7 28 77 1e c0
Oct 31 00:09:14 ubuntu kernel: [ 187.683040] RIP
[<ffffffffc0150609>] btrfs_lookup_extent_info+0x469/0x4a0 [btrfs]
Oct 31 00:09:14 ubuntu kernel: [ 187.683050] RSP <ffff8800a0ae7810>
Oct 31 00:09:14 ubuntu kernel: [ 187.683052] ---[ end trace
fb7849e4a6f20425 ]---
Then it keeps repeating this:
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] BUG: soft lockup - CPU#2
stuck for 22s! [btrfs-transacti:4416]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Modules linked in:
nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
e1000e libahci ptp pps_core
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CPU: 2 PID: 4416 Comm:
btrfs-transacti Tainted: G D WC 3.16.0-23-generic #31-Ubuntu
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Hardware name:
Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] task: ffff8800a23b1460
ti: ffff8801ba8f8000 task.ti: ffff8801ba8f8000
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RIP:
0010:[<ffffffff81787712>] [<ffffffff81787712>]
_raw_spin_lock+0x32/0x50
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RSP:
0018:ffff8801ba8fbcc8 EFLAGS: 00000202
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RAX: 0000000000004a52
RBX: 0000000000014800 RCX: 0000000000008c82
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RDX: 0000000000008c84
RSI: 0000000000008c84 RDI: ffff8801b88be1b0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RBP: ffff8801ba8fbcc8
R08: 00000000008dd0e4 R09: 000000002ac4f29b
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] R10: 000000929da8c524
R11: 0000000000000020 R12: ffff88020c32c800
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] R13: ffff88020c32c808
R14: 0000000200000003 R15: ffff880217d8e4e0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] FS:
0000000000000000(0000) GS:ffff880217d00000(0000)
knlGS:0000000000000000
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CR2: 00007fffa496afd8
CR3: 00000002084dd000 CR4: 00000000000007e0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Stack:
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffff8801ba8fbdf0
ffffffffc0153e02 ffffffff810abb55 ffff8800e14532f0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffff8800e1453358
ffff8800a23b14c8 ffff8801ba8fbd60 ffff8801ba8fbd50
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffffffff81011661
0000000000014800 ffff880217d11c40 ffff8800a23b1a50
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Call Trace:
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0153e02>]
__btrfs_run_delayed_refs+0x1e2/0x11e0 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff810abb55>] ?
set_next_entity+0x95/0xb0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81011661>] ?
__switch_to+0x191/0x5e0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff8107dd8a>] ?
del_timer_sync+0x4a/0x60
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0158df3>]
btrfs_run_delayed_refs.part.64+0x73/0x270 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0159007>]
btrfs_run_delayed_refs+0x17/0x20 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0169269>]
btrfs_commit_transaction+0x29/0x80 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc016527d>]
transaction_kthread+0x1ed/0x260 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0165090>] ?
btrfs_cleanup_transaction+0x540/0x540 [btrfs]
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094aeb>]
kthread+0xdb/0x100
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81787c3c>]
ret_from_fork+0x7c/0xb0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094a10>] ?
kthread_create_on_node+0x1c0/0x1c0
Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Code: 89 e5 b8 00 00 02
00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 04 5d c3 66 90 83 e2 fe 0f
b7 f2 b8 00 80 00 00 eb 0a 0f 1f 00 f3 90 <83> e8 01 74 0a 0f b7 0f 66
39 ca 75 f1 5d c3 66 66 66 90 66 66
Any ideas how to fix this filesystem? I do have backups, but I am
interested in finding out what happened and what to do.
Regards
Tobias
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-10-31 0:29 filesystem corruption Tobias Holst
@ 2014-10-31 1:02 ` Tobias Holst
2014-10-31 2:41 ` Rich Freeman
0 siblings, 1 reply; 89+ messages in thread
From: Tobias Holst @ 2014-10-31 1:02 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Addition:
I found some posts here about a general file system corruption in 3.17
and 3.17.1 - is this the cause?
Additionally I am using ro-snapshots - maybe this is the cause, too?
Anyway: Can I fix that or do I have to reinstall? Haven't touched the
filesystem, just did a scrub (found 0 errors).
Regards
Tobias
2014-10-31 1:29 GMT+01:00 Tobias Holst <tobby@tobby.eu>:
> Hi
>
> I was using a btrfs RAID1 with two disks under Ubuntu 14.04, kernel
> 3.13 and btrfs-tools 3.14.1 for weeks without issues.
>
> Now I updated to kernel 3.17.1 and btrfs-tools 3.17. After a reboot
> everything looked fine and I started some tests. While running
> duperemover (just scanning, not doing anything) and a balance at the
> same time the load suddenly went up to >30 and the system was not
> responding anymore. Everyhting working with the filesystem stopped
> responding. So I did a hard reset.
>
> I was able to reboot, but on the login prompt nothing happened but a
> kernel bug. Same back in kernel 3.13.
>
> Now I started a live system (Ubuntu 14.10, kernel 3.16.x, btrfs-tools
> 3.14.1), and mounted the btrfs filesystem. I can browse through the
> files but sometimes, especially when accessing my snapshots or trying
> to create a new snapshot, the kernel bug appears and the filesystem
> hangs.
>
> It shows this:
> Oct 31 00:09:14 ubuntu kernel: [ 187.661731] ------------[ cut here
> ]------------
> Oct 31 00:09:14 ubuntu kernel: [ 187.661770] WARNING: CPU: 1 PID:
> 4417 at /build/buildd/linux-3.16.0/fs/btrfs/relocation.c:924
> build_backref_tree+0xcab/0x1240 [btrfs]()
> Oct 31 00:09:14 ubuntu kernel: [ 187.661772] Modules linked in:
> nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
> dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
> 6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
> parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
> dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
> e1000e libahci ptp pps_core
> Oct 31 00:09:14 ubuntu kernel: [ 187.661800] CPU: 1 PID: 4417 Comm:
> btrfs-balance Tainted: G C 3.16.0-23-generic #31-Ubuntu
> Oct 31 00:09:14 ubuntu kernel: [ 187.661802] Hardware name:
> Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
> Oct 31 00:09:14 ubuntu kernel: [ 187.661804] 0000000000000009
> ffff8800a0ae7a00 ffffffff8177fcbc 0000000000000000
> Oct 31 00:09:14 ubuntu kernel: [ 187.661807] ffff8800a0ae7a38
> ffffffff8106fd8d ffff8800a1440750 ffff8800a1440b48
> Oct 31 00:09:14 ubuntu kernel: [ 187.661809] ffff88020a8ce000
> 0000000000000001 ffff88020b6b0d00 ffff8800a0ae7a48
> Oct 31 00:09:14 ubuntu kernel: [ 187.661812] Call Trace:
> Oct 31 00:09:14 ubuntu kernel: [ 187.661820] [<ffffffff8177fcbc>]
> dump_stack+0x45/0x56
> Oct 31 00:09:14 ubuntu kernel: [ 187.661825] [<ffffffff8106fd8d>]
> warn_slowpath_common+0x7d/0xa0
> Oct 31 00:09:14 ubuntu kernel: [ 187.661827] [<ffffffff8106fe6a>]
> warn_slowpath_null+0x1a/0x20
> Oct 31 00:09:14 ubuntu kernel: [ 187.661842] [<ffffffffc01b734b>]
> build_backref_tree+0xcab/0x1240 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661857] [<ffffffffc01b7ae1>]
> relocate_tree_blocks+0x201/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661872] [<ffffffffc01b88d8>] ?
> add_data_references+0x268/0x2a0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661887] [<ffffffffc01b96fd>]
> relocate_block_group+0x25d/0x6b0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661902] [<ffffffffc01b9d36>]
> btrfs_relocate_block_group+0x1e6/0x2f0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661916] [<ffffffffc0190988>]
> btrfs_relocate_chunk.isra.27+0x58/0x720 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661926] [<ffffffffc0140dc1>] ?
> btrfs_set_path_blocking+0x41/0x80 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661935] [<ffffffffc0145dfd>] ?
> btrfs_search_slot+0x48d/0xa40 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661950] [<ffffffffc018b49b>] ?
> release_extent_buffer+0x2b/0xd0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661964] [<ffffffffc018b95f>] ?
> free_extent_buffer+0x4f/0xa0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661979] [<ffffffffc01936c3>]
> __btrfs_balance+0x4d3/0x8d0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.661993] [<ffffffffc0193d48>]
> btrfs_balance+0x288/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.662008] [<ffffffffc019411d>]
> balance_kthread+0x5d/0x80 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.662022] [<ffffffffc01940c0>] ?
> btrfs_balance+0x600/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.662026] [<ffffffff81094aeb>]
> kthread+0xdb/0x100
> Oct 31 00:09:14 ubuntu kernel: [ 187.662029] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.662032] [<ffffffff81787c3c>]
> ret_from_fork+0x7c/0xb0
> Oct 31 00:09:14 ubuntu kernel: [ 187.662035] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.662037] ---[ end trace
> fb7849e4a6f20424 ]---
>
> end this:
> Oct 31 00:09:14 ubuntu kernel: [ 187.682629] ------------[ cut here
> ]------------
> Oct 31 00:09:14 ubuntu kernel: [ 187.682635] kernel BUG at
> /build/buildd/linux-3.16.0/fs/btrfs/extent-tree.c:868!
> Oct 31 00:09:14 ubuntu kernel: [ 187.682638] invalid opcode: 0000 [#1] SMP
> Oct 31 00:09:14 ubuntu kernel: [ 187.682642] Modules linked in:
> nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
> dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
> 6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
> parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
> dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
> e1000e libahci ptp pps_core
> Oct 31 00:09:14 ubuntu kernel: [ 187.682686] CPU: 1 PID: 4417 Comm:
> btrfs-balance Tainted: G WC 3.16.0-23-generic #31-Ubuntu
> Oct 31 00:09:14 ubuntu kernel: [ 187.682688] Hardware name:
> Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
> Oct 31 00:09:14 ubuntu kernel: [ 187.682690] task: ffff8801bb5728c0
> ti: ffff8800a0ae4000 task.ti: ffff8800a0ae4000
> Oct 31 00:09:14 ubuntu kernel: [ 187.682691] RIP:
> 0010:[<ffffffffc0150609>] [<ffffffffc0150609>]
> btrfs_lookup_extent_info+0x469/0x4a0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682704] RSP:
> 0018:ffff8800a0ae7810 EFLAGS: 00010246
> Oct 31 00:09:14 ubuntu kernel: [ 187.682706] RAX: 0000000000000000
> RBX: ffff8800a1440b40 RCX: 000000129457c000
> Oct 31 00:09:14 ubuntu kernel: [ 187.682708] RDX: ffff8801ab1be3c0
> RSI: 000000129457c000 RDI: ffff8801ab1be428
> Oct 31 00:09:14 ubuntu kernel: [ 187.682709] RBP: ffff8800a0ae7898
> R08: ffff8801ab1be3c0 R09: 0000160000000000
> Oct 31 00:09:14 ubuntu kernel: [ 187.682711] R10: 0000000000000000
> R11: 000000000000003a R12: ffff8801ab1be428
> Oct 31 00:09:14 ubuntu kernel: [ 187.682713] R13: 000000129457c000
> R14: ffff8801b8800be0 R15: 0000000000000000
> Oct 31 00:09:14 ubuntu kernel: [ 187.682715] FS:
> 0000000000000000(0000) GS:ffff880217c80000(0000)
> knlGS:0000000000000000
> Oct 31 00:09:14 ubuntu kernel: [ 187.682717] CS: 0010 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Oct 31 00:09:14 ubuntu kernel: [ 187.682718] CR2: 0000000000ed3970
> CR3: 0000000208e63000 CR4: 00000000000007e0
> Oct 31 00:09:14 ubuntu kernel: [ 187.682720] Stack:
> Oct 31 00:09:14 ubuntu kernel: [ 187.682721] ffff8800a0ae78c0
> 0000000000000000 0000000000000000 ffff8801ab1be3c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.682724] ffff8801b88be1b0
> ffff8801ab1be3c0 ffff8801ab1be400 c0008801b8a45720
> Oct 31 00:09:14 ubuntu kernel: [ 187.682727] 00a8000000129457
> ff00000000000040 ffffffffc01570d1 0000000000000001
> Oct 31 00:09:14 ubuntu kernel: [ 187.682730] Call Trace:
> Oct 31 00:09:14 ubuntu kernel: [ 187.682742] [<ffffffffc01570d1>] ?
> btrfs_alloc_free_block+0x3a1/0x470 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682751] [<ffffffffc01416f4>]
> update_ref_for_cow+0x174/0x360 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682761] [<ffffffffc0141afd>]
> __btrfs_cow_block+0x21d/0x510 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682770] [<ffffffffc0141f86>]
> btrfs_cow_block+0x116/0x1b0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682779] [<ffffffffc0145b44>]
> btrfs_search_slot+0x1d4/0xa40 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682791] [<ffffffffc01677ad>] ?
> record_root_in_trans+0xad/0x120 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682807] [<ffffffffc01b64f3>]
> do_relocation+0x3c3/0x570 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682817] [<ffffffffc0152878>] ?
> btrfs_block_rsv_refill+0x48/0xa0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682832] [<ffffffffc01b7e35>]
> relocate_tree_blocks+0x555/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682847] [<ffffffffc01b88d8>] ?
> add_data_references+0x268/0x2a0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682862] [<ffffffffc01b96fd>]
> relocate_block_group+0x25d/0x6b0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682876] [<ffffffffc01b9d36>]
> btrfs_relocate_block_group+0x1e6/0x2f0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682891] [<ffffffffc0190988>]
> btrfs_relocate_chunk.isra.27+0x58/0x720 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682900] [<ffffffffc0140dc1>] ?
> btrfs_set_path_blocking+0x41/0x80 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682909] [<ffffffffc0145dfd>] ?
> btrfs_search_slot+0x48d/0xa40 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682924] [<ffffffffc018b49b>] ?
> release_extent_buffer+0x2b/0xd0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682938] [<ffffffffc018b95f>] ?
> free_extent_buffer+0x4f/0xa0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682953] [<ffffffffc01936c3>]
> __btrfs_balance+0x4d3/0x8d0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682968] [<ffffffffc0193d48>]
> btrfs_balance+0x288/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682982] [<ffffffffc019411d>]
> balance_kthread+0x5d/0x80 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.682997] [<ffffffffc01940c0>] ?
> btrfs_balance+0x600/0x600 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.683001] [<ffffffff81094aeb>]
> kthread+0xdb/0x100
> Oct 31 00:09:14 ubuntu kernel: [ 187.683004] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.683007] [<ffffffff81787c3c>]
> ret_from_fork+0x7c/0xb0
> Oct 31 00:09:14 ubuntu kernel: [ 187.683010] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.683011] Code: be b0 00 00 00 48
> c7 c7 90 77 1e c0 48 89 55 a8 e8 5d f8 f1 c0 48 8b 55 a8 e9 2e fe ff
> ff 0f 0b 48 83 7d 88 00 0f 85 8d fe ff ff <0f> 0b 31 c0 e9 de fe ff ff
> be 6c 03 00 00 48 c7 c7 28 77 1e c0
> Oct 31 00:09:14 ubuntu kernel: [ 187.683040] RIP
> [<ffffffffc0150609>] btrfs_lookup_extent_info+0x469/0x4a0 [btrfs]
> Oct 31 00:09:14 ubuntu kernel: [ 187.683050] RSP <ffff8800a0ae7810>
> Oct 31 00:09:14 ubuntu kernel: [ 187.683052] ---[ end trace
> fb7849e4a6f20425 ]---
>
> Then it keeps repeating this:
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] BUG: soft lockup - CPU#2
> stuck for 22s! [btrfs-transacti:4416]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Modules linked in:
> nls_iso8859_1 dm_crypt gpio_ich coretemp lpc_ich kvm_intel kvm
> dm_multipath scsi_dh serio_raw xgifb(C) bnep rfcomm bluetooth
> 6lowpan_iphc i3000_edac edac_core parport_pc mac_hid ppdev shpchp lp
> parport squashfs overlayfs nls_utf8 isofs btrfs xor raid6_pq dm_mirror
> dm_region_hash dm_log hid_generic usbhid hid uas usb_storage ahci
> e1000e libahci ptp pps_core
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CPU: 2 PID: 4416 Comm:
> btrfs-transacti Tainted: G D WC 3.16.0-23-generic #31-Ubuntu
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Hardware name:
> Supermicro PDSML/PDSML+, BIOS 6.00 03/06/2009
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] task: ffff8800a23b1460
> ti: ffff8801ba8f8000 task.ti: ffff8801ba8f8000
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RIP:
> 0010:[<ffffffff81787712>] [<ffffffff81787712>]
> _raw_spin_lock+0x32/0x50
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RSP:
> 0018:ffff8801ba8fbcc8 EFLAGS: 00000202
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RAX: 0000000000004a52
> RBX: 0000000000014800 RCX: 0000000000008c82
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RDX: 0000000000008c84
> RSI: 0000000000008c84 RDI: ffff8801b88be1b0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] RBP: ffff8801ba8fbcc8
> R08: 00000000008dd0e4 R09: 000000002ac4f29b
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] R10: 000000929da8c524
> R11: 0000000000000020 R12: ffff88020c32c800
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] R13: ffff88020c32c808
> R14: 0000000200000003 R15: ffff880217d8e4e0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] FS:
> 0000000000000000(0000) GS:ffff880217d00000(0000)
> knlGS:0000000000000000
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CS: 0010 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] CR2: 00007fffa496afd8
> CR3: 00000002084dd000 CR4: 00000000000007e0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Stack:
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffff8801ba8fbdf0
> ffffffffc0153e02 ffffffff810abb55 ffff8800e14532f0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffff8800e1453358
> ffff8800a23b14c8 ffff8801ba8fbd60 ffff8801ba8fbd50
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] ffffffff81011661
> 0000000000014800 ffff880217d11c40 ffff8800a23b1a50
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Call Trace:
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0153e02>]
> __btrfs_run_delayed_refs+0x1e2/0x11e0 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff810abb55>] ?
> set_next_entity+0x95/0xb0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81011661>] ?
> __switch_to+0x191/0x5e0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff8107dd8a>] ?
> del_timer_sync+0x4a/0x60
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0158df3>]
> btrfs_run_delayed_refs.part.64+0x73/0x270 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0159007>]
> btrfs_run_delayed_refs+0x17/0x20 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0169269>]
> btrfs_commit_transaction+0x29/0x80 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc016527d>]
> transaction_kthread+0x1ed/0x260 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffffc0165090>] ?
> btrfs_cleanup_transaction+0x540/0x540 [btrfs]
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094aeb>]
> kthread+0xdb/0x100
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81787c3c>]
> ret_from_fork+0x7c/0xb0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] [<ffffffff81094a10>] ?
> kthread_create_on_node+0x1c0/0x1c0
> Oct 31 00:10:07 ubuntu kernel: [ 240.100001] Code: 89 e5 b8 00 00 02
> 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 04 5d c3 66 90 83 e2 fe 0f
> b7 f2 b8 00 80 00 00 eb 0a 0f 1f 00 f3 90 <83> e8 01 74 0a 0f b7 0f 66
> 39 ca 75 f1 5d c3 66 66 66 90 66 66
>
>
> Any ideas how to fix this filesystem? I do have backups, but I am
> interested in finding out what happened and what to do.
>
> Regards
> Tobias
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-10-31 1:02 ` Tobias Holst
@ 2014-10-31 2:41 ` Rich Freeman
2014-10-31 17:34 ` Tobias Holst
0 siblings, 1 reply; 89+ messages in thread
From: Rich Freeman @ 2014-10-31 2:41 UTC (permalink / raw)
To: Tobias Holst; +Cc: linux-btrfs@vger.kernel.org
On Thu, Oct 30, 2014 at 9:02 PM, Tobias Holst <tobby@tobby.eu> wrote:
> Addition:
> I found some posts here about a general file system corruption in 3.17
> and 3.17.1 - is this the cause?
> Additionally I am using ro-snapshots - maybe this is the cause, too?
>
> Anyway: Can I fix that or do I have to reinstall? Haven't touched the
> filesystem, just did a scrub (found 0 errors).
>
Yup - ro-snapshots is a big problem in 3.17. You can probably recover now by:
1. Update your kernel to 3.17.2 - that takes care of all the big
known 3.16/17 issues in general.
2. Run btrfs check using btrfs-tools 3.17. That can clean up the
broken snapshots in your filesystem.
That is fairly likely to get your filesystem working normally again.
It worked for me. I was getting some balance issues when trying to
add another device and I'm not sure if 3.17.2 totally fixed that - I
ended up cancelling the balance and it will be a while before I have
to balance this particular filesystem again, so I'll just hold off and
hope things stabilize.
--
Rich
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-10-31 2:41 ` Rich Freeman
@ 2014-10-31 17:34 ` Tobias Holst
2014-11-02 4:49 ` Robert White
0 siblings, 1 reply; 89+ messages in thread
From: Tobias Holst @ 2014-10-31 17:34 UTC (permalink / raw)
To: Rich Freeman; +Cc: linux-btrfs@vger.kernel.org
I am now using another system with kernel 3.17.2 and btrfs-tools 3.17
and inserted one of the two HDDs of my btrfs-RAID1 to it. I can't add
the second one as there are only two slots in that server.
This is what I got:
tobby@ubuntu: sudo btrfs check /dev/sdb1
warning, device 2 is missing
warning devid 2 not found already
root item for root 1746, current bytenr 80450240512, current gen
163697, current level 2, new bytenr 40074067968, new gen 163707, new
level 2
Found 1 roots with an outdated root item.
Please run a filesystem check with the option --repair to fix them.
tobby@ubuntu: sudo btrfs check --repair /dev/sdb1
enabling repair mode
warning, device 2 is missing
warning devid 2 not found already
Unable to find block group for 0
extent-tree.c:289: find_search_start: Assertion `1` failed.
btrfs[0x42bd62]
btrfs[0x42ffe5]
btrfs[0x430211]
btrfs[0x4246ec]
btrfs[0x424d11]
btrfs[0x426af3]
btrfs[0x41b18c]
btrfs[0x40b46a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ffca1119ec5]
btrfs[0x40b497]
This can be repeated as often as I want ;) Nothing changed.
Regards
Tobias
2014-10-31 3:41 GMT+01:00 Rich Freeman <r-btrfs@thefreemanclan.net>:
> On Thu, Oct 30, 2014 at 9:02 PM, Tobias Holst <tobby@tobby.eu> wrote:
>> Addition:
>> I found some posts here about a general file system corruption in 3.17
>> and 3.17.1 - is this the cause?
>> Additionally I am using ro-snapshots - maybe this is the cause, too?
>>
>> Anyway: Can I fix that or do I have to reinstall? Haven't touched the
>> filesystem, just did a scrub (found 0 errors).
>>
>
> Yup - ro-snapshots is a big problem in 3.17. You can probably recover now by:
> 1. Update your kernel to 3.17.2 - that takes care of all the big
> known 3.16/17 issues in general.
> 2. Run btrfs check using btrfs-tools 3.17. That can clean up the
> broken snapshots in your filesystem.
>
> That is fairly likely to get your filesystem working normally again.
> It worked for me. I was getting some balance issues when trying to
> add another device and I'm not sure if 3.17.2 totally fixed that - I
> ended up cancelling the balance and it will be a while before I have
> to balance this particular filesystem again, so I'll just hold off and
> hope things stabilize.
>
> --
> Rich
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-10-31 17:34 ` Tobias Holst
@ 2014-11-02 4:49 ` Robert White
2014-11-02 21:57 ` Chris Murphy
2014-11-03 2:55 ` Tobias Holst
0 siblings, 2 replies; 89+ messages in thread
From: Robert White @ 2014-11-02 4:49 UTC (permalink / raw)
To: Tobias Holst, Rich Freeman; +Cc: linux-btrfs@vger.kernel.org
On 10/31/2014 10:34 AM, Tobias Holst wrote:
> I am now using another system with kernel 3.17.2 and btrfs-tools 3.17
> and inserted one of the two HDDs of my btrfs-RAID1 to it. I can't add
> the second one as there are only two slots in that server.
>
> This is what I got:
>
> tobby@ubuntu: sudo btrfs check /dev/sdb1
> warning, device 2 is missing
> warning devid 2 not found already
> root item for root 1746, current bytenr 80450240512, current gen
> 163697, current level 2, new bytenr 40074067968, new gen 163707, new
> level 2
> Found 1 roots with an outdated root item.
> Please run a filesystem check with the option --repair to fix them.
>
> tobby@ubuntu: sudo btrfs check --repair /dev/sdb1
> enabling repair mode
> warning, device 2 is missing
> warning devid 2 not found already
> Unable to find block group for 0
> extent-tree.c:289: find_search_start: Assertion `1` failed.
The read-only snapshots taken under 3.17.1 are your core problem.
Now btrfsck is refusing to operate on the degraded RAID because degraded
RAID is degraded so it's read-only. (this is an educated guess). Since
btrfsck is _not_ a mount type of operation its got no "degraded mode"
that would let you deal with half a RAID as far as I know.
In your case...
It is _known_ that you need to be _not_ running 3.17.0 or 3.17.1 if you
are going to make read-only snapshots safely.
It is _known_ that you need to be running 3.17.2 to get a number of
fixes that impact your circumstance.
It is _known_ that you need to be running btrfs-progs 3.17 to repair the
read-only snapshot that are borked up, and that you must _not_ have
previously tried to repair the problme with an older btrfsck.
Were I you, I would...
Put the two disks back in the same computer before something bad happens.
Upgrade that computer to 3.17.2 and 3.17 respectively.
Take a backup (because I am paranoid like that, though current threat
seems negligible).
btrfsck your raid with --repair.
Alternately, if you previously tried to btrfsck the raid with a version
prior to 3.17 tools after the read-only snapshot(s) problem, you will
need to resort to mkfs.btrfs to solve the problem. But Hey! you have two
disks, so break the RAID, then mkfs one of them, then copy the data,
then re-make the RAID such that the new FS rules.
Enjoy your system no longer taking racy read-only snapshots... 8-)
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-02 4:49 ` Robert White
@ 2014-11-02 21:57 ` Chris Murphy
2014-11-03 3:43 ` Zygo Blaxell
2014-11-03 2:55 ` Tobias Holst
1 sibling, 1 reply; 89+ messages in thread
From: Chris Murphy @ 2014-11-02 21:57 UTC (permalink / raw)
Cc: Btrfs BTRFS
On Nov 1, 2014, at 10:49 PM, Robert White <rwhite@pobox.com> wrote:
> On 10/31/2014 10:34 AM, Tobias Holst wrote:
>> I am now using another system with kernel 3.17.2 and btrfs-tools 3.17
>> and inserted one of the two HDDs of my btrfs-RAID1 to it. I can't add
>> the second one as there are only two slots in that server.
>>
>> This is what I got:
>>
>> tobby@ubuntu: sudo btrfs check /dev/sdb1
>> warning, device 2 is missing
>> warning devid 2 not found already
>> root item for root 1746, current bytenr 80450240512, current gen
>> 163697, current level 2, new bytenr 40074067968, new gen 163707, new
>> level 2
>> Found 1 roots with an outdated root item.
>> Please run a filesystem check with the option --repair to fix them.
>>
>> tobby@ubuntu: sudo btrfs check --repair /dev/sdb1
>> enabling repair mode
>> warning, device 2 is missing
>> warning devid 2 not found already
>> Unable to find block group for 0
>> extent-tree.c:289: find_search_start: Assertion `1` failed.
>
> The read-only snapshots taken under 3.17.1 are your core problem.
>
> Now btrfsck is refusing to operate on the degraded RAID because degraded RAID is degraded so it's read-only. (this is an educated guess).
Degradedness and writability are orthogonal. If there's some problem with the fs that prevents it from being mountable rw, then that'd apply for both normal and degraded operation. If the fs is OK, it should permit writable degraded mounts.
> Since btrfsck is _not_ a mount type of operation its got no "degraded mode" that would let you deal with half a RAID as far as I know.
That's a problem. I can see why a repair might need an additional flag (maybe force) to repair a volume that has the minimum number of devices for degraded mounting, but not all are present. Maybe we wouldn't want it to be easy to accidentally run a repair that changes the file system when a device happens to be missing inadvertently that could be found and connected later.
I think related to this is a btrfs equivalent of a bitmap. The metadata already has this information in it, but possibly right now btrfs lacks the equivalent behavior of mdadm readd when a previously missing device is reconnected. If it has a bitmap then it doesn't have to be completely rebuilt, the bitmap contains information telling md how to "catch up" the readded device, i.e. only that which is different needs to be written upon a readd.
For example if I have a two device Btrfs raid1 for both data and metadata, and one device is removed and I mount -o degraded,rw one of them and make some small changes, unmount, then reconnect the missing device and mount NOT degraded - what happens? I haven't tried this. And I also don't know if a full balance (hours) is needed to "catch up" the formerly missing device. With md this is very fast - seconds/minutes depending on how much has been changed.
Chris Murphy
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-02 4:49 ` Robert White
2014-11-02 21:57 ` Chris Murphy
@ 2014-11-03 2:55 ` Tobias Holst
2014-11-03 3:49 ` Robert White
1 sibling, 1 reply; 89+ messages in thread
From: Tobias Holst @ 2014-11-03 2:55 UTC (permalink / raw)
To: Robert White; +Cc: Rich Freeman, linux-btrfs@vger.kernel.org
Thank you for your reply.
I'll answer in-line.
2014-11-02 5:49 GMT+01:00 Robert White <rwhite@pobox.com>:
> On 10/31/2014 10:34 AM, Tobias Holst wrote:
>>
>> I am now using another system with kernel 3.17.2 and btrfs-tools 3.17
>> and inserted one of the two HDDs of my btrfs-RAID1 to it. I can't add
>> the second one as there are only two slots in that server.
>>
>> This is what I got:
>>
>> tobby@ubuntu: sudo btrfs check /dev/sdb1
>> warning, device 2 is missing
>> warning devid 2 not found already
>> root item for root 1746, current bytenr 80450240512, current gen
>> 163697, current level 2, new bytenr 40074067968, new gen 163707, new
>> level 2
>> Found 1 roots with an outdated root item.
>> Please run a filesystem check with the option --repair to fix them.
>>
>> tobby@ubuntu: sudo btrfs check --repair /dev/sdb1
>> enabling repair mode
>> warning, device 2 is missing
>> warning devid 2 not found already
>> Unable to find block group for 0
>> extent-tree.c:289: find_search_start: Assertion `1` failed.
>
>
> The read-only snapshots taken under 3.17.1 are your core problem.
OK
>
> Now btrfsck is refusing to operate on the degraded RAID because degraded
> RAID is degraded so it's read-only. (this is an educated guess). Since
> btrfsck is _not_ a mount type of operation its got no "degraded mode" that
> would let you deal with half a RAID as far as I know.
OK, good to know.
>
> In your case...
>
> It is _known_ that you need to be _not_ running 3.17.0 or 3.17.1 if you are
> going to make read-only snapshots safely.
> It is _known_ that you need to be running 3.17.2 to get a number of fixes
> that impact your circumstance.
> It is _known_ that you need to be running btrfs-progs 3.17 to repair the
> read-only snapshot that are borked up, and that you must _not_ have
> previously tried to repair the problme with an older btrfsck.
No, I didn't try to repair it with older kernels/btrfs-tools.
>
> Were I you, I would...
>
> Put the two disks back in the same computer before something bad happens.
>
> Upgrade that computer to 3.17.2 and 3.17 respectively.
As I mentioned before I only have two slots and my system on this
btrfs-raid1 is not working anymore. Not just when accessing
ro-snapshots - it crashes everytime at the login prompt. So now I
installed Ubuntu 14.04 to an USB stick (so I can readd both btrfs
HDDs) and upgraded the kernel to 3.17.2 and btrfs-tools to 3.17.
>
> Take a backup (because I am paranoid like that, though current threat seems
> negligible).
I already have a backup. :)
>
> btrfsck your raid with --repair.
OK. And this is what I get now:
tobby@ubuntu: sudo btrfs check /dev/sda1
root item for root 1746, current bytenr 80450240512, current gen
163697, current level 2, new bytenr 40074067968, new gen 163707, new
level 2
Found 1 roots with an outdated root item.
Please run a filesystem check with the option --repair to fix them.
tobby@ubuntu: sudo btrfs check /dev/sda1 --repair
enabling repair mode
fixing root item for root 1746, current bytenr 80450240512, current
gen 163697, current level 2, new bytenr 40074067968, new gen 163707,
new level 2
Fixed 1 roots.
Checking filesystem on /dev/sda1
UUID: 3ad065be-2525-4547-87d3-0e195497f9cf
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
root 18446744073709551607 inode 258 errors 1000, some csum missing
found 36031450184 bytes used err is 1
total csum bytes: 59665716
total tree bytes: 3523330048
total fs tree bytes: 3234054144
total extent tree bytes: 202358784
btree space waste bytes: 755547262
file data blocks allocated: 122274091008
referenced 211741990912
Btrfs v3.17
>
> Alternately, if you previously tried to btrfsck the raid with a version
> prior to 3.17 tools after the read-only snapshot(s) problem, you will need
> to resort to mkfs.btrfs to solve the problem. But Hey! you have two disks,
> so break the RAID, then mkfs one of them, then copy the data, then re-make
> the RAID such that the new FS rules.
>
> Enjoy your system no longer taking racy read-only snapshots... 8-)
>
>
Aaaaand this worked! :) Server is back online without restoring any
files from the backup. Looks good to me!
But I can't do a balance anymore?
root@t-mon:~# btrfs balance start /dev/sda1
ERROR: can't access '/dev/sda1'
Regards
Tobias
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-02 21:57 ` Chris Murphy
@ 2014-11-03 3:43 ` Zygo Blaxell
2014-11-03 17:11 ` Chris Murphy
0 siblings, 1 reply; 89+ messages in thread
From: Zygo Blaxell @ 2014-11-03 3:43 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]
On Sun, Nov 02, 2014 at 02:57:22PM -0700, Chris Murphy wrote:
> On Nov 1, 2014, at 10:49 PM, Robert White <rwhite@pobox.com> wrote:
>
> > On 10/31/2014 10:34 AM, Tobias Holst wrote:
> >> I am now using another system with kernel 3.17.2 and btrfs-tools 3.17
> >> and inserted one of the two HDDs of my btrfs-RAID1 to it. I can't add
> >> the second one as there are only two slots in that server.
> >>
> >> This is what I got:
> >>
> >> tobby@ubuntu: sudo btrfs check /dev/sdb1
> >> warning, device 2 is missing
> >> warning devid 2 not found already
> >> root item for root 1746, current bytenr 80450240512, current gen
> >> 163697, current level 2, new bytenr 40074067968, new gen 163707, new
> >> level 2
> >> Found 1 roots with an outdated root item.
> >> Please run a filesystem check with the option --repair to fix them.
> >>
> >> tobby@ubuntu: sudo btrfs check --repair /dev/sdb1
> >> enabling repair mode
> >> warning, device 2 is missing
> >> warning devid 2 not found already
> >> Unable to find block group for 0
> >> extent-tree.c:289: find_search_start: Assertion `1` failed.
> >
> > The read-only snapshots taken under 3.17.1 are your core problem.
> >
> > Now btrfsck is refusing to operate on the degraded RAID because
> > degraded RAID is degraded so it's read-only. (this is an educated
> > guess).
>
> Degradedness and writability are orthogonal. If there's some problem
> with the fs that prevents it from being mountable rw, then that'd
> apply for both normal and degraded operation. If the fs is OK, it
> should permit writable degraded mounts.
>
> > Since btrfsck is _not_ a mount type of operation its got no "degraded
> > mode" that would let you deal with half a RAID as far as I know.
>
> That's a problem. I can see why a repair might need an additional flag
> (maybe force) to repair a volume that has the minimum number of devices
> for degraded mounting, but not all are present. Maybe we wouldn't want
> it to be easy to accidentally run a repair that changes the file system
> when a device happens to be missing inadvertently that could be found
> and connected later.
>
> I think related to this is a btrfs equivalent of a bitmap. The metadata
> already has this information in it, but possibly right now btrfs
> lacks the equivalent behavior of mdadm readd when a previously missing
> device is reconnected. If it has a bitmap then it doesn't have to be
> completely rebuilt, the bitmap contains information telling md how to
> "catch up" the readded device, i.e. only that which is different needs
> to be written upon a readd.
>
> For example if I have a two device Btrfs raid1 for both data and
> metadata, and one device is removed and I mount -o degraded,rw one
> of them and make some small changes, unmount, then reconnect the
> missing device and mount NOT degraded - what happens? I haven't tried
> this.
I have. It's a filesystem-destroying disaster. Never do it, never let
it happen accidentally. Make sure that if a disk gets temporarily
disconnected, you either never mount it degraded, or never let it come
back (i.e. take the disk to another machine and wipefs it). Don't ever,
ever put 'degraded' in /etc/fstab mount options. Nope. No.
btrfs seems to assume the data is correct on both disks (the generation
numbers and checksums are OK) but gets confused by equally plausible but
different metadata on each disk. It doesn't take long before the
filesystem becomes data soup or crashes the kernel.
There is more than one way to get to this point. Take LVM snapshots of
the devices in a btrfs RAID1 array, and 'btrfs device scan' will see two
different versions of each btrfs device in a btrfs filesystem (one for
the origin LV and one for the snapshot). btrfs then assembles LVs of
different vintages randomly (e.g. one from the mount command line, one
from an earlier LVM snapshot of the second disk) with disastrous results
similar to the above. IMHO if btrfs sees multiple devices with the same
UUIDs, it should reject all of them and require an explicit device list;
however, mdadm has a way to deal with this that would also work.
mdadm puts event counters and timestamps in the device superblocks to
prevent any such accidental disjoint assembly and modification of members
of an array. If disks go temporarily offline with separate modifications
then mdadm refuses to accept disks with different counter+timestamp data
(so you'll get all the disks but one rejected, or only one disk with all
others rejected). The rejected disk(s) has to go through full device
recovery before rejoining the array--someone has to use mdadm to add
the rejected disk as if it was a new, blank one.
Currently btrfs won't mount a degraded array by default, which prevents
unrecoverable inconsistency. That's a safe behavior for now, but sooner
or later btrfs will need to be able to safely boot unattended on a
degraded RAID1 root filesystem.
> And I also don't know if a full balance (hours) is needed to
> "catch up" the formerly missing device. With md this is very fast -
> seconds/minutes depending on how much has been changed.
I schedule a scrub immediately after boot, assuming that it will resolve
any data differences (and also assuming that the reboot was caused by
a disk-related glitch, which it usually is for me). That might not
be enough for metadata differences, and it's certainly not enough for
modifications in degraded mode. Full balance is out of my reach--it
takes weeks on even my medium-sized filesystems, and mkfs + rsync from
backup is much faster.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-03 2:55 ` Tobias Holst
@ 2014-11-03 3:49 ` Robert White
0 siblings, 0 replies; 89+ messages in thread
From: Robert White @ 2014-11-03 3:49 UTC (permalink / raw)
To: Tobias Holst; +Cc: Rich Freeman, linux-btrfs@vger.kernel.org
On 11/02/2014 06:55 PM, Tobias Holst wrote:
> But I can't do a balance anymore?
>
> root@t-mon:~# btrfs balance start /dev/sda1
> ERROR: can't access '/dev/sda1'
Balance takes place on a mounted filesystem not a native block device.
So...
mount -t btrfs /dev/sda1 /some/path/somewhere
btrfs balance start /some/path/somewhere
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-03 3:43 ` Zygo Blaxell
@ 2014-11-03 17:11 ` Chris Murphy
2014-11-04 4:31 ` Zygo Blaxell
0 siblings, 1 reply; 89+ messages in thread
From: Chris Murphy @ 2014-11-03 17:11 UTC (permalink / raw)
To: Zygo Blaxell; +Cc: Btrfs BTRFS
On Nov 2, 2014, at 8:43 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
> On Sun, Nov 02, 2014 at 02:57:22PM -0700, Chris Murphy wrote:
>>
>> For example if I have a two device Btrfs raid1 for both data and
>> metadata, and one device is removed and I mount -o degraded,rw one
>> of them and make some small changes, unmount, then reconnect the
>> missing device and mount NOT degraded - what happens? I haven't tried
>> this.
>
> I have. It's a filesystem-destroying disaster. Never do it, never let
> it happen accidentally. Make sure that if a disk gets temporarily
> disconnected, you either never mount it degraded, or never let it come
> back (i.e. take the disk to another machine and wipefs it). Don't ever,
> ever put 'degraded' in /etc/fstab mount options. Nope. No.
Well I guess I now see why opensuse's plan for Btrfs by default proscribes multiple device Btrfs volumes. The described scenario is really common with users, I see it often on linux-raid@. And md doesn't have this problem. The worst case scenario is if devices don't have bitmaps, and then a whole device rebuild has to happen rather than just a quick "catchup".
>
> btrfs seems to assume the data is correct on both disks (the generation
> numbers and checksums are OK) but gets confused by equally plausible but
> different metadata on each disk. It doesn't take long before the
> filesystem becomes data soup or crashes the kernel.
This is a pretty significant problem to still be present, honestly. I can understand the "catchup" mechanism is probably not built yet, but clearly the two devices don't have the same generation. The lower generation device should probably be booted/ignored or declared missing in the meantime to prevent trashing the file system.
Chris Murphy
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-03 17:11 ` Chris Murphy
@ 2014-11-04 4:31 ` Zygo Blaxell
2014-11-04 8:25 ` Duncan
2014-11-04 18:28 ` Chris Murphy
0 siblings, 2 replies; 89+ messages in thread
From: Zygo Blaxell @ 2014-11-04 4:31 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]
On Mon, Nov 03, 2014 at 10:11:18AM -0700, Chris Murphy wrote:
>
> On Nov 2, 2014, at 8:43 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
> > btrfs seems to assume the data is correct on both disks (the generation
> > numbers and checksums are OK) but gets confused by equally plausible but
> > different metadata on each disk. It doesn't take long before the
> > filesystem becomes data soup or crashes the kernel.
>
> This is a pretty significant problem to still be present, honestly. I
> can understand the "catchup" mechanism is probably not built yet,
> but clearly the two devices don't have the same generation. The lower
> generation device should probably be booted/ignored or declared missing
> in the meantime to prevent trashing the file system.
The problem with generation numbers is when both devices get divergent
generation numbers but we can't tell them apart, e.g.
1. sda generation = 5, sdb generation = 5
2. sdb temporarily disconnects, so we are degraded on just sda
3. sda gets more generations 6..9
4. sda temporarily disconnects, so we have no disks at all.
5. the machine reboots, gets sdb back but not sda
If we allow degraded here, then:
6. sdb gets more generations 6..9
7. sdb disconnects, no disks so no filesystem
8. the machine reboots again, this time with sda and sdb present
Now we have two disks with equal generation numbers. Generations 6..9
on sda are not the same as generations 6..9 on sdb, so if we mix the
two disks' metadata we get bad confusion.
It needs to be more than a sequential number. If one of the disks
disappears we need to record this fact on the surviving disks, and also
cope with _both_ disks claiming to be the "surviving" one.
>
> Chris Murphy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-04 4:31 ` Zygo Blaxell
@ 2014-11-04 8:25 ` Duncan
2014-11-04 18:28 ` Chris Murphy
1 sibling, 0 replies; 89+ messages in thread
From: Duncan @ 2014-11-04 8:25 UTC (permalink / raw)
To: linux-btrfs
Zygo Blaxell posted on Mon, 03 Nov 2014 23:31:45 -0500 as excerpted:
> On Mon, Nov 03, 2014 at 10:11:18AM -0700, Chris Murphy wrote:
>>
>> On Nov 2, 2014, at 8:43 PM, Zygo Blaxell <zblaxell@furryterror.org>
>> wrote:
>> > btrfs seems to assume the data is correct on both disks (the
>> > generation numbers and checksums are OK) but gets confused by equally
>> > plausible but different metadata on each disk. It doesn't take long
>> > before the filesystem becomes data soup or crashes the kernel.
>>
>> This is a pretty significant problem to still be present, honestly. I
>> can understand the "catchup" mechanism is probably not built yet,
>> but clearly the two devices don't have the same generation. The lower
>> generation device should probably be booted/ignored or declared missing
>> in the meantime to prevent trashing the file system.
>
> The problem with generation numbers is when both devices get divergent
> generation numbers but we can't tell them apart
[snip very reasonable scenario]
> Now we have two disks with equal generation numbers.
> Generations 6..9 on sda are not the same as generations 6..9 on sdb, so
> if we mix the two disks' metadata we get bad confusion.
>
> It needs to be more than a sequential number. If one of the disks
> disappears we need to record this fact on the surviving disks, and also
> cope with _both_ disks claiming to be the "surviving" one.
Zygo's absolutely correct. There is an existing catchup mechanism, but
the tracking is /purely/ sequential generation number based, and if the
two generation sequences diverge, "Welcome to the (data) Twilight Zone!"
I noted this in my own early pre-deployment raid1 mode testing as well,
except that I didn't at that point know about sequence numbers and never
got as far as letting the filesystem make data soup of itself.
What I did was this:
1) Create a two-device raid1 data and metadata filesystem, mount it and
stick some data on it.
2) Unmount, pull a device, mount degraded the remaining device.
3) Change a file.
4) Unmount, switch devices, mount degraded the other device.
5) Change the same file in an different/incompatible way.
6) Unmount, plug both devices in again, mount (not degraded).
7) Wait for the sync I was used to from mdraid, which of course didn't
occur.
8) Check the file to see which version showed up. I don't recall which
version it was, but it wasn't the common pre-change version.
9) Unmount, pull each device one at a time, mounting the other one
degraded and checking the file again.
10) The file on each device remained different, without a warning or
indication of any problem at all when I mounted undegraded in 6/7.
Had I initiated a scrub, presumably it would have seen the difference and
if one was a newer generation, it would have taken it, overwriting the
other. I don't know what it would have done if both were the same
generation, tho the file being small (just a few line text file, big
enough to test the effect of differing edits), I guess it would take one
version or the other. If the file was large enough to be multiple
extents, however, I've no idea whether it'd take one or the other, or
possibly combine the two, picking extents where they differed more or
less randomly.
By that time the lack of warning and absolute resolution to one version
or the other even after mounting undegraded and accessing the file with
incompatible versions on each of the two devices was bothering me
sufficiently that I didn't test any further.
Being just me I have to worry about (unlike a multi-admin corporate
scenario where you can never be /sure/ what the other admins will do
regardless of agreed procedure), I simply set myself a set of rules very
similar to what Zygo proposed:
1) If for whatever reason I ever split a btrfs raid1 with the intent or
even the possibility of bringing the pieces back together again, if at
all possible, never mount the split pieces writable -- mount read-only.
2) If a writable mount is required, keep the writable mounts to one
device of the split. As long as the other device is never mounted
writable, it will have an older generation when they're reunited and a
scrub should take care of things, reliably resolving to the updated
written device, rewriting the older generation on the other device.
What I'd do here is physically put the removed side of the raid1 in
storage, far enough from the remaining side that I couldn't possibly get
them mixed up. I'd clearly label it as well, creating a "defense in
depth" of at least two, the labeling and the physical separation and
storage of the read-only device.
3) If for whatever reason the originally read-only side must be mounted
writable, very clearly mark the originally mounted-writable device
POISONED/TOXIC!! *NEVER* *EVER* let such a POISONED device anywhere near
its original raid1 mate, until it is wiped, such that there's no
possibility of btrfs getting confused and contaminated with the poisoned
data.
Given how unimpressed I was with btrfs' ability to do the right thing in
such cases, I'd be tempted to wipefs the device, then dd from
/dev/zero to it, then badblocks write-pattern test a couple patterns,
then (if it was a full physical device not just a partition) hardware
secure-erase it, then mkfs it to ext4 or vfat, then dd from /dev/zero it
again and again hardware secure-erase it, then FINALLY mkfs.btrfs it
again. Of course being ssd, a single mkfs.btrfs would issue a trim and
that should suffice, but I was really REALLY not impressed with btrfs'
ability to reliably do the right thing, and would effectively be tearing
up the schoolbooks (at least the workbooks, since they couldn't be bought
back) and feeding them to the furnace at the end of the year, as I used
to do when I was a kid, not because it made a difference, but because it
was so emotionally rewarding! =:^)
Or maybe I'd make that an excuse to try dban[1].
But I'd probably just dd from /dev/zero or secure-erase it, or badblocks-
write-test a couple patterns if I wanted to badblocks-test it anyway, or
mkfs.btrfs it to get the trim from that.
But I'd have fun doing it. =:^)
And then I'd plug it back in and btrfs replace the missing device.
Anyway, the point is, either don't reintroduce absent devices once split
out of a btrfs raid1, or ensure they don't get written and immediately do
a scrub to update them when reintroduced, or if they were written and the
other device was too, separately, be sure the one is wiped (Destroy them
with Lasers![2]) before using a full btrfs replace, to keep the remaining
device(s) and the data on them healthy. =:^)
---
[1] https://www.google.com/search?q=dban
[2] Destroy them with Lazers! by Knife Party
https://www.google.com/search?q=destroy+them+with+lazers
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-04 4:31 ` Zygo Blaxell
2014-11-04 8:25 ` Duncan
@ 2014-11-04 18:28 ` Chris Murphy
2014-11-04 21:44 ` Duncan
` (2 more replies)
1 sibling, 3 replies; 89+ messages in thread
From: Chris Murphy @ 2014-11-04 18:28 UTC (permalink / raw)
To: Zygo Blaxell; +Cc: Btrfs BTRFS
On Nov 3, 2014, at 9:31 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
> On Mon, Nov 03, 2014 at 10:11:18AM -0700, Chris Murphy wrote:
>>
>> On Nov 2, 2014, at 8:43 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
>>> btrfs seems to assume the data is correct on both disks (the generation
>>> numbers and checksums are OK) but gets confused by equally plausible but
>>> different metadata on each disk. It doesn't take long before the
>>> filesystem becomes data soup or crashes the kernel.
>>
>> This is a pretty significant problem to still be present, honestly. I
>> can understand the "catchup" mechanism is probably not built yet,
>> but clearly the two devices don't have the same generation. The lower
>> generation device should probably be booted/ignored or declared missing
>> in the meantime to prevent trashing the file system.
>
> The problem with generation numbers is when both devices get divergent
> generation numbers but we can't tell them apart, e.g.
>
> 1. sda generation = 5, sdb generation = 5
>
> 2. sdb temporarily disconnects, so we are degraded on just sda
>
> 3. sda gets more generations 6..9
>
> 4. sda temporarily disconnects, so we have no disks at all.
>
> 5. the machine reboots, gets sdb back but not sda
>
> If we allow degraded here, then:
>
> 6. sdb gets more generations 6..9
>
> 7. sdb disconnects, no disks so no filesystem
>
> 8. the machine reboots again, this time with sda and sdb present
>
> Now we have two disks with equal generation numbers. Generations 6..9
> on sda are not the same as generations 6..9 on sdb, so if we mix the
> two disks' metadata we get bad confusion.
>
> It needs to be more than a sequential number. If one of the disks
> disappears we need to record this fact on the surviving disks, and also
> cope with _both_ disks claiming to be the "surviving" one.
I agree this is also a problem. But the most common case is where we know that sda generation is newer (larger value) and most recently modified, and sdb has not since been modified but needs to be caught up. As far as I know the only way to do that on Btrfs right now is a full balance, it doesn't catch up just be being reconnected with a normal mount.
Chris Murphy
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-04 18:28 ` Chris Murphy
@ 2014-11-04 21:44 ` Duncan
2014-11-04 22:19 ` Robert White
2014-11-04 22:34 ` Zygo Blaxell
2 siblings, 0 replies; 89+ messages in thread
From: Duncan @ 2014-11-04 21:44 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Tue, 04 Nov 2014 11:28:39 -0700 as excerpted:
>> It needs to be more than a sequential number. If one of the disks
>> disappears we need to record this fact on the surviving disks, and also
>> cope with _both_ disks claiming to be the "surviving" one.
>
> I agree this is also a problem. But the most common case is where we
> know that sda generation is newer (larger value) and most recently
> modified, and sdb has not since been modified but needs to be caught up.
> As far as I know the only way to do that on Btrfs right now is a full
> balance, it doesn't catch up just be being reconnected with a normal
> mount.
I thought it was a scrub that would take care of that, not a balance?
(Maybe do both to be sure?)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-04 18:28 ` Chris Murphy
2014-11-04 21:44 ` Duncan
@ 2014-11-04 22:19 ` Robert White
2014-11-04 22:34 ` Zygo Blaxell
2 siblings, 0 replies; 89+ messages in thread
From: Robert White @ 2014-11-04 22:19 UTC (permalink / raw)
To: Chris Murphy, Zygo Blaxell; +Cc: Btrfs BTRFS
On 11/04/2014 10:28 AM, Chris Murphy wrote:
> On Nov 3, 2014, at 9:31 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
>> Now we have two disks with equal generation numbers. Generations 6..9
>> on sda are not the same as generations 6..9 on sdb, so if we mix the
>> two disks' metadata we get bad confusion.
>>
>> It needs to be more than a sequential number. If one of the disks
>> disappears we need to record this fact on the surviving disks, and also
>> cope with _both_ disks claiming to be the "surviving" one.
>
> I agree this is also a problem. But the most common case is where we know that sda generation is newer (larger value) and most recently modified, and sdb has not since been modified but needs to be caught up. As far as I know the only way to do that on Btrfs right now is a full balance, it doesn't catch up just be being reconnected with a normal mount.
I would think that any time any system or fraction thereof is mounted
with both a "degraded" and "rw", status a degraded flag should be set
somewhere/somehow in the superblock etc.
The only way to clear this flag would be to reach a "reconciled" state.
That state could be reached in one of several ways. Removing the missing
mirror element would be a fast reconcile, doing a balance or scrub would
be a slow reconcile for a filessytem where all the media are returned to
service (e.g. the missing volume of a RAID 1 etc is returned.)
Generation numbers are pretty good, but I'd put on a rider that any
generation number or equivelant incremented while the system is degraded
should have a unique quanta (say a GUID) generated and stored along with
the generation number. The mere existence of this quanta would act as
the degraded flag.
Any check/compare/access related to the generation number would know to
notice that the GUID is in place and do the necessary resolution. If
successful the GUID would be discarded.
As to how this could be implemented, I'm not fully conversant on the
internal layout.
One possibility would be to add a block reference, or, indeed replace
the current storage for generation numbers completely with block
reference to a block containing the generation number and the potential
GUID. The main value of having an out-of-structure reference is that its
content is less space constrained, and it could be shared by multiple
usages. In the case, for instance, where the block is added (as opposed
to replacing the generation number) only one such block would be needed
per degraded,rw mount, and it could be attached to as many filesystem
structures as needed.
Just as metadata under DUP is divergent after a degraded mount, a
generation block wold be divergent, and likely in a different location
than its peers on a subsequent restored geometry.
A gerenation block could have other nicities like the date/time and the
devices present (or absent); such information could conceivably be used
to intellegently disambiguate references. For instance if one degraded
mount had sda and sdb, and second had sdb and sdc, then itd be known
that sdb was dominant for having been present every time.
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: filesystem corruption
2014-11-04 18:28 ` Chris Murphy
2014-11-04 21:44 ` Duncan
2014-11-04 22:19 ` Robert White
@ 2014-11-04 22:34 ` Zygo Blaxell
2 siblings, 0 replies; 89+ messages in thread
From: Zygo Blaxell @ 2014-11-04 22:34 UTC (permalink / raw)
To: Chris Murphy; +Cc: Btrfs BTRFS
[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]
On Tue, Nov 04, 2014 at 11:28:39AM -0700, Chris Murphy wrote:
> On Nov 3, 2014, at 9:31 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
> > It needs to be more than a sequential number. If one of the disks
> > disappears we need to record this fact on the surviving disks, and also
> > cope with _both_ disks claiming to be the "surviving" one.
>
> I agree this is also a problem. But the most common case is where we
> know that sda generation is newer (larger value) and most recently
> modified, and sdb has not since been modified but needs to be caught
> up. As far as I know the only way to do that on Btrfs right now is
> a full balance, it doesn't catch up just be being reconnected with a
> normal mount.
The data on the disks might be inconistent, so resynchronization must
read from only the "good" copy. A balance could just spread corruption
around if it reads from two out-of-sync mirrors. (Maybe it already does
the right thing if sdb was not modified...?).
The full resync operation is more like btrfs device replace, except that
it's replacing a disk in-place (i.e. without removing it first), and it
would not read from the non-"good" disk.
>
> Chris Murphy--
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Filesystem Corruption
@ 2018-12-03 9:31 Stefan Malte Schumacher
2018-12-03 11:34 ` Qu Wenruo
2018-12-03 16:29 ` remi
0 siblings, 2 replies; 89+ messages in thread
From: Stefan Malte Schumacher @ 2018-12-03 9:31 UTC (permalink / raw)
To: Btrfs BTRFS
Hello,
I have noticed an unusual amount of crc-errors in downloaded rars,
beginning about a week ago. But lets start with the preliminaries. I
am using Debian Stretch.
Kernel: Linux mars 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4
(2018-08-21) x86_64 GNU/Linux
BTRFS-Tools btrfs-progs 4.7.3-1
Smartctl shows no errors for any of the drives in the filesystem.
Btrfs /dev/stats shows zero errors, but dmesg gives me a lot of
filesystem related error messages.
[5390748.884929] Buffer I/O error on dev dm-0, logical block
976701312, async page read
This errors is shown a lot of time in the log.
This seems to affect just newly written files. This is the output of
btrfs scrub status:
scrub status for 1609e4e1-4037-4d31-bf12-f84a691db5d8
scrub started at Tue Nov 27 06:02:04 2018 and finished after 07:34:16
total bytes scrubbed: 17.29TiB with 0 errors
What is the probable cause of these errors? How can I fix this?
Thanks in advance for your advice
Stefan
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem Corruption
2018-12-03 9:31 Stefan Malte Schumacher
@ 2018-12-03 11:34 ` Qu Wenruo
2018-12-03 16:29 ` remi
1 sibling, 0 replies; 89+ messages in thread
From: Qu Wenruo @ 2018-12-03 11:34 UTC (permalink / raw)
To: Stefan Malte Schumacher, Btrfs BTRFS
[-- Attachment #1.1: Type: text/plain, Size: 1387 bytes --]
On 2018/12/3 下午5:31, Stefan Malte Schumacher wrote:
> Hello,
>
> I have noticed an unusual amount of crc-errors in downloaded rars,
> beginning about a week ago. But lets start with the preliminaries. I
> am using Debian Stretch.
> Kernel: Linux mars 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4
> (2018-08-21) x86_64 GNU/Linux
> BTRFS-Tools btrfs-progs 4.7.3-1
> Smartctl shows no errors for any of the drives in the filesystem.
>
> Btrfs /dev/stats shows zero errors, but dmesg gives me a lot of
> filesystem related error messages.
>
> [5390748.884929] Buffer I/O error on dev dm-0, logical block
> 976701312, async page read
> This errors is shown a lot of time in the log.
No "btrfs:" prefix, looks more like an error message from block level,
no wonder btrfs shows no error at all.
What is the underlying device mapper?
And further more, is there any kernel message with "btrfs"
(case-insensitive) in it?
Thanks,
Qu
>
> This seems to affect just newly written files. This is the output of
> btrfs scrub status:
> scrub status for 1609e4e1-4037-4d31-bf12-f84a691db5d8
> scrub started at Tue Nov 27 06:02:04 2018 and finished after 07:34:16
> total bytes scrubbed: 17.29TiB with 0 errors
>
> What is the probable cause of these errors? How can I fix this?
>
> Thanks in advance for your advice
> Stefan
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: Filesystem Corruption
2018-12-03 9:31 Stefan Malte Schumacher
2018-12-03 11:34 ` Qu Wenruo
@ 2018-12-03 16:29 ` remi
1 sibling, 0 replies; 89+ messages in thread
From: remi @ 2018-12-03 16:29 UTC (permalink / raw)
To: Stefan Malte Schumacher, Btrfs BTRFS
On Mon, Dec 3, 2018, at 4:31 AM, Stefan Malte Schumacher wrote:
> I have noticed an unusual amount of crc-errors in downloaded rars,
> beginning about a week ago. But lets start with the preliminaries. I
> am using Debian Stretch.
> Kernel: Linux mars 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4
> (2018-08-21) x86_64 GNU/Linux
>
> [5390748.884929] Buffer I/O error on dev dm-0, logical block
> 976701312, async page read
Excuse me for butting when there are *many* more qualified people on this list.
But assuming the rar crc errors are related to your unexplained buffer I/O errors, (and not some weird coincidence of simply bad downloads.), I would start, immediately, by testing the Memory. Ram corruption can wreak havok with btrfs, (any filesystem but I think BTRFS has special challenges in this regard.) and this looks like memory error to me.
^ permalink raw reply [flat|nested] 89+ messages in thread
end of thread, other threads:[~2018-12-03 16:29 UTC | newest]
Thread overview: 89+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-06 18:00 Filesystem Corruption Kurt
-- strict thread matches above, loose matches on Subject: below --
2018-12-03 9:31 Stefan Malte Schumacher
2018-12-03 11:34 ` Qu Wenruo
2018-12-03 16:29 ` remi
2014-10-31 0:29 filesystem corruption Tobias Holst
2014-10-31 1:02 ` Tobias Holst
2014-10-31 2:41 ` Rich Freeman
2014-10-31 17:34 ` Tobias Holst
2014-11-02 4:49 ` Robert White
2014-11-02 21:57 ` Chris Murphy
2014-11-03 3:43 ` Zygo Blaxell
2014-11-03 17:11 ` Chris Murphy
2014-11-04 4:31 ` Zygo Blaxell
2014-11-04 8:25 ` Duncan
2014-11-04 18:28 ` Chris Murphy
2014-11-04 21:44 ` Duncan
2014-11-04 22:19 ` Robert White
2014-11-04 22:34 ` Zygo Blaxell
2014-11-03 2:55 ` Tobias Holst
2014-11-03 3:49 ` Robert White
2011-01-03 1:58 Patrick H.
2011-01-03 3:16 ` Neil Brown
[not found] ` <4D214B5C.3010103@feystorm.net>
2011-01-03 4:56 ` Neil Brown
2011-01-03 5:05 ` Patrick H.
2011-01-04 5:33 ` NeilBrown
2011-01-04 7:50 ` Patrick H.
2011-01-04 17:31 ` Patrick H.
2011-01-05 1:22 ` Patrick H.
2011-01-05 7:02 ` CoolCold
[not found] ` <AANLkTinL_nz58f8rSPuhYvVwGY5jdu1XVkNLC1ky5A65@mail.gmail.com>
2011-01-05 14:28 ` Patrick H.
2011-01-05 15:52 ` Spelic
2011-01-05 15:55 ` Patrick H.
2007-06-06 3:10 Filesystem corruption Xu CanHao
2007-06-06 12:16 ` Ingo Bormuth
2007-05-30 20:13 devsk
2007-05-30 17:22 devsk
2007-05-30 19:24 ` Toby Thain
2007-05-30 20:03 ` David Masover
2007-05-31 0:11 ` Ingo Bormuth
2007-06-02 23:10 ` Edward Shishkin
2007-06-04 2:55 ` Ingo Bormuth
2007-06-04 9:41 ` Edward Shishkin
2007-06-05 23:20 ` Ingo Bormuth
2007-05-27 13:18 Laurent CARON
2007-05-28 12:23 ` Vladimir V. Saveliev
2007-05-28 14:10 ` Laurent CARON
2007-05-28 17:13 ` Vladimir V. Saveliev
2007-05-28 17:27 ` Laurent CARON
[not found] ` <Pine.LNX.4.64.0705280025570.10429@sheep.housecafe.de>
2007-05-28 17:31 ` Christian Kujau
2007-05-28 18:16 ` Laurent CARON
2007-05-28 23:19 ` Christian Kujau
2007-05-29 8:39 ` Vladimir V. Saveliev
[not found] ` <465BA9AC.8040805@ultraviolet.org>
2007-05-29 8:15 ` Vladimir V. Saveliev
2007-05-29 12:36 ` Toby Thain
2007-05-30 13:25 ` David Masover
2007-05-30 16:02 ` Vladimir V. Saveliev
2007-05-30 20:06 ` David Masover
2007-05-30 16:42 ` Toby Thain
2007-05-30 19:42 ` David Masover
2007-05-30 16:08 ` Vladimir V. Saveliev
2003-08-13 16:05 Locke
2003-08-14 7:49 ` Oleg Drokin
2002-09-05 15:57 Filesystem Corruption Brian Tinsley
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2002-06-07 7:15 ` Oleg Drokin
2002-06-11 16:49 ` Kurt
2002-06-06 18:00 Kurt
2002-06-06 18:00 Kurt
2001-02-05 16:00 Filesystem corruption Ian Chilton
2001-02-05 13:16 Ian Chilton
2001-01-31 14:20 Carsten Langgaard
2001-01-31 15:52 ` Florian Lohoff
2001-01-31 16:24 ` Carsten Langgaard
2001-01-31 16:48 ` Florian Lohoff
2001-02-05 10:02 ` Ralf Baechle
2001-02-05 12:10 ` Alan Cox
2001-02-05 12:10 ` Alan Cox
2001-02-05 12:56 ` Geert Uytterhoeven
2001-02-05 13:01 ` Alan Cox
2001-02-05 13:01 ` Alan Cox
2001-02-05 22:01 ` Ralf Baechle
2001-02-05 22:01 ` Ralf Baechle
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.