* Re: Error testing ext3 on brd ramdisk
[not found] ` <20090317094019.GA10360@smart.research.nokia.com>
@ 2009-03-18 12:11 ` Nick Piggin
2009-03-18 13:42 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2009-03-18 12:11 UTC (permalink / raw)
To: Denis Karpov
Cc: ext Jorge Boncompte [DTI2], Hunter Adrian (Nokia-D/Helsinki),
LKML, Jan Kara, linux-ext4
On Tue, Mar 17, 2009 at 11:40:19AM +0200, Denis Karpov wrote:
> Hello,
>
> first off, sorry if you getting this email twice.
No problem, I'm not exactly able to reproduce it myself, but Jan Kara
has just fixed some issues which could explain it: they happen under
memory pressure so I may not have triggered it if I didn't put it
under pressure.
Jan's fixes are here:
http://marc.info/?l=linux-ext4&m=123731584711382&w=2
It would be interesting to try them, and if they don't work maybe
he's also interested so I cc'ed him.
> I also tried to do ext3/ext4 fs smoketesting and used Adraian's
> script. I am consistently getting the same results - filesystem get's
> corrupted.
> I tested on quad Xeon, with patches posted in this thread.
>
> 1. tests with brd:
> - ext3fs on brd
> corruption (see attached ext3fs.brd.corruption.txt)
> - ext4fs on brd
> corruption (see attached ext4fs.brd.corruption.txt)
>
> In both cases I saw some complains from JBD/JBD2:
> JBD: Detected IO errors while flushing file data on
>
> 2. I enabled JBD debugging, re-run the tests. Console was
> flooded with messages and in the end I got a soft lockup.
> I cannot consistently reproduce this (see attached
> brd.ext3fs.softlock.txt).
>
> Just to be sure I re-run the tests on real block device (usb stick)
>
> 3. tests with real block device (usb stick)
> - ext3fs
> no fs currption (overnight run)
> - ext4fs
> no fs currption (overnight run)
It's possible the real block device is not fast enough to trigger
it, or different timings don't trigger it (brd requests complete
immediately wheras real devices tend to complete afterwards,
from (soft)interrupt context).
Or it could be that brd is consuming some more memory to push
the system into reclaim and exposing those bugs Jan has fixed...
> Any ideas what else can be done here? I'd like to find out if this is
> filesystem or brd related fault.
Yes, thanks for persisting. If you can test the patches and see
if they help? If not, does ext2 show corruption? How about ext3
on loop device (with backing file from tmpfs/ramfs for speed).
Thanks,
Nick
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Error testing ext3 on brd ramdisk
2009-03-18 12:11 ` Error testing ext3 on brd ramdisk Nick Piggin
@ 2009-03-18 13:42 ` Jan Kara
2009-03-20 12:24 ` Denis Karpov
2009-03-20 13:35 ` Denis Karpov
0 siblings, 2 replies; 5+ messages in thread
From: Jan Kara @ 2009-03-18 13:42 UTC (permalink / raw)
To: Nick Piggin
Cc: Denis Karpov, ext Jorge Boncompte [DTI2],
Hunter Adrian (Nokia-D/Helsinki), LKML, linux-ext4
> On Tue, Mar 17, 2009 at 11:40:19AM +0200, Denis Karpov wrote:
> > Hello,
> >
> > first off, sorry if you getting this email twice.
>
> No problem, I'm not exactly able to reproduce it myself, but Jan Kara
> has just fixed some issues which could explain it: they happen under
> memory pressure so I may not have triggered it if I didn't put it
> under pressure.
>
> Jan's fixes are here:
>
> http://marc.info/?l=linux-ext4&m=123731584711382&w=2
>
> It would be interesting to try them, and if they don't work maybe
> he's also interested so I cc'ed him.
>
>
> > I also tried to do ext3/ext4 fs smoketesting and used Adraian's
> > script. I am consistently getting the same results - filesystem get's
> > corrupted.
> > I tested on quad Xeon, with patches posted in this thread.
> >
> > 1. tests with brd:
> > - ext3fs on brd
> > corruption (see attached ext3fs.brd.corruption.txt)
> > - ext4fs on brd
> > corruption (see attached ext4fs.brd.corruption.txt)
> >
> > In both cases I saw some complains from JBD/JBD2:
> > JBD: Detected IO errors while flushing file data on
Yes, my patches fix exactly this problem. So please try running with
them. I'm not sure about that HTREE corruption you see during fsck. That
seems to be a separate issue.
> > 2. I enabled JBD debugging, re-run the tests. Console was
> > flooded with messages and in the end I got a soft lockup.
> > I cannot consistently reproduce this (see attached
> > brd.ext3fs.softlock.txt).
Yes, this usually produces far too many messages. The soft lockup was
probably caused by the machine being too busy logging all the messages
(log files are synced which adds much more to the load of the
filesystem). I'd probably leave that aside for now and concentrate on
the corruption problem.
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Error testing ext3 on brd ramdisk
2009-03-18 13:42 ` Jan Kara
@ 2009-03-20 12:24 ` Denis Karpov
2009-03-20 12:49 ` Denis Karpov
2009-03-20 13:35 ` Denis Karpov
1 sibling, 1 reply; 5+ messages in thread
From: Denis Karpov @ 2009-03-20 12:24 UTC (permalink / raw)
To: ext Jan Kara
Cc: Nick Piggin, Denis Karpov, ext Jorge Boncompte [DTI2],
Hunter Adrian (Nokia-D/Helsinki), LKML,
linux-ext4@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 729 bytes --]
> > Jan's fixes are here:
> > http://marc.info/?l=linux-ext4&m=123731584711382&w=2
> > It would be interesting to try them, and if they don't work maybe
> > he's also interested so I cc'ed him.
Hi,
thank you reppl. I re-run the tests with this patch.
> > >
> > > In both cases I saw some complains from JBD/JBD2:
> > > JBD: Detected IO errors while flushing file data on
> Yes, my patches fix exactly this problem. So please try running with
> them. I'm not sure about that HTREE corruption you see during fsck. That
> seems to be a separate issue.
Unfortunately it looks like the problem is not fixed - JBD still complains
and in the end HTREE is getting damaged, in both ext3 and ext4 tests (see
attached logs).
Denis
[-- Attachment #2: ext3.htree.txt --]
[-- Type: text/plain, Size: 6376 bytes --]
-------------------------------------------------------------
Cycle 34
Fri Mar 20 08:06:06 EDT 2009
Mounting
[ 1128.228621] EXT3 FS on ram0, internal journal
[ 1128.232046] kjournald starting. Commit interval 5 seconds
[ 1128.238557] EXT3-fs: mounted filesystem with ordered data mode.
Removing old fsstress data
Starting fsstress
Sleeping 30 seconds
seed = 1238324876
[ 1139.101289] JBD: Detected IO errors while flushing file data on ram0
[ 1139.116050] JBD: Detected IO errors while flushing file data on ram0
[ 1139.124125] JBD: Detected IO errors while flushing file data on ram0
[ 1140.100787] JBD: Detected IO errors while flushing file data on ram0
[ 1140.375989] JBD: Detected IO errors while flushing file data on ram0
[ 1141.313458] JBD: Detected IO errors while flushing file data on ram0
[ 1142.387028] JBD: Detected IO errors while flushing file data on ram0
[ 1142.668047] JBD: Detected IO errors while flushing file data on ram0 [ 1143.035936] JBD: Detected IO errors while flushing file data on ram0
[ 1143.160422] JBD: Detected IO errors while flushing file data on ram0
Stopping fsstress
3974 ttyS0 00:00:00 fsstress_ext3
3977 ttyS0 00:00:20 fsstress_ext3
3978 ttyS0 00:00:20 fsstress_ext3
3979 ttyS0 00:00:20 fsstress_ext3
./brd_test_ext3.sh: line 37: 3974 Terminated `pwd`/fsstress_ext3 -
d $TESTDIR/work -p 3 -l 0 -n 100000000
Unmounting
Checking
/dev/ram0: HTREE directory inode 42 has an invalid root node. HTREE INDEX CLEARED.
/dev/ram0: Entry 'c302b' in /work/p1/da (42) has an incorrect filetype (was 3, s
hould be 1).
/dev/ram0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
3# fsck /dev/ram0
fsck 1.41.3 (12-Oct-2008)
e2fsck 1.41.3 (12-Oct-2008)
/dev/ram0 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'c302b' in /work/p1/da (42) has an incorrect filetype (was 3, should be 1)
.
Fix<y>? yes
Entry 'd3699' in /work/p1/da (42) is a link to directory /work/p1/da/d18a/d3874/
d278f/d2401/d3e6a/d4380/d33e8/d2a30/d3ad4/d45f3 (968).
Clear<y>? yes
Entry 'd3ad4' in /work/p1/da/d18a/d3874/d278f/d2401/d3e6a/d4380/d33e8/d2a30 (192
) is a link to directory /work/p1/da/d207a (811). Clear<y>? yes
Entry 'd18a' in /work/p1/da (42) is a link to directory /work/p1/da/d18a (491).
Clear<y>? yes
Pass 3: Checking directory connectivity
'..' in /work/p1/da/d207a (811) is /work/p1/da/d18a/d3874/d278f/d2401/d3e6a/d438
0/d33e8/d2a30 (192), should be /work/p1/da (42).
Fix<y>? yes
Pass 4: Checking reference counts
Inode 53 ref count is 1, should be 2. Fix<y>? yes
Inode 67 ref count is 1, should be 2. Fix<y>? yes
Inode 86 ref count is 3, should be 4. Fix<y>? yes
...
Inode 945 ref count is 1, should be 2. Fix<y>? yes
Inode 993 ref count is 7, should be 8. Fix<y>? yes
Pass 5: Checking group summary information
/dev/ram0: ***** FILE SYSTEM WAS MODIFIED ***** /dev/ram0: 1024/1024 files (18.4% non-contiguous), 4094/4096 blocks
[-- Attachment #3: ext4.htree.txt --]
[-- Type: text/plain, Size: 6457 bytes --]
-------------------------------------------------------------
Cycle 9
Fri Mar 20 08:18:28 EDT 2009
Mounting
[ 1870.516310] EXT4-fs: barriers enabled
[ 1870.520359] kjournald2 starting: pid 5729, dev ram1:8, commit interval 5 seco
nds
Removing old fss[ 1870.520396] EXT4 FS on ram1, internal journal on ram1:8
tress data
[ 1870.520399] EXT4-fs: delayed allocation enabled
[ 1870.520401] EXT4-fs: file extents enabled
[ 1870.520553] EXT4-fs: mballoc enabled
[ 1870.520556] EXT4-fs: mounted filesystem ram1 with ordered data mode
[ 1870.570047] JBD: barrier-based sync failed on ram1:8 - disabling barriers
Starting fsstress
Sleeping 30 seconds
seed = 1237467072
Stopping fsstress 5732 ttyS0 00:00:00 fsstress_ext4
5735 ttyS0 00:00:28 fsstress_ext4
5736 ttyS0 00:00:28 fsstress_ext4
5737 ttyS0 00:00:28 fsstress_ext4
./brd_test_ext4.sh: line 36: 5732 Terminated `pwd`/fsstress_ext4 -
d $TESTDIR/work -p 3 -l 0 -n 100000000
Unmounting
[ 1901.751383] EXT4-fs: mballoc: 7905 blocks 4022 reqs (13 success)
[ 1901.757443] EXT4-fs: mballoc: 8495 extents scanned, 341 goal hits, 6 2^N hits
, 4 breaks, 38 lost
[ 1901.766253] EXT4-fs: mballoc: 1 generated and it took 2140
[ 1901.771755] EXT4-fs: mballoc: 18857 preallocated, 11619 discarded
Checking
/dev/ram1: HTREE directory inode 20 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram1: HTREE directory inode 40 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram1: Entry 'f15e2' in /work/p0/d5/db/d934 (20) has an incorrect filetype (
was 1, should be 3).
/dev/ram1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
# fsck /dev/ram1
fsck 1.41.3 (12-Oct-2008)
e2fsck 1.41.3 (12-Oct-2008)
/dev/ram1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'f15e2' in /work/p0/d5/db/d934 (20) has an incorrect filetype (was 1, shou
ld be 3).
Fix<y>? yes
Entry 'd51' in /work/p0/d5/db/d934 (20) is a link to directory /work/p0/d5/db/d9
34/d51 (110). Clear<y>? yes
Entry 'd1cba' in /work/p0/d5/db/d934 (20) is a link to directory /work/p0/d5/db/
d934/d1cba (918).
Clear<y>? yes
Entry 'd17f8' in /work/p0/d5/db/d203c (40) is a link to directory /work/p0/d5/db
/d203c/d17f8 (392).
Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 36 ref count is 2, should be 3. Fix<y>? yes
Inode 37 ref count is 2, should be 3. Fix<y>? yes
Inode 55 ref count is 2, should be 3. Fix<y>? yes
.....
Inode 655 ref count is 4, should be 5. Fix<y>? yes
Inode 1024 ref count is 4, should be 5. Fix<y>? yes
Pass 5: Checking group summary information
/dev/ram1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/ram1: 1024/1024 files (24.2% non-contiguous), 4096/4096 blocks
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Error testing ext3 on brd ramdisk
2009-03-20 12:24 ` Denis Karpov
@ 2009-03-20 12:49 ` Denis Karpov
0 siblings, 0 replies; 5+ messages in thread
From: Denis Karpov @ 2009-03-20 12:49 UTC (permalink / raw)
To: ext Jan Kara, Nick Piggin, Denis Karpov,
ext Jorge Boncompte [DTI2], "Hunter Adrian (Nokia-D/Helsink
On Fri, Mar 20, 2009 at 02:24:05PM +0200, Denis Karpov wrote:
> Unfortunately it looks like the problem is not fixed - JBD still complains
> and in the end HTREE is getting damaged, in both ext3 and ext4 tests (see
> attached logs).
>
> Denis
Please, disregard the previos message, I tested with a wrong patchset.
Sorry for the hussle.
Denis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Error testing ext3 on brd ramdisk
2009-03-18 13:42 ` Jan Kara
2009-03-20 12:24 ` Denis Karpov
@ 2009-03-20 13:35 ` Denis Karpov
1 sibling, 0 replies; 5+ messages in thread
From: Denis Karpov @ 2009-03-20 13:35 UTC (permalink / raw)
To: ext Jan Kara
Cc: Nick Piggin, ext Jorge Boncompte [DTI2],
Hunter Adrian (Nokia-D/Helsinki), LKML,
linux-ext4@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 852 bytes --]
On Wed, Mar 18, 2009 at 02:42:02PM +0100, ext Jan Kara wrote:
> > On Tue, Mar 17, 2009 at 11:40:19AM +0200, Denis Karpov wrote:
> > Jan's fixes are here:
> > http://marc.info/?l=linux-ext4&m=123731584711382&w=2
> > It would be interesting to try them, and if they don't work maybe
> > he's also interested so I cc'ed him.
Hello,
I've re-run the tests (with Jan's patches and also Nick's "fs: new inode
i_state corruption fix" patch).
> > > In both cases I saw some complains from JBD/JBD2:
> > > JBD: Detected IO errors while flushing file data on
> Yes, my patches fix exactly this problem. So please try running with
> them. I'm not sure about that HTREE corruption you see during fsck. That
> seems to be a separate issue.
The issue with JBD seems to be gone. But problem with HTREE being corrupted
still remains (see attached logs).
Denis
[-- Attachment #2: ext3.htree.1.txt --]
[-- Type: text/plain, Size: 5241 bytes --]
-------------------------------------------------------------
Cycle 26
Fri Mar 20 09:23:50 EDT 2009
Mounting
[ 907.443733] EXT3 FS on ram0, internal journal
[ 907.448199] EXT3-fs: mounted filesystem with ordered data mode.
[ 907.448529] kjournald starting. Commit interval 5 seconds
Removing old fsstress data
Starting fsstress
Sleeping 30 seconds
seed = 1237468251
Stopping fsstress
5656 ttyS0 00:00:00 fsstress_ext3
5659 ttyS0 00:00:26 fsstress_ext3
5660 ttyS0 00:00:25 fsstress_ext3
5661 ttyS0 00:00:25 fsstress_ext3
./brd_test_ext3.sh: line 37: 5656 Terminated `pwd`/fsstress_ext3 -
d $TESTDIR/work -p 3 -l 0 -n 100000000 Unmounting
Checking
/dev/ram0: HTREE directory inode 82 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram0: Entry 'c17e8' in /work/p1/d2/d99 (82) has an incorrect filetype (was
3, should be 1).
/dev/ram0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
# fsck.ext3 /dev/ram0
e2fsck 1.41.3 (12-Oct-2008)
/dev/ram0 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'c17e8' in /work/p1/d2/d99 (82) has an incorrect filetype (was 3, should b
e 1).
Fix<y>? yes
Entry 'fdab' in /work/p1/d2/d99 (82) has an incorrect filetype (was 1, should be
7).
Fix<y>? yes
Entry 'c17fa' in /work/p1/d2/d99 (82) has an incorrect filetype (was 3, should be 1).
Fix<y>? yes
Entry 'da0' in /work/p1/d2/d99 (82) is a link to directory /work/p1/d2/d99/da0 (
238).
Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 83 ref count is 9, should be 10. Fix<y>? yes
Inode 94 ref count is 3, should be 4. Fix<y>? yes
Inode 99 ref count is 9, should be 10. Fix<y>? yes
...
Inode 1018 ref count is 2, should be 3. Fix<y>? yes
Inode 1024 ref count is 1, should be 2. Fix<y>? yes
Pass 5: Checking group summary information
/dev/ram0: ***** FILE SYSTEM WAS MODIFIED *****
/dev/ram0: 1024/1024 files (12.0% non-contiguous), 4096/4096 blocks
[-- Attachment #3: ext4.htree.1.txt --]
[-- Type: text/plain, Size: 2178 bytes --]
-------------------------------------------------------------
Cycle 5
Fri Mar 20 09:17:50 EDT 2009
Mounting
Removing old fsstress data
Starting fsstress
Sleeping 30 seconds
seed = 1237604365
Stopping fsstress
5370 pts/0 00:00:00 fsstress_ext4
5373 pts/0 00:00:19 fsstress_ext4
5374 pts/0 00:00:20 fsstress_ext4
5375 pts/0 00:00:19 fsstress_ext4
./brd_test_ext4.sh: line 36: 5370 Terminated `pwd`/fsstress_ext4 -d $TESTDIR/work -p 3 -l 0 -n 100000000
Unmounting
Checking
/dev/ram1: HTREE directory inode 165 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram1: HTREE directory inode 272 has an invalid root node.
HTREE INDEX CLEARED.
/dev/ram1: Entry 'c363' in /work/p2/d2/de/d5f/d80/d616 (272) has an incorrect filetype (was 3, should be 1).
/dev/ram1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
# fsck.ext4 /dev/ram1
e2fsck 1.41.3 (12-Oct-2008)
/dev/ram1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'c363' in /work/p2/d2/de/d5f/d80/d616 (272) has an incorrect filetype (was 3, should be 1).
Fix<y>? yes
Entry 'c10b5' in /work/p2/d2/de/d37 (165) has an incorrect filetype (was 3, should be 1).
Fix<y>? yes
Entry 'f164' in /work/p2/d2/de/d37 (165) has an incorrect filetype (was 1, should be 3).
Fix<y>? yes
Entry 'f104e' in /work/p2/d2/de/d37 (165) has an incorrect filetype (was 1, should be 2).
Fix<y>? yes
Entry 'deaa' in /work/p2/d2/de/d5f/d80/d616 (272) is a link to directory /work/p2/d2/de/d5f/d80/d616/deaa (819).
Clear<y>? yes
Entry 'd155a' in /work/p2/d2/de/d37 (165) is a link to directory /work/p2/d2/de/d37/d155a (434).
Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 25 ref count is 6, should be 7. Fix<y>? yes
Inode 48 ref count is 1, should be 2. Fix<y>? yes
...
Inode 1003 ref count is 3, should be 4. Fix<y>? yes
Inode 1004 ref count is 1, should be 2. Fix<y>? yes
Pass 5: Checking group summary information
/dev/ram1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/ram1: 1024/1024 files (20.5% non-contiguous), 4096/4096 blocks
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-03-20 13:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <49AF9932.2040301@dti2.net>
[not found] ` <20090305094623.GA17815@wotan.suse.de>
[not found] ` <49AFAFD9.9050805@dti2.net>
[not found] ` <49AFC1A9.90501@dti2.net>
[not found] ` <20090310161247.GA19352@wotan.suse.de>
[not found] ` <20090310163002.GC19352@wotan.suse.de>
[not found] ` <49B69A09.3080408@dti2.net>
[not found] ` <20090311021920.GA16561@wotan.suse.de>
[not found] ` <49BA927F.8020701@dti2.net>
[not found] ` <20090317094019.GA10360@smart.research.nokia.com>
2009-03-18 12:11 ` Error testing ext3 on brd ramdisk Nick Piggin
2009-03-18 13:42 ` Jan Kara
2009-03-20 12:24 ` Denis Karpov
2009-03-20 12:49 ` Denis Karpov
2009-03-20 13:35 ` Denis Karpov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).