* "Unknown code" error when enabling metadata_csum on ext4 raid1 device @ 2012-07-31 2:53 Nick Semenkovich 2012-08-01 7:19 ` Zheng Liu 0 siblings, 1 reply; 9+ messages in thread From: Nick Semenkovich @ 2012-07-31 2:53 UTC (permalink / raw) To: linux-ext4 I'm trying to enable metadata_csum on an ext4 raid1 device, but end up with a semi-cryptic error. (This is probably the same thing Tomasz Chmielewski reported in http://www.spinics.net/lists/linux-ext4/msg33139.html ) Is this issue being tracked someplace? $ uname -ar Linux dev 3.5.0-6-generic #6-Ubuntu SMP Mon Jul 23 19:52:14 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux $ git clone git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git $ ./configure [snip] $ make progs [snip] $ sudo misc/tune2fs -O metadata_csum /dev/md1 tune2fs 1.42.3 (14-May-2012) rewrite_directory: Unknown code Ijv 64 while rewriting directories The error code changes when I re-run tune2fs (to [a-zA-Z]*3 64). I've tried this on the master, "next", & "pu" git branches, all with errors. $ debugfs -R 'stats' /dev/md1 > http://web.mit.edu/semenko/Public/debugfs-md1.txt Best, Nick -- Nick Semenkovich Laboratory of Dr. Jeffrey I. Gordon Medical Scientist Training Program School of Medicine Washington University in St. Louis http://web.mit.edu/semenko/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-07-31 2:53 "Unknown code" error when enabling metadata_csum on ext4 raid1 device Nick Semenkovich @ 2012-08-01 7:19 ` Zheng Liu 2012-08-01 7:16 ` Tomasz Chmielewski 2012-08-03 4:01 ` Theodore Ts'o 0 siblings, 2 replies; 9+ messages in thread From: Zheng Liu @ 2012-08-01 7:19 UTC (permalink / raw) To: semenko; +Cc: linux-ext4, semenko, mangoo, tytso, djwong On Mon, Jul 30, 2012 at 09:53:15PM -0500, Nick Semenkovich wrote: > I'm trying to enable metadata_csum on an ext4 raid1 device, but end up > with a semi-cryptic error. > > (This is probably the same thing Tomasz Chmielewski reported in > http://www.spinics.net/lists/linux-ext4/msg33139.html ) > > Is this issue being tracked someplace? > > $ uname -ar > Linux dev 3.5.0-6-generic #6-Ubuntu SMP Mon Jul 23 19:52:14 UTC 2012 > x86_64 x86_64 x86_64 GNU/Linux > > $ git clone git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git > $ ./configure > [snip] > $ make progs > [snip] > $ sudo misc/tune2fs -O metadata_csum /dev/md1 > tune2fs 1.42.3 (14-May-2012) > rewrite_directory: Unknown code Ijv 64 while rewriting directories > > The error code changes when I re-run tune2fs (to [a-zA-Z]*3 64). > > > I've tried this on the master, "next", & "pu" git branches, all with errors. > > > $ debugfs -R 'stats' /dev/md1 > > http://web.mit.edu/semenko/Public/debugfs-md1.txt [CC to Tomasz, Ted, and Darrick] Hi Nick and Tomasz, Could you please try this patch? It seems that the problem is because error code doesn't be clear. Regards, Zheng Subject: [PATCH] tune2fs: clear error code before rewriting directory when metadata_csum enabled From: Zheng Liu <wenqing.lz@taobao.com> When we enable metadata_csum feature in tune2fs, all inodes need to be rewrited to calculate checksum. In this process, the inode that has been removed also needs to calculate checksum, but the extent tree in these inodes has been clear. Thus, we cannot read any extents, and an 'EXT2_ET_EXTENT_NO_NEXT' error is returned back. But in this condition error code in rewrite_dir_context doesn't be initialized, and it causes an unknown error. we can use this script to reproduce this bug: #!/bin/sh dev='/dev/sda1' mnt='/mnt/sda1' mkfs.ext4 $dev mount -t ext4 $dev $mnt # without metadata_csum feature mkdir -p $mnt/test/1 mkdir $mnt/test/2 echo "hello" > $mnt/test/1/hello rm -rf $mnt/* umount $mnt tune2fs -O metadata_csum $dev CC: Nick Semenkovich <semenko@alum.mit.edu> CC: Tomasz Chmielewski <mangoo@wpkg.org> CC: "Theodore Ts'o" <tytso@mit.edu> CC: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> --- misc/tune2fs.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/misc/tune2fs.c b/misc/tune2fs.c index 6a48009..41a5529 100644 --- a/misc/tune2fs.c +++ b/misc/tune2fs.c @@ -592,6 +592,7 @@ errcode_t rewrite_directory(ext2_filsys fs, ext2_ino_t dir, ctx.is_htree = (inode->i_flags & EXT2_INDEX_FL); ctx.dir = dir; + ctx.errcode = 0; retval = ext2fs_block_iterate3(fs, dir, BLOCK_FLAG_READ_ONLY | BLOCK_FLAG_DATA_ONLY, 0, rewrite_dir_block, &ctx); -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 7:19 ` Zheng Liu @ 2012-08-01 7:16 ` Tomasz Chmielewski 2012-08-01 7:48 ` Zheng Liu 2012-08-03 4:01 ` Theodore Ts'o 1 sibling, 1 reply; 9+ messages in thread From: Tomasz Chmielewski @ 2012-08-01 7:16 UTC (permalink / raw) To: semenko, linux-ext4, semenko, tytso, djwong On 08/01/2012 02:19 PM, Zheng Liu wrote: > Hi Nick and Tomasz, > > Could you please try this patch? It seems that the problem is because > error code doesn't be clear. Hi, didn't try the patch yet, but I've noticed the following in dmesg since 3.5: [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. Could this be related? -- Tomasz Chmielewski http://blog.wpkg.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 7:16 ` Tomasz Chmielewski @ 2012-08-01 7:48 ` Zheng Liu 2012-08-01 7:51 ` Tomasz Chmielewski 0 siblings, 1 reply; 9+ messages in thread From: Zheng Liu @ 2012-08-01 7:48 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: semenko, linux-ext4, semenko, tytso, djwong On Wed, Aug 01, 2012 at 02:16:41PM +0700, Tomasz Chmielewski wrote: > On 08/01/2012 02:19 PM, Zheng Liu wrote: > > Hi Nick and Tomasz, > > > > Could you please try this patch? It seems that the problem is because > > error code doesn't be clear. > > Hi, > > didn't try the patch yet, but I've noticed the following in dmesg since 3.5: > > [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > > > Could this be related? Are these messages printed before you enable metadata_csum feature? Regards, Zheng ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 7:48 ` Zheng Liu @ 2012-08-01 7:51 ` Tomasz Chmielewski 2012-08-01 8:17 ` Zheng Liu 0 siblings, 1 reply; 9+ messages in thread From: Tomasz Chmielewski @ 2012-08-01 7:51 UTC (permalink / raw) To: semenko, linux-ext4, semenko, tytso, djwong On 08/01/2012 02:48 PM, Zheng Liu wrote: >> didn't try the patch yet, but I've noticed the following in dmesg since 3.5: >> >> [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >> >> Could this be related? > > Are these messages printed before you enable metadata_csum feature? I didn't notice them before trying to enable metadata_csum feature. On the other hand, enabling metadata_csum feature was pretty much the first thing I've made after booting to 3.5 kernel on this system, so it could be it changed something. Also, when I do: dumpe2fs -h /dev/sda1|grep metadata_csum I don't see metadata_csum feature anywhere. -- Tomasz Chmielewski http://blog.wpkg.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 7:51 ` Tomasz Chmielewski @ 2012-08-01 8:17 ` Zheng Liu 2012-08-02 3:43 ` Nick Semenkovich 0 siblings, 1 reply; 9+ messages in thread From: Zheng Liu @ 2012-08-01 8:17 UTC (permalink / raw) To: Tomasz Chmielewski; +Cc: semenko, linux-ext4, semenko, tytso, djwong On Wed, Aug 01, 2012 at 02:51:43PM +0700, Tomasz Chmielewski wrote: > On 08/01/2012 02:48 PM, Zheng Liu wrote: > > >>didn't try the patch yet, but I've noticed the following in dmesg since 3.5: > >> > >>[69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >>[69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > >>[69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >>[69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > >>[69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >> > >> > >>Could this be related? > > > >Are these messages printed before you enable metadata_csum feature? > > I didn't notice them before trying to enable metadata_csum feature. > > On the other hand, enabling metadata_csum feature was pretty much > the first thing I've made after booting to 3.5 kernel on this > system, so it could be it changed something. Yes, it will change something when you try to enable metadata_csum feature in tune2fs. So you'd better to run e2fsck to check your filesystem IMHO. > > > Also, when I do: > > dumpe2fs -h /dev/sda1|grep metadata_csum > > I don't see metadata_csum feature anywhere. You won't see this feature until you can enable this feature successful. Regards, Zheng ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 8:17 ` Zheng Liu @ 2012-08-02 3:43 ` Nick Semenkovich 2012-08-02 9:58 ` Zheng Liu 0 siblings, 1 reply; 9+ messages in thread From: Nick Semenkovich @ 2012-08-02 3:43 UTC (permalink / raw) To: Tomasz Chmielewski, semenko, linux-ext4, semenko, tytso, djwong On Wed, Aug 1, 2012 at 3:17 AM, Zheng Liu <gnehzuil.liu@gmail.com> wrote: > On Wed, Aug 01, 2012 at 02:51:43PM +0700, Tomasz Chmielewski wrote: >> On 08/01/2012 02:48 PM, Zheng Liu wrote: >> >> >>didn't try the patch yet, but I've noticed the following in dmesg since 3.5: >> >> >> >>[69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >>[69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> >>[69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >>[69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> >>[69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >> >> >> >> >>Could this be related? >> > >> >Are these messages printed before you enable metadata_csum feature? >> >> I didn't notice them before trying to enable metadata_csum feature. >> >> On the other hand, enabling metadata_csum feature was pretty much >> the first thing I've made after booting to 3.5 kernel on this >> system, so it could be it changed something. > > Yes, it will change something when you try to enable metadata_csum > feature in tune2fs. So you'd better to run e2fsck to check your > filesystem IMHO. > Sorry for the slow reply -- I hadn't seen any "Corrupt dir inode" errors until now. Before running the one-line patch above, I resynced the MD array and ran a quick fsck (via "touch /forcefsck" & reboot). Then, $ sudo misc/tune2fs -O metadata_csum /dev/md1 [says something about running e2fsck -D] Then I got a few dmesg errors like: [128700.816091] JBD2: Spotted dirty metadata buffer (dev = md1, blocknr = 5243385). There's a risk of filesystem corruption in case of system crash. [128700.816106] JBD2: Spotted dirty metadata buffer (dev = md1, blocknr = 1057). There's a risk of filesystem corruption in case of system crash. then a lot of [128711.000677] EXT4-fs warning (device md1): dx_probe:647: dx entry: limit != root limit [128711.000679] EXT4-fs warning (device md1): dx_probe:732: Corrupt dir inode 7733251, running e2fsck is recommended. On my next command (sudo -s), I got an immediate kernel panic: [128713.776475] EXT4-fs warning (device md1): dx_probe:732: Corrupt dir inode 7733251, running e2fsck is recommended. [128761.137143] BUG: unable to handle kernel NULL pointer dereference at (null) [128761.137195] IP: [<ffffffff8121d448>] ext4_iget+0x498/0xa50 [128761.137231] PGD 106651067 PUD 11cf41067 PMD 0 [128761.137258] Oops: 0000 [#1] SMP [128761.137279] CPU 0 [snip...] Full panic @ http://web.mit.edu/semenko/Public/panic.txt -- Nick Semenkovich Laboratory of Dr. Jeffrey I. Gordon Medical Scientist Training Program School of Medicine Washington University in St. Louis http://web.mit.edu/semenko/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-02 3:43 ` Nick Semenkovich @ 2012-08-02 9:58 ` Zheng Liu 0 siblings, 0 replies; 9+ messages in thread From: Zheng Liu @ 2012-08-02 9:58 UTC (permalink / raw) To: semenko; +Cc: Tomasz Chmielewski, linux-ext4, semenko, tytso, djwong On Wed, Aug 01, 2012 at 10:43:05PM -0500, Nick Semenkovich wrote: [-- snip --] > Sorry for the slow reply -- > > > I hadn't seen any "Corrupt dir inode" errors until now. > > Before running the one-line patch above, I resynced the MD array and > ran a quick fsck (via "touch /forcefsck" & reboot). > > > Then, > $ sudo misc/tune2fs -O metadata_csum /dev/md1 > > [says something about running e2fsck -D] > > > Then I got a few dmesg errors like: > > [128700.816091] JBD2: Spotted dirty metadata buffer (dev = md1, > blocknr = 5243385). There's a risk of filesystem corruption in case of > system crash. > [128700.816106] JBD2: Spotted dirty metadata buffer (dev = md1, > blocknr = 1057). There's a risk of filesystem corruption in case of > system crash. > > then a lot of > > [128711.000677] EXT4-fs warning (device md1): dx_probe:647: dx entry: > limit != root limit > [128711.000679] EXT4-fs warning (device md1): dx_probe:732: Corrupt > dir inode 7733251, running e2fsck is recommended. > > > On my next command (sudo -s), I got an immediate kernel panic: > > [128713.776475] EXT4-fs warning (device md1): dx_probe:732: Corrupt > dir inode 7733251, running e2fsck is recommended. > [128761.137143] BUG: unable to handle kernel NULL pointer dereference > at (null) > [128761.137195] IP: [<ffffffff8121d448>] ext4_iget+0x498/0xa50 > [128761.137231] PGD 106651067 PUD 11cf41067 PMD 0 > [128761.137258] Oops: 0000 [#1] SMP > [128761.137279] CPU 0 > [snip...] > > Full panic @ http://web.mit.edu/semenko/Public/panic.txt Hi Nick, Thanks for testing my patch. As you described above, it seems that there still has some bugs when metadata_csum feature enabled. I tried to reproduce this bug, but I couldn't reproduce it in my sandbox. I see the full panic file, and it seems that the kernel is running on Ubuntu distribution and it doesn't use a generic mainline kernel. So IMHO would you like to try a latest upstream kernel? At least when the problem happens again, it is easy for me to find out where goes wrong. Thanks for your patient. Regards, Zheng ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "Unknown code" error when enabling metadata_csum on ext4 raid1 device 2012-08-01 7:19 ` Zheng Liu 2012-08-01 7:16 ` Tomasz Chmielewski @ 2012-08-03 4:01 ` Theodore Ts'o 1 sibling, 0 replies; 9+ messages in thread From: Theodore Ts'o @ 2012-08-03 4:01 UTC (permalink / raw) To: semenko, linux-ext4, semenko, mangoo, djwong On Wed, Aug 01, 2012 at 03:19:35PM +0800, Zheng Liu wrote: > Subject: [PATCH] tune2fs: clear error code before rewriting directory when metadata_csum enabled > > From: Zheng Liu <wenqing.lz@taobao.com> > > When we enable metadata_csum feature in tune2fs, all inodes need to be rewrited > to calculate checksum. In this process, the inode that has been removed also > needs to calculate checksum, but the extent tree in these inodes has been clear. > Thus, we cannot read any extents, and an 'EXT2_ET_EXTENT_NO_NEXT' error is > returned back. But in this condition error code in rewrite_dir_context doesn't > be initialized, and it causes an unknown error. Thanks, I've merged this into my e2fsprogs checksum branch. I've promoted all of the metadata checksum patches in e2fsprogs into the next branch. At that point I'll strongly suggest that people use the development branch (currently the next branch, but in the next or two, the master branch) of e2fsprogs. For the kernel, for now I suggest using the v3.5 kernel with the ext4_for_linus (commit 03179fe92318) from the ext4.git tree merged in. Hopefully the necessary bug fix commits will be in the v3.5.1 kernel, but the 3.5.y series hasn't been released yet. - Ted ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-08-03 4:01 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-31 2:53 "Unknown code" error when enabling metadata_csum on ext4 raid1 device Nick Semenkovich 2012-08-01 7:19 ` Zheng Liu 2012-08-01 7:16 ` Tomasz Chmielewski 2012-08-01 7:48 ` Zheng Liu 2012-08-01 7:51 ` Tomasz Chmielewski 2012-08-01 8:17 ` Zheng Liu 2012-08-02 3:43 ` Nick Semenkovich 2012-08-02 9:58 ` Zheng Liu 2012-08-03 4:01 ` Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).