JBD2: Spotted dirty metadata buffer....

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* JBD2: Spotted dirty metadata buffer....
@ 2016-11-21 15:28 Wolfgang Walter
  2016-11-22  0:49 ` Andreas Dilger
       [not found] ` <11848279.gFnWiZMlh7@stwm.de>
  0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Walter @ 2016-11-21 15:28 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: LKML

Hello,

I'm testing EXT4 with an external journal (data=journal). When writing I rather often get

	JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1008028301). There's a risk of filesystem corruption in case of system crash.

Now other message is logged. If I unmount the filesystem an do an forced fsck, all seems fine.

The filesystem as it's journal are on a LV (which is again backed by DRBD). The journal, too.

Im using stable kernel 4.8.8.

I created the journal with:

	mkfs.ext4 -O journal_dev -v -b 4096  -L jdyn /dev/export/jdyn

I created the filesystem with:

	mkfs.ext4 -J device=UUID=625d871f-c278-4acb-916d-774dc78dbd8a -v -b 4096 -E stride=$((512/4)),stripe_width=$((512/4*3)),lazy_itable_init=0,nodiscard -O inline_data,mmp -L dyn /dev/export/dyn

I tested it also with data=writeback. I didn't see these errors then.

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: JBD2: Spotted dirty metadata buffer....
  2016-11-21 15:28 JBD2: Spotted dirty metadata buffer Wolfgang Walter
@ 2016-11-22  0:49 ` Andreas Dilger
  2016-11-22 13:56   ` Wolfgang Walter
       [not found] ` <11848279.gFnWiZMlh7@stwm.de>
  1 sibling, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2016-11-22  0:49 UTC (permalink / raw)
  To: Wolfgang Walter; +Cc: Ext4 Developers List, LKML

[-- Attachment #1: Type: text/plain, Size: 2062 bytes --]

On Nov 21, 2016, at 8:28 AM, Wolfgang Walter <linux@stwm.de> wrote:
> 
> Hello,
> 
> I'm testing EXT4 with an external journal (data=journal). When writing I rather often get
> 
> 	JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1008028301). There's a risk of filesystem corruption in case of system crash.

Can you please determine what file this block belongs to and/or what kind
of metadata block it is.  You can check "dumpe2fs /dev/dm-22" to list all
of the metadata blocks, and if it isn't listed as filesystem metadata, you
can run "debugfs -c -R 'icheck 1008028301' /dev/dm-22" to find which file
this block belongs to, then "debugfs -c -R 'ncheck <inum>' /dev/dm-22" after
you have the inode number.

> No other message is logged. If I unmount the filesystem an do an forced fsck, all seems fine.

This is likely a bug in the code, marking a block dirty without setting
up the journaling for it correctly.  A few bugs like this were fixed a
while ago by Eric, but those fixes should be in 4.8.8.

> The filesystem as it's journal are on a LV (which is again backed by DRBD). The journal, too.
> 
> I'm using stable kernel 4.8.8.
> 
> I created the journal with:
> 
> 	mkfs.ext4 -O journal_dev -v -b 4096  -L jdyn /dev/export/jdyn
> 
> I created the filesystem with:
> 
> 	mkfs.ext4 -J device=UUID=625d871f-c278-4acb-916d-774dc78dbd8a -v -b 4096 -E stride=$((512/4)),stripe_width=$((512/4*3)),lazy_itable_init=0,nodiscard -O inline_data,mmp -L dyn /dev/export/dyn

It may be that inline_data is the culprit, since this is a relatively new
feature that isn't widely used yet.

Cheers, Andreas

> I tested it also with data=writeback. I didn't see these errors then.
> 
> Regards,
> --
> Wolfgang Walter
> Studentenwerk München
> Anstalt des öffentlichen Rechts
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: JBD2: Spotted dirty metadata buffer....
  2016-11-22  0:49 ` Andreas Dilger
@ 2016-11-22 13:56   ` Wolfgang Walter
  2016-11-22 23:02     ` Andreas Dilger
  0 siblings, 1 reply; 7+ messages in thread
From: Wolfgang Walter @ 2016-11-22 13:56 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Ext4 Developers List, LKML

Am Montag, 21. November 2016, 17:49:36 schrieben Sie:
> On Nov 21, 2016, at 8:28 AM, Wolfgang Walter <linux@stwm.de> wrote:
> > Hello,
> > 
> > I'm testing EXT4 with an external journal (data=journal). When writing I
> > rather often get> 
> > 	JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1008028301).
> > 	There's a risk of filesystem corruption in case of system crash.
> Can you please determine what file this block belongs to and/or what kind
> of metadata block it is.  You can check "dumpe2fs /dev/dm-22" to list all
> of the metadata blocks, and if it isn't listed as filesystem metadata, you
> can run "debugfs -c -R 'icheck 1008028301' /dev/dm-22" to find which file
> this block belongs to, then "debugfs -c -R 'ncheck <inum>' /dev/dm-22" after
> you have the inode number.

This think this is the metadata block which covers 1008028301? :

Group 30762: (Blocks 1008009216-1008041983) csum 0xb3c0 [INODE_UNINIT, ITABLE_ZEROED]
  Block bitmap at 1007681546 (bg #30752 + 10), csum 0x94b3d052
  Inode bitmap at 1007681562 (bg #30752 + 26), csum 0x00000000
  Inode table at 1007684128-1007684383 (bg #30752 + 2592)
  8340 free blocks, 4096 free inodes, 0 directories, 4096 unused inodes
  Free blocks: 1008021210-1008021211, 1008021506-1008021511, 1008021520-1008021698, 1008021749-1008021887, 1008022528
-1008022655, 1008024576-1008024768, 1008024839-1008024959, 1008026624-1008026879, 1008028524-1008035839
  Free inodes: 126001153-126005248

I'm not sure if I read it correct but 1008028301 is part of the metadata?

I checked all of the other blocks logged so far and they are all alike, they are
in between <x> and <y> of a "(Blocks <x>-<y>)" and they are above <b> of the
"Inode table at <a>-<b>". 

Another thing I found was that no block has been mentioned more then twice.
If they are mentioned twice this is within a second or so. Maybe this is simply
because they only where touched when I did the rsync.

Here is an example:

[303709.144704] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342550). There's a risk of filesystem corruption in case of system crash.
[303709.145561] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342538). There's a risk of filesystem corruption in case of system crash.
[303709.145571] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342527). There's a risk of filesystem corruption in case of system crash.
[303709.145579] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342516). There's a risk of filesystem corruption in case of system crash.
[303709.145593] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342504). There's a risk of filesystem corruption in case of system crash.
[303709.145604] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342493). There's a risk of filesystem corruption in case of system crash.
[303709.145612] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342482). There's a risk of filesystem corruption in case of system crash.
[303709.145625] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342469). There's a risk of filesystem corruption in case of system crash.
[303719.119889] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343261). There's a risk of filesystem corruption in case of system crash.
[303719.119896] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343263). There's a risk of filesystem corruption in case of system crash.
[303719.120074] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343328). There's a risk of filesystem corruption in case of system crash.
[303719.120079] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343330). There's a risk of filesystem corruption in case of system crash.
[303719.120083] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343332). There's a risk of filesystem corruption in case of system crash.
[303719.121607] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936933). There's a risk of filesystem corruption in case of system crash.
[303719.121625] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936936). There's a risk of filesystem corruption in case of system crash.
[303719.121651] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936948). There's a risk of filesystem corruption in case of system crash.
[303719.121701] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936962). There's a risk of filesystem corruption in case of system crash.
[303719.121814] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 198776202). There's a risk of filesystem corruption in case of system crash.
[303719.121887] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 198776224). There's a risk of filesystem corruption in case of system crash.
[303719.149886] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 198776224). There's a risk of filesystem corruption in case of system crash.
[303719.149912] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 198776202). There's a risk of filesystem corruption in case of system crash.
[303719.167924] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936962). There's a risk of filesystem corruption in case of system crash.
[303719.167941] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936948). There's a risk of filesystem corruption in case of system crash.
[303719.167950] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936936). There's a risk of filesystem corruption in case of system crash.
[303719.167956] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 148936933). There's a risk of filesystem corruption in case of system crash.
[303719.167999] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343332). There's a risk of filesystem corruption in case of system crash.
[303719.168001] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343330). There's a risk of filesystem corruption in case of system crash.
[303719.168004] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343328). There's a risk of filesystem corruption in case of system crash.
[303719.168054] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343263). There's a risk of filesystem corruption in case of system crash.
[303719.168056] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085343261). There's a risk of filesystem corruption in case of system crash.

> 
> > No other message is logged. If I unmount the filesystem an do an forced
> > fsck, all seems fine.
> This is likely a bug in the code, marking a block dirty without setting
> up the journaling for it correctly.  A few bugs like this were fixed a
> while ago by Eric, but those fixes should be in 4.8.8.
> 
> > The filesystem as it's journal are on a LV (which is again backed by
> > DRBD). The journal, too.
> > 
> > I'm using stable kernel 4.8.8.
> > 
> > I created the journal with:
> > 	mkfs.ext4 -O journal_dev -v -b 4096  -L jdyn /dev/export/jdyn
> > 
> > I created the filesystem with:
> > 	mkfs.ext4 -J device=UUID=625d871f-c278-4acb-916d-774dc78dbd8a -v -b 4096
> > 	-E
> > 	stride=$((512/4)),stripe_width=$((512/4*3)),lazy_itable_init=0,nodiscard
> > 	-O inline_data,mmp -L dyn /dev/export/dyn
> It may be that inline_data is the culprit, since this is a relatively new
> feature that isn't widely used yet.
> 
> Cheers, Andreas
> 
> > I tested it also with data=writeback. I didn't see these errors then.
> > 
> > Regards,
> > --
> > Wolfgang Walter
> > Studentenwerk München
> > Anstalt des öffentlichen Rechts
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Cheers, Andreas

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: JBD2: Spotted dirty metadata buffer....
  2016-11-22 13:56   ` Wolfgang Walter
@ 2016-11-22 23:02     ` Andreas Dilger
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Dilger @ 2016-11-22 23:02 UTC (permalink / raw)
  To: Wolfgang Walter; +Cc: Ext4 Developers List, LKML

[-- Attachment #1: Type: text/plain, Size: 4564 bytes --]

On Nov 22, 2016, at 6:56 AM, Wolfgang Walter <linux@stwm.de> wrote:
> 
> Am Montag, 21. November 2016, 17:49:36 schrieben Sie:
>> On Nov 21, 2016, at 8:28 AM, Wolfgang Walter <linux@stwm.de> wrote:
>>> Hello,
>>> 
>>> I'm testing EXT4 with an external journal (data=journal). When writing I
>>> rather often get>
>>> 	JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1008028301).
>>> 	There's a risk of filesystem corruption in case of system crash.
>> Can you please determine what file this block belongs to and/or what kind
>> of metadata block it is.  You can check "dumpe2fs /dev/dm-22" to list all
>> of the metadata blocks, and if it isn't listed as filesystem metadata, you
>> can run "debugfs -c -R 'icheck 1008028301' /dev/dm-22" to find which file
>> this block belongs to, then "debugfs -c -R 'ncheck <inum>' /dev/dm-22" after
>> you have the inode number.
> 
> This think this is the metadata block which covers 1008028301? :
> 
> Group 30762: (Blocks 1008009216-1008041983) csum 0xb3c0 [INODE_UNINIT, ITABLE_ZEROED]
>  Block bitmap at 1007681546 (bg #30752 + 10), csum 0x94b3d052
>  Inode bitmap at 1007681562 (bg #30752 + 26), csum 0x00000000
>  Inode table at 1007684128-1007684383 (bg #30752 + 2592)
>  8340 free blocks, 4096 free inodes, 0 directories, 4096 unused inodes
>  Free blocks: 1008021210-1008021211, 1008021506-1008021511, 1008021520-1008021698, 1008021749-1008021887, 1008022528
> -1008022655, 1008024576-1008024768, 1008024839-1008024959, 1008026624-1008026879, 1008028524-1008035839
>  Free inodes: 126001153-126005248
> 
> I'm not sure if I read it correct but 1008028301 is part of the metadata?
> 
> I checked all of the other blocks logged so far and they are all alike,
> they are in between <x> and <y> of a "(Blocks <x>-<y>)" and they are above
> <b> of the "Inode table at <a>-<b>".

This means that block 1008028301 is part of this block group 30762, but
it is not listed in the range of "Block bitmap", "Inode bitmap", or "Inode
table" so it is just a regular block that was allocated to a file.  You
can use "debugfs -c -R 'icheck 1008028301' /dev/dm-22" to find out the
inode this block belongs to, and "debugfs -c -R 'stat <inum>' /dev/dm-22"
to see what the block is used for (e.g. data block, indirect block, etc).

Unfortunately, I don't think "debugfs icheck" reports the type of block
that this is, even though it probably could.

Cheers, Andreas

> Another thing I found was that no block has been mentioned more then twice.
> If they are mentioned twice this is within a second or so. Maybe this is
> simply because they only where touched when I did the rsync.
> 
> Here is an example:
> 
> [303709.144704] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342550). There's a risk of filesystem corruption in case of system crash.
> [303709.145561] JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 1085342538). There's a risk of filesystem corruption in case of system crash.
> 
>> 
>>> No other message is logged. If I unmount the filesystem an do an forced
>>> fsck, all seems fine.
>> This is likely a bug in the code, marking a block dirty without setting
>> up the journaling for it correctly.  A few bugs like this were fixed a
>> while ago by Eric, but those fixes should be in 4.8.8.
>> 
>>> The filesystem as it's journal are on a LV (which is again backed by
>>> DRBD). The journal, too.
>>> 
>>> I'm using stable kernel 4.8.8.
>>> 
>>> I created the journal with:
>>> 	mkfs.ext4 -O journal_dev -v -b 4096  -L jdyn /dev/export/jdyn
>>> 
>>> I created the filesystem with:
>>> 	mkfs.ext4 -J device=UUID=625d871f-c278-4acb-916d-774dc78dbd8a -v -b 4096
>>> 	-E
>>> 	stride=$((512/4)),stripe_width=$((512/4*3)),lazy_itable_init=0,nodiscard
>>> 	-O inline_data,mmp -L dyn /dev/export/dyn
>> It may be that inline_data is the culprit, since this is a relatively new
>> feature that isn't widely used yet.
>> 
>> Cheers, Andreas
>> 
>>> I tested it also with data=writeback. I didn't see these errors then.
>>> 
>>> Regards,
>>> --
>>> Wolfgang Walter
>>> Studentenwerk München
>>> Anstalt des öffentlichen Rechts
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> Cheers, Andreas
> 
> Regards,
> --
> Wolfgang Walter
> Studentenwerk München
> Anstalt des öffentlichen Rechts


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <11848279.gFnWiZMlh7@stwm.de>]

[parent not found: <6EBBE85C-8A87-4D8B-9897-3D67D4B6D732@dilger.ca>]

* Re: JBD2: Spotted dirty metadata buffer....
       [not found]   ` <6EBBE85C-8A87-4D8B-9897-3D67D4B6D732@dilger.ca>
@ 2016-11-28 11:26     ` Wolfgang Walter
  2017-09-06 12:46       ` Wolfgang Walter
  0 siblings, 1 reply; 7+ messages in thread
From: Wolfgang Walter @ 2016-11-28 11:26 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Ext4 Developers List, LKML

Am Mittwoch, 23. November 2016, 16:40:07 schrieb Andreas Dilger:
> On Nov 23, 2016, at 3:43 AM, Wolfgang Walter <linux@stwm.de> wrote:
> > Am Dienstag, 22. November 2016, 16:02:53 schrieben Sie:
> >> On Nov 22, 2016, at 6:56 AM, Wolfgang Walter <linux@stwm.de> wrote:
> >>> Am Montag, 21. November 2016, 17:49:36 schrieben Sie:
> >>>> On Nov 21, 2016, at 8:28 AM, Wolfgang Walter <linux@stwm.de> wrote:
> >>>>> Hello,

[snip]

> Stepping back a bit - does this problem only happen with an external
> journal device, or does it also happen with an internal journal?
> 

So I tried that this weekend. I got again these messages 

JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 241763277). There's a risk of filesystem corruption in case of system crash.

So this also happens with an internal journal.

I think it has something to do with data=journal + writing fast a lot of data and removing and/or moving a files at the same time.

I then tried it again with an internal journal but without the option journal_async_commit. I got these messages again. This time though, after some time running, the rsync got stuck in D state. I waited more than an hour but it did not recover. According to iostat there was no IO any more. I couldn't access the filesystem any more, ls -l oder touch got stuck in D state, too. I didn't get any kernel message, though. top reported the whole time one cpu (and always the same one) with about 100% wa time. No process but top itself seemed to be active.

I never had that with external journal + journal_async_commit. When I run bonnie++ and copied large amount of data with dd at the same time I got some problems. But then the filesystem recovered every time, and I got messages like that one:

INFO: task dd:16705 blocked for more than 120 seconds.
      Tainted: G            E   4.8.8-debian64.all+1.1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dd              D ffff9d70dfc18700     0 16705  16503 0x00000004
 ffff9d686ab53000 ffff9d70d2897080 ffff9d713fffbb80 024280ca8d8d1e9c
 ffff9d70af8ac000 ffff9d70af8aba58 ffff9d90a94840ec ffff9d686ab53000
 00000000ffffffff ffff9d90a94840f0 ffffffff8e7ee6f1 ffff9d90a94840e8
Call Trace:
 [<ffffffff8e7ee6f1>] ? schedule+0x31/0x80
 [<ffffffff8e7ee98a>] ? schedule_preempt_disabled+0xa/0x10
 [<ffffffff8e7f0574>] ? __mutex_lock_slowpath+0xb4/0x130
 [<ffffffff8e7f060b>] ? mutex_lock+0x1b/0x30
 [<ffffffffc042e32f>] ? __jbd2_log_wait_for_space+0x7f/0x1e0 [jbd2]
 [<ffffffffc0428384>] ? add_transaction_credits+0x284/0x2a0 [jbd2]
 [<ffffffff8e2eb31f>] ? add_timer+0x12f/0x220
 [<ffffffffc04284ff>] ? start_this_handle+0x10f/0x420 [jbd2]
 [<ffffffffc0428c99>] ? jbd2__journal_start+0xe9/0x1f0 [jbd2]
 [<ffffffffc0530595>] ? ext4_setattr+0x205/0x880 [ext4]
 [<ffffffff8e4226d9>] ? notify_change+0x229/0x430
 [<ffffffff8e401cf2>] ? do_truncate+0x72/0xc0
 [<ffffffff8e412fa8>] ? path_openat+0x388/0x14d0
 [<ffffffff8e415371>] ? do_filp_open+0x91/0x100
 [<ffffffff8e3b92cc>] ? handle_mm_fault+0xd1c/0x14b0
 [<ffffffff8e4031d7>] ? do_sys_open+0x127/0x210
 [<ffffffff8e7f2ab6>] ? entry_SYSCALL_64_fastpath+0x1e/0xa8

INFO: task bonnie++:19693 blocked for more than 120 seconds.
      Tainted: G            E   4.8.8-debian64.all+1.1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bonnie++        D ffff9d70dfad8700     0 19693  16248 0x00000000
ffff9d70b0287040 ffff9d70d289c0c0 ffffffff8e2c7266 ffff9d70b18e4000
ffff9d70b18e4000 ffff9d70b18e3aa0 ffff9d90a94840ec ffff9d70b0287040
  00000000ffffffff ffff9d90a94840f0 ffffffff8e7ee6f1 ffff9d90a94840e8
 Call Trace:
  [<ffffffff8e2c7266>] ? mutex_optimistic_spin+0x176/0x1c0
  [<ffffffff8e7ee6f1>] ? schedule+0x31/0x80
  [<ffffffff8e7ee98a>] ? schedule_preempt_disabled+0xa/0x10
  [<ffffffff8e7f0574>] ? __mutex_lock_slowpath+0xb4/0x130
  [<ffffffff8e7f060b>] ? mutex_lock+0x1b/0x30
  [<ffffffffc042e32f>] ? __jbd2_log_wait_for_space+0x7f/0x1e0 [jbd2]
  [<ffffffffc0428384>] ? add_transaction_credits+0x284/0x2a0 [jbd2]
  [<ffffffffc04284ff>] ? start_this_handle+0x10f/0x420 [jbd2]
  [<ffffffffc0428c99>] ? jbd2__journal_start+0xe9/0x1f0 [jbd2]
  [<ffffffffc052e2b5>] ? ext4_write_begin+0x125/0x4f0 [ext4]
  [<ffffffff8e37f0f3>] ? generic_perform_write+0xd3/0x1d0
  [<ffffffff8e381a46>] ? __generic_file_write_iter+0x196/0x1e0
  [<ffffffffc05231b6>] ? ext4_file_write_iter+0x196/0x3d0 [ext4]
  [<ffffffff8e40365b>] ? __vfs_write+0xeb/0x160
  [<ffffffff8e4047f3>] ? vfs_write+0xb3/0x1a0
  [<ffffffff8e405bf2>] ? SyS_write+0x52/0xc0
  [<ffffffff8e7f2ab6>] ? entry_SYSCALL_64_fastpath+0x1e/0xa8

But there was IO both on the filesystem and the journal (the devices of the filesystem is slower then the device the journal sits on).


Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: JBD2: Spotted dirty metadata buffer....
  2016-11-28 11:26     ` Wolfgang Walter
@ 2017-09-06 12:46       ` Wolfgang Walter
  2017-09-06 18:07         ` Andreas Dilger
  0 siblings, 1 reply; 7+ messages in thread
From: Wolfgang Walter @ 2017-09-06 12:46 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Ext4 Developers List, LKML

Am Montag, 28. November 2016, 12:26:38 schrieb Wolfgang Walter:
> Am Mittwoch, 23. November 2016, 16:40:07 schrieb Andreas Dilger:
> > On Nov 23, 2016, at 3:43 AM, Wolfgang Walter <linux@stwm.de> wrote:
> > > Am Dienstag, 22. November 2016, 16:02:53 schrieben Sie:
> > >> On Nov 22, 2016, at 6:56 AM, Wolfgang Walter <linux@stwm.de> wrote:
> > >>> Am Montag, 21. November 2016, 17:49:36 schrieben Sie:
> > >>>> On Nov 21, 2016, at 8:28 AM, Wolfgang Walter <linux@stwm.de> wrote:
> > >>>>> Hello,
> 
> [snip]
> 
> > Stepping back a bit - does this problem only happen with an external
> > journal device, or does it also happen with an internal journal?
> > 
> 
> So I tried that this weekend. I got again these messages 
> 
> JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 241763277). There's a risk of filesystem corruption in case of system crash.
> 
> So this also happens with an internal journal.
> 

[snip]

I last tried with 4.9.46 and I still see that problem when rsyncing data to the filesystem: errors similar to

JBD2: Spotted dirty metadata buffer (dev = dm-25, blocknr = 1008028301). There's a risk of filesystem corruption in case of system 

A later filesystem check does not show any errors.


With 4.9.46 stable kernels I also sometimes get the following error:

EXT4-fs error (device dm-25): ext4_iget:4501: inode #74061557: comm rsync: checksum invalid
or
EXT4-fs error (device dm-25): ext4_iget:4501: inode #155844677: comm nfsd: checksum invalid

A filesystem check then says that the inode itselfs seems ok but the checksum is indeed wrong.

As these inodes are inodes of very small files.

So I finally copied all away and reinitialized the filesystem. But this time without -O inline_data

Since then all works fine. So I assume there is a proplem with inodes and inline-data (at least until 4.9.46), maybe only with data=journal.

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: JBD2: Spotted dirty metadata buffer....
  2017-09-06 12:46       ` Wolfgang Walter
@ 2017-09-06 18:07         ` Andreas Dilger
  0 siblings, 0 replies; 7+ messages in thread
From: Andreas Dilger @ 2017-09-06 18:07 UTC (permalink / raw)
  To: Wolfgang Walter, Tao Ma; +Cc: Ext4 Developers List, LKML

[-- Attachment #1: Type: text/plain, Size: 2247 bytes --]

On Sep 6, 2017, at 6:46 AM, Wolfgang Walter <linux@stwm.de> wrote:
> 
> Am Montag, 28. November 2016, 12:26:38 schrieb Wolfgang Walter:
>> Am Mittwoch, 23. November 2016, 16:40:07 schrieb Andreas Dilger:
>>> Stepping back a bit - does this problem only happen with an external
>>> journal device, or does it also happen with an internal journal?
>> 
>> So I tried that this weekend. I got again these messages
>> 
>> JBD2: Spotted dirty metadata buffer (dev = dm-22, blocknr = 241763277). There's a risk of filesystem corruption in case of system crash.
>> 
>> So this also happens with an internal journal.
>> 
> 
> [snip]
> 
> I last tried with 4.9.46 and I still see that problem when rsyncing data to the filesystem: errors similar to
> 
> JBD2: Spotted dirty metadata buffer (dev = dm-25, blocknr = 1008028301). There's a risk of filesystem corruption in case of system crash.
> 
> A later filesystem check does not show any errors.
> 
> 
> With 4.9.46 stable kernels I also sometimes get the following error:
> 
> EXT4-fs error (device dm-25): ext4_iget:4501: inode #74061557: comm rsync: checksum invalid
> or
> EXT4-fs error (device dm-25): ext4_iget:4501: inode #155844677: comm nfsd: checksum invalid
> 
> A filesystem check then says that the inode itselfs seems ok but the checksum is indeed wrong.
> 
> As these inodes are inodes of very small files.
> 
> So I finally copied all away and reinitialized the filesystem. But this time without -O inline_data
> 
> Since then all works fine. So I assume there is a proplem with inodes and inline-data (at least until 4.9.46), maybe only with data=journal.

Add Tao Ma, since he is the original inline data author, and it appears that Taobao
is using this feature so are probably interested in fixing any corruption.

Wolfgang, it would be useful if you could figure out what type of block the reported
blocknr=1008028301 is.  You can use "debugfs -c -R 'icheck 1008028301' /dev/dm-25".
I'd suspect, given that this is an inline data problem, that this is an inode table
block, but it is good to be sure.  It would also be useful to see if this inode
number correlates to one of the inodes that have bad checksums.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-09-06 18:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-21 15:28 JBD2: Spotted dirty metadata buffer Wolfgang Walter
2016-11-22  0:49 ` Andreas Dilger
2016-11-22 13:56   ` Wolfgang Walter
2016-11-22 23:02     ` Andreas Dilger
     [not found] ` <11848279.gFnWiZMlh7@stwm.de>
     [not found]   ` <6EBBE85C-8A87-4D8B-9897-3D67D4B6D732@dilger.ca>
2016-11-28 11:26     ` Wolfgang Walter
2017-09-06 12:46       ` Wolfgang Walter
2017-09-06 18:07         ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox