* Ext4 deadlock (+lockdep splat) during rsync
@ 2017-01-08 22:41 George Spelvin
2017-01-10 3:40 ` Theodore Ts'o
2017-01-19 17:37 ` Jan Kara
0 siblings, 2 replies; 14+ messages in thread
From: George Spelvin @ 2017-01-08 22:41 UTC (permalink / raw)
To: linux-ext4; +Cc: linux
After replacing a drive in a RAID array, I tried to bring some things
up to date with rsync and ran into an annoyingly repeatable deadlock.
So I found a chance to boot with a lockdep kernel and immediately turned up the following:
[ 755.740865] =============================================
[ 755.741072] [ INFO: possible recursive locking detected ]
[ 755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
[ 755.741489] ---------------------------------------------
[ 755.741699] rsync/14818 is trying to acquire lock:
[ 755.741907] (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[ 755.742145] but task is already holding lock:
[ 755.742742] (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
[ 755.743102] other info that might help us debug this:
[ 755.743802] Possible unsafe locking scenario:
[ 755.743802] CPU0
[ 755.743802] ----
[ 755.743802] lock(&ei->xattr_sem);
[ 755.743802] lock(&ei->xattr_sem);
[ 755.743802] *** DEADLOCK ***
[ 755.743802] May be due to missing lock nesting notation
[ 755.743802] 4 locks held by rsync/14818:
[ 755.743802] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
[ 755.743802] #1: (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
[ 755.743802] #2: (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
[ 755.743802] #3: (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
[ 755.743802] stack backtrace:
[ 755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
[ 755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
[ 755.743802] ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
[ 755.743802] ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
[ 755.743802] 0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
[ 755.743802] Call Trace:
[ 755.743802] [<ffffffff812a6d05>] dump_stack+0x68/0x93
[ 755.743802] [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
[ 755.743802] [<ffffffff810c8d10>] lock_acquire+0x60/0x80
[ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[ 755.743802] [<ffffffff81793a44>] down_write+0x44/0x80
[ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[ 755.743802] [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[ 755.743802] [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
[ 755.743802] [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
[ 755.743802] [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
[ 755.743802] [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
[ 755.743802] [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
[ 755.743802] [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
[ 755.743802] [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
[ 755.743802] [<ffffffff8120b523>] ext4_create+0xc3/0x150
[ 755.743802] [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
[ 755.743802] [<ffffffff8118b072>] path_openat+0x312/0x9f0
[ 755.743802] [<ffffffff8118d309>] do_filp_open+0x79/0xd0
[ 755.743802] [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
[ 755.743802] [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
[ 755.743802] [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
[ 755.743802] [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
[ 755.743802] [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
[ 755.743802] [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
Followed by repetitions of
[ 1105.549736] INFO: task rsync:14818 blocked for more than 120 seconds.
[ 1105.550089] Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
[ 1105.550436] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1105.550801] rsync D 0 14818 1 0x20020004
[ 1105.551173] ffff88040f7f1900 ffff88042d660000 0000000000000000 ffff8804082a2b00
[ 1105.551552] ffff88042f216298 ffffc9000c273820 ffffffff8178fc08 0000000000000046
[ 1105.551933] ffff8804082a3060 ffff88042f216298 ffff88042f216280 ffff8804082a2b00
[ 1105.552311] Call Trace:
[ 1105.552678] [<ffffffff8178fc08>] ? __schedule+0x2b8/0x840
[ 1105.553051] [<ffffffff817901c8>] schedule+0x38/0x90
[ 1105.553438] [<ffffffff817945b5>] ? rwsem_down_write_failed+0x2a5/0x450
[ 1105.553810] [<ffffffff817945ba>] rwsem_down_write_failed+0x2aa/0x450
[ 1105.554186] [<ffffffff81794360>] ? rwsem_down_write_failed+0x50/0x450
[ 1105.554571] [<ffffffff812b41d7>] call_rwsem_down_write_failed+0x17/0x30
[ 1105.554953] [<ffffffff812a6d21>] ? dump_stack+0x84/0x93
[ 1105.555335] [<ffffffff81793a5f>] down_write+0x5f/0x80
[ 1105.555720] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
[ 1105.556111] [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
[ 1105.556502] [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
[ 1105.556900] [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
[ 1105.557282] [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
[ 1105.557675] [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
[ 1105.558065] [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
[ 1105.558448] [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
[ 1105.558832] [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
[ 1105.559208] [<ffffffff8120b523>] ext4_create+0xc3/0x150
[ 1105.559575] [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
[ 1105.559943] [<ffffffff8118b072>] path_openat+0x312/0x9f0
[ 1105.560287] [<ffffffff8118d309>] do_filp_open+0x79/0xd0
[ 1105.560637] [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
[ 1105.560978] [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
[ 1105.561316] [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
[ 1105.561652] [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
[ 1105.561984] [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
[ 1105.562313] [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
[ 1105.562639] INFO: lockdep is turned off.
Problem first observed with a 4.8 kernel, but the lockdep one is 4.9.
I have metadata checksums turned on,
Filesystem features: has_journal resize_inode dir_index sparse_super2 filetype needs_recovery extent flex_bg inline_data sparse_super large_file extra_isize metadata_csum
Journal features: journal_incompat_revoke journal_checksum_v3
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
It's awakward to reboot in the middle of sending e-mail, but when I do
I can strace the rsync and find the path it's trying to create, if that
is helpful to debugging.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
@ 2017-01-10 3:40 ` Theodore Ts'o
2017-01-10 7:25 ` George Spelvin
2017-01-19 17:37 ` Jan Kara
1 sibling, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2017-01-10 3:40 UTC (permalink / raw)
To: George Spelvin; +Cc: linux-ext4
On Sun, Jan 08, 2017 at 05:41:14PM -0500, George Spelvin wrote:
> After replacing a drive in a RAID array, I tried to bring some things
> up to date with rsync and ran into an annoyingly repeatable deadlock.
Thanks for the bug report! I'm able to reproduce this, so it's a
matter of my tracking down the bug.
In the meantime, do you need a workaround? If you're not using the
new project quota feature, I can send you a quick patch to disable
attempts to move the extended attributes to resereve space for the
project ID --- this is what is causing the deadlock.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-10 3:40 ` Theodore Ts'o
@ 2017-01-10 7:25 ` George Spelvin
0 siblings, 0 replies; 14+ messages in thread
From: George Spelvin @ 2017-01-10 7:25 UTC (permalink / raw)
To: linux, tytso; +Cc: linux-ext4
> Thanks for the bug report! I'm able to reproduce this, so it's a
> matter of my tracking down the bug.
Hooray! And thank you; I don't know why I hit so many ext4 bugs.
> In the meantime, do you need a workaround? If you're not using the
> new project quota feature, I can send you a quick patch to disable
> attempts to move the extended attributes to resereve space for the
> project ID --- this is what is causing the deadlock.
If it's very little bother, I'd appreciate it, but if you expect to
have something more complete by the end of the week, I can wait
that long.
I didn't think I was using extended attributes at all, so the
limitation is fine.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
2017-01-10 3:40 ` Theodore Ts'o
@ 2017-01-19 17:37 ` Jan Kara
2017-01-19 17:45 ` Jan Kara
2017-01-19 23:19 ` Theodore Ts'o
1 sibling, 2 replies; 14+ messages in thread
From: Jan Kara @ 2017-01-19 17:37 UTC (permalink / raw)
To: George Spelvin; +Cc: linux-ext4
On Sun 08-01-17 17:41:14, George Spelvin wrote:
> After replacing a drive in a RAID array, I tried to bring some things
> up to date with rsync and ran into an annoyingly repeatable deadlock.
>
> So I found a chance to boot with a lockdep kernel and immediately turned up the following:
>
> [ 755.740865] =============================================
> [ 755.741072] [ INFO: possible recursive locking detected ]
> [ 755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
> [ 755.741489] ---------------------------------------------
> [ 755.741699] rsync/14818 is trying to acquire lock:
> [ 755.741907] (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> [ 755.742145] but task is already holding lock:
> [ 755.742742] (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> [ 755.743102] other info that might help us debug this:
> [ 755.743802] Possible unsafe locking scenario:
> [ 755.743802] CPU0
> [ 755.743802] ----
> [ 755.743802] lock(&ei->xattr_sem);
> [ 755.743802] lock(&ei->xattr_sem);
> [ 755.743802] *** DEADLOCK ***
> [ 755.743802] May be due to missing lock nesting notation
> [ 755.743802] 4 locks held by rsync/14818:
> [ 755.743802] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
> [ 755.743802] #1: (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
> [ 755.743802] #2: (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
> [ 755.743802] #3: (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> [ 755.743802] stack backtrace:
> [ 755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
> [ 755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
> [ 755.743802] ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
> [ 755.743802] ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
> [ 755.743802] 0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
> [ 755.743802] Call Trace:
> [ 755.743802] [<ffffffff812a6d05>] dump_stack+0x68/0x93
> [ 755.743802] [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
> [ 755.743802] [<ffffffff810c8d10>] lock_acquire+0x60/0x80
> [ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> [ 755.743802] [<ffffffff81793a44>] down_write+0x44/0x80
> [ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> [ 755.743802] [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> [ 755.743802] [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
> [ 755.743802] [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
> [ 755.743802] [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
> [ 755.743802] [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
> [ 755.743802] [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
> [ 755.743802] [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
> [ 755.743802] [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
> [ 755.743802] [<ffffffff8120b523>] ext4_create+0xc3/0x150
> [ 755.743802] [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
> [ 755.743802] [<ffffffff8118b072>] path_openat+0x312/0x9f0
> [ 755.743802] [<ffffffff8118d309>] do_filp_open+0x79/0xd0
> [ 755.743802] [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
> [ 755.743802] [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
> [ 755.743802] [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
> [ 755.743802] [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
> [ 755.743802] [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
> [ 755.743802] [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
OK, the problem is that we call ext4_mark_inode_dirty() while holding
xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
to grab it again. This may happen in several place in inline.c, generally
when handling inline directories. I'll try to craft a fix tomorrow...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-19 17:37 ` Jan Kara
@ 2017-01-19 17:45 ` Jan Kara
2017-01-19 17:59 ` George Spelvin
2017-01-19 23:19 ` Theodore Ts'o
1 sibling, 1 reply; 14+ messages in thread
From: Jan Kara @ 2017-01-19 17:45 UTC (permalink / raw)
To: George Spelvin; +Cc: linux-ext4
On Thu 19-01-17 18:37:07, Jan Kara wrote:
> On Sun 08-01-17 17:41:14, George Spelvin wrote:
> > After replacing a drive in a RAID array, I tried to bring some things
> > up to date with rsync and ran into an annoyingly repeatable deadlock.
> >
> > So I found a chance to boot with a lockdep kernel and immediately turned up the following:
> >
> > [ 755.740865] =============================================
> > [ 755.741072] [ INFO: possible recursive locking detected ]
> > [ 755.741279] 4.9.1-00126-gfbb9fcc9-dirty #576 Not tainted
> > [ 755.741489] ---------------------------------------------
> > [ 755.741699] rsync/14818 is trying to acquire lock:
> > [ 755.741907] (&ei->xattr_sem){++++..}, at: [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> > [ 755.742145] but task is already holding lock:
> > [ 755.742742] (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> > [ 755.743102] other info that might help us debug this:
> > [ 755.743802] Possible unsafe locking scenario:
> > [ 755.743802] CPU0
> > [ 755.743802] ----
> > [ 755.743802] lock(&ei->xattr_sem);
> > [ 755.743802] lock(&ei->xattr_sem);
> > [ 755.743802] *** DEADLOCK ***
> > [ 755.743802] May be due to missing lock nesting notation
> > [ 755.743802] 4 locks held by rsync/14818:
> > [ 755.743802] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff811a0fef>] mnt_want_write+0x1f/0x50
> > [ 755.743802] #1: (&type->i_mutex_dir_key){++++++}, at: [<ffffffff8118b058>] path_openat+0x2f8/0x9f0
> > [ 755.743802] #2: (jbd2_handle){++++..}, at: [<ffffffff81239aa6>] start_this_handle+0x196/0x540
> > [ 755.743802] #3: (&ei->xattr_sem){++++..}, at: [<ffffffff81236f95>] ext4_try_add_inline_entry+0x55/0x1a0
> > [ 755.743802] stack backtrace:
> > [ 755.743802] CPU: 0 PID: 14818 Comm: rsync Not tainted 4.9.1-00126-gfbb9fcc9-dirty #576
> > [ 755.743802] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
> > [ 755.743802] ffffc9000c273820 ffffffff812a6d05 ffffffff8253a080 ffffffff8253a080
> > [ 755.743802] ffffc9000c2738d8 ffffffff810c7eab ffffc9000c272000 ffffc90000000004
> > [ 755.743802] 0000000000000000 ffffffff81e0b100 1a883a7e30ec461a ffffffff8253a080
> > [ 755.743802] Call Trace:
> > [ 755.743802] [<ffffffff812a6d05>] dump_stack+0x68/0x93
> > [ 755.743802] [<ffffffff810c7eab>] __lock_acquire+0x7ab/0x1270
> > [ 755.743802] [<ffffffff810c8d10>] lock_acquire+0x60/0x80
> > [ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> > [ 755.743802] [<ffffffff81793a44>] down_write+0x44/0x80
> > [ 755.743802] [<ffffffff81234603>] ? ext4_expand_extra_isize_ea+0x63/0x850
> > [ 755.743802] [<ffffffff81234603>] ext4_expand_extra_isize_ea+0x63/0x850
> > [ 755.743802] [<ffffffff81795ec2>] ? _raw_read_unlock+0x22/0x30
> > [ 755.743802] [<ffffffff8123a542>] ? jbd2_journal_extend+0x132/0x1b0
> > [ 755.743802] [<ffffffff812002c9>] ext4_mark_inode_dirty+0x129/0x180
> > [ 755.743802] [<ffffffff81235d64>] ext4_add_dirent_to_inline.isra.16+0xe4/0x100
> > [ 755.743802] [<ffffffff81236fd9>] ext4_try_add_inline_entry+0x99/0x1a0
> > [ 755.743802] [<ffffffff8120b102>] ext4_add_entry+0x1d2/0x370
> > [ 755.743802] [<ffffffff8120b2b9>] ext4_add_nondir+0x19/0x70
> > [ 755.743802] [<ffffffff8120b523>] ext4_create+0xc3/0x150
> > [ 755.743802] [<ffffffff8118aaf8>] lookup_open+0x3d8/0x640
> > [ 755.743802] [<ffffffff8118b072>] path_openat+0x312/0x9f0
> > [ 755.743802] [<ffffffff8118d309>] do_filp_open+0x79/0xd0
> > [ 755.743802] [<ffffffff81795bc2>] ? _raw_spin_unlock+0x22/0x30
> > [ 755.743802] [<ffffffff8119d9b3>] ? __alloc_fd+0xf3/0x200
> > [ 755.743802] [<ffffffff8117b4fe>] do_sys_open+0x11e/0x1f0
> > [ 755.743802] [<ffffffff811d2046>] compat_SyS_open+0x16/0x20
> > [ 755.743802] [<ffffffff810027e4>] do_fast_syscall_32+0x94/0x210
> > [ 755.743802] [<ffffffff81797da1>] entry_SYSENTER_compat+0x51/0x60
>
> OK, the problem is that we call ext4_mark_inode_dirty() while holding
> xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
> to grab it again. This may happen in several place in inline.c, generally
> when handling inline directories. I'll try to craft a fix tomorrow...
Ah, I've noticed Ted had already beaten me to it...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-19 17:37 ` Jan Kara
2017-01-19 17:45 ` Jan Kara
@ 2017-01-19 23:19 ` Theodore Ts'o
2017-01-19 23:38 ` George Spelvin
1 sibling, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2017-01-19 23:19 UTC (permalink / raw)
To: Jan Kara; +Cc: George Spelvin, linux-ext4
On Thu, Jan 19, 2017 at 06:37:07PM +0100, Jan Kara wrote:
> OK, the problem is that we call ext4_mark_inode_dirty() while holding
> xattr_sem and that recurses into ext4_expand_extra_isize_ea() which tries
> to grab it again. This may happen in several place in inline.c, generally
> when handling inline directories. I'll try to craft a fix tomorrow...
I think the inline_data patches I posted should have taken care of
George's initial set of bug reports?
(And George, the reason why you're seeing lots of problems is because
inline_data isn't enabled by default yet, and as the old joke goes,
the Early Christians get the best Lions. :-)
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-19 23:19 ` Theodore Ts'o
@ 2017-01-19 23:38 ` George Spelvin
2017-01-20 14:40 ` Theodore Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: George Spelvin @ 2017-01-19 23:38 UTC (permalink / raw)
To: tytso; +Cc: jack, linux-ext4, linux
> I think the inline_data patches I posted should have taken care of
> George's initial set of bug reports?
Er, the two I posted most recently were on a kernel with your 4-patch
deadlock series applied:
98a9e36a ext4: propagate error values from ext4_inline_data_truncate()
50c39f0d ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores
f321034b ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
5283ac14 ext4: add debug_want_extra_isize mount option
Are you referring to some other patches I should be looking for?
> (And George, the reason why you're seeing lots of problems is because
> inline_data isn't enabled by default yet, and as the old joke goes,
> the Early Christians get the best Lions. :-)
Yes, I know I'm tempting fate by keeping data I actaully care about (I
have backups of most of it, but reassembling it from those backups would
be most unpleasant) on a file system with experimental features enabled.
But *somebody* has to debug new features, and I've noticed that the fickle
goddess Glitch is most likely to make an appearance when She sees such
an offering.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-19 23:38 ` George Spelvin
@ 2017-01-20 14:40 ` Theodore Ts'o
2017-01-20 17:57 ` George Spelvin
0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2017-01-20 14:40 UTC (permalink / raw)
To: George Spelvin; +Cc: jack, linux-ext4
On Thu, Jan 19, 2017 at 06:38:24PM -0500, George Spelvin wrote:
> > I think the inline_data patches I posted should have taken care of
> > George's initial set of bug reports?
>
> Er, the two I posted most recently were on a kernel with your 4-patch
> deadlock series applied:
>
> 98a9e36a ext4: propagate error values from ext4_inline_data_truncate()
> 50c39f0d ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores
> f321034b ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
> 5283ac14 ext4: add debug_want_extra_isize mount option
Yes, I know that's why I said, it takes care of the _initial_ set of
bug reports (not your recently reported ones). Jan's comments
indicated he was lookig at the initial set, and I to avoid him redoing
work that i had already done.
Those bugs look like they are very separate in that they have nothing
to do with locking, but rather in e2fsck and the kernel not properly
dealing with certain types of inconsistencies on disk. On my list to
deal with. As a workaround, you can just clri the offending corrupted
inode from lost+found and then run e2fsck.
> > (And George, the reason why you're seeing lots of problems is because
> > inline_data isn't enabled by default yet, and as the old joke goes,
> > the Early Christians get the best Lions. :-)
>
> Yes, I know I'm tempting fate by keeping data I actaully care about (I
> have backups of most of it, but reassembling it from those backups would
> be most unpleasant) on a file system with experimental features enabled.
>
> But *somebody* has to debug new features, and I've noticed that the fickle
> goddess Glitch is most likely to make an appearance when She sees such
> an offering.
Indeed, the reason why we didn't see this earlier is because we don't
have a test which tests the case of expanding i_extra_size with inline
data, and the introduction of project quota was the first time we've
expanded the extra inode fields in a while. Since you were using
inline_data, you very graciously exposed these bugs and a shortcoming
in our testing. :-)
On my todo list is to add some i_extra_isize testing to xfstsests.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-20 14:40 ` Theodore Ts'o
@ 2017-01-20 17:57 ` George Spelvin
2017-01-21 13:30 ` Theodore Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: George Spelvin @ 2017-01-20 17:57 UTC (permalink / raw)
To: linux, tytso; +Cc: jack, linux-ext4
> Those bugs look like they are very separate in that they have nothing
> to do with locking, but rather in e2fsck and the kernel not properly
> dealing with certain types of inconsistencies on disk. On my list to
> deal with. As a workaround, you can just clri the offending corrupted
> inode from lost+found and then run e2fsck.
Yes, that's fairly obvious, but I've avoided that for a few reasons:
1) I was holding on to the broken inodes in case you needed more information
that I didn't include in the initial report,
2) I also thought they might be wanted to test an eventual fix, and
3) I was probably going to get a bit fancier to preserve the file names.
My latest report is a bit confusing because debugfs() reads the
directories without complaint, so I'm not sure it's a valid on-disk
situation that the kernel is improperly rejecting, or an invaid situation
that libext2fs is improperly accepting.
Since I don't currently understand the on-disk format of inline data,
(I need to RTFS), I can't figure that out or see if there's a way
to do a repair in situ.
Oh! Found it at
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inline_Data
So I'm guessing the problem is that the required empty system.data
attribute is missing (due to the preceding extra_isize changing mess),
and if I just created it, everything would magically come back to life.
Something like ea_set <inode> system.data ""
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ext4 deadlock (+lockdep splat) during rsync
2017-01-20 17:57 ` George Spelvin
@ 2017-01-21 13:30 ` Theodore Ts'o
2017-01-21 16:45 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2017-01-21 13:30 UTC (permalink / raw)
To: George Spelvin; +Cc: jack, linux-ext4
On Fri, Jan 20, 2017 at 12:57:23PM -0500, George Spelvin wrote:
> So I'm guessing the problem is that the required empty system.data
> attribute is missing (due to the preceding extra_isize changing mess),
> and if I just created it, everything would magically come back to life
>
> Something like ea_set <inode> system.data ""
Well, it would still a corrupted, innconsistent inode, in that there
would still be a block attached to the inode, and i_size would be
different from the size of the data xattr used by inline_data. So you
might as well just clri the inode and rerun fsck, or clri the inode
and then unlink the directory entry in lost+found.
You might get it to a state where the kernel isn't explicitly
complaining any more, but you might end up unmasking other bugs where
the kernel is further failing to handle an inconsistent inode relating
to inline_data. Which is fine if you want to potentially exposing
more problems, but if this is file system with Data You Care about, my
suggestsion to run debugfs's clri and then e2fsck -f really is the
most conservative advice to give.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
2017-01-21 13:30 ` Theodore Ts'o
@ 2017-01-21 16:45 ` George Spelvin
2017-01-23 0:08 ` Theodore Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: George Spelvin @ 2017-01-21 16:45 UTC (permalink / raw)
To: linux, tytso; +Cc: jack, linux-ext4
Ted Ts'o wrote:
On Fri, Jan 20, 2017 at 12:57:23PM -0500, George Spelvin wrote:
>> So I'm guessing the problem is that the required empty system.data
>> attribute is missing (due to the preceding extra_isize changing mess),
>> and if I just created it, everything would magically come back to life
>>
>> Something like ea_set <inode> system.data ""
> Well, it would still a corrupted, innconsistent inode, in that there
> would still be a block attached to the inode, and i_size would be
> different from the size of the data xattr used by inline_data. So you
> might as well just clri the inode and rerun fsck, or clri the inode
> and then unlink the directory entry in lost+found.
Er, huh? I was referring to the following error, which is one of a
dozen inodes I have with this problem (sorry all the Subject: lines have
gotten tangled):
>> Subject: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
>>
>> debugfs: stat <1171779>
>> Inode: 1171779 Type: directory Mode: 0775 Flags: 0x10000000
>> Generation: 2016668288 Version: 0x00000000:00000007
>> User: 1000 Group: 161 Project: 0 Size: 60
>> File ACL: 0 Directory ACL: 0
>> Links: 2 Blockcount: 0
>> Fragment: Address: 0 Number: 0 Size: 0
>> ctime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
>> atime: 0x56d5943f:bb6e1344 -- Tue Mar 1 08:08:15 2016
>> mtime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
>> crtime: 0x56d388b6:533e9e7c -- Sun Feb 28 18:54:30 2016
>> Size of extra inode fields: 32
>> Inode checksum: 0x82a01dca
>> Size of inline data: 60
>>
>> debugfs: ls <1171779>
>> 1171779 (12) . 1171778 (12) .. 1171954 (12) 0 1171955 (12) 1
>> 1171956 (12) 2 1171957 (20) 3
Zero blocks, data apparently safe and sound in the i_blocks field, just
missing (due to getting trashed by buggy i_extra_size changing code)
the system.data ea, and thus i_inline_off == 0 which upsets the kernel.
> You might get it to a state where the kernel isn't explicitly
> complaining any more, but you might end up unmasking other bugs where
> the kernel is further failing to handle an inconsistent inode relating
> to inline_data.
I just want to get it to a state where I can mv the contents into a
replacement directly and then rmdir this one, rather than having to
make a note of the name & inode number of each of the contained files
and then recreate it from the contents of lost+found (which is already
a bit of a swamp I'm wading through).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
2017-01-21 16:45 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
@ 2017-01-23 0:08 ` Theodore Ts'o
0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2017-01-23 0:08 UTC (permalink / raw)
To: George Spelvin; +Cc: jack, linux-ext4
On Sat, Jan 21, 2017 at 11:45:12AM -0500, George Spelvin wrote:
> Er, huh? I was referring to the following error, which is one of a
> dozen inodes I have with this problem (sorry all the Subject: lines have
> gotten tangled):
Yes, there are a bunch of different inodes, and I was referring to a
different inode dump you've sent out.
> I just want to get it to a state where I can mv the contents into a
> replacement directly and then rmdir this one, rather than having to
> make a note of the name & inode number of each of the contained files
> and then recreate it from the contents of lost+found (which is already
> a bit of a swamp I'm wading through).
My one recommendation would be try it out on a copy of the file
system, or try recreating the corruption on a small file system and
then experiment before you go ahead try to recover things. I tend to
be *very* caution when live data is concerned, especially when I'm not
in front of the system myself and I'm asked to give advice to a user.
Or if it's too hard to make a copy of the file system, you might make
one change to one inode, then run fsck, and then mount the file
system, and make sure the kernel doesn't do something untoward, etc.,
before you try making that same class of change on two or three
inodes, then fsck and mount the file system, and try accessing those
files, before you make all of the changes for that class of
corruption.
Good luck,
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* kernel BUG at fs/ext4/inline.c:1943!
@ 2017-01-18 8:21 George Spelvin
2017-01-19 17:50 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
0 siblings, 1 reply; 14+ messages in thread
From: George Spelvin @ 2017-01-18 8:21 UTC (permalink / raw)
To: linux, tytso; +Cc: linux-ext4
I was trying to rmdir an empty directory in lost+found that accumulated
during my recent problems.
EXT4-fs (md3): mounted filesystem with writeback data mode. Opts: data=writeback,delalloc
$ cd /mountpoint/lost+found
$ rmdir \#1625089/
------------[ cut here ]------------
kernel BUG at fs/ext4/inline.c:1943!
invalid opcode: 0000 [#1] SMP
Modules linked in: nfsd lockd grace sunrpc ablk_helper via_velocity x86_pkg_temp_thermal crc32_pclmul crc32c_intel [last unloaded: twofish_common]
CPU: 0 PID: 16879 Comm: rmdir Not tainted 4.10.0-rc3-00322-g1aa1df43-dirty #591
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F7 03/19/2014
task: ffff88040e4cab00 task.stack: ffffc9000dd5c000
RIP: 0010:ext4_inline_data_truncate+0x3db/0x400
RSP: 0018:ffffc9000dd5dcd8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88042d287ea0 RCX: 0000021810000000
RDX: 00000000ffffffc3 RSI: ffffc9000dd5dcf8 RDI: ffff8801a41771f0
RBP: ffffc9000dd5dd80 R08: ffff88006123e9c0 R09: ffff8803fe3d60a0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801a41771f0
R13: ffff8801a4177058 R14: ffff8801a41770f0 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88042f200000(0063) knlGS:00000000f7787800
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000f765fbe0 CR3: 0000000179a5e000 CR4: 00000000001406f0
Call Trace:
ext4_truncate+0x1ea/0x300
ext4_evict_inode+0x2c7/0x3d0
evict+0xc0/0x190
iput+0x14c/0x1e0
dentry_unlink_inode+0xbe/0x160
d_delete+0x9c/0xb0
vfs_rmdir+0x102/0x130
do_rmdir+0x1a3/0x1e0
SyS_rmdir+0x11/0x20
do_fast_syscall_32+0x94/0x210
entry_SYSENTER_compat+0x51/0x60
RIP: 0023:0xf778aaf9
RSP: 002b:00000000ffda9e0c EFLAGS: 00000292 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00000000ffdab332 RCX: 0000000000000000
RDX: 00000000565f4000 RSI: 00000000565efb38 RDI: 00000000ffdab332
RBP: 00000000ffda9e68 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Code: ff ff 89 c1 48 89 d7 48 c7 44 0a f8 00 00 00 00 8d 48 ff 31 c0 c1 e9 03 f3 48 ab e9 31 fe ff ff 41 bf f4 ff ff ff e9 3c fe ff ff <0f> 0b 89 c0 c7 02 00 00 00 00 c7 44 02 fc 00 00 00 00 e9 0f fe
RIP: ext4_inline_data_truncate+0x3db/0x400 RSP: ffffc9000dd5dcd8
---[ end trace caa17f858299cde5 ]---
That's the "BUG_ON(is.s.not_found);"
debugfs says the directory entry is gone, but I can:
debugfs: stat <1625089>
Inode: 1625089 Type: directory Mode: 0775 Flags: 0x10000000
Generation: 927350643 Version: 0x00000000:00000004
User: 1000 Group: 161 Project: 0 Size: 132
File ACL: 1664090185 Directory ACL: 0
Links: 0 Blockcount: 8
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x587f2034:472c74ec -- Wed Jan 18 02:58:44 2017
atime: 0x56b9e2f8:b68a7658 -- Tue Feb 9 08:00:40 2016
mtime: 0x56c1bc4b:a7765de8 -- Mon Feb 15 06:53:47 2016
crtime: 0x56ba9eb4:a51d90ac -- Tue Feb 9 21:21:40 2016
Size of extra inode fields: 32
Extended attributes:
system.data (72)
Inode checksum: 0xe2f12fc5
Size of inline data: 132
debugfs: dump <1625089> /tmp/inode
$ xxd /tmp/inode
00000000: 0b00 0000 0000 0000 3800 0000 0000 0000 ........8.......
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000040: 4800 3101 xxxx xxxx xxxx xxxx xxxx xxxx H.1.xxxxxxxxxxxx
00000050: xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxxxxxxxxxxxxx
00000060: xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxxxxxxxxxxxxx
00000070: xx2e 7064 6600 0000 0000 0000 0000 0000 x.pdf...........
00000080: 0000 0000 ....
debugfs: ea_get -f /tmp/ea <1625089> system.data
$ xxd /tmp/ea
00000000: 0000 0000 4800 3101 xxxx xxxx xxxx xxxx ....H.1.xxxxxxxx
00000010: xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxxxxxxxxxxxxx
00000020: xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxxxxxxxxxxxxxx
00000030: xxxx xxxx xx2e 7064 6600 0000 0000 0000 xxxxx.pdf.......
00000040: 0000 0000 0000 0000 ........
^ permalink raw reply [flat|nested] 14+ messages in thread
* ext4_iget:4740: inode #%ld: block 48: comm find: invalid block
2017-01-18 8:21 kernel BUG at fs/ext4/inline.c:1943! George Spelvin
@ 2017-01-19 17:50 ` George Spelvin
0 siblings, 0 replies; 14+ messages in thread
From: George Spelvin @ 2017-01-19 17:50 UTC (permalink / raw)
To: linux-ext4, tytso; +Cc: linux
This is part of the fallout of previous errors, so I don't know how
it happened, but repeated e2fsck runs (git master, e2fsck 1.43.3.1
(22-Dec-2016)) aren't fixing it, which is at least an error in e2fsck.
I think it's something to do with inline_data. The error is
} else if (!ext4_has_inline_data(inode)) {
if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
if ((S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
(S_ISLNK(inode->i_mode) &&
!ext4_inode_is_fast_symlink(inode))))
/* Validate extent which is part of inode */
ret = ext4_ext_check_inode(inode);
} else if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
(S_ISLNK(inode->i_mode) &&
!ext4_inode_is_fast_symlink(inode))) {
/* Validate block references which are part of inode */
4740--> ret = ext4_ind_check_inode(inode);
}
but I think the !ext4_has_inline_data() check is wrong. I.e. the inode
*does* have inline data but the test is not detecting it, leading
to -EFSCORRUPTED.
debugfs: stat <1171779>
Inode: 1171779 Type: directory Mode: 0775 Flags: 0x10000000
Generation: 2016668288 Version: 0x00000000:00000007
User: 1000 Group: 161 Project: 0 Size: 60
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 0
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
atime: 0x56d5943f:bb6e1344 -- Tue Mar 1 08:08:15 2016
mtime: 0x587eff35:6e19953c -- Wed Jan 18 00:37:57 2017
crtime: 0x56d388b6:533e9e7c -- Sun Feb 28 18:54:30 2016
Size of extra inode fields: 32
Inode checksum: 0x82a01dca
Size of inline data: 60
debugfs: ls <1171779>
1171779 (12) . 1171778 (12) .. 1171954 (12) 0 1171955 (12) 1
1171956 (12) 2 1171957 (20) 3
debugfs: ex <1171779>
<1171779>: does not uses extent block maps
debugfs: blocks <1171779>
(i.e. blank line)
It has no blocks allocated, but contains data... must be inline, no?
The EXT4_INODE_INLINE_DATA bit (bit 28) is set, so i_inline_off must
be zero. How could that arise?
Other info:
debugfs: ea_list <1171779>
debugfs: htree <1171779>
htree: Not a hash-indexed directory
debugfs: id <1171779>
0000 fd45 e803 3c00 0000 3f94 d556 35ff 7e58 .E..<...?..V5.~X
0020 35ff 7e58 0000 0000 a100 0200 0000 0000 5.~X............
0040 0000 0010 0700 0000 42e1 1100 f2e1 1100 ........B.......
0060 0c00 0101 3000 0000 f3e1 1100 0c00 0101 ....0...........
0100 3100 0000 f4e1 1100 0c00 0101 3200 0000 1...........2...
0120 f5e1 1100 1400 0101 3300 0000 0000 0000 ........3.......
0140 0000 0000 80ea 3378 0000 0000 0000 0000 ......3x........
0160 0000 0000 0000 0000 0000 0000 ca1d 0000 ................
0200 2000 a082 3c95 196e 3c95 196e 4413 6ebb ...<..n<..nD.n.
0220 b688 d356 7c9e 3e53 0000 0000 0000 0000 ...V|.>S........
0240 0000 0000 0000 0000 0000 0000 0000 0000 ................
*
debugfs: dump <1171779> /tmp/empty
$ xxd /tmp/empty
00000000: 42e1 1100 f2e1 1100 0c00 0101 3000 0000 B...........0...
00000010: f3e1 1100 0c00 0101 3100 0000 f4e1 1100 ........1.......
00000020: 0c00 0101 3200 0000 f5e1 1100 1400 0101 ....2...........
00000030: 3300 0000 0000 0000 0000 0000 3...........
That data is clearly visible in the i_block area of the inode
(starting at offset 0050).
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-01-23 0:08 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-08 22:41 Ext4 deadlock (+lockdep splat) during rsync George Spelvin
2017-01-10 3:40 ` Theodore Ts'o
2017-01-10 7:25 ` George Spelvin
2017-01-19 17:37 ` Jan Kara
2017-01-19 17:45 ` Jan Kara
2017-01-19 17:59 ` George Spelvin
2017-01-19 23:19 ` Theodore Ts'o
2017-01-19 23:38 ` George Spelvin
2017-01-20 14:40 ` Theodore Ts'o
2017-01-20 17:57 ` George Spelvin
2017-01-21 13:30 ` Theodore Ts'o
2017-01-21 16:45 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
2017-01-23 0:08 ` Theodore Ts'o
-- strict thread matches above, loose matches on Subject: below --
2017-01-18 8:21 kernel BUG at fs/ext4/inline.c:1943! George Spelvin
2017-01-19 17:50 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).