public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
@ 2024-09-16 13:30 bugzilla-daemon
  2024-09-16 13:31 ` [Bug 219283] " bugzilla-daemon
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 13:30 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

            Bug ID: 219283
           Summary: kernel regression with ext4 and ea_inode mount flags
                    and exercising xattrs (between Linux 6.8 and 6.11)
           Product: File System
           Version: 2.5
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: colin.i.king@gmail.com
        Regression: No

Created attachment 306882
  --> https://bugzilla.kernel.org/attachment.cgi?id=306882&action=edit
dmesg showing lockup

exercising xattr with stress-ng on ext4 when it's been created with ea_inode
option causes a kernel hang.

Kernel: 6.11.0-7, AMD64 virtual machine 8 thread virtual machine (important,
must be multiple CPU threads to trigger the regression)
20GB virtio drive on /dev/vdb, 1 partition /dev/vdb1

sudo mkfs.ext4 /dev/vdb1 -O ea_inode
sudo mount /dev/vdb1 /mnt

git clone https://github.com/ColinIanKing/stress-ng
cd stress-ng
make clean; make -j $(nproc)

sudo ./stress-ng --xattr 8 -t 120 --vmstat 1 --file-path /mnt

..wait a couple of minutes, you will see that the number of running processes
is not 8 as expected (from the --vmstat output of stress-ng)

cannot ^C stop stress-ng because of a kernel lockup, use another tty and check
dmesg, I get the following:

[ 247.028846] INFO: task jbd2/vdb1-8:1548 blocked for more than 122 seconds.
[ 247.030830] Not tainted 6.11.0-7-generic #7-Ubuntu
[ 247.032667] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 247.034170] task:jbd2/vdb1-8 state:D stack:0 pid:1548 tgid:1548 ppid:2
flags:0x00004000
[ 247.034176] Call Trace:
[ 247.034178] <TASK>
[ 247.034182] __schedule+0x277/0x6c0
[ 247.034199] schedule+0x29/0xd0
[ 247.034203] jbd2_journal_wait_updates+0x77/0xf0
[ 247.034207] ? __pfx_autoremove_wake_function+0x10/0x10
[ 247.034213] jbd2_journal_commit_transaction+0x290/0x1a10
[ 247.034223] kjournald2+0xa8/0x250
[ 247.034228] ? __pfx_autoremove_wake_function+0x10/0x10
[ 247.034233] ? __pfx_kjournald2+0x10/0x10
[ 247.034236] kthread+0xe1/0x110
[ 247.034241] ? __pfx_kthread+0x10/0x10
[ 247.034244] ret_from_fork+0x44/0x70
[ 247.034247] ? __pfx_kthread+0x10/0x10
[ 247.034251] ret_from_fork_asm+0x1a/0x30
[ 247.034257] </TASK>

NOTE: this works fine for Linux 6.8.0, so this looks like a regression for
6.11.0

Attached is the full kernel log.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
@ 2024-09-16 13:31 ` bugzilla-daemon
  2024-09-16 15:35 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 13:31 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #1 from Colin Ian King (colin.i.king@gmail.com) ---
Oops, typo, the reproducer should be:

sudo ./stress-ng --xattr 8 -t 120 --vmstat 1 --temp-path /mnt

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
  2024-09-16 13:31 ` [Bug 219283] " bugzilla-daemon
@ 2024-09-16 15:35 ` bugzilla-daemon
  2024-09-16 17:03 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 15:35 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #2 from Colin Ian King (colin.i.king@gmail.com) ---
Seems to occur on 6.10 but not 6.9, so it's a 6.10 regression.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
  2024-09-16 13:31 ` [Bug 219283] " bugzilla-daemon
  2024-09-16 15:35 ` bugzilla-daemon
@ 2024-09-16 17:03 ` bugzilla-daemon
  2024-09-16 17:13 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 17:03 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

Theodore Tso (tytso@mit.edu) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #3 from Theodore Tso (tytso@mit.edu) ---
OK, so you've isolated it to something between 6.9 and 6.10.   Any chance you
could do further bisection work?   Thanks!!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2024-09-16 17:03 ` bugzilla-daemon
@ 2024-09-16 17:13 ` bugzilla-daemon
  2024-09-16 17:51 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 17:13 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #4 from Colin Ian King (colin.i.king@gmail.com) ---
I'm doing some bisect builds, but my system is a bit slow, so may take a few
days.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2024-09-16 17:13 ` bugzilla-daemon
@ 2024-09-16 17:51 ` bugzilla-daemon
  2024-09-16 18:09 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 17:51 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #5 from Theodore Tso (tytso@mit.edu) ---
Thanks!   I'm traveling for the next two weeks, so anything you can do to
narrow it down will be much appreciated.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2024-09-16 17:51 ` bugzilla-daemon
@ 2024-09-16 18:09 ` bugzilla-daemon
  2024-09-17 18:13 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-16 18:09 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #6 from Artem S. Tashkinov (aros@gmx.com) ---
(In reply to Colin Ian King from comment #4)
> I'm doing some bisect builds, but my system is a bit slow, so may take a few
> days.

Use ext4 in RAM, e.g. using tmpfs.

For building the kernel you could use ccache.

To avoid reboots, have your root fs on any other FS, e.g. xfs and build ext4 as
a module.

Or even perform everything in a VM.

It's not a hardware module, so the whole test could be performed in a VM.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2024-09-16 18:09 ` bugzilla-daemon
@ 2024-09-17 18:13 ` bugzilla-daemon
  2024-09-17 20:18 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-17 18:13 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #7 from Colin Ian King (colin.i.king@gmail.com) ---
$ git bisect bad
0a46ef234756dca04623b7591e8ebb3440622f0b is the first bad commit
commit 0a46ef234756dca04623b7591e8ebb3440622f0b (HEAD)
Author: Jan Kara <jack@suse.cz>
Date:   Thu Mar 21 17:26:50 2024 +0100

    ext4: do not create EA inode under buffer lock

    ext4_xattr_set_entry() creates new EA inodes while holding buffer lock
    on the external xattr block. This is problematic as it nests all the
    allocation locking (which acquires locks on other buffers) under the
    buffer lock. This can even deadlock when the filesystem is corrupted and
    e.g. quota file is setup to contain xattr block as data block. Move the
    allocation of EA inode out of ext4_xattr_set_entry() into the callers.

    Reported-by: syzbot+a43d4f48b8397d0e41a9@syzkaller.appspotmail.com
    Signed-off-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20240321162657.27420-2-jack@suse.cz
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

 fs/ext4/xattr.c | 113
++++++++++++++++++++++++++++++++++---------------------------------------
 1 file changed, 53 insertions(+), 60 deletions(-)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2024-09-17 18:13 ` bugzilla-daemon
@ 2024-09-17 20:18 ` bugzilla-daemon
  2024-09-18 13:43 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-17 20:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #8 from Colin Ian King (colin.i.king@gmail.com) ---
And reverting 0a46ef234756dca04623b7591e8ebb3440622f0b on 6.10.10 makes the
test run OK, without reverting it the kernel will hang.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2024-09-17 20:18 ` bugzilla-daemon
@ 2024-09-18 13:43 ` bugzilla-daemon
  2024-09-18 13:43 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-18 13:43 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Bisected commit-id|                            |0a46ef234756dca04623b7591e8
                   |                            |ebb3440622f0b
         Regression|No                          |Yes

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2024-09-18 13:43 ` bugzilla-daemon
@ 2024-09-18 13:43 ` bugzilla-daemon
  2024-09-18 15:35 ` bugzilla-daemon
  2024-09-18 16:08 ` bugzilla-daemon
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-18 13:43 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jack@suse.cz

--- Comment #9 from Artem S. Tashkinov (aros@gmx.com) ---
Jan,

Please take a look.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2024-09-18 13:43 ` bugzilla-daemon
@ 2024-09-18 15:35 ` bugzilla-daemon
  2024-09-18 16:08 ` bugzilla-daemon
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-18 15:35 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #10 from Jan Kara (jack@suse.cz) ---
Yeah, I'm already watching it because when you narrowed down the range to
6.9-6.10 I started suspecting this commit :) I'll try to reproduce to debug
what's going on.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 219283] kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11)
  2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
                   ` (10 preceding siblings ...)
  2024-09-18 15:35 ` bugzilla-daemon
@ 2024-09-18 16:08 ` bugzilla-daemon
  11 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2024-09-18 16:08 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=219283

--- Comment #11 from Jan Kara (jack@suse.cz) ---
I think I see the problem and AFAICT commit 0a46ef234756dca is just making the
latent deadlock easier to hit. The problem is shown by these two stacktraces
from your dmesg:

Task 1
[  247.045575]  __wait_on_freeing_inode+0xba/0x140
[  247.045584]  find_inode_fast+0xa4/0xe0
[  247.045588]  iget_locked+0x71/0x200
[  247.045597]  __ext4_iget+0x148/0x1080
[  247.045615]  ext4_xattr_inode_cache_find+0xe2/0x220
[  247.045621]  ext4_xattr_inode_lookup_create+0x122/0x240
[  247.045626]  ext4_xattr_block_set+0xc2/0xeb0
[  247.045633]  ext4_xattr_set_handle+0x4ba/0x650
[  247.045641]  ext4_xattr_set+0x80/0x160

Task 2
[  247.043719]  mb_cache_entry_wait_unused+0x9a/0xd0
[  247.043729]  ext4_evict_ea_inode+0x64/0xb0
[  247.043733]  ext4_evict_inode+0x35c/0x6d0
[  247.043739]  evict+0x108/0x2c0
[  247.043745]  iput+0x14a/0x260
[  247.043749]  ext4_xattr_ibody_set+0x175/0x1d0
[  247.043754]  ext4_xattr_set_handle+0x297/0x650
[  247.043762]  ext4_xattr_set+0x80/0x160

These two tasks are deadlocked against each other. One has dropped the last
reference to xattr inode and is trying to remove it from memory and waits for
corresponding mbcache entry to get unused while another task is holding the
mbcache entry reference and is waiting for inode to be evicted from memory.
Commit 0a46ef234756dca removed synchronization on buffer lock for one of the
hot paths and thus hitting this race is now much more likely.

I just have to make up my mind how to best fix this ABBA deadlock.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-09-18 16:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-16 13:30 [Bug 219283] New: kernel regression with ext4 and ea_inode mount flags and exercising xattrs (between Linux 6.8 and 6.11) bugzilla-daemon
2024-09-16 13:31 ` [Bug 219283] " bugzilla-daemon
2024-09-16 15:35 ` bugzilla-daemon
2024-09-16 17:03 ` bugzilla-daemon
2024-09-16 17:13 ` bugzilla-daemon
2024-09-16 17:51 ` bugzilla-daemon
2024-09-16 18:09 ` bugzilla-daemon
2024-09-17 18:13 ` bugzilla-daemon
2024-09-17 20:18 ` bugzilla-daemon
2024-09-18 13:43 ` bugzilla-daemon
2024-09-18 13:43 ` bugzilla-daemon
2024-09-18 15:35 ` bugzilla-daemon
2024-09-18 16:08 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox