All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 211819] Processes stuck in D-state after accessing ext4+fast_commit+fscrypt
Date: Fri, 25 Mar 2022 22:36:16 +0000	[thread overview]
Message-ID: <bug-211819-13602-opkiWIhkdR@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-211819-13602@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=211819

AT (kernel@twinhelix.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kernel@twinhelix.com

--- Comment #1 from AT (kernel@twinhelix.com) ---
This affects kernels 5.16.15-arch1 and 5.16.16-arch1 at least (distro suggested
I add to this report). While running 5.16.16 with my home directory encrypted
with fscrypt (automatically unlocked on login), I enabled EXT4 fast_commit on
my root drive with:

sudo tune2fs -O fast_commit /dev/nvme0n1p2

( Other filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent flex_bg encrypt sparse_super large_file
huge_file dir_nlink extra_isize metadata_csum )

The system remained responsive but after that there were repeated episodes like
this in the log while running Firefox:

1 Mar 23 22:19:15 hostname kernel: INFO: task IndexedDB #19:3350 blocked for
more than 122 seconds.
1 Mar 23 22:19:15 hostname kernel: Not tainted 5.16.16-arch1-1 #1
2 Mar 23 22:19:15 hostname kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
3 Mar 23 22:19:15 hostname kernel: task:IndexedDB #19 state:D stack: 0 pid:
3350 ppid: 1393 flags:0x00000002
4 Mar 23 22:19:15 hostname kernel: Call Trace:
5 Mar 23 22:19:15 hostname kernel: <TASK>
6 Mar 23 22:19:15 hostname kernel: __schedule+0x2f6/0xf80
7 Mar 23 22:19:15 hostname kernel: ? touch_atime+0x44/0x190
8 Mar 23 22:19:15 hostname kernel: schedule+0x4b/0xc0
9 Mar 23 22:19:15 hostname kernel: ext4_fc_wait_committing_inode+0xad/0xe0
[ext4 ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
10 Mar 23 22:19:15 hostname kernel: ? var_wake_function+0x20/0x20
11 Mar 23 22:19:15 hostname kernel: ext4_fc_start_update+0x43/0x90 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
12 Mar 23 22:19:15 hostname kernel: ext4_buffered_write_iter+0x35/0x120 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
13 Mar 23 22:19:15 hostname kernel: new_sync_write+0x15c/0x1f0
14 Mar 23 22:19:15 hostname kernel: vfs_write+0x1eb/0x280
15 Mar 23 22:19:15 hostname kernel: ksys_write+0x67/0xe0
16 Mar 23 22:19:15 hostname kernel: do_syscall_64+0x5c/0x80
17 Mar 23 22:19:15 hostname kernel: ? do_syscall_64+0x69/0x80
18 Mar 23 22:19:15 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
19 Mar 23 22:19:15 hostname kernel: ? do_syscall_64+0x69/0x80
20 Mar 23 22:19:15 hostname kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
21 Mar 23 22:19:15 hostname kernel: RIP: 0033:0x7f5c0222928f
22 Mar 23 22:19:15 hostname kernel: RSP: 002b:00007f5becaac970 EFLAGS: 00000293
ORIG_RAX: 0000000000000001
23 Mar 23 22:19:15 hostname kernel: RAX: ffffffffffffffda RBX: 00007f5bbcfcffa0
RCX: 00007f5c0222928f
24 Mar 23 22:19:15 hostname kernel: RDX: 0000000000001000 RSI: 00007f5bb8e89000
RDI: 0000000000000038
25 Mar 23 22:19:15 hostname kernel: RBP: 0000000000000038 R08: 0000000000000000
R09: 0000000000000001
26 Mar 23 22:19:15 hostname kernel: R10: 00007ffea35cb080 R11: 0000000000000293
R12: 00007f5bb8e89000
27 Mar 23 22:19:15 hostname kernel: R13: 0000000000001000 R14: 00007f5bbcfcffa0
R15: 0000000000012000
28 Mar 23 22:19:15 hostname kernel: </TASK>

Shutdown was held up by Firefox/Thunderbird processes hanging, after an expired
timeout I had to forcefully reboot with Magic SysRq key. At one point the
system wouldn't boot with disk errors and I had to fsck.ext4 in the initrd,
luckily managed to get booting again. Log from the latter part of an attempted
suspend:

Mar 24 00:41:16 hostname kernel: PM: suspend entry (s2idle)
Mar 24 00:41:16 hostname kernel: Filesystems sync: 0.009 seconds
Mar 24 00:41:36 hostname kernel: Freezing user space processes ...
Mar 24 00:41:36 hostname kernel: Freezing of tasks failed after 20.009 seconds
(3 tasks refusing to freeze, wq_busy=0):
Mar 24 00:41:36 hostname kernel: task:IndexedDB #19 state:D stack: 0 pid: 3350
ppid: 1393 flags:0x00000006
Mar 24 00:41:36 hostname kernel: Call Trace:
Mar 24 00:41:36 hostname kernel: <TASK>
Mar 24 00:41:36 hostname kernel: __schedule+0x2f6/0xf80
Mar 24 00:41:36 hostname kernel: ? touch_atime+0x44/0x190
Mar 24 00:41:36 hostname kernel: schedule+0x4b/0xc0
Mar 24 00:41:36 hostname kernel: ext4_fc_wait_committing_inode+0xad/0xe0 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ? var_wake_function+0x20/0x20
Mar 24 00:41:36 hostname kernel: ext4_fc_start_update+0x43/0x90 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ext4_buffered_write_iter+0x35/0x120 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: new_sync_write+0x15c/0x1f0
Mar 24 00:41:36 hostname kernel: vfs_write+0x1eb/0x280
Mar 24 00:41:36 hostname kernel: ksys_write+0x67/0xe0
Mar 24 00:41:36 hostname kernel: do_syscall_64+0x5c/0x80
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 24 00:41:36 hostname kernel: RIP: 0033:0x7f5c0222928f
Mar 24 00:41:36 hostname kernel: RSP: 002b:00007f5becaac970 EFLAGS: 00000293
ORIG_RAX: 0000000000000001
Mar 24 00:41:36 hostname kernel: RAX: ffffffffffffffda RBX: 00007f5bbcfcffa0
RCX: 00007f5c0222928f
Mar 24 00:41:36 hostname kernel: RDX: 0000000000001000 RSI: 00007f5bb8e89000
RDI: 0000000000000038
Mar 24 00:41:36 hostname kernel: RBP: 0000000000000038 R08: 0000000000000000
R09: 0000000000000001
Mar 24 00:41:36 hostname kernel: R10: 00007ffea35cb080 R11: 0000000000000293
R12: 00007f5bb8e89000
Mar 24 00:41:36 hostname kernel: R13: 0000000000001000 R14: 00007f5bbcfcffa0
R15: 0000000000012000
Mar 24 00:41:36 hostname kernel: </TASK>
Mar 24 00:41:36 hostname kernel: task:Indexed~ Mnt #1 state:D stack: 0
pid:64999 ppid: 1393 flags:0x00000006
Mar 24 00:41:36 hostname kernel: Call Trace:
Mar 24 00:41:36 hostname kernel: <TASK>
Mar 24 00:41:36 hostname kernel: __schedule+0x2f6/0xf80
Mar 24 00:41:36 hostname kernel: schedule+0x4b/0xc0
Mar 24 00:41:36 hostname kernel: ext4_fc_wait_committing_inode+0xad/0xe0 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ? var_wake_function+0x20/0x20
Mar 24 00:41:36 hostname kernel: ext4_fc_start_update+0x43/0x90 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ext4_buffered_write_iter+0x35/0x120 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: new_sync_write+0x15c/0x1f0
Mar 24 00:41:36 hostname kernel: vfs_write+0x1eb/0x280
Mar 24 00:41:36 hostname kernel: ksys_write+0x67/0xe0
Mar 24 00:41:36 hostname kernel: do_syscall_64+0x5c/0x80
Mar 24 00:41:36 hostname kernel: ? __fget_light+0x8f/0x110
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 24 00:41:36 hostname kernel: RIP: 0033:0x7f5c0222928f
Mar 24 00:41:36 hostname kernel: RSP: 002b:00007f5bbc0ebcf0 EFLAGS: 00000293
ORIG_RAX: 0000000000000001
Mar 24 00:41:36 hostname kernel: RAX: ffffffffffffffda RBX: 00007f5ba7fa5608
RCX: 00007f5c0222928f
Mar 24 00:41:36 hostname kernel: RDX: 0000000000000018 RSI: 00007f5bbc0ebeb0
RDI: 00000000000000ce
Mar 24 00:41:36 hostname kernel: RBP: 00000000000000ce R08: 0000000000000000
R09: 0000000000000000
Mar 24 00:41:36 hostname kernel: R10: 00007ffea35cb080 R11: 0000000000000293
R12: 00007f5bbc0ebeb0
Mar 24 00:41:36 hostname kernel: R13: 0000000000000018 R14: 00007f5ba7fa5608
R15: 0000000000058860
Mar 24 00:41:36 hostname kernel: </TASK>
Mar 24 00:41:36 hostname kernel: task:Indexed~ Mnt #2 state:D stack: 0
pid:65000 ppid: 1393 flags:0x00000006
Mar 24 00:41:36 hostname kernel: Call Trace:
Mar 24 00:41:36 hostname kernel: <TASK>
Mar 24 00:41:36 hostname kernel: __schedule+0x2f6/0xf80
Mar 24 00:41:36 hostname kernel: ? atime_needs_update+0x82/0x100
Mar 24 00:41:36 hostname kernel: schedule+0x4b/0xc0
Mar 24 00:41:36 hostname kernel: ext4_fc_wait_committing_inode+0xad/0xe0 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ? var_wake_function+0x20/0x20
Mar 24 00:41:36 hostname kernel: ext4_fc_start_update+0x43/0x90 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: ext4_buffered_write_iter+0x35/0x120 [ext4
ef9ba7b6856d0b378c7bc2df0cae9987df17f5c0]
Mar 24 00:41:36 hostname kernel: new_sync_write+0x15c/0x1f0
Mar 24 00:41:36 hostname kernel: vfs_write+0x1eb/0x280
Mar 24 00:41:36 hostname kernel: ksys_write+0x67/0xe0
Mar 24 00:41:36 hostname kernel: do_syscall_64+0x5c/0x80
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? syscall_exit_to_user_mode+0x23/0x40
Mar 24 00:41:36 hostname kernel: ? do_syscall_64+0x69/0x80
Mar 24 00:41:36 hostname kernel: ? native_flush_tlb_local+0x31/0x40
Mar 24 00:41:36 hostname kernel: ? flush_tlb_func+0xc8/0x1d0
Mar 24 00:41:36 hostname kernel: ? sched_clock_cpu+0x9/0xb0
Mar 24 00:41:36 hostname kernel: ? irqtime_account_irq+0x38/0xb0
Mar 24 00:41:36 hostname kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 24 00:41:36 hostname kernel: RIP: 0033:0x7f5c0222928f
Mar 24 00:41:36 hostname kernel: RSP: 002b:00007f5baf502130 EFLAGS: 00000293
ORIG_RAX: 0000000000000001
Mar 24 00:41:36 hostname kernel: RAX: ffffffffffffffda RBX: 00007f5b98511da0
RCX: 00007f5c0222928f
Mar 24 00:41:36 hostname kernel: RDX: 0000000000001000 RSI: 00007f5baaf4c000
RDI: 00000000000000a2
Mar 24 00:41:36 hostname kernel: RBP: 00000000000000a2 R08: 0000000000000000
R09: 0000000000000001
Mar 24 00:41:36 hostname kernel: R10: 00007ffea35cb080 R11: 0000000000000293
R12: 00007f5baaf4c000
Mar 24 00:41:36 hostname kernel: R13: 0000000000001000 R14: 00007f5b98511da0
R15: 0000000000007000
Mar 24 00:41:36 hostname kernel: </TASK>
Mar 24 00:41:36 hostname kernel:
Mar 24 00:41:36 hostname kernel: OOM killer enabled.
Mar 24 00:41:36 hostname kernel: Restarting tasks ... done.
Mar 24 00:41:36 hostname kernel: PM: suspend exit
Mar 24 00:41:36 hostname systemd-sleep[65047]: Failed to put system to sleep.
System resumed again: Device or resource busy
Mar 24 00:41:36 hostname systemd[1]: systemd-suspend.service: Main process
exited, code=exited, status=1/FAILURE
Mar 24 00:41:36 hostname audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295
ses=4294967295 subj==unconfined msg='unit=systemd-suspend comm="systemd"
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Mar 24 00:41:36 hostname kernel: audit: type=1130 audit(1648035696.209:166):
pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined
msg='unit=systemd-suspend comm="systemd" exe="/usr/lib/systemd/systemd"
hostname=? addr=? terminal=? res=failed'
Mar 24 00:41:36 hostname systemd[1]: systemd-suspend.service: Failed with
result 'exit-code'.
Mar 24 00:41:36 hostname systemd[1]: Failed to start System Suspend.
Mar 24 00:41:36 hostname systemd[1]: Dependency failed for Suspend.
Mar 24 00:41:36 hostname systemd[1]: suspend.target: Job suspend.target/start
failed with result 'dependency'.
Mar 24 00:41:36 hostname systemd[1]: Stopped target Sleep.
Mar 24 00:41:36 hostname systemd-logind[595]: Operation 'sleep' finished.

I tried both kernels with the same issue. Then to remove it I ran:

sudo tune2fs -O ^fast_commit /dev/nvme0n1p2

which still resulted in errors until I rebooted and appended "fsck.mode=force"
to the Linux boot cmdline; the system now runs OK. Hope this helps debug the
fscrypt/fast_commit interaction.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

      reply	other threads:[~2022-03-25 22:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18  6:51 [Bug 211819] New: Processes stuck in D-state after accessing ext4+fast_commit+fscrypt bugzilla-daemon
2022-03-25 22:36 ` bugzilla-daemon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-211819-13602-opkiWIhkdR@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.