All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 219166] New: ext4 hang when setting echo noop > /sys/block/sda/queue/scheduler
Date: Fri, 16 Aug 2024 07:43:09 +0000	[thread overview]
Message-ID: <bug-219166-13602@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=219166

            Bug ID: 219166
           Summary: ext4 hang when setting echo noop >
                    /sys/block/sda/queue/scheduler
           Product: File System
           Version: 2.5
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: rjones@redhat.com
        Regression: No

kernel-6.11.0-0.rc3.30.fc41.x86_64

This hang seems new in 6.11.0.

Downstream bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2303267

Trying to change the queue/scheduler for a device containing an ext4 filesystem
very rarely causes other processes which are using the filesystem to hang.

This is running inside qemu.  Using 'crash' I was able to get a stack trace
from a couple of hanging processes (from different VM instances):

crash> set 230
    PID: 230
COMMAND: "modprobe"
   TASK: ffff98ce03ca3040  [THREAD_INFO: ffff98ce03ca3040]
    CPU: 0
  STATE: TASK_UNINTERRUPTIBLE 
crash> bt
PID: 230      TASK: ffff98ce03ca3040  CPU: 0    COMMAND: "modprobe"
 #0 [ffffaf9940307840] __schedule at ffffffff9618f6d0
 #1 [ffffaf99403078f8] schedule at ffffffff96190a27
 #2 [ffffaf9940307908] __bio_queue_enter at ffffffff957e121c
 #3 [ffffaf9940307968] blk_mq_submit_bio at ffffffff957f358c
 #4 [ffffaf99403079f0] __submit_bio at ffffffff957e1e3c
 #5 [ffffaf9940307a58] submit_bio_noacct_nocheck at ffffffff957e2326
 #6 [ffffaf9940307ac0] ext4_mpage_readpages at ffffffff955ceafc
 #7 [ffffaf9940307be0] read_pages at ffffffff95381d1a
 #8 [ffffaf9940307c40] page_cache_ra_unbounded at ffffffff95381ff5
 #9 [ffffaf9940307ca8] filemap_fault at ffffffff953761b5
#10 [ffffaf9940307d48] __do_fault at ffffffff953d1895
#11 [ffffaf9940307d70] do_fault at ffffffff953d2425
#12 [ffffaf9940307da0] __handle_mm_fault at ffffffff953d8c6b
#13 [ffffaf9940307e88] handle_mm_fault at ffffffff953d95c2
#14 [ffffaf9940307ec8] do_user_addr_fault at ffffffff950b34ea
#15 [ffffaf9940307f28] exc_page_fault at ffffffff96186e4e
#16 [ffffaf9940307f50] asm_exc_page_fault at ffffffff962012a6
    RIP: 0000556b7a7468d8  RSP: 00007ffde2ffb560  RFLAGS: 00000206
    RAX: 00000000000bec82  RBX: 00007f5331a0dc82  RCX: 0000556b7a75b92a
    RDX: 00007ffde2ffd8d0  RSI: 00000000200bec82  RDI: 0000556ba8edf960
    RBP: 00007ffde2ffb7c0   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000202  R12: 00000000200bec82
    R13: 0000556ba8edf960  R14: 00007ffde2ffd8d0  R15: 0000556b7a760708
    ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b

crash> set 234
    PID: 234
COMMAND: "modprobe"
   TASK: ffff9f5ec3a22f40  [THREAD_INFO: ffff9f5ec3a22f40]
    CPU: 0
  STATE: TASK_UNINTERRUPTIBLE 
crash> bt
PID: 234      TASK: ffff9f5ec3a22f40  CPU: 0    COMMAND: "modprobe"
 #0 [ffffb21e002e7840] __schedule at ffffffffa718f6d0
 #1 [ffffb21e002e78f8] schedule at ffffffffa7190a27
 #2 [ffffb21e002e7908] __bio_queue_enter at ffffffffa67e121c
 #3 [ffffb21e002e7968] blk_mq_submit_bio at ffffffffa67f358c
 #4 [ffffb21e002e79f0] __submit_bio at ffffffffa67e1e3c
 #5 [ffffb21e002e7a58] submit_bio_noacct_nocheck at ffffffffa67e2326
 #6 [ffffb21e002e7ac0] ext4_mpage_readpages at ffffffffa65ceafc
 #7 [ffffb21e002e7be0] read_pages at ffffffffa6381d17
 #8 [ffffb21e002e7c40] page_cache_ra_unbounded at ffffffffa6381ff5
 #9 [ffffb21e002e7ca8] filemap_fault at ffffffffa63761b5
#10 [ffffb21e002e7d48] __do_fault at ffffffffa63d1892
#11 [ffffb21e002e7d70] do_fault at ffffffffa63d2425
#12 [ffffb21e002e7da0] __handle_mm_fault at ffffffffa63d8c6b
#13 [ffffb21e002e7e88] handle_mm_fault at ffffffffa63d95c2
#14 [ffffb21e002e7ec8] do_user_addr_fault at ffffffffa60b34ea
#15 [ffffb21e002e7f28] exc_page_fault at ffffffffa7186e4e
#16 [ffffb21e002e7f50] asm_exc_page_fault at ffffffffa72012a6
    RIP: 000055d16159f8d8  RSP: 00007ffdd4c1f340  RFLAGS: 00010206
    RAX: 00000000000bec82  RBX: 00007ff2fd00dc82  RCX: 000055d1615b492a
    RDX: 00007ffdd4c216b0  RSI: 00000000200bec82  RDI: 000055d185725960
    RBP: 00007ffdd4c1f5a0   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000202  R12: 00000000200bec82
    R13: 000055d185725960  R14: 00007ffdd4c216b0  R15: 000055d1615b9708
    ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b

The hanging process was modprobe both times.  I don't know if that is
significant.

Note that "noop" is not actually a valid scheduler (for about 10 years)!  We
still observe this hang even when setting it to this impossible value.

# echo noop > /sys/block/sda/queue/scheduler
/init: line 108: echo: write error: Invalid argument

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

             reply	other threads:[~2024-08-16  7:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-16  7:43 bugzilla-daemon [this message]
2024-08-16 16:40 ` [Bug 219166] ext4 hang when setting echo noop > /sys/block/sda/queue/scheduler bugzilla-daemon
2024-08-16 17:06 ` bugzilla-daemon
2024-08-16 20:36 ` bugzilla-daemon
2024-08-16 20:51 ` bugzilla-daemon
2024-08-17 10:22 ` bugzilla-daemon
2024-08-17 12:36 ` bugzilla-daemon
2024-08-17 12:38 ` bugzilla-daemon
2024-08-17 12:41 ` bugzilla-daemon
2024-08-17 13:58 ` bugzilla-daemon
2024-08-20 15:33 ` bugzilla-daemon
2024-08-20 15:33 ` [Bug 219166] occasional block layer hang when setting 'echo noop > /sys/block/sda/queue/scheduler' bugzilla-daemon
2024-09-05  1:04 ` bugzilla-daemon
2024-09-05  7:25 ` bugzilla-daemon
2024-09-05  9:32 ` bugzilla-daemon
2024-09-06 19:46 ` bugzilla-daemon
2024-09-06 20:28 ` bugzilla-daemon
2024-09-06 20:32 ` bugzilla-daemon
2024-09-07  7:49 ` bugzilla-daemon
2024-09-07 11:09 ` bugzilla-daemon
2024-09-07 11:10 ` bugzilla-daemon
2024-09-18  8:36 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-219166-13602@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.