public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: linux-btrfs@vger.kernel.org
Subject: 6.4 and 6.9 btrfs blocked and btrfs_work_helper workqueue lockup, is it an IO bug/hang though?
Date: Sun, 18 Feb 2024 22:24:04 -0800	[thread overview]
Message-ID: <ZdL0BJjuyhtS8vn1@merlins.org> (raw)

I've seen this with both 6.4.9 and 6.6.9 and had to sysrq reboot both
times to recover.
I'm trying to see if it's a hang of my raid card.

That's the more recent hang with 6.6.9:

The one with 6.4.9 is longer, so I put it here: https://pastebin.com/xz11JXWM

Here's the 6.6.9 one here. Can someone help me confirm that at least
this one is likely not btrfs' fault but just underlying I/O hang?

If so, idoes the 6.4.9 match the same symptom, or is it a different issue?

Thanks,
Marc

135577.600958] INFO: task md12_raid1:1276 blocked for more than 120 seconds.
[135577.621963]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135577.641401] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135577.665522] task:md12_raid1      state:D stack:0     pid:1276  ppid:2      flags:0x00004000
[135577.691381] Call Trace:
[135577.699494]  <TASK>
[135577.706627]  __schedule+0x6af/0x702
[135577.717853]  schedule+0x8b/0xbd
[135577.727933]  md_super_wait+0x5d/0x9c
[135577.739287]  ? __pfx_autoremove_wake_function+0x40/0x40
[135577.755602]  write_sb_page+0x242/0x25d
[135577.767482]  md_update_sb+0x4c1/0x679
[135577.779072]  md_check_recovery+0x276/0x484
[135577.791965]  raid1d+0x46/0x10db
[135577.802178]  ? raw_spin_rq_unlock_irq+0x5/0x10
[135577.816122]  ? finish_task_switch.isra.0+0x129/0x202
[135577.831629]  ? __schedule+0x6b7/0x702
[135577.843292]  ? lock_timer_base+0x38/0x5f
[135577.855662]  ? schedule+0x8b/0xbd
[135577.866222]  ? __list_add+0x12/0x2f
[135577.877341]  ? _raw_spin_unlock_irqrestore+0xe/0x2e
[135577.892618]  md_thread+0x113/0x140
[135577.903553]  ? __pfx_autoremove_wake_function+0x40/0x40
[135577.920016]  ? __pfx_md_thread+0x40/0x40
[135577.932413]  kthread+0xe8/0xf0
[135577.942221]  ? __pfx_kthread+0x40/0x40
[135577.954084]  ret_from_fork+0x24/0x36
[135577.965583]  ? __pfx_kthread+0x40/0x40
[135577.977475]  ret_from_fork_asm+0x1b/0x80
[135577.989877]  </TASK>
[135577.997078] INFO: task md13_raid1:1278 blocked for more than 121 seconds.
[135578.018044]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135578.037405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135578.061454] task:md13_raid1      state:D stack:0     pid:1278  ppid:2      flags:0x00004000
[135578.087065] Call Trace:
[135578.094972]  <TASK>
[135578.102074]  __schedule+0x6af/0x702
[135578.113302]  schedule+0x8b/0xbd
[135578.123297]  md_super_wait+0x5d/0x9c
[135578.134757]  ? __pfx_autoremove_wake_function+0x40/0x40
[135578.151023]  write_sb_page+0x242/0x25d
[135578.162869]  md_update_sb+0x4c1/0x679
[135578.174470]  md_check_recovery+0x276/0x484
[135578.187342]  raid1d+0x46/0x10db
[135578.197352]  ? raw_spin_rq_unlock_irq+0x5/0x10
[135578.211262]  ? finish_task_switch.isra.0+0x129/0x202
[135578.226731]  ? __schedule+0x6b7/0x702
[135578.238278]  ? lock_timer_base+0x38/0x5f
[135578.250688]  ? _raw_spin_unlock_irqrestore+0xe/0x2e
[135578.265933]  ? __try_to_del_timer_sync+0x64/0x8b
[135578.280350]  ? __timer_delete_sync+0x2e/0x3d
[135578.293706]  ? __list_add+0x12/0x2f
[135578.304886]  ? _raw_spin_unlock_irqrestore+0xe/0x2e
[135578.320238]  md_thread+0x113/0x140
[135578.331038]  ? __pfx_autoremove_wake_function+0x40/0x40
[135578.347265]  ? __pfx_md_thread+0x40/0x40
[135578.359599]  kthread+0xe8/0xf0
[135578.369323]  ? __pfx_kthread+0x40/0x40
[135578.381272]  ret_from_fork+0x24/0x36
[135578.392702]  ? __pfx_kthread+0x40/0x40
[135578.404534]  ret_from_fork_asm+0x1b/0x80
[135578.416877]  </TASK>
[135578.424012] INFO: task dmcrypt_write/2:2017 blocked for more than 121 seconds.
[135578.446256]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135578.465619] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135578.489691] task:dmcrypt_write/2 state:D stack:0     pid:2017  ppid:2      flags:0x00004000
[135578.515301] Call Trace:
[135578.523384]  <TASK>
[135578.530433]  __schedule+0x6af/0x702
[135578.541490]  schedule+0x8b/0xbd
[135578.551650]  md_write_start+0x160/0x1a7
[135578.563748]  ? __pfx_autoremove_wake_function+0x40/0x40
[135578.579993]  raid1_make_request+0x89/0x880
[135578.592962]  ? sugov_update_single_freq+0x20/0x106
[135578.607926]  ? update_load_avg+0x372/0x39b
[135578.620814]  ? get_sd_balance_interval+0xf/0x3d
[135578.635156]  md_handle_request+0x126/0x16d
[135578.648040]  ? __pfx_dmcrypt_write+0x40/0x40 [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135578.674973]  ? _raw_spin_unlock+0xa/0x1d
[135578.687325]  ? raw_spin_rq_unlock_irq+0x5/0x10
[135578.701386]  ? finish_task_switch.isra.0+0x129/0x202
[135578.716851]  __submit_bio+0x63/0x89
[135578.728043]  submit_bio_noacct_nocheck+0x181/0x258
[135578.743026]  ? __pfx_dmcrypt_write+0x40/0x40 [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135578.769994]  dmcrypt_write+0xd1/0xfd [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135578.794848]  kthread+0xe8/0xf0
[135578.804612]  ? __pfx_kthread+0x40/0x40
[135578.816438]  ret_from_fork+0x24/0x36
[135578.827741]  ? __pfx_kthread+0x40/0x40
[135578.839744]  ret_from_fork_asm+0x1b/0x80
[135578.852280]  </TASK>
[135578.859445] INFO: task btrfs-transacti:2415 blocked for more than 122 seconds.
[135578.881710]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135578.901270] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135578.925345] task:btrfs-transacti state:D stack:0     pid:2415  ppid:2      flags:0x00004000
[135578.950996] Call Trace:
[135578.958960]  <TASK>
[135578.965902]  __schedule+0x6af/0x702
[135578.976995]  schedule+0x8b/0xbd
[135578.987195]  io_schedule+0x12/0x38
[135578.998015]  folio_wait_bit_common+0x157/0x202
[135579.011950]  ? __pfx_wake_page_function+0x40/0x40
[135579.026658]  folio_wait_writeback+0x30/0x38
[135579.039897]  __filemap_fdatawait_range+0x74/0xbf
[135579.054353]  ? __update_freelist_fast+0x17/0x1e
[135579.068568]  ? __clear_extent_bit+0x323/0x338
[135579.082257]  filemap_fdatawait_range+0xf/0x19
[135579.096112]  __btrfs_wait_marked_extents.isra.0+0x98/0xf3
[135579.113089]  btrfs_write_and_wait_transaction+0x5d/0xbf
[135579.129372]  btrfs_commit_transaction+0x67c/0xa62
[135579.144094]  ? start_transaction+0x3f7/0x463
[135579.157540]  transaction_kthread+0x105/0x17a
[135579.170970]  ? __pfx_transaction_kthread+0x40/0x40
[135579.185951]  kthread+0xe8/0xf0
[135579.195732]  ? __pfx_kthread+0x40/0x40
[135579.207593]  ret_from_fork+0x24/0x36
[135579.218942]  ? __pfx_kthread+0x40/0x40
[135579.230778]  ret_from_fork_asm+0x1b/0x80
[135579.243146]  </TASK>
[135579.250316] INFO: task dmcrypt_write/2:5016 blocked for more than 122 seconds.
[135579.272593]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135579.291991] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135579.316244] task:dmcrypt_write/2 state:D stack:0     pid:5016  ppid:2      flags:0x00004000
[135579.341930] Call Trace:
[135579.349929]  <TASK>
[135579.356826]  __schedule+0x6af/0x702
[135579.367910]  schedule+0x8b/0xbd
[135579.377939]  md_write_start+0x160/0x1a7
[135579.390216]  ? __pfx_autoremove_wake_function+0x40/0x40
[135579.406491]  raid1_make_request+0x89/0x880
[135579.419401]  ? update_cfs_rq_load_avg+0x176/0x189
[135579.434131]  ? update_load_avg+0x46/0x39b
[135579.446738]  ? get_sd_balance_interval+0xf/0x3d
[135579.461129]  md_handle_request+0x126/0x16d
[135579.474037]  ? __pfx_dmcrypt_write+0x40/0x40 [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135579.500975]  ? _raw_spin_unlock+0xa/0x1d
[135579.513370]  ? raw_spin_rq_unlock_irq+0x5/0x10
[135579.527308]  ? finish_task_switch.isra.0+0x129/0x202
[135579.542831]  __submit_bio+0x63/0x89
[135579.553918]  submit_bio_noacct_nocheck+0x181/0x258
[135579.568943]  ? __pfx_dmcrypt_write+0x40/0x40 [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135579.595885]  dmcrypt_write+0xd1/0xfd [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135579.620760]  kthread+0xe8/0xf0
[135579.630682]  ? __pfx_kthread+0x40/0x40
[135579.642699]  ret_from_fork+0x24/0x36
[135579.654188]  ? __pfx_kthread+0x40/0x40
[135579.666045]  ret_from_fork_asm+0x1b/0x80
[135579.678418]  </TASK>
[135579.685606] INFO: task dmcrypt_write/2:5286 blocked for more than 122 seconds.
[135579.707870]       Not tainted 6.6.9-amd64-volpre-sysrq-20240101 #19
[135579.727445] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[135579.751591] task:dmcrypt_write/2 state:D stack:0     pid:5286  ppid:2      flags:0x00004000
[135579.777266] Call Trace:
[135579.785223]  <TASK>
[135579.792124]  __schedule+0x6af/0x702
[135579.803200]  ? __pfx_wbt_inflight_cb+0x40/0x40
[135579.817139]  ? __pfx_wbt_cleanup_cb+0x40/0x40
[135579.830818]  schedule+0x8b/0xbd
[135579.840951]  io_schedule+0x12/0x38
[135579.851745]  rq_qos_wait+0xe8/0x126
[135579.862795]  ? __pfx_rq_qos_wake_function+0x40/0x40
[135579.878021]  ? __pfx_wbt_inflight_cb+0x40/0x40
[135579.891951]  wbt_wait+0x95/0xe4
[135579.902022]  __rq_qos_throttle+0x23/0x33
[135579.914398]  blk_mq_submit_bio+0x2b6/0x4dd
[135579.927273]  __submit_bio+0x29/0x89
[135579.938356]  submit_bio_noacct_nocheck+0x121/0x258
[135579.953357]  ? __pfx_dmcrypt_write+0x40/0x40 [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135579.980432]  dmcrypt_write+0xd1/0xfd [dm_crypt 721219ef82f7f7c3ecde59f70e81b621d3b8f858]
[135580.005323]  kthread+0xe8/0xf0
[135580.015088]  ? __pfx_kthread+0x40/0x40
[135580.026929]  ret_from_fork+0x24/0x36
[135580.038259]  ? __pfx_kthread+0x40/0x40
[135580.050266]  ret_from_fork_asm+0x1b/0x80
[135580.062628]  </TASK>
[135580.069774] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings


-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

             reply	other threads:[~2024-02-19  6:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-19  6:24 Marc MERLIN [this message]
2024-02-27 23:14 ` 6.4 and 6.9 btrfs blocked and btrfs_work_helper workqueue lockup, is it an IO bug/hang though? Marc MERLIN
2024-02-28  8:18   ` Qu Wenruo
2024-02-28  8:24     ` Qu Wenruo
2024-02-29  1:41       ` Marc MERLIN
2024-02-29  2:01         ` Qu Wenruo
2024-02-29  3:15           ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZdL0BJjuyhtS8vn1@merlins.org \
    --to=marc@merlins.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox