Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: "Torbjörn Jansson" <torbjorn@jansson.tech>, linux-btrfs@vger.kernel.org
Subject: Re: Filesystem lockup during backup
Date: Mon, 20 Oct 2025 07:12:24 +1030	[thread overview]
Message-ID: <974df153-cbdf-443a-aa3b-0a30c121928d@suse.com> (raw)
In-Reply-To: <4e2d3143-5383-491d-86c2-6b3eb7e21c3e@jansson.tech>



在 2025/10/19 20:43, Torbjörn Jansson 写道:
> Hello.
> 
> i have a btrfs filesystem on two 18tb disks that i use as backup 
> destination for my proxmox cluster.
> the filesystem is using btrfs raid1 mirroring and is exported over nfs 
> to the other nodes.
> 
> because this is used primarily for backups there are periods of heavy 
> writes (several backups running at the same time) and when this happens 
> it is very likely the filesystem and nfsd locks up completely.
> this then starts a chain reaction due to the default hard mount blocking 
> processes then eventually ceph also becomes unhappy and then the vms 
> goes down.
> 
> below is the hung task output from dmesg on the computer with the disks.
> 
> any idea whats going on and what i can do about it?
> 
> 
> 
> [1560204.654347] INFO: task nfsd:5136 blocked for more than 122 seconds.
> [1560204.654351]       Tainted: P           O       6.14.11-2-pve #1

v6.14 is EOL, you're completely on the vendor to provide any fix/backport.

Recommended to go either LTS kernels or latest upstream one.

> [1560204.654353] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [1560204.654355] task:nfsd            state:D stack:0     pid:5136  
> tgid:5136  ppid:2      task_flags:0x200040 flags:0x00004000

The only message? No more other tasks?

> [1560204.654361] Call Trace:
> [1560204.654363]  <TASK>
> [1560204.654366]  __schedule+0x466/0x1400
> [1560204.654376]  schedule+0x29/0x130
> [1560204.654381]  io_schedule+0x4c/0x80
> [1560204.654387]  folio_wait_bit_common+0x122/0x2e0
> [1560204.654393]  ? __pfx_wake_page_function+0x10/0x10
> [1560204.654400]  __folio_lock+0x17/0x30

This hangs at folio locking, which normally means another process has 
locked the data folio, but no more hanging messages to further debug it.

Thanks,
Qu

> [1560204.654404]  extent_write_cache_pages+0x36e/0x7f0 [btrfs]
> [1560204.654559]  btrfs_writepages+0x75/0x130 [btrfs]
> [1560204.654703]  do_writepages+0xde/0x280
> [1560204.654710]  ? __pfx_ip_finish_output+0x10/0x10
> [1560204.654715]  ? wbc_attach_and_unlock_inode+0xd1/0x130
> [1560204.654721]  filemap_fdatawrite_wbc+0x58/0x80
> [1560204.654726]  ? __ip_queue_xmit+0x19b/0x4e0
> [1560204.654731]  __filemap_fdatawrite_range+0x6d/0xa0
> [1560204.654744]  filemap_fdatawrite_range+0x13/0x30
> [1560204.654748]  btrfs_fdatawrite_range+0x28/0x70 [btrfs]
> [1560204.654889]  start_ordered_ops.constprop.0+0x4e/0x90 [btrfs]
> [1560204.655029]  btrfs_sync_file+0xa9/0x610 [btrfs]
> [1560204.655159]  ? list_lru_del_obj+0xad/0xe0
> [1560204.655168]  vfs_fsync_range+0x42/0xa0
> [1560204.655174]  nfsd_commit+0x9f/0x180 [nfsd]
> [1560204.655275]  nfsd4_commit+0x60/0xa0 [nfsd]
> [1560204.655367]  nfsd4_proc_compound+0x3ad/0x760 [nfsd]
> [1560204.655427]  nfsd_dispatch+0xce/0x220 [nfsd]
> [1560204.655486]  svc_process_common+0x464/0x6f0 [sunrpc]
> [1560204.655553]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
> [1560204.655611]  svc_process+0x136/0x1f0 [sunrpc]
> [1560204.655675]  svc_recv+0x7bb/0x9a0 [sunrpc]
> [1560204.655741]  ? __pfx_nfsd+0x10/0x10 [nfsd]
> [1560204.655798]  nfsd+0x90/0xf0 [nfsd]
> [1560204.655852]  kthread+0xf9/0x230
> [1560204.655855]  ? __pfx_kthread+0x10/0x10
> [1560204.655858]  ret_from_fork+0x44/0x70
> [1560204.655862]  ? __pfx_kthread+0x10/0x10
> [1560204.655864]  ret_from_fork_asm+0x1a/0x30
> [1560204.655871]  </TASK>
> 


  reply	other threads:[~2025-10-19 20:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-19 10:13 Filesystem lockup during backup Torbjörn Jansson
2025-10-19 20:42 ` Qu Wenruo [this message]
2025-10-19 21:55   ` Torbjörn Jansson
2025-10-19 22:48     ` Qu Wenruo
2025-10-20 17:03       ` Torbjörn Jansson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=974df153-cbdf-443a-aa3b-0a30c121928d@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=torbjorn@jansson.tech \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox