Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "Massimo B." <massimo.b@gmx.net>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: I/O blocked after booting
Date: Thu, 28 Mar 2024 20:40:33 +1030	[thread overview]
Message-ID: <22650868-6777-41ae-a068-37821929be7c@gmx.com> (raw)
In-Reply-To: <238dc2b36f27838baf02425b364705c58fcc5de5.camel@gmx.net>



在 2024/3/21 23:43, Massimo B. 写道:
> Hello everybody,
>
> I have this issue since years on all my desktop machines (but with almost
> identical distribution and configurations):
>
> Sometimes when booting the system, it comes up until the window manager with
> login screen appears. But no further login is possible. Trying to login via
> virtual terminals, SSH or trying to reboot, it appears that all I/O to the btrfs
> is blocked. Also waiting for ~20 minutes doesn't help the filesystem is
> blocking.
>
> I thought that might happen on unclean shutdowns or stuff. But it's not
> reproducible and also clean shutdowns sometimes lead to the same issue.
>
> First I thought it's some of the btrfsmaintenance jobs. So finally I disabled
> all of them:
>
> # grep PERIOD /etc/default/btrfsmaintenance
> BTRFS_DEFRAG_PERIOD="none"
> BTRFS_BALANCE_PERIOD="none"
> BTRFS_SCRUB_PERIOD="monthly"
> BTRFS_TRIM_PERIOD="none"
>
> No success.
>
> What I can confirm, after doing a forced reboot by holy SYSRQ series R,E,I,S,U,B
> the next startup is always fine and gets a working btrfs.
> Then the first line on the screen before doing the reboot are:
>
> sysrq: Keyboard mode set to system default
> sysrq: Terminate All Tasks
> elogind-daemon[4481]: Received signal 15 [TERM]
>
> BTRFS info (device dm-2): first mount of filesystem 1d677-.....
> BTRFS info (device dm-2): using crc32c (crc32c-intel checksum algorithm
> BTRFS info (device dm-2): force zstd compression, level 15
> BTRFS info (device dm-2): using free space tree
> BTRFS warmomg (device dm-0): failed to trim 30 block group(s), last error -512
> BTRFS warmomg (device dm-0): failed to trim 1 device(s), last error -512
>
> I guess this dm-0 is my main btrfs on PCIe NVMe.
>
> When successfully mounted the mount looks like this:
>
> /dev/mapper/luks-801... on / type btrfs (rw,noatime,nodiratime,compress-
> force=zstd:3,ssd,discard=async,noacl,space_cache=v2,subvolid=524,subvol=/volumes
> /root)

Disable disable (nodiscard mount option), as it looks like there is
something wrong with the auto discard, then retry.

This is mostly related to your NVME device's discard implementation, and
I believe manually fstrim may be a better and more reliable solution.

Thanks,
Qu

>
> Current kernel is 6.6.13-gentoo, though I don't think that is important as I
> have the issue for years with all previous kernels.
> I'm not only using the self-configured kernel from gentoo-sources but also a
> universal binary 6.6.16-gentoo-dist.
>
> I thought, maybe my btrbk run by cron could be the culprit. Looking at the
> syslogs, before the blocked I/O I see some very last lines in the log, where
> btrbk was started. Right after that the next line is the next boot:
>
> Mar 19 07:43:40 [chronyd] System clock wrong by -3.227396 seconds
> Mar 19 07:43:40 [chronyd] System clock was stepped by -3.227396 seconds
> Mar 19 07:44:00 [fcron] pam_unix(fcron:session): session opened for user clamav(uid=130) by (uid=0)
> Mar 19 07:44:00 [fcron] Job 'fangfrisch -c /etc/fangfrisch.conf refresh' started for user clamav (pid 4977)
> Mar 19 07:44:00 [fcron] Job 'ionice -c 3 schedtool -D -e btrbk -c /etc/btrbk/btrbk.conf run cron && /usr/local/bin/1update_btrbksnapshotlinks -c /etc/btrbk/btrbk.conf /mnt/archive/*/* / (truncated)
> Mar 19 07:44:00 [fcron] Job 'run-parts /etc/cron.daily' started for user systab (pid 4984)
> Mar 19 07:44:00 [fcron] Job 'run-parts /etc/cron.weekly' started for user systab (pid 4987)
> Mar 19 07:47:32 [kernel] Linux version 6.6.13-gentoo (root@gentoo) (gcc (Gentoo 13.2.1_p20230826 p7) 13.2.1 20230826, GNU ld (Gentoo 2.40 p7) 2.40.0) #1 SMP PREEMPT_DYNAMIC Mon Jan 22 11:11:15 CET 2024
>
> Actually btrbk works fine when the system is up and running, either started
> manually or from the cron job. What could happen to block all btrfs IO? How can
> I debug that?
>
> Best regards,
> Massimo
>

  parent reply	other threads:[~2024-03-28 10:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 13:13 I/O blocked after booting Massimo B.
2024-03-28  8:36 ` HAN Yuwei
2024-03-28 10:10 ` Qu Wenruo [this message]
2024-03-28 10:39   ` Massimo B.
2024-03-28 14:55     ` Massimo B.
2024-03-28 17:24       ` Roman Mamedov
2024-03-28 20:23       ` Qu Wenruo
2024-06-14  8:32         ` Massimo B.
2024-06-14 21:57           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22650868-6777-41ae-a068-37821929be7c@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=massimo.b@gmx.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox