From: bugzilla-daemon@kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 216322] Freezing of tasks failed after 60.004 seconds (1 tasks refusing to freeze... task:fstrim ext4_trim_fs - Dell XPS 13 9310
Date: Thu, 04 Aug 2022 11:47:47 +0000 [thread overview]
Message-ID: <bug-216322-13602-2MvUDlAfJU@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-216322-13602@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=216322
--- Comment #4 from Lukas Czerner (lczerner@redhat.com) ---
On Thu, Aug 04, 2022 at 12:44:45AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=216322
>
> Theodore Tso (tytso@mit.edu) changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |tytso@mit.edu
>
> --- Comment #2 from Theodore Tso (tytso@mit.edu) ---
> So the problem is that the FITRIM ioctl does not check if a signal is
> pending,
> and so if the fstrim program requests that the entire SSD (len=ULLONG_MAX),
> like the broomstick set off by Mickey Mouse in Fantasia's "Sorcerer's
> Apprentive", it will mindlessly send discard requests for any blocks not in
> use
> by the file system until it is done. Or to put it another way, "Neither
> rain,
> nor snow, or a request to freeze the OS, shall stop the FITRIM ioctl from its
> appointed task." :-)
>
> The question is how to fix things. The problem is that the FITRIM ioctl
> interface is pretty horrible. The fstrim_range.len variable is an IN/OUT
> field where on the input it is the number of bytes that should be trimmed
> (from
> start to start+len) and when the ioctl returns fstrm_range.len is the number
> of
> bytes that were actually trimmed. So this is not really amenable for
> -ERESTARTSYS.
>
> Worse, the fstrim program in util-linux doesn't handle an EAGAIN error return
> code, so if it gets the EAGAIN after try_to_freeze_tasks send the fake signal
> to the process, fstrim will print to stderr "fstrim: FITRIM ioctl failed" and
> the rest of the file system trim operation will be aborted.
>
> It might be that the only way we can fix this is to have FITRIM return
> EAGAIN,
> which will stop the fstrim in its tracks. This is... not great, but
> typically
> fstrim is run out of crontab or a systemd timer once a month, so if the user
> tries to suspend right as the fstrim is running, hopefully we'll get lucky
> next
> month. We can then try teach fstrim to do the right thing, and so this
> lossage mode would only happen in the combination of a new kernel and an
> older
> version of util-linux.
>
> I'm not happy with that solution, but the alternative of creating a new
> FITRIM2
> ioctl that has a sane interface means that you need an new kernel and a new
> util-linux package, and if you don't, the user will have to deal with a hot
> laptop bag and a drained battery. And not changing FITRIM's behaviour will
> have the same potential end result, if the user gets unlucky and tries to
> suspend the laptop when there is more than 60 seconds left before FITRIM to
> complete. :-/
>
> The other thing I'll note is that every file system has its own FITRIM
> implementation, and I suspect they all have this issue, because the FITRIM
> interface is fundamentally flawed.
I agree that the FITRIM interface is flawed in this way. But
ext4_try_to_trim_range() actually does have fatal_signal_pending() and
will return -ERESTARTSYS if that's true. Or did you have something else in
mind?
Also in that case, I see no reason why we would not be able to adjust
the fstrim_range to make it easier to re-start where we left off if
we're going to return -ERESTARTSYS. I am missing something?
I have not had time to look deeply into the traces, but are you actually
sure that we're not stuck in blkdev_issue_discard() instead?
-Lukas
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2022-08-04 11:47 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-03 20:18 [Bug 216322] New: Freezing of tasks failed after 60.004 seconds (1 tasks refusing to freeze... task:fstrim ext4_trim_fs - Dell XPS 13 9310 bugzilla-daemon
2022-08-03 20:26 ` [Bug 216322] " bugzilla-daemon
2022-08-04 0:44 ` bugzilla-daemon
2022-08-04 11:47 ` Lukas Czerner
2022-08-04 1:01 ` bugzilla-daemon
2022-08-04 11:47 ` bugzilla-daemon [this message]
2022-08-04 14:45 ` Theodore Ts'o
2022-08-10 0:29 ` Dave Chinner
2022-08-04 14:45 ` bugzilla-daemon
2022-08-09 13:40 ` bugzilla-daemon
2022-08-09 17:48 ` bugzilla-daemon
2022-08-10 0:29 ` bugzilla-daemon
2022-09-09 6:02 ` bugzilla-daemon
2022-09-26 15:59 ` bugzilla-daemon
2022-09-27 13:24 ` bugzilla-daemon
2023-04-19 14:58 ` bugzilla-daemon
2023-04-21 23:46 ` bugzilla-daemon
2023-09-06 16:28 ` bugzilla-daemon
2023-09-13 15:02 ` bugzilla-daemon
2023-09-13 15:03 ` bugzilla-daemon
2023-09-25 12:04 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-216322-13602-2MvUDlAfJU@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@kernel.org \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).