All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH v2 4/4] bcachefs: initial freeze/unfreeze support
Date: Thu, 21 Sep 2023 10:55:54 -0400	[thread overview]
Message-ID: <ZQxZehDOmHjg064p@bfoster> (raw)
In-Reply-To: <ZQrzjG5T2jSthuB9@bfoster>

On Wed, Sep 20, 2023 at 09:28:44AM -0400, Brian Foster wrote:
> On Tue, Sep 19, 2023 at 09:21:16PM -0400, Kent Overstreet wrote:
> > Pulled this into the testing branch and then got
> > 
> > https://evilpiepirate.org/~testdashboard/c/de4ea1e2a9ceec5d55fffbc1acab89f0dc8f90b6/xfstests.generic.459/log.br
> > 
> > So I'll likely kick this patch back out for now, let me know when you
> > have a fixed version :)
> > 
> 
> Ah, sorry.. I should have mentioned this in the cover letter. I'm aware
> of this failure but my initial triage has it as an unrelated problem.
> That test basically induces I/O errors by explicitly overprovisioning a
> dm-thin volume for the fs. The original bug was a livelock issue on XFS
> related to metadata writeback failure/retry in this particular scenario.
> 
> The test relies on freeze in that it basically consumes all of the
> initially provisioned space, issues a freeze in the background (which
> will start off hanging because not everything can write back until more
> storage is available), and then grows available space so freeze can
> proceed to completion. It uses the success/failure of the freeze to
> determine pass/failure, and if the freeze fails it looks like it expects
> the filesystem to have been remounted ro (which I believe is ext4's way
> of dealing with this).
> 
> My notes say that freeze failed because the fs shutdown, which I think
> is due to the whole overprovision/flush thing leading to I/O errors
> (i.e. expected behavior, probably similar to ext4). But TBH I hadn't dug
> into it further than initial triage to rule out the core freeze
> mechanism itself. I'll dig more into that soon to see whether we need to
> change the test or something in the kernel, though I don't think it
> necessarily needs to gate freeze support..
> 

Yeah, so I can confirm that the only real difference in behavior between
ext4 and bcachefs wrt to this test is that the former sets SB_RDONLY in
its internal error remount read-only sequence, which in turn results in
"ro" text in the /proc/mounts output and thus satisfies the test.

bcachefs only does that in the ioctl shutdown paths for whatever reason.
Unfortunately it's not quite as simple as doing the same in the fatal
error path. If I do that, it looks like the async error handling worker
can race with a freeze, which does two different things wrt to internal
locks when sb_rdonly() is true vs. not.

This can possibly all be serialized via the right combination of
s_umount and freeze protection in the error handler to ensure we never
see the wrong combination of being freeze locked && sb_rdonly(), but
that requires a little more thought. Given the test could also easily
change to checking for -EROFS or some such on bcachefs vs. "ro" in the
mount status, this does strike me as more of a minor shutdown
inconsistency than a significant freeze or shutdown bug. What might be
more useful is at some point to audit the various read-only paths for
consistent behaviors regardless of how invoked (i.e. "ro" for remount,
shutdown) and proper serialization.

Brian


  reply	other threads:[~2023-09-21 17:51 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-15 12:51 [PATCH v2 0/4] bcachefs: journal bug fixes / freeze support Brian Foster
2023-09-15 12:51 ` [PATCH v2 1/4] bcachefs: refactor pin put helpers Brian Foster
2023-09-15 12:51 ` [PATCH v2 2/4] bcachefs: prepare journal buf put to handle pin put Brian Foster
2023-09-15 12:51 ` [PATCH v2 3/4] bcachefs: fix race between journal entry close and pin set Brian Foster
2023-09-15 12:51 ` [PATCH v2 4/4] bcachefs: initial freeze/unfreeze support Brian Foster
2023-09-19 21:32   ` Kent Overstreet
2023-09-20  1:21   ` Kent Overstreet
2023-09-20 13:28     ` Brian Foster
2023-09-21 14:55       ` Brian Foster [this message]
2023-09-21 15:46       ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQxZehDOmHjg064p@bfoster \
    --to=bfoster@redhat.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.