From: David Sterba <dsterba@suse.cz>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: fix possible infinite loop in data async reclaim
Date: Wed, 26 Aug 2020 11:13:22 +0200 [thread overview]
Message-ID: <20200826091322.GA28318@twin.jikos.cz> (raw)
In-Reply-To: <24f846bc8860cab91ca134d0a337cc290589a092.1598389008.git.josef@toxicpanda.com>
On Tue, Aug 25, 2020 at 04:56:59PM -0400, Josef Bacik wrote:
> Dave reported an issue where generic/102 would sometimes hang. This
> turned out to be because we'd get into this spot where we were no longer
> making progress on data reservations because our exit condition was not
> met. The log is basically
>
> while (!space_info->full && !list_empty(&space_info->tickets))
> flush_space(space_info, flush_state);
>
> where flush state is our various flush states, but doesn't include
> ALLOC_CHUNK_FORCE. This is because we actually lead with allocating
> chunks, and so the assumption was that once you got to the actual
> flushing states you could no longer allocate chunks. This was a stupid
> assumption, because you could have deleted block groups that would be
> reclaimed by a transaction commit, thus unsetting space_info->full.
> This is essentially what happens with generic/102, and so sometimes
> you'd get stuck in the flushing loop because we weren't allocating
> chunks, but flushing space wasn't giving us what we needed to make
> progress.
>
> Fix this by adding ALLOC_CHUNK_FORCE to the end of our flushing states,
> that way we will eventually bail out because we did end up with
> space_info->full if we free'd a chunk previously. Otherwise, as is the
> case for this test, we'll allocate our chunk and continue on our happy
> merry way.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Thanks. As the flushing states are added one by one at the end of the
series, I'll add it as a separate patch. Folding it to some other patch
would lose a bit more of information that's in the changelog, so this
leaves a short window where the 102 hang could happen but again the
flushing sequence is not switched at once.
prev parent reply other threads:[~2020-08-26 9:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-25 20:56 [PATCH] btrfs: fix possible infinite loop in data async reclaim Josef Bacik
2020-08-26 9:13 ` David Sterba [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200826091322.GA28318@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=josef@toxicpanda.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox