From: Amir Goldstein <amir73il@gmail.com>
To: Brian Foster <bfoster@redhat.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
"Luis R. Rodriguez" <mcgrof@kernel.org>
Subject: Re: [PATCH] xfs: don't reuse busy extents on extent trim
Date: Thu, 26 May 2022 18:28:23 +0300 [thread overview]
Message-ID: <CAOQ4uxjoLm_xwD1GBMNteHsd4zv_4vr+g7xF9_HoCquhq4yoFQ@mail.gmail.com> (raw)
In-Reply-To: <Yo+M6Jhjwt/ruOfi@bfoster>
> > Hi Brian,
> >
> > This patch was one of my selected fixes to backport for v5.10.y.
> > It has a very scary looking commit message and the change seems
> > to be independent of any infrastructure changes(?).
> >
> > The problem is that applying this patch to v5.10.y reliably reproduces
> > this buffer corruption assertion [*] with test xfs/076.
> >
> > This happens on the kdevops system that is using loop devices over
> > sparse files inside qemu images. It does not reproduce on my small
> > VM at home.
> >
> > Normally, I would just drop this patch from the stable candidates queue
> > and move on, but I thought you might be interested to investigate this
> > reliable reproducer, because maybe this system exercises an error
> > that is otherwise rare to hit.
> >
> > It seemed weird to me that NOT reusing the extent would result in
> > data corruption, but it could indicate that reusing the extent was masking
> > the assertion and hiding another bug(?).
> >
>
> Indeed, this does seem like an odd failure. The shutdown on transaction
> cancel implies cancellation of a dirty transaction. This is not
> necessarily corruption as much as just being the generic
> naming/messaging related to shutdowns due to unexpected in-core state.
> The patch in question removes some modifications to in-core busy extent
> state during extent allocation that are fundamentally unsafe in
> combination with how allocation works. This change doesn't appear to
> affect any transaction directly, so the correlation may be indirect.
>
> xfs/076 looks like it's a sparse inode allocation test, which certainly
> seems relevant in that it is stressing the ability to allocate inode
> chunks under free space fragmentation. If this patch further restricts
> extent allocation by removing availability of some set of (recently
> freed, busy) extents, then perhaps there is some allocation failure
> sequence that was previously unlikely enough to mask some poor error
> handling logic or transaction handling (like an agfl fixup dirtying a
> transaction followed by an allocation failure, for example) that we're
> now running into.
>
> > Can you think of another reason to explain the regression this fix
> > introduces to 5.10.y?
> >
>
> Not off the top of my head. Something along the lines of the above seems
> plausible, but that's just speculation at this point.
>
> > Do you care to investigate this failure or shall I just move on?
> >
>
> I think it would be good to understand whether there's a regression
> introduced by this patch, a bug somewhere else or just some impedence
> mismatch in logic between the combination of this change and whatever
> else happens to be in v5.10.y. Unfortunately I'm not able to reproduce
> if I pull just this commit back into latest 5.10.y (5.10.118). I've
> tried with a traditional bdev as well as a preallocated and sparse
> loopback scratch dev.
I also failed to reproduce it on another VM, but it reproduces reliably
on this system. That's why I thought we'd better use this opportunity.
This system has lots of RAM and disk to spare so I have no problem
running this test in a VM in parallel to my work.
It is not actually my system, it's a system that Luis has setup for
stable XFS testing and gave me access to, so if the need arises
you could get direct access to the system, but for now, I have no
problem running the test for you.
> Have you tested this patch (backport) in isolation
> in your reproducer env or only in combination with other pending
> backports?
>
I tested it on top of 5.10.109 + these 5 patches:
https://github.com/amir73il/linux/commits/xfs-5.10.y-1
I can test it in isolation if you like. Let me know if there are
other forensics that you would like me to collect.
Thanks,
Amir.
next prev parent reply other threads:[~2022-05-26 15:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 15:34 [PATCH] xfs: don't reuse busy extents on extent trim Brian Foster
2021-02-22 18:27 ` Darrick J. Wong
2021-02-23 12:31 ` Brian Foster
2021-02-23 18:22 ` Darrick J. Wong
2022-05-26 11:34 ` Amir Goldstein
2022-05-26 14:21 ` Brian Foster
2022-05-26 15:28 ` Amir Goldstein [this message]
2022-05-26 19:33 ` Amir Goldstein
2022-05-26 19:47 ` Brian Foster
2022-05-26 20:56 ` Amir Goldstein
2022-05-27 7:06 ` Amir Goldstein
2022-05-27 12:50 ` Brian Foster
2022-05-27 14:16 ` Amir Goldstein
2022-05-28 14:23 ` Amir Goldstein
2022-05-31 15:55 ` Brian Foster
2023-01-11 14:41 ` Gao Xiang
2023-01-11 21:06 ` Dave Chinner
2021-02-23 6:24 ` Chandan Babu R
2021-02-25 7:51 ` Christoph Hellwig
2021-02-25 18:05 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOQ4uxjoLm_xwD1GBMNteHsd4zv_4vr+g7xF9_HoCquhq4yoFQ@mail.gmail.com \
--to=amir73il@gmail.com \
--cc=bfoster@redhat.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).