From: Boris Burkov <boris@bur.io>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction
Date: Thu, 23 Mar 2023 10:00:06 -0700 [thread overview]
Message-ID: <20230323170006.GA28317@zen> (raw)
In-Reply-To: <20230323161529.GA8070@zen>
On Thu, Mar 23, 2023 at 09:15:29AM -0700, Boris Burkov wrote:
> On Thu, Mar 23, 2023 at 01:47:28AM -0700, Christoph Hellwig wrote:
> > This is a bit of a mess. And the root cause of that is that
> > btrfs_extract_ordered_extent the way it is used right now does
> > the wrong thing in terms of splitting the ordered_extent. What
> > we want is to allocate a new one for the beginning of the range,
> > and leave the rest alone.
> >
> > I did run into this a while ago during my (nt yet submitted) work
> > to keep an ordered_extent pointer in the btrfs_bio, and I have some
> > patches to sort it out.
> >
> > I've rebased your fix on top of those, can you check if this tree
> > makes sense to you;
> >
> > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-dio-fix-hch
> >
> > it passes basic testing so far.
>
> Nice, this is great!
>
> I actually also made the same changes as in your branch while working on
> my fix, but didn't know enough about the zoned use case to realize that
> the simpler "extract from beginning" constraint also applied to the
> zoned case. So what happened in my branch was I implemented the three
> way split as two "split at starts" which ultimately felt too messy and I
> opted for returning the new split objects from the the existing model.
>
> If it's true that we can always do a "split from front" then I'm all
> aboard and think this is the way forward. Given that I found what I
> think is a serious bug in the case where pre>0, I suspect you are right,
> and we aren't hitting that case.
>
> I will check that this passes my testing for the various dio cases (I
> have one modified xfstests case I haven't sent yet for the meanest
> version of the deadlock I have come up with so far) and the other tests
> that I saw races/flakiness on, but from a quick look, your branch looks
> correct to me. I believe the most non-obvious property my fix relies on
> is dio_data->ordered having the leftovers from the partial after
> submission so that it can be cancelled, which your branch looks to
> maintain.
Your branch as-is does not pass the existing tests, It's missing a fix
from my V5. We need to avoid splitting partial OEs when doing NOCOW dio
writes, because iomap_begin() does not create a fresh pinned em in that
case, since it reuses the existing extent.
e.g.,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8cb61f4daec0..bbc89a0872e7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7719,7 +7719,7 @@ static void btrfs_dio_submit_io(const struct iomap_iter *iter, struct bio *bio,
* cancelled in iomap_end to avoid a deadlock wherein faulting the
* remaining pages is blocked on the outstanding ordered extent.
*/
- if (iter->flags & IOMAP_WRITE) {
+ if (iter->flags & IOMAP_WRITE && !test_bit(BTRFS_ORDERED_NOCOW, &dio_data->ordered->flags)) {
int err;
err = btrfs_extract_ordered_extent(bbio, dio_data->ordered);
With that patch, I pass 10x of btrfs/250, so running the full suite next.
>
> Assuming the tests pass, I do want to get this in sooner than later,
> since downstream is still waiting on a fix. Would you be willing to send
> your stack soon for my fix to land atop? I don't mind if you just send a
> patch series with my patches mixed in, either. If, OTOH, your patches
> are still a while out, or depend on something else that's underway,
> maybe we could land mine, then gut them for your improvements. I'm fine
> with it either way.
>
> Thanks,
> Boris
next prev parent reply other threads:[~2023-03-23 17:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-22 19:11 [PATCH v5 0/5] btrfs: fix corruption caused by partial dio writes Boris Burkov
2023-03-22 19:11 ` [PATCH v5 1/5] btrfs: add function to create and return an ordered extent Boris Burkov
2023-03-22 19:11 ` [PATCH v5 2/5] btrfs: stash ordered extent in dio_data during iomap dio Boris Burkov
2023-03-22 19:11 ` [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction Boris Burkov
2023-03-23 8:47 ` Christoph Hellwig
2023-03-23 16:15 ` Boris Burkov
2023-03-23 17:00 ` Boris Burkov [this message]
2023-03-23 17:45 ` Boris Burkov
2023-03-23 21:29 ` Christoph Hellwig
2023-03-23 22:43 ` Boris Burkov
2023-03-24 0:24 ` Christoph Hellwig
2023-03-22 19:11 ` [PATCH v5 4/5] btrfs: fix crash with non-zero pre in btrfs_split_ordered_extent Boris Burkov
2023-03-23 8:36 ` Naohiro Aota
2023-03-23 16:22 ` Boris Burkov
2023-03-22 19:11 ` [PATCH v5 5/5] btrfs: split partial dio bios before submit Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230323170006.GA28317@zen \
--to=boris@bur.io \
--cc=hch@infradead.org \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox