public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction
Date: Thu, 23 Mar 2023 10:00:06 -0700	[thread overview]
Message-ID: <20230323170006.GA28317@zen> (raw)
In-Reply-To: <20230323161529.GA8070@zen>

On Thu, Mar 23, 2023 at 09:15:29AM -0700, Boris Burkov wrote:
> On Thu, Mar 23, 2023 at 01:47:28AM -0700, Christoph Hellwig wrote:
> > This is a bit of a mess.  And the root cause of that is that
> > btrfs_extract_ordered_extent the way it is used right now does
> > the wrong thing in terms of splitting the ordered_extent.  What
> > we want is to allocate a new one for the beginning of the range,
> > and leave the rest alone.
> > 
> > I did run into this a while ago during my (nt yet submitted) work
> > to keep an ordered_extent pointer in the btrfs_bio, and I have some
> > patches to sort it out.
> > 
> > I've rebased your fix on top of those, can you check if this tree
> > makes sense to you;
> > 
> >    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/btrfs-dio-fix-hch
> > 
> > it passes basic testing so far.
> 
> Nice, this is great!
> 
> I actually also made the same changes as in your branch while working on
> my fix, but didn't know enough about the zoned use case to realize that
> the simpler "extract from beginning" constraint also applied to the
> zoned case. So what happened in my branch was I implemented the three
> way split as two "split at starts" which ultimately felt too messy and I
> opted for returning the new split objects from the the existing model.
> 
> If it's true that we can always do a "split from front" then I'm all
> aboard and think this is the way forward. Given that I found what I
> think is a serious bug in the case where pre>0, I suspect you are right,
> and we aren't hitting that case.
> 
> I will check that this passes my testing for the various dio cases (I
> have one modified xfstests case I haven't sent yet for the meanest
> version of the deadlock I have come up with so far) and the other tests
> that I saw races/flakiness on, but from a quick look, your branch looks
> correct to me. I believe the most non-obvious property my fix relies on
> is dio_data->ordered having the leftovers from the partial after
> submission so that it can be cancelled, which your branch looks to
> maintain.

Your branch as-is does not pass the existing tests, It's missing a fix
from my V5. We need to avoid splitting partial OEs when doing NOCOW dio
writes, because iomap_begin() does not create a fresh pinned em in that
case, since it reuses the existing extent.

e.g.,

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8cb61f4daec0..bbc89a0872e7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7719,7 +7719,7 @@ static void btrfs_dio_submit_io(const struct iomap_iter *iter, struct bio *bio,
         * cancelled in iomap_end to avoid a deadlock wherein faulting the
         * remaining pages is blocked on the outstanding ordered extent.
         */
-       if (iter->flags & IOMAP_WRITE) {
+       if (iter->flags & IOMAP_WRITE && !test_bit(BTRFS_ORDERED_NOCOW, &dio_data->ordered->flags)) {
                int err;

                err = btrfs_extract_ordered_extent(bbio, dio_data->ordered);

With that patch, I pass 10x of btrfs/250, so running the full suite next.

> 
> Assuming the tests pass, I do want to get this in sooner than later,
> since downstream is still waiting on a fix. Would you be willing to send
> your stack soon for my fix to land atop? I don't mind if you just send a
> patch series with my patches mixed in, either. If, OTOH, your patches
> are still a while out, or depend on something else that's underway,
> maybe we could land mine, then gut them for your improvements. I'm fine
> with it either way.
> 
> Thanks,
> Boris

  reply	other threads:[~2023-03-23 17:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22 19:11 [PATCH v5 0/5] btrfs: fix corruption caused by partial dio writes Boris Burkov
2023-03-22 19:11 ` [PATCH v5 1/5] btrfs: add function to create and return an ordered extent Boris Burkov
2023-03-22 19:11 ` [PATCH v5 2/5] btrfs: stash ordered extent in dio_data during iomap dio Boris Burkov
2023-03-22 19:11 ` [PATCH v5 3/5] btrfs: return ordered_extent splits from bio extraction Boris Burkov
2023-03-23  8:47   ` Christoph Hellwig
2023-03-23 16:15     ` Boris Burkov
2023-03-23 17:00       ` Boris Burkov [this message]
2023-03-23 17:45         ` Boris Burkov
2023-03-23 21:29         ` Christoph Hellwig
2023-03-23 22:43           ` Boris Burkov
2023-03-24  0:24             ` Christoph Hellwig
2023-03-22 19:11 ` [PATCH v5 4/5] btrfs: fix crash with non-zero pre in btrfs_split_ordered_extent Boris Burkov
2023-03-23  8:36   ` Naohiro Aota
2023-03-23 16:22     ` Boris Burkov
2023-03-22 19:11 ` [PATCH v5 5/5] btrfs: split partial dio bios before submit Boris Burkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230323170006.GA28317@zen \
    --to=boris@bur.io \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox