public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Filipe Manana <fdmanana@kernel.org>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 1/2] btrfs: warn on invalid slot in tree mod log rewind
Date: Wed, 31 May 2023 07:51:25 -0700	[thread overview]
Message-ID: <20230531145125.GA2249308@zen> (raw)
In-Reply-To: <CAL3q7H6Fage2sJLY6JEdzPVJGogOqHWWNnbpRL2gU3o24x9F6w@mail.gmail.com>

On Wed, May 31, 2023 at 01:37:32PM +0100, Filipe Manana wrote:
> On Tue, May 30, 2023 at 8:52 PM Boris Burkov <boris@bur.io> wrote:
> >
> > The way that tree mod log tracks the ultimate length of the eb, the
> > variable 'n', eventually turns up the correct value, but at intermediate
> > steps during the rewind, n can be inaccurate as a representation of the
> > end of the eb. For example, it doesn't get updated on move rewinds, and
> > it does get updated for add/remove in the middle of the eb.
> >
> > To detect cases with invalid moves, introduce a separate variable called
> > max_slot which tries to track the maximum valid slot in the rewind eb.
> > We can then warn if we do a move whose src range goes beyond the max
> > valid slot.
> >
> > There is a commented caveat that it is possible to have this value be an
> > overestimate due to the challenge of properly handling 'add' operations
> > in the middle of the eb, but in practice it doesn't cause enough of a
> > problem to throw out the max idea in favor of tracking every valid slot.
> >
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> >  fs/btrfs/tree-mod-log.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c
> > index a555baa0143a..1b1ae9ce9d1e 100644
> > --- a/fs/btrfs/tree-mod-log.c
> > +++ b/fs/btrfs/tree-mod-log.c
> > @@ -664,8 +664,10 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
> >         unsigned long o_dst;
> >         unsigned long o_src;
> >         unsigned long p_size = sizeof(struct btrfs_key_ptr);
> > +       u32 max_slot;
> >
> >         n = btrfs_header_nritems(eb);
> > +       max_slot = n - 1;
> >         read_lock(&fs_info->tree_mod_log_lock);
> >         while (tm && tm->seq >= time_seq) {
> >                 /*
> > @@ -684,6 +686,8 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
> >                         btrfs_set_node_ptr_generation(eb, tm->slot,
> >                                                       tm->generation);
> >                         n++;
> > +                       if (max_slot == (u32)-1 || tm->slot > max_slot)
> > +                               max_slot = tm->slot;
> >                         break;
> >                 case BTRFS_MOD_LOG_KEY_REPLACE:
> >                         BUG_ON(tm->slot >= n);
> > @@ -693,6 +697,15 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
> >                                                       tm->generation);
> >                         break;
> >                 case BTRFS_MOD_LOG_KEY_ADD:
> > +                       /*
> > +                        * It is possible we could have already removed keys behind the known
> > +                        * max slot, so this will be an overestimate. In practice, the copy
> > +                        * operation inserts them in increasing order, and overestimating just
> > +                        * means we miss some warnings, so it's OK. It isn't worth carefully
> > +                        * tracking the full array of valid slots to check against when moving.
> > +                        */
> > +                       if (tm->slot == max_slot)
> > +                               max_slot--;
> >                         /* if a move operation is needed it's in the log */
> >                         n--;
> >                         break;
> > @@ -701,6 +714,12 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
> >                         o_src = btrfs_node_key_ptr_offset(eb, tm->move.dst_slot);
> >                         memmove_extent_buffer(eb, o_dst, o_src,
> >                                               tm->move.nr_items * p_size);
> > +                       WARN((tm->move.dst_slot + tm->move.nr_items - 1 > max_slot) ||
> > +                            (max_slot == (u32)-1 && tm->move.nr_items > 0),
> 
> Why the: "tm->move.nr_items > 0" ?
> It's actually a bug, or in the best case a pointless operation that
> should not have been logged, to have a move operation for 0 items.

My thinking was that max_slot models the maximum valid slot with
(u32)-1 meaning "no slot is valid". But a move of size 0 doesn't read
any slot so doesn't constitute a bug per-se, just a dumb no-op tree mod
log entry. I can still warn on it, if you think that's better.

> 
> > +                            "Move from invalid tree mod log slot eb %llu slot %d dst_slot %d nr_items %d seq %llu n %u max_slot %u\n",
> > +                            eb->start, tm->slot, tm->move.dst_slot, tm->move.nr_items,
> > +                            tm->seq, n, max_slot);
> 
> Why trigger the warning after doing the memmove, and not before?
> I would say it's preferable, because in case the bad memmove triggers
> a crash/panic, we get the warning logged with some useful information.
> 
> It's also possibly better to log using the btrfs_warn() helper, as it
> prints information about the affected filesystem, etc.
> So like a: if (WARN_ON(conditions)) btrfs_warn()

Good points, thanks for the review.

> 
> Thanks.
> 
> 
> > +                       max_slot = tm->slot + tm->move.nr_items - 1;
> >                         break;
> >                 case BTRFS_MOD_LOG_ROOT_REPLACE:
> >                         /*
> > --
> > 2.40.1
> >

  reply	other threads:[~2023-05-31 14:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 19:22 [PATCH 0/2] btrfs: fix logical_to_ino panic in btrfs_map_bio Boris Burkov
2023-05-30 19:22 ` [PATCH 1/2] btrfs: warn on invalid slot in tree mod log rewind Boris Burkov
2023-05-31 12:37   ` Filipe Manana
2023-05-31 14:51     ` Boris Burkov [this message]
2023-05-30 19:22 ` [PATCH 2/2] btrfs: insert tree mod log move in push_node_left Boris Burkov
2023-05-31 12:53   ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230531145125.GA2249308@zen \
    --to=boris@bur.io \
    --cc=fdmanana@kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox