From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
To: Xiao Ni <xni@redhat.com>
Cc: Song Liu <song@kernel.org>,
Guoqing Jiang <guoqing.jiang@linux.dev>,
linux-raid <linux-raid@vger.kernel.org>
Subject: Re: fail_last_dev and FailFast/LastDev flag incompatibility
Date: Fri, 11 Feb 2022 09:51:44 +0100 [thread overview]
Message-ID: <20220211095144.0000258c@linux.intel.com> (raw)
In-Reply-To: <CALTww2-UxsgNBdUJ0EHrmPUyvnO+Q04DsxnOdfExN5dFmjMsfw@mail.gmail.com>
On Fri, 11 Feb 2022 15:53:42 +0800
Xiao Ni <xni@redhat.com> wrote:
> After thinking for a while, my words from my last email don't
> describe properly. For raid1/raid10, if fail_last_dev is true. The
> bios which are sent to member disks all have MD_FAILFAST. If there
> are no errors, failfast works well until the last device failure. It
> will not re-send the bio without MD_FAILFAST when fail_last_dev is
> true, because the last device has been set faulty. There is no
> meaning to send the bio again in this situation. So it should be
> right to only check faulty flag here.
Hi Xiao,
Thanks for clarification.
Mariusz
>
> On Fri, Feb 11, 2022 at 3:24 PM Xiao Ni <xni@redhat.com> wrote:
> >
> > And for raid1/raid10, it looks like fail_last_dev and FailFast want
> > to do opposite things.
> > It can fail the last and it doesn't send a rewrite bio when
> > fail_last_dev is true. Because the
> > last dev has been set faulty. There is no meaning to send the
> > rewrite bio. So FailFast only
> > works when fail_last_dev is false.
> >
> > On Fri, Feb 11, 2022 at 2:48 PM Xiao Ni <xni@redhat.com> wrote:
> > >
> > > Hi Marisuz
> > >
> > > We don't need to consider MD_FAILFAST for raid456. Because only
> > > raid1 and raid10 support it.
> > > MD_FAILFAST_SUPPORTED is only set in raid1_run/raid10_run. So
> > > LastDev only be useful for
> > > raid1/raid10. It should be good to only check Faulty here.
> > >
> > > Best Regards
> > > Xiao
> > >
> > > On Wed, Feb 9, 2022 at 5:40 PM Mariusz Tkaczyk
> > > <mariusz.tkaczyk@linux.intel.com> wrote:
> > > >
> > > > Hi All,
> > > > During my work under failed arrays handling[1] improvements, I
> > > > discovered potential issue with "failfast" and metadata writes.
> > > > In commit message[2] Neil mentioned that:
> > > > "If we get a failure writing metadata but the device doesn't
> > > > fail, it must be the last device so we re-write without
> > > > FAILFAST".
> > > >
> > > > Obviously, this is not true for RAID456 (again)[1] but it is
> > > > also not true for RAID1 and RAID10 with "fail_las_dev"[3]
> > > > functionality enabled.
> > > >
> > > > I did a quick check and can see that setter for "LastDev" flag
> > > > is called if "Faulty" on device is not set. I proposed some
> > > > changes in the area in my patchset[4] but after discussion we
> > > > decided to drop changes here. Current approach is not correct
> > > > for all branches, so my proposal is to change:
> > > >
> > > > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > > > index 7b024912f1eb..3daec14ef6b2 100644
> > > > --- a/drivers/md/md.c
> > > > +++ b/drivers/md/md.c
> > > > @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
> > > > pr_err("md: %s gets error=%d\n", __func__,
> > > > blk_status_to_errno(bio->bi_status));
> > > > md_error(mddev, rdev);
> > > > - if (!test_bit(Faulty, &rdev->flags)
> > > > + if (test_bit(MD_BROKEN, mddev->flag)
> > > > && (bio->bi_opf & MD_FAILFAST)) {
> > > > set_bit(MD_SB_NEED_REWRITE,
> > > > &mddev->sb_flags); set_bit(LastDev, &rdev->flags);
> > > >
> > > >
> > > > It will force "LastDev" to be set on every metadata rewrite if
> > > > mddevice is known to be failed.
> > > > Do you have any other suggestions?
> > > >
> > > > + Guoqing - author of fail_last_dev.
> > > > + Xiao - you are familiarized with FailFast so please take a
> > > > look.
> > > >
> > > > [1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
> > > > [2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
> > > > [3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
> > > > [4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/
> > > >
> > > > Thanks,
> > > > Mariusz
> > > >
>
next prev parent reply other threads:[~2022-02-11 8:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-09 9:40 fail_last_dev and FailFast/LastDev flag incompatibility Mariusz Tkaczyk
2022-02-11 6:48 ` Xiao Ni
2022-02-11 7:24 ` Xiao Ni
2022-02-11 7:53 ` Xiao Ni
2022-02-11 8:51 ` Mariusz Tkaczyk [this message]
2022-02-11 8:49 ` Guoqing Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220211095144.0000258c@linux.intel.com \
--to=mariusz.tkaczyk@linux.intel.com \
--cc=guoqing.jiang@linux.dev \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.