linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fail_last_dev and FailFast/LastDev flag incompatibility
@ 2022-02-09  9:40 Mariusz Tkaczyk
  2022-02-11  6:48 ` Xiao Ni
  2022-02-11  8:49 ` Guoqing Jiang
  0 siblings, 2 replies; 6+ messages in thread
From: Mariusz Tkaczyk @ 2022-02-09  9:40 UTC (permalink / raw)
  To: Song Liu, Guoqing Jiang, Xiao Ni; +Cc: linux-raid

Hi All,
During my work under failed arrays handling[1] improvements, I
discovered potential issue with "failfast" and metadata writes. In
commit message[2] Neil mentioned that:
"If we get a failure writing metadata but the device doesn't
fail, it must be the last device so we re-write without
FAILFAST".

Obviously, this is not true for RAID456 (again)[1] but it is also not
true for RAID1 and RAID10 with "fail_las_dev"[3] functionality enabled.

I did a quick check and can see that setter for "LastDev" flag is
called if "Faulty" on device is not set. I proposed some changes in the
area in my patchset[4] but after discussion we decided to drop changes
here. Current approach is not correct for all branches, so my proposal
is to change:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7b024912f1eb..3daec14ef6b2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
                pr_err("md: %s gets error=%d\n", __func__,
                       blk_status_to_errno(bio->bi_status));
                md_error(mddev, rdev);
-               if (!test_bit(Faulty, &rdev->flags)
+               if (test_bit(MD_BROKEN, mddev->flag)
                    && (bio->bi_opf & MD_FAILFAST)) {
                        set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
                        set_bit(LastDev, &rdev->flags);


It will force "LastDev" to be set on every metadata rewrite if mddevice
is known to be failed.
Do you have any other suggestions?

+ Guoqing - author of fail_last_dev.
+ Xiao - you are familiarized with FailFast so please take a look.

[1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
[2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
[3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
[4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/

Thanks,
Mariusz

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: fail_last_dev and FailFast/LastDev flag incompatibility
  2022-02-09  9:40 fail_last_dev and FailFast/LastDev flag incompatibility Mariusz Tkaczyk
@ 2022-02-11  6:48 ` Xiao Ni
  2022-02-11  7:24   ` Xiao Ni
  2022-02-11  8:49 ` Guoqing Jiang
  1 sibling, 1 reply; 6+ messages in thread
From: Xiao Ni @ 2022-02-11  6:48 UTC (permalink / raw)
  To: Mariusz Tkaczyk; +Cc: Song Liu, Guoqing Jiang, linux-raid

Hi Marisuz

We don't need to consider MD_FAILFAST for raid456. Because only raid1
and raid10 support it.
MD_FAILFAST_SUPPORTED is only set in raid1_run/raid10_run. So LastDev
only be useful for
raid1/raid10. It should be good to only check Faulty here.

Best Regards
Xiao

On Wed, Feb 9, 2022 at 5:40 PM Mariusz Tkaczyk
<mariusz.tkaczyk@linux.intel.com> wrote:
>
> Hi All,
> During my work under failed arrays handling[1] improvements, I
> discovered potential issue with "failfast" and metadata writes. In
> commit message[2] Neil mentioned that:
> "If we get a failure writing metadata but the device doesn't
> fail, it must be the last device so we re-write without
> FAILFAST".
>
> Obviously, this is not true for RAID456 (again)[1] but it is also not
> true for RAID1 and RAID10 with "fail_las_dev"[3] functionality enabled.
>
> I did a quick check and can see that setter for "LastDev" flag is
> called if "Faulty" on device is not set. I proposed some changes in the
> area in my patchset[4] but after discussion we decided to drop changes
> here. Current approach is not correct for all branches, so my proposal
> is to change:
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7b024912f1eb..3daec14ef6b2 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
>                 pr_err("md: %s gets error=%d\n", __func__,
>                        blk_status_to_errno(bio->bi_status));
>                 md_error(mddev, rdev);
> -               if (!test_bit(Faulty, &rdev->flags)
> +               if (test_bit(MD_BROKEN, mddev->flag)
>                     && (bio->bi_opf & MD_FAILFAST)) {
>                         set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
>                         set_bit(LastDev, &rdev->flags);
>
>
> It will force "LastDev" to be set on every metadata rewrite if mddevice
> is known to be failed.
> Do you have any other suggestions?
>
> + Guoqing - author of fail_last_dev.
> + Xiao - you are familiarized with FailFast so please take a look.
>
> [1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
> [2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
> [3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
> [4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/
>
> Thanks,
> Mariusz
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fail_last_dev and FailFast/LastDev flag incompatibility
  2022-02-11  6:48 ` Xiao Ni
@ 2022-02-11  7:24   ` Xiao Ni
  2022-02-11  7:53     ` Xiao Ni
  0 siblings, 1 reply; 6+ messages in thread
From: Xiao Ni @ 2022-02-11  7:24 UTC (permalink / raw)
  To: Mariusz Tkaczyk; +Cc: Song Liu, Guoqing Jiang, linux-raid

And for raid1/raid10, it looks like fail_last_dev and FailFast want to
do opposite things.
It can fail the last and it doesn't send a rewrite bio when
fail_last_dev is true. Because the
last dev has been set faulty. There is no meaning to send the rewrite
bio. So FailFast only
works when fail_last_dev is false.

On Fri, Feb 11, 2022 at 2:48 PM Xiao Ni <xni@redhat.com> wrote:
>
> Hi Marisuz
>
> We don't need to consider MD_FAILFAST for raid456. Because only raid1
> and raid10 support it.
> MD_FAILFAST_SUPPORTED is only set in raid1_run/raid10_run. So LastDev
> only be useful for
> raid1/raid10. It should be good to only check Faulty here.
>
> Best Regards
> Xiao
>
> On Wed, Feb 9, 2022 at 5:40 PM Mariusz Tkaczyk
> <mariusz.tkaczyk@linux.intel.com> wrote:
> >
> > Hi All,
> > During my work under failed arrays handling[1] improvements, I
> > discovered potential issue with "failfast" and metadata writes. In
> > commit message[2] Neil mentioned that:
> > "If we get a failure writing metadata but the device doesn't
> > fail, it must be the last device so we re-write without
> > FAILFAST".
> >
> > Obviously, this is not true for RAID456 (again)[1] but it is also not
> > true for RAID1 and RAID10 with "fail_las_dev"[3] functionality enabled.
> >
> > I did a quick check and can see that setter for "LastDev" flag is
> > called if "Faulty" on device is not set. I proposed some changes in the
> > area in my patchset[4] but after discussion we decided to drop changes
> > here. Current approach is not correct for all branches, so my proposal
> > is to change:
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 7b024912f1eb..3daec14ef6b2 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
> >                 pr_err("md: %s gets error=%d\n", __func__,
> >                        blk_status_to_errno(bio->bi_status));
> >                 md_error(mddev, rdev);
> > -               if (!test_bit(Faulty, &rdev->flags)
> > +               if (test_bit(MD_BROKEN, mddev->flag)
> >                     && (bio->bi_opf & MD_FAILFAST)) {
> >                         set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
> >                         set_bit(LastDev, &rdev->flags);
> >
> >
> > It will force "LastDev" to be set on every metadata rewrite if mddevice
> > is known to be failed.
> > Do you have any other suggestions?
> >
> > + Guoqing - author of fail_last_dev.
> > + Xiao - you are familiarized with FailFast so please take a look.
> >
> > [1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
> > [2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
> > [3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
> > [4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/
> >
> > Thanks,
> > Mariusz
> >


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fail_last_dev and FailFast/LastDev flag incompatibility
  2022-02-11  7:24   ` Xiao Ni
@ 2022-02-11  7:53     ` Xiao Ni
  2022-02-11  8:51       ` Mariusz Tkaczyk
  0 siblings, 1 reply; 6+ messages in thread
From: Xiao Ni @ 2022-02-11  7:53 UTC (permalink / raw)
  To: Mariusz Tkaczyk; +Cc: Song Liu, Guoqing Jiang, linux-raid

After thinking for a while, my words from my last email don't describe properly.
For raid1/raid10, if fail_last_dev is true. The bios which are sent to
member disks all have MD_FAILFAST. If there are no errors, failfast
works well until the last device failure. It will not re-send the bio
without MD_FAILFAST when fail_last_dev is true, because the last
device has been set faulty. There is no meaning to send the bio again
in this situation. So it should be right to only check faulty flag
here.

On Fri, Feb 11, 2022 at 3:24 PM Xiao Ni <xni@redhat.com> wrote:
>
> And for raid1/raid10, it looks like fail_last_dev and FailFast want to
> do opposite things.
> It can fail the last and it doesn't send a rewrite bio when
> fail_last_dev is true. Because the
> last dev has been set faulty. There is no meaning to send the rewrite
> bio. So FailFast only
> works when fail_last_dev is false.
>
> On Fri, Feb 11, 2022 at 2:48 PM Xiao Ni <xni@redhat.com> wrote:
> >
> > Hi Marisuz
> >
> > We don't need to consider MD_FAILFAST for raid456. Because only raid1
> > and raid10 support it.
> > MD_FAILFAST_SUPPORTED is only set in raid1_run/raid10_run. So LastDev
> > only be useful for
> > raid1/raid10. It should be good to only check Faulty here.
> >
> > Best Regards
> > Xiao
> >
> > On Wed, Feb 9, 2022 at 5:40 PM Mariusz Tkaczyk
> > <mariusz.tkaczyk@linux.intel.com> wrote:
> > >
> > > Hi All,
> > > During my work under failed arrays handling[1] improvements, I
> > > discovered potential issue with "failfast" and metadata writes. In
> > > commit message[2] Neil mentioned that:
> > > "If we get a failure writing metadata but the device doesn't
> > > fail, it must be the last device so we re-write without
> > > FAILFAST".
> > >
> > > Obviously, this is not true for RAID456 (again)[1] but it is also not
> > > true for RAID1 and RAID10 with "fail_las_dev"[3] functionality enabled.
> > >
> > > I did a quick check and can see that setter for "LastDev" flag is
> > > called if "Faulty" on device is not set. I proposed some changes in the
> > > area in my patchset[4] but after discussion we decided to drop changes
> > > here. Current approach is not correct for all branches, so my proposal
> > > is to change:
> > >
> > > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > > index 7b024912f1eb..3daec14ef6b2 100644
> > > --- a/drivers/md/md.c
> > > +++ b/drivers/md/md.c
> > > @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
> > >                 pr_err("md: %s gets error=%d\n", __func__,
> > >                        blk_status_to_errno(bio->bi_status));
> > >                 md_error(mddev, rdev);
> > > -               if (!test_bit(Faulty, &rdev->flags)
> > > +               if (test_bit(MD_BROKEN, mddev->flag)
> > >                     && (bio->bi_opf & MD_FAILFAST)) {
> > >                         set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
> > >                         set_bit(LastDev, &rdev->flags);
> > >
> > >
> > > It will force "LastDev" to be set on every metadata rewrite if mddevice
> > > is known to be failed.
> > > Do you have any other suggestions?
> > >
> > > + Guoqing - author of fail_last_dev.
> > > + Xiao - you are familiarized with FailFast so please take a look.
> > >
> > > [1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
> > > [2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
> > > [3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
> > > [4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/
> > >
> > > Thanks,
> > > Mariusz
> > >


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fail_last_dev and FailFast/LastDev flag incompatibility
  2022-02-09  9:40 fail_last_dev and FailFast/LastDev flag incompatibility Mariusz Tkaczyk
  2022-02-11  6:48 ` Xiao Ni
@ 2022-02-11  8:49 ` Guoqing Jiang
  1 sibling, 0 replies; 6+ messages in thread
From: Guoqing Jiang @ 2022-02-11  8:49 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Song Liu, Xiao Ni; +Cc: linux-raid



On 2/9/22 5:40 PM, Mariusz Tkaczyk wrote:
> Hi All,
> During my work under failed arrays handling[1]*improvements*,

Sorry, I disagree, will comment your new version later.

> I discovered potential issue with "failfast" and metadata writes. In
> commit message[2] Neil mentioned that:
> "If we get a failure writing metadata but the device doesn't
> fail, it must be the last device so we re-write without
> FAILFAST".
>
> Obviously, this is not true for RAID456 (again)[1] but it is also not
> true for RAID1 and RAID10 with "fail_las_dev"[3] functionality enabled.
>
> I did a quick check and can see that setter for "LastDev" flag is
> called if "Faulty" on device is not set. I proposed some changes in the
> area in my patchset[4] but after discussion we decided to drop changes
> here. Current approach is not correct for all branches, so my proposal
> is to change:
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7b024912f1eb..3daec14ef6b2 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
>                  pr_err("md: %s gets error=%d\n", __func__,
>                         blk_status_to_errno(bio->bi_status));
>                  md_error(mddev, rdev);
> -               if (!test_bit(Faulty, &rdev->flags)
> +               if (test_bit(MD_BROKEN, mddev->flag)
>                      && (bio->bi_opf & MD_FAILFAST)) {
>                          set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
>                          set_bit(LastDev, &rdev->flags);

IIUC, there is no problem with checking Faulty since super_written is
against rdev while MD_BROKEN is supposed to mean array is broken.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fail_last_dev and FailFast/LastDev flag incompatibility
  2022-02-11  7:53     ` Xiao Ni
@ 2022-02-11  8:51       ` Mariusz Tkaczyk
  0 siblings, 0 replies; 6+ messages in thread
From: Mariusz Tkaczyk @ 2022-02-11  8:51 UTC (permalink / raw)
  To: Xiao Ni; +Cc: Song Liu, Guoqing Jiang, linux-raid

On Fri, 11 Feb 2022 15:53:42 +0800
Xiao Ni <xni@redhat.com> wrote:

> After thinking for a while, my words from my last email don't
> describe properly. For raid1/raid10, if fail_last_dev is true. The
> bios which are sent to member disks all have MD_FAILFAST. If there
> are no errors, failfast works well until the last device failure. It
> will not re-send the bio without MD_FAILFAST when fail_last_dev is
> true, because the last device has been set faulty. There is no
> meaning to send the bio again in this situation. So it should be
> right to only check faulty flag here.

Hi Xiao,
Thanks for clarification.

Mariusz

> 
> On Fri, Feb 11, 2022 at 3:24 PM Xiao Ni <xni@redhat.com> wrote:
> >
> > And for raid1/raid10, it looks like fail_last_dev and FailFast want
> > to do opposite things.
> > It can fail the last and it doesn't send a rewrite bio when
> > fail_last_dev is true. Because the
> > last dev has been set faulty. There is no meaning to send the
> > rewrite bio. So FailFast only
> > works when fail_last_dev is false.
> >
> > On Fri, Feb 11, 2022 at 2:48 PM Xiao Ni <xni@redhat.com> wrote:
> > >
> > > Hi Marisuz
> > >
> > > We don't need to consider MD_FAILFAST for raid456. Because only
> > > raid1 and raid10 support it.
> > > MD_FAILFAST_SUPPORTED is only set in raid1_run/raid10_run. So
> > > LastDev only be useful for
> > > raid1/raid10. It should be good to only check Faulty here.
> > >
> > > Best Regards
> > > Xiao
> > >
> > > On Wed, Feb 9, 2022 at 5:40 PM Mariusz Tkaczyk
> > > <mariusz.tkaczyk@linux.intel.com> wrote:
> > > >
> > > > Hi All,
> > > > During my work under failed arrays handling[1] improvements, I
> > > > discovered potential issue with "failfast" and metadata writes.
> > > > In commit message[2] Neil mentioned that:
> > > > "If we get a failure writing metadata but the device doesn't
> > > > fail, it must be the last device so we re-write without
> > > > FAILFAST".
> > > >
> > > > Obviously, this is not true for RAID456 (again)[1] but it is
> > > > also not true for RAID1 and RAID10 with "fail_las_dev"[3]
> > > > functionality enabled.
> > > >
> > > > I did a quick check and can see that setter for "LastDev" flag
> > > > is called if "Faulty" on device is not set. I proposed some
> > > > changes in the area in my patchset[4] but after discussion we
> > > > decided to drop changes here. Current approach is not correct
> > > > for all branches, so my proposal is to change:
> > > >
> > > > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > > > index 7b024912f1eb..3daec14ef6b2 100644
> > > > --- a/drivers/md/md.c
> > > > +++ b/drivers/md/md.c
> > > > @@ -931,7 +931,7 @@ static void super_written(struct bio *bio)
> > > >                 pr_err("md: %s gets error=%d\n", __func__,
> > > >                        blk_status_to_errno(bio->bi_status));
> > > >                 md_error(mddev, rdev);
> > > > -               if (!test_bit(Faulty, &rdev->flags)
> > > > +               if (test_bit(MD_BROKEN, mddev->flag)
> > > >                     && (bio->bi_opf & MD_FAILFAST)) {
> > > >                         set_bit(MD_SB_NEED_REWRITE,
> > > > &mddev->sb_flags); set_bit(LastDev, &rdev->flags);
> > > >
> > > >
> > > > It will force "LastDev" to be set on every metadata rewrite if
> > > > mddevice is known to be failed.
> > > > Do you have any other suggestions?
> > > >
> > > > + Guoqing - author of fail_last_dev.
> > > > + Xiao - you are familiarized with FailFast so please take a
> > > > look.
> > > >
> > > > [1]https://lore.kernel.org/linux-raid/CAPhsuW54_9CTR6sh7mnQ6O77F2HNArLHGWHYsUdbNGy7pXgipQ@mail.gmail.com/T/#m8cf7c57429b6fd332220157186151900ce23865d
> > > > [2]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=46533ff7fefb7e9e3539494f5873b00091caa8eb
> > > > [3]https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=9a567843f7ce
> > > > [4]https://lore.kernel.org/linux-raid/CAPhsuW5bV+Bz=Od9jomNHoedaEMFAXymN11J80G62GVPwSp41g@mail.gmail.com/
> > > >
> > > > Thanks,
> > > > Mariusz
> > > >
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-11  8:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-09  9:40 fail_last_dev and FailFast/LastDev flag incompatibility Mariusz Tkaczyk
2022-02-11  6:48 ` Xiao Ni
2022-02-11  7:24   ` Xiao Ni
2022-02-11  7:53     ` Xiao Ni
2022-02-11  8:51       ` Mariusz Tkaczyk
2022-02-11  8:49 ` Guoqing Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).