* [PATCH 1/2] md/r5cache: improve add-journal
@ 2017-03-15 18:28 Song Liu
2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu
2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li
0 siblings, 2 replies; 5+ messages in thread
From: Song Liu @ 2017-03-15 18:28 UTC (permalink / raw)
To: linux-raid
Cc: shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen,
Song Liu
1. suspend the array before adding journal, so that we can add journal
when the array is not read-only;
2. allow recreate journal when existing journal is Faulty. So that we can
add-journal before removing failed journal.
Signed-off-by: Song Liu <songliubraving@fb.com>
---
drivers/md/md.c | 5 +++--
drivers/md/raid5.c | 6 ++----
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 42e68b2..ac3bd15 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6230,9 +6230,10 @@ static int add_new_disk(struct mddev *mddev, mdu_disk_info_t *info)
struct md_rdev *rdev2;
bool has_journal = false;
- /* make sure no existing journal disk */
+ /* make sure no active journal disk */
rdev_for_each(rdev2, mddev) {
- if (test_bit(Journal, &rdev2->flags)) {
+ if (test_bit(Journal, &rdev2->flags) &&
+ !test_bit(Faulty, &rdev2->flags)) {
has_journal = true;
break;
}
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 447d9dd..ee8648b 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7758,11 +7758,9 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
return -EBUSY;
rdev->raid_disk = 0;
- /*
- * The array is in readonly mode if journal is missing, so no
- * write requests running. We should be safe
- */
+ mddev_suspend(mddev);
log_init(conf, rdev);
+ mddev_resume(mddev);
return 0;
}
if (mddev->recovery_disabled == conf->recovery_disabled)
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] md/r5cache: journal remove support
2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu
@ 2017-03-15 18:28 ` Song Liu
2017-03-15 22:52 ` Shaohua Li
2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li
1 sibling, 1 reply; 5+ messages in thread
From: Song Liu @ 2017-03-15 18:28 UTC (permalink / raw)
To: linux-raid
Cc: shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen,
Song Liu
When journal device of an array fails, the array is forced into read-only
mode. To make the array normal without adding another journal device, we
need to remove journal _feature_ from the array.
This patch allows remove journal _feature_ from an array, For journal
remove to work, existing journal should be either missing or faulty.
Two flags are added to GET_ARRAY_INFO for mdadm.
1. MD_SB_HAS_JOURNAL: meaning the array have journal feature;
2. MD_SB_JOURNAL_REMOVABLE: meaning the journal is faulty or missing
When both flags are set, mdadm can clear MD_SB_HAS_JOURNAL to remove
journal _feature_.
Signed-off-by: Song Liu <songliubraving@fb.com>
---
drivers/md/md.c | 42 ++++++++++++++++++++++++++++++++++++++++--
include/uapi/linux/raid/md_p.h | 11 ++++++++---
2 files changed, 48 insertions(+), 5 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ac3bd15..32ee994 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5981,6 +5981,25 @@ static void autorun_devices(int part)
}
#endif /* !MODULE */
+/*
+ * the journal _feature_ is removable when:
+ * the array has journal support &&
+ * (journal is missing || journal is faulty)
+ */
+static bool journal_removable(struct mddev *mddev)
+{
+ struct md_rdev *rdev;
+
+ if (!test_bit(MD_HAS_JOURNAL, &mddev->flags))
+ return false;
+
+ rdev_for_each_rcu(rdev, mddev)
+ if (test_bit(Journal, &rdev->flags) &&
+ !test_bit(Faulty, &rdev->flags))
+ return false;
+ return true;
+}
+
static int get_version(void __user *arg)
{
mdu_version_t ver;
@@ -6041,6 +6060,10 @@ static int get_array_info(struct mddev *mddev, void __user *arg)
info.state |= (1<<MD_SB_BITMAP_PRESENT);
if (mddev_is_clustered(mddev))
info.state |= (1<<MD_SB_CLUSTERED);
+ if (test_bit(MD_HAS_JOURNAL, &mddev->flags))
+ info.state |= (1<<MD_SB_HAS_JOURNAL);
+ if (journal_removable(mddev))
+ info.state |= (1<<MD_SB_JOURNAL_REMOVABLE);
info.active_disks = insync;
info.working_disks = working;
info.failed_disks = failed;
@@ -6721,6 +6744,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
/* calculate expected state,ignoring low bits */
if (mddev->bitmap && mddev->bitmap_info.offset)
state |= (1 << MD_SB_BITMAP_PRESENT);
+ if (journal_removable(mddev))
+ state |= (1 << MD_SB_JOURNAL_REMOVABLE);
if (mddev->major_version != info->major_version ||
mddev->minor_version != info->minor_version ||
@@ -6730,8 +6755,11 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
/* mddev->layout != info->layout || */
mddev->persistent != !info->not_persistent ||
mddev->chunk_sectors != info->chunk_size >> 9 ||
- /* ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT to change */
- ((state^info->state) & 0xfffffe00)
+ /*
+ * ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT
+ * and SB_HAS_JOURNAL to change
+ */
+ ((state^info->state) & 0xfffffc00)
)
return -EINVAL;
/* Check there is only one change */
@@ -6743,6 +6771,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
cnt++;
if ((state ^ info->state) & (1<<MD_SB_BITMAP_PRESENT))
cnt++;
+ if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL))
+ cnt++;
if (cnt == 0)
return 0;
if (cnt > 1)
@@ -6831,6 +6861,14 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
mddev->bitmap_info.offset = 0;
}
}
+ if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) {
+ if (!journal_removable(mddev)) {
+ rv = -EINVAL;
+ goto err;
+ }
+ clear_bit(MD_HAS_JOURNAL, &mddev->flags);
+ }
+
md_update_sb(mddev, 1);
return rv;
err:
diff --git a/include/uapi/linux/raid/md_p.h b/include/uapi/linux/raid/md_p.h
index d9a1ead..b1f2b63 100644
--- a/include/uapi/linux/raid/md_p.h
+++ b/include/uapi/linux/raid/md_p.h
@@ -1,15 +1,15 @@
/*
md_p.h : physical layout of Linux RAID devices
Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
-
+
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
-
+
You should have received a copy of the GNU General Public License
(for example /usr/src/linux/COPYING); if not, write to the Free
- Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
#ifndef _MD_P_H
@@ -119,6 +119,11 @@ typedef struct mdp_device_descriptor_s {
#define MD_SB_CLUSTERED 5 /* MD is clustered */
#define MD_SB_BITMAP_PRESENT 8 /* bitmap may be present nearby */
+#define MD_SB_HAS_JOURNAL 9
+#define MD_SB_JOURNAL_REMOVABLE 10 /* journal _feature_ can be removed,
+ * which means the journal is either
+ * missing or Faulty
+ */
/*
* Notes:
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] md/r5cache: improve add-journal
2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu
2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu
@ 2017-03-15 22:51 ` Shaohua Li
2017-03-15 23:48 ` Song Liu
1 sibling, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2017-03-15 22:51 UTC (permalink / raw)
To: Song Liu
Cc: linux-raid, shli, neilb, kernel-team, dan.j.williams, hch,
jes.sorensen
On Wed, Mar 15, 2017 at 11:28:15AM -0700, Song Liu wrote:
> 1. suspend the array before adding journal, so that we can add journal
> when the array is not read-only;
we can't call mddev_suspend in raid5d, because there is deadlock.
raid5_add_disk can be called in raid5d.
> 2. allow recreate journal when existing journal is Faulty. So that we can
> add-journal before removing failed journal.
this is weird usage, why don't we remove the failed journal first?
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
> drivers/md/md.c | 5 +++--
> drivers/md/raid5.c | 6 ++----
> 2 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 42e68b2..ac3bd15 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6230,9 +6230,10 @@ static int add_new_disk(struct mddev *mddev, mdu_disk_info_t *info)
> struct md_rdev *rdev2;
> bool has_journal = false;
>
> - /* make sure no existing journal disk */
> + /* make sure no active journal disk */
> rdev_for_each(rdev2, mddev) {
> - if (test_bit(Journal, &rdev2->flags)) {
> + if (test_bit(Journal, &rdev2->flags) &&
> + !test_bit(Faulty, &rdev2->flags)) {
> has_journal = true;
> break;
> }
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 447d9dd..ee8648b 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -7758,11 +7758,9 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
> return -EBUSY;
>
> rdev->raid_disk = 0;
> - /*
> - * The array is in readonly mode if journal is missing, so no
> - * write requests running. We should be safe
> - */
> + mddev_suspend(mddev);
> log_init(conf, rdev);
> + mddev_resume(mddev);
> return 0;
> }
> if (mddev->recovery_disabled == conf->recovery_disabled)
> --
> 2.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] md/r5cache: journal remove support
2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu
@ 2017-03-15 22:52 ` Shaohua Li
0 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2017-03-15 22:52 UTC (permalink / raw)
To: Song Liu
Cc: linux-raid, shli, neilb, kernel-team, dan.j.williams, hch,
jes.sorensen
On Wed, Mar 15, 2017 at 11:28:16AM -0700, Song Liu wrote:
> When journal device of an array fails, the array is forced into read-only
> mode. To make the array normal without adding another journal device, we
> need to remove journal _feature_ from the array.
>
> This patch allows remove journal _feature_ from an array, For journal
> remove to work, existing journal should be either missing or faulty.
>
> Two flags are added to GET_ARRAY_INFO for mdadm.
> 1. MD_SB_HAS_JOURNAL: meaning the array have journal feature;
> 2. MD_SB_JOURNAL_REMOVABLE: meaning the journal is faulty or missing
>
> When both flags are set, mdadm can clear MD_SB_HAS_JOURNAL to remove
> journal _feature_.
please use the new 'consistency_policy' interface to remove journal support
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
> drivers/md/md.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> include/uapi/linux/raid/md_p.h | 11 ++++++++---
> 2 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ac3bd15..32ee994 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5981,6 +5981,25 @@ static void autorun_devices(int part)
> }
> #endif /* !MODULE */
>
> +/*
> + * the journal _feature_ is removable when:
> + * the array has journal support &&
> + * (journal is missing || journal is faulty)
> + */
> +static bool journal_removable(struct mddev *mddev)
> +{
> + struct md_rdev *rdev;
> +
> + if (!test_bit(MD_HAS_JOURNAL, &mddev->flags))
> + return false;
> +
> + rdev_for_each_rcu(rdev, mddev)
> + if (test_bit(Journal, &rdev->flags) &&
> + !test_bit(Faulty, &rdev->flags))
> + return false;
> + return true;
> +}
> +
> static int get_version(void __user *arg)
> {
> mdu_version_t ver;
> @@ -6041,6 +6060,10 @@ static int get_array_info(struct mddev *mddev, void __user *arg)
> info.state |= (1<<MD_SB_BITMAP_PRESENT);
> if (mddev_is_clustered(mddev))
> info.state |= (1<<MD_SB_CLUSTERED);
> + if (test_bit(MD_HAS_JOURNAL, &mddev->flags))
> + info.state |= (1<<MD_SB_HAS_JOURNAL);
> + if (journal_removable(mddev))
> + info.state |= (1<<MD_SB_JOURNAL_REMOVABLE);
> info.active_disks = insync;
> info.working_disks = working;
> info.failed_disks = failed;
> @@ -6721,6 +6744,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
> /* calculate expected state,ignoring low bits */
> if (mddev->bitmap && mddev->bitmap_info.offset)
> state |= (1 << MD_SB_BITMAP_PRESENT);
> + if (journal_removable(mddev))
> + state |= (1 << MD_SB_JOURNAL_REMOVABLE);
>
> if (mddev->major_version != info->major_version ||
> mddev->minor_version != info->minor_version ||
> @@ -6730,8 +6755,11 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
> /* mddev->layout != info->layout || */
> mddev->persistent != !info->not_persistent ||
> mddev->chunk_sectors != info->chunk_size >> 9 ||
> - /* ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT to change */
> - ((state^info->state) & 0xfffffe00)
> + /*
> + * ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT
> + * and SB_HAS_JOURNAL to change
> + */
> + ((state^info->state) & 0xfffffc00)
> )
> return -EINVAL;
> /* Check there is only one change */
> @@ -6743,6 +6771,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
> cnt++;
> if ((state ^ info->state) & (1<<MD_SB_BITMAP_PRESENT))
> cnt++;
> + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL))
> + cnt++;
> if (cnt == 0)
> return 0;
> if (cnt > 1)
> @@ -6831,6 +6861,14 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
> mddev->bitmap_info.offset = 0;
> }
> }
> + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) {
> + if (!journal_removable(mddev)) {
> + rv = -EINVAL;
> + goto err;
> + }
> + clear_bit(MD_HAS_JOURNAL, &mddev->flags);
> + }
> +
> md_update_sb(mddev, 1);
> return rv;
> err:
> diff --git a/include/uapi/linux/raid/md_p.h b/include/uapi/linux/raid/md_p.h
> index d9a1ead..b1f2b63 100644
> --- a/include/uapi/linux/raid/md_p.h
> +++ b/include/uapi/linux/raid/md_p.h
> @@ -1,15 +1,15 @@
> /*
> md_p.h : physical layout of Linux RAID devices
> Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman
> -
> +
> This program is free software; you can redistribute it and/or modify
> it under the terms of the GNU General Public License as published by
> the Free Software Foundation; either version 2, or (at your option)
> any later version.
> -
> +
> You should have received a copy of the GNU General Public License
> (for example /usr/src/linux/COPYING); if not, write to the Free
> - Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> */
>
> #ifndef _MD_P_H
> @@ -119,6 +119,11 @@ typedef struct mdp_device_descriptor_s {
>
> #define MD_SB_CLUSTERED 5 /* MD is clustered */
> #define MD_SB_BITMAP_PRESENT 8 /* bitmap may be present nearby */
> +#define MD_SB_HAS_JOURNAL 9
> +#define MD_SB_JOURNAL_REMOVABLE 10 /* journal _feature_ can be removed,
> + * which means the journal is either
> + * missing or Faulty
> + */
>
> /*
> * Notes:
> --
> 2.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] md/r5cache: improve add-journal
2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li
@ 2017-03-15 23:48 ` Song Liu
0 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2017-03-15 23:48 UTC (permalink / raw)
To: Shaohua Li
Cc: linux-raid, Shaohua Li, NeilBrown, Kernel Team, Dan Williams,
Christoph Hellwig, jes.sorensen@gmail.com
> On Mar 15, 2017, at 3:51 PM, Shaohua Li <shli@kernel.org> wrote:
>
> On Wed, Mar 15, 2017 at 11:28:15AM -0700, Song Liu wrote:
>> 1. suspend the array before adding journal, so that we can add journal
>> when the array is not read-only;
>
> we can't call mddev_suspend in raid5d, because there is deadlock.
> raid5_add_disk can be called in raid5d.
I see. I guess we can just require setting read only before adding journal.
>
>> 2. allow recreate journal when existing journal is Faulty. So that we can
>> add-journal before removing failed journal.
>
> this is weird usage, why don't we remove the failed journal first?
This is really not necessary.
I guess we can just drop this patch.
Thanks,
Song
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-03-15 23:48 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu
2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu
2017-03-15 22:52 ` Shaohua Li
2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li
2017-03-15 23:48 ` Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).