* [PATCH 1/2] md/r5cache: improve add-journal @ 2017-03-15 18:28 Song Liu 2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu 2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li 0 siblings, 2 replies; 5+ messages in thread From: Song Liu @ 2017-03-15 18:28 UTC (permalink / raw) To: linux-raid Cc: shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen, Song Liu 1. suspend the array before adding journal, so that we can add journal when the array is not read-only; 2. allow recreate journal when existing journal is Faulty. So that we can add-journal before removing failed journal. Signed-off-by: Song Liu <songliubraving@fb.com> --- drivers/md/md.c | 5 +++-- drivers/md/raid5.c | 6 ++---- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 42e68b2..ac3bd15 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6230,9 +6230,10 @@ static int add_new_disk(struct mddev *mddev, mdu_disk_info_t *info) struct md_rdev *rdev2; bool has_journal = false; - /* make sure no existing journal disk */ + /* make sure no active journal disk */ rdev_for_each(rdev2, mddev) { - if (test_bit(Journal, &rdev2->flags)) { + if (test_bit(Journal, &rdev2->flags) && + !test_bit(Faulty, &rdev2->flags)) { has_journal = true; break; } diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 447d9dd..ee8648b 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7758,11 +7758,9 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev) return -EBUSY; rdev->raid_disk = 0; - /* - * The array is in readonly mode if journal is missing, so no - * write requests running. We should be safe - */ + mddev_suspend(mddev); log_init(conf, rdev); + mddev_resume(mddev); return 0; } if (mddev->recovery_disabled == conf->recovery_disabled) -- 2.9.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] md/r5cache: journal remove support 2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu @ 2017-03-15 18:28 ` Song Liu 2017-03-15 22:52 ` Shaohua Li 2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li 1 sibling, 1 reply; 5+ messages in thread From: Song Liu @ 2017-03-15 18:28 UTC (permalink / raw) To: linux-raid Cc: shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen, Song Liu When journal device of an array fails, the array is forced into read-only mode. To make the array normal without adding another journal device, we need to remove journal _feature_ from the array. This patch allows remove journal _feature_ from an array, For journal remove to work, existing journal should be either missing or faulty. Two flags are added to GET_ARRAY_INFO for mdadm. 1. MD_SB_HAS_JOURNAL: meaning the array have journal feature; 2. MD_SB_JOURNAL_REMOVABLE: meaning the journal is faulty or missing When both flags are set, mdadm can clear MD_SB_HAS_JOURNAL to remove journal _feature_. Signed-off-by: Song Liu <songliubraving@fb.com> --- drivers/md/md.c | 42 ++++++++++++++++++++++++++++++++++++++++-- include/uapi/linux/raid/md_p.h | 11 ++++++++--- 2 files changed, 48 insertions(+), 5 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index ac3bd15..32ee994 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5981,6 +5981,25 @@ static void autorun_devices(int part) } #endif /* !MODULE */ +/* + * the journal _feature_ is removable when: + * the array has journal support && + * (journal is missing || journal is faulty) + */ +static bool journal_removable(struct mddev *mddev) +{ + struct md_rdev *rdev; + + if (!test_bit(MD_HAS_JOURNAL, &mddev->flags)) + return false; + + rdev_for_each_rcu(rdev, mddev) + if (test_bit(Journal, &rdev->flags) && + !test_bit(Faulty, &rdev->flags)) + return false; + return true; +} + static int get_version(void __user *arg) { mdu_version_t ver; @@ -6041,6 +6060,10 @@ static int get_array_info(struct mddev *mddev, void __user *arg) info.state |= (1<<MD_SB_BITMAP_PRESENT); if (mddev_is_clustered(mddev)) info.state |= (1<<MD_SB_CLUSTERED); + if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) + info.state |= (1<<MD_SB_HAS_JOURNAL); + if (journal_removable(mddev)) + info.state |= (1<<MD_SB_JOURNAL_REMOVABLE); info.active_disks = insync; info.working_disks = working; info.failed_disks = failed; @@ -6721,6 +6744,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) /* calculate expected state,ignoring low bits */ if (mddev->bitmap && mddev->bitmap_info.offset) state |= (1 << MD_SB_BITMAP_PRESENT); + if (journal_removable(mddev)) + state |= (1 << MD_SB_JOURNAL_REMOVABLE); if (mddev->major_version != info->major_version || mddev->minor_version != info->minor_version || @@ -6730,8 +6755,11 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) /* mddev->layout != info->layout || */ mddev->persistent != !info->not_persistent || mddev->chunk_sectors != info->chunk_size >> 9 || - /* ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT to change */ - ((state^info->state) & 0xfffffe00) + /* + * ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT + * and SB_HAS_JOURNAL to change + */ + ((state^info->state) & 0xfffffc00) ) return -EINVAL; /* Check there is only one change */ @@ -6743,6 +6771,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) cnt++; if ((state ^ info->state) & (1<<MD_SB_BITMAP_PRESENT)) cnt++; + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) + cnt++; if (cnt == 0) return 0; if (cnt > 1) @@ -6831,6 +6861,14 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) mddev->bitmap_info.offset = 0; } } + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) { + if (!journal_removable(mddev)) { + rv = -EINVAL; + goto err; + } + clear_bit(MD_HAS_JOURNAL, &mddev->flags); + } + md_update_sb(mddev, 1); return rv; err: diff --git a/include/uapi/linux/raid/md_p.h b/include/uapi/linux/raid/md_p.h index d9a1ead..b1f2b63 100644 --- a/include/uapi/linux/raid/md_p.h +++ b/include/uapi/linux/raid/md_p.h @@ -1,15 +1,15 @@ /* md_p.h : physical layout of Linux RAID devices Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman - + This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. - + You should have received a copy of the GNU General Public License (for example /usr/src/linux/COPYING); if not, write to the Free - Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ #ifndef _MD_P_H @@ -119,6 +119,11 @@ typedef struct mdp_device_descriptor_s { #define MD_SB_CLUSTERED 5 /* MD is clustered */ #define MD_SB_BITMAP_PRESENT 8 /* bitmap may be present nearby */ +#define MD_SB_HAS_JOURNAL 9 +#define MD_SB_JOURNAL_REMOVABLE 10 /* journal _feature_ can be removed, + * which means the journal is either + * missing or Faulty + */ /* * Notes: -- 2.9.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] md/r5cache: journal remove support 2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu @ 2017-03-15 22:52 ` Shaohua Li 0 siblings, 0 replies; 5+ messages in thread From: Shaohua Li @ 2017-03-15 22:52 UTC (permalink / raw) To: Song Liu Cc: linux-raid, shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen On Wed, Mar 15, 2017 at 11:28:16AM -0700, Song Liu wrote: > When journal device of an array fails, the array is forced into read-only > mode. To make the array normal without adding another journal device, we > need to remove journal _feature_ from the array. > > This patch allows remove journal _feature_ from an array, For journal > remove to work, existing journal should be either missing or faulty. > > Two flags are added to GET_ARRAY_INFO for mdadm. > 1. MD_SB_HAS_JOURNAL: meaning the array have journal feature; > 2. MD_SB_JOURNAL_REMOVABLE: meaning the journal is faulty or missing > > When both flags are set, mdadm can clear MD_SB_HAS_JOURNAL to remove > journal _feature_. please use the new 'consistency_policy' interface to remove journal support > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > drivers/md/md.c | 42 ++++++++++++++++++++++++++++++++++++++++-- > include/uapi/linux/raid/md_p.h | 11 ++++++++--- > 2 files changed, 48 insertions(+), 5 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index ac3bd15..32ee994 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -5981,6 +5981,25 @@ static void autorun_devices(int part) > } > #endif /* !MODULE */ > > +/* > + * the journal _feature_ is removable when: > + * the array has journal support && > + * (journal is missing || journal is faulty) > + */ > +static bool journal_removable(struct mddev *mddev) > +{ > + struct md_rdev *rdev; > + > + if (!test_bit(MD_HAS_JOURNAL, &mddev->flags)) > + return false; > + > + rdev_for_each_rcu(rdev, mddev) > + if (test_bit(Journal, &rdev->flags) && > + !test_bit(Faulty, &rdev->flags)) > + return false; > + return true; > +} > + > static int get_version(void __user *arg) > { > mdu_version_t ver; > @@ -6041,6 +6060,10 @@ static int get_array_info(struct mddev *mddev, void __user *arg) > info.state |= (1<<MD_SB_BITMAP_PRESENT); > if (mddev_is_clustered(mddev)) > info.state |= (1<<MD_SB_CLUSTERED); > + if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) > + info.state |= (1<<MD_SB_HAS_JOURNAL); > + if (journal_removable(mddev)) > + info.state |= (1<<MD_SB_JOURNAL_REMOVABLE); > info.active_disks = insync; > info.working_disks = working; > info.failed_disks = failed; > @@ -6721,6 +6744,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > /* calculate expected state,ignoring low bits */ > if (mddev->bitmap && mddev->bitmap_info.offset) > state |= (1 << MD_SB_BITMAP_PRESENT); > + if (journal_removable(mddev)) > + state |= (1 << MD_SB_JOURNAL_REMOVABLE); > > if (mddev->major_version != info->major_version || > mddev->minor_version != info->minor_version || > @@ -6730,8 +6755,11 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > /* mddev->layout != info->layout || */ > mddev->persistent != !info->not_persistent || > mddev->chunk_sectors != info->chunk_size >> 9 || > - /* ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT to change */ > - ((state^info->state) & 0xfffffe00) > + /* > + * ignore bottom 8 bits of state, and allow SB_BITMAP_PRESENT > + * and SB_HAS_JOURNAL to change > + */ > + ((state^info->state) & 0xfffffc00) > ) > return -EINVAL; > /* Check there is only one change */ > @@ -6743,6 +6771,8 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > cnt++; > if ((state ^ info->state) & (1<<MD_SB_BITMAP_PRESENT)) > cnt++; > + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) > + cnt++; > if (cnt == 0) > return 0; > if (cnt > 1) > @@ -6831,6 +6861,14 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info) > mddev->bitmap_info.offset = 0; > } > } > + if ((state ^ info->state) & (1<<MD_SB_HAS_JOURNAL)) { > + if (!journal_removable(mddev)) { > + rv = -EINVAL; > + goto err; > + } > + clear_bit(MD_HAS_JOURNAL, &mddev->flags); > + } > + > md_update_sb(mddev, 1); > return rv; > err: > diff --git a/include/uapi/linux/raid/md_p.h b/include/uapi/linux/raid/md_p.h > index d9a1ead..b1f2b63 100644 > --- a/include/uapi/linux/raid/md_p.h > +++ b/include/uapi/linux/raid/md_p.h > @@ -1,15 +1,15 @@ > /* > md_p.h : physical layout of Linux RAID devices > Copyright (C) 1996-98 Ingo Molnar, Gadi Oxman > - > + > This program is free software; you can redistribute it and/or modify > it under the terms of the GNU General Public License as published by > the Free Software Foundation; either version 2, or (at your option) > any later version. > - > + > You should have received a copy of the GNU General Public License > (for example /usr/src/linux/COPYING); if not, write to the Free > - Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > + Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > */ > > #ifndef _MD_P_H > @@ -119,6 +119,11 @@ typedef struct mdp_device_descriptor_s { > > #define MD_SB_CLUSTERED 5 /* MD is clustered */ > #define MD_SB_BITMAP_PRESENT 8 /* bitmap may be present nearby */ > +#define MD_SB_HAS_JOURNAL 9 > +#define MD_SB_JOURNAL_REMOVABLE 10 /* journal _feature_ can be removed, > + * which means the journal is either > + * missing or Faulty > + */ > > /* > * Notes: > -- > 2.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] md/r5cache: improve add-journal 2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu 2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu @ 2017-03-15 22:51 ` Shaohua Li 2017-03-15 23:48 ` Song Liu 1 sibling, 1 reply; 5+ messages in thread From: Shaohua Li @ 2017-03-15 22:51 UTC (permalink / raw) To: Song Liu Cc: linux-raid, shli, neilb, kernel-team, dan.j.williams, hch, jes.sorensen On Wed, Mar 15, 2017 at 11:28:15AM -0700, Song Liu wrote: > 1. suspend the array before adding journal, so that we can add journal > when the array is not read-only; we can't call mddev_suspend in raid5d, because there is deadlock. raid5_add_disk can be called in raid5d. > 2. allow recreate journal when existing journal is Faulty. So that we can > add-journal before removing failed journal. this is weird usage, why don't we remove the failed journal first? > Signed-off-by: Song Liu <songliubraving@fb.com> > --- > drivers/md/md.c | 5 +++-- > drivers/md/raid5.c | 6 ++---- > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 42e68b2..ac3bd15 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -6230,9 +6230,10 @@ static int add_new_disk(struct mddev *mddev, mdu_disk_info_t *info) > struct md_rdev *rdev2; > bool has_journal = false; > > - /* make sure no existing journal disk */ > + /* make sure no active journal disk */ > rdev_for_each(rdev2, mddev) { > - if (test_bit(Journal, &rdev2->flags)) { > + if (test_bit(Journal, &rdev2->flags) && > + !test_bit(Faulty, &rdev2->flags)) { > has_journal = true; > break; > } > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 447d9dd..ee8648b 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -7758,11 +7758,9 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev) > return -EBUSY; > > rdev->raid_disk = 0; > - /* > - * The array is in readonly mode if journal is missing, so no > - * write requests running. We should be safe > - */ > + mddev_suspend(mddev); > log_init(conf, rdev); > + mddev_resume(mddev); > return 0; > } > if (mddev->recovery_disabled == conf->recovery_disabled) > -- > 2.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] md/r5cache: improve add-journal 2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li @ 2017-03-15 23:48 ` Song Liu 0 siblings, 0 replies; 5+ messages in thread From: Song Liu @ 2017-03-15 23:48 UTC (permalink / raw) To: Shaohua Li Cc: linux-raid, Shaohua Li, NeilBrown, Kernel Team, Dan Williams, Christoph Hellwig, jes.sorensen@gmail.com > On Mar 15, 2017, at 3:51 PM, Shaohua Li <shli@kernel.org> wrote: > > On Wed, Mar 15, 2017 at 11:28:15AM -0700, Song Liu wrote: >> 1. suspend the array before adding journal, so that we can add journal >> when the array is not read-only; > > we can't call mddev_suspend in raid5d, because there is deadlock. > raid5_add_disk can be called in raid5d. I see. I guess we can just require setting read only before adding journal. > >> 2. allow recreate journal when existing journal is Faulty. So that we can >> add-journal before removing failed journal. > > this is weird usage, why don't we remove the failed journal first? This is really not necessary. I guess we can just drop this patch. Thanks, Song ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-03-15 23:48 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-15 18:28 [PATCH 1/2] md/r5cache: improve add-journal Song Liu 2017-03-15 18:28 ` [PATCH 2/2] md/r5cache: journal remove support Song Liu 2017-03-15 22:52 ` Shaohua Li 2017-03-15 22:51 ` [PATCH 1/2] md/r5cache: improve add-journal Shaohua Li 2017-03-15 23:48 ` Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).