linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly
@ 2016-08-19 22:34 Song Liu
  2016-08-19 22:34 ` [PATCH 2/2] r5cache: remove journal support Song Liu
  2016-08-23  5:10 ` [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Shaohua Li
  0 siblings, 2 replies; 5+ messages in thread
From: Song Liu @ 2016-08-19 22:34 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Song Liu

Currently, the code sets MD_JOURNAL_CLEAN when the array has
MD_FEATURE_JOURNAL and the recovery_cp is MaxSector. The array
will be MD_JOURNAL_CLEAN even if the journal device is missing.

With this patch, the MD_JOURNAL_CLEAN is only set when the journal
device presents.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 drivers/md/md.c    |  5 +----
 drivers/md/raid5.c | 13 ++++++++-----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 51bc35f..46917f7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1604,11 +1604,8 @@ static int super_1_validate(struct mddev *mddev, struct md_rdev *rdev)
 			mddev->new_chunk_sectors = mddev->chunk_sectors;
 		}
 
-		if (le32_to_cpu(sb->feature_map) & MD_FEATURE_JOURNAL) {
+		if (le32_to_cpu(sb->feature_map) & MD_FEATURE_JOURNAL)
 			set_bit(MD_HAS_JOURNAL, &mddev->flags);
-			if (mddev->recovery_cp == MaxSector)
-				set_bit(MD_JOURNAL_CLEAN, &mddev->flags);
-		}
 	} else if (mddev->pers == NULL) {
 		/* Insist of good event counter while assembling, except for
 		 * spares (which don't need an event count) */
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4f8f524..2119e09 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6840,11 +6840,14 @@ static int raid5_run(struct mddev *mddev)
 	if (IS_ERR(conf))
 		return PTR_ERR(conf);
 
-	if (test_bit(MD_HAS_JOURNAL, &mddev->flags) && !journal_dev) {
-		printk(KERN_ERR "md/raid:%s: journal disk is missing, force array readonly\n",
-		       mdname(mddev));
-		mddev->ro = 1;
-		set_disk_ro(mddev->gendisk, 1);
+	if (test_bit(MD_HAS_JOURNAL, &mddev->flags)) {
+		if (!journal_dev) {
+			pr_err("md/raid:%s: journal disk is missing, force array readonly\n",
+			       mdname(mddev));
+			mddev->ro = 1;
+			set_disk_ro(mddev->gendisk, 1);
+		} else if (mddev->recovery_cp == MaxSector)
+			set_bit(MD_JOURNAL_CLEAN, &mddev->flags);
 	}
 
 	conf->min_offset_diff = min_offset_diff;
-- 
2.8.0.rc2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] r5cache: remove journal support
  2016-08-19 22:34 [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Song Liu
@ 2016-08-19 22:34 ` Song Liu
  2016-08-23  5:32   ` Shaohua Li
  2016-08-23  5:10 ` [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Shaohua Li
  1 sibling, 1 reply; 5+ messages in thread
From: Song Liu @ 2016-08-19 22:34 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Song Liu

In current r5cache, when the journal device is broken, the raid
array is forced in readonly mode. There is no way to remove the
"journal feature", and thus make the array read-write without
journal.

This patch provides sysfs entry r5c_cache_mode that can be used
to remove journal feature.

r5c_cache_mode has 4 different values:
* no-cache;
* write-through (write journal only);
* write-back (w/ write cache feature, which will be added soon);
* broken-cache (journal missing or Faulty)

By writing into r5c_cache_mode, the array can transit from
broken-cache to no-cache, which removes journal feature for the
array.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 drivers/md/raid5-cache.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/md/raid5.c       |  5 ++++
 drivers/md/raid5.h       |  6 +++++
 3 files changed, 75 insertions(+)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 5504ce2..508d470 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -40,6 +40,16 @@
  */
 #define R5L_POOL_SIZE	4
 
+enum r5c_cache_mode {
+	R5C_MODE_NO_CACHE = 0,
+	R5C_MODE_WRITE_THROUGH = 1,
+	R5C_MODE_WRITE_BACK = 2,
+	R5C_MODE_BROKEN_CACHE = 3,
+};
+
+static char *r5c_cache_mode_str[] = {"no-cache", "write-through",
+				     "write-back", "broken-cache"};
+
 struct r5l_log {
 	struct md_rdev *rdev;
 
@@ -97,6 +107,8 @@ struct r5l_log {
 
 	bool need_cache_flush;
 	bool in_teardown;
+
+	enum r5c_cache_mode cache_mode;
 };
 
 /*
@@ -1193,6 +1205,56 @@ ioerr:
 	return ret;
 }
 
+ssize_t r5c_show_cache_mode(struct mddev *mddev, char *page)
+{
+	struct r5conf *conf = mddev->private;
+	int val = 0;
+	int ret = 0;
+
+	if (conf->log)
+		val = conf->log->cache_mode;
+	else if (test_bit(MD_HAS_JOURNAL, &mddev->flags))
+		val = R5C_MODE_BROKEN_CACHE;
+	ret += snprintf(page, PAGE_SIZE - ret, "%d: %s\n",
+			val, r5c_cache_mode_str[val]);
+	return ret;
+}
+
+ssize_t r5c_store_cache_mode(struct mddev *mddev, const char *page, size_t len)
+{
+	struct r5conf *conf = mddev->private;
+	struct r5l_log *log = conf->log;
+	int val;
+
+	if (kstrtoint(page, 10, &val))
+		return -EINVAL;
+	if (!log && val != R5C_MODE_NO_CACHE)
+		return -EINVAL;
+	/* currently only support write through (write journal) */
+	if (val < R5C_MODE_NO_CACHE || val > R5C_MODE_WRITE_THROUGH)
+		return -EINVAL;
+	if (val == R5C_MODE_NO_CACHE) {
+		if (conf->log &&
+		    !test_bit(Faulty, &log->rdev->flags)) {
+			pr_err("md/raid:%s: journal device is in use, cannot remove it\n",
+			       mdname(mddev));
+			return -EINVAL;
+		}
+	}
+
+	spin_lock_irq(&conf->device_lock);
+	if (log)
+		conf->log->cache_mode = val;
+	if (val == R5C_MODE_NO_CACHE) {
+		clear_bit(MD_HAS_JOURNAL, &mddev->flags);
+		set_bit(MD_UPDATE_SB_FLAGS, &mddev->flags);
+	}
+	spin_unlock_irq(&conf->device_lock);
+	pr_info("%s: setting r5c cache mode to %d: %s\n",
+		       mdname(mddev), val, r5c_cache_mode_str[val]);
+	return len;
+}
+
 int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 {
 	struct request_queue *q = bdev_get_queue(rdev->bdev);
@@ -1246,6 +1308,8 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 	INIT_LIST_HEAD(&log->no_space_stripes);
 	spin_lock_init(&log->no_space_stripes_lock);
 
+	log->cache_mode = R5C_MODE_WRITE_THROUGH;
+
 	if (r5l_load_log(log))
 		goto error;
 
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2119e09..665d853 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6230,6 +6230,10 @@ raid5_group_thread_cnt = __ATTR(group_thread_cnt, S_IRUGO | S_IWUSR,
 				raid5_show_group_thread_cnt,
 				raid5_store_group_thread_cnt);
 
+static struct md_sysfs_entry
+r5c_cache_mode = __ATTR(r5c_cache_mode, S_IRUGO | S_IWUSR,
+			r5c_show_cache_mode, r5c_store_cache_mode);
+
 static struct attribute *raid5_attrs[] =  {
 	&raid5_stripecache_size.attr,
 	&raid5_stripecache_active.attr,
@@ -6237,6 +6241,7 @@ static struct attribute *raid5_attrs[] =  {
 	&raid5_group_thread_cnt.attr,
 	&raid5_skip_copy.attr,
 	&raid5_rmw_level.attr,
+	&r5c_cache_mode.attr,
 	NULL,
 };
 static struct attribute_group raid5_attrs_group = {
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 517d4b6..ace9675 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -635,4 +635,10 @@ extern void r5l_stripe_write_finished(struct stripe_head *sh);
 extern int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
 extern void r5l_quiesce(struct r5l_log *log, int state);
 extern bool r5l_log_disk_error(struct r5conf *conf);
+
+
+extern ssize_t r5c_show_cache_mode(struct mddev *mddev, char *page);
+extern ssize_t
+r5c_store_cache_mode(struct mddev *mddev, const char *page, size_t len);
+
 #endif
-- 
2.8.0.rc2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly
  2016-08-19 22:34 [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Song Liu
  2016-08-19 22:34 ` [PATCH 2/2] r5cache: remove journal support Song Liu
@ 2016-08-23  5:10 ` Shaohua Li
  1 sibling, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2016-08-23  5:10 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-raid, shli

On Fri, Aug 19, 2016 at 03:34:01PM -0700, Song Liu wrote:
> Currently, the code sets MD_JOURNAL_CLEAN when the array has
> MD_FEATURE_JOURNAL and the recovery_cp is MaxSector. The array
> will be MD_JOURNAL_CLEAN even if the journal device is missing.
> 
> With this patch, the MD_JOURNAL_CLEAN is only set when the journal
> device presents.

Applied this one, thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] r5cache: remove journal support
  2016-08-19 22:34 ` [PATCH 2/2] r5cache: remove journal support Song Liu
@ 2016-08-23  5:32   ` Shaohua Li
  2016-08-23  6:12     ` Song Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2016-08-23  5:32 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-raid, shli

On Fri, Aug 19, 2016 at 03:34:02PM -0700, Song Liu wrote:
> In current r5cache, when the journal device is broken, the raid
> array is forced in readonly mode. There is no way to remove the
> "journal feature", and thus make the array read-write without
> journal.
> 
> This patch provides sysfs entry r5c_cache_mode that can be used
> to remove journal feature.
> 
> r5c_cache_mode has 4 different values:
> * no-cache;
> * write-through (write journal only);
> * write-back (w/ write cache feature, which will be added soon);
> * broken-cache (journal missing or Faulty)
> 
> By writing into r5c_cache_mode, the array can transit from
> broken-cache to no-cache, which removes journal feature for the
> array.
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  drivers/md/raid5-cache.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/md/raid5.c       |  5 ++++
>  drivers/md/raid5.h       |  6 +++++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 5504ce2..508d470 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -40,6 +40,16 @@
>   */
>  #define R5L_POOL_SIZE	4
>  
> +enum r5c_cache_mode {
> +	R5C_MODE_NO_CACHE = 0,
> +	R5C_MODE_WRITE_THROUGH = 1,
> +	R5C_MODE_WRITE_BACK = 2,
> +	R5C_MODE_BROKEN_CACHE = 3,
The idea of setting different modes makes sense.
But this is a little confusing. The first three are modes, the last one is status.
We can't set BROKEN_CACHE mode, right?

> +};
> +
> +static char *r5c_cache_mode_str[] = {"no-cache", "write-through",
> +				     "write-back", "broken-cache"};
> +
>  struct r5l_log {
>  	struct md_rdev *rdev;
>  
> @@ -97,6 +107,8 @@ struct r5l_log {
>  
>  	bool need_cache_flush;
>  	bool in_teardown;
> +
> +	enum r5c_cache_mode cache_mode;
>  };
>  
>  /*
> @@ -1193,6 +1205,56 @@ ioerr:
>  	return ret;
>  }
>  
> +ssize_t r5c_show_cache_mode(struct mddev *mddev, char *page)
> +{
> +	struct r5conf *conf = mddev->private;
> +	int val = 0;
> +	int ret = 0;
> +
> +	if (conf->log)
> +		val = conf->log->cache_mode;
> +	else if (test_bit(MD_HAS_JOURNAL, &mddev->flags))
> +		val = R5C_MODE_BROKEN_CACHE;
> +	ret += snprintf(page, PAGE_SIZE - ret, "%d: %s\n",
> +			val, r5c_cache_mode_str[val]);
> +	return ret;
> +}
> +
> +ssize_t r5c_store_cache_mode(struct mddev *mddev, const char *page, size_t len)
> +{
> +	struct r5conf *conf = mddev->private;
> +	struct r5l_log *log = conf->log;
> +	int val;
> +
> +	if (kstrtoint(page, 10, &val))
> +		return -EINVAL;
> +	if (!log && val != R5C_MODE_NO_CACHE)
> +		return -EINVAL;
> +	/* currently only support write through (write journal) */
> +	if (val < R5C_MODE_NO_CACHE || val > R5C_MODE_WRITE_THROUGH)
> +		return -EINVAL;
> +	if (val == R5C_MODE_NO_CACHE) {
> +		if (conf->log &&
> +		    !test_bit(Faulty, &log->rdev->flags)) {
> +			pr_err("md/raid:%s: journal device is in use, cannot remove it\n",
> +			       mdname(mddev));
> +			return -EINVAL;
> +		}
> +	}
> +
> +	spin_lock_irq(&conf->device_lock);
> +	if (log)
> +		conf->log->cache_mode = val;
> +	if (val == R5C_MODE_NO_CACHE) {
> +		clear_bit(MD_HAS_JOURNAL, &mddev->flags);
> +		set_bit(MD_UPDATE_SB_FLAGS, &mddev->flags);

If the journal disk is Faulty and we clear HAS_JOURNAL, what's role of journal
disk?

This sounds incomplete. If we assemble the array and journal disk is missing,
can we set the mode to NO_CACHE and allow the array writeable?

> +	}
> +	spin_unlock_irq(&conf->device_lock);
> +	pr_info("%s: setting r5c cache mode to %d: %s\n",
> +		       mdname(mddev), val, r5c_cache_mode_str[val]);
> +	return len;
> +}
> +
>  int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
>  {
>  	struct request_queue *q = bdev_get_queue(rdev->bdev);
> @@ -1246,6 +1308,8 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
>  	INIT_LIST_HEAD(&log->no_space_stripes);
>  	spin_lock_init(&log->no_space_stripes_lock);
>  
> +	log->cache_mode = R5C_MODE_WRITE_THROUGH;
> +
>  	if (r5l_load_log(log))
>  		goto error;
>  
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 2119e09..665d853 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6230,6 +6230,10 @@ raid5_group_thread_cnt = __ATTR(group_thread_cnt, S_IRUGO | S_IWUSR,
>  				raid5_show_group_thread_cnt,
>  				raid5_store_group_thread_cnt);
>  
> +static struct md_sysfs_entry
> +r5c_cache_mode = __ATTR(r5c_cache_mode, S_IRUGO | S_IWUSR,
> +			r5c_show_cache_mode, r5c_store_cache_mode);
> +
>  static struct attribute *raid5_attrs[] =  {
>  	&raid5_stripecache_size.attr,
>  	&raid5_stripecache_active.attr,
> @@ -6237,6 +6241,7 @@ static struct attribute *raid5_attrs[] =  {
>  	&raid5_group_thread_cnt.attr,
>  	&raid5_skip_copy.attr,
>  	&raid5_rmw_level.attr,
> +	&r5c_cache_mode.attr,
>  	NULL,
>  };
>  static struct attribute_group raid5_attrs_group = {
> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> index 517d4b6..ace9675 100644
> --- a/drivers/md/raid5.h
> +++ b/drivers/md/raid5.h
> @@ -635,4 +635,10 @@ extern void r5l_stripe_write_finished(struct stripe_head *sh);
>  extern int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
>  extern void r5l_quiesce(struct r5l_log *log, int state);
>  extern bool r5l_log_disk_error(struct r5conf *conf);
> +
> +
> +extern ssize_t r5c_show_cache_mode(struct mddev *mddev, char *page);
> +extern ssize_t
> +r5c_store_cache_mode(struct mddev *mddev, const char *page, size_t len);

Instead of export two functions, you can move r5c_cache_mode sysfs entry to raid5-cache.c
and export it.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] r5cache: remove journal support
  2016-08-23  5:32   ` Shaohua Li
@ 2016-08-23  6:12     ` Song Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2016-08-23  6:12 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid@vger.kernel.org, Shaohua Li


>>> On 8/22/16, 10:32 PM, "Shaohua Li" <shli@kernel.org> wrote:
>>  
>    > +enum r5c_cache_mode {
>    > +	R5C_MODE_NO_CACHE = 0,
>    > +	R5C_MODE_WRITE_THROUGH = 1,
>    > +	R5C_MODE_WRITE_BACK = 2,
>    > +	R5C_MODE_BROKEN_CACHE = 3,
>    The idea of setting different modes makes sense.
>    But this is a little confusing. The first three are modes, the last one is status.
>    We can't set BROKEN_CACHE mode, right?

How about we name it as r5c_cache_state? Splitting it into two entries 
seems an overkill to me. 

>> +	spin_lock_irq(&conf->device_lock);
>    > +	if (log)
>    > +		conf->log->cache_mode = val;
>    > +	if (val == R5C_MODE_NO_CACHE) {
>    > +		clear_bit(MD_HAS_JOURNAL, &mddev->flags);
>    > +		set_bit(MD_UPDATE_SB_FLAGS, &mddev->flags);
    
>    If the journal disk is Faulty and we clear HAS_JOURNAL, what's role of journal
>    disk?

The on disk role is faulty (0xffff). The in memory data structure will show both 
Faulty and Journal. It works in my test (create -> fail journal -> remove journal -> 
write -> stop -> assemble).  
    
>    This sounds incomplete. If we assemble the array and journal disk is missing,
>    can we set the mode to NO_CACHE and allow the array writeable?

Do you mean setting to NO_CACHE automatically? I think it is not safe, as we 
may assemble and run the array accidentally without the journal. If the array 
is set to NO_CACHE, we will lost pending data on the journal. 
    
>> +extern ssize_t r5c_show_cache_mode(struct mddev *mddev, char *page);
>    > +extern ssize_t
>    > +r5c_store_cache_mode(struct mddev *mddev, const char *page, size_t len);
>    
>    Instead of export two functions, you can move r5c_cache_mode sysfs entry to raid5-cache.c
>    and export it.

Will fix it in the next version. 

Thanks, 
Song


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-23  6:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-19 22:34 [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Song Liu
2016-08-19 22:34 ` [PATCH 2/2] r5cache: remove journal support Song Liu
2016-08-23  5:32   ` Shaohua Li
2016-08-23  6:12     ` Song Liu
2016-08-23  5:10 ` [PATCH 1/2] r5cache: set MD_JOURNAL_CLEAN correctly Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).