Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: Song Liu <songliubraving@fb.com>
Cc: linux-raid@vger.kernel.org, neilb@suse.com, shli@fb.com,
	kernel-team@fb.com, dan.j.williams@intel.com, hch@infradead.org,
	liuzhengyuan@kylinos.cn, liuyun01@kylinos.cn,
	Jes.Sorensen@redhat.com
Subject: Re: [PATCH 1/2] md/r5cache: disable write back for degraded raid6
Date: Sat, 21 Jan 2017 10:42:38 -0800	[thread overview]
Message-ID: <20170121184238.ukp3i5f4i3wpz2tg@kernel.org> (raw)
In-Reply-To: <20170118235650.2430923-1-songliubraving@fb.com>

On Wed, Jan 18, 2017 at 03:56:49PM -0800, Song Liu wrote:
> raid6 handles write differently in degraded mode. Specifically,
> handle_stripe_fill() is called for writes. As a result, write
> back cache has very little performance benefit for degraded
> raid6. (On the other hand, write back cache does help sequential
> writes on degraded raid4 and raid5).

To be honest I really hate the idea of writeback for degraded array. It adds a
lot of complexity. It maybe improve write performance, but read performance is
always bad, so the disk performance is already bad. Improving write performance
doesn't change disk performance fully. Can't imagine who care about the
performance of degraded array. When a disk is broken, the first thing people
should do is to take action to avoid further disk loss. In other word, you are
optimizing the performance of a corner case. I hope we can delete the logic
later.

> Write back cache for degraded mode also introduces data integrity
> corner cases. This is mostly because handle_stripe_fill() is
> called on write. To avoid handling these corner cases, this patch
> disables write back cache for degraded raid6.
> 
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  drivers/md/raid5-cache.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 4957297..b31ae41 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -2371,6 +2371,16 @@ int r5c_try_caching_write(struct r5conf *conf,
>  		set_bit(STRIPE_R5C_CACHING, &sh->state);
>  	}
>  
> +	/*
> +	 * When raid6 array runs in degraded mode, handle_stripe_fill() is
> +	 * called on every write. So write back cache doesn't help the
> +	 * performance. To simplify the code, do write-through.
> +	 */
> +	if (conf->level == 6 && s->failed) {
> +		r5c_make_stripe_write_out(sh);
> +		return -EAGAIN;
> +	}

Instead of this adhoc handling, why don't we fully switch to writethrough mode
after disk is broken?

Thanks,
Shaohua

      parent reply	other threads:[~2017-01-21 18:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-18 23:56 [PATCH 1/2] md/r5cache: disable write back for degraded raid6 Song Liu
2017-01-18 23:56 ` [PATCH 2/2] md/r5cache: improve journal device efficiency Song Liu
2017-01-21 18:54   ` Shaohua Li
2017-01-21 18:42 ` Shaohua Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170121184238.ukp3i5f4i3wpz2tg@kernel.org \
    --to=shli@kernel.org \
    --cc=Jes.Sorensen@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=liuyun01@kylinos.cn \
    --cc=liuzhengyuan@kylinos.cn \
    --cc=neilb@suse.com \
    --cc=shli@fb.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox