From: NeilBrown <neilb@suse.de>
To: Eric Mei <meijia@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH] md/raid5: don't do chunk aligned read on degraded array.
Date: Thu, 19 Mar 2015 17:02:15 +1100 [thread overview]
Message-ID: <20150319170215.7d6dfd60@notabene.brown> (raw)
In-Reply-To: <550A60FF.3050902@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2473 bytes --]
On Wed, 18 Mar 2015 23:39:11 -0600 Eric Mei <meijia@gmail.com> wrote:
> From: Eric Mei <eric.mei@seagate.com>
>
> When array is degraded, read data landed on failed drives will result in
> reading rest of data in a stripe. So a single sequential read would
> result in same data being read twice.
>
> This patch is to avoid chunk aligned read for degraded array. The
> downside is to involve stripe cache which means associated CPU overhead
> and extra memory copy.
>
> Signed-off-by: Eric Mei <eric.mei@seagate.com>
> ---
> drivers/md/raid5.c | 15 ++++++++++++---
> 1 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cd2f96b..763c64a 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4180,8 +4180,12 @@ static int raid5_mergeable_bvec(struct mddev *mddev,
> unsigned int chunk_sectors = mddev->chunk_sectors;
> unsigned int bio_sectors = bvm->bi_size >> 9;
>
> - if ((bvm->bi_rw & 1) == WRITE)
> - return biovec->bv_len; /* always allow writes to be
> mergeable */
> + /*
> + * always allow writes to be mergeable, read as well if array
> + * is degraded as we'll go through stripe cache anyway.
> + */
> + if ((bvm->bi_rw & 1) == WRITE || mddev->degraded)
> + return biovec->bv_len;
>
> if (mddev->new_chunk_sectors < mddev->chunk_sectors)
> chunk_sectors = mddev->new_chunk_sectors;
> @@ -4656,7 +4660,12 @@ static void make_request(struct mddev *mddev,
> struct bio * bi)
>
> md_write_start(mddev, bi);
>
> - if (rw == READ &&
> + /*
> + * If array is degraded, better not do chunk aligned read because
> + * later we might have to read it again in order to reconstruct
> + * data on failed drives.
> + */
> + if (rw == READ && mddev->degraded == 0 &&
> mddev->reshape_position == MaxSector &&
> chunk_aligned_read(mddev,bi))
> return;
Thanks for the patch.
However this sort of patch really needs to come with some concrete
performance numbers. Preferably both sequential reads and random reads.
I agree that sequential reads are likely to be faster, but how much faster
are they?
I imagine that this might make random reads a little slower. Does it? By
how much?
Thanks,
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-03-19 6:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 5:39 [PATCH] md/raid5: don't do chunk aligned read on degraded array Eric Mei
2015-03-19 6:02 ` NeilBrown [this message]
2015-03-19 19:41 ` Eric Mei
2015-04-20 6:20 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150319170215.7d6dfd60@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=meijia@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).