From: NeilBrown <neilb@suse.de>
To: Eric Mei <meijia@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH] md/raid5: don't do chunk aligned read on degraded array.
Date: Thu, 19 Mar 2015 17:02:15 +1100 [thread overview]
Message-ID: <20150319170215.7d6dfd60@notabene.brown> (raw)
In-Reply-To: <550A60FF.3050902@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2473 bytes --]
On Wed, 18 Mar 2015 23:39:11 -0600 Eric Mei <meijia@gmail.com> wrote:
> From: Eric Mei <eric.mei@seagate.com>
>
> When array is degraded, read data landed on failed drives will result in
> reading rest of data in a stripe. So a single sequential read would
> result in same data being read twice.
>
> This patch is to avoid chunk aligned read for degraded array. The
> downside is to involve stripe cache which means associated CPU overhead
> and extra memory copy.
>
> Signed-off-by: Eric Mei <eric.mei@seagate.com>
> ---
> drivers/md/raid5.c | 15 ++++++++++++---
> 1 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cd2f96b..763c64a 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4180,8 +4180,12 @@ static int raid5_mergeable_bvec(struct mddev *mddev,
> unsigned int chunk_sectors = mddev->chunk_sectors;
> unsigned int bio_sectors = bvm->bi_size >> 9;
>
> - if ((bvm->bi_rw & 1) == WRITE)
> - return biovec->bv_len; /* always allow writes to be
> mergeable */
> + /*
> + * always allow writes to be mergeable, read as well if array
> + * is degraded as we'll go through stripe cache anyway.
> + */
> + if ((bvm->bi_rw & 1) == WRITE || mddev->degraded)
> + return biovec->bv_len;
>
> if (mddev->new_chunk_sectors < mddev->chunk_sectors)
> chunk_sectors = mddev->new_chunk_sectors;
> @@ -4656,7 +4660,12 @@ static void make_request(struct mddev *mddev,
> struct bio * bi)
>
> md_write_start(mddev, bi);
>
> - if (rw == READ &&
> + /*
> + * If array is degraded, better not do chunk aligned read because
> + * later we might have to read it again in order to reconstruct
> + * data on failed drives.
> + */
> + if (rw == READ && mddev->degraded == 0 &&
> mddev->reshape_position == MaxSector &&
> chunk_aligned_read(mddev,bi))
> return;
Thanks for the patch.
However this sort of patch really needs to come with some concrete
performance numbers. Preferably both sequential reads and random reads.
I agree that sequential reads are likely to be faster, but how much faster
are they?
I imagine that this might make random reads a little slower. Does it? By
how much?
Thanks,
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-03-19 6:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 5:39 [PATCH] md/raid5: don't do chunk aligned read on degraded array Eric Mei
2015-03-19 6:02 ` NeilBrown [this message]
2015-03-19 19:41 ` Eric Mei
2015-04-20 6:20 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150319170215.7d6dfd60@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=meijia@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.