linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: stan@hardwarefreak.com
Cc: David Brown <david.brown@hesbynett.no>,
	Michael Tokarev <mjt@tls.msk.ru>,
	Miquel van Smoorenburg <mikevs@xs4all.net>,
	Linux RAID <linux-raid@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: O_DIRECT to md raid 6 is slow
Date: Mon, 20 Aug 2012 10:01:34 +1000	[thread overview]
Message-ID: <20120820100134.22b2b056@notabene.brown> (raw)
In-Reply-To: <50317804.9010701@hardwarefreak.com>

[-- Attachment #1: Type: text/plain, Size: 2541 bytes --]

On Sun, 19 Aug 2012 18:34:28 -0500 Stan Hoeppner <stan@hardwarefreak.com>
wrote:

> On 8/19/2012 9:01 AM, David Brown wrote:
> > I'm sort of jumping in to this thread, so my apologies if I repeat
> > things other people have said already.
> 
> I'm glad you jumped in David.  You made a critical statement of fact
> below which clears some things up.  If you had stated it early on,
> before Miquel stole the thread and moved it to LKML proper, it would
> have short circuited a lot of this discussion.  Which is:
> 
> > AFAIK, there is scope for a few performance optimisations in raid6.  One
> > is that for small writes which only need to change one block, raid5 uses
> > a "short-cut" RMW cycle (read the old data block, read the old parity
> > block, calculate the new parity block, write the new data and parity
> > blocks).  A similar short-cut could be implemented in raid6, though it
> > is not clear how much a difference it would really make.
> 
> Thus my original statement was correct, or at least half correct[1], as
> it pertained to md/RAID6.  Then Miquel switched the discussion to
> md/RAID5 and stated I was all wet.  I wasn't, and neither was Dave
> Chinner.  I was simply unaware of this md/RAID5 single block write RMW
> shortcut.  I'm copying lkml proper on this simply to set the record
> straight.  Not that anyone was paying attention, but it needs to be in
> the same thread in the archives.  The takeaway:
> 

Since we are trying to set the record straight....

> md/RAID6 must read all devices in a RMW cycle.

md/RAID6 must read all data devices (i.e. not parity devices) which it is not
going to write to, in an RWM cycle (which the code actually calls RCW -
reconstruct-write).

> 
> md/RAID5 takes a shortcut for single block writes, and must only read
> one drive for the RMW cycle.

md/RAID5 uses an alternate mechanism when the number of data blocks that need
to be written is less than half the number of data blocks in a stripe.  In
this alternate mechansim (which the code calls RMW - read-modify-write),
md/RAID5 reads all the blocks that it is about to write to, plus the parity
block.  It then computes the new parity and writes it out along with the new
data.

> 
> [1}The only thing that's not clear at this point is if md/RAID6 also
> always writes back all chunks during RMW, or only the chunk that has
> changed.

Do you seriously imagine anyone would write code to write out data which it
is known has not changed?  Sad. :-)

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2012-08-20  0:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-15  0:49 O_DIRECT to md raid 6 is slow Andy Lutomirski
2012-08-15  1:07 ` kedacomkernel
2012-08-15  1:12   ` Andy Lutomirski
2012-08-15  1:23     ` kedacomkernel
2012-08-15 11:50 ` John Robinson
2012-08-15 17:57   ` Andy Lutomirski
2012-08-15 22:00     ` Stan Hoeppner
2012-08-15 22:10       ` Andy Lutomirski
2012-08-15 23:50         ` Stan Hoeppner
2012-08-16  1:08           ` Andy Lutomirski
2012-08-16  6:41           ` Roman Mamedov
     [not found]     ` <201208152307.q7FN7hMR008630@xs8.xs4all.nl>
     [not found]       ` <502CD3F8.70001@hardwarefreak.com>
     [not found]         ` <502D6B0A.6090508@xs4all.net>
     [not found]           ` <502DF357.8090205@hardwarefreak.com>
     [not found]             ` <502E2817.8040306@xs4all.net>
2012-08-18  5:09               ` Stan Hoeppner
2012-08-18 10:08                 ` Michael Tokarev
2012-08-19  3:17                   ` Stan Hoeppner
2012-08-19 14:01                     ` David Brown
2012-08-19 23:34                       ` Stan Hoeppner
2012-08-20  0:01                         ` NeilBrown [this message]
2012-08-20  4:44                           ` Stan Hoeppner
2012-08-20  5:19                             ` Dave Chinner
2012-08-20  5:42                               ` Stan Hoeppner
2012-08-20  7:47                           ` David Brown
2012-08-21 14:51                         ` Miquel van Smoorenburg
2012-08-22  3:59                           ` Stan Hoeppner
2012-08-19 17:02                     ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120820100134.22b2b056@notabene.brown \
    --to=neilb@suse.de \
    --cc=david.brown@hesbynett.no \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mikevs@xs4all.net \
    --cc=mjt@tls.msk.ru \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).