linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* raid5/raid6 write performance question
@ 2011-02-17 18:52 Patrick J. LoPresti
  2011-02-17 20:13 ` Piergiorgio Sartor
  2011-02-18  9:56 ` David Brown
  0 siblings, 2 replies; 3+ messages in thread
From: Patrick J. LoPresti @ 2011-02-17 18:52 UTC (permalink / raw)
  To: linux-raid

I have a fair amount of experience with hardware RAID devices, but now
I am investigating Linux software RAID and I have a question.  Well, a
few questions.

The classic problem for RAID5/RAID6 write performance, especially when
striping across many drives, is that a single small write requires
reading in the entire stripe from all disks to calculate the new
syndrome block(s).

Hardware RAID controllers typically mitigate this problem by using a
sizable (512MiB - 4GiB) non-volatile write-back cache, in the hopes
that enough blocks will be written in a short period of time to
populate an entire stripe.  Once an entire stripe is in the write-back
cache, it can be written out with its syndrome blocks without having
to read anything.

Of course, the cache has to be non-volatile (battery backed or solid
state), because the kernel is expecting stuff it has written to disk
not to vanish because of a power failure.

My question is this:  How does Linux RAID5/RAID6 avoid reading an
entire stripe every time the kernel flushes a single page?  Does it
have a (volatile?) cache?  Or does it rely on the kernel flushing lots
of contiguous data in a single request?  Or something else?

Does Linux RAID keep track of which disk blocks have already been
written at least once, so that there is a difference between writing a
block for the first time and updating it later?  (But I guess that
would not make sense, since eventually all writes become updates as
files are created and deleted.)

Thanks.

 - Pat

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-02-18  9:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-17 18:52 raid5/raid6 write performance question Patrick J. LoPresti
2011-02-17 20:13 ` Piergiorgio Sartor
2011-02-18  9:56 ` David Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).