linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Hardware RAID offload
@ 2004-07-11 20:17 Jeff Garzik
  2004-07-12 12:09 ` Alan Cox
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff Garzik @ 2004-07-11 20:17 UTC (permalink / raw)
  To: linux-raid; +Cc: Device mapper devel list, Jens Axboe, Alan Cox


Food for comment, no specific issues or questions.

Some of the SATA controllers on the market are in a grey area between 
"completely non-RAID" and "completely hardware RAID".  These 
"in-between" SATA controllers provide some features which can be used to 
accelerate RAID in certain cases, and it would be nice to be able to 
make use of these features.  In one case, -not- making use of these 
"RAID offload" features causes a distinct performance loss.

Here's a rough description of the features provided.

1) Transaction sequencing.  Consider that N disk transactions comprise a 
single RAID1 write.  The hardware can be set up to wait until all N 
transactions are complete, before sending an interrupt.  This is 
applicable to Marvell and Promise SATA, among others.

Block layer comments:  Not really compatible with the way the Linux 
block layer works, but who knows, maybe some genius has ideas.


2) Copy elimination.  All disk transactions on the Promise SX4 go 
through an on-board DIMM (128M - 2G), before being sent to the attached 
controllers.  I would love to use this to eliminate data duplication on 
RAID1 and RAID5 writes.


3) RAID5 XOR offload.  Some Promise (and other) controllers support 
this.  Since modern CPUs are so fast, generally this isn't a useful 
feature by itself.  However, when combined with #2, you can offload 
quite a bit of RAID5 onto the hardware.


4) Off-board RAID balancing.  With disk transactions funnelled through 
the bottleneck of an on-board DIMM, the hardware is actually in a better 
position to decide how to balance raid 1/5 reads.  Only one case of that 
in hardware I know of, though.

There was a fifth feature, but I forget what it was.  :)

As some of you have no doubt already noted, these features are specific 
to a single controller, while a device-mapper or md RAID need not be. 
To facilitate this, I forsee needing to create a "hardware group" or 
"block device group", to which would allow the necessary associations to 
be utilized where available, while being 100% software in all other cases.

Or maybe, allow the user to set a flag that tells md to pass a request 
directly through to the low-level driver, in certain situations ("pass 
through all RAID1 writes, but handle everything else in software").  /me 
thinks out loud...

In general, storage hardware seems to be trending towards "put the fast 
path in hardware, let software handle the rest", which is OK with me...

	Jeff




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC] Hardware RAID offload
  2004-07-11 20:17 [RFC] Hardware RAID offload Jeff Garzik
@ 2004-07-12 12:09 ` Alan Cox
  0 siblings, 0 replies; 2+ messages in thread
From: Alan Cox @ 2004-07-12 12:09 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-raid, Device mapper devel list, Jens Axboe

On Sul, 2004-07-11 at 21:17, Jeff Garzik wrote:
> 1) Transaction sequencing.  Consider that N disk transactions comprise a 
> single RAID1 write.  The hardware can be set up to wait until all N 
> transactions are complete, before sending an interrupt.  This is 
> applicable to Marvell and Promise SATA, among others.

So this is basically IRQ mitigation. Can you flip the "cause an IRQ"
status on the fly ?

> 2) Copy elimination.  All disk transactions on the Promise SX4 go 
> through an on-board DIMM (128M - 2G), before being sent to the attached 
> controllers.  I would love to use this to eliminate data duplication on 
> RAID1 and RAID5 writes.

Big win for PCI cards because it cuts PCI bandwidth way down. It
doesn't seem like it should be that hard to add to the drivers either
depending upon how the RAM is presented. Is it mapped into PCI space ?

> Or maybe, allow the user to set a flag that tells md to pass a request 
> directly through to the low-level driver, in certain situations ("pass 
> through all RAID1 writes, but handle everything else in software").  /me 
> thinks out loud...

As some kind of driver "md_ops" or as a separate raid1-promise plugin ?



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-07-12 12:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-11 20:17 [RFC] Hardware RAID offload Jeff Garzik
2004-07-12 12:09 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).