linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NVRAM support
@ 2006-02-10  9:01 Mirko Benz
  2006-02-10 12:42 ` Erik Mouw
  2006-02-10 17:38 ` Paul Clements
  0 siblings, 2 replies; 14+ messages in thread
From: Mirko Benz @ 2006-02-10  9:01 UTC (permalink / raw)
  To: linux-raid

Hello,

Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
card that exports battery backed memory.
Could that significantly improve write speed for RAID 5/6 (e.g. via an 
external journal, asynchronous operation and write caching)?

What changes would be required?

Thanks,
Mirko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-10  9:01 NVRAM support Mirko Benz
@ 2006-02-10 12:42 ` Erik Mouw
  2006-02-10 15:43   ` Bill Davidsen
  2006-02-10 17:38 ` Paul Clements
  1 sibling, 1 reply; 14+ messages in thread
From: Erik Mouw @ 2006-02-10 12:42 UTC (permalink / raw)
  To: Mirko Benz; +Cc: linux-raid

On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote:
> Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
> card that exports battery backed memory.

Unless it's very large (i.e.: as large as one of your disks), it
doesn't make sense. It will probably break less often, but it doesn't
help you in case a disk really breaks. It also won't speed up an MD
device much.

> Could that significantly improve write speed for RAID 5/6 (e.g. via an 
> external journal, asynchronous operation and write caching)?

You could use it for an external journal, or you could use it as a swap
device.

> What changes would be required?

None, ext3 supports external journals. Look for the -O option in the
mke2fs manual page. Using the NVRAM device as swap is not different
from a using "normal" swap partition.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-10 12:42 ` Erik Mouw
@ 2006-02-10 15:43   ` Bill Davidsen
  2006-02-11  1:02     ` dean gaudet
  0 siblings, 1 reply; 14+ messages in thread
From: Bill Davidsen @ 2006-02-10 15:43 UTC (permalink / raw)
  To: Erik Mouw; +Cc: Mirko Benz, linux-raid

Erik Mouw wrote:

>On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote:
>  
>
>>Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
>>card that exports battery backed memory.
>>    
>>
>
>Unless it's very large (i.e.: as large as one of your disks), it
>doesn't make sense. It will probably break less often, but it doesn't
>help you in case a disk really breaks. It also won't speed up an MD
>device much.
>
>  
>
>>Could that significantly improve write speed for RAID 5/6 (e.g. via an 
>>external journal, asynchronous operation and write caching)?
>>    
>>
>
>You could use it for an external journal, or you could use it as a swap
>device.
>  
>

Let me concur, I used external journal on SSD a decade ago with jfs 
(AIX). If you do a lot of operations which generate journal entries, 
file create, delete, etc, then it will double your performance in some 
cases. Otherwise it really doesn't help much, use as a swap device might 
be more helpful depending on your config.

>  
>
>>What changes would be required?
>>    
>>
>
>None, ext3 supports external journals. Look for the -O option in the
>mke2fs manual page. Using the NVRAM device as swap is not different
>from a using "normal" swap partition.
>
>
>Erik
>
>  
>


-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-10  9:01 NVRAM support Mirko Benz
  2006-02-10 12:42 ` Erik Mouw
@ 2006-02-10 17:38 ` Paul Clements
  1 sibling, 0 replies; 14+ messages in thread
From: Paul Clements @ 2006-02-10 17:38 UTC (permalink / raw)
  To: Mirko Benz; +Cc: linux-raid

Mirko Benz wrote:

> Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI 
> card that exports battery backed memory.

Sure. There are a couple ways I can think of using such a thing:

1) put an md intent bitmap on the NVRAM device for faster resyncs

2) use the NVRAM as a write journal for md to make md raid4/5/6 reliable 
(if the system crashes while an md raid5 is degraded, i.e., missing a 
disk, there is a chance of silent data corruption). The md driver does 
not currently do write journalling, so this would require some code changes.

--
Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-10 15:43   ` Bill Davidsen
@ 2006-02-11  1:02     ` dean gaudet
  2006-02-13  9:22       ` Erik Mouw
  0 siblings, 1 reply; 14+ messages in thread
From: dean gaudet @ 2006-02-11  1:02 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Erik Mouw, Mirko Benz, linux-raid

On Fri, 10 Feb 2006, Bill Davidsen wrote:

> Erik Mouw wrote:
> 
> > You could use it for an external journal, or you could use it as a swap
> > device.
> >  
> 
> Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If
> you do a lot of operations which generate journal entries, file create,
> delete, etc, then it will double your performance in some cases. Otherwise it
> really doesn't help much, use as a swap device might be more helpful depending
> on your config.

it doesn't seem to make any sense at all to use a non-volatile external 
memory for swap... swap has no purpose past a power outage.

-dean

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-11  1:02     ` dean gaudet
@ 2006-02-13  9:22       ` Erik Mouw
  2006-02-13 11:54         ` Andy Smith
  2006-02-15  8:24         ` Mirko Benz
  0 siblings, 2 replies; 14+ messages in thread
From: Erik Mouw @ 2006-02-13  9:22 UTC (permalink / raw)
  To: dean gaudet; +Cc: Bill Davidsen, Mirko Benz, linux-raid

On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
> On Fri, 10 Feb 2006, Bill Davidsen wrote:
> > Erik Mouw wrote:
> > > You could use it for an external journal, or you could use it as a swap
> > > device.
> > >  
> > 
> > Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If
> > you do a lot of operations which generate journal entries, file create,
> > delete, etc, then it will double your performance in some cases. Otherwise it
> > really doesn't help much, use as a swap device might be more helpful depending
> > on your config.
> 
> it doesn't seem to make any sense at all to use a non-volatile external 
> memory for swap... swap has no purpose past a power outage.

No, but it is a very fast swap device. Much faster than a hard drive.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
| Data lost? Stay calm and contact Harddisk-recovery.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-13  9:22       ` Erik Mouw
@ 2006-02-13 11:54         ` Andy Smith
  2006-02-13 13:35           ` Guy
  2006-02-14 10:17           ` Erik Mouw
  2006-02-15  8:24         ` Mirko Benz
  1 sibling, 2 replies; 14+ messages in thread
From: Andy Smith @ 2006-02-13 11:54 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
> On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
> > it doesn't seem to make any sense at all to use a non-volatile external 
> > memory for swap... swap has no purpose past a power outage.
> 
> No, but it is a very fast swap device. Much faster than a hard drive.

Wouldn't the same amount of money be better spent on RAM then?

-- 
http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby
Encrypted mail welcome - keyid 0x604DE5DB

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: NVRAM support
  2006-02-13 11:54         ` Andy Smith
@ 2006-02-13 13:35           ` Guy
  2006-02-14 10:17           ` Erik Mouw
  1 sibling, 0 replies; 14+ messages in thread
From: Guy @ 2006-02-13 13:35 UTC (permalink / raw)
  To: 'Andy Smith', linux-raid

Not the same amount!  Match the size of the NV RAM disk with RAM at a
fraction of the cost.  With the money saved, buy a computer for the kids.
:)

} -----Original Message-----
} From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
} owner@vger.kernel.org] On Behalf Of Andy Smith
} Sent: Monday, February 13, 2006 6:55 AM
} To: linux-raid@vger.kernel.org
} Subject: Re: NVRAM support
} 
} On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
} > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
} > > it doesn't seem to make any sense at all to use a non-volatile
} external
} > > memory for swap... swap has no purpose past a power outage.
} >
} > No, but it is a very fast swap device. Much faster than a hard drive.
} 
} Wouldn't the same amount of money be better spent on RAM then?
} 
} --
} http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby
} Encrypted mail welcome - keyid 0x604DE5DB


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-13 11:54         ` Andy Smith
  2006-02-13 13:35           ` Guy
@ 2006-02-14 10:17           ` Erik Mouw
  1 sibling, 0 replies; 14+ messages in thread
From: Erik Mouw @ 2006-02-14 10:17 UTC (permalink / raw)
  To: linux-raid

On Mon, Feb 13, 2006 at 11:54:44AM +0000, Andy Smith wrote:
> On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote:
> > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
> > > it doesn't seem to make any sense at all to use a non-volatile external 
> > > memory for swap... swap has no purpose past a power outage.
> > 
> > No, but it is a very fast swap device. Much faster than a hard drive.
> 
> Wouldn't the same amount of money be better spent on RAM then?

Sure, but when you happen to have such a device lying idle, this is a
way to use it.

(note that you can also use unused memory on your video adapter as a
fast swap device).


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-13  9:22       ` Erik Mouw
  2006-02-13 11:54         ` Andy Smith
@ 2006-02-15  8:24         ` Mirko Benz
  2006-02-15 23:00           ` Neil Brown
  1 sibling, 1 reply; 14+ messages in thread
From: Mirko Benz @ 2006-02-15  8:24 UTC (permalink / raw)
  To: Erik Mouw; +Cc: dean gaudet, Bill Davidsen, linux-raid

Hi,

My intention was not to use a NVRAM device for swap.

Enterprise storage systems use NVRAM for better data protection/faster 
recovery in case of a crash.
Modern CPUs can do RAID calculation very fast. But Linux RAID is 
vulnerable when a crash during a write operation occurs.
E.g. Data and parity write requests are issued in parallel but only one 
finishes. This will
lead to inconsistent data. It will be undetected and can not be 
repaired. Right?

How can journaling be implemented within linux-raid?

I have seen a paper that tries this in cooperation with a file system:
„Journal-guided Resynchronization for Software RAID“
www.cs.wisc.edu/adsl/Publications

But I would rather see a solution within md so that other file systems 
or LVM can be used on top of md.

Regards,
Mirko

Erik Mouw schrieb:
> On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote:
>   
>> On Fri, 10 Feb 2006, Bill Davidsen wrote:
>>     
>>> Erik Mouw wrote:
>>>       
>>>> You could use it for an external journal, or you could use it as a swap
>>>> device.
>>>>  
>>>>         
>>> Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If
>>> you do a lot of operations which generate journal entries, file create,
>>> delete, etc, then it will double your performance in some cases. Otherwise it
>>> really doesn't help much, use as a swap device might be more helpful depending
>>> on your config.
>>>       
>> it doesn't seem to make any sense at all to use a non-volatile external 
>> memory for swap... swap has no purpose past a power outage.
>>     
>
> No, but it is a very fast swap device. Much faster than a hard drive.
>
>
> Erik
>
>   

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-15  8:24         ` Mirko Benz
@ 2006-02-15 23:00           ` Neil Brown
  2006-02-16 10:05             ` Mario 'BitKoenig' Holbe
  2006-02-20  9:57             ` Mirko Benz
  0 siblings, 2 replies; 14+ messages in thread
From: Neil Brown @ 2006-02-15 23:00 UTC (permalink / raw)
  To: Mirko Benz; +Cc: Erik Mouw, dean gaudet, Bill Davidsen, linux-raid

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown, Size: 2101 bytes --]

On Wednesday February 15, mirko.benz@web.de wrote:
> Hi,
> 
> My intention was not to use a NVRAM device for swap.
> 
> Enterprise storage systems use NVRAM for better data protection/faster 
> recovery in case of a crash.
> Modern CPUs can do RAID calculation very fast. But Linux RAID is 
> vulnerable when a crash during a write operation occurs.
> E.g. Data and parity write requests are issued in parallel but only one 
> finishes. This will
> lead to inconsistent data. It will be undetected and can not be 
> repaired. Right?

Wrong.  Well, maybe 5% right.

If the array is degraded, that the inconsistency cannot be detected.
If the array is fully functioning, then any inconsistency will be
corrected by a 'resync'.

> 
> How can journaling be implemented within linux-raid?

With a fair bit of work. :-)

> 
> I have seen a paper that tries this in cooperation with a file system:
> „Journal-guided Resynchronization for Software RAID“
> www.cs.wisc.edu/adsl/Publications

This is using the ext3 journal to make the 'resync' (mentioned above)
faster.  Write-intent bitmaps can achieve similar speedups with
different costs.

> 
> But I would rather see a solution within md so that other file systems 
> or LVM can be used on top of md.

Currently there is no solution to the "crash while writing and
degraded on restart means possible silent data corruption" problem.
However is it, in reality, a very small problem (unless you regularly
run with a degraded array - don't do that).

The only practical fix at the filesystem level is, as you suggest,
journalling to NVRAM.  There is work underway to restructure md/raid5
to be able to off-load the xor and raid6 calculations to dedicated
hardware.  This restructure would also make it a lot easier to journal
raid5 updates thus closing this hole (and also improving write
latency).

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-15 23:00           ` Neil Brown
@ 2006-02-16 10:05             ` Mario 'BitKoenig' Holbe
  2006-02-20  9:57             ` Mirko Benz
  1 sibling, 0 replies; 14+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2006-02-16 10:05 UTC (permalink / raw)
  To: linux-raid

Neil Brown <neilb@suse.de> wrote:
> On Wednesday February 15, mirko.benz@web.de wrote:
>> E.g. Data and parity write requests are issued in parallel but only one 
>> finishes. This will
>> lead to inconsistent data. It will be undetected and can not be 
> If the array is degraded, that the inconsistency cannot be detected.

Hmm, if the array is degraded, then there is no redundancy at all, so
there is no chance for any inconsistency.

Btw., this reminds me... now when you have raid6 - when is a raid6
defined to be degraded? Perhaps you have equal issues there as with
raid1 >2 mirrors some months ago (resync was not started when 3rd
mirror failed and 1st and 2nd were inconsistent)?

> If the array is fully functioning, then any inconsistency will be
> corrected by a 'resync'.

Yes, because the redundancy is ignored and rebuilt.


regards
   Mario
-- 
Why did the tachyon cross the road?
Because it was on the other side.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-15 23:00           ` Neil Brown
  2006-02-16 10:05             ` Mario 'BitKoenig' Holbe
@ 2006-02-20  9:57             ` Mirko Benz
  2006-02-20 23:16               ` Neil Brown
  1 sibling, 1 reply; 14+ messages in thread
From: Mirko Benz @ 2006-02-20  9:57 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Hello,

We have applications were large data sets (e.g. 100 MB) are sequentially 
written.
Software RAID could do a full stripe update (without reading/using 
existing data).
Does this happen in parallel? If yes, isn't that data vulnerable when a 
crash occurs?

Thanks,
Mirko

Neil Brown schrieb:
> On Wednesday February 15, mirko.benz@web.de wrote:
>   
>> Hi,
>>
>> My intention was not to use a NVRAM device for swap.
>>
>> Enterprise storage systems use NVRAM for better data protection/faster 
>> recovery in case of a crash.
>> Modern CPUs can do RAID calculation very fast. But Linux RAID is 
>> vulnerable when a crash during a write operation occurs.
>> E.g. Data and parity write requests are issued in parallel but only one 
>> finishes. This will
>> lead to inconsistent data. It will be undetected and can not be 
>> repaired. Right?
>>     
>
> Wrong.  Well, maybe 5% right.
>
> If the array is degraded, that the inconsistency cannot be detected.
> If the array is fully functioning, then any inconsistency will be
> corrected by a 'resync'.
>
>   
>> How can journaling be implemented within linux-raid?
>>     
>
> With a fair bit of work. :-)
>
>   
>> I have seen a paper that tries this in cooperation with a file system:
>> ?Journal-guided Resynchronization for Software RAID?
>> www.cs.wisc.edu/adsl/Publications
>>     
>
> This is using the ext3 journal to make the 'resync' (mentioned above)
> faster.  Write-intent bitmaps can achieve similar speedups with
> different costs.
>
>   
>> But I would rather see a solution within md so that other file systems 
>> or LVM can be used on top of md.
>>     
>
> Currently there is no solution to the "crash while writing and
> degraded on restart means possible silent data corruption" problem.
> However is it, in reality, a very small problem (unless you regularly
> run with a degraded array - don't do that).
>
> The only practical fix at the filesystem level is, as you suggest,
> journalling to NVRAM.  There is work underway to restructure md/raid5
> to be able to off-load the xor and raid6 calculations to dedicated
> hardware.  This restructure would also make it a lot easier to journal
> raid5 updates thus closing this hole (and also improving write
> latency).
>
> NeilBrown
>
>   


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: NVRAM support
  2006-02-20  9:57             ` Mirko Benz
@ 2006-02-20 23:16               ` Neil Brown
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Brown @ 2006-02-20 23:16 UTC (permalink / raw)
  To: Mirko Benz; +Cc: linux-raid

On Monday February 20, mirko.benz@web.de wrote:
> Hello,
> 
> We have applications were large data sets (e.g. 100 MB) are sequentially 
> written.
> Software RAID could do a full stripe update (without reading/using 
> existing data).
> Does this happen in parallel? If yes, isn't that data vulnerable when a 
> crash occurs?

md/raid5 does full stripe writes about 80% of the time when I've
measured it while doing large writes.  I'm don't know why it is not
closer to 100%.  I suspect some subtle scheduling issue that I
haven't managed to get to the bottom of yet (I should get back to
that).

Data is only vulnerable if, after the crash, the array is degraded.
If the array is still complete after the crash, then there is no loss
of data.

NeilBrown

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2006-02-20 23:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-10  9:01 NVRAM support Mirko Benz
2006-02-10 12:42 ` Erik Mouw
2006-02-10 15:43   ` Bill Davidsen
2006-02-11  1:02     ` dean gaudet
2006-02-13  9:22       ` Erik Mouw
2006-02-13 11:54         ` Andy Smith
2006-02-13 13:35           ` Guy
2006-02-14 10:17           ` Erik Mouw
2006-02-15  8:24         ` Mirko Benz
2006-02-15 23:00           ` Neil Brown
2006-02-16 10:05             ` Mario 'BitKoenig' Holbe
2006-02-20  9:57             ` Mirko Benz
2006-02-20 23:16               ` Neil Brown
2006-02-10 17:38 ` Paul Clements

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).