* NVRAM support @ 2006-02-10 9:01 Mirko Benz 2006-02-10 12:42 ` Erik Mouw 2006-02-10 17:38 ` Paul Clements 0 siblings, 2 replies; 14+ messages in thread From: Mirko Benz @ 2006-02-10 9:01 UTC (permalink / raw) To: linux-raid Hello, Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI card that exports battery backed memory. Could that significantly improve write speed for RAID 5/6 (e.g. via an external journal, asynchronous operation and write caching)? What changes would be required? Thanks, Mirko ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-10 9:01 NVRAM support Mirko Benz @ 2006-02-10 12:42 ` Erik Mouw 2006-02-10 15:43 ` Bill Davidsen 2006-02-10 17:38 ` Paul Clements 1 sibling, 1 reply; 14+ messages in thread From: Erik Mouw @ 2006-02-10 12:42 UTC (permalink / raw) To: Mirko Benz; +Cc: linux-raid On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote: > Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI > card that exports battery backed memory. Unless it's very large (i.e.: as large as one of your disks), it doesn't make sense. It will probably break less often, but it doesn't help you in case a disk really breaks. It also won't speed up an MD device much. > Could that significantly improve write speed for RAID 5/6 (e.g. via an > external journal, asynchronous operation and write caching)? You could use it for an external journal, or you could use it as a swap device. > What changes would be required? None, ext3 supports external journals. Look for the -O option in the mke2fs manual page. Using the NVRAM device as swap is not different from a using "normal" swap partition. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-10 12:42 ` Erik Mouw @ 2006-02-10 15:43 ` Bill Davidsen 2006-02-11 1:02 ` dean gaudet 0 siblings, 1 reply; 14+ messages in thread From: Bill Davidsen @ 2006-02-10 15:43 UTC (permalink / raw) To: Erik Mouw; +Cc: Mirko Benz, linux-raid Erik Mouw wrote: >On Fri, Feb 10, 2006 at 10:01:09AM +0100, Mirko Benz wrote: > > >>Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI >>card that exports battery backed memory. >> >> > >Unless it's very large (i.e.: as large as one of your disks), it >doesn't make sense. It will probably break less often, but it doesn't >help you in case a disk really breaks. It also won't speed up an MD >device much. > > > >>Could that significantly improve write speed for RAID 5/6 (e.g. via an >>external journal, asynchronous operation and write caching)? >> >> > >You could use it for an external journal, or you could use it as a swap >device. > > Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If you do a lot of operations which generate journal entries, file create, delete, etc, then it will double your performance in some cases. Otherwise it really doesn't help much, use as a swap device might be more helpful depending on your config. > > >>What changes would be required? >> >> > >None, ext3 supports external journals. Look for the -O option in the >mke2fs manual page. Using the NVRAM device as swap is not different >from a using "normal" swap partition. > > >Erik > > > -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-10 15:43 ` Bill Davidsen @ 2006-02-11 1:02 ` dean gaudet 2006-02-13 9:22 ` Erik Mouw 0 siblings, 1 reply; 14+ messages in thread From: dean gaudet @ 2006-02-11 1:02 UTC (permalink / raw) To: Bill Davidsen; +Cc: Erik Mouw, Mirko Benz, linux-raid On Fri, 10 Feb 2006, Bill Davidsen wrote: > Erik Mouw wrote: > > > You could use it for an external journal, or you could use it as a swap > > device. > > > > Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If > you do a lot of operations which generate journal entries, file create, > delete, etc, then it will double your performance in some cases. Otherwise it > really doesn't help much, use as a swap device might be more helpful depending > on your config. it doesn't seem to make any sense at all to use a non-volatile external memory for swap... swap has no purpose past a power outage. -dean ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-11 1:02 ` dean gaudet @ 2006-02-13 9:22 ` Erik Mouw 2006-02-13 11:54 ` Andy Smith 2006-02-15 8:24 ` Mirko Benz 0 siblings, 2 replies; 14+ messages in thread From: Erik Mouw @ 2006-02-13 9:22 UTC (permalink / raw) To: dean gaudet; +Cc: Bill Davidsen, Mirko Benz, linux-raid On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: > On Fri, 10 Feb 2006, Bill Davidsen wrote: > > Erik Mouw wrote: > > > You could use it for an external journal, or you could use it as a swap > > > device. > > > > > > > Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If > > you do a lot of operations which generate journal entries, file create, > > delete, etc, then it will double your performance in some cases. Otherwise it > > really doesn't help much, use as a swap device might be more helpful depending > > on your config. > > it doesn't seem to make any sense at all to use a non-volatile external > memory for swap... swap has no purpose past a power outage. No, but it is a very fast swap device. Much faster than a hard drive. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands | Data lost? Stay calm and contact Harddisk-recovery.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-13 9:22 ` Erik Mouw @ 2006-02-13 11:54 ` Andy Smith 2006-02-13 13:35 ` Guy 2006-02-14 10:17 ` Erik Mouw 2006-02-15 8:24 ` Mirko Benz 1 sibling, 2 replies; 14+ messages in thread From: Andy Smith @ 2006-02-13 11:54 UTC (permalink / raw) To: linux-raid [-- Attachment #1: Type: text/plain, Size: 527 bytes --] On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote: > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: > > it doesn't seem to make any sense at all to use a non-volatile external > > memory for swap... swap has no purpose past a power outage. > > No, but it is a very fast swap device. Much faster than a hard drive. Wouldn't the same amount of money be better spent on RAM then? -- http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby Encrypted mail welcome - keyid 0x604DE5DB [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: NVRAM support 2006-02-13 11:54 ` Andy Smith @ 2006-02-13 13:35 ` Guy 2006-02-14 10:17 ` Erik Mouw 1 sibling, 0 replies; 14+ messages in thread From: Guy @ 2006-02-13 13:35 UTC (permalink / raw) To: 'Andy Smith', linux-raid Not the same amount! Match the size of the NV RAM disk with RAM at a fraction of the cost. With the money saved, buy a computer for the kids. :) } -----Original Message----- } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- } owner@vger.kernel.org] On Behalf Of Andy Smith } Sent: Monday, February 13, 2006 6:55 AM } To: linux-raid@vger.kernel.org } Subject: Re: NVRAM support } } On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote: } > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: } > > it doesn't seem to make any sense at all to use a non-volatile } external } > > memory for swap... swap has no purpose past a power outage. } > } > No, but it is a very fast swap device. Much faster than a hard drive. } } Wouldn't the same amount of money be better spent on RAM then? } } -- } http://strugglers.net/wiki/Xen_hosting -- A Xen VPS hosting hobby } Encrypted mail welcome - keyid 0x604DE5DB ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-13 11:54 ` Andy Smith 2006-02-13 13:35 ` Guy @ 2006-02-14 10:17 ` Erik Mouw 1 sibling, 0 replies; 14+ messages in thread From: Erik Mouw @ 2006-02-14 10:17 UTC (permalink / raw) To: linux-raid On Mon, Feb 13, 2006 at 11:54:44AM +0000, Andy Smith wrote: > On Mon, Feb 13, 2006 at 10:22:04AM +0100, Erik Mouw wrote: > > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: > > > it doesn't seem to make any sense at all to use a non-volatile external > > > memory for swap... swap has no purpose past a power outage. > > > > No, but it is a very fast swap device. Much faster than a hard drive. > > Wouldn't the same amount of money be better spent on RAM then? Sure, but when you happen to have such a device lying idle, this is a way to use it. (note that you can also use unused memory on your video adapter as a fast swap device). Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-13 9:22 ` Erik Mouw 2006-02-13 11:54 ` Andy Smith @ 2006-02-15 8:24 ` Mirko Benz 2006-02-15 23:00 ` Neil Brown 1 sibling, 1 reply; 14+ messages in thread From: Mirko Benz @ 2006-02-15 8:24 UTC (permalink / raw) To: Erik Mouw; +Cc: dean gaudet, Bill Davidsen, linux-raid Hi, My intention was not to use a NVRAM device for swap. Enterprise storage systems use NVRAM for better data protection/faster recovery in case of a crash. Modern CPUs can do RAID calculation very fast. But Linux RAID is vulnerable when a crash during a write operation occurs. E.g. Data and parity write requests are issued in parallel but only one finishes. This will lead to inconsistent data. It will be undetected and can not be repaired. Right? How can journaling be implemented within linux-raid? I have seen a paper that tries this in cooperation with a file system: „Journal-guided Resynchronization for Software RAID“ www.cs.wisc.edu/adsl/Publications But I would rather see a solution within md so that other file systems or LVM can be used on top of md. Regards, Mirko Erik Mouw schrieb: > On Fri, Feb 10, 2006 at 05:02:02PM -0800, dean gaudet wrote: > >> On Fri, 10 Feb 2006, Bill Davidsen wrote: >> >>> Erik Mouw wrote: >>> >>>> You could use it for an external journal, or you could use it as a swap >>>> device. >>>> >>>> >>> Let me concur, I used external journal on SSD a decade ago with jfs (AIX). If >>> you do a lot of operations which generate journal entries, file create, >>> delete, etc, then it will double your performance in some cases. Otherwise it >>> really doesn't help much, use as a swap device might be more helpful depending >>> on your config. >>> >> it doesn't seem to make any sense at all to use a non-volatile external >> memory for swap... swap has no purpose past a power outage. >> > > No, but it is a very fast swap device. Much faster than a hard drive. > > > Erik > > - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-15 8:24 ` Mirko Benz @ 2006-02-15 23:00 ` Neil Brown 2006-02-16 10:05 ` Mario 'BitKoenig' Holbe 2006-02-20 9:57 ` Mirko Benz 0 siblings, 2 replies; 14+ messages in thread From: Neil Brown @ 2006-02-15 23:00 UTC (permalink / raw) To: Mirko Benz; +Cc: Erik Mouw, dean gaudet, Bill Davidsen, linux-raid [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=unknown, Size: 2101 bytes --] On Wednesday February 15, mirko.benz@web.de wrote: > Hi, > > My intention was not to use a NVRAM device for swap. > > Enterprise storage systems use NVRAM for better data protection/faster > recovery in case of a crash. > Modern CPUs can do RAID calculation very fast. But Linux RAID is > vulnerable when a crash during a write operation occurs. > E.g. Data and parity write requests are issued in parallel but only one > finishes. This will > lead to inconsistent data. It will be undetected and can not be > repaired. Right? Wrong. Well, maybe 5% right. If the array is degraded, that the inconsistency cannot be detected. If the array is fully functioning, then any inconsistency will be corrected by a 'resync'. > > How can journaling be implemented within linux-raid? With a fair bit of work. :-) > > I have seen a paper that tries this in cooperation with a file system: > Journal-guided Resynchronization for Software RAID > www.cs.wisc.edu/adsl/Publications This is using the ext3 journal to make the 'resync' (mentioned above) faster. Write-intent bitmaps can achieve similar speedups with different costs. > > But I would rather see a solution within md so that other file systems > or LVM can be used on top of md. Currently there is no solution to the "crash while writing and degraded on restart means possible silent data corruption" problem. However is it, in reality, a very small problem (unless you regularly run with a degraded array - don't do that). The only practical fix at the filesystem level is, as you suggest, journalling to NVRAM. There is work underway to restructure md/raid5 to be able to off-load the xor and raid6 calculations to dedicated hardware. This restructure would also make it a lot easier to journal raid5 updates thus closing this hole (and also improving write latency). NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-15 23:00 ` Neil Brown @ 2006-02-16 10:05 ` Mario 'BitKoenig' Holbe 2006-02-20 9:57 ` Mirko Benz 1 sibling, 0 replies; 14+ messages in thread From: Mario 'BitKoenig' Holbe @ 2006-02-16 10:05 UTC (permalink / raw) To: linux-raid Neil Brown <neilb@suse.de> wrote: > On Wednesday February 15, mirko.benz@web.de wrote: >> E.g. Data and parity write requests are issued in parallel but only one >> finishes. This will >> lead to inconsistent data. It will be undetected and can not be > If the array is degraded, that the inconsistency cannot be detected. Hmm, if the array is degraded, then there is no redundancy at all, so there is no chance for any inconsistency. Btw., this reminds me... now when you have raid6 - when is a raid6 defined to be degraded? Perhaps you have equal issues there as with raid1 >2 mirrors some months ago (resync was not started when 3rd mirror failed and 1st and 2nd were inconsistent)? > If the array is fully functioning, then any inconsistency will be > corrected by a 'resync'. Yes, because the redundancy is ignored and rebuilt. regards Mario -- Why did the tachyon cross the road? Because it was on the other side. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-15 23:00 ` Neil Brown 2006-02-16 10:05 ` Mario 'BitKoenig' Holbe @ 2006-02-20 9:57 ` Mirko Benz 2006-02-20 23:16 ` Neil Brown 1 sibling, 1 reply; 14+ messages in thread From: Mirko Benz @ 2006-02-20 9:57 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Hello, We have applications were large data sets (e.g. 100 MB) are sequentially written. Software RAID could do a full stripe update (without reading/using existing data). Does this happen in parallel? If yes, isn't that data vulnerable when a crash occurs? Thanks, Mirko Neil Brown schrieb: > On Wednesday February 15, mirko.benz@web.de wrote: > >> Hi, >> >> My intention was not to use a NVRAM device for swap. >> >> Enterprise storage systems use NVRAM for better data protection/faster >> recovery in case of a crash. >> Modern CPUs can do RAID calculation very fast. But Linux RAID is >> vulnerable when a crash during a write operation occurs. >> E.g. Data and parity write requests are issued in parallel but only one >> finishes. This will >> lead to inconsistent data. It will be undetected and can not be >> repaired. Right? >> > > Wrong. Well, maybe 5% right. > > If the array is degraded, that the inconsistency cannot be detected. > If the array is fully functioning, then any inconsistency will be > corrected by a 'resync'. > > >> How can journaling be implemented within linux-raid? >> > > With a fair bit of work. :-) > > >> I have seen a paper that tries this in cooperation with a file system: >> ?Journal-guided Resynchronization for Software RAID? >> www.cs.wisc.edu/adsl/Publications >> > > This is using the ext3 journal to make the 'resync' (mentioned above) > faster. Write-intent bitmaps can achieve similar speedups with > different costs. > > >> But I would rather see a solution within md so that other file systems >> or LVM can be used on top of md. >> > > Currently there is no solution to the "crash while writing and > degraded on restart means possible silent data corruption" problem. > However is it, in reality, a very small problem (unless you regularly > run with a degraded array - don't do that). > > The only practical fix at the filesystem level is, as you suggest, > journalling to NVRAM. There is work underway to restructure md/raid5 > to be able to off-load the xor and raid6 calculations to dedicated > hardware. This restructure would also make it a lot easier to journal > raid5 updates thus closing this hole (and also improving write > latency). > > NeilBrown > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-20 9:57 ` Mirko Benz @ 2006-02-20 23:16 ` Neil Brown 0 siblings, 0 replies; 14+ messages in thread From: Neil Brown @ 2006-02-20 23:16 UTC (permalink / raw) To: Mirko Benz; +Cc: linux-raid On Monday February 20, mirko.benz@web.de wrote: > Hello, > > We have applications were large data sets (e.g. 100 MB) are sequentially > written. > Software RAID could do a full stripe update (without reading/using > existing data). > Does this happen in parallel? If yes, isn't that data vulnerable when a > crash occurs? md/raid5 does full stripe writes about 80% of the time when I've measured it while doing large writes. I'm don't know why it is not closer to 100%. I suspect some subtle scheduling issue that I haven't managed to get to the bottom of yet (I should get back to that). Data is only vulnerable if, after the crash, the array is degraded. If the array is still complete after the crash, then there is no loss of data. NeilBrown ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NVRAM support 2006-02-10 9:01 NVRAM support Mirko Benz 2006-02-10 12:42 ` Erik Mouw @ 2006-02-10 17:38 ` Paul Clements 1 sibling, 0 replies; 14+ messages in thread From: Paul Clements @ 2006-02-10 17:38 UTC (permalink / raw) To: Mirko Benz; +Cc: linux-raid Mirko Benz wrote: > Does a high speed NVRAM device makes sense for Linux SW RAID? E.g. a PCI > card that exports battery backed memory. Sure. There are a couple ways I can think of using such a thing: 1) put an md intent bitmap on the NVRAM device for faster resyncs 2) use the NVRAM as a write journal for md to make md raid4/5/6 reliable (if the system crashes while an md raid5 is degraded, i.e., missing a disk, there is a chance of silent data corruption). The md driver does not currently do write journalling, so this would require some code changes. -- Paul ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-02-20 23:16 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-02-10 9:01 NVRAM support Mirko Benz 2006-02-10 12:42 ` Erik Mouw 2006-02-10 15:43 ` Bill Davidsen 2006-02-11 1:02 ` dean gaudet 2006-02-13 9:22 ` Erik Mouw 2006-02-13 11:54 ` Andy Smith 2006-02-13 13:35 ` Guy 2006-02-14 10:17 ` Erik Mouw 2006-02-15 8:24 ` Mirko Benz 2006-02-15 23:00 ` Neil Brown 2006-02-16 10:05 ` Mario 'BitKoenig' Holbe 2006-02-20 9:57 ` Mirko Benz 2006-02-20 23:16 ` Neil Brown 2006-02-10 17:38 ` Paul Clements
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).