* Re: mode data=journal in ext3. Is it safe to use? [not found] <40FB8221D224C44393B0549DDB7A5CE83E31B1@tor.lokal.lan> @ 2004-06-15 18:09 ` Petter Larsen 2004-06-15 18:20 ` Eugene Crosser ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Petter Larsen @ 2004-06-15 18:09 UTC (permalink / raw) To: ext3; +Cc: ext3, Nicolas.Kowalski, linux-kernel Hello I try again. Can anybody of you acknowledge or not if mode data=journal in ext3 is safe to use in Linux kernel 2.6.x? Wee need to have a very consistent and integrity for our filesystem, and it would then be desired to journal both data and metadata. But if this mode can corrupt the filesystem as both Phil White and Nicolas Kowalski has experienced, it may be more advised to use mode data=ordered instead. Data integrity is much more important for us than speed. What do you people out there say? I also try to post this in the kernel mailing list. I have not subscribed to the kml so if anybody there have som advisory about this I would be pleased if you could CC me. Petter On Mon, 2004-06-07 at 10:21, Petter Larsen wrote: > Hello > > I can see several postings on this mailing-list that people have > problem > with mounting ext3 partition with mode data=journal. > > See URL's: > https://www.redhat.com/archives/ext3-users/2004-March/msg00000.html > https://www.redhat.com/archives/ext3-users/2004-March/msg00050.html > > We are going to use ext3 on a Compact Flash disk in true IDE mode. We > need this filesystem to be as safe and consistent as possible. We can > not tolerate any garbage in the files after a crash or sudden power > failures. We have then decided to use ext3 with mode data=journal. > > Can I rely on this? > We use kernel 2.6.5 on PowerPC 8260, and may be using newer kernels > later in the project. > > > Best regards > -- > Petter Larsen > cand. scient. > moreCom as > 913 17 222 > > > _______________________________________________ > Ext3-users mailing list > Ext3-users@redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -- Petter Larsen cand. scient. moreCom as 913 17 222 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-15 18:09 ` mode data=journal in ext3. Is it safe to use? Petter Larsen @ 2004-06-15 18:20 ` Eugene Crosser 2004-06-17 8:36 ` Petter Larsen 2004-06-16 7:34 ` Oleg Drokin 2004-06-16 15:49 ` Timothy Miller 2 siblings, 1 reply; 32+ messages in thread From: Eugene Crosser @ 2004-06-15 18:20 UTC (permalink / raw) To: Petter Larsen; +Cc: ext3, ext3, Nicolas.Kowalski, linux-kernel [-- Attachment #1: Type: text/plain, Size: 956 bytes --] On Tue, 2004-06-15 at 20:09 +0200, Petter Larsen wrote: > Can anybody of you acknowledge or not if mode data=journal in ext3 is > safe to use in Linux kernel 2.6.x? > > Wee need to have a very consistent and integrity for our filesystem, and > it would then be desired to journal both data and metadata. > > But if this mode can corrupt the filesystem as both Phil White and > Nicolas Kowalski has experienced, it may be more advised to use mode > data=ordered instead. > > Data integrity is much more important for us than speed. I ran ext3 with data=journal on 2.6.6smp for about a week on a heavily loaded system (I mean it). I did not ever experience filesystem corruption (related to the fs code). I did, however, hit complete system lockup once. It *may* have been unrelated to the fs code. (If you use quota, it *will* lock. The author is working on a fix. Above, I am referring to a lockup with quota off). Eugene [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-15 18:20 ` Eugene Crosser @ 2004-06-17 8:36 ` Petter Larsen 0 siblings, 0 replies; 32+ messages in thread From: Petter Larsen @ 2004-06-17 8:36 UTC (permalink / raw) To: Eugene Crosser; +Cc: ext3, linux-kernel > > > > Data integrity is much more important for us than speed. > > I ran ext3 with data=journal on 2.6.6smp for about a week on a heavily > loaded system (I mean it). I did not ever experience filesystem > corruption (related to the fs code). I did, however, hit complete > system lockup once. It *may* have been unrelated to the fs code. > > (If you use quota, it *will* lock. The author is working on a fix. > Above, I am referring to a lockup with quota off). > > Eugene Good to here. But there may have been a lookup once because you are not sure that the crash was unrelated to ext3 fs code? Are you going to test it more? We are not going to use quota, we are using ext3 on a compact flash disk in an embedded device. -- Petter Larsen cand. scient. moreCom as 913 17 222 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-15 18:09 ` mode data=journal in ext3. Is it safe to use? Petter Larsen 2004-06-15 18:20 ` Eugene Crosser @ 2004-06-16 7:34 ` Oleg Drokin 2004-06-17 8:27 ` Petter Larsen 2004-06-16 15:49 ` Timothy Miller 2 siblings, 1 reply; 32+ messages in thread From: Oleg Drokin @ 2004-06-16 7:34 UTC (permalink / raw) To: pla, linux-kernel Hello! Petter Larsen <pla@morecom.no> wrote: PL> Can anybody of you acknowledge or not if mode data=journal in ext3 is PL> safe to use in Linux kernel 2.6.x? PL> Wee need to have a very consistent and integrity for our filesystem, and PL> it would then be desired to journal both data and metadata. Actually data=journal mode would gain you mostly zero extra consistency compared to data=ordered mode. (the only more consistency bit that you get is correct mtime on files that have their pages overwritten, I think). You have zero control over transaction boundaries in ext3, so you still need to design your applications in such a way that they have their own sort of transactions (if this is needed). PL> Data integrity is much more important for us than speed. It is not clear what sort of extra data integrity do you expect from data journaling mode and why do you think it is there. Garbage in files should not happen in data ordered mode as data pages are written first before metadata updates are committed. Bye, Oleg ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-16 7:34 ` Oleg Drokin @ 2004-06-17 8:27 ` Petter Larsen 2004-06-17 17:09 ` Oleg Drokin 0 siblings, 1 reply; 32+ messages in thread From: Petter Larsen @ 2004-06-17 8:27 UTC (permalink / raw) To: Oleg Drokin; +Cc: linux-kernel, ext3 Hello I comment inline.. > PL> Can anybody of you acknowledge or not if mode data=journal in ext3 is > PL> safe to use in Linux kernel 2.6.x? > PL> Wee need to have a very consistent and integrity for our filesystem, and > PL> it would then be desired to journal both data and metadata. > > OLEG> Actually data=journal mode would gain you mostly zero extra consistency compared > to data=ordered mode. (the only more consistency bit that you get is > correct mtime on files that have their pages overwritten, I think). > You have zero control over transaction boundaries in ext3, so you still need > to design your applications in such a way that they have their own > sort of transactions (if this is needed). So your conclusion is that data=journal mode is useless if you do not want a correct mtime? It would be a littles sense in developing the data=journal mode if this is the only benefit, don't you think? >From the Linux/Documentation/filesystems/ext3.txt data=journal All data are committed into the journal prior to being written into the main file system. data=ordered (*) All data are forced directly out to the main file system prior to its metadata being committed to the journal. My problem is that ext3 in the latest kernel, 2.6.x and the latest 2.4.x, are not well documented around the web. Whitepapers and so are pretty old. Much have changed I belive in ext3 since it was first introduced by Dr. Tweedie. The first release was journaling both data and metadata, se also the transcript from Dr. Tweedie from the Ottawa Linux Symposium 20th July 2000. http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html There he says that they are journaling both metadata and data, but that the design goal is not to do that. So can this be interpreted that mode data=journal is only there for historic reasons? > PL> Data integrity is much more important for us than speed. > > OLEG> It is not clear what sort of extra data integrity do you expect from data > journaling mode and why do you think it is there. I would belive that the goal for such a mode data=journal would gain extra data integrity because it also journals data. Why should it not? I would belive that it makes sense to have these different modes so people can choose the best mode for there applications. > OLEG> Garbage in files should not happen in data ordered mode as data pages are > written first before metadata updates are committed. Are you sure? Petter ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 8:27 ` Petter Larsen @ 2004-06-17 17:09 ` Oleg Drokin 2004-06-18 9:41 ` Helge Hafting 0 siblings, 1 reply; 32+ messages in thread From: Oleg Drokin @ 2004-06-17 17:09 UTC (permalink / raw) To: Petter Larsen; +Cc: linux-kernel, ext3 Hello! On Thu, Jun 17, 2004 at 10:27:17AM +0200, Petter Larsen wrote: > > PL> Can anybody of you acknowledge or not if mode data=journal in ext3 is > > PL> safe to use in Linux kernel 2.6.x? > > PL> Wee need to have a very consistent and integrity for our filesystem, and > > PL> it would then be desired to journal both data and metadata. > > OLEG> Actually data=journal mode would gain you mostly zero extra consistency compared > > to data=ordered mode. (the only more consistency bit that you get is > > correct mtime on files that have their pages overwritten, I think). > > You have zero control over transaction boundaries in ext3, so you still need > > to design your applications in such a way that they have their own > > sort of transactions (if this is needed). > So your conclusion is that data=journal mode is useless if you do not > want a correct mtime? Well, yes. > It would be a littles sense in developing the data=journal mode if this > is the only benefit, don't you think? > >From the Linux/Documentation/filesystems/ext3.txt > data=journal All data are committed into the journal prior > to being written into the main file system. > data=ordered (*) All data are forced directly out to the main > file system prior to its metadata being committed to > the journal. > My problem is that ext3 in the latest kernel, 2.6.x and the latest > 2.4.x, are not well documented around the web. Whitepapers and so are > pretty old. Much have changed I belive in ext3 since it was first > introduced by Dr. Tweedie. The first release was journaling both data > and metadata, se also the transcript from Dr. Tweedie from the Ottawa > Linux Symposium 20th July 2000. > http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html > There he says that they are journaling both metadata and data, but that > the design goal is not to do that. So can this be interpreted that mode > data=journal is only there for historic reasons? May be so. Also fsync heavy loads on real disk devices with large journals tend to benefit from journaled data mode as well. > > PL> Data integrity is much more important for us than speed. > > > > OLEG> It is not clear what sort of extra data integrity do you expect from data > > journaling mode and why do you think it is there. > I would belive that the goal for such a mode data=journal would gain > extra data integrity because it also journals data. Why should it not? I Well, actually I bet you do not care if the data goes through journal or not as long as it is not lost. In case of ordered journaling mode, data is written first before metadata updates, mostly the same happens with data journal mode, only with the latter case date is written into journal and if transaction was not committed, after a reboot it won't be copied to where it should be, same scenario in ordered journal mode will result in data getting where it should be, but due to lack of metadata updates, you won't see it. (this is in case of append, for overwrite it will be a little bit different, but still you have no control over how much of stuff will be overwritten). > would belive that it makes sense to have these different modes so people > can choose the best mode for there applications. True. > > OLEG> Garbage in files should not happen in data ordered mode as data pages are > > written first before metadata updates are committed. > Are you sure? If you can reproduce a garbage in files in ordered journal mode, that would be a bug that should be fixed then. Bye, Oleg ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 17:09 ` Oleg Drokin @ 2004-06-18 9:41 ` Helge Hafting 2004-06-18 10:15 ` Oleg Drokin 0 siblings, 1 reply; 32+ messages in thread From: Helge Hafting @ 2004-06-18 9:41 UTC (permalink / raw) To: Oleg Drokin; +Cc: Petter Larsen, linux-kernel, ext3 Oleg Drokin wrote: >Hello! > >On Thu, Jun 17, 2004 at 10:27:17AM +0200, Petter Larsen wrote: > > >>>PL> Can anybody of you acknowledge or not if mode data=journal in ext3 is >>>PL> safe to use in Linux kernel 2.6.x? >>>PL> Wee need to have a very consistent and integrity for our filesystem, and >>>PL> it would then be desired to journal both data and metadata. >>>OLEG> Actually data=journal mode would gain you mostly zero extra consistency compared >>>to data=ordered mode. (the only more consistency bit that you get is >>>correct mtime on files that have their pages overwritten, I think). >>>You have zero control over transaction boundaries in ext3, so you still need >>>to design your applications in such a way that they have their own >>>sort of transactions (if this is needed). >>> >>> >>So your conclusion is that data=journal mode is useless if you do not >>want a correct mtime? >> >> > >Well, yes. > > > >>It would be a littles sense in developing the data=journal mode if this >>is the only benefit, don't you think? >>>From the Linux/Documentation/filesystems/ext3.txt >>data=journal All data are committed into the journal prior >> to being written into the main file system. >>data=ordered (*) All data are forced directly out to the main >>file system prior to its metadata being committed to >> the journal. >>My problem is that ext3 in the latest kernel, 2.6.x and the latest >>2.4.x, are not well documented around the web. Whitepapers and so are >>pretty old. Much have changed I belive in ext3 since it was first >>introduced by Dr. Tweedie. The first release was journaling both data >>and metadata, se also the transcript from Dr. Tweedie from the Ottawa >>Linux Symposium 20th July 2000. >>http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html >>There he says that they are journaling both metadata and data, but that >>the design goal is not to do that. So can this be interpreted that mode >>data=journal is only there for historic reasons? >> >> > >May be so. Also fsync heavy loads on real disk devices with large journals >tend to benefit from journaled data mode as well. > > > >>>PL> Data integrity is much more important for us than speed. >>> >>>OLEG> It is not clear what sort of extra data integrity do you expect from data >>>journaling mode and why do you think it is there. >>> >>> >>I would belive that the goal for such a mode data=journal would gain >>extra data integrity because it also journals data. Why should it not? I >> >> > >Well, actually I bet you do not care if the data goes through journal or not >as long as it is not lost. >In case of ordered journaling mode, data is written first before metadata >updates, mostly the same happens with data journal mode, only with the latter >case date is written into journal and if transaction was not committed, after >a reboot it won't be copied to where it should be, same scenario in ordered >journal mode will result in data getting where it should be, but due to >lack of metadata updates, you won't see it. (this is in case of append, >for overwrite it will be a little bit different, but still you have no >control over how much of stuff will be overwritten). > > > >>would belive that it makes sense to have these different modes so people >>can choose the best mode for there applications. >> >> > >True. > > > >>>OLEG> Garbage in files should not happen in data ordered mode as data pages are >>>written first before metadata updates are committed. >>> >>> >>Are you sure? >> >> > >If you can reproduce a garbage in files in ordered journal mode, that would be a >bug that should be fixed then. > > Hard to _produce_, but consider: 1. Write data to an existing file 2. Sync metadata 3. data is forced out because of ordered mode, a powerout crash happens in the middle of this. The file now has a block with a mix of new and old, it may even be unreadable due to a bad sector checksum. With data journalling you either get the old data (because the crash happened during a write to the journal) or new data (crash happened during data write, the data is restored from the good copy in the journal.) Helge Hafting ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-18 9:41 ` Helge Hafting @ 2004-06-18 10:15 ` Oleg Drokin 2004-06-18 11:30 ` Paulo Marques 0 siblings, 1 reply; 32+ messages in thread From: Oleg Drokin @ 2004-06-18 10:15 UTC (permalink / raw) To: Helge Hafting; +Cc: Petter Larsen, linux-kernel, ext3 Hello! On Fri, Jun 18, 2004 at 11:41:23AM +0200, Helge Hafting wrote: > >If you can reproduce a garbage in files in ordered journal mode, that > >would be a > >bug that should be fixed then. > Hard to _produce_, but consider: > 1. Write data to an existing file > 2. Sync metadata > 3. data is forced out because of ordered mode, a powerout crash happens > in the middle of this. The file now has a block with a mix of new > and old, Well, this is not much worse than having two blocks, one from old file and one from new after a crash. > it may even be unreadable due to a bad sector checksum. Well, in data journaled mode you may get unreadable journal, is this much better? (Also original question was about CF flash media, so no bad sector problems I presume). > With data journalling you either get the old data (because the crash > happened > during a write to the journal) or new data (crash happened during data > write, Well, while with data journaling mode your granularity is one block, with data ordered it is one sector. > the data is restored from the good copy in the journal.) Bye, Oleg ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-18 10:15 ` Oleg Drokin @ 2004-06-18 11:30 ` Paulo Marques 2004-06-18 12:05 ` Oleg Drokin 2004-06-19 19:16 ` mode data=journal in ext3. Is it safe to use? Bernd Eckenfels 0 siblings, 2 replies; 32+ messages in thread From: Paulo Marques @ 2004-06-18 11:30 UTC (permalink / raw) To: Oleg Drokin Cc: Helge Hafting, Petter Larsen, linux-kernel@vger.kernel.org, ext3 On Fri, 2004-06-18 at 11:15, Oleg Drokin wrote: > Hello! > > On Fri, Jun 18, 2004 at 11:41:23AM +0200, Helge Hafting wrote: > > > >If you can reproduce a garbage in files in ordered journal mode, that > > >would be a > > >bug that should be fixed then. > > Hard to _produce_, but consider: > > 1. Write data to an existing file > > 2. Sync metadata > > 3. data is forced out because of ordered mode, a powerout crash happens > > in the middle of this. The file now has a block with a mix of new > > and old, > > Well, this is not much worse than having two blocks, one from old file > and one from new after a crash. Agree. If the application needs consistency it must do some journaling itself. At least, until the time when an application can say "start transaction" "commit transaction" to the file system itself. > > it may even be unreadable due to a bad sector checksum. > > Well, in data journaled mode you may get unreadable journal, is this much > better? (Also original question was about CF flash media, so no bad sector > problems I presume). You got it wrong here. The sentence was "bad sector checksum", not "bad sector". If the sector was "half written", then the checksum would not match. If the journal is "half written" then it is just discarded (or at least it should be). > > With data journalling you either get the old data (because the crash > > happened > > during a write to the journal) or new data (crash happened during data > > write, > > Well, while with data journaling mode your granularity is one block, > with data ordered it is one sector. Imagine that you request a 2Mb write to an ext3 filesystem with an 1Mb journal. There is *no way* the filesystem can do the write in an atomic operation. (there would be if the filesystem wrote the data to free blocks and updated the metadata through the journal) The point is, there is no concept of "atomic operation" at the file system level, so the application must do journaling itself if it wants to have some concept of "transactions". >From my experience with CF cards, there are some brands that do wear-leveling (I know that at least the TwinMOS ones do, and probably SanDisk too) and others that don't (Kingmax). With a bad CF card and an ext3 filesystem you can get bad sectors in a couple of hours doing some intensive writing. A good CF card will sustain "normal use" (2 writes per minute average) and an ext3 filesystem for months (maybe years, I still didn't went that far in time :) Just my two cents, -- Paulo Marques - www.grupopie.com "In a world without walls and fences who needs windows and gates?" ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-18 11:30 ` Paulo Marques @ 2004-06-18 12:05 ` Oleg Drokin 2004-06-21 17:42 ` mode data=journal in ext3. Is it safe to use? Conclusion Petter Larsen 2004-06-19 19:16 ` mode data=journal in ext3. Is it safe to use? Bernd Eckenfels 1 sibling, 1 reply; 32+ messages in thread From: Oleg Drokin @ 2004-06-18 12:05 UTC (permalink / raw) To: Paulo Marques Cc: Helge Hafting, Petter Larsen, linux-kernel@vger.kernel.org, ext3 Hello! On Fri, Jun 18, 2004 at 12:30:55PM +0100, Paulo Marques wrote: > > > Hard to _produce_, but consider: > > > 1. Write data to an existing file > > > 2. Sync metadata > > > 3. data is forced out because of ordered mode, a powerout crash happens > > > in the middle of this. The file now has a block with a mix of new > > > and old, > > Well, this is not much worse than having two blocks, one from old file > > and one from new after a crash. > Agree. If the application needs consistency it must do some journaling > itself. At least, until the time when an application can say "start > transaction" "commit transaction" to the file system itself. Right, this is my point. > > > it may even be unreadable due to a bad sector checksum. > > Well, in data journaled mode you may get unreadable journal, is this much > > better? (Also original question was about CF flash media, so no bad sector > > problems I presume). > You got it wrong here. The sentence was "bad sector checksum", not "bad > sector". If the sector was "half written", then the checksum would not > match. In any case bad sector checksum is hardware bug. Sector write is supposed to be atomic, it either happens or not. > If the journal is "half written" then it is just discarded (or at least > it should be). Well, if there is bad sector checksum inside journal block, ext3 won't be all that happy about this for sure (and most of other journaling filesystems as well, I am sure). > > > With data journalling you either get the old data (because the crash > > > happened > > > during a write to the journal) or new data (crash happened during data > > > write, > > Well, while with data journaling mode your granularity is one block, > > with data ordered it is one sector. > Imagine that you request a 2Mb write to an ext3 filesystem with an 1Mb > journal. There is *no way* the filesystem can do the write in an atomic > operation. (there would be if the filesystem wrote the data to free > blocks and updated the metadata through the journal) True. Even if you write 512K of data and have 1Mb journal, still there is no atomicity guarantee. > The point is, there is no concept of "atomic operation" at the file > system level, so the application must do journaling itself if it wants > to have some concept of "transactions". Well, if you go with less than 1 block size updates (that do not cross block boundaries), this can be done atomically. (with help of fsync and stuff). > >From my experience with CF cards, there are some brands that do > wear-leveling (I know that at least the TwinMOS ones do, and probably > SanDisk too) and others that don't (Kingmax). > With a bad CF card and an ext3 filesystem you can get bad sectors in a > couple of hours doing some intensive writing. Well, for flash memory there is jffs2, it does (data) journalling and supports compression. And it can even work over conventional block devices via mtd block emulation, I think. Basically jffs2 is one large fs-sized journal as I understand it. Bye, Oleg ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? Conclusion 2004-06-18 12:05 ` Oleg Drokin @ 2004-06-21 17:42 ` Petter Larsen 0 siblings, 0 replies; 32+ messages in thread From: Petter Larsen @ 2004-06-21 17:42 UTC (permalink / raw) To: ext3, linux-kernel; +Cc: albertogli I will summarise this thread and try to set the picture of what has been discussed and concluded. 1. ext3 with mode data=journal in kernel 2.6.x is probably working as intended. One has responded with using this mode heavily on 2.6.6 without corruption related to the fs code. Since nobody has said that they have seen faults, we should belive that it is safe. It is in an stable kernel... 2. Mode data=journal will not gain much more than correct mtime compared to mode data=ordered. 3. Applications that need a very consistent filesystem, e.g. consistent writes, they need to do this by implementing there own transaction/journaling system. Alberto Bertogli has written a library that can assist with this. See URL, http://users.auriga.wearlab.de/~alb/libjio/. I have not used it so I can not say for sure how good it is, but it seems like a nice start and worth to take a look at. 4. Because mode data=journal does not gain much, it would be better to use mode data=ordered and use any form of transaction/journaling itself. Mode data=ordered is the default in ext3 and probably most used, and therefor also best tested. 5. If, and only if, you have less than 1 block size updates (that do not cross block boundaries), these operations (write) can be done atomically. (with help of fsync and stuff,(from Oleg and others)). 6. Wear leveling on a Compact Flash card: Wear leveling is an important task. SanDisk has Industrial Grade support for some of there CF-cards, see these links. http://www.sandisk.com/pressrelease/020522_toughness.htm http://www.sandisk.com/pressrelease/021112_igapps.htm http://www.sandisk.com/pdf/oem/WPaperWearLevelv1.0.pdf We are in the telecommunications and networking business and need this kind of Compact Flash cards. From there site: * Enhanced error correction and sophisticated wear leveling technology * Card level MTBF >3 million hours * 2 million program/erase cycle endurance per block We are not bound to SanDisk. We could use any suplier that meet these criteria. I do not know the wear leveling algorithm in detail so how they shuffle read-only data (or if they do) around the disk, and even how it does it if we create partitions on this CF disk (partition are probably transparent for the wear leveling algorithm), is an issue we need to find out of. Thanks for all your replies ( there are 32 threads:-) spread along the ext3 ML and the LKML and a couple private ). It has helped me a lot. Best regards -- Petter Larsen cand. scient. moreCom as 913 17 222 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-18 11:30 ` Paulo Marques 2004-06-18 12:05 ` Oleg Drokin @ 2004-06-19 19:16 ` Bernd Eckenfels 1 sibling, 0 replies; 32+ messages in thread From: Bernd Eckenfels @ 2004-06-19 19:16 UTC (permalink / raw) To: linux-kernel In article <1087558255.25904.14.camel@pmarqueslinux> you wrote: > The point is, there is no concept of "atomic operation" at the file > system level, so the application must do journaling itself if it wants > to have some concept of "transactions". Well, there can be rules like "writes after flush with size less than x are atomic". With X beeing something between sector size, blocksize or data journal size. However most unix programs which do not do yournalling and rely on some stable atomic behaviour work with generating new files and renaming that. And for this the meta data journalling in ordered mode is fine. So only the append only logfiles may need some special treatment, this looks like a common source for null-bytes in a file. And only in case it is not a temp file, its a problem (syslog) Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-15 18:09 ` mode data=journal in ext3. Is it safe to use? Petter Larsen 2004-06-15 18:20 ` Eugene Crosser 2004-06-16 7:34 ` Oleg Drokin @ 2004-06-16 15:49 ` Timothy Miller 2004-06-17 0:51 ` Daniel Pittman ` (2 more replies) 2 siblings, 3 replies; 32+ messages in thread From: Timothy Miller @ 2004-06-16 15:49 UTC (permalink / raw) To: Petter Larsen; +Cc: ext3, ext3, Nicolas.Kowalski, linux-kernel Petter Larsen wrote: > > Data integrity is much more important for us than speed. > You might want to consider ReiserFS or one of the others which were designed with journaling in mind. And I hope you're using RAID1 or RAID5. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-16 15:49 ` Timothy Miller @ 2004-06-17 0:51 ` Daniel Pittman 2004-06-17 3:02 ` Tim Connors 2004-06-17 5:35 ` Hans Reiser 2004-06-17 8:29 ` Petter Larsen [not found] ` <1805.216.148.213.196.1087426691.squirrel@www.code-visions.com> 2 siblings, 2 replies; 32+ messages in thread From: Daniel Pittman @ 2004-06-17 0:51 UTC (permalink / raw) To: linux-kernel; +Cc: Ext3-users On 17 Jun 2004, Timothy Miller wrote: > Petter Larsen wrote: > >> Data integrity is much more important for us than speed. > > You might want to consider ReiserFS or one of the others which were > designed with journaling in mind. And I hope you're using RAID1 or > RAID5. I must admit, that isn't quite the response that I would have expected for those requirements. :) ReiserFS, XFS and (presumably) JFS all have considerably better performance than ext3, for most tasks, because they were indeed designed with journaling in mind. OTOH, ReiserFS had an extremely long period of instability, and was build by a group who felt that a working fsck was something you put together after you got the filesystem working. This, combined with the occasional "ReiserFS 3 ate my data" reports and the reluctance of the developers to adapt to the 4K kernel stacks in 2.6.recent, would leave me hesitant to recommend it as "more trustworthy" than ext3. XFS, with the "null out data on recovery" mode, is less reliable than ext3, full stop. It routinely destroys data in real world situations, a secure, but irritating, choice. ext3 remains the only journaling filesystem that I would, personally, put any great degree of faith in, since it is still developed in a cautious and safe fashion, and has a focus on getting the tools to verify correctness in place before enabling kernel-side features. Obviously, your millage may vary on these topics, as presumably have your experiences. Regards, Daniel -- Advertising may be described as the science of arresting the human intelligence long enough to get money from it. -- Stephen Leacock ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 0:51 ` Daniel Pittman @ 2004-06-17 3:02 ` Tim Connors 2004-06-17 5:35 ` Hans Reiser 1 sibling, 0 replies; 32+ messages in thread From: Tim Connors @ 2004-06-17 3:02 UTC (permalink / raw) To: Daniel Pittman; +Cc: linux-kernel, Ext3-users Daniel Pittman <daniel@rimspace.net> said on Thu, 17 Jun 2004 10:51:54 +1000: > XFS, with the "null out data on recovery" mode, is less reliable than > ext3, full stop. It routinely destroys data in real world situations, a > secure, but irritating, choice. And please tell me -- the point of journalling is to reduce fsck times upon failure - particularly important if you have 14TB of raid (yes, we had to fsck after a recent downtime, and it had been > 180 days - took half the day). What is the point of journalling if you have to compare and restore against backup everytime the power fails? This is slower than a mere fsckage. FYI, I think jfs has the same behaviour as xfs - I do notice a distinct lack of usage of a /lost+found, which has been important to me in the past. > ext3 remains the only journaling filesystem that I would, personally, > put any great degree of faith in, since it is still developed in a > cautious and safe fashion, and has a focus on getting the tools to > verify correctness in place before enabling kernel-side features. > > > Obviously, your millage may vary on these topics, as presumably have > your experiences. Sounds about right :) Next time I reformat/get a new drive, I'll be going back to ext3 - never caused any problems for me. -- TimC -- http://astronomy.swin.edu.au/staff/tconnors/ Single White Stick-Figure, L12, enjoys long walks by the shore, cooking up a nice menudo, and bashing small animals with sticks. My meat sword is enormous. Seeks female Accordian Thief for relationship and buffs. -- Riff @ some game forum ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 0:51 ` Daniel Pittman 2004-06-17 3:02 ` Tim Connors @ 2004-06-17 5:35 ` Hans Reiser 2004-06-17 10:08 ` Dave Jones 1 sibling, 1 reply; 32+ messages in thread From: Hans Reiser @ 2004-06-17 5:35 UTC (permalink / raw) To: Daniel Pittman; +Cc: linux-kernel, Ext3-users Daniel Pittman wrote: >OTOH, ReiserFS had an extremely long period of instability, > we were stable before ext3 was... >and was >build by a group who felt that a working fsck was something you put >together after you got the filesystem working. > > Well, if you have a total of two guys working on a filesystem, and plenty not working yet in the filesystem, why the hell would you start to work on fsck before the main body of code is working and performing well enough that anybody would want to use it? Surely my task ordering was correct for a two man team. With Reiser4 we had funding for an fsck guy, and as a result fsck is working at ship. With V3, we had no funding at all until it started to work. >This, combined with the occasional "ReiserFS 3 ate my data" reports and > > like ext2/ext3, we are now able to say that almost all such reports are hardware (for V3 not V4, V4 gained some bugs when we ported to -mm and its radix trees, and is still not shipped as a result). >the reluctance of the developers to adapt to the 4K kernel stacks in >2.6.recent, > do you use them? I don't know real users who do, or else I would be quicker to care. On the one hand, you complain about how we were unstable, and on the other hand you complain about how we aren't willing to destabilize the code to add new features to what is no longer the development branch. Seems pretty inconsistent logically to me. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 5:35 ` Hans Reiser @ 2004-06-17 10:08 ` Dave Jones 2004-06-17 16:55 ` Hans Reiser 0 siblings, 1 reply; 32+ messages in thread From: Dave Jones @ 2004-06-17 10:08 UTC (permalink / raw) To: Hans Reiser; +Cc: Daniel Pittman, linux-kernel, Ext3-users On Wed, Jun 16, 2004 at 10:35:50PM -0700, Hans Reiser wrote: > >the reluctance of the developers to adapt to the 4K kernel stacks in > >2.6.recent, > > > do you use them? I don't know real users who do, or else I would be > quicker to care. The Fedora Core 2 kernel (and what will be RHEL4) is currently using 4K stacks. This makes up quite a large userbase. > On the one hand, you complain about how we were unstable, and on the > other hand you complain about how we aren't willing to destabilize the > code to add new features to what is no longer the development branch. > Seems pretty inconsistent logically to me. If you really are reluctant it fix it, there's always the option of marking CONFIG_REISER4 as dependant on CONFIG_BROKEN if CONFIG_4KSTACKS is selected. Dave ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 10:08 ` Dave Jones @ 2004-06-17 16:55 ` Hans Reiser 0 siblings, 0 replies; 32+ messages in thread From: Hans Reiser @ 2004-06-17 16:55 UTC (permalink / raw) To: Dave Jones; +Cc: Daniel Pittman, linux-kernel, Ext3-users, Chris Mason Dave Jones wrote: >On Wed, Jun 16, 2004 at 10:35:50PM -0700, Hans Reiser wrote: > > > >the reluctance of the developers to adapt to the 4K kernel stacks in > > >2.6.recent, > > > > > do you use them? I don't know real users who do, or else I would be > > quicker to care. > >The Fedora Core 2 kernel (and what will be RHEL4) is currently >using 4K stacks. This makes up quite a large userbase. > > Sigh. I guess we have to support it then. Chris, are you up to doing it? > > On the one hand, you complain about how we were unstable, and on the > > other hand you complain about how we aren't willing to destabilize the > > code to add new features to what is no longer the development branch. > > Seems pretty inconsistent logically to me. > >If you really are reluctant it fix it, there's always the option of >marking CONFIG_REISER4 as dependant on CONFIG_BROKEN if CONFIG_4KSTACKS >is selected. > > Dave > > > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-16 15:49 ` Timothy Miller 2004-06-17 0:51 ` Daniel Pittman @ 2004-06-17 8:29 ` Petter Larsen 2004-06-17 19:30 ` Daniel Egger [not found] ` <87wu26mto2.fsf@enki.rimspace.net> [not found] ` <1805.216.148.213.196.1087426691.squirrel@www.code-visions.com> 2 siblings, 2 replies; 32+ messages in thread From: Petter Larsen @ 2004-06-17 8:29 UTC (permalink / raw) To: Timothy Miller; +Cc: ext3, linux-kernel > > > > Data integrity is much more important for us than speed. > > > > > You might want to consider ReiserFS or one of the others which were > designed with journaling in mind. And I hope you're using RAID1 or RAID5. We are using ext3 on a compact flash disk in an embedded device. So we are not using RAID systems. Best regards -- Petter Larsen cand. scient. moreCom as 913 17 222 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 8:29 ` Petter Larsen @ 2004-06-17 19:30 ` Daniel Egger [not found] ` <87wu26mto2.fsf@enki.rimspace.net> 1 sibling, 0 replies; 32+ messages in thread From: Daniel Egger @ 2004-06-17 19:30 UTC (permalink / raw) To: Petter Larsen; +Cc: Timothy Miller, ext3, linux-kernel [-- Attachment #1: Type: text/plain, Size: 227 bytes --] On 17.06.2004, at 10:29, Petter Larsen wrote: > We are using ext3 on a compact flash disk in an embedded device. So we > are not using RAID systems. An excellent way to kill such media. Hopefully YMMV. Servus, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 478 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <87wu26mto2.fsf@enki.rimspace.net>]
* Re: mode data=journal in ext3. Is it safe to use? [not found] ` <87wu26mto2.fsf@enki.rimspace.net> @ 2004-06-27 14:17 ` Petter Larsen 2004-06-28 0:22 ` Daniel Pittman 0 siblings, 1 reply; 32+ messages in thread From: Petter Larsen @ 2004-06-27 14:17 UTC (permalink / raw) To: Daniel Pittman; +Cc: ext3, linux-kernel > > We are using ext3 on a compact flash disk in an embedded device. So we > > are not using RAID systems. > > Watch out - even with the internal wear leveling the CF disk will do, > ext3 is still a pretty heavy filesystem to use there. > > Daniel Well, which filesystem would you then used for read-write on this CF? -- Petter Larsen cand. scient. moreCom as 913 17 222 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-27 14:17 ` Petter Larsen @ 2004-06-28 0:22 ` Daniel Pittman 0 siblings, 0 replies; 32+ messages in thread From: Daniel Pittman @ 2004-06-28 0:22 UTC (permalink / raw) To: Petter Larsen; +Cc: ext3, linux-kernel On 28 Jun 2004, Petter Larsen wrote: >>> We are using ext3 on a compact flash disk in an embedded device. So we >>> are not using RAID systems. >> >> Watch out - even with the internal wear leveling the CF disk will do, >> ext3 is still a pretty heavy filesystem to use there. > > Well, which filesystem would you then used for read-write on this CF? My recommendation would be to look at running your system out of memory, and writing back to flash on a scheduled basis, and at shutdown. That way the write load is minimized, but you still have a persistent store. Daniel -- Anyone who goes to a psychiatrist ought to have his head examined. -- Samuel Goldwyn ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <1805.216.148.213.196.1087426691.squirrel@www.code-visions.com>]
* Re: mode data=journal in ext3. Is it safe to use? [not found] ` <1805.216.148.213.196.1087426691.squirrel@www.code-visions.com> @ 2004-06-17 11:23 ` Petter Larsen 2004-06-17 16:26 ` Andreas Dilger 0 siblings, 1 reply; 32+ messages in thread From: Petter Larsen @ 2004-06-17 11:23 UTC (permalink / raw) To: Phil White; +Cc: ext3, linux-kernel > I was never able to resolve the problems I had with data=journal with the > 2.4 kernel. I did *not* try the 2.6 kernel though, so I can't give you > any data points there. In the end, I settled for data=ordered, and have > never seen the problems I described in my original posts. Also, to give > you some background, I had been using ReiserFS before switching to ext3, > and I experienced a lot of corruption with Reiser (my company makes linux > based appliances which sometimes get turned off while under heavy IO). > Since ReiserFS doesn't do data journalling (metadata only), we > consistently ended up with corrupt files. After this, I decided to try > ext3 with data=journal, and I never even got far enough with load testing > to try the 'hard reset' test. It would consistently crash in the fs code > under heavy load. This should be considered a serious bug, dont you think. Have you reported this to the kernel list? I have the list now on the CC, but it probably should be made as a bug report. > > We have since had no problems with data=ordered, and since it writes data > blocks before writing metadata to the journal, we don't see corrupt files > anymore (even on hard resets). Ok > > If data integrity (within the file) is important to you in the face of a > crash or power loss, do NOT use ReiserFS or ext3 data=writeback. If your > application never overwrites data in files, you will be just fine using > data=ordered (appending to files or creating new files is pretty much > guaranteed to never cause corruption). If you need to overwrite data in > files, you need to use data=journal (and probably beg people to fix it) or > rewrite your application to use some other method (i.e. copy the file, > delete the old one) and just use data=ordered. > So data=journal would gain safer data integrity (if it works as intended then) than using data=ordered. But if data=journal does not work correctly we may be better off using data=ordered if we design our application after it. The problem is that we can not do this consistent because we have a mix of both open source applications and our own developed applications. But think of your scenario of copy, delete and make a new file with the new content. First we copy the contents of the file, then we do our modifications. When we are done we delete the original file. Then we hit a crash. The content we had of the file in our process are gone, the original file is deleted. This is not a good idea. But if we write the new file first as fileX.new and den delete fileX, hit a crash then we would have at least the correct file written as fileX.new. But we would be best off if we could trust the filesystem. In practise there are probably many more systems out there which use data=ordered because this is the default, and therefor get best testet. Journaling both data and metadata was what Dr. Tweedie did in the first public releases, but the goal was not to do it. It is not easy to know what is the best thing to do. We use this ext3 filesystem on a compact flash in an embedded system. Petter ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 11:23 ` Petter Larsen @ 2004-06-17 16:26 ` Andreas Dilger 0 siblings, 0 replies; 32+ messages in thread From: Andreas Dilger @ 2004-06-17 16:26 UTC (permalink / raw) To: Petter Larsen; +Cc: Phil White, ext3, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1244 bytes --] On Jun 17, 2004 13:23 +0200, Petter Larsen wrote: > But think of your scenario of copy, delete and make a new file with the > new content. First we copy the contents of the file, then we do our > modifications. When we are done we delete the original file. Then we hit > a crash. The content we had of the file in our process are gone, the > original file is deleted. This is not a good idea. But if we write the new > file first as fileX.new and den delete fileX, hit a crash then we would > have at least the correct file written as fileX.new. The rename operation is guaranteed to be atomic. You implement updates as: 1) create new file 2) write data to new file 3) rename new file over old filename If the system crashes at any time you are guaranteed that the old filename has valid data in it. Even if you use data=journal mode while overwriting the old filename directly you wouldn't be guaranteed to have valid data unless your application was only e.g. writing aligned records to fixed file offsets, and those records were <= 4kB in size. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/ [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use?
@ 2004-06-17 14:56 Ken Ryan
2004-06-17 16:06 ` Timothy Miller
2004-06-19 14:49 ` Petter Larsen
0 siblings, 2 replies; 32+ messages in thread
From: Ken Ryan @ 2004-06-17 14:56 UTC (permalink / raw)
To: linux-kernel; +Cc: pla
> > >
> > > Data integrity is much more important for us than speed.
> > >
> >
> >
> > You might want to consider ReiserFS or one of the others which were
> > designed with journaling in mind. And I hope you're using RAID1 or RAID5.
>
> We are using ext3 on a compact flash disk in an embedded device. So we
> are not using RAID systems.
[I'm not subscribed, hopefully this threads]
Um, is this a new application or have you done this before?
It's my understanding that very few (or no) CF devices do wear-levelling internally.
Using a journal, especially a true data journal, seems like *the* way to wear out your
flash as quickly as possible.
If you've had success using ext2 in read/write mode on flash/CF in a shipping product,
I for one would like to know more details!
ken
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 14:56 Ken Ryan @ 2004-06-17 16:06 ` Timothy Miller 2004-06-17 17:20 ` Hans Reiser 2004-06-19 14:49 ` Petter Larsen 1 sibling, 1 reply; 32+ messages in thread From: Timothy Miller @ 2004-06-17 16:06 UTC (permalink / raw) To: Ken Ryan; +Cc: linux-kernel, pla Doesn't Reiser4 do wear-leveling for flash? Ken Ryan wrote: >> > > > > Data integrity is much more important for us than speed. >> > > > > > You might want to consider ReiserFS or one of the others >> which were > designed with journaling in mind. And I hope you're >> using RAID1 or RAID5. >> >> We are using ext3 on a compact flash disk in an embedded device. So we >> are not using RAID systems. > > > [I'm not subscribed, hopefully this threads] > > Um, is this a new application or have you done this before? > > It's my understanding that very few (or no) CF devices do wear-levelling > internally. > Using a journal, especially a true data journal, seems like *the* way to > wear out your > flash as quickly as possible. > > If you've had success using ext2 in read/write mode on flash/CF in a > shipping product, > I for one would like to know more details! > > ken > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 16:06 ` Timothy Miller @ 2004-06-17 17:20 ` Hans Reiser 2004-06-17 19:15 ` Ken Ryan 2004-06-17 19:43 ` Daniel Egger 0 siblings, 2 replies; 32+ messages in thread From: Hans Reiser @ 2004-06-17 17:20 UTC (permalink / raw) To: Timothy Miller; +Cc: Ken Ryan, linux-kernel, pla Timothy Miller wrote: > Doesn't Reiser4 do wear-leveling for flash? No, we don't. We do have wandering logs, so it would be feasible to code, but bitmap blocks and super blocks get written to the same locations repeatedly. Actually, most compact flash devices DO do wear leveling, from what I have heard. > > Ken Ryan wrote: > >>> > > > > Data integrity is much more important for us than speed. >>> > > > > > You might want to consider ReiserFS or one of the others >>> which were > designed with journaling in mind. And I hope you're >>> using RAID1 or RAID5. >>> >>> We are using ext3 on a compact flash disk in an embedded device. So we >>> are not using RAID systems. >> >> >> >> [I'm not subscribed, hopefully this threads] >> >> Um, is this a new application or have you done this before? >> >> It's my understanding that very few (or no) CF devices do >> wear-levelling internally. >> Using a journal, especially a true data journal, seems like *the* way >> to wear out your >> flash as quickly as possible. >> >> If you've had success using ext2 in read/write mode on flash/CF in a >> shipping product, >> I for one would like to know more details! >> >> ken >> >> >> >> - >> To unsubscribe from this list: send the line "unsubscribe >> linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> >> > > - > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 17:20 ` Hans Reiser @ 2004-06-17 19:15 ` Ken Ryan 2004-06-18 6:18 ` Hans Reiser 2004-06-17 19:43 ` Daniel Egger 1 sibling, 1 reply; 32+ messages in thread From: Ken Ryan @ 2004-06-17 19:15 UTC (permalink / raw) To: Hans Reiser; +Cc: Timothy Miller, linux-kernel, pla Hans Reiser wrote: > Timothy Miller wrote: > >> Doesn't Reiser4 do wear-leveling for flash? > > > No, we don't. We do have wandering logs, so it would be feasible to > code, but bitmap blocks and super blocks get written to the same > locations repeatedly. > > Actually, most compact flash devices DO do wear leveling, from what I > have heard. The ones I've seen, only sort of. They'll allocate writes from available erased pages to try to distribute their use, but if you have a disk that's, say, 70% read-only data and 30% read-write then the wear-levelling will only happen on that 30% of the disk. True wear levelling will actually scrub read-only or rarely-written data, forcing it to get off its duff so the flash cells they're sitting on can get some exercise, and give the more worn cells a rest (that scrub helps ECC fix soft errors from weak cells too). True wear-levelling is really hard, and obviously requires budgeting extra bandwidth and storage devices for safely shuffling around data that the application has no intention of moving (picture losing power in the middle of a scrub). It's not worth it for the consumer CF usage model of "take photos until the card is full, then copy them all to the PC and wipe the card clean". [Yes, I tend to see this from the inside-out: I'm actually an FPGA/ASIC weenie not a kernel hacker. One of my current projects is part of a controller chip for a solid-state storage system with ${bignum} NAND flash chips. Alas, my specialty is video and graphics, so I'm still coming up the learning curve on storage systems]. ken ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 19:15 ` Ken Ryan @ 2004-06-18 6:18 ` Hans Reiser 0 siblings, 0 replies; 32+ messages in thread From: Hans Reiser @ 2004-06-18 6:18 UTC (permalink / raw) To: Ken Ryan; +Cc: Timothy Miller, linux-kernel, pla Ken Ryan wrote: > Hans Reiser wrote: > >> Timothy Miller wrote: >> >>> Doesn't Reiser4 do wear-leveling for flash? >> >> >> >> No, we don't. We do have wandering logs, so it would be feasible to >> code, but bitmap blocks and super blocks get written to the same >> locations repeatedly. >> >> Actually, most compact flash devices DO do wear leveling, from what I >> have heard. > > > > The ones I've seen, only sort of. They'll allocate writes from > available erased pages to try to distribute their use, but if you > have a disk that's, say, 70% read-only data and 30% read-write then > the wear-levelling will only happen on that > 30% of the disk. True wear levelling will actually scrub read-only or > rarely-written data, forcing it to get off its > duff so the flash cells they're sitting on can get some exercise, and > give the more worn cells a rest (that scrub > helps ECC fix soft errors from weak cells too). True wear-levelling > is really hard, and obviously requires > budgeting extra bandwidth and storage devices for safely shuffling > around data that the application has no > intention of moving (picture losing power in the middle of a scrub). > It's not worth it for the consumer CF > usage model of "take photos until the card is full, then copy them all > to the PC and wipe the card clean". > > [Yes, I tend to see this from the inside-out: I'm actually an > FPGA/ASIC weenie not a kernel hacker. One of my current > projects is part of a controller chip for a solid-state storage system > with ${bignum} NAND flash chips. Alas, my specialty > is video and graphics, so I'm still coming up the learning curve on > storage systems]. > > ken > > > > > Interesting. Thanks for educating me. No existing general purpose filesystem that I know of will address your needs. We could of course write one if someone paid for it.... ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 17:20 ` Hans Reiser 2004-06-17 19:15 ` Ken Ryan @ 2004-06-17 19:43 ` Daniel Egger 2004-06-17 19:59 ` Ken Ryan 1 sibling, 1 reply; 32+ messages in thread From: Daniel Egger @ 2004-06-17 19:43 UTC (permalink / raw) To: Hans Reiser; +Cc: Timothy Miller, linux-kernel, pla, Ken Ryan [-- Attachment #1: Type: text/plain, Size: 696 bytes --] On 17.06.2004, at 19:20, Hans Reiser wrote: > Actually, most compact flash devices DO do wear leveling, from what I > have heard. Care to mention sources? I'd be surprised if they did simply because it'll cost money that could be earned otherwise. Also I think you confuse bad block remapping with wear leveling and even the former I haven't experienced so far. CF disks were designed for simply the reason of having an empty disk, writing data onto it up to a certain level, reading it a few times and emptying the disk again. So except for the organizational blocks and "the end" of a disk which tends to get rarely hit there're a well distributed write utilization. Servus, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 478 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 19:43 ` Daniel Egger @ 2004-06-17 19:59 ` Ken Ryan 0 siblings, 0 replies; 32+ messages in thread From: Ken Ryan @ 2004-06-17 19:59 UTC (permalink / raw) To: Daniel Egger; +Cc: Hans Reiser, Timothy Miller, linux-kernel, pla Daniel Egger wrote: > On 17.06.2004, at 19:20, Hans Reiser wrote: > >> Actually, most compact flash devices DO do wear leveling, from what I >> have heard. > > > Care to mention sources? I'd be surprised if they did simply because > it'll cost money that could be earned otherwise. Also I think you > confuse bad block remapping with wear leveling and even the former > I haven't experienced so far. > > CF disks were designed for simply the reason of having an empty disk, > writing data onto it up to a certain level, reading it a few times > and emptying the disk again. So except for the organizational blocks > and "the end" of a disk which tends to get rarely hit there're a > well distributed write utilization. > > Servus, > Daniel For example: Just bop over to the Sandisk website, go the the OEM section, and download the manual/datasheet for CF devices. The wearlevel command itself isn't supported (I'm ignorant of flash on IDE, I assume it is intended to mean full scrub-style wear levelling) but they note they roll simplified wear levelling into the erased page pool. Doing that is an easy way to get part of the way there without needing a lot of infrastructure. And for the fill-read-empty usage model it's perfectly fine. ken ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: mode data=journal in ext3. Is it safe to use? 2004-06-17 14:56 Ken Ryan 2004-06-17 16:06 ` Timothy Miller @ 2004-06-19 14:49 ` Petter Larsen 1 sibling, 0 replies; 32+ messages in thread From: Petter Larsen @ 2004-06-19 14:49 UTC (permalink / raw) To: Ken Ryan; +Cc: linux-kernel > > > > We are using ext3 on a compact flash disk in an embedded device. So we > > are not using RAID systems. > > Um, is this a new application or have you done this before? > > It's my understanding that very few (or no) CF devices do wear-levelling internally. > Using a journal, especially a true data journal, seems like *the* way to wear out your > flash as quickly as possible. > > If you've had success using ext2 in read/write mode on flash/CF in a shipping product, > I for one would like to know more details! > > ken >From our data sheet: Wear Leveling is an intrinsic part of the operation of SanDisk products using NAND memory. But for sure, we will use a Compact flash that DO wear leveling, and also shuffling read-only data around the Compact Flash disk. This will be for production, yes. Petter ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2004-06-28 0:22 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <40FB8221D224C44393B0549DDB7A5CE83E31B1@tor.lokal.lan>
2004-06-15 18:09 ` mode data=journal in ext3. Is it safe to use? Petter Larsen
2004-06-15 18:20 ` Eugene Crosser
2004-06-17 8:36 ` Petter Larsen
2004-06-16 7:34 ` Oleg Drokin
2004-06-17 8:27 ` Petter Larsen
2004-06-17 17:09 ` Oleg Drokin
2004-06-18 9:41 ` Helge Hafting
2004-06-18 10:15 ` Oleg Drokin
2004-06-18 11:30 ` Paulo Marques
2004-06-18 12:05 ` Oleg Drokin
2004-06-21 17:42 ` mode data=journal in ext3. Is it safe to use? Conclusion Petter Larsen
2004-06-19 19:16 ` mode data=journal in ext3. Is it safe to use? Bernd Eckenfels
2004-06-16 15:49 ` Timothy Miller
2004-06-17 0:51 ` Daniel Pittman
2004-06-17 3:02 ` Tim Connors
2004-06-17 5:35 ` Hans Reiser
2004-06-17 10:08 ` Dave Jones
2004-06-17 16:55 ` Hans Reiser
2004-06-17 8:29 ` Petter Larsen
2004-06-17 19:30 ` Daniel Egger
[not found] ` <87wu26mto2.fsf@enki.rimspace.net>
2004-06-27 14:17 ` Petter Larsen
2004-06-28 0:22 ` Daniel Pittman
[not found] ` <1805.216.148.213.196.1087426691.squirrel@www.code-visions.com>
2004-06-17 11:23 ` Petter Larsen
2004-06-17 16:26 ` Andreas Dilger
2004-06-17 14:56 Ken Ryan
2004-06-17 16:06 ` Timothy Miller
2004-06-17 17:20 ` Hans Reiser
2004-06-17 19:15 ` Ken Ryan
2004-06-18 6:18 ` Hans Reiser
2004-06-17 19:43 ` Daniel Egger
2004-06-17 19:59 ` Ken Ryan
2004-06-19 14:49 ` Petter Larsen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox