* Re: [Yaffs] bit error rates --> a vendor speaks [not found] ` <d97046180602151153g23064424x9e1ddf83a1d7ae4f@mail.gmail.com> @ 2006-02-16 1:32 ` Charles Manning 2006-02-18 9:10 ` Thomas Gleixner 2006-02-23 8:31 ` Vitaly Wool 0 siblings, 2 replies; 34+ messages in thread From: Charles Manning @ 2006-02-16 1:32 UTC (permalink / raw) To: yaffs, linux-mtd; +Cc: William Watson To mtd-ers... Please forgive the cross-posting, but I think the content is sufficiently important to those who use NAND but are not plugged in to the YAFFS list. On Thursday 16 February 2006 08:53, William Watson wrote: > I will also note that a NAND vendor who paid us a visit at about that same > time said that we should expect WORSE soft error behaviour with succeeding > generations of NAND flash chips. The geometries would get smaller and > smaller, the chip dies would get larger and larger, and the amount of time > for production testing of each chip would not increase, or at least, not > increase as fast as the total storage of a chip. Thus, the testing per > page would only go down in subsequent generations of chips. These two > statements seemed to say that we would see both (1) increased rates of ECC > errors, and (2) an increase in the number of marginal blocks not marked bad > by the chip vendor. A vendor on the list contacted me off list, so I asked their permission before posting what they said on-list. I got that permission so long as their name was removed. As William states, It seems that the reliability has peaked. NAND is expected to get worse, and a move to better correction schemes is encouraged. For the most part, this is not really a YAFFS2 issue since the ECC mechanism (on the data) is not part of YAFFS2 per se. THis is now part of the mtd, or flash driver for non-Linux (eg. William's case). For YAFFS2, the impact is probably limited to: 1) ECC on tags.... Tags are so small that a single-bit correction is probably enough. Multibit is probably a good thing to investigate. 2) More OOB being used for multi-bit schemes will probably mean less space available for tags. 3) An emerging need for more forgiving block retirement. 4) Thinking about "spreading" to reduce write disturb. Thus far it has been quite hard to engage NAND vendors but it seems some are now willing to talk a bit more. I will try to discuss the issues to better understand them. -- Charles Without further blaah, the vendor's words, slightly edited: It's difficult to decide what the best block retirement policy should be. For a program or erase failure, block retirement should be mandatory because internally, the chip has already tried to program or erase multiple times already. For ECC (soft errors), it's more of an open question. Should the errors be scrubbed and the data written back (to the same location or a different location) or should the data be moved and the block permanently retired? One thing I can say with certainty, overall NAND flash reliability is degrading. At the time the application note was written, 0.16 micron based NAND flash had a block write/erase endurance of 250k-1M cycles with reasonable data retention. However, as lithography shrinks have continued (0.16um -> 0.13um -> 90nm -> 70nm), the physical cell area has been cut by roughly half at each die shrink. The physics of the materials don't change, therefore, less charge is being stored in the memory cell every generation. 90nm SLC (single level cell) NAND flash was nominally rated at 100K write/erase cycles per block, and it is very likely that 70nm SLC NAND flash will be significantly less. Better ECC is going to be necessary. The single bit correcting Hamming code that is currently used for SLC NAND was originally designed for SmartMedia over 10 years ago. Today, most memory cards using NAND flash implement Reed Solomon ECC capable of correcting 4 or more symbol errors (typically 8 or more bits per symbol) per 512 bytes. This enables the ability to correct 4 random errors per sector. However, for speed reasons, the ECC is usually done in hardware. If a multi-bit ECC was implemented, then it would be possible to implement a policy like: if there is the maximum number of correctable errors or an uncorrectable error in a page, then the block can be permanently retired, otherwise, scrub the data and move to a new location. But this kind of policy would be difficult to implement with single bit correcting Hamming code. Disturbance errors (read disturb, program disturb) will become more probable in the future due to increased capacitive coupling between bit lines and word lines due to decreased separation at each lithography shrink. One fact that is somewhat underappreciated is that one cannot have both high block write/erase endurance and long data retention simultaneously. One can have better data retention if the block sees fewer write/erase cycles. The more the data is spread out across all the blocks, the better the overall data retention since every block is written and erased as few times a necessary. Journaling appears to be one of the best ways to spread out the writes but some kind of static data wear leveling could improve it further (however, there might be IP issues). Keep up the good work, Sincerely, [name scrubbed] ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-16 1:32 ` [Yaffs] bit error rates --> a vendor speaks Charles Manning @ 2006-02-18 9:10 ` Thomas Gleixner 2006-02-18 16:31 ` Vitaly Wool 2006-02-18 18:11 ` Russ Dill 2006-02-23 8:31 ` Vitaly Wool 1 sibling, 2 replies; 34+ messages in thread From: Thomas Gleixner @ 2006-02-18 9:10 UTC (permalink / raw) To: Charles Manning; +Cc: William Watson, linux-mtd, yaffs Charles, On Thu, 2006-02-16 at 14:32 +1300, Charles Manning wrote: > 1) ECC on tags.... Tags are so small that a single-bit correction is probably > enough. Multibit is probably a good thing to investigate. > 2) More OOB being used for multi-bit schemes will probably mean less space > available for tags. We really should start to think seriously about oob usage for arbitrary data storage at all. I know that YAFFS(2) depends on that, but looking at the required mess in the code to keep this up for all the 9999 variants of ECC/RS whatever mechanisms, bad block marking schemes ... Some words about Reed Solomon. Reed Solomon needs hardware support for performance reasons. Efficient usage of Reed Solomon requires a different Data / RS-code layout: 512 Byte Data 8 Byte RS Code 512 Byte Data 8 Byte RS Code 512 Byte Data 8 Byte RS Code 512 Byte Data 8 Byte RS Code 32 Byte OOB This layout is supported already (see rtc_from4.c). It requires usage of flash based bad block tables. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-18 9:10 ` Thomas Gleixner @ 2006-02-18 16:31 ` Vitaly Wool 2006-02-19 8:22 ` Thomas Gleixner 2006-02-18 18:11 ` Russ Dill 1 sibling, 1 reply; 34+ messages in thread From: Vitaly Wool @ 2006-02-18 16:31 UTC (permalink / raw) To: tglx; +Cc: William Watson, Charles Manning, linux-mtd, yaffs Hi folks, just two small notes from my side. First, FWIW, YAFFS2 never writes OOB w/o data and that looks more proper than JFFS2 style which means cleanmarkers for an empty page. YAFFS2 just needs means to be agnostic about how OOB bytes are placed within a page. Next, I took a look at rtc_from4.c and I'm not sure how to follow this method if the NAND controller just doesn't have means to give the caclulated ECC back to user. Thomas, could you please elaborate on that? Best regards, Vitaly Thomas Gleixner wrote: >Charles, > >On Thu, 2006-02-16 at 14:32 +1300, Charles Manning wrote: > > >>1) ECC on tags.... Tags are so small that a single-bit correction is probably >>enough. Multibit is probably a good thing to investigate. >>2) More OOB being used for multi-bit schemes will probably mean less space >>available for tags. >> >> > >We really should start to think seriously about oob usage for arbitrary >data storage at all. I know that YAFFS(2) depends on that, but looking >at the required mess in the code to keep this up for all the 9999 >variants of ECC/RS whatever mechanisms, bad block marking schemes ... > >Some words about Reed Solomon. > >Reed Solomon needs hardware support for performance reasons. Efficient >usage of Reed Solomon requires a different Data / RS-code layout: > >512 Byte Data >8 Byte RS Code >512 Byte Data >8 Byte RS Code >512 Byte Data >8 Byte RS Code >512 Byte Data >8 Byte RS Code >32 Byte OOB > >This layout is supported already (see rtc_from4.c). It requires usage of >flash based bad block tables. > > tglx > > > >______________________________________________________ >Linux MTD discussion mailing list >http://lists.infradead.org/mailman/listinfo/linux-mtd/ > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-18 16:31 ` Vitaly Wool @ 2006-02-19 8:22 ` Thomas Gleixner 2006-02-20 20:42 ` Charles Manning 0 siblings, 1 reply; 34+ messages in thread From: Thomas Gleixner @ 2006-02-19 8:22 UTC (permalink / raw) To: Vitaly Wool; +Cc: William Watson, Charles Manning, linux-mtd, yaffs Vitaly, On Sat, 2006-02-18 at 19:31 +0300, Vitaly Wool wrote: > Hi folks, Can you please stop top posting ? > just two small notes from my side. > First, FWIW, YAFFS2 never writes OOB w/o data and that looks more proper > than JFFS2 style which means cleanmarkers for an empty page. > YAFFS2 just needs means to be agnostic about how OOB bytes are placed > within a page. I'm not too happy about this JFFS2 oddity and I would remove it better today than tomorrow. The agnostic thing is fine, but it still is not solving the fundamental flaw of oob usage at all. The only guarantee of NAND is that you can store data size, but the size of the "free" bytes in OOB (due to ECC, bad block markers ...) is non constant and depends on hardware design, chip types ... So any self restriction of a filesystem, block emulation layer or whatever user of NAND flash to a small number of bytes it wants to put into OOB can not cover _all_ possible constellations. The kernel can only provide generic solutions and it makes absolutely no sense to rely on nifty tricks which require code bloat and can not cover every corner case. > Next, I took a look at rtc_from4.c and I'm not sure how to follow this > method if the NAND controller just doesn't have means to give the > caclulated ECC back to user. Thomas, could you please elaborate on that? -ENOPARSE. What does the NAND controller do with the calculated ECC, if it has no way to let the user read the ECC ? You mean those controllers which insert the ECC automatically at some point into the data stream ? A great example why OOB usage is not a good idea at all. These controllers guarantee data size and nothing else. I have one on my desk which does not let you use OOB, because it does all the magic itself. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-19 8:22 ` Thomas Gleixner @ 2006-02-20 20:42 ` Charles Manning 2006-02-20 21:37 ` Thomas Gleixner 0 siblings, 1 reply; 34+ messages in thread From: Charles Manning @ 2006-02-20 20:42 UTC (permalink / raw) To: tglx; +Cc: William Watson, Vitaly Wool, yaffs, linux-mtd > > just two small notes from my side. > > First, FWIW, YAFFS2 never writes OOB w/o data and that looks more proper > > than JFFS2 style which means cleanmarkers for an empty page. > > YAFFS2 just needs means to be agnostic about how OOB bytes are placed > > within a page. > > I'm not too happy about this JFFS2 oddity and I would remove it better > today than tomorrow. All the NAND fs (yaffs, jffs2 and [in the future] jffs3) use free oob. It is a very handy place to put tags/metadata/whatever you want to call it and allows for far faster and more efficient flash file systems. In systems that wrap up NAND as a block device you don't run a flash file system. If a system does have spare oob and can use it for ffs use, then why deny it? > > The agnostic thing is fine, but it still is not solving the fundamental > flaw of oob usage at all. The only guarantee of NAND is that you can > store data size, but the size of the "free" bytes in OOB (due to ECC, > bad block markers ...) is non constant and depends on hardware design, > chip types ... So any self restriction of a filesystem, block emulation > layer or whatever user of NAND flash to a small number of bytes it wants > to put into OOB can not cover _all_ possible constellations. > > The kernel can only provide generic solutions and it makes absolutely no > sense to rely on nifty tricks which require code bloat and can not cover > every corner case. I think you are looking at this in the wrong way. We should focus on **clean interfaces** rather than trying to put everything into nand_base.c I think nand_base.c is a pretty good generic solution for certain classes of nand part, but it should not try to do all things for all NAND. Doing so will just slow it down and make it even more complex. I don't think anyone wants to make nand_base.c more complex. Analogy: in serial driver world there are a lot of devices that fit the 16550 model of operation and it makes sense to treat them similarly and use some common code. However, if youwant to support an weird UART like, say, the Atmel running in PDC mode then this no longer makes sense. You can however still use the same **interfaces**. Already, many people replace nand_base.c for performance reasons. People with really odd-ball hardware write theirs from scratch. I expect that this will increase with the proliferaation of hardware assitance for ECC or DMA etc. Writing a nand driver is not a huge mission, and is easier with clean interfaces. Often people will treat nand_base.c as documentation. Adding the ability to read fee oob data is trival compared to the rest of the effort doing ECC placement etc. Of course it can't be done if the hw denies access, but that is another matter. > > > Next, I took a look at rtc_from4.c and I'm not sure how to follow this > > method if the NAND controller just doesn't have means to give the > > caclulated ECC back to user. Thomas, could you please elaborate on that? > > -ENOPARSE. What does the NAND controller do with the calculated ECC, if > it has no way to let the user read the ECC ? > You mean those controllers which insert the ECC automatically at some > point into the data stream ? A great example why OOB usage is not a good > idea at all. These controllers guarantee data size and nothing else. I > have one on my desk which does not let you use OOB, because it does all > the magic itself. Fine, if the hw makes it into a block device or hides the OOB completely, then use it as a block device and it cannot be used with a fs that wants oob areas. However, this is no longer NAND even though it might have NAND inside. I lose no sleep that it does not work with YAFFS. I see no point in saying that because *some* systems cannot provide oob we should make an effort to deny this on *all* systems. To me this is as silly as suggesting we ban USB or ethernet or CAN from Linux because some systems don't support it. Speaking for YAFFS: For the forseeable future, YAFFS will require and use OOB, or an adequate simulation. (eg. the way people do YAFFS on NOR). As NAND changes, there will be a need to support new encoding schemes, but that's largely seperated off from YAFFS. YAFFS does not try to be all things for all people. Perhaps YAFFS4 or more will not need OOB. Sure, YAFFS **could** be redesigned to not need OOB but that would make it far less effective for its core user base. There are about 40 Linux filesystems out there, they each exist because they do something special. Having said that though, I guess there is one way to make YAFFS work with no OOB and that is to put the tags in the page. eg. 2048-byte page becomes 2032 bytes of data and 16 bytes of tags. Possible, but would drive up copying costs due to the loss of page alignment. -- CHarles ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 20:42 ` Charles Manning @ 2006-02-20 21:37 ` Thomas Gleixner 2006-02-20 22:40 ` Charles Manning 2006-02-21 11:59 ` Artem B. Bityutskiy 0 siblings, 2 replies; 34+ messages in thread From: Thomas Gleixner @ 2006-02-20 21:37 UTC (permalink / raw) To: Charles Manning; +Cc: William Watson, Vitaly Wool, yaffs, linux-mtd On Tue, 2006-02-21 at 09:42 +1300, Charles Manning wrote: > If a system does have spare oob and can use it for ffs use, then why deny it? Emphasis on "does have spare oob". Thats the whole point. You can not rely on that. Whats the size you need ? 1, 2, ... 16, 32 bytes? There is no guarantee. > Already, many people replace nand_base.c for performance reasons. People with > really odd-ball hardware write theirs from scratch. I expect that this will > increase with the proliferaation of hardware assitance for ECC or DMA etc. > Writing a nand driver is not a huge mission, and is easier with clean > interfaces. Often people will treat nand_base.c as documentation. Great. Whats the benefit for those who put effort into generic solutions and clean interfaces? > > You mean those controllers which insert the ECC automatically at some > > point into the data stream ? A great example why OOB usage is not a good > > idea at all. These controllers guarantee data size and nothing else. I > > have one on my desk which does not let you use OOB, because it does all > > the magic itself. > > Fine, if the hw makes it into a block device or hides the OOB completely, then The hardware does not make it a block device. Where did I say that ? > use it as a block device and it cannot be used with a fs that wants oob > areas. However, this is no longer NAND even though it might have NAND inside. Err. The controller hides oob. This is still NAND with all its restrictions. And it does not become a block device magically. > I lose no sleep that it does not work with YAFFS. Wake up please. Thats going to be reality for NAND based stuff in the future. The controllers will expose the raw FLASH but claim the OOB area for their own purpose - hardware based error correction. > YAFFS does not try to be all things for all people. Perhaps YAFFS4 or more > will not need OOB. Sure, YAFFS **could** be redesigned to not need OOB but > that would make it far less effective for its core user base. There are about > 40 Linux filesystems out there, they each exist because they do something > special. Right, but I doubt that the majority of them relies on non guaranteed hardware features. > Having said that though, I guess there is one way to make YAFFS work with no > OOB and that is to put the tags in the page. eg. 2048-byte page becomes 2032 > bytes of data and 16 bytes of tags. Possible, but would drive up copying > costs due to the loss of page alignment. Well, whether you like it or not. That topic has to be discussed in the near future. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 21:37 ` Thomas Gleixner @ 2006-02-20 22:40 ` Charles Manning 2006-02-20 23:18 ` Thomas Gleixner 2006-02-21 12:14 ` [Yaffs] bit error rates --> a vendor speaks Artem B. Bityutskiy 2006-02-21 11:59 ` Artem B. Bityutskiy 1 sibling, 2 replies; 34+ messages in thread From: Charles Manning @ 2006-02-20 22:40 UTC (permalink / raw) To: linux-mtd, tglx; +Cc: William Watson, Vitaly Wool, yaffs On Tuesday 21 February 2006 10:37, Thomas Gleixner wrote: > On Tue, 2006-02-21 at 09:42 +1300, Charles Manning wrote: > > If a system does have spare oob and can use it for ffs use, then why deny > > it? > > Emphasis on "does have spare oob". Thats the whole point. You can not > rely on that. Whats the size you need ? 1, 2, ... 16, 32 bytes? There is > no guarantee. Sorry Thomas I don't buy that argument. If a system has a NAND device that does have spare OOB available then it does have spare OOB and I can rely on that. If a NAND chip is soldered to a board, and the system exposes the OOB it is there. YAFFS (or whatever) can then be used on this device. Just going for the lowest common denominator all the time is like saying "run all serial links at 9600 and never use 115200 because 115200 might not be supported on all possible serial links", or "you can't run an ftdi USB serial port at 230k because most PC serial ports only go up to 115200". > > > Already, many people replace nand_base.c for performance reasons. People > > with really odd-ball hardware write theirs from scratch. I expect that > > this will increase with the proliferaation of hardware assitance for ECC > > or DMA etc. Writing a nand driver is not a huge mission, and is easier > > with clean interfaces. Often people will treat nand_base.c as > > documentation. > > Great. Whats the benefit for those who put effort into generic solutions > and clean interfaces? Generic solutions are good for those that it fits with. It does not mean that the generic solution is going to be good for all people all the time. It is impossible to write any body of code that is the best solution to all possible uses. It is however possible to write code that is good enough for many applications. However, good clean interfaces are even more valuable because they allow people to easily plug in alternative solutions that interact well with the rest of the system. The benefits of good clean interfaces are modularity. If you were to write a nand driver from scratch that obeys the interface then you can use it with mtdpart, mtdconcat, various fs etc. This is the major reason I don't like to have oobinfo hanging through the interface. > > > > You mean those controllers which insert the ECC automatically at some > > > point into the data stream ? A great example why OOB usage is not a > > > good idea at all. These controllers guarantee data size and nothing > > > else. I have one on my desk which does not let you use OOB, because it > > > does all the magic itself. > > > > Fine, if the hw makes it into a block device or hides the OOB completely, > > then > > The hardware does not make it a block device. Where did I say that ? > > > use it as a block device and it cannot be used with a fs that wants oob > > areas. However, this is no longer NAND even though it might have NAND > > inside. > > Err. The controller hides oob. This is still NAND with all its > restrictions. And it does not become a block device magically. OK I got that wrong. I don't have the specs for the device you're looking at so I can't see all the details. > > > I lose no sleep that it does not work with YAFFS. > > Wake up please. Thats going to be reality for NAND based stuff in the > future. The controllers will expose the raw FLASH but claim the OOB area > for their own purpose - hardware based error correction. Yes that is so for a few parts like OneNAND etc as well as some modules. However, there's still a lot of NAND out there that will provide OOB. So far I think I've only heard one person ask about YAFFS on OneNAND. So far I have not heard anyone ask for YAFFS to run on the board sitting on your desk. Perhaps opening up YAFFS to those people would be valuable. Would you like to see YAFFS run on your non-oob board? I guess it could also make YAFFS easier to use for NOR people. Currently NOR folks have to do quite a bit of trickery. Enough people do this for me to think this could be valuable. > > > Having said that though, I guess there is one way to make YAFFS work with > > no OOB and that is to put the tags in the page. eg. 2048-byte page > > becomes 2032 bytes of data and 16 bytes of tags. Possible, but would > > drive up copying costs due to the loss of page alignment. > > Well, whether you like it or not. That topic has to be discussed in the > near future. Perhaps this might be the way things go. Thinking about this more, I think it would not be so hard and it would be useful for people using OneNAND etc. YAFFS already has an internal cache which can take care of the page alignment with almost no work at all. However, if that happens I'd still like to be able to use OOB if it is there (just like I want to be able to use 115200 if it is there) because OOB can make a more efficient fs with page alignment etc. -- Charles ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 22:40 ` Charles Manning @ 2006-02-20 23:18 ` Thomas Gleixner 2006-02-21 0:29 ` Jon Masters 2006-02-21 1:08 ` [Yaffs] bit error rates --> YAFFS for devices with no OOB Charles Manning 2006-02-21 12:14 ` [Yaffs] bit error rates --> a vendor speaks Artem B. Bityutskiy 1 sibling, 2 replies; 34+ messages in thread From: Thomas Gleixner @ 2006-02-20 23:18 UTC (permalink / raw) To: Charles Manning; +Cc: William Watson, Vitaly Wool, linux-mtd, yaffs On Tue, 2006-02-21 at 11:40 +1300, Charles Manning wrote: > Just going for the lowest common denominator all the time is like saying "run > all serial links at 9600 and never use 115200 because 115200 might not be > supported on all possible serial links", or "you can't run an ftdi USB serial > port at 230k because most PC serial ports only go up to 115200". Again, the comparison is still flawed. The worst serial device still guarantees a baudrate > 0 and the effective baudrate has no impact on data storage size. > The benefits of good clean interfaces are modularity. If you were to write a > nand driver from scratch that obeys the interface then you can use it with > mtdpart, mtdconcat, various fs etc. This is the major reason I don't like to > have oobinfo hanging through the interface. I'm well aware of interface design, but I see no convincing argument not to remove oob access at all. At least there is no in kernel user essentially depending on it. Making JFFS2 oob independend is a no brain patch. > OK I got that wrong. I don't have the specs for the device you're looking at > so I can't see all the details. Simply as I said. Standard NAND interface, no oob access. Well, I can read the raw device (including OOB) for diagnostic purposes in a special mode. > > > I lose no sleep that it does not work with YAFFS. > > > > Wake up please. Thats going to be reality for NAND based stuff in the > > future. The controllers will expose the raw FLASH but claim the OOB area > > for their own purpose - hardware based error correction. > > Yes that is so for a few parts like OneNAND etc as well as some modules. > However, there's still a lot of NAND out there that will provide OOB. That's no reason to keep a dead thing alive. There are still a lot of cars which need leaded fuel. I dont see any gas station providing it within a 100km range. > Perhaps opening up YAFFS to those people would be valuable. Would you like to > see YAFFS run on your non-oob board? At least for a test. > I guess it could also make YAFFS easier to use for NOR people. Currently NOR > folks have to do quite a bit of trickery. Enough people do this for me to > think this could be valuable. :) > However, if that happens I'd still like to be able to use OOB if it is there > (just like I want to be able to use 115200 if it is there) because OOB can > make a more efficient fs with page alignment etc. Go back to top - and as you said before: > > Already, many people replace nand_base.c for performance reasons. Nobody is going to prevent this, so they can expose any interface they want for their private performance gain. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 23:18 ` Thomas Gleixner @ 2006-02-21 0:29 ` Jon Masters 2006-02-21 8:26 ` Thomas Gleixner 2006-02-21 1:08 ` [Yaffs] bit error rates --> YAFFS for devices with no OOB Charles Manning 1 sibling, 1 reply; 34+ messages in thread From: Jon Masters @ 2006-02-21 0:29 UTC (permalink / raw) To: tglx; +Cc: linux-mtd, Charles Manning, Vitaly Wool, yaffs On 2/20/06, Thomas Gleixner <tglx@linutronix.de> wrote: > On Tue, 2006-02-21 at 11:40 +1300, Charles Manning wrote: > > Just going for the lowest common denominator all the time is like saying "run > > all serial links at 9600 and never use 115200 because 115200 might not be > > supported on all possible serial links", or "you can't run an ftdi USB serial > > port at 230k because most PC serial ports only go up to 115200". > > Again, the comparison is still flawed. > > The worst serial device still guarantees a baudrate > 0 and the > effective baudrate has no impact on data storage size. Just let me make sure I'm getting this right: 1). You don't have OOB available to you with your NAND part. 2). You want YAFFS changed to suit your special case. It's all very well arguing that relying on OOB in all cases is a bad idea - and indeed, it sounds like a good idea to support packing extra data into pages on flash - but you seem very keen on pushing the idea that it's always bad to use OOB. So long as logic is added such that you can have different behaviour, I fail to see the problem here. Cheers, Jon. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 0:29 ` Jon Masters @ 2006-02-21 8:26 ` Thomas Gleixner 2006-02-21 9:35 ` Jörn Engel 0 siblings, 1 reply; 34+ messages in thread From: Thomas Gleixner @ 2006-02-21 8:26 UTC (permalink / raw) To: jonathan; +Cc: linux-mtd, Charles Manning, Vitaly Wool, yaffs On Tue, 2006-02-21 at 00:29 +0000, Jon Masters wrote: > > The worst serial device still guarantees a baudrate > 0 and the > > effective baudrate has no impact on data storage size. > > Just let me make sure I'm getting this right: > > 1). You don't have OOB available to you with your NAND part. > 2). You want YAFFS changed to suit your special case. No, you get it wrong. Its not my personal problem at all. It's not about the board on my desk. It's about some piece of software relying on a non guaranteed hardware feature. I'm looking into the variety of hardware which evolves around NAND flash and I carefully look in which direction this is going. I see that the near future will require more complexity in the nand code and I'm looking for a sane solution for that. I'm not saying that its wrong, when something uses a nice hardware feature, but my point still stands that it is wrong to rely on a feature which is nowhere guaranteed. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 8:26 ` Thomas Gleixner @ 2006-02-21 9:35 ` Jörn Engel 0 siblings, 0 replies; 34+ messages in thread From: Jörn Engel @ 2006-02-21 9:35 UTC (permalink / raw) To: Thomas Gleixner; +Cc: jonathan, Vitaly Wool, Charles Manning, linux-mtd, yaffs On Tue, 21 February 2006 09:26:13 +0100, Thomas Gleixner wrote: > > It's about some piece of software relying on a non guaranteed hardware > feature. > > I'm looking into the variety of hardware which evolves around NAND flash > and I carefully look in which direction this is going. > > I see that the near future will require more complexity in the nand code > and I'm looking for a sane solution for that. I'm not saying that its > wrong, when something uses a nice hardware feature, but my point still > stands that it is wrong to rely on a feature which is nowhere > guaranteed. Ack. One of the better features of hard disks is that they have a standard interface. I can take an ext3 image from a 5 year old disk from one vendor and write it to a brand new disk from a different vendor and things just work. No conversion tool is needed. For flash, moving images between arbitrary chips is just a pipe dream. But things will get pushed into that direction. So for a filesystem to work on many chips, it should be prepared to work with different page- and block-sizes and not rely on OOB at all. Jörn -- You ain't got no problem, Jules. I'm on the motherfucker. Go back in there, chill them niggers out and wait for the Wolf, who should be coming directly. -- Marsellus Wallace ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> YAFFS for devices with no OOB 2006-02-20 23:18 ` Thomas Gleixner 2006-02-21 0:29 ` Jon Masters @ 2006-02-21 1:08 ` Charles Manning 2006-02-21 2:12 ` Jon Masters 2006-02-22 0:38 ` Jamie Lokier 1 sibling, 2 replies; 34+ messages in thread From: Charles Manning @ 2006-02-21 1:08 UTC (permalink / raw) To: tglx; +Cc: linux-mtd, yaffs On Tuesday 21 February 2006 12:18, Thomas Gleixner wrote: > > > Perhaps opening up YAFFS to those people would be valuable. Would you > > like to see YAFFS run on your non-oob board? > > At least for a test. OK, I'll start investigating an OOB-less YAFFS and give that a priority depending on the amount of interest. Hint to everyone: If this sounds interesting, respond (on or off list), especially if you'd like to be involved in some way. I think a final solution that exploits OOB if it is available but will work if OOB is not available would be best. That way: 1) People could use YAFFS on a wider variety of flash types (inc OneNAND etc) without using OOB. 2) If you have OOB, and drivers set up to use it, and wish to use it, then you will likely get some performance gains due to page alignment etc. -- CHarles ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> YAFFS for devices with no OOB 2006-02-21 1:08 ` [Yaffs] bit error rates --> YAFFS for devices with no OOB Charles Manning @ 2006-02-21 2:12 ` Jon Masters 2006-02-22 0:38 ` Jamie Lokier 1 sibling, 0 replies; 34+ messages in thread From: Jon Masters @ 2006-02-21 2:12 UTC (permalink / raw) To: Charles Manning; +Cc: tglx, linux-mtd, yaffs On 2/21/06, Charles Manning <manningc2@actrix.gen.nz> wrote: > OK, I'll start investigating an OOB-less YAFFS and give that a priority > depending on the amount of interest. Hint to everyone: If this sounds > interesting, respond (on or off list), especially if you'd like to be > involved in some way. I'd like to play with this, but I need to pick up a board for myself before I do much more YAFFS work - all the existing stuff was done on boards in the Monta office. > I think a final solution that exploits OOB if it is available but will work if > OOB is not available would be best. Per my last mail, I think that's the best solution. Might take a bit of work on MTD too. Shove stuff my way please Charles and if I'm not frantically writing chapters then I'll take a look :-) Jon. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> YAFFS for devices with no OOB 2006-02-21 1:08 ` [Yaffs] bit error rates --> YAFFS for devices with no OOB Charles Manning 2006-02-21 2:12 ` Jon Masters @ 2006-02-22 0:38 ` Jamie Lokier 1 sibling, 0 replies; 34+ messages in thread From: Jamie Lokier @ 2006-02-22 0:38 UTC (permalink / raw) To: Charles Manning; +Cc: tglx, linux-mtd, yaffs Charles Manning wrote: > OK, I'll start investigating an OOB-less YAFFS and give that a priority > depending on the amount of interest. Hint to everyone: If this sounds > interesting, respond (on or off list), especially if you'd like to be > involved in some way. Would that make it more usable as a filesystem for Compact Flash cards? I've heard of people using JFFS2 on Compact Flash (using blkmtd), because it's power-fail-safe and is much better for wear than, say, ext3. -- Jamie ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 22:40 ` Charles Manning 2006-02-20 23:18 ` Thomas Gleixner @ 2006-02-21 12:14 ` Artem B. Bityutskiy 2006-02-21 13:50 ` Josh Boyer 1 sibling, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-21 12:14 UTC (permalink / raw) To: Charles Manning, tglx; +Cc: William Watson, Vitaly Wool, linux-mtd, yaffs Charles Manning wrote: > Sorry Thomas I don't buy that argument. If a system has a NAND device that > does have spare OOB available then it does have spare OOB and I can rely on > that. If a NAND chip is soldered to a board, and the system exposes the OOB > it is there. YAFFS (or whatever) can then be used on this device. I understand Thomas's point as as he is fighting for generalization. Indeed, this OOB stuff introduces a lot of mess. Charles' point is - if OOB is there, why not to let users use it? Also sounds reasonable. What I think would be nice to do is to get rid of OOB in MTD stuff, but add a possibility to access OOB via some NAND-specific interfaces from nand_base. Indeed, if one wants to work with a generalized flash device - please use MTD interface. If one still wants to access OOB, use lower-layer NAND interfaces. That's all about to have more then one layer of Generalization. And I believe this is the right way to go. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 12:14 ` [Yaffs] bit error rates --> a vendor speaks Artem B. Bityutskiy @ 2006-02-21 13:50 ` Josh Boyer 2006-02-21 14:36 ` Artem B. Bityutskiy 0 siblings, 1 reply; 34+ messages in thread From: Josh Boyer @ 2006-02-21 13:50 UTC (permalink / raw) To: Artem B. Bityutskiy Cc: Vitaly Wool, William Watson, yaffs, Charles Manning, linux-mtd, tglx On 2/21/06, Artem B. Bityutskiy <dedekind@yandex.ru> wrote: > Charles Manning wrote: > > Sorry Thomas I don't buy that argument. If a system has a NAND device that > > does have spare OOB available then it does have spare OOB and I can rely on > > that. If a NAND chip is soldered to a board, and the system exposes the OOB > > it is there. YAFFS (or whatever) can then be used on this device. > > I understand Thomas's point as as he is fighting for generalization. > Indeed, this OOB stuff introduces a lot of mess. > > Charles' point is - if OOB is there, why not to let users use it? Also > sounds reasonable. But Charles also wants clean interfaces. I agree that clean interfaces are definitely a good thing, but trying to come up with a clean interface for OOB access that won't get bastardized seems to be unattainable. > What I think would be nice to do is to get rid of OOB in MTD stuff, but > add a possibility to access OOB via some NAND-specific interfaces from > nand_base. Indeed, if one wants to work with a generalized flash device > - please use MTD interface. If one still wants to access OOB, use > lower-layer NAND interfaces. That's all about to have more then one > layer of Generalization. And I believe this is the right way to go. I think at some time in the not so distant future this whole conversation will become a moot point. SLC NAND quality seems to be degradding as the die sizes go down, and MLC NAND is already of a degraded quality comparitively. Better ECC algorithms will be needed to provide the reliability that people want and I can see that consuming all of the available OOB area anyway. josh ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 13:50 ` Josh Boyer @ 2006-02-21 14:36 ` Artem B. Bityutskiy 2006-02-21 14:49 ` Artem B. Bityutskiy 0 siblings, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-21 14:36 UTC (permalink / raw) To: Josh Boyer Cc: Vitaly Wool, William Watson, yaffs, Charles Manning, linux-mtd, tglx Josh Boyer wrote: > But Charles also wants clean interfaces. I agree that clean > interfaces are definitely a good thing, but trying to come up with a > clean interface for OOB access that won't get bastardized seems to be > unattainable. But this does not mean that we should remove OOB support. Again, the right (IMO, of couse) way is to make several levels of generalization. Just off the top of my head: 1. MTD level: a generic flash model with nothion of eraseblock, a mimimal I/O unit, read and write operations, nothing else. This level is for properly designed software. 2. NAND flash level: here you have OOB access. Work in terms of NAND pages, etc. Here YAFFS could be happy. 3. We could also have a DataFlash layer, where DataFlash guys could make use of their blocks and sectores, use features note visible from the topmost generic MTD layer. > I think at some time in the not so distant future this whole > conversation will become a moot point. SLC NAND quality seems to be > degradding as the die sizes go down, and MLC NAND is already of a > degraded quality comparitively. Better ECC algorithms will be needed > to provide the reliability that people want and I can see that > consuming all of the available OOB area anyway. Still, the argument with CF cards makes me believe that vendors will provide some space for user data in OOB. Why can't they enlarge OOB size? -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 14:36 ` Artem B. Bityutskiy @ 2006-02-21 14:49 ` Artem B. Bityutskiy 0 siblings, 0 replies; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-21 14:49 UTC (permalink / raw) To: Josh Boyer Cc: Vitaly Wool, William Watson, yaffs, Charles Manning, linux-mtd, tglx Artem B. Bityutskiy wrote: > But this does not mean that we should remove OOB support. Again, the > right (IMO, of couse) way is to make several levels of generalization. > > Just off the top of my head: > > 1. MTD level: a generic flash model with nothion of eraseblock, a > mimimal I/O unit, read and write operations, nothing else. This level is > for properly designed software. > > 2. NAND flash level: here you have OOB access. Work in terms of NAND > pages, etc. Here YAFFS could be happy. > > 3. We could also have a DataFlash layer, where DataFlash guys could make > use of their blocks and sectores, use features note visible from the > topmost generic MTD layer. We can put that this way: The topmost layer is MTD which provides a uniform and generic way to access all flash devices: NOR, NAND, AG-AND, DataFlash, OneNAND, ECC-ed NORs, and "whataver the hack". Here you don't even notice the divverence between them. Below this MTD layer we could have a NOR layer (generalizes all NOR flashes), NAND layer (generalizes all NAND flashes), DataFlash layer (generalizes all DataFlash flashes) and so on. YAFFS could then work with NAND layer, not with MTD layer, because it is NAND-oriented. Again, this is a question of proper MTD layering, and just saying that we don't support OOB is a lazy way to solve problems, IMO. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-20 21:37 ` Thomas Gleixner 2006-02-20 22:40 ` Charles Manning @ 2006-02-21 11:59 ` Artem B. Bityutskiy 2006-02-21 12:06 ` Thomas Gleixner 1 sibling, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-21 11:59 UTC (permalink / raw) To: tglx; +Cc: William Watson, Charles Manning, Vitaly Wool, yaffs, linux-mtd Thomas Gleixner wrote: > Wake up please. Thats going to be reality for NAND based stuff in the > future. The controllers will expose the raw FLASH but claim the OOB area > for their own purpose - hardware based error correction. One of my colleagues said a very interesting argument against this. Look, consider all those CompactFlash cards. They are NAND flash based. They have a kind of block device emulation built-in. And I bet they use OOB to store the logical block number corresponding to this physical block. The block device over Flash device emulation is so widespread, so vendors will never forbid OOB usage. From this point of view, OOB is no going to go. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 11:59 ` Artem B. Bityutskiy @ 2006-02-21 12:06 ` Thomas Gleixner 2006-02-25 11:58 ` Artem B. Bityutskiy 0 siblings, 1 reply; 34+ messages in thread From: Thomas Gleixner @ 2006-02-21 12:06 UTC (permalink / raw) To: Artem B. Bityutskiy Cc: William Watson, Charles Manning, Vitaly Wool, yaffs, linux-mtd On Tue, 2006-02-21 at 14:59 +0300, Artem B. Bityutskiy wrote: > Thomas Gleixner wrote: > > Wake up please. Thats going to be reality for NAND based stuff in the > > future. The controllers will expose the raw FLASH but claim the OOB area > > for their own purpose - hardware based error correction. > One of my colleagues said a very interesting argument against this. > > Look, consider all those CompactFlash cards. They are NAND flash based. > They have a kind of block device emulation built-in. And I bet they use > OOB to store the logical block number corresponding to this physical > block. The block device over Flash device emulation is so widespread, so > vendors will never forbid OOB usage. I did nowhere say, that oob usage will be forbidden. > From this point of view, OOB is no going to go. The CF controller does its own closed proprietary magic and looking at the robustness of those cards I dont want to know what it does. The OOB usage of a closed device is in no way relevant for a discussion about a robust, sane and quite generic solution for handling NAND flash devices inside of Linux. I don't care what those chips do unless they run Linux inside. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-21 12:06 ` Thomas Gleixner @ 2006-02-25 11:58 ` Artem B. Bityutskiy 2006-02-27 13:27 ` Josh Boyer 0 siblings, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-25 11:58 UTC (permalink / raw) To: tglx; +Cc: linux-mtd BTW, talking about generalization, that mtd->eraseregions stuff is as bad as OOB. IMO, it is better to treat different regions as different partitions or as different MTD devices. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-25 11:58 ` Artem B. Bityutskiy @ 2006-02-27 13:27 ` Josh Boyer 2006-02-27 16:01 ` Artem B. Bityutskiy 0 siblings, 1 reply; 34+ messages in thread From: Josh Boyer @ 2006-02-27 13:27 UTC (permalink / raw) To: Artem B. Bityutskiy; +Cc: tglx, linux-mtd On 2/25/06, Artem B. Bityutskiy <dedekind@yandex.ru> wrote: > BTW, talking about generalization, that mtd->eraseregions stuff is as > bad as OOB. IMO, it is better to treat different regions as different > partitions or as different MTD devices. Why is that? Often times the eraseregions have no practical use for the software involved. Take the Intel P30 for example. It has a few eraseblocks at either the top or bottom of the chip that are different eraseblock size. Some software may want to use this feature, but by having the MTD layer abstract them away it makes a lot of things simpler. If you have different eraseregions show up as different MTD devices (or partitions which are essentially the same thing in this discussion), then you have to manually concatenate them back into a single device. It takes a) more RAM overhead and b) more layers of complexity to do that. josh ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-27 13:27 ` Josh Boyer @ 2006-02-27 16:01 ` Artem B. Bityutskiy 2006-02-27 16:15 ` Josh Boyer 0 siblings, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-27 16:01 UTC (permalink / raw) To: Josh Boyer; +Cc: tglx, linux-mtd Josh Boyer wrote: > Often times the eraseregions have no practical use for the software > involved. Take the Intel P30 for example. It has a few eraseblocks > at either the top or bottom of the chip that are different eraseblock > size. This explains why putting this all together in one mtd-> structure is bad (they have different eraseblocks size). > If you have different eraseregions show up as different MTD devices > (or partitions which are essentially the same thing in this > discussion), Err, actually partitions are better, because they are handeled by the same flash driver. > then you have to manually concatenate them back into a > single device. What for do you need to concatenate them back? Different erase regions are used for completely different purposes, right? So if you work with the small "boot" region, you don't normally need the rest of the flash and vice versa. My point is that there should be a generalized flash model like this: 1. there are only 3 operations: read, write, erase; 2. erase is done in terms of eraseblocks, eraseblocks are all equivalent in size; 4. there is a minimal input/output unit exists; That's all. Erase regions and OOB is out of this simple flash model, do you see what I mean? Add your erase regions to this model and you'll end up with a mess. Organize these regions as different instances of the above generalized model, and you have a very nice picture. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-27 16:01 ` Artem B. Bityutskiy @ 2006-02-27 16:15 ` Josh Boyer 2006-02-27 17:21 ` Artem B. Bityutskiy 0 siblings, 1 reply; 34+ messages in thread From: Josh Boyer @ 2006-02-27 16:15 UTC (permalink / raw) To: Artem B. Bityutskiy; +Cc: tglx, linux-mtd On 2/27/06, Artem B. Bityutskiy <dedekind@yandex.ru> wrote: > Josh Boyer wrote: > > Often times the eraseregions have no practical use for the software > > involved. Take the Intel P30 for example. It has a few eraseblocks > > at either the top or bottom of the chip that are different eraseblock > > size. > This explains why putting this all together in one mtd-> structure is > bad (they have different eraseblocks size). The chip driver abstracts that away. The users of the device typically don't _know_ there are different eraseregions. > > > If you have different eraseregions show up as different MTD devices > > (or partitions which are essentially the same thing in this > > discussion), > Err, actually partitions are better, because they are handeled by the > same flash driver. I meant that for all intents and purposes, partitions show up as completely different MTD devices. They aren't traditional partitions as on a hard drive. It's unrelated to this topic, so ignore that. > > > then you have to manually concatenate them back into a > > single device. > What for do you need to concatenate them back? Different erase regions > are used for completely different purposes, right? So if you work with > the small "boot" region, you don't normally need the rest of the flash > and vice versa. Different erase regions _can_ be used for completely different purposes. Often, they aren't used at all. Which means that if the show up as a separate device, you have to concatenate them back together. > My point is that there should be a generalized flash model like this: > > 1. there are only 3 operations: read, write, erase; > 2. erase is done in terms of eraseblocks, eraseblocks are all equivalent > in size; > 4. there is a minimal input/output unit exists; > > That's all. Erase regions and OOB is out of this simple flash model, do > you see what I mean? Add your erase regions to this model and you'll end > up with a mess. Organize these regions as different instances of the > above generalized model, and you have a very nice picture. A very nice picture that complicates things for end users and wastes DRAM when there is no intention to use the eraseregions differently. I can buy the OOB argument. I think we'll have to agree to disagree on the eraseregion stuff. josh ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-27 16:15 ` Josh Boyer @ 2006-02-27 17:21 ` Artem B. Bityutskiy 2006-02-27 17:40 ` Josh Boyer 0 siblings, 1 reply; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-27 17:21 UTC (permalink / raw) To: Josh Boyer; +Cc: tglx, linux-mtd Josh, I believe you're only *partially* right Please, switch to "architecture mode", leave this "low-level programmer" mode. Forget about few bytes of DRAM for now, please! :-) The bellow are different points which should help me to change your mind. 1. The region is an attribute of *flash chip*. The region is *not* an attribute of MTD device. MTD device may describe the whole flash chip, it may describe a part of flash chip (partitions), and it may describe many flash chips (concatenation). So, I state: the notion of region has nothing to do with the notion of MTD device. If you have a specific application, which works with a *flash chip* not an *MTD device*, please, introduce another software object, sort of "struct flash_chip". Let your applications to work with these "struct flash_chip" objects. Putting references to regions to the MTD device structure (struct mtd_info) is *bad*. 2. Well, you're in ecstasy of the current approach (kidding :-) ). Then explain me, why if, let's say JFFS3, opens an MTD device, it has access to some weird "regions"? What's on earth is going on? This is definitely error-prone. This is messy. This is bad design. JFFS3 does not want to see regions. 3. Regions... How do you look at them? We could look at them like this: a). They are different entities, used for completely different purposes, a rare program needs to use all of them simultaneously, they may have different eraseblock size and other properties, they are just different. Yes, it happened to be that they are on one single chip, but they are still different. Or we could look at them narrower: b). A region is a kind of "appendix" to the flash chip, where the bootcode or whatever may be put. There is not more then one (or really few) regions per flash chip. There is only one big and main "data" region, and one or few small auxiliary regions. I reckon that the reality is more like b). And in this respect I see your point. Indeed, what for should we spend more DRAM to describe this unworthy "appendix" (aka region) by a distinct software object? Let's push this all together! I can agree with you *partially*, providing that chips like a). are *not* going to appear. Partially means that I only agree that there *may be* no reason to create a distinct MTD device for this poor appendix. So I state: Ok, in b)-like chips, do not create a distinct MTD device. Let's invent something different. But this must not be in struct mtd_info anyway. Do you see my points? -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-27 17:21 ` Artem B. Bityutskiy @ 2006-02-27 17:40 ` Josh Boyer 0 siblings, 0 replies; 34+ messages in thread From: Josh Boyer @ 2006-02-27 17:40 UTC (permalink / raw) To: Artem B. Bityutskiy; +Cc: tglx, linux-mtd On 2/27/06, Artem B. Bityutskiy <dedekind@yandex.ru> wrote: > Josh, I believe you're only *partially* right I'm surprised I'm right at all :). Normally I just make an idiot of myself. > 1. The region is an attribute of *flash chip*. The region is *not* an > attribute of MTD device. MTD device may describe the whole flash chip, > it may describe a part of flash chip (partitions), and it may describe > many flash chips (concatenation). > > So, I state: the notion of region has nothing to do with the notion of > MTD device. Ok. I agree with that. Regions are indeed a chip thing. > If you have a specific application, which works with a *flash chip* not > an *MTD device*, please, introduce another software object, sort of > "struct flash_chip". Let your applications to work with these "struct > flash_chip" objects. > > Putting references to regions to the MTD device structure (struct > mtd_info) is *bad*. Yes. And that is where we are getting our signals crossed. Or mabye it was just me. I was thinking that you wanted to take the region info and make it manditory for a new MTD to be created. If it's still valid for the _chip_ driver to abstract away the region stuff, I'm all for that. > 2. Well, you're in ecstasy of the current approach (kidding :-) ). Then > explain me, why if, let's say JFFS3, opens an MTD device, it has access > to some weird "regions"? What's on earth is going on? This is definitely > error-prone. This is messy. This is bad design. JFFS3 does not want to > see regions. Right. This is indeed not really wanted. Are there even users of the region info in the mtd_info structure? > 3. Regions... How do you look at them? > > We could look at them like this: > a). They are different entities, used for completely different purposes, > a rare program needs to use all of them simultaneously, they may have > different eraseblock size and other properties, they are just different. > Yes, it happened to be that they are on one single chip, but they are > still different. > > Or we could look at them narrower: > b). A region is a kind of "appendix" to the flash chip, where the > bootcode or whatever may be put. There is not more then one (or really > few) regions per flash chip. There is only one big and main "data" > region, and one or few small auxiliary regions. > > I reckon that the reality is more like b). And in this respect I see > your point. Indeed, what for should we spend more DRAM to describe this > unworthy "appendix" (aka region) by a distinct software object? Let's > push this all together! > > I can agree with you *partially*, providing that chips like a). are > *not* going to appear. Partially means that I only agree that there *may > be* no reason to create a distinct MTD device for this poor appendix. > > So I state: Ok, in b)-like chips, do not create a distinct MTD device. > Let's invent something different. But this must not be in struct > mtd_info anyway. Or not invent anything at all unless it's really, really, really needed :). > > Do you see my points? Yep. I think we agree for the most part now. josh ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-18 9:10 ` Thomas Gleixner 2006-02-18 16:31 ` Vitaly Wool @ 2006-02-18 18:11 ` Russ Dill 2006-02-19 0:29 ` Charles Manning 2006-02-19 8:29 ` Thomas Gleixner 1 sibling, 2 replies; 34+ messages in thread From: Russ Dill @ 2006-02-18 18:11 UTC (permalink / raw) To: tglx; +Cc: Charles Manning, linux-mtd, yaffs > Some words about Reed Solomon. > > Reed Solomon needs hardware support for performance reasons. Efficient > usage of Reed Solomon requires a different Data / RS-code layout: At what level is hardware support required? I'm involved in the design of a new system with a 466Mhz 80200. Should fpga considerations be mode for ecc correction? What sort of logic would be best to put in the fpga? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-18 18:11 ` Russ Dill @ 2006-02-19 0:29 ` Charles Manning 2006-02-19 5:08 ` Jon Masters 2006-02-19 8:29 ` Thomas Gleixner 1 sibling, 1 reply; 34+ messages in thread From: Charles Manning @ 2006-02-19 0:29 UTC (permalink / raw) To: Russ Dill; +Cc: linux-mtd, yaffs On Sunday 19 February 2006 07:11, Russ Dill wrote: > > Some words about Reed Solomon. > > > > Reed Solomon needs hardware support for performance reasons. Efficient > > usage of Reed Solomon requires a different Data / RS-code layout: > > At what level is hardware support required? I'm involved in the design > of a new system with a 466Mhz 80200. Should fpga considerations be > mode for ecc correction? What sort of logic would be best to put in > the fpga? If you have NAND going past or through the FPGA then thinking about RS or ECC is a GoodIdea. ECC is pretty expensive (yaffs_ecc.c is a bit faster than nand_ecc.c, but still requires quite a bit of computation). From what I've heard, RS is a lot more expensive. I have not yet looked at RS yet, but I "pencil designed" a fast and simple ECC scheme that needs approx 22 flipflops + some other gates a while ago. This mechanism is used pretty much as follows: 1) Clear flipflops. 2) Transfer data 3) Read calculated ECC out of flipflops. So the actual correction is still done in software, but the expensive part - the calculation - is done in hw. This gets you most of the way there with limited hw costs. I hunch that you could do something similar for RS. -- CHarles ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-19 0:29 ` Charles Manning @ 2006-02-19 5:08 ` Jon Masters 0 siblings, 0 replies; 34+ messages in thread From: Jon Masters @ 2006-02-19 5:08 UTC (permalink / raw) To: Charles Manning; +Cc: Russ Dill, linux-mtd, yaffs On 2/19/06, Charles Manning <manningc2@actrix.gen.nz> wrote: > If you have NAND going past or through the FPGA then thinking about RS or ECC > is a GoodIdea. On a related note, does anyone know of a good Xilinx reference board that comes with a good amount of NAND already soldered on? Having just quit working with Montavista stuff, I'm looking to pick up a board for myself at home - something like an ML403 but with a good amount of NAND would be nice :-) Jon. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-18 18:11 ` Russ Dill 2006-02-19 0:29 ` Charles Manning @ 2006-02-19 8:29 ` Thomas Gleixner 2006-02-23 0:46 ` Russ Dill 1 sibling, 1 reply; 34+ messages in thread From: Thomas Gleixner @ 2006-02-19 8:29 UTC (permalink / raw) To: Russ Dill; +Cc: Charles Manning, linux-mtd, yaffs On Sat, 2006-02-18 at 10:11 -0800, Russ Dill wrote: > > Some words about Reed Solomon. > > > > Reed Solomon needs hardware support for performance reasons. Efficient > > usage of Reed Solomon requires a different Data / RS-code layout: > > At what level is hardware support required? I'm involved in the design > of a new system with a 466Mhz 80200. Should fpga considerations be > mode for ecc correction? What sort of logic would be best to put in > the fpga? Depends. The 1bit correction/ 2bit detection Hamming ECC algorithm found in the kernel is not too bad, but Reed Solomon is a quite conmputation expensive algorithm. Look into the encoder / decoder code in lib/reed_solomon. In general you have to iterate over the data buffer and compute on each step. The performance penalty depends on the complexitiy of the algortihm. If you have enough space in your FPGA then its definitely a good idea to put some ECC calculation mechanism into it. There are implementations for both ECC and Reed Solomon available. tglx ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-19 8:29 ` Thomas Gleixner @ 2006-02-23 0:46 ` Russ Dill 2006-02-23 7:36 ` Thomas Gleixner 0 siblings, 1 reply; 34+ messages in thread From: Russ Dill @ 2006-02-23 0:46 UTC (permalink / raw) To: tglx; +Cc: Charles Manning, linux-mtd, yaffs > Depends. The 1bit correction/ 2bit detection Hamming ECC algorithm found > in the kernel is not too bad, but Reed Solomon is a quite conmputation > expensive algorithm. Look into the encoder / decoder code in > lib/reed_solomon. Do NAND errors tend to clump in sequences of bits (ie, something reed solomon is paticularly good at)? > In general you have to iterate over the data buffer and compute on each > step. The performance penalty depends on the complexitiy of the > algortihm. If you have enough space in your FPGA then its definitely a > good idea to put some ECC calculation mechanism into it. There are > implementations for both ECC and Reed Solomon available. Care to recommend any? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-23 0:46 ` Russ Dill @ 2006-02-23 7:36 ` Thomas Gleixner 0 siblings, 0 replies; 34+ messages in thread From: Thomas Gleixner @ 2006-02-23 7:36 UTC (permalink / raw) To: Russ Dill; +Cc: Charles Manning, linux-mtd, yaffs On Wed, 2006-02-22 at 17:46 -0700, Russ Dill wrote: > > Depends. The 1bit correction/ 2bit detection Hamming ECC algorithm found > > in the kernel is not too bad, but Reed Solomon is a quite conmputation > > expensive algorithm. Look into the encoder / decoder code in > > lib/reed_solomon. > > > Do NAND errors tend to clump in sequences of bits (ie, something reed > solomon is paticularly good at)? In tests with AG-AND, which is known to be less reliable, I saw the bits flip randomly in several consecutive bytes. > > In general you have to iterate over the data buffer and compute on each > > step. The performance penalty depends on the complexitiy of the > > algortihm. If you have enough space in your FPGA then its definitely a > > good idea to put some ECC calculation mechanism into it. There are > > implementations for both ECC and Reed Solomon available. > > Care to recommend any? I dont remember where the Hamming ECC was from, but a Reed Solomon encoder is available on opencores.org. And there are several generators out there which produce RS encoder VHDL code from the given parameters. Someone used this one with success http://home.arcor.de/christianschuler/software/genenc_v1_2_tar.gz You need to build the interfaces around though. Parameters used back then: -b 10 data width -n 518 total length of code word -k 512 number of information words -m 10 symbol width -p 10000001001 polynomial Result below. Some hints: Expand the 8 bit data bus to 10 bits by setting the bit 8/9 hard to 1. Before feeding the resulting 10 bit word into the generator invert them. Also invert the resulting RS code. The reason is: code for all data 0x00 is 0 0 0 0 0 0. Empty flash (0xff) expanded with bit 8/9 feeds 0x3ff to the inverter, which results in 0x000 for the generator input. Now you invert the resulting code and get 6 words of 0x3ff. That way you need no special checking for all empty flash as the resulting code will be correct. tglx ------------------------------------------------------------------------------- -- Parallel Reed-Solomon Encoder (ENTITY) -- automatically generated by program 'genenc_solaris' -- ./genenc_solaris -e 2 -b 10 -n 518 -k 512 -m 10 -p 10000001001 > rs_ecc_nand1.vhdl -- -- Date Thu Jun 23 08:07:57 2005 -- ------------------------------------------------------------------------------- LIBRARY ieee; USE ieee.std_logic_1164.ALL; ENTITY rs_enc IS PORT ( clk : IN STD_LOGIC; enable : IN STD_LOGIC; reset : IN STD_LOGIC; out_enb : OUT STD_LOGIC; d_in : IN STD_LOGIC_VECTOR(9 DOWNTO 0); d_out : OUT STD_LOGIC_VECTOR(9 DOWNTO 0) ); END rs_enc; ------------------------------------------------------------------------------- -- Parallel Reed-Solomon Encoder (ARCHITECTURE) -- automatically generated by program 'genenc_solaris' -- -- Date Thu Jun 23 08:07:57 2005 -- ------------------------------------------------------------------------------- ARCHITECTURE rtl OF rs_enc IS CONSTANT nn: INTEGER := 518; -- Number of code symbols CONSTANT kk: INTEGER := 512; -- Number of information symbols CONSTANT bw: INTEGER := 10; -- Bussize in bits CONSTANT mm: INTEGER := 10; -- symbol size in bits CONSTANT last_in: INTEGER := 512; -- Last input clock cycle CONSTANT last_out: INTEGER := 518; -- Last output clock cycle ------------------------------------------------------------------------------- -- Galois field multiplier functions for Generator Polynomial -- RS(518,512) encoder: -- G(0): FUNCTION gf_mul_21 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(3) XOR bb(9); cc_next(1) := bb(0) XOR bb(4); cc_next(2) := bb(1) XOR bb(5); cc_next(3) := bb(2) XOR bb(3) XOR bb(6) XOR bb(9); cc_next(4) := bb(3) XOR bb(4) XOR bb(7); cc_next(5) := bb(4) XOR bb(5) XOR bb(8); cc_next(6) := bb(5) XOR bb(6) XOR bb(9); cc_next(7) := bb(0) XOR bb(6) XOR bb(7); cc_next(8) := bb(1) XOR bb(7) XOR bb(8); cc_next(9) := bb(2) XOR bb(8) XOR bb(9); RETURN cc_next; END gf_mul_21; -- G(1): FUNCTION gf_mul_981 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(3) XOR bb(6) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(1) := bb(0) XOR bb(4) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(2) := bb(0) XOR bb(1) XOR bb(5) XOR bb(8) XOR bb(9); cc_next(3) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(7) XOR bb(8); cc_next(4) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(8) XOR bb(9); cc_next(5) := bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(9); cc_next(6) := bb(2) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(6); cc_next(7) := bb(0) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(6) XOR bb(7); cc_next(8) := bb(1) XOR bb(4) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(8); cc_next(9) := bb(2) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(8) XOR bb(9); RETURN cc_next; END gf_mul_981; -- G(2): FUNCTION gf_mul_312 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(2) XOR bb(6) XOR bb(9); cc_next(1) := bb(3) XOR bb(7); cc_next(2) := bb(4) XOR bb(8); cc_next(3) := bb(2) XOR bb(5) XOR bb(6); cc_next(4) := bb(0) XOR bb(3) XOR bb(6) XOR bb(7); cc_next(5) := bb(1) XOR bb(4) XOR bb(7) XOR bb(8); cc_next(6) := bb(2) XOR bb(5) XOR bb(8) XOR bb(9); cc_next(7) := bb(3) XOR bb(6) XOR bb(9); cc_next(8) := bb(0) XOR bb(4) XOR bb(7); cc_next(9) := bb(1) XOR bb(5) XOR bb(8); RETURN cc_next; END gf_mul_312; -- G(3): FUNCTION gf_mul_606 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(0) XOR bb(1) XOR bb(2) XOR bb(4) XOR bb(7); cc_next(1) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(5) XOR bb(8); cc_next(2) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(6) XOR bb(9); cc_next(3) := bb(0) XOR bb(3) XOR bb(5); cc_next(4) := bb(1) XOR bb(4) XOR bb(6); cc_next(5) := bb(2) XOR bb(5) XOR bb(7); cc_next(6) := bb(0) XOR bb(3) XOR bb(6) XOR bb(8); cc_next(7) := bb(1) XOR bb(4) XOR bb(7) XOR bb(9); cc_next(8) := bb(0) XOR bb(2) XOR bb(5) XOR bb(8); cc_next(9) := bb(0) XOR bb(1) XOR bb(3) XOR bb(6) XOR bb(9); RETURN cc_next; END gf_mul_606; -- G(4): FUNCTION gf_mul_305 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(0) XOR bb(3) XOR bb(9); cc_next(1) := bb(0) XOR bb(1) XOR bb(4); cc_next(2) := bb(1) XOR bb(2) XOR bb(5); cc_next(3) := bb(2) XOR bb(6) XOR bb(9); cc_next(4) := bb(3) XOR bb(7); cc_next(5) := bb(4) XOR bb(8); cc_next(6) := bb(5) XOR bb(9); cc_next(7) := bb(0) XOR bb(6); cc_next(8) := bb(1) XOR bb(7); cc_next(9) := bb(2) XOR bb(8); RETURN cc_next; END gf_mul_305; -- G(5): FUNCTION gf_mul_967 ( bb: STD_LOGIC_VECTOR(9 DOWNTO 0)) RETURN STD_LOGIC_VECTOR IS VARIABLE cc_next: STD_LOGIC_VECTOR(9 DOWNTO 0); BEGIN cc_next(0) := bb(4) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(1) := bb(0) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(2) := bb(0) XOR bb(1) XOR bb(6) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(3) := bb(0) XOR bb(1) XOR bb(2) XOR bb(4) XOR bb(5) XOR bb(6); cc_next(4) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(5) XOR bb(6) XOR bb(7); cc_next(5) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(6) XOR bb(7) XOR bb(8); cc_next(6) := bb(0) XOR bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(7) XOR bb(8) XOR bb(9); cc_next(7) := bb(1) XOR bb(2) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(6) XOR bb(8) XOR bb(9); cc_next(8) := bb(2) XOR bb(3) XOR bb(4) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(9); cc_next(9) := bb(3) XOR bb(4) XOR bb(5) XOR bb(6) XOR bb(7) XOR bb(8); RETURN cc_next; END gf_mul_967; -- G(6): input equals output ! ------------------------------------------------------------------------------- TYPE bb_t IS ARRAY (0 TO nn-kk-1) OF STD_LOGIC_VECTOR(mm-1 DOWNTO 0); TYPE prod_t IS ARRAY (0 TO nn-kk-1) OF STD_LOGIC_VECTOR(mm-1 DOWNTO 0); ------------------------------------------------------------------------------- SIGNAL bb: bb_t; SIGNAL rs_calc : STD_LOGIC; SIGNAL rs_ins : STD_LOGIC; BEGIN main: PROCESS (clk,bb,d_in,reset) VARIABLE feedback: STD_LOGIC_VECTOR(mm-1 DOWNTO 0); VARIABLE product: bb_t; BEGIN FOR i IN 0 TO mm - 1 LOOP feedback:= bb(0) XOR d_in; END LOOP; product(0) := gf_mul_967(feedback); product(1) := gf_mul_305(feedback); product(2) := gf_mul_606(feedback); product(3) := gf_mul_312(feedback); product(4) := gf_mul_981(feedback); product(5) := gf_mul_21(feedback); IF reset = '1' THEN FOR i IN 0 TO nn-kk-1 LOOP bb(i) <= (others => '0'); END LOOP; out_enb <= '0'; ELSIF clk = '1' AND clk'EVENT THEN IF enable = '1' THEN out_enb <= '1'; IF rs_ins = '0' AND rs_calc = '1' THEN -- calculate: FOR i IN 0 TO 4 LOOP bb(i) <= bb(i+1) XOR product(i); END LOOP; bb(5) <= product(5); d_out <= d_in; ELSIF rs_ins = '1' AND rs_calc = '0' THEN -- insert and shift: d_out <= bb(0); FOR i IN 0 TO nn-kk-2 LOOP bb(i) <= bb(i+1); END LOOP; bb(nn-kk-1) <= (others =>'0'); ELSIF rs_ins = '0' AND rs_calc = '0' THEN -- bypass: d_out <= d_in; END IF; ELSE out_enb <= '0'; END IF; END IF; END PROCESS; --main control: PROCESS (clk,reset) VARIABLE b_cnt : INTEGER RANGE 0 TO 518; BEGIN IF reset = '1' THEN rs_calc <= '1'; rs_ins <= '0'; b_cnt := 0; ELSE IF clk = '1' AND clk'EVENT THEN IF enable = '1' THEN IF b_cnt = 0 THEN rs_calc <= '1'; rs_ins <= '0'; b_cnt := b_cnt + 1; ELSIF b_cnt = last_in - 1 THEN rs_calc <= '0'; rs_ins <= '1'; b_cnt := b_cnt + 1; ELSIF b_cnt = last_out - 1 THEN rs_calc <= '1'; rs_ins <= '0'; b_cnt := 0; ELSE b_cnt := b_cnt + 1; END IF; END IF; END IF; END IF; END PROCESS; -- ctrl END rtl; -- rs_enc ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-16 1:32 ` [Yaffs] bit error rates --> a vendor speaks Charles Manning 2006-02-18 9:10 ` Thomas Gleixner @ 2006-02-23 8:31 ` Vitaly Wool 2006-02-24 9:51 ` Artem B. Bityutskiy 1 sibling, 1 reply; 34+ messages in thread From: Vitaly Wool @ 2006-02-23 8:31 UTC (permalink / raw) To: Charles Manning; +Cc: William Watson, linux-mtd, yaffs Hello folks, my two cents... The way out I'd like to suggest is to [re]design the NAND filesystems (jffs{2|3}, yaffs[2]) in such a way that no sensible information is stored in OOB area. That will mean no compatibility/applicability issues. But how to deal with speed decrease within this approach? I think it's reasonable to store speedup information (indexes, hash values etc.) in the OOB area. Well, this speedup info will need to be designed in an adaptable way as the amount of spare OOB bytes can be whatever. Also, the tools for writing the filesystem image onto the flash better be capable of generating this speedup info as they write the image, otherwise first mount will take a lot of time (probably as it happens once, it's also acceptable but less convenient). And... I do not like the idea to remove OOB-handling stuff from the MTD layer. I also do not like the idea to make a separate interface for NAND layer, I'm afraid this will lead to confusion and hard to track errors. Hope that makes sense, Best regards, Vitaly Charles Manning wrote: >To mtd-ers... Please forgive the cross-posting, but I think the content is >sufficiently important to those who use NAND but are not plugged in to the >YAFFS list. > > >On Thursday 16 February 2006 08:53, William Watson wrote: > > > >>I will also note that a NAND vendor who paid us a visit at about that same >>time said that we should expect WORSE soft error behaviour with succeeding >>generations of NAND flash chips. The geometries would get smaller and >>smaller, the chip dies would get larger and larger, and the amount of time >>for production testing of each chip would not increase, or at least, not >>increase as fast as the total storage of a chip. Thus, the testing per >>page would only go down in subsequent generations of chips. These two >>statements seemed to say that we would see both (1) increased rates of ECC >>errors, and (2) an increase in the number of marginal blocks not marked bad >>by the chip vendor. >> >> > >A vendor on the list contacted me off list, so I asked their permission before >posting what they said on-list. I got that permission so long as their name >was removed. > >As William states, It seems that the reliability has peaked. NAND is expected >to get worse, and a move to better correction schemes is encouraged. > >For the most part, this is not really a YAFFS2 issue since the ECC mechanism >(on the data) is not part of YAFFS2 per se. THis is now part of the mtd, or >flash driver for non-Linux (eg. William's case). For YAFFS2, the impact is >probably limited to: >1) ECC on tags.... Tags are so small that a single-bit correction is probably >enough. Multibit is probably a good thing to investigate. >2) More OOB being used for multi-bit schemes will probably mean less space >available for tags. >3) An emerging need for more forgiving block retirement. >4) Thinking about "spreading" to reduce write disturb. > >Thus far it has been quite hard to engage NAND vendors but it seems some are >now willing to talk a bit more. I will try to discuss the issues to better >understand them. > >-- Charles > > >Without further blaah, the vendor's words, slightly edited: > >It's difficult to decide what the best block retirement policy should be. >For a program or erase failure, block retirement should be mandatory >because internally, the chip has already tried to program or erase multiple >times already. For ECC (soft errors), it's more of an open question. >Should the errors be scrubbed and the data written back (to the same >location or a different location) or should the data be moved and the block >permanently retired? > >One thing I can say with certainty, overall NAND flash reliability is >degrading. At the time the application note was written, 0.16 micron based >NAND flash had a block write/erase endurance of 250k-1M cycles with >reasonable data retention. However, as lithography shrinks have continued >(0.16um -> 0.13um -> 90nm -> 70nm), the physical cell area has been cut by >roughly half at each die shrink. The physics of the materials don't >change, therefore, less charge is being stored in the memory cell every >generation. 90nm SLC (single level cell) NAND flash was nominally rated at >100K write/erase cycles per block, and it is very likely that 70nm SLC NAND >flash will be significantly less. Better ECC is going to be necessary. >The single bit correcting Hamming code that is currently used for SLC NAND >was originally designed for SmartMedia over 10 years ago. Today, most >memory cards using NAND flash implement Reed Solomon ECC capable of >correcting 4 or more symbol errors (typically 8 or more bits per symbol) >per 512 bytes. This enables the ability to correct 4 random errors per >sector. However, for speed reasons, the ECC is usually done in hardware. >If a multi-bit ECC was implemented, then it would be possible to implement >a policy like: if there is the maximum number of correctable errors or an >uncorrectable error in a page, then the block can be permanently retired, >otherwise, scrub the data and move to a new location. But this kind of >policy would be difficult to implement with single bit correcting Hamming >code. Disturbance errors (read disturb, program disturb) will become more >probable in the future due to increased capacitive coupling between bit >lines and word lines due to decreased separation at each lithography >shrink. > >One fact that is somewhat underappreciated is that one cannot have both >high block write/erase endurance and long data retention simultaneously. >One can have better data retention if the block sees fewer write/erase >cycles. The more the data is spread out across all the blocks, the better >the overall data retention since every block is written and erased as few >times a necessary. Journaling appears to be one of the best ways to spread >out the writes but some kind of static data wear leveling could improve it >further (however, there might be IP issues). > >Keep up the good work, > >Sincerely, >[name scrubbed] > >______________________________________________________ >Linux MTD discussion mailing list >http://lists.infradead.org/mailman/listinfo/linux-mtd/ > > > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Yaffs] bit error rates --> a vendor speaks 2006-02-23 8:31 ` Vitaly Wool @ 2006-02-24 9:51 ` Artem B. Bityutskiy 0 siblings, 0 replies; 34+ messages in thread From: Artem B. Bityutskiy @ 2006-02-24 9:51 UTC (permalink / raw) To: Vitaly Wool; +Cc: William Watson, Charles Manning, linux-mtd, yaffs Vitaly Wool wrote: > And... I do not like the idea to remove OOB-handling stuff from the MTD > layer. I also do not like the idea to make a separate interface for NAND > layer, I'm afraid this will lead to confusion and hard to track errors. Having a generalize MTD layer on top of NAND, NOR, etc layers cannot lead to any confusion. What can confuse you? This is just a nice layering. -- Best Regards, Artem B. Bityutskiy, St.-Petersburg, Russia. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2006-02-27 17:40 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <43EB96DC.3030900@eptar.com>
[not found] ` <35fb2e590602100558s2d868fa3o1752fbf3217439e4@mail.gmail.com>
[not found] ` <d97046180602151153g23064424x9e1ddf83a1d7ae4f@mail.gmail.com>
2006-02-16 1:32 ` [Yaffs] bit error rates --> a vendor speaks Charles Manning
2006-02-18 9:10 ` Thomas Gleixner
2006-02-18 16:31 ` Vitaly Wool
2006-02-19 8:22 ` Thomas Gleixner
2006-02-20 20:42 ` Charles Manning
2006-02-20 21:37 ` Thomas Gleixner
2006-02-20 22:40 ` Charles Manning
2006-02-20 23:18 ` Thomas Gleixner
2006-02-21 0:29 ` Jon Masters
2006-02-21 8:26 ` Thomas Gleixner
2006-02-21 9:35 ` Jörn Engel
2006-02-21 1:08 ` [Yaffs] bit error rates --> YAFFS for devices with no OOB Charles Manning
2006-02-21 2:12 ` Jon Masters
2006-02-22 0:38 ` Jamie Lokier
2006-02-21 12:14 ` [Yaffs] bit error rates --> a vendor speaks Artem B. Bityutskiy
2006-02-21 13:50 ` Josh Boyer
2006-02-21 14:36 ` Artem B. Bityutskiy
2006-02-21 14:49 ` Artem B. Bityutskiy
2006-02-21 11:59 ` Artem B. Bityutskiy
2006-02-21 12:06 ` Thomas Gleixner
2006-02-25 11:58 ` Artem B. Bityutskiy
2006-02-27 13:27 ` Josh Boyer
2006-02-27 16:01 ` Artem B. Bityutskiy
2006-02-27 16:15 ` Josh Boyer
2006-02-27 17:21 ` Artem B. Bityutskiy
2006-02-27 17:40 ` Josh Boyer
2006-02-18 18:11 ` Russ Dill
2006-02-19 0:29 ` Charles Manning
2006-02-19 5:08 ` Jon Masters
2006-02-19 8:29 ` Thomas Gleixner
2006-02-23 0:46 ` Russ Dill
2006-02-23 7:36 ` Thomas Gleixner
2006-02-23 8:31 ` Vitaly Wool
2006-02-24 9:51 ` Artem B. Bityutskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox