* Problem with clean markers/partial writes on Micron 4-bit ECC NAND
@ 2011-06-17 17:52 Peter Barada
2011-06-17 21:00 ` Ivan Djelic
2011-06-23 8:46 ` Artem Bityutskiy
0 siblings, 2 replies; 5+ messages in thread
From: Peter Barada @ 2011-06-17 17:52 UTC (permalink / raw)
To: linux-mtd, Eric Nelson, Peter Barada
I'm using a 2K page Micron NAND that has an internal 4-bit ECC engine.
The Micron NAND chip uses 8-bytes per 512 bytes of main data area + 4
bytes of the OOB. This allows the 32 bytes of ECC to correct 2048 bytes
of the main data area and 16 bytes of the OOB area.
The problem I'm running into with JFFS2 is that empty flash is first
marked with a clean marker into the OOB, and then a 2nd write to the
main data area is done (w/o an intervening erase) to that page with data
which corrupts the ECCs that were first modified by writing the cleanmarker.
The OOB layout I'm using (which allows ECC'ng 16 bytes of the OOB) is:
ecclayout = {
eccbytes = 32,
eccpos = { 8, 9, 10, 11, 12, 13, 14, 15,
24, 25, 26, 27, 28, 19, 30, 31,
40, 41, 42, 43, 44, 45, 46, 47,
56, 57, 58, 59, 60, 61, 62, 63},
.oobfree = {
{ .offset = 4,
.length = 4},
{ .offset = 20,
.length = 4},
{ .offset = 36,
.length = 4},
{ .offset = 52,
.length = 4},
},
};
After the cleanmarker is written into bytes 4-7 and 16-23 of the OOB,
nanddump shows:
OOB Data: ff ff ff ff 85 19 03 20 5a e3 da 69 01 40 f1 36
OOB Data: ff ff ff ff 08 00 00 00 91 99 3c 05 01 d0 5d b3
OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Note the ECCs in bytes 8-15 and 25-31 are no longer 0xFF since the ECC
at bytes 8-15 covers data area bytes 0-511 as well as OOB bytes 4-7 and
the ECC at bytes 16-23 covers data area bytes 512-1023 as well as OOB
bytes 16-23.
I believe I've figured out a workaround:
1) Modify the ecclayout to add the other 8 bytes of the OOB that are NOT
ECCd *after* the 16 bytes that are ECCd (so the ecc layout looks like):
ecclayout = {
eccbytes = 32,
eccpos = { 8, 9, 10, 11, 12, 13, 14, 15,
24, 25, 26, 27, 28, 19, 30, 31,
40, 41, 42, 43, 44, 45, 46, 47,
56, 57, 58, 59, 60, 61, 62, 63},
.oobfree = {
{ .offset = 4,
.length = 4},
{ .offset = 20,
.length = 4},
{ .offset = 36,
.length = 4},
{ .offset = 52,
.length = 4},
{ .offset = 2,
.length = 2},
{ .offset = 18,
.length = 2},
{ .offset = 24,
.length = 2},
{ .offset = 42,
.length = 2},
},
};
2) Then set ops.ooboffs to 16 in jffs2_write_nand_cleanmarker and
jffs2_check_nand_cleanmarker.
This "offsets" the read/writes by 16 bytes to move the cleanmarker into
OOB bytes that do not perturb the ECCs, and so far it looks to work.
However I feel this is a hack as our product will use two different NAND
chips, the other being a more traditional SLC that can use 1-bit hamming
for ECC (which does not ECC any bytes in the OOB).
How can I best code this into the MTD layer such that JFFS2 (and other
NAND FSs that does partial writes including OOB bytes) can understand
that some OOB bytes perturb the data area ECC?
I think adding a "non_ecc_oob_offset" variable to the ecclayout could
capture this nuance of the OOB/ECC interaction for this chip and JFFS2
could set ops.ooboffs to non_ecc_oob_offset in
jffs2_write_nand_cleanmarker and jffs2_check_nand_cleanmarker.
Any comments are appreciated!
--
Peter Barada
peter.barada@gmail.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problem with clean markers/partial writes on Micron 4-bit ECC NAND
2011-06-17 17:52 Problem with clean markers/partial writes on Micron 4-bit ECC NAND Peter Barada
@ 2011-06-17 21:00 ` Ivan Djelic
2011-06-18 16:37 ` Peter Barada
2011-06-23 8:46 ` Artem Bityutskiy
1 sibling, 1 reply; 5+ messages in thread
From: Ivan Djelic @ 2011-06-17 21:00 UTC (permalink / raw)
To: Peter Barada; +Cc: Eric Nelson, linux-mtd@lists.infradead.org, Peter Barada
On Fri, Jun 17, 2011 at 06:52:24PM +0100, Peter Barada wrote:
(...)
> The problem I'm running into with JFFS2 is that empty flash is first
> marked with a clean marker into the OOB, and then a 2nd write to the
> main data area is done (w/o an intervening erase) to that page with data
> which corrupts the ECCs that were first modified by writing the cleanmarker.
>
(...)
>
> I believe I've figured out a workaround:
>
> 1) Modify the ecclayout to add the other 8 bytes of the OOB that are NOT
> ECCd *after* the 16 bytes that are ECCd (so the ecc layout looks like):
>
(...)
> 2) Then set ops.ooboffs to 16 in jffs2_write_nand_cleanmarker and
> jffs2_check_nand_cleanmarker.
>
> This "offsets" the read/writes by 16 bytes to move the cleanmarker into
> OOB bytes that do not perturb the ECCs, and so far it looks to work.
>
OK, so now the cleanmarker is in an unprotected area; did you also patch
jffs2_check_nand_cleanmarker() so that it does its pattern comparison in a
bitflip-robust way (instead of just doing a memcmp) ?
I think you may also need to modify jffs2_check_oob_empty() to take into
account the new offset of your cleanmarker.
> However I feel this is a hack as our product will use two different NAND
> chips, the other being a more traditional SLC that can use 1-bit hamming
> for ECC (which does not ECC any bytes in the OOB).
>
> How can I best code this into the MTD layer such that JFFS2 (and other
> NAND FSs that does partial writes including OOB bytes) can understand
> that some OOB bytes perturb the data area ECC?
>
> I think adding a "non_ecc_oob_offset" variable to the ecclayout could
> capture this nuance of the OOB/ECC interaction for this chip and JFFS2
> could set ops.ooboffs to non_ecc_oob_offset in
> jffs2_write_nand_cleanmarker and jffs2_check_nand_cleanmarker.
I believe JFFS2 only uses oob for its cleanmarker; then, maybe you could just
omit the ecc-protected bytes from the .oobfree list, like this:
ecclayout = {
eccbytes = 32,
eccpos = { 8, 9, 10, 11, 12, 13, 14, 15,
24, 25, 26, 27, 28, 19, 30, 31,
40, 41, 42, 43, 44, 45, 46, 47,
56, 57, 58, 59, 60, 61, 62, 63},
.oobfree = {
{ .offset = 2,
.length = 2},
{ .offset = 18,
.length = 2},
{ .offset = 24,
.length = 2},
{ .offset = 42,
.length = 2},
},
};
and only modify jffs2_check_nand_cleanmarker() and jffs2_check_oob_empty()
so that they are robust to bitflips in unprotected oob bytes ?
Or are you also using on the same mtd device another filesystem requiring
protected oob bytes, like YAFFS2 ?
Regards,
Ivan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problem with clean markers/partial writes on Micron 4-bit ECC NAND
2011-06-17 21:00 ` Ivan Djelic
@ 2011-06-18 16:37 ` Peter Barada
2011-06-18 17:49 ` Kevin Cernekee
0 siblings, 1 reply; 5+ messages in thread
From: Peter Barada @ 2011-06-18 16:37 UTC (permalink / raw)
To: Ivan Djelic; +Cc: Eric Nelson, linux-mtd@lists.infradead.org, Peter Barada
On 06/17/2011 05:00 PM, Ivan Djelic wrote:
> On Fri, Jun 17, 2011 at 06:52:24PM +0100, Peter Barada wrote:
> (...)
>> The problem I'm running into with JFFS2 is that empty flash is first
>> marked with a clean marker into the OOB, and then a 2nd write to the
>> main data area is done (w/o an intervening erase) to that page with data
>> which corrupts the ECCs that were first modified by writing the cleanmarker.
>>
> (...)
>> I believe I've figured out a workaround:
>>
>> 1) Modify the ecclayout to add the other 8 bytes of the OOB that are NOT
>> ECCd *after* the 16 bytes that are ECCd (so the ecc layout looks like):
>>
> (...)
>> 2) Then set ops.ooboffs to 16 in jffs2_write_nand_cleanmarker and
>> jffs2_check_nand_cleanmarker.
>>
>> This "offsets" the read/writes by 16 bytes to move the cleanmarker into
>> OOB bytes that do not perturb the ECCs, and so far it looks to work.
>>
> OK, so now the cleanmarker is in an unprotected area; did you also patch
> jffs2_check_nand_cleanmarker() so that it does its pattern comparison in a
> bitflip-robust way (instead of just doing a memcmp) ?
> I think you may also need to modify jffs2_check_oob_empty() to take into
> account the new offset of your cleanmarker.
No, I haven't yet created a bitflip-robust jffs2_check_nand_cleanmarker
- will code up a version that calculates the number of flips between the
expected and read markers and if less than some threshold accept it.
I'm surprised such code isn't already in JFFS2, unless JFFS2 assumes
that area of the OOB is already protected (and that protection is OOB-only).
>> However I feel this is a hack as our product will use two different NAND
>> chips, the other being a more traditional SLC that can use 1-bit hamming
>> for ECC (which does not ECC any bytes in the OOB).
>>
>> How can I best code this into the MTD layer such that JFFS2 (and other
>> NAND FSs that does partial writes including OOB bytes) can understand
>> that some OOB bytes perturb the data area ECC?
>>
>> I think adding a "non_ecc_oob_offset" variable to the ecclayout could
>> capture this nuance of the OOB/ECC interaction for this chip and JFFS2
>> could set ops.ooboffs to non_ecc_oob_offset in
>> jffs2_write_nand_cleanmarker and jffs2_check_nand_cleanmarker.
> I believe JFFS2 only uses oob for its cleanmarker; then, maybe you could just
> omit the ecc-protected bytes from the .oobfree list, like this:
>
> ecclayout = {
> eccbytes = 32,
> eccpos = { 8, 9, 10, 11, 12, 13, 14, 15,
> 24, 25, 26, 27, 28, 19, 30, 31,
> 40, 41, 42, 43, 44, 45, 46, 47,
> 56, 57, 58, 59, 60, 61, 62, 63},
> .oobfree = {
> { .offset = 2,
> .length = 2},
> { .offset = 18,
> .length = 2},
> { .offset = 24,
> .length = 2},
> { .offset = 42,
> .length = 2},
> },
> };
>
> and only modify jffs2_check_nand_cleanmarker() and jffs2_check_oob_empty()
> so that they are robust to bitflips in unprotected oob bytes ?
> Or are you also using on the same mtd device another filesystem requiring
> protected oob bytes, like YAFFS2 ?
Yes we are using YAFFS2 in our system so I need not only protected OOB
bytes, but at least 16 of them for YAFFS to hold its meta data - hence
our OOB layout. What's nice is that the 16 currently available are ECCd
so there's not need to ECC the YAFFS meta data (which saves time).
The issue I have is how can I best tell MTD (and FS layers on top of it)
that some of the OOB bytes can not be used in partial writes due to
those bytes perturbing the ECC, or how to change JFFS2 to erase the
block after writing the cleanmarker when it wants to write data into the
block.
Why does JFFS2 write a clean marker into the empty block? Is it to
cover some state transition where power could be interrupted?
> Regards,
>
> Ivan
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
--
Peter Barada
peter.barada@gmail.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problem with clean markers/partial writes on Micron 4-bit ECC NAND
2011-06-18 16:37 ` Peter Barada
@ 2011-06-18 17:49 ` Kevin Cernekee
0 siblings, 0 replies; 5+ messages in thread
From: Kevin Cernekee @ 2011-06-18 17:49 UTC (permalink / raw)
To: Peter Barada
Cc: Ivan Djelic, linux-mtd@lists.infradead.org, Eric Nelson,
Peter Barada
On Sat, Jun 18, 2011 at 9:37 AM, Peter Barada <peter.barada@gmail.com> wrote:
> The issue I have is how can I best tell MTD (and FS layers on top of it)
> that some of the OOB bytes can not be used in partial writes due to those
> bytes perturbing the ECC, or how to change JFFS2 to erase the block after
> writing the cleanmarker when it wants to write data into the block.
FWIW, the product I work on used to utilize JFFS2 and YAFFS2. We
needed to hack around many of the same OOB/NOP limitations you are
seeing. (As well as several cases where corrupted filesystem metadata
caused a kernel oops on mount.)
Once UBIFS became a viable alternative, we found that supporting
JFFS2/YAFFS2 on NAND flash was far more trouble than it was worth.
> Why does JFFS2 write a clean marker into the empty block? Is it to cover
> some state transition where power could be interrupted?
Here is a good explanation:
http://linux-mtd.infradead.org/faq/jffs2.html#L_clmarker
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problem with clean markers/partial writes on Micron 4-bit ECC NAND
2011-06-17 17:52 Problem with clean markers/partial writes on Micron 4-bit ECC NAND Peter Barada
2011-06-17 21:00 ` Ivan Djelic
@ 2011-06-23 8:46 ` Artem Bityutskiy
1 sibling, 0 replies; 5+ messages in thread
From: Artem Bityutskiy @ 2011-06-23 8:46 UTC (permalink / raw)
To: Peter Barada; +Cc: Eric Nelson, linux-mtd, Peter Barada
On Fri, 2011-06-17 at 13:52 -0400, Peter Barada wrote:
> I'm using a 2K page Micron NAND that has an internal 4-bit ECC engine.
>
> The Micron NAND chip uses 8-bytes per 512 bytes of main data area + 4
> bytes of the OOB. This allows the 32 bytes of ECC to correct 2048 bytes
> of the main data area and 16 bytes of the OOB area.
>
> The problem I'm running into with JFFS2 is that empty flash is first
> marked with a clean marker into the OOB, and then a 2nd write to the
> main data area is done (w/o an intervening erase) to that page with data
> which corrupts the ECCs that were first modified by writing the cleanmarker.
I remember someone sent patches to teach JFFS2 to avoid using clean
markers, but I do not remembers the details and whether the patches made
it into the mainline.
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-06-23 8:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-17 17:52 Problem with clean markers/partial writes on Micron 4-bit ECC NAND Peter Barada
2011-06-17 21:00 ` Ivan Djelic
2011-06-18 16:37 ` Peter Barada
2011-06-18 17:49 ` Kevin Cernekee
2011-06-23 8:46 ` Artem Bityutskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).