* NAND/ HW ECC problem
@ 2005-09-19 13:40 Vitaly Wool
2005-09-19 20:40 ` Thomas Gleixner
0 siblings, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2005-09-19 13:40 UTC (permalink / raw)
To: linux-mtd
[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]
Greetings,
we're currently working on NAND flash driver implementation for a new
ARM926-based board. The flash controller is SanDisk's MLC, and the the
requirement is to use hardware ECC capabilities provided by this controller.
We've come across several hardships with generic MTD NAND stuff carrying
out this task.
First, it turned to be necessary to add one more ECC type
(NAND_ECC_HW10_512): this controller stores 10 ECC data bytes after each
512-byte block. Also the need to change size of eccpos array off of
nand_oobinfo structure arose: for flashes with 2K-sector, this turned to
be 40 bytes for each sector.
The serious problem we came across also was that nand_write_page doesn't
follow the free bytes reference for OOB to write ECC data what was
obviously wrong. As far as I understand, the DoC flashes have specific
mechanism for handling that, so he legacy variant was left for the DoC,
dunno whether it's right.
Attached is the patch suggested to fix the problems described.
Best regards,
Vitaly
[-- Attachment #2: nand-ecc.patch --]
[-- Type: text/plain, Size: 3298 bytes --]
Index: linux-2.6.10/drivers/mtd/nand/nand_base.c
===================================================================
--- linux-2.6.10.orig/drivers/mtd/nand/nand_base.c
+++ linux-2.6.10/drivers/mtd/nand/nand_base.c
@@ -904,11 +904,17 @@ static int nand_write_page (struct mtd_i
}
break;
}
-
+
/* Write out OOB data */
- if (this->options & NAND_HWECC_SYNDROME)
- this->write_buf(mtd, &oob_buf[oobsel->eccbytes], mtd->oobsize - oobsel->eccbytes);
- else
+ if (this->options & NAND_HWECC_SYNDROME)
+ if (mtd->ecctype != MTD_ECC_RS_DiskOnChip )
+ for (i = 0; oobsel->oobfree[i][1]; i++ )
+ this->write_buf (mtd,
+ oob_buf+oobsel->oobfree[i][0],
+ oobsel->oobfree[i][1] );
+ else /* DiskOnChip legacy ECC */
+ this->write_buf(mtd, &oob_buf[oobsel->eccbytes], mtd->oobsize - oobsel->eccbytes);
+ else
this->write_buf(mtd, oob_buf, mtd->oobsize);
/* Send command to actually program the data */
@@ -1247,8 +1253,15 @@ int nand_do_read_ecc (struct mtd_info *m
break;
}
+
/* read oobdata */
- this->read_buf(mtd, &oob_data[mtd->oobsize - oobreadlen], oobreadlen);
+ if (mtd->ecctype != MTD_ECC_RS_DiskOnChip )
+ for (i = 0; oobsel->oobfree[i][1]; i++ )
+ this->read_buf (mtd,
+ oob_data+oobsel->oobfree[i][0],
+ oobsel->oobfree[i][1] );
+ else /* legacy DiskOnChip ECC */
+ this->read_buf(mtd, &oob_data[mtd->oobsize - oobreadlen], oobreadlen);
/* Skip ECC check, if not requested (ECC_NONE or HW_ECC with syndromes) */
if (!compareecc)
@@ -2542,6 +2555,7 @@ int nand_scan (struct mtd_info *mtd, int
this->eccsize = 2048;
break;
+ case NAND_ECC_HW10_512:
case NAND_ECC_HW3_512:
case NAND_ECC_HW6_512:
case NAND_ECC_HW8_512:
@@ -2578,7 +2592,9 @@ int nand_scan (struct mtd_info *mtd, int
switch (this->eccmode) {
case NAND_ECC_HW12_2048:
this->eccbytes += 4;
- case NAND_ECC_HW8_512:
+ case NAND_ECC_HW10_512:
+ this->eccbytes += 2;
+ case NAND_ECC_HW8_512:
this->eccbytes += 2;
case NAND_ECC_HW6_512:
this->eccbytes += 3;
@@ -2600,6 +2616,7 @@ int nand_scan (struct mtd_info *mtd, int
case NAND_ECC_HW3_512:
case NAND_ECC_HW6_512:
case NAND_ECC_HW8_512:
+ case NAND_ECC_HW10_512:
this->eccsteps = mtd->oobblock / 512;
break;
case NAND_ECC_HW3_256:
Index: linux-2.6.10/include/mtd/mtd-abi.h
===================================================================
--- linux-2.6.10.orig/include/mtd/mtd-abi.h
+++ linux-2.6.10/include/mtd/mtd-abi.h
@@ -115,7 +115,7 @@ struct nand_oobinfo {
uint32_t useecc;
uint32_t eccbytes;
uint32_t oobfree[8][2];
- uint32_t eccpos[32];
+ uint32_t eccpos[40];
};
#endif /* __MTD_ABI_H__ */
Index: linux-2.6.10/include/linux/mtd/nand.h
===================================================================
--- linux-2.6.10.orig/include/linux/mtd/nand.h
+++ linux-2.6.10/include/linux/mtd/nand.h
@@ -161,8 +161,10 @@ extern int nand_read_raw (struct mtd_inf
#define NAND_ECC_HW6_512 4
/* Hardware ECC 8 byte ECC per 512 Byte data */
#define NAND_ECC_HW8_512 6
+/* SanDisk MLC: 10 bytes ECC per 512 Byte data */
+#define NAND_ECC_HW10_512 7
/* Hardware ECC 12 byte ECC per 2048 Byte data */
-#define NAND_ECC_HW12_2048 7
+#define NAND_ECC_HW12_2048 8
/*
* Constants for Hardware ECC
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NAND/ HW ECC problem
2005-09-19 13:40 NAND/ HW ECC problem Vitaly Wool
@ 2005-09-19 20:40 ` Thomas Gleixner
2005-09-19 21:40 ` Vitaly Wool
2005-09-20 9:19 ` Vitaly Wool
0 siblings, 2 replies; 6+ messages in thread
From: Thomas Gleixner @ 2005-09-19 20:40 UTC (permalink / raw)
To: Vitaly Wool; +Cc: linux-mtd
On Mon, 2005-09-19 at 17:40 +0400, Vitaly Wool wrote:
> First, it turned to be necessary to add one more ECC type
> (NAND_ECC_HW10_512): this controller stores 10 ECC data bytes after each
> 512-byte block. Also the need to change size of eccpos array off of
> nand_oobinfo structure arose: for flashes with 2K-sector, this turned to
> be 40 bytes for each sector.
Hmm, we really should think about making the ECC size per chunksize run
time configurable.
> The serious problem we came across also was that nand_write_page doesn't
> follow the free bytes reference for OOB to write ECC data what was
> obviously wrong. As far as I understand, the DoC flashes have specific
> mechanism for handling that, so he legacy variant was left for the DoC,
> dunno whether it's right.
Err, the oobfree reference is the place where file systems can put their
data in. The ECC is put into the byte positions given by eccpos.
http://www.linux-mtd.infradead.org/tech/mtdnand/x215.html
eccpos
The eccpos array holds the byte offsets in the spare area where the ecc
codes are placed.
* oobfree
The oobfree array defines the areas in the spare area which can be used
for automatic placement. The information is given in the format {offset,
size}. offset defines the start of the usable area, size the length in
bytes. More than one area can be defined. The list is terminated by an
{0, 0} entry.
I never bothered to implement the support for HW_ECC with a scattered
byte layout as I never seen a controller doing such nonsense. All I have
seen do
data - ecc - fsoobdata (512byte/page)
data - ecc -data - ecc -data - ecc -data - ecc - fsoobdata (2k/page)
as this is the most efficient way to handle it.
If your chip does it different, please use the correct parts of the data
structure to implement it.
tglx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NAND/ HW ECC problem
2005-09-19 20:40 ` Thomas Gleixner
@ 2005-09-19 21:40 ` Vitaly Wool
2005-09-19 21:43 ` Thomas Gleixner
2005-09-20 9:19 ` Vitaly Wool
1 sibling, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2005-09-19 21:40 UTC (permalink / raw)
To: tglx; +Cc: linux-mtd
Thomas Gleixner wrote:
>On Mon, 2005-09-19 at 17:40 +0400, Vitaly Wool wrote:
>
>
>>First, it turned to be necessary to add one more ECC type
>>(NAND_ECC_HW10_512): this controller stores 10 ECC data bytes after each
>>512-byte block. Also the need to change size of eccpos array off of
>>nand_oobinfo structure arose: for flashes with 2K-sector, this turned to
>>be 40 bytes for each sector.
>>
>>
>
>Hmm, we really should think about making the ECC size per chunksize run
>time configurable.
>
>
What if have it in the same way as oobfree is stored, i. e. {offset,
length} pairs instead?
>
>
>>The serious problem we came across also was that nand_write_page doesn't
>>follow the free bytes reference for OOB to write ECC data what was
>>obviously wrong. As far as I understand, the DoC flashes have specific
>>mechanism for handling that, so he legacy variant was left for the DoC,
>>dunno whether it's right.
>>
>>
>
>Err, the oobfree reference is the place where file systems can put their
>data in. The ECC is put into the byte positions given by eccpos.
>
>
>
Oh yes, thanks for the clarifications.
>I never bothered to implement the support for HW_ECC with a scattered
>byte layout as I never seen a controller doing such nonsense. All I have
>seen do
>
>data - ecc - fsoobdata (512byte/page)
>data - ecc -data - ecc -data - ecc -data - ecc - fsoobdata (2k/page)
>
>as this is the most efficient way to handle it.
>
>If your chip does it different, please use the correct parts of the data
>structure to implement it.
>
>
>
Yes, it does and I'm not at all happy with that. However, I guess this
controller is likely to be used in other boards' designs...
So, the right way to handle this bizarre stuff currently is to
write/read ECC data byte-by-byte following the eccpos array, correct?
Best regards,
Vitaly
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NAND/ HW ECC problem
2005-09-19 21:40 ` Vitaly Wool
@ 2005-09-19 21:43 ` Thomas Gleixner
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2005-09-19 21:43 UTC (permalink / raw)
To: Vitaly Wool; +Cc: linux-mtd
On Tue, 2005-09-20 at 01:40 +0400, Vitaly Wool wrote:
> >Hmm, we really should think about making the ECC size per chunksize run
> >time configurable.
> >
> What if have it in the same way as oobfree is stored, i. e. {offset,
> length} pairs instead?
Yes
> >If your chip does it different, please use the correct parts of the data
> >structure to implement it.
> >
> Yes, it does and I'm not at all happy with that. However, I guess this
> controller is likely to be used in other boards' designs...
> So, the right way to handle this bizarre stuff currently is to
> write/read ECC data byte-by-byte following the eccpos array, correct?
Yes
tglx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NAND/ HW ECC problem
2005-09-19 20:40 ` Thomas Gleixner
2005-09-19 21:40 ` Vitaly Wool
@ 2005-09-20 9:19 ` Vitaly Wool
2005-09-20 10:04 ` Thomas Gleixner
1 sibling, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2005-09-20 9:19 UTC (permalink / raw)
To: tglx; +Cc: linux-mtd
Thomas Gleixner wrote:
>>The serious problem we came across also was that nand_write_page doesn't
>>follow the free bytes reference for OOB to write ECC data what was
>>obviously wrong. As far as I understand, the DoC flashes have specific
>>mechanism for handling that, so he legacy variant was left for the DoC,
>>dunno whether it's right.
>>
>>
>
>Err, the oobfree reference is the place where file systems can put their
>data in. The ECC is put into the byte positions given by eccpos.
>
>http://www.linux-mtd.infradead.org/tech/mtdnand/x215.html
>
>eccpos
>
>The eccpos array holds the byte offsets in the spare area where the ecc
>codes are placed.
>
> * oobfree
>
>The oobfree array defines the areas in the spare area which can be used
>for automatic placement. The information is given in the format {offset,
>size}. offset defines the start of the usable area, size the length in
>bytes. More than one area can be defined. The list is terminated by an
>{0, 0} entry.
>
>
>I never bothered to implement the support for HW_ECC with a scattered
>byte layout as I never seen a controller doing such nonsense. All I have
>seen do
>
>data - ecc - fsoobdata (512byte/page)
>data - ecc -data - ecc -data - ecc -data - ecc - fsoobdata (2k/page)
>
>as this is the most efficient way to handle it.
>
>If your chip does it different, please use the correct parts of the data
>structure to implement it.
>
Well, after having a fresh look at it I lost the feeling that I was
doing something wrong.
Let's look at the patch again:
<snip>
/* Write out OOB data */
- if (this->options & NAND_HWECC_SYNDROME)
- this->write_buf(mtd, &oob_buf[oobsel->eccbytes], mtd->oobsize - oobsel->eccbytes);
- else
+ if (this->options & NAND_HWECC_SYNDROME)
+ if (mtd->ecctype != MTD_ECC_RS_DiskOnChip )
+ for (i = 0; oobsel->oobfree[i][1]; i++ )
+ this->write_buf (mtd,
+ oob_buf+oobsel->oobfree[i][0],
+ oobsel->oobfree[i][1] );
+ else /* DiskOnChip legacy ECC */
+ this->write_buf(mtd, &oob_buf[oobsel->eccbytes], mtd->oobsize - oobsel->eccbytes);
+ else
this->write_buf(mtd, oob_buf, mtd->oobsize);
<snip>
So, the subject of change is exactly the areas in the spare area which
can be used for automatic placement (fsoobdata), i. e. oobfree. In my
case, oobfree is not at oob+buf + oobsel->eccbytes but spread across the
block in a way
data - ecc - fsoobdata - data - ecc - fsoobdata - data - ecc - fsoobdata - data - ecc - fsoobdata
Therefore the assumption that fsoobdata is at the end of the block
doesn't work, and that's what this part of the patch is about.
Please correct me if I'm wrong,
Thanks,
Vitaly
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: NAND/ HW ECC problem
2005-09-20 9:19 ` Vitaly Wool
@ 2005-09-20 10:04 ` Thomas Gleixner
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2005-09-20 10:04 UTC (permalink / raw)
To: Vitaly Wool; +Cc: linux-mtd
On Tue, 2005-09-20 at 13:19 +0400, Vitaly Wool wrote:
> So, the subject of change is exactly the areas in the spare area which
> can be used for automatic placement (fsoobdata), i. e. oobfree. In my
> case, oobfree is not at oob+buf + oobsel->eccbytes but spread across the
> block in a way
> data - ecc - fsoobdata - data - ecc - fsoobdata - data - ecc - fsoobdata - data - ecc - fsoobdata
>
>
> Therefore the assumption that fsoobdata is at the end of the block
> doesn't work, and that's what this part of the patch is about.
> Please correct me if I'm wrong,
ECC goes not into oobfree !
oobsel->eccpos is the array holding the byte positions of the ecc bytes
oobsel->oobfree is defining the places where filesystems can put data
in. i.e: all places which are not used by bad block markers and ECC
bytes.
The code for SW ECC does exactly the right thing.
The code for HW ECC makes the assumption that the ecc positions are
aligned only for the case that the chip options have NAND_HWECC_SYNDROME
set. If this option is not set, then the bytes are put to the correct
place as given by oobsel->eccpos.
Your patch makes another wrong assumption about MTD_ECC_RS_DiskOnChip.
Depending the non scattered placement of ECC bytes on this would break
other users e.g. rtc_from4.
These users rely on the NAND_HWECC_SYNDROM option, which enforces the
data-ecc layout. Your chip does not.
You need absolutely no change to the code except the new ECC mode
constant and the larger eccpos array. Anything else is supported
already. You must not set NAND_HWECC_SYNDROM option even if your chip
generates syndromes. The code in nand_base.c does know exactly nothing
about the contents of the ecc bytes. It just transfers the data from and
to the places defined by oobsel->eccpos.
The read code will do the following
for (i= 0; i < pagesize/512; i++)
enable_hwecc()
read512bytes()
calculate_ecc() (Read the ecc data into ecc_calc)
read_oob(16 or 64 byte)
transfer ecc data bytes from oob_data to ecc_code (linear array)
for (i= 0; i < pagesize/512; i++)
correct_data(&data[i*512], &ecc_code[i*ecclen], &ecc_calc[i*ecclen])
tglx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-09-20 10:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-19 13:40 NAND/ HW ECC problem Vitaly Wool
2005-09-19 20:40 ` Thomas Gleixner
2005-09-19 21:40 ` Vitaly Wool
2005-09-19 21:43 ` Thomas Gleixner
2005-09-20 9:19 ` Vitaly Wool
2005-09-20 10:04 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox