public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
* [U-Boot]  OMAP3 NAND ECC bug report
@ 2011-10-20  9:06 Arno Steffen
  2011-10-21  9:28 ` Arno Steffen
  2011-10-27 15:22 ` Ivan Nardi
  0 siblings, 2 replies; 9+ messages in thread
From: Arno Steffen @ 2011-10-20  9:06 UTC (permalink / raw)
  To: u-boot

I did tests with OMAP3 uboot. The SW-ECC (testet 1 bit, 4 bit BCH)
doesn't correct errors in environment (during power-up).
Compiling uboot for default HW-ECC - correction works fine.
Testet with TI's PSP 4.02.00.07 (almost like arago latest version).

I modified single bits by adding a patch (thanks Scott Wood) to uboot,
that allows write in raw mode to flash. Single bit error can corrected
with HW-ECC, not with SW-ECC.
This report is only about uboot environment, which is read in startup.
I didn't test it, but from my experience the correction works fine,
beside the initial reading of the environment, as for reading and
starting the kernel.

Imho this is very critical, as reliability is suffered if a device
isn't booting.

Best regards
Arno

PS:
The link below refers to a previous discussion.
http://lists.denx.de/pipermail/u-boot/2011-September/100486.html

in omap_gpmc.c I used this in board_nand_init()

...
	nand->chip_delay = 100;
	nand->ecc.mode = NAND_ECC_4BIT_SOFT;
	nand_curr_device = 0;
	omap_nand_switch_ecc(NAND_ECC_4BIT_SOFT, 1);
	return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-20  9:06 [U-Boot] OMAP3 NAND ECC bug report Arno Steffen
@ 2011-10-21  9:28 ` Arno Steffen
  2011-10-25  7:41   ` Arno Steffen
  2011-10-27 15:22 ` Ivan Nardi
  1 sibling, 1 reply; 9+ messages in thread
From: Arno Steffen @ 2011-10-21  9:28 UTC (permalink / raw)
  To: u-boot

I think reason that SW_ECC is not working, is in omap_gpmc.c:

omap_nand_switch_ecc():

	  case NAND_ECC_SOFT:
		nand->ecc.mode = NAND_ECC_SOFT;
		/* Use mtd default settings */
		nand->ecc.layout = NULL;
		printf("SW ECC selected\n");

The ecc struct is not setup?!? Someone has an idea?

@TI stuff: if you are not the right person to address this, can you
please forward?
Thanks

 - Arno



2011/10/20 Arno Steffen <arno.steffen@googlemail.com>:
> I did tests with OMAP3 uboot. The SW-ECC (testet 1 bit, 4 bit BCH)
> doesn't correct errors in environment (during power-up).
> Compiling uboot for default HW-ECC - correction works fine.
> Testet with TI's PSP 4.02.00.07 (almost like arago latest version).
>
> I modified single bits by adding a patch (thanks Scott Wood) to uboot,
> that allows write in raw mode to flash. Single bit error can corrected
> with HW-ECC, not with SW-ECC.
> This report is only about uboot environment, which is read in startup.
> I didn't test it, but from my experience the correction works fine,
> beside the initial reading of the environment, as for reading and
> starting the kernel.
>
> Imho this is very critical, as reliability is suffered if a device
> isn't booting.
>
> Best regards
> Arno
>
> PS:
> The link below refers to a previous discussion.
> http://lists.denx.de/pipermail/u-boot/2011-September/100486.html
>
> in omap_gpmc.c I used this in board_nand_init()
>
> ...
> ? ? ? ?nand->chip_delay = 100;
> ? ? ? ?nand->ecc.mode = NAND_ECC_4BIT_SOFT;
> ? ? ? ?nand_curr_device = 0;
> ? ? ? ?omap_nand_switch_ecc(NAND_ECC_4BIT_SOFT, 1);
> ? ? ? ?return 0;
> }
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-21  9:28 ` Arno Steffen
@ 2011-10-25  7:41   ` Arno Steffen
  2011-10-26 20:26     ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Arno Steffen @ 2011-10-25  7:41 UTC (permalink / raw)
  To: u-boot

I am feeling to spam the board but found finally the reason for this behaviour.

nand_read returns with -EUCLEAN in case of correcting errors, and this
will later on reported as BAD NAND, although this error is corrected.

Correct me if I am wrong. What I did is chaning env_nand.c:
	while (amount_loaded < CONFIG_ENV_SIZE && offset < end) {
		if (nand_block_isbad(&nand_info[0], offset)) {
			offset += blocksize;
		} else {
			char_ptr = &buf[amount_loaded];
-			if (nand_read(&nand_info[0], offset, &len, char_ptr)) {
-				return 1;
+			err = nand_read(&nand_info[0], offset, &len, char_ptr);
+			if (err) {
+				if (err != -EUCLEAN) {   // Bad NAND has been corrected, so no problem
+					return 1;
+				}
			}
			offset += blocksize;
			amount_loaded += len;
		}


Correct me if I am wrong, but it seems that all platform are suffered,
not just OMAP.
There will be compared a corrected bits before and after correction,
and it send an error message, if bits are corrected (instead of only
if it can not be corrected).

Best regards
Arno


2011/10/21 Arno Steffen <arno.steffen@googlemail.com>:
> I think reason that SW_ECC is not working, is in omap_gpmc.c:
>
> omap_nand_switch_ecc():
>
> ? ? ? ? ?case NAND_ECC_SOFT:
> ? ? ? ? ? ? ? ?nand->ecc.mode = NAND_ECC_SOFT;
> ? ? ? ? ? ? ? ?/* Use mtd default settings */
> ? ? ? ? ? ? ? ?nand->ecc.layout = NULL;
> ? ? ? ? ? ? ? ?printf("SW ECC selected\n");
>
> The ecc struct is not setup?!? Someone has an idea?
>
> @TI stuff: if you are not the right person to address this, can you
> please forward?
> Thanks
>
> ?- Arno
>
>
>
> 2011/10/20 Arno Steffen <arno.steffen@googlemail.com>:
>> I did tests with OMAP3 uboot. The SW-ECC (testet 1 bit, 4 bit BCH)
>> doesn't correct errors in environment (during power-up).
>> Compiling uboot for default HW-ECC - correction works fine.
>> Testet with TI's PSP 4.02.00.07 (almost like arago latest version).
>>
>> I modified single bits by adding a patch (thanks Scott Wood) to uboot,
>> that allows write in raw mode to flash. Single bit error can corrected
>> with HW-ECC, not with SW-ECC.
>> This report is only about uboot environment, which is read in startup.
>> I didn't test it, but from my experience the correction works fine,
>> beside the initial reading of the environment, as for reading and
>> starting the kernel.
>>
>> Imho this is very critical, as reliability is suffered if a device
>> isn't booting.
>>
>> Best regards
>> Arno
>>
>> PS:
>> The link below refers to a previous discussion.
>> http://lists.denx.de/pipermail/u-boot/2011-September/100486.html
>>
>> in omap_gpmc.c I used this in board_nand_init()
>>
>> ...
>> ? ? ? ?nand->chip_delay = 100;
>> ? ? ? ?nand->ecc.mode = NAND_ECC_4BIT_SOFT;
>> ? ? ? ?nand_curr_device = 0;
>> ? ? ? ?omap_nand_switch_ecc(NAND_ECC_4BIT_SOFT, 1);
>> ? ? ? ?return 0;
>> }
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-25  7:41   ` Arno Steffen
@ 2011-10-26 20:26     ` Scott Wood
  2011-10-28  6:10       ` Arno Steffen
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Wood @ 2011-10-26 20:26 UTC (permalink / raw)
  To: u-boot

On 10/25/2011 02:41 AM, Arno Steffen wrote:
> I am feeling to spam the board but found finally the reason for this behaviour.
> 
> nand_read returns with -EUCLEAN in case of correcting errors, and this
> will later on reported as BAD NAND, although this error is corrected.
> 
> Correct me if I am wrong. What I did is chaning env_nand.c:
> 	while (amount_loaded < CONFIG_ENV_SIZE && offset < end) {
> 		if (nand_block_isbad(&nand_info[0], offset)) {
> 			offset += blocksize;
> 		} else {
> 			char_ptr = &buf[amount_loaded];
> -			if (nand_read(&nand_info[0], offset, &len, char_ptr)) {
> -				return 1;
> +			err = nand_read(&nand_info[0], offset, &len, char_ptr);
> +			if (err) {
> +				if (err != -EUCLEAN) {   // Bad NAND has been corrected, so no problem
> +					return 1;
> +				}
> 			}
> 			offset += blocksize;
> 			amount_loaded += len;
> 		}
> 
> 
> Correct me if I am wrong, but it seems that all platform are suffered,
> not just OMAP.
> There will be compared a corrected bits before and after correction,
> and it send an error message, if bits are corrected (instead of only
> if it can not be corrected).

What version of U-Boot are you looking at?

In readenv(), I see nand_read_skip_bad(), not nand_read().
nand_read_skip_bad() handles -EUCLEAN.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-20  9:06 [U-Boot] OMAP3 NAND ECC bug report Arno Steffen
  2011-10-21  9:28 ` Arno Steffen
@ 2011-10-27 15:22 ` Ivan Nardi
  2011-10-27 19:02   ` Scott Wood
  1 sibling, 1 reply; 9+ messages in thread
From: Ivan Nardi @ 2011-10-27 15:22 UTC (permalink / raw)
  To: u-boot

On 20 October 2011 11:06, Arno Steffen <arno.steffen@googlemail.com> wrote:
>
> I modified single bits by adding a patch (thanks Scott Wood) to uboot,
> that allows write in raw mode to flash. Single bit error can corrected
> with HW-ECC, not with SW-ECC.

Hi all,

I would be very interested in using that patch (it is very useful for
testing ecc algorithms)
In u-boot the command "nand biterr" isn't implemented yet and some
code found in the mailing list is pretty old and never merged (why?)
into mainline (http://www.mail-archive.com/u-boot at lists.denx.de/msg22963.html)

Could you share it?
Thanks a lot

Ivan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-27 15:22 ` Ivan Nardi
@ 2011-10-27 19:02   ` Scott Wood
  0 siblings, 0 replies; 9+ messages in thread
From: Scott Wood @ 2011-10-27 19:02 UTC (permalink / raw)
  To: u-boot

On 10/27/2011 10:22 AM, Ivan Nardi wrote:
> On 20 October 2011 11:06, Arno Steffen <arno.steffen@googlemail.com> wrote:
>>
>> I modified single bits by adding a patch (thanks Scott Wood) to uboot,
>> that allows write in raw mode to flash. Single bit error can corrected
>> with HW-ECC, not with SW-ECC.
> 
> Hi all,
> 
> I would be very interested in using that patch (it is very useful for
> testing ecc algorithms)
> In u-boot the command "nand biterr" isn't implemented yet and some
> code found in the mailing list is pretty old and never merged (why?)
> into mainline (http://www.mail-archive.com/u-boot at lists.denx.de/msg22963.html)
> 
> Could you share it?
> Thanks a lot

raw read/write was recently added to mainline U-Boot:

commit fb3659ac422801ea18f36ef62926df70beeada4c
Author: Marek Vasut <marek.vasut@gmail.com>
Date:   Fri Sep 23 15:43:10 2011 +0200

    NAND: Add nand read.raw and write.raw commands

    These commands should work around various "hardware" ECC and BCH
methods.

    Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
    Cc: Scott Wood <scottwood@freescale.com>
    Cc: Stefano Babic <sbabic@denx.de>
    Cc: Wolfgang Denk <wd@denx.de>
    Cc: Detlev Zundel <dzu@denx.de>
    [scottwood at freescale.com: s/write the page/access the page/]
    Signed-off-by: Scott Wood <scottwood@freescale.com>

As for why the biterr patch wasn't merged, it's because nobody ever
resubmitted with a version that fixed the issues raised in review and
applies to the current tree.

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-26 20:26     ` Scott Wood
@ 2011-10-28  6:10       ` Arno Steffen
  2011-10-28 16:06         ` Scott Wood
  0 siblings, 1 reply; 9+ messages in thread
From: Arno Steffen @ 2011-10-28  6:10 UTC (permalink / raw)
  To: u-boot

I have been used u-boot supportet (or better say not suppported) by TI
in their latest PSP/SDK (Juli 2011).
Base is 2010.06. But this issue was also inside older versions.

- Arno

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-28  6:10       ` Arno Steffen
@ 2011-10-28 16:06         ` Scott Wood
  2011-10-31  9:36           ` Arno Steffen
  0 siblings, 1 reply; 9+ messages in thread
From: Scott Wood @ 2011-10-28 16:06 UTC (permalink / raw)
  To: u-boot

On 10/28/2011 01:10 AM, Arno Steffen wrote:
> I have been used u-boot supportet (or better say not suppported) by TI
> in their latest PSP/SDK (Juli 2011).
> Base is 2010.06. But this issue was also inside older versions.

It's not older versions I'm concerned about, but newer ones.  This bug
has been fixed in mainline U-Boot, about a year ago.  We don't have a
time machine to put the bug fix in older versions. :-)

-Scott

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [U-Boot] OMAP3 NAND ECC bug report
  2011-10-28 16:06         ` Scott Wood
@ 2011-10-31  9:36           ` Arno Steffen
  0 siblings, 0 replies; 9+ messages in thread
From: Arno Steffen @ 2011-10-31  9:36 UTC (permalink / raw)
  To: u-boot

Sure, Scott, not blame on you. Just frustating, doing bughunting for
issues that have been fixed long time ago. I would expect a bit more
from TI in this case. My fault obviously.
Ok, finally I got it, that's what matters.
- Arno


2011/10/28 Scott Wood <scottwood@freescale.com>:
> On 10/28/2011 01:10 AM, Arno Steffen wrote:
>> I have been used u-boot supportet (or better say not suppported) by TI
>> in their latest PSP/SDK (Juli 2011).
>> Base is 2010.06. But this issue was also inside older versions.
>
> It's not older versions I'm concerned about, but newer ones. ?This bug
> has been fixed in mainline U-Boot, about a year ago. ?We don't have a
> time machine to put the bug fix in older versions. :-)
>
> -Scott
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-10-31  9:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-20  9:06 [U-Boot] OMAP3 NAND ECC bug report Arno Steffen
2011-10-21  9:28 ` Arno Steffen
2011-10-25  7:41   ` Arno Steffen
2011-10-26 20:26     ` Scott Wood
2011-10-28  6:10       ` Arno Steffen
2011-10-28 16:06         ` Scott Wood
2011-10-31  9:36           ` Arno Steffen
2011-10-27 15:22 ` Ivan Nardi
2011-10-27 19:02   ` Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox