linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* ARM: sunxi: Experiences NAND flash
@ 2015-08-11 12:16 Olliver Schinagl
  2015-08-17  7:34 ` Boris Brezillon
       [not found] ` <55CB4AA3.9050809@schinagl.nl>
  0 siblings, 2 replies; 6+ messages in thread
From: Olliver Schinagl @ 2015-08-11 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hello everybody,

We are working with Boris and Roy's patch series on getting the NAND 
flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
everything looks fine, but we noticed that occasionally (after 
power/cycle or power cut) ubi fails to mount the partition. It is not 
something easily enough to reproduce, but it has failed on 5 boards out 
of 30 we have.

U-boot reports the following:
UBI: default fastmap pool size: 100
UBI: default fastmap WL pool size: 25
UBI: attaching mtd1 to ubi0
UBI: scanning is finished
UBI init error 22
Error reading superblock on volume 'ubi:boot' errno=-19!
ubifsmount - mount UBIFS volume

whereas the linux kernel booted from sd card gives:
ubiattach /dev/ubi_ctrl -m 0
[  100.560704] ubi0: default fastmap pool size: 8
[  100.565186] ubi0: default fastmap WL pool size: 4
[  100.570100] ubi0: attaching mtd0
[  100.590469] ubi0: scanning is finished
[  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
not found
[  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
error -22
ubiattach: error!: cannot attach mtd0
            error 22 (Invalid argument)

The u-boot version we are using is a few months out of date
U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
Technology
arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
GNU ld (2.25-5+5+b1) 2.25

but the kernel is fairly up to date:
4.2.0-rc4-opinicus-g8ec3671


Now I know that the mtd stuff is all very new and all very untested, 
what I am curious about is a) have other people actually tried the mtd 
stuff on Allwinner hardware, and b) has anybody encountered this issue 
as well?

It's not something very easily reproducible (toggling a machine on/off 
repeatedly did not trigger it yet) but it does happen.

Olliver

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-11 12:16 ARM: sunxi: Experiences NAND flash Olliver Schinagl
@ 2015-08-17  7:34 ` Boris Brezillon
  2015-08-17  7:51   ` [linux-sunxi] " Michal Suchanek
  2015-08-17  8:30   ` Roy Spliet
       [not found] ` <55CB4AA3.9050809@schinagl.nl>
  1 sibling, 2 replies; 6+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Oliver,

Sorry for the late reply (I was in vacation for the last 2 weeks)

On Tue, 11 Aug 2015 14:16:52 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hello everybody,
> 
> We are working with Boris and Roy's patch series on getting the NAND 
> flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
> everything looks fine, but we noticed that occasionally (after 
> power/cycle or power cut) ubi fails to mount the partition. It is not 
> something easily enough to reproduce, but it has failed on 5 boards out 
> of 30 we have.

I remember warning you about that problem before: MLC NANDs are not as
reliable as SLC ones (please read my presentation about MLC support in
Linux [1]). I also remember recommending using an SLC chip if you were
tight on time to avoid dealing with all these MLC related problems, but
you decided to go for the MLC solution.

Back to your problem now, what you're seeing here is probably caused by
interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
of my presentation for more information).

> 
> U-boot reports the following:
> UBI: default fastmap pool size: 100
> UBI: default fastmap WL pool size: 25
> UBI: attaching mtd1 to ubi0
> UBI: scanning is finished
> UBI init error 22
> Error reading superblock on volume 'ubi:boot' errno=-19!
> ubifsmount - mount UBIFS volume
> 
> whereas the linux kernel booted from sd card gives:
> ubiattach /dev/ubi_ctrl -m 0
> [  100.560704] ubi0: default fastmap pool size: 8
> [  100.565186] ubi0: default fastmap WL pool size: 4
> [  100.570100] ubi0: attaching mtd0
> [  100.590469] ubi0: scanning is finished
> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
> not found
> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
> error -22
> ubiattach: error!: cannot attach mtd0
>             error 22 (Invalid argument)
> 
> The u-boot version we are using is a few months out of date
> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
> Technology
> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> GNU ld (2.25-5+5+b1) 2.25
> 
> but the kernel is fairly up to date:
> 4.2.0-rc4-opinicus-g8ec3671
> 
> 
> Now I know that the mtd stuff is all very new and all very untested, 
> what I am curious about is a) have other people actually tried the mtd 
> stuff on Allwinner hardware, and b) has anybody encountered this issue 
> as well?

Yes we did. So far we're using the NAND in SLC mode to address this
problem. It seems to work, but you also loose half the NAND capacity.

> 
> It's not something very easily reproducible (toggling a machine on/off 
> repeatedly did not trigger it yet) but it does happen.

I managed to reproduce it by faking a power cut directly in the NAND
core code (by sending a RESET command to the NAND chip in the middle of
a program operation), and I can confirm SLC mode address the problem.

Anyway, remember that MLC NANDs have other sources of unreliability
(e.g the unstable bits problem).

Best Regards,

Boris


[1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [linux-sunxi] ARM: sunxi: Experiences NAND flash
       [not found] ` <55CB4AA3.9050809@schinagl.nl>
@ 2015-08-17  7:48   ` Boris Brezillon
  0 siblings, 0 replies; 6+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Oliver,

On Wed, 12 Aug 2015 15:31:15 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hey Yassin,
> 
> I'm affraid. The strange thing that seems very related here is that when 
> writing a file onto the flash, it fails and succeeds alternating. It 
> never fails or succeeds twice in a row! And this on any board and any 
> partition.

I don't know if you only pasted half your command sequence, but it
seems you are writing twice on the same memory region without erasing it,
and this is prohibited on NAND devices.

Try with:

# flash_erase /dev/mtd0 && nandwrite -p /dev/mtd0 u-boot-sunxi-with-spl.bin

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [linux-sunxi] Re: ARM: sunxi: Experiences NAND flash
  2015-08-17  7:34 ` Boris Brezillon
@ 2015-08-17  7:51   ` Michal Suchanek
  2015-08-17  8:30   ` Roy Spliet
  1 sibling, 0 replies; 6+ messages in thread
From: Michal Suchanek @ 2015-08-17  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hello

On 17 August 2015 at 09:34, Boris Brezillon
<boris.brezillon@free-electrons.com> wrote:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>

>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
>
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
>


What is needed to use the NAND in SLC mode?

Presumably you need to know something about its organizetion?

Is this data available for chips commonly used on sunxi devices?

Thanks

Michal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-17  7:34 ` Boris Brezillon
  2015-08-17  7:51   ` [linux-sunxi] " Michal Suchanek
@ 2015-08-17  8:30   ` Roy Spliet
  2015-08-17  9:03     ` Boris Brezillon
  1 sibling, 1 reply; 6+ messages in thread
From: Roy Spliet @ 2015-08-17  8:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Reply in-line

Op 17-08-15 om 08:34 schreef Boris Brezillon:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>
>> Hello everybody,
>>
>> We are working with Boris and Roy's patch series on getting the NAND
>> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
>> everything looks fine, but we noticed that occasionally (after
>> power/cycle or power cut) ubi fails to mount the partition. It is not
>> something easily enough to reproduce, but it has failed on 5 boards out
>> of 30 we have.
> I remember warning you about that problem before: MLC NANDs are not as
> reliable as SLC ones (please read my presentation about MLC support in
> Linux [1]). I also remember recommending using an SLC chip if you were
> tight on time to avoid dealing with all these MLC related problems, but
> you decided to go for the MLC solution.
>
> Back to your problem now, what you're seeing here is probably caused by
> interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> of my presentation for more information).
In his defence; we looked at it, and from what we could tell it is not 
possible to find an affordable SLC chip that the Allwinner A10/A20 
BootROM would even boot from. In general, chips below 8K page size 
require 64-bit EEC strength to operate, which in turn required more OOB 
area than any chip would provide. This limitation is in my opinion a 
design fault from AllWinners side and I hope that their future SoCs can 
boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
but right now there is nothing we can do to change that situation.
>> U-boot reports the following:
>> UBI: default fastmap pool size: 100
>> UBI: default fastmap WL pool size: 25
>> UBI: attaching mtd1 to ubi0
>> UBI: scanning is finished
>> UBI init error 22
>> Error reading superblock on volume 'ubi:boot' errno=-19!
>> ubifsmount - mount UBIFS volume
>>
>> whereas the linux kernel booted from sd card gives:
>> ubiattach /dev/ubi_ctrl -m 0
>> [  100.560704] ubi0: default fastmap pool size: 8
>> [  100.565186] ubi0: default fastmap WL pool size: 4
>> [  100.570100] ubi0: attaching mtd0
>> [  100.590469] ubi0: scanning is finished
>> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
>> not found
>> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
>> error -22
>> ubiattach: error!: cannot attach mtd0
>>              error 22 (Invalid argument)
>>
>> The u-boot version we are using is a few months out of date
>> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
>> Technology
>> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
>> GNU ld (2.25-5+5+b1) 2.25
>>
>> but the kernel is fairly up to date:
>> 4.2.0-rc4-opinicus-g8ec3671
>>
>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
So as requested by someone else: how exactly does that work? Can we just 
give your NAND driver a mapping between shared pages and instruct it to 
ignore half, or does the driver require some serious patchery?
Cheers,

Roy
>
>> It's not something very easily reproducible (toggling a machine on/off
>> repeatedly did not trigger it yet) but it does happen.
> I managed to reproduce it by faking a power cut directly in the NAND
> core code (by sending a RESET command to the NAND chip in the middle of
> a program operation), and I can confirm SLC mode address the problem.
>
> Anyway, remember that MLC NANDs have other sources of unreliability
> (e.g the unstable bits problem).
>
> Best Regards,
>
> Boris
>
>
> [1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-17  8:30   ` Roy Spliet
@ 2015-08-17  9:03     ` Boris Brezillon
  0 siblings, 0 replies; 6+ messages in thread
From: Boris Brezillon @ 2015-08-17  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Roy,

On Mon, 17 Aug 2015 09:30:38 +0100
Roy Spliet <seven@nimrod-online.com> wrote:

> Hello,
> 
> Reply in-line
> 
> Op 17-08-15 om 08:34 schreef Boris Brezillon:
> > Hi Oliver,
> >
> > Sorry for the late reply (I was in vacation for the last 2 weeks)
> >
> > On Tue, 11 Aug 2015 14:16:52 +0200
> > Olliver Schinagl <oliver+list@schinagl.nl> wrote:
> >
> >> Hello everybody,
> >>
> >> We are working with Boris and Roy's patch series on getting the NAND
> >> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
> >> everything looks fine, but we noticed that occasionally (after
> >> power/cycle or power cut) ubi fails to mount the partition. It is not
> >> something easily enough to reproduce, but it has failed on 5 boards out
> >> of 30 we have.
> > I remember warning you about that problem before: MLC NANDs are not as
> > reliable as SLC ones (please read my presentation about MLC support in
> > Linux [1]). I also remember recommending using an SLC chip if you were
> > tight on time to avoid dealing with all these MLC related problems, but
> > you decided to go for the MLC solution.
> >
> > Back to your problem now, what you're seeing here is probably caused by
> > interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> > of my presentation for more information).
> In his defence; we looked at it, and from what we could tell it is not 
> possible to find an affordable SLC chip that the Allwinner A10/A20 
> BootROM would even boot from. In general, chips below 8K page size 
> require 64-bit EEC strength to operate, which in turn required more OOB 
> area than any chip would provide. This limitation is in my opinion a 
> design fault from AllWinners side and I hope that their future SoCs can 
> boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
> but right now there is nothing we can do to change that situation.

Hm, according to this table [1], it also tries the 64bit/512bytes
scheme, which should fit in most SLC NANDs (if you have a NAND with 2k
+ 64byte pages, and you only use 512 bytes per page it leaves 1600 bytes
for your ECC data).
This being said, supporting this kind of layout in Linux can be
complicated: I remember we (Roy and I) tried to patch the nand part code
to tweak the data/oob repartition for this case, but we didn't manage
to get it to work.

> >> U-boot reports the following:
> >> UBI: default fastmap pool size: 100
> >> UBI: default fastmap WL pool size: 25
> >> UBI: attaching mtd1 to ubi0
> >> UBI: scanning is finished
> >> UBI init error 22
> >> Error reading superblock on volume 'ubi:boot' errno=-19!
> >> ubifsmount - mount UBIFS volume
> >>
> >> whereas the linux kernel booted from sd card gives:
> >> ubiattach /dev/ubi_ctrl -m 0
> >> [  100.560704] ubi0: default fastmap pool size: 8
> >> [  100.565186] ubi0: default fastmap WL pool size: 4
> >> [  100.570100] ubi0: attaching mtd0
> >> [  100.590469] ubi0: scanning is finished
> >> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
> >> not found
> >> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
> >> error -22
> >> ubiattach: error!: cannot attach mtd0
> >>              error 22 (Invalid argument)
> >>
> >> The u-boot version we are using is a few months out of date
> >> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
> >> Technology
> >> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> >> GNU ld (2.25-5+5+b1) 2.25
> >>
> >> but the kernel is fairly up to date:
> >> 4.2.0-rc4-opinicus-g8ec3671
> >>
> >>
> >> Now I know that the mtd stuff is all very new and all very untested,
> >> what I am curious about is a) have other people actually tried the mtd
> >> stuff on Allwinner hardware, and b) has anybody encountered this issue
> >> as well?
> > Yes we did. So far we're using the NAND in SLC mode to address this
> > problem. It seems to work, but you also loose half the NAND capacity.
> So as requested by someone else: how exactly does that work? Can we just 
> give your NAND driver a mapping between shared pages and instruct it to 
> ignore half, or does the driver require some serious patchery?

I only have a prototype for this SLC mode, and the code is available
here [2].
In short, the NAND core layer checks for SLC mode activation, and if it
is activated it only exposes half the erase block capacity.
This also requires some chip specific code to enable/disable the SLC
mode and adjust the row/column addresses before passing them to the
controller driver.
Note that SLC mode can be enabled by partitions, which let us declare
the SPL partition in MLC mode so that BROM can still load the SPL.

Best Regards,

Boris

[1]http://linux-sunxi.org/NAND#More_information_on_BROM_NAND
[2]https://github.com/NextThingCo/CHIP-linux/tree/nextthing/4.2/chip-nand-slc-mode

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-08-17  9:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-11 12:16 ARM: sunxi: Experiences NAND flash Olliver Schinagl
2015-08-17  7:34 ` Boris Brezillon
2015-08-17  7:51   ` [linux-sunxi] " Michal Suchanek
2015-08-17  8:30   ` Roy Spliet
2015-08-17  9:03     ` Boris Brezillon
     [not found] ` <55CB4AA3.9050809@schinagl.nl>
2015-08-17  7:48   ` [linux-sunxi] " Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).