* Wear-leveling peculiarities
@ 2015-05-18 13:53 Johannes Bauer
2015-05-18 14:09 ` Johannes Bauer
2015-05-19 9:17 ` valent.turkovic
0 siblings, 2 replies; 15+ messages in thread
From: Johannes Bauer @ 2015-05-18 13:53 UTC (permalink / raw)
To: linux-mtd
Hello list,
I keep track of some devices running an embedded ARM Linux which boots
from NAND flash. On there, ubifs is used. The deployed kernels are:
Linux version 3.0.59 (###@###) (gcc version 4.5.4 20120305 (prerelease)
(GCC) ) #1 Mon Apr 29 16:36:42 CEST 2013
Target is ARMv7 (omap2).
The units have been deployed for three years now. Recently, we've been
seeing units fail more often. This warranted some investigation. I
pulled dd images of the relevant /dev/mtd device (mtd4 in my case) and
wrote a small Python script that evaluated the UBIFS LEB headers, in
particular the erase count. I expected to see a uniform distribution of
erases all around the flash.
But on the contrary, we see the very opposite:
http://imgur.com/a/d5Bhl
Here you see graphs of two units. You can see that the pattern is
identical: Lots of pages which were written seldomly, lots of pages
which were written frequently. Very little inbetween. This is what the
histograms show (erase count on the X axis and their occurences on the Y
axis).
The other graphs are even more disturbing. It shows the physical layout
of the NAND flash. Each pixel corresponds to one LEB. Everything upwards
of 100 erases is red (the scale is linear, shown at the very bottom).
You can see that in some areas, pages are erased very often while in
others they're virtually constant.
This is something I'd expect if the FS would not perform wear-leveling
(files that are written in-place cause page erases at the same locations
over and over). But ubifs should take care of this, shouldn't it? It
might well be that my understanding of ubifs is too limited so I don't
grasp the whole picture. In any case, any advice is greatly appreciated.
Thanks in advance,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 13:53 Wear-leveling peculiarities Johannes Bauer
@ 2015-05-18 14:09 ` Johannes Bauer
2015-05-18 17:58 ` Richard Weinberger
2015-05-19 9:17 ` valent.turkovic
1 sibling, 1 reply; 15+ messages in thread
From: Johannes Bauer @ 2015-05-18 14:09 UTC (permalink / raw)
To: linux-mtd
Am 18.05.2015 15:53, schrieb Johannes Bauer:
> I don't grasp the whole picture.
Obviously.
I've been thinking about this some more and had an epiphany pretty much
ten seconds after I hit the "send" button on my email.
Wear-leveling is working alright, but of course some parts (static
operating system components, for example) will never be written a lot of
times, because there is no need to change their contents. Wear-leveling
is only performed on the sectors which are actively written (and on
those which are unoccupied). This would explain the graphs, but is the
explanation also correct?
To remedy this, is is advisable/necessary to walk over the whole file
system every once in a while and read every LEB that has a small erase
count and write it back to its original location? So that an exchange
between the near-constant LEBs and the actively-written LEBs is
performed? Is there a tool to do this?
Best regards,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 14:09 ` Johannes Bauer
@ 2015-05-18 17:58 ` Richard Weinberger
2015-05-18 17:59 ` Richard Weinberger
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Richard Weinberger @ 2015-05-18 17:58 UTC (permalink / raw)
To: Johannes Bauer; +Cc: linux-mtd
On Mon, May 18, 2015 at 4:09 PM, Johannes Bauer
<weolanwaybqm@spornkuller.de> wrote:
> Am 18.05.2015 15:53, schrieb Johannes Bauer:
>
>> I don't grasp the whole picture.
>
>
> Obviously.
>
> I've been thinking about this some more and had an epiphany pretty much ten
> seconds after I hit the "send" button on my email.
>
> Wear-leveling is working alright, but of course some parts (static operating
> system components, for example) will never be written a lot of times,
> because there is no need to change their contents. Wear-leveling is only
> performed on the sectors which are actively written (and on those which are
> unoccupied). This would explain the graphs, but is the explanation also
> correct?
Wear-leveling is done on UBI and UBIFS.
What is CONFIG_MTD_UBI_WL_THRESHOLD set to?
>From the Kconfig help:
This parameter defines the maximum difference between the highest
erase counter value and the lowest erase counter value of eraseblocks
of UBI devices. When this threshold is exceeded, UBI starts performing
wear leveling by means of moving data from eraseblock with low erase
counter to eraseblocks with high erase counter.
The default value should be OK for SLC NAND flashes, NOR flashes and
other flashes which have eraseblock life-cycle 100000 or more.
However, in case of MLC NAND flashes which typically have eraseblock
life-cycle less than 10000, the threshold should be lessened (e.g.,
to 128 or 256, although it does not have to be power of 2)
I suspect that your threshold was never reached.
>From your provided graph it looks like all erase block have been
erased t most 300 times.
If your NAND starts dying after 300 erases you're in trouble.
--
Thanks,
//richard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 17:58 ` Richard Weinberger
@ 2015-05-18 17:59 ` Richard Weinberger
2015-05-18 20:38 ` Johannes Bauer
[not found] ` <555A4D38.3070702@spornkuller.de>
2 siblings, 0 replies; 15+ messages in thread
From: Richard Weinberger @ 2015-05-18 17:59 UTC (permalink / raw)
To: Johannes Bauer; +Cc: linux-mtd
On Mon, May 18, 2015 at 7:58 PM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> On Mon, May 18, 2015 at 4:09 PM, Johannes Bauer
> <weolanwaybqm@spornkuller.de> wrote:
>> Am 18.05.2015 15:53, schrieb Johannes Bauer:
>>
>>> I don't grasp the whole picture.
>>
>>
>> Obviously.
>>
>> I've been thinking about this some more and had an epiphany pretty much ten
>> seconds after I hit the "send" button on my email.
>>
>> Wear-leveling is working alright, but of course some parts (static operating
>> system components, for example) will never be written a lot of times,
>> because there is no need to change their contents. Wear-leveling is only
>> performed on the sectors which are actively written (and on those which are
>> unoccupied). This would explain the graphs, but is the explanation also
>> correct?
>
> Wear-leveling is done on UBI and UBIFS.
Should be read as *not* UBIFS.
--
Thanks,
//richard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 17:58 ` Richard Weinberger
2015-05-18 17:59 ` Richard Weinberger
@ 2015-05-18 20:38 ` Johannes Bauer
[not found] ` <555A4D38.3070702@spornkuller.de>
2 siblings, 0 replies; 15+ messages in thread
From: Johannes Bauer @ 2015-05-18 20:38 UTC (permalink / raw)
To: linux-mtd
Sorry Richard, I meant to reply to the list.
On 18.05.2015 19:58, Richard Weinberger wrote:
> Wear-leveling is done on UBI and UBIFS.
> What is CONFIG_MTD_UBI_WL_THRESHOLD set to?
Ooops. I honestly don't know, will check this out tomorrow. I must admit
that I wasn't aware of this setting at all.
> I suspect that your threshold was never reached.
Yes, I suspect you're right here.
>>From your provided graph it looks like all erase block have been
> erased t most 300 times.
> If your NAND starts dying after 300 erases you're in trouble.
And I fear you're right here as well
Although there's no definitive saying how many page writes the failed
units had because the defective sectors are so broken that the kernel
barfs out I/O errors. That means I can't even read the OOB metadata.
Cheers,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
[not found] ` <555A4D38.3070702@spornkuller.de>
@ 2015-05-18 20:47 ` Richard Weinberger
2015-05-26 9:38 ` Johannes Bauer
0 siblings, 1 reply; 15+ messages in thread
From: Richard Weinberger @ 2015-05-18 20:47 UTC (permalink / raw)
To: Johannes Bauer; +Cc: linux-mtd@lists.infradead.org
Am 18.05.2015 um 22:36 schrieb Johannes Bauer:
> On 18.05.2015 19:58, Richard Weinberger wrote:
>
>> Wear-leveling is done on UBI and UBIFS.
>> What is CONFIG_MTD_UBI_WL_THRESHOLD set to?
>
> Ooops. I honestly don't know, will check this out tomorrow. I must admit
> that I wasn't aware of this setting at all.
>
>> >From the Kconfig help:
>> This parameter defines the maximum difference between the highest
>> erase counter value and the lowest erase counter value of eraseblocks
>> of UBI devices. When this threshold is exceeded, UBI starts performing
>> wear leveling by means of moving data from eraseblock with low erase
>> counter to eraseblocks with high erase counter.
>>
>> The default value should be OK for SLC NAND flashes, NOR flashes and
>> other flashes which have eraseblock life-cycle 100000 or more.
>> However, in case of MLC NAND flashes which typically have eraseblock
>> life-cycle less than 10000, the threshold should be lessened (e.g.,
>> to 128 or 256, although it does not have to be power of 2)
>>
>> I suspect that your threshold was never reached.
>
> Yes, I suspect you're right here.
If you did not set CONFIG_MTD_UBI_WL_THRESHOLD it is 4096.
So, regular wear-leveling did never happen.
>> >From your provided graph it looks like all erase block have been
>> erased t most 300 times.
>> If your NAND starts dying after 300 erases you're in trouble.
>
> And I fear you're right here as well :-(
>
> Although there's no definitive saying how many page writes the failed
> units had because the defective sectors are so broken that the kernel
> barfs out I/O errors. That means I can't even read the OOB metadata.
The OOB-Data does not matter. UBI is not using OOB.
So, you have to figure out why UBIFS is dying. Maybe it is a NAND issue.
Thanks,
//richard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 13:53 Wear-leveling peculiarities Johannes Bauer
2015-05-18 14:09 ` Johannes Bauer
@ 2015-05-19 9:17 ` valent.turkovic
2015-05-26 9:43 ` Johannes Bauer
1 sibling, 1 reply; 15+ messages in thread
From: valent.turkovic @ 2015-05-19 9:17 UTC (permalink / raw)
To: Johannes Bauer; +Cc: linux-mtd
On 18 May 2015 at 15:53, Johannes Bauer <weolanwaybqm@spornkuller.de> wrote:
> Hello list,
> The units have been deployed for three years now. Recently, we've been
> seeing units fail more often. This warranted some investigation. I pulled dd
> images of the relevant /dev/mtd device (mtd4 in my case) and wrote a small
> Python script that evaluated the UBIFS LEB headers, in particular the erase
> count. I expected to see a uniform distribution of erases all around the
> flash.
Can you please share your python script, it would probably be helpful
to other people as well.
Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-18 20:47 ` Richard Weinberger
@ 2015-05-26 9:38 ` Johannes Bauer
2015-05-26 10:08 ` Richard Weinberger
0 siblings, 1 reply; 15+ messages in thread
From: Johannes Bauer @ 2015-05-26 9:38 UTC (permalink / raw)
To: linux-mtd; +Cc: richard
Am 18.05.2015 22:47, schrieb Richard Weinberger:
>>> I suspect that your threshold was never reached.
>>
>> Yes, I suspect you're right here.
>
> If you did not set CONFIG_MTD_UBI_WL_THRESHOLD it is 4096.
> So, regular wear-leveling did never happen.
Your initial assessment was correct. The WL_THRESHOLD is at 4096 when we
should have configured it to a lower value for our NAND flash.
> The OOB-Data does not matter. UBI is not using OOB.
>
> So, you have to figure out why UBIFS is dying. Maybe it is a NAND
> issue.
Yes, sorry, I meant the EC header metadata. Indeed it looks like our
NAND flash is dying because WL is not active. Determining the amount of
erase-cycles of dead NAND blocks is difficult (since the EC header
meatdata is lost as well when the whole block doesn't respond anymore).
Thank you for getting me on the right track.
Best regards,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-19 9:17 ` valent.turkovic
@ 2015-05-26 9:43 ` Johannes Bauer
0 siblings, 0 replies; 15+ messages in thread
From: Johannes Bauer @ 2015-05-26 9:43 UTC (permalink / raw)
To: valent.turkovic; +Cc: linux-mtd
Am 19.05.2015 11:17, schrieb valent.turkovic@gmail.com:
> Can you please share your python script, it would probably be helpful
> to other people as well.
> Thanks.
The problem is that I've been doing that on company time; and the gears
do grind slowly (but steadily) around here. To publish any coding work
I'm doing I do have to have permission from my superior.
The whole thing is trivial at best and be assured that it annoys me as
much as you that there's this buerocratic burden. But I hope you
understand that I do like my job very much and would like to keep it
without getting in trouble :-)
Anyways if I get a response, I'll get back here. In the meantime it
appears that Guido Martínez has published a very similar tool on this
mailing list (ubiecdump).
Best regards,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-26 9:38 ` Johannes Bauer
@ 2015-05-26 10:08 ` Richard Weinberger
2015-05-26 11:14 ` Johannes Bauer
0 siblings, 1 reply; 15+ messages in thread
From: Richard Weinberger @ 2015-05-26 10:08 UTC (permalink / raw)
To: Johannes Bauer, linux-mtd
Am 26.05.2015 um 11:38 schrieb Johannes Bauer:
> Am 18.05.2015 22:47, schrieb Richard Weinberger:
>
>>>> I suspect that your threshold was never reached.
>>>
>>> Yes, I suspect you're right here.
>>
>> If you did not set CONFIG_MTD_UBI_WL_THRESHOLD it is 4096.
>> So, regular wear-leveling did never happen.
>
> Your initial assessment was correct. The WL_THRESHOLD is at 4096 when we should have configured it to a lower value for our NAND flash.
>
>> The OOB-Data does not matter. UBI is not using OOB.
>>
>> So, you have to figure out why UBIFS is dying. Maybe it is a NAND issue.
>
> Yes, sorry, I meant the EC header metadata. Indeed it looks like our NAND flash is dying because WL is not active. Determining the amount of erase-cycles of dead NAND blocks is
> difficult (since the EC header meatdata is lost as well when the whole block doesn't respond anymore).
What flash is this?
I should not start dying that early.
Thanks,
//richard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-26 10:08 ` Richard Weinberger
@ 2015-05-26 11:14 ` Johannes Bauer
2015-05-26 16:19 ` Jeff Lauruhn (jlauruhn)
2015-05-26 18:20 ` Ezequiel Garcia
0 siblings, 2 replies; 15+ messages in thread
From: Johannes Bauer @ 2015-05-26 11:14 UTC (permalink / raw)
To: Richard Weinberger; +Cc: linux-mtd
Am 26.05.2015 12:08, schrieb Richard Weinberger:
> What flash is this?
> I should not start dying that early.
To be honest, I have no idea. The problem is that the CPU is stacked on
top of the NAND via package-on-package technlogoly. The whole component
is supplied by a third party (and I've just scanned their documentation,
which does not give any hints about the origin of the NAND). So I have
no markings on the package (POP) and no documentation. The only thing I
do have is dmesg:
omap2-nand driver initializing
ONFI flash detected
NAND device: Manufacturer ID: 0x20, Chip ID: 0xaa (ST Micro NAND 256MiB
1,8V 8-bit)
Maybe there's a way to talk to the NAND directly and get some more
identification, but I don't know how. Any hints? I'd happily provide
clues.
Cheers,
Johannes
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Wear-leveling peculiarities
2015-05-26 11:14 ` Johannes Bauer
@ 2015-05-26 16:19 ` Jeff Lauruhn (jlauruhn)
2015-05-26 16:24 ` Johannes Bauer
2015-05-26 18:20 ` Ezequiel Garcia
1 sibling, 1 reply; 15+ messages in thread
From: Jeff Lauruhn (jlauruhn) @ 2015-05-26 16:19 UTC (permalink / raw)
To: Johannes Bauer, Richard Weinberger; +Cc: linux-mtd@lists.infradead.org
If the part is ONFI compliant you can always dump the parameter page and any information you need.
Jeff Lauruhn
NAND Application Engineer
Embedded Business Unit
Micron Technology, Inc
-----Original Message-----
From: linux-mtd [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Johannes Bauer
Sent: Tuesday, May 26, 2015 4:14 AM
To: Richard Weinberger
Cc: linux-mtd@lists.infradead.org
Subject: Re: Wear-leveling peculiarities
Am 26.05.2015 12:08, schrieb Richard Weinberger:
> What flash is this?
> I should not start dying that early.
To be honest, I have no idea. The problem is that the CPU is stacked on top of the NAND via package-on-package technlogoly. The whole component is supplied by a third party (and I've just scanned their documentation, which does not give any hints about the origin of the NAND). So I have no markings on the package (POP) and no documentation. The only thing I do have is dmesg:
omap2-nand driver initializing
ONFI flash detected
NAND device: Manufacturer ID: 0x20, Chip ID: 0xaa (ST Micro NAND 256MiB 1,8V 8-bit)
Maybe there's a way to talk to the NAND directly and get some more identification, but I don't know how. Any hints? I'd happily provide clues.
Cheers,
Johannes
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Wear-leveling peculiarities
2015-05-26 16:19 ` Jeff Lauruhn (jlauruhn)
@ 2015-05-26 16:24 ` Johannes Bauer
2015-05-26 16:29 ` Richard Weinberger
0 siblings, 1 reply; 15+ messages in thread
From: Johannes Bauer @ 2015-05-26 16:24 UTC (permalink / raw)
To: Jeff Lauruhn (jlauruhn); +Cc: Richard Weinberger, linux-mtd
Hi Jeff,
Am 26.05.2015 18:19, schrieb Jeff Lauruhn (jlauruhn):
> If the part is ONFI compliant you can always dump the parameter page
> and any information you need.
This sounds very much like something I'd like to do... only that I have
no clue how :-) Any hints?
Best regards,
Joe
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-26 16:24 ` Johannes Bauer
@ 2015-05-26 16:29 ` Richard Weinberger
0 siblings, 0 replies; 15+ messages in thread
From: Richard Weinberger @ 2015-05-26 16:29 UTC (permalink / raw)
To: Johannes Bauer, Jeff Lauruhn (jlauruhn); +Cc: linux-mtd
Am 26.05.2015 um 18:24 schrieb Johannes Bauer:
> Hi Jeff,
>
> Am 26.05.2015 18:19, schrieb Jeff Lauruhn (jlauruhn):
>> If the part is ONFI compliant you can always dump the parameter page
>> and any information you need.
>
> This sounds very much like something I'd like to do... only that I have no clue how :-) Any hints?
Read nand_base.c it does all NAND probing.
I'd start with a plain NAND_CMD_READID.
Thanks,
//richard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Wear-leveling peculiarities
2015-05-26 11:14 ` Johannes Bauer
2015-05-26 16:19 ` Jeff Lauruhn (jlauruhn)
@ 2015-05-26 18:20 ` Ezequiel Garcia
1 sibling, 0 replies; 15+ messages in thread
From: Ezequiel Garcia @ 2015-05-26 18:20 UTC (permalink / raw)
To: Johannes Bauer, Richard Weinberger; +Cc: linux-mtd
On 05/26/2015 08:14 AM, Johannes Bauer wrote:
> Am 26.05.2015 12:08, schrieb Richard Weinberger:
>
>> What flash is this?
>> I should not start dying that early.
>
> To be honest, I have no idea. The problem is that the CPU is stacked on
> top of the NAND via package-on-package technlogoly. The whole component
> is supplied by a third party (and I've just scanned their documentation,
> which does not give any hints about the origin of the NAND). So I have
> no markings on the package (POP) and no documentation. The only thing I
> do have is dmesg:
>
> omap2-nand driver initializing
> ONFI flash detected
> NAND device: Manufacturer ID: 0x20, Chip ID: 0xaa (ST Micro NAND 256MiB
> 1,8V 8-bit)
>
According to this:
http://www.linux-mtd.infradead.org/nand-data/nanddata.html
0x20 0xaa is the ID of Numonyx/ST NAND02GXXX. Searching for the specs it
seems the device is SLC, so it's kind of unexpected that the default WL
value was wrong.
> Maybe there's a way to talk to the NAND directly and get some more
> identification, but I don't know how. Any hints? I'd happily provide clues.
>
The ONFI specs talk about that (a quick "onfi spec pdf" search will do).
The ONFI specifies a paramter page which contains all the information
about the device (5.7.1. Parameter Page Data Structure Definition).
Keep us updated if you find why the flashes are dying :)
--
Ezequiel Garcia, VanguardiaSur
www.vanguardiasur.com.ar
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-05-26 18:24 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-18 13:53 Wear-leveling peculiarities Johannes Bauer
2015-05-18 14:09 ` Johannes Bauer
2015-05-18 17:58 ` Richard Weinberger
2015-05-18 17:59 ` Richard Weinberger
2015-05-18 20:38 ` Johannes Bauer
[not found] ` <555A4D38.3070702@spornkuller.de>
2015-05-18 20:47 ` Richard Weinberger
2015-05-26 9:38 ` Johannes Bauer
2015-05-26 10:08 ` Richard Weinberger
2015-05-26 11:14 ` Johannes Bauer
2015-05-26 16:19 ` Jeff Lauruhn (jlauruhn)
2015-05-26 16:24 ` Johannes Bauer
2015-05-26 16:29 ` Richard Weinberger
2015-05-26 18:20 ` Ezequiel Garcia
2015-05-19 9:17 ` valent.turkovic
2015-05-26 9:43 ` Johannes Bauer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox