* ubi vol_size and lots of bad blocks @ 2011-10-10 12:09 Daniel Drake 2011-10-11 11:35 ` Atlant Schmidt 2011-10-14 11:15 ` Artem Bityutskiy 0 siblings, 2 replies; 8+ messages in thread From: Daniel Drake @ 2011-10-10 12:09 UTC (permalink / raw) To: linux-mtd Hi, We're still working on getting ubifs shipped on OLPC XO-1. One outstanding issue we have is that on some laptops, when switching from jffs2 to ubifs, the laptop simply does not boot (root fs mounting difficulties). One case of this is when there are a large number of bad blocks on the disk, during boot we get: [ 76.855427] UBI error: vtbl_check: too large reserved_pebs 7850, good PEBs 7765 [ 76.867878] UBI error: vtbl_check: volume table check failed: record 0, error 9 With so many bad blocks, this is likely a problematic nand or a corrupt BBT. However, jffs2 worked in this situation, and (with many of our laptops in remote places) it would be nice for us to figure out how to make ubifs handle it as well. There are other cases of this error in the archive, and people have generally solved it by using a smaller vol_size in the ubinize config. Am I right in saying that reserved_pebs is computed from the vol_size specified in the ubinize config? I guess "good PEBs" is calculated from the amount of non-bad blocks found during the boot process. This suggests that using vol_size is unsafe for installations such as ours, where while we do know the NAND size in advance, we also want to support an unknown, high number of bad blocks which will vary throughout the field. I found a note in the UBI FAQ where it says vol_size can be excluded and it will be computed to be the size of the input image, and then the autoresize flag can be used to expand the partition later. Excluding vol_size in this way indeed solves the problem and the problematic laptop now boots. So, am I right in saying that for an installation such as OLPC, where resilience to strange NAND conditions involving high numbers of bad blocks is desired, it is advisable to *not* specify vol_size in ubinize.cfg? (If so I'll send in a FAQ update for the website.) The one bit I don't understand is what happens if another block goes bad later. If the autoresize functionality has modified reserved_pebs to represent the exact number of good blocks on the disk (i.e. reserved_pebs==good_PEBs), next time a block goes bad the same reserved_pebs>good_PEBs boot failure would be hit again. But I am probably missing something. cheers, Daniel ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: ubi vol_size and lots of bad blocks 2011-10-10 12:09 ubi vol_size and lots of bad blocks Daniel Drake @ 2011-10-11 11:35 ` Atlant Schmidt 2011-10-14 12:58 ` Artem Bityutskiy 2011-10-14 11:15 ` Artem Bityutskiy 1 sibling, 1 reply; 8+ messages in thread From: Atlant Schmidt @ 2011-10-11 11:35 UTC (permalink / raw) To: 'Daniel Drake', linux-mtd@lists.infradead.org Daniel: > The one bit I don't understand is what happens if another block goes > bad later. If the autoresize functionality has modified reserved_pebs > to represent the exact number of good blocks on the disk (i.e. > reserved_pebs==good_PEBs), next time a block goes bad the same > reserved_pebs>good_PEBs boot failure would be hit again. But I am > probably missing something. Be careful here -- the last time I looked, blocks that go bad *ARE NOT* actually permanently marked as bad; they may no longer be used during the current boot, but the next time you reboot, they're eligible for attempted-but-often-failing use once again. That is, once you've initialized the UBIfs, the number of bad PEBs never grows, no matter how many times it the software discovers that (say) PEB #1234 being just atrociously bad. And again, this may have changed but was definitely true the last time I tested this (although I'd love to be told otherwise). Atlant -----Original Message----- From: linux-mtd-bounces@lists.infradead.org [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Daniel Drake Sent: Monday, October 10, 2011 08:09 To: linux-mtd@lists.infradead.org Subject: ubi vol_size and lots of bad blocks Hi, We're still working on getting ubifs shipped on OLPC XO-1. One outstanding issue we have is that on some laptops, when switching from jffs2 to ubifs, the laptop simply does not boot (root fs mounting difficulties). One case of this is when there are a large number of bad blocks on the disk, during boot we get: [ 76.855427] UBI error: vtbl_check: too large reserved_pebs 7850, good PEBs 7765 [ 76.867878] UBI error: vtbl_check: volume table check failed: record 0, error 9 With so many bad blocks, this is likely a problematic nand or a corrupt BBT. However, jffs2 worked in this situation, and (with many of our laptops in remote places) it would be nice for us to figure out how to make ubifs handle it as well. There are other cases of this error in the archive, and people have generally solved it by using a smaller vol_size in the ubinize config. Am I right in saying that reserved_pebs is computed from the vol_size specified in the ubinize config? I guess "good PEBs" is calculated from the amount of non-bad blocks found during the boot process. This suggests that using vol_size is unsafe for installations such as ours, where while we do know the NAND size in advance, we also want to support an unknown, high number of bad blocks which will vary throughout the field. I found a note in the UBI FAQ where it says vol_size can be excluded and it will be computed to be the size of the input image, and then the autoresize flag can be used to expand the partition later. Excluding vol_size in this way indeed solves the problem and the problematic laptop now boots. So, am I right in saying that for an installation such as OLPC, where resilience to strange NAND conditions involving high numbers of bad blocks is desired, it is advisable to *not* specify vol_size in ubinize.cfg? (If so I'll send in a FAQ update for the website.) The one bit I don't understand is what happens if another block goes bad later. If the autoresize functionality has modified reserved_pebs to represent the exact number of good blocks on the disk (i.e. reserved_pebs==good_PEBs), next time a block goes bad the same reserved_pebs>good_PEBs boot failure would be hit again. But I am probably missing something. cheers, Daniel ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ Click https://www.mailcontrol.com/sr/JXot!iSixtzTndxI!oX7UpJAdpTSMUBqW1!uL9x+cJDFU9F9FklsxoR4wEgrZ2pSIEZflx!5bMpTHufDF4Ashw== to report this email as spam. This e-mail and the information, including any attachments, it contains are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email. ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: ubi vol_size and lots of bad blocks 2011-10-11 11:35 ` Atlant Schmidt @ 2011-10-14 12:58 ` Artem Bityutskiy 2011-10-14 13:03 ` Atlant Schmidt 0 siblings, 1 reply; 8+ messages in thread From: Artem Bityutskiy @ 2011-10-14 12:58 UTC (permalink / raw) To: Atlant Schmidt; +Cc: linux-mtd@lists.infradead.org, 'Daniel Drake' On Tue, 2011-10-11 at 07:35 -0400, Atlant Schmidt wrote: > > Be careful here -- the last time I looked, blocks that go > bad *ARE NOT* actually permanently marked as bad; they may > no longer be used during the current boot, but the next time > you reboot, they're eligible for attempted-but-often-failing > use once again. If this happened to you, this must be because of a bug in your MTD driver. UBI dimply calls MTD mark_bad() function to mark blocks bad. -- Best Regards, Artem Bityutskiy ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: ubi vol_size and lots of bad blocks 2011-10-14 12:58 ` Artem Bityutskiy @ 2011-10-14 13:03 ` Atlant Schmidt 2011-10-17 11:36 ` Atlant Schmidt 0 siblings, 1 reply; 8+ messages in thread From: Atlant Schmidt @ 2011-10-14 13:03 UTC (permalink / raw) To: 'dedekind1@gmail.com' Cc: linux-mtd@lists.infradead.org, 'Daniel Drake' Artem: Interesting! I'll ask our subject-matter expert; it would be great if this were resolved. Atlant -----Original Message----- From: Artem Bityutskiy [mailto:dedekind1@gmail.com] Sent: Friday, October 14, 2011 08:58 To: Atlant Schmidt Cc: 'Daniel Drake'; linux-mtd@lists.infradead.org Subject: RE: ubi vol_size and lots of bad blocks On Tue, 2011-10-11 at 07:35 -0400, Atlant Schmidt wrote: > > Be careful here -- the last time I looked, blocks that go > bad *ARE NOT* actually permanently marked as bad; they may > no longer be used during the current boot, but the next time > you reboot, they're eligible for attempted-but-often-failing > use once again. If this happened to you, this must be because of a bug in your MTD driver. UBI dimply calls MTD mark_bad() function to mark blocks bad. -- Best Regards, Artem Bityutskiy Click https://www.mailcontrol.com/sr/fue6KqaG!5rTndxI!oX7Us7Qlo!t9IHt!Dknkv+q9bergxmG63nRctGYsfIhXbFK16EEqwzvaW+2eN6FcXjHgg== to report this email as spam. This e-mail and the information, including any attachments, it contains are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email. ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: ubi vol_size and lots of bad blocks 2011-10-14 13:03 ` Atlant Schmidt @ 2011-10-17 11:36 ` Atlant Schmidt 0 siblings, 0 replies; 8+ messages in thread From: Atlant Schmidt @ 2011-10-17 11:36 UTC (permalink / raw) To: Atlant Schmidt, 'dedekind1@gmail.com' Cc: linux-mtd@lists.infradead.org, 'Daniel Drake' Artem: Thanks for the prompt -- our subject-matter expert says that yes, this was a bug that has now been corrected! Atlant -----Original Message----- From: linux-mtd-bounces@lists.infradead.org [mailto:linux-mtd-bounces@lists.infradead.org] On Behalf Of Atlant Schmidt Sent: Friday, October 14, 2011 09:03 To: 'dedekind1@gmail.com' Cc: linux-mtd@lists.infradead.org; 'Daniel Drake' Subject: RE: ubi vol_size and lots of bad blocks Artem: Interesting! I'll ask our subject-matter expert; it would be great if this were resolved. Atlant -----Original Message----- From: Artem Bityutskiy [mailto:dedekind1@gmail.com] Sent: Friday, October 14, 2011 08:58 To: Atlant Schmidt Cc: 'Daniel Drake'; linux-mtd@lists.infradead.org Subject: RE: ubi vol_size and lots of bad blocks On Tue, 2011-10-11 at 07:35 -0400, Atlant Schmidt wrote: > > Be careful here -- the last time I looked, blocks that go > bad *ARE NOT* actually permanently marked as bad; they may > no longer be used during the current boot, but the next time > you reboot, they're eligible for attempted-but-often-failing > use once again. If this happened to you, this must be because of a bug in your MTD driver. UBI dimply calls MTD mark_bad() function to mark blocks bad. -- Best Regards, Artem Bityutskiy Click https://www.mailcontrol.com/sr/fue6KqaG!5rTndxI!oX7Us7Qlo!t9IHt!Dknkv+q9bergxmG63nRctGYsfIhXbFK16EEqwzvaW+2eN6FcXjHgg== to report this email as spam. This e-mail and the information, including any attachments, it contains are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email. ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ This e-mail and the information, including any attachments, it contains are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message. Thank you. Please consider the environment before printing this email. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ubi vol_size and lots of bad blocks 2011-10-10 12:09 ubi vol_size and lots of bad blocks Daniel Drake 2011-10-11 11:35 ` Atlant Schmidt @ 2011-10-14 11:15 ` Artem Bityutskiy 2011-10-17 13:35 ` Daniel Drake 1 sibling, 1 reply; 8+ messages in thread From: Artem Bityutskiy @ 2011-10-14 11:15 UTC (permalink / raw) To: Daniel Drake; +Cc: linux-mtd Hi Daniel, On Mon, 2011-10-10 at 13:09 +0100, Daniel Drake wrote: > One outstanding issue we have is that on some laptops, when switching > from jffs2 to ubifs, the laptop simply does not boot (root fs mounting > difficulties). > > One case of this is when there are a large number of bad blocks on the > disk, during boot we get: > [ 76.855427] UBI error: vtbl_check: too large reserved_pebs 7850, > good PEBs 7765 > [ 76.867878] UBI error: vtbl_check: volume table check failed: > record 0, error 9 Would be great if you also attached full kernel log with UBI debugging enabled and probably build messages enabled. Just makes it easier when you can see UBI output about the flash geometry, etc. > With so many bad blocks, this is likely a problematic nand or a > corrupt BBT. However, jffs2 worked in this situation, and (with many > of our laptops in remote places) it would be nice for us to figure out > how to make ubifs handle it as well. > > > There are other cases of this error in the archive, and people have > generally solved it by using a smaller vol_size in the ubinize config. > Am I right in saying that reserved_pebs is computed from the vol_size > specified in the ubinize config? > > I guess "good PEBs" is calculated from the amount of non-bad blocks > found during the boot process. Yes, I believe it is just amount of non-bad eraseblocks. > This suggests that using vol_size is unsafe for installations such as > ours, where while we do know the NAND size in advance, we also want to > support an unknown, high number of bad blocks which will vary > throughout the field. But this is why the autoresize flag was introduce. When creating UBI image, you have to know how big your volume has to be. At least you need to know the _minimum_ size. And you should use this minimum volume size in your ubinize config file. > I found a note in the UBI FAQ where it says vol_size can be excluded > and it will be computed to be the size of the input image, and then > the autoresize flag can be used to expand the partition later. > Excluding vol_size in this way indeed solves the problem and the > problematic laptop now boots. Well, you probably need some free space as well. Just come up with some minimum number, say 300MiB and use this number for volume size in ubinize, and use autoresize flag. In this case, when you flash this image to your device, UBI will automatically resize this volume to the maximum possible size. > So, am I right in saying that for an installation such as OLPC, where > resilience to strange NAND conditions involving high numbers of bad > blocks is desired, it is advisable to *not* specify vol_size in > ubinize.cfg? Yes, I think you can do this, I think. > (If so I'll send in a FAQ update for the website.) > > The one bit I don't understand is what happens if another block goes > bad later. If the autoresize functionality has modified reserved_pebs > to represent the exact number of good blocks on the disk (i.e. > reserved_pebs==good_PEBs), next time a block goes bad the same > reserved_pebs>good_PEBs boot failure would be hit again. But I am > probably missing something. Autorisize will not occupy the PEBs reserved for bad block handling. Dunno how much you looked into UBI code, but it works roughly like this: 1. avail_pebs = good_pebs 2. read volume table, and avail_pebs -= reserved_pebs for each volume, i.e., we subtract the amount of PEB which all volumes absolutely require. 3. initialize other subsystems, and subtract EBA_RESERVED_PEBS=1, WL_RESERVED_PEBS=1. IOW, every subsystem subtracts amount of PEBs it requires to operate. E.g., Wear-levelling (WL) subsystem requires one eraseblock for its purposes, etc. 4. In 'ubi_eba_init_scan()' function we calculate the normal amount of PEBs which we reserve for bad blocks handling (default is 1%), and subtract that amount from avail_pebs. If avail_peb's is already very small, it will become zero in this case. 5. At the very end, we increase the autoresize-marked volume by what is left in avail_pebs. IOW, autoresize will not touch PEBs reserved for BB handling. Remember, UBIFS also does autoresize automatically, but it is limited by what you specified with -c option to mkfs.ubifs. So specify large enough number, but not too large, because the larger it is, the more space UBIFS will reserve for LPT. But only power-of-2 boundaries make difference for UBIFS. IOW, 4000 and 4095 LEBs in -c are equivalent from UBIFS POW. But 4095 and 4096 make a difference. So whatever you specify for -c (say -c X), you can make that to be "-c roundup_pow_of_two(X) - 1" and this will not affect anything. But "roundup_pow_of_two(X)" will make UBIFS image a bit larger. I think this info is in the web size in a more readable form. Sorry if my reply is very messy, feel free to ask questions. -- Best Regards, Artem Bityutskiy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ubi vol_size and lots of bad blocks 2011-10-14 11:15 ` Artem Bityutskiy @ 2011-10-17 13:35 ` Daniel Drake 2011-10-20 15:57 ` Artem Bityutskiy 0 siblings, 1 reply; 8+ messages in thread From: Daniel Drake @ 2011-10-17 13:35 UTC (permalink / raw) To: dedekind1; +Cc: linux-mtd Hi Artem, Thanks for the detailed response - I'll be sure to send another documentation patch once we've got to the bottom of everything. On Fri, Oct 14, 2011 at 12:15 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote: >> I found a note in the UBI FAQ where it says vol_size can be excluded >> and it will be computed to be the size of the input image, and then >> the autoresize flag can be used to expand the partition later. >> Excluding vol_size in this way indeed solves the problem and the >> problematic laptop now boots. > > Well, you probably need some free space as well. Just come up with > some minimum number, say 300MiB and use this number for volume size in > ubinize, and use autoresize flag. Regarding free space, is it really necessary? My understanding is that the autoresize functionality will resize the volume *before* it gets mounted for the first time, so it should be fine to not leave any free space at image creation time. When it gets mounted for the first time, it will be freshly resized and have free space available. As for "some minimum number", I guess it goes without saying that whatever number is chosen, it must be bigger than the amount of data that is going to be written into the image. Our image building tool will be used by different customers who will apply simple customisations (e.g. with GNOME, with wikipedia) so the range of image sizes varies. We need to do it based on some kind of calculation that considers the size of the initial data to be written to the flash. If we can do it with no free space initially, we can let ubinize do that for us, with autoresize enabled (this was my trail of thought). > Autorisize will not occupy the PEBs reserved for bad block handling. OK, thanks for clarifying. One final question... What happens when the PEBs reserved for bad block handling run out? Thanks, Daniel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ubi vol_size and lots of bad blocks 2011-10-17 13:35 ` Daniel Drake @ 2011-10-20 15:57 ` Artem Bityutskiy 0 siblings, 0 replies; 8+ messages in thread From: Artem Bityutskiy @ 2011-10-20 15:57 UTC (permalink / raw) To: Daniel Drake; +Cc: linux-mtd [-- Attachment #1: Type: text/plain, Size: 3283 bytes --] On Mon, 2011-10-17 at 14:35 +0100, Daniel Drake wrote: > Hi Artem, > > Thanks for the detailed response - I'll be sure to send another > documentation patch once we've got to the bottom of everything. > > On Fri, Oct 14, 2011 at 12:15 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote: > >> I found a note in the UBI FAQ where it says vol_size can be excluded > >> and it will be computed to be the size of the input image, and then > >> the autoresize flag can be used to expand the partition later. > >> Excluding vol_size in this way indeed solves the problem and the > >> problematic laptop now boots. > > > > Well, you probably need some free space as well. Just come up with > > some minimum number, say 300MiB and use this number for volume size in > > ubinize, and use autoresize flag. > > Regarding free space, is it really necessary? Well, I though that if OLPC requires some free space to boot, it could be necessary. > My understanding is that > the autoresize functionality will resize the volume *before* it gets > mounted for the first time, so it should be fine to not leave any free > space at image creation time. When it gets mounted for the first time, > it will be freshly resized and have free space available. Yes. I was just thinking about a situation when you have so many bad blocks, that it will be resized and there will be too few space. In that case the device won't boot with weird and unexpected symptoms. I thought that if you reserve min. free space, then it won't boot with predictable symptoms - UBI will print a message like "not enough eraseblocks" or something like that. > As for "some minimum number", I guess it goes without saying that > whatever number is chosen, it must be bigger than the amount of data > that is going to be written into the image. Frankly, do not remember, depends on ubinize implemenation. Most probably yes, if you put smaller number, ubinize will throw an error back. > Our image building tool > will be used by different customers who will apply simple > customisations (e.g. with GNOME, with wikipedia) so the range of image > sizes varies. We need to do it based on some kind of calculation that > considers the size of the initial data to be written to the flash. If > we can do it with no free space initially, we can let ubinize do that > for us, with autoresize enabled (this was my trail of thought). Yeah, you can forget about the free space stuff. > > Autorisize will not occupy the PEBs reserved for bad block handling. > > OK, thanks for clarifying. > One final question... What happens when the PEBs reserved for bad > block handling run out? Very good question. In this case you will get an error and UBI will switch to R/O mode. UBI guarantees that there is a PEB for each LEB. If you run out of good PEBs, then a write to the LEB may fail. To recover from this error you could re-flash the device. The run-time recovery would require deleting or shrinking one of the UBI volumes. So you need to carefully select the amount of PEBs reserver for bad blocks handling. For Nokia phones like N900 1% was just fine. The have Samsung OneNAND flash, 256MiB in size, 128KiB PEBs. -- Best Regards, Artem Bityutskiy [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 490 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-10-20 15:57 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-10 12:09 ubi vol_size and lots of bad blocks Daniel Drake 2011-10-11 11:35 ` Atlant Schmidt 2011-10-14 12:58 ` Artem Bityutskiy 2011-10-14 13:03 ` Atlant Schmidt 2011-10-17 11:36 ` Atlant Schmidt 2011-10-14 11:15 ` Artem Bityutskiy 2011-10-17 13:35 ` Daniel Drake 2011-10-20 15:57 ` Artem Bityutskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox