* Regression in handling of unsafe UBI shutdown
@ 2011-07-19 13:57 Daniel Mack
2011-07-19 15:02 ` Artem Bityutskiy
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Daniel Mack @ 2011-07-19 13:57 UTC (permalink / raw)
To: linux-mtd, linux-kernel; +Cc: Adrian Hunter, Sven Neumann, Artem Bityutskiy
Hey,
we're facing a new behaviour with 3.0-rc7 kernels wrt UBI file systems
that are not properly unmounted on shut down. The newest kernel we're
using in production is 2.6.36.4, and this version doesn't show the
effect.
When our devices boot up, the bootloader (U-Boot) initializes its UBI
code, reads in the uImage from the root partition and executes it.
This worked very well in thousands of installations previously (with
kernels up to version 2.6.36.4), and generally, it still does work
well with the latest cutting edge when the kernel is shut down
properly and unmounts the UBI partitions safely. However, if the power
is removed out of a sudden from a running system, U-Boot now doesn't
like the FS anymore upon the next boot:
Creating 1 MTD partitions on "nand0":
0x00120000-0x08000000 : "mtd=3"
UBI: attaching mtd1 to ubi0
UBI: physical eraseblock size: 131072 bytes (128 KiB)
UBI: logical eraseblock size: 126976 bytes
UBI: smallest flash I/O unit: 2048
UBI: VID header offset: 2048 (aligned 2048)
UBI: data offset: 4096
UBI: attached mtd1 to ubi0
UBI: MTD device name: "mtd=3"
UBI: MTD device size: 126 MiB
UBI: number of good PEBs: 1014
UBI: number of bad PEBs: 1
UBI: max. allowed volumes: 128
UBI: wear-leveling threshold: 4096
UBI: number of internal volumes: 1
UBI: number of user volumes: 2
UBI: available PEBs: 8
UBI: total number of reserved PEBs: 1006
UBI: number of PEBs reserved for bad PEB handling: 10
UBI: max/mean erase counter: 4/3
UBIFS: recovery needed
Error reading superblock on volume 'ubi:RootFS'!
UBIFS not mounted, use ubifs mount to mount volume first!
Wrong Image Format for bootm command
ERROR: can't get kernel image!
Hence my question is: were there any radical changes in the UBI/UBIFS
code on the kernel side that make older code not like the new content
anymore?
Thanks,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-19 13:57 Regression in handling of unsafe UBI shutdown Daniel Mack
@ 2011-07-19 15:02 ` Artem Bityutskiy
2011-07-20 5:21 ` Artem Bityutskiy
2011-07-20 11:45 ` Mike Hench
2 siblings, 0 replies; 12+ messages in thread
From: Artem Bityutskiy @ 2011-07-19 15:02 UTC (permalink / raw)
To: Daniel Mack; +Cc: Sven Neumann, linux-mtd, linux-kernel
On Tue, 2011-07-19 at 15:57 +0200, Daniel Mack wrote:
> Hey,
>
> we're facing a new behaviour with 3.0-rc7 kernels wrt UBI file systems
> that are not properly unmounted on shut down. The newest kernel we're
> using in production is 2.6.36.4, and this version doesn't show the
> effect.
>
> When our devices boot up, the bootloader (U-Boot) initializes its UBI
> code, reads in the uImage from the root partition and executes it.
> This worked very well in thousands of installations previously (with
> kernels up to version 2.6.36.4), and generally, it still does work
> well with the latest cutting edge when the kernel is shut down
> properly and unmounts the UBI partitions safely. However, if the power
> is removed out of a sudden from a running system, U-Boot now doesn't
> like the FS anymore upon the next boot:
>
>
> Creating 1 MTD partitions on "nand0":
> 0x00120000-0x08000000 : "mtd=3"
> UBI: attaching mtd1 to ubi0
> UBI: physical eraseblock size: 131072 bytes (128 KiB)
> UBI: logical eraseblock size: 126976 bytes
> UBI: smallest flash I/O unit: 2048
> UBI: VID header offset: 2048 (aligned 2048)
> UBI: data offset: 4096
> UBI: attached mtd1 to ubi0
> UBI: MTD device name: "mtd=3"
> UBI: MTD device size: 126 MiB
> UBI: number of good PEBs: 1014
> UBI: number of bad PEBs: 1
> UBI: max. allowed volumes: 128
> UBI: wear-leveling threshold: 4096
> UBI: number of internal volumes: 1
> UBI: number of user volumes: 2
> UBI: available PEBs: 8
> UBI: total number of reserved PEBs: 1006
> UBI: number of PEBs reserved for bad PEB handling: 10
> UBI: max/mean erase counter: 4/3
> UBIFS: recovery needed
> Error reading superblock on volume 'ubi:RootFS'!
> UBIFS not mounted, use ubifs mount to mount volume first!
> Wrong Image Format for bootm command
> ERROR: can't get kernel image!
>
>
> Hence my question is: were there any radical changes in the UBI/UBIFS
> code on the kernel side that make older code not like the new content
> anymore?
Not sure if there were radical changes, but please, enable UBIFS
debugging and provide UBIFS messages. Here are some hints:
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_how_send_bugreport
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-19 13:57 Regression in handling of unsafe UBI shutdown Daniel Mack
2011-07-19 15:02 ` Artem Bityutskiy
@ 2011-07-20 5:21 ` Artem Bityutskiy
2011-07-20 9:18 ` Daniel Mack
2011-07-20 11:45 ` Mike Hench
2 siblings, 1 reply; 12+ messages in thread
From: Artem Bityutskiy @ 2011-07-20 5:21 UTC (permalink / raw)
To: Daniel Mack; +Cc: Sven Neumann, linux-mtd, linux-kernel, Adrian Hunter
On Tue, 2011-07-19 at 15:57 +0200, Daniel Mack wrote:
> UBIFS: recovery needed
> Error reading superblock on volume 'ubi:RootFS'!
> UBIFS not mounted, use ubifs mount to mount volume first!
> Wrong Image Format for bootm command
> ERROR: can't get kernel image!
>
>
> Hence my question is: were there any radical changes in the UBI/UBIFS
> code on the kernel side that make older code not like the new content
> anymore?
Daniel, sorry, I have no time to look at this now, could you please try
to bisect the issue?
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 5:21 ` Artem Bityutskiy
@ 2011-07-20 9:18 ` Daniel Mack
2011-07-20 9:31 ` Daniel Mack
2011-07-20 12:32 ` Artem Bityutskiy
0 siblings, 2 replies; 12+ messages in thread
From: Daniel Mack @ 2011-07-20 9:18 UTC (permalink / raw)
To: dedekind1; +Cc: Sven Neumann, linux-mtd, linux-kernel, Adrian Hunter
On Wed, Jul 20, 2011 at 7:21 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Tue, 2011-07-19 at 15:57 +0200, Daniel Mack wrote:
>> UBIFS: recovery needed
>> Error reading superblock on volume 'ubi:RootFS'!
>> UBIFS not mounted, use ubifs mount to mount volume first!
>> Wrong Image Format for bootm command
>> ERROR: can't get kernel image!
>>
>>
>> Hence my question is: were there any radical changes in the UBI/UBIFS
>> code on the kernel side that make older code not like the new content
>> anymore?
>
> Daniel, sorry, I have no time to look at this now, could you please try
> to bisect the issue?
It's not really easy to bisect as the issue is not always fully
reproducable, and also because the flash needs to be re-initialized
after it happened.
Also note that it's not the kernel itself that complains about the
state of the file system in this case but U-Boot. If we boot a 3.0-rc7
kernel in such a situation (via USB for example), the kernel will
recover the FS and continue.
I don't know how many people use the UBI code in U-Boot, and I don't
know either whether it was a good idea to go this way in the first
place, but we didn't want to waste much space on the NAND for a
fixed-size partition just for the kernel, and have a hard limit for it
in the future. And as I said, this approach has worked just fine in
the past.
So, let me re-phrase my question: is anyone aware of changes in the
UBIFS code between 2.6.36 and 3.0 that might cause trouble to U-Boot's
UBI code from 2009?
Thanks,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 9:18 ` Daniel Mack
@ 2011-07-20 9:31 ` Daniel Mack
2011-07-22 7:58 ` Artem Bityutskiy
2011-07-20 12:32 ` Artem Bityutskiy
1 sibling, 1 reply; 12+ messages in thread
From: Daniel Mack @ 2011-07-20 9:31 UTC (permalink / raw)
To: dedekind1; +Cc: Sven Neumann, linux-mtd, linux-kernel
On Wed, Jul 20, 2011 at 11:18 AM, Daniel Mack <zonque@gmail.com> wrote:
> Also note that it's not the kernel itself that complains about the
> state of the file system in this case but U-Boot. If we boot a 3.0-rc7
> kernel in such a situation (via USB for example), the kernel will
> recover the FS and continue.
And I forgot to mention that the mtd-toturetest didn't show any
problems, so it's most likely not an issue with the block-device
layer.
Btw - Adrian Hunter's address <adrian.hunter@nokia.com> bounces - can
anyone care to update the UBI record in MAINTAINERS with a new
address? :)
Thanks,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Regression in handling of unsafe UBI shutdown
2011-07-19 13:57 Regression in handling of unsafe UBI shutdown Daniel Mack
2011-07-19 15:02 ` Artem Bityutskiy
2011-07-20 5:21 ` Artem Bityutskiy
@ 2011-07-20 11:45 ` Mike Hench
2011-07-20 11:50 ` Daniel Mack
2011-07-24 14:52 ` Daniel Mack
2 siblings, 2 replies; 12+ messages in thread
From: Mike Hench @ 2011-07-20 11:45 UTC (permalink / raw)
To: Daniel Mack, linux-mtd, linux-kernel
Cc: Artem Bityutskiy, Sven Neumann, Adrian Hunter
> UBIFS: recovery needed
> Error reading superblock on volume 'ubi:RootFS'!
> UBIFS not mounted, use ubifs mount to mount volume first!
> Wrong Image Format for bootm command
> ERROR: can't get kernel image!
Be nice to know what error.
FWIW:
a) I do not see this. I have very limited experience with 3.0-rc7 though
so maybe I should not say anything :-)
b) sometimes UBIFS decides to do recovery in u-boot, sometimes not.
it seems to depend on the type of content in the journal.
Lots of deleted files seem to trigger u-boot to perform a recovery for
instance.
I was running out of memory In u-boot while replaying the journal
at one point with an older kernel. This produced the same top level
error.
I run 12 megs now, with a 2gigabyte flash. I think that is overkill.
Work with u-boot with the 'bad' flash.
There is debugging in there, but I forget how to turn it on.
Mike Hench
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 11:45 ` Mike Hench
@ 2011-07-20 11:50 ` Daniel Mack
2011-07-20 12:06 ` Mike Hench
2011-07-24 14:52 ` Daniel Mack
1 sibling, 1 reply; 12+ messages in thread
From: Daniel Mack @ 2011-07-20 11:50 UTC (permalink / raw)
To: Mike Hench
Cc: Sven Neumann, Artem Bityutskiy, linux-mtd, linux-kernel,
Adrian Hunter
Hi Mike,
On Wed, Jul 20, 2011 at 1:45 PM, Mike Hench <mhench@elutions.com> wrote:
>> UBIFS: recovery needed
>> Error reading superblock on volume 'ubi:RootFS'!
>> UBIFS not mounted, use ubifs mount to mount volume first!
>> Wrong Image Format for bootm command
>> ERROR: can't get kernel image!
>
>
> Be nice to know what error.
> FWIW:
> a) I do not see this. I have very limited experience with 3.0-rc7 though
> so maybe I should not say anything :-)
Thanks for this information. It's good to know there are similar
setups used by other people. Are you an an ARM as well?
Did you try to cut the power from your device in the middle of full
operation? If would help a lot if you could try this.
> b) sometimes UBIFS decides to do recovery in u-boot, sometimes not.
> it seems to depend on the type of content in the journal.
> Lots of deleted files seem to trigger u-boot to perform a recovery for
> instance.
And my fear is that the possible types of content changed in UBIFS in
a way that the kernel itself would be able to deal with it, but legacy
code (in U-Boot and in our case) wouldn't. That would explain what I'm
seeing.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: Regression in handling of unsafe UBI shutdown
2011-07-20 11:50 ` Daniel Mack
@ 2011-07-20 12:06 ` Mike Hench
0 siblings, 0 replies; 12+ messages in thread
From: Mike Hench @ 2011-07-20 12:06 UTC (permalink / raw)
To: Daniel Mack
Cc: Sven Neumann, Artem Bityutskiy, linux-mtd, linux-kernel,
Adrian Hunter
> Are you an an ARM as well?
We run powerpc.
> Did you try to cut the power from your device in the middle of full
> operation? If would help a lot if you could try this.
My test is automated random power cycling while doing a complicated file
copy/compare operation I use for abuse testing.
Oh, forgot to mention, we are using u-boot 2010-06, not that old,
not new either.
Mike.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 9:18 ` Daniel Mack
2011-07-20 9:31 ` Daniel Mack
@ 2011-07-20 12:32 ` Artem Bityutskiy
2011-07-20 12:50 ` Daniel Mack
1 sibling, 1 reply; 12+ messages in thread
From: Artem Bityutskiy @ 2011-07-20 12:32 UTC (permalink / raw)
To: Daniel Mack; +Cc: Sven Neumann, linux-mtd, linux-kernel
On Wed, 2011-07-20 at 11:18 +0200, Daniel Mack wrote:
> On Wed, Jul 20, 2011 at 7:21 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> > On Tue, 2011-07-19 at 15:57 +0200, Daniel Mack wrote:
> >> UBIFS: recovery needed
> >> Error reading superblock on volume 'ubi:RootFS'!
> >> UBIFS not mounted, use ubifs mount to mount volume first!
> >> Wrong Image Format for bootm command
> >> ERROR: can't get kernel image!
> >>
> >>
> >> Hence my question is: were there any radical changes in the UBI/UBIFS
> >> code on the kernel side that make older code not like the new content
> >> anymore?
> >
> > Daniel, sorry, I have no time to look at this now, could you please try
> > to bisect the issue?
>
> It's not really easy to bisect as the issue is not always fully
> reproducable, and also because the flash needs to be re-initialized
> after it happened.
>
> Also note that it's not the kernel itself that complains about the
> state of the file system in this case but U-Boot. If we boot a 3.0-rc7
> kernel in such a situation (via USB for example), the kernel will
> recover the FS and continue.
>
> I don't know how many people use the UBI code in U-Boot, and I don't
> know either whether it was a good idea to go this way in the first
> place, but we didn't want to waste much space on the NAND for a
> fixed-size partition just for the kernel, and have a hard limit for it
> in the future. And as I said, this approach has worked just fine in
> the past.
>
> So, let me re-phrase my question: is anyone aware of changes in the
> UBIFS code between 2.6.36 and 3.0 that might cause trouble to U-Boot's
> UBI code from 2009?
I guess that would be an on-flash format change? I am not aware of such
changes, and if there were such - this is a big issue which we wound
need to fix.
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 12:32 ` Artem Bityutskiy
@ 2011-07-20 12:50 ` Daniel Mack
0 siblings, 0 replies; 12+ messages in thread
From: Daniel Mack @ 2011-07-20 12:50 UTC (permalink / raw)
To: dedekind1; +Cc: Sven Neumann, linux-mtd, linux-kernel
On Wed, Jul 20, 2011 at 2:32 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Wed, 2011-07-20 at 11:18 +0200, Daniel Mack wrote:
>> So, let me re-phrase my question: is anyone aware of changes in the
>> UBIFS code between 2.6.36 and 3.0 that might cause trouble to U-Boot's
>> UBI code from 2009?
>
> I guess that would be an on-flash format change? I am not aware of such
> changes, and if there were such - this is a big issue which we wound
> need to fix.
Yes. And more tests show that we appearantly also have a more severe
problem here. Some devices that have been power-cut during operation
even refuse to create a new UBI from Linux on the NAND after they're
toast (we use this procedure as a debricking option for customers).
The only thing that helps in such cases is erasing the NAND from
U-Boot. After that, the UBI creation works again. Any more ideas on
how to debug this would be appreciated.
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 9:31 ` Daniel Mack
@ 2011-07-22 7:58 ` Artem Bityutskiy
0 siblings, 0 replies; 12+ messages in thread
From: Artem Bityutskiy @ 2011-07-22 7:58 UTC (permalink / raw)
To: Daniel Mack; +Cc: Sven Neumann, linux-mtd, linux-kernel
On Wed, 2011-07-20 at 11:31 +0200, Daniel Mack wrote:
> Btw - Adrian Hunter's address <adrian.hunter@nokia.com> bounces - can
> anyone care to update the UBI record in MAINTAINERS with a new
> address? :)
Fixed in the ubifs tree, thanks.
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Regression in handling of unsafe UBI shutdown
2011-07-20 11:45 ` Mike Hench
2011-07-20 11:50 ` Daniel Mack
@ 2011-07-24 14:52 ` Daniel Mack
1 sibling, 0 replies; 12+ messages in thread
From: Daniel Mack @ 2011-07-24 14:52 UTC (permalink / raw)
To: Mike Hench
Cc: Sven Neumann, Artem Bityutskiy, linux-mtd, linux-kernel,
Adrian Hunter
On Wed, Jul 20, 2011 at 1:45 PM, Mike Hench <mhench@elutions.com> wrote:
>> UBIFS: recovery needed
>> Error reading superblock on volume 'ubi:RootFS'!
>> UBIFS not mounted, use ubifs mount to mount volume first!
>> Wrong Image Format for bootm command
>> ERROR: can't get kernel image!
>
>
> Be nice to know what error.
> FWIW:
> a) I do not see this. I have very limited experience with 3.0-rc7 though
> so maybe I should not say anything :-)
>
> b) sometimes UBIFS decides to do recovery in u-boot, sometimes not.
> it seems to depend on the type of content in the journal.
> Lots of deleted files seem to trigger u-boot to perform a recovery for
> instance.
> I was running out of memory In u-boot while replaying the journal
> at one point with an older kernel. This produced the same top level
> error.
Thank you very much vor mentioning this, as it totally pointed me in
the right direction. Whatever really caused it, we ended up with
larger journal produced by newer kernels in comparison to older ones,
and U-Boot indeed ran out memory when trying to access the FS later.
Increasing the amount of available memory in U-Boot fixes the problem
for us.
Sorry for scaring you, but it looked like a major regression in the
first place. And thanks again to everyone for helping sort this out.
Daniel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-07-24 14:52 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-19 13:57 Regression in handling of unsafe UBI shutdown Daniel Mack
2011-07-19 15:02 ` Artem Bityutskiy
2011-07-20 5:21 ` Artem Bityutskiy
2011-07-20 9:18 ` Daniel Mack
2011-07-20 9:31 ` Daniel Mack
2011-07-22 7:58 ` Artem Bityutskiy
2011-07-20 12:32 ` Artem Bityutskiy
2011-07-20 12:50 ` Daniel Mack
2011-07-20 11:45 ` Mike Hench
2011-07-20 11:50 ` Daniel Mack
2011-07-20 12:06 ` Mike Hench
2011-07-24 14:52 ` Daniel Mack
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox