Data integrity check after UBIFORMAT? Bad image sequence number error.

linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* Data integrity check after UBIFORMAT? Bad image sequence number error.
@ 2015-06-09  8:02 t kevin
  2015-06-09  8:20 ` Richard Weinberger
  0 siblings, 1 reply; 4+ messages in thread
From: t kevin @ 2015-06-09  8:02 UTC (permalink / raw)
  To: linux-mtd

Hi,

We are using kernel 2.6.36 and mtd-util-1.5.1 on our box.
During system upgrade, very very occasionally ( 1 in 100, maybe? ) I
get this error at ubiattach after ubiformat.

[ 1632.520000] UBI: attaching mtd8 to ubi0
[ 1632.520000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
[ 1632.530000] UBI: logical eraseblock size: 126976 bytes
[ 1632.530000] UBI: smallest flash I/O unit: 2048
[ 1632.540000] UBI: sub-page size: 512
[ 1632.540000] UBI: VID header offset: 2048 (aligned 2048)
[ 1632.550000] UBI: data offset: 4096
[ 1633.190000] UBI error: process_eb: bad image sequence number
559476870 in PEB 635, expected 139654706

I understand ubiformat generate a random sequence number and write the
sequence number to all PEB. So it seems an expected sequence number
somehow is not written into nand flash correctly.

So I changed my upgrade sequence like below

ubiformat ubi.img /dev/mtdx
ubiattach /dev/mtdx

if [ "$?" != "0" ]
    #do ubiformat again
    ubiformat ubi.img /dev/mtdx
fi

This time the syndrome changed. After the re-format,

[ 9.630000] UBI: attaching mtd8 to ubi0
[ 9.630000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
[ 9.640000] UBI: logical eraseblock size: 126976 bytes
[ 9.640000] UBI: smallest flash I/O unit: 2048
[ 9.650000] UBI: sub-page size: 512
[ 9.650000] UBI: VID header offset: 2048 (aligned 2048)
[ 9.660000] UBI: data offset: 4096
[ 10.490000] UBI: max. sequence number: 0
[ 10.550000] UBI: volume 0 ("system") re-sized from 897 to 980 LEBs
[ 10.560000] UBI: attached mtd8 to ubi0
[ 10.560000] UBI: MTD device name: "rootfs"
[ 10.570000] UBI: MTD device size: 128 MiB
[ 10.570000] UBI: number of good PEBs: 1024
[ 10.580000] UBI: number of bad PEBs: 0
[ 10.580000] UBI: max. allowed volumes: 128
[ 10.590000] UBI: wear-leveling threshold: 4096
[ 10.590000] UBI: number of internal volumes: 1
[ 10.590000] UBI: number of user volumes: 1
[ 10.600000] UBI: available PEBs: 0
[ 10.600000] UBI: total number of reserved PEBs: 1024
[ 10.610000] UBI: number of PEBs reserved for bad PEB handling: 40
[ 10.610000] UBI: max/mean erase counter: 1/0
[ 10.620000] UBI: image sequence number: 356225242
[ 10.620000] UBI: background thread "ubi_bgt0d" started, PID 332
UBI device number 0, total 1024 LEBs (130023424 bytes, 124.0 MiB),
available 0 LEBs (0 bytes), LEB size 126976 bytes (124.0 KiB)
[ 10.790000] UBIFS: mounted UBI device 0, volume 0, name "system"
[ 10.800000] UBIFS: mounted read-only
[ 10.800000] UBIFS: file system size: 123039744 bytes (120156 KiB, 117
MiB, 969 LEBs)
[ 10.810000] UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
[ 10.820000] UBIFS: media format: w4/r0 (latest is w4/r0)
[ 10.830000] UBIFS: default compressor: none
[ 10.830000] UBIFS: reserved for root: 0 bytes (0 KiB)
[ 10.830000] UBIFS error (pid 333): ubifs_read_node: bad node type
(255 but expected 9)
[ 10.840000] UBIFS error (pid 333): ubifs_read_node: bad node at LEB 896:85496
[ 10.850000] UBIFS error (pid 333): ubifs_iget: failed to read inode
1, error -22


This time the sequence number check succeed, but ran into another
error, bad node type.
It seems something fishy with the driver. Some expected data is not
written into flash correctly.

My question are,
1. What could possibly be wrong that caused the ubiformat fail?
2. Is there a way to verify the data integrity after a UBIFORMAT
process? Something like "mtd verify" function.

Thanks a lot
Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Data integrity check after UBIFORMAT? Bad image sequence number error.
  2015-06-09  8:02 Data integrity check after UBIFORMAT? Bad image sequence number error t kevin
@ 2015-06-09  8:20 ` Richard Weinberger
  2015-06-09  8:52   ` t kevin
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Weinberger @ 2015-06-09  8:20 UTC (permalink / raw)
  To: t kevin; +Cc: linux-mtd@lists.infradead.org

On Tue, Jun 9, 2015 at 10:02 AM, t kevin <kevint324@gmail.com> wrote:
> Hi,
>
> We are using kernel 2.6.36 and mtd-util-1.5.1 on our box.
> During system upgrade, very very occasionally ( 1 in 100, maybe? ) I
> get this error at ubiattach after ubiformat.
>
> [ 1632.520000] UBI: attaching mtd8 to ubi0
> [ 1632.520000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
> [ 1632.530000] UBI: logical eraseblock size: 126976 bytes
> [ 1632.530000] UBI: smallest flash I/O unit: 2048
> [ 1632.540000] UBI: sub-page size: 512
> [ 1632.540000] UBI: VID header offset: 2048 (aligned 2048)
> [ 1632.550000] UBI: data offset: 4096
> [ 1633.190000] UBI error: process_eb: bad image sequence number
> 559476870 in PEB 635, expected 139654706
>
> I understand ubiformat generate a random sequence number and write the
> sequence number to all PEB. So it seems an expected sequence number
> somehow is not written into nand flash correctly.

Are you sure about that?
Can it be that 559476870 is the seq number of the old image and the
new one is too small?
This is one of the main reasons why we have that number, such that
UBI can detect a partial written image.

> So I changed my upgrade sequence like below
>
> ubiformat ubi.img /dev/mtdx
> ubiattach /dev/mtdx
>
> if [ "$?" != "0" ]
>     #do ubiformat again
>     ubiformat ubi.img /dev/mtdx

You format it while it is attached?

> My question are,
> 1. What could possibly be wrong that caused the ubiformat fail?

It can be a faulty MTD driver, a usage error, everything.

> 2. Is there a way to verify the data integrity after a UBIFORMAT
> process? Something like "mtd verify" function.

I fear the answer is "no".

Thanks,
//richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Data integrity check after UBIFORMAT? Bad image sequence number error.
  2015-06-09  8:20 ` Richard Weinberger
@ 2015-06-09  8:52   ` t kevin
  2015-06-09  9:05     ` Richard Weinberger
  0 siblings, 1 reply; 4+ messages in thread
From: t kevin @ 2015-06-09  8:52 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd@lists.infradead.org

Hi Richard

Thanks for the reply. See inline comments below.

2015-06-09 16:20 GMT+08:00 Richard Weinberger <richard.weinberger@gmail.com>:
> On Tue, Jun 9, 2015 at 10:02 AM, t kevin <kevint324@gmail.com> wrote:
>> Hi,
>>
>> We are using kernel 2.6.36 and mtd-util-1.5.1 on our box.
>> During system upgrade, very very occasionally ( 1 in 100, maybe? ) I
>> get this error at ubiattach after ubiformat.
>>
>> [ 1632.520000] UBI: attaching mtd8 to ubi0
>> [ 1632.520000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
>> [ 1632.530000] UBI: logical eraseblock size: 126976 bytes
>> [ 1632.530000] UBI: smallest flash I/O unit: 2048
>> [ 1632.540000] UBI: sub-page size: 512
>> [ 1632.540000] UBI: VID header offset: 2048 (aligned 2048)
>> [ 1632.550000] UBI: data offset: 4096
>> [ 1633.190000] UBI error: process_eb: bad image sequence number
>> 559476870 in PEB 635, expected 139654706
>>
>> I understand ubiformat generate a random sequence number and write the
>> sequence number to all PEB. So it seems an expected sequence number
>> somehow is not written into nand flash correctly.
>
> Are you sure about that?
> Can it be that 559476870 is the seq number of the old image and the
> new one is too small?
> This is one of the main reasons why we have that number, such that
> UBI can detect a partial written image.
>
I don't really know what "559476870" is. We don't track image sequence
number : (

>> So I changed my upgrade sequence like below
>>
>> ubiformat ubi.img /dev/mtdx
>> ubiattach /dev/mtdx
>>
>> if [ "$?" != "0" ]
>>     #do ubiformat again
>>     ubiformat ubi.img /dev/mtdx
>
> You format it while it is attached?
>

I'll do re-format only when ubiattach returns fail and then I know
there is something wrong during ubiformat. So by that time it's not
attached.

>> My question are,
>> 1. What could possibly be wrong that caused the ubiformat fail?
>
> It can be a faulty MTD driver, a usage error, everything.
>
>> 2. Is there a way to verify the data integrity after a UBIFORMAT
>> process? Something like "mtd verify" function.
>
> I fear the answer is "no".
>

As I mentioned, the error is very rare, but it did happen multiple
times. So we are considering data integrity check.

> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Data integrity check after UBIFORMAT? Bad image sequence number error.
  2015-06-09  8:52   ` t kevin
@ 2015-06-09  9:05     ` Richard Weinberger
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Weinberger @ 2015-06-09  9:05 UTC (permalink / raw)
  To: t kevin; +Cc: linux-mtd@lists.infradead.org



Am 09.06.2015 um 10:52 schrieb t kevin:
> Hi Richard
> 
> Thanks for the reply. See inline comments below.
> 
> 2015-06-09 16:20 GMT+08:00 Richard Weinberger <richard.weinberger@gmail.com>:
>> On Tue, Jun 9, 2015 at 10:02 AM, t kevin <kevint324@gmail.com> wrote:
>>> Hi,
>>>
>>> We are using kernel 2.6.36 and mtd-util-1.5.1 on our box.
>>> During system upgrade, very very occasionally ( 1 in 100, maybe? ) I
>>> get this error at ubiattach after ubiformat.
>>>
>>> [ 1632.520000] UBI: attaching mtd8 to ubi0
>>> [ 1632.520000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
>>> [ 1632.530000] UBI: logical eraseblock size: 126976 bytes
>>> [ 1632.530000] UBI: smallest flash I/O unit: 2048
>>> [ 1632.540000] UBI: sub-page size: 512
>>> [ 1632.540000] UBI: VID header offset: 2048 (aligned 2048)
>>> [ 1632.550000] UBI: data offset: 4096
>>> [ 1633.190000] UBI error: process_eb: bad image sequence number
>>> 559476870 in PEB 635, expected 139654706
>>>
>>> I understand ubiformat generate a random sequence number and write the
>>> sequence number to all PEB. So it seems an expected sequence number
>>> somehow is not written into nand flash correctly.
>>
>> Are you sure about that?
>> Can it be that 559476870 is the seq number of the old image and the
>> new one is too small?
>> This is one of the main reasons why we have that number, such that
>> UBI can detect a partial written image.
>>
> I don't really know what "559476870" is. We don't track image sequence
> number : (

Please start tracking them. UBI prints the number while attaching.
If the old number remains after an update, you update was most likely
not complete. And you can start investigate.

>>> So I changed my upgrade sequence like below
>>>
>>> ubiformat ubi.img /dev/mtdx
>>> ubiattach /dev/mtdx
>>>
>>> if [ "$?" != "0" ]
>>>     #do ubiformat again
>>>     ubiformat ubi.img /dev/mtdx
>>
>> You format it while it is attached?
>>
> 
> I'll do re-format only when ubiattach returns fail and then I know
> there is something wrong during ubiformat. So by that time it's not
> attached.

Right you are. :)

>>> My question are,
>>> 1. What could possibly be wrong that caused the ubiformat fail?
>>
>> It can be a faulty MTD driver, a usage error, everything.
>>
>>> 2. Is there a way to verify the data integrity after a UBIFORMAT
>>> process? Something like "mtd verify" function.
>>
>> I fear the answer is "no".
>>
> 
> As I mentioned, the error is very rare, but it did happen multiple
> times. So we are considering data integrity check.

I suspect that sometimes not the whole MTD partition is written.

Just in case, does your MTD driver pass all mtd and UBI tests?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-09  9:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-09  8:02 Data integrity check after UBIFORMAT? Bad image sequence number error t kevin
2015-06-09  8:20 ` Richard Weinberger
2015-06-09  8:52   ` t kevin
2015-06-09  9:05     ` Richard Weinberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).