public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Corrupted UBIFS, bad CRC
@ 2012-01-12 13:47 Karsten Jeppesen
  2012-01-15 12:24 ` Artem Bityutskiy
  0 siblings, 1 reply; 12+ messages in thread
From: Karsten Jeppesen @ 2012-01-12 13:47 UTC (permalink / raw)
  To: ubifs

Hi Guys,

Artem was the last one to respond back in November and I have been working hard on this ever since, but porting kernels takes a bit.
I am sorry not to have included the content of the earlier emails but I am attempting to answer all the outstanding questions.

Yes, Artem, I downloaded and adapted the backported tree (kernel 2.6.32 which was the closest to our 2.6.32.8) and it still showed the error.

I am painfully aware of that you like to look at problems close to current state and I am *really* trying to accomodate that.
I have ported kernel 3.2.0 (rudimentary though) to test for this problem, and it still exists.
I am provoking the error by having 16 machines powercycle at 20 secs power-on, 3 secs power-off and in 24hrs 2 machines will fail.

I have run the speed-test (see below if interested) and I will be running the stresstest later today or in the weekend.
As I stated: this test was done on a stock kenel 3.2.0 patched to our ARM9263

You stated last time that you were able to reclaim the blocks using a PC. Could this be an architectual problem PC/ARM ?


(Structure needs cleaning - is there an fsck for that?)

Sincerely,
Dr. Karsten Jeppesen


Last time I submitted way to much debug. This time hope it is correct:
--- MOUNTING DEBUG OUTPUT (mount -t ubifs ubi0:rootfs /skov/mnt/rootfs)

# mount -t ubifs ubi0:rootfs /skov/mnt/rootfs
UBIFS: recovery needed
UBIFS error (pid 18479): ubifs_recover_leb: corrupt empty space LEB 4:0, corruption starts at 144
UBIFS error (pid 18479): ubifs_scanned_corruption: corruption at LEB 4:144
UBIFS error (pid 18479): ubifs_scanned_corruption: first 8192 bytes from LEB 4:144
00000000: 00000000 00000000 00000000 00000000 ffffffff ffffffff ffffffff ffffffff  ................................
00000020: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
00000040: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
... more lines with just fffffff
00001fc0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
00001fe0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
UBIFS error (pid 18479): ubifs_recover_leb: LEB 4 scanning failed
mount: mounting ubi0:rootfs on /skov/mnt/rootfs failed: Structure needs cleaning
#
---

--- SPEEDTEST OUTPUT

# modprobe mtd_speedtest dev=4
 
=================================================
mtd_speedtest:
MTD device: 4
mtd_speedtest:
not NAND flash, assume page size is 512 bytes.
mtd_speedtest:
MTD device size 63700992, eraseblock size 131072, page size 512, count of
eraseblocks 486, pages per eraseblock 256, OOB size 0
mtd_speedtest:
testing eraseblock write speed
mtd_speedtest:
eraseblock write speed is 148 KiB/s
mtd_speedtest:
testing eraseblock read speed
mtd_speedtest:
eraseblock read speed is 1531 KiB/s
mtd_speedtest:
testing page write speed
mtd_speedtest:
page write speed is 149 KiB/s
mtd_speedtest:
testing page read speed
mtd_speedtest:
page read speed is 1475 KiB/s
mtd_speedtest:
testing 2 page write speed
mtd_speedtest:
2 page write speed is 147 KiB/s
mtd_speedtest:
testing 2 page read speed
mtd_speedtest:
2 page read speed is 1505 KiB/s
mtd_speedtest:
Testing erase speed
mtd_speedtest:
erase speed is 334 KiB/s
mtd_speedtest:
Testing 2x multi-block erase speed
mtd_speedtest:
2x multi-block erase speed is 299 KiB/s
mtd_speedtest:
Testing 4x multi-block erase speed
mtd_speedtest:
4x multi-block erase speed is 295 KiB/s
mtd_speedtest:
Testing 8x multi-block erase speed
mtd_speedtest:
8x multi-block erase speed is 293 KiB/s
mtd_speedtest:
Testing 16x multi-block erase speed
mtd_speedtest:
16x multi-block erase speed is 291 KiB/s
mtd_speedtest:
Testing 32x multi-block erase speed
mtd_speedtest:
32x multi-block erase speed is 289 KiB/s
mtd_speedtest:
Testing 64x multi-block erase speed
mtd_speedtest:
64x multi-block erase speed is 286 KiB/s
mtd_speedtest:
finished
=================================================
#
---


^ permalink raw reply	[flat|nested] 12+ messages in thread
* Corrupted UBIFS, bad CRC
@ 2012-01-12 14:31 Karsten Jeppesen
  0 siblings, 0 replies; 12+ messages in thread
From: Karsten Jeppesen @ 2012-01-12 14:31 UTC (permalink / raw)
  To: ubifs

Hi again,

Maybe I should include all info you require, but sometimes the brain is slower than the right pinkie....

...
* run the MTD tests to validate your flash
Done - nothing to report

* enable the UBIFS extra self checks and try to reproduce the problem.
Not possible ??? 

# echo 3 > /sys/module/ubifs/
uevent   version
... no parameters folder

* make sure you use up-to-date UBIFS
Sorry - best I can do right now is the 3.2.0 kernel. Is that ok?

* make sure you have compiled the kernel symbols in 

# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_HOTPLUG=y


* mark the Enable debugging support 

CONFIG_UBIFS_FS=y
# CONFIG_UBIFS_FS_XATTR is not set
# CONFIG_UBIFS_FS_ADVANCED_COMPR is not set
CONFIG_UBIFS_FS_LZO=y
CONFIG_UBIFS_FS_ZLIB=y
CONFIG_UBIFS_FS_DEBUG=y


* include all the messages UBIFS prints
# dmesg -n8
# ubiattach /dev/ubi_ctrl -m 4
UBI: attaching mtd4 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    130944 bytes
UBI: smallest flash I/O unit:    1
UBI: VID header offset:          64 (aligned 64)
UBI: data offset:                128
UBI: max. sequence number:       17653
UBI: attached mtd4 to ubi0
UBI: MTD device name:            "User"
UBI: MTD device size:            28 MiB
UBI: number of good PEBs:        230
UBI: number of bad PEBs:         0
UBI: number of corrupted PEBs:   0
UBI: max. allowed volumes:       128
UBI: wear-leveling threshold:    4096
UBI: number of internal volumes: 1
UBI: number of user volumes:     1
UBI: available PEBs:             0
UBI: total number of reserved PEBs: 230
UBI: number of PEBs reserved for bad PEB handling: 0
UBI: max/mean erase counter: 31/5
UBI: image sequence number:  1704669600
UBI: background thread "ubi_bgt0d" started, PID 6374
UBI device number 0, total 230 LEBs (30117120 bytes, 28.7 MiB), available 0 LEBs (0 bytes), LEB size 130944 bytes (127.9 KiB)
# mkdir -p /skov/mnt/rootfs
# mount -t ubifs ubi0:rootfs /skov/mnt/rootfs
UBIFS: recovery needed
UBIFS error (pid 6700): ubifs_recover_leb: corrupt empty space LEB 4:0, corruption starts at 144
UBIFS error (pid 6700): ubifs_scanned_corruption: corruption at LEB 4:144
UBIFS error (pid 6700): ubifs_scanned_corruption: first 8192 bytes from LEB 4:144
00000000: 00000000 00000000 00000000 00000000 ffffffff ffffffff ffffffff ffffffff  ................................
00000020: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
...many lines of ffffff
00001fc0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
00001fe0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff  ................................
UBIFS error (pid 6700): ubifs_recover_leb: LEB 4 scanning failed
mount: mounting ubi0:rootfs on /skov/mnt/rootfs failed: Structure needs cleaning
#


* explicitly tell about whether you did any checking
The UBI config looks like this:
CONFIG_MTD_UBI=y
CONFIG_MTD_UBI_WL_THRESHOLD=4096
CONFIG_MTD_UBI_BEB_RESERVE=1
# CONFIG_MTD_UBI_GLUEBI is not set
# CONFIG_MTD_UBI_DEBUG is not set


* describe your flash device
# mtdinfo -a
Count of MTD devices:           5
Present MTD devices:            mtd0, mtd1, mtd2, mtd3, mtd4
Sysfs interface supported:      yes

mtd0
Name:                           u-boot
Type:                           nor
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          3 (393216 bytes, 384.0 KiB)
Minimum input/output unit size: 1 byte
Sub-page size:                  1 byte
Character device major/minor:   90:0
Bad blocks are allowed:         false
Device is writable:             true

mtd1
Name:                           Env
Type:                           nor
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          1 (131072 bytes, 128.0 KiB)
Minimum input/output unit size: 1 byte
Sub-page size:                  1 byte
Character device major/minor:   90:2
Bad blocks are allowed:         false
Device is writable:             true

mtd2
Name:                           Linux
Type:                           nor
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          16 (2097152 bytes, 2.0 MiB)
Minimum input/output unit size: 1 byte
Sub-page size:                  1 byte
Character device major/minor:   90:4
Bad blocks are allowed:         false
Device is writable:             true

mtd3
Name:                           Bmp_Image
Type:                           nor
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          6 (786432 bytes, 768.0 KiB)
Minimum input/output unit size: 1 byte
Sub-page size:                  1 byte
Character device major/minor:   90:6
Bad blocks are allowed:         false
Device is writable:             true

mtd4
Name:                           User
Type:                           nor
Eraseblock size:                131072 bytes, 128.0 KiB
Amount of eraseblocks:          230 (30146560 bytes, 28.8 MiB)
Minimum input/output unit size: 1 byte
Sub-page size:                  1 byte
Character device major/minor:   90:8
Bad blocks are allowed:         false
Device is writable:             true


* describe how the problem can be reproduced
I have 16 machines running powercycle 20secs-on 3secs-off. In 24  hrs roughly 2 will fail this way.


Now I think I have added all I could think of.

Sincerely,
Dr. Karsten Jeppesen

^ permalink raw reply	[flat|nested] 12+ messages in thread
* Corrupted UBIFS, bad CRC
@ 2011-11-23 12:49 Karsten Jeppesen
  2011-11-29 21:58 ` Artem Bityutskiy
  0 siblings, 1 reply; 12+ messages in thread
From: Karsten Jeppesen @ 2011-11-23 12:49 UTC (permalink / raw)
  To: linux-mtd@lists.infradead.org

Hi Artem,

First: I love the UBIFS. It performs really really well.

Amongst many well functioning targets (ARM 9263 based) this sucker had the nerves to act up:



Uncompressing Linux........... done, booting the kernel.
[    1.570000] UBIFS error (pid 1): ubifs_check_node: bad CRC: calculated 0x7d62d42c, read 0x1173c109
[    1.580000] UBIFS error (pid 1): ubifs_check_node: bad node at LEB 84:50696
[    1.580000] UBIFS error (pid 1): ubifs_read_node: expected node type 9
[    1.590000] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)



I am running a kernel 2.6.32.8 with most patches applied. Especially the recovery.c patch and the mtd (8 byte write buffer patch) applied. The target that showed this error does not have these patches applied.

Even so... I copied the FLASH content to a target with these patches and tried again in order to see if these patches would allow the kernel to rectify the problem. No cigar.
Of course I ran with debug enabled so here are the output (but even better I hope - here is the flash image for download: http://download.gnist.skov.com/corrupt_ubifs.img )

----
[    1.570000] atmel_usart.3: ttyS3 at MMIO 0xfff94000 (irq = 9) is a ATMEL_SERIAL
[    1.630000] brd: module loaded
[    1.660000] loop: module loaded
[    1.670000] physmap platform flash device: 04000000 at 10000000
[    1.680000] Number of erase regions: 1
[    1.680000] Warning:  Overriding MaxBufWriteSize from 2^6 to 2^3
[    1.690000] Primary Vendor Command Set: 0002 (AMD/Fujitsu Standard)
[    1.690000] Primary Algorithm Table at 0040
[    1.700000] Alternative Vendor Command Set: 0000 (None)
[    1.700000] No Alternate Algorithm Table
[    1.710000] Vcc Minimum:  2.7 V
[    1.710000] Vcc Maximum:  3.6 V
[    1.710000] No Vpp line
[    1.720000] Typical byte/word write timeout: 64 µs
[    1.720000] Maximum byte/word write timeout: 512 µs
[    1.730000] Typical full buffer write timeout: 64 µs
[    1.730000] Maximum full buffer write timeout: 2048 µs
[    1.740000] Typical block erase timeout: 512 ms
[    1.740000] Maximum block erase timeout: 4096 ms
[    1.750000] Typical chip erase timeout: 131072 ms
[    1.750000] Maximum chip erase timeout: 524288 ms
[    1.760000] Device size: 0x2000000 bytes (32 MiB)
[    1.760000] Flash Device Interface description: 0x0002
[    1.770000]   - supports x8 and x16 via BYTE# with asynchronous interface
[    1.770000] Max. bytes in buffer write: 0x8
[    1.780000] Number of Erase Block Regions: 1
[    1.780000]   Erase Region #0: BlockSize 0x20000 bytes, 256 blocks
[    1.790000] physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
[    1.790000]  Amd/Fujitsu Extended Query Table at 0x0040
[    1.800000] physmap-flash.0: CFI does not contain boot bank location. Assuming top.
[    1.810000] number of CFI chips: 1
[    1.810000] cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
[    1.820000] 5 cmdlinepart partitions found on MTD device physmap-flash.0

----
# skovsetup mountflash
[  165.940000] UBI: attaching mtd4 to ubi0
[  165.950000] UBI: physical eraseblock size:   131072 bytes (128 KiB)
[  165.950000] UBI: logical eraseblock size:    130944 bytes
[  165.960000] UBI: smallest flash I/O unit:    1
[  165.960000] UBI: VID header offset:          64 (aligned 64)
[  165.970000] UBI: data offset:                128
[  166.040000] UBI: attached mtd4 to ubi0
[  166.060000] UBI: MTD device name:            "User"
[  166.060000] UBI: MTD device size:            28 MiB
[  166.120000] UBI: number of good PEBs:        230
[  166.120000] UBI: number of bad PEBs:         0
[  166.120000] UBI: max. allowed volumes:       128
[  166.150000] UBI: wear-leveling threshold:    4096
[  166.150000] UBI: number of internal volumes: 1
[  166.150000] UBI: number of user volumes:     1
[  166.170000] UBI: available PEBs:             0
[  166.170000] UBI: total number of reserved PEBs: 230
[  166.190000] UBI: number of PEBs reserved for bad PEB handling: 0
[  166.230000] UBI: max/mean erase counter: 2/0
[  166.230000] UBI: image sequence number: 1748877991
[  166.260000] UBI: background thread "ubi_bgt0d" started, PID 2170
UBI device number 0, total 230 LEBs (30117120 bytes, 28.7 MiB), available 0 LEBs (0 bytes), LEB size 130944 bytes (127.9 KiB)
[  166.520000] UBIFS: recovery needed
[  166.680000] UBIFS error (pid 2177): ubifs_check_node: bad CRC: calculated 0x7d62d42c, read 0x1173c109
[  166.690000] UBIFS error (pid 2177): ubifs_check_node: bad node at LEB 84:50696
[  166.700000] UBIFS error (pid 2177): ubifs_read_node: expected node type 9
mount: mounting ubi0:rootfs on /skov/mnt/rootfs failed: Structure needs cleaning
#
---

So the question is if UBI can made to recover this situation???


Sincerely,

Dr. Karsten Jeppesen,
SKOV AS

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-01-18 14:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-12 13:47 Corrupted UBIFS, bad CRC Karsten Jeppesen
2012-01-15 12:24 ` Artem Bityutskiy
2012-01-16  8:18   ` Karsten Jeppesen
2012-01-16 10:24     ` Artem Bityutskiy
2012-01-16 12:40       ` Karsten Jeppesen
2012-01-16 12:46         ` Artem Bityutskiy
2012-01-16 12:50         ` Artem Bityutskiy
2012-01-17 12:23           ` Karsten Jeppesen
2012-01-18 14:43             ` Artem Bityutskiy
  -- strict thread matches above, loose matches on Subject: below --
2012-01-12 14:31 Karsten Jeppesen
2011-11-23 12:49 Karsten Jeppesen
2011-11-29 21:58 ` Artem Bityutskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox