* corruption of JFFS2 filesystem, csize is set to 0 after moving a block @ 2007-04-26 14:54 Hans-Christian Egtvedt 2007-04-26 15:43 ` David Woodhouse 0 siblings, 1 reply; 6+ messages in thread From: Hans-Christian Egtvedt @ 2007-04-26 14:54 UTC (permalink / raw) To: linux-mtd Hello, When I stress the JFFS2 filesystem by copying files around on the root (/) I end up with a corrupted filesystem after a reboot. The system just hangs after the kernel is done booting: Freeing init memory: 56K (90000000 - 9000e000) Where I should get: init started: BusyBox v1.4.2 (2007-04-17 15:34:55 CEST) multi-call binary etc... I copy and remove files until I reach "cp: write error: No space left on device" I extracted the filesystem from my flash device (Atmel AT49BV642D) and did a dump. Here I can see that some of the nodes have a csize set to 0 for vital files such as libdl-0.9.28.so. Any pointers to where I should start debugging, what can go wrong? I can provide jffs2dump's, logs or images if needed. -- Best regards Hans-Christian Egtvedt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: corruption of JFFS2 filesystem, csize is set to 0 after moving a block 2007-04-26 14:54 corruption of JFFS2 filesystem, csize is set to 0 after moving a block Hans-Christian Egtvedt @ 2007-04-26 15:43 ` David Woodhouse 2007-04-27 9:13 ` Hans-Christian Egtvedt 0 siblings, 1 reply; 6+ messages in thread From: David Woodhouse @ 2007-04-26 15:43 UTC (permalink / raw) To: Hans-Christian Egtvedt; +Cc: linux-mtd On Thu, 2007-04-26 at 16:54 +0200, Hans-Christian Egtvedt wrote: > Hello, > > When I stress the JFFS2 filesystem by copying files around on the root > (/) I end up with a corrupted filesystem after a reboot. The system just > hangs after the kernel is done booting: > Freeing init memory: 56K (90000000 - 9000e000) > > Where I should get: > init started: BusyBox v1.4.2 (2007-04-17 15:34:55 CEST) multi-call > binary > etc... > > I copy and remove files until I reach "cp: write error: No space left on > device" > > I extracted the filesystem from my flash device (Atmel AT49BV642D) and > did a dump. Here I can see that some of the nodes have a csize set to 0 > for vital files such as libdl-0.9.28.so. There's not necessarily anything wrong with that. > Any pointers to where I should start debugging, what can go wrong? > > I can provide jffs2dump's, logs or images if needed. Take a copy of the image, then work out where the kernel is stuck. Use SysRq-P and/or SysRq-T, and if it's in JFFS2 try running with CONFIG_JFFS2_FS_DEBUG=1 (and with 'verbose' on the command line), and capture all the output on a serial console. -- dwmw2 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: corruption of JFFS2 filesystem, csize is set to 0 after moving a block 2007-04-26 15:43 ` David Woodhouse @ 2007-04-27 9:13 ` Hans-Christian Egtvedt 2007-04-27 9:31 ` Haavard Skinnemoen 2007-04-27 9:46 ` David Woodhouse 0 siblings, 2 replies; 6+ messages in thread From: Hans-Christian Egtvedt @ 2007-04-27 9:13 UTC (permalink / raw) To: David Woodhouse; +Cc: linux-mtd On Thu, 2007-04-26 at 16:43 +0100, David Woodhouse wrote: > On Thu, 2007-04-26 at 16:54 +0200, Hans-Christian Egtvedt wrote: > > Hello, > > > > When I stress the JFFS2 filesystem by copying files around on the root > > (/) I end up with a corrupted filesystem after a reboot. The system just > > hangs after the kernel is done booting: > > Freeing init memory: 56K (90000000 - 9000e000) > > > > Where I should get: > > init started: BusyBox v1.4.2 (2007-04-17 15:34:55 CEST) multi-call > > binary > > etc... > > > > I copy and remove files until I reach "cp: write error: No space left on > > device" > > > > I extracted the filesystem from my flash device (Atmel AT49BV642D) and > > did a dump. Here I can see that some of the nodes have a csize set to 0 > > for vital files such as libdl-0.9.28.so. > > There's not necessarily anything wrong with that. Some filesystem dump from before: Dirent node at 0x0013c7e0, totlen 0x0000003b, #pino 7, version 148, #ino 150, nsize 19, name ld-uClibc-0.9.28.so Inode node at 0x0013c81c, totlen 0x00000a14, #ino 150, version 1, isize 13108, csize 2512, dsize 4092, offset 0 Inode node at 0x0013d230, totlen 0x00000c57, #ino 150, version 2, isize 13108, csize 3091, dsize 4092, offset 4092 Inode node at 0x0013de88, totlen 0x00000b21, #ino 150, version 3, isize 13108, csize 2781, dsize 4092, offset 8184 Inode node at 0x0013e9ac, totlen 0x000001e0, #ino 150, version 4, isize 13108, csize 412, dsize 832, offset 12276 After: Dirent node at 0x006c7bf0, totlen 0x0000003b, #pino 7, version 171, #ino 150, nsize 19, name ld-uClibc-0.9.28.so Inode node at 0x006c7c2c, totlen 0x00000a14, #ino 150, version 5, isize 13108, csize 2512, dsize 4092, offset 0 Inode node at 0x006c8640, totlen 0x00000044, #ino 150, version 6, isize 13108, csize 0, dsize 4092, offset 4092 Inode node at 0x006c8684, totlen 0x00000044, #ino 150, version 7, isize 13108, csize 0, dsize 4092, offset 8184 Inode node at 0x006c86c8, totlen 0x00000044, #ino 150, version 8, isize 13108, csize 0, dsize 832, offset 12276 csize changed to 0 is correct for this node? If the node header is correct, could it be that the node data has been corrupted in some way? > > Any pointers to where I should start debugging, what can go wrong? > > > > I can provide jffs2dump's, logs or images if needed. > > Take a copy of the image, then work out where the kernel is stuck. Use > SysRq-P and/or SysRq-T, and if it's in JFFS2 try running with > CONFIG_JFFS2_FS_DEBUG=1 (and with 'verbose' on the command line), and > capture all the output on a serial console. The system is in do_signal, which is most likely a sign of the init process has received an unexpected signal. I assume it is due to one of the core libraries being corrupted. JFFS2 log with debug=1 jffs2_scan_dirent_node(): Node at 0x006c7bf0 [JFFS2 DBG] (1) jffs2_link_node_ref: Last node at 903008c4 is (006c7bac,902febd8) [JFFS2 DBG] (1) jffs2_link_node_ref: New ref is 903008d0 (fffffffe becomes 006c7bf2,00000000) len 0x3c [JFFS2 DBG] (1) jffs2_add_fd_to_list: add dirent "ld-uClibc-0.9.28.so", ino #150 jffs2_scan_inode_node(): Node at 0x006c7c2c [JFFS2 DBG] (1) jffs2_add_ino_cache: add 902febc0 (ino #150) [JFFS2 DBG] (1) jffs2_link_node_ref: Last node at 903008d0 is (006c7bf2,902e7704) [JFFS2 DBG] (1) jffs2_link_node_ref: New ref is 903008dc (fffffffe becomes 006c7c2c,00000000) len 0xa14 Node is ino #150, version 5. Range 0x0-0xffc Fewer than 68 bytes (inode node) left to end of buf. Reading 0x1000 at 0x006c8640 jffs2_scan_inode_node(): Node at 0x006c8640 [JFFS2 DBG] (1) jffs2_link_node_ref: Last node at 903008dc is (006c7c2c,902febc0) [JFFS2 DBG] (1) jffs2_link_node_ref: New ref is 903008e8 (fffffffe becomes 006c8640,00000000) len 0x44 Node is ino #150, version 6. Range 0xffc-0x1ff8 jffs2_scan_inode_node(): Node at 0x006c8684 [JFFS2 DBG] (1) jffs2_link_node_ref: Last node at 903008e8 is (006c8640,903008dc) [JFFS2 DBG] (1) jffs2_link_node_ref: New ref is 903008f4 (fffffffe becomes 006c8684,00000000) len 0x44 Node is ino #150, version 7. Range 0x1ff8-0x2ff4 jffs2_scan_inode_node(): Node at 0x006c86c8 [JFFS2 DBG] (1) jffs2_link_node_ref: Last node at 903008f4 is (006c8684,903008e8) [JFFS2 DBG] (1) jffs2_link_node_ref: New ref is 90300900 (fffffffe becomes 006c86c8,00000000) len 0x44 Node is ino #150, version 8. Range 0x2ff4-0x3334 What else should I look for in the log file, it is a bit big to be attached to this list (21 MB). -- Best regards Hans-Christian Egtvedt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: corruption of JFFS2 filesystem, csize is set to 0 after moving a block 2007-04-27 9:13 ` Hans-Christian Egtvedt @ 2007-04-27 9:31 ` Haavard Skinnemoen 2007-04-27 9:46 ` David Woodhouse 1 sibling, 0 replies; 6+ messages in thread From: Haavard Skinnemoen @ 2007-04-27 9:31 UTC (permalink / raw) To: Hans-Christian Egtvedt; +Cc: linux-mtd, David Woodhouse On Fri, 27 Apr 2007 11:13:49 +0200 Hans-Christian Egtvedt <hcegtvedt@norway.atmel.com> wrote: > The system is in do_signal, which is most likely a sign of the init > process has received an unexpected signal. I assume it is due to one of > the core libraries being corrupted. FWIW, the avr32 update I'm about to push out will change the behaviour of the exception handling code to panic when this happens instead of trying to deliver the signal forever. I can try to backport it to whatever version you're running, but I'm pretty sure your analysis is correct (more specifically, I think init got a SIGBUS signal it didn't want.) Haavard ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: corruption of JFFS2 filesystem, csize is set to 0 after moving a block 2007-04-27 9:13 ` Hans-Christian Egtvedt 2007-04-27 9:31 ` Haavard Skinnemoen @ 2007-04-27 9:46 ` David Woodhouse 2007-04-27 11:52 ` Hans-Christian Egtvedt 1 sibling, 1 reply; 6+ messages in thread From: David Woodhouse @ 2007-04-27 9:46 UTC (permalink / raw) To: Hans-Christian Egtvedt; +Cc: linux-mtd On Fri, 2007-04-27 at 11:13 +0200, Hans-Christian Egtvedt wrote: > > Some filesystem dump from before: > Dirent node at 0x0013c7e0, totlen 0x0000003b, #pino 7, version 148, #ino 150, nsize 19, name ld-uClibc-0.9.28.so > Inode node at 0x0013c81c, totlen 0x00000a14, #ino 150, version 1, isize 13108, csize 2512, dsize 4092, offset 0 > Inode node at 0x0013d230, totlen 0x00000c57, #ino 150, version 2, isize 13108, csize 3091, dsize 4092, offset 4092 > Inode node at 0x0013de88, totlen 0x00000b21, #ino 150, version 3, isize 13108, csize 2781, dsize 4092, offset 8184 > Inode node at 0x0013e9ac, totlen 0x000001e0, #ino 150, version 4, isize 13108, csize 412, dsize 832, offset 12276 Those are suspect. Why 4092 bytes not 4096? The node with version 2 claims to be 4092 bytes starting from 4092, which is invalid because it crosses a page boundary. > After: > Dirent node at 0x006c7bf0, totlen 0x0000003b, #pino 7, version 171, #ino 150, nsize 19, name ld-uClibc-0.9.28.so > Inode node at 0x006c7c2c, totlen 0x00000a14, #ino 150, version 5, isize 13108, csize 2512, dsize 4092, offset 0 > Inode node at 0x006c8640, totlen 0x00000044, #ino 150, version 6, isize 13108, csize 0, dsize 4092, offset 4092 > Inode node at 0x006c8684, totlen 0x00000044, #ino 150, version 7, isize 13108, csize 0, dsize 4092, offset 8184 > Inode node at 0x006c86c8, totlen 0x00000044, #ino 150, version 8, isize 13108, csize 0, dsize 832, offset 12276 Ok, in that case I agree that a csize of zero also looks suspicious. Matches the node 'totlen' though. What's the compression type. Did you use 'mkfs.jffs2 -s 4092'? -- dwmw2 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: corruption of JFFS2 filesystem, csize is set to 0 after moving a block 2007-04-27 9:46 ` David Woodhouse @ 2007-04-27 11:52 ` Hans-Christian Egtvedt 0 siblings, 0 replies; 6+ messages in thread From: Hans-Christian Egtvedt @ 2007-04-27 11:52 UTC (permalink / raw) To: David Woodhouse; +Cc: linux-mtd On Fri, 2007-04-27 at 10:46 +0100, David Woodhouse wrote: > On Fri, 2007-04-27 at 11:13 +0200, Hans-Christian Egtvedt wrote: <cut jffs2dump initial image> > Those are suspect. Why 4092 bytes not 4096? The node with version 2 > claims to be 4092 bytes starting from 4092, which is invalid because it > crosses a page boundary. Let me quote Homer Jay Simpson, "DOH!". <cut jffs2dump corrupted image> > Ok, in that case I agree that a csize of zero also looks suspicious. > Matches the node 'totlen' though. What's the compression type. > > Did you use 'mkfs.jffs2 -s 4092'? I have no idea how I turned up with this number, but rebuilding the image with pagesize=4096 gives a fully working image. Many thanks for your help. -- Best regards Hans-Christian Egtvedt ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-04-27 11:52 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-26 14:54 corruption of JFFS2 filesystem, csize is set to 0 after moving a block Hans-Christian Egtvedt 2007-04-26 15:43 ` David Woodhouse 2007-04-27 9:13 ` Hans-Christian Egtvedt 2007-04-27 9:31 ` Haavard Skinnemoen 2007-04-27 9:46 ` David Woodhouse 2007-04-27 11:52 ` Hans-Christian Egtvedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox