jffs2 BUG() in gc.c:135 "Checked all inodes but still..."

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

* jffs2 BUG() in gc.c:135 "Checked all inodes but still..."
@ 2002-09-19 10:54 Victor Tse
  2002-09-19 10:59 ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: Victor Tse @ 2002-09-19 10:54 UTC (permalink / raw)
  To: linux-mtd

Hi,

I am using the latest code updated from CVS with 2.4.18-rmk7 on a ~4.5MB
partition on a 8MB NAND flash.

We already have a running filesystem on the flash previously created
with code from CVS on Sep 6. Seconds after the partition is mounted, a
kernel BUG() appears:

Checked all inodes but still 0x5a8 bytes of unchecked space?
kernel BUG at gc.c:135!
Unable to handle kernel NULL pointer dereference at virtual address
00000000
...

The number of bytes reported is different every time.

Before updating to the latest code there had been strange behaviors e.g.
seemingly corrupted executable files that restores itself after a
reboot, and we get different bad CRC reports every time the system boots
up. However generally the system is stable enough for our development.

Is the new code compatible with old data? Should I just rebuild the
whole filesystem with the new code by erasing it and copying all files
into it? Any help is greatly appreciated...


ksymoops output:

kernel BUG at gc.c:135!
Unable to handle kernel NULL pointer dereference at virtual address
00000000
*pgd = 00000000, *pmd = 00000000
Internal error: Oops: ffffffff
CPU: 0
pc : [<c0 0392c0>]    lr     Not tainted
sp : c09fbf08  ip : c09fbec0  fp : c09fbf18
r10: 00000000  r9 : 00000003  r8 : c0717904
r7 : c07178e0  r6 : c07178cc  r5 : c07178cc  r4 : 00000000
r3 : 00000000  r2 : 00000001  r1 : 00000001  r0 : 00000001
Flags: nZCv  IRQs on  FbIQs on  Mode SVControl: 217F  Table: C0614015 
DAC: 00000015
Process jffs2_gcd_mtd4 (pid: 11, stackpage=c09fb000)
Stack: (0xc09fbef8 to 0xc09fc000)
bee0:                                                       c003f2f8
c00392c0
bf00: 60000013 ffffffff c09fa000 c09fbf5c c09fbf1c c008d4cc4 c003928c
c07d4000 00000000 00000000 00000000 00000000 c09fa000 c07178cc 00000000
bf40: c07178e0 00000000 00000003 00000000 c09fbff4 c09fbf60 c0090bd4
c008d430
bf60: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff
bf80: ffffffff ffffffff 00000013 000c00000 00000000 000 00000000
bfa0: 00000000 c09fbfb0 c0034914 c003bb68 00000000 00000000 00000600
c00909e4
bfc0: c07178cc c0717800 00000000 c07178e0 00000000 00000003 00000000
c07dde1c
bfe0: 00000000 c0717800 00000000 c09fbff8 c0035c40 c00909f4 ffffffff
ffffffff
Backtrace:
Function entered at [<c0039270c>] from [<c008C09FA000
Function entered at [<c008d420>] from [<c0090bd4>]
Function entered at [<c00909e4>] from [<c0035c40>]
 r5 = C0717800  r4 = 00f000000
Code:
Kernel panic: Aiee, killing interrupt handler!
Warning (Oops_read): Code line not seen, dumping what data is available


Trace; 0000000c0039270c <END_OF_CODE+b40206fc0/????>
Trace; c008d420 <jffs2_garbage_collect_pass+0/7ac>
Trace; c0090bd4 <jffs2_garbage_collect_thread+1f0/208>
Trace; c00909e4 <jffs2_garbage_collect_thread+0/208>
Trace; c0035c40 <kernel_thread+38/48>



-- 
Victor Tse <victortse@avantwave.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 BUG() in gc.c:135 "Checked all inodes but still..."
  2002-09-19 10:54 jffs2 BUG() in gc.c:135 "Checked all inodes but still..." Victor Tse
@ 2002-09-19 10:59 ` David Woodhouse
  2002-09-20  4:08   ` Victor Tse
  0 siblings, 1 reply; 5+ messages in thread
From: David Woodhouse @ 2002-09-19 10:59 UTC (permalink / raw)
  To: Victor Tse; +Cc: linux-mtd

victortse@avantwave.com said:
>  Before updating to the latest code there had been strange behaviors
> e.g. seemingly corrupted executable files that restores itself after a
> reboot, and we get different bad CRC reports every time the system
> boots up. However generally the system is stable enough for our
> development.

Can you provide more details? These aren't expected, except the CRC errors, 
which shouldn't be common -- they should only happen on power loss.

> Is the new code compatible with old data? Should I just rebuild the
> whole filesystem with the new code by erasing it and copying all files
> into it? Any help is greatly appreciated... 

Yes, the new code is compatible. September 6th was a bad time to take a 
snapshot -- can you update and try again?

If you can reproduce it still, please provide a full log of the mount with 
CONFIG_JFFS2_FS_DEBUG=1.


--
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 BUG() in gc.c:135 "Checked all inodes but still..."
  2002-09-19 10:59 ` David Woodhouse
@ 2002-09-20  4:08   ` Victor Tse
  2002-09-20 14:36     ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: Victor Tse @ 2002-09-20  4:08 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-mtd

Hi,

I updated to the latest code and rebuilt the filesystem by eraseall,
mount then copy the files into it. There is no problem if I umount it
before reboot, but the BUG() appears if I did not umount it.

This log is taken after copying the files, umount and then reboot:
http://www.avantwave.com/~victortse/jffs2.log1.gz

Then reboot withing umounting:
http://www.avantwave.com/~victortse/jffs2.log2.gz

Many thanks.

Regards,
Victor Tse

On Thu, 2002-09-19 at 18:59, David Woodhouse wrote:
> victortse@avantwave.com said:
> >  Before updating to the latest code there had been strange behaviors
> > e.g. seemingly corrupted executable files that restores itself after a
> > reboot, and we get different bad CRC reports every time the system
> > boots up. However generally the system is stable enough for our
> > development.
> 
> Can you provide more details? These aren't expected, except the CRC errors, 
> which shouldn't be common -- they should only happen on power loss.
> 
> > Is the new code compatible with old data? Should I just rebuild the
> > whole filesystem with the new code by erasing it and copying all files
> > into it? Any help is greatly appreciated... 
> 
> Yes, the new code is compatible. September 6th was a bad time to take a 
> snapshot -- can you update and try again?
> 
> If you can reproduce it still, please provide a full log of the mount with 
> CONFIG_JFFS2_FS_DEBUG=1.
> 
> 
> --
> dwmw2
-- 
Victor Tse <victortse@avantwave.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 BUG() in gc.c:135 "Checked all inodes but still..."
  2002-09-20  4:08   ` Victor Tse
@ 2002-09-20 14:36     ` David Woodhouse
  2002-09-20 14:56       ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: David Woodhouse @ 2002-09-20 14:36 UTC (permalink / raw)
  To: Victor Tse; +Cc: linux-mtd

victortse@avantwave.com said:
>  Then reboot withing umounting: http://www.avantwave.com/~victortse/
> jffs2.log2.gz 

Hmmm.

	jffs2_scan_inode_node(): Node at 0x001aa4a8
	Node is ino #46, version 1599. Range 0x3a1000-0x3a2000
	jffs2_scan_inode_node(): Node at 0x001aae44
	Node is ino #46, version 1600. Range 0x3a2000-0x3a25a8
	jffs2_scan_inode_node(): Node at 0x001ab250
	Node is ino #46, version 1601. Range 0x3a25a8-0x3a3000
		<...>
	Node at 001ab250 (0) is a data node
	version 1601, highest_version now 1602
	dnode @001ab250: ver 1601, offset 3a25a8, dsize 0a58
	Unknown INCOMPAT nodetype FFFF at 001AAE44
	Node at 001aa4a8 (0) is a data node
	version 1599, highest_version now 1602
	dnode @001aa4a8: ver 1599, offset 3a1000, dsize 1000
	Node at 001aa000 (0) is a data node
		<...>
	Checked all inodes but still 0x40c bytes of unchecked space?
		<...>
	clean_list: 001aa000 (used 00001bf0, dirty 00000000, wasted 00000004, unchecked 0000040c, free 00000000)
		<...>
	kernel BUG at gc.c:135!

So on scan we read a real node from 0x1aae44, but later on during the
readinode we find 0xFFFF as its nodetype. What's _really_ on the flash at
0x1aae44? Is this 100% repeatable with the failure being in the same place?
(And can you reduce the baud rate on your serial console till whatever
you're using for logging can actually keep up)

I can fix the fact that we screw up the accounting and hit the BUG() when we
get this error. But the error should never happen, and if your nodes are
turning into 0xFFFF you are going to lose data. What flash hardware, driver,
etc?

--
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: jffs2 BUG() in gc.c:135 "Checked all inodes but still..."
  2002-09-20 14:36     ` David Woodhouse
@ 2002-09-20 14:56       ` David Woodhouse
  0 siblings, 0 replies; 5+ messages in thread
From: David Woodhouse @ 2002-09-20 14:56 UTC (permalink / raw)
  Cc: Victor Tse, linux-mtd

dwmw2@infradead.org said:
>  I can fix the fact that we screw up the accounting and hit the BUG()
> when we get this error. 

OK, I think I've done this -- CVS update and try again. Now you should just 
lose data when you get flash errors like this, rather than hitting my 
accounting sanity checks and getting a BUG(). I'm still not wonderfully 
happy with what you're seeing though.

--
dwmw2

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-09-20 14:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-19 10:54 jffs2 BUG() in gc.c:135 "Checked all inodes but still..." Victor Tse
2002-09-19 10:59 ` David Woodhouse
2002-09-20  4:08   ` Victor Tse
2002-09-20 14:36     ` David Woodhouse
2002-09-20 14:56       ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox