* JFFS2 is crashing the kernel
@ 2002-01-16 16:32 Sanjay Kumar
2002-01-16 18:22 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Sanjay Kumar @ 2002-01-16 16:32 UTC (permalink / raw)
To: linux-mtd
Hi All,
I am using JFFS2 for AMD flash running on a MIPS processor
and till now I had no problems. but suddenly while reading a file
from Flash, the kernel crashed. I am using Linux 2.4.2
I got the following trace -
5>inflate returned -4
Unable to handle kernel paging request at virtual address 07c04800, epc == 8098c
2dc, ra == 8098c2a0
Oops in fault.c:do_page_fault, line 189:
$0 : 00000000 80ab0000 000006eb 00000000 000006eb 000006ea 07c04800 d7735738
$8 : 00000000 80a36aa0 000006eb 000223f4 000006eb 80b3a160 9393f884 07c04800
$16: 80d44000 80b53100 00000000 07c04800 80d44000 00001000 80b79c60 80d53b10
$24: 00000000 00000000 80d52000 80d53af0 80b0ecb0 8098c2a0
epc : 8098c2dc
Status: 10004304
Cause : 10000008
Process lash (pid: 145, stackpage=80d52000)
Stack: 00000001 80a71050 8091fbec 00000800 000006eb 80d53b10 07c04800 80aaaf18
000006eb 8091d80c 00001000 80b5a3c4 80b4eef8 00011000 80d44000 00012000
80af8630 fffffff7 80b0ecb0 8098a310 00000001 80a71068 00000008 00000000
00001000 2a2b2c2d 80b4ede0 80af8630 80b4eef8 80af8630 00000000 00000000
80b0bd30 00001000 80b4ede0 8098a4ec 80a78490 00000005 00000001 00001000
80b4ede0 ...
Call Trace: [<8091fbec>] [<8091d80c>] [<8098a310>] [<8098a4ec>] [<8098a654>] [<8
092dbdc>] [<8092dc98>] [<80a51664>] [<80a51664>] [<80a51650>] [<80a51648>] [<809
2dfec>] [<8092df0c>] [<80952d54>] [<80968bb4>] [<809d8854>] [<809f5e98>] [<8090b
ba4>] [<809688d8>] [<80953754>] [<80953b84>] [<8092ce78>] [<80943524>] [<809103c
0>] [<80910de4>] [<8090b368>]
Code: 8e27003c 04a0000d 02603021 <90c20000> 24c60001 24a5ffff 00621026 3042
00ff 00021080
By the trace it seems that the kernel crashed when JFFS2 is calling the
zlib_decompress() for reading the file.
Did anybody got this problem? Also can some one help me in understanding the
problem.
Thanks,
Sanjay
__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: JFFS2 is crashing the kernel
2002-01-16 16:32 JFFS2 is crashing the kernel Sanjay Kumar
@ 2002-01-16 18:22 ` David Woodhouse
2002-01-16 20:02 ` different crash (was Re: JFFS2 is crashing the kernel) Adam Wozniak
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2002-01-16 18:22 UTC (permalink / raw)
To: Sanjay Kumar; +Cc: linux-mtd
sankumar73@yahoo.com said:
> 5>inflate returned -4
That's Z_MEM_ERROR. Looks like a kmalloc failed.
The oops is impossible to interpret unless it's passed through ksymoops.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 18:22 ` David Woodhouse
@ 2002-01-16 20:02 ` Adam Wozniak
2002-01-16 20:32 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Adam Wozniak @ 2002-01-16 20:02 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
Got this crash yesterday. gc.c is rev 1.58, nodelist.h is rev 1.59
This is on a PPC system.
Haven't been able to get it to repeat. Any help is appreciated.
--Adam
--------
This one was also running the /work/awozniak/slo-15-jan-2002-B kernel:
Oops: kernel access of bad area, sig: 11
NIP: C0089FDC XER: 00000000 LR: C0089F88 SP: C3A21F10 REGS: c3a21e60
TRAP: 0300
Not tainted
MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 00001360, DSISR: 20000000
TASK = c3a20000[11] 'jffs2_gcd_mtd1' Last syscall: -1
last math c39ae000 last altivec 00000000
GPR00: 00001360 C3A21F10 C3A20000 0000002F 00001360 00000001 00000030
C0170000
GPR08: 01A7B414 C0155B1C 00000000 C3A21E30 30018000 100278F4 00000000
00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
GPR24: 00000000 C02C32F8 C0180000 C02C3200 C02C32C4 C0179840 C3FCFED8
C3BFE9C0
Call backtrace:
C0089F88 C008CA94 C00064BC
Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
------- Additional Comments From XXXXX 2002-01-16 00:35 -------
Oh yeah, this was about 2 seconds after starting an FTP of [large file]
------- Additional Comments From Adam Wozniak 2002-01-16 10:27 -------
[awozniak@rangers linux]$ findmap C0089FDC C0089F88 C008CA94 C00064BC
closest to c0089fdc :: c0089da4 T jffs2_garbage_collect_pass (plus
568 of
1500)
closest to c0089f88 :: c0089da4 T jffs2_garbage_collect_pass (plus
484 of
1500)
closest to c008ca94 :: c008c864 t jffs2_garbage_collect_thread (plus
560 of
564)
closest to c00064bc :: c0006490 T kernel_thread (plus 44 of 56)
------- Additional Comments From Adam Wozniak 2002-01-16 11:24 -------
316 0350 40820014 bc 4,2,.L1068
317 0354 801A0000 lwz 0,irq_stat@l(26)
318 0358 2C000000 cmpwi 0,0,0
319 035c 41820008 bc 12,2,.L1068
320 0360 48000001 bl do_softirq
321 .L1068:
322 0364 7FE4FB78 mr 4,31
323 0368 7F83E378 mr 3,28
324 036c 48000001 bl jffs2_mark_node_obsolete
325 0370 480002E0 b .L1080
326 .L1066:
327 0374 7FE4FB78 mr 4,31
328 0378 3B7CFF3C addi 27,28,-196
329 .L1083:
330 037c 80840000 lwz 4,0(4)
331 0380 80040000 lwz 0,0(4) // 0x148 + 568 ==
0x380
332 0384 2C000000 cmpwi 0,0,0
333 0388 4082FFF4 bc 4,2,.L1083
334 038c 83A4000C lwz 29,12(4)
335 0390 3C600000 lis 3,.LC17@ha
336 0394 386303B4 la 3,.LC17@l(3)
337 0398 7FA4EB78 mr 4,29
338 039c 48000001 bl printk
------- Additional Comments From Adam Wozniak 2002-01-16 11:26 -------
if (!raw->next_in_ino) {
/* Inode-less node. Clean marker, snapshot or something
like
tha spin_unlock_bh(&c->erase_completion_lock);
jffs2_mark_node_obsolete(c, raw);
goto eraseit_lock;
}
inum = jffs2_raw_ref_to_inum(raw); // <--
HERE
D1(printk(KERN_DEBUG "Inode number is #%u\n", inum));
spin_unlock_bh(&c->erase_completion_lock);
D1(printk(KERN_DEBUG "jffs2_garbage_collect_pass collecting from
block @
inode = iget(OFNI_BS_2SFFJ(c), inum);
------- Additional Comments From Adam Wozniak 2002-01-16 11:33 -------
static inline int jffs2_raw_ref_to_inum(struct jffs2_raw_node_ref *raw)
{
while(raw->next_in_ino) { // <-- maybe HERE
raw = raw->next_in_ino; // <-- maybe HERE
}
return ((struct jffs2_inode_cache *)raw)->ino;
}
------- Additional Comments From Adam Wozniak 2002-01-16 11:54 -------
DSISR indicates this was a "load" operation, not a "store".
(obvious, consistant with crash as indicated).
Likely result of bad pointer dereference.
--
Adam Wozniak (KG6GZR) COM DEV Wireless - Digital and Software Systems
awozniak@comdev.cc 3450 Broad St. 107, San Luis Obispo, CA 93401
http://www.comdev.cc
Voice: (805) 544-1089 Fax: (805) 544-2055
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 20:02 ` different crash (was Re: JFFS2 is crashing the kernel) Adam Wozniak
@ 2002-01-16 20:32 ` David Woodhouse
2002-01-16 20:35 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2002-01-16 20:32 UTC (permalink / raw)
To: Adam Wozniak; +Cc: linux-mtd
Can you look at the assembly and see precisely what line of code was
faulting? Recompiling the offending file with identical flags except for
adding '-g' can be useful.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 20:32 ` David Woodhouse
@ 2002-01-16 20:35 ` David Woodhouse
2002-01-16 21:44 ` Adam Wozniak
2002-01-16 22:04 ` Adam Wozniak
0 siblings, 2 replies; 10+ messages in thread
From: David Woodhouse @ 2002-01-16 20:35 UTC (permalink / raw)
To: Adam Wozniak, linux-mtd
dwmw2@infradead.org said:
> Can you look at the assembly and see precisely what line of code was
> faulting? Recompiling the offending file with identical flags except
> for adding '-g' can be useful.
Note to self: Read all of mail before replying. It makes you look less
stupid.
What version of build.c?
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 20:35 ` David Woodhouse
@ 2002-01-16 21:44 ` Adam Wozniak
2002-01-16 22:06 ` David Woodhouse
2002-01-16 22:04 ` Adam Wozniak
1 sibling, 1 reply; 10+ messages in thread
From: Adam Wozniak @ 2002-01-16 21:44 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
David Woodhouse wrote:
> What version of build.c?
Id: build.c,v 1.22 2002/01/09 16:30:57 dwmw2 Exp
--
Adam Wozniak (KG6GZR) COM DEV Wireless - Digital and Software Systems
awozniak@comdev.cc 3450 Broad St. 107, San Luis Obispo, CA 93401
http://www.comdev.cc
Voice: (805) 544-1089 Fax: (805) 544-2055
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 20:35 ` David Woodhouse
2002-01-16 21:44 ` Adam Wozniak
@ 2002-01-16 22:04 ` Adam Wozniak
1 sibling, 0 replies; 10+ messages in thread
From: Adam Wozniak @ 2002-01-16 22:04 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
David Woodhouse wrote:
>
> What version of build.c?
[awozniak@rangers jffs2]$ grep Id: *.[ch]
background.c: * $Id: background.c,v 1.16 2001/10/08 09:22:38 dwmw2 Exp $
build.c: * $Id: build.c,v 1.22 2002/01/09 16:30:57 dwmw2 Exp $
compr.c: * $Id: compr.c,v 1.22 2002/01/11 10:10:49 dwmw2 Exp $
compr_rtime.c: * $Id: compr_rtime.c,v 1.7 2002/01/09 13:25:57 dwmw2 Exp
$
compr_rubin.c: * $Id: compr_rubin.c,v 1.15 2002/01/09 13:25:57 dwmw2 Exp
$
compr_rubin.h:/* $Id: compr_rubin.h,v 1.5 2001/02/26 13:50:01 dwmw2 Exp
$ */
comprtest.c:/* $Id: comprtest.c,v 1.5 2002/01/03 15:20:44 dwmw2 Exp $ */
compr_zlib.c: * $Id: compr_zlib.c,v 1.11 2002/01/11 10:10:49 dwmw2 Exp $
crc32.c:/* $Id: crc32.c,v 1.4 2002/01/03 15:20:44 dwmw2 Exp $ */
crc32.h:/* $Id: crc32.h,v 1.4 2002/01/03 15:20:44 dwmw2 Exp $ */
dir.c: * $Id: dir.c,v 1.51 2002/01/13 22:11:00 dwmw2 Exp $
erase.c: * $Id: erase.c,v 1.30 2002/01/09 13:25:57 dwmw2 Exp $
file.c: * $Id: file.c,v 1.63 2002/01/07 11:18:05 dwmw2 Exp $
gc.c: * $Id: gc.c,v 1.58 2002/01/09 13:25:57 dwmw2 Exp $
ioctl.c: * $Id: ioctl.c,v 1.5 2001/03/15 15:38:24 dwmw2 Exp $
malloc.c: * $Id: malloc.c,v 1.18 2002/01/09 13:25:57 dwmw2 Exp $
nodelist.c: * $Id: nodelist.c,v 1.36 2002/01/09 16:52:59 dwmw2 Exp $
nodelist.h: * $Id: nodelist.h,v 1.59 2002/01/10 09:29:53 dwmw2 Exp $
nodemgmt.c: * $Id: nodemgmt.c,v 1.51 2002/01/09 13:25:58 dwmw2 Exp $
os-linux.h: * $Id: os-linux.h,v 1.3 2002/01/09 13:00:08 dwmw2 Exp $
pushpull.c: * $Id: pushpull.c,v 1.9 2002/01/09 13:25:58 dwmw2 Exp $
pushpull.h: * $Id: pushpull.h,v 1.5 2001/09/23 10:04:15 rmk Exp $
read.c: * $Id: read.c,v 1.19 2002/01/14 23:30:26 dwmw2 Exp $
readinode.c: * $Id: readinode.c,v 1.64 2002/01/09 13:25:58 dwmw2 Exp $
scan.c: * $Id: scan.c,v 1.57 2002/01/09 13:25:58 dwmw2 Exp $
super.c: * $Id: super.c,v 1.50 2002/01/09 13:25:58 dwmw2 Exp $
symlink.c: * $Id: symlink.c,v 1.9 2002/01/10 09:29:53 dwmw2 Exp $
write.c: * $Id: write.c,v 1.38 2002/01/09 13:25:58 dwmw2 Exp $
zlib.c: * $Id: zlib.c,v 1.3 1997/12/23 10:47:42 paulus Exp $
zlib.h:/* $Id: zlib.h,v 1.2 1997/12/23 10:47:44 paulus Exp $
*/
--
Adam Wozniak (KG6GZR) COM DEV Wireless - Digital and Software Systems
awozniak@comdev.cc 3450 Broad St. 107, San Luis Obispo, CA 93401
http://www.comdev.cc
Voice: (805) 544-1089 Fax: (805) 544-2055
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 21:44 ` Adam Wozniak
@ 2002-01-16 22:06 ` David Woodhouse
2002-01-16 22:11 ` Adam Wozniak
0 siblings, 1 reply; 10+ messages in thread
From: David Woodhouse @ 2002-01-16 22:06 UTC (permalink / raw)
To: Adam Wozniak; +Cc: linux-mtd
adam.wozniak@comdev.cc said:
> Id: build.c,v 1.22 2002/01/09 16:30:57 dwmw2 Exp
Bugger. There was a bug in 1.21 which could cause the symptoms you saw -
but I fixed it in 1.22.
Now I'm confused. It's possible that the shuffling around of code for
portability, although it shouldn't have changed the behaviour in any
significant way, has introduced a bug.
I suspect you have a node or inocache on the list which has been freed and
shouldn't be there.
If this were reproducible, I'd want you to add debugging printks to
jffs2_{alloc,free}_raw_node_ref similar to the ones in the inocache
alloc/free functions. Also to jffs2_raw_ref_to_inum() to print raw and raw->
next_in_ino each time round the loop you identified.
Then boot with debugging (level 1) enabled and log _all_ messages, even the
KERN_DEBUG ones, from the time the filesystem is first mounted.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 22:06 ` David Woodhouse
@ 2002-01-16 22:11 ` Adam Wozniak
2002-01-16 22:20 ` David Woodhouse
0 siblings, 1 reply; 10+ messages in thread
From: Adam Wozniak @ 2002-01-16 22:11 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd
David Woodhouse wrote:
>
> adam.wozniak@comdev.cc said:
> > Id: build.c,v 1.22 2002/01/09 16:30:57 dwmw2 Exp
>
> Bugger. There was a bug in 1.21 which could cause the symptoms you saw -
> but I fixed it in 1.22.
>
> Now I'm confused. It's possible that the shuffling around of code for
> portability, although it shouldn't have changed the behaviour in any
> significant way, has introduced a bug.
>
> I suspect you have a node or inocache on the list which has been freed and
> shouldn't be there.
>
> If this were reproducible, I'd want you to add debugging printks to
> jffs2_{alloc,free}_raw_node_ref similar to the ones in the inocache
> alloc/free functions. Also to jffs2_raw_ref_to_inum() to print raw and raw->
> next_in_ino each time round the loop you identified.
>
> Then boot with debugging (level 1) enabled and log _all_ messages, even the
> KERN_DEBUG ones, from the time the filesystem is first mounted.
>
> --
> dwmw2
Is it possible that I could have
had a kernel with build.c rev 1.21 which wrote bad data to flash,
then upgraded the kernel with a build.c rev 1.22
then got the crash?
I'm a little distant from the actual test lab, so I don't have the whole
history of the machine which crashed. I also have machines in the field
which are running with a build.c rev 1.21 or earlier. Are they likely
to be susceptable to this?
--Adam
--
Adam Wozniak (KG6GZR) COM DEV Wireless - Digital and Software Systems
awozniak@comdev.cc 3450 Broad St. 107, San Luis Obispo, CA 93401
http://www.comdev.cc
Voice: (805) 544-1089 Fax: (805) 544-2055
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: different crash (was Re: JFFS2 is crashing the kernel)
2002-01-16 22:11 ` Adam Wozniak
@ 2002-01-16 22:20 ` David Woodhouse
0 siblings, 0 replies; 10+ messages in thread
From: David Woodhouse @ 2002-01-16 22:20 UTC (permalink / raw)
To: Adam Wozniak; +Cc: linux-mtd
adam.wozniak@comdev.cc said:
> Is it possible that I could have had a kernel with build.c rev 1.21
> which wrote bad data to flash, then upgraded the kernel with a build.c
> rev 1.22 then got the crash?
No, it's an in-memory thing. Versions between 1.17 and 1.21 inclusive have
this bug - it's triggered if there are deleted inodes with still-valid nodes
on the flash. We free the inocaches too early, while they're still on the
lists.
> I'm a little distant from the actual test lab, so I don't have the
> whole history of the machine which crashed. I also have machines in
> the field which are running with a build.c rev 1.21 or earlier. Are
> they likely to be susceptable to this?
For production machines, I'd recommend using the jffs2-2_4-branch from CVS.
That has a couple of bugfixes which turned up during the eCos port, but not
the rest of the code-shuffling. The latter _shouldn't_ make any difference,
but there's no harm in being safe.
--
dwmw2
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-01-16 22:09 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-16 16:32 JFFS2 is crashing the kernel Sanjay Kumar
2002-01-16 18:22 ` David Woodhouse
2002-01-16 20:02 ` different crash (was Re: JFFS2 is crashing the kernel) Adam Wozniak
2002-01-16 20:32 ` David Woodhouse
2002-01-16 20:35 ` David Woodhouse
2002-01-16 21:44 ` Adam Wozniak
2002-01-16 22:06 ` David Woodhouse
2002-01-16 22:11 ` Adam Wozniak
2002-01-16 22:20 ` David Woodhouse
2002-01-16 22:04 ` Adam Wozniak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox