* simulate a bad NAND block cause kernel hang
@ 2005-08-25 13:27 ahgu
2005-08-25 13:44 ` Thomas Gleixner
0 siblings, 1 reply; 4+ messages in thread
From: ahgu @ 2005-08-25 13:27 UTC (permalink / raw)
To: ahgu, linux-mtd
I forced the flash_erase function to fail. I expect the jffs2 will pick up
the return error and mark the block bad and put the bad block in a bad_block
list. But what I get is kernel failure:
I get similar error when I simulate a write error.
Am I doing the bad block simulation correctly? Is this a correct response?
What is supposed to happen when the NAND flash grow a bad block?
-ahgu
erasing 615
Erase at 0x0001c000 failed immediately: errno 7
jffs2_erase_failedScheduling in interrupt
kernel BUG at sched.c:676!
Unable to handle kernel paging request at virtual address 00000000, epc ==
80112220, ra == 80112220
Oops in fault.c:do_page_fault, line 225:
$0 : 00000000 1000f800 0000001b 00000001 816ee000 00000000 00000001 00001893
$8 : 00001893 00000000 00000000 00000000 802a9459 fffffff9 0000000a 802edd0a
$16: 8028e260 802ec000 00000000 802ad928 818cbe2c 818cbe28 80106000 802ad924
$24: ffffffff 00000002 802ec000 802ede38 802ede38 80112220
Hi : 000247ff
Lo : befc0000
epc : 80112220 Not tainted
Status: 1000f803
Cause : 1080000c
Process kupdated (pid: 6, stackpage=802ec000)
Stack: 80245460 80245538 000002a4 00001875 802fe504 009a0000 818cbd10
802ad928
818cbe2c 818cbe28 802fe4b0 8029ae8c 00000000 801db1f8 00000010
802edeb8
802edea8 1000f801 00000000 802ec000 802fe508 802fe508 0000000a
00000266
8107c720 818cbd10 818cbd10 816af120 818cbe2c 818cbe28 00000001
8029ae8c
00000000 801d29b0 802fe400 8107c720 818cbd10 816af120 801852a0
801851fc
80253518 ...
Call Trace: [<80245460>] [<80245538>] [<801db1f8>] [<801d29b0>] [<801852a0>]
[<801851fc>]
[<80253518>] [<801854ac>] [<801853f8>] [<80186738>] [<80186730>]
[<8014004c>]
[<8013f11c>] [<8013f610>] [<8013f3d8>] [<8013f3d8>] [<801089e8>]
[<80140eb4>]
[<801089d8>]
Code: 24a55538 0c0458b5 240602a4 <ac000000> 0012a940 3c0a8029 254a2040
01555021 40016000
Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
*
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: simulate a bad NAND block cause kernel hang
2005-08-25 13:27 simulate a bad NAND block cause kernel hang ahgu
@ 2005-08-25 13:44 ` Thomas Gleixner
2005-08-25 15:41 ` ahgu
0 siblings, 1 reply; 4+ messages in thread
From: Thomas Gleixner @ 2005-08-25 13:44 UTC (permalink / raw)
To: ahgu; +Cc: linux-mtd
On Thu, 2005-08-25 at 09:27 -0400, ahgu wrote:
> I forced the flash_erase function to fail. I expect the jffs2 will pick up
> the return error and mark the block bad and put the bad block in a bad_block
> list. But what I get is kernel failure:
> I get similar error when I simulate a write error.
> Am I doing the bad block simulation correctly? Is this a correct response?
> What is supposed to happen when the NAND flash grow a bad block?
JFFS2 should handle this.
The oops trace is worthless, as it does not show the stack trace in
human readable form (function names decoded)
Make sure that CONFIG_KALLSYMS is set in your kernel .config file.
Also information about kernel version and possibly applied MTD/JFFS2
patches is missing.
tglx
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: simulate a bad NAND block cause kernel hang
2005-08-25 13:44 ` Thomas Gleixner
@ 2005-08-25 15:41 ` ahgu
2005-08-25 15:47 ` Thomas Gleixner
0 siblings, 1 reply; 4+ messages in thread
From: ahgu @ 2005-08-25 15:41 UTC (permalink / raw)
To: tglx; +Cc: linux-mtd
I am using 2.4.18 kernel.
jffs2_erase_failed(c, jeb);is the last function before the fault condition
get triggered.
Where can I find the diff between 2.4.18 and 2.4.20?
void jffs2_erase_block(struct jffs2_sb_info *c, struct jffs2_eraseblock
*jeb)
{
int ret;
#ifdef __ECOS
ret = jffs2_flash_erase(c, jeb);
if (!ret) {
jffs2_erase_succeeded(c, jeb);
return;
}
#else /* Linux */
struct erase_info *instr;
instr = kmalloc(sizeof(struct erase_info) + sizeof(struct
erase_priv_struct), GFP_KERNEL);
if (!instr) {
printk(KERN_WARNING "kmalloc for struct erase_info in jffs2_erase_block
failed. Refiling block for later\n");
spin_lock(&c->erase_completion_lock);
list_del(&jeb->list);
list_add(&jeb->list, &c->erase_pending_list);
c->erasing_size -= c->sector_size;
c->dirty_size += c->sector_size;
jeb->dirty_size = c->sector_size;
spin_unlock(&c->erase_completion_lock);
return;
}
memset(instr, 0, sizeof(*instr));
instr->mtd = c->mtd;
instr->addr = jeb->offset;
instr->len = c->sector_size;
instr->callback = jffs2_erase_callback;
instr->priv = (unsigned long)(&instr[1]);
((struct erase_priv_struct *)instr->priv)->jeb = jeb;
((struct erase_priv_struct *)instr->priv)->c = c;
/* NAND , read out the fail counter, if possible */
if (!jffs2_can_mark_obsolete(c))
jffs2_nand_read_failcnt(c,jeb);
ret = c->mtd->erase(c->mtd, instr);
if (!ret)
return;
kfree(instr);
#endif /* __ECOS */
if (ret == -ENOMEM || ret == -EAGAIN) {
/* Erase failed immediately. Refile it on the list */
D1(printk(KERN_DEBUG "Erase at 0x%08x failed: %d. Refiling on
erase_pending_list\n", jeb->offset, ret));
spin_lock(&c->erase_completion_lock);
list_del(&jeb->list);
list_add(&jeb->list, &c->erase_pending_list);
c->erasing_size -= c->sector_size;
c->dirty_size += c->sector_size;
jeb->dirty_size = c->sector_size;
spin_unlock(&c->erase_completion_lock);
return;
}
if (ret == -EROFS)
printk(KERN_WARNING "Erase at 0x%08x failed immediately: -EROFS. Is the
sector locked?\n", jeb->offset);
else
printk(KERN_WARNING "Erase at 0x%08x failed immediately: errno %d\n",
jeb->offset, ret);
jffs2_erase_failed(c, jeb);
}
----- Original Message -----
From: "Thomas Gleixner" <tglx@linutronix.de>
To: "ahgu" <ahgu@ahgu.homeunix.com>
Cc: <linux-mtd@lists.infradead.org>
Sent: Thursday, August 25, 2005 9:44 AM
Subject: Re: simulate a bad NAND block cause kernel hang
> On Thu, 2005-08-25 at 09:27 -0400, ahgu wrote:
>> I forced the flash_erase function to fail. I expect the jffs2 will pick
>> up
>> the return error and mark the block bad and put the bad block in a
>> bad_block
>> list. But what I get is kernel failure:
>> I get similar error when I simulate a write error.
>> Am I doing the bad block simulation correctly? Is this a correct
>> response?
>> What is supposed to happen when the NAND flash grow a bad block?
>
> JFFS2 should handle this.
>
> The oops trace is worthless, as it does not show the stack trace in
> human readable form (function names decoded)
>
> Make sure that CONFIG_KALLSYMS is set in your kernel .config file.
>
> Also information about kernel version and possibly applied MTD/JFFS2
> patches is missing.
>
>
> tglx
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: simulate a bad NAND block cause kernel hang
2005-08-25 15:41 ` ahgu
@ 2005-08-25 15:47 ` Thomas Gleixner
0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2005-08-25 15:47 UTC (permalink / raw)
To: ahgu; +Cc: linux-mtd
On Thu, 2005-08-25 at 11:41 -0400, ahgu wrote:
> I am using 2.4.18 kernel.
Please read
http://www.linux-mtd.infradead.org/source.html#kernelversions
> jffs2_erase_failed(c, jeb);is the last function before the fault condition
> get triggered.
> Where can I find the diff between 2.4.18 and 2.4.20?
diff -urN linux-2.4.18 linux-2.4.20 >veryvery_broken_vs_very_broken.diff
2.4. kernels have no working NAND support and will never get it.
tglx
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-08-25 15:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-25 13:27 simulate a bad NAND block cause kernel hang ahgu
2005-08-25 13:44 ` Thomas Gleixner
2005-08-25 15:41 ` ahgu
2005-08-25 15:47 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox