* Re: Crash in ide_do_request() on card removal
[not found] <42EA1AB0.6070001@imc-berlin.de>
@ 2005-08-02 9:57 ` Steven Scholz
2005-08-02 10:48 ` Jens Axboe
0 siblings, 1 reply; 2+ messages in thread
From: Steven Scholz @ 2005-08-02 9:57 UTC (permalink / raw)
To: linux-ide, linux-kernel
Steven Scholz wrote:
> Hi there,
>
> when surprisingly removing a CF ATA card (without unmounting before) I
> sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM):
>
> cardmgr[194]: shutting down socket 0
> cardmgr[194]: executing: './ide stop hda'
> cardmgr[194]: + umount -v /dev/hda1
> Assertion '(hwgroup->drive)' failed in
> drivers/ide/ide-io.c:ide_do_request(1130)
> Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035)
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000010
> pgd = c0e34000
> [00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000
> Internal error: Oops: 17 [#1]
> Modules linked in: ide_cs pcmcia at91_cf pcmcia_core
> CPU: 0
> PC is at ide_do_request+0x100/0x480
> LR is at 0x1
> pc : [<c00f9980>] lr : [<00000001>] Not tainted
> ...
>
> As the assertions show "drive" is NULL (due to the card removal?) and
> thus the kernel crashes ...
>
> Upon card removal the pcmcia cardmgr tries to unmount the drive which
> disapeared.
>
> ("sometimes" above means that the rest of the time the kernel is not
> dumping core, but the umount process hangs forever.)
(I think) I found the reason for this behaviour:
Upon card removal the functions
~ # cardctl eject
ide_release(398)
ide_unregister(585): index=0
blk_unregister_queue(3603)
elv_unregister_queue(549)
ide_unregister(698)
ide_detach(164)
are called. Thus the request queue for the drive is discarded which is fair
enough. But disk->queue would still point to a (now invalid) request_queue_t
structure. Thus if I/O requests (e.g. "umount") are started _after_ the drive
was removed bad things can happen! So I think we should explicitly remove the
reference to that queue by doing
void blk_unregister_queue(struct gendisk *disk)
{
request_queue_t *q = disk->queue;
if (q && q->request_fn) {
elv_unregister_queue(q);
kobject_unregister(&q->kobj);
+ disk->queue = NULL;
kobject_put(&disk->kobj);
}
}
in drivers/block/ll_rw_blk.c
Then instead of a crash or hang one would get
~ # umount /mnt/pcmcia/
...
generic_shutdown_super(249) calling sop->put_super @ c00ac734
fat_clusters_flush(49)
generic_make_request: Trying to access nonexistent block-device hda1 (1)
FAT: bread failed in fat_clusters_flush
Thanks a million.
--
Steven
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Crash in ide_do_request() on card removal
2005-08-02 9:57 ` Crash in ide_do_request() on card removal Steven Scholz
@ 2005-08-02 10:48 ` Jens Axboe
0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2005-08-02 10:48 UTC (permalink / raw)
To: Steven Scholz; +Cc: linux-ide, linux-kernel
On Tue, Aug 02 2005, Steven Scholz wrote:
> Steven Scholz wrote:
>
> >Hi there,
> >
> >when surprisingly removing a CF ATA card (without unmounting before) I
> >sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM):
> >
> >cardmgr[194]: shutting down socket 0
> >cardmgr[194]: executing: './ide stop hda'
> >cardmgr[194]: + umount -v /dev/hda1
> >Assertion '(hwgroup->drive)' failed in
> >drivers/ide/ide-io.c:ide_do_request(1130)
> >Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035)
> >Unable to handle kernel NULL pointer dereference at virtual address
> >00000010
> >pgd = c0e34000
> >[00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000
> >Internal error: Oops: 17 [#1]
> >Modules linked in: ide_cs pcmcia at91_cf pcmcia_core
> >CPU: 0
> >PC is at ide_do_request+0x100/0x480
> >LR is at 0x1
> >pc : [<c00f9980>] lr : [<00000001>] Not tainted
> >...
> >
> >As the assertions show "drive" is NULL (due to the card removal?) and
> >thus the kernel crashes ...
> >
> >Upon card removal the pcmcia cardmgr tries to unmount the drive which
> >disapeared.
> >
> >("sometimes" above means that the rest of the time the kernel is not
> >dumping core, but the umount process hangs forever.)
>
> (I think) I found the reason for this behaviour:
>
> Upon card removal the functions
>
> ~ # cardctl eject
> ide_release(398)
> ide_unregister(585): index=0
> blk_unregister_queue(3603)
> elv_unregister_queue(549)
> ide_unregister(698)
> ide_detach(164)
>
> are called. Thus the request queue for the drive is discarded which is fair
> enough. But disk->queue would still point to a (now invalid)
> request_queue_t structure. Thus if I/O requests (e.g. "umount") are started
> _after_ the drive was removed bad things can happen! So I think we should
> explicitly remove the reference to that queue by doing
>
> void blk_unregister_queue(struct gendisk *disk)
> {
> request_queue_t *q = disk->queue;
>
> if (q && q->request_fn) {
> elv_unregister_queue(q);
> kobject_unregister(&q->kobj);
> + disk->queue = NULL;
> kobject_put(&disk->kobj);
> }
> }
That's not quite true, q is not invalid after this call. It will only be
invalid when it is freed (which doesn't happen from here but rather from
the blk_cleanup_queue() call when the reference count drops to 0).
This is still not perfect, but a lot better. Does it work for you?
--- linux-2.6.12/drivers/ide/ide-disk.c~ 2005-08-02 12:48:16.000000000 +0200
+++ linux-2.6.12/drivers/ide/ide-disk.c 2005-08-02 12:48:32.000000000 +0200
@@ -1054,6 +1054,7 @@
drive->driver_data = NULL;
drive->devfs_name[0] = '\0';
g->private_data = NULL;
+ g->disk = NULL;
put_disk(g);
kfree(idkp);
}
--
Jens Axboe
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-08-02 10:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <42EA1AB0.6070001@imc-berlin.de>
2005-08-02 9:57 ` Crash in ide_do_request() on card removal Steven Scholz
2005-08-02 10:48 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox