* Re: Crash in ide_do_request() on card removal [not found] <42EA1AB0.6070001@imc-berlin.de> @ 2005-08-02 9:57 ` Steven Scholz 2005-08-02 10:48 ` Jens Axboe 0 siblings, 1 reply; 2+ messages in thread From: Steven Scholz @ 2005-08-02 9:57 UTC (permalink / raw) To: linux-ide, linux-kernel Steven Scholz wrote: > Hi there, > > when surprisingly removing a CF ATA card (without unmounting before) I > sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM): > > cardmgr[194]: shutting down socket 0 > cardmgr[194]: executing: './ide stop hda' > cardmgr[194]: + umount -v /dev/hda1 > Assertion '(hwgroup->drive)' failed in > drivers/ide/ide-io.c:ide_do_request(1130) > Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035) > Unable to handle kernel NULL pointer dereference at virtual address > 00000010 > pgd = c0e34000 > [00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000 > Internal error: Oops: 17 [#1] > Modules linked in: ide_cs pcmcia at91_cf pcmcia_core > CPU: 0 > PC is at ide_do_request+0x100/0x480 > LR is at 0x1 > pc : [<c00f9980>] lr : [<00000001>] Not tainted > ... > > As the assertions show "drive" is NULL (due to the card removal?) and > thus the kernel crashes ... > > Upon card removal the pcmcia cardmgr tries to unmount the drive which > disapeared. > > ("sometimes" above means that the rest of the time the kernel is not > dumping core, but the umount process hangs forever.) (I think) I found the reason for this behaviour: Upon card removal the functions ~ # cardctl eject ide_release(398) ide_unregister(585): index=0 blk_unregister_queue(3603) elv_unregister_queue(549) ide_unregister(698) ide_detach(164) are called. Thus the request queue for the drive is discarded which is fair enough. But disk->queue would still point to a (now invalid) request_queue_t structure. Thus if I/O requests (e.g. "umount") are started _after_ the drive was removed bad things can happen! So I think we should explicitly remove the reference to that queue by doing void blk_unregister_queue(struct gendisk *disk) { request_queue_t *q = disk->queue; if (q && q->request_fn) { elv_unregister_queue(q); kobject_unregister(&q->kobj); + disk->queue = NULL; kobject_put(&disk->kobj); } } in drivers/block/ll_rw_blk.c Then instead of a crash or hang one would get ~ # umount /mnt/pcmcia/ ... generic_shutdown_super(249) calling sop->put_super @ c00ac734 fat_clusters_flush(49) generic_make_request: Trying to access nonexistent block-device hda1 (1) FAT: bread failed in fat_clusters_flush Thanks a million. -- Steven ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Crash in ide_do_request() on card removal 2005-08-02 9:57 ` Crash in ide_do_request() on card removal Steven Scholz @ 2005-08-02 10:48 ` Jens Axboe 0 siblings, 0 replies; 2+ messages in thread From: Jens Axboe @ 2005-08-02 10:48 UTC (permalink / raw) To: Steven Scholz; +Cc: linux-ide, linux-kernel On Tue, Aug 02 2005, Steven Scholz wrote: > Steven Scholz wrote: > > >Hi there, > > > >when surprisingly removing a CF ATA card (without unmounting before) I > >sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM): > > > >cardmgr[194]: shutting down socket 0 > >cardmgr[194]: executing: './ide stop hda' > >cardmgr[194]: + umount -v /dev/hda1 > >Assertion '(hwgroup->drive)' failed in > >drivers/ide/ide-io.c:ide_do_request(1130) > >Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035) > >Unable to handle kernel NULL pointer dereference at virtual address > >00000010 > >pgd = c0e34000 > >[00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000 > >Internal error: Oops: 17 [#1] > >Modules linked in: ide_cs pcmcia at91_cf pcmcia_core > >CPU: 0 > >PC is at ide_do_request+0x100/0x480 > >LR is at 0x1 > >pc : [<c00f9980>] lr : [<00000001>] Not tainted > >... > > > >As the assertions show "drive" is NULL (due to the card removal?) and > >thus the kernel crashes ... > > > >Upon card removal the pcmcia cardmgr tries to unmount the drive which > >disapeared. > > > >("sometimes" above means that the rest of the time the kernel is not > >dumping core, but the umount process hangs forever.) > > (I think) I found the reason for this behaviour: > > Upon card removal the functions > > ~ # cardctl eject > ide_release(398) > ide_unregister(585): index=0 > blk_unregister_queue(3603) > elv_unregister_queue(549) > ide_unregister(698) > ide_detach(164) > > are called. Thus the request queue for the drive is discarded which is fair > enough. But disk->queue would still point to a (now invalid) > request_queue_t structure. Thus if I/O requests (e.g. "umount") are started > _after_ the drive was removed bad things can happen! So I think we should > explicitly remove the reference to that queue by doing > > void blk_unregister_queue(struct gendisk *disk) > { > request_queue_t *q = disk->queue; > > if (q && q->request_fn) { > elv_unregister_queue(q); > kobject_unregister(&q->kobj); > + disk->queue = NULL; > kobject_put(&disk->kobj); > } > } That's not quite true, q is not invalid after this call. It will only be invalid when it is freed (which doesn't happen from here but rather from the blk_cleanup_queue() call when the reference count drops to 0). This is still not perfect, but a lot better. Does it work for you? --- linux-2.6.12/drivers/ide/ide-disk.c~ 2005-08-02 12:48:16.000000000 +0200 +++ linux-2.6.12/drivers/ide/ide-disk.c 2005-08-02 12:48:32.000000000 +0200 @@ -1054,6 +1054,7 @@ drive->driver_data = NULL; drive->devfs_name[0] = '\0'; g->private_data = NULL; + g->disk = NULL; put_disk(g); kfree(idkp); } -- Jens Axboe ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-08-02 10:48 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <42EA1AB0.6070001@imc-berlin.de>
2005-08-02 9:57 ` Crash in ide_do_request() on card removal Steven Scholz
2005-08-02 10:48 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox