public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Crash in ide_do_request() on card removal
       [not found] <42EA1AB0.6070001@imc-berlin.de>
@ 2005-08-02  9:57 ` Steven Scholz
  2005-08-02 10:48   ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Steven Scholz @ 2005-08-02  9:57 UTC (permalink / raw)
  To: linux-ide, linux-kernel

Steven Scholz wrote:

> Hi there,
> 
> when surprisingly removing a CF ATA card (without unmounting before) I 
> sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM):
> 
> cardmgr[194]: shutting down socket 0
> cardmgr[194]: executing: './ide stop hda'
> cardmgr[194]: + umount -v /dev/hda1
> Assertion '(hwgroup->drive)' failed in 
> drivers/ide/ide-io.c:ide_do_request(1130)
> Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035)
> Unable to handle kernel NULL pointer dereference at virtual address 
> 00000010
> pgd = c0e34000
> [00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000
> Internal error: Oops: 17 [#1]
> Modules linked in: ide_cs pcmcia at91_cf pcmcia_core
> CPU: 0
> PC is at ide_do_request+0x100/0x480
> LR is at 0x1
> pc : [<c00f9980>]    lr : [<00000001>]    Not tainted
> ...
> 
> As the assertions show "drive" is NULL (due to the card removal?) and 
> thus the kernel crashes ...
> 
> Upon card removal the pcmcia cardmgr tries to unmount the drive which 
> disapeared.
> 
> ("sometimes" above means that the rest of the time the kernel is not 
> dumping core, but the umount process hangs forever.)

(I think) I found the reason for this behaviour:

Upon card removal the functions

~ # cardctl eject
ide_release(398)
ide_unregister(585): index=0
blk_unregister_queue(3603)
elv_unregister_queue(549)
ide_unregister(698)
ide_detach(164)

are called. Thus the request queue for the drive is discarded which is fair 
enough. But disk->queue would still point to a (now invalid) request_queue_t 
structure. Thus if I/O requests (e.g. "umount") are started _after_ the drive 
was removed bad things can happen! So I think we should explicitly remove the 
reference to that queue by doing

void blk_unregister_queue(struct gendisk *disk)
{
	request_queue_t *q = disk->queue;

	if (q && q->request_fn) {
		elv_unregister_queue(q);
		kobject_unregister(&q->kobj);
+		disk->queue = NULL;
		kobject_put(&disk->kobj);
	}
}

in drivers/block/ll_rw_blk.c

Then instead of a crash or hang one would get

~ # umount /mnt/pcmcia/
...
generic_shutdown_super(249) calling sop->put_super @ c00ac734
fat_clusters_flush(49)
generic_make_request: Trying to access nonexistent block-device hda1 (1)
FAT: bread failed in fat_clusters_flush

Thanks a million.

--
Steven









^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Crash in ide_do_request() on card removal
  2005-08-02  9:57 ` Crash in ide_do_request() on card removal Steven Scholz
@ 2005-08-02 10:48   ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2005-08-02 10:48 UTC (permalink / raw)
  To: Steven Scholz; +Cc: linux-ide, linux-kernel

On Tue, Aug 02 2005, Steven Scholz wrote:
> Steven Scholz wrote:
> 
> >Hi there,
> >
> >when surprisingly removing a CF ATA card (without unmounting before) I 
> >sometimes get kernel crashes in ide_do_request() (linux-2.6.13-rc4 on ARM):
> >
> >cardmgr[194]: shutting down socket 0
> >cardmgr[194]: executing: './ide stop hda'
> >cardmgr[194]: + umount -v /dev/hda1
> >Assertion '(hwgroup->drive)' failed in 
> >drivers/ide/ide-io.c:ide_do_request(1130)
> >Assertion '(drive)' failed in drivers/ide/ide-io.c:choose_drive(1035)
> >Unable to handle kernel NULL pointer dereference at virtual address 
> >00000010
> >pgd = c0e34000
> >[00000010] *pgd=20eb0031, *pte=00000000, *ppte=00000000
> >Internal error: Oops: 17 [#1]
> >Modules linked in: ide_cs pcmcia at91_cf pcmcia_core
> >CPU: 0
> >PC is at ide_do_request+0x100/0x480
> >LR is at 0x1
> >pc : [<c00f9980>]    lr : [<00000001>]    Not tainted
> >...
> >
> >As the assertions show "drive" is NULL (due to the card removal?) and 
> >thus the kernel crashes ...
> >
> >Upon card removal the pcmcia cardmgr tries to unmount the drive which 
> >disapeared.
> >
> >("sometimes" above means that the rest of the time the kernel is not 
> >dumping core, but the umount process hangs forever.)
> 
> (I think) I found the reason for this behaviour:
> 
> Upon card removal the functions
> 
> ~ # cardctl eject
> ide_release(398)
> ide_unregister(585): index=0
> blk_unregister_queue(3603)
> elv_unregister_queue(549)
> ide_unregister(698)
> ide_detach(164)
> 
> are called. Thus the request queue for the drive is discarded which is fair 
> enough. But disk->queue would still point to a (now invalid) 
> request_queue_t structure. Thus if I/O requests (e.g. "umount") are started 
> _after_ the drive was removed bad things can happen! So I think we should 
> explicitly remove the reference to that queue by doing
> 
> void blk_unregister_queue(struct gendisk *disk)
> {
> 	request_queue_t *q = disk->queue;
> 
> 	if (q && q->request_fn) {
> 		elv_unregister_queue(q);
> 		kobject_unregister(&q->kobj);
> +		disk->queue = NULL;
> 		kobject_put(&disk->kobj);
> 	}
> }

That's not quite true, q is not invalid after this call. It will only be
invalid when it is freed (which doesn't happen from here but rather from
the blk_cleanup_queue() call when the reference count drops to 0).

This is still not perfect, but a lot better. Does it work for you?

--- linux-2.6.12/drivers/ide/ide-disk.c~	2005-08-02 12:48:16.000000000 +0200
+++ linux-2.6.12/drivers/ide/ide-disk.c	2005-08-02 12:48:32.000000000 +0200
@@ -1054,6 +1054,7 @@
 	drive->driver_data = NULL;
 	drive->devfs_name[0] = '\0';
 	g->private_data = NULL;
+	g->disk = NULL;
 	put_disk(g);
 	kfree(idkp);
 }

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-08-02 10:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <42EA1AB0.6070001@imc-berlin.de>
2005-08-02  9:57 ` Crash in ide_do_request() on card removal Steven Scholz
2005-08-02 10:48   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox