* ide pmac breakage @ 2008-07-28 1:29 Benjamin Herrenschmidt 2008-07-28 2:02 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-28 1:29 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz; +Cc: list linux-ide, linuxppc-dev list The current ide-pmac upstream is broken. It calls media_bay_set_ide_infos() with an uninitialized "hwif" argument. It's not a trivial mistake, there's a chicken-and-egg problem in the init code in there. I've locally fixed it with this patch that i'll merge via the powerpc tree unless you have an objection. However, the machine crashes when removing the media-bay CD-ROM drive. Crash appears to be a NULL deref, possibly in elv_may_queue() though I don't have a clean backtrace yet, working on it... Cheers, Ben. diff --git a/drivers/ide/ppc/pmac.c b/drivers/ide/ppc/pmac.c index c521bf6..fa2be26 100644 --- a/drivers/ide/ppc/pmac.c +++ b/drivers/ide/ppc/pmac.c @@ -1086,6 +1086,11 @@ static int __devinit pmac_ide_setup_device(pmac_ide_hwif_t *pmif, hw_regs_t *hw) /* Make sure we have sane timings */ sanitize_timings(pmif); + host = ide_host_alloc(&d, hws); + if (host == NULL) + return -ENOMEM; + hwif = host->ports[0]; + #ifndef CONFIG_PPC64 /* XXX FIXME: Media bay stuff need re-organizing */ if (np->parent && np->parent->name @@ -1119,11 +1124,11 @@ static int __devinit pmac_ide_setup_device(pmac_ide_hwif_t *pmif, hw_regs_t *hw) pmif->mdev ? "macio" : "PCI", pmif->aapl_bus_id, pmif->mediabay ? " (mediabay)" : "", hw->irq); - rc = ide_host_add(&d, hws, &host); - if (rc) + rc = ide_host_register(host, &d, hws); + if (rc) { + ide_host_free(host); return rc; - - hwif = host->ports[0]; + } return 0; } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-28 1:29 ide pmac breakage Benjamin Herrenschmidt @ 2008-07-28 2:02 ` Benjamin Herrenschmidt 2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-28 2:02 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz; +Cc: list linux-ide, linuxppc-dev list On Mon, 2008-07-28 at 11:29 +1000, Benjamin Herrenschmidt wrote: > The current ide-pmac upstream is broken. It calls > media_bay_set_ide_infos() with an uninitialized "hwif" argument. > > It's not a trivial mistake, there's a chicken-and-egg problem in the > init code in there. > > I've locally fixed it with this patch that i'll merge via the powerpc > tree unless you have an objection. > > However, the machine crashes when removing the media-bay CD-ROM drive. > > Crash appears to be a NULL deref, possibly in elv_may_queue() though > I don't have a clean backtrace yet, working on it... Here's a backtrace: Vector: 300 (Data Access) at [c58b7b80] pc: c014f264: elv_may_queue+0x10/0x44 lr: c0152750: get_request+0x2c/0x2c0 sp: c58b7c30 msr: 1032 dar: c dsisr: 40000000 current = 0xc58aaae0 pid = 854, comm = media-bay enter ? for help mon> t [c58b7c40] c0152750 get_request+0x2c/0x2c0 [c58b7c70] c0152a08 get_request_wait+0x24/0xec [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0 [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0 [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8 [c58b7e50] c022395c ide_cd_release+0x80/0x84 [c58b7e70] c0163650 kref_put+0x54/0x6c [c58b7e80] c0223884 ide_cd_put+0x40/0x5c [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4 [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44 [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8 [c58b7f00] c01e7424 device_del+0x104/0x198 [c58b7f20] c01e74d0 device_unregister+0x18/0x30 [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88 [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80 [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0 [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc [c58b7fd0] c00485c0 kthread+0x48/0x84 [c58b7ff0] c0011b20 kernel_thread+0x44/0x60 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-28 2:02 ` Benjamin Herrenschmidt @ 2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz 2008-07-29 5:17 ` FUJITA Tomonori 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-28 14:31 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, list linux-ide, Borislav Petkov, linuxppc-dev list On Monday 28 July 2008, Benjamin Herrenschmidt wrote: > On Mon, 2008-07-28 at 11:29 +1000, Benjamin Herrenschmidt wrote: > > The current ide-pmac upstream is broken. It calls > > media_bay_set_ide_infos() with an uninitialized "hwif" argument. > > > > It's not a trivial mistake, there's a chicken-and-egg problem in the > > init code in there. > > > > I've locally fixed it with this patch that i'll merge via the powerpc > > tree unless you have an objection. Fine with me (you may add my ACK) and thanks for fixing it. > > However, the machine crashes when removing the media-bay CD-ROM drive. > > > > Crash appears to be a NULL deref, possibly in elv_may_queue() though > > I don't have a clean backtrace yet, working on it... I wonder whether conversion from on-stack struct requests to allocated ones may have something to do with it (or not?)... > Here's a backtrace: > > Vector: 300 (Data Access) at [c58b7b80] > pc: c014f264: elv_may_queue+0x10/0x44 > lr: c0152750: get_request+0x2c/0x2c0 > sp: c58b7c30 > msr: 1032 > dar: c > dsisr: 40000000 > current = 0xc58aaae0 > pid = 854, comm = media-bay > enter ? for help > mon> t > [c58b7c40] c0152750 get_request+0x2c/0x2c0 > [c58b7c70] c0152a08 get_request_wait+0x24/0xec > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0 > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0 > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8 > [c58b7e50] c022395c ide_cd_release+0x80/0x84 > [c58b7e70] c0163650 kref_put+0x54/0x6c > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4 > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44 > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8 > [c58b7f00] c01e7424 device_del+0x104/0x198 > [c58b7f20] c01e74d0 device_unregister+0x18/0x30 > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88 > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80 > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0 > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc > [c58b7fd0] c00485c0 kthread+0x48/0x84 > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz @ 2008-07-29 5:17 ` FUJITA Tomonori 2008-07-29 5:20 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 26+ messages in thread From: FUJITA Tomonori @ 2008-07-29 5:17 UTC (permalink / raw) To: bzolnier; +Cc: linuxppc-dev, petkovbb, fujita.tomonori, linux-ide On Mon, 28 Jul 2008 16:31:56 +0200 Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote: > > > However, the machine crashes when removing the media-bay CD-ROM drive. > > > > > > Crash appears to be a NULL deref, possibly in elv_may_queue() though > > > I don't have a clean backtrace yet, working on it... > > I wonder whether conversion from on-stack struct requests to allocated > ones may have something to do with it (or not?)... It might be. q->elevator is NULL? I think that everyone goes through this path (generic_ide_remove -> ide_cd_release -> cdrom_get_disc_info ->...). With 2.6.27-rc1, I've just tried this path by removing ide-cd module, and it's fine. If q->elevator is NULL, the media-bay code might mess up the ref counting of the request queue... > > Here's a backtrace: > > > > Vector: 300 (Data Access) at [c58b7b80] > > pc: c014f264: elv_may_queue+0x10/0x44 > > lr: c0152750: get_request+0x2c/0x2c0 > > sp: c58b7c30 > > msr: 1032 > > dar: c > > dsisr: 40000000 > > current = 0xc58aaae0 > > pid = 854, comm = media-bay > > enter ? for help > > mon> t > > [c58b7c40] c0152750 get_request+0x2c/0x2c0 > > [c58b7c70] c0152a08 get_request_wait+0x24/0xec > > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0 > > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc > > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0 > > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c > > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8 > > [c58b7e50] c022395c ide_cd_release+0x80/0x84 > > [c58b7e70] c0163650 kref_put+0x54/0x6c > > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c > > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c > > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4 > > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44 > > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8 > > [c58b7f00] c01e7424 device_del+0x104/0x198 > > [c58b7f20] c01e74d0 device_unregister+0x18/0x30 > > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88 > > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80 > > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0 > > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc > > [c58b7fd0] c00485c0 kthread+0x48/0x84 > > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60 > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 5:17 ` FUJITA Tomonori @ 2008-07-29 5:20 ` Benjamin Herrenschmidt 2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-29 5:20 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-ide, petkovbb, bzolnier, linuxppc-dev On Tue, 2008-07-29 at 14:17 +0900, FUJITA Tomonori wrote: > If q->elevator is NULL, the media-bay code might mess up the ref > counting of the request queue... Well, all I do is call into Bart's new helpers to scan for or unregister devices ... Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 5:20 ` Benjamin Herrenschmidt @ 2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz 2008-07-29 12:02 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-29 11:41 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote: > On Tue, 2008-07-29 at 14:17 +0900, FUJITA Tomonori wrote: > > If q->elevator is NULL, the media-bay code might mess up the ref > > counting of the request queue... > > Well, all I do is call into Bart's new helpers to scan for or unregister > devices ... The switch to these helpers happened _before_ 2.6.26 and it shouldn't bring such behavior change (ditto for new IDE host addition/removal helpers)... Please try to git-bisect it when you have some time. Thanks, Bart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz @ 2008-07-29 12:02 ` Benjamin Herrenschmidt 2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-29 12:02 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote: > > Well, all I do is call into Bart's new helpers to scan for or > unregister > > devices ... > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't > bring > such behavior change (ditto for new IDE host addition/removal > helpers)... > > Please try to git-bisect it when you have some time. Ok, I will. I worked fine when I last tried your patches. I'll see if I can track it down too. Been a bit too busy lately as you can imagine. Do you have something that exercise the same code path you can use ? Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 12:02 ` Benjamin Herrenschmidt @ 2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz 2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-29 12:04 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote: > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote: > > > Well, all I do is call into Bart's new helpers to scan for or > > unregister > > > devices ... > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't > > bring > > such behavior change (ditto for new IDE host addition/removal > > helpers)... > > > > Please try to git-bisect it when you have some time. > > Ok, I will. I worked fine when I last tried your patches. I'll see if I > can track it down too. Been a bit too busy lately as you can imagine. > > Do you have something that exercise the same code path you can use ? I'll see if I can reproduce it with IDE warm-plug support later... Thanks, Bart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz @ 2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-29 14:45 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote: > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote: > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote: > > > > Well, all I do is call into Bart's new helpers to scan for or > > > unregister > > > > devices ... > > > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't > > > bring > > > such behavior change (ditto for new IDE host addition/removal > > > helpers)... > > > > > > Please try to git-bisect it when you have some time. > > > > Ok, I will. I worked fine when I last tried your patches. I'll see if I > > can track it down too. Been a bit too busy lately as you can imagine. > > > > Do you have something that exercise the same code path you can use ? > > I'll see if I can reproduce it with IDE warm-plug support later... OK, I reproduced it here with IDE warm-plug support (echo -n "1" > /sys/class/ide_port/ide*/delete_devices) for devices driven by ide-cd. It is also reproducible under qemu so I'm scripting it into git-bisect run now... Thanks, Bart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz @ 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz 2008-07-29 21:30 ` Benjamin Herrenschmidt ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-29 19:26 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote: > On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote: > > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote: > > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote: > > > > > Well, all I do is call into Bart's new helpers to scan for or > > > > unregister > > > > > devices ... > > > > > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't > > > > bring > > > > such behavior change (ditto for new IDE host addition/removal > > > > helpers)... > > > > > > > > Please try to git-bisect it when you have some time. > > > > > > Ok, I will. I worked fine when I last tried your patches. I'll see if I > > > can track it down too. Been a bit too busy lately as you can imagine. > > > > > > Do you have something that exercise the same code path you can use ? > > > > I'll see if I can reproduce it with IDE warm-plug support later... > > OK, I reproduced it here with IDE warm-plug support > (echo -n "1" > /sys/class/ide_port/ide*/delete_devices) > for devices driven by ide-cd. > > It is also reproducible under qemu so I'm scripting it > into git-bisect run now... I WON!!! From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Subject: [PATCH] ide: fix regression caused by ide_device_{get,put}() addition On Monday 28 July 2008, Benjamin Herrenschmidt wrote: [...] > Vector: 300 (Data Access) at [c58b7b80] > pc: c014f264: elv_may_queue+0x10/0x44 > lr: c0152750: get_request+0x2c/0x2c0 > sp: c58b7c30 > msr: 1032 > dar: c > dsisr: 40000000 > current = 0xc58aaae0 > pid = 854, comm = media-bay > enter ? for help > mon> t > [c58b7c40] c0152750 get_request+0x2c/0x2c0 > [c58b7c70] c0152a08 get_request_wait+0x24/0xec > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0 > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0 > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8 > [c58b7e50] c022395c ide_cd_release+0x80/0x84 > [c58b7e70] c0163650 kref_put+0x54/0x6c > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4 > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44 > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8 > [c58b7f00] c01e7424 device_del+0x104/0x198 > [c58b7f20] c01e74d0 device_unregister+0x18/0x30 > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88 > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80 > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0 > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc > [c58b7fd0] c00485c0 kthread+0x48/0x84 > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60 The guilty commit turned out to be 08da591e14cf87247ec09b17c350235157a92fc3 ("ide: add ide_device_{get,put}() helpers"). ide_device_put() is called before kref_put() in ide_cd_put() so IDE device is already gone by the time ide_cd_release() is reached. Fix it by calling ide_device_get() before kref_get() and ide_device_put() after kref_put() in all affected device drivers. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> --- drivers/ide/ide-cd.c | 10 +++++----- drivers/ide/ide-disk.c | 9 ++++----- drivers/ide/ide-floppy.c | 9 ++++----- drivers/ide/ide-tape.c | 9 ++++----- drivers/scsi/ide-scsi.c | 9 ++++----- 5 files changed, 21 insertions(+), 25 deletions(-) Index: b/drivers/ide/ide-cd.c =================================================================== --- a/drivers/ide/ide-cd.c +++ b/drivers/ide/ide-cd.c @@ -66,11 +66,11 @@ static struct cdrom_info *ide_cd_get(str mutex_lock(&idecd_ref_mutex); cd = ide_cd_g(disk); if (cd) { - kref_get(&cd->kref); - if (ide_device_get(cd->drive)) { - kref_put(&cd->kref, ide_cd_release); + if (ide_device_get(cd->drive)) cd = NULL; - } + else + kref_get(&cd->kref); + } mutex_unlock(&idecd_ref_mutex); return cd; @@ -79,8 +79,8 @@ static struct cdrom_info *ide_cd_get(str static void ide_cd_put(struct cdrom_info *cd) { mutex_lock(&idecd_ref_mutex); - ide_device_put(cd->drive); kref_put(&cd->kref, ide_cd_release); + ide_device_put(cd->drive); mutex_unlock(&idecd_ref_mutex); } Index: b/drivers/ide/ide-disk.c =================================================================== --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -65,11 +65,10 @@ static struct ide_disk_obj *ide_disk_get mutex_lock(&idedisk_ref_mutex); idkp = ide_disk_g(disk); if (idkp) { - kref_get(&idkp->kref); - if (ide_device_get(idkp->drive)) { - kref_put(&idkp->kref, ide_disk_release); + if (ide_device_get(idkp->drive)) idkp = NULL; - } + else + kref_get(&idkp->kref); } mutex_unlock(&idedisk_ref_mutex); return idkp; @@ -78,8 +77,8 @@ static struct ide_disk_obj *ide_disk_get static void ide_disk_put(struct ide_disk_obj *idkp) { mutex_lock(&idedisk_ref_mutex); - ide_device_put(idkp->drive); kref_put(&idkp->kref, ide_disk_release); + ide_device_put(idkp->drive); mutex_unlock(&idedisk_ref_mutex); } Index: b/drivers/ide/ide-floppy.c =================================================================== --- a/drivers/ide/ide-floppy.c +++ b/drivers/ide/ide-floppy.c @@ -167,11 +167,10 @@ static struct ide_floppy_obj *ide_floppy mutex_lock(&idefloppy_ref_mutex); floppy = ide_floppy_g(disk); if (floppy) { - kref_get(&floppy->kref); - if (ide_device_get(floppy->drive)) { - kref_put(&floppy->kref, idefloppy_cleanup_obj); + if (ide_device_get(floppy->drive)) floppy = NULL; - } + else + kref_get(&floppy->kref); } mutex_unlock(&idefloppy_ref_mutex); return floppy; @@ -180,8 +179,8 @@ static struct ide_floppy_obj *ide_floppy static void ide_floppy_put(struct ide_floppy_obj *floppy) { mutex_lock(&idefloppy_ref_mutex); - ide_device_put(floppy->drive); kref_put(&floppy->kref, idefloppy_cleanup_obj); + ide_device_put(floppy->drive); mutex_unlock(&idefloppy_ref_mutex); } Index: b/drivers/ide/ide-tape.c =================================================================== --- a/drivers/ide/ide-tape.c +++ b/drivers/ide/ide-tape.c @@ -331,11 +331,10 @@ static struct ide_tape_obj *ide_tape_get mutex_lock(&idetape_ref_mutex); tape = ide_tape_g(disk); if (tape) { - kref_get(&tape->kref); - if (ide_device_get(tape->drive)) { - kref_put(&tape->kref, ide_tape_release); + if (ide_device_get(tape->drive)) tape = NULL; - } + else + kref_get(&tape->kref); } mutex_unlock(&idetape_ref_mutex); return tape; @@ -344,8 +343,8 @@ static struct ide_tape_obj *ide_tape_get static void ide_tape_put(struct ide_tape_obj *tape) { mutex_lock(&idetape_ref_mutex); - ide_device_put(tape->drive); kref_put(&tape->kref, ide_tape_release); + ide_device_put(tape->drive); mutex_unlock(&idetape_ref_mutex); } Index: b/drivers/scsi/ide-scsi.c =================================================================== --- a/drivers/scsi/ide-scsi.c +++ b/drivers/scsi/ide-scsi.c @@ -102,11 +102,10 @@ static struct ide_scsi_obj *ide_scsi_get mutex_lock(&idescsi_ref_mutex); scsi = ide_scsi_g(disk); if (scsi) { - scsi_host_get(scsi->host); - if (ide_device_get(scsi->drive)) { - scsi_host_put(scsi->host); + if (ide_device_get(scsi->drive)) scsi = NULL; - } + else + scsi_host_get(scsi->host); } mutex_unlock(&idescsi_ref_mutex); return scsi; @@ -115,8 +114,8 @@ static struct ide_scsi_obj *ide_scsi_get static void ide_scsi_put(struct ide_scsi_obj *scsi) { mutex_lock(&idescsi_ref_mutex); - ide_device_put(scsi->drive); scsi_host_put(scsi->host); + ide_device_put(scsi->drive); mutex_unlock(&idescsi_ref_mutex); } ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz @ 2008-07-29 21:30 ` Benjamin Herrenschmidt 2008-07-30 1:23 ` FUJITA Tomonori 2008-07-30 6:57 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-29 21:30 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev > I WON!!! Heh, great :-) I'll give you patch a try, thanks ! Cheers, Ben. > From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Subject: [PATCH] ide: fix regression caused by ide_device_{get,put}() addition > > On Monday 28 July 2008, Benjamin Herrenschmidt wrote: > > [...] > > > Vector: 300 (Data Access) at [c58b7b80] > > pc: c014f264: elv_may_queue+0x10/0x44 > > lr: c0152750: get_request+0x2c/0x2c0 > > sp: c58b7c30 > > msr: 1032 > > dar: c > > dsisr: 40000000 > > current = 0xc58aaae0 > > pid = 854, comm = media-bay > > enter ? for help > > mon> t > > [c58b7c40] c0152750 get_request+0x2c/0x2c0 > > [c58b7c70] c0152a08 get_request_wait+0x24/0xec > > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0 > > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc > > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0 > > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c > > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8 > > [c58b7e50] c022395c ide_cd_release+0x80/0x84 > > [c58b7e70] c0163650 kref_put+0x54/0x6c > > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c > > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c > > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4 > > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44 > > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8 > > [c58b7f00] c01e7424 device_del+0x104/0x198 > > [c58b7f20] c01e74d0 device_unregister+0x18/0x30 > > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88 > > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80 > > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0 > > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc > > [c58b7fd0] c00485c0 kthread+0x48/0x84 > > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60 > > The guilty commit turned out to be 08da591e14cf87247ec09b17c350235157a92fc3 > ("ide: add ide_device_{get,put}() helpers"). ide_device_put() is called > before kref_put() in ide_cd_put() so IDE device is already gone by the time > ide_cd_release() is reached. > > Fix it by calling ide_device_get() before kref_get() and ide_device_put() > after kref_put() in all affected device drivers. > > Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> > Cc: Borislav Petkov <petkovbb@gmail.com> > Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > --- > drivers/ide/ide-cd.c | 10 +++++----- > drivers/ide/ide-disk.c | 9 ++++----- > drivers/ide/ide-floppy.c | 9 ++++----- > drivers/ide/ide-tape.c | 9 ++++----- > drivers/scsi/ide-scsi.c | 9 ++++----- > 5 files changed, 21 insertions(+), 25 deletions(-) > > Index: b/drivers/ide/ide-cd.c > =================================================================== > --- a/drivers/ide/ide-cd.c > +++ b/drivers/ide/ide-cd.c > @@ -66,11 +66,11 @@ static struct cdrom_info *ide_cd_get(str > mutex_lock(&idecd_ref_mutex); > cd = ide_cd_g(disk); > if (cd) { > - kref_get(&cd->kref); > - if (ide_device_get(cd->drive)) { > - kref_put(&cd->kref, ide_cd_release); > + if (ide_device_get(cd->drive)) > cd = NULL; > - } > + else > + kref_get(&cd->kref); > + > } > mutex_unlock(&idecd_ref_mutex); > return cd; > @@ -79,8 +79,8 @@ static struct cdrom_info *ide_cd_get(str > static void ide_cd_put(struct cdrom_info *cd) > { > mutex_lock(&idecd_ref_mutex); > - ide_device_put(cd->drive); > kref_put(&cd->kref, ide_cd_release); > + ide_device_put(cd->drive); > mutex_unlock(&idecd_ref_mutex); > } > > Index: b/drivers/ide/ide-disk.c > =================================================================== > --- a/drivers/ide/ide-disk.c > +++ b/drivers/ide/ide-disk.c > @@ -65,11 +65,10 @@ static struct ide_disk_obj *ide_disk_get > mutex_lock(&idedisk_ref_mutex); > idkp = ide_disk_g(disk); > if (idkp) { > - kref_get(&idkp->kref); > - if (ide_device_get(idkp->drive)) { > - kref_put(&idkp->kref, ide_disk_release); > + if (ide_device_get(idkp->drive)) > idkp = NULL; > - } > + else > + kref_get(&idkp->kref); > } > mutex_unlock(&idedisk_ref_mutex); > return idkp; > @@ -78,8 +77,8 @@ static struct ide_disk_obj *ide_disk_get > static void ide_disk_put(struct ide_disk_obj *idkp) > { > mutex_lock(&idedisk_ref_mutex); > - ide_device_put(idkp->drive); > kref_put(&idkp->kref, ide_disk_release); > + ide_device_put(idkp->drive); > mutex_unlock(&idedisk_ref_mutex); > } > > Index: b/drivers/ide/ide-floppy.c > =================================================================== > --- a/drivers/ide/ide-floppy.c > +++ b/drivers/ide/ide-floppy.c > @@ -167,11 +167,10 @@ static struct ide_floppy_obj *ide_floppy > mutex_lock(&idefloppy_ref_mutex); > floppy = ide_floppy_g(disk); > if (floppy) { > - kref_get(&floppy->kref); > - if (ide_device_get(floppy->drive)) { > - kref_put(&floppy->kref, idefloppy_cleanup_obj); > + if (ide_device_get(floppy->drive)) > floppy = NULL; > - } > + else > + kref_get(&floppy->kref); > } > mutex_unlock(&idefloppy_ref_mutex); > return floppy; > @@ -180,8 +179,8 @@ static struct ide_floppy_obj *ide_floppy > static void ide_floppy_put(struct ide_floppy_obj *floppy) > { > mutex_lock(&idefloppy_ref_mutex); > - ide_device_put(floppy->drive); > kref_put(&floppy->kref, idefloppy_cleanup_obj); > + ide_device_put(floppy->drive); > mutex_unlock(&idefloppy_ref_mutex); > } > > Index: b/drivers/ide/ide-tape.c > =================================================================== > --- a/drivers/ide/ide-tape.c > +++ b/drivers/ide/ide-tape.c > @@ -331,11 +331,10 @@ static struct ide_tape_obj *ide_tape_get > mutex_lock(&idetape_ref_mutex); > tape = ide_tape_g(disk); > if (tape) { > - kref_get(&tape->kref); > - if (ide_device_get(tape->drive)) { > - kref_put(&tape->kref, ide_tape_release); > + if (ide_device_get(tape->drive)) > tape = NULL; > - } > + else > + kref_get(&tape->kref); > } > mutex_unlock(&idetape_ref_mutex); > return tape; > @@ -344,8 +343,8 @@ static struct ide_tape_obj *ide_tape_get > static void ide_tape_put(struct ide_tape_obj *tape) > { > mutex_lock(&idetape_ref_mutex); > - ide_device_put(tape->drive); > kref_put(&tape->kref, ide_tape_release); > + ide_device_put(tape->drive); > mutex_unlock(&idetape_ref_mutex); > } > > Index: b/drivers/scsi/ide-scsi.c > =================================================================== > --- a/drivers/scsi/ide-scsi.c > +++ b/drivers/scsi/ide-scsi.c > @@ -102,11 +102,10 @@ static struct ide_scsi_obj *ide_scsi_get > mutex_lock(&idescsi_ref_mutex); > scsi = ide_scsi_g(disk); > if (scsi) { > - scsi_host_get(scsi->host); > - if (ide_device_get(scsi->drive)) { > - scsi_host_put(scsi->host); > + if (ide_device_get(scsi->drive)) > scsi = NULL; > - } > + else > + scsi_host_get(scsi->host); > } > mutex_unlock(&idescsi_ref_mutex); > return scsi; > @@ -115,8 +114,8 @@ static struct ide_scsi_obj *ide_scsi_get > static void ide_scsi_put(struct ide_scsi_obj *scsi) > { > mutex_lock(&idescsi_ref_mutex); > - ide_device_put(scsi->drive); > scsi_host_put(scsi->host); > + ide_device_put(scsi->drive); > mutex_unlock(&idescsi_ref_mutex); > } > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz 2008-07-29 21:30 ` Benjamin Herrenschmidt @ 2008-07-30 1:23 ` FUJITA Tomonori 2008-07-30 6:57 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 26+ messages in thread From: FUJITA Tomonori @ 2008-07-30 1:23 UTC (permalink / raw) To: bzolnier; +Cc: fujita.tomonori, petkovbb, linuxppc-dev, linux-ide On Tue, 29 Jul 2008 21:26:11 +0200 Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote: > On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote: > > On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote: > > > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote: > > > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote: > > > > > > Well, all I do is call into Bart's new helpers to scan for or > > > > > unregister > > > > > > devices ... > > > > > > > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't > > > > > bring > > > > > such behavior change (ditto for new IDE host addition/removal > > > > > helpers)... > > > > > > > > > > Please try to git-bisect it when you have some time. > > > > > > > > Ok, I will. I worked fine when I last tried your patches. I'll see if I > > > > can track it down too. Been a bit too busy lately as you can imagine. > > > > > > > > Do you have something that exercise the same code path you can use ? > > > > > > I'll see if I can reproduce it with IDE warm-plug support later... > > > > OK, I reproduced it here with IDE warm-plug support > > (echo -n "1" > /sys/class/ide_port/ide*/delete_devices) > > for devices driven by ide-cd. > > > > It is also reproducible under qemu so I'm scripting it > > into git-bisect run now... > > I WON!!! Great, seems that I don't need to learn how ide does ref counting. :) Thanks a lot! ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz 2008-07-29 21:30 ` Benjamin Herrenschmidt 2008-07-30 1:23 ` FUJITA Tomonori @ 2008-07-30 6:57 ` Benjamin Herrenschmidt 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz 2 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-30 6:57 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Tue, 2008-07-29 at 21:26 +0200, Bartlomiej Zolnierkiewicz wrote: > I WON!!! Only half... It goes further and then blows up again. First problem is, this unregister interface doesn't quite convey the fact that the HW is gone and the IDE code seems to take it's sweet time figuring it out after trying some requests. Maybe something smarter can be done here ? ie, ide_set_interface_dead() :-) mediabay0: switching to 7 mediabay0: powering down media bay 0 is empty hdc: status error: status=0x00 { } ide: failed opcode was: unknown hdc: status error: status=0x00 { } ide: failed opcode was: unknown Then after this couple of failed attempts at sending commands, it crashes with the backtrace below. Another NULL dereference apparently, though the DAR value (the faulting address) has been slightly different between consecutive tests so it may be a use-after-free too. Note that there shouldn't be anything fundamentally different from ide-pmac here vs. something like pcmcia IDE cards... do you have one of these to test with ? Vector: 300 (Data Access) at [c59dfdc0] pc: c0211f78: ide_device_put+0x18/0x58 lr: c0223c34: ide_cd_put+0x40/0x5c sp: c59dfe70 msr: 9032 dar: 10 dsisr: 40000000 current = 0xc58a9880 pid = 843, comm = media-bay enter ? for help [c59dfe80] c0223c34 ide_cd_put+0x40/0x5c [c59dfea0] c02114d4 generic_ide_remove+0x28/0x3c [c59dfeb0] c01ea108 __device_release_driver+0x78/0xb4 [c59dfec0] c01ea218 device_release_driver+0x28/0x44 [c59dfee0] c01e9350 bus_remove_device+0xac/0xd8 [c59dff00] c01e77f8 device_del+0x104/0x198 [c59dff20] c01e78a4 device_unregister+0x18/0x30 [c59dff40] c0212598 __ide_port_unregister_devices+0x6c/0x88 [c59dff60] c021276c ide_port_unregister_devices+0x38/0x80 [c59dff80] c0209078 media_bay_step+0x1cc/0x5c0 [c59dffb0] c02094f8 media_bay_task+0x8c/0xcc [c59dffd0] c0048640 kthread+0x48/0x84 [c59dfff0] c0011b38 kernel_thread+0x44/0x60 ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-30 6:57 ` Benjamin Herrenschmidt @ 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz 2008-07-30 22:49 ` Benjamin Herrenschmidt 2008-07-31 4:25 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-30 19:11 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Wednesday 30 July 2008, Benjamin Herrenschmidt wrote: > On Tue, 2008-07-29 at 21:26 +0200, Bartlomiej Zolnierkiewicz wrote: > > > I WON!!! > > Only half... Heh, I wasn't talking about fixing the issue... (hint: look up the author of the bad commit). > It goes further and then blows up again. First problem is, this > unregister interface doesn't quite convey the fact that the HW is gone > and the IDE code seems to take it's sweet time figuring it out after > trying some requests. Maybe something smarter can be done here ? ie, > ide_set_interface_dead() :-) Sure, it would be great to have controller hotplug working reliably one day but the recent patches shouldn't really be making things worse (quite the contrary) so I would prefer to find and learn more about the cause of regression first. [ However nothing prevents us from improving the hotplug support in parallel to fixing the issue so if you have some ideas that could be conveyed in form of patches please go ahead. ] > mediabay0: switching to 7 > mediabay0: powering down > media bay 0 is empty > hdc: status error: status=0x00 { } > ide: failed opcode was: unknown > hdc: status error: status=0x00 { } > ide: failed opcode was: unknown > > Then after this couple of failed attempts at sending commands, it > crashes with the backtrace below. Another NULL dereference apparently, > though the DAR value (the faulting address) has been slightly different > between consecutive tests so it may be a use-after-free too. Is it actually caused by additional reference counting on drive->gendev? IOW if you reverse the patch below instead of applying the previous fix do things work OK again? > Note that there shouldn't be anything fundamentally different from > ide-pmac here vs. something like pcmcia IDE cards... do you have one of > these to test with ? Nope and I really don't intend to have one. I count on other people to take some care of support for host drivers that they maintain/use. ;) Thanks, Bart diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c index 4e73aee..8f253e5 100644 --- a/drivers/ide/ide-cd.c +++ b/drivers/ide/ide-cd.c @@ -57,23 +57,29 @@ static DEFINE_MUTEX(idecd_ref_mutex); #define ide_cd_g(disk) \ container_of((disk)->private_data, struct cdrom_info, driver) +static void ide_cd_release(struct kref *); + static struct cdrom_info *ide_cd_get(struct gendisk *disk) { struct cdrom_info *cd = NULL; mutex_lock(&idecd_ref_mutex); cd = ide_cd_g(disk); - if (cd) + if (cd) { kref_get(&cd->kref); + if (ide_device_get(cd->drive)) { + kref_put(&cd->kref, ide_cd_release); + cd = NULL; + } + } mutex_unlock(&idecd_ref_mutex); return cd; } -static void ide_cd_release(struct kref *); - static void ide_cd_put(struct cdrom_info *cd) { mutex_lock(&idecd_ref_mutex); + ide_device_put(cd->drive); kref_put(&cd->kref, ide_cd_release); mutex_unlock(&idecd_ref_mutex); } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz @ 2008-07-30 22:49 ` Benjamin Herrenschmidt 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz 2008-07-31 4:25 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-30 22:49 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Wed, 2008-07-30 at 21:11 +0200, Bartlomiej Zolnierkiewicz wrote: > > > Note that there shouldn't be anything fundamentally different from > > ide-pmac here vs. something like pcmcia IDE cards... do you have one > of > > these to test with ? > > Nope and I really don't intend to have one. I count on other people > to take some care of support for host drivers that they > maintain/use. ;) Hrm... that's not a very sane approach. You have some infrastructure for adding / removing devices, in fact changes things in that area even, and not a single piece of HW to exercise those code path ? I don't ask you to get a powermac with a media-bay, but ide_cs seems to be a pretty important one that's part of what the ide maintainer should take care of... And I suspect it's going to exercise the same code path as mediabay. Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-30 22:49 ` Benjamin Herrenschmidt @ 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz 2008-07-31 1:09 ` Benjamin Herrenschmidt 2008-07-31 8:49 ` Alan Cox 0 siblings, 2 replies; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-31 0:48 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Thursday 31 July 2008, Benjamin Herrenschmidt wrote: > On Wed, 2008-07-30 at 21:11 +0200, Bartlomiej Zolnierkiewicz wrote: > > > > > Note that there shouldn't be anything fundamentally different from > > > ide-pmac here vs. something like pcmcia IDE cards... do you have one > > of > > > these to test with ? > > > > Nope and I really don't intend to have one. I count on other people > > to take some care of support for host drivers that they > > maintain/use. ;) > > Hrm... that's not a very sane approach. You have some infrastructure for > adding / removing devices, in fact changes things in that area even, and There seems to be some confusion between warm-plugging of IDE devices and hot-plugging of IDE devices. > not a single piece of HW to exercise those code path ? I don't ask you > to get a powermac with a media-bay, but ide_cs seems to be a pretty > important one that's part of what the ide maintainer should take care > of... And I suspect it's going to exercise the same code path as > mediabay. I'm open for offers of co-maintaince of the code (which includes taking over particular host drivers) and since you seem to have pretty good insight into what ide maintainer should be doing I presume that you want to be added to MAINTAINERS? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz @ 2008-07-31 1:09 ` Benjamin Herrenschmidt 2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz 2008-07-31 8:49 ` Alan Cox 1 sibling, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-31 1:09 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Thu, 2008-07-31 at 02:48 +0200, Bartlomiej Zolnierkiewicz wrote: > There seems to be some confusion between warm-plugging of IDE devices > and hot-plugging of IDE devices. > > > not a single piece of HW to exercise those code path ? I don't ask you > > to get a powermac with a media-bay, but ide_cs seems to be a pretty > > important one that's part of what the ide maintainer should take care > > of... And I suspect it's going to exercise the same code path as > > mediabay. > > I'm open for offers of co-maintaince of the code (which includes taking > over particular host drivers) and since you seem to have pretty good > insight into what ide maintainer should be doing I presume that you want > to be added to MAINTAINERS? Should I laugh ? Seriously here. Those things used to work, you broke them. It may be worthwile cleanup / improvements, but at the end of the day, if you are the maintainer for drivers/ide, and things as fundamental as supporting proper ide_cs seems to be totally part of your job. I'm not saying ide_cs is broken today (though I suspect it -may- be suffering from similar breakage to media-bay), however, I'm reacting to your apparent lack of interest in making sure these things work. Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 1:09 ` Benjamin Herrenschmidt @ 2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz 0 siblings, 0 replies; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-31 3:17 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Thursday 31 July 2008, Benjamin Herrenschmidt wrote: > On Thu, 2008-07-31 at 02:48 +0200, Bartlomiej Zolnierkiewicz wrote: > > There seems to be some confusion between warm-plugging of IDE devices > > and hot-plugging of IDE devices. > > > > > not a single piece of HW to exercise those code path ? I don't ask you > > > to get a powermac with a media-bay, but ide_cs seems to be a pretty > > > important one that's part of what the ide maintainer should take care > > > of... And I suspect it's going to exercise the same code path as > > > mediabay. > > > > I'm open for offers of co-maintaince of the code (which includes taking > > over particular host drivers) and since you seem to have pretty good > > insight into what ide maintainer should be doing I presume that you want > > to be added to MAINTAINERS? > > Should I laugh ? > > Seriously here. Those things used to work, you broke them. That's jumping to conlusions as you haven't yet proven that it was really me. :) Which would be great because than I can finally start fixing the damage. > It may be worthwile cleanup / improvements, but at the end of the day, > if you are the maintainer for drivers/ide, and things as fundamental as > supporting proper ide_cs seems to be totally part of your job. I'm not Maybe I'm really completely unsuited for the job. I even didn't know that proper supporting of _one_ particular host driver is a _fundamental_ part of the _subsystem_ maintainer's job... > saying ide_cs is broken today (though I suspect it -may- be suffering > from similar breakage to media-bay), however, I'm reacting to your > apparent lack of interest in making sure these things work. If only all people showed so much interest as *IDE*PMAC* Maintainer in making sure that things work (instead of i.e. telling people what should they be doing in their _limited_ _private_ time) we would have nothing to worry about! ;) Can we now go back to fixing ide-pmac breakage? Pretty please. It is not unlikely that it in the time it took for the last four mails it would have been fixed already. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz 2008-07-31 1:09 ` Benjamin Herrenschmidt @ 2008-07-31 8:49 ` Alan Cox 2008-07-31 9:11 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 26+ messages in thread From: Alan Cox @ 2008-07-31 8:49 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, petkovbb, linuxppc-dev, linux-ide > There seems to be some confusion between warm-plugging of IDE devices > and hot-plugging of IDE devices. > > > not a single piece of HW to exercise those code path ? I don't ask you > > to get a powermac with a media-bay, but ide_cs seems to be a pretty > > important one that's part of what the ide maintainer should take care > > of... And I suspect it's going to exercise the same code path as > > mediabay. That confuses people sometimes - ide_cs is a controller hotplug not a device hotplug ... ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 8:49 ` Alan Cox @ 2008-07-31 9:11 ` Benjamin Herrenschmidt 2008-07-31 9:13 ` Alan Cox 0 siblings, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-31 9:11 UTC (permalink / raw) To: Alan Cox Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz, linuxppc-dev On Thu, 2008-07-31 at 09:49 +0100, Alan Cox wrote: > > There seems to be some confusion between warm-plugging of IDE devices > > and hot-plugging of IDE devices. > > > > > not a single piece of HW to exercise those code path ? I don't ask you > > > to get a powermac with a media-bay, but ide_cs seems to be a pretty > > > important one that's part of what the ide maintainer should take care > > > of... And I suspect it's going to exercise the same code path as > > > mediabay. > > That confuses people sometimes - ide_cs is a controller hotplug not a > device hotplug ... I could make the media-bay look like a controller hotplug if it was going to make things easier... Cheers, Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 9:11 ` Benjamin Herrenschmidt @ 2008-07-31 9:13 ` Alan Cox 2008-07-31 9:48 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 26+ messages in thread From: Alan Cox @ 2008-07-31 9:13 UTC (permalink / raw) To: benh Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz, linuxppc-dev > I could make the media-bay look like a controller hotplug if it was > going to make things easier... I'm not sure it will. It may do nowdays, but the older IDE code historically was fairly broken for both cases except in 2.4. Also faking it as controller hotplug is the wrong path for libata which does real drive hot plug. Alan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 9:13 ` Alan Cox @ 2008-07-31 9:48 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-31 9:48 UTC (permalink / raw) To: Alan Cox Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz, linuxppc-dev On Thu, 2008-07-31 at 10:13 +0100, Alan Cox wrote: > > I could make the media-bay look like a controller hotplug if it was > > going to make things easier... > > I'm not sure it will. It may do nowdays, but the older IDE code > historically was fairly broken for both cases except in 2.4. Also faking > it as controller hotplug is the wrong path for libata which does real > drive hot plug. Yeah, that was my line of thinking initially, also the fact that it has the nice side effect of keeping the minor number stable. Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz 2008-07-30 22:49 ` Benjamin Herrenschmidt @ 2008-07-31 4:25 ` Benjamin Herrenschmidt 2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz 1 sibling, 1 reply; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-31 4:25 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev > Is it actually caused by additional reference counting on drive->gendev? > IOW if you reverse the patch below instead of applying the previous fix > do things work OK again? > > > Note that there shouldn't be anything fundamentally different from > > ide-pmac here vs. something like pcmcia IDE cards... do you have one of > > these to test with ? > > Nope and I really don't intend to have one. I count on other people > to take some care of support for host drivers that they maintain/use. ;) Reverting the patch below does the job. Thanks. Ben. > Thanks, > Bart > > diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c > index 4e73aee..8f253e5 100644 > --- a/drivers/ide/ide-cd.c > +++ b/drivers/ide/ide-cd.c > @@ -57,23 +57,29 @@ static DEFINE_MUTEX(idecd_ref_mutex); > #define ide_cd_g(disk) \ > container_of((disk)->private_data, struct cdrom_info, driver) > > +static void ide_cd_release(struct kref *); > + > static struct cdrom_info *ide_cd_get(struct gendisk *disk) > { > struct cdrom_info *cd = NULL; > > mutex_lock(&idecd_ref_mutex); > cd = ide_cd_g(disk); > - if (cd) > + if (cd) { > kref_get(&cd->kref); > + if (ide_device_get(cd->drive)) { > + kref_put(&cd->kref, ide_cd_release); > + cd = NULL; > + } > + } > mutex_unlock(&idecd_ref_mutex); > return cd; > } > > -static void ide_cd_release(struct kref *); > - > static void ide_cd_put(struct cdrom_info *cd) > { > mutex_lock(&idecd_ref_mutex); > + ide_device_put(cd->drive); > kref_put(&cd->kref, ide_cd_release); > mutex_unlock(&idecd_ref_mutex); > } ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 4:25 ` Benjamin Herrenschmidt @ 2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz 2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-07-31 10:51 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Thursday 31 July 2008, Benjamin Herrenschmidt wrote: >=20 > > Is it actually caused by additional reference counting on drive->gendev? > > IOW if you reverse the patch below instead of applying the previous fix > > do things work OK again? > >=20 > > > Note that there shouldn't be anything fundamentally different from > > > ide-pmac here vs. something like pcmcia IDE cards... do you have one = of > > > these to test with ? > >=20 > > Nope and I really don't intend to have one. I count on other people > > to take some care of support for host drivers that they maintain/use. ;) >=20 > Reverting the patch below does the job. Thanks. Thanks, this narrows down the problem pretty nicely. [ Unfortunately we cannot revert the whole patch as it would break unloading of IDE host drivers modules so I still need you help on fixing this one. ] Lets get back to the oops: Vector: 300 (Data Access) at [c59dfdc0] =C2=A0 =C2=A0 pc: c0211f78: ide_device_put+0x18/0x58 =C2=A0 =C2=A0 lr: c0223c34: ide_cd_put+0x40/0x5c =C2=A0 =C2=A0 sp: c59dfe70 =C2=A0 =C2=A0msr: 9032 =C2=A0 =C2=A0dar: 10 =C2=A0dsisr: 40000000 =C2=A0 current =3D 0xc58a9880 =C2=A0 =C2=A0 pid =C2=A0 =3D 843, comm =3D media-bay enter ? for help [c59dfe80] c0223c34 ide_cd_put+0x40/0x5c [c59dfea0] c02114d4 generic_ide_remove+0x28/0x3c [c59dfeb0] c01ea108 __device_release_driver+0x78/0xb4 [c59dfec0] c01ea218 device_release_driver+0x28/0x44 [c59dfee0] c01e9350 bus_remove_device+0xac/0xd8 [c59dff00] c01e77f8 device_del+0x104/0x198 [c59dff20] c01e78a4 device_unregister+0x18/0x30 [c59dff40] c0212598 __ide_port_unregister_devices+0x6c/0x88 [c59dff60] c021276c ide_port_unregister_devices+0x38/0x80 [c59dff80] c0209078 media_bay_step+0x1cc/0x5c0 [c59dffb0] c02094f8 media_bay_task+0x8c/0xcc [c59dffd0] c0048640 kthread+0x48/0x84 [c59dfff0] c0011b38 kernel_thread+0x44/0x60 On a fresh look at ide_device_put(), ide_host_alloc() and pmac.c it may be that the above oops is actually media-bay specific. ide_device_put(): =2E.. struct device *host_dev =3D drive->hwif->host->dev[0]; struct module *module =3D host_dev ? host_dev->driver->owner : NULL; =2E.. ide_host_alloc(): =2E.. if (hws[0]) host->dev[0] =3D hws[0]->dev; =2E.. pmac.c: =2E.. pmac_ide_macio_attach(struct macio_dev *mdev, const struct of_device_id *ma= tch) =2E.. dev_set_drvdata(&mdev->ofdev.dev, pmif); memset(&hw, 0, sizeof(hw)); pmac_ide_init_ports(&hw, pmif->regbase); hw.irq =3D irq; hw.dev =3D &mdev->bus->pdev->dev; hw.parent =3D &mdev->ofdev.dev; =2E.. pmac macio is unique in using different devices for hwif->dev / host->dev (hw.dev) and hwif->gendev.parent / dev_set_drvdata() (hw.parent) [ I has been actually wondering why is it so for some time...? ] Thus we may be hitting oops in ide_device_put() on host_dev->driver because hw.dev is used as host->dev for pmac macio in ide_device_put() while we really want to use hw.parent. =46ix should be as simple as: =2D host->dev[0] =3D hws[0]->dev; + host->dev[0] =3D hws[0]->parent ? hws[0]->parent : hws[0]-= >dev; Could you please try it together with my previous patch for ide_device_{get,put}()? Bart ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz @ 2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz 2008-08-01 22:26 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 26+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-08-01 10:54 UTC (permalink / raw) To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Thursday 31 July 2008, Bartlomiej Zolnierkiewicz wrote: [...] Sorry if my mails were a bit harsh but nobody likes to be pushed around. [ It is not like I don't want to add proper hot-plugging support or do test on more hardware but my time schedule is already tight enough and there are still more fundamental things to address (which take priority). ] > - host->dev[0] = hws[0]->dev; > + host->dev[0] = hws[0]->parent ? hws[0]->parent : hws[0]->dev; > > Could you please try it together with my previous patch for > ide_device_{get,put}()? Please test it when you have some time. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage 2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz @ 2008-08-01 22:26 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 26+ messages in thread From: Benjamin Herrenschmidt @ 2008-08-01 22:26 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev On Fri, 2008-08-01 at 12:54 +0200, Bartlomiej Zolnierkiewicz wrote: > On Thursday 31 July 2008, Bartlomiej Zolnierkiewicz wrote: > > [...] > > Sorry if my mails were a bit harsh but nobody likes to be pushed around. > > [ It is not like I don't want to add proper hot-plugging support or do test > on more hardware but my time schedule is already tight enough and there are > still more fundamental things to address (which take priority). ] > > > - host->dev[0] = hws[0]->dev; > > + host->dev[0] = hws[0]->parent ? hws[0]->parent : hws[0]->dev; > > > > Could you please try it together with my previous patch for > > ide_device_{get,put}()? > > Please test it when you have some time. The problem in that case is access to the HW :-) I have plenty ide-pmac based machines but only one or two old laptop (ie. Paul has one too) with a media-bay and those are in the office. So I'll test on Monday when I get there, unless I get a chance to pop by on Sunday but that's unlikely. Ben. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2008-08-01 22:26 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-28 1:29 ide pmac breakage Benjamin Herrenschmidt 2008-07-28 2:02 ` Benjamin Herrenschmidt 2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz 2008-07-29 5:17 ` FUJITA Tomonori 2008-07-29 5:20 ` Benjamin Herrenschmidt 2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz 2008-07-29 12:02 ` Benjamin Herrenschmidt 2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz 2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz 2008-07-29 21:30 ` Benjamin Herrenschmidt 2008-07-30 1:23 ` FUJITA Tomonori 2008-07-30 6:57 ` Benjamin Herrenschmidt 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz 2008-07-30 22:49 ` Benjamin Herrenschmidt 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz 2008-07-31 1:09 ` Benjamin Herrenschmidt 2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz 2008-07-31 8:49 ` Alan Cox 2008-07-31 9:11 ` Benjamin Herrenschmidt 2008-07-31 9:13 ` Alan Cox 2008-07-31 9:48 ` Benjamin Herrenschmidt 2008-07-31 4:25 ` Benjamin Herrenschmidt 2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz 2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz 2008-08-01 22:26 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).