* ide pmac breakage
@ 2008-07-28 1:29 Benjamin Herrenschmidt
2008-07-28 2:02 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-28 1:29 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz; +Cc: list linux-ide, linuxppc-dev list
The current ide-pmac upstream is broken. It calls
media_bay_set_ide_infos() with an uninitialized "hwif" argument.
It's not a trivial mistake, there's a chicken-and-egg problem in the
init code in there.
I've locally fixed it with this patch that i'll merge via the powerpc
tree unless you have an objection.
However, the machine crashes when removing the media-bay CD-ROM drive.
Crash appears to be a NULL deref, possibly in elv_may_queue() though
I don't have a clean backtrace yet, working on it...
Cheers,
Ben.
diff --git a/drivers/ide/ppc/pmac.c b/drivers/ide/ppc/pmac.c
index c521bf6..fa2be26 100644
--- a/drivers/ide/ppc/pmac.c
+++ b/drivers/ide/ppc/pmac.c
@@ -1086,6 +1086,11 @@ static int __devinit pmac_ide_setup_device(pmac_ide_hwif_t *pmif, hw_regs_t *hw)
/* Make sure we have sane timings */
sanitize_timings(pmif);
+ host = ide_host_alloc(&d, hws);
+ if (host == NULL)
+ return -ENOMEM;
+ hwif = host->ports[0];
+
#ifndef CONFIG_PPC64
/* XXX FIXME: Media bay stuff need re-organizing */
if (np->parent && np->parent->name
@@ -1119,11 +1124,11 @@ static int __devinit pmac_ide_setup_device(pmac_ide_hwif_t *pmif, hw_regs_t *hw)
pmif->mdev ? "macio" : "PCI", pmif->aapl_bus_id,
pmif->mediabay ? " (mediabay)" : "", hw->irq);
- rc = ide_host_add(&d, hws, &host);
- if (rc)
+ rc = ide_host_register(host, &d, hws);
+ if (rc) {
+ ide_host_free(host);
return rc;
-
- hwif = host->ports[0];
+ }
return 0;
}
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-28 1:29 ide pmac breakage Benjamin Herrenschmidt
@ 2008-07-28 2:02 ` Benjamin Herrenschmidt
2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-28 2:02 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz; +Cc: list linux-ide, linuxppc-dev list
On Mon, 2008-07-28 at 11:29 +1000, Benjamin Herrenschmidt wrote:
> The current ide-pmac upstream is broken. It calls
> media_bay_set_ide_infos() with an uninitialized "hwif" argument.
>
> It's not a trivial mistake, there's a chicken-and-egg problem in the
> init code in there.
>
> I've locally fixed it with this patch that i'll merge via the powerpc
> tree unless you have an objection.
>
> However, the machine crashes when removing the media-bay CD-ROM drive.
>
> Crash appears to be a NULL deref, possibly in elv_may_queue() though
> I don't have a clean backtrace yet, working on it...
Here's a backtrace:
Vector: 300 (Data Access) at [c58b7b80]
pc: c014f264: elv_may_queue+0x10/0x44
lr: c0152750: get_request+0x2c/0x2c0
sp: c58b7c30
msr: 1032
dar: c
dsisr: 40000000
current = 0xc58aaae0
pid = 854, comm = media-bay
enter ? for help
mon> t
[c58b7c40] c0152750 get_request+0x2c/0x2c0
[c58b7c70] c0152a08 get_request_wait+0x24/0xec
[c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
[c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
[c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
[c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
[c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
[c58b7e50] c022395c ide_cd_release+0x80/0x84
[c58b7e70] c0163650 kref_put+0x54/0x6c
[c58b7e80] c0223884 ide_cd_put+0x40/0x5c
[c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
[c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
[c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
[c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
[c58b7f00] c01e7424 device_del+0x104/0x198
[c58b7f20] c01e74d0 device_unregister+0x18/0x30
[c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
[c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
[c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
[c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
[c58b7fd0] c00485c0 kthread+0x48/0x84
[c58b7ff0] c0011b20 kernel_thread+0x44/0x60
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-28 2:02 ` Benjamin Herrenschmidt
@ 2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz
2008-07-29 5:17 ` FUJITA Tomonori
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-28 14:31 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, list linux-ide, Borislav Petkov,
linuxppc-dev list
On Monday 28 July 2008, Benjamin Herrenschmidt wrote:
> On Mon, 2008-07-28 at 11:29 +1000, Benjamin Herrenschmidt wrote:
> > The current ide-pmac upstream is broken. It calls
> > media_bay_set_ide_infos() with an uninitialized "hwif" argument.
> >
> > It's not a trivial mistake, there's a chicken-and-egg problem in the
> > init code in there.
> >
> > I've locally fixed it with this patch that i'll merge via the powerpc
> > tree unless you have an objection.
Fine with me (you may add my ACK) and thanks for fixing it.
> > However, the machine crashes when removing the media-bay CD-ROM drive.
> >
> > Crash appears to be a NULL deref, possibly in elv_may_queue() though
> > I don't have a clean backtrace yet, working on it...
I wonder whether conversion from on-stack struct requests to allocated
ones may have something to do with it (or not?)...
> Here's a backtrace:
>
> Vector: 300 (Data Access) at [c58b7b80]
> pc: c014f264: elv_may_queue+0x10/0x44
> lr: c0152750: get_request+0x2c/0x2c0
> sp: c58b7c30
> msr: 1032
> dar: c
> dsisr: 40000000
> current = 0xc58aaae0
> pid = 854, comm = media-bay
> enter ? for help
> mon> t
> [c58b7c40] c0152750 get_request+0x2c/0x2c0
> [c58b7c70] c0152a08 get_request_wait+0x24/0xec
> [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
> [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
> [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
> [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
> [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
> [c58b7e50] c022395c ide_cd_release+0x80/0x84
> [c58b7e70] c0163650 kref_put+0x54/0x6c
> [c58b7e80] c0223884 ide_cd_put+0x40/0x5c
> [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
> [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
> [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
> [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
> [c58b7f00] c01e7424 device_del+0x104/0x198
> [c58b7f20] c01e74d0 device_unregister+0x18/0x30
> [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
> [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
> [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
> [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
> [c58b7fd0] c00485c0 kthread+0x48/0x84
> [c58b7ff0] c0011b20 kernel_thread+0x44/0x60
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz
@ 2008-07-29 5:17 ` FUJITA Tomonori
2008-07-29 5:20 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 26+ messages in thread
From: FUJITA Tomonori @ 2008-07-29 5:17 UTC (permalink / raw)
To: bzolnier; +Cc: linuxppc-dev, petkovbb, fujita.tomonori, linux-ide
On Mon, 28 Jul 2008 16:31:56 +0200
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:
> > > However, the machine crashes when removing the media-bay CD-ROM drive.
> > >
> > > Crash appears to be a NULL deref, possibly in elv_may_queue() though
> > > I don't have a clean backtrace yet, working on it...
>
> I wonder whether conversion from on-stack struct requests to allocated
> ones may have something to do with it (or not?)...
It might be. q->elevator is NULL?
I think that everyone goes through this path (generic_ide_remove ->
ide_cd_release -> cdrom_get_disc_info ->...). With 2.6.27-rc1, I've
just tried this path by removing ide-cd module, and it's fine.
If q->elevator is NULL, the media-bay code might mess up the ref
counting of the request queue...
> > Here's a backtrace:
> >
> > Vector: 300 (Data Access) at [c58b7b80]
> > pc: c014f264: elv_may_queue+0x10/0x44
> > lr: c0152750: get_request+0x2c/0x2c0
> > sp: c58b7c30
> > msr: 1032
> > dar: c
> > dsisr: 40000000
> > current = 0xc58aaae0
> > pid = 854, comm = media-bay
> > enter ? for help
> > mon> t
> > [c58b7c40] c0152750 get_request+0x2c/0x2c0
> > [c58b7c70] c0152a08 get_request_wait+0x24/0xec
> > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
> > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
> > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
> > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
> > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
> > [c58b7e50] c022395c ide_cd_release+0x80/0x84
> > [c58b7e70] c0163650 kref_put+0x54/0x6c
> > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c
> > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
> > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
> > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
> > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
> > [c58b7f00] c01e7424 device_del+0x104/0x198
> > [c58b7f20] c01e74d0 device_unregister+0x18/0x30
> > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
> > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
> > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
> > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
> > [c58b7fd0] c00485c0 kthread+0x48/0x84
> > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60
>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 5:17 ` FUJITA Tomonori
@ 2008-07-29 5:20 ` Benjamin Herrenschmidt
2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-29 5:20 UTC (permalink / raw)
To: FUJITA Tomonori; +Cc: linux-ide, petkovbb, bzolnier, linuxppc-dev
On Tue, 2008-07-29 at 14:17 +0900, FUJITA Tomonori wrote:
> If q->elevator is NULL, the media-bay code might mess up the ref
> counting of the request queue...
Well, all I do is call into Bart's new helpers to scan for or unregister
devices ...
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 5:20 ` Benjamin Herrenschmidt
@ 2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz
2008-07-29 12:02 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-29 11:41 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> On Tue, 2008-07-29 at 14:17 +0900, FUJITA Tomonori wrote:
> > If q->elevator is NULL, the media-bay code might mess up the ref
> > counting of the request queue...
>
> Well, all I do is call into Bart's new helpers to scan for or unregister
> devices ...
The switch to these helpers happened _before_ 2.6.26 and it shouldn't bring
such behavior change (ditto for new IDE host addition/removal helpers)...
Please try to git-bisect it when you have some time.
Thanks,
Bart
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz
@ 2008-07-29 12:02 ` Benjamin Herrenschmidt
2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-29 12:02 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > Well, all I do is call into Bart's new helpers to scan for or
> unregister
> > devices ...
>
> The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> bring
> such behavior change (ditto for new IDE host addition/removal
> helpers)...
>
> Please try to git-bisect it when you have some time.
Ok, I will. I worked fine when I last tried your patches. I'll see if I
can track it down too. Been a bit too busy lately as you can imagine.
Do you have something that exercise the same code path you can use ?
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 12:02 ` Benjamin Herrenschmidt
@ 2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz
2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-29 12:04 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > Well, all I do is call into Bart's new helpers to scan for or
> > unregister
> > > devices ...
> >
> > The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> > bring
> > such behavior change (ditto for new IDE host addition/removal
> > helpers)...
> >
> > Please try to git-bisect it when you have some time.
>
> Ok, I will. I worked fine when I last tried your patches. I'll see if I
> can track it down too. Been a bit too busy lately as you can imagine.
>
> Do you have something that exercise the same code path you can use ?
I'll see if I can reproduce it with IDE warm-plug support later...
Thanks,
Bart
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz
@ 2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz
2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-29 14:45 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > Well, all I do is call into Bart's new helpers to scan for or
> > > unregister
> > > > devices ...
> > >
> > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> > > bring
> > > such behavior change (ditto for new IDE host addition/removal
> > > helpers)...
> > >
> > > Please try to git-bisect it when you have some time.
> >
> > Ok, I will. I worked fine when I last tried your patches. I'll see if I
> > can track it down too. Been a bit too busy lately as you can imagine.
> >
> > Do you have something that exercise the same code path you can use ?
>
> I'll see if I can reproduce it with IDE warm-plug support later...
OK, I reproduced it here with IDE warm-plug support
(echo -n "1" > /sys/class/ide_port/ide*/delete_devices)
for devices driven by ide-cd.
It is also reproducible under qemu so I'm scripting it
into git-bisect run now...
Thanks,
Bart
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz
@ 2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
2008-07-29 21:30 ` Benjamin Herrenschmidt
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-29 19:26 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > > Well, all I do is call into Bart's new helpers to scan for or
> > > > unregister
> > > > > devices ...
> > > >
> > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> > > > bring
> > > > such behavior change (ditto for new IDE host addition/removal
> > > > helpers)...
> > > >
> > > > Please try to git-bisect it when you have some time.
> > >
> > > Ok, I will. I worked fine when I last tried your patches. I'll see if I
> > > can track it down too. Been a bit too busy lately as you can imagine.
> > >
> > > Do you have something that exercise the same code path you can use ?
> >
> > I'll see if I can reproduce it with IDE warm-plug support later...
>
> OK, I reproduced it here with IDE warm-plug support
> (echo -n "1" > /sys/class/ide_port/ide*/delete_devices)
> for devices driven by ide-cd.
>
> It is also reproducible under qemu so I'm scripting it
> into git-bisect run now...
I WON!!!
From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Subject: [PATCH] ide: fix regression caused by ide_device_{get,put}() addition
On Monday 28 July 2008, Benjamin Herrenschmidt wrote:
[...]
> Vector: 300 (Data Access) at [c58b7b80]
> pc: c014f264: elv_may_queue+0x10/0x44
> lr: c0152750: get_request+0x2c/0x2c0
> sp: c58b7c30
> msr: 1032
> dar: c
> dsisr: 40000000
> current = 0xc58aaae0
> pid = 854, comm = media-bay
> enter ? for help
> mon> t
> [c58b7c40] c0152750 get_request+0x2c/0x2c0
> [c58b7c70] c0152a08 get_request_wait+0x24/0xec
> [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
> [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
> [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
> [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
> [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
> [c58b7e50] c022395c ide_cd_release+0x80/0x84
> [c58b7e70] c0163650 kref_put+0x54/0x6c
> [c58b7e80] c0223884 ide_cd_put+0x40/0x5c
> [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
> [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
> [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
> [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
> [c58b7f00] c01e7424 device_del+0x104/0x198
> [c58b7f20] c01e74d0 device_unregister+0x18/0x30
> [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
> [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
> [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
> [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
> [c58b7fd0] c00485c0 kthread+0x48/0x84
> [c58b7ff0] c0011b20 kernel_thread+0x44/0x60
The guilty commit turned out to be 08da591e14cf87247ec09b17c350235157a92fc3
("ide: add ide_device_{get,put}() helpers"). ide_device_put() is called
before kref_put() in ide_cd_put() so IDE device is already gone by the time
ide_cd_release() is reached.
Fix it by calling ide_device_get() before kref_get() and ide_device_put()
after kref_put() in all affected device drivers.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
---
drivers/ide/ide-cd.c | 10 +++++-----
drivers/ide/ide-disk.c | 9 ++++-----
drivers/ide/ide-floppy.c | 9 ++++-----
drivers/ide/ide-tape.c | 9 ++++-----
drivers/scsi/ide-scsi.c | 9 ++++-----
5 files changed, 21 insertions(+), 25 deletions(-)
Index: b/drivers/ide/ide-cd.c
===================================================================
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -66,11 +66,11 @@ static struct cdrom_info *ide_cd_get(str
mutex_lock(&idecd_ref_mutex);
cd = ide_cd_g(disk);
if (cd) {
- kref_get(&cd->kref);
- if (ide_device_get(cd->drive)) {
- kref_put(&cd->kref, ide_cd_release);
+ if (ide_device_get(cd->drive))
cd = NULL;
- }
+ else
+ kref_get(&cd->kref);
+
}
mutex_unlock(&idecd_ref_mutex);
return cd;
@@ -79,8 +79,8 @@ static struct cdrom_info *ide_cd_get(str
static void ide_cd_put(struct cdrom_info *cd)
{
mutex_lock(&idecd_ref_mutex);
- ide_device_put(cd->drive);
kref_put(&cd->kref, ide_cd_release);
+ ide_device_put(cd->drive);
mutex_unlock(&idecd_ref_mutex);
}
Index: b/drivers/ide/ide-disk.c
===================================================================
--- a/drivers/ide/ide-disk.c
+++ b/drivers/ide/ide-disk.c
@@ -65,11 +65,10 @@ static struct ide_disk_obj *ide_disk_get
mutex_lock(&idedisk_ref_mutex);
idkp = ide_disk_g(disk);
if (idkp) {
- kref_get(&idkp->kref);
- if (ide_device_get(idkp->drive)) {
- kref_put(&idkp->kref, ide_disk_release);
+ if (ide_device_get(idkp->drive))
idkp = NULL;
- }
+ else
+ kref_get(&idkp->kref);
}
mutex_unlock(&idedisk_ref_mutex);
return idkp;
@@ -78,8 +77,8 @@ static struct ide_disk_obj *ide_disk_get
static void ide_disk_put(struct ide_disk_obj *idkp)
{
mutex_lock(&idedisk_ref_mutex);
- ide_device_put(idkp->drive);
kref_put(&idkp->kref, ide_disk_release);
+ ide_device_put(idkp->drive);
mutex_unlock(&idedisk_ref_mutex);
}
Index: b/drivers/ide/ide-floppy.c
===================================================================
--- a/drivers/ide/ide-floppy.c
+++ b/drivers/ide/ide-floppy.c
@@ -167,11 +167,10 @@ static struct ide_floppy_obj *ide_floppy
mutex_lock(&idefloppy_ref_mutex);
floppy = ide_floppy_g(disk);
if (floppy) {
- kref_get(&floppy->kref);
- if (ide_device_get(floppy->drive)) {
- kref_put(&floppy->kref, idefloppy_cleanup_obj);
+ if (ide_device_get(floppy->drive))
floppy = NULL;
- }
+ else
+ kref_get(&floppy->kref);
}
mutex_unlock(&idefloppy_ref_mutex);
return floppy;
@@ -180,8 +179,8 @@ static struct ide_floppy_obj *ide_floppy
static void ide_floppy_put(struct ide_floppy_obj *floppy)
{
mutex_lock(&idefloppy_ref_mutex);
- ide_device_put(floppy->drive);
kref_put(&floppy->kref, idefloppy_cleanup_obj);
+ ide_device_put(floppy->drive);
mutex_unlock(&idefloppy_ref_mutex);
}
Index: b/drivers/ide/ide-tape.c
===================================================================
--- a/drivers/ide/ide-tape.c
+++ b/drivers/ide/ide-tape.c
@@ -331,11 +331,10 @@ static struct ide_tape_obj *ide_tape_get
mutex_lock(&idetape_ref_mutex);
tape = ide_tape_g(disk);
if (tape) {
- kref_get(&tape->kref);
- if (ide_device_get(tape->drive)) {
- kref_put(&tape->kref, ide_tape_release);
+ if (ide_device_get(tape->drive))
tape = NULL;
- }
+ else
+ kref_get(&tape->kref);
}
mutex_unlock(&idetape_ref_mutex);
return tape;
@@ -344,8 +343,8 @@ static struct ide_tape_obj *ide_tape_get
static void ide_tape_put(struct ide_tape_obj *tape)
{
mutex_lock(&idetape_ref_mutex);
- ide_device_put(tape->drive);
kref_put(&tape->kref, ide_tape_release);
+ ide_device_put(tape->drive);
mutex_unlock(&idetape_ref_mutex);
}
Index: b/drivers/scsi/ide-scsi.c
===================================================================
--- a/drivers/scsi/ide-scsi.c
+++ b/drivers/scsi/ide-scsi.c
@@ -102,11 +102,10 @@ static struct ide_scsi_obj *ide_scsi_get
mutex_lock(&idescsi_ref_mutex);
scsi = ide_scsi_g(disk);
if (scsi) {
- scsi_host_get(scsi->host);
- if (ide_device_get(scsi->drive)) {
- scsi_host_put(scsi->host);
+ if (ide_device_get(scsi->drive))
scsi = NULL;
- }
+ else
+ scsi_host_get(scsi->host);
}
mutex_unlock(&idescsi_ref_mutex);
return scsi;
@@ -115,8 +114,8 @@ static struct ide_scsi_obj *ide_scsi_get
static void ide_scsi_put(struct ide_scsi_obj *scsi)
{
mutex_lock(&idescsi_ref_mutex);
- ide_device_put(scsi->drive);
scsi_host_put(scsi->host);
+ ide_device_put(scsi->drive);
mutex_unlock(&idescsi_ref_mutex);
}
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
@ 2008-07-29 21:30 ` Benjamin Herrenschmidt
2008-07-30 1:23 ` FUJITA Tomonori
2008-07-30 6:57 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-29 21:30 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
> I WON!!!
Heh, great :-)
I'll give you patch a try, thanks !
Cheers,
Ben.
> From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Subject: [PATCH] ide: fix regression caused by ide_device_{get,put}() addition
>
> On Monday 28 July 2008, Benjamin Herrenschmidt wrote:
>
> [...]
>
> > Vector: 300 (Data Access) at [c58b7b80]
> > pc: c014f264: elv_may_queue+0x10/0x44
> > lr: c0152750: get_request+0x2c/0x2c0
> > sp: c58b7c30
> > msr: 1032
> > dar: c
> > dsisr: 40000000
> > current = 0xc58aaae0
> > pid = 854, comm = media-bay
> > enter ? for help
> > mon> t
> > [c58b7c40] c0152750 get_request+0x2c/0x2c0
> > [c58b7c70] c0152a08 get_request_wait+0x24/0xec
> > [c58b7cc0] c0225674 ide_cd_queue_pc+0x58/0x1a0
> > [c58b7d40] c022672c ide_cdrom_packet+0x9c/0xdc
> > [c58b7d70] c0261810 cdrom_get_disc_info+0x60/0xd0
> > [c58b7dc0] c026208c cdrom_mrw_exit+0x1c/0x11c
> > [c58b7e30] c0260f7c unregister_cdrom+0x84/0xe8
> > [c58b7e50] c022395c ide_cd_release+0x80/0x84
> > [c58b7e70] c0163650 kref_put+0x54/0x6c
> > [c58b7e80] c0223884 ide_cd_put+0x40/0x5c
> > [c58b7ea0] c0211100 generic_ide_remove+0x28/0x3c
> > [c58b7eb0] c01e9d34 __device_release_driver+0x78/0xb4
> > [c58b7ec0] c01e9e44 device_release_driver+0x28/0x44
> > [c58b7ee0] c01e8f7c bus_remove_device+0xac/0xd8
> > [c58b7f00] c01e7424 device_del+0x104/0x198
> > [c58b7f20] c01e74d0 device_unregister+0x18/0x30
> > [c58b7f40] c02121c4 __ide_port_unregister_devices+0x6c/0x88
> > [c58b7f60] c0212398 ide_port_unregister_devices+0x38/0x80
> > [c58b7f80] c0208ca4 media_bay_step+0x1cc/0x5c0
> > [c58b7fb0] c0209124 media_bay_task+0x8c/0xcc
> > [c58b7fd0] c00485c0 kthread+0x48/0x84
> > [c58b7ff0] c0011b20 kernel_thread+0x44/0x60
>
> The guilty commit turned out to be 08da591e14cf87247ec09b17c350235157a92fc3
> ("ide: add ide_device_{get,put}() helpers"). ide_device_put() is called
> before kref_put() in ide_cd_put() so IDE device is already gone by the time
> ide_cd_release() is reached.
>
> Fix it by calling ide_device_get() before kref_get() and ide_device_put()
> after kref_put() in all affected device drivers.
>
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Borislav Petkov <petkovbb@gmail.com>
> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> ---
> drivers/ide/ide-cd.c | 10 +++++-----
> drivers/ide/ide-disk.c | 9 ++++-----
> drivers/ide/ide-floppy.c | 9 ++++-----
> drivers/ide/ide-tape.c | 9 ++++-----
> drivers/scsi/ide-scsi.c | 9 ++++-----
> 5 files changed, 21 insertions(+), 25 deletions(-)
>
> Index: b/drivers/ide/ide-cd.c
> ===================================================================
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -66,11 +66,11 @@ static struct cdrom_info *ide_cd_get(str
> mutex_lock(&idecd_ref_mutex);
> cd = ide_cd_g(disk);
> if (cd) {
> - kref_get(&cd->kref);
> - if (ide_device_get(cd->drive)) {
> - kref_put(&cd->kref, ide_cd_release);
> + if (ide_device_get(cd->drive))
> cd = NULL;
> - }
> + else
> + kref_get(&cd->kref);
> +
> }
> mutex_unlock(&idecd_ref_mutex);
> return cd;
> @@ -79,8 +79,8 @@ static struct cdrom_info *ide_cd_get(str
> static void ide_cd_put(struct cdrom_info *cd)
> {
> mutex_lock(&idecd_ref_mutex);
> - ide_device_put(cd->drive);
> kref_put(&cd->kref, ide_cd_release);
> + ide_device_put(cd->drive);
> mutex_unlock(&idecd_ref_mutex);
> }
>
> Index: b/drivers/ide/ide-disk.c
> ===================================================================
> --- a/drivers/ide/ide-disk.c
> +++ b/drivers/ide/ide-disk.c
> @@ -65,11 +65,10 @@ static struct ide_disk_obj *ide_disk_get
> mutex_lock(&idedisk_ref_mutex);
> idkp = ide_disk_g(disk);
> if (idkp) {
> - kref_get(&idkp->kref);
> - if (ide_device_get(idkp->drive)) {
> - kref_put(&idkp->kref, ide_disk_release);
> + if (ide_device_get(idkp->drive))
> idkp = NULL;
> - }
> + else
> + kref_get(&idkp->kref);
> }
> mutex_unlock(&idedisk_ref_mutex);
> return idkp;
> @@ -78,8 +77,8 @@ static struct ide_disk_obj *ide_disk_get
> static void ide_disk_put(struct ide_disk_obj *idkp)
> {
> mutex_lock(&idedisk_ref_mutex);
> - ide_device_put(idkp->drive);
> kref_put(&idkp->kref, ide_disk_release);
> + ide_device_put(idkp->drive);
> mutex_unlock(&idedisk_ref_mutex);
> }
>
> Index: b/drivers/ide/ide-floppy.c
> ===================================================================
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -167,11 +167,10 @@ static struct ide_floppy_obj *ide_floppy
> mutex_lock(&idefloppy_ref_mutex);
> floppy = ide_floppy_g(disk);
> if (floppy) {
> - kref_get(&floppy->kref);
> - if (ide_device_get(floppy->drive)) {
> - kref_put(&floppy->kref, idefloppy_cleanup_obj);
> + if (ide_device_get(floppy->drive))
> floppy = NULL;
> - }
> + else
> + kref_get(&floppy->kref);
> }
> mutex_unlock(&idefloppy_ref_mutex);
> return floppy;
> @@ -180,8 +179,8 @@ static struct ide_floppy_obj *ide_floppy
> static void ide_floppy_put(struct ide_floppy_obj *floppy)
> {
> mutex_lock(&idefloppy_ref_mutex);
> - ide_device_put(floppy->drive);
> kref_put(&floppy->kref, idefloppy_cleanup_obj);
> + ide_device_put(floppy->drive);
> mutex_unlock(&idefloppy_ref_mutex);
> }
>
> Index: b/drivers/ide/ide-tape.c
> ===================================================================
> --- a/drivers/ide/ide-tape.c
> +++ b/drivers/ide/ide-tape.c
> @@ -331,11 +331,10 @@ static struct ide_tape_obj *ide_tape_get
> mutex_lock(&idetape_ref_mutex);
> tape = ide_tape_g(disk);
> if (tape) {
> - kref_get(&tape->kref);
> - if (ide_device_get(tape->drive)) {
> - kref_put(&tape->kref, ide_tape_release);
> + if (ide_device_get(tape->drive))
> tape = NULL;
> - }
> + else
> + kref_get(&tape->kref);
> }
> mutex_unlock(&idetape_ref_mutex);
> return tape;
> @@ -344,8 +343,8 @@ static struct ide_tape_obj *ide_tape_get
> static void ide_tape_put(struct ide_tape_obj *tape)
> {
> mutex_lock(&idetape_ref_mutex);
> - ide_device_put(tape->drive);
> kref_put(&tape->kref, ide_tape_release);
> + ide_device_put(tape->drive);
> mutex_unlock(&idetape_ref_mutex);
> }
>
> Index: b/drivers/scsi/ide-scsi.c
> ===================================================================
> --- a/drivers/scsi/ide-scsi.c
> +++ b/drivers/scsi/ide-scsi.c
> @@ -102,11 +102,10 @@ static struct ide_scsi_obj *ide_scsi_get
> mutex_lock(&idescsi_ref_mutex);
> scsi = ide_scsi_g(disk);
> if (scsi) {
> - scsi_host_get(scsi->host);
> - if (ide_device_get(scsi->drive)) {
> - scsi_host_put(scsi->host);
> + if (ide_device_get(scsi->drive))
> scsi = NULL;
> - }
> + else
> + scsi_host_get(scsi->host);
> }
> mutex_unlock(&idescsi_ref_mutex);
> return scsi;
> @@ -115,8 +114,8 @@ static struct ide_scsi_obj *ide_scsi_get
> static void ide_scsi_put(struct ide_scsi_obj *scsi)
> {
> mutex_lock(&idescsi_ref_mutex);
> - ide_device_put(scsi->drive);
> scsi_host_put(scsi->host);
> + ide_device_put(scsi->drive);
> mutex_unlock(&idescsi_ref_mutex);
> }
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
2008-07-29 21:30 ` Benjamin Herrenschmidt
@ 2008-07-30 1:23 ` FUJITA Tomonori
2008-07-30 6:57 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 26+ messages in thread
From: FUJITA Tomonori @ 2008-07-30 1:23 UTC (permalink / raw)
To: bzolnier; +Cc: fujita.tomonori, petkovbb, linuxppc-dev, linux-ide
On Tue, 29 Jul 2008 21:26:11 +0200
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:
> On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> > On Tuesday 29 July 2008, Bartlomiej Zolnierkiewicz wrote:
> > > On Tuesday 29 July 2008, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2008-07-29 at 13:41 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > > > Well, all I do is call into Bart's new helpers to scan for or
> > > > > unregister
> > > > > > devices ...
> > > > >
> > > > > The switch to these helpers happened _before_ 2.6.26 and it shouldn't
> > > > > bring
> > > > > such behavior change (ditto for new IDE host addition/removal
> > > > > helpers)...
> > > > >
> > > > > Please try to git-bisect it when you have some time.
> > > >
> > > > Ok, I will. I worked fine when I last tried your patches. I'll see if I
> > > > can track it down too. Been a bit too busy lately as you can imagine.
> > > >
> > > > Do you have something that exercise the same code path you can use ?
> > >
> > > I'll see if I can reproduce it with IDE warm-plug support later...
> >
> > OK, I reproduced it here with IDE warm-plug support
> > (echo -n "1" > /sys/class/ide_port/ide*/delete_devices)
> > for devices driven by ide-cd.
> >
> > It is also reproducible under qemu so I'm scripting it
> > into git-bisect run now...
>
> I WON!!!
Great, seems that I don't need to learn how ide does ref counting. :)
Thanks a lot!
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
2008-07-29 21:30 ` Benjamin Herrenschmidt
2008-07-30 1:23 ` FUJITA Tomonori
@ 2008-07-30 6:57 ` Benjamin Herrenschmidt
2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz
2 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-30 6:57 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Tue, 2008-07-29 at 21:26 +0200, Bartlomiej Zolnierkiewicz wrote:
> I WON!!!
Only half...
It goes further and then blows up again. First problem is, this
unregister interface doesn't quite convey the fact that the HW is gone
and the IDE code seems to take it's sweet time figuring it out after
trying some requests. Maybe something smarter can be done here ? ie,
ide_set_interface_dead() :-)
mediabay0: switching to 7
mediabay0: powering down
media bay 0 is empty
hdc: status error: status=0x00 { }
ide: failed opcode was: unknown
hdc: status error: status=0x00 { }
ide: failed opcode was: unknown
Then after this couple of failed attempts at sending commands, it
crashes with the backtrace below. Another NULL dereference apparently,
though the DAR value (the faulting address) has been slightly different
between consecutive tests so it may be a use-after-free too.
Note that there shouldn't be anything fundamentally different from
ide-pmac here vs. something like pcmcia IDE cards... do you have one of
these to test with ?
Vector: 300 (Data Access) at [c59dfdc0]
pc: c0211f78: ide_device_put+0x18/0x58
lr: c0223c34: ide_cd_put+0x40/0x5c
sp: c59dfe70
msr: 9032
dar: 10
dsisr: 40000000
current = 0xc58a9880
pid = 843, comm = media-bay
enter ? for help
[c59dfe80] c0223c34 ide_cd_put+0x40/0x5c
[c59dfea0] c02114d4 generic_ide_remove+0x28/0x3c
[c59dfeb0] c01ea108 __device_release_driver+0x78/0xb4
[c59dfec0] c01ea218 device_release_driver+0x28/0x44
[c59dfee0] c01e9350 bus_remove_device+0xac/0xd8
[c59dff00] c01e77f8 device_del+0x104/0x198
[c59dff20] c01e78a4 device_unregister+0x18/0x30
[c59dff40] c0212598 __ide_port_unregister_devices+0x6c/0x88
[c59dff60] c021276c ide_port_unregister_devices+0x38/0x80
[c59dff80] c0209078 media_bay_step+0x1cc/0x5c0
[c59dffb0] c02094f8 media_bay_task+0x8c/0xcc
[c59dffd0] c0048640 kthread+0x48/0x84
[c59dfff0] c0011b38 kernel_thread+0x44/0x60
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-30 6:57 ` Benjamin Herrenschmidt
@ 2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz
2008-07-30 22:49 ` Benjamin Herrenschmidt
2008-07-31 4:25 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-30 19:11 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Wednesday 30 July 2008, Benjamin Herrenschmidt wrote:
> On Tue, 2008-07-29 at 21:26 +0200, Bartlomiej Zolnierkiewicz wrote:
>
> > I WON!!!
>
> Only half...
Heh, I wasn't talking about fixing the issue...
(hint: look up the author of the bad commit).
> It goes further and then blows up again. First problem is, this
> unregister interface doesn't quite convey the fact that the HW is gone
> and the IDE code seems to take it's sweet time figuring it out after
> trying some requests. Maybe something smarter can be done here ? ie,
> ide_set_interface_dead() :-)
Sure, it would be great to have controller hotplug working reliably
one day but the recent patches shouldn't really be making things worse
(quite the contrary) so I would prefer to find and learn more about
the cause of regression first.
[ However nothing prevents us from improving the hotplug support in
parallel to fixing the issue so if you have some ideas that could
be conveyed in form of patches please go ahead. ]
> mediabay0: switching to 7
> mediabay0: powering down
> media bay 0 is empty
> hdc: status error: status=0x00 { }
> ide: failed opcode was: unknown
> hdc: status error: status=0x00 { }
> ide: failed opcode was: unknown
>
> Then after this couple of failed attempts at sending commands, it
> crashes with the backtrace below. Another NULL dereference apparently,
> though the DAR value (the faulting address) has been slightly different
> between consecutive tests so it may be a use-after-free too.
Is it actually caused by additional reference counting on drive->gendev?
IOW if you reverse the patch below instead of applying the previous fix
do things work OK again?
> Note that there shouldn't be anything fundamentally different from
> ide-pmac here vs. something like pcmcia IDE cards... do you have one of
> these to test with ?
Nope and I really don't intend to have one. I count on other people
to take some care of support for host drivers that they maintain/use. ;)
Thanks,
Bart
diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 4e73aee..8f253e5 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -57,23 +57,29 @@ static DEFINE_MUTEX(idecd_ref_mutex);
#define ide_cd_g(disk) \
container_of((disk)->private_data, struct cdrom_info, driver)
+static void ide_cd_release(struct kref *);
+
static struct cdrom_info *ide_cd_get(struct gendisk *disk)
{
struct cdrom_info *cd = NULL;
mutex_lock(&idecd_ref_mutex);
cd = ide_cd_g(disk);
- if (cd)
+ if (cd) {
kref_get(&cd->kref);
+ if (ide_device_get(cd->drive)) {
+ kref_put(&cd->kref, ide_cd_release);
+ cd = NULL;
+ }
+ }
mutex_unlock(&idecd_ref_mutex);
return cd;
}
-static void ide_cd_release(struct kref *);
-
static void ide_cd_put(struct cdrom_info *cd)
{
mutex_lock(&idecd_ref_mutex);
+ ide_device_put(cd->drive);
kref_put(&cd->kref, ide_cd_release);
mutex_unlock(&idecd_ref_mutex);
}
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz
@ 2008-07-30 22:49 ` Benjamin Herrenschmidt
2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz
2008-07-31 4:25 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-30 22:49 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Wed, 2008-07-30 at 21:11 +0200, Bartlomiej Zolnierkiewicz wrote:
>
> > Note that there shouldn't be anything fundamentally different from
> > ide-pmac here vs. something like pcmcia IDE cards... do you have one
> of
> > these to test with ?
>
> Nope and I really don't intend to have one. I count on other people
> to take some care of support for host drivers that they
> maintain/use. ;)
Hrm... that's not a very sane approach. You have some infrastructure for
adding / removing devices, in fact changes things in that area even, and
not a single piece of HW to exercise those code path ? I don't ask you
to get a powermac with a media-bay, but ide_cs seems to be a pretty
important one that's part of what the ide maintainer should take care
of... And I suspect it's going to exercise the same code path as
mediabay.
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-30 22:49 ` Benjamin Herrenschmidt
@ 2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz
2008-07-31 1:09 ` Benjamin Herrenschmidt
2008-07-31 8:49 ` Alan Cox
0 siblings, 2 replies; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-31 0:48 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Thursday 31 July 2008, Benjamin Herrenschmidt wrote:
> On Wed, 2008-07-30 at 21:11 +0200, Bartlomiej Zolnierkiewicz wrote:
> >
> > > Note that there shouldn't be anything fundamentally different from
> > > ide-pmac here vs. something like pcmcia IDE cards... do you have one
> > of
> > > these to test with ?
> >
> > Nope and I really don't intend to have one. I count on other people
> > to take some care of support for host drivers that they
> > maintain/use. ;)
>
> Hrm... that's not a very sane approach. You have some infrastructure for
> adding / removing devices, in fact changes things in that area even, and
There seems to be some confusion between warm-plugging of IDE devices
and hot-plugging of IDE devices.
> not a single piece of HW to exercise those code path ? I don't ask you
> to get a powermac with a media-bay, but ide_cs seems to be a pretty
> important one that's part of what the ide maintainer should take care
> of... And I suspect it's going to exercise the same code path as
> mediabay.
I'm open for offers of co-maintaince of the code (which includes taking
over particular host drivers) and since you seem to have pretty good
insight into what ide maintainer should be doing I presume that you want
to be added to MAINTAINERS?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz
@ 2008-07-31 1:09 ` Benjamin Herrenschmidt
2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz
2008-07-31 8:49 ` Alan Cox
1 sibling, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-31 1:09 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Thu, 2008-07-31 at 02:48 +0200, Bartlomiej Zolnierkiewicz wrote:
> There seems to be some confusion between warm-plugging of IDE devices
> and hot-plugging of IDE devices.
>
> > not a single piece of HW to exercise those code path ? I don't ask you
> > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > important one that's part of what the ide maintainer should take care
> > of... And I suspect it's going to exercise the same code path as
> > mediabay.
>
> I'm open for offers of co-maintaince of the code (which includes taking
> over particular host drivers) and since you seem to have pretty good
> insight into what ide maintainer should be doing I presume that you want
> to be added to MAINTAINERS?
Should I laugh ?
Seriously here. Those things used to work, you broke them.
It may be worthwile cleanup / improvements, but at the end of the day,
if you are the maintainer for drivers/ide, and things as fundamental as
supporting proper ide_cs seems to be totally part of your job. I'm not
saying ide_cs is broken today (though I suspect it -may- be suffering
from similar breakage to media-bay), however, I'm reacting to your
apparent lack of interest in making sure these things work.
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 1:09 ` Benjamin Herrenschmidt
@ 2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz
0 siblings, 0 replies; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-31 3:17 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Thursday 31 July 2008, Benjamin Herrenschmidt wrote:
> On Thu, 2008-07-31 at 02:48 +0200, Bartlomiej Zolnierkiewicz wrote:
> > There seems to be some confusion between warm-plugging of IDE devices
> > and hot-plugging of IDE devices.
> >
> > > not a single piece of HW to exercise those code path ? I don't ask you
> > > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > > important one that's part of what the ide maintainer should take care
> > > of... And I suspect it's going to exercise the same code path as
> > > mediabay.
> >
> > I'm open for offers of co-maintaince of the code (which includes taking
> > over particular host drivers) and since you seem to have pretty good
> > insight into what ide maintainer should be doing I presume that you want
> > to be added to MAINTAINERS?
>
> Should I laugh ?
>
> Seriously here. Those things used to work, you broke them.
That's jumping to conlusions as you haven't yet proven that it was
really me. :)
Which would be great because than I can finally start fixing the damage.
> It may be worthwile cleanup / improvements, but at the end of the day,
> if you are the maintainer for drivers/ide, and things as fundamental as
> supporting proper ide_cs seems to be totally part of your job. I'm not
Maybe I'm really completely unsuited for the job. I even didn't know
that proper supporting of _one_ particular host driver is a _fundamental_
part of the _subsystem_ maintainer's job...
> saying ide_cs is broken today (though I suspect it -may- be suffering
> from similar breakage to media-bay), however, I'm reacting to your
> apparent lack of interest in making sure these things work.
If only all people showed so much interest as *IDE*PMAC* Maintainer in
making sure that things work (instead of i.e. telling people what should
they be doing in their _limited_ _private_ time) we would have nothing
to worry about! ;)
Can we now go back to fixing ide-pmac breakage? Pretty please.
It is not unlikely that it in the time it took for the last four
mails it would have been fixed already.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz
2008-07-30 22:49 ` Benjamin Herrenschmidt
@ 2008-07-31 4:25 ` Benjamin Herrenschmidt
2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz
1 sibling, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-31 4:25 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
> Is it actually caused by additional reference counting on drive->gendev?
> IOW if you reverse the patch below instead of applying the previous fix
> do things work OK again?
>
> > Note that there shouldn't be anything fundamentally different from
> > ide-pmac here vs. something like pcmcia IDE cards... do you have one of
> > these to test with ?
>
> Nope and I really don't intend to have one. I count on other people
> to take some care of support for host drivers that they maintain/use. ;)
Reverting the patch below does the job. Thanks.
Ben.
> Thanks,
> Bart
>
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 4e73aee..8f253e5 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -57,23 +57,29 @@ static DEFINE_MUTEX(idecd_ref_mutex);
> #define ide_cd_g(disk) \
> container_of((disk)->private_data, struct cdrom_info, driver)
>
> +static void ide_cd_release(struct kref *);
> +
> static struct cdrom_info *ide_cd_get(struct gendisk *disk)
> {
> struct cdrom_info *cd = NULL;
>
> mutex_lock(&idecd_ref_mutex);
> cd = ide_cd_g(disk);
> - if (cd)
> + if (cd) {
> kref_get(&cd->kref);
> + if (ide_device_get(cd->drive)) {
> + kref_put(&cd->kref, ide_cd_release);
> + cd = NULL;
> + }
> + }
> mutex_unlock(&idecd_ref_mutex);
> return cd;
> }
>
> -static void ide_cd_release(struct kref *);
> -
> static void ide_cd_put(struct cdrom_info *cd)
> {
> mutex_lock(&idecd_ref_mutex);
> + ide_device_put(cd->drive);
> kref_put(&cd->kref, ide_cd_release);
> mutex_unlock(&idecd_ref_mutex);
> }
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz
2008-07-31 1:09 ` Benjamin Herrenschmidt
@ 2008-07-31 8:49 ` Alan Cox
2008-07-31 9:11 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 26+ messages in thread
From: Alan Cox @ 2008-07-31 8:49 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, petkovbb, linuxppc-dev, linux-ide
> There seems to be some confusion between warm-plugging of IDE devices
> and hot-plugging of IDE devices.
>
> > not a single piece of HW to exercise those code path ? I don't ask you
> > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > important one that's part of what the ide maintainer should take care
> > of... And I suspect it's going to exercise the same code path as
> > mediabay.
That confuses people sometimes - ide_cs is a controller hotplug not a
device hotplug ...
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 8:49 ` Alan Cox
@ 2008-07-31 9:11 ` Benjamin Herrenschmidt
2008-07-31 9:13 ` Alan Cox
0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-31 9:11 UTC (permalink / raw)
To: Alan Cox
Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
linuxppc-dev
On Thu, 2008-07-31 at 09:49 +0100, Alan Cox wrote:
> > There seems to be some confusion between warm-plugging of IDE devices
> > and hot-plugging of IDE devices.
> >
> > > not a single piece of HW to exercise those code path ? I don't ask you
> > > to get a powermac with a media-bay, but ide_cs seems to be a pretty
> > > important one that's part of what the ide maintainer should take care
> > > of... And I suspect it's going to exercise the same code path as
> > > mediabay.
>
> That confuses people sometimes - ide_cs is a controller hotplug not a
> device hotplug ...
I could make the media-bay look like a controller hotplug if it was
going to make things easier...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 9:11 ` Benjamin Herrenschmidt
@ 2008-07-31 9:13 ` Alan Cox
2008-07-31 9:48 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2008-07-31 9:13 UTC (permalink / raw)
To: benh
Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
linuxppc-dev
> I could make the media-bay look like a controller hotplug if it was
> going to make things easier...
I'm not sure it will. It may do nowdays, but the older IDE code
historically was fairly broken for both cases except in 2.4. Also faking
it as controller hotplug is the wrong path for libata which does real
drive hot plug.
Alan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 9:13 ` Alan Cox
@ 2008-07-31 9:48 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-31 9:48 UTC (permalink / raw)
To: Alan Cox
Cc: FUJITA Tomonori, linux-ide, petkovbb, Bartlomiej Zolnierkiewicz,
linuxppc-dev
On Thu, 2008-07-31 at 10:13 +0100, Alan Cox wrote:
> > I could make the media-bay look like a controller hotplug if it was
> > going to make things easier...
>
> I'm not sure it will. It may do nowdays, but the older IDE code
> historically was fairly broken for both cases except in 2.4. Also faking
> it as controller hotplug is the wrong path for libata which does real
> drive hot plug.
Yeah, that was my line of thinking initially, also the fact that it has
the nice side effect of keeping the minor number stable.
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 4:25 ` Benjamin Herrenschmidt
@ 2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz
2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-07-31 10:51 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Thursday 31 July 2008, Benjamin Herrenschmidt wrote:
>=20
> > Is it actually caused by additional reference counting on drive->gendev?
> > IOW if you reverse the patch below instead of applying the previous fix
> > do things work OK again?
> >=20
> > > Note that there shouldn't be anything fundamentally different from
> > > ide-pmac here vs. something like pcmcia IDE cards... do you have one =
of
> > > these to test with ?
> >=20
> > Nope and I really don't intend to have one. I count on other people
> > to take some care of support for host drivers that they maintain/use. ;)
>=20
> Reverting the patch below does the job. Thanks.
Thanks, this narrows down the problem pretty nicely.
[ Unfortunately we cannot revert the whole patch as it would break
unloading of IDE host drivers modules so I still need you help on
fixing this one. ]
Lets get back to the oops:
Vector: 300 (Data Access) at [c59dfdc0]
=C2=A0 =C2=A0 pc: c0211f78: ide_device_put+0x18/0x58
=C2=A0 =C2=A0 lr: c0223c34: ide_cd_put+0x40/0x5c
=C2=A0 =C2=A0 sp: c59dfe70
=C2=A0 =C2=A0msr: 9032
=C2=A0 =C2=A0dar: 10
=C2=A0dsisr: 40000000
=C2=A0 current =3D 0xc58a9880
=C2=A0 =C2=A0 pid =C2=A0 =3D 843, comm =3D media-bay
enter ? for help
[c59dfe80] c0223c34 ide_cd_put+0x40/0x5c
[c59dfea0] c02114d4 generic_ide_remove+0x28/0x3c
[c59dfeb0] c01ea108 __device_release_driver+0x78/0xb4
[c59dfec0] c01ea218 device_release_driver+0x28/0x44
[c59dfee0] c01e9350 bus_remove_device+0xac/0xd8
[c59dff00] c01e77f8 device_del+0x104/0x198
[c59dff20] c01e78a4 device_unregister+0x18/0x30
[c59dff40] c0212598 __ide_port_unregister_devices+0x6c/0x88
[c59dff60] c021276c ide_port_unregister_devices+0x38/0x80
[c59dff80] c0209078 media_bay_step+0x1cc/0x5c0
[c59dffb0] c02094f8 media_bay_task+0x8c/0xcc
[c59dffd0] c0048640 kthread+0x48/0x84
[c59dfff0] c0011b38 kernel_thread+0x44/0x60
On a fresh look at ide_device_put(), ide_host_alloc() and pmac.c
it may be that the above oops is actually media-bay specific.
ide_device_put():
=2E..
struct device *host_dev =3D drive->hwif->host->dev[0];
struct module *module =3D host_dev ? host_dev->driver->owner : NULL;
=2E..
ide_host_alloc():
=2E..
if (hws[0])
host->dev[0] =3D hws[0]->dev;
=2E..
pmac.c:
=2E..
pmac_ide_macio_attach(struct macio_dev *mdev, const struct of_device_id *ma=
tch)
=2E..
dev_set_drvdata(&mdev->ofdev.dev, pmif);
memset(&hw, 0, sizeof(hw));
pmac_ide_init_ports(&hw, pmif->regbase);
hw.irq =3D irq;
hw.dev =3D &mdev->bus->pdev->dev;
hw.parent =3D &mdev->ofdev.dev;
=2E..
pmac macio is unique in using different devices for hwif->dev / host->dev
(hw.dev) and hwif->gendev.parent / dev_set_drvdata() (hw.parent)
[ I has been actually wondering why is it so for some time...? ]
Thus we may be hitting oops in ide_device_put() on host_dev->driver
because hw.dev is used as host->dev for pmac macio in ide_device_put()
while we really want to use hw.parent.
=46ix should be as simple as:
=2D host->dev[0] =3D hws[0]->dev;
+ host->dev[0] =3D hws[0]->parent ? hws[0]->parent : hws[0]-=
>dev;
Could you please try it together with my previous patch for
ide_device_{get,put}()?
Bart
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz
@ 2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz
2008-08-01 22:26 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 26+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-08-01 10:54 UTC (permalink / raw)
To: benh; +Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Thursday 31 July 2008, Bartlomiej Zolnierkiewicz wrote:
[...]
Sorry if my mails were a bit harsh but nobody likes to be pushed around.
[ It is not like I don't want to add proper hot-plugging support or do test
on more hardware but my time schedule is already tight enough and there are
still more fundamental things to address (which take priority). ]
> - host->dev[0] = hws[0]->dev;
> + host->dev[0] = hws[0]->parent ? hws[0]->parent : hws[0]->dev;
>
> Could you please try it together with my previous patch for
> ide_device_{get,put}()?
Please test it when you have some time.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: ide pmac breakage
2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz
@ 2008-08-01 22:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-01 22:26 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: FUJITA Tomonori, linux-ide, petkovbb, linuxppc-dev
On Fri, 2008-08-01 at 12:54 +0200, Bartlomiej Zolnierkiewicz wrote:
> On Thursday 31 July 2008, Bartlomiej Zolnierkiewicz wrote:
>
> [...]
>
> Sorry if my mails were a bit harsh but nobody likes to be pushed around.
>
> [ It is not like I don't want to add proper hot-plugging support or do test
> on more hardware but my time schedule is already tight enough and there are
> still more fundamental things to address (which take priority). ]
>
> > - host->dev[0] = hws[0]->dev;
> > + host->dev[0] = hws[0]->parent ? hws[0]->parent : hws[0]->dev;
> >
> > Could you please try it together with my previous patch for
> > ide_device_{get,put}()?
>
> Please test it when you have some time.
The problem in that case is access to the HW :-) I have plenty ide-pmac
based machines but only one or two old laptop (ie. Paul has one too)
with a media-bay and those are in the office. So I'll test on Monday
when I get there, unless I get a chance to pop by on Sunday but that's
unlikely.
Ben.
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2008-08-01 22:26 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-28 1:29 ide pmac breakage Benjamin Herrenschmidt
2008-07-28 2:02 ` Benjamin Herrenschmidt
2008-07-28 14:31 ` Bartlomiej Zolnierkiewicz
2008-07-29 5:17 ` FUJITA Tomonori
2008-07-29 5:20 ` Benjamin Herrenschmidt
2008-07-29 11:41 ` Bartlomiej Zolnierkiewicz
2008-07-29 12:02 ` Benjamin Herrenschmidt
2008-07-29 12:04 ` Bartlomiej Zolnierkiewicz
2008-07-29 14:45 ` Bartlomiej Zolnierkiewicz
2008-07-29 19:26 ` Bartlomiej Zolnierkiewicz
2008-07-29 21:30 ` Benjamin Herrenschmidt
2008-07-30 1:23 ` FUJITA Tomonori
2008-07-30 6:57 ` Benjamin Herrenschmidt
2008-07-30 19:11 ` Bartlomiej Zolnierkiewicz
2008-07-30 22:49 ` Benjamin Herrenschmidt
2008-07-31 0:48 ` Bartlomiej Zolnierkiewicz
2008-07-31 1:09 ` Benjamin Herrenschmidt
2008-07-31 3:17 ` Bartlomiej Zolnierkiewicz
2008-07-31 8:49 ` Alan Cox
2008-07-31 9:11 ` Benjamin Herrenschmidt
2008-07-31 9:13 ` Alan Cox
2008-07-31 9:48 ` Benjamin Herrenschmidt
2008-07-31 4:25 ` Benjamin Herrenschmidt
2008-07-31 10:51 ` Bartlomiej Zolnierkiewicz
2008-08-01 10:54 ` Bartlomiej Zolnierkiewicz
2008-08-01 22:26 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).