* [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind()
@ 2009-05-26 0:08 Kenji Kaneshige
2009-05-26 15:45 ` Bjorn Helgaas
0 siblings, 1 reply; 7+ messages in thread
From: Kenji Kaneshige @ 2009-05-26 0:08 UTC (permalink / raw)
To: linux-pci@vger.kernel.org, Jesse Barnes; +Cc: Alex Chiang, linux acpi
Fix wrong struct pci_dev reference counter handling in acpi_pci_bind().
The 'dev' field of struct acpi_pci_data is having a pointer to struct
pci_dev without incrementing the reference counter. Because of this, I
got the following kernel oops when I was doing some pci hotplug
operations. This patch fixes this bug by replacing wrong hand-made
pci_find_slot() with pci_get_slot() in acpi_pci_bind().
[ 206.427004] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
[ 206.427076] IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd
[ 206.427076] PGD 8225ad067 PUD 82258b067 PMD 0
[ 206.427076] Oops: 0000 [#1] SMP
[ 206.427076] last sysfs file: /sys/bus/pci/slots/1/power
[ 206.427076] CPU 2
[ 206.427076] Modules linked in: acpiphp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc pci_slot battery ac parport_pc lp parport sg mptspi mptscsih mptbase scsi_transport_spi sr_mod cdrom e1000e serio_raw button i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
[ 206.427076] Pid: 10367, comm: bash Not tainted 2.6.30-rc4-kk #10 PRIMERGY
[ 206.427076] RIP: 0010:[<ffffffff803f0e9b>] [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd
[ 206.427076] RSP: 0018:ffff8808225a9d68 EFLAGS: 00010206
[ 206.427076] RAX: 0000000000000000 RBX: ffff88083c547800 RCX: 0000000000000006
[ 206.427076] RDX: ffff88083ce36ca0 RSI: ffff880822508768 RDI: 0000000000000000
[ 206.427076] RBP: ffff8808225a9d98 R08: 0000000000000000 R09: 0000000000000000
[ 206.427076] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
[ 206.427076] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8808225a9e28
[ 206.427076] FS: 00007f376c4ee6e0(0000) GS:ffff880054636000(0000) knlGS:0000000000000000
[ 206.427076] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 206.427076] CR2: 00000000000000e8 CR3: 0000000822590000 CR4: 00000000000006e0
[ 206.427076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 206.427076] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 206.427076] Process bash (pid: 10367, threadinfo ffff8808225a8000, task ffff880822508000)
[ 206.427076] Stack:
[ 206.427076] 000000000000001f ffff88083addd5a0 ffff8808225a9d98 ffff88083ce36ca0
[ 206.427076] ffff88083c547800 ffff88083c547800 ffff8808225a9db8 ffffffff803ecee4
[ 206.427076] ffff88083c547800 ffff88083c547000 ffff8808225a9e08 ffffffff803ecf6d
[ 206.427076] Call Trace:
[ 206.427076] [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68
[ 206.427076] [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3
[ 206.427076] [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp]
[ 206.427076] [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp]
[ 206.427076] [<ffffffff803cfc18>] power_write_file+0xc8/0x110
[ 206.427076] [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30
[ 206.427076] [<ffffffff803469ce>] sysfs_write_file+0xce/0x140
[ 206.427076] [<ffffffff802e94e7>] vfs_write+0xc7/0x170
[ 206.427076] [<ffffffff802e9aa0>] sys_write+0x50/0x90
[ 206.427076] [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b
[ 206.427076] Code: be 2b 01 00 00 48 c7 c7 d0 c6 5a 80 31 c0 e8 c3 8f 01 00 eb 36 48 8b 55 e8 48 8b 42 10 48 83 78 18 00 74 13 48 8b 42 08 0f b7 3a <0f> b6 b0 e8 00 00 00 e8 ab f8 ff ff 48 8b 7d e8 e8 30 cf ee ff
[ 206.427076] RIP [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd
[ 206.427076] RSP <ffff8808225a9d68>
[ 206.427076] CR2: 00000000000000e8
[ 206.440158] ---[ end trace 1ca3974fa717e665 ]---
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
drivers/acpi/pci_bind.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
Index: 20090521/drivers/acpi/pci_bind.c
===================================================================
--- 20090521.orig/drivers/acpi/pci_bind.c
+++ 20090521/drivers/acpi/pci_bind.c
@@ -116,9 +116,6 @@ int acpi_pci_bind(struct acpi_device *de
struct acpi_pci_data *pdata;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
acpi_handle handle;
- struct pci_dev *dev;
- struct pci_bus *bus;
-
if (!device || !device->parent)
return -EINVAL;
@@ -180,16 +177,8 @@ int acpi_pci_bind(struct acpi_device *de
* PCI devices are added to the global pci list when the root
* bridge start ops are run, which may not have happened yet.
*/
- bus = pci_find_bus(data->id.segment, data->id.bus);
- if (bus) {
- list_for_each_entry(dev, &bus->devices, bus_list) {
- if (dev->devfn == PCI_DEVFN(data->id.device,
- data->id.function)) {
- data->dev = dev;
- break;
- }
- }
- }
+ data->dev = pci_get_slot(pdata->bus,
+ PCI_DEVFN(data->id.device, data->id.function));
if (!data->dev) {
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"Device %04x:%02x:%02x.%d not present in PCI namespace\n",
@@ -259,9 +248,10 @@ int acpi_pci_bind(struct acpi_device *de
end:
kfree(buffer.pointer);
- if (result)
+ if (result) {
+ pci_dev_put(data->dev);
kfree(data);
-
+ }
return result;
}
@@ -303,6 +293,7 @@ static int acpi_pci_unbind(struct acpi_d
if (data->dev->subordinate) {
acpi_pci_irq_del_prt(data->id.segment, data->bus->number);
}
+ pci_dev_put(data->dev);
kfree(data);
end:
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-26 0:08 [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() Kenji Kaneshige @ 2009-05-26 15:45 ` Bjorn Helgaas 2009-05-26 23:41 ` Alex Chiang 2009-05-26 23:43 ` Kenji Kaneshige 0 siblings, 2 replies; 7+ messages in thread From: Bjorn Helgaas @ 2009-05-26 15:45 UTC (permalink / raw) To: Kenji Kaneshige Cc: linux-pci@vger.kernel.org, Jesse Barnes, Alex Chiang, linux acpi On Monday 25 May 2009 06:08:03 pm Kenji Kaneshige wrote: > Fix wrong struct pci_dev reference counter handling in acpi_pci_bind(). > > The 'dev' field of struct acpi_pci_data is having a pointer to struct > pci_dev without incrementing the reference counter. Because of this, I > got the following kernel oops when I was doing some pci hotplug > operations. This patch fixes this bug by replacing wrong hand-made > pci_find_slot() with pci_get_slot() in acpi_pci_bind(). I don't like this ACPI/PCI bind thing in general because having the extra .bind and .unbind methods is ugly and somewhat non-obvious, and I'm nervous about object lifetime issues like this one. But I don't have a better alternative to offer, and there's definitely a problem here, so thanks for fixing and testing it. I do have one question below about whether the comment in the existing code, which seems to be an excuse for doing the hand-made pci_find_slot(), is still relevant, or should just be removed. Bjorn > [ 206.427004] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 > [ 206.427076] IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd > [ 206.427076] PGD 8225ad067 PUD 82258b067 PMD 0 > [ 206.427076] Oops: 0000 [#1] SMP > [ 206.427076] last sysfs file: /sys/bus/pci/slots/1/power > [ 206.427076] CPU 2 > [ 206.427076] Modules linked in: acpiphp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc pci_slot battery ac parport_pc lp parport sg mptspi mptscsih mptbase scsi_transport_spi sr_mod cdrom e1000e serio_raw button i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] > [ 206.427076] Pid: 10367, comm: bash Not tainted 2.6.30-rc4-kk #10 PRIMERGY > [ 206.427076] RIP: 0010:[<ffffffff803f0e9b>] [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd > [ 206.427076] RSP: 0018:ffff8808225a9d68 EFLAGS: 00010206 > [ 206.427076] RAX: 0000000000000000 RBX: ffff88083c547800 RCX: 0000000000000006 > [ 206.427076] RDX: ffff88083ce36ca0 RSI: ffff880822508768 RDI: 0000000000000000 > [ 206.427076] RBP: ffff8808225a9d98 R08: 0000000000000000 R09: 0000000000000000 > [ 206.427076] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 > [ 206.427076] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8808225a9e28 > [ 206.427076] FS: 00007f376c4ee6e0(0000) GS:ffff880054636000(0000) knlGS:0000000000000000 > [ 206.427076] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 206.427076] CR2: 00000000000000e8 CR3: 0000000822590000 CR4: 00000000000006e0 > [ 206.427076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 206.427076] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 206.427076] Process bash (pid: 10367, threadinfo ffff8808225a8000, task ffff880822508000) > [ 206.427076] Stack: > [ 206.427076] 000000000000001f ffff88083addd5a0 ffff8808225a9d98 ffff88083ce36ca0 > [ 206.427076] ffff88083c547800 ffff88083c547800 ffff8808225a9db8 ffffffff803ecee4 > [ 206.427076] ffff88083c547800 ffff88083c547000 ffff8808225a9e08 ffffffff803ecf6d > [ 206.427076] Call Trace: > [ 206.427076] [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 > [ 206.427076] [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 > [ 206.427076] [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] > [ 206.427076] [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] > [ 206.427076] [<ffffffff803cfc18>] power_write_file+0xc8/0x110 > [ 206.427076] [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 > [ 206.427076] [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 > [ 206.427076] [<ffffffff802e94e7>] vfs_write+0xc7/0x170 > [ 206.427076] [<ffffffff802e9aa0>] sys_write+0x50/0x90 > [ 206.427076] [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b > [ 206.427076] Code: be 2b 01 00 00 48 c7 c7 d0 c6 5a 80 31 c0 e8 c3 8f 01 00 eb 36 48 8b 55 e8 48 8b 42 10 48 83 78 18 00 74 13 48 8b 42 08 0f b7 3a <0f> b6 b0 e8 00 00 00 e8 ab f8 ff ff 48 8b 7d e8 e8 30 cf ee ff > [ 206.427076] RIP [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd > [ 206.427076] RSP <ffff8808225a9d68> > [ 206.427076] CR2: 00000000000000e8 > [ 206.440158] ---[ end trace 1ca3974fa717e665 ]--- > > Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> > > drivers/acpi/pci_bind.c | 21 ++++++--------------- > 1 file changed, 6 insertions(+), 15 deletions(-) > > Index: 20090521/drivers/acpi/pci_bind.c > =================================================================== > --- 20090521.orig/drivers/acpi/pci_bind.c > +++ 20090521/drivers/acpi/pci_bind.c > @@ -116,9 +116,6 @@ int acpi_pci_bind(struct acpi_device *de > struct acpi_pci_data *pdata; > struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; > acpi_handle handle; > - struct pci_dev *dev; > - struct pci_bus *bus; > - > > if (!device || !device->parent) > return -EINVAL; > @@ -180,16 +177,8 @@ int acpi_pci_bind(struct acpi_device *de > * PCI devices are added to the global pci list when the root > * bridge start ops are run, which may not have happened yet. Please update or remove this comment, which claims that "we cannot simply search the global pci device list." I don't know whether the comment (a) explains why we can't use pci_get_slot(), (b) explains why we can't use pci_find_slot() or some other interface, (c) refers to an ordering problem that doesn't exist on your system, or (d) is just no longer applicable at all. > */ > - bus = pci_find_bus(data->id.segment, data->id.bus); > - if (bus) { > - list_for_each_entry(dev, &bus->devices, bus_list) { > - if (dev->devfn == PCI_DEVFN(data->id.device, > - data->id.function)) { > - data->dev = dev; > - break; > - } > - } > - } > + data->dev = pci_get_slot(pdata->bus, > + PCI_DEVFN(data->id.device, data->id.function)); > if (!data->dev) { > ACPI_DEBUG_PRINT((ACPI_DB_INFO, > "Device %04x:%02x:%02x.%d not present in PCI namespace\n", > @@ -259,9 +248,10 @@ int acpi_pci_bind(struct acpi_device *de > > end: > kfree(buffer.pointer); > - if (result) > + if (result) { > + pci_dev_put(data->dev); > kfree(data); > - > + } > return result; > } > > @@ -303,6 +293,7 @@ static int acpi_pci_unbind(struct acpi_d > if (data->dev->subordinate) { > acpi_pci_irq_del_prt(data->id.segment, data->bus->number); > } > + pci_dev_put(data->dev); > kfree(data); > > end: ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-26 15:45 ` Bjorn Helgaas @ 2009-05-26 23:41 ` Alex Chiang 2009-05-26 23:58 ` Kenji Kaneshige 2009-05-26 23:43 ` Kenji Kaneshige 1 sibling, 1 reply; 7+ messages in thread From: Alex Chiang @ 2009-05-26 23:41 UTC (permalink / raw) To: Bjorn Helgaas Cc: Kenji Kaneshige, linux-pci@vger.kernel.org, Jesse Barnes, linux acpi, lenb Adding Len because this should probably go through his tree, not Jesse's. Len, we're discussing a patch that I feel should go into 2.6.30, because it fixes an oops that I introduced in the beginning of the merge window, and that we've been working on since. We just now have a patch to fix it, along with another patch that I wrote: http://patchwork.kernel.org/patch/25296/ This current patch is here: http://patchwork.kernel.org/patch/25895/ Discussion follows. * Bjorn Helgaas <bjorn.helgaas@hp.com>: > On Monday 25 May 2009 06:08:03 pm Kenji Kaneshige wrote: > > Fix wrong struct pci_dev reference counter handling in acpi_pci_bind(). > > > > The 'dev' field of struct acpi_pci_data is having a pointer to struct > > pci_dev without incrementing the reference counter. Because of this, I > > got the following kernel oops when I was doing some pci hotplug > > operations. This patch fixes this bug by replacing wrong hand-made > > pci_find_slot() with pci_get_slot() in acpi_pci_bind(). > > I don't like this ACPI/PCI bind thing in general because having the > extra .bind and .unbind methods is ugly and somewhat non-obvious, and > I'm nervous about object lifetime issues like this one. > > But I don't have a better alternative to offer, and there's definitely > a problem here, so thanks for fixing and testing it. I do have one > question below about whether the comment in the existing code, which > seems to be an excuse for doing the hand-made pci_find_slot(), is > still relevant, or should just be removed. I reviewed and successfully tested this patch on our ia64 machine. Reviewed-by: Alex Chiang <achiang@hp.com> Tested-by: Alex Chiang <achiang@hp.com> > > @@ -180,16 +177,8 @@ int acpi_pci_bind(struct acpi_device *de > > * PCI devices are added to the global pci list when the root > > * bridge start ops are run, which may not have happened yet. > > Please update or remove this comment, which claims that "we cannot > simply search the global pci device list." I don't know whether the > comment (a) explains why we can't use pci_get_slot(), (b) explains > why we can't use pci_find_slot() or some other interface, (c) refers > to an ordering problem that doesn't exist on your system, or (d) is > just no longer applicable at all. I think the comment is simply no longer applicable. During boot, we exercise this path in acpi_pci_root_add(): acpi_pci_root_add pci_acpi_scan_root acpi_pci_bind_root acpi_pci_bridge_scan acpi_pci_bind pci_acpi_scan_root will create the PCI namespace for us before we attempt to bind the devices, so we know we will find the pci_dev on the pci_bus->devices list. During hotplug, we exercise this path: acpiphp_enable_slot enable_device pci_scan_slot pci_scan_bridge acpiphp_bus_add acpi_bus_add acpi_add_single_object acpi_pci_bind pci_scan_slot() will put the new pci_devs onto pci_bus->devices, so by the time we get to acpi_pci_bind, the call to pci_get_slot will be successful. There is the case where we hotplug a bridge device: handle_hotplug_event_bridge handle_bridge_insertion acpi_bus_add acpi_add_single_object acpi_pci_bind And this does confuse me a little bit, because I'm not seeing how the bridge device gets added to the parent pci_bus->devices list before we get to acpi_pci_bind, but... Kenji's patch isn't changing the semantics on _how_ we find a device: - bus = pci_find_bus(data->id.segment, data->id.bus); - if (bus) { - list_for_each_entry(dev, &bus->devices, bus_list) { - if (dev->devfn == PCI_DEVFN(data->id.device, - data->id.function)) { - data->dev = dev; - break; - } - } - } + data->dev = pci_get_slot(pdata->bus, + PCI_DEVFN(data->id.device, data->id.function)); if (!data->dev) { ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device %04x:%02x:%02x.%d not present in PCI namespace\n", pci_get_slot() iterates across bus->devices too (except that it correctly grabs the pci_bus_sem first) in addition to the obvious refcount on the pci_dev. Given the above, I feel pretty comfortable with Kenji-san's change, and I'd recommend that he just get rid of that confusing comment entirely. The only other suggestion I have is that he could trim down the oops output a bit to just get the stack trace: BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd Call Trace: [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] [<ffffffff803cfc18>] power_write_file+0xc8/0x110 [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 [<ffffffff802e94e7>] vfs_write+0xc7/0x170 [<ffffffff802e9aa0>] sys_write+0x50/0x90 [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b Thanks. /ac ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-26 23:41 ` Alex Chiang @ 2009-05-26 23:58 ` Kenji Kaneshige 2009-05-27 21:56 ` Len Brown 0 siblings, 1 reply; 7+ messages in thread From: Kenji Kaneshige @ 2009-05-26 23:58 UTC (permalink / raw) To: Alex Chiang Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, Jesse Barnes, linux acpi, lenb Alex Chiang wrote: > Adding Len because this should probably go through his tree, not > Jesse's. > > Len, we're discussing a patch that I feel should go into 2.6.30, > because it fixes an oops that I introduced in the beginning of > the merge window, and that we've been working on since. We just > now have a patch to fix it, along with another patch that I > wrote: > > http://patchwork.kernel.org/patch/25296/ > > This current patch is here: > > http://patchwork.kernel.org/patch/25895/ > > Discussion follows. > > * Bjorn Helgaas <bjorn.helgaas@hp.com>: >> On Monday 25 May 2009 06:08:03 pm Kenji Kaneshige wrote: >>> Fix wrong struct pci_dev reference counter handling in acpi_pci_bind(). >>> >>> The 'dev' field of struct acpi_pci_data is having a pointer to struct >>> pci_dev without incrementing the reference counter. Because of this, I >>> got the following kernel oops when I was doing some pci hotplug >>> operations. This patch fixes this bug by replacing wrong hand-made >>> pci_find_slot() with pci_get_slot() in acpi_pci_bind(). >> I don't like this ACPI/PCI bind thing in general because having the >> extra .bind and .unbind methods is ugly and somewhat non-obvious, and >> I'm nervous about object lifetime issues like this one. >> >> But I don't have a better alternative to offer, and there's definitely >> a problem here, so thanks for fixing and testing it. I do have one >> question below about whether the comment in the existing code, which >> seems to be an excuse for doing the hand-made pci_find_slot(), is >> still relevant, or should just be removed. > > I reviewed and successfully tested this patch on our ia64 machine. > > Reviewed-by: Alex Chiang <achiang@hp.com> > Tested-by: Alex Chiang <achiang@hp.com> > >>> @@ -180,16 +177,8 @@ int acpi_pci_bind(struct acpi_device *de >>> * PCI devices are added to the global pci list when the root >>> * bridge start ops are run, which may not have happened yet. >> Please update or remove this comment, which claims that "we cannot >> simply search the global pci device list." I don't know whether the >> comment (a) explains why we can't use pci_get_slot(), (b) explains >> why we can't use pci_find_slot() or some other interface, (c) refers >> to an ordering problem that doesn't exist on your system, or (d) is >> just no longer applicable at all. > > I think the comment is simply no longer applicable. > > During boot, we exercise this path in acpi_pci_root_add(): > > acpi_pci_root_add > pci_acpi_scan_root > acpi_pci_bind_root > acpi_pci_bridge_scan > acpi_pci_bind > > pci_acpi_scan_root will create the PCI namespace for us before we > attempt to bind the devices, so we know we will find the pci_dev > on the pci_bus->devices list. > > During hotplug, we exercise this path: > > acpiphp_enable_slot > enable_device > pci_scan_slot > pci_scan_bridge > acpiphp_bus_add > acpi_bus_add > acpi_add_single_object > acpi_pci_bind > > pci_scan_slot() will put the new pci_devs onto pci_bus->devices, > so by the time we get to acpi_pci_bind, the call to pci_get_slot > will be successful. > > There is the case where we hotplug a bridge device: > > handle_hotplug_event_bridge > handle_bridge_insertion > acpi_bus_add > acpi_add_single_object > acpi_pci_bind > > And this does confuse me a little bit, because I'm not seeing how > the bridge device gets added to the parent pci_bus->devices list > before we get to acpi_pci_bind, but... > > Kenji's patch isn't changing the semantics on _how_ we find a > device: > > - bus = pci_find_bus(data->id.segment, data->id.bus); > - if (bus) { > - list_for_each_entry(dev, &bus->devices, bus_list) { > - if (dev->devfn == PCI_DEVFN(data->id.device, > - data->id.function)) { > - data->dev = dev; > - break; > - } > - } > - } > + data->dev = pci_get_slot(pdata->bus, > + PCI_DEVFN(data->id.device, data->id.function)); > if (!data->dev) { > ACPI_DEBUG_PRINT((ACPI_DB_INFO, > "Device %04x:%02x:%02x.%d not present in PCI namespace\n", > > pci_get_slot() iterates across bus->devices too (except that it > correctly grabs the pci_bus_sem first) in addition to the obvious > refcount on the pci_dev. > > Given the above, I feel pretty comfortable with Kenji-san's > change, and I'd recommend that he just get rid of that confusing > comment entirely. > > The only other suggestion I have is that he could trim down the > oops output a bit to just get the stack trace: > > BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 > IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd > Call Trace: > [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 > [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 > [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] > [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] > [<ffffffff803cfc18>] power_write_file+0xc8/0x110 > [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 > [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 > [<ffffffff802e94e7>] vfs_write+0xc7/0x170 > [<ffffffff802e9aa0>] sys_write+0x50/0x90 > [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b > Thank you very much for detailed analysis. I agree with you. I'll send an updated patch (instead of additional patch to remove obsolete comment), but it will be tomorrow because I'm day off today. Sorry... Thanks, Kenji Kaneshige ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-26 23:58 ` Kenji Kaneshige @ 2009-05-27 21:56 ` Len Brown 2009-05-28 0:09 ` Kenji Kaneshige 0 siblings, 1 reply; 7+ messages in thread From: Len Brown @ 2009-05-27 21:56 UTC (permalink / raw) To: Kenji Kaneshige Cc: Alex Chiang, Bjorn Helgaas, linux-pci@vger.kernel.org, Jesse Barnes, linux acpi >From dacd2549ca61ddbdd1ed62a76ca95dea3f0e02c6 Mon Sep 17 00:00:00 2001 From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Date: Tue, 26 May 2009 09:08:03 +0900 Subject: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() The 'dev' field of struct acpi_pci_data is having a pointer to struct pci_dev without incrementing the reference counter. Because of this, I got the following kernel oops when I was doing some pci hotplug operations. This patch fixes this bug by replacing wrong hand-made pci_find_slot() with pci_get_slot() in acpi_pci_bind(). BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd Call Trace: [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] [<ffffffff803cfc18>] power_write_file+0xc8/0x110 [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 [<ffffffff802e94e7>] vfs_write+0xc7/0x170 [<ffffffff802e9aa0>] sys_write+0x50/0x90 [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Reviewed-by: Alex Chiang <achiang@hp.com> Tested-by: Alex Chiang <achiang@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> --- as applied to the acpi tree... cheers, -Len drivers/acpi/pci_bind.c | 24 ++++++------------------ 1 files changed, 6 insertions(+), 18 deletions(-) diff --git a/drivers/acpi/pci_bind.c b/drivers/acpi/pci_bind.c index 95650f8..bc46de3 100644 --- a/drivers/acpi/pci_bind.c +++ b/drivers/acpi/pci_bind.c @@ -116,9 +116,6 @@ int acpi_pci_bind(struct acpi_device *device) struct acpi_pci_data *pdata; struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; acpi_handle handle; - struct pci_dev *dev; - struct pci_bus *bus; - if (!device || !device->parent) return -EINVAL; @@ -176,20 +173,9 @@ int acpi_pci_bind(struct acpi_device *device) * Locate matching device in PCI namespace. If it doesn't exist * this typically means that the device isn't currently inserted * (e.g. docking station, port replicator, etc.). - * We cannot simply search the global pci device list, since - * PCI devices are added to the global pci list when the root - * bridge start ops are run, which may not have happened yet. */ - bus = pci_find_bus(data->id.segment, data->id.bus); - if (bus) { - list_for_each_entry(dev, &bus->devices, bus_list) { - if (dev->devfn == PCI_DEVFN(data->id.device, - data->id.function)) { - data->dev = dev; - break; - } - } - } + data->dev = pci_get_slot(pdata->bus, + PCI_DEVFN(data->id.device, data->id.function)); if (!data->dev) { ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device %04x:%02x:%02x.%d not present in PCI namespace\n", @@ -259,9 +245,10 @@ int acpi_pci_bind(struct acpi_device *device) end: kfree(buffer.pointer); - if (result) + if (result) { + pci_dev_put(data->dev); kfree(data); - + } return result; } @@ -303,6 +290,7 @@ static int acpi_pci_unbind(struct acpi_device *device) if (data->dev->subordinate) { acpi_pci_irq_del_prt(data->id.segment, data->bus->number); } + pci_dev_put(data->dev); kfree(data); end: -- 1.6.3.1.169.g33fd ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-27 21:56 ` Len Brown @ 2009-05-28 0:09 ` Kenji Kaneshige 0 siblings, 0 replies; 7+ messages in thread From: Kenji Kaneshige @ 2009-05-28 0:09 UTC (permalink / raw) To: Len Brown Cc: Alex Chiang, Bjorn Helgaas, linux-pci@vger.kernel.org, Jesse Barnes, linux acpi Thank you for updating the patch. It looks good to me. Thanks, Kenji Kaneshige Len Brown wrote: >>From dacd2549ca61ddbdd1ed62a76ca95dea3f0e02c6 Mon Sep 17 00:00:00 2001 > From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> > Date: Tue, 26 May 2009 09:08:03 +0900 > Subject: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() > > The 'dev' field of struct acpi_pci_data is having a pointer to struct > pci_dev without incrementing the reference counter. Because of this, I > got the following kernel oops when I was doing some pci hotplug > operations. This patch fixes this bug by replacing wrong hand-made > pci_find_slot() with pci_get_slot() in acpi_pci_bind(). > > BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 > IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd > > Call Trace: > [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 > [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 > [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] > [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] > [<ffffffff803cfc18>] power_write_file+0xc8/0x110 > [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 > [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 > [<ffffffff802e94e7>] vfs_write+0xc7/0x170 > [<ffffffff802e9aa0>] sys_write+0x50/0x90 > [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b > > Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> > Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> > Reviewed-by: Alex Chiang <achiang@hp.com> > Tested-by: Alex Chiang <achiang@hp.com> > Signed-off-by: Len Brown <len.brown@intel.com> > --- > > as applied to the acpi tree... > > cheers, > -Len > > drivers/acpi/pci_bind.c | 24 ++++++------------------ > 1 files changed, 6 insertions(+), 18 deletions(-) > > diff --git a/drivers/acpi/pci_bind.c b/drivers/acpi/pci_bind.c > index 95650f8..bc46de3 100644 > --- a/drivers/acpi/pci_bind.c > +++ b/drivers/acpi/pci_bind.c > @@ -116,9 +116,6 @@ int acpi_pci_bind(struct acpi_device *device) > struct acpi_pci_data *pdata; > struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; > acpi_handle handle; > - struct pci_dev *dev; > - struct pci_bus *bus; > - > > if (!device || !device->parent) > return -EINVAL; > @@ -176,20 +173,9 @@ int acpi_pci_bind(struct acpi_device *device) > * Locate matching device in PCI namespace. If it doesn't exist > * this typically means that the device isn't currently inserted > * (e.g. docking station, port replicator, etc.). > - * We cannot simply search the global pci device list, since > - * PCI devices are added to the global pci list when the root > - * bridge start ops are run, which may not have happened yet. > */ > - bus = pci_find_bus(data->id.segment, data->id.bus); > - if (bus) { > - list_for_each_entry(dev, &bus->devices, bus_list) { > - if (dev->devfn == PCI_DEVFN(data->id.device, > - data->id.function)) { > - data->dev = dev; > - break; > - } > - } > - } > + data->dev = pci_get_slot(pdata->bus, > + PCI_DEVFN(data->id.device, data->id.function)); > if (!data->dev) { > ACPI_DEBUG_PRINT((ACPI_DB_INFO, > "Device %04x:%02x:%02x.%d not present in PCI namespace\n", > @@ -259,9 +245,10 @@ int acpi_pci_bind(struct acpi_device *device) > > end: > kfree(buffer.pointer); > - if (result) > + if (result) { > + pci_dev_put(data->dev); > kfree(data); > - > + } > return result; > } > > @@ -303,6 +290,7 @@ static int acpi_pci_unbind(struct acpi_device *device) > if (data->dev->subordinate) { > acpi_pci_irq_del_prt(data->id.segment, data->bus->number); > } > + pci_dev_put(data->dev); > kfree(data); > > end: ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() 2009-05-26 15:45 ` Bjorn Helgaas 2009-05-26 23:41 ` Alex Chiang @ 2009-05-26 23:43 ` Kenji Kaneshige 1 sibling, 0 replies; 7+ messages in thread From: Kenji Kaneshige @ 2009-05-26 23:43 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci@vger.kernel.org, Jesse Barnes, Alex Chiang, linux acpi Bjorn Helgaas wrote: > On Monday 25 May 2009 06:08:03 pm Kenji Kaneshige wrote: >> Fix wrong struct pci_dev reference counter handling in acpi_pci_bind(). >> >> The 'dev' field of struct acpi_pci_data is having a pointer to struct >> pci_dev without incrementing the reference counter. Because of this, I >> got the following kernel oops when I was doing some pci hotplug >> operations. This patch fixes this bug by replacing wrong hand-made >> pci_find_slot() with pci_get_slot() in acpi_pci_bind(). > > I don't like this ACPI/PCI bind thing in general because having the > extra .bind and .unbind methods is ugly and somewhat non-obvious, and > I'm nervous about object lifetime issues like this one. > > But I don't have a better alternative to offer, and there's definitely > a problem here, so thanks for fixing and testing it. I do have one > question below about whether the comment in the existing code, which > seems to be an excuse for doing the hand-made pci_find_slot(), is > still relevant, or should just be removed. > > Bjorn > >> [ 206.427004] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 >> [ 206.427076] IP: [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd >> [ 206.427076] PGD 8225ad067 PUD 82258b067 PMD 0 >> [ 206.427076] Oops: 0000 [#1] SMP >> [ 206.427076] last sysfs file: /sys/bus/pci/slots/1/power >> [ 206.427076] CPU 2 >> [ 206.427076] Modules linked in: acpiphp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc pci_slot battery ac parport_pc lp parport sg mptspi mptscsih mptbase scsi_transport_spi sr_mod cdrom e1000e serio_raw button i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] >> [ 206.427076] Pid: 10367, comm: bash Not tainted 2.6.30-rc4-kk #10 PRIMERGY >> [ 206.427076] RIP: 0010:[<ffffffff803f0e9b>] [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd >> [ 206.427076] RSP: 0018:ffff8808225a9d68 EFLAGS: 00010206 >> [ 206.427076] RAX: 0000000000000000 RBX: ffff88083c547800 RCX: 0000000000000006 >> [ 206.427076] RDX: ffff88083ce36ca0 RSI: ffff880822508768 RDI: 0000000000000000 >> [ 206.427076] RBP: ffff8808225a9d98 R08: 0000000000000000 R09: 0000000000000000 >> [ 206.427076] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 >> [ 206.427076] R13: 0000000000000000 R14: 0000000000000001 R15: ffff8808225a9e28 >> [ 206.427076] FS: 00007f376c4ee6e0(0000) GS:ffff880054636000(0000) knlGS:0000000000000000 >> [ 206.427076] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [ 206.427076] CR2: 00000000000000e8 CR3: 0000000822590000 CR4: 00000000000006e0 >> [ 206.427076] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 206.427076] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> [ 206.427076] Process bash (pid: 10367, threadinfo ffff8808225a8000, task ffff880822508000) >> [ 206.427076] Stack: >> [ 206.427076] 000000000000001f ffff88083addd5a0 ffff8808225a9d98 ffff88083ce36ca0 >> [ 206.427076] ffff88083c547800 ffff88083c547800 ffff8808225a9db8 ffffffff803ecee4 >> [ 206.427076] ffff88083c547800 ffff88083c547000 ffff8808225a9e08 ffffffff803ecf6d >> [ 206.427076] Call Trace: >> [ 206.427076] [<ffffffff803ecee4>] acpi_bus_remove+0x54/0x68 >> [ 206.427076] [<ffffffff803ecf6d>] acpi_bus_trim+0x75/0xe3 >> [ 206.427076] [<ffffffffa0345ddd>] acpiphp_disable_slot+0x16d/0x1e0 [acpiphp] >> [ 206.427076] [<ffffffffa03441f0>] disable_slot+0x20/0x60 [acpiphp] >> [ 206.427076] [<ffffffff803cfc18>] power_write_file+0xc8/0x110 >> [ 206.427076] [<ffffffff803c6a54>] pci_slot_attr_store+0x24/0x30 >> [ 206.427076] [<ffffffff803469ce>] sysfs_write_file+0xce/0x140 >> [ 206.427076] [<ffffffff802e94e7>] vfs_write+0xc7/0x170 >> [ 206.427076] [<ffffffff802e9aa0>] sys_write+0x50/0x90 >> [ 206.427076] [<ffffffff8020bd6b>] system_call_fastpath+0x16/0x1b >> [ 206.427076] Code: be 2b 01 00 00 48 c7 c7 d0 c6 5a 80 31 c0 e8 c3 8f 01 00 eb 36 48 8b 55 e8 48 8b 42 10 48 83 78 18 00 74 13 48 8b 42 08 0f b7 3a <0f> b6 b0 e8 00 00 00 e8 ab f8 ff ff 48 8b 7d e8 e8 30 cf ee ff >> [ 206.427076] RIP [<ffffffff803f0e9b>] acpi_pci_unbind+0xb1/0xdd >> [ 206.427076] RSP <ffff8808225a9d68> >> [ 206.427076] CR2: 00000000000000e8 >> [ 206.440158] ---[ end trace 1ca3974fa717e665 ]--- >> >> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> >> >> drivers/acpi/pci_bind.c | 21 ++++++--------------- >> 1 file changed, 6 insertions(+), 15 deletions(-) >> >> Index: 20090521/drivers/acpi/pci_bind.c >> =================================================================== >> --- 20090521.orig/drivers/acpi/pci_bind.c >> +++ 20090521/drivers/acpi/pci_bind.c >> @@ -116,9 +116,6 @@ int acpi_pci_bind(struct acpi_device *de >> struct acpi_pci_data *pdata; >> struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; >> acpi_handle handle; >> - struct pci_dev *dev; >> - struct pci_bus *bus; >> - >> >> if (!device || !device->parent) >> return -EINVAL; >> @@ -180,16 +177,8 @@ int acpi_pci_bind(struct acpi_device *de >> * PCI devices are added to the global pci list when the root >> * bridge start ops are run, which may not have happened yet. > > Please update or remove this comment, which claims that "we cannot > simply search the global pci device list." I don't know whether the > comment (a) explains why we can't use pci_get_slot(), (b) explains > why we can't use pci_find_slot() or some other interface, (c) refers > to an ordering problem that doesn't exist on your system, or (d) is > just no longer applicable at all. > My answer is (d). In the past, PCI core used to have two link lists to which struct pci_dev are linked, one is per PCI bus list and the another is global list. And the global list doesn't exist now, it was removed several months/years(?) ago. At the acpi_pci_bind() time, struct pci_dev is already on the per PCI bus list, but not on the global list (struct pci_dev used to be added to global list at .start time of ACPI PCI root driver). So I think this comment just explains we cannot find struct pci_dev on the global list at this point. I'll make an additional patch to remove this obsolete comment. Thanks, Kenji Kaneshige ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-05-28 0:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-26 0:08 [PATCH] PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() Kenji Kaneshige 2009-05-26 15:45 ` Bjorn Helgaas 2009-05-26 23:41 ` Alex Chiang 2009-05-26 23:58 ` Kenji Kaneshige 2009-05-27 21:56 ` Len Brown 2009-05-28 0:09 ` Kenji Kaneshige 2009-05-26 23:43 ` Kenji Kaneshige
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox