* 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
@ 2008-04-04 17:53 Badari Pulavarty
2008-04-04 18:05 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-04 17:53 UTC (permalink / raw)
To: achiang, Andrew Morton; +Cc: lkml
Hi Andrew & Alex,
I get following panic in rpaphp_register_slot() on
2.6.25-rc8-mm1. Known issue ?
Thanks,
Badari
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
Unable to handle kernel paging request for data at address 0x00000070
Faulting instruction address: 0xc00000000031e4b8
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA pSeries
Modules linked in:
NIP: c00000000031e4b8 LR: c00000000031dd60 CTR: 0000000000000000
REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 42002022 XER: 20000009
DAR: 0000000000000070, DSISR: 0000000040000000
TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0
GPR00: c0000000006c6e90 c0000000700dbcb0 c000000000769140 c0000000703c1440
GPR04: c000000070226400 0000000000000000 0000000020000009 d000080080000004
GPR08: 0000000000000001 0000000000000000 c000000071fff1b8 0000000000000000
GPR12: 0000000000004000 c000000000690380 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000
GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48
GPR28: c0000000703c1440 c000000070ff7780 c0000000006f7778 c0000000006c6e90
NIP [c00000000031e4b8] .rpaphp_register_slot+0xd8/0x164
LR [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
Call Trace:
[c0000000700dbcb0] [c0000000700dbd70] 0xc0000000700dbd70 (unreliable)
[c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
[c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68
[c0000000700dbea0] [c0000000006103ec] .kernel_init+0x1f8/0x38c
[c0000000700dbf90] [c000000000023f80] .kernel_thread+0x4c/0x68
Instruction dump:
48000020 e8bd0020 e89e8010 e87e8048 4bd38c09 60000000 3860fff5 48000080
e93d0028 e89d0030 7f83e378 e9290038 <e9290070> 80a90004 78a5eee2 4bffee7d
---[ end trace 561bb236c800851f ]---
Kernel panic - not syncing: Attempted to kill init!
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-04 17:53 2.6.25-rc8-mm1 panic in rpaphp_register_slot() Badari Pulavarty
@ 2008-04-04 18:05 ` Alex Chiang
2008-04-04 20:19 ` Badari Pulavarty
0 siblings, 1 reply; 23+ messages in thread
From: Alex Chiang @ 2008-04-04 18:05 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: Andrew Morton, lkml
Hi Badari,
* Badari Pulavarty <pbadari@us.ibm.com>:
> Hi Andrew & Alex,
>
> I get following panic in rpaphp_register_slot() on
> 2.6.25-rc8-mm1. Known issue ?
I know about it now... :-/
Any chance you could turn debugging on for both rpaphp and
pci_hotplug?
Thanks.
/ac
> pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> Unable to handle kernel paging request for data at address 0x00000070
> Faulting instruction address: 0xc00000000031e4b8
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=32 NUMA pSeries
> Modules linked in:
> NIP: c00000000031e4b8 LR: c00000000031dd60 CTR: 0000000000000000
> REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1)
> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 42002022 XER: 20000009
> DAR: 0000000000000070, DSISR: 0000000040000000
> TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0
> GPR00: c0000000006c6e90 c0000000700dbcb0 c000000000769140 c0000000703c1440
> GPR04: c000000070226400 0000000000000000 0000000020000009 d000080080000004
> GPR08: 0000000000000001 0000000000000000 c000000071fff1b8 0000000000000000
> GPR12: 0000000000004000 c000000000690380 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000
> GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48
> GPR28: c0000000703c1440 c000000070ff7780 c0000000006f7778 c0000000006c6e90
> NIP [c00000000031e4b8] .rpaphp_register_slot+0xd8/0x164
> LR [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
> Call Trace:
> [c0000000700dbcb0] [c0000000700dbd70] 0xc0000000700dbd70 (unreliable)
> [c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
> [c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68
> [c0000000700dbea0] [c0000000006103ec] .kernel_init+0x1f8/0x38c
> [c0000000700dbf90] [c000000000023f80] .kernel_thread+0x4c/0x68
> Instruction dump:
> 48000020 e8bd0020 e89e8010 e87e8048 4bd38c09 60000000 3860fff5 48000080
> e93d0028 e89d0030 7f83e378 e9290038 <e9290070> 80a90004 78a5eee2 4bffee7d
> ---[ end trace 561bb236c800851f ]---
> Kernel panic - not syncing: Attempted to kill init!
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-04 18:05 ` Alex Chiang
@ 2008-04-04 20:19 ` Badari Pulavarty
2008-04-04 22:42 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-04 20:19 UTC (permalink / raw)
To: Alex Chiang; +Cc: Andrew Morton, lkml
On Fri, 2008-04-04 at 12:05 -0600, Alex Chiang wrote:
> Hi Badari,
>
> * Badari Pulavarty <pbadari@us.ibm.com>:
> > Hi Andrew & Alex,
> >
> > I get following panic in rpaphp_register_slot() on
> > 2.6.25-rc8-mm1. Known issue ?
>
> I know about it now... :-/
>
> Any chance you could turn debugging on for both rpaphp and
> pci_hotplug?
>
I did set "debug" variables in rpaphp_core.c and pci_hotplug_core.c
to 1. But it didn't give any more information. How do I enable
the debug ?
I just changed dbg() to printk() in rpaphp_register_slot() to
print out more information.
Thanks,
Badari
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
Unable to handle kernel paging request for data at address 0x00000070
Faulting instruction address: 0xc00000000031e4ac
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA pSeries
Modules linked in:
NIP: c00000000031e4ac LR: c00000000031e434 CTR: 800000000013f270
REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 22002022 XER: 00000004
DAR: 0000000000000070, DSISR: 0000000040000000
TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0
GPR00: c0000000006c6f10 c0000000700dbcb0 c0000000007691c0 c000000070372440
GPR04: c000000070217400 0000000000000001 0000000000000000 0000000000000001
GPR08: c0000000007917cc 0000000000000000 00000000000050b3 c0000000007917c8
GPR12: 0000000000004000 c000000000690400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000
GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48
GPR28: c000000070372440 c00000006e036780 c0000000006f77f8 c0000000006c6f10
NIP [c00000000031e4ac] .rpaphp_register_slot+0xcc/0x158
LR [c00000000031e434] .rpaphp_register_slot+0x54/0x158
Call Trace:
[c0000000700dbcb0] [c00000000031e434] .rpaphp_register_slot+0x54/0x158 (unreliable)
[c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
[c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68
[c0000000700dbea0] [c0000000006103ec] .kernel_init+0x1f8/0x38c
[c0000000700dbf90] [c000000000023f80] .kernel_thread+0x4c/0x68
Instruction dump:
48000020 e8bd0020 e89e8010 e87e8048 4bd38c15 60000000 3860fff5 48000080
e93d0028 e89d0030 7f83e378 e9290038 <e9290070> 80a90004 78a5eee2 4bffee89
---[ end trace 561bb236c800851f ]---
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-04 20:19 ` Badari Pulavarty
@ 2008-04-04 22:42 ` Alex Chiang
2008-04-04 23:35 ` Badari Pulavarty
0 siblings, 1 reply; 23+ messages in thread
From: Alex Chiang @ 2008-04-04 22:42 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: Andrew Morton, lkml, linas
* Badari Pulavarty <pbadari@us.ibm.com>:
>
> On Fri, 2008-04-04 at 12:05 -0600, Alex Chiang wrote:
> > Hi Badari,
> >
> > * Badari Pulavarty <pbadari@us.ibm.com>:
> > > Hi Andrew & Alex,
> > >
> > > I get following panic in rpaphp_register_slot() on
> > > 2.6.25-rc8-mm1. Known issue ?
> >
> > I know about it now... :-/
> >
> > Any chance you could turn debugging on for both rpaphp and
> > pci_hotplug?
> >
>
> I did set "debug" variables in rpaphp_core.c and pci_hotplug_core.c
> to 1. But it didn't give any more information. How do I enable
> the debug ?
If those modules are built into your kernel, then you'll need to
modify the kernel command line to add:
rpaphp.debug=1 pci_hotplug.debug=1
If you're manually modprobing them:
$ modprobe pci_hotplug debug=1
$ modprobe rpaphp debug=1
> I just changed dbg() to printk() in rpaphp_register_slot() to
> print out more information.
>
> Thanks,
> Badari
>
>
> pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> Unable to handle kernel paging request for data at address 0x00000070
Hrm, this is a little more information, but still not quite
enough. I'm going to take a stab in the dark and say I'm probably
doing something wrong on this line, maybe dereferencing a pointer
incorrectly:
retval = pci_hp_register(php_slot, slot->bus,
PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
But what the real answer is, I don't know, since I don't know
anything about POWER.
Linas -- can you help me out and tell me if I'm doing anything
obviously stupid?
Thanks.
/ac
> Faulting instruction address: 0xc00000000031e4ac
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=32 NUMA pSeries
> Modules linked in:
> NIP: c00000000031e4ac LR: c00000000031e434 CTR: 800000000013f270
> REGS: c0000000700dba30 TRAP: 0300 Not tainted (2.6.25-rc8-mm1)
> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 22002022 XER: 00000004
> DAR: 0000000000000070, DSISR: 0000000040000000
> TASK = c0000000700d7990[1] 'swapper' THREAD: c0000000700d8000 CPU: 0
> GPR00: c0000000006c6f10 c0000000700dbcb0 c0000000007691c0 c000000070372440
> GPR04: c000000070217400 0000000000000001 0000000000000000 0000000000000001
> GPR08: c0000000007917cc 0000000000000000 00000000000050b3 c0000000007917c8
> GPR12: 0000000000004000 c000000000690400 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 4000000001c00000 0000000000000000 0000000002247c38 0000000000000000
> GPR24: 0000000002247ea8 c000000071fff1b8 0000000000000000 c0000000023e8f48
> GPR28: c000000070372440 c00000006e036780 c0000000006f77f8 c0000000006c6f10
> NIP [c00000000031e4ac] .rpaphp_register_slot+0xcc/0x158
> LR [c00000000031e434] .rpaphp_register_slot+0x54/0x158
> Call Trace:
> [c0000000700dbcb0] [c00000000031e434] .rpaphp_register_slot+0x54/0x158 (unreliable)
> [c0000000700dbd40] [c00000000031dd60] .rpaphp_add_slot+0x1a4/0x260
> [c0000000700dbe20] [c000000000636864] .rpaphp_init+0x34/0x68
> [c0000000700dbea0] [c0000000006103ec] .kernel_init+0x1f8/0x38c
> [c0000000700dbf90] [c000000000023f80] .kernel_thread+0x4c/0x68
> Instruction dump:
> 48000020 e8bd0020 e89e8010 e87e8048 4bd38c15 60000000 3860fff5 48000080
> e93d0028 e89d0030 7f83e378 e9290038 <e9290070> 80a90004 78a5eee2 4bffee89
> ---[ end trace 561bb236c800851f ]---
>
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-04 22:42 ` Alex Chiang
@ 2008-04-04 23:35 ` Badari Pulavarty
2008-04-07 23:42 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-04 23:35 UTC (permalink / raw)
To: Alex Chiang; +Cc: Andrew Morton, lkml, linas
On Fri, 2008-04-04 at 16:42 -0600, Alex Chiang wrote:
> * Badari Pulavarty <pbadari@us.ibm.com>:
> >
> > On Fri, 2008-04-04 at 12:05 -0600, Alex Chiang wrote:
> > > Hi Badari,
> > >
> > > * Badari Pulavarty <pbadari@us.ibm.com>:
> > > > Hi Andrew & Alex,
> > > >
> > > > I get following panic in rpaphp_register_slot() on
> > > > 2.6.25-rc8-mm1. Known issue ?
> > >
> > > I know about it now... :-/
> > >
> > > Any chance you could turn debugging on for both rpaphp and
> > > pci_hotplug?
> > >
> >
> > I did set "debug" variables in rpaphp_core.c and pci_hotplug_core.c
> > to 1. But it didn't give any more information. How do I enable
> > the debug ?
>
> If those modules are built into your kernel, then you'll need to
> modify the kernel command line to add:
>
> rpaphp.debug=1 pci_hotplug.debug=1
>
> If you're manually modprobing them:
>
> $ modprobe pci_hotplug debug=1
> $ modprobe rpaphp debug=1
>
> > I just changed dbg() to printk() in rpaphp_register_slot() to
> > print out more information.
> >
> > Thanks,
> > Badari
> >
> >
> > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > Unable to handle kernel paging request for data at address 0x00000070
>
> Hrm, this is a little more information, but still not quite
> enough. I'm going to take a stab in the dark and say I'm probably
> doing something wrong on this line, maybe dereferencing a pointer
> incorrectly:
>
> retval = pci_hp_register(php_slot, slot->bus,
> PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
>
Sorry. I thought you knew this already. Disassembly clearly showed
that slot->dn->child is NULL.
I confirmed it by adding printk also.
Thanks,
Badari
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-04 23:35 ` Badari Pulavarty
@ 2008-04-07 23:42 ` Alex Chiang
2008-04-16 0:36 ` Andrew Morton
2008-04-16 3:17 ` Alex Chiang
0 siblings, 2 replies; 23+ messages in thread
From: Alex Chiang @ 2008-04-07 23:42 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: Andrew Morton, lkml, linas
Hi Badari,
> > > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > > Unable to handle kernel paging request for data at address 0x00000070
> >
> > Hrm, this is a little more information, but still not quite
> > enough. I'm going to take a stab in the dark and say I'm probably
> > doing something wrong on this line, maybe dereferencing a pointer
> > incorrectly:
> >
> > retval = pci_hp_register(php_slot, slot->bus,
> > PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
>
> Sorry. I thought you knew this already. Disassembly clearly showed
> that slot->dn->child is NULL.
>
> I confirmed it by adding printk also.
This patch is a complete guess on my part (since I've not been
able to understand pseries architecture) but I think it should
fix your issue.
Can you give it a try and let me know? It applies on top of the
-mm tree that includes my physical pci_slot series.
Also, I'm hoping Linas will speak up and let me know what the
real answer might be. ;)
Thanks.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
Unpopulated device_node slots do not have children, and
attempting to dereference them will result in a panic.
Instead, attempt to derive the PCI slot number from the bus
itself, and failing that, default to 0.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
index 0d4cfc7..91ce6a6 100644
--- a/drivers/pci/hotplug/rpaphp_slot.c
+++ b/drivers/pci/hotplug/rpaphp_slot.c
@@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
{
struct hotplug_slot *php_slot = slot->hotplug_slot;
int retval;
+ int slot_nr;
dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
__FUNCTION__, slot->dn->full_name, slot->index, slot->name,
@@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
return -EAGAIN;
}
- retval = pci_hp_register(php_slot, slot->bus,
- PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
+ if (slot->bus->self)
+ slot_nr = PCI_SLOT(slot->bus->self->devfn);
+ else
+ slot_nr = 0;
+ retval = pci_hp_register(php_slot, slot->bus, slot_nr);
if (retval) {
err("pci_hp_register failed with error %d\n", retval);
return retval;
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-07 23:42 ` Alex Chiang
@ 2008-04-16 0:36 ` Andrew Morton
2008-04-16 2:13 ` Alex Chiang
2008-04-16 3:08 ` Badari Pulavarty
2008-04-16 3:17 ` Alex Chiang
1 sibling, 2 replies; 23+ messages in thread
From: Andrew Morton @ 2008-04-16 0:36 UTC (permalink / raw)
To: Alex Chiang; +Cc: pbadari, linux-kernel, linas
On Mon, 7 Apr 2008 17:42:55 -0600
Alex Chiang <achiang@hp.com> wrote:
> Hi Badari,
>
> > > > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > > > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > > > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > > > Unable to handle kernel paging request for data at address 0x00000070
> > >
> > > Hrm, this is a little more information, but still not quite
> > > enough. I'm going to take a stab in the dark and say I'm probably
> > > doing something wrong on this line, maybe dereferencing a pointer
> > > incorrectly:
> > >
> > > retval = pci_hp_register(php_slot, slot->bus,
> > > PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> >
> > Sorry. I thought you knew this already. Disassembly clearly showed
> > that slot->dn->child is NULL.
> >
> > I confirmed it by adding printk also.
>
> This patch is a complete guess on my part (since I've not been
> able to understand pseries architecture) but I think it should
> fix your issue.
>
> Can you give it a try and let me know? It applies on top of the
> -mm tree that includes my physical pci_slot series.
>
> Also, I'm hoping Linas will speak up and let me know what the
> real answer might be. ;)
>
> Thanks.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
>
> Unpopulated device_node slots do not have children, and
> attempting to dereference them will result in a panic.
>
> Instead, attempt to derive the PCI slot number from the bus
> itself, and failing that, default to 0.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
> index 0d4cfc7..91ce6a6 100644
> --- a/drivers/pci/hotplug/rpaphp_slot.c
> +++ b/drivers/pci/hotplug/rpaphp_slot.c
> @@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
> {
> struct hotplug_slot *php_slot = slot->hotplug_slot;
> int retval;
> + int slot_nr;
>
> dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
> __FUNCTION__, slot->dn->full_name, slot->index, slot->name,
> @@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
> return -EAGAIN;
> }
>
> - retval = pci_hp_register(php_slot, slot->bus,
> - PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> + if (slot->bus->self)
> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> + else
> + slot_nr = 0;
> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> if (retval) {
> err("pci_hp_register failed with error %d\n", retval);
> return retval;
Did we hear back from Badari on this?
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 0:36 ` Andrew Morton
@ 2008-04-16 2:13 ` Alex Chiang
2008-04-16 3:08 ` Badari Pulavarty
1 sibling, 0 replies; 23+ messages in thread
From: Alex Chiang @ 2008-04-16 2:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: pbadari, linux-kernel, linas
* Andrew Morton <akpm@linux-foundation.org>:
> Alex Chiang <achiang@hp.com> wrote:
>
> > This patch is a complete guess on my part (since I've not been
> > able to understand pseries architecture) but I think it should
> > fix your issue.
> >
> > Can you give it a try and let me know? It applies on top of the
> > -mm tree that includes my physical pci_slot series.
> >
> > Also, I'm hoping Linas will speak up and let me know what the
> > real answer might be. ;)
>
> Did we hear back from Badari on this?
Haven't heard a peep. :(
Badari? Did it help?
Thanks.
/ac
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 0:36 ` Andrew Morton
2008-04-16 2:13 ` Alex Chiang
@ 2008-04-16 3:08 ` Badari Pulavarty
2008-04-16 3:18 ` Alex Chiang
1 sibling, 1 reply; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-16 3:08 UTC (permalink / raw)
To: Andrew Morton; +Cc: Alex Chiang, linux-kernel, anton
Andrew Morton wrote:
> On Mon, 7 Apr 2008 17:42:55 -0600
> Alex Chiang <achiang@hp.com> wrote:
>
>
>> Hi Badari,
>>
>>
>>>>> pci_hotplug: PCI Hot Plug PCI Core version: 0.5
>>>>> rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
>>>>> rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
>>>>> Unable to handle kernel paging request for data at address 0x00000070
>>>>>
>>>> Hrm, this is a little more information, but still not quite
>>>> enough. I'm going to take a stab in the dark and say I'm probably
>>>> doing something wrong on this line, maybe dereferencing a pointer
>>>> incorrectly:
>>>>
>>>> retval = pci_hp_register(php_slot, slot->bus,
>>>> PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
>>>>
>>> Sorry. I thought you knew this already. Disassembly clearly showed
>>> that slot->dn->child is NULL.
>>>
>>> I confirmed it by adding printk also.
>>>
>> This patch is a complete guess on my part (since I've not been
>> able to understand pseries architecture) but I think it should
>> fix your issue.
>>
>> Can you give it a try and let me know? It applies on top of the
>> -mm tree that includes my physical pci_slot series.
>>
>> Also, I'm hoping Linas will speak up and let me know what the
>> real answer might be. ;)
>>
>> Thanks.
>>
>> /ac
>>
>> From: Alex Chiang <achiang@hp.com>
>> Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
>>
>> Unpopulated device_node slots do not have children, and
>> attempting to dereference them will result in a panic.
>>
>> Instead, attempt to derive the PCI slot number from the bus
>> itself, and failing that, default to 0.
>>
>> Signed-off-by: Alex Chiang <achiang@hp.com>
>> ---
>> diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
>> index 0d4cfc7..91ce6a6 100644
>> --- a/drivers/pci/hotplug/rpaphp_slot.c
>> +++ b/drivers/pci/hotplug/rpaphp_slot.c
>> @@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
>> {
>> struct hotplug_slot *php_slot = slot->hotplug_slot;
>> int retval;
>> + int slot_nr;
>>
>> dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
>> __FUNCTION__, slot->dn->full_name, slot->index, slot->name,
>> @@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
>> return -EAGAIN;
>> }
>>
>> - retval = pci_hp_register(php_slot, slot->bus,
>> - PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
>> + if (slot->bus->self)
>> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
>> + else
>> + slot_nr = 0;
>> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
>> if (retval) {
>> err("pci_hp_register failed with error %d\n", retval);
>> return retval;
>>
>
> Did we hear back from Badari on this?
>
> Thanks.
>
No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my
machine.
I was trying to get some ppc expert to review the patch, as Alex was not
comfortable with the fix/hack. Linas is no longer with IBM :(
Thanks,
Badari
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-07 23:42 ` Alex Chiang
2008-04-16 0:36 ` Andrew Morton
@ 2008-04-16 3:17 ` Alex Chiang
2008-04-16 7:45 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 23+ messages in thread
From: Alex Chiang @ 2008-04-16 3:17 UTC (permalink / raw)
To: benh; +Cc: Badari Pulavarty, Andrew Morton, lkml
Hi Ben,
Could you take a look at this patch and tell me what I'm doing
wrong?
Thanks.
/ac
* Alex Chiang <achiang@hp.com>:
> Hi Badari,
>
> > > > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > > > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > > > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > > > Unable to handle kernel paging request for data at address 0x00000070
> > >
> > > Hrm, this is a little more information, but still not quite
> > > enough. I'm going to take a stab in the dark and say I'm probably
> > > doing something wrong on this line, maybe dereferencing a pointer
> > > incorrectly:
> > >
> > > retval = pci_hp_register(php_slot, slot->bus,
> > > PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> >
> > Sorry. I thought you knew this already. Disassembly clearly showed
> > that slot->dn->child is NULL.
> >
> > I confirmed it by adding printk also.
>
> This patch is a complete guess on my part (since I've not been
> able to understand pseries architecture) but I think it should
> fix your issue.
>
> Can you give it a try and let me know? It applies on top of the
> -mm tree that includes my physical pci_slot series.
>
> Also, I'm hoping Linas will speak up and let me know what the
> real answer might be. ;)
>
> Thanks.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
>
> Unpopulated device_node slots do not have children, and
> attempting to dereference them will result in a panic.
>
> Instead, attempt to derive the PCI slot number from the bus
> itself, and failing that, default to 0.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
> index 0d4cfc7..91ce6a6 100644
> --- a/drivers/pci/hotplug/rpaphp_slot.c
> +++ b/drivers/pci/hotplug/rpaphp_slot.c
> @@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
> {
> struct hotplug_slot *php_slot = slot->hotplug_slot;
> int retval;
> + int slot_nr;
>
> dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
> __FUNCTION__, slot->dn->full_name, slot->index, slot->name,
> @@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
> return -EAGAIN;
> }
>
> - retval = pci_hp_register(php_slot, slot->bus,
> - PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> + if (slot->bus->self)
> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> + else
> + slot_nr = 0;
> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> if (retval) {
> err("pci_hp_register failed with error %d\n", retval);
> return retval;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 3:08 ` Badari Pulavarty
@ 2008-04-16 3:18 ` Alex Chiang
2008-04-16 3:20 ` Badari Pulavarty
2008-04-16 19:32 ` Badari Pulavarty
0 siblings, 2 replies; 23+ messages in thread
From: Alex Chiang @ 2008-04-16 3:18 UTC (permalink / raw)
To: Badari Pulavarty; +Cc: Andrew Morton, linux-kernel, anton
>>> + if (slot->bus->self)
>>> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
>>> + else
>>> + slot_nr = 0;
>>> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
>>> if (retval) {
>>> err("pci_hp_register failed with error %d\n", retval);
>>> return retval;
>
> No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my
> machine.
It didn't help because you're still getting a NULL deref? Or are
you seeing some other failure mode?
Thanks.
/ac
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 3:18 ` Alex Chiang
@ 2008-04-16 3:20 ` Badari Pulavarty
2008-04-16 19:32 ` Badari Pulavarty
1 sibling, 0 replies; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-16 3:20 UTC (permalink / raw)
To: Alex Chiang, Badari Pulavarty, Andrew Morton, linux-kernel, anton
Alex Chiang wrote:
>>>> + if (slot->bus->self)
>>>> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
>>>> + else
>>>> + slot_nr = 0;
>>>> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
>>>> if (retval) {
>>>> err("pci_hp_register failed with error %d\n", retval);
>>>> return retval;
>>>>
>> No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my
>> machine.
>>
>
> It didn't help because you're still getting a NULL deref? Or are
> you seeing some other failure mode?
>
I am pretty sure it was NULL deref. I can double check again, but have
to wait
till tomorrow morning - don't have access to console right now.
Thanks,
Badari
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 3:17 ` Alex Chiang
@ 2008-04-16 7:45 ` Benjamin Herrenschmidt
2008-04-16 17:11 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-04-16 7:45 UTC (permalink / raw)
To: Alex Chiang; +Cc: Badari Pulavarty, Andrew Morton, lkml
On Tue, 2008-04-15 at 21:17 -0600, Alex Chiang wrote:
> Hi Ben,
>
> Could you take a look at this patch and tell me what I'm doing
> wrong?
Hrm, I'll have a look but I'd need some context first... This is to fix
a problem introduced by another serie of patches ? Can you give me some
pointers here ?
The pSeries PCI stuff can be tricky so I'll need some time tomorrow to
context switch and remind myself of everything involved :-) So I'd like
to make sure by that time, I have all the elements.
Thanks,
Ben.
> Thanks.
>
> /ac
>
> * Alex Chiang <achiang@hp.com>:
> > Hi Badari,
> >
> > > > > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > > > > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > > > > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > > > > Unable to handle kernel paging request for data at address 0x00000070
> > > >
> > > > Hrm, this is a little more information, but still not quite
> > > > enough. I'm going to take a stab in the dark and say I'm probably
> > > > doing something wrong on this line, maybe dereferencing a pointer
> > > > incorrectly:
> > > >
> > > > retval = pci_hp_register(php_slot, slot->bus,
> > > > PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> > >
> > > Sorry. I thought you knew this already. Disassembly clearly showed
> > > that slot->dn->child is NULL.
> > >
> > > I confirmed it by adding printk also.
> >
> > This patch is a complete guess on my part (since I've not been
> > able to understand pseries architecture) but I think it should
> > fix your issue.
> >
> > Can you give it a try and let me know? It applies on top of the
> > -mm tree that includes my physical pci_slot series.
> >
> > Also, I'm hoping Linas will speak up and let me know what the
> > real answer might be. ;)
> >
> > Thanks.
> >
> > /ac
> >
> > From: Alex Chiang <achiang@hp.com>
> > Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
> >
> > Unpopulated device_node slots do not have children, and
> > attempting to dereference them will result in a panic.
> >
> > Instead, attempt to derive the PCI slot number from the bus
> > itself, and failing that, default to 0.
> >
> > Signed-off-by: Alex Chiang <achiang@hp.com>
> > ---
> > diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
> > index 0d4cfc7..91ce6a6 100644
> > --- a/drivers/pci/hotplug/rpaphp_slot.c
> > +++ b/drivers/pci/hotplug/rpaphp_slot.c
> > @@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
> > {
> > struct hotplug_slot *php_slot = slot->hotplug_slot;
> > int retval;
> > + int slot_nr;
> >
> > dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
> > __FUNCTION__, slot->dn->full_name, slot->index, slot->name,
> > @@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
> > return -EAGAIN;
> > }
> >
> > - retval = pci_hp_register(php_slot, slot->bus,
> > - PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> > + if (slot->bus->self)
> > + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> > + else
> > + slot_nr = 0;
> > + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> > if (retval) {
> > err("pci_hp_register failed with error %d\n", retval);
> > return retval;
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 7:45 ` Benjamin Herrenschmidt
@ 2008-04-16 17:11 ` Alex Chiang
2008-04-16 20:03 ` Andrew Morton
2008-04-16 22:15 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 23+ messages in thread
From: Alex Chiang @ 2008-04-16 17:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Badari Pulavarty, Andrew Morton, lkml
Hi Ben,
* Benjamin Herrenschmidt <benh@kernel.crashing.org>:
>
> On Tue, 2008-04-15 at 21:17 -0600, Alex Chiang wrote:
> >
> > Could you take a look at this patch and tell me what I'm doing
> > wrong?
>
> Hrm, I'll have a look but I'd need some context first... This is to fix
> a problem introduced by another serie of patches ? Can you give me some
> pointers here ?
Thanks for taking a look. The patch series that got accepted into
-mm is here:
http://lkml.org/lkml/2008/3/25/228
There have been several fixes on top of that series. I'm not sure
what the canonical way to refer to -mm patches is, but if you
navigate to this URL:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm2/patch-list
You'll also need to apply:
pci-hotplug-introduce-pci_slot-fix.patch
pci-hotplug-introduce-pci_slot-fix-fix.patch
pci-hotplug-introduce-pci_slot-fix-2.patch
pci-hotplug-introduce-pci_slot-fix-99.patch
pci-hotplug-introduce-pci_slot-fix-3.patch
pci-hotplug-acpi-pci-slot-detection-driver-fix.patch
drivers-acpi-pci_slotc-fix-build-with-config_dmi=n.patch
[wow, that's bad on me :( ]
> The pSeries PCI stuff can be tricky so I'll need some time tomorrow to
> context switch and remind myself of everything involved :-) So I'd like
> to make sure by that time, I have all the elements.
The basic idea, which I keep botching on pSeries, is that when we
make a call to pci_hp_register, we now need to pass it:
pci_hp_register(struct hotplug_slot *slot, struct pci_bus *bus, int slot_nr)
I am having trouble figuring out the slot_nr argument. Basically,
I want to get the devfn of the slot we're looking at.
Thanks again.
/ac
> Thanks,
> Ben.
>
> > Thanks.
> >
> > /ac
> >
> > * Alex Chiang <achiang@hp.com>:
> > > Hi Badari,
> > >
> > > > > > pci_hotplug: PCI Hot Plug PCI Core version: 0.5
> > > > > > rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1
> > > > > > rpaphp_register_slot registering slot:path[/pci@800000020000003/pci@2,4] index[22010003], name[U787E.001.AAA3015-P2-C1] pdomain[22010003] type[16]
> > > > > > Unable to handle kernel paging request for data at address 0x00000070
> > > > >
> > > > > Hrm, this is a little more information, but still not quite
> > > > > enough. I'm going to take a stab in the dark and say I'm probably
> > > > > doing something wrong on this line, maybe dereferencing a pointer
> > > > > incorrectly:
> > > > >
> > > > > retval = pci_hp_register(php_slot, slot->bus,
> > > > > PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> > > >
> > > > Sorry. I thought you knew this already. Disassembly clearly showed
> > > > that slot->dn->child is NULL.
> > > >
> > > > I confirmed it by adding printk also.
> > >
> > > This patch is a complete guess on my part (since I've not been
> > > able to understand pseries architecture) but I think it should
> > > fix your issue.
> > >
> > > Can you give it a try and let me know? It applies on top of the
> > > -mm tree that includes my physical pci_slot series.
> > >
> > > Also, I'm hoping Linas will speak up and let me know what the
> > > real answer might be. ;)
> > >
> > > Thanks.
> > >
> > > /ac
> > >
> > > From: Alex Chiang <achiang@hp.com>
> > > Subject: rpaphp: correctly call pci_hp_register for empty PCI slots
> > >
> > > Unpopulated device_node slots do not have children, and
> > > attempting to dereference them will result in a panic.
> > >
> > > Instead, attempt to derive the PCI slot number from the bus
> > > itself, and failing that, default to 0.
> > >
> > > Signed-off-by: Alex Chiang <achiang@hp.com>
> > > ---
> > > diff --git a/drivers/pci/hotplug/rpaphp_slot.c b/drivers/pci/hotplug/rpaphp_slot.c
> > > index 0d4cfc7..91ce6a6 100644
> > > --- a/drivers/pci/hotplug/rpaphp_slot.c
> > > +++ b/drivers/pci/hotplug/rpaphp_slot.c
> > > @@ -121,6 +121,7 @@ int rpaphp_register_slot(struct slot *slot)
> > > {
> > > struct hotplug_slot *php_slot = slot->hotplug_slot;
> > > int retval;
> > > + int slot_nr;
> > >
> > > dbg("%s registering slot:path[%s] index[%x], name[%s] pdomain[%x] type[%d]\n",
> > > __FUNCTION__, slot->dn->full_name, slot->index, slot->name,
> > > @@ -132,8 +133,11 @@ int rpaphp_register_slot(struct slot *slot)
> > > return -EAGAIN;
> > > }
> > >
> > > - retval = pci_hp_register(php_slot, slot->bus,
> > > - PCI_SLOT(PCI_DN(slot->dn->child)->devfn));
> > > + if (slot->bus->self)
> > > + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> > > + else
> > > + slot_nr = 0;
> > > + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> > > if (retval) {
> > > err("pci_hp_register failed with error %d\n", retval);
> > > return retval;
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> > >
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 3:18 ` Alex Chiang
2008-04-16 3:20 ` Badari Pulavarty
@ 2008-04-16 19:32 ` Badari Pulavarty
2008-04-16 19:59 ` Alex Chiang
1 sibling, 1 reply; 23+ messages in thread
From: Badari Pulavarty @ 2008-04-16 19:32 UTC (permalink / raw)
To: Alex Chiang; +Cc: Andrew Morton, linux-kernel, anton
On Tue, 2008-04-15 at 21:18 -0600, Alex Chiang wrote:
> >>> + if (slot->bus->self)
> >>> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> >>> + else
> >>> + slot_nr = 0;
> >>> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> >>> if (retval) {
> >>> err("pci_hp_register failed with error %d\n", retval);
> >>> return retval;
> >
> > No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my
> > machine.
>
> It didn't help because you're still getting a NULL deref? Or are
> you seeing some other failure mode?
>
Alex,
I verified this patch again. It seem to have worked this time :(
I am not able to reproduce the problem with this problem (tried many
times).
Sorry for the false alarm :(
Thanks,
Badari
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 19:32 ` Badari Pulavarty
@ 2008-04-16 19:59 ` Alex Chiang
0 siblings, 0 replies; 23+ messages in thread
From: Alex Chiang @ 2008-04-16 19:59 UTC (permalink / raw)
To: ben; +Cc: Badari Pulavarty, Andrew Morton, linux-kernel, anton
Hi Badari,
* Badari Pulavarty <pbadari@us.ibm.com>:
>
> On Tue, 2008-04-15 at 21:18 -0600, Alex Chiang wrote:
> > >>> + if (slot->bus->self)
> > >>> + slot_nr = PCI_SLOT(slot->bus->self->devfn);
> > >>> + else
> > >>> + slot_nr = 0;
> > >>> + retval = pci_hp_register(php_slot, slot->bus, slot_nr);
> > >>> if (retval) {
> > >>> err("pci_hp_register failed with error %d\n", retval);
> > >>> return retval;
> > >
> > > No. It didn't help. I had to hack "slot_nr = 0" all the time to boot my
> > > machine.
> >
> > It didn't help because you're still getting a NULL deref? Or are
> > you seeing some other failure mode?
> >
>
> I verified this patch again. It seem to have worked this time :(
> I am not able to reproduce the problem with this problem (tried many
> times).
>
> Sorry for the false alarm :(
Thanks for testing. Please do let me know if you see any more
problems.
Ben, I would still appreciate a code review on:
http://lkml.org/lkml/2008/4/7/319
Thanks!
/ac
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 17:11 ` Alex Chiang
@ 2008-04-16 20:03 ` Andrew Morton
2008-04-16 22:16 ` Benjamin Herrenschmidt
2008-04-16 22:15 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2008-04-16 20:03 UTC (permalink / raw)
To: Alex Chiang; +Cc: benh, pbadari, linux-kernel
On Wed, 16 Apr 2008 11:11:27 -0600
Alex Chiang <achiang@hp.com> wrote:
> Hi Ben,
>
> * Benjamin Herrenschmidt <benh@kernel.crashing.org>:
> >
> > On Tue, 2008-04-15 at 21:17 -0600, Alex Chiang wrote:
> > >
> > > Could you take a look at this patch and tell me what I'm doing
> > > wrong?
> >
> > Hrm, I'll have a look but I'd need some context first... This is to fix
> > a problem introduced by another serie of patches ? Can you give me some
> > pointers here ?
>
> Thanks for taking a look. The patch series that got accepted into
> -mm is here:
>
> http://lkml.org/lkml/2008/3/25/228
>
> There have been several fixes on top of that series. I'm not sure
> what the canonical way to refer to -mm patches is, but if you
> navigate to this URL:
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm2/patch-list
>
> You'll also need to apply:
>
> pci-hotplug-introduce-pci_slot-fix.patch
> pci-hotplug-introduce-pci_slot-fix-fix.patch
> pci-hotplug-introduce-pci_slot-fix-2.patch
> pci-hotplug-introduce-pci_slot-fix-99.patch
> pci-hotplug-introduce-pci_slot-fix-3.patch
> pci-hotplug-acpi-pci-slot-detection-driver-fix.patch
> drivers-acpi-pci_slotc-fix-build-with-config_dmi=n.patch
I folded it all down to two patches. Unfortunately they don't apply very
well to mainline:
1 out of 4 hunks FAILED -- saving rejects to file drivers/pci/hotplug/acpiphp_core.c.rej
2 out of 3 hunks FAILED -- saving rejects to file drivers/pci/hotplug/acpiphp_ibm.c.rej
1 out of 6 hunks FAILED -- saving rejects to file drivers/pci/hotplug/pciehp_core.c.rej
1 out of 4 hunks FAILED -- saving rejects to file drivers/pci/hotplug/shpchp_core.c.rej
1 out of 5 hunks FAILED -- saving rejects to file include/linux/pci.h.rej
I guess because I reworked everything to fit on Greg's tree(s).
fwiw, http://userweb.kernel.org/~akpm/bh.gz (agains -rc9) contains all of
-mm up to and including pci-hotplug-acpi-pci-slot-detection-driver.patch
and is suitable for review/repair/etc.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 17:11 ` Alex Chiang
2008-04-16 20:03 ` Andrew Morton
@ 2008-04-16 22:15 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-04-16 22:15 UTC (permalink / raw)
To: Alex Chiang; +Cc: Badari Pulavarty, Andrew Morton, lkml
> The basic idea, which I keep botching on pSeries, is that when we
> make a call to pci_hp_register, we now need to pass it:
>
> pci_hp_register(struct hotplug_slot *slot, struct pci_bus *bus, int slot_nr)
>
> I am having trouble figuring out the slot_nr argument. Basically,
> I want to get the devfn of the slot we're looking at.
I haven't looked in details yet, but I can already tell that things on
pSeries aren't that simple because we don't necessarily know in advance
about "slots"...
When you add a physical PCI device to a partition, things can range from:
- A device gets added to an existing bridge (ie. slot)
- A whole P2P brigde gets added with that device below it (that's a slot too)
- A whole PCI host bridge gets added with that device below it (or with
a P2P bridge below it and the device below that).
In the later case, it's hard to have any concept of slot since you don't
know anything about the host bridge before it gets added to your partition :-)
So I'm not sure how much we can use of your slot infrastructure, I'll have
to look, I suspect it can cover some cases but not all of them.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 20:03 ` Andrew Morton
@ 2008-04-16 22:16 ` Benjamin Herrenschmidt
2008-04-19 6:38 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-04-16 22:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Alex Chiang, pbadari, linux-kernel
On Wed, 2008-04-16 at 13:03 -0700, Andrew Morton wrote:
>
> fwiw, http://userweb.kernel.org/~akpm/bh.gz (agains -rc9) contains all
> of
> -mm up to and including
> pci-hotplug-acpi-pci-slot-detection-driver.patch
> and is suitable for review/repair/etc.
Ok, I'll use that.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-16 22:16 ` Benjamin Herrenschmidt
@ 2008-04-19 6:38 ` Alex Chiang
2008-04-19 7:03 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 23+ messages in thread
From: Alex Chiang @ 2008-04-19 6:38 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Andrew Morton, pbadari, linux-kernel
Hi Ben,
* Benjamin Herrenschmidt <benh@kernel.crashing.org>:
>
> > The basic idea, which I keep botching on pSeries, is that when we
> > make a call to pci_hp_register, we now need to pass it:
> >
> > pci_hp_register(struct hotplug_slot *slot, struct pci_bus *bus, int slot_nr)
> >
> > I am having trouble figuring out the slot_nr argument. Basically,
> > I want to get the devfn of the slot we're looking at.
>
> I haven't looked in details yet, but I can already tell that things on
> pSeries aren't that simple because we don't necessarily know in advance
> about "slots"...
>
> When you add a physical PCI device to a partition, things can range from:
>
> - A device gets added to an existing bridge (ie. slot)
> - A whole P2P brigde gets added with that device below it (that's a slot too)
> - A whole PCI host bridge gets added with that device below it (or with
> a P2P bridge below it and the device below that).
>
> In the later case, it's hard to have any concept of slot since you don't
> know anything about the host bridge before it gets added to your partition :-)
>
> So I'm not sure how much we can use of your slot infrastructure, I'll have
> to look, I suspect it can cover some cases but not all of them.
*poke*
Any update on this?
Anything I can do to help?
> On Wed, 2008-04-16 at 13:03 -0700, Andrew Morton wrote:
> >
> > fwiw, http://userweb.kernel.org/~akpm/bh.gz (agains -rc9) contains all
> > of
> > -mm up to and including
> > pci-hotplug-acpi-pci-slot-detection-driver.patch
> > and is suitable for review/repair/etc.
>
> Ok, I'll use that.
Thanks,
/ac
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-19 6:38 ` Alex Chiang
@ 2008-04-19 7:03 ` Benjamin Herrenschmidt
2008-04-22 4:05 ` Alex Chiang
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-04-19 7:03 UTC (permalink / raw)
To: Alex Chiang; +Cc: Andrew Morton, pbadari, linux-kernel
On Sat, 2008-04-19 at 00:38 -0600, Alex Chiang wrote:
> So I'm not sure how much we can use of your slot infrastructure, I'll
> have
> > to look, I suspect it can cover some cases but not all of them.
>
> *poke*
>
> Any update on this?
>
> Anything I can do to help?
Not yet no. I've looked a bit, but ran out of time, I'll look more this
week-end or next week. I'm also trying to figure out who in IBM is
responsible for that hotplug stuff so they can get involved too.
BTW. How would you do if the answer was we simply can't declare hotplug
"slots" ? ie. if I end up finding out (which is what I think will
happen) that there is simply no way for us to know in advance any
concept of "slot" with a devfn for hotplug ?
Basically, when doing hotplug, the hypervisor sends us new bits of
device-tree with things potentially ranging from host bridges, P2P
bridges to devices, but I'm not certain at this stage we can know in
advance where they'll hook up (in fact, for PHBs, we can't for sure) and
thus even if we end up supporting hotplug of actual slots, we don't even
know in advance the devfn where devices will appear. It's all hidden
from us by the hypervisor.
How would that fit in your infrastructure ? Can we just disable usage of
your slots abstraction in our case ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-19 7:03 ` Benjamin Herrenschmidt
@ 2008-04-22 4:05 ` Alex Chiang
2008-04-22 5:07 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 23+ messages in thread
From: Alex Chiang @ 2008-04-22 4:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Andrew Morton, pbadari, linux-kernel, matthew
* Benjamin Herrenschmidt <benh@kernel.crashing.org>:
>
> On Sat, 2008-04-19 at 00:38 -0600, Alex Chiang wrote:
> > So I'm not sure how much we can use of your slot infrastructure, I'll
> > have
> > > to look, I suspect it can cover some cases but not all of them.
> >
> > *poke*
> >
> > Any update on this?
> >
> > Anything I can do to help?
>
> Not yet no. I've looked a bit, but ran out of time, I'll look more this
> week-end or next week. I'm also trying to figure out who in IBM is
> responsible for that hotplug stuff so they can get involved too.
>
> BTW. How would you do if the answer was we simply can't declare hotplug
> "slots" ? ie. if I end up finding out (which is what I think will
> happen) that there is simply no way for us to know in advance any
> concept of "slot" with a devfn for hotplug ?
Hm, I may be getting lost in the twisty maze of pseries, but I
guess I don't really understand this statement.
Today, before calling rpaphp_register_slot, we first call
rpaphp_enable_slot. rpaphp_enable_slot makes a call to
pcibios_find_pci_bus which must succeed before we ever try to
register the slot.
So if there is a pci_bus, then it must have a ->self, which means
it has a ->devfn. Right?
Are you saying that it is not accurate to use this
pci_bus->self->devfn to keep track of slots?
> Basically, when doing hotplug, the hypervisor sends us new bits of
> device-tree with things potentially ranging from host bridges, P2P
> bridges to devices, but I'm not certain at this stage we can know in
> advance where they'll hook up (in fact, for PHBs, we can't for sure) and
> thus even if we end up supporting hotplug of actual slots, we don't even
> know in advance the devfn where devices will appear. It's all hidden
> from us by the hypervisor.
>
> How would that fit in your infrastructure ? Can we just disable usage of
> your slots abstraction in our case ?
I suppose you could just pass in 0 as slot_nr/devfn. That is what
my fixup patch did if it couldn't find a pci_bus->self. The
result would be that for a given pci_bus, you would only see the
first "slot" with this 0 slot_nr appear in sysfs, and it would
have whatever name originally associated with your dn.
I think if I were to understand more about this issue, we could
figure out a better solution...
thanks,
/ac
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: 2.6.25-rc8-mm1 panic in rpaphp_register_slot()
2008-04-22 4:05 ` Alex Chiang
@ 2008-04-22 5:07 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2008-04-22 5:07 UTC (permalink / raw)
To: Alex Chiang; +Cc: Andrew Morton, pbadari, linux-kernel, matthew
On Mon, 2008-04-21 at 22:05 -0600, Alex Chiang wrote:
>
> Are you saying that it is not accurate to use this
> pci_bus->self->devfn to keep track of slots?
Well, if it's a PHB, there's no self.
> I suppose you could just pass in 0 as slot_nr/devfn. That is what
> my fixup patch did if it couldn't find a pci_bus->self. The
> result would be that for a given pci_bus, you would only see the
> first "slot" with this 0 slot_nr appear in sysfs, and it would
> have whatever name originally associated with your dn.
Yes but we can have a PHB with no self and suddenly the HV brings in
multiple devices behind it. Oh well, I need to dig more, if I manage to
get the bloody hotplug stuff working here at all..
In some case, we don't even have the PHB, so nothing will appear in
sysfs but that's probably mostly harmless. The main thing is that those
hotplug operations are never driven that way, they are driven by the
management console which talks to a daemon which talks to the HV etc...
and the kernel is the last one to know about hotplug :-)
> I think if I were to understand more about this issue, we could
> figure out a better solution...
Ben.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2008-04-22 5:10 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-04 17:53 2.6.25-rc8-mm1 panic in rpaphp_register_slot() Badari Pulavarty
2008-04-04 18:05 ` Alex Chiang
2008-04-04 20:19 ` Badari Pulavarty
2008-04-04 22:42 ` Alex Chiang
2008-04-04 23:35 ` Badari Pulavarty
2008-04-07 23:42 ` Alex Chiang
2008-04-16 0:36 ` Andrew Morton
2008-04-16 2:13 ` Alex Chiang
2008-04-16 3:08 ` Badari Pulavarty
2008-04-16 3:18 ` Alex Chiang
2008-04-16 3:20 ` Badari Pulavarty
2008-04-16 19:32 ` Badari Pulavarty
2008-04-16 19:59 ` Alex Chiang
2008-04-16 3:17 ` Alex Chiang
2008-04-16 7:45 ` Benjamin Herrenschmidt
2008-04-16 17:11 ` Alex Chiang
2008-04-16 20:03 ` Andrew Morton
2008-04-16 22:16 ` Benjamin Herrenschmidt
2008-04-19 6:38 ` Alex Chiang
2008-04-19 7:03 ` Benjamin Herrenschmidt
2008-04-22 4:05 ` Alex Chiang
2008-04-22 5:07 ` Benjamin Herrenschmidt
2008-04-16 22:15 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox