* [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier)
2014-04-21 2:25 [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Wei Yang
@ 2014-04-21 2:25 ` Wei Yang
2014-04-21 3:37 ` Benjamin Herrenschmidt
2014-04-21 3:35 ` [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Benjamin Herrenschmidt
2014-04-21 23:34 ` Gavin Shan
2 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2014-04-21 2:25 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Wei Yang, aik, gwshan
Here is a draft call flow:
pci_device_add
pcibios_add_device
pci_dma_dev_setup
pnv_pci_dma_dev_setup
pnv_pci_ioda_dma_dev_setup
set_iommu_table_base_and_group <--- here
device_add
When set_iommu_table_base_and_group() is invoked int
pnv_pci_ioda_dma_dev_setup(), the dev->kobj->sd is not initialized. The
dev->kobj->sd is initialized in device_add().
After applying this patch, the error
iommu_tce: 0003:05:00.0 has not been added, ret=-14
is cleared.
This patch revert the change for pnv_pci_ioda_dma_dev_setup() from commit
d905c5df(PPC: POWERNV: move iommu_add_device earlier).
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index fc4edda..b5d8f73 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1425,7 +1425,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
pe = &phb->ioda.pe_array[pdn->pe_number];
WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
- set_iommu_table_base_and_group(&pdev->dev, &pe->tce32_table);
+ set_iommu_table_base(&pdev->dev, &pe->tce32_table);
}
static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
--
1.7.9.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier)
2014-04-21 2:25 ` [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier) Wei Yang
@ 2014-04-21 3:37 ` Benjamin Herrenschmidt
2014-04-21 5:22 ` Wei Yang
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2014-04-21 3:37 UTC (permalink / raw)
To: Wei Yang; +Cc: aik, linuxppc-dev, gwshan
On Mon, 2014-04-21 at 10:25 +0800, Wei Yang wrote:
> Here is a draft call flow:
>
> pci_device_add
> pcibios_add_device
> pci_dma_dev_setup
> pnv_pci_dma_dev_setup
> pnv_pci_ioda_dma_dev_setup
> set_iommu_table_base_and_group <--- here
> device_add
>
> When set_iommu_table_base_and_group() is invoked int
> pnv_pci_ioda_dma_dev_setup(), the dev->kobj->sd is not initialized. The
> dev->kobj->sd is initialized in device_add().
>
> After applying this patch, the error
> iommu_tce: 0003:05:00.0 has not been added, ret=-14
> is cleared.
>
> This patch revert the change for pnv_pci_ioda_dma_dev_setup() from commit
> d905c5df(PPC: POWERNV: move iommu_add_device earlier).
But in that case, is the group still set ?
Alexey, is that correct ?
Cheers,
Ben.
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index fc4edda..b5d8f73 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1425,7 +1425,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>
> pe = &phb->ioda.pe_array[pdn->pe_number];
> WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
> - set_iommu_table_base_and_group(&pdev->dev, &pe->tce32_table);
> + set_iommu_table_base(&pdev->dev, &pe->tce32_table);
> }
>
> static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier)
2014-04-21 3:37 ` Benjamin Herrenschmidt
@ 2014-04-21 5:22 ` Wei Yang
2014-04-21 6:11 ` Wei Yang
0 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2014-04-21 5:22 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, gwshan
On Mon, Apr 21, 2014 at 01:37:32PM +1000, Benjamin Herrenschmidt wrote:
>On Mon, 2014-04-21 at 10:25 +0800, Wei Yang wrote:
>> Here is a draft call flow:
>>
>> pci_device_add
>> pcibios_add_device
>> pci_dma_dev_setup
>> pnv_pci_dma_dev_setup
>> pnv_pci_ioda_dma_dev_setup
>> set_iommu_table_base_and_group <--- here
>> device_add
>>
>> When set_iommu_table_base_and_group() is invoked int
>> pnv_pci_ioda_dma_dev_setup(), the dev->kobj->sd is not initialized. The
>> dev->kobj->sd is initialized in device_add().
>>
>> After applying this patch, the error
>> iommu_tce: 0003:05:00.0 has not been added, ret=-14
>> is cleared.
>>
>> This patch revert the change for pnv_pci_ioda_dma_dev_setup() from commit
>> d905c5df(PPC: POWERNV: move iommu_add_device earlier).
>
>But in that case, is the group still set ?
I think so.
iommu_add_device() will be invoked when device_add() notify the pci bus.
>
>Alexey, is that correct ?
>
>Cheers,
>Ben.
>
>> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index fc4edda..b5d8f73 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -1425,7 +1425,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>>
>> pe = &phb->ioda.pe_array[pdn->pe_number];
>> WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
>> - set_iommu_table_base_and_group(&pdev->dev, &pe->tce32_table);
>> + set_iommu_table_base(&pdev->dev, &pe->tce32_table);
>> }
>>
>> static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
>
--
Richard Yang
Help you, Help me
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier)
2014-04-21 5:22 ` Wei Yang
@ 2014-04-21 6:11 ` Wei Yang
0 siblings, 0 replies; 14+ messages in thread
From: Wei Yang @ 2014-04-21 6:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, gwshan
On Mon, Apr 21, 2014 at 01:22:13PM +0800, Wei Yang wrote:
>On Mon, Apr 21, 2014 at 01:37:32PM +1000, Benjamin Herrenschmidt wrote:
>>On Mon, 2014-04-21 at 10:25 +0800, Wei Yang wrote:
>>> Here is a draft call flow:
>>>
>>> pci_device_add
>>> pcibios_add_device
>>> pci_dma_dev_setup
>>> pnv_pci_dma_dev_setup
>>> pnv_pci_ioda_dma_dev_setup
>>> set_iommu_table_base_and_group <--- here
>>> device_add
>>>
>>> When set_iommu_table_base_and_group() is invoked int
>>> pnv_pci_ioda_dma_dev_setup(), the dev->kobj->sd is not initialized. The
>>> dev->kobj->sd is initialized in device_add().
>>>
>>> After applying this patch, the error
>>> iommu_tce: 0003:05:00.0 has not been added, ret=-14
>>> is cleared.
>>>
>>> This patch revert the change for pnv_pci_ioda_dma_dev_setup() from commit
>>> d905c5df(PPC: POWERNV: move iommu_add_device earlier).
>>
>>But in that case, is the group still set ?
>
>I think so.
>
>iommu_add_device() will be invoked when device_add() notify the pci bus.
>
Did a quick test, after applying this patch, the logic would be the same for
VF and PF hotplug, the iommu_group is created with corresponding device.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-21 2:25 [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Wei Yang
2014-04-21 2:25 ` [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier) Wei Yang
@ 2014-04-21 3:35 ` Benjamin Herrenschmidt
2014-04-21 6:03 ` Wei Yang
2014-04-21 23:34 ` Gavin Shan
2 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2014-04-21 3:35 UTC (permalink / raw)
To: Wei Yang; +Cc: aik, linuxppc-dev, gwshan, Gavin Shan
On Mon, 2014-04-21 at 10:25 +0800, Wei Yang wrote:
> When pcibios_remove_pci_devices() is removing pci devices, it will release
> pci device respectively. When the refcount of the device is 0, the pci_dev
> structure will be destroyed.
>
> On PowerNV platform, the pci_dev will not be destroyed since the refcount is
> not 0.
>
> After applying the patch, this warning is cleared during the EEH hotplug
> event.
You have to be careful here. We take a reference to the device in the
structure eeh_dev, that means we might access it after it's freed if
we don't increase the refcount.
Cheers,
Ben.
> [ 204.123609] ------------[ cut here ]------------
> [ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
> [ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
> [ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
> [ 204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
> [ 204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
> [ 204.124506] REGS: c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+)
> [ 204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 88008084 XER: 20000000
> [ 204.124682] CFAR: c00000000006c644 SOFTE: 1
> GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
> GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
> GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
> GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
> GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
> GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
> [ 204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
> [ 204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
> [ 204.125443] Call Trace:
> [ 204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
> [ 204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
> [ 204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
> [ 204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
> [ 204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
> [ 204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
> [ 204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
> [ 204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
> [ 204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
> [ 204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
> [ 204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
> [ 204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
> [ 204.126556] Instruction dump:
> [ 204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
> [ 204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
> [ 204.126966] ---[ end trace 6e7aefd80add2973 ]---
>
> This patch clear the step to get reference.
>
> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index e2270f4..fc4edda 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -561,7 +561,6 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
> pci_name(dev));
> continue;
> }
> - pci_dev_get(dev);
> pdn->pcidev = dev;
> pdn->pe_number = pe->pe_number;
> pe->dma_weight += pnv_ioda_dma_weight(dev);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-21 3:35 ` [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Benjamin Herrenschmidt
@ 2014-04-21 6:03 ` Wei Yang
0 siblings, 0 replies; 14+ messages in thread
From: Wei Yang @ 2014-04-21 6:03 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, gwshan, Gavin Shan
On Mon, Apr 21, 2014 at 01:35:34PM +1000, Benjamin Herrenschmidt wrote:
>On Mon, 2014-04-21 at 10:25 +0800, Wei Yang wrote:
>> When pcibios_remove_pci_devices() is removing pci devices, it will release
>> pci device respectively. When the refcount of the device is 0, the pci_dev
>> structure will be destroyed.
>>
>> On PowerNV platform, the pci_dev will not be destroyed since the refcount is
>> not 0.
>>
>> After applying the patch, this warning is cleared during the EEH hotplug
>> event.
>
>You have to be careful here. We take a reference to the device in the
>structure eeh_dev, that means we might access it after it's freed if
>we don't increase the refcount.
Ben,
Thanks for reminding.
Hmm, I checked the eeh_dev structure, there is a field pdev in eeh_dev, which
is not a ref to the device.
Then I did a check in eeh hotplug code patch. The pci_dev remove/install
happens in eeh_reset_device(), between pcibios_remove_pci_devices() and
pcibios_add_pci_devices(). During this period, we will try to clear/set the PE
state and restore the BAR of a pci device. But the BAR is restored through the
device node instead of pci_dev structure. As my understanding the eeh code
here has the assumption that these pci devices in the same PE are removed.
Maybe I missed something, needs confirmation from Gavin.
The code I removed in this patch is introduced in commit
184cd4a3(powerpc/powernv: PCI support for p7IOC under OPAL v2).
Sounds the original purpose is to make sure a pci device is covered by a Bus
PE, so don't remove it. After booting up, Bus PE will always be there, which
means we will always hold a refcount to a pci device.
I didn't come up with a better idea, or leaving the warning is fine.
Any comments are welcome :)
>
>Cheers,
>Ben.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-21 2:25 [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Wei Yang
2014-04-21 2:25 ` [PATCH 2/2] powerpc/powernc: revert part of commit d905c5df(PPC: POWERNV: move iommu_add_device earlier) Wei Yang
2014-04-21 3:35 ` [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform Benjamin Herrenschmidt
@ 2014-04-21 23:34 ` Gavin Shan
2014-04-22 7:44 ` Wei Yang
2 siblings, 1 reply; 14+ messages in thread
From: Gavin Shan @ 2014-04-21 23:34 UTC (permalink / raw)
To: Wei Yang; +Cc: aik, linuxppc-dev, gwshan
On Mon, Apr 21, 2014 at 10:25:18AM +0800, Wei Yang wrote:
>When pcibios_remove_pci_devices() is removing pci devices, it will release
>pci device respectively. When the refcount of the device is 0, the pci_dev
>structure will be destroyed.
>
>On PowerNV platform, the pci_dev will not be destroyed since the refcount is
>not 0.
>
Richard, the above description is true. However, it's not relevant to the
issue (backtrace). I don't quite understand the scenario you had. You're
doing hotplug on VFs or PF? I guess it would be full-hotplug instead of
partial hotplug.
It seems that the IOMMU group wasn't detached correctly and then we tried
to attach it again. Or the IOMMU group was attached for towice? :-)
The IOMMU group is expected to be detached like this, please investigate
for more why it wasn't detached correctly.
pcibios_remove_pci_devices()
pci_stop_and_remove_bus_device()
pci_remove_bus_device()
pci_destroy_dev()
static void pci_destroy_dev(struct pci_dev *dev)
{
if (!dev->dev.kobj.parent)
return;
device_del(&dev->dev); /* It's calling iommu_del_device() */
down_write(&pci_bus_sem);
list_del(&dev->bus_list);
up_write(&pci_bus_sem);
pci_free_resources(dev);
put_device(&dev->dev); /* It's calling pcibios_release_device() */
}
Thanks,
Gavin
>After applying the patch, this warning is cleared during the EEH hotplug
>event.
>
>[ 204.123609] ------------[ cut here ]------------
>[ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
>[ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
>[ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
>[ 204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
>[ 204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
>[ 204.124506] REGS: c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+)
>[ 204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 88008084 XER: 20000000
>[ 204.124682] CFAR: c00000000006c644 SOFTE: 1
>GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
>GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
>GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
>GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
>GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
>GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
>GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
>[ 204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
>[ 204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
>[ 204.125443] Call Trace:
>[ 204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
>[ 204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
>[ 204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
>[ 204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
>[ 204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
>[ 204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
>[ 204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
>[ 204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
>[ 204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
>[ 204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
>[ 204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
>[ 204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
>[ 204.126556] Instruction dump:
>[ 204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
>[ 204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218
>[ 204.126966] ---[ end trace 6e7aefd80add2973 ]---
>
>This patch clear the step to get reference.
>
>Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c | 1 -
> 1 file changed, 1 deletion(-)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index e2270f4..fc4edda 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -561,7 +561,6 @@ static void pnv_ioda_setup_same_PE(struct pci_bus *bus, struct pnv_ioda_pe *pe)
> pci_name(dev));
> continue;
> }
>- pci_dev_get(dev);
> pdn->pcidev = dev;
> pdn->pe_number = pe->pe_number;
> pe->dma_weight += pnv_ioda_dma_weight(dev);
>--
>1.7.9.5
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-21 23:34 ` Gavin Shan
@ 2014-04-22 7:44 ` Wei Yang
2014-04-22 8:25 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2014-04-22 7:44 UTC (permalink / raw)
To: Gavin Shan; +Cc: aik, linuxppc-dev
On Tue, Apr 22, 2014 at 09:34:23AM +1000, Gavin Shan wrote:
>On Mon, Apr 21, 2014 at 10:25:18AM +0800, Wei Yang wrote:
>>When pcibios_remove_pci_devices() is removing pci devices, it will release
>>pci device respectively. When the refcount of the device is 0, the pci_dev
>>structure will be destroyed.
>>
>>On PowerNV platform, the pci_dev will not be destroyed since the refcount is
>>not 0.
>>
>
>Richard, the above description is true. However, it's not relevant to the
>issue (backtrace). I don't quite understand the scenario you had. You're
>doing hotplug on VFs or PF? I guess it would be full-hotplug instead of
>partial hotplug.
I am doing hotplug on PF.
>
>It seems that the IOMMU group wasn't detached correctly and then we tried
>to attach it again. Or the IOMMU group was attached for towice? :-)
Did more tests and find it is a little out of expectation. The conclusion is:
1. The iommu group is detached correctly,
2. The iommu group is attached three times
3. The warning is cleared based on the first patch instead of this one.
The three times to attach the iommu_group is:
pci_device_add
...
set_iommu_table_base_and_group <- 1st time, fail
device_add
...
tce_iommu_bus_notifier <- 2nd time, succees
pcibios_add_pci_devices
...
pcibios_setup_bus_devices <- 3rd time, re-attach
The error in patch 1 happens at the first time, since the dev->kobj->sd is not
initialized.
The warning in patch 2 happens at the 3rd time, since iommu group is already
attached in the 2nd time.
So this patch(the 2nd one) doesn't contribute to clear the warning and error.
Only the first patch did it. Please ignore this one.
>
>The IOMMU group is expected to be detached like this, please investigate
>for more why it wasn't detached correctly.
>
>pcibios_remove_pci_devices()
>pci_stop_and_remove_bus_device()
>pci_remove_bus_device()
>pci_destroy_dev()
>
>static void pci_destroy_dev(struct pci_dev *dev)
>{
> if (!dev->dev.kobj.parent)
> return;
>
> device_del(&dev->dev); /* It's calling iommu_del_device() */
>
> down_write(&pci_bus_sem);
> list_del(&dev->bus_list);
> up_write(&pci_bus_sem);
>
> pci_free_resources(dev);
> put_device(&dev->dev); /* It's calling pcibios_release_device() */
>}
>
--
Richard Yang
Help you, Help me
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-22 7:44 ` Wei Yang
@ 2014-04-22 8:25 ` Benjamin Herrenschmidt
2014-04-22 9:44 ` Wei Yang
2014-04-22 23:00 ` Gavin Shan
0 siblings, 2 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2014-04-22 8:25 UTC (permalink / raw)
To: Wei Yang; +Cc: aik, linuxppc-dev, Gavin Shan
On Tue, 2014-04-22 at 15:44 +0800, Wei Yang wrote:
> So this patch(the 2nd one) doesn't contribute to clear the warning and
> error.
> Only the first patch did it. Please ignore this one.
But is it correct ? It's not right to keep a refcount elevated if we
don't have to.
Gavin, can you get to the bottom of that refcount business ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-22 8:25 ` Benjamin Herrenschmidt
@ 2014-04-22 9:44 ` Wei Yang
2014-04-22 23:00 ` Gavin Shan
1 sibling, 0 replies; 14+ messages in thread
From: Wei Yang @ 2014-04-22 9:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, Gavin Shan
On Tue, Apr 22, 2014 at 06:25:09PM +1000, Benjamin Herrenschmidt wrote:
>On Tue, 2014-04-22 at 15:44 +0800, Wei Yang wrote:
>> So this patch(the 2nd one) doesn't contribute to clear the warning and
>> error.
>> Only the first patch did it. Please ignore this one.
>
>But is it correct ? It's not right to keep a refcount elevated if we
>don't have to.
The code refcount is introduced in commit 184cd4a3(powerpc/powernv: PCI
support for p7IOC under OPAL v2). To me, it looks not necessary.
>
>Gavin, can you get to the bottom of that refcount business ?
>
>Cheers,
>Ben.
>
--
Richard Yang
Help you, Help me
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-22 8:25 ` Benjamin Herrenschmidt
2014-04-22 9:44 ` Wei Yang
@ 2014-04-22 23:00 ` Gavin Shan
2014-04-23 0:26 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 14+ messages in thread
From: Gavin Shan @ 2014-04-22 23:00 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, Wei Yang, linuxppc-dev, Gavin Shan
On Tue, Apr 22, 2014 at 06:25:09PM +1000, Benjamin Herrenschmidt wrote:
>On Tue, 2014-04-22 at 15:44 +0800, Wei Yang wrote:
>> So this patch(the 2nd one) doesn't contribute to clear the warning and
>> error.
>> Only the first patch did it. Please ignore this one.
>
>But is it correct ? It's not right to keep a refcount elevated if we
>don't have to.
>
>Gavin, can you get to the bottom of that refcount business ?
>
Ben, "struct pci_dn::pcidev" was used by EEH originally. We don't
use it any more. So it can be removed.
Currently, EEH has following 4 functions to do conversion from
one to another. None of them relies on "struct pci_dn::pcidev".
of_node_to_eeh_dev() device_node -> pci_dn -> eeh_dev
pci_dev_to_eeh_dev() pci_dev -> device -> archdata -> eeh_dev
eeh_dev_to_of_node() eeh_dev -> device_node
eeh_dev_to_pci_dev() eeh_dev -> pci_dev
The side effect of holding pci_dev refcount is the pci_dev, eeh_dev,
eeh_pe instance can't be free'ed during fully hotplug though EEH can
survive. It's reasonable to remove it.
Thanks,
Gavin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-22 23:00 ` Gavin Shan
@ 2014-04-23 0:26 ` Benjamin Herrenschmidt
2014-04-23 1:56 ` Wei Yang
0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2014-04-23 0:26 UTC (permalink / raw)
To: Gavin Shan; +Cc: aik, Wei Yang, linuxppc-dev
On Wed, 2014-04-23 at 09:00 +1000, Gavin Shan wrote:
> The side effect of holding pci_dev refcount is the pci_dev, eeh_dev,
> eeh_pe instance can't be free'ed during fully hotplug though EEH can
> survive. It's reasonable to remove it.
Allright. Can you guys refresh that patch with an updated cset
comment and shoot it upstream ?
Thanks !
Cheers,
Ben.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] powerpc/powernv: clear the refcount for pci_dev on powernv platform
2014-04-23 0:26 ` Benjamin Herrenschmidt
@ 2014-04-23 1:56 ` Wei Yang
0 siblings, 0 replies; 14+ messages in thread
From: Wei Yang @ 2014-04-23 1:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: aik, linuxppc-dev, Gavin Shan
On Wed, Apr 23, 2014 at 10:26:18AM +1000, Benjamin Herrenschmidt wrote:
>On Wed, 2014-04-23 at 09:00 +1000, Gavin Shan wrote:
>> The side effect of holding pci_dev refcount is the pci_dev, eeh_dev,
>> eeh_pe instance can't be free'ed during fully hotplug though EEH can
>> survive. It's reasonable to remove it.
>
>Allright. Can you guys refresh that patch with an updated cset
>comment and shoot it upstream ?
Glad to do so.
>
>Thanks !
>
>Cheers,
>Ben.
>
--
Richard Yang
Help you, Help me
^ permalink raw reply [flat|nested] 14+ messages in thread