From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Alexey Kardashevskiy <aik@au1.ibm.com>
Cc: Wei Yang <weiyang@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org, gwshan@linux.vnet.ibm.com
Subject: Re: [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
Date: Tue, 29 Apr 2014 17:37:20 +0800 [thread overview]
Message-ID: <20140429093719.GB8028@richard> (raw)
In-Reply-To: <535F5B04.5000102@au1.ibm.com>
On Tue, Apr 29, 2014 at 05:55:48PM +1000, Alexey Kardashevskiy wrote:
>On 04/29/2014 04:49 PM, Wei Yang wrote:
>> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>>> and two of them will trigger warning or error.
>>>>
>>>> The three times to invoke the iommu_add_device() are:
>>>>
>>>> pci_device_add
>>>> ...
>>>> set_iommu_table_base_and_group <- 1st time, fail
>>>> device_add
>>>> ...
>>>> tce_iommu_bus_notifier <- 2nd time, succees
>>>> pcibios_add_pci_devices
>>>> ...
>>>> pcibios_setup_bus_devices <- 3rd time, re-attach
>>>>
>>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>>> dev->kobj->sd is initialized in device_add().
>>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>>
>>>> After applying this patch, the error
>>>
>>> Nack.
>>>
>>> The patch still seems incorrect and we actually need to remove
>>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>>from another notifier anyway. Could you please test it?
>>>
>>>
>>
>> Hi, Alexey,
>>
>> Nice to see your comment. Let me show what I got fist.
>>
>> Generally, when kernel enumerate on the pci device, following functions will
>> be invoked.
>>
>> pci_device_add
>> pcibios_setup_bus_device
>> ...
>> set_iommu_table_base_and_group
>> device_add
>> ...
>> tce_iommu_bus_notifier
>> pcibios_fixp_bus
>> pcibios_add_pci_devices
>> ...
>> pcibios_setup_bus_devices
>>
>>>From the call flow, we see for a normall pci device, the
>> pcibios_setup_bus_device() will be invoked twice.
>
>
>tce_iommu_bus_notifier should not be called for devices during boot-time
>PCI discovery as the discovery is done from subsys_initcall handler and
>tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
>this is happening? You should see warnings as for PF's EEH but you do not
>say that.
>
Let me make it clearer. I list the call flow for general purpose to show the
sequence of the call flow.
The tce_iommu_bus_notifier is not invoked at the boot time. And none of them
do real thing at boot time.
I don't understand your last sentence. I see warning and error during EEH
hotplug. If necessary, I will add this in the commit log.
>
>> At the bootup time, none of them succeed to setup the dma, since the PE is not
>> assigned or the tbl is not set. The iommu tbl and group is setup in
>> pnv_pci_ioda_setup_DMA().
>>
>> This call flow maintains the same when EEH error happens on Bus PE, while the
>> behavior is a little different.
>>
>> pci_device_add
>> pcibios_setup_bus_device
>> ...
>> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
>
>
>Sorry, what is uninitialized? The only kobj here is the one in iommu_group
>and it must have been taken care of before adding a device.
pci_dev->dev->kobj->sd.
I have explained this in previous discussion. This kobj->sd is initialized in
device_add().
>
>
>> device_add
>> ...
>> tce_iommu_bus_notifier <- succeed
>> pcibios_fixp_bus
>> pcibios_add_pci_devices
>> ...
>> pcibios_setup_bus_devices <- warning, re-attach
>
>
>This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
>avoid the warning.
Then how about the first time's error?
>
>
>> While this call flow will change a little on a VF. For a VF,
>> pcibios_fixp_bus() will not be invoked. Current behavior is this.
>
>
>You meant pcibios_fixup_bus? I would expect it to get called (from
>pci_rescan_bus() or something like that?) and this seems to be the problem
>here.
When EEH happens and hotplug the pci bus, we need to do the pci_scan_bridge()
which will do the pcibios_fixup_bus().
So you suggest to remove it? This is the generic code.
>
>How are VFs added in terms of code flow? Is there any git tree to look at?
>
VF add code flow is a generic code in drivers/pci/iov.c, virtfn_add().
>
>
>> pci_device_add
>> pcibios_setup_bus_device
>> ...
>> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
>> device_add
>> ...
>> tce_iommu_bus_notifier <- succeed
>>
>> And if an EEH error happens just on a VF, I believe the same call flow should
>> go for recovery. (This is not set down, still under discussion with Gavin.)
>>
>> My conclusion is:
>> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>> So I choose to revert the code and attach make the iommu group attachment
>> just happens in one place.
>>
>> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
>> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
>> PF, maybe some platform need to fixup some thing after the pci_bus is added.
>> So I don't remove one of them to solve the problem.
>>
>> If you have a better idea, I am glad to take it.
>>
>
>
>--
>Alexey Kardashevskiy
>IBM OzLabs, LTC Team
>
>e-mail: aik@au1.ibm.com
>notes: Alexey Kardashevskiy/Australia/IBM
--
Richard Yang
Help you, Help me
next prev parent reply other threads:[~2014-04-29 9:38 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-23 2:26 [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Wei Yang
2014-04-23 2:26 ` [PATCH 2/2] powerpc/powernv: release the refcount for pci_dev Wei Yang
2014-04-28 13:35 ` [PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device() Alexey Kardashevskiy
2014-04-29 6:49 ` Wei Yang
2014-04-29 7:55 ` Alexey Kardashevskiy
2014-04-29 9:37 ` Wei Yang [this message]
2014-04-29 13:11 ` Alexey Kardashevskiy
2014-04-30 1:31 ` Wei Yang
2014-04-30 0:28 ` Gavin Shan
2014-04-30 3:28 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140429093719.GB8028@richard \
--to=weiyang@linux.vnet.ibm.com \
--cc=aik@au1.ibm.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).