From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>,
Cao jin <caoj.fnst@cn.fujitsu.com>,
qemu-devel@nongnu.org
Cc: mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH v14 Resend 10/13] pci: add pci device pre-post reset callbacks for host bus reset
Date: Tue, 22 Dec 2015 15:18:45 +0800 [thread overview]
Message-ID: <5678F955.6010703@cn.fujitsu.com> (raw)
In-Reply-To: <1450732026.3781.89.camel@redhat.com>
On 12/22/2015 05:07 AM, Alex Williamson wrote:
> On Fri, 2015-12-18 at 11:29 +0800, Chen Fan wrote:
>> On 12/18/2015 04:31 AM, Alex Williamson wrote:
>>> On Thu, 2015-12-17 at 09:41 +0800, Cao jin wrote:
>>>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>>
>>>> Particularly, For vfio devices, Once need to recovery devices
>>>> by bus reset such as AER, we always need to reset the host bus
>>>> to recovery the devices under the bus, so we need to add pci
>>>> device
>>>> callbacks to specify to do host bus reset.
>>>>
>>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>> ---
>>>> hw/pci/pci.c | 18 ++++++++++++++++++
>>>> hw/pci/pci_bridge.c | 9 +++++++++
>>>> hw/vfio/pci.c | 26 ++++++++++++++++++++++++++
>>>> hw/vfio/pci.h | 2 ++
>>>> include/hw/pci/pci.h | 7 +++++++
>>>> 5 files changed, 62 insertions(+)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index f6ca6ef..64fa2cc 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -247,6 +247,24 @@ static void pci_do_device_reset(PCIDevice
>>>> *dev)
>>>> msix_reset(dev);
>>>> }
>>>>
>>>> +void pci_device_pre_reset(PCIBus *bus, PCIDevice *dev, void
>>>> *unused)
>>>> +{
>>>> + PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(dev);
>>>> +
>>>> + if (dc->pre_reset) {
>>>> + dc->pre_reset(dev);
>>>> + }
>>>> +}
>>>> +
>>>> +void pci_device_post_reset(PCIBus *bus, PCIDevice *dev, void
>>>> *unused)
>>>> +{
>>>> + PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(dev);
>>>> +
>>>> + if (dc->post_reset) {
>>>> + dc->post_reset(dev);
>>>> + }
>>>> +}
>>>> +
>>>> /*
>>>> * This function is called on #RST and FLR.
>>>> * FLR if PCI_EXP_DEVCTL_BCR_FLR is set
>>>> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
>>>> index 40c97b1..ddb76ab 100644
>>>> --- a/hw/pci/pci_bridge.c
>>>> +++ b/hw/pci/pci_bridge.c
>>>> @@ -267,8 +267,17 @@ void pci_bridge_write_config(PCIDevice *d,
>>>>
>>>> newctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
>>>> if (~oldctl & newctl & PCI_BRIDGE_CTL_BUS_RESET) {
>>>> + /*
>>>> + * Notify all vfio-pci devices under the bus
>>>> + * should do physical bus reset.
>>>> + */
>>>> + PCIBus *sec_bus = pci_bridge_get_sec_bus(s);
>>>> + pci_for_each_device(sec_bus, pci_bus_num(sec_bus),
>>>> + pci_device_pre_reset, NULL);
>>>> /* Trigger hot reset on 0->1 transition. */
>>>> qbus_reset_all(&s->sec_bus.qbus);
>>>> + pci_for_each_device(sec_bus, pci_bus_num(sec_bus),
>>>> + pci_device_post_reset, NULL);
>>>> }
>>>> }
>>>>
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index e17dc89..df32618 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
>>>> static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool
>>>> enabled);
>>>> +static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single);
>>>>
>>>> /*
>>>> * Disabling BAR mmaping can be slow, but toggling it around
>>>> INTx
>>>> can
>>>> @@ -1888,6 +1889,8 @@ static int
>>>> vfio_check_host_bus_reset(VFIOPCIDevice *vdev)
>>>> /* List all affected devices by bus reset */
>>>> devices = &info->devices[0];
>>>>
>>>> + vdev->single_depend_dev = (info->count == 1);
>>>> +
>>>> /* Verify that we have all the groups required */
>>>> for (i = 0; i < info->count; i++) {
>>>> PCIHostDeviceAddress host;
>>>> @@ -2029,10 +2032,26 @@ static int
>>>> vfio_check_bus_reset(NotifierWithReturn *n, void *opaque)
>>>> return vfio_check_host_bus_reset(vdev);
>>>> }
>>>>
>>>> +static void vfio_aer_pre_reset(PCIDevice *pdev)
>>>> +{
>>>> + VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
>>>> +
>>>> + vdev->aer_reset = true;
>>>> + vfio_pci_hot_reset(vdev, vdev->single_depend_dev);
>>>> +}
>>> Doesn't this lead to multiple host bus resets per guest bus reset
>>> in
>>> many cases? It looks like we'll do it once per vfio-pci device,
>>> even
>>> if those devices are on the same host bus. That's a 1 second
>>> operation
>>> per device. Can we avoid that? Maybe some sort of sequence ID
>>> could
>>> help a device figure out whether it's already been reset as part of
>>> a
>>> dependent device for this particular guest bus reset. Thanks,
>> That's right, I missed this case, but I don't understand the scenario
>> how to
>> use a sequence ID to mark the device if been reset. can you detail it
>> ?
> I don't really have a concrete idea for a sequence ID, it was just a
> thought that maybe if each bus reset had a sequence ID then devices
> could know whether they've already been reset for that sequence ID.
> The basic problem we have is that reset callbacks are per device and
> it's difficult to infer which individual resets are part of that bus
> reset. In fact, do we propagate resets correctly down secondary
> bridges? We're triggering off a VM write of the bridge control bus
> reset bit triggering from 0->1 and we then call qbus_reset_all() on
> that qbus, which I think is just going to call pci_bridge_reset() for
> any other bridges, which doesn't do anything about resetting deeper
> subordinate buses. I think that means that if we had a root port with
> a switch below it and endpoints below that, if the VM triggered a
> secondary bus reset at the root port, the endpoints would never see it,
> which is not how real hardware works.
Indeed, you're right, for subordinate buses reset, we should have a common
mechanism for all bridges.
>
>> additional, there was a mechanism to compute device whether need to
>> be reset
>> by hot reset. so I simply modify the code as the following:
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index a9bc67e..42774ca 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -2063,13 +2063,19 @@ static void vfio_aer_pre_reset(PCIDevice
>> *pdev)
>> VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
>>
>> vdev->aer_reset = true;
>> - vfio_pci_hot_reset(vdev, vdev->single_depend_dev);
>> + vdev->vbasedev.needs_reset = true;
>> }
>>
>> static void vfio_aer_post_reset(PCIDevice *pdev)
>> {
>> VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
>>
>> + if (!vdev->single_depend_dev && vdev->vbasedev.needs_reset) {
>> + vfio_pci_hot_reset(vdev, false);
>> + } else {
>> + vfio_pci_hot_reset(vdev, true);
>> + }
>> +
>> vdev->aer_reset = false;
>> }
>>
>> what do you think of this ?
> I think it might be a bigger problem than that subtle change. I wonder
> if we really need a better model of the reset line through the
> subordinate buses. When reset is asserted, we'd set a bus_in_reset
> flag on the bus and trigger downstream bridges to do the same. Then
> when the user de-asserts reset, we'd call qbus_reset_all() and
> propagate it through to downstream buses. That way the per device
> reset callback could check to see if the bus is in reset and aer
> devices can then know to do a bus reset. Finally, the bus_in_reset
> flag would be cleared on all the affected buses. I'm sure there are
> numerous details missing there, but it seems like it might be a
> reasonable approach. Thanks,
it should be, let me try to think about it carefully.;)
Thanks,
Chen
>
> Alex
>
>
> .
>
next prev parent reply other threads:[~2015-12-22 7:26 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-17 8:41 [Qemu-devel] [PATCH v14 00/13] vfio-pci: pass the aer error to guest Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 01/13] vfio: extract vfio_get_hot_reset_info as a single function Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 02/13] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 03/13] pcie: modify the capability size assert Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 04/13] vfio: make the 4 bytes aligned for capability size Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 05/13] vfio: add pcie extanded capability support Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 06/13] aer: impove pcie_aer_init to support vfio device Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 07/13] vfio: add aer support for " Cao jin
2015-11-17 8:41 ` [Qemu-devel] [PATCH v14 08/13] vfio: add check host bus reset is support or not Cao jin
2015-12-17 20:32 ` [Qemu-devel] [PATCH v14 Resend " Alex Williamson
2015-12-18 1:14 ` Chen Fan
2015-12-24 14:32 ` Michael S. Tsirkin
2015-12-24 17:47 ` Alex Williamson
2015-12-24 18:06 ` Michael S. Tsirkin
2015-12-24 18:20 ` Alex Williamson
2015-12-24 18:23 ` Michael S. Tsirkin
2015-12-24 18:41 ` Alex Williamson
2015-12-24 19:42 ` Michael S. Tsirkin
2015-11-17 8:42 ` [Qemu-devel] [PATCH v14 09/13] add check reset mechanism when hotplug vfio device Cao jin
2015-12-17 20:32 ` [Qemu-devel] [PATCH v14 Resend " Alex Williamson
2015-11-17 8:42 ` [Qemu-devel] [PATCH v14 10/13] pci: add pci device pre-post reset callbacks for host bus reset Cao jin
2015-12-17 20:31 ` [Qemu-devel] [PATCH v14 Resend " Alex Williamson
2015-12-18 3:29 ` Chen Fan
2015-12-21 21:07 ` Alex Williamson
2015-12-22 7:18 ` Chen Fan [this message]
2015-12-24 5:10 ` Chen Fan
2015-12-24 14:34 ` Michael S. Tsirkin
2015-12-25 1:18 ` Chen Fan
2015-12-23 12:00 ` Michael S. Tsirkin
2015-12-24 5:14 ` Chen Fan
2015-11-17 8:42 ` [Qemu-devel] [PATCH v14 11/13] pcie_aer: expose pcie_aer_msg() interface Cao jin
2015-11-17 8:42 ` [Qemu-devel] [PATCH v14 12/13] vfio-pci: pass the aer error to guest Cao jin
2015-11-17 8:42 ` [Qemu-devel] [PATCH v14 13/13] vfio: add 'aer' property to expose aercap Cao jin
2015-11-18 17:06 ` [Qemu-devel] [PATCH v14 00/13] vfio-pci: pass the aer error to guest Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5678F955.6010703@cn.fujitsu.com \
--to=chen.fan.fnst@cn.fujitsu.com \
--cc=alex.williamson@redhat.com \
--cc=caoj.fnst@cn.fujitsu.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).