All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Cao jin <caoj.fnst@cn.fujitsu.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v13 09/13] add check reset mechanism when hotplug vfio device
Date: Mon, 16 Nov 2015 18:18:07 +0800	[thread overview]
Message-ID: <5649AD5F.6020407@cn.fujitsu.com> (raw)
In-Reply-To: <1447448661.3946.147.camel@redhat.com>

Hi Alex,

   Thanks for your detailed explanation.
   during my test, I found that maybe there was another problem in vfio 
driver,
I use a dual-port NIC which address are: 06:00.0 and 06:00.1 two functions.
then I use aer-inject to inject one error to one function like following:
AER
ID 0000:06:00.0
UNCOR_STATUS DLP
HEADER_LOG 0 1 2 3

here I boot qemu with one enable aer, one disable aer:
./x86_64-softmmu/qemu-system-x86_64 -M q35 -device 
ioh3420,bus=pcie.0,addr=1c.0,port=1,id=bridge1,chassis=1
  -device vfio-pci,host=06:00.1,bus=bridge1,addr=00.1
  -device 
vfio-pci,host=06:00.0,bus=bridge1,addr=00.0,aer=true,multifunction=on

so we expected that the error only sent to the vfio device with host 
address is 06:00.0,
but I found that all devices (06:00.0 , 06:00.1) receive the signal in 
qemu, which sent by vfio driver
in vfio_pci_aer_err_detected. then qemu stopped by the device with 
06:00.1 received the signal.
is that right?

Thanks,
Chen


On 11/14/2015 05:04 AM, Alex Williamson wrote:
> On Fri, 2015-11-13 at 11:28 +0800, Cao jin wrote:
>> On 11/12/2015 07:51 PM, Michael S. Tsirkin wrote:
>>> On Wed, Nov 11, 2015 at 06:34:27PM +0800, Cao jin wrote:
>>>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>>
>>>> Since we support multi-function hotplug. the function 0 indicate
>>>> the closure of the slot, so we have the chance to do the check.
>>>>
>>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>> ---
>>>>    hw/pci/pci.c             | 29 +++++++++++++++++++++++++++++
>>>>    hw/vfio/pci.c            | 19 +++++++++++++++++++
>>>>    hw/vfio/pci.h            |  2 ++
>>>>    include/hw/pci/pci_bus.h |  5 +++++
>>>>    4 files changed, 55 insertions(+)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 168b9cc..f6ca6ef 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -81,6 +81,7 @@ static void pci_bus_realize(BusState *qbus, Error **errp)
>>>>        PCIBus *bus = PCI_BUS(qbus);
>>>>
>>>>        vmstate_register(NULL, -1, &vmstate_pcibus, bus);
>>>> +    notifier_with_return_list_init(&bus->hotplug_notifiers);
>>>>    }
>>>>
>>>>    static void pci_bus_unrealize(BusState *qbus, Error **errp)
>>>> @@ -1835,6 +1836,22 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn)
>>>>        return bus->devices[devfn];
>>>>    }
>>>>
>>>> +void pci_bus_add_hotplug_notifier(PCIBus *bus, NotifierWithReturn *notify)
>>>> +{
>>>> +    notifier_with_return_list_add(&bus->hotplug_notifiers, notify);
>>>> +}
>>>> +
>>>> +void pci_bus_remove_hotplug_notifier(NotifierWithReturn *notifier)
>>>> +{
>>>> +    notifier_with_return_remove(notifier);
>>>> +}
>>>> +
>>>> +static int pci_bus_hotplug_notifier(PCIBus *bus, void *opaque)
>>>> +{
>>>> +    return notifier_with_return_list_notify(&bus->hotplug_notifiers,
>>>> +                                            opaque);
>>>> +}
>>>> +
>>>>    static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>>>>    {
>>>>        PCIDevice *pci_dev = (PCIDevice *)qdev;
>>>> @@ -1877,6 +1894,18 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>>>>            pci_qdev_unrealize(DEVICE(pci_dev), NULL);
>>>>            return;
>>>>        }
>>>> +
>>>> +    /*
>>>> +     *  If the function is func 0, indicate the closure of the slot.
>>>> +     *  signal the callback.
>>>> +     */
>>>> +    if (DEVICE(pci_dev)->hotplugged &&
>>>> +        pci_get_function_0(pci_dev) == pci_dev &&
>>>> +        pci_bus_hotplug_notifier(bus, pci_dev)) {
>>>> +        error_setg(errp, "failed to hotplug function 0");
>>>> +        pci_qdev_unrealize(DEVICE(pci_dev), NULL);
>>>> +        return;
>>>> +    }
>>> I don't understand why this is required in pci core.
>>> PCI Device is already constructed anyway.
>>> Just do the checks and call unrealize in vfio.
>> Because when do multi-function hotplug, the function 0 on the pcie bus
>> probably is not a vfio device. so we should trigger the check from pci
>> core.
>>
>>> I also don't see why you are tying this to hotplug.
>>> I would check each function as it's added.
>>> But that's a vfio thing, if both you and Alex think
>>> it's a good idea, fine by me.
>> The device is  initialized one by one no matter it is cold plugged or
>> hot plugged, but for the vfio with aer that need to get depended devices
>> required by bus reset, so need to make sure the reset depended devices
>> are assigned to qemu, in vfio, there is a machine done callback to check
>> the bus reset for boot up, so it also should be done in hotplug。
>>
>> it looks little complicated, Alex, any idea?
>
> So the problem is that to support AER we need to be able to do a bus
> reset of the device, both in the virtual and physical spaces.  A
> physical bus reset is likely to affect more than a single device since
> we're often dealing with multifunction endpoints.  Those functions may
> be considered isolated on the host due to ACS, but we cannot reset the
> bus without affecting all of the functions.  Therefore, we need to test
> whether we have a compatible setup, but it involves more than a single
> device.  We cannot test each device as it is initialized because any
> time more than one device is affected, and we haven't yet added the
> other devices, we'll fail the test.
>
> There are two separate cases where we need to solve this problem,
> coldplug and hotplug.  Coldplug can be resolved by using the
> machine-init done notifier to verify that our configuration is
> compatible.  We have no requirements for the ordering of devices during
> cold initialization.  For the hotplug case, we've defined that function
> 0 closes the slot, which provides an opportunity for us to do the same
> verification.  However, function 0 is not necessarily a vfio-pci device.
> We can create our own multifunction devices in the VM, where function 0
> could be any type of pci device.  Thus vfio-pci cannot notify itself
> when a slot is closed and due to the above mentioned problem, we cannot
> verify as each device is added.
>
> So, I don't really see a better way to solve the problem than what's
> being proposed here.  Thanks,
>
> Alex
>
> .
>

  reply	other threads:[~2015-11-16 10:22 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11 10:34 [Qemu-devel] [PATCH v13 00/13] vfio-pci: pass the aer error to guest Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 01/13] vfio: extract vfio_get_hot_reset_info as a single function Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 02/13] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 03/13] pcie: modify the capability size assert Cao jin
2015-11-11 16:55   ` Michael S. Tsirkin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 04/13] vfio: make the 4 bytes aligned for capability size Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 05/13] vfio: add pcie extanded capability support Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 06/13] aer: impove pcie_aer_init to support vfio device Cao jin
2015-11-11 16:55   ` Michael S. Tsirkin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 07/13] vfio: add aer support for " Cao jin
2015-11-11 20:49   ` Alex Williamson
2015-11-12 11:54     ` Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 08/13] vfio: add check host bus reset is support or not Cao jin
2015-11-11 20:53   ` Alex Williamson
2015-11-12 11:56     ` Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 09/13] add check reset mechanism when hotplug vfio device Cao jin
2015-11-12 11:51   ` Michael S. Tsirkin
2015-11-13  3:28     ` Cao jin
2015-11-13 21:04       ` Alex Williamson
2015-11-16 10:18         ` Chen Fan [this message]
2015-11-16 16:05           ` Alex Williamson
2015-11-17  2:48             ` Chen Fan
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 10/13] pci: add pci device pre-post reset callbacks for host bus reset Cao jin
2015-11-11 16:56   ` Michael S. Tsirkin
2015-11-11 20:58   ` Alex Williamson
2015-11-12 11:58     ` Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 11/13] pcie_aer: expose pcie_aer_msg() interface Cao jin
2015-11-11 16:56   ` Michael S. Tsirkin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 12/13] vfio-pci: pass the aer error to guest Cao jin
2015-11-11 10:34 ` [Qemu-devel] [PATCH v13 13/13] vfio: add 'aer' property to expose aercap Cao jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5649AD5F.6020407@cn.fujitsu.com \
    --to=chen.fan.fnst@cn.fujitsu.com \
    --cc=alex.williamson@redhat.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.