From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40317) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YvFhm-0001jc-Eh for qemu-devel@nongnu.org; Wed, 20 May 2015 21:55:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YvFhg-0001eU-Tu for qemu-devel@nongnu.org; Wed, 20 May 2015 21:55:26 -0400 Received: from [59.151.112.132] (port=15943 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YvFhf-0001dg-9I for qemu-devel@nongnu.org; Wed, 20 May 2015 21:55:20 -0400 Message-ID: <555D3AB0.7070300@cn.fujitsu.com> Date: Thu, 21 May 2015 09:53:52 +0800 From: Chen Fan MIME-Version: 1.0 References: <1432064065.11375.246.camel@redhat.com> <555C02D8.7060409@cn.fujitsu.com> <1432147650.11375.278.camel@redhat.com> In-Reply-To: <1432147650.11375.278.camel@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC v7 11/11] vfio: add 'aer' property to expose aercap List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Williamson Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org On 05/21/2015 02:47 AM, Alex Williamson wrote: > On Wed, 2015-05-20 at 11:43 +0800, Chen Fan wrote: >> On 05/20/2015 03:34 AM, Alex Williamson wrote: >>> On Tue, 2015-05-19 at 12:42 +0800, Chen Fan wrote: >>>> add 'aer' property to let user able to decide whether expose >>>> the aer capability. by default we should disable aer feature, >>>> because it needs configuration restrictions. >>> But the previous patch already enabled it, so a bisect might get it >>> enabled then disabled. >>> >>>> Signed-off-by: Chen Fan >>>> --- >>>> hw/i386/pc_q35.c | 4 +++ >>>> hw/vfio/pci.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> include/hw/compat.h | 8 +++++ >>>> 3 files changed, 96 insertions(+) >>>> >>>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c >>>> index e67f2de..94a3a1c 100644 >>>> --- a/hw/i386/pc_q35.c >>>> +++ b/hw/i386/pc_q35.c >>>> @@ -439,6 +439,10 @@ static QEMUMachine pc_q35_machine_v2_3 = { >>>> PC_Q35_2_3_MACHINE_OPTIONS, >>>> .name = "pc-q35-2.3", >>>> .init = pc_q35_init_2_3, >>>> + .compat_props = (GlobalProperty[]) { >>>> + HW_COMPAT_2_3, >>>> + { /* end of list */ } >>>> + }, >>>> }; >>>> >>>> #define PC_Q35_2_2_MACHINE_OPTIONS PC_Q35_2_3_MACHINE_OPTIONS >>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>> index f2fc5d6..b4b9495 100644 >>>> --- a/hw/vfio/pci.c >>>> +++ b/hw/vfio/pci.c >>>> @@ -42,6 +42,7 @@ >>>> #include "trace.h" >>>> #include "hw/vfio/vfio.h" >>>> #include "hw/vfio/vfio-common.h" >>>> +#include "hw/pci/pci_bridge.h" >>>> >>>> struct VFIOPCIDevice; >>>> >>>> @@ -161,6 +162,8 @@ typedef struct VFIOPCIDevice { >>>> #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT) >>>> #define VFIO_FEATURE_ENABLE_REQ_BIT 1 >>>> #define VFIO_FEATURE_ENABLE_REQ (1 << VFIO_FEATURE_ENABLE_REQ_BIT) >>>> +#define VFIO_FEATURE_ENABLE_AER_BIT 2 >>>> +#define VFIO_FEATURE_ENABLE_AER (1 << VFIO_FEATURE_ENABLE_AER_BIT) >>>> int32_t bootindex; >>>> uint8_t pm_cap; >>>> bool has_vga; >>>> @@ -2821,10 +2824,31 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) >>>> static int vfio_setup_aer(VFIOPCIDevice *vdev, int pos, uint16_t size) >>>> { >>>> PCIDevice *pdev = &vdev->pdev; >>>> + PCIDevice *dev_iter; >>>> uint8_t *exp_cap = pdev->config + pdev->exp.exp_cap; >>>> uint32_t severity, errcap; >>>> + uint8_t type; >>>> int ret; >>>> >>>> + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { >>>> + return 0; >>>> + } >>>> + >>>> + dev_iter = pci_bridge_get_device(pdev->bus); >>>> + if (!dev_iter) { >>> So this is testing that it can't be on the root complex endpoint. >>> >>>> + goto error; >>>> + } >>>> + >>>> + while (dev_iter) { >>>> + type = pcie_cap_get_type(dev_iter); >>>> + if ((type != PCI_EXP_TYPE_ROOT_PORT && >>>> + type != PCI_EXP_TYPE_UPSTREAM && >>>> + type != PCI_EXP_TYPE_DOWNSTREAM)) { >>>> + goto error; >>>> + } >>> And this tests that we have PCIe devices up to the root complex, but do >>> do we need to check whether they support aer? >> Exactly, although the ioh3420 port and the xio3130 switch have >> been support aer, but in case new devices are unsupported in the future. > Yes, that's my concern, we don't want to make assumptions based on the > devices we have today that need to be revisited later. > >>>> + dev_iter = pci_bridge_get_device(dev_iter->bus); >>>> + } >>>> + >>>> errcap = vfio_pci_read_config(pdev, pdev->exp.aer_cap + PCI_ERR_CAP, 4); >>>> /* >>>> * The ability to record multiple headers is depending on >>>> @@ -2850,6 +2874,11 @@ static int vfio_setup_aer(VFIOPCIDevice *vdev, int pos, uint16_t size) >>>> pci_long_test_and_clear_mask(exp_cap + PCI_ERR_UNCOR_SEVER, ~severity); >>>> >>>> return 0; >>>> + >>>> +error: >>>> + error_report("vfio: Unable to enable AER for device %s, parent bus " >>>> + "do not support signal AER", vdev->vbasedev.name); >>>> + return -1; >>>> } >>>> >>>> static int vfio_add_ext_cap(VFIOPCIDevice *vdev, const uint8_t *config) >>>> @@ -3733,9 +3762,52 @@ static void vfio_check_host_bus_reset(VFIODevice *vbasedev) >>>> vdev->has_bus_reset = true; >>>> >>>> out: >>>> + if ((vdev->features & VFIO_FEATURE_ENABLE_AER) && >>>> + !vdev->has_bus_reset) { >>>> + error_report("vfio: Cannot enable AER for device %s," >>>> + "which is not support host bus reset.", >>>> + vdev->vbasedev.name); >>>> + exit(1); >>>> + } >>> And this is done via the machine init callback, so again we're not doing >>> anything to support hotplug. >> Exactly, I missed the case of hotplug. I think I should implement >> the bus reset capability at each initialize time and meanwhile update >> the other devices bus reset capability on the same bus. > Right, so it's probably better to do the validation from the initfn > rather than via a separate machine init callback so that we can fail a > device_add if it would compromise our ability to do an aer bus reset. > > The host device hotplug scenario is harder, maybe before injecting the > aer message to the guest, we need to verify whether a bus reset is still > available. Once injected, we wouldn't know whether the inability to do > a bus reset should be fatal or not. > >>>> g_free(info); >>>> } >>>> >>>> +static int vfio_pci_validate_aer_devices(DeviceState *dev, >>>> + void *opaque) >>>> +{ >>>> + VFIOPCIDevice *vdev; >>>> + >>>> + if (!object_dynamic_cast(OBJECT(dev), "vfio-pci")) { >>>> + return 0; >>>> + } >>>> + >>>> + vdev = DO_UPCAST(VFIOPCIDevice, pdev, PCI_DEVICE(dev)); >>>> + if (!vdev->has_bus_reset) { >>>> + error_report("vfio: device %s is not support host bus reset," >>>> + "but on the same bus has vfio device enable AER.", >>>> + vdev->vbasedev.name); >>>> + exit(1); >>>> + } >>>> + return 0; >>>> +} >>>> + >>>> +static void vfio_check_virtual_bus_reset(VFIODevice *vbasedev) >>>> +{ >>>> + VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); >>>> + >>>> + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { >>>> + return; >>>> + } >>>> + >>>> + /* >>>> + * Verify that we have all the vfio devices support host bus reset >>>> + * on the bus >>>> + */ >>>> + qbus_walk_children(BUS(vdev->pdev.bus), NULL, NULL, >>>> + vfio_pci_validate_aer_devices, >>>> + NULL, NULL); >>>> +} >>>> + >>>> static void vfio_pci_machine_done_notify(Notifier *notifier, void *unused) >>>> { >>>> VFIOGroup *group; >>>> @@ -3746,6 +3818,16 @@ static void vfio_pci_machine_done_notify(Notifier *notifier, void *unused) >>>> vfio_check_host_bus_reset(vbasedev); >>>> } >>>> } >>>> + >>>> + /* >>>> + * All devices need support host bus reset on each virtual bus >>>> + * if one device enable AER. >>>> + */ >>>> + QLIST_FOREACH(group, &vfio_group_list, next) { >>>> + QLIST_FOREACH(vbasedev, &group->device_list, next) { >>>> + vfio_check_virtual_bus_reset(vbasedev); >>>> + } >>>> + } >>> I'm not sure I see how this call path can catch anything missed by >>> vfio_check_host_bus_reset(). >> I think we should support the case that on the same virtual bus >> if one device enable AER, probably the other devices is on the >> separate bus. at least the other devices should support host >> bus reset. >> >> 1. on the same virtual bus and on the same host bus >> >> virtual bus ------------------------- >> | | >> deviceA deviceB >> | | >> host bus ------------------------- >> >> >> 2. on the same virtual bus but on the respective host bus >> >> virtual bus ------------------------- >> | | >> deviceA deviceB >> | | >> host bus ---------- ------------ >> > That's an interesting question, but I'm not sure I agree that collateral > devices need to support bus reset, or if it's sufficient that they just > need to support any reset mechanism. We don't really let the guest see > the post-reset state of the device anyway, the host will always have > done a save/restore of state already. So if aer is not supported on the > device and we therefore know that the guest isn't attempting to do link > renegotiation or fundamental reset specific to that device, isn't any > reset that quiesces the device sufficient? Thanks, I agree. Thanks, Chen > > Alex > >>>> } >>>> >>>> static Notifier machine_notifier = { >>>> @@ -4004,6 +4086,8 @@ static Property vfio_pci_dev_properties[] = { >>>> VFIO_FEATURE_ENABLE_VGA_BIT, false), >>>> DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, >>>> VFIO_FEATURE_ENABLE_REQ_BIT, true), >>>> + DEFINE_PROP_BIT("aer", VFIOPCIDevice, features, >>>> + VFIO_FEATURE_ENABLE_AER_BIT, false), >>>> DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1), >>>> DEFINE_PROP_BOOL("x-mmap", VFIOPCIDevice, vbasedev.allow_mmap, true), >>>> /* >>>> diff --git a/include/hw/compat.h b/include/hw/compat.h >>>> index 313682a..f722919 100644 >>>> --- a/include/hw/compat.h >>>> +++ b/include/hw/compat.h >>>> @@ -1,7 +1,15 @@ >>>> #ifndef HW_COMPAT_H >>>> #define HW_COMPAT_H >>>> >>>> +#define HW_COMPAT_2_3 \ >>>> + {\ >>>> + .driver = "vfio-pci",\ >>>> + .property = "x-aer",\ >>> It's defined as "aer" above, not "x-aer". I'm not sure why we need this >>> though if the default value is off anyway. >>> >>>> + .value = "off",\ >>>> + } >>>> + >>>> #define HW_COMPAT_2_1 \ >>>> + HW_COMPAT_2_3, \ >>>> {\ >>>> .driver = "intel-hda",\ >>>> .property = "old_msi_addr",\ >>> >>> . >>> > > > > . >