From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: izumi.taku@jp.fujitsu.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC v7 11/11] vfio: add 'aer' property to expose aercap
Date: Thu, 21 May 2015 09:53:52 +0800 [thread overview]
Message-ID: <555D3AB0.7070300@cn.fujitsu.com> (raw)
In-Reply-To: <1432147650.11375.278.camel@redhat.com>
On 05/21/2015 02:47 AM, Alex Williamson wrote:
> On Wed, 2015-05-20 at 11:43 +0800, Chen Fan wrote:
>> On 05/20/2015 03:34 AM, Alex Williamson wrote:
>>> On Tue, 2015-05-19 at 12:42 +0800, Chen Fan wrote:
>>>> add 'aer' property to let user able to decide whether expose
>>>> the aer capability. by default we should disable aer feature,
>>>> because it needs configuration restrictions.
>>> But the previous patch already enabled it, so a bisect might get it
>>> enabled then disabled.
>>>
>>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>> ---
>>>> hw/i386/pc_q35.c | 4 +++
>>>> hw/vfio/pci.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> include/hw/compat.h | 8 +++++
>>>> 3 files changed, 96 insertions(+)
>>>>
>>>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>>>> index e67f2de..94a3a1c 100644
>>>> --- a/hw/i386/pc_q35.c
>>>> +++ b/hw/i386/pc_q35.c
>>>> @@ -439,6 +439,10 @@ static QEMUMachine pc_q35_machine_v2_3 = {
>>>> PC_Q35_2_3_MACHINE_OPTIONS,
>>>> .name = "pc-q35-2.3",
>>>> .init = pc_q35_init_2_3,
>>>> + .compat_props = (GlobalProperty[]) {
>>>> + HW_COMPAT_2_3,
>>>> + { /* end of list */ }
>>>> + },
>>>> };
>>>>
>>>> #define PC_Q35_2_2_MACHINE_OPTIONS PC_Q35_2_3_MACHINE_OPTIONS
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index f2fc5d6..b4b9495 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -42,6 +42,7 @@
>>>> #include "trace.h"
>>>> #include "hw/vfio/vfio.h"
>>>> #include "hw/vfio/vfio-common.h"
>>>> +#include "hw/pci/pci_bridge.h"
>>>>
>>>> struct VFIOPCIDevice;
>>>>
>>>> @@ -161,6 +162,8 @@ typedef struct VFIOPCIDevice {
>>>> #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
>>>> #define VFIO_FEATURE_ENABLE_REQ_BIT 1
>>>> #define VFIO_FEATURE_ENABLE_REQ (1 << VFIO_FEATURE_ENABLE_REQ_BIT)
>>>> +#define VFIO_FEATURE_ENABLE_AER_BIT 2
>>>> +#define VFIO_FEATURE_ENABLE_AER (1 << VFIO_FEATURE_ENABLE_AER_BIT)
>>>> int32_t bootindex;
>>>> uint8_t pm_cap;
>>>> bool has_vga;
>>>> @@ -2821,10 +2824,31 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
>>>> static int vfio_setup_aer(VFIOPCIDevice *vdev, int pos, uint16_t size)
>>>> {
>>>> PCIDevice *pdev = &vdev->pdev;
>>>> + PCIDevice *dev_iter;
>>>> uint8_t *exp_cap = pdev->config + pdev->exp.exp_cap;
>>>> uint32_t severity, errcap;
>>>> + uint8_t type;
>>>> int ret;
>>>>
>>>> + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
>>>> + return 0;
>>>> + }
>>>> +
>>>> + dev_iter = pci_bridge_get_device(pdev->bus);
>>>> + if (!dev_iter) {
>>> So this is testing that it can't be on the root complex endpoint.
>>>
>>>> + goto error;
>>>> + }
>>>> +
>>>> + while (dev_iter) {
>>>> + type = pcie_cap_get_type(dev_iter);
>>>> + if ((type != PCI_EXP_TYPE_ROOT_PORT &&
>>>> + type != PCI_EXP_TYPE_UPSTREAM &&
>>>> + type != PCI_EXP_TYPE_DOWNSTREAM)) {
>>>> + goto error;
>>>> + }
>>> And this tests that we have PCIe devices up to the root complex, but do
>>> do we need to check whether they support aer?
>> Exactly, although the ioh3420 port and the xio3130 switch have
>> been support aer, but in case new devices are unsupported in the future.
> Yes, that's my concern, we don't want to make assumptions based on the
> devices we have today that need to be revisited later.
>
>>>> + dev_iter = pci_bridge_get_device(dev_iter->bus);
>>>> + }
>>>> +
>>>> errcap = vfio_pci_read_config(pdev, pdev->exp.aer_cap + PCI_ERR_CAP, 4);
>>>> /*
>>>> * The ability to record multiple headers is depending on
>>>> @@ -2850,6 +2874,11 @@ static int vfio_setup_aer(VFIOPCIDevice *vdev, int pos, uint16_t size)
>>>> pci_long_test_and_clear_mask(exp_cap + PCI_ERR_UNCOR_SEVER, ~severity);
>>>>
>>>> return 0;
>>>> +
>>>> +error:
>>>> + error_report("vfio: Unable to enable AER for device %s, parent bus "
>>>> + "do not support signal AER", vdev->vbasedev.name);
>>>> + return -1;
>>>> }
>>>>
>>>> static int vfio_add_ext_cap(VFIOPCIDevice *vdev, const uint8_t *config)
>>>> @@ -3733,9 +3762,52 @@ static void vfio_check_host_bus_reset(VFIODevice *vbasedev)
>>>> vdev->has_bus_reset = true;
>>>>
>>>> out:
>>>> + if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
>>>> + !vdev->has_bus_reset) {
>>>> + error_report("vfio: Cannot enable AER for device %s,"
>>>> + "which is not support host bus reset.",
>>>> + vdev->vbasedev.name);
>>>> + exit(1);
>>>> + }
>>> And this is done via the machine init callback, so again we're not doing
>>> anything to support hotplug.
>> Exactly, I missed the case of hotplug. I think I should implement
>> the bus reset capability at each initialize time and meanwhile update
>> the other devices bus reset capability on the same bus.
> Right, so it's probably better to do the validation from the initfn
> rather than via a separate machine init callback so that we can fail a
> device_add if it would compromise our ability to do an aer bus reset.
>
> The host device hotplug scenario is harder, maybe before injecting the
> aer message to the guest, we need to verify whether a bus reset is still
> available. Once injected, we wouldn't know whether the inability to do
> a bus reset should be fatal or not.
>
>>>> g_free(info);
>>>> }
>>>>
>>>> +static int vfio_pci_validate_aer_devices(DeviceState *dev,
>>>> + void *opaque)
>>>> +{
>>>> + VFIOPCIDevice *vdev;
>>>> +
>>>> + if (!object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
>>>> + return 0;
>>>> + }
>>>> +
>>>> + vdev = DO_UPCAST(VFIOPCIDevice, pdev, PCI_DEVICE(dev));
>>>> + if (!vdev->has_bus_reset) {
>>>> + error_report("vfio: device %s is not support host bus reset,"
>>>> + "but on the same bus has vfio device enable AER.",
>>>> + vdev->vbasedev.name);
>>>> + exit(1);
>>>> + }
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static void vfio_check_virtual_bus_reset(VFIODevice *vbasedev)
>>>> +{
>>>> + VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>>>> +
>>>> + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
>>>> + return;
>>>> + }
>>>> +
>>>> + /*
>>>> + * Verify that we have all the vfio devices support host bus reset
>>>> + * on the bus
>>>> + */
>>>> + qbus_walk_children(BUS(vdev->pdev.bus), NULL, NULL,
>>>> + vfio_pci_validate_aer_devices,
>>>> + NULL, NULL);
>>>> +}
>>>> +
>>>> static void vfio_pci_machine_done_notify(Notifier *notifier, void *unused)
>>>> {
>>>> VFIOGroup *group;
>>>> @@ -3746,6 +3818,16 @@ static void vfio_pci_machine_done_notify(Notifier *notifier, void *unused)
>>>> vfio_check_host_bus_reset(vbasedev);
>>>> }
>>>> }
>>>> +
>>>> + /*
>>>> + * All devices need support host bus reset on each virtual bus
>>>> + * if one device enable AER.
>>>> + */
>>>> + QLIST_FOREACH(group, &vfio_group_list, next) {
>>>> + QLIST_FOREACH(vbasedev, &group->device_list, next) {
>>>> + vfio_check_virtual_bus_reset(vbasedev);
>>>> + }
>>>> + }
>>> I'm not sure I see how this call path can catch anything missed by
>>> vfio_check_host_bus_reset().
>> I think we should support the case that on the same virtual bus
>> if one device enable AER, probably the other devices is on the
>> separate bus. at least the other devices should support host
>> bus reset.
>>
>> 1. on the same virtual bus and on the same host bus
>>
>> virtual bus -------------------------
>> | |
>> deviceA deviceB
>> | |
>> host bus -------------------------
>>
>>
>> 2. on the same virtual bus but on the respective host bus
>>
>> virtual bus -------------------------
>> | |
>> deviceA deviceB
>> | |
>> host bus ---------- ------------
>>
> That's an interesting question, but I'm not sure I agree that collateral
> devices need to support bus reset, or if it's sufficient that they just
> need to support any reset mechanism. We don't really let the guest see
> the post-reset state of the device anyway, the host will always have
> done a save/restore of state already. So if aer is not supported on the
> device and we therefore know that the guest isn't attempting to do link
> renegotiation or fundamental reset specific to that device, isn't any
> reset that quiesces the device sufficient? Thanks,
I agree.
Thanks,
Chen
>
> Alex
>
>>>> }
>>>>
>>>> static Notifier machine_notifier = {
>>>> @@ -4004,6 +4086,8 @@ static Property vfio_pci_dev_properties[] = {
>>>> VFIO_FEATURE_ENABLE_VGA_BIT, false),
>>>> DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features,
>>>> VFIO_FEATURE_ENABLE_REQ_BIT, true),
>>>> + DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
>>>> + VFIO_FEATURE_ENABLE_AER_BIT, false),
>>>> DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1),
>>>> DEFINE_PROP_BOOL("x-mmap", VFIOPCIDevice, vbasedev.allow_mmap, true),
>>>> /*
>>>> diff --git a/include/hw/compat.h b/include/hw/compat.h
>>>> index 313682a..f722919 100644
>>>> --- a/include/hw/compat.h
>>>> +++ b/include/hw/compat.h
>>>> @@ -1,7 +1,15 @@
>>>> #ifndef HW_COMPAT_H
>>>> #define HW_COMPAT_H
>>>>
>>>> +#define HW_COMPAT_2_3 \
>>>> + {\
>>>> + .driver = "vfio-pci",\
>>>> + .property = "x-aer",\
>>> It's defined as "aer" above, not "x-aer". I'm not sure why we need this
>>> though if the default value is off anyway.
>>>
>>>> + .value = "off",\
>>>> + }
>>>> +
>>>> #define HW_COMPAT_2_1 \
>>>> + HW_COMPAT_2_3, \
>>>> {\
>>>> .driver = "intel-hda",\
>>>> .property = "old_msi_addr",\
>>>
>>> .
>>>
>
>
>
> .
>
prev parent reply other threads:[~2015-05-21 1:55 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-19 4:42 [Qemu-devel] [RFC v7 00/11] vfio-pci: pass the aer error to guest Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 01/11] vfio: extract vfio_get_hot_reset_info as a single function Chen Fan
2015-05-19 19:34 ` Alex Williamson
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 02/11] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 03/11] qdev: add bus reset_notifiers callbacks for host " Chen Fan
2015-05-19 19:34 ` Alex Williamson
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 04/11] vfio: add check host bus reset is support or not Chen Fan
2015-05-19 19:34 ` Alex Williamson
2015-05-20 7:24 ` Chen Fan
2015-05-20 18:47 ` Alex Williamson
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 05/11] vfio: do hot bus reset when do virtual secondary bus reset Chen Fan
2015-05-19 19:34 ` Alex Williamson
2015-05-20 9:59 ` Chen Fan
2015-05-20 18:47 ` Alex Williamson
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 06/11] vfio: add pcie extanded capability support Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 07/11] aer: impove pcie_aer_init to support vfio device Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 08/11] vfio: add aer support for " Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 09/11] pcie_aer: expose pcie_aer_msg() interface Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 10/11] vfio-pci: pass the aer error to guest Chen Fan
2015-05-19 4:42 ` [Qemu-devel] [RFC v7 11/11] vfio: add 'aer' property to expose aercap Chen Fan
2015-05-19 19:34 ` Alex Williamson
2015-05-20 3:43 ` Chen Fan
2015-05-20 18:47 ` Alex Williamson
2015-05-21 1:53 ` Chen Fan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=555D3AB0.7070300@cn.fujitsu.com \
--to=chen.fan.fnst@cn.fujitsu.com \
--cc=alex.williamson@redhat.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).