From: Alexander Graf <agraf@suse.de>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: "aik@ozlabs.ru" <aik@ozlabs.ru>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
Alex Williamson <alex.williamson@redhat.com>,
"qiudayu@linux.vnet.ibm.com" <qiudayu@linux.vnet.ibm.com>,
"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>
Subject: Re: [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device
Date: Fri, 23 May 2014 12:49:13 +0000 [thread overview]
Message-ID: <537F43C9.1080503@suse.de> (raw)
In-Reply-To: <20140523124326.GA4778@shangw>
On 23.05.14 14:43, Gavin Shan wrote:
> On Fri, May 23, 2014 at 01:58:50PM +0200, Alexander Graf wrote:
>> On 23.05.14 13:55, Gavin Shan wrote:
>>> On Fri, May 23, 2014 at 11:58:22AM +0200, Alexander Graf wrote:
>>>> On 23.05.14 09:37, Gavin Shan wrote:
>>>>> On Fri, May 23, 2014 at 08:55:15AM +0200, Alexander Graf wrote:
>>>>>>> Am 23.05.2014 um 06:37 schrieb Gavin Shan <gwshan@linux.vnet.ibm.com>:
>>>>>>>> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
>>>>>>>>> On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:
>>>>>>>>> The patch adds new IOCTL commands for VFIO PCI device to support
>>>>>>>>> EEH functionality for PCI devices, which have been passed through
>>>>>>>> >from host to somebody else via VFIO.
>>>>> .../...
>>>>>
>>>>>>>>> +
>>>>>>>>> +/*
>>>>>>>>> + * Reset is the major step to recover problematic PE. The following
>>>>>>>>> + * command helps on that.
>>>>>>>>> + */
>>>>>>>>> +struct vfio_eeh_pe_reset {
>>>>>>>>> + __u32 argsz;
>>>>>>>>> + __u32 option;
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> +#define VFIO_EEH_PE_RESET _IO(VFIO_TYPE, VFIO_BASE + 24)
>>>>>>>>> +
>>>>>>>>> +/*
>>>>>>>>> + * One of the steps for recovery after PE reset is to configure the
>>>>>>>>> + * PCI bridges affected by the PE reset.
>>>>>>>>> + */
>>>>>>>>> +#define VFIO_EEH_PE_CONFIGURE _IO(VFIO_TYPE, VFIO_BASE + 25)
>>>>>>>> What can the user do differently by making these separate ioctls?
>>>>>>> hrm, I didn't understood as well. Alex.G could have the explaination.
>>>>>> Alex raised the same concern as me: why separate reset and configure? When we want to recover a device, we need a reset call anyway, right?
>>>>>>
>>>>> Ok. With current ioctl commands, "reset+configure" is required to do
>>>>> error recovery. Before the recovery, we also need call "configure"
>>>>> in order to retrieve error log correctly.
>>>> Well, the "configure" ioctl (which is a really bad name for what it
>>>> does btw) currently only restores the BARs which doesn't sound like
>>>> error log retrieval to me.
>>>>
>>> Could you please suggest a better name? I had VFIO_EEH_PE_CONFIGURE because
>>> it's for RTAS call "ibm,configure-pe".
>> VFIO_RESTORE_BARS maybe?
>>
> hrm, It's not better than the original one. Could we just
> have VFIO_EEH_PE_CONFIGURE as all left ioctl command names
> are stick to RTAS call names.
>
> Also, I might add more logic to this function to improve
> reliability. For example, if there're multiple PCI bridges
> included in the PE, I need reset them one by one and ensure
> their PCI link comes up. It's obviously not restoring BARs,
> but configuring PE :-)
VFIO_EEH_RECOVER?
>
>>>>> Also, they corresponds to 2 separate RTAS services: "ibm,set-slot-reset"
>>>>> and "ibm,configure-pe".
>>>> Does a guest always issue both? What's the order it calls them in?
>>>>
>>> For one error, the following RTAS calls was called in general:
>>>
>>> < stop device drivers, no PCI traffic expected during recovery >
>>> ibm,set-eeh-option
>>> ibm,configure-pe
>>> < error log retrival >
>> I see. So the guest retrieves the log via BARs from the device? I
>> guess I'm failing to see what "the log" is.
>>
> well. It seems that I didn't describe it clearly enough. The ioctl
> command was introduced to finish the function that should be done
> with RTAS call "ibm,configure-pe", which is to configure PCI bridges's
> config space correctly. Without that, it's possible that we can't
> access the config space of the subordinate PCI devices of the PCI
> bridges. So we should restore config space for PCI bridges. However,
> we also need restore config space for normal PCI devices because
> userland has some config space registers masked off and can't access
> them all, so it's not reliable to restore config space for normal
> PCI devices from userland.
>
> So the restoring config space of PCI bridges is required, but restoring
> config space for normal devices are a trick here.
So what if user space accesses config space while the device is broken?
What if it accesses an mmap'ed BAR while the device is in broken state
and BARs haven't been recovered yet?
Alex
WARNING: multiple messages have this Message-ID (diff)
From: Alexander Graf <agraf@suse.de>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: "aik@ozlabs.ru" <aik@ozlabs.ru>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
Alex Williamson <alex.williamson@redhat.com>,
"qiudayu@linux.vnet.ibm.com" <qiudayu@linux.vnet.ibm.com>,
"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>
Subject: Re: [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device
Date: Fri, 23 May 2014 14:49:13 +0200 [thread overview]
Message-ID: <537F43C9.1080503@suse.de> (raw)
In-Reply-To: <20140523124326.GA4778@shangw>
On 23.05.14 14:43, Gavin Shan wrote:
> On Fri, May 23, 2014 at 01:58:50PM +0200, Alexander Graf wrote:
>> On 23.05.14 13:55, Gavin Shan wrote:
>>> On Fri, May 23, 2014 at 11:58:22AM +0200, Alexander Graf wrote:
>>>> On 23.05.14 09:37, Gavin Shan wrote:
>>>>> On Fri, May 23, 2014 at 08:55:15AM +0200, Alexander Graf wrote:
>>>>>>> Am 23.05.2014 um 06:37 schrieb Gavin Shan <gwshan@linux.vnet.ibm.com>:
>>>>>>>> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
>>>>>>>>> On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:
>>>>>>>>> The patch adds new IOCTL commands for VFIO PCI device to support
>>>>>>>>> EEH functionality for PCI devices, which have been passed through
>>>>>>>> >from host to somebody else via VFIO.
>>>>> .../...
>>>>>
>>>>>>>>> +
>>>>>>>>> +/*
>>>>>>>>> + * Reset is the major step to recover problematic PE. The following
>>>>>>>>> + * command helps on that.
>>>>>>>>> + */
>>>>>>>>> +struct vfio_eeh_pe_reset {
>>>>>>>>> + __u32 argsz;
>>>>>>>>> + __u32 option;
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> +#define VFIO_EEH_PE_RESET _IO(VFIO_TYPE, VFIO_BASE + 24)
>>>>>>>>> +
>>>>>>>>> +/*
>>>>>>>>> + * One of the steps for recovery after PE reset is to configure the
>>>>>>>>> + * PCI bridges affected by the PE reset.
>>>>>>>>> + */
>>>>>>>>> +#define VFIO_EEH_PE_CONFIGURE _IO(VFIO_TYPE, VFIO_BASE + 25)
>>>>>>>> What can the user do differently by making these separate ioctls?
>>>>>>> hrm, I didn't understood as well. Alex.G could have the explaination.
>>>>>> Alex raised the same concern as me: why separate reset and configure? When we want to recover a device, we need a reset call anyway, right?
>>>>>>
>>>>> Ok. With current ioctl commands, "reset+configure" is required to do
>>>>> error recovery. Before the recovery, we also need call "configure"
>>>>> in order to retrieve error log correctly.
>>>> Well, the "configure" ioctl (which is a really bad name for what it
>>>> does btw) currently only restores the BARs which doesn't sound like
>>>> error log retrieval to me.
>>>>
>>> Could you please suggest a better name? I had VFIO_EEH_PE_CONFIGURE because
>>> it's for RTAS call "ibm,configure-pe".
>> VFIO_RESTORE_BARS maybe?
>>
> hrm, It's not better than the original one. Could we just
> have VFIO_EEH_PE_CONFIGURE as all left ioctl command names
> are stick to RTAS call names.
>
> Also, I might add more logic to this function to improve
> reliability. For example, if there're multiple PCI bridges
> included in the PE, I need reset them one by one and ensure
> their PCI link comes up. It's obviously not restoring BARs,
> but configuring PE :-)
VFIO_EEH_RECOVER?
>
>>>>> Also, they corresponds to 2 separate RTAS services: "ibm,set-slot-reset"
>>>>> and "ibm,configure-pe".
>>>> Does a guest always issue both? What's the order it calls them in?
>>>>
>>> For one error, the following RTAS calls was called in general:
>>>
>>> < stop device drivers, no PCI traffic expected during recovery >
>>> ibm,set-eeh-option
>>> ibm,configure-pe
>>> < error log retrival >
>> I see. So the guest retrieves the log via BARs from the device? I
>> guess I'm failing to see what "the log" is.
>>
> well. It seems that I didn't describe it clearly enough. The ioctl
> command was introduced to finish the function that should be done
> with RTAS call "ibm,configure-pe", which is to configure PCI bridges's
> config space correctly. Without that, it's possible that we can't
> access the config space of the subordinate PCI devices of the PCI
> bridges. So we should restore config space for PCI bridges. However,
> we also need restore config space for normal PCI devices because
> userland has some config space registers masked off and can't access
> them all, so it's not reliable to restore config space for normal
> PCI devices from userland.
>
> So the restoring config space of PCI bridges is required, but restoring
> config space for normal devices are a trick here.
So what if user space accesses config space while the device is broken?
What if it accesses an mmap'ed BAR while the device is in broken state
and BARs haven't been recovered yet?
Alex
next prev parent reply other threads:[~2014-05-23 12:49 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 8:23 [PATCH v6 0/3] EEH Support for VFIO PCI device Gavin Shan
2014-05-22 8:23 ` Gavin Shan
2014-05-22 8:23 ` [PATCH v6 1/3] powerpc/eeh: Flags for passed device and PE Gavin Shan
2014-05-22 8:23 ` Gavin Shan
2014-05-22 8:23 ` [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device Gavin Shan
2014-05-22 8:23 ` Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:17 ` Gavin Shan
2014-05-23 0:17 ` Gavin Shan
2014-05-23 0:37 ` Gavin Shan
2014-05-23 0:37 ` Gavin Shan
2014-05-23 3:23 ` Alex Williamson
2014-05-23 3:23 ` Alex Williamson
2014-05-23 6:52 ` Alexander Graf
2014-05-23 6:52 ` Alexander Graf
2014-05-23 11:58 ` Gavin Shan
2014-05-23 11:58 ` Gavin Shan
2014-05-23 12:30 ` Alexander Graf
2014-05-23 12:30 ` Alexander Graf
2014-05-23 14:49 ` Alex Williamson
2014-05-23 14:49 ` Alex Williamson
2014-05-24 1:37 ` Gavin Shan
2014-05-24 1:37 ` Gavin Shan
2014-05-23 12:51 ` Alex Williamson
2014-05-23 12:51 ` Alex Williamson
2014-05-23 13:24 ` Alexander Graf
2014-05-23 13:24 ` Alexander Graf
2014-05-23 3:10 ` Alex Williamson
2014-05-23 3:10 ` Alex Williamson
2014-05-23 4:37 ` Gavin Shan
2014-05-23 4:37 ` Gavin Shan
2014-05-23 5:00 ` Benjamin Herrenschmidt
2014-05-23 5:00 ` Benjamin Herrenschmidt
2014-05-23 14:36 ` Alex Williamson
2014-05-23 14:36 ` Alex Williamson
2014-05-23 6:55 ` Alexander Graf
2014-05-23 6:55 ` Alexander Graf
2014-05-23 7:37 ` Gavin Shan
2014-05-23 7:37 ` Gavin Shan
2014-05-23 9:58 ` Alexander Graf
2014-05-23 9:58 ` Alexander Graf
2014-05-23 11:55 ` Gavin Shan
2014-05-23 11:55 ` Gavin Shan
2014-05-23 11:58 ` Alexander Graf
2014-05-23 11:58 ` Alexander Graf
2014-05-23 12:43 ` Gavin Shan
2014-05-23 12:43 ` Gavin Shan
2014-05-23 12:49 ` Alexander Graf [this message]
2014-05-23 12:49 ` Alexander Graf
2014-05-24 1:46 ` Gavin Shan
2014-05-24 1:46 ` Gavin Shan
2014-05-23 14:29 ` Alex Williamson
2014-05-23 14:29 ` Alex Williamson
2014-05-24 2:06 ` Gavin Shan
2014-05-24 2:06 ` Gavin Shan
2014-05-27 17:39 ` Alex Williamson
2014-05-27 17:39 ` Alex Williamson
2014-05-22 8:23 ` [PATCH v6 3/3] powerpc/eeh: Avoid event on passed PE Gavin Shan
2014-05-22 8:23 ` Gavin Shan
2014-05-22 9:55 ` Alexander Graf
2014-05-22 9:55 ` Alexander Graf
2014-05-23 0:01 ` Gavin Shan
2014-05-23 0:01 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537F43C9.1080503@suse.de \
--to=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=gwshan@linux.vnet.ibm.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=qiudayu@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.