From: "Michael S. Tsirkin" <mst@redhat.com>
To: Cao jin <caoj.fnst@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
izumi.taku@jp.fujitsu.com, alex.williamson@redhat.com
Subject: Re: [PATCH] vfio/pci: Support error recovery
Date: Mon, 28 Nov 2016 05:00:04 +0200 [thread overview]
Message-ID: <20161128045506-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <1480246457-10368-1-git-send-email-caoj.fnst@cn.fujitsu.com>
On Sun, Nov 27, 2016 at 07:34:17PM +0800, Cao jin wrote:
> It is user space driver's or device-specific driver's(in guest) responsbility
> to do a serious recovery when error happened. Link-reset is one part of
> recovery, when pci device is assigned to VM via vfio, link-reset will do
> twice in host & guest separately, which will cause many trouble for a
> successful recovery, so, disable the vfio-pci's link-reset in aer driver
> in host, this is a keypoint for guest to do error recovery successfully.
>
> CC: alex.williamson@redhat.com
> CC: mst@redhat.com
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
> This is actually a RFC version(has debug lines left), and has minor changes in
> aer driver, so I think maybe it is better not to CC pci guys in this round.
> Later will do.
>
> drivers/pci/pcie/aer/aerdrv_core.c | 12 ++++++-
> drivers/vfio/pci/vfio_pci.c | 63 +++++++++++++++++++++++++++++++++++--
> drivers/vfio/pci/vfio_pci_private.h | 2 ++
> 3 files changed, 74 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 521e39c..289fb8e 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -496,7 +496,17 @@ static void do_recovery(struct pci_dev *dev, int severity)
> "error_detected",
> report_error_detected);
>
> - if (severity == AER_FATAL) {
> + /* vfio-pci as a general meta driver, it actually couldn't do any real
> + * recovery for device. It is user space driver, or device-specific
> + * driver in guest who should take care of the serious error recovery,
> + * link reset actually is one part of whole recovery. Doing reset_link
> + * in aer driver of host kernel for vfio-pci devices will cause many
> + * trouble for user space driver or guest's device-specific driver,
> + * for example: the serious recovery often need to read register in
> + * config space, but if register reading happens during link-resetting,
> + * it is quite possible to return invalid value like all F's, which
> + * will result in unpredictable error. */
Fix multi-comment style please.
> + if (severity == AER_FATAL && strcmp(dev->driver->name, "vfio-pci")) {
You really want some flag in the device, or something similar.
Also, how do we know driver is not going away at this point?
> result = reset_link(dev);
> if (result != PCI_ERS_RESULT_RECOVERED)
> goto failed;
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 712a849..aefd751 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -535,6 +535,15 @@ static long vfio_pci_ioctl(void *device_data,
> struct vfio_pci_device *vdev = device_data;
> unsigned long minsz;
>
> + if (vdev->aer_recovering && (cmd == VFIO_DEVICE_SET_IRQS ||
> + cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) {
> + int ret;
> + ret = wait_for_completion_interruptible(
> + &vdev->aer_error_completion);
> + if (ret)
> + return ret;
> + }
> +
> if (cmd == VFIO_DEVICE_GET_INFO) {
> struct vfio_device_info info;
>
> @@ -1117,6 +1126,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> vdev->irq_type = VFIO_PCI_NUM_IRQS;
> mutex_init(&vdev->igate);
> spin_lock_init(&vdev->irqlock);
> + init_completion(&vdev->aer_error_completion);
>
> ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
> if (ret) {
> @@ -1176,6 +1186,9 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> {
> struct vfio_pci_device *vdev;
> struct vfio_device *device;
> + u32 uncor_status = 0;
> + unsigned int aer_cap_offset = 0;
> + int ret;
>
> device = vfio_device_get_from_dev(&pdev->dev);
> if (device == NULL)
> @@ -1187,10 +1200,30 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> return PCI_ERS_RESULT_DISCONNECT;
> }
>
> + /* get device's uncorrectable error status as soon as possible,
> + * and signal it to user space. The later we read it, the possibility
> + * the register value is mangled grows. */
> + aer_cap_offset = pci_find_ext_capability(vdev->pdev, PCI_EXT_CAP_ID_ERR);
> + ret = pci_read_config_dword(vdev->pdev, aer_cap_offset +
> + PCI_ERR_UNCOR_STATUS, &uncor_status);
> + if (ret)
> + return PCI_ERS_RESULT_DISCONNECT;
> +
> + pr_err("device %d got AER detect notification. uncorrectable error status = 0x%x\n", pdev->devfn, uncor_status);//to be removed
> mutex_lock(&vdev->igate);
> +
> + vdev->aer_recovering = true;
> + reinit_completion(&vdev->aer_error_completion);
> +
> + /* suspend config space access from user space,
> + * when vfio-pci's error recovery process is on */
what about access to memory etc? Do you need to suspend this as well?
> + pci_cfg_access_trylock(vdev->pdev);
If you trylock, you need to handle failure.
>
> - if (vdev->err_trigger)
> - eventfd_signal(vdev->err_trigger, 1);
> + if (vdev->err_trigger && uncor_status) {
> + pr_err("device %d signal uncor status to user space", pdev->devfn);//may be removed
> + /* signal uncorrectable error status to user space */
> + eventfd_signal(vdev->err_trigger, uncor_status);
> + }
>
> mutex_unlock(&vdev->igate);
>
> @@ -1199,8 +1232,34 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
> return PCI_ERS_RESULT_CAN_RECOVER;
> }
>
> +static void vfio_pci_aer_resume(struct pci_dev *pdev)
> +{
> + struct vfio_pci_device *vdev;
> + struct vfio_device *device;
> +
> + device = vfio_device_get_from_dev(&pdev->dev);
> + if (device == NULL)
> + return;
> +
> + vdev = vfio_device_data(device);
> + if (vdev == NULL) {
> + vfio_device_put(device);
> + return;
> + }
> +
> + /* vfio-pci's error recovery is done, time to
> + * resume pci config space's accesses */
> + pci_cfg_access_unlock(vdev->pdev);
pci_cfg_access blocks accesses through sysfs.
I would not bother, focus on accesses through vfio.
> +
> + vdev->aer_recovering = false;
don't you need locks when changing aer_recovering?
> + complete_all(&vdev->aer_error_completion);
> +
> + vfio_device_put(device);
> +}
> +
> static const struct pci_error_handlers vfio_err_handlers = {
> .error_detected = vfio_pci_aer_err_detected,
> + .resume = vfio_pci_aer_resume,
> };
>
> static struct pci_driver vfio_pci_driver = {
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index 8a7d546..ebf1041 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -83,6 +83,8 @@ struct vfio_pci_device {
> bool bardirty;
> bool has_vga;
> bool needs_reset;
> + bool aer_recovering;
> + struct completion aer_error_completion;
> struct pci_saved_state *pci_saved_state;
> int refcnt;
> struct eventfd_ctx *err_trigger;
> --
> 1.8.3.1
>
>
next prev parent reply other threads:[~2016-11-28 3:00 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-27 11:34 [PATCH] vfio/pci: Support error recovery Cao jin
2016-11-28 3:00 ` Michael S. Tsirkin [this message]
2016-11-28 9:32 ` Cao jin
2016-11-30 1:46 ` Michael S. Tsirkin
2016-12-01 13:38 ` Cao jin
2016-12-01 4:04 ` Alex Williamson
2016-12-01 4:51 ` Michael S. Tsirkin
2016-12-01 13:40 ` Cao jin
2016-12-06 3:46 ` Michael S. Tsirkin
2016-12-06 6:47 ` Cao jin
2016-12-01 13:40 ` Cao jin
2016-12-01 14:55 ` Alex Williamson
2016-12-04 12:16 ` Cao jin
2016-12-04 15:30 ` Alex Williamson
2016-12-05 5:52 ` Cao jin
2016-12-05 16:17 ` Alex Williamson
2016-12-06 3:55 ` Michael S. Tsirkin
2016-12-06 4:59 ` Alex Williamson
2016-12-06 10:46 ` Cao jin
2016-12-06 15:35 ` Alex Williamson
2016-12-07 2:49 ` Cao jin
2016-12-08 14:46 ` Cao jin
2016-12-08 16:30 ` Michael S. Tsirkin
2016-12-09 3:40 ` Cao jin
2016-12-09 3:40 ` Cao jin
2016-12-06 6:11 ` Cao jin
2016-12-06 15:25 ` Alex Williamson
2016-12-07 2:58 ` Cao jin
2016-12-12 13:49 ` Cao jin
2016-12-12 19:12 ` Alex Williamson
2016-12-12 22:29 ` Michael S. Tsirkin
2016-12-12 22:43 ` Alex Williamson
2016-12-13 3:15 ` Michael S. Tsirkin
2016-12-13 3:39 ` Alex Williamson
2016-12-13 16:12 ` Michael S. Tsirkin
2016-12-13 16:27 ` Alex Williamson
2016-12-14 1:58 ` Michael S. Tsirkin
2016-12-14 3:00 ` Alex Williamson
2016-12-14 22:20 ` Michael S. Tsirkin
2016-12-14 22:47 ` Alex Williamson
2016-12-14 23:00 ` Michael S. Tsirkin
2016-12-14 23:32 ` Alex Williamson
2016-12-14 10:24 ` Cao jin
2016-12-14 22:16 ` Alex Williamson
2016-12-14 22:25 ` Michael S. Tsirkin
2016-12-14 22:49 ` Alex Williamson
2016-12-15 13:56 ` Cao jin
2016-12-15 14:50 ` Michael S. Tsirkin
2016-12-15 22:01 ` Alex Williamson
2016-12-16 10:15 ` Cao jin
2016-12-16 10:15 ` Cao jin
2016-12-15 17:02 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161128045506-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=caoj.fnst@cn.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.