From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Jason Wang <jasowang@redhat.com>, <mst@redhat.com>,
<kvm@vger.kernel.org>,
<virtualization@lists.linux-foundation.org>
Cc: <oren@nvidia.com>, <nitzanc@nvidia.com>, <cohuck@redhat.com>
Subject: Re: [PATCH v2 1/3] virtio: update reset callback to return status
Date: Thu, 8 Apr 2021 17:24:04 +0300 [thread overview]
Message-ID: <103ae6fe-1ffc-90a3-09cd-bcbbcbb8eee7@nvidia.com> (raw)
In-Reply-To: <a46cef9b-b8da-b748-c332-a3a05bd9135f@redhat.com>
On 4/8/2021 4:14 PM, Jason Wang wrote:
>
> 在 2021/4/8 下午5:56, Max Gurtovoy 写道:
>>
>> On 4/8/2021 11:58 AM, Jason Wang wrote:
>>>
>>> 在 2021/4/8 下午4:11, Max Gurtovoy 写道:
>>>> The reset device operation, usually is an operation that might fail
>>>> from
>>>> various reasons. For example, the controller might be in a bad
>>>> state and
>>>> can't answer to any request. Usually, the paravirt SW based virtio
>>>> devices always succeed in reset operation but this is not the case for
>>>> HW based virtio devices.
>>>
>>>
>>> I would like to know under what condition that the reset operation
>>> may fail (except for the case of a bugg guest).
>>
>> The controller might not be ready or stuck. This is a real use case
>> for many PCI devices.
>>
>> For real devices the FW might be in a bad state and it can happen
>> also for paravirt device if you have a bug in the controller code or
>> if you entered some error flow (Out of memory).
>>
>> You don't want to be stuck because of one bad device.
>
>
> So the buggy driver can damage the host through various ways, I'm not
> sure doing such workaround is worthwhile.
do you mean device ?
sometimes you need to replace device FW and it will work.
I don't think it's a workaround. Other protocols, such as NVMe, solved
this in the specification.
PCI config space and PCI controller are sometimes 2 different components
and sometimes the controller is not active, although the device is
plugged and seen by the PCI subsystem.
>
> Note that this driver has been used for real hardware PCI devices for
> many years. We don't receive any report of this before.
>
>
>>
>>
>>>
>>>
>>>>
>>>> This commit is also a preparation for adding a timeout mechanism for
>>>> resetting virtio devices.
>>>>
>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>> ---
>>>>
>>>> changes from v1:
>>>> - update virtio_ccw.c (Cornelia)
>>>> - update virtio_uml.c
>>>> - update mlxbf-tmfifo.c
>>>
>>>
>>> Note that virtio driver may call reset, so you probably need to
>>> convert them.
>>
>> I'm sure I understand.
>>
>> Convert to what ?
>
>
> Convert to deal with the possible rest failure. E.g in
> virtblk_freeze() we had:
>
> static int virtblk_freeze(struct virtio_device *vdev)
> {
> struct virtio_blk *vblk = vdev->priv;
>
> /* Ensure we don't receive any more interrupts */
> vdev->config->reset(vdev);
> ...
>
> We need fail the freeze here.
Agree.
>
> Another example is the driver remove which is not expected to be fail.
> A lot of virtio drivers tries to call reset there. I'm not sure how
> hard to deal with the failure in the path of remove (e.g
> __device_release_driver tends to ignore the return value of
> bus->remove().)
I think it can stay as-is and ignore the ->reset return value and
continue free the other resources to avoid leakage.
>
> Thanks
>
>
>>
>> Thanks.
>>
>>>
>>> Thanks
>>>
>>>
>>>>
>>>> ---
>>>> arch/um/drivers/virtio_uml.c | 4 +++-
>>>> drivers/platform/mellanox/mlxbf-tmfifo.c | 4 +++-
>>>> drivers/remoteproc/remoteproc_virtio.c | 4 +++-
>>>> drivers/s390/virtio/virtio_ccw.c | 9 ++++++---
>>>> drivers/virtio/virtio.c | 22 +++++++++++++++-------
>>>> drivers/virtio/virtio_mmio.c | 3 ++-
>>>> drivers/virtio/virtio_pci_legacy.c | 4 +++-
>>>> drivers/virtio/virtio_pci_modern.c | 3 ++-
>>>> drivers/virtio/virtio_vdpa.c | 4 +++-
>>>> include/linux/virtio_config.h | 5 +++--
>>>> 10 files changed, 43 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/arch/um/drivers/virtio_uml.c
>>>> b/arch/um/drivers/virtio_uml.c
>>>> index 91ddf74ca888..b6e66265ed32 100644
>>>> --- a/arch/um/drivers/virtio_uml.c
>>>> +++ b/arch/um/drivers/virtio_uml.c
>>>> @@ -827,11 +827,13 @@ static void vu_set_status(struct
>>>> virtio_device *vdev, u8 status)
>>>> vu_dev->status = status;
>>>> }
>>>> -static void vu_reset(struct virtio_device *vdev)
>>>> +static int vu_reset(struct virtio_device *vdev)
>>>> {
>>>> struct virtio_uml_device *vu_dev = to_virtio_uml_device(vdev);
>>>> vu_dev->status = 0;
>>>> +
>>>> + return 0;
>>>> }
>>>> static void vu_del_vq(struct virtqueue *vq)
>>>> diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c
>>>> b/drivers/platform/mellanox/mlxbf-tmfifo.c
>>>> index bbc4e71a16ff..c192b8ac5d9e 100644
>>>> --- a/drivers/platform/mellanox/mlxbf-tmfifo.c
>>>> +++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
>>>> @@ -980,11 +980,13 @@ static void
>>>> mlxbf_tmfifo_virtio_set_status(struct virtio_device *vdev,
>>>> }
>>>> /* Reset the device. Not much here for now. */
>>>> -static void mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev)
>>>> +static int mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev)
>>>> {
>>>> struct mlxbf_tmfifo_vdev *tm_vdev = mlxbf_vdev_to_tmfifo(vdev);
>>>> tm_vdev->status = 0;
>>>> +
>>>> + return 0;
>>>> }
>>>> /* Read the value of a configuration field. */
>>>> diff --git a/drivers/remoteproc/remoteproc_virtio.c
>>>> b/drivers/remoteproc/remoteproc_virtio.c
>>>> index 0cc617f76068..ca9573c62c3d 100644
>>>> --- a/drivers/remoteproc/remoteproc_virtio.c
>>>> +++ b/drivers/remoteproc/remoteproc_virtio.c
>>>> @@ -191,7 +191,7 @@ static void rproc_virtio_set_status(struct
>>>> virtio_device *vdev, u8 status)
>>>> dev_dbg(&vdev->dev, "status: %d\n", status);
>>>> }
>>>> -static void rproc_virtio_reset(struct virtio_device *vdev)
>>>> +static int rproc_virtio_reset(struct virtio_device *vdev)
>>>> {
>>>> struct rproc_vdev *rvdev = vdev_to_rvdev(vdev);
>>>> struct fw_rsc_vdev *rsc;
>>>> @@ -200,6 +200,8 @@ static void rproc_virtio_reset(struct
>>>> virtio_device *vdev)
>>>> rsc->status = 0;
>>>> dev_dbg(&vdev->dev, "reset !\n");
>>>> +
>>>> + return 0;
>>>> }
>>>> /* provide the vdev features as retrieved from the firmware */
>>>> diff --git a/drivers/s390/virtio/virtio_ccw.c
>>>> b/drivers/s390/virtio/virtio_ccw.c
>>>> index 54e686dca6de..52b32555e746 100644
>>>> --- a/drivers/s390/virtio/virtio_ccw.c
>>>> +++ b/drivers/s390/virtio/virtio_ccw.c
>>>> @@ -732,14 +732,15 @@ static int virtio_ccw_find_vqs(struct
>>>> virtio_device *vdev, unsigned nvqs,
>>>> return ret;
>>>> }
>>>> -static void virtio_ccw_reset(struct virtio_device *vdev)
>>>> +static int virtio_ccw_reset(struct virtio_device *vdev)
>>>> {
>>>> struct virtio_ccw_device *vcdev = to_vc_device(vdev);
>>>> struct ccw1 *ccw;
>>>> + int ret;
>>>> ccw = ccw_device_dma_zalloc(vcdev->cdev, sizeof(*ccw));
>>>> if (!ccw)
>>>> - return;
>>>> + return -ENOMEM;
>>>> /* Zero status bits. */
>>>> vcdev->dma_area->status = 0;
>>>> @@ -749,8 +750,10 @@ static void virtio_ccw_reset(struct
>>>> virtio_device *vdev)
>>>> ccw->flags = 0;
>>>> ccw->count = 0;
>>>> ccw->cda = 0;
>>>> - ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
>>>> + ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
>>>> ccw_device_dma_free(vcdev->cdev, ccw, sizeof(*ccw));
>>>> +
>>>> + return ret;
>>>> }
>>>> static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>>>> index 4b15c00c0a0a..ddbfd5b5f3bd 100644
>>>> --- a/drivers/virtio/virtio.c
>>>> +++ b/drivers/virtio/virtio.c
>>>> @@ -338,7 +338,7 @@ int register_virtio_device(struct virtio_device
>>>> *dev)
>>>> /* Assign a unique device index and hence name. */
>>>> err = ida_simple_get(&virtio_index_ida, 0, 0, GFP_KERNEL);
>>>> if (err < 0)
>>>> - goto out;
>>>> + goto out_err;
>>>> dev->index = err;
>>>> dev_set_name(&dev->dev, "virtio%u", dev->index);
>>>> @@ -349,7 +349,9 @@ int register_virtio_device(struct virtio_device
>>>> *dev)
>>>> /* We always start by resetting the device, in case a previous
>>>> * driver messed it up. This also tests that code path a
>>>> little. */
>>>> - dev->config->reset(dev);
>>>> + err = dev->config->reset(dev);
>>>> + if (err)
>>>> + goto out_ida;
>>>> /* Acknowledge that we've seen the device. */
>>>> virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
>>>> @@ -362,10 +364,14 @@ int register_virtio_device(struct
>>>> virtio_device *dev)
>>>> */
>>>> err = device_add(&dev->dev);
>>>> if (err)
>>>> - ida_simple_remove(&virtio_index_ida, dev->index);
>>>> -out:
>>>> - if (err)
>>>> - virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
>>>> + goto out_ida;
>>>> +
>>>> + return 0;
>>>> +
>>>> +out_ida:
>>>> + ida_simple_remove(&virtio_index_ida, dev->index);
>>>> +out_err:
>>>> + virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED);
>>>> return err;
>>>> }
>>>> EXPORT_SYMBOL_GPL(register_virtio_device);
>>>> @@ -408,7 +414,9 @@ int virtio_device_restore(struct virtio_device
>>>> *dev)
>>>> /* We always start by resetting the device, in case a previous
>>>> * driver messed it up. */
>>>> - dev->config->reset(dev);
>>>> + ret = dev->config->reset(dev);
>>>> + if (ret)
>>>> + goto err;
>>>> /* Acknowledge that we've seen the device. */
>>>> virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
>>>> diff --git a/drivers/virtio/virtio_mmio.c
>>>> b/drivers/virtio/virtio_mmio.c
>>>> index 56128b9c46eb..12b8f048c48d 100644
>>>> --- a/drivers/virtio/virtio_mmio.c
>>>> +++ b/drivers/virtio/virtio_mmio.c
>>>> @@ -256,12 +256,13 @@ static void vm_set_status(struct
>>>> virtio_device *vdev, u8 status)
>>>> writel(status, vm_dev->base + VIRTIO_MMIO_STATUS);
>>>> }
>>>> -static void vm_reset(struct virtio_device *vdev)
>>>> +static int vm_reset(struct virtio_device *vdev)
>>>> {
>>>> struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>>>> /* 0 status means a reset. */
>>>> writel(0, vm_dev->base + VIRTIO_MMIO_STATUS);
>>>> + return 0;
>>>> }
>>>> diff --git a/drivers/virtio/virtio_pci_legacy.c
>>>> b/drivers/virtio/virtio_pci_legacy.c
>>>> index d62e9835aeec..0b5d95e3efa1 100644
>>>> --- a/drivers/virtio/virtio_pci_legacy.c
>>>> +++ b/drivers/virtio/virtio_pci_legacy.c
>>>> @@ -89,7 +89,7 @@ static void vp_set_status(struct virtio_device
>>>> *vdev, u8 status)
>>>> iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS);
>>>> }
>>>> -static void vp_reset(struct virtio_device *vdev)
>>>> +static int vp_reset(struct virtio_device *vdev)
>>>> {
>>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>>>> /* 0 status means a reset. */
>>>> @@ -99,6 +99,8 @@ static void vp_reset(struct virtio_device *vdev)
>>>> ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS);
>>>> /* Flush pending VQ/configuration callbacks. */
>>>> vp_synchronize_vectors(vdev);
>>>> +
>>>> + return 0;
>>>> }
>>>> static u16 vp_config_vector(struct virtio_pci_device *vp_dev,
>>>> u16 vector)
>>>> diff --git a/drivers/virtio/virtio_pci_modern.c
>>>> b/drivers/virtio/virtio_pci_modern.c
>>>> index fbd4ebc00eb6..cc3412a96a17 100644
>>>> --- a/drivers/virtio/virtio_pci_modern.c
>>>> +++ b/drivers/virtio/virtio_pci_modern.c
>>>> @@ -158,7 +158,7 @@ static void vp_set_status(struct virtio_device
>>>> *vdev, u8 status)
>>>> vp_modern_set_status(&vp_dev->mdev, status);
>>>> }
>>>> -static void vp_reset(struct virtio_device *vdev)
>>>> +static int vp_reset(struct virtio_device *vdev)
>>>> {
>>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>>>> struct virtio_pci_modern_device *mdev = &vp_dev->mdev;
>>>> @@ -174,6 +174,7 @@ static void vp_reset(struct virtio_device *vdev)
>>>> msleep(1);
>>>> /* Flush pending VQ/configuration callbacks. */
>>>> vp_synchronize_vectors(vdev);
>>>> + return 0;
>>>> }
>>>> static u16 vp_config_vector(struct virtio_pci_device *vp_dev,
>>>> u16 vector)
>>>> diff --git a/drivers/virtio/virtio_vdpa.c
>>>> b/drivers/virtio/virtio_vdpa.c
>>>> index e28acf482e0c..5fd4e627a9b0 100644
>>>> --- a/drivers/virtio/virtio_vdpa.c
>>>> +++ b/drivers/virtio/virtio_vdpa.c
>>>> @@ -97,11 +97,13 @@ static void virtio_vdpa_set_status(struct
>>>> virtio_device *vdev, u8 status)
>>>> return ops->set_status(vdpa, status);
>>>> }
>>>> -static void virtio_vdpa_reset(struct virtio_device *vdev)
>>>> +static int virtio_vdpa_reset(struct virtio_device *vdev)
>>>> {
>>>> struct vdpa_device *vdpa = vd_get_vdpa(vdev);
>>>> vdpa_reset(vdpa);
>>>> +
>>>> + return 0;
>>>> }
>>>> static bool virtio_vdpa_notify(struct virtqueue *vq)
>>>> diff --git a/include/linux/virtio_config.h
>>>> b/include/linux/virtio_config.h
>>>> index 8519b3ae5d52..d2b0f1699a75 100644
>>>> --- a/include/linux/virtio_config.h
>>>> +++ b/include/linux/virtio_config.h
>>>> @@ -44,9 +44,10 @@ struct virtio_shm_region {
>>>> * status: the new status byte
>>>> * @reset: reset the device
>>>> * vdev: the virtio device
>>>> - * After this, status and feature negotiation must be done again
>>>> + * Upon success, status and feature negotiation must be done again
>>>> * Device must not be reset from its vq/config callbacks, or in
>>>> * parallel with being added/removed.
>>>> + * Returns 0 on success or error status.
>>>> * @find_vqs: find virtqueues and instantiate them.
>>>> * vdev: the virtio_device
>>>> * nvqs: the number of virtqueues to find
>>>> @@ -82,7 +83,7 @@ struct virtio_config_ops {
>>>> u32 (*generation)(struct virtio_device *vdev);
>>>> u8 (*get_status)(struct virtio_device *vdev);
>>>> void (*set_status)(struct virtio_device *vdev, u8 status);
>>>> - void (*reset)(struct virtio_device *vdev);
>>>> + int (*reset)(struct virtio_device *vdev);
>>>> int (*find_vqs)(struct virtio_device *, unsigned nvqs,
>>>> struct virtqueue *vqs[], vq_callback_t *callbacks[],
>>>> const char * const names[], const bool *ctx,
>>>
>>
>
next prev parent reply other threads:[~2021-04-08 14:24 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 8:11 [PATCH v2 1/3] virtio: update reset callback to return status Max Gurtovoy
2021-04-08 8:11 ` [PATCH v2 2/3] virito_pci: add timeout to reset device operation Max Gurtovoy
2021-04-08 9:01 ` Jason Wang
2021-04-08 9:44 ` Max Gurtovoy
2021-04-08 12:45 ` Jason Wang
2021-04-08 12:57 ` Max Gurtovoy
2021-04-08 13:18 ` Jason Wang
2021-04-08 8:11 ` [PATCH v2 3/3] virtio_pci: add module param for reset_timeout Max Gurtovoy
2021-04-08 8:58 ` [PATCH v2 1/3] virtio: update reset callback to return status Jason Wang
2021-04-08 9:56 ` Max Gurtovoy
2021-04-08 13:14 ` Jason Wang
2021-04-08 14:24 ` Max Gurtovoy [this message]
2021-04-09 5:22 ` Jason Wang
2021-04-11 9:50 ` Max Gurtovoy
2021-04-12 9:07 ` Michael S. Tsirkin
2021-04-12 12:17 ` Max Gurtovoy
2021-04-12 21:29 ` Michael S. Tsirkin
2021-04-08 15:56 ` Michael S. Tsirkin
2021-04-12 11:55 ` Max Gurtovoy
2021-04-12 12:04 ` Michael S. Tsirkin
2021-04-12 13:03 ` Max Gurtovoy
2021-04-12 21:23 ` Michael S. Tsirkin
2021-04-12 22:53 ` Max Gurtovoy
2021-04-13 2:42 ` Jason Wang
2021-04-13 4:52 ` Michael S. Tsirkin
2021-04-13 4:42 ` Michael S. Tsirkin
2021-04-13 10:56 ` Max Gurtovoy
2021-04-08 16:18 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=103ae6fe-1ffc-90a3-09cd-bcbbcbb8eee7@nvidia.com \
--to=mgurtovoy@nvidia.com \
--cc=cohuck@redhat.com \
--cc=jasowang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=nitzanc@nvidia.com \
--cc=oren@nvidia.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox