* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" [not found] <1429257573-7359-1-git-send-email-famz@redhat.com> @ 2015-04-20 17:36 ` Michael S. Tsirkin 2015-04-20 19:10 ` Paolo Bonzini 2015-04-21 2:37 ` [Qemu-devel] " Fam Zheng 0 siblings, 2 replies; 10+ messages in thread From: Michael S. Tsirkin @ 2015-04-20 17:36 UTC (permalink / raw) To: Fam Zheng Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > data with vring. > That has drawbacks such as losing unsaved data (e.g. when > guest user is writing a very long email), or possible denial of service in > a nested vm use case where virtio device is passed through. > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > improve this by communicating the error state between virtio devices and > drivers. The device notifies guest upon setting the bit, then the guest driver > should detect this bit and report to userspace, or recover the device by > resetting it. Unfortunately, virtio 1 spec does not have a conformance statement that requires driver to recover. We merely have a non-normative looking text: Note: For example, the driver can’t assume requests in flight will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that they have not been completed. A good implementation will try to recover by issuing a reset. Implementing this reset for all devices in a race-free manner might also be far from trivial. I think we'd need a feature bit for this. OTOH as long as we make this a new feature, would an ability to reset a single VQ be a better match for what you are trying to achieve? > This series makes necessary changes in virtio core code, based on which > virtio-blk is converted. Other devices now keep the existing behavior by > passing in "error_abort". They will be converted in following series. The Linux > driver part will also be worked on. > > One concern with this behavior change is that it's now harder to notice the > actual driver bug that caused the error, as the guest continues to run. To > address that, we could probably add a new error action option to virtio > devices, similar to the "read/write werror" in block layer, so the vm could be > paused and the management will get an event in QMP like pvpanic. This work can > be done on top. At the architectural level, that's only one concern. Others would be - workloads such as openstack handle guest crash better than a guest that's e.g. slow because of a memory leak - it's easier for guests to probe host for security issues if guest isn't killed - guest can flood host log with guest-triggered errors At the implementation level, there's one big issue you seem to have missed: DMA to invalid memory addresses causes a crash in memory core. I'm not sure whether it makes sense to recover from virtio core bugs when we can't recover from device bugs. > > > Fam Zheng (18): > virtio: Return error from virtqueue_map_sg > virtio: Return error from virtqueue_num_heads > virtio: Return error from virtqueue_get_head > virtio: Return error from virtqueue_next_desc > virtio: Return error from virtqueue_get_avail_bytes > virtio: Return error from virtqueue_pop > virtio: Return error from virtqueue_avail_bytes > virtio: Return error from virtio_add_queue > virtio: Return error from virtio_del_queue > virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET > virtio: Add "needs_reset" flag to virtio device > virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop > virtio-blk: Graceful error handling of virtqueue_pop > qtest: Add "QTEST_FILTER" to filter test cases > qtest: virtio-blk: Extract "setup" for future reuse > libqos: Add qvirtio_needs_reset > qtest: Add test case for "needs reset" of virtio-blk > qtest: virtio-blk: Suppress virtio error messages in "make check" > > hw/9pfs/virtio-9p-device.c | 2 +- > hw/9pfs/virtio-9p.c | 2 +- > hw/block/dataplane/virtio-blk.c | 9 +- > hw/block/virtio-blk.c | 62 +++++-- > hw/char/virtio-serial-bus.c | 30 ++-- > hw/net/virtio-net.c | 36 +++-- > hw/scsi/virtio-scsi.c | 8 +- > hw/virtio/virtio-balloon.c | 13 +- > hw/virtio/virtio-rng.c | 6 +- > hw/virtio/virtio.c | 214 ++++++++++++++++++------- > include/hw/virtio/virtio-blk.h | 3 +- > include/hw/virtio/virtio.h | 17 +- > include/standard-headers/linux/virtio_config.h | 2 + > tests/Makefile | 6 +- > tests/libqos/virtio.c | 5 + > tests/libqos/virtio.h | 2 + > tests/virtio-blk-test.c | 196 ++++++++++++++++++++-- > 17 files changed, 482 insertions(+), 131 deletions(-) > > -- > 1.9.3 > _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin @ 2015-04-20 19:10 ` Paolo Bonzini 2015-04-20 20:34 ` Michael S. Tsirkin 2015-04-21 2:37 ` [Qemu-devel] " Fam Zheng 1 sibling, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2015-04-20 19:10 UTC (permalink / raw) To: Michael S. Tsirkin, Fam Zheng Cc: Aneesh Kumar K.V, Amit Shah, qemu-devel, Stefan Hajnoczi, virtualization On 20/04/2015 19:36, Michael S. Tsirkin wrote: > At the implementation level, there's one big issue you seem to have > missed: DMA to invalid memory addresses causes a crash in memory core. > I'm not sure whether it makes sense to recover from virtio core bugs > when we can't recover from device bugs. What do you mean exactly? DMA to invalid memory addresses causes address_space_map to return a "short read". Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-20 19:10 ` Paolo Bonzini @ 2015-04-20 20:34 ` Michael S. Tsirkin 2015-04-21 2:39 ` Fam Zheng 2015-04-21 6:52 ` Paolo Bonzini 0 siblings, 2 replies; 10+ messages in thread From: Michael S. Tsirkin @ 2015-04-20 20:34 UTC (permalink / raw) To: Paolo Bonzini Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote: > > > On 20/04/2015 19:36, Michael S. Tsirkin wrote: > > At the implementation level, there's one big issue you seem to have > > missed: DMA to invalid memory addresses causes a crash in memory core. > > I'm not sure whether it makes sense to recover from virtio core bugs > > when we can't recover from device bugs. > > What do you mean exactly? DMA to invalid memory addresses causes > address_space_map to return a "short read". > > Paolo I mean, first of all, a bunch of virtio_XXX_phys calls. These eventually call qemu_get_ram_ptr, which internally calls qemu_get_ram_block and ramblock_ptr. Both abort on errors. -- MST ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-20 20:34 ` Michael S. Tsirkin @ 2015-04-21 2:39 ` Fam Zheng 2015-04-21 6:52 ` Paolo Bonzini 1 sibling, 0 replies; 10+ messages in thread From: Fam Zheng @ 2015-04-21 2:39 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Mon, 04/20 22:34, Michael S. Tsirkin wrote: > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote: > > > > > > On 20/04/2015 19:36, Michael S. Tsirkin wrote: > > > At the implementation level, there's one big issue you seem to have > > > missed: DMA to invalid memory addresses causes a crash in memory core. > > > I'm not sure whether it makes sense to recover from virtio core bugs > > > when we can't recover from device bugs. > > > > What do you mean exactly? DMA to invalid memory addresses causes > > address_space_map to return a "short read". > > > > Paolo > > I mean, first of all, a bunch of virtio_XXX_phys calls. > These eventually call qemu_get_ram_ptr, which internally calls > qemu_get_ram_block and ramblock_ptr. > Both abort on errors. > They are VQ manipulating operations, not DMA. Anyway, can we return errors from memory core? Fam ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-20 20:34 ` Michael S. Tsirkin 2015-04-21 2:39 ` Fam Zheng @ 2015-04-21 6:52 ` Paolo Bonzini 2015-04-21 6:58 ` Michael S. Tsirkin 1 sibling, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2015-04-21 6:52 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah On 20/04/2015 22:34, Michael S. Tsirkin wrote: > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote: >> >> >> On 20/04/2015 19:36, Michael S. Tsirkin wrote: >>> At the implementation level, there's one big issue you seem to have >>> missed: DMA to invalid memory addresses causes a crash in memory core. >>> I'm not sure whether it makes sense to recover from virtio core bugs >>> when we can't recover from device bugs. >> >> What do you mean exactly? DMA to invalid memory addresses causes >> address_space_map to return a "short read". >> >> Paolo > > I mean, first of all, a bunch of virtio_XXX_phys calls. > These eventually call qemu_get_ram_ptr, which internally calls > qemu_get_ram_block and ramblock_ptr. > Both abort on errors. address_space_translate and memory_access_size should ensure they don't. Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-21 6:52 ` Paolo Bonzini @ 2015-04-21 6:58 ` Michael S. Tsirkin 0 siblings, 0 replies; 10+ messages in thread From: Michael S. Tsirkin @ 2015-04-21 6:58 UTC (permalink / raw) To: Paolo Bonzini Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah On Tue, Apr 21, 2015 at 08:52:36AM +0200, Paolo Bonzini wrote: > > > On 20/04/2015 22:34, Michael S. Tsirkin wrote: > > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote: > >> > >> > >> On 20/04/2015 19:36, Michael S. Tsirkin wrote: > >>> At the implementation level, there's one big issue you seem to have > >>> missed: DMA to invalid memory addresses causes a crash in memory core. > >>> I'm not sure whether it makes sense to recover from virtio core bugs > >>> when we can't recover from device bugs. > >> > >> What do you mean exactly? DMA to invalid memory addresses causes > >> address_space_map to return a "short read". > >> > >> Paolo > > > > I mean, first of all, a bunch of virtio_XXX_phys calls. > > These eventually call qemu_get_ram_ptr, which internally calls > > qemu_get_ram_block and ramblock_ptr. > > Both abort on errors. > > address_space_translate and memory_access_size should ensure they don't. > > Paolo More comments in this code won't hurt. It *looks* as if we assume we get a valid mr, and try to access it. In any case, no error is reported. -- MST ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin 2015-04-20 19:10 ` Paolo Bonzini @ 2015-04-21 2:37 ` Fam Zheng 2015-04-21 5:22 ` Michael S. Tsirkin 1 sibling, 1 reply; 10+ messages in thread From: Fam Zheng @ 2015-04-21 2:37 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Mon, 04/20 19:36, Michael S. Tsirkin wrote: > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > > data with vring. > > That has drawbacks such as losing unsaved data (e.g. when > > guest user is writing a very long email), or possible denial of service in > > a nested vm use case where virtio device is passed through. > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > > improve this by communicating the error state between virtio devices and > > drivers. The device notifies guest upon setting the bit, then the guest driver > > should detect this bit and report to userspace, or recover the device by > > resetting it. > > Unfortunately, virtio 1 spec does not have a conformance statement > that requires driver to recover. We merely have a non-normative looking > text: > Note: For example, the driver can’t assume requests in flight > will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that > they have not been completed. A good implementation will try to recover > by issuing a reset. > > Implementing this reset for all devices in a race-free manner might also > be far from trivial. I think we'd need a feature bit for this. > OTOH as long as we make this a new feature, would an ability to > reset a single VQ be a better match for what you are trying to > achieve? I think that is too complicated as a recovery measure, a device level resetting will be better to get to a deterministic state, at least. > > > This series makes necessary changes in virtio core code, based on which > > virtio-blk is converted. Other devices now keep the existing behavior by > > passing in "error_abort". They will be converted in following series. The Linux > > driver part will also be worked on. > > > > One concern with this behavior change is that it's now harder to notice the > > actual driver bug that caused the error, as the guest continues to run. To > > address that, we could probably add a new error action option to virtio > > devices, similar to the "read/write werror" in block layer, so the vm could be > > paused and the management will get an event in QMP like pvpanic. This work can > > be done on top. > > At the architectural level, that's only one concern. Others would be > - workloads such as openstack handle guest crash better than > a guest that's e.g. slow because of a memory leak What memory leak are you referring to? > - it's easier for guests to probe host for security issues > if guest isn't killed > - guest can flood host log with guest-triggered errors We can still abort() if guest is triggering error too quickly. Fam _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-21 2:37 ` [Qemu-devel] " Fam Zheng @ 2015-04-21 5:22 ` Michael S. Tsirkin 2015-04-21 5:50 ` Fam Zheng 0 siblings, 1 reply; 10+ messages in thread From: Michael S. Tsirkin @ 2015-04-21 5:22 UTC (permalink / raw) To: Fam Zheng Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote: > On Mon, 04/20 19:36, Michael S. Tsirkin wrote: > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > > > data with vring. > > > That has drawbacks such as losing unsaved data (e.g. when > > > guest user is writing a very long email), or possible denial of service in > > > a nested vm use case where virtio device is passed through. > > > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > > > improve this by communicating the error state between virtio devices and > > > drivers. The device notifies guest upon setting the bit, then the guest driver > > > should detect this bit and report to userspace, or recover the device by > > > resetting it. > > > > Unfortunately, virtio 1 spec does not have a conformance statement > > that requires driver to recover. We merely have a non-normative looking > > text: > > Note: For example, the driver can’t assume requests in flight > > will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that > > they have not been completed. A good implementation will try to recover > > by issuing a reset. > > > > Implementing this reset for all devices in a race-free manner might also > > be far from trivial. I think we'd need a feature bit for this. > > OTOH as long as we make this a new feature, would an ability to > > reset a single VQ be a better match for what you are trying to > > achieve? > > I think that is too complicated as a recovery measure, a device level resetting > will be better to get to a deterministic state, at least. Question would be, how hard is it to stop host from using all queues, retrieve all host OS state and re-program it into the device. If we need to shadow all OS state within the driver, then that's a lot of not well tested code with a possibility of introducing more bugs. > > > > > This series makes necessary changes in virtio core code, based on which > > > virtio-blk is converted. Other devices now keep the existing behavior by > > > passing in "error_abort". They will be converted in following series. The Linux > > > driver part will also be worked on. > > > > > > One concern with this behavior change is that it's now harder to notice the > > > actual driver bug that caused the error, as the guest continues to run. To > > > address that, we could probably add a new error action option to virtio > > > devices, similar to the "read/write werror" in block layer, so the vm could be > > > paused and the management will get an event in QMP like pvpanic. This work can > > > be done on top. > > > > At the architectural level, that's only one concern. Others would be > > - workloads such as openstack handle guest crash better than > > a guest that's e.g. slow because of a memory leak > > What memory leak are you referring to? That was just an example. If host detects a malformed ring, it will crash. But often it doesn't, result is buffers not being used, so guest can't free them up. > > - it's easier for guests to probe host for security issues > > if guest isn't killed > > - guest can flood host log with guest-triggered errors > > We can still abort() if guest is triggering error too quickly. > > Fam Absolutely, and if it looked like I'm against error detection and recovery, this was not my intent. I am merely saying we can't apply this patchset as is, deferring addressing the issues to patches on top. But I have an idea: refactor the code to use error_abort. This way we can apply the patchset without making functional changes, and you can make progress to complete this, on top. -- MST _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-21 5:22 ` Michael S. Tsirkin @ 2015-04-21 5:50 ` Fam Zheng 2015-04-21 6:09 ` Michael S. Tsirkin 0 siblings, 1 reply; 10+ messages in thread From: Fam Zheng @ 2015-04-21 5:50 UTC (permalink / raw) To: Michael S. Tsirkin Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Tue, 04/21 07:22, Michael S. Tsirkin wrote: > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote: > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote: > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > > > > data with vring. > > > > That has drawbacks such as losing unsaved data (e.g. when > > > > guest user is writing a very long email), or possible denial of service in > > > > a nested vm use case where virtio device is passed through. > > > > > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > > > > improve this by communicating the error state between virtio devices and > > > > drivers. The device notifies guest upon setting the bit, then the guest driver > > > > should detect this bit and report to userspace, or recover the device by > > > > resetting it. > > > > > > Unfortunately, virtio 1 spec does not have a conformance statement > > > that requires driver to recover. We merely have a non-normative looking > > > text: > > > Note: For example, the driver can’t assume requests in flight > > > will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that > > > they have not been completed. A good implementation will try to recover > > > by issuing a reset. > > > > > > Implementing this reset for all devices in a race-free manner might also > > > be far from trivial. I think we'd need a feature bit for this. > > > OTOH as long as we make this a new feature, would an ability to > > > reset a single VQ be a better match for what you are trying to > > > achieve? > > > > I think that is too complicated as a recovery measure, a device level resetting > > will be better to get to a deterministic state, at least. > > Question would be, how hard is it to stop host from using all queues, > retrieve all host OS state and re-program it into the device. > If we need to shadow all OS state within the driver, then that's a lot > of not well tested code with a possibility of introducing more bugs. I don't understand the question. In this series the virtio-blk device will not pop any more requests, and as long as the reset is properly handled, both guest and host should go back to a good state. > > > > > > > > This series makes necessary changes in virtio core code, based on which > > > > virtio-blk is converted. Other devices now keep the existing behavior by > > > > passing in "error_abort". They will be converted in following series. The Linux > > > > driver part will also be worked on. > > > > > > > > One concern with this behavior change is that it's now harder to notice the > > > > actual driver bug that caused the error, as the guest continues to run. To > > > > address that, we could probably add a new error action option to virtio > > > > devices, similar to the "read/write werror" in block layer, so the vm could be > > > > paused and the management will get an event in QMP like pvpanic. This work can > > > > be done on top. > > > > > > At the architectural level, that's only one concern. Others would be > > > - workloads such as openstack handle guest crash better than > > > a guest that's e.g. slow because of a memory leak > > > > What memory leak are you referring to? > > That was just an example. If host detects a malformed ring, it will > crash. But often it doesn't, result is buffers not being used, so guest > can't free them up. > > > > - it's easier for guests to probe host for security issues > > > if guest isn't killed > > > - guest can flood host log with guest-triggered errors > > > > We can still abort() if guest is triggering error too quickly. > > > Absolutely, and if it looked like I'm against error detection and > recovery, this was not my intent. > > I am merely saying we can't apply this patchset as is, deferring > addressing the issues to patches on top. > > But I have an idea: refactor the code to use error_abort. That is patch 1-9 of this series. Or do you mean also refactor and pass error_abort to the memory core? Fam >This way we > can apply the patchset without making functional changes, and you can > make progress to complete this, on top. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" 2015-04-21 5:50 ` Fam Zheng @ 2015-04-21 6:09 ` Michael S. Tsirkin 0 siblings, 0 replies; 10+ messages in thread From: Michael S. Tsirkin @ 2015-04-21 6:09 UTC (permalink / raw) To: Fam Zheng Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi, Amit Shah, Paolo Bonzini On Tue, Apr 21, 2015 at 01:50:33PM +0800, Fam Zheng wrote: > On Tue, 04/21 07:22, Michael S. Tsirkin wrote: > > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote: > > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote: > > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > > > > > data with vring. > > > > > That has drawbacks such as losing unsaved data (e.g. when > > > > > guest user is writing a very long email), or possible denial of service in > > > > > a nested vm use case where virtio device is passed through. > > > > > > > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > > > > > improve this by communicating the error state between virtio devices and > > > > > drivers. The device notifies guest upon setting the bit, then the guest driver > > > > > should detect this bit and report to userspace, or recover the device by > > > > > resetting it. > > > > > > > > Unfortunately, virtio 1 spec does not have a conformance statement > > > > that requires driver to recover. We merely have a non-normative looking > > > > text: > > > > Note: For example, the driver can’t assume requests in flight > > > > will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that > > > > they have not been completed. A good implementation will try to recover > > > > by issuing a reset. > > > > > > > > Implementing this reset for all devices in a race-free manner might also > > > > be far from trivial. I think we'd need a feature bit for this. > > > > OTOH as long as we make this a new feature, would an ability to > > > > reset a single VQ be a better match for what you are trying to > > > > achieve? > > > > > > I think that is too complicated as a recovery measure, a device level resetting > > > will be better to get to a deterministic state, at least. > > > > Question would be, how hard is it to stop host from using all queues, > > retrieve all host OS state and re-program it into the device. > > If we need to shadow all OS state within the driver, then that's a lot > > of not well tested code with a possibility of introducing more bugs. > > I don't understand the question. In this series the virtio-blk device will not > pop any more requests, and as long as the reset is properly handled, both guest > and host should go back to a good state. > > > > > > > > > > > This series makes necessary changes in virtio core code, based on which > > > > > virtio-blk is converted. Other devices now keep the existing behavior by > > > > > passing in "error_abort". They will be converted in following series. The Linux > > > > > driver part will also be worked on. > > > > > > > > > > One concern with this behavior change is that it's now harder to notice the > > > > > actual driver bug that caused the error, as the guest continues to run. To > > > > > address that, we could probably add a new error action option to virtio > > > > > devices, similar to the "read/write werror" in block layer, so the vm could be > > > > > paused and the management will get an event in QMP like pvpanic. This work can > > > > > be done on top. > > > > > > > > At the architectural level, that's only one concern. Others would be > > > > - workloads such as openstack handle guest crash better than > > > > a guest that's e.g. slow because of a memory leak > > > > > > What memory leak are you referring to? > > > > That was just an example. If host detects a malformed ring, it will > > crash. But often it doesn't, result is buffers not being used, so guest > > can't free them up. > > > > > > - it's easier for guests to probe host for security issues > > > > if guest isn't killed > > > > - guest can flood host log with guest-triggered errors > > > > > > We can still abort() if guest is triggering error too quickly. > > > > > > Absolutely, and if it looked like I'm against error detection and > > recovery, this was not my intent. > > > > I am merely saying we can't apply this patchset as is, deferring > > addressing the issues to patches on top. > > > > But I have an idea: refactor the code to use error_abort. > > That is patch 1-9 of this series. Or do you mean also refactor and pass > error_abort to the memory core? > > Fam So if you like just patches 1-9 applied, this sounds reasonable. I'll provide review comments on the individual patches. > >This way we > > can apply the patchset without making functional changes, and you can > > make progress to complete this, on top. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-04-21 6:58 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1429257573-7359-1-git-send-email-famz@redhat.com> 2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin 2015-04-20 19:10 ` Paolo Bonzini 2015-04-20 20:34 ` Michael S. Tsirkin 2015-04-21 2:39 ` Fam Zheng 2015-04-21 6:52 ` Paolo Bonzini 2015-04-21 6:58 ` Michael S. Tsirkin 2015-04-21 2:37 ` [Qemu-devel] " Fam Zheng 2015-04-21 5:22 ` Michael S. Tsirkin 2015-04-21 5:50 ` Fam Zheng 2015-04-21 6:09 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).