Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS

virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
       [not found] <1429257573-7359-1-git-send-email-famz@redhat.com>
@ 2015-04-20 17:36 ` Michael S. Tsirkin
  2015-04-20 19:10   ` Paolo Bonzini
  2015-04-21  2:37   ` [Qemu-devel] " Fam Zheng
  0 siblings, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 17:36 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> data with vring.
> That has drawbacks such as losing unsaved data (e.g. when
> guest user is writing a very long email), or possible denial of service in
> a nested vm use case where virtio device is passed through.
> 
> virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> improve this by communicating the error state between virtio devices and
> drivers. The device notifies guest upon setting the bit, then the guest driver
> should detect this bit and report to userspace, or recover the device by
> resetting it.

Unfortunately, virtio 1 spec does not have a conformance statement
that requires driver to recover. We merely have a non-normative looking
text:
	Note: For example, the driver can’t assume requests in flight
	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
	they have not been completed. A good implementation will try to recover
	by issuing a reset.

Implementing this reset for all devices in a race-free manner might also
be far from trivial.  I think we'd need a feature bit for this.
OTOH as long as we make this a new feature, would an ability to
reset a single VQ be a better match for what you are trying to
achieve?

> This series makes necessary changes in virtio core code, based on which
> virtio-blk is converted. Other devices now keep the existing behavior by
> passing in "error_abort". They will be converted in following series. The Linux
> driver part will also be worked on.
> 
> One concern with this behavior change is that it's now harder to notice the
> actual driver bug that caused the error, as the guest continues to run.  To
> address that, we could probably add a new error action option to virtio
> devices,  similar to the "read/write werror" in block layer, so the vm could be
> paused and the management will get an event in QMP like pvpanic.  This work can
> be done on top.

At the architectural level, that's only one concern. Others would be
- workloads such as openstack handle guest crash better than
  a guest that's e.g. slow because of a memory leak
- it's easier for guests to probe host for security issues
  if guest isn't killed
- guest can flood host log with guest-triggered errors


At the implementation level, there's one big issue you seem to have
missed: DMA to invalid memory addresses causes a crash in memory core.
I'm not sure whether it makes sense to recover from virtio core bugs
when we can't recover from device bugs.


> 
> 
> Fam Zheng (18):
>   virtio: Return error from virtqueue_map_sg
>   virtio: Return error from virtqueue_num_heads
>   virtio: Return error from virtqueue_get_head
>   virtio: Return error from virtqueue_next_desc
>   virtio: Return error from virtqueue_get_avail_bytes
>   virtio: Return error from virtqueue_pop
>   virtio: Return error from virtqueue_avail_bytes
>   virtio: Return error from virtio_add_queue
>   virtio: Return error from virtio_del_queue
>   virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET
>   virtio: Add "needs_reset" flag to virtio device
>   virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop
>   virtio-blk: Graceful error handling of virtqueue_pop
>   qtest: Add "QTEST_FILTER" to filter test cases
>   qtest: virtio-blk: Extract "setup" for future reuse
>   libqos: Add qvirtio_needs_reset
>   qtest: Add test case for "needs reset" of virtio-blk
>   qtest: virtio-blk: Suppress virtio error messages in "make check"
> 
>  hw/9pfs/virtio-9p-device.c                     |   2 +-
>  hw/9pfs/virtio-9p.c                            |   2 +-
>  hw/block/dataplane/virtio-blk.c                |   9 +-
>  hw/block/virtio-blk.c                          |  62 +++++--
>  hw/char/virtio-serial-bus.c                    |  30 ++--
>  hw/net/virtio-net.c                            |  36 +++--
>  hw/scsi/virtio-scsi.c                          |   8 +-
>  hw/virtio/virtio-balloon.c                     |  13 +-
>  hw/virtio/virtio-rng.c                         |   6 +-
>  hw/virtio/virtio.c                             | 214 ++++++++++++++++++-------
>  include/hw/virtio/virtio-blk.h                 |   3 +-
>  include/hw/virtio/virtio.h                     |  17 +-
>  include/standard-headers/linux/virtio_config.h |   2 +
>  tests/Makefile                                 |   6 +-
>  tests/libqos/virtio.c                          |   5 +
>  tests/libqos/virtio.h                          |   2 +
>  tests/virtio-blk-test.c                        | 196 ++++++++++++++++++++--
>  17 files changed, 482 insertions(+), 131 deletions(-)
> 
> -- 
> 1.9.3
> 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin
@ 2015-04-20 19:10   ` Paolo Bonzini
  2015-04-20 20:34     ` Michael S. Tsirkin
  2015-04-21  2:37   ` [Qemu-devel] " Fam Zheng
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-04-20 19:10 UTC (permalink / raw)
  To: Michael S. Tsirkin, Fam Zheng
  Cc: Aneesh Kumar K.V, Amit Shah, qemu-devel, Stefan Hajnoczi,
	virtualization



On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> At the implementation level, there's one big issue you seem to have
> missed: DMA to invalid memory addresses causes a crash in memory core.
> I'm not sure whether it makes sense to recover from virtio core bugs
> when we can't recover from device bugs.

What do you mean exactly?  DMA to invalid memory addresses causes
address_space_map to return a "short read".

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 19:10   ` Paolo Bonzini
@ 2015-04-20 20:34     ` Michael S. Tsirkin
  2015-04-21  2:39       ` Fam Zheng
  2015-04-21  6:52       ` Paolo Bonzini
  0 siblings, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2015-04-20 20:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > At the implementation level, there's one big issue you seem to have
> > missed: DMA to invalid memory addresses causes a crash in memory core.
> > I'm not sure whether it makes sense to recover from virtio core bugs
> > when we can't recover from device bugs.
> 
> What do you mean exactly?  DMA to invalid memory addresses causes
> address_space_map to return a "short read".
> 
> Paolo

I mean, first of all, a bunch of virtio_XXX_phys calls.
These eventually call qemu_get_ram_ptr, which internally calls
qemu_get_ram_block and ramblock_ptr.
Both abort on errors.

-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 20:34     ` Michael S. Tsirkin
@ 2015-04-21  2:39       ` Fam Zheng
  2015-04-21  6:52       ` Paolo Bonzini
  1 sibling, 0 replies; 10+ messages in thread
From: Fam Zheng @ 2015-04-21  2:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 04/20 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> > > At the implementation level, there's one big issue you seem to have
> > > missed: DMA to invalid memory addresses causes a crash in memory core.
> > > I'm not sure whether it makes sense to recover from virtio core bugs
> > > when we can't recover from device bugs.
> > 
> > What do you mean exactly?  DMA to invalid memory addresses causes
> > address_space_map to return a "short read".
> > 
> > Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.
> 

They are VQ manipulating operations, not DMA. Anyway, can we return errors from
memory core?

Fam

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 20:34     ` Michael S. Tsirkin
  2015-04-21  2:39       ` Fam Zheng
@ 2015-04-21  6:52       ` Paolo Bonzini
  2015-04-21  6:58         ` Michael S. Tsirkin
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-04-21  6:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah



On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
>>> At the implementation level, there's one big issue you seem to have
>>> missed: DMA to invalid memory addresses causes a crash in memory core.
>>> I'm not sure whether it makes sense to recover from virtio core bugs
>>> when we can't recover from device bugs.
>>
>> What do you mean exactly?  DMA to invalid memory addresses causes
>> address_space_map to return a "short read".
>>
>> Paolo
> 
> I mean, first of all, a bunch of virtio_XXX_phys calls.
> These eventually call qemu_get_ram_ptr, which internally calls
> qemu_get_ram_block and ramblock_ptr.
> Both abort on errors.

address_space_translate and memory_access_size should ensure they don't.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  6:52       ` Paolo Bonzini
@ 2015-04-21  6:58         ` Michael S. Tsirkin
  0 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Fam Zheng, qemu-devel, virtualization, Aneesh Kumar K.V,
	Stefan Hajnoczi, Amit Shah

On Tue, Apr 21, 2015 at 08:52:36AM +0200, Paolo Bonzini wrote:
> 
> 
> On 20/04/2015 22:34, Michael S. Tsirkin wrote:
> > On Mon, Apr 20, 2015 at 09:10:02PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 20/04/2015 19:36, Michael S. Tsirkin wrote:
> >>> At the implementation level, there's one big issue you seem to have
> >>> missed: DMA to invalid memory addresses causes a crash in memory core.
> >>> I'm not sure whether it makes sense to recover from virtio core bugs
> >>> when we can't recover from device bugs.
> >>
> >> What do you mean exactly?  DMA to invalid memory addresses causes
> >> address_space_map to return a "short read".
> >>
> >> Paolo
> > 
> > I mean, first of all, a bunch of virtio_XXX_phys calls.
> > These eventually call qemu_get_ram_ptr, which internally calls
> > qemu_get_ram_block and ramblock_ptr.
> > Both abort on errors.
> 
> address_space_translate and memory_access_size should ensure they don't.
> 
> Paolo

More comments in this code won't hurt.
It *looks* as if we assume we get a valid mr, and try to
access it.
In any case, no error is reported.

-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin
  2015-04-20 19:10   ` Paolo Bonzini
@ 2015-04-21  2:37   ` Fam Zheng
  2015-04-21  5:22     ` Michael S. Tsirkin
  1 sibling, 1 reply; 10+ messages in thread
From: Fam Zheng @ 2015-04-21  2:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > data with vring.
> > That has drawbacks such as losing unsaved data (e.g. when
> > guest user is writing a very long email), or possible denial of service in
> > a nested vm use case where virtio device is passed through.
> > 
> > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > improve this by communicating the error state between virtio devices and
> > drivers. The device notifies guest upon setting the bit, then the guest driver
> > should detect this bit and report to userspace, or recover the device by
> > resetting it.
> 
> Unfortunately, virtio 1 spec does not have a conformance statement
> that requires driver to recover. We merely have a non-normative looking
> text:
> 	Note: For example, the driver can’t assume requests in flight
> 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> 	they have not been completed. A good implementation will try to recover
> 	by issuing a reset.
> 
> Implementing this reset for all devices in a race-free manner might also
> be far from trivial.  I think we'd need a feature bit for this.
> OTOH as long as we make this a new feature, would an ability to
> reset a single VQ be a better match for what you are trying to
> achieve?

I think that is too complicated as a recovery measure, a device level resetting
will be better to get to a deterministic state, at least.

> 
> > This series makes necessary changes in virtio core code, based on which
> > virtio-blk is converted. Other devices now keep the existing behavior by
> > passing in "error_abort". They will be converted in following series. The Linux
> > driver part will also be worked on.
> > 
> > One concern with this behavior change is that it's now harder to notice the
> > actual driver bug that caused the error, as the guest continues to run.  To
> > address that, we could probably add a new error action option to virtio
> > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > paused and the management will get an event in QMP like pvpanic.  This work can
> > be done on top.
> 
> At the architectural level, that's only one concern. Others would be
> - workloads such as openstack handle guest crash better than
>   a guest that's e.g. slow because of a memory leak

What memory leak are you referring to?

> - it's easier for guests to probe host for security issues
>   if guest isn't killed
> - guest can flood host log with guest-triggered errors

We can still abort() if guest is triggering error too quickly.

Fam
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  2:37   ` [Qemu-devel] " Fam Zheng
@ 2015-04-21  5:22     ` Michael S. Tsirkin
  2015-04-21  5:50       ` Fam Zheng
  0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  5:22 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > data with vring.
> > > That has drawbacks such as losing unsaved data (e.g. when
> > > guest user is writing a very long email), or possible denial of service in
> > > a nested vm use case where virtio device is passed through.
> > > 
> > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > improve this by communicating the error state between virtio devices and
> > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > should detect this bit and report to userspace, or recover the device by
> > > resetting it.
> > 
> > Unfortunately, virtio 1 spec does not have a conformance statement
> > that requires driver to recover. We merely have a non-normative looking
> > text:
> > 	Note: For example, the driver can’t assume requests in flight
> > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > 	they have not been completed. A good implementation will try to recover
> > 	by issuing a reset.
> > 
> > Implementing this reset for all devices in a race-free manner might also
> > be far from trivial.  I think we'd need a feature bit for this.
> > OTOH as long as we make this a new feature, would an ability to
> > reset a single VQ be a better match for what you are trying to
> > achieve?
> 
> I think that is too complicated as a recovery measure, a device level resetting
> will be better to get to a deterministic state, at least.

Question would be, how hard is it to stop host from using all queues,
retrieve all host OS state and re-program it into the device.
If we need to shadow all OS state within the driver, then that's a lot
of not well tested code with a possibility of introducing more bugs.

> > 
> > > This series makes necessary changes in virtio core code, based on which
> > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > passing in "error_abort". They will be converted in following series. The Linux
> > > driver part will also be worked on.
> > > 
> > > One concern with this behavior change is that it's now harder to notice the
> > > actual driver bug that caused the error, as the guest continues to run.  To
> > > address that, we could probably add a new error action option to virtio
> > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > be done on top.
> > 
> > At the architectural level, that's only one concern. Others would be
> > - workloads such as openstack handle guest crash better than
> >   a guest that's e.g. slow because of a memory leak
> 
> What memory leak are you referring to?

That was just an example.  If host detects a malformed ring, it will
crash.  But often it doesn't, result is buffers not being used, so guest
can't free them up.

> > - it's easier for guests to probe host for security issues
> >   if guest isn't killed
> > - guest can flood host log with guest-triggered errors
> 
> We can still abort() if guest is triggering error too quickly.
> 
> Fam


Absolutely, and if it looked like I'm against error detection and
recovery, this was not my intent.

I am merely saying we can't apply this patchset as is, deferring
addressing the issues to patches on top.

But I have an idea: refactor the code to use error_abort. This way we
can apply the patchset without making functional changes, and you can
make progress to complete this, on top.



-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  5:22     ` Michael S. Tsirkin
@ 2015-04-21  5:50       ` Fam Zheng
  2015-04-21  6:09         ` Michael S. Tsirkin
  0 siblings, 1 reply; 10+ messages in thread
From: Fam Zheng @ 2015-04-21  5:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > data with vring.
> > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > guest user is writing a very long email), or possible denial of service in
> > > > a nested vm use case where virtio device is passed through.
> > > > 
> > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > improve this by communicating the error state between virtio devices and
> > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > should detect this bit and report to userspace, or recover the device by
> > > > resetting it.
> > > 
> > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > that requires driver to recover. We merely have a non-normative looking
> > > text:
> > > 	Note: For example, the driver can’t assume requests in flight
> > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > 	they have not been completed. A good implementation will try to recover
> > > 	by issuing a reset.
> > > 
> > > Implementing this reset for all devices in a race-free manner might also
> > > be far from trivial.  I think we'd need a feature bit for this.
> > > OTOH as long as we make this a new feature, would an ability to
> > > reset a single VQ be a better match for what you are trying to
> > > achieve?
> > 
> > I think that is too complicated as a recovery measure, a device level resetting
> > will be better to get to a deterministic state, at least.
> 
> Question would be, how hard is it to stop host from using all queues,
> retrieve all host OS state and re-program it into the device.
> If we need to shadow all OS state within the driver, then that's a lot
> of not well tested code with a possibility of introducing more bugs.

I don't understand the question. In this series the virtio-blk device will not
pop any more requests, and as long as the reset is properly handled, both guest
and host should go back to a good state.
> 
> > > 
> > > > This series makes necessary changes in virtio core code, based on which
> > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > driver part will also be worked on.
> > > > 
> > > > One concern with this behavior change is that it's now harder to notice the
> > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > address that, we could probably add a new error action option to virtio
> > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > be done on top.
> > > 
> > > At the architectural level, that's only one concern. Others would be
> > > - workloads such as openstack handle guest crash better than
> > >   a guest that's e.g. slow because of a memory leak
> > 
> > What memory leak are you referring to?
> 
> That was just an example.  If host detects a malformed ring, it will
> crash.  But often it doesn't, result is buffers not being used, so guest
> can't free them up.
> 
> > > - it's easier for guests to probe host for security issues
> > >   if guest isn't killed
> > > - guest can flood host log with guest-triggered errors
> > 
> > We can still abort() if guest is triggering error too quickly.
> 
> 
> Absolutely, and if it looked like I'm against error detection and
> recovery, this was not my intent.
> 
> I am merely saying we can't apply this patchset as is, deferring
> addressing the issues to patches on top.
> 
> But I have an idea: refactor the code to use error_abort. 

That is patch 1-9 of this series. Or do you mean also refactor and pass
error_abort to the memory core?

Fam

>This way we
> can apply the patchset without making functional changes, and you can
> make progress to complete this, on top.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET"
  2015-04-21  5:50       ` Fam Zheng
@ 2015-04-21  6:09         ` Michael S. Tsirkin
  0 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2015-04-21  6:09 UTC (permalink / raw)
  To: Fam Zheng
  Cc: qemu-devel, virtualization, Aneesh Kumar K.V, Stefan Hajnoczi,
	Amit Shah, Paolo Bonzini

On Tue, Apr 21, 2015 at 01:50:33PM +0800, Fam Zheng wrote:
> On Tue, 04/21 07:22, Michael S. Tsirkin wrote:
> > On Tue, Apr 21, 2015 at 10:37:00AM +0800, Fam Zheng wrote:
> > > On Mon, 04/20 19:36, Michael S. Tsirkin wrote:
> > > > On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote:
> > > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid
> > > > > data with vring.
> > > > > That has drawbacks such as losing unsaved data (e.g. when
> > > > > guest user is writing a very long email), or possible denial of service in
> > > > > a nested vm use case where virtio device is passed through.
> > > > > 
> > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to
> > > > > improve this by communicating the error state between virtio devices and
> > > > > drivers. The device notifies guest upon setting the bit, then the guest driver
> > > > > should detect this bit and report to userspace, or recover the device by
> > > > > resetting it.
> > > > 
> > > > Unfortunately, virtio 1 spec does not have a conformance statement
> > > > that requires driver to recover. We merely have a non-normative looking
> > > > text:
> > > > 	Note: For example, the driver can’t assume requests in flight
> > > > 	will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that
> > > > 	they have not been completed. A good implementation will try to recover
> > > > 	by issuing a reset.
> > > > 
> > > > Implementing this reset for all devices in a race-free manner might also
> > > > be far from trivial.  I think we'd need a feature bit for this.
> > > > OTOH as long as we make this a new feature, would an ability to
> > > > reset a single VQ be a better match for what you are trying to
> > > > achieve?
> > > 
> > > I think that is too complicated as a recovery measure, a device level resetting
> > > will be better to get to a deterministic state, at least.
> > 
> > Question would be, how hard is it to stop host from using all queues,
> > retrieve all host OS state and re-program it into the device.
> > If we need to shadow all OS state within the driver, then that's a lot
> > of not well tested code with a possibility of introducing more bugs.
> 
> I don't understand the question. In this series the virtio-blk device will not
> pop any more requests, and as long as the reset is properly handled, both guest
> and host should go back to a good state.
> > 
> > > > 
> > > > > This series makes necessary changes in virtio core code, based on which
> > > > > virtio-blk is converted. Other devices now keep the existing behavior by
> > > > > passing in "error_abort". They will be converted in following series. The Linux
> > > > > driver part will also be worked on.
> > > > > 
> > > > > One concern with this behavior change is that it's now harder to notice the
> > > > > actual driver bug that caused the error, as the guest continues to run.  To
> > > > > address that, we could probably add a new error action option to virtio
> > > > > devices,  similar to the "read/write werror" in block layer, so the vm could be
> > > > > paused and the management will get an event in QMP like pvpanic.  This work can
> > > > > be done on top.
> > > > 
> > > > At the architectural level, that's only one concern. Others would be
> > > > - workloads such as openstack handle guest crash better than
> > > >   a guest that's e.g. slow because of a memory leak
> > > 
> > > What memory leak are you referring to?
> > 
> > That was just an example.  If host detects a malformed ring, it will
> > crash.  But often it doesn't, result is buffers not being used, so guest
> > can't free them up.
> > 
> > > > - it's easier for guests to probe host for security issues
> > > >   if guest isn't killed
> > > > - guest can flood host log with guest-triggered errors
> > > 
> > > We can still abort() if guest is triggering error too quickly.
> > 
> > 
> > Absolutely, and if it looked like I'm against error detection and
> > recovery, this was not my intent.
> > 
> > I am merely saying we can't apply this patchset as is, deferring
> > addressing the issues to patches on top.
> > 
> > But I have an idea: refactor the code to use error_abort. 
> 
> That is patch 1-9 of this series. Or do you mean also refactor and pass
> error_abort to the memory core?
> 
> Fam

So if you like just patches 1-9 applied, this sounds
reasonable. I'll provide review comments on the individual patches.



> >This way we
> > can apply the patchset without making functional changes, and you can
> > make progress to complete this, on top.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-04-21  6:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1429257573-7359-1-git-send-email-famz@redhat.com>
2015-04-20 17:36 ` [PATCH 00/18] virtio-blk: Support "VIRTIO_CONFIG_S_NEEDS_RESET" Michael S. Tsirkin
2015-04-20 19:10   ` Paolo Bonzini
2015-04-20 20:34     ` Michael S. Tsirkin
2015-04-21  2:39       ` Fam Zheng
2015-04-21  6:52       ` Paolo Bonzini
2015-04-21  6:58         ` Michael S. Tsirkin
2015-04-21  2:37   ` [Qemu-devel] " Fam Zheng
2015-04-21  5:22     ` Michael S. Tsirkin
2015-04-21  5:50       ` Fam Zheng
2015-04-21  6:09         ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).