Re: [PATCH v2] virtio-pci: disable vring processing when bus-mastering is disabled

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Halil Pasic <pasic@linux.ibm.com>
To: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	qemu-devel@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2] virtio-pci: disable vring processing when bus-mastering is disabled
Date: Thu, 28 Nov 2019 17:48:00 +0100	[thread overview]
Message-ID: <20191128174800.2d4a2cc2.pasic@linux.ibm.com> (raw)
In-Reply-To: <20191120005003.27035-1-mdroth@linux.vnet.ibm.com>

On Tue, 19 Nov 2019 18:50:03 -0600
Michael Roth <mdroth@linux.vnet.ibm.com> wrote:

[..]
> I.e. the calling code is only scheduling a one-shot BH for
> virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
> an additional virtqueue entry before we get there. This is likely due
> to the following check in virtio_queue_host_notifier_aio_poll:
> 
>   static bool virtio_queue_host_notifier_aio_poll(void *opaque)
>   {
>       EventNotifier *n = opaque;
>       VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
>       bool progress;
> 
>       if (!vq->vring.desc || virtio_queue_empty(vq)) {
>           return false;
>       }
> 
>       progress = virtio_queue_notify_aio_vq(vq);
> 
> namely the call to virtio_queue_empty(). In this case, since no new
> requests have actually been issued, shadow_avail_idx == last_avail_idx,
> so we actually try to access the vring via vring_avail_idx() to get
> the latest non-shadowed idx:
> 
>   int virtio_queue_empty(VirtQueue *vq)
>   {
>       bool empty;
>       ...
> 
>       if (vq->shadow_avail_idx != vq->last_avail_idx) {
>           return 0;
>       }
> 
>       rcu_read_lock();
>       empty = vring_avail_idx(vq) == vq->last_avail_idx;
>       rcu_read_unlock();
>       return empty;
> 
> but since the IOMMU region has been disabled we get a bogus value (0
> usually), which causes virtio_queue_empty() to falsely report that
> there are entries to be processed, which causes errors such as:
> 
>   "virtio: zero sized buffers are not allowed"
> 
> or
> 
>   "virtio-blk missing headers"
> 
> and puts the device in an error state.
> 

I've seen something similar on s390x with virtio-ccw-blk under
protected virtualization, that made me wonder about how virtio-blk in
particular but also virtio in general handles shutdown and reset.

This makes me wonder if bus-mastering disabled is the only scenario when
a something like vdev->disabled should be used.

In particular I have the following mechanism in mind 

qemu_system_reset() --> ... --> qemu_devices_reset() --> ... --> 
--> virtio_[transport]_reset() --> ... --> virtio_bus_stop_ioeventfd()
--> virtio_blk_data_plane_stop()

which in turn triggesrs the following cascade:
virtio_blk_data_plane_stop_bh --> virtio_queue_aio_set_host_notifier_handler() -->
--> virtio_queue_host_notifier_aio_read() 
which however calls 
virtio_queue_notify_aio_vq() if the notifier tests as
positive. 

Since we still have vq->handle_aio_output that means we may
call virtqueue_pop() during the reset procedure.

This was a problem for us, because (due to a bug) the shared pages that
constitute the virtio ring weren't shared any more. And thus we got
the infamous  
virtio_error(vdev, "virtio: zero sized buffers are not allowed").

Now the bug is no more, and we can tolerate that somewhat late access
to the virtio ring.

But it keeps nagging me, is it really OK for the device to access the
virtio ring during reset? My intuition tells me that the device should
not look for new requests after it has been told to reset.

Opinions? (Michael, Connie)

Regards,
Halil

> This patch works around the issue by introducing virtio_set_disabled(),
> which sets a 'disabled' flag to bypass checks like virtio_queue_empty()
> when bus-mastering is disabled. Since we'd check this flag at all the
> same sites as vdev->broken, we replace those checks with an inline
> function which checks for either vdev->broken or vdev->disabled.
> 
> The 'disabled' flag is only migrated when set, which should be fairly
> rare, but to maintain migration compatibility we disable it's use for
> older machine types. Users requiring the use of the flag in conjunction
> with older machine types can set it explicitly as a virtio-device
> option.
>

next prev parent reply	other threads:[~2019-11-28 19:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20  0:50 [PATCH v2] virtio-pci: disable vring processing when bus-mastering is disabled Michael Roth
2019-11-20  6:12 ` no-reply
2019-11-20 16:23   ` Michael Roth
2019-11-20 16:40     ` Dr. David Alan Gilbert
2019-11-28 16:48 ` Halil Pasic [this message]
2019-11-28 17:03   ` Michael S. Tsirkin
2019-11-29 13:51     ` Halil Pasic
2019-12-05  6:04 ` Alexey Kardashevskiy
2019-12-11 16:02 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191128174800.2d4a2cc2.pasic@linux.ibm.com \
    --to=pasic@linux.ibm.com \
    --cc=aik@ozlabs.ru \
    --cc=borntraeger@de.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=frankja@linux.ibm.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).