qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] commit virtio: recalculate vq->inuse after migration might cause last_avail_idx vs. used_idx failure
@ 2016-12-14 19:12 Halil Pasic
  2016-12-15  8:24 ` Stefan Hajnoczi
  2016-12-15 10:52 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 8+ messages in thread
From: Halil Pasic @ 2016-12-14 19:12 UTC (permalink / raw)
  To: Christian Borntraeger, QEMU Developers, Stefan Hajnoczi

We have a migration problem, which is in my opinion caused by a
deficiency in how vq->inuse is calculated after the migration (commit
bccdef6b  "virtio: recalculate vq->inuse after migration" to
blame).


We got a bugreport with this log for a live migration target. 

2016-12-13T18:59:03.647309Z qemu-system-s390x: VQ 1 size 0x100 < last_avail_idx 0x2f76 - used_idx 0x762f
2016-12-13T18:59:03.647385Z qemu-system-s390x: error while loading state for instance 0x0 of device '/fe.0.0001/virtio-net'
2016-12-13T18:59:03.647540Z qemu-system-s390x: load of migration failed: Operation not permitted
2016-12-13 18:59:03.796+0000: shutting down, reason=failed

They use QEMU version 2.7 but looking at the current git master
I think this did not get fixed in the meanwhile.

So here goes the argument. The recalculation is done like this:

+            vdev->vq[i].inuse = vdev->vq[i].last_avail_idx -
+                                vdev->vq[i].used_idx;

This does not seem correct when last_avail_idx has already
wrapped around but used_idx not yet. We see from the log that
last_avail_idx  (0x2f76) less that used_idx (0x762f) thus
inuse (of type int) ends up being negative.

+            if (vdev->vq[i].inuse > vdev->vq[i].vring.num) {

Because vdev->vq[i].vring.num is unsigned int ala usual arithmetic
conversions ("Otherwise, if the operand that has unsigned integer type
has rank greater or equal to the rank of the type of the other operand,
then the operand with signed integer type is converted to the type of
the operand with unsigned integer type." C99) inuse gets converted to
unsigned int.

Thus the check fails and produces the log cited above.

+                error_report("VQ %d size 0x%x < last_avail_idx 0x%x - "
+                             "used_idx 0x%x",
+                             i, vdev->vq[i].vring.num,
+                             vdev->vq[i].last_avail_idx,
+                             vdev->vq[i].used_idx);
+                return -1;
+            }

Do we want to try to fix this for 2.8? I already have a small patch prepared.

Regards,
Halil

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-12-15 16:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-14 19:12 [Qemu-devel] commit virtio: recalculate vq->inuse after migration might cause last_avail_idx vs. used_idx failure Halil Pasic
2016-12-15  8:24 ` Stefan Hajnoczi
2016-12-15 10:52 ` Dr. David Alan Gilbert
2016-12-15 11:32   ` Halil Pasic
2016-12-15 11:38     ` Dr. David Alan Gilbert
2016-12-15 13:37     ` Paolo Bonzini
2016-12-15 16:16       ` Halil Pasic
2016-12-15 14:06     ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).