qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] 9pfs: don't try to flush self and avoid QEMU hang on reset
@ 2017-03-16 16:33 Greg Kurz
  2017-03-21 14:01 ` Eric Blake
  0 siblings, 1 reply; 4+ messages in thread
From: Greg Kurz @ 2017-03-16 16:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Eric Blake, qemu-stable, Greg Kurz

According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.

QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.

If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.

This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).

The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.

[*] http://man.cat-v.org/plan_9/5/flush

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/9pfs/9p.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 76c9247c777d..e20417fbb0db 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2369,7 +2369,7 @@ static void coroutine_fn v9fs_flush(void *opaque)
             break;
         }
     }
-    if (cancel_pdu) {
+    if (cancel_pdu && cancel_pdu != pdu) {
         cancel_pdu->cancelled = 1;
         /*
          * Wait for pdu to complete.

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-21 15:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-16 16:33 [Qemu-devel] [PATCH] 9pfs: don't try to flush self and avoid QEMU hang on reset Greg Kurz
2017-03-21 14:01 ` Eric Blake
2017-03-21 14:42   ` Greg Kurz
2017-03-21 15:42     ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).