qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends
@ 2014-10-24  8:13 Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 1/3] char: Trigger timeouts on poll() when frontend is unready Heinz Graalfs
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Heinz Graalfs @ 2014-10-24  8:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, pbonzini, Heinz Graalfs, borntraeger

On s390 one can observe system hang situations wrt console input when
using 'dataplane=on'.

dataplane processing causes an inactive main thread and an active
dataplane thread.

When a character backend descriptor disappears from the main thread's
poll() descriptor array (when can_read() returns 0) it happens that it
will never reappear in the poll() array due to missing poll() interrupts.

The following patches fix observed hangs on s390 and provide a means
to avoid potential hangs in other backends/frontends.

The command line to reproduce the hang on s390:
qemu-system-s390x -machine s390-ccw-virtio,accel=kvm,usb=off,kernel_irqchip=on
 -m 1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -rtc base=utc
 -drive file=/dev/disk/by-path/ccw-0.0.e7ac,if=none,id=drive-virtio-disk0,
  format=raw,werror=report,rerror=report,cache=none,aio=native
 -device virtio-blk-ccw,ioeventfd=on,scsi=off,config-wce=off,devno=fe.0.0001,
  drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,x-data-plane=on
 -kernel /guest/vmlinux
 -append "root=/dev/vda1 console=ttyS0"
 -chardev stdio,id=sclpcon1,mux=off
 -device sclplmconsole,chardev=sclpcon1
 -nographic -nodefaults

Patch 1:
Adds timeout to poll() when a frontend returned -EAGAIN on 'can_read'.

Patch 2:
The SCLP line mode console now returns -EAGAIN on 'can_read' to fix a hang
after the backend descriptor disappeared forever from the poll() descriptor
array.

Patch 3:
The ASCII console now returns -EAGAIN on 'can_read' to avoid a potential hang.

Heinz Graalfs (3):
  char: Trigger timeouts on poll() when frontend is unready
  s390x: Fix hanging SCLP line mode console
  s390x: Avoid hanging SCLP ASCII console

 hw/char/sclpconsole-lm.c |  4 ++--
 hw/char/sclpconsole.c    |  7 ++++++-
 qemu-char.c              | 27 ++++++++++++++++++++++++++-
 3 files changed, 34 insertions(+), 4 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [RFC 1/3] char: Trigger timeouts on poll() when frontend is unready
  2014-10-24  8:13 [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Heinz Graalfs
@ 2014-10-24  8:13 ` Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 2/3] s390x: Fix hanging SCLP line mode console Heinz Graalfs
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Heinz Graalfs @ 2014-10-24  8:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, pbonzini, Heinz Graalfs, borntraeger

When a character frontend returns zero on 'can_read' poll() goes on
without the backend's descriptor.

The remaining active descriptors may not be a part of the main
thread's poll; for example, if dataplane is active, activity may only
be ongoing on the dataplane thread.

This patch changes the character backend's io_watch_poll() loop logic.
If a frontend returns -EAGAIN on 'can_read' a poll() timeout
is triggered, thus the backend's descriptor has a chance to reappear within
the poll desriptor array. The timeout callback reinvokes 'can_read' to ask
the frontend if it is ready. Scheduling a timeout callback on poll() ends
when the frontend returns a non negative value on 'can_read'.

Frontends still returning zero on 'can_read' or without such callback are
not affected.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 qemu-char.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/qemu-char.c b/qemu-char.c
index 8623c70..4adf7e4 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -697,6 +697,7 @@ typedef struct IOWatchPoll
     IOCanReadHandler *fd_can_read;
     GSourceFunc fd_read;
     void *opaque;
+    guint timer_tag;
 } IOWatchPoll;
 
 static IOWatchPoll *io_watch_poll_from_source(GSource *source)
@@ -704,19 +705,38 @@ static IOWatchPoll *io_watch_poll_from_source(GSource *source)
     return container_of(source, IOWatchPoll, parent);
 }
 
+static gboolean io_watch_timer(gpointer opaque)
+{
+    IOWatchPoll *iwp = io_watch_poll_from_source(opaque);
+
+    if (iwp->fd_can_read(iwp->opaque) == -EAGAIN) {
+        return TRUE;
+    }
+    iwp->timer_tag = 0;
+    return FALSE;
+}
+
 static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_)
 {
     IOWatchPoll *iwp = io_watch_poll_from_source(source);
-    bool now_active = iwp->fd_can_read(iwp->opaque) > 0;
+    int avail = iwp->fd_can_read(iwp->opaque);
+    bool now_active = avail > 0;
+    bool use_timeout = avail == -EAGAIN;
     bool was_active = iwp->src != NULL;
     if (was_active == now_active) {
         return FALSE;
     }
 
+    if (iwp->timer_tag) {
+        g_source_remove(iwp->timer_tag);
+        iwp->timer_tag = 0;
+    }
     if (now_active) {
         iwp->src = g_io_create_watch(iwp->channel, G_IO_IN | G_IO_ERR | G_IO_HUP);
         g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL);
         g_source_attach(iwp->src, NULL);
+    } else if (use_timeout) {
+        iwp->timer_tag = g_timeout_add(100, io_watch_timer, source);
     } else {
         g_source_destroy(iwp->src);
         g_source_unref(iwp->src);
@@ -774,6 +794,7 @@ static guint io_add_watch_poll(GIOChannel *channel,
     iwp->channel = channel;
     iwp->fd_read = (GSourceFunc) fd_read;
     iwp->src = NULL;
+    iwp->timer_tag = 0;
 
     tag = g_source_attach(&iwp->parent, NULL);
     g_source_unref(&iwp->parent);
@@ -796,6 +817,10 @@ static void io_remove_watch_poll(guint tag)
         g_source_unref(iwp->src);
         iwp->src = NULL;
     }
+    if (iwp->timer_tag) {
+        g_source_remove(iwp->timer_tag);
+        iwp->timer_tag = 0;
+    }
     g_source_destroy(&iwp->parent);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [RFC 2/3] s390x: Fix hanging SCLP line mode console
  2014-10-24  8:13 [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 1/3] char: Trigger timeouts on poll() when frontend is unready Heinz Graalfs
@ 2014-10-24  8:13 ` Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 3/3] s390x: Avoid hanging SCLP ASCII console Heinz Graalfs
  2014-10-24  9:15 ` [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Paolo Bonzini
  3 siblings, 0 replies; 6+ messages in thread
From: Heinz Graalfs @ 2014-10-24  8:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, pbonzini, Heinz Graalfs, borntraeger

Exploit the new can_read timeout infrastructure in order to avoid
hangs when no further activity happens on the main thread.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 hw/char/sclpconsole-lm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
index 80dd0a9..3575ef2 100644
--- a/hw/char/sclpconsole-lm.c
+++ b/hw/char/sclpconsole-lm.c
@@ -60,11 +60,11 @@ static int chr_can_read(void *opaque)
     SCLPConsoleLM *scon = opaque;
 
     if (scon->event.event_pending) {
-        return 0;
+        return -EAGAIN;
     } else if (SIZE_CONSOLE_BUFFER - scon->length) {
         return 1;
     }
-    return 0;
+    return -EAGAIN;
 }
 
 static void chr_read(void *opaque, const uint8_t *buf, int size)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Qemu-devel] [RFC 3/3] s390x: Avoid hanging SCLP ASCII console
  2014-10-24  8:13 [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 1/3] char: Trigger timeouts on poll() when frontend is unready Heinz Graalfs
  2014-10-24  8:13 ` [Qemu-devel] [RFC 2/3] s390x: Fix hanging SCLP line mode console Heinz Graalfs
@ 2014-10-24  8:13 ` Heinz Graalfs
  2014-10-24  9:15 ` [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Paolo Bonzini
  3 siblings, 0 replies; 6+ messages in thread
From: Heinz Graalfs @ 2014-10-24  8:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, pbonzini, Heinz Graalfs, borntraeger

Exploit the new can_read timeout infrastructure in order to avoid
hangs when no further activity happens on the main thread.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 hw/char/sclpconsole.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/char/sclpconsole.c b/hw/char/sclpconsole.c
index fca105d..a33aaa2 100644
--- a/hw/char/sclpconsole.c
+++ b/hw/char/sclpconsole.c
@@ -44,8 +44,13 @@ typedef struct SCLPConsole {
 static int chr_can_read(void *opaque)
 {
     SCLPConsole *scon = opaque;
+    int avail = SIZE_BUFFER_VT220 - scon->iov_data_len;
 
-    return SIZE_BUFFER_VT220 - scon->iov_data_len;
+    if (avail > 0) {
+        return avail;
+    }
+
+    return -EAGAIN;
 }
 
 /* Send data from a char device over to the guest */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends
  2014-10-24  8:13 [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Heinz Graalfs
                   ` (2 preceding siblings ...)
  2014-10-24  8:13 ` [Qemu-devel] [RFC 3/3] s390x: Avoid hanging SCLP ASCII console Heinz Graalfs
@ 2014-10-24  9:15 ` Paolo Bonzini
  2014-10-24 11:29   ` Heinz Graalfs
  3 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2014-10-24  9:15 UTC (permalink / raw)
  To: Heinz Graalfs, qemu-devel; +Cc: cornelia.huck, borntraeger

On 10/24/2014 10:13 AM, Heinz Graalfs wrote:
> On s390 one can observe system hang situations wrt console input when
> using 'dataplane=on'.
> 
> dataplane processing causes an inactive main thread and an active
> dataplane thread.
> 
> When a character backend descriptor disappears from the main thread's
> poll() descriptor array (when can_read() returns 0) it happens that it
> will never reappear in the poll() array due to missing poll() interrupts.
> 
> The following patches fix observed hangs on s390 and provide a means
> to avoid potential hangs in other backends/frontends.

I think all you need is a simple

    qemu_notify_event();

call when can_read can go from 0 to 1, for example just before
get_console_data returns (for hw/char/sclpconsole-lm.c).

By the way, for hw/char/sclpconsole-lm.c I'm not sure what happens if
scon->length == SIZE_CONSOLE_BUFFER.  You cannot read, so you cannot
generate an event, and you cannot reset scon->length because you cannot
generate an event.  I think something like this is needed:

diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
index 80dd0a9..c61b77b 100644
--- a/hw/char/sclpconsole-lm.c
+++ b/hw/char/sclpconsole-lm.c
@@ -61,10 +61,9 @@ static int chr_can_read(void *opaque)

     if (scon->event.event_pending) {
         return 0;
-    } else if (SIZE_CONSOLE_BUFFER - scon->length) {
+    } else {
         return 1;
     }
-    return 0;
 }

 static void chr_read(void *opaque, const uint8_t *buf, int size)
@@ -78,6 +77,10 @@ static void chr_read(void *opaque, const uint8_t
*buf, int size)
         sclp_service_interrupt(0);
         return;
     }
+    if (scon->length == SIZE_CONSOLE_BUFFER) {
+        /* Eat the character, but still process CR and LF.  */
+        return;
+    }
     scon->buf[scon->length] = *buf;
     scon->length += 1;
     if (scon->echo) {

Paolo

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends
  2014-10-24  9:15 ` [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Paolo Bonzini
@ 2014-10-24 11:29   ` Heinz Graalfs
  0 siblings, 0 replies; 6+ messages in thread
From: Heinz Graalfs @ 2014-10-24 11:29 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: cornelia.huck, borntraeger

On 24/10/14 11:15, Paolo Bonzini wrote:
> On 10/24/2014 10:13 AM, Heinz Graalfs wrote:
>> On s390 one can observe system hang situations wrt console input when
>> using 'dataplane=on'.
>>
>> dataplane processing causes an inactive main thread and an active
>> dataplane thread.
>>
>> When a character backend descriptor disappears from the main thread's
>> poll() descriptor array (when can_read() returns 0) it happens that it
>> will never reappear in the poll() array due to missing poll() interrupts.
>>
>> The following patches fix observed hangs on s390 and provide a means
>> to avoid potential hangs in other backends/frontends.
> 
> I think all you need is a simple
> 
>     qemu_notify_event();
> 
> call when can_read can go from 0 to 1, for example just before
> get_console_data returns (for hw/char/sclpconsole-lm.c).
> 

definitely, just that simple!

> By the way, for hw/char/sclpconsole-lm.c I'm not sure what happens if
> scon->length == SIZE_CONSOLE_BUFFER.  You cannot read, so you cannot
> generate an event, and you cannot reset scon->length because you cannot
> generate an event.  I think something like this is needed:
> 
> diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
> index 80dd0a9..c61b77b 100644
> --- a/hw/char/sclpconsole-lm.c
> +++ b/hw/char/sclpconsole-lm.c
> @@ -61,10 +61,9 @@ static int chr_can_read(void *opaque)
> 
>      if (scon->event.event_pending) {
>          return 0;
> -    } else if (SIZE_CONSOLE_BUFFER - scon->length) {
> +    } else {
>          return 1;
>      }
> -    return 0;
>  }
> 
>  static void chr_read(void *opaque, const uint8_t *buf, int size)
> @@ -78,6 +77,10 @@ static void chr_read(void *opaque, const uint8_t
> *buf, int size)
>          sclp_service_interrupt(0);
>          return;
>      }
> +    if (scon->length == SIZE_CONSOLE_BUFFER) {
> +        /* Eat the character, but still process CR and LF.  */
> +        return;
> +    }

yes, thanks a lot

>      scon->buf[scon->length] = *buf;
>      scon->length += 1;
>      if (scon->echo) {
> 
> Paolo
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-10-24 11:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24  8:13 [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Heinz Graalfs
2014-10-24  8:13 ` [Qemu-devel] [RFC 1/3] char: Trigger timeouts on poll() when frontend is unready Heinz Graalfs
2014-10-24  8:13 ` [Qemu-devel] [RFC 2/3] s390x: Fix hanging SCLP line mode console Heinz Graalfs
2014-10-24  8:13 ` [Qemu-devel] [RFC 3/3] s390x: Avoid hanging SCLP ASCII console Heinz Graalfs
2014-10-24  9:15 ` [Qemu-devel] [RFC 0/3] qemu-char: Add poll timeouts for character backends Paolo Bonzini
2014-10-24 11:29   ` Heinz Graalfs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).