* [Qemu-devel] [PATCH 0/3] Avoid retrying on serial_xmit with EPIPE @ 2018-05-31 7:45 Sergio Lopez 2018-05-31 7:45 ` [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN Sergio Lopez ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 7:45 UTC (permalink / raw) To: berrange, pbonzini, marcandre.lureau, mst; +Cc: qemu-devel, Sergio Lopez EPIPE is not a recoverable error, so retrying on serial_xmit is a waste of CPU cycles, and can potentially comprosie the Guest OS stability if both the vCPU and the emulator thread are pinned to the same pCPU, with the first one actively polling the serial device and barely giving time to the second to make any actual progress. Sergio Lopez (3): io: Implement QIO_CHANNEL_ERR_BROKEN chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE hw/char/serial: Don't retry on serial_xmit if errno == EPIPE chardev/char-io.c | 3 +++ hw/char/serial.c | 1 + include/io/channel.h | 1 + io/channel-file.c | 3 +++ 4 files changed, 8 insertions(+) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN 2018-05-31 7:45 [Qemu-devel] [PATCH 0/3] Avoid retrying on serial_xmit with EPIPE Sergio Lopez @ 2018-05-31 7:45 ` Sergio Lopez 2018-06-01 11:50 ` Daniel P. Berrangé 2018-05-31 7:46 ` [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE Sergio Lopez 2018-05-31 7:46 ` [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE Sergio Lopez 2 siblings, 1 reply; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 7:45 UTC (permalink / raw) To: berrange, pbonzini, marcandre.lureau, mst; +Cc: qemu-devel, Sergio Lopez QIO_CHANNEL_ERR_BROKEN is used to identify a potentially unrecoverable error in the channel, like EPIPE. --- include/io/channel.h | 1 + io/channel-file.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/io/channel.h b/include/io/channel.h index e8cdadb..bbe45f6 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -38,6 +38,7 @@ typedef struct QIOChannel QIOChannel; typedef struct QIOChannelClass QIOChannelClass; #define QIO_CHANNEL_ERR_BLOCK -2 +#define QIO_CHANNEL_ERR_BROKEN -3 typedef enum QIOChannelFeature QIOChannelFeature; diff --git a/io/channel-file.c b/io/channel-file.c index db948ab..a990f67 100644 --- a/io/channel-file.c +++ b/io/channel-file.c @@ -124,6 +124,9 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc, if (errno == EAGAIN) { return QIO_CHANNEL_ERR_BLOCK; } + if (errno == EPIPE) { + return QIO_CHANNEL_ERR_BROKEN; + } if (errno == EINTR) { goto retry; } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN 2018-05-31 7:45 ` [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN Sergio Lopez @ 2018-06-01 11:50 ` Daniel P. Berrangé 0 siblings, 0 replies; 18+ messages in thread From: Daniel P. Berrangé @ 2018-06-01 11:50 UTC (permalink / raw) To: Sergio Lopez; +Cc: pbonzini, marcandre.lureau, mst, qemu-devel On Thu, May 31, 2018 at 09:45:59AM +0200, Sergio Lopez wrote: > QIO_CHANNEL_ERR_BROKEN is used to identify a potentially unrecoverable > error in the channel, like EPIPE. > --- > include/io/channel.h | 1 + > io/channel-file.c | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/include/io/channel.h b/include/io/channel.h > index e8cdadb..bbe45f6 100644 > --- a/include/io/channel.h > +++ b/include/io/channel.h > @@ -38,6 +38,7 @@ typedef struct QIOChannel QIOChannel; > typedef struct QIOChannelClass QIOChannelClass; > > #define QIO_CHANNEL_ERR_BLOCK -2 > +#define QIO_CHANNEL_ERR_BROKEN -3 > > typedef enum QIOChannelFeature QIOChannelFeature; > > diff --git a/io/channel-file.c b/io/channel-file.c > index db948ab..a990f67 100644 > --- a/io/channel-file.c > +++ b/io/channel-file.c > @@ -124,6 +124,9 @@ static ssize_t qio_channel_file_writev(QIOChannel *ioc, > if (errno == EAGAIN) { > return QIO_CHANNEL_ERR_BLOCK; > } > + if (errno == EPIPE) { > + return QIO_CHANNEL_ERR_BROKEN; > + } > if (errno == EINTR) { > goto retry; > } I don't see a compelling reason to add this. EAGAIN needs special handling because we do not wish to set the "Error **errp" object. EPIPE can be just handled by callers in the same way as any other errno value, by closing the connection. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE 2018-05-31 7:45 [Qemu-devel] [PATCH 0/3] Avoid retrying on serial_xmit with EPIPE Sergio Lopez 2018-05-31 7:45 ` [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN Sergio Lopez @ 2018-05-31 7:46 ` Sergio Lopez 2018-05-31 9:48 ` Marc-André Lureau 2018-06-01 11:51 ` Daniel P. Berrangé 2018-05-31 7:46 ` [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE Sergio Lopez 2 siblings, 2 replies; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 7:46 UTC (permalink / raw) To: berrange, pbonzini, marcandre.lureau, mst; +Cc: qemu-devel, Sergio Lopez This allows callers to identify this potentially unrecoverable error. --- chardev/char-io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/chardev/char-io.c b/chardev/char-io.c index f810524..f934eb9 100644 --- a/chardev/char-io.c +++ b/chardev/char-io.c @@ -168,6 +168,9 @@ int io_channel_send_full(QIOChannel *ioc, errno = EAGAIN; return -1; + } else if (ret == QIO_CHANNEL_ERR_BROKEN) { + errno = EPIPE; + return -1; } else if (ret < 0) { errno = EINVAL; return -1; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE 2018-05-31 7:46 ` [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE Sergio Lopez @ 2018-05-31 9:48 ` Marc-André Lureau 2018-05-31 11:03 ` Sergio Lopez 2018-06-01 11:51 ` Daniel P. Berrangé 1 sibling, 1 reply; 18+ messages in thread From: Marc-André Lureau @ 2018-05-31 9:48 UTC (permalink / raw) To: Sergio Lopez; +Cc: Daniel P. Berrange, Paolo Bonzini, Michael S. Tsirkin, QEMU On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > This allows callers to identify this potentially unrecoverable error. (missing signed-off) Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> > --- > chardev/char-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/chardev/char-io.c b/chardev/char-io.c > index f810524..f934eb9 100644 > --- a/chardev/char-io.c > +++ b/chardev/char-io.c > @@ -168,6 +168,9 @@ int io_channel_send_full(QIOChannel *ioc, > > errno = EAGAIN; > return -1; > + } else if (ret == QIO_CHANNEL_ERR_BROKEN) { > + errno = EPIPE; > + return -1; > } else if (ret < 0) { > errno = EINVAL; > return -1; > -- > 1.8.3.1 > > -- Marc-André Lureau ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE 2018-05-31 9:48 ` Marc-André Lureau @ 2018-05-31 11:03 ` Sergio Lopez 0 siblings, 0 replies; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 11:03 UTC (permalink / raw) To: Marc-André Lureau Cc: Daniel P. Berrange, Paolo Bonzini, Michael S. Tsirkin, QEMU On Thu, May 31, 2018 at 11:48:25AM +0200, Marc-André Lureau wrote: > On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > > This allows callers to identify this potentially unrecoverable error. > > (missing signed-off) Ouch, sorry, will add it in v2. Sergio. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE 2018-05-31 7:46 ` [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE Sergio Lopez 2018-05-31 9:48 ` Marc-André Lureau @ 2018-06-01 11:51 ` Daniel P. Berrangé 1 sibling, 0 replies; 18+ messages in thread From: Daniel P. Berrangé @ 2018-06-01 11:51 UTC (permalink / raw) To: Sergio Lopez; +Cc: pbonzini, marcandre.lureau, mst, qemu-devel On Thu, May 31, 2018 at 09:46:00AM +0200, Sergio Lopez wrote: > This allows callers to identify this potentially unrecoverable error. > --- > chardev/char-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/chardev/char-io.c b/chardev/char-io.c > index f810524..f934eb9 100644 > --- a/chardev/char-io.c > +++ b/chardev/char-io.c > @@ -168,6 +168,9 @@ int io_channel_send_full(QIOChannel *ioc, > > errno = EAGAIN; > return -1; > + } else if (ret == QIO_CHANNEL_ERR_BROKEN) { > + errno = EPIPE; > + return -1; > } else if (ret < 0) { > errno = EINVAL; > return -1; Again, I don't see any reason to add this special case for EPIPE Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-05-31 7:45 [Qemu-devel] [PATCH 0/3] Avoid retrying on serial_xmit with EPIPE Sergio Lopez 2018-05-31 7:45 ` [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN Sergio Lopez 2018-05-31 7:46 ` [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE Sergio Lopez @ 2018-05-31 7:46 ` Sergio Lopez 2018-05-31 9:52 ` Marc-André Lureau 2 siblings, 1 reply; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 7:46 UTC (permalink / raw) To: berrange, pbonzini, marcandre.lureau, mst; +Cc: qemu-devel, Sergio Lopez If writing to the frontend channel failed with EPIPE, don't set up a retry. EPIPE is not a recoverable error, so trying again is a waste of CPU cycles. If the vCPU writing to the serial device and emulator thread are pinned to the same pCPU, it can also compromise the stability of the Guest OS, as both threads will be competing for pCPU's time, with the vCPU actively polling the serial device and barely giving time to the emulator thread to make actual progress. --- hw/char/serial.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/char/serial.c b/hw/char/serial.c index 2c080c9..f26e86b 100644 --- a/hw/char/serial.c +++ b/hw/char/serial.c @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) /* in loopback mode, say that we just received a char */ serial_receive1(s, &s->tsr, 1); } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && + errno != EPIPE && s->tsr_retry < MAX_XMIT_RETRY) { assert(s->watch_tag == 0); s->watch_tag = -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-05-31 7:46 ` [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE Sergio Lopez @ 2018-05-31 9:52 ` Marc-André Lureau 2018-05-31 11:03 ` Sergio Lopez 2018-06-01 11:52 ` Daniel P. Berrangé 0 siblings, 2 replies; 18+ messages in thread From: Marc-André Lureau @ 2018-05-31 9:52 UTC (permalink / raw) To: Sergio Lopez; +Cc: Daniel P. Berrange, Paolo Bonzini, Michael S. Tsirkin, QEMU Hi On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > If writing to the frontend channel failed with EPIPE, don't set up a > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU > cycles. > > If the vCPU writing to the serial device and emulator thread are pinned > to the same pCPU, it can also compromise the stability of the Guest OS, > as both threads will be competing for pCPU's time, with the vCPU > actively polling the serial device and barely giving time to the > emulator thread to make actual progress. > --- > hw/char/serial.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/char/serial.c b/hw/char/serial.c > index 2c080c9..f26e86b 100644 > --- a/hw/char/serial.c > +++ b/hw/char/serial.c > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) > /* in loopback mode, say that we just received a char */ > serial_receive1(s, &s->tsr, 1); > } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && > + errno != EPIPE && > s->tsr_retry < MAX_XMIT_RETRY) { Instead of adding explicit handling of EPIPE, shouldn't the code be rewritten to treat -1 return && errno != EAGAIN as fatal? > assert(s->watch_tag == 0); > s->watch_tag = > -- > 1.8.3.1 > > -- Marc-André Lureau ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-05-31 9:52 ` Marc-André Lureau @ 2018-05-31 11:03 ` Sergio Lopez 2018-05-31 11:07 ` Marc-André Lureau 2018-06-01 11:52 ` Daniel P. Berrangé 1 sibling, 1 reply; 18+ messages in thread From: Sergio Lopez @ 2018-05-31 11:03 UTC (permalink / raw) To: Marc-André Lureau Cc: Daniel P. Berrange, Paolo Bonzini, Michael S. Tsirkin, QEMU On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote: > On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > > If writing to the frontend channel failed with EPIPE, don't set up a > > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU > > cycles. > > > > If the vCPU writing to the serial device and emulator thread are pinned > > to the same pCPU, it can also compromise the stability of the Guest OS, > > as both threads will be competing for pCPU's time, with the vCPU > > actively polling the serial device and barely giving time to the > > emulator thread to make actual progress. > > --- > > hw/char/serial.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/hw/char/serial.c b/hw/char/serial.c > > index 2c080c9..f26e86b 100644 > > --- a/hw/char/serial.c > > +++ b/hw/char/serial.c > > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) > > /* in loopback mode, say that we just received a char */ > > serial_receive1(s, &s->tsr, 1); > > } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && > > + errno != EPIPE && > > s->tsr_retry < MAX_XMIT_RETRY) { > > Instead of adding explicit handling of EPIPE, shouldn't the code be > rewritten to treat -1 return && errno != EAGAIN as fatal? What kind of action should it trigger if treating errno != EAGAIN as fatal? If you meant something like calling 'abort()', I think that, while this could be considered as the "correct" behavior, it's a bit unpractical for the users. The most common situation for this to happen is restarting virtlogd while having the console/serial redirected to it. The VM will stop writing to the serial device, but otherwise the Guest OS can probably keep working without any issues. Sergio. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-05-31 11:03 ` Sergio Lopez @ 2018-05-31 11:07 ` Marc-André Lureau 0 siblings, 0 replies; 18+ messages in thread From: Marc-André Lureau @ 2018-05-31 11:07 UTC (permalink / raw) To: Sergio Lopez; +Cc: Daniel P. Berrange, Paolo Bonzini, Michael S. Tsirkin, QEMU Hi On Thu, May 31, 2018 at 1:03 PM, Sergio Lopez <slp@redhat.com> wrote: > On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote: >> On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: >> > If writing to the frontend channel failed with EPIPE, don't set up a >> > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU >> > cycles. >> > >> > If the vCPU writing to the serial device and emulator thread are pinned >> > to the same pCPU, it can also compromise the stability of the Guest OS, >> > as both threads will be competing for pCPU's time, with the vCPU >> > actively polling the serial device and barely giving time to the >> > emulator thread to make actual progress. >> > --- >> > hw/char/serial.c | 1 + >> > 1 file changed, 1 insertion(+) >> > >> > diff --git a/hw/char/serial.c b/hw/char/serial.c >> > index 2c080c9..f26e86b 100644 >> > --- a/hw/char/serial.c >> > +++ b/hw/char/serial.c >> > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) >> > /* in loopback mode, say that we just received a char */ >> > serial_receive1(s, &s->tsr, 1); >> > } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && >> > + errno != EPIPE && >> > s->tsr_retry < MAX_XMIT_RETRY) { >> >> Instead of adding explicit handling of EPIPE, shouldn't the code be >> rewritten to treat -1 return && errno != EAGAIN as fatal? > > What kind of action should it trigger if treating errno != EAGAIN as fatal? "fatal" was perhaps the wrong word. I mean that we don't go into the retry loop and treat the chardev as disconnected (without explicit handling of EPIPE). > > If you meant something like calling 'abort()', I think that, while this > could be considered as the "correct" behavior, it's a bit unpractical > for the users. The most common situation for this to happen is > restarting virtlogd while having the console/serial redirected to it. > The VM will stop writing to the serial device, but otherwise the Guest > OS can probably keep working without any issues. > > Sergio. -- Marc-André Lureau ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-05-31 9:52 ` Marc-André Lureau 2018-05-31 11:03 ` Sergio Lopez @ 2018-06-01 11:52 ` Daniel P. Berrangé 2018-06-01 12:05 ` Sergio Lopez 1 sibling, 1 reply; 18+ messages in thread From: Daniel P. Berrangé @ 2018-06-01 11:52 UTC (permalink / raw) To: Marc-André Lureau Cc: Sergio Lopez, Paolo Bonzini, Michael S. Tsirkin, QEMU On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote: > Hi > > On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > > If writing to the frontend channel failed with EPIPE, don't set up a > > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU > > cycles. > > > > If the vCPU writing to the serial device and emulator thread are pinned > > to the same pCPU, it can also compromise the stability of the Guest OS, > > as both threads will be competing for pCPU's time, with the vCPU > > actively polling the serial device and barely giving time to the > > emulator thread to make actual progress. > > --- > > hw/char/serial.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/hw/char/serial.c b/hw/char/serial.c > > index 2c080c9..f26e86b 100644 > > --- a/hw/char/serial.c > > +++ b/hw/char/serial.c > > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) > > /* in loopback mode, say that we just received a char */ > > serial_receive1(s, &s->tsr, 1); > > } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && > > + errno != EPIPE && > > s->tsr_retry < MAX_XMIT_RETRY) { > > Instead of adding explicit handling of EPIPE, shouldn't the code be > rewritten to treat -1 return && errno != EAGAIN as fatal? Yes, exactly this code is already broken for every single errno value, not simply EPIPE. It needs fixing to treat '-1' return code correctly instead of retrying everything. > > > assert(s->watch_tag == 0); > > s->watch_tag = Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 11:52 ` Daniel P. Berrangé @ 2018-06-01 12:05 ` Sergio Lopez 2018-06-01 12:16 ` Paolo Bonzini 0 siblings, 1 reply; 18+ messages in thread From: Sergio Lopez @ 2018-06-01 12:05 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Marc-André Lureau, Paolo Bonzini, Michael S. Tsirkin, QEMU On Fri, Jun 01, 2018 at 12:52:12PM +0100, Daniel P. Berrangé wrote: > On Thu, May 31, 2018 at 11:52:01AM +0200, Marc-André Lureau wrote: > > Hi > > > > On Thu, May 31, 2018 at 9:46 AM, Sergio Lopez <slp@redhat.com> wrote: > > > If writing to the frontend channel failed with EPIPE, don't set up a > > > retry. EPIPE is not a recoverable error, so trying again is a waste of CPU > > > cycles. > > > > > > If the vCPU writing to the serial device and emulator thread are pinned > > > to the same pCPU, it can also compromise the stability of the Guest OS, > > > as both threads will be competing for pCPU's time, with the vCPU > > > actively polling the serial device and barely giving time to the > > > emulator thread to make actual progress. > > > --- > > > hw/char/serial.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/hw/char/serial.c b/hw/char/serial.c > > > index 2c080c9..f26e86b 100644 > > > --- a/hw/char/serial.c > > > +++ b/hw/char/serial.c > > > @@ -262,6 +262,7 @@ static void serial_xmit(SerialState *s) > > > /* in loopback mode, say that we just received a char */ > > > serial_receive1(s, &s->tsr, 1); > > > } else if (qemu_chr_fe_write(&s->chr, &s->tsr, 1) != 1 && > > > + errno != EPIPE && > > > s->tsr_retry < MAX_XMIT_RETRY) { > > > > Instead of adding explicit handling of EPIPE, shouldn't the code be > > rewritten to treat -1 return && errno != EAGAIN as fatal? > > Yes, exactly this code is already broken for every single errno > value, not simply EPIPE. It needs fixing to treat '-1' return code > correctly instead of retrying everything. Given that EAGAIN is already taken care of in chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or should we just drop all the tsr_retry logic? Sergio. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 12:05 ` Sergio Lopez @ 2018-06-01 12:16 ` Paolo Bonzini 2018-06-01 12:28 ` Sergio Lopez 0 siblings, 1 reply; 18+ messages in thread From: Paolo Bonzini @ 2018-06-01 12:16 UTC (permalink / raw) To: Sergio Lopez, Daniel P. Berrangé Cc: Marc-André Lureau, Michael S. Tsirkin, QEMU On 01/06/2018 14:05, Sergio Lopez wrote: >>> Instead of adding explicit handling of EPIPE, shouldn't the code be >>> rewritten to treat -1 return && errno != EAGAIN as fatal? >> Yes, exactly this code is already broken for every single errno >> value, not simply EPIPE. It needs fixing to treat '-1' return code >> correctly instead of retrying everything. > Given that EAGAIN is already taken care of in > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or > should we just drop all the tsr_retry logic? It's not handled there, the call from qemu_chr_fe_write has write_all == false. Paolo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 12:16 ` Paolo Bonzini @ 2018-06-01 12:28 ` Sergio Lopez 2018-06-01 12:31 ` Paolo Bonzini 2018-06-01 12:43 ` Peter Maydell 0 siblings, 2 replies; 18+ messages in thread From: Sergio Lopez @ 2018-06-01 12:28 UTC (permalink / raw) To: Paolo Bonzini Cc: Daniel P. Berrangé, Marc-André Lureau, Michael S. Tsirkin, QEMU On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote: > On 01/06/2018 14:05, Sergio Lopez wrote: > >>> Instead of adding explicit handling of EPIPE, shouldn't the code be > >>> rewritten to treat -1 return && errno != EAGAIN as fatal? > >> Yes, exactly this code is already broken for every single errno > >> value, not simply EPIPE. It needs fixing to treat '-1' return code > >> correctly instead of retrying everything. > > Given that EAGAIN is already taken care of in > > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or > > should we just drop all the tsr_retry logic? > > It's not handled there, the call from qemu_chr_fe_write has write_all == > false. You're right. So should we just retry only for -1 && errno == EAGAIN and just ignore the error otherwise? Sergio. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 12:28 ` Sergio Lopez @ 2018-06-01 12:31 ` Paolo Bonzini 2018-06-01 12:43 ` Peter Maydell 1 sibling, 0 replies; 18+ messages in thread From: Paolo Bonzini @ 2018-06-01 12:31 UTC (permalink / raw) To: Sergio Lopez Cc: Daniel P. Berrangé, Marc-André Lureau, Michael S. Tsirkin, QEMU On 01/06/2018 14:28, Sergio Lopez wrote: > On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote: >> On 01/06/2018 14:05, Sergio Lopez wrote: >>>>> Instead of adding explicit handling of EPIPE, shouldn't the code be >>>>> rewritten to treat -1 return && errno != EAGAIN as fatal? >>>> Yes, exactly this code is already broken for every single errno >>>> value, not simply EPIPE. It needs fixing to treat '-1' return code >>>> correctly instead of retrying everything. >>> Given that EAGAIN is already taken care of in >>> chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or >>> should we just drop all the tsr_retry logic? >> >> It's not handled there, the call from qemu_chr_fe_write has write_all == >> false. > > You're right. So should we just retry only for -1 && errno == EAGAIN and > just ignore the error otherwise? Probably, yes. Paolo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 12:28 ` Sergio Lopez 2018-06-01 12:31 ` Paolo Bonzini @ 2018-06-01 12:43 ` Peter Maydell 2018-06-01 16:20 ` Sergio Lopez 1 sibling, 1 reply; 18+ messages in thread From: Peter Maydell @ 2018-06-01 12:43 UTC (permalink / raw) To: Sergio Lopez Cc: Paolo Bonzini, Marc-André Lureau, QEMU, Michael S. Tsirkin On 1 June 2018 at 13:28, Sergio Lopez <slp@redhat.com> wrote: > On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote: >> On 01/06/2018 14:05, Sergio Lopez wrote: >> >>> Instead of adding explicit handling of EPIPE, shouldn't the code be >> >>> rewritten to treat -1 return && errno != EAGAIN as fatal? >> >> Yes, exactly this code is already broken for every single errno >> >> value, not simply EPIPE. It needs fixing to treat '-1' return code >> >> correctly instead of retrying everything. >> > Given that EAGAIN is already taken care of in >> > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or >> > should we just drop all the tsr_retry logic? >> >> It's not handled there, the call from qemu_chr_fe_write has write_all == >> false. > > You're right. So should we just retry only for -1 && errno == EAGAIN and > just ignore the error otherwise? I don't think we should make all of qemu_chr_fe_write()'s callers have to handle EAGAIN. That function already has a way to tell the caller that it didn't consume any data, which is to return 0. (EAGAIN is a curse and we should strive to avoid it spreading its poison into areas of the codebase that we can keep it out of.) In general there are three cases I think qemu_chr_fe_write() callers might care about: * data was consumed (return >0) * data wasn't consumed, try again later (return 0) * data wasn't consumed, and a later attempt won't work either (return -1) NB that hw/char/serial.c is not the only serial device using this pattern and probably needing adjustment (though I think it's the only one with the complicated tsr_retry logic). In the patch that started this thread the report is that we use lots of CPU. Can you explain why that happens? I was expecting that we would set up the qemu_chr_fe_add_watch() on the dead chr FE and it would just never fire since the FE can never accept further data... thanks -- PMM ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE 2018-06-01 12:43 ` Peter Maydell @ 2018-06-01 16:20 ` Sergio Lopez 0 siblings, 0 replies; 18+ messages in thread From: Sergio Lopez @ 2018-06-01 16:20 UTC (permalink / raw) To: Peter Maydell Cc: Paolo Bonzini, Marc-André Lureau, QEMU, Michael S. Tsirkin On Fri, Jun 01, 2018 at 01:43:56PM +0100, Peter Maydell wrote: > On 1 June 2018 at 13:28, Sergio Lopez <slp@redhat.com> wrote: > > On Fri, Jun 01, 2018 at 02:16:59PM +0200, Paolo Bonzini wrote: > >> On 01/06/2018 14:05, Sergio Lopez wrote: > >> >>> Instead of adding explicit handling of EPIPE, shouldn't the code be > >> >>> rewritten to treat -1 return && errno != EAGAIN as fatal? > >> >> Yes, exactly this code is already broken for every single errno > >> >> value, not simply EPIPE. It needs fixing to treat '-1' return code > >> >> correctly instead of retrying everything. > >> > Given that EAGAIN is already taken care of in > >> > chardev/char.c:qemu_chr_write_buffer, in which cases should we retry? Or > >> > should we just drop all the tsr_retry logic? > >> > >> It's not handled there, the call from qemu_chr_fe_write has write_all == > >> false. > > > > You're right. So should we just retry only for -1 && errno == EAGAIN and > > just ignore the error otherwise? > > I don't think we should make all of qemu_chr_fe_write()'s callers > have to handle EAGAIN. That function already has a way to tell > the caller that it didn't consume any data, which is to return 0. > > (EAGAIN is a curse and we should strive to avoid it spreading its > poison into areas of the codebase that we can keep it out of.) > > In general there are three cases I think qemu_chr_fe_write() callers > might care about: > * data was consumed (return >0) > * data wasn't consumed, try again later (return 0) > * data wasn't consumed, and a later attempt won't work either > (return -1) > > NB that hw/char/serial.c is not the only serial device using > this pattern and probably needing adjustment (though I think it's > the only one with the complicated tsr_retry logic). > > In the patch that started this thread the report is that we > use lots of CPU. Can you explain why that happens? I was expecting > that we would set up the qemu_chr_fe_add_watch() on the dead > chr FE and it would just never fire since the FE can never > accept further data... The callback is indeed triggered (I haven't looked why yet), and both the main thread and the vCPU that issued the write to the serial port (which is now polling the device to see if the transmitter is empty) are fighting for the qemu_global_mutex. This issue is heavily amplified if both threads are sharing the same pCPU (by chance or pinning). Some debugging info from a simulation: - vCPU thread waiting to acquire qemu_global_mutex: Thread 3 (Thread 0x7fb6c7fff700 (LWP 31641)): #0 0x00007fb6e591c4cd in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fb6e5917dcb in _L_lock_812 () at /lib64/libpthread.so.0 #2 0x00007fb6e5917c98 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x556dfb7b6420 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:79 #3 0x0000556dfaebcfd7 in qemu_mutex_lock_impl (mutex=mutex@entry=0x556dfb7b6420 <qemu_global_mutex>, file=file@entry=0x556dfaf57477 "/root/Projects/qemu/cpus.c", line=line@entry=1765) at util/qemu-thread-posix.c:67 #4 0x0000556dfaacfe58 in qemu_mutex_lock_iothread () at /root/Projects/qemu/cpus.c:1765 #5 0x0000556dfaa90bcd in prepare_mmio_access (mr=0x556dfe71c0e0, mr=0x556dfe71c0e0) at /root/Projects/qemu/exec.c:3068 #6 0x0000556dfaa95f98 in flatview_read_continue (fv=fv@entry=0x7fb6c028aa80, addr=addr@entry=1021, attrs=..., attrs@entry=..., buf=buf@entry=0x7fb6e739f000 "", len=len@entry=1, addr1=5, l=1, mr=0x556dfe71c0e0) at /root/Projects/qemu/exec.c:3189 ---Type <return> to continue, or q <return> to quit--- #7 0x0000556dfaa9617c in flatview_read (fv=0x7fb6c028aa80, addr=1021, attrs=..., buf=0x7fb6e739f000 "", len=1) at /root/Projects/qemu/exec.c:3255 #8 0x0000556dfaa9629f in address_space_read_full (as=<optimized out>, addr=addr@entry=1021, attrs=..., buf=<optimized out>, len=len@entry=1) at /root/Projects/qemu/exec.c:3268 #9 0x0000556dfaa963fa in address_space_rw (as=<optimized out>, addr=addr@entry=1021, attrs=..., attrs@entry=..., buf=<optimized out>, len=len@entry=1, is_write=is_write@entry=false) at /root/Projects/qemu/exec.c:3298 #10 0x0000556dfaaf6536 in kvm_cpu_exec (count=1, size=1, direction=<optimized out>, data=<optimized out>, attrs=..., port=1021) at /root/Projects/qemu/accel/kvm/kvm-all.c:1730 #11 0x0000556dfaaf6536 in kvm_cpu_exec (cpu=cpu@entry=0x556dfd3d8020) at /root/Projects/qemu/accel/kvm/kvm-all.c:1970 #12 0x0000556dfaad0006 in qemu_kvm_cpu_thread_fn (arg=0x556dfd3d8020) at /root/Projects/qemu/cpus.c:1215 #13 0x00007fb6e5915dd5 in start_thread (arg=0x7fb6c7fff700) at pthread_create.c:308 #14 0x00007fb6d9e97b3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The owner is LWP 31634, the main thread (gdb) p qemu_global_mutex $1 = {lock = {__data = {__lock = 2, __count = 0, __owner = 31634, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\222{\000\000\001", '\000' <repeats 26 times>, __align = 2}, initialized = true} - The main thead is in the callback Thread 1 (Thread 0x7fb6e7355cc0 (LWP 31634)): ---Type <return> to continue, or q <return> to quit--- #0 0x0000556dfac59420 in serial_watch_cb (chan=0x556dfd321400, cond=G_IO_OUT, opaque=0x556dfe71c000) at hw/char/serial.c:233 #1 0x00007fb6e68638f9 in g_main_context_dispatch (context=0x556dfd314210) at gmain.c:3146 #2 0x00007fb6e68638f9 in g_main_context_dispatch (context=context@entry=0x556dfd314210) at gmain.c:3811 #3 0x0000556dfaeba126 in main_loop_wait () at util/main-loop.c:215 #4 0x0000556dfaeba126 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:263 #5 0x0000556dfaeba126 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:522 #6 0x0000556dfaa89a2f in main () at vl.c:1943 #7 0x0000556dfaa89a2f in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4679 Given your question, I was wondering what would happen if the callback never fired. The result is much, much worse, as QEMU doesn't set UART_LSR_TEMT, and the Guest OS (RHEL7.5) keeps polling the serial device for ~10 seconds. Of course this can be regarded as a issue if the Guest's serial driver, but I'm pretty sure we don't want to break it ;-) This is not the first time I'm fighting the retry logic in serial.c. Some versions back, serial_xmit would happily add up watchers with no limit, eventually exceeding FD_SETSIZE limit. I'm starting to thing that tsr_retry logic does more harm than good. In any case, I think the question on the table is when should serial_xmit retry writing to the FE, and we seem to have two options: 1. When ret != 1 and errno == EAGAIN. 2. When ret == 0. I'm good with either, so I leave the decision to you. Sergio. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2018-06-01 16:20 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-05-31 7:45 [Qemu-devel] [PATCH 0/3] Avoid retrying on serial_xmit with EPIPE Sergio Lopez 2018-05-31 7:45 ` [Qemu-devel] [PATCH 1/3] io: Implement QIO_CHANNEL_ERR_BROKEN Sergio Lopez 2018-06-01 11:50 ` Daniel P. Berrangé 2018-05-31 7:46 ` [Qemu-devel] [PATCH 2/3] chardev: On QIO_CHANNEL_ERR_BROKEN set errno to EPIPE Sergio Lopez 2018-05-31 9:48 ` Marc-André Lureau 2018-05-31 11:03 ` Sergio Lopez 2018-06-01 11:51 ` Daniel P. Berrangé 2018-05-31 7:46 ` [Qemu-devel] [PATCH 3/3] hw/char/serial: Don't retry on serial_xmit if errno == EPIPE Sergio Lopez 2018-05-31 9:52 ` Marc-André Lureau 2018-05-31 11:03 ` Sergio Lopez 2018-05-31 11:07 ` Marc-André Lureau 2018-06-01 11:52 ` Daniel P. Berrangé 2018-06-01 12:05 ` Sergio Lopez 2018-06-01 12:16 ` Paolo Bonzini 2018-06-01 12:28 ` Sergio Lopez 2018-06-01 12:31 ` Paolo Bonzini 2018-06-01 12:43 ` Peter Maydell 2018-06-01 16:20 ` Sergio Lopez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).