* [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access @ 2011-09-01 19:35 Luiz Capitulino 2011-09-01 19:47 ` Daniel P. Berrange ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Luiz Capitulino @ 2011-09-01 19:35 UTC (permalink / raw) To: qemu-devel; +Cc: Marian Krcmarik, Alon Levy Sometimes, when having lots of VMs running on a RHEV host and the user attempts to close a SPICE window, libvirt will get corrupted json from QEMU. After some investigation, I found out that the problem is that different SPICE threads are calling monitor functions (such as monitor_protocol_event()) in parallel which causes concurrent access to the monitor's internal buffer outbuf[]. This fixes the problem by protecting accesses to outbuf[] with a mutex. Honestly speaking, I'm not completely sure this the best thing to do because the monitor itself and other qemu subsystems are not thread safe, so having subsystems like SPICE assuming the contrary seems a bit catastrophic to me... Anyways, this commit fixes the problem at hand. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- monitor.c | 16 +++++++++++++++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/monitor.c b/monitor.c index 04f465a..61d4d93 100644 --- a/monitor.c +++ b/monitor.c @@ -57,6 +57,7 @@ #include "json-parser.h" #include "osdep.h" #include "cpu.h" +#include "qemu-thread.h" #ifdef CONFIG_SIMPLE_TRACE #include "trace.h" #endif @@ -144,6 +145,7 @@ struct Monitor { int suspend_cnt; uint8_t outbuf[1024]; int outbuf_index; + QemuMutex mutex; ReadLineState *rs; MonitorControl *mc; CPUState *mon_cpu; @@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func, void monitor_flush(Monitor *mon) { + qemu_mutex_lock(&mon->mutex); + if (mon && mon->outbuf_index != 0 && !mon->mux_out) { qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index); mon->outbuf_index = 0; } + + qemu_mutex_unlock(&mon->mutex); } /* flush at every end of line or if the buffer is full */ @@ -257,6 +263,8 @@ static void monitor_puts(Monitor *mon, const char *str) { char c; + qemu_mutex_lock(&mon->mutex); + for(;;) { c = *str++; if (c == '\0') @@ -265,9 +273,14 @@ static void monitor_puts(Monitor *mon, const char *str) mon->outbuf[mon->outbuf_index++] = '\r'; mon->outbuf[mon->outbuf_index++] = c; if (mon->outbuf_index >= (sizeof(mon->outbuf) - 1) - || c == '\n') + || c == '\n') { + qemu_mutex_unlock(&mon->mutex); monitor_flush(mon); + qemu_mutex_lock(&mon->mutex); + } } + + qemu_mutex_unlock(&mon->mutex); } void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap) @@ -5275,6 +5288,7 @@ void monitor_init(CharDriverState *chr, int flags) mon = g_malloc0(sizeof(*mon)); + qemu_mutex_init(&mon->mutex); mon->chr = chr; mon->flags = flags; if (flags & MONITOR_USE_READLINE) { -- 1.7.7.rc0.72.g4b5ea ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino @ 2011-09-01 19:47 ` Daniel P. Berrange 2011-09-01 21:03 ` Jan Kiszka 2011-09-02 1:34 ` Anthony Liguori 2 siblings, 0 replies; 15+ messages in thread From: Daniel P. Berrange @ 2011-09-01 19:47 UTC (permalink / raw) To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel On Thu, Sep 01, 2011 at 04:35:45PM -0300, Luiz Capitulino wrote: > Sometimes, when having lots of VMs running on a RHEV host and the user > attempts to close a SPICE window, libvirt will get corrupted json from > QEMU. > > After some investigation, I found out that the problem is that different > SPICE threads are calling monitor functions (such as > monitor_protocol_event()) in parallel which causes concurrent access > to the monitor's internal buffer outbuf[]. > > This fixes the problem by protecting accesses to outbuf[] with a mutex. > > Honestly speaking, I'm not completely sure this the best thing to do > because the monitor itself and other qemu subsystems are not thread safe, > so having subsystems like SPICE assuming the contrary seems a bit > catastrophic to me... > > Anyways, this commit fixes the problem at hand. IMHO this patch should be applied to stable-0.15 as is, since it is an important fix for SPICE, and this highly targetted mutex lock has low-risk of regressions elsewhere. I'd also apply it for master now, but at the same time perhaps start work on adding broader locking that covers all APIs that monitor.c exposes to internal QEMU code, so we're future proofed against other surprises. > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino 2011-09-01 19:47 ` Daniel P. Berrange @ 2011-09-01 21:03 ` Jan Kiszka 2011-09-02 1:34 ` Anthony Liguori 2 siblings, 0 replies; 15+ messages in thread From: Jan Kiszka @ 2011-09-01 21:03 UTC (permalink / raw) To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel [-- Attachment #1: Type: text/plain, Size: 1564 bytes --] On 2011-09-01 21:35, Luiz Capitulino wrote: > Sometimes, when having lots of VMs running on a RHEV host and the user > attempts to close a SPICE window, libvirt will get corrupted json from > QEMU. > > After some investigation, I found out that the problem is that different > SPICE threads are calling monitor functions (such as > monitor_protocol_event()) in parallel which causes concurrent access > to the monitor's internal buffer outbuf[]. > > This fixes the problem by protecting accesses to outbuf[] with a mutex. > > Honestly speaking, I'm not completely sure this the best thing to do > because the monitor itself and other qemu subsystems are not thread safe, > so having subsystems like SPICE assuming the contrary seems a bit > catastrophic to me... I fully agree. ... > @@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func, > > void monitor_flush(Monitor *mon) > { > + qemu_mutex_lock(&mon->mutex); > + > if (mon && mon->outbuf_index != 0 && !mon->mux_out) { > qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index); > mon->outbuf_index = 0; > } > + > + qemu_mutex_unlock(&mon->mutex); Here is another example for things that can break due to "optimistic" parallelization: What protects the chardev state that will be touched by calling qemu_chr_fe_write? Even when ignoring mux'ed channels for now, I bet there are code paths that modify the state without holding the frontend lock (i.e. Monitor::mutex). Jan [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino 2011-09-01 19:47 ` Daniel P. Berrange 2011-09-01 21:03 ` Jan Kiszka @ 2011-09-02 1:34 ` Anthony Liguori 2011-09-02 9:41 ` Daniel P. Berrange 2011-09-02 13:39 ` Gerd Hoffmann 2 siblings, 2 replies; 15+ messages in thread From: Anthony Liguori @ 2011-09-02 1:34 UTC (permalink / raw) To: Luiz Capitulino; +Cc: Marian Krcmarik, Alon Levy, qemu-devel On 09/01/2011 02:35 PM, Luiz Capitulino wrote: > Sometimes, when having lots of VMs running on a RHEV host and the user > attempts to close a SPICE window, libvirt will get corrupted json from > QEMU. > > After some investigation, I found out that the problem is that different > SPICE threads are calling monitor functions (such as > monitor_protocol_event()) in parallel which causes concurrent access > to the monitor's internal buffer outbuf[]. > > This fixes the problem by protecting accesses to outbuf[] with a mutex. > > Honestly speaking, I'm not completely sure this the best thing to do > because the monitor itself and other qemu subsystems are not thread safe, > so having subsystems like SPICE assuming the contrary seems a bit > catastrophic to me... > > Anyways, this commit fixes the problem at hand. Nack. This is absolutely a Spice bug. Spice should not be calling into QEMU code from multiple threads. It should only call into QEMU code while it's holding the qemu_mutex. The right way to fix this is probably to make all of the SpiceCoreInterface callbacks simply write to a file descriptor which can then wake up QEMU to do the operation on behalf of it. It's ugly but the libspice interface is far too tied to QEMU internals in the first place which is the root of the problem. Regards, Anthony Liguori > > Signed-off-by: Luiz Capitulino<lcapitulino@redhat.com> > --- > monitor.c | 16 +++++++++++++++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/monitor.c b/monitor.c > index 04f465a..61d4d93 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -57,6 +57,7 @@ > #include "json-parser.h" > #include "osdep.h" > #include "cpu.h" > +#include "qemu-thread.h" > #ifdef CONFIG_SIMPLE_TRACE > #include "trace.h" > #endif > @@ -144,6 +145,7 @@ struct Monitor { > int suspend_cnt; > uint8_t outbuf[1024]; > int outbuf_index; > + QemuMutex mutex; > ReadLineState *rs; > MonitorControl *mc; > CPUState *mon_cpu; > @@ -246,10 +248,14 @@ static int monitor_read_password(Monitor *mon, ReadLineFunc *readline_func, > > void monitor_flush(Monitor *mon) > { > + qemu_mutex_lock(&mon->mutex); > + > if (mon&& mon->outbuf_index != 0&& !mon->mux_out) { > qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index); > mon->outbuf_index = 0; > } > + > + qemu_mutex_unlock(&mon->mutex); > } > > /* flush at every end of line or if the buffer is full */ > @@ -257,6 +263,8 @@ static void monitor_puts(Monitor *mon, const char *str) > { > char c; > > + qemu_mutex_lock(&mon->mutex); > + > for(;;) { > c = *str++; > if (c == '\0') > @@ -265,9 +273,14 @@ static void monitor_puts(Monitor *mon, const char *str) > mon->outbuf[mon->outbuf_index++] = '\r'; > mon->outbuf[mon->outbuf_index++] = c; > if (mon->outbuf_index>= (sizeof(mon->outbuf) - 1) > - || c == '\n') > + || c == '\n') { > + qemu_mutex_unlock(&mon->mutex); > monitor_flush(mon); > + qemu_mutex_lock(&mon->mutex); > + } > } > + > + qemu_mutex_unlock(&mon->mutex); > } > > void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap) > @@ -5275,6 +5288,7 @@ void monitor_init(CharDriverState *chr, int flags) > > mon = g_malloc0(sizeof(*mon)); > > + qemu_mutex_init(&mon->mutex); > mon->chr = chr; > mon->flags = flags; > if (flags& MONITOR_USE_READLINE) { ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 1:34 ` Anthony Liguori @ 2011-09-02 9:41 ` Daniel P. Berrange 2011-09-02 11:26 ` Jan Kiszka 2011-09-02 13:39 ` Gerd Hoffmann 1 sibling, 1 reply; 15+ messages in thread From: Daniel P. Berrange @ 2011-09-02 9:41 UTC (permalink / raw) To: Anthony Liguori; +Cc: Marian Krcmarik, Alon Levy, qemu-devel, Luiz Capitulino On Thu, Sep 01, 2011 at 08:34:35PM -0500, Anthony Liguori wrote: > On 09/01/2011 02:35 PM, Luiz Capitulino wrote: > >Sometimes, when having lots of VMs running on a RHEV host and the user > >attempts to close a SPICE window, libvirt will get corrupted json from > >QEMU. > > > >After some investigation, I found out that the problem is that different > >SPICE threads are calling monitor functions (such as > >monitor_protocol_event()) in parallel which causes concurrent access > >to the monitor's internal buffer outbuf[]. > > > >This fixes the problem by protecting accesses to outbuf[] with a mutex. > > > >Honestly speaking, I'm not completely sure this the best thing to do > >because the monitor itself and other qemu subsystems are not thread safe, > >so having subsystems like SPICE assuming the contrary seems a bit > >catastrophic to me... > > > >Anyways, this commit fixes the problem at hand. > > Nack. > > This is absolutely a Spice bug. Spice should not be calling into > QEMU code from multiple threads. It should only call into QEMU code > while it's holding the qemu_mutex. > > The right way to fix this is probably to make all of the > SpiceCoreInterface callbacks simply write to a file descriptor which > can then wake up QEMU to do the operation on behalf of it. It's > ugly but the libspice interface is far too tied to QEMU internals in > the first place which is the root of the problem. This feels like a rather short-term approach to fixing the problem to me. As QEMU becomes increasingly multi-threaded, there is high liklihood that we'll get other code in QEMU which wants to use the monitor from multiple threads. The monitor code in QEMU is fairly well isolated & thus comparatively easy to make threadsafe, so I don't see why we wouldn't want todo that & avoid any chance of this type of problem recurring in the future. IMHO, "fixing" SPICE is not fixing the bug at all, it is just removing the trigger of the bug in the monitor. Regards, Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 9:41 ` Daniel P. Berrange @ 2011-09-02 11:26 ` Jan Kiszka 0 siblings, 0 replies; 15+ messages in thread From: Jan Kiszka @ 2011-09-02 11:26 UTC (permalink / raw) To: Daniel P. Berrange Cc: Marian Krcmarik, Alon Levy, qemu-devel, Luiz Capitulino On 2011-09-02 11:41, Daniel P. Berrange wrote: > On Thu, Sep 01, 2011 at 08:34:35PM -0500, Anthony Liguori wrote: >> On 09/01/2011 02:35 PM, Luiz Capitulino wrote: >>> Sometimes, when having lots of VMs running on a RHEV host and the user >>> attempts to close a SPICE window, libvirt will get corrupted json from >>> QEMU. >>> >>> After some investigation, I found out that the problem is that different >>> SPICE threads are calling monitor functions (such as >>> monitor_protocol_event()) in parallel which causes concurrent access >>> to the monitor's internal buffer outbuf[]. >>> >>> This fixes the problem by protecting accesses to outbuf[] with a mutex. >>> >>> Honestly speaking, I'm not completely sure this the best thing to do >>> because the monitor itself and other qemu subsystems are not thread safe, >>> so having subsystems like SPICE assuming the contrary seems a bit >>> catastrophic to me... >>> >>> Anyways, this commit fixes the problem at hand. >> >> Nack. >> >> This is absolutely a Spice bug. Spice should not be calling into >> QEMU code from multiple threads. It should only call into QEMU code >> while it's holding the qemu_mutex. >> >> The right way to fix this is probably to make all of the >> SpiceCoreInterface callbacks simply write to a file descriptor which >> can then wake up QEMU to do the operation on behalf of it. It's >> ugly but the libspice interface is far too tied to QEMU internals in >> the first place which is the root of the problem. > > This feels like a rather short-term approach to fixing the problem > to me. As QEMU becomes increasingly multi-threaded, there is high > liklihood that we'll get other code in QEMU which wants to use the > monitor from multiple threads. The monitor code in QEMU is fairly > well isolated & thus comparatively easy to make threadsafe, so I As pointed out before, this assumption is not correct. > don't see why we wouldn't want todo that & avoid any chance of this > type of problem recurring in the future. > > IMHO, "fixing" SPICE is not fixing the bug at all, it is just removing > the trigger of the bug in the monitor. Until we have officially thread-safe subsystems, SPICE must take the qemu_global_mutex before calling core services. This patch does not make the monitor thread-safe as it does not address indirectly called services. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 1:34 ` Anthony Liguori 2011-09-02 9:41 ` Daniel P. Berrange @ 2011-09-02 13:39 ` Gerd Hoffmann 2011-09-02 14:03 ` Anthony Liguori ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Gerd Hoffmann @ 2011-09-02 13:39 UTC (permalink / raw) To: Anthony Liguori Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino [-- Attachment #1: Type: text/plain, Size: 1804 bytes --] Hi, >> After some investigation, I found out that the problem is that different >> SPICE threads are calling monitor functions (such as >> monitor_protocol_event()) in parallel which causes concurrent access >> to the monitor's internal buffer outbuf[]. [ adding spice-list to Cc, see qemu-devel for the rest of the thread ] spice isn't supposed to do that. /me just added a assert in channel_event() and saw it trigger in display channel disconnects. #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6 #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6 #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6 #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340) at /home/kraxel/projects/qemu/ui/spice-core.c:223 #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400 #5 reds_stream_free (s=0x35e92c0) at reds.c:4981 #6 0x00007f9a77aac8b0 in red_disconnect_channel (channel=0x7f9a24069a80) at red_worker.c:8489 #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, events=<value optimized out>) at red_worker.c:10062 #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at red_worker.c:10304 #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0 #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6 IMHO spice server should handle the display channel tear-down in the dispatcher instead of the worker thread. Alon? >> Anyways, this commit fixes the problem at hand. Not really. channel_event() itself isn't thread-safe too, it does unlocked list operations which can also blow up when called from different threads. A patch like the attached (warning: untested) should do as quick&dirty fix for stable. But IMO we really should fix spice instead. cheers, Gerd [-- Attachment #2: 0001-spice-workaround-a-spice-server-bug.patch --] [-- Type: text/plain, Size: 2258 bytes --] >From 7496e573ff6085d3c42d7e65b72c85fd2a7b4a78 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann <kraxel@redhat.com> Date: Fri, 2 Sep 2011 15:03:28 +0200 Subject: [PATCH] spice: workaround a spice server bug. --- ui/spice-core.c | 21 ++++++++++++++++++++- 1 files changed, 20 insertions(+), 1 deletions(-) diff --git a/ui/spice-core.c b/ui/spice-core.c index dba11f0..c99cdc5 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -19,6 +19,7 @@ #include <spice-experimental.h> #include <netdb.h> +#include <pthread.h> #include "qemu-common.h" #include "qemu-spice.h" @@ -44,6 +45,8 @@ static char *auth_passwd; static time_t auth_expires = TIME_MAX; int using_spice = 0; +static pthread_t me; + struct SpiceTimer { QEMUTimer *timer; QTAILQ_ENTRY(SpiceTimer) next; @@ -216,6 +219,8 @@ static void channel_event(int event, SpiceChannelEventInfo *info) }; QDict *server, *client; QObject *data; + bool need_lock = !pthread_equal(me, pthread_self()); + static int first = 1; client = qdict_new(); add_addr_info(client, &info->paddr, info->plen); @@ -223,6 +228,14 @@ static void channel_event(int event, SpiceChannelEventInfo *info) server = qdict_new(); add_addr_info(server, &info->laddr, info->llen); + if (need_lock) { + qemu_mutex_lock_iothread(); + if (first) { + fprintf(stderr, "You are using a broken spice-server version\n"); + first = 0; + } + } + if (event == SPICE_CHANNEL_EVENT_INITIALIZED) { qdict_put(server, "auth", qstring_from_str(auth)); add_channel_info(client, info); @@ -236,6 +249,10 @@ static void channel_event(int event, SpiceChannelEventInfo *info) QOBJECT(client), QOBJECT(server)); monitor_protocol_event(qevent[event], data); qobject_decref(data); + + if (need_lock) { + qemu_mutex_unlock_iothread(); + } } #else /* SPICE_INTERFACE_CORE_MINOR >= 3 */ @@ -482,7 +499,9 @@ void qemu_spice_init(void) spice_image_compression_t compression; spice_wan_compression_t wan_compr; - if (!opts) { + me = pthread_self(); + + if (!opts) { return; } port = qemu_opt_get_number(opts, "port", 0); -- 1.7.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 13:39 ` Gerd Hoffmann @ 2011-09-02 14:03 ` Anthony Liguori 2011-09-02 14:24 ` Luiz Capitulino 2011-09-02 14:28 ` Anthony Liguori 2 siblings, 0 replies; 15+ messages in thread From: Anthony Liguori @ 2011-09-02 14:03 UTC (permalink / raw) To: Gerd Hoffmann Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino On 09/02/2011 08:39 AM, Gerd Hoffmann wrote: > Hi, > >>> After some investigation, I found out that the problem is that different >>> SPICE threads are calling monitor functions (such as >>> monitor_protocol_event()) in parallel which causes concurrent access >>> to the monitor's internal buffer outbuf[]. > > [ adding spice-list to Cc, see qemu-devel for the rest of the thread ] > > spice isn't supposed to do that. > > /me just added a assert in channel_event() and saw it trigger in display > channel disconnects. > > #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6 > #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6 > #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6 > #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340) > at /home/kraxel/projects/qemu/ui/spice-core.c:223 > #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400 > #5 reds_stream_free (s=0x35e92c0) at reds.c:4981 > #6 0x00007f9a77aac8b0 in red_disconnect_channel (channel=0x7f9a24069a80) > at red_worker.c:8489 > #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, > events=<value optimized out>) > at red_worker.c:10062 > #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at > red_worker.c:10304 > #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0 > #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6 > > IMHO spice server should handle the display channel tear-down in the > dispatcher instead of the worker thread. Alon? > >>> Anyways, this commit fixes the problem at hand. > > Not really. channel_event() itself isn't thread-safe too, it does > unlocked list operations which can also blow up when called from > different threads. > > A patch like the attached (warning: untested) should do as quick&dirty > fix for stable. But IMO we really should fix spice instead. Spice should not be calling *any* QEMU code without holding the global mutex. That includes all of the QObject interactions. Regards, Anthony Liguori > > cheers, > Gerd > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 13:39 ` Gerd Hoffmann 2011-09-02 14:03 ` Anthony Liguori @ 2011-09-02 14:24 ` Luiz Capitulino 2011-09-02 14:28 ` Anthony Liguori 2 siblings, 0 replies; 15+ messages in thread From: Luiz Capitulino @ 2011-09-02 14:24 UTC (permalink / raw) To: Gerd Hoffmann; +Cc: spice-devel, Marian Krcmarik, Alon Levy, qemu-devel On Fri, 02 Sep 2011 15:39:03 +0200 Gerd Hoffmann <kraxel@redhat.com> wrote: > Hi, > > >> After some investigation, I found out that the problem is that different > >> SPICE threads are calling monitor functions (such as > >> monitor_protocol_event()) in parallel which causes concurrent access > >> to the monitor's internal buffer outbuf[]. > > [ adding spice-list to Cc, see qemu-devel for the rest of the thread ] > > spice isn't supposed to do that. > > /me just added a assert in channel_event() and saw it trigger in display > channel disconnects. > > #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6 > #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6 > #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6 > #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340) > at /home/kraxel/projects/qemu/ui/spice-core.c:223 > #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400 > #5 reds_stream_free (s=0x35e92c0) at reds.c:4981 > #6 0x00007f9a77aac8b0 in red_disconnect_channel > (channel=0x7f9a24069a80) at red_worker.c:8489 > #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, > events=<value optimized out>) > at red_worker.c:10062 > #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at > red_worker.c:10304 > #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0 > #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6 > > IMHO spice server should handle the display channel tear-down in the > dispatcher instead of the worker thread. Alon? > > >> Anyways, this commit fixes the problem at hand. > > Not really. channel_event() itself isn't thread-safe too, it does > unlocked list operations which can also blow up when called from > different threads. I thought my patch was at least a candidate for stable, but after this thread I'm convinced the problem should be fixed in spice instead. > > A patch like the attached (warning: untested) should do as quick&dirty > fix for stable. But IMO we really should fix spice instead. > > cheers, > Gerd > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 13:39 ` Gerd Hoffmann 2011-09-02 14:03 ` Anthony Liguori 2011-09-02 14:24 ` Luiz Capitulino @ 2011-09-02 14:28 ` Anthony Liguori 2011-09-02 15:18 ` Gerd Hoffmann 2 siblings, 1 reply; 15+ messages in thread From: Anthony Liguori @ 2011-09-02 14:28 UTC (permalink / raw) To: Gerd Hoffmann Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino On 09/02/2011 08:39 AM, Gerd Hoffmann wrote: > Hi, > >>> After some investigation, I found out that the problem is that different >>> SPICE threads are calling monitor functions (such as >>> monitor_protocol_event()) in parallel which causes concurrent access >>> to the monitor's internal buffer outbuf[]. > > [ adding spice-list to Cc, see qemu-devel for the rest of the thread ] > > spice isn't supposed to do that. > > /me just added a assert in channel_event() and saw it trigger in display > channel disconnects. > > #0 0x0000003ceba32a45 in raise () from /lib64/libc.so.6 > #1 0x0000003ceba34225 in abort () from /lib64/libc.so.6 > #2 0x0000003ceba2b9d5 in __assert_fail () from /lib64/libc.so.6 > #3 0x0000000000503759 in channel_event (event=3, info=0x35e9340) > at /home/kraxel/projects/qemu/ui/spice-core.c:223 > #4 0x00007f9a77a9921b in reds_channel_event (s=0x35e92c0) at reds.c:400 > #5 reds_stream_free (s=0x35e92c0) at reds.c:4981 > #6 0x00007f9a77aac8b0 in red_disconnect_channel (channel=0x7f9a24069a80) > at red_worker.c:8489 > #7 0x00007f9a77ab53a8 in handle_dev_input (listener=0x7f9a3211ab20, > events=<value optimized out>) > at red_worker.c:10062 > #8 0x00007f9a77ab436d in red_worker_main (arg=<value optimized out>) at > red_worker.c:10304 > #9 0x0000003cec2077e1 in start_thread () from /lib64/libpthread.so.0 > #10 0x0000003cebae68ed in clone () from /lib64/libc.so.6 > > IMHO spice server should handle the display channel tear-down in the > dispatcher instead of the worker thread. Alon? > >>> Anyways, this commit fixes the problem at hand. > > Not really. channel_event() itself isn't thread-safe too, it does > unlocked list operations which can also blow up when called from > different threads. > > A patch like the attached (warning: untested) should do as quick&dirty > fix for stable. But IMO we really should fix spice instead. I agree. I'm not sure I like the idea of still calling QEMU code without holding the mutex (even the QObject code). Can you just use a bottom half to defer this work to the I/O thread? Bottom half scheduling has to be signal safe which means it will also be thread safe. Regards, Anthony Liguori > > cheers, > Gerd > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 14:28 ` Anthony Liguori @ 2011-09-02 15:18 ` Gerd Hoffmann 2011-09-02 15:20 ` Anthony Liguori 2011-09-02 15:31 ` Paolo Bonzini 0 siblings, 2 replies; 15+ messages in thread From: Gerd Hoffmann @ 2011-09-02 15:18 UTC (permalink / raw) To: Anthony Liguori Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino Hi, >> A patch like the attached (warning: untested) should do as quick&dirty >> fix for stable. But IMO we really should fix spice instead. > > I agree. I'm not sure I like the idea of still calling QEMU code without > holding the mutex (even the QObject code). I though just creating the objects isn't an issue, but if you disagree we can just move up the lock to the head of the function. > Can you just use a bottom half to defer this work to the I/O thread? > Bottom half scheduling has to be signal safe which means it will also be > thread safe. Not that straight forward as I would have to pass arguments to the bottom half. cheers, Gerd ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 15:18 ` Gerd Hoffmann @ 2011-09-02 15:20 ` Anthony Liguori 2011-09-02 15:31 ` Paolo Bonzini 1 sibling, 0 replies; 15+ messages in thread From: Anthony Liguori @ 2011-09-02 15:20 UTC (permalink / raw) To: Gerd Hoffmann Cc: Marian Krcmarik, Alon Levy, qemu-devel, spice-devel, Luiz Capitulino On 09/02/2011 10:18 AM, Gerd Hoffmann wrote: > Hi, > >>> A patch like the attached (warning: untested) should do as quick&dirty >>> fix for stable. But IMO we really should fix spice instead. >> >> I agree. I'm not sure I like the idea of still calling QEMU code without >> holding the mutex (even the QObject code). > > I though just creating the objects isn't an issue, but if you disagree > we can just move up the lock to the head of the function. What I fear is that Spice will assume something is thread safe, but then someone will make a change that makes the subsystem non-reentrant. I'd rather that we have very clear rules about what's thread safe and not thread safe. If you want to audit the QObject subsystem, declare it thread safe, and document it as such, that would be okay. But it needs to be systematic, not ad-hoc. Regards, Anthony Liguori > >> Can you just use a bottom half to defer this work to the I/O thread? >> Bottom half scheduling has to be signal safe which means it will also be >> thread safe. > > Not that straight forward as I would have to pass arguments to the > bottom half. > > cheers, > Gerd > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 15:18 ` Gerd Hoffmann 2011-09-02 15:20 ` Anthony Liguori @ 2011-09-02 15:31 ` Paolo Bonzini 2011-09-02 15:37 ` Anthony Liguori 2011-09-05 7:48 ` Gerd Hoffmann 1 sibling, 2 replies; 15+ messages in thread From: Paolo Bonzini @ 2011-09-02 15:31 UTC (permalink / raw) To: Gerd Hoffmann Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy, spice-devel On 09/02/2011 05:18 PM, Gerd Hoffmann wrote: > >> Can you just use a bottom half to defer this work to the I/O thread? >> Bottom half scheduling has to be signal safe which means it will also be >> thread safe. > > Not that straight forward as I would have to pass arguments to the > bottom half. Can you add a variant of qemu_bh_new that accepts a sizeof for the new bottom half? Then the bottom half itself can be passed as the opaque and used for the arguments. Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 15:31 ` Paolo Bonzini @ 2011-09-02 15:37 ` Anthony Liguori 2011-09-05 7:48 ` Gerd Hoffmann 1 sibling, 0 replies; 15+ messages in thread From: Anthony Liguori @ 2011-09-02 15:37 UTC (permalink / raw) To: Paolo Bonzini Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy, Gerd Hoffmann, spice-devel On 09/02/2011 10:31 AM, Paolo Bonzini wrote: > On 09/02/2011 05:18 PM, Gerd Hoffmann wrote: >> >>> Can you just use a bottom half to defer this work to the I/O thread? >>> Bottom half scheduling has to be signal safe which means it will also be >>> thread safe. >> >> Not that straight forward as I would have to pass arguments to the >> bottom half. > > Can you add a variant of qemu_bh_new that accepts a sizeof for the new > bottom half? Then the bottom half itself can be passed as the opaque and > used for the arguments. Bottom halves are opaque to the caller. Passing arguments would require careful consideration of locking too. I think the best way to resolve this is to fix libspice and not try to work around the problem in QEMU. Regards, Anthony Liguori > > Paolo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access 2011-09-02 15:31 ` Paolo Bonzini 2011-09-02 15:37 ` Anthony Liguori @ 2011-09-05 7:48 ` Gerd Hoffmann 1 sibling, 0 replies; 15+ messages in thread From: Gerd Hoffmann @ 2011-09-05 7:48 UTC (permalink / raw) To: Paolo Bonzini Cc: qemu-devel, Luiz Capitulino, Marian Krcmarik, Alon Levy, spice-devel On 09/02/11 17:31, Paolo Bonzini wrote: > On 09/02/2011 05:18 PM, Gerd Hoffmann wrote: >> >>> Can you just use a bottom half to defer this work to the I/O thread? >>> Bottom half scheduling has to be signal safe which means it will also be >>> thread safe. >> >> Not that straight forward as I would have to pass arguments to the >> bottom half. > > Can you add a variant of qemu_bh_new that accepts a sizeof for the new > bottom half? Then the bottom half itself can be passed as the opaque and > used for the arguments. That wouldn't help. I would have to create some kind of job queue which is then processed by the bottom half. cheers, Gerd ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-09-05 7:48 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-09-01 19:35 [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access Luiz Capitulino 2011-09-01 19:47 ` Daniel P. Berrange 2011-09-01 21:03 ` Jan Kiszka 2011-09-02 1:34 ` Anthony Liguori 2011-09-02 9:41 ` Daniel P. Berrange 2011-09-02 11:26 ` Jan Kiszka 2011-09-02 13:39 ` Gerd Hoffmann 2011-09-02 14:03 ` Anthony Liguori 2011-09-02 14:24 ` Luiz Capitulino 2011-09-02 14:28 ` Anthony Liguori 2011-09-02 15:18 ` Gerd Hoffmann 2011-09-02 15:20 ` Anthony Liguori 2011-09-02 15:31 ` Paolo Bonzini 2011-09-02 15:37 ` Anthony Liguori 2011-09-05 7:48 ` Gerd Hoffmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).