From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:35237) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QzRtB-0005Ux-1c for qemu-devel@nongnu.org; Fri, 02 Sep 2011 07:26:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QzRt9-00048H-Oa for qemu-devel@nongnu.org; Fri, 02 Sep 2011 07:26:25 -0400 Received: from goliath.siemens.de ([192.35.17.28]:33229) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QzRt9-00048C-AB for qemu-devel@nongnu.org; Fri, 02 Sep 2011 07:26:23 -0400 Message-ID: <4E60BD5B.3070909@siemens.com> Date: Fri, 02 Sep 2011 13:26:19 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <20110901163545.71ba1515@doriath> <4E6032AB.8080804@codemonkey.ws> <20110902094158.GA27508@redhat.com> In-Reply-To: <20110902094158.GA27508@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] monitor: Protect outbuf from concurrent access List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: Marian Krcmarik , Alon Levy , qemu-devel , Luiz Capitulino On 2011-09-02 11:41, Daniel P. Berrange wrote: > On Thu, Sep 01, 2011 at 08:34:35PM -0500, Anthony Liguori wrote: >> On 09/01/2011 02:35 PM, Luiz Capitulino wrote: >>> Sometimes, when having lots of VMs running on a RHEV host and the user >>> attempts to close a SPICE window, libvirt will get corrupted json from >>> QEMU. >>> >>> After some investigation, I found out that the problem is that different >>> SPICE threads are calling monitor functions (such as >>> monitor_protocol_event()) in parallel which causes concurrent access >>> to the monitor's internal buffer outbuf[]. >>> >>> This fixes the problem by protecting accesses to outbuf[] with a mutex. >>> >>> Honestly speaking, I'm not completely sure this the best thing to do >>> because the monitor itself and other qemu subsystems are not thread safe, >>> so having subsystems like SPICE assuming the contrary seems a bit >>> catastrophic to me... >>> >>> Anyways, this commit fixes the problem at hand. >> >> Nack. >> >> This is absolutely a Spice bug. Spice should not be calling into >> QEMU code from multiple threads. It should only call into QEMU code >> while it's holding the qemu_mutex. >> >> The right way to fix this is probably to make all of the >> SpiceCoreInterface callbacks simply write to a file descriptor which >> can then wake up QEMU to do the operation on behalf of it. It's >> ugly but the libspice interface is far too tied to QEMU internals in >> the first place which is the root of the problem. > > This feels like a rather short-term approach to fixing the problem > to me. As QEMU becomes increasingly multi-threaded, there is high > liklihood that we'll get other code in QEMU which wants to use the > monitor from multiple threads. The monitor code in QEMU is fairly > well isolated & thus comparatively easy to make threadsafe, so I As pointed out before, this assumption is not correct. > don't see why we wouldn't want todo that & avoid any chance of this > type of problem recurring in the future. > > IMHO, "fixing" SPICE is not fixing the bug at all, it is just removing > the trigger of the bug in the monitor. Until we have officially thread-safe subsystems, SPICE must take the qemu_global_mutex before calling core services. This patch does not make the monitor thread-safe as it does not address indirectly called services. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux