From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:58897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RJ219-0006Cd-4C for qemu-devel@nongnu.org; Wed, 26 Oct 2011 07:51:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RJ217-0000tM-Bx for qemu-devel@nongnu.org; Wed, 26 Oct 2011 07:51:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56507) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RJ217-0000tE-2M for qemu-devel@nongnu.org; Wed, 26 Oct 2011 07:51:33 -0400 Date: Wed, 26 Oct 2011 13:48:39 +0200 From: Alon Levy Message-ID: <20111026113804.GB18048@bow> References: <1319457739-14562-1-git-send-email-alevy@redhat.com> <1319457739-14562-2-git-send-email-alevy@redhat.com> <4EA5808A.7050409@redhat.com> <20111024134516.57596735@doriath> <20111024172937.GC12513@bow.tlv.redhat.com> <20111024223148.47b0dec7@doriath> <20111025101309.GB3490@bow> <20111025105130.3e7ac8fa@doriath> <20111025132111.GG3490@bow> <20111025164118.45761b40@doriath> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111025164118.45761b40@doriath> Subject: Re: [Qemu-devel] [PATCH 1/5] monitor: screen_dump async List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, mlureau@redhat.com, Gerd Hoffmann , armbru@redhat.com On Tue, Oct 25, 2011 at 04:41:18PM -0200, Luiz Capitulino wrote: > On Tue, 25 Oct 2011 15:21:11 +0200 > Alon Levy wrote: > > > On Tue, Oct 25, 2011 at 10:51:30AM -0200, Luiz Capitulino wrote: > > > On Tue, 25 Oct 2011 12:13:09 +0200 > > > Alon Levy wrote: > > > > > > > On Mon, Oct 24, 2011 at 10:31:48PM -0200, Luiz Capitulino wrote: > > > > > On Mon, 24 Oct 2011 19:29:37 +0200 > > > > > Alon Levy wrote: > > > > > > > > > > > On Mon, Oct 24, 2011 at 01:45:16PM -0200, Luiz Capitulino wrote: > > > > > > > On Mon, 24 Oct 2011 17:13:14 +0200 > > > > > > > Gerd Hoffmann wrote: > > > > > > > > > > > > > > > On 10/24/11 14:02, Alon Levy wrote: > > > > > > > > > Make screen_dump monitor command an async command to allow next for qxl > > > > > > > > > to implement it as a initiating call to red_worker and completion on > > > > > > > > > callback, to fix a deadlock when issueing a screendump command via > > > > > > > > > libvirt while connected with a libvirt controlled spice-gtk client. > > > > > > > > > > > > > > > > Approach looks reasonable to me. Patch breaks the build though, you've > > > > > > > > missed a bunch of screen_dump functions in non-x86 targets. > > > > > > > > > > > > > > There are two problems actually. > > > > > > > > > > > > > > The first one is that changing an existing command from synchronous > > > > > > > to asynchronous is an incompatible change because asynchronous commands > > > > > > > semantics is different. For an example of possible problems please > > > > > > > check: https://bugzilla.redhat.com/show_bug.cgi?id=623903. > > > > > > > > > > > > > > The second problem is that the existing asynchronous interface in the > > > > > > > monitor is incomplete and has never been used for real. Our plan is to > > > > > > > use QAPI's async support, but that has not landed in master yet and iirc > > > > > > > there wasn't consensus about it. I also think it's a bit late for its > > > > > > > inclusion in 1.0 (and certainly not a candidate for stable). > > > > > > > > > > > > > > If all you need here is to delay sending the response, then maybe the > > > > > > > current interface could work (although I honestly don't trust it and > > > > > > > regret not having dropped it). Otherwise our only choice would be to > > > > > > > work on getting the QAPI async support merged. > > > > > > > > > > > > My problem is that the io thread keeps the global mutex during the wait, > > > > > > that's why the async monitor is perfect for what I want - this is > > > > > > exactly what it does. > > > > > > > > > > Let's not mix internal implementation details with what we want as > > > > > an external interface. > > > > > > > > > > Can't you just make a vga_hw_screen_dump() specific callback? > > > > > > > > > > > > > I don't understand how that would help - if the monitor command doesn't > > > > return (normal sync operation) then the mutex is never dropped, and any > > > > callback won't change that. > > > > > > I'm trying to figure out a different solution. > > > > > > Our primary motivation for making a QMP command asynchronous must be to > > > give clients the ability to keep issuing commands while "slow" commands > > > are running. If that's not what you want nor your primary motivation, > > > then you're probably taking the wrong approach. > > > > That sounds right, and it was what I assumed the async monitor > > implementation would do, boy was I surprised to discover it doesn't do > > any such thing, but what it does do is return early, allow *other* io > > related events (select returns) to be handled, and keeps the serialized > > only-one-command-ongoing monitor usage. > > Yes, if that turns out to be needed internally then we'll have to add > such functionality (but probably not as part of QMP's async support iiuc). > > > > If that is what you want, then you'll have to add a new command, because > > > changing from asynchronous to synchronous is an incompatible change _and_ > > > you shouldn't use the current interface, because it's botched (actually, > > > I think I'm going to drop it right now as my last series fixed its only user). > > > > > > > This is not what I want. I understand of course that it is what one > > would design an async monitor / api to allow (it was after all what I > > thought async monitor support meant). > > > > > Using a botched interface that doesn't do what's supposed to do but happens to > > > solve a bug as a side effect will very likely end in tears at some point in > > > the future. > > > > > > > Right, but it's the opposite of the current case. > > > > > Now, I did some research and found this description of the problem: > > > > > > """ > > > In testing my patches for 'add_client' support with SPICE, I noticed > > > deadlock in the QEMU/SPICE code. It only happens if I close a SPICE > > > client and then immediately reconnect within about 1 second. If I > > > wait a couple of seconds before reconnecting the SPICE client I don't > > > see the deadlock. > > > """ > > > (http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg02599.html) > > > > > > Is this accurate? Why does it _work_ after 1 second? > > > > > > > This is an unrelated bug, I know I said different on the thread but now > > rereading the callstack it is the channel_event locking workaround - and > > it is fixed by a spice-server only patch (which stops calling > > channel_event from the worker thread). > > So, can you try to describe the problem you're having to someone who is > not familiar with qlx and its locking? > I'm suddenly not sure this patchset even fixes the problem that I said it does, so I'll update once I made sure what it actually fixes. > > > > > > On the other hand, thinking a bit about the reference to 623903 baloon > > > > bug, I don't see a problem: the change doesn't affect the semantics for > > > > any other device except qxl, which I've tested. For any other device, > > > > the only difference is that instead of: > > > > > > > > do_screen_dump call > > > > device specific implementation > > > > return > > > > > > > > It becomes > > > > > > > > do_screen_dump call > > > > device specific implementation (not qxl) > > > > callback called (always - not conditional, no one stores it except > > > > qxl) > > > > return > > > > > > > > > > > > > > > > I haven't looked at QAPI async support, but I > > > > > > understand it's a bit in the future. > > > > > > > > > > Yes, it's not for the immediate term. > > > > > > > > > >