From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:38275) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qm5ce-0005FN-DR for qemu-devel@nongnu.org; Wed, 27 Jul 2011 11:02:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qm5cY-0004Xo-HI for qemu-devel@nongnu.org; Wed, 27 Jul 2011 11:02:08 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:55090) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qm5cY-0004Xd-96 for qemu-devel@nongnu.org; Wed, 27 Jul 2011 11:02:02 -0400 Received: by gxk26 with SMTP id 26so1321745gxk.4 for ; Wed, 27 Jul 2011 08:02:01 -0700 (PDT) Message-ID: <4E302865.9030806@codemonkey.ws> Date: Wed, 27 Jul 2011 10:01:57 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1311685434-25282-1-git-send-email-alevy@redhat.com> <20110727070756.GZ2375@bow.redhat.com> <4E301DC1.30901@codemonkey.ws> <20110727144932.GH2799@bow.redhat.com> In-Reply-To: <20110727144932.GH2799@bow.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] virtio-serial-bus: replay guest_open on migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster , qemu-devel@nongnu.org, amit.shah@redhat.com, hdegoede@redhat.com, yhalperi@redhat.com, quintela@redhat.com On 07/27/2011 09:49 AM, Alon Levy wrote: > On Wed, Jul 27, 2011 at 09:16:33AM -0500, Anthony Liguori wrote: >> On 07/27/2011 02:07 AM, Alon Levy wrote: >>> On Wed, Jul 27, 2011 at 07:45:25AM +0200, Markus Armbruster wrote: >>>> Alon Levy writes: >>>> >>>>> Signed-off-by: Alon Levy >>>>> --- >>>>> hw/virtio-serial-bus.c | 8 +++++++- >>>>> 1 files changed, 7 insertions(+), 1 deletions(-) >>>>> >>>>> diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c >>>>> index c5eb931..7a652ff 100644 >>>>> --- a/hw/virtio-serial-bus.c >>>>> +++ b/hw/virtio-serial-bus.c >>>>> @@ -618,14 +618,20 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id) >>>>> for (i = 0; i< nr_active_ports; i++) { >>>>> uint32_t id; >>>>> bool host_connected; >>>>> + VirtIOSerialPortInfo *info; >>>>> >>>>> id = qemu_get_be32(f); >>>>> port = find_port_by_id(s, id); >>>>> if (!port) { >>>>> return -EINVAL; >>>>> } >>>>> - >>>>> port->guest_connected = qemu_get_byte(f); >>>>> + info = DO_UPCAST(VirtIOSerialPortInfo, qdev, port->dev.info); >>>>> + if (port->guest_connected&& info->guest_open) { >>>>> + /* replay guest open */ >>>>> + info->guest_open(port); >>>>> + >>>>> + } >>>>> host_connected = qemu_get_byte(f); >>>>> if (host_connected != port->host_connected) { >>>>> /* >>>> >>>> The patch makes enough sense to me, but the commit message is >>>> insufficient. Why do you have to replay? And what's being fixed? >>> >>> When migrating a host with with a spice agent running the mouse becomes >>> non operational after the migration. This is rhbz #718463, currently on >>> spice-server but it seems this is a qemu-kvm issue. The problem is that >>> after migration spice doesn't know the guest agent is open. >> >> The problem is that guest_open is a hack. >> >> You want connection semantics. You need the following information >> from the backend and the client: >> >> 1) backend is associated with a transport. The transport may >> disconnect at any point in time. The backend needs to have explicit >> state transitions associated with the transport disconnecting and >> connecting. >> >> 2) the client may disconnect and reconnect at any point in time. A >> device model reset is a disconnect followed by a reconnect. >> >> This gives you the following matrix of states: >> >> A: backend-connected, client-connected >> B: backend-disconnected, client-disconnected >> C: backend-connected, client-disconnected >> D: backend-disconnected, client-connected >> >> The state transition diagram looks like this: >> >> B: for some devices, immediately goto C. other devices, on accept() >> goto D. if in B and client connects, goto D >> >> C: if transport disconnects, goto B. if client connects, goto A >> >> D: if transport connects, goto A. if client disconects, goto B >> >> A: if transport disconnects, goto B, if client disconnects, goto C >> >> The problem is that guest_open() is a poor approximation of >> 'client-connected' and it's not used universally. We need to >> introduce proper state tracking to the character device layer and we >> need to have a proper connection function that is used by all char >> device clients. >> >> Semantically, write should only be allowed in states A and D. Read >> should only be allowed in states A and C. >> >> C and D should have very well defined semantics about what happens >> to the data that is written Arguably, read/write should not be >> allowed in states C/D. >> >> Device reset should always trigger a client reconnect. Migration >> resets devices so migration would Just Work if we modelled the state >> transitions appropriately. > Are you saying currently on migration the guest (client above) always > receives an event from virtio? No. Everything is all over the place today. I'm saying this is how it should work: All character devices need to start in the same state today. As it stands, everything but spicevmc starts in D. spicevmc starts in B. guest_open()/guest_close() assumes the devices starts in D. guest_open/guest_close is a fundamentally broken interface because it only works for spicevmc. We need to generalize guest_open/guest_close and make all backends start in D. That means we need to add guest_open() calls to the initfns of any device that uses a CharDriverState. And the natural place to add a guest_close() is to the destroy functions of any device. reset() is semantically a close followed by an open, so for all devices that use CDSs, we should add a guest_close() followed by a guest_open(). Now, for virtio-serial, since it supports multiple ports, in it's reset() path, it needs to do close() followed by open. Since vmstate_load() is basically an extension of reset(), you'll have to open() any new ports that appear in that path. > The guest_open callback happens when > a guest operation happens, not when the device gets reset, unless the later > triggers the former, but I don't understand how that would happen since a > reset can happen while the guest isn't ready to handle anything (guest is > booting). > I do see a virtio_pci_reset does a virtio_reset which sends the > VIRTIO_NO_VECTOR interrupt, but I don't understand what happens after that. > > Besides, I understand the need to fix the connection semantics of chardevs, > but the situation is broken right now and even if someone were to write this > I don't believe you would just take it to 0.15.0, would you? No, guest_open() is fundamentally broken right now. It's not that hard to fix, but it's too much for 0.150 I suspect. > > Also, the conversation is still ongoing but Armbru mentioned some ''relevant-cases'' > in http://lists.nongnu.org/archive/html/qemu-devel/2011-07/msg03221.html > backing the fix-the-hack approach (at least for 0.15.0). The problem is this isn't the only place where this problem with occur. libvirt just implemented reboot via shutdown and reset which means that if you don't do the right thing in reset(), you're going to have the same problem there. Fixing it right isn't hard, let's focus on doing it the right way. Regards, Anthony Liguori >> >> Regards, >> >> Anthony Liguori >> >