From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LC2oG-0007RP-Ow for qemu-devel@nongnu.org; Sun, 14 Dec 2008 21:03:48 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LC2oE-0007RD-5t for qemu-devel@nongnu.org; Sun, 14 Dec 2008 21:03:47 -0500 Received: from [199.232.76.173] (port=39576 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LC2oE-0007RA-0L for qemu-devel@nongnu.org; Sun, 14 Dec 2008 21:03:46 -0500 Received: from yx-out-1718.google.com ([74.125.44.154]:21373) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LC2oD-0008B7-Fs for qemu-devel@nongnu.org; Sun, 14 Dec 2008 21:03:45 -0500 Received: by yx-out-1718.google.com with SMTP id 3so1137482yxi.82 for ; Sun, 14 Dec 2008 18:03:44 -0800 (PST) Message-ID: <4945BAFB.3070804@codemonkey.ws> Date: Sun, 14 Dec 2008 20:03:39 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] Vmchannel PCI device. References: <20081214115027.4028.56164.stgit@dhcp-1-237.tlv.redhat.com> <20081214131247.GS5555@redhat.com> <49455B5E.8080504@codemonkey.ws> <20081214221346.GA16902@redhat.com> <49458F31.8040208@codemonkey.ws> <20081214233305.GA22151@redhat.com> In-Reply-To: <20081214233305.GA22151@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: qemu-devel@nongnu.org, Gleb Natapov , kvm@vger.kernel.org Daniel P. Berrange wrote: > On Sun, Dec 14, 2008 at 04:56:49PM -0600, Anthony Liguori wrote: > >> Daniel P. Berrange wrote: >> >>> On Sun, Dec 14, 2008 at 01:15:42PM -0600, Anthony Liguori wrote: >>> >>> One non-QEMU backend I can see being implemented is a DBus daemon, >>> providing a simple bus for RPC calls between guests & host. >>> >> The main problem with "external" backends is that they cannot easily >> participate in save/restore or live migration. If you want to have an >> RPC mechanism, I would suggest implementing the backend in QEMU and >> hooking QEMU up to dbus. Then you can implement proper save/restore. >> > > DBus is a general purpose RPC service, which has little-to-no knowledge > of the semantics of application services running over it. Simply pushing > a backend into QEMU can't magically make sure all the application level > state is preserved across save/restore/migrate. For some protocols the > only viable option may be to explicitly give the equivalent of -EPIPE > / POLLHUP to the guest and have it explicitly re-establish connectivity > with the host backend and re-initialize neccessary state if desired > In the case of dbus, you actually have a shot of making save/restore transparent. If you send the RPCs, you can parse the messages in QEMU and know when you have a complete buffer. You can then dispatch the RPC from QEMU (and BTW, perfect example of security, you want the RPCs to originate from the QEMU process). When you get the RPC response, you can marshal it and make it available to the guest. If you ever have a request or response, you should save the partial results as part of save/restore. You could use the live feature of savevm to attempt to wait until there are no pending RPCs. In fact, you have to do this because otherwise, the save/restore would be broken. This example is particularly bad for EPIPE. If the guest sends an RPC, what happens if it gets EPIPE? Has it been completed? It would make it very difficult to program for this model. EPIPE is the model Xen used for guest save/restore and it's been a huge hassle. You don't want guests involved in save/restore because it adds a combinatorial factor to your test matrix. You have to now test every host combination with every supported guest combination to ensure that save/restore has not regressed. It's a huge burden and IMHO is never truly necessary. > It imposes a configuration & authentication burden on the guest to > use networking. When a virtual fence device is provided directly from > the host OS, you can get zero-config deployment of clustering with > the need to configure any authentication credentials in the guest. > This is a big plus over over the traditional setup for real machines. > If you just want to use vmchannel for networking without the "configuration" burden then someone heavily involved with a distro should just preconfigure, say Fedora, to create a private network on a dedicated network interface as soon as the system starts. Then you have a dedicated, never disappearing network interface you can use for all of this stuff. And it requires no application modification to boot. > This really depends on what you define the semantics of the vmchannel > protocol to be - specifically whether you want save/restore/migrate to > be totally opaque to the guest or not. I could imagine one option is to > have the guest end of the device be given -EPIPE when the backend is > restarted for restore/migrate, and choose to re-establish its connection > if so desired. This would not require QEMU to maintain any backend state. > For stateless datagram (UDP-like) application protocols there's nothing > that there's no special support required for save/restore. > It's a losing proposition because it explodes the test matrix to build anything that's even remotely robust. >> What's the argument to do these things external to QEMU? >> > > There are many potential uses cases for VMchannel, not all are going > to be general purpose things that everyone wants to use. Forcing alot > of application specific backend code into QEMU is not a good way to > approach this from a maintenance point of view. Some backends may well > be well suited to living inside QEMU, while others may be better suited > as external services. > I think VMchannel is a useful concept but not for the same reasons you do :-) Regards, Anthony Liguori > Daniel >