From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59694) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYvZl-0006af-WA for qemu-devel@nongnu.org; Thu, 25 Feb 2016 08:03:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYvZi-00015R-Pm for qemu-devel@nongnu.org; Thu, 25 Feb 2016 08:03:25 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38352) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYvZi-00015M-JU for qemu-devel@nongnu.org; Thu, 25 Feb 2016 08:03:22 -0500 Date: Thu, 25 Feb 2016 13:03:16 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20160225130315.GD2163@work-vm> References: <87d1rlua15.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87d1rlua15.fsf@blackfin.pond.sub.org> Subject: Re: [Qemu-devel] ivshmem migration restrictions and bugs List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: =?iso-8859-1?Q?Marc-Andr=E9?= Lureau , Cam Macdonell , Claudio Fontana , qemu-devel@nongnu.org, David Marchand * Markus Armbruster (armbru@redhat.com) wrote: > TL;DR: I recommend to stay away from migration when using chardev=... > > ivshmem migration is messed up in several entertaining ways. > > = General lossage = > > G1. Migrating more than one peer doesn't work, but that's a (badly) > documented restriction, not a bug (see documentation of property > "role" in qemu-doc.texi). If you migrate more than one, the shared > memory can get messed up. > > G2. If peers connect on the destination before migration is complete, > the shared memory can get messed up. This isn't even badly > documented. > > Management applications can deal with this in principle. > > = Lossage with MSI-X (msi=on) = > > M1. s->intrstatus and s->intrmask (registers INTRSTATUS and INTRMASK) > are not migrated, even though they have guest-visible contents. > They reset to zero instead. Wrong, but unlikely to cause trouble, > because the registers are inert in this configuration. > > There's nothing management applications can do about this. > > = Lossage with interrupts (chardev=...) = > > I1. s->vm_id (register IVPOSITION) is not migrated. It briefly changes > to -1, then to whatever ID the server on the destination assigns. > To get the same ID back, you must carefully control the order in > which devices connect to the server on the destination: if this > device was the n-th to connect on the source, it must also be the > n-th on the destination. > > We can hope that the guest reads IVPOSITION rarely or not at all > after device driver initialization, so the temporary change to -1 > will be overlooked most of the time. > > I2. If the shared memory's ramblock arrives at the destination before > shared memory setup completes, migration fails. Shared memory setup > completes shortly after the shared memory is received from the > server. > > I3. If migration completes before the shared memory setup completes on > the source, shared memory contents is lost (zeroed?). I don't yet > know what happens when shared memory setup completes during > migration. > > G2 + I1 implies that you can only migrate the peer with ID zero. > Management applications need make sure the device with role=master > connects first both on source and destination, which seems feasible. > > There's nothing management applications can do about the temporary > IVPOSITION change (I1). > > There is no known way for a management application to wait for shared > memory setup to complete. > > Migration failure due to I2 is recoverable: restart the server on the > destination, and retry the migration with a bit more time between > running the destination QEMU and the migrate command. The server > restart is necessary to preserve ID zero. > > I'm not aware of a way to guard against or mitigate I3. Fortunately, > shared memory setup should almost always win the race. > > = What can we do about it? = > > G1 and G2 are a matter of improving documentation. > > M1 is easy enough to fix, if we care. > > That leaves I1, I2 and I3. Common root cause: we don't finish setup in > realize(), we merely arrange for messages from the server to be received > and processed. This exposes both guest and migration to an incompletely > set up device. > > Completing setup right in realize() would be simpler and race-free. > However, it could also make realize() hang waiting for a hung server. > Probably okay for -device, but what about hot plug? > > If it's not okay, we could split ivshmem into a frontend and a backend. > Hot plug could create the backend asynchronously, wait for it to > complete, then create the frontend / device model. Command line would > have to create the backend synchronously, of course. How can you tell when 'shared memory setup' is complete? You could delay starting incoming migration on the destination or starting a migration on the source until that setup is complete. Dave > > Other ideas? > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK