From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MghmW-0006BF-PS for qemu-devel@nongnu.org; Thu, 27 Aug 2009 12:25:00 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MghmR-000662-Rd for qemu-devel@nongnu.org; Thu, 27 Aug 2009 12:25:00 -0400 Received: from [199.232.76.173] (port=34155 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MghmR-00065w-KX for qemu-devel@nongnu.org; Thu, 27 Aug 2009 12:24:55 -0400 Received: from qw-out-1920.google.com ([74.125.92.144]:29648) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MghmR-00043v-7M for qemu-devel@nongnu.org; Thu, 27 Aug 2009 12:24:55 -0400 Received: by qw-out-1920.google.com with SMTP id 5so252564qwc.4 for ; Thu, 27 Aug 2009 09:24:54 -0700 (PDT) Message-ID: <4A96B353.8070600@codemonkey.ws> Date: Thu, 27 Aug 2009 11:24:51 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [BUG] Regression of exec migration References: <4A969496.2070305@codemonkey.ws> <076B9FA6-C362-47C1-AA8B-70BF147843A6@irisa.fr> In-Reply-To: <076B9FA6-C362-47C1-AA8B-70BF147843A6@irisa.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pierre Riteau Cc: Chris Lalancette , qemu-devel@nongnu.org Pierre Riteau wrote: > On 27 août 09, at 16:13, Anthony Liguori wrote: > >> Pierre Riteau wrote: >>> [Sorry Chris, resending without the giant attachments.] >>> >>> Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec migration >>> much slower than before. >>> I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. >>> >>> I'm migrating a fully booted Linux VM (also running Lenny) with >>> 128MB of RAM to a file, using the following command: migrate "exec: >>> cat > vmimage". The resulting file has a size of 57MB (because we >>> save only what is allocated from the 128MB). >>> With the current HEAD, it takes from 15 to 40 seconds (it's >>> variable) to perform the migration to the file. >>> With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or >>> just commenting the "socket_set_nonblock(s->fd);" statement), it >>> takes about 3 seconds. >> >> Without that changeset, it wasn't a live migration. The better way >> to compare would be to issue stop before doing the migrate and >> compare that time with the previous time. >> >> When a migration is live, it's iterative which means there's more >> work to do. > > I tried with stop too, and I get the same results. It's an idle VM so > only a small number of pages are being modified while the migration is > going on. > I agree that the changeset seems good, the code it replaces was > obviously wrong. > But I think there is something wrong somewhere else, unless it is > considered normal that it takes so much time for an exec migration. > To compare, using the same setup with one more machine and a Gigabit > network, a tcp migration capped at 35m (the slowest speed I've > measured from the disk, it can be way faster) takes about the same > time, between 2 and 4 seconds. I don't think the difference between 3 seconds and 15 seconds is significant. Can you try a different workload that will result in a migration that takes much longer (say multiple minutes)? That is, I'd like to know whether there's a fixed greater cost of exec: migration vs. factor of 5. I expect exec: to be slower because there is more copying but not by a factor of 5. I expect that it's going to be a combination of relatively small constant factor + relatively small constant fixed cost. Regards, Anthony Liguori