From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33137) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAhNZ-0008H8-IQ for qemu-devel@nongnu.org; Tue, 04 Feb 2014 09:53:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WAhNT-00015t-D7 for qemu-devel@nongnu.org; Tue, 04 Feb 2014 09:53:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60149) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAhNT-00015n-5K for qemu-devel@nongnu.org; Tue, 04 Feb 2014 09:53:31 -0500 Message-ID: <52F0F26A.5020304@redhat.com> Date: Tue, 04 Feb 2014 15:00:10 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <52F0938F.2040102@ozlabs.ru> <52F0C523.30102@redhat.com> <52F0D611.7070105@ozlabs.ru> <52F0D810.4070806@redhat.com> <52F0DA04.9040003@ozlabs.ru> In-Reply-To: <52F0DA04.9040003@ozlabs.ru> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] migration: broken ram_save_pending List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , "qemu-devel@nongnu.org" Cc: Alex Graf Il 04/02/2014 13:16, Alexey Kardashevskiy ha scritto: > On 02/04/2014 11:07 PM, Paolo Bonzini wrote: >> Il 04/02/2014 12:59, Alexey Kardashevskiy ha scritto: >>>>> With the default throttling of 32 MiB/s, bandwidth must be something like >>>>> 33000 (expressed in bytes/ms) with the default settings, and then >>>> max_size >>>>> should be 33000*3*10^9 / 10^6 = 6000000. Where is my computation wrong? >>> >>> migrate_max_downtime() = 30000000 = 3*10^7. >> >> Oops, that's the mistake. > > Make a patch? :) I mean, my mistake. :) I assumed 3000 ms = 3*10^9. 30 ms is too little, but 3000 ms is probably too much for a default. >>> When the migration is in iterating stage, bandwidth is a speed in last >>> 100ms which is usually 5 blocks 250KB each so it is >>> 1250000/100=12500bytes/s and max_size=12500*30000000/10^6=375000 which is >>> less than the last chunk is. >> >> Perhaps our default maximum downtime is too low. 30 ms doesn't seem >> achievable in practice with 32 MiB/s bandwidth. Just making it 300 ms or >> so should fix your problem. > > Well, it will fix it in my particular case but in a long run this does not > feel like a fix - there should be a way for migration_thread() to know that > ram_save_iterate() sent all dirty pages it had to send, no? No, because new pages might be dirtied while ram_save_iterate() was running. Paolo