From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45604) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WENJ2-00077L-1d for qemu-devel@nongnu.org; Fri, 14 Feb 2014 13:16:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WENIx-0002TC-28 for qemu-devel@nongnu.org; Fri, 14 Feb 2014 13:16:07 -0500 Received: from mail-ph.de-nserver.de ([85.158.179.214]:57888) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WENIw-0002T1-NR for qemu-devel@nongnu.org; Fri, 14 Feb 2014 13:16:02 -0500 Message-ID: <52FE5D71.6040704@profihost.ag> Date: Fri, 14 Feb 2014 19:16:17 +0100 From: Stefan Priebe MIME-Version: 1.0 References: <52FA399F.6070802@profihost.ag> <52FA6C5E.1060300@profihost.ag> <20140214150345.GL17391@stefanha-thinkpad.redhat.com> In-Reply-To: <20140214150345.GL17391@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] memory allocation of migration changed? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Orit Wasserman , Peter Lieven , qemu-devel , Dave Gilbert , Juan Quintela Am 14.02.2014 16:03, schrieb Stefan Hajnoczi: > On Tue, Feb 11, 2014 at 07:30:54PM +0100, Stefan Priebe wrote: >> Am 11.02.2014 16:44, schrieb Stefan Hajnoczi: >>> On Tue, Feb 11, 2014 at 3:54 PM, Stefan Priebe - Profihost AG >>> wrote: >>>> in the past (Qemu 1.5) a migration failed if there was not enogh memory >>>> on the target host available directly at the beginning. >>>> >>>> Now with Qemu 1.7 i've seen succeeded migrations but the kernel OOM >>>> memory killer killing qemu processes. So the migration seems to takes >>>> place without having anough memory on the target machine? >>> >>> How much memory is the guest configured with? How much memory does >>> the host have? >> >> Guest: 48GB >> Host: 192GB >> >>> I wonder if there are zero pages that can be migrated almost "for >>> free" and the destination host doesn't touch. When they are touched >>> for the first time after migration handover, they need to be allocated >>> on the destination host. This can lead to OOM if you overcommitted >>> memory. >> >> In the past the migration failed immediatly with exit code 255. >> >>> Can you reproduce the OOM reliably? It should be possible to debug it >>> and figure out whether it's just bad luck or a true regression. >> >> So there is no known patch changing this behaviour? >> >> What is about those? >> fc1c4a5d32e15a4c40c47945da85ef9c1e0c1b54 >> 211ea74022f51164a7729030b28eec90b6c99a08 >> f1c72795af573b24a7da5eb52375c9aba8a37972 > > Yes, that's what I was referring to when I mentioned zero pages. > > The problem might just be that the destination host didn't have enough > free memory. Migration succeeded due to memory overcommit on the host, > but quickly ran out of memory after handover. The quick answer there is > to reconsider your overcommitting memory and also checking memory > availability before live migrating. Yes that makes sense in the past it was just working due to a failing qemu process. What is the right way to check for enough free memory and memory usage of a specific vm? Stefan >