From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=56738 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PaZik-0004lR-RY for qemu-devel@nongnu.org; Wed, 05 Jan 2011 15:12:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PaZij-0002l2-Lk for qemu-devel@nongnu.org; Wed, 05 Jan 2011 15:12:34 -0500 Received: from e5.ny.us.ibm.com ([32.97.182.145]:58115) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PaZij-0002kx-I3 for qemu-devel@nongnu.org; Wed, 05 Jan 2011 15:12:33 -0500 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e5.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p05JnRML017876 for ; Wed, 5 Jan 2011 14:49:32 -0500 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 615244DE803B for ; Wed, 5 Jan 2011 15:09:43 -0500 (EST) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p05KCDPj1982666 for ; Wed, 5 Jan 2011 15:12:14 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p05KCDup028923 for ; Wed, 5 Jan 2011 18:12:13 -0200 Message-ID: <4D24D09A.9060204@linux.vnet.ibm.com> Date: Wed, 05 Jan 2011 14:12:10 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] add MADV_DONTFORK to guest physical memory References: <20100915170824.GL5981@random.random> <20110105151012.GC15823@random.random> <4D24B22C.4010302@linux.vnet.ibm.com> <20110105195430.GF15823@random.random> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: Andrea Arcangeli , Blue Swirl , =?ISO-8859-1?Q?Andreas_F=E4rber?= , Michael Roth , "qemu-devel@nongnu.org Developers" On 01/05/2011 02:00 PM, Alexander Graf wrote: > On 05.01.2011, at 20:54, Andrea Arcangeli wrote: > > >> Hello everyone, >> >> On Wed, Jan 05, 2011 at 08:44:38PM +0100, Alexander Graf wrote: >> >>> On 05.01.2011, at 19:02, Michael Roth wrote: >>> >>> >>>> On 01/05/2011 09:10 AM, Andrea Arcangeli wrote: >>>> >>>>> The bug is still there so I rediffed the old patch against current >>>>> code. >>>>> >>>>> On a related topic: could somebody give me advice on how to implement >>>>> a command line (command line seems enough, the other option would be >>>>> monitor command) to make the MADV_MERGEABLE conditional? I got KSM on >>>>> THP working fine but KSM may decrease performance by increasing the >>>>> number of copy on write and by splitting hugepages, so we'd like to be >>>>> able to turn off KSM on a per-VM basis (not on the whole host, which >>>>> of course we already can by setting /sys/kernel/mm/ksm/run to 0) so >>>>> that high perf VMs will keep running at maximum speed with KSM off but >>>>> others may still benefit from KSM. For that I need to make the below >>>>> MADV_MERGEABLE madvise conditional to something and the code itself >>>>> will be trivial, we've just to converge on a command line option >>>>> (hopefully quickly ;). >>>>> >>>> There was a -mem_prealloc option added a while back to set MAP_POPULATE on memory mapped in via the -mem-path option. So an analogous -mem_nomerge option or something along that line seems reasonable for conditionally unsetting QEMU_MADV_MERGEABLE. >>>> >>>> And for consistency you should probably make both your proposed changes for -mem-path'd memory as well. >>>> >>> Why not clean up all that mess and introduce a new -mem option that would just take all of the several options as parameters? >>> >>> -mem size=512,populate=on,ksm=off >>> >>> and default -m to something reasonable with the new syntax. >>> >> I'm neutral... so feel free to decide what I should implement ;). >> >> One comment on combining -m ksm=off (or -mem_nomerge) with >> -mem-path. It seems unnecessary because ksm can't be turned on on >> VM_HUGETLB vmas (MADV_MERGEABLE will return -EINVAL) and mem-path only >> makes sense if used in combination with hugetlbfs (which sets >> VM_HUGETLB of course). >> > Sure, not all combinations make sense. But "-mem size=1G,path=/dev/shm/vm1.ram,populate=on" would make sense, no? TPH should go along the same lines here too. It'd just be a flag "tph" that defaults to on if available. > > That way we could also do all the sanity checks in a single place. I really like the idea of combining memory management command line parameters into a single option :). In the end I'd assume it's Anthony's call though. > Where it's really helpful is in the ever elusive configuration file format. -mem becomes: [mem] size=1G path=/dev/shm/vm1.ram populate=on Which is nice from a grouping perspective. Regards, Anthony Liguori > Alex > >