From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=42414 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PawmE-0001iB-5X for qemu-devel@nongnu.org; Thu, 06 Jan 2011 15:49:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pawlv-0005Rw-Ms for qemu-devel@nongnu.org; Thu, 06 Jan 2011 15:49:25 -0500 Received: from e32.co.us.ibm.com ([32.97.110.150]:37091) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pawlv-0005RW-HY for qemu-devel@nongnu.org; Thu, 06 Jan 2011 15:49:23 -0500 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by e32.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p06KdKMu017784 for ; Thu, 6 Jan 2011 13:39:20 -0700 Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p06Kn9Yn082364 for ; Thu, 6 Jan 2011 13:49:12 -0700 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p06Kn9og010466 for ; Thu, 6 Jan 2011 13:49:09 -0700 Message-ID: <4D262AC2.9000202@linux.vnet.ibm.com> Date: Thu, 06 Jan 2011 14:49:06 -0600 From: Michael Roth MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] add MADV_DONTFORK to guest physical memory References: <20100915170824.GL5981@random.random> <20110105151012.GC15823@random.random> <4D24B22C.4010302@linux.vnet.ibm.com> <20110105195430.GF15823@random.random> <4D24D3EB.1080702@linux.vnet.ibm.com> <20110105203518.GH15823@random.random> <4D24E249.2040501@linux.vnet.ibm.com> <20110106174925.GK15823@random.random> In-Reply-To: <20110106174925.GK15823@random.random> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrea Arcangeli Cc: Blue Swirl , =?ISO-8859-1?Q?Andreas_F=E4rber?= , "qemu-devel@nongnu.org Developers" , Anthony Liguori , Alexander Graf On 01/06/2011 11:49 AM, Andrea Arcangeli wrote: > On Wed, Jan 05, 2011 at 03:27:37PM -0600, Michael Roth wrote: >> On 01/05/2011 02:35 PM, Andrea Arcangeli wrote: >>> On Wed, Jan 05, 2011 at 02:26:19PM -0600, Michael Roth wrote: >>>> Yah you're right, but I've seen several discussions about using mempath >>>> for tmpfs/ram-backed files for things like numa/zram/etc so tend to >>>> think of it as something potentially more than just a hook for >>>> hugetlbfs, which is becoming less and less useful. But the MADV_DONTFORK >>>> stuff should still be immediately applicable. >>> >>> Yes, MADV_DONTFORK should be set all on all guest physical memory >>> without options so I hope the new patch I just posted is fine to stop >>> the spurious -ENOMEM failures in fork. >> >> The patch in this thread? A couple paths still aren't covered when using >> -mem-path. Something like this should get them all: > > Well the reason of MADV_DONTFORK is to avoid accounting issues with > anonymous memory, mem-path don't have that issue as hugetlbfs skips > the accounting (it has to because hugetlbfs are not always taken from > the regular page allocator). It could be however considered a minor > performance optimization. That's one case, but there's also a wonky fallback in that path that defaults to the normal qemu_vmalloc(): if (mem_path) { #if defined (__linux__) && !defined(TARGET_S390X) new_block->host = file_ram_alloc(new_block, size, mem_path); if (!new_block->host) { new_block->host = qemu_vmalloc(size); qemu_madvise(new_block->host, size, QEMU_MADV_MERGEABLE); } May make sense to only add coverage for that specific case though. If file_ram_alloc() is generalized we could deal with it then. > Now you mention that you want to use -mem-path for other things too, > so maybe that's why you need it there too. BTW, if you ever need it > for more than hugetlbfs, I'm afraid this MAP_PRIVATE I see when > mem_prealloc isn't set, is going to screw any other potential useful > usage besides hugetlbfs, not exactly sure why it makes any sense to > use MAP_PRIVATE there and not only MAP_SHARED. Not sure either...but if it's another hugetlbfs-ism it shouldn't matter since using mem-path for something other than hugetlbfs would ideally be configurable with a -mem path=/dev/shm/vm0.mem,hugetlbfs=off or something along that line, which wouldn't necessarily set MAP_PRIVATE.