From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MIn3J-0001Sc-Bn for qemu-devel@nongnu.org; Mon, 22 Jun 2009 13:11:29 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MIn3E-0001OK-RP for qemu-devel@nongnu.org; Mon, 22 Jun 2009 13:11:28 -0400 Received: from [199.232.76.173] (port=34899 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MIn3E-0001Ns-NB for qemu-devel@nongnu.org; Mon, 22 Jun 2009 13:11:24 -0400 Received: from mx2.redhat.com ([66.187.237.31]:51458) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MIn3E-0000jV-4F for qemu-devel@nongnu.org; Mon, 22 Jun 2009 13:11:24 -0400 Message-ID: <4A3FBB68.3000708@redhat.com> Date: Mon, 22 Jun 2009 20:12:08 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [Qemu-commits] [COMMIT 3086844] Instead of writing a zero page, madvise it away References: <200906221549.n5MFn3Qd015389@d03av02.boulder.ibm.com> <4A3FAD69.60507@redhat.com> <4A3FB077.4040607@codemonkey.ws> <4A3FB390.4060809@redhat.com> <4A3FB829.10203@us.ibm.com> In-Reply-To: <4A3FB829.10203@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: qemu-devel On 06/22/2009 07:58 PM, Anthony Liguori wrote: >> Note that the patch contains a small bug -- the kernel is allowed to >> ignore the advise according to the manual page, so it's better to >> memset() the memory before dropping it. > > > Hrm, that's not quite how I interpreted the man page. > > "This call > does not influence the semantics of the application (except in the case > of MADV_DONTNEED), but may influence its performance. The kernel is > free to ignore the advice." > > MADV_DONTNEED is called out as changing the application semantics. > Specifically, I think the kernel has to zero-fill even if it choose to > ignore the advice. > > I limited the guard to Linux specifically because I was unsure about > that behavior but it would be good to clarify if anyone knows how. This is not posix (there is a POSIX_MADV_DONTNEED which appears to be non-destructive (and a better complement to the other advices)), so the only references are the manual page and the code. I don't see Linux ever avoiding brake brake brake the kernel certainly can ignore the advice: static long madvise_dontneed(struct vm_area_struct * vma, struct vm_area_struct ** prev, unsigned long start, unsigned long end) { *prev = vma; if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP)) return -EINVAL; if (unlikely(vma->vm_flags & VM_NONLINEAR)) { struct zap_details details = { .nonlinear_vma = vma, .last_index = ULONG_MAX, }; zap_page_range(vma, start, end - start, &details); } else zap_page_range(vma, start, end - start, NULL); return 0; } it won't do it silently, but we don't check the return code either. Let's take the safe path on this and zero the page. -- error compiling committee.c: too many arguments to function