From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrea Arcangeli Subject: Re: unit tests and get_user_pages_ptes_fast() Date: Tue, 5 Oct 2010 16:32:46 +0200 Message-ID: <20101005143246.GW26357@random.random> References: <4CA99FE0.7070709@redhat.com> <20101004134052.GQ26357@random.random> <20101004235953.GA1474@amt.cnet> <4CAAD59B.5020003@redhat.com> <20101005092217.GA15663@amt.cnet> <20101005141547.GV26357@random.random> <4CAB3541.3080505@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Marcelo Tosatti , KVM list To: Avi Kivity Return-path: Received: from mx1.redhat.com ([209.132.183.28]:34131 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750867Ab0JEOcr (ORCPT ); Tue, 5 Oct 2010 10:32:47 -0400 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o95EWlxt028419 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 5 Oct 2010 10:32:47 -0400 Content-Disposition: inline In-Reply-To: <4CAB3541.3080505@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Oct 05, 2010 at 04:25:05PM +0200, Avi Kivity wrote: > On 10/05/2010 04:15 PM, Andrea Arcangeli wrote: > > On Tue, Oct 05, 2010 at 06:22:17AM -0300, Marcelo Tosatti wrote: > > > It'll not be so advantageous for ksm because there should be read-faults > > > very rarely on that case. > > > > It'll also make all clean swapcache dirty for no good. > > > > > Will post. > > > > If we've to walk pagetables twice, why don't you do this: > > > > writable=1 > > get_user_pages_fast(write=write_fault) > > if (!write_fault) > > writable = __get_user_pages_fast(write=1) > > > > That will solve the debugging knob and it'll solve ksm and it'll be > > optimal for read swapins on exclusive clean swapcache too. > > But it means an extra vmexit in the following case: > > - read fault > - page is present and writeable in the Linux page table > > which is very common. For this you need get_user_pages_ptes_fast(). With a read fault, the VM already sets the pte as writable if the VM permissions allows that and the page isn't shared (i.e. if it's an exclusive swap page). We've just to check if it did that or not. So when it's a read fault we've to run __get_user_pages_fast(write=1) before we can assume the page is mapped writable in the pte. So I don't see the problem... in terms of page faults is optimal. Only downside is having to walk the pagetables twice, the second time to verify if the first gup_fast has marked the host pte writable or not.