From mboxrd@z Thu Jan 1 00:00:00 1970 From: Scott Wood Date: Mon, 08 Aug 2011 23:13:41 +0000 Subject: Re: [PATCH v5 5/5] KVM: PPC: e500: MMU API Message-Id: <4E406DA5.7070803@freescale.com> List-Id: References: <20110707234159.GE6646@schlenkerla.am.freescale.net> In-Reply-To: <20110707234159.GE6646@schlenkerla.am.freescale.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kvm-ppc@vger.kernel.org On 08/08/2011 03:49 AM, Johannes Weiner wrote: > On Mon, Jul 25, 2011 at 11:50:50PM +0200, Alexander Graf wrote: >> >> Well, alternatively we could simply bail out if the memory is not >> anonymous, right? Then the pinning on get_user_pages_fast should be >> enough. Johannes, would there be any downside to this approach? > > I don't see any correctness issues. Maybe Andrea does? > > While the userspace pages are never freed because of your reference, > it does not prevent reclaim from writing them to swap und unmapping > them from the user's page tables. Being unmapped from the user's page tables isn't a problem, as long as if the mapping is faulted back in before the I/O reference is released, it points at the same physical page. Anything else seems like it would break using get_free_pages() to implement read() -- you could be swapping out the wrong data. I hope that the "there may even be a completely different page there in some cases (eg. if mmapped pagecache has been invalidated and subsequently re faulted)" in the __get_user_pages() comment is referring to the !FOLL_WRITE case (or an explicit mapping change from userspace). This usage of get_free_pages() is pretty similar to how the guest's memory is dealt with. When the guest adds a TLB entry, get_user_pages_fast() gets called. It also doesn't get marked dirty until just before release, and userspace may access the memory before then (for debugging the guest, emulated DMA, etc). If that's not a problem, it shouldn't be a problem here either. -Scott