From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrea Arcangeli Subject: Re: [PATCH] Handle vma regions with no backing page Date: Wed, 4 Jun 2008 21:34:03 +0200 Message-ID: <20080604193402.GO21613@duo.random> References: <20080603113937.GE8158@duo.random> <1212592164.26322.10.camel@lnx-benami> <20080604161755.GC7089@il.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Ben-Ami Yassour1 , aliguori@us.ibm.com, Han Weidong , "Kay, Allen M" , Amit Shah , kvm@vger.kernel.org, avi@qumranet.com To: Muli Ben-Yehuda Return-path: Received: from host36-195-149-62.serverdedicati.aruba.it ([62.149.195.36]:56405 "EHLO mx.cpushare.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750987AbYFDTeH (ORCPT ); Wed, 4 Jun 2008 15:34:07 -0400 Content-Disposition: inline In-Reply-To: <20080604161755.GC7089@il.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Jun 04, 2008 at 07:17:55PM +0300, Muli Ben-Yehuda wrote: > On Wed, Jun 04, 2008 at 06:09:24PM +0300, Ben-Ami Yassour1 wrote: > > > > > We noticed that pfn_valid does not always works as expected by > > > > this patch to indicate that a pfn has a backing page. We have > > > > seen a case where CONFIG_NUMA was not set and then where > > > > pfn_valid returned 1 for an mmio pfn. We then changed the > > > > config file with CONFIG_NUMA set and it worked fine as expected > > > > (since a different implementation of pfn_valid was used). > > > > > > > > How should we overcome this issue? > > > > > > There's a page_is_ram() too, but that's the e820 map check and it > > > means it's RAM not that there's a page backing store. Certainly if > > > it's not ram we should go ahead with just the pfn but it'd be a > > > workaround. > > > > > > I really think it'd be better off to fix pfn_valid to work for > > > NUMA. > > > > It does work for NUMA, it does not work without the NUMA option. > > Andrea, how would you suggest to fix pfn_valid for the CONFIG_NUMA > disabled case? I'm very surprised that this is broken for non-numa. To be sure I understand, what exactly means that "pfn has not a backing page"? I think we must fix pfn_valid, so that when it returns true, pfn_to_page returns an entry that is contained inside some mem_map array allocated by mm/page_alloc.c. pfn_valid is totally right to return true on mmio regions, as long as a 'struct page' exists. Like for example the 640k-1M area. It'd be a waste to pay the price of a discontiguous array to save a few struct pages (or perhaps these days they're inefficiently using 4k tlb entries to map struct page dunno, I surely prefer to waste a few struct pages and stay with 2M tlb). As long as a 'struct page' exists and pfn_to_page returns a 'struct page' checking PageReserved() should be enough to know if it's pageable RAM owned by the VM or an mmio region. I don't know why but when I did reserved-ram patch, the PageReserved check wasn't returning true on the reserved-ram. Perhaps it's because it was ram, dunno, so I had to use the page_count() hack. But for mmio !pfn_valid || PageReserved should work. Said that this mem_map code tends to change all the time for whatever reason, so I wouldn't be shocked if PageReserved doesn't actually work either. But mixing pfn_valid and PageReserved sounds the right objective.