From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Aneesh Kumar K.V" Date: Tue, 21 Jan 2014 03:52:47 +0000 Subject: Re: [PATCH 0/4] powernv: kvm: numa fault improvement Message-Id: <87ob36ypc0.fsf@linux.vnet.ibm.com> List-Id: References: <1386751674-14136-1-git-send-email-pingfank@linux.vnet.ibm.com> <87d2jm7j3d.fsf@linux.vnet.ibm.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Liu ping fan Cc: Paul Mackerras , linuxppc-dev@lists.ozlabs.org, Alexander Graf , kvm-ppc@vger.kernel.org Liu ping fan writes: > On Mon, Jan 20, 2014 at 11:45 PM, Aneesh Kumar K.V > wrote: >> Liu ping fan writes: >> >>> On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf wrote: >>>> >>>> On 11.12.2013, at 09:47, Liu Ping Fan wrote: >>>> >>>>> This series is based on Aneesh's series "[PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64" >>>>> >>>>> For this series, I apply the same idea from the previous thread "[PATCH 0/3] optimize for powerpc _PAGE_NUMA" >>>>> (for which, I still try to get a machine to show nums) >>>>> >>>>> But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host, >>>>> which is well known. >>>> >>>> This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it. >>>> >>> Sorry for the unclear message. After introducing the _PAGE_NUMA, >>> kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it >>> should rely on host's kvmppc_book3s_hv_page_fault() to call >>> do_numa_page() to do the numa fault check. This incurs the overhead >>> when exiting from rmode to vmode. My idea is that in >>> kvmppc_do_h_enter(), we do a quick check, if the page is right placed, >>> there is no need to exit to vmode (i.e saving htab, slab switching) >> >> Can you explain more. Are we looking at hcall from guest and >> hypervisor handling them in real mode ? If so why would guest issue a >> hcall on a pte entry that have PAGE_NUMA set. Or is this about >> hypervisor handling a missing hpte, because of host swapping this page >> out ? In that case how we end up in h_enter ? IIUC for that case we >> should get to kvmppc_hpte_hv_fault. >> > After setting _PAGE_NUMA, we should flush out all hptes both in host's > htab and guest's. So when guest tries to access memory, host finds > that there is not hpte ready for guest in guest's htab. And host > should raise dsi to guest. Now guest receive that fault, removes the PAGE_NUMA bit and do an hpte_insert. So before we do an hpte_insert (or H_ENTER) we should have cleared PAGE_NUMA bit. >This incurs that guest ends up in h_enter. > And you can see in current code, we also try this quick path firstly. > Only if fail, we will resort to slow path -- kvmppc_hpte_hv_fault. hmm ? hpte_hv_fault is the hypervisor handling the fault. -aneesh