Re: [PATCH 0/4] powernv: kvm: numa fault improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Liu ping fan <kernelfans@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org, Alexander Graf <agraf@suse.de>,
	kvm-ppc@vger.kernel.org
Subject: Re: [PATCH 0/4] powernv: kvm: numa fault improvement
Date: Tue, 21 Jan 2014 03:52:47 +0000	[thread overview]
Message-ID: <87ob36ypc0.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAFgQCTt2eWHF4iZP6bx075ZFMx8yhqnZgHnZwm2uLLOOCar+XQ@mail.gmail.com>

Liu ping fan <kernelfans@gmail.com> writes:

> On Mon, Jan 20, 2014 at 11:45 PM, Aneesh Kumar K.V
> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Liu ping fan <kernelfans@gmail.com> writes:
>>
>>> On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf <agraf@suse.de> wrote:
>>>>
>>>> On 11.12.2013, at 09:47, Liu Ping Fan <kernelfans@gmail.com> wrote:
>>>>
>>>>> This series is based on Aneesh's series  "[PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64"
>>>>>
>>>>> For this series, I apply the same idea from the previous thread "[PATCH 0/3] optimize for powerpc _PAGE_NUMA"
>>>>> (for which, I still try to get a machine to show nums)
>>>>>
>>>>> But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host,
>>>>> which is  well known.
>>>>
>>>> This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it.
>>>>
>>> Sorry for the unclear message. After introducing the _PAGE_NUMA,
>>> kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it
>>> should rely on host's kvmppc_book3s_hv_page_fault() to call
>>> do_numa_page() to do the numa fault check. This incurs the overhead
>>> when exiting from rmode to vmode.  My idea is that in
>>> kvmppc_do_h_enter(), we do a quick check, if the page is right placed,
>>> there is no need to exit to vmode (i.e saving htab, slab switching)
>>
>> Can you explain more. Are we looking at hcall from guest  and
>> hypervisor handling them in real mode ? If so why would guest issue a
>> hcall on a pte entry that have PAGE_NUMA set. Or is this about
>> hypervisor handling a missing hpte, because of host swapping this page
>> out ? In that case how we end up in h_enter ? IIUC for that case we
>> should get to kvmppc_hpte_hv_fault.
>>
> After setting _PAGE_NUMA, we should flush out all hptes both in host's
> htab and guest's. So when guest tries to access memory, host finds
> that there is not hpte ready for guest in guest's htab. And host
> should raise dsi to guest.

Now guest receive that fault, removes the PAGE_NUMA bit and do an
hpte_insert. So before we do an hpte_insert (or H_ENTER) we should have
cleared PAGE_NUMA bit.

>This incurs that guest ends up in h_enter.
> And you can see in current code, we also try this quick path firstly.
> Only if fail, we will resort to slow path --  kvmppc_hpte_hv_fault.

hmm ? hpte_hv_fault is the hypervisor handling the fault.

-aneesh

WARNING: multiple messages have this Message-ID (diff)

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Liu ping fan <kernelfans@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org, Alexander Graf <agraf@suse.de>,
	kvm-ppc@vger.kernel.org
Subject: Re: [PATCH 0/4] powernv: kvm: numa fault improvement
Date: Tue, 21 Jan 2014 09:10:47 +0530	[thread overview]
Message-ID: <87ob36ypc0.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAFgQCTt2eWHF4iZP6bx075ZFMx8yhqnZgHnZwm2uLLOOCar+XQ@mail.gmail.com>

Liu ping fan <kernelfans@gmail.com> writes:

> On Mon, Jan 20, 2014 at 11:45 PM, Aneesh Kumar K.V
> <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Liu ping fan <kernelfans@gmail.com> writes:
>>
>>> On Thu, Jan 9, 2014 at 8:08 PM, Alexander Graf <agraf@suse.de> wrote:
>>>>
>>>> On 11.12.2013, at 09:47, Liu Ping Fan <kernelfans@gmail.com> wrote:
>>>>
>>>>> This series is based on Aneesh's series  "[PATCH -V2 0/5] powerpc: mm: Numa faults support for ppc64"
>>>>>
>>>>> For this series, I apply the same idea from the previous thread "[PATCH 0/3] optimize for powerpc _PAGE_NUMA"
>>>>> (for which, I still try to get a machine to show nums)
>>>>>
>>>>> But for this series, I think that I have a good justification -- the fact of heavy cost when switching context between guest and host,
>>>>> which is  well known.
>>>>
>>>> This cover letter isn't really telling me anything. Please put a proper description of what you're trying to achieve, why you're trying to achieve what you're trying and convince your readers that it's a good idea to do it the way you do it.
>>>>
>>> Sorry for the unclear message. After introducing the _PAGE_NUMA,
>>> kvmppc_do_h_enter() can not fill up the hpte for guest. Instead, it
>>> should rely on host's kvmppc_book3s_hv_page_fault() to call
>>> do_numa_page() to do the numa fault check. This incurs the overhead
>>> when exiting from rmode to vmode.  My idea is that in
>>> kvmppc_do_h_enter(), we do a quick check, if the page is right placed,
>>> there is no need to exit to vmode (i.e saving htab, slab switching)
>>
>> Can you explain more. Are we looking at hcall from guest  and
>> hypervisor handling them in real mode ? If so why would guest issue a
>> hcall on a pte entry that have PAGE_NUMA set. Or is this about
>> hypervisor handling a missing hpte, because of host swapping this page
>> out ? In that case how we end up in h_enter ? IIUC for that case we
>> should get to kvmppc_hpte_hv_fault.
>>
> After setting _PAGE_NUMA, we should flush out all hptes both in host's
> htab and guest's. So when guest tries to access memory, host finds
> that there is not hpte ready for guest in guest's htab. And host
> should raise dsi to guest.

Now guest receive that fault, removes the PAGE_NUMA bit and do an
hpte_insert. So before we do an hpte_insert (or H_ENTER) we should have
cleared PAGE_NUMA bit.

>This incurs that guest ends up in h_enter.
> And you can see in current code, we also try this quick path firstly.
> Only if fail, we will resort to slow path --  kvmppc_hpte_hv_fault.

hmm ? hpte_hv_fault is the hypervisor handling the fault.

-aneesh

next prev parent reply	other threads:[~2014-01-21  3:52 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-11  8:47 [PATCH 0/4] powernv: kvm: numa fault improvement Liu Ping Fan
2013-12-11  8:47 ` Liu Ping Fan
2013-12-11  8:47 ` [PATCH 1/4] mm: export numa_migrate_prep() Liu Ping Fan
2013-12-11  8:47   ` Liu Ping Fan
2013-12-11  8:47 ` [PATCH 2/4] powernv: kvm: make _PAGE_NUMA take effect Liu Ping Fan
2013-12-11  8:47   ` Liu Ping Fan
2014-01-20 15:22   ` Aneesh Kumar K.V
2014-01-20 15:34     ` Aneesh Kumar K.V
2013-12-11  8:47 ` [PATCH 3/4] powernv: kvm: extend input param for lookup_linux_pte Liu Ping Fan
2013-12-11  8:47   ` Liu Ping Fan
2013-12-11  8:47 ` [PATCH 4/4] powernv: kvm: make the handling of _PAGE_NUMA faster for guest Liu Ping Fan
2013-12-11  8:47   ` Liu Ping Fan
2014-01-09 12:08 ` [PATCH 0/4] powernv: kvm: numa fault improvement Alexander Graf
2014-01-09 12:08   ` Alexander Graf
2014-01-15  6:36   ` Liu ping fan
2014-01-15  6:36     ` Liu ping fan
2014-01-20 14:48     ` Alexander Graf
2014-01-20 14:48       ` Alexander Graf
2014-01-21 11:22       ` Paul Mackerras
2014-01-21 11:22         ` Paul Mackerras
2014-01-22  5:18         ` Aneesh Kumar K.V
2014-01-22  5:30           ` Aneesh Kumar K.V
2014-01-22  8:33           ` Liu ping fan
2014-01-22  8:33             ` Liu ping fan
2014-02-26  3:09       ` Liu ping fan
2014-02-26  3:09         ` Liu ping fan
2014-01-20 15:45     ` Aneesh Kumar K.V
2014-01-20 15:57       ` Aneesh Kumar K.V
2014-01-21  2:30       ` Liu ping fan
2014-01-21  2:30         ` Liu ping fan
2014-01-21  3:40         ` Aneesh Kumar K.V [this message]
2014-01-21  3:52           ` Aneesh Kumar K.V
2014-01-21  9:07           ` Liu ping fan
2014-01-21  9:07             ` Liu ping fan
2014-01-21  9:11             ` Liu ping fan
2014-01-21  9:11               ` Liu ping fan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ob36ypc0.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=agraf@suse.de \
    --cc=kernelfans@gmail.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.