Re: [RFC PATCH] KVM: PPC: BOOK3S: HV: THP support for guest

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Alexander Graf <agraf@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org, paulus@samba.org,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [RFC PATCH] KVM: PPC: BOOK3S: HV: THP support for guest
Date: Tue, 06 May 2014 19:26:40 +1000	[thread overview]
Message-ID: <1399368400.18906.9.camel@pasglop> (raw)
In-Reply-To: <5368A78D.4070509@suse.de>

On Tue, 2014-05-06 at 11:12 +0200, Alexander Graf wrote:

> So if I understand this patch correctly, it simply introduces logic to 
> handle page sizes other than 4k, 64k, 16M by analyzing the actual page 
> size field in the HPTE. Mind to explain why exactly that enables us to 
> use THP?
>
> What exactly is the flow if the pages are not backed by huge pages? What 
> is the flow when they start to get backed by huge pages?

The hypervisor doesn't care about segments ... but it needs to properly
decode the page size requested by the guest, if anything, to issue the
right form of tlbie instruction.

The encoding in the HPTE for a 16M page inside a 64K segment is
different than the encoding for a 16M in a 16M segment, this is done so
that the encoding carries both information, which allows broadcast
tlbie to properly find the right set in the TLB for invalidations among
others.

So from a KVM perspective, we don't know whether the guest is doing THP
or something else (Linux calls it THP but all we care here is that this
is MPSS, another guest than Linux might exploit that differently).

What we do know is that if we advertise MPSS, we need to decode the page
sizes encoded in the HPTE so that we know what we are dealing with in
H_ENTER and can do the appropriate TLB invalidations in H_REMOVE &
evictions.

> > +			if (a_size != -1)
> > +				return 1ul << mmu_psize_defs[a_size].shift;
> > +		}
> > +
> > +	}
> > +	return 0;
> >   }
> >   
> >   static inline unsigned long hpte_rpn(unsigned long ptel, unsigned long psize)
> > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > index 8227dba5af0f..a38d3289320a 100644
> > --- a/arch/powerpc/kvm/book3s_hv.c
> > +++ b/arch/powerpc/kvm/book3s_hv.c
> > @@ -1949,6 +1949,13 @@ static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
> >   	 * support pte_enc here
> >   	 */
> >   	(*sps)->enc[0].pte_enc = def->penc[linux_psize];
> > +	/*
> > +	 * Add 16MB MPSS support
> > +	 */
> > +	if (linux_psize != MMU_PAGE_16M) {
> > +		(*sps)->enc[1].page_shift = 24;
> > +		(*sps)->enc[1].pte_enc = def->penc[MMU_PAGE_16M];
> > +	}
> 
> So this basically indicates that every segment (except for the 16MB one) 
> can also handle 16MB MPSS page sizes? I suppose you want to remove the 
> comment in kvm_vm_ioctl_get_smmu_info_hv() that says we don't do MPSS here.

I haven't reviewed the code there, make sure it will indeed do a
different encoding for every combination of segment/actual page size.

> Can we also ensure that every system we run on can do MPSS?

P7 and P8 are identical in that regard. However 970 doesn't do MPSS so
let's make sure we get that right.

Cheers,
Ben.

next prev parent reply	other threads:[~2014-05-06  9:26 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-04 17:30 [RFC PATCH] KVM: PPC: BOOK3S: HV: THP support for guest Aneesh Kumar K.V
2014-05-04 17:36 ` Aneesh Kumar K.V
2014-05-05 11:38 ` Alexander Graf
2014-05-05 14:47   ` Aneesh Kumar K.V
2014-05-06  4:20     ` Paul Mackerras
2014-05-06 14:25       ` Aneesh Kumar K.V
2014-05-06  9:12 ` Alexander Graf
2014-05-06  9:26   ` Benjamin Herrenschmidt [this message]
2014-05-06  9:39     ` Alexander Graf
2014-05-06 15:06       ` Aneesh Kumar K.V
2014-05-06 15:23         ` Alexander Graf
2014-05-06 16:08           ` Aneesh Kumar K.V
2014-05-06 16:18             ` Alexander Graf
2014-05-06 20:35             ` Benjamin Herrenschmidt
2014-05-06 14:23   ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399368400.18906.9.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=agraf@suse.de \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).