kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <marc.zyngier@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Christoffer Dall <christoffer.dall@linaro.org>
Cc: "kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"jungseoklee85@gmail.com" <jungseoklee85@gmail.com>,
	Joel Schopp <joel.schopp@amd.com>
Subject: Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
Date: Tue, 07 Oct 2014 14:28:43 +0100	[thread overview]
Message-ID: <5433EA8B.6010502@arm.com> (raw)
In-Reply-To: <20141007104846.GB12675@e104818-lin.cambridge.arm.com>

On 07/10/14 11:48, Catalin Marinas wrote:
> On Mon, Oct 06, 2014 at 09:30:25PM +0100, Christoffer Dall wrote:
>> +/**
>> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
>> + * @kvm:       The KVM struct pointer for the VM.
>> + * @pgd:       The kernel pseudo pgd
>> + *
>> + * When the kernel uses more levels of page tables than the guest, we allocate
>> + * a fake PGD and pre-populate it to point to the next-level page table, which
>> + * will be the real initial page table pointed to by the VTTBR.
>> + *
>> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
>> + * the kernel will use folded pud.  When KVM_PREALLOC_LEVEL==1, we
>> + * allocate 2 consecutive PUD pages.
>> + */
>> +#if defined(CONFIG_ARM64_64K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 3
>> +#define KVM_PREALLOC_LEVEL     2
>> +#define PTRS_PER_S2_PGD                1
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> I agree that my magic equation wasn't readable ;) (I had troubles
> re-understanding it as well), but you also have some constants here that
> are not immediately obvious where you got to them from. IIUC,
> KVM_PREALLOC_LEVEL == 2 here means that the hardware only understands
> stage 2 pmd and pte. I guess you could look into the ARM ARM tables but
> it's still not clear.
> 
> Let's look at PTRS_PER_S2_PGD as I think it's simpler. My proposal was:
> 
> #if PGDIR_SHIFT > KVM_PHYS_SHIFT
> #define PTRS_PER_S2_PGD			(1)
> #else
> #define PTRS_PER_S2_PGD			(1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
> #endif
> 
> In this case PGDIR_SHIFT is 42, so we get PTRS_PER_S2_PGD == 1. The 4K
> and 4 levels case below is also correct.
> 
> The KVM start level calculation, we could assume that KVM needs either
> host levels or host levels - 1 (unless we go for some weirdly small
> KVM_PHYS_SHIFT). So we could define them KVM_PREALLOC_LEVEL as:
> 
> #if PTRS_PER_S2_PGD <= 16
> #define KVM_PREALLOC_LEVEL	(4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
> #else
> #define KVM_PREALLOC_LEVEL	(0)
> #endif
> 
> Basically if you can concatenate 16 or less pages at the level below the
> top, the architecture does not allow a small top level. In this case,
> (4 - CONFIG_ARM64_PGTABLE_LEVELS) represents the first level for the
> host and we add 1 to go to the next level for KVM stage 2 when
> PTRS_PER_S2_PGD is 16 or less. We use 0 when we don't need to
> preallocate.

I think this makes the whole thing clearer (at least for me), as it
makes the relationship between KVM_PREALLOC_LEVEL and
CONFIG_ARM64_PGTABLE_LEVELS explicit (it wasn't completely obvious to me
initially).

>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +       pmd_t *pmd;
>> +
>> +       pud = pud_offset(pgd, 0);
>> +       pmd = (pmd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
>> +
>> +       if (!pmd)
>> +               return -ENOMEM;
>> +       pud_populate(NULL, pud, pmd);
>> +
>> +       return 0;
>> +}
>> +
>> +static inline void kvm_free_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       free_pages((unsigned long)pmd, 0);
>> +}
>> +
>> +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       return virt_to_phys(pmd);
>> +
>> +}
>> +#elif defined(CONFIG_ARM64_4K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 4
>> +#define KVM_PREALLOC_LEVEL     1
>> +#define PTRS_PER_S2_PGD                2
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> Here PGDIR_SHIFT is 39, so we get PTRS_PER_S2_PGD == (1 << (40 - 39))
> which is 2 and KVM_PREALLOC_LEVEL == 1.
> 
>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +
>> +       pud = (pud_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
>> +       if (!pud)
>> +               return -ENOMEM;
>> +       pgd_populate(NULL, pgd, pud);
>> +       pgd_populate(NULL, pgd + 1, pud + PTRS_PER_PUD);
>> +
>> +       return 0;
>> +}
> 
> You still need to define these functions but you can make their
> implementation dependent solely on the KVM_PREALLOC_LEVEL rather than
> 64K/4K and levels combinations. If it is KVM_PREALLOC_LEVEL is 1, you
> allocate pud and populate the pgds (in a loop based on the
> PTRS_PER_S2_PGD). If it is 2, you allocate the pmd and populate the pud
> (still in a loop though it would probably be 1 iteration). We know based
> on the assumption above that you can't get KVM_PREALLOC_LEVEL == 2 and
> CONFIG_ARM64_PGTABLE_LEVELS == 4.
> 

Also agreed. Most of what you wrote here could also be gathered as
comments in the patch.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...


  reply	other threads:[~2014-10-07 13:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-06 20:30 [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 Christoffer Dall
2014-10-07 10:48   ` Catalin Marinas
2014-10-07 13:28     ` Marc Zyngier [this message]
2014-10-07 19:39       ` Christoffer Dall
2014-10-08  9:34         ` Marc Zyngier
2014-10-08  9:47           ` Christoffer Dall
2014-10-08 10:27             ` Marc Zyngier
2014-10-08  9:47         ` Catalin Marinas
2014-10-09 11:01           ` Christoffer Dall
2014-10-09 13:36             ` Catalin Marinas
2014-10-10  8:16               ` Christoffer Dall
2014-10-07 13:40   ` Marc Zyngier
2014-10-08  9:48     ` Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 2/3] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 3/3] arm64: Allow 48-bits VA space without ARM_SMMU Christoffer Dall
2014-10-07  9:24 ` [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Catalin Marinas
2014-10-07  9:36   ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5433EA8B.6010502@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=joel.schopp@amd.com \
    --cc=jungseoklee85@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).