[PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: marc.zyngier@arm.com (Marc Zyngier)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
Date: Tue, 07 Oct 2014 14:28:43 +0100	[thread overview]
Message-ID: <5433EA8B.6010502@arm.com> (raw)
In-Reply-To: <20141007104846.GB12675@e104818-lin.cambridge.arm.com>

On 07/10/14 11:48, Catalin Marinas wrote:
> On Mon, Oct 06, 2014 at 09:30:25PM +0100, Christoffer Dall wrote:
>> +/**
>> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
>> + * @kvm:       The KVM struct pointer for the VM.
>> + * @pgd:       The kernel pseudo pgd
>> + *
>> + * When the kernel uses more levels of page tables than the guest, we allocate
>> + * a fake PGD and pre-populate it to point to the next-level page table, which
>> + * will be the real initial page table pointed to by the VTTBR.
>> + *
>> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
>> + * the kernel will use folded pud.  When KVM_PREALLOC_LEVEL==1, we
>> + * allocate 2 consecutive PUD pages.
>> + */
>> +#if defined(CONFIG_ARM64_64K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 3
>> +#define KVM_PREALLOC_LEVEL     2
>> +#define PTRS_PER_S2_PGD                1
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> I agree that my magic equation wasn't readable ;) (I had troubles
> re-understanding it as well), but you also have some constants here that
> are not immediately obvious where you got to them from. IIUC,
> KVM_PREALLOC_LEVEL == 2 here means that the hardware only understands
> stage 2 pmd and pte. I guess you could look into the ARM ARM tables but
> it's still not clear.
> 
> Let's look at PTRS_PER_S2_PGD as I think it's simpler. My proposal was:
> 
> #if PGDIR_SHIFT > KVM_PHYS_SHIFT
> #define PTRS_PER_S2_PGD			(1)
> #else
> #define PTRS_PER_S2_PGD			(1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
> #endif
> 
> In this case PGDIR_SHIFT is 42, so we get PTRS_PER_S2_PGD == 1. The 4K
> and 4 levels case below is also correct.
> 
> The KVM start level calculation, we could assume that KVM needs either
> host levels or host levels - 1 (unless we go for some weirdly small
> KVM_PHYS_SHIFT). So we could define them KVM_PREALLOC_LEVEL as:
> 
> #if PTRS_PER_S2_PGD <= 16
> #define KVM_PREALLOC_LEVEL	(4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
> #else
> #define KVM_PREALLOC_LEVEL	(0)
> #endif
> 
> Basically if you can concatenate 16 or less pages at the level below the
> top, the architecture does not allow a small top level. In this case,
> (4 - CONFIG_ARM64_PGTABLE_LEVELS) represents the first level for the
> host and we add 1 to go to the next level for KVM stage 2 when
> PTRS_PER_S2_PGD is 16 or less. We use 0 when we don't need to
> preallocate.

I think this makes the whole thing clearer (at least for me), as it
makes the relationship between KVM_PREALLOC_LEVEL and
CONFIG_ARM64_PGTABLE_LEVELS explicit (it wasn't completely obvious to me
initially).

>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +       pmd_t *pmd;
>> +
>> +       pud = pud_offset(pgd, 0);
>> +       pmd = (pmd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
>> +
>> +       if (!pmd)
>> +               return -ENOMEM;
>> +       pud_populate(NULL, pud, pmd);
>> +
>> +       return 0;
>> +}
>> +
>> +static inline void kvm_free_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       free_pages((unsigned long)pmd, 0);
>> +}
>> +
>> +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       return virt_to_phys(pmd);
>> +
>> +}
>> +#elif defined(CONFIG_ARM64_4K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 4
>> +#define KVM_PREALLOC_LEVEL     1
>> +#define PTRS_PER_S2_PGD                2
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> Here PGDIR_SHIFT is 39, so we get PTRS_PER_S2_PGD == (1 << (40 - 39))
> which is 2 and KVM_PREALLOC_LEVEL == 1.
> 
>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +
>> +       pud = (pud_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
>> +       if (!pud)
>> +               return -ENOMEM;
>> +       pgd_populate(NULL, pgd, pud);
>> +       pgd_populate(NULL, pgd + 1, pud + PTRS_PER_PUD);
>> +
>> +       return 0;
>> +}
> 
> You still need to define these functions but you can make their
> implementation dependent solely on the KVM_PREALLOC_LEVEL rather than
> 64K/4K and levels combinations. If it is KVM_PREALLOC_LEVEL is 1, you
> allocate pud and populate the pgds (in a loop based on the
> PTRS_PER_S2_PGD). If it is 2, you allocate the pmd and populate the pud
> (still in a loop though it would probably be 1 iteration). We know based
> on the assumption above that you can't get KVM_PREALLOC_LEVEL == 2 and
> CONFIG_ARM64_PGTABLE_LEVELS == 4.
> 

Also agreed. Most of what you wrote here could also be gathered as
comments in the patch.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

WARNING: multiple messages have this Message-ID (diff)

From: Marc Zyngier <marc.zyngier@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Christoffer Dall <christoffer.dall@linaro.org>
Cc: "kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"jungseoklee85@gmail.com" <jungseoklee85@gmail.com>,
	Joel Schopp <joel.schopp@amd.com>
Subject: Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
Date: Tue, 07 Oct 2014 14:28:43 +0100	[thread overview]
Message-ID: <5433EA8B.6010502@arm.com> (raw)
In-Reply-To: <20141007104846.GB12675@e104818-lin.cambridge.arm.com>

On 07/10/14 11:48, Catalin Marinas wrote:
> On Mon, Oct 06, 2014 at 09:30:25PM +0100, Christoffer Dall wrote:
>> +/**
>> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
>> + * @kvm:       The KVM struct pointer for the VM.
>> + * @pgd:       The kernel pseudo pgd
>> + *
>> + * When the kernel uses more levels of page tables than the guest, we allocate
>> + * a fake PGD and pre-populate it to point to the next-level page table, which
>> + * will be the real initial page table pointed to by the VTTBR.
>> + *
>> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
>> + * the kernel will use folded pud.  When KVM_PREALLOC_LEVEL==1, we
>> + * allocate 2 consecutive PUD pages.
>> + */
>> +#if defined(CONFIG_ARM64_64K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 3
>> +#define KVM_PREALLOC_LEVEL     2
>> +#define PTRS_PER_S2_PGD                1
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> I agree that my magic equation wasn't readable ;) (I had troubles
> re-understanding it as well), but you also have some constants here that
> are not immediately obvious where you got to them from. IIUC,
> KVM_PREALLOC_LEVEL == 2 here means that the hardware only understands
> stage 2 pmd and pte. I guess you could look into the ARM ARM tables but
> it's still not clear.
> 
> Let's look at PTRS_PER_S2_PGD as I think it's simpler. My proposal was:
> 
> #if PGDIR_SHIFT > KVM_PHYS_SHIFT
> #define PTRS_PER_S2_PGD			(1)
> #else
> #define PTRS_PER_S2_PGD			(1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
> #endif
> 
> In this case PGDIR_SHIFT is 42, so we get PTRS_PER_S2_PGD == 1. The 4K
> and 4 levels case below is also correct.
> 
> The KVM start level calculation, we could assume that KVM needs either
> host levels or host levels - 1 (unless we go for some weirdly small
> KVM_PHYS_SHIFT). So we could define them KVM_PREALLOC_LEVEL as:
> 
> #if PTRS_PER_S2_PGD <= 16
> #define KVM_PREALLOC_LEVEL	(4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
> #else
> #define KVM_PREALLOC_LEVEL	(0)
> #endif
> 
> Basically if you can concatenate 16 or less pages at the level below the
> top, the architecture does not allow a small top level. In this case,
> (4 - CONFIG_ARM64_PGTABLE_LEVELS) represents the first level for the
> host and we add 1 to go to the next level for KVM stage 2 when
> PTRS_PER_S2_PGD is 16 or less. We use 0 when we don't need to
> preallocate.

I think this makes the whole thing clearer (at least for me), as it
makes the relationship between KVM_PREALLOC_LEVEL and
CONFIG_ARM64_PGTABLE_LEVELS explicit (it wasn't completely obvious to me
initially).

>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +       pmd_t *pmd;
>> +
>> +       pud = pud_offset(pgd, 0);
>> +       pmd = (pmd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
>> +
>> +       if (!pmd)
>> +               return -ENOMEM;
>> +       pud_populate(NULL, pud, pmd);
>> +
>> +       return 0;
>> +}
>> +
>> +static inline void kvm_free_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       free_pages((unsigned long)pmd, 0);
>> +}
>> +
>> +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
>> +{
>> +       pgd_t *pgd = kvm->arch.pgd;
>> +       pud_t *pud = pud_offset(pgd, 0);
>> +       pmd_t *pmd = pmd_offset(pud, 0);
>> +       return virt_to_phys(pmd);
>> +
>> +}
>> +#elif defined(CONFIG_ARM64_4K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 4
>> +#define KVM_PREALLOC_LEVEL     1
>> +#define PTRS_PER_S2_PGD                2
>> +#define S2_PGD_ORDER           get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> Here PGDIR_SHIFT is 39, so we get PTRS_PER_S2_PGD == (1 << (40 - 39))
> which is 2 and KVM_PREALLOC_LEVEL == 1.
> 
>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> +       pud_t *pud;
>> +
>> +       pud = (pud_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
>> +       if (!pud)
>> +               return -ENOMEM;
>> +       pgd_populate(NULL, pgd, pud);
>> +       pgd_populate(NULL, pgd + 1, pud + PTRS_PER_PUD);
>> +
>> +       return 0;
>> +}
> 
> You still need to define these functions but you can make their
> implementation dependent solely on the KVM_PREALLOC_LEVEL rather than
> 64K/4K and levels combinations. If it is KVM_PREALLOC_LEVEL is 1, you
> allocate pud and populate the pgds (in a loop based on the
> PTRS_PER_S2_PGD). If it is 2, you allocate the pmd and populate the pud
> (still in a loop though it would probably be 1 iteration). We know based
> on the assumption above that you can't get KVM_PREALLOC_LEVEL == 2 and
> CONFIG_ARM64_PGTABLE_LEVELS == 4.
> 

Also agreed. Most of what you wrote here could also be gathered as
comments in the patch.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

next prev parent reply	other threads:[~2014-10-07 13:28 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-06 20:30 [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Christoffer Dall
2014-10-06 20:30 ` Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 Christoffer Dall
2014-10-06 20:30   ` Christoffer Dall
2014-10-07 10:48   ` Catalin Marinas
2014-10-07 10:48     ` Catalin Marinas
2014-10-07 13:28     ` Marc Zyngier [this message]
2014-10-07 13:28       ` Marc Zyngier
2014-10-07 19:39       ` Christoffer Dall
2014-10-07 19:39         ` Christoffer Dall
2014-10-08  9:34         ` Marc Zyngier
2014-10-08  9:34           ` Marc Zyngier
2014-10-08  9:47           ` Christoffer Dall
2014-10-08  9:47             ` Christoffer Dall
2014-10-08 10:27             ` Marc Zyngier
2014-10-08 10:27               ` Marc Zyngier
2014-10-08  9:47         ` Catalin Marinas
2014-10-08  9:47           ` Catalin Marinas
2014-10-09 11:01           ` Christoffer Dall
2014-10-09 11:01             ` Christoffer Dall
2014-10-09 13:36             ` Catalin Marinas
2014-10-09 13:36               ` Catalin Marinas
2014-10-10  8:16               ` Christoffer Dall
2014-10-10  8:16                 ` Christoffer Dall
2014-10-07 13:40   ` Marc Zyngier
2014-10-07 13:40     ` Marc Zyngier
2014-10-08  9:48     ` Christoffer Dall
2014-10-08  9:48       ` Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 2/3] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE Christoffer Dall
2014-10-06 20:30   ` Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 3/3] arm64: Allow 48-bits VA space without ARM_SMMU Christoffer Dall
2014-10-06 20:30   ` Christoffer Dall
2014-10-07  9:24 ` [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Catalin Marinas
2014-10-07  9:24   ` Catalin Marinas
2014-10-07  9:36   ` Christoffer Dall
2014-10-07  9:36     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5433EA8B.6010502@arm.com \
    --to=marc.zyngier@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.