From: Marc Zyngier <marc.zyngier@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
Christoffer Dall <christoffer.dall@linaro.org>
Cc: "kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"jungseoklee85@gmail.com" <jungseoklee85@gmail.com>,
Joel Schopp <joel.schopp@amd.com>
Subject: Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
Date: Tue, 07 Oct 2014 14:28:43 +0100 [thread overview]
Message-ID: <5433EA8B.6010502@arm.com> (raw)
In-Reply-To: <20141007104846.GB12675@e104818-lin.cambridge.arm.com>
On 07/10/14 11:48, Catalin Marinas wrote:
> On Mon, Oct 06, 2014 at 09:30:25PM +0100, Christoffer Dall wrote:
>> +/**
>> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
>> + * @kvm: The KVM struct pointer for the VM.
>> + * @pgd: The kernel pseudo pgd
>> + *
>> + * When the kernel uses more levels of page tables than the guest, we allocate
>> + * a fake PGD and pre-populate it to point to the next-level page table, which
>> + * will be the real initial page table pointed to by the VTTBR.
>> + *
>> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
>> + * the kernel will use folded pud. When KVM_PREALLOC_LEVEL==1, we
>> + * allocate 2 consecutive PUD pages.
>> + */
>> +#if defined(CONFIG_ARM64_64K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 3
>> +#define KVM_PREALLOC_LEVEL 2
>> +#define PTRS_PER_S2_PGD 1
>> +#define S2_PGD_ORDER get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
>
> I agree that my magic equation wasn't readable ;) (I had troubles
> re-understanding it as well), but you also have some constants here that
> are not immediately obvious where you got to them from. IIUC,
> KVM_PREALLOC_LEVEL == 2 here means that the hardware only understands
> stage 2 pmd and pte. I guess you could look into the ARM ARM tables but
> it's still not clear.
>
> Let's look at PTRS_PER_S2_PGD as I think it's simpler. My proposal was:
>
> #if PGDIR_SHIFT > KVM_PHYS_SHIFT
> #define PTRS_PER_S2_PGD (1)
> #else
> #define PTRS_PER_S2_PGD (1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
> #endif
>
> In this case PGDIR_SHIFT is 42, so we get PTRS_PER_S2_PGD == 1. The 4K
> and 4 levels case below is also correct.
>
> The KVM start level calculation, we could assume that KVM needs either
> host levels or host levels - 1 (unless we go for some weirdly small
> KVM_PHYS_SHIFT). So we could define them KVM_PREALLOC_LEVEL as:
>
> #if PTRS_PER_S2_PGD <= 16
> #define KVM_PREALLOC_LEVEL (4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
> #else
> #define KVM_PREALLOC_LEVEL (0)
> #endif
>
> Basically if you can concatenate 16 or less pages at the level below the
> top, the architecture does not allow a small top level. In this case,
> (4 - CONFIG_ARM64_PGTABLE_LEVELS) represents the first level for the
> host and we add 1 to go to the next level for KVM stage 2 when
> PTRS_PER_S2_PGD is 16 or less. We use 0 when we don't need to
> preallocate.
I think this makes the whole thing clearer (at least for me), as it
makes the relationship between KVM_PREALLOC_LEVEL and
CONFIG_ARM64_PGTABLE_LEVELS explicit (it wasn't completely obvious to me
initially).
>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> + pud_t *pud;
>> + pmd_t *pmd;
>> +
>> + pud = pud_offset(pgd, 0);
>> + pmd = (pmd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
>> +
>> + if (!pmd)
>> + return -ENOMEM;
>> + pud_populate(NULL, pud, pmd);
>> +
>> + return 0;
>> +}
>> +
>> +static inline void kvm_free_hwpgd(struct kvm *kvm)
>> +{
>> + pgd_t *pgd = kvm->arch.pgd;
>> + pud_t *pud = pud_offset(pgd, 0);
>> + pmd_t *pmd = pmd_offset(pud, 0);
>> + free_pages((unsigned long)pmd, 0);
>> +}
>> +
>> +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
>> +{
>> + pgd_t *pgd = kvm->arch.pgd;
>> + pud_t *pud = pud_offset(pgd, 0);
>> + pmd_t *pmd = pmd_offset(pud, 0);
>> + return virt_to_phys(pmd);
>> +
>> +}
>> +#elif defined(CONFIG_ARM64_4K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 4
>> +#define KVM_PREALLOC_LEVEL 1
>> +#define PTRS_PER_S2_PGD 2
>> +#define S2_PGD_ORDER get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
>
> Here PGDIR_SHIFT is 39, so we get PTRS_PER_S2_PGD == (1 << (40 - 39))
> which is 2 and KVM_PREALLOC_LEVEL == 1.
>
>> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
>> +{
>> + pud_t *pud;
>> +
>> + pud = (pud_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
>> + if (!pud)
>> + return -ENOMEM;
>> + pgd_populate(NULL, pgd, pud);
>> + pgd_populate(NULL, pgd + 1, pud + PTRS_PER_PUD);
>> +
>> + return 0;
>> +}
>
> You still need to define these functions but you can make their
> implementation dependent solely on the KVM_PREALLOC_LEVEL rather than
> 64K/4K and levels combinations. If it is KVM_PREALLOC_LEVEL is 1, you
> allocate pud and populate the pgds (in a loop based on the
> PTRS_PER_S2_PGD). If it is 2, you allocate the pmd and populate the pud
> (still in a loop though it would probably be 1 iteration). We know based
> on the assumption above that you can't get KVM_PREALLOC_LEVEL == 2 and
> CONFIG_ARM64_PGTABLE_LEVELS == 4.
>
Also agreed. Most of what you wrote here could also be gathered as
comments in the patch.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
next prev parent reply other threads:[~2014-10-07 13:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-06 20:30 [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 Christoffer Dall
2014-10-07 10:48 ` Catalin Marinas
2014-10-07 13:28 ` Marc Zyngier [this message]
2014-10-07 19:39 ` Christoffer Dall
2014-10-08 9:34 ` Marc Zyngier
2014-10-08 9:47 ` Christoffer Dall
2014-10-08 10:27 ` Marc Zyngier
2014-10-08 9:47 ` Catalin Marinas
2014-10-09 11:01 ` Christoffer Dall
2014-10-09 13:36 ` Catalin Marinas
2014-10-10 8:16 ` Christoffer Dall
2014-10-07 13:40 ` Marc Zyngier
2014-10-08 9:48 ` Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 2/3] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE Christoffer Dall
2014-10-06 20:30 ` [PATCH v2 3/3] arm64: Allow 48-bits VA space without ARM_SMMU Christoffer Dall
2014-10-07 9:24 ` [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits Catalin Marinas
2014-10-07 9:36 ` Christoffer Dall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5433EA8B.6010502@arm.com \
--to=marc.zyngier@arm.com \
--cc=catalin.marinas@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=joel.schopp@amd.com \
--cc=jungseoklee85@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).