public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>,
	kernel-team@android.com, kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: [PATCH v3 02/21] KVM: arm64: Add stand-alone page-table walker infrastructure
Date: Wed, 2 Sep 2020 11:36:49 +0100	[thread overview]
Message-ID: <20200902103648.GC5567@willie-the-truck> (raw)
In-Reply-To: <9de812eb-1067-08bf-69cd-eb205dfbda35@arm.com>

On Thu, Aug 27, 2020 at 05:27:13PM +0100, Alexandru Elisei wrote:
> It looks to me like the fact that code doesn't take into account the fact that we
> can have concatenated pages at the initial level of lookup. Am I missing
> something? Is it added in later patches and I missed it? I've commented below in a
> few places where I noticed that.

(seems like you figured some of this out in a later reply).

> On 8/25/20 10:39 AM, Will Deacon wrote:
> > The KVM page-table code is intricately tied into the kernel page-table
> > code and re-uses the pte/pmd/pud/p4d/pgd macros directly in an attempt
> > to reduce code duplication. Unfortunately, the reality is that there is
> > an awful lot of code required to make this work, and at the end of the
> > day you're limited to creating page-tables with the same configuration
> > as the host kernel. Furthermore, lifting the page-table code to run
> > directly at EL2 on a non-VHE system (as we plan to to do in future
> > patches) is practically impossible due to the number of dependencies it
> > has on the core kernel.
> >
> > Introduce a framework for walking Armv8 page-tables configured
> > independently from the host kernel.
> >
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Quentin Perret <qperret@google.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/include/asm/kvm_pgtable.h | 101 ++++++++++
> >  arch/arm64/kvm/hyp/Makefile          |   2 +-
> >  arch/arm64/kvm/hyp/pgtable.c         | 290 +++++++++++++++++++++++++++
> >  3 files changed, 392 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_pgtable.h
> >  create mode 100644 arch/arm64/kvm/hyp/pgtable.c

[...]

> > +static u64 kvm_granule_shift(u32 level)
> > +{
> > +	return (KVM_PGTABLE_MAX_LEVELS - level) * (PAGE_SHIFT - 3) + 3;
> 
> Isn't that the same same thing as the macro ARM64_HW_PGTABLE_LEVEL_SHIFT(n) from
> pgtable-hwdef.h? I think the header is already included, as this file uses
> PTRS_PER_PTE and that's the only place I found it defined.

Hmm, that's an interesting one. If we ever want to adjust KVM_PGTABLE_MAX_LEVELS
things will break, so we just need to take that into account should future
architecture extensions add an extra level. I suppose I can add a comment
to that effect and use ARM64_HW_PGTABLE_LEVEL_SHIFT() instead.

> 
> > +}
> > +
> > +static u64 kvm_granule_size(u32 level)
> > +{
> > +	return BIT(kvm_granule_shift(level));
> > +}
> > +
> > +static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 level)
> > +{
> > +	u64 granule = kvm_granule_size(level);
> > +
> > +	/*
> > +	 * Reject invalid block mappings and don't bother with 4TB mappings for
> > +	 * 52-bit PAs.
> > +	 */
> > +	if (level == 0 || (PAGE_SIZE != SZ_4K && level == 1))
> > +		return false;
> > +
> > +	if (granule > (end - addr))
> > +		return false;
> > +
> > +	return IS_ALIGNED(addr, granule) && IS_ALIGNED(phys, granule);
> > +}
> 
> This is a very nice rewrite of fault_supports_stage2_huge_mapping, definitely
> easier to understand.

Thanks!

> > +static u32 kvm_start_level(u64 ia_bits)
> > +{
> > +	u64 levels = DIV_ROUND_UP(ia_bits - PAGE_SHIFT, PAGE_SHIFT - 3);
> 
> Isn't that the same same thing as the macro ARM64_HW_PGTABLE_LEVELS from
> pgtable-hwdef.h?

Yes, although this is slightly more idiomatic due to its use of
DIV_ROUND_UP imo. But happy to replace it.

> 
> > +	return KVM_PGTABLE_MAX_LEVELS - levels;
> 
> I tried to verify this formula and I think there's something that I don't
> understand or I'm missing. For the default KVM setup, where the user doesn't
> specify an IPA size different from the 40 bits default: ia_bits = 40 (IPA =
> [39:0]), 4KB pages, translation starting at level 1 with 2 concatenated level 1
> tables (VTCR_EL2.T0SZ = 24, VTCR_EL2.SL0 = 1, VTCR_EL2.TG0 = 0, starting level
> from table D5-13 at page D5-2566, ARM DDI 0487F.b), according to the formula I get:
> 
> levels = DIV_ROUND_UP(40 - 12, 12 -3) = DIV_ROUND_UP(28, 9) = 4
> return 4 - 4 = 0
> 
> which means the resulting starting level is 0 instead of 1.

Yeah, this is fiddly. kvm_start_level() doesn't cater for concatenation at
all and it's only used to determine the start level for the hypervisor
stage-1 table. For the stage-2 page-tables, we actually extract the start
level back out of the vtcr, as that gets configured separately and so we
just parameterise ourselves around that.

I think I'll remove kvm_start_level() entirely, and just inlined it into
its single call site (which will be neater using ARM64_HW_PGTABLE_LEVELS).

> 
> > +}
> > +
> > +static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level)
> > +{
> > +	u64 shift = kvm_granule_shift(level);
> > +	u64 mask = BIT(PAGE_SHIFT - 3) - 1;
> 
> This doesn't seem to take into account the fact that we can have concatenated
> initial page tables.

This is ok, as we basically process the PGD one page at a time so that the
details of concatenation only really need to be exposed to the iterator.
See the use of kvm_pgd_page_idx() in _kvm_pgtable_walk().

> > +static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
> > +				      kvm_pte_t *ptep, u32 level)
> > +{
> > +	int ret = 0;
> > +	u64 addr = data->addr;
> > +	kvm_pte_t *childp, pte = *ptep;
> > +	bool table = kvm_pte_table(pte, level);
> > +	enum kvm_pgtable_walk_flags flags = data->walker->flags;
> > +
> > +	if (table && (flags & KVM_PGTABLE_WALK_TABLE_PRE)) {
> > +		ret = kvm_pgtable_visitor_cb(data, addr, level, ptep,
> > +					     KVM_PGTABLE_WALK_TABLE_PRE);
> 
> I see that below we check if the visitor modified the leaf entry and turned into a
> table. Is it not allowed for a visitor to turn a table into a block mapping?

It is allowed, but in that case we don't revisit the block entry, as there's
really no need. Compare that with installing a table, where you may well
want to descend into the new table to initialise the new entries in there.

The kerneldoc for kvm_pgtable_walk() talks a bit about this. (aside: that
function isn't actually used, but it felt useful to expose it as an
interface).

Thanks for the review,

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-09-02 10:38 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-25  9:39 [PATCH v3 00/21] KVM: arm64: Rewrite page-table code and fault handling Will Deacon
2020-08-25  9:39 ` [PATCH v3 01/21] KVM: arm64: Remove kvm_mmu_free_memory_caches() Will Deacon
2020-08-25  9:39 ` [PATCH v3 02/21] KVM: arm64: Add stand-alone page-table walker infrastructure Will Deacon
2020-08-27 16:27   ` Alexandru Elisei
2020-08-28 15:43     ` Alexandru Elisei
2020-09-02 10:36     ` Will Deacon [this message]
2020-08-28 15:51   ` Alexandru Elisei
2020-09-02 10:49     ` Will Deacon
2020-09-02  6:31   ` Gavin Shan
2020-09-02 11:02     ` Will Deacon
2020-09-03  1:11       ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 03/21] KVM: arm64: Add support for creating kernel-agnostic stage-1 page tables Will Deacon
2020-08-28 15:35   ` Alexandru Elisei
2020-09-02 10:06     ` Will Deacon
2020-08-25  9:39 ` [PATCH v3 04/21] KVM: arm64: Use generic allocator for hyp stage-1 page-tables Will Deacon
2020-08-28 16:32   ` Alexandru Elisei
2020-09-02 11:35     ` Will Deacon
2020-09-02 14:48       ` Alexandru Elisei
2020-08-25  9:39 ` [PATCH v3 05/21] KVM: arm64: Add support for creating kernel-agnostic stage-2 page tables Will Deacon
2020-09-02  6:40   ` Gavin Shan
2020-09-02 11:30     ` Will Deacon
2020-08-25  9:39 ` [PATCH v3 06/21] KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table Will Deacon
2020-09-01 16:24   ` Alexandru Elisei
2020-09-02 11:46     ` Will Deacon
2020-09-03  2:57   ` Gavin Shan
2020-09-03  5:27     ` Gavin Shan
2020-09-03 11:18   ` Gavin Shan
2020-09-03 12:30     ` Will Deacon
2020-09-03 16:15       ` Will Deacon
2020-09-04  0:47         ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 07/21] KVM: arm64: Convert kvm_phys_addr_ioremap() to generic page-table API Will Deacon
2020-09-01 17:08   ` Alexandru Elisei
2020-09-02 11:48     ` Will Deacon
2020-09-03  3:57   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 08/21] KVM: arm64: Convert kvm_set_spte_hva() " Will Deacon
2020-09-02 15:37   ` Alexandru Elisei
2020-09-03 16:37     ` Will Deacon
2020-09-03  4:13   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 09/21] KVM: arm64: Convert unmap_stage2_range() " Will Deacon
2020-09-02 16:23   ` Alexandru Elisei
2020-09-02 18:44     ` Alexandru Elisei
2020-09-03 17:57     ` Will Deacon
2020-09-08 13:07       ` Alexandru Elisei
2020-09-09 10:57         ` Alexandru Elisei
2020-09-03  4:19   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 10/21] KVM: arm64: Add support for stage-2 page-aging in generic page-table Will Deacon
2020-09-03  4:33   ` Gavin Shan
2020-09-03 16:48     ` Will Deacon
2020-09-04  1:01       ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 11/21] KVM: arm64: Convert page-aging and access faults to generic page-table API Will Deacon
2020-09-03  4:37   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 12/21] KVM: arm64: Add support for stage-2 write-protect in generic page-table Will Deacon
2020-09-03  4:47   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 13/21] KVM: arm64: Convert write-protect operation to generic page-table API Will Deacon
2020-09-03  4:48   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 14/21] KVM: arm64: Add support for stage-2 cache flushing in generic page-table Will Deacon
2020-09-03  4:51   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 15/21] KVM: arm64: Convert memslot cache-flushing code to generic page-table API Will Deacon
2020-09-03  4:52   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 16/21] KVM: arm64: Add support for relaxing stage-2 perms in generic page-table code Will Deacon
2020-09-03  4:55   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 17/21] KVM: arm64: Convert user_mem_abort() to generic page-table API Will Deacon
2020-09-03  6:05   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 18/21] KVM: arm64: Check the pgt instead of the pgd when modifying page-table Will Deacon
2020-09-03  5:00   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 19/21] KVM: arm64: Remove unused page-table code Will Deacon
2020-09-03  6:02   ` Gavin Shan
2020-08-25  9:39 ` [PATCH v3 20/21] KVM: arm64: Remove unused 'pgd' field from 'struct kvm_s2_mmu' Will Deacon
2020-09-03  5:07   ` Gavin Shan
2020-09-03 16:50     ` Will Deacon
2020-09-04  0:59       ` Gavin Shan
2020-09-04 10:02         ` Marc Zyngier
2020-08-25  9:39 ` [PATCH v3 21/21] KVM: arm64: Don't constrain maximum IPA size based on host configuration Will Deacon
2020-09-03  5:09   ` Gavin Shan
2020-08-27 16:26 ` [PATCH v3 00/21] KVM: arm64: Rewrite page-table code and fault handling Alexandru Elisei
2020-09-01 16:15   ` Will Deacon
2020-09-03  7:34 ` Gavin Shan
2020-09-03 11:13   ` Gavin Shan
2020-09-03 11:48     ` Gavin Shan
2020-09-03 12:16       ` Will Deacon
2020-09-04  0:51         ` Gavin Shan
2020-09-04 10:07           ` Marc Zyngier
2020-09-05  3:56             ` Gavin Shan
2020-09-05  9:33               ` Marc Zyngier
2020-09-07  9:27           ` Will Deacon
2020-09-03 18:52 ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200902103648.GC5567@willie-the-truck \
    --to=will@kernel.org \
    --cc=alexandru.elisei@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=kernel-team@android.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox