From: Oliver Upton <oliver.upton@linux.dev>
To: Ricardo Koller <ricarkol@google.com>
Cc: pbonzini@redhat.com, maz@kernel.org, oupton@google.com,
yuzenghui@huawei.com, dmatlack@google.com, kvm@vger.kernel.org,
kvmarm@lists.linux.dev, qperret@google.com,
catalin.marinas@arm.com, andrew.jones@linux.dev,
seanjc@google.com, alexandru.elisei@arm.com,
suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com,
reijiw@google.com, rananta@google.com, bgardon@google.com,
ricarkol@gmail.com
Subject: Re: [PATCH v3 03/12] KVM: arm64: Add helper for creating unlinked stage2 subtrees
Date: Thu, 16 Feb 2023 00:13:18 +0000 [thread overview]
Message-ID: <Y+11HkcHMwUrEkPz@linux.dev> (raw)
In-Reply-To: <20230215174046.2201432-4-ricarkol@google.com>
On Wed, Feb 15, 2023 at 05:40:37PM +0000, Ricardo Koller wrote:
> Add a stage2 helper, kvm_pgtable_stage2_create_unlinked(), for creating
> unlinked tables (the opposite of kvm_pgtable_stage2_free_unlinked()).
You don't need to mention the free_unlinked() side of things, IMO.
> Creating an unlinked table is useful for splitting block PTEs into
> subtrees of 4K PTEs. For example, a 1G block PTE can be split into 4K
> PTEs by first creating a fully populated tree, and then use it to
> replace the 1G PTE in a single step. This will be used in a
> subsequent commit for eager huge-page splitting (a dirty-logging
> optimization).
This is all very PAGE_SIZE=4K centric :) Could you instead describe the
operation in terms of the Linux page table hierarchy?
i.e. '... useful for splitting hugepages into PTE-level mappings'
> No functional change intended. This new function will be used in a
> subsequent commit.
>
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> ---
> arch/arm64/include/asm/kvm_pgtable.h | 29 +++++++++++++++++
> arch/arm64/kvm/hyp/pgtable.c | 47 ++++++++++++++++++++++++++++
> 2 files changed, 76 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 7c45082e6c23..2ea397ad3e63 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -460,6 +460,35 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
> */
> void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level);
>
> +/**
> + * kvm_pgtable_stage2_create_unlinked() - Create an unlinked stage-2 paging structure.
> + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
> + * @new: Unlinked stage-2 paging structure to be created.
> + * @phys: Physical address of the memory to map.
> + * @level: Level of the stage-2 paging structure to be created.
I believe this is the starting level of the page table to be created,
right? Might be worth calling that out explicitly as level could also be
interpreted as mapping level.
> + * @prot: Permissions and attributes for the mapping.
> + * @mc: Cache of pre-allocated and zeroed memory from which to allocate
> + * page-table pages.
> + * @force_pte: Force mappings to PAGE_SIZE granularity.
> + *
> + * Create an unlinked page-table tree under @new. If @force_pte is
> + * true or @level is the PMD level, then the tree is mapped up to the
> + * PAGE_SIZE leaf PTE; the tree is mapped up one level otherwise. This
> + * new page-table tree is not reachable (i.e., it is removed) from the
'i.e. it is unlinked'
> + * root pgd and it's therefore unreachableby the hardware page-table
> + * walker. No TLB invalidation or CMOs are performed.
> + *
> + * If device attributes are not explicitly requested in @prot, then the
> + * mapping will be normal, cacheable.
> + *
> + * Return: 0 only if a fully populated tree was created (all memory
> + * under @level is mapped), negative error code on failure.
> + */
> +int kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
> + kvm_pte_t *new, u64 phys, u32 level,
> + enum kvm_pgtable_prot prot, void *mc,
> + bool force_pte);
> +
> /**
> * kvm_pgtable_stage2_map() - Install a mapping in a guest stage-2 page-table.
> * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 0a5ef9288371..fed314f2b320 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -1181,6 +1181,53 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
> return kvm_pgtable_walk(pgt, addr, size, &walker);
> }
>
> +int kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
> + kvm_pte_t *new, u64 phys, u32 level,
> + enum kvm_pgtable_prot prot, void *mc,
> + bool force_pte)
> +{
> + struct stage2_map_data map_data = {
> + .phys = phys,
> + .mmu = pgt->mmu,
> + .memcache = mc,
> + .force_pte = force_pte,
> + };
> + struct kvm_pgtable_walker walker = {
> + .cb = stage2_map_walker,
> + .flags = KVM_PGTABLE_WALK_LEAF |
> + KVM_PGTABLE_WALK_SKIP_BBM |
> + KVM_PGTABLE_WALK_SKIP_CMO,
> + .arg = &map_data,
> + };
> + /* .addr (the IPA) is irrelevant for a removed table */
> + struct kvm_pgtable_walk_data data = {
> + .walker = &walker,
> + .addr = 0,
> + .end = kvm_granule_size(level),
> + };
> + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops;
> + kvm_pte_t *pgtable;
> + int ret;
> +
> + ret = stage2_set_prot_attr(pgt, prot, &map_data.attr);
> + if (ret)
> + return ret;
> +
> + pgtable = mm_ops->zalloc_page(mc);
> + if (!pgtable)
> + return -ENOMEM;
> +
> + ret = __kvm_pgtable_walk(&data, mm_ops, (kvm_pteref_t)pgtable,
> + level + 1);
> + if (ret) {
> + kvm_pgtable_stage2_free_unlinked(mm_ops, pgtable, level);
> + mm_ops->put_page(pgtable);
> + return ret;
> + }
> +
> + *new = kvm_init_table_pte(pgtable, mm_ops);
> + return 0;
Hmm. I don't think you really need the 'out' pointer here. Either the
function fails and it returns a negative error, or it succeeds and
returns a pointer.
kvm_pte_t *kvm_pgtable_stage2_create_unlinked(...)
{
int ret;
[...]
ret = __kvm_pgtable_walk(...);
if (ret) {
kvm_pgtable_stage2_free_unlinked(...);
mm_ops->put_page(pgtable);
return ERR_PTR(ret);
}
return pgtable;
}
[...]
pgtable = kvm_pgtable_stage2_create_unlinked(...);
if (IS_ERR(pgtable))
return PTR_ERR(pgtable);
--
Thanks,
Oliver
next prev parent reply other threads:[~2023-02-16 0:32 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-15 17:40 [PATCH v3 00/12] Implement Eager Page Splitting for ARM Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 01/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO Ricardo Koller
2023-02-16 2:56 ` Shaoqin Huang
2023-02-16 18:41 ` Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 02/12] KVM: arm64: Rename free_unlinked to free_removed Ricardo Koller
2023-02-15 23:51 ` Oliver Upton
2023-02-16 3:13 ` Shaoqin Huang
2023-02-16 18:44 ` Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 03/12] KVM: arm64: Add helper for creating unlinked stage2 subtrees Ricardo Koller
2023-02-16 0:13 ` Oliver Upton [this message]
2023-02-16 18:34 ` Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 04/12] KVM: arm64: Add kvm_pgtable_stage2_split() Ricardo Koller
2023-02-16 0:36 ` Oliver Upton
2023-02-16 18:07 ` Ricardo Koller
2023-02-16 18:14 ` Ricardo Koller
2023-02-16 12:22 ` Shaoqin Huang
2023-02-16 18:30 ` Ricardo Koller
2023-02-17 4:07 ` Shaoqin Huang
2023-02-15 17:40 ` [PATCH v3 05/12] KVM: arm64: Refactor kvm_arch_commit_memory_region() Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 06/12] KVM: arm64: Add kvm_uninit_stage2_mmu() Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 07/12] KVM: arm64: Export kvm_are_all_memslots_empty() Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 08/12] KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 09/12] KVM: arm64: Split huge pages when dirty logging is enabled Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 10/12] KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 11/12] KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG Ricardo Koller
2023-02-15 17:40 ` [PATCH v3 12/12] KVM: arm64: Use local TLBI on permission relaxation Ricardo Koller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+11HkcHMwUrEkPz@linux.dev \
--to=oliver.upton@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=andrew.jones@linux.dev \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=dmatlack@google.com \
--cc=eric.auger@redhat.com \
--cc=gshan@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=qperret@google.com \
--cc=rananta@google.com \
--cc=reijiw@google.com \
--cc=ricarkol@gmail.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.