From: Zenghui Yu <zenghui.yu@linux.dev>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>,
Joey Gouly <joey.gouly@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Christoffer Dall <christoffer.dall@arm.com>,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
wanghaibin.wang@huawei.com
Subject: Re: [PATCH v3 02/16] KVM: arm64: nv: Implement nested Stage-2 page table walk logic
Date: Wed, 4 Feb 2026 16:28:57 +0800 [thread overview]
Message-ID: <3f88cd49-68f1-4276-a067-b7c6beadb27c@linux.dev> (raw)
In-Reply-To: <20240614144552.2773592-3-maz@kernel.org>
Hi Marc,
[ chewing through the NV code.. ;-) ]
On 6/14/24 10:45 PM, Marc Zyngier wrote:
> From: Christoffer Dall <christoffer.dall@linaro.org>
>
> Based on the pseudo-code in the ARM ARM, implement a stage 2 software
> page table walker.
[...]
> +static u32 compute_fsc(int level, u32 fsc)
> +{
> + return fsc | (level & 0x3);
> +}
> +
> +static int get_ia_size(struct s2_walk_info *wi)
> +{
> + return 64 - wi->t0sz;
> +}
> +
> +static int check_base_s2_limits(struct s2_walk_info *wi,
> + int level, int input_size, int stride)
> +{
> + int start_size, ia_size;
> +
> + ia_size = get_ia_size(wi);
> +
> + /* Check translation limits */
> + switch (BIT(wi->pgshift)) {
> + case SZ_64K:
> + if (level == 0 || (level == 1 && ia_size <= 42))
It looks broken as the pseudocode checks the limits based on
*implemented PA size*, rather than on ia_size, which is essentially the
input address size (64 - T0SZ) programmed by L1 hypervisor. They're
different.
We can probably get the implemented PA size by:
AArch64.PAMax()
{
parange = get_idreg_field_enum(kvm, ID_AA64MMFR0_EL1, PARANGE);
return id_aa64mmfr0_parange_to_phys_shift(parange);
}
Not sure if I've read the spec correctly.
> + return -EFAULT;
> + break;
> + case SZ_16K:
> + if (level == 0 || (level == 1 && ia_size <= 40))
> + return -EFAULT;
> + break;
> + case SZ_4K:
> + if (level < 0 || (level == 0 && ia_size <= 42))
> + return -EFAULT;
> + break;
> + }
> +
> + /* Check input size limits */
> + if (input_size > ia_size)
This is always false for the current code. ;-)
> + return -EFAULT;
> +
> + /* Check number of entries in starting level table */
> + start_size = input_size - ((3 - level) * stride + wi->pgshift);
> + if (start_size < 1 || start_size > stride + 4)
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +/* Check if output is within boundaries */
> +static int check_output_size(struct s2_walk_info *wi, phys_addr_t output)
> +{
> + unsigned int output_size = wi->max_oa_bits;
> +
> + if (output_size != 48 && (output & GENMASK_ULL(47, output_size)))
> + return -1;
> +
> + return 0;
> +}
> +
> +/*
> + * This is essentially a C-version of the pseudo code from the ARM ARM
> + * AArch64.TranslationTableWalk function. I strongly recommend looking at
> + * that pseudocode in trying to understand this.
> + *
> + * Must be called with the kvm->srcu read lock held
> + */
> +static int walk_nested_s2_pgd(phys_addr_t ipa,
> + struct s2_walk_info *wi, struct kvm_s2_trans *out)
> +{
> + int first_block_level, level, stride, input_size, base_lower_bound;
> + phys_addr_t base_addr;
> + unsigned int addr_top, addr_bottom;
> + u64 desc; /* page table entry */
> + int ret;
> + phys_addr_t paddr;
> +
> + switch (BIT(wi->pgshift)) {
> + default:
> + case SZ_64K:
> + case SZ_16K:
> + level = 3 - wi->sl;
> + first_block_level = 2;
> + break;
> + case SZ_4K:
> + level = 2 - wi->sl;
> + first_block_level = 1;
> + break;
> + }
> +
> + stride = wi->pgshift - 3;
> + input_size = get_ia_size(wi);
> + if (input_size > 48 || input_size < 25)
> + return -EFAULT;
> +
> + ret = check_base_s2_limits(wi, level, input_size, stride);
> + if (WARN_ON(ret))
> + return ret;
> +
> + base_lower_bound = 3 + input_size - ((3 - level) * stride +
> + wi->pgshift);
> + base_addr = wi->baddr & GENMASK_ULL(47, base_lower_bound);
> +
> + if (check_output_size(wi, base_addr)) {
> + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ);
With a wrongly programmed base address, we should report the ADDRSZ
fault at level 0 (as per R_BFHQH and the pseudocode). It's easy to fix.
> +static void vtcr_to_walk_info(u64 vtcr, struct s2_walk_info *wi)
> +{
> + wi->t0sz = vtcr & TCR_EL2_T0SZ_MASK;
> +
> + switch (vtcr & VTCR_EL2_TG0_MASK) {
> + case VTCR_EL2_TG0_4K:
> + wi->pgshift = 12; break;
> + case VTCR_EL2_TG0_16K:
> + wi->pgshift = 14; break;
> + case VTCR_EL2_TG0_64K:
> + default: /* IMPDEF: treat any other value as 64k */
> + wi->pgshift = 16; break;
> + }
> +
> + wi->sl = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
> + /* Global limit for now, should eventually be per-VM */
> + wi->max_oa_bits = min(get_kvm_ipa_limit(),
^^^
Should we use AArch64.PAMax() instead? As the output address size is
never larger than the implemented PA size.
Now I'm wondering if we can let kvm_get_pa_bits() just return PAMax for
(based on the exposed (to-L1) AA64MFR0.PARange value) and use it when
possible.
Thanks,
Zenghui
next prev parent reply other threads:[~2026-02-04 8:31 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-14 14:45 [PATCH v3 00/16] KVM: arm64: nv: Shadow stage-2 page table handling Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 01/16] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 02/16] KVM: arm64: nv: Implement nested Stage-2 page table walk logic Marc Zyngier
2026-02-04 8:28 ` Zenghui Yu [this message]
2026-02-06 11:05 ` Marc Zyngier
2026-02-08 18:34 ` Zenghui Yu
2024-06-14 14:45 ` [PATCH v3 03/16] KVM: arm64: nv: Handle shadow stage 2 page faults Marc Zyngier
2024-08-21 19:11 ` Zenghui Yu
2024-08-22 6:31 ` Oliver Upton
2024-06-14 14:45 ` [PATCH v3 04/16] KVM: arm64: nv: Unmap/flush shadow stage 2 page tables Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 05/16] KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 06/16] KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 07/16] KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1 Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 08/16] KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 09/16] KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 10/16] KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 11/16] KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 12/16] KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 13/16] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 14/16] KVM: arm64: nv: Add handling of outer-shareable TLBI operations Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 15/16] KVM: arm64: nv: Add handling of range-based " Marc Zyngier
2024-06-14 14:45 ` [PATCH v3 16/16] KVM: arm64: nv: Add handling of NXS-flavoured " Marc Zyngier
2024-06-19 8:41 ` [PATCH v3 00/16] KVM: arm64: nv: Shadow stage-2 page table handling Oliver Upton
2024-11-21 8:11 ` Ganapatrao Kulkarni
2024-11-21 16:44 ` Marc Zyngier
2024-11-22 16:54 ` Ganapatrao Kulkarni
2024-11-22 19:04 ` Marc Zyngier
2024-11-23 9:49 ` Marc Zyngier
2024-12-05 11:50 ` Darren Hart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f88cd49-68f1-4276-a067-b7c6beadb27c@linux.dev \
--to=zenghui.yu@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=christoffer.dall@arm.com \
--cc=gankulkarni@os.amperecomputing.com \
--cc=james.morse@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=wanghaibin.wang@huawei.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox