Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Suzuki K Poulose <suzuki.poulose@arm.com>
To: Gavin Shan <gshan@redhat.com>,
	Steven Price <steven.price@arm.com>,
	kvm@vger.kernel.org, kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>, Will Deacon <will@kernel.org>,
	James Morse <james.morse@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Joey Gouly <joey.gouly@arm.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Fuad Tabba <tabba@google.com>,
	linux-coco@lists.linux.dev,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Shanker Donthineni <sdonthineni@nvidia.com>,
	Alper Gun <alpergun@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
	Emi Kisanuki <fj0570is@fujitsu.com>,
	Vishal Annapurve <vannapurve@google.com>,
	WeiLin.Chang@arm.com, Lorenzo.Pieralisi2@arm.com
Subject: Re: [PATCH v14 29/44] arm64: RMI: Runtime faulting of memory
Date: Mon, 8 Jun 2026 10:30:13 +0100	[thread overview]
Message-ID: <cecbd148-5d33-49c8-928f-572f71b3dd69@arm.com> (raw)
In-Reply-To: <3359f788-07fa-41a1-9ac7-45c58577c1fa@redhat.com>

On 05/06/2026 07:23, Gavin Shan wrote:
> Hi Steve,
> 
> On 5/13/26 11:17 PM, Steven Price wrote:
>> At runtime if the realm guest accesses memory which hasn't yet been
>> mapped then KVM needs to either populate the region or fault the guest.
>>
>> For memory in the lower (protected) region of IPA a fresh page is
>> provided to the RMM which will zero the contents. For memory in the
>> upper (shared) region of IPA, the memory from the memslot is mapped
>> into the realm VM non secure.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>> Changes since v13:
>>   * Numerous changes due to rebasing.
>>   * Fix addr_range_desc() to encode the correct block size.
>> Changes since v12:
>>   * Switch to RMM v2.0 range based APIs.
>> Changes since v11:
>>   * Adapt to upstream changes.
>> Changes since v10:
>>   * RME->RMI renaming.
>>   * Adapt to upstream gmem changes.
>> Changes since v9:
>>   * Fix call to kvm_stage2_unmap_range() in kvm_free_stage2_pgd() to set
>>     may_block to avoid stall warnings.
>>   * Minor coding style fixes.
>> Changes since v8:
>>   * Propagate the may_block flag.
>>   * Minor comments and coding style changes.
>> Changes since v7:
>>   * Remove redundant WARN_ONs for realm_create_rtt_levels() - it will
>>     internally WARN when necessary.
>> Changes since v6:
>>   * Handle PAGE_SIZE being larger than RMM granule size.
>>   * Some minor renaming following review comments.
>> Changes since v5:
>>   * Reduce use of struct page in preparation for supporting the RMM
>>     having a different page size to the host.
>>   * Handle a race when delegating a page where another CPU has faulted on
>>     a the same page (and already delegated the physical page) but not yet
>>     mapped it. In this case simply return to the guest to either use the
>>     mapping from the other CPU (or refault if the race is lost).
>>   * The changes to populate_par_region() are moved into the previous
>>     patch where they belong.
>> Changes since v4:
>>   * Code cleanup following review feedback.
>>   * Drop the PTE_SHARED bit when creating unprotected page table entries.
>>     This is now set by the RMM and the host has no control of it and the
>>     spec requires the bit to be set to zero.
>> Changes since v2:
>>   * Avoid leaking memory if failing to map it in the realm.
>>   * Correctly mask RTT based on LPA2 flag (see rtt_get_phys()).
>>   * Adapt to changes in previous patches.
>> ---
>>   arch/arm64/include/asm/kvm_emulate.h |   8 ++
>>   arch/arm64/include/asm/kvm_rmi.h     |  12 ++
>>   arch/arm64/kvm/mmu.c                 | 128 ++++++++++++++++----
>>   arch/arm64/kvm/rmi.c                 | 173 +++++++++++++++++++++++++++
>>   4 files changed, 301 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/ 
>> include/asm/kvm_emulate.h
>> index 2e69fe494716..8b6f9d26b5d8 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -712,6 +712,14 @@ static inline bool kvm_realm_is_created(struct 
>> kvm *kvm)
>>       return kvm_is_realm(kvm) && kvm_realm_state(kvm) != 
>> REALM_STATE_NONE;
>>   }
>> +static inline gpa_t kvm_gpa_from_fault(struct kvm *kvm, phys_addr_t ipa)
>> +{
>> +    if (!kvm_is_realm(kvm))
>> +        return ipa;
>> +
>> +    return ipa & ~BIT(kvm->arch.realm.ia_bits - 1);
>> +}
>> +
>>   static inline bool vcpu_is_rec(const struct kvm_vcpu *vcpu)
>>   {
>>       return kvm_is_realm(vcpu->kvm);
>> diff --git a/arch/arm64/include/asm/kvm_rmi.h b/arch/arm64/include/ 
>> asm/kvm_rmi.h
>> index a2b6bc412a22..b65cfec10dee 100644
>> --- a/arch/arm64/include/asm/kvm_rmi.h
>> +++ b/arch/arm64/include/asm/kvm_rmi.h
>> @@ -6,6 +6,7 @@
>>   #ifndef __ASM_KVM_RMI_H
>>   #define __ASM_KVM_RMI_H
>> +#include <asm/kvm_pgtable.h>
>>   #include <asm/rmi_smc.h>
>>   /**
>> @@ -97,6 +98,17 @@ void kvm_realm_unmap_range(struct kvm *kvm,
>>                  unsigned long size,
>>                  bool unmap_private,
>>                  bool may_block);
>> +int realm_map_protected(struct kvm *kvm,
>> +            unsigned long base_ipa,
>> +            kvm_pfn_t pfn,
>> +            unsigned long size,
>> +            struct kvm_mmu_memory_cache *memcache);
>> +int realm_map_non_secure(struct realm *realm,
>> +             unsigned long ipa,
>> +             kvm_pfn_t pfn,
>> +             unsigned long size,
>> +             enum kvm_pgtable_prot prot,
>> +             struct kvm_mmu_memory_cache *memcache);
>>   static inline bool kvm_realm_is_private_address(struct realm *realm,
>>                           unsigned long addr)
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index ac2a0f0106b0..776ffe56d17e 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -334,8 +334,15 @@ static void __unmap_stage2_range(struct 
>> kvm_s2_mmu *mmu, phys_addr_t start, u64
>>       lockdep_assert_held_write(&kvm->mmu_lock);
>>       WARN_ON(size & ~PAGE_MASK);
>> -    WARN_ON(stage2_apply_range(mmu, start, end, 
>> KVM_PGT_FN(kvm_pgtable_stage2_unmap),
>> -                   may_block));
>> +
>> +    if (kvm_is_realm(kvm)) {
>> +        kvm_realm_unmap_range(kvm, start, size, !only_shared,
>> +                      may_block);
>> +    } else {
>> +        WARN_ON(stage2_apply_range(mmu, start, end,
>> +                       KVM_PGT_FN(kvm_pgtable_stage2_unmap),
>> +                       may_block));
>> +    }
>>   }
>>   void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
>> @@ -358,7 +365,10 @@ static void stage2_flush_memslot(struct kvm *kvm,
>>       phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
>>       phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
>> -    kvm_stage2_flush_range(&kvm->arch.mmu, addr, end);
>> +    if (kvm_is_realm(kvm))
>> +        kvm_realm_unmap_range(kvm, addr, end - addr, false, true);
>> +    else
>> +        kvm_stage2_flush_range(&kvm->arch.mmu, addr, end);
>>   }
>>   /**
>> @@ -1103,6 +1113,10 @@ void stage2_unmap_vm(struct kvm *kvm)
>>       struct kvm_memory_slot *memslot;
>>       int idx, bkt;
>> +    /* For realms this is handled by the RMM so nothing to do here */
>> +    if (kvm_is_realm(kvm))
>> +        return;
>> +
>>       idx = srcu_read_lock(&kvm->srcu);
>>       mmap_read_lock(current->mm);
>>       write_lock(&kvm->mmu_lock);
>> @@ -1528,6 +1542,29 @@ static bool kvm_vma_mte_allowed(struct 
>> vm_area_struct *vma)
>>       return vma->vm_flags & VM_MTE_ALLOWED;
>>   }
>> +static int realm_map_ipa(struct kvm *kvm, phys_addr_t ipa,
>> +             kvm_pfn_t pfn, unsigned long map_size,
>> +             enum kvm_pgtable_prot prot,
>> +             struct kvm_mmu_memory_cache *memcache)
>> +{
>> +    struct realm *realm = &kvm->arch.realm;
>> +
>> +    /*
>> +     * Write permission is required for now even though it's possible to
>> +     * map unprotected pages (granules) as read-only. It's impossible to
>> +     * map protected pages (granules) as read-only.
>> +     */
>> +    if (WARN_ON(!(prot & KVM_PGTABLE_PROT_W)))
>> +        return -EFAULT;
>> +
> 
> I'm a bit concerned with this. We don't have KVM_PGTABLE_PROT_W set in 
> @prot
> if the stage2 fault is raised due to memory read. With -EFAULT returned 
> to VMM
> (e.g. QEMU), the vCPU continuous execution is stopped and system won't be
> working any more.
> 
>> +    ipa = ALIGN_DOWN(ipa, PAGE_SIZE);
>> +    if (!kvm_realm_is_private_address(realm, ipa))
>> +        return realm_map_non_secure(realm, ipa, pfn, map_size, prot,
>> +                        memcache);
>> +
>> +    return realm_map_protected(kvm, ipa, pfn, map_size, memcache);
>> +}
>> +
>>   static bool kvm_vma_is_cacheable(struct vm_area_struct *vma)
>>   {
>>       switch (FIELD_GET(PTE_ATTRINDX_MASK, pgprot_val(vma- 
>> >vm_page_prot))) {
>> @@ -1604,27 +1641,52 @@ static int gmem_abort(const struct 
>> kvm_s2_fault_desc *s2fd)
>>       bool write_fault, exec_fault;
>>       enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED;
>>       enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
>> -    struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt;
>> +    struct kvm_vcpu *vcpu = s2fd->vcpu;
>> +    struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt;
>> +    gpa_t gpa = kvm_gpa_from_fault(vcpu->kvm, s2fd->fault_ipa);
>>       unsigned long mmu_seq;
>>       struct page *page;
>> -    struct kvm *kvm = s2fd->vcpu->kvm;
>> +    struct kvm *kvm = vcpu->kvm;
>>       void *memcache;
>>       kvm_pfn_t pfn;
>>       gfn_t gfn;
>>       int ret;
>> -    memcache = get_mmu_memcache(s2fd->vcpu);
>> -    ret = topup_mmu_memcache(s2fd->vcpu, memcache);
>> +    if (kvm_is_realm(vcpu->kvm)) {
>> +        /* check for memory attribute mismatch */
>> +        bool is_priv_gfn = kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT);
>> +        /*
>> +         * For Realms, the shared address is an alias of the private
>> +         * PA with the top bit set. Thus if the fault address matches
>> +         * the GPA then it is the private alias.
>> +         */
>> +        bool is_priv_fault = (gpa == s2fd->fault_ipa);
>> +
>> +        if (is_priv_gfn != is_priv_fault) {
>> +            kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
>> +                              kvm_is_write_fault(vcpu),
>> +                              false,
>> +                              is_priv_fault);
>> +            /*
>> +             * KVM_EXIT_MEMORY_FAULT requires an return code of
>> +             * -EFAULT, see the API documentation
>> +             */
>> +            return -EFAULT;
>> +        }
>> +    }
>> +
>> +    memcache = get_mmu_memcache(vcpu);
>> +    ret = topup_mmu_memcache(vcpu, memcache);
>>       if (ret)
>>           return ret;
>>       if (s2fd->nested)
>>           gfn = kvm_s2_trans_output(s2fd->nested) >> PAGE_SHIFT;
>>       else
>> -        gfn = s2fd->fault_ipa >> PAGE_SHIFT;
>> +        gfn = gpa >> PAGE_SHIFT;
>> -    write_fault = kvm_is_write_fault(s2fd->vcpu);
>> -    exec_fault = kvm_vcpu_trap_is_exec_fault(s2fd->vcpu);
>> +    write_fault = kvm_is_write_fault(vcpu);
>> +    exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>>       VM_WARN_ON_ONCE(write_fault && exec_fault);
>> @@ -1634,7 +1696,7 @@ static int gmem_abort(const struct 
>> kvm_s2_fault_desc *s2fd)
>>       ret = kvm_gmem_get_pfn(kvm, s2fd->memslot, gfn, &pfn, &page, NULL);
>>       if (ret) {
>> -        kvm_prepare_memory_fault_exit(s2fd->vcpu, s2fd->fault_ipa, 
>> PAGE_SIZE,
>> +        kvm_prepare_memory_fault_exit(vcpu, gpa, PAGE_SIZE,
>>                             write_fault, exec_fault, false);
>>           return ret;
>>       }
>> @@ -1654,14 +1716,20 @@ static int gmem_abort(const struct 
>> kvm_s2_fault_desc *s2fd)
>>       kvm_fault_lock(kvm);
>>       if (mmu_invalidate_retry(kvm, mmu_seq)) {
>>           ret = -EAGAIN;
>> -        goto out_unlock;
>> +        goto out_release_page;
>> +    }
>> +
>> +    if (kvm_is_realm(kvm)) {
>> +        ret = realm_map_ipa(kvm, s2fd->fault_ipa, pfn,
>> +                    PAGE_SIZE, KVM_PGTABLE_PROT_R | 
>> KVM_PGTABLE_PROT_W, memcache);
>> +        goto out_release_page;
>>       }
>>       ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, 
>> PAGE_SIZE,
>>                            __pfn_to_phys(pfn), prot,
>>                            memcache, flags);
>> -out_unlock:
>> +out_release_page:
>>       kvm_release_faultin_page(kvm, page, !!ret, prot & 
>> KVM_PGTABLE_PROT_W);
>>       kvm_fault_unlock(kvm);
>> @@ -1847,7 +1915,7 @@ static int kvm_s2_fault_get_vma_info(const 
>> struct kvm_s2_fault_desc *s2fd,
>>        * mapping size to ensure we find the right PFN and lay down the
>>        * mapping in the right place.
>>        */
>> -    s2vi->gfn = ALIGN_DOWN(s2fd->fault_ipa, s2vi->vma_pagesize) >> 
>> PAGE_SHIFT;
>> +    s2vi->gfn = kvm_gpa_from_fault(kvm, ALIGN_DOWN(s2fd->fault_ipa, 
>> s2vi->vma_pagesize)) >> PAGE_SHIFT;
>>       s2vi->mte_allowed = kvm_vma_mte_allowed(vma);
>> @@ -2056,6 +2124,9 @@ static int kvm_s2_fault_map(const struct 
>> kvm_s2_fault_desc *s2fd,
>>           prot &= ~KVM_NV_GUEST_MAP_SZ;
>>           ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, 
>> gfn_to_gpa(gfn),
>>                                    prot, flags);
>> +    } else if (kvm_is_realm(kvm)) {
>> +        ret = realm_map_ipa(kvm, s2fd->fault_ipa, pfn, mapping_size,
>> +                    prot, memcache);
>>       } else {
>>           ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, 
>> gfn_to_gpa(gfn), mapping_size,
>>                                __pfn_to_phys(pfn), prot,
> 
> For the case kvm_is_realm(), need we adjust 's2fd->fault_ipa' for the 
> sake of
> huge pages. In kvm_s2_fault_map(), @gfn and @pfn may have been adjusted by
> transparent_hugepage_adjust() to be aligned with huge page size. If the
> adjustment happened in transparent_hugepage_adjust(), we need to align
> s2fd->fault_ipa down to the huge page size either.
> 
> 
>> @@ -2214,6 +2285,13 @@ int kvm_handle_guest_sea(struct kvm_vcpu *vcpu)
>>       return 0;
>>   }
>> +static bool shared_ipa_fault(struct kvm *kvm, phys_addr_t fault_ipa)
>> +{
>> +    gpa_t gpa = kvm_gpa_from_fault(kvm, fault_ipa);
>> +
>> +    return (gpa != fault_ipa);
>> +}
>> +
>>   /**
>>    * kvm_handle_guest_abort - handles all 2nd stage aborts
>>    * @vcpu:    the VCPU pointer
>> @@ -2324,8 +2402,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>           nested = &nested_trans;
>>       }
>> -    gfn = ipa >> PAGE_SHIFT;
>> +    gfn = kvm_gpa_from_fault(vcpu->kvm, ipa) >> PAGE_SHIFT;
>>       memslot = gfn_to_memslot(vcpu->kvm, gfn);
>> +
>>       hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
>>       write_fault = kvm_is_write_fault(vcpu);
>>       if (kvm_is_error_hva(hva) || (write_fault && !writable)) {
>> @@ -2368,7 +2447,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>            * of the page size.
>>            */
>>           ipa |= FAR_TO_FIPA_OFFSET(kvm_vcpu_get_hfar(vcpu));
>> -        ret = io_mem_abort(vcpu, ipa);
>> +        ret = io_mem_abort(vcpu, kvm_gpa_from_fault(vcpu->kvm, ipa));
>>           goto out_unlock;
>>       }
>> @@ -2396,7 +2475,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>                   !write_fault &&
>>                   !kvm_vcpu_trap_is_exec_fault(vcpu));
>> -        if (kvm_slot_has_gmem(memslot))
>> +        if (kvm_slot_has_gmem(memslot) && !shared_ipa_fault(vcpu- 
>> >kvm, fault_ipa))
>>               ret = gmem_abort(&s2fd);
>>           else
>>               ret = user_mem_abort(&s2fd);
>> @@ -2433,6 +2512,10 @@ bool kvm_age_gfn(struct kvm *kvm, struct 
>> kvm_gfn_range *range)
>>       if (!kvm->arch.mmu.pgt || kvm_vm_is_protected(kvm))
>>           return false;
>> +    /* We don't support aging for Realms */
>> +    if (kvm_is_realm(kvm))
>> +        return true;
>> +
>>       return KVM_PGT_FN(kvm_pgtable_stage2_test_clear_young)(kvm- 
>> >arch.mmu.pgt,
>>                              range->start << PAGE_SHIFT,
>>                              size, true);
>> @@ -2449,6 +2532,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct 
>> kvm_gfn_range *range)
>>       if (!kvm->arch.mmu.pgt || kvm_vm_is_protected(kvm))
>>           return false;
>> +    /* We don't support aging for Realms */
>> +    if (kvm_is_realm(kvm))
>> +        return true;
>> +
>>       return KVM_PGT_FN(kvm_pgtable_stage2_test_clear_young)(kvm- 
>> >arch.mmu.pgt,
>>                              range->start << PAGE_SHIFT,
>>                              size, false);
>> @@ -2628,10 +2715,11 @@ int kvm_arch_prepare_memory_region(struct kvm 
>> *kvm,
>>           return -EFAULT;
>>       /*
>> -     * Only support guest_memfd backed memslots with mappable memory, 
>> since
>> -     * there aren't any CoCo VMs that support only private memory on 
>> arm64.
>> +     * Only support guest_memfd backed memslots with mappable memory,
>> +     * unless the guest is a CCA realm guest.
>>        */
>> -    if (kvm_slot_has_gmem(new) && !kvm_memslot_is_gmem_only(new))
>> +    if (kvm_slot_has_gmem(new) && !kvm_memslot_is_gmem_only(new) &&
>> +        !kvm_is_realm(kvm))
>>           return -EINVAL;
>>       hva = new->userspace_addr;
>> diff --git a/arch/arm64/kvm/rmi.c b/arch/arm64/kvm/rmi.c
>> index cae29fd3353c..761b38a4071c 100644
>> --- a/arch/arm64/kvm/rmi.c
>> +++ b/arch/arm64/kvm/rmi.c
>> @@ -597,6 +597,179 @@ static int realm_data_map_init(struct kvm *kvm, 
>> unsigned long ipa,
>>       return ret;
>>   }
>> +static unsigned long addr_range_desc(unsigned long phys, unsigned 
>> long size)
>> +{
>> +    unsigned long out = 0;
>> +
>> +    switch (size) {
>> +    case P4D_SIZE:
>> +        out = 3 | (1 << 2);
>> +        break;
>> +    case PUD_SIZE:
>> +        out = 2 | (1 << 2);
>> +        break;
>> +    case PMD_SIZE:
>> +        out = 1 | (1 << 2);
>> +        break;
>> +    case PAGE_SIZE:
>> +        out = 0 | (1 << 2);
>> +        break;
>> +    default:
>> +        /*
>> +         * Only support mapping at the page level granulatity when
>> +         * it's an unusual length. This should get us back onto a larger
>> +         * block size for the subsequent mappings.
>> +         */
>> +        out = 0 | ((MIN(size >> PAGE_SHIFT, PTRS_PER_PTE - 1)) << 2);
>> +        break;
>> +    }
>> +
>> +    WARN_ON(phys & ~PAGE_MASK);
>> +
>> +    out |= phys & PAGE_MASK;
>> +
>> +    return out;
>> +}
>> +
>> +int realm_map_protected(struct kvm *kvm,
>> +            unsigned long ipa,
>> +            kvm_pfn_t pfn,
>> +            unsigned long map_size,
>> +            struct kvm_mmu_memory_cache *memcache)
>> +{
>> +    struct realm *realm = &kvm->arch.realm;
>> +    phys_addr_t phys = __pfn_to_phys(pfn);
>> +    phys_addr_t base_phys = phys;
>> +    phys_addr_t rd = virt_to_phys(realm->rd);
>> +    unsigned long base_ipa = ipa;
>> +    unsigned long ipa_top = ipa + map_size;
>> +    int ret = 0;
>> +
>> +    if (WARN_ON(!IS_ALIGNED(map_size, PAGE_SIZE) ||
>> +            !IS_ALIGNED(ipa, map_size)))
>> +        return -EINVAL;
>> +
>> +    if (rmi_delegate_range(phys, map_size)) {
>> +        /*
>> +         * It's likely we raced with another VCPU on the same
>> +         * fault. Assume the other VCPU has handled the fault
>> +         * and return to the guest.
>> +         */
>> +        return 0;
>> +    }
>> +
>> +    while (ipa < ipa_top) {
>> +        unsigned long flags = RMI_ADDR_TYPE_SINGLE;
>> +        unsigned long range_desc = addr_range_desc(phys, ipa_top - ipa);
>> +        unsigned long out_top;
>> +
>> +        ret = rmi_rtt_data_map(rd, ipa, ipa_top, flags, range_desc,
>> +                       &out_top);
>> +
>> +        if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +            /* Create missing RTTs and retry */
>> +            int level = RMI_RETURN_INDEX(ret);
>> +
>> +            WARN_ON(level == KVM_PGTABLE_LAST_LEVEL);
>> +            ret = realm_create_rtt_levels(realm, ipa, level,
>> +                              KVM_PGTABLE_LAST_LEVEL,
>> +                              memcache);

Could we give the RMM a chance to make use of the Block mappings by 
creating the Missing RTTs to the level that may work for the current
range_desc ? i.e., if the range_desc is a 2M block size, we could create
tables upto L2 in the first go and if the RMM still needs RTT, we could
go further down to the KVM_PGTABLE_LAST_LEVEL. I understand this is
kind of an optimisation, so may be we could defer it. (Same applies for
the non_secure map below).


>> +            if (ret)
>> +                goto err_undelegate;
>> +
>> +            ret = rmi_rtt_data_map(rd, ipa, ipa_top, flags,
>> +                           range_desc, &out_top);
>> +        }
>> +
>> +        if (WARN_ON(ret))
>> +            goto err_undelegate;
>> +
>> +        phys += out_top - ipa;
>> +        ipa = out_top;
>> +    }
>> +
>> +    return 0;
>> +
>> +err_undelegate:
>> +    realm_unmap_private_range(kvm, base_ipa, ipa, true);
>> +    if (WARN_ON(rmi_undelegate_range(base_phys, map_size))) {
>> +        /* Page can't be returned to NS world so is lost */
>> +        get_page(phys_to_page(base_phys));
>> +    }
>> +    return -ENXIO;
>> +}
>> +
>> +int realm_map_non_secure(struct realm *realm,
>> +             unsigned long ipa,
>> +             kvm_pfn_t pfn,
>> +             unsigned long size,
>> +             enum kvm_pgtable_prot prot,
>> +             struct kvm_mmu_memory_cache *memcache)
>> +{
>> +    unsigned long attr, flags = 0;
>> +    phys_addr_t rd = virt_to_phys(realm->rd);
>> +    phys_addr_t phys = __pfn_to_phys(pfn);
>> +    unsigned long ipa_top = ipa + size;
>> +    int ret;
>> +
>> +    if (WARN_ON(!IS_ALIGNED(size, PAGE_SIZE) ||
>> +            !IS_ALIGNED(ipa, size)))
>> +        return -EINVAL;
>> +
>> +    switch (prot & (KVM_PGTABLE_PROT_DEVICE | 
>> KVM_PGTABLE_PROT_NORMAL_NC)) {
>> +    case KVM_PGTABLE_PROT_DEVICE | KVM_PGTABLE_PROT_NORMAL_NC:
>> +        return -EINVAL;
>> +    case KVM_PGTABLE_PROT_DEVICE:
>> +        attr = MT_S2_FWB_DEVICE_nGnRE;
>> +        break;
>> +    case KVM_PGTABLE_PROT_NORMAL_NC:
>> +        attr = MT_S2_FWB_NORMAL_NC;
>> +        break;
>> +    default:
>> +        attr = MT_S2_FWB_NORMAL;
>> +    }
>> +
>> +    flags |= FIELD_PREP(RMI_RTT_UNPROT_MAP_FLAGS_MEMATTR, attr);
>> +
>> +    if (prot & KVM_PGTABLE_PROT_R)
>> +        flags |= FIELD_PREP(RMI_RTT_UNPROT_MAP_FLAGS_S2AP, 
>> RMI_S2AP_DIRECT_READ);
>> +    if (prot & KVM_PGTABLE_PROT_W)
>> +        flags |= FIELD_PREP(RMI_RTT_UNPROT_MAP_FLAGS_S2AP, 
>> RMI_S2AP_DIRECT_WRITE);
>> +
>> +    flags |= RMI_ADDR_TYPE_SINGLE;
>> +
>> +    while (ipa < ipa_top) {
>> +        unsigned long range_desc = addr_range_desc(phys, ipa_top - ipa);
>> +        unsigned long out_top;
>> +
>> +        ret = rmi_rtt_unprot_map(rd, ipa, ipa_top, flags, range_desc,
>> +                     &out_top);
>> +
>> +        if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +            /* Create missing RTTs and retry */
>> +            int level = RMI_RETURN_INDEX(ret);
>> +
>> +            WARN_ON(level == KVM_PGTABLE_LAST_LEVEL);
>> +            ret = realm_create_rtt_levels(realm, ipa, level,
>> +                              KVM_PGTABLE_LAST_LEVEL,

^^ Same as above.

Suzuki


>> +                              memcache);
>> +            if (ret)
>> +                return ret;
>> +
>> +            ret = rmi_rtt_unprot_map(rd, ipa, ipa_top, flags,
>> +                         range_desc, &out_top);
>> +        }
>> +
>> +        if (WARN_ON(ret))
>> +            return ret;
>> +
>> +        phys += out_top - ipa;
>> +        ipa = out_top;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static int populate_region_cb(struct kvm *kvm, gfn_t gfn, kvm_pfn_t 
>> pfn,
>>                     struct page *src_page, void *opaque)
>>   {
> 
> Thanks,
> Gavin
> 



  parent reply	other threads:[~2026-06-08  9:31 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 13:17 [PATCH v14 00/44] arm64: Support for Arm CCA in KVM Steven Price
2026-05-13 13:17 ` [PATCH v14 01/44] kvm: arm64: Include kvm_emulate.h in kvm/arm_psci.h Steven Price
2026-05-21 10:19   ` Marc Zyngier
2026-05-21 15:11     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 02/44] kvm: arm64: Avoid including linux/kvm_host.h in kvm_pgtable.h Steven Price
2026-05-21 10:26   ` Marc Zyngier
2026-05-21 15:11     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 03/44] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
2026-05-21 12:25   ` Marc Zyngier
2026-05-21 15:15     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 04/44] arm64: RMI: Add SMC definitions for calling the RMM Steven Price
2026-05-18  7:08   ` Gavin Shan
2026-05-20 16:01     ` Steven Price
2026-05-21 12:40   ` Marc Zyngier
2026-05-21 14:50     ` Suzuki K Poulose
2026-05-21 15:33     ` Steven Price
2026-05-22  9:58       ` Marc Zyngier
2026-06-03 10:15         ` Steven Price
2026-05-13 13:17 ` [PATCH v14 05/44] arm64: RMI: Add wrappers for RMI calls Steven Price
2026-05-19  5:35   ` Aneesh Kumar K.V
2026-05-21 15:44     ` Steven Price
2026-05-21  0:21   ` Gavin Shan
2026-05-21 15:44     ` Steven Price
2026-05-21 12:49   ` Marc Zyngier
2026-05-21 15:44     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 06/44] arm64: RMI: Check for RMI support at init Steven Price
2026-05-21  0:39   ` Gavin Shan
2026-05-21 15:49     ` Steven Price
2026-05-25  6:58       ` Gavin Shan
2026-06-03 10:57         ` Steven Price
2026-05-21 13:02   ` Marc Zyngier
2026-06-03 10:57     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 07/44] arm64: RMI: Configure the RMM with the host's page size Steven Price
2026-05-21  0:51   ` Gavin Shan
2026-05-21 22:36     ` Suzuki K Poulose
2026-05-21 13:30   ` Marc Zyngier
2026-05-21 14:53     ` Suzuki K Poulose
2026-06-03 15:48     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 08/44] arm64: RMI: Ensure that the RMM has GPT entries for memory Steven Price
2026-05-19  5:55   ` Aneesh Kumar K.V
2026-06-03 15:48     ` Steven Price
2026-05-21  0:58   ` Gavin Shan
2026-06-03 15:48     ` Steven Price
2026-05-21 13:47   ` Marc Zyngier
2026-05-21 14:24     ` Marc Zyngier
2026-05-21 15:39     ` Suzuki K Poulose
2026-06-03 15:48       ` Steven Price
2026-05-13 13:17 ` [PATCH v14 09/44] arm64: RMI: Provide functions to delegate/undelegate ranges of memory Steven Price
2026-05-21 13:59   ` Marc Zyngier
2026-05-21 16:01     ` Suzuki K Poulose
2026-05-22 10:02       ` Marc Zyngier
2026-06-04 14:43     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 10/44] arm64: RMI: Add support for SRO Steven Price
2026-05-14  8:01   ` Aneesh Kumar K.V
2026-05-14  9:33     ` Steven Price
2026-05-19  6:02   ` Aneesh Kumar K.V
2026-06-04 15:19     ` Steven Price
2026-05-21  4:38   ` Gavin Shan
2026-06-04 15:19     ` Steven Price
2026-06-12 23:07       ` Dan Williams (nvidia)
2026-05-21 14:35   ` Marc Zyngier
2026-06-04 15:19     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 11/44] arm64: RMI: Check for RMI support at KVM init Steven Price
2026-05-13 13:17 ` [PATCH v14 12/44] arm64: RMI: Check for LPA2 support Steven Price
2026-05-13 13:17 ` [PATCH v14 13/44] arm64: RMI: Define the user ABI Steven Price
2026-05-26 22:17   ` Wei-Lin Chang
2026-06-04 15:27     ` Steven Price
2026-05-27 15:21   ` Marc Zyngier
2026-06-02 11:15     ` Suzuki K Poulose
2026-06-04 15:27     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 14/44] arm64: RMI: Basic infrastructure for creating a realm Steven Price
2026-05-19  6:31   ` Aneesh Kumar K.V
2026-05-28  7:10   ` Marc Zyngier
2026-06-02 14:49     ` Suzuki K Poulose
2026-06-04 15:55       ` Steven Price
2026-05-13 13:17 ` [PATCH v14 15/44] kvm: arm64: Don't expose unsupported capabilities for realm guests Steven Price
2026-05-13 13:17 ` [PATCH v14 16/44] KVM: arm64: Allow passing machine type in KVM creation Steven Price
2026-05-13 13:17 ` [PATCH v14 17/44] arm64: RMI: RTT tear down Steven Price
2026-05-19  6:54   ` Aneesh Kumar K.V
2026-05-26 22:27   ` Wei-Lin Chang
2026-06-05 15:01     ` Steven Price
2026-05-26 22:32   ` Wei-Lin Chang
2026-06-05 15:01     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 18/44] arm64: RMI: Activate realm on first VCPU run Steven Price
2026-05-13 13:17 ` [PATCH v14 19/44] arm64: RMI: Allocate/free RECs to match vCPUs Steven Price
2026-05-26 22:39   ` Wei-Lin Chang
2026-06-05 15:02     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 20/44] arm64: RMI: Support for the VGIC in realms Steven Price
2026-05-28  4:07   ` Gavin Shan
2026-06-05 15:02     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 21/44] KVM: arm64: Support timers in realm RECs Steven Price
2026-05-28  4:11   ` Gavin Shan
2026-05-13 13:17 ` [PATCH v14 22/44] arm64: RMI: Handle realm enter/exit Steven Price
2026-05-28  4:38   ` Gavin Shan
2026-06-05 15:02     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 23/44] arm64: RMI: Handle RMI_EXIT_RIPAS_CHANGE Steven Price
2026-05-19  9:40   ` Aneesh Kumar K.V
2026-06-05 15:02     ` Steven Price
2026-05-27 10:52   ` Wei-Lin Chang
2026-05-13 13:17 ` [PATCH v14 24/44] KVM: arm64: Handle realm MMIO emulation Steven Price
2026-05-28  5:03   ` Gavin Shan
2026-06-08  8:49     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 25/44] KVM: arm64: Expose support for private memory Steven Price
2026-05-13 13:17 ` [PATCH v14 26/44] arm64: RMI: Allow populating initial contents Steven Price
2026-05-28  5:30   ` Gavin Shan
2026-06-08  9:36     ` Steven Price
2026-06-08  9:41       ` Suzuki K Poulose
2026-06-08 13:53         ` Steven Price
2026-05-13 13:17 ` [PATCH v14 27/44] arm64: RMI: Set RIPAS of initial memslots Steven Price
2026-05-19 10:02   ` Aneesh Kumar K.V
2026-05-19 10:13     ` Suzuki K Poulose
2026-05-19 12:55       ` Aneesh Kumar K.V
2026-05-19 13:06         ` Suzuki K Poulose
2026-05-13 13:17 ` [PATCH v14 28/44] arm64: RMI: Create the realm descriptor Steven Price
2026-05-26 22:47   ` Wei-Lin Chang
2026-06-08  9:49     ` Steven Price
2026-05-28  5:51   ` Gavin Shan
2026-06-08  9:56     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 29/44] arm64: RMI: Runtime faulting of memory Steven Price
2026-06-05  6:23   ` Gavin Shan
2026-06-05  7:28     ` Lorenzo Pieralisi
2026-06-05  8:11       ` Gavin Shan
2026-06-05 14:35         ` Lorenzo Pieralisi
2026-06-08  9:30     ` Suzuki K Poulose [this message]
2026-06-08 10:56       ` Steven Price
2026-06-08 12:58         ` Suzuki K Poulose
2026-06-05 11:20   ` Gavin Shan
2026-06-08 10:56     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 30/44] KVM: arm64: Handle realm VCPU load Steven Price
2026-05-13 13:17 ` [PATCH v14 31/44] KVM: arm64: Validate register access for a Realm VM Steven Price
2026-05-13 13:17 ` [PATCH v14 32/44] KVM: arm64: Handle Realm PSCI requests Steven Price
2026-05-28  6:55   ` Gavin Shan
2026-06-08 11:15     ` Steven Price
2026-05-13 13:17 ` [PATCH v14 33/44] KVM: arm64: WARN on injected undef exceptions Steven Price
2026-05-13 13:17 ` [PATCH v14 34/44] arm64: RMI: allow userspace to inject aborts Steven Price
2026-05-13 13:17 ` [PATCH v14 35/44] arm64: RMI: support RSI_HOST_CALL Steven Price
2026-05-13 13:17 ` [PATCH v14 36/44] arm64: RMI: Allow checking SVE on VM instance Steven Price
2026-05-13 13:17 ` [PATCH v14 37/44] arm64: RMI: Prevent Device mappings for Realms Steven Price
2026-05-19 10:25   ` Aneesh Kumar K.V
2026-05-13 13:17 ` [PATCH v14 38/44] arm64: RMI: Propagate number of breakpoints and watchpoints to userspace Steven Price
2026-05-13 13:17 ` [PATCH v14 39/44] arm64: RMI: Set breakpoint parameters through SET_ONE_REG Steven Price
2026-05-13 13:17 ` [PATCH v14 40/44] arm64: RMI: Propagate max SVE vector length from RMM Steven Price
2026-05-13 13:17 ` [PATCH v14 41/44] arm64: RMI: Configure max SVE vector length for a Realm Steven Price
2026-05-13 13:17 ` [PATCH v14 42/44] arm64: RMI: Provide register list for unfinalized RMI RECs Steven Price
2026-05-13 13:17 ` [PATCH v14 43/44] arm64: RMI: Provide accurate register list Steven Price
2026-05-13 13:17 ` [PATCH v14 44/44] arm64: RMI: Enable realms to be created Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cecbd148-5d33-49c8-928f-572f71b3dd69@arm.com \
    --to=suzuki.poulose@arm.com \
    --cc=Lorenzo.Pieralisi2@arm.com \
    --cc=WeiLin.Chang@arm.com \
    --cc=alexandru.elisei@arm.com \
    --cc=alpergun@google.com \
    --cc=aneesh.kumar@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=fj0570is@fujitsu.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=gshan@redhat.com \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=sdonthineni@nvidia.com \
    --cc=steven.price@arm.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox