From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3BD2CF0183A for ; Fri, 6 Mar 2026 14:02:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=d0w9iw1cGZJvO5arhwgcmj7KKngdL1WNYxmO3CGQ7Gg=; b=tYY38eAnGKd901A06Wfmwapxl9 s+pfx/4AKKnK2Nqkdo35igs3PzKh5Loc8aehJPdIFe9IOhfTiza/6wGE7sE4bUeoXv0XSslLw5t9a fFQilJ13smtpaDJo5dkdwY9DoP4kXmslOljB4MpWGAIVxkXMhpey8TTjhbE5DWzYJ45SD0m+ghiLI ZHnzZSCSgyph/huVlbAwbv8YF34HJIG+EGGoVuQ8im19rCD37hYkIt6YP0zGA5B6EDbmkDyLUagSe 41RbY17RGoVTcyZAWg5OyN8LdeoGG+v6LNvAzIbAUqAcRh0djrLJ2+PlNs3In+lcn+kINBsPk3ZYv OL6MJ2Cw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vyVlL-00000003q6k-1TL3; Fri, 06 Mar 2026 14:02:43 +0000 Received: from mail-ej1-x64a.google.com ([2a00:1450:4864:20::64a]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vyVlH-00000003q4E-19HP for linux-arm-kernel@lists.infradead.org; Fri, 06 Mar 2026 14:02:40 +0000 Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-b9071b7cc09so686012966b.1 for ; Fri, 06 Mar 2026 06:02:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1772805756; x=1773410556; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=d0w9iw1cGZJvO5arhwgcmj7KKngdL1WNYxmO3CGQ7Gg=; b=xWcAFGlfqBp82ZklyTzwUvCpeXzh6aXDNxTIrQY+S1Utqka3S8TN2da1VbsTGF9b1U kMaeylv3sawQG7pPyKR/Zh3gI4Bk1sFUUnaeoRgVBnsAsq5HJnrlobkEC7zfKS5x1F0U M0yyN1CKLd9YtzdYo5XbpNC8yz8MyYORM5whFOD3e165P6NYzuARmYarQ63Mts6/qz1A IoNi3g7zwall50fy69gv5Y6+PIEraWQnIKqUdvSWJC/u/AU5MKVLCmH9gPYtjnscXyYI 3/HaYioi4PoOdc/j7Vdh9izBKz8FNNHSZoNxXapC9dZZW7sm0lvmNND+K8UFlCa+tbde YZJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772805756; x=1773410556; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d0w9iw1cGZJvO5arhwgcmj7KKngdL1WNYxmO3CGQ7Gg=; b=NkBEat9zxgRJ5YOLh8QuoUGMUxETNt+uiBRGzzWx6JqOPtuUTDcNZ+qAZmkZit672v ADk2kT3bwRO/IA0c358KRpwC2jWTsLRYmVqC4nO6iH4rKXKiwyOcB58TJAhdic53PA/U 8S/Y999dbv2fusqLxagc1B1spkVYmdI4IBckKAMRSR0d4ymx4WJAt2NHz9J7d2S6m6BH 5DU4Ici3R/r8f8xq8EHE5cYgHxmdzdWGsSeOw/jOx3nsLV1hVLwQYSPJdzFnXwq1qZ2X Ho8Pf6HoNXk6dyQ27kvXLYg4JB/5F4NHifI7tKaLPdbi4EISdDBMQmERzO4FuafgSJyd UnXA== X-Forwarded-Encrypted: i=1; AJvYcCVxcpkblETjR4DfDuTPz/FI1YYSZcG6xqm9emWxTsNAMg5s71UYlSeVFnvtq5L9te0/aY2puT9cryLRnSiimHh+@lists.infradead.org X-Gm-Message-State: AOJu0Yybo8lKyCrvVDsO76T4fWD2jtl+/2CRpx21DKTsdJEEeBM52FZ8 LyVPolXsXIjqBrmoa/yCcpW3KW0tiPMf93zY+A8B/IP5vGcNfmISKDGWpLlEiYeLnteOgvdB3yU ZnQ== X-Received: from ejbci23.prod.google.com ([2002:a17:906:c357:b0:b93:5d8f:523c]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a17:906:f588:b0:b87:d255:39ff with SMTP id a640c23a62f3a-b942dfb3019mr117949966b.32.1772805756111; Fri, 06 Mar 2026 06:02:36 -0800 (PST) Date: Fri, 6 Mar 2026 14:02:21 +0000 In-Reply-To: <20260306140232.2193802-1-tabba@google.com> Mime-Version: 1.0 References: <20260306140232.2193802-1-tabba@google.com> X-Mailer: git-send-email 2.53.0.473.g4a7958ca14-goog Message-ID: <20260306140232.2193802-3-tabba@google.com> Subject: [PATCH v1 02/13] KVM: arm64: Introduce struct kvm_s2_fault to user_mem_abort() From: Fuad Tabba To: kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org Cc: maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, qperret@google.com, vdonnefort@google.com, tabba@google.com Content-Type: text/plain; charset="UTF-8" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260306_060239_388200_DF69A619 X-CRM114-Status: GOOD ( 30.08 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The user_mem_abort() function takes many arguments and defines a large number of local variables. Passing all these variables around to helper functions would result in functions with too many arguments. Introduce struct kvm_s2_fault to encapsulate the stage-2 fault state. This structure holds both the input parameters and the intermediate state required during the fault handling process. Update user_mem_abort() to initialize this structure and replace the usage of local variables with fields from the new structure. This prepares the ground for further extracting parts of user_mem_abort() into smaller helper functions that can simply take a pointer to the fault state structure. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 222 +++++++++++++++++++++++++------------------ 1 file changed, 128 insertions(+), 94 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index f8064b2d3204..419b793c5891 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1710,38 +1710,68 @@ static short kvm_s2_resolve_vma_size(struct vm_area_struct *vma, return vma_shift; } +struct kvm_s2_fault { + struct kvm_vcpu *vcpu; + phys_addr_t fault_ipa; + struct kvm_s2_trans *nested; + struct kvm_memory_slot *memslot; + unsigned long hva; + bool fault_is_perm; + + bool write_fault; + bool exec_fault; + bool writable; + bool topup_memcache; + bool mte_allowed; + bool is_vma_cacheable; + bool s2_force_noncacheable; + bool vfio_allow_any_uc; + unsigned long mmu_seq; + phys_addr_t ipa; + short vma_shift; + gfn_t gfn; + kvm_pfn_t pfn; + bool logging_active; + bool force_pte; + long vma_pagesize; + long fault_granule; + enum kvm_pgtable_prot prot; + struct page *page; + vm_flags_t vm_flags; +}; + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, bool fault_is_perm) { int ret = 0; - bool topup_memcache; - bool write_fault, writable; - bool exec_fault, mte_allowed, is_vma_cacheable; - bool s2_force_noncacheable = false, vfio_allow_any_uc = false; - unsigned long mmu_seq; - phys_addr_t ipa = fault_ipa; + struct kvm_s2_fault fault_data = { + .vcpu = vcpu, + .fault_ipa = fault_ipa, + .nested = nested, + .memslot = memslot, + .hva = hva, + .fault_is_perm = fault_is_perm, + .ipa = fault_ipa, + .logging_active = memslot_is_logging(memslot), + .force_pte = memslot_is_logging(memslot), + .s2_force_noncacheable = false, + .vfio_allow_any_uc = false, + .prot = KVM_PGTABLE_PROT_R, + }; + struct kvm_s2_fault *fault = &fault_data; struct kvm *kvm = vcpu->kvm; struct vm_area_struct *vma; - short vma_shift; void *memcache; - gfn_t gfn; - kvm_pfn_t pfn; - bool logging_active = memslot_is_logging(memslot); - bool force_pte = logging_active; - long vma_pagesize, fault_granule; - enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; - struct page *page; - vm_flags_t vm_flags; enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; - if (fault_is_perm) - fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu); - write_fault = kvm_is_write_fault(vcpu); - exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); - VM_WARN_ON_ONCE(write_fault && exec_fault); + if (fault->fault_is_perm) + fault->fault_granule = kvm_vcpu_trap_get_perm_fault_granule(fault->vcpu); + fault->write_fault = kvm_is_write_fault(fault->vcpu); + fault->exec_fault = kvm_vcpu_trap_is_exec_fault(fault->vcpu); + VM_WARN_ON_ONCE(fault->write_fault && fault->exec_fault); /* * Permission faults just need to update the existing leaf entry, @@ -1749,43 +1779,44 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * only exception to this is when dirty logging is enabled at runtime * and a write fault needs to collapse a block entry into a table. */ - topup_memcache = !fault_is_perm || (logging_active && write_fault); - ret = prepare_mmu_memcache(vcpu, topup_memcache, &memcache); + fault->topup_memcache = !fault->fault_is_perm || + (fault->logging_active && fault->write_fault); + ret = prepare_mmu_memcache(fault->vcpu, fault->topup_memcache, &memcache); if (ret) return ret; /* - * Let's check if we will get back a huge page backed by hugetlbfs, or + * Let's check if we will get back a huge fault->page backed by hugetlbfs, or * get block mapping for device MMIO region. */ mmap_read_lock(current->mm); - vma = vma_lookup(current->mm, hva); + vma = vma_lookup(current->mm, fault->hva); if (unlikely(!vma)) { - kvm_err("Failed to find VMA for hva 0x%lx\n", hva); + kvm_err("Failed to find VMA for fault->hva 0x%lx\n", fault->hva); mmap_read_unlock(current->mm); return -EFAULT; } - vma_shift = kvm_s2_resolve_vma_size(vma, hva, memslot, nested, - &force_pte, &ipa); - vma_pagesize = 1UL << vma_shift; + fault->vma_shift = kvm_s2_resolve_vma_size(vma, fault->hva, fault->memslot, fault->nested, + &fault->force_pte, &fault->ipa); + fault->vma_pagesize = 1UL << fault->vma_shift; /* * Both the canonical IPA and fault IPA must be aligned to the * mapping size to ensure we find the right PFN and lay down the * mapping in the right place. */ - fault_ipa = ALIGN_DOWN(fault_ipa, vma_pagesize); - ipa = ALIGN_DOWN(ipa, vma_pagesize); + fault->fault_ipa = ALIGN_DOWN(fault->fault_ipa, fault->vma_pagesize); + fault->ipa = ALIGN_DOWN(fault->ipa, fault->vma_pagesize); - gfn = ipa >> PAGE_SHIFT; - mte_allowed = kvm_vma_mte_allowed(vma); + fault->gfn = fault->ipa >> PAGE_SHIFT; + fault->mte_allowed = kvm_vma_mte_allowed(vma); - vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; + fault->vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; - vm_flags = vma->vm_flags; + fault->vm_flags = vma->vm_flags; - is_vma_cacheable = kvm_vma_is_cacheable(vma); + fault->is_vma_cacheable = kvm_vma_is_cacheable(vma); /* Don't use the VMA after the unlock -- it may have vanished */ vma = NULL; @@ -1798,24 +1829,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs * with the smp_wmb() in kvm_mmu_invalidate_end(). */ - mmu_seq = kvm->mmu_invalidate_seq; + fault->mmu_seq = kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); - pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, - &writable, &page); - if (pfn == KVM_PFN_ERR_HWPOISON) { - kvm_send_hwpoison_signal(hva, vma_shift); + fault->pfn = __kvm_faultin_pfn(fault->memslot, fault->gfn, + fault->write_fault ? FOLL_WRITE : 0, + &fault->writable, &fault->page); + if (fault->pfn == KVM_PFN_ERR_HWPOISON) { + kvm_send_hwpoison_signal(fault->hva, fault->vma_shift); return 0; } - if (is_error_noslot_pfn(pfn)) + if (is_error_noslot_pfn(fault->pfn)) return -EFAULT; /* - * Check if this is non-struct page memory PFN, and cannot support + * Check if this is non-struct fault->page memory PFN, and cannot support * CMOs. It could potentially be unsafe to access as cacheable. */ - if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP) && !pfn_is_map_memory(pfn)) { - if (is_vma_cacheable) { + if (fault->vm_flags & (VM_PFNMAP | VM_MIXEDMAP) && !pfn_is_map_memory(fault->pfn)) { + if (fault->is_vma_cacheable) { /* * Whilst the VMA owner expects cacheable mapping to this * PFN, hardware also has to support the FWB and CACHE DIC @@ -1832,26 +1864,26 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ret = -EFAULT; } else { /* - * If the page was identified as device early by looking at - * the VMA flags, vma_pagesize is already representing the + * If the fault->page was identified as device early by looking at + * the VMA flags, fault->vma_pagesize is already representing the * largest quantity we can map. If instead it was mapped - * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE + * via __kvm_faultin_pfn(), fault->vma_pagesize is set to PAGE_SIZE * and must not be upgraded. * * In both cases, we don't let transparent_hugepage_adjust() * change things at the last minute. */ - s2_force_noncacheable = true; + fault->s2_force_noncacheable = true; } - } else if (logging_active && !write_fault) { + } else if (fault->logging_active && !fault->write_fault) { /* - * Only actually map the page as writable if this was a write + * Only actually map the fault->page as fault->writable if this was a write * fault. */ - writable = false; + fault->writable = false; } - if (exec_fault && s2_force_noncacheable) + if (fault->exec_fault && fault->s2_force_noncacheable) ret = -ENOEXEC; if (ret) @@ -1860,101 +1892,103 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, /* * Guest performs atomic/exclusive operations on memory with unsupported * attributes (e.g. ld64b/st64b on normal memory when no FEAT_LS64WB) - * and trigger the exception here. Since the memslot is valid, inject + * and trigger the exception here. Since the fault->memslot is valid, inject * the fault back to the guest. */ - if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) { - kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu)); + if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(fault->vcpu))) { + kvm_inject_dabt_excl_atomic(fault->vcpu, kvm_vcpu_get_hfar(fault->vcpu)); ret = 1; goto out_put_page; } - if (nested) - adjust_nested_fault_perms(nested, &prot, &writable); + if (fault->nested) + adjust_nested_fault_perms(fault->nested, &fault->prot, &fault->writable); kvm_fault_lock(kvm); - pgt = vcpu->arch.hw_mmu->pgt; - if (mmu_invalidate_retry(kvm, mmu_seq)) { + pgt = fault->vcpu->arch.hw_mmu->pgt; + if (mmu_invalidate_retry(kvm, fault->mmu_seq)) { ret = -EAGAIN; goto out_unlock; } /* - * If we are not forced to use page mapping, check if we are + * If we are not forced to use fault->page mapping, check if we are * backed by a THP and thus use block mapping if possible. */ - if (vma_pagesize == PAGE_SIZE && !(force_pte || s2_force_noncacheable)) { - if (fault_is_perm && fault_granule > PAGE_SIZE) - vma_pagesize = fault_granule; + if (fault->vma_pagesize == PAGE_SIZE && + !(fault->force_pte || fault->s2_force_noncacheable)) { + if (fault->fault_is_perm && fault->fault_granule > PAGE_SIZE) + fault->vma_pagesize = fault->fault_granule; else - vma_pagesize = transparent_hugepage_adjust(kvm, memslot, - hva, &pfn, - &fault_ipa); + fault->vma_pagesize = transparent_hugepage_adjust(kvm, fault->memslot, + fault->hva, &fault->pfn, + &fault->fault_ipa); - if (vma_pagesize < 0) { - ret = vma_pagesize; + if (fault->vma_pagesize < 0) { + ret = fault->vma_pagesize; goto out_unlock; } } - if (!fault_is_perm && !s2_force_noncacheable && kvm_has_mte(kvm)) { + if (!fault->fault_is_perm && !fault->s2_force_noncacheable && kvm_has_mte(kvm)) { /* Check the VMM hasn't introduced a new disallowed VMA */ - if (mte_allowed) { - sanitise_mte_tags(kvm, pfn, vma_pagesize); + if (fault->mte_allowed) { + sanitise_mte_tags(kvm, fault->pfn, fault->vma_pagesize); } else { ret = -EFAULT; goto out_unlock; } } - if (writable) - prot |= KVM_PGTABLE_PROT_W; + if (fault->writable) + fault->prot |= KVM_PGTABLE_PROT_W; - if (exec_fault) - prot |= KVM_PGTABLE_PROT_X; + if (fault->exec_fault) + fault->prot |= KVM_PGTABLE_PROT_X; - if (s2_force_noncacheable) { - if (vfio_allow_any_uc) - prot |= KVM_PGTABLE_PROT_NORMAL_NC; + if (fault->s2_force_noncacheable) { + if (fault->vfio_allow_any_uc) + fault->prot |= KVM_PGTABLE_PROT_NORMAL_NC; else - prot |= KVM_PGTABLE_PROT_DEVICE; + fault->prot |= KVM_PGTABLE_PROT_DEVICE; } else if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) { - prot |= KVM_PGTABLE_PROT_X; + fault->prot |= KVM_PGTABLE_PROT_X; } - if (nested) - adjust_nested_exec_perms(kvm, nested, &prot); + if (fault->nested) + adjust_nested_exec_perms(kvm, fault->nested, &fault->prot); /* * Under the premise of getting a FSC_PERM fault, we just need to relax - * permissions only if vma_pagesize equals fault_granule. Otherwise, + * permissions only if fault->vma_pagesize equals fault->fault_granule. Otherwise, * kvm_pgtable_stage2_map() should be called to change block size. */ - if (fault_is_perm && vma_pagesize == fault_granule) { + if (fault->fault_is_perm && fault->vma_pagesize == fault->fault_granule) { /* * Drop the SW bits in favour of those stored in the * PTE, which will be preserved. */ - prot &= ~KVM_NV_GUEST_MAP_SZ; - ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, fault_ipa, prot, flags); + fault->prot &= ~KVM_NV_GUEST_MAP_SZ; + ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, fault->fault_ipa, fault->prot, + flags); } else { - ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, fault_ipa, vma_pagesize, - __pfn_to_phys(pfn), prot, - memcache, flags); + ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, fault->fault_ipa, fault->vma_pagesize, + __pfn_to_phys(fault->pfn), fault->prot, + memcache, flags); } out_unlock: - kvm_release_faultin_page(kvm, page, !!ret, writable); + kvm_release_faultin_page(kvm, fault->page, !!ret, fault->writable); kvm_fault_unlock(kvm); - /* Mark the page dirty only if the fault is handled successfully */ - if (writable && !ret) - mark_page_dirty_in_slot(kvm, memslot, gfn); + /* Mark the fault->page dirty only if the fault is handled successfully */ + if (fault->writable && !ret) + mark_page_dirty_in_slot(kvm, fault->memslot, fault->gfn); return ret != -EAGAIN ? ret : 0; out_put_page: - kvm_release_page_unused(page); + kvm_release_page_unused(fault->page); return ret; } -- 2.53.0.473.g4a7958ca14-goog