From: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
To: fangyu.yu@linux.alibaba.com, anup@brainfault.org,
atish.patra@linux.dev, pjw@kernel.org, palmer@dabbelt.com,
aou@eecs.berkeley.edu, alex@ghiti.fr, pbonzini@redhat.com,
jiangyifei@huawei.com
Cc: guoren@kernel.org, kvm@vger.kernel.org,
kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] RISC-V: KVM: Remove automatic I/O mapping for VM_PFNMAP
Date: Mon, 20 Oct 2025 16:43:14 -0300 [thread overview]
Message-ID: <ca6a8e5a-f14d-4017-90dc-be566d594eee@ventanamicro.com> (raw)
In-Reply-To: <20251020130801.68356-1-fangyu.yu@linux.alibaba.com>
On 10/20/25 10:08 AM, fangyu.yu@linux.alibaba.com wrote:
> From: Fangyu Yu <fangyu.yu@linux.alibaba.com>
>
> As of commit aac6db75a9fc ("vfio/pci: Use unmap_mapping_range()"),
> vm_pgoff may no longer guaranteed to hold the PFN for VM_PFNMAP
> regions. Using vma->vm_pgoff to derive the HPA here may therefore
> produce incorrect mappings.
>
> Instead, I/O mappings for such regions can be established on-demand
> during g-stage page faults, making the upfront ioremap in this path
> is unnecessary.
>
> Fixes: 9d05c1fee837 ("RISC-V: KVM: Implement stage2 page table programming")
> Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
> ---
Hi,
This patch fixes the issue observed by Drew in [1]. I was helping Drew
debug it using a QEMU guest inside an emulated risc-v host with the
'virt' machine + IOMMU enabled.
Using the patches from [2], without the workaround patch (18), booting a
guest with a passed-through PCI device fails with a store amo fault and a
kernel oops:
[ 3.304776] Oops - store (or AMO) access fault [#1]
[ 3.305159] Modules linked in:
[ 3.305603] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc4 #39
[ 3.305988] Hardware name: riscv-virtio,qemu (DT)
[ 3.306140] epc : __ew32+0x34/0xba
[ 3.307910] ra : e1000_irq_disable+0x1e/0x9a
[ 3.307984] epc : ffffffff806ebfbe ra : ffffffff806ee3f8 sp : ff2000000000baf0
[ 3.308022] gp : ffffffff81719938 tp : ff600000018b8000 t0 : ff60000002c3b480
[ 3.308055] t1 : 0000000000000065 t2 : 3030206530303031 s0 : ff2000000000bb30
[ 3.308086] s1 : ff60000002a50a00 a0 : ff60000002a50fb8 a1 : 00000000000000d8
[ 3.308118] a2 : ffffffffffffffff a3 : 0000000000000002 a4 : 0000000000003000
[ 3.308161] a5 : ff200000001e00d8 a6 : 0000000000000008 a7 : 0000000000000038
[ 3.308195] s2 : ff60000002a50fb8 s3 : ff60000001865000 s4 : 00000000000000d8
[ 3.308226] s5 : ffffffffffffffff s6 : ff60000002a50a00 s7 : ffffffff812d2760
[ 3.308258] s8 : 0000000000000a00 s9 : 0000000000001000 s10: ff60000002a51000
[ 3.308288] s11: ff60000002a54000 t3 : ffffffff8172ec4f t4 : ffffffff8172ec4f
[ 3.308475] t5 : ffffffff8172ec50 t6 : ff2000000000b848
[ 3.308763] status: 0000000200000120 badaddr: ff200000001e00d8 cause: 0000000000000007
[ 3.308975] [<ffffffff806ebfbe>] __ew32+0x34/0xba
[ 3.309196] [<ffffffff806ee3f8>] e1000_irq_disable+0x1e/0x9a
[ 3.309241] [<ffffffff806f1e12>] e1000_probe+0x3b6/0xb50
[ 3.309279] [<ffffffff80510554>] pci_device_probe+0x7e/0xf8
[ 3.310001] [<ffffffff80610344>] really_probe+0x82/0x202
[ 3.310409] [<ffffffff80610520>] __driver_probe_device+0x5c/0xd0
[ 3.310622] [<ffffffff806105c0>] driver_probe_device+0x2c/0xb0
(...)
Further debugging showed that, as far as QEMU goes, the store fault happens in an
"unassigned io region", i.e. a region where there's no IO memory region mapped by
any device. There is no IOMMU faults being logged and, at least as far as I've
observed, no IOMMU translation bugs in the QEMU side as well.
Thanks for the fix!
Tested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
[1] https://lore.kernel.org/all/20250920203851.2205115-38-ajones@ventanamicro.com/
[2] https://lore.kernel.org/all/20250920203851.2205115-20-ajones@ventanamicro.com/
> arch/riscv/kvm/mmu.c | 20 +-------------------
> 1 file changed, 1 insertion(+), 19 deletions(-)
>
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 525fb5a330c0..84c04c8f0892 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -197,8 +197,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>
> /*
> * A memory region could potentially cover multiple VMAs, and
> - * any holes between them, so iterate over all of them to find
> - * out if we can map any of them right now.
> + * any holes between them, so iterate over all of them.
> *
> * +--------------------------------------------+
> * +---------------+----------------+ +----------------+
> @@ -229,32 +228,15 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> vm_end = min(reg_end, vma->vm_end);
>
> if (vma->vm_flags & VM_PFNMAP) {
> - gpa_t gpa = base_gpa + (vm_start - hva);
> - phys_addr_t pa;
> -
> - pa = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
> - pa += vm_start - vma->vm_start;
> -
> /* IO region dirty page logging not allowed */
> if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> ret = -EINVAL;
> goto out;
> }
> -
> - ret = kvm_riscv_mmu_ioremap(kvm, gpa, pa, vm_end - vm_start,
> - writable, false);
> - if (ret)
> - break;
> }
> hva = vm_end;
> } while (hva < reg_end);
>
> - if (change == KVM_MR_FLAGS_ONLY)
> - goto out;
> -
> - if (ret)
> - kvm_riscv_mmu_iounmap(kvm, base_gpa, size);
> -
> out:
> mmap_read_unlock(current->mm);
> return ret;
next prev parent reply other threads:[~2025-10-20 19:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 13:08 [PATCH] RISC-V: KVM: Remove automatic I/O mapping for VM_PFNMAP fangyu.yu
2025-10-20 19:43 ` Daniel Henrique Barboza [this message]
2025-10-21 1:44 ` fangyu.yu
2025-10-21 2:38 ` kernel test robot
2025-10-21 15:27 ` Guo Ren
2025-10-21 15:31 ` Jason Gunthorpe
2025-10-24 7:31 ` Anup Patel
2025-10-24 10:03 ` Guo Ren
2025-10-24 7:29 ` Anup Patel
2025-10-24 13:31 ` fangyu.yu
2025-10-24 15:25 ` Anup Patel
2025-10-24 15:57 ` Anup Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca6a8e5a-f14d-4017-90dc-be566d594eee@ventanamicro.com \
--to=dbarboza@ventanamicro.com \
--cc=alex@ghiti.fr \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=atish.patra@linux.dev \
--cc=fangyu.yu@linux.alibaba.com \
--cc=guoren@kernel.org \
--cc=jiangyifei@huawei.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=pbonzini@redhat.com \
--cc=pjw@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox