From: Sean Christopherson <seanjc@google.com>
To: Anup Patel <apatel@ventanamicro.com>
Cc: Andrew Jones <ajones@ventanamicro.com>,
zhouquan@iscas.ac.cn, anup@brainfault.org,
atishp@atishpatra.org, paul.walmsley@sifive.com,
palmer@dabbelt.com, linux-kernel@vger.kernel.org,
linux-riscv@lists.infradead.org, kvm@vger.kernel.org,
kvm-riscv@lists.infradead.org
Subject: Re: [PATCH] RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
Date: Tue, 17 Jun 2025 07:36:22 -0700 [thread overview]
Message-ID: <aFF9ZqbvZZtbUnGt@google.com> (raw)
In-Reply-To: <CAK9=C2WFA+SDt4MCLj0reQnkkA2kxUmfWhT8HZxjT_DdW8W_rQ@mail.gmail.com>
On Sun, Jun 15, 2025, Anup Patel wrote:
> On Sat, Jun 14, 2025 at 3:59 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Jun 12, 2025, Andrew Jones wrote:
> > > On Wed, Jun 11, 2025 at 09:17:36AM -0700, Sean Christopherson wrote:
> > > > Looks like y'all also have a bug where an -EEXIST will be returned to userspace,
> > > > and will generate what's probably a spurious kvm_err() message.
> > >
> > > On 32-bit riscv, due to losing the upper bits of the physical address? Or
> > > is there yet another thing to fix?
> >
> > Another bug, I think. gstage_set_pte() returns -EEXIST if a PTE exists, and I
> > _assume_ that's supposed to be benign? But this code returns it blindly:
>
> gstage_set_pte() returns -EEXIST only when it was expecting a non-leaf
> PTE at a particular level but got a leaf PTE
Right, but isn't returning -EEXIST all the way to userspace undesirable behavior?
E.g. in this sequence, KVM will return -EEXIST and incorrectly terminate the VM
(assuming the VMM doesn't miraculously recover somehow):
1. Back the VM with HugeTLBFS
2. Fault-in memory, i.e. create hugepage mappings
3. Enable KVM_MEM_LOG_DIRTY_PAGES
4. Write-protection fault, kvm_riscv_gstage_map() tries to create a writable
non-huge mapping.
5. gstage_set_pte() encounters the huge leaf PTE before reaching the target
level, and returns -EEXIST.
AFAICT, gstage_wp_memory_region() doesn't split/shatter/demote hugepages, it
simply clears _PAGE_WRITE.
It's entirely possible I'm missing something that makes the above scenario
impossible in practice, but at this point I'm genuinely curious :-)
--
kvm-riscv mailing list
kvm-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kvm-riscv
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Anup Patel <apatel@ventanamicro.com>
Cc: Andrew Jones <ajones@ventanamicro.com>,
zhouquan@iscas.ac.cn, anup@brainfault.org,
atishp@atishpatra.org, paul.walmsley@sifive.com,
palmer@dabbelt.com, linux-kernel@vger.kernel.org,
linux-riscv@lists.infradead.org, kvm@vger.kernel.org,
kvm-riscv@lists.infradead.org
Subject: Re: [PATCH] RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
Date: Tue, 17 Jun 2025 07:36:22 -0700 [thread overview]
Message-ID: <aFF9ZqbvZZtbUnGt@google.com> (raw)
In-Reply-To: <CAK9=C2WFA+SDt4MCLj0reQnkkA2kxUmfWhT8HZxjT_DdW8W_rQ@mail.gmail.com>
On Sun, Jun 15, 2025, Anup Patel wrote:
> On Sat, Jun 14, 2025 at 3:59 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Jun 12, 2025, Andrew Jones wrote:
> > > On Wed, Jun 11, 2025 at 09:17:36AM -0700, Sean Christopherson wrote:
> > > > Looks like y'all also have a bug where an -EEXIST will be returned to userspace,
> > > > and will generate what's probably a spurious kvm_err() message.
> > >
> > > On 32-bit riscv, due to losing the upper bits of the physical address? Or
> > > is there yet another thing to fix?
> >
> > Another bug, I think. gstage_set_pte() returns -EEXIST if a PTE exists, and I
> > _assume_ that's supposed to be benign? But this code returns it blindly:
>
> gstage_set_pte() returns -EEXIST only when it was expecting a non-leaf
> PTE at a particular level but got a leaf PTE
Right, but isn't returning -EEXIST all the way to userspace undesirable behavior?
E.g. in this sequence, KVM will return -EEXIST and incorrectly terminate the VM
(assuming the VMM doesn't miraculously recover somehow):
1. Back the VM with HugeTLBFS
2. Fault-in memory, i.e. create hugepage mappings
3. Enable KVM_MEM_LOG_DIRTY_PAGES
4. Write-protection fault, kvm_riscv_gstage_map() tries to create a writable
non-huge mapping.
5. gstage_set_pte() encounters the huge leaf PTE before reaching the target
level, and returns -EEXIST.
AFAICT, gstage_wp_memory_region() doesn't split/shatter/demote hugepages, it
simply clears _PAGE_WRITE.
It's entirely possible I'm missing something that makes the above scenario
impossible in practice, but at this point I'm genuinely curious :-)
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Anup Patel <apatel@ventanamicro.com>
Cc: Andrew Jones <ajones@ventanamicro.com>,
zhouquan@iscas.ac.cn, anup@brainfault.org,
atishp@atishpatra.org, paul.walmsley@sifive.com,
palmer@dabbelt.com, linux-kernel@vger.kernel.org,
linux-riscv@lists.infradead.org, kvm@vger.kernel.org,
kvm-riscv@lists.infradead.org
Subject: Re: [PATCH] RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
Date: Tue, 17 Jun 2025 07:36:22 -0700 [thread overview]
Message-ID: <aFF9ZqbvZZtbUnGt@google.com> (raw)
In-Reply-To: <CAK9=C2WFA+SDt4MCLj0reQnkkA2kxUmfWhT8HZxjT_DdW8W_rQ@mail.gmail.com>
On Sun, Jun 15, 2025, Anup Patel wrote:
> On Sat, Jun 14, 2025 at 3:59 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Jun 12, 2025, Andrew Jones wrote:
> > > On Wed, Jun 11, 2025 at 09:17:36AM -0700, Sean Christopherson wrote:
> > > > Looks like y'all also have a bug where an -EEXIST will be returned to userspace,
> > > > and will generate what's probably a spurious kvm_err() message.
> > >
> > > On 32-bit riscv, due to losing the upper bits of the physical address? Or
> > > is there yet another thing to fix?
> >
> > Another bug, I think. gstage_set_pte() returns -EEXIST if a PTE exists, and I
> > _assume_ that's supposed to be benign? But this code returns it blindly:
>
> gstage_set_pte() returns -EEXIST only when it was expecting a non-leaf
> PTE at a particular level but got a leaf PTE
Right, but isn't returning -EEXIST all the way to userspace undesirable behavior?
E.g. in this sequence, KVM will return -EEXIST and incorrectly terminate the VM
(assuming the VMM doesn't miraculously recover somehow):
1. Back the VM with HugeTLBFS
2. Fault-in memory, i.e. create hugepage mappings
3. Enable KVM_MEM_LOG_DIRTY_PAGES
4. Write-protection fault, kvm_riscv_gstage_map() tries to create a writable
non-huge mapping.
5. gstage_set_pte() encounters the huge leaf PTE before reaching the target
level, and returns -EEXIST.
AFAICT, gstage_wp_memory_region() doesn't split/shatter/demote hugepages, it
simply clears _PAGE_WRITE.
It's entirely possible I'm missing something that makes the above scenario
impossible in practice, but at this point I'm genuinely curious :-)
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2025-06-17 15:33 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-11 9:51 [PATCH] RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map() zhouquan
2025-06-11 9:51 ` zhouquan
2025-06-11 9:51 ` zhouquan
2025-06-11 11:29 ` Andrew Jones
2025-06-11 11:29 ` Andrew Jones
2025-06-11 11:29 ` Andrew Jones
2025-06-11 16:17 ` Sean Christopherson
2025-06-11 16:17 ` Sean Christopherson
2025-06-11 16:17 ` Sean Christopherson
2025-06-12 9:42 ` Andrew Jones
2025-06-12 9:42 ` Andrew Jones
2025-06-12 9:42 ` Andrew Jones
2025-06-13 22:29 ` Sean Christopherson
2025-06-13 22:29 ` Sean Christopherson
2025-06-13 22:29 ` Sean Christopherson
2025-06-15 16:27 ` Anup Patel
2025-06-15 16:27 ` Anup Patel
2025-06-15 16:27 ` Anup Patel
2025-06-17 14:36 ` Sean Christopherson [this message]
2025-06-17 14:36 ` Sean Christopherson
2025-06-17 14:36 ` Sean Christopherson
2025-06-19 7:04 ` Anup Patel
2025-06-19 7:04 ` Anup Patel
2025-06-19 7:04 ` Anup Patel
2025-07-17 12:03 ` Anup Patel
2025-07-17 12:03 ` Anup Patel
2025-07-17 12:03 ` Anup Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFF9ZqbvZZtbUnGt@google.com \
--to=seanjc@google.com \
--cc=ajones@ventanamicro.com \
--cc=anup@brainfault.org \
--cc=apatel@ventanamicro.com \
--cc=atishp@atishpatra.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=zhouquan@iscas.ac.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.