From: Binbin Wu <binbin.wu@linux.intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: Zhao Liu <zhao1.liu@intel.com>,
xiaoyao.li@intel.com, qemu-devel@nongnu.org,
michael.roth@amd.com, rick.p.edgecombe@intel.com,
isaku.yamahata@intel.com, farrah.chen@intel.com,
kvm@vger.kernel.org
Subject: Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL
Date: Fri, 13 Dec 2024 09:52:34 +0800 [thread overview]
Message-ID: <2cf8bf60-b36d-47ca-9aef-d477a841cbed@linux.intel.com> (raw)
In-Reply-To: <745b2b6e-7dd0-4437-bbbf-673ddc0df014@linux.intel.com>
On 12/13/2024 9:46 AM, Binbin Wu wrote:
>
>
>
> On 12/13/2024 5:28 AM, Paolo Bonzini wrote:
>> On 12/12/24 20:13, Sean Christopherson wrote:
>>> On Thu, Dec 12, 2024, Paolo Bonzini wrote:
>>>> On 12/12/24 09:07, Zhao Liu wrote:
>>>>> On Thu, Dec 12, 2024 at 11:26:28AM +0800, Binbin Wu wrote:
>>>>>> Date: Thu, 12 Dec 2024 11:26:28 +0800
>>>>>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>>>>> Subject: [PATCH] i386/kvm: Set return value after handling
>>>>>> KVM_EXIT_HYPERCALL
>>>>>> X-Mailer: git-send-email 2.46.0
>>>>>>
>>>>>> Userspace should set the ret field of hypercall after handling
>>>>>> KVM_EXIT_HYPERCALL. Otherwise, a stale value could be returned to KVM.
>>>>>>
>>>>>> Fixes: 47e76d03b15 ("i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE")
>>>>>> Reported-by: Farrah Chen <farrah.chen@intel.com>
>>>>>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>>>>>> Tested-by: Farrah Chen <farrah.chen@intel.com>
>>>>>> ---
>>>>>> To test the TDX code in kvm-coco-queue, please apply the patch to the QEMU,
>>>>>> otherwise, TDX guest boot could fail.
>>>>>> A matching QEMU tree including this patch is here:
>>>>>> https://github.com/intel-staging/qemu-tdx/releases/tag/tdx-qemu-upstream-v6.1-fix_kvm_hypercall_return_value
>>>>>>
>>>>>> Previously, the issue was not triggered because no one would modify the ret
>>>>>> value. But with the refactor patch for __kvm_emulate_hypercall() in KVM,
>>>>>> https://lore.kernel.org/kvm/20241128004344.4072099-7-seanjc@google.com/, the
>>>>>> value could be modified.
>>>>>
>>>>> Could you explain the specific reasons here in detail? It would be
>>>>> helpful with debugging or reproducing the issue.
>>>>>
>>>>>> ---
>>>>>> target/i386/kvm/kvm.c | 8 ++++++--
>>>>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>>>>> index 8e17942c3b..4bcccb48d1 100644
>>>>>> --- a/target/i386/kvm/kvm.c
>>>>>> +++ b/target/i386/kvm/kvm.c
>>>>>> @@ -6005,10 +6005,14 @@ static int kvm_handle_hc_map_gpa_range(struct kvm_run *run)
>>>>>> static int kvm_handle_hypercall(struct kvm_run *run)
>>>>>> {
>>>>>> + int ret = -EINVAL;
>>>>>> +
>>>>>> if (run->hypercall.nr == KVM_HC_MAP_GPA_RANGE)
>>>>>> - return kvm_handle_hc_map_gpa_range(run);
>>>>>> + ret = kvm_handle_hc_map_gpa_range(run);
>>>>>> +
>>>>>> + run->hypercall.ret = ret;
>>>>>
>>>>> ret may be negative but hypercall.ret is u64. Do we need to set it to
>>>>> -ret?
>>>>
>>>> If ret is less than zero, will stop the VM anyway as
>>>> RUN_STATE_INTERNAL_ERROR.
>>>>
>>>> If this has to be fixed in QEMU, I think there's no need to set anything
>>>> if ret != 0; also because kvm_convert_memory() returns -1 on error and
>>>> that's not how the error would be passed to the guest.
>>>>
>>>> However, I think the right fix should simply be this in KVM:
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 83fe0a78146f..e2118ba93ef6 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -10066,6 +10066,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>>>> }
>>>> vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
>>>> + vcpu->run->ret = 0;
>>>
>>> vcpu->run->hypercall.ret
>>>
>>>> vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
>>>> vcpu->run->hypercall.args[0] = gpa;
>>>> vcpu->run->hypercall.args[1] = npages;
>>>>
>>>> While there is arguably a change in behavior of the kernel both with
>>>> the patches in kvm-coco-queue and with the above one, _in practice_
>>>> the above change is one that userspace will not notice.
>>>
>>> I agree that KVM should initialize "ret", but I don't think '0' is the right
>>> value. KVM shouldn't assume userspace will successfully handle the hypercall.
>>> What happens if KVM sets vcpu->run->hypercall.ret to a non-zero value, e.g. -KVM_ENOSYS?
>>
>> Unfortunately QEMU is never writing vcpu->run->hypercall.ret, so the guest sees -KVM_ENOSYS; this is basically the same bug that Binbin is fixing, just with a different value passed to the guest.
>>
>> In other words, the above one-liner is pulling the "don't break userspace" card.
>>
>> Paolo
>>
>>
> If the change need to be done in KVM, there are other 3 functions that use
> KVM_EXIT_HYPERCALL based on the code in kvm-coco-queue.
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 40fe7258843e..a624f7289282 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3633,6 +3633,7 @@ static int snp_begin_psc_msr(struct vcpu_svm *svm, u64 ghcb_msr)
> }
>
> vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> + vcpu->run->ret = 0;
> vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
> vcpu->run->hypercall.args[0] = gpa;
> vcpu->run->hypercall.args[1] = 1;
> @@ -3796,6 +3797,7 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc)
> case VMGEXIT_PSC_OP_PRIVATE:
> case VMGEXIT_PSC_OP_SHARED:
> vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> + vcpu->run->ret = 0;
> vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
> vcpu->run->hypercall.args[0] = gfn_to_gpa(gfn);
> vcpu->run->hypercall.args[1] = npages;
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 85c8aee263c1..c50c2edc8c56 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1161,6 +1161,7 @@ static void __tdx_map_gpa(struct vcpu_tdx * tdx)
> pr_err("%s: gpa = 0x%llx, size = 0x%llx", __func__, gpa, size);
>
> tdx->vcpu.run->exit_reason = KVM_EXIT_HYPERCALL;
> + tdx->vcpu->run->ret = 0;
> tdx->vcpu.run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
> tdx->vcpu.run->hypercall.args[0] = gpa & ~gfn_to_gpa(kvm_gfn_direct_bits(tdx->vcpu.kvm));
> tdx->vcpu.run->hypercall.args[1] = size / PAGE_SIZE;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4f94b1e24eae..3f82bb2357e3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10070,6 +10070,7 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
> }
>
> vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> + vcpu->run->ret = 0;
> vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
> vcpu->run->hypercall.args[0] = gpa;
> vcpu->run->hypercall.args[1] = npages;
>
Maybe we could add a helper to fill the vcpu->run?
next prev parent reply other threads:[~2024-12-13 1:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-12 3:26 [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL Binbin Wu
2024-12-12 3:41 ` Yao Yuan
2024-12-12 3:44 ` Xiaoyao Li
2024-12-12 5:18 ` Binbin Wu
2024-12-12 7:09 ` Xiaoyao Li
2024-12-12 7:24 ` Binbin Wu
2024-12-12 8:07 ` Zhao Liu
2024-12-12 16:03 ` Paolo Bonzini
2024-12-12 19:13 ` Sean Christopherson
2024-12-12 21:28 ` Paolo Bonzini
2024-12-12 22:11 ` Sean Christopherson
2024-12-13 1:46 ` Binbin Wu
2024-12-13 1:52 ` Binbin Wu [this message]
2024-12-13 1:56 ` Binbin Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2cf8bf60-b36d-47ca-9aef-d477a841cbed@linux.intel.com \
--to=binbin.wu@linux.intel.com \
--cc=farrah.chen@intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kvm@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=xiaoyao.li@intel.com \
--cc=zhao1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.