public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Binbin Wu <binbin.wu@linux.intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>
Cc: Zhao Liu <zhao1.liu@intel.com>,
	xiaoyao.li@intel.com, qemu-devel@nongnu.org,
	michael.roth@amd.com, rick.p.edgecombe@intel.com,
	isaku.yamahata@intel.com, farrah.chen@intel.com,
	kvm@vger.kernel.org
Subject: Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL
Date: Fri, 13 Dec 2024 09:52:34 +0800	[thread overview]
Message-ID: <2cf8bf60-b36d-47ca-9aef-d477a841cbed@linux.intel.com> (raw)
In-Reply-To: <745b2b6e-7dd0-4437-bbbf-673ddc0df014@linux.intel.com>




On 12/13/2024 9:46 AM, Binbin Wu wrote:
>
>
>
> On 12/13/2024 5:28 AM, Paolo Bonzini wrote:
>> On 12/12/24 20:13, Sean Christopherson wrote:
>>> On Thu, Dec 12, 2024, Paolo Bonzini wrote:
>>>> On 12/12/24 09:07, Zhao Liu wrote:
>>>>> On Thu, Dec 12, 2024 at 11:26:28AM +0800, Binbin Wu wrote:
>>>>>> Date: Thu, 12 Dec 2024 11:26:28 +0800
>>>>>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>>>>> Subject: [PATCH] i386/kvm: Set return value after handling
>>>>>>    KVM_EXIT_HYPERCALL
>>>>>> X-Mailer: git-send-email 2.46.0
>>>>>>
>>>>>> Userspace should set the ret field of hypercall after handling
>>>>>> KVM_EXIT_HYPERCALL.  Otherwise, a stale value could be returned to KVM.
>>>>>>
>>>>>> Fixes: 47e76d03b15 ("i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE")
>>>>>> Reported-by: Farrah Chen <farrah.chen@intel.com>
>>>>>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>>>>>> Tested-by: Farrah Chen <farrah.chen@intel.com>
>>>>>> ---
>>>>>> To test the TDX code in kvm-coco-queue, please apply the patch to the QEMU,
>>>>>> otherwise, TDX guest boot could fail.
>>>>>> A matching QEMU tree including this patch is here:
>>>>>> https://github.com/intel-staging/qemu-tdx/releases/tag/tdx-qemu-upstream-v6.1-fix_kvm_hypercall_return_value
>>>>>>
>>>>>> Previously, the issue was not triggered because no one would modify the ret
>>>>>> value. But with the refactor patch for __kvm_emulate_hypercall() in KVM,
>>>>>> https://lore.kernel.org/kvm/20241128004344.4072099-7-seanjc@google.com/, the
>>>>>> value could be modified.
>>>>>
>>>>> Could you explain the specific reasons here in detail? It would be
>>>>> helpful with debugging or reproducing the issue.
>>>>>
>>>>>> ---
>>>>>>    target/i386/kvm/kvm.c | 8 ++++++--
>>>>>>    1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>>>>>> index 8e17942c3b..4bcccb48d1 100644
>>>>>> --- a/target/i386/kvm/kvm.c
>>>>>> +++ b/target/i386/kvm/kvm.c
>>>>>> @@ -6005,10 +6005,14 @@ static int kvm_handle_hc_map_gpa_range(struct kvm_run *run)
>>>>>>    static int kvm_handle_hypercall(struct kvm_run *run)
>>>>>>    {
>>>>>> +    int ret = -EINVAL;
>>>>>> +
>>>>>>        if (run->hypercall.nr == KVM_HC_MAP_GPA_RANGE)
>>>>>> -        return kvm_handle_hc_map_gpa_range(run);
>>>>>> +        ret = kvm_handle_hc_map_gpa_range(run);
>>>>>> +
>>>>>> +    run->hypercall.ret = ret;
>>>>>
>>>>> ret may be negative but hypercall.ret is u64. Do we need to set it to
>>>>> -ret?
>>>>
>>>> If ret is less than zero, will stop the VM anyway as
>>>> RUN_STATE_INTERNAL_ERROR.
>>>>
>>>> If this has to be fixed in QEMU, I think there's no need to set anything
>>>> if ret != 0; also because kvm_convert_memory() returns -1 on error and
>>>> that's not how the error would be passed to the guest.
>>>>
>>>> However, I think the right fix should simply be this in KVM:
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 83fe0a78146f..e2118ba93ef6 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -10066,6 +10066,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>>>>           }
>>>>           vcpu->run->exit_reason        = KVM_EXIT_HYPERCALL;
>>>> +        vcpu->run->ret                = 0;
>>>
>>>         vcpu->run->hypercall.ret
>>>
>>>> vcpu->run->hypercall.nr       = KVM_HC_MAP_GPA_RANGE;
>>>>           vcpu->run->hypercall.args[0]  = gpa;
>>>>           vcpu->run->hypercall.args[1]  = npages;
>>>>
>>>> While there is arguably a change in behavior of the kernel both with
>>>> the patches in kvm-coco-queue and with the above one, _in practice_
>>>> the above change is one that userspace will not notice.
>>>
>>> I agree that KVM should initialize "ret", but I don't think '0' is the right
>>> value.  KVM shouldn't assume userspace will successfully handle the hypercall.
>>> What happens if KVM sets vcpu->run->hypercall.ret to a non-zero value, e.g. -KVM_ENOSYS?
>>
>> Unfortunately QEMU is never writing vcpu->run->hypercall.ret, so the guest sees -KVM_ENOSYS; this is basically the same bug that Binbin is fixing, just with a different value passed to the guest.
>>
>> In other words, the above one-liner is pulling the "don't break userspace" card.
>>
>> Paolo
>>
>>
> If the change need to be done in KVM, there are other 3 functions that use
> KVM_EXIT_HYPERCALL based on the code in kvm-coco-queue.
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 40fe7258843e..a624f7289282 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3633,6 +3633,7 @@ static int snp_begin_psc_msr(struct vcpu_svm *svm, u64 ghcb_msr)
>         }
>
>         vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> +       vcpu->run->ret         = 0;
>         vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
>         vcpu->run->hypercall.args[0] = gpa;
>         vcpu->run->hypercall.args[1] = 1;
> @@ -3796,6 +3797,7 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc)
>         case VMGEXIT_PSC_OP_PRIVATE:
>         case VMGEXIT_PSC_OP_SHARED:
>                 vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> +               vcpu->run->ret         = 0;
>                 vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
>                 vcpu->run->hypercall.args[0] = gfn_to_gpa(gfn);
>                 vcpu->run->hypercall.args[1] = npages;
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 85c8aee263c1..c50c2edc8c56 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1161,6 +1161,7 @@ static void __tdx_map_gpa(struct vcpu_tdx * tdx)
>         pr_err("%s: gpa = 0x%llx, size = 0x%llx", __func__, gpa, size);
>
>         tdx->vcpu.run->exit_reason       = KVM_EXIT_HYPERCALL;
> +       tdx->vcpu->run->ret              = 0;
>         tdx->vcpu.run->hypercall.nr      = KVM_HC_MAP_GPA_RANGE;
>         tdx->vcpu.run->hypercall.args[0] = gpa & ~gfn_to_gpa(kvm_gfn_direct_bits(tdx->vcpu.kvm));
>         tdx->vcpu.run->hypercall.args[1] = size / PAGE_SIZE;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4f94b1e24eae..3f82bb2357e3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10070,6 +10070,7 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
>                 }
>
>                 vcpu->run->exit_reason        = KVM_EXIT_HYPERCALL;
> +               vcpu->run->ret                = 0;
>                 vcpu->run->hypercall.nr       = KVM_HC_MAP_GPA_RANGE;
>                 vcpu->run->hypercall.args[0]  = gpa;
>                 vcpu->run->hypercall.args[1]  = npages;
>
Maybe we could add a helper to fill the vcpu->run?


  reply	other threads:[~2024-12-13  1:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12  3:26 [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL Binbin Wu
2024-12-12  3:41 ` Yao Yuan
2024-12-12  3:44 ` Xiaoyao Li
2024-12-12  5:18   ` Binbin Wu
2024-12-12  7:09     ` Xiaoyao Li
2024-12-12  7:24       ` Binbin Wu
2024-12-12  8:07 ` Zhao Liu
2024-12-12 16:03   ` Paolo Bonzini
2024-12-12 19:13     ` Sean Christopherson
2024-12-12 21:28       ` Paolo Bonzini
2024-12-12 22:11         ` Sean Christopherson
2024-12-13  1:46         ` Binbin Wu
2024-12-13  1:52           ` Binbin Wu [this message]
2024-12-13  1:56           ` Binbin Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cf8bf60-b36d-47ca-9aef-d477a841cbed@linux.intel.com \
    --to=binbin.wu@linux.intel.com \
    --cc=farrah.chen@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=xiaoyao.li@intel.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox