public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>, kvm@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] KVM: use signals to abort enter_guest/blocking and retry
Date: Mon, 24 Oct 2022 09:49:33 +0200	[thread overview]
Message-ID: <2bd40960-e65e-db2b-ae8f-1109cb9f3fa0@redhat.com> (raw)
In-Reply-To: <4ef882c2-1535-d7df-d474-e5fab2975f53@redhat.com>



Am 24/10/2022 um 09:43 schrieb Emanuele Giuseppe Esposito:
> 
> 
> Am 23/10/2022 um 19:48 schrieb Paolo Bonzini:
>> On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote:
>>> Once a vcpu exectues KVM_RUN, it could enter two states:
>>> enter guest mode, or block/halt.
>>> Use a signal to allow a vcpu to exit the guest state or unblock,
>>> so that it can exit KVM_RUN and release the read semaphore,
>>> allowing a pending KVM_KICK_ALL_RUNNING_VCPUS to continue.
>>>
>>> Note that the signal is not deleted and used to propagate the
>>> exit reason till vcpu_run(). It will be clearead only by
>>> KVM_RESUME_ALL_KICKED_VCPUS. This allows the vcpu to keep try
>>> entering KVM_RUN and perform again all checks done in
>>> kvm_arch_vcpu_ioctl_run() before entering the guest state,
>>> where it will return again if the request is still set.
>>>
>>> However, the userspace hypervisor should also try to avoid
>>> continuously calling KVM_RUN after invoking KVM_KICK_ALL_RUNNING_VCPUS,
>>> because such call will just translate in a back-to-back down_read()
>>> and up_read() (thanks to the signal).
>>
>> Since the userspace should anyway avoid going into this effectively-busy
>> wait, what about clearing the request after the first exit?  The
>> cancellation ioctl can be kept for vCPUs that are never entered after
>> KVM_KICK_ALL_RUNNING_VCPUS.  Alternatively, kvm_clear_all_cpus_request
>> could be done right before up_write().
> 
> Clearing makes sense, but should we "trust" the userspace not to go into
> busy wait?

Actually since this change is just a s/test/check, I would rather keep
test. If userspace does things wrong, this mechanism would still work
properly.

> What's the typical "contract" between KVM and the userspace? Meaning,
> should we cover the basic usage mistakes like forgetting to busy wait on
> KVM_RUN?
> 
> If we don't, I can add a comment when clearing and of course also
> mention it in the API documentation (that I forgot to update, sorry :D)
> 
> Emanuele
> 
>>
>> Paolo
>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   arch/x86/include/asm/kvm_host.h |  2 ++
>>>   arch/x86/kvm/x86.c              |  8 ++++++++
>>>   virt/kvm/kvm_main.c             | 21 +++++++++++++++++++++
>>>   3 files changed, 31 insertions(+)
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h
>>> b/arch/x86/include/asm/kvm_host.h
>>> index aa381ab69a19..d5c37f344d65 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -108,6 +108,8 @@
>>>       KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>>   #define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \
>>>       KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>> +#define KVM_REQ_USERSPACE_KICK        \
>>> +    KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT)
>>>     #define
>>> CR0_RESERVED_BITS                                               \
>>>       (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM |
>>> X86_CR0_TS \
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index b0c47b41c264..2af5f427b4e9 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -10270,6 +10270,10 @@ static int vcpu_enter_guest(struct kvm_vcpu
>>> *vcpu)
>>>       }
>>>         if (kvm_request_pending(vcpu)) {
>>> +        if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu)) {
>>> +            r = -EINTR;
>>> +            goto out;
>>> +        }
>>>           if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) {
>>>               r = -EIO;
>>>               goto out;
>>> @@ -10701,6 +10705,10 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
>>>               r = vcpu_block(vcpu);
>>>           }
>>>   +        /* vcpu exited guest/unblocked because of this request */
>>> +        if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu))
>>> +            return -EINTR;
>>> +
>>>           if (r <= 0)
>>>               break;
>>>   diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index ae0240928a4a..13fa7229b85d 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -3431,6 +3431,8 @@ static int kvm_vcpu_check_block(struct kvm_vcpu
>>> *vcpu)
>>>           goto out;
>>>       if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu))
>>>           goto out;
>>> +    if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu))
>>> +        goto out;
>>>         ret = 0;
>>>   out:
>>> @@ -4668,6 +4670,25 @@ static long kvm_vm_ioctl(struct file *filp,
>>>           r = kvm_vm_ioctl_enable_cap_generic(kvm, &cap);
>>>           break;
>>>       }
>>> +    case KVM_KICK_ALL_RUNNING_VCPUS: {
>>> +        /*
>>> +         * Notify all running vcpus that they have to stop.
>>> +         * Caught in kvm_arch_vcpu_ioctl_run()
>>> +         */
>>> +        kvm_make_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK);
>>> +
>>> +        /*
>>> +         * Use wr semaphore to wait for all vcpus to exit from KVM_RUN.
>>> +         */
>>> +        down_write(&memory_transaction);
>>> +        up_write(&memory_transaction);
>>> +        break;
>>> +    }
>>> +    case KVM_RESUME_ALL_KICKED_VCPUS: {
>>> +        /* Remove all requests sent with KVM_KICK_ALL_RUNNING_VCPUS */
>>> +        kvm_clear_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK);
>>> +        break;
>>> +    }
>>>       case KVM_SET_USER_MEMORY_REGION: {
>>>           struct kvm_userspace_memory_region kvm_userspace_mem;
>>>   
>>


  reply	other threads:[~2022-10-24  7:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-22 15:48 [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 1/4] linux-headers/linux/kvm.h: introduce kvm_userspace_memory_region_list ioctl Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 2/4] KVM: introduce kvm_clear_all_cpus_request Emanuele Giuseppe Esposito
2022-10-22 15:48 ` [PATCH 3/4] KVM: introduce memory transaction semaphore Emanuele Giuseppe Esposito
2022-10-23 17:50   ` Paolo Bonzini
2022-10-24 12:57     ` Emanuele Giuseppe Esposito
2022-10-25 10:01       ` Paolo Bonzini
2022-10-22 15:48 ` [PATCH 4/4] KVM: use signals to abort enter_guest/blocking and retry Emanuele Giuseppe Esposito
2022-10-23 17:48   ` Paolo Bonzini
2022-10-24  7:43     ` Emanuele Giuseppe Esposito
2022-10-24  7:49       ` Emanuele Giuseppe Esposito [this message]
2022-10-25 10:05       ` Paolo Bonzini
2022-10-24  7:56 ` [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm Christian Borntraeger
2022-10-24  8:33   ` Emanuele Giuseppe Esposito
2022-10-24  9:09     ` Christian Borntraeger
2022-10-24 22:45       ` Sean Christopherson
2022-10-25  9:33         ` Paolo Bonzini
2022-10-25 15:55           ` Sean Christopherson
2022-10-25 21:34             ` Paolo Bonzini
2022-10-25 23:07               ` Sean Christopherson
2022-10-26 17:52                 ` Hyper-V VTLs, permission bitmaps and userspace exits (was Re: [PATCH 0/4] KVM: API to block and resume all running vcpus in a vm) Paolo Bonzini
2022-10-26 19:33                   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bd40960-e65e-db2b-ae8f-1109cb9f3fa0@redhat.com \
    --to=eesposit@redhat.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox