From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
David Matlack <dmatlack@google.com>,
Xu Yilun <yilun.xu@linux.intel.com>,
Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH 3/4] KVM: Get reference to VM's address space in the async #PF worker
Date: Fri, 26 Jan 2024 17:21:48 +0100 [thread overview]
Message-ID: <87sf2k83qb.fsf@redhat.com> (raw)
In-Reply-To: <20240110011533.503302-4-seanjc@google.com>
Sean Christopherson <seanjc@google.com> writes:
> Get a reference to the target VM's address space in async_pf_execute()
> instead of gifting a reference from kvm_setup_async_pf(). Keeping the
> address space alive just to service an async #PF is counter-productive,
> i.e. if the process is exiting and all vCPUs are dead, then NOT doing
> get_user_pages_remote() and freeing the address space asap is
> desirable.
It took me a while to realize why all vCPU fds are managed by the same
mm which did KVM_CREATE_VM as (AFAIU) fds can be passed around. Turns
out, we explicitly forbid this in kvm_vcpu_ioctl():
if (vcpu->kvm->mm != current->mm || vcpu->kvm->vm_dead)
return -EIO;
so this indeed means that grabbing current->mm in kvm_setup_async_pf()
can be avoided. I'm not sure whether it's just me or a "all vCPUs are
quired to be managed by the same mm" comment somewhere would be helpful.
>
> Handling the mm reference entirely within async_pf_execute() also
> simplifies the async #PF flows as a whole, e.g. it's not immediately
> obvious when the worker task vs. the vCPU task is responsible for putting
> the gifted mm reference.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> include/linux/kvm_host.h | 1 -
> virt/kvm/async_pf.c | 32 ++++++++++++++++++--------------
> 2 files changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 7e7fd25b09b3..bbfefd7e612f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -238,7 +238,6 @@ struct kvm_async_pf {
> struct list_head link;
> struct list_head queue;
> struct kvm_vcpu *vcpu;
> - struct mm_struct *mm;
> gpa_t cr2_or_gpa;
> unsigned long addr;
> struct kvm_arch_async_pf arch;
> diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
> index d5dc50318aa6..c3f4f351a2ae 100644
> --- a/virt/kvm/async_pf.c
> +++ b/virt/kvm/async_pf.c
> @@ -46,8 +46,8 @@ static void async_pf_execute(struct work_struct *work)
> {
> struct kvm_async_pf *apf =
> container_of(work, struct kvm_async_pf, work);
> - struct mm_struct *mm = apf->mm;
> struct kvm_vcpu *vcpu = apf->vcpu;
> + struct mm_struct *mm = vcpu->kvm->mm;
> unsigned long addr = apf->addr;
> gpa_t cr2_or_gpa = apf->cr2_or_gpa;
> int locked = 1;
> @@ -56,16 +56,24 @@ static void async_pf_execute(struct work_struct *work)
> might_sleep();
>
> /*
> - * This work is run asynchronously to the task which owns
> - * mm and might be done in another context, so we must
> - * access remotely.
> + * Attempt to pin the VM's host address space, and simply skip gup() if
> + * acquiring a pin fail, i.e. if the process is exiting. Note, KVM
> + * holds a reference to its associated mm_struct until the very end of
> + * kvm_destroy_vm(), i.e. the struct itself won't be freed before this
> + * work item is fully processed.
> */
> - mmap_read_lock(mm);
> - get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, &locked);
> - if (locked)
> - mmap_read_unlock(mm);
> - mmput(mm);
> + if (mmget_not_zero(mm)) {
> + mmap_read_lock(mm);
> + get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, &locked);
> + if (locked)
> + mmap_read_unlock(mm);
> + mmput(mm);
> + }
>
> + /*
> + * Notify and kick the vCPU even if faulting in the page failed, e.g.
> + * so that the vCPU can retry the fault synchronously.
> + */
> if (IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC))
> kvm_arch_async_page_present(vcpu, apf);
>
> @@ -129,10 +137,8 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> #ifdef CONFIG_KVM_ASYNC_PF_SYNC
> flush_work(&work->work);
> #else
> - if (cancel_work_sync(&work->work)) {
> - mmput(work->mm);
> + if (cancel_work_sync(&work->work))
> kmem_cache_free(async_pf_cache, work);
> - }
> #endif
> spin_lock(&vcpu->async_pf.lock);
> }
> @@ -211,8 +217,6 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> work->cr2_or_gpa = cr2_or_gpa;
> work->addr = hva;
> work->arch = *arch;
> - work->mm = current->mm;
> - mmget(work->mm);
>
> INIT_WORK(&work->work, async_pf_execute);
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
--
Vitaly
next prev parent reply other threads:[~2024-01-26 16:21 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-10 1:15 [PATCH 0/4] KVM: Async #PF fixes and cleanups Sean Christopherson
2024-01-10 1:15 ` [PATCH 1/4] KVM: Always flush async #PF workqueue when vCPU is being destroyed Sean Christopherson
2024-01-20 12:40 ` Xu Yilun
2024-01-24 19:04 ` Sean Christopherson
2024-01-26 7:36 ` Xu Yilun
2024-02-06 19:06 ` Sean Christopherson
2024-01-26 16:51 ` Vitaly Kuznetsov
2024-01-26 17:19 ` Sean Christopherson
2024-01-29 9:02 ` Vitaly Kuznetsov
2024-02-19 13:59 ` Xu Yilun
2024-02-19 15:51 ` Sean Christopherson
2024-02-20 3:02 ` Xu Yilun
2024-01-10 1:15 ` [PATCH 2/4] KVM: Put mm immediately after async #PF worker completes remote gup() Sean Christopherson
2024-01-20 15:24 ` Xu Yilun
2024-01-26 16:23 ` Vitaly Kuznetsov
2024-01-10 1:15 ` [PATCH 3/4] KVM: Get reference to VM's address space in the async #PF worker Sean Christopherson
2024-01-20 15:16 ` Xu Yilun
2024-01-24 18:52 ` Sean Christopherson
2024-01-26 8:06 ` Xu Yilun
2024-01-26 16:21 ` Vitaly Kuznetsov [this message]
2024-01-26 16:39 ` Sean Christopherson
2024-01-10 1:15 ` [PATCH 4/4] KVM: Nullify async #PF worker's "apf" pointer as soon as it might be freed Sean Christopherson
2024-01-20 15:24 ` Xu Yilun
2024-01-26 16:30 ` Vitaly Kuznetsov
2024-02-06 21:36 ` [PATCH 0/4] KVM: Async #PF fixes and cleanups Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sf2k83qb.fsf@redhat.com \
--to=vkuznets@redhat.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.