* [PATCHv2 0/2] kvm/x86: vhost task creation failure handling
@ 2025-02-26 21:38 Keith Busch
2025-02-26 21:38 ` [PATCHv2 1/2] vhost: return task creation error instead of NULL Keith Busch
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Keith Busch @ 2025-02-26 21:38 UTC (permalink / raw)
To: pbonzini, seanjc, kvm; +Cc: x86, virtualization, linux-kernel, Keith Busch
From: Keith Busch <kbusch@kernel.org>
The suggestion from Sean appears to be successful, so sending out a new
version for consideration.
Background:
The crosvm VMM might send signals to its threads that have entered
KVM_RUN. The signal specifically is SIGRTRMIN from here:
https://github.com/google/crosvm/blob/main/src/crosvm/sys/linux/vcpu.rs#L651
If this happens to occur when the huge page recovery is trying to create
its vhost task, that will fail with ERESTARTNOINTR. Once this happens,
all KVM_RUN calls will fail with ENOMEM despite memory not being the
problem.
This series propogates the error up so we can distinguish that from the
current defaulting to ENOMEM and replaces the call_once since we need to
be able to call it repeatedly due to this condition.
Changes from the v1 (prefixed as an "RFC", really) patch:
Instead of using a VM-wide mutex, update the call_once pattern to
complete only if what it calls is successful (from Sean).
Keith Busch (1):
vhost: return task creation error instead of NULL
Sean Christopherson (1):
kvm: retry nx_huge_page_recovery_thread creation
arch/x86/kvm/mmu/mmu.c | 12 +++++-------
drivers/vhost/vhost.c | 2 +-
include/linux/call_once.h | 16 +++++++++++-----
kernel/vhost_task.c | 4 ++--
4 files changed, 19 insertions(+), 15 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCHv2 1/2] vhost: return task creation error instead of NULL
2025-02-26 21:38 [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Keith Busch
@ 2025-02-26 21:38 ` Keith Busch
2025-02-26 21:38 ` [PATCHv2 2/2] kvm: retry nx_huge_page_recovery_thread creation Keith Busch
2025-02-27 10:21 ` [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Lei Yang
2 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2025-02-26 21:38 UTC (permalink / raw)
To: pbonzini, seanjc, kvm; +Cc: x86, virtualization, linux-kernel, Keith Busch
From: Keith Busch <kbusch@kernel.org>
Lets callers distinguish why the vhost task creation failed. No one
currently cares why it failed, so no runtime change from this patch, but
that will not be the case for long.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
arch/x86/kvm/mmu/mmu.c | 2 +-
drivers/vhost/vhost.c | 2 +-
kernel/vhost_task.c | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d4ac4a1f8b81b..18ca1ea6dc240 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -7471,7 +7471,7 @@ static void kvm_mmu_start_lpage_recovery(struct once *once)
kvm_nx_huge_page_recovery_worker_kill,
kvm, "kvm-nx-lpage-recovery");
- if (!nx_thread)
+ if (IS_ERR(nx_thread))
return;
vhost_task_start(nx_thread);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9ac25d08f473e..61dd19c7f99f1 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -666,7 +666,7 @@ static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev)
vtsk = vhost_task_create(vhost_run_work_list, vhost_worker_killed,
worker, name);
- if (!vtsk)
+ if (!IS_ERR(vtsk))
goto free_worker;
mutex_init(&worker->mutex);
diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
index 8800f5acc0071..2ef2e1b800916 100644
--- a/kernel/vhost_task.c
+++ b/kernel/vhost_task.c
@@ -133,7 +133,7 @@ struct vhost_task *vhost_task_create(bool (*fn)(void *),
vtsk = kzalloc(sizeof(*vtsk), GFP_KERNEL);
if (!vtsk)
- return NULL;
+ return ERR_PTR(-ENOMEM);
init_completion(&vtsk->exited);
mutex_init(&vtsk->exit_mutex);
vtsk->data = arg;
@@ -145,7 +145,7 @@ struct vhost_task *vhost_task_create(bool (*fn)(void *),
tsk = copy_process(NULL, 0, NUMA_NO_NODE, &args);
if (IS_ERR(tsk)) {
kfree(vtsk);
- return NULL;
+ return ERR_PTR(PTR_ERR(tsk));
}
vtsk->task = tsk;
--
2.43.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCHv2 2/2] kvm: retry nx_huge_page_recovery_thread creation
2025-02-26 21:38 [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Keith Busch
2025-02-26 21:38 ` [PATCHv2 1/2] vhost: return task creation error instead of NULL Keith Busch
@ 2025-02-26 21:38 ` Keith Busch
2025-02-27 10:21 ` [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Lei Yang
2 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2025-02-26 21:38 UTC (permalink / raw)
To: pbonzini, seanjc, kvm; +Cc: x86, virtualization, linux-kernel, Keith Busch
From: Sean Christopherson <seanjc@google.com>
A VMM may send a signal to its threads while they've entered KVM_RUN. If
that thread happens to be trying to make the huge page recovery vhost
task, it will fail with -ERESTARTNOINTR. We need to retry if that
happens, so make call_once complete only if what it called was
successful.
Based-on-a-patch-by: Sean Christopherson <seanjc@google.com>
[implemented caller side, commit log]
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
arch/x86/kvm/mmu/mmu.c | 10 ++++------
include/linux/call_once.h | 16 +++++++++++-----
2 files changed, 15 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 18ca1ea6dc240..8160870398b90 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -7460,7 +7460,7 @@ static bool kvm_nx_huge_page_recovery_worker(void *data)
return true;
}
-static void kvm_mmu_start_lpage_recovery(struct once *once)
+static int kvm_mmu_start_lpage_recovery(struct once *once)
{
struct kvm_arch *ka = container_of(once, struct kvm_arch, nx_once);
struct kvm *kvm = container_of(ka, struct kvm, arch);
@@ -7472,12 +7472,13 @@ static void kvm_mmu_start_lpage_recovery(struct once *once)
kvm, "kvm-nx-lpage-recovery");
if (IS_ERR(nx_thread))
- return;
+ return PTR_ERR(nx_thread);
vhost_task_start(nx_thread);
/* Make the task visible only once it is fully started. */
WRITE_ONCE(kvm->arch.nx_huge_page_recovery_thread, nx_thread);
+ return 0;
}
int kvm_mmu_post_init_vm(struct kvm *kvm)
@@ -7485,10 +7486,7 @@ int kvm_mmu_post_init_vm(struct kvm *kvm)
if (nx_hugepage_mitigation_hard_disabled)
return 0;
- call_once(&kvm->arch.nx_once, kvm_mmu_start_lpage_recovery);
- if (!kvm->arch.nx_huge_page_recovery_thread)
- return -ENOMEM;
- return 0;
+ return call_once(&kvm->arch.nx_once, kvm_mmu_start_lpage_recovery);
}
void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
diff --git a/include/linux/call_once.h b/include/linux/call_once.h
index 6261aa0b3fb00..ddcfd91493eaa 100644
--- a/include/linux/call_once.h
+++ b/include/linux/call_once.h
@@ -26,20 +26,26 @@ do { \
__once_init((once), #once, &__key); \
} while (0)
-static inline void call_once(struct once *once, void (*cb)(struct once *))
+static inline int call_once(struct once *once, int (*cb)(struct once *))
{
+ int r;
+
/* Pairs with atomic_set_release() below. */
if (atomic_read_acquire(&once->state) == ONCE_COMPLETED)
- return;
+ return 0;
guard(mutex)(&once->lock);
WARN_ON(atomic_read(&once->state) == ONCE_RUNNING);
if (atomic_read(&once->state) != ONCE_NOT_STARTED)
- return;
+ return -EINVAL;
atomic_set(&once->state, ONCE_RUNNING);
- cb(once);
- atomic_set_release(&once->state, ONCE_COMPLETED);
+ r = cb(once);
+ if (r)
+ atomic_set(&once->state, ONCE_NOT_STARTED);
+ else
+ atomic_set_release(&once->state, ONCE_COMPLETED);
+ return r;
}
#endif /* _LINUX_CALL_ONCE_H */
--
2.43.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCHv2 0/2] kvm/x86: vhost task creation failure handling
2025-02-26 21:38 [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Keith Busch
2025-02-26 21:38 ` [PATCHv2 1/2] vhost: return task creation error instead of NULL Keith Busch
2025-02-26 21:38 ` [PATCHv2 2/2] kvm: retry nx_huge_page_recovery_thread creation Keith Busch
@ 2025-02-27 10:21 ` Lei Yang
2025-02-27 15:09 ` Keith Busch
2 siblings, 1 reply; 5+ messages in thread
From: Lei Yang @ 2025-02-27 10:21 UTC (permalink / raw)
To: Keith Busch
Cc: pbonzini, seanjc, kvm, x86, virtualization, linux-kernel,
Keith Busch
Hi Keith
There are some error messages from qemu output when I tested this
series of patches with the virtio-net regression test. It can
reproduced by boot up a guest with vhost device after applied your
patches.
Error messages:
Qemu output:
qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
"vhostfd": "16", "fd": "10"}: vhost_set_owner failed: Cannot allocate
memory
qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
"vhostfd": "16", "fd": "10"}: vhost-net requested but could not be
initialized
My tests based on this commit:
# git log -1
commit dd83757f6e686a2188997cb58b5975f744bb7786 (HEAD -> master,
origin/master, origin/HEAD)
Merge: 102c16a1f9a9 eb54d2695b57
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed Feb 26 16:55:30 2025 -0800
Merge tag 'bcachefs-2025-02-26' of git://evilpiepirate.org/bcachefs
Thanks
Lei
On Thu, Feb 27, 2025 at 5:40 AM Keith Busch <kbusch@meta.com> wrote:
>
> From: Keith Busch <kbusch@kernel.org>
>
> The suggestion from Sean appears to be successful, so sending out a new
> version for consideration.
>
> Background:
>
> The crosvm VMM might send signals to its threads that have entered
> KVM_RUN. The signal specifically is SIGRTRMIN from here:
>
> https://github.com/google/crosvm/blob/main/src/crosvm/sys/linux/vcpu.rs#L651
>
> If this happens to occur when the huge page recovery is trying to create
> its vhost task, that will fail with ERESTARTNOINTR. Once this happens,
> all KVM_RUN calls will fail with ENOMEM despite memory not being the
> problem.
>
> This series propogates the error up so we can distinguish that from the
> current defaulting to ENOMEM and replaces the call_once since we need to
> be able to call it repeatedly due to this condition.
>
> Changes from the v1 (prefixed as an "RFC", really) patch:
>
> Instead of using a VM-wide mutex, update the call_once pattern to
> complete only if what it calls is successful (from Sean).
>
> Keith Busch (1):
> vhost: return task creation error instead of NULL
>
> Sean Christopherson (1):
> kvm: retry nx_huge_page_recovery_thread creation
>
> arch/x86/kvm/mmu/mmu.c | 12 +++++-------
> drivers/vhost/vhost.c | 2 +-
> include/linux/call_once.h | 16 +++++++++++-----
> kernel/vhost_task.c | 4 ++--
> 4 files changed, 19 insertions(+), 15 deletions(-)
>
> --
> 2.43.5
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCHv2 0/2] kvm/x86: vhost task creation failure handling
2025-02-27 10:21 ` [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Lei Yang
@ 2025-02-27 15:09 ` Keith Busch
0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2025-02-27 15:09 UTC (permalink / raw)
To: Lei Yang
Cc: Keith Busch, pbonzini, seanjc, kvm, x86, virtualization,
linux-kernel
On Thu, Feb 27, 2025 at 06:21:34PM +0800, Lei Yang wrote:
> Hi Keith
>
> There are some error messages from qemu output when I tested this
> series of patches with the virtio-net regression test. It can
> reproduced by boot up a guest with vhost device after applied your
> patches.
> Error messages:
> Qemu output:
> qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
> "vhostfd": "16", "fd": "10"}: vhost_set_owner failed: Cannot allocate
> memory
> qemu-kvm: -netdev {"id": "idoejzv8", "type": "tap", "vhost": true,
> "vhostfd": "16", "fd": "10"}: vhost-net requested but could not be
> initialized
*facepalm*
I accidently left the "!" in the condition:
@@ -666,7 +666,7 @@ static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev)
vtsk = vhost_task_create(vhost_run_work_list, vhost_worker_killed,
worker, name);
- if (!vtsk)
+ if (!IS_ERR(vtsk))
goto free_worker;
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-02-27 15:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-26 21:38 [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Keith Busch
2025-02-26 21:38 ` [PATCHv2 1/2] vhost: return task creation error instead of NULL Keith Busch
2025-02-26 21:38 ` [PATCHv2 2/2] kvm: retry nx_huge_page_recovery_thread creation Keith Busch
2025-02-27 10:21 ` [PATCHv2 0/2] kvm/x86: vhost task creation failure handling Lei Yang
2025-02-27 15:09 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox