* [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context
@ 2025-04-21 10:28 Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work Shrikanth Hegde
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-21 10:28 UTC (permalink / raw)
To: maddy, linuxppc-dev
Cc: npiggin, christophe.leroy, sshegde, gautam, vaibhav, bigeasy,
linux-kernel
From: Gautam Menghani <gautam@linux.ibm.com>
This is an effort to use the generic kvm infra which handles check for
need_resched, handling signals etc. i.e xfer_to_guest_mode_handle_work.
kvm guests boots and runs stress-ng CPU stressor on PowerVM and on PowerNV.
preempt=full and preempt=lazy was tested on PowerNV and in both cases the
KVM guest boots and runs stress-ng CPU stressor.
Please let me know if any specific testing to be done.
Kept the patches separate since they differ functionally, but kept them
is a series since 2nd patch depends on functionality of 1st. Also this
could help in git bisect.
This is based on tip/master
Shrikanth Hegde (2):
powerpc: kvm: use generic transfer to guest mode work
powerpc: enable to run posix cpu timers in task context
arch/powerpc/Kconfig | 2 ++
arch/powerpc/kvm/book3s_hv.c | 13 +++++++------
arch/powerpc/kvm/powerpc.c | 22 ++++++++--------------
3 files changed, 17 insertions(+), 20 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-21 10:28 [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context Shrikanth Hegde
@ 2025-04-21 10:28 ` Shrikanth Hegde
2025-04-24 14:42 ` Sebastian Andrzej Siewior
2025-04-21 10:28 ` [PATCH 2/2] powerpc: enable to run posix cpu timers in task context Shrikanth Hegde
2025-04-21 12:34 ` [PATCH 0/2] powerpc: kvm: generic framework and run posix " Shrikanth Hegde
2 siblings, 1 reply; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-21 10:28 UTC (permalink / raw)
To: maddy, linuxppc-dev
Cc: npiggin, christophe.leroy, sshegde, gautam, vaibhav, bigeasy,
linux-kernel
There is generic entry framework to handle signals and check for
reschedule etc before entering the guest. Use that framework for
powerpc.
Advantages:
- Less code duplication.
- powerpc kvm now understands NR and NR_lazy bits.
- powerpc can enable HAVE_POSIX_CPU_TIMERS_TASK_WORK, currently the
powerpc/kvm code doesn't handle TIF_NOTIFY_RESUME.
Testing: No splats seen in below cases.
- Booted KVM guest on PowerVM and PowerNV systems.
- Ran stress-ng CPU stressors in each above case.
- On PowerNV host, tried preempt=lazy/full and run stress-ng CPU stressor
in the KVM guest.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/kvm/book3s_hv.c | 13 +++++++------
arch/powerpc/kvm/powerpc.c | 22 ++++++++--------------
3 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 6722625a4..83807ae44 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -299,6 +299,7 @@ config PPC
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN && EXECMEM
+ select KVM_XFER_TO_GUEST_WORK
select LOCK_MM_AND_FIND_VMA
select MMU_GATHER_PAGE_SIZE
select MMU_GATHER_RCU_TABLE_FREE
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 19f4d298d..123539642 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -80,8 +80,8 @@
#include <asm/ultravisor.h>
#include <asm/dtl.h>
#include <asm/plpar_wrappers.h>
-
#include <trace/events/ipi.h>
+#include <linux/entry-kvm.h>
#include "book3s.h"
#include "book3s_hv.h"
@@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
}
if (need_resched())
- cond_resched();
+ schedule();
kvmppc_update_vpas(vcpu);
@@ -5097,10 +5097,11 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
return -EINVAL;
}
- /* No need to go into the guest when all we'll do is come back out */
- if (signal_pending(current)) {
- run->exit_reason = KVM_EXIT_INTR;
- return -EINTR;
+ /* use generic frameworks to handle signals, need_resched */
+ if (__xfer_to_guest_mode_work_pending()) {
+ r = xfer_to_guest_mode_handle_work(vcpu);
+ if (r)
+ return r;
}
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 153587741..4ff334532 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -34,6 +34,7 @@
#endif
#include <asm/ultravisor.h>
#include <asm/setup.h>
+#include <linux/entry-kvm.h>
#include "timing.h"
#include "../mm/mmu_decl.h"
@@ -80,24 +81,17 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
{
int r;
+ /* use generic framework to handle need resched and signals */
+ if (__xfer_to_guest_mode_work_pending()) {
+ r = xfer_to_guest_mode_handle_work(vcpu);
+ if (r)
+ return r;
+ }
+
WARN_ON(irqs_disabled());
hard_irq_disable();
while (true) {
- if (need_resched()) {
- local_irq_enable();
- cond_resched();
- hard_irq_disable();
- continue;
- }
-
- if (signal_pending(current)) {
- kvmppc_account_exit(vcpu, SIGNAL_EXITS);
- vcpu->run->exit_reason = KVM_EXIT_INTR;
- r = -EINTR;
- break;
- }
-
vcpu->mode = IN_GUEST_MODE;
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] powerpc: enable to run posix cpu timers in task context
2025-04-21 10:28 [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work Shrikanth Hegde
@ 2025-04-21 10:28 ` Shrikanth Hegde
2025-04-21 12:34 ` [PATCH 0/2] powerpc: kvm: generic framework and run posix " Shrikanth Hegde
2 siblings, 0 replies; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-21 10:28 UTC (permalink / raw)
To: maddy, linuxppc-dev
Cc: npiggin, christophe.leroy, sshegde, gautam, vaibhav, bigeasy,
linux-kernel
Now that all kvm entry to guest paths handle the task work
using the generic framework, enable HAVE_POSIX_CPU_TIMERS_TASK_WORK
which allows running posix cpu timers in task context instead of running
them in hardirq. This would is a necessary step towards enabling
PREEMPT_RT on powerNV systems.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
arch/powerpc/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 83807ae44..f42fa4181 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -277,6 +277,7 @@ config PPC
select HAVE_PERF_EVENTS_NMI if PPC64
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+ select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_RETHOOK if KPROBES
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE
--
2.48.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context
2025-04-21 10:28 [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 2/2] powerpc: enable to run posix cpu timers in task context Shrikanth Hegde
@ 2025-04-21 12:34 ` Shrikanth Hegde
2 siblings, 0 replies; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-21 12:34 UTC (permalink / raw)
To: maddy, linuxppc-dev, gautam
Cc: npiggin, christophe.leroy, vaibhav, bigeasy, linux-kernel
On 4/21/25 15:58, Shrikanth Hegde wrote:
> From: Gautam Menghani <gautam@linux.ibm.com>
>
I made a mistake while generating the patch. Sorry about that. i will
fix it up in next version.
Please consider the above as:
From: Shrikanth Hegde <sshegde@linux.ibm.com>
> This is an effort to use the generic kvm infra which handles check for
> need_resched, handling signals etc. i.e xfer_to_guest_mode_handle_work.
>
> kvm guests boots and runs stress-ng CPU stressor on PowerVM and on PowerNV.
> preempt=full and preempt=lazy was tested on PowerNV and in both cases the
> KVM guest boots and runs stress-ng CPU stressor.
>
> Please let me know if any specific testing to be done.
>
> Kept the patches separate since they differ functionally, but kept them
> is a series since 2nd patch depends on functionality of 1st. Also this
> could help in git bisect.
>
> This is based on tip/master
>
> Shrikanth Hegde (2):
> powerpc: kvm: use generic transfer to guest mode work
> powerpc: enable to run posix cpu timers in task context
>
> arch/powerpc/Kconfig | 2 ++
> arch/powerpc/kvm/book3s_hv.c | 13 +++++++------
> arch/powerpc/kvm/powerpc.c | 22 ++++++++--------------
> 3 files changed, 17 insertions(+), 20 deletions(-)
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-21 10:28 ` [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work Shrikanth Hegde
@ 2025-04-24 14:42 ` Sebastian Andrzej Siewior
2025-04-24 15:57 ` Shrikanth Hegde
0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-24 14:42 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: maddy, linuxppc-dev, npiggin, christophe.leroy, gautam, vaibhav,
linux-kernel
On 2025-04-21 15:58:36 [+0530], Shrikanth Hegde wrote:
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 19f4d298d..123539642 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -80,8 +80,8 @@
> #include <asm/ultravisor.h>
> #include <asm/dtl.h>
> #include <asm/plpar_wrappers.h>
> -
> #include <trace/events/ipi.h>
> +#include <linux/entry-kvm.h>
>
> #include "book3s.h"
> #include "book3s_hv.h"
> @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> }
>
> if (need_resched())
> - cond_resched();
> + schedule();
This looks unrelated and odd. I don't why but this should be a
cond_resched() so it can be optimized away on PREEMPT kernels.
> kvmppc_update_vpas(vcpu);
>
> @@ -5097,10 +5097,11 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
> return -EINVAL;
> }
>
> - /* No need to go into the guest when all we'll do is come back out */
> - if (signal_pending(current)) {
> - run->exit_reason = KVM_EXIT_INTR;
> - return -EINTR;
> + /* use generic frameworks to handle signals, need_resched */
> + if (__xfer_to_guest_mode_work_pending()) {
> + r = xfer_to_guest_mode_handle_work(vcpu);
This could be unconditional.
> + if (r)
> + return r;
> }
>
> #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 153587741..4ff334532 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -34,6 +34,7 @@
> #endif
> #include <asm/ultravisor.h>
> #include <asm/setup.h>
> +#include <linux/entry-kvm.h>
>
> #include "timing.h"
> #include "../mm/mmu_decl.h"
> @@ -80,24 +81,17 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
> {
> int r;
>
> + /* use generic framework to handle need resched and signals */
> + if (__xfer_to_guest_mode_work_pending()) {
> + r = xfer_to_guest_mode_handle_work(vcpu);
there is nothing special you do checking and handling the work. Couldn't
you invoke xfer_to_guest_mode_handle_work() unconditionally?
> + if (r)
> + return r;
> + }
> +
> WARN_ON(irqs_disabled());
> hard_irq_disable();
>
> while (true) {
> - if (need_resched()) {
> - local_irq_enable();
> - cond_resched();
> - hard_irq_disable();
> - continue;
> - }
> -
> - if (signal_pending(current)) {
> - kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> - vcpu->run->exit_reason = KVM_EXIT_INTR;
> - r = -EINTR;
> - break;
I don't how this works but couldn't SIGNAL_EXITS vanish now that it
isn't updated anymore? The stat itself moves in kvm_handle_signal_exit()
to a different counter so it is not lost. The reader just needs to look
somewhere else for it.
> - }
> -
> vcpu->mode = IN_GUEST_MODE;
>
> /*
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-24 14:42 ` Sebastian Andrzej Siewior
@ 2025-04-24 15:57 ` Shrikanth Hegde
2025-04-24 18:38 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-24 15:57 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: maddy, linuxppc-dev, npiggin, christophe.leroy, gautam, vaibhav,
linux-kernel
On 4/24/25 20:12, Sebastian Andrzej Siewior wrote:
Thanks Sebastian for taking a look.
> On 2025-04-21 15:58:36 [+0530], Shrikanth Hegde wrote:
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 19f4d298d..123539642 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -80,8 +80,8 @@
>> #include <asm/ultravisor.h>
>> #include <asm/dtl.h>
>> #include <asm/plpar_wrappers.h>
>> -
>> #include <trace/events/ipi.h>
>> +#include <linux/entry-kvm.h>
>>
>> #include "book3s.h"
>> #include "book3s_hv.h"
>> @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>> }
>>
>> if (need_resched())
>> - cond_resched();
>> + schedule();
>
> This looks unrelated and odd. I don't why but this should be a
> cond_resched() so it can be optimized away on PREEMPT kernels.
This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
>
>> kvmppc_update_vpas(vcpu);
>>
>> @@ -5097,10 +5097,11 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
>> return -EINVAL;
>> }
>>
>> - /* No need to go into the guest when all we'll do is come back out */
>> - if (signal_pending(current)) {
>> - run->exit_reason = KVM_EXIT_INTR;
>> - return -EINTR;
>> + /* use generic frameworks to handle signals, need_resched */
>> + if (__xfer_to_guest_mode_work_pending()) {
>> + r = xfer_to_guest_mode_handle_work(vcpu);
> This could be unconditional.
>
>> + if (r)
>> + return r;
>> }
>>
>> #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index 153587741..4ff334532 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -34,6 +34,7 @@
>> #endif
>> #include <asm/ultravisor.h>
>> #include <asm/setup.h>
>> +#include <linux/entry-kvm.h>
>>
>> #include "timing.h"
>> #include "../mm/mmu_decl.h"
>> @@ -80,24 +81,17 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
>> {
>> int r;
>>
>> + /* use generic framework to handle need resched and signals */
>> + if (__xfer_to_guest_mode_work_pending()) {
>> + r = xfer_to_guest_mode_handle_work(vcpu);
>
> there is nothing special you do checking and handling the work. Couldn't
> you invoke xfer_to_guest_mode_handle_work() unconditionally?
>
I followed what was in arch/x86/kvm/x86.c. Since xfer_to_guest_mode_handle_work does the same check
it makes sense to call it without checks too.
Will update in v2.
>> + if (r)
>> + return r;
>> + }
>> +
>> WARN_ON(irqs_disabled());
>> hard_irq_disable();
>>
>> while (true) {
>> - if (need_resched()) {
>> - local_irq_enable();
>> - cond_resched();
>> - hard_irq_disable();
>> - continue;
>> - }
>> -
>> - if (signal_pending(current)) {
>> - kvmppc_account_exit(vcpu, SIGNAL_EXITS);
>> - vcpu->run->exit_reason = KVM_EXIT_INTR;
>> - r = -EINTR;
>> - break;
>
> I don't how this works but couldn't SIGNAL_EXITS vanish now that it
> isn't updated anymore? The stat itself moves in kvm_handle_signal_exit()
> to a different counter so it is not lost. The reader just needs to look
> somewhere else for it.
ok. thanks for pointing out.
AFAIU it is updating the stats mostly. But below could keep the stats happy.
I will update that in v2.
if (__xfer_to_guest_mode_work_pending()) {
r = xfer_to_guest_mode_handle_work(vcpu);
+ /* generic framework doesn't update ppc specific stats*/
+ if (r == -EINTR)
+ kvmppc_account_exit(vcpu, SIGNAL_EXITS);
if (r)
return r;
>
>> - }
>> -
>> vcpu->mode = IN_GUEST_MODE;
>>
>> /*
>
> Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-24 15:57 ` Shrikanth Hegde
@ 2025-04-24 18:38 ` Sebastian Andrzej Siewior
2025-04-25 11:19 ` Shrikanth Hegde
0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-24 18:38 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: maddy, linuxppc-dev, npiggin, christophe.leroy, gautam, vaibhav,
linux-kernel
On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
> > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > > index 19f4d298d..123539642 100644
> > > --- a/arch/powerpc/kvm/book3s_hv.c
> > > +++ b/arch/powerpc/kvm/book3s_hv.c
> > > @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> > > }
> > > if (need_resched())
> > > - cond_resched();
> > > + schedule();
> >
>
>
> > This looks unrelated and odd. I don't why but this should be a
> > cond_resched() so it can be optimized away on PREEMPT kernels.
>
> This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
But this makes no sense. On preempt=full the cond_resched() gets patched
out while schedule() doesn't. Okay, this explains the stuck.
On preempt=full need_resched() should not return true because the
preemption happens right away. Unless you are in a preempt-disabled
or interrupt disabled section. But any of the conditions can't be true
because in both cases you can't invoke schedule(). So you must have had
a wake up on the local CPU which sets need-resched but the schedule()
was delayed for some reason. Once that need-resched bit is observed by
a remote CPU then it won't send an interrupt for a scheduling request
because it should happen any time soon… This should be fixed.
If you replace the above with preempt_disable(); preempt_enable() then it
should also work…
…
> > > --- a/arch/powerpc/kvm/powerpc.c
> > > +++ b/arch/powerpc/kvm/powerpc.c
> > > @@ -34,6 +34,7 @@
> > > #endif
> > > #include <asm/ultravisor.h>
> > > #include <asm/setup.h>
> > > +#include <linux/entry-kvm.h>
> > > #include "timing.h"
> > > #include "../mm/mmu_decl.h"
> > > @@ -80,24 +81,17 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
> > > {
> > > int r;
> > > + /* use generic framework to handle need resched and signals */
> > > + if (__xfer_to_guest_mode_work_pending()) {
> > > + r = xfer_to_guest_mode_handle_work(vcpu);
> >
> > there is nothing special you do checking and handling the work. Couldn't
> > you invoke xfer_to_guest_mode_handle_work() unconditionally?
> >
>
> I followed what was in arch/x86/kvm/x86.c. Since xfer_to_guest_mode_handle_work does the same check
> it makes sense to call it without checks too.
Yeah but I guess x86 did some other updates, too.
> Will update in v2.
>
…
> > > -
> > > - if (signal_pending(current)) {
> > > - kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> > > - vcpu->run->exit_reason = KVM_EXIT_INTR;
> > > - r = -EINTR;
> > > - break;
> >
> > I don't how this works but couldn't SIGNAL_EXITS vanish now that it
> > isn't updated anymore? The stat itself moves in kvm_handle_signal_exit()
> > to a different counter so it is not lost. The reader just needs to look
> > somewhere else for it.
>
> ok. thanks for pointing out.
>
> AFAIU it is updating the stats mostly. But below could keep the stats happy.
> I will update that in v2.
>
> if (__xfer_to_guest_mode_work_pending()) {
> r = xfer_to_guest_mode_handle_work(vcpu);
> + /* generic framework doesn't update ppc specific stats*/
> + if (r == -EINTR)
> + kvmppc_account_exit(vcpu, SIGNAL_EXITS);
> if (r)
> return r;
Either that or you rip it out entirely but that is not my call.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-24 18:38 ` Sebastian Andrzej Siewior
@ 2025-04-25 11:19 ` Shrikanth Hegde
2025-04-25 13:31 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-25 11:19 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, npiggin, vaibhav
Cc: maddy, linuxppc-dev, christophe.leroy, gautam, linux-kernel
On 4/25/25 00:08, Sebastian Andrzej Siewior wrote:
> On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>>> index 19f4d298d..123539642 100644
>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>> @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>>> }
>>>> if (need_resched())
>>>> - cond_resched();
>>>> + schedule();
>>>
>>
>>
>>> This looks unrelated and odd. I don't why but this should be a
>>> cond_resched() so it can be optimized away on PREEMPT kernels.
>>
>> This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
>
> But this makes no sense. On preempt=full the cond_resched() gets patched
> out while schedule() doesn't. Okay, this explains the stuck.
cond_resched works. What you said is right about schedule and preemption models.
Initially I had some other code changes and they were causing it get stuck. i retested it.
But looking at the semantics of usage of xfer_to_guest_mode_work
I think using schedule is probably right over here.
Correct me if i got it all wrong.
on x86:
kvm_arch_vcpu_ioctl_run
vcpu_run
for () {
.. run guest..
xfer_to_guest_mode_handle_work
schedule
}
on Powerpc: ( taking book3s_hv flavour):
kvm_arch_vcpu_ioctl_run
kvmppc_vcpu_run_hv *1
do while() {
kvmhv_run_single_vcpu or kvmppc_run_vcpu
-- checking for need_resched and signals and bails out *2
}
*1 - checks for need resched and signals before entering guest
*2 - checks for need resched and signals while running the guest
This patch is addressing only *1 but it needs to address *2 as well using generic framework.
I think it is doable for books3s_hv atleast. (though might need rewrite)
__kvmppc_vcpu_run is a block box to me yet. I think it first makes sense
to move it C and then try use the xfer_to_guest_mode_handle_work.
nick, vaibhav, any idea on __kvmppc_vcpu_run on how is it handling signal pending, and need_resched.
So this is going to need more work specially on *2 and doing that is also key for preempt=lazy/full to work
for kvm on powepc. will try to figure out.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-25 11:19 ` Shrikanth Hegde
@ 2025-04-25 13:31 ` Sebastian Andrzej Siewior
2025-04-25 14:24 ` Shrikanth Hegde
2025-04-29 6:19 ` Shrikanth Hegde
0 siblings, 2 replies; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-25 13:31 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: npiggin, vaibhav, maddy, linuxppc-dev, christophe.leroy, gautam,
linux-kernel
On 2025-04-25 16:49:19 [+0530], Shrikanth Hegde wrote:
> On 4/25/25 00:08, Sebastian Andrzej Siewior wrote:
> > On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
> > > > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > > > > index 19f4d298d..123539642 100644
> > > > > --- a/arch/powerpc/kvm/book3s_hv.c
> > > > > +++ b/arch/powerpc/kvm/book3s_hv.c
> > > > > @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> > > > > }
> > > > > if (need_resched())
> > > > > - cond_resched();
> > > > > + schedule();
> > > >
> > >
> > >
> > > > This looks unrelated and odd. I don't why but this should be a
> > > > cond_resched() so it can be optimized away on PREEMPT kernels.
> > >
> > > This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
> >
> > But this makes no sense. On preempt=full the cond_resched() gets patched
> > out while schedule() doesn't. Okay, this explains the stuck.
>
> cond_resched works. What you said is right about schedule and preemption models.
> Initially I had some other code changes and they were causing it get stuck. i retested it.
so it is unrelated then ;)
> But looking at the semantics of usage of xfer_to_guest_mode_work
> I think using schedule is probably right over here.
> Correct me if i got it all wrong.
No, if you do xfer_to_guest_mode_work() then it will invoke schedule()
when appropriate. It just the thing in kvmhv_run_single_vcpu() looks odd
and might have been duct tape or an accident and could probably be
removed.
> on x86:
> kvm_arch_vcpu_ioctl_run
> vcpu_run
> for () {
> .. run guest..
> xfer_to_guest_mode_handle_work
> schedule
> }
>
>
> on Powerpc: ( taking book3s_hv flavour):
> kvm_arch_vcpu_ioctl_run
> kvmppc_vcpu_run_hv *1
> do while() {
> kvmhv_run_single_vcpu or kvmppc_run_vcpu
> -- checking for need_resched and signals and bails out *2
> }
>
>
> *1 - checks for need resched and signals before entering guest
I don't see the need_resched() check here.
> *2 - checks for need resched and signals while running the guest
>
>
> This patch is addressing only *1 but it needs to address *2 as well using generic framework.
> I think it is doable for books3s_hv atleast. (though might need rewrite)
>
> __kvmppc_vcpu_run is a block box to me yet. I think it first makes sense
> to move it C and then try use the xfer_to_guest_mode_handle_work.
> nick, vaibhav, any idea on __kvmppc_vcpu_run on how is it handling signal pending, and need_resched.
>
>
> So this is going to need more work specially on *2 and doing that is also key for preempt=lazy/full to work
> for kvm on powepc. will try to figure out.
Okay.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-25 13:31 ` Sebastian Andrzej Siewior
@ 2025-04-25 14:24 ` Shrikanth Hegde
2025-04-25 16:23 ` Sebastian Andrzej Siewior
2025-04-29 6:19 ` Shrikanth Hegde
1 sibling, 1 reply; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-25 14:24 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: npiggin, vaibhav, maddy, linuxppc-dev, christophe.leroy, gautam,
linux-kernel
On 4/25/25 19:01, Sebastian Andrzej Siewior wrote:
> On 2025-04-25 16:49:19 [+0530], Shrikanth Hegde wrote:
>> On 4/25/25 00:08, Sebastian Andrzej Siewior wrote:
>>> On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
>>>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>>>>> index 19f4d298d..123539642 100644
>>>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>>>> @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>>>>> }
>>>>>> if (need_resched())
>>>>>> - cond_resched();
>>>>>> + schedule();
>>>>>
>>>>
>>>>
>>>>> This looks unrelated and odd. I don't why but this should be a
>>>>> cond_resched() so it can be optimized away on PREEMPT kernels.
>>>>
>>>> This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
>>>
>>> But this makes no sense. On preempt=full the cond_resched() gets patched
>>> out while schedule() doesn't. Okay, this explains the stuck.
>>
>> cond_resched works. What you said is right about schedule and preemption models.
>> Initially I had some other code changes and they were causing it get stuck. i retested it.
>
> so it is unrelated then ;)
>
>> But looking at the semantics of usage of xfer_to_guest_mode_work
>> I think using schedule is probably right over here.
>> Correct me if i got it all wrong.
>
> No, if you do xfer_to_guest_mode_work() then it will invoke schedule()
> when appropriate. It just the thing in kvmhv_run_single_vcpu() looks odd
> and might have been duct tape or an accident and could probably be
> removed.
>
I was wondering xfer_to_guest_mode_work could also call cond_resched
instead of schedule since for preempt=full/lazy is preemptible
as early as possible right?
>> on x86:
>> kvm_arch_vcpu_ioctl_run
>> vcpu_run
>> for () {
>> .. run guest..
>> xfer_to_guest_mode_handle_work
>> schedule
>> }
>>
>>
>> on Powerpc: ( taking book3s_hv flavour):
>> kvm_arch_vcpu_ioctl_run
>> kvmppc_vcpu_run_hv *1
>> do while() {
>> kvmhv_run_single_vcpu or kvmppc_run_vcpu
>> -- checking for need_resched and signals and bails out *2
>> }
>>
>>
>> *1 - checks for need resched and signals before entering guest
> I don't see the need_resched() check here.
>
right, i think *2 is critical since it is in a loop.
*1 is probably an optimization to skip a few cycles.
>> *2 - checks for need resched and signals while running the guest
>>
>>
>> This patch is addressing only *1 but it needs to address *2 as well using generic framework.
>> I think it is doable for books3s_hv atleast. (though might need rewrite)
>>
>> __kvmppc_vcpu_run is a block box to me yet. I think it first makes sense
>> to move it C and then try use the xfer_to_guest_mode_handle_work.
>> nick, vaibhav, any idea on __kvmppc_vcpu_run on how is it handling signal pending, and need_resched.
>>
>>
>> So this is going to need more work specially on *2 and doing that is also key for preempt=lazy/full to work
>> for kvm on powepc. will try to figure out.
>
> Okay.
>
> Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-25 14:24 ` Shrikanth Hegde
@ 2025-04-25 16:23 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-25 16:23 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: npiggin, vaibhav, maddy, linuxppc-dev, christophe.leroy, gautam,
linux-kernel
On 2025-04-25 19:54:56 [+0530], Shrikanth Hegde wrote:
> > > But looking at the semantics of usage of xfer_to_guest_mode_work
> > > I think using schedule is probably right over here.
> > > Correct me if i got it all wrong.
> >
> > No, if you do xfer_to_guest_mode_work() then it will invoke schedule()
> > when appropriate. It just the thing in kvmhv_run_single_vcpu() looks odd
> > and might have been duct tape or an accident and could probably be
> > removed.
> >
>
> I was wondering xfer_to_guest_mode_work could also call cond_resched
> instead of schedule since for preempt=full/lazy is preemptible
> as early as possible right?
No, I think it is okay. For preempt=full you shouldn't observe the
flag in xfer_to_guest_mode_work() so it does not matter.
Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work
2025-04-25 13:31 ` Sebastian Andrzej Siewior
2025-04-25 14:24 ` Shrikanth Hegde
@ 2025-04-29 6:19 ` Shrikanth Hegde
1 sibling, 0 replies; 12+ messages in thread
From: Shrikanth Hegde @ 2025-04-29 6:19 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: npiggin, vaibhav, maddy, linuxppc-dev, christophe.leroy, gautam,
linux-kernel
On 4/25/25 19:01, Sebastian Andrzej Siewior wrote:
> On 2025-04-25 16:49:19 [+0530], Shrikanth Hegde wrote:
>> On 4/25/25 00:08, Sebastian Andrzej Siewior wrote:
>>> On 2025-04-24 21:27:59 [+0530], Shrikanth Hegde wrote:
>>>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>>>>> index 19f4d298d..123539642 100644
>>>>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>>>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>>>>> @@ -4901,7 +4901,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>>>>> }
>>>>>> if (need_resched())
>>>>>> - cond_resched();
>>>>>> + schedule();
>>>>>
>>>>
>>>>
>>>>> This looks unrelated and odd. I don't why but this should be a
>>>>> cond_resched() so it can be optimized away on PREEMPT kernels.
>>>>
>>>> This is needed, otherwise KVM on powerVM setup gets stuck on preempt=full/lazy.
>>>
When i look back, i think below makes sense, since this is what even the
generic framework will do and it would work for all preemption models.
Since this happens in a loop, it is necessary to check for reschedule.
if need_resched()
schedule()
Ideal would be call xfer_to_guest_mode_handle_work in the loop, which
would handle the above too.
>>> But this makes no sense. On preempt=full the cond_resched() gets patched
>>> out while schedule() doesn't. Okay, this explains the stuck.
>>
>> cond_resched works. What you said is right about schedule and preemption models.
>> Initially I had some other code changes and they were causing it get stuck. i retested it.
>
> so it is unrelated then ;)
>
>> But looking at the semantics of usage of xfer_to_guest_mode_work
>> I think using schedule is probably right over here.
>> Correct me if i got it all wrong.
>
> No, if you do xfer_to_guest_mode_work() then it will invoke schedule()
> when appropriate. It just the thing in kvmhv_run_single_vcpu() looks odd
> and might have been duct tape or an accident and could probably be
> removed.
>
>> on x86:
>> kvm_arch_vcpu_ioctl_run
>> vcpu_run
>> for () {
>> .. run guest..
>> xfer_to_guest_mode_handle_work
>> schedule
>> }
>>
>>
>> on Powerpc: ( taking book3s_hv flavour):
>> kvm_arch_vcpu_ioctl_run
>> kvmppc_vcpu_run_hv *1
>> do while() {
>> kvmhv_run_single_vcpu or kvmppc_run_vcpu
>> -- checking for need_resched and signals and bails out *2
>> }
>>
>>
>> *1 - checks for need resched and signals before entering guest
> I don't see the need_resched() check here.
>
>> *2 - checks for need resched and signals while running the guest
>>
>>
>> This patch is addressing only *1 but it needs to address *2 as well using generic framework.
>> I think it is doable for books3s_hv atleast. (though might need rewrite)
>>
>> __kvmppc_vcpu_run is a block box to me yet. I think it first makes sense
>> to move it C and then try use the xfer_to_guest_mode_handle_work.
>> nick, vaibhav, any idea on __kvmppc_vcpu_run on how is it handling signal pending, and need_resched.
>>
>>
>> So this is going to need more work specially on *2 and doing that is also key for preempt=lazy/full to work
>> for kvm on powepc. will try to figure out.
>
> Okay.
>
> Sebastian
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-04-29 6:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21 10:28 [PATCH 0/2] powerpc: kvm: generic framework and run posix timers in task context Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 1/2] powerpc: kvm: use generic transfer to guest mode work Shrikanth Hegde
2025-04-24 14:42 ` Sebastian Andrzej Siewior
2025-04-24 15:57 ` Shrikanth Hegde
2025-04-24 18:38 ` Sebastian Andrzej Siewior
2025-04-25 11:19 ` Shrikanth Hegde
2025-04-25 13:31 ` Sebastian Andrzej Siewior
2025-04-25 14:24 ` Shrikanth Hegde
2025-04-25 16:23 ` Sebastian Andrzej Siewior
2025-04-29 6:19 ` Shrikanth Hegde
2025-04-21 10:28 ` [PATCH 2/2] powerpc: enable to run posix cpu timers in task context Shrikanth Hegde
2025-04-21 12:34 ` [PATCH 0/2] powerpc: kvm: generic framework and run posix " Shrikanth Hegde
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).