public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
@ 2024-01-03  7:53 Prasad Pandit
  2024-01-08 10:45 ` Prasad Pandit
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Prasad Pandit @ 2024-01-03  7:53 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
request for a vcpu even when its 'events->nmi.pending' is zero.
Ex:
    qemu_thread_start
     kvm_vcpu_thread_fn
      qemu_wait_io_event
       qemu_wait_io_event_common
        process_queued_cpu_work
         do_kvm_cpu_synchronize_post_init/_reset
          kvm_arch_put_registers
           kvm_put_vcpu_events (cpu, level=[2|3])

This leads vCPU threads in QEMU to constantly acquire & release the
global mutex lock, delaying the guest boot due to lock contention.
Add check to make KVM_REQ_NMI request only if vcpu has NMI pending.

Fixes: bdedff263132 ("KVM: x86: Route pending NMIs from userspace through process_nmi()")
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 arch/x86/kvm/x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1a3aaa7dafae..468870450b8b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5405,7 +5405,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 	if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING) {
 		vcpu->arch.nmi_pending = 0;
 		atomic_set(&vcpu->arch.nmi_queued, events->nmi.pending);
-		kvm_make_request(KVM_REQ_NMI, vcpu);
+		if (events->nmi.pending)
+			kvm_make_request(KVM_REQ_NMI, vcpu);
 	}
 	static_call(kvm_x86_set_nmi_mask)(vcpu, events->nmi.masked);

--
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-01-03  7:53 [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu Prasad Pandit
@ 2024-01-08 10:45 ` Prasad Pandit
  2024-01-08 17:38   ` Sean Christopherson
  2024-02-03  0:11 ` Sean Christopherson
  2024-02-06  8:10 ` Dongli Zhang
  2 siblings, 1 reply; 8+ messages in thread
From: Prasad Pandit @ 2024-01-08 10:45 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel, Prasad Pandit

On Wed, 3 Jan 2024 at 13:24, Prasad Pandit <ppandit@redhat.com> wrote:
> kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
> request for a vcpu even when its 'events->nmi.pending' is zero.
> Ex:
>     qemu_thread_start
>      kvm_vcpu_thread_fn
>       qemu_wait_io_event
>        qemu_wait_io_event_common
>         process_queued_cpu_work
>          do_kvm_cpu_synchronize_post_init/_reset
>           kvm_arch_put_registers
>            kvm_put_vcpu_events (cpu, level=[2|3])
>
> This leads vCPU threads in QEMU to constantly acquire & release the
> global mutex lock, delaying the guest boot due to lock contention.
> Add check to make KVM_REQ_NMI request only if vcpu has NMI pending.
>
> Fixes: bdedff263132 ("KVM: x86: Route pending NMIs from userspace through process_nmi()")
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  arch/x86/kvm/x86.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1a3aaa7dafae..468870450b8b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5405,7 +5405,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
>         if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING) {
>                 vcpu->arch.nmi_pending = 0;
>                 atomic_set(&vcpu->arch.nmi_queued, events->nmi.pending);
> -               kvm_make_request(KVM_REQ_NMI, vcpu);
> +               if (events->nmi.pending)
> +                       kvm_make_request(KVM_REQ_NMI, vcpu);
>         }
>         static_call(kvm_x86_set_nmi_mask)(vcpu, events->nmi.masked);
> --
> 2.43.0

Ping...!
---
  - Prasad


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-01-08 10:45 ` Prasad Pandit
@ 2024-01-08 17:38   ` Sean Christopherson
  2024-01-09  6:26     ` Prasad Pandit
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2024-01-08 17:38 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: kvm, linux-kernel, Prasad Pandit

On Mon, Jan 08, 2024, Prasad Pandit wrote:
> On Wed, 3 Jan 2024 at 13:24, Prasad Pandit <ppandit@redhat.com> wrote:
> > kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
> > request for a vcpu even when its 'events->nmi.pending' is zero.
> > Ex:
> >     qemu_thread_start
> >      kvm_vcpu_thread_fn
> >       qemu_wait_io_event
> >        qemu_wait_io_event_common
> >         process_queued_cpu_work
> >          do_kvm_cpu_synchronize_post_init/_reset
> >           kvm_arch_put_registers
> >            kvm_put_vcpu_events (cpu, level=[2|3])
> >
> > This leads vCPU threads in QEMU to constantly acquire & release the
> > global mutex lock, delaying the guest boot due to lock contention.
> > Add check to make KVM_REQ_NMI request only if vcpu has NMI pending.
> >
> > Fixes: bdedff263132 ("KVM: x86: Route pending NMIs from userspace through process_nmi()")
> > Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> > ---
> >  arch/x86/kvm/x86.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 1a3aaa7dafae..468870450b8b 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5405,7 +5405,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
> >         if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING) {
> >                 vcpu->arch.nmi_pending = 0;
> >                 atomic_set(&vcpu->arch.nmi_queued, events->nmi.pending);
> > -               kvm_make_request(KVM_REQ_NMI, vcpu);
> > +               if (events->nmi.pending)
> > +                       kvm_make_request(KVM_REQ_NMI, vcpu);
> >         }
> >         static_call(kvm_x86_set_nmi_mask)(vcpu, events->nmi.masked);
> > --
> > 2.43.0
> 
> Ping...!

This is on my list of things to grab for 6.8, I'm just waiting for various pull
requests to fully land in order to simplify my branch management.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-01-08 17:38   ` Sean Christopherson
@ 2024-01-09  6:26     ` Prasad Pandit
  0 siblings, 0 replies; 8+ messages in thread
From: Prasad Pandit @ 2024-01-09  6:26 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel

On Mon, 8 Jan 2024 at 23:08, Sean Christopherson <seanjc@google.com> wrote:
> This is on my list of things to grab for 6.8, I'm just waiting for various pull
> requests to fully land in order to simplify my branch management.

* Okay, cool.

Thank you.
---
  - Prasad


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-01-03  7:53 [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu Prasad Pandit
  2024-01-08 10:45 ` Prasad Pandit
@ 2024-02-03  0:11 ` Sean Christopherson
  2024-02-06  8:10 ` Dongli Zhang
  2 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2024-02-03  0:11 UTC (permalink / raw)
  To: Sean Christopherson, Prasad Pandit; +Cc: kvm, linux-kernel, Prasad Pandit

On Wed, 03 Jan 2024 13:23:43 +0530, Prasad Pandit wrote:
> kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
> request for a vcpu even when its 'events->nmi.pending' is zero.
> Ex:
>     qemu_thread_start
>      kvm_vcpu_thread_fn
>       qemu_wait_io_event
>        qemu_wait_io_event_common
>         process_queued_cpu_work
>          do_kvm_cpu_synchronize_post_init/_reset
>           kvm_arch_put_registers
>            kvm_put_vcpu_events (cpu, level=[2|3])
> 
> [...]

Applied to kvm-x86 fixes, thanks!

[1/1] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
      https://github.com/kvm-x86/linux/commit/6231c9e1a9f3

--
https://github.com/kvm-x86/linux/tree/next

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-01-03  7:53 [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu Prasad Pandit
  2024-01-08 10:45 ` Prasad Pandit
  2024-02-03  0:11 ` Sean Christopherson
@ 2024-02-06  8:10 ` Dongli Zhang
  2024-02-06 20:58   ` Sean Christopherson
  2 siblings, 1 reply; 8+ messages in thread
From: Dongli Zhang @ 2024-02-06  8:10 UTC (permalink / raw)
  To: Prasad Pandit, Prasad Pandit; +Cc: kvm, linux-kernel, Sean Christopherson

Hi Prasad,

On 1/2/24 23:53, Prasad Pandit wrote:
> From: Prasad Pandit <pjp@fedoraproject.org>
> 
> kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
> request for a vcpu even when its 'events->nmi.pending' is zero.
> Ex:
>     qemu_thread_start
>      kvm_vcpu_thread_fn
>       qemu_wait_io_event
>        qemu_wait_io_event_common
>         process_queued_cpu_work
>          do_kvm_cpu_synchronize_post_init/_reset
>           kvm_arch_put_registers
>            kvm_put_vcpu_events (cpu, level=[2|3])
> 
> This leads vCPU threads in QEMU to constantly acquire & release the
> global mutex lock, delaying the guest boot due to lock contention.

Would you mind sharing where and how the lock contention is at QEMU space? That
is, how the QEMU mutex lock is impacted by KVM KVM_REQ_NMI?

Or you meant line 3031 at QEMU side?

2858 int kvm_cpu_exec(CPUState *cpu)
2859 {
2860     struct kvm_run *run = cpu->kvm_run;
2861     int ret, run_ret;
... ...
3023         default:
3024             DPRINTF("kvm_arch_handle_exit\n");
3025             ret = kvm_arch_handle_exit(cpu, run);
3026             break;
3027         }
3028     } while (ret == 0);
3029
3030     cpu_exec_end(cpu);
3031     qemu_mutex_lock_iothread();
3032
3033     if (ret < 0) {
3034         cpu_dump_state(cpu, stderr, CPU_DUMP_CODE);
3035         vm_stop(RUN_STATE_INTERNAL_ERROR);
3036     }
3037
3038     qatomic_set(&cpu->exit_request, 0);
3039     return ret;
3040 }

Thank you very much!

Dongli Zhang

> Add check to make KVM_REQ_NMI request only if vcpu has NMI pending.
> 
> Fixes: bdedff263132 ("KVM: x86: Route pending NMIs from userspace through process_nmi()")
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  arch/x86/kvm/x86.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1a3aaa7dafae..468870450b8b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5405,7 +5405,8 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
>  	if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING) {
>  		vcpu->arch.nmi_pending = 0;
>  		atomic_set(&vcpu->arch.nmi_queued, events->nmi.pending);
> -		kvm_make_request(KVM_REQ_NMI, vcpu);
> +		if (events->nmi.pending)
> +			kvm_make_request(KVM_REQ_NMI, vcpu);
>  	}
>  	static_call(kvm_x86_set_nmi_mask)(vcpu, events->nmi.masked);
> 
> --
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-02-06  8:10 ` Dongli Zhang
@ 2024-02-06 20:58   ` Sean Christopherson
  2024-02-09  0:34     ` Dongli Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2024-02-06 20:58 UTC (permalink / raw)
  To: Dongli Zhang; +Cc: Prasad Pandit, Prasad Pandit, kvm, linux-kernel

On Tue, Feb 06, 2024, Dongli Zhang wrote:
> Hi Prasad,
> 
> On 1/2/24 23:53, Prasad Pandit wrote:
> > From: Prasad Pandit <pjp@fedoraproject.org>
> > 
> > kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
> > request for a vcpu even when its 'events->nmi.pending' is zero.
> > Ex:
> >     qemu_thread_start
> >      kvm_vcpu_thread_fn
> >       qemu_wait_io_event
> >        qemu_wait_io_event_common
> >         process_queued_cpu_work
> >          do_kvm_cpu_synchronize_post_init/_reset
> >           kvm_arch_put_registers
> >            kvm_put_vcpu_events (cpu, level=[2|3])
> > 
> > This leads vCPU threads in QEMU to constantly acquire & release the
> > global mutex lock, delaying the guest boot due to lock contention.
> 
> Would you mind sharing where and how the lock contention is at QEMU space? That
> is, how the QEMU mutex lock is impacted by KVM KVM_REQ_NMI?
> 
> Or you meant line 3031 at QEMU side?

Yeah, something like that.  Details in this thread.

https://lore.kernel.org/all/CAE8KmOyffXD4k69vRAFwesaqrBGzFY3i+kefbkHcQf4=jNYzOA@mail.gmail.com

> 2858 int kvm_cpu_exec(CPUState *cpu)
> 2859 {
> 2860     struct kvm_run *run = cpu->kvm_run;
> 2861     int ret, run_ret;
> ... ...
> 3023         default:
> 3024             DPRINTF("kvm_arch_handle_exit\n");
> 3025             ret = kvm_arch_handle_exit(cpu, run);
> 3026             break;
> 3027         }
> 3028     } while (ret == 0);
> 3029
> 3030     cpu_exec_end(cpu);
> 3031     qemu_mutex_lock_iothread();
> 3032
> 3033     if (ret < 0) {
> 3034         cpu_dump_state(cpu, stderr, CPU_DUMP_CODE);
> 3035         vm_stop(RUN_STATE_INTERNAL_ERROR);
> 3036     }
> 3037
> 3038     qatomic_set(&cpu->exit_request, 0);
> 3039     return ret;
> 3040 }

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu
  2024-02-06 20:58   ` Sean Christopherson
@ 2024-02-09  0:34     ` Dongli Zhang
  0 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2024-02-09  0:34 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Prasad Pandit, Prasad Pandit, kvm, linux-kernel



On 2/6/24 12:58, Sean Christopherson wrote:
> On Tue, Feb 06, 2024, Dongli Zhang wrote:
>> Hi Prasad,
>>
>> On 1/2/24 23:53, Prasad Pandit wrote:
>>> From: Prasad Pandit <pjp@fedoraproject.org>
>>>
>>> kvm_vcpu_ioctl_x86_set_vcpu_events() routine makes 'KVM_REQ_NMI'
>>> request for a vcpu even when its 'events->nmi.pending' is zero.
>>> Ex:
>>>     qemu_thread_start
>>>      kvm_vcpu_thread_fn
>>>       qemu_wait_io_event
>>>        qemu_wait_io_event_common
>>>         process_queued_cpu_work
>>>          do_kvm_cpu_synchronize_post_init/_reset
>>>           kvm_arch_put_registers
>>>            kvm_put_vcpu_events (cpu, level=[2|3])
>>>
>>> This leads vCPU threads in QEMU to constantly acquire & release the
>>> global mutex lock, delaying the guest boot due to lock contention.
>>
>> Would you mind sharing where and how the lock contention is at QEMU space? That
>> is, how the QEMU mutex lock is impacted by KVM KVM_REQ_NMI?
>>
>> Or you meant line 3031 at QEMU side?
> 
> Yeah, something like that.  Details in this thread.
> 
> https://urldefense.com/v3/__https://lore.kernel.org/all/CAE8KmOyffXD4k69vRAFwesaqrBGzFY3i*kefbkHcQf4=jNYzOA@mail.gmail.com__;Kw!!ACWV5N9M2RV99hQ!N61g2QXuC5B5RpVNBowgKUXjHzX4vp0lCXuton3fKVRbzBuXaVtJgePu0ddSf3EB9EEQORTmwop4vD5KrQ$ 

Thank you very much for pointing to the discussion. I should have found them :)

Here is my understanding.

1. During the VM creation, the mp_state of AP (non-BSP) is always
KVM_MP_STATE_UNINITIALIZED, until INIT/SIPI.

2. Ideally, AP should block at below. That is, line 3775 is always false.

3760 bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
3761 {
3762         struct rcuwait *wait = kvm_arch_vcpu_get_wait(vcpu);
3763         bool waited = false;
3764
3765         vcpu->stat.generic.blocking = 1;
3766
3767         preempt_disable();
3768         kvm_arch_vcpu_blocking(vcpu);
3769         prepare_to_rcuwait(wait);
3770         preempt_enable();
3771
3772         for (;;) {
3773                 set_current_state(TASK_INTERRUPTIBLE);
3774
3775                 if (kvm_vcpu_check_block(vcpu) < 0)
3776                         break;
3777
3778                 waited = true;
3779                 schedule();
3780         }

3. Unfortunately, the issue may set KVM_REQ_NMI for AP.

4. This leads to the kvm_vcpu_check_block() to return.

kvm_arch_vcpu_ioctl_run()
-> kvm_vcpu_block()
   -> kvm_vcpu_check_block()
      -> kvm_arch_vcpu_runnable()
         -> kvm_vcpu_has_events()
            -> kvm_test_request(KVM_REQ_NMI, vcpu)


5. The kvm_arch_vcpu_ioctl_run() returns to QEMU with -EAGAIN.

6. The QEMU side is not able to handle -EAGAIN, but to goto line 2984 to return.

It acquires the global mutex at line 2976 (release before entering into guest
again). The KVM_REQ_NMI is never cleared until INIT/SIPI.


2808 int kvm_cpu_exec(CPUState *cpu)
2809 {
... ...
2868         if (run_ret < 0) {
2869             if (run_ret == -EINTR || run_ret == -EAGAIN) {
2870                 trace_kvm_io_window_exit();
2871                 kvm_eat_signals(cpu);
2872                 ret = EXCP_INTERRUPT;
2873                 break;
2874             }
... ...
2973     } while (ret == 0);
2974
2975     cpu_exec_end(cpu);
2976     bql_lock();
2977
2978     if (ret < 0) {
2979         cpu_dump_state(cpu, stderr, CPU_DUMP_CODE);
2980         vm_stop(RUN_STATE_INTERNAL_ERROR);
2981     }
2982
2983     qatomic_set(&cpu->exit_request, 0);
2984     return ret;
2985 }


7. The QEMU AP vCPU thread enters into KVM_RUN again. Same flow as step 4, goto
step 4, again and again.

The lock has been frequently acquired/released. The vCPU 0 is unhappy with it,
especially when the number of APs is large!

I guess it is not an issue after VM reboot (without QEMU instance re-creation
because the mpstate is not KVM_MP_STATE_UNINITIALIZED any longer).


Thank you very much!

Dongli Zhang

> 
>> 2858 int kvm_cpu_exec(CPUState *cpu)
>> 2859 {
>> 2860     struct kvm_run *run = cpu->kvm_run;
>> 2861     int ret, run_ret;
>> ... ...
>> 3023         default:
>> 3024             DPRINTF("kvm_arch_handle_exit\n");
>> 3025             ret = kvm_arch_handle_exit(cpu, run);
>> 3026             break;
>> 3027         }
>> 3028     } while (ret == 0);
>> 3029
>> 3030     cpu_exec_end(cpu);
>> 3031     qemu_mutex_lock_iothread();
>> 3032
>> 3033     if (ret < 0) {
>> 3034         cpu_dump_state(cpu, stderr, CPU_DUMP_CODE);
>> 3035         vm_stop(RUN_STATE_INTERNAL_ERROR);
>> 3036     }
>> 3037
>> 3038     qatomic_set(&cpu->exit_request, 0);
>> 3039     return ret;
>> 3040 }

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-02-09  0:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-03  7:53 [PATCH] KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu Prasad Pandit
2024-01-08 10:45 ` Prasad Pandit
2024-01-08 17:38   ` Sean Christopherson
2024-01-09  6:26     ` Prasad Pandit
2024-02-03  0:11 ` Sean Christopherson
2024-02-06  8:10 ` Dongli Zhang
2024-02-06 20:58   ` Sean Christopherson
2024-02-09  0:34     ` Dongli Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox