Kernel KVM virtualization development
 help / color / mirror / Atom feed
* Re: [v5 1/3] KVM: setup empty irq routing when create vm
       [not found] ` <20240506101751.3145407-2-foxywang@tencent.com>
@ 2026-04-17  8:19   ` Christian Borntraeger
  2026-04-17  9:36     ` Christian Borntraeger
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Borntraeger @ 2026-04-17  8:19 UTC (permalink / raw)
  To: Yi Wang, seanjc, pbonzini, tglx, mingo, bp, dave.hansen, x86, hpa,
	kvm, linux-kernel, wanpengli, foxywang, oliver.upton, maz, anup,
	atishp, frankja, imbrenda, weijiang.yang

Am 06.05.24 um 12:17 schrieb Yi Wang:
> From: Yi Wang <foxywang@tencent.com>
> 
> Add a new function to setup empty irq routing in kvm path, which
> can be invoded in non-architecture-specific functions. The difference
> compared to the kvm_setup_empty_irq_routing() is this function just
> alloc the empty irq routing and does not need synchronize srcu, as
> we will call it in kvm_create_vm().
> 
> Using the new adding function, we can setup empty irq routing when
> kvm_create_vm(), so that x86 and s390 no longer need to set
> empty/dummy irq routing when creating an IRQCHIP 'cause it avoid
> an synchronize_srcu.
> 
> Signed-off-by: Yi Wang <foxywang@tencent.com>

We have recently looked into cpu consumption for virtio.
So interestingly enough, this increases cpu consumption for things like uperf
ping pong on s390.
Bisect points to this commit.
I originally thought that this is a no-op for s390, but it is not.

The reasons seems to be that nr_rt_entries is now 1 instead of 0 making every
interrupt inject more expensive as we no longer drop out in

int kvm_irq_map_gsi(struct kvm *kvm,
                     struct kvm_kernel_irq_routing_entry *entries, int gsi)
{
         struct kvm_irq_routing_table *irq_rt;
         struct kvm_kernel_irq_routing_entry *e;
         int n = 0;

         irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm->irq_srcu,
                                         lockdep_is_held(&kvm->irq_lock));
         if (irq_rt && gsi < irq_rt->nr_rt_entries) {  <---------


[...]

> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 1e567d1f6d3d..ec1fda7fffea 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -237,3 +237,26 @@ int kvm_set_irq_routing(struct kvm *kvm,
>   
>   	return r;
>   }
> +
> +/*
> + * Alloc empty irq routing.
> + * Called only during vm creation, because we don't synchronize_srcu here.
> + */
> +int kvm_init_irq_routing(struct kvm *kvm)
> +{
> +	struct kvm_irq_routing_table *new;
> +	int chip_size;
> +
> +	new = kzalloc(struct_size(new, map, 1), GFP_KERNEL_ACCOUNT);
> +	if (!new)
> +		return -ENOMEM;
> +
> +	new->nr_rt_entries = 1;

Does anyone see a problem with changing that to

	new->nr_rt_entries = 0;

?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5 1/3] KVM: setup empty irq routing when create vm
  2026-04-17  8:19   ` [v5 1/3] KVM: setup empty irq routing when create vm Christian Borntraeger
@ 2026-04-17  9:36     ` Christian Borntraeger
  2026-04-17 10:19       ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Borntraeger @ 2026-04-17  9:36 UTC (permalink / raw)
  To: Yi Wang, seanjc, pbonzini, tglx, mingo, bp, dave.hansen, x86, hpa,
	kvm, linux-kernel, wanpengli, foxywang, oliver.upton, maz, anup,
	atishp, frankja, imbrenda, weijiang.yang

Am 17.04.26 um 10:19 schrieb Christian Borntraeger:
> Am 06.05.24 um 12:17 schrieb Yi Wang:
>> From: Yi Wang <foxywang@tencent.com>
>>
>> Add a new function to setup empty irq routing in kvm path, which
>> can be invoded in non-architecture-specific functions. The difference
>> compared to the kvm_setup_empty_irq_routing() is this function just
>> alloc the empty irq routing and does not need synchronize srcu, as
>> we will call it in kvm_create_vm().
>>
>> Using the new adding function, we can setup empty irq routing when
>> kvm_create_vm(), so that x86 and s390 no longer need to set
>> empty/dummy irq routing when creating an IRQCHIP 'cause it avoid
>> an synchronize_srcu.
>>
>> Signed-off-by: Yi Wang <foxywang@tencent.com>
> 
> We have recently looked into cpu consumption for virtio.
> So interestingly enough, this increases cpu consumption for things like uperf
> ping pong on s390.
> Bisect points to this commit.
> I originally thought that this is a no-op for s390, but it is not.
> 
> The reasons seems to be that nr_rt_entries is now 1 instead of 0 making every
> interrupt inject more expensive as we no longer drop out in
> 
> int kvm_irq_map_gsi(struct kvm *kvm,
>                      struct kvm_kernel_irq_routing_entry *entries, int gsi)
> {
>          struct kvm_irq_routing_table *irq_rt;
>          struct kvm_kernel_irq_routing_entry *e;
>          int n = 0;
> 
>          irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm->irq_srcu,
>                                          lockdep_is_held(&kvm->irq_lock));
>          if (irq_rt && gsi < irq_rt->nr_rt_entries) {  <---------
> 
Hmm, I guess I misread the code and the problem is likely not this.
Let me have another look.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5 1/3] KVM: setup empty irq routing when create vm
  2026-04-17  9:36     ` Christian Borntraeger
@ 2026-04-17 10:19       ` Paolo Bonzini
  2026-04-17 10:26         ` Christian Borntraeger
  2026-05-04 12:48         ` Christian Borntraeger
  0 siblings, 2 replies; 5+ messages in thread
From: Paolo Bonzini @ 2026-04-17 10:19 UTC (permalink / raw)
  To: Christian Borntraeger, Yi Wang, seanjc, tglx, mingo, bp,
	dave.hansen, x86, hpa, kvm, linux-kernel, wanpengli, foxywang,
	oliver.upton, maz, anup, atishp, frankja, imbrenda, weijiang.yang

On 4/17/26 11:36, Christian Borntraeger wrote:
>>
>>
>>          irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm- 
>> >irq_srcu,
>>                                          lockdep_is_held(&kvm- 
>> >irq_lock));
>>          if (irq_rt && gsi < irq_rt->nr_rt_entries) {  <---------
>>
> Hmm, I guess I misread the code and the problem is likely not this.
> Let me have another look.

It makes sense anyway, together with a similar extra element in
kvm_set_irq_routing():

diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 462c70621247..c3e4fbbfed94 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -178,11 +178,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
  	for (i = 0; i < nr; ++i) {
  		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
  			return -EINVAL;
-		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+		nr_rt_entries = max(nr_rt_entries + 1, ue[i].gsi);
  	}
  
-	nr_rt_entries += 1;
-
  	new = kzalloc_flex(*new, map, nr_rt_entries, GFP_KERNEL_ACCOUNT);
  	if (!new)
  		return -ENOMEM;
@@ -246,11 +244,11 @@ int kvm_init_irq_routing(struct kvm *kvm)
  	struct kvm_irq_routing_table *new;
  	int chip_size;
  
-	new = kzalloc_flex(*new, map, 1, GFP_KERNEL_ACCOUNT);
+	new = kzalloc_flex(*new, map, 0, GFP_KERNEL_ACCOUNT);
  	if (!new)
  		return -ENOMEM;
  
-	new->nr_rt_entries = 1;
+	new->nr_rt_entries = 0;
  
  	chip_size = sizeof(int) * KVM_NR_IRQCHIPS * KVM_IRQCHIP_NUM_PINS;
  	memset(new->chip, -1, chip_size);

Paolo


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [v5 1/3] KVM: setup empty irq routing when create vm
  2026-04-17 10:19       ` Paolo Bonzini
@ 2026-04-17 10:26         ` Christian Borntraeger
  2026-05-04 12:48         ` Christian Borntraeger
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Borntraeger @ 2026-04-17 10:26 UTC (permalink / raw)
  To: Paolo Bonzini, Yi Wang, seanjc, tglx, mingo, bp, dave.hansen, x86,
	hpa, kvm, linux-kernel, wanpengli, foxywang, oliver.upton, maz,
	anup, atishp, frankja, imbrenda, weijiang.yang


Am 17.04.26 um 12:19 schrieb Paolo Bonzini:
> On 4/17/26 11:36, Christian Borntraeger wrote:
>>>
>>>
>>>          irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm- >irq_srcu,
>>>                                          lockdep_is_held(&kvm- >irq_lock));
>>>          if (irq_rt && gsi < irq_rt->nr_rt_entries) {  <---------
>>>
>> Hmm, I guess I misread the code and the problem is likely not this.
>> Let me have another look.
> 
> It makes sense anyway, together with a similar extra element in
> kvm_set_irq_routing():

Yes, it should certainly avoid some cycles in some cases.
Let me try to understand what exactly triggered the increases system time after
the original patch and then I will send some patches.


> 
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 462c70621247..c3e4fbbfed94 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -178,11 +178,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
>       for (i = 0; i < nr; ++i) {
>           if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
>               return -EINVAL;
> -        nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> +        nr_rt_entries = max(nr_rt_entries + 1, ue[i].gsi);
>       }
> 
> -    nr_rt_entries += 1;
> -
>       new = kzalloc_flex(*new, map, nr_rt_entries, GFP_KERNEL_ACCOUNT);
>       if (!new)
>           return -ENOMEM;
> @@ -246,11 +244,11 @@ int kvm_init_irq_routing(struct kvm *kvm)
>       struct kvm_irq_routing_table *new;
>       int chip_size;
> 
> -    new = kzalloc_flex(*new, map, 1, GFP_KERNEL_ACCOUNT);
> +    new = kzalloc_flex(*new, map, 0, GFP_KERNEL_ACCOUNT);
>       if (!new)
>           return -ENOMEM;
> 
> -    new->nr_rt_entries = 1;
> +    new->nr_rt_entries = 0;
> 
>       chip_size = sizeof(int) * KVM_NR_IRQCHIPS * KVM_IRQCHIP_NUM_PINS;
>       memset(new->chip, -1, chip_size);
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v5 1/3] KVM: setup empty irq routing when create vm
  2026-04-17 10:19       ` Paolo Bonzini
  2026-04-17 10:26         ` Christian Borntraeger
@ 2026-05-04 12:48         ` Christian Borntraeger
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Borntraeger @ 2026-05-04 12:48 UTC (permalink / raw)
  To: Paolo Bonzini, Yi Wang, seanjc, tglx, mingo, bp, dave.hansen, x86,
	hpa, kvm, linux-kernel, wanpengli, foxywang, oliver.upton, maz,
	anup, atishp, frankja, imbrenda, weijiang.yang, Parshuram Sangle,
	Sean Christopherson

Am 17.04.26 um 12:19 schrieb Paolo Bonzini:
> On 4/17/26 11:36, Christian Borntraeger wrote:
>>>
>>>
>>>          irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm- >irq_srcu,
>>>                                          lockdep_is_held(&kvm- >irq_lock));
>>>          if (irq_rt && gsi < irq_rt->nr_rt_entries) {  <---------
>>>
>> Hmm, I guess I misread the code and the problem is likely not this.
>> Let me have another look.
> 
> It makes sense anyway, together with a similar extra element in
> kvm_set_irq_routing():


Turns out that the bisect was wrong. the increased cpu consumption came from
commit aeb1b22a3ac8e94c791f06f16e921384794771fa ("KVM: Enable halt polling shrink parameter by default")
which kind of makes sense.


@Paolo, shall I do a patch with 0 nr_rt_entries anyway as discussed or shall we
drop this?



> 
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 462c70621247..c3e4fbbfed94 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -178,11 +178,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
>       for (i = 0; i < nr; ++i) {
>           if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
>               return -EINVAL;
> -        nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
> +        nr_rt_entries = max(nr_rt_entries + 1, ue[i].gsi);
>       }
> 
> -    nr_rt_entries += 1;
> -
>       new = kzalloc_flex(*new, map, nr_rt_entries, GFP_KERNEL_ACCOUNT);
>       if (!new)
>           return -ENOMEM;
> @@ -246,11 +244,11 @@ int kvm_init_irq_routing(struct kvm *kvm)
>       struct kvm_irq_routing_table *new;
>       int chip_size;
> 
> -    new = kzalloc_flex(*new, map, 1, GFP_KERNEL_ACCOUNT);
> +    new = kzalloc_flex(*new, map, 0, GFP_KERNEL_ACCOUNT);
>       if (!new)
>           return -ENOMEM;
> 
> -    new->nr_rt_entries = 1;
> +    new->nr_rt_entries = 0;
> 
>       chip_size = sizeof(int) * KVM_NR_IRQCHIPS * KVM_IRQCHIP_NUM_PINS;
>       memset(new->chip, -1, chip_size);
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-04 12:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20240506101751.3145407-1-foxywang@tencent.com>
     [not found] ` <20240506101751.3145407-2-foxywang@tencent.com>
2026-04-17  8:19   ` [v5 1/3] KVM: setup empty irq routing when create vm Christian Borntraeger
2026-04-17  9:36     ` Christian Borntraeger
2026-04-17 10:19       ` Paolo Bonzini
2026-04-17 10:26         ` Christian Borntraeger
2026-05-04 12:48         ` Christian Borntraeger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox