* [PATCH v2 0/3] KVM: x86: Add a module param for device posted IRQs
@ 2025-03-20 14:20 Sean Christopherson
2025-03-20 14:20 ` [PATCH v2 1/3] KVM: VMX: Don't send UNBLOCK when starting device assignment without APICv Sean Christopherson
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-03-20 14:20 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Yosry Ahmed
Add a module param, enable_device_posted_irqs, to control and enumerate
KVM support for device posted IRQs.
v2:
- Add prep patches to use kvm_arch_has_irq_bypass() in vendor code when
querying support IRQ bypass, a.k.a. device posted IRQs, so as not to
unexpectedly introduce a (desired) dependency on enable_apicv. [Yosry]
- Use "&=" when constraining enable_device_posted_irqs based on APICv
and IOMMU posting support. [Yosry]
v1: https://lore.kernel.org/all/20250315025615.2367411-1-seanjc@google.com
Sean Christopherson (3):
KVM: VMX: Don't send UNBLOCK when starting device assignment without
APICv
KVM: SVM: Don't update IRTEs if APICv/AVIC is disable
KVM: x86: Add a module param to control and enumerate device posted
IRQs
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm/avic.c | 3 +--
arch/x86/kvm/vmx/posted_intr.c | 7 +++----
arch/x86/kvm/x86.c | 10 +++++++++-
4 files changed, 14 insertions(+), 7 deletions(-)
base-commit: c9ea48bb6ee6b28bbc956c1e8af98044618fed5e
--
2.49.0.395.g12beb8f557-goog
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH v2 1/3] KVM: VMX: Don't send UNBLOCK when starting device assignment without APICv 2025-03-20 14:20 [PATCH v2 0/3] KVM: x86: Add a module param for device posted IRQs Sean Christopherson @ 2025-03-20 14:20 ` Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs Sean Christopherson 2 siblings, 0 replies; 13+ messages in thread From: Sean Christopherson @ 2025-03-20 14:20 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Yosry Ahmed When starting device assignment, i.e. potential IRQ bypass, don't blast KVM_REQ_UNBLOCK if APICv is disabled/unsupported. There is no need to wake vCPUs if they can never use VT-d posted IRQs (sending UNBLOCK guards against races being vCPUs blocking and devices starting IRQ bypass). Opportunistically use kvm_arch_has_irq_bypass() for all relevant checks in the VMX Posted Interrupt code so that all checks in KVM x86 incorporate the same information (once AMD/AVIC is given similar treatment). Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/posted_intr.c | 7 +++---- arch/x86/kvm/x86.c | 1 + 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index ec08fa3caf43..16121d29dfd9 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -134,9 +134,8 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) static bool vmx_can_use_vtd_pi(struct kvm *kvm) { - return irqchip_in_kernel(kvm) && enable_apicv && - kvm_arch_has_assigned_device(kvm) && - irq_remapping_cap(IRQ_POSTING_CAP); + return irqchip_in_kernel(kvm) && kvm_arch_has_irq_bypass() && + kvm_arch_has_assigned_device(kvm); } /* @@ -254,7 +253,7 @@ bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu) */ void vmx_pi_start_assignment(struct kvm *kvm) { - if (!irq_remapping_cap(IRQ_POSTING_CAP)) + if (!kvm_arch_has_irq_bypass()) return; kvm_make_all_cpus_request(kvm, KVM_REQ_UNBLOCK); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 69c20a68a3f0..f76d655dc9a8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13554,6 +13554,7 @@ bool kvm_arch_has_irq_bypass(void) { return enable_apicv && irq_remapping_cap(IRQ_POSTING_CAP); } +EXPORT_SYMBOL_GPL(kvm_arch_has_irq_bypass); int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons, struct irq_bypass_producer *prod) -- 2.49.0.395.g12beb8f557-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable 2025-03-20 14:20 [PATCH v2 0/3] KVM: x86: Add a module param for device posted IRQs Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 1/3] KVM: VMX: Don't send UNBLOCK when starting device assignment without APICv Sean Christopherson @ 2025-03-20 14:20 ` Sean Christopherson 2025-03-20 16:08 ` Jim Mattson 2025-03-20 14:20 ` [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs Sean Christopherson 2 siblings, 1 reply; 13+ messages in thread From: Sean Christopherson @ 2025-03-20 14:20 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Yosry Ahmed Skip IRTE updates if AVIC is disabled/unsupported, as forcing the IRTE into remapped mode (kvm_vcpu_apicv_active() will never be true) is unnecessary and wasteful. The IOMMU driver is responsible for putting IRTEs into remapped mode when an IRQ is allocated by a device, long before that device is assigned to a VM. I.e. the kernel as a whole has major issues if the IRTE isn't already in remapped mode. Opportunsitically kvm_arch_has_irq_bypass() to query for APICv/AVIC, so so that all checks in KVM x86 incorporate the same information. Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/svm/avic.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 65fd245a9953..901d8d2dc169 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -898,8 +898,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq, struct kvm_irq_routing_table *irq_rt; int idx, ret = 0; - if (!kvm_arch_has_assigned_device(kvm) || - !irq_remapping_cap(IRQ_POSTING_CAP)) + if (!kvm_arch_has_assigned_device(kvm) || !kvm_arch_has_irq_bypass()) return 0; pr_debug("SVM: %s: host_irq=%#x, guest_irq=%#x, set=%#x\n", -- 2.49.0.395.g12beb8f557-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable 2025-03-20 14:20 ` [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable Sean Christopherson @ 2025-03-20 16:08 ` Jim Mattson 0 siblings, 0 replies; 13+ messages in thread From: Jim Mattson @ 2025-03-20 16:08 UTC (permalink / raw) To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Thu, Mar 20, 2025 at 7:31 AM Sean Christopherson <seanjc@google.com> wrote: > > Skip IRTE updates if AVIC is disabled/unsupported, as forcing the IRTE > into remapped mode (kvm_vcpu_apicv_active() will never be true) is > unnecessary and wasteful. The IOMMU driver is responsible for putting > IRTEs into remapped mode when an IRQ is allocated by a device, long before > that device is assigned to a VM. I.e. the kernel as a whole has major > issues if the IRTE isn't already in remapped mode. > > Opportunsitically kvm_arch_has_irq_bypass() to query for APICv/AVIC, so > so that all checks in KVM x86 incorporate the same information. > > Cc: Yosry Ahmed <yosry.ahmed@linux.dev> > Signed-off-by: Sean Christopherson <seanjc@google.com> Nit: "disable" -> "disabled" in the shortlog. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 14:20 [PATCH v2 0/3] KVM: x86: Add a module param for device posted IRQs Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 1/3] KVM: VMX: Don't send UNBLOCK when starting device assignment without APICv Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable Sean Christopherson @ 2025-03-20 14:20 ` Sean Christopherson 2025-03-20 16:02 ` Jim Mattson 2025-03-20 17:59 ` Sean Christopherson 2 siblings, 2 replies; 13+ messages in thread From: Sean Christopherson @ 2025-03-20 14:20 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini; +Cc: kvm, linux-kernel, Yosry Ahmed Add a module param to allow disabling device posted interrupts without having to sacrifice all of APICv/AVIC, and to also effectively enumerate to userspace whether or not KVM may be utilizing device posted IRQs. Disabling device posted interrupts is very desirable for testing, and can even be desirable for production environments, e.g. if the host kernel wants to interpose on device interrupts. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 9 ++++++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d881e7d276b1..bf11c5ee50cb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1922,6 +1922,7 @@ struct kvm_arch_async_pf { extern u32 __read_mostly kvm_nr_uret_msrs; extern bool __read_mostly allow_smaller_maxphyaddr; extern bool __read_mostly enable_apicv; +extern bool __read_mostly enable_device_posted_irqs; extern struct kvm_x86_ops kvm_x86_ops; #define kvm_x86_call(func) static_call(kvm_x86_##func) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f76d655dc9a8..e7eb2198db26 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -227,6 +227,10 @@ EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); bool __read_mostly enable_apicv = true; EXPORT_SYMBOL_GPL(enable_apicv); +bool __read_mostly enable_device_posted_irqs = true; +module_param(enable_device_posted_irqs, bool, 0444); +EXPORT_SYMBOL_GPL(enable_device_posted_irqs); + const struct _kvm_stats_desc kvm_vm_stats_desc[] = { KVM_GENERIC_VM_STATS(), STATS_DESC_COUNTER(VM, mmu_shadow_zapped), @@ -9772,6 +9776,9 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (r != 0) goto out_mmu_exit; + enable_device_posted_irqs &= enable_apicv && + irq_remapping_cap(IRQ_POSTING_CAP); + kvm_ops_update(ops); for_each_online_cpu(cpu) { @@ -13552,7 +13559,7 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma); bool kvm_arch_has_irq_bypass(void) { - return enable_apicv && irq_remapping_cap(IRQ_POSTING_CAP); + return enable_device_posted_irqs; } EXPORT_SYMBOL_GPL(kvm_arch_has_irq_bypass); -- 2.49.0.395.g12beb8f557-goog ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 14:20 ` [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs Sean Christopherson @ 2025-03-20 16:02 ` Jim Mattson 2025-03-20 17:54 ` Sean Christopherson 2025-03-20 17:59 ` Sean Christopherson 1 sibling, 1 reply; 13+ messages in thread From: Jim Mattson @ 2025-03-20 16:02 UTC (permalink / raw) To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Thu, Mar 20, 2025 at 7:31 AM Sean Christopherson <seanjc@google.com> wrote: > > Add a module param to allow disabling device posted interrupts without > having to sacrifice all of APICv/AVIC, and to also effectively enumerate > to userspace whether or not KVM may be utilizing device posted IRQs. > Disabling device posted interrupts is very desirable for testing, and can > even be desirable for production environments, e.g. if the host kernel > wants to interpose on device interrupts. Are you referring to CONFIG_X86_POSTED_MSI, or something else that doesn't exist yet? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 16:02 ` Jim Mattson @ 2025-03-20 17:54 ` Sean Christopherson 0 siblings, 0 replies; 13+ messages in thread From: Sean Christopherson @ 2025-03-20 17:54 UTC (permalink / raw) To: Jim Mattson; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Thu, Mar 20, 2025, Jim Mattson wrote: > On Thu, Mar 20, 2025 at 7:31 AM Sean Christopherson <seanjc@google.com> wrote: > > > > Add a module param to allow disabling device posted interrupts without > > having to sacrifice all of APICv/AVIC, and to also effectively enumerate > > to userspace whether or not KVM may be utilizing device posted IRQs. > > Disabling device posted interrupts is very desirable for testing, and can > > even be desirable for production environments, e.g. if the host kernel > > wants to interpose on device interrupts. > > Are you referring to CONFIG_X86_POSTED_MSI, or something else that > doesn't exist yet? Yeah, that, and/or out-of-tree hackery to do similar coalescing (or ratelimiting). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 14:20 ` [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs Sean Christopherson 2025-03-20 16:02 ` Jim Mattson @ 2025-03-20 17:59 ` Sean Christopherson 2025-03-20 18:14 ` Yosry Ahmed 2025-03-21 1:57 ` Chao Gao 1 sibling, 2 replies; 13+ messages in thread From: Sean Christopherson @ 2025-03-20 17:59 UTC (permalink / raw) To: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Thu, Mar 20, 2025, Sean Christopherson wrote: > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index f76d655dc9a8..e7eb2198db26 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -227,6 +227,10 @@ EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); > bool __read_mostly enable_apicv = true; > EXPORT_SYMBOL_GPL(enable_apicv); > > +bool __read_mostly enable_device_posted_irqs = true; > +module_param(enable_device_posted_irqs, bool, 0444); > +EXPORT_SYMBOL_GPL(enable_device_posted_irqs); > + > const struct _kvm_stats_desc kvm_vm_stats_desc[] = { > KVM_GENERIC_VM_STATS(), > STATS_DESC_COUNTER(VM, mmu_shadow_zapped), > @@ -9772,6 +9776,9 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > if (r != 0) > goto out_mmu_exit; > > + enable_device_posted_irqs &= enable_apicv && > + irq_remapping_cap(IRQ_POSTING_CAP); Drat, this is flawed. Putting the module param in kvm.ko means that loading kvm.ko with enable_device_posted_irqs=true, but a vendor module with APICv/AVIC disabled, leaves enable_device_posted_irqs disabled for the lifetime of kvm.ko. I.e. reloading the vendor module with APICv/AVIC enabled can't enable device posted IRQs. Option #1 is to do what we do for enable_mmio_caching, and snapshot userspace's desire. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e7eb2198db26..c84ad9109108 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -228,6 +228,7 @@ bool __read_mostly enable_apicv = true; EXPORT_SYMBOL_GPL(enable_apicv); bool __read_mostly enable_device_posted_irqs = true; +bool __ro_after_init allow_device_posted_irqs; module_param(enable_device_posted_irqs, bool, 0444); EXPORT_SYMBOL_GPL(enable_device_posted_irqs); @@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (r != 0) goto out_mmu_exit; - enable_device_posted_irqs &= enable_apicv && - irq_remapping_cap(IRQ_POSTING_CAP); + enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && + irq_remapping_cap(IRQ_POSTING_CAP); kvm_ops_update(ops); @@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); static int __init kvm_x86_init(void) { + allow_device_posted_irqs = enable_device_posted_irqs; + kvm_init_xstate_sizes(); kvm_mmu_x86_module_init(); Option #2 is to shove the module param into vendor code, but leave the variable in kvm.ko, like we do for enable_apicv. I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and doesn't prevent putting the logic in kvm_x86_vendor_init(). ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 17:59 ` Sean Christopherson @ 2025-03-20 18:14 ` Yosry Ahmed 2025-03-21 1:57 ` Chao Gao 1 sibling, 0 replies; 13+ messages in thread From: Yosry Ahmed @ 2025-03-20 18:14 UTC (permalink / raw) To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel On Thu, Mar 20, 2025 at 10:59:19AM -0700, Sean Christopherson wrote: > On Thu, Mar 20, 2025, Sean Christopherson wrote: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index f76d655dc9a8..e7eb2198db26 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -227,6 +227,10 @@ EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); > > bool __read_mostly enable_apicv = true; > > EXPORT_SYMBOL_GPL(enable_apicv); > > > > +bool __read_mostly enable_device_posted_irqs = true; > > +module_param(enable_device_posted_irqs, bool, 0444); > > +EXPORT_SYMBOL_GPL(enable_device_posted_irqs); > > + > > const struct _kvm_stats_desc kvm_vm_stats_desc[] = { > > KVM_GENERIC_VM_STATS(), > > STATS_DESC_COUNTER(VM, mmu_shadow_zapped), > > @@ -9772,6 +9776,9 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > > if (r != 0) > > goto out_mmu_exit; > > > > + enable_device_posted_irqs &= enable_apicv && > > + irq_remapping_cap(IRQ_POSTING_CAP); > > Drat, this is flawed. Putting the module param in kvm.ko means that loading > kvm.ko with enable_device_posted_irqs=true, but a vendor module with APICv/AVIC > disabled, leaves enable_device_posted_irqs disabled for the lifetime of kvm.ko. > I.e. reloading the vendor module with APICv/AVIC enabled can't enable device > posted IRQs. > > Option #1 is to do what we do for enable_mmio_caching, and snapshot userspace's > desire. > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e7eb2198db26..c84ad9109108 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -228,6 +228,7 @@ bool __read_mostly enable_apicv = true; > EXPORT_SYMBOL_GPL(enable_apicv); > > bool __read_mostly enable_device_posted_irqs = true; > +bool __ro_after_init allow_device_posted_irqs; > module_param(enable_device_posted_irqs, bool, 0444); > EXPORT_SYMBOL_GPL(enable_device_posted_irqs); > > @@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > if (r != 0) > goto out_mmu_exit; > > - enable_device_posted_irqs &= enable_apicv && > - irq_remapping_cap(IRQ_POSTING_CAP); > + enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && > + irq_remapping_cap(IRQ_POSTING_CAP); > > kvm_ops_update(ops); > > @@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); > > static int __init kvm_x86_init(void) > { > + allow_device_posted_irqs = enable_device_posted_irqs; > + > kvm_init_xstate_sizes(); > > kvm_mmu_x86_module_init(); > > > Option #2 is to shove the module param into vendor code, but leave the variable > in kvm.ko, like we do for enable_apicv. > > I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and > doesn't prevent putting the logic in kvm_x86_vendor_init(). +1, option #1 seems a bit confusing to me. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-20 17:59 ` Sean Christopherson 2025-03-20 18:14 ` Yosry Ahmed @ 2025-03-21 1:57 ` Chao Gao 2025-03-21 20:44 ` Sean Christopherson 1 sibling, 1 reply; 13+ messages in thread From: Chao Gao @ 2025-03-21 1:57 UTC (permalink / raw) To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Thu, Mar 20, 2025 at 10:59:19AM -0700, Sean Christopherson wrote: >On Thu, Mar 20, 2025, Sean Christopherson wrote: >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index f76d655dc9a8..e7eb2198db26 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -227,6 +227,10 @@ EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); >> bool __read_mostly enable_apicv = true; >> EXPORT_SYMBOL_GPL(enable_apicv); >> >> +bool __read_mostly enable_device_posted_irqs = true; >> +module_param(enable_device_posted_irqs, bool, 0444); >> +EXPORT_SYMBOL_GPL(enable_device_posted_irqs); can this variable be declared as static? >> + >> const struct _kvm_stats_desc kvm_vm_stats_desc[] = { >> KVM_GENERIC_VM_STATS(), >> STATS_DESC_COUNTER(VM, mmu_shadow_zapped), >> @@ -9772,6 +9776,9 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) >> if (r != 0) >> goto out_mmu_exit; >> >> + enable_device_posted_irqs &= enable_apicv && >> + irq_remapping_cap(IRQ_POSTING_CAP); > >Drat, this is flawed. Putting the module param in kvm.ko means that loading >kvm.ko with enable_device_posted_irqs=true, but a vendor module with APICv/AVIC >disabled, leaves enable_device_posted_irqs disabled for the lifetime of kvm.ko. >I.e. reloading the vendor module with APICv/AVIC enabled can't enable device >posted IRQs. > >Option #1 is to do what we do for enable_mmio_caching, and snapshot userspace's >desire. > >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >index e7eb2198db26..c84ad9109108 100644 >--- a/arch/x86/kvm/x86.c >+++ b/arch/x86/kvm/x86.c >@@ -228,6 +228,7 @@ bool __read_mostly enable_apicv = true; > EXPORT_SYMBOL_GPL(enable_apicv); > > bool __read_mostly enable_device_posted_irqs = true; >+bool __ro_after_init allow_device_posted_irqs; > module_param(enable_device_posted_irqs, bool, 0444); > EXPORT_SYMBOL_GPL(enable_device_posted_irqs); > >@@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > if (r != 0) > goto out_mmu_exit; > >- enable_device_posted_irqs &= enable_apicv && >- irq_remapping_cap(IRQ_POSTING_CAP); >+ enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && >+ irq_remapping_cap(IRQ_POSTING_CAP); Can we simply drop this ... > > kvm_ops_update(ops); > >@@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); > > static int __init kvm_x86_init(void) > { >+ allow_device_posted_irqs = enable_device_posted_irqs; >+ > kvm_init_xstate_sizes(); > > kvm_mmu_x86_module_init(); > > >Option #2 is to shove the module param into vendor code, but leave the variable >in kvm.ko, like we do for enable_apicv. > >I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and >doesn't prevent putting the logic in kvm_x86_vendor_init(). > and do bool kvm_arch_has_irq_bypass(void) { return enable_device_posted_irqs && enable_apicv && irq_remapping_cap(IRQ_POSTING_CAP); } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-21 1:57 ` Chao Gao @ 2025-03-21 20:44 ` Sean Christopherson 2025-03-24 9:26 ` Chao Gao 0 siblings, 1 reply; 13+ messages in thread From: Sean Christopherson @ 2025-03-21 20:44 UTC (permalink / raw) To: Chao Gao; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Fri, Mar 21, 2025, Chao Gao wrote: > On Thu, Mar 20, 2025 at 10:59:19AM -0700, Sean Christopherson wrote: > >@@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > > if (r != 0) > > goto out_mmu_exit; > > > >- enable_device_posted_irqs &= enable_apicv && > >- irq_remapping_cap(IRQ_POSTING_CAP); > >+ enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && > >+ irq_remapping_cap(IRQ_POSTING_CAP); > > Can we simply drop this ... > > > > > kvm_ops_update(ops); > > > >@@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); > > > > static int __init kvm_x86_init(void) > > { > >+ allow_device_posted_irqs = enable_device_posted_irqs; > >+ > > kvm_init_xstate_sizes(); > > > > kvm_mmu_x86_module_init(); > > > > > >Option #2 is to shove the module param into vendor code, but leave the variable > >in kvm.ko, like we do for enable_apicv. > > > >I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and > >doesn't prevent putting the logic in kvm_x86_vendor_init(). > > > > and do > > bool kvm_arch_has_irq_bypass(void) > { > return enable_device_posted_irqs && enable_apicv && > irq_remapping_cap(IRQ_POSTING_CAP); > } That would avoid the vendor module issues, but it would result in allow_device_posted_irqs not reflecting the state of KVM. We could partially address that by having the variable incorporate irq_remapping_cap(IRQ_POSTING_CAP) but not enable_apicv, but that's still a bit funky. Given that enable_apicv already has the "variable in kvm.ko, module param in kvm-{amd,intel}.ko" behavior, and that I am planning on giving enable_ipiv the same treatment (long story), my strong vote is to go with option #2 as it's the most flexibile, most accurate, and consistent with existing knobs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-21 20:44 ` Sean Christopherson @ 2025-03-24 9:26 ` Chao Gao 2025-03-24 13:41 ` Sean Christopherson 0 siblings, 1 reply; 13+ messages in thread From: Chao Gao @ 2025-03-24 9:26 UTC (permalink / raw) To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Fri, Mar 21, 2025 at 01:44:47PM -0700, Sean Christopherson wrote: >On Fri, Mar 21, 2025, Chao Gao wrote: >> On Thu, Mar 20, 2025 at 10:59:19AM -0700, Sean Christopherson wrote: >> >@@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) >> > if (r != 0) >> > goto out_mmu_exit; >> > >> >- enable_device_posted_irqs &= enable_apicv && >> >- irq_remapping_cap(IRQ_POSTING_CAP); >> >+ enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && >> >+ irq_remapping_cap(IRQ_POSTING_CAP); >> >> Can we simply drop this ... >> >> > >> > kvm_ops_update(ops); >> > >> >@@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); >> > >> > static int __init kvm_x86_init(void) >> > { >> >+ allow_device_posted_irqs = enable_device_posted_irqs; >> >+ >> > kvm_init_xstate_sizes(); >> > >> > kvm_mmu_x86_module_init(); >> > >> > >> >Option #2 is to shove the module param into vendor code, but leave the variable >> >in kvm.ko, like we do for enable_apicv. >> > >> >I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and >> >doesn't prevent putting the logic in kvm_x86_vendor_init(). >> > >> >> and do >> >> bool kvm_arch_has_irq_bypass(void) >> { >> return enable_device_posted_irqs && enable_apicv && >> irq_remapping_cap(IRQ_POSTING_CAP); >> } > >That would avoid the vendor module issues, but it would result in >allow_device_posted_irqs not reflecting the state of KVM. We could partially Ok. I missed that. btw, is using module_param_cb() a bad idea? like: module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644); with a proper .get callback, we can reflect the state of KVM to userspace accurately. >address that by having the variable incorporate irq_remapping_cap(IRQ_POSTING_CAP) >but not enable_apicv, but that's still a bit funky. > >Given that enable_apicv already has the "variable in kvm.ko, module param in >kvm-{amd,intel}.ko" behavior, and that I am planning on giving enable_ipiv the >same treatment (long story), my strong vote is to go with option #2 as it's the >most flexibile, most accurate, and consistent with existing knobs. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs 2025-03-24 9:26 ` Chao Gao @ 2025-03-24 13:41 ` Sean Christopherson 0 siblings, 0 replies; 13+ messages in thread From: Sean Christopherson @ 2025-03-24 13:41 UTC (permalink / raw) To: Chao Gao; +Cc: Paolo Bonzini, kvm, linux-kernel, Yosry Ahmed On Mon, Mar 24, 2025, Chao Gao wrote: > On Fri, Mar 21, 2025 at 01:44:47PM -0700, Sean Christopherson wrote: > >On Fri, Mar 21, 2025, Chao Gao wrote: > >> On Thu, Mar 20, 2025 at 10:59:19AM -0700, Sean Christopherson wrote: > >> >@@ -9776,8 +9777,8 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) > >> > if (r != 0) > >> > goto out_mmu_exit; > >> > > >> >- enable_device_posted_irqs &= enable_apicv && > >> >- irq_remapping_cap(IRQ_POSTING_CAP); > >> >+ enable_device_posted_irqs = allow_device_posted_irqs && enable_apicv && > >> >+ irq_remapping_cap(IRQ_POSTING_CAP); > >> > >> Can we simply drop this ... > >> > >> > > >> > kvm_ops_update(ops); > >> > > >> >@@ -14033,6 +14034,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_rmp_fault); > >> > > >> > static int __init kvm_x86_init(void) > >> > { > >> >+ allow_device_posted_irqs = enable_device_posted_irqs; > >> >+ > >> > kvm_init_xstate_sizes(); > >> > > >> > kvm_mmu_x86_module_init(); > >> > > >> > > >> >Option #2 is to shove the module param into vendor code, but leave the variable > >> >in kvm.ko, like we do for enable_apicv. > >> > > >> >I'm leaning toward option #2, as it's more flexible, arguably more intuitive, and > >> >doesn't prevent putting the logic in kvm_x86_vendor_init(). > >> > > >> > >> and do > >> > >> bool kvm_arch_has_irq_bypass(void) > >> { > >> return enable_device_posted_irqs && enable_apicv && > >> irq_remapping_cap(IRQ_POSTING_CAP); > >> } > > > >That would avoid the vendor module issues, but it would result in > >allow_device_posted_irqs not reflecting the state of KVM. We could partially > > Ok. I missed that. > > btw, is using module_param_cb() a bad idea? like: > > module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644); > > with a proper .get callback, we can reflect the state of KVM to userspace > accurately. It's not a bad idea, but it comes with tradeoffs too. A little bit more code, but more importantly enable_device_posted_irqs wouldn't reflect KVM's internal state, which could result in bugs if KVM were to check the module param directly. I don't think that'd be likely to happen, but given that pretty much every other "simple" param in KVM reflects KVM's state directly, it'd be an easy mistake to make. That, and being able to set toggle the param when reloading the vendor module is actually valuable, as there are setups where kvm.ko is built-in, but the vendor modules are not. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-03-24 13:41 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-20 14:20 [PATCH v2 0/3] KVM: x86: Add a module param for device posted IRQs Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 1/3] KVM: VMX: Don't send UNBLOCK when starting device assignment without APICv Sean Christopherson 2025-03-20 14:20 ` [PATCH v2 2/3] KVM: SVM: Don't update IRTEs if APICv/AVIC is disable Sean Christopherson 2025-03-20 16:08 ` Jim Mattson 2025-03-20 14:20 ` [PATCH v2 3/3] KVM: x86: Add a module param to control and enumerate device posted IRQs Sean Christopherson 2025-03-20 16:02 ` Jim Mattson 2025-03-20 17:54 ` Sean Christopherson 2025-03-20 17:59 ` Sean Christopherson 2025-03-20 18:14 ` Yosry Ahmed 2025-03-21 1:57 ` Chao Gao 2025-03-21 20:44 ` Sean Christopherson 2025-03-24 9:26 ` Chao Gao 2025-03-24 13:41 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox