kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/17] Add VT-d Posted-Interrupts support
@ 2015-08-25  8:50 Feng Wu
  2015-08-25  8:50 ` [PATCH v7 04/17] KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu' Feng Wu
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

v7:
* Define two weak irq bypass callbacks:
  - kvm_arch_irq_bypass_start()
  - kvm_arch_irq_bypass_stop()
* Remove the x86 dummy implementation of the above two functions.
* Print some useful information instead of WARN_ON() when the
  irq bypass consumer unregistration fails.
* Fix an issue when calling pi_pre_block and pi_post_block.

v6:
* Rebase on 4.2.0-rc6
* Rebase on https://lkml.org/lkml/2015/8/6/526 and http://www.gossamer-threads.com/lists/linux/kernel/2235623
* Make the add_consumer and del_consumer callbacks static
* Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
* Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
* Remove optional dummy callbacks for irq producer

v4:
* For lowest-priority interrupt, only support single-CPU destination
interrupts at the current stage, more common lowest priority support
will be added later.
* Accoring to Marcelo's suggestion, when vCPU is blocked, we handle
the posted-interrupts in the HLT emulation path.
* Some small changes (coding style, typo, add some code comments)

v3:
* Adjust the Posted-interrupts Descriptor updating logic when vCPU is
  preempted or blocked.
* KVM_DEV_VFIO_DEVICE_POSTING_IRQ --> KVM_DEV_VFIO_DEVICE_POST_IRQ
* __KVM_HAVE_ARCH_KVM_VFIO_POSTING --> __KVM_HAVE_ARCH_KVM_VFIO_POST
* Add KVM_DEV_VFIO_DEVICE_UNPOST_IRQ attribute for VFIO irq, which
  can be used to change back to remapping mode.
* Fix typo

v2:
* Use VFIO framework to enable this feature, the VFIO part of this series is
  base on Eric's patch "[PATCH v3 0/8] KVM-VFIO IRQ forward control"
* Rebase this patchset on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git,
  then revise some irq logic based on the new hierarchy irqdomain patches provided
  by Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

Feng Wu (17):
  KVM: Extend struct pi_desc for VT-d Posted-Interrupts
  KVM: Add some helper functions for Posted-Interrupts
  KVM: Define a new interface kvm_intr_is_single_vcpu()
  KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu'
  KVM: Add interfaces to control PI outside vmx
  KVM: Make struct kvm_irq_routing_table accessible
  KVM: make kvm_set_msi_irq() public
  vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices
  vfio: Register/unregister irq_bypass_producer
  KVM: x86: Update IRTE for posted-interrupts
  KVM: Define two weak arch callbacks for irq bypass manager
  KVM: Implement IRQ bypass consumer callbacks for x86
  KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
  KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
  KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
  KVM: Warn if 'SN' is set during posting interrupts by software
  iommu/vt-d: Add a command line parameter for VT-d posted-interrupts

 Documentation/kernel-parameters.txt |   1 +
 arch/x86/include/asm/kvm_host.h     |  20 +++
 arch/x86/kvm/Kconfig                |   1 +
 arch/x86/kvm/irq_comm.c             |  28 +++-
 arch/x86/kvm/vmx.c                  | 288 +++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/x86.c                  | 167 +++++++++++++++++++--
 drivers/iommu/irq_remapping.c       |  12 +-
 drivers/vfio/pci/Kconfig            |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c   |   9 ++
 drivers/vfio/pci/vfio_pci_private.h |   2 +
 include/linux/kvm_host.h            |  28 ++++
 include/linux/kvm_irqfd.h           |   2 +
 virt/kvm/eventfd.c                  |  22 ++-
 virt/kvm/irqchip.c                  |  10 --
 virt/kvm/kvm_main.c                 |   3 +
 15 files changed, 565 insertions(+), 29 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v7 01/17] KVM: Extend struct pi_desc for VT-d Posted-Interrupts
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-08-25  8:50   ` Feng Wu
  2015-08-25  8:50   ` [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts Feng Wu
                     ` (13 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Extend struct pi_desc for VT-d Posted-Interrupts.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/kvm/vmx.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 83b7b5c..271dd70 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -446,8 +446,24 @@ struct nested_vmx {
 /* Posted-Interrupt Descriptor */
 struct pi_desc {
 	u32 pir[8];     /* Posted interrupt requested */
-	u32 control;	/* bit 0 of control is outstanding notification bit */
-	u32 rsvd[7];
+	union {
+		struct {
+				/* bit 256 - Outstanding Notification */
+			u16	on	: 1,
+				/* bit 257 - Suppress Notification */
+				sn	: 1,
+				/* bit 271:258 - Reserved */
+				rsvd_1	: 14;
+				/* bit 279:272 - Notification Vector */
+			u8	nv;
+				/* bit 287:280 - Reserved */
+			u8	rsvd_2;
+				/* bit 319:288 - Notification Destination */
+			u32	ndst;
+		};
+		u64 control;
+	};
+	u32 rsvd[6];
 } __aligned(64);
 
 static bool pi_test_and_set_on(struct pi_desc *pi_desc)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 01/17] KVM: Extend struct pi_desc for VT-d Posted-Interrupts Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-09-11 10:16     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 03/17] KVM: Define a new interface kvm_intr_is_single_vcpu() Feng Wu
                     ` (12 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch adds some helper functions to manipulate the
Posted-Interrupts Descriptor.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/kvm/vmx.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 271dd70..316f9bf 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -443,6 +443,8 @@ struct nested_vmx {
 };
 
 #define POSTED_INTR_ON  0
+#define POSTED_INTR_SN  1
+
 /* Posted-Interrupt Descriptor */
 struct pi_desc {
 	u32 pir[8];     /* Posted interrupt requested */
@@ -483,6 +485,30 @@ static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
 	return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
 }
 
+static void pi_clear_sn(struct pi_desc *pi_desc)
+{
+	return clear_bit(POSTED_INTR_SN,
+			(unsigned long *)&pi_desc->control);
+}
+
+static void pi_set_sn(struct pi_desc *pi_desc)
+{
+	return set_bit(POSTED_INTR_SN,
+			(unsigned long *)&pi_desc->control);
+}
+
+static int pi_test_on(struct pi_desc *pi_desc)
+{
+	return test_bit(POSTED_INTR_ON,
+			(unsigned long *)&pi_desc->control);
+}
+
+static int pi_test_sn(struct pi_desc *pi_desc)
+{
+	return test_bit(POSTED_INTR_SN,
+			(unsigned long *)&pi_desc->control);
+}
+
 struct vcpu_vmx {
 	struct kvm_vcpu       vcpu;
 	unsigned long         host_rsp;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 03/17] KVM: Define a new interface kvm_intr_is_single_vcpu()
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 01/17] KVM: Extend struct pi_desc for VT-d Posted-Interrupts Feng Wu
  2015-08-25  8:50   ` [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-4-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 05/17] KVM: Add interfaces to control PI outside vmx Feng Wu
                     ` (11 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch defines a new interface kvm_intr_is_single_vcpu(),
which can returns whether the interrupt is for single-CPU or not.

It is used by VT-d PI, since now we only support single-CPU
interrupts, For lowest-priority interrupts, if user configures
it via /proc/irq or uses irqbalance to make it single-CPU, we
can use PI to deliver the interrupts to it. Full functionality
of lowest-priority support will be added later.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/irq_comm.c         | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 49ec903..af11bca 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1204,4 +1204,7 @@ int __x86_set_memory_region(struct kvm *kvm,
 int x86_set_memory_region(struct kvm *kvm,
 			  const struct kvm_userspace_memory_region *mem);
 
+bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
+			     struct kvm_vcpu **dest_vcpu);
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 9efff9e..a9572a13 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -297,6 +297,30 @@ out:
 	return r;
 }
 
+bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
+			     struct kvm_vcpu **dest_vcpu)
+{
+	int i, r = 0;
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		if (!kvm_apic_present(vcpu))
+			continue;
+
+		if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand,
+					irq->dest_id, irq->dest_mode))
+			continue;
+
+		r++;
+		*dest_vcpu = vcpu;
+	}
+
+	if (r == 1)
+		return true;
+	else
+		return false;
+}
+
 #define IOAPIC_ROUTING_ENTRY(irq) \
 	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
 	  .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 04/17] KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu'
  2015-08-25  8:50 [PATCH v7 00/17] Add VT-d Posted-Interrupts support Feng Wu
@ 2015-08-25  8:50 ` Feng Wu
       [not found]   ` <1440492620-15934-5-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50 ` [PATCH v7 17/17] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Feng Wu
  2 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel, Feng Wu

Define an interface to get PI descriptor address from the vCPU structure.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/vmx.c              | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index af11bca..d50c1d3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -858,6 +858,8 @@ struct kvm_x86_ops {
 	void (*enable_log_dirty_pt_masked)(struct kvm *kvm,
 					   struct kvm_memory_slot *slot,
 					   gfn_t offset, unsigned long mask);
+
+	u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
 	/* pmu operations of sub-arch */
 	const struct kvm_pmu_ops *pmu_ops;
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 316f9bf..81a995c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -610,6 +610,10 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu)
 #define FIELD64(number, name)	[number] = VMCS12_OFFSET(name), \
 				[number##_HIGH] = VMCS12_OFFSET(name)+4
 
+struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
+{
+	return &(to_vmx(vcpu)->pi_desc);
+}
 
 static unsigned long shadow_read_only_fields[] = {
 	/*
@@ -4487,6 +4491,11 @@ static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu)
 	return;
 }
 
+static u64 vmx_get_pi_desc_addr(struct kvm_vcpu *vcpu)
+{
+	return __pa((u64)vcpu_to_pi_desc(vcpu));
+}
+
 /*
  * Set up the vmcs's constant host-state fields, i.e., host-state fields that
  * will not change in the lifetime of the guest.
@@ -10460,6 +10469,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.flush_log_dirty = vmx_flush_log_dirty,
 	.enable_log_dirty_pt_masked = vmx_enable_log_dirty_pt_masked,
 
+	.get_pi_desc_addr = vmx_get_pi_desc_addr,
+
 	.pmu_ops = &intel_pmu_ops,
 };
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 05/17] KVM: Add interfaces to control PI outside vmx
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 03/17] KVM: Define a new interface kvm_intr_is_single_vcpu() Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-08-25  8:50   ` [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible Feng Wu
                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch adds pi_clear_sn and pi_set_sn to struct kvm_x86_ops,
so we can set/clear SN outside vmx.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/vmx.c              | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d50c1d3..c4f99f1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -860,6 +860,9 @@ struct kvm_x86_ops {
 					   gfn_t offset, unsigned long mask);
 
 	u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
+
+	void (*pi_clear_sn)(struct kvm_vcpu *vcpu);
+	void (*pi_set_sn)(struct kvm_vcpu *vcpu);
 	/* pmu operations of sub-arch */
 	const struct kvm_pmu_ops *pmu_ops;
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 81a995c..234f720 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -615,6 +615,16 @@ struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
 	return &(to_vmx(vcpu)->pi_desc);
 }
 
+static void vmx_pi_clear_sn(struct kvm_vcpu *vcpu)
+{
+	pi_clear_sn(vcpu_to_pi_desc(vcpu));
+}
+
+static void vmx_pi_set_sn(struct kvm_vcpu *vcpu)
+{
+	pi_set_sn(vcpu_to_pi_desc(vcpu));
+}
+
 static unsigned long shadow_read_only_fields[] = {
 	/*
 	 * We do NOT shadow fields that are modified when L0
@@ -10471,6 +10481,9 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
 	.get_pi_desc_addr = vmx_get_pi_desc_addr,
 
+	.pi_clear_sn = vmx_pi_clear_sn,
+	.pi_set_sn = vmx_pi_set_sn,
+
 	.pmu_ops = &intel_pmu_ops,
 };
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 05/17] KVM: Add interfaces to control PI outside vmx Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-09-11 10:50     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 07/17] KVM: make kvm_set_msi_irq() public Feng Wu
                     ` (9 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Move struct kvm_irq_routing_table from irqchip.c to kvm_host.h,
so we can use it outside of irqchip.c.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 include/linux/kvm_host.h | 14 ++++++++++++++
 virt/kvm/irqchip.c       | 10 ----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5ac8d21..5f183fb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -328,6 +328,20 @@ struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
+#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
+
+struct kvm_irq_routing_table {
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
+	u32 nr_rt_entries;
+	/*
+	 * Array indexed by gsi. Each entry contains list of irq chips
+	 * the gsi is connected to.
+	 */
+	struct hlist_head map[0];
+};
+
+#endif
+
 #ifndef KVM_PRIVATE_MEM_SLOTS
 #define KVM_PRIVATE_MEM_SLOTS 0
 #endif
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 21c1424..2cf45d3 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -31,16 +31,6 @@
 #include <trace/events/kvm.h>
 #include "irq.h"
 
-struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
-	u32 nr_rt_entries;
-	/*
-	 * Array indexed by gsi. Each entry contains list of irq chips
-	 * the gsi is connected to.
-	 */
-	struct hlist_head map[0];
-};
-
 int kvm_irq_map_gsi(struct kvm *kvm,
 		    struct kvm_kernel_irq_routing_entry *entries, int gsi)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 07/17] KVM: make kvm_set_msi_irq() public
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-8-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices Feng Wu
                     ` (8 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Make kvm_set_msi_irq() public, we can use this function outside.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h | 4 ++++
 arch/x86/kvm/irq_comm.c         | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c4f99f1..82d0709 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -175,6 +175,8 @@ enum {
  */
 #define KVM_APIC_PV_EOI_PENDING	1
 
+struct kvm_kernel_irq_routing_entry;
+
 /*
  * We don't want allocation failures within the mmu code, so we preallocate
  * enough memory for a single page fault in a cache.
@@ -1212,4 +1214,6 @@ int x86_set_memory_region(struct kvm *kvm,
 bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
 			     struct kvm_vcpu **dest_vcpu);
 
+void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+		     struct kvm_lapic_irq *irq);
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index a9572a13..1319c60 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -91,8 +91,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 	return r;
 }
 
-static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
-				   struct kvm_lapic_irq *irq)
+void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
+		     struct kvm_lapic_irq *irq)
 {
 	trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data);
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 07/17] KVM: make kvm_set_msi_irq() public Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-9-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 09/17] vfio: Register/unregister irq_bypass_producer Feng Wu
                     ` (7 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Enable irq bypass manager for vfio PCI devices.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/vfio/pci/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 579d83b..02912f1 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PCI
 	tristate "VFIO support for PCI devices"
 	depends on VFIO && PCI && EVENTFD
 	select VFIO_VIRQFD
+	select IRQ_BYPASS_MANAGER
 	help
 	  Support for the PCI VFIO bus driver.  This is required to make
 	  use of PCI drivers using the VFIO framework.
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 09/17] vfio: Register/unregister irq_bypass_producer
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-08-25  8:50   ` [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts Feng Wu
                     ` (6 subsequent siblings)
  14 siblings, 0 replies; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch adds the registration/unregistration of an
irq_bypass_producer for MSI/MSIx on vfio pci devices.

v6:
- Make the add_consumer and del_consumer callbacks static
- Remove pointless INIT_LIST_HEAD to 'vdev->ctx[vector].producer.node)'
- Use dev_info instead of WARN_ON() when irq_bypass_register_producer fails
- Remove optional dummy callbacks for irq producer

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/vfio/pci/vfio_pci_intrs.c   | 9 +++++++++
 drivers/vfio/pci/vfio_pci_private.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
index 1f577b4..c65299d 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -319,6 +319,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
 
 	if (vdev->ctx[vector].trigger) {
 		free_irq(irq, vdev->ctx[vector].trigger);
+		irq_bypass_unregister_producer(&vdev->ctx[vector].producer);
 		kfree(vdev->ctx[vector].name);
 		eventfd_ctx_put(vdev->ctx[vector].trigger);
 		vdev->ctx[vector].trigger = NULL;
@@ -360,6 +361,14 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
 		return ret;
 	}
 
+	vdev->ctx[vector].producer.token = trigger;
+	vdev->ctx[vector].producer.irq = irq;
+	ret = irq_bypass_register_producer(&vdev->ctx[vector].producer);
+	if (unlikely(ret))
+		dev_info(&pdev->dev,
+		"irq bypass producer (token %p) registeration fails: %d\n",
+		vdev->ctx[vector].producer.token, ret);
+
 	vdev->ctx[vector].trigger = trigger;
 
 	return 0;
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index ae0e1b4..0e7394f 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -13,6 +13,7 @@
 
 #include <linux/mutex.h>
 #include <linux/pci.h>
+#include <linux/irqbypass.h>
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -29,6 +30,7 @@ struct vfio_pci_irq_ctx {
 	struct virqfd		*mask;
 	char			*name;
 	bool			masked;
+	struct irq_bypass_producer	producer;
 };
 
 struct vfio_pci_device {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 09/17] vfio: Register/unregister irq_bypass_producer Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-08-25 19:57     ` Alex Williamson
  2015-09-11 10:29     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 11/17] KVM: Define two weak arch callbacks for irq bypass manager Feng Wu
                     ` (5 subsequent siblings)
  14 siblings, 2 replies; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch adds the routine to update IRTE for posted-interrupts
when guest changes the interrupt configuration.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/kvm/x86.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ef2560..8f09a76 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -63,6 +63,7 @@
 #include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/pvclock.h>
 #include <asm/div64.h>
+#include <asm/irq_remapping.h>
 
 #define MAX_IO_MSRS 256
 #define KVM_MAX_MCE_BANKS 32
@@ -8248,6 +8249,78 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
 }
 EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
 
+/*
+ * kvm_arch_update_pi_irte - set IRTE for Posted-Interrupts
+ *
+ * @kvm: kvm
+ * @host_irq: host irq of the interrupt
+ * @guest_irq: gsi of the interrupt
+ * @set: set or unset PI
+ * returns 0 on success, < 0 on failure
+ */
+int kvm_arch_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
+			    uint32_t guest_irq, bool set)
+{
+	struct kvm_kernel_irq_routing_entry *e;
+	struct kvm_irq_routing_table *irq_rt;
+	struct kvm_lapic_irq irq;
+	struct kvm_vcpu *vcpu;
+	struct vcpu_data vcpu_info;
+	int idx, ret = -EINVAL;
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return 0;
+
+	idx = srcu_read_lock(&kvm->irq_srcu);
+	irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
+	BUG_ON(guest_irq >= irq_rt->nr_rt_entries);
+
+	hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
+		if (e->type != KVM_IRQ_ROUTING_MSI)
+			continue;
+		/*
+		 * VT-d PI cannot support posting multicast/broadcast
+		 * interrupts to a VCPU, we still use interrupt remapping
+		 * for these kind of interrupts.
+		 *
+		 * For lowest-priority interrupts, we only support
+		 * those with single CPU as the destination, e.g. user
+		 * configures the interrupts via /proc/irq or uses
+		 * irqbalance to make the interrupts single-CPU.
+		 *
+		 * We will support full lowest-priority interrupt later.
+		 *
+		 */
+
+		kvm_set_msi_irq(e, &irq);
+		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
+			continue;
+
+		vcpu_info.pi_desc_addr = kvm_x86_ops->get_pi_desc_addr(vcpu);
+		vcpu_info.vector = irq.vector;
+
+		if (set)
+			ret = irq_set_vcpu_affinity(host_irq, &vcpu_info);
+		else {
+			/* suppress notification event before unposting */
+			kvm_x86_ops->pi_set_sn(vcpu);
+			ret = irq_set_vcpu_affinity(host_irq, NULL);
+			kvm_x86_ops->pi_clear_sn(vcpu);
+		}
+
+		if (ret < 0) {
+			printk(KERN_INFO "%s: failed to update PI IRTE\n",
+					__func__);
+			goto out;
+		}
+	}
+
+	ret = 0;
+out:
+	srcu_read_unlock(&kvm->irq_srcu, idx);
+	return ret;
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 11/17] KVM: Define two weak arch callbacks for irq bypass manager
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-12-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86 Feng Wu
                     ` (4 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Define two weak arch callbacks so that archs that don't need
them don't need define them.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 virt/kvm/eventfd.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index d7a230f..f3050b9 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -256,6 +256,16 @@ static void irqfd_update(struct kvm *kvm, struct kvm_kernel_irqfd *irqfd)
 	write_seqcount_end(&irqfd->irq_entry_sc);
 }
 
+void __attribute__((weak)) kvm_arch_irq_bypass_stop(
+				struct irq_bypass_consumer *cons)
+{
+}
+
+void __attribute__((weak)) kvm_arch_irq_bypass_start(
+				struct irq_bypass_consumer *cons)
+{
+}
+
 static int
 kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 11/17] KVM: Define two weak arch callbacks for irq bypass manager Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-09-11 10:31     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' Feng Wu
                     ` (3 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Implement the following callbacks for x86:

- kvm_arch_irq_bypass_add_producer
- kvm_arch_irq_bypass_del_producer
- kvm_arch_irq_bypass_stop: dummy callback
- kvm_arch_irq_bypass_resume: dummy callback

and set CONFIG_HAVE_KVM_IRQ_BYPASS for x86.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/x86.c              | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 82d0709..3038c1b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -24,6 +24,7 @@
 #include <linux/perf_event.h>
 #include <linux/pvclock_gtod.h>
 #include <linux/clocksource.h>
+#include <linux/irqbypass.h>
 
 #include <asm/pvclock-abi.h>
 #include <asm/desc.h>
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index c951d44..b90776f 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -30,6 +30,7 @@ config KVM
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_IRQFD
 	select IRQ_BYPASS_MANAGER
+	select HAVE_KVM_IRQ_BYPASS
 	select HAVE_KVM_IRQ_ROUTING
 	select HAVE_KVM_EVENTFD
 	select KVM_APIC_ARCHITECTURE
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f09a76..be4b561 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -50,6 +50,8 @@
 #include <linux/pci.h>
 #include <linux/timekeeper_internal.h>
 #include <linux/pvclock_gtod.h>
+#include <linux/kvm_irqfd.h>
+#include <linux/irqbypass.h>
 #include <trace/events/kvm.h>
 
 #define CREATE_TRACE_POINTS
@@ -8321,6 +8323,38 @@ out:
 	return ret;
 }
 
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	irqfd->producer = prod;
+
+	return kvm_arch_update_pi_irte(irqfd->kvm, prod->irq, irqfd->gsi, 1);
+}
+
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+				      struct irq_bypass_producer *prod)
+{
+	int ret;
+	struct kvm_kernel_irqfd *irqfd =
+		container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+	irqfd->producer = NULL;
+
+	/*
+	 * When producer of consumer is unregistered, we change back to
+	 * remapped mode, so we can re-use the current implementation
+	 * when the irq is masked/disabed or the consumer side (KVM
+	 * int this case doesn't want to receive the interrupts.
+	*/
+	ret = kvm_arch_update_pi_irte(irqfd->kvm, prod->irq, irqfd->gsi, 0);
+	if (ret)
+		printk(KERN_INFO "irq bypass consumer (token %p) unregistration"
+		       " fails: %d\n", irqfd->consumer.token, ret);
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86 Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-09-11 10:39     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 14/17] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted Feng Wu
                     ` (2 subsequent siblings)
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch adds an arch specific hooks 'arch_update' in
'struct kvm_kernel_irqfd'. On Intel side, it is used to
update the IRTE when VT-d posted-interrupts is used.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/x86.c              |  5 +++++
 include/linux/kvm_host.h        | 11 +++++++++++
 include/linux/kvm_irqfd.h       |  2 ++
 virt/kvm/eventfd.c              | 12 +++++++++++-
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3038c1b..22269b4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -176,6 +176,8 @@ enum {
  */
 #define KVM_APIC_PV_EOI_PENDING	1
 
+#define __KVM_HAVE_ARCH_IRQFD_INIT 1
+
 struct kvm_kernel_irq_routing_entry;
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index be4b561..ef93fdc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8355,6 +8355,11 @@ void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
 		       " fails: %d\n", irqfd->consumer.token, ret);
 }
 
+void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd)
+{
+	irqfd->arch_update = kvm_arch_update_pi_irte;
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5f183fb..f4005dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -34,6 +34,8 @@
 
 #include <asm/kvm_host.h>
 
+struct kvm_kernel_irqfd;
+
 /*
  * The bit 16 ~ bit 31 of kvm_memory_region::flags are internally used
  * in kvm, other bits are visible for userspace which are defined in
@@ -1145,6 +1147,15 @@ extern struct kvm_device_ops kvm_xics_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
 extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
 
+#ifdef __KVM_HAVE_ARCH_IRQFD_INIT
+void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd);
+#else
+static inline void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd)
+{
+	irqfd->arch_update = NULL;
+}
+#endif
+
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
 static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
index 0c1de05..b7aab52 100644
--- a/include/linux/kvm_irqfd.h
+++ b/include/linux/kvm_irqfd.h
@@ -66,6 +66,8 @@ struct kvm_kernel_irqfd {
 	struct work_struct shutdown;
 	struct irq_bypass_consumer consumer;
 	struct irq_bypass_producer *producer;
+	int (*arch_update)(struct kvm *kvm, unsigned int host_irq,
+			   uint32_t guest_irq, bool set);
 };
 
 #endif /* __LINUX_KVM_IRQFD_H */
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index f3050b9..b2d9066 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -288,6 +288,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 	INIT_LIST_HEAD(&irqfd->list);
 	INIT_WORK(&irqfd->inject, irqfd_inject);
 	INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
+	kvm_arch_irqfd_init(irqfd);
 	seqcount_init(&irqfd->irq_entry_sc);
 
 	f = fdget(args->fd);
@@ -580,13 +581,22 @@ kvm_irqfd_release(struct kvm *kvm)
  */
 void kvm_irq_routing_update(struct kvm *kvm)
 {
+	int ret;
 	struct kvm_kernel_irqfd *irqfd;
 
 	spin_lock_irq(&kvm->irqfds.lock);
 
-	list_for_each_entry(irqfd, &kvm->irqfds.items, list)
+	list_for_each_entry(irqfd, &kvm->irqfds.items, list) {
 		irqfd_update(kvm, irqfd);
 
+		if (irqfd->arch_update && irqfd->producer) {
+			ret = irqfd->arch_update(
+					irqfd->kvm, irqfd->producer->irq,
+					irqfd->gsi, 1);
+			WARN_ON(ret);
+		}
+	}
+
 	spin_unlock_irq(&kvm->irqfds.lock);
 }
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 14/17] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-15-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2015-08-25  8:50   ` [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked Feng Wu
  2015-08-25  8:50   ` [PATCH v7 16/17] KVM: Warn if 'SN' is set during posting interrupts by software Feng Wu
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch updates the Posted-Interrupts Descriptor when vCPU
is preempted.

sched out:
- Set 'SN' to suppress furture non-urgent interrupts posted for
the vCPU.

sched in:
- Clear 'SN'
- Change NDST if vCPU is scheduled to a different CPU
- Set 'NV' to POSTED_INTR_VECTOR

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/kvm/vmx.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 234f720..9c87064 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -45,6 +45,7 @@
 #include <asm/debugreg.h>
 #include <asm/kexec.h>
 #include <asm/apic.h>
+#include <asm/irq_remapping.h>
 
 #include "trace.h"
 #include "pmu.h"
@@ -2001,10 +2002,60 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
 		vmx->loaded_vmcs->cpu = cpu;
 	}
+
+	if (irq_remapping_cap(IRQ_POSTING_CAP)) {
+		struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+		struct pi_desc old, new;
+		unsigned int dest;
+
+		do {
+			old.control = new.control = pi_desc->control;
+
+			/*
+			 * If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there
+			 * are two possible cases:
+			 * 1. After running 'pi_pre_block', context switch
+			 *    happened. For this case, 'sn' was set in
+			 *    vmx_vcpu_put(), so we need to clear it here.
+			 * 2. After running 'pi_pre_block', we were blocked,
+			 *    and woken up by some other guy. For this case,
+			 *    we don't need to do anything, 'pi_post_block'
+			 *    will do everything for us. However, we cannot
+			 *    check whether it is case #1 or case #2 here
+			 *    (maybe, not needed), so we also clear sn here,
+			 *    I think it is not a big deal.
+			 */
+			if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) {
+				if (vcpu->cpu != cpu) {
+					dest = cpu_physical_id(cpu);
+
+					if (x2apic_enabled())
+						new.ndst = dest;
+					else
+						new.ndst = (dest << 8) & 0xFF00;
+				}
+
+				/* set 'NV' to 'notification vector' */
+				new.nv = POSTED_INTR_VECTOR;
+			}
+
+			/* Allow posting non-urgent interrupts */
+			new.sn = 0;
+		} while (cmpxchg(&pi_desc->control, old.control,
+				new.control) != old.control);
+	}
 }
 
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	if (irq_remapping_cap(IRQ_POSTING_CAP)) {
+		struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+		/* Set SN when the vCPU is preempted */
+		if (vcpu->preempted)
+			pi_set_sn(pi_desc);
+	}
+
 	__vmx_load_host_state(to_vmx(vcpu));
 	if (!vmm_exclusive) {
 		__loaded_vmcs_clear(to_vmx(vcpu)->loaded_vmcs);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 14/17] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
  2015-09-11 11:21     ` Paolo Bonzini
  2015-08-25  8:50   ` [PATCH v7 16/17] KVM: Warn if 'SN' is set during posting interrupts by software Feng Wu
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

This patch updates the Posted-Interrupts Descriptor when vCPU
is blocked.

pre-block:
- Add the vCPU to the blocked per-CPU list
- Set 'NV' to POSTED_INTR_WAKEUP_VECTOR

post-block:
- Remove the vCPU from the per-CPU list

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/include/asm/kvm_host.h |   5 ++
 arch/x86/kvm/vmx.c              | 151 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  55 ++++++++++++---
 include/linux/kvm_host.h        |   3 +
 virt/kvm/kvm_main.c             |   3 +
 5 files changed, 207 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 22269b4..32af275 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -554,6 +554,8 @@ struct kvm_vcpu_arch {
 	 */
 	bool write_fault_to_shadow_pgtable;
 
+	bool halted;
+
 	/* set at EPT violation at this point */
 	unsigned long exit_qualification;
 
@@ -868,6 +870,9 @@ struct kvm_x86_ops {
 
 	void (*pi_clear_sn)(struct kvm_vcpu *vcpu);
 	void (*pi_set_sn)(struct kvm_vcpu *vcpu);
+
+	int (*pi_pre_block)(struct kvm_vcpu *vcpu);
+	void (*pi_post_block)(struct kvm_vcpu *vcpu);
 	/* pmu operations of sub-arch */
 	const struct kvm_pmu_ops *pmu_ops;
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9c87064..64e35ea 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -888,6 +888,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
 static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
 static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
 
+/*
+ * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
+ * can find which vCPU should be waken up.
+ */
+static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
+static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
+
 static unsigned long *vmx_io_bitmap_a;
 static unsigned long *vmx_io_bitmap_b;
 static unsigned long *vmx_msr_bitmap_legacy;
@@ -2981,6 +2988,8 @@ static int hardware_enable(void)
 		return -EBUSY;
 
 	INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
+	INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
+	spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
 
 	/*
 	 * Now we can enable the vmclear operation in kdump
@@ -6106,6 +6115,25 @@ static void update_ple_window_actual_max(void)
 			                    ple_window_grow, INT_MIN);
 }
 
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+static void wakeup_handler(void)
+{
+	struct kvm_vcpu *vcpu;
+	int cpu = smp_processor_id();
+
+	spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
+	list_for_each_entry(vcpu, &per_cpu(blocked_vcpu_on_cpu, cpu),
+			blocked_vcpu_list) {
+		struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+		if (pi_test_on(pi_desc) == 1)
+			kvm_vcpu_kick(vcpu);
+	}
+	spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
+}
+
 static __init int hardware_setup(void)
 {
 	int r = -ENOMEM, i, msr;
@@ -6290,6 +6318,8 @@ static __init int hardware_setup(void)
 		kvm_x86_ops->enable_log_dirty_pt_masked = NULL;
 	}
 
+	kvm_set_posted_intr_wakeup_handler(wakeup_handler);
+
 	return alloc_kvm_area();
 
 out8:
@@ -10414,6 +10444,124 @@ static void vmx_enable_log_dirty_pt_masked(struct kvm *kvm,
 	kvm_mmu_clear_dirty_pt_masked(kvm, memslot, offset, mask);
 }
 
+/*
+ * This routine does the following things for vCPU which is going
+ * to be blocked if VT-d PI is enabled.
+ * - Store the vCPU to the wakeup list, so when interrupts happen
+ *   we can find the right vCPU to wake up.
+ * - Change the Posted-interrupt descriptor as below:
+ *      'NDST' <-- vcpu->pre_pcpu
+ *      'NV' <-- POSTED_INTR_WAKEUP_VECTOR
+ * - If 'ON' is set during this process, which means at least one
+ *   interrupt is posted for this vCPU, we cannot block it, in
+ *   this case, return 1, otherwise, return 0.
+ *
+ */
+static int vmx_pi_pre_block(struct kvm_vcpu *vcpu)
+{
+	unsigned long flags;
+	unsigned int dest;
+	struct pi_desc old, new;
+	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return 0;
+
+	vcpu->pre_pcpu = vcpu->cpu;
+	spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
+			  vcpu->pre_pcpu), flags);
+	list_add_tail(&vcpu->blocked_vcpu_list,
+		      &per_cpu(blocked_vcpu_on_cpu,
+		      vcpu->pre_pcpu));
+	spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock,
+			       vcpu->pre_pcpu), flags);
+
+	do {
+		old.control = new.control = pi_desc->control;
+
+		/*
+		 * We should not block the vCPU if
+		 * an interrupt is posted for it.
+		 */
+		if (pi_test_on(pi_desc) == 1) {
+			spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
+					  vcpu->pre_pcpu), flags);
+			list_del(&vcpu->blocked_vcpu_list);
+			spin_unlock_irqrestore(
+					&per_cpu(blocked_vcpu_on_cpu_lock,
+					vcpu->pre_pcpu), flags);
+			vcpu->pre_pcpu = -1;
+
+			return 1;
+		}
+
+		WARN((pi_desc->sn == 1),
+		     "Warning: SN field of posted-interrupts "
+		     "is set before blocking\n");
+
+		/*
+		 * Since vCPU can be preempted during this process,
+		 * vcpu->cpu could be different with pre_pcpu, we
+		 * need to set pre_pcpu as the destination of wakeup
+		 * notification event, then we can find the right vCPU
+		 * to wakeup in wakeup handler if interrupts happen
+		 * when the vCPU is in blocked state.
+		 */
+		dest = cpu_physical_id(vcpu->pre_pcpu);
+
+		if (x2apic_enabled())
+			new.ndst = dest;
+		else
+			new.ndst = (dest << 8) & 0xFF00;
+
+		/* set 'NV' to 'wakeup vector' */
+		new.nv = POSTED_INTR_WAKEUP_VECTOR;
+	} while (cmpxchg(&pi_desc->control, old.control,
+			new.control) != old.control);
+
+	return 0;
+}
+
+static void vmx_pi_post_block(struct kvm_vcpu *vcpu)
+{
+	struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+	struct pi_desc old, new;
+	unsigned int dest;
+	unsigned long flags;
+
+	if (!irq_remapping_cap(IRQ_POSTING_CAP))
+		return;
+
+	do {
+		old.control = new.control = pi_desc->control;
+
+		dest = cpu_physical_id(vcpu->cpu);
+
+		if (x2apic_enabled())
+			new.ndst = dest;
+		else
+			new.ndst = (dest << 8) & 0xFF00;
+
+		/* Allow posting non-urgent interrupts */
+		new.sn = 0;
+
+		/* set 'NV' to 'notification vector' */
+		new.nv = POSTED_INTR_VECTOR;
+	} while (cmpxchg(&pi_desc->control, old.control,
+			new.control) != old.control);
+
+	if(vcpu->pre_pcpu != -1) {
+		spin_lock_irqsave(
+			&per_cpu(blocked_vcpu_on_cpu_lock,
+			vcpu->pre_pcpu), flags);
+		list_del(&vcpu->blocked_vcpu_list);
+		spin_unlock_irqrestore(
+			&per_cpu(blocked_vcpu_on_cpu_lock,
+			vcpu->pre_pcpu), flags);
+		vcpu->pre_pcpu = -1;
+	}
+}
+
 static struct kvm_x86_ops vmx_x86_ops = {
 	.cpu_has_kvm_support = cpu_has_kvm_support,
 	.disabled_by_bios = vmx_disabled_by_bios,
@@ -10535,6 +10683,9 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.pi_clear_sn = vmx_pi_clear_sn,
 	.pi_set_sn = vmx_pi_set_sn,
 
+	.pi_pre_block = vmx_pi_pre_block,
+	.pi_post_block = vmx_pi_post_block,
+
 	.pmu_ops = &intel_pmu_ops,
 };
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ef93fdc..fc7f222 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5869,7 +5869,13 @@ int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	++vcpu->stat.halt_exits;
 	if (irqchip_in_kernel(vcpu->kvm)) {
-		vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
+		/* Handle posted-interrupt when vCPU is to be halted */
+		if (!kvm_x86_ops->pi_pre_block ||
+		    (kvm_x86_ops->pi_pre_block &&
+		    kvm_x86_ops->pi_pre_block(vcpu) == 0)) {
+			vcpu->arch.halted = true;
+			vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
+		}
 		return 1;
 	} else {
 		vcpu->run->exit_reason = KVM_EXIT_HLT;
@@ -6518,6 +6524,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_vcpu_reload_apic_access_page(vcpu);
 	}
 
+	/*
+	 * Since posted-interrupts can be set by VT-d HW now, in this
+	 * case, KVM_REQ_EVENT is not set. We move the following
+	 * operations out of the if statement.
+	 */
+	if (kvm_lapic_enabled(vcpu)) {
+		/*
+		 * Update architecture specific hints for APIC
+		 * virtual interrupt delivery.
+		 */
+		if (kvm_x86_ops->hwapic_irr_update)
+			kvm_x86_ops->hwapic_irr_update(vcpu,
+				kvm_lapic_find_highest_irr(vcpu));
+	}
+
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
 		kvm_apic_accept_events(vcpu);
 		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
@@ -6534,13 +6555,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			kvm_x86_ops->enable_irq_window(vcpu);
 
 		if (kvm_lapic_enabled(vcpu)) {
-			/*
-			 * Update architecture specific hints for APIC
-			 * virtual interrupt delivery.
-			 */
-			if (kvm_x86_ops->hwapic_irr_update)
-				kvm_x86_ops->hwapic_irr_update(vcpu,
-					kvm_lapic_find_highest_irr(vcpu));
 			update_cr8_intercept(vcpu);
 			kvm_lapic_sync_to_vapic(vcpu);
 		}
@@ -6711,10 +6725,31 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 
 	for (;;) {
 		if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
-		    !vcpu->arch.apf.halted)
+		    !vcpu->arch.apf.halted) {
+			/*
+			 * For some cases, we can get here with
+			 * vcpu->arch.halted being true.
+			 */
+			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
+				kvm_x86_ops->pi_post_block(vcpu);
+				vcpu->arch.halted = false;
+			}
+
 			r = vcpu_enter_guest(vcpu);
-		else
+		} else {
 			r = vcpu_block(kvm, vcpu);
+
+			/*
+			 * pi_post_block() must be called after
+			 * pi_pre_block() which is called in
+			 * kvm_vcpu_halt().
+			 */
+			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
+				kvm_x86_ops->pi_post_block(vcpu);
+				vcpu->arch.halted = false;
+			}
+		}
+
 		if (r <= 0)
 			break;
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f4005dc..6aa69f4 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -233,6 +233,9 @@ struct kvm_vcpu {
 	unsigned long requests;
 	unsigned long guest_debug;
 
+	int pre_pcpu;
+	struct list_head blocked_vcpu_list;
+
 	struct mutex mutex;
 	struct kvm_run *run;
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b8a444..191c7eb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -220,6 +220,9 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	init_waitqueue_head(&vcpu->wq);
 	kvm_async_pf_vcpu_init(vcpu);
 
+	vcpu->pre_pcpu = -1;
+	INIT_LIST_HEAD(&vcpu->blocked_vcpu_list);
+
 	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
 	if (!page) {
 		r = -ENOMEM;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 16/17] KVM: Warn if 'SN' is set during posting interrupts by software
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2015-08-25  8:50   ` [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked Feng Wu
@ 2015-08-25  8:50   ` Feng Wu
       [not found]     ` <1440492620-15934-17-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  14 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini-H+wXaHxf7aLQT0dZR+AlfA,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A

Currently, we don't support urgent interrupt, all interrupts
are recognized as non-urgent interrupt, so we cannot post
interrupts when 'SN' is set.

If the vcpu is in guest mode, it cannot have been scheduled out,
and that's the only case when SN is set currently, warning if
SN is set.

Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 arch/x86/kvm/vmx.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 64e35ea..eb640a1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4494,6 +4494,22 @@ static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_SMP
 	if (vcpu->mode == IN_GUEST_MODE) {
+		struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+		/*
+		 * Currently, we don't support urgent interrupt,
+		 * all interrupts are recognized as non-urgent
+		 * interrupt, so we cannot post interrupts when
+		 * 'SN' is set.
+		 *
+		 * If the vcpu is in guest mode, it means it is
+		 * running instead of being scheduled out and
+		 * waiting in the run queue, and that's the only
+		 * case when 'SN' is set currently, warning if
+		 * 'SN' is set.
+		 */
+		WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc));
+
 		apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
 				POSTED_INTR_VECTOR);
 		return true;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 17/17] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
  2015-08-25  8:50 [PATCH v7 00/17] Add VT-d Posted-Interrupts support Feng Wu
  2015-08-25  8:50 ` [PATCH v7 04/17] KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu' Feng Wu
       [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-08-25  8:50 ` Feng Wu
  2015-09-11 11:05   ` Paolo Bonzini
  2 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2015-08-25  8:50 UTC (permalink / raw)
  To: pbonzini, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel, Feng Wu

Enable VT-d Posted-Interrtups and add a command line
parameter for it.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 Documentation/kernel-parameters.txt |  1 +
 drivers/iommu/irq_remapping.c       | 12 ++++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 1d6f045..52aca36 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1547,6 +1547,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			nosid	disable Source ID checking
 			no_x2apic_optout
 				BIOS x2APIC opt-out request will be ignored
+			nopost	disable Interrupt Posting
 
 	iomem=		Disable strict checking of access to MMIO memory
 		strict	regions from userspace.
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 2d99930..d8c3997 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -22,7 +22,7 @@ int irq_remap_broken;
 int disable_sourceid_checking;
 int no_x2apic_optout;
 
-int disable_irq_post = 1;
+int disable_irq_post = 0;
 
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
@@ -58,14 +58,18 @@ static __init int setup_irqremap(char *str)
 		return -EINVAL;
 
 	while (*str) {
-		if (!strncmp(str, "on", 2))
+		if (!strncmp(str, "on", 2)) {
 			disable_irq_remap = 0;
-		else if (!strncmp(str, "off", 3))
+			disable_irq_post = 0;
+		} else if (!strncmp(str, "off", 3)) {
 			disable_irq_remap = 1;
-		else if (!strncmp(str, "nosid", 5))
+			disable_irq_post = 1;
+		} else if (!strncmp(str, "nosid", 5))
 			disable_sourceid_checking = 1;
 		else if (!strncmp(str, "no_x2apic_optout", 16))
 			no_x2apic_optout = 1;
+		else if (!strncmp(str, "nopost", 6))
+			disable_irq_post = 1;
 
 		str += strcspn(str, ",");
 		while (*str == ',')
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts
  2015-08-25  8:50   ` [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts Feng Wu
@ 2015-08-25 19:57     ` Alex Williamson
       [not found]       ` <1440532664.20355.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2015-09-11 10:29     ` Paolo Bonzini
  1 sibling, 1 reply; 35+ messages in thread
From: Alex Williamson @ 2015-08-25 19:57 UTC (permalink / raw)
  To: Feng Wu; +Cc: pbonzini, joro, mtosatti, eric.auger, kvm, iommu, linux-kernel

On Tue, 2015-08-25 at 16:50 +0800, Feng Wu wrote:
> This patch adds the routine to update IRTE for posted-interrupts
> when guest changes the interrupt configuration.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  arch/x86/kvm/x86.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5ef2560..8f09a76 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -63,6 +63,7 @@
>  #include <asm/fpu/internal.h> /* Ugh! */
>  #include <asm/pvclock.h>
>  #include <asm/div64.h>
> +#include <asm/irq_remapping.h>
>  
>  #define MAX_IO_MSRS 256
>  #define KVM_MAX_MCE_BANKS 32
> @@ -8248,6 +8249,78 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
>  }
>  EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
>  
> +/*
> + * kvm_arch_update_pi_irte - set IRTE for Posted-Interrupts
> + *
> + * @kvm: kvm
> + * @host_irq: host irq of the interrupt
> + * @guest_irq: gsi of the interrupt
> + * @set: set or unset PI
> + * returns 0 on success, < 0 on failure
> + */
> +int kvm_arch_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
> +			    uint32_t guest_irq, bool set)
> +{
> +	struct kvm_kernel_irq_routing_entry *e;
> +	struct kvm_irq_routing_table *irq_rt;
> +	struct kvm_lapic_irq irq;
> +	struct kvm_vcpu *vcpu;
> +	struct vcpu_data vcpu_info;
> +	int idx, ret = -EINVAL;
> +
> +	if (!irq_remapping_cap(IRQ_POSTING_CAP))
> +		return 0;
> +
> +	idx = srcu_read_lock(&kvm->irq_srcu);
> +	irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
> +	BUG_ON(guest_irq >= irq_rt->nr_rt_entries);
> +
> +	hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
> +		if (e->type != KVM_IRQ_ROUTING_MSI)
> +			continue;
> +		/*
> +		 * VT-d PI cannot support posting multicast/broadcast
> +		 * interrupts to a VCPU, we still use interrupt remapping
> +		 * for these kind of interrupts.
> +		 *
> +		 * For lowest-priority interrupts, we only support
> +		 * those with single CPU as the destination, e.g. user
> +		 * configures the interrupts via /proc/irq or uses
> +		 * irqbalance to make the interrupts single-CPU.
> +		 *
> +		 * We will support full lowest-priority interrupt later.
> +		 *
> +		 */
> +
> +		kvm_set_msi_irq(e, &irq);
> +		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
> +			continue;
> +
> +		vcpu_info.pi_desc_addr = kvm_x86_ops->get_pi_desc_addr(vcpu);
> +		vcpu_info.vector = irq.vector;
> +
> +		if (set)
> +			ret = irq_set_vcpu_affinity(host_irq, &vcpu_info);
> +		else {
> +			/* suppress notification event before unposting */
> +			kvm_x86_ops->pi_set_sn(vcpu);
> +			ret = irq_set_vcpu_affinity(host_irq, NULL);
> +			kvm_x86_ops->pi_clear_sn(vcpu);
> +		}

Can we add trace events so that we have a way to tell when PI is being
enabled/disabled other than performance heuristics?  Thanks,

Alex

> +
> +		if (ret < 0) {
> +			printk(KERN_INFO "%s: failed to update PI IRTE\n",
> +					__func__);
> +			goto out;
> +		}
> +	}
> +
> +	ret = 0;
> +out:
> +	srcu_read_unlock(&kvm->irq_srcu, idx);
> +	return ret;
> +}
> +
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts
       [not found]       ` <1440532664.20355.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-08-26  0:36         ` Wu, Feng
  0 siblings, 0 replies; 35+ messages in thread
From: Wu, Feng @ 2015-08-26  0:36 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org,
	mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> Sent: Wednesday, August 26, 2015 3:58 AM
> To: Wu, Feng
> Cc: pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org; mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org;
> eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org; kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts
> 
> On Tue, 2015-08-25 at 16:50 +0800, Feng Wu wrote:
> > This patch adds the routine to update IRTE for posted-interrupts
> > when guest changes the interrupt configuration.
> >
> > Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> >  arch/x86/kvm/x86.c | 73
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 73 insertions(+)
> > +		kvm_set_msi_irq(e, &irq);
> > +		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
> > +			continue;
> > +
> > +		vcpu_info.pi_desc_addr = kvm_x86_ops->get_pi_desc_addr(vcpu);
> > +		vcpu_info.vector = irq.vector;
> > +
> > +		if (set)
> > +			ret = irq_set_vcpu_affinity(host_irq, &vcpu_info);
> > +		else {
> > +			/* suppress notification event before unposting */
> > +			kvm_x86_ops->pi_set_sn(vcpu);
> > +			ret = irq_set_vcpu_affinity(host_irq, NULL);
> > +			kvm_x86_ops->pi_clear_sn(vcpu);
> > +		}
> 
> Can we add trace events so that we have a way to tell when PI is being
> enabled/disabled other than performance heuristics?  Thanks,

Sure, I will add it.

Thanks,
Feng

> 
> Alex
> > 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts
  2015-08-25  8:50   ` [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts Feng Wu
@ 2015-09-11 10:16     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:16 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> This patch adds some helper functions to manipulate the
> Posted-Interrupts Descriptor.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  arch/x86/kvm/vmx.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 271dd70..316f9bf 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -443,6 +443,8 @@ struct nested_vmx {
>  };
>  
>  #define POSTED_INTR_ON  0
> +#define POSTED_INTR_SN  1
> +
>  /* Posted-Interrupt Descriptor */
>  struct pi_desc {
>  	u32 pir[8];     /* Posted interrupt requested */
> @@ -483,6 +485,30 @@ static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
>  	return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
>  }
>  
> +static void pi_clear_sn(struct pi_desc *pi_desc)
> +{
> +	return clear_bit(POSTED_INTR_SN,
> +			(unsigned long *)&pi_desc->control);
> +}
> +
> +static void pi_set_sn(struct pi_desc *pi_desc)
> +{
> +	return set_bit(POSTED_INTR_SN,
> +			(unsigned long *)&pi_desc->control);
> +}
> +
> +static int pi_test_on(struct pi_desc *pi_desc)
> +{
> +	return test_bit(POSTED_INTR_ON,
> +			(unsigned long *)&pi_desc->control);
> +}
> +
> +static int pi_test_sn(struct pi_desc *pi_desc)
> +{
> +	return test_bit(POSTED_INTR_SN,
> +			(unsigned long *)&pi_desc->control);
> +}
> +
>  struct vcpu_vmx {
>  	struct kvm_vcpu       vcpu;
>  	unsigned long         host_rsp;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 03/17] KVM: Define a new interface kvm_intr_is_single_vcpu()
       [not found]     ` <1440492620-15934-4-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:20       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:20 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> This patch defines a new interface kvm_intr_is_single_vcpu(),
> which can returns whether the interrupt is for single-CPU or not.
> 
> It is used by VT-d PI, since now we only support single-CPU
> interrupts, For lowest-priority interrupts, if user configures
> it via /proc/irq or uses irqbalance to make it single-CPU, we
> can use PI to deliver the interrupts to it. Full functionality
> of lowest-priority support will be added later.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  arch/x86/include/asm/kvm_host.h |  3 +++
>  arch/x86/kvm/irq_comm.c         | 24 ++++++++++++++++++++++++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 49ec903..af11bca 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1204,4 +1204,7 @@ int __x86_set_memory_region(struct kvm *kvm,
>  int x86_set_memory_region(struct kvm *kvm,
>  			  const struct kvm_userspace_memory_region *mem);
>  
> +bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
> +			     struct kvm_vcpu **dest_vcpu);
> +
>  #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
> index 9efff9e..a9572a13 100644
> --- a/arch/x86/kvm/irq_comm.c
> +++ b/arch/x86/kvm/irq_comm.c
> @@ -297,6 +297,30 @@ out:
>  	return r;
>  }
>  
> +bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
> +			     struct kvm_vcpu **dest_vcpu)
> +{
> +	int i, r = 0;
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		if (!kvm_apic_present(vcpu))
> +			continue;
> +
> +		if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand,
> +					irq->dest_id, irq->dest_mode))
> +			continue;
> +
> +		r++;

if (++r == 2)
	return false;

> +		*dest_vcpu = vcpu;
> +	}
> +
> +	if (r == 1)
> +		return true;
> +	else
> +		return false;

... then just "return r == 1;" is enough here.

This could also be optimized to treat APIC_DEST_NOSHORT specially.  Get
the mda, and if it has a single physical CPU check that it has
kvm_apic_present(vcpu) set.  Otherwise fall back to the slow path.

Paolo

> +}
> +
>  #define IOAPIC_ROUTING_ENTRY(irq) \
>  	{ .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP,	\
>  	  .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } }
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts
  2015-08-25  8:50   ` [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts Feng Wu
  2015-08-25 19:57     ` Alex Williamson
@ 2015-09-11 10:29     ` Paolo Bonzini
  1 sibling, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:29 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> +int kvm_arch_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
> +			    uint32_t guest_irq, bool set)

Please move all of this code to a vmx.c callback instead of adding
get_pi_desc_addr. Check if this makes the pi_set_sn and pi_clear_sn
callbacks superfluous.

Paolo

> +	if (!irq_remapping_cap(IRQ_POSTING_CAP))
> +		return 0;


> +	idx = srcu_read_lock(&kvm->irq_srcu);
> +	irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
> +	BUG_ON(guest_irq >= irq_rt->nr_rt_entries);
> +
> +	hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
> +		if (e->type != KVM_IRQ_ROUTING_MSI)
> +			continue;
> +		/*
> +		 * VT-d PI cannot support posting multicast/broadcast
> +		 * interrupts to a VCPU, we still use interrupt remapping
> +		 * for these kind of interrupts.
> +		 *
> +		 * For lowest-priority interrupts, we only support
> +		 * those with single CPU as the destination, e.g. user
> +		 * configures the interrupts via /proc/irq or uses
> +		 * irqbalance to make the interrupts single-CPU.
> +		 *
> +		 * We will support full lowest-priority interrupt later.
> +		 *
> +		 */
> +
> +		kvm_set_msi_irq(e, &irq);
> +		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
> +			continue;
> +
> +		vcpu_info.pi_desc_addr = kvm_x86_ops->get_pi_desc_addr(vcpu);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86
  2015-08-25  8:50   ` [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86 Feng Wu
@ 2015-09-11 10:31     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:31 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> +	struct kvm_kernel_irqfd *irqfd =
> +		container_of(cons, struct kvm_kernel_irqfd, consumer);
> +
> +	irqfd->producer = prod;

This assignment should be under "if (kvm_x86_ops->update_pi_irte)".

> +	return kvm_arch_update_pi_irte(irqfd->kvm, prod->irq, irqfd->gsi, 1);
> +}
> +
> +void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
> +				      struct irq_bypass_producer *prod)
> +{
> +	int ret;
> +	struct kvm_kernel_irqfd *irqfd =
> +		container_of(cons, struct kvm_kernel_irqfd, consumer);
> +
> +	irqfd->producer = NULL;

And here it should be like:

	if (!kvm_x86_ops->update_pi_irte) {
		WARN_ON(irqfd->producer != NULL);
		return;
	}

	WARN_ON(irqfd->producer != prod);
	irqfd->producer = NULL;

Paolo

> +

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd'
  2015-08-25  8:50   ` [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' Feng Wu
@ 2015-09-11 10:39     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:39 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> +void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd)
> +{
> +	irqfd->arch_update = kvm_arch_update_pi_irte;
> +}
> +
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 5f183fb..f4005dc 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -34,6 +34,8 @@
>  
>  #include <asm/kvm_host.h>
>  
> +struct kvm_kernel_irqfd;
> +
>  /*
>   * The bit 16 ~ bit 31 of kvm_memory_region::flags are internally used
>   * in kvm, other bits are visible for userspace which are defined in
> @@ -1145,6 +1147,15 @@ extern struct kvm_device_ops kvm_xics_ops;
>  extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
>  extern struct kvm_device_ops kvm_arm_vgic_v3_ops;
>  
> +#ifdef __KVM_HAVE_ARCH_IRQFD_INIT
> +void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd);
> +#else
> +static inline void kvm_arch_irqfd_init(struct kvm_kernel_irqfd *irqfd)
> +{
> +	irqfd->arch_update = NULL;
> +}
> +#endif
> +
>  #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
>  
>  static inline void kvm_vcpu_set_in_spin_loop(struct kvm_vcpu *vcpu, bool val)
> diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h
> index 0c1de05..b7aab52 100644
> --- a/include/linux/kvm_irqfd.h
> +++ b/include/linux/kvm_irqfd.h
> @@ -66,6 +66,8 @@ struct kvm_kernel_irqfd {
>  	struct work_struct shutdown;
>  	struct irq_bypass_consumer consumer;
>  	struct irq_bypass_producer *producer;
> +	int (*arch_update)(struct kvm *kvm, unsigned int host_irq,
> +			   uint32_t guest_irq, bool set);
>  };
>  
>  #endif /* __LINUX_KVM_IRQFD_H */
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index f3050b9..b2d9066 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -288,6 +288,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
>  	INIT_LIST_HEAD(&irqfd->list);
>  	INIT_WORK(&irqfd->inject, irqfd_inject);
>  	INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
> +	kvm_arch_irqfd_init(irqfd);
>  	seqcount_init(&irqfd->irq_entry_sc);
>  
>  	f = fdget(args->fd);
> @@ -580,13 +581,22 @@ kvm_irqfd_release(struct kvm *kvm)
>   */
>  void kvm_irq_routing_update(struct kvm *kvm)
>  {
> +	int ret;
>  	struct kvm_kernel_irqfd *irqfd;
>  
>  	spin_lock_irq(&kvm->irqfds.lock);
>  
> -	list_for_each_entry(irqfd, &kvm->irqfds.items, list)
> +	list_for_each_entry(irqfd, &kvm->irqfds.items, list) {
>  		irqfd_update(kvm, irqfd);
>  
> +		if (irqfd->arch_update && irqfd->producer) {

With the changes I suggested in the previous message, you only need to
check "if (irqfd->producer)" here.  Then you can remove
kvm_arch_irqfd_init and just put the new "if (irqfd->producer)" under
"#ifdef CONFIG_KVM_HAVE_IRQ_BYPASS".

Just rename kvm_arch_update_pi_irte to kvm_arch_update_irqfd_routing.

Paolo

> +			ret = irqfd->arch_update(
> +					irqfd->kvm, irqfd->producer->irq,
> +					irqfd->gsi, 1);
> +			WARN_ON(ret);
> +		}
> +	}

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 11/17] KVM: Define two weak arch callbacks for irq bypass manager
       [not found]     ` <1440492620-15934-12-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:41       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:41 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> Define two weak arch callbacks so that archs that don't need
> them don't need define them.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  virt/kvm/eventfd.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index d7a230f..f3050b9 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -256,6 +256,16 @@ static void irqfd_update(struct kvm *kvm, struct kvm_kernel_irqfd *irqfd)
>  	write_seqcount_end(&irqfd->irq_entry_sc);
>  }
>  
> +void __attribute__((weak)) kvm_arch_irq_bypass_stop(
> +				struct irq_bypass_consumer *cons)
> +{
> +}
> +
> +void __attribute__((weak)) kvm_arch_irq_bypass_start(
> +				struct irq_bypass_consumer *cons)
> +{
> +}
> +
>  static int
>  kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
>  {
> 

This would belong into the patch that adds CONFIG_KVM_HAVE_IRQ_BYPASS
(and the functions should be under "#ifdef CONFIG_KVM_HAVE_IRQ_BYPASS").

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 04/17] KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu'
       [not found]   ` <1440492620-15934-5-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:50     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:50 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> Define an interface to get PI descriptor address from the vCPU structure.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

See the later review, this interface and the one in patch 5 is too
low-level.

Paolo

> ---
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/vmx.c              | 11 +++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index af11bca..d50c1d3 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -858,6 +858,8 @@ struct kvm_x86_ops {
>  	void (*enable_log_dirty_pt_masked)(struct kvm *kvm,
>  					   struct kvm_memory_slot *slot,
>  					   gfn_t offset, unsigned long mask);
> +
> +	u64 (*get_pi_desc_addr)(struct kvm_vcpu *vcpu);
>  	/* pmu operations of sub-arch */
>  	const struct kvm_pmu_ops *pmu_ops;
>  };
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 316f9bf..81a995c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -610,6 +610,10 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu)
>  #define FIELD64(number, name)	[number] = VMCS12_OFFSET(name), \
>  				[number##_HIGH] = VMCS12_OFFSET(name)+4
>  
> +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu)
> +{
> +	return &(to_vmx(vcpu)->pi_desc);
> +}
>  
>  static unsigned long shadow_read_only_fields[] = {
>  	/*
> @@ -4487,6 +4491,11 @@ static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu)
>  	return;
>  }
>  
> +static u64 vmx_get_pi_desc_addr(struct kvm_vcpu *vcpu)
> +{
> +	return __pa((u64)vcpu_to_pi_desc(vcpu));
> +}
> +
>  /*
>   * Set up the vmcs's constant host-state fields, i.e., host-state fields that
>   * will not change in the lifetime of the guest.
> @@ -10460,6 +10469,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
>  	.flush_log_dirty = vmx_flush_log_dirty,
>  	.enable_log_dirty_pt_masked = vmx_enable_log_dirty_pt_masked,
>  
> +	.get_pi_desc_addr = vmx_get_pi_desc_addr,
> +
>  	.pmu_ops = &intel_pmu_ops,
>  };
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible
  2015-08-25  8:50   ` [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible Feng Wu
@ 2015-09-11 10:50     ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:50 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> Move struct kvm_irq_routing_table from irqchip.c to kvm_host.h,
> so we can use it outside of irqchip.c.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  include/linux/kvm_host.h | 14 ++++++++++++++
>  virt/kvm/irqchip.c       | 10 ----------
>  2 files changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 5ac8d21..5f183fb 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -328,6 +328,20 @@ struct kvm_kernel_irq_routing_entry {
>  	struct hlist_node link;
>  };
>  
> +#ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
> +
> +struct kvm_irq_routing_table {
> +	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
> +	u32 nr_rt_entries;
> +	/*
> +	 * Array indexed by gsi. Each entry contains list of irq chips
> +	 * the gsi is connected to.
> +	 */
> +	struct hlist_head map[0];
> +};
> +
> +#endif
> +
>  #ifndef KVM_PRIVATE_MEM_SLOTS
>  #define KVM_PRIVATE_MEM_SLOTS 0
>  #endif
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 21c1424..2cf45d3 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -31,16 +31,6 @@
>  #include <trace/events/kvm.h>
>  #include "irq.h"
>  
> -struct kvm_irq_routing_table {
> -	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
> -	u32 nr_rt_entries;
> -	/*
> -	 * Array indexed by gsi. Each entry contains list of irq chips
> -	 * the gsi is connected to.
> -	 */
> -	struct hlist_head map[0];
> -};
> -
>  int kvm_irq_map_gsi(struct kvm *kvm,
>  		    struct kvm_kernel_irq_routing_entry *entries, int gsi)
>  {
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 07/17] KVM: make kvm_set_msi_irq() public
       [not found]     ` <1440492620-15934-8-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:53       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:53 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> Make kvm_set_msi_irq() public, we can use this function outside.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  arch/x86/include/asm/kvm_host.h | 4 ++++
>  arch/x86/kvm/irq_comm.c         | 4 ++--
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c4f99f1..82d0709 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -175,6 +175,8 @@ enum {
>   */
>  #define KVM_APIC_PV_EOI_PENDING	1
>  
> +struct kvm_kernel_irq_routing_entry;
> +
>  /*
>   * We don't want allocation failures within the mmu code, so we preallocate
>   * enough memory for a single page fault in a cache.
> @@ -1212,4 +1214,6 @@ int x86_set_memory_region(struct kvm *kvm,
>  bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>  			     struct kvm_vcpu **dest_vcpu);
>  
> +void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
> +		     struct kvm_lapic_irq *irq);
>  #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
> index a9572a13..1319c60 100644
> --- a/arch/x86/kvm/irq_comm.c
> +++ b/arch/x86/kvm/irq_comm.c
> @@ -91,8 +91,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>  	return r;
>  }
>  
> -static inline void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
> -				   struct kvm_lapic_irq *irq)
> +void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
> +		     struct kvm_lapic_irq *irq)
>  {
>  	trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data);
>  
> 

Reviewed-by: Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices
       [not found]     ` <1440492620-15934-9-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:54       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:54 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> Enable irq bypass manager for vfio PCI devices.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/vfio/pci/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 579d83b..02912f1 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -2,6 +2,7 @@ config VFIO_PCI
>  	tristate "VFIO support for PCI devices"
>  	depends on VFIO && PCI && EVENTFD
>  	select VFIO_VIRQFD
> +	select IRQ_BYPASS_MANAGER
>  	help
>  	  Support for the PCI VFIO bus driver.  This is required to make
>  	  use of PCI drivers using the VFIO framework.
> 

Might as well squash it into patch 9.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 14/17] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted
       [not found]     ` <1440492620-15934-15-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:56       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:56 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> This patch updates the Posted-Interrupts Descriptor when vCPU
> is preempted.
> 
> sched out:
> - Set 'SN' to suppress furture non-urgent interrupts posted for
> the vCPU.
> 
> sched in:
> - Clear 'SN'
> - Change NDST if vCPU is scheduled to a different CPU
> - Set 'NV' to POSTED_INTR_VECTOR
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  arch/x86/kvm/vmx.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 234f720..9c87064 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -45,6 +45,7 @@
>  #include <asm/debugreg.h>
>  #include <asm/kexec.h>
>  #include <asm/apic.h>
> +#include <asm/irq_remapping.h>
>  
>  #include "trace.h"
>  #include "pmu.h"
> @@ -2001,10 +2002,60 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  		vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
>  		vmx->loaded_vmcs->cpu = cpu;
>  	}
> +
> +	if (irq_remapping_cap(IRQ_POSTING_CAP)) {
> +		struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> +		struct pi_desc old, new;
> +		unsigned int dest;
> +
> +		do {
> +			old.control = new.control = pi_desc->control;
> +
> +			/*
> +			 * If 'nv' field is POSTED_INTR_WAKEUP_VECTOR, there
> +			 * are two possible cases:
> +			 * 1. After running 'pi_pre_block', context switch
> +			 *    happened. For this case, 'sn' was set in
> +			 *    vmx_vcpu_put(), so we need to clear it here.
> +			 * 2. After running 'pi_pre_block', we were blocked,
> +			 *    and woken up by some other guy. For this case,
> +			 *    we don't need to do anything, 'pi_post_block'
> +			 *    will do everything for us. However, we cannot
> +			 *    check whether it is case #1 or case #2 here
> +			 *    (maybe, not needed), so we also clear sn here,
> +			 *    I think it is not a big deal.
> +			 */
> +			if (pi_desc->nv != POSTED_INTR_WAKEUP_VECTOR) {
> +				if (vcpu->cpu != cpu) {
> +					dest = cpu_physical_id(cpu);
> +
> +					if (x2apic_enabled())
> +						new.ndst = dest;
> +					else
> +						new.ndst = (dest << 8) & 0xFF00;
> +				}
> +
> +				/* set 'NV' to 'notification vector' */
> +				new.nv = POSTED_INTR_VECTOR;
> +			}
> +
> +			/* Allow posting non-urgent interrupts */
> +			new.sn = 0;
> +		} while (cmpxchg(&pi_desc->control, old.control,
> +				new.control) != old.control);
> +	}
>  }
>  
>  static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
>  {
> +	if (irq_remapping_cap(IRQ_POSTING_CAP)) {
> +		struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> +
> +		/* Set SN when the vCPU is preempted */
> +		if (vcpu->preempted)
> +			pi_set_sn(pi_desc);
> +	}
> +
>  	__vmx_load_host_state(to_vmx(vcpu));
>  	if (!vmm_exclusive) {
>  		__loaded_vmcs_clear(to_vmx(vcpu)->loaded_vmcs);
> 

Please make this separate functions vmx_vcpu_pi_load and vmx_vcpu_pi_put.

Paolo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 16/17] KVM: Warn if 'SN' is set during posting interrupts by software
       [not found]     ` <1440492620-15934-17-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-09-11 10:59       ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 10:59 UTC (permalink / raw)
  To: Feng Wu, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	joro-zLv9SwRftAIdnm+yROfE0A, mtosatti-H+wXaHxf7aLQT0dZR+AlfA
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kvm-u79uwXL29TY76Z2rM5mHXA,
	eric.auger-QSEj5FYQhm4dnm+yROfE0A



On 25/08/2015 10:50, Feng Wu wrote:
> Currently, we don't support urgent interrupt, all interrupts
> are recognized as non-urgent interrupt, so we cannot post
> interrupts when 'SN' is set.
> 
> If the vcpu is in guest mode, it cannot have been scheduled out,
> and that's the only case when SN is set currently, warning if
> SN is set.
> 
> Signed-off-by: Feng Wu <feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  arch/x86/kvm/vmx.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 64e35ea..eb640a1 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4494,6 +4494,22 @@ static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
>  {
>  #ifdef CONFIG_SMP
>  	if (vcpu->mode == IN_GUEST_MODE) {
> +		struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +		/*
> +		 * Currently, we don't support urgent interrupt,
> +		 * all interrupts are recognized as non-urgent
> +		 * interrupt, so we cannot post interrupts when
> +		 * 'SN' is set.
> +		 *
> +		 * If the vcpu is in guest mode, it means it is
> +		 * running instead of being scheduled out and
> +		 * waiting in the run queue, and that's the only
> +		 * case when 'SN' is set currently, warning if
> +		 * 'SN' is set.
> +		 */
> +		WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc));
> +
>  		apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
>  				POSTED_INTR_VECTOR);
>  		return true;
> 

Reviewed-by: Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 17/17] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts
  2015-08-25  8:50 ` [PATCH v7 17/17] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Feng Wu
@ 2015-09-11 11:05   ` Paolo Bonzini
  0 siblings, 0 replies; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 11:05 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> Enable VT-d Posted-Interrtups and add a command line
> parameter for it.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  Documentation/kernel-parameters.txt |  1 +
>  drivers/iommu/irq_remapping.c       | 12 ++++++++----
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 1d6f045..52aca36 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -1547,6 +1547,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			nosid	disable Source ID checking
>  			no_x2apic_optout
>  				BIOS x2APIC opt-out request will be ignored
> +			nopost	disable Interrupt Posting
>  
>  	iomem=		Disable strict checking of access to MMIO memory
>  		strict	regions from userspace.
> diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
> index 2d99930..d8c3997 100644
> --- a/drivers/iommu/irq_remapping.c
> +++ b/drivers/iommu/irq_remapping.c
> @@ -22,7 +22,7 @@ int irq_remap_broken;
>  int disable_sourceid_checking;
>  int no_x2apic_optout;
>  
> -int disable_irq_post = 1;
> +int disable_irq_post = 0;
>  
>  static int disable_irq_remap;
>  static struct irq_remap_ops *remap_ops;
> @@ -58,14 +58,18 @@ static __init int setup_irqremap(char *str)
>  		return -EINVAL;
>  
>  	while (*str) {
> -		if (!strncmp(str, "on", 2))
> +		if (!strncmp(str, "on", 2)) {
>  			disable_irq_remap = 0;
> -		else if (!strncmp(str, "off", 3))
> +			disable_irq_post = 0;
> +		} else if (!strncmp(str, "off", 3)) {
>  			disable_irq_remap = 1;
> -		else if (!strncmp(str, "nosid", 5))
> +			disable_irq_post = 1;
> +		} else if (!strncmp(str, "nosid", 5))
>  			disable_sourceid_checking = 1;
>  		else if (!strncmp(str, "no_x2apic_optout", 16))
>  			no_x2apic_optout = 1;
> +		else if (!strncmp(str, "nopost", 6))
> +			disable_irq_post = 1;
>  
>  		str += strcspn(str, ",");
>  		while (*str == ',')
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
  2015-08-25  8:50   ` [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked Feng Wu
@ 2015-09-11 11:21     ` Paolo Bonzini
  2015-09-14  7:57       ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Paolo Bonzini @ 2015-09-11 11:21 UTC (permalink / raw)
  To: Feng Wu, alex.williamson, joro, mtosatti
  Cc: eric.auger, kvm, iommu, linux-kernel



On 25/08/2015 10:50, Feng Wu wrote:
> This patch updates the Posted-Interrupts Descriptor when vCPU
> is blocked.
> 
> pre-block:
> - Add the vCPU to the blocked per-CPU list
> - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> 
> post-block:
> - Remove the vCPU from the per-CPU list
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |   5 ++
>  arch/x86/kvm/vmx.c              | 151 ++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c              |  55 ++++++++++++---
>  include/linux/kvm_host.h        |   3 +
>  virt/kvm/kvm_main.c             |   3 +
>  5 files changed, 207 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 22269b4..32af275 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -554,6 +554,8 @@ struct kvm_vcpu_arch {
>  	 */
>  	bool write_fault_to_shadow_pgtable;
>  
> +	bool halted;
> +
>  	/* set at EPT violation at this point */
>  	unsigned long exit_qualification;
>  
> @@ -868,6 +870,9 @@ struct kvm_x86_ops {
>  
>  	void (*pi_clear_sn)(struct kvm_vcpu *vcpu);
>  	void (*pi_set_sn)(struct kvm_vcpu *vcpu);
> +
> +	int (*pi_pre_block)(struct kvm_vcpu *vcpu);
> +	void (*pi_post_block)(struct kvm_vcpu *vcpu);

Just pre_block/post_block please.  Also, please document the return
value of pre_block.

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ef93fdc..fc7f222 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5869,7 +5869,13 @@ int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>  {
>  	++vcpu->stat.halt_exits;
>  	if (irqchip_in_kernel(vcpu->kvm)) {
> -		vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
> +		/* Handle posted-interrupt when vCPU is to be halted */
> +		if (!kvm_x86_ops->pi_pre_block ||
> +		    (kvm_x86_ops->pi_pre_block &&

No need to test kvm_x86_ops->pi_pre_block again.

> +		    kvm_x86_ops->pi_pre_block(vcpu) == 0)) {
> +			vcpu->arch.halted = true;
> +			vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
> +		}
>  		return 1;
>  	} else {
>  		vcpu->run->exit_reason = KVM_EXIT_HLT;
> @@ -6518,6 +6524,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			kvm_vcpu_reload_apic_access_page(vcpu);
>  	}
>  
> +	/*
> +	 * Since posted-interrupts can be set by VT-d HW now, in this
> +	 * case, KVM_REQ_EVENT is not set. We move the following
> +	 * operations out of the if statement.
> +	 */

Just "KVM_REQ_EVENT is not set when posted interrupts are set by VT-d
hardware, so we have to update RVI unconditionally", please.

Could we skip this (in a future patch) if PI.ON=0?

> +	if (kvm_lapic_enabled(vcpu)) {
> +		/*
> +		 * Update architecture specific hints for APIC
> +		 * virtual interrupt delivery.
> +		 */
> +		if (kvm_x86_ops->hwapic_irr_update)
> +			kvm_x86_ops->hwapic_irr_update(vcpu,
> +				kvm_lapic_find_highest_irr(vcpu));
> +	}
> +
>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>  		kvm_apic_accept_events(vcpu);
>  		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> @@ -6534,13 +6555,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			kvm_x86_ops->enable_irq_window(vcpu);
>  
>  		if (kvm_lapic_enabled(vcpu)) {
> -			/*
> -			 * Update architecture specific hints for APIC
> -			 * virtual interrupt delivery.
> -			 */
> -			if (kvm_x86_ops->hwapic_irr_update)
> -				kvm_x86_ops->hwapic_irr_update(vcpu,
> -					kvm_lapic_find_highest_irr(vcpu));
>  			update_cr8_intercept(vcpu);
>  			kvm_lapic_sync_to_vapic(vcpu);
>  		}
> @@ -6711,10 +6725,31 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	for (;;) {
>  		if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
> -		    !vcpu->arch.apf.halted)
> +		    !vcpu->arch.apf.halted) {
> +			/*
> +			 * For some cases, we can get here with
> +			 * vcpu->arch.halted being true.
> +			 */

Which cases?

Paolo

> +			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
> +				kvm_x86_ops->pi_post_block(vcpu);
> +				vcpu->arch.halted = false;
> +			}
> +
>  			r = vcpu_enter_guest(vcpu);
> -		else
> +		} else {
>  			r = vcpu_block(kvm, vcpu);
> +
> +			/*
> +			 * pi_post_block() must be called after
> +			 * pi_pre_block() which is called in
> +			 * kvm_vcpu_halt().
> +			 */
> +			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
> +				kvm_x86_ops->pi_post_block(vcpu);
> +				vcpu->arch.halted = false;
> +			}
> +		}
> +
>  		if (r <= 0)
>  			break;
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index f4005dc..6aa69f4 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -233,6 +233,9 @@ struct kvm_vcpu {
>  	unsigned long requests;
>  	unsigned long guest_debug;
>  
> +	int pre_pcpu;
> +	struct list_head blocked_vcpu_list;
> +
>  	struct mutex mutex;
>  	struct kvm_run *run;
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b8a444..191c7eb 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -220,6 +220,9 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>  	init_waitqueue_head(&vcpu->wq);
>  	kvm_async_pf_vcpu_init(vcpu);
>  
> +	vcpu->pre_pcpu = -1;
> +	INIT_LIST_HEAD(&vcpu->blocked_vcpu_list);
> +
>  	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>  	if (!page) {
>  		r = -ENOMEM;
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked
  2015-09-11 11:21     ` Paolo Bonzini
@ 2015-09-14  7:57       ` Wu, Feng
  0 siblings, 0 replies; 35+ messages in thread
From: Wu, Feng @ 2015-09-14  7:57 UTC (permalink / raw)
  To: Paolo Bonzini, alex.williamson@redhat.com, joro@8bytes.org,
	mtosatti@redhat.com
  Cc: eric.auger@linaro.org, kvm@vger.kernel.org,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	Wu, Feng

First of all, Paolo, thanks a lot for your review on this series, it really means a lot!:)

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Paolo Bonzini
> Sent: Friday, September 11, 2015 7:21 PM
> To: Wu, Feng; alex.williamson@redhat.com; joro@8bytes.org;
> mtosatti@redhat.com
> Cc: eric.auger@linaro.org; kvm@vger.kernel.org;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when
> vCPU is blocked
> 
> 
> 
> On 25/08/2015 10:50, Feng Wu wrote:
> > This patch updates the Posted-Interrupts Descriptor when vCPU
> > is blocked.
> >
> > pre-block:
> > - Add the vCPU to the blocked per-CPU list
> > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> >
> > post-block:
> > - Remove the vCPU from the per-CPU list
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |   5 ++
> >  arch/x86/kvm/vmx.c              | 151
> ++++++++++++++++++++++++++++++++++++++++
> >  arch/x86/kvm/x86.c              |  55 ++++++++++++---
> >  include/linux/kvm_host.h        |   3 +
> >  virt/kvm/kvm_main.c             |   3 +
> >  5 files changed, 207 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h
> b/arch/x86/include/asm/kvm_host.h
> > index 22269b4..32af275 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -554,6 +554,8 @@ struct kvm_vcpu_arch {
> >  	 */
> >  	bool write_fault_to_shadow_pgtable;
> >
> > +	bool halted;
> > +
> >  	/* set at EPT violation at this point */
> >  	unsigned long exit_qualification;
> >
> > @@ -868,6 +870,9 @@ struct kvm_x86_ops {
> >
> >  	void (*pi_clear_sn)(struct kvm_vcpu *vcpu);
> >  	void (*pi_set_sn)(struct kvm_vcpu *vcpu);
> > +
> > +	int (*pi_pre_block)(struct kvm_vcpu *vcpu);
> > +	void (*pi_post_block)(struct kvm_vcpu *vcpu);
> 
> Just pre_block/post_block please.  Also, please document the return
> value of pre_block.
> 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index ef93fdc..fc7f222 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5869,7 +5869,13 @@ int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >  {
> >  	++vcpu->stat.halt_exits;
> >  	if (irqchip_in_kernel(vcpu->kvm)) {
> > -		vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
> > +		/* Handle posted-interrupt when vCPU is to be halted */
> > +		if (!kvm_x86_ops->pi_pre_block ||
> > +		    (kvm_x86_ops->pi_pre_block &&
> 
> No need to test kvm_x86_ops->pi_pre_block again.
> 
> > +		    kvm_x86_ops->pi_pre_block(vcpu) == 0)) {
> > +			vcpu->arch.halted = true;
> > +			vcpu->arch.mp_state = KVM_MP_STATE_HALTED;
> > +		}
> >  		return 1;
> >  	} else {
> >  		vcpu->run->exit_reason = KVM_EXIT_HLT;
> > @@ -6518,6 +6524,21 @@ static int vcpu_enter_guest(struct kvm_vcpu
> *vcpu)
> >  			kvm_vcpu_reload_apic_access_page(vcpu);
> >  	}
> >
> > +	/*
> > +	 * Since posted-interrupts can be set by VT-d HW now, in this
> > +	 * case, KVM_REQ_EVENT is not set. We move the following
> > +	 * operations out of the if statement.
> > +	 */
> 
> Just "KVM_REQ_EVENT is not set when posted interrupts are set by VT-d
> hardware, so we have to update RVI unconditionally", please.
> 
> Could we skip this (in a future patch) if PI.ON=0?

Do you mean only executing the following code when PI.ON == 1?
Maybe we cannot do that, since 'ON' can be cleared by hypervisor
in lots of places.

> 
> > +	if (kvm_lapic_enabled(vcpu)) {
> > +		/*
> > +		 * Update architecture specific hints for APIC
> > +		 * virtual interrupt delivery.
> > +		 */
> > +		if (kvm_x86_ops->hwapic_irr_update)
> > +			kvm_x86_ops->hwapic_irr_update(vcpu,
> > +				kvm_lapic_find_highest_irr(vcpu));
> > +	}
> > +
> >  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> >  		kvm_apic_accept_events(vcpu);
> >  		if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
> > @@ -6534,13 +6555,6 @@ static int vcpu_enter_guest(struct kvm_vcpu
> *vcpu)
> >  			kvm_x86_ops->enable_irq_window(vcpu);
> >
> >  		if (kvm_lapic_enabled(vcpu)) {
> > -			/*
> > -			 * Update architecture specific hints for APIC
> > -			 * virtual interrupt delivery.
> > -			 */
> > -			if (kvm_x86_ops->hwapic_irr_update)
> > -				kvm_x86_ops->hwapic_irr_update(vcpu,
> > -					kvm_lapic_find_highest_irr(vcpu));
> >  			update_cr8_intercept(vcpu);
> >  			kvm_lapic_sync_to_vapic(vcpu);
> >  		}
> > @@ -6711,10 +6725,31 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
> >
> >  	for (;;) {
> >  		if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
> > -		    !vcpu->arch.apf.halted)
> > +		    !vcpu->arch.apf.halted) {
> > +			/*
> > +			 * For some cases, we can get here with
> > +			 * vcpu->arch.halted being true.
> > +			 */
> 
> Which cases?

See the following scenario:
vcpu_run()
{
......

	vcpu_enter_guest() --> VM_EXIT -> kvm_vcpu_halt() --> vcpu->arch.halted = true;

........

	kvm_check_async_pf_completion()
		--> ...... --> kvm_arch_async_page_present(), in which it set
				vcpu->arch.apf.halted to false and vcpu->arch.mp_state
				to KVM_MP_STATE_RUNNABLE, then next time we re-enter
				the for (;;) loop, it will end up vcpu->arch.halted being true
}

Thanks,
Feng

> 
> Paolo
> 
> > +			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
> > +				kvm_x86_ops->pi_post_block(vcpu);
> > +				vcpu->arch.halted = false;
> > +			}
> > +
> >  			r = vcpu_enter_guest(vcpu);
> > -		else
> > +		} else {
> >  			r = vcpu_block(kvm, vcpu);
> > +
> > +			/*
> > +			 * pi_post_block() must be called after
> > +			 * pi_pre_block() which is called in
> > +			 * kvm_vcpu_halt().
> > +			 */
> > +			if (kvm_x86_ops->pi_post_block && vcpu->arch.halted) {
> > +				kvm_x86_ops->pi_post_block(vcpu);
> > +				vcpu->arch.halted = false;
> > +			}
> > +		}
> > +
> >  		if (r <= 0)
> >  			break;
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index f4005dc..6aa69f4 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -233,6 +233,9 @@ struct kvm_vcpu {
> >  	unsigned long requests;
> >  	unsigned long guest_debug;
> >
> > +	int pre_pcpu;
> > +	struct list_head blocked_vcpu_list;
> > +
> >  	struct mutex mutex;
> >  	struct kvm_run *run;
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 8b8a444..191c7eb 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -220,6 +220,9 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm
> *kvm, unsigned id)
> >  	init_waitqueue_head(&vcpu->wq);
> >  	kvm_async_pf_vcpu_init(vcpu);
> >
> > +	vcpu->pre_pcpu = -1;
> > +	INIT_LIST_HEAD(&vcpu->blocked_vcpu_list);
> > +
> >  	page = alloc_page(GFP_KERNEL | __GFP_ZERO);
> >  	if (!page) {
> >  		r = -ENOMEM;
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-09-14  7:57 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-25  8:50 [PATCH v7 00/17] Add VT-d Posted-Interrupts support Feng Wu
2015-08-25  8:50 ` [PATCH v7 04/17] KVM: Get Posted-Interrupts descriptor address from 'struct kvm_vcpu' Feng Wu
     [not found]   ` <1440492620-15934-5-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:50     ` Paolo Bonzini
     [not found] ` <1440492620-15934-1-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-08-25  8:50   ` [PATCH v7 01/17] KVM: Extend struct pi_desc for VT-d Posted-Interrupts Feng Wu
2015-08-25  8:50   ` [PATCH v7 02/17] KVM: Add some helper functions for Posted-Interrupts Feng Wu
2015-09-11 10:16     ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 03/17] KVM: Define a new interface kvm_intr_is_single_vcpu() Feng Wu
     [not found]     ` <1440492620-15934-4-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:20       ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 05/17] KVM: Add interfaces to control PI outside vmx Feng Wu
2015-08-25  8:50   ` [PATCH v7 06/17] KVM: Make struct kvm_irq_routing_table accessible Feng Wu
2015-09-11 10:50     ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 07/17] KVM: make kvm_set_msi_irq() public Feng Wu
     [not found]     ` <1440492620-15934-8-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:53       ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 08/17] vfio: Select IRQ_BYPASS_MANAGER for vfio PCI devices Feng Wu
     [not found]     ` <1440492620-15934-9-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:54       ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 09/17] vfio: Register/unregister irq_bypass_producer Feng Wu
2015-08-25  8:50   ` [PATCH v7 10/17] KVM: x86: Update IRTE for posted-interrupts Feng Wu
2015-08-25 19:57     ` Alex Williamson
     [not found]       ` <1440532664.20355.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-08-26  0:36         ` Wu, Feng
2015-09-11 10:29     ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 11/17] KVM: Define two weak arch callbacks for irq bypass manager Feng Wu
     [not found]     ` <1440492620-15934-12-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:41       ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 12/17] KVM: Implement IRQ bypass consumer callbacks for x86 Feng Wu
2015-09-11 10:31     ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 13/17] KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' Feng Wu
2015-09-11 10:39     ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 14/17] KVM: Update Posted-Interrupts Descriptor when vCPU is preempted Feng Wu
     [not found]     ` <1440492620-15934-15-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:56       ` Paolo Bonzini
2015-08-25  8:50   ` [PATCH v7 15/17] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked Feng Wu
2015-09-11 11:21     ` Paolo Bonzini
2015-09-14  7:57       ` Wu, Feng
2015-08-25  8:50   ` [PATCH v7 16/17] KVM: Warn if 'SN' is set during posting interrupts by software Feng Wu
     [not found]     ` <1440492620-15934-17-git-send-email-feng.wu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-09-11 10:59       ` Paolo Bonzini
2015-08-25  8:50 ` [PATCH v7 17/17] iommu/vt-d: Add a command line parameter for VT-d posted-interrupts Feng Wu
2015-09-11 11:05   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).