All of lore.kernel.org
 help / color / mirror / Atom feed
* [v7 0/4] prerequisite changes for VT-d posted-interrupts
@ 2015-05-19  9:07 Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series implement some prerequisite parts for VT-d posted-interrupts. It was part of
http://thread.gmane.org/gmane.linux.kernel.iommu/7708. To make things clear, I will divide
the whole series which contain multiple components into three parts:
- prerequisite changes (included in this series)
- IOMMU part (v4 was reviewed, some comments need to be addressed)
- KVM and VFIO parts (will send out this part once the first two parts are accepted)

This series is rebased on the x86-apic branch of tip tree.

v6 --> v7:
[1/4]:
- Add a KernelDoc comment for function irq_set_vcpu_affinity().

v5 --> v6:
[3/4]:
- Avoid the conditional in the exception handler smp_kvm_posted_intr_wakeup_ipi().
- Rename "wakeup_handler_callback" to "kvm_posted_intr_wakeup_handler".

[4/4]
- Newly added in this series, show the statistics information for posted-interrupts.


v4 --> v5:
- Move the declaration of "irq_chip_set_vcpu_affinity_parent()" to [1/3].
- Use the accessor to get "struct irq_data", "struct irq_chip".
- Use "irq_get_desc_lock()" instead of "irq_to_desc()".
- Declare "wakeup_handler_callback" in "asm/irq.h".
- Use entering_ack_irq()/exiting_irq() in smp_kvm_posted_intr_wakeup_ipi().


Feng Wu (3):
  x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
  x86/irq: Define a global vector for VT-d Posted-Interrupts
  x86/irq: Show statistics information for posted-interrupts

Jiang Liu (1):
  genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a
    VCPU

 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/apic/msi.c         |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 43 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 include/linux/irq.h                |  6 ++++++
 kernel/irq/chip.c                  | 14 +++++++++++++
 kernel/irq/manage.c                | 31 +++++++++++++++++++++++++++
 12 files changed, 109 insertions(+)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:45   ` [tip:irq/core] " tip-bot for Jiang Liu
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

From: Jiang Liu <jiang.liu@linux.intel.com>

With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.

By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.

Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
        -->QEMU and KVM handle this
        -->KVM call this interface (passing posted interrupts descriptor
           and guest vector)
        -->irq core will transfer the control to IOMMU
        -->IOMMU will do the real work of updating IRTE (IRTE has new
           format for VT-d Posted-Interrupts)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 include/linux/irq.h |  6 ++++++
 kernel/irq/chip.c   | 14 ++++++++++++++
 kernel/irq/manage.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62c6901..48cb7d1 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  * @irq_write_msi_msg:	optional to write message content for MSI
  * @irq_get_irqchip_state:	return the internal state of an interrupt
  * @irq_set_irqchip_state:	set the internal state of a interrupt
+ * @irq_set_vcpu_affinity:	optional to target a vCPU in a virtual machine
  * @flags:		chip specific flags
  */
 struct irq_chip {
@@ -369,6 +370,8 @@ struct irq_chip {
 	int		(*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
 	int		(*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);
 
+	int		(*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
+
 	unsigned long	flags;
 };
 
@@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
 extern void irq_cpu_offline(void);
 extern int irq_set_affinity_locked(struct irq_data *data,
 				   const struct cpumask *cpumask, bool force);
+extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
 void irq_move_irq(struct irq_data *data);
@@ -467,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
 					const struct cpumask *dest,
 					bool force);
 extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
+extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
+					     void *vcpu_info);
 #endif
 
 /* Handling of unhandled and spurious interrupts: */
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eb9a4ea..55016b2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
 }
 
 /**
+ * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
+ * @data:	Pointer to interrupt specific data
+ * @dest:	The vcpu affinity information
+ */
+int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
+{
+	data = data->parent_data;
+	if (data->chip->irq_set_vcpu_affinity)
+		return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
+
+	return -ENOSYS;
+}
+
+/**
  * irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
  * @data:	Pointer to interrupt specific data
  * @on:		Whether to set or reset the wake-up capability of this irq
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932b..b1c7e8f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -256,6 +256,37 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
 }
 EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
 
+/**
+ *	irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
+ *	@irq: interrupt number to set affinity
+ *	@vcpu_info: vCPU specific data
+ *
+ *	This function uses the vCPU specific data to set the vCPU
+ *	affinity for an irq. The vCPU specific data is passed from
+ *	outside, such as KVM. One example code path is as below:
+ *	KVM -> IOMMU -> irq_set_vcpu_affinity().
+ */
+int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
+{
+	unsigned long flags;
+	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
+	struct irq_data *data;
+	struct irq_chip *chip;
+	int ret = -ENOSYS;
+
+	if (!desc)
+		return -EINVAL;
+
+	data = irq_desc_get_irq_data(desc);
+	chip = irq_data_get_irq_chip(data);
+	if (chip && chip->irq_set_vcpu_affinity)
+		ret = chip->irq_set_vcpu_affinity(data, vcpu_info);
+	irq_put_desc_unlock(desc, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
+
 static void irq_affinity_notify(struct work_struct *work)
 {
 	struct irq_affinity_notify *notify =
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:54   ` [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs tip-bot for Feng Wu
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 58fde66..d2d95e2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:54   ` [tip:x86/apic] " tip-bot for Feng Wu
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Currently, we use a global vector as the Posted-Interrupts
Notification Event for all the vCPUs in the system. We need
to introduce another global vector for VT-d Posted-Interrtups,
which will be used to wakeup the sleep vCPU when an external
interrupt from a direct-assigned device happens for that vCPU.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 31 +++++++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 8 files changed, 45 insertions(+)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..27ca0af 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -23,6 +23,8 @@ BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 #ifdef CONFIG_HAVE_KVM
 BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
 		 smp_kvm_posted_intr_ipi)
+BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
+		 smp_kvm_posted_intr_wakeup_ipi)
 #endif
 
 /*
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..9866065 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -14,6 +14,7 @@ typedef struct {
 #endif
 #ifdef CONFIG_HAVE_KVM
 	unsigned int kvm_posted_intr_ipis;
+	unsigned int kvm_posted_intr_wakeup_ipis;
 #endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 1f88e71..6ffc847 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -29,6 +29,7 @@
 extern asmlinkage void apic_timer_interrupt(void);
 extern asmlinkage void x86_platform_ipi(void);
 extern asmlinkage void kvm_posted_intr_ipi(void);
+extern asmlinkage void kvm_posted_intr_wakeup_ipi(void);
 extern asmlinkage void error_interrupt(void);
 extern asmlinkage void irq_work_interrupt(void);
 
@@ -92,6 +93,7 @@ extern void trace_call_function_single_interrupt(void);
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
 #define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
+#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi
 #endif /* CONFIG_TRACING */
 
 #ifdef	CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index a80cbb8..8008d06 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -30,6 +30,10 @@ extern void fixup_irqs(void);
 extern void irq_force_complete_move(int);
 #endif
 
+#ifdef CONFIG_HAVE_KVM
+extern void kvm_set_posted_intr_wakeup_handler(void (*handler)(void));
+#endif
+
 extern void (*x86_platform_ipi_callback)(void);
 extern void native_init_IRQ(void);
 extern bool handle_irq(unsigned irq, struct pt_regs *regs);
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index b26cb12..dca94f2 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -105,6 +105,7 @@
 /* Vector for KVM to deliver posted interrupt IPI */
 #ifdef CONFIG_HAVE_KVM
 #define POSTED_INTR_VECTOR		0xf2
+#define POSTED_INTR_WAKEUP_VECTOR	0xf1
 #endif
 
 /*
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c7b2384..177feec 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -919,6 +919,8 @@ apicinterrupt X86_PLATFORM_IPI_VECTOR \
 #ifdef CONFIG_HAVE_KVM
 apicinterrupt3 POSTED_INTR_VECTOR \
 	kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
+	kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
 #endif
 
 #ifdef CONFIG_X86_MCE_THRESHOLD
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e5952c2..2ec339a 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -237,6 +237,18 @@ __visible void smp_x86_platform_ipi(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_HAVE_KVM
+static void dummy_handler(void) {}
+static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler;
+
+void kvm_set_posted_intr_wakeup_handler(void (*handler)(void))
+{
+	if (handler)
+		kvm_posted_intr_wakeup_handler = handler;
+	else
+		kvm_posted_intr_wakeup_handler = dummy_handler;
+}
+EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wakeup_handler);
+
 /*
  * Handler for POSTED_INTERRUPT_VECTOR.
  */
@@ -256,6 +268,25 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
 
 	set_irq_regs(old_regs);
 }
+
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	entering_ack_irq();
+
+	inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+
+	kvm_posted_intr_wakeup_handler();
+
+	exiting_irq();
+
+	set_irq_regs(old_regs);
+}
+
 #endif
 
 __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index cd10a64..895941d 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -144,6 +144,8 @@ static void __init apic_intr_init(void)
 #ifdef CONFIG_HAVE_KVM
 	/* IPI for KVM to deliver posted interrupt */
 	alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+	/* IPI for KVM to deliver interrupt to wake up tasks */
+	alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
 #endif
 
 	/* IPI vectors for APIC spurious and error interrupts */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 4/4] x86/irq: Show statistics information for posted-interrupts
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
                   ` (2 preceding siblings ...)
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:55   ` [tip:x86/apic] " tip-bot for Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Show the statistics information for notification event
and wakeup event for posted-interrupt in /proc/interrupts.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 arch/x86/kernel/irq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 2ec339a..be466ff 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -136,6 +136,18 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 #if defined(CONFIG_X86_IO_APIC)
 	seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
 #endif
+#ifdef CONFIG_HAVE_KVM
+	seq_printf(p, "%*s: ", prec, "NEV");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis);
+	seq_puts(p, "  Posted-interrupt notification event\n");
+
+	seq_printf(p, "%*s: ", prec, "WEV");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ",
+			   irq_stats(j)->kvm_posted_intr_wakeup_ipis);
+	seq_puts(p, "  Posted-interrupt wakeup event\n");
+#endif
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:irq/core] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
@ 2015-05-19 13:45   ` tip-bot for Jiang Liu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Jiang Liu @ 2015-05-19 13:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: feng.wu, hpa, tglx, mingo, linux-kernel, jiang.liu

Commit-ID:  0a4377de305684c883bf90ad21e3cbdeead70f5c
Gitweb:     http://git.kernel.org/tip/0a4377de305684c883bf90ad21e3cbdeead70f5c
Author:     Jiang Liu <jiang.liu@linux.intel.com>
AuthorDate: Tue, 19 May 2015 17:07:14 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:41:19 +0200

genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.

By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.

Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
        -->QEMU and KVM handle this
        -->KVM call this interface (passing posted interrupts descriptor
           and guest vector)
        -->irq core will transfer the control to IOMMU
        -->IOMMU will do the real work of updating IRTE (IRTE has new
           format for VT-d Posted-Interrupts)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Link: http://lkml.kernel.org/r/1432026437-16560-2-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irq.h |  6 ++++++
 kernel/irq/chip.c   | 14 ++++++++++++++
 kernel/irq/manage.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62c6901..48cb7d1 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  * @irq_write_msi_msg:	optional to write message content for MSI
  * @irq_get_irqchip_state:	return the internal state of an interrupt
  * @irq_set_irqchip_state:	set the internal state of a interrupt
+ * @irq_set_vcpu_affinity:	optional to target a vCPU in a virtual machine
  * @flags:		chip specific flags
  */
 struct irq_chip {
@@ -369,6 +370,8 @@ struct irq_chip {
 	int		(*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
 	int		(*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);
 
+	int		(*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
+
 	unsigned long	flags;
 };
 
@@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
 extern void irq_cpu_offline(void);
 extern int irq_set_affinity_locked(struct irq_data *data,
 				   const struct cpumask *cpumask, bool force);
+extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
 void irq_move_irq(struct irq_data *data);
@@ -467,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
 					const struct cpumask *dest,
 					bool force);
 extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
+extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
+					     void *vcpu_info);
 #endif
 
 /* Handling of unhandled and spurious interrupts: */
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eb9a4ea..55016b2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
 }
 
 /**
+ * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
+ * @data:	Pointer to interrupt specific data
+ * @dest:	The vcpu affinity information
+ */
+int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
+{
+	data = data->parent_data;
+	if (data->chip->irq_set_vcpu_affinity)
+		return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
+
+	return -ENOSYS;
+}
+
+/**
  * irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
  * @data:	Pointer to interrupt specific data
  * @on:		Whether to set or reset the wake-up capability of this irq
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932b..b1c7e8f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -256,6 +256,37 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
 }
 EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
 
+/**
+ *	irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
+ *	@irq: interrupt number to set affinity
+ *	@vcpu_info: vCPU specific data
+ *
+ *	This function uses the vCPU specific data to set the vCPU
+ *	affinity for an irq. The vCPU specific data is passed from
+ *	outside, such as KVM. One example code path is as below:
+ *	KVM -> IOMMU -> irq_set_vcpu_affinity().
+ */
+int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
+{
+	unsigned long flags;
+	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
+	struct irq_data *data;
+	struct irq_chip *chip;
+	int ret = -ENOSYS;
+
+	if (!desc)
+		return -EINVAL;
+
+	data = irq_desc_get_irq_data(desc);
+	chip = irq_data_get_irq_chip(data);
+	if (chip && chip->irq_set_vcpu_affinity)
+		ret = chip->irq_set_vcpu_affinity(data, vcpu_info);
+	irq_put_desc_unlock(desc, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
+
 static void irq_affinity_notify(struct work_struct *work)
 {
 	struct irq_affinity_notify *notify =

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
@ 2015-05-19 13:54   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:54 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, feng.wu, jiang.liu, tglx, linux-kernel, mingo

Commit-ID:  a2f1c8bdc02bfcaa5a658283b883fdb54e328b36
Gitweb:     http://git.kernel.org/tip/a2f1c8bdc02bfcaa5a658283b883fdb54e328b36
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:15 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1432026437-16560-3-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/msi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index ef516af..1a9d735 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq: Define a global vector for VT-d Posted-Interrupts
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
@ 2015-05-19 13:54   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, linux-kernel, yang.z.zhang, mingo, feng.wu, hpa

Commit-ID:  f6b3c72c23661e5534cd2eede16e9bac7ebb761c
Gitweb:     http://git.kernel.org/tip/f6b3c72c23661e5534cd2eede16e9bac7ebb761c
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:16 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq: Define a global vector for VT-d Posted-Interrupts

Currently, we use a global vector as the Posted-Interrupts
Notification Event for all the vCPUs in the system. We need
to introduce another global vector for VT-d Posted-Interrtups,
which will be used to wakeup the sleep vCPU when an external
interrupt from a direct-assigned device happens for that vCPU.

[ tglx: Removed a gazillion of extra newlines ]

Signed-off-by: Feng Wu <feng.wu@intel.com>
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1432026437-16560-4-git-send-email-feng.wu@intel.com
Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 8 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..27ca0af 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -23,6 +23,8 @@ BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 #ifdef CONFIG_HAVE_KVM
 BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
 		 smp_kvm_posted_intr_ipi)
+BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
+		 smp_kvm_posted_intr_wakeup_ipi)
 #endif
 
 /*
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..9866065 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -14,6 +14,7 @@ typedef struct {
 #endif
 #ifdef CONFIG_HAVE_KVM
 	unsigned int kvm_posted_intr_ipis;
+	unsigned int kvm_posted_intr_wakeup_ipis;
 #endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 9ec5d37..10c80d4 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -29,6 +29,7 @@
 extern asmlinkage void apic_timer_interrupt(void);
 extern asmlinkage void x86_platform_ipi(void);
 extern asmlinkage void kvm_posted_intr_ipi(void);
+extern asmlinkage void kvm_posted_intr_wakeup_ipi(void);
 extern asmlinkage void error_interrupt(void);
 extern asmlinkage void irq_work_interrupt(void);
 
@@ -58,6 +59,7 @@ extern void trace_call_function_single_interrupt(void);
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
 #define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
+#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi
 #endif /* CONFIG_TRACING */
 
 #ifdef	CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index a80cbb8..8008d06 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -30,6 +30,10 @@ extern void fixup_irqs(void);
 extern void irq_force_complete_move(int);
 #endif
 
+#ifdef CONFIG_HAVE_KVM
+extern void kvm_set_posted_intr_wakeup_handler(void (*handler)(void));
+#endif
+
 extern void (*x86_platform_ipi_callback)(void);
 extern void native_init_IRQ(void);
 extern bool handle_irq(unsigned irq, struct pt_regs *regs);
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index bf55235..0ed29ac 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -86,6 +86,7 @@
 /* Vector for KVM to deliver posted interrupt IPI */
 #ifdef CONFIG_HAVE_KVM
 #define POSTED_INTR_VECTOR		0xf2
+#define POSTED_INTR_WAKEUP_VECTOR	0xf1
 #endif
 
 /*
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 47b9581..22aadc9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -916,6 +916,8 @@ apicinterrupt X86_PLATFORM_IPI_VECTOR \
 #ifdef CONFIG_HAVE_KVM
 apicinterrupt3 POSTED_INTR_VECTOR \
 	kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
+	kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
 #endif
 
 #ifdef CONFIG_X86_MCE_THRESHOLD
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index be38945..90b2f705 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -242,6 +242,18 @@ __visible void smp_x86_platform_ipi(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_HAVE_KVM
+static void dummy_handler(void) {}
+static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler;
+
+void kvm_set_posted_intr_wakeup_handler(void (*handler)(void))
+{
+	if (handler)
+		kvm_posted_intr_wakeup_handler = handler;
+	else
+		kvm_posted_intr_wakeup_handler = dummy_handler;
+}
+EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wakeup_handler);
+
 /*
  * Handler for POSTED_INTERRUPT_VECTOR.
  */
@@ -254,6 +266,20 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
 	exiting_irq();
 	set_irq_regs(old_regs);
 }
+
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	entering_ack_irq();
+	inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+	kvm_posted_intr_wakeup_handler();
+	exiting_irq();
+	set_irq_regs(old_regs);
+}
 #endif
 
 __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index dc1e08d..680723a 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -144,6 +144,8 @@ static void __init apic_intr_init(void)
 #ifdef CONFIG_HAVE_KVM
 	/* IPI for KVM to deliver posted interrupt */
 	alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+	/* IPI for KVM to deliver interrupt to wake up tasks */
+	alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
 #endif
 
 	/* IPI vectors for APIC spurious and error interrupts */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq: Show statistics information for posted-interrupts
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
@ 2015-05-19 13:55   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, mingo, feng.wu, tglx, hpa

Commit-ID:  501b32653ebf49114cccb9afbf9150cf18fd8700
Gitweb:     http://git.kernel.org/tip/501b32653ebf49114cccb9afbf9150cf18fd8700
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:17 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq: Show statistics information for posted-interrupts

Show the statistics information for notification event
and wakeup event for posted-interrupt in /proc/interrupts.

[ tglx: Named the short identifiers PIN and PIW to match the long
  	identifiers ]

Signed-off-by: Feng Wu <feng.wu@intel.com>
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1432026437-16560-5-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/irq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 90b2f705..7e10c8b 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -142,6 +142,18 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 #if defined(CONFIG_X86_IO_APIC)
 	seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
 #endif
+#ifdef CONFIG_HAVE_KVM
+	seq_printf(p, "%*s: ", prec, "PIN");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis);
+	seq_puts(p, "  Posted-interrupt notification event\n");
+
+	seq_printf(p, "%*s: ", prec, "PIW");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ",
+			   irq_stats(j)->kvm_posted_intr_wakeup_ipis);
+	seq_puts(p, "  Posted-interrupt wakeup event\n");
+#endif
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-05-19 13:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
2015-05-19 13:45   ` [tip:irq/core] " tip-bot for Jiang Liu
2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
2015-05-19 13:54   ` [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs tip-bot for Feng Wu
2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
2015-05-19 13:54   ` [tip:x86/apic] " tip-bot for Feng Wu
2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
2015-05-19 13:55   ` [tip:x86/apic] " tip-bot for Feng Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.