[PATCH 0 of 12] PV on HVM Xen

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0 of 12] PV on HVM Xen
@ 2010-05-18 10:22 Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 01/12] Add support for hvm_op Stefano Stabellini
                   ` (12 more replies)
  0 siblings, 13 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Stabellini, Stefano, Don Dutile,
	Sheng Yang

Hi all,
this is the fixed, updated and rebased version of the PV on HVM series:
the series is based on 2.6.34 now and supports Xen PV frontends running
in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.

The list of bugs fixed in this update includes: xenbus drivers crashes
when xenbus is not properly initialized, a memory corruption bug in
suspend/resume and testing for the xen platform pci version and protocol
has been moved to enlighten.c (before unplugging emulated devices).

In order to be able to use VIRQ_TIMER and to improve performances you
need a patch to Xen to implement the vector callback mechanism
for event channel delivery.

A git tree is also available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git

branch name 2.6.34-pvhvm.

Cheers,

Stefano

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/12] Add support for hvm_op
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
@ 2010-05-18 10:22 ` Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 02/12] early PV on HVM Stefano Stabellini
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Stefano Stabellini,
	Jeremy Fitzhardinge, Don Dutile, Sheng Yang

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/include/asm/xen/hypercall.h |    6 ++
 include/xen/hvm.h                    |   24 +++++++++
 include/xen/interface/hvm/hvm_op.h   |   35 ++++++++++++
 include/xen/interface/hvm/params.h   |   95 ++++++++++++++++++++++++++++++++++
 4 files changed, 160 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/hvm.h
 create mode 100644 include/xen/interface/hvm/hvm_op.h
 create mode 100644 include/xen/interface/hvm/params.h

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 9c371e4..7fda040 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -417,6 +417,12 @@ HYPERVISOR_nmi_op(unsigned long op, unsigned long arg)
 	return _hypercall2(int, nmi_op, op, arg);
 }
 
+static inline unsigned long __must_check
+HYPERVISOR_hvm_op(int op, void *arg)
+{
+       return _hypercall2(unsigned long, hvm_op, op, arg);
+}
+
 static inline void
 MULTI_fpu_taskswitch(struct multicall_entry *mcl, int set)
 {
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
new file mode 100644
index 0000000..6b0d418
--- /dev/null
+++ b/include/xen/hvm.h
@@ -0,0 +1,24 @@
+/* Simple wrappers around HVM functions */
+#ifndef XEN_HVM_H__
+#define XEN_HVM_H__
+
+#include <xen/interface/hvm/params.h>
+
+static inline int hvm_get_parameter(int idx, uint64_t *value)
+{
+       struct xen_hvm_param xhv;
+       int r;
+
+       xhv.domid = DOMID_SELF;
+       xhv.index = idx;
+       r = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
+       if (r < 0) {
+               printk(KERN_ERR "cannot get hvm parameter %d: %d.\n",
+                      idx, r);
+               return r;
+       }
+       *value = xhv.value;
+       return r;
+}
+
+#endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
new file mode 100644
index 0000000..73c8c7e
--- /dev/null
+++ b/include/xen/interface/hvm/hvm_op.h
@@ -0,0 +1,35 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
+#define __XEN_PUBLIC_HVM_HVM_OP_H__
+
+/* Get/set subcommands: the second argument of the hypercall is a
+ * pointer to a xen_hvm_param struct. */
+#define HVMOP_set_param           0
+#define HVMOP_get_param           1
+struct xen_hvm_param {
+    domid_t  domid;    /* IN */
+    uint32_t index;    /* IN */
+    uint64_t value;    /* IN/OUT */
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
+
+#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
diff --git a/include/xen/interface/hvm/params.h b/include/xen/interface/hvm/params.h
new file mode 100644
index 0000000..1888d8c
--- /dev/null
+++ b/include/xen/interface/hvm/params.h
@@ -0,0 +1,95 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
+#define __XEN_PUBLIC_HVM_PARAMS_H__
+
+#include "hvm_op.h"
+
+/*
+ * Parameter space for HVMOP_{set,get}_param.
+ */
+
+/*
+ * How should CPU0 event-channel notifications be delivered?
+ * val[63:56] == 0: val[55:0] is a delivery GSI (Global System Interrupt).
+ * val[63:56] == 1: val[55:0] is a delivery PCI INTx line, as follows:
+ *                  Domain = val[47:32], Bus  = val[31:16],
+ *                  DevFn  = val[15: 8], IntX = val[ 1: 0]
+ * val[63:56] == 2: val[7:0] is a vector number.
+ * If val == 0 then CPU0 event-channel notifications are not delivered.
+ */
+#define HVM_PARAM_CALLBACK_IRQ 0
+
+#define HVM_PARAM_STORE_PFN    1
+#define HVM_PARAM_STORE_EVTCHN 2
+
+#define HVM_PARAM_PAE_ENABLED  4
+
+#define HVM_PARAM_IOREQ_PFN    5
+
+#define HVM_PARAM_BUFIOREQ_PFN 6
+
+/*
+ * Set mode for virtual timers (currently x86 only):
+ *  delay_for_missed_ticks (default):
+ *   Do not advance a vcpu's time beyond the correct delivery time for
+ *   interrupts that have been missed due to preemption. Deliver missed
+ *   interrupts when the vcpu is rescheduled and advance the vcpu's virtual
+ *   time stepwise for each one.
+ *  no_delay_for_missed_ticks:
+ *   As above, missed interrupts are delivered, but guest time always tracks
+ *   wallclock (i.e., real) time while doing so.
+ *  no_missed_ticks_pending:
+ *   No missed interrupts are held pending. Instead, to ensure ticks are
+ *   delivered at some non-zero rate, if we detect missed ticks then the
+ *   internal tick alarm is not disabled if the VCPU is preempted during the
+ *   next tick period.
+ *  one_missed_tick_pending:
+ *   Missed interrupts are collapsed together and delivered as one 'late tick'.
+ *   Guest time always tracks wallclock (i.e., real) time.
+ */
+#define HVM_PARAM_TIMER_MODE   10
+#define HVMPTM_delay_for_missed_ticks    0
+#define HVMPTM_no_delay_for_missed_ticks 1
+#define HVMPTM_no_missed_ticks_pending   2
+#define HVMPTM_one_missed_tick_pending   3
+
+/* Boolean: Enable virtual HPET (high-precision event timer)? (x86-only) */
+#define HVM_PARAM_HPET_ENABLED 11
+
+/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
+#define HVM_PARAM_IDENT_PT     12
+
+/* Device Model domain, defaults to 0. */
+#define HVM_PARAM_DM_DOMAIN    13
+
+/* ACPI S state: currently support S0 and S3 on x86. */
+#define HVM_PARAM_ACPI_S_STATE 14
+
+/* TSS used on Intel when CR0.PE=0. */
+#define HVM_PARAM_VM86_TSS     15
+
+/* Boolean: Enable aligning all periodic vpts to reduce interrupts */
+#define HVM_PARAM_VPT_ALIGN    16
+
+#define HVM_NR_PARAMS          17
+
+#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/12] early PV on HVM
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 01/12] Add support for hvm_op Stefano Stabellini
@ 2010-05-18 10:22 ` Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Stefano Stabellini,
	Yaozu (Eddie) Dong, Don Dutile, Sheng Yang

From: Sheng Yang <sheng@linux.intel.com>

Initialize basic pv on hvm features in xen_guest_init.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
---
 arch/x86/kernel/setup.c           |    2 +
 arch/x86/xen/enlighten.c          |   87 +++++++++++++++++++++++++++++++++++++
 drivers/input/xen-kbdfront.c      |    2 +-
 drivers/video/xen-fbfront.c       |    2 +-
 drivers/xen/xenbus/xenbus_probe.c |   21 ++++++++-
 include/xen/xen.h                 |    2 +
 6 files changed, 111 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c4851ef..ae9b6cb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -69,6 +69,7 @@
 #include <linux/tboot.h>
 
 #include <video/edid.h>
+#include <xen/xen.h>
 
 #include <asm/mtrr.h>
 #include <asm/apic.h>
@@ -1032,6 +1033,7 @@ void __init setup_arch(char **cmdline_p)
 	probe_nr_irqs_gsi();
 
 	kvm_guest_init();
+	xen_guest_init();
 
 	e820_reserve_resources();
 	e820_mark_nosave_regions(max_low_pfn);
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 65d8d79..87a3b10 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -35,6 +35,8 @@
 #include <xen/interface/version.h>
 #include <xen/interface/physdev.h>
 #include <xen/interface/vcpu.h>
+#include <xen/interface/memory.h>
+#include <xen/interface/hvm/hvm_op.h>
 #include <xen/features.h>
 #include <xen/page.h>
 #include <xen/hvc-console.h>
@@ -56,6 +58,7 @@
 #include <asm/tlbflush.h>
 #include <asm/reboot.h>
 #include <asm/stackprotector.h>
+#include <asm/hypervisor.h>
 
 #include "xen-ops.h"
 #include "mmu.h"
@@ -1206,3 +1209,87 @@ asmlinkage void __init xen_start_kernel(void)
 	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
 #endif
 }
+
+static uint32_t xen_cpuid_base(void)
+{
+	uint32_t base, eax, ebx, ecx, edx;
+	char signature[13];
+
+	for (base = 0x40000000; base < 0x40010000; base += 0x100) {
+		cpuid(base, &eax, &ebx, &ecx, &edx);
+		*(uint32_t*)(signature + 0) = ebx;
+		*(uint32_t*)(signature + 4) = ecx;
+		*(uint32_t*)(signature + 8) = edx;
+		signature[12] = 0;
+
+		if (!strcmp("XenVMMXenVMM", signature) && ((eax - base) >= 2))
+			return base;
+	}
+
+	return 0;
+}
+
+static int init_hvm_pv_info(int *major, int *minor)
+{
+	uint32_t eax, ebx, ecx, edx, pages, msr, base;
+	u64 pfn;
+
+	base = xen_cpuid_base();
+	if (!base)
+		return -EINVAL;
+
+	cpuid(base + 1, &eax, &ebx, &ecx, &edx);
+
+	*major = eax >> 16;
+	*minor = eax & 0xffff;
+	printk(KERN_INFO "Xen version %d.%d.\n", *major, *minor);
+
+	cpuid(base + 2, &pages, &msr, &ecx, &edx);
+
+	pfn = __pa(hypercall_page);
+	wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
+
+	xen_setup_features();
+
+	pv_info = xen_info;
+	pv_info.kernel_rpl = 0;
+
+	xen_domain_type = XEN_HVM_DOMAIN;
+
+	return 0;
+}
+
+static void __init init_shared_info(void)
+{
+	struct xen_add_to_physmap xatp;
+	struct shared_info *shared_info_page;
+
+	shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
+	xatp.domid = DOMID_SELF;
+	xatp.idx = 0;
+	xatp.space = XENMAPSPACE_shared_info;
+	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
+		BUG();
+
+	HYPERVISOR_shared_info = (struct shared_info *)shared_info_page;
+
+	/* Don't do the full vcpu_info placement stuff until we have a
+	   possible map and a non-dummy shared_info. */
+	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
+}
+
+void __init xen_guest_init(void)
+{
+	int r;
+	int major, minor;
+
+	if (xen_pv_domain())
+		return;
+
+	r = init_hvm_pv_info(&major, &minor);
+	if (r < 0)
+		return;
+
+	init_shared_info();
+}
diff --git a/drivers/input/xen-kbdfront.c b/drivers/input/xen-kbdfront.c
index e140816..7451d78 100644
--- a/drivers/input/xen-kbdfront.c
+++ b/drivers/input/xen-kbdfront.c
@@ -339,7 +339,7 @@ static struct xenbus_driver xenkbd_driver = {
 
 static int __init xenkbd_init(void)
 {
-	if (!xen_domain())
+	if (!xen_domain() || xen_hvm_domain())
 		return -ENODEV;
 
 	/* Nothing to do if running in dom0. */
diff --git a/drivers/video/xen-fbfront.c b/drivers/video/xen-fbfront.c
index fa97d3e..a105a19 100644
--- a/drivers/video/xen-fbfront.c
+++ b/drivers/video/xen-fbfront.c
@@ -684,7 +684,7 @@ static struct xenbus_driver xenfb_driver = {
 
 static int __init xenfb_init(void)
 {
-	if (!xen_domain())
+	if (!xen_domain() || xen_hvm_domain())
 		return -ENODEV;
 
 	/* Nothing to do if running in dom0. */
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 3479332..0b05b62 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -56,6 +56,8 @@
 #include <xen/events.h>
 #include <xen/page.h>
 
+#include <xen/hvm.h>
+
 #include "xenbus_comms.h"
 #include "xenbus_probe.h"
 
@@ -806,10 +808,23 @@ static int __init xenbus_probe_init(void)
 		/* dom0 not yet supported */
 	} else {
 		xenstored_ready = 1;
-		xen_store_evtchn = xen_start_info->store_evtchn;
-		xen_store_mfn = xen_start_info->store_mfn;
+		if (xen_hvm_domain()) {
+			uint64_t v = 0;
+			err = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v);
+			if (err)
+				goto out_error;
+			xen_store_evtchn = (int)v;
+			err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
+			if (err)
+				goto out_error;
+			xen_store_mfn = (unsigned long)v;
+			xen_store_interface = ioremap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+		} else {
+			xen_store_evtchn = xen_start_info->store_evtchn;
+			xen_store_mfn = xen_start_info->store_mfn;
+			xen_store_interface = mfn_to_virt(xen_store_mfn);
+		}
 	}
-	xen_store_interface = mfn_to_virt(xen_store_mfn);
 
 	/* Initialize the interface to xenstore. */
 	err = xs_init();
diff --git a/include/xen/xen.h b/include/xen/xen.h
index a164024..cb8c48b 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -9,8 +9,10 @@ enum xen_domain_type {
 
 #ifdef CONFIG_XEN
 extern enum xen_domain_type xen_domain_type;
+extern void xen_guest_init(void);
 #else
 #define xen_domain_type		XEN_NATIVE
+#define xen_guest_init() do { } while (0)
 #endif
 
 #define xen_domain()		(xen_domain_type != XEN_NATIVE)
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 01/12] Add support for hvm_op Stefano Stabellini
  2010-05-18 10:22 ` [PATCH 02/12] early PV on HVM Stefano Stabellini
@ 2010-05-18 10:22 ` Stefano Stabellini
  2010-05-18 17:17   ` Jeremy Fitzhardinge
                     ` (2 more replies)
  2010-05-18 10:22 ` [PATCH 04/12] Fix find_unbound_irq in presence of ioapic irqs Stefano Stabellini
                   ` (9 subsequent siblings)
  12 siblings, 3 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Don Dutile, Stefano Stabellini,
	Sheng Yang

From: Sheng Yang <sheng@linux.intel.com>

Set the callback to receive evtchns from Xen, using the
callback vector delivery mechanism.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/xen/enlighten.c         |   35 +++++++++++++++++++++++++++++++++++
 drivers/xen/events.c             |   31 ++++++++++++++++++++++++-------
 include/xen/events.h             |    3 +++
 include/xen/hvm.h                |    9 +++++++++
 include/xen/interface/features.h |    3 +++
 5 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 87a3b10..502c4f8 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -37,8 +37,11 @@
 #include <xen/interface/vcpu.h>
 #include <xen/interface/memory.h>
 #include <xen/interface/hvm/hvm_op.h>
+#include <xen/interface/hvm/params.h>
 #include <xen/features.h>
 #include <xen/page.h>
+#include <xen/hvm.h>
+#include <xen/events.h>
 #include <xen/hvc-console.h>
 
 #include <asm/paravirt.h>
@@ -79,6 +82,8 @@ struct shared_info xen_dummy_shared_info;
 
 void *xen_initial_gdt;
 
+int xen_have_vector_callback;
+
 /*
  * Point at some empty memory to start with. We map the real shared_info
  * page as soon as fixmap is up and running.
@@ -1279,6 +1284,31 @@ static void __init init_shared_info(void)
 	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
 }
 
+int xen_set_callback_via(uint64_t via)
+{
+	struct xen_hvm_param a;
+	a.domid = DOMID_SELF;
+	a.index = HVM_PARAM_CALLBACK_IRQ;
+	a.value = via;
+	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
+}
+
+void do_hvm_pv_evtchn_intr(void)
+{
+	xen_hvm_evtchn_do_upcall(get_irq_regs());
+}
+
+static void xen_callback_vector(void)
+{
+	uint64_t callback_via;
+	if (xen_feature(XENFEAT_hvm_callback_vector)) {
+		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
+		xen_set_callback_via(callback_via);
+		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
+		xen_have_vector_callback = 1;
+	}
+}
+
 void __init xen_guest_init(void)
 {
 	int r;
@@ -1292,4 +1322,9 @@ void __init xen_guest_init(void)
 		return;
 
 	init_shared_info();
+
+	xen_callback_vector();
+
+	have_vcpu_info_placement = 0;
+	x86_init.irqs.intr_init = xen_init_IRQ;
 }
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index db8f506..3523dbb 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -36,6 +36,8 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
 
+#include <xen/xen.h>
+#include <xen/hvm.h>
 #include <xen/xen-ops.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
@@ -617,17 +619,13 @@ static DEFINE_PER_CPU(unsigned, xed_nesting_count);
  * a bitset of words which contain pending event bits.  The second
  * level is a bitset of pending events themselves.
  */
-void xen_evtchn_do_upcall(struct pt_regs *regs)
+void __xen_evtchn_do_upcall(struct pt_regs *regs)
 {
 	int cpu = get_cpu();
-	struct pt_regs *old_regs = set_irq_regs(regs);
 	struct shared_info *s = HYPERVISOR_shared_info;
 	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
  	unsigned count;
 
-	exit_idle();
-	irq_enter();
-
 	do {
 		unsigned long pending_words;
 
@@ -667,10 +665,26 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	} while(count != 1);
 
 out:
+
+	put_cpu();
+}
+
+void xen_evtchn_do_upcall(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	exit_idle();
+	irq_enter();
+
+	__xen_evtchn_do_upcall(regs);
+
 	irq_exit();
 	set_irq_regs(old_regs);
+}
 
-	put_cpu();
+void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
+{
+	__xen_evtchn_do_upcall(regs);
 }
 
 /* Rebind a new event channel to an existing irq. */
@@ -947,5 +961,8 @@ void __init xen_init_IRQ(void)
 	for (i = 0; i < NR_EVENT_CHANNELS; i++)
 		mask_evtchn(i);
 
-	irq_ctx_init(smp_processor_id());
+	if (xen_hvm_domain())
+		native_init_IRQ();
+	else
+		irq_ctx_init(smp_processor_id());
 }
diff --git a/include/xen/events.h b/include/xen/events.h
index e68d59a..868e5d6 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -56,4 +56,7 @@ void xen_poll_irq(int irq);
 /* Determine the IRQ which is bound to an event channel */
 unsigned irq_from_evtchn(unsigned int evtchn);
 
+void xen_evtchn_do_upcall(struct pt_regs *regs);
+void xen_hvm_evtchn_do_upcall(struct pt_regs *regs);
+
 #endif	/* _XEN_EVENTS_H */
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
index 6b0d418..5940ee5 100644
--- a/include/xen/hvm.h
+++ b/include/xen/hvm.h
@@ -3,6 +3,7 @@
 #define XEN_HVM_H__
 
 #include <xen/interface/hvm/params.h>
+#include <asm/xen/hypercall.h>
 
 static inline int hvm_get_parameter(int idx, uint64_t *value)
 {
@@ -21,4 +22,12 @@ static inline int hvm_get_parameter(int idx, uint64_t *value)
        return r;
 }
 
+int xen_set_callback_via(uint64_t via);
+extern int xen_have_vector_callback;
+
+#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
+#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
+#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
+                               HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
+
 #endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
index f51b641..8ab08b9 100644
--- a/include/xen/interface/features.h
+++ b/include/xen/interface/features.h
@@ -41,6 +41,9 @@
 /* x86: Does this Xen host support the MMU_PT_UPDATE_PRESERVE_AD hypercall? */
 #define XENFEAT_mmu_pt_update_preserve_ad  5
 
+/* x86: Does this Xen host support the HVM callback vector type? */
+#define XENFEAT_hvm_callback_vector        8
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/12] Fix find_unbound_irq in presence of ioapic irqs.
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (2 preceding siblings ...)
  2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
@ 2010-05-18 10:22 ` Stefano Stabellini
  2010-05-18 10:23 ` [PATCH 05/12] unplug emulated disks and nics Stefano Stabellini
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Don Dutile, Stefano Stabellini,
	Sheng Yang

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 drivers/xen/events.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 3523dbb..a137a2f 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -337,9 +337,18 @@ static int find_unbound_irq(void)
 	int irq;
 	struct irq_desc *desc;
 
-	for (irq = 0; irq < nr_irqs; irq++)
+	for (irq = 0; irq < nr_irqs; irq++) {
+		desc = irq_to_desc(irq);
+		/* only 0->15 have init'd desc; handle irq > 16 */
+		if (desc == NULL)
+			break;
+		if (desc->chip == &no_irq_chip)
+			break;
+		if (desc->chip != &xen_dynamic_chip)
+			continue;
 		if (irq_info[irq].type == IRQT_UNBOUND)
 			break;
+	}
 
 	if (irq == nr_irqs)
 		panic("No available IRQ to bind to: increase nr_irqs!\n");
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/12] unplug emulated disks and nics
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (3 preceding siblings ...)
  2010-05-18 10:22 ` [PATCH 04/12] Fix find_unbound_irq in presence of ioapic irqs Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 17:27   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 06/12] xen pci platform device driver Stefano Stabellini
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jeremy Fitzhardinge, xen-devel, Don Dutile, Stefano Stabellini,
	Sheng Yang

add a xen_unplug command line option to the kernel to unplug
xen emulated disks and nics.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/xen/enlighten.c             |   66 ++++++++++++++++++++++++++++++++++
 include/xen/hvm.h                    |    2 +
 include/xen/interface/platform_pci.h |   46 +++++++++++++++++++++++
 include/xen/platform_pci.h           |   32 ++++++++++++++++
 4 files changed, 146 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/interface/platform_pci.h
 create mode 100644 include/xen/platform_pci.h

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 502c4f8..aac47b0 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -31,6 +31,7 @@
 #include <linux/gfp.h>
 
 #include <xen/xen.h>
+#include <xen/platform_pci.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/version.h>
 #include <xen/interface/physdev.h>
@@ -38,6 +39,7 @@
 #include <xen/interface/memory.h>
 #include <xen/interface/hvm/hvm_op.h>
 #include <xen/interface/hvm/params.h>
+#include <xen/interface/platform_pci.h>
 #include <xen/features.h>
 #include <xen/page.h>
 #include <xen/hvm.h>
@@ -83,6 +85,8 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 int xen_have_vector_callback;
+int xen_platform_pci;
+static int unplug;
 
 /*
  * Point at some empty memory to start with. We map the real shared_info
@@ -1309,6 +1313,39 @@ static void xen_callback_vector(void)
 	}
 }
 
+static int __init check_platform_magic(void)
+{
+	short magic;
+	char protocol;
+
+	magic = inw(XEN_IOPORT_MAGIC);
+	if (magic != XEN_IOPORT_MAGIC_VAL) {
+		printk(KERN_ERR "Xen Platform Pci: unrecognised magic value\n");
+		return -1;
+	}
+
+	protocol = inb(XEN_IOPORT_PROTOVER);
+
+	printk(KERN_DEBUG "Xen Platform Pci: I/O protocol version %d\n",
+			protocol);
+
+	switch (protocol) {
+	case 1:
+		outw(XEN_IOPORT_LINUX_PRODNUM, XEN_IOPORT_PRODNUM);
+		outl(XEN_IOPORT_LINUX_DRVVER, XEN_IOPORT_DRVVER);
+		if (inw(XEN_IOPORT_MAGIC) != XEN_IOPORT_MAGIC_VAL) {
+			printk(KERN_ERR "Xen Platform: blacklisted by host\n");
+			return -3;
+		}
+		break;
+	default:
+		printk(KERN_WARNING "Xen Platform Pci: unknown I/O protocol version");
+		return -2;
+	}
+
+	return 0;
+}
+
 void __init xen_guest_init(void)
 {
 	int r;
@@ -1325,6 +1362,35 @@ void __init xen_guest_init(void)
 
 	xen_callback_vector();
 
+	r = check_platform_magic();
+	if (!r || (r == -1 && (unplug & UNPLUG_IGNORE)))
+		xen_platform_pci = 1;
+	if (xen_platform_pci && !(unplug & UNPLUG_IGNORE))
+		outw(unplug, XEN_IOPORT_UNPLUG);
 	have_vcpu_info_placement = 0;
 	x86_init.irqs.intr_init = xen_init_IRQ;
 }
+
+static int __init parse_unplug(char *arg)
+{
+	char *p, *q;
+
+	for (p = arg; p; p = q) {
+		q = strchr(arg, ',');
+		if (q)
+			*q++ = '\0';
+		if (!strcmp(p, "all"))
+			unplug |= UNPLUG_ALL;
+		else if (!strcmp(p, "ide-disks"))
+			unplug |= UNPLUG_ALL_IDE_DISKS;
+		else if (!strcmp(p, "aux-ide-disks"))
+			unplug |= UNPLUG_AUX_IDE_DISKS;
+		else if (!strcmp(p, "nics"))
+			unplug |= UNPLUG_ALL_NICS;
+		else
+			printk(KERN_WARNING "unrecognised option '%s' "
+				 "in module parameter 'dev_unplug'\n", p);
+	}
+	return 0;
+}
+early_param("xen_unplug", parse_unplug);
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
index 5940ee5..777d2ce 100644
--- a/include/xen/hvm.h
+++ b/include/xen/hvm.h
@@ -30,4 +30,6 @@ extern int xen_have_vector_callback;
 #define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
                                HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
 
+extern int xen_platform_pci;
+
 #endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/platform_pci.h b/include/xen/interface/platform_pci.h
new file mode 100644
index 0000000..720eaf5
--- /dev/null
+++ b/include/xen/interface/platform_pci.h
@@ -0,0 +1,46 @@
+/******************************************************************************
+ * platform_pci.h
+ *
+ * Interface for granting foreign access to page frames, and receiving
+ * page-ownership transfers.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_PLATFORM_PCI_H__
+#define __XEN_PUBLIC_PLATFORM_PCI_H__
+
+#define XEN_IOPORT_BASE 0x10
+
+#define XEN_IOPORT_PLATFLAGS	(XEN_IOPORT_BASE + 0) /* 1 byte access (R/W) */
+#define XEN_IOPORT_MAGIC	(XEN_IOPORT_BASE + 0) /* 2 byte access (R) */
+#define XEN_IOPORT_UNPLUG	(XEN_IOPORT_BASE + 0) /* 2 byte access (W) */
+#define XEN_IOPORT_DRVVER	(XEN_IOPORT_BASE + 0) /* 4 byte access (W) */
+
+#define XEN_IOPORT_SYSLOG	(XEN_IOPORT_BASE + 2) /* 1 byte access (W) */
+#define XEN_IOPORT_PROTOVER	(XEN_IOPORT_BASE + 2) /* 1 byte access (R) */
+#define XEN_IOPORT_PRODNUM	(XEN_IOPORT_BASE + 2) /* 2 byte access (W) */
+
+#define UNPLUG_ALL_IDE_DISKS 1
+#define UNPLUG_ALL_NICS 2
+#define UNPLUG_AUX_IDE_DISKS 4
+#define UNPLUG_ALL 7
+#define UNPLUG_IGNORE 8
+
+#endif /* __XEN_PUBLIC_PLATFORM_PCI_H__ */
diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
new file mode 100644
index 0000000..f39f4d3
--- /dev/null
+++ b/include/xen/platform_pci.h
@@ -0,0 +1,32 @@
+/******************************************************************************
+ * platform-pci.h
+ *
+ * Xen platform PCI device driver
+ * Copyright (c) 2004, Intel Corporation. <xiaofeng.ling@intel.com>
+ * Copyright (c) 2007, XenSource Inc.
+ * Copyright (c) 2010, Citrix
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef _XEN_PLATFORM_PCI_H
+#define _XEN_PLATFORM_PCI_H
+
+#include <linux/version.h>
+
+#define XEN_IOPORT_MAGIC_VAL 0x49d2
+#define XEN_IOPORT_LINUX_PRODNUM 0xffff
+#define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
+
+#endif /* _XEN_PLATFORM_PCI_H */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/12] xen pci platform device driver
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (4 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 05/12] unplug emulated disks and nics Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:11   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 07/12] Add suspend\resume support for PV on HVM guests Stefano Stabellini
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Add the xen pci platform device driver that is responsible
for initializing the grant table and xenbus in PV on HVM mode.
Few changes to xenbus and grant table are necessary to allow the delayed
initialization in HVM mode.
Grant table needs few additional modifications to work in HVM mode.

When running on HVM the event channel upcall is never called while in
progress because it is a normal Linux irq handler, therefore we cannot
be sure that evtchn_upcall_pending is 0 when returning.
For this reason if evtchn_upcall_pending is set by Xen we need to loop
again on the event channels set pending otherwise we might loose some
event channel deliveries.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 drivers/xen/Kconfig                 |    8 ++
 drivers/xen/Makefile                |    3 +-
 drivers/xen/events.c                |    5 +-
 drivers/xen/grant-table.c           |   70 +++++++++++--
 drivers/xen/platform-pci.c          |  198 +++++++++++++++++++++++++++++++++++
 drivers/xen/xenbus/xenbus_probe.c   |   20 +++-
 include/xen/grant_table.h           |    1 +
 include/xen/interface/grant_table.h |    1 +
 include/xen/platform_pci.h          |    9 ++
 include/xen/xenbus.h                |    1 +
 10 files changed, 300 insertions(+), 16 deletions(-)
 create mode 100644 drivers/xen/platform-pci.c

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index fad3df2..da312e2 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -62,4 +62,12 @@ config XEN_SYS_HYPERVISOR
 	 virtual environment, /sys/hypervisor will still be present,
 	 but will have no xen contents.
 
+config XEN_PLATFORM_PCI
+	tristate "xen platform pci device driver"
+	depends on XEN
+	help
+	  Driver for the Xen PCI Platform device: it is responsible for
+	  initializing xenbus and grant_table when running in a Xen HVM
+	  domain. As a consequence this driver is required to run any Xen PV
+	  frontend on Xen HVM.
 endmenu
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 7c28434..e392fb7 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_XENCOMM)	+= xencomm.o
 obj-$(CONFIG_XEN_BALLOON)	+= balloon.o
 obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
 obj-$(CONFIG_XENFS)		+= xenfs/
-obj-$(CONFIG_XEN_SYS_HYPERVISOR)	+= sys-hypervisor.o
\ No newline at end of file
+obj-$(CONFIG_XEN_SYS_HYPERVISOR)	+= sys-hypervisor.o
+obj-$(CONFIG_XEN_PLATFORM_PCI)	+= platform-pci.o
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index a137a2f..cfc6d96 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -671,7 +671,7 @@ void __xen_evtchn_do_upcall(struct pt_regs *regs)
 
 		count = __get_cpu_var(xed_nesting_count);
 		__get_cpu_var(xed_nesting_count) = 0;
-	} while(count != 1);
+	} while(count != 1 || vcpu_info->evtchn_upcall_pending);
 
 out:
 
@@ -731,7 +731,8 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 	struct evtchn_bind_vcpu bind_vcpu;
 	int evtchn = evtchn_from_irq(irq);
 
-	if (!VALID_EVTCHN(evtchn))
+	if (!VALID_EVTCHN(evtchn) ||
+		(xen_hvm_domain() && !xen_have_vector_callback))
 		return -1;
 
 	/* Send future instances of this interrupt to other vcpu. */
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index f66db3b..6f5f3ba 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -37,11 +37,14 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/uaccess.h>
+#include <linux/io.h>
 
 #include <xen/xen.h>
 #include <xen/interface/xen.h>
 #include <xen/page.h>
 #include <xen/grant_table.h>
+#include <xen/platform_pci.h>
+#include <xen/interface/memory.h>
 #include <asm/xen/hypercall.h>
 
 #include <asm/pgtable.h>
@@ -59,6 +62,7 @@ static unsigned int boot_max_nr_grant_frames;
 static int gnttab_free_count;
 static grant_ref_t gnttab_free_head;
 static DEFINE_SPINLOCK(gnttab_list_lock);
+static unsigned long hvm_pv_resume_frames;
 
 static struct grant_entry *shared;
 
@@ -449,6 +453,30 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
 	unsigned int nr_gframes = end_idx + 1;
 	int rc;
 
+	if (xen_hvm_domain()) {
+		struct xen_add_to_physmap xatp;
+		unsigned int i = end_idx;
+		rc = 0;
+		/*
+		 * Loop backwards, so that the first hypercall has the largest
+		 * index, ensuring that the table will grow only once.
+		 */
+		do {
+			xatp.domid = DOMID_SELF;
+			xatp.idx = i;
+			xatp.space = XENMAPSPACE_grant_table;
+			xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
+			rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
+			if (rc != 0) {
+				printk(KERN_WARNING
+						"grant table add_to_physmap failed, err=%d\n", rc);
+				break;
+			}
+		} while (i-- > start_idx);
+
+		return rc;
+	}
+
 	frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
 	if (!frames)
 		return -ENOMEM;
@@ -476,9 +504,28 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
 
 int gnttab_resume(void)
 {
-	if (max_nr_grant_frames() < nr_grant_frames)
+	unsigned int max_nr_gframes;
+
+	max_nr_gframes = max_nr_grant_frames();
+	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
-	return gnttab_map(0, nr_grant_frames - 1);
+
+	if (xen_pv_domain())
+		return gnttab_map(0, nr_grant_frames - 1);
+
+	if (!hvm_pv_resume_frames) {
+		hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
+		shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
+		if (shared == NULL) {
+			printk(KERN_WARNING
+					"Fail to ioremap gnttab share frames\n");
+			return -ENOMEM;
+		}
+	}
+
+	gnttab_map(0, nr_grant_frames - 1);
+
+	return 0;
 }
 
 int gnttab_suspend(void)
@@ -505,15 +552,12 @@ static int gnttab_expand(unsigned int req_entries)
 	return rc;
 }
 
-static int __devinit gnttab_init(void)
+int gnttab_init(void)
 {
 	int i;
 	unsigned int max_nr_glist_frames, nr_glist_frames;
 	unsigned int nr_init_grefs;
 
-	if (!xen_domain())
-		return -ENODEV;
-
 	nr_grant_frames = 1;
 	boot_max_nr_grant_frames = __max_nr_grant_frames();
 
@@ -557,4 +601,16 @@ static int __devinit gnttab_init(void)
 	return -ENOMEM;
 }
 
-core_initcall(gnttab_init);
+static int __devinit __gnttab_init(void)
+{
+	/* Delay grant-table initialization in the PV on HVM case */
+	if (xen_hvm_domain())
+		return 0;
+
+	if (!xen_pv_domain())
+		return -ENODEV;
+
+	return gnttab_init();
+}
+
+core_initcall(__gnttab_init);
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
new file mode 100644
index 0000000..7a8da66
--- /dev/null
+++ b/drivers/xen/platform-pci.c
@@ -0,0 +1,198 @@
+/******************************************************************************
+ * platform-pci.c
+ *
+ * Xen platform PCI device driver
+ * Copyright (c) 2005, Intel Corporation.
+ * Copyright (c) 2007, XenSource Inc.
+ * Copyright (c) 2010, Citrix
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <asm/io.h>
+
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include <xen/grant_table.h>
+#include <xen/platform_pci.h>
+#include <xen/interface/platform_pci.h>
+#include <xen/xenbus.h>
+#include <xen/events.h>
+#include <xen/hvm.h>
+
+#define DRV_NAME    "xen-platform-pci"
+
+MODULE_AUTHOR("ssmith@xensource.com and stefano.stabellini@eu.citrix.com");
+MODULE_DESCRIPTION("Xen platform PCI device");
+MODULE_LICENSE("GPL");
+
+static unsigned long platform_mmio;
+static unsigned long platform_mmio_alloc;
+static unsigned long platform_mmiolen;
+
+unsigned long alloc_xen_mmio(unsigned long len)
+{
+	unsigned long addr;
+
+	addr = platform_mmio + platform_mmio_alloc;
+	platform_mmio_alloc += len;
+	BUG_ON(platform_mmio_alloc > platform_mmiolen);
+
+	return addr;
+}
+
+static uint64_t get_callback_via(struct pci_dev *pdev)
+{
+	u8 pin;
+	int irq;
+
+	irq = pdev->irq;
+	if (irq < 16)
+		return irq; /* ISA IRQ */
+
+	pin = pdev->pin;
+
+	/* We don't know the GSI. Specify the PCI INTx line instead. */
+	return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */
+		((uint64_t)pci_domain_nr(pdev->bus) << 32) |
+		((uint64_t)pdev->bus->number << 16) |
+		((uint64_t)(pdev->devfn & 0xff) << 8) |
+		((uint64_t)(pin - 1) & 3);
+}
+
+static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
+{
+	xen_hvm_evtchn_do_upcall(get_irq_regs());
+	return IRQ_HANDLED;
+}
+
+static int xen_allocate_irq(struct pci_dev *pdev)
+{
+	return request_irq(pdev->irq, do_hvm_evtchn_intr,
+			IRQF_DISABLED | IRQF_NOBALANCING | IRQF_TRIGGER_RISING,
+			"xen-platform-pci", pdev);
+}
+
+static int __devinit platform_pci_init(struct pci_dev *pdev,
+				       const struct pci_device_id *ent)
+{
+	int i, ret;
+	long ioaddr, iolen;
+	long mmio_addr, mmio_len;
+	uint64_t callback_via;
+
+	i = pci_enable_device(pdev);
+	if (i)
+		return i;
+
+	ioaddr = pci_resource_start(pdev, 0);
+	iolen = pci_resource_len(pdev, 0);
+
+	mmio_addr = pci_resource_start(pdev, 1);
+	mmio_len = pci_resource_len(pdev, 1);
+
+	if (mmio_addr == 0 || ioaddr == 0) {
+		dev_err(&pdev->dev, "no resources found\n");
+		ret = -ENOENT;
+	}
+
+	if (request_mem_region(mmio_addr, mmio_len, DRV_NAME) == NULL) {
+		dev_err(&pdev->dev, "MEM I/O resource 0x%lx @ 0x%lx busy\n",
+		       mmio_addr, mmio_len);
+		ret = -EBUSY;
+	}
+
+	if (request_region(ioaddr, iolen, DRV_NAME) == NULL) {
+		dev_err(&pdev->dev, "I/O resource 0x%lx @ 0x%lx busy\n",
+		       iolen, ioaddr);
+		ret = -EBUSY;
+		goto out;
+	}
+
+	platform_mmio = mmio_addr;
+	platform_mmiolen = mmio_len;
+
+	if (!xen_have_vector_callback) {
+		ret = xen_allocate_irq(pdev);
+		if (ret) {
+			printk(KERN_WARNING "request_irq failed err=%d\n", ret);
+			goto out;
+		}
+		callback_via = get_callback_via(pdev);
+		ret = xen_set_callback_via(callback_via);
+		if (ret) {
+			printk(KERN_WARNING
+					"Unable to set the evtchn callback err=%d\n", ret);
+			goto out;
+		}
+	}
+
+	alloc_xen_mmio_hook = alloc_xen_mmio;
+	platform_pci_resume_hook = platform_pci_resume;
+	platform_pci_disable_irq_hook = platform_pci_disable_irq;
+	platform_pci_enable_irq_hook = platform_pci_enable_irq;
+
+	ret = gnttab_init();
+	if (ret)
+		goto out;
+	ret = xenbus_probe_init();
+	if (ret)
+		goto out;
+
+out:
+	if (ret) {
+		release_mem_region(mmio_addr, mmio_len);
+		release_region(ioaddr, iolen);
+		pci_disable_device(pdev);
+	}
+
+	return ret;
+}
+
+#define XEN_PLATFORM_VENDOR_ID 0x5853
+#define XEN_PLATFORM_DEVICE_ID 0x0001
+static struct pci_device_id platform_pci_tbl[] __devinitdata = {
+	{XEN_PLATFORM_VENDOR_ID, XEN_PLATFORM_DEVICE_ID,
+	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
+	{0,}
+};
+
+MODULE_DEVICE_TABLE(pci, platform_pci_tbl);
+
+static struct pci_driver platform_driver = {
+	name:     DRV_NAME,
+	probe :    platform_pci_init,
+	id_table : platform_pci_tbl,
+};
+
+static int __init platform_pci_module_init(void)
+{
+	int rc;
+
+	if (!xen_platform_pci)
+		return -ENODEV;
+
+	rc = pci_register_driver(&platform_driver);
+	if (rc) {
+		printk(KERN_INFO DRV_NAME
+		       ": No platform pci device model found\n");
+		return rc;
+	}
+	return 0;
+}
+
+module_init(platform_pci_module_init);
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 0b05b62..dc6ed06 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -782,16 +782,24 @@ void xenbus_probe(struct work_struct *unused)
 	blocking_notifier_call_chain(&xenstore_chain, 0, NULL);
 }
 
-static int __init xenbus_probe_init(void)
+static int __init __xenbus_probe_init(void)
+{
+	/* Delay initialization in the PV on HVM case */
+	if (xen_hvm_domain())
+		return 0;
+
+	if (!xen_pv_domain())
+		return -ENODEV;
+
+	return xenbus_probe_init();
+}
+
+int xenbus_probe_init(void)
 {
 	int err = 0;
 
 	DPRINTK("");
 
-	err = -ENODEV;
-	if (!xen_domain())
-		goto out_error;
-
 	/* Register ourselves with the kernel bus subsystem */
 	err = bus_register(&xenbus_frontend.bus);
 	if (err)
@@ -857,7 +865,7 @@ static int __init xenbus_probe_init(void)
 	return err;
 }
 
-postcore_initcall(xenbus_probe_init);
+postcore_initcall(__xenbus_probe_init);
 
 MODULE_LICENSE("GPL");
 
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index a40f1cd..811cda5 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -51,6 +51,7 @@ struct gnttab_free_callback {
 	u16 count;
 };
 
+int gnttab_init(void);
 int gnttab_suspend(void);
 int gnttab_resume(void);
 
diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
index 39da93c..39e5717 100644
--- a/include/xen/interface/grant_table.h
+++ b/include/xen/interface/grant_table.h
@@ -28,6 +28,7 @@
 #ifndef __XEN_PUBLIC_GRANT_TABLE_H__
 #define __XEN_PUBLIC_GRANT_TABLE_H__
 
+#include <xen/interface/xen.h>
 
 /***********************************
  * GRANT TABLE REPRESENTATION
diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
index f39f4d3..59a120c 100644
--- a/include/xen/platform_pci.h
+++ b/include/xen/platform_pci.h
@@ -29,4 +29,13 @@
 #define XEN_IOPORT_LINUX_PRODNUM 0xffff
 #define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
 
+#ifdef CONFIG_XEN_PLATFORM_PCI
+unsigned long alloc_xen_mmio(unsigned long len);
+#else
+static inline unsigned long alloc_xen_mmio(unsigned long len)
+{
+	return ~0UL;
+}
+#endif
+
 #endif /* _XEN_PLATFORM_PCI_H */
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 43e2d7d..ffa97de 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -174,6 +174,7 @@ void unregister_xenbus_watch(struct xenbus_watch *watch);
 void xs_suspend(void);
 void xs_resume(void);
 void xs_suspend_cancel(void);
+int xenbus_probe_init(void);
 
 /* Used by xenbus_dev to borrow kernel's store connection. */
 void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg);
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/12] Add suspend\resume support for PV on HVM guests.
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (5 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 06/12] xen pci platform device driver Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:11   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 08/12] Allow xen platform pci device to be compiled as a module Stefano Stabellini
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/xen/enlighten.c          |    9 ++--
 arch/x86/xen/suspend.c            |    6 ++
 arch/x86/xen/xen-ops.h            |    3 +
 drivers/xen/manage.c              |   95 +++++++++++++++++++++++++++++++++++--
 drivers/xen/platform-pci.c        |   29 +++++++++++-
 drivers/xen/xenbus/xenbus_probe.c |   28 +++++++++++
 include/xen/platform_pci.h        |    6 ++
 include/xen/xen-ops.h             |    3 +
 8 files changed, 170 insertions(+), 9 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index aac47b0..23b8200 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1268,12 +1268,13 @@ static int init_hvm_pv_info(int *major, int *minor)
 	return 0;
 }
 
-static void __init init_shared_info(void)
+void init_shared_info(void)
 {
 	struct xen_add_to_physmap xatp;
-	struct shared_info *shared_info_page;
+	static struct shared_info *shared_info_page = 0;
 
-	shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
+	if (!shared_info_page)
+		shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
 	xatp.domid = DOMID_SELF;
 	xatp.idx = 0;
 	xatp.space = XENMAPSPACE_shared_info;
@@ -1302,7 +1303,7 @@ void do_hvm_pv_evtchn_intr(void)
 	xen_hvm_evtchn_do_upcall(get_irq_regs());
 }
 
-static void xen_callback_vector(void)
+void xen_callback_vector(void)
 {
 	uint64_t callback_via;
 	if (xen_feature(XENFEAT_hvm_callback_vector)) {
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 987267f..86f3b45 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -26,6 +26,12 @@ void xen_pre_suspend(void)
 		BUG();
 }
 
+void xen_hvm_post_suspend(int suspend_cancelled)
+{
+		init_shared_info();
+		xen_callback_vector();
+}
+
 void xen_post_suspend(int suspend_cancelled)
 {
 	xen_build_mfn_list_list();
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index f9153a3..caf89ee 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -38,6 +38,9 @@ void xen_enable_sysenter(void);
 void xen_enable_syscall(void);
 void xen_vcpu_restore(void);
 
+void xen_callback_vector(void);
+void init_shared_info(void);
+
 void __init xen_build_dynamic_phys_to_machine(void);
 
 void xen_init_irq_ops(void);
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 2ac4440..a73edd8 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -8,15 +8,20 @@
 #include <linux/sysrq.h>
 #include <linux/stop_machine.h>
 #include <linux/freezer.h>
+#include <linux/pci.h>
+#include <linux/cpumask.h>
 
+#include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/grant_table.h>
 #include <xen/events.h>
 #include <xen/hvc-console.h>
 #include <xen/xen-ops.h>
+#include <xen/platform_pci.h>
 
 #include <asm/xen/hypercall.h>
 #include <asm/xen/page.h>
+#include <asm/xen/hypervisor.h>
 
 enum shutdown_state {
 	SHUTDOWN_INVALID = -1,
@@ -33,10 +38,30 @@ enum shutdown_state {
 static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
 
 #ifdef CONFIG_PM_SLEEP
-static int xen_suspend(void *data)
+static int xen_hvm_suspend(void *data)
 {
+	struct sched_shutdown r = { .reason = SHUTDOWN_suspend };
 	int *cancelled = data;
+
+	BUG_ON(!irqs_disabled());
+
+	*cancelled = HYPERVISOR_sched_op(SCHEDOP_shutdown, &r);
+
+	xen_hvm_post_suspend(*cancelled);
+	gnttab_resume();
+
+	if (!*cancelled) {
+		xen_irq_resume();
+		platform_pci_resume();
+	}
+
+	return 0;
+}
+
+static int xen_suspend(void *data)
+{
 	int err;
+	int *cancelled = data;
 
 	BUG_ON(!irqs_disabled());
 
@@ -73,6 +98,53 @@ static int xen_suspend(void *data)
 	return 0;
 }
 
+static void do_hvm_suspend(void)
+{
+	int err;
+	int cancelled = 1;
+
+	shutting_down = SHUTDOWN_SUSPEND;
+
+#ifdef CONFIG_PREEMPT
+	/* If the kernel is preemptible, we need to freeze all the processes
+	   to prevent them from being in the middle of a pagetable update
+	   during suspend. */
+	err = freeze_processes();
+	if (err) {
+		printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
+		goto out;
+	}
+#endif
+
+	printk(KERN_DEBUG "suspending xenstore... ");
+	xenbus_suspend();
+	printk(KERN_DEBUG "xenstore suspended\n");
+	platform_pci_disable_irq();
+	
+	err = stop_machine(xen_hvm_suspend, &cancelled, cpumask_of(0));
+	if (err) {
+		printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
+		cancelled = 1;
+	}
+
+	platform_pci_enable_irq();
+
+	if (!cancelled) {
+		xen_arch_resume();
+		xenbus_resume();
+	} else
+		xs_suspend_cancel();
+
+	/* Make sure timer events get retriggered on all CPUs */
+	clock_was_set();
+
+#ifdef CONFIG_PREEMPT
+	thaw_processes();
+out:
+#endif
+	shutting_down = SHUTDOWN_INVALID;
+}
+
 static void do_suspend(void)
 {
 	int err;
@@ -185,7 +257,10 @@ static void shutdown_handler(struct xenbus_watch *watch,
 		ctrl_alt_del();
 #ifdef CONFIG_PM_SLEEP
 	} else if (strcmp(str, "suspend") == 0) {
-		do_suspend();
+		if (xen_hvm_domain())
+			do_hvm_suspend();
+		else
+			do_suspend();
 #endif
 	} else {
 		printk(KERN_INFO "Ignoring shutdown request: %s\n", str);
@@ -261,7 +336,19 @@ static int shutdown_event(struct notifier_block *notifier,
 	return NOTIFY_DONE;
 }
 
-static int __init setup_shutdown_event(void)
+static int __init __setup_shutdown_event(void)
+{
+	/* Delay initialization in the PV on HVM case */
+	if (xen_hvm_domain())
+		return 0;
+
+	if (!xen_pv_domain())
+		return -ENODEV;
+
+	return xen_setup_shutdown_event();
+}
+
+int xen_setup_shutdown_event(void)
 {
 	static struct notifier_block xenstore_notifier = {
 		.notifier_call = shutdown_event
@@ -271,4 +358,4 @@ static int __init setup_shutdown_event(void)
 	return 0;
 }
 
-subsys_initcall(setup_shutdown_event);
+subsys_initcall(__setup_shutdown_event);
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
index 7a8da66..b15f809 100644
--- a/drivers/xen/platform-pci.c
+++ b/drivers/xen/platform-pci.c
@@ -33,6 +33,7 @@
 #include <xen/xenbus.h>
 #include <xen/events.h>
 #include <xen/hvm.h>
+#include <xen/xen-ops.h>
 
 #define DRV_NAME    "xen-platform-pci"
 
@@ -43,6 +44,8 @@ MODULE_LICENSE("GPL");
 static unsigned long platform_mmio;
 static unsigned long platform_mmio_alloc;
 static unsigned long platform_mmiolen;
+static uint64_t callback_via;
+struct pci_dev *xen_platform_pdev;
 
 unsigned long alloc_xen_mmio(unsigned long len)
 {
@@ -87,13 +90,33 @@ static int xen_allocate_irq(struct pci_dev *pdev)
 			"xen-platform-pci", pdev);
 }
 
+void platform_pci_disable_irq(void)
+{
+	printk(KERN_DEBUG "platform_pci_disable_irq\n");
+	disable_irq(xen_platform_pdev->irq);
+}
+
+void platform_pci_enable_irq(void)
+{
+	printk(KERN_DEBUG "platform_pci_enable_irq\n");
+	enable_irq(xen_platform_pdev->irq);
+}
+
+void platform_pci_resume(void)
+{
+	if (!xen_have_vector_callback && xen_set_callback_via(callback_via)) {
+		printk("platform_pci_resume failure!\n");
+		return;
+	}
+}
+
 static int __devinit platform_pci_init(struct pci_dev *pdev,
 				       const struct pci_device_id *ent)
 {
 	int i, ret;
 	long ioaddr, iolen;
 	long mmio_addr, mmio_len;
-	uint64_t callback_via;
+	xen_platform_pdev = pdev;
 
 	i = pci_enable_device(pdev);
 	if (i)
@@ -152,6 +175,10 @@ static int __devinit platform_pci_init(struct pci_dev *pdev,
 	ret = xenbus_probe_init();
 	if (ret)
 		goto out;
+	ret = xen_setup_shutdown_event();
+	if (ret)
+		goto out;
+
 
 out:
 	if (ret) {
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index dc6ed06..a679205 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -746,6 +746,34 @@ static int xenbus_dev_resume(struct device *dev)
 	return 0;
 }
 
+static int dev_suspend(struct device *dev, void *data)
+{
+	return xenbus_dev_suspend(dev, PMSG_SUSPEND);
+}
+
+void xenbus_suspend(void)
+{
+	DPRINTK("");
+
+	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_suspend);
+	xs_suspend();
+}
+EXPORT_SYMBOL_GPL(xenbus_suspend);
+
+static int dev_resume(struct device *dev, void *data)
+{
+	return xenbus_dev_resume(dev);
+}
+
+void xenbus_resume(void)
+{
+	DPRINTK("");
+
+	xs_resume();
+	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_resume);
+}
+EXPORT_SYMBOL_GPL(xenbus_resume);
+
 /* A flag to determine if xenstored is 'ready' (i.e. has started) */
 int xenstored_ready = 0;
 
diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
index 59a120c..ced434d 100644
--- a/include/xen/platform_pci.h
+++ b/include/xen/platform_pci.h
@@ -31,11 +31,17 @@
 
 #ifdef CONFIG_XEN_PLATFORM_PCI
 unsigned long alloc_xen_mmio(unsigned long len);
+void platform_pci_resume(void);
+void platform_pci_disable_irq(void);
+void platform_pci_enable_irq(void);
 #else
 static inline unsigned long alloc_xen_mmio(unsigned long len)
 {
 	return ~0UL;
 }
+static inline void platform_pci_resume(void) {}
+static inline void platform_pci_disable_irq(void) {}
+static inline void platform_pci_enable_irq(void) {}
 #endif
 
 #endif /* _XEN_PLATFORM_PCI_H */
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 883a21b..46bc81e 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -7,6 +7,7 @@ DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
 
 void xen_pre_suspend(void);
 void xen_post_suspend(int suspend_cancelled);
+void xen_hvm_post_suspend(int suspend_cancelled);
 
 void xen_mm_pin_all(void);
 void xen_mm_unpin_all(void);
@@ -14,4 +15,6 @@ void xen_mm_unpin_all(void);
 void xen_timer_resume(void);
 void xen_arch_resume(void);
 
+int xen_setup_shutdown_event(void);
+
 #endif /* INCLUDE_XEN_OPS_H */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/12] Allow xen platform pci device to be compiled as a module
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (6 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 07/12] Add suspend\resume support for PV on HVM guests Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:15   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC Stefano Stabellini
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/xen/enlighten.c          |    3 +++
 drivers/xen/events.c              |    1 +
 drivers/xen/grant-table.c         |    6 +++++-
 drivers/xen/manage.c              |   14 +++++++++++---
 drivers/xen/xenbus/xenbus_probe.c |    1 +
 include/xen/platform_pci.h        |   18 ++++--------------
 6 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 23b8200..77ba321 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -85,7 +85,9 @@ struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 int xen_have_vector_callback;
+EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 int xen_platform_pci;
+EXPORT_SYMBOL_GPL(xen_platform_pci);
 static int unplug;
 
 /*
@@ -1297,6 +1299,7 @@ int xen_set_callback_via(uint64_t via)
 	a.value = via;
 	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
 }
+EXPORT_SYMBOL_GPL(xen_set_callback_via);
 
 void do_hvm_pv_evtchn_intr(void)
 {
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index cfc6d96..4840a03 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -695,6 +695,7 @@ void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
 {
 	__xen_evtchn_do_upcall(regs);
 }
+EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall);
 
 /* Rebind a new event channel to an existing irq. */
 void rebind_evtchn_irq(int evtchn, int irq)
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 6f5f3ba..f936d30 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -56,6 +56,9 @@
 #define GNTTAB_LIST_END 0xffffffff
 #define GREFS_PER_GRANT_FRAME (PAGE_SIZE / sizeof(struct grant_entry))
 
+unsigned long (*alloc_xen_mmio_hook)(unsigned long len);
+EXPORT_SYMBOL_GPL(alloc_xen_mmio_hook);
+
 static grant_ref_t **gnttab_list;
 static unsigned int nr_grant_frames;
 static unsigned int boot_max_nr_grant_frames;
@@ -514,7 +517,7 @@ int gnttab_resume(void)
 		return gnttab_map(0, nr_grant_frames - 1);
 
 	if (!hvm_pv_resume_frames) {
-		hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
+		hvm_pv_resume_frames = alloc_xen_mmio_hook(PAGE_SIZE * max_nr_gframes);
 		shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
 		if (shared == NULL) {
 			printk(KERN_WARNING
@@ -600,6 +603,7 @@ int gnttab_init(void)
 	kfree(gnttab_list);
 	return -ENOMEM;
 }
+EXPORT_SYMBOL_GPL(gnttab_init);
 
 static int __devinit __gnttab_init(void)
 {
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index a73edd8..49ee52d 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -34,6 +34,13 @@ enum shutdown_state {
 	 SHUTDOWN_HALT = 4,
 };
 
+void (*platform_pci_resume_hook)(void);
+EXPORT_SYMBOL_GPL(platform_pci_resume_hook);
+void (*platform_pci_disable_irq_hook)(void);
+EXPORT_SYMBOL_GPL(platform_pci_disable_irq_hook);
+void (*platform_pci_enable_irq_hook)(void);
+EXPORT_SYMBOL_GPL(platform_pci_enable_irq_hook);
+
 /* Ignore multiple shutdown requests. */
 static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
 
@@ -52,7 +59,7 @@ static int xen_hvm_suspend(void *data)
 
 	if (!*cancelled) {
 		xen_irq_resume();
-		platform_pci_resume();
+		platform_pci_resume_hook();
 	}
 
 	return 0;
@@ -119,7 +126,7 @@ static void do_hvm_suspend(void)
 	printk(KERN_DEBUG "suspending xenstore... ");
 	xenbus_suspend();
 	printk(KERN_DEBUG "xenstore suspended\n");
-	platform_pci_disable_irq();
+	platform_pci_disable_irq_hook();
 	
 	err = stop_machine(xen_hvm_suspend, &cancelled, cpumask_of(0));
 	if (err) {
@@ -127,7 +134,7 @@ static void do_hvm_suspend(void)
 		cancelled = 1;
 	}
 
-	platform_pci_enable_irq();
+	platform_pci_enable_irq_hook();
 
 	if (!cancelled) {
 		xen_arch_resume();
@@ -357,5 +364,6 @@ int xen_setup_shutdown_event(void)
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(xen_setup_shutdown_event);
 
 subsys_initcall(__setup_shutdown_event);
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index a679205..f83e083 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -892,6 +892,7 @@ int xenbus_probe_init(void)
   out_error:
 	return err;
 }
+EXPORT_SYMBOL_GPL(xenbus_probe_init);
 
 postcore_initcall(__xenbus_probe_init);
 
diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
index ced434d..c3c2527 100644
--- a/include/xen/platform_pci.h
+++ b/include/xen/platform_pci.h
@@ -29,19 +29,9 @@
 #define XEN_IOPORT_LINUX_PRODNUM 0xffff
 #define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
 
-#ifdef CONFIG_XEN_PLATFORM_PCI
-unsigned long alloc_xen_mmio(unsigned long len);
-void platform_pci_resume(void);
-void platform_pci_disable_irq(void);
-void platform_pci_enable_irq(void);
-#else
-static inline unsigned long alloc_xen_mmio(unsigned long len)
-{
-	return ~0UL;
-}
-static inline void platform_pci_resume(void) {}
-static inline void platform_pci_disable_irq(void) {}
-static inline void platform_pci_enable_irq(void) {}
-#endif
+extern unsigned long (*alloc_xen_mmio_hook)(unsigned long len);
+extern void (*platform_pci_resume_hook)(void);
+extern void (*platform_pci_disable_irq_hook)(void);
+extern void (*platform_pci_enable_irq_hook)(void);
 
 #endif /* _XEN_PLATFORM_PCI_H */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (7 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 08/12] Allow xen platform pci device to be compiled as a module Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:15   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 10/12] __setup_vector_irq: handle NULL chip_data Stefano Stabellini
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Make sure chip_data is not NULL before accessing it.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/kernel/apic/io_apic.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index eb2789c..c64499c 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1732,6 +1732,8 @@ __apicdebuginit(void) print_IO_APIC(void)
 		struct irq_pin_list *entry;
 
 		cfg = desc->chip_data;
+		if (!cfg)
+			continue;
 		entry = cfg->irq_2_pin;
 		if (!entry)
 			continue;
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/12] __setup_vector_irq: handle NULL chip_data
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (8 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 10:23 ` [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM Stefano Stabellini
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/kernel/apic/io_apic.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index c64499c..4d3d391 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1269,6 +1269,8 @@ void __setup_vector_irq(int cpu)
 	/* Mark the inuse vectors */
 	for_each_irq_desc(irq, desc) {
 		cfg = desc->chip_data;
+		if (!cfg)
+			continue;
 
 		/*
 		 * If it is a legacy IRQ handled by the legacy PIC, this cpu
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (9 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 10/12] __setup_vector_irq: handle NULL chip_data Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:23   ` Jeremy Fitzhardinge
  2010-05-18 10:23 ` [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state Stefano Stabellini
  2010-05-18 10:55 ` [PATCH 0 of 12] PV on HVM Xen Christian Tramnitz
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/x86/xen/enlighten.c         |   39 +++++++++++++++++++++++++++++++++++++-
 arch/x86/xen/time.c              |    3 ++
 drivers/xen/manage.c             |    1 +
 include/xen/interface/features.h |    3 ++
 4 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 77ba321..41677fe 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1274,6 +1274,7 @@ void init_shared_info(void)
 {
 	struct xen_add_to_physmap xatp;
 	static struct shared_info *shared_info_page = 0;
+	int cpu;
 
 	if (!shared_info_page)
 		shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
@@ -1288,7 +1289,42 @@ void init_shared_info(void)
 
 	/* Don't do the full vcpu_info placement stuff until we have a
 	   possible map and a non-dummy shared_info. */
-	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
+	/* This code is run at resume time so make sure all the online cpus
+	 * have xen_vcpu properly set */
+	for_each_online_cpu(cpu)
+		per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+}
+
+static void xen_hvm_setup_cpu_clockevents(void)
+{
+	int cpu = smp_processor_id();
+	xen_setup_timer(cpu);
+	per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
+	xen_setup_cpu_clockevents();
+}
+
+static void init_hvm_time(void)
+{
+#ifdef CONFIG_SMP
+	/* vector callback is needed otherwise we cannot receive interrupts
+	 * on cpu > 0 */
+	if (!xen_have_vector_callback)
+		return;
+#endif
+	if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
+		printk(KERN_WARNING "Xen doesn't support pvclock on HVM,"
+				"disable pv timer\n");
+		return;
+	}
+
+	pv_time_ops = xen_time_ops;
+	x86_init.timers.timer_init = xen_time_init;
+	x86_init.timers.setup_percpu_clockev = x86_init_noop;
+	x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
+
+	x86_platform.calibrate_tsc = xen_tsc_khz;
+	x86_platform.get_wallclock = xen_get_wallclock;
+	x86_platform.set_wallclock = xen_set_wallclock;
 }
 
 int xen_set_callback_via(uint64_t via)
@@ -1373,6 +1409,7 @@ void __init xen_guest_init(void)
 		outw(unplug, XEN_IOPORT_UNPLUG);
 	have_vcpu_info_placement = 0;
 	x86_init.irqs.intr_init = xen_init_IRQ;
+	init_hvm_time();
 }
 
 static int __init parse_unplug(char *arg)
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 32764b8..620e68f 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -19,6 +19,7 @@
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
 
+#include <xen/xen.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/vcpu.h>
@@ -470,6 +471,8 @@ void xen_timer_resume(void)
 	for_each_online_cpu(cpu) {
 		if (HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL))
 			BUG();
+		if (xen_hvm_domain())
+			xen_setup_runstate_info(cpu);
 	}
 }
 
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 49ee52d..4a8af22 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -60,6 +60,7 @@ static int xen_hvm_suspend(void *data)
 	if (!*cancelled) {
 		xen_irq_resume();
 		platform_pci_resume_hook();
+		xen_timer_resume();
 	}
 
 	return 0;
diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
index 8ab08b9..70d2563 100644
--- a/include/xen/interface/features.h
+++ b/include/xen/interface/features.h
@@ -44,6 +44,9 @@
 /* x86: Does this Xen host support the HVM callback vector type? */
 #define XENFEAT_hvm_callback_vector        8
 
+/* x86: pvclock algorithm is safe to use on HVM */
+#define XENFEAT_hvm_safe_pvclock           9
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (10 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM Stefano Stabellini
@ 2010-05-18 10:23 ` Stefano Stabellini
  2010-05-18 18:28   ` Jeremy Fitzhardinge
  2010-05-18 10:55 ` [PATCH 0 of 12] PV on HVM Xen Christian Tramnitz
  12 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-18 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: xen-devel, Jeremy Fitzhardinge, Don Dutile, Sheng Yang,
	Stefano Stabellini

From: Don Dutile <ddutile@redhat.com>

this way if xenbus isn't configured in a FV xen guest,
loading pv drivers (like netfront) won't crash the guest.

Signed-off-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 drivers/xen/xenbus/xenbus_probe.c |   29 +++++++++++++++++++++++++----
 drivers/xen/xenbus/xenbus_probe.h |    1 +
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index f83e083..5e8dae6 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -188,6 +188,11 @@ static struct xen_bus_type xenbus_frontend = {
 	.levels = 2, 		/* device/type/<id> */
 	.get_bus_id = frontend_bus_id,
 	.probe = xenbus_probe_frontend,
+	/* 
+	 * to ensure loading pv-on-hvm drivers on FV guest
+	 * doesn't blow up trying to use uninit'd xenbus.
+	 */
+	.error = -ENODEV,
 	.bus = {
 		.name      = "xen",
 		.match     = xenbus_match,
@@ -352,6 +357,9 @@ int xenbus_register_driver_common(struct xenbus_driver *drv,
 				  struct module *owner,
 				  const char *mod_name)
 {
+	if (bus->error)
+		return bus->error;
+
 	drv->driver.name = drv->name;
 	drv->driver.bus = &bus->bus;
 	drv->driver.owner = owner;
@@ -484,8 +492,12 @@ int xenbus_probe_node(struct xen_bus_type *bus,
 	struct xenbus_device *xendev;
 	size_t stringlen;
 	char *tmpstring;
+	enum xenbus_state state;
+
+	if (bus->error)
+		return bus->error;
 
-	enum xenbus_state state = xenbus_read_driver_state(nodename);
+	state = xenbus_read_driver_state(nodename);
 
 	if (state != XenbusStateInitialising) {
 		/* Device is not new, so ignore it.  This can happen if a
@@ -593,6 +605,9 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
 	char **dir;
 	unsigned int i, dir_n;
 
+	if (bus->error)
+		return bus->error;
+
 	dir = xenbus_directory(XBT_NIL, bus->root, "", &dir_n);
 	if (IS_ERR(dir))
 		return PTR_ERR(dir);
@@ -636,7 +651,7 @@ void xenbus_dev_changed(const char *node, struct xen_bus_type *bus)
 	char type[XEN_BUS_ID_SIZE];
 	const char *p, *root;
 
-	if (char_count(node, '/') < 2)
+	if (bus->error || char_count(node, '/') < 2)
 		return;
 
 	exists = xenbus_exists(XBT_NIL, node, "");
@@ -829,8 +844,8 @@ int xenbus_probe_init(void)
 	DPRINTK("");
 
 	/* Register ourselves with the kernel bus subsystem */
-	err = bus_register(&xenbus_frontend.bus);
-	if (err)
+	xenbus_frontend.error = bus_register(&xenbus_frontend.bus);
+	if (xenbus_frontend.error)
 		goto out_error;
 
 	err = xenbus_backend_bus_register();
@@ -923,6 +938,9 @@ static int is_device_connecting(struct device *dev, void *data)
 
 static int exists_connecting_device(struct device_driver *drv)
 {
+	if (xenbus_frontend.error)
+		return xenbus_frontend.error;
+
 	return bus_for_each_dev(&xenbus_frontend.bus, NULL, drv,
 				is_device_connecting);
 }
@@ -1002,6 +1020,9 @@ static void wait_for_devices(struct xenbus_driver *xendrv)
 #ifndef MODULE
 static int __init boot_wait_for_devices(void)
 {
+	if (!xenbus_frontend.error)
+		return xenbus_frontend.error;
+
 	ready_to_wait_for_devices = 1;
 	wait_for_devices(NULL);
 	return 0;
diff --git a/drivers/xen/xenbus/xenbus_probe.h b/drivers/xen/xenbus/xenbus_probe.h
index 6c5e318..15febe4 100644
--- a/drivers/xen/xenbus/xenbus_probe.h
+++ b/drivers/xen/xenbus/xenbus_probe.h
@@ -53,6 +53,7 @@ static inline void xenbus_backend_bus_unregister(void) {}
 struct xen_bus_type
 {
 	char *root;
+	int error;
 	unsigned int levels;
 	int (*get_bus_id)(char bus_id[XEN_BUS_ID_SIZE], const char *nodename);
 	int (*probe)(const char *type, const char *dir);
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
                   ` (11 preceding siblings ...)
  2010-05-18 10:23 ` [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state Stefano Stabellini
@ 2010-05-18 10:55 ` Christian Tramnitz
  2010-05-18 18:41   ` Jeremy Fitzhardinge
  2010-05-24 17:28   ` Stefano Stabellini
  12 siblings, 2 replies; 46+ messages in thread
From: Christian Tramnitz @ 2010-05-18 10:55 UTC (permalink / raw)
  To: xen-devel

Hi Stefano,

what are the particular advantages of running PVonHVM vs. traditional PV 
(vs pure HVM)?
I'd like to update the wiki with some info about it...



Thanks,
    Christian

Am 18.05.2010 12:22, schrieb Stefano Stabellini:
> Hi all,
> this is the fixed, updated and rebased version of the PV on HVM series:
> the series is based on 2.6.34 now and supports Xen PV frontends running
> in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.
>
> The list of bugs fixed in this update includes: xenbus drivers crashes
> when xenbus is not properly initialized, a memory corruption bug in
> suspend/resume and testing for the xen platform pci version and protocol
> has been moved to enlighten.c (before unplugging emulated devices).
>
> In order to be able to use VIRQ_TIMER and to improve performances you
> need a patch to Xen to implement the vector callback mechanism
> for event channel delivery.
>
> A git tree is also available here:
>
> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
>
> branch name 2.6.34-pvhvm.
>
> Cheers,
>
> Stefano

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
@ 2010-05-18 17:17   ` Jeremy Fitzhardinge
  2010-05-19 12:24     ` Stefano Stabellini
  2010-05-18 17:43   ` Jeremy Fitzhardinge
  2010-05-18 18:10   ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 17:17 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Don Dutile, linux-kernel, Sheng Yang

On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
> From: Sheng Yang <sheng@linux.intel.com>
>
> Set the callback to receive evtchns from Xen, using the
> callback vector delivery mechanism.
>   

Could you expand on this a little?  Like, why is this desireable?  What
functional difference does it make?  Is this patch useful in its own
right, or is it just laying the groundwork for something else?

Thanks,
    J

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> ---
>  arch/x86/xen/enlighten.c         |   35 +++++++++++++++++++++++++++++++++++
>  drivers/xen/events.c             |   31 ++++++++++++++++++++++++-------
>  include/xen/events.h             |    3 +++
>  include/xen/hvm.h                |    9 +++++++++
>  include/xen/interface/features.h |    3 +++
>  5 files changed, 74 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 87a3b10..502c4f8 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -37,8 +37,11 @@
>  #include <xen/interface/vcpu.h>
>  #include <xen/interface/memory.h>
>  #include <xen/interface/hvm/hvm_op.h>
> +#include <xen/interface/hvm/params.h>
>  #include <xen/features.h>
>  #include <xen/page.h>
> +#include <xen/hvm.h>
> +#include <xen/events.h>
>  #include <xen/hvc-console.h>
>  
>  #include <asm/paravirt.h>
> @@ -79,6 +82,8 @@ struct shared_info xen_dummy_shared_info;
>  
>  void *xen_initial_gdt;
>  
> +int xen_have_vector_callback;
> +
>  /*
>   * Point at some empty memory to start with. We map the real shared_info
>   * page as soon as fixmap is up and running.
> @@ -1279,6 +1284,31 @@ static void __init init_shared_info(void)
>  	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
>  }
>  
> +int xen_set_callback_via(uint64_t via)
> +{
> +	struct xen_hvm_param a;
> +	a.domid = DOMID_SELF;
> +	a.index = HVM_PARAM_CALLBACK_IRQ;
> +	a.value = via;
> +	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
> +}
> +
> +void do_hvm_pv_evtchn_intr(void)
> +{
> +	xen_hvm_evtchn_do_upcall(get_irq_regs());
> +}
> +
> +static void xen_callback_vector(void)
> +{
> +	uint64_t callback_via;
> +	if (xen_feature(XENFEAT_hvm_callback_vector)) {
> +		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
> +		xen_set_callback_via(callback_via);
> +		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
> +		xen_have_vector_callback = 1;
> +	}
> +}
> +
>  void __init xen_guest_init(void)
>  {
>  	int r;
> @@ -1292,4 +1322,9 @@ void __init xen_guest_init(void)
>  		return;
>  
>  	init_shared_info();
> +
> +	xen_callback_vector();
> +
> +	have_vcpu_info_placement = 0;
> +	x86_init.irqs.intr_init = xen_init_IRQ;
>  }
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index db8f506..3523dbb 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -36,6 +36,8 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/hypervisor.h>
>  
> +#include <xen/xen.h>
> +#include <xen/hvm.h>
>  #include <xen/xen-ops.h>
>  #include <xen/events.h>
>  #include <xen/interface/xen.h>
> @@ -617,17 +619,13 @@ static DEFINE_PER_CPU(unsigned, xed_nesting_count);
>   * a bitset of words which contain pending event bits.  The second
>   * level is a bitset of pending events themselves.
>   */
> -void xen_evtchn_do_upcall(struct pt_regs *regs)
> +void __xen_evtchn_do_upcall(struct pt_regs *regs)
>  {
>  	int cpu = get_cpu();
> -	struct pt_regs *old_regs = set_irq_regs(regs);
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
>   	unsigned count;
>  
> -	exit_idle();
> -	irq_enter();
> -
>  	do {
>  		unsigned long pending_words;
>  
> @@ -667,10 +665,26 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>  	} while(count != 1);
>  
>  out:
> +
> +	put_cpu();
> +}
> +
> +void xen_evtchn_do_upcall(struct pt_regs *regs)
> +{
> +	struct pt_regs *old_regs = set_irq_regs(regs);
> +
> +	exit_idle();
> +	irq_enter();
> +
> +	__xen_evtchn_do_upcall(regs);
> +
>  	irq_exit();
>  	set_irq_regs(old_regs);
> +}
>  
> -	put_cpu();
> +void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
> +{
> +	__xen_evtchn_do_upcall(regs);
>  }
>  
>  /* Rebind a new event channel to an existing irq. */
> @@ -947,5 +961,8 @@ void __init xen_init_IRQ(void)
>  	for (i = 0; i < NR_EVENT_CHANNELS; i++)
>  		mask_evtchn(i);
>  
> -	irq_ctx_init(smp_processor_id());
> +	if (xen_hvm_domain())
> +		native_init_IRQ();
> +	else
> +		irq_ctx_init(smp_processor_id());
>  }
> diff --git a/include/xen/events.h b/include/xen/events.h
> index e68d59a..868e5d6 100644
> --- a/include/xen/events.h
> +++ b/include/xen/events.h
> @@ -56,4 +56,7 @@ void xen_poll_irq(int irq);
>  /* Determine the IRQ which is bound to an event channel */
>  unsigned irq_from_evtchn(unsigned int evtchn);
>  
> +void xen_evtchn_do_upcall(struct pt_regs *regs);
> +void xen_hvm_evtchn_do_upcall(struct pt_regs *regs);
> +
>  #endif	/* _XEN_EVENTS_H */
> diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> index 6b0d418..5940ee5 100644
> --- a/include/xen/hvm.h
> +++ b/include/xen/hvm.h
> @@ -3,6 +3,7 @@
>  #define XEN_HVM_H__
>  
>  #include <xen/interface/hvm/params.h>
> +#include <asm/xen/hypercall.h>
>  
>  static inline int hvm_get_parameter(int idx, uint64_t *value)
>  {
> @@ -21,4 +22,12 @@ static inline int hvm_get_parameter(int idx, uint64_t *value)
>         return r;
>  }
>  
> +int xen_set_callback_via(uint64_t via);
> +extern int xen_have_vector_callback;
> +
> +#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
> +#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
> +#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
> +                               HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
> +
>  #endif /* XEN_HVM_H__ */
> diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
> index f51b641..8ab08b9 100644
> --- a/include/xen/interface/features.h
> +++ b/include/xen/interface/features.h
> @@ -41,6 +41,9 @@
>  /* x86: Does this Xen host support the MMU_PT_UPDATE_PRESERVE_AD hypercall? */
>  #define XENFEAT_mmu_pt_update_preserve_ad  5
>  
> +/* x86: Does this Xen host support the HVM callback vector type? */
> +#define XENFEAT_hvm_callback_vector        8
> +
>  #define XENFEAT_NR_SUBMAPS 1
>  
>  #endif /* __XEN_PUBLIC_FEATURES_H__ */
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/12] unplug emulated disks and nics
  2010-05-18 10:23 ` [PATCH 05/12] unplug emulated disks and nics Stefano Stabellini
@ 2010-05-18 17:27   ` Jeremy Fitzhardinge
  2010-05-19 13:00     ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 17:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Don Dutile, linux-kernel, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> add a xen_unplug command line option to the kernel to unplug
> xen emulated disks and nics.
>   

I think it would be nice to call it something like "xen_emul_unplug" to
clarify what this actually means.  And is it really necessary to make it
a command-line option?  Can't we unplug these things once the pv drivers
are brought up?  What happens if the user doesn't specify this?

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/x86/xen/enlighten.c             |   66 ++++++++++++++++++++++++++++++++++
>  include/xen/hvm.h                    |    2 +
>  include/xen/interface/platform_pci.h |   46 +++++++++++++++++++++++
>  include/xen/platform_pci.h           |   32 ++++++++++++++++
>  4 files changed, 146 insertions(+), 0 deletions(-)
>  create mode 100644 include/xen/interface/platform_pci.h
>  create mode 100644 include/xen/platform_pci.h
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 502c4f8..aac47b0 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -31,6 +31,7 @@
>  #include <linux/gfp.h>
>  
>  #include <xen/xen.h>
> +#include <xen/platform_pci.h>
>  #include <xen/interface/xen.h>
>  #include <xen/interface/version.h>
>  #include <xen/interface/physdev.h>
> @@ -38,6 +39,7 @@
>  #include <xen/interface/memory.h>
>  #include <xen/interface/hvm/hvm_op.h>
>  #include <xen/interface/hvm/params.h>
> +#include <xen/interface/platform_pci.h>
>  #include <xen/features.h>
>  #include <xen/page.h>
>  #include <xen/hvm.h>
> @@ -83,6 +85,8 @@ struct shared_info xen_dummy_shared_info;
>  void *xen_initial_gdt;
>  
>  int xen_have_vector_callback;
> +int xen_platform_pci;
> +static int unplug;
>  
>  /*
>   * Point at some empty memory to start with. We map the real shared_info
> @@ -1309,6 +1313,39 @@ static void xen_callback_vector(void)
>  	}
>  }
>  
> +static int __init check_platform_magic(void)
>   

I'd prefer not to put all this in enlighten.c unless it really needs to
be here.  Given that all this is dependent on the Xen platform PCI
device being enabled, it would probably be happy in a separate
conditionally compiled file.

> +{
> +	short magic;
> +	char protocol;
> +
> +	magic = inw(XEN_IOPORT_MAGIC);
>   

Does this get run only once we've established we're running on Xen, or
could this be run in an arbitrary environment?

> +	if (magic != XEN_IOPORT_MAGIC_VAL) {
> +		printk(KERN_ERR "Xen Platform Pci: unrecognised magic value\n");
> +		return -1;
> +	}
> +
> +	protocol = inb(XEN_IOPORT_PROTOVER);
> +
> +	printk(KERN_DEBUG "Xen Platform Pci: I/O protocol version %d\n",
>   

"PCI" please.  Also, is that really accurate since we're doing random IO
port stuff with no obvious connection to PCI?  "Xen Platform Device"
perhaps?  (Though given that we have a proper fake PCI device, why all
this random IO port hackery anyway?)


> +			protocol);
> +
> +	switch (protocol) {
> +	case 1:
> +		outw(XEN_IOPORT_LINUX_PRODNUM, XEN_IOPORT_PRODNUM);
> +		outl(XEN_IOPORT_LINUX_DRVVER, XEN_IOPORT_DRVVER);
> +		if (inw(XEN_IOPORT_MAGIC) != XEN_IOPORT_MAGIC_VAL) {
> +			printk(KERN_ERR "Xen Platform: blacklisted by host\n");
> +			return -3;
> +		}
> +		break;
> +	default:
> +		printk(KERN_WARNING "Xen Platform Pci: unknown I/O protocol version");
> +		return -2;
> +	}
> +
> +	return 0;
> +}
> +
>  void __init xen_guest_init(void)
>  {
>  	int r;
> @@ -1325,6 +1362,35 @@ void __init xen_guest_init(void)
>  
>  	xen_callback_vector();
>  
> +	r = check_platform_magic();
> +	if (!r || (r == -1 && (unplug & UNPLUG_IGNORE)))
> +		xen_platform_pci = 1;
> +	if (xen_platform_pci && !(unplug & UNPLUG_IGNORE))
> +		outw(unplug, XEN_IOPORT_UNPLUG);
>   

What does all this do?  A comment would be nice.

>  	have_vcpu_info_placement = 0;
>  	x86_init.irqs.intr_init = xen_init_IRQ;
>  }
> +
> +static int __init parse_unplug(char *arg)
> +{
> +	char *p, *q;
> +
> +	for (p = arg; p; p = q) {
> +		q = strchr(arg, ',');
> +		if (q)
> +			*q++ = '\0';
> +		if (!strcmp(p, "all"))
> +			unplug |= UNPLUG_ALL;
> +		else if (!strcmp(p, "ide-disks"))
> +			unplug |= UNPLUG_ALL_IDE_DISKS;
> +		else if (!strcmp(p, "aux-ide-disks"))
> +			unplug |= UNPLUG_AUX_IDE_DISKS;
> +		else if (!strcmp(p, "nics"))
> +			unplug |= UNPLUG_ALL_NICS;
> +		else
> +			printk(KERN_WARNING "unrecognised option '%s' "
> +				 "in module parameter 'dev_unplug'\n", p);
>   

"xen_unplug" (or whatever it becomes).

> +	}
> +	return 0;
> +}
> +early_param("xen_unplug", parse_unplug);
>   

If we must have this kernel command line parameter, make sure you update
Documentation/kernel-parameters.txt.

> diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> index 5940ee5..777d2ce 100644
> --- a/include/xen/hvm.h
> +++ b/include/xen/hvm.h
> @@ -30,4 +30,6 @@ extern int xen_have_vector_callback;
>  #define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
>                                 HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
>  
> +extern int xen_platform_pci;
> +
>  #endif /* XEN_HVM_H__ */
> diff --git a/include/xen/interface/platform_pci.h b/include/xen/interface/platform_pci.h
> new file mode 100644
> index 0000000..720eaf5
> --- /dev/null
> +++ b/include/xen/interface/platform_pci.h
> @@ -0,0 +1,46 @@
> +/******************************************************************************
> + * platform_pci.h
> + *
> + * Interface for granting foreign access to page frames, and receiving
> + * page-ownership transfers.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef __XEN_PUBLIC_PLATFORM_PCI_H__
> +#define __XEN_PUBLIC_PLATFORM_PCI_H__
> +
> +#define XEN_IOPORT_BASE 0x10
> +
> +#define XEN_IOPORT_PLATFLAGS	(XEN_IOPORT_BASE + 0) /* 1 byte access (R/W) */
> +#define XEN_IOPORT_MAGIC	(XEN_IOPORT_BASE + 0) /* 2 byte access (R) */
> +#define XEN_IOPORT_UNPLUG	(XEN_IOPORT_BASE + 0) /* 2 byte access (W) */
> +#define XEN_IOPORT_DRVVER	(XEN_IOPORT_BASE + 0) /* 4 byte access (W) */
> +
> +#define XEN_IOPORT_SYSLOG	(XEN_IOPORT_BASE + 2) /* 1 byte access (W) */
> +#define XEN_IOPORT_PROTOVER	(XEN_IOPORT_BASE + 2) /* 1 byte access (R) */
> +#define XEN_IOPORT_PRODNUM	(XEN_IOPORT_BASE + 2) /* 2 byte access (W) */
> +
> +#define UNPLUG_ALL_IDE_DISKS 1
> +#define UNPLUG_ALL_NICS 2
> +#define UNPLUG_AUX_IDE_DISKS 4
> +#define UNPLUG_ALL 7
> +#define UNPLUG_IGNORE 8
> +
> +#endif /* __XEN_PUBLIC_PLATFORM_PCI_H__ */
> diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> new file mode 100644
> index 0000000..f39f4d3
> --- /dev/null
> +++ b/include/xen/platform_pci.h
> @@ -0,0 +1,32 @@
> +/******************************************************************************
> + * platform-pci.h
> + *
> + * Xen platform PCI device driver
> + * Copyright (c) 2004, Intel Corporation. <xiaofeng.ling@intel.com>
> + * Copyright (c) 2007, XenSource Inc.
> + * Copyright (c) 2010, Citrix
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#ifndef _XEN_PLATFORM_PCI_H
> +#define _XEN_PLATFORM_PCI_H
> +
> +#include <linux/version.h>
> +
> +#define XEN_IOPORT_MAGIC_VAL 0x49d2
> +#define XEN_IOPORT_LINUX_PRODNUM 0xffff
> +#define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
>   

Can't these two headers be folded together?  There doesn't seem much
point in splitting these XEN_IOPORT definitions across two files.

    J

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
  2010-05-18 17:17   ` Jeremy Fitzhardinge
@ 2010-05-18 17:43   ` Jeremy Fitzhardinge
  2010-05-19 13:01     ` Stefano Stabellini
  2010-05-18 18:10   ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 17:43 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Don Dutile, linux-kernel, Sheng Yang

On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
> From: Sheng Yang <sheng@linux.intel.com>
>
> Set the callback to receive evtchns from Xen, using the
> callback vector delivery mechanism.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> ---
>  arch/x86/xen/enlighten.c         |   35 +++++++++++++++++++++++++++++++++++
>  drivers/xen/events.c             |   31 ++++++++++++++++++++++++-------
>  include/xen/events.h             |    3 +++
>  include/xen/hvm.h                |    9 +++++++++
>  include/xen/interface/features.h |    3 +++
>  5 files changed, 74 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 87a3b10..502c4f8 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -37,8 +37,11 @@
>  #include <xen/interface/vcpu.h>
>  #include <xen/interface/memory.h>
>  #include <xen/interface/hvm/hvm_op.h>
> +#include <xen/interface/hvm/params.h>
>  #include <xen/features.h>
>  #include <xen/page.h>
> +#include <xen/hvm.h>
> +#include <xen/events.h>
>  #include <xen/hvc-console.h>
>  
>  #include <asm/paravirt.h>
> @@ -79,6 +82,8 @@ struct shared_info xen_dummy_shared_info;
>  
>  void *xen_initial_gdt;
>  
> +int xen_have_vector_callback;
>   

BTW, this can be a __read_mostly.

    J

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
  2010-05-18 17:17   ` Jeremy Fitzhardinge
  2010-05-18 17:43   ` Jeremy Fitzhardinge
@ 2010-05-18 18:10   ` Jeremy Fitzhardinge
  2010-05-19 13:08     ` Stefano Stabellini
  2 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:10 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Don Dutile, linux-kernel, Sheng Yang

On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
> From: Sheng Yang <sheng@linux.intel.com>
>
> Set the callback to receive evtchns from Xen, using the
> callback vector delivery mechanism.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> ---
>  arch/x86/xen/enlighten.c         |   35 +++++++++++++++++++++++++++++++++++
>  drivers/xen/events.c             |   31 ++++++++++++++++++++++++-------
>  include/xen/events.h             |    3 +++
>  include/xen/hvm.h                |    9 +++++++++
>  include/xen/interface/features.h |    3 +++
>  5 files changed, 74 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 87a3b10..502c4f8 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -37,8 +37,11 @@
>  #include <xen/interface/vcpu.h>
>  #include <xen/interface/memory.h>
>  #include <xen/interface/hvm/hvm_op.h>
> +#include <xen/interface/hvm/params.h>
>  #include <xen/features.h>
>  #include <xen/page.h>
> +#include <xen/hvm.h>
> +#include <xen/events.h>
>  #include <xen/hvc-console.h>
>  
>  #include <asm/paravirt.h>
> @@ -79,6 +82,8 @@ struct shared_info xen_dummy_shared_info;
>  
>  void *xen_initial_gdt;
>  
> +int xen_have_vector_callback;
> +
>  /*
>   * Point at some empty memory to start with. We map the real shared_info
>   * page as soon as fixmap is up and running.
> @@ -1279,6 +1284,31 @@ static void __init init_shared_info(void)
>  	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
>  }
>  
> +int xen_set_callback_via(uint64_t via)
> +{
> +	struct xen_hvm_param a;
> +	a.domid = DOMID_SELF;
> +	a.index = HVM_PARAM_CALLBACK_IRQ;
> +	a.value = via;
> +	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
>   

Does this implicitly set the vector delivery on all vcpus, current and
future?

> +}
> +
> +void do_hvm_pv_evtchn_intr(void)
> +{
> +	xen_hvm_evtchn_do_upcall(get_irq_regs());
> +}
> +
> +static void xen_callback_vector(void)
>   

All this callback vector stuff should be in drivers/xen/events.c.  It
would also be good to give it a more descriptive name
("xen_set_callback_vector"?), and make it an init function.

> +{
> +	uint64_t callback_via;
> +	if (xen_feature(XENFEAT_hvm_callback_vector)) {
> +		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
> +		xen_set_callback_via(callback_via);
>   

Do you need to check the return value here?  Can it possibly fail?

> +		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
> +		xen_have_vector_callback = 1;
> +	}
> +}
> +
>  void __init xen_guest_init(void)
>  {
>  	int r;
> @@ -1292,4 +1322,9 @@ void __init xen_guest_init(void)
>  		return;
>  
>  	init_shared_info();
> +
> +	xen_callback_vector();
> +
> +	have_vcpu_info_placement = 0;
> +	x86_init.irqs.intr_init = xen_init_IRQ;
>  }
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index db8f506..3523dbb 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -36,6 +36,8 @@
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/hypervisor.h>
>  
> +#include <xen/xen.h>
> +#include <xen/hvm.h>
>  #include <xen/xen-ops.h>
>  #include <xen/events.h>
>  #include <xen/interface/xen.h>
> @@ -617,17 +619,13 @@ static DEFINE_PER_CPU(unsigned, xed_nesting_count);
>   * a bitset of words which contain pending event bits.  The second
>   * level is a bitset of pending events themselves.
>   */
> -void xen_evtchn_do_upcall(struct pt_regs *regs)
> +void __xen_evtchn_do_upcall(struct pt_regs *regs)
>   

Given that the regs arg is completely unused, you should drop it.

>  {
>  	int cpu = get_cpu();
> -	struct pt_regs *old_regs = set_irq_regs(regs);
>  	struct shared_info *s = HYPERVISOR_shared_info;
>  	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
>   	unsigned count;
>  
> -	exit_idle();
> -	irq_enter();
> -
>  	do {
>  		unsigned long pending_words;
>  
> @@ -667,10 +665,26 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>  	} while(count != 1);
>  
>  out:
> +
> +	put_cpu();
> +}
> +
> +void xen_evtchn_do_upcall(struct pt_regs *regs)
> +{
> +	struct pt_regs *old_regs = set_irq_regs(regs);
> +
> +	exit_idle();
> +	irq_enter();
> +
> +	__xen_evtchn_do_upcall(regs);
> +
>  	irq_exit();
>  	set_irq_regs(old_regs);
> +}
>  
> -	put_cpu();
> +void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
> +{
> +	__xen_evtchn_do_upcall(regs);
>   

Don't you need to set_irq_regs here?

>  }
>  
>  /* Rebind a new event channel to an existing irq. */
> @@ -947,5 +961,8 @@ void __init xen_init_IRQ(void)
>  	for (i = 0; i < NR_EVENT_CHANNELS; i++)
>  		mask_evtchn(i);
>  
> -	irq_ctx_init(smp_processor_id());
> +	if (xen_hvm_domain())
> +		native_init_IRQ();
> +	else
> +		irq_ctx_init(smp_processor_id());
>  }
> diff --git a/include/xen/events.h b/include/xen/events.h
> index e68d59a..868e5d6 100644
> --- a/include/xen/events.h
> +++ b/include/xen/events.h
> @@ -56,4 +56,7 @@ void xen_poll_irq(int irq);
>  /* Determine the IRQ which is bound to an event channel */
>  unsigned irq_from_evtchn(unsigned int evtchn);
>  
> +void xen_evtchn_do_upcall(struct pt_regs *regs);
> +void xen_hvm_evtchn_do_upcall(struct pt_regs *regs);
> +
>  #endif	/* _XEN_EVENTS_H */
> diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> index 6b0d418..5940ee5 100644
> --- a/include/xen/hvm.h
> +++ b/include/xen/hvm.h
> @@ -3,6 +3,7 @@
>  #define XEN_HVM_H__
>  
>  #include <xen/interface/hvm/params.h>
> +#include <asm/xen/hypercall.h>
>  
>  static inline int hvm_get_parameter(int idx, uint64_t *value)
>  {
> @@ -21,4 +22,12 @@ static inline int hvm_get_parameter(int idx, uint64_t *value)
>         return r;
>  }
>  
> +int xen_set_callback_via(uint64_t via);
> +extern int xen_have_vector_callback;
> +
> +#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
> +#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
> +#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
> +                               HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
> +
>  #endif /* XEN_HVM_H__ */
> diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
> index f51b641..8ab08b9 100644
> --- a/include/xen/interface/features.h
> +++ b/include/xen/interface/features.h
> @@ -41,6 +41,9 @@
>  /* x86: Does this Xen host support the MMU_PT_UPDATE_PRESERVE_AD hypercall? */
>  #define XENFEAT_mmu_pt_update_preserve_ad  5
>  
> +/* x86: Does this Xen host support the HVM callback vector type? */
> +#define XENFEAT_hvm_callback_vector        8
> +
>  #define XENFEAT_NR_SUBMAPS 1
>  
>  #endif /* __XEN_PUBLIC_FEATURES_H__ */
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/12] xen pci platform device driver
  2010-05-18 10:23 ` [PATCH 06/12] xen pci platform device driver Stefano Stabellini
@ 2010-05-18 18:11   ` Jeremy Fitzhardinge
  2010-05-19 13:50     ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:11 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> Add the xen pci platform device driver that is responsible
> for initializing the grant table and xenbus in PV on HVM mode.
> Few changes to xenbus and grant table are necessary to allow the delayed
> initialization in HVM mode.
> Grant table needs few additional modifications to work in HVM mode.
>   

This needs a description of how event and interrupt handling work in
this environment.

> When running on HVM the event channel upcall is never called while in
> progress because it is a normal Linux irq handler, therefore we cannot
> be sure that evtchn_upcall_pending is 0 when returning.
>   

Is that because the interrupt raised by a pending event is
edge-triggered, so that even if the event is still pending on return,
the corresponding interrupt isn't still asserted?

> For this reason if evtchn_upcall_pending is set by Xen we need to loop
> again on the event channels set pending otherwise we might loose some
> event channel deliveries.
>   

So if the event is raised after the event processing loop but before the
handler returns, the corresponding interrupt is still asserted so the
interrupt handler will be re-entered immediately?  But if that's true,
then why is an event occurring during the loop liable to get missed?


> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> ---
>  drivers/xen/Kconfig                 |    8 ++
>  drivers/xen/Makefile                |    3 +-
>  drivers/xen/events.c                |    5 +-
>  drivers/xen/grant-table.c           |   70 +++++++++++--
>  drivers/xen/platform-pci.c          |  198 +++++++++++++++++++++++++++++++++++
>  drivers/xen/xenbus/xenbus_probe.c   |   20 +++-
>  include/xen/grant_table.h           |    1 +
>  include/xen/interface/grant_table.h |    1 +
>  include/xen/platform_pci.h          |    9 ++
>  include/xen/xenbus.h                |    1 +
>  10 files changed, 300 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/xen/platform-pci.c
>
> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> index fad3df2..da312e2 100644
> --- a/drivers/xen/Kconfig
> +++ b/drivers/xen/Kconfig
> @@ -62,4 +62,12 @@ config XEN_SYS_HYPERVISOR
>  	 virtual environment, /sys/hypervisor will still be present,
>  	 but will have no xen contents.
>  
> +config XEN_PLATFORM_PCI
> +	tristate "xen platform pci device driver"
> +	depends on XEN
> +	help
> +	  Driver for the Xen PCI Platform device: it is responsible for
> +	  initializing xenbus and grant_table when running in a Xen HVM
> +	  domain. As a consequence this driver is required to run any Xen PV
> +	  frontend on Xen HVM.
>  endmenu
> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> index 7c28434..e392fb7 100644
> --- a/drivers/xen/Makefile
> +++ b/drivers/xen/Makefile
> @@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_XENCOMM)	+= xencomm.o
>  obj-$(CONFIG_XEN_BALLOON)	+= balloon.o
>  obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
>  obj-$(CONFIG_XENFS)		+= xenfs/
> -obj-$(CONFIG_XEN_SYS_HYPERVISOR)	+= sys-hypervisor.o
> \ No newline at end of file
> +obj-$(CONFIG_XEN_SYS_HYPERVISOR)	+= sys-hypervisor.o
> +obj-$(CONFIG_XEN_PLATFORM_PCI)	+= platform-pci.o
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index a137a2f..cfc6d96 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -671,7 +671,7 @@ void __xen_evtchn_do_upcall(struct pt_regs *regs)
>  
>  		count = __get_cpu_var(xed_nesting_count);
>  		__get_cpu_var(xed_nesting_count) = 0;
> -	} while(count != 1);
> +	} while(count != 1 || vcpu_info->evtchn_upcall_pending);
>   

I still don't think I understand the need for this (or if its needed,
why its correct).

>  
>  out:
>  
> @@ -731,7 +731,8 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
>  	struct evtchn_bind_vcpu bind_vcpu;
>  	int evtchn = evtchn_from_irq(irq);
>  
> -	if (!VALID_EVTCHN(evtchn))
> +	if (!VALID_EVTCHN(evtchn) ||
> +		(xen_hvm_domain() && !xen_have_vector_callback))
>   

A comment would be useful here.  Is it that events delivered via IO APIC
are always routed to vcpu 0, but PV events and vectored HVM events can
be delivered to any vcpu?

>  		return -1;
>  
>  	/* Send future instances of this interrupt to other vcpu. */
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index f66db3b..6f5f3ba 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -37,11 +37,14 @@
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
>  #include <linux/uaccess.h>
> +#include <linux/io.h>
>  
>  #include <xen/xen.h>
>  #include <xen/interface/xen.h>
>  #include <xen/page.h>
>  #include <xen/grant_table.h>
> +#include <xen/platform_pci.h>
> +#include <xen/interface/memory.h>
>  #include <asm/xen/hypercall.h>
>  
>  #include <asm/pgtable.h>
> @@ -59,6 +62,7 @@ static unsigned int boot_max_nr_grant_frames;
>  static int gnttab_free_count;
>  static grant_ref_t gnttab_free_head;
>  static DEFINE_SPINLOCK(gnttab_list_lock);
> +static unsigned long hvm_pv_resume_frames;
>  
>  static struct grant_entry *shared;
>  
> @@ -449,6 +453,30 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
>  	unsigned int nr_gframes = end_idx + 1;
>  	int rc;
>  
> +	if (xen_hvm_domain()) {
> +		struct xen_add_to_physmap xatp;
> +		unsigned int i = end_idx;
> +		rc = 0;
> +		/*
> +		 * Loop backwards, so that the first hypercall has the largest
> +		 * index, ensuring that the table will grow only once.
> +		 */
> +		do {
> +			xatp.domid = DOMID_SELF;
> +			xatp.idx = i;
> +			xatp.space = XENMAPSPACE_grant_table;
> +			xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
> +			rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
> +			if (rc != 0) {
> +				printk(KERN_WARNING
> +						"grant table add_to_physmap failed, err=%d\n", rc);
> +				break;
> +			}
> +		} while (i-- > start_idx);
> +
> +		return rc;
> +	}
> +
>  	frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
>  	if (!frames)
>  		return -ENOMEM;
> @@ -476,9 +504,28 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
>  
>  int gnttab_resume(void)
>  {
> -	if (max_nr_grant_frames() < nr_grant_frames)
> +	unsigned int max_nr_gframes;
> +
> +	max_nr_gframes = max_nr_grant_frames();
> +	if (max_nr_gframes < nr_grant_frames)
>  		return -ENOSYS;
> -	return gnttab_map(0, nr_grant_frames - 1);
> +
> +	if (xen_pv_domain())
> +		return gnttab_map(0, nr_grant_frames - 1);
> +
> +	if (!hvm_pv_resume_frames) {
> +		hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
>   

Can alloc_xen_mmio fail?  Can this possibly get called with the stub
version in place (returning ~0UL), and what does ioremap do if you pass
that into it?

> +		shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
> +		if (shared == NULL) {
> +			printk(KERN_WARNING
> +					"Fail to ioremap gnttab share frames\n");
> +			return -ENOMEM;
> +		}
> +	}
> +
> +	gnttab_map(0, nr_grant_frames - 1);
> +
> +	return 0;
>  }
>  
>  int gnttab_suspend(void)
> @@ -505,15 +552,12 @@ static int gnttab_expand(unsigned int req_entries)
>  	return rc;
>  }
>  
> -static int __devinit gnttab_init(void)
> +int gnttab_init(void)
>  {
>  	int i;
>  	unsigned int max_nr_glist_frames, nr_glist_frames;
>  	unsigned int nr_init_grefs;
>  
> -	if (!xen_domain())
> -		return -ENODEV;
> -
>  	nr_grant_frames = 1;
>  	boot_max_nr_grant_frames = __max_nr_grant_frames();
>  
> @@ -557,4 +601,16 @@ static int __devinit gnttab_init(void)
>  	return -ENOMEM;
>  }
>  
> -core_initcall(gnttab_init);
> +static int __devinit __gnttab_init(void)
> +{
> +	/* Delay grant-table initialization in the PV on HVM case */
> +	if (xen_hvm_domain())
> +		return 0;
> +
> +	if (!xen_pv_domain())
> +		return -ENODEV;
> +
> +	return gnttab_init();
> +}
> +
> +core_initcall(__gnttab_init);
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> new file mode 100644
> index 0000000..7a8da66
> --- /dev/null
> +++ b/drivers/xen/platform-pci.c
> @@ -0,0 +1,198 @@
> +/******************************************************************************
> + * platform-pci.c
> + *
> + * Xen platform PCI device driver
> + * Copyright (c) 2005, Intel Corporation.
> + * Copyright (c) 2007, XenSource Inc.
> + * Copyright (c) 2010, Citrix
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + *
> + */
> +
> +#include <asm/io.h>
> +
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +
> +#include <xen/grant_table.h>
> +#include <xen/platform_pci.h>
> +#include <xen/interface/platform_pci.h>
> +#include <xen/xenbus.h>
> +#include <xen/events.h>
> +#include <xen/hvm.h>
> +
> +#define DRV_NAME    "xen-platform-pci"
> +
> +MODULE_AUTHOR("ssmith@xensource.com and stefano.stabellini@eu.citrix.com");
> +MODULE_DESCRIPTION("Xen platform PCI device");
> +MODULE_LICENSE("GPL");
> +
> +static unsigned long platform_mmio;
> +static unsigned long platform_mmio_alloc;
> +static unsigned long platform_mmiolen;
> +
> +unsigned long alloc_xen_mmio(unsigned long len)
> +{
> +	unsigned long addr;
> +
> +	addr = platform_mmio + platform_mmio_alloc;
> +	platform_mmio_alloc += len;
> +	BUG_ON(platform_mmio_alloc > platform_mmiolen);
> +
> +	return addr;
> +}
> +
> +static uint64_t get_callback_via(struct pci_dev *pdev)
> +{
> +	u8 pin;
> +	int irq;
> +
> +	irq = pdev->irq;
> +	if (irq < 16)
> +		return irq; /* ISA IRQ */
> +
> +	pin = pdev->pin;
> +
> +	/* We don't know the GSI. Specify the PCI INTx line instead. */
> +	return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */
> +		((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> +		((uint64_t)pdev->bus->number << 16) |
> +		((uint64_t)(pdev->devfn & 0xff) << 8) |
> +		((uint64_t)(pin - 1) & 3);
> +}
> +
> +static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
> +{
> +	xen_hvm_evtchn_do_upcall(get_irq_regs());
> +	return IRQ_HANDLED;
> +}
> +
> +static int xen_allocate_irq(struct pci_dev *pdev)
> +{
> +	return request_irq(pdev->irq, do_hvm_evtchn_intr,
> +			IRQF_DISABLED | IRQF_NOBALANCING | IRQF_TRIGGER_RISING,
> +			"xen-platform-pci", pdev);
> +}
> +
> +static int __devinit platform_pci_init(struct pci_dev *pdev,
> +				       const struct pci_device_id *ent)
> +{
> +	int i, ret;
> +	long ioaddr, iolen;
> +	long mmio_addr, mmio_len;
> +	uint64_t callback_via;
> +
> +	i = pci_enable_device(pdev);
> +	if (i)
> +		return i;
> +
> +	ioaddr = pci_resource_start(pdev, 0);
> +	iolen = pci_resource_len(pdev, 0);
> +
> +	mmio_addr = pci_resource_start(pdev, 1);
> +	mmio_len = pci_resource_len(pdev, 1);
> +
> +	if (mmio_addr == 0 || ioaddr == 0) {
> +		dev_err(&pdev->dev, "no resources found\n");
> +		ret = -ENOENT;
> +	}
> +
> +	if (request_mem_region(mmio_addr, mmio_len, DRV_NAME) == NULL) {
> +		dev_err(&pdev->dev, "MEM I/O resource 0x%lx @ 0x%lx busy\n",
> +		       mmio_addr, mmio_len);
> +		ret = -EBUSY;
> +	}
> +
> +	if (request_region(ioaddr, iolen, DRV_NAME) == NULL) {
> +		dev_err(&pdev->dev, "I/O resource 0x%lx @ 0x%lx busy\n",
> +		       iolen, ioaddr);
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	platform_mmio = mmio_addr;
> +	platform_mmiolen = mmio_len;
> +
> +	if (!xen_have_vector_callback) {
> +		ret = xen_allocate_irq(pdev);
> +		if (ret) {
> +			printk(KERN_WARNING "request_irq failed err=%d\n", ret);
> +			goto out;
> +		}
> +		callback_via = get_callback_via(pdev);
> +		ret = xen_set_callback_via(callback_via);
> +		if (ret) {
> +			printk(KERN_WARNING
> +					"Unable to set the evtchn callback err=%d\n", ret);
> +			goto out;
> +		}
> +	}
> +
> +	alloc_xen_mmio_hook = alloc_xen_mmio;
> +	platform_pci_resume_hook = platform_pci_resume;
> +	platform_pci_disable_irq_hook = platform_pci_disable_irq;
> +	platform_pci_enable_irq_hook = platform_pci_enable_irq;
>   

What's this _hook stuff for?

> +
> +	ret = gnttab_init();
> +	if (ret)
> +		goto out;
> +	ret = xenbus_probe_init();
> +	if (ret)
> +		goto out;
> +
> +out:
> +	if (ret) {
> +		release_mem_region(mmio_addr, mmio_len);
> +		release_region(ioaddr, iolen);
> +		pci_disable_device(pdev);
> +	}
> +
> +	return ret;
> +}
> +
> +#define XEN_PLATFORM_VENDOR_ID 0x5853
> +#define XEN_PLATFORM_DEVICE_ID 0x0001
> +static struct pci_device_id platform_pci_tbl[] __devinitdata = {
> +	{XEN_PLATFORM_VENDOR_ID, XEN_PLATFORM_DEVICE_ID,
> +	 PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
> +	{0,}
> +};
> +
> +MODULE_DEVICE_TABLE(pci, platform_pci_tbl);
> +
> +static struct pci_driver platform_driver = {
> +	name:     DRV_NAME,
> +	probe :    platform_pci_init,
> +	id_table : platform_pci_tbl,
> +};
> +
> +static int __init platform_pci_module_init(void)
> +{
> +	int rc;
> +
> +	if (!xen_platform_pci)
> +		return -ENODEV;
> +
> +	rc = pci_register_driver(&platform_driver);
> +	if (rc) {
> +		printk(KERN_INFO DRV_NAME
> +		       ": No platform pci device model found\n");
> +		return rc;
> +	}
> +	return 0;
> +}
> +
> +module_init(platform_pci_module_init);
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index 0b05b62..dc6ed06 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -782,16 +782,24 @@ void xenbus_probe(struct work_struct *unused)
>  	blocking_notifier_call_chain(&xenstore_chain, 0, NULL);
>  }
>  
> -static int __init xenbus_probe_init(void)
> +static int __init __xenbus_probe_init(void)
> +{
> +	/* Delay initialization in the PV on HVM case */
> +	if (xen_hvm_domain())
> +		return 0;
> +
> +	if (!xen_pv_domain())
> +		return -ENODEV;
> +
> +	return xenbus_probe_init();
> +}
> +
> +int xenbus_probe_init(void)
>  {
>  	int err = 0;
>  
>  	DPRINTK("");
>  
> -	err = -ENODEV;
> -	if (!xen_domain())
> -		goto out_error;
> -
>  	/* Register ourselves with the kernel bus subsystem */
>  	err = bus_register(&xenbus_frontend.bus);
>  	if (err)
> @@ -857,7 +865,7 @@ static int __init xenbus_probe_init(void)
>  	return err;
>  }
>  
> -postcore_initcall(xenbus_probe_init);
> +postcore_initcall(__xenbus_probe_init);
>  
>  MODULE_LICENSE("GPL");
>  
> diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
> index a40f1cd..811cda5 100644
> --- a/include/xen/grant_table.h
> +++ b/include/xen/grant_table.h
> @@ -51,6 +51,7 @@ struct gnttab_free_callback {
>  	u16 count;
>  };
>  
> +int gnttab_init(void);
>  int gnttab_suspend(void);
>  int gnttab_resume(void);
>  
> diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
> index 39da93c..39e5717 100644
> --- a/include/xen/interface/grant_table.h
> +++ b/include/xen/interface/grant_table.h
> @@ -28,6 +28,7 @@
>  #ifndef __XEN_PUBLIC_GRANT_TABLE_H__
>  #define __XEN_PUBLIC_GRANT_TABLE_H__
>  
> +#include <xen/interface/xen.h>
>  
>  /***********************************
>   * GRANT TABLE REPRESENTATION
> diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> index f39f4d3..59a120c 100644
> --- a/include/xen/platform_pci.h
> +++ b/include/xen/platform_pci.h
> @@ -29,4 +29,13 @@
>  #define XEN_IOPORT_LINUX_PRODNUM 0xffff
>  #define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
>  
> +#ifdef CONFIG_XEN_PLATFORM_PCI
> +unsigned long alloc_xen_mmio(unsigned long len);
> +#else
> +static inline unsigned long alloc_xen_mmio(unsigned long len)
> +{
> +	return ~0UL;
> +}
>   

Why is this stub needed?

> +#endif
> +
>  #endif /* _XEN_PLATFORM_PCI_H */
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 43e2d7d..ffa97de 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -174,6 +174,7 @@ void unregister_xenbus_watch(struct xenbus_watch *watch);
>  void xs_suspend(void);
>  void xs_resume(void);
>  void xs_suspend_cancel(void);
> +int xenbus_probe_init(void);
>  
>  /* Used by xenbus_dev to borrow kernel's store connection. */
>  void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg);
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/12] Add suspend\resume support for PV on HVM guests.
  2010-05-18 10:23 ` [PATCH 07/12] Add suspend\resume support for PV on HVM guests Stefano Stabellini
@ 2010-05-18 18:11   ` Jeremy Fitzhardinge
  2010-05-19 14:18     ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:11 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:

"/"

Please describe what's needed to support suspend/resume.  Is this a
normal x86 ACPI suspend/resume, or a Xen save/restore?

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/x86/xen/enlighten.c          |    9 ++--
>  arch/x86/xen/suspend.c            |    6 ++
>  arch/x86/xen/xen-ops.h            |    3 +
>  drivers/xen/manage.c              |   95 +++++++++++++++++++++++++++++++++++--
>  drivers/xen/platform-pci.c        |   29 +++++++++++-
>  drivers/xen/xenbus/xenbus_probe.c |   28 +++++++++++
>  include/xen/platform_pci.h        |    6 ++
>  include/xen/xen-ops.h             |    3 +
>  8 files changed, 170 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index aac47b0..23b8200 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1268,12 +1268,13 @@ static int init_hvm_pv_info(int *major, int *minor)
>  	return 0;
>  }
>  
> -static void __init init_shared_info(void)
> +void init_shared_info(void)
>  {
>  	struct xen_add_to_physmap xatp;
> -	struct shared_info *shared_info_page;
> +	static struct shared_info *shared_info_page = 0;
>  
> -	shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
> +	if (!shared_info_page)
> +		shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
>  	xatp.domid = DOMID_SELF;
>  	xatp.idx = 0;
>  	xatp.space = XENMAPSPACE_shared_info;
> @@ -1302,7 +1303,7 @@ void do_hvm_pv_evtchn_intr(void)
>  	xen_hvm_evtchn_do_upcall(get_irq_regs());
>  }
>  
> -static void xen_callback_vector(void)
> +void xen_callback_vector(void)
>  {
>  	uint64_t callback_via;
>  	if (xen_feature(XENFEAT_hvm_callback_vector)) {
> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
> index 987267f..86f3b45 100644
> --- a/arch/x86/xen/suspend.c
> +++ b/arch/x86/xen/suspend.c
> @@ -26,6 +26,12 @@ void xen_pre_suspend(void)
>  		BUG();
>  }
>  
> +void xen_hvm_post_suspend(int suspend_cancelled)
> +{
> +		init_shared_info();
> +		xen_callback_vector();
> +}
> +
>  void xen_post_suspend(int suspend_cancelled)
>  {
>  	xen_build_mfn_list_list();
> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> index f9153a3..caf89ee 100644
> --- a/arch/x86/xen/xen-ops.h
> +++ b/arch/x86/xen/xen-ops.h
> @@ -38,6 +38,9 @@ void xen_enable_sysenter(void);
>  void xen_enable_syscall(void);
>  void xen_vcpu_restore(void);
>  
> +void xen_callback_vector(void);
> +void init_shared_info(void);
> +
>  void __init xen_build_dynamic_phys_to_machine(void);
>  
>  void xen_init_irq_ops(void);
> diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> index 2ac4440..a73edd8 100644
> --- a/drivers/xen/manage.c
> +++ b/drivers/xen/manage.c
> @@ -8,15 +8,20 @@
>  #include <linux/sysrq.h>
>  #include <linux/stop_machine.h>
>  #include <linux/freezer.h>
> +#include <linux/pci.h>
> +#include <linux/cpumask.h>
>  
> +#include <xen/xen.h>
>  #include <xen/xenbus.h>
>  #include <xen/grant_table.h>
>  #include <xen/events.h>
>  #include <xen/hvc-console.h>
>  #include <xen/xen-ops.h>
> +#include <xen/platform_pci.h>
>  
>  #include <asm/xen/hypercall.h>
>  #include <asm/xen/page.h>
> +#include <asm/xen/hypervisor.h>
>  
>  enum shutdown_state {
>  	SHUTDOWN_INVALID = -1,
> @@ -33,10 +38,30 @@ enum shutdown_state {
>  static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
>  
>  #ifdef CONFIG_PM_SLEEP
> -static int xen_suspend(void *data)
> +static int xen_hvm_suspend(void *data)
>  {
> +	struct sched_shutdown r = { .reason = SHUTDOWN_suspend };
>  	int *cancelled = data;
> +
> +	BUG_ON(!irqs_disabled());
> +
> +	*cancelled = HYPERVISOR_sched_op(SCHEDOP_shutdown, &r);
> +
> +	xen_hvm_post_suspend(*cancelled);
> +	gnttab_resume();
> +
> +	if (!*cancelled) {
> +		xen_irq_resume();
> +		platform_pci_resume();
> +	}
> +
> +	return 0;
> +}
> +
> +static int xen_suspend(void *data)
> +{
>  	int err;
> +	int *cancelled = data;
>  
>  	BUG_ON(!irqs_disabled());
>  
> @@ -73,6 +98,53 @@ static int xen_suspend(void *data)
>  	return 0;
>  }
>  
> +static void do_hvm_suspend(void)
> +{
> +	int err;
> +	int cancelled = 1;
> +
> +	shutting_down = SHUTDOWN_SUSPEND;
> +
> +#ifdef CONFIG_PREEMPT
> +	/* If the kernel is preemptible, we need to freeze all the processes
> +	   to prevent them from being in the middle of a pagetable update
> +	   during suspend. */
> +	err = freeze_processes();
> +	if (err) {
> +		printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
> +		goto out;
> +	}
> +#endif
> +
> +	printk(KERN_DEBUG "suspending xenstore... ");
> +	xenbus_suspend();
> +	printk(KERN_DEBUG "xenstore suspended\n");
> +	platform_pci_disable_irq();
> +	
> +	err = stop_machine(xen_hvm_suspend, &cancelled, cpumask_of(0));
> +	if (err) {
> +		printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
> +		cancelled = 1;
> +	}
> +
> +	platform_pci_enable_irq();
> +
> +	if (!cancelled) {
> +		xen_arch_resume();
> +		xenbus_resume();
> +	} else
> +		xs_suspend_cancel();
> +
> +	/* Make sure timer events get retriggered on all CPUs */
> +	clock_was_set();
> +
> +#ifdef CONFIG_PREEMPT
> +	thaw_processes();
> +out:
> +#endif
> +	shutting_down = SHUTDOWN_INVALID;
> +}
> +
>  static void do_suspend(void)
>  {
>  	int err;
> @@ -185,7 +257,10 @@ static void shutdown_handler(struct xenbus_watch *watch,
>  		ctrl_alt_del();
>  #ifdef CONFIG_PM_SLEEP
>  	} else if (strcmp(str, "suspend") == 0) {
> -		do_suspend();
> +		if (xen_hvm_domain())
> +			do_hvm_suspend();
>   

Why does HVM come via this path?  Wouldn't ACPI S3 be a better match for
HVM?  Does this make sure the full device model suspend/resume callbacks
get called?  Previously I think we cut corners because we knew there
wouldn't be any PCI devices in the system...

And if the full device model is being used properly, then can't all this
hvm-specific stuff be done in the platform pci driver itself, rather
than here?  Is checkpoint the issue?  (Is checkpointing hvm domains
supported?)

> +		else
> +			do_suspend();
>  #endif
>  	} else {
>  		printk(KERN_INFO "Ignoring shutdown request: %s\n", str);
> @@ -261,7 +336,19 @@ static int shutdown_event(struct notifier_block *notifier,
>  	return NOTIFY_DONE;
>  }
>  
> -static int __init setup_shutdown_event(void)
> +static int __init __setup_shutdown_event(void)
> +{
> +	/* Delay initialization in the PV on HVM case */
> +	if (xen_hvm_domain())
> +		return 0;
> +
> +	if (!xen_pv_domain())
> +		return -ENODEV;
> +
> +	return xen_setup_shutdown_event();
> +}
> +
> +int xen_setup_shutdown_event(void)
>  {
>  	static struct notifier_block xenstore_notifier = {
>  		.notifier_call = shutdown_event
> @@ -271,4 +358,4 @@ static int __init setup_shutdown_event(void)
>  	return 0;
>  }
>  
> -subsys_initcall(setup_shutdown_event);
> +subsys_initcall(__setup_shutdown_event);
> diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> index 7a8da66..b15f809 100644
> --- a/drivers/xen/platform-pci.c
> +++ b/drivers/xen/platform-pci.c
> @@ -33,6 +33,7 @@
>  #include <xen/xenbus.h>
>  #include <xen/events.h>
>  #include <xen/hvm.h>
> +#include <xen/xen-ops.h>
>  
>  #define DRV_NAME    "xen-platform-pci"
>  
> @@ -43,6 +44,8 @@ MODULE_LICENSE("GPL");
>  static unsigned long platform_mmio;
>  static unsigned long platform_mmio_alloc;
>  static unsigned long platform_mmiolen;
> +static uint64_t callback_via;
> +struct pci_dev *xen_platform_pdev;
>  
>  unsigned long alloc_xen_mmio(unsigned long len)
>  {
> @@ -87,13 +90,33 @@ static int xen_allocate_irq(struct pci_dev *pdev)
>  			"xen-platform-pci", pdev);
>  }
>  
> +void platform_pci_disable_irq(void)
>   

If these are non-static they need a xen_ prefix.  In fact
"platform_pci_" is too generic anyway, and they should all have xen_
prefixes.

Aside from that, why do they need to be externally callable?  Can't the
pci device's own suspend/resume handlers do this?

> +{
> +	printk(KERN_DEBUG "platform_pci_disable_irq\n");
> +	disable_irq(xen_platform_pdev->irq);
> +}
> +
> +void platform_pci_enable_irq(void)
> +{
> +	printk(KERN_DEBUG "platform_pci_enable_irq\n");
> +	enable_irq(xen_platform_pdev->irq);
> +}
> +
> +void platform_pci_resume(void)
> +{
> +	if (!xen_have_vector_callback && xen_set_callback_via(callback_via)) {
> +		printk("platform_pci_resume failure!\n");
> +		return;
> +	}
> +}
> +
>  static int __devinit platform_pci_init(struct pci_dev *pdev,
>  				       const struct pci_device_id *ent)
>  {
>  	int i, ret;
>  	long ioaddr, iolen;
>  	long mmio_addr, mmio_len;
> -	uint64_t callback_via;
> +	xen_platform_pdev = pdev;
>  
>  	i = pci_enable_device(pdev);
>  	if (i)
> @@ -152,6 +175,10 @@ static int __devinit platform_pci_init(struct pci_dev *pdev,
>  	ret = xenbus_probe_init();
>  	if (ret)
>  		goto out;
> +	ret = xen_setup_shutdown_event();
> +	if (ret)
> +		goto out;
> +
>  
>  out:
>  	if (ret) {
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index dc6ed06..a679205 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -746,6 +746,34 @@ static int xenbus_dev_resume(struct device *dev)
>  	return 0;
>  }
>  
> +static int dev_suspend(struct device *dev, void *data)
> +{
> +	return xenbus_dev_suspend(dev, PMSG_SUSPEND);
> +}
> +
> +void xenbus_suspend(void)
> +{
> +	DPRINTK("");
> +
> +	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_suspend);
> +	xs_suspend();
> +}
> +EXPORT_SYMBOL_GPL(xenbus_suspend);
> +
> +static int dev_resume(struct device *dev, void *data)
> +{
> +	return xenbus_dev_resume(dev);
> +}
> +
> +void xenbus_resume(void)
> +{
> +	DPRINTK("");
> +
> +	xs_resume();
> +	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_resume);
> +}
> +EXPORT_SYMBOL_GPL(xenbus_resume);
> +
>  /* A flag to determine if xenstored is 'ready' (i.e. has started) */
>  int xenstored_ready = 0;
>  
> diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> index 59a120c..ced434d 100644
> --- a/include/xen/platform_pci.h
> +++ b/include/xen/platform_pci.h
> @@ -31,11 +31,17 @@
>  
>  #ifdef CONFIG_XEN_PLATFORM_PCI
>  unsigned long alloc_xen_mmio(unsigned long len);
> +void platform_pci_resume(void);
> +void platform_pci_disable_irq(void);
> +void platform_pci_enable_irq(void);
>  #else
>  static inline unsigned long alloc_xen_mmio(unsigned long len)
>  {
>  	return ~0UL;
>  }
> +static inline void platform_pci_resume(void) {}
> +static inline void platform_pci_disable_irq(void) {}
> +static inline void platform_pci_enable_irq(void) {}
>  #endif
>  
>  #endif /* _XEN_PLATFORM_PCI_H */
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index 883a21b..46bc81e 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -7,6 +7,7 @@ DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
>  
>  void xen_pre_suspend(void);
>  void xen_post_suspend(int suspend_cancelled);
> +void xen_hvm_post_suspend(int suspend_cancelled);
>  
>  void xen_mm_pin_all(void);
>  void xen_mm_unpin_all(void);
> @@ -14,4 +15,6 @@ void xen_mm_unpin_all(void);
>  void xen_timer_resume(void);
>  void xen_arch_resume(void);
>  
> +int xen_setup_shutdown_event(void);
> +
>  #endif /* INCLUDE_XEN_OPS_H */
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/12] Allow xen platform pci device to be compiled as a module
  2010-05-18 10:23 ` [PATCH 08/12] Allow xen platform pci device to be compiled as a module Stefano Stabellini
@ 2010-05-18 18:15   ` Jeremy Fitzhardinge
  2010-05-19 14:19     ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:15 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:

All the _hook stuff looks wrong. (below)

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/x86/xen/enlighten.c          |    3 +++
>  drivers/xen/events.c              |    1 +
>  drivers/xen/grant-table.c         |    6 +++++-
>  drivers/xen/manage.c              |   14 +++++++++++---
>  drivers/xen/xenbus/xenbus_probe.c |    1 +
>  include/xen/platform_pci.h        |   18 ++++--------------
>  6 files changed, 25 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 23b8200..77ba321 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -85,7 +85,9 @@ struct shared_info xen_dummy_shared_info;
>  void *xen_initial_gdt;
>  
>  int xen_have_vector_callback;
> +EXPORT_SYMBOL_GPL(xen_have_vector_callback);
>  int xen_platform_pci;
> +EXPORT_SYMBOL_GPL(xen_platform_pci);
>  static int unplug;
>  
>  /*
> @@ -1297,6 +1299,7 @@ int xen_set_callback_via(uint64_t via)
>  	a.value = via;
>  	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
>  }
> +EXPORT_SYMBOL_GPL(xen_set_callback_via);
>  
>  void do_hvm_pv_evtchn_intr(void)
>  {
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index cfc6d96..4840a03 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -695,6 +695,7 @@ void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
>  {
>  	__xen_evtchn_do_upcall(regs);
>  }
> +EXPORT_SYMBOL_GPL(xen_hvm_evtchn_do_upcall);
>  
>  /* Rebind a new event channel to an existing irq. */
>  void rebind_evtchn_irq(int evtchn, int irq)
> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> index 6f5f3ba..f936d30 100644
> --- a/drivers/xen/grant-table.c
> +++ b/drivers/xen/grant-table.c
> @@ -56,6 +56,9 @@
>  #define GNTTAB_LIST_END 0xffffffff
>  #define GREFS_PER_GRANT_FRAME (PAGE_SIZE / sizeof(struct grant_entry))
>  
> +unsigned long (*alloc_xen_mmio_hook)(unsigned long len);
> +EXPORT_SYMBOL_GPL(alloc_xen_mmio_hook);
> +
>  static grant_ref_t **gnttab_list;
>  static unsigned int nr_grant_frames;
>  static unsigned int boot_max_nr_grant_frames;
> @@ -514,7 +517,7 @@ int gnttab_resume(void)
>  		return gnttab_map(0, nr_grant_frames - 1);
>  
>  	if (!hvm_pv_resume_frames) {
> -		hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
> +		hvm_pv_resume_frames = alloc_xen_mmio_hook(PAGE_SIZE * max_nr_gframes);
>   

This looks like it should be restructured so the pci device driver
itself is doing this mapping, then calling into the grant subsystem to
tell it where the mapping is.

>  		shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
>  		if (shared == NULL) {
>  			printk(KERN_WARNING
> @@ -600,6 +603,7 @@ int gnttab_init(void)
>  	kfree(gnttab_list);
>  	return -ENOMEM;
>  }
> +EXPORT_SYMBOL_GPL(gnttab_init);
>  
>  static int __devinit __gnttab_init(void)
>  {
> diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> index a73edd8..49ee52d 100644
> --- a/drivers/xen/manage.c
> +++ b/drivers/xen/manage.c
> @@ -34,6 +34,13 @@ enum shutdown_state {
>  	 SHUTDOWN_HALT = 4,
>  };
>  
> +void (*platform_pci_resume_hook)(void);
> +EXPORT_SYMBOL_GPL(platform_pci_resume_hook);
> +void (*platform_pci_disable_irq_hook)(void);
> +EXPORT_SYMBOL_GPL(platform_pci_disable_irq_hook);
> +void (*platform_pci_enable_irq_hook)(void);
> +EXPORT_SYMBOL_GPL(platform_pci_enable_irq_hook);
>   

If all this _hook stuff is here to support a modular xen platform pci
device, then something has gone wrong.  The device should be doing this
via its own suspend/resume handlers.

> +
>  /* Ignore multiple shutdown requests. */
>  static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
>  
> @@ -52,7 +59,7 @@ static int xen_hvm_suspend(void *data)
>  
>  	if (!*cancelled) {
>  		xen_irq_resume();
> -		platform_pci_resume();
> +		platform_pci_resume_hook();
>  	}
>  
>  	return 0;
> @@ -119,7 +126,7 @@ static void do_hvm_suspend(void)
>  	printk(KERN_DEBUG "suspending xenstore... ");
>  	xenbus_suspend();
>  	printk(KERN_DEBUG "xenstore suspended\n");
> -	platform_pci_disable_irq();
> +	platform_pci_disable_irq_hook();
>  	
>  	err = stop_machine(xen_hvm_suspend, &cancelled, cpumask_of(0));
>  	if (err) {
> @@ -127,7 +134,7 @@ static void do_hvm_suspend(void)
>  		cancelled = 1;
>  	}
>  
> -	platform_pci_enable_irq();
> +	platform_pci_enable_irq_hook();
>  
>  	if (!cancelled) {
>  		xen_arch_resume();
> @@ -357,5 +364,6 @@ int xen_setup_shutdown_event(void)
>  
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(xen_setup_shutdown_event);
>  
>  subsys_initcall(__setup_shutdown_event);
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index a679205..f83e083 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -892,6 +892,7 @@ int xenbus_probe_init(void)
>    out_error:
>  	return err;
>  }
> +EXPORT_SYMBOL_GPL(xenbus_probe_init);
>  
>  postcore_initcall(__xenbus_probe_init);
>  
> diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> index ced434d..c3c2527 100644
> --- a/include/xen/platform_pci.h
> +++ b/include/xen/platform_pci.h
> @@ -29,19 +29,9 @@
>  #define XEN_IOPORT_LINUX_PRODNUM 0xffff
>  #define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
>  
> -#ifdef CONFIG_XEN_PLATFORM_PCI
> -unsigned long alloc_xen_mmio(unsigned long len);
> -void platform_pci_resume(void);
> -void platform_pci_disable_irq(void);
> -void platform_pci_enable_irq(void);
> -#else
> -static inline unsigned long alloc_xen_mmio(unsigned long len)
> -{
> -	return ~0UL;
> -}
> -static inline void platform_pci_resume(void) {}
> -static inline void platform_pci_disable_irq(void) {}
> -static inline void platform_pci_enable_irq(void) {}
> -#endif
> +extern unsigned long (*alloc_xen_mmio_hook)(unsigned long len);
> +extern void (*platform_pci_resume_hook)(void);
> +extern void (*platform_pci_disable_irq_hook)(void);
> +extern void (*platform_pci_enable_irq_hook)(void);
>  
>  #endif /* _XEN_PLATFORM_PCI_H */
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC
  2010-05-18 10:23 ` [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC Stefano Stabellini
@ 2010-05-18 18:15   ` Jeremy Fitzhardinge
  2010-05-19 14:25     ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:15 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> Make sure chip_data is not NULL before accessing it.
>   

You should clarify under what circumstances it can be legitimately NULL.

    J

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/x86/kernel/apic/io_apic.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index eb2789c..c64499c 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -1732,6 +1732,8 @@ __apicdebuginit(void) print_IO_APIC(void)
>  		struct irq_pin_list *entry;
>  
>  		cfg = desc->chip_data;
> +		if (!cfg)
> +			continue;
>  		entry = cfg->irq_2_pin;
>  		if (!entry)
>  			continue;
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM
  2010-05-18 10:23 ` [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM Stefano Stabellini
@ 2010-05-18 18:23   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:23 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:

Please describe what you're doing, why its useful and how it works.

> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/x86/xen/enlighten.c         |   39 +++++++++++++++++++++++++++++++++++++-
>  arch/x86/xen/time.c              |    3 ++
>  drivers/xen/manage.c             |    1 +
>  include/xen/interface/features.h |    3 ++
>  4 files changed, 45 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 77ba321..41677fe 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1274,6 +1274,7 @@ void init_shared_info(void)
>  {
>  	struct xen_add_to_physmap xatp;
>  	static struct shared_info *shared_info_page = 0;
> +	int cpu;
>  
>  	if (!shared_info_page)
>  		shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
> @@ -1288,7 +1289,42 @@ void init_shared_info(void)
>  
>  	/* Don't do the full vcpu_info placement stuff until we have a
>  	   possible map and a non-dummy shared_info. */
> -	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
> +	/* This code is run at resume time so make sure all the online cpus
> +	 * have xen_vcpu properly set */
>   

Why is this necessary? Can the vcpu structures move on resume?

> +	for_each_online_cpu(cpu)
> +		per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
> +}
> +
> +static void xen_hvm_setup_cpu_clockevents(void)
> +{
> +	int cpu = smp_processor_id();
> +	xen_setup_timer(cpu);
> +	per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
> +	xen_setup_cpu_clockevents();
> +}
> +
> +static void init_hvm_time(void)
> +{
> +#ifdef CONFIG_SMP
> +	/* vector callback is needed otherwise we cannot receive interrupts
> +	 * on cpu > 0 */
> +	if (!xen_have_vector_callback)
> +		return;
>   

Putting this check in CONFIG_SMP is more or less pointless, since
effectively every kernel is built with SMP enabled now.  Can you either
check whether the max CPUs is 1, or just make it always depend on vector
callback, even on a UP kernel/domain?

> +#endif
> +	if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
> +		printk(KERN_WARNING "Xen doesn't support pvclock on HVM,"
> +				"disable pv timer\n");
>   

What does this mean?  Is it asking the user to do something?

> +		return;
> +	}
> +
> +	pv_time_ops = xen_time_ops;
> +	x86_init.timers.timer_init = xen_time_init;
> +	x86_init.timers.setup_percpu_clockev = x86_init_noop;
> +	x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
> +
> +	x86_platform.calibrate_tsc = xen_tsc_khz;
> +	x86_platform.get_wallclock = xen_get_wallclock;
> +	x86_platform.set_wallclock = xen_set_wallclock;
>  }
>  
>  int xen_set_callback_via(uint64_t via)
> @@ -1373,6 +1409,7 @@ void __init xen_guest_init(void)
>  		outw(unplug, XEN_IOPORT_UNPLUG);
>  	have_vcpu_info_placement = 0;
>  	x86_init.irqs.intr_init = xen_init_IRQ;
> +	init_hvm_time();
>  }
>  
>  static int __init parse_unplug(char *arg)
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 32764b8..620e68f 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -19,6 +19,7 @@
>  #include <asm/xen/hypervisor.h>
>  #include <asm/xen/hypercall.h>
>  
> +#include <xen/xen.h>
>  #include <xen/events.h>
>  #include <xen/interface/xen.h>
>  #include <xen/interface/vcpu.h>
> @@ -470,6 +471,8 @@ void xen_timer_resume(void)
>  	for_each_online_cpu(cpu) {
>  		if (HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL))
>  			BUG();
> +		if (xen_hvm_domain())
> +			xen_setup_runstate_info(cpu);
>  	}
>  }
>  
> diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> index 49ee52d..4a8af22 100644
> --- a/drivers/xen/manage.c
> +++ b/drivers/xen/manage.c
> @@ -60,6 +60,7 @@ static int xen_hvm_suspend(void *data)
>  	if (!*cancelled) {
>  		xen_irq_resume();
>  		platform_pci_resume_hook();
> +		xen_timer_resume();
>  	}
>  
>  	return 0;
> diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h
> index 8ab08b9..70d2563 100644
> --- a/include/xen/interface/features.h
> +++ b/include/xen/interface/features.h
> @@ -44,6 +44,9 @@
>  /* x86: Does this Xen host support the HVM callback vector type? */
>  #define XENFEAT_hvm_callback_vector        8
>  
> +/* x86: pvclock algorithm is safe to use on HVM */
> +#define XENFEAT_hvm_safe_pvclock           9
>   

Why is this needed?  When is it not safe?

> +
>  #define XENFEAT_NR_SUBMAPS 1
>  
>  #endif /* __XEN_PUBLIC_FEATURES_H__ */
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state
  2010-05-18 10:23 ` [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state Stefano Stabellini
@ 2010-05-18 18:28   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:28 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: linux-kernel, xen-devel, Don Dutile, Sheng Yang

On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> From: Don Dutile <ddutile@redhat.com>
>
> this way if xenbus isn't configured in a FV xen guest,
>   

What does "FV" mean here?  Do you mean HVM?

> loading pv drivers (like netfront) won't crash the guest.
>   

No, this is way too hacky.  Is the issue that xenbus can't handle
drivers registering with it before it has been initialized?  How does
pci, usb, etc bus implementations deal with this?

    J

> Signed-off-by: Don Dutile <ddutile@redhat.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  drivers/xen/xenbus/xenbus_probe.c |   29 +++++++++++++++++++++++++----
>  drivers/xen/xenbus/xenbus_probe.h |    1 +
>  2 files changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index f83e083..5e8dae6 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -188,6 +188,11 @@ static struct xen_bus_type xenbus_frontend = {
>  	.levels = 2, 		/* device/type/<id> */
>  	.get_bus_id = frontend_bus_id,
>  	.probe = xenbus_probe_frontend,
> +	/* 
> +	 * to ensure loading pv-on-hvm drivers on FV guest
> +	 * doesn't blow up trying to use uninit'd xenbus.
> +	 */
> +	.error = -ENODEV,
>  	.bus = {
>  		.name      = "xen",
>  		.match     = xenbus_match,
> @@ -352,6 +357,9 @@ int xenbus_register_driver_common(struct xenbus_driver *drv,
>  				  struct module *owner,
>  				  const char *mod_name)
>  {
> +	if (bus->error)
> +		return bus->error;
> +
>  	drv->driver.name = drv->name;
>  	drv->driver.bus = &bus->bus;
>  	drv->driver.owner = owner;
> @@ -484,8 +492,12 @@ int xenbus_probe_node(struct xen_bus_type *bus,
>  	struct xenbus_device *xendev;
>  	size_t stringlen;
>  	char *tmpstring;
> +	enum xenbus_state state;
> +
> +	if (bus->error)
> +		return bus->error;
>  
> -	enum xenbus_state state = xenbus_read_driver_state(nodename);
> +	state = xenbus_read_driver_state(nodename);
>  
>  	if (state != XenbusStateInitialising) {
>  		/* Device is not new, so ignore it.  This can happen if a
> @@ -593,6 +605,9 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>  	char **dir;
>  	unsigned int i, dir_n;
>  
> +	if (bus->error)
> +		return bus->error;
> +
>  	dir = xenbus_directory(XBT_NIL, bus->root, "", &dir_n);
>  	if (IS_ERR(dir))
>  		return PTR_ERR(dir);
> @@ -636,7 +651,7 @@ void xenbus_dev_changed(const char *node, struct xen_bus_type *bus)
>  	char type[XEN_BUS_ID_SIZE];
>  	const char *p, *root;
>  
> -	if (char_count(node, '/') < 2)
> +	if (bus->error || char_count(node, '/') < 2)
>  		return;
>  
>  	exists = xenbus_exists(XBT_NIL, node, "");
> @@ -829,8 +844,8 @@ int xenbus_probe_init(void)
>  	DPRINTK("");
>  
>  	/* Register ourselves with the kernel bus subsystem */
> -	err = bus_register(&xenbus_frontend.bus);
> -	if (err)
> +	xenbus_frontend.error = bus_register(&xenbus_frontend.bus);
> +	if (xenbus_frontend.error)
>  		goto out_error;
>  
>  	err = xenbus_backend_bus_register();
> @@ -923,6 +938,9 @@ static int is_device_connecting(struct device *dev, void *data)
>  
>  static int exists_connecting_device(struct device_driver *drv)
>  {
> +	if (xenbus_frontend.error)
> +		return xenbus_frontend.error;
> +
>  	return bus_for_each_dev(&xenbus_frontend.bus, NULL, drv,
>  				is_device_connecting);
>  }
> @@ -1002,6 +1020,9 @@ static void wait_for_devices(struct xenbus_driver *xendrv)
>  #ifndef MODULE
>  static int __init boot_wait_for_devices(void)
>  {
> +	if (!xenbus_frontend.error)
> +		return xenbus_frontend.error;
> +
>  	ready_to_wait_for_devices = 1;
>  	wait_for_devices(NULL);
>  	return 0;
> diff --git a/drivers/xen/xenbus/xenbus_probe.h b/drivers/xen/xenbus/xenbus_probe.h
> index 6c5e318..15febe4 100644
> --- a/drivers/xen/xenbus/xenbus_probe.h
> +++ b/drivers/xen/xenbus/xenbus_probe.h
> @@ -53,6 +53,7 @@ static inline void xenbus_backend_bus_unregister(void) {}
>  struct xen_bus_type
>  {
>  	char *root;
> +	int error;
>  	unsigned int levels;
>  	int (*get_bus_id)(char bus_id[XEN_BUS_ID_SIZE], const char *nodename);
>  	int (*probe)(const char *type, const char *dir);
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-18 10:55 ` [PATCH 0 of 12] PV on HVM Xen Christian Tramnitz
@ 2010-05-18 18:41   ` Jeremy Fitzhardinge
  2010-05-24 17:28   ` Stefano Stabellini
  1 sibling, 0 replies; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-18 18:41 UTC (permalink / raw)
  To: Christian Tramnitz; +Cc: xen-devel

On 05/18/2010 03:55 AM, Christian Tramnitz wrote:
> Hi Stefano,
>
> what are the particular advantages of running PVonHVM vs. traditional
> PV (vs pure HVM)?
> I'd like to update the wiki with some info about it...

That's an interesting question.  It probably has two dimensions: ease of
deployment and performance.  Some users may just find it easier to
deploy a guest image in pure HVM mode since its most similar to the
native environment (though with pvgrub these differences are fairly
small).  Performance is more complex, as it depends on both your
workload and the capabilities of the processor you're running on.

So, try it and see ;)

    J

>
>
>
> Thanks,
>    Christian
>
> Am 18.05.2010 12:22, schrieb Stefano Stabellini:
>> Hi all,
>> this is the fixed, updated and rebased version of the PV on HVM series:
>> the series is based on 2.6.34 now and supports Xen PV frontends running
>> in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.
>>
>> The list of bugs fixed in this update includes: xenbus drivers crashes
>> when xenbus is not properly initialized, a memory corruption bug in
>> suspend/resume and testing for the xen platform pci version and protocol
>> has been moved to enlighten.c (before unplugging emulated devices).
>>
>> In order to be able to use VIRQ_TIMER and to improve performances you
>> need a patch to Xen to implement the vector callback mechanism
>> for event channel delivery.
>>
>> A git tree is also available here:
>>
>> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
>>
>> branch name 2.6.34-pvhvm.
>>
>> Cheers,
>>
>> Stefano
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 17:17   ` Jeremy Fitzhardinge
@ 2010-05-19 12:24     ` Stefano Stabellini
  2010-05-19 18:19       ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 12:24 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
> > From: Sheng Yang <sheng@linux.intel.com>
> >
> > Set the callback to receive evtchns from Xen, using the
> > callback vector delivery mechanism.
> >   
> 
> Could you expand on this a little?  Like, why is this desireable?  What
> functional difference does it make?  Is this patch useful in its own
> right, or is it just laying the groundwork for something else?
> 

In order to use PV frontends on HVM we need to receive notifications on
event channel deliveries somehow.
Using the callback vector is the preferred way, because it is available
independently from any (emulated) PCI device, all the vcpus can receive
these callbacks and theoretically there is no need to interact with the
emulated lapic (even though at the moment we are doing it anyway because
we are using the IPI vector).
The other way is to receive interrupts from the xen platform pci device,
but in that case interaction with the emulated lapic is unavoidable and
we are limited to receive interrupts on vcpu 0.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 05/12] unplug emulated disks and nics
  2010-05-18 17:27   ` Jeremy Fitzhardinge
@ 2010-05-19 13:00     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 13:00 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> > add a xen_unplug command line option to the kernel to unplug
> > xen emulated disks and nics.
> >   
> 
> I think it would be nice to call it something like "xen_emul_unplug" to
> clarify what this actually means.  And is it really necessary to make it
> a command-line option?  Can't we unplug these things once the pv drivers
> are brought up?  What happens if the user doesn't specify this?
> 

We have to do it before PCI is initialized to make sure the unplug
doesn't create any issues, for this reason we cannot wait for the PV
frontends to come up.
Also the PV frontends might be compiled as modules or even missing so we
cannot just unplug any emulated device as soon as we detect that we are
running on Xen.
As a conclusion we thought that deferring the decision to the user might
be the safest way of doing it.

If the user doesn't specify this, both the PV frontends and the drivers
for the emulated network card and disk could come up, and this might
lead to data corruptions (especially in the disk case).

Probably I should prevent the xen platform pci driver from loading
(that in turn would prevent the PV frontends from loading) if the
option is not specified.

s/xen_unplug/xen_emul_unplug is fine by me.



> > +static int __init check_platform_magic(void)
> >   
> 
> I'd prefer not to put all this in enlighten.c unless it really needs to
> be here.  Given that all this is dependent on the Xen platform PCI
> device being enabled, it would probably be happy in a separate
> conditionally compiled file.

OK


> 
> > +{
> > +	short magic;
> > +	char protocol;
> > +
> > +	magic = inw(XEN_IOPORT_MAGIC);
> >   
> 
> Does this get run only once we've established we're running on Xen, or
> could this be run in an arbitrary environment?
> 

only once we know we are running on Xen


> > +	if (magic != XEN_IOPORT_MAGIC_VAL) {
> > +		printk(KERN_ERR "Xen Platform Pci: unrecognised magic value\n");
> > +		return -1;
> > +	}
> > +
> > +	protocol = inb(XEN_IOPORT_PROTOVER);
> > +
> > +	printk(KERN_DEBUG "Xen Platform Pci: I/O protocol version %d\n",
> >   
> 
> "PCI" please.  Also, is that really accurate since we're doing random IO
> port stuff with no obvious connection to PCI?  "Xen Platform Device"
> perhaps?  (Though given that we have a proper fake PCI device, why all
> this random IO port hackery anyway?)
> 
> 

That's because the ioports even though they are static, they still
belong to the Xen platform PCI device.
The ioports are static so that they can be probed before the PCI bus is
even initialized.


> > +			protocol);
> > +
> > +	switch (protocol) {
> > +	case 1:
> > +		outw(XEN_IOPORT_LINUX_PRODNUM, XEN_IOPORT_PRODNUM);
> > +		outl(XEN_IOPORT_LINUX_DRVVER, XEN_IOPORT_DRVVER);
> > +		if (inw(XEN_IOPORT_MAGIC) != XEN_IOPORT_MAGIC_VAL) {
> > +			printk(KERN_ERR "Xen Platform: blacklisted by host\n");
> > +			return -3;
> > +		}
> > +		break;
> > +	default:
> > +		printk(KERN_WARNING "Xen Platform Pci: unknown I/O protocol version");
> > +		return -2;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  void __init xen_guest_init(void)
> >  {
> >  	int r;
> > @@ -1325,6 +1362,35 @@ void __init xen_guest_init(void)
> >  
> >  	xen_callback_vector();
> >  
> > +	r = check_platform_magic();
> > +	if (!r || (r == -1 && (unplug & UNPLUG_IGNORE)))
> > +		xen_platform_pci = 1;
> > +	if (xen_platform_pci && !(unplug & UNPLUG_IGNORE))
> > +		outw(unplug, XEN_IOPORT_UNPLUG);
> >   
> 
> What does all this do?  A comment would be nice.
>

OK.
The idea is that xen_platform_pci is set to 1 (meaning: the xen platform
pci device driver is enabled) if the xen platform protocol
version matches our check or if it doesn't match because we are running
on a very old version of Xen and the user told us to continue anyway.

After my comment above on preventing the PV frontends from loading if no
unplug is done, I think the whole thing should be changed to:

/* check the version of the xen platform PCI device */
r = check_platform_magic();
/* If the version matches and the user told us to unplug the emulated
 * devices, enable the Xen platform PCI driver. 
 * Also enable the Xen platform PCI driver if the version is really old
 * and the user told us to ignore it.
 */
if (!r && unplug || (r == -1 && (unplug & UNPLUG_IGNORE)))
    xen_platform_pci = 1;
/* Now unplug the emulated devices */
if (xen_platform_pci && !(unplug & UNPLUG_IGNORE))
    outw(unplug, XEN_IOPORT_UNPLUG);


> >  	have_vcpu_info_placement = 0;
> >  	x86_init.irqs.intr_init = xen_init_IRQ;
> >  }
> > +
> > +static int __init parse_unplug(char *arg)
> > +{
> > +	char *p, *q;
> > +
> > +	for (p = arg; p; p = q) {
> > +		q = strchr(arg, ',');
> > +		if (q)
> > +			*q++ = '\0';
> > +		if (!strcmp(p, "all"))
> > +			unplug |= UNPLUG_ALL;
> > +		else if (!strcmp(p, "ide-disks"))
> > +			unplug |= UNPLUG_ALL_IDE_DISKS;
> > +		else if (!strcmp(p, "aux-ide-disks"))
> > +			unplug |= UNPLUG_AUX_IDE_DISKS;
> > +		else if (!strcmp(p, "nics"))
> > +			unplug |= UNPLUG_ALL_NICS;
> > +		else
> > +			printk(KERN_WARNING "unrecognised option '%s' "
> > +				 "in module parameter 'dev_unplug'\n", p);
> >   
> 
> "xen_unplug" (or whatever it becomes).

OK


> 
> > +	}
> > +	return 0;
> > +}
> > +early_param("xen_unplug", parse_unplug);
> >   
> 
> If we must have this kernel command line parameter, make sure you update
> Documentation/kernel-parameters.txt.
> 

Sure


> > diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> > index 5940ee5..777d2ce 100644
> > --- a/include/xen/hvm.h
> > +++ b/include/xen/hvm.h
> > @@ -30,4 +30,6 @@ extern int xen_have_vector_callback;
> >  #define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
> >                                 HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
> >  
> > +extern int xen_platform_pci;
> > +
> >  #endif /* XEN_HVM_H__ */
> > diff --git a/include/xen/interface/platform_pci.h b/include/xen/interface/platform_pci.h
> > new file mode 100644
> > index 0000000..720eaf5
> > --- /dev/null
> > +++ b/include/xen/interface/platform_pci.h
> > @@ -0,0 +1,46 @@
> > +/******************************************************************************
> > + * platform_pci.h
> > + *
> > + * Interface for granting foreign access to page frames, and receiving
> > + * page-ownership transfers.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a copy
> > + * of this software and associated documentation files (the "Software"), to
> > + * deal in the Software without restriction, including without limitation the
> > + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_PLATFORM_PCI_H__
> > +#define __XEN_PUBLIC_PLATFORM_PCI_H__
> > +
> > +#define XEN_IOPORT_BASE 0x10
> > +
> > +#define XEN_IOPORT_PLATFLAGS	(XEN_IOPORT_BASE + 0) /* 1 byte access (R/W) */
> > +#define XEN_IOPORT_MAGIC	(XEN_IOPORT_BASE + 0) /* 2 byte access (R) */
> > +#define XEN_IOPORT_UNPLUG	(XEN_IOPORT_BASE + 0) /* 2 byte access (W) */
> > +#define XEN_IOPORT_DRVVER	(XEN_IOPORT_BASE + 0) /* 4 byte access (W) */
> > +
> > +#define XEN_IOPORT_SYSLOG	(XEN_IOPORT_BASE + 2) /* 1 byte access (W) */
> > +#define XEN_IOPORT_PROTOVER	(XEN_IOPORT_BASE + 2) /* 1 byte access (R) */
> > +#define XEN_IOPORT_PRODNUM	(XEN_IOPORT_BASE + 2) /* 2 byte access (W) */
> > +
> > +#define UNPLUG_ALL_IDE_DISKS 1
> > +#define UNPLUG_ALL_NICS 2
> > +#define UNPLUG_AUX_IDE_DISKS 4
> > +#define UNPLUG_ALL 7
> > +#define UNPLUG_IGNORE 8
> > +
> > +#endif /* __XEN_PUBLIC_PLATFORM_PCI_H__ */
> > diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> > new file mode 100644
> > index 0000000..f39f4d3
> > --- /dev/null
> > +++ b/include/xen/platform_pci.h
> > @@ -0,0 +1,32 @@
> > +/******************************************************************************
> > + * platform-pci.h
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2004, Intel Corporation. <xiaofeng.ling@intel.com>
> > + * Copyright (c) 2007, XenSource Inc.
> > + * Copyright (c) 2010, Citrix
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along with
> > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> > + * Place - Suite 330, Boston, MA 02111-1307 USA.
> > + */
> > +
> > +#ifndef _XEN_PLATFORM_PCI_H
> > +#define _XEN_PLATFORM_PCI_H
> > +
> > +#include <linux/version.h>
> > +
> > +#define XEN_IOPORT_MAGIC_VAL 0x49d2
> > +#define XEN_IOPORT_LINUX_PRODNUM 0xffff
> > +#define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
> >   
> 
> Can't these two headers be folded together?  There doesn't seem much
> point in splitting these XEN_IOPORT definitions across two files.
> 

One is supposed to be a common header file shared with the hypervisor
(actually the device model in this case), the other is the Xen platform
PCI driver header file.
I don't think they should be merged.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 17:43   ` Jeremy Fitzhardinge
@ 2010-05-19 13:01     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 13:01 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
oid *xen_initial_gdt;
> >  
> > +int xen_have_vector_callback;
> >   
> 
> BTW, this can be a __read_mostly.
> 
 
I'll do that, thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-18 18:10   ` Jeremy Fitzhardinge
@ 2010-05-19 13:08     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 13:08 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
> > From: Sheng Yang <sheng@linux.intel.com>
> >
> > Set the callback to receive evtchns from Xen, using the
> > callback vector delivery mechanism.
> >
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> > ---
> >  arch/x86/xen/enlighten.c         |   35 +++++++++++++++++++++++++++++++++++
> >  drivers/xen/events.c             |   31 ++++++++++++++++++++++++-------
> >  include/xen/events.h             |    3 +++
> >  include/xen/hvm.h                |    9 +++++++++
> >  include/xen/interface/features.h |    3 +++
> >  5 files changed, 74 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 87a3b10..502c4f8 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -37,8 +37,11 @@
> >  #include <xen/interface/vcpu.h>
> >  #include <xen/interface/memory.h>
> >  #include <xen/interface/hvm/hvm_op.h>
> > +#include <xen/interface/hvm/params.h>
> >  #include <xen/features.h>
> >  #include <xen/page.h>
> > +#include <xen/hvm.h>
> > +#include <xen/events.h>
> >  #include <xen/hvc-console.h>
> >  
> >  #include <asm/paravirt.h>
> > @@ -79,6 +82,8 @@ struct shared_info xen_dummy_shared_info;
> >  
> >  void *xen_initial_gdt;
> >  
> > +int xen_have_vector_callback;
> > +
> >  /*
> >   * Point at some empty memory to start with. We map the real shared_info
> >   * page as soon as fixmap is up and running.
> > @@ -1279,6 +1284,31 @@ static void __init init_shared_info(void)
> >  	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
> >  }
> >  
> > +int xen_set_callback_via(uint64_t via)
> > +{
> > +	struct xen_hvm_param a;
> > +	a.domid = DOMID_SELF;
> > +	a.index = HVM_PARAM_CALLBACK_IRQ;
> > +	a.value = via;
> > +	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
> >   
> 
> Does this implicitly set the vector delivery on all vcpus, current and
> future?
> 

Yes.

> > +}
> > +
> > +void do_hvm_pv_evtchn_intr(void)
> > +{
> > +	xen_hvm_evtchn_do_upcall(get_irq_regs());
> > +}
> > +
> > +static void xen_callback_vector(void)
> >   
> 
> All this callback vector stuff should be in drivers/xen/events.c.  It
> would also be good to give it a more descriptive name
> ("xen_set_callback_vector"?), and make it an init function.
> 

I could move it events.c and call it at the beginning of xen_init_IRQ,
is that OK?


> > +{
> > +	uint64_t callback_via;
> > +	if (xen_feature(XENFEAT_hvm_callback_vector)) {
> > +		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
> > +		xen_set_callback_via(callback_via);
> >   
> 
> Do you need to check the return value here?  Can it possibly fail?
> 

Yes, it can fail. The vector delivery mechanism hasn't been checked in
Xen yet (I sent the patch right after this patch series).


> > +		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
> > +		xen_have_vector_callback = 1;
> > +	}
> > +}
> > +
> >  void __init xen_guest_init(void)
> >  {
> >  	int r;
> > @@ -1292,4 +1322,9 @@ void __init xen_guest_init(void)
> >  		return;
> >  
> >  	init_shared_info();
> > +
> > +	xen_callback_vector();
> > +
> > +	have_vcpu_info_placement = 0;
> > +	x86_init.irqs.intr_init = xen_init_IRQ;
> >  }
> > diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> > index db8f506..3523dbb 100644
> > --- a/drivers/xen/events.c
> > +++ b/drivers/xen/events.c
> > @@ -36,6 +36,8 @@
> >  #include <asm/xen/hypercall.h>
> >  #include <asm/xen/hypervisor.h>
> >  
> > +#include <xen/xen.h>
> > +#include <xen/hvm.h>
> >  #include <xen/xen-ops.h>
> >  #include <xen/events.h>
> >  #include <xen/interface/xen.h>
> > @@ -617,17 +619,13 @@ static DEFINE_PER_CPU(unsigned, xed_nesting_count);
> >   * a bitset of words which contain pending event bits.  The second
> >   * level is a bitset of pending events themselves.
> >   */
> > -void xen_evtchn_do_upcall(struct pt_regs *regs)
> > +void __xen_evtchn_do_upcall(struct pt_regs *regs)
> >   
> 
> Given that the regs arg is completely unused, you should drop it.
> 

OK

> >  {
> >  	int cpu = get_cpu();
> > -	struct pt_regs *old_regs = set_irq_regs(regs);
> >  	struct shared_info *s = HYPERVISOR_shared_info;
> >  	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> >   	unsigned count;
> >  
> > -	exit_idle();
> > -	irq_enter();
> > -
> >  	do {
> >  		unsigned long pending_words;
> >  
> > @@ -667,10 +665,26 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
> >  	} while(count != 1);
> >  
> >  out:
> > +
> > +	put_cpu();
> > +}
> > +
> > +void xen_evtchn_do_upcall(struct pt_regs *regs)
> > +{
> > +	struct pt_regs *old_regs = set_irq_regs(regs);
> > +
> > +	exit_idle();
> > +	irq_enter();
> > +
> > +	__xen_evtchn_do_upcall(regs);
> > +
> >  	irq_exit();
> >  	set_irq_regs(old_regs);
> > +}
> >  
> > -	put_cpu();
> > +void xen_hvm_evtchn_do_upcall(struct pt_regs *regs)
> > +{
> > +	__xen_evtchn_do_upcall(regs);
> >   
> 
> Don't you need to set_irq_regs here?

No, that was done by smp_x86_platform_ipi or do_IRQ.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 06/12] xen pci platform device driver
  2010-05-18 18:11   ` Jeremy Fitzhardinge
@ 2010-05-19 13:50     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 13:50 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> > Add the xen pci platform device driver that is responsible
> > for initializing the grant table and xenbus in PV on HVM mode.
> > Few changes to xenbus and grant table are necessary to allow the delayed
> > initialization in HVM mode.
> > Grant table needs few additional modifications to work in HVM mode.
> >
> 
> This needs a description of how event and interrupt handling work in
> this environment.
> 

event channel handling is still done by __xen_evtchn_do_upcall, however
every time Xen raises an event channel Xen also notifies us injecting an
interrupt (from the Xen platform PCI device) or using the callback
vector.
Both the callback handler and the Xen platform PCI interrupt handler
calls __xen_evtchn_do_upcall.


> > When running on HVM the event channel upcall is never called while in
> > progress because it is a normal Linux irq handler, therefore we cannot
> > be sure that evtchn_upcall_pending is 0 when returning.
> >
> 
> Is that because the interrupt raised by a pending event is
> edge-triggered, so that even if the event is still pending on return,
> the corresponding interrupt isn't still asserted?
> 

The interrupt is level triggered and it is handled by the fasteoi handler
in Linux, but edge or level don't mean much here because it is just an
emulated interrupt.
If you look at handle_fasteoi_irq you'll notice that if the same
interrupt is IRQ_INPROGRESS it just goes out.


> > For this reason if evtchn_upcall_pending is set by Xen we need to loop
> > again on the event channels set pending otherwise we might loose some
> > event channel deliveries.
> >
> 
> So if the event is raised after the event processing loop but before the
> handler returns, the corresponding interrupt is still asserted so the
> interrupt handler will be re-entered immediately?  But if that's true,
> then why is an event occurring during the loop liable to get missed?
>

If an event is raised after the event processing loop but before the
handler returns, we have interrupts disabled at the time so we never
even get to handle_fasteoi_irq.
Once we renable interrupts we'll process the interrupt and the event as
expected.

The problem is when we receive interrupts while we are processing an
event with interrupts enabled, in that case we would receive
anther interrupt, we'll go to handle_fasteoi_irq and out without
resetting evtchn_upcall_pending. In this case Xen will not raise other
interrupts, and we will not read again evtchn_upcall_pending (because we
expect it to be 0), so events would be lost.


> 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> > ---
> >  drivers/xen/Kconfig                 |    8 ++
> >  drivers/xen/Makefile                |    3 +-
> >  drivers/xen/events.c                |    5 +-
> >  drivers/xen/grant-table.c           |   70 +++++++++++--
> >  drivers/xen/platform-pci.c          |  198 +++++++++++++++++++++++++++++++++++
> >  drivers/xen/xenbus/xenbus_probe.c   |   20 +++-
> >  include/xen/grant_table.h           |    1 +
> >  include/xen/interface/grant_table.h |    1 +
> >  include/xen/platform_pci.h          |    9 ++
> >  include/xen/xenbus.h                |    1 +
> >  10 files changed, 300 insertions(+), 16 deletions(-)
> >  create mode 100644 drivers/xen/platform-pci.c
> >
> > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> > index fad3df2..da312e2 100644
> > --- a/drivers/xen/Kconfig
> > +++ b/drivers/xen/Kconfig
> > @@ -62,4 +62,12 @@ config XEN_SYS_HYPERVISOR
> >        virtual environment, /sys/hypervisor will still be present,
> >        but will have no xen contents.
> >
> > +config XEN_PLATFORM_PCI
> > +     tristate "xen platform pci device driver"
> > +     depends on XEN
> > +     help
> > +       Driver for the Xen PCI Platform device: it is responsible for
> > +       initializing xenbus and grant_table when running in a Xen HVM
> > +       domain. As a consequence this driver is required to run any Xen PV
> > +       frontend on Xen HVM.
> >  endmenu
> > diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
> > index 7c28434..e392fb7 100644
> > --- a/drivers/xen/Makefile
> > +++ b/drivers/xen/Makefile
> > @@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_XENCOMM)     += xencomm.o
> >  obj-$(CONFIG_XEN_BALLOON)    += balloon.o
> >  obj-$(CONFIG_XEN_DEV_EVTCHN) += evtchn.o
> >  obj-$(CONFIG_XENFS)          += xenfs/
> > -obj-$(CONFIG_XEN_SYS_HYPERVISOR)     += sys-hypervisor.o
> > \ No newline at end of file
> > +obj-$(CONFIG_XEN_SYS_HYPERVISOR)     += sys-hypervisor.o
> > +obj-$(CONFIG_XEN_PLATFORM_PCI)       += platform-pci.o
> > diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> > index a137a2f..cfc6d96 100644
> > --- a/drivers/xen/events.c
> > +++ b/drivers/xen/events.c
> > @@ -671,7 +671,7 @@ void __xen_evtchn_do_upcall(struct pt_regs *regs)
> >
> >               count = __get_cpu_var(xed_nesting_count);
> >               __get_cpu_var(xed_nesting_count) = 0;
> > -     } while(count != 1);
> > +     } while(count != 1 || vcpu_info->evtchn_upcall_pending);
> >
> 
> I still don't think I understand the need for this (or if its needed,
> why its correct).
> 

see above

> >
> >  out:
> >
> > @@ -731,7 +731,8 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
> >       struct evtchn_bind_vcpu bind_vcpu;
> >       int evtchn = evtchn_from_irq(irq);
> >
> > -     if (!VALID_EVTCHN(evtchn))
> > +     if (!VALID_EVTCHN(evtchn) ||
> > +             (xen_hvm_domain() && !xen_have_vector_callback))
> >
> 
> A comment would be useful here.  Is it that events delivered via IO APIC
> are always routed to vcpu 0, but PV events and vectored HVM events can
> be delivered to any vcpu?
> 

Yes, exactly.

> >               return -1;
> >
> >       /* Send future instances of this interrupt to other vcpu. */
> > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> > index f66db3b..6f5f3ba 100644
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -37,11 +37,14 @@
> >  #include <linux/slab.h>
> >  #include <linux/vmalloc.h>
> >  #include <linux/uaccess.h>
> > +#include <linux/io.h>
> >
> >  #include <xen/xen.h>
> >  #include <xen/interface/xen.h>
> >  #include <xen/page.h>
> >  #include <xen/grant_table.h>
> > +#include <xen/platform_pci.h>
> > +#include <xen/interface/memory.h>
> >  #include <asm/xen/hypercall.h>
> >
> >  #include <asm/pgtable.h>
> > @@ -59,6 +62,7 @@ static unsigned int boot_max_nr_grant_frames;
> >  static int gnttab_free_count;
> >  static grant_ref_t gnttab_free_head;
> >  static DEFINE_SPINLOCK(gnttab_list_lock);
> > +static unsigned long hvm_pv_resume_frames;
> >
> >  static struct grant_entry *shared;
> >
> > @@ -449,6 +453,30 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
> >       unsigned int nr_gframes = end_idx + 1;
> >       int rc;
> >
> > +     if (xen_hvm_domain()) {
> > +             struct xen_add_to_physmap xatp;
> > +             unsigned int i = end_idx;
> > +             rc = 0;
> > +             /*
> > +              * Loop backwards, so that the first hypercall has the largest
> > +              * index, ensuring that the table will grow only once.
> > +              */
> > +             do {
> > +                     xatp.domid = DOMID_SELF;
> > +                     xatp.idx = i;
> > +                     xatp.space = XENMAPSPACE_grant_table;
> > +                     xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
> > +                     rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp);
> > +                     if (rc != 0) {
> > +                             printk(KERN_WARNING
> > +                                             "grant table add_to_physmap failed, err=%d\n", rc);
> > +                             break;
> > +                     }
> > +             } while (i-- > start_idx);
> > +
> > +             return rc;
> > +     }
> > +
> >       frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
> >       if (!frames)
> >               return -ENOMEM;
> > @@ -476,9 +504,28 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
> >
> >  int gnttab_resume(void)
> >  {
> > -     if (max_nr_grant_frames() < nr_grant_frames)
> > +     unsigned int max_nr_gframes;
> > +
> > +     max_nr_gframes = max_nr_grant_frames();
> > +     if (max_nr_gframes < nr_grant_frames)
> >               return -ENOSYS;
> > -     return gnttab_map(0, nr_grant_frames - 1);
> > +
> > +     if (xen_pv_domain())
> > +             return gnttab_map(0, nr_grant_frames - 1);
> > +
> > +     if (!hvm_pv_resume_frames) {
> > +             hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
> >
> 
> Can alloc_xen_mmio fail?
> Can this possibly get called with the stub
> version in place (returning ~0UL), and what does ioremap do if you pass
> that into it?
> 

No, alloc_xen_mmio cannot fail if the Xen platform PCI device driver
loaded successfully, and we wouldn't be here if it didn't. For this
reason this cannot get called with the stub version in place.


> > +             shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
> > +             if (shared == NULL) {
> > +                     printk(KERN_WARNING
> > +                                     "Fail to ioremap gnttab share frames\n");
> > +                     return -ENOMEM;
> > +             }
> > +     }
> > +
> > +     gnttab_map(0, nr_grant_frames - 1);
> > +
> > +     return 0;
> >  }
> >
> >  int gnttab_suspend(void)
> > @@ -505,15 +552,12 @@ static int gnttab_expand(unsigned int req_entries)
> >       return rc;
> >  }
> >
> > -static int __devinit gnttab_init(void)
> > +int gnttab_init(void)
> >  {
> >       int i;
> >       unsigned int max_nr_glist_frames, nr_glist_frames;
> >       unsigned int nr_init_grefs;
> >
> > -     if (!xen_domain())
> > -             return -ENODEV;
> > -
> >       nr_grant_frames = 1;
> >       boot_max_nr_grant_frames = __max_nr_grant_frames();
> >
> > @@ -557,4 +601,16 @@ static int __devinit gnttab_init(void)
> >       return -ENOMEM;
> >  }
> >
> > -core_initcall(gnttab_init);
> > +static int __devinit __gnttab_init(void)
> > +{
> > +     /* Delay grant-table initialization in the PV on HVM case */
> > +     if (xen_hvm_domain())
> > +             return 0;
> > +
> > +     if (!xen_pv_domain())
> > +             return -ENODEV;
> > +
> > +     return gnttab_init();
> > +}
> > +
> > +core_initcall(__gnttab_init);
> > diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> > new file mode 100644
> > index 0000000..7a8da66
> > --- /dev/null
> > +++ b/drivers/xen/platform-pci.c
> > @@ -0,0 +1,198 @@
> > +/******************************************************************************
> > + * platform-pci.c
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2005, Intel Corporation.
> > + * Copyright (c) 2007, XenSource Inc.
> > + * Copyright (c) 2010, Citrix
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along with
> > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> > + * Place - Suite 330, Boston, MA 02111-1307 USA.
> > + *
> > + */
> > +
> > +#include <asm/io.h>
> > +
> > +#include <linux/interrupt.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +
> > +#include <xen/grant_table.h>
> > +#include <xen/platform_pci.h>
> > +#include <xen/interface/platform_pci.h>
> > +#include <xen/xenbus.h>
> > +#include <xen/events.h>
> > +#include <xen/hvm.h>
> > +
> > +#define DRV_NAME    "xen-platform-pci"
> > +
> > +MODULE_AUTHOR("ssmith@xensource.com and stefano.stabellini@eu.citrix.com");
> > +MODULE_DESCRIPTION("Xen platform PCI device");
> > +MODULE_LICENSE("GPL");
> > +
> > +static unsigned long platform_mmio;
> > +static unsigned long platform_mmio_alloc;
> > +static unsigned long platform_mmiolen;
> > +
> > +unsigned long alloc_xen_mmio(unsigned long len)
> > +{
> > +     unsigned long addr;
> > +
> > +     addr = platform_mmio + platform_mmio_alloc;
> > +     platform_mmio_alloc += len;
> > +     BUG_ON(platform_mmio_alloc > platform_mmiolen);
> > +
> > +     return addr;
> > +}
> > +
> > +static uint64_t get_callback_via(struct pci_dev *pdev)
> > +{
> > +     u8 pin;
> > +     int irq;
> > +
> > +     irq = pdev->irq;
> > +     if (irq < 16)
> > +             return irq; /* ISA IRQ */
> > +
> > +     pin = pdev->pin;
> > +
> > +     /* We don't know the GSI. Specify the PCI INTx line instead. */
> > +     return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */
> > +             ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> > +             ((uint64_t)pdev->bus->number << 16) |
> > +             ((uint64_t)(pdev->devfn & 0xff) << 8) |
> > +             ((uint64_t)(pin - 1) & 3);
> > +}
> > +
> > +static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
> > +{
> > +     xen_hvm_evtchn_do_upcall(get_irq_regs());
> > +     return IRQ_HANDLED;
> > +}
> > +
> > +static int xen_allocate_irq(struct pci_dev *pdev)
> > +{
> > +     return request_irq(pdev->irq, do_hvm_evtchn_intr,
> > +                     IRQF_DISABLED | IRQF_NOBALANCING | IRQF_TRIGGER_RISING,
> > +                     "xen-platform-pci", pdev);
> > +}
> > +
> > +static int __devinit platform_pci_init(struct pci_dev *pdev,
> > +                                    const struct pci_device_id *ent)
> > +{
> > +     int i, ret;
> > +     long ioaddr, iolen;
> > +     long mmio_addr, mmio_len;
> > +     uint64_t callback_via;
> > +
> > +     i = pci_enable_device(pdev);
> > +     if (i)
> > +             return i;
> > +
> > +     ioaddr = pci_resource_start(pdev, 0);
> > +     iolen = pci_resource_len(pdev, 0);
> > +
> > +     mmio_addr = pci_resource_start(pdev, 1);
> > +     mmio_len = pci_resource_len(pdev, 1);
> > +
> > +     if (mmio_addr == 0 || ioaddr == 0) {
> > +             dev_err(&pdev->dev, "no resources found\n");
> > +             ret = -ENOENT;
> > +     }
> > +
> > +     if (request_mem_region(mmio_addr, mmio_len, DRV_NAME) == NULL) {
> > +             dev_err(&pdev->dev, "MEM I/O resource 0x%lx @ 0x%lx busy\n",
> > +                    mmio_addr, mmio_len);
> > +             ret = -EBUSY;
> > +     }
> > +
> > +     if (request_region(ioaddr, iolen, DRV_NAME) == NULL) {
> > +             dev_err(&pdev->dev, "I/O resource 0x%lx @ 0x%lx busy\n",
> > +                    iolen, ioaddr);
> > +             ret = -EBUSY;
> > +             goto out;
> > +     }
> > +
> > +     platform_mmio = mmio_addr;
> > +     platform_mmiolen = mmio_len;
> > +
> > +     if (!xen_have_vector_callback) {
> > +             ret = xen_allocate_irq(pdev);
> > +             if (ret) {
> > +                     printk(KERN_WARNING "request_irq failed err=%d\n", ret);
> > +                     goto out;
> > +             }
> > +             callback_via = get_callback_via(pdev);
> > +             ret = xen_set_callback_via(callback_via);
> > +             if (ret) {
> > +                     printk(KERN_WARNING
> > +                                     "Unable to set the evtchn callback err=%d\n", ret);
> > +                     goto out;
> > +             }
> > +     }
> > +
> > +     alloc_xen_mmio_hook = alloc_xen_mmio;
> > +     platform_pci_resume_hook = platform_pci_resume;
> > +     platform_pci_disable_irq_hook = platform_pci_disable_irq;
> > +     platform_pci_enable_irq_hook = platform_pci_enable_irq;
> >
> 
> What's this _hook stuff for?
> 

The hooks are needed to allow the xen platform device to be compiled and
used as a module successfully.
See also PATCH number 8.

The problem is that grant_table.c and manage.c depend on functions
provided by the Xen platform PCI driver if running on HVM, and this
could be compiled as a module. In that case we need to provide stubs
until the Xen platform PCI module is loaded.


> > +
> > +     ret = gnttab_init();
> > +     if (ret)
> > +             goto out;
> > +     ret = xenbus_probe_init();
> > +     if (ret)
> > +             goto out;
> > +
> > +out:
> > +     if (ret) {
> > +             release_mem_region(mmio_addr, mmio_len);
> > +             release_region(ioaddr, iolen);
> > +             pci_disable_device(pdev);
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> > +#define XEN_PLATFORM_VENDOR_ID 0x5853
> > +#define XEN_PLATFORM_DEVICE_ID 0x0001
> > +static struct pci_device_id platform_pci_tbl[] __devinitdata = {
> > +     {XEN_PLATFORM_VENDOR_ID, XEN_PLATFORM_DEVICE_ID,
> > +      PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
> > +     {0,}
> > +};
> > +
> > +MODULE_DEVICE_TABLE(pci, platform_pci_tbl);
> > +
> > +static struct pci_driver platform_driver = {
> > +     name:     DRV_NAME,
> > +     probe :    platform_pci_init,
> > +     id_table : platform_pci_tbl,
> > +};
> > +
> > +static int __init platform_pci_module_init(void)
> > +{
> > +     int rc;
> > +
> > +     if (!xen_platform_pci)
> > +             return -ENODEV;
> > +
> > +     rc = pci_register_driver(&platform_driver);
> > +     if (rc) {
> > +             printk(KERN_INFO DRV_NAME
> > +                    ": No platform pci device model found\n");
> > +             return rc;
> > +     }
> > +     return 0;
> > +}
> > +
> > +module_init(platform_pci_module_init);
> > diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> > index 0b05b62..dc6ed06 100644
> > --- a/drivers/xen/xenbus/xenbus_probe.c
> > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > @@ -782,16 +782,24 @@ void xenbus_probe(struct work_struct *unused)
> >       blocking_notifier_call_chain(&xenstore_chain, 0, NULL);
> >  }
> >
> > -static int __init xenbus_probe_init(void)
> > +static int __init __xenbus_probe_init(void)
> > +{
> > +     /* Delay initialization in the PV on HVM case */
> > +     if (xen_hvm_domain())
> > +             return 0;
> > +
> > +     if (!xen_pv_domain())
> > +             return -ENODEV;
> > +
> > +     return xenbus_probe_init();
> > +}
> > +
> > +int xenbus_probe_init(void)
> >  {
> >       int err = 0;
> >
> >       DPRINTK("");
> >
> > -     err = -ENODEV;
> > -     if (!xen_domain())
> > -             goto out_error;
> > -
> >       /* Register ourselves with the kernel bus subsystem */
> >       err = bus_register(&xenbus_frontend.bus);
> >       if (err)
> > @@ -857,7 +865,7 @@ static int __init xenbus_probe_init(void)
> >       return err;
> >  }
> >
> > -postcore_initcall(xenbus_probe_init);
> > +postcore_initcall(__xenbus_probe_init);
> >
> >  MODULE_LICENSE("GPL");
> >
> > diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
> > index a40f1cd..811cda5 100644
> > --- a/include/xen/grant_table.h
> > +++ b/include/xen/grant_table.h
> > @@ -51,6 +51,7 @@ struct gnttab_free_callback {
> >       u16 count;
> >  };
> >
> > +int gnttab_init(void);
> >  int gnttab_suspend(void);
> >  int gnttab_resume(void);
> >
> > diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h
> > index 39da93c..39e5717 100644
> > --- a/include/xen/interface/grant_table.h
> > +++ b/include/xen/interface/grant_table.h
> > @@ -28,6 +28,7 @@
> >  #ifndef __XEN_PUBLIC_GRANT_TABLE_H__
> >  #define __XEN_PUBLIC_GRANT_TABLE_H__
> >
> > +#include <xen/interface/xen.h>
> >
> >  /***********************************
> >   * GRANT TABLE REPRESENTATION
> > diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> > index f39f4d3..59a120c 100644
> > --- a/include/xen/platform_pci.h
> > +++ b/include/xen/platform_pci.h
> > @@ -29,4 +29,13 @@
> >  #define XEN_IOPORT_LINUX_PRODNUM 0xffff
> >  #define XEN_IOPORT_LINUX_DRVVER  ((LINUX_VERSION_CODE << 8) + 0x0)
> >
> > +#ifdef CONFIG_XEN_PLATFORM_PCI
> > +unsigned long alloc_xen_mmio(unsigned long len);
> > +#else
> > +static inline unsigned long alloc_xen_mmio(unsigned long len)
> > +{
> > +     return ~0UL;
> > +}
> >
> 
> Why is this stub needed?
> 

Actually after introducing the corresponding hook it is not needed
anymore. PATCH 8 removes it.


> > +#endif
> > +
> >  #endif /* _XEN_PLATFORM_PCI_H */
> > diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> > index 43e2d7d..ffa97de 100644
> > --- a/include/xen/xenbus.h
> > +++ b/include/xen/xenbus.h
> > @@ -174,6 +174,7 @@ void unregister_xenbus_watch(struct xenbus_watch *watch);
> >  void xs_suspend(void);
> >  void xs_resume(void);
> >  void xs_suspend_cancel(void);
> > +int xenbus_probe_init(void);
> >
> >  /* Used by xenbus_dev to borrow kernel's store connection. */
> >  void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg);
> >
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 07/12] Add suspend\resume support for PV on HVM guests.
  2010-05-18 18:11   ` Jeremy Fitzhardinge
@ 2010-05-19 14:18     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 14:18 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: xen-devel@lists.xensource.com, Don Dutile,
	linux-kernel@vger.kernel.org, Sheng Yang, Stefano Stabellini

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> 
> "/"
> 
> Please describe what's needed to support suspend/resume.  Is this a
> normal x86 ACPI suspend/resume, or a Xen save/restore?
> 

This is Xen save/restore, it doesn't have anything to do with ACPI.
In order to support it, we need to listen to xenbus for the suspend
event, freeze all the processes, suspend the PV frontends and call an
hypercall.


> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > ---
> >  arch/x86/xen/enlighten.c          |    9 ++--
> >  arch/x86/xen/suspend.c            |    6 ++
> >  arch/x86/xen/xen-ops.h            |    3 +
> >  drivers/xen/manage.c              |   95 +++++++++++++++++++++++++++++++++++--
> >  drivers/xen/platform-pci.c        |   29 +++++++++++-
> >  drivers/xen/xenbus/xenbus_probe.c |   28 +++++++++++
> >  include/xen/platform_pci.h        |    6 ++
> >  include/xen/xen-ops.h             |    3 +
> >  8 files changed, 170 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index aac47b0..23b8200 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -1268,12 +1268,13 @@ static int init_hvm_pv_info(int *major, int *minor)
> >  	return 0;
> >  }
> >  
> > -static void __init init_shared_info(void)
> > +void init_shared_info(void)
> >  {
> >  	struct xen_add_to_physmap xatp;
> > -	struct shared_info *shared_info_page;
> > +	static struct shared_info *shared_info_page = 0;
> >  
> > -	shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
> > +	if (!shared_info_page)
> > +		shared_info_page = (struct shared_info *) alloc_bootmem_pages(PAGE_SIZE);
> >  	xatp.domid = DOMID_SELF;
> >  	xatp.idx = 0;
> >  	xatp.space = XENMAPSPACE_shared_info;
> > @@ -1302,7 +1303,7 @@ void do_hvm_pv_evtchn_intr(void)
> >  	xen_hvm_evtchn_do_upcall(get_irq_regs());
> >  }
> >  
> > -static void xen_callback_vector(void)
> > +void xen_callback_vector(void)
> >  {
> >  	uint64_t callback_via;
> >  	if (xen_feature(XENFEAT_hvm_callback_vector)) {
> > diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
> > index 987267f..86f3b45 100644
> > --- a/arch/x86/xen/suspend.c
> > +++ b/arch/x86/xen/suspend.c
> > @@ -26,6 +26,12 @@ void xen_pre_suspend(void)
> >  		BUG();
> >  }
> >  
> > +void xen_hvm_post_suspend(int suspend_cancelled)
> > +{
> > +		init_shared_info();
> > +		xen_callback_vector();
> > +}
> > +
> >  void xen_post_suspend(int suspend_cancelled)
> >  {
> >  	xen_build_mfn_list_list();
> > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> > index f9153a3..caf89ee 100644
> > --- a/arch/x86/xen/xen-ops.h
> > +++ b/arch/x86/xen/xen-ops.h
> > @@ -38,6 +38,9 @@ void xen_enable_sysenter(void);
> >  void xen_enable_syscall(void);
> >  void xen_vcpu_restore(void);
> >  
> > +void xen_callback_vector(void);
> > +void init_shared_info(void);
> > +
> >  void __init xen_build_dynamic_phys_to_machine(void);
> >  
> >  void xen_init_irq_ops(void);
> > diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> > index 2ac4440..a73edd8 100644
> > --- a/drivers/xen/manage.c
> > +++ b/drivers/xen/manage.c
> > @@ -8,15 +8,20 @@
> >  #include <linux/sysrq.h>
> >  #include <linux/stop_machine.h>
> >  #include <linux/freezer.h>
> > +#include <linux/pci.h>
> > +#include <linux/cpumask.h>
> >  
> > +#include <xen/xen.h>
> >  #include <xen/xenbus.h>
> >  #include <xen/grant_table.h>
> >  #include <xen/events.h>
> >  #include <xen/hvc-console.h>
> >  #include <xen/xen-ops.h>
> > +#include <xen/platform_pci.h>
> >  
> >  #include <asm/xen/hypercall.h>
> >  #include <asm/xen/page.h>
> > +#include <asm/xen/hypervisor.h>
> >  
> >  enum shutdown_state {
> >  	SHUTDOWN_INVALID = -1,
> > @@ -33,10 +38,30 @@ enum shutdown_state {
> >  static enum shutdown_state shutting_down = SHUTDOWN_INVALID;
> >  
> >  #ifdef CONFIG_PM_SLEEP
> > -static int xen_suspend(void *data)
> > +static int xen_hvm_suspend(void *data)
> >  {
> > +	struct sched_shutdown r = { .reason = SHUTDOWN_suspend };
> >  	int *cancelled = data;
> > +
> > +	BUG_ON(!irqs_disabled());
> > +
> > +	*cancelled = HYPERVISOR_sched_op(SCHEDOP_shutdown, &r);
> > +
> > +	xen_hvm_post_suspend(*cancelled);
> > +	gnttab_resume();
> > +
> > +	if (!*cancelled) {
> > +		xen_irq_resume();
> > +		platform_pci_resume();
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int xen_suspend(void *data)
> > +{
> >  	int err;
> > +	int *cancelled = data;
> >  
> >  	BUG_ON(!irqs_disabled());
> >  
> > @@ -73,6 +98,53 @@ static int xen_suspend(void *data)
> >  	return 0;
> >  }
> >  
> > +static void do_hvm_suspend(void)
> > +{
> > +	int err;
> > +	int cancelled = 1;
> > +
> > +	shutting_down = SHUTDOWN_SUSPEND;
> > +
> > +#ifdef CONFIG_PREEMPT
> > +	/* If the kernel is preemptible, we need to freeze all the processes
> > +	   to prevent them from being in the middle of a pagetable update
> > +	   during suspend. */
> > +	err = freeze_processes();
> > +	if (err) {
> > +		printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
> > +		goto out;
> > +	}
> > +#endif
> > +
> > +	printk(KERN_DEBUG "suspending xenstore... ");
> > +	xenbus_suspend();
> > +	printk(KERN_DEBUG "xenstore suspended\n");
> > +	platform_pci_disable_irq();
> > +	
> > +	err = stop_machine(xen_hvm_suspend, &cancelled, cpumask_of(0));
> > +	if (err) {
> > +		printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
> > +		cancelled = 1;
> > +	}
> > +
> > +	platform_pci_enable_irq();
> > +
> > +	if (!cancelled) {
> > +		xen_arch_resume();
> > +		xenbus_resume();
> > +	} else
> > +		xs_suspend_cancel();
> > +
> > +	/* Make sure timer events get retriggered on all CPUs */
> > +	clock_was_set();
> > +
> > +#ifdef CONFIG_PREEMPT
> > +	thaw_processes();
> > +out:
> > +#endif
> > +	shutting_down = SHUTDOWN_INVALID;
> > +}
> > +
> >  static void do_suspend(void)
> >  {
> >  	int err;
> > @@ -185,7 +257,10 @@ static void shutdown_handler(struct xenbus_watch *watch,
> >  		ctrl_alt_del();
> >  #ifdef CONFIG_PM_SLEEP
> >  	} else if (strcmp(str, "suspend") == 0) {
> > -		do_suspend();
> > +		if (xen_hvm_domain())
> > +			do_hvm_suspend();
> >   
> 
> Why does HVM come via this path?  Wouldn't ACPI S3 be a better match for
> HVM?  Does this make sure the full device model suspend/resume callbacks
> get called?  Previously I think we cut corners because we knew there
> wouldn't be any PCI devices in the system...
> 
> And if the full device model is being used properly, then can't all this
> hvm-specific stuff be done in the platform pci driver itself, rather
> than here?  Is checkpoint the issue?  (Is checkpointing hvm domains
> supported?)
> 

I don't think we want to handle this with ACPI S3. Xen is capable of
issuing ACPI S3 requests if it wants to. This is a different case.

My first attempt was to use the PV suspend handler but I end up with too
many if (xen_hvm_domain()) so I decided to write a new one.
Now that the code is much more stable and it doesn't have any known bugs
anymore I might be able to refactor it in a better way either moving it
to the platform pci driver or using the PV suspend handler.


> > +		else
> > +			do_suspend();
> >  #endif
> >  	} else {
> >  		printk(KERN_INFO "Ignoring shutdown request: %s\n", str);
> > @@ -261,7 +336,19 @@ static int shutdown_event(struct notifier_block *notifier,
> >  	return NOTIFY_DONE;
> >  }
> >  
> > -static int __init setup_shutdown_event(void)
> > +static int __init __setup_shutdown_event(void)
> > +{
> > +	/* Delay initialization in the PV on HVM case */
> > +	if (xen_hvm_domain())
> > +		return 0;
> > +
> > +	if (!xen_pv_domain())
> > +		return -ENODEV;
> > +
> > +	return xen_setup_shutdown_event();
> > +}
> > +
> > +int xen_setup_shutdown_event(void)
> >  {
> >  	static struct notifier_block xenstore_notifier = {
> >  		.notifier_call = shutdown_event
> > @@ -271,4 +358,4 @@ static int __init setup_shutdown_event(void)
> >  	return 0;
> >  }
> >  
> > -subsys_initcall(setup_shutdown_event);
> > +subsys_initcall(__setup_shutdown_event);
> > diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
> > index 7a8da66..b15f809 100644
> > --- a/drivers/xen/platform-pci.c
> > +++ b/drivers/xen/platform-pci.c
> > @@ -33,6 +33,7 @@
> >  #include <xen/xenbus.h>
> >  #include <xen/events.h>
> >  #include <xen/hvm.h>
> > +#include <xen/xen-ops.h>
> >  
> >  #define DRV_NAME    "xen-platform-pci"
> >  
> > @@ -43,6 +44,8 @@ MODULE_LICENSE("GPL");
> >  static unsigned long platform_mmio;
> >  static unsigned long platform_mmio_alloc;
> >  static unsigned long platform_mmiolen;
> > +static uint64_t callback_via;
> > +struct pci_dev *xen_platform_pdev;
> >  
> >  unsigned long alloc_xen_mmio(unsigned long len)
> >  {
> > @@ -87,13 +90,33 @@ static int xen_allocate_irq(struct pci_dev *pdev)
> >  			"xen-platform-pci", pdev);
> >  }
> >  
> > +void platform_pci_disable_irq(void)
> >   
> 
> If these are non-static they need a xen_ prefix.  In fact
> "platform_pci_" is too generic anyway, and they should all have xen_
> prefixes.
> 
> Aside from that, why do they need to be externally callable?  Can't the
> pci device's own suspend/resume handlers do this?

I had serious problems the first time I tried to do that, but it is true
that at that point I had many other serious bugs in the hvm
suspend\resume code. I'll give it another try now that is stable.


> 
> > +{
> > +	printk(KERN_DEBUG "platform_pci_disable_irq\n");
> > +	disable_irq(xen_platform_pdev->irq);
> > +}
> > +
> > +void platform_pci_enable_irq(void)
> > +{
> > +	printk(KERN_DEBUG "platform_pci_enable_irq\n");
> > +	enable_irq(xen_platform_pdev->irq);
> > +}
> > +
> > +void platform_pci_resume(void)
> > +{
> > +	if (!xen_have_vector_callback && xen_set_callback_via(callback_via)) {
> > +		printk("platform_pci_resume failure!\n");
> > +		return;
> > +	}
> > +}
> > +
> >  static int __devinit platform_pci_init(struct pci_dev *pdev,
> >  				       const struct pci_device_id *ent)
> >  {
> >  	int i, ret;
> >  	long ioaddr, iolen;
> >  	long mmio_addr, mmio_len;
> > -	uint64_t callback_via;
> > +	xen_platform_pdev = pdev;
> >  
> >  	i = pci_enable_device(pdev);
> >  	if (i)
> > @@ -152,6 +175,10 @@ static int __devinit platform_pci_init(struct pci_dev *pdev,
> >  	ret = xenbus_probe_init();
> >  	if (ret)
> >  		goto out;
> > +	ret = xen_setup_shutdown_event();
> > +	if (ret)
> > +		goto out;
> > +
> >  
> >  out:
> >  	if (ret) {
> > diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> > index dc6ed06..a679205 100644
> > --- a/drivers/xen/xenbus/xenbus_probe.c
> > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > @@ -746,6 +746,34 @@ static int xenbus_dev_resume(struct device *dev)
> >  	return 0;
> >  }
> >  
> > +static int dev_suspend(struct device *dev, void *data)
> > +{
> > +	return xenbus_dev_suspend(dev, PMSG_SUSPEND);
> > +}
> > +
> > +void xenbus_suspend(void)
> > +{
> > +	DPRINTK("");
> > +
> > +	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_suspend);
> > +	xs_suspend();
> > +}
> > +EXPORT_SYMBOL_GPL(xenbus_suspend);
> > +
> > +static int dev_resume(struct device *dev, void *data)
> > +{
> > +	return xenbus_dev_resume(dev);
> > +}
> > +
> > +void xenbus_resume(void)
> > +{
> > +	DPRINTK("");
> > +
> > +	xs_resume();
> > +	bus_for_each_dev(&xenbus_frontend.bus, NULL, NULL, dev_resume);
> > +}
> > +EXPORT_SYMBOL_GPL(xenbus_resume);
> > +
> >  /* A flag to determine if xenstored is 'ready' (i.e. has started) */
> >  int xenstored_ready = 0;
> >  
> > diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> > index 59a120c..ced434d 100644
> > --- a/include/xen/platform_pci.h
> > +++ b/include/xen/platform_pci.h
> > @@ -31,11 +31,17 @@
> >  
> >  #ifdef CONFIG_XEN_PLATFORM_PCI
> >  unsigned long alloc_xen_mmio(unsigned long len);
> > +void platform_pci_resume(void);
> > +void platform_pci_disable_irq(void);
> > +void platform_pci_enable_irq(void);
> >  #else
> >  static inline unsigned long alloc_xen_mmio(unsigned long len)
> >  {
> >  	return ~0UL;
> >  }
> > +static inline void platform_pci_resume(void) {}
> > +static inline void platform_pci_disable_irq(void) {}
> > +static inline void platform_pci_enable_irq(void) {}
> >  #endif
> >  
> >  #endif /* _XEN_PLATFORM_PCI_H */
> > diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> > index 883a21b..46bc81e 100644
> > --- a/include/xen/xen-ops.h
> > +++ b/include/xen/xen-ops.h
> > @@ -7,6 +7,7 @@ DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
> >  
> >  void xen_pre_suspend(void);
> >  void xen_post_suspend(int suspend_cancelled);
> > +void xen_hvm_post_suspend(int suspend_cancelled);
> >  
> >  void xen_mm_pin_all(void);
> >  void xen_mm_unpin_all(void);
> > @@ -14,4 +15,6 @@ void xen_mm_unpin_all(void);
> >  void xen_timer_resume(void);
> >  void xen_arch_resume(void);
> >  
> > +int xen_setup_shutdown_event(void);
> > +
> >  #endif /* INCLUDE_XEN_OPS_H */
> >   
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 08/12] Allow xen platform pci device to be compiled as a module
  2010-05-18 18:15   ` Jeremy Fitzhardinge
@ 2010-05-19 14:19     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 14:19 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
> > index 6f5f3ba..f936d30 100644
> > --- a/drivers/xen/grant-table.c
> > +++ b/drivers/xen/grant-table.c
> > @@ -56,6 +56,9 @@
> >  #define GNTTAB_LIST_END 0xffffffff
> >  #define GREFS_PER_GRANT_FRAME (PAGE_SIZE / sizeof(struct grant_entry))
> >  
> > +unsigned long (*alloc_xen_mmio_hook)(unsigned long len);
> > +EXPORT_SYMBOL_GPL(alloc_xen_mmio_hook);
> > +
> >  static grant_ref_t **gnttab_list;
> >  static unsigned int nr_grant_frames;
> >  static unsigned int boot_max_nr_grant_frames;
> > @@ -514,7 +517,7 @@ int gnttab_resume(void)
> >  		return gnttab_map(0, nr_grant_frames - 1);
> >  
> >  	if (!hvm_pv_resume_frames) {
> > -		hvm_pv_resume_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
> > +		hvm_pv_resume_frames = alloc_xen_mmio_hook(PAGE_SIZE * max_nr_gframes);
> >   
> 
> This looks like it should be restructured so the pci device driver
> itself is doing this mapping, then calling into the grant subsystem to
> tell it where the mapping is.
> 

OK


> >  		shared = ioremap(hvm_pv_resume_frames, PAGE_SIZE * max_nr_gframes);
> >  		if (shared == NULL) {
> >  			printk(KERN_WARNING
> > @@ -600,6 +603,7 @@ int gnttab_init(void)
> >  	kfree(gnttab_list);
> >  	return -ENOMEM;
> >  }
> > +EXPORT_SYMBOL_GPL(gnttab_init);
> >  
> >  static int __devinit __gnttab_init(void)
> >  {
> > diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> > index a73edd8..49ee52d 100644
> > --- a/drivers/xen/manage.c
> > +++ b/drivers/xen/manage.c
> > @@ -34,6 +34,13 @@ enum shutdown_state {
> >  	 SHUTDOWN_HALT = 4,
> >  };
> >  
> > +void (*platform_pci_resume_hook)(void);
> > +EXPORT_SYMBOL_GPL(platform_pci_resume_hook);
> > +void (*platform_pci_disable_irq_hook)(void);
> > +EXPORT_SYMBOL_GPL(platform_pci_disable_irq_hook);
> > +void (*platform_pci_enable_irq_hook)(void);
> > +EXPORT_SYMBOL_GPL(platform_pci_enable_irq_hook);
> >   
> 
> If all this _hook stuff is here to support a modular xen platform pci
> device, then something has gone wrong.  The device should be doing this
> via its own suspend/resume handlers.
> 

I'll give it another try.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC
  2010-05-18 18:15   ` Jeremy Fitzhardinge
@ 2010-05-19 14:25     ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-19 14:25 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
> On 05/18/2010 03:23 AM, Stefano Stabellini wrote:
> > Make sure chip_data is not NULL before accessing it.
> >   
> 
> You should clarify under what circumstances it can be legitimately NULL.
> 
> 

The VIRQ_TIMER handler and virq handlers in general don't have any
chip_data.


> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > ---
> >  arch/x86/kernel/apic/io_apic.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> > index eb2789c..c64499c 100644
> > --- a/arch/x86/kernel/apic/io_apic.c
> > +++ b/arch/x86/kernel/apic/io_apic.c
> > @@ -1732,6 +1732,8 @@ __apicdebuginit(void) print_IO_APIC(void)
> >  		struct irq_pin_list *entry;
> >  
> >  		cfg = desc->chip_data;
> > +		if (!cfg)
> > +			continue;
> >  		entry = cfg->irq_2_pin;
> >  		if (!entry)
> >  			continue;
> >   
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 03/12] evtchn delivery on HVM
  2010-05-19 12:24     ` Stefano Stabellini
@ 2010-05-19 18:19       ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 46+ messages in thread
From: Jeremy Fitzhardinge @ 2010-05-19 18:19 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel@lists.xensource.com, Don Dutile,
	linux-kernel@vger.kernel.org, Sheng Yang

On 05/19/2010 05:24 AM, Stefano Stabellini wrote:
> On Tue, 18 May 2010, Jeremy Fitzhardinge wrote:
>   
>> On 05/18/2010 03:22 AM, Stefano Stabellini wrote:
>>     
>>> From: Sheng Yang <sheng@linux.intel.com>
>>>
>>> Set the callback to receive evtchns from Xen, using the
>>> callback vector delivery mechanism.
>>>   
>>>       
>> Could you expand on this a little?  Like, why is this desireable?  What
>> functional difference does it make?  Is this patch useful in its own
>> right, or is it just laying the groundwork for something else?
>>
>>     
> In order to use PV frontends on HVM we need to receive notifications on
> event channel deliveries somehow.
>   
(OK, but I just meant update the commit comment on the patch itself.)

> Using the callback vector is the preferred way, because it is available
> independently from any (emulated) PCI device, all the vcpus can receive
> these callbacks and theoretically there is no need to interact with the
> emulated lapic (even though at the moment we are doing it anyway because
> we are using the IPI vector).
>   
> The other way is to receive interrupts from the xen platform pci device,
> but in that case interaction with the emulated lapic is unavoidable and
> we are limited to receive interrupts on vcpu 0.
>   

Perhaps you should mention this first, since it is the historical way of
doing it, and then talk about its limitations, and then talk about the
replacement to address those limitations.

    J

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-18 10:55 ` [PATCH 0 of 12] PV on HVM Xen Christian Tramnitz
  2010-05-18 18:41   ` Jeremy Fitzhardinge
@ 2010-05-24 17:28   ` Stefano Stabellini
  2010-05-24 19:32     ` Pasi Kärkkäinen
  1 sibling, 1 reply; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-24 17:28 UTC (permalink / raw)
  To: Christian Tramnitz; +Cc: xen-devel@lists.xensource.com

On Tue, 18 May 2010, Christian Tramnitz wrote:
> Hi Stefano,
> 
> what are the particular advantages of running PVonHVM vs. traditional PV 
> (vs pure HVM)?
> I'd like to update the wiki with some info about it...
> 

Sorry for the late reply, but I wanted to have some numbers to compare
the three cases...

PVonHVM guests are as easy to install, use and maintain as pure HVM
guests but provide performances comparable or even better than
traditional PV guests.

In particular you'll find that PVonHVM guests are always faster than
pure HVM guests, most of the times faster than 64 bit PV guests but
probably slower than 32 bit PV guests.


The following are the results of kernbench run in optimal mode on a 4
vcpu linux guest with 512MB of ram.
The host is a Dell Precision T3500 (HAP enabled):
testbox:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2266.806
cache size      : 8192 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
bogomips        : 4538.00
clflush size    : 64
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2266.806
cache size      : 8192 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
bogomips        : 4538.00
clflush size    : 64
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2266.806
cache size      : 8192 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
bogomips        : 4538.00
clflush size    : 64
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping        : 5
cpu MHz         : 2266.806
cache size      : 8192 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
bogomips        : 4538.00
clflush size    : 64
power management:



64 bit PV on HVM
----------------
Average Optimal load -j 4 Run (std deviation):
Elapsed Time 215.307 (10.1294)
User Time 632.503 (6.4785)
System Time 115.497 (4.53905)
Percent CPU 347.333 (15.885)
Context Switches 43319.7 (2088.39)
Sleeps 48950 (3140.18)

64 bit pure HVM
---------------
Average Optimal load -j 4 Run (std deviation):
Elapsed Time 235.25 (4.10598)
User Time 512.48 (1.69714)
System Time 73.5967 (0.65041)
Percent CPU 248.667 (4.93288)
Context Switches 35930.3 (342.837)
Sleeps 56249.7 (2784.42)

64 bit traditional PV
---------------------
Average Optimal load -j 4 Run (std deviation):
Elapsed Time 248.187 (12.3954)
User Time 535.283 (0.818189)
System Time 127.497 (0.342685)
Percent CPU 266.667 (12.8582)
Context Switches 32978 (2968.78)
Sleeps 54391.3 (3141.44)


the results show that the 64 bit PV on HVM case is the fastest but it is
also the one that uses most CPU and User time.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0 of 12] PV on HVM Xen
@ 2010-05-24 18:25 Stefano Stabellini
  2010-05-24 18:29 ` Stefano Stabellini
                   ` (3 more replies)
  0 siblings, 4 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-24 18:25 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org
  Cc: Stefano Stabellini, Jeremy Fitzhardinge,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

Hi all,
this is another update of the PV on HVM Xen series that addresses
Jeremy's comments.
The platform_pci hooks have been removed, suspend/resume for HVM
domains is now much more similar to the PV case and shares the same
do_suspend function.
Alloc_xen_mmio_hook has been removed has well, now the memory allocation for
the grant table is done by the xen platform pci driver directly.
The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
secondary vcpus have the variable set correctly no matter what the xen
features are on the host.
The kernel command line option xen_unplug has been renamed to
xen_emul_unplug and the code that makes use of it has been moved to a
separate file (arch/x86/xen/platform-pci-unplug.c).
Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
and the Xen platform PCI driver have been compiled, and set the default
value of xen_emul_unplug accordingly.
The patch "Initialize xenbus device structs with ENODEV as
default" has been removed from the series and it will be sent
separately.
Finally the comments on most of the patches have been improved.

The series is based on 2.6.34 and supports Xen PV frontends running
in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.

In order to be able to use VIRQ_TIMER and to improve performances you
need a patch to Xen to implement the vector callback mechanism
for event channel delivery.

A git tree is also available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git

branch name 2.6.34-pvhvm-v2.

Cheers,

Stefano

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 18:25 Stefano Stabellini
@ 2010-05-24 18:29 ` Stefano Stabellini
  2010-05-24 18:30 ` Boris Derzhavets
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-24 18:29 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: linux-kernel@vger.kernel.org, Jeremy Fitzhardinge,
	xen-devel@lists.xensource.com, Don Dutile, Sheng Yang

The subject should be [PATCH 0 of 11] PV on HVM Xen, sorry about that.

On Mon, 24 May 2010, Stefano Stabellini wrote:
> Hi all,
> this is another update of the PV on HVM Xen series that addresses
> Jeremy's comments.
> The platform_pci hooks have been removed, suspend/resume for HVM
> domains is now much more similar to the PV case and shares the same
> do_suspend function.
> Alloc_xen_mmio_hook has been removed has well, now the memory allocation for
> the grant table is done by the xen platform pci driver directly.
> The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
> secondary vcpus have the variable set correctly no matter what the xen
> features are on the host.
> The kernel command line option xen_unplug has been renamed to
> xen_emul_unplug and the code that makes use of it has been moved to a
> separate file (arch/x86/xen/platform-pci-unplug.c).
> Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
> and the Xen platform PCI driver have been compiled, and set the default
> value of xen_emul_unplug accordingly.
> The patch "Initialize xenbus device structs with ENODEV as
> default" has been removed from the series and it will be sent
> separately.
> Finally the comments on most of the patches have been improved.
> 
> The series is based on 2.6.34 and supports Xen PV frontends running
> in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.
> 
> In order to be able to use VIRQ_TIMER and to improve performances you
> need a patch to Xen to implement the vector callback mechanism
> for event channel delivery.
> 
> A git tree is also available here:
> 
> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
> 
> branch name 2.6.34-pvhvm-v2.
> 
> Cheers,
> 
> Stefano
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 18:25 Stefano Stabellini
  2010-05-24 18:29 ` Stefano Stabellini
@ 2010-05-24 18:30 ` Boris Derzhavets
  2010-05-24 18:36 ` Boris Derzhavets
  2010-05-28 10:25 ` Boris Derzhavets
  3 siblings, 0 replies; 46+ messages in thread
From: Boris Derzhavets @ 2010-05-24 18:30 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Don Dutile,
	Sheng Yang, Stefano Stabellini

[-- Attachment #1.1: Type: text/plain, Size: 2442 bytes --]

> In order to be able to use VIRQ_TIMER and to improve performances you
> need
 a patch to Xen to implement the vector callback mechanism
> for event 
channel delivery.

Where to get it ?

Boris.

--- On Mon, 5/24/10, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> wrote:

From: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>, "Jeremy Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Don Dutile" <ddutile@redhat.com>, "Sheng Yang" <sheng@linux.intel.com>
Date: Monday, May 24, 2010, 2:25 PM

Hi all,
this is another update of the PV on HVM Xen series that addresses
Jeremy's comments.
The platform_pci hooks have been removed, suspend/resume for HVM
domains is now much more similar to the PV case and shares the same
do_suspend function.
Alloc_xen_mmio_hook has been removed has well, now the memory allocation for
the grant table is done by the xen platform pci driver directly.
The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
secondary vcpus have the variable set correctly no matter what the xen
features are on the host.
The kernel command line option xen_unplug has been renamed to
xen_emul_unplug and the code that makes use of it has been moved to a
separate file (arch/x86/xen/platform-pci-unplug.c).
Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
and the Xen platform PCI driver have been compiled, and set the default
value of xen_emul_unplug accordingly.
The patch "Initialize xenbus device structs with ENODEV as
default" has been removed from the series and it will be sent
separately.
Finally the comments on most of the patches have been improved.

The series is based on 2.6.34 and supports Xen PV frontends running
in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.

In order to be able to use VIRQ_TIMER and to improve performances you
need a patch to Xen to implement the vector callback mechanism
for event channel delivery.

A git tree is also available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git

branch name 2.6.34-pvhvm-v2.

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 3130 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 18:25 Stefano Stabellini
  2010-05-24 18:29 ` Stefano Stabellini
  2010-05-24 18:30 ` Boris Derzhavets
@ 2010-05-24 18:36 ` Boris Derzhavets
  2010-05-28 10:25 ` Boris Derzhavets
  3 siblings, 0 replies; 46+ messages in thread
From: Boris Derzhavets @ 2010-05-24 18:36 UTC (permalink / raw)
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Don Dutile,
	Sheng Yang, Stefano Stabellini

[-- Attachment #1.1: Type: text/plain, Size: 2443 bytes --]

> In order to be able to use VIRQ_TIMER and to improve performances 
you
> need
 a patch to Xen to implement the vector callback mechanism
> for 
event 
channel delivery.

Where to get it ?

Boris.

--- On Mon, 5/24/10, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> wrote:

From: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>, "Jeremy Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Don Dutile" <ddutile@redhat.com>, "Sheng Yang" <sheng@linux.intel.com>
Date: Monday, May 24, 2010, 2:25 PM

Hi all,
this is another update of the PV on HVM Xen series that addresses
Jeremy's comments.
The platform_pci hooks have been removed, suspend/resume for HVM
domains is now much more similar to the PV case and shares the same
do_suspend function.
Alloc_xen_mmio_hook has been removed has well, now the memory allocation for
the grant table is done by the xen platform pci driver directly.
The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
secondary vcpus have the variable set correctly no matter what the xen
features are on the host.
The kernel command line option xen_unplug has been renamed to
xen_emul_unplug and the code that makes use of it has been moved to a
separate file (arch/x86/xen/platform-pci-unplug.c).
Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
and the Xen platform PCI driver have been compiled, and set the default
value of xen_emul_unplug accordingly.
The patch "Initialize xenbus device structs with ENODEV as
default" has been removed from the series and it will be sent
separately.
Finally the comments on most of the patches have been improved.

The series is based on 2.6.34 and supports Xen PV frontends running
in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.

In order to be able to use VIRQ_TIMER and to improve performances you
need a patch to Xen to implement the vector callback mechanism
for event channel delivery.

A git tree is also available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git

branch name 2.6.34-pvhvm-v2.

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 3128 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 17:28   ` Stefano Stabellini
@ 2010-05-24 19:32     ` Pasi Kärkkäinen
  2010-05-24 19:51       ` Stefano Stabellini
  0 siblings, 1 reply; 46+ messages in thread
From: Pasi Kärkkäinen @ 2010-05-24 19:32 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel@lists.xensource.com, Christian Tramnitz

On Mon, May 24, 2010 at 06:28:42PM +0100, Stefano Stabellini wrote:
> On Tue, 18 May 2010, Christian Tramnitz wrote:
> > Hi Stefano,
> > 
> > what are the particular advantages of running PVonHVM vs. traditional PV 
> > (vs pure HVM)?
> > I'd like to update the wiki with some info about it...
> > 
> 
> Sorry for the late reply, but I wanted to have some numbers to compare
> the three cases...
> 
> PVonHVM guests are as easy to install, use and maintain as pure HVM
> guests but provide performances comparable or even better than
> traditional PV guests.
> 
> In particular you'll find that PVonHVM guests are always faster than
> pure HVM guests, most of the times faster than 64 bit PV guests but
> probably slower than 32 bit PV guests.
> 

Hmm.. are you sure kernbench is a proper benchmark for disk IO? 
kernel compilations fire up a lot of new processes, and that favours HVM guests.

Basicly it seems really weird to me that in the results HVM (without PV drivers) 
is faster than pure PV..

If you compare pure disk-IO PV vs. HVM the difference is usually big.. 
PV beating HVM hands down.

-- Pasi

> 
> The following are the results of kernbench run in optimal mode on a 4
> vcpu linux guest with 512MB of ram.
> The host is a Dell Precision T3500 (HAP enabled):
> testbox:~# cat /proc/cpuinfo
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 26
> model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
> stepping        : 5
> cpu MHz         : 2266.806
> cache size      : 8192 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 11
> wp              : yes
> flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
> bogomips        : 4538.00
> clflush size    : 64
> power management:
> 
> processor       : 1
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 26
> model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
> stepping        : 5
> cpu MHz         : 2266.806
> cache size      : 8192 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 11
> wp              : yes
> flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
> bogomips        : 4538.00
> clflush size    : 64
> power management:
> 
> processor       : 2
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 26
> model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
> stepping        : 5
> cpu MHz         : 2266.806
> cache size      : 8192 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 11
> wp              : yes
> flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
> bogomips        : 4538.00
> clflush size    : 64
> power management:
> 
> processor       : 3
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 26
> model name      : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
> stepping        : 5
> cpu MHz         : 2266.806
> cache size      : 8192 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 11
> wp              : yes
> flags           : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est ssse3 sse4_1 sse4_2 popcnt ida
> bogomips        : 4538.00
> clflush size    : 64
> power management:
> 
> 
> 
> 64 bit PV on HVM
> ----------------
> Average Optimal load -j 4 Run (std deviation):
> Elapsed Time 215.307 (10.1294)
> User Time 632.503 (6.4785)
> System Time 115.497 (4.53905)
> Percent CPU 347.333 (15.885)
> Context Switches 43319.7 (2088.39)
> Sleeps 48950 (3140.18)
> 
> 64 bit pure HVM
> ---------------
> Average Optimal load -j 4 Run (std deviation):
> Elapsed Time 235.25 (4.10598)
> User Time 512.48 (1.69714)
> System Time 73.5967 (0.65041)
> Percent CPU 248.667 (4.93288)
> Context Switches 35930.3 (342.837)
> Sleeps 56249.7 (2784.42)
> 
> 64 bit traditional PV
> ---------------------
> Average Optimal load -j 4 Run (std deviation):
> Elapsed Time 248.187 (12.3954)
> User Time 535.283 (0.818189)
> System Time 127.497 (0.342685)
> Percent CPU 266.667 (12.8582)
> Context Switches 32978 (2968.78)
> Sleeps 54391.3 (3141.44)
> 
> 
> the results show that the 64 bit PV on HVM case is the fastest but it is
> also the one that uses most CPU and User time.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 19:32     ` Pasi Kärkkäinen
@ 2010-05-24 19:51       ` Stefano Stabellini
  0 siblings, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-24 19:51 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel@lists.xensource.com, Christian Tramnitz,
	Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

On Mon, 24 May 2010, Pasi Kärkkäinen wrote:
> Hmm.. are you sure kernbench is a proper benchmark for disk IO? 
> kernel compilations fire up a lot of new processes, and that favours HVM guests.
> 
> Basicly it seems really weird to me that in the results HVM (without PV drivers) 
> is faster than pure PV..
> 
> If you compare pure disk-IO PV vs. HVM the difference is usually big.. 
> PV beating HVM hands down.
> 

32 bit PV certainly would, even on a kernbench, but keep in mind that 64
bit PV is not as fast.
In any case, like Jeremy previously said, it all depends on the
workload...

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 19:06 [Xen-devel] " Stefano Stabellini
@ 2010-05-25  6:14 ` Boris Derzhavets
  0 siblings, 0 replies; 46+ messages in thread
From: Boris Derzhavets @ 2010-05-25  6:14 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Stabellini,
	linux-kernel@vger.kernel.org, Stefano, Don Dutile, Sheng Yang


[-- Attachment #1.1: Type: text/plain, Size: 1164 bytes --]

What Xen Version this patch is supposed to be applied ?
Seems like not 4.0.

Boris.

--- On Mon, 5/24/10, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:

From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
To: "Boris Derzhavets" <bderzhavets@yahoo.com>
Cc: "Jeremy Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Stabellini" <Stefano.Stabellini@eu.citrix.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Stefano@yahoo.com, "Don Dutile" <ddutile@redhat.com>, "Sheng Yang" <sheng@linux.intel.com>
Date: Monday, May 24, 2010, 3:06 PM

On Mon, 24 May 2010, Boris Derzhavets wrote:
> > In order to be able to use VIRQ_TIMER and to improve performances you
> > need a patch to Xen to implement the vector callback mechanism
> > for event channel delivery.
> 
> Where to get it ?
> 

It is this one:

http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00875.html


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel



      

[-- Attachment #1.2: Type: text/html, Size: 1886 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-25  9:55 [Xen-devel] " Stefano Stabellini
@ 2010-05-25 11:15 ` Boris Derzhavets
  0 siblings, 0 replies; 46+ messages in thread
From: Boris Derzhavets @ 2010-05-25 11:15 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com,
	Stefano Stabellini, Stefano@yahoo.com, Don Dutile, Sheng Yang


[-- Attachment #1.1: Type: text/plain, Size: 943 bytes --]

Could you,please, resubmit it as raw attachment.

Boris.

--- On Tue, 5/25/10, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:

From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
To: "Boris Derzhavets" <bderzhavets@yahoo.com>
Cc: "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>, "Jeremy Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "Stefano@yahoo.com" <Stefano@yahoo.com>, "Don Dutile" <ddutile@redhat.com>, "Sheng Yang" <sheng@linux.intel.com>
Date: Tuesday, May 25, 2010, 5:55 AM

On Tue, 25 May 2010, Boris Derzhavets wrote:
> What Xen Version this patch is supposed to be applied ?
> Seems like not 4.0.
> 

xen-unstable, but it shouldn't be difficult to port to 4.0.
BTW I just sent an updated version of the patch to the list.



      

[-- Attachment #1.2: Type: text/html, Size: 1342 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-24 18:25 Stefano Stabellini
                   ` (2 preceding siblings ...)
  2010-05-24 18:36 ` Boris Derzhavets
@ 2010-05-28 10:25 ` Boris Derzhavets
  2010-05-28 10:45   ` Pasi Kärkkäinen
  2010-05-28 11:06   ` Stefano Stabellini
  3 siblings, 2 replies; 46+ messages in thread
From: Boris Derzhavets @ 2010-05-28 10:25 UTC (permalink / raw)
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Don Dutile,
	Sheng Yang, Stefano Stabellini

[-- Attachment #1.1: Type: text/plain, Size: 2507 bytes --]

What is an advantage of PV on HVM ?
Kernel 2.6.34 with Stefano's patches may be built i believe only on Linux HVM DomU.
At the same time any recent Linux ( >=24 or >=26) supports PV guest install ( it's in
mainline for a while).
What i am missing here ?

Boris.

--- On Mon, 5/24/10, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> wrote:

From: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Subject: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>, "Jeremy Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, "Don Dutile" <ddutile@redhat.com>, "Sheng Yang" <sheng@linux.intel.com>
Date: Monday, May 24, 2010, 2:25 PM

Hi all,
this is another update of the PV on HVM Xen series that addresses
Jeremy's comments.
The platform_pci hooks have been removed, suspend/resume for HVM
domains is now much more similar to the PV case and shares the same
do_suspend function.
Alloc_xen_mmio_hook has been removed has well, now the memory allocation for
the grant table is done by the xen platform pci driver directly.
The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
secondary vcpus have the variable set correctly no matter what the xen
features are on the host.
The kernel command line option xen_unplug has been renamed to
xen_emul_unplug and the code that makes use of it has been moved to a
separate file (arch/x86/xen/platform-pci-unplug.c).
Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
and the Xen platform PCI driver have been compiled, and set the default
value of xen_emul_unplug accordingly.
The patch "Initialize xenbus device structs with ENODEV as
default" has been removed from the series and it will be sent
separately.
Finally the comments on most of the patches have been improved.

The series is based on 2.6.34 and supports Xen PV frontends running
in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.

In order to be able to use VIRQ_TIMER and to improve performances you
need a patch to Xen to implement the vector callback mechanism
for event channel delivery.

A git tree is also available here:

git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git

branch name 2.6.34-pvhvm-v2.

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 3190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-28 10:25 ` Boris Derzhavets
@ 2010-05-28 10:45   ` Pasi Kärkkäinen
  2010-05-28 11:06   ` Stefano Stabellini
  1 sibling, 0 replies; 46+ messages in thread
From: Pasi Kärkkäinen @ 2010-05-28 10:45 UTC (permalink / raw)
  To: Boris Derzhavets
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Don Dutile,
	Sheng Yang, Stefano Stabellini

On Fri, May 28, 2010 at 03:25:34AM -0700, Boris Derzhavets wrote:
>    What is an advantage of PV on HVM ?
>

Pure HVM guests using the Qemu emulated disk/network devices are slow. 
PV-on-HVM drivers make disk- and network IO fast for HVM guests.

>    Kernel 2.6.34 with Stefano's patches may be built i believe only on Linux
>    HVM DomU.
>

Exactly. They're meant for an upstream kernel, running as Xen HVM guest.

>    At the same time any recent Linux ( >=24 or >=26) supports PV guest
>    install ( it's in
>    mainline for a while).
>    What i am missing here ?
> 

HVM guests might be faster for some workloads compared to PV guests.
Kernel compilation could be one example..ie. workloads spawning a lot 
of new processes all the time.

-- Pasi

>    Boris.
> 
>    --- On Mon, 5/24/10, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
>    wrote:
> 
>      From: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
>      Subject: [Xen-devel] [PATCH 0 of 12] PV on HVM Xen
>      To: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
>      Cc: "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com>, "Jeremy
>      Fitzhardinge" <jeremy@goop.org>, "xen-devel@lists.xensource.com"
>      <xen-devel@lists.xensource.com>, "Don Dutile" <ddutile@redhat.com>,
>      "Sheng Yang" <sheng@linux.intel.com>
>      Date: Monday, May 24, 2010, 2:25 PM
> 
>      Hi all,
>      this is another update of the PV on HVM Xen series that addresses
>      Jeremy's comments.
>      The platform_pci hooks have been removed, suspend/resume for HVM
>      domains is now much more similar to the PV case and shares the same
>      do_suspend function.
>      Alloc_xen_mmio_hook has been removed has well, now the memory allocation
>      for
>      the grant table is done by the xen platform pci driver directly.
>      The per_cpu xen_vcpu variable is set by a cpu_notifier function so that
>      secondary vcpus have the variable set correctly no matter what the xen
>      features are on the host.
>      The kernel command line option xen_unplug has been renamed to
>      xen_emul_unplug and the code that makes use of it has been moved to a
>      separate file (arch/x86/xen/platform-pci-unplug.c).
>      Xen_unplug_emulated_devices is now able to detect if blkfront, netfront
>      and the Xen platform PCI driver have been compiled, and set the default
>      value of xen_emul_unplug accordingly.
>      The patch "Initialize xenbus device structs with ENODEV as
>      default" has been removed from the series and it will be sent
>      separately.
>      Finally the comments on most of the patches have been improved.
> 
>      The series is based on 2.6.34 and supports Xen PV frontends running
>      in a HVM domain, including netfront, blkfront and the VIRQ_TIMER.
> 
>      In order to be able to use VIRQ_TIMER and to improve performances you
>      need a patch to Xen to implement the vector callback mechanism
>      for event channel delivery.
> 
>      A git tree is also available here:
> 
>      git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
> 
>      branch name 2.6.34-pvhvm-v2.
> 
>      Cheers,
> 
>      Stefano
> 
>      _______________________________________________
>      Xen-devel mailing list
>      [1]Xen-devel@lists.xensource.com
>      [2]http://lists.xensource.com/xen-devel
> 
> References
> 
>    Visible links
>    1. file:///mc/compose?to=Xen-devel@lists.xensource.com
>    2. http://lists.xensource.com/xen-devel

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0 of 12] PV on HVM Xen
  2010-05-28 10:25 ` Boris Derzhavets
  2010-05-28 10:45   ` Pasi Kärkkäinen
@ 2010-05-28 11:06   ` Stefano Stabellini
  1 sibling, 0 replies; 46+ messages in thread
From: Stefano Stabellini @ 2010-05-28 11:06 UTC (permalink / raw)
  To: Boris Derzhavets
  Cc: Jeremy Fitzhardinge, xen-devel@lists.xensource.com, Don Dutile,
	Sheng Yang, Stefano Stabellini

On Fri, 28 May 2010, Boris Derzhavets wrote:
> What is an advantage of PV on HVM ?
> Kernel 2.6.34 with Stefano's patches may be built i believe only on Linux HVM DomU.
> At the same time any recent Linux ( >=24 or >=26) supports PV guest install ( it's in
> mainline for a while).
> What i am missing here ?
> 

A PV on HVM kernel doesn't need to be a special kernel: your pvops
kernel that can be used as native kernel or PV kernel can also be used
as PV on HVM kernel just adding CONFIG_XEN_PLATFORM_PCI.

PV on HVM support is good mainly for performances of 64 bit guests and
simplicity of installation.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2010-05-28 11:06 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-18 10:22 [PATCH 0 of 12] PV on HVM Xen Stefano Stabellini
2010-05-18 10:22 ` [PATCH 01/12] Add support for hvm_op Stefano Stabellini
2010-05-18 10:22 ` [PATCH 02/12] early PV on HVM Stefano Stabellini
2010-05-18 10:22 ` [PATCH 03/12] evtchn delivery " Stefano Stabellini
2010-05-18 17:17   ` Jeremy Fitzhardinge
2010-05-19 12:24     ` Stefano Stabellini
2010-05-19 18:19       ` Jeremy Fitzhardinge
2010-05-18 17:43   ` Jeremy Fitzhardinge
2010-05-19 13:01     ` Stefano Stabellini
2010-05-18 18:10   ` Jeremy Fitzhardinge
2010-05-19 13:08     ` Stefano Stabellini
2010-05-18 10:22 ` [PATCH 04/12] Fix find_unbound_irq in presence of ioapic irqs Stefano Stabellini
2010-05-18 10:23 ` [PATCH 05/12] unplug emulated disks and nics Stefano Stabellini
2010-05-18 17:27   ` Jeremy Fitzhardinge
2010-05-19 13:00     ` Stefano Stabellini
2010-05-18 10:23 ` [PATCH 06/12] xen pci platform device driver Stefano Stabellini
2010-05-18 18:11   ` Jeremy Fitzhardinge
2010-05-19 13:50     ` Stefano Stabellini
2010-05-18 10:23 ` [PATCH 07/12] Add suspend\resume support for PV on HVM guests Stefano Stabellini
2010-05-18 18:11   ` Jeremy Fitzhardinge
2010-05-19 14:18     ` Stefano Stabellini
2010-05-18 10:23 ` [PATCH 08/12] Allow xen platform pci device to be compiled as a module Stefano Stabellini
2010-05-18 18:15   ` Jeremy Fitzhardinge
2010-05-19 14:19     ` Stefano Stabellini
2010-05-18 10:23 ` [PATCH 09/12] Fix possible NULL pointer dereference in print_IO_APIC Stefano Stabellini
2010-05-18 18:15   ` Jeremy Fitzhardinge
2010-05-19 14:25     ` Stefano Stabellini
2010-05-18 10:23 ` [PATCH 10/12] __setup_vector_irq: handle NULL chip_data Stefano Stabellini
2010-05-18 10:23 ` [PATCH 11/12] Support VIRQ_TIMER and pvclock on HVM Stefano Stabellini
2010-05-18 18:23   ` Jeremy Fitzhardinge
2010-05-18 10:23 ` [PATCH 12/12] Initialize xenbus device structs with ENODEV as default state Stefano Stabellini
2010-05-18 18:28   ` Jeremy Fitzhardinge
2010-05-18 10:55 ` [PATCH 0 of 12] PV on HVM Xen Christian Tramnitz
2010-05-18 18:41   ` Jeremy Fitzhardinge
2010-05-24 17:28   ` Stefano Stabellini
2010-05-24 19:32     ` Pasi Kärkkäinen
2010-05-24 19:51       ` Stefano Stabellini
  -- strict thread matches above, loose matches on Subject: below --
2010-05-24 18:25 Stefano Stabellini
2010-05-24 18:29 ` Stefano Stabellini
2010-05-24 18:30 ` Boris Derzhavets
2010-05-24 18:36 ` Boris Derzhavets
2010-05-28 10:25 ` Boris Derzhavets
2010-05-28 10:45   ` Pasi Kärkkäinen
2010-05-28 11:06   ` Stefano Stabellini
2010-05-24 19:06 [Xen-devel] " Stefano Stabellini
2010-05-25  6:14 ` Boris Derzhavets
2010-05-25  9:55 [Xen-devel] " Stefano Stabellini
2010-05-25 11:15 ` Boris Derzhavets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).