xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code
@ 2016-11-14 17:17 Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-14 17:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

Hi,

I have a long-standing idea to separate PV and PVHVM code in kernel and 
introduce Kconfig options to make it possible to enable the required
parts only breaking the current 'all or nothing' approach.

Motivation:
- Xen related x86 code in kernel is rather big and it is unclear which
  parts of it are required for PV, for HVM or for both. With PVH coming
  into picture is becomes even more tangled. It makes it hard to
  understand/audit the code.

- In some case we may want to avoid bloating kernel by supporting Xen
  guests we don't need. In particular, 90% of the code in arch/x86/xen/ is
  required to support PV guests and one may require PVHVM support only.

- PV guests are supposed to go away one day and such code separation would
  help us to get ready.

This RFC adds XEN_PV Kconfig option and makes it possible to build PV-only
and PVHVM-only kernels. It also makes it possible to disable Dom0 support.
The series is incomplete and probably dirty in some places, I didn't pay
much attention to the current PVH implementation as (as far as I
understand) it is supposed to be replaced with PVHv2 but before investing
more I'd like to get opinions whether such refactoring will be welcomed.

Patches are rather big but this is mostly just moving code around, no
functional changes intended. I smoke tested it with PV-only and PVHVM-only
builds.

Vitaly Kuznetsov (4):
  x86/xen: start untangling PV and PVHVM guest support code
  x86/xen: split smp.c for PV and PVHVM guests
  x86/xen: put setup.c, mmu.c and p2m.c under CONFIG_XEN_PV
  x86/xen: put setup.c, pmu.c and apic.c under CONFIG_XEN_PV

 arch/x86/include/asm/hypervisor.h |   3 +-
 arch/x86/include/asm/xen/page.h   |  44 ++++-
 arch/x86/kernel/cpu/hypervisor.c  |   7 +-
 arch/x86/kernel/process_64.c      |   2 +-
 arch/x86/xen/Kconfig              |  35 +++-
 arch/x86/xen/Makefile             |  14 +-
 arch/x86/xen/enlighten.c          | 396 +++-----------------------------------
 arch/x86/xen/enlighten_common.c   | 216 +++++++++++++++++++++
 arch/x86/xen/enlighten_hvm.c      | 202 +++++++++++++++++++
 arch/x86/xen/mmu.c                | 286 +--------------------------
 arch/x86/xen/mmu_common.c         | 219 +++++++++++++++++++++
 arch/x86/xen/mmu_hvm.c            |  77 ++++++++
 arch/x86/xen/pmu.h                |   5 +
 arch/x86/xen/smp.c                | 301 +++--------------------------
 arch/x86/xen/smp.h                |  23 +++
 arch/x86/xen/smp_common.c         | 246 +++++++++++++++++++++++
 arch/x86/xen/smp_hvm.c            |  54 ++++++
 arch/x86/xen/suspend.c            |   4 +
 arch/x86/xen/xen-head.S           |   4 +
 arch/x86/xen/xen-ops.h            |   2 +
 drivers/xen/balloon.c             |  30 ++-
 include/xen/xen-ops.h             |  20 ++
 22 files changed, 1230 insertions(+), 960 deletions(-)
 create mode 100644 arch/x86/xen/enlighten_common.c
 create mode 100644 arch/x86/xen/enlighten_hvm.c
 create mode 100644 arch/x86/xen/mmu_common.c
 create mode 100644 arch/x86/xen/mmu_hvm.c
 create mode 100644 arch/x86/xen/smp_common.c
 create mode 100644 arch/x86/xen/smp_hvm.c

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH KERNEL 1/4] x86/xen: start untangling PV and PVHVM guest support code
  2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
@ 2016-11-14 17:17 ` Vitaly Kuznetsov
  2016-11-15 11:26   ` David Vrabel
  2016-11-15 15:50   ` Boris Ostrovsky
  2016-11-14 17:17 ` [RFC PATCH KERNEL 2/4] x86/xen: split smp.c for PV and PVHVM guests Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-14 17:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

Introduce CONFIG_XEN_PV config option and split enlighten.c into
3 files. Temporary add #ifdef CONFIG_XEN_PV to smp.c and mmu.c to
not break the build and not make the patch even bigger.

xen_cpu_up_prepare*/xen_cpu_die hooks require separation to support
future xen_smp_intr_init() split.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hypervisor.h |   3 +-
 arch/x86/kernel/cpu/hypervisor.c  |   7 +-
 arch/x86/kernel/process_64.c      |   2 +-
 arch/x86/xen/Kconfig              |  25 ++-
 arch/x86/xen/Makefile             |   7 +-
 arch/x86/xen/enlighten.c          | 388 ++------------------------------------
 arch/x86/xen/enlighten_common.c   | 216 +++++++++++++++++++++
 arch/x86/xen/enlighten_hvm.c      | 202 ++++++++++++++++++++
 arch/x86/xen/mmu.c                |   2 +
 arch/x86/xen/smp.c                |  40 ++--
 arch/x86/xen/suspend.c            |   4 +
 arch/x86/xen/xen-head.S           |   4 +
 arch/x86/xen/xen-ops.h            |   2 +
 include/xen/xen-ops.h             |   7 +
 14 files changed, 510 insertions(+), 399 deletions(-)
 create mode 100644 arch/x86/xen/enlighten_common.c
 create mode 100644 arch/x86/xen/enlighten_hvm.c

diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h
index 67942b6..4faa12d 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -53,7 +53,8 @@ extern const struct hypervisor_x86 *x86_hyper;
 /* Recognized hypervisors */
 extern const struct hypervisor_x86 x86_hyper_vmware;
 extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
-extern const struct hypervisor_x86 x86_hyper_xen;
+extern const struct hypervisor_x86 x86_hyper_xen_pv;
+extern const struct hypervisor_x86 x86_hyper_xen_pvhvm;
 extern const struct hypervisor_x86 x86_hyper_kvm;
 
 extern void init_hypervisor(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 35691a6..dd68b03 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -28,8 +28,11 @@
 
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
-#ifdef CONFIG_XEN
-	&x86_hyper_xen,
+#ifdef CONFIG_XEN_PV
+	&x86_hyper_xen_pv,
+#endif
+#ifdef CONFIG_XEN_PVHVM
+	&x86_hyper_xen_pvhvm,
 #endif
 	&x86_hyper_vmware,
 	&x86_hyper_ms_hyperv,
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index b3760b3..4e91a02 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -434,7 +434,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 		     task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV))
 		__switch_to_xtra(prev_p, next_p, tss);
 
-#ifdef CONFIG_XEN
+#ifdef CONFIG_XEN_PV
 	/*
 	 * On Xen PV, IOPL bits in pt_regs->flags have no effect, and
 	 * current_pt_regs()->flags may not match the current task's
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index c7b15f3..8298378 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -6,7 +6,6 @@ config XEN
 	bool "Xen guest support"
 	depends on PARAVIRT
 	select PARAVIRT_CLOCK
-	select XEN_HAVE_PVMMU
 	select XEN_HAVE_VPMU
 	depends on X86_64 || (X86_32 && X86_PAE)
 	depends on X86_LOCAL_APIC && X86_TSC
@@ -15,18 +14,32 @@ config XEN
 	  kernel to boot in a paravirtualized environment under the
 	  Xen hypervisor.
 
+config XEN_PV
+	bool "Xen PV guest support"
+	default y
+	depends on XEN
+	help
+	  Support running as a Xen PV guest.
+
 config XEN_DOM0
-	def_bool y
-	depends on XEN && PCI_XEN && SWIOTLB_XEN
+	bool "Xen PV Dom0 support"
+	default y
+	depends on XEN_PV && PCI_XEN && SWIOTLB_XEN
 	depends on X86_IO_APIC && ACPI && PCI
+	select XEN_HAVE_PVMMU
+	help
+	  Support running as a Xen PV Dom0 guest.
 
 config XEN_PVHVM
-	def_bool y
+	bool "Xen PVHVM guest support"
+	default y
 	depends on XEN && PCI && X86_LOCAL_APIC
+	help
+	  Support running as a Xen PVHVM guest.
 
 config XEN_512GB
 	bool "Limit Xen pv-domain memory to 512GB"
-	depends on XEN && X86_64
+	depends on XEN_PV && X86_64
 	default y
 	help
 	  Limit paravirtualized user domains to 512GB of RAM.
@@ -53,5 +66,5 @@ config XEN_DEBUG_FS
 
 config XEN_PVH
 	bool "Support for running as a PVH guest"
-	depends on X86_64 && XEN && XEN_PVHVM
+	depends on X86_64 && XEN_PV && XEN_PVHVM
 	def_bool n
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index e47e527..e60fc93 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -10,11 +10,14 @@ nostackp := $(call cc-option, -fno-stack-protector)
 CFLAGS_enlighten.o		:= $(nostackp)
 CFLAGS_mmu.o			:= $(nostackp)
 
-obj-y		:= enlighten.o setup.o multicalls.o mmu.o irq.o \
-			time.o xen-asm.o xen-asm_$(BITS).o \
+obj-y		:= enlighten_common.o setup.o multicalls.o \
+			mmu.o irq.o time.o xen-asm.o xen-asm_$(BITS).o \
 			grant-table.o suspend.o platform-pci-unplug.o \
 			p2m.o apic.o pmu.o
 
+obj-$(CONFIG_XEN_PV)		+= enlighten.o
+obj-$(CONFIG_XEN_PVHVM)		+= enlighten_hvm.o
+
 obj-$(CONFIG_EVENT_TRACING) += trace.o
 
 obj-$(CONFIG_SMP)		+= smp.o
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index bdd8556..086c339 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -90,78 +90,13 @@
 #include "multicalls.h"
 #include "pmu.h"
 
-EXPORT_SYMBOL_GPL(hypercall_page);
-
-/*
- * Pointer to the xen_vcpu_info structure or
- * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
- * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
- * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
- * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
- * acknowledge pending events.
- * Also more subtly it is used by the patched version of irq enable/disable
- * e.g. xen_irq_enable_direct and xen_iret in PV mode.
- *
- * The desire to be able to do those mask/unmask operations as a single
- * instruction by using the per-cpu offset held in %gs is the real reason
- * vcpu info is in a per-cpu pointer and the original reason for this
- * hypercall.
- *
- */
-DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
-
-/*
- * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
- * hypercall. This can be used both in PV and PVHVM mode. The structure
- * overrides the default per_cpu(xen_vcpu, cpu) value.
- */
-DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
-
-/* Linux <-> Xen vCPU id mapping */
-DEFINE_PER_CPU(uint32_t, xen_vcpu_id);
-EXPORT_PER_CPU_SYMBOL(xen_vcpu_id);
-
-enum xen_domain_type xen_domain_type = XEN_NATIVE;
-EXPORT_SYMBOL_GPL(xen_domain_type);
-
-unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START;
-EXPORT_SYMBOL(machine_to_phys_mapping);
-unsigned long  machine_to_phys_nr;
-EXPORT_SYMBOL(machine_to_phys_nr);
-
-struct start_info *xen_start_info;
-EXPORT_SYMBOL_GPL(xen_start_info);
-
-struct shared_info xen_dummy_shared_info;
 
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
 
-static int xen_cpu_up_prepare(unsigned int cpu);
-static int xen_cpu_up_online(unsigned int cpu);
-static int xen_cpu_dead(unsigned int cpu);
-
-/*
- * Point at some empty memory to start with. We map the real shared_info
- * page as soon as fixmap is up and running.
- */
-struct shared_info *HYPERVISOR_shared_info = &xen_dummy_shared_info;
-
-/*
- * Flag to determine whether vcpu info placement is available on all
- * VCPUs.  We assume it is to start with, and then set it to zero on
- * the first failure.  This is because it can succeed on some VCPUs
- * and not others, since it can involve hypervisor memory allocation,
- * or because the guest failed to guarantee all the appropriate
- * constraints on all VCPUs (ie buffer can't cross a page boundary).
- *
- * Note that any particular CPU may be using a placed vcpu structure,
- * but we can only optimise if the all are.
- *
- * 0: not available, 1: available
- */
-static int have_vcpu_info_placement = 1;
+static int xen_cpu_up_prepare_pv(unsigned int cpu);
+static int xen_cpu_dead_pv(unsigned int cpu);
 
 struct tls_descs {
 	struct desc_struct desc[3];
@@ -176,73 +111,6 @@ struct tls_descs {
  */
 static DEFINE_PER_CPU(struct tls_descs, shadow_tls_desc);
 
-static void clamp_max_cpus(void)
-{
-#ifdef CONFIG_SMP
-	if (setup_max_cpus > MAX_VIRT_CPUS)
-		setup_max_cpus = MAX_VIRT_CPUS;
-#endif
-}
-
-void xen_vcpu_setup(int cpu)
-{
-	struct vcpu_register_vcpu_info info;
-	int err;
-	struct vcpu_info *vcpup;
-
-	BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
-
-	/*
-	 * This path is called twice on PVHVM - first during bootup via
-	 * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
-	 * hotplugged: cpu_up -> xen_hvm_cpu_notify.
-	 * As we can only do the VCPUOP_register_vcpu_info once lets
-	 * not over-write its result.
-	 *
-	 * For PV it is called during restore (xen_vcpu_restore) and bootup
-	 * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
-	 * use this function.
-	 */
-	if (xen_hvm_domain()) {
-		if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
-			return;
-	}
-	if (xen_vcpu_nr(cpu) < MAX_VIRT_CPUS)
-		per_cpu(xen_vcpu, cpu) =
-			&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
-
-	if (!have_vcpu_info_placement) {
-		if (cpu >= MAX_VIRT_CPUS)
-			clamp_max_cpus();
-		return;
-	}
-
-	vcpup = &per_cpu(xen_vcpu_info, cpu);
-	info.mfn = arbitrary_virt_to_mfn(vcpup);
-	info.offset = offset_in_page(vcpup);
-
-	/* Check to see if the hypervisor will put the vcpu_info
-	   structure where we want it, which allows direct access via
-	   a percpu-variable.
-	   N.B. This hypercall can _only_ be called once per CPU. Subsequent
-	   calls will error out with -EINVAL. This is due to the fact that
-	   hypervisor has no unregister variant and this hypercall does not
-	   allow to over-write info.mfn and info.offset.
-	 */
-	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, xen_vcpu_nr(cpu),
-				 &info);
-
-	if (err) {
-		printk(KERN_DEBUG "register_vcpu_info failed: err=%d\n", err);
-		have_vcpu_info_placement = 0;
-		clamp_max_cpus();
-	} else {
-		/* This cpu is using the registered vcpu info, even if
-		   later ones fail to. */
-		per_cpu(xen_vcpu, cpu) = vcpup;
-	}
-}
-
 /*
  * On restore, set the vcpu placement up again.
  * If it fails, then we're in a bad state, since
@@ -1291,18 +1159,6 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.end_context_switch = xen_end_context_switch,
 };
 
-static void xen_reboot(int reason)
-{
-	struct sched_shutdown r = { .reason = reason };
-	int cpu;
-
-	for_each_online_cpu(cpu)
-		xen_pmu_finish(cpu);
-
-	if (HYPERVISOR_sched_op(SCHEDOP_shutdown, &r))
-		BUG();
-}
-
 static void xen_restart(char *msg)
 {
 	xen_reboot(SHUTDOWN_reboot);
@@ -1330,25 +1186,6 @@ static void xen_crash_shutdown(struct pt_regs *regs)
 	xen_reboot(SHUTDOWN_crash);
 }
 
-static int
-xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
-{
-	if (!kexec_crash_loaded())
-		xen_reboot(SHUTDOWN_crash);
-	return NOTIFY_DONE;
-}
-
-static struct notifier_block xen_panic_block = {
-	.notifier_call= xen_panic_event,
-	.priority = INT_MIN
-};
-
-int xen_panic_handler_init(void)
-{
-	atomic_notifier_chain_register(&panic_notifier_list, &xen_panic_block);
-	return 0;
-}
-
 static const struct machine_ops xen_machine_ops __initconst = {
 	.restart = xen_restart,
 	.halt = xen_machine_halt,
@@ -1537,24 +1374,6 @@ static void __init xen_dom0_set_legacy_features(void)
 	x86_platform.legacy.rtc = 1;
 }
 
-static int xen_cpuhp_setup(void)
-{
-	int rc;
-
-	rc = cpuhp_setup_state_nocalls(CPUHP_XEN_PREPARE,
-				       "XEN_HVM_GUEST_PREPARE",
-				       xen_cpu_up_prepare, xen_cpu_dead);
-	if (rc >= 0) {
-		rc = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-					       "XEN_HVM_GUEST_ONLINE",
-					       xen_cpu_up_online, NULL);
-		if (rc < 0)
-			cpuhp_remove_state_nocalls(CPUHP_XEN_PREPARE);
-	}
-
-	return rc >= 0 ? 0 : rc;
-}
-
 /* First C function to be called on Xen boot */
 asmlinkage __visible void __init xen_start_kernel(void)
 {
@@ -1656,7 +1475,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
 	   possible map and a non-dummy shared_info. */
 	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
 
-	WARN_ON(xen_cpuhp_setup());
+	WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_pv, xen_cpu_dead_pv));
 
 	local_irq_disable();
 	early_boot_irqs_disabled = true;
@@ -1771,95 +1590,10 @@ asmlinkage __visible void __init xen_start_kernel(void)
 #endif
 }
 
-void __ref xen_hvm_init_shared_info(void)
-{
-	int cpu;
-	struct xen_add_to_physmap xatp;
-	static struct shared_info *shared_info_page = 0;
-
-	if (!shared_info_page)
-		shared_info_page = (struct shared_info *)
-			extend_brk(PAGE_SIZE, PAGE_SIZE);
-	xatp.domid = DOMID_SELF;
-	xatp.idx = 0;
-	xatp.space = XENMAPSPACE_shared_info;
-	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
-	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
-		BUG();
-
-	HYPERVISOR_shared_info = (struct shared_info *)shared_info_page;
-
-	/* xen_vcpu is a pointer to the vcpu_info struct in the shared_info
-	 * page, we use it in the event channel upcall and in some pvclock
-	 * related functions. We don't need the vcpu_info placement
-	 * optimizations because we don't use any pv_mmu or pv_irq op on
-	 * HVM.
-	 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is
-	 * online but xen_hvm_init_shared_info is run at resume time too and
-	 * in that case multiple vcpus might be online. */
-	for_each_online_cpu(cpu) {
-		/* Leave it to be NULL. */
-		if (xen_vcpu_nr(cpu) >= MAX_VIRT_CPUS)
-			continue;
-		per_cpu(xen_vcpu, cpu) =
-			&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
-	}
-}
-
-#ifdef CONFIG_XEN_PVHVM
-static void __init init_hvm_pv_info(void)
-{
-	int major, minor;
-	uint32_t eax, ebx, ecx, edx, pages, msr, base;
-	u64 pfn;
-
-	base = xen_cpuid_base();
-	cpuid(base + 1, &eax, &ebx, &ecx, &edx);
-
-	major = eax >> 16;
-	minor = eax & 0xffff;
-	printk(KERN_INFO "Xen version %d.%d.\n", major, minor);
-
-	cpuid(base + 2, &pages, &msr, &ecx, &edx);
-
-	pfn = __pa(hypercall_page);
-	wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
-
-	xen_setup_features();
-
-	cpuid(base + 4, &eax, &ebx, &ecx, &edx);
-	if (eax & XEN_HVM_CPUID_VCPU_ID_PRESENT)
-		this_cpu_write(xen_vcpu_id, ebx);
-	else
-		this_cpu_write(xen_vcpu_id, smp_processor_id());
-
-	pv_info.name = "Xen HVM";
-
-	xen_domain_type = XEN_HVM_DOMAIN;
-}
-#endif
-
-static int xen_cpu_up_prepare(unsigned int cpu)
+static int xen_cpu_up_prepare_pv(unsigned int cpu)
 {
 	int rc;
 
-	if (xen_hvm_domain()) {
-		/*
-		 * This can happen if CPU was offlined earlier and
-		 * offlining timed out in common_cpu_die().
-		 */
-		if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) {
-			xen_smp_intr_free(cpu);
-			xen_uninit_lock_cpu(cpu);
-		}
-
-		if (cpu_acpi_id(cpu) != U32_MAX)
-			per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
-		else
-			per_cpu(xen_vcpu_id, cpu) = cpu;
-		xen_vcpu_setup(cpu);
-	}
-
 	if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
 		xen_setup_timer(cpu);
 
@@ -1872,7 +1606,7 @@ static int xen_cpu_up_prepare(unsigned int cpu)
 	return 0;
 }
 
-static int xen_cpu_dead(unsigned int cpu)
+static int xen_cpu_dead_pv(unsigned int cpu)
 {
 	xen_smp_intr_free(cpu);
 
@@ -1882,84 +1616,6 @@ static int xen_cpu_dead(unsigned int cpu)
 	return 0;
 }
 
-static int xen_cpu_up_online(unsigned int cpu)
-{
-	xen_init_lock_cpu(cpu);
-	return 0;
-}
-
-#ifdef CONFIG_XEN_PVHVM
-#ifdef CONFIG_KEXEC_CORE
-static void xen_hvm_shutdown(void)
-{
-	native_machine_shutdown();
-	if (kexec_in_progress)
-		xen_reboot(SHUTDOWN_soft_reset);
-}
-
-static void xen_hvm_crash_shutdown(struct pt_regs *regs)
-{
-	native_machine_crash_shutdown(regs);
-	xen_reboot(SHUTDOWN_soft_reset);
-}
-#endif
-
-static void __init xen_hvm_guest_init(void)
-{
-	if (xen_pv_domain())
-		return;
-
-	init_hvm_pv_info();
-
-	xen_hvm_init_shared_info();
-
-	xen_panic_handler_init();
-
-	BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
-
-	xen_hvm_smp_init();
-	WARN_ON(xen_cpuhp_setup());
-	xen_unplug_emulated_devices();
-	x86_init.irqs.intr_init = xen_init_IRQ;
-	xen_hvm_init_time_ops();
-	xen_hvm_init_mmu_ops();
-#ifdef CONFIG_KEXEC_CORE
-	machine_ops.shutdown = xen_hvm_shutdown;
-	machine_ops.crash_shutdown = xen_hvm_crash_shutdown;
-#endif
-}
-#endif
-
-static bool xen_nopv = false;
-static __init int xen_parse_nopv(char *arg)
-{
-       xen_nopv = true;
-       return 0;
-}
-early_param("xen_nopv", xen_parse_nopv);
-
-static uint32_t __init xen_platform(void)
-{
-	if (xen_nopv)
-		return 0;
-
-	return xen_cpuid_base();
-}
-
-bool xen_hvm_need_lapic(void)
-{
-	if (xen_nopv)
-		return false;
-	if (xen_pv_domain())
-		return false;
-	if (!xen_hvm_domain())
-		return false;
-	if (xen_feature(XENFEAT_hvm_pirqs))
-		return false;
-	return true;
-}
-EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
-
 static void xen_set_cpu_features(struct cpuinfo_x86 *c)
 {
 	if (xen_pv_domain()) {
@@ -2007,28 +1663,18 @@ static void xen_pin_vcpu(int cpu)
 	}
 }
 
-const struct hypervisor_x86 x86_hyper_xen = {
-	.name			= "Xen",
-	.detect			= xen_platform,
-#ifdef CONFIG_XEN_PVHVM
-	.init_platform		= xen_hvm_guest_init,
-#endif
-	.x2apic_available	= xen_x2apic_para_available,
-	.set_cpu_features       = xen_set_cpu_features,
-	.pin_vcpu               = xen_pin_vcpu,
-};
-EXPORT_SYMBOL(x86_hyper_xen);
-
-#ifdef CONFIG_HOTPLUG_CPU
-void xen_arch_register_cpu(int num)
+uint32_t __init xen_platform_pv(void)
 {
-	arch_register_cpu(num);
-}
-EXPORT_SYMBOL(xen_arch_register_cpu);
+	if (xen_pv_domain())
+		return xen_cpuid_base();
 
-void xen_arch_unregister_cpu(int num)
-{
-	arch_unregister_cpu(num);
+	return 0;
 }
-EXPORT_SYMBOL(xen_arch_unregister_cpu);
-#endif
+
+const struct hypervisor_x86 x86_hyper_xen_pv = {
+	.name			= "Xen PV",
+	.detect			= xen_platform_pv,
+	.set_cpu_features       = xen_set_cpu_features,
+	.pin_vcpu               = xen_pin_vcpu,
+};
+EXPORT_SYMBOL(x86_hyper_xen_pv);
diff --git a/arch/x86/xen/enlighten_common.c b/arch/x86/xen/enlighten_common.c
new file mode 100644
index 0000000..b8a7be2
--- /dev/null
+++ b/arch/x86/xen/enlighten_common.c
@@ -0,0 +1,216 @@
+#include <linux/cpu.h>
+#include <linux/kexec.h>
+#include <linux/interrupt.h>
+
+#include <xen/features.h>
+
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/page.h>
+#include <asm/cpu.h>
+
+#include "xen-ops.h"
+#include "pmu.h"
+#include "smp.h"
+
+EXPORT_SYMBOL_GPL(hypercall_page);
+
+/*
+ * Pointer to the xen_vcpu_info structure or
+ * &HYPERVISOR_shared_info->vcpu_info[cpu]. See xen_hvm_init_shared_info
+ * and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
+ * but if the hypervisor supports VCPUOP_register_vcpu_info then it can point
+ * to xen_vcpu_info. The pointer is used in __xen_evtchn_do_upcall to
+ * acknowledge pending events.
+ * Also more subtly it is used by the patched version of irq enable/disable
+ * e.g. xen_irq_enable_direct and xen_iret in PV mode.
+ *
+ * The desire to be able to do those mask/unmask operations as a single
+ * instruction by using the per-cpu offset held in %gs is the real reason
+ * vcpu info is in a per-cpu pointer and the original reason for this
+ * hypercall.
+ *
+ */
+DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
+
+/*
+ * Per CPU pages used if hypervisor supports VCPUOP_register_vcpu_info
+ * hypercall. This can be used both in PV and PVHVM mode. The structure
+ * overrides the default per_cpu(xen_vcpu, cpu) value.
+ */
+DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
+
+/* Linux <-> Xen vCPU id mapping */
+DEFINE_PER_CPU(uint32_t, xen_vcpu_id);
+EXPORT_PER_CPU_SYMBOL(xen_vcpu_id);
+
+enum xen_domain_type xen_domain_type = XEN_NATIVE;
+EXPORT_SYMBOL_GPL(xen_domain_type);
+
+unsigned long *machine_to_phys_mapping = (void *)MACH2PHYS_VIRT_START;
+EXPORT_SYMBOL(machine_to_phys_mapping);
+unsigned long  machine_to_phys_nr;
+EXPORT_SYMBOL(machine_to_phys_nr);
+
+struct start_info *xen_start_info;
+EXPORT_SYMBOL_GPL(xen_start_info);
+
+struct shared_info xen_dummy_shared_info;
+
+/*
+ * Point at some empty memory to start with. We map the real shared_info
+ * page as soon as fixmap is up and running.
+ */
+struct shared_info *HYPERVISOR_shared_info = &xen_dummy_shared_info;
+
+/*
+ * Flag to determine whether vcpu info placement is available on all
+ * VCPUs.  We assume it is to start with, and then set it to zero on
+ * the first failure.  This is because it can succeed on some VCPUs
+ * and not others, since it can involve hypervisor memory allocation,
+ * or because the guest failed to guarantee all the appropriate
+ * constraints on all VCPUs (ie buffer can't cross a page boundary).
+ *
+ * Note that any particular CPU may be using a placed vcpu structure,
+ * but we can only optimise if the all are.
+ *
+ * 0: not available, 1: available
+ */
+int have_vcpu_info_placement = 1;
+
+static int xen_cpu_up_online(unsigned int cpu)
+{
+	xen_init_lock_cpu(cpu);
+	return 0;
+}
+
+int xen_cpuhp_setup(int (*cpu_up_prepare_cb)(unsigned int),
+		    int (*cpu_dead_cb)(unsigned int))
+{
+	int rc;
+
+	rc = cpuhp_setup_state_nocalls(CPUHP_XEN_PREPARE,
+				       "XEN_HVM_GUEST_PREPARE",
+				       cpu_up_prepare_cb, cpu_dead_cb);
+	if (rc >= 0) {
+		rc = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+					       "XEN_HVM_GUEST_ONLINE",
+					       xen_cpu_up_online, NULL);
+		if (rc < 0)
+			cpuhp_remove_state_nocalls(CPUHP_XEN_PREPARE);
+	}
+
+	return rc >= 0 ? 0 : rc;
+}
+
+static void clamp_max_cpus(void)
+{
+#ifdef CONFIG_SMP
+	if (setup_max_cpus > MAX_VIRT_CPUS)
+		setup_max_cpus = MAX_VIRT_CPUS;
+#endif
+}
+
+void xen_vcpu_setup(int cpu)
+{
+	struct vcpu_register_vcpu_info info;
+	int err;
+	struct vcpu_info *vcpup;
+
+	BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
+
+	/*
+	 * This path is called twice on PVHVM - first during bootup via
+	 * smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
+	 * hotplugged: cpu_up -> xen_hvm_cpu_notify.
+	 * As we can only do the VCPUOP_register_vcpu_info once lets
+	 * not over-write its result.
+	 *
+	 * For PV it is called during restore (xen_vcpu_restore) and bootup
+	 * (xen_setup_vcpu_info_placement). The hotplug mechanism does not
+	 * use this function.
+	 */
+	if (xen_hvm_domain()) {
+		if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
+			return;
+	}
+	if (xen_vcpu_nr(cpu) < MAX_VIRT_CPUS)
+		per_cpu(xen_vcpu, cpu) =
+			&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
+
+	if (!have_vcpu_info_placement) {
+		if (cpu >= MAX_VIRT_CPUS)
+			clamp_max_cpus();
+		return;
+	}
+
+	vcpup = &per_cpu(xen_vcpu_info, cpu);
+	info.mfn = arbitrary_virt_to_mfn(vcpup);
+	info.offset = offset_in_page(vcpup);
+
+	/* Check to see if the hypervisor will put the vcpu_info
+	   structure where we want it, which allows direct access via
+	   a percpu-variable.
+	   N.B. This hypercall can _only_ be called once per CPU. Subsequent
+	   calls will error out with -EINVAL. This is due to the fact that
+	   hypervisor has no unregister variant and this hypercall does not
+	   allow to over-write info.mfn and info.offset.
+	 */
+	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, xen_vcpu_nr(cpu),
+				 &info);
+
+	if (err) {
+		printk(KERN_DEBUG "register_vcpu_info failed: err=%d\n", err);
+		have_vcpu_info_placement = 0;
+		clamp_max_cpus();
+	} else {
+		/* This cpu is using the registered vcpu info, even if
+		   later ones fail to. */
+		per_cpu(xen_vcpu, cpu) = vcpup;
+	}
+}
+
+void xen_reboot(int reason)
+{
+	struct sched_shutdown r = { .reason = reason };
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		xen_pmu_finish(cpu);
+
+	if (HYPERVISOR_sched_op(SCHEDOP_shutdown, &r))
+		BUG();
+}
+
+static int
+xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
+{
+	if (!kexec_crash_loaded())
+		xen_reboot(SHUTDOWN_crash);
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block xen_panic_block = {
+	.notifier_call = xen_panic_event,
+	.priority = INT_MIN
+};
+
+int xen_panic_handler_init(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list, &xen_panic_block);
+	return 0;
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+void xen_arch_register_cpu(int num)
+{
+	arch_register_cpu(num);
+}
+EXPORT_SYMBOL(xen_arch_register_cpu);
+
+void xen_arch_unregister_cpu(int num)
+{
+	arch_unregister_cpu(num);
+}
+EXPORT_SYMBOL(xen_arch_unregister_cpu);
+#endif
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
new file mode 100644
index 0000000..58b9e44
--- /dev/null
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -0,0 +1,202 @@
+#include <linux/cpu.h>
+#include <linux/kexec.h>
+
+#include <xen/features.h>
+#include <xen/events.h>
+#include <xen/interface/memory.h>
+
+#include <asm/reboot.h>
+#include <asm/setup.h>
+#include <asm/hypervisor.h>
+
+#include <asm/xen/cpuid.h>
+#include <asm/xen/hypervisor.h>
+
+#include "xen-ops.h"
+#include "mmu.h"
+#include "smp.h"
+
+void __ref xen_hvm_init_shared_info(void)
+{
+	int cpu;
+	struct xen_add_to_physmap xatp;
+	static struct shared_info *shared_info_page;
+
+	if (!shared_info_page)
+		shared_info_page = (struct shared_info *)
+			extend_brk(PAGE_SIZE, PAGE_SIZE);
+	xatp.domid = DOMID_SELF;
+	xatp.idx = 0;
+	xatp.space = XENMAPSPACE_shared_info;
+	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
+		BUG();
+
+	HYPERVISOR_shared_info = (struct shared_info *)shared_info_page;
+
+	/* xen_vcpu is a pointer to the vcpu_info struct in the shared_info
+	 * page, we use it in the event channel upcall and in some pvclock
+	 * related functions. We don't need the vcpu_info placement
+	 * optimizations because we don't use any pv_mmu or pv_irq op on
+	 * HVM.
+	 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is
+	 * online but xen_hvm_init_shared_info is run at resume time too and
+	 * in that case multiple vcpus might be online. */
+	for_each_online_cpu(cpu) {
+		/* Leave it to be NULL. */
+		if (xen_vcpu_nr(cpu) >= MAX_VIRT_CPUS)
+			continue;
+		per_cpu(xen_vcpu, cpu) =
+			&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
+	}
+}
+
+static void __init init_hvm_pv_info(void)
+{
+	int major, minor;
+	uint32_t eax, ebx, ecx, edx, pages, msr, base;
+	u64 pfn;
+
+	base = xen_cpuid_base();
+	cpuid(base + 1, &eax, &ebx, &ecx, &edx);
+
+	major = eax >> 16;
+	minor = eax & 0xffff;
+	printk(KERN_INFO "Xen version %d.%d.\n", major, minor);
+
+	cpuid(base + 2, &pages, &msr, &ecx, &edx);
+
+	pfn = __pa(hypercall_page);
+	wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
+
+	xen_setup_features();
+
+	cpuid(base + 4, &eax, &ebx, &ecx, &edx);
+	if (eax & XEN_HVM_CPUID_VCPU_ID_PRESENT)
+		this_cpu_write(xen_vcpu_id, ebx);
+	else
+		this_cpu_write(xen_vcpu_id, smp_processor_id());
+
+	pv_info.name = "Xen HVM";
+
+	xen_domain_type = XEN_HVM_DOMAIN;
+}
+
+#ifdef CONFIG_KEXEC_CORE
+static void xen_hvm_shutdown(void)
+{
+	native_machine_shutdown();
+	if (kexec_in_progress)
+		xen_reboot(SHUTDOWN_soft_reset);
+}
+
+static void xen_hvm_crash_shutdown(struct pt_regs *regs)
+{
+	native_machine_crash_shutdown(regs);
+	xen_reboot(SHUTDOWN_soft_reset);
+}
+#endif
+
+static int xen_cpu_up_prepare_hvm(unsigned int cpu)
+{
+	int rc;
+
+	/*
+	 * This can happen if CPU was offlined earlier and
+	 * offlining timed out in common_cpu_die().
+	 */
+	if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) {
+		xen_smp_intr_free(cpu);
+		xen_uninit_lock_cpu(cpu);
+	}
+
+	if (cpu_acpi_id(cpu) != U32_MAX)
+		per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
+	else
+		per_cpu(xen_vcpu_id, cpu) = cpu;
+	xen_vcpu_setup(cpu);
+
+	if (xen_feature(XENFEAT_hvm_safe_pvclock))
+		xen_setup_timer(cpu);
+
+	rc = xen_smp_intr_init(cpu);
+	if (rc) {
+		WARN(1, "xen_smp_intr_init() for CPU %d failed: %d\n",
+		     cpu, rc);
+		return rc;
+	}
+	return 0;
+}
+
+static int xen_cpu_dead_hvm(unsigned int cpu)
+{
+	xen_smp_intr_free(cpu);
+
+	if (xen_feature(XENFEAT_hvm_safe_pvclock))
+		xen_teardown_timer(cpu);
+
+	return 0;
+}
+
+void __init xen_hvm_guest_init(void)
+{
+	if (xen_pv_domain())
+		return;
+
+	init_hvm_pv_info();
+
+	xen_hvm_init_shared_info();
+
+	xen_panic_handler_init();
+
+	BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+
+	xen_hvm_smp_init();
+	WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
+	xen_unplug_emulated_devices();
+	x86_init.irqs.intr_init = xen_init_IRQ;
+	xen_hvm_init_time_ops();
+	xen_hvm_init_mmu_ops();
+#ifdef CONFIG_KEXEC_CORE
+	machine_ops.shutdown = xen_hvm_shutdown;
+	machine_ops.crash_shutdown = xen_hvm_crash_shutdown;
+#endif
+}
+
+bool xen_hvm_need_lapic(void)
+{
+	if (xen_nopv)
+		return false;
+	if (xen_pv_domain())
+		return false;
+	if (!xen_hvm_domain())
+		return false;
+	if (xen_feature(XENFEAT_hvm_pirqs))
+		return false;
+	return true;
+}
+EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
+
+bool xen_nopv;
+static __init int xen_parse_nopv(char *arg)
+{
+	xen_nopv = true;
+	return 0;
+}
+early_param("xen_nopv", xen_parse_nopv);
+
+uint32_t __init xen_platform_hvm(void)
+{
+	if (xen_pv_domain() || xen_nopv)
+		return 0;
+
+	return xen_cpuid_base();
+}
+
+const struct hypervisor_x86 x86_hyper_xen_pvhvm = {
+	.name			= "Xen",
+	.detect			= xen_platform_hvm,
+	.init_platform		= xen_hvm_guest_init,
+	.x2apic_available	= xen_x2apic_para_available,
+};
+EXPORT_SYMBOL(x86_hyper_xen_pvhvm);
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 7d5afdb..65e184b 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1295,7 +1295,9 @@ static void __init xen_pagetable_init(void)
 	if (!xen_feature(XENFEAT_auto_translated_physmap))
 		xen_remap_memory();
 
+#ifdef CONFIG_XEN_PV
 	xen_setup_shared_info();
+#endif
 }
 static void xen_write_cr2(unsigned long cr2)
 {
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 9fa27ce..bdb0d9c 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -253,6 +253,7 @@ int xen_smp_intr_init(unsigned int cpu)
 	return rc;
 }
 
+#ifdef CONFIG_XEN_PV
 static void __init xen_fill_possible_map(void)
 {
 	int i, rc;
@@ -304,6 +305,7 @@ static void __init xen_filter_cpu_maps(void)
 #endif
 
 }
+#endif
 
 static void __init xen_smp_prepare_boot_cpu(void)
 {
@@ -311,6 +313,7 @@ static void __init xen_smp_prepare_boot_cpu(void)
 	native_smp_prepare_boot_cpu();
 
 	if (xen_pv_domain()) {
+#ifdef CONFIG_XEN_PV
 		if (!xen_feature(XENFEAT_writable_page_tables))
 			/* We've switched to the "real" per-cpu gdt, so make
 			 * sure the old memory can be recycled. */
@@ -327,6 +330,7 @@ static void __init xen_smp_prepare_boot_cpu(void)
 
 		xen_filter_cpu_maps();
 		xen_setup_vcpu_info_placement();
+#endif
 	}
 
 	/*
@@ -344,6 +348,7 @@ static void __init xen_smp_prepare_boot_cpu(void)
 	xen_init_spinlocks();
 }
 
+#ifdef CONFIG_XEN_PV
 static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
 {
 	unsigned cpu;
@@ -525,22 +530,6 @@ static int xen_cpu_disable(void)
 	return 0;
 }
 
-static void xen_cpu_die(unsigned int cpu)
-{
-	while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up,
-						     xen_vcpu_nr(cpu), NULL)) {
-		__set_current_state(TASK_UNINTERRUPTIBLE);
-		schedule_timeout(HZ/10);
-	}
-
-	if (common_cpu_die(cpu) == 0) {
-		xen_smp_intr_free(cpu);
-		xen_uninit_lock_cpu(cpu);
-		xen_teardown_timer(cpu);
-		xen_pmu_finish(cpu);
-	}
-}
-
 static void xen_play_dead(void) /* used only with HOTPLUG_CPU */
 {
 	play_dead_common();
@@ -592,6 +581,23 @@ static void xen_stop_other_cpus(int wait)
 {
 	smp_call_function(stop_self, NULL, wait);
 }
+#endif /* CONFIG_XEN_PV */
+
+static void xen_cpu_die(unsigned int cpu)
+{
+	while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up,
+						     xen_vcpu_nr(cpu), NULL)) {
+		__set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(HZ/10);
+	}
+
+	if (common_cpu_die(cpu) == 0) {
+		xen_smp_intr_free(cpu);
+		xen_uninit_lock_cpu(cpu);
+		xen_teardown_timer(cpu);
+		xen_pmu_finish(cpu);
+	}
+}
 
 static void xen_smp_send_reschedule(int cpu)
 {
@@ -738,6 +744,7 @@ static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_XEN_PV
 static const struct smp_ops xen_smp_ops __initconst = {
 	.smp_prepare_boot_cpu = xen_smp_prepare_boot_cpu,
 	.smp_prepare_cpus = xen_smp_prepare_cpus,
@@ -760,6 +767,7 @@ void __init xen_smp_init(void)
 	smp_ops = xen_smp_ops;
 	xen_fill_possible_map();
 }
+#endif
 
 static void __init xen_hvm_smp_prepare_cpus(unsigned int max_cpus)
 {
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index 7f664c4..37f634f 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -16,6 +16,7 @@
 
 static void xen_pv_pre_suspend(void)
 {
+#ifdef CONFIG_XEN_PV
 	xen_mm_pin_all();
 
 	xen_start_info->store_mfn = mfn_to_pfn(xen_start_info->store_mfn);
@@ -28,6 +29,7 @@ static void xen_pv_pre_suspend(void)
 	if (HYPERVISOR_update_va_mapping(fix_to_virt(FIX_PARAVIRT_BOOTMAP),
 					 __pte_ma(0), 0))
 		BUG();
+#endif
 }
 
 static void xen_hvm_post_suspend(int suspend_cancelled)
@@ -48,6 +50,7 @@ static void xen_hvm_post_suspend(int suspend_cancelled)
 
 static void xen_pv_post_suspend(int suspend_cancelled)
 {
+#ifdef CONFIG_XEN_PV
 	xen_build_mfn_list_list();
 
 	xen_setup_shared_info();
@@ -66,6 +69,7 @@ static void xen_pv_post_suspend(int suspend_cancelled)
 	}
 
 	xen_mm_unpin_all();
+#endif
 }
 
 void xen_arch_pre_suspend(void)
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 7f8d8ab..b0016b3 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -35,6 +35,7 @@
 #define PVH_FEATURES (0)
 #endif
 
+#ifdef CONFIG_XEN_PV
 	__INIT
 ENTRY(startup_xen)
 	cld
@@ -53,6 +54,7 @@ ENTRY(startup_xen)
 	jmp xen_start_kernel
 
 	__FINIT
+#endif
 
 #ifdef CONFIG_XEN_PVH
 /*
@@ -112,7 +114,9 @@ ENTRY(hypercall_page)
 	/* Map the p2m table to a 512GB-aligned user address. */
 	ELFNOTE(Xen, XEN_ELFNOTE_INIT_P2M,       .quad PGDIR_SIZE)
 #endif
+#ifdef CONFIG_XEN_PV
 	ELFNOTE(Xen, XEN_ELFNOTE_ENTRY,          _ASM_PTR startup_xen)
+#endif
 	ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, _ASM_PTR hypercall_page)
 	ELFNOTE(Xen, XEN_ELFNOTE_FEATURES,       .ascii "!writable_page_tables|pae_pgdir_above_4gb"; .asciz PVH_FEATURES_STR)
 	ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, .long (PVH_FEATURES) |
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 3cbce3b..b4e1d35 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -76,6 +76,8 @@ irqreturn_t xen_debug_interrupt(int irq, void *dev_id);
 
 bool xen_vcpu_stolen(int vcpu);
 
+extern int have_vcpu_info_placement;
+
 void xen_vcpu_setup(int cpu);
 void xen_setup_vcpu_info_placement(void);
 
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index b5486e6..bcf90ed 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -15,6 +15,8 @@ static inline uint32_t xen_vcpu_nr(int cpu)
 	return per_cpu(xen_vcpu_id, cpu);
 }
 
+extern bool xen_nopv;
+
 void xen_arch_pre_suspend(void);
 void xen_arch_post_suspend(int suspend_cancelled);
 
@@ -33,6 +35,11 @@ u64 xen_steal_clock(int cpu);
 
 int xen_setup_shutdown_event(void);
 
+int xen_cpuhp_setup(int (*cpu_up_prepare_cb)(unsigned int),
+		    int (*cpu_dead_cb)(unsigned int));
+
+void xen_reboot(int reason);
+
 extern unsigned long *xen_contiguous_bitmap;
 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				unsigned int address_bits,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH KERNEL 2/4] x86/xen: split smp.c for PV and PVHVM guests
  2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
@ 2016-11-14 17:17 ` Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 3/4] x86/xen: put setup.c, mmu.c and p2m.c under CONFIG_XEN_PV Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-14 17:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

More or less mechanically split smp.c into 3 files. XEN_PV_SMP and
XEN_PVHVM_SMP config options added to support the change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/xen/Kconfig      |   8 ++
 arch/x86/xen/Makefile     |   5 +-
 arch/x86/xen/enlighten.c  |   8 ++
 arch/x86/xen/smp.c        | 295 +++-------------------------------------------
 arch/x86/xen/smp.h        |  23 ++++
 arch/x86/xen/smp_common.c | 246 ++++++++++++++++++++++++++++++++++++++
 arch/x86/xen/smp_hvm.c    |  54 +++++++++
 7 files changed, 359 insertions(+), 280 deletions(-)
 create mode 100644 arch/x86/xen/smp_common.c
 create mode 100644 arch/x86/xen/smp_hvm.c

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 8298378..e25c93e 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -21,6 +21,10 @@ config XEN_PV
 	help
 	  Support running as a Xen PV guest.
 
+config XEN_PV_SMP
+	def_bool y
+	depends on XEN_PV && SMP
+
 config XEN_DOM0
 	bool "Xen PV Dom0 support"
 	default y
@@ -37,6 +41,10 @@ config XEN_PVHVM
 	help
 	  Support running as a Xen PVHVM guest.
 
+config XEN_PVHVM_SMP
+	def_bool y
+	depends on XEN_PVHVM && SMP
+
 config XEN_512GB
 	bool "Limit Xen pv-domain memory to 512GB"
 	depends on XEN_PV && X86_64
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index e60fc93..aa6cd5e 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -20,7 +20,10 @@ obj-$(CONFIG_XEN_PVHVM)		+= enlighten_hvm.o
 
 obj-$(CONFIG_EVENT_TRACING) += trace.o
 
-obj-$(CONFIG_SMP)		+= smp.o
+obj-$(CONFIG_SMP)		+= smp_common.o
+obj-$(CONFIG_XEN_PV_SMP)	+= smp.o
+obj-$(CONFIG_XEN_PVHVM_SMP)	+= smp_hvm.o
+
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o
 obj-$(CONFIG_XEN_DEBUG_FS)	+= debugfs.o
 obj-$(CONFIG_XEN_DOM0)		+= vga.o
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 086c339..583221c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1603,12 +1603,20 @@ static int xen_cpu_up_prepare_pv(unsigned int cpu)
 		     cpu, rc);
 		return rc;
 	}
+
+	rc = xen_smp_intr_init_pv(cpu);
+	if (rc) {
+		WARN(1, "xen_smp_intr_init_pv() for CPU %d failed: %d\n",
+		     cpu, rc);
+		return rc;
+	}
 	return 0;
 }
 
 static int xen_cpu_dead_pv(unsigned int cpu)
 {
 	xen_smp_intr_free(cpu);
+	xen_smp_intr_free_pv(cpu);
 
 	if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
 		xen_teardown_timer(cpu);
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index bdb0d9c..0cb6d99 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -43,32 +43,11 @@
 
 cpumask_var_t xen_cpu_initialized_map;
 
-struct xen_common_irq {
-	int irq;
-	char *name;
-};
-static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq) = { .irq = -1 };
-static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq) = { .irq = -1 };
-static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq) = { .irq = -1 };
 static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work) = { .irq = -1 };
-static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 };
 static DEFINE_PER_CPU(struct xen_common_irq, xen_pmu_irq) = { .irq = -1 };
 
-static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id);
-static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id);
 static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id);
 
-/*
- * Reschedule call back.
- */
-static irqreturn_t xen_reschedule_interrupt(int irq, void *dev_id)
-{
-	inc_irq_stat(irq_resched_count);
-	scheduler_ipi();
-
-	return IRQ_HANDLED;
-}
-
 static void cpu_bringup(void)
 {
 	int cpu;
@@ -121,36 +100,8 @@ asmlinkage __visible void cpu_bringup_and_idle(int cpu)
 	cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
 }
 
-void xen_smp_intr_free(unsigned int cpu)
+void xen_smp_intr_free_pv(unsigned int cpu)
 {
-	if (per_cpu(xen_resched_irq, cpu).irq >= 0) {
-		unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu).irq, NULL);
-		per_cpu(xen_resched_irq, cpu).irq = -1;
-		kfree(per_cpu(xen_resched_irq, cpu).name);
-		per_cpu(xen_resched_irq, cpu).name = NULL;
-	}
-	if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) {
-		unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu).irq, NULL);
-		per_cpu(xen_callfunc_irq, cpu).irq = -1;
-		kfree(per_cpu(xen_callfunc_irq, cpu).name);
-		per_cpu(xen_callfunc_irq, cpu).name = NULL;
-	}
-	if (per_cpu(xen_debug_irq, cpu).irq >= 0) {
-		unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu).irq, NULL);
-		per_cpu(xen_debug_irq, cpu).irq = -1;
-		kfree(per_cpu(xen_debug_irq, cpu).name);
-		per_cpu(xen_debug_irq, cpu).name = NULL;
-	}
-	if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) {
-		unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu).irq,
-				       NULL);
-		per_cpu(xen_callfuncsingle_irq, cpu).irq = -1;
-		kfree(per_cpu(xen_callfuncsingle_irq, cpu).name);
-		per_cpu(xen_callfuncsingle_irq, cpu).name = NULL;
-	}
-	if (xen_hvm_domain())
-		return;
-
 	if (per_cpu(xen_irq_work, cpu).irq >= 0) {
 		unbind_from_irqhandler(per_cpu(xen_irq_work, cpu).irq, NULL);
 		per_cpu(xen_irq_work, cpu).irq = -1;
@@ -164,63 +115,12 @@ void xen_smp_intr_free(unsigned int cpu)
 		kfree(per_cpu(xen_pmu_irq, cpu).name);
 		per_cpu(xen_pmu_irq, cpu).name = NULL;
 	}
-};
-int xen_smp_intr_init(unsigned int cpu)
+}
+
+int xen_smp_intr_init_pv(unsigned int cpu)
 {
 	int rc;
-	char *resched_name, *callfunc_name, *debug_name, *pmu_name;
-
-	resched_name = kasprintf(GFP_KERNEL, "resched%d", cpu);
-	rc = bind_ipi_to_irqhandler(XEN_RESCHEDULE_VECTOR,
-				    cpu,
-				    xen_reschedule_interrupt,
-				    IRQF_PERCPU|IRQF_NOBALANCING,
-				    resched_name,
-				    NULL);
-	if (rc < 0)
-		goto fail;
-	per_cpu(xen_resched_irq, cpu).irq = rc;
-	per_cpu(xen_resched_irq, cpu).name = resched_name;
-
-	callfunc_name = kasprintf(GFP_KERNEL, "callfunc%d", cpu);
-	rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_VECTOR,
-				    cpu,
-				    xen_call_function_interrupt,
-				    IRQF_PERCPU|IRQF_NOBALANCING,
-				    callfunc_name,
-				    NULL);
-	if (rc < 0)
-		goto fail;
-	per_cpu(xen_callfunc_irq, cpu).irq = rc;
-	per_cpu(xen_callfunc_irq, cpu).name = callfunc_name;
-
-	debug_name = kasprintf(GFP_KERNEL, "debug%d", cpu);
-	rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu, xen_debug_interrupt,
-				     IRQF_PERCPU | IRQF_NOBALANCING,
-				     debug_name, NULL);
-	if (rc < 0)
-		goto fail;
-	per_cpu(xen_debug_irq, cpu).irq = rc;
-	per_cpu(xen_debug_irq, cpu).name = debug_name;
-
-	callfunc_name = kasprintf(GFP_KERNEL, "callfuncsingle%d", cpu);
-	rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_SINGLE_VECTOR,
-				    cpu,
-				    xen_call_function_single_interrupt,
-				    IRQF_PERCPU|IRQF_NOBALANCING,
-				    callfunc_name,
-				    NULL);
-	if (rc < 0)
-		goto fail;
-	per_cpu(xen_callfuncsingle_irq, cpu).irq = rc;
-	per_cpu(xen_callfuncsingle_irq, cpu).name = callfunc_name;
-
-	/*
-	 * The IRQ worker on PVHVM goes through the native path and uses the
-	 * IPI mechanism.
-	 */
-	if (xen_hvm_domain())
-		return 0;
+	char *callfunc_name, *pmu_name;
 
 	callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu);
 	rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR,
@@ -249,11 +149,10 @@ int xen_smp_intr_init(unsigned int cpu)
 	return 0;
 
  fail:
-	xen_smp_intr_free(cpu);
+	xen_smp_intr_free_pv(cpu);
 	return rc;
 }
 
-#ifdef CONFIG_XEN_PV
 static void __init xen_fill_possible_map(void)
 {
 	int i, rc;
@@ -305,7 +204,6 @@ static void __init xen_filter_cpu_maps(void)
 #endif
 
 }
-#endif
 
 static void __init xen_smp_prepare_boot_cpu(void)
 {
@@ -313,7 +211,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
 	native_smp_prepare_boot_cpu();
 
 	if (xen_pv_domain()) {
-#ifdef CONFIG_XEN_PV
 		if (!xen_feature(XENFEAT_writable_page_tables))
 			/* We've switched to the "real" per-cpu gdt, so make
 			 * sure the old memory can be recycled. */
@@ -330,16 +227,9 @@ static void __init xen_smp_prepare_boot_cpu(void)
 
 		xen_filter_cpu_maps();
 		xen_setup_vcpu_info_placement();
-#endif
 	}
 
 	/*
-	 * Setup vcpu_info for boot CPU.
-	 */
-	if (xen_hvm_domain())
-		xen_vcpu_setup(0);
-
-	/*
 	 * The alternative logic (which patches the unlock/lock) runs before
 	 * the smp bootup up code is activated. Hence we need to set this up
 	 * the core kernel is being patched. Otherwise we will have only
@@ -348,7 +238,6 @@ static void __init xen_smp_prepare_boot_cpu(void)
 	xen_init_spinlocks();
 }
 
-#ifdef CONFIG_XEN_PV
 static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
 {
 	unsigned cpu;
@@ -380,6 +269,9 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus)
 	if (xen_smp_intr_init(0))
 		BUG();
 
+	if (xen_smp_intr_init_pv(0))
+		BUG();
+
 	if (!alloc_cpumask_var(&xen_cpu_initialized_map, GFP_KERNEL))
 		panic("could not allocate xen_cpu_initialized_map\n");
 
@@ -489,7 +381,7 @@ static int xen_cpu_up(unsigned int cpu, struct task_struct *idle)
 
 	/*
 	 * PV VCPUs are always successfully taken down (see 'while' loop
-	 * in xen_cpu_die()), so -EBUSY is an error.
+	 * in xen_cpu_die_pv()), so -EBUSY is an error.
 	 */
 	rc = cpu_check_up_prepare(cpu);
 	if (rc)
@@ -552,7 +444,7 @@ static int xen_cpu_disable(void)
 	return -ENOSYS;
 }
 
-static void xen_cpu_die(unsigned int cpu)
+static void xen_cpu_die_pv(unsigned int cpu)
 {
 	BUG();
 }
@@ -581,159 +473,24 @@ static void xen_stop_other_cpus(int wait)
 {
 	smp_call_function(stop_self, NULL, wait);
 }
-#endif /* CONFIG_XEN_PV */
 
-static void xen_cpu_die(unsigned int cpu)
+void xen_cpu_die_pv(unsigned int cpu)
 {
-	while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up,
-						     xen_vcpu_nr(cpu), NULL)) {
+	while (HYPERVISOR_vcpu_op(VCPUOP_is_up,
+				  xen_vcpu_nr(cpu), NULL)) {
 		__set_current_state(TASK_UNINTERRUPTIBLE);
 		schedule_timeout(HZ/10);
 	}
 
 	if (common_cpu_die(cpu) == 0) {
 		xen_smp_intr_free(cpu);
+		xen_smp_intr_free_pv(cpu);
 		xen_uninit_lock_cpu(cpu);
 		xen_teardown_timer(cpu);
 		xen_pmu_finish(cpu);
 	}
 }
 
-static void xen_smp_send_reschedule(int cpu)
-{
-	xen_send_IPI_one(cpu, XEN_RESCHEDULE_VECTOR);
-}
-
-static void __xen_send_IPI_mask(const struct cpumask *mask,
-			      int vector)
-{
-	unsigned cpu;
-
-	for_each_cpu_and(cpu, mask, cpu_online_mask)
-		xen_send_IPI_one(cpu, vector);
-}
-
-static void xen_smp_send_call_function_ipi(const struct cpumask *mask)
-{
-	int cpu;
-
-	__xen_send_IPI_mask(mask, XEN_CALL_FUNCTION_VECTOR);
-
-	/* Make sure other vcpus get a chance to run if they need to. */
-	for_each_cpu(cpu, mask) {
-		if (xen_vcpu_stolen(cpu)) {
-			HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
-			break;
-		}
-	}
-}
-
-static void xen_smp_send_call_function_single_ipi(int cpu)
-{
-	__xen_send_IPI_mask(cpumask_of(cpu),
-			  XEN_CALL_FUNCTION_SINGLE_VECTOR);
-}
-
-static inline int xen_map_vector(int vector)
-{
-	int xen_vector;
-
-	switch (vector) {
-	case RESCHEDULE_VECTOR:
-		xen_vector = XEN_RESCHEDULE_VECTOR;
-		break;
-	case CALL_FUNCTION_VECTOR:
-		xen_vector = XEN_CALL_FUNCTION_VECTOR;
-		break;
-	case CALL_FUNCTION_SINGLE_VECTOR:
-		xen_vector = XEN_CALL_FUNCTION_SINGLE_VECTOR;
-		break;
-	case IRQ_WORK_VECTOR:
-		xen_vector = XEN_IRQ_WORK_VECTOR;
-		break;
-#ifdef CONFIG_X86_64
-	case NMI_VECTOR:
-	case APIC_DM_NMI: /* Some use that instead of NMI_VECTOR */
-		xen_vector = XEN_NMI_VECTOR;
-		break;
-#endif
-	default:
-		xen_vector = -1;
-		printk(KERN_ERR "xen: vector 0x%x is not implemented\n",
-			vector);
-	}
-
-	return xen_vector;
-}
-
-void xen_send_IPI_mask(const struct cpumask *mask,
-			      int vector)
-{
-	int xen_vector = xen_map_vector(vector);
-
-	if (xen_vector >= 0)
-		__xen_send_IPI_mask(mask, xen_vector);
-}
-
-void xen_send_IPI_all(int vector)
-{
-	int xen_vector = xen_map_vector(vector);
-
-	if (xen_vector >= 0)
-		__xen_send_IPI_mask(cpu_online_mask, xen_vector);
-}
-
-void xen_send_IPI_self(int vector)
-{
-	int xen_vector = xen_map_vector(vector);
-
-	if (xen_vector >= 0)
-		xen_send_IPI_one(smp_processor_id(), xen_vector);
-}
-
-void xen_send_IPI_mask_allbutself(const struct cpumask *mask,
-				int vector)
-{
-	unsigned cpu;
-	unsigned int this_cpu = smp_processor_id();
-	int xen_vector = xen_map_vector(vector);
-
-	if (!(num_online_cpus() > 1) || (xen_vector < 0))
-		return;
-
-	for_each_cpu_and(cpu, mask, cpu_online_mask) {
-		if (this_cpu == cpu)
-			continue;
-
-		xen_send_IPI_one(cpu, xen_vector);
-	}
-}
-
-void xen_send_IPI_allbutself(int vector)
-{
-	xen_send_IPI_mask_allbutself(cpu_online_mask, vector);
-}
-
-static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id)
-{
-	irq_enter();
-	generic_smp_call_function_interrupt();
-	inc_irq_stat(irq_call_count);
-	irq_exit();
-
-	return IRQ_HANDLED;
-}
-
-static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id)
-{
-	irq_enter();
-	generic_smp_call_function_single_interrupt();
-	inc_irq_stat(irq_call_count);
-	irq_exit();
-
-	return IRQ_HANDLED;
-}
-
 static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
 {
 	irq_enter();
@@ -744,14 +501,13 @@ static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-#ifdef CONFIG_XEN_PV
 static const struct smp_ops xen_smp_ops __initconst = {
 	.smp_prepare_boot_cpu = xen_smp_prepare_boot_cpu,
 	.smp_prepare_cpus = xen_smp_prepare_cpus,
 	.smp_cpus_done = xen_smp_cpus_done,
 
 	.cpu_up = xen_cpu_up,
-	.cpu_die = xen_cpu_die,
+	.cpu_die = xen_cpu_die_pv,
 	.cpu_disable = xen_cpu_disable,
 	.play_dead = xen_play_dead,
 
@@ -767,22 +523,3 @@ void __init xen_smp_init(void)
 	smp_ops = xen_smp_ops;
 	xen_fill_possible_map();
 }
-#endif
-
-static void __init xen_hvm_smp_prepare_cpus(unsigned int max_cpus)
-{
-	native_smp_prepare_cpus(max_cpus);
-	WARN_ON(xen_smp_intr_init(0));
-
-	xen_init_lock_cpu(0);
-}
-
-void __init xen_hvm_smp_init(void)
-{
-	smp_ops.smp_prepare_cpus = xen_hvm_smp_prepare_cpus;
-	smp_ops.smp_send_reschedule = xen_smp_send_reschedule;
-	smp_ops.cpu_die = xen_cpu_die;
-	smp_ops.send_call_func_ipi = xen_smp_send_call_function_ipi;
-	smp_ops.send_call_func_single_ipi = xen_smp_send_call_function_single_ipi;
-	smp_ops.smp_prepare_boot_cpu = xen_smp_prepare_boot_cpu;
-}
diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index c5c16dc..94ed5cc 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -8,9 +8,22 @@ extern void xen_send_IPI_mask_allbutself(const struct cpumask *mask,
 extern void xen_send_IPI_allbutself(int vector);
 extern void xen_send_IPI_all(int vector);
 extern void xen_send_IPI_self(int vector);
+extern void xen_send_IPI_mask(const struct cpumask *mask, int vector);
 
 extern int xen_smp_intr_init(unsigned int cpu);
 extern void xen_smp_intr_free(unsigned int cpu);
+#ifdef CONFIG_XEN_PV
+extern int xen_smp_intr_init_pv(unsigned int cpu);
+extern void xen_smp_intr_free_pv(unsigned int cpu);
+#endif
+extern void xen_smp_send_reschedule(int cpu);
+extern void xen_smp_send_call_function_ipi(const struct cpumask *mask);
+extern void xen_smp_send_call_function_single_ipi(int cpu);
+
+struct xen_common_irq {
+	int irq;
+	char *name;
+};
 
 #else /* CONFIG_SMP */
 
@@ -18,6 +31,16 @@ static inline int xen_smp_intr_init(unsigned int cpu)
 {
 	return 0;
 }
+
+#ifdef CONFIG_XEN_PV
+static inline int xen_smp_intr_init_pv(unsigned int cpu)
+{
+	return 0;
+}
+
+static inline void xen_smp_intr_free_pv(unsigned int cpu) {}
+#endif
+
 static inline void xen_smp_intr_free(unsigned int cpu) {}
 #endif /* CONFIG_SMP */
 
diff --git a/arch/x86/xen/smp_common.c b/arch/x86/xen/smp_common.c
new file mode 100644
index 0000000..7a39cad
--- /dev/null
+++ b/arch/x86/xen/smp_common.c
@@ -0,0 +1,246 @@
+#include <linux/smp.h>
+#include <linux/slab.h>
+#include <linux/cpumask.h>
+#include <linux/percpu.h>
+
+#include <xen/events.h>
+
+#include "xen-ops.h"
+#include "pmu.h"
+#include "smp.h"
+
+static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq) = { .irq = -1 };
+static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq) = { .irq = -1 };
+static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq) = { .irq = -1 };
+static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 };
+
+/*
+ * Reschedule call back.
+ */
+static irqreturn_t xen_reschedule_interrupt(int irq, void *dev_id)
+{
+	inc_irq_stat(irq_resched_count);
+	scheduler_ipi();
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id)
+{
+	irq_enter();
+	generic_smp_call_function_interrupt();
+	inc_irq_stat(irq_call_count);
+	irq_exit();
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id)
+{
+	irq_enter();
+	generic_smp_call_function_single_interrupt();
+	inc_irq_stat(irq_call_count);
+	irq_exit();
+
+	return IRQ_HANDLED;
+}
+
+void xen_smp_intr_free(unsigned int cpu)
+{
+	if (per_cpu(xen_resched_irq, cpu).irq >= 0) {
+		unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu).irq, NULL);
+		per_cpu(xen_resched_irq, cpu).irq = -1;
+		kfree(per_cpu(xen_resched_irq, cpu).name);
+		per_cpu(xen_resched_irq, cpu).name = NULL;
+	}
+	if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) {
+		unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu).irq, NULL);
+		per_cpu(xen_callfunc_irq, cpu).irq = -1;
+		kfree(per_cpu(xen_callfunc_irq, cpu).name);
+		per_cpu(xen_callfunc_irq, cpu).name = NULL;
+	}
+	if (per_cpu(xen_debug_irq, cpu).irq >= 0) {
+		unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu).irq, NULL);
+		per_cpu(xen_debug_irq, cpu).irq = -1;
+		kfree(per_cpu(xen_debug_irq, cpu).name);
+		per_cpu(xen_debug_irq, cpu).name = NULL;
+	}
+	if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) {
+		unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu).irq,
+				       NULL);
+		per_cpu(xen_callfuncsingle_irq, cpu).irq = -1;
+		kfree(per_cpu(xen_callfuncsingle_irq, cpu).name);
+		per_cpu(xen_callfuncsingle_irq, cpu).name = NULL;
+	}
+}
+
+int xen_smp_intr_init(unsigned int cpu)
+{
+	int rc;
+	char *resched_name, *callfunc_name, *debug_name;
+
+	resched_name = kasprintf(GFP_KERNEL, "resched%d", cpu);
+	rc = bind_ipi_to_irqhandler(XEN_RESCHEDULE_VECTOR,
+				    cpu,
+				    xen_reschedule_interrupt,
+				    IRQF_PERCPU|IRQF_NOBALANCING,
+				    resched_name,
+				    NULL);
+	if (rc < 0)
+		goto fail;
+	per_cpu(xen_resched_irq, cpu).irq = rc;
+	per_cpu(xen_resched_irq, cpu).name = resched_name;
+
+	callfunc_name = kasprintf(GFP_KERNEL, "callfunc%d", cpu);
+	rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_VECTOR,
+				    cpu,
+				    xen_call_function_interrupt,
+				    IRQF_PERCPU|IRQF_NOBALANCING,
+				    callfunc_name,
+				    NULL);
+	if (rc < 0)
+		goto fail;
+	per_cpu(xen_callfunc_irq, cpu).irq = rc;
+	per_cpu(xen_callfunc_irq, cpu).name = callfunc_name;
+
+	debug_name = kasprintf(GFP_KERNEL, "debug%d", cpu);
+	rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu, xen_debug_interrupt,
+				     IRQF_PERCPU | IRQF_NOBALANCING,
+				     debug_name, NULL);
+	if (rc < 0)
+		goto fail;
+	per_cpu(xen_debug_irq, cpu).irq = rc;
+	per_cpu(xen_debug_irq, cpu).name = debug_name;
+
+	callfunc_name = kasprintf(GFP_KERNEL, "callfuncsingle%d", cpu);
+	rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_SINGLE_VECTOR,
+				    cpu,
+				    xen_call_function_single_interrupt,
+				    IRQF_PERCPU|IRQF_NOBALANCING,
+				    callfunc_name,
+				    NULL);
+	if (rc < 0)
+		goto fail;
+	per_cpu(xen_callfuncsingle_irq, cpu).irq = rc;
+	per_cpu(xen_callfuncsingle_irq, cpu).name = callfunc_name;
+
+	return 0;
+
+ fail:
+	xen_smp_intr_free(cpu);
+	return rc;
+}
+
+void xen_smp_send_reschedule(int cpu)
+{
+	xen_send_IPI_one(cpu, XEN_RESCHEDULE_VECTOR);
+}
+
+static void __xen_send_IPI_mask(const struct cpumask *mask,
+				int vector)
+{
+	unsigned cpu;
+
+	for_each_cpu_and(cpu, mask, cpu_online_mask)
+		xen_send_IPI_one(cpu, vector);
+}
+
+void xen_smp_send_call_function_ipi(const struct cpumask *mask)
+{
+	int cpu;
+
+	__xen_send_IPI_mask(mask, XEN_CALL_FUNCTION_VECTOR);
+
+	/* Make sure other vcpus get a chance to run if they need to. */
+	for_each_cpu(cpu, mask) {
+		if (xen_vcpu_stolen(cpu)) {
+			HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
+			break;
+		}
+	}
+}
+
+void xen_smp_send_call_function_single_ipi(int cpu)
+{
+	__xen_send_IPI_mask(cpumask_of(cpu),
+			    XEN_CALL_FUNCTION_SINGLE_VECTOR);
+}
+
+static inline int xen_map_vector(int vector)
+{
+	int xen_vector;
+
+	switch (vector) {
+	case RESCHEDULE_VECTOR:
+		xen_vector = XEN_RESCHEDULE_VECTOR;
+		break;
+	case CALL_FUNCTION_VECTOR:
+		xen_vector = XEN_CALL_FUNCTION_VECTOR;
+		break;
+	case CALL_FUNCTION_SINGLE_VECTOR:
+		xen_vector = XEN_CALL_FUNCTION_SINGLE_VECTOR;
+		break;
+	case IRQ_WORK_VECTOR:
+		xen_vector = XEN_IRQ_WORK_VECTOR;
+		break;
+#ifdef CONFIG_X86_64
+	case NMI_VECTOR:
+	case APIC_DM_NMI: /* Some use that instead of NMI_VECTOR */
+		xen_vector = XEN_NMI_VECTOR;
+		break;
+#endif
+	default:
+		xen_vector = -1;
+		printk(KERN_ERR "xen: vector 0x%x is not implemented\n",
+			vector);
+	}
+
+	return xen_vector;
+}
+
+void xen_send_IPI_mask(const struct cpumask *mask, int vector)
+{
+	int xen_vector = xen_map_vector(vector);
+
+	if (xen_vector >= 0)
+		__xen_send_IPI_mask(mask, xen_vector);
+}
+
+void xen_send_IPI_all(int vector)
+{
+	int xen_vector = xen_map_vector(vector);
+
+	if (xen_vector >= 0)
+		__xen_send_IPI_mask(cpu_online_mask, xen_vector);
+}
+
+void xen_send_IPI_self(int vector)
+{
+	int xen_vector = xen_map_vector(vector);
+
+	if (xen_vector >= 0)
+		xen_send_IPI_one(smp_processor_id(), xen_vector);
+}
+
+void xen_send_IPI_mask_allbutself(const struct cpumask *mask,
+				int vector)
+{
+	unsigned cpu;
+	unsigned int this_cpu = smp_processor_id();
+	int xen_vector = xen_map_vector(vector);
+
+	if (!(num_online_cpus() > 1) || (xen_vector < 0))
+		return;
+
+	for_each_cpu_and(cpu, mask, cpu_online_mask) {
+		if (this_cpu == cpu)
+			continue;
+
+		xen_send_IPI_one(cpu, xen_vector);
+	}
+}
+
+void xen_send_IPI_allbutself(int vector)
+{
+	xen_send_IPI_mask_allbutself(cpu_online_mask, vector);
+}
diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
new file mode 100644
index 0000000..ce4ff59
--- /dev/null
+++ b/arch/x86/xen/smp_hvm.c
@@ -0,0 +1,54 @@
+#include <asm/smp.h>
+
+#include "xen-ops.h"
+#include "smp.h"
+
+static void __init xen_smp_prepare_boot_cpu_hvm(void)
+{
+	BUG_ON(smp_processor_id() != 0);
+	native_smp_prepare_boot_cpu();
+
+	xen_vcpu_setup(0);
+
+	/*
+	 * The alternative logic (which patches the unlock/lock) runs before
+	 * the smp bootup up code is activated. Hence we need to set this up
+	 * the core kernel is being patched. Otherwise we will have only
+	 * modules patched but not core code.
+	 */
+	xen_init_spinlocks();
+}
+
+static void __init xen_hvm_smp_prepare_cpus(unsigned int max_cpus)
+{
+	native_smp_prepare_cpus(max_cpus);
+	WARN_ON(xen_smp_intr_init(0));
+
+	xen_init_lock_cpu(0);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+void xen_cpu_die_hvm(unsigned int cpu)
+{
+	if (common_cpu_die(cpu) == 0) {
+		xen_smp_intr_free(cpu);
+		xen_uninit_lock_cpu(cpu);
+		xen_teardown_timer(cpu);
+	}
+}
+#else
+static void xen_cpu_die_hvm(unsigned int cpu)
+{
+	BUG();
+}
+#endif
+
+void __init xen_hvm_smp_init(void)
+{
+	smp_ops.smp_prepare_cpus = xen_hvm_smp_prepare_cpus;
+	smp_ops.smp_send_reschedule = xen_smp_send_reschedule;
+	smp_ops.cpu_die = xen_cpu_die_hvm;
+	smp_ops.send_call_func_ipi = xen_smp_send_call_function_ipi;
+	smp_ops.send_call_func_single_ipi = xen_smp_send_call_function_single_ipi;
+	smp_ops.smp_prepare_boot_cpu = xen_smp_prepare_boot_cpu_hvm;
+}
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH KERNEL 3/4] x86/xen: put setup.c, mmu.c and p2m.c under CONFIG_XEN_PV
  2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 2/4] x86/xen: split smp.c for PV and PVHVM guests Vitaly Kuznetsov
@ 2016-11-14 17:17 ` Vitaly Kuznetsov
  2016-11-14 17:17 ` [RFC PATCH KERNEL 4/4] x86/xen: put setup.c, pmu.c and apic.c " Vitaly Kuznetsov
  2016-11-14 18:21 ` [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code David Vrabel
  4 siblings, 0 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-14 17:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

These three files (mmu.c, p2m.c, setup.c) are mostly required to
support PV guests, in fact p2m.c and setup.c have no code for PVHVM
at all. mmu.c has some, move it to mmu_common.c and mmu_hvm.c.

Some additional changes are required:
- In the balloon driver we can't use xen_start_info, xen_released_pages
  and xen_extra_mem it is PV-only. Decorate it with #ifdef CONFIG_XEN_PV

- Some PV-only functions are used by drivers and for PVHVM guests these
  functions have 'if (xen_feature(XENFEAT_auto_translated_physmap))' check
  in the beginning. Create required stubs for PVHVM-only builds.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/xen/page.h |  44 ++++++-
 arch/x86/xen/Makefile           |  10 +-
 arch/x86/xen/mmu.c              | 284 ----------------------------------------
 arch/x86/xen/mmu_common.c       | 219 +++++++++++++++++++++++++++++++
 arch/x86/xen/mmu_hvm.c          |  77 +++++++++++
 drivers/xen/balloon.c           |  30 +++--
 include/xen/xen-ops.h           |  13 ++
 7 files changed, 376 insertions(+), 301 deletions(-)
 create mode 100644 arch/x86/xen/mmu_common.c
 create mode 100644 arch/x86/xen/mmu_hvm.c

diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index f5fb840..3244ecf 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -43,9 +43,10 @@ extern unsigned long *xen_p2m_addr;
 extern unsigned long  xen_p2m_size;
 extern unsigned long  xen_max_p2m_pfn;
 
-extern int xen_alloc_p2m_entry(unsigned long pfn);
-
 extern unsigned long get_phys_to_machine(unsigned long pfn);
+
+#ifdef CONFIG_XEN_PV
+extern int xen_alloc_p2m_entry(unsigned long pfn);
 extern bool set_phys_to_machine(unsigned long pfn, unsigned long mfn);
 extern bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn);
 extern unsigned long __init set_phys_range_identity(unsigned long pfn_s,
@@ -57,6 +58,38 @@ extern int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
 extern int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops,
 				     struct gnttab_unmap_grant_ref *kunmap_ops,
 				     struct page **pages, unsigned int count);
+#else /* CONFIG_XEN_PV */
+static inline int xen_alloc_p2m_entry(unsigned long pfn)
+{
+	return 0;
+}
+
+static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
+{
+	return true;
+}
+
+static inline bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn)
+{
+	return true;
+}
+
+static inline int
+set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
+			struct gnttab_map_grant_ref *kmap_ops,
+			struct page **pages, unsigned int count)
+{
+	return 0;
+}
+
+static inline int
+clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops,
+			  struct gnttab_unmap_grant_ref *kunmap_ops,
+			  struct page **pages, unsigned int count)
+{
+	return 0;
+}
+#endif /* CONFIG_XEN_PV */
 
 /*
  * Helper functions to write or read unsigned long values to/from
@@ -82,6 +115,7 @@ static inline int xen_safe_read_ulong(unsigned long *addr, unsigned long *val)
  * - get_phys_to_machine() is to be called by __pfn_to_mfn() only in special
  *   cases needing an extended handling.
  */
+#ifdef CONFIG_XEN_PV
 static inline unsigned long __pfn_to_mfn(unsigned long pfn)
 {
 	unsigned long mfn;
@@ -98,6 +132,12 @@ static inline unsigned long __pfn_to_mfn(unsigned long pfn)
 
 	return mfn;
 }
+#else
+static inline unsigned long __pfn_to_mfn(unsigned long pfn)
+{
+	return pfn;
+}
+#endif
 
 static inline unsigned long pfn_to_mfn(unsigned long pfn)
 {
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index aa6cd5e..121a368 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -10,13 +10,13 @@ nostackp := $(call cc-option, -fno-stack-protector)
 CFLAGS_enlighten.o		:= $(nostackp)
 CFLAGS_mmu.o			:= $(nostackp)
 
-obj-y		:= enlighten_common.o setup.o multicalls.o \
-			mmu.o irq.o time.o xen-asm.o xen-asm_$(BITS).o \
+obj-y		:= enlighten_common.o multicalls.o \
+			irq.o time.o xen-asm.o xen-asm_$(BITS).o \
 			grant-table.o suspend.o platform-pci-unplug.o \
-			p2m.o apic.o pmu.o
+			apic.o pmu.o mmu_common.o
 
-obj-$(CONFIG_XEN_PV)		+= enlighten.o
-obj-$(CONFIG_XEN_PVHVM)		+= enlighten_hvm.o
+obj-$(CONFIG_XEN_PV)		+= enlighten.o setup.o mmu.o p2m.o
+obj-$(CONFIG_XEN_PVHVM)		+= enlighten_hvm.o mmu_hvm.o
 
 obj-$(CONFIG_EVENT_TRACING) += trace.o
 
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 65e184b..94b1806 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -80,12 +80,6 @@
 #include "mmu.h"
 #include "debugfs.h"
 
-/*
- * Protects atomic reservation decrease/increase against concurrent increases.
- * Also protects non-atomic updates of current_pages and balloon lists.
- */
-DEFINE_SPINLOCK(xen_reservation_lock);
-
 #ifdef CONFIG_X86_32
 /*
  * Identity map, in addition to plain kernel map.  This needs to be
@@ -125,36 +119,6 @@ static phys_addr_t xen_pt_base, xen_pt_size __initdata;
  */
 #define USER_LIMIT	((STACK_TOP_MAX + PGDIR_SIZE - 1) & PGDIR_MASK)
 
-unsigned long arbitrary_virt_to_mfn(void *vaddr)
-{
-	xmaddr_t maddr = arbitrary_virt_to_machine(vaddr);
-
-	return PFN_DOWN(maddr.maddr);
-}
-
-xmaddr_t arbitrary_virt_to_machine(void *vaddr)
-{
-	unsigned long address = (unsigned long)vaddr;
-	unsigned int level;
-	pte_t *pte;
-	unsigned offset;
-
-	/*
-	 * if the PFN is in the linear mapped vaddr range, we can just use
-	 * the (quick) virt_to_machine() p2m lookup
-	 */
-	if (virt_addr_valid(vaddr))
-		return virt_to_machine(vaddr);
-
-	/* otherwise we have to do a (slower) full page-table walk */
-
-	pte = lookup_address(address, &level);
-	BUG_ON(pte == NULL);
-	offset = address & ~PAGE_MASK;
-	return XMADDR(((phys_addr_t)pte_mfn(*pte) << PAGE_SHIFT) + offset);
-}
-EXPORT_SYMBOL_GPL(arbitrary_virt_to_machine);
-
 void make_lowmem_page_readonly(void *vaddr)
 {
 	pte_t *pte, ptev;
@@ -1314,25 +1278,6 @@ unsigned long xen_read_cr2_direct(void)
 	return this_cpu_read(xen_vcpu_info.arch.cr2);
 }
 
-void xen_flush_tlb_all(void)
-{
-	struct mmuext_op *op;
-	struct multicall_space mcs;
-
-	trace_xen_mmu_flush_tlb_all(0);
-
-	preempt_disable();
-
-	mcs = xen_mc_entry(sizeof(*op));
-
-	op = mcs.args;
-	op->cmd = MMUEXT_TLB_FLUSH_ALL;
-	MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
-
-	xen_mc_issue(PARAVIRT_LAZY_MMU);
-
-	preempt_enable();
-}
 static void xen_flush_tlb(void)
 {
 	struct mmuext_op *op;
@@ -2695,232 +2640,3 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 	spin_unlock_irqrestore(&xen_reservation_lock, flags);
 }
 EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
-
-#ifdef CONFIG_XEN_PVHVM
-#ifdef CONFIG_PROC_VMCORE
-/*
- * This function is used in two contexts:
- * - the kdump kernel has to check whether a pfn of the crashed kernel
- *   was a ballooned page. vmcore is using this function to decide
- *   whether to access a pfn of the crashed kernel.
- * - the kexec kernel has to check whether a pfn was ballooned by the
- *   previous kernel. If the pfn is ballooned, handle it properly.
- * Returns 0 if the pfn is not backed by a RAM page, the caller may
- * handle the pfn special in this case.
- */
-static int xen_oldmem_pfn_is_ram(unsigned long pfn)
-{
-	struct xen_hvm_get_mem_type a = {
-		.domid = DOMID_SELF,
-		.pfn = pfn,
-	};
-	int ram;
-
-	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
-		return -ENXIO;
-
-	switch (a.mem_type) {
-		case HVMMEM_mmio_dm:
-			ram = 0;
-			break;
-		case HVMMEM_ram_rw:
-		case HVMMEM_ram_ro:
-		default:
-			ram = 1;
-			break;
-	}
-
-	return ram;
-}
-#endif
-
-static void xen_hvm_exit_mmap(struct mm_struct *mm)
-{
-	struct xen_hvm_pagetable_dying a;
-	int rc;
-
-	a.domid = DOMID_SELF;
-	a.gpa = __pa(mm->pgd);
-	rc = HYPERVISOR_hvm_op(HVMOP_pagetable_dying, &a);
-	WARN_ON_ONCE(rc < 0);
-}
-
-static int is_pagetable_dying_supported(void)
-{
-	struct xen_hvm_pagetable_dying a;
-	int rc = 0;
-
-	a.domid = DOMID_SELF;
-	a.gpa = 0x00;
-	rc = HYPERVISOR_hvm_op(HVMOP_pagetable_dying, &a);
-	if (rc < 0) {
-		printk(KERN_DEBUG "HVMOP_pagetable_dying not supported\n");
-		return 0;
-	}
-	return 1;
-}
-
-void __init xen_hvm_init_mmu_ops(void)
-{
-	if (is_pagetable_dying_supported())
-		pv_mmu_ops.exit_mmap = xen_hvm_exit_mmap;
-#ifdef CONFIG_PROC_VMCORE
-	register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram);
-#endif
-}
-#endif
-
-#define REMAP_BATCH_SIZE 16
-
-struct remap_data {
-	xen_pfn_t *mfn;
-	bool contiguous;
-	pgprot_t prot;
-	struct mmu_update *mmu_update;
-};
-
-static int remap_area_mfn_pte_fn(pte_t *ptep, pgtable_t token,
-				 unsigned long addr, void *data)
-{
-	struct remap_data *rmd = data;
-	pte_t pte = pte_mkspecial(mfn_pte(*rmd->mfn, rmd->prot));
-
-	/* If we have a contiguous range, just update the mfn itself,
-	   else update pointer to be "next mfn". */
-	if (rmd->contiguous)
-		(*rmd->mfn)++;
-	else
-		rmd->mfn++;
-
-	rmd->mmu_update->ptr = virt_to_machine(ptep).maddr;
-	rmd->mmu_update->val = pte_val_ma(pte);
-	rmd->mmu_update++;
-
-	return 0;
-}
-
-static int do_remap_gfn(struct vm_area_struct *vma,
-			unsigned long addr,
-			xen_pfn_t *gfn, int nr,
-			int *err_ptr, pgprot_t prot,
-			unsigned domid,
-			struct page **pages)
-{
-	int err = 0;
-	struct remap_data rmd;
-	struct mmu_update mmu_update[REMAP_BATCH_SIZE];
-	unsigned long range;
-	int mapped = 0;
-
-	BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
-
-	if (xen_feature(XENFEAT_auto_translated_physmap)) {
-#ifdef CONFIG_XEN_PVH
-		/* We need to update the local page tables and the xen HAP */
-		return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
-						 prot, domid, pages);
-#else
-		return -EINVAL;
-#endif
-        }
-
-	rmd.mfn = gfn;
-	rmd.prot = prot;
-	/* We use the err_ptr to indicate if there we are doing a contiguous
-	 * mapping or a discontigious mapping. */
-	rmd.contiguous = !err_ptr;
-
-	while (nr) {
-		int index = 0;
-		int done = 0;
-		int batch = min(REMAP_BATCH_SIZE, nr);
-		int batch_left = batch;
-		range = (unsigned long)batch << PAGE_SHIFT;
-
-		rmd.mmu_update = mmu_update;
-		err = apply_to_page_range(vma->vm_mm, addr, range,
-					  remap_area_mfn_pte_fn, &rmd);
-		if (err)
-			goto out;
-
-		/* We record the error for each page that gives an error, but
-		 * continue mapping until the whole set is done */
-		do {
-			int i;
-
-			err = HYPERVISOR_mmu_update(&mmu_update[index],
-						    batch_left, &done, domid);
-
-			/*
-			 * @err_ptr may be the same buffer as @gfn, so
-			 * only clear it after each chunk of @gfn is
-			 * used.
-			 */
-			if (err_ptr) {
-				for (i = index; i < index + done; i++)
-					err_ptr[i] = 0;
-			}
-			if (err < 0) {
-				if (!err_ptr)
-					goto out;
-				err_ptr[i] = err;
-				done++; /* Skip failed frame. */
-			} else
-				mapped += done;
-			batch_left -= done;
-			index += done;
-		} while (batch_left);
-
-		nr -= batch;
-		addr += range;
-		if (err_ptr)
-			err_ptr += batch;
-		cond_resched();
-	}
-out:
-
-	xen_flush_tlb_all();
-
-	return err < 0 ? err : mapped;
-}
-
-int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
-			       unsigned long addr,
-			       xen_pfn_t gfn, int nr,
-			       pgprot_t prot, unsigned domid,
-			       struct page **pages)
-{
-	return do_remap_gfn(vma, addr, &gfn, nr, NULL, prot, domid, pages);
-}
-EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
-
-int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
-			       unsigned long addr,
-			       xen_pfn_t *gfn, int nr,
-			       int *err_ptr, pgprot_t prot,
-			       unsigned domid, struct page **pages)
-{
-	/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
-	 * and the consequences later is quite hard to detect what the actual
-	 * cause of "wrong memory was mapped in".
-	 */
-	BUG_ON(err_ptr == NULL);
-	return do_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid, pages);
-}
-EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
-
-
-/* Returns: 0 success */
-int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-			       int numpgs, struct page **pages)
-{
-	if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
-		return 0;
-
-#ifdef CONFIG_XEN_PVH
-	return xen_xlate_unmap_gfn_range(vma, numpgs, pages);
-#else
-	return -EINVAL;
-#endif
-}
-EXPORT_SYMBOL_GPL(xen_unmap_domain_gfn_range);
diff --git a/arch/x86/xen/mmu_common.c b/arch/x86/xen/mmu_common.c
new file mode 100644
index 0000000..d134cd9
--- /dev/null
+++ b/arch/x86/xen/mmu_common.c
@@ -0,0 +1,219 @@
+#include <linux/pfn.h>
+
+#include <asm/xen/page.h>
+#include <asm/xen/hypercall.h>
+
+#include <xen/interface/memory.h>
+
+#include "multicalls.h"
+#include "mmu.h"
+
+/*
+ * Protects atomic reservation decrease/increase against concurrent increases.
+ * Also protects non-atomic updates of current_pages and balloon lists.
+ */
+DEFINE_SPINLOCK(xen_reservation_lock);
+
+#define REMAP_BATCH_SIZE 16
+
+struct remap_data {
+	xen_pfn_t *mfn;
+	bool contiguous;
+	pgprot_t prot;
+	struct mmu_update *mmu_update;
+};
+
+void xen_flush_tlb_all(void)
+{
+	struct mmuext_op *op;
+	struct multicall_space mcs;
+
+	trace_xen_mmu_flush_tlb_all(0);
+
+	preempt_disable();
+
+	mcs = xen_mc_entry(sizeof(*op));
+
+	op = mcs.args;
+	op->cmd = MMUEXT_TLB_FLUSH_ALL;
+	MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
+
+	xen_mc_issue(PARAVIRT_LAZY_MMU);
+
+	preempt_enable();
+}
+
+static int remap_area_mfn_pte_fn(pte_t *ptep, pgtable_t token,
+				 unsigned long addr, void *data)
+{
+	struct remap_data *rmd = data;
+	pte_t pte = pte_mkspecial(mfn_pte(*rmd->mfn, rmd->prot));
+
+	/* If we have a contiguous range, just update the mfn itself,
+	   else update pointer to be "next mfn". */
+	if (rmd->contiguous)
+		(*rmd->mfn)++;
+	else
+		rmd->mfn++;
+
+	rmd->mmu_update->ptr = virt_to_machine(ptep).maddr;
+	rmd->mmu_update->val = pte_val_ma(pte);
+	rmd->mmu_update++;
+
+	return 0;
+}
+
+static int do_remap_gfn(struct vm_area_struct *vma,
+			unsigned long addr,
+			xen_pfn_t *gfn, int nr,
+			int *err_ptr, pgprot_t prot,
+			unsigned domid,
+			struct page **pages)
+{
+	int err = 0;
+	struct remap_data rmd;
+	struct mmu_update mmu_update[REMAP_BATCH_SIZE];
+	unsigned long range;
+	int mapped = 0;
+
+	BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
+
+	if (xen_feature(XENFEAT_auto_translated_physmap)) {
+#ifdef CONFIG_XEN_PVH
+		/* We need to update the local page tables and the xen HAP */
+		return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+						 prot, domid, pages);
+#else
+		return -EINVAL;
+#endif
+	}
+
+	rmd.mfn = gfn;
+	rmd.prot = prot;
+	/* We use the err_ptr to indicate if there we are doing a contiguous
+	 * mapping or a discontigious mapping. */
+	rmd.contiguous = !err_ptr;
+
+	while (nr) {
+		int index = 0;
+		int done = 0;
+		int batch = min(REMAP_BATCH_SIZE, nr);
+		int batch_left = batch;
+		range = (unsigned long)batch << PAGE_SHIFT;
+
+		rmd.mmu_update = mmu_update;
+		err = apply_to_page_range(vma->vm_mm, addr, range,
+					  remap_area_mfn_pte_fn, &rmd);
+		if (err)
+			goto out;
+
+		/* We record the error for each page that gives an error, but
+		 * continue mapping until the whole set is done */
+		do {
+			int i;
+
+			err = HYPERVISOR_mmu_update(&mmu_update[index],
+						    batch_left, &done, domid);
+
+			/*
+			 * @err_ptr may be the same buffer as @gfn, so
+			 * only clear it after each chunk of @gfn is
+			 * used.
+			 */
+			if (err_ptr) {
+				for (i = index; i < index + done; i++)
+					err_ptr[i] = 0;
+			}
+			if (err < 0) {
+				if (!err_ptr)
+					goto out;
+				err_ptr[i] = err;
+				done++; /* Skip failed frame. */
+			} else
+				mapped += done;
+			batch_left -= done;
+			index += done;
+		} while (batch_left);
+
+		nr -= batch;
+		addr += range;
+		if (err_ptr)
+			err_ptr += batch;
+		cond_resched();
+	}
+out:
+
+	xen_flush_tlb_all();
+
+	return err < 0 ? err : mapped;
+}
+
+int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
+			       unsigned long addr,
+			       xen_pfn_t gfn, int nr,
+			       pgprot_t prot, unsigned domid,
+			       struct page **pages)
+{
+	return do_remap_gfn(vma, addr, &gfn, nr, NULL, prot, domid, pages);
+}
+EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
+
+int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
+			       unsigned long addr,
+			       xen_pfn_t *gfn, int nr,
+			       int *err_ptr, pgprot_t prot,
+			       unsigned domid, struct page **pages)
+{
+	/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
+	 * and the consequences later is quite hard to detect what the actual
+	 * cause of "wrong memory was mapped in".
+	 */
+	BUG_ON(err_ptr == NULL);
+	return do_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid, pages);
+}
+EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
+
+/* Returns: 0 success */
+int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
+			       int numpgs, struct page **pages)
+{
+	if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+		return 0;
+
+#ifdef CONFIG_XEN_PVH
+	return xen_xlate_unmap_gfn_range(vma, numpgs, pages);
+#else
+	return -EINVAL;
+#endif
+}
+EXPORT_SYMBOL_GPL(xen_unmap_domain_gfn_range);
+
+unsigned long arbitrary_virt_to_mfn(void *vaddr)
+{
+	xmaddr_t maddr = arbitrary_virt_to_machine(vaddr);
+
+	return PFN_DOWN(maddr.maddr);
+}
+
+xmaddr_t arbitrary_virt_to_machine(void *vaddr)
+{
+	unsigned long address = (unsigned long)vaddr;
+	unsigned int level;
+	pte_t *pte;
+	unsigned offset;
+
+	/*
+	 * if the PFN is in the linear mapped vaddr range, we can just use
+	 * the (quick) virt_to_machine() p2m lookup
+	 */
+	if (virt_addr_valid(vaddr))
+		return virt_to_machine(vaddr);
+
+	/* otherwise we have to do a (slower) full page-table walk */
+
+	pte = lookup_address(address, &level);
+	BUG_ON(pte == NULL);
+	offset = address & ~PAGE_MASK;
+	return XMADDR(((phys_addr_t)pte_mfn(*pte) << PAGE_SHIFT) + offset);
+}
+EXPORT_SYMBOL_GPL(arbitrary_virt_to_machine);
diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
new file mode 100644
index 0000000..c0ecb92
--- /dev/null
+++ b/arch/x86/xen/mmu_hvm.c
@@ -0,0 +1,77 @@
+#include <linux/types.h>
+#include <linux/crash_dump.h>
+
+#include <xen/interface/xen.h>
+#include <xen/hvm.h>
+
+#ifdef CONFIG_PROC_VMCORE
+/*
+ * This function is used in two contexts:
+ * - the kdump kernel has to check whether a pfn of the crashed kernel
+ *   was a ballooned page. vmcore is using this function to decide
+ *   whether to access a pfn of the crashed kernel.
+ * - the kexec kernel has to check whether a pfn was ballooned by the
+ *   previous kernel. If the pfn is ballooned, handle it properly.
+ * Returns 0 if the pfn is not backed by a RAM page, the caller may
+ * handle the pfn special in this case.
+ */
+static int xen_oldmem_pfn_is_ram(unsigned long pfn)
+{
+	struct xen_hvm_get_mem_type a = {
+		.domid = DOMID_SELF,
+		.pfn = pfn,
+	};
+	int ram;
+
+	if (HYPERVISOR_hvm_op(HVMOP_get_mem_type, &a))
+		return -ENXIO;
+
+	switch (a.mem_type) {
+	case HVMMEM_mmio_dm:
+		ram = 0;
+		break;
+	case HVMMEM_ram_rw:
+	case HVMMEM_ram_ro:
+	default:
+		ram = 1;
+		break;
+	}
+
+	return ram;
+}
+#endif
+
+static void xen_hvm_exit_mmap(struct mm_struct *mm)
+{
+	struct xen_hvm_pagetable_dying a;
+	int rc;
+
+	a.domid = DOMID_SELF;
+	a.gpa = __pa(mm->pgd);
+	rc = HYPERVISOR_hvm_op(HVMOP_pagetable_dying, &a);
+	WARN_ON_ONCE(rc < 0);
+}
+
+static int is_pagetable_dying_supported(void)
+{
+	struct xen_hvm_pagetable_dying a;
+	int rc = 0;
+
+	a.domid = DOMID_SELF;
+	a.gpa = 0x00;
+	rc = HYPERVISOR_hvm_op(HVMOP_pagetable_dying, &a);
+	if (rc < 0) {
+		printk(KERN_DEBUG "HVMOP_pagetable_dying not supported\n");
+		return 0;
+	}
+	return 1;
+}
+
+void __init xen_hvm_init_mmu_ops(void)
+{
+	if (is_pagetable_dying_supported())
+		pv_mmu_ops.exit_mmap = xen_hvm_exit_mmap;
+#ifdef CONFIG_PROC_VMCORE
+	register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram);
+#endif
+}
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index e4db19e..390168d 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -710,6 +710,7 @@ void free_xenballooned_pages(int nr_pages, struct page **pages)
 }
 EXPORT_SYMBOL(free_xenballooned_pages);
 
+#ifdef CONFIG_XEN_PV
 static void __init balloon_add_region(unsigned long start_pfn,
 				      unsigned long pages)
 {
@@ -733,19 +734,22 @@ static void __init balloon_add_region(unsigned long start_pfn,
 
 	balloon_stats.total_pages += extra_pfn_end - start_pfn;
 }
+#endif
 
 static int __init balloon_init(void)
 {
-	int i;
-
 	if (!xen_domain())
 		return -ENODEV;
 
 	pr_info("Initialising balloon driver\n");
 
+#ifdef CONFIG_XEN_PV
 	balloon_stats.current_pages = xen_pv_domain()
 		? min(xen_start_info->nr_pages - xen_released_pages, max_pfn)
 		: get_num_physpages();
+#else
+	balloon_stats.current_pages = get_num_physpages();
+#endif
 	balloon_stats.target_pages  = balloon_stats.current_pages;
 	balloon_stats.balloon_low   = 0;
 	balloon_stats.balloon_high  = 0;
@@ -762,14 +766,20 @@ static int __init balloon_init(void)
 	register_sysctl_table(xen_root);
 #endif
 
-	/*
-	 * Initialize the balloon with pages from the extra memory
-	 * regions (see arch/x86/xen/setup.c).
-	 */
-	for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++)
-		if (xen_extra_mem[i].n_pfns)
-			balloon_add_region(xen_extra_mem[i].start_pfn,
-					   xen_extra_mem[i].n_pfns);
+#ifdef CONFIG_XEN_PV
+	{
+		int i;
+
+		/*
+		 * Initialize the balloon with pages from the extra memory
+		 * regions (see arch/x86/xen/setup.c).
+		 */
+		for (i = 0; i < XEN_EXTRA_MEM_MAX_REGIONS; i++)
+			if (xen_extra_mem[i].n_pfns)
+				balloon_add_region(xen_extra_mem[i].start_pfn,
+						   xen_extra_mem[i].n_pfns);
+	}
+#endif
 
 	return 0;
 }
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index bcf90ed..e20bbba 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -41,11 +41,24 @@ int xen_cpuhp_setup(int (*cpu_up_prepare_cb)(unsigned int),
 void xen_reboot(int reason);
 
 extern unsigned long *xen_contiguous_bitmap;
+#ifdef CONFIG_XEN_PV
 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				unsigned int address_bits,
 				dma_addr_t *dma_handle);
 
 void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
+#else
+static inline int xen_create_contiguous_region(phys_addr_t pstart,
+					       unsigned int order,
+					       unsigned int address_bits,
+					       dma_addr_t *dma_handle)
+{
+	return 0;
+}
+
+static inline void xen_destroy_contiguous_region(phys_addr_t pstart,
+						 unsigned int order) { }
+#endif
 
 struct vm_area_struct;
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH KERNEL 4/4] x86/xen: put setup.c, pmu.c and apic.c under CONFIG_XEN_PV
  2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2016-11-14 17:17 ` [RFC PATCH KERNEL 3/4] x86/xen: put setup.c, mmu.c and p2m.c under CONFIG_XEN_PV Vitaly Kuznetsov
@ 2016-11-14 17:17 ` Vitaly Kuznetsov
  2016-11-14 18:21 ` [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code David Vrabel
  4 siblings, 0 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-14 17:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

xen_pmu_init/finish() functions are used in suspend.c and
enlighten_common.c, add stubs for now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/xen/Kconfig  | 2 +-
 arch/x86/xen/Makefile | 4 ++--
 arch/x86/xen/pmu.h    | 5 +++++
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index e25c93e..484d6b8 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -6,7 +6,6 @@ config XEN
 	bool "Xen guest support"
 	depends on PARAVIRT
 	select PARAVIRT_CLOCK
-	select XEN_HAVE_VPMU
 	depends on X86_64 || (X86_32 && X86_PAE)
 	depends on X86_LOCAL_APIC && X86_TSC
 	help
@@ -18,6 +17,7 @@ config XEN_PV
 	bool "Xen PV guest support"
 	default y
 	depends on XEN
+	select XEN_HAVE_VPMU
 	help
 	  Support running as a Xen PV guest.
 
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 121a368..8ec52f0 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -13,9 +13,9 @@ CFLAGS_mmu.o			:= $(nostackp)
 obj-y		:= enlighten_common.o multicalls.o \
 			irq.o time.o xen-asm.o xen-asm_$(BITS).o \
 			grant-table.o suspend.o platform-pci-unplug.o \
-			apic.o pmu.o mmu_common.o
+			mmu_common.o
 
-obj-$(CONFIG_XEN_PV)		+= enlighten.o setup.o mmu.o p2m.o
+obj-$(CONFIG_XEN_PV)		+= enlighten.o setup.o mmu.o p2m.o pmu.o apic.o
 obj-$(CONFIG_XEN_PVHVM)		+= enlighten_hvm.o mmu_hvm.o
 
 obj-$(CONFIG_EVENT_TRACING) += trace.o
diff --git a/arch/x86/xen/pmu.h b/arch/x86/xen/pmu.h
index af5f0ad..4be5355 100644
--- a/arch/x86/xen/pmu.h
+++ b/arch/x86/xen/pmu.h
@@ -4,8 +4,13 @@
 #include <xen/interface/xenpmu.h>
 
 irqreturn_t xen_pmu_irq_handler(int irq, void *dev_id);
+#ifdef CONFIG_XEN_HAVE_VPMU
 void xen_pmu_init(int cpu);
 void xen_pmu_finish(int cpu);
+#else
+static inline void xen_pmu_init(int cpu) {}
+static inline void xen_pmu_finish(int cpu) {}
+#endif
 bool is_xen_pmu(int cpu);
 bool pmu_msr_read(unsigned int msr, uint64_t *val, int *err);
 bool pmu_msr_write(unsigned int msr, uint32_t low, uint32_t high, int *err);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code
  2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2016-11-14 17:17 ` [RFC PATCH KERNEL 4/4] x86/xen: put setup.c, pmu.c and apic.c " Vitaly Kuznetsov
@ 2016-11-14 18:21 ` David Vrabel
  2016-11-14 18:47   ` Boris Ostrovsky
  4 siblings, 1 reply; 11+ messages in thread
From: David Vrabel @ 2016-11-14 18:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov, xen-devel
  Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

On 14/11/16 17:17, Vitaly Kuznetsov wrote:
> Hi,
> 
> I have a long-standing idea to separate PV and PVHVM code in kernel and 
> introduce Kconfig options to make it possible to enable the required
> parts only breaking the current 'all or nothing' approach.
> 
> Motivation:
> - Xen related x86 code in kernel is rather big and it is unclear which
>   parts of it are required for PV, for HVM or for both. With PVH coming
>   into picture is becomes even more tangled. It makes it hard to
>   understand/audit the code.
> 
> - In some case we may want to avoid bloating kernel by supporting Xen
>   guests we don't need. In particular, 90% of the code in arch/x86/xen/ is
>   required to support PV guests and one may require PVHVM support only.
> 
> - PV guests are supposed to go away one day and such code separation would
>   help us to get ready.

All good reasons.

> This RFC adds XEN_PV Kconfig option and makes it possible to build PV-only
> and PVHVM-only kernels. It also makes it possible to disable Dom0 support.
> The series is incomplete and probably dirty in some places, I didn't pay
> much attention to the current PVH implementation as (as far as I
> understand) it is supposed to be replaced with PVHv2 but before investing
> more I'd like to get opinions whether such refactoring will be welcomed.

This series might be best done after PVHv1 is removed.  Boris, any
thoughts on the best approach here?

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code
  2016-11-14 18:21 ` [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code David Vrabel
@ 2016-11-14 18:47   ` Boris Ostrovsky
  2016-11-15 11:14     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2016-11-14 18:47 UTC (permalink / raw)
  To: David Vrabel, Vitaly Kuznetsov, xen-devel
  Cc: Juergen Gross, Andrew Jones, x86

On 11/14/2016 01:21 PM, David Vrabel wrote:
> On 14/11/16 17:17, Vitaly Kuznetsov wrote:
>> Hi,
>>
>> I have a long-standing idea to separate PV and PVHVM code in kernel and 
>> introduce Kconfig options to make it possible to enable the required
>> parts only breaking the current 'all or nothing' approach.
>>
>> Motivation:
>> - Xen related x86 code in kernel is rather big and it is unclear which
>>   parts of it are required for PV, for HVM or for both. With PVH coming
>>   into picture is becomes even more tangled. It makes it hard to
>>   understand/audit the code.
>>
>> - In some case we may want to avoid bloating kernel by supporting Xen
>>   guests we don't need. In particular, 90% of the code in arch/x86/xen/ is
>>   required to support PV guests and one may require PVHVM support only.
>>
>> - PV guests are supposed to go away one day and such code separation would
>>   help us to get ready.
> All good reasons.
>
>> This RFC adds XEN_PV Kconfig option and makes it possible to build PV-only
>> and PVHVM-only kernels. It also makes it possible to disable Dom0 support.
>> The series is incomplete and probably dirty in some places, I didn't pay
>> much attention to the current PVH implementation as (as far as I
>> understand) it is supposed to be replaced with PVHv2 but before investing
>> more I'd like to get opinions whether such refactoring will be welcomed.
> This series might be best done after PVHv1 is removed.  Boris, any
> thoughts on the best approach here?

I would prefer to wait until at least domU PVHv2 (together with removal
of v1) happens. As soon as I am done with ACPI hotplug on the hypervisor
side I will post the new version.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code
  2016-11-14 18:47   ` Boris Ostrovsky
@ 2016-11-15 11:14     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-15 11:14 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: Juergen Gross, xen-devel, Andrew Jones, x86, David Vrabel

Boris Ostrovsky <boris.ostrovsky@oracle.com> writes:

> On 11/14/2016 01:21 PM, David Vrabel wrote:
>> On 14/11/16 17:17, Vitaly Kuznetsov wrote:
>>> Hi,
>>>
>>> I have a long-standing idea to separate PV and PVHVM code in kernel and 
>>> introduce Kconfig options to make it possible to enable the required
>>> parts only breaking the current 'all or nothing' approach.
>>>
>>> Motivation:
>>> - Xen related x86 code in kernel is rather big and it is unclear which
>>>   parts of it are required for PV, for HVM or for both. With PVH coming
>>>   into picture is becomes even more tangled. It makes it hard to
>>>   understand/audit the code.
>>>
>>> - In some case we may want to avoid bloating kernel by supporting Xen
>>>   guests we don't need. In particular, 90% of the code in arch/x86/xen/ is
>>>   required to support PV guests and one may require PVHVM support only.
>>>
>>> - PV guests are supposed to go away one day and such code separation would
>>>   help us to get ready.
>> All good reasons.

Good, let's do it then)

>>> This RFC adds XEN_PV Kconfig option and makes it possible to build PV-only
>>> and PVHVM-only kernels. It also makes it possible to disable Dom0 support.
>>> The series is incomplete and probably dirty in some places, I didn't pay
>>> much attention to the current PVH implementation as (as far as I
>>> understand) it is supposed to be replaced with PVHv2 but before investing
>>> more I'd like to get opinions whether such refactoring will be welcomed.
>> This series might be best done after PVHv1 is removed.  Boris, any
>> thoughts on the best approach here?
>
> I would prefer to wait until at least domU PVHv2 (together with removal
> of v1) happens. As soon as I am done with ACPI hotplug on the hypervisor
> side I will post the new version.

Sure, there is no rush here.

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 1/4] x86/xen: start untangling PV and PVHVM guest support code
  2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
@ 2016-11-15 11:26   ` David Vrabel
  2016-11-15 12:08     ` Vitaly Kuznetsov
  2016-11-15 15:50   ` Boris Ostrovsky
  1 sibling, 1 reply; 11+ messages in thread
From: David Vrabel @ 2016-11-15 11:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov, xen-devel
  Cc: Juergen Gross, Boris Ostrovsky, x86, Andrew Jones, David Vrabel

On 14/11/16 17:17, Vitaly Kuznetsov wrote:
> Introduce CONFIG_XEN_PV config option and split enlighten.c into
> 3 files. Temporary add #ifdef CONFIG_XEN_PV to smp.c and mmu.c to
> not break the build and not make the patch even bigger.
> 
> xen_cpu_up_prepare*/xen_cpu_die hooks require separation to support
> future xen_smp_intr_init() split.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/hypervisor.h |   3 +-
>  arch/x86/kernel/cpu/hypervisor.c  |   7 +-
>  arch/x86/kernel/process_64.c      |   2 +-
>  arch/x86/xen/Kconfig              |  25 ++-
>  arch/x86/xen/Makefile             |   7 +-
>  arch/x86/xen/enlighten.c          | 388 ++------------------------------------
>  arch/x86/xen/enlighten_common.c   | 216 +++++++++++++++++++++
>  arch/x86/xen/enlighten_hvm.c      | 202 ++++++++++++++++++++

I think I'd prefer:

enlighten.c, enlighten_pv.c and enlighten_hvm.c

And similarly for the other files you split.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 1/4] x86/xen: start untangling PV and PVHVM guest support code
  2016-11-15 11:26   ` David Vrabel
@ 2016-11-15 12:08     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 11+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-15 12:08 UTC (permalink / raw)
  To: David Vrabel; +Cc: Juergen Gross, xen-devel, Boris Ostrovsky, x86, Andrew Jones

David Vrabel <david.vrabel@citrix.com> writes:

> On 14/11/16 17:17, Vitaly Kuznetsov wrote:
>> Introduce CONFIG_XEN_PV config option and split enlighten.c into
>> 3 files. Temporary add #ifdef CONFIG_XEN_PV to smp.c and mmu.c to
>> not break the build and not make the patch even bigger.
>> 
>> xen_cpu_up_prepare*/xen_cpu_die hooks require separation to support
>> future xen_smp_intr_init() split.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/hypervisor.h |   3 +-
>>  arch/x86/kernel/cpu/hypervisor.c  |   7 +-
>>  arch/x86/kernel/process_64.c      |   2 +-
>>  arch/x86/xen/Kconfig              |  25 ++-
>>  arch/x86/xen/Makefile             |   7 +-
>>  arch/x86/xen/enlighten.c          | 388 ++------------------------------------
>>  arch/x86/xen/enlighten_common.c   | 216 +++++++++++++++++++++
>>  arch/x86/xen/enlighten_hvm.c      | 202 ++++++++++++++++++++
>
> I think I'd prefer:
>
> enlighten.c, enlighten_pv.c and enlighten_hvm.c
>
> And similarly for the other files you split.

Sure,

in this case we'll basically be renaming enlighten.c -> enlighten_pv.c
and enlighten_common.c will become new enlighten.c. I'll now wait till
PVHv2 stuff lands, do some cleanups and send v1.

Thanks!

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH KERNEL 1/4] x86/xen: start untangling PV and PVHVM guest support code
  2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
  2016-11-15 11:26   ` David Vrabel
@ 2016-11-15 15:50   ` Boris Ostrovsky
  1 sibling, 0 replies; 11+ messages in thread
From: Boris Ostrovsky @ 2016-11-15 15:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov, xen-devel
  Cc: Juergen Gross, Andrew Jones, x86, David Vrabel

On 11/14/2016 12:17 PM, Vitaly Kuznetsov wrote:
>  
> +config XEN_PV
> +	bool "Xen PV guest support"
> +	default y
> +	depends on XEN
> +	help
> +	  Support running as a Xen PV guest.
> +
>  config XEN_DOM0

We might consider renaming this to XEN_PV_DOM0 to distinguish it from
eventual XEN_PVH_DOM0.

-boris

> -	def_bool y
> -	depends on XEN && PCI_XEN && SWIOTLB_XEN
> +	bool "Xen PV Dom0 support"
> +	default y
> +	depends on XEN_PV && PCI_XEN && SWIOTLB_XEN
>  	depends on X86_IO_APIC && ACPI && PCI
> +	select XEN_HAVE_PVMMU
> +	help
> +	  Support running as a Xen PV Dom0 guest.
>  
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-11-15 15:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-14 17:17 [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code Vitaly Kuznetsov
2016-11-14 17:17 ` [RFC PATCH KERNEL 1/4] x86/xen: start untangling " Vitaly Kuznetsov
2016-11-15 11:26   ` David Vrabel
2016-11-15 12:08     ` Vitaly Kuznetsov
2016-11-15 15:50   ` Boris Ostrovsky
2016-11-14 17:17 ` [RFC PATCH KERNEL 2/4] x86/xen: split smp.c for PV and PVHVM guests Vitaly Kuznetsov
2016-11-14 17:17 ` [RFC PATCH KERNEL 3/4] x86/xen: put setup.c, mmu.c and p2m.c under CONFIG_XEN_PV Vitaly Kuznetsov
2016-11-14 17:17 ` [RFC PATCH KERNEL 4/4] x86/xen: put setup.c, pmu.c and apic.c " Vitaly Kuznetsov
2016-11-14 18:21 ` [RFC PATCH KERNEL 0/4] x86/xen: untangle PV and PVHVM guest support code David Vrabel
2016-11-14 18:47   ` Boris Ostrovsky
2016-11-15 11:14     ` Vitaly Kuznetsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).