[RFC PATCH 0/2] Add support for a fake, para-virtualised machine

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
@ 2012-12-03 17:52 Will Deacon
  2012-12-03 17:52 ` [RFC PATCH 1/2] ARM: Dummy Virtual Machine platform support Will Deacon
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-03 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

When running Linux on a para-virtualised platform (that is, one where
the guest is aware that it is dealing with virtual devices sitting on
things like virtio or xenbus) we require very little in the way of
platform code and piggy-backing on top of an existing platform can
require a lot of device emulation for very little gain.

These two patches introduce mach-virt: a very simple, DT-based machine
which can be used with kvmtool in conjunction with virtio-based devices.
It's not hard to imagine the same machine being targetted by Xen, which
currently emulates a minimal variant of the vexpress platform.

Note that this patch series depends on the timer rework from Mark
Rutland, posted on Friday:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html

All feedback welcome. We suspect that most controversy will be around
the name of the thing :)

Will

Marc Zyngier (2):
  ARM: Dummy Virtual Machine platform support
  ARM: SMP support for mach-virt

 arch/arm/Kconfig             |   2 +
 arch/arm/Makefile            |   1 +
 arch/arm/mach-virt/Kconfig   |   9 ++
 arch/arm/mach-virt/Makefile  |   6 ++
 arch/arm/mach-virt/headsmp.S |  38 ++++++++
 arch/arm/mach-virt/platsmp.c | 205 +++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mach-virt/virt.c    |  71 +++++++++++++++
 7 files changed, 332 insertions(+)
 create mode 100644 arch/arm/mach-virt/Kconfig
 create mode 100644 arch/arm/mach-virt/Makefile
 create mode 100644 arch/arm/mach-virt/headsmp.S
 create mode 100644 arch/arm/mach-virt/platsmp.c
 create mode 100644 arch/arm/mach-virt/virt.c

-- 
1.8.0

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 1/2] ARM: Dummy Virtual Machine platform support
  2012-12-03 17:52 [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Will Deacon
@ 2012-12-03 17:52 ` Will Deacon
  2012-12-03 17:52 ` [RFC PATCH 2/2] ARM: SMP support for mach-virt Will Deacon
  2012-12-03 21:54 ` [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Rob Herring
  2 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-03 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

Add support for the smallest, dumbest possible platform, to be
used as a guest for KVM or other hypervisors.

It only mandates a GIC and architected timers. Fits nicely with
a multiplatform zImage. Uses very little silicon area.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/Kconfig            |  2 ++
 arch/arm/Makefile           |  1 +
 arch/arm/mach-virt/Kconfig  |  9 +++++++
 arch/arm/mach-virt/Makefile |  5 ++++
 arch/arm/mach-virt/virt.c   | 65 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 82 insertions(+)
 create mode 100644 arch/arm/mach-virt/Kconfig
 create mode 100644 arch/arm/mach-virt/Makefile
 create mode 100644 arch/arm/mach-virt/virt.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 781725e..ba0dca7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1126,6 +1126,8 @@ source "arch/arm/mach-versatile/Kconfig"
 source "arch/arm/mach-vexpress/Kconfig"
 source "arch/arm/plat-versatile/Kconfig"
 
+source "arch/arm/mach-virt/Kconfig"
+
 source "arch/arm/mach-w90x900/Kconfig"
 
 # Definitions to make life easier
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 5f914fc..e8232ad 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -192,6 +192,7 @@ machine-$(CONFIG_ARCH_SOCFPGA)		+= socfpga
 machine-$(CONFIG_ARCH_SPEAR13XX)	+= spear13xx
 machine-$(CONFIG_ARCH_SPEAR3XX)		+= spear3xx
 machine-$(CONFIG_MACH_SPEAR600)		+= spear6xx
+machine-$(CONFIG_ARCH_VIRT)		+= virt
 machine-$(CONFIG_ARCH_ZYNQ)		+= zynq
 
 # Platform directory name.  This list is sorted alphanumerically
diff --git a/arch/arm/mach-virt/Kconfig b/arch/arm/mach-virt/Kconfig
new file mode 100644
index 0000000..a568a2a
--- /dev/null
+++ b/arch/arm/mach-virt/Kconfig
@@ -0,0 +1,9 @@
+config ARCH_VIRT
+	bool "Dummy Virtual Machine" if ARCH_MULTI_V7
+	select ARCH_WANT_OPTIONAL_GPIOLIB
+	select ARM_GIC
+	select ARM_ARCH_TIMER
+	select HAVE_SMP
+	select CPU_V7
+	select SPARSE_IRQ
+	select USE_OF
diff --git a/arch/arm/mach-virt/Makefile b/arch/arm/mach-virt/Makefile
new file mode 100644
index 0000000..7ddbfa6
--- /dev/null
+++ b/arch/arm/mach-virt/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the linux kernel.
+#
+
+obj-y					:= virt.o
diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c
new file mode 100644
index 0000000..174b9da
--- /dev/null
+++ b/arch/arm/mach-virt/virt.c
@@ -0,0 +1,65 @@
+/*
+ * Dummy Virtual Machine - does what it says on the tin.
+ *
+ * Copyright (C) 2012 ARM Ltd
+ * Authors: Will Deacon <will.deacon@arm.com>,
+ *          Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+
+#include <asm/arch_timer.h>
+#include <asm/hardware/gic.h>
+#include <asm/mach/arch.h>
+#include <asm/mach/time.h>
+
+const static struct of_device_id irq_match[] = {
+	{ .compatible = "arm,cortex-a15-gic", .data = gic_of_init, },
+	{}
+};
+
+static void __init gic_init_irq(void)
+{
+	of_irq_init(irq_match);
+}
+
+static void __init virt_init(void)
+{
+	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+}
+
+static void __init virt_timer_init(void)
+{
+	WARN_ON(arch_timer_of_register() != 0);
+	WARN_ON(arch_timer_sched_clock_init() != 0);
+}
+
+static const char *virt_dt_match[] = {
+	"linux,dummy-virt",
+	NULL
+};
+
+static struct sys_timer virt_timer = {
+	.init = virt_timer_init,
+};
+
+DT_MACHINE_START(VIRT, "Dummy Virtual Machine")
+	.init_irq	= gic_init_irq,
+	.handle_irq     = gic_handle_irq,
+	.timer		= &virt_timer,
+	.init_machine	= virt_init,
+	.dt_compat	= virt_dt_match,
+MACHINE_END
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-03 17:52 [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Will Deacon
  2012-12-03 17:52 ` [RFC PATCH 1/2] ARM: Dummy Virtual Machine platform support Will Deacon
@ 2012-12-03 17:52 ` Will Deacon
  2012-12-03 21:55   ` Rob Herring
  2012-12-03 21:54 ` [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Rob Herring
  2 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-03 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

This patch adds support for SMP to mach-virt.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mach-virt/Makefile  |   1 +
 arch/arm/mach-virt/headsmp.S |  38 ++++++++
 arch/arm/mach-virt/platsmp.c | 205 +++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mach-virt/virt.c    |   6 ++
 4 files changed, 250 insertions(+)
 create mode 100644 arch/arm/mach-virt/headsmp.S
 create mode 100644 arch/arm/mach-virt/platsmp.c

diff --git a/arch/arm/mach-virt/Makefile b/arch/arm/mach-virt/Makefile
index 7ddbfa6..9ce8a28 100644
--- a/arch/arm/mach-virt/Makefile
+++ b/arch/arm/mach-virt/Makefile
@@ -3,3 +3,4 @@
 #
 
 obj-y					:= virt.o
+obj-$(CONFIG_SMP)			+= platsmp.o headsmp.o
diff --git a/arch/arm/mach-virt/headsmp.S b/arch/arm/mach-virt/headsmp.S
new file mode 100644
index 0000000..e27afb0
--- /dev/null
+++ b/arch/arm/mach-virt/headsmp.S
@@ -0,0 +1,38 @@
+/*
+ *  Copyright (c) 2012 ARM Limited
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <linux/linkage.h>
+#include <linux/init.h>
+
+	__INIT
+
+/*
+ * This provides a "holding pen" into which all secondary cores are held
+ * until we're ready for them to initialise.
+ */
+ENTRY(virt_secondary_startup)
+	mrc	p15, 0, r0, c0, c0, 5
+	and	r0, r0, #15
+	adr	r4, 1f
+	ldmia	r4, {r5, r6}
+	sub	r4, r4, r5
+	add	r6, r6, r4
+pen:	ldr	r7, [r6]
+	cmp	r7, r0
+	bne	pen
+
+	/*
+	 * we've been released from the holding pen: secondary_stack
+	 * should now contain the SVC stack for this core
+	 */
+	b	secondary_startup
+
+	.align
+1:	.long	.
+	.long	pen_release
+ENDPROC(virt_secondary_startup)
diff --git a/arch/arm/mach-virt/platsmp.c b/arch/arm/mach-virt/platsmp.c
new file mode 100644
index 0000000..fe02f51
--- /dev/null
+++ b/arch/arm/mach-virt/platsmp.c
@@ -0,0 +1,205 @@
+/*
+ * Dummy Virtual Machine - does what it says on the tin.
+ *
+ * SMP operations, shamelessly stolen from:
+ * arch/arm64/kernel/smp.c
+ *
+ * Copyright (C) 2012 ARM Ltd
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/smp.h>
+#include <linux/errno.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/jiffies.h>
+#include <linux/of.h>
+
+#include <asm/cacheflush.h>
+#include <asm/smp_plat.h>
+#include <asm/hardware/gic.h>
+
+extern void virt_secondary_startup(void);
+
+static DEFINE_RAW_SPINLOCK(boot_lock);
+static phys_addr_t cpu_release_addr[NR_CPUS];
+
+/*
+ * Write secondary_holding_pen_release in a way that is guaranteed to be
+ * visible to all observers, irrespective of whether they're taking part
+ * in coherency or not.  This is necessary for the hotplug code to work
+ * reliably.
+ */
+static void __cpuinit write_pen_release(int val)
+{
+	void *start = (void *)&pen_release;
+	unsigned long size = sizeof(pen_release);
+
+	pen_release = val;
+	smp_wmb();
+	__cpuc_flush_dcache_area(start, size);
+	outer_clean_range(__pa(&pen_release), __pa(&pen_release + 1));
+}
+
+/*
+ * Enumerate the possible CPU set from the device tree.
+ */
+static void __init virt_smp_init_cpus(void)
+{
+	const char *enable_method;
+	struct device_node *dn = NULL;
+	int cpu = 0;
+	u32 release_addr;
+
+	while ((dn = of_find_node_by_type(dn, "cpu"))) {
+		if (cpu >= NR_CPUS)
+			goto next;
+
+		/*
+		 * We currently support only the "spin-table" enable-method.
+		 */
+		enable_method = of_get_property(dn, "enable-method", NULL);
+		if (!enable_method || strcmp(enable_method, "spin-table")) {
+			pr_err("CPU %d: missing or invalid enable-method property: %s\n",
+			       cpu, enable_method);
+			goto next;
+		}
+
+		/*
+		 * Determine the address from which the CPU is polling.
+		 */
+		if (of_property_read_u32(dn, "cpu-release-addr", &release_addr)) {
+			pr_err("CPU %d: missing or invalid cpu-release-addr property\n",
+			       cpu);
+			goto next;
+		}
+
+		cpu_release_addr[cpu] = release_addr;
+		set_cpu_possible(cpu, true);
+next:
+		cpu++;
+	}
+
+	/* sanity check */
+	if (cpu > NR_CPUS)
+		pr_warning("no. of cores (%d) greater than configured maximum of %d - clipping\n",
+			   cpu, NR_CPUS);
+
+	set_smp_cross_call(gic_raise_softirq);
+}
+
+static void __init virt_smp_prepare_cpus(unsigned int max_cpus)
+{
+	int cpu;
+	void **release_addr;
+	unsigned int ncores = num_possible_cpus();
+
+	/*
+	 * are we trying to boot more cores than exist?
+	 */
+	if (max_cpus > ncores)
+		max_cpus = ncores;
+
+	/*
+	 * Initialise the present map (which describes the set of CPUs
+	 * actually populated at the present time) and release the
+	 * secondaries from the bootloader.
+	 */
+	for_each_possible_cpu(cpu) {
+		if (max_cpus == 0)
+			break;
+
+		if (!cpu_release_addr[cpu])
+			continue;
+
+		release_addr = __va(cpu_release_addr[cpu]);
+		release_addr[0] = (void *)__pa(virt_secondary_startup);
+		smp_wmb();
+		__cpuc_flush_dcache_area(release_addr, sizeof(release_addr[0]));
+		outer_clean_range(__pa(release_addr), __pa(release_addr+1));
+
+		set_cpu_present(cpu, true);
+		max_cpus--;
+	}
+}
+
+static int __cpuinit virt_boot_secondary(unsigned int cpu,
+					 struct task_struct *idle)
+{
+	unsigned long timeout;
+
+	/*
+	 * Set synchronisation state between this boot processor
+	 * and the secondary one
+	 */
+	raw_spin_lock(&boot_lock);
+
+	/*
+	 * Update the pen release flag.
+	 */
+	write_pen_release(cpu);
+
+	/*
+	 * Send the secondary CPU a soft interrupt, causing the
+	 * secondaries to read pen_release.
+	 */
+	gic_raise_softirq(cpumask_of(cpu), 0);
+
+	timeout = jiffies + (1 * HZ);
+	while (time_before(jiffies, timeout)) {
+		if (pen_release == -1UL)
+			break;
+		udelay(10);
+	}
+
+	/*
+	 * Now the secondary core is starting up let it run its
+	 * calibrations, then wait for it to finish
+	 */
+	raw_spin_unlock(&boot_lock);
+
+	return pen_release != -1 ? -ENOSYS : 0;
+}
+
+static void __cpuinit virt_secondary_init(unsigned int cpu)
+{
+	/*
+	 * if any interrupts are already enabled for the primary
+	 * core (e.g. timer irq), then they will not have been enabled
+	 * for us: do so
+	 */
+	gic_secondary_init(0);
+
+	/*
+	 * let the primary processor know we're out of the
+	 * pen, then head off into the C entry point
+	 */
+	write_pen_release(-1);
+
+	/*
+	 * Synchronise with the boot thread.
+	 */
+	raw_spin_lock(&boot_lock);
+	raw_spin_unlock(&boot_lock);
+}
+
+struct smp_operations __initdata virt_smp_ops = {
+	.smp_init_cpus		= virt_smp_init_cpus,
+	.smp_prepare_cpus	= virt_smp_prepare_cpus,
+	.smp_secondary_init	= virt_secondary_init,
+	.smp_boot_secondary	= virt_boot_secondary,
+};
diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c
index 174b9da..d764835 100644
--- a/arch/arm/mach-virt/virt.c
+++ b/arch/arm/mach-virt/virt.c
@@ -20,6 +20,7 @@
 
 #include <linux/of_irq.h>
 #include <linux/of_platform.h>
+#include <linux/smp.h>
 
 #include <asm/arch_timer.h>
 #include <asm/hardware/gic.h>
@@ -56,10 +57,15 @@ static struct sys_timer virt_timer = {
 	.init = virt_timer_init,
 };
 
+#ifdef CONFIG_SMP
+extern struct smp_operations virt_smp_ops;
+#endif
+
 DT_MACHINE_START(VIRT, "Dummy Virtual Machine")
 	.init_irq	= gic_init_irq,
 	.handle_irq     = gic_handle_irq,
 	.timer		= &virt_timer,
 	.init_machine	= virt_init,
+	.smp		= smp_ops(virt_smp_ops),
 	.dt_compat	= virt_dt_match,
 MACHINE_END
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-03 17:52 [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Will Deacon
  2012-12-03 17:52 ` [RFC PATCH 1/2] ARM: Dummy Virtual Machine platform support Will Deacon
  2012-12-03 17:52 ` [RFC PATCH 2/2] ARM: SMP support for mach-virt Will Deacon
@ 2012-12-03 21:54 ` Rob Herring
  2012-12-04 12:30   ` Will Deacon
  2 siblings, 1 reply; 38+ messages in thread
From: Rob Herring @ 2012-12-03 21:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/03/2012 11:52 AM, Will Deacon wrote:
> Hello,
> 
> When running Linux on a para-virtualised platform (that is, one where
> the guest is aware that it is dealing with virtual devices sitting on
> things like virtio or xenbus) we require very little in the way of
> platform code and piggy-backing on top of an existing platform can
> require a lot of device emulation for very little gain.
> 
> These two patches introduce mach-virt: a very simple, DT-based machine
> which can be used with kvmtool in conjunction with virtio-based devices.
> It's not hard to imagine the same machine being targetted by Xen, which
> currently emulates a minimal variant of the vexpress platform.
> 
> Note that this patch series depends on the timer rework from Mark
> Rutland, posted on Friday:
> 
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
> 
> All feedback welcome. We suspect that most controversy will be around
> the name of the thing :)

We've discussed this before at conferences. I don't know that we
concluded this wasn't needed, but it certainly leaned that direction. So
what has changed? You're not going to save code space because we're
building multiple platforms together. You'll save some boot time, but a
stripped down dtb with only the minimal peripherals would probably save
nearly as much time. However, I do have concerns with using VExpress as
the guest. For example, you can't support a non-PAE guest with 4GB of
RAM on VExpress (maybe if the vexpress code gets all memory map info
from DT).

Is this really complete? Will we need reset, poweroff, hotplug, and
suspend/resume support for example? Unlike most initial platform
submissions which are minimal, I think seeing full support would be
useful here. Then we can better gauge how much we are really saving.

Rob

> 
> Will
> 
> 
> Marc Zyngier (2):
>   ARM: Dummy Virtual Machine platform support
>   ARM: SMP support for mach-virt
> 
>  arch/arm/Kconfig             |   2 +
>  arch/arm/Makefile            |   1 +
>  arch/arm/mach-virt/Kconfig   |   9 ++
>  arch/arm/mach-virt/Makefile  |   6 ++
>  arch/arm/mach-virt/headsmp.S |  38 ++++++++
>  arch/arm/mach-virt/platsmp.c | 205 +++++++++++++++++++++++++++++++++++++++++++
>  arch/arm/mach-virt/virt.c    |  71 +++++++++++++++
>  7 files changed, 332 insertions(+)
>  create mode 100644 arch/arm/mach-virt/Kconfig
>  create mode 100644 arch/arm/mach-virt/Makefile
>  create mode 100644 arch/arm/mach-virt/headsmp.S
>  create mode 100644 arch/arm/mach-virt/platsmp.c
>  create mode 100644 arch/arm/mach-virt/virt.c
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-03 17:52 ` [RFC PATCH 2/2] ARM: SMP support for mach-virt Will Deacon
@ 2012-12-03 21:55   ` Rob Herring
  2012-12-04 12:40     ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Rob Herring @ 2012-12-03 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/03/2012 11:52 AM, Will Deacon wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> This patch adds support for SMP to mach-virt.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/mach-virt/Makefile  |   1 +
>  arch/arm/mach-virt/headsmp.S |  38 ++++++++
>  arch/arm/mach-virt/platsmp.c | 205 +++++++++++++++++++++++++++++++++++++++++++
>  arch/arm/mach-virt/virt.c    |   6 ++
>  4 files changed, 250 insertions(+)
>  create mode 100644 arch/arm/mach-virt/headsmp.S
>  create mode 100644 arch/arm/mach-virt/platsmp.c
> 
> diff --git a/arch/arm/mach-virt/Makefile b/arch/arm/mach-virt/Makefile
> index 7ddbfa6..9ce8a28 100644
> --- a/arch/arm/mach-virt/Makefile
> +++ b/arch/arm/mach-virt/Makefile
> @@ -3,3 +3,4 @@
>  #
>  
>  obj-y					:= virt.o
> +obj-$(CONFIG_SMP)			+= platsmp.o headsmp.o
> diff --git a/arch/arm/mach-virt/headsmp.S b/arch/arm/mach-virt/headsmp.S
> new file mode 100644
> index 0000000..e27afb0
> --- /dev/null
> +++ b/arch/arm/mach-virt/headsmp.S
> @@ -0,0 +1,38 @@
> +/*
> + *  Copyright (c) 2012 ARM Limited
> + *  All Rights Reserved
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#include <linux/linkage.h>
> +#include <linux/init.h>
> +
> +	__INIT
> +
> +/*
> + * This provides a "holding pen" into which all secondary cores are held
> + * until we're ready for them to initialise.
> + */
> +ENTRY(virt_secondary_startup)
> +	mrc	p15, 0, r0, c0, c0, 5
> +	and	r0, r0, #15
> +	adr	r4, 1f
> +	ldmia	r4, {r5, r6}
> +	sub	r4, r4, r5
> +	add	r6, r6, r4
> +pen:	ldr	r7, [r6]
> +	cmp	r7, r0
> +	bne	pen

Why is the pen is needed? It should only be needed for hotplug on
systems that can't reset their cores. I'd hope you could design good
virtual h/w.

> +
> +	/*
> +	 * we've been released from the holding pen: secondary_stack
> +	 * should now contain the SVC stack for this core
> +	 */
> +	b	secondary_startup
> +
> +	.align
> +1:	.long	.
> +	.long	pen_release
> +ENDPROC(virt_secondary_startup)
> diff --git a/arch/arm/mach-virt/platsmp.c b/arch/arm/mach-virt/platsmp.c
> new file mode 100644
> index 0000000..fe02f51
> --- /dev/null
> +++ b/arch/arm/mach-virt/platsmp.c
> @@ -0,0 +1,205 @@
> +/*
> + * Dummy Virtual Machine - does what it says on the tin.
> + *
> + * SMP operations, shamelessly stolen from:
> + * arch/arm64/kernel/smp.c
> + *
> + * Copyright (C) 2012 ARM Ltd
> + * Author: Catalin Marinas <catalin.marinas@arm.com>
> + * Author: Will Deacon <will.deacon@arm.com>
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/smp.h>
> +#include <linux/errno.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/jiffies.h>
> +#include <linux/of.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/smp_plat.h>
> +#include <asm/hardware/gic.h>
> +
> +extern void virt_secondary_startup(void);
> +
> +static DEFINE_RAW_SPINLOCK(boot_lock);
> +static phys_addr_t cpu_release_addr[NR_CPUS];
> +
> +/*
> + * Write secondary_holding_pen_release in a way that is guaranteed to be
> + * visible to all observers, irrespective of whether they're taking part
> + * in coherency or not.  This is necessary for the hotplug code to work
> + * reliably.
> + */
> +static void __cpuinit write_pen_release(int val)
> +{
> +	void *start = (void *)&pen_release;
> +	unsigned long size = sizeof(pen_release);
> +
> +	pen_release = val;
> +	smp_wmb();
> +	__cpuc_flush_dcache_area(start, size);
> +	outer_clean_range(__pa(&pen_release), __pa(&pen_release + 1));
> +}
> +
> +/*
> + * Enumerate the possible CPU set from the device tree.
> + */
> +static void __init virt_smp_init_cpus(void)
> +{
> +	const char *enable_method;
> +	struct device_node *dn = NULL;
> +	int cpu = 0;
> +	u32 release_addr;
> +
> +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> +		if (cpu >= NR_CPUS)
> +			goto next;
> +
> +		/*
> +		 * We currently support only the "spin-table" enable-method.
> +		 */
> +		enable_method = of_get_property(dn, "enable-method", NULL);
> +		if (!enable_method || strcmp(enable_method, "spin-table")) {

Are these documented?

> +			pr_err("CPU %d: missing or invalid enable-method property: %s\n",
> +			       cpu, enable_method);
> +			goto next;
> +		}
> +
> +		/*
> +		 * Determine the address from which the CPU is polling.
> +		 */
> +		if (of_property_read_u32(dn, "cpu-release-addr", &release_addr)) {
> +			pr_err("CPU %d: missing or invalid cpu-release-addr property\n",
> +			       cpu);
> +			goto next;
> +		}
> +
> +		cpu_release_addr[cpu] = release_addr;
> +		set_cpu_possible(cpu, true);
> +next:
> +		cpu++;
> +	}
> +
> +	/* sanity check */
> +	if (cpu > NR_CPUS)
> +		pr_warning("no. of cores (%d) greater than configured maximum of %d - clipping\n",
> +			   cpu, NR_CPUS);
> +
> +	set_smp_cross_call(gic_raise_softirq);
> +}
> +
> +static void __init virt_smp_prepare_cpus(unsigned int max_cpus)
> +{
> +	int cpu;
> +	void **release_addr;
> +	unsigned int ncores = num_possible_cpus();
> +
> +	/*
> +	 * are we trying to boot more cores than exist?
> +	 */
> +	if (max_cpus > ncores)
> +		max_cpus = ncores;
> +
> +	/*
> +	 * Initialise the present map (which describes the set of CPUs
> +	 * actually populated at the present time) and release the
> +	 * secondaries from the bootloader.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		if (max_cpus == 0)
> +			break;
> +
> +		if (!cpu_release_addr[cpu])
> +			continue;
> +
> +		release_addr = __va(cpu_release_addr[cpu]);
> +		release_addr[0] = (void *)__pa(virt_secondary_startup);
> +		smp_wmb();
> +		__cpuc_flush_dcache_area(release_addr, sizeof(release_addr[0]));
> +		outer_clean_range(__pa(release_addr), __pa(release_addr+1));
> +
> +		set_cpu_present(cpu, true);
> +		max_cpus--;
> +	}
> +}
> +
> +static int __cpuinit virt_boot_secondary(unsigned int cpu,
> +					 struct task_struct *idle)
> +{
> +	unsigned long timeout;
> +
> +	/*
> +	 * Set synchronisation state between this boot processor
> +	 * and the secondary one
> +	 */
> +	raw_spin_lock(&boot_lock);
> +
> +	/*
> +	 * Update the pen release flag.
> +	 */
> +	write_pen_release(cpu);
> +
> +	/*
> +	 * Send the secondary CPU a soft interrupt, causing the
> +	 * secondaries to read pen_release.
> +	 */
> +	gic_raise_softirq(cpumask_of(cpu), 0);
> +
> +	timeout = jiffies + (1 * HZ);
> +	while (time_before(jiffies, timeout)) {
> +		if (pen_release == -1UL)
> +			break;
> +		udelay(10);
> +	}
> +
> +	/*
> +	 * Now the secondary core is starting up let it run its
> +	 * calibrations, then wait for it to finish
> +	 */
> +	raw_spin_unlock(&boot_lock);
> +
> +	return pen_release != -1 ? -ENOSYS : 0;
> +}
> +
> +static void __cpuinit virt_secondary_init(unsigned int cpu)
> +{
> +	/*
> +	 * if any interrupts are already enabled for the primary
> +	 * core (e.g. timer irq), then they will not have been enabled
> +	 * for us: do so
> +	 */
> +	gic_secondary_init(0);
> +
> +	/*
> +	 * let the primary processor know we're out of the
> +	 * pen, then head off into the C entry point
> +	 */
> +	write_pen_release(-1);
> +
> +	/*
> +	 * Synchronise with the boot thread.
> +	 */
> +	raw_spin_lock(&boot_lock);
> +	raw_spin_unlock(&boot_lock);
> +}
> +
> +struct smp_operations __initdata virt_smp_ops = {
> +	.smp_init_cpus		= virt_smp_init_cpus,
> +	.smp_prepare_cpus	= virt_smp_prepare_cpus,
> +	.smp_secondary_init	= virt_secondary_init,
> +	.smp_boot_secondary	= virt_boot_secondary,
> +};
> diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c
> index 174b9da..d764835 100644
> --- a/arch/arm/mach-virt/virt.c
> +++ b/arch/arm/mach-virt/virt.c
> @@ -20,6 +20,7 @@
>  
>  #include <linux/of_irq.h>
>  #include <linux/of_platform.h>
> +#include <linux/smp.h>
>  
>  #include <asm/arch_timer.h>
>  #include <asm/hardware/gic.h>
> @@ -56,10 +57,15 @@ static struct sys_timer virt_timer = {
>  	.init = virt_timer_init,
>  };
>  
> +#ifdef CONFIG_SMP
> +extern struct smp_operations virt_smp_ops;
> +#endif
> +
>  DT_MACHINE_START(VIRT, "Dummy Virtual Machine")
>  	.init_irq	= gic_init_irq,
>  	.handle_irq     = gic_handle_irq,
>  	.timer		= &virt_timer,
>  	.init_machine	= virt_init,
> +	.smp		= smp_ops(virt_smp_ops),
>  	.dt_compat	= virt_dt_match,
>  MACHINE_END
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-03 21:54 ` [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Rob Herring
@ 2012-12-04 12:30   ` Will Deacon
  2012-12-04 14:12     ` Rob Herring
  2012-12-11 16:19     ` Stefano Stabellini
  0 siblings, 2 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-04 12:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Rob,

[fixing Arnd's address, as I apparently can't spell]

On Mon, Dec 03, 2012 at 09:54:09PM +0000, Rob Herring wrote:
> On 12/03/2012 11:52 AM, Will Deacon wrote:
> > When running Linux on a para-virtualised platform (that is, one where
> > the guest is aware that it is dealing with virtual devices sitting on
> > things like virtio or xenbus) we require very little in the way of
> > platform code and piggy-backing on top of an existing platform can
> > require a lot of device emulation for very little gain.
> > 
> > These two patches introduce mach-virt: a very simple, DT-based machine
> > which can be used with kvmtool in conjunction with virtio-based devices.
> > It's not hard to imagine the same machine being targetted by Xen, which
> > currently emulates a minimal variant of the vexpress platform.
> > 
> > Note that this patch series depends on the timer rework from Mark
> > Rutland, posted on Friday:
> > 
> >   http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
> > 
> > All feedback welcome. We suspect that most controversy will be around
> > the name of the thing :)
> 
> We've discussed this before at conferences. I don't know that we
> concluded this wasn't needed, but it certainly leaned that direction.

I too leaned that direction before I started looking at kvm in detail and,
since then, I've changed my mind when it comes to para-virtualisation.

The reason for this is that there is absolutely no reason to emulate some
components of a real platform and then bolt virtual devices onto it once
you've got enough to get it going. It leads to a right royal mess in
userspace, where you have to write a load of non-reusable emulation code and
it leads to churn in the kernel because you're constantly at odds with
people trying to develop the platform code based on the actual hardware
they have.

With a virtualisation-capable ARMv7 system, all you *need* to boot SMP
Debian is:

	- A v7 CPU with virt extensions
	- vGIC
	- architected timers

*everything* else can be described using virtio devices in the device-tree,
essentially allowing you to generate platforms based on the above and boot
the same kernel on them.

> So what has changed? You're not going to save code space because we're
> building multiple platforms together. You'll save some boot time, but a
> stripped down dtb with only the minimal peripherals would probably save
> nearly as much time. 

It's really got nothing to do with code space or boot speed. What it *is*
about is avoiding the tight coupling with a real platform and suffering as a
result. Yes, you can strip down the DT for a real platform but you'll likely
still have to emulate things like the SP804 in order to boot. That's not to
mention any platform-specific system register interfaces which are required
early on.

We can't even re-use the socfpga code (which is incredibly minimal) without
emulating the dw_apb_timer.

> However, I do have concerns with using VExpress as
> the guest. For example, you can't support a non-PAE guest with 4GB of
> RAM on VExpress (maybe if the vexpress code gets all memory map info
> from DT).

Yes, vexpress is even less suitable for this.

> Is this really complete? Will we need reset, poweroff, hotplug, and
> suspend/resume support for example? Unlike most initial platform
> submissions which are minimal, I think seeing full support would be
> useful here. Then we can better gauge how much we are really saving.

The code is complete in the sense that you can boot an SMP guest running
Debian with console, network, block etc. etc. but you're right to point out
the absence of power-management support.

However, power-management in KVM guests is a *much* larger problem and not
one that has been solved adequately as of yet. There are suggestions that it
should be handled entirely in firmware, with the guest making smc calls to
request power-management operations but this is yet to materialise and, as
such, we can't yet use it here.

We could look at building a virtio-based power controller but that's going
to come up too late for SMP booting (although will give us hotplug, reset
etc).

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-03 21:55   ` Rob Herring
@ 2012-12-04 12:40     ` Will Deacon
  2012-12-04 13:33       ` Russell King - ARM Linux
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 03, 2012 at 09:55:35PM +0000, Rob Herring wrote:
> On 12/03/2012 11:52 AM, Will Deacon wrote:
> > From: Marc Zyngier <marc.zyngier@arm.com>
> > 
> > This patch adds support for SMP to mach-virt.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > 

[...]

> > +/*
> > + * This provides a "holding pen" into which all secondary cores are held
> > + * until we're ready for them to initialise.
> > + */
> > +ENTRY(virt_secondary_startup)
> > +	mrc	p15, 0, r0, c0, c0, 5
> > +	and	r0, r0, #15
> > +	adr	r4, 1f
> > +	ldmia	r4, {r5, r6}
> > +	sub	r4, r4, r5
> > +	add	r6, r6, r4
> > +pen:	ldr	r7, [r6]
> > +	cmp	r7, r0
> > +	bne	pen
> 
> Why is the pen is needed? It should only be needed for hotplug on
> systems that can't reset their cores. I'd hope you could design good
> virtual h/w.

It's not so much about designing good virtual h/w as it is avoiding tying
the platform to it. What we don't want is to mandate that in order to boot
this machine, you *must* implement an emulation of some virtual
power-controller or SMP booting device. If we go down that route, there's
less advantage from having the virtual platform in the first place.

There's also less of a problem with the pen approach to booting because
ultimately the virtual CPUs executing there are just pthreads and will be
scheduled appropriately by the hypervisor (in contrast to a real system
where there may be concerns about power consumption and memory bandwidth).

For hotplug, sure, we could have an *optional* virtio-based device for
dealing with that if we want to. We could even have some early probing code
for it and use it for SMP boot if we find a matching DT node, but we'd still
need to keep the pen code lying around as a fallback.

> > +/*
> > + * Enumerate the possible CPU set from the device tree.
> > + */
> > +static void __init virt_smp_init_cpus(void)
> > +{
> > +	const char *enable_method;
> > +	struct device_node *dn = NULL;
> > +	int cpu = 0;
> > +	u32 release_addr;
> > +
> > +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> > +		if (cpu >= NR_CPUS)
> > +			goto next;
> > +
> > +		/*
> > +		 * We currently support only the "spin-table" enable-method.
> > +		 */
> > +		enable_method = of_get_property(dn, "enable-method", NULL);
> > +		if (!enable_method || strcmp(enable_method, "spin-table")) {
> 
> Are these documented?

It's part of the EPAPR spec iirc and follows the booting protocol used by
arm64.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 12:40     ` Will Deacon
@ 2012-12-04 13:33       ` Russell King - ARM Linux
  2012-12-04 13:40         ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Russell King - ARM Linux @ 2012-12-04 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 12:40:47PM +0000, Will Deacon wrote:
> On Mon, Dec 03, 2012 at 09:55:35PM +0000, Rob Herring wrote:
> > Why is the pen is needed? It should only be needed for hotplug on
> > systems that can't reset their cores. I'd hope you could design good
> > virtual h/w.
> 
> It's not so much about designing good virtual h/w as it is avoiding tying
> the platform to it. What we don't want is to mandate that in order to boot
> this machine, you *must* implement an emulation of some virtual
> power-controller or SMP booting device. If we go down that route, there's
> less advantage from having the virtual platform in the first place.

There is actually a bigger problem here.  Let's say that you have a
quad SMP platform.  You've arranged for your kernel to boot and only
bring one of those cores online.

You then kexec() or reboot.  As far as the kernel is concerned, those
other two CPUs are not online and are not running any kernel code;
however in reality they could be sitting in this 'pen'.

The memory that these 'offline' CPUs is executing then gets overwritten,
and that's game over for those CPUs.

So, the 'pen' approach in the kernel is fragile, I'd much rather not
have it.  It was fine in the beginning for the initial ARM Ltd SMP
platforms but in this modern age it has no place in real platforms where
there is proper control of the secondary CPUs.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 13:33       ` Russell King - ARM Linux
@ 2012-12-04 13:40         ` Will Deacon
  2012-12-04 14:37           ` Russell King - ARM Linux
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 01:33:26PM +0000, Russell King - ARM Linux wrote:
> On Tue, Dec 04, 2012 at 12:40:47PM +0000, Will Deacon wrote:
> > On Mon, Dec 03, 2012 at 09:55:35PM +0000, Rob Herring wrote:
> > > Why is the pen is needed? It should only be needed for hotplug on
> > > systems that can't reset their cores. I'd hope you could design good
> > > virtual h/w.
> > 
> > It's not so much about designing good virtual h/w as it is avoiding tying
> > the platform to it. What we don't want is to mandate that in order to boot
> > this machine, you *must* implement an emulation of some virtual
> > power-controller or SMP booting device. If we go down that route, there's
> > less advantage from having the virtual platform in the first place.
> 
> There is actually a bigger problem here.  Let's say that you have a
> quad SMP platform.  You've arranged for your kernel to boot and only
> bring one of those cores online.
> 
> You then kexec() or reboot.  As far as the kernel is concerned, those
> other two CPUs are not online and are not running any kernel code;
> however in reality they could be sitting in this 'pen'.
> 
> The memory that these 'offline' CPUs is executing then gets overwritten,
> and that's game over for those CPUs.

That's not strictly true. The device-tree passed to the kernel should have a
/memreserve/ entry for the SMP pen to avoid exactly this scenario. In real
hardware, this still sucks because you have spinning CPUs burning up power
but that's not such a problem with a virtual platform.

> So, the 'pen' approach in the kernel is fragile, I'd much rather not
> have it.  It was fine in the beginning for the initial ARM Ltd SMP
> platforms but in this modern age it has no place in real platforms where
> there is proper control of the secondary CPUs.

We could have an (optional) virtual device for booting secondary CPUs but I
think the pen should still be there as a default method. Otherwise, we're
forcing a component of the platform to be emulated unnecessarily (the CPU,
vGIC and timers all have hardware-assisted virtualisation and require no
emulation).

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 12:30   ` Will Deacon
@ 2012-12-04 14:12     ` Rob Herring
  2012-12-04 17:00       ` Nicolas Pitre
  2012-12-11 16:19     ` Stefano Stabellini
  1 sibling, 1 reply; 38+ messages in thread
From: Rob Herring @ 2012-12-04 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/04/2012 06:30 AM, Will Deacon wrote:
> Hi Rob,
> 
> [fixing Arnd's address, as I apparently can't spell]
> 
> On Mon, Dec 03, 2012 at 09:54:09PM +0000, Rob Herring wrote:
>> On 12/03/2012 11:52 AM, Will Deacon wrote:
>>> When running Linux on a para-virtualised platform (that is, one where
>>> the guest is aware that it is dealing with virtual devices sitting on
>>> things like virtio or xenbus) we require very little in the way of
>>> platform code and piggy-backing on top of an existing platform can
>>> require a lot of device emulation for very little gain.
>>>
>>> These two patches introduce mach-virt: a very simple, DT-based machine
>>> which can be used with kvmtool in conjunction with virtio-based devices.
>>> It's not hard to imagine the same machine being targetted by Xen, which
>>> currently emulates a minimal variant of the vexpress platform.
>>>
>>> Note that this patch series depends on the timer rework from Mark
>>> Rutland, posted on Friday:
>>>
>>>   http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
>>>
>>> All feedback welcome. We suspect that most controversy will be around
>>> the name of the thing :)
>>
>> We've discussed this before at conferences. I don't know that we
>> concluded this wasn't needed, but it certainly leaned that direction.
> 
> I too leaned that direction before I started looking at kvm in detail and,
> since then, I've changed my mind when it comes to para-virtualisation.
> 
> The reason for this is that there is absolutely no reason to emulate some
> components of a real platform and then bolt virtual devices onto it once
> you've got enough to get it going. It leads to a right royal mess in
> userspace, where you have to write a load of non-reusable emulation code and
> it leads to churn in the kernel because you're constantly at odds with
> people trying to develop the platform code based on the actual hardware
> they have.
> 
> With a virtualisation-capable ARMv7 system, all you *need* to boot SMP
> Debian is:
> 
> 	- A v7 CPU with virt extensions
> 	- vGIC
> 	- architected timers
> 
> *everything* else can be described using virtio devices in the device-tree,
> essentially allowing you to generate platforms based on the above and boot
> the same kernel on them.
> 
>> So what has changed? You're not going to save code space because we're
>> building multiple platforms together. You'll save some boot time, but a
>> stripped down dtb with only the minimal peripherals would probably save
>> nearly as much time. 
> 
> It's really got nothing to do with code space or boot speed. What it *is*
> about is avoiding the tight coupling with a real platform and suffering as a
> result. Yes, you can strip down the DT for a real platform but you'll likely
> still have to emulate things like the SP804 in order to boot. That's not to
> mention any platform-specific system register interfaces which are required
> early on.
> 
> We can't even re-use the socfpga code (which is incredibly minimal) without
> emulating the dw_apb_timer.

That to me is highlighting where we need to do more work on DT driving
the initialization. The platforms are still aware of what kind of timers
and interrupt controllers are present. They should not be. There's work
in progress for both of those.

Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
table code could probably be common. I think I could use it on highbank
as well. If we decide the pen code stays, then it should be common
rather than creating yet another copy.

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 13:40         ` Will Deacon
@ 2012-12-04 14:37           ` Russell King - ARM Linux
  2012-12-04 16:11             ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Russell King - ARM Linux @ 2012-12-04 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 01:40:10PM +0000, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 01:33:26PM +0000, Russell King - ARM Linux wrote:
> > The memory that these 'offline' CPUs is executing then gets overwritten,
> > and that's game over for those CPUs.
> 
> That's not strictly true. The device-tree passed to the kernel should have a
> /memreserve/ entry for the SMP pen to avoid exactly this scenario. In real
> hardware, this still sucks because you have spinning CPUs burning up power
> but that's not such a problem with a virtual platform.

Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
will be located at 32K into the RAM.  The v3.7 kernel will also want to
occupy the same place.  At some point you have to overwrite the v3.6
kernel with the v3.7 kernel image.

That happens _before_ the DT has been parsed, so any memreserve stuff
will be ignored.  And it's at that point that your "offline" secondary
CPUs will have their instructions overwritten.

That's fine if the pen ends up being at the same place but that's not
something we guarantee.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 14:37           ` Russell King - ARM Linux
@ 2012-12-04 16:11             ` Will Deacon
  2012-12-04 16:35               ` Russell King - ARM Linux
  2012-12-04 16:45               ` Rob Herring
  0 siblings, 2 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-04 16:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
> On Tue, Dec 04, 2012 at 01:40:10PM +0000, Will Deacon wrote:
> > On Tue, Dec 04, 2012 at 01:33:26PM +0000, Russell King - ARM Linux wrote:
> > > The memory that these 'offline' CPUs is executing then gets overwritten,
> > > and that's game over for those CPUs.
> > 
> > That's not strictly true. The device-tree passed to the kernel should have a
> > /memreserve/ entry for the SMP pen to avoid exactly this scenario. In real
> > hardware, this still sucks because you have spinning CPUs burning up power
> > but that's not such a problem with a virtual platform.
> 
> Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
> into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
> will be located at 32K into the RAM.  The v3.7 kernel will also want to
> occupy the same place.  At some point you have to overwrite the v3.6
> kernel with the v3.7 kernel image.

If the 3.6 kernel didn't bring those CPUs online, they will sit in the
bootloader pen (out of the way of the kernel image) rather than the kernel
pen so I don't think there will be a problem.

The problem you're describing actually happens when the 3.6 kernel onlines
all of the CPUs, because now it has no way to hotplug them off safely. This
is also an issue with non-virtualised hardware but we could solve it for the
virtual platform by having a para-virtualised device for doing CPU hotplug.

> That happens _before_ the DT has been parsed, so any memreserve stuff
> will be ignored.  And it's at that point that your "offline" secondary
> CPUs will have their instructions overwritten.
> 
> That's fine if the pen ends up being at the same place but that's not
> something we guarantee.

Having CPUs in limbo between the bootloader the being online in the kernel
is something we should just avoid. Isn't that pen __init anyway?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 16:11             ` Will Deacon
@ 2012-12-04 16:35               ` Russell King - ARM Linux
  2012-12-04 17:24                 ` Will Deacon
  2012-12-04 16:45               ` Rob Herring
  1 sibling, 1 reply; 38+ messages in thread
From: Russell King - ARM Linux @ 2012-12-04 16:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 04:11:13PM +0000, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
> > On Tue, Dec 04, 2012 at 01:40:10PM +0000, Will Deacon wrote:
> > > On Tue, Dec 04, 2012 at 01:33:26PM +0000, Russell King - ARM Linux wrote:
> > > > The memory that these 'offline' CPUs is executing then gets overwritten,
> > > > and that's game over for those CPUs.
> > > 
> > > That's not strictly true. The device-tree passed to the kernel should have a
> > > /memreserve/ entry for the SMP pen to avoid exactly this scenario. In real
> > > hardware, this still sucks because you have spinning CPUs burning up power
> > > but that's not such a problem with a virtual platform.
> > 
> > Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
> > into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
> > will be located at 32K into the RAM.  The v3.7 kernel will also want to
> > occupy the same place.  At some point you have to overwrite the v3.6
> > kernel with the v3.7 kernel image.
> 
> If the 3.6 kernel didn't bring those CPUs online, they will sit in the
> bootloader pen (out of the way of the kernel image) rather than the kernel
> pen so I don't think there will be a problem.

... or in the case of sane hardware, the CPUs will be powered down.

> The problem you're describing actually happens when the 3.6 kernel onlines
> all of the CPUs, because now it has no way to hotplug them off safely. This
> is also an issue with non-virtualised hardware but we could solve it for the
> virtual platform by having a para-virtualised device for doing CPU hotplug.

That situation exists on ARM Ltd platforms where there's no way to
properly return them back to the boot loader.  We should not be forcing
this ARM Ltd platform deficiency onto other platforms as part of a
"design", even virtual platforms.

Most other real-world platforms out there have a way to power off the
unused secondary CPUs - Tegra and OMAP both do.

As far as virtual platforms go, how secondary CPUs are dealt with should
already have been solved; I really can't imagine that KVM and XEN on
other architectures end up with CPUs spinning in a loop inside the guest
kernel waiting for the guest OS to ask them to boot.  Neither can I imagine
that KVM and XEN end up with CPUs spinning in the guest OS when CPUs are
asked to be hot-unplugged.

> > That happens _before_ the DT has been parsed, so any memreserve stuff
> > will be ignored.  And it's at that point that your "offline" secondary
> > CPUs will have their instructions overwritten.
> > 
> > That's fine if the pen ends up being at the same place but that's not
> > something we guarantee.
> 
> Having CPUs in limbo between the bootloader the being online in the kernel
> is something we should just avoid. Isn't that pen __init anyway?

If you have hotplug enabled, all the secondary bringup code should be
in the __cpuinit and __cpuinitdata sections.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 16:11             ` Will Deacon
  2012-12-04 16:35               ` Russell King - ARM Linux
@ 2012-12-04 16:45               ` Rob Herring
  2012-12-04 17:16                 ` Will Deacon
  1 sibling, 1 reply; 38+ messages in thread
From: Rob Herring @ 2012-12-04 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/04/2012 10:11 AM, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
>> On Tue, Dec 04, 2012 at 01:40:10PM +0000, Will Deacon wrote:
>>> On Tue, Dec 04, 2012 at 01:33:26PM +0000, Russell King - ARM Linux wrote:
>>>> The memory that these 'offline' CPUs is executing then gets overwritten,
>>>> and that's game over for those CPUs.
>>>
>>> That's not strictly true. The device-tree passed to the kernel should have a
>>> /memreserve/ entry for the SMP pen to avoid exactly this scenario. In real
>>> hardware, this still sucks because you have spinning CPUs burning up power
>>> but that's not such a problem with a virtual platform.
>>
>> Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
>> into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
>> will be located at 32K into the RAM.  The v3.7 kernel will also want to
>> occupy the same place.  At some point you have to overwrite the v3.6
>> kernel with the v3.7 kernel image.
> 
> If the 3.6 kernel didn't bring those CPUs online, they will sit in the
> bootloader pen (out of the way of the kernel image) rather than the kernel
> pen so I don't think there will be a problem.
> 
> The problem you're describing actually happens when the 3.6 kernel onlines
> all of the CPUs, because now it has no way to hotplug them off safely. This
> is also an issue with non-virtualised hardware but we could solve it for the
> virtual platform by having a para-virtualised device for doing CPU hotplug.
> 
>> That happens _before_ the DT has been parsed, so any memreserve stuff
>> will be ignored.  And it's at that point that your "offline" secondary
>> CPUs will have their instructions overwritten.
>>
>> That's fine if the pen ends up being at the same place but that's not
>> something we guarantee.
> 
> Having CPUs in limbo between the bootloader the being online in the kernel
> is something we should just avoid. Isn't that pen __init anyway?

Aren't we mixing 2 pens here? You must have some simple bootloader
containing vector table and a pen that the dtb points to, right? The pen
you have in the kernel is only needed when hotplug only does a wfi. As
you don't yet support hotplug, then you can drop all the kernel pen code.

If there is no way to reset the core, then couldn't the hotplug code
tear down the cpu setup and just jump back to 0x0 which then returns to
the bootloader's pen?

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 14:12     ` Rob Herring
@ 2012-12-04 17:00       ` Nicolas Pitre
  2012-12-04 17:11         ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Nicolas Pitre @ 2012-12-04 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Dec 2012, Rob Herring wrote:

> That to me is highlighting where we need to do more work on DT driving
> the initialization. The platforms are still aware of what kind of timers
> and interrupt controllers are present. They should not be. There's work
> in progress for both of those.
> 
> Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
> table code could probably be common. I think I could use it on highbank
> as well. If we decide the pen code stays, then it should be common
> rather than creating yet another copy.

I don't want to rain on the "everything should be common" parade here.  
However, for the best part of last year I've been working on kernel 
support for big.LITTLE systems, and the handling of CPU hotplug 
(including SMP secondary boot) is far from being a trivial task.  
Managing the simple bringing up or down of a CPU in such an environment 
required hundreds of new lines of code.  That is far from a simple 
holding pen or spinning table to say the least.

[ For the curious, I'll post this code here soon for review. ]

So my point of view is: if you do not need a holding pen because you can 
hold individual CPUs in reset, then don't.  Many platforms with support 
in the kernel can do that, yet they copied the holding pen code just 
because it is there.  And that is total crap.

on the topic of a para-virtualised machine, I think that it should 
simply implement the PSCI calls to bring up CPUs _without_ any holding 
pen nor spinning tables.  You issue the appropriate PSCI call with the 
physical address for secondary_startup() as argument and you're done.  
The host intercepts that call and free a new CPU instance in response.  
That's all.

Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 17:00       ` Nicolas Pitre
@ 2012-12-04 17:11         ` Will Deacon
  2012-12-04 18:02           ` Nicolas Pitre
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 17:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nicolas,

On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> On Tue, 4 Dec 2012, Rob Herring wrote:
> 
> > That to me is highlighting where we need to do more work on DT driving
> > the initialization. The platforms are still aware of what kind of timers
> > and interrupt controllers are present. They should not be. There's work
> > in progress for both of those.
> > 
> > Lorenzo's DT MPIDR patches should trim down smp code some. The DT spin
> > table code could probably be common. I think I could use it on highbank
> > as well. If we decide the pen code stays, then it should be common
> > rather than creating yet another copy.
> 
> I don't want to rain on the "everything should be common" parade here.  
> However, for the best part of last year I've been working on kernel 
> support for big.LITTLE systems, and the handling of CPU hotplug 
> (including SMP secondary boot) is far from being a trivial task.  
> Managing the simple bringing up or down of a CPU in such an environment 
> required hundreds of new lines of code.  That is far from a simple 
> holding pen or spinning table to say the least.
> 
> [ For the curious, I'll post this code here soon for review. ]
> 
> So my point of view is: if you do not need a holding pen because you can 
> hold individual CPUs in reset, then don't.  Many platforms with support 
> in the kernel can do that, yet they copied the holding pen code just 
> because it is there.  And that is total crap.

Agreed, but it's also total crap forcing emulation of a made-up power
controller on the host in the case of a virtual platform.

> on the topic of a para-virtualised machine, I think that it should 
> simply implement the PSCI calls to bring up CPUs _without_ any holding 
> pen nor spinning tables.  You issue the appropriate PSCI call with the 
> physical address for secondary_startup() as argument and you're done.  
> The host intercepts that call and free a new CPU instance in response.  
> That's all.

I'd be happy to go with this suggestion if it wasn't for one thing:
platforms that do not implement a secure mode. For these platforms, smc will
be an undefined instruction at the exception level where it is executed and
therefore cannot be trapped by the hypervisor.

If that situation requires a pen, I see no benefit from having two boot
schemes where one of them would work in every case.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 16:45               ` Rob Herring
@ 2012-12-04 17:16                 ` Will Deacon
  2012-12-04 17:23                   ` Rob Herring
                                     ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-04 17:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
> On 12/04/2012 10:11 AM, Will Deacon wrote:
> > On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
> >> Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
> >> into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
> >> will be located at 32K into the RAM.  The v3.7 kernel will also want to
> >> occupy the same place.  At some point you have to overwrite the v3.6
> >> kernel with the v3.7 kernel image.
> > 
> > If the 3.6 kernel didn't bring those CPUs online, they will sit in the
> > bootloader pen (out of the way of the kernel image) rather than the kernel
> > pen so I don't think there will be a problem.
> > 
> > The problem you're describing actually happens when the 3.6 kernel onlines
> > all of the CPUs, because now it has no way to hotplug them off safely. This
> > is also an issue with non-virtualised hardware but we could solve it for the
> > virtual platform by having a para-virtualised device for doing CPU hotplug.
> > 
> >> That happens _before_ the DT has been parsed, so any memreserve stuff
> >> will be ignored.  And it's at that point that your "offline" secondary
> >> CPUs will have their instructions overwritten.
> >>
> >> That's fine if the pen ends up being at the same place but that's not
> >> something we guarantee.
> > 
> > Having CPUs in limbo between the bootloader the being online in the kernel
> > is something we should just avoid. Isn't that pen __init anyway?
> 
> Aren't we mixing 2 pens here? You must have some simple bootloader
> containing vector table and a pen that the dtb points to, right? The pen
> you have in the kernel is only needed when hotplug only does a wfi. As
> you don't yet support hotplug, then you can drop all the kernel pen code.

Yes, both qemu and kvmtool have bootloader pens outside of the kernel but
since wfi is not trapped by kvm, the secondaries can be released early due
to a spurious wakeup so we need the second pen.

> If there is no way to reset the core, then couldn't the hotplug code
> tear down the cpu setup and just jump back to 0x0 which then returns to
> the bootloader's pen?

I think hotplug really should be implemented with a virtio device. We just
trap back to the emulation and kill the vcpu thread.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:16                 ` Will Deacon
@ 2012-12-04 17:23                   ` Rob Herring
  2012-12-04 17:24                   ` Marc Zyngier
  2012-12-04 18:10                   ` Nicolas Pitre
  2 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2012-12-04 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/04/2012 11:16 AM, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
>> On 12/04/2012 10:11 AM, Will Deacon wrote:
>>> On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
>>>> Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
>>>> into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
>>>> will be located at 32K into the RAM.  The v3.7 kernel will also want to
>>>> occupy the same place.  At some point you have to overwrite the v3.6
>>>> kernel with the v3.7 kernel image.
>>>
>>> If the 3.6 kernel didn't bring those CPUs online, they will sit in the
>>> bootloader pen (out of the way of the kernel image) rather than the kernel
>>> pen so I don't think there will be a problem.
>>>
>>> The problem you're describing actually happens when the 3.6 kernel onlines
>>> all of the CPUs, because now it has no way to hotplug them off safely. This
>>> is also an issue with non-virtualised hardware but we could solve it for the
>>> virtual platform by having a para-virtualised device for doing CPU hotplug.
>>>
>>>> That happens _before_ the DT has been parsed, so any memreserve stuff
>>>> will be ignored.  And it's at that point that your "offline" secondary
>>>> CPUs will have their instructions overwritten.
>>>>
>>>> That's fine if the pen ends up being at the same place but that's not
>>>> something we guarantee.
>>>
>>> Having CPUs in limbo between the bootloader the being online in the kernel
>>> is something we should just avoid. Isn't that pen __init anyway?
>>
>> Aren't we mixing 2 pens here? You must have some simple bootloader
>> containing vector table and a pen that the dtb points to, right? The pen
>> you have in the kernel is only needed when hotplug only does a wfi. As
>> you don't yet support hotplug, then you can drop all the kernel pen code.
> 
> Yes, both qemu and kvmtool have bootloader pens outside of the kernel but
> since wfi is not trapped by kvm, the secondaries can be released early due
> to a spurious wakeup so we need the second pen.

Wouldn't the pen requiring both a valid address (perhaps !0 or !-1) and
a wake-up fix this?

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 16:35               ` Russell King - ARM Linux
@ 2012-12-04 17:24                 ` Will Deacon
  2012-12-04 19:37                   ` Arnd Bergmann
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 04:35:55PM +0000, Russell King - ARM Linux wrote:
> On Tue, Dec 04, 2012 at 04:11:13PM +0000, Will Deacon wrote:
> > The problem you're describing actually happens when the 3.6 kernel onlines
> > all of the CPUs, because now it has no way to hotplug them off safely. This
> > is also an issue with non-virtualised hardware but we could solve it for the
> > virtual platform by having a para-virtualised device for doing CPU hotplug.
> 
> That situation exists on ARM Ltd platforms where there's no way to
> properly return them back to the boot loader.  We should not be forcing
> this ARM Ltd platform deficiency onto other platforms as part of a
> "design", even virtual platforms.
> 
> Most other real-world platforms out there have a way to power off the
> unused secondary CPUs - Tegra and OMAP both do.

If a virtual machine powers off a virtual CPU, I doubt we want to power of
its corresponding CPU -- that logic can remain in the host. All we need to
do is kill the virtual CPU thread, which we can do easily enough. Booting is
the more difficult problem because we introduce a reliance on a virtual
device being ready incredibly early, essentially hardcoding part of the
virtual machine.

> As far as virtual platforms go, how secondary CPUs are dealt with should
> already have been solved; I really can't imagine that KVM and XEN on
> other architectures end up with CPUs spinning in a loop inside the guest
> kernel waiting for the guest OS to ask them to boot.  Neither can I imagine
> that KVM and XEN end up with CPUs spinning in the guest OS when CPUs are
> asked to be hot-unplugged.

So neither kvmtool or qemu currently support hotplug for kvm guests on any
architectures from what I can tell. Furthermore, kvmtool on ppc (at least)
uses a secondary spinning loop at a fixed offset into the kernel image. I
don't think we should really pay much attention to those other architectures
in this regard!

> > > That happens _before_ the DT has been parsed, so any memreserve stuff
> > > will be ignored.  And it's at that point that your "offline" secondary
> > > CPUs will have their instructions overwritten.
> > > 
> > > That's fine if the pen ends up being at the same place but that's not
> > > something we guarantee.
> > 
> > Having CPUs in limbo between the bootloader the being online in the kernel
> > is something we should just avoid. Isn't that pen __init anyway?
> 
> If you have hotplug enabled, all the secondary bringup code should be
> in the __cpuinit and __cpuinitdata sections.

Right, but if booting a !HOTPLUG kernel via kexec, surely we'd have to clear
that pen of CPUs?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:16                 ` Will Deacon
  2012-12-04 17:23                   ` Rob Herring
@ 2012-12-04 17:24                   ` Marc Zyngier
  2012-12-04 17:30                     ` Will Deacon
  2012-12-04 18:10                   ` Nicolas Pitre
  2 siblings, 1 reply; 38+ messages in thread
From: Marc Zyngier @ 2012-12-04 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/12/12 17:16, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
>> On 12/04/2012 10:11 AM, Will Deacon wrote:
>>> On Tue, Dec 04, 2012 at 02:37:25PM +0000, Russell King - ARM Linux wrote:
>>>> Umm.  So let's see.  If I'm running v3.6 stock kernel and want to kexec
>>>> into a v3.7 stock kernel.  The SMP pen is part of the v3.6 kernel, which
>>>> will be located at 32K into the RAM.  The v3.7 kernel will also want to
>>>> occupy the same place.  At some point you have to overwrite the v3.6
>>>> kernel with the v3.7 kernel image.
>>>
>>> If the 3.6 kernel didn't bring those CPUs online, they will sit in the
>>> bootloader pen (out of the way of the kernel image) rather than the kernel
>>> pen so I don't think there will be a problem.
>>>
>>> The problem you're describing actually happens when the 3.6 kernel onlines
>>> all of the CPUs, because now it has no way to hotplug them off safely. This
>>> is also an issue with non-virtualised hardware but we could solve it for the
>>> virtual platform by having a para-virtualised device for doing CPU hotplug.
>>>
>>>> That happens _before_ the DT has been parsed, so any memreserve stuff
>>>> will be ignored.  And it's at that point that your "offline" secondary
>>>> CPUs will have their instructions overwritten.
>>>>
>>>> That's fine if the pen ends up being at the same place but that's not
>>>> something we guarantee.
>>>
>>> Having CPUs in limbo between the bootloader the being online in the kernel
>>> is something we should just avoid. Isn't that pen __init anyway?
>>
>> Aren't we mixing 2 pens here? You must have some simple bootloader
>> containing vector table and a pen that the dtb points to, right? The pen
>> you have in the kernel is only needed when hotplug only does a wfi. As
>> you don't yet support hotplug, then you can drop all the kernel pen code.
> 
> Yes, both qemu and kvmtool have bootloader pens outside of the kernel but
> since wfi is not trapped by kvm, the secondaries can be released early due
> to a spurious wakeup so we need the second pen.

Actually, KVM traps WFI and puts the vcpu thread on a wait queue (we are
actually giving more guaranties than the architecture offers here).

We should be able to remove the loop and rely on WFI.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:24                   ` Marc Zyngier
@ 2012-12-04 17:30                     ` Will Deacon
  2012-12-11 16:04                       ` Stefano Stabellini
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 05:24:57PM +0000, Marc Zyngier wrote:
> On 04/12/12 17:16, Will Deacon wrote:
> > On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
> >> Aren't we mixing 2 pens here? You must have some simple bootloader
> >> containing vector table and a pen that the dtb points to, right? The pen
> >> you have in the kernel is only needed when hotplug only does a wfi. As
> >> you don't yet support hotplug, then you can drop all the kernel pen code.
> > 
> > Yes, both qemu and kvmtool have bootloader pens outside of the kernel but
> > since wfi is not trapped by kvm, the secondaries can be released early due
> > to a spurious wakeup so we need the second pen.
> 
> Actually, KVM traps WFI and puts the vcpu thread on a wait queue (we are
> actually giving more guaranties than the architecture offers here).

Ok, if we can rely on this behaviour in the future this sounds promising.
Can somebody comment from the Xen side of things please?

> We should be able to remove the loop and rely on WFI.

Yes, we should be able to remove the kernel-side loop. We still won't have
hotplug and friends, but that can come later because there's a discussion
around PSCI vs virtual power controller to be had there.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 17:11         ` Will Deacon
@ 2012-12-04 18:02           ` Nicolas Pitre
  2012-12-04 18:14             ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Nicolas Pitre @ 2012-12-04 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Dec 2012, Will Deacon wrote:

> Hi Nicolas,
> 
> On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> > on the topic of a para-virtualised machine, I think that it should 
> > simply implement the PSCI calls to bring up CPUs _without_ any holding 
> > pen nor spinning tables.  You issue the appropriate PSCI call with the 
> > physical address for secondary_startup() as argument and you're done.  
> > The host intercepts that call and free a new CPU instance in response.  
> > That's all.
> 
> I'd be happy to go with this suggestion if it wasn't for one thing:
> platforms that do not implement a secure mode. For these platforms, smc will
> be an undefined instruction at the exception level where it is executed and
> therefore cannot be trapped by the hypervisor.

Really?  I thought the hypervisor could virtualize SMC calls.  Or is 
that considered a security hazard?

I don't remember all the PSCI spec details, but I think there was some 
provision for this case i.e. the SMC call could be a HYP call instead.  
And if that's not in the spec, then it probably should be added and 
implemented as if it was.

> If that situation requires a pen, I see no benefit from having two boot
> schemes where one of them would work in every case.

We always have the choice between several schemes in device drivers for 
example, depending on the hardware generation.  Yet we always implement 
the better scheme for the newest hardware for performance reasons, even 
if an older one could work in all cases.

A holding pen is a rather stupid scheme.  Please let's try to do without 
it if possible.

Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:16                 ` Will Deacon
  2012-12-04 17:23                   ` Rob Herring
  2012-12-04 17:24                   ` Marc Zyngier
@ 2012-12-04 18:10                   ` Nicolas Pitre
  2 siblings, 0 replies; 38+ messages in thread
From: Nicolas Pitre @ 2012-12-04 18:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Dec 2012, Will Deacon wrote:

> On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
> > If there is no way to reset the core, then couldn't the hotplug code
> > tear down the cpu setup and just jump back to 0x0 which then returns to
> > the bootloader's pen?
> 
> I think hotplug really should be implemented with a virtio device. We just
> trap back to the emulation and kill the vcpu thread.

Hotplug and secondary boot should really be considered the same and use 
the same methods.  So if you envision a virtio device for hotplug, then 
just use that for secondary boot as well.  PSCI might be simpler to 
implement though.


Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 18:02           ` Nicolas Pitre
@ 2012-12-04 18:14             ` Will Deacon
  2012-12-05 14:52               ` Catalin Marinas
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-04 18:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
> On Tue, 4 Dec 2012, Will Deacon wrote:
> 
> > Hi Nicolas,
> > 
> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> > > on the topic of a para-virtualised machine, I think that it should 
> > > simply implement the PSCI calls to bring up CPUs _without_ any holding 
> > > pen nor spinning tables.  You issue the appropriate PSCI call with the 
> > > physical address for secondary_startup() as argument and you're done.  
> > > The host intercepts that call and free a new CPU instance in response.  
> > > That's all.
> > 
> > I'd be happy to go with this suggestion if it wasn't for one thing:
> > platforms that do not implement a secure mode. For these platforms, smc will
> > be an undefined instruction at the exception level where it is executed and
> > therefore cannot be trapped by the hypervisor.
> 
> Really?  I thought the hypervisor could virtualize SMC calls.  Or is 
> that considered a security hazard?

If the security extensions aren't implemented, the hypervisor can't trap the
smc instruction.

> I don't remember all the PSCI spec details, but I think there was some 
> provision for this case i.e. the SMC call could be a HYP call instead.  
> And if that's not in the spec, then it probably should be added and 
> implemented as if it was.

Well, this depends on the guest taking an undefined instruction exception on
the smc, then deciding to issue an hvc instead and *then* having the
hypervisor somehow translate that into a PSCI invocation. It could work, but
it sounds easy to mess up and relies on the PSCI firmware co-existing with
things like kvm.

> > If that situation requires a pen, I see no benefit from having two boot
> > schemes where one of them would work in every case.
> 
> We always have the choice between several schemes in device drivers for 
> example, depending on the hardware generation.  Yet we always implement 
> the better scheme for the newest hardware for performance reasons, even 
> if an older one could work in all cases.

Again, I totally agree when it comes to things like poweroff and hotplug but
for booting I don't think we gain much from having multiple implementations
for a single platform. Hopefully this is moot -- see below.

> A holding pen is a rather stupid scheme.  Please let's try to do without 
> it if possible.

I've just hacked up Rob's suggestion and it seems to be working, so I'll
post a pen-less v2 tomorrow. The hotplug/reboot code can come later when we
have something host-side that we can use (could be PSCI).

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:24                 ` Will Deacon
@ 2012-12-04 19:37                   ` Arnd Bergmann
  0 siblings, 0 replies; 38+ messages in thread
From: Arnd Bergmann @ 2012-12-04 19:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 04 December 2012, Will Deacon wrote:
> > 
> > Most other real-world platforms out there have a way to power off the
> > unused secondary CPUs - Tegra and OMAP both do.
> 
> If a virtual machine powers off a virtual CPU, I doubt we want to power of
> its corresponding CPU -- that logic can remain in the host. All we need to
> do is kill the virtual CPU thread, which we can do easily enough. Booting is
> the more difficult problem because we introduce a reliance on a virtual
> device being ready incredibly early, essentially hardcoding part of the
> virtual machine.

Powering off a virtual CPU is the same as killing the virtual CPU thread.
Not powering it off would imply that we schedule a host CPU to run an
endless loop on it.

	Arnd

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 18:14             ` Will Deacon
@ 2012-12-05 14:52               ` Catalin Marinas
  2012-12-05 15:07                 ` Nicolas Pitre
  2012-12-05 15:07                 ` Will Deacon
  0 siblings, 2 replies; 38+ messages in thread
From: Catalin Marinas @ 2012-12-05 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
>> On Tue, 4 Dec 2012, Will Deacon wrote:
>> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
>> > > on the topic of a para-virtualised machine, I think that it should
>> > > simply implement the PSCI calls to bring up CPUs _without_ any holding
>> > > pen nor spinning tables.  You issue the appropriate PSCI call with the
>> > > physical address for secondary_startup() as argument and you're done.
>> > > The host intercepts that call and free a new CPU instance in response.
>> > > That's all.
>> >
>> > I'd be happy to go with this suggestion if it wasn't for one thing:
>> > platforms that do not implement a secure mode. For these platforms, smc will
>> > be an undefined instruction at the exception level where it is executed and
>> > therefore cannot be trapped by the hypervisor.
>>
>> Really?  I thought the hypervisor could virtualize SMC calls.  Or is
>> that considered a security hazard?
>
> If the security extensions aren't implemented, the hypervisor can't trap the
> smc instruction.
>
>> I don't remember all the PSCI spec details, but I think there was some
>> provision for this case i.e. the SMC call could be a HYP call instead.
>> And if that's not in the spec, then it probably should be added and
>> implemented as if it was.
>
> Well, this depends on the guest taking an undefined instruction exception on
> the smc, then deciding to issue an hvc instead and *then* having the
> hypervisor somehow translate that into a PSCI invocation. It could work, but
> it sounds easy to mess up and relies on the PSCI firmware co-existing with
> things like kvm.

We can have enable-method DT entries independent of the SoC and one of
them can be psci-hvc.

Just for clarification, AArch32 with virtualisation mandates the
security extensions, so the SMC can be trapped. On AArch64 it is a bit
tricky since the presence of EL3 is not mandate, in which case SMC
would undef (don't as why ;). That's where we can have different
enable methods specified via the DT.

-- 
Catalin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-05 14:52               ` Catalin Marinas
@ 2012-12-05 15:07                 ` Nicolas Pitre
  2012-12-05 15:10                   ` Will Deacon
  2012-12-05 15:07                 ` Will Deacon
  1 sibling, 1 reply; 38+ messages in thread
From: Nicolas Pitre @ 2012-12-05 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 5 Dec 2012, Catalin Marinas wrote:

> On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> > On Tue, Dec 04, 2012 at 06:02:13PM +0000, Nicolas Pitre wrote:
> >> On Tue, 4 Dec 2012, Will Deacon wrote:
> >> > On Tue, Dec 04, 2012 at 05:00:07PM +0000, Nicolas Pitre wrote:
> >> > > on the topic of a para-virtualised machine, I think that it should
> >> > > simply implement the PSCI calls to bring up CPUs _without_ any holding
> >> > > pen nor spinning tables.  You issue the appropriate PSCI call with the
> >> > > physical address for secondary_startup() as argument and you're done.
> >> > > The host intercepts that call and free a new CPU instance in response.
> >> > > That's all.
> >> >
> >> > I'd be happy to go with this suggestion if it wasn't for one thing:
> >> > platforms that do not implement a secure mode. For these platforms, smc will
> >> > be an undefined instruction at the exception level where it is executed and
> >> > therefore cannot be trapped by the hypervisor.
> >>
> >> Really?  I thought the hypervisor could virtualize SMC calls.  Or is
> >> that considered a security hazard?
> >
> > If the security extensions aren't implemented, the hypervisor can't trap the
> > smc instruction.
> >
> >> I don't remember all the PSCI spec details, but I think there was some
> >> provision for this case i.e. the SMC call could be a HYP call instead.
> >> And if that's not in the spec, then it probably should be added and
> >> implemented as if it was.
> >
> > Well, this depends on the guest taking an undefined instruction exception on
> > the smc, then deciding to issue an hvc instead and *then* having the
> > hypervisor somehow translate that into a PSCI invocation. It could work, but
> > it sounds easy to mess up and relies on the PSCI firmware co-existing with
> > things like kvm.
> 
> We can have enable-method DT entries independent of the SoC and one of
> them can be psci-hvc.
> 
> Just for clarification, AArch32 with virtualisation mandates the
> security extensions, so the SMC can be trapped.

Good. Therefore this one is settled.

> On AArch64 it is a bit
> tricky since the presence of EL3 is not mandate, in which case SMC
> would undef (don't as why ;). That's where we can have different
> enable methods specified via the DT.

In that case, sure.  But do you expect such a configuration to be 
common?  Especially with all this secure booting being and cie enforced 
across the board?  I bet it won't.

So it is probably best to presume PSCI by default, and have a DT 
specified method only when it is necessary to override the default.


Nicolas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-05 14:52               ` Catalin Marinas
  2012-12-05 15:07                 ` Nicolas Pitre
@ 2012-12-05 15:07                 ` Will Deacon
  2012-12-05 15:15                   ` Catalin Marinas
  1 sibling, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-05 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 05, 2012 at 02:52:57PM +0000, Catalin Marinas wrote:
> On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> > Well, this depends on the guest taking an undefined instruction exception on
> > the smc, then deciding to issue an hvc instead and *then* having the
> > hypervisor somehow translate that into a PSCI invocation. It could work, but
> > it sounds easy to mess up and relies on the PSCI firmware co-existing with
> > things like kvm.
> 
> We can have enable-method DT entries independent of the SoC and one of
> them can be psci-hvc.

As soon as the support is there in the upper layers, we can do that.

> Just for clarification, AArch32 with virtualisation mandates the
> security extensions, so the SMC can be trapped. On AArch64 it is a bit
> tricky since the presence of EL3 is not mandate, in which case SMC
> would undef (don't as why ;). That's where we can have different
> enable methods specified via the DT.

Not entirely true: only ARMv7 mandates the security extensions in this
manner. You can still have ARMv8 CPUs running AArch32 code without the
security extensions.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-05 15:07                 ` Nicolas Pitre
@ 2012-12-05 15:10                   ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-05 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 05, 2012 at 03:07:05PM +0000, Nicolas Pitre wrote:
> On Wed, 5 Dec 2012, Catalin Marinas wrote:
> > On 4 December 2012 18:14, Will Deacon <will.deacon@arm.com> wrote:
> > > Well, this depends on the guest taking an undefined instruction exception on
> > > the smc, then deciding to issue an hvc instead and *then* having the
> > > hypervisor somehow translate that into a PSCI invocation. It could work, but
> > > it sounds easy to mess up and relies on the PSCI firmware co-existing with
> > > things like kvm.
> > 
> > We can have enable-method DT entries independent of the SoC and one of
> > them can be psci-hvc.
> > 
> > Just for clarification, AArch32 with virtualisation mandates the
> > security extensions, so the SMC can be trapped.
> 
> Good. Therefore this one is settled.

Looks we replied at the same time! Please see my other mail...

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-05 15:07                 ` Will Deacon
@ 2012-12-05 15:15                   ` Catalin Marinas
  0 siblings, 0 replies; 38+ messages in thread
From: Catalin Marinas @ 2012-12-05 15:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 05, 2012 at 03:07:32PM +0000, Will Deacon wrote:
> On Wed, Dec 05, 2012 at 02:52:57PM +0000, Catalin Marinas wrote:
> > Just for clarification, AArch32 with virtualisation mandates the
> > security extensions, so the SMC can be trapped. On AArch64 it is a bit
> > tricky since the presence of EL3 is not mandate, in which case SMC
> > would undef (don't as why ;). That's where we can have different
> > enable methods specified via the DT.
>
> Not entirely true: only ARMv7 mandates the security extensions in this
> manner. You can still have ARMv8 CPUs running AArch32 code without the
> security extensions.

Yes, I pretty much had the 32-bit and 64-bit ARM ports in mind. An
AArch32 guest OS running on an AArch64 KVM would indeed have this issue.

But HVC PSCI would still work for the mach-virt in all cases.

--
Catalin

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-04 17:30                     ` Will Deacon
@ 2012-12-11 16:04                       ` Stefano Stabellini
  2012-12-11 16:09                         ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Stefano Stabellini @ 2012-12-11 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 4 Dec 2012, Will Deacon wrote:
> On Tue, Dec 04, 2012 at 05:24:57PM +0000, Marc Zyngier wrote:
> > On 04/12/12 17:16, Will Deacon wrote:
> > > On Tue, Dec 04, 2012 at 04:45:58PM +0000, Rob Herring wrote:
> > >> Aren't we mixing 2 pens here? You must have some simple bootloader
> > >> containing vector table and a pen that the dtb points to, right? The pen
> > >> you have in the kernel is only needed when hotplug only does a wfi. As
> > >> you don't yet support hotplug, then you can drop all the kernel pen code.
> > > 
> > > Yes, both qemu and kvmtool have bootloader pens outside of the kernel but
> > > since wfi is not trapped by kvm, the secondaries can be released early due
> > > to a spurious wakeup so we need the second pen.
> > 
> > Actually, KVM traps WFI and puts the vcpu thread on a wait queue (we are
> > actually giving more guaranties than the architecture offers here).
> 
> Ok, if we can rely on this behaviour in the future this sounds promising.
> Can somebody comment from the Xen side of things please?

Hi!
Sorry for taking so long to reply but I admit I missed the thread
entirely until now. I would appreciate if you could CC
xen-devel at lists.xen.org and/or me for virtualization related
discussions.

On the Xen side there are not going to be any problems related to SMP
and holding pens, because secondary CPUs only come into existence after
the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
ARM).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 16:04                       ` Stefano Stabellini
@ 2012-12-11 16:09                         ` Will Deacon
  2012-12-11 16:34                           ` Stefano Stabellini
  0 siblings, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-11 16:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 11, 2012 at 04:04:20PM +0000, Stefano Stabellini wrote:
> Hi!

Hi Stefano,

> Sorry for taking so long to reply but I admit I missed the thread
> entirely until now. I would appreciate if you could CC
> xen-devel at lists.xen.org and/or me for virtualization related
> discussions.

I did CC xen-arm at lists.xen.org, but you've dropped that list in your
reply...

> On the Xen side there are not going to be any problems related to SMP
> and holding pens, because secondary CPUs only come into existence after
> the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
> ARM).

Ok. Would you be willing/able to wrap this in a PSCI interface when you do
add SMP support for ARM?

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 0/2] Add support for a fake, para-virtualised machine
  2012-12-04 12:30   ` Will Deacon
  2012-12-04 14:12     ` Rob Herring
@ 2012-12-11 16:19     ` Stefano Stabellini
  1 sibling, 0 replies; 38+ messages in thread
From: Stefano Stabellini @ 2012-12-11 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

sorry if I missed anybody from CC, I don't have the original list

On Tue, 4 Dec 2012, Will Deacon wrote:
> On Mon, Dec 03, 2012 at 09:54:09PM +0000, Rob Herring wrote:
> > On 12/03/2012 11:52 AM, Will Deacon wrote:
> > > When running Linux on a para-virtualised platform (that is, one where
> > > the guest is aware that it is dealing with virtual devices sitting on
> > > things like virtio or xenbus) we require very little in the way of
> > > platform code and piggy-backing on top of an existing platform can
> > > require a lot of device emulation for very little gain.
> > > 
> > > These two patches introduce mach-virt: a very simple, DT-based machine
> > > which can be used with kvmtool in conjunction with virtio-based devices.
> > > It's not hard to imagine the same machine being targetted by Xen, which
> > > currently emulates a minimal variant of the vexpress platform.
> > > 
> > > Note that this patch series depends on the timer rework from Mark
> > > Rutland, posted on Friday:
> > > 
> > >   http://lists.infradead.org/pipermail/linux-arm-kernel/2012-November/135651.html
> > > 
> > > All feedback welcome. We suspect that most controversy will be around
> > > the name of the thing :)
> > 
> > We've discussed this before at conferences. I don't know that we
> > concluded this wasn't needed, but it certainly leaned that direction.
> 
> I too leaned that direction before I started looking at kvm in detail and,
> since then, I've changed my mind when it comes to para-virtualisation.
> 
> The reason for this is that there is absolutely no reason to emulate some
> components of a real platform and then bolt virtual devices onto it once
> you've got enough to get it going. It leads to a right royal mess in
> userspace, where you have to write a load of non-reusable emulation code and
> it leads to churn in the kernel because you're constantly at odds with
> people trying to develop the platform code based on the actual hardware
> they have.
> 
> With a virtualisation-capable ARMv7 system, all you *need* to boot SMP
> Debian is:
> 
> 	- A v7 CPU with virt extensions
> 	- vGIC
> 	- architected timers
> 
> *everything* else can be described using virtio devices in the device-tree,
> essentially allowing you to generate platforms based on the above and boot
> the same kernel on them.

That's right, however for Xen there would be a single hypervisor node
and no virtio devices: everything else is going to be on xenbus.


> > So what has changed? You're not going to save code space because we're
> > building multiple platforms together. You'll save some boot time, but a
> > stripped down dtb with only the minimal peripherals would probably save
> > nearly as much time. 
> 
> It's really got nothing to do with code space or boot speed. What it *is*
> about is avoiding the tight coupling with a real platform and suffering as a
> result. Yes, you can strip down the DT for a real platform but you'll likely
> still have to emulate things like the SP804 in order to boot. That's not to
> mention any platform-specific system register interfaces which are required
> early on.
> 
> We can't even re-use the socfpga code (which is incredibly minimal) without
> emulating the dw_apb_timer.
> 
> > However, I do have concerns with using VExpress as
> > the guest. For example, you can't support a non-PAE guest with 4GB of
> > RAM on VExpress (maybe if the vexpress code gets all memory map info
> > from DT).
> 
> Yes, vexpress is even less suitable for this.

My understanding was that vexpress was supposed to become a machine
fully discoverable via DT, and that of course includes the memory map.
Actually I think that we are pretty close to that already.

However if you prefer to introduce a new machine for that purpose, I am
OK with that too. I am sure we can base xenvm on that.


> > Is this really complete? Will we need reset, poweroff, hotplug, and
> > suspend/resume support for example? Unlike most initial platform
> > submissions which are minimal, I think seeing full support would be
> > useful here. Then we can better gauge how much we are really saving.
> 
> The code is complete in the sense that you can boot an SMP guest running
> Debian with console, network, block etc. etc. but you're right to point out
> the absence of power-management support.
> 
> However, power-management in KVM guests is a *much* larger problem and not
> one that has been solved adequately as of yet. There are suggestions that it
> should be handled entirely in firmware, with the guest making smc calls to
> request power-management operations but this is yet to materialise and, as
> such, we can't yet use it here.
> 
> We could look at building a virtio-based power controller but that's going
> to come up too late for SMP booting (although will give us hotplug, reset
> etc).

In Xen x86 we implement hotplug via xenbus and power management via
hypercalls. So we should be able to get away with it wihout doing any
emulation or virtio.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 16:09                         ` Will Deacon
@ 2012-12-11 16:34                           ` Stefano Stabellini
  2012-12-11 16:41                             ` Ian Campbell
  2012-12-11 16:43                             ` Will Deacon
  0 siblings, 2 replies; 38+ messages in thread
From: Stefano Stabellini @ 2012-12-11 16:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 11 Dec 2012, Will Deacon wrote:
> On Tue, Dec 11, 2012 at 04:04:20PM +0000, Stefano Stabellini wrote:
> > Hi!
> 
> Hi Stefano,
> 
> > Sorry for taking so long to reply but I admit I missed the thread
> > entirely until now. I would appreciate if you could CC
> > xen-devel at lists.xen.org and/or me for virtualization related
> > discussions.
> 
> I did CC xen-arm at lists.xen.org,
  
Thanks!
I don't know why I didn't receive it then :-\

In any case xen-devel is the right place for upstream related
discussion.


> but you've dropped that list in your
> reply...

Yeah.. that's because I didn't receive the original email and I had to
download the mailing list archives, sorry about that.


> > On the Xen side there are not going to be any problems related to SMP
> > and holding pens, because secondary CPUs only come into existence after
> > the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
> > ARM).
> 
> Ok. Would you be willing/able to wrap this in a PSCI interface when you do
> add SMP support for ARM?

I grepped for PSCI in the Linux tree but I couldn't find anything.
Also I googled for ARM PSCI and nothing of value came up. Can I find a
doc that explains what PSCI is anywhere?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 16:34                           ` Stefano Stabellini
@ 2012-12-11 16:41                             ` Ian Campbell
  2012-12-11 16:43                             ` Will Deacon
  1 sibling, 0 replies; 38+ messages in thread
From: Ian Campbell @ 2012-12-11 16:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2012-12-11 at 16:34 +0000, Stefano Stabellini wrote:
> On Tue, 11 Dec 2012, Will Deacon wrote:
> > On Tue, Dec 11, 2012 at 04:04:20PM +0000, Stefano Stabellini wrote:
> > > Hi!
> > 
> > Hi Stefano,
> > 
> > > Sorry for taking so long to reply but I admit I missed the thread
> > > entirely until now. I would appreciate if you could CC
> > > xen-devel at lists.xen.org and/or me for virtualization related
> > > discussions.
> > 
> > I did CC xen-arm at lists.xen.org,
>   
> Thanks!
> I don't know why I didn't receive it then :-\

xen-arm@ is the list for the armv6 pv Xen port but those developers
aren't terribly active these days and I wouldn't be surprise if no one
is moderating that particular list. I assume Will isn't subscribed so
his post will be in the queue somewhere.

> In any case xen-devel is the right place for upstream related
> discussion.

Right, any discussion about Xen on v7 and v8 ARM processors with the
virt extensions should happen on xen-devel@ not xen-arm@, the
MAINTAINERS file reflects this.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 16:34                           ` Stefano Stabellini
  2012-12-11 16:41                             ` Ian Campbell
@ 2012-12-11 16:43                             ` Will Deacon
  2012-12-11 17:14                               ` Stefano Stabellini
  1 sibling, 1 reply; 38+ messages in thread
From: Will Deacon @ 2012-12-11 16:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 11, 2012 at 04:34:27PM +0000, Stefano Stabellini wrote:
> On Tue, 11 Dec 2012, Will Deacon wrote:
> > On Tue, Dec 11, 2012 at 04:04:20PM +0000, Stefano Stabellini wrote:
> > > Hi!
> > 
> > Hi Stefano,
> > 
> > > Sorry for taking so long to reply but I admit I missed the thread
> > > entirely until now. I would appreciate if you could CC
> > > xen-devel at lists.xen.org and/or me for virtualization related
> > > discussions.
> > 
> > I did CC xen-arm at lists.xen.org,
>   
> Thanks!
> I don't know why I didn't receive it then :-\
> 
> In any case xen-devel is the right place for upstream related
> discussion.
> 
> 
> > but you've dropped that list in your
> > reply...
> 
> Yeah.. that's because I didn't receive the original email and I had to
> download the mailing list archives, sorry about that.

Ok, although note that I'm now getting all of your replies twice!

> 
> > > On the Xen side there are not going to be any problems related to SMP
> > > and holding pens, because secondary CPUs only come into existence after
> > > the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
> > > ARM).
> > 
> > Ok. Would you be willing/able to wrap this in a PSCI interface when you do
> > add SMP support for ARM?
> 
> I grepped for PSCI in the Linux tree but I couldn't find anything.
> Also I googled for ARM PSCI and nothing of value came up. Can I find a
> doc that explains what PSCI is anywhere?

There is an initial version of the document here (requires registration):

  https://silver.arm.com/download/download.tm?pv=1303201

Note that there are some updates due to clarify things a bit in areas such
as function ID numbers and the PCS. Marc and I also plan to hack something up
in kvm/mach-virt for booting secondary CPUs.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 16:43                             ` Will Deacon
@ 2012-12-11 17:14                               ` Stefano Stabellini
  2012-12-11 17:24                                 ` Will Deacon
  0 siblings, 1 reply; 38+ messages in thread
From: Stefano Stabellini @ 2012-12-11 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 11 Dec 2012, Will Deacon wrote:
> > > > On the Xen side there are not going to be any problems related to SMP
> > > > and holding pens, because secondary CPUs only come into existence after
> > > > the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
> > > > ARM).
> > > 
> > > Ok. Would you be willing/able to wrap this in a PSCI interface when you do
> > > add SMP support for ARM?
> > 
> > I grepped for PSCI in the Linux tree but I couldn't find anything.
> > Also I googled for ARM PSCI and nothing of value came up. Can I find a
> > doc that explains what PSCI is anywhere?
> 
> There is an initial version of the document here (requires registration):
> 
>   https://silver.arm.com/download/download.tm?pv=1303201
> 
> Note that there are some updates due to clarify things a bit in areas such
> as function ID numbers and the PCS. Marc and I also plan to hack something up
> in kvm/mach-virt for booting secondary CPUs.

Ideally the functionalities would be discoverable via DT, so that PSCI,
PSCI-HVC and a paravirtualized Xen interface based on hypercalls and
xenbus could all coexist without issues, reusing the same internal Linux
APIs.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC PATCH 2/2] ARM: SMP support for mach-virt
  2012-12-11 17:14                               ` Stefano Stabellini
@ 2012-12-11 17:24                                 ` Will Deacon
  0 siblings, 0 replies; 38+ messages in thread
From: Will Deacon @ 2012-12-11 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 11, 2012 at 05:14:11PM +0000, Stefano Stabellini wrote:
> On Tue, 11 Dec 2012, Will Deacon wrote:
> > > > > On the Xen side there are not going to be any problems related to SMP
> > > > > and holding pens, because secondary CPUs only come into existence after
> > > > > the first CPU calls the HYPERVISOR_vcpu_op hypercall (still missing on
> > > > > ARM).
> > > > 
> > > > Ok. Would you be willing/able to wrap this in a PSCI interface when you do
> > > > add SMP support for ARM?
> > > 
> > > I grepped for PSCI in the Linux tree but I couldn't find anything.
> > > Also I googled for ARM PSCI and nothing of value came up. Can I find a
> > > doc that explains what PSCI is anywhere?
> > 
> > There is an initial version of the document here (requires registration):
> > 
> >   https://silver.arm.com/download/download.tm?pv=1303201
> > 
> > Note that there are some updates due to clarify things a bit in areas such
> > as function ID numbers and the PCS. Marc and I also plan to hack something up
> > in kvm/mach-virt for booting secondary CPUs.
> 
> Ideally the functionalities would be discoverable via DT, so that PSCI,
> PSCI-HVC and a paravirtualized Xen interface based on hypercalls and
> xenbus could all coexist without issues, reusing the same internal Linux
> APIs.

Agreed, I'll propose a binding once we have something off the ground.

Will

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-12-11 17:24 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-03 17:52 [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Will Deacon
2012-12-03 17:52 ` [RFC PATCH 1/2] ARM: Dummy Virtual Machine platform support Will Deacon
2012-12-03 17:52 ` [RFC PATCH 2/2] ARM: SMP support for mach-virt Will Deacon
2012-12-03 21:55   ` Rob Herring
2012-12-04 12:40     ` Will Deacon
2012-12-04 13:33       ` Russell King - ARM Linux
2012-12-04 13:40         ` Will Deacon
2012-12-04 14:37           ` Russell King - ARM Linux
2012-12-04 16:11             ` Will Deacon
2012-12-04 16:35               ` Russell King - ARM Linux
2012-12-04 17:24                 ` Will Deacon
2012-12-04 19:37                   ` Arnd Bergmann
2012-12-04 16:45               ` Rob Herring
2012-12-04 17:16                 ` Will Deacon
2012-12-04 17:23                   ` Rob Herring
2012-12-04 17:24                   ` Marc Zyngier
2012-12-04 17:30                     ` Will Deacon
2012-12-11 16:04                       ` Stefano Stabellini
2012-12-11 16:09                         ` Will Deacon
2012-12-11 16:34                           ` Stefano Stabellini
2012-12-11 16:41                             ` Ian Campbell
2012-12-11 16:43                             ` Will Deacon
2012-12-11 17:14                               ` Stefano Stabellini
2012-12-11 17:24                                 ` Will Deacon
2012-12-04 18:10                   ` Nicolas Pitre
2012-12-03 21:54 ` [RFC PATCH 0/2] Add support for a fake, para-virtualised machine Rob Herring
2012-12-04 12:30   ` Will Deacon
2012-12-04 14:12     ` Rob Herring
2012-12-04 17:00       ` Nicolas Pitre
2012-12-04 17:11         ` Will Deacon
2012-12-04 18:02           ` Nicolas Pitre
2012-12-04 18:14             ` Will Deacon
2012-12-05 14:52               ` Catalin Marinas
2012-12-05 15:07                 ` Nicolas Pitre
2012-12-05 15:10                   ` Will Deacon
2012-12-05 15:07                 ` Will Deacon
2012-12-05 15:15                   ` Catalin Marinas
2012-12-11 16:19     ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).