linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3)
@ 2016-08-19 16:13 Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization Daniel Thompson
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

This patchset provides a pseudo-NMI for arm64 kernels by reimplementing
the irqflags macros to modify the GIC PMR (the priority mask register
accessible as a system register on GICv3 and later) rather than the
PSR. The patchset includes an implementation of
arch_trigger_all_cpu_backtrace() for arm64 allowing the new code to be
exercised.

The code works-for-me (tm) and is more "real" than the last time I
shared these patches. However there remain a couple of limitations
and caveats:

 1. Requires GICv3+ hardware to be effective meaning it has mostly been
    tested using FVP. The alternatives runtime patching system is
    employed so systems with earlier GIC architectures are still
    bootable but will not benefit from NMI simulation.

 2. FVP needs a bit of hacking to be able to run <SysRq-L> from an ISR.
    That's a shame because <SysRq-L> is a great way to observe an NMI
    preempting an IRQ handler. Testers are welcome to ping me offline
    and I will share the hacks I have been using to test with.

v3:

 * Rebased on v4.8-rc2
 * Restrict apply_alternatives_early() to consider only the
   capabilities needed for the kernel to function correctly (Will
   Deacon).
 * Added a bunch of notrace qualifiers (Will Deacon)
 * Used byte addressing to configure the interrupt priorities (Marc
   Zygnier)
 * Fixed implementation of interrupts_enabled()
 * Eliminated the over-zealous inclusion of <linux/irqchip/arm-gic-v3.h>.
 * Pushed a few symbol defintions towards the end of the patch series.

v2:

 * Removed the isb instructions. The PMR is self-synchronizing so
   these are not needed (Dave Martin)
 * Use alternative runtime patching to allow the same kernel binary
   to boot systems with and without GICv3+ (Dave Martin).
 * Added code to properly distinguish between NMI and normal IRQ and to
   call into NMI handling code where needed.
 * Replaced the IPI backtrace logic with a newer version (from Russell
   King).


Daniel Thompson (7):
  irqchip: gic-v3: Reset BPR during initialization
  arm64: Add support for on-demand backtrace of other CPUs
  arm64: cpufeature: Allow early detect of specific features
  arm64: alternative: Apply alternatives early in boot process
  arm64: irqflags: Reorder the fiq & async macros
  arm64: irqflags: Use ICC sysregs to implement IRQ masking
  arm64: Implement IPI_CPU_BACKTRACE using pseudo-NMIs

 arch/arm/include/asm/arch_gicv3.h      |   6 ++
 arch/arm64/Kconfig                     |  16 ++++
 arch/arm64/include/asm/alternative.h   |   1 +
 arch/arm64/include/asm/arch_gicv3.h    |  57 +++++++++++++
 arch/arm64/include/asm/assembler.h     |  55 +++++++++++-
 arch/arm64/include/asm/hardirq.h       |   2 +-
 arch/arm64/include/asm/irq.h           |   3 +
 arch/arm64/include/asm/irqflags.h      | 118 ++++++++++++++++++++++++--
 arch/arm64/include/asm/ptrace.h        |  23 +++++
 arch/arm64/include/asm/smp.h           |   2 +
 arch/arm64/kernel/alternative.c        |  36 +++++++-
 arch/arm64/kernel/cpufeature.c         |  92 +++++++++++---------
 arch/arm64/kernel/entry.S              | 151 +++++++++++++++++++++++++++------
 arch/arm64/kernel/head.S               |  35 ++++++++
 arch/arm64/kernel/smp.c                |  68 ++++++++++++++-
 arch/arm64/mm/proc.S                   |  23 +++++
 drivers/irqchip/irq-gic-v3.c           |  72 ++++++++++++++++
 include/linux/irqchip/arm-gic-common.h |   8 ++
 include/linux/irqchip/arm-gic.h        |   5 --
 lib/nmi_backtrace.c                    |   9 +-
 20 files changed, 697 insertions(+), 85 deletions(-)

--
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-22 12:33   ` Marc Zyngier
  2016-08-19 16:13 ` [RFC PATCH v3 2/7] arm64: Add support for on-demand backtrace of other CPUs Daniel Thompson
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, when running on FVP, CPU 0 boots up with its BPR changed from
the reset value. This renders it impossible to (preemptively) prioritize
interrupts on CPU 0.

This is harmless on normal systems since Linux typically does not
support preemptive interrupts. It does however cause problems in
systems with additional changes (such as patches for NMI simulation).

Many thanks to Andrew Thoelke for suggesting the BPR as having the
potential to harm preemption.

Suggested-by: Andrew Thoelke <andrew.thoelke@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm/include/asm/arch_gicv3.h   | 6 ++++++
 arch/arm64/include/asm/arch_gicv3.h | 6 ++++++
 drivers/irqchip/irq-gic-v3.c        | 8 ++++++++
 3 files changed, 20 insertions(+)

diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
index e08d15184056..dfe4002812da 100644
--- a/arch/arm/include/asm/arch_gicv3.h
+++ b/arch/arm/include/asm/arch_gicv3.h
@@ -34,6 +34,7 @@
 #define ICC_CTLR			__ACCESS_CP15(c12, 0, c12, 4)
 #define ICC_SRE				__ACCESS_CP15(c12, 0, c12, 5)
 #define ICC_IGRPEN1			__ACCESS_CP15(c12, 0, c12, 7)
+#define ICC_BPR1			__ACCESS_CP15(c12, 0, c12, 3)
 
 #define ICC_HSRE			__ACCESS_CP15(c12, 4, c9, 5)
 
@@ -157,6 +158,11 @@ static inline void gic_write_sre(u32 val)
 	isb();
 }
 
+static inline void gic_write_bpr1(u32 val)
+{
+	asm volatile("mcr " __stringify(ICC_BPR1) : : "r" (val));
+}
+
 /*
  * Even in 32bit systems that use LPAE, there is no guarantee that the I/O
  * interface provides true 64bit atomic accesses, so using strd/ldrd doesn't
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index 8ec88e5b290f..fc2a0cb47b2c 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -28,6 +28,7 @@
 #define ICC_CTLR_EL1			sys_reg(3, 0, 12, 12, 4)
 #define ICC_SRE_EL1			sys_reg(3, 0, 12, 12, 5)
 #define ICC_GRPEN1_EL1			sys_reg(3, 0, 12, 12, 7)
+#define ICC_BPR1_EL1			sys_reg(3, 0, 12, 12, 3)
 
 #define ICC_SRE_EL2			sys_reg(3, 4, 12, 9, 5)
 
@@ -165,6 +166,11 @@ static inline void gic_write_sre(u32 val)
 	isb();
 }
 
+static inline void gic_write_bpr1(u32 val)
+{
+	asm volatile("msr_s " __stringify(ICC_BPR1_EL1) ", %0" : : "r" (val));
+}
+
 #define gic_read_typer(c)		readq_relaxed(c)
 #define gic_write_irouter(v, c)		writeq_relaxed(v, c)
 
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 6fc56c3466b0..fedcdd09b9b2 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -495,6 +495,14 @@ static void gic_cpu_sys_reg_init(void)
 	/* Set priority mask register */
 	gic_write_pmr(DEFAULT_PMR_VALUE);
 
+	/*
+	 * Some firmwares hand over to the kernel with the BPR changed from
+	 * its reset value (and with a value large enough to prevent
+	 * any pre-emptive interrupts from working at all). Writing a zero
+	 * to BPR restores is reset value.
+	 */
+	gic_write_bpr1(0);
+
 	if (static_key_true(&supports_deactivate)) {
 		/* EOI drops priority only (mode 1) */
 		gic_write_ctlr(ICC_CTLR_EL1_EOImode_drop);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 2/7] arm64: Add support for on-demand backtrace of other CPUs
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 3/7] arm64: cpufeature: Allow early detect of specific features Daniel Thompson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently arm64 has no implementation of arch_trigger_all_cpu_backtace.
The patch provides one using library code recently added by Russell King
for for the majority of the implementation. Currently this is realized
using regular irqs but could, in the future, be implemented using
NMI-like mechanisms.

Note: There is a small (and nasty) change to the generic code to ensure
      good stack traces. The generic code currently assumes that
      show_regs() will include a stack trace but arch/arm64 does not do
      this so we must add extra code here. Ideas on a better approach
      here would be very welcome (is there any appetite to change arm64
      show_regs() or should we just tease out the dump code into a
      callback?).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm64/include/asm/hardirq.h |  2 +-
 arch/arm64/include/asm/irq.h     |  3 +++
 arch/arm64/kernel/smp.c          | 30 +++++++++++++++++++++++++++++-
 lib/nmi_backtrace.c              |  9 +++++++--
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index 8740297dac77..1473fc2f7ab7 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -20,7 +20,7 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
-#define NR_IPI	6
+#define NR_IPI	7
 
 typedef struct {
 	unsigned int __softirq_pending;
diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index b77197d941fc..67dc130ae517 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -56,5 +56,8 @@ static inline bool on_irq_stack(unsigned long sp, int cpu)
 	return (low <= sp && sp <= high);
 }
 
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+
 #endif /* !__ASSEMBLER__ */
 #endif
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d93d43352504..99f607f0fa97 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -37,6 +37,7 @@
 #include <linux/completion.h>
 #include <linux/of.h>
 #include <linux/irq_work.h>
+#include <linux/nmi.h>
 
 #include <asm/alternative.h>
 #include <asm/atomic.h>
@@ -73,7 +74,8 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_TIMER,
 	IPI_IRQ_WORK,
-	IPI_WAKEUP
+	IPI_WAKEUP,
+	IPI_CPU_BACKTRACE,
 };
 
 #ifdef CONFIG_ARM64_VHE
@@ -737,6 +739,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
 	S(IPI_TIMER, "Timer broadcast interrupts"),
 	S(IPI_IRQ_WORK, "IRQ work interrupts"),
 	S(IPI_WAKEUP, "CPU wake-up interrupts"),
+	S(IPI_CPU_BACKTRACE, "backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -862,6 +865,14 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		break;
 #endif
 
+	case IPI_CPU_BACKTRACE:
+		printk_nmi_enter();
+		irq_enter();
+		nmi_cpu_backtrace(regs);
+		irq_exit();
+		printk_nmi_exit();
+		break;
+
 	default:
 		pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
 		break;
@@ -935,3 +946,20 @@ bool cpus_are_stuck_in_kernel(void)
 
 	return !!cpus_stuck_in_kernel || smp_spin_tables;
 }
+
+static void raise_nmi(cpumask_t *mask)
+{
+	/*
+	 * Generate the backtrace directly if we are running in a
+	 * calling context that is not preemptible by the backtrace IPI.
+	 */
+	if (cpumask_test_cpu(smp_processor_id(), mask) && irqs_disabled())
+		nmi_cpu_backtrace(NULL);
+
+	smp_cross_call(mask, IPI_CPU_BACKTRACE);
+}
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	nmi_trigger_all_cpu_backtrace(include_self, raise_nmi);
+}
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 26caf51cc238..3dada8487477 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -78,10 +78,15 @@ bool nmi_cpu_backtrace(struct pt_regs *regs)
 
 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
 		pr_warn("NMI backtrace for cpu %d\n", cpu);
-		if (regs)
+		if (regs) {
 			show_regs(regs);
-		else
+#ifdef CONFIG_ARM64
+			show_stack(NULL, NULL);
+#endif
+		} else {
 			dump_stack();
+		}
+
 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
 		return true;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 3/7] arm64: cpufeature: Allow early detect of specific features
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 2/7] arm64: Add support for on-demand backtrace of other CPUs Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 4/7] arm64: alternative: Apply alternatives early in boot process Daniel Thompson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently it is not possible to detect features of the boot CPU
until the other CPUs have been brought up.

This prevents us from reacting to features of the boot CPU until
fairly late in the boot process. To solve this we allow a subset
of features (that are likely to be common to all clusters) to be
detected based on the boot CPU alone.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/kernel/cpufeature.c | 92 ++++++++++++++++++++++++------------------
 1 file changed, 53 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 62272eac1352..e5af761c80b8 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -423,45 +423,6 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
 	reg->strict_mask = strict_mask;
 }
 
-void __init init_cpu_features(struct cpuinfo_arm64 *info)
-{
-	/* Before we start using the tables, make sure it is sorted */
-	sort_ftr_regs();
-
-	init_cpu_ftr_reg(SYS_CTR_EL0, info->reg_ctr);
-	init_cpu_ftr_reg(SYS_DCZID_EL0, info->reg_dczid);
-	init_cpu_ftr_reg(SYS_CNTFRQ_EL0, info->reg_cntfrq);
-	init_cpu_ftr_reg(SYS_ID_AA64DFR0_EL1, info->reg_id_aa64dfr0);
-	init_cpu_ftr_reg(SYS_ID_AA64DFR1_EL1, info->reg_id_aa64dfr1);
-	init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0);
-	init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1);
-	init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0);
-	init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1);
-	init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);
-	init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
-	init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
-
-	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
-		init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);
-		init_cpu_ftr_reg(SYS_ID_ISAR0_EL1, info->reg_id_isar0);
-		init_cpu_ftr_reg(SYS_ID_ISAR1_EL1, info->reg_id_isar1);
-		init_cpu_ftr_reg(SYS_ID_ISAR2_EL1, info->reg_id_isar2);
-		init_cpu_ftr_reg(SYS_ID_ISAR3_EL1, info->reg_id_isar3);
-		init_cpu_ftr_reg(SYS_ID_ISAR4_EL1, info->reg_id_isar4);
-		init_cpu_ftr_reg(SYS_ID_ISAR5_EL1, info->reg_id_isar5);
-		init_cpu_ftr_reg(SYS_ID_MMFR0_EL1, info->reg_id_mmfr0);
-		init_cpu_ftr_reg(SYS_ID_MMFR1_EL1, info->reg_id_mmfr1);
-		init_cpu_ftr_reg(SYS_ID_MMFR2_EL1, info->reg_id_mmfr2);
-		init_cpu_ftr_reg(SYS_ID_MMFR3_EL1, info->reg_id_mmfr3);
-		init_cpu_ftr_reg(SYS_ID_PFR0_EL1, info->reg_id_pfr0);
-		init_cpu_ftr_reg(SYS_ID_PFR1_EL1, info->reg_id_pfr1);
-		init_cpu_ftr_reg(SYS_MVFR0_EL1, info->reg_mvfr0);
-		init_cpu_ftr_reg(SYS_MVFR1_EL1, info->reg_mvfr1);
-		init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
-	}
-
-}
-
 static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)
 {
 	struct arm64_ftr_bits *ftrp;
@@ -739,6 +700,18 @@ static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
 	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
 }
 
+static const struct arm64_cpu_capabilities arm64_early_features[] = {
+	{
+		.desc = "GIC system register CPU interface",
+		.capability = ARM64_HAS_SYSREG_GIC_CPUIF,
+		.matches = has_useable_gicv3_cpuif,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.field_pos = ID_AA64PFR0_GIC_SHIFT,
+		.min_field_value = 1,
+	},
+	{}
+};
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1023,6 +996,47 @@ void verify_local_cpu_capabilities(void)
 		verify_local_elf_hwcaps(compat_elf_hwcaps);
 }
 
+void __init init_cpu_features(struct cpuinfo_arm64 *info)
+{
+	/* Before we start using the tables, make sure it is sorted */
+	sort_ftr_regs();
+
+	init_cpu_ftr_reg(SYS_CTR_EL0, info->reg_ctr);
+	init_cpu_ftr_reg(SYS_DCZID_EL0, info->reg_dczid);
+	init_cpu_ftr_reg(SYS_CNTFRQ_EL0, info->reg_cntfrq);
+	init_cpu_ftr_reg(SYS_ID_AA64DFR0_EL1, info->reg_id_aa64dfr0);
+	init_cpu_ftr_reg(SYS_ID_AA64DFR1_EL1, info->reg_id_aa64dfr1);
+	init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0);
+	init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1);
+	init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0);
+	init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1);
+	init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);
+	init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
+	init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
+
+	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
+		init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);
+		init_cpu_ftr_reg(SYS_ID_ISAR0_EL1, info->reg_id_isar0);
+		init_cpu_ftr_reg(SYS_ID_ISAR1_EL1, info->reg_id_isar1);
+		init_cpu_ftr_reg(SYS_ID_ISAR2_EL1, info->reg_id_isar2);
+		init_cpu_ftr_reg(SYS_ID_ISAR3_EL1, info->reg_id_isar3);
+		init_cpu_ftr_reg(SYS_ID_ISAR4_EL1, info->reg_id_isar4);
+		init_cpu_ftr_reg(SYS_ID_ISAR5_EL1, info->reg_id_isar5);
+		init_cpu_ftr_reg(SYS_ID_MMFR0_EL1, info->reg_id_mmfr0);
+		init_cpu_ftr_reg(SYS_ID_MMFR1_EL1, info->reg_id_mmfr1);
+		init_cpu_ftr_reg(SYS_ID_MMFR2_EL1, info->reg_id_mmfr2);
+		init_cpu_ftr_reg(SYS_ID_MMFR3_EL1, info->reg_id_mmfr3);
+		init_cpu_ftr_reg(SYS_ID_PFR0_EL1, info->reg_id_pfr0);
+		init_cpu_ftr_reg(SYS_ID_PFR1_EL1, info->reg_id_pfr1);
+		init_cpu_ftr_reg(SYS_MVFR0_EL1, info->reg_mvfr0);
+		init_cpu_ftr_reg(SYS_MVFR1_EL1, info->reg_mvfr1);
+		init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
+	}
+
+	update_cpu_capabilities(arm64_early_features,
+				"early detected feature:");
+}
+
 static void __init setup_feature_capabilities(void)
 {
 	update_cpu_capabilities(arm64_features, "detected feature:");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 4/7] arm64: alternative: Apply alternatives early in boot process
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
                   ` (2 preceding siblings ...)
  2016-08-19 16:13 ` [RFC PATCH v3 3/7] arm64: cpufeature: Allow early detect of specific features Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 5/7] arm64: irqflags: Reorder the fiq & async macros Daniel Thompson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently alternatives are applied very late in the boot process (and
a long time after we enable scheduling). Some alternative sequences,
such as those that alter the way CPU context is stored, must be applied
much earlier in the boot sequence.

Introduce apply_alternatives_early() to allow some alternatives to be
applied immediately after we detect the CPU features of the boot CPU.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/include/asm/alternative.h |  1 +
 arch/arm64/kernel/alternative.c      | 36 +++++++++++++++++++++++++++++++++---
 arch/arm64/kernel/smp.c              |  7 +++++++
 3 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 8746ff6abd77..2eee073668eb 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -19,6 +19,7 @@ struct alt_instr {
 	u8  alt_len;		/* size of new instruction(s), <= orig_len */
 };
 
+void __init apply_alternatives_early(void);
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);
 
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index d2ee1b21a10d..9c623b7f69f8 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -27,6 +27,18 @@
 #include <asm/insn.h>
 #include <linux/stop_machine.h>
 
+/*
+ * early-apply features can be detected using only the boot CPU (i.e.
+ * no need to check capability of any secondary CPUs) and, even then,
+ * should only include features where we must patch the kernel very
+ * early in the boot process.
+ *
+ * Note that the cpufeature logic *must* be made aware of early-apply
+ * features to ensure they are reported as enabled without waiting
+ * for other CPUs to boot.
+ */
+#define EARLY_APPLY_FEATURE_MASK BIT(ARM64_HAS_SYSREG_GIC_CPUIF)
+
 #define __ALT_PTR(a,f)		(u32 *)((void *)&(a)->f + (a)->f)
 #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
@@ -85,7 +97,7 @@ static u32 get_alt_insn(struct alt_instr *alt, u32 *insnptr, u32 *altinsnptr)
 	return insn;
 }
 
-static void __apply_alternatives(void *alt_region)
+static void __apply_alternatives(void *alt_region, unsigned long feature_mask)
 {
 	struct alt_instr *alt;
 	struct alt_region *region = alt_region;
@@ -95,6 +107,9 @@ static void __apply_alternatives(void *alt_region)
 		u32 insn;
 		int i, nr_inst;
 
+		if ((BIT(alt->cpufeature) & feature_mask) == 0)
+			continue;
+
 		if (!cpus_have_cap(alt->cpufeature))
 			continue;
 
@@ -117,6 +132,21 @@ static void __apply_alternatives(void *alt_region)
 }
 
 /*
+ * This is called very early in the boot process (directly after we run
+ * a feature detect on the boot CPU). No need to worry about other CPUs
+ * here.
+ */
+void apply_alternatives_early(void)
+{
+	struct alt_region region = {
+		.begin	= __alt_instructions,
+		.end	= __alt_instructions_end,
+	};
+
+	__apply_alternatives(&region, EARLY_APPLY_FEATURE_MASK);
+}
+
+/*
  * We might be patching the stop_machine state machine, so implement a
  * really simple polling protocol here.
  */
@@ -135,7 +165,7 @@ static int __apply_alternatives_multi_stop(void *unused)
 		isb();
 	} else {
 		BUG_ON(patched);
-		__apply_alternatives(&region);
+		__apply_alternatives(&region, ~EARLY_APPLY_FEATURE_MASK);
 		/* Barriers provided by the cache flushing */
 		WRITE_ONCE(patched, 1);
 	}
@@ -156,5 +186,5 @@ void apply_alternatives(void *start, size_t length)
 		.end	= start + length,
 	};
 
-	__apply_alternatives(&region);
+	__apply_alternatives(&region, -1);
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 99f607f0fa97..c49e8874fba8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -441,6 +441,13 @@ void __init smp_prepare_boot_cpu(void)
 	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
 	cpuinfo_store_boot_cpu();
 	save_boot_cpu_run_el();
+
+	/*
+	 * We now know enough about the boot CPU to apply the
+	 * alternatives that cannot wait until interrupt handling
+	 * and/or scheduling is enabled.
+	 */
+	apply_alternatives_early();
 }
 
 static u64 __init of_get_cpu_mpidr(struct device_node *dn)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 5/7] arm64: irqflags: Reorder the fiq & async macros
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
                   ` (3 preceding siblings ...)
  2016-08-19 16:13 ` [RFC PATCH v3 4/7] arm64: alternative: Apply alternatives early in boot process Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Daniel Thompson
  2016-08-19 16:13 ` [RFC PATCH v3 7/7] arm64: Implement IPI_CPU_BACKTRACE using pseudo-NMIs Daniel Thompson
  6 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Separate out the local fiq & async macros from the various arch inlines.
This makes is easier for us (in a later patch) to provide an alternative
implementation of these inlines.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/include/asm/irqflags.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c581281fa12..f367cbd92ff0 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -53,12 +53,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }
 
-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
@@ -90,6 +84,12 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 	return flags & PSR_I_BIT;
 }
 
+#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
+#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
+
+#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
+#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
+
 /*
  * save and restore debug state
  */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
                   ` (4 preceding siblings ...)
  2016-08-19 16:13 ` [RFC PATCH v3 5/7] arm64: irqflags: Reorder the fiq & async macros Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  2016-08-22 10:12   ` Marc Zyngier
  2016-08-19 16:13 ` [RFC PATCH v3 7/7] arm64: Implement IPI_CPU_BACKTRACE using pseudo-NMIs Daniel Thompson
  6 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Currently irqflags is implemented using the PSR's I bit. It is possible
to implement irqflags by using the co-processor interface to the GIC.
Using the co-processor interface makes it feasible to simulate NMIs
using GIC interrupt prioritization.

This patch changes the irqflags macros to modify, save and restore
ICC_PMR_EL1. This has a substantial knock on effect for the rest of
the kernel. There are four reasons for this:

1. The state of the ICC_PMR_EL1_G_BIT becomes part of the CPU context
   and must be saved and restored during traps. To simplify the
   additional context management the ICC_PMR_EL1_G_BIT is converted
   into a fake (reserved) bit within the PSR (PSR_G_BIT). Naturally
   this approach will need to be changed if future ARM architecture
   extensions make use of this bit.

2. The hardware automatically masks the I bit (at boot, during traps, etc).
   When the I bit is set by hardware we must add code to switch from I
   bit masking and PMR masking.

3. Some instructions, noteably wfi, require that the PMR not be used
   for interrupt masking. Before calling these instructions we must
   switch from PMR masking to I bit masking.

4. We use the alternatives system to all a single kernel to boot and
   switch between the different masking approaches.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/Kconfig                  |  15 +++++
 arch/arm64/include/asm/arch_gicv3.h |  51 +++++++++++++++++
 arch/arm64/include/asm/assembler.h  |  32 ++++++++++-
 arch/arm64/include/asm/irqflags.h   | 106 ++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/ptrace.h     |  23 ++++++++
 arch/arm64/kernel/entry.S           |  77 ++++++++++++++++++++++----
 arch/arm64/kernel/head.S            |  35 ++++++++++++
 arch/arm64/kernel/smp.c             |   6 ++
 arch/arm64/mm/proc.S                |  23 ++++++++
 drivers/irqchip/irq-gic-v3.c        |   2 +
 include/linux/irqchip/arm-gic.h     |   2 +-
 11 files changed, 356 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index bc3f00f586f1..56846724d2af 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -718,6 +718,21 @@ config FORCE_MAX_ZONEORDER
 	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
 	  4M allocations matching the default size used by generic code.
 
+config USE_ICC_SYSREGS_FOR_IRQFLAGS
+	bool "Use ICC system registers for IRQ masking"
+	select CONFIG_ARM_GIC_V3
+	help
+	  Using the ICC system registers for IRQ masking makes it possible
+	  to simulate NMI on ARM64 systems. This allows several interesting
+	  features (especially debug features) to be used on these systems.
+
+	  Say Y here to implement IRQ masking using ICC system
+	  registers. This will result in an unbootable kernel if these
+	  registers are not implemented or made inaccessible by the
+	  EL3 firmare or EL2 hypervisor (if present).
+
+	  If unsure, say N
+
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
 	depends on COMPAT
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index fc2a0cb47b2c..a4ad69e2831b 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -80,6 +80,9 @@
 #include <linux/stringify.h>
 #include <asm/barrier.h>
 
+/* Our default, arbitrary priority value. Linux only uses one anyway. */
+#define DEFAULT_PMR_VALUE	0xf0
+
 /*
  * Low-level accessors
  *
@@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
 static inline u64 gic_read_iar_common(void)
 {
 	u64 irqstat;
+	u64 __maybe_unused daif;
+	u64 __maybe_unused pmr;
+	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
 
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
 	dsb(sy);
+#else
+	/*
+	 * The PMR may be configured to mask interrupts when this code is
+	 * called, thus in order to acknowledge interrupts we must set the
+	 * PMR to its default value before reading from the IAR.
+	 *
+	 * To do this without taking an interrupt we also ensure the I bit
+	 * is set whilst we are interfering with the value of the PMR.
+	 */
+	asm volatile(
+		"mrs	%1, daif\n\t"			     /* save I bit  */
+		"msr	daifset, #2\n\t"		     /* set I bit   */
+		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
+		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
+		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
+		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
+		"isb\n\t"
+		"msr	daif, %1"			     /* restore I   */
+		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
+		: "r" (default_pmr_value));
+#endif
+
 	return irqstat;
 }
 
@@ -118,7 +147,11 @@ static inline u64 gic_read_iar_common(void)
 static inline u64 gic_read_iar_cavium_thunderx(void)
 {
 	u64 irqstat;
+	u64 __maybe_unused daif;
+	u64 __maybe_unused pmr;
+	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
 
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 	asm volatile(
 		"nop;nop;nop;nop\n\t"
 		"nop;nop;nop;nop\n\t"
@@ -126,6 +159,24 @@ static inline u64 gic_read_iar_cavium_thunderx(void)
 		"nop;nop;nop;nop"
 		: "=r" (irqstat));
 	mb();
+#else
+	/* Refer to comment in gic_read_iar_common() for more details */
+	asm volatile(
+		"mrs	%1, daif\n\t"			     /* save I bit  */
+		"msr	daifset, #2\n\t"		     /* set I bit   */
+		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
+		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
+		"nop;nop;nop;nop\n\t"
+		"nop;nop;nop;nop\n\t"
+		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
+		"nop;nop;nop;nop\n\t"
+		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
+		"isb\n\t"
+		"msr	daif, %1"			     /* restore I   */
+		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
+		: "r" (default_pmr_value));
+
+#endif
 
 	return irqstat;
 }
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d5025c69ca81..0adea5807ef0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/alternative.h>
+#include <asm/arch_gicv3.h>
 #include <asm/asm-offsets.h>
 #include <asm/cpufeature.h>
 #include <asm/page.h>
@@ -33,12 +35,30 @@
 /*
  * Enable and disable interrupts.
  */
-	.macro	disable_irq
+	.macro	disable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	mov	\tmp, #ICC_PMR_EL1_MASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	msr	daifset, #2
+alternative_else
+	msr_s	ICC_PMR_EL1, \tmp
+alternative_endif
+#else
+	msr	daifset, #2
+#endif
 	.endm
 
-	.macro	enable_irq
+	.macro	enable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	mov     \tmp, #ICC_PMR_EL1_UNMASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	msr	daifclr, #2
+alternative_else
+	msr_s	ICC_PMR_EL1, \tmp
+alternative_endif
+#else
+	msr	daifclr, #2
+#endif
 	.endm
 
 /*
@@ -70,13 +90,19 @@
 9990:
 	.endm
 
+
 /*
  * Enable both debug exceptions and interrupts. This is likely to be
  * faster than two daifclr operations, since writes to this register
  * are self-synchronising.
  */
-	.macro	enable_dbg_and_irq
+	.macro	enable_dbg_and_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	enable_dbg
+	enable_irq \tmp
+#else
 	msr	daifclr, #(8 | 2)
+#endif
 	.endm
 
 /*
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index f367cbd92ff0..b14ee24bc3a7 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -18,8 +18,13 @@
 
 #ifdef __KERNEL__
 
+#include <asm/alternative.h>
+#include <asm/arch_gicv3.h>
+#include <asm/cpufeature.h>
 #include <asm/ptrace.h>
 
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+
 /*
  * CPU interrupt mask handling.
  */
@@ -84,6 +89,107 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 	return flags & PSR_I_BIT;
 }
 
+static inline void maybe_switch_to_sysreg_gic_cpuif(void) {}
+
+#else /* CONFIG_IRQFLAGS_GIC_MASKING */
+
+/*
+ * CPU interrupt mask handling.
+ */
+static inline unsigned long arch_local_irq_save(void)
+{
+	unsigned long flags, masked = ICC_PMR_EL1_MASKED;
+
+	asm volatile(ALTERNATIVE(
+		"mrs	%0, daif		// arch_local_irq_save\n"
+		"msr	daifset, #2",
+		/* --- */
+		"mrs_s  %0, " __stringify(ICC_PMR_EL1) "\n"
+		"msr_s	" __stringify(ICC_PMR_EL1) ",%1",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		: "=&r" (flags)
+		: "r" (masked)
+		: "memory");
+
+	return flags;
+}
+
+static inline void arch_local_irq_enable(void)
+{
+	unsigned long unmasked = ICC_PMR_EL1_UNMASKED;
+
+	asm volatile(ALTERNATIVE(
+		"msr	daifclr, #2		// arch_local_irq_enable",
+		"msr_s  " __stringify(ICC_PMR_EL1) ",%0",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		:
+		: "r" (unmasked)
+		: "memory");
+}
+
+static inline void arch_local_irq_disable(void)
+{
+	unsigned long masked = ICC_PMR_EL1_MASKED;
+
+	asm volatile(ALTERNATIVE(
+		"msr	daifset, #2		// arch_local_irq_disable",
+		"msr_s  " __stringify(ICC_PMR_EL1) ",%0",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		:
+		: "r" (masked)
+		: "memory");
+}
+
+/*
+ * Save the current interrupt enable state.
+ */
+static inline unsigned long arch_local_save_flags(void)
+{
+	unsigned long flags;
+
+	asm volatile(ALTERNATIVE(
+		"mrs	%0, daif		// arch_local_save_flags",
+		"mrs_s  %0, " __stringify(ICC_PMR_EL1),
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		: "=r" (flags)
+		:
+		: "memory");
+
+	return flags;
+}
+
+/*
+ * restore saved IRQ state
+ */
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+	asm volatile(ALTERNATIVE(
+		"msr	daif, %0		// arch_local_irq_restore",
+		"msr_s  " __stringify(ICC_PMR_EL1) ",%0",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+	:
+	: "r" (flags)
+	: "memory");
+}
+
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+	asm volatile(ALTERNATIVE(
+		"and	%0, %0, #" __stringify(PSR_I_BIT) "\n"
+		"nop",
+		/* --- */
+		"and	%0, %0, # " __stringify(ICC_PMR_EL1_G_BIT) "\n"
+		"eor	%0, %0, # " __stringify(ICC_PMR_EL1_G_BIT),
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		: "+r" (flags));
+
+	return flags;
+}
+
+void maybe_switch_to_sysreg_gic_cpuif(void);
+
+#endif /* CONFIG_IRQFLAGS_GIC_MASKING */
+
 #define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
 #define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
 
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index ada08b5b036d..24154d6228d7 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -25,6 +25,24 @@
 #define CurrentEL_EL1		(1 << 2)
 #define CurrentEL_EL2		(2 << 2)
 
+/* PMR values used to mask/unmask interrupts */
+#define ICC_PMR_EL1_G_SHIFT     6
+#define ICC_PMR_EL1_G_BIT       (1 << ICC_PMR_EL1_G_SHIFT)
+#define ICC_PMR_EL1_UNMASKED    0xf0
+#define ICC_PMR_EL1_MASKED      (ICC_PMR_EL1_UNMASKED ^ ICC_PMR_EL1_G_BIT)
+
+/*
+ * This is the GIC interrupt mask bit. It is not actually part of the
+ * PSR and so does not appear in the user API, we are simply using some
+ * reserved bits in the PSR to store some state from the interrupt
+ * controller. The context save/restore functions will extract the
+ * ICC_PMR_EL1_G_BIT and save it as the PSR_G_BIT.
+ */
+#define PSR_G_BIT		0x00400000
+#define PSR_G_SHIFT             22
+#define PSR_G_PMR_G_SHIFT	(PSR_G_SHIFT - ICC_PMR_EL1_G_SHIFT)
+#define PSR_I_PMR_G_SHIFT	(7 - ICC_PMR_EL1_G_SHIFT)
+
 /* AArch32-specific ptrace requests */
 #define COMPAT_PTRACE_GETREGS		12
 #define COMPAT_PTRACE_SETREGS		13
@@ -142,8 +160,13 @@ struct pt_regs {
 #define processor_mode(regs) \
 	((regs)->pstate & PSR_MODE_MASK)
 
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 #define interrupts_enabled(regs) \
 	(!((regs)->pstate & PSR_I_BIT))
+#else
+#define interrupts_enabled(regs) \
+	((!((regs)->pstate & PSR_I_BIT)) && (!((regs)->pstate & PSR_G_BIT)))
+#endif
 
 #define fast_interrupts_enabled(regs) \
 	(!((regs)->pstate & PSR_F_BIT))
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 441420ca7d08..1712b344cbf3 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -20,6 +20,7 @@
 
 #include <linux/init.h>
 #include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/alternative.h>
 #include <asm/assembler.h>
@@ -108,6 +109,26 @@
 	.endif /* \el == 0 */
 	mrs	x22, elr_el1
 	mrs	x23, spsr_el1
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/*
+	 * Save the context held in the PMR register and copy the current
+	 * I bit state to the PMR. Re-enable of the I bit happens in later
+	 * code that knows what type of trap we are handling.
+	 */
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	b	1f
+alternative_else
+	mrs_s	x20, ICC_PMR_EL1		// Get PMR
+alternative_endif
+	and	x20, x20, #ICC_PMR_EL1_G_BIT	// Extract mask bit
+	lsl	x20, x20, #PSR_G_PMR_G_SHIFT	// Shift to a PSTATE RES0 bit
+	eor	x20, x20, #PSR_G_BIT		// Invert bit
+	orr	x23, x20, x23			// Store PMR within PSTATE
+	mov	x20, #ICC_PMR_EL1_MASKED
+	msr_s	ICC_PMR_EL1, x20		// Mask normal interrupts at PMR
+1:
+#endif
+
 	stp	lr, x21, [sp, #S_LR]
 	stp	x22, x23, [sp, #S_PC]
 
@@ -168,6 +189,22 @@ alternative_else
 alternative_endif
 #endif
 	.endif
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/*
+	 * Restore the context to the PMR (and ensure the reserved bit is
+	 * restored to zero before being copied back to the PSR).
+	 */
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	b	1f
+alternative_else
+	and	x20, x22, #PSR_G_BIT		// Get stolen PSTATE bit
+alternative_endif
+	and	x22, x22, #~PSR_G_BIT		// Clear stolen bit
+	lsr	x20, x20, #PSR_G_PMR_G_SHIFT	// Shift back to PMR mask
+	eor	x20, x20, #ICC_PMR_EL1_UNMASKED	// x20 gets 0xf0 or 0xb0
+	msr_s	ICC_PMR_EL1, x20		// Write to PMR
+1:
+#endif
 	msr	elr_el1, x21			// set up the return data
 	msr	spsr_el1, x22
 	ldp	x0, x1, [sp, #16 * 0]
@@ -378,14 +415,30 @@ el1_da:
 	mrs	x0, far_el1
 	enable_dbg
 	// re-enable interrupts if they were enabled in the aborted context
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
+	nop
+	nop
+	msr     daifclr, #2
+1:
+alternative_else
+	tbnz	x23, #PSR_G_SHIFT, 1f		// PSR_G_BIT
+	mov     x2, #ICC_PMR_EL1_UNMASKED
+	msr_s   ICC_PMR_EL1, x2
+	msr     daifclr, #2
 1:
+alternative_endif
+#else
+	tbnz	x23, #7, 1f			// PSR_I_BIT
+	enable_irq x2
+1:
+#endif
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
 	// disable interrupts before pulling preserved data off the stack
-	disable_irq
+	disable_irq x21
 	kernel_exit 1
 el1_sp_pc:
 	/*
@@ -539,7 +592,7 @@ el0_da:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	bic	x0, x26, #(0xff << 56)
 	mov	x1, x25
@@ -552,7 +605,7 @@ el0_ia:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -585,7 +638,7 @@ el0_sp_pc:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -597,7 +650,7 @@ el0_undef:
 	 * Undefined instruction
 	 */
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
@@ -606,7 +659,7 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -690,7 +743,7 @@ ENDPROC(cpu_switch_to)
  * and this includes saving x0 back into the kernel stack.
  */
 ret_fast_syscall:
-	disable_irq				// disable interrupts
+	disable_irq x21				// disable interrupts
 	str	x0, [sp, #S_X0]			// returned x0
 	ldr	x1, [tsk, #TI_FLAGS]		// re-check for syscall tracing
 	and	x2, x1, #_TIF_SYSCALL_WORK
@@ -700,7 +753,7 @@ ret_fast_syscall:
 	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
-	enable_irq				// enable interrupts
+	enable_irq x0				// enable interrupts
 	b	__sys_trace_return_skipped	// we already saved x0
 
 /*
@@ -710,7 +763,7 @@ work_pending:
 	tbnz	x1, #TIF_NEED_RESCHED, work_resched
 	/* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
 	mov	x0, sp				// 'regs'
-	enable_irq				// enable interrupts for do_notify_resume()
+	enable_irq x21				// enable interrupts for do_notify_resume()
 	bl	do_notify_resume
 	b	ret_to_user
 work_resched:
@@ -723,7 +776,7 @@ work_resched:
  * "slow" syscall return path.
  */
 ret_to_user:
-	disable_irq				// disable interrupts
+	disable_irq x21				// disable interrupts
 	ldr	x1, [tsk, #TI_FLAGS]
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
@@ -753,7 +806,7 @@ el0_svc:
 	mov	sc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
+	enable_dbg_and_irq x16
 	ct_user_exit 1
 
 	ldr	x16, [tsk, #TI_FLAGS]		// check for syscall hooks
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b77f58355da1..27f04f03a179 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -620,6 +620,38 @@ set_cpu_boot_mode_flag:
 	ret
 ENDPROC(set_cpu_boot_mode_flag)
 
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * void maybe_switch_to_sysreg_gic_cpuif(void)
+ *
+ * Enable interrupt controller system register access if this feature
+ * has been detected by the alternatives system.
+ *
+ * Before we jump into generic code we must enable interrupt controller system
+ * register access because this is required by the irqflags macros.  We must
+ * also mask interrupts@the PMR and unmask them within the PSR. That leaves
+ * us set up and ready for the kernel to make its first call to
+ * arch_local_irq_enable().
+
+ *
+ */
+ENTRY(maybe_switch_to_sysreg_gic_cpuif)
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	b	1f
+alternative_else
+	mrs_s	x0, ICC_SRE_EL1
+alternative_endif
+	orr	x0, x0, #1
+	msr_s	ICC_SRE_EL1, x0			// Set ICC_SRE_EL1.SRE==1
+	isb					// Make sure SRE is now set
+	mov	x0, ICC_PMR_EL1_MASKED
+	msr_s	ICC_PMR_EL1, x0			// Prepare for unmask of I bit
+	msr	daifclr, #2			// Clear the I bit
+1:
+	ret
+ENDPROC(maybe_switch_to_sysreg_gic_cpuif)
+#endif
+
 /*
  * We need to find out the CPU boot mode long after boot, so we need to
  * store it in a writable variable.
@@ -684,6 +716,9 @@ __secondary_switched:
 	mov	sp, x0
 	and	x0, x0, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x0			// save thread_info
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	bl	maybe_switch_to_sysreg_gic_cpuif
+#endif
 	mov	x29, #0
 	b	secondary_start_kernel
 ENDPROC(__secondary_switched)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index c49e8874fba8..2c118b2db7b4 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -448,6 +448,12 @@ void __init smp_prepare_boot_cpu(void)
 	 * and/or scheduling is enabled.
 	 */
 	apply_alternatives_early();
+
+	/*
+	 * Conditionally switch to GIC PMR for interrupt masking (this
+	 * will be a nop if we are using normal interrupt masking)
+	 */
+	maybe_switch_to_sysreg_gic_cpuif();
 }
 
 static u64 __init of_get_cpu_mpidr(struct device_node *dn)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5bb61de23201..dba725eece86 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -20,6 +20,7 @@
 
 #include <linux/init.h>
 #include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
@@ -47,11 +48,33 @@
  *	cpu_do_idle()
  *
  *	Idle the processor (wait for interrupt).
+ *
+ *	If CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS is set we must do additional
+ *	work to ensure that interrupts are not masked at the PMR (because the
+ *	core will not wake up if we block the wake up signal in the interrupt
+ *	controller).
  */
 ENTRY(cpu_do_idle)
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+#endif
+	dsb	sy				// WFI may enter a low-power mode
+	wfi
+	ret
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_else
+	mrs	x0, daif			// save I bit
+	msr	daifset, #2			// set I bit
+	mrs_s	x1, ICC_PMR_EL1			// save PMR
+alternative_endif
+	mov	x2, #ICC_PMR_EL1_UNMASKED
+	msr_s	ICC_PMR_EL1, x2			// unmask at PMR
 	dsb	sy				// WFI may enter a low-power mode
 	wfi
+	msr_s	ICC_PMR_EL1, x1			// restore PMR
+	msr	daif, x0			// restore I bit
 	ret
+#endif
 ENDPROC(cpu_do_idle)
 
 #ifdef CONFIG_CPU_PM
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fedcdd09b9b2..14b2fb5e81ef 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -492,8 +492,10 @@ static void gic_cpu_sys_reg_init(void)
 	if (!gic_enable_sre())
 		pr_err("GIC: unable to set SRE (disabled at EL2), panic ahead\n");
 
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 	/* Set priority mask register */
 	gic_write_pmr(DEFAULT_PMR_VALUE);
+#endif
 
 	/*
 	 * Some firmwares hand over to the kernel with the BPR changed from
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index eafc965b3eb8..ba9ed0a2ac09 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -54,7 +54,7 @@
 #define GICD_INT_EN_CLR_X32		0xffffffff
 #define GICD_INT_EN_SET_SGI		0x0000ffff
 #define GICD_INT_EN_CLR_PPI		0xffff0000
-#define GICD_INT_DEF_PRI		0xa0
+#define GICD_INT_DEF_PRI		0xc0
 #define GICD_INT_DEF_PRI_X4		((GICD_INT_DEF_PRI << 24) |\
 					(GICD_INT_DEF_PRI << 16) |\
 					(GICD_INT_DEF_PRI << 8) |\
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 7/7] arm64: Implement IPI_CPU_BACKTRACE using pseudo-NMIs
  2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
                   ` (5 preceding siblings ...)
  2016-08-19 16:13 ` [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Daniel Thompson
@ 2016-08-19 16:13 ` Daniel Thompson
  6 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2016-08-19 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

Recently arm64 gained the capability to (optionally) mask interrupts
using the GIC PMR rather than the CPU PSR. That allows us to introduce
an NMI-like means to handle backtrace requests.

This provides a useful debug aid by allowing the kernel to robustly show
a backtrace for every processor in the system when, for example, we hang
trying to acquire a spin lock.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/Kconfig                     |  1 +
 arch/arm64/include/asm/assembler.h     | 23 ++++++++++
 arch/arm64/include/asm/smp.h           |  2 +
 arch/arm64/kernel/entry.S              | 78 +++++++++++++++++++++++++++-------
 arch/arm64/kernel/smp.c                | 27 +++++++++++-
 drivers/irqchip/irq-gic-v3.c           | 62 +++++++++++++++++++++++++++
 include/linux/irqchip/arm-gic-common.h |  8 ++++
 include/linux/irqchip/arm-gic.h        |  5 ---
 8 files changed, 185 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 56846724d2af..7421941b1036 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -86,6 +86,7 @@ config ARM64
 	select HAVE_IRQ_TIME_ACCOUNTING
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
+	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0adea5807ef0..840f2f01c60a 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -33,6 +33,29 @@
 #include <asm/thread_info.h>
 
 /*
+ * Enable and disable pseudo NMI.
+ */
+	.macro disable_nmi
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	nop
+alternative_else
+	msr	daifset, #2
+alternative_endif
+#endif
+	.endm
+
+	.macro enable_nmi
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	nop
+alternative_else
+	msr	daifclr, #2
+alternative_endif
+#endif
+	.endm
+
+/*
  * Enable and disable interrupts.
  */
 	.macro	disable_irq, tmp
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 022644704a93..75e0be7ad306 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -33,6 +33,8 @@
 #include <linux/cpumask.h>
 #include <linux/thread_info.h>
 
+#define SMP_IPI_NMI_MASK (1 << 6)
+
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 
 struct seq_file;
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 1712b344cbf3..8f51e23cb05d 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -268,6 +268,40 @@ alternative_endif
 	mov	sp, x19
 	.endm
 
+	.macro	trace_hardirqs_off, pstate
+#ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	bl	trace_hardirqs_off
+	nop
+alternative_else
+	tbnz	\pstate, #PSR_G_SHIFT, 1f		// PSR_G_BIT
+	bl	trace_hardirqs_off
+1:
+alternative_endif
+#else
+	bl	trace_hardirqs_off
+#endif
+#endif
+	.endm
+
+	.macro	trace_hardirqs_on, pstate
+#ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	bl	trace_hardirqs_on
+	nop
+alternative_else
+	tbnz	\pstate, #PSR_G_SHIFT, 1f		// PSR_G_BIT
+	bl	trace_hardirqs_on
+1:
+alternative_endif
+#else
+	bl	trace_hardirqs_on
+#endif
+#endif
+	.endm
+
 /*
  * These are the registers used in the syscall handler, and allow us to
  * have in theory up to 7 arguments to a function - x0 to x6.
@@ -413,20 +447,19 @@ el1_da:
 	 * Data abort handling
 	 */
 	mrs	x0, far_el1
+	enable_nmi
 	enable_dbg
 	// re-enable interrupts if they were enabled in the aborted context
 #ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	tbnz	x23, #7, 1f			// PSR_I_BIT
 	nop
-	nop
 	msr     daifclr, #2
 1:
 alternative_else
 	tbnz	x23, #PSR_G_SHIFT, 1f		// PSR_G_BIT
 	mov     x2, #ICC_PMR_EL1_UNMASKED
 	msr_s   ICC_PMR_EL1, x2
-	msr     daifclr, #2
 1:
 alternative_endif
 #else
@@ -439,6 +472,7 @@ alternative_endif
 
 	// disable interrupts before pulling preserved data off the stack
 	disable_irq x21
+	disable_nmi
 	kernel_exit 1
 el1_sp_pc:
 	/*
@@ -479,10 +513,14 @@ ENDPROC(el1_sync)
 el1_irq:
 	kernel_entry 1
 	enable_dbg
-#ifdef CONFIG_TRACE_IRQFLAGS
-	bl	trace_hardirqs_off
-#endif
+	trace_hardirqs_off x23
 
+	/*
+	 * On systems with CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS then
+	 * we do not yet know if this IRQ is a pseudo-NMI or a normal
+	 * interrupt. For that reason we must rely on the irq_handler to
+	 * enable the NMI once the interrupt type is determined.
+	 */
 	irq_handler
 
 #ifdef CONFIG_PREEMPT
@@ -493,9 +531,9 @@ el1_irq:
 	bl	el1_preempt
 1:
 #endif
-#ifdef CONFIG_TRACE_IRQFLAGS
-	bl	trace_hardirqs_on
-#endif
+
+	disable_nmi
+	trace_hardirqs_on x23
 	kernel_exit 1
 ENDPROC(el1_irq)
 
@@ -592,6 +630,7 @@ el0_da:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
+	enable_nmi
 	enable_dbg_and_irq x0
 	ct_user_exit
 	bic	x0, x26, #(0xff << 56)
@@ -605,6 +644,7 @@ el0_ia:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
+	enable_nmi
 	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
@@ -638,6 +678,7 @@ el0_sp_pc:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
+	enable_nmi
 	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
@@ -650,6 +691,7 @@ el0_undef:
 	 * Undefined instruction
 	 */
 	// enable interrupts before calling the main handler
+	enable_nmi
 	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, sp
@@ -692,16 +734,18 @@ el0_irq:
 	kernel_entry 0
 el0_irq_naked:
 	enable_dbg
-#ifdef CONFIG_TRACE_IRQFLAGS
-	bl	trace_hardirqs_off
-#endif
-
+	trace_hardirqs_off x23
 	ct_user_exit
+
+	/*
+	 * On systems with CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS then
+	 * we do not yet know if this IRQ is a pseudo-NMI or a normal
+	 * interrupt. For that reason we must rely on the irq_handler to
+	 * enable the NMI once the interrupt type is determined.
+	 */
 	irq_handler
 
-#ifdef CONFIG_TRACE_IRQFLAGS
-	bl	trace_hardirqs_on
-#endif
+	trace_hardirqs_on x23
 	b	ret_to_user
 ENDPROC(el0_irq)
 
@@ -751,6 +795,7 @@ ret_fast_syscall:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 	enable_step_tsk x1, x2
+	disable_nmi
 	kernel_exit 0
 ret_fast_syscall_trace:
 	enable_irq x0				// enable interrupts
@@ -763,6 +808,7 @@ work_pending:
 	tbnz	x1, #TIF_NEED_RESCHED, work_resched
 	/* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
 	mov	x0, sp				// 'regs'
+	enable_nmi
 	enable_irq x21				// enable interrupts for do_notify_resume()
 	bl	do_notify_resume
 	b	ret_to_user
@@ -781,6 +827,7 @@ ret_to_user:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 	enable_step_tsk x1, x2
+	disable_nmi
 	kernel_exit 0
 ENDPROC(ret_to_user)
 
@@ -806,6 +853,7 @@ el0_svc:
 	mov	sc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
+	enable_nmi
 	enable_dbg_and_irq x16
 	ct_user_exit 1
 
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 2c118b2db7b4..309e988b00a3 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -879,6 +879,13 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 #endif
 
 	case IPI_CPU_BACKTRACE:
+		BUILD_BUG_ON(SMP_IPI_NMI_MASK != BIT(IPI_CPU_BACKTRACE));
+
+		if (in_nmi()) {
+			nmi_cpu_backtrace(regs);
+			break;
+		}
+
 		printk_nmi_enter();
 		irq_enter();
 		nmi_cpu_backtrace(regs);
@@ -960,13 +967,31 @@ bool cpus_are_stuck_in_kernel(void)
 	return !!cpus_stuck_in_kernel || smp_spin_tables;
 }
 
+/*
+ * IPI_CPU_BACKTRACE is either implemented either as a normal IRQ  or,
+ * if the hardware can supports it, using a pseudo-NMI.
+ *
+ * The mechanism used to implement pseudo-NMI means that in both cases
+ * testing if the backtrace IPI is disabled requires us to check the
+ * PSR I bit. However in the later case we cannot use irqs_disabled()
+ * to check the I bit because, when the pseudo-NMI is active that
+ * function examines the GIC PMR instead.
+ */
+static unsigned long nmi_disabled(void)
+{
+	unsigned long flags;
+
+	asm volatile("mrs %0, daif" : "=r"(flags) :: "memory");
+	return flags & PSR_I_BIT;
+}
+
 static void raise_nmi(cpumask_t *mask)
 {
 	/*
 	 * Generate the backtrace directly if we are running in a
 	 * calling context that is not preemptible by the backtrace IPI.
 	 */
-	if (cpumask_test_cpu(smp_processor_id(), mask) && irqs_disabled())
+	if (cpumask_test_cpu(smp_processor_id(), mask) && nmi_disabled())
 		nmi_cpu_backtrace(NULL);
 
 	smp_cross_call(mask, IPI_CPU_BACKTRACE);
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 14b2fb5e81ef..2b01bfdf0716 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -23,6 +23,7 @@
 #include <linux/delay.h>
 #include <linux/interrupt.h>
 #include <linux/irqdomain.h>
+#include <linux/nmi.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
@@ -342,10 +343,60 @@ static u64 gic_mpidr_to_affinity(unsigned long mpidr)
 	return aff;
 }
 
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static bool gic_handle_nmi(struct pt_regs *regs)
+{
+	u64 irqnr;
+	struct pt_regs *old_regs;
+
+	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r"(irqnr));
+
+	/*
+	 * If no IRQ is acknowledged at this point then we have entered the
+	 * handler due to an normal interrupt (rather than a pseudo-NMI).
+	 * If so then unmask the I-bit and return to normal handling.
+	 */
+	if (irqnr == ICC_IAR1_EL1_SPURIOUS) {
+		asm volatile("msr daifclr, #2" : : : "memory");
+		return false;
+	}
+
+	old_regs = set_irq_regs(regs);
+	nmi_enter();
+
+	do {
+		if (SMP_IPI_NMI_MASK & (1 << irqnr)) {
+			gic_write_eoir(irqnr);
+			if (static_key_true(&supports_deactivate))
+				gic_write_dir(irqnr);
+			nmi_cpu_backtrace(regs);
+		} else if (unlikely(irqnr != ICC_IAR1_EL1_SPURIOUS)) {
+			gic_write_eoir(irqnr);
+			if (static_key_true(&supports_deactivate))
+				gic_write_dir(irqnr);
+			WARN_ONCE(true, "Unexpected NMI received!\n");
+		}
+
+		asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1)
+			     : "=r"(irqnr));
+	} while (irqnr != ICC_IAR1_EL1_SPURIOUS);
+
+	nmi_exit();
+	set_irq_regs(old_regs);
+
+	return true;
+}
+#else
+static bool gic_handle_nmi(struct pt_regs *regs) { return false; }
+#endif
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
 
+	if (gic_handle_nmi(regs))
+		return;
+
 	do {
 		irqnr = gic_read_iar();
 
@@ -525,6 +576,7 @@ static int gic_dist_supports_lpis(void)
 static void gic_cpu_init(void)
 {
 	void __iomem *rbase;
+	unsigned long __maybe_unused nmimask, hwirq;
 
 	/* Register ourselves with the rest of the world */
 	if (gic_populate_rdist())
@@ -545,6 +597,16 @@ static void gic_cpu_init(void)
 
 	/* initialise system registers */
 	gic_cpu_sys_reg_init();
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/* Boost the priority of any IPI in the mask */
+	nmimask = SMP_IPI_NMI_MASK;
+	for_each_set_bit(hwirq, &nmimask, 16)
+		writeb_relaxed(GICD_INT_DEF_PRI ^ BIT(7),
+			       rbase + GIC_DIST_PRI + hwirq);
+	gic_dist_wait_for_rwp();
+	gic_redist_wait_for_rwp();
+#endif
 }
 
 #ifdef CONFIG_SMP
diff --git a/include/linux/irqchip/arm-gic-common.h b/include/linux/irqchip/arm-gic-common.h
index c647b0547bcd..e25e1818f163 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -13,6 +13,14 @@
 #include <linux/types.h>
 #include <linux/ioport.h>
 
+#define GIC_DIST_PRI			0x400
+
+#define GICD_INT_DEF_PRI		0xc0
+#define GICD_INT_DEF_PRI_X4		((GICD_INT_DEF_PRI << 24) |\
+					(GICD_INT_DEF_PRI << 16) |\
+					(GICD_INT_DEF_PRI << 8) |\
+					GICD_INT_DEF_PRI)
+
 enum gic_type {
 	GIC_V2,
 	GIC_V3,
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index ba9ed0a2ac09..a5cc1c768e24 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -54,11 +54,6 @@
 #define GICD_INT_EN_CLR_X32		0xffffffff
 #define GICD_INT_EN_SET_SGI		0x0000ffff
 #define GICD_INT_EN_CLR_PPI		0xffff0000
-#define GICD_INT_DEF_PRI		0xc0
-#define GICD_INT_DEF_PRI_X4		((GICD_INT_DEF_PRI << 24) |\
-					(GICD_INT_DEF_PRI << 16) |\
-					(GICD_INT_DEF_PRI << 8) |\
-					GICD_INT_DEF_PRI)
 
 #define GICH_HCR			0x0
 #define GICH_VTR			0x4
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-19 16:13 ` [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Daniel Thompson
@ 2016-08-22 10:12   ` Marc Zyngier
  2016-08-22 14:24     ` Daniel Thompson
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2016-08-22 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On 19/08/16 17:13, Daniel Thompson wrote:
> Currently irqflags is implemented using the PSR's I bit. It is possible
> to implement irqflags by using the co-processor interface to the GIC.
> Using the co-processor interface makes it feasible to simulate NMIs
> using GIC interrupt prioritization.
> 
> This patch changes the irqflags macros to modify, save and restore
> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
> the kernel. There are four reasons for this:
> 
> 1. The state of the ICC_PMR_EL1_G_BIT becomes part of the CPU context

What is this G bit? I can't spot anything like this in the GICv3 spec.

>    and must be saved and restored during traps. To simplify the
>    additional context management the ICC_PMR_EL1_G_BIT is converted
>    into a fake (reserved) bit within the PSR (PSR_G_BIT). Naturally
>    this approach will need to be changed if future ARM architecture
>    extensions make use of this bit.
> 
> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>    When the I bit is set by hardware we must add code to switch from I
>    bit masking and PMR masking.
> 
> 3. Some instructions, noteably wfi, require that the PMR not be used
>    for interrupt masking. Before calling these instructions we must
>    switch from PMR masking to I bit masking.
> 
> 4. We use the alternatives system to all a single kernel to boot and
>    switch between the different masking approaches.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> ---
>  arch/arm64/Kconfig                  |  15 +++++
>  arch/arm64/include/asm/arch_gicv3.h |  51 +++++++++++++++++
>  arch/arm64/include/asm/assembler.h  |  32 ++++++++++-
>  arch/arm64/include/asm/irqflags.h   | 106 ++++++++++++++++++++++++++++++++++++
>  arch/arm64/include/asm/ptrace.h     |  23 ++++++++
>  arch/arm64/kernel/entry.S           |  77 ++++++++++++++++++++++----
>  arch/arm64/kernel/head.S            |  35 ++++++++++++
>  arch/arm64/kernel/smp.c             |   6 ++
>  arch/arm64/mm/proc.S                |  23 ++++++++
>  drivers/irqchip/irq-gic-v3.c        |   2 +
>  include/linux/irqchip/arm-gic.h     |   2 +-
>  11 files changed, 356 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index bc3f00f586f1..56846724d2af 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -718,6 +718,21 @@ config FORCE_MAX_ZONEORDER
>  	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
>  	  4M allocations matching the default size used by generic code.
>  
> +config USE_ICC_SYSREGS_FOR_IRQFLAGS
> +	bool "Use ICC system registers for IRQ masking"
> +	select CONFIG_ARM_GIC_V3
> +	help
> +	  Using the ICC system registers for IRQ masking makes it possible
> +	  to simulate NMI on ARM64 systems. This allows several interesting
> +	  features (especially debug features) to be used on these systems.
> +
> +	  Say Y here to implement IRQ masking using ICC system
> +	  registers. This will result in an unbootable kernel if these

Is that a true statement? If so, this is not acceptable. We want a
kernel that has this feature enabled to boot on anything, whether it has
GICv3 or not. It should disable itself on a GICv2 system.

> +	  registers are not implemented or made inaccessible by the
> +	  EL3 firmare or EL2 hypervisor (if present).
> +
> +	  If unsure, say N
> +
>  menuconfig ARMV8_DEPRECATED
>  	bool "Emulate deprecated/obsolete ARMv8 instructions"
>  	depends on COMPAT
> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> index fc2a0cb47b2c..a4ad69e2831b 100644
> --- a/arch/arm64/include/asm/arch_gicv3.h
> +++ b/arch/arm64/include/asm/arch_gicv3.h
> @@ -80,6 +80,9 @@
>  #include <linux/stringify.h>
>  #include <asm/barrier.h>
>  
> +/* Our default, arbitrary priority value. Linux only uses one anyway. */
> +#define DEFAULT_PMR_VALUE	0xf0
> +
>  /*
>   * Low-level accessors
>   *
> @@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
>  static inline u64 gic_read_iar_common(void)
>  {
>  	u64 irqstat;
> +	u64 __maybe_unused daif;
> +	u64 __maybe_unused pmr;
> +	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
>  
> +#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>  	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
>  	dsb(sy);
> +#else
> +	/*
> +	 * The PMR may be configured to mask interrupts when this code is
> +	 * called, thus in order to acknowledge interrupts we must set the
> +	 * PMR to its default value before reading from the IAR.
> +	 *
> +	 * To do this without taking an interrupt we also ensure the I bit
> +	 * is set whilst we are interfering with the value of the PMR.
> +	 */
> +	asm volatile(
> +		"mrs	%1, daif\n\t"			     /* save I bit  */
> +		"msr	daifset, #2\n\t"		     /* set I bit   */
> +		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
> +		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
> +		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
> +		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
> +		"isb\n\t"
> +		"msr	daif, %1"			     /* restore I   */
> +		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
> +		: "r" (default_pmr_value));

I'm a bit puzzled by this. Why would you have to mess with PMR or DAIF
at this stage? Why can't we just let the GIC priority mechanism do its
thing? I'd expect the initial setting of PMR (indicating whether or not
we have normal interrupts enabled or not) to be enough to perform the
filtering.

You may get an NMI while you're already handling an interrupt, but
that's fair game, and that's not something we should prevent.

What am I missing?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization
  2016-08-19 16:13 ` [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization Daniel Thompson
@ 2016-08-22 12:33   ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2016-08-22 12:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/08/16 17:13, Daniel Thompson wrote:
> Currently, when running on FVP, CPU 0 boots up with its BPR changed from
> the reset value. This renders it impossible to (preemptively) prioritize
> interrupts on CPU 0.
> 
> This is harmless on normal systems since Linux typically does not
> support preemptive interrupts. It does however cause problems in
> systems with additional changes (such as patches for NMI simulation).
> 
> Many thanks to Andrew Thoelke for suggesting the BPR as having the
> potential to harm preemption.
> 
> Suggested-by: Andrew Thoelke <andrew.thoelke@arm.com>
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-22 10:12   ` Marc Zyngier
@ 2016-08-22 14:24     ` Daniel Thompson
  2016-08-22 15:06       ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2016-08-22 14:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/08/16 11:12, Marc Zyngier wrote:
> Hi Daniel,
>
> On 19/08/16 17:13, Daniel Thompson wrote:
>> Currently irqflags is implemented using the PSR's I bit. It is possible
>> to implement irqflags by using the co-processor interface to the GIC.
>> Using the co-processor interface makes it feasible to simulate NMIs
>> using GIC interrupt prioritization.
>>
>> This patch changes the irqflags macros to modify, save and restore
>> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
>> the kernel. There are four reasons for this:
>>
>> 1. The state of the ICC_PMR_EL1_G_BIT becomes part of the CPU context
>
> What is this G bit? I can't spot anything like this in the GICv3 spec.

Good point. This is jargon that hasn't get been defined. Its just an 
arbitrary bit in the PMR (we don't need to save whole PMR, just the bit 
we toggle to jump between high and low priority).

I'll fix the description (and perhaps also add deeper comments to 
asm/ptrace.h ).


>>    and must be saved and restored during traps. To simplify the
>>    additional context management the ICC_PMR_EL1_G_BIT is converted
>>    into a fake (reserved) bit within the PSR (PSR_G_BIT). Naturally
>>    this approach will need to be changed if future ARM architecture
>>    extensions make use of this bit.
>>
>> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>>    When the I bit is set by hardware we must add code to switch from I
>>    bit masking and PMR masking.
>>
>> 3. Some instructions, noteably wfi, require that the PMR not be used
>>    for interrupt masking. Before calling these instructions we must
>>    switch from PMR masking to I bit masking.
>>
>> 4. We use the alternatives system to all a single kernel to boot and
>>    switch between the different masking approaches.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>> ---
>>  arch/arm64/Kconfig                  |  15 +++++
>>  arch/arm64/include/asm/arch_gicv3.h |  51 +++++++++++++++++
>>  arch/arm64/include/asm/assembler.h  |  32 ++++++++++-
>>  arch/arm64/include/asm/irqflags.h   | 106 ++++++++++++++++++++++++++++++++++++
>>  arch/arm64/include/asm/ptrace.h     |  23 ++++++++
>>  arch/arm64/kernel/entry.S           |  77 ++++++++++++++++++++++----
>>  arch/arm64/kernel/head.S            |  35 ++++++++++++
>>  arch/arm64/kernel/smp.c             |   6 ++
>>  arch/arm64/mm/proc.S                |  23 ++++++++
>>  drivers/irqchip/irq-gic-v3.c        |   2 +
>>  include/linux/irqchip/arm-gic.h     |   2 +-
>>  11 files changed, 356 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index bc3f00f586f1..56846724d2af 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -718,6 +718,21 @@ config FORCE_MAX_ZONEORDER
>>  	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
>>  	  4M allocations matching the default size used by generic code.
>>
>> +config USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +	bool "Use ICC system registers for IRQ masking"
>> +	select CONFIG_ARM_GIC_V3
>> +	help
>> +	  Using the ICC system registers for IRQ masking makes it possible
>> +	  to simulate NMI on ARM64 systems. This allows several interesting
>> +	  features (especially debug features) to be used on these systems.
>> +
>> +	  Say Y here to implement IRQ masking using ICC system
>> +	  registers. This will result in an unbootable kernel if these
>
> Is that a true statement? If so, this is not acceptable. We want a
> kernel that has this feature enabled to boot on anything, whether it has
> GICv3 or not. It should disable itself on a GICv2 system.

It *was* a true statement in v2 but now its completely wrong. The code 
works just fine on Hikey (where the feature does not deploy).

Will fix.



>> +	  registers are not implemented or made inaccessible by the
>> +	  EL3 firmare or EL2 hypervisor (if present).
>> +
>> +	  If unsure, say N
>> +
>>  menuconfig ARMV8_DEPRECATED
>>  	bool "Emulate deprecated/obsolete ARMv8 instructions"
>>  	depends on COMPAT
>> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
>> index fc2a0cb47b2c..a4ad69e2831b 100644
>> --- a/arch/arm64/include/asm/arch_gicv3.h
>> +++ b/arch/arm64/include/asm/arch_gicv3.h
>> @@ -80,6 +80,9 @@
>>  #include <linux/stringify.h>
>>  #include <asm/barrier.h>
>>
>> +/* Our default, arbitrary priority value. Linux only uses one anyway. */
>> +#define DEFAULT_PMR_VALUE	0xf0
>> +
>>  /*
>>   * Low-level accessors
>>   *
>> @@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
>>  static inline u64 gic_read_iar_common(void)
>>  {
>>  	u64 irqstat;
>> +	u64 __maybe_unused daif;
>> +	u64 __maybe_unused pmr;
>> +	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
>>
>> +#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>  	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
>>  	dsb(sy);
>> +#else
>> +	/*
>> +	 * The PMR may be configured to mask interrupts when this code is
>> +	 * called, thus in order to acknowledge interrupts we must set the
>> +	 * PMR to its default value before reading from the IAR.
>> +	 *
>> +	 * To do this without taking an interrupt we also ensure the I bit
>> +	 * is set whilst we are interfering with the value of the PMR.
>> +	 */
>> +	asm volatile(
>> +		"mrs	%1, daif\n\t"			     /* save I bit  */
>> +		"msr	daifset, #2\n\t"		     /* set I bit   */
>> +		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
>> +		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
>> +		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
>> +		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
>> +		"isb\n\t"
>> +		"msr	daif, %1"			     /* restore I   */
>> +		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
>> +		: "r" (default_pmr_value));
>
> I'm a bit puzzled by this. Why would you have to mess with PMR or DAIF
> at this stage? Why can't we just let the GIC priority mechanism do its
> thing? I'd expect the initial setting of PMR (indicating whether or not
> we have normal interrupts enabled or not) to be enough to perform the
> filtering.
>
> You may get an NMI while you're already handling an interrupt, but
> that's fair game, and that's not something we should prevent.
>
> What am I missing?

This code *always* executes with interrupts masked at the PMR and, when 
interrupts are masked in this way, they cannot be acknowledged.

So to acknowledge the interrupt we must first modify the PMR to unmask 
it. If we did that without first ensuring the I-bit is masked we would 
end up constantly re-trapping.

Note, we could probably avoid saving the DAIF bits and PMR since they 
both have a known state at this stage in gic_handle_irq(). However it 
would be such a nasty potential side-effect I think I'd have to refactor 
things a bit so the mask/unmask dance doesn't happen in the register 
access functions.


Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-22 14:24     ` Daniel Thompson
@ 2016-08-22 15:06       ` Marc Zyngier
  2016-08-25 13:23         ` Daniel Thompson
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2016-08-22 15:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/08/16 15:24, Daniel Thompson wrote:
> On 22/08/16 11:12, Marc Zyngier wrote:
>> Hi Daniel,
>>
>> On 19/08/16 17:13, Daniel Thompson wrote:
>>> Currently irqflags is implemented using the PSR's I bit. It is possible
>>> to implement irqflags by using the co-processor interface to the GIC.
>>> Using the co-processor interface makes it feasible to simulate NMIs
>>> using GIC interrupt prioritization.
>>>
>>> This patch changes the irqflags macros to modify, save and restore
>>> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
>>> the kernel. There are four reasons for this:
>>>
>>> 1. The state of the ICC_PMR_EL1_G_BIT becomes part of the CPU context
>>
>> What is this G bit? I can't spot anything like this in the GICv3 spec.
> 
> Good point. This is jargon that hasn't get been defined. Its just an 
> arbitrary bit in the PMR (we don't need to save whole PMR, just the bit 
> we toggle to jump between high and low priority).
> 
> I'll fix the description (and perhaps also add deeper comments to 
> asm/ptrace.h ).
> 
> 
>>>    and must be saved and restored during traps. To simplify the
>>>    additional context management the ICC_PMR_EL1_G_BIT is converted
>>>    into a fake (reserved) bit within the PSR (PSR_G_BIT). Naturally
>>>    this approach will need to be changed if future ARM architecture
>>>    extensions make use of this bit.
>>>
>>> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>>>    When the I bit is set by hardware we must add code to switch from I
>>>    bit masking and PMR masking.
>>>
>>> 3. Some instructions, noteably wfi, require that the PMR not be used
>>>    for interrupt masking. Before calling these instructions we must
>>>    switch from PMR masking to I bit masking.
>>>
>>> 4. We use the alternatives system to all a single kernel to boot and
>>>    switch between the different masking approaches.
>>>
>>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>>> ---
>>>  arch/arm64/Kconfig                  |  15 +++++
>>>  arch/arm64/include/asm/arch_gicv3.h |  51 +++++++++++++++++
>>>  arch/arm64/include/asm/assembler.h  |  32 ++++++++++-
>>>  arch/arm64/include/asm/irqflags.h   | 106 ++++++++++++++++++++++++++++++++++++
>>>  arch/arm64/include/asm/ptrace.h     |  23 ++++++++
>>>  arch/arm64/kernel/entry.S           |  77 ++++++++++++++++++++++----
>>>  arch/arm64/kernel/head.S            |  35 ++++++++++++
>>>  arch/arm64/kernel/smp.c             |   6 ++
>>>  arch/arm64/mm/proc.S                |  23 ++++++++
>>>  drivers/irqchip/irq-gic-v3.c        |   2 +
>>>  include/linux/irqchip/arm-gic.h     |   2 +-
>>>  11 files changed, 356 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index bc3f00f586f1..56846724d2af 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -718,6 +718,21 @@ config FORCE_MAX_ZONEORDER
>>>  	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
>>>  	  4M allocations matching the default size used by generic code.
>>>
>>> +config USE_ICC_SYSREGS_FOR_IRQFLAGS
>>> +	bool "Use ICC system registers for IRQ masking"
>>> +	select CONFIG_ARM_GIC_V3
>>> +	help
>>> +	  Using the ICC system registers for IRQ masking makes it possible
>>> +	  to simulate NMI on ARM64 systems. This allows several interesting
>>> +	  features (especially debug features) to be used on these systems.
>>> +
>>> +	  Say Y here to implement IRQ masking using ICC system
>>> +	  registers. This will result in an unbootable kernel if these
>>
>> Is that a true statement? If so, this is not acceptable. We want a
>> kernel that has this feature enabled to boot on anything, whether it has
>> GICv3 or not. It should disable itself on a GICv2 system.
> 
> It *was* a true statement in v2 but now its completely wrong. The code 
> works just fine on Hikey (where the feature does not deploy).
> 
> Will fix.
> 
> 
> 
>>> +	  registers are not implemented or made inaccessible by the
>>> +	  EL3 firmare or EL2 hypervisor (if present).
>>> +
>>> +	  If unsure, say N
>>> +
>>>  menuconfig ARMV8_DEPRECATED
>>>  	bool "Emulate deprecated/obsolete ARMv8 instructions"
>>>  	depends on COMPAT
>>> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
>>> index fc2a0cb47b2c..a4ad69e2831b 100644
>>> --- a/arch/arm64/include/asm/arch_gicv3.h
>>> +++ b/arch/arm64/include/asm/arch_gicv3.h
>>> @@ -80,6 +80,9 @@
>>>  #include <linux/stringify.h>
>>>  #include <asm/barrier.h>
>>>
>>> +/* Our default, arbitrary priority value. Linux only uses one anyway. */
>>> +#define DEFAULT_PMR_VALUE	0xf0
>>> +
>>>  /*
>>>   * Low-level accessors
>>>   *
>>> @@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
>>>  static inline u64 gic_read_iar_common(void)
>>>  {
>>>  	u64 irqstat;
>>> +	u64 __maybe_unused daif;
>>> +	u64 __maybe_unused pmr;
>>> +	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
>>>
>>> +#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>  	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
>>>  	dsb(sy);
>>> +#else
>>> +	/*
>>> +	 * The PMR may be configured to mask interrupts when this code is
>>> +	 * called, thus in order to acknowledge interrupts we must set the
>>> +	 * PMR to its default value before reading from the IAR.
>>> +	 *
>>> +	 * To do this without taking an interrupt we also ensure the I bit
>>> +	 * is set whilst we are interfering with the value of the PMR.
>>> +	 */
>>> +	asm volatile(
>>> +		"mrs	%1, daif\n\t"			     /* save I bit  */
>>> +		"msr	daifset, #2\n\t"		     /* set I bit   */
>>> +		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
>>> +		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
>>> +		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
>>> +		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
>>> +		"isb\n\t"
>>> +		"msr	daif, %1"			     /* restore I   */
>>> +		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
>>> +		: "r" (default_pmr_value));
>>
>> I'm a bit puzzled by this. Why would you have to mess with PMR or DAIF
>> at this stage? Why can't we just let the GIC priority mechanism do its
>> thing? I'd expect the initial setting of PMR (indicating whether or not
>> we have normal interrupts enabled or not) to be enough to perform the
>> filtering.
>>
>> You may get an NMI while you're already handling an interrupt, but
>> that's fair game, and that's not something we should prevent.
>>
>> What am I missing?
> 
> This code *always* executes with interrupts masked at the PMR and, when 
> interrupts are masked in this way, they cannot be acknowledged.
> 
> So to acknowledge the interrupt we must first modify the PMR to unmask 
> it. If we did that without first ensuring the I-bit is masked we would 
> end up constantly re-trapping.

Right, I see your problem now. I guess that I had a slightly different
view of how to implement it. My own take is that we'd be better off
leaving the I bit alone until we reach the point where we've accessed
the IAR register (hence making the interrupt active and this particular
priority unable to fire:

	irqstat = read_iar();
	clear_i_bit();

At that stage, the only interrupt that can fire is an NMI (or nothing at
all if we're already handling one). The I bit must also be set again on
priority drop (before writing to the EOI register).

This will save you most, if not all, of the faffing with PMR (which
requires an ISB/DSB pair on each update to be guaranteed to have an
effect, and is a good way to kill your performance).

Of course, the drawback is that we still cover a large section of code
using the I bit, but that'd be a good start.

> Note, we could probably avoid saving the DAIF bits and PMR since they 
> both have a known state at this stage in gic_handle_irq(). However it 
> would be such a nasty potential side-effect I think I'd have to refactor 
> things a bit so the mask/unmask dance doesn't happen in the register 
> access functions.

Yeah, there is clearly some redundancy here. And to be completely
honest, I think you should try to split this series in two parts:
- one that implements interrupt masking using PMR
- one that takes care of NMIs and possibly backtracing

At the moment, things are a bit mixed and it is hard to pinpoint what is
really required for what.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-22 15:06       ` Marc Zyngier
@ 2016-08-25 13:23         ` Daniel Thompson
  2016-08-25 13:54           ` Marc Zyngier
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2016-08-25 13:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/08/16 16:06, Marc Zyngier wrote:
>>>> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
>>>> index fc2a0cb47b2c..a4ad69e2831b 100644
>>>> --- a/arch/arm64/include/asm/arch_gicv3.h
>>>> +++ b/arch/arm64/include/asm/arch_gicv3.h
>>>> @@ -80,6 +80,9 @@
>>>>  #include <linux/stringify.h>
>>>>  #include <asm/barrier.h>
>>>>
>>>> +/* Our default, arbitrary priority value. Linux only uses one anyway. */
>>>> +#define DEFAULT_PMR_VALUE	0xf0
>>>> +
>>>>  /*
>>>>   * Low-level accessors
>>>>   *
>>>> @@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
>>>>  static inline u64 gic_read_iar_common(void)
>>>>  {
>>>>  	u64 irqstat;
>>>> +	u64 __maybe_unused daif;
>>>> +	u64 __maybe_unused pmr;
>>>> +	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
>>>>
>>>> +#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>>  	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
>>>>  	dsb(sy);
>>>> +#else
>>>> +	/*
>>>> +	 * The PMR may be configured to mask interrupts when this code is
>>>> +	 * called, thus in order to acknowledge interrupts we must set the
>>>> +	 * PMR to its default value before reading from the IAR.
>>>> +	 *
>>>> +	 * To do this without taking an interrupt we also ensure the I bit
>>>> +	 * is set whilst we are interfering with the value of the PMR.
>>>> +	 */
>>>> +	asm volatile(
>>>> +		"mrs	%1, daif\n\t"			     /* save I bit  */
>>>> +		"msr	daifset, #2\n\t"		     /* set I bit   */
>>>> +		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
>>>> +		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
>>>> +		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
>>>> +		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
>>>> +		"isb\n\t"
>>>> +		"msr	daif, %1"			     /* restore I   */
>>>> +		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
>>>> +		: "r" (default_pmr_value));
>>>
>>> I'm a bit puzzled by this. Why would you have to mess with PMR or DAIF
>>> at this stage? Why can't we just let the GIC priority mechanism do its
>>> thing? I'd expect the initial setting of PMR (indicating whether or not
>>> we have normal interrupts enabled or not) to be enough to perform the
>>> filtering.
>>>
>>> You may get an NMI while you're already handling an interrupt, but
>>> that's fair game, and that's not something we should prevent.
>>>
>>> What am I missing?
>>
>> This code *always* executes with interrupts masked at the PMR and, when
>> interrupts are masked in this way, they cannot be acknowledged.
>>
>> So to acknowledge the interrupt we must first modify the PMR to unmask
>> it. If we did that without first ensuring the I-bit is masked we would
>> end up constantly re-trapping.
>
> Right, I see your problem now. I guess that I had a slightly different
> view of how to implement it. My own take is that we'd be better off
> leaving the I bit alone until we reach the point where we've accessed
> the IAR register (hence making the interrupt active and this particular
> priority unable to fire:
>
> 	irqstat = read_iar();
> 	clear_i_bit();
>
> At that stage, the only interrupt that can fire is an NMI (or nothing at
> all if we're already handling one). The I bit must also be set again on
> priority drop (before writing to the EOI register).

Relying on intack to raise the priority does sound a lot simpler.

The forceful mask of PMR inside kernel_entry was so we model, as closely 
as possible. the way the I-bit is set by hardware when we take a trap.

However the update of the PMR can easily be shifted into the enable_nmi 
macro leaving the interrupt controller (which cannot enable_nmi before 
calling the handler) to look after itself.


> This will save you most, if not all, of the faffing with PMR (which
> requires an ISB/DSB pair on each update to be guaranteed to have an
> effect, and is a good way to kill your performance).

Removing the code is a good thing although I thought the PMR was 
self-synchronizing (the code used to have barriers all over the shop 
until Dave Martin pointed that out).


> Of course, the drawback is that we still cover a large section of code
> using the I bit, but that'd be a good start.

To be honest I think that is forced on us anyway. We cannot know if the 
IRQ trap is a pseudo-NMI or an IRQ until we read the intack register.


>> Note, we could probably avoid saving the DAIF bits and PMR since they
>> both have a known state at this stage in gic_handle_irq(). However it
>> would be such a nasty potential side-effect I think I'd have to refactor
>> things a bit so the mask/unmask dance doesn't happen in the register
>> access functions.
>
> Yeah, there is clearly some redundancy here. And to be completely
> honest, I think you should try to split this series in two parts:
> - one that implements interrupt masking using PMR
> - one that takes care of NMIs and possibly backtracing
>
> At the moment, things are a bit mixed and it is hard to pinpoint what is
> really required for what.

Well... the patches are already teased out just as you say but the 
ordering *is* a bit weird now you mention it (patches 3, 4, 5 & 6 do the 
interrupt masking, whilst 1, 2 and 7 do the NMI).

Do you think its acceptable to reorder the patch set and put a clear 
description of the covering letter describing when the code related to 
NMI starts? I find it hard to see the point of PMR masking without 
example code to drive it.


Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2016-08-25 13:23         ` Daniel Thompson
@ 2016-08-25 13:54           ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2016-08-25 13:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 25 Aug 2016 14:23:42 +0100
Daniel Thompson <daniel.thompson@linaro.org> wrote:

> On 22/08/16 16:06, Marc Zyngier wrote:
> >>>> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> >>>> index fc2a0cb47b2c..a4ad69e2831b 100644
> >>>> --- a/arch/arm64/include/asm/arch_gicv3.h
> >>>> +++ b/arch/arm64/include/asm/arch_gicv3.h
> >>>> @@ -80,6 +80,9 @@
> >>>>  #include <linux/stringify.h>
> >>>>  #include <asm/barrier.h>
> >>>>
> >>>> +/* Our default, arbitrary priority value. Linux only uses one anyway. */
> >>>> +#define DEFAULT_PMR_VALUE	0xf0
> >>>> +
> >>>>  /*
> >>>>   * Low-level accessors
> >>>>   *
> >>>> @@ -102,9 +105,35 @@ static inline void gic_write_dir(u32 irq)
> >>>>  static inline u64 gic_read_iar_common(void)
> >>>>  {
> >>>>  	u64 irqstat;
> >>>> +	u64 __maybe_unused daif;
> >>>> +	u64 __maybe_unused pmr;
> >>>> +	u64 __maybe_unused default_pmr_value = DEFAULT_PMR_VALUE;
> >>>>
> >>>> +#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>>>  	asm volatile("mrs_s %0, " __stringify(ICC_IAR1_EL1) : "=r" (irqstat));
> >>>>  	dsb(sy);
> >>>> +#else
> >>>> +	/*
> >>>> +	 * The PMR may be configured to mask interrupts when this code is
> >>>> +	 * called, thus in order to acknowledge interrupts we must set the
> >>>> +	 * PMR to its default value before reading from the IAR.
> >>>> +	 *
> >>>> +	 * To do this without taking an interrupt we also ensure the I bit
> >>>> +	 * is set whilst we are interfering with the value of the PMR.
> >>>> +	 */
> >>>> +	asm volatile(
> >>>> +		"mrs	%1, daif\n\t"			     /* save I bit  */
> >>>> +		"msr	daifset, #2\n\t"		     /* set I bit   */
> >>>> +		"mrs_s  %2, " __stringify(ICC_PMR_EL1) "\n\t"/* save PMR    */
> >>>> +		"msr_s	" __stringify(ICC_PMR_EL1) ",%3\n\t" /* set PMR     */
> >>>> +		"mrs_s	%0, " __stringify(ICC_IAR1_EL1) "\n\t" /* ack int   */
> >>>> +		"msr_s  " __stringify(ICC_PMR_EL1) ",%2\n\t" /* restore PMR */
> >>>> +		"isb\n\t"
> >>>> +		"msr	daif, %1"			     /* restore I   */
> >>>> +		: "=r" (irqstat), "=&r" (daif), "=&r" (pmr)
> >>>> +		: "r" (default_pmr_value));  
> >>>
> >>> I'm a bit puzzled by this. Why would you have to mess with PMR or DAIF
> >>> at this stage? Why can't we just let the GIC priority mechanism do its
> >>> thing? I'd expect the initial setting of PMR (indicating whether or not
> >>> we have normal interrupts enabled or not) to be enough to perform the
> >>> filtering.
> >>>
> >>> You may get an NMI while you're already handling an interrupt, but
> >>> that's fair game, and that's not something we should prevent.
> >>>
> >>> What am I missing?  
> >>
> >> This code *always* executes with interrupts masked at the PMR and, when
> >> interrupts are masked in this way, they cannot be acknowledged.
> >>
> >> So to acknowledge the interrupt we must first modify the PMR to unmask
> >> it. If we did that without first ensuring the I-bit is masked we would
> >> end up constantly re-trapping.  
> >
> > Right, I see your problem now. I guess that I had a slightly different
> > view of how to implement it. My own take is that we'd be better off
> > leaving the I bit alone until we reach the point where we've accessed
> > the IAR register (hence making the interrupt active and this particular
> > priority unable to fire:
> >
> > 	irqstat = read_iar();
> > 	clear_i_bit();
> >
> > At that stage, the only interrupt that can fire is an NMI (or nothing at
> > all if we're already handling one). The I bit must also be set again on
> > priority drop (before writing to the EOI register).  
> 
> Relying on intack to raise the priority does sound a lot simpler.
> 
> The forceful mask of PMR inside kernel_entry was so we model, as closely 
> as possible. the way the I-bit is set by hardware when we take a trap.

I think we shouldn't confuse two things:
- an explicit call to local_irq_disable() which should translate in an
  increase of the priority to only allow NMIs,
- an exception taken by the CPU, which should mask interrupts
  completely until you're in a situation where you can safely clear the
  I bit.

> However the update of the PMR can easily be shifted into the enable_nmi 
> macro leaving the interrupt controller (which cannot enable_nmi before 
> calling the handler) to look after itself.
> 
> 
> > This will save you most, if not all, of the faffing with PMR (which
> > requires an ISB/DSB pair on each update to be guaranteed to have an
> > effect, and is a good way to kill your performance).  
> 
> Removing the code is a good thing although I thought the PMR was 
> self-synchronizing (the code used to have barriers all over the shop 
> until Dave Martin pointed that out).

The architecture document seems to suggest otherwise (8.1.6,
Observability of the effects of accesses to the GIC registers),
explicitly calling out that:

"- Architectural execution of a DSB instruction guarantees that
	- The last value written to ICC_PMR_EL1 or GICC_PMR is observed
	  by the associated Redistributor.
[...]
-- Note --
An ISB or other context synchronization operation must precede the DSB
to ensure visibility of System register writes.
----------"

> > Of course, the drawback is that we still cover a large section of code
> > using the I bit, but that'd be a good start.  
> 
> To be honest I think that is forced on us anyway. We cannot know if the 
> IRQ trap is a pseudo-NMI or an IRQ until we read the intack register.

Exactly. This is the only way you can be sure of what you're actually
dealing with (anything else is inherently racy).

> >> Note, we could probably avoid saving the DAIF bits and PMR since they
> >> both have a known state at this stage in gic_handle_irq(). However it
> >> would be such a nasty potential side-effect I think I'd have to refactor
> >> things a bit so the mask/unmask dance doesn't happen in the register
> >> access functions.  
> >
> > Yeah, there is clearly some redundancy here. And to be completely
> > honest, I think you should try to split this series in two parts:
> > - one that implements interrupt masking using PMR
> > - one that takes care of NMIs and possibly backtracing
> >
> > At the moment, things are a bit mixed and it is hard to pinpoint what is
> > really required for what.  
> 
> Well... the patches are already teased out just as you say but the 
> ordering *is* a bit weird now you mention it (patches 3, 4, 5 & 6 do the 
> interrupt masking, whilst 1, 2 and 7 do the NMI).
> 
> Do you think its acceptable to reorder the patch set and put a clear 
> description of the covering letter describing when the code related to 
> NMI starts? I find it hard to see the point of PMR masking without 
> example code to drive it.

I see it under a slightly different light. PMR usage as a replacement
for the I bit shouldn't introduce any change in behaviour, and I'm
eager to get that part right before we start introducing more features.

Of course, we should keep the NMI/backtracing goal in mind, but I'd
rather take one step at a time and plug the PMR infrastructure in place
first.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-08-25 13:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-19 16:13 [RFC PATCH v3 0/7] Pseudo-NMI for arm64 using ICC_PMR_EL1 (GICv3) Daniel Thompson
2016-08-19 16:13 ` [RFC PATCH v3 1/7] irqchip: gic-v3: Reset BPR during initialization Daniel Thompson
2016-08-22 12:33   ` Marc Zyngier
2016-08-19 16:13 ` [RFC PATCH v3 2/7] arm64: Add support for on-demand backtrace of other CPUs Daniel Thompson
2016-08-19 16:13 ` [RFC PATCH v3 3/7] arm64: cpufeature: Allow early detect of specific features Daniel Thompson
2016-08-19 16:13 ` [RFC PATCH v3 4/7] arm64: alternative: Apply alternatives early in boot process Daniel Thompson
2016-08-19 16:13 ` [RFC PATCH v3 5/7] arm64: irqflags: Reorder the fiq & async macros Daniel Thompson
2016-08-19 16:13 ` [RFC PATCH v3 6/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Daniel Thompson
2016-08-22 10:12   ` Marc Zyngier
2016-08-22 14:24     ` Daniel Thompson
2016-08-22 15:06       ` Marc Zyngier
2016-08-25 13:23         ` Daniel Thompson
2016-08-25 13:54           ` Marc Zyngier
2016-08-19 16:13 ` [RFC PATCH v3 7/7] arm64: Implement IPI_CPU_BACKTRACE using pseudo-NMIs Daniel Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).