* [PATCH 0 of 7] Fix kexec in Xen (take 4)
@ 2011-06-13 17:02 Andrew Cooper
2011-06-13 17:02 ` [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown Andrew Cooper
` (7 more replies)
0 siblings, 8 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
This set of patches are partly bugfixed to problems I have noticed while working on this area of the codebase, and targeted fixes to get the kexec path working again on Xen 4.x
The first three patches are small bugfixes which are not directly related to the kexec problems, but are on the codepath.
The next four patches are directly related to fixing the kexec path.
These patches cause xen to track the BSP local APIC boot state and return to it before kexec'ing to a new kernel. This prevents the kdump kernel falling over itself when booting in x2apic mode while expecting to be in xapic mode.
It also makes sure to disable IO virtualisation, along with fixing the current codepath related to disabling Interrup Remapping on Intel VTD boxes.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 8:44 ` Jan Beulich
2011-06-13 17:02 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
` (6 subsequent siblings)
7 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
This is a rare case, but if the BIOS is set to uniprocessor, and Xen
is booted with 'lapic x2apic', Xen will switch into x2apic mode, which
will cause a protection fault when disabling the local APIC. This
leads to a general protection fault as this code is also in the fault
handler.
When x2apic mode is enabled, the only tranlsation which does
not result in a protection fault is to clear both the EN and EXTD
bits, which is safe to do in all cases, even if you are in xapic
mode rather than x2apic mode.
The linux code from which this is derrived is protected by an
if ( ! x2apic_mode ...) clause which is how they get away with it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100
+++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
@@ -340,7 +340,8 @@ void disable_local_APIC(void)
if (enabled_via_apicbase) {
uint64_t msr_content;
rdmsrl(MSR_IA32_APICBASE, msr_content);
- wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE);
+ wrmsrl(MSR_IA32_APICBASE, msr_content &
+ ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
}
}
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
2011-06-13 17:02 ` [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 8:46 ` Jan Beulich
2011-06-13 17:02 ` [PATCH 3 of 7] IOMMU: Sanitise pointer work Andrew Cooper
` (5 subsequent siblings)
7 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
nmi_shootdown_cpus is part of the kexec path, coming from a panic, and
as such can be called both with interrupts enabled or disabled. We
really dont want to accidentally set IF.
Therefore, use save/restore in preference to disable/enable.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
@@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu
static void nmi_shootdown_cpus(void)
{
- unsigned long msecs;
+ unsigned long msecs, flags;
- local_irq_disable();
+ local_irq_save(flags);
crashing_cpu = smp_processor_id();
local_irq_count(crashing_cpu) = 0;
@@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void)
__stop_this_cpu();
disable_IO_APIC();
- local_irq_enable();
+ local_irq_restore(flags);
}
void machine_crash_shutdown(void)
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 3 of 7] IOMMU: Sanitise pointer work
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
2011-06-13 17:02 ` [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown Andrew Cooper
2011-06-13 17:02 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-13 18:13 ` Keir Fraser
2011-06-13 17:02 ` [PATCH 4 of 7] APIC: record local APIC state on boot Andrew Cooper
` (4 subsequent siblings)
7 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
Check for null pointers before calling function pointers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c
--- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
@@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un
return ops->read_apic_from_ire(apic, reg);
}
-void iommu_resume()
+void iommu_resume(void)
{
const struct iommu_ops *ops = iommu_get_ops();
- if ( iommu_enabled )
+ if ( iommu_enabled && ops && ops->resume )
ops->resume();
}
-void iommu_suspend()
+void iommu_suspend(void)
{
const struct iommu_ops *ops = iommu_get_ops();
- if ( iommu_enabled )
+ if ( iommu_enabled && ops && ops->suspend )
ops->suspend();
}
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 4 of 7] APIC: record local APIC state on boot
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
` (2 preceding siblings ...)
2011-06-13 17:02 ` [PATCH 3 of 7] IOMMU: Sanitise pointer work Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 8:57 ` Jan Beulich
2011-06-13 17:02 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Andrew Cooper
` (3 subsequent siblings)
7 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
Xen does not store the boot local APIC state which leads to problems
when shutting down for a kexec jump. This patch records the boot
state so we can return to the boot state when kexecing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
@@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity;
static bool_t __initdata opt_x2apic = 1;
boolean_param("x2apic", opt_x2apic);
+enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
+
bool_t __read_mostly x2apic_enabled = 0;
bool_t __read_mostly directed_eoi_enabled = 0;
@@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void)
return 0;
}
+/* Needs to be called during startup. It records the state the BIOS
+ * leaves the local APIC so we can undo upon kexec.
+ */
+void __init record_boot_APIC_mode(void)
+{
+ /* Sanity check - we should only ever run once, but could possibly
+ * be called several times */
+ if ( APIC_MODE_INVALID != apic_boot_mode )
+ return;
+
+ apic_boot_mode = current_local_apic_mode();
+
+ apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
+ apic_mode_to_str(apic_boot_mode));
+}
+
+/* Look at the bits in MSR_IA32_APICBASE and work out which
+ * APIC mode we are in */
+enum apic_mode current_local_apic_mode(void)
+{
+ u64 msr_contents;
+
+ rdmsrl(MSR_IA32_APICBASE, msr_contents);
+
+ /* Reading EXTD bit from the MSR is only valid if CPUID
+ * says so, else reserved */
+ if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
+ && (msr_contents & MSR_IA32_APICBASE_EXTD) )
+ return APIC_MODE_X2APIC;
+
+ /* EN bit should always be valid as long as we can read the MSR
+ */
+ if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
+ return APIC_MODE_XAPIC;
+
+ return APIC_MODE_DISABLED;
+}
+
+
+const char * apic_mode_to_str(const enum apic_mode mode)
+{
+ switch ( mode )
+ {
+ case APIC_MODE_INVALID:
+ return "invalid";
+ case APIC_MODE_DISABLED:
+ return "disabled";
+ case APIC_MODE_XAPIC:
+ return "xapic";
+ case APIC_MODE_X2APIC:
+ return "x2apic";
+ default:
+ return "unrecognised";
+ }
+}
+
void check_for_unexpected_msi(unsigned int vector)
{
unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c
--- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
@@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
{
int i, changed;
+ record_boot_APIC_mode();
+
check_x2apic_preenabled();
cmdline_apic = changed = (genapic != NULL);
diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h
--- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
@@ -21,6 +21,23 @@
#define IO_APIC_REDIR_DEST_LOGICAL 0x00800
#define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
+/* Possible APIC states */
+enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
+ APIC_MODE_DISABLED, /* If uniprocessor, or smp in
+ * uniprocessor mode */
+ APIC_MODE_XAPIC, /* xAPIC mode - default upon
+ * chipset reset */
+ APIC_MODE_X2APIC /* x2APIC mode - common for large
+ * smp machines */
+};
+
+/* Bootstrap processor local APIC boot mode - so we can undo our changes
+ * to the APIC state */
+extern enum apic_mode apic_boot_mode;
+
+/* enum apic_mode -> str function for logging/debug */
+const char * apic_mode_to_str(const enum apic_mode);
+
extern u8 apic_verbosity;
extern bool_t x2apic_enabled;
extern bool_t directed_eoi_enabled;
@@ -203,6 +220,8 @@ extern void disable_APIC_timer(void);
extern void enable_APIC_timer(void);
extern int lapic_suspend(void);
extern int lapic_resume(void);
+extern void record_boot_APIC_mode(void);
+extern enum apic_mode current_local_apic_mode(void);
extern int check_nmi_watchdog (void);
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
` (3 preceding siblings ...)
2011-06-13 17:02 ` [PATCH 4 of 7] APIC: record local APIC state on boot Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 9:02 ` Jan Beulich
2011-06-14 21:45 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Kay, Allen M
2011-06-13 17:02 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op Andrew Cooper
` (2 subsequent siblings)
7 siblings, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
Experimental evidence shows that Extended Interrupt Mode remains in
effect even after Interrupt Remapping is disabled in each DMAR Global
Command Register. A consiquence of this is that when we switch from
x2apic mode back to xapic mode, and disable interrupt remapping for
the kdump kernel, interrupts passing through the IO APICs are in
x2apic format as opposed xapic. This causes a triple fault in the
kexec kernel.
As EIM is explicitly set up each time Interrup Remapping is enabled,
it is safe for us to clobber this when taring down.
Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
verbose and error-prone code, and was only used in 1 place before. We
now have IRTA_EIME which is the specific bit in the register.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
--- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
@@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
#ifdef CONFIG_X86
/* set extended interrupt mode bit */
- ir_ctrl->iremap_maddr |=
- eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
+ ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
#endif
spin_lock_irqsave(&iommu->register_lock, flags);
@@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
if ( !ecap_intr_remap(iommu->ecap) )
return;
+ /* If we are disabling Interrupt Remapping, make sure we dont stay in
+ * Extended Interrupt Mode, as this is unaffected by the Interrupt
+ * Remapping flag in each DMAR Global Control Register.
+ * Specifically, local apics in xapic mode do not like interrupts delivered
+ * in x2apic mode. Any code turning interrupt remapping back on will set
+ * EIME back correctly.
+ */
+ if ( iommu_supports_eim() )
+ {
+ u64 irta;
+ irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
+ dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
+ IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
+ !(irta & IRTA_EIME), irta);
+ }
+
spin_lock_irqsave(&iommu->register_lock, flags);
sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
if ( !(sts & DMA_GSTS_IRES) )
diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
--- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
@@ -471,7 +471,7 @@ struct qinval_entry {
#define IEC_GLOBAL_INVL 0
#define IEC_INDEX_INVL 1
-#define IRTA_REG_EIME_SHIFT 11
+#define IRTA_EIME (1 << 11)
/* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
#define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
` (4 preceding siblings ...)
2011-06-13 17:02 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 12:10 ` Keir Fraser
2011-06-14 22:15 ` Kay, Allen M
2011-06-13 17:02 ` [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing Andrew Cooper
2011-06-13 18:15 ` [PATCH 0 of 7] Fix kexec in Xen (take 4) Keir Fraser
7 siblings, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
The kdump kernel has problems booting with interrupt/dma
remapping enabled, so we need a new iommu_ops called
crash_shutdown which is basically suspend but doesn't
need to bother saving state.
Make sure that crash_shutdown is called on the kexec
path.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
@@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void)
msecs--;
}
+ /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
+ * happy when booting if interrupt/dma remapping is still enabled */
+ iommu_crash_shutdown();
+
__stop_this_cpu();
disable_IO_APIC();
diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c
--- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
@@ -921,6 +921,14 @@ void amd_iommu_suspend(void)
disable_iommu(iommu);
}
+void amd_iommu_crash_shutdown(void)
+{
+ struct amd_iommu *iommu;
+
+ for_each_amd_iommu ( iommu )
+ disable_iommu(iommu);
+}
+
void amd_iommu_resume(void)
{
struct amd_iommu *iommu;
diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/pci_amd_iommu.c
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 +0100
@@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = {
.suspend = amd_iommu_suspend,
.resume = amd_iommu_resume,
.share_p2m = amd_iommu_share_p2m,
+ .crash_shutdown = amd_iommu_crash_shutdown,
};
diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c
--- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
@@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain
ops->share_p2m(d);
}
+void iommu_crash_shutdown(void)
+{
+ const struct iommu_ops *ops = iommu_get_ops();
+ if ( ops && ops->crash_shutdown )
+ ops->crash_shutdown();
+ iommu_enabled = 0;
+}
+
/*
* Local variables:
* mode: C
diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
@@ -2238,6 +2238,25 @@ static void vtd_suspend(void)
}
}
+static void vtd_crash_shutdown(void)
+{
+ struct acpi_drhd_unit *drhd;
+ struct iommu *iommu;
+
+ if ( !iommu_enabled )
+ return;
+
+ iommu_flush_all();
+
+ for_each_drhd_unit ( drhd )
+ {
+ iommu = drhd->iommu;
+ iommu_disable_translation(iommu);
+ }
+
+ iommu_disable_x2apic_IR();
+}
+
static void vtd_resume(void)
{
struct acpi_drhd_unit *drhd;
@@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops =
.suspend = vtd_suspend,
.resume = vtd_resume,
.share_p2m = iommu_set_pgd,
+ .crash_shutdown = vtd_crash_shutdown,
};
/*
diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 +0100
@@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_
/* power management support */
void amd_iommu_resume(void);
void amd_iommu_suspend(void);
+void amd_iommu_crash_shutdown(void);
static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift)
{
diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h
--- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
@@ -133,6 +133,7 @@ struct iommu_ops {
void (*suspend)(void);
void (*resume)(void);
void (*share_p2m)(struct domain *d);
+ void (*crash_shutdown)(void);
};
void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
@@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un
void iommu_suspend(void);
void iommu_resume(void);
+void iommu_crash_shutdown(void);
void iommu_set_dom0_mapping(struct domain *d);
void iommu_share_p2m_table(struct domain *d);
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
` (5 preceding siblings ...)
2011-06-13 17:02 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op Andrew Cooper
@ 2011-06-13 17:02 ` Andrew Cooper
2011-06-14 12:11 ` Keir Fraser
2011-06-13 18:15 ` [PATCH 0 of 7] Fix kexec in Xen (take 4) Keir Fraser
7 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-13 17:02 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper
Introduce the boolean variable 'kexecing' which indicates to functions
whether we are on the kexec path or not. This is used by
disable_local_APIC() to try and revert the APIC mode back to how it
was found on boot.
We also need some fudging of the x2apic_enabled variable. It is used
in multiple places over the codebase to mean multiple things,
including:
What did the user specifify on the command line?
Did the BIOS boot me in x2apic mode?
Is the BSP Local APIC in x2apic mode?
What mode is my Local APIC in?
Therefore, set it up to prevent a protection fault when disabling the
IOAPICs. (In this case, it is used in the "What mode is my Local APIC
in?" case, so the processor doesnt suffer a protection fault because
of trying to use x2apic MSRs when it should be using xapic MMIO)
Finally, make sure that interrupts are disabled when jumping into the
purgatory code. It would be bad to service interrupts in the Xen
context when the next kernel is booting.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
@@ -37,6 +37,7 @@
#include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */
#include <mach_apic.h>
#include <io_ports.h>
+#include <xen/kexec.h>
static bool_t tdt_enabled __read_mostly;
static bool_t tdt_enable __initdata = 1;
@@ -345,6 +346,33 @@ void disable_local_APIC(void)
wrmsrl(MSR_IA32_APICBASE, msr_content &
~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
}
+
+ if ( kexecing )
+ {
+ uint64_t msr_content;
+ rdmsrl(MSR_IA32_APICBASE, msr_content);
+ msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
+ wrmsrl(MSR_IA32_APICBASE, msr_content);
+
+ switch ( apic_boot_mode )
+ {
+ case APIC_MODE_DISABLED:
+ break; /* Nothing to do - we did this above */
+ case APIC_MODE_XAPIC:
+ msr_content |= MSR_IA32_APICBASE_ENABLE;
+ wrmsrl(MSR_IA32_APICBASE, msr_content);
+ break;
+ case APIC_MODE_X2APIC:
+ msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
+ wrmsrl(MSR_IA32_APICBASE, msr_content);
+ break;
+ default:
+ printk("Default case when reverting #%d lapic to boot state\n",
+ smp_processor_id());
+ break;
+ }
+ }
+
}
/*
diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
@@ -27,6 +27,7 @@
#include <asm/hvm/support.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
+#include <xen/iommu.h>
static atomic_t waiting_for_crash_ipi;
static unsigned int crashing_cpu;
@@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void)
iommu_crash_shutdown();
__stop_this_cpu();
+
+ /* This is a bit of a hack due to the problems with the x2apic_enabled
+ * variable, but we can't do any better without a significant refactoring
+ * of the APIC code */
+ if ( current_local_apic_mode() == APIC_MODE_X2APIC )
+ x2apic_enabled = 1;
+ else
+ x2apic_enabled = 0;
+
disable_IO_APIC();
local_irq_restore(flags);
diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c
--- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
@@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im
if ( hpet_broadcast_is_available() )
hpet_disable_legacy_broadcast();
+ /* We are about to permenantly jump out of the Xen context into the kexec
+ * purgatory code. We really dont want to be still servicing interupts.
+ */
+ local_irq_disable();
+
/*
* compat_machine_kexec() returns to idle pagetables, which requires us
* to be running on a static GDT mapping (idle pagetables have no GDT
diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c
--- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
@@ -29,6 +29,8 @@
#include <compat/kexec.h>
#endif
+bool_t kexecing = FALSE;
+
static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes);
static Elf_Note *xen_crash_note;
@@ -220,6 +222,8 @@ void kexec_crash(void)
if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
return;
+ kexecing = TRUE;
+
kexec_common_shutdown();
kexec_crash_save_cpu();
machine_crash_shutdown();
@@ -232,6 +236,8 @@ static long kexec_reboot(void *_image)
{
xen_kexec_image_t *image = _image;
+ kexecing = TRUE;
+
kexec_common_shutdown();
machine_reboot_kexec(image);
diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h
--- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
+++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
@@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve {
extern xen_kexec_reserve_t kexec_crash_area;
+extern bool_t kexecing;
+
void set_kexec_crash_area_size(u64 system_ram);
/* We have space for 4 images to support atomic update
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3 of 7] IOMMU: Sanitise pointer work
2011-06-13 17:02 ` [PATCH 3 of 7] IOMMU: Sanitise pointer work Andrew Cooper
@ 2011-06-13 18:13 ` Keir Fraser
2011-06-14 9:53 ` Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-13 18:13 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> Check for null pointers before calling function pointers.
But they're never NULL? This one's a bit pointless.
-- Keir
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c
> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
> @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un
> return ops->read_apic_from_ire(apic, reg);
> }
>
> -void iommu_resume()
> +void iommu_resume(void)
> {
> const struct iommu_ops *ops = iommu_get_ops();
> - if ( iommu_enabled )
> + if ( iommu_enabled && ops && ops->resume )
> ops->resume();
> }
>
> -void iommu_suspend()
> +void iommu_suspend(void)
> {
> const struct iommu_ops *ops = iommu_get_ops();
> - if ( iommu_enabled )
> + if ( iommu_enabled && ops && ops->suspend )
> ops->suspend();
> }
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0 of 7] Fix kexec in Xen (take 4)
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
` (6 preceding siblings ...)
2011-06-13 17:02 ` [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing Andrew Cooper
@ 2011-06-13 18:15 ` Keir Fraser
2011-06-16 13:05 ` Andrew Cooper
7 siblings, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-13 18:15 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> This set of patches are partly bugfixed to problems I have noticed while
> working on this area of the codebase, and targeted fixes to get the kexec path
> working again on Xen 4.x
>
> The first three patches are small bugfixes which are not directly related to
> the kexec problems, but are on the codepath.
>
> The next four patches are directly related to fixing the kexec path.
>
> These patches cause xen to track the BSP local APIC boot state and return to
> it before kexec'ing to a new kernel. This prevents the kdump kernel falling
> over itself when booting in x2apic mode while expecting to be in xapic mode.
>
> It also makes sure to disable IO virtualisation, along with fixing the current
> codepath related to disabling Interrup Remapping on Intel VTD boxes.
Overall looking much better! I will take a look at the patches individually,
and I expect Jan will be looking too. I think we may have a largely
check-in-able series here. :-)
-- Keir
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
2011-06-13 17:02 ` [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown Andrew Cooper
@ 2011-06-14 8:44 ` Jan Beulich
2011-06-14 9:44 ` Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2011-06-14 8:44 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> This is a rare case, but if the BIOS is set to uniprocessor, and Xen
> is booted with 'lapic x2apic', Xen will switch into x2apic mode, which
> will cause a protection fault when disabling the local APIC. This
> leads to a general protection fault as this code is also in the fault
> handler.
>
> When x2apic mode is enabled, the only tranlsation which does
> not result in a protection fault is to clear both the EN and EXTD
> bits, which is safe to do in all cases, even if you are in xapic
> mode rather than x2apic mode.
>
> The linux code from which this is derrived is protected by an
> if ( ! x2apic_mode ...) clause which is how they get away with it.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
You may want to submit a similar patch to Linux (which is what
this code got derived from), so that in the future no-one will get
surprised that this is different in Xen and Linux.
Otoh, interestingly this is being done only for x86-32 in Linux, and
I highly doubt any X2APIC capable machine would boot with APIC
disabled.
Jan
> diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c
> --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100
> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> @@ -340,7 +340,8 @@ void disable_local_APIC(void)
> if (enabled_via_apicbase) {
> uint64_t msr_content;
> rdmsrl(MSR_IA32_APICBASE, msr_content);
> - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE);
> + wrmsrl(MSR_IA32_APICBASE, msr_content &
> + ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
> }
> }
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag
2011-06-13 17:02 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
@ 2011-06-14 8:46 ` Jan Beulich
2011-06-14 9:46 ` Keir Fraser
2011-06-14 9:51 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
0 siblings, 2 replies; 43+ messages in thread
From: Jan Beulich @ 2011-06-14 8:46 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> nmi_shootdown_cpus is part of the kexec path, coming from a panic, and
> as such can be called both with interrupts enabled or disabled. We
> really dont want to accidentally set IF.
Can interrupts really be enabled when entering this function?
> Therefore, use save/restore in preference to disable/enable.
I.e. wouldn't just removing the stray local_irq_enable() suffice?
Jan
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c
> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu
>
> static void nmi_shootdown_cpus(void)
> {
> - unsigned long msecs;
> + unsigned long msecs, flags;
>
> - local_irq_disable();
> + local_irq_save(flags);
>
> crashing_cpu = smp_processor_id();
> local_irq_count(crashing_cpu) = 0;
> @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void)
> __stop_this_cpu();
> disable_IO_APIC();
>
> - local_irq_enable();
> + local_irq_restore(flags);
> }
>
> void machine_crash_shutdown(void)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot
2011-06-13 17:02 ` [PATCH 4 of 7] APIC: record local APIC state on boot Andrew Cooper
@ 2011-06-14 8:57 ` Jan Beulich
2011-06-14 10:48 ` Ian Campbell
0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2011-06-14 8:57 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> Xen does not store the boot local APIC state which leads to problems
> when shutting down for a kexec jump. This patch records the boot
> state so we can return to the boot state when kexecing.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
With a couple of minor adjustments (see inline comments),
Acked-by: Jan Beulich <jbeulich@novell.com>
> diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c
> --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity;
> static bool_t __initdata opt_x2apic = 1;
> boolean_param("x2apic", opt_x2apic);
>
> +enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
static ... __read_mostly ...
> +
> bool_t __read_mostly x2apic_enabled = 0;
> bool_t __read_mostly directed_eoi_enabled = 0;
>
> @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void)
> return 0;
> }
>
> +/* Needs to be called during startup. It records the state the BIOS
> + * leaves the local APIC so we can undo upon kexec.
> + */
> +void __init record_boot_APIC_mode(void)
> +{
> + /* Sanity check - we should only ever run once, but could possibly
> + * be called several times */
> + if ( APIC_MODE_INVALID != apic_boot_mode )
> + return;
> +
> + apic_boot_mode = current_local_apic_mode();
> +
> + apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
> + apic_mode_to_str(apic_boot_mode));
> +}
> +
> +/* Look at the bits in MSR_IA32_APICBASE and work out which
> + * APIC mode we are in */
> +enum apic_mode current_local_apic_mode(void)
Logically (within this patch) this function should also be static, but
I see it is being referenced by a later patch. Read on there.
> +{
> + u64 msr_contents;
> +
> + rdmsrl(MSR_IA32_APICBASE, msr_contents);
> +
> + /* Reading EXTD bit from the MSR is only valid if CPUID
> + * says so, else reserved */
> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
> + && (msr_contents & MSR_IA32_APICBASE_EXTD) )
> + return APIC_MODE_X2APIC;
> +
> + /* EN bit should always be valid as long as we can read the MSR
> + */
> + if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
> + return APIC_MODE_XAPIC;
> +
> + return APIC_MODE_DISABLED;
> +}
> +
> +
> +const char * apic_mode_to_str(const enum apic_mode mode)
static ... __init ...
Jan
> +{
> + switch ( mode )
> + {
> + case APIC_MODE_INVALID:
> + return "invalid";
> + case APIC_MODE_DISABLED:
> + return "disabled";
> + case APIC_MODE_XAPIC:
> + return "xapic";
> + case APIC_MODE_X2APIC:
> + return "x2apic";
> + default:
> + return "unrecognised";
> + }
> +}
> +
> void check_for_unexpected_msi(unsigned int vector)
> {
> unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
> diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c
> --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
> @@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
> {
> int i, changed;
>
> + record_boot_APIC_mode();
> +
> check_x2apic_preenabled();
> cmdline_apic = changed = (genapic != NULL);
>
> diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h
> --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
> @@ -21,6 +21,23 @@
> #define IO_APIC_REDIR_DEST_LOGICAL 0x00800
> #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
>
> +/* Possible APIC states */
> +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
> + APIC_MODE_DISABLED, /* If uniprocessor, or smp in
> + * uniprocessor mode */
> + APIC_MODE_XAPIC, /* xAPIC mode - default upon
> + * chipset reset */
> + APIC_MODE_X2APIC /* x2APIC mode - common for large
> + * smp machines */
> +};
> +
> +/* Bootstrap processor local APIC boot mode - so we can undo our changes
> + * to the APIC state */
> +extern enum apic_mode apic_boot_mode;
> +
> +/* enum apic_mode -> str function for logging/debug */
> +const char * apic_mode_to_str(const enum apic_mode);
> +
> extern u8 apic_verbosity;
> extern bool_t x2apic_enabled;
> extern bool_t directed_eoi_enabled;
> @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void);
> extern void enable_APIC_timer(void);
> extern int lapic_suspend(void);
> extern int lapic_resume(void);
> +extern void record_boot_APIC_mode(void);
> +extern enum apic_mode current_local_apic_mode(void);
>
> extern int check_nmi_watchdog (void);
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-13 17:02 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Andrew Cooper
@ 2011-06-14 9:02 ` Jan Beulich
2011-06-14 9:59 ` Andrew Cooper
2011-06-14 21:20 ` Kay, Allen M
2011-06-14 21:45 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Kay, Allen M
1 sibling, 2 replies; 43+ messages in thread
From: Jan Beulich @ 2011-06-14 9:02 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Allen M Kay
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> Experimental evidence shows that Extended Interrupt Mode remains in
> effect even after Interrupt Remapping is disabled in each DMAR Global
> Command Register. A consiquence of this is that when we switch from
> x2apic mode back to xapic mode, and disable interrupt remapping for
> the kdump kernel, interrupts passing through the IO APICs are in
> x2apic format as opposed xapic. This causes a triple fault in the
> kexec kernel.
>
> As EIM is explicitly set up each time Interrup Remapping is enabled,
> it is safe for us to clobber this when taring down.
>
> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
> verbose and error-prone code, and was only used in 1 place before. We
> now have IRTA_EIME which is the specific bit in the register.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
>
> #ifdef CONFIG_X86
> /* set extended interrupt mode bit */
> - ir_ctrl->iremap_maddr |=
> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
> #endif
> spin_lock_irqsave(&iommu->register_lock, flags);
>
> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
> if ( !ecap_intr_remap(iommu->ecap) )
> return;
>
> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
> + * Remapping flag in each DMAR Global Control Register.
> + * Specifically, local apics in xapic mode do not like interrupts
> delivered
> + * in x2apic mode. Any code turning interrupt remapping back on will
> set
> + * EIME back correctly.
> + */
> + if ( iommu_supports_eim() )
> + {
> + u64 irta;
> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
> + !(irta & IRTA_EIME), irta);
Hmm, this of course can be problematic in the context of a crash:
I realize that you want it cleared, but does system state guarantee
that this WAIT_OP will always be able to complete? You'd get a
hung system instead if it won't. Admittedly it's not clear which one
is better, but a best effort approach would probably skip the wait
loop. Let's see what our friends at Intel think...
Jan
> + }
> +
> spin_lock_irqsave(&iommu->register_lock, flags);
> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> if ( !(sts & DMA_GSTS_IRES) )
> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> @@ -471,7 +471,7 @@ struct qinval_entry {
>
> #define IEC_GLOBAL_INVL 0
> #define IEC_INDEX_INVL 1
> -#define IRTA_REG_EIME_SHIFT 11
> +#define IRTA_EIME (1 << 11)
>
> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
2011-06-14 8:44 ` Jan Beulich
@ 2011-06-14 9:44 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-14 9:44 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xensource.com
On 14/06/11 09:44, Jan Beulich wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> This is a rare case, but if the BIOS is set to uniprocessor, and Xen
>> is booted with 'lapic x2apic', Xen will switch into x2apic mode, which
>> will cause a protection fault when disabling the local APIC. This
>> leads to a general protection fault as this code is also in the fault
>> handler.
>>
>> When x2apic mode is enabled, the only tranlsation which does
>> not result in a protection fault is to clear both the EN and EXTD
>> bits, which is safe to do in all cases, even if you are in xapic
>> mode rather than x2apic mode.
>>
>> The linux code from which this is derrived is protected by an
>> if ( ! x2apic_mode ...) clause which is how they get away with it.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jan Beulich <jbeulich@novell.com>
>
> You may want to submit a similar patch to Linux (which is what
> this code got derived from), so that in the future no-one will get
> surprised that this is different in Xen and Linux.
>
> Otoh, interestingly this is being done only for x86-32 in Linux, and
> I highly doubt any X2APIC capable machine would boot with APIC
> disabled.
>
> Jan
>
As I said, it is an edge case and shouldn't occur under any normal
circumstances, but given the nature of the fix, we might as well help
the odd setups. I considered upstreaming it to Linux but I doubt It
will be taken because there is no way to force their code to have a
protection fault.
~Andrew
>> diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c
>> --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100
>> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -340,7 +340,8 @@ void disable_local_APIC(void)
>> if (enabled_via_apicbase) {
>> uint64_t msr_content;
>> rdmsrl(MSR_IA32_APICBASE, msr_content);
>> - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE);
>> + wrmsrl(MSR_IA32_APICBASE, msr_content &
>> + ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
>> }
>> }
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag
2011-06-14 8:46 ` Jan Beulich
@ 2011-06-14 9:46 ` Keir Fraser
2011-06-15 11:01 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag [Reformatted] Andrew Cooper
2011-06-14 9:51 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
1 sibling, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-14 9:46 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper; +Cc: xen-devel
On 14/06/2011 09:46, "Jan Beulich" <JBeulich@novell.com> wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> nmi_shootdown_cpus is part of the kexec path, coming from a panic, and
>> as such can be called both with interrupts enabled or disabled. We
>> really dont want to accidentally set IF.
>
> Can interrupts really be enabled when entering this function?
>
>> Therefore, use save/restore in preference to disable/enable.
>
> I.e. wouldn't just removing the stray local_irq_enable() suffice?
I'd say that the caller should local_irq_disable(). The function itself
should at most BUG_ON(local_irq_is_enabled()).
-- Keir
> Jan
>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu
>>
>> static void nmi_shootdown_cpus(void)
>> {
>> - unsigned long msecs;
>> + unsigned long msecs, flags;
>>
>> - local_irq_disable();
>> + local_irq_save(flags);
>>
>> crashing_cpu = smp_processor_id();
>> local_irq_count(crashing_cpu) = 0;
>> @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void)
>> __stop_this_cpu();
>> disable_IO_APIC();
>>
>> - local_irq_enable();
>> + local_irq_restore(flags);
>> }
>>
>> void machine_crash_shutdown(void)
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag
2011-06-14 8:46 ` Jan Beulich
2011-06-14 9:46 ` Keir Fraser
@ 2011-06-14 9:51 ` Andrew Cooper
1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-14 9:51 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xensource.com
On 14/06/11 09:46, Jan Beulich wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> nmi_shootdown_cpus is part of the kexec path, coming from a panic, and
>> as such can be called both with interrupts enabled or disabled. We
>> really dont want to accidentally set IF.
> Can interrupts really be enabled when entering this function?
It is unlikely, but I believe there are a few codepaths which could
enter this function with interrupts enabled. Either way, the last thing
we need is the usual occurrence of leaving this function with interrupts
enabled
>> Therefore, use save/restore in preference to disable/enable.
> I.e. wouldn't just removing the stray local_irq_enable() suffice?
>
> Jan
I was trying to go for minimal changes to the Linux code. I would agree
that a single disable right at the top of the crash path might be a
better solution.
~Andrew
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu
>>
>> static void nmi_shootdown_cpus(void)
>> {
>> - unsigned long msecs;
>> + unsigned long msecs, flags;
>>
>> - local_irq_disable();
>> + local_irq_save(flags);
>>
>> crashing_cpu = smp_processor_id();
>> local_irq_count(crashing_cpu) = 0;
>> @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void)
>> __stop_this_cpu();
>> disable_IO_APIC();
>>
>> - local_irq_enable();
>> + local_irq_restore(flags);
>> }
>>
>> void machine_crash_shutdown(void)
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3 of 7] IOMMU: Sanitise pointer work
2011-06-13 18:13 ` Keir Fraser
@ 2011-06-14 9:53 ` Andrew Cooper
2011-06-14 11:51 ` Keir Fraser
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-14 9:53 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com
On 13/06/11 19:13, Keir Fraser wrote:
> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
>> Check for null pointers before calling function pointers.
> But they're never NULL? This one's a bit pointless.
>
> -- Keir
>
This patch was a direct result of comments regarding my previous attempt
to make a crash_shutdown iommu_op. Are we happy to assume that noone in
the future is going to partially implement an iommu_ops structure?
~Andrew
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c
>> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un
>> return ops->read_apic_from_ire(apic, reg);
>> }
>>
>> -void iommu_resume()
>> +void iommu_resume(void)
>> {
>> const struct iommu_ops *ops = iommu_get_ops();
>> - if ( iommu_enabled )
>> + if ( iommu_enabled && ops && ops->resume )
>> ops->resume();
>> }
>>
>> -void iommu_suspend()
>> +void iommu_suspend(void)
>> {
>> const struct iommu_ops *ops = iommu_get_ops();
>> - if ( iommu_enabled )
>> + if ( iommu_enabled && ops && ops->suspend )
>> ops->suspend();
>> }
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-14 9:02 ` Jan Beulich
@ 2011-06-14 9:59 ` Andrew Cooper
2011-06-14 21:20 ` Kay, Allen M
1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-14 9:59 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xensource.com, Allen M Kay
On 14/06/11 10:02, Jan Beulich wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Experimental evidence shows that Extended Interrupt Mode remains in
>> effect even after Interrupt Remapping is disabled in each DMAR Global
>> Command Register. A consiquence of this is that when we switch from
>> x2apic mode back to xapic mode, and disable interrupt remapping for
>> the kdump kernel, interrupts passing through the IO APICs are in
>> x2apic format as opposed xapic. This causes a triple fault in the
>> kexec kernel.
>>
>> As EIM is explicitly set up each time Interrup Remapping is enabled,
>> it is safe for us to clobber this when taring down.
>>
>> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
>> verbose and error-prone code, and was only used in 1 place before. We
>> now have IRTA_EIME which is the specific bit in the register.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
>> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
>>
>> #ifdef CONFIG_X86
>> /* set extended interrupt mode bit */
>> - ir_ctrl->iremap_maddr |=
>> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
>> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
>> #endif
>> spin_lock_irqsave(&iommu->register_lock, flags);
>>
>> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
>> if ( !ecap_intr_remap(iommu->ecap) )
>> return;
>>
>> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
>> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
>> + * Remapping flag in each DMAR Global Control Register.
>> + * Specifically, local apics in xapic mode do not like interrupts
>> delivered
>> + * in x2apic mode. Any code turning interrupt remapping back on will
>> set
>> + * EIME back correctly.
>> + */
>> + if ( iommu_supports_eim() )
>> + {
>> + u64 irta;
>> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
>> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
>> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
>> + !(irta & IRTA_EIME), irta);
> Hmm, this of course can be problematic in the context of a crash:
> I realize that you want it cleared, but does system state guarantee
> that this WAIT_OP will always be able to complete? You'd get a
> hung system instead if it won't. Admittedly it's not clear which one
> is better, but a best effort approach would probably skip the wait
> loop. Let's see what our friends at Intel think...
>
> Jan
>From what I can see, IOMMU_WAIT_OP just polls memory mapped registers
until they change. As we are doing multiple dmar_read/writes and the
spec warns that we must wait.
I cant see anything inherently dangerous doing this in the crash
context, because if the wait times out, then the chances are that the
hardware is broken, at which point you have bigger problems that a kexec
not happening.
~Andrew
>> + }
>> +
>> spin_lock_irqsave(&iommu->register_lock, flags);
>> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
>> if ( !(sts & DMA_GSTS_IRES) )
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
>> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> @@ -471,7 +471,7 @@ struct qinval_entry {
>>
>> #define IEC_GLOBAL_INVL 0
>> #define IEC_INDEX_INVL 1
>> -#define IRTA_REG_EIME_SHIFT 11
>> +#define IRTA_EIME (1 << 11)
>>
>> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
>> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot
2011-06-14 8:57 ` Jan Beulich
@ 2011-06-14 10:48 ` Ian Campbell
2011-06-14 11:21 ` Jan Beulich
0 siblings, 1 reply; 43+ messages in thread
From: Ian Campbell @ 2011-06-14 10:48 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel@lists.xensource.com
On Tue, 2011-06-14 at 09:57 +0100, Jan Beulich wrote:
> >>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> > Xen does not store the boot local APIC state which leads to problems
> > when shutting down for a kexec jump. This patch records the boot
> > state so we can return to the boot state when kexecing.
> >
> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> With a couple of minor adjustments (see inline comments),
>
> Acked-by: Jan Beulich <jbeulich@novell.com>
>
> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c
> > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity;
> > static bool_t __initdata opt_x2apic = 1;
> > boolean_param("x2apic", opt_x2apic);
> >
> > +enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
>
> static ... __read_mostly ...
I think I suggested that Andrew remove the read mostly annotiation since
this variable isn't really read mostly -- more of an access almost never
variable (it's written on boot and only read on the kexec path).
Does optimising for this variable really help anything? I suspect
(without evidence to backup my feeling) that over using this annotation
will actually push actual frequently read __read_mostly variables into
taking up more cache lines and actually hurt...
Ian.
> > +
> > bool_t __read_mostly x2apic_enabled = 0;
> > bool_t __read_mostly directed_eoi_enabled = 0;
> >
> > @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void)
> > return 0;
> > }
> >
> > +/* Needs to be called during startup. It records the state the BIOS
> > + * leaves the local APIC so we can undo upon kexec.
> > + */
> > +void __init record_boot_APIC_mode(void)
> > +{
> > + /* Sanity check - we should only ever run once, but could possibly
> > + * be called several times */
> > + if ( APIC_MODE_INVALID != apic_boot_mode )
> > + return;
> > +
> > + apic_boot_mode = current_local_apic_mode();
> > +
> > + apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
> > + apic_mode_to_str(apic_boot_mode));
> > +}
> > +
> > +/* Look at the bits in MSR_IA32_APICBASE and work out which
> > + * APIC mode we are in */
> > +enum apic_mode current_local_apic_mode(void)
>
> Logically (within this patch) this function should also be static, but
> I see it is being referenced by a later patch. Read on there.
>
> > +{
> > + u64 msr_contents;
> > +
> > + rdmsrl(MSR_IA32_APICBASE, msr_contents);
> > +
> > + /* Reading EXTD bit from the MSR is only valid if CPUID
> > + * says so, else reserved */
> > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
> > + && (msr_contents & MSR_IA32_APICBASE_EXTD) )
> > + return APIC_MODE_X2APIC;
> > +
> > + /* EN bit should always be valid as long as we can read the MSR
> > + */
> > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
> > + return APIC_MODE_XAPIC;
> > +
> > + return APIC_MODE_DISABLED;
> > +}
> > +
> > +
> > +const char * apic_mode_to_str(const enum apic_mode mode)
>
> static ... __init ...
>
> Jan
>
> > +{
> > + switch ( mode )
> > + {
> > + case APIC_MODE_INVALID:
> > + return "invalid";
> > + case APIC_MODE_DISABLED:
> > + return "disabled";
> > + case APIC_MODE_XAPIC:
> > + return "xapic";
> > + case APIC_MODE_X2APIC:
> > + return "x2apic";
> > + default:
> > + return "unrecognised";
> > + }
> > +}
> > +
> > void check_for_unexpected_msi(unsigned int vector)
> > {
> > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c
> > --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
> > +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
> > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
> > {
> > int i, changed;
> >
> > + record_boot_APIC_mode();
> > +
> > check_x2apic_preenabled();
> > cmdline_apic = changed = (genapic != NULL);
> >
> > diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h
> > --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
> > +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
> > @@ -21,6 +21,23 @@
> > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800
> > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
> >
> > +/* Possible APIC states */
> > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
> > + APIC_MODE_DISABLED, /* If uniprocessor, or smp in
> > + * uniprocessor mode */
> > + APIC_MODE_XAPIC, /* xAPIC mode - default upon
> > + * chipset reset */
> > + APIC_MODE_X2APIC /* x2APIC mode - common for large
> > + * smp machines */
> > +};
> > +
> > +/* Bootstrap processor local APIC boot mode - so we can undo our changes
> > + * to the APIC state */
> > +extern enum apic_mode apic_boot_mode;
> > +
> > +/* enum apic_mode -> str function for logging/debug */
> > +const char * apic_mode_to_str(const enum apic_mode);
> > +
> > extern u8 apic_verbosity;
> > extern bool_t x2apic_enabled;
> > extern bool_t directed_eoi_enabled;
> > @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void);
> > extern void enable_APIC_timer(void);
> > extern int lapic_suspend(void);
> > extern int lapic_resume(void);
> > +extern void record_boot_APIC_mode(void);
> > +extern enum apic_mode current_local_apic_mode(void);
> >
> > extern int check_nmi_watchdog (void);
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot
2011-06-14 10:48 ` Ian Campbell
@ 2011-06-14 11:21 ` Jan Beulich
2011-06-15 12:33 ` [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted] Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2011-06-14 11:21 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, xen-devel@lists.xensource.com
>>> On 14.06.11 at 12:48, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2011-06-14 at 09:57 +0100, Jan Beulich wrote:
>> >>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> > Xen does not store the boot local APIC state which leads to problems
>> > when shutting down for a kexec jump. This patch records the boot
>> > state so we can return to the boot state when kexecing.
>> >
>> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> With a couple of minor adjustments (see inline comments),
>>
>> Acked-by: Jan Beulich <jbeulich@novell.com>
>>
>> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c
>> > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity;
>> > static bool_t __initdata opt_x2apic = 1;
>> > boolean_param("x2apic", opt_x2apic);
>> >
>> > +enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
>>
>> static ... __read_mostly ...
>
> I think I suggested that Andrew remove the read mostly annotiation since
> this variable isn't really read mostly -- more of an access almost never
> variable (it's written on boot and only read on the kexec path).
Indeed, the "static" part is the more important one.
> Does optimising for this variable really help anything? I suspect
> (without evidence to backup my feeling) that over using this annotation
> will actually push actual frequently read __read_mostly variables into
> taking up more cache lines and actually hurt...
Agreed.
Jan
> Ian.
>
>> > +
>> > bool_t __read_mostly x2apic_enabled = 0;
>> > bool_t __read_mostly directed_eoi_enabled = 0;
>> >
>> > @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void)
>> > return 0;
>> > }
>> >
>> > +/* Needs to be called during startup. It records the state the BIOS
>> > + * leaves the local APIC so we can undo upon kexec.
>> > + */
>> > +void __init record_boot_APIC_mode(void)
>> > +{
>> > + /* Sanity check - we should only ever run once, but could possibly
>> > + * be called several times */
>> > + if ( APIC_MODE_INVALID != apic_boot_mode )
>> > + return;
>> > +
>> > + apic_boot_mode = current_local_apic_mode();
>> > +
>> > + apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
>> > + apic_mode_to_str(apic_boot_mode));
>> > +}
>> > +
>> > +/* Look at the bits in MSR_IA32_APICBASE and work out which
>> > + * APIC mode we are in */
>> > +enum apic_mode current_local_apic_mode(void)
>>
>> Logically (within this patch) this function should also be static, but
>> I see it is being referenced by a later patch. Read on there.
>>
>> > +{
>> > + u64 msr_contents;
>> > +
>> > + rdmsrl(MSR_IA32_APICBASE, msr_contents);
>> > +
>> > + /* Reading EXTD bit from the MSR is only valid if CPUID
>> > + * says so, else reserved */
>> > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
>> > + && (msr_contents & MSR_IA32_APICBASE_EXTD) )
>> > + return APIC_MODE_X2APIC;
>> > +
>> > + /* EN bit should always be valid as long as we can read the MSR
>> > + */
>> > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
>> > + return APIC_MODE_XAPIC;
>> > +
>> > + return APIC_MODE_DISABLED;
>> > +}
>> > +
>> > +
>> > +const char * apic_mode_to_str(const enum apic_mode mode)
>>
>> static ... __init ...
>>
>> Jan
>>
>> > +{
>> > + switch ( mode )
>> > + {
>> > + case APIC_MODE_INVALID:
>> > + return "invalid";
>> > + case APIC_MODE_DISABLED:
>> > + return "disabled";
>> > + case APIC_MODE_XAPIC:
>> > + return "xapic";
>> > + case APIC_MODE_X2APIC:
>> > + return "x2apic";
>> > + default:
>> > + return "unrecognised";
>> > + }
>> > +}
>> > +
>> > void check_for_unexpected_msi(unsigned int vector)
>> > {
>> > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
>> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c
>> > --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
>> > +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100
>> > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
>> > {
>> > int i, changed;
>> >
>> > + record_boot_APIC_mode();
>> > +
>> > check_x2apic_preenabled();
>> > cmdline_apic = changed = (genapic != NULL);
>> >
>> > diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h
>> > --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
>> > +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100
>> > @@ -21,6 +21,23 @@
>> > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800
>> > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
>> >
>> > +/* Possible APIC states */
>> > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
>> > + APIC_MODE_DISABLED, /* If uniprocessor, or smp in
>> > + * uniprocessor mode */
>> > + APIC_MODE_XAPIC, /* xAPIC mode - default upon
>> > + * chipset reset */
>> > + APIC_MODE_X2APIC /* x2APIC mode - common for large
>> > + * smp machines */
>> > +};
>> > +
>> > +/* Bootstrap processor local APIC boot mode - so we can undo our changes
>> > + * to the APIC state */
>> > +extern enum apic_mode apic_boot_mode;
>> > +
>> > +/* enum apic_mode -> str function for logging/debug */
>> > +const char * apic_mode_to_str(const enum apic_mode);
>> > +
>> > extern u8 apic_verbosity;
>> > extern bool_t x2apic_enabled;
>> > extern bool_t directed_eoi_enabled;
>> > @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void);
>> > extern void enable_APIC_timer(void);
>> > extern int lapic_suspend(void);
>> > extern int lapic_resume(void);
>> > +extern void record_boot_APIC_mode(void);
>> > +extern enum apic_mode current_local_apic_mode(void);
>> >
>> > extern int check_nmi_watchdog (void);
>> >
>> >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@lists.xensource.com
>> > http://lists.xensource.com/xen-devel
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 3 of 7] IOMMU: Sanitise pointer work
2011-06-14 9:53 ` Andrew Cooper
@ 2011-06-14 11:51 ` Keir Fraser
0 siblings, 0 replies; 43+ messages in thread
From: Keir Fraser @ 2011-06-14 11:51 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel@lists.xensource.com
On 14/06/2011 10:53, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> On 13/06/11 19:13, Keir Fraser wrote:
>> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>>
>>> Check for null pointers before calling function pointers.
>> But they're never NULL? This one's a bit pointless.
>>
>> -- Keir
>>
> This patch was a direct result of comments regarding my previous attempt
> to make a crash_shutdown iommu_op. Are we happy to assume that noone in
> the future is going to partially implement an iommu_ops structure?
We'll cross that bridge if we come to it. I can't see anyone implementing
iommu_ops that don't need a call back on suspend and resume.
-- Keir
> ~Andrew
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c
>>> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>>> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>>> @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un
>>> return ops->read_apic_from_ire(apic, reg);
>>> }
>>>
>>> -void iommu_resume()
>>> +void iommu_resume(void)
>>> {
>>> const struct iommu_ops *ops = iommu_get_ops();
>>> - if ( iommu_enabled )
>>> + if ( iommu_enabled && ops && ops->resume )
>>> ops->resume();
>>> }
>>>
>>> -void iommu_suspend()
>>> +void iommu_suspend(void)
>>> {
>>> const struct iommu_ops *ops = iommu_get_ops();
>>> - if ( iommu_enabled )
>>> + if ( iommu_enabled && ops && ops->suspend )
>>> ops->suspend();
>>> }
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-13 17:02 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op Andrew Cooper
@ 2011-06-14 12:10 ` Keir Fraser
2011-06-15 12:50 ` Andrew Cooper
2011-06-14 22:15 ` Kay, Allen M
1 sibling, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-14 12:10 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> The kdump kernel has problems booting with interrupt/dma
> remapping enabled, so we need a new iommu_ops called
> crash_shutdown which is basically suspend but doesn't
> need to bother saving state.
>
> Make sure that crash_shutdown is called on the kexec
> path.
What's wrong with the existing .suspend iommu callbacks? They look very
similar (identical in the AMD case in fact). The extra
iommu_disable_x2apic_IR() could be avoided in favour of the eisting one in
the lapic_suspend() path, perhaps?
-- Keir
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c
> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> @@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void)
> msecs--;
> }
>
> + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
> + * happy when booting if interrupt/dma remapping is still enabled */
> + iommu_crash_shutdown();
> +
> __stop_this_cpu();
> disable_IO_APIC();
>
> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c
> --- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
> @@ -921,6 +921,14 @@ void amd_iommu_suspend(void)
> disable_iommu(iommu);
> }
>
> +void amd_iommu_crash_shutdown(void)
> +{
> + struct amd_iommu *iommu;
> +
> + for_each_amd_iommu ( iommu )
> + disable_iommu(iommu);
> +}
> +
> void amd_iommu_resume(void)
> {
> struct amd_iommu *iommu;
> diff -r 2635774346c4 -r 3ad737eb0a8d
> xen/drivers/passthrough/amd/pci_amd_iommu.c
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011
> +0100
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011
> +0100
> @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = {
> .suspend = amd_iommu_suspend,
> .resume = amd_iommu_resume,
> .share_p2m = amd_iommu_share_p2m,
> + .crash_shutdown = amd_iommu_crash_shutdown,
> };
> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c
> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
> @@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain
> ops->share_p2m(d);
> }
>
> +void iommu_crash_shutdown(void)
> +{
> + const struct iommu_ops *ops = iommu_get_ops();
> + if ( ops && ops->crash_shutdown )
> + ops->crash_shutdown();
> + iommu_enabled = 0;
> +}
> +
> /*
> * Local variables:
> * mode: C
> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c
> --- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
> @@ -2238,6 +2238,25 @@ static void vtd_suspend(void)
> }
> }
>
> +static void vtd_crash_shutdown(void)
> +{
> + struct acpi_drhd_unit *drhd;
> + struct iommu *iommu;
> +
> + if ( !iommu_enabled )
> + return;
> +
> + iommu_flush_all();
> +
> + for_each_drhd_unit ( drhd )
> + {
> + iommu = drhd->iommu;
> + iommu_disable_translation(iommu);
> + }
> +
> + iommu_disable_x2apic_IR();
> +}
> +
> static void vtd_resume(void)
> {
> struct acpi_drhd_unit *drhd;
> @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops =
> .suspend = vtd_suspend,
> .resume = vtd_resume,
> .share_p2m = iommu_set_pgd,
> + .crash_shutdown = vtd_crash_shutdown,
> };
>
> /*
> diff -r 2635774346c4 -r 3ad737eb0a8d
> xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
> --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011
> +0100
> +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011
> +0100
> @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_
> /* power management support */
> void amd_iommu_resume(void);
> void amd_iommu_suspend(void);
> +void amd_iommu_crash_shutdown(void);
>
> static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift)
> {
> diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h
> --- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
> @@ -133,6 +133,7 @@ struct iommu_ops {
> void (*suspend)(void);
> void (*resume)(void);
> void (*share_p2m)(struct domain *d);
> + void (*crash_shutdown)(void);
> };
>
> void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned
> int value);
> @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un
>
> void iommu_suspend(void);
> void iommu_resume(void);
> +void iommu_crash_shutdown(void);
>
> void iommu_set_dom0_mapping(struct domain *d);
> void iommu_share_p2m_table(struct domain *d);
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
2011-06-13 17:02 ` [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing Andrew Cooper
@ 2011-06-14 12:11 ` Keir Fraser
2011-06-14 13:05 ` Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-14 12:11 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> Introduce the boolean variable 'kexecing' which indicates to functions
> whether we are on the kexec path or not. This is used by
> disable_local_APIC() to try and revert the APIC mode back to how it
> was found on boot.
Does this need to be done only on the kexec path, rather than
unconditionally in disable_local_APIC()?
-- Keir
> We also need some fudging of the x2apic_enabled variable. It is used
> in multiple places over the codebase to mean multiple things,
> including:
> What did the user specifify on the command line?
> Did the BIOS boot me in x2apic mode?
> Is the BSP Local APIC in x2apic mode?
> What mode is my Local APIC in?
>
> Therefore, set it up to prevent a protection fault when disabling the
> IOAPICs. (In this case, it is used in the "What mode is my Local APIC
> in?" case, so the processor doesnt suffer a protection fault because
> of trying to use x2apic MSRs when it should be using xapic MMIO)
>
> Finally, make sure that interrupts are disabled when jumping into the
> purgatory code. It would be bad to service interrupts in the Xen
> context when the next kernel is booting.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c
> --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
> @@ -37,6 +37,7 @@
> #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */
> #include <mach_apic.h>
> #include <io_ports.h>
> +#include <xen/kexec.h>
>
> static bool_t tdt_enabled __read_mostly;
> static bool_t tdt_enable __initdata = 1;
> @@ -345,6 +346,33 @@ void disable_local_APIC(void)
> wrmsrl(MSR_IA32_APICBASE, msr_content &
> ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
> }
> +
> + if ( kexecing )
> + {
> + uint64_t msr_content;
> + rdmsrl(MSR_IA32_APICBASE, msr_content);
> + msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
> + wrmsrl(MSR_IA32_APICBASE, msr_content);
> +
> + switch ( apic_boot_mode )
> + {
> + case APIC_MODE_DISABLED:
> + break; /* Nothing to do - we did this above */
> + case APIC_MODE_XAPIC:
> + msr_content |= MSR_IA32_APICBASE_ENABLE;
> + wrmsrl(MSR_IA32_APICBASE, msr_content);
> + break;
> + case APIC_MODE_X2APIC:
> + msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
> + wrmsrl(MSR_IA32_APICBASE, msr_content);
> + break;
> + default:
> + printk("Default case when reverting #%d lapic to boot state\n",
> + smp_processor_id());
> + break;
> + }
> + }
> +
> }
>
> /*
> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c
> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
> @@ -27,6 +27,7 @@
> #include <asm/hvm/support.h>
> #include <asm/apic.h>
> #include <asm/io_apic.h>
> +#include <xen/iommu.h>
>
> static atomic_t waiting_for_crash_ipi;
> static unsigned int crashing_cpu;
> @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void)
> iommu_crash_shutdown();
>
> __stop_this_cpu();
> +
> + /* This is a bit of a hack due to the problems with the x2apic_enabled
> + * variable, but we can't do any better without a significant refactoring
> + * of the APIC code */
> + if ( current_local_apic_mode() == APIC_MODE_X2APIC )
> + x2apic_enabled = 1;
> + else
> + x2apic_enabled = 0;
> +
> disable_IO_APIC();
>
> local_irq_restore(flags);
> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c
> --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
> @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im
> if ( hpet_broadcast_is_available() )
> hpet_disable_legacy_broadcast();
>
> + /* We are about to permenantly jump out of the Xen context into the kexec
> + * purgatory code. We really dont want to be still servicing interupts.
> + */
> + local_irq_disable();
> +
> /*
> * compat_machine_kexec() returns to idle pagetables, which requires us
> * to be running on a static GDT mapping (idle pagetables have no GDT
> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c
> --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
> @@ -29,6 +29,8 @@
> #include <compat/kexec.h>
> #endif
>
> +bool_t kexecing = FALSE;
> +
> static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes);
>
> static Elf_Note *xen_crash_note;
> @@ -220,6 +222,8 @@ void kexec_crash(void)
> if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
> return;
>
> + kexecing = TRUE;
> +
> kexec_common_shutdown();
> kexec_crash_save_cpu();
> machine_crash_shutdown();
> @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image)
> {
> xen_kexec_image_t *image = _image;
>
> + kexecing = TRUE;
> +
> kexec_common_shutdown();
> machine_reboot_kexec(image);
>
> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h
> --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
> @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve {
>
> extern xen_kexec_reserve_t kexec_crash_area;
>
> +extern bool_t kexecing;
> +
> void set_kexec_crash_area_size(u64 system_ram);
>
> /* We have space for 4 images to support atomic update
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
2011-06-14 12:11 ` Keir Fraser
@ 2011-06-14 13:05 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-14 13:05 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com
On 14/06/11 13:11, Keir Fraser wrote:
> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
>> Introduce the boolean variable 'kexecing' which indicates to functions
>> whether we are on the kexec path or not. This is used by
>> disable_local_APIC() to try and revert the APIC mode back to how it
>> was found on boot.
> Does this need to be done only on the kexec path, rather than
> unconditionally in disable_local_APIC()?
>
> -- Keir
I tried doing it unconditionally, but because of the interaction of the
local apic mode and the x2apic_enabled variable, it was causing
protection faults all over the place in the regular reboot code.
Specifically: the apic code switches between MSRs and MMIO based on the
single global x2apic_enabled variable. Whether you should use MSRs or
MMIO depends on whether your local apic is in xapic or x2apic mode, and
using MSRs when you should be using MMIO results in a protection fault,
which cascades as disable_local_APIC() is on the panic codepath.
The function is also caused as part of __stop_this_cpu() which is on
several codepaths. While none of those codepaths immediately jump out
as being problematic, I have a feeling that the interaction with
x2apic_enabled will cause problems there as well.
I can't think of any architectural reason for it not to happen, but it
wont work without the original refactoring of the apic code which was
rejected on terms of straying too far from Linux.
~Andrew
>> We also need some fudging of the x2apic_enabled variable. It is used
>> in multiple places over the codebase to mean multiple things,
>> including:
>> What did the user specifify on the command line?
>> Did the BIOS boot me in x2apic mode?
>> Is the BSP Local APIC in x2apic mode?
>> What mode is my Local APIC in?
>>
>> Therefore, set it up to prevent a protection fault when disabling the
>> IOAPICs. (In this case, it is used in the "What mode is my Local APIC
>> in?" case, so the processor doesnt suffer a protection fault because
>> of trying to use x2apic MSRs when it should be using xapic MMIO)
>>
>> Finally, make sure that interrupts are disabled when jumping into the
>> purgatory code. It would be bad to service interrupts in the Xen
>> context when the next kernel is booting.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c
>> --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -37,6 +37,7 @@
>> #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */
>> #include <mach_apic.h>
>> #include <io_ports.h>
>> +#include <xen/kexec.h>
>>
>> static bool_t tdt_enabled __read_mostly;
>> static bool_t tdt_enable __initdata = 1;
>> @@ -345,6 +346,33 @@ void disable_local_APIC(void)
>> wrmsrl(MSR_IA32_APICBASE, msr_content &
>> ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
>> }
>> +
>> + if ( kexecing )
>> + {
>> + uint64_t msr_content;
>> + rdmsrl(MSR_IA32_APICBASE, msr_content);
>> + msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> + wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +
>> + switch ( apic_boot_mode )
>> + {
>> + case APIC_MODE_DISABLED:
>> + break; /* Nothing to do - we did this above */
>> + case APIC_MODE_XAPIC:
>> + msr_content |= MSR_IA32_APICBASE_ENABLE;
>> + wrmsrl(MSR_IA32_APICBASE, msr_content);
>> + break;
>> + case APIC_MODE_X2APIC:
>> + msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> + wrmsrl(MSR_IA32_APICBASE, msr_content);
>> + break;
>> + default:
>> + printk("Default case when reverting #%d lapic to boot state\n",
>> + smp_processor_id());
>> + break;
>> + }
>> + }
>> +
>> }
>>
>> /*
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -27,6 +27,7 @@
>> #include <asm/hvm/support.h>
>> #include <asm/apic.h>
>> #include <asm/io_apic.h>
>> +#include <xen/iommu.h>
>>
>> static atomic_t waiting_for_crash_ipi;
>> static unsigned int crashing_cpu;
>> @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void)
>> iommu_crash_shutdown();
>>
>> __stop_this_cpu();
>> +
>> + /* This is a bit of a hack due to the problems with the x2apic_enabled
>> + * variable, but we can't do any better without a significant refactoring
>> + * of the APIC code */
>> + if ( current_local_apic_mode() == APIC_MODE_X2APIC )
>> + x2apic_enabled = 1;
>> + else
>> + x2apic_enabled = 0;
>> +
>> disable_IO_APIC();
>>
>> local_irq_restore(flags);
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c
>> --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im
>> if ( hpet_broadcast_is_available() )
>> hpet_disable_legacy_broadcast();
>>
>> + /* We are about to permenantly jump out of the Xen context into the kexec
>> + * purgatory code. We really dont want to be still servicing interupts.
>> + */
>> + local_irq_disable();
>> +
>> /*
>> * compat_machine_kexec() returns to idle pagetables, which requires us
>> * to be running on a static GDT mapping (idle pagetables have no GDT
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c
>> --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -29,6 +29,8 @@
>> #include <compat/kexec.h>
>> #endif
>>
>> +bool_t kexecing = FALSE;
>> +
>> static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes);
>>
>> static Elf_Note *xen_crash_note;
>> @@ -220,6 +222,8 @@ void kexec_crash(void)
>> if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
>> return;
>>
>> + kexecing = TRUE;
>> +
>> kexec_common_shutdown();
>> kexec_crash_save_cpu();
>> machine_crash_shutdown();
>> @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image)
>> {
>> xen_kexec_image_t *image = _image;
>>
>> + kexecing = TRUE;
>> +
>> kexec_common_shutdown();
>> machine_reboot_kexec(image);
>>
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h
>> --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
>> @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve {
>>
>> extern xen_kexec_reserve_t kexec_crash_area;
>>
>> +extern bool_t kexecing;
>> +
>> void set_kexec_crash_area_size(u64 system_ram);
>>
>> /* We have space for 4 images to support atomic update
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-14 9:02 ` Jan Beulich
2011-06-14 9:59 ` Andrew Cooper
@ 2011-06-14 21:20 ` Kay, Allen M
2011-06-15 6:48 ` Jan Beulich
1 sibling, 1 reply; 43+ messages in thread
From: Kay, Allen M @ 2011-06-14 21:20 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper; +Cc: xen-devel@lists.xensource.com
> Let's see what our friends at Intel think...
IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will cause a system panic.
-----Original Message-----
From: Jan Beulich [mailto:JBeulich@novell.com]
Sent: Tuesday, June 14, 2011 2:02 AM
To: Andrew Cooper
Cc: Kay, Allen M; xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> Experimental evidence shows that Extended Interrupt Mode remains in
> effect even after Interrupt Remapping is disabled in each DMAR Global
> Command Register. A consiquence of this is that when we switch from
> x2apic mode back to xapic mode, and disable interrupt remapping for
> the kdump kernel, interrupts passing through the IO APICs are in
> x2apic format as opposed xapic. This causes a triple fault in the
> kexec kernel.
>
> As EIM is explicitly set up each time Interrup Remapping is enabled,
> it is safe for us to clobber this when taring down.
>
> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
> verbose and error-prone code, and was only used in 1 place before. We
> now have IRTA_EIME which is the specific bit in the register.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
>
> #ifdef CONFIG_X86
> /* set extended interrupt mode bit */
> - ir_ctrl->iremap_maddr |=
> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
> #endif
> spin_lock_irqsave(&iommu->register_lock, flags);
>
> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
> if ( !ecap_intr_remap(iommu->ecap) )
> return;
>
> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
> + * Remapping flag in each DMAR Global Control Register.
> + * Specifically, local apics in xapic mode do not like interrupts
> delivered
> + * in x2apic mode. Any code turning interrupt remapping back on will
> set
> + * EIME back correctly.
> + */
> + if ( iommu_supports_eim() )
> + {
> + u64 irta;
> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
> + !(irta & IRTA_EIME), irta);
Hmm, this of course can be problematic in the context of a crash:
I realize that you want it cleared, but does system state guarantee
that this WAIT_OP will always be able to complete? You'd get a
hung system instead if it won't. Admittedly it's not clear which one
is better, but a best effort approach would probably skip the wait
loop. Let's see what our friends at Intel think...
Jan
> + }
> +
> spin_lock_irqsave(&iommu->register_lock, flags);
> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> if ( !(sts & DMA_GSTS_IRES) )
> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> @@ -471,7 +471,7 @@ struct qinval_entry {
>
> #define IEC_GLOBAL_INVL 0
> #define IEC_INDEX_INVL 1
> -#define IRTA_REG_EIME_SHIFT 11
> +#define IRTA_EIME (1 << 11)
>
> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-13 17:02 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Andrew Cooper
2011-06-14 9:02 ` Jan Beulich
@ 2011-06-14 21:45 ` Kay, Allen M
1 sibling, 0 replies; 43+ messages in thread
From: Kay, Allen M @ 2011-06-14 21:45 UTC (permalink / raw)
To: Andrew Cooper, xen-devel@lists.xensource.com; +Cc: Keir Fraser, Jan Beulich
> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
> + * Remapping flag in each DMAR Global Control Register.
> + * Specifically, local apics in xapic mode do not like interrupts delivered
> + * in x2apic mode. Any code turning interrupt remapping back on will set
> + * EIME back correctly.
> + */
> + if ( iommu_supports_eim() )
> + {
> + u64 irta;
> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
> + !(irta & IRTA_EIME), irta);
> + }
> +
> spin_lock_irqsave(&iommu->register_lock, flags);
> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> if ( !(sts & DMA_GSTS_IRES) )
This new code should go after spin_lock_irqsave() call. You should also model your code after existing code that disables interrupt remapping in the same function, by checking whether EIME is enabled after the read before proceed to disable it. It's good to keep the software consistent in case we encounter any hardware bugs in the future.
> #define IEC_GLOBAL_INVL 0
> #define IEC_INDEX_INVL 1
> -#define IRTA_REG_EIME_SHIFT 11
> +#define IRTA_EIME (1 << 11)
Cast with u64 as follow like other defines in iommu.h.
#define IRTA_EIME (((u64)1) << 11)
Allen
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-13 17:02 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op Andrew Cooper
2011-06-14 12:10 ` Keir Fraser
@ 2011-06-14 22:15 ` Kay, Allen M
2011-06-15 13:06 ` Andrew Cooper
2011-06-15 15:00 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op [Reformatted] Andrew Cooper
1 sibling, 2 replies; 43+ messages in thread
From: Kay, Allen M @ 2011-06-14 22:15 UTC (permalink / raw)
To: Andrew Cooper, xen-devel@lists.xensource.com; +Cc: Keir Fraser, Jan Beulich
+static void vtd_crash_shutdown(void)
+{
+ struct acpi_drhd_unit *drhd;
+ struct iommu *iommu;
+
+ if ( !iommu_enabled )
+ return;
+
+ iommu_flush_all();
+
+ for_each_drhd_unit ( drhd )
+ {
+ iommu = drhd->iommu;
+ iommu_disable_translation(iommu);
+ }
+
+ iommu_disable_x2apic_IR();
+}
+
Iommu_disable_x2apic_IR() check for iommu_supports_eim() before entering. What happens when x2apic is not enabled but interrupt remapping is enabled?
Maybe you should just create disable_intremap() and disable_qi() functions and call from vtd_crash_shutdown() and iommu_disable_x2apic_IR().
Allen
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-14 21:20 ` Kay, Allen M
@ 2011-06-15 6:48 ` Jan Beulich
2011-06-15 7:45 ` Ian Campbell
0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2011-06-15 6:48 UTC (permalink / raw)
To: Andrew Cooper, Allen M Kay; +Cc: xen-devel@lists.xensource.com
>>> On 14.06.11 at 23:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote:
>> Let's see what our friends at Intel think...
>
> IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will
> cause a system panic.
Sort of bad when we're already bringing down the system possibly
because of a panic.
Jan
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@novell.com]
> Sent: Tuesday, June 14, 2011 2:02 AM
> To: Andrew Cooper
> Cc: Kay, Allen M; xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended
> Interrupt Mode when disabling Interupt Remapping
>
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Experimental evidence shows that Extended Interrupt Mode remains in
>> effect even after Interrupt Remapping is disabled in each DMAR Global
>> Command Register. A consiquence of this is that when we switch from
>> x2apic mode back to xapic mode, and disable interrupt remapping for
>> the kdump kernel, interrupts passing through the IO APICs are in
>> x2apic format as opposed xapic. This causes a triple fault in the
>> kexec kernel.
>>
>> As EIM is explicitly set up each time Interrup Remapping is enabled,
>> it is safe for us to clobber this when taring down.
>>
>> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
>> verbose and error-prone code, and was only used in 1 place before. We
>> now have IRTA_EIME which is the specific bit in the register.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
>> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
>>
>> #ifdef CONFIG_X86
>> /* set extended interrupt mode bit */
>> - ir_ctrl->iremap_maddr |=
>> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
>> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
>> #endif
>> spin_lock_irqsave(&iommu->register_lock, flags);
>>
>> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
>> if ( !ecap_intr_remap(iommu->ecap) )
>> return;
>>
>> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
>> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
>> + * Remapping flag in each DMAR Global Control Register.
>> + * Specifically, local apics in xapic mode do not like interrupts
>> delivered
>> + * in x2apic mode. Any code turning interrupt remapping back on will
>> set
>> + * EIME back correctly.
>> + */
>> + if ( iommu_supports_eim() )
>> + {
>> + u64 irta;
>> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
>> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
>> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
>> + !(irta & IRTA_EIME), irta);
>
> Hmm, this of course can be problematic in the context of a crash:
> I realize that you want it cleared, but does system state guarantee
> that this WAIT_OP will always be able to complete? You'd get a
> hung system instead if it won't. Admittedly it's not clear which one
> is better, but a best effort approach would probably skip the wait
> loop. Let's see what our friends at Intel think...
>
> Jan
>
>> + }
>> +
>> spin_lock_irqsave(&iommu->register_lock, flags);
>> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
>> if ( !(sts & DMA_GSTS_IRES) )
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
>> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> @@ -471,7 +471,7 @@ struct qinval_entry {
>>
>> #define IEC_GLOBAL_INVL 0
>> #define IEC_INDEX_INVL 1
>> -#define IRTA_REG_EIME_SHIFT 11
>> +#define IRTA_EIME (1 << 11)
>>
>> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
>> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
2011-06-15 6:48 ` Jan Beulich
@ 2011-06-15 7:45 ` Ian Campbell
2011-06-15 14:49 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping [Reformatted] Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Ian Campbell @ 2011-06-15 7:45 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel@lists.xensource.com, Allen M Kay
On Wed, 2011-06-15 at 07:48 +0100, Jan Beulich wrote:
> >>> On 14.06.11 at 23:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote:
> >> Let's see what our friends at Intel think...
> >
> > IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will
> > cause a system panic.
>
> Sort of bad when we're already bringing down the system possibly
> because of a panic.
Right. I think we could do with an __IOMMU_WAIT_OP (if one doesn't
already exist) which just waits for the timeout and returns. There's no
harm in trying to continue with the kexec if it fails IMHO.
Ian.
>
> Jan
>
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@novell.com]
> > Sent: Tuesday, June 14, 2011 2:02 AM
> > To: Andrew Cooper
> > Cc: Kay, Allen M; xen-devel@lists.xensource.com
> > Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended
> > Interrupt Mode when disabling Interupt Remapping
> >
> >>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >> Experimental evidence shows that Extended Interrupt Mode remains in
> >> effect even after Interrupt Remapping is disabled in each DMAR Global
> >> Command Register. A consiquence of this is that when we switch from
> >> x2apic mode back to xapic mode, and disable interrupt remapping for
> >> the kdump kernel, interrupts passing through the IO APICs are in
> >> x2apic format as opposed xapic. This causes a triple fault in the
> >> kexec kernel.
> >>
> >> As EIM is explicitly set up each time Interrup Remapping is enabled,
> >> it is safe for us to clobber this when taring down.
> >>
> >> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
> >> verbose and error-prone code, and was only used in 1 place before. We
> >> now have IRTA_EIME which is the specific bit in the register.
> >>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >>
> >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
> >> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> >> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
> >> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
> >>
> >> #ifdef CONFIG_X86
> >> /* set extended interrupt mode bit */
> >> - ir_ctrl->iremap_maddr |=
> >> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
> >> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
> >> #endif
> >> spin_lock_irqsave(&iommu->register_lock, flags);
> >>
> >> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
> >> if ( !ecap_intr_remap(iommu->ecap) )
> >> return;
> >>
> >> + /* If we are disabling Interrupt Remapping, make sure we dont stay in
> >> + * Extended Interrupt Mode, as this is unaffected by the Interrupt
> >> + * Remapping flag in each DMAR Global Control Register.
> >> + * Specifically, local apics in xapic mode do not like interrupts
> >> delivered
> >> + * in x2apic mode. Any code turning interrupt remapping back on will
> >> set
> >> + * EIME back correctly.
> >> + */
> >> + if ( iommu_supports_eim() )
> >> + {
> >> + u64 irta;
> >> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
> >> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
> >> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
> >> + !(irta & IRTA_EIME), irta);
> >
> > Hmm, this of course can be problematic in the context of a crash:
> > I realize that you want it cleared, but does system state guarantee
> > that this WAIT_OP will always be able to complete? You'd get a
> > hung system instead if it won't. Admittedly it's not clear which one
> > is better, but a best effort approach would probably skip the wait
> > loop. Let's see what our friends at Intel think...
> >
> > Jan
> >
> >> + }
> >> +
> >> spin_lock_irqsave(&iommu->register_lock, flags);
> >> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> >> if ( !(sts & DMA_GSTS_IRES) )
> >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
> >> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> >> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100
> >> @@ -471,7 +471,7 @@ struct qinval_entry {
> >>
> >> #define IEC_GLOBAL_INVL 0
> >> #define IEC_INDEX_INVL 1
> >> -#define IRTA_REG_EIME_SHIFT 11
> >> +#define IRTA_EIME (1 << 11)
> >>
> >> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
> >> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xensource.com
> >> http://lists.xensource.com/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag [Reformatted]
2011-06-14 9:46 ` Keir Fraser
@ 2011-06-15 11:01 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 11:01 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Jan Beulich
[-- Attachment #1: Type: text/plain, Size: 47 bytes --]
Reformatted as per Keir's suggestion.
~Andrew
[-- Attachment #2: kexec-fix-interrupts.patch --]
[-- Type: text/x-patch, Size: 1060 bytes --]
KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag
nmi_shootdown_cpus is part of the kexec path, coming from a panic, and
as such can be called both with interrupts enabled or disabled. We
really dont want to accidentally set IF.
Therefore, use save/restore in preference to disable/enable.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r a6f5c3a474e3 xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Wed Jun 15 11:47:12 2011 +0100
+++ b/xen/arch/x86/crash.c Wed Jun 15 11:57:12 2011 +0100
@@ -57,7 +57,7 @@ static void nmi_shootdown_cpus(void)
{
unsigned long msecs;
- local_irq_disable();
+ BUG_ON(local_irq_is_enabled());
crashing_cpu = smp_processor_id();
local_irq_count(crashing_cpu) = 0;
@@ -79,14 +79,14 @@ static void nmi_shootdown_cpus(void)
__stop_this_cpu();
disable_IO_APIC();
-
- local_irq_enable();
}
void machine_crash_shutdown(void)
{
crash_xen_info_t *info;
+ local_irq_disable();
+
nmi_shootdown_cpus();
info = kexec_crash_save_info();
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
2011-06-14 11:21 ` Jan Beulich
@ 2011-06-15 12:33 ` Andrew Cooper
2011-06-15 12:42 ` Keir Fraser
2011-06-15 12:50 ` Jan Beulich
0 siblings, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 12:33 UTC (permalink / raw)
To: Jan Beulich; +Cc: Ian Campbell, xen-devel@lists.xensource.com
[-- Attachment #1: Type: text/plain, Size: 382 bytes --]
It turns out that we do require apic_boot_mode to be accessible outside
in a later reformatted patch so that cant be static any more.
Suggested changes for acpi_mode_to_str have been taken, along with
removing its declaration from apic.h and putting it at the top of apic.c
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: apic-record-boot-mode.patch --]
[-- Type: text/x-patch, Size: 4304 bytes --]
APIC: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems
when shutting down for a kexec jump. This patch records the boot
state so we can return to the boot state when kexecing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r e32752440713 xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/arch/x86/apic.c Wed Jun 15 13:30:08 2011 +0100
@@ -74,6 +74,11 @@ u8 __read_mostly apic_verbosity;
static bool_t __initdata opt_x2apic = 1;
boolean_param("x2apic", opt_x2apic);
+enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
+
+/* enum apic_mode -> str function for logging/debug */
+static const char * apic_mode_to_str(const enum apic_mode);
+
bool_t __read_mostly x2apic_enabled = 0;
bool_t __read_mostly directed_eoi_enabled = 0;
@@ -1438,6 +1443,62 @@ int __init APIC_init_uniprocessor (void)
return 0;
}
+/* Needs to be called during startup. It records the state the BIOS
+ * leaves the local APIC so we can undo upon kexec.
+ */
+void __init record_boot_APIC_mode(void)
+{
+ /* Sanity check - we should only ever run once, but could possibly
+ * be called several times */
+ if ( APIC_MODE_INVALID != apic_boot_mode )
+ return;
+
+ apic_boot_mode = current_local_apic_mode();
+
+ apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
+ apic_mode_to_str(apic_boot_mode));
+}
+
+/* Look at the bits in MSR_IA32_APICBASE and work out which
+ * APIC mode we are in */
+enum apic_mode current_local_apic_mode(void)
+{
+ u64 msr_contents;
+
+ rdmsrl(MSR_IA32_APICBASE, msr_contents);
+
+ /* Reading EXTD bit from the MSR is only valid if CPUID
+ * says so, else reserved */
+ if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
+ && (msr_contents & MSR_IA32_APICBASE_EXTD) )
+ return APIC_MODE_X2APIC;
+
+ /* EN bit should always be valid as long as we can read the MSR
+ */
+ if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
+ return APIC_MODE_XAPIC;
+
+ return APIC_MODE_DISABLED;
+}
+
+
+static const char * __init apic_mode_to_str(const enum apic_mode mode)
+{
+ switch ( mode )
+ {
+ case APIC_MODE_INVALID:
+ return "invalid";
+ case APIC_MODE_DISABLED:
+ return "disabled";
+ case APIC_MODE_XAPIC:
+ return "xapic";
+ case APIC_MODE_X2APIC:
+ return "x2apic";
+ default:
+ return "unrecognised";
+ }
+}
+
void check_for_unexpected_msi(unsigned int vector)
{
unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
diff -r e32752440713 xen/arch/x86/genapic/probe.c
--- a/xen/arch/x86/genapic/probe.c Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/arch/x86/genapic/probe.c Wed Jun 15 13:30:08 2011 +0100
@@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
{
int i, changed;
+ record_boot_APIC_mode();
+
check_x2apic_preenabled();
cmdline_apic = changed = (genapic != NULL);
diff -r e32752440713 xen/include/asm-x86/apic.h
--- a/xen/include/asm-x86/apic.h Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/include/asm-x86/apic.h Wed Jun 15 13:30:08 2011 +0100
@@ -21,6 +21,20 @@
#define IO_APIC_REDIR_DEST_LOGICAL 0x00800
#define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
+/* Possible APIC states */
+enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
+ APIC_MODE_DISABLED, /* If uniprocessor, or smp in
+ * uniprocessor mode */
+ APIC_MODE_XAPIC, /* xAPIC mode - default upon
+ * chipset reset */
+ APIC_MODE_X2APIC /* x2APIC mode - common for large
+ * smp machines */
+};
+
+/* Bootstrap processor local APIC boot mode - so we can undo our changes
+ * to the APIC state */
+extern enum apic_mode apic_boot_mode;
+
extern u8 apic_verbosity;
extern bool_t x2apic_enabled;
extern bool_t directed_eoi_enabled;
@@ -203,6 +217,8 @@ extern void disable_APIC_timer(void);
extern void enable_APIC_timer(void);
extern int lapic_suspend(void);
extern int lapic_resume(void);
+extern void record_boot_APIC_mode(void);
+extern enum apic_mode current_local_apic_mode(void);
extern int check_nmi_watchdog (void);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
2011-06-15 12:33 ` [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted] Andrew Cooper
@ 2011-06-15 12:42 ` Keir Fraser
2011-06-15 13:38 ` Andrew Cooper
2011-06-15 12:50 ` Jan Beulich
1 sibling, 1 reply; 43+ messages in thread
From: Keir Fraser @ 2011-06-15 12:42 UTC (permalink / raw)
To: Andrew Cooper, Jan Beulich; +Cc: Ian Campbell, xen-devel@lists.xensource.com
On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> It turns out that we do require apic_boot_mode to be accessible outside
> in a later reformatted patch so that cant be static any more.
>
> Suggested changes for acpi_mode_to_str have been taken, along with
> removing its declaration from apic.h and putting it at the top of apic.c
I applied some of you series already to xen-unstable tip. Please re-send the
remaining bits you need as a new series against tip.
-- Keir
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
2011-06-15 12:33 ` [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted] Andrew Cooper
2011-06-15 12:42 ` Keir Fraser
@ 2011-06-15 12:50 ` Jan Beulich
1 sibling, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2011-06-15 12:50 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Ian Campbell, xen-devel@lists.xensource.com
>>> On 15.06.11 at 14:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> It turns out that we do require apic_boot_mode to be accessible outside
> in a later reformatted patch so that cant be static any more.
As said earlier, I think the code that accesses this variable really
belongs into apic.c (possibly into a new function).
> Suggested changes for acpi_mode_to_str have been taken, along with
> removing its declaration from apic.h and putting it at the top of apic.c
Thanks.
Jan
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-14 12:10 ` Keir Fraser
@ 2011-06-15 12:50 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 12:50 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com
On 14/06/11 13:10, Keir Fraser wrote:
> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
>> The kdump kernel has problems booting with interrupt/dma
>> remapping enabled, so we need a new iommu_ops called
>> crash_shutdown which is basically suspend but doesn't
>> need to bother saving state.
>>
>> Make sure that crash_shutdown is called on the kexec
>> path.
> What's wrong with the existing .suspend iommu callbacks? They look very
> similar (identical in the AMD case in fact). The extra
> iommu_disable_x2apic_IR() could be avoided in favour of the eisting one in
> the lapic_suspend() path, perhaps?
>
> -- Keir
>
For the Intel case, we don't care whether the user has forced iommu on
the comandline - we still need to disable it. Also, saving state is
pointless in the case of a crash. It also gives us a clean way to set
iommu_enabled = 0; which prevents unhelpful panics later on when
disable_IO_APICs times out because it is polling the wrong register.
For AMD, that is simply because AMD doesn't actually save state. (I
assume that the state is retained in the hardware, but I don't have any
AMD boxes I can test with)
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void)
>> msecs--;
>> }
>>
>> + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
>> + * happy when booting if interrupt/dma remapping is still enabled */
>> + iommu_crash_shutdown();
>> +
>> __stop_this_cpu();
>> disable_IO_APIC();
>>
>> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c
>> --- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -921,6 +921,14 @@ void amd_iommu_suspend(void)
>> disable_iommu(iommu);
>> }
>>
>> +void amd_iommu_crash_shutdown(void)
>> +{
>> + struct amd_iommu *iommu;
>> +
>> + for_each_amd_iommu ( iommu )
>> + disable_iommu(iommu);
>> +}
>> +
>> void amd_iommu_resume(void)
>> {
>> struct amd_iommu *iommu;
>> diff -r 2635774346c4 -r 3ad737eb0a8d
>> xen/drivers/passthrough/amd/pci_amd_iommu.c
>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011
>> +0100
>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011
>> +0100
>> @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = {
>> .suspend = amd_iommu_suspend,
>> .resume = amd_iommu_resume,
>> .share_p2m = amd_iommu_share_p2m,
>> + .crash_shutdown = amd_iommu_crash_shutdown,
>> };
>> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c
>> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain
>> ops->share_p2m(d);
>> }
>>
>> +void iommu_crash_shutdown(void)
>> +{
>> + const struct iommu_ops *ops = iommu_get_ops();
>> + if ( ops && ops->crash_shutdown )
>> + ops->crash_shutdown();
>> + iommu_enabled = 0;
>> +}
>> +
>> /*
>> * Local variables:
>> * mode: C
>> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c
>> --- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -2238,6 +2238,25 @@ static void vtd_suspend(void)
>> }
>> }
>>
>> +static void vtd_crash_shutdown(void)
>> +{
>> + struct acpi_drhd_unit *drhd;
>> + struct iommu *iommu;
>> +
>> + if ( !iommu_enabled )
>> + return;
>> +
>> + iommu_flush_all();
>> +
>> + for_each_drhd_unit ( drhd )
>> + {
>> + iommu = drhd->iommu;
>> + iommu_disable_translation(iommu);
>> + }
>> +
>> + iommu_disable_x2apic_IR();
>> +}
>> +
>> static void vtd_resume(void)
>> {
>> struct acpi_drhd_unit *drhd;
>> @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops =
>> .suspend = vtd_suspend,
>> .resume = vtd_resume,
>> .share_p2m = iommu_set_pgd,
>> + .crash_shutdown = vtd_crash_shutdown,
>> };
>>
>> /*
>> diff -r 2635774346c4 -r 3ad737eb0a8d
>> xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
>> --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011
>> +0100
>> +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011
>> +0100
>> @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_
>> /* power management support */
>> void amd_iommu_resume(void);
>> void amd_iommu_suspend(void);
>> +void amd_iommu_crash_shutdown(void);
>>
>> static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift)
>> {
>> diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h
>> --- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100
>> @@ -133,6 +133,7 @@ struct iommu_ops {
>> void (*suspend)(void);
>> void (*resume)(void);
>> void (*share_p2m)(struct domain *d);
>> + void (*crash_shutdown)(void);
>> };
>>
>> void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned
>> int value);
>> @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un
>>
>> void iommu_suspend(void);
>> void iommu_resume(void);
>> +void iommu_crash_shutdown(void);
>>
>> void iommu_set_dom0_mapping(struct domain *d);
>> void iommu_share_p2m_table(struct domain *d);
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-14 22:15 ` Kay, Allen M
@ 2011-06-15 13:06 ` Andrew Cooper
2011-06-15 16:39 ` Kay, Allen M
2011-06-15 15:00 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op [Reformatted] Andrew Cooper
1 sibling, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 13:06 UTC (permalink / raw)
To: Kay, Allen M; +Cc: Jan, xen-devel@lists.xensource.com, Keir Fraser, Beulich
On 14/06/11 23:15, Kay, Allen M wrote:
> +static void vtd_crash_shutdown(void)
> +{
> + struct acpi_drhd_unit *drhd;
> + struct iommu *iommu;
> +
> + if ( !iommu_enabled )
> + return;
> +
> + iommu_flush_all();
> +
> + for_each_drhd_unit ( drhd )
> + {
> + iommu = drhd->iommu;
> + iommu_disable_translation(iommu);
> + }
> +
> + iommu_disable_x2apic_IR();
> +}
> +
>
> Iommu_disable_x2apic_IR() check for iommu_supports_eim() before entering. What happens when x2apic is not enabled but interrupt remapping is enabled?
>
> Maybe you should just create disable_intremap() and disable_qi() functions and call from vtd_crash_shutdown() and iommu_disable_x2apic_IR().
>
> Allen
Well spotted - I missed that. My suggestion would be to remove the
check for eim and deal with it in the relevant disable_intremap and
disable_qi functions. My feeling is that a call to "iommu_disable_IR"
should be able to deal whether or not you have eim.
If there are no objections, I will go ahead and try this and integrate
it into the patch 5 of the series which is already dealing with eim, and
needs some refactoring following my chat with Ian Campbell this morning.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
2011-06-15 12:42 ` Keir Fraser
@ 2011-06-15 13:38 ` Andrew Cooper
2011-06-15 14:49 ` Andrew Cooper
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 13:38 UTC (permalink / raw)
To: Keir Fraser; +Cc: Ian Campbell, xen-devel@lists.xensource.com, Jan Beulich
[-- Attachment #1: Type: text/plain, Size: 888 bytes --]
On 15/06/11 13:42, Keir Fraser wrote:
> On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
>> It turns out that we do require apic_boot_mode to be accessible outside
>> in a later reformatted patch so that cant be static any more.
>>
>> Suggested changes for acpi_mode_to_str have been taken, along with
>> removing its declaration from apic.h and putting it at the top of apic.c
> I applied some of you series already to xen-unstable tip. Please re-send the
> remaining bits you need as a new series against tip.
>
> -- Keir
Ah - I am working against the wrong unstable. Here is the change
against staging unstable.
Also, is it wise having extern enum apic_mode
current_local_apic_mode(void); is apic.h if the function itself is
static inside apic.c?
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: apic-record-boot-mode.patch --]
[-- Type: text/x-patch, Size: 4304 bytes --]
APIC: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems
when shutting down for a kexec jump. This patch records the boot
state so we can return to the boot state when kexecing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r e32752440713 xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/arch/x86/apic.c Wed Jun 15 13:30:08 2011 +0100
@@ -74,6 +74,11 @@ u8 __read_mostly apic_verbosity;
static bool_t __initdata opt_x2apic = 1;
boolean_param("x2apic", opt_x2apic);
+enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
+
+/* enum apic_mode -> str function for logging/debug */
+static const char * apic_mode_to_str(const enum apic_mode);
+
bool_t __read_mostly x2apic_enabled = 0;
bool_t __read_mostly directed_eoi_enabled = 0;
@@ -1438,6 +1443,62 @@ int __init APIC_init_uniprocessor (void)
return 0;
}
+/* Needs to be called during startup. It records the state the BIOS
+ * leaves the local APIC so we can undo upon kexec.
+ */
+void __init record_boot_APIC_mode(void)
+{
+ /* Sanity check - we should only ever run once, but could possibly
+ * be called several times */
+ if ( APIC_MODE_INVALID != apic_boot_mode )
+ return;
+
+ apic_boot_mode = current_local_apic_mode();
+
+ apic_printk(APIC_DEBUG, "APIC boot state is '%s'\n",
+ apic_mode_to_str(apic_boot_mode));
+}
+
+/* Look at the bits in MSR_IA32_APICBASE and work out which
+ * APIC mode we are in */
+enum apic_mode current_local_apic_mode(void)
+{
+ u64 msr_contents;
+
+ rdmsrl(MSR_IA32_APICBASE, msr_contents);
+
+ /* Reading EXTD bit from the MSR is only valid if CPUID
+ * says so, else reserved */
+ if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
+ && (msr_contents & MSR_IA32_APICBASE_EXTD) )
+ return APIC_MODE_X2APIC;
+
+ /* EN bit should always be valid as long as we can read the MSR
+ */
+ if ( msr_contents & MSR_IA32_APICBASE_ENABLE )
+ return APIC_MODE_XAPIC;
+
+ return APIC_MODE_DISABLED;
+}
+
+
+static const char * __init apic_mode_to_str(const enum apic_mode mode)
+{
+ switch ( mode )
+ {
+ case APIC_MODE_INVALID:
+ return "invalid";
+ case APIC_MODE_DISABLED:
+ return "disabled";
+ case APIC_MODE_XAPIC:
+ return "xapic";
+ case APIC_MODE_X2APIC:
+ return "x2apic";
+ default:
+ return "unrecognised";
+ }
+}
+
void check_for_unexpected_msi(unsigned int vector)
{
unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
diff -r e32752440713 xen/arch/x86/genapic/probe.c
--- a/xen/arch/x86/genapic/probe.c Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/arch/x86/genapic/probe.c Wed Jun 15 13:30:08 2011 +0100
@@ -60,6 +60,8 @@ void __init generic_apic_probe(void)
{
int i, changed;
+ record_boot_APIC_mode();
+
check_x2apic_preenabled();
cmdline_apic = changed = (genapic != NULL);
diff -r e32752440713 xen/include/asm-x86/apic.h
--- a/xen/include/asm-x86/apic.h Wed Jun 15 12:02:13 2011 +0100
+++ b/xen/include/asm-x86/apic.h Wed Jun 15 13:30:08 2011 +0100
@@ -21,6 +21,20 @@
#define IO_APIC_REDIR_DEST_LOGICAL 0x00800
#define IO_APIC_REDIR_DEST_PHYSICAL 0x00000
+/* Possible APIC states */
+enum apic_mode { APIC_MODE_INVALID, /* Not set yet */
+ APIC_MODE_DISABLED, /* If uniprocessor, or smp in
+ * uniprocessor mode */
+ APIC_MODE_XAPIC, /* xAPIC mode - default upon
+ * chipset reset */
+ APIC_MODE_X2APIC /* x2APIC mode - common for large
+ * smp machines */
+};
+
+/* Bootstrap processor local APIC boot mode - so we can undo our changes
+ * to the APIC state */
+extern enum apic_mode apic_boot_mode;
+
extern u8 apic_verbosity;
extern bool_t x2apic_enabled;
extern bool_t directed_eoi_enabled;
@@ -203,6 +217,8 @@ extern void disable_APIC_timer(void);
extern void enable_APIC_timer(void);
extern int lapic_suspend(void);
extern int lapic_resume(void);
+extern void record_boot_APIC_mode(void);
+extern enum apic_mode current_local_apic_mode(void);
extern int check_nmi_watchdog (void);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping [Reformatted]
2011-06-15 7:45 ` Ian Campbell
@ 2011-06-15 14:49 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 14:49 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Allen M Kay, Jan Beulich
[-- Attachment #1: Type: text/plain, Size: 262 bytes --]
Reformatted for staging unstable following Allen's suggestions.
The problem about panicking on the kexec path will be fixed in a later
reformatted patch.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: iommu-fix-disable-IR.patch --]
[-- Type: text/x-patch, Size: 3729 bytes --]
IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
Experimental evidence shows that Extended Interrupt Mode remains in
effect even after Interrupt Remapping is disabled in each DMAR Global
Command Register. A consiquence of this is that when we switch from
x2apic mode back to xapic mode, and disable interrupt remapping for
the kdump kernel, interrupts passing through the IO APICs are in
x2apic format as opposed xapic. This causes a triple fault in the
kexec kernel.
As EIM is explicitly set up each time Interrup Remapping is enabled,
it is safe for us to clobber this when taring down.
Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused
verbose and error-prone code, and was only used in 1 place before. We
now have IRTA_EIME which is the specific bit in the register.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 7925f452afb7 xen/drivers/passthrough/vtd/intremap.c
--- a/xen/drivers/passthrough/vtd/intremap.c Wed Jun 15 14:36:03 2011 +0100
+++ b/xen/drivers/passthrough/vtd/intremap.c Wed Jun 15 15:45:57 2011 +0100
@@ -811,27 +811,12 @@ int enable_intremap(struct iommu *iommu,
void disable_intremap(struct iommu *iommu)
{
u32 sts;
+ u64 irta;
unsigned long flags;
if ( !ecap_intr_remap(iommu->ecap) )
return;
- /* If we are disabling Interrupt Remapping, make sure we dont stay in
- * Extended Interrupt Mode, as this is unaffected by the Interrupt
- * Remapping flag in each DMAR Global Control Register.
- * Specifically, local apics in xapic mode do not like interrupts delivered
- * in x2apic mode. Any code turning interrupt remapping back on will set
- * EIME back correctly.
- */
- if ( iommu_supports_eim() )
- {
- u64 irta;
- irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
- dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
- IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
- !(irta & IRTA_EIME), irta);
- }
-
spin_lock_irqsave(&iommu->register_lock, flags);
sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
if ( !(sts & DMA_GSTS_IRES) )
@@ -841,6 +826,26 @@ void disable_intremap(struct iommu *iomm
IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, dmar_readl,
!(sts & DMA_GSTS_IRES), sts);
+
+ /* If we are disabling Interrupt Remapping, make sure we dont stay in
+ * Extended Interrupt Mode, as this is unaffected by the Interrupt
+ * Remapping flag in each DMAR Global Control Register.
+ * Specifically, local apics in xapic mode do not like interrupts delivered
+ * in x2apic mode. Any code turning interrupt remapping back on will set
+ * EIME back correctly.
+ */
+ if ( !ecap_eim(iommu->ecap) )
+ goto out;
+
+ /* Can't read the register unless we ecaps says we can */
+ irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
+ if ( !(irta & IRTA_EIME) )
+ goto out;
+
+ dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
+ IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
+ !(irta & IRTA_EIME), irta);
+
out:
spin_unlock_irqrestore(&iommu->register_lock, flags);
}
diff -r 7925f452afb7 xen/drivers/passthrough/vtd/iommu.h
--- a/xen/drivers/passthrough/vtd/iommu.h Wed Jun 15 14:36:03 2011 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.h Wed Jun 15 15:45:57 2011 +0100
@@ -471,7 +471,7 @@ struct qinval_entry {
#define IEC_GLOBAL_INVL 0
#define IEC_INDEX_INVL 1
-#define IRTA_EIME (1 << 11)
+#define IRTA_EIME (((u64)1) << 11)
/* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
#define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 )
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
2011-06-15 13:38 ` Andrew Cooper
@ 2011-06-15 14:49 ` Andrew Cooper
0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 14:49 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]
On 15/06/11 14:38, Andrew Cooper wrote:
>
> On 15/06/11 13:42, Keir Fraser wrote:
>> On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>>
>>> It turns out that we do require apic_boot_mode to be accessible outside
>>> in a later reformatted patch so that cant be static any more.
>>>
>>> Suggested changes for acpi_mode_to_str have been taken, along with
>>> removing its declaration from apic.h and putting it at the top of apic.c
>> I applied some of you series already to xen-unstable tip. Please re-send the
>> remaining bits you need as a new series against tip.
>>
>> -- Keir
> Ah - I am working against the wrong unstable. Here is the change
> against staging unstable.
>
> Also, is it wise having extern enum apic_mode
> current_local_apic_mode(void); is apic.h if the function itself is
> static inside apic.c?
>
Sorry - previous patch was still against the wrong repo. This is
correctly against staging.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: apic-record-boot-mode.patch --]
[-- Type: text/x-patch, Size: 804 bytes --]
APIC: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems
when shutting down for a kexec jump. This patch records the boot
state so we can return to the boot state when kexecing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r cac82bc1ea23 xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c Wed Jun 15 13:33:58 2011 +0100
+++ b/xen/arch/x86/apic.c Wed Jun 15 14:36:02 2011 +0100
@@ -78,7 +78,7 @@ boolean_param("x2apic", opt_x2apic);
* Bootstrap processor local APIC boot mode - so we can undo our changes
* to the APIC state.
*/
-static enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
+enum apic_mode apic_boot_mode = APIC_MODE_INVALID;
bool_t __read_mostly x2apic_enabled = 0;
bool_t __read_mostly directed_eoi_enabled = 0;
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op [Reformatted]
2011-06-14 22:15 ` Kay, Allen M
2011-06-15 13:06 ` Andrew Cooper
@ 2011-06-15 15:00 ` Andrew Cooper
1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2011-06-15 15:00 UTC (permalink / raw)
To: Kay, Allen M; +Cc: Jan, xen-devel@lists.xensource.com, Keir Fraser, Beulich
[-- Attachment #1: Type: text/plain, Size: 279 bytes --]
Reformatted patch which removes the crash_shutdown call to
iommu_disable_x2apic_IR and merges disable_intremap and disable_qinval
into the single for_each_dhrd_unit loop.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: crash_shutdown_iommu_ops.patch --]
[-- Type: text/x-patch, Size: 4393 bytes --]
IOMMU: add crash_shutdown iommu_op
The kdump kernel has problems booting with interrupt/dma
remapping enabled, so we need a new iommu_ops called
crash_shutdown which is basically suspend but doesn't
need to bother saving state.
Make sure that crash_shutdown is called on the kexec
path.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 3b440fc31409 xen/arch/x86/crash.c
--- a/xen/arch/x86/crash.c Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/arch/x86/crash.c Wed Jun 15 15:55:20 2011 +0100
@@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void)
msecs--;
}
+ /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
+ * happy when booting if interrupt/dma remapping is still enabled */
+ iommu_crash_shutdown();
+
__stop_this_cpu();
disable_IO_APIC();
}
diff -r 3b440fc31409 xen/drivers/passthrough/amd/iommu_init.c
--- a/xen/drivers/passthrough/amd/iommu_init.c Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/drivers/passthrough/amd/iommu_init.c Wed Jun 15 15:55:20 2011 +0100
@@ -921,6 +921,14 @@ void amd_iommu_suspend(void)
disable_iommu(iommu);
}
+void amd_iommu_crash_shutdown(void)
+{
+ struct amd_iommu *iommu;
+
+ for_each_amd_iommu ( iommu )
+ disable_iommu(iommu);
+}
+
void amd_iommu_resume(void)
{
struct amd_iommu *iommu;
diff -r 3b440fc31409 xen/drivers/passthrough/amd/pci_amd_iommu.c
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Wed Jun 15 15:55:20 2011 +0100
@@ -456,4 +456,5 @@ const struct iommu_ops amd_iommu_ops = {
.suspend = amd_iommu_suspend,
.resume = amd_iommu_resume,
.share_p2m = amd_iommu_share_p2m,
+ .crash_shutdown = amd_iommu_crash_shutdown,
};
diff -r 3b440fc31409 xen/drivers/passthrough/iommu.c
--- a/xen/drivers/passthrough/iommu.c Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/drivers/passthrough/iommu.c Wed Jun 15 15:55:20 2011 +0100
@@ -432,6 +432,14 @@ void iommu_share_p2m_table(struct domain
ops->share_p2m(d);
}
+void iommu_crash_shutdown(void)
+{
+ const struct iommu_ops *ops = iommu_get_ops();
+ if ( ops && ops->crash_shutdown )
+ ops->crash_shutdown();
+ iommu_enabled = 0;
+}
+
/*
* Local variables:
* mode: C
diff -r 3b440fc31409 xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.c Wed Jun 15 15:55:20 2011 +0100
@@ -2235,6 +2235,25 @@ static void vtd_suspend(void)
}
}
+static void vtd_crash_shutdown(void)
+{
+ struct acpi_drhd_unit *drhd;
+ struct iommu *iommu;
+
+ if ( !iommu_enabled )
+ return;
+
+ iommu_flush_all();
+
+ for_each_drhd_unit ( drhd )
+ {
+ iommu = drhd->iommu;
+ iommu_disable_translation(iommu);
+ disable_intremap(drhd->iommu);
+ disable_qinval(drhd->iommu);
+ }
+}
+
static void vtd_resume(void)
{
struct acpi_drhd_unit *drhd;
@@ -2286,6 +2305,7 @@ const struct iommu_ops intel_iommu_ops =
.suspend = vtd_suspend,
.resume = vtd_resume,
.share_p2m = iommu_set_pgd,
+ .crash_shutdown = vtd_crash_shutdown,
};
/*
diff -r 3b440fc31409 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Wed Jun 15 15:55:20 2011 +0100
@@ -102,6 +102,7 @@ extern void *shared_intremap_table;
/* power management support */
void amd_iommu_resume(void);
void amd_iommu_suspend(void);
+void amd_iommu_crash_shutdown(void);
static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift)
{
diff -r 3b440fc31409 xen/include/xen/iommu.h
--- a/xen/include/xen/iommu.h Wed Jun 15 15:45:58 2011 +0100
+++ b/xen/include/xen/iommu.h Wed Jun 15 15:55:20 2011 +0100
@@ -133,6 +133,7 @@ struct iommu_ops {
void (*suspend)(void);
void (*resume)(void);
void (*share_p2m)(struct domain *d);
+ void (*crash_shutdown)(void);
};
void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
@@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un
void iommu_suspend(void);
void iommu_resume(void);
+void iommu_crash_shutdown(void);
void iommu_set_dom0_mapping(struct domain *d);
void iommu_share_p2m_table(struct domain *d);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 43+ messages in thread
* RE: [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
2011-06-15 13:06 ` Andrew Cooper
@ 2011-06-15 16:39 ` Kay, Allen M
0 siblings, 0 replies; 43+ messages in thread
From: Kay, Allen M @ 2011-06-15 16:39 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Jan, xen-devel@lists.xensource.com, Keir Fraser, Beulich
> Well spotted - I missed that. My suggestion would be to remove the
> check for eim and deal with it in the relevant disable_intremap and
> disable_qi functions. My feeling is that a call to "iommu_disable_IR"
> should be able to deal whether or not you have eim.
>
I agree with your suggestion of not checking for EIM in iommu_disable_IR().
Allen
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0 of 7] Fix kexec in Xen (take 4)
2011-06-13 18:15 ` [PATCH 0 of 7] Fix kexec in Xen (take 4) Keir Fraser
@ 2011-06-16 13:05 ` Andrew Cooper
2011-06-16 13:13 ` Keir Fraser
0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2011-06-16 13:05 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com
One final thing to say, These fixes are candidates for backport to 4.1
as the kexec path is still broken there.
~Andrew
On 13/06/11 19:15, Keir Fraser wrote:
> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>
>> This set of patches are partly bugfixed to problems I have noticed while
>> working on this area of the codebase, and targeted fixes to get the kexec path
>> working again on Xen 4.x
>>
>> The first three patches are small bugfixes which are not directly related to
>> the kexec problems, but are on the codepath.
>>
>> The next four patches are directly related to fixing the kexec path.
>>
>> These patches cause xen to track the BSP local APIC boot state and return to
>> it before kexec'ing to a new kernel. This prevents the kdump kernel falling
>> over itself when booting in x2apic mode while expecting to be in xapic mode.
>>
>> It also makes sure to disable IO virtualisation, along with fixing the current
>> codepath related to disabling Interrup Remapping on Intel VTD boxes.
> Overall looking much better! I will take a look at the patches individually,
> and I expect Jan will be looking too. I think we may have a largely
> check-in-able series here. :-)
>
> -- Keir
>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH 0 of 7] Fix kexec in Xen (take 4)
2011-06-16 13:05 ` Andrew Cooper
@ 2011-06-16 13:13 ` Keir Fraser
0 siblings, 0 replies; 43+ messages in thread
From: Keir Fraser @ 2011-06-16 13:13 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel@lists.xensource.com
On 16/06/2011 14:05, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> One final thing to say, These fixes are candidates for backport to 4.1
> as the kexec path is still broken there.
Since I think you ended up with their scope limited to the kexec paths only,
that should be doable.
-- Keir
> ~Andrew
>
> On 13/06/11 19:15, Keir Fraser wrote:
>> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
>>
>>> This set of patches are partly bugfixed to problems I have noticed while
>>> working on this area of the codebase, and targeted fixes to get the kexec
>>> path
>>> working again on Xen 4.x
>>>
>>> The first three patches are small bugfixes which are not directly related to
>>> the kexec problems, but are on the codepath.
>>>
>>> The next four patches are directly related to fixing the kexec path.
>>>
>>> These patches cause xen to track the BSP local APIC boot state and return to
>>> it before kexec'ing to a new kernel. This prevents the kdump kernel falling
>>> over itself when booting in x2apic mode while expecting to be in xapic mode.
>>>
>>> It also makes sure to disable IO virtualisation, along with fixing the
>>> current
>>> codepath related to disabling Interrup Remapping on Intel VTD boxes.
>> Overall looking much better! I will take a look at the patches individually,
>> and I expect Jan will be looking too. I think we may have a largely
>> check-in-able series here. :-)
>>
>> -- Keir
>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>
^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2011-06-16 13:13 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-13 17:02 [PATCH 0 of 7] Fix kexec in Xen (take 4) Andrew Cooper
2011-06-13 17:02 ` [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown Andrew Cooper
2011-06-14 8:44 ` Jan Beulich
2011-06-14 9:44 ` Andrew Cooper
2011-06-13 17:02 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
2011-06-14 8:46 ` Jan Beulich
2011-06-14 9:46 ` Keir Fraser
2011-06-15 11:01 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag [Reformatted] Andrew Cooper
2011-06-14 9:51 ` [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn't look after the interrupt flag Andrew Cooper
2011-06-13 17:02 ` [PATCH 3 of 7] IOMMU: Sanitise pointer work Andrew Cooper
2011-06-13 18:13 ` Keir Fraser
2011-06-14 9:53 ` Andrew Cooper
2011-06-14 11:51 ` Keir Fraser
2011-06-13 17:02 ` [PATCH 4 of 7] APIC: record local APIC state on boot Andrew Cooper
2011-06-14 8:57 ` Jan Beulich
2011-06-14 10:48 ` Ian Campbell
2011-06-14 11:21 ` Jan Beulich
2011-06-15 12:33 ` [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted] Andrew Cooper
2011-06-15 12:42 ` Keir Fraser
2011-06-15 13:38 ` Andrew Cooper
2011-06-15 14:49 ` Andrew Cooper
2011-06-15 12:50 ` Jan Beulich
2011-06-13 17:02 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Andrew Cooper
2011-06-14 9:02 ` Jan Beulich
2011-06-14 9:59 ` Andrew Cooper
2011-06-14 21:20 ` Kay, Allen M
2011-06-15 6:48 ` Jan Beulich
2011-06-15 7:45 ` Ian Campbell
2011-06-15 14:49 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping [Reformatted] Andrew Cooper
2011-06-14 21:45 ` [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping Kay, Allen M
2011-06-13 17:02 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op Andrew Cooper
2011-06-14 12:10 ` Keir Fraser
2011-06-15 12:50 ` Andrew Cooper
2011-06-14 22:15 ` Kay, Allen M
2011-06-15 13:06 ` Andrew Cooper
2011-06-15 16:39 ` Kay, Allen M
2011-06-15 15:00 ` [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op [Reformatted] Andrew Cooper
2011-06-13 17:02 ` [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing Andrew Cooper
2011-06-14 12:11 ` Keir Fraser
2011-06-14 13:05 ` Andrew Cooper
2011-06-13 18:15 ` [PATCH 0 of 7] Fix kexec in Xen (take 4) Keir Fraser
2011-06-16 13:05 ` Andrew Cooper
2011-06-16 13:13 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.