From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
Date: Wed, 18 May 2011 21:48:36 +0100 [thread overview]
Message-ID: <4DD430A4.5000305@citrix.com> (raw)
In-Reply-To: <20110518184946.GB14013@dumpdata.com>
On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote:
> On Wed, May 18, 2011 at 07:08:16PM +0100, Andrew Cooper wrote:
>> kdump kernels are unable to boot with IOMMU enabled,
>> this patch disabled IOMMU mode and removes some of the generic
>> code from the shutdown path which doesnt work after other
>> CPUs have been shot down.
>>
>> Also, leave local interrupts disabled when jumping into pugatory
> purgatory?
purgatory is the bit of code which the a crashing kernel jumps into,
which pretends to do minimal bootloader things to book the kdump
kernel. It is part of the kexec-tools package.
>> as we have no idea whats in there and really dont want to be
>> servicing interrupts when our entire state is invalid.
>>
>> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com>
>>
>> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100
>> +++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100
>> @@ -27,6 +27,8 @@
>> #include<asm/hvm/support.h>
>> #include<asm/apic.h>
>> #include<asm/io_apic.h>
>> +#include<xen/iommu.h>
>> +#include<asm/hvm/iommu.h>
>>
>> static atomic_t waiting_for_crash_ipi;
>> static unsigned int crashing_cpu;
>> @@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu
>>
>> kexec_crash_save_cpu();
>>
>> - __stop_this_cpu();
>> + disable_local_APIC();
>> + hvm_cpu_down();
>> + clts();
>> + asm volatile ( "fninit" );
> Can you provide a comment why you are using fninit and clt?
> Is this what the Linux kernel does too when it goes through the kexec path?
I was replacing __stop_this_cpu() with the safe subset of its contents -
it was a verbatim copy minus the SMP stuff which the regular
__stop_this_cpu() does. I suppose I could have split __stop_this_cpu()
to __crash_stop_this_cpu() but it didn't seem worth making such a
trivially small function.
>>
>> atomic_dec(&waiting_for_crash_ipi);
>>
>> @@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu
>> static void nmi_shootdown_cpus(void)
>> {
>> unsigned long msecs;
>> + u64 msr_contents;
>>
>> local_irq_disable();
>>
>> @@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void)
>> msecs--;
>> }
>>
>> - __stop_this_cpu();
>> + disable_local_APIC();
>> + hvm_cpu_down();
>> + clts();
>> + asm volatile ( "fninit" );
>> +
>> + /* This is a bit of a hack but there is no other way to shutdown correctly
>> + * without a significant refactoring of the APIC code */
>> + rdmsrl(MSR_IA32_APICBASE, msr_contents);
>> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
>> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) )
>> + x2apic_enabled = 1;
>> + else
>> + x2apic_enabled = 0;
>> +
>> disable_IO_APIC();
>> -
>> - local_irq_enable();
> Why?
Because that local_irq_enable() results in the interrupt flag being set
when jumping into purgatory, which (at the moment) doesn't touch
interrupts at all. The result is that interrupts from PCI devices which
are unaware of the crash are (potentially) being serviced by the xen
handlers even though we have left the Xen context for good.
>> }
>>
>> void machine_crash_shutdown(void)
>> {
>> crash_xen_info_t *info;
>> + const struct iommu_ops * ops;
>>
>> nmi_shootdown_cpus();
>>
>> + /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops
>> + * function called crash() or so which just disables the iommu 'fun' without saving state
>> + */
>> + ops = iommu_get_ops();
>> + if(ops)
>> + ops->suspend();
> Uh, no checking if ops->suspend exists?
>
True - at the moment both intel and amd iommu_ops are fully implemented
but I will add an extra condition to the if statement.
>> +
>> + /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture
>> + * independant, and also bears little/no relation to x2apic. Needs cleaning up
> What about AMD VI IOMMUs? Does it work when that IOMMU is used?
>
It worked on the AMD box I tested the code on. Like the comment says -
as far as I can tell, it is architecture independent code.
>> + */
>> + iommu_disable_x2apic_IR();
> Can't that function be done in the suspend code of the IOMMU?
There is a comment in iommu suspend stating that it cant and isn't done,
but rather is left for the local/ioapic_suspend functions which dont
properly work in the kexec path.
>> +
>> +
>> info = kexec_crash_save_info();
>> info->xen_phys_start = xen_phys_start;
>> info->dom0_pfn_to_mfn_frame_list_list =
>> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/hpet.c
>> --- a/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100
>> +++ b/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100
>> @@ -670,6 +670,33 @@ void hpet_disable_legacy_broadcast(void)
>> smp_send_event_check_mask(&cpu_online_map);
>> }
>>
>> +/* This function is similar to the regular
>> + * hpet_disable_legacy_broadcast function, except it is called
>> + * on the crash path with only the current processor up, so we
>> + * can forget the locks and really cant send an event check IPI
>> + * to the other processors */
>> +void crash_hpet_disable_legacy_broadcast(void)
>> +{
>> + u32 cfg;
>> +
>> + if ( !hpet_events || !(hpet_events->flags& HPET_EVT_LEGACY) )
>> + return;
>> +
>> + hpet_events->flags |= HPET_EVT_DISABLE;
>> +
>> + /* disable HPET T0 */
>> + cfg = hpet_read32(HPET_Tn_CFG(0));
>> + cfg&= ~HPET_TN_ENABLE;
>> + hpet_write32(cfg, HPET_Tn_CFG(0));
>> +
>> + /* Stop HPET legacy interrupts */
>> + cfg = hpet_read32(HPET_CFG);
>> + cfg&= ~HPET_CFG_LEGACY;
>> + hpet_write32(cfg, HPET_CFG);
>> +
>> +}
>> +
>> +
>> void hpet_broadcast_enter(void)
>> {
>> unsigned int cpu = smp_processor_id();
>> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/machine_kexec.c
>> --- a/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100
>> +++ b/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100
>> @@ -97,7 +97,7 @@ void machine_kexec(xen_kexec_image_t *im
>> };
>>
>> if ( hpet_broadcast_is_available() )
>> - hpet_disable_legacy_broadcast();
>> + crash_hpet_disable_legacy_broadcast();
>>
>> /*
>> * compat_machine_kexec() returns to idle pagetables, which requires us
>> diff -r e80b5280fe2f -r aaf44d1a903d xen/include/asm-x86/hpet.h
>> --- a/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100
>> +++ b/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100
>> @@ -73,5 +73,6 @@ void hpet_broadcast_enter(void);
>> void hpet_broadcast_exit(void);
>> int hpet_broadcast_is_available(void);
>> void hpet_disable_legacy_broadcast(void);
>> +void crash_hpet_disable_legacy_broadcast(void);
>>
>> #endif /* __X86_HPET_H__ */
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
next prev parent reply other threads:[~2011-05-18 20:48 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-18 18:08 [PATCH 0 of 3] Fix kexec path in xen (take 2) Andrew Cooper
2011-05-18 18:08 ` [PATCH 1 of 3] apic: record local apic state on boot Andrew Cooper
2011-05-18 18:49 ` Konrad Rzeszutek Wilk
2011-05-18 20:27 ` Andrew Cooper
2011-05-18 20:43 ` Konrad Rzeszutek Wilk
2011-05-19 0:56 ` Tian, Kevin
2011-05-19 8:34 ` Ian Campbell
2011-05-19 16:21 ` Konrad Rzeszutek Wilk
2011-05-19 0:54 ` Tian, Kevin
2011-05-18 23:40 ` Keir Fraser
2011-05-19 11:14 ` Andrew Cooper
2011-05-19 14:26 ` Konrad Rzeszutek Wilk
2011-05-19 14:43 ` Andrew Cooper
2011-05-18 18:08 ` [PATCH 2 of 3] apic: remove 'enabled_via_apicbase' variable Andrew Cooper
2011-05-18 18:53 ` Konrad Rzeszutek Wilk
2011-05-18 20:35 ` Andrew Cooper
2011-05-18 20:45 ` Konrad Rzeszutek Wilk
2011-05-19 3:31 ` Tian, Kevin
2011-05-18 18:08 ` [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel Andrew Cooper
2011-05-18 18:49 ` Konrad Rzeszutek Wilk
2011-05-18 20:48 ` Andrew Cooper [this message]
2011-05-18 20:57 ` Konrad Rzeszutek Wilk
2011-05-18 21:24 ` Andrew Cooper
2011-05-19 14:32 ` Konrad Rzeszutek Wilk
2011-05-20 0:33 ` Kay, Allen M
2011-05-20 21:55 ` Kay, Allen M
2011-05-18 18:39 ` [PATCH 0 of 3] Fix kexec path in xen (take 2) Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DD430A4.5000305@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).