* [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2)
@ 2010-04-03 1:27 Chris Wright
2010-04-03 1:27 ` [PATCH 1/4] x86/amd-iommu: enable iommu before attaching devices Chris Wright
` (4 more replies)
0 siblings, 5 replies; 27+ messages in thread
From: Chris Wright @ 2010-04-03 1:27 UTC (permalink / raw)
To: Joerg Roedel
Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec,
linux-kernel
Series includes the following patches:
x86/amd-iommu: enable iommu before attaching devices
x86/amd-iommu: warn when issuing command to uninitialized cmd buffer
Revert "x86: disable IOMMUs on kernel crash"
x86/amd-iommu: use for_each_pci_dev
First one is the primary bug fix.
v2
- add disable_iommus on failure path
- move removal of CMD_BUFFER_UNINITIALIZED to iommu_enable_command_buffer()
- fix ";;" typo in patch 2
- add Revert "x86: disable IOMMUs on kernel crash"
- add x86/amd-iommu: use for_each_pci_dev
thanks,
-chris
^ permalink raw reply [flat|nested] 27+ messages in thread* [PATCH 1/4] x86/amd-iommu: enable iommu before attaching devices 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright @ 2010-04-03 1:27 ` Chris Wright 2010-04-03 1:27 ` [PATCH 2/4] x86/amd-iommu: warn when issuing command to uninitialized cmd buffer Chris Wright ` (3 subsequent siblings) 4 siblings, 0 replies; 27+ messages in thread From: Chris Wright @ 2010-04-03 1:27 UTC (permalink / raw) To: Joerg Roedel Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec, linux-kernel [-- Attachment #1: amd-enable-iommu-earlier.patch --] [-- Type: text/plain, Size: 1364 bytes --] Hit another kdump problem as reported by Neil Horman. When initializaing the IOMMU, we attach devices to their domains before the IOMMU is fully (re)initialized. Attaching a device will issue some important invalidations. In the context of the newly kexec'd kdump kernel, the IOMMU may have stale cached data from the original kernel. Because we do the attach too early, the invalidation commands are placed in the new command buffer before the IOMMU is updated w/ that buffer. This leaves the stale entries in the kdump context and can renders device unusable. Simply enable the IOMMU before we do the attach. Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> --- arch/x86/kernel/amd_iommu_init.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/amd_iommu_init.c +++ b/arch/x86/kernel/amd_iommu_init.c @@ -1288,6 +1288,8 @@ static int __init amd_iommu_init(void) if (ret) goto free; + enable_iommus(); + if (iommu_pass_through) ret = amd_iommu_init_passthrough(); else @@ -1300,8 +1302,6 @@ static int __init amd_iommu_init(void) amd_iommu_init_notifier(); - enable_iommus(); - if (iommu_pass_through) goto out; @@ -1315,6 +1315,7 @@ out: return ret; free: + disable_iommus(); amd_iommu_uninit_devices(); ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 2/4] x86/amd-iommu: warn when issuing command to uninitialized cmd buffer 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright 2010-04-03 1:27 ` [PATCH 1/4] x86/amd-iommu: enable iommu before attaching devices Chris Wright @ 2010-04-03 1:27 ` Chris Wright 2010-04-03 1:27 ` [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" Chris Wright ` (2 subsequent siblings) 4 siblings, 0 replies; 27+ messages in thread From: Chris Wright @ 2010-04-03 1:27 UTC (permalink / raw) To: Joerg Roedel Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec, linux-kernel [-- Attachment #1: amd-warn-on-early-cmd-buffer.patch --] [-- Type: text/plain, Size: 1944 bytes --] To catch future potential issues we can add a warning whenever we issue a command before the command buffer is fully initialized. Signed-off-by: Chris Wright <chrisw@sous-sol.org> --- Or not...this is just if you think it's useful ;-) arch/x86/include/asm/amd_iommu_types.h | 1 + arch/x86/kernel/amd_iommu.c | 1 + arch/x86/kernel/amd_iommu_init.c | 5 +++-- 3 files changed, 5 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/amd_iommu_types.h +++ b/arch/x86/include/asm/amd_iommu_types.h @@ -140,6 +140,7 @@ /* constants to configure the command buffer */ #define CMD_BUFFER_SIZE 8192 +#define CMD_BUFFER_UNINITIALIZED 1 #define CMD_BUFFER_ENTRIES 512 #define MMIO_CMD_SIZE_SHIFT 56 #define MMIO_CMD_SIZE_512 (0x9ULL << MMIO_CMD_SIZE_SHIFT) --- a/arch/x86/kernel/amd_iommu.c +++ b/arch/x86/kernel/amd_iommu.c @@ -392,6 +392,7 @@ static int __iommu_queue_command(struct u32 tail, head; u8 *target; + WARN_ON(iommu->cmd_buf_size & CMD_BUFFER_UNINITIALIZED); tail = readl(iommu->mmio_base + MMIO_CMD_TAIL_OFFSET); target = iommu->cmd_buf + tail; memcpy_toio(target, cmd, sizeof(*cmd)); --- a/arch/x86/kernel/amd_iommu_init.c +++ b/arch/x86/kernel/amd_iommu_init.c @@ -436,7 +436,7 @@ static u8 * __init alloc_command_buffer( if (cmd_buf == NULL) return NULL; - iommu->cmd_buf_size = CMD_BUFFER_SIZE; + iommu->cmd_buf_size = CMD_BUFFER_SIZE | CMD_BUFFER_UNINITIALIZED; return cmd_buf; } @@ -472,12 +472,13 @@ static void iommu_enable_command_buffer( &entry, sizeof(entry)); amd_iommu_reset_cmd_buffer(iommu); + iommu->cmd_buf_size &= ~(CMD_BUFFER_UNINITIALIZED); } static void __init free_command_buffer(struct amd_iommu *iommu) { free_pages((unsigned long)iommu->cmd_buf, - get_order(iommu->cmd_buf_size)); + get_order(iommu->cmd_buf_size & ~(CMD_BUFFER_UNINITIALIZED))); } /* allocates the memory where the IOMMU will log its events to */ ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright 2010-04-03 1:27 ` [PATCH 1/4] x86/amd-iommu: enable iommu before attaching devices Chris Wright 2010-04-03 1:27 ` [PATCH 2/4] x86/amd-iommu: warn when issuing command to uninitialized cmd buffer Chris Wright @ 2010-04-03 1:27 ` Chris Wright 2010-04-03 17:22 ` Joerg Roedel 2010-04-03 17:41 ` Joerg Roedel 2010-04-03 1:27 ` [PATCH 4/4] x86/amd-iommu: use for_each_pci_dev Chris Wright 2010-04-07 10:05 ` [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Joerg Roedel 4 siblings, 2 replies; 27+ messages in thread From: Chris Wright @ 2010-04-03 1:27 UTC (permalink / raw) To: Joerg Roedel Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec, linux-kernel [-- Attachment #1: crash-leave-iommu-enabled.patch --] [-- Type: text/plain, Size: 1017 bytes --] This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. Disabling the IOMMU can potetially allow DMA transactions to complete without being translated. Leave it enabled, and allow crash kernel to do the IOMMU reinitialization properly. Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> --- arch/x86/kernel/crash.c | 6 ------ 1 file changed, 6 deletions(-) --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -27,7 +27,6 @@ #include <asm/cpu.h> #include <asm/reboot.h> #include <asm/virtext.h> -#include <asm/x86_init.h> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) @@ -103,10 +102,5 @@ void native_machine_crash_shutdown(struc #ifdef CONFIG_HPET_TIMER hpet_disable(); #endif - -#ifdef CONFIG_X86_64 - x86_platform.iommu_shutdown(); -#endif - crash_save_cpu(regs, safe_smp_processor_id()); } ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 1:27 ` [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" Chris Wright @ 2010-04-03 17:22 ` Joerg Roedel 2010-04-03 17:44 ` Eric W. Biederman 2010-04-03 17:41 ` Joerg Roedel 1 sibling, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-03 17:22 UTC (permalink / raw) To: Chris Wright Cc: Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, ebiederm, vgoyal On Fri, Apr 02, 2010 at 06:27:54PM -0700, Chris Wright wrote: > This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. > > Disabling the IOMMU can potetially allow DMA transactions to > complete without being translated. Leave it enabled, and allow > crash kernel to do the IOMMU reinitialization properly. > > Cc: Joerg Roedel <joerg.roedel@amd.com> > Cc: Eric Biederman <ebiederm@xmission.com> > Cc: Neil Horman <nhorman@tuxdriver.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Signed-off-by: Chris Wright <chrisw@sous-sol.org> > --- > arch/x86/kernel/crash.c | 6 ------ > 1 file changed, 6 deletions(-) > > --- a/arch/x86/kernel/crash.c > +++ b/arch/x86/kernel/crash.c > @@ -27,7 +27,6 @@ > #include <asm/cpu.h> > #include <asm/reboot.h> > #include <asm/virtext.h> > -#include <asm/x86_init.h> > > #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) > > @@ -103,10 +102,5 @@ void native_machine_crash_shutdown(struc > #ifdef CONFIG_HPET_TIMER > hpet_disable(); > #endif > - > -#ifdef CONFIG_X86_64 > - x86_platform.iommu_shutdown(); > -#endif > - > crash_save_cpu(regs, safe_smp_processor_id()); Hmm, I think for this we need to change the gart code too and disable the gart before its initialization runs to not re-introduce issues fixed in commit bc2cea6a34fdb30f118ec75db39a46a191870607, no? Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:22 ` Joerg Roedel @ 2010-04-03 17:44 ` Eric W. Biederman 2010-04-04 8:44 ` Joerg Roedel 0 siblings, 1 reply; 27+ messages in thread From: Eric W. Biederman @ 2010-04-03 17:44 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal Joerg Roedel <joro@8bytes.org> writes: > On Fri, Apr 02, 2010 at 06:27:54PM -0700, Chris Wright wrote: >> This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. >> >> Disabling the IOMMU can potetially allow DMA transactions to >> complete without being translated. Leave it enabled, and allow >> crash kernel to do the IOMMU reinitialization properly. >> >> Cc: Joerg Roedel <joerg.roedel@amd.com> >> Cc: Eric Biederman <ebiederm@xmission.com> >> Cc: Neil Horman <nhorman@tuxdriver.com> >> Cc: Vivek Goyal <vgoyal@redhat.com> >> Signed-off-by: Chris Wright <chrisw@sous-sol.org> >> --- >> arch/x86/kernel/crash.c | 6 ------ >> 1 file changed, 6 deletions(-) >> >> --- a/arch/x86/kernel/crash.c >> +++ b/arch/x86/kernel/crash.c >> @@ -27,7 +27,6 @@ >> #include <asm/cpu.h> >> #include <asm/reboot.h> >> #include <asm/virtext.h> >> -#include <asm/x86_init.h> >> >> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) >> >> @@ -103,10 +102,5 @@ void native_machine_crash_shutdown(struc >> #ifdef CONFIG_HPET_TIMER >> hpet_disable(); >> #endif >> - >> -#ifdef CONFIG_X86_64 >> - x86_platform.iommu_shutdown(); >> -#endif >> - >> crash_save_cpu(regs, safe_smp_processor_id()); > > Hmm, I think for this we need to change the gart code too and disable > the gart before its initialization runs to not re-introduce issues fixed > in commit bc2cea6a34fdb30f118ec75db39a46a191870607, no? That is a different code path with a different set of assumptions and restrictions. On a normal kexec of course we want to do an orderly shutdown. For the gart with a little luck we can just ignore it on kexec on panic. Unlike a virtualization capable iommu it doesn't prevent access to devices, when it is enabled. Worst case is that we have to start including iommu=off for gart systems. The best case is that we can figure out how to have the gart code reinitialize itself sanely, starting from some arbitrary point. machine_crash_shutdown is about doing those things that we can not do in any other way. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:44 ` Eric W. Biederman @ 2010-04-04 8:44 ` Joerg Roedel 2010-04-04 9:16 ` Eric W. Biederman 0 siblings, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-04 8:44 UTC (permalink / raw) To: Eric W. Biederman Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal On Sat, Apr 03, 2010 at 10:44:22AM -0700, Eric W. Biederman wrote: > Joerg Roedel <joro@8bytes.org> writes: > > Hmm, I think for this we need to change the gart code too and disable > > the gart before its initialization runs to not re-introduce issues fixed > > in commit bc2cea6a34fdb30f118ec75db39a46a191870607, no? > > That is a different code path with a different set of assumptions and > restrictions. On a normal kexec of course we want to do an orderly shutdown. Thats another problem with this patch. It introduces a difference between the panic-shutdown kexec and the ordinary kexec. > For the gart with a little luck we can just ignore it on kexec on > panic. The commit I mentioned above already proves this assumption wrong. > Unlike a virtualization capable iommu it doesn't prevent access > to devices, when it is enabled. Worst case is that we have to start > including iommu=off for gart systems. No no no. This is a maintenance nightmare for almost everybody. Where do you want to Document this special cases that 'if kernel uses gart then and only then boot the kexec kernel with iommu=off'. Always passing iommu=off to the kexec kernel doesn't work too for obvious reasons. > The best case is that we can figure out how to have the gart code > reinitialize itself sanely, starting from some arbitrary point. Yes, that is missing in this patch. But to keep changes small and don't bother with the gart code at all I suggest to remove the shutdown routine from the amd-iommu code only and not the whole shutdown call in the machine_crash_shutdown path. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 8:44 ` Joerg Roedel @ 2010-04-04 9:16 ` Eric W. Biederman 2010-04-04 9:19 ` Eric W. Biederman 0 siblings, 1 reply; 27+ messages in thread From: Eric W. Biederman @ 2010-04-04 9:16 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal Joerg Roedel <joro@8bytes.org> writes: > On Sat, Apr 03, 2010 at 10:44:22AM -0700, Eric W. Biederman wrote: >> Joerg Roedel <joro@8bytes.org> writes: > >> > Hmm, I think for this we need to change the gart code too and disable >> > the gart before its initialization runs to not re-introduce issues fixed >> > in commit bc2cea6a34fdb30f118ec75db39a46a191870607, no? >> >> That is a different code path with a different set of assumptions and >> restrictions. On a normal kexec of course we want to do an orderly shutdown. > > Thats another problem with this patch. It introduces a difference > between the panic-shutdown kexec and the ordinary kexec. Of course there is. kexec on panic does no shutdown, and should not. That is fundamental. That is documented. This make them symmetric crap is a bad idea, because we can not reliably do it. 999 times out of 1000 we can actually handle everything we need to in the kdump kernel in the initialization path and when we succeed it makes linux much more robust. >> For the gart with a little luck we can just ignore it on kexec on >> panic. > > The commit I mentioned above already proves this assumption wrong. Now that I look at that commit again I am stunned. That commit already says it is doing things wrong. What I meant by ignore it is that we should be able to not use it. >> Unlike a virtualization capable iommu it doesn't prevent access >> to devices, when it is enabled. Worst case is that we have to start >> including iommu=off for gart systems. > > No no no. This is a maintenance nightmare for almost everybody. Where do > you want to Document this special cases that 'if kernel uses gart then > and only then boot the kexec kernel with iommu=off'. > Always passing iommu=off to the kexec kernel doesn't work too for > obvious reasons. I agree. I would like to see something better, but the situation with the current set of patches is workable. Getting autodetection in there an automatically doing the right thing would be much better. Do you happen to have a patch you would like to submit to handle this? >> The best case is that we can figure out how to have the gart code >> reinitialize itself sanely, starting from some arbitrary point. > > Yes, that is missing in this patch. But to keep changes small and don't > bother with the gart code at all I suggest to remove the shutdown > routine from the amd-iommu code only and not the whole shutdown call in > the machine_crash_shutdown path. Which will only encourage further abuse of that code path. The reliability has been continually decreasing lately and I believe this is one of those reasons. The hpet junk in there appears to be an even bigger reason. I have machines right now that can not reliably crash dump because someone tried papering over problems by putting code in the machine_crash_shutdown path which must have worked for their test cast but does not work for real world failures, and right now I am very grumpy about it all. I guess what I am saying is that I do not believe shutting down the gart in machine_crash_shutdown actually works. It is definitely not the right place to do that work. So I don't see why we should keep any of that code there. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 9:16 ` Eric W. Biederman @ 2010-04-04 9:19 ` Eric W. Biederman 0 siblings, 0 replies; 27+ messages in thread From: Eric W. Biederman @ 2010-04-04 9:19 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal I guess I should add the place where I have serious test results for with a gart is on 2.6.29. Where there isn't that iommu shutdown and I don't have problems with it. So my experience is that touching the gart in machine_crash_shutdown is unnecessary. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 1:27 ` [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" Chris Wright 2010-04-03 17:22 ` Joerg Roedel @ 2010-04-03 17:41 ` Joerg Roedel 2010-04-03 17:49 ` Eric W. Biederman 2010-04-04 11:54 ` David Woodhouse 1 sibling, 2 replies; 27+ messages in thread From: Joerg Roedel @ 2010-04-03 17:41 UTC (permalink / raw) To: Chris Wright Cc: Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, ebiederm, vgoyal On Fri, Apr 02, 2010 at 06:27:54PM -0700, Chris Wright wrote: > This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. > > Disabling the IOMMU can potetially allow DMA transactions to > complete without being translated. Leave it enabled, and allow > crash kernel to do the IOMMU reinitialization properly. > > Cc: Joerg Roedel <joerg.roedel@amd.com> > Cc: Eric Biederman <ebiederm@xmission.com> > Cc: Neil Horman <nhorman@tuxdriver.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Signed-off-by: Chris Wright <chrisw@sous-sol.org> > --- > arch/x86/kernel/crash.c | 6 ------ > 1 file changed, 6 deletions(-) > > --- a/arch/x86/kernel/crash.c > +++ b/arch/x86/kernel/crash.c > @@ -27,7 +27,6 @@ > #include <asm/cpu.h> > #include <asm/reboot.h> > #include <asm/virtext.h> > -#include <asm/x86_init.h> > > #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) > > @@ -103,10 +102,5 @@ void native_machine_crash_shutdown(struc > #ifdef CONFIG_HPET_TIMER > hpet_disable(); > #endif > - > -#ifdef CONFIG_X86_64 > - x86_platform.iommu_shutdown(); > -#endif > - > crash_save_cpu(regs, safe_smp_processor_id()); Another problem: This also breaks if the kdump kernel has no iommu-support. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:41 ` Joerg Roedel @ 2010-04-03 17:49 ` Eric W. Biederman 2010-04-03 19:13 ` Joerg Roedel 2010-04-04 7:24 ` Bernhard Walle 2010-04-04 11:54 ` David Woodhouse 1 sibling, 2 replies; 27+ messages in thread From: Eric W. Biederman @ 2010-04-03 17:49 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal Joerg Roedel <joro@8bytes.org> writes: > On Fri, Apr 02, 2010 at 06:27:54PM -0700, Chris Wright wrote: >> This effectively reverts commit 61d047be99757fd9b0af900d7abce9a13a337488. >> >> Disabling the IOMMU can potetially allow DMA transactions to >> complete without being translated. Leave it enabled, and allow >> crash kernel to do the IOMMU reinitialization properly. >> >> Cc: Joerg Roedel <joerg.roedel@amd.com> >> Cc: Eric Biederman <ebiederm@xmission.com> >> Cc: Neil Horman <nhorman@tuxdriver.com> >> Cc: Vivek Goyal <vgoyal@redhat.com> >> Signed-off-by: Chris Wright <chrisw@sous-sol.org> >> --- >> arch/x86/kernel/crash.c | 6 ------ >> 1 file changed, 6 deletions(-) >> >> --- a/arch/x86/kernel/crash.c >> +++ b/arch/x86/kernel/crash.c >> @@ -27,7 +27,6 @@ >> #include <asm/cpu.h> >> #include <asm/reboot.h> >> #include <asm/virtext.h> >> -#include <asm/x86_init.h> >> >> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) >> >> @@ -103,10 +102,5 @@ void native_machine_crash_shutdown(struc >> #ifdef CONFIG_HPET_TIMER >> hpet_disable(); >> #endif >> - >> -#ifdef CONFIG_X86_64 >> - x86_platform.iommu_shutdown(); >> -#endif >> - >> crash_save_cpu(regs, safe_smp_processor_id()); > > Another problem: This also breaks if the kdump kernel has no > iommu-support. Not a problem. We require a lot of things of the kdump kernel, and it is immediately apparent in a basic sanity test. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:49 ` Eric W. Biederman @ 2010-04-03 19:13 ` Joerg Roedel 2010-04-03 19:41 ` Eric W. Biederman 2010-04-04 7:24 ` Bernhard Walle 1 sibling, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-03 19:13 UTC (permalink / raw) To: Eric W. Biederman Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal On Sat, Apr 03, 2010 at 10:49:09AM -0700, Eric W. Biederman wrote: > Joerg Roedel <joro@8bytes.org> writes: > > Another problem: This also breaks if the kdump kernel has no > > iommu-support. > > Not a problem. We require a lot of things of the kdump kernel, > and it is immediately apparent in a basic sanity test. Only if the sanity test is done on an iommu machine which I don't want to rely on. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 19:13 ` Joerg Roedel @ 2010-04-03 19:41 ` Eric W. Biederman 0 siblings, 0 replies; 27+ messages in thread From: Eric W. Biederman @ 2010-04-03 19:41 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal Joerg Roedel <joro@8bytes.org> writes: > On Sat, Apr 03, 2010 at 10:49:09AM -0700, Eric W. Biederman wrote: >> Joerg Roedel <joro@8bytes.org> writes: > >> > Another problem: This also breaks if the kdump kernel has no >> > iommu-support. >> >> Not a problem. We require a lot of things of the kdump kernel, >> and it is immediately apparent in a basic sanity test. > > Only if the sanity test is done on an iommu machine which I don't want > to rely on. That makes no sense. The requirements on the kdump kernel has always been that it somehow figure out to recover a machine that is in a random hardware state. That requires drivers for the hardware, that is critical to the machines operation. The easy test for sysadmins is to do: echo > /proc/sysrq-trigger Anyone who thinks the result from one piece of hardware applies to another is deluded. We have been down the path of doing lots of things in the crashing kernel with lkcd, in practice it was worthless in the event of real world crashes. kexec on panic isn't perfect but it at least is an architecture that works often enough to be usable. It does require testing to make certain the basic code paths don't regress, but even so it is a lot easier to maintain and keep useful than any alternative I know of. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:49 ` Eric W. Biederman 2010-04-03 19:13 ` Joerg Roedel @ 2010-04-04 7:24 ` Bernhard Walle 2010-04-04 7:51 ` Eric W. Biederman 2010-04-04 8:53 ` Joerg Roedel 1 sibling, 2 replies; 27+ messages in thread From: Bernhard Walle @ 2010-04-04 7:24 UTC (permalink / raw) To: Eric W. Biederman Cc: Joerg Roedel, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal Am 03.04.10 19:49, schrieb Eric W. Biederman: > Not a problem. We require a lot of things of the kdump kernel, > and it is immediately apparent in a basic sanity test. Also, in most cases (for example: distribution kernels), the kdump kernel nowadays is identical to the running kernel. So, if the running kernel has IOMMU support, the kdump kernel also has. Regards, Bernhard ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 7:24 ` Bernhard Walle @ 2010-04-04 7:51 ` Eric W. Biederman 2010-04-04 8:53 ` Joerg Roedel 1 sibling, 0 replies; 27+ messages in thread From: Eric W. Biederman @ 2010-04-04 7:51 UTC (permalink / raw) To: Bernhard Walle Cc: Joerg Roedel, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal Bernhard Walle <bernhard@bwalle.de> writes: > Am 03.04.10 19:49, schrieb Eric W. Biederman: >> Not a problem. We require a lot of things of the kdump kernel, >> and it is immediately apparent in a basic sanity test. > > Also, in most cases (for example: distribution kernels), the kdump > kernel nowadays is identical to the running kernel. So, if the running > kernel has IOMMU support, the kdump kernel also has. I have been rather pleasantly surprised at how well that works. Although I am still glad I insisted that using the same kernel version not be a hard requirement. It allowed us to specify a good solid interface. How to cope with GART in that situation without having to manually specify iommu=off in the configuration I find a more compelling question. I expect I will have a look at that in the coming months if someone doesn't get there before I do. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 7:24 ` Bernhard Walle 2010-04-04 7:51 ` Eric W. Biederman @ 2010-04-04 8:53 ` Joerg Roedel 2010-04-04 9:44 ` Eric W. Biederman 1 sibling, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-04 8:53 UTC (permalink / raw) To: Bernhard Walle Cc: Eric W. Biederman, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > Am 03.04.10 19:49, schrieb Eric W. Biederman: > > Not a problem. We require a lot of things of the kdump kernel, > > and it is immediately apparent in a basic sanity test. > > Also, in most cases (for example: distribution kernels), the kdump > kernel nowadays is identical to the running kernel. So, if the running > kernel has IOMMU support, the kdump kernel also has. Yes, I know. But is that a requirement for kexec? If not we still potentially have this problem. It is a smaller problem than data-corruption caused by in-flight dma because most people^Wdistributions indeed use the same kernel for normal boot and kexec, thats true. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 8:53 ` Joerg Roedel @ 2010-04-04 9:44 ` Eric W. Biederman 2010-04-04 10:01 ` Joerg Roedel 0 siblings, 1 reply; 27+ messages in thread From: Eric W. Biederman @ 2010-04-04 9:44 UTC (permalink / raw) To: Joerg Roedel Cc: Bernhard Walle, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal Joerg Roedel <joro@8bytes.org> writes: > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: >> Am 03.04.10 19:49, schrieb Eric W. Biederman: >> > Not a problem. We require a lot of things of the kdump kernel, >> > and it is immediately apparent in a basic sanity test. >> >> Also, in most cases (for example: distribution kernels), the kdump >> kernel nowadays is identical to the running kernel. So, if the running >> kernel has IOMMU support, the kdump kernel also has. > > Yes, I know. But is that a requirement for kexec? For normal kexec no. That path is expected to do a clean hardware shutdown. For kexec on panic aka kdump the requirement is that your your crash kernel be able to initialize your hardware from any state it can be put in. Currently we have no kernels that properly recover and amd-iommu and it is not a requirement that we cater to broken kernels. > If not we still > potentially have this problem. It is a smaller problem than > data-corruption caused by in-flight dma because most > people^Wdistributions indeed use the same kernel for normal boot and > kexec, thats true. There are exceptions made for practical reality. The fact that typically the kdump kernel is the same as the running kernel mean that we don't even need to consider one of those exceptions. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 9:44 ` Eric W. Biederman @ 2010-04-04 10:01 ` Joerg Roedel 2010-04-06 17:42 ` Chris Wright 0 siblings, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-04 10:01 UTC (permalink / raw) To: Eric W. Biederman Cc: Bernhard Walle, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: > Joerg Roedel <joro@8bytes.org> writes: > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: > >> > Not a problem. We require a lot of things of the kdump kernel, > >> > and it is immediately apparent in a basic sanity test. > >> > >> Also, in most cases (for example: distribution kernels), the kdump > >> kernel nowadays is identical to the running kernel. So, if the running > >> kernel has IOMMU support, the kdump kernel also has. > > > > Yes, I know. But is that a requirement for kexec? > > For normal kexec no. That path is expected to do a clean hardware > shutdown. > > For kexec on panic aka kdump the requirement is that your your crash > kernel be able to initialize your hardware from any state it can be > put in. Ok, if you show me where this is documented for everybody then I am probably convinced :-) We should fixup the gart initialization anyway. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-04 10:01 ` Joerg Roedel @ 2010-04-06 17:42 ` Chris Wright 2010-04-06 17:51 ` Joerg Roedel 0 siblings, 1 reply; 27+ messages in thread From: Chris Wright @ 2010-04-06 17:42 UTC (permalink / raw) To: Joerg Roedel Cc: Eric W. Biederman, Bernhard Walle, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, Chris Wright, hbabu, iommu, vgoyal * Joerg Roedel (joro@8bytes.org) wrote: > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: > > Joerg Roedel <joro@8bytes.org> writes: > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: > > >> > Not a problem. We require a lot of things of the kdump kernel, > > >> > and it is immediately apparent in a basic sanity test. > > >> > > >> Also, in most cases (for example: distribution kernels), the kdump > > >> kernel nowadays is identical to the running kernel. So, if the running > > >> kernel has IOMMU support, the kdump kernel also has. > > > > > > Yes, I know. But is that a requirement for kexec? > > > > For normal kexec no. That path is expected to do a clean hardware > > shutdown. > > > > For kexec on panic aka kdump the requirement is that your your crash > > kernel be able to initialize your hardware from any state it can be > > put in. > > Ok, if you show me where this is documented for everybody then I am > probably convinced :-) > We should fixup the gart initialization anyway. So, you planning to pull in all 4 patches then? thanks, -chris ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-06 17:42 ` Chris Wright @ 2010-04-06 17:51 ` Joerg Roedel 2010-04-06 20:39 ` Vivek Goyal 0 siblings, 1 reply; 27+ messages in thread From: Joerg Roedel @ 2010-04-06 17:51 UTC (permalink / raw) To: Chris Wright Cc: Joerg Roedel, Eric W. Biederman, Bernhard Walle, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu, vgoyal On Tue, Apr 06, 2010 at 10:42:57AM -0700, Chris Wright wrote: > * Joerg Roedel (joro@8bytes.org) wrote: > > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: > > > Joerg Roedel <joro@8bytes.org> writes: > > > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: > > > >> > Not a problem. We require a lot of things of the kdump kernel, > > > >> > and it is immediately apparent in a basic sanity test. > > > >> > > > >> Also, in most cases (for example: distribution kernels), the kdump > > > >> kernel nowadays is identical to the running kernel. So, if the running > > > >> kernel has IOMMU support, the kdump kernel also has. > > > > > > > > Yes, I know. But is that a requirement for kexec? > > > > > > For normal kexec no. That path is expected to do a clean hardware > > > shutdown. > > > > > > For kexec on panic aka kdump the requirement is that your your crash > > > kernel be able to initialize your hardware from any state it can be > > > put in. > > > > Ok, if you show me where this is documented for everybody then I am > > probably convinced :-) > > We should fixup the gart initialization anyway. > > So, you planning to pull in all 4 patches then? Yes, I will apply them tomorrow and write a fix for the GART issue this may introduce. Joerg ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-06 17:51 ` Joerg Roedel @ 2010-04-06 20:39 ` Vivek Goyal 2010-04-06 21:13 ` Vivek Goyal 0 siblings, 1 reply; 27+ messages in thread From: Vivek Goyal @ 2010-04-06 20:39 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, Eric W. Biederman, Bernhard Walle, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu On Tue, Apr 06, 2010 at 07:51:06PM +0200, Joerg Roedel wrote: > On Tue, Apr 06, 2010 at 10:42:57AM -0700, Chris Wright wrote: > > * Joerg Roedel (joro@8bytes.org) wrote: > > > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: > > > > Joerg Roedel <joro@8bytes.org> writes: > > > > > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > > > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: > > > > >> > Not a problem. We require a lot of things of the kdump kernel, > > > > >> > and it is immediately apparent in a basic sanity test. > > > > >> > > > > >> Also, in most cases (for example: distribution kernels), the kdump > > > > >> kernel nowadays is identical to the running kernel. So, if the running > > > > >> kernel has IOMMU support, the kdump kernel also has. > > > > > > > > > > Yes, I know. But is that a requirement for kexec? > > > > > > > > For normal kexec no. That path is expected to do a clean hardware > > > > shutdown. > > > > > > > > For kexec on panic aka kdump the requirement is that your your crash > > > > kernel be able to initialize your hardware from any state it can be > > > > put in. > > > > > > Ok, if you show me where this is documented for everybody then I am > > > probably convinced :-) > > > We should fixup the gart initialization anyway. > > > > So, you planning to pull in all 4 patches then? > > Yes, I will apply them tomorrow and write a fix for the GART issue this > may introduce. > Hi Joerg, Going through the old mail thread, I think the commit you pointed to was primarily introduced to solve kexec + GART issue and not necessarily kdump issue. In fact disabling IOMMU patch was introduced by you. Author: Joerg Roedel <joerg.roedel@amd.com> Date: Tue Jun 9 17:56:09 2009 +0200 x86: disable IOMMUs on kernel crash If the IOMMUs are still enabled when the kexec kernel boots access to the disk is not possible. This is bad for tools like kdump or anything else which wants to use PCI devices. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> I am assuming you introduced this patch because you faced issues with amd-iommu and not GART. So basically GART should have been working with kdump even before you introduced disabling iommu patch in kdump path. Thanks Vivek ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-06 20:39 ` Vivek Goyal @ 2010-04-06 21:13 ` Vivek Goyal 2010-04-06 21:45 ` Yinghai Lu 0 siblings, 1 reply; 27+ messages in thread From: Vivek Goyal @ 2010-04-06 21:13 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, Joerg Roedel, Eric W. Biederman, Bernhard Walle, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu On Tue, Apr 06, 2010 at 04:39:56PM -0400, Vivek Goyal wrote: > On Tue, Apr 06, 2010 at 07:51:06PM +0200, Joerg Roedel wrote: > > On Tue, Apr 06, 2010 at 10:42:57AM -0700, Chris Wright wrote: > > > * Joerg Roedel (joro@8bytes.org) wrote: > > > > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: > > > > > Joerg Roedel <joro@8bytes.org> writes: > > > > > > > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: > > > > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: > > > > > >> > Not a problem. We require a lot of things of the kdump kernel, > > > > > >> > and it is immediately apparent in a basic sanity test. > > > > > >> > > > > > >> Also, in most cases (for example: distribution kernels), the kdump > > > > > >> kernel nowadays is identical to the running kernel. So, if the running > > > > > >> kernel has IOMMU support, the kdump kernel also has. > > > > > > > > > > > > Yes, I know. But is that a requirement for kexec? > > > > > > > > > > For normal kexec no. That path is expected to do a clean hardware > > > > > shutdown. > > > > > > > > > > For kexec on panic aka kdump the requirement is that your your crash > > > > > kernel be able to initialize your hardware from any state it can be > > > > > put in. > > > > > > > > Ok, if you show me where this is documented for everybody then I am > > > > probably convinced :-) > > > > We should fixup the gart initialization anyway. > > > > > > So, you planning to pull in all 4 patches then? > > > > Yes, I will apply them tomorrow and write a fix for the GART issue this > > may introduce. > > > > Hi Joerg, > > Going through the old mail thread, I think the commit you pointed to was > primarily introduced to solve kexec + GART issue and not necessarily kdump > issue. > > In fact disabling IOMMU patch was introduced by you. > > Author: Joerg Roedel <joerg.roedel@amd.com> > Date: Tue Jun 9 17:56:09 2009 +0200 > > x86: disable IOMMUs on kernel crash > > If the IOMMUs are still enabled when the kexec kernel boots access to > the disk is not possible. This is bad for tools like kdump or anything > else which wants to use PCI devices. > > Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> > > I am assuming you introduced this patch because you faced issues with > amd-iommu and not GART. > > So basically GART should have been working with kdump even before you > introduced disabling iommu patch in kdump path. Looking at following commit, we were still not shutting down GART and fixing issues like second kernel accessing the GART aperture set by first kernel. commit aaf230424204864e2833dcc1da23e2cb0b9f39cd Author: Yinghai Lu <Yinghai.Lu@Sun.COM> Date: Wed Jan 30 13:33:09 2008 +0100 x86: disable the GART early, 64-bit For K8 system: 4G RAM with memory hole remapping enabled, or more than 4G RAM installed. So I guess it should be fine to not shutdown GART in crashing kernel and then look at the fresh issues which crop up and figure out how to fix those. Vivek ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-06 21:13 ` Vivek Goyal @ 2010-04-06 21:45 ` Yinghai Lu 2010-04-06 22:10 ` Eric W. Biederman 0 siblings, 1 reply; 27+ messages in thread From: Yinghai Lu @ 2010-04-06 21:45 UTC (permalink / raw) To: Vivek Goyal Cc: Joerg Roedel, Chris Wright, Joerg Roedel, Eric W. Biederman, Bernhard Walle, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu On Tue, Apr 6, 2010 at 2:13 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Tue, Apr 06, 2010 at 04:39:56PM -0400, Vivek Goyal wrote: >> On Tue, Apr 06, 2010 at 07:51:06PM +0200, Joerg Roedel wrote: >> > On Tue, Apr 06, 2010 at 10:42:57AM -0700, Chris Wright wrote: >> > > * Joerg Roedel (joro@8bytes.org) wrote: >> > > > On Sun, Apr 04, 2010 at 02:44:36AM -0700, Eric W. Biederman wrote: >> > > > > Joerg Roedel <joro@8bytes.org> writes: >> > > > > >> > > > > > On Sun, Apr 04, 2010 at 09:24:30AM +0200, Bernhard Walle wrote: >> > > > > >> Am 03.04.10 19:49, schrieb Eric W. Biederman: >> > > > > >> > Not a problem. We require a lot of things of the kdump kernel, >> > > > > >> > and it is immediately apparent in a basic sanity test. >> > > > > >> >> > > > > >> Also, in most cases (for example: distribution kernels), the kdump >> > > > > >> kernel nowadays is identical to the running kernel. So, if the running >> > > > > >> kernel has IOMMU support, the kdump kernel also has. >> > > > > > >> > > > > > Yes, I know. But is that a requirement for kexec? >> > > > > >> > > > > For normal kexec no. That path is expected to do a clean hardware >> > > > > shutdown. >> > > > > >> > > > > For kexec on panic aka kdump the requirement is that your your crash >> > > > > kernel be able to initialize your hardware from any state it can be >> > > > > put in. >> > > > >> > > > Ok, if you show me where this is documented for everybody then I am >> > > > probably convinced :-) >> > > > We should fixup the gart initialization anyway. >> > > >> > > So, you planning to pull in all 4 patches then? >> > >> > Yes, I will apply them tomorrow and write a fix for the GART issue this >> > may introduce. >> > >> >> Hi Joerg, >> >> Going through the old mail thread, I think the commit you pointed to was >> primarily introduced to solve kexec + GART issue and not necessarily kdump >> issue. >> >> In fact disabling IOMMU patch was introduced by you. >> >> Author: Joerg Roedel <joerg.roedel@amd.com> >> Date: Tue Jun 9 17:56:09 2009 +0200 >> >> x86: disable IOMMUs on kernel crash >> >> If the IOMMUs are still enabled when the kexec kernel boots access to >> the disk is not possible. This is bad for tools like kdump or anything >> else which wants to use PCI devices. >> >> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> >> >> I am assuming you introduced this patch because you faced issues with >> amd-iommu and not GART. >> >> So basically GART should have been working with kdump even before you >> introduced disabling iommu patch in kdump path. > > Looking at following commit, we were still not shutting down GART and > fixing issues like second kernel accessing the GART aperture set by first > kernel. > > commit aaf230424204864e2833dcc1da23e2cb0b9f39cd > Author: Yinghai Lu <Yinghai.Lu@Sun.COM> > Date: Wed Jan 30 13:33:09 2008 +0100 > > x86: disable the GART early, 64-bit > > For K8 system: 4G RAM with memory hole remapping enabled, or more than > 4G RAM installed. > > So I guess it should be fine to not shutdown GART in crashing kernel and > then look at the fresh issues which crop up and figure out how to fix > those. not sure if it is related: for crashing kernel, it could do early_memtest to check if some device are still do dma operation. When I use kexec to start second kernel, if enable the early_memtest in second kernel, it will find some pages RAM are BAD, and it will mark them and not use them. memtest=1 should be good enough. Fresh restart will not report there is any BAD ram in the same system. YH ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-06 21:45 ` Yinghai Lu @ 2010-04-06 22:10 ` Eric W. Biederman 0 siblings, 0 replies; 27+ messages in thread From: Eric W. Biederman @ 2010-04-06 22:10 UTC (permalink / raw) To: Yinghai Lu Cc: Vivek Goyal, Joerg Roedel, Chris Wright, Joerg Roedel, Bernhard Walle, nhorman, nhorman, kexec, linux-kernel, hbabu, iommu Yinghai Lu <yinghai@kernel.org> writes: > not sure if it is related: I don't think it is. > for crashing kernel, it could do early_memtest to check if some device > are still do dma operation. Devices doing DMA in general are not a problem in the kdump kernel because we are using an area of memory that has been reserved since the beginning of time and no DMA's should be targeting it. The challenge is how to regain control of the IOMMU. > When I use kexec to start second kernel, if enable the early_memtest > in second kernel, it will find some pages RAM are BAD, > and it will mark them and not use them. memtest=1 should be good enough. > Fresh restart will not report there is any BAD ram in the same system. I assume you are not talking kdump here. On-going DMA in the case of kexec indicates some device driver isn't shutting itself down when it's shutdown method is called. Odds are it is a network controller that doesn't stop DMA when it is brought down or it is, possibly a really weird disk driver. If you are seeing this with the kdump kernel this may indeed indicate an IOMMU reinitialization problem. Eric ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" 2010-04-03 17:41 ` Joerg Roedel 2010-04-03 17:49 ` Eric W. Biederman @ 2010-04-04 11:54 ` David Woodhouse 1 sibling, 0 replies; 27+ messages in thread From: David Woodhouse @ 2010-04-04 11:54 UTC (permalink / raw) To: Joerg Roedel Cc: Chris Wright, nhorman, nhorman, Joerg Roedel, kexec, linux-kernel, hbabu, iommu, ebiederm, vgoyal On Sat, 2010-04-03 at 19:41 +0200, Joerg Roedel wrote: > Another problem: This also breaks if the kdump kernel has no > iommu-support. The kdump kernel has to support the hardware it's running on. Film at 11. -- David Woodhouse Open Source Technology Centre David.Woodhouse@intel.com Intel Corporation ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 4/4] x86/amd-iommu: use for_each_pci_dev 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright ` (2 preceding siblings ...) 2010-04-03 1:27 ` [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" Chris Wright @ 2010-04-03 1:27 ` Chris Wright 2010-04-07 10:05 ` [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Joerg Roedel 4 siblings, 0 replies; 27+ messages in thread From: Chris Wright @ 2010-04-03 1:27 UTC (permalink / raw) To: Joerg Roedel Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec, linux-kernel [-- Attachment #1: amd-iommu-use-for_each_pci_dev.patch --] [-- Type: text/plain, Size: 540 bytes --] Replace open coded version with for_each_pci_dev Signed-off-by: Chris Wright <chrisw@sous-sol.org> --- arch/x86/kernel/amd_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/amd_iommu.c +++ b/arch/x86/kernel/amd_iommu.c @@ -2187,7 +2187,7 @@ static void prealloc_protection_domains( struct dma_ops_domain *dma_dom; u16 devid; - while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) { + for_each_pci_dev(dev) { /* Do we handle this device? */ if (!check_device(&dev->dev)) ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright ` (3 preceding siblings ...) 2010-04-03 1:27 ` [PATCH 4/4] x86/amd-iommu: use for_each_pci_dev Chris Wright @ 2010-04-07 10:05 ` Joerg Roedel 4 siblings, 0 replies; 27+ messages in thread From: Joerg Roedel @ 2010-04-07 10:05 UTC (permalink / raw) To: Chris Wright Cc: nhorman, nhorman, vgoyal, hbabu, ebiederm, iommu, kexec, linux-kernel On Fri, Apr 02, 2010 at 06:27:51PM -0700, Chris Wright wrote: > Series includes the following patches: > > x86/amd-iommu: enable iommu before attaching devices > x86/amd-iommu: warn when issuing command to uninitialized cmd buffer > Revert "x86: disable IOMMUs on kernel crash" > x86/amd-iommu: use for_each_pci_dev > > First one is the primary bug fix. > > v2 > - add disable_iommus on failure path > - move removal of CMD_BUFFER_UNINITIALIZED to iommu_enable_command_buffer() > - fix ";;" typo in patch 2 > - add Revert "x86: disable IOMMUs on kernel crash" > - add x86/amd-iommu: use for_each_pci_dev Applied all, thanks. ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2010-04-07 10:06 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-04-03 1:27 [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Chris Wright 2010-04-03 1:27 ` [PATCH 1/4] x86/amd-iommu: enable iommu before attaching devices Chris Wright 2010-04-03 1:27 ` [PATCH 2/4] x86/amd-iommu: warn when issuing command to uninitialized cmd buffer Chris Wright 2010-04-03 1:27 ` [PATCH 3/4] Revert "x86: disable IOMMUs on kernel crash" Chris Wright 2010-04-03 17:22 ` Joerg Roedel 2010-04-03 17:44 ` Eric W. Biederman 2010-04-04 8:44 ` Joerg Roedel 2010-04-04 9:16 ` Eric W. Biederman 2010-04-04 9:19 ` Eric W. Biederman 2010-04-03 17:41 ` Joerg Roedel 2010-04-03 17:49 ` Eric W. Biederman 2010-04-03 19:13 ` Joerg Roedel 2010-04-03 19:41 ` Eric W. Biederman 2010-04-04 7:24 ` Bernhard Walle 2010-04-04 7:51 ` Eric W. Biederman 2010-04-04 8:53 ` Joerg Roedel 2010-04-04 9:44 ` Eric W. Biederman 2010-04-04 10:01 ` Joerg Roedel 2010-04-06 17:42 ` Chris Wright 2010-04-06 17:51 ` Joerg Roedel 2010-04-06 20:39 ` Vivek Goyal 2010-04-06 21:13 ` Vivek Goyal 2010-04-06 21:45 ` Yinghai Lu 2010-04-06 22:10 ` Eric W. Biederman 2010-04-04 11:54 ` David Woodhouse 2010-04-03 1:27 ` [PATCH 4/4] x86/amd-iommu: use for_each_pci_dev Chris Wright 2010-04-07 10:05 ` [PATCH 0/4] AMD IOMMU kdump fix plus cleanups (v2) Joerg Roedel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox