* [PATCH] kdump: Fix for machine checkstop on DMA fault
@ 2006-03-23 4:30 Haren Myneni
2006-03-23 5:38 ` Olof Johansson
0 siblings, 1 reply; 9+ messages in thread
From: Haren Myneni @ 2006-03-23 4:30 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev, ellerman, Milton Miller, Olaf Hering
[-- Attachment #1: Type: text/plain, Size: 320 bytes --]
Paul, If you are OK with this fix, please send it upstream.
Thanks
Haren
- Some machines checkstop on dma protection fault for ongoing DMA left
in the first kernel. Since, we do not shutdown devices before the kdump
boot, let them continue DMA to old kernel space.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
[-- Attachment #2: kdump-dma-fault-fix.patch --]
[-- Type: text/x-patch, Size: 557 bytes --]
--- 2616-git5-k1/arch/powerpc/kernel/iommu.c.orig 2006-04-04 19:08:02.000000000 -0700
+++ 2616-git5-k1/arch/powerpc/kernel/iommu.c 2006-04-04 10:50:45.000000000 -0700
@@ -427,8 +427,10 @@ struct iommu_table *iommu_init_table(str
tbl->it_largehint = tbl->it_halfpoint;
spin_lock_init(&tbl->it_lock);
+#ifndef CONFIG_CRASH_DUMP
/* Clear the hardware table in case firmware left allocations in it */
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
+#endif
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 4:30 [PATCH] kdump: Fix for machine checkstop on DMA fault Haren Myneni @ 2006-03-23 5:38 ` Olof Johansson 2006-03-23 6:06 ` Michael Ellerman 0 siblings, 1 reply; 9+ messages in thread From: Olof Johansson @ 2006-03-23 5:38 UTC (permalink / raw) To: Haren Myneni Cc: linuxppc-dev, ellerman, Paul Mackerras, Milton Miller, Olaf Hering On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote: > Paul, If you are OK with this fix, please send it upstream. > > Thanks > Haren > > - Some machines checkstop on dma protection fault for ongoing DMA left > in the first kernel. Since, we do not shutdown devices before the kdump > boot, let them continue DMA to old kernel space. How is this solved for regular kexec, doesn't the same problem exist there? -Olof ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 5:38 ` Olof Johansson @ 2006-03-23 6:06 ` Michael Ellerman 2006-03-23 6:19 ` Olof Johansson 0 siblings, 1 reply; 9+ messages in thread From: Michael Ellerman @ 2006-03-23 6:06 UTC (permalink / raw) To: linuxppc-dev; +Cc: Milton Miller, Paul Mackerras, Olaf Hering, ellerman [-- Attachment #1: Type: text/plain, Size: 888 bytes --] On Thu, 23 Mar 2006 16:38, Olof Johansson wrote: > On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote: > > Paul, If you are OK with this fix, please send it upstream. > > > > Thanks > > Haren > > > > - Some machines checkstop on dma protection fault for ongoing DMA left > > in the first kernel. Since, we do not shutdown devices before the kdump > > boot, let them continue DMA to old kernel space. > > How is this solved for regular kexec, doesn't the same problem exist > there? The idea for normal kexec is that the kernel should have shut everything down properly. It's a bug if there are still DMAs going on. Hopefully. cheers -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 6:06 ` Michael Ellerman @ 2006-03-23 6:19 ` Olof Johansson 2006-03-23 20:12 ` Olof Johansson 0 siblings, 1 reply; 9+ messages in thread From: Olof Johansson @ 2006-03-23 6:19 UTC (permalink / raw) To: Michael Ellerman Cc: Milton Miller, linuxppc-dev, Paul Mackerras, Olaf Hering, ellerman On Thu, Mar 23, 2006 at 05:06:27PM +1100, Michael Ellerman wrote: > On Thu, 23 Mar 2006 16:38, Olof Johansson wrote: > > On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote: > > > Paul, If you are OK with this fix, please send it upstream. > > > > > > Thanks > > > Haren > > > > > > - Some machines checkstop on dma protection fault for ongoing DMA left > > > in the first kernel. Since, we do not shutdown devices before the kdump > > > boot, let them continue DMA to old kernel space. > > > > How is this solved for regular kexec, doesn't the same problem exist > > there? > > The idea for normal kexec is that the kernel should have shut everything down > properly. It's a bug if there are still DMAs going on. Hopefully. Thanks Michael. In that case, I have to NACK the original patch. Out of luck, it'll probably work in most cases, but there's always the risk of a DMA still going on, the crash kernel remapping an entry, and getting memory scribbled over. The crash kernel needs to be even more careful, and instead read out the entries that are mapped and reserve them. This would require a bit more plumbing since there's no way to read an entry right now, but it'd remove that hole. -Olof ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 6:19 ` Olof Johansson @ 2006-03-23 20:12 ` Olof Johansson 2006-03-23 23:06 ` Haren Myneni 2006-03-27 5:04 ` Michael Ellerman 0 siblings, 2 replies; 9+ messages in thread From: Olof Johansson @ 2006-03-23 20:12 UTC (permalink / raw) To: Olof Johansson Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras, Olaf Hering, ellerman On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote: > The crash kernel needs to be even more careful, and instead read out > the entries that are mapped and reserve them. This would require a bit > more plumbing since there's no way to read an entry right now, but it'd > remove that hole. Actually, what's probably easier is to allocate some entries when the purgatory is set up, and make the crash kernel only use those by modifying the device tree accordingly. Sort of how regular memory is handled right now. That'd be a cleaner solution with less changes needed. The trick will be to get a decent size contiguous allocation, but the same applies for the memory reserve. -Olof ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 20:12 ` Olof Johansson @ 2006-03-23 23:06 ` Haren Myneni 2006-03-23 23:11 ` Olof Johansson 2006-03-27 5:04 ` Michael Ellerman 1 sibling, 1 reply; 9+ messages in thread From: Haren Myneni @ 2006-03-23 23:06 UTC (permalink / raw) To: Olof Johansson Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras, Olaf Hering, ellerman [-- Attachment #1: Type: text/plain, Size: 1733 bytes --] linuxppc-dev-bounces+hbabu=us.ibm.com@ozlabs.org wrote on 03/23/2006 12:12:58 PM: > On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote: > > > The crash kernel needs to be even more careful, and instead read out > > the entries that are mapped and reserve them. This would require a bit > > more plumbing since there's no way to read an entry right now, but it'd > > remove that hole. > > Actually, what's probably easier is to allocate some entries when the > purgatory is set up, and make the crash kernel only use those by modifying > the device tree accordingly. Sort of how regular memory is handled right > now. That'd be a cleaner solution with less changes needed. > > The trick will be to get a decent size contiguous allocation, but the > same applies for the memory reserve. Olof, Thanks for your comments/suggestions. On JS21, immediately after the tce entries are initialized, the machine checkstops with an error "Internal CPU 1 Fault Error" on bladecenter MM. If we do not initialize tce entries for crash kernel, allows the ongoing DMA continue to the old kernel memory. I though that, ongoing DMA will be stopped when the device reset happens later by the drivers. I think, some hardening is already included in some drivers to take care of this behavior. I might be wrong. So far, I had e100 issue after testing on p5, p4, js20 and js21. Probably, it could be lucky scenario. So, will be keeping the same change (posted here) plus your suggestion. Right? Can we apply same approach even for power-4? Thanks Haren > > > -Olof > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-dev [-- Attachment #2: Type: text/html, Size: 2160 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 23:06 ` Haren Myneni @ 2006-03-23 23:11 ` Olof Johansson 0 siblings, 0 replies; 9+ messages in thread From: Olof Johansson @ 2006-03-23 23:11 UTC (permalink / raw) To: Haren Myneni Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras, Olaf Hering, ellerman Hi, On Thu, Mar 23, 2006 at 03:06:22PM -0800, Haren Myneni wrote: > On JS21, immediately after the tce entries are initialized, the machine > checkstops with an error "Internal CPU 1 Fault Error" on bladecenter MM. > If we do not initialize tce entries for crash kernel, allows the ongoing > DMA continue to the old kernel memory. I though that, ongoing DMA will be The problem isn't when DMA is going to the old kernel memory. The problem is when that TCE entry gets reused by the crashdump kernel, and some other memory gets overwritten instead. > stopped when the device reset happens later by the drivers. I think, some > hardening is already included in some drivers to take care of this > behavior. I might be wrong. So far, I had e100 issue after testing on p5, What assures that the crash kernel has drivers for all hardware in the system? If there's no driver, what will then be used to quiesce the device? > p4, js20 and js21. Probably, it could be lucky scenario. > So, will be keeping the same change (posted here) plus your suggestion. > Right? Can we apply same approach even for power-4? What you have now might be a 99%-of-the-time-it-works solution, but is that really good enough? The last things you want from a crash kernel is: 1. Have it crash on it's own because of something getting overwritten (small chance, since most mappings are probably for writing out data for later analysis) or: 2. Have it write corrupted data to the crash dump. This makes it more or less useless, since you can't trust what it wrote out: Did the machine go down because of the memory corruption you're spotting, or did that happen after the crash, while dumping it, etc? Either way, a proper solution is needed, not a 99% one. -Olof ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-23 20:12 ` Olof Johansson 2006-03-23 23:06 ` Haren Myneni @ 2006-03-27 5:04 ` Michael Ellerman 2006-03-27 14:06 ` Olof Johansson 1 sibling, 1 reply; 9+ messages in thread From: Michael Ellerman @ 2006-03-27 5:04 UTC (permalink / raw) To: Olof Johansson, Haren Myneni Cc: linuxppc-dev, Paul Mackerras, Milton Miller, Olaf Hering [-- Attachment #1: Type: text/plain, Size: 1803 bytes --] On Thu, 2006-03-23 at 14:12 -0600, Olof Johansson wrote: > On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote: > > > The crash kernel needs to be even more careful, and instead read out > > the entries that are mapped and reserve them. This would require a bit > > more plumbing since there's no way to read an entry right now, but it'd > > remove that hole. > > Actually, what's probably easier is to allocate some entries when the > purgatory is set up, and make the crash kernel only use those by modifying > the device tree accordingly. Sort of how regular memory is handled right > now. That'd be a cleaner solution with less changes needed. > > The trick will be to get a decent size contiguous allocation, but the > same applies for the memory reserve. I disagree. In most cases the kdump kernel will be loaded by the boot scripts, so reserving TCE space then is ~= reserving it for the life of the first kernel. Given that TCE space is a scarce commodity I don't think reserving it in the first kernel is a viable option. What we should do is modify the second kernel so that instead of clearing the TCE tables it instead walks the tables and detects existing mappings, and then marks those as reserved so they're not overwritten. This should give us 100% safety from the second kernel reusing a mapping and copping a rogue DMA, and doesn't inflict any penalty on the first kernel. It does fall down if there's no TCE space left for a device when the second kernel comes up, but I think that's the best trade off. cheers -- Michael Ellerman IBM OzLabs wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault 2006-03-27 5:04 ` Michael Ellerman @ 2006-03-27 14:06 ` Olof Johansson 0 siblings, 0 replies; 9+ messages in thread From: Olof Johansson @ 2006-03-27 14:06 UTC (permalink / raw) To: Michael Ellerman; +Cc: Olaf Hering, Paul Mackerras, Milton Miller, linuxppc-dev On Mon, Mar 27, 2006 at 04:04:58PM +1100, Michael Ellerman wrote: > I disagree. In most cases the kdump kernel will be loaded by the boot > scripts, so reserving TCE space then is ~= reserving it for the life of > the first kernel. Given that TCE space is a scarce commodity I don't > think reserving it in the first kernel is a viable option. Well, hopefully the kdump kernel doesn't need as much table space as a regular kernel, so the loss would be limited, but if you're willing to do the reserve instead; that'd be better. > What we should do is modify the second kernel so that instead of > clearing the TCE tables it instead walks the tables and detects existing > mappings, and then marks those as reserved so they're not overwritten. Yep, that's exactly what my first proposal was. > This should give us 100% safety from the second kernel reusing a mapping > and copping a rogue DMA, and doesn't inflict any penalty on the first > kernel. It does fall down if there's no TCE space left for a device when > the second kernel comes up, but I think that's the best trade off. Correct. The time it could be a disadvantage is when there's been a driver bug that leaks mappings that causes the machine to go down (i.e. into the kdump kernel). Whatever device has been leaking might not be usable since the table will be 100% full. -Olof ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-03-27 14:08 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-23 4:30 [PATCH] kdump: Fix for machine checkstop on DMA fault Haren Myneni 2006-03-23 5:38 ` Olof Johansson 2006-03-23 6:06 ` Michael Ellerman 2006-03-23 6:19 ` Olof Johansson 2006-03-23 20:12 ` Olof Johansson 2006-03-23 23:06 ` Haren Myneni 2006-03-23 23:11 ` Olof Johansson 2006-03-27 5:04 ` Michael Ellerman 2006-03-27 14:06 ` Olof Johansson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).