* [PATCH] kdump: Fix for machine checkstop on DMA fault
@ 2006-03-23 4:30 Haren Myneni
2006-03-23 5:38 ` Olof Johansson
0 siblings, 1 reply; 9+ messages in thread
From: Haren Myneni @ 2006-03-23 4:30 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev, ellerman, Milton Miller, Olaf Hering
[-- Attachment #1: Type: text/plain, Size: 320 bytes --]
Paul, If you are OK with this fix, please send it upstream.
Thanks
Haren
- Some machines checkstop on dma protection fault for ongoing DMA left
in the first kernel. Since, we do not shutdown devices before the kdump
boot, let them continue DMA to old kernel space.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
[-- Attachment #2: kdump-dma-fault-fix.patch --]
[-- Type: text/x-patch, Size: 557 bytes --]
--- 2616-git5-k1/arch/powerpc/kernel/iommu.c.orig 2006-04-04 19:08:02.000000000 -0700
+++ 2616-git5-k1/arch/powerpc/kernel/iommu.c 2006-04-04 10:50:45.000000000 -0700
@@ -427,8 +427,10 @@ struct iommu_table *iommu_init_table(str
tbl->it_largehint = tbl->it_halfpoint;
spin_lock_init(&tbl->it_lock);
+#ifndef CONFIG_CRASH_DUMP
/* Clear the hardware table in case firmware left allocations in it */
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
+#endif
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging %s\n",
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 4:30 [PATCH] kdump: Fix for machine checkstop on DMA fault Haren Myneni
@ 2006-03-23 5:38 ` Olof Johansson
2006-03-23 6:06 ` Michael Ellerman
0 siblings, 1 reply; 9+ messages in thread
From: Olof Johansson @ 2006-03-23 5:38 UTC (permalink / raw)
To: Haren Myneni
Cc: linuxppc-dev, ellerman, Paul Mackerras, Milton Miller,
Olaf Hering
On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote:
> Paul, If you are OK with this fix, please send it upstream.
>
> Thanks
> Haren
>
> - Some machines checkstop on dma protection fault for ongoing DMA left
> in the first kernel. Since, we do not shutdown devices before the kdump
> boot, let them continue DMA to old kernel space.
How is this solved for regular kexec, doesn't the same problem exist
there?
-Olof
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 5:38 ` Olof Johansson
@ 2006-03-23 6:06 ` Michael Ellerman
2006-03-23 6:19 ` Olof Johansson
0 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2006-03-23 6:06 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Milton Miller, Paul Mackerras, Olaf Hering, ellerman
[-- Attachment #1: Type: text/plain, Size: 888 bytes --]
On Thu, 23 Mar 2006 16:38, Olof Johansson wrote:
> On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote:
> > Paul, If you are OK with this fix, please send it upstream.
> >
> > Thanks
> > Haren
> >
> > - Some machines checkstop on dma protection fault for ongoing DMA left
> > in the first kernel. Since, we do not shutdown devices before the kdump
> > boot, let them continue DMA to old kernel space.
>
> How is this solved for regular kexec, doesn't the same problem exist
> there?
The idea for normal kexec is that the kernel should have shut everything down
properly. It's a bug if there are still DMAs going on. Hopefully.
cheers
--
Michael Ellerman
IBM OzLabs
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 6:06 ` Michael Ellerman
@ 2006-03-23 6:19 ` Olof Johansson
2006-03-23 20:12 ` Olof Johansson
0 siblings, 1 reply; 9+ messages in thread
From: Olof Johansson @ 2006-03-23 6:19 UTC (permalink / raw)
To: Michael Ellerman
Cc: Milton Miller, linuxppc-dev, Paul Mackerras, Olaf Hering,
ellerman
On Thu, Mar 23, 2006 at 05:06:27PM +1100, Michael Ellerman wrote:
> On Thu, 23 Mar 2006 16:38, Olof Johansson wrote:
> > On Wed, Mar 22, 2006 at 08:30:26PM -0800, Haren Myneni wrote:
> > > Paul, If you are OK with this fix, please send it upstream.
> > >
> > > Thanks
> > > Haren
> > >
> > > - Some machines checkstop on dma protection fault for ongoing DMA left
> > > in the first kernel. Since, we do not shutdown devices before the kdump
> > > boot, let them continue DMA to old kernel space.
> >
> > How is this solved for regular kexec, doesn't the same problem exist
> > there?
>
> The idea for normal kexec is that the kernel should have shut everything down
> properly. It's a bug if there are still DMAs going on. Hopefully.
Thanks Michael.
In that case, I have to NACK the original patch.
Out of luck, it'll probably work in most cases, but there's always
the risk of a DMA still going on, the crash kernel remapping an entry,
and getting memory scribbled over.
The crash kernel needs to be even more careful, and instead read out
the entries that are mapped and reserve them. This would require a bit
more plumbing since there's no way to read an entry right now, but it'd
remove that hole.
-Olof
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 6:19 ` Olof Johansson
@ 2006-03-23 20:12 ` Olof Johansson
2006-03-23 23:06 ` Haren Myneni
2006-03-27 5:04 ` Michael Ellerman
0 siblings, 2 replies; 9+ messages in thread
From: Olof Johansson @ 2006-03-23 20:12 UTC (permalink / raw)
To: Olof Johansson
Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras,
Olaf Hering, ellerman
On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote:
> The crash kernel needs to be even more careful, and instead read out
> the entries that are mapped and reserve them. This would require a bit
> more plumbing since there's no way to read an entry right now, but it'd
> remove that hole.
Actually, what's probably easier is to allocate some entries when the
purgatory is set up, and make the crash kernel only use those by modifying
the device tree accordingly. Sort of how regular memory is handled right
now. That'd be a cleaner solution with less changes needed.
The trick will be to get a decent size contiguous allocation, but the
same applies for the memory reserve.
-Olof
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 20:12 ` Olof Johansson
@ 2006-03-23 23:06 ` Haren Myneni
2006-03-23 23:11 ` Olof Johansson
2006-03-27 5:04 ` Michael Ellerman
1 sibling, 1 reply; 9+ messages in thread
From: Haren Myneni @ 2006-03-23 23:06 UTC (permalink / raw)
To: Olof Johansson
Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras,
Olaf Hering, ellerman
[-- Attachment #1: Type: text/plain, Size: 1733 bytes --]
linuxppc-dev-bounces+hbabu=us.ibm.com@ozlabs.org wrote on 03/23/2006
12:12:58 PM:
> On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote:
>
> > The crash kernel needs to be even more careful, and instead read out
> > the entries that are mapped and reserve them. This would require a bit
> > more plumbing since there's no way to read an entry right now, but
it'd
> > remove that hole.
>
> Actually, what's probably easier is to allocate some entries when the
> purgatory is set up, and make the crash kernel only use those by
modifying
> the device tree accordingly. Sort of how regular memory is handled right
> now. That'd be a cleaner solution with less changes needed.
>
> The trick will be to get a decent size contiguous allocation, but the
> same applies for the memory reserve.
Olof, Thanks for your comments/suggestions.
On JS21, immediately after the tce entries are initialized, the machine
checkstops with an error "Internal CPU 1 Fault Error" on bladecenter MM.
If we do not initialize tce entries for crash kernel, allows the ongoing
DMA continue to the old kernel memory. I though that, ongoing DMA will be
stopped when the device reset happens later by the drivers. I think, some
hardening is already included in some drivers to take care of this
behavior. I might be wrong. So far, I had e100 issue after testing on p5,
p4, js20 and js21. Probably, it could be lucky scenario.
So, will be keeping the same change (posted here) plus your suggestion.
Right? Can we apply same approach even for power-4?
Thanks
Haren
>
>
> -Olof
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
[-- Attachment #2: Type: text/html, Size: 2160 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 23:06 ` Haren Myneni
@ 2006-03-23 23:11 ` Olof Johansson
0 siblings, 0 replies; 9+ messages in thread
From: Olof Johansson @ 2006-03-23 23:11 UTC (permalink / raw)
To: Haren Myneni
Cc: Milton Miller, Michael Ellerman, linuxppc-dev, Paul Mackerras,
Olaf Hering, ellerman
Hi,
On Thu, Mar 23, 2006 at 03:06:22PM -0800, Haren Myneni wrote:
> On JS21, immediately after the tce entries are initialized, the machine
> checkstops with an error "Internal CPU 1 Fault Error" on bladecenter MM.
> If we do not initialize tce entries for crash kernel, allows the ongoing
> DMA continue to the old kernel memory. I though that, ongoing DMA will be
The problem isn't when DMA is going to the old kernel memory. The
problem is when that TCE entry gets reused by the crashdump kernel, and
some other memory gets overwritten instead.
> stopped when the device reset happens later by the drivers. I think, some
> hardening is already included in some drivers to take care of this
> behavior. I might be wrong. So far, I had e100 issue after testing on p5,
What assures that the crash kernel has drivers for all hardware in the
system? If there's no driver, what will then be used to quiesce the
device?
> p4, js20 and js21. Probably, it could be lucky scenario.
> So, will be keeping the same change (posted here) plus your suggestion.
> Right? Can we apply same approach even for power-4?
What you have now might be a 99%-of-the-time-it-works solution, but is
that really good enough?
The last things you want from a crash kernel is:
1. Have it crash on it's own because of something getting overwritten
(small chance, since most mappings are probably for writing out data
for later analysis)
or:
2. Have it write corrupted data to the crash dump. This makes it more or
less useless, since you can't trust what it wrote out: Did the machine
go down because of the memory corruption you're spotting, or did that
happen after the crash, while dumping it, etc?
Either way, a proper solution is needed, not a 99% one.
-Olof
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-23 20:12 ` Olof Johansson
2006-03-23 23:06 ` Haren Myneni
@ 2006-03-27 5:04 ` Michael Ellerman
2006-03-27 14:06 ` Olof Johansson
1 sibling, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2006-03-27 5:04 UTC (permalink / raw)
To: Olof Johansson, Haren Myneni
Cc: linuxppc-dev, Paul Mackerras, Milton Miller, Olaf Hering
[-- Attachment #1: Type: text/plain, Size: 1803 bytes --]
On Thu, 2006-03-23 at 14:12 -0600, Olof Johansson wrote:
> On Thu, Mar 23, 2006 at 12:19:04AM -0600, Olof Johansson wrote:
>
> > The crash kernel needs to be even more careful, and instead read out
> > the entries that are mapped and reserve them. This would require a bit
> > more plumbing since there's no way to read an entry right now, but it'd
> > remove that hole.
>
> Actually, what's probably easier is to allocate some entries when the
> purgatory is set up, and make the crash kernel only use those by modifying
> the device tree accordingly. Sort of how regular memory is handled right
> now. That'd be a cleaner solution with less changes needed.
>
> The trick will be to get a decent size contiguous allocation, but the
> same applies for the memory reserve.
I disagree. In most cases the kdump kernel will be loaded by the boot
scripts, so reserving TCE space then is ~= reserving it for the life of
the first kernel. Given that TCE space is a scarce commodity I don't
think reserving it in the first kernel is a viable option.
What we should do is modify the second kernel so that instead of
clearing the TCE tables it instead walks the tables and detects existing
mappings, and then marks those as reserved so they're not overwritten.
This should give us 100% safety from the second kernel reusing a mapping
and copping a rogue DMA, and doesn't inflict any penalty on the first
kernel. It does fall down if there's no TCE space left for a device when
the second kernel comes up, but I think that's the best trade off.
cheers
--
Michael Ellerman
IBM OzLabs
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kdump: Fix for machine checkstop on DMA fault
2006-03-27 5:04 ` Michael Ellerman
@ 2006-03-27 14:06 ` Olof Johansson
0 siblings, 0 replies; 9+ messages in thread
From: Olof Johansson @ 2006-03-27 14:06 UTC (permalink / raw)
To: Michael Ellerman; +Cc: Olaf Hering, Paul Mackerras, Milton Miller, linuxppc-dev
On Mon, Mar 27, 2006 at 04:04:58PM +1100, Michael Ellerman wrote:
> I disagree. In most cases the kdump kernel will be loaded by the boot
> scripts, so reserving TCE space then is ~= reserving it for the life of
> the first kernel. Given that TCE space is a scarce commodity I don't
> think reserving it in the first kernel is a viable option.
Well, hopefully the kdump kernel doesn't need as much table space as a
regular kernel, so the loss would be limited, but if you're willing to
do the reserve instead; that'd be better.
> What we should do is modify the second kernel so that instead of
> clearing the TCE tables it instead walks the tables and detects existing
> mappings, and then marks those as reserved so they're not overwritten.
Yep, that's exactly what my first proposal was.
> This should give us 100% safety from the second kernel reusing a mapping
> and copping a rogue DMA, and doesn't inflict any penalty on the first
> kernel. It does fall down if there's no TCE space left for a device when
> the second kernel comes up, but I think that's the best trade off.
Correct. The time it could be a disadvantage is when there's been a
driver bug that leaks mappings that causes the machine to go down (i.e.
into the kdump kernel). Whatever device has been leaking might not be
usable since the table will be 100% full.
-Olof
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-03-27 14:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-23 4:30 [PATCH] kdump: Fix for machine checkstop on DMA fault Haren Myneni
2006-03-23 5:38 ` Olof Johansson
2006-03-23 6:06 ` Michael Ellerman
2006-03-23 6:19 ` Olof Johansson
2006-03-23 20:12 ` Olof Johansson
2006-03-23 23:06 ` Haren Myneni
2006-03-23 23:11 ` Olof Johansson
2006-03-27 5:04 ` Michael Ellerman
2006-03-27 14:06 ` Olof Johansson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).