* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-13 6:30 ` Chris Friesen
0 siblings, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2013-07-13 6:30 UTC (permalink / raw)
To: Michael Ellerman; +Cc: kexec, Paul Mackerras, linuxppc-dev, Vivek Goyal
On 07/12/2013 04:59 PM, Chris Friesen wrote:
> On 07/12/2013 03:08 PM, Chris Friesen wrote:
>
>> I turned on the instrumentation in early_init_dt_scan_memory() and got
>> the following when jumping to the capture kernel:
>>
>> memory scan node memory, reg size 16, data: 0 0 2 0,
>> - 0 , 200000000
>>
>> That 0x200000000 matches the fact that I'm seeing 8GB of memory
>> available in the recovery kernel.
>>
>> If I boot the original kernel with "crashkernel=224M@32M", should I
>> expect that only 224MB is marked as "linux,usable-memory" in the
>> recovery kernel?
>
> I started looking at the kexec side of things, and I noticed something a
> bit odd. In most places dealing with the device tree in kexec it accepts
> either "memory" or "memory@" for the memory node name. In
> add_usable_mem_property() in arch/ppc64/fs2dt.c it seems to only accept
> "memory@".
>
> Is this expected behaviour? It seems to be the same in current git
> versions of kexec-tools.
>
> On my system I see "/proc/device-tree/memory".
>
> If I modify add_usable_mem_property() to also accept "/memory" then my
> recovery kernel boots up with
>
> physicalMemorySize = 0x10000000
>
> which is 256MB (which is still a bit odd since I specified 224MB for the
> crashkernel).
>
> However, it then hits the BUG() call at the end of mark_bootmem() in
> mm/bootmem.c.
One final thing and I'll stop replying to myself. :)
It looks like the problem is that some board-specific freescale code was
calling lmb_reserve() with a base address in the 4GB range. It seems
odd that lmb_reserve() didn't throw some kind of error when the recovery
kernel was supposed to be limited to 224MB.
Rather than try and fix the bug, I turned off the (unneeded) config
options related to the above lmb_reserve() calls and was able to
successfully access the information I needed via /dev/oldmem.
The upshot is that there seems to be a number of things that could be
improved:
1) kexec should accept "/memory" and not just "/memory@"
2) lmb_reserve() should really respect the crashkernel memory limit
3) the freescale stuff really shouldn't assume it can map things
wherever it feels like
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-13 6:30 ` Chris Friesen
@ 2013-07-14 4:36 ` Michael Ellerman
-1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2013-07-14 4:36 UTC (permalink / raw)
To: Chris Friesen
Cc: Benjamin Herrenschmidt, kexec, Haren Myneni, Paul Mackerras,
linuxppc-dev, Vivek Goyal
On Sat, Jul 13, 2013 at 12:30:50AM -0600, Chris Friesen wrote:
> On 07/12/2013 04:59 PM, Chris Friesen wrote:
> >On 07/12/2013 03:08 PM, Chris Friesen wrote:
> >
> >>I turned on the instrumentation in early_init_dt_scan_memory() and got
> >>the following when jumping to the capture kernel:
> >>
> >>memory scan node memory, reg size 16, data: 0 0 2 0,
> >>- 0 , 200000000
> >>
> >>That 0x200000000 matches the fact that I'm seeing 8GB of memory
> >>available in the recovery kernel.
> >>
> >>If I boot the original kernel with "crashkernel=224M@32M", should I
> >>expect that only 224MB is marked as "linux,usable-memory" in the
> >>recovery kernel?
> >
> >I started looking at the kexec side of things, and I noticed something a
> >bit odd. In most places dealing with the device tree in kexec it accepts
> >either "memory" or "memory@" for the memory node name. In
> >add_usable_mem_property() in arch/ppc64/fs2dt.c it seems to only accept
> >"memory@".
> >
> >Is this expected behaviour? It seems to be the same in current git
> >versions of kexec-tools.
> >
> >On my system I see "/proc/device-tree/memory".
> >
> >If I modify add_usable_mem_property() to also accept "/memory" then my
> >recovery kernel boots up with
> >
> >physicalMemorySize = 0x10000000
> >
> >which is 256MB (which is still a bit odd since I specified 224MB for the
> >crashkernel).
> >
> >However, it then hits the BUG() call at the end of mark_bootmem() in
> >mm/bootmem.c.
>
> One final thing and I'll stop replying to myself. :)
>
> It looks like the problem is that some board-specific freescale code
> was calling lmb_reserve() with a base address in the 4GB range. It
> seems odd that lmb_reserve() didn't throw some kind of error when
> the recovery kernel was supposed to be limited to 224MB.
>
> Rather than try and fix the bug, I turned off the (unneeded) config
> options related to the above lmb_reserve() calls and was able to
> successfully access the information I needed via /dev/oldmem.
>
> The upshot is that there seems to be a number of things that could
> be improved:
>
> 1) kexec should accept "/memory" and not just "/memory@"
Yeah probably, I think folks tend to use "/memory@0" even if they only
have a single memory node. But that does sound like a bug in kexec.
> 2) lmb_reserve() should really respect the crashkernel memory limit
It's been replaced in mainline, so you'd have to check what the current
code does in that situation.
> 3) the freescale stuff really shouldn't assume it can map things
> wherever it feels like
Agreed :)
cheers
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-14 4:36 ` Michael Ellerman
0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2013-07-14 4:36 UTC (permalink / raw)
To: Chris Friesen; +Cc: kexec, Paul Mackerras, linuxppc-dev, Vivek Goyal
On Sat, Jul 13, 2013 at 12:30:50AM -0600, Chris Friesen wrote:
> On 07/12/2013 04:59 PM, Chris Friesen wrote:
> >On 07/12/2013 03:08 PM, Chris Friesen wrote:
> >
> >>I turned on the instrumentation in early_init_dt_scan_memory() and got
> >>the following when jumping to the capture kernel:
> >>
> >>memory scan node memory, reg size 16, data: 0 0 2 0,
> >>- 0 , 200000000
> >>
> >>That 0x200000000 matches the fact that I'm seeing 8GB of memory
> >>available in the recovery kernel.
> >>
> >>If I boot the original kernel with "crashkernel=224M@32M", should I
> >>expect that only 224MB is marked as "linux,usable-memory" in the
> >>recovery kernel?
> >
> >I started looking at the kexec side of things, and I noticed something a
> >bit odd. In most places dealing with the device tree in kexec it accepts
> >either "memory" or "memory@" for the memory node name. In
> >add_usable_mem_property() in arch/ppc64/fs2dt.c it seems to only accept
> >"memory@".
> >
> >Is this expected behaviour? It seems to be the same in current git
> >versions of kexec-tools.
> >
> >On my system I see "/proc/device-tree/memory".
> >
> >If I modify add_usable_mem_property() to also accept "/memory" then my
> >recovery kernel boots up with
> >
> >physicalMemorySize = 0x10000000
> >
> >which is 256MB (which is still a bit odd since I specified 224MB for the
> >crashkernel).
> >
> >However, it then hits the BUG() call at the end of mark_bootmem() in
> >mm/bootmem.c.
>
> One final thing and I'll stop replying to myself. :)
>
> It looks like the problem is that some board-specific freescale code
> was calling lmb_reserve() with a base address in the 4GB range. It
> seems odd that lmb_reserve() didn't throw some kind of error when
> the recovery kernel was supposed to be limited to 224MB.
>
> Rather than try and fix the bug, I turned off the (unneeded) config
> options related to the above lmb_reserve() calls and was able to
> successfully access the information I needed via /dev/oldmem.
>
> The upshot is that there seems to be a number of things that could
> be improved:
>
> 1) kexec should accept "/memory" and not just "/memory@"
Yeah probably, I think folks tend to use "/memory@0" even if they only
have a single memory node. But that does sound like a bug in kexec.
> 2) lmb_reserve() should really respect the crashkernel memory limit
It's been replaced in mainline, so you'd have to check what the current
code does in that situation.
> 3) the freescale stuff really shouldn't assume it can map things
> wherever it feels like
Agreed :)
cheers
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-14 4:36 ` Michael Ellerman
@ 2013-07-14 5:26 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 28+ messages in thread
From: Benjamin Herrenschmidt @ 2013-07-14 5:26 UTC (permalink / raw)
To: Michael Ellerman
Cc: Chris Friesen, kexec, Haren Myneni, Paul Mackerras, linuxppc-dev,
Vivek Goyal
On Sun, 2013-07-14 at 14:36 +1000, Michael Ellerman wrote:
> > >Is this expected behaviour? It seems to be the same in current git
> > >versions of kexec-tools.
> > >
> > >On my system I see "/proc/device-tree/memory".
> > >
> > >If I modify add_usable_mem_property() to also accept "/memory" then
> my
This is a bug in your device-tree. The memory node should have a unit
address which corresponds to it's reg property. I know people tend to
skip it for "0" but it's bad practice.
So for memory starting at 0 it should be memory@0
Cheers,
Ben.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-14 5:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 28+ messages in thread
From: Benjamin Herrenschmidt @ 2013-07-14 5:26 UTC (permalink / raw)
To: Michael Ellerman
Cc: Chris Friesen, kexec, Paul Mackerras, linuxppc-dev, Vivek Goyal
On Sun, 2013-07-14 at 14:36 +1000, Michael Ellerman wrote:
> > >Is this expected behaviour? It seems to be the same in current git
> > >versions of kexec-tools.
> > >
> > >On my system I see "/proc/device-tree/memory".
> > >
> > >If I modify add_usable_mem_property() to also accept "/memory" then
> my
This is a bug in your device-tree. The memory node should have a unit
address which corresponds to it's reg property. I know people tend to
skip it for "0" but it's bad practice.
So for memory starting at 0 it should be memory@0
Cheers,
Ben.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-14 5:26 ` Benjamin Herrenschmidt
@ 2013-07-14 23:08 ` Chris Friesen
-1 siblings, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2013-07-14 23:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: kexec, Michael Ellerman, Haren Myneni, Paul Mackerras,
linuxppc-dev, Vivek Goyal
On 07/13/2013 11:26 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2013-07-14 at 14:36 +1000, Michael Ellerman wrote:
>>>> Is this expected behaviour? It seems to be the same in current git
>>>> versions of kexec-tools.
>>>>
>>>> On my system I see "/proc/device-tree/memory".
>>>>
>>>> If I modify add_usable_mem_property() to also accept "/memory" then
>> my
>
> This is a bug in your device-tree. The memory node should have a unit
> address which corresponds to it's reg property. I know people tend to
> skip it for "0" but it's bad practice.
>
> So for memory starting at 0 it should be memory@0
There are a fair number of dts files in the kernel tree that don't
specify an address for the memory node.
If the kernel accepts it without an address, it seems logical that kexec
should as well.
Or maybe the kernel should just implicitly assume an address of zero and
export it as such in /proc/device-tree?
Chris
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-14 23:08 ` Chris Friesen
0 siblings, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2013-07-14 23:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: kexec, Paul Mackerras, linuxppc-dev, Vivek Goyal
On 07/13/2013 11:26 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2013-07-14 at 14:36 +1000, Michael Ellerman wrote:
>>>> Is this expected behaviour? It seems to be the same in current git
>>>> versions of kexec-tools.
>>>>
>>>> On my system I see "/proc/device-tree/memory".
>>>>
>>>> If I modify add_usable_mem_property() to also accept "/memory" then
>> my
>
> This is a bug in your device-tree. The memory node should have a unit
> address which corresponds to it's reg property. I know people tend to
> skip it for "0" but it's bad practice.
>
> So for memory starting at 0 it should be memory@0
There are a fair number of dts files in the kernel tree that don't
specify an address for the memory node.
If the kernel accepts it without an address, it seems logical that kexec
should as well.
Or maybe the kernel should just implicitly assume an address of zero and
export it as such in /proc/device-tree?
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-14 23:08 ` Chris Friesen
@ 2013-07-14 23:11 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 28+ messages in thread
From: Benjamin Herrenschmidt @ 2013-07-14 23:11 UTC (permalink / raw)
To: Chris Friesen
Cc: kexec, Michael Ellerman, Haren Myneni, Paul Mackerras,
linuxppc-dev, Vivek Goyal
On Sun, 2013-07-14 at 17:08 -0600, Chris Friesen wrote:
> > So for memory starting at 0 it should be memory@0
>
> There are a fair number of dts files in the kernel tree that don't
> specify an address for the memory node.
>
> If the kernel accepts it without an address, it seems logical that kexec
> should as well.
As long as kexec doesn't start being stupid when there are several nodes
and doesn't pick up the "first one in device-tree order" instead of the
one at 0...
I've been hit by that sort of bugs before (though not specifically in
kexec).
> Or maybe the kernel should just implicitly assume an address of zero and
> export it as such in /proc/device-tree?
I don't want /proc/device-tree to expose something different than what's in
the actual device-tree, that would be the source for endless horrors.
We already are borderline with the occasional renaming we do in the case of
duplicate name+property...
Cheers,
Ben.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-14 23:11 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 28+ messages in thread
From: Benjamin Herrenschmidt @ 2013-07-14 23:11 UTC (permalink / raw)
To: Chris Friesen; +Cc: kexec, Paul Mackerras, linuxppc-dev, Vivek Goyal
On Sun, 2013-07-14 at 17:08 -0600, Chris Friesen wrote:
> > So for memory starting at 0 it should be memory@0
>
> There are a fair number of dts files in the kernel tree that don't
> specify an address for the memory node.
>
> If the kernel accepts it without an address, it seems logical that kexec
> should as well.
As long as kexec doesn't start being stupid when there are several nodes
and doesn't pick up the "first one in device-tree order" instead of the
one at 0...
I've been hit by that sort of bugs before (though not specifically in
kexec).
> Or maybe the kernel should just implicitly assume an address of zero and
> export it as such in /proc/device-tree?
I don't want /proc/device-tree to expose something different than what's in
the actual device-tree, that would be the source for endless horrors.
We already are borderline with the occasional renaming we do in the case of
duplicate name+property...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-13 6:30 ` Chris Friesen
@ 2013-07-29 23:10 ` Scott Wood
-1 siblings, 0 replies; 28+ messages in thread
From: Scott Wood @ 2013-07-29 23:10 UTC (permalink / raw)
To: Chris Friesen
Cc: Michael Ellerman, Paul Mackerras, kexec, linuxppc-dev,
Vivek Goyal
On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> On 07/12/2013 04:59 PM, Chris Friesen wrote:
>> On 07/12/2013 03:08 PM, Chris Friesen wrote:
>>
>>> I turned on the instrumentation in early_init_dt_scan_memory() and
>>> got
>>> the following when jumping to the capture kernel:
>>>
>>> memory scan node memory, reg size 16, data: 0 0 2 0,
>>> - 0 , 200000000
>>>
>>> That 0x200000000 matches the fact that I'm seeing 8GB of memory
>>> available in the recovery kernel.
>>>
>>> If I boot the original kernel with "crashkernel=224M@32M", should I
>>> expect that only 224MB is marked as "linux,usable-memory" in the
>>> recovery kernel?
>>
>> I started looking at the kexec side of things, and I noticed
>> something a
>> bit odd. In most places dealing with the device tree in kexec it
>> accepts
>> either "memory" or "memory@" for the memory node name. In
>> add_usable_mem_property() in arch/ppc64/fs2dt.c it seems to only
>> accept
>> "memory@".
>>
>> Is this expected behaviour? It seems to be the same in current git
>> versions of kexec-tools.
>>
>> On my system I see "/proc/device-tree/memory".
>>
>> If I modify add_usable_mem_property() to also accept "/memory" then
>> my
>> recovery kernel boots up with
>>
>> physicalMemorySize = 0x10000000
>>
>> which is 256MB (which is still a bit odd since I specified 224MB for
>> the
>> crashkernel).
>>
>> However, it then hits the BUG() call at the end of mark_bootmem() in
>> mm/bootmem.c.
>
> One final thing and I'll stop replying to myself. :)
>
> It looks like the problem is that some board-specific freescale code
> was calling lmb_reserve() with a base address in the 4GB range. It
> seems odd that lmb_reserve() didn't throw some kind of error when the
> recovery kernel was supposed to be limited to 224MB.
>
> Rather than try and fix the bug, I turned off the (unneeded) config
> options related to the above lmb_reserve() calls and was able to
> successfully access the information I needed via /dev/oldmem.
>
> The upshot is that there seems to be a number of things that could be
> improved:
>
> 1) kexec should accept "/memory" and not just "/memory@"
> 2) lmb_reserve() should really respect the crashkernel memory limit
> 3) the freescale stuff really shouldn't assume it can map things
> wherever it feels like
What "board-specific freescale code" are you referring to?
-Scott
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-29 23:10 ` Scott Wood
0 siblings, 0 replies; 28+ messages in thread
From: Scott Wood @ 2013-07-29 23:10 UTC (permalink / raw)
To: Chris Friesen; +Cc: Paul Mackerras, kexec, linuxppc-dev, Vivek Goyal
On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> On 07/12/2013 04:59 PM, Chris Friesen wrote:
>> On 07/12/2013 03:08 PM, Chris Friesen wrote:
>>=20
>>> I turned on the instrumentation in early_init_dt_scan_memory() and =20
>>> got
>>> the following when jumping to the capture kernel:
>>>=20
>>> memory scan node memory, reg size 16, data: 0 0 2 0,
>>> - 0 , 200000000
>>>=20
>>> That 0x200000000 matches the fact that I'm seeing 8GB of memory
>>> available in the recovery kernel.
>>>=20
>>> If I boot the original kernel with "crashkernel=3D224M@32M", should I
>>> expect that only 224MB is marked as "linux,usable-memory" in the
>>> recovery kernel?
>>=20
>> I started looking at the kexec side of things, and I noticed =20
>> something a
>> bit odd. In most places dealing with the device tree in kexec it =20
>> accepts
>> either "memory" or "memory@" for the memory node name. In
>> add_usable_mem_property() in arch/ppc64/fs2dt.c it seems to only =20
>> accept
>> "memory@".
>>=20
>> Is this expected behaviour? It seems to be the same in current git
>> versions of kexec-tools.
>>=20
>> On my system I see "/proc/device-tree/memory".
>>=20
>> If I modify add_usable_mem_property() to also accept "/memory" then =20
>> my
>> recovery kernel boots up with
>>=20
>> physicalMemorySize =3D 0x10000000
>>=20
>> which is 256MB (which is still a bit odd since I specified 224MB for =20
>> the
>> crashkernel).
>>=20
>> However, it then hits the BUG() call at the end of mark_bootmem() in
>> mm/bootmem.c.
>=20
> One final thing and I'll stop replying to myself. :)
>=20
> It looks like the problem is that some board-specific freescale code =20
> was calling lmb_reserve() with a base address in the 4GB range. It =20
> seems odd that lmb_reserve() didn't throw some kind of error when the =20
> recovery kernel was supposed to be limited to 224MB.
>=20
> Rather than try and fix the bug, I turned off the (unneeded) config =20
> options related to the above lmb_reserve() calls and was able to =20
> successfully access the information I needed via /dev/oldmem.
>=20
> The upshot is that there seems to be a number of things that could be =20
> improved:
>=20
> 1) kexec should accept "/memory" and not just "/memory@"
> 2) lmb_reserve() should really respect the crashkernel memory limit
> 3) the freescale stuff really shouldn't assume it can map things =20
> wherever it feels like
What "board-specific freescale code" are you referring to?
-Scott=
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: visible memory seems wrong in kexec crash dump kernel
2013-07-29 23:10 ` Scott Wood
@ 2013-07-31 16:40 ` Friesen, Christopher
-1 siblings, 0 replies; 28+ messages in thread
From: Friesen, Christopher @ 2013-07-31 16:40 UTC (permalink / raw)
To: Scott Wood
Cc: Michael Ellerman, Paul Mackerras, kexec@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, Vivek Goyal
From: Scott Wood [scottwood@freescale.com]
Sent: Monday, July 29, 2013 5:10 PM
To: Friesen, Christopher
Cc: Michael Ellerman; kexec@lists.infradead.org; Paul Mackerras; linuxppc-dev@lists.ozlabs.org; Vivek Goyal
Subject: Re: visible memory seems wrong in kexec crash dump kernel
On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> The upshot is that there seems to be a number of things that could be
> improved:
>
> 1) kexec should accept "/memory" and not just "/memory@"
> 2) lmb_reserve() should really respect the crashkernel memory limit
> 3) the freescale stuff really shouldn't assume it can map things
> wherever it feels like
What "board-specific freescale code" are you referring to?
-Scott
Sorry for the crappy quoting, I'm using a web outlook portal.
I've switched employers so I don't have access to the exact details any more. The system in question was a Kontron AM4150 which uses the P5020. As I recall, one of the Freescale drivers (I think it was the buffer or queue manager that the network driver makes use of) was attempting to call lmb_reserve() with a base address in the 4GB range even though the recovery kernel was limited to 224MB of memory.
While I've got your attention, the other thing that I found was that the "dpa" network driver didn't properly work in a kexec'd kernel even when given lots of memory. It would work for a little bit and then hang.
This was all on a Wind River Linux 4.3 system, which is based on 2.6.34. It's possible it's been fixed in more recent kernels. The testcase to see if current kernels are still affected should be fairly simple:
1) Take a P5020 board and boot it with NFS rootfs, reserving 224MB for crashkernel.
2) Load a crashkernel into it, configured to use NFS rootfs.
3) Trigger a panic on the main kernel.
4) Ensure it loads the recovery kernel properly, that it is limited to the correct amount of memory, that it has visibility of the original kernel's memory, that it has network connectivity and is stable when copying a few gigs of data over the network.
Chris
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-31 16:40 ` Friesen, Christopher
0 siblings, 0 replies; 28+ messages in thread
From: Friesen, Christopher @ 2013-07-31 16:40 UTC (permalink / raw)
To: Scott Wood
Cc: Paul Mackerras, kexec@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, Vivek Goyal
=46rom: Scott Wood [scottwood@freescale.com]
Sent: Monday, July 29, 2013 5:10 PM
To: Friesen, Christopher
Cc: Michael Ellerman; kexec@lists.infradead.org; Paul Mackerras; linu=
xppc-dev@lists.ozlabs.org; Vivek Goyal
Subject: Re: visible memory seems wrong in kexec crash dump kernel
On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> The upshot is that there seems to be a number of things that could =
be
> improved:
>
> 1) kexec should accept "/memory" and not just "/memory@"
> 2) lmb_reserve() should really respect the crashkernel memory limit
> 3) the freescale stuff really shouldn't assume it can map things
> wherever it feels like
What "board-specific freescale code" are you referring to?
-Scott
Sorry for the crappy quoting, I'm using a web outlook portal.
I've switched employers so I don't have access to the exact details a=
ny more. The system in question was a Kontron AM4150 which uses the =
P5020. As I recall, one of the Freescale drivers (I think it was the=
buffer or queue manager that the network driver makes use of) was at=
tempting to call lmb_reserve() with a base address in the 4GB range e=
ven though the recovery kernel was limited to 224MB of memory.
While I've got your attention, the other thing that I found was that =
the "dpa" network driver didn't properly work in a kexec'd kernel eve=
n when given lots of memory. It would work for a little bit and then=
hang.
This was all on a Wind River Linux 4.3 system, which is based on 2.6.=
34. It's possible it's been fixed in more recent kernels. The testc=
ase to see if current kernels are still affected should be fairly sim=
ple:
1) Take a P5020 board and boot it with NFS rootfs, reserving 224MB fo=
r crashkernel.
2) Load a crashkernel into it, configured to use NFS rootfs.
3) Trigger a panic on the main kernel.
4) Ensure it loads the recovery kernel properly, that it is limited t=
o the correct amount of memory, that it has visibility of the origina=
l kernel's memory, that it has network connectivity and is stable wh=
en copying a few gigs of data over the network.
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
2013-07-31 16:40 ` Friesen, Christopher
@ 2013-07-31 16:50 ` Scott Wood
-1 siblings, 0 replies; 28+ messages in thread
From: Scott Wood @ 2013-07-31 16:50 UTC (permalink / raw)
To: Friesen, Christopher
Cc: Michael Ellerman, Paul Mackerras, kexec@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, Vivek Goyal
On 07/31/2013 11:40:05 AM, Friesen, Christopher wrote:
>
> From: Scott Wood [scottwood@freescale.com]
> Sent: Monday, July 29, 2013 5:10 PM
> To: Friesen, Christopher
> Cc: Michael Ellerman; kexec@lists.infradead.org; Paul Mackerras;
> linuxppc-dev@lists.ozlabs.org; Vivek Goyal
> Subject: Re: visible memory seems wrong in kexec crash dump kernel
>
> On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> > The upshot is that there seems to be a number of things that could
> be
> > improved:
> >
> > 1) kexec should accept "/memory" and not just "/memory@"
> > 2) lmb_reserve() should really respect the crashkernel memory limit
> > 3) the freescale stuff really shouldn't assume it can map things
> > wherever it feels like
>
> What "board-specific freescale code" are you referring to?
>
> -Scott
>
>
> Sorry for the crappy quoting, I'm using a web outlook portal.
>
> I've switched employers so I don't have access to the exact details
> any more. The system in question was a Kontron AM4150 which uses the
> P5020. As I recall, one of the Freescale drivers (I think it was the
> buffer or queue manager that the network driver makes use of) was
> attempting to call lmb_reserve() with a base address in the 4GB range
> even though the recovery kernel was limited to 224MB of memory.
That's not "board specific" code, and it's not even mainline Linux
code. Unfortunately none of the datapath stuff is upstream, still.
> While I've got your attention, the other thing that I found was that
> the "dpa" network driver didn't properly work in a kexec'd kernel
> even when given lots of memory. It would work for a little bit and
> then hang.
I'm not particularly surprised by this. It doesn't help that there's
no way to do a device reset. :-(
Issues with Freescale SDK code should be reported on
https://community.freescale.com/, to support@freescale.com, or to your
FAE.
-Scott
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: visible memory seems wrong in kexec crash dump kernel
@ 2013-07-31 16:50 ` Scott Wood
0 siblings, 0 replies; 28+ messages in thread
From: Scott Wood @ 2013-07-31 16:50 UTC (permalink / raw)
To: Friesen, Christopher
Cc: Paul Mackerras, kexec@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, Vivek Goyal
On 07/31/2013 11:40:05 AM, Friesen, Christopher wrote:
>=20
> From: Scott Wood [scottwood@freescale.com]
> Sent: Monday, July 29, 2013 5:10 PM
> To: Friesen, Christopher
> Cc: Michael Ellerman; kexec@lists.infradead.org; Paul Mackerras; =20
> linuxppc-dev@lists.ozlabs.org; Vivek Goyal
> Subject: Re: visible memory seems wrong in kexec crash dump kernel
>=20
> On 07/13/2013 01:30:50 AM, Chris Friesen wrote:
> > The upshot is that there seems to be a number of things that could =20
> be
> > improved:
> >
> > 1) kexec should accept "/memory" and not just "/memory@"
> > 2) lmb_reserve() should really respect the crashkernel memory limit
> > 3) the freescale stuff really shouldn't assume it can map things
> > wherever it feels like
>=20
> What "board-specific freescale code" are you referring to?
>=20
> -Scott
>=20
>=20
> Sorry for the crappy quoting, I'm using a web outlook portal.
>=20
> I've switched employers so I don't have access to the exact details =20
> any more. The system in question was a Kontron AM4150 which uses the =20
> P5020. As I recall, one of the Freescale drivers (I think it was the =20
> buffer or queue manager that the network driver makes use of) was =20
> attempting to call lmb_reserve() with a base address in the 4GB range =20
> even though the recovery kernel was limited to 224MB of memory.
That's not "board specific" code, and it's not even mainline Linux =20
code. Unfortunately none of the datapath stuff is upstream, still.
> While I've got your attention, the other thing that I found was that =20
> the "dpa" network driver didn't properly work in a kexec'd kernel =20
> even when given lots of memory. It would work for a little bit and =20
> then hang.
I'm not particularly surprised by this. It doesn't help that there's =20
no way to do a device reset. :-(
Issues with Freescale SDK code should be reported on =20
https://community.freescale.com/, to support@freescale.com, or to your =20
FAE.
-Scott=
^ permalink raw reply [flat|nested] 28+ messages in thread