Re: Maple PPC970 kexec crash-dump problems

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Milton Miller <miltonm@bga.com>
To: Benjamin Walsh <walsh.benj@gmail.com>
Cc: linuxppc-dev list <linuxppc-dev@ozlabs.org>
Subject: Re: Maple PPC970 kexec crash-dump problems
Date: Fri, 6 Feb 2009 10:53:14 -0600	[thread overview]
Message-ID: <94c75807ec390be91bde1452536bae51@bga.com> (raw)
In-Reply-To: <cb298a870902041048v57391b62l7701933465b7f43c@mail.gmail.com>


On Feb 4, 2009, at 12:48 PM, Benjamin Walsh wrote:

> Hi Milton,
>
> I've tracked it down to the device tree passed to the second kernel=20
> being screwed-up when patched by kexec-tools. Namely, it was creating=20=

> linux,usable-memory entries that were wrong, and the MMU=20
> initialization hung when it failed allocating for the page tables. I=20=

> hacked the tool, and got passed that point in the init sequence, but=20=

> the very first IO mapped access fails, so the MMU doesn't seem to be=20=

> set up correctly.

I would need more details exactly what you think is wrong.

How does it fail?

If the first IO mapped access fails, then I would ask if you are using=20=

IOMMU.  It is quite possible that the dart iommu code needs to be=20
modified to use the existing mapping table instead of allocating a new=20=

table, otherwise any existing mappings being used by inflight dma would=20=

fail and the that might cause mmio loads to wait for uncompletable dma=20=

writes.   Just a theory with the lack of information you gave me.


>
> Anyway, up to my question: is the crash dump (kdump) kernel supposed=20=

> to use the memory reserved for it by the first kernel for its working=20=

> memory ? e.g. On that board, I have 0->2GB and 4->6GB for a total of=20=

> 4GB of RAM. Let's say I reserve 128M@32M, that's 0x2000000->0xa000000.=20=

> Is the second kernel supposed to use
>
> (0x2000000+<kernel size>) -> 0xa000000
>
> for its memory pool and leave everything else:
>
> 0->0x2000000, 0xa000000 -> 80000000, 0x100000000 -> 0x180000000
>
> as memory that is from the first kernel, used to debug it ?


Yes, but that is not quite how the device tree is formed.

The second kernel will also use the interrupt vector area at address 0.=20=

  Therefore that is saved as the backup region in purgatory to the=20
address allocated in the kdump region.  =10The device tree is then=20
created with linux,usable-memory regions extending the kdump region=20
back to 0, and a reserve entry marking the area as reserved.

In addition, the device tree gets the memory backing tce tables for=20
pseries smp mode.  It may need the page with the dart table, marked=20
reserved, so that the table gets added to the linear map -- except it=20
should be mapped cache inhibited so that may not work either.

>
> Basically, I am trying to figure out if I patched the tool correctly.
>
> Thanks,
> Ben
>
> On Sat, Jan 24, 2009 at 2:52 AM, Milton Miller <miltonm@bga.com> =
wrote:
>> On Sat Jan 24 at 07:59:47 EST in 2009, Benjamin Walsh wrote:
>>> I am trying to use kexec with a crash dump kernel on a Maple board=20=

>>> (Motorola
>>>  ATCA6101 to be precise). This board is running a two-CPU PPC970FX.=20=

>>> I am
>>>  running a 2.6.27-10 kernel and have tried both older kexec-tools=20
>>> and the
>>>  newest ones. I have tried SMP and non-SMP kernels.
>>
>>  Once you start the second cpu it is likly executing instructions=20
>> somewhere.
>>
>>  Priory to 2.6.27 you had to compile a fixxed offset kerenl to run=20
>> kdump. =A0With 2.6.27 that option was removed and replaced with teh=20=

>> relocatable kerenl. =A0However, becasue of the way linux interacts =
with=20
>> open firmware, the kernel will still move itself to 0 unless a=20
>> specific flag is set. =A0 The location of the flag was changed twice=20=

>> during the merge process, and the patches for kexec-tools were not=20
>> made until early this year.
>>
>>
>>> Using kexec -l to fast boot works correctly. However, loading a=20
>>> crash dump
>>>  kernel and triggering a crash via echo c > /proc/sysrq-trigger=20
>>> simply hangs
>>>  the board. I have traced the sequence down to after the call to
>>>  kexec_copy_flush(), when the CPU returns to real-address mode (bl
>>>  real_mode). At this point I have no further debugging information.
>>
>>
>>> Two things could help me:
>>>
>>>  - Getting the fix if this is a known issue and a fix exists. I have=20=

>>> looked
>>>  at recent patches and nothing lept to mind, mostly relocatable=20
>>> kernel
>>>  support.
>>
>>  That is a major change.
>>
>>  That said, I don't know if anyone has tested kexec panic beyond=20
>> pseries for 64 bit powerpc.
>>
>>  I know Paul originally prototyped the relocatable patch on a=20
>> powermac, but I dont' know what if any smp testing he performed. =A0=20=

>> And you said you are actualy on maple not a powermac, so the startup=20=

>> issues are different.
>>
>>
>>> - Obtaining the address of the serial port @3f8 in real mode. The=20
>>> init
>>>  sequence with udbg ON says that the physical address of the port is
>>>  0xf40003f8; however, setting it up in poll mode and trying to stuff
>>>  characters in the tx buffer doesn't produce anything.
>>
>>  Ah yes. =A0In real mode you can only talk to cacheable memory =
without=20
>> implementation specific assistance. =A0However, if you look in the=20
>> kernel for the maple early udbg support, you will find the code you=20=

>> need to talk to that serial port in real mode.
>>
>>
>>>
>>>  Has anyone recently tried to use the serial port in real mode ?
>>>
>>>  Thanks for any help.
>>>
>>>  Ben
>>
>>  Hope this gets you started. =A0I wrote a lot of the kernel code, but =
I=20
>> had the advantage of external jtag access to the processor to see=20
>> where it when ended up when it went astray.
>>
>>  milton
>>

     prev parent reply	other threads:[~2009-02-06 16:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-23 20:59 Maple PPC970 kexec crash-dump problems Benjamin Walsh
2009-01-24  7:52 ` Milton Miller
2009-02-04 18:48   ` Benjamin Walsh
2009-02-06 16:53     ` Milton Miller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94c75807ec390be91bde1452536bae51@bga.com \
    --to=miltonm@bga.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=walsh.benj@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).