linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* 3.5+: yaboot, Invalid memory access
@ 2012-07-31  5:46 Christian Kujau
  2012-08-01  2:02 ` Tony Breeds
  2012-09-04  6:18 ` Christian Kujau
  0 siblings, 2 replies; 12+ messages in thread
From: Christian Kujau @ 2012-07-31  5:46 UTC (permalink / raw)
  To: linuxppc-dev

Hi,

when trying to upgrade from 3.5 (final) to today's git checkout from 
Linus' tree, yaboot cannot boot and the following is printed:

  [...]
  returning from prom_init
  Invalid memory access at %SRR0: 00c62fd4  %SRR1: 00003030

The whole message: http://nerdbynature.de/bits/3.5.0/yaboot/

The Yaboot version is 1.3.16 (from Debian/testing), I haven't tried 1.3.17 
yet, its changelog does not say anything about "newer kernel support" so 
I'm not sure if this would help here.

Any other ideas?

Thanks,
Christian.
-- 
BOFH excuse #389:

/dev/clue was linked to /dev/null

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-07-31  5:46 3.5+: yaboot, Invalid memory access Christian Kujau
@ 2012-08-01  2:02 ` Tony Breeds
  2012-08-01  7:30   ` Christian Kujau
  2012-09-04  6:18 ` Christian Kujau
  1 sibling, 1 reply; 12+ messages in thread
From: Tony Breeds @ 2012-08-01  2:02 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 945 bytes --]

On Mon, Jul 30, 2012 at 10:46:39PM -0700, Christian Kujau wrote:
> Hi,
> 
> when trying to upgrade from 3.5 (final) to today's git checkout from 
> Linus' tree, yaboot cannot boot and the following is printed:
> 
>   [...]
>   returning from prom_init
>   Invalid memory access at %SRR0: 00c62fd4  %SRR1: 00003030
> 
> The whole message: http://nerdbynature.de/bits/3.5.0/yaboot/
> 
> The Yaboot version is 1.3.16 (from Debian/testing), I haven't tried 1.3.17 
> yet, its changelog does not say anything about "newer kernel support" so 
> I'm not sure if this would help here.

Well the "returning from prom_init" message is from the kernel so yaboot
has done something.  We'll work out if it was the right thing ;P

I had a look at the URL you provided.

1) can you also upload you vmlinux so we can poke at it.
2) What is the dmseg.txt file there?  If you're unable to boot 3.5 how
did you get the dmesg?

Yours Tony

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-08-01  2:02 ` Tony Breeds
@ 2012-08-01  7:30   ` Christian Kujau
  0 siblings, 0 replies; 12+ messages in thread
From: Christian Kujau @ 2012-08-01  7:30 UTC (permalink / raw)
  To: Tony Breeds; +Cc: linuxppc-dev

On Wed, 1 Aug 2012 at 12:02, Tony Breeds wrote:
> 1) can you also upload you vmlinux so we can poke at it.

I've uploaded the vmlinx (and System.map) to the same location:

  http://nerdbynature.de/bits/3.5.0/yaboot/

> 2) What is the dmseg.txt file there?  If you're unable to boot 3.5 how
> did you get the dmesg?

This is a dmesg from the working 3.5.0 (final), i.e. the release version 
which is working. The version that is not booting is "yesterday's git 
tree" or 37cd9600a9e20359b0283983c9e3a55d84347168 to be specific.

Sorry for the confusion. I only put the dmesg there so that people can 
have a look and see what type of machine this is.

Christian.
-- 
BOFH excuse #54:

Evil dogs hypnotised the night shift

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-07-31  5:46 3.5+: yaboot, Invalid memory access Christian Kujau
  2012-08-01  2:02 ` Tony Breeds
@ 2012-09-04  6:18 ` Christian Kujau
  2012-09-04  6:51   ` Michael Ellerman
  1 sibling, 1 reply; 12+ messages in thread
From: Christian Kujau @ 2012-09-04  6:18 UTC (permalink / raw)
  To: Steven Rostedt, Benjamin Herrenschmidt; +Cc: linuxppc-dev

On Mon, 30 Jul 2012 at 22:46, Christian Kujau wrote:
> when trying to upgrade from 3.5 (final) to today's git checkout from 
> Linus' tree, yaboot cannot boot and the following is printed:
> 
>   [...]
>   returning from prom_init
>   Invalid memory access at %SRR0: 00c62fd4  %SRR1: 00003030

Finally got around to do a bisection:

-----------------------------
b6e3796834faefe4b6e9a2aedfe12665cd51fbc5 is the first bad commit
commit b6e3796834faefe4b6e9a2aedfe12665cd51fbc5
Author: Steven Rostedt <srostedt@redhat.com>
Date:   Thu Apr 26 08:31:18 2012 +0000

    powerpc: Have patch_instruction detect faults
    
    For ftrace to use the patch_instruction code, it needs to check for
    faults on write. Ftrace updates code all over the kernel, and we need 
    to know if code is updated or not due to protections that are placed 
    on some portions of the kernel. If ftrace does not detect a fault, it 
    will error later on, and it will be much more difficult to find the 
    problem.
    
    By changing patch_instruction() to detect faults, then ftrace will be
    able to make use of it too.
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

:040000 040000 83fb0e420524db452c07425e4dc402041428e696 0d1a01acd863237cf388045946ad4446a28df50c M      arch
-----------------------------

The changeset looked pretty harmless to me but I tested with a current 
3.6+ git checkout and the kernel would only boot when this changeset was
reverted.

Thoughts?

Thanks,
Christian.

> The whole message: http://nerdbynature.de/bits/3.5.0/yaboot/
> 
> The Yaboot version is 1.3.16 (from Debian/testing), I haven't tried 1.3.17 
> yet, its changelog does not say anything about "newer kernel support" so 
> I'm not sure if this would help here.
> 
> Any other ideas?
> 
> Thanks,
> Christian.

-- 
BOFH excuse #154:

You can tune a file system, but you can't tune a fish (from most tunefs man pages)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04  6:18 ` Christian Kujau
@ 2012-09-04  6:51   ` Michael Ellerman
  2012-09-04  9:32     ` Christian Kujau
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Michael Ellerman @ 2012-09-04  6:51 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Steven Rostedt, linuxppc-dev

On Mon, 2012-09-03 at 23:18 -0700, Christian Kujau wrote:
> On Mon, 30 Jul 2012 at 22:46, Christian Kujau wrote:
> > when trying to upgrade from 3.5 (final) to today's git checkout from 
> > Linus' tree, yaboot cannot boot and the following is printed:
> > 
> >   [...]
> >   returning from prom_init
> >   Invalid memory access at %SRR0: 00c62fd4  %SRR1: 00003030
> 
> Finally got around to do a bisection:
> 
> -----------------------------
> b6e3796834faefe4b6e9a2aedfe12665cd51fbc5 is the first bad commit
> commit b6e3796834faefe4b6e9a2aedfe12665cd51fbc5
> Author: Steven Rostedt <srostedt@redhat.com>
> Date:   Thu Apr 26 08:31:18 2012 +0000
> 
>     powerpc: Have patch_instruction detect faults
>     
>     For ftrace to use the patch_instruction code, it needs to check for
>     faults on write. Ftrace updates code all over the kernel, and we need 
>     to know if code is updated or not due to protections that are placed 
>     on some portions of the kernel. If ftrace does not detect a fault, it 
>     will error later on, and it will be much more difficult to find the 
>     problem.
>     
>     By changing patch_instruction() to detect faults, then ftrace will be
>     able to make use of it too.
>     
>     Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
>     Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 
> :040000 040000 83fb0e420524db452c07425e4dc402041428e696 0d1a01acd863237cf388045946ad4446a28df50c M      arch
> -----------------------------
> 
> The changeset looked pretty harmless to me but I tested with a current 
> 3.6+ git checkout and the kernel would only boot when this changeset was
> reverted.
> 
> Thoughts?

My guess would be we're calling that quite early and the __put_user()
check is getting confused and failing. That means we'll have left some
code unpatched, which then fails.

Can you try with the patch applied, but instead of returning if the
__put_user() fails, just continue on anyway.

That will isolate if it's something in the __put_user() (I doubt it), or
just that the __put_user() is failing and leaving the code unpatched.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04  6:51   ` Michael Ellerman
@ 2012-09-04  9:32     ` Christian Kujau
  2012-09-05  1:08       ` Benjamin Herrenschmidt
  2012-09-04 14:27     ` Steven Rostedt
  2012-09-04 19:40     ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 12+ messages in thread
From: Christian Kujau @ 2012-09-04  9:32 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Steven Rostedt, linuxppc-dev

On Tue, 4 Sep 2012 at 16:51, Michael Ellerman wrote:
> My guess would be we're calling that quite early and the __put_user()
> check is getting confused and failing. That means we'll have left some
> code unpatched, which then fails.
> 
> Can you try with the patch applied, but instead of returning if the
> __put_user() fails, just continue on anyway.

You mean, like this?

------
diff --git a/arch/powerpc/lib/code-patching.c 
b/arch/powerpc/lib/code-patching.c
index dd223b3..755b623 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -21,8 +21,8 @@ int patch_instruction(unsigned int *addr, unsigned int 
instr)
        int err;
 
        err = __put_user(instr, addr);
-       if (err)
-               return err;
+//     if (err)
+//             return err;
        asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));
        return 0;
 }
------


Thanks,
Christian.

> 
> That will isolate if it's something in the __put_user() (I doubt it), or
> just that the __put_user() is failing and leaving the code unpatched.
> 
> cheers
-- 
BOFH excuse #361:

Communist revolutionaries taking over the server room and demanding all the computers in the building or they shoot the sysadmin. Poor misguided fools.

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04  6:51   ` Michael Ellerman
  2012-09-04  9:32     ` Christian Kujau
@ 2012-09-04 14:27     ` Steven Rostedt
  2012-09-04 19:49       ` Benjamin Herrenschmidt
  2012-09-04 19:40     ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2012-09-04 14:27 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Christian Kujau, linuxppc-dev

On Tue, 2012-09-04 at 16:51 +1000, Michael Ellerman wrote:

> My guess would be we're calling that quite early and the __put_user()
> check is getting confused and failing. That means we'll have left some
> code unpatched, which then fails.
> 
> Can you try with the patch applied, but instead of returning if the
> __put_user() fails, just continue on anyway.
> 
> That will isolate if it's something in the __put_user() (I doubt it), or
> just that the __put_user() is failing and leaving the code unpatched.

As a test case, continuing on error is fine, but I would not recommend
that as a fix. If it fails, but still does the patch, that could be
harmful, and confusing of a result.

Need to figure out why put_user is failing.

-- Steve

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04  6:51   ` Michael Ellerman
  2012-09-04  9:32     ` Christian Kujau
  2012-09-04 14:27     ` Steven Rostedt
@ 2012-09-04 19:40     ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-04 19:40 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Christian Kujau, linuxppc-dev, Steven Rostedt

On Tue, 2012-09-04 at 16:51 +1000, Michael Ellerman wrote:
> 
> My guess would be we're calling that quite early and the __put_user()
> check is getting confused and failing. That means we'll have left some
> code unpatched, which then fails.
> 
> Can you try with the patch applied, but instead of returning if the
> __put_user() fails, just continue on anyway.
> 
> That will isolate if it's something in the __put_user() (I doubt it),
> or
> just that the __put_user() is failing and leaving the code unpatched.

Bah, worse.... we do the feature fixups on 32-bit while still running
unrelocated, so we need RELOC() macros everywhere. That's probably
busted.

We might be able to try replacing __put_user() with __put_user_size() to
avoid the kernel address check, which is I think what's breaking it.

That stuff has long been in my list of things to rework, ie, we do that
fixup way too early on ppc32, in fact earlier than ppc64, ie, before the
platform is probed which means the platform cannot adjust the bits.

However, iirc, we have some early setup code that relies on the fixup
being done that early so it's a bit of a chicken & egg problem that
needs to be untangled first.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04 14:27     ` Steven Rostedt
@ 2012-09-04 19:49       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-04 19:49 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Christian Kujau, linuxppc-dev

On Tue, 2012-09-04 at 10:27 -0400, Steven Rostedt wrote:
> As a test case, continuing on error is fine, but I would not recommend
> that as a fix. If it fails, but still does the patch, that could be
> harmful, and confusing of a result.
> 
> Need to figure out why put_user is failing. 

It's probably not failiing... it's just called in a context in very
early boot on ppc32 where the might_sleep() test in there will trip
(because we are running before we are operating at our link address) and
will crash horribly.

I'll try to figure out a good way to fix this today.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-04  9:32     ` Christian Kujau
@ 2012-09-05  1:08       ` Benjamin Herrenschmidt
  2012-09-05  5:25         ` Christian Kujau
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-05  1:08 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linuxppc-dev, Steven Rostedt

On Tue, 2012-09-04 at 02:32 -0700, Christian Kujau wrote:
> On Tue, 4 Sep 2012 at 16:51, Michael Ellerman wrote:
> > My guess would be we're calling that quite early and the __put_user()
> > check is getting confused and failing. That means we'll have left some
> > code unpatched, which then fails.
> > 
> > Can you try with the patch applied, but instead of returning if the
> > __put_user() fails, just continue on anyway.
> 
> You mean, like this?

Try this:

powerpc: Don't use __put_user() in patch_instruction

patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.

Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index dd223b3..17e5b23 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -20,7 +20,7 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
 {
 	int err;
 
-	err = __put_user(instr, addr);
+	__put_user_size(instr, addr, 4, err);
 	if (err)
 		return err;
 	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-05  1:08       ` Benjamin Herrenschmidt
@ 2012-09-05  5:25         ` Christian Kujau
  2012-09-05  5:29           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Christian Kujau @ 2012-09-05  5:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Steven Rostedt

On Wed, 5 Sep 2012 at 11:08, Benjamin Herrenschmidt wrote:
> Try this:
> 
> powerpc: Don't use __put_user() in patch_instruction

Perfect! With this patch applied, the machine boots again.

  Tested-by: Christian Kujau <lists@nerdbynature.de>

I sure hope that other people will benefit from this as well. I can't 
stand the thought that you guys are always putting out fixes for this ol' 
PowerBook of mine :-\

Thanks so much,
Christian.

> 
> patch_instruction() can be called very early on ppc32, when the kernel
> isn't yet running at it's linked address. That can cause the !
> is_kernel_addr() test in __put_user() to trip and call might_sleep()
> which is very bad at that point during boot.
> 
> Use a lower level function instead for now, at least until we get to
> rework ppc32 boot process to do the code patching later, like ppc64
> does.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> 
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index dd223b3..17e5b23 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -20,7 +20,7 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
>  {
>  	int err;
>  
> -	err = __put_user(instr, addr);
> +	__put_user_size(instr, addr, 4, err);
>  	if (err)
>  		return err;
>  	asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr));
> 
> 
> 

-- 
BOFH excuse #101:

Collapsed Backbone

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 3.5+: yaboot, Invalid memory access
  2012-09-05  5:25         ` Christian Kujau
@ 2012-09-05  5:29           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2012-09-05  5:29 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linuxppc-dev, Steven Rostedt

On Tue, 2012-09-04 at 22:25 -0700, Christian Kujau wrote:
> I sure hope that other people will benefit from this as well. I can't 
> stand the thought that you guys are always putting out fixes for this
> ol' 
> PowerBook of mine :-\

Well, for some strange reason I didn't observe the problem on a couple
of old macs here, might have got lucky or something (could depend on
whether might_sleep() is compiled to do something or not... depends
on .config debug options).

Anyways, I'll send that fix to Linus asap. Thanks for reporting !

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-09-05  5:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-31  5:46 3.5+: yaboot, Invalid memory access Christian Kujau
2012-08-01  2:02 ` Tony Breeds
2012-08-01  7:30   ` Christian Kujau
2012-09-04  6:18 ` Christian Kujau
2012-09-04  6:51   ` Michael Ellerman
2012-09-04  9:32     ` Christian Kujau
2012-09-05  1:08       ` Benjamin Herrenschmidt
2012-09-05  5:25         ` Christian Kujau
2012-09-05  5:29           ` Benjamin Herrenschmidt
2012-09-04 14:27     ` Steven Rostedt
2012-09-04 19:49       ` Benjamin Herrenschmidt
2012-09-04 19:40     ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).