linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* 2.6.23-rc7-mm1 -- powerpc rtas panic
       [not found] <20070924021716.9bfe7dfb.akpm@linux-foundation.org>
@ 2007-09-24 12:35 ` Andy Whitcroft
  2007-10-02 23:28   ` Linas Vepstas
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Whitcroft @ 2007-09-24 12:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linuxppc-dev, linux-kernel

Seeing the following from an older power LPAR, pretty sure we had
this in the previous -mm also:

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000047ac8
cpu 0x0: Vector: 300 (Data Access) at [c00000000058f750]
    pc: c000000000047ac8: .pSeries_log_error+0x364/0x420
    lr: c000000000047a4c: .pSeries_log_error+0x2e8/0x420
    sp: c00000000058f9d0
   msr: 8000000000001032
   dar: 0
 dsisr: 42000000
  current = 0xc0000000004a9b30
  paca    = 0xc0000000004aa700
    pid   = 0, comm = swapper
enter ? for help
[c00000000058faf0] c000000000021164 .rtas_call+0x200/0x250
[c00000000058fba0] c000000000049cd0 .early_enable_eeh+0x168/0x360
[c00000000058fc70] c00000000002f674 .traverse_pci_devices+0x8c/0x138
[c00000000058fd10] c000000000460ce8 .eeh_init+0x1a8/0x200
[c00000000058fdb0] c00000000045fb70 .pSeries_setup_arch+0x128/0x234
[c00000000058fe40] c00000000044f830 .setup_arch+0x214/0x24c
[c00000000058fee0] c000000000446a38 .start_kernel+0xd4/0x3e4
[c00000000058ff90] c000000000373194 .start_here_common+0x54/0x58

This machine is a:

processor       : 0
cpu             : POWER4+ (gq)
clock           : 1703.965296MHz
revision        : 19.0

[...]

timebase        : 212995662
machine         : CHRP IBM,7040-681

-apw

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-09-24 12:35 ` 2.6.23-rc7-mm1 -- powerpc rtas panic Andy Whitcroft
@ 2007-10-02 23:28   ` Linas Vepstas
  2007-10-03  0:26     ` Tony Breeds
  0 siblings, 1 reply; 10+ messages in thread
From: Linas Vepstas @ 2007-10-02 23:28 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote:
> Seeing the following from an older power LPAR, pretty sure we had
> this in the previous -mm also:

I haven't forgetten about this ... and am looking at it now.
Seems that whenever I go to reserve the machine pSeries-102,
someone else is using it :-)

--linas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-02 23:28   ` Linas Vepstas
@ 2007-10-03  0:26     ` Tony Breeds
  2007-10-03  0:30       ` Michael Ellerman
  0 siblings, 1 reply; 10+ messages in thread
From: Tony Breeds @ 2007-10-03  0:26 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On Tue, Oct 02, 2007 at 06:28:19PM -0500, Linas Vepstas wrote:
> On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote:
> > Seeing the following from an older power LPAR, pretty sure we had
> > this in the previous -mm also:
> 
> I haven't forgetten about this ... and am looking at it now.
> Seems that whenever I go to reserve the machine pSeries-102,
> someone else is using it :-)

This panic is caused by "[POWERPC] pseries: Fix jumbled no_logging flag."
(79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24
branch.  It looks to me that we have logging enabled too early now.

I think the following is a reasonable fix?

---
Explicitly enable RTAS error logging, when it should be ready.


Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>

---

 arch/powerpc/platforms/pseries/rtasd.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..0df5d0d 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate;
 static int full_rtas_msgs = 0;
 
 /* Stop logging to nvram after first fatal error */
-static int no_more_logging;
+static int no_more_logging = 1; /* Until we initialize everything,
+                                 * make sure we don't try logging
+                                 * anything */
+
 
 static int error_log_cnt;
 
@@ -414,6 +417,8 @@ static int rtasd(void *unused)
 	memset(logdata, 0, rtas_error_log_max);
 	rc = nvram_read_error_log(logdata, rtas_error_log_max,
 	                          &err_type, &error_log_cnt);
+	/* We can use rtas_log_buf now */
+	no_more_logging = 0;
 
 	if (!rc) {
 		if (err_type != ERR_FLAG_ALREADY_LOGGED) {

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-03  0:26     ` Tony Breeds
@ 2007-10-03  0:30       ` Michael Ellerman
  2007-10-03  1:19         ` Tony Breeds
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2007-10-03  0:30 UTC (permalink / raw)
  To: Tony Breeds; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2015 bytes --]

On Wed, 2007-10-03 at 10:26 +1000, Tony Breeds wrote:
> On Tue, Oct 02, 2007 at 06:28:19PM -0500, Linas Vepstas wrote:
> > On Mon, Sep 24, 2007 at 01:35:31PM +0100, Andy Whitcroft wrote:
> > > Seeing the following from an older power LPAR, pretty sure we had
> > > this in the previous -mm also:
> > 
> > I haven't forgetten about this ... and am looking at it now.
> > Seems that whenever I go to reserve the machine pSeries-102,
> > someone else is using it :-)
> 
> This panic is caused by "[POWERPC] pseries: Fix jumbled no_logging flag."
> (79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24
> branch.  It looks to me that we have logging enabled too early now.
> 
> I think the following is a reasonable fix?
> 
> ---
> Explicitly enable RTAS error logging, when it should be ready.
> 
> 
> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
> 
> ---
> 
>  arch/powerpc/platforms/pseries/rtasd.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
> index 30925d2..0df5d0d 100644
> --- a/arch/powerpc/platforms/pseries/rtasd.c
> +++ b/arch/powerpc/platforms/pseries/rtasd.c
> @@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate;
>  static int full_rtas_msgs = 0;
>  
>  /* Stop logging to nvram after first fatal error */
> -static int no_more_logging;
> +static int no_more_logging = 1; /* Until we initialize everything,
> +                                 * make sure we don't try logging
> +                                 * anything */
> +

I realise it'll make the patch bigger, but this doesn't seem like a
particularly good name for the variable anymore.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-03  0:30       ` Michael Ellerman
@ 2007-10-03  1:19         ` Tony Breeds
  2007-10-03  4:09           ` Michael Ellerman
  2007-10-05  0:01           ` Nish Aravamudan
  0 siblings, 2 replies; 10+ messages in thread
From: Tony Breeds @ 2007-10-03  1:19 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote:
 
> I realise it'll make the patch bigger, but this doesn't seem like a
> particularly good name for the variable anymore.

Sure, what about?

Clarify when RTAS logging is enabled.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>

---
 arch/powerpc/platforms/pseries/rtasd.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..73401c8 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate;
 static int full_rtas_msgs = 0;
 
 /* Stop logging to nvram after first fatal error */
-static int no_more_logging;
-
+static int logging_enabled; /* Until we initialize everything,
+                             * make sure we don't try logging
+                             * anything */
 static int error_log_cnt;
 
 /*
@@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
 	}
 
 	/* Write error to NVRAM */
-	if (!no_more_logging && !(err_type & ERR_FLAG_BOOT))
+	if (logging_enabled && !(err_type & ERR_FLAG_BOOT))
 		nvram_write_error_log(buf, len, err_type, error_log_cnt);
 
 	/*
@@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
 		printk_log_rtas(buf, len);
 
 	/* Check to see if we need to or have stopped logging */
-	if (fatal || no_more_logging) {
-		no_more_logging = 1;
+	if (fatal || !logging_enabled) {
+		logging_enabled = 0;
 		spin_unlock_irqrestore(&rtasd_log_lock, s);
 		return;
 	}
@@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf,
 
 	spin_lock_irqsave(&rtasd_log_lock, s);
 	/* if it's 0, then we know we got the last one (the one in NVRAM) */
-	if (rtas_log_size == 0 && !no_more_logging)
+	if (rtas_log_size == 0 && logging_enabled)
 		nvram_clear_error_log();
 	spin_unlock_irqrestore(&rtasd_log_lock, s);
 
@@ -414,6 +415,8 @@ static int rtasd(void *unused)
 	memset(logdata, 0, rtas_error_log_max);
 	rc = nvram_read_error_log(logdata, rtas_error_log_max,
 	                          &err_type, &error_log_cnt);
+	/* We can use rtas_log_buf now */
+	logging_enabled = 1;
 
 	if (!rc) {
 		if (err_type != ERR_FLAG_ALREADY_LOGGED) {

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-03  1:19         ` Tony Breeds
@ 2007-10-03  4:09           ` Michael Ellerman
  2007-10-03 18:50             ` Linas Vepstas
  2007-10-05  0:01           ` Nish Aravamudan
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2007-10-03  4:09 UTC (permalink / raw)
  To: Tony Breeds; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2975 bytes --]

On Wed, 2007-10-03 at 11:19 +1000, Tony Breeds wrote:
> On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote:
>  
> > I realise it'll make the patch bigger, but this doesn't seem like a
> > particularly good name for the variable anymore.
> 
> Sure, what about?

Better .. but  ..   :D

> diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
> index 30925d2..73401c8 100644
> --- a/arch/powerpc/platforms/pseries/rtasd.c
> +++ b/arch/powerpc/platforms/pseries/rtasd.c
> @@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate;
>  static int full_rtas_msgs = 0;
>  
>  /* Stop logging to nvram after first fatal error */
> -static int no_more_logging;
> -
> +static int logging_enabled; /* Until we initialize everything,
> +                             * make sure we don't try logging
> +                             * anything */

Until we initialise what exactly?

>  static int error_log_cnt;
>  
>  /*
> @@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
>  	}
>  
>  	/* Write error to NVRAM */
> -	if (!no_more_logging && !(err_type & ERR_FLAG_BOOT))
> +	if (logging_enabled && !(err_type & ERR_FLAG_BOOT))
>  		nvram_write_error_log(buf, len, err_type, error_log_cnt);
>  
>  	/*
> @@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
>  		printk_log_rtas(buf, len);
>  
>  	/* Check to see if we need to or have stopped logging */
> -	if (fatal || no_more_logging) {
> -		no_more_logging = 1;
> +	if (fatal || !logging_enabled) {
> +		logging_enabled = 0;
>  		spin_unlock_irqrestore(&rtasd_log_lock, s);
>  		return;
>  	}

Hmmm, this routine has 4 separate lock-dropping exit paths ..

> @@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf,
>  
>  	spin_lock_irqsave(&rtasd_log_lock, s);
>  	/* if it's 0, then we know we got the last one (the one in NVRAM) */
> -	if (rtas_log_size == 0 && !no_more_logging)
> +	if (rtas_log_size == 0 && logging_enabled)
>  		nvram_clear_error_log();
>  	spin_unlock_irqrestore(&rtasd_log_lock, s);
>  
> @@ -414,6 +415,8 @@ static int rtasd(void *unused)
>  	memset(logdata, 0, rtas_error_log_max);
>  	rc = nvram_read_error_log(logdata, rtas_error_log_max,
>  	                          &err_type, &error_log_cnt);
> +	/* We can use rtas_log_buf now */
> +	logging_enabled = 1;
>  
>  	if (!rc) {
>  		if (err_type != ERR_FLAG_ALREADY_LOGGED) {

What exactly happens that allows us to do logging? I don't see any
ordering between anything else and the setting of the flag, and AFAICT
we're not inside a spinlock or anything here.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-03  4:09           ` Michael Ellerman
@ 2007-10-03 18:50             ` Linas Vepstas
  0 siblings, 0 replies; 10+ messages in thread
From: Linas Vepstas @ 2007-10-03 18:50 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On Wed, Oct 03, 2007 at 02:09:46PM +1000, Michael Ellerman wrote:
> 
> Until we initialise what exactly?

Until we allocate the error log buffer. The original crash was 
for a null-pointer deref of the unallocated buffer. I just sent 
out a patch to fix this; its a bit simpler than the below.

In that email, I remarked:

Andy Whitcroft's crash was appearently due to firmware complaining
about lost power, (actually, lost power supply redundancy!), which
occurred very early during boot.

    Type                00000040 (EPOW)
    Status:             bypassed new
    Residual error from previous boot.
    EPOW Sensor Value:  00000002
    EPOW warning due to loss of redundancy.
    EPOW general power fault.

I've no clue why firmware thought it was OK to report this
during one of the earliest calls to RTAS; I'm still investiigating
that.

--linas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-03  1:19         ` Tony Breeds
  2007-10-03  4:09           ` Michael Ellerman
@ 2007-10-05  0:01           ` Nish Aravamudan
  2007-10-05 16:03             ` Linas Vepstas
  1 sibling, 1 reply; 10+ messages in thread
From: Nish Aravamudan @ 2007-10-05  0:01 UTC (permalink / raw)
  To: Tony Breeds; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On 10/2/07, Tony Breeds <tony@bakeyournoodle.com> wrote:
> On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote:
>
> > I realise it'll make the patch bigger, but this doesn't seem like a
> > particularly good name for the variable anymore.
>
> Sure, what about?
>
> Clarify when RTAS logging is enabled.
>
> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>

For what it's worth, on a different ppc64 box, this resolves a similar
panic for me.

Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>

Thanks,
Nish

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-05  0:01           ` Nish Aravamudan
@ 2007-10-05 16:03             ` Linas Vepstas
  2007-10-08  3:47               ` Nish Aravamudan
  0 siblings, 1 reply; 10+ messages in thread
From: Linas Vepstas @ 2007-10-05 16:03 UTC (permalink / raw)
  To: Nish Aravamudan; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On Thu, Oct 04, 2007 at 05:01:47PM -0700, Nish Aravamudan wrote:
> On 10/2/07, Tony Breeds <tony@bakeyournoodle.com> wrote:
> > On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote:
> >
> > > I realise it'll make the patch bigger, but this doesn't seem like a
> > > particularly good name for the variable anymore.
> >
> > Sure, what about?
> >
> > Clarify when RTAS logging is enabled.
> >
> > Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
> 
> For what it's worth, on a different ppc64 box, this resolves a similar
> panic for me.
> 
> Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>

For the reasons explained, I'd really like to nack Tony's patch.

--linas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.23-rc7-mm1 -- powerpc rtas panic
  2007-10-05 16:03             ` Linas Vepstas
@ 2007-10-08  3:47               ` Nish Aravamudan
  0 siblings, 0 replies; 10+ messages in thread
From: Nish Aravamudan @ 2007-10-08  3:47 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: linuxppc-dev, Andrew Morton, linux-kernel

On 10/5/07, Linas Vepstas <linas@austin.ibm.com> wrote:
> On Thu, Oct 04, 2007 at 05:01:47PM -0700, Nish Aravamudan wrote:
> > On 10/2/07, Tony Breeds <tony@bakeyournoodle.com> wrote:
> > > On Wed, Oct 03, 2007 at 10:30:16AM +1000, Michael Ellerman wrote:
> > >
> > > > I realise it'll make the patch bigger, but this doesn't seem like a
> > > > particularly good name for the variable anymore.
> > >
> > > Sure, what about?
> > >
> > > Clarify when RTAS logging is enabled.
> > >
> > > Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
> >
> > For what it's worth, on a different ppc64 box, this resolves a similar
> > panic for me.
> >
> > Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
>
> For the reasons explained, I'd really like to nack Tony's patch.

I see. Can you reply in this thread with the patch you mentioned in
your other reply? (or point me to a copy of it)

Thanks,
Nish

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-10-08  3:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070924021716.9bfe7dfb.akpm@linux-foundation.org>
2007-09-24 12:35 ` 2.6.23-rc7-mm1 -- powerpc rtas panic Andy Whitcroft
2007-10-02 23:28   ` Linas Vepstas
2007-10-03  0:26     ` Tony Breeds
2007-10-03  0:30       ` Michael Ellerman
2007-10-03  1:19         ` Tony Breeds
2007-10-03  4:09           ` Michael Ellerman
2007-10-03 18:50             ` Linas Vepstas
2007-10-05  0:01           ` Nish Aravamudan
2007-10-05 16:03             ` Linas Vepstas
2007-10-08  3:47               ` Nish Aravamudan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).