linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Fix Oops in rtas_stop_self()
@ 2014-04-25  9:33 Li Zhong
  2014-04-25 12:18 ` Anton Blanchard
  0 siblings, 1 reply; 6+ messages in thread
From: Li Zhong @ 2014-04-25  9:33 UTC (permalink / raw)
  To: PowerPC email list; +Cc: Paul Mackerras

When trying offline cpus, I noticed following Oops in rtas_stop_self(),
and it seems caused by commit 41dd03a9. The Oops disappears after
reverting this commit.

After reading the code, I guess it might be caused by moving the
rtas_args to stack. Still need some more time to read enter_rtas to
understand why it happens, but the problem seems could be solved by
moving the rtas_args away from stack by adding static before it.

[  247.194623] cpu 28 (hwid 28) Ready to die...
[  247.194641] Unable to handle kernel paging request for data at address 0xecb213c50
[  247.194642] Faulting instruction address: 0x0f34157c
[  247.194645] Oops: Kernel access of bad area, sig: 11 [#1]
[  247.194648] SMP NR_CPUS=1024 NUMA pSeries
[  247.194664] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables ehea ibmvscsi scsi_transport_srp scsi_tgt
[  247.194667] CPU: 28 PID: 0 Comm: swapper/28 Not tainted 3.15.0-rc1+ #10
[  247.194670] CPU: 27 PID: 1171 Comm: drmgr Not tainted 3.15.0-rc1+ #10
[  247.194673] task: c0000007ca2f0ba0 ti: c000000ecb210000 task.ti: c000000ecb210000
[  247.194679] NIP: 000000000f34157c LR: 000000000000a84c CTR: 0000000000000151
[  247.194680] REGS: c000000ecb213780 TRAP: 0300   Not tainted  (3.15.0-rc1+)
[  247.194684] MSR: 8000000000001000 <SF,ME>  CR: 00000000  XER: 00000000
[  247.194707] CFAR: c00000000000a844 DAR: 0000000ecb213c50 DSISR: 40000000 SOFTE: 0 
[  247.194707] GPR00: 8000000000001030 c000000ecb213a00 c0000000014c7880 0000000ecb213c50 
[  247.194707] GPR04: 000000000f340000 0000000ecb213c50 0000000000000000 8000000000001032 
[  247.194707] GPR08: 0000000000001000 c000000ecb213a00 0000000000000000 0000000000000000 
[  247.194707] GPR12: 0000000000000000 c00000000f227e00 c00000000091f5b0 000000000f394e2c 
[  247.194707] GPR16: 000000000000001c 0000000000000001 000000000000001c c0000000013ad5b0 
[  247.194707] GPR20: 0000000000000001 0000000000000000 c0000000013bfb89 c000000000d57688 
[  247.194707] GPR24: 000000000000001c c000000000d44e2c c000000ecb210000 c0000000013c0f48 
[  247.194707] GPR28: c000000000d44e28 c00000000150ca68 00000000000000e0 c000000ecb210000 
[  247.194710] NIP [000000000f34157c] 0xf34157c
[  247.194712] LR [000000000000a84c] 0xa84c
[  247.194714] Call Trace:
[  247.194716] [c000000ecb213a00] [000000000000001c] 0x1c (unreliable)
[  247.194722] [c000000ecb213be0] [c00000000007ffe0] .pseries_mach_cpu_die+0x150/0x350
[  247.194725] [c000000ecb213cf0] [c0000000000420cc] .cpu_die+0x3c/0x60
[  247.194727] [c000000ecb213d60] [c000000000017e58] .arch_cpu_idle_dead+0x28/0x40
[  247.194732] [c000000ecb213dd0] [c000000000104f40] .cpu_startup_entry+0x570/0x590
[  247.194735] [c000000ecb213ed0] [c000000000041d4c] .start_secondary+0x2dc/0x310
[  247.194738] [c000000ecb213f90] [c00000000000996c] .start_secondary_prolog+0x10/0x14
[  247.194740] Instruction dump:
[  247.194744] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[  247.194748] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[  247.194751] ---[ end trace d1d21584135396ba ]---

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 9b8e050..20d6297 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -88,13 +88,14 @@ void set_default_offline_state(int cpu)
 
 static void rtas_stop_self(void)
 {
-	struct rtas_args args = {
-		.token = cpu_to_be32(rtas_stop_self_token),
+	static struct rtas_args args = {
 		.nargs = 0,
 		.nret = 1,
 		.rets = &args.args[0],
 	};
 
+	args.token = cpu_to_be32(rtas_stop_self_token);
+
 	local_irq_disable();
 
 	BUG_ON(rtas_stop_self_token == RTAS_UNKNOWN_SERVICE);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix Oops in rtas_stop_self()
  2014-04-25  9:33 [RFC PATCH] Fix Oops in rtas_stop_self() Li Zhong
@ 2014-04-25 12:18 ` Anton Blanchard
  2014-04-28  0:26   ` Li Zhong
  2014-04-28  0:29   ` [PATCH] powerpc: " Li Zhong
  0 siblings, 2 replies; 6+ messages in thread
From: Anton Blanchard @ 2014-04-25 12:18 UTC (permalink / raw)
  To: Li Zhong; +Cc: Paul Mackerras, Benjamin Herrenschmidt, PowerPC email list


Hi,

> When trying offline cpus, I noticed following Oops in
> rtas_stop_self(), and it seems caused by commit 41dd03a9. The Oops
> disappears after reverting this commit.
> 
> After reading the code, I guess it might be caused by moving the
> rtas_args to stack. Still need some more time to read enter_rtas to
> understand why it happens, but the problem seems could be solved by
> moving the rtas_args away from stack by adding static before it.

Nice catch. RTAS is 32bit and if your box has more than 4GB RAM then
your stack could easily be outside 32bit range.

You can add:

Signed-off-by: Anton Blanchard <anton@samba.org>

And also:

Cc: stable@vger.kernel.org # 3.14+

> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 9b8e050..20d6297
> 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -88,13 +88,14 @@ void set_default_offline_state(int cpu)
>  
>  static void rtas_stop_self(void)
>  {
> -	struct rtas_args args = {
> -		.token = cpu_to_be32(rtas_stop_self_token),
> +	static struct rtas_args args = {
>  		.nargs = 0,
>  		.nret = 1,
>  		.rets = &args.args[0],
>  	};
>  
> +	args.token = cpu_to_be32(rtas_stop_self_token);
> +
>  	local_irq_disable();
>  
>  	BUG_ON(rtas_stop_self_token == RTAS_UNKNOWN_SERVICE);
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix Oops in rtas_stop_self()
  2014-04-25 12:18 ` Anton Blanchard
@ 2014-04-28  0:26   ` Li Zhong
  2014-04-28  9:06     ` David Laight
  2014-04-28  0:29   ` [PATCH] powerpc: " Li Zhong
  1 sibling, 1 reply; 6+ messages in thread
From: Li Zhong @ 2014-04-28  0:26 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Paul Mackerras, Benjamin Herrenschmidt, PowerPC email list

On Fri, 2014-04-25 at 22:18 +1000, Anton Blanchard wrote:
> Hi,
> 
> > When trying offline cpus, I noticed following Oops in
> > rtas_stop_self(), and it seems caused by commit 41dd03a9. The Oops
> > disappears after reverting this commit.
> > 
> > After reading the code, I guess it might be caused by moving the
> > rtas_args to stack. Still need some more time to read enter_rtas to
> > understand why it happens, but the problem seems could be solved by
> > moving the rtas_args away from stack by adding static before it.
> 
> Nice catch. RTAS is 32bit and if your box has more than 4GB RAM then
> your stack could easily be outside 32bit range.

Ah, yes, the stack here is obviously at a much higher address than 4GB.

> 
> You can add:
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> 
> And also:
> 
> Cc: stable@vger.kernel.org # 3.14+

OK, 

Thanks, Zhong
> 
> > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 9b8e050..20d6297
> > 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > @@ -88,13 +88,14 @@ void set_default_offline_state(int cpu)
> >  
> >  static void rtas_stop_self(void)
> >  {
> > -	struct rtas_args args = {
> > -		.token = cpu_to_be32(rtas_stop_self_token),
> > +	static struct rtas_args args = {
> >  		.nargs = 0,
> >  		.nret = 1,
> >  		.rets = &args.args[0],
> >  	};
> >  
> > +	args.token = cpu_to_be32(rtas_stop_self_token);
> > +
> >  	local_irq_disable();
> >  
> >  	BUG_ON(rtas_stop_self_token == RTAS_UNKNOWN_SERVICE);
> > 
> > 
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] powerpc: Fix Oops in rtas_stop_self()
  2014-04-25 12:18 ` Anton Blanchard
  2014-04-28  0:26   ` Li Zhong
@ 2014-04-28  0:29   ` Li Zhong
  1 sibling, 0 replies; 6+ messages in thread
From: Li Zhong @ 2014-04-28  0:29 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Paul Mackerras, Benjamin Herrenschmidt, PowerPC email list

commit 41dd03a9 may cause Oops in rtas_stop_self().

The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.

So the patch moves rtas_args away from stack by adding static before
it. 

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 9b8e050..20d6297 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -88,13 +88,14 @@ void set_default_offline_state(int cpu)
 
 static void rtas_stop_self(void)
 {
-	struct rtas_args args = {
-		.token = cpu_to_be32(rtas_stop_self_token),
+	static struct rtas_args args = {
 		.nargs = 0,
 		.nret = 1,
 		.rets = &args.args[0],
 	};
 
+	args.token = cpu_to_be32(rtas_stop_self_token);
+
 	local_irq_disable();
 
 	BUG_ON(rtas_stop_self_token == RTAS_UNKNOWN_SERVICE);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [RFC PATCH] Fix Oops in rtas_stop_self()
  2014-04-28  0:26   ` Li Zhong
@ 2014-04-28  9:06     ` David Laight
  2014-04-28  9:17       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2014-04-28  9:06 UTC (permalink / raw)
  To: 'Li Zhong', Anton Blanchard
  Cc: PowerPC email list, Benjamin Herrenschmidt, Paul Mackerras

RnJvbTogTGkgWmhvbmcNCj4gT24gRnJpLCAyMDE0LTA0LTI1IGF0IDIyOjE4ICsxMDAwLCBBbnRv
biBCbGFuY2hhcmQgd3JvdGU6DQo+ID4gSGksDQo+ID4NCj4gPiA+IFdoZW4gdHJ5aW5nIG9mZmxp
bmUgY3B1cywgSSBub3RpY2VkIGZvbGxvd2luZyBPb3BzIGluDQo+ID4gPiBydGFzX3N0b3Bfc2Vs
ZigpLCBhbmQgaXQgc2VlbXMgY2F1c2VkIGJ5IGNvbW1pdCA0MWRkMDNhOS4gVGhlIE9vcHMNCj4g
PiA+IGRpc2FwcGVhcnMgYWZ0ZXIgcmV2ZXJ0aW5nIHRoaXMgY29tbWl0Lg0KPiA+ID4NCj4gPiA+
IEFmdGVyIHJlYWRpbmcgdGhlIGNvZGUsIEkgZ3Vlc3MgaXQgbWlnaHQgYmUgY2F1c2VkIGJ5IG1v
dmluZyB0aGUNCj4gPiA+IHJ0YXNfYXJncyB0byBzdGFjay4gU3RpbGwgbmVlZCBzb21lIG1vcmUg
dGltZSB0byByZWFkIGVudGVyX3J0YXMgdG8NCj4gPiA+IHVuZGVyc3RhbmQgd2h5IGl0IGhhcHBl
bnMsIGJ1dCB0aGUgcHJvYmxlbSBzZWVtcyBjb3VsZCBiZSBzb2x2ZWQgYnkNCj4gPiA+IG1vdmlu
ZyB0aGUgcnRhc19hcmdzIGF3YXkgZnJvbSBzdGFjayBieSBhZGRpbmcgc3RhdGljIGJlZm9yZSBp
dC4NCj4gPg0KPiA+IE5pY2UgY2F0Y2guIFJUQVMgaXMgMzJiaXQgYW5kIGlmIHlvdXIgYm94IGhh
cyBtb3JlIHRoYW4gNEdCIFJBTSB0aGVuDQo+ID4geW91ciBzdGFjayBjb3VsZCBlYXNpbHkgYmUg
b3V0c2lkZSAzMmJpdCByYW5nZS4NCj4gDQo+IEFoLCB5ZXMsIHRoZSBzdGFjayBoZXJlIGlzIG9i
dmlvdXNseSBhdCBhIG11Y2ggaGlnaGVyIGFkZHJlc3MgdGhhbiA0R0IuDQoNCkFyZSB3ZSB0YWxr
aW5nIG9mIHBoeXNpY2FsIG9yIHZpcnR1YWwgYWRkcmVzc2VzIGhlcmU/DQoob3IgZXZlbiB1c2Vy
PykNCg0KSXMgdGhlcmUgYSByZS1lbnRyYW5jeSBwcm9ibGVtIHVzaW5nIGtlcm5lbCBzdGF0aWMg
ZGF0YT8NCg0KCURhdmlkDQoNCg==

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] Fix Oops in rtas_stop_self()
  2014-04-28  9:06     ` David Laight
@ 2014-04-28  9:17       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2014-04-28  9:17 UTC (permalink / raw)
  To: David Laight
  Cc: PowerPC email list, Paul Mackerras, Anton Blanchard,
	'Li Zhong'

On Mon, 2014-04-28 at 09:06 +0000, David Laight wrote:
> > Ah, yes, the stack here is obviously at a much higher address than
> 4GB.
> 
> Are we talking of physical or virtual addresses here?
> (or even user?)

Real.

> Is there a re-entrancy problem using kernel static data?

Not for this code. This is called as the last thing a CPU does
before returning to firmware.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-04-28  9:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-25  9:33 [RFC PATCH] Fix Oops in rtas_stop_self() Li Zhong
2014-04-25 12:18 ` Anton Blanchard
2014-04-28  0:26   ` Li Zhong
2014-04-28  9:06     ` David Laight
2014-04-28  9:17       ` Benjamin Herrenschmidt
2014-04-28  0:29   ` [PATCH] powerpc: " Li Zhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).