linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Unable to handle kernel paging request in show_instructions
@ 2006-06-20  0:42 David Wilder
  2006-06-20  0:44 ` Arnd Bergmann
  2006-06-20 13:47 ` Anton Blanchard
  0 siblings, 2 replies; 3+ messages in thread
From: David Wilder @ 2006-06-20  0:42 UTC (permalink / raw)
  To: Linuxppc-dev

I ran into the following problem during Oops processing:

Oops: Exception in kernel mode, sig: 4 [#1]
SMP NR_CPUS=128 NUMA PSERIES LPAR
Modules linked in: pitrace sg scsi_mod nfs lockd nfs_acl sunrpc ipv6 
apparmor aa match_pcre loop dm_mod tg3
NIP: D000000000022014 LR: C000000000018FE4 CTR: C00000000036C718
REGS: c00000000043faf0 TRAP: 0700   Tainted: G     U  
(2.6.16.16-1.6-ppc64-wilder)
MSR: 8000000000089432 <EE,ME,IR,DR>  CR: 24000088  XER: 000FFFFF
TASK = c00000000048a660[0] 'swapper' THREAD: c00000000043c000 CPU: 0
GPR00: C000000000018FE4 C00000000043FD70 C000000000624420 0000000000000000
GPR04: C00000000048A990 0000000000006DFF 0000000024000082 C00000000000F0B0
GPR08: 0000000000000000 C0000000004351C0 0000000001021A00 C000000001456BC0
GPR12: D0000000004CC2B8 C00000000048AE80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 4000000000400000 C00000000042BA00 C00000000042BEA8 C00000000042BC70
GPR24: C00000000048AE80 000000000082BA00 0000000000000000 C000000000000000
GPR28: 00000000FFFFFFFF C000000000629070 C0000000004C7BC8 C00000000F9F89D0
NIP [D000000000022014] 0xd000000000022014
LR [C000000000018FE4] .default_idle+0x98/0xcc
Call Trace:
[C00000000043FD70] [C000000000018FE4] .default_idle+0x98/0xcc (unreliable)
[C00000000043FE00] [C000000000018F38] .cpu_idle+0x40/0x54
[C00000000043FE70] [C000000000009274] .rest_init+0x44/0x5c
[C00000000043FEF0] [C0000000003FC75C] .start_kernel+0x270/0x288
[C00000000043FF90] [C000000000008594] .start_here_common+0x88/0x8c
Instruction dump:
 >>>Unable to handle kernel paging request for data at address 
0xd000000000021fe4
 >>>Faulting instruction address: 0xc00000000036b960

I don't care about the original oops, only the second fault because it 
prevents kdump from starting.

The problem occures in show_instructions().  Show_instructions() takes 
the NIP (D00000000002201) and subtracts some number so it points several 
instructs before the failing instructions.  In this case the new value 
is on a previous page and that page is not valid (it is not mapped).  
When the new NIP is referenced we get a second fault.  

show_instructions tries to validate addresses by checking if it is the 
kernel segment (0xc.....) or the first vmalloc segment (0xD.......).  
But in this case the validation passes even though the address is 
invalid.   Any ideas how to fix this?  Is there a easy way to validate 
if a page is valid before accessing it?

-- 
David Wilder
IBM Linux Technology Center
Beaverton, Oregon, USA 
dwilder@us.ibm.com
(503)578-3789

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unable to handle kernel paging request in show_instructions
  2006-06-20  0:42 Unable to handle kernel paging request in show_instructions David Wilder
@ 2006-06-20  0:44 ` Arnd Bergmann
  2006-06-20 13:47 ` Anton Blanchard
  1 sibling, 0 replies; 3+ messages in thread
From: Arnd Bergmann @ 2006-06-20  0:44 UTC (permalink / raw)
  To: linuxppc-dev

On Tuesday 20 June 2006 02:42, David Wilder wrote:

> But in this case the validation passes even though the address is 
> invalid.   Any ideas how to fix this?  Is there a easy way to validate 
> if a page is valid before accessing it?

The check that kallsyms_lookup does should find this, but it may
be overkill to call that function just to verify an address.

You might also have success calling vmalloc_to_page for a 0xd000...
address to check if that is really mapped.

	Arnd <><

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unable to handle kernel paging request in show_instructions
  2006-06-20  0:42 Unable to handle kernel paging request in show_instructions David Wilder
  2006-06-20  0:44 ` Arnd Bergmann
@ 2006-06-20 13:47 ` Anton Blanchard
  1 sibling, 0 replies; 3+ messages in thread
From: Anton Blanchard @ 2006-06-20 13:47 UTC (permalink / raw)
  To: David Wilder; +Cc: Linuxppc-dev


Hi David,

> The problem occures in show_instructions().  Show_instructions() takes 
> the NIP (D00000000002201) and subtracts some number so it points several 
> instructs before the failing instructions.  In this case the new value 
> is on a previous page and that page is not valid (it is not mapped).  
> When the new NIP is referenced we get a second fault.  
> 
> show_instructions tries to validate addresses by checking if it is the 
> kernel segment (0xc.....) or the first vmalloc segment (0xD.......).  
> But in this case the validation passes even though the address is 
> invalid.   Any ideas how to fix this?  Is there a easy way to validate 
> if a page is valid before accessing it?

Whats interesting is that bad_page_fault should have walked the
exception tables and recovered, considering we have a __get_user call in
show_instructions.

While we should really understand why this failed, I suspect we should
be tighter with our checking. It looks like __kernel_text_address()
does what we want. Untested patch below.

Anton


Use __kernel_text_address when validating instruction addresses in the
Oops code.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -342,13 +342,6 @@ #endif
 
 static int instructions_to_print = 16;
 
-#ifdef CONFIG_PPC64
-#define BAD_PC(pc)	((REGION_ID(pc) != KERNEL_REGION_ID) && \
-		         (REGION_ID(pc) != VMALLOC_REGION_ID))
-#else
-#define BAD_PC(pc)	((pc) < KERNELBASE)
-#endif
-
 static void show_instructions(struct pt_regs *regs)
 {
 	int i;
@@ -367,7 +360,8 @@ static void show_instructions(struct pt_
 		 * bad address because the pc *should* only be a
 		 * kernel address.
 		 */
-		if (BAD_PC(pc) || __get_user(instr, (unsigned int __user *)pc)) {
+		if (!__kernel_text_address(pc) ||
+		     __get_user(instr, (unsigned int __user *)pc)) {
 			printk("XXXXXXXX ");
 		} else {
 			if (regs->nip == pc)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-06-20 13:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-20  0:42 Unable to handle kernel paging request in show_instructions David Wilder
2006-06-20  0:44 ` Arnd Bergmann
2006-06-20 13:47 ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).