public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* trap int3 problem while porting a user space application and small cleanup patch
@ 2006-02-12 17:08 Roberto Nibali
  2006-02-13  0:57 ` [discuss] " Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: Roberto Nibali @ 2006-02-12 17:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: ak, discuss

[-- Attachment #1: Type: text/plain, Size: 1506 bytes --]

Hello,

For a while I've been working on a little tool called mpt-status to be 
able to monitor LSI based controllers. The source can be found here:

     http://www.drugphish.ch/~ratz/mpt-status/

The issue I'm trying to track down now is why I cannot get it to work on 
a x86_64 kernel (Sun Fire V20z with AMD Opteron(tm) Processor 252 on 
SLES 9 PL3). I suspect 32/64 bit issues between in my ioctl message 
passing between user space and kernel space. Unfortunately when I strace 
the kernel spits out tons of following entries:

mpt-status[16045] trap int3 rip:400acf rsp:7fbfff70b0 error:0
mpt-status[16045] trap int3 rip:4008f1 rsp:7fbfff70a8 error:0
mpt-status[16045] trap int3 rip:400b86 rsp:7fbfff70b0 error:0

I can only remotely guess what happened because I'm not sound on x64 
trap handling, so my question is: How can I best debug and address this 
issue in my tool?

I'm pretty sure it has something to do with me including kernel headers 
in a user space tool, but noone has done the sanitizing for the LSI 
related headers residing in drivers/message/fusion. It works on all 
32-bit machines I've tested so far.

Attached is a small code style cleanup patch that resulted from my 
skimming through the arch/x86_64/kernel/traps.c code to figure out what 
went haywire. If Andi is ok with it, please consider applying.

Signed-off-by: Roberto Nibali <ratz@drugphish.ch>

Best regards,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

[-- Attachment #2: x86_64_kernel_traps_cleanup-1.diff --]
[-- Type: text/x-patch, Size: 1862 bytes --]

diff --git a/arch/x86_64/kernel/traps.c b/arch/x86_64/kernel/traps.c
index ee1b2da..3442a56 100644
--- a/arch/x86_64/kernel/traps.c
+++ b/arch/x86_64/kernel/traps.c
@@ -108,7 +108,7 @@ int printk_address(unsigned long address
 	if (!modname) 
 		modname = delim = ""; 		
         return printk("<%016lx>{%s%s%s%s%+ld}",
-		      address,delim,modname,delim,symname,offset); 
+		      address, delim, modname, delim, symname, offset); 
 } 
 #else
 int printk_address(unsigned long address)
@@ -320,13 +320,12 @@ void show_registers(struct pt_regs *regs
 		show_stack(NULL, (unsigned long*)rsp);
 
 		printk("\nCode: ");
-		if(regs->rip < PAGE_OFFSET)
+		if (regs->rip < PAGE_OFFSET)
 			goto bad;
 
-		for(i=0;i<20;i++)
-		{
+		for (i=0; i<20; i++) {
 			unsigned char c;
-			if(__get_user(c, &((unsigned char*)regs->rip)[i])) {
+			if (__get_user(c, &((unsigned char*)regs->rip)[i])) {
 bad:
 				printk(" Bad RIP value.");
 				break;
@@ -465,7 +464,7 @@ static void __kprobes do_trap(int trapnr
 			printk(KERN_INFO
 			       "%s[%d] trap %s rip:%lx rsp:%lx error:%lx\n",
 			       tsk->comm, tsk->pid, str,
-			       regs->rip,regs->rsp,error_code); 
+			       regs->rip, regs->rsp, error_code); 
 
 		if (info)
 			force_sig_info(signr, info, tsk);
@@ -479,9 +478,9 @@ static void __kprobes do_trap(int trapnr
 	{	     
 		const struct exception_table_entry *fixup;
 		fixup = search_exception_tables(regs->rip);
-		if (fixup) {
+		if (fixup)
 			regs->rip = fixup->fixup;
-		} else	
+		else	
 			die(str, regs, error_code);
 		return;
 	}
@@ -554,7 +553,7 @@ asmlinkage void __kprobes do_general_pro
 			printk(KERN_INFO
 		       "%s[%d] general protection rip:%lx rsp:%lx error:%lx\n",
 			       tsk->comm, tsk->pid,
-			       regs->rip,regs->rsp,error_code); 
+			       regs->rip, regs->rsp, error_code); 
 
 		force_sig(SIGSEGV, tsk);
 		return;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [discuss] trap int3 problem while porting a user space application and small cleanup patch
  2006-02-12 17:08 trap int3 problem while porting a user space application and small cleanup patch Roberto Nibali
@ 2006-02-13  0:57 ` Andi Kleen
  2006-02-13  7:55   ` Roberto Nibali
  0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2006-02-13  0:57 UTC (permalink / raw)
  To: discuss; +Cc: Roberto Nibali, linux-kernel

On Sunday 12 February 2006 18:08, Roberto Nibali wrote:
> Hello,
> 
> For a while I've been working on a little tool called mpt-status to be 
> able to monitor LSI based controllers. The source can be found here:
> 
>      http://www.drugphish.ch/~ratz/mpt-status/
> 
> The issue I'm trying to track down now is why I cannot get it to work on 
> a x86_64 kernel (Sun Fire V20z with AMD Opteron(tm) Processor 252 on 
> SLES 9 PL3). I suspect 32/64 bit issues between in my ioctl message 
> passing between user space and kernel space.

Quite possible. The mpt ioctls would need a ioctl conversion handler
to allow a 32bit program to use the 64bit ioctls. Or just use a 64bit
executable.

> Unfortunately when I strace  
> the kernel spits out tons of following entries:

Some kernel versions printed that with strace. I think I fixed it in
mainline, but I can't remember if it was fixed in SLES9 too (apparently not)
It's fairly harmless, just ignore it. If it really bothers you you can
turn it off with echo 0 > /proc/sys/debug/exception-trace


> 
> Attached is a small code style cleanup patch that resulted from my 
> skimming through the arch/x86_64/kernel/traps.c code to figure out what 
> went haywire. If Andi is ok with it, please consider applying.

Hmm, ok applied.
-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [discuss] trap int3 problem while porting a user space application and small cleanup patch
  2006-02-13  0:57 ` [discuss] " Andi Kleen
@ 2006-02-13  7:55   ` Roberto Nibali
  2006-02-13  9:25     ` Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: Roberto Nibali @ 2006-02-13  7:55 UTC (permalink / raw)
  To: Andi Kleen; +Cc: discuss, linux-kernel

Hello Andi,

Thanks for your comments.

>> The issue I'm trying to track down now is why I cannot get it to work on 
>> a x86_64 kernel (Sun Fire V20z with AMD Opteron(tm) Processor 252 on 
>> SLES 9 PL3). I suspect 32/64 bit issues between in my ioctl message 
>> passing between user space and kernel space.
> 
> Quite possible. The mpt ioctls would need a ioctl conversion handler
> to allow a 32bit program to use the 64bit ioctls. Or just use a 64bit
> executable.

It is a 64bit executable:

ratz@cpp9:~/mpt-status-1.1.5-RC3> readelf -h ./mpt-status | grep 64
ELF Header:
   Class:                             ELF64
   Machine:                           Advanced Micro Devices X86-64
   Start of program headers:          64 (bytes into file)
   Size of this header:               64 (bytes)
   Size of section headers:           64 (bytes)
ratz@cpp9:~/mpt-status-1.1.5-RC3> ldd ./mpt-status
         libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a9566d000)
         /lib64/ld-linux-x86-64.so.2 (0x0000002a95556000)

The strace looks ok with regard to the ioctl though:

cpp9:/home/ratz/mpt-status-1.1.5-RC3 # strace ./mpt-status
execve("./mpt-status", ["./mpt-status"], [/* 44 vars */]) = 0
uname({sys="Linux", node="cpp9", ...})  = 0
brk(0)                                  = 0x503000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2a9556b000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26878, ...}) = 0
mmap(NULL, 26878, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2a9556c000
close(3)                                = 0
open("/lib64/tls/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\313\1\0"..., 
640) = 640
lseek(3, 624, SEEK_SET)                 = 624
read(3, "\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\2\0\0\0\6\0\0\0"..., 32) 
= 32
fstat(3, {st_mode=S_IFREG|0755, st_size=1424617, ...}) = 0
mmap(NULL, 2254664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 
0) = 0x2a9566d000
madvise(0x2a9566d000, 2254664, MADV_SEQUENTIAL|0x1) = 0
mprotect(0x2a95778000, 1161032, PROT_NONE) = 0
mmap(0x2a95877000, 102400, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x10a000) = 0x2a95877000
mmap(0x2a95890000, 14152, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2a95890000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2a95894000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2a95895000
arch_prctl(ARCH_SET_FS, 0x2a95894b00)   = 0
munmap(0x2a9556c000, 26878)             = 0
open("/dev/mptctl", O_RDWR)             = 3
brk(0)                                  = 0x503000
brk(0x526000)                           = 0x526000
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x2a9556c000
write(1, "SGE ptr: 0x7fbfffc144\n", 22SGE ptr: 0x7fbfffc144
) = 22
write(1, "conf ptr: 0x7fbfffc124\n", 23conf ptr: 0x7fbfffc124
) = 23
write(1, "dataSgeOffset: 4\n", 17dataSgeOffset: 4
)      = 17
ioctl(3, 0xc0486d14, 0x7fbfffc0e0)      = 0
ioctl(3, 0xc0486d14, 0x7fbfffc0e0)      = 0
write(1, "\nYou seem to have no SCSI disks "..., 139
You seem to have no SCSI disks attached to your HBA or you have
them on a different scsi_id. To get your SCSI id, run:

     mpt-status -p
) = 139
write(1, "\n", 1
)                       = 1
munmap(0x2a9556c000, 4096)              = 0
exit_group(1)                           = ?

My next steps will involve enabling full debug of the mptctl driver to 
find out where it gets stuck and to sprinkle a few printk's to see if 
the struct's got the wrong size or has been packed wrongly. Even the 
SuSE provided mpt-status (including the patches) does not work correctly 
on this machine. So I reckon I try to get my hands on a SLES support 
contract and/or maybe ping LSIL.

 From the looks of the MPI headers one can see that LSIL carefully 
thought about the 64bit case and thus I'm really astonished it does not 
work.

>> Unfortunately when I strace  
>> the kernel spits out tons of following entries:
> 
> Some kernel versions printed that with strace. I think I fixed it in
> mainline, but I can't remember if it was fixed in SLES9 too (apparently not)
> It's fairly harmless, just ignore it. If it really bothers you you can
> turn it off with echo 0 > /proc/sys/debug/exception-trace

Nice.

> Hmm, ok applied.

:) I know, not exactly fixing anything, just creating more work for you.

Best regards,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [discuss] trap int3 problem while porting a user space application and small cleanup patch
  2006-02-13  7:55   ` Roberto Nibali
@ 2006-02-13  9:25     ` Andi Kleen
  0 siblings, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2006-02-13  9:25 UTC (permalink / raw)
  To: Roberto Nibali; +Cc: discuss, linux-kernel

On Monday 13 February 2006 08:55, Roberto Nibali wrote:
> Hello Andi,
> 
> Thanks for your comments.
> 
> >> The issue I'm trying to track down now is why I cannot get it to work on 
> >> a x86_64 kernel (Sun Fire V20z with AMD Opteron(tm) Processor 252 on 
> >> SLES 9 PL3). I suspect 32/64 bit issues between in my ioctl message 
> >> passing between user space and kernel space.
> > 
> > Quite possible. The mpt ioctls would need a ioctl conversion handler
> > to allow a 32bit program to use the 64bit ioctls. Or just use a 64bit
> > executable.
> 
> It is a 64bit executable:

Then whatever problem the program has is not enabled to 32bit ioctl emulation.
Maybe it has some generic 64bit issues.

Thanks for looking into it.

-Andi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-02-13  9:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-12 17:08 trap int3 problem while porting a user space application and small cleanup patch Roberto Nibali
2006-02-13  0:57 ` [discuss] " Andi Kleen
2006-02-13  7:55   ` Roberto Nibali
2006-02-13  9:25     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox