* Re: Crash in vm86() on SMP boxes with vesa driver?
@ 2003-04-29 6:03 Petr Vandrovec
2003-04-29 19:41 ` Kendall Bennett
0 siblings, 1 reply; 4+ messages in thread
From: Petr Vandrovec @ 2003-04-29 6:03 UTC (permalink / raw)
To: Kendall Bennett; +Cc: linux-kernel
On 28 Apr 03 at 16:12, Kendall Bennett wrote:
> 8.0 box with the latest 2.4.20 kernel on it (but the problem happened
> with the stock kernel and kernels lower then .20 as well). Unfortunately
> I don't have access to the box (it is in Australia), but I have access to
> the bug report information (and will try to configure a box soon to
> reproduce it here). Anyway the folowing is the error log produced by
> XFree86 when the crash occurs:
We told you before that you cannot trust VESA BIOS.
> (II) VESA(0): initializing int10
> (WW) VESA(0): Bad V_BIOS checksum
> (II) VESA(0): Primary V_BIOS segment is: 0xc000
Bad checksum? Sorry, your BIOS is not usable. Either XFree gets checksum
wrong, or there is something I would not want in my computer there...
> (II) VESA(0): virtual address = 0x402d7000,
> physical address = 0xf0000000, size = 33554432
> (II) VESA(0): VBESetVBEMode failed(EE) VESA(0): vm86() syscall generated
> signal 11.
> (II) VESA(0): EAX=0x00000150, EBX=0x00000ba0, ECX=0x00000000,
> EDX=0x00000000
> (II) VESA(0): ESP=0x00000fba, EBP=0x00000001, ESI=0x00000bc3,
> EDI=0x00003ad7
> (II) VESA(0): CS=0xc000, SS=0x0100, DS=0x0000, ES=0xc000, FS=0x0000,
> GS=0x0000 (II) VESA(0): EIP=0x0000800f, EFLAGS=0x00033006
> (II) VESA(0): code at 0x000c800f:
> 62 18 91 60 09 fa 03 85 27 11 27 11 9d 0f f4 81
> fe 06 d0 1a 68 74 99 a9 c6 39 f9 6d 04 b4 d6 6b
> Also from debugging our own code we have a bit more information about
> where the problem occurs, and it occurs on the return from the vm86()
> system call when the code tries to pop the EBX register from the stack.
> Which kind of indicates that the kernel screwed up the return stack of
> the program for some reason:
No. Crash happened inside VM, and it was shown as happening on return
from int $0x80. But real problem is that in the VM you are executing
code at 0xC000:0x800F. But there is no code there, it is garbage
(bound bx,[bx+si]; xchg cx,ax; pusha; or dx,di ???) which generated
bounds check interrupt.
> Any ideas? I am not sure how to start debuging this (assuming I can get
> my SMP machine up and running and reproduce it) in the kernel. Also the
> machine that the problem occurs on goes to the customer tomorrow, so we
> won't be able to debug this much ourselves until I can get a new machine
> to reproduce it. But, it would seem to me that others may well have seen
> this problem already?
Make sure that videocard properly reports that it uses more than 32kB
BIOS. Maybe card reports only 32kB, while it uses 48kB. System is free
to do anything it wants with 32-48kB range including mapping another BIOS
there, or writting zeroes, or garbage there... Also make sure that
you have properly setup VM, that 0xC8000 is mapped to physical address
0xC8000...
Best regards,
Petr Vandrovec
vandrove@vc.cvut.cz
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Crash in vm86() on SMP boxes with vesa driver?
2003-04-29 6:03 Crash in vm86() on SMP boxes with vesa driver? Petr Vandrovec
@ 2003-04-29 19:41 ` Kendall Bennett
2003-04-29 20:49 ` Alan Cox
0 siblings, 1 reply; 4+ messages in thread
From: Kendall Bennett @ 2003-04-29 19:41 UTC (permalink / raw)
To: linux-kernel
"Petr Vandrovec" <VANDROVE@vc.cvut.cz> wrote:
> > 8.0 box with the latest 2.4.20 kernel on it (but the problem happened
> > with the stock kernel and kernels lower then .20 as well). Unfortunately
> > I don't have access to the box (it is in Australia), but I have access to
> > the bug report information (and will try to configure a box soon to
> > reproduce it here). Anyway the folowing is the error log produced by
> > XFree86 when the crash occurs:
>
> We told you before that you cannot trust VESA BIOS.
No, I do not agree with that statement at all. At present I would say
that there is a problem with the vm86() services, or perhaps something
wrong with the way we (and XFree86) are setting up the vm86 state for the
BIOS. The reason I say that is because we use the BIOS all the time using
vm86() services on OS/2 and we have not had any of these problems.
Essentially what I am saying is that this problem is fixable somewhere
(either in the kernel or in our/XFree86's vm86() code).
> > (II) VESA(0): initializing int10
> > (WW) VESA(0): Bad V_BIOS checksum
> > (II) VESA(0): Primary V_BIOS segment is: 0xc000
>
> Bad checksum? Sorry, your BIOS is not usable. Either XFree gets
> checksum wrong, or there is something I would not want in my
> computer there...
I wish we stll had access to the machine so I could debug this. It is
plausible that the BIOS has a bad checksum, but if it did, the system
BIOS would have failed to POST the card. Hence I think there is something
else going on here.
> > Also from debugging our own code we have a bit more information about
> > where the problem occurs, and it occurs on the return from the vm86()
> > system call when the code tries to pop the EBX register from the stack.
> > Which kind of indicates that the kernel screwed up the return stack of
> > the program for some reason:
>
> No. Crash happened inside VM, and it was shown as happening on
> return from int $0x80. But real problem is that in the VM you are
> executing code at 0xC000:0x800F. But there is no code there, it is
> garbage (bound bx,[bx+si]; xchg cx,ax; pusha; or dx,di ???) which
> generated bounds check interrupt.
Ok, that makes sense. From experience with OS/2 and virtual machines,
this generally happens when the video BIOS is confused by the state of
the hardware, especially of I/O port access to certain registers has been
incorrectly virtualised. ATI cards have been notorious for us on OS/2 for
these types of problems, but the problem is solveable (most of the OS/2
related problems are all specific to running in a window, where access to
the hardware registers has to be restricted and correctly emulated).
> > Any ideas? I am not sure how to start debuging this (assuming I can get
> > my SMP machine up and running and reproduce it) in the kernel. Also the
> > machine that the problem occurs on goes to the customer tomorrow, so we
> > won't be able to debug this much ourselves until I can get a new machine
> > to reproduce it. But, it would seem to me that others may well have seen
> > this problem already?
>
> Make sure that videocard properly reports that it uses more than
> 32kB BIOS. Maybe card reports only 32kB, while it uses 48kB. System
> is free to do anything it wants with 32-48kB range including
> mapping another BIOS there, or writting zeroes, or garbage there...
> Also make sure that you have properly setup VM, that 0xC8000 is
> mapped to physical address 0xC8000...
I will check into this. I am running into some strange problems on an
NVIDIA GeForce4 integrated system right now, yet that same BIOS works
perfectly in DOS and OS/2 so there is something up with the way the
vm86() services are being handled. I will try to solve the problem I am
seeing on this NVIDIA machine, and perhaps that will lead to a solution
for both problems (assuming they are actually related of course ;-).
Regards,
---
Kendall Bennett
Chief Executive Officer
SciTech Software, Inc.
Phone: (530) 894 8400
http://www.scitechsoft.com
~ SciTech SNAP - The future of device driver technology! ~
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Crash in vm86() on SMP boxes with vesa driver?
2003-04-29 19:41 ` Kendall Bennett
@ 2003-04-29 20:49 ` Alan Cox
0 siblings, 0 replies; 4+ messages in thread
From: Alan Cox @ 2003-04-29 20:49 UTC (permalink / raw)
To: Kendall Bennett; +Cc: Linux Kernel Mailing List
On Maw, 2003-04-29 at 20:41, Kendall Bennett wrote:
> I will check into this. I am running into some strange problems on an
> NVIDIA GeForce4 integrated system right now, yet that same BIOS works
> perfectly in DOS and OS/2 so there is something up with the way the
> vm86() services are being handled. I will try to solve the problem I am
> seeing on this NVIDIA machine, and perhaps that will lead to a solution
> for both problems (assuming they are actually related of course ;-).
Another thing to do is to run it under the emulator x86 stuff in
XFree86, see what that shows up differently to vm86 proper. And then
there are lots of fun things that confuse video hardware - the fact
Linux is a PnP OS, the PCI configuration problems and so on. In
paticular nobody right now (including X11 proper) virtualises
conf1/conf2 etc properly so you can crash the box easily
^ permalink raw reply [flat|nested] 4+ messages in thread
* Crash in vm86() on SMP boxes with vesa driver?
@ 2003-04-28 23:12 Kendall Bennett
0 siblings, 0 replies; 4+ messages in thread
From: Kendall Bennett @ 2003-04-28 23:12 UTC (permalink / raw)
To: linux-kernel
Hi Guys,
We ran into a problem with the VESA services crashing in our code on SMP
machines, and as a test we checked the XFree86 vesa driver module and
found that it has the same problem. The machine in question is a Red Hat
8.0 box with the latest 2.4.20 kernel on it (but the problem happened
with the stock kernel and kernels lower then .20 as well). Unfortunately
I don't have access to the box (it is in Australia), but I have access to
the bug report information (and will try to configure a box soon to
reproduce it here). Anyway the folowing is the error log produced by
XFree86 when the crash occurs:
(II) VESA(0): VBESetVBEMode failed(EE) VESA(0): vm86() syscall
generated
signal 11.
I guess this answers your question ... ;)
(II) VESA(0): initializing int10
(WW) VESA(0): Bad V_BIOS checksum
(II) VESA(0): Primary V_BIOS segment is: 0xc000
(II) VESA(0): VESA BIOS detected
(II) VESA(0): VESA VBE Version 2.0
(II) VESA(0): VESA VBE Total Mem: 32768 kB
(II) VESA(0): VESA VBE OEM: ATI RAGE128
(II) VESA(0): VESA VBE OEM Software Rev: 1.0
(II) VESA(0): VESA VBE OEM Vendor: ATI Technologies Inc.
(II) VESA(0): VESA VBE OEM Product: R128
(II) VESA(0): VESA VBE OEM Product Rev: 01.00
(==) VESA(0): Write-combining range (0xf0000000,0x2000000)
(II) VESA(0): virtual address = 0x402d7000,
physical address = 0xf0000000, size = 33554432
(II) VESA(0): VBESetVBEMode failed(EE) VESA(0): vm86() syscall generated
signal 11.
(II) VESA(0): EAX=0x00000150, EBX=0x00000ba0, ECX=0x00000000,
EDX=0x00000000
(II) VESA(0): ESP=0x00000fba, EBP=0x00000001, ESI=0x00000bc3,
EDI=0x00003ad7
(II) VESA(0): CS=0xc000, SS=0x0100, DS=0x0000, ES=0xc000, FS=0x0000,
GS=0x0000 (II) VESA(0): EIP=0x0000800f, EFLAGS=0x00033006
(II) VESA(0): code at 0x000c800f:
62 18 91 60 09 fa 03 85 27 11 27 11 9d 0f f4 81
fe 06 d0 1a 68 74 99 a9 c6 39 f9 6d 04 b4 d6 6b
(II) stack at 0x00001fba:
df 15 00 20 00 00 37 3e 00 00 b0 0f 00 00 dc 0f
00 00 00 04 00 00 9f 00 00 00 1b 80 00 00 e2 00
00 00 00 00 00 c0 87 4b 20 14 00 c0 1a 42 1b c1
00 00 00 01 00 00 00 00 00 20 40 00 00 00 86 4b
00 06 00 00 00 32
(EE) VESA(0): Set VBE Mode failed!
Also from debugging our own code we have a bit more information about
where the problem occurs, and it occurs on the return from the vm86()
system call when the code tries to pop the EBX register from the stack.
Which kind of indicates that the kernel screwed up the return stack of
the program for some reason:
Program received signal SIGSEGV, Segmentation fault.
0x4007cafb in vm86 (vm=0x4016d9cc) at linux/pm.c:102
(gdb) bt
#0 0x4007cafb in vm86 (vm=0x4016d9cc) at linux/pm.c:102
#1 0x4007fb0f in run_vm86 () at linux/pm.c:1610
#2 0x4007fd0a in PM_int86 (intno=16, in=0xbffff260, out=0xbffff260) at
#linux/pm.c:1653 3 0x400afcaa in DRV_configure (forcedMem=0) at
#config.c:210 4 0x400af176 in DRV_initDriver () at
#/a/scitech/private/src/snap/graphics/drivers/gdetect.c:524 5 0x400944f4
#in _GA_initInternal (dc=0x4019fd40, device=0)
at /a/scitech/private/src/snap/graphics/gainit.c:1231
#6 0x4009651d in LoadDriver (deviceIndex=0, shared=0, info=0x8072550,
drivername=0xbffff780 "null.drv", busType=5) at
/a/scitech/private/src/snap/graphics/gainit.c:2255
#7 0x40096d56 in __GA_loadDriver (deviceIndex=0, shared=0)
at /a/scitech/private/src/snap/graphics/gainit.c:2808
#8 0x4009712e in GA_loadDriver (deviceIndex=0, shared=0)
at /a/scitech/private/src/snap/graphics/gainit.c:2919
#9 0x0804c9d0 in main ()
#10 0x401fc907 in __libc_start_main () from /lib/libc.so.6
Program received signal SIGSEGV, Segmentation fault.
0x4007cafb in vm86 (vm=0x4016d9cc) at linux/pm.c:102
102 asm volatile (
(gdb) list
97 static int
98 vm86(struct vm86_struct *vm)
99 {
100 int r;
101 #ifdef __PIC__
102 asm volatile (
103 "pushl %%ebx\n\t"
104 "movl %2, %%ebx\n\t"
105 "int $0x80\n\t"
106 "popl %%ebx"
Dump of assembler code for function vm86:
0x4007cae0 <vm86>: push %ebp
0x4007cae1 <vm86+1>: mov %esp,%ebp
0x4007cae3 <vm86+3>: sub $0x8,%esp
0x4007cae6 <vm86+6>: mov $0x71,%edx
0x4007caeb <vm86+11>: mov 0x8(%ebp),%eax
0x4007caee <vm86+14>: mov %eax,0xfffffff8(%ebp)
0x4007caf1 <vm86+17>: mov %edx,%eax
0x4007caf3 <vm86+19>: mov 0xfffffff8(%ebp),%ecx
0x4007caf6 <vm86+22>: push %ebx
0x4007caf7 <vm86+23>: mov %ecx,%ebx
0x4007caf9 <vm86+25>: int $0x80
0x4007cafb <vm86+27>: pop %ebx
0x4007cafc <vm86+28>: mov %eax,0xfffffff8(%ebp)
0x4007caff <vm86+31>: mov 0xfffffff8(%ebp),%eax
0x4007cb02 <vm86+34>: mov %eax,0xfffffffc(%ebp)
0x4007cb05 <vm86+37>: mov 0xfffffffc(%ebp),%eax
0x4007cb08 <vm86+40>: mov %eax,%eax
0x4007cb0a <vm86+42>: mov %ebp,%esp
0x4007cb0c <vm86+44>: pop %ebp
0x4007cb0d <vm86+45>: ret
End of assembler dump.
(gdb) info frame
Stack level 0, frame at 0xbffff208:
eip = 0x4007cafb in vm86 (linux/pm.c:102); saved eip 0x4007fb0f
called by frame at 0xbffff238
source language c.
Arglist at 0xbffff208, args: vm=0x4016d9cc
Locals at 0xbffff208, Previous frame's sp is 0x0
Saved registers:
ebp at 0xbffff208, eip at 0xbffff20c
(gdb) info args
vm = (struct vm86_struct *) 0x4016d9cc
(gdb) info locals
r = 1
(gdb) info reg
eax 0x0 0
ecx 0x4016d9cc 1075239372
edx 0x71 113
ebx 0x4016d9cc 1075239372
esp 0xbffff1fc 0xbffff1fc
ebp 0xbffff208 0xbffff208
esi 0x4019c4e0 1075430624
edi 0x17c 380
eip 0x4007cafb 0x4007cafb
eflags 0x203286 2110086
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x0 0
fctrl 0x37f 895
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
xmm0 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm1 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm2 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm3 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm4 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm5 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm6 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
xmm7 {f = {0x0, 0x0, 0x0, 0x0}} {f = {0, 0, 0, 0}}
mxcsr 0x1f80 8064
orig_eax 0x71 113
(gdb) x/32w $esp-64
0xbffff1bc: 0x401c4598 0x00000001 0x00000000
0x400355e4
0xbffff1cc: 0x00004000 0x40321da0 0x40321cfc
0x00000089
0xbffff1dc: 0x08075198 0x000000e9 0x042801bf
0xbffff218
0xbffff1ec: 0x4007f6d0 0x40259010 0x40259004
0xbffff218
0xbffff1fc: 0x4017dc54 0x4016d9cc 0x00000001
0xbffff238
0xbffff20c: 0x4007fb0f 0x4016d9cc 0x4017dc54
0xbffff238
0xbffff21c: 0x4007fbd5 0x4016d9cc 0x4017dc54
0xbffff248
0xbffff22c: 0x4007fd02 0x00000001 0x4017dc54
0xbffff248
Any ideas? I am not sure how to start debuging this (assuming I can get
my SMP machine up and running and reproduce it) in the kernel. Also the
machine that the problem occurs on goes to the customer tomorrow, so we
won't be able to debug this much ourselves until I can get a new machine
to reproduce it. But, it would seem to me that others may well have seen
this problem already?
Note that I have also posted this message to the XFree86 developer
mailing list since I am not sure if this is a Linux kernel issue or some
other issue.
Regards,
---
Kendall Bennett
Chief Executive Officer
SciTech Software, Inc.
Phone: (530) 894 8400
http://www.scitechsoft.com
~ SciTech SNAP - The future of device driver technology! ~
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-04-29 21:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-29 6:03 Crash in vm86() on SMP boxes with vesa driver? Petr Vandrovec
2003-04-29 19:41 ` Kendall Bennett
2003-04-29 20:49 ` Alan Cox
-- strict thread matches above, loose matches on Subject: below --
2003-04-28 23:12 Kendall Bennett
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox