From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261580AbTD2TaA (ORCPT ); Tue, 29 Apr 2003 15:30:00 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261582AbTD2TaA (ORCPT ); Tue, 29 Apr 2003 15:30:00 -0400 Received: from adsl-63-195-13-70.dsl.chic01.pacbell.net ([63.195.13.70]:51881 "EHLO mail.scitechsoft.com") by vger.kernel.org with ESMTP id S261580AbTD2T36 (ORCPT ); Tue, 29 Apr 2003 15:29:58 -0400 From: "Kendall Bennett" Organization: SciTech Software, Inc. To: linux-kernel@vger.kernel.org Date: Tue, 29 Apr 2003 12:41:12 -0700 MIME-Version: 1.0 Subject: Re: Crash in vm86() on SMP boxes with vesa driver? Message-ID: <3EAE72E8.7176.1E3DFDF8@localhost> In-reply-to: X-mailer: Pegasus Mail for Windows (v4.02) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org "Petr Vandrovec" wrote: > > 8.0 box with the latest 2.4.20 kernel on it (but the problem happened > > with the stock kernel and kernels lower then .20 as well). Unfortunately > > I don't have access to the box (it is in Australia), but I have access to > > the bug report information (and will try to configure a box soon to > > reproduce it here). Anyway the folowing is the error log produced by > > XFree86 when the crash occurs: > > We told you before that you cannot trust VESA BIOS. No, I do not agree with that statement at all. At present I would say that there is a problem with the vm86() services, or perhaps something wrong with the way we (and XFree86) are setting up the vm86 state for the BIOS. The reason I say that is because we use the BIOS all the time using vm86() services on OS/2 and we have not had any of these problems. Essentially what I am saying is that this problem is fixable somewhere (either in the kernel or in our/XFree86's vm86() code). > > (II) VESA(0): initializing int10 > > (WW) VESA(0): Bad V_BIOS checksum > > (II) VESA(0): Primary V_BIOS segment is: 0xc000 > > Bad checksum? Sorry, your BIOS is not usable. Either XFree gets > checksum wrong, or there is something I would not want in my > computer there... I wish we stll had access to the machine so I could debug this. It is plausible that the BIOS has a bad checksum, but if it did, the system BIOS would have failed to POST the card. Hence I think there is something else going on here. > > Also from debugging our own code we have a bit more information about > > where the problem occurs, and it occurs on the return from the vm86() > > system call when the code tries to pop the EBX register from the stack. > > Which kind of indicates that the kernel screwed up the return stack of > > the program for some reason: > > No. Crash happened inside VM, and it was shown as happening on > return from int $0x80. But real problem is that in the VM you are > executing code at 0xC000:0x800F. But there is no code there, it is > garbage (bound bx,[bx+si]; xchg cx,ax; pusha; or dx,di ???) which > generated bounds check interrupt. Ok, that makes sense. From experience with OS/2 and virtual machines, this generally happens when the video BIOS is confused by the state of the hardware, especially of I/O port access to certain registers has been incorrectly virtualised. ATI cards have been notorious for us on OS/2 for these types of problems, but the problem is solveable (most of the OS/2 related problems are all specific to running in a window, where access to the hardware registers has to be restricted and correctly emulated). > > Any ideas? I am not sure how to start debuging this (assuming I can get > > my SMP machine up and running and reproduce it) in the kernel. Also the > > machine that the problem occurs on goes to the customer tomorrow, so we > > won't be able to debug this much ourselves until I can get a new machine > > to reproduce it. But, it would seem to me that others may well have seen > > this problem already? > > Make sure that videocard properly reports that it uses more than > 32kB BIOS. Maybe card reports only 32kB, while it uses 48kB. System > is free to do anything it wants with 32-48kB range including > mapping another BIOS there, or writting zeroes, or garbage there... > Also make sure that you have properly setup VM, that 0xC8000 is > mapped to physical address 0xC8000... I will check into this. I am running into some strange problems on an NVIDIA GeForce4 integrated system right now, yet that same BIOS works perfectly in DOS and OS/2 so there is something up with the way the vm86() services are being handled. I will try to solve the problem I am seeing on this NVIDIA machine, and perhaps that will lead to a solution for both problems (assuming they are actually related of course ;-). Regards, --- Kendall Bennett Chief Executive Officer SciTech Software, Inc. Phone: (530) 894 8400 http://www.scitechsoft.com ~ SciTech SNAP - The future of device driver technology! ~