From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [long] MINIX 3.1.6 works in QEMU-0.12.3 only with KVM disabled Date: Mon, 15 Mar 2010 14:48:02 +0200 Message-ID: <4B9E2C82.5030801@redhat.com> References: <4B9773C4.7010001@gmail.com> <4B9E11E3.5050000@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: Antoine Leca Return-path: Received: from mx1.redhat.com ([209.132.183.28]:26682 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933709Ab0COMsF (ORCPT ); Mon, 15 Mar 2010 08:48:05 -0400 In-Reply-To: <4B9E11E3.5050000@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: On 03/15/2010 12:54 PM, Antoine Leca wrote: > >> When doing switch, the cached segment selectors are preserved, >> which allows one to use protected mode segments in real-address mode >> (this is called unreal mode). >> > Now this is a by-product of the implementation inside the BIOS. > In fact, even if the BIOS enters unreal mode (or the similar big real, > more useful with segmentation-less architectures), before turning back > to the client it (should) reset things to normal real mode, as service > 15/87 is not an usual way to enter unreal mode (for example, this effect > is not even mentionned in Ralf Brown's list). > The entry into unreal mode is unintentional; the bios is transitioning to protected mode and 'unreal mode' only exists for a few instructions, IIRC. > As a result (and also and foremost because of 80286 compatibility), > instead of directly using unreal or big real mode if possible (as done > eg. in himem.sys), Minix monitor goes to the great pain to going back to > square #1, and since blocks are at most 64 KB in size and several > iterations are needed, on the next block Minix sets up the (very > similar) GDT then does another call to the same BIOS service 15/87. > > > >> I knew these parts before, but this is where Avi's answer came in: KVM >> on Intel does not yet support unreal mode and requires the cached >> segment descriptors to be valid in real-address mode. >> > I do not know which virtual BIOS is using KVM, but I notice while > reading http://bochs.sourceforge.net/cgi-bin/lxr/source/bios/rombios.c: > [ Slightly edited to fit the width of my post. AL. ] > 3555 case 0x87: > 3556 #if BX_CPU< 3 > 3557 # error "Int15 function 87h not supported on< 80386" > 3558 #endif > 3559 // +++ should probably have descriptor checks > 3560 // +++ should have exception handlers > ... > 3640 mov eax, cr0 > 3641 or al, #0x01 > 3642 mov cr0, eax > 3643 ;; far jump to flush CPU queue after transition to prot. mode > 3644 JMP_AP(0x0020, protected_mode) > 3645 > 3646 protected_mode: > 3647 ;; GDT points to valid descriptor table, now load SS, DS, ES > 3648 mov ax, #0x28 ;; 101 000 = 5th desc.in table, TI=GDT,RPL=00 > 3649 mov ss, ax > 3650 mov ax, #0x10 ;; 010 000 = 2nd desc.in table, TI=GDT,RPL=00 > 3651 mov ds, ax > 3652 mov ax, #0x18 ;; 011 000 = 3rd desc.in table, TI=GDT,RPL=00 > 3653 mov es, ax > 3654 xor si, si > 3655 xor di, di > 3656 cld > 3657 rep > 3658 movsw ;; move CX words from DS:SI to ES:DI > 3659 > 3660 ;; make sure DS and ES limits are 64KB > 3661 mov ax, #0x28 > 3662 mov ds, ax > 3663 mov es, ax > 3664 > 3665 ;; reset PG bit in CR0 ??? > 3666 mov eax, cr0 > 3667 and al, #0xFE > 3668 mov cr0, eax > > I should be loosing something here... There is no unreal mode at any > moment, is it? > > [ ... some web browsing occuring meanwhile ... Later: ] > > Okay, now I got another picture. 8-| > Until recently, KVM (and qemu) used Bochs BIOS, showed above; but they > switched recently to SeaBIOS... where the applicable code is in > src/system.c, and looks like (now this is AT&T assembly): > 83 static void > 84 handle_1587(struct bregs *regs) > 85 { > 86 // +++ should probably have descriptor checks > 87 // +++ should have exception handlers > .... > 127 // Enable protected mode > 128 " movl %%cr0, %%eax\n" > 129 " orl $" __stringify(CR0_PE) ", %%eax\n" > 130 " movl %%eax, %%cr0\n" > 131 > 132 // far jump to flush CPU queue after transition to prot. mode > 133 " ljmpw $(4<<3), $1f\n" > 134 > 135 // GDT points to valid descriptor table, now load DS, ES > 136 "1:movw $(2<<3), %%ax\n" > // 2nd descriptor in table, TI=GDT, RPL=00 > 137 " movw %%ax, %%ds\n" > 138 " movw $(3<<3), %%ax\n" > // 3rd descriptor in table, TI=GDT, RPL=00 > 139 " movw %%ax, %%es\n" > 140 > 141 // move CX words from DS:SI to ES:DI > 142 " xorw %%si, %%si\n" > 143 " xorw %%di, %%di\n" > 144 " rep movsw\n" > 145 > 146 // Disable protected mode > 147 " movl %%cr0, %%eax\n" > 148 " andl $~" __stringify(CR0_PE) ", %%eax\n" > 149 " movl %%eax, %%cr0\n" > > Note that while the basic scheme is the same, the "cleaning up" of lines > 3660-3663 "make sure DS and ES limits are 64KB" is not present. > IIUC, the virtualized CPU goes back to real mode with those segments > sets as they are in protected mode, and yes with Minix boot monitor they > happenned to NOT be paragraph-aligned. > > > Is it possible to add back this "cleaning up" to the BIOS used in KVM? > I think so. This is a longstanding kvm bug, but I can't see any downsides to a workaround in the BIOS. -- error compiling committee.c: too many arguments to function