From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bandan Das Subject: Re: x86: Question regarding the reset value of LINT0 Date: Thu, 09 Apr 2015 15:17:41 -0400 Message-ID: References: <2B474EEE-85C9-47C3-89FF-C56754CFEC0D@gmail.com> <55255AF2.2070706@siemens.com> <06513D06-1629-4AC0-9014-C6D13C29A1FC@gmail.com> <55256004.8030403@siemens.com> <55256A89.3030100@siemens.com> <06DCB70D-52E7-457B-BEEF-051F20136D7A@gmail.com> <5526C4B9.6030101@gmail.com> <8E93BDA9-CCB5-4D7C-9FF0-0CDBCDB78051@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Avi Kivity , Jan Kiszka , Paolo Bonzini , kvm list To: Nadav Amit Return-path: Received: from mx1.redhat.com ([209.132.183.28]:50167 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933141AbbDITRu convert rfc822-to-8bit (ORCPT ); Thu, 9 Apr 2015 15:17:50 -0400 In-Reply-To: <8E93BDA9-CCB5-4D7C-9FF0-0CDBCDB78051@gmail.com> (Nadav Amit's message of "Thu, 9 Apr 2015 21:49:12 +0300") Sender: kvm-owner@vger.kernel.org List-ID: Nadav Amit writes: > Avi Kivity wrote: > >> On 04/09/2015 09:21 PM, Nadav Amit wrote: >>> Bandan Das wrote: >>>=20 >>>> Nadav Amit writes: >>>>=20 >>>>> Jan Kiszka wrote: >>>>>=20 >>>>>> On 2015-04-08 19:40, Nadav Amit wrote: >>>>>>> Jan Kiszka wrote: >>>>>>>=20 >>>>>>>> On 2015-04-08 18:59, Nadav Amit wrote: >>>>>>>>> Jan Kiszka wrote: >>>>>>>>>=20 >>>>>>>>>> On 2015-04-08 18:40, Nadav Amit wrote: >>>>>>>>>>> Hi, >>>>>>>>>>>=20 >>>>>>>>>>> I would appreciate if someone explains the reason for enabl= ing LINT0 during >>>>>>>>>>> APIC reset. This does not correspond with Intel SDM Figure = 10-8: =E2=80=9CLocal >>>>>>>>>>> Vector Table=E2=80=9D that says all LVT registers are reset= to 0x10000. >>>>>>>>>>>=20 >>>>>>>>>>> In kvm_lapic_reset, I see: >>>>>>>>>>>=20 >>>>>>>>>>> apic_set_reg(apic, APIC_LVT0, >>>>>>>>>>> SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); >>>>>>>>>>>=20 >>>>>>>>>>> Which is actually pretty similar to QEMU=E2=80=99s apic_res= et_common: >>>>>>>>>>>=20 >>>>>>>>>>> if (bsp) { >>>>>>>>>>> /* >>>>>>>>>>> * LINT0 delivery mode on CPU #0 is set to ExtInt at in= itialization >>>>>>>>>>> * time typically by BIOS, so PIC interrupt can be deli= vered to the >>>>>>>>>>> * processor when local APIC is enabled. >>>>>>>>>>> */ >>>>>>>>>>> s->lvt[APIC_LVT_LINT0] =3D 0x700; >>>>>>>>>>> } >>>>>>>>>>>=20 >>>>>>>>>>> Yet, in both cases, I miss the point - if it is typically d= one by the BIOS, >>>>>>>>>>> why does QEMU or KVM enable it? >>>>>>>>>>>=20 >>>>>>>>>>> BTW: KVM seems to run fine without it, and I think setting = it causes me >>>>>>>>>>> problems in certain cases. >>>>>>>>>> I suspect it has some historic BIOS backgrounds. Already tri= ed to find >>>>>>>>>> more information in the git logs of both code bases? Or some= thing that >>>>>>>>>> indicates of SeaBIOS or BochsBIOS once didn't do this initia= lization? >>>>>>>>> Thanks. I found no indication of such thing. >>>>>>>>>=20 >>>>>>>>> QEMU=E2=80=99s commit message (0e21e12bb311c4c1095d0269dc2ef8= 1196ccb60a) says: >>>>>>>>>=20 >>>>>>>>> Don't route PIC interrupts through the local APIC if the loc= al APIC >>>>>>>>> config says so. By Ari Kivity. >>>>>>>>>=20 >>>>>>>>> Maybe Avi Kivity knows this guy. >>>>>>>> ths? That should have been Thiemo Seufer (IIRC), but he just c= ommitted >>>>>>>> the code back then (and is no longer with us, sadly). >>>>>>> Oh=E2=80=A6 I am sorry - I didn=E2=80=99t know about that.. (I = tried to make an unfunny joke >>>>>>> about Avi knowing =E2=80=9CAri=E2=80=9D). >>>>>> Ah. No problem. My brain apparently fixed that typo up unnoticed= =2E >>>>>>=20 >>>>>>>> But if that commit went in without any BIOS changes around it,= QEMU >>>>>>>> simply had to do the job of the latter to keep things working. >>>>>>> So should I leave it as is? Can I at least disable in KVM durin= g INIT (and >>>>>>> leave it as is for RESET)? >>>>>> No, I don't think there is a need to leave this inaccurate for Q= EMU if >>>>>> our included BIOS gets it right. I don't know what the backward >>>>>> bug-compatibility of KVM is, though. Maybe you can identify sinc= e when >>>>>> our BIOS is fine so that we can discuss time frames. >>>>> I think that it was addressed in commit >>>>> 19c1a7692bf65fc40e56f93ad00cc3eefaad22a4 ("Initialize the LINT LV= Ts on the >>>>> local APIC of the BSP.=E2=80=9D) So it should be included in seab= ios 0.5.0, which >>>>> means qemu 0.12 - so we are talking about the end of 2009 or star= t of 2010. >>>> The probability that someone will use a newer version of kernel wi= th something >>>> as old as 0.12 is probably minimal. I think it's ok to change it w= ith a comment >>>> indicating the reason. To be on the safe side, however, a user cha= ngeable switch >>>> is something worth considering. >>> I don=E2=80=99t see any existing mechanism for KVM to be aware of i= ts user type and >>> version. I do see another case of KVM hacks that are intended for f= ixing >>> very old QEMU bugs (see 3a624e29c75 changes in vmx_set_segment, whi= ch are >>> from pretty much the same time-frame of the issue I try to fix). >>>=20 >>> Since this is something which would follow around, please advise wh= at would >>> be the format. A new ioctl that would supply the userspace =E2=80=9C= type=E2=80=9D (according >>> to predefined constants) and version? >>=20 >> That would be madness. KVM shouldn't even know that qemu exists, let= alone >> track its versions. >>=20 >> Simply add a new toggle KVM_USE_STANDARD_LAPIC_LVT_INIT and document= that >> userspace MUST use it. Old userspace won't, and will get the old bug= gy >> behavior. > > I fully agree it would be madness. Yet it appears to be a recurring p= roblem. > Here are similar problems found from a short search: > > 1. vmx_set_segment setting segment accessed (3a624e29c75) > 2. svm_set_cr0 clearing CD and NW (709ddebf81c) > 3. Limited number of MTRRs due to Seabios bus (0d234daf7e0a) > > Excluding (1) all of the other issues are related to the VM BIOS. Per= haps > KVM should somehow realize which VM BIOS runs? (yes, it sounds just a= s bad.) How about renaming the toggle Avi mentioned above to something more gen= eric (KVM_DISABLE_LEGACY_QUIRKS ?) and grouping all the issues together ? Mo= dern userspace will always enable it and get the new correct behavior. When more cases= are discovered, KVM can just add them to the list. > Nadav > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html