All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bandan Das <bsd@redhat.com>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Avi Kivity <avi.kivity@gmail.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm list <kvm@vger.kernel.org>
Subject: Re: x86: Question regarding the reset value of LINT0
Date: Thu, 09 Apr 2015 15:17:41 -0400	[thread overview]
Message-ID: <jpg7ftl2jsa.fsf@redhat.com> (raw)
In-Reply-To: <8E93BDA9-CCB5-4D7C-9FF0-0CDBCDB78051@gmail.com> (Nadav Amit's message of "Thu, 9 Apr 2015 21:49:12 +0300")

Nadav Amit <nadav.amit@gmail.com> writes:

> Avi Kivity <avi.kivity@gmail.com> wrote:
>
>> On 04/09/2015 09:21 PM, Nadav Amit wrote:
>>> Bandan Das <bsd@redhat.com> wrote:
>>> 
>>>> Nadav Amit <nadav.amit@gmail.com> writes:
>>>> 
>>>>> Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>> 
>>>>>> On 2015-04-08 19:40, Nadav Amit wrote:
>>>>>>> Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>> 
>>>>>>>> On 2015-04-08 18:59, Nadav Amit wrote:
>>>>>>>>> Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>>>>>> 
>>>>>>>>>> On 2015-04-08 18:40, Nadav Amit wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I would appreciate if someone explains the reason for enabling LINT0 during
>>>>>>>>>>> APIC reset. This does not correspond with Intel SDM Figure 10-8: “Local
>>>>>>>>>>> Vector Table” that says all LVT registers are reset to 0x10000.
>>>>>>>>>>> 
>>>>>>>>>>> In kvm_lapic_reset, I see:
>>>>>>>>>>> 
>>>>>>>>>>> 	apic_set_reg(apic, APIC_LVT0,
>>>>>>>>>>> 		SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT));
>>>>>>>>>>> 
>>>>>>>>>>> Which is actually pretty similar to QEMU’s apic_reset_common:
>>>>>>>>>>> 
>>>>>>>>>>> if (bsp) {
>>>>>>>>>>>     /*
>>>>>>>>>>>      * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization
>>>>>>>>>>>      * time typically by BIOS, so PIC interrupt can be delivered to the
>>>>>>>>>>>      * processor when local APIC is enabled.
>>>>>>>>>>>      */
>>>>>>>>>>>     s->lvt[APIC_LVT_LINT0] = 0x700;
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> Yet, in both cases, I miss the point - if it is typically done by the BIOS,
>>>>>>>>>>> why does QEMU or KVM enable it?
>>>>>>>>>>> 
>>>>>>>>>>> BTW: KVM seems to run fine without it, and I think setting it causes me
>>>>>>>>>>> problems in certain cases.
>>>>>>>>>> I suspect it has some historic BIOS backgrounds. Already tried to find
>>>>>>>>>> more information in the git logs of both code bases? Or something that
>>>>>>>>>> indicates of SeaBIOS or BochsBIOS once didn't do this initialization?
>>>>>>>>> Thanks. I found no indication of such thing.
>>>>>>>>> 
>>>>>>>>> QEMU’s commit message (0e21e12bb311c4c1095d0269dc2ef81196ccb60a) says:
>>>>>>>>> 
>>>>>>>>>  Don't route PIC interrupts through the local APIC if the local APIC
>>>>>>>>>  config says so. By Ari Kivity.
>>>>>>>>> 
>>>>>>>>> Maybe Avi Kivity knows this guy.
>>>>>>>> ths? That should have been Thiemo Seufer (IIRC), but he just committed
>>>>>>>> the code back then (and is no longer with us, sadly).
>>>>>>> Oh… I am sorry - I didn’t know about that.. (I tried to make an unfunny joke
>>>>>>> about Avi knowing “Ari”).
>>>>>> Ah. No problem. My brain apparently fixed that typo up unnoticed.
>>>>>> 
>>>>>>>> But if that commit went in without any BIOS changes around it, QEMU
>>>>>>>> simply had to do the job of the latter to keep things working.
>>>>>>> So should I leave it as is? Can I at least disable in KVM during INIT (and
>>>>>>> leave it as is for RESET)?
>>>>>> No, I don't think there is a need to leave this inaccurate for QEMU if
>>>>>> our included BIOS gets it right. I don't know what the backward
>>>>>> bug-compatibility of KVM is, though. Maybe you can identify since when
>>>>>> our BIOS is fine so that we can discuss time frames.
>>>>> I think that it was addressed in commit
>>>>> 19c1a7692bf65fc40e56f93ad00cc3eefaad22a4 ("Initialize the LINT LVTs on the
>>>>> local APIC of the BSP.”) So it should be included in seabios 0.5.0, which
>>>>> means qemu 0.12 - so we are talking about the end of 2009 or start of 2010.
>>>> The probability that someone will use a newer version of kernel with something
>>>> as old as 0.12 is probably minimal. I think it's ok to change it with a comment
>>>> indicating the reason. To be on the safe side, however, a user changeable switch
>>>> is something worth considering.
>>> I don’t see any existing mechanism for KVM to be aware of its user type and
>>> version. I do see another case of KVM hacks that are intended for fixing
>>> very old QEMU bugs (see 3a624e29c75 changes in vmx_set_segment, which are
>>> from pretty much the same time-frame of the issue I try to fix).
>>> 
>>> Since this is something which would follow around, please advise what would
>>> be the format. A new ioctl that would supply the userspace “type” (according
>>> to predefined constants) and version?
>> 
>> That would be madness. KVM shouldn't even know that qemu exists, let alone
>> track its versions.
>> 
>> Simply add a new toggle KVM_USE_STANDARD_LAPIC_LVT_INIT and document that
>> userspace MUST use it. Old userspace won't, and will get the old buggy
>> behavior.
>
> I fully agree it would be madness. Yet it appears to be a recurring problem.
> Here are similar problems  found from a short search:
>
> 1. vmx_set_segment setting segment accessed (3a624e29c75)
> 2. svm_set_cr0 clearing CD and NW (709ddebf81c)
> 3. Limited number of MTRRs due to Seabios bus (0d234daf7e0a)
>
> Excluding (1) all of the other issues are related to the VM BIOS. Perhaps
> KVM should somehow realize which VM BIOS runs? (yes, it sounds just as bad.)

How about renaming the toggle Avi mentioned above to something more generic
(KVM_DISABLE_LEGACY_QUIRKS ?) and grouping all the issues together ? Modern userspace
will always enable it and get the new correct behavior. When more cases are discovered,
KVM can just add them to the list.


> Nadav
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-04-09 19:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-08 16:40 x86: Question regarding the reset value of LINT0 Nadav Amit
2015-04-08 16:44 ` Jan Kiszka
2015-04-08 16:59   ` Nadav Amit
2015-04-08 17:06     ` Jan Kiszka
2015-04-08 17:40       ` Nadav Amit
2015-04-08 17:51         ` Jan Kiszka
2015-04-08 21:49           ` Nadav Amit
2015-04-08 22:11             ` Bandan Das
2015-04-09 18:21               ` Nadav Amit
2015-04-09 18:28                 ` Avi Kivity
2015-04-09 18:49                   ` Nadav Amit
2015-04-09 19:17                     ` Bandan Das [this message]
2015-04-10  9:12                       ` Paolo Bonzini
2015-04-12 18:29                         ` Nadav Amit
2015-04-12 22:53                           ` [PATCH] KVM: x86: Support for disabling quirks Nadav Amit
2015-04-13 10:34                             ` Paolo Bonzini
2015-04-13 12:02                               ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jpg7ftl2jsa.fsf@redhat.com \
    --to=bsd@redhat.com \
    --cc=avi.kivity@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kvm@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.