kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: "Nadav Har'El" <nyh@math.technion.ac.il>
Cc: kvm@vger.kernel.org, Sheng Yang <sheng@linux.intel.com>
Subject: Re: [PATCH 18/24] Exiting from L2 to L1
Date: Sun, 12 Sep 2010 19:21:29 +0200	[thread overview]
Message-ID: <4C8D0C19.4050003@redhat.com> (raw)
In-Reply-To: <20100912170503.GA7828@fermat.math.technion.ac.il>

  On 09/12/2010 07:05 PM, Nadav Har'El wrote:
>
>>>>> +	vmcs12->guest_ia32_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
>>>>>
>>>> Without msr bitmaps, cannot change.
>>> I added a TODO before this (and a couple of others) for future
>>> optimization.
>>> I'm not even convinced how much quicker it is to check the MSR bitmap
>>> before
>>> doing vmcs_read64, vs just to going ahead and vmreading it in any case.
>> IIRC we don't support msr bitmaps now, so no need to check anything.
> I do think we support msr bitmaps... E.g., we have
> nested_vmx_exit_handled_msr() to check whether L1 requires an exit for a
> certain MSR access.  Where don't we support them? (but I'm not denying the
> possiblity that this support still has holes or bugs).

I was just talking from memory.  If you do support them, that's great.

Note that kvm itself doesn't support give the guest control of 
DEBUGCTLMSR, so you should just be able to read it from the shadow value 
(which strangely doesn't exist - I'll post a fix).

>> In general, avoid vmcs reads as much as possible.  Just think of your
>> code running in a guest!
> Yes. On the other hand, I don't want to be sorry in the future when I want
> to support some feature, but because I wanted to shave off 1% of the L2->L1
> switching time, and 0.01% of the total run time (and I'm just making
> numbers up...), I now need to find a dozen places where things need to change
> to support this feature. On the other hand, this will likely happen anyway ;-)

Well, with msrs you have two cases: those which the guest controls and 
those which are shadowed.  So all we need is a systematic way for 
dealing with the two types.

>>>>> +	vmcs12->vm_entry_intr_info_field =
>>>>> +		vmcs_read32(VM_ENTRY_INTR_INFO_FIELD);
>>>>>
>>>> Autocleared, no need to read.
>>> Well, we need to clear the "valid bit" on exit, so we don't mistakenly
>>> inject
>>> the same interrupt twice.
>> I don't think so.  We write this field as part of guest entry (that is,
>> after the decision re which vmcs to use, yes?), so guest entry will
>> always follow writing this field.  Since guest entry clears the field,
>> reading it after an exit will necessarily return 0.
> Well, you obviously know the KVM code much better than I do, but from what
> I saw, I thought (maybe I misunderstood) that in normal (non-nested) KVM,
> this field only gets written on injection, not on every entry, so the code
> relies on the fact that the processor turns off the "valid" bit during exit,
> to avoid the same event being injected when the same field value is used for
> another entry.

Correct.

> I can only find code which resets this field in vmx_vcpu_reset(),
> but that doesn't get called on every entry, right? Or am I missing something?

prepare_vmcs12() is called in response for a 2->1 vmexit which is first 
trapped by 0, yes?  Because it's called immediately after a vmexit, 
VM_ENTRY_INTR_INFO_FIELD is guaranteed to have been cleared by the 
processor.

There are two cases where VM_ENTRY_INTR_INFO_FIELD can potentially not 
be cleared by hardware:

1. if we call prepare_vmcs12() between injection and entry.  This cannot 
happen AFAICT.
2. if the vmexit was really a failed 1->2 vmentry, and if the processor 
doesn't clear VM_ENTRY_INTR_INFO_FIELD in response to vm entry failures 
(need to check scripture)

If neither of these are valid, the code can be removed.  If only the 
second, we might make it conditional.

>> What can happen is that the contents of the field is transferred to the
>> IDT_VECTORING_INFO field or VM_EXIT_INTR_INFO field.
>>
>> (question: on a failed vmentry, is this field cleared?)
> I don't know the answer :-)

Sheng?

>>> There were two ways to do it: 1. clear it ourselves,
>>> or 2. copy the value from vmcs02 where the processor already cleared it.
>>> There are pros and cons for each approach, but I'll change like you
>>> suggest,
>>> to clear it ourselves:
>>>
>>> 	vmcs12->vm_entry_intr_info_field&= ~INTR_INFO_VALID_MASK;
>> That's really a temporary variable, I don't think you need to touch it.
> But we need to emulate the correct VMX behavior. According to the spec, the
> "valid" bit on this field needs to be cleared on vmexit, so we need to do it
> also on emulated exits from L2 to L1. If we're sure that we already cleared it
> earlier, then fine, but if not (and like I said, I failed to find this code),
> we need to do it now, on exit - either by explictly clearing the bit or by
> copying a value where the processor cleared this bit (arguably, the former is
> more correct emulation).

Sorry, I misread it as vmx->idt_vectoring_info which is a temporary 
variable used to cache IDT_VECTORING_INFO.  Ignore my remark.

>> I didn't mean register independent helper; one function for cr0 and one
>> function for cr4 so the reader can just see the name and pretend to
>> understand what it does, instead of seeing a bunch of incomprehensible
>> bitwise operations.
> Ok, done:
>
> /*
>   * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date
>   * because L2 may have changed some cr0 bits directly (see CRO_GUEST_HOST_MASK)
>   * without L0 trapping the change and updating vmcs12.
>   * This function returns the value we should put in vmcs12.guest_cr0. It's not
>   * enough to just return the current (vmcs02) GUEST_CR0. This may not be the
>   * guest cr0 that L1 thought it was giving its L2 guest - it is possible that
>   * L1 wished to allow its guest to set a cr0 bit directly, but we (L0) asked
>   * to trap this change and instead set just the read shadow. If this is the
>   * case, we need to copy these read-shadow bits back to vmcs12.guest_cr0, where
>   * L1 believes they already are.
>   */
> static inline unsigned long
> vmcs12_guest_cr0(struct kvm_vcpu *vcpu, struct vmcs_fields *vmcs12)

newline...

> {
> 	unsigned long guest_cr0_bits =
> 		vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask;
> 	return (vmcs_readl(GUEST_CR0)&  guest_cr0_bits) |
> 		(vmcs_readl(CR0_READ_SHADOW)&  ~guest_cr0_bits);
> }
>
> and the call becomes just:
>
> 	vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
>
> which is easy to glance over (but doesn't say much about what it is doing).

It's a little easier to digest, at least.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2010-09-12 17:21 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-13 12:22 [PATCH 0/24] Nested VMX, v5 Nadav Har'El
2010-06-13 12:23 ` [PATCH 1/24] Move nested option from svm.c to x86.c Nadav Har'El
2010-06-14  8:11   ` Avi Kivity
2010-06-15 14:27     ` Nadav Har'El
2010-06-13 12:23 ` [PATCH 2/24] Add VMX and SVM to list of supported cpuid features Nadav Har'El
2010-06-14  8:13   ` Avi Kivity
2010-06-15 14:31     ` Nadav Har'El
2010-06-13 12:24 ` [PATCH 3/24] Implement VMXON and VMXOFF Nadav Har'El
2010-06-14  8:21   ` Avi Kivity
2010-06-16 11:14     ` Nadav Har'El
2010-06-16 11:26       ` Avi Kivity
2010-06-15 20:18   ` Marcelo Tosatti
2010-06-16  7:50     ` Nadav Har'El
2010-06-13 12:24 ` [PATCH 4/24] Allow setting the VMXE bit in CR4 Nadav Har'El
2010-06-15 11:09   ` Gleb Natapov
2010-06-15 14:44     ` Nadav Har'El
2010-06-13 12:25 ` [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1 Nadav Har'El
2010-06-14  8:33   ` Avi Kivity
2010-06-14  8:49     ` Nadav Har'El
2010-06-14 12:35       ` Avi Kivity
2010-06-16 12:24     ` Nadav Har'El
2010-06-16 13:10       ` Avi Kivity
2010-06-22 14:54     ` Nadav Har'El
2010-06-22 16:53       ` Nadav Har'El
2010-06-23  8:07         ` Avi Kivity
2010-08-08 15:09           ` Nadav Har'El
2010-08-10  3:24             ` Avi Kivity
2010-06-23  7:57       ` Avi Kivity
2010-06-23  9:15         ` Alexander Graf
2010-06-23  9:24           ` Avi Kivity
2010-06-23 12:07         ` Nadav Har'El
2010-06-23 12:13           ` Avi Kivity
2010-06-13 12:25 ` [PATCH 6/24] Implement reading and writing of VMX MSRs Nadav Har'El
2010-06-14  8:42   ` Avi Kivity
2010-06-23  8:13     ` Nadav Har'El
2010-06-23  8:24       ` Avi Kivity
2010-06-13 12:26 ` [PATCH 7/24] Understanding guest pointers to vmcs12 structures Nadav Har'El
2010-06-14  8:48   ` Avi Kivity
2010-08-02 12:25     ` Nadav Har'El
2010-08-02 13:38       ` Avi Kivity
2010-06-15 12:14   ` Gleb Natapov
2010-08-01 15:16     ` Nadav Har'El
2010-08-01 15:25       ` Gleb Natapov
2010-08-02  8:57         ` Nadav Har'El
2010-06-13 12:26 ` [PATCH 8/24] Hold a vmcs02 for each vmcs12 Nadav Har'El
2010-06-14  8:57   ` Avi Kivity
2010-07-06  9:50   ` Dong, Eddie
2010-08-02 13:38     ` Nadav Har'El
2010-06-13 12:27 ` [PATCH 9/24] Implement VMCLEAR Nadav Har'El
2010-06-14  9:03   ` Avi Kivity
2010-06-15 13:47   ` Gleb Natapov
2010-06-15 13:50     ` Avi Kivity
2010-06-15 13:54       ` Gleb Natapov
2010-08-05 11:50         ` Nadav Har'El
2010-08-05 11:53           ` Gleb Natapov
2010-08-05 12:01             ` Nadav Har'El
2010-08-05 12:05               ` Avi Kivity
2010-08-05 12:10                 ` Nadav Har'El
2010-08-05 12:13                   ` Avi Kivity
2010-08-05 12:29                     ` Nadav Har'El
2010-08-05 12:03           ` Avi Kivity
2010-07-06  2:56   ` Dong, Eddie
2010-08-03 12:12     ` Nadav Har'El
2010-06-13 12:27 ` [PATCH 10/24] Implement VMPTRLD Nadav Har'El
2010-06-14  9:07   ` Avi Kivity
2010-08-05 11:13     ` Nadav Har'El
2010-06-16 13:36   ` Gleb Natapov
2010-07-06  3:09   ` Dong, Eddie
2010-08-05 11:35     ` Nadav Har'El
2010-06-13 12:28 ` [PATCH 11/24] Implement VMPTRST Nadav Har'El
2010-06-14  9:15   ` Avi Kivity
2010-06-16 13:53     ` Gleb Natapov
2010-06-16 15:33       ` Nadav Har'El
2010-06-13 12:28 ` [PATCH 12/24] Add VMCS fields to the vmcs12 Nadav Har'El
2010-06-14  9:24   ` Avi Kivity
2010-06-16 14:18   ` Gleb Natapov
2010-06-13 12:29 ` [PATCH 13/24] Implement VMREAD and VMWRITE Nadav Har'El
2010-06-14  9:36   ` Avi Kivity
2010-06-16 14:48     ` Gleb Natapov
2010-08-04 13:42       ` Nadav Har'El
2010-08-04 16:09     ` Nadav Har'El
2010-08-04 16:41       ` Avi Kivity
2010-06-16 15:03   ` Gleb Natapov
2010-08-04 11:46     ` Nadav Har'El
2010-06-13 12:29 ` [PATCH 14/24] Prepare vmcs02 from vmcs01 and vmcs12 Nadav Har'El
2010-06-14 11:11   ` Avi Kivity
2010-06-17  8:50   ` Gleb Natapov
2010-07-06  6:25   ` Dong, Eddie
2010-06-13 12:30 ` [PATCH 15/24] Move register-syncing to a function Nadav Har'El
2010-06-13 12:30 ` [PATCH 16/24] Implement VMLAUNCH and VMRESUME Nadav Har'El
2010-06-14 11:41   ` Avi Kivity
2010-09-26 11:14     ` Nadav Har'El
2010-09-26 12:56       ` Avi Kivity
2010-09-26 13:06         ` Nadav Har'El
2010-09-26 13:51           ` Avi Kivity
2010-06-17 10:59   ` Gleb Natapov
2010-09-16 16:06     ` Nadav Har'El
2010-06-13 12:31 ` [PATCH 17/24] No need for handle_vmx_insn function any more Nadav Har'El
2010-06-13 12:31 ` [PATCH 18/24] Exiting from L2 to L1 Nadav Har'El
2010-06-14 12:04   ` Avi Kivity
2010-09-12 14:05     ` Nadav Har'El
2010-09-12 14:29       ` Avi Kivity
2010-09-12 17:05         ` Nadav Har'El
2010-09-12 17:21           ` Avi Kivity [this message]
2010-09-12 19:51             ` Nadav Har'El
2010-09-13  8:48               ` Avi Kivity
2010-09-13  5:53             ` Sheng Yang
2010-09-13  8:52               ` Avi Kivity
2010-09-13  9:01                 ` Nadav Har'El
2010-09-13  9:34                   ` Avi Kivity
2010-09-14 13:07     ` Nadav Har'El
2010-06-13 12:32 ` [PATCH 19/24] Deciding if L0 or L1 should handle an L2 exit Nadav Har'El
2010-06-14 12:24   ` Avi Kivity
2010-09-16 14:42     ` Nadav Har'El
2010-06-13 12:32 ` [PATCH 20/24] Correct handling of interrupt injection Nadav Har'El
2010-06-14 12:29   ` Avi Kivity
2010-06-14 12:48     ` Avi Kivity
2010-09-16 15:25     ` Nadav Har'El
2010-06-13 12:33 ` [PATCH 21/24] Correct handling of exception injection Nadav Har'El
2010-06-13 12:33 ` [PATCH 22/24] Correct handling of idt vectoring info Nadav Har'El
2010-06-17 11:58   ` Gleb Natapov
2010-09-20  6:37     ` Nadav Har'El
2010-09-20  9:34       ` Gleb Natapov
2010-09-20 10:03         ` Nadav Har'El
2010-09-20 10:11           ` Avi Kivity
2010-09-22 23:15             ` Nadav Har'El
2010-09-26 15:14               ` Avi Kivity
2010-09-26 15:18                 ` Gleb Natapov
2010-09-20 10:20           ` Gleb Natapov
2010-06-13 12:34 ` [PATCH 23/24] Handling of CR0.TS and #NM for Lazy FPU loading Nadav Har'El
2010-06-13 12:34 ` [PATCH 24/24] Miscellenous small corrections Nadav Har'El
2010-06-14 12:34 ` [PATCH 0/24] Nested VMX, v5 Avi Kivity
2010-06-14 13:03   ` Nadav Har'El
2010-06-15 10:00     ` Avi Kivity
2010-10-17 12:03       ` Nadav Har'El
2010-10-17 12:10         ` Avi Kivity
2010-10-17 12:39           ` Nadav Har'El
2010-10-17 13:35             ` Avi Kivity
2010-07-09  8:59 ` Dong, Eddie
2010-07-11  8:27   ` Nadav Har'El
2010-07-11 11:05     ` Alexander Graf
2010-07-11 12:49       ` Nadav Har'El
2010-07-11 13:12         ` Avi Kivity
2010-07-11 15:39           ` Nadav Har'El
2010-07-11 15:45             ` Avi Kivity
2010-07-11 13:20     ` Avi Kivity
2010-07-15  3:27 ` Sheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C8D0C19.4050003@redhat.com \
    --to=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=nyh@math.technion.ac.il \
    --cc=sheng@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).