Re: [PATCH 6/6] KVM: VMX: Add PML support in VMX

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Kai Huang <kai.huang@linux.intel.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: pbonzini@redhat.com, gleb@kernel.org, linux@arm.linux.org.uk,
	kvm@vger.kernel.org
Subject: Re: [PATCH 6/6] KVM: VMX: Add PML support in VMX
Date: Thu, 05 Feb 2015 14:23:25 +0800	[thread overview]
Message-ID: <54D30C5D.1000402@linux.intel.com> (raw)
In-Reply-To: <20150203151833.GD19731@potion.redhat.com>


On 02/03/2015 11:18 PM, Radim Krčmář wrote:
> 2015-01-28 10:54+0800, Kai Huang:
>> This patch adds PML support in VMX. A new module parameter 'enable_pml' is added
> (+module_param_named(pml, enable_pml, bool, S_IRUGO);)
>
>> to allow user to enable/disable it manually.
>>
>> Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
>> ---
>> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
>> index 587149b..a139977 100644
>> --- a/arch/x86/kvm/trace.h
>> +++ b/arch/x86/kvm/trace.h
>> @@ -846,6 +846,24 @@ TRACE_EVENT(kvm_track_tsc,
>> +TRACE_EVENT(kvm_pml_full,
>> +	TP_PROTO(unsigned int vcpu_id),
>> +	TP_ARGS(vcpu_id),
>> +
>> +	TP_STRUCT__entry(
>> +		__field(	unsigned int,	vcpu_id			)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__entry->vcpu_id		= vcpu_id;
>> +	),
>> +
>> +	TP_printk("vcpu %d: PML full", __entry->vcpu_id)
>> +);
>> +
>>   #endif /* CONFIG_X86_64 */
> You have it protected by CONFIG_X86_64, but use it unconditionally.
Thanks for catching. This has been fixed by another patch, and the fix 
has also been merged by Paolo.

Thanks,
-Kai
>
> (Also, we can get all this information from kvm_exit tracepoint;
>   I'd just drop it.)
>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c987374..de5ce82 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -516,6 +519,10 @@ struct vcpu_vmx {
>> +	/* Support for PML */
>> +#define PML_ENTITY_NUM		512
> (Readers of this struct likely aren't interested in this number, so it
>   would be nicer to define it outside.  I thought it is here to hint the
>   amount of allocated pages, but PML_ENTITY_NUM / 512 isn't obvious.)
>
>> +	struct page *pml_pg;
>> @@ -4355,6 +4368,9 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
>>   	   a current VMCS12
>>   	*/
>>   	exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
>> +	/* PML is enabled/disabled in creating/destorying vcpu */
>> +	exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
> What is the harm of enabling it here?
>
> (SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES seems similar and does it.)
Because the PML feature detection is unconditional (meaning 
SECONDARY_EXEC_ENABLE_PML is always in 
vmcs_config.cpu_based_2nd_exec_ctrl), but the PML buffer is only created 
when vcpu is created, and it is controlled by 'enable_pml' module 
parameter, if we always enable SECONDARY_EXEC_ENABLE_PML here, no PML 
buffer will be created if PML is disabled by 'enable_pml' parameter, so 
it's better to enable it along with creating PML buffer.

>
>> +
>>   	return exec_control;
>>   }
>> @@ -6971,6 +7001,31 @@ static bool vmx_test_pir(struct kvm_vcpu *vcpu, int vector)
>> +static int handle_pml_full(struct kvm_vcpu *vcpu)
>> +{
>> +	unsigned long exit_qualification;
>> +
>> +	trace_kvm_pml_full(vcpu->vcpu_id);
>> +
>> +	exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
>> +
>> +	/*
>> +	 * PML buffer FULL happened while executing iret from NMI,
>> +	 * "blocked by NMI" bit has to be set before next VM entry.
>> +	 */
>> +	if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
>> +			cpu_has_virtual_nmis() &&
>> +			(exit_qualification & INTR_INFO_UNBLOCK_NMI))
>> +		vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
>> +				GUEST_INTR_STATE_NMI);
> Relevant part of the specification is pasted from 27.2.2 (Information
> for VM Exits Due to Vectored Events), which makes me think that this bit
> is mirrored to IDT-vectoring information field ...
>
> Isn't vmx_recover_nmi_blocking() enough?
>
> (I see the same code in handle_ept_violation(), but wasn't that needed
>   just because of a hardware error?)
This needs to be handled in both EPT violation and PML. If you look at 
the PML specification (the link is in cover letter), you can see the 
definition of bit 12 of exit_qualification (section 1.5), which explains 
above code. The same definition of exit_qualification is in EPT 
violation part in Intel's SDM.

>> +
>> +	/*
>> +	 * PML buffer already flushed at beginning of VMEXIT. Nothing to do
>> +	 * here.., and there's no userspace involvement needed for PML.
>> +	 */
>> +	return 1;
>> +}
>> @@ -7325,6 +7381,89 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2)
>> +static int vmx_enable_pml(struct vcpu_vmx *vmx)
>> +{
>> +	struct page *pml_pg;
>> +	u32 exec_control;
>> +
>> +	pml_pg = alloc_page(GFP_KERNEL | __GFP_ZERO);
>> +	if (!pml_pg)
>> +		return -ENOMEM;
>> +
>> +	vmx->pml_pg = pml_pg;
> (It's safe to use vmx->pml_pg directly.)
>
>> +
>> +	vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
>> +	vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
>> +
>> +	exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
>> +	exec_control |= SECONDARY_EXEC_ENABLE_PML;
>> +	vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control);
>> +
>> +	return 0;
>> +}
>> +static void vmx_flush_pml_buffer(struct vcpu_vmx *vmx)
>> +{
>> +	struct kvm *kvm = vmx->vcpu.kvm;
>> +	u64 *pml_buf;
>> +	u16 pml_idx;
>> +
>> +	pml_idx = vmcs_read16(GUEST_PML_INDEX);
>> +
>> +	/* Do nothing if PML buffer is empty */
>> +	if (pml_idx == (PML_ENTITY_NUM - 1))
>> +		return;
>> +
>> +	/* PML index always points to next available PML buffer entity */
>> +	if (pml_idx >= PML_ENTITY_NUM)
>> +		pml_idx = 0;
>> +	else
>> +		pml_idx++;
> If the pml_idx is >= PML_ENTITY_NUM and < 0xffff, the log is empty,
In this case, the log is full, actually. PML index is set to 
PML_ENTITY_NUM - 1 initially, and hardware decrease it after GPA is logged.

Below is the pseudocode of PML hardware logic.

IF (PML Index[15:9] ≠ 0)
     THEN VM exit;
FI;
set accessed and dirty flags for EPT;
IF (a dirty flag was updated from 0 to 1)
THEN
     PML address[PML index] ← 4-KByte-aligned guest-physical address;
     PML index is decremented;
FI;

Thanks,
-Kai
> so it would be better to use just 'pml_idx++' and unsigned modulo.
>
> (Could also 'assert(pml_idx < PML_ENTITY_NUM)' then.)
>
>> +
>> +	pml_buf = page_address(vmx->pml_pg);
>> +	for (; pml_idx < PML_ENTITY_NUM; pml_idx++) {
>> +		u64 gpa;
>> +
>> +		gpa = pml_buf[pml_idx];
>> +		WARN_ON(gpa & (PAGE_SIZE - 1));
>> +		mark_page_dirty(kvm, gpa >> PAGE_SHIFT);
>> +	}
>> +
>> +	/* reset PML index */
>> +	vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
>> +}
>> @@ -9492,6 +9655,31 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
>> +static void vmx_slot_enable_log_dirty(struct kvm *kvm,
>> +				     struct kvm_memory_slot *slot)
>> +{
>> +	kvm_mmu_slot_leaf_clear_dirty(kvm, slot);
> (New slot contains dirty pages?)
>
>> +	kvm_mmu_slot_largepage_remove_write_access(kvm, slot);
>> +}
> Thanks.

next prev parent reply	other threads:[~2015-02-05  6:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-28  2:54 [PATCH 0/6] KVM: VMX: Page Modification Logging (PML) support Kai Huang
2015-01-28  2:54 ` [PATCH 1/6] KVM: Rename kvm_arch_mmu_write_protect_pt_masked to be more generic for log dirty Kai Huang
2015-01-28  2:54 ` [PATCH 2/6] KVM: MMU: Add mmu help functions to support PML Kai Huang
2015-02-03 17:34   ` Radim Krčmář
2015-02-05  5:59     ` Kai Huang
2015-02-05 14:51       ` Radim Krčmář
2015-01-28  2:54 ` [PATCH 3/6] KVM: MMU: Explicitly set D-bit for writable spte Kai Huang
2015-01-28  2:54 ` [PATCH 4/6] KVM: x86: Change parameter of kvm_mmu_slot_remove_write_access Kai Huang
2015-02-03 16:28   ` Radim Krčmář
2015-01-28  2:54 ` [PATCH 5/6] KVM: x86: Add new dirty logging kvm_x86_ops for PML Kai Huang
2015-02-03 15:53   ` Radim Krčmář
2015-02-05  6:29     ` Kai Huang
2015-02-05 14:52       ` Radim Krčmář
2015-01-28  2:54 ` [PATCH 6/6] KVM: VMX: Add PML support in VMX Kai Huang
2015-02-03 15:18   ` Radim Krčmář
2015-02-03 15:39     ` Paolo Bonzini
2015-02-03 16:02       ` Radim Krčmář
2015-02-05  6:23     ` Kai Huang [this message]
2015-02-05 15:04       ` Radim Krčmář
2015-02-06  0:22         ` Kai Huang
2015-02-06  0:28         ` Kai Huang
2015-02-06 16:00       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54D30C5D.1000402@linux.intel.com \
    --to=kai.huang@linux.intel.com \
    --cc=gleb@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).