All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tang Chen <tangchen@cn.fujitsu.com>
To: Gleb Natapov <gleb@kernel.org>
Cc: <mtosatti@redhat.com>, <nadav.amit@gmail.com>,
	<kvm@vger.kernel.org>, <laijs@cn.fujitsu.com>,
	<isimatu.yasuaki@jp.fujitsu.com>, <guz.fnst@cn.fujitsu.com>,
	<linux-kernel@vger.kernel.org>,
	Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory.
Date: Mon, 14 Jul 2014 15:57:09 +0800	[thread overview]
Message-ID: <53C38D55.2040307@cn.fujitsu.com> (raw)
In-Reply-To: <20140712080442.GH4399@minantech.com>

Hi Gleb,

Thanks for the reply. Please see below.

On 07/12/2014 04:04 PM, Gleb Natapov wrote:
> On Tue, Jul 08, 2014 at 09:01:32PM +0800, Tang Chen wrote:
>> apic access page is pinned in memory. As a result, it cannot be migrated/hot-removed.
>> Actually, it is not necessary to be pinned.
>>
>> The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer. When
>> the page is migrated, kvm_mmu_notifier_invalidate_page() will invalidate the
>> corresponding ept entry. This patch introduces a new vcpu request named
>> KVM_REQ_APIC_PAGE_RELOAD, and makes this request to all the vcpus at this time,
>> and force all the vcpus exit guest, and re-enter guest till they updates the VMCS
>> APIC_ACCESS_ADDR pointer to the new apic access page address, and updates
>> kvm->arch.apic_access_page to the new page.
>>
> By default kvm Linux guest uses x2apic, so APIC_ACCESS_ADDR mechanism
> is not used since no MMIO access to APIC is ever done. Have you tested
> this with "-cpu modelname,-x2apic" qemu flag?

I used the following commandline to test the patches.

# /usr/libexec/qemu-kvm -m 512M -hda /home/tangchen/xxx.img -enable-kvm 
-smp 2

And I think the guest used APIC_ACCESS_ADDR mechanism because the previous
patch-set has some problem which will happen when the apic page is accessed.
And it did happen.

I'll test this patch-set with "-cpu modelname,-x2apic" flag.

>
>> Signed-off-by: Tang Chen<tangchen@cn.fujitsu.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h |  1 +
>>   arch/x86/kvm/mmu.c              | 11 +++++++++++
>>   arch/x86/kvm/svm.c              |  6 ++++++
>>   arch/x86/kvm/vmx.c              |  8 +++++++-
>>   arch/x86/kvm/x86.c              | 14 ++++++++++++++
>>   include/linux/kvm_host.h        |  2 ++
>>   virt/kvm/kvm_main.c             | 12 ++++++++++++
>>   7 files changed, 53 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 62f973e..9ce6bfd 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -737,6 +737,7 @@ struct kvm_x86_ops {
>>   	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>>   	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>>   	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> +	void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>>   	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>>   	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>>   	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 9314678..551693d 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3427,6 +3427,17 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>>   			 level, gfn, pfn, prefault);
>>   	spin_unlock(&vcpu->kvm->mmu_lock);
>>
>> +	/*
>> +	 * apic access page could be migrated. When the guest tries to access
>> +	 * the apic access page, ept violation will occur, and we can use GUP
>> +	 * to find the new page.
>> +	 *
>> +	 * GUP will wait till the migrate entry be replaced with the new page.
>> +	 */
>> +	if (gpa == APIC_DEFAULT_PHYS_BASE)
>> +		vcpu->kvm->arch.apic_access_page = gfn_to_page_no_pin(vcpu->kvm,
>> +					APIC_DEFAULT_PHYS_BASE>>  PAGE_SHIFT);
> Shouldn't you make KVM_REQ_APIC_PAGE_RELOAD request here?

I don't think we need to make KVM_REQ_APIC_PAGE_RELOAD request here.

In kvm_mmu_notifier_invalidate_page() I made the request. And the 
handler called
gfn_to_page_no_pin() to get the new page, which will wait till the 
migration
finished. And then updated the VMCS APIC_ACCESS_ADDR pointer. So, when 
the vcpus
were forced to exit the guest mode, they would wait till the VMCS 
APIC_ACCESS_ADDR
pointer was updated.

As a result, we don't need to make the request here.


>
>> +
>>   	return r;
>>
>>   out_unlock:
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 576b525..dc76f29 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -3612,6 +3612,11 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>>   	return;
>>   }
>>
>> +static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> +	return;
>> +}
>> +
>>   static int svm_vm_has_apicv(struct kvm *kvm)
>>   {
>>   	return 0;
>> @@ -4365,6 +4370,7 @@ static struct kvm_x86_ops svm_x86_ops = {
>>   	.enable_irq_window = enable_irq_window,
>>   	.update_cr8_intercept = update_cr8_intercept,
>>   	.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
>> +	.set_apic_access_page_addr = svm_set_apic_access_page_addr,
>>   	.vm_has_apicv = svm_vm_has_apicv,
>>   	.load_eoi_exitmap = svm_load_eoi_exitmap,
>>   	.hwapic_isr_update = svm_hwapic_isr_update,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 5532ac8..f7c6313 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3992,7 +3992,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>>   	if (r)
>>   		goto out;
>>
>> -	page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE>>  PAGE_SHIFT);
>> +	page = gfn_to_page_no_pin(kvm, APIC_DEFAULT_PHYS_BASE>>  PAGE_SHIFT);
>>   	if (is_error_page(page)) {
>>   		r = -EFAULT;
>>   		goto out;
>> @@ -7073,6 +7073,11 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>>   	vmx_set_msr_bitmap(vcpu);
>>   }
>>
>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> +	vmcs_write64(APIC_ACCESS_ADDR, hpa);
>> +}
>> +
>>   static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>>   {
>>   	u16 status;
>> @@ -8842,6 +8847,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>>   	.enable_irq_window = enable_irq_window,
>>   	.update_cr8_intercept = update_cr8_intercept,
>>   	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>> +	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
>>   	.vm_has_apicv = vmx_vm_has_apicv,
>>   	.load_eoi_exitmap = vmx_load_eoi_exitmap,
>>   	.hwapic_irr_update = vmx_hwapic_irr_update,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ffbe557..7080eda 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5929,6 +5929,18 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>   	kvm_apic_update_tmr(vcpu, tmr);
>>   }
>>
>> +static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
>> +{
>> +	/*
>> +	 * When the page is being migrated, GUP will wait till the migrate
>> +	 * entry is replaced with the new pte entry pointing to the new page.
>> +	 */
>> +	struct page *page = gfn_to_page_no_pin(vcpu->kvm,
>> +					APIC_DEFAULT_PHYS_BASE>>  PAGE_SHIFT);
> If you do not use kvm->arch.apic_access_page to get current address why not drop it entirely?
>

I should also update kvm->arch.apic_access_page here. It is used in 
other places
in kvm, so I don't think we should drop it. Will update the patch.

>> +	kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
>> +					       page_to_phys(page));
>> +}
>> +
>>   /*
>>    * Returns 1 to let __vcpu_run() continue the guest execution loop without
>>    * exiting to the userspace.  Otherwise, the value will be returned to the
>> @@ -5989,6 +6001,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>   			kvm_deliver_pmi(vcpu);
>>   		if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>>   			vcpu_scan_ioapic(vcpu);
>> +		if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
>> +			vcpu_reload_apic_access_page(vcpu);
>>   	}
>>
>>   	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 7c58d9d..f49be86 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
>>   #define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
>>   #define KVM_REQ_ENABLE_IBS        23
>>   #define KVM_REQ_DISABLE_IBS       24
>> +#define KVM_REQ_APIC_PAGE_RELOAD  25
>>
>>   #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>>   #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
>> @@ -596,6 +597,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
>>   void kvm_reload_remote_mmus(struct kvm *kvm);
>>   void kvm_make_mclock_inprogress_request(struct kvm *kvm);
>>   void kvm_make_scan_ioapic_request(struct kvm *kvm);
>> +void kvm_reload_apic_access_page(struct kvm *kvm);
>>
>>   long kvm_arch_dev_ioctl(struct file *filp,
>>   			unsigned int ioctl, unsigned long arg);
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 6091849..965b702 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -210,6 +210,11 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>>   	make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>>   }
>>
>> +void kvm_reload_apic_access_page(struct kvm *kvm)
>> +{
>> +	make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);
>> +}
>> +
>>   int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>>   {
>>   	struct page *page;
>> @@ -294,6 +299,13 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>>   	if (need_tlb_flush)
>>   		kvm_flush_remote_tlbs(kvm);
>>
>> +	/*
>> +	 * The physical address of apic access page is stroed in VMCS.
>> +	 * So need to update it when it becomes invalid.
>> +	 */
>> +	if (address == gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE>>  PAGE_SHIFT))
>> +		kvm_reload_apic_access_page(kvm);
>> +
>>   	spin_unlock(&kvm->mmu_lock);
>>   	srcu_read_unlock(&kvm->srcu, idx);
>>   }
>
> You forgot to drop put_page(kvm->arch.apic_access_page); from x86.c again.
>

Yes...will update the patch.

Thanks.


  reply	other threads:[~2014-07-14  7:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-08 13:01 [PATCH v2 0/5] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-08 13:01 ` [PATCH v2 1/5] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
2014-07-08 13:01 ` [PATCH v2 2/5] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address Tang Chen
2014-07-08 13:01 ` [PATCH v2 3/5] kvm, mem-hotplug: Do not pin ept identity pagetable in memory Tang Chen
2014-07-08 13:01 ` [PATCH v2 4/5] kvm: Remove ept_identity_pagetable from struct kvm_arch Tang Chen
2014-07-09  2:06   ` Tang Chen
2014-07-09  2:08   ` [RESEND PATCH " Tang Chen
2014-07-12  7:44     ` Gleb Natapov
2014-07-14  9:17       ` Tang Chen
2014-07-14 14:27         ` Gleb Natapov
2014-07-15 10:39           ` Tang Chen
2014-07-08 13:01 ` [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory Tang Chen
2014-07-12  8:04   ` Gleb Natapov
2014-07-14  7:57     ` Tang Chen [this message]
2014-07-14 14:58       ` Gleb Natapov
2014-07-15 11:52         ` Jan Kiszka
2014-07-15 12:09           ` Gleb Natapov
2014-07-15 12:28             ` Tang Chen
2014-07-15 12:40               ` Gleb Natapov
2014-07-15 12:54                 ` Tang Chen
2014-07-15 14:40                   ` Gleb Natapov
2014-07-17  9:22                     ` Tang Chen
2014-07-15 13:10                 ` Jan Kiszka
2014-07-15 14:04                   ` Gleb Natapov
2014-07-17 13:34                 ` Tang Chen
2014-07-17 13:57                   ` Gleb Natapov
2014-07-18  9:05                     ` Tang Chen
2014-07-18 11:21                       ` Gleb Natapov
2014-07-15 12:11           ` Tang Chen
2014-07-09  1:20 ` [PATCH v2 0/5] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-11  6:23 ` Tang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C38D55.2040307@cn.fujitsu.com \
    --to=tangchen@cn.fujitsu.com \
    --cc=gleb@kernel.org \
    --cc=guz.fnst@cn.fujitsu.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nadav.amit@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.