From: Tang Chen <tangchen@cn.fujitsu.com>
To: Gleb Natapov <gleb@kernel.org>
Cc: <mtosatti@redhat.com>, <nadav.amit@gmail.com>,
<kvm@vger.kernel.org>, <laijs@cn.fujitsu.com>,
<isimatu.yasuaki@jp.fujitsu.com>, <guz.fnst@cn.fujitsu.com>,
<linux-kernel@vger.kernel.org>,
Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory.
Date: Mon, 14 Jul 2014 15:57:09 +0800 [thread overview]
Message-ID: <53C38D55.2040307@cn.fujitsu.com> (raw)
In-Reply-To: <20140712080442.GH4399@minantech.com>
Hi Gleb,
Thanks for the reply. Please see below.
On 07/12/2014 04:04 PM, Gleb Natapov wrote:
> On Tue, Jul 08, 2014 at 09:01:32PM +0800, Tang Chen wrote:
>> apic access page is pinned in memory. As a result, it cannot be migrated/hot-removed.
>> Actually, it is not necessary to be pinned.
>>
>> The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer. When
>> the page is migrated, kvm_mmu_notifier_invalidate_page() will invalidate the
>> corresponding ept entry. This patch introduces a new vcpu request named
>> KVM_REQ_APIC_PAGE_RELOAD, and makes this request to all the vcpus at this time,
>> and force all the vcpus exit guest, and re-enter guest till they updates the VMCS
>> APIC_ACCESS_ADDR pointer to the new apic access page address, and updates
>> kvm->arch.apic_access_page to the new page.
>>
> By default kvm Linux guest uses x2apic, so APIC_ACCESS_ADDR mechanism
> is not used since no MMIO access to APIC is ever done. Have you tested
> this with "-cpu modelname,-x2apic" qemu flag?
I used the following commandline to test the patches.
# /usr/libexec/qemu-kvm -m 512M -hda /home/tangchen/xxx.img -enable-kvm
-smp 2
And I think the guest used APIC_ACCESS_ADDR mechanism because the previous
patch-set has some problem which will happen when the apic page is accessed.
And it did happen.
I'll test this patch-set with "-cpu modelname,-x2apic" flag.
>
>> Signed-off-by: Tang Chen<tangchen@cn.fujitsu.com>
>> ---
>> arch/x86/include/asm/kvm_host.h | 1 +
>> arch/x86/kvm/mmu.c | 11 +++++++++++
>> arch/x86/kvm/svm.c | 6 ++++++
>> arch/x86/kvm/vmx.c | 8 +++++++-
>> arch/x86/kvm/x86.c | 14 ++++++++++++++
>> include/linux/kvm_host.h | 2 ++
>> virt/kvm/kvm_main.c | 12 ++++++++++++
>> 7 files changed, 53 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 62f973e..9ce6bfd 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -737,6 +737,7 @@ struct kvm_x86_ops {
>> void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>> void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>> void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> + void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
>> void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>> void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>> int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 9314678..551693d 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3427,6 +3427,17 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
>> level, gfn, pfn, prefault);
>> spin_unlock(&vcpu->kvm->mmu_lock);
>>
>> + /*
>> + * apic access page could be migrated. When the guest tries to access
>> + * the apic access page, ept violation will occur, and we can use GUP
>> + * to find the new page.
>> + *
>> + * GUP will wait till the migrate entry be replaced with the new page.
>> + */
>> + if (gpa == APIC_DEFAULT_PHYS_BASE)
>> + vcpu->kvm->arch.apic_access_page = gfn_to_page_no_pin(vcpu->kvm,
>> + APIC_DEFAULT_PHYS_BASE>> PAGE_SHIFT);
> Shouldn't you make KVM_REQ_APIC_PAGE_RELOAD request here?
I don't think we need to make KVM_REQ_APIC_PAGE_RELOAD request here.
In kvm_mmu_notifier_invalidate_page() I made the request. And the
handler called
gfn_to_page_no_pin() to get the new page, which will wait till the
migration
finished. And then updated the VMCS APIC_ACCESS_ADDR pointer. So, when
the vcpus
were forced to exit the guest mode, they would wait till the VMCS
APIC_ACCESS_ADDR
pointer was updated.
As a result, we don't need to make the request here.
>
>> +
>> return r;
>>
>> out_unlock:
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 576b525..dc76f29 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -3612,6 +3612,11 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>> return;
>> }
>>
>> +static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> + return;
>> +}
>> +
>> static int svm_vm_has_apicv(struct kvm *kvm)
>> {
>> return 0;
>> @@ -4365,6 +4370,7 @@ static struct kvm_x86_ops svm_x86_ops = {
>> .enable_irq_window = enable_irq_window,
>> .update_cr8_intercept = update_cr8_intercept,
>> .set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
>> + .set_apic_access_page_addr = svm_set_apic_access_page_addr,
>> .vm_has_apicv = svm_vm_has_apicv,
>> .load_eoi_exitmap = svm_load_eoi_exitmap,
>> .hwapic_isr_update = svm_hwapic_isr_update,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 5532ac8..f7c6313 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3992,7 +3992,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
>> if (r)
>> goto out;
>>
>> - page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE>> PAGE_SHIFT);
>> + page = gfn_to_page_no_pin(kvm, APIC_DEFAULT_PHYS_BASE>> PAGE_SHIFT);
>> if (is_error_page(page)) {
>> r = -EFAULT;
>> goto out;
>> @@ -7073,6 +7073,11 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
>> vmx_set_msr_bitmap(vcpu);
>> }
>>
>> +static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
>> +{
>> + vmcs_write64(APIC_ACCESS_ADDR, hpa);
>> +}
>> +
>> static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
>> {
>> u16 status;
>> @@ -8842,6 +8847,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
>> .enable_irq_window = enable_irq_window,
>> .update_cr8_intercept = update_cr8_intercept,
>> .set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
>> + .set_apic_access_page_addr = vmx_set_apic_access_page_addr,
>> .vm_has_apicv = vmx_vm_has_apicv,
>> .load_eoi_exitmap = vmx_load_eoi_exitmap,
>> .hwapic_irr_update = vmx_hwapic_irr_update,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ffbe557..7080eda 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5929,6 +5929,18 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>> kvm_apic_update_tmr(vcpu, tmr);
>> }
>>
>> +static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
>> +{
>> + /*
>> + * When the page is being migrated, GUP will wait till the migrate
>> + * entry is replaced with the new pte entry pointing to the new page.
>> + */
>> + struct page *page = gfn_to_page_no_pin(vcpu->kvm,
>> + APIC_DEFAULT_PHYS_BASE>> PAGE_SHIFT);
> If you do not use kvm->arch.apic_access_page to get current address why not drop it entirely?
>
I should also update kvm->arch.apic_access_page here. It is used in
other places
in kvm, so I don't think we should drop it. Will update the patch.
>> + kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
>> + page_to_phys(page));
>> +}
>> +
>> /*
>> * Returns 1 to let __vcpu_run() continue the guest execution loop without
>> * exiting to the userspace. Otherwise, the value will be returned to the
>> @@ -5989,6 +6001,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>> kvm_deliver_pmi(vcpu);
>> if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>> vcpu_scan_ioapic(vcpu);
>> + if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
>> + vcpu_reload_apic_access_page(vcpu);
>> }
>>
>> if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 7c58d9d..f49be86 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
>> #define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
>> #define KVM_REQ_ENABLE_IBS 23
>> #define KVM_REQ_DISABLE_IBS 24
>> +#define KVM_REQ_APIC_PAGE_RELOAD 25
>>
>> #define KVM_USERSPACE_IRQ_SOURCE_ID 0
>> #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
>> @@ -596,6 +597,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
>> void kvm_reload_remote_mmus(struct kvm *kvm);
>> void kvm_make_mclock_inprogress_request(struct kvm *kvm);
>> void kvm_make_scan_ioapic_request(struct kvm *kvm);
>> +void kvm_reload_apic_access_page(struct kvm *kvm);
>>
>> long kvm_arch_dev_ioctl(struct file *filp,
>> unsigned int ioctl, unsigned long arg);
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 6091849..965b702 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -210,6 +210,11 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
>> make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
>> }
>>
>> +void kvm_reload_apic_access_page(struct kvm *kvm)
>> +{
>> + make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);
>> +}
>> +
>> int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>> {
>> struct page *page;
>> @@ -294,6 +299,13 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>> if (need_tlb_flush)
>> kvm_flush_remote_tlbs(kvm);
>>
>> + /*
>> + * The physical address of apic access page is stroed in VMCS.
>> + * So need to update it when it becomes invalid.
>> + */
>> + if (address == gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE>> PAGE_SHIFT))
>> + kvm_reload_apic_access_page(kvm);
>> +
>> spin_unlock(&kvm->mmu_lock);
>> srcu_read_unlock(&kvm->srcu, idx);
>> }
>
> You forgot to drop put_page(kvm->arch.apic_access_page); from x86.c again.
>
Yes...will update the patch.
Thanks.
next prev parent reply other threads:[~2014-07-14 7:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-08 13:01 [PATCH v2 0/5] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-08 13:01 ` [PATCH v2 1/5] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning Tang Chen
2014-07-08 13:01 ` [PATCH v2 2/5] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address Tang Chen
2014-07-08 13:01 ` [PATCH v2 3/5] kvm, mem-hotplug: Do not pin ept identity pagetable in memory Tang Chen
2014-07-08 13:01 ` [PATCH v2 4/5] kvm: Remove ept_identity_pagetable from struct kvm_arch Tang Chen
2014-07-09 2:06 ` Tang Chen
2014-07-09 2:08 ` [RESEND PATCH " Tang Chen
2014-07-12 7:44 ` Gleb Natapov
2014-07-14 9:17 ` Tang Chen
2014-07-14 14:27 ` Gleb Natapov
2014-07-15 10:39 ` Tang Chen
2014-07-08 13:01 ` [PATCH v2 5/5] kvm, mem-hotplug: Do not pin apic access page in memory Tang Chen
2014-07-12 8:04 ` Gleb Natapov
2014-07-14 7:57 ` Tang Chen [this message]
2014-07-14 14:58 ` Gleb Natapov
2014-07-15 11:52 ` Jan Kiszka
2014-07-15 12:09 ` Gleb Natapov
2014-07-15 12:28 ` Tang Chen
2014-07-15 12:40 ` Gleb Natapov
2014-07-15 12:54 ` Tang Chen
2014-07-15 14:40 ` Gleb Natapov
2014-07-17 9:22 ` Tang Chen
2014-07-15 13:10 ` Jan Kiszka
2014-07-15 14:04 ` Gleb Natapov
2014-07-17 13:34 ` Tang Chen
2014-07-17 13:57 ` Gleb Natapov
2014-07-18 9:05 ` Tang Chen
2014-07-18 11:21 ` Gleb Natapov
2014-07-15 12:11 ` Tang Chen
2014-07-09 1:20 ` [PATCH v2 0/5] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page Tang Chen
2014-07-11 6:23 ` Tang Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53C38D55.2040307@cn.fujitsu.com \
--to=tangchen@cn.fujitsu.com \
--cc=gleb@kernel.org \
--cc=guz.fnst@cn.fujitsu.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=kvm@vger.kernel.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=nadav.amit@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox