linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Gleb Natapov <gleb@redhat.com>
Cc: kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, avi@redhat.com, mingo@elte.hu,
	a.p.zijlstra@chello.nl, tglx@linutronix.de, hpa@zytor.com,
	riel@redhat.com, cl@linux-foundation.org
Subject: Re: [PATCH v7 00/12] KVM: Add host swap event notifications for PV guest
Date: Mon, 18 Oct 2010 13:34:03 -0200	[thread overview]
Message-ID: <20101018153402.GA2067@amt.cnet> (raw)
In-Reply-To: <1287048176-2563-1-git-send-email-gleb@redhat.com>

On Thu, Oct 14, 2010 at 11:22:44AM +0200, Gleb Natapov wrote:
> KVM virtualizes guest memory by means of shadow pages or HW assistance
> like NPT/EPT. Not all memory used by a guest is mapped into the guest
> address space or even present in a host memory at any given time.
> When vcpu tries to access memory page that is not mapped into the guest
> address space KVM is notified about it. KVM maps the page into the guest
> address space and resumes vcpu execution. If the page is swapped out from
> the host memory vcpu execution is suspended till the page is swapped
> into the memory again. This is inefficient since vcpu can do other work
> (run other task or serve interrupts) while page gets swapped in.
> 
> The patch series tries to mitigate this problem by introducing two
> mechanisms. The first one is used with non-PV guest and it works like
> this: when vcpu tries to access swapped out page it is halted and
> requested page is swapped in by another thread. That way vcpu can still
> process interrupts while io is happening in parallel and, with any luck,
> interrupt will cause the guest to schedule another task on the vcpu, so
> it will have work to do instead of waiting for the page to be swapped in.
> 
> The second mechanism introduces PV notification about swapped page state to
> a guest (asynchronous page fault). Instead of halting vcpu upon access to
> swapped out page and hoping that some interrupt will cause reschedule we
> immediately inject asynchronous page fault to the vcpu.  PV aware guest
> knows that upon receiving such exception it should schedule another task
> to run on the vcpu. Current task is put to sleep until another kind of
> asynchronous page fault is received that notifies the guest that page
> is now in the host memory, so task that waits for it can run again.
> 
> To measure performance benefits I use a simple benchmark program (below)
> that starts number of threads. Some of them do work (increment counter),
> others access huge array in random location trying to generate host page
> faults. The size of the array is smaller then guest memory bug bigger
> then host memory so we are guarantied that host will swap out part of
> the array.
> 
> I ran the benchmark on three setups: with current kvm.git (master),
> with my patch series + non-pv guest (nonpv) and with my patch series +
> pv guest (pv).
> 
> Each guest had 4 cpus and 2G memory and was launched inside 512M memory
> container. The command line was "./bm -f 4 -w 4 -t 60" (run 4 faulting
> threads and 4 working threads for a minute).
> 
> Below is the total amount of "work" each guest managed to do
> (average of 10 runs):
>          total work    std error
> master: 122789420615 (3818565029)
> nonpv:  138455939001 (773774299)
> pv:     234351846135 (10461117116)
> 
> Changes:
>  v1->v2
>    Use MSR instead of hypercall.
>    Move most of the code into arch independent place.
>    halt inside a guest instead of doing "wait for page" hypercall if
>     preemption is disabled.
>  v2->v3
>    Use MSR from range 0x4b564dxx.
>    Add slot version tracking.
>    Support migration by restarting all guest processes after migration.
>    Drop patch that tract preemptability for non-preemptable kernels
>     due to performance concerns. Send async PF to non-preemptable
>     guests only when vcpu is executing userspace code.
>  v3->v4
>   Provide alternative page fault handler in PV guest instead of adding hook to
>    standard page fault handler and patch it out on non-PV guests.
>   Allow only limited number of outstanding async page fault per vcpu.
>   Unify  gfn_to_pfn and gfn_to_pfn_async code.
>   Cancel outstanding slow work on reset.
>  v4->v5
>   Move async pv cpu initialization into cpu hotplug notifier.
>   Use GFP_NOWAIT instead of GFP_ATOMIC for allocation that shouldn't sleep
>   Process KVM_REQ_MMU_SYNC even in page_fault_other_cr3() before changing
>    cr3 back
>  v5->v6
>   To many. Will list only major changes here.
>   Replace slow work with work queues.
>   Halt vcpu for non-pv guests.
>   Handle async PF in nested SVM mode.
>   Do not prefault swapped in page for non tdp case.
>  v6->v7
>   Fix "GUP fail in work thread" problem
>   Do prefault only if mmu is in direct map mode
>   Use cpu->request to ask for vcpu halt (drop optimization that tried to
>    skip non-present apf injection if page is swapped in before next vmentry)
>   Keep track of synthetic halt in separate state to prevent it from leaking
>    during migration.
>   Fix memslot tracking problems.
>   More documentation.
>   Other small comments are addressed
> 
> Gleb Natapov (12):
>   Add get_user_pages() variant that fails if major fault is required.
>   Halt vcpu if page it tries to access is swapped out.
>   Retry fault before vmentry
>   Add memory slot versioning and use it to provide fast guest write interface
>   Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c.
>   Add PV MSR to enable asynchronous page faults delivery.
>   Add async PF initialization to PV guest.
>   Handle async PF in a guest.
>   Inject asynchronous page fault into a PV guest if page is swapped out.
>   Handle async PF in non preemptable context
>   Let host know whether the guest can handle async PF in non-userspace context.
>   Send async PF when guest is not in userspace too.
> 
>  Documentation/kernel-parameters.txt |    3 +
>  Documentation/kvm/cpuid.txt         |    3 +
>  Documentation/kvm/msr.txt           |   36 ++++-
>  arch/x86/include/asm/kvm_host.h     |   28 +++-
>  arch/x86/include/asm/kvm_para.h     |   24 +++
>  arch/x86/include/asm/traps.h        |    1 +
>  arch/x86/kernel/entry_32.S          |   10 +
>  arch/x86/kernel/entry_64.S          |    3 +
>  arch/x86/kernel/kvm.c               |  315 +++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/kvmclock.c          |   13 +--
>  arch/x86/kvm/Kconfig                |    1 +
>  arch/x86/kvm/Makefile               |    1 +
>  arch/x86/kvm/mmu.c                  |   61 ++++++-
>  arch/x86/kvm/paging_tmpl.h          |    8 +-
>  arch/x86/kvm/svm.c                  |   45 ++++-
>  arch/x86/kvm/x86.c                  |  192 +++++++++++++++++++++-
>  fs/ncpfs/mmap.c                     |    2 +
>  include/linux/kvm.h                 |    1 +
>  include/linux/kvm_host.h            |   39 +++++
>  include/linux/kvm_types.h           |    7 +
>  include/linux/mm.h                  |    5 +
>  include/trace/events/kvm.h          |   95 +++++++++++
>  mm/filemap.c                        |    3 +
>  mm/memory.c                         |   31 +++-
>  mm/shmem.c                          |    8 +-
>  virt/kvm/Kconfig                    |    3 +
>  virt/kvm/async_pf.c                 |  213 +++++++++++++++++++++++
>  virt/kvm/async_pf.h                 |   36 ++++
>  virt/kvm/kvm_main.c                 |  132 ++++++++++++---
>  29 files changed, 1255 insertions(+), 64 deletions(-)
>  create mode 100644 virt/kvm/async_pf.c
>  create mode 100644 virt/kvm/async_pf.h

Applied, thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-10-18 16:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-14  9:22 [PATCH v7 00/12] KVM: Add host swap event notifications for PV guest Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 01/12] Add get_user_pages() variant that fails if major fault is required Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 02/12] Halt vcpu if page it tries to access is swapped out Gleb Natapov
2010-10-20 11:28   ` Jan Kiszka
2010-10-20 11:33     ` Gleb Natapov
2010-10-20 11:35       ` Jan Kiszka
2010-10-20 11:39         ` Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 03/12] Retry fault before vmentry Gleb Natapov
2010-10-17 10:33   ` Avi Kivity
2010-10-17 10:43     ` Avi Kivity
2010-10-17 16:13   ` Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 04/12] Add memory slot versioning and use it to provide fast guest write interface Gleb Natapov
2010-10-18 13:22   ` Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 05/12] Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 06/12] Add PV MSR to enable asynchronous page faults delivery Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 07/12] Add async PF initialization to PV guest Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 08/12] Handle async PF in a guest Gleb Natapov
2010-10-20 11:48   ` Jan Kiszka
2010-10-20 11:50     ` Jan Kiszka
2010-10-20 11:53     ` Peter Zijlstra
2010-10-14  9:22 ` [PATCH v7 09/12] Inject asynchronous page fault into a PV guest if page is swapped out Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 10/12] Handle async PF in non preemptable context Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 11/12] Let host know whether the guest can handle async PF in non-userspace context Gleb Natapov
2010-10-14  9:22 ` [PATCH v7 12/12] Send async PF when guest is not in userspace too Gleb Natapov
2010-10-18 15:34 ` Marcelo Tosatti [this message]
  -- strict thread matches above, loose matches on Subject: below --
2010-10-14  9:16 [PATCH v7 00/12] KVM: Add host swap event notifications for PV guest y
2010-10-14  9:21 ` Gleb Natapov
2010-10-14  9:16 y
2010-10-14  9:16 y
2010-10-14  9:16 y

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101018153402.GA2067@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).