From: "Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>
To: peterz@infradead.org, mtosatti@redhat.com, avi@redhat.com
Cc: raghukt@linux.vnet.ibm.com, alex.shi@intel.com, mingo@elte.hu,
kvm@vger.kernel.org, hpa@zytor.com
Subject: [PATCH v3 0/8] KVM paravirt remote flush tlb
Date: Tue, 31 Jul 2012 16:17:23 +0530 [thread overview]
Message-ID: <20120731104312.16662.27889.stgit@abhimanyu.in.ibm.com> (raw)
Remote flushing api's does a busy wait which is fine in bare-metal
scenario. But with-in the guest, the vcpus might have been pre-empted
or blocked. In this scenario, the initator vcpu would end up
busy-waiting for a long amount of time.
This was discovered in our gang scheduling test and other way to solve
this is by para-virtualizing the flush_tlb_others_ipi(now shows up as
smp_call_function_many after Alex Shi's TLB optimization)
This patch set implements para-virt flush tlbs making sure that it
does not wait for vcpus that are sleeping. And all the sleeping vcpus
flush the tlb on guest enter. Idea was discussed here:
https://lkml.org/lkml/2012/2/20/157
This also brings one more dependency for lock-less page walk that is
performed by get_user_pages_fast(gup_fast). gup_fast disables the
interrupt and assumes that the pages will not be freed during that
period. And this was fine as the flush_tlb_others_ipi would wait for
all the IPI to be processed and return back. With the new approach of
not waiting for the sleeping vcpus, this assumption is not valid
anymore. So now HAVE_RCU_TABLE_FREE is used to free the pages. This
will make sure that all the cpus would atleast process smp_callback
before the pages are freed.
Changelog from v2:
• Rebase to 3.5 based linus(commit - f7da9cd) kernel.
• Port PV-Flush to new TLB-Optimization code by Alex Shi
• Use pinned pages to avoid overhead during guest enter/exit (Marcelo)
• Remove kick, as this is not improving much
• Use bit fields in the state(flush_on_enter and vcpu_running) flag to
avoid smp barriers (Marcelo)
• Add documentation for Paravirt TLB Flush (Marcelo)
Changelog from v1:
• Race fixes reported by Vatsa
• Address gup_fast dependency using PeterZ's rcu table free patch
• Fix rcu_table_free for hw pagetable walkers
Here are the results from PLE hardware. Here is the setup details:
• 32 CPUs (HT disabled)
• 64-bit VM
• 32vcpus
• 8GB RAM
Base: f7da9cd (based on 3.5 kernel, includes rik's changes and alex
shi's changes)
ple-opt: Raghu's PLE improvements [1](in kvm:auto-next now)
pv3flsh: ple-opt + paravirt flush v3
Lower is better
kbench - 1VM
============
Avg Stddev
base 16.714089 1.2471967
pleopt 12.527411 0.15261886
pv3flsh 12.955556 0.5041832
kbench - 2VM
============
Avg Stddev
base 28.565933 3.0167804
pleopt 22.7613 1.9046476
pv3flsh 23.034083 2.2192968
Higher is better
ebizzy - 1VM
============
Avg Stddev
base 1091 21.674358
pleopt 2239 45.188494
pv3flsh 2170.7 44.592102
ebizzy - 2VM
============
Avg Stddev
base 1824.7 63.708299
pleopt 2383.2 107.46779
pv3flsh 2328.2 69.359172
Observations:
-------------
Looking at the results above, ple-opt[1] patches have addressed the
remote-flush-tlb issue that we were trying to address using the
paravirt-tlb-flush approach.
[1] http://article.gmane.org/gmane.linux.kernel/1329752
---
Nikunj A. Dadhania (6):
KVM Guest: Add VCPU running/pre-empted state for guest
KVM-HV: Add VCPU running/pre-empted state for guest
KVM Guest: Add paravirt kvm_flush_tlb_others
KVM-HV: Add flush_on_enter before guest enter
Enable HAVE_RCU_TABLE_FREE for kvm when PARAVIRT_TLB_FLUSH is enabled
KVM-doc: Add paravirt tlb flush document
Peter Zijlstra (2):
mm, x86: Add HAVE_RCU_TABLE_FREE support
mm: Add missing TLB invalidate to RCU page-table freeing
Documentation/virtual/kvm/msr.txt | 4 +
Documentation/virtual/kvm/paravirt-tlb-flush.txt | 53 +++++++++++++++++++
arch/Kconfig | 3 +
arch/powerpc/Kconfig | 1
arch/sparc/Kconfig | 1
arch/x86/Kconfig | 11 ++++
arch/x86/include/asm/kvm_host.h | 7 ++
arch/x86/include/asm/kvm_para.h | 13 +++++
arch/x86/include/asm/tlb.h | 1
arch/x86/include/asm/tlbflush.h | 11 ++++
arch/x86/kernel/kvm.c | 38 +++++++++++++
arch/x86/kvm/cpuid.c | 1
arch/x86/kvm/x86.c | 62 +++++++++++++++++++++-
arch/x86/mm/pgtable.c | 6 +-
arch/x86/mm/tlb.c | 37 +++++++++++++
include/asm-generic/tlb.h | 9 +++
mm/memory.c | 43 +++++++++++++--
17 files changed, 290 insertions(+), 11 deletions(-)
create mode 100644 Documentation/virtual/kvm/paravirt-tlb-flush.txt
next reply other threads:[~2012-07-31 10:48 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-31 10:47 Nikunj A. Dadhania [this message]
2012-07-31 10:47 ` [PATCH v3 1/8] mm, x86: Add HAVE_RCU_TABLE_FREE support Nikunj A. Dadhania
2012-07-31 10:48 ` [PATCH v3 2/8] mm: Add missing TLB invalidate to RCU page-table freeing Nikunj A. Dadhania
2012-07-31 10:48 ` [PATCH v3 3/8] KVM Guest: Add VCPU running/pre-empted state for guest Nikunj A. Dadhania
2012-07-31 10:48 ` [PATCH v3 4/8] KVM-HV: " Nikunj A. Dadhania
2012-08-02 19:56 ` Marcelo Tosatti
2012-08-03 5:17 ` Nikunj A Dadhania
2012-08-03 5:55 ` Nikunj A Dadhania
2012-08-03 17:31 ` Marcelo Tosatti
2012-08-04 18:33 ` Nikunj A Dadhania
2012-07-31 10:48 ` [PATCH v3 5/8] KVM Guest: Add paravirt kvm_flush_tlb_others Nikunj A. Dadhania
2012-07-31 10:49 ` [PATCH v3 6/8] KVM-HV: Add flush_on_enter before guest enter Nikunj A. Dadhania
2012-08-02 20:14 ` Marcelo Tosatti
2012-08-02 20:16 ` Marcelo Tosatti
2012-08-03 5:37 ` Nikunj A Dadhania
2012-08-03 17:31 ` Marcelo Tosatti
2012-07-31 10:49 ` [PATCH v3 7/8] Enable HAVE_RCU_TABLE_FREE for kvm when PARAVIRT_TLB_FLUSH is enabled Nikunj A. Dadhania
2012-07-31 10:49 ` [PATCH v3 8/8] KVM-doc: Add paravirt tlb flush document Nikunj A. Dadhania
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120731104312.16662.27889.stgit@abhimanyu.in.ibm.com \
--to=nikunj@linux.vnet.ibm.com \
--cc=alex.shi@intel.com \
--cc=avi@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mtosatti@redhat.com \
--cc=peterz@infradead.org \
--cc=raghukt@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).