From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: [patch 00/13] RFC: out of sync shadow Date: Sat, 06 Sep 2008 15:48:22 -0300 Message-ID: <20080906184822.560099087@localhost.localdomain> Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.redhat.com ([66.187.233.31]:60539 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752084AbYIFT0u (ORCPT ); Sat, 6 Sep 2008 15:26:50 -0400 Sender: kvm-owner@vger.kernel.org List-ID: Keep shadow pages temporarily out of sync, allowing more efficient guest PTE updates in comparison to trap-emulate + unprotect heuristics. Stolen from Xen :) This version only allows leaf pagetables to go out of sync, for simplicity, but can be enhanced. VMX "bypass_guest_pf" feature on prefetch_page breaks it (since new PTE writes need no TLB flush, I assume). Not sure if its worthwhile to convert notrap_nonpresent -> trap_nonpresent on unshadow or just go for unconditional nonpaging_prefetch_page. * Kernel builds on 4-way 64-bit guest improve 10% (+ 3.7% for get_user_pages_fast). * lmbench's "lat_proc fork" microbenchmark latency is 40% lower (a shadow worst scenario test). * The RHEL3 highpte kscand hangs go from 5+ seconds to < 1 second. * Windows 2003 Server, 32-bit PAE, DDK build (build -cPzM 3): Windows 2003 Checked 64 Bit Build Environment, 256M RAM 1-vcpu: vanilla + gup_fast: oos 0:04:37.375 0:03:28.047 (- 25%) 2-vcpus: vanilla + gup_fast oos 0:02:32.000 0:01:56.031 (- 23%) Windows 2003 Checked Build Environment, 1GB RAM 2-vcpus: vanilla + fast_gup oos 0:02:26.078 0:01:50.110 (- 24%) 4-vcpus: vanilla + fast_gup oos 0:01:59.266 0:01:29.625 (- 25%) And I think other optimizations are possible now, for example the guest can be responsible for remote TLB flushing on kvm_mmu_pte_write(). Please review. --