From: Paolo Bonzini <pbonzini@redhat.com>
To: Nadav Amit <nadav.amit@gmail.com>, kvm@vger.kernel.org
Subject: Re: x86: strange behavior of invlpg
Date: Mon, 16 May 2016 11:28:23 +0200 [thread overview]
Message-ID: <573992B7.20300@redhat.com> (raw)
In-Reply-To: <2C79F043-EF77-4C44-BE36-1CEDE16E788F@gmail.com>
On 14/05/2016 11:35, Nadav Amit wrote:
> I encountered a strange phenomenum and I would appreciate your sanity check
> and opinion. It looks as if 'invlpg' that runs in a VM causes a very broad
> flush.
>
> I created a small kvm-unit-test (below) to show what I talk about. The test
> touches 50 pages, and then either: (1) runs full flush, (2) runs invlpg to
> an arbitrary (other) address, or (3) runs memory barrier.
>
> It appears that the execution time of the test is indeed determined by TLB
> misses, since the runtime of the memory barrier flavor is considerably lower.
Did you check the performance counters? Another explanation is that
there are no TLB misses, but CR3 writes are optimized in such a way
that they do not incur TLB misses either. (Disclaimer: I didn't check
the performance counters to prove the alternative theory ;)).
> What I find strange is that if I compute the net access time for tests 1 & 2,
> by deducing the time of the flushes, the time is almost identical. I am aware
> that invlpg flushes the page-walk caches, but I would still expect the invlpg
> flavor to run considerably faster than the full-flush flavor.
That's interesting. I guess you're using EPT because I get very
similar number on an Ivy Bridge laptop:
with invlpg: 902,224,568
with full flush: 880,103,513
invlpg only 113,186,461
full flushes only 100,236,620
access net 104,454,125
w/full flush net 779,866,893
w/invlpg net 789,038,107
(commas added for readability).
Out of curiosity I tried making all pages global (patch after my
signature). Both invlpg and write to CR3 become much faster, but
invlpg now is faster than full flush, even though in theory it
should be the opposite...
with invlpg: 223,079,661
with full flush: 294,280,788
invlpg only 126,236,334
full flushes only 107,614,525
access net 90,830,503
w/full flush net 186,666,263
w/invlpg net 96,843,327
Thanks for the interesting test!
Paolo
diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index 7ce7bbc..3b9b81a 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -2,6 +2,7 @@
#include "vm.h"
#include "libcflat.h"
+#define PTE_GLOBAL 256
#define PAGE_SIZE 4096ul
#ifdef __x86_64__
#define LARGE_PAGE_SIZE (512 * PAGE_SIZE)
@@ -106,14 +107,14 @@ unsigned long *install_large_page(unsigned long *cr3,
void *virt)
{
return install_pte(cr3, 2, virt,
- phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_PSE, 0);
+ phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_PSE | PTE_GLOBAL, 0);
}
unsigned long *install_page(unsigned long *cr3,
unsigned long phys,
void *virt)
{
- return install_pte(cr3, 1, virt, phys | PTE_PRESENT | PTE_WRITE | PTE_USER, 0);
+ return install_pte(cr3, 1, virt, phys | PTE_PRESENT | PTE_WRITE | PTE_USER | PTE_GLOBAL, 0);
}
> Am I missing something?
>
>
> On my Haswell EP I get the following results:
>
> with invlpg: 948965249
> with full flush: 1047927009
> invlpg only 127682028
> full flushes only 224055273
> access net 107691277 --> considerably lower than w/flushes
> w/full flush net 823871736
> w/invlpg net 821283221 --> almost identical to full-flush net
>
> ---
>
>
> #include "libcflat.h"
> #include "fwcfg.h"
> #include "vm.h"
> #include "smp.h"
>
> #define N_PAGES (50)
> #define ITERATIONS (500000)
> volatile char buf[N_PAGES * PAGE_SIZE] __attribute__ ((aligned (PAGE_SIZE)));
>
> int main(void)
> {
> void *another_addr = (void*)0x50f9000;
> int i, j;
> unsigned long t_start, t_single, t_full, t_single_only, t_full_only,
> t_access;
> unsigned long cr3;
> char v = 0;
>
> setup_vm();
>
> cr3 = read_cr3();
>
> t_start = rdtsc();
> for (i = 0; i < ITERATIONS; i++) {
> invlpg(another_addr);
> for (j = 0; j < N_PAGES; j++)
> v = buf[PAGE_SIZE * j];
> }
> t_single = rdtsc() - t_start;
> printf("with invlpg: %lu\n", t_single);
>
> t_start = rdtsc();
> for (i = 0; i < ITERATIONS; i++) {
> write_cr3(cr3);
> for (j = 0; j < N_PAGES; j++)
> v = buf[PAGE_SIZE * j];
> }
> t_full = rdtsc() - t_start;
> printf("with full flush: %lu\n", t_full);
>
> t_start = rdtsc();
> for (i = 0; i < ITERATIONS; i++)
> invlpg(another_addr);
> t_single_only = rdtsc() - t_start;
> printf("invlpg only %lu\n", t_single_only);
>
> t_start = rdtsc();
> for (i = 0; i < ITERATIONS; i++)
> write_cr3(cr3);
> t_full_only = rdtsc() - t_start;
> printf("full flushes only %lu\n", t_full_only);
>
> t_start = rdtsc();
> for (i = 0; i < ITERATIONS; i++) {
> for (j = 0; j < N_PAGES; j++)
> v = buf[PAGE_SIZE * j];
> mb();
> }
> t_access = rdtsc()-t_start;
> printf("access net %lu\n", t_access);
> printf("w/full flush net %lu\n", t_full - t_full_only);
> printf("w/invlpg net %lu\n", t_single - t_single_only);
>
> (void)v;
> return 0;
> }--
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2016-05-16 9:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-14 9:35 x86: strange behavior of invlpg Nadav Amit
2016-05-16 9:28 ` Paolo Bonzini [this message]
2016-05-16 16:51 ` Nadav Amit
2016-05-16 16:56 ` Paolo Bonzini
2016-05-16 19:39 ` Nadav Amit
2016-05-17 4:27 ` Nadav Amit
2018-02-15 22:43 ` Nadav Amit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=573992B7.20300@redhat.com \
--to=pbonzini@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=nadav.amit@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox