qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] Huge TLB performance improvement
@ 2006-03-06 14:59 Thiemo Seufer
  2006-11-05 15:38 ` Daniel Jacobowitz
  0 siblings, 1 reply; 16+ messages in thread
From: Thiemo Seufer @ 2006-03-06 14:59 UTC (permalink / raw)
  To: qemu-devel

Hello All,

this patch vastly improves TLB performance on MIPS, and probably also
on other architectures. I measured a Linux boot-shutdown cycle,
including userland init.

With minimal jump cache invalidation:

real    11m43.429s
user    9m51.975s
sys     0m1.375s

 64.19   1476.81  1476.81 20551904     0.00     0.00  tlb_flush_page
  6.72   1631.36   154.55   184346     0.00     0.00  cpu_mips_exec
  4.35   1731.46   100.10  3550500     0.00     0.00  dyngen_code
  3.66   1815.77    84.31 90897893     0.00     0.00  decode_opc
  2.89   1882.21    66.44 11170487     0.00     0.00  gen_intermediate_code_internal
  1.72   1921.80    39.59 29919267     0.00     0.00  map_address
  1.52   1956.66    34.86  7619987     0.00     0.00  tb_find_pc
  0.96   1978.85    22.19 26361969     0.00     0.00  tlb_set_page_exec
  0.96   2000.84    21.99                             __ldl_mmu
  0.90   2021.59    20.75 27279747     0.00     0.00  gen_arith_imm


With global jump cache kill:

real    6m19.811s
user    4m23.650s
sys     0m0.617s

 21.67    188.78   188.78   146571     0.00     0.00  cpu_mips_exec
 11.37    287.88    99.10  3393051     0.00     0.00  dyngen_code
  9.59    371.45    83.57 89839869     0.00     0.00  decode_opc
  7.68    438.33    66.88 10989930     0.00     0.00  gen_intermediate_code_internal
  4.24    475.26    36.93 30124659     0.00     0.00  map_address
  3.80    508.33    33.07  7596879     0.00     0.00  tb_find_pc
  2.74    532.22    23.89 27781692     0.00     0.00  tlb_set_page_exec
  2.62    555.02    22.80 39891573     0.00     0.00  cpu_mips_handle_mmu_fault
  2.55    577.25    22.23                             __ldl_mmu
  2.30    597.26    20.01 26968709     0.00     0.00  gen_arith_imm


Thiemo


Index: qemu-work/exec.c
===================================================================
--- qemu-work.orig/exec.c	2006-03-06 01:30:09.000000000 +0000
+++ qemu-work/exec.c	2006-03-06 01:30:28.000000000 +0000
@@ -1247,7 +1247,6 @@
 void tlb_flush_page(CPUState *env, target_ulong addr)
 {
     int i;
-    TranslationBlock *tb;
 
 #if defined(DEBUG_TLB)
     printf("tlb_flush_page: " TARGET_FMT_lx "\n", addr);
@@ -1261,14 +1260,10 @@
     tlb_flush_entry(&env->tlb_table[0][i], addr);
     tlb_flush_entry(&env->tlb_table[1][i], addr);
 
-    for(i = 0; i < TB_JMP_CACHE_SIZE; i++) {
-        tb = env->tb_jmp_cache[i];
-        if (tb && 
-            ((tb->pc & TARGET_PAGE_MASK) == addr ||
-             ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) {
-            env->tb_jmp_cache[i] = NULL;
-        }
-    }
+    /* We throw away the jump cache altogether. This is cheaper than
+       trying to be smart by invalidating only the entries in the
+       affected address range. */
+    memset (env->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
 
 #if !defined(CONFIG_SOFTMMU)
     if (addr < MMAP_AREA_END)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-03-06 14:59 [Qemu-devel] [PATCH] Huge TLB performance improvement Thiemo Seufer
@ 2006-11-05 15:38 ` Daniel Jacobowitz
  2006-11-12  1:10   ` Daniel Jacobowitz
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-05 15:38 UTC (permalink / raw)
  To: qemu-devel

On Mon, Mar 06, 2006 at 02:59:29PM +0000, Thiemo Seufer wrote:
> Hello All,
> 
> this patch vastly improves TLB performance on MIPS, and probably also
> on other architectures. I measured a Linux boot-shutdown cycle,
> including userland init.

Quoting the whole message since this is from March...

I don't remember seeing any followup discussion of this patch, but I
may have missed it.  Thiemo's definitely right about "vastly".  Is this
patch appropriate, or would anyone care to suggest a more
sophisticated data structure to avoid the full cache invalidate?

> 
> With minimal jump cache invalidation:
> 
> real    11m43.429s
> user    9m51.975s
> sys     0m1.375s
> 
>  64.19   1476.81  1476.81 20551904     0.00     0.00  tlb_flush_page
>   6.72   1631.36   154.55   184346     0.00     0.00  cpu_mips_exec
>   4.35   1731.46   100.10  3550500     0.00     0.00  dyngen_code
>   3.66   1815.77    84.31 90897893     0.00     0.00  decode_opc
>   2.89   1882.21    66.44 11170487     0.00     0.00  gen_intermediate_code_internal
>   1.72   1921.80    39.59 29919267     0.00     0.00  map_address
>   1.52   1956.66    34.86  7619987     0.00     0.00  tb_find_pc
>   0.96   1978.85    22.19 26361969     0.00     0.00  tlb_set_page_exec
>   0.96   2000.84    21.99                             __ldl_mmu
>   0.90   2021.59    20.75 27279747     0.00     0.00  gen_arith_imm
> 
> 
> With global jump cache kill:
> 
> real    6m19.811s
> user    4m23.650s
> sys     0m0.617s
> 
>  21.67    188.78   188.78   146571     0.00     0.00  cpu_mips_exec
>  11.37    287.88    99.10  3393051     0.00     0.00  dyngen_code
>   9.59    371.45    83.57 89839869     0.00     0.00  decode_opc
>   7.68    438.33    66.88 10989930     0.00     0.00  gen_intermediate_code_internal
>   4.24    475.26    36.93 30124659     0.00     0.00  map_address
>   3.80    508.33    33.07  7596879     0.00     0.00  tb_find_pc
>   2.74    532.22    23.89 27781692     0.00     0.00  tlb_set_page_exec
>   2.62    555.02    22.80 39891573     0.00     0.00  cpu_mips_handle_mmu_fault
>   2.55    577.25    22.23                             __ldl_mmu
>   2.30    597.26    20.01 26968709     0.00     0.00  gen_arith_imm
> 
> 
> Thiemo
> 
> 
> Index: qemu-work/exec.c
> ===================================================================
> --- qemu-work.orig/exec.c	2006-03-06 01:30:09.000000000 +0000
> +++ qemu-work/exec.c	2006-03-06 01:30:28.000000000 +0000
> @@ -1247,7 +1247,6 @@
>  void tlb_flush_page(CPUState *env, target_ulong addr)
>  {
>      int i;
> -    TranslationBlock *tb;
>  
>  #if defined(DEBUG_TLB)
>      printf("tlb_flush_page: " TARGET_FMT_lx "\n", addr);
> @@ -1261,14 +1260,10 @@
>      tlb_flush_entry(&env->tlb_table[0][i], addr);
>      tlb_flush_entry(&env->tlb_table[1][i], addr);
>  
> -    for(i = 0; i < TB_JMP_CACHE_SIZE; i++) {
> -        tb = env->tb_jmp_cache[i];
> -        if (tb && 
> -            ((tb->pc & TARGET_PAGE_MASK) == addr ||
> -             ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) {
> -            env->tb_jmp_cache[i] = NULL;
> -        }
> -    }
> +    /* We throw away the jump cache altogether. This is cheaper than
> +       trying to be smart by invalidating only the entries in the
> +       affected address range. */
> +    memset (env->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
>  
>  #if !defined(CONFIG_SOFTMMU)
>      if (addr < MMAP_AREA_END)
> 
> 
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
> 

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-05 15:38 ` Daniel Jacobowitz
@ 2006-11-12  1:10   ` Daniel Jacobowitz
  2006-11-12 11:49     ` Laurent Desnogues
  2006-11-12 20:42     ` Paul Brook
  0 siblings, 2 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12  1:10 UTC (permalink / raw)
  To: qemu-devel

On Sun, Nov 05, 2006 at 10:38:20AM -0500, Daniel Jacobowitz wrote:
> On Mon, Mar 06, 2006 at 02:59:29PM +0000, Thiemo Seufer wrote:
> > Hello All,
> > 
> > this patch vastly improves TLB performance on MIPS, and probably also
> > on other architectures. I measured a Linux boot-shutdown cycle,
> > including userland init.
> 
> Quoting the whole message since this is from March...
> 
> I don't remember seeing any followup discussion of this patch, but I
> may have missed it.  Thiemo's definitely right about "vastly".  Is this
> patch appropriate, or would anyone care to suggest a more
> sophisticated data structure to avoid the full cache invalidate?

This patch is an even nicer alternative, I think.  I benchmarked four
alternatives (several times each):

Straight qemu with my previously posted MIPS patches takes 6:13 to
start and reboot a MIPS userspace (through init, so lots of fork/exec).

Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40.

A patch which finds the entries which need to be flushed more
efficiently cuts it to 1:21.

A patch which flushes up to 1/32nd of the jump buffer indiscriminately
cuts it to 1:11-1:13.

Here's that last patch.  It changes the hash function so that entries
from a particular page are always grouped together in tb_jmp_cache,
then finds the possibly two affected ranges and memsets them clear.
Thoughts?  Is this acceptable, where else should it be tested besides
MIPS?  I haven't fine-tuned the numbers; it currently allows for max 64
cached jump targets per target page, but that could be made higher or
lower.

-- 
Daniel Jacobowitz
CodeSourcery

---
 cpu-defs.h |    5 +++++
 exec-all.h |   12 +++++++++++-
 exec.c     |   15 +++++++--------
 3 files changed, 23 insertions(+), 9 deletions(-)

Index: qemu/cpu-defs.h
===================================================================
--- qemu.orig/cpu-defs.h	2006-11-11 15:12:26.000000000 -0500
+++ qemu/cpu-defs.h	2006-11-11 15:12:33.000000000 -0500
@@ -80,6 +80,11 @@ typedef unsigned long ram_addr_t;
 #define TB_JMP_CACHE_BITS 12
 #define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
 
+#define TB_JMP_PAGE_BITS (TB_JMP_CACHE_BITS / 2)
+#define TB_JMP_PAGE_SIZE (1 << TB_JMP_PAGE_BITS)
+#define TB_JMP_ADDR_MASK (TB_JMP_PAGE_SIZE - 1)
+#define TB_JMP_PAGE_MASK (TB_JMP_ADDR_MASK << TB_JMP_PAGE_BITS)
+
 #define CPU_TLB_BITS 8
 #define CPU_TLB_SIZE (1 << CPU_TLB_BITS)
 
Index: qemu/exec-all.h
===================================================================
--- qemu.orig/exec-all.h	2006-11-11 15:12:26.000000000 -0500
+++ qemu/exec-all.h	2006-11-11 19:56:36.000000000 -0500
@@ -196,9 +196,19 @@ typedef struct TranslationBlock {
     struct TranslationBlock *jmp_first;
 } TranslationBlock;
 
+static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc)
+{
+    target_ulong tmp;
+    tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS));
+    return (tmp >> TB_JMP_PAGE_BITS) & TB_JMP_PAGE_MASK;
+}
+
 static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc)
 {
-    return (pc ^ (pc >> TB_JMP_CACHE_BITS)) & (TB_JMP_CACHE_SIZE - 1);
+    target_ulong tmp;
+    tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS));
+    return (((tmp >> TB_JMP_PAGE_BITS) & TB_JMP_PAGE_MASK) |
+	    (tmp & TB_JMP_ADDR_MASK));
 }
 
 static inline unsigned int tb_phys_hash_func(unsigned long pc)
Index: qemu/exec.c
===================================================================
--- qemu.orig/exec.c	2006-11-11 15:12:26.000000000 -0500
+++ qemu/exec.c	2006-11-11 19:39:45.000000000 -0500
@@ -1299,14 +1299,13 @@ void tlb_flush_page(CPUState *env, targe
     tlb_flush_entry(&env->tlb_table[0][i], addr);
     tlb_flush_entry(&env->tlb_table[1][i], addr);
 
-    for(i = 0; i < TB_JMP_CACHE_SIZE; i++) {
-        tb = env->tb_jmp_cache[i];
-        if (tb && 
-            ((tb->pc & TARGET_PAGE_MASK) == addr ||
-             ((tb->pc + tb->size - 1) & TARGET_PAGE_MASK) == addr)) {
-            env->tb_jmp_cache[i] = NULL;
-        }
-    }
+    /* Discard jump cache entries for any tb which might potentially
+       overlap the flushed page.  */
+    i = tb_jmp_cache_hash_page(addr - TARGET_PAGE_SIZE);
+    memset (&env->tb_jmp_cache[i], 0, TB_JMP_PAGE_SIZE * sizeof(tb));
+
+    i = tb_jmp_cache_hash_page(addr);
+    memset (&env->tb_jmp_cache[i], 0, TB_JMP_PAGE_SIZE * sizeof(tb));
 
 #if !defined(CONFIG_SOFTMMU)
     if (addr < MMAP_AREA_END)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12  1:10   ` Daniel Jacobowitz
@ 2006-11-12 11:49     ` Laurent Desnogues
  2006-11-12 13:52       ` Thiemo Seufer
  2006-11-12 14:08       ` Paul Brook
  2006-11-12 20:42     ` Paul Brook
  1 sibling, 2 replies; 16+ messages in thread
From: Laurent Desnogues @ 2006-11-12 11:49 UTC (permalink / raw)
  To: qemu-devel

Daniel Jacobowitz a écrit :
> 
> Straight qemu with my previously posted MIPS patches takes 6:13 to
> start and reboot a MIPS userspace (through init, so lots of fork/exec).
> 
> Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40.
> 
> A patch which finds the entries which need to be flushed more
> efficiently cuts it to 1:21.
> 
> A patch which flushes up to 1/32nd of the jump buffer indiscriminately
> cuts it to 1:11-1:13.

Warning:  I don't know anything about the Qemu MMU implementation
so this question is perhaps stupid :)

Did you try to benchmark some user space applications with the
various implementations you propose?  The boot of a Linux kernel
is quite heavy on various kinds of flushes and so is very
different from "standard" applications.


			Laurent

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 11:49     ` Laurent Desnogues
@ 2006-11-12 13:52       ` Thiemo Seufer
  2006-11-12 14:08       ` Paul Brook
  1 sibling, 0 replies; 16+ messages in thread
From: Thiemo Seufer @ 2006-11-12 13:52 UTC (permalink / raw)
  To: qemu-devel

Laurent Desnogues wrote:
> Daniel Jacobowitz a écrit :
> >
> >Straight qemu with my previously posted MIPS patches takes 6:13 to
> >start and reboot a MIPS userspace (through init, so lots of fork/exec).
> >
> >Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40.
> >
> >A patch which finds the entries which need to be flushed more
> >efficiently cuts it to 1:21.
> >
> >A patch which flushes up to 1/32nd of the jump buffer indiscriminately
> >cuts it to 1:11-1:13.
> 
> Warning:  I don't know anything about the Qemu MMU implementation
> so this question is perhaps stupid :)
> 
> Did you try to benchmark some user space applications with the
> various implementations you propose?

A "benchmark" I did was compiling lmbench, which became more than twice
as fast. I didn't bother to do real measurements.

> The boot of a Linux kernel
> is quite heavy on various kinds of flushes and so is very
> different from "standard" applications.

At least the MIPS kernel is indeed different in that it uses non-trivial
TLB mappings nearly excusively for modules and userland. IOW, the
kernel-side MMU overhead at boot time is neglectable. My patch made no
significant difference for that case.


Thiemo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 11:49     ` Laurent Desnogues
  2006-11-12 13:52       ` Thiemo Seufer
@ 2006-11-12 14:08       ` Paul Brook
  2006-11-12 14:29         ` Thiemo Seufer
  1 sibling, 1 reply; 16+ messages in thread
From: Paul Brook @ 2006-11-12 14:08 UTC (permalink / raw)
  To: qemu-devel

On Sunday 12 November 2006 11:49, Laurent Desnogues wrote:
> Daniel Jacobowitz a écrit :
> > Straight qemu with my previously posted MIPS patches takes 6:13 to
> > start and reboot a MIPS userspace (through init, so lots of fork/exec).
> >
> > Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40.
> >
> > A patch which finds the entries which need to be flushed more
> > efficiently cuts it to 1:21.
> >
> > A patch which flushes up to 1/32nd of the jump buffer indiscriminately
> > cuts it to 1:11-1:13.
>
> Warning:  I don't know anything about the Qemu MMU implementation
> so this question is perhaps stupid :)
>
> Did you try to benchmark some user space applications with the
> various implementations you propose?  The boot of a Linux kernel
> is quite heavy on various kinds of flushes and so is very
> different from "standard" applications.

MIPS is different because it has a relatively small software managed TLB. 
Other targets have a hardware managed TLB. On a hardware managed TLB the OS 
treats it as if it were infinite size, and invalidation only occurs when a OS 
changes the mappings. On a software managed TLB "flushes" are more likely to 
occur during normal operation as TLB slots are reused.

Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 14:08       ` Paul Brook
@ 2006-11-12 14:29         ` Thiemo Seufer
  2006-11-12 14:44           ` Paul Brook
  2006-11-12 16:56           ` Daniel Jacobowitz
  0 siblings, 2 replies; 16+ messages in thread
From: Thiemo Seufer @ 2006-11-12 14:29 UTC (permalink / raw)
  To: qemu-devel

Paul Brook wrote:
> On Sunday 12 November 2006 11:49, Laurent Desnogues wrote:
> > Daniel Jacobowitz a écrit :
> > > Straight qemu with my previously posted MIPS patches takes 6:13 to
> > > start and reboot a MIPS userspace (through init, so lots of fork/exec).
> > >
> > > Thiemo's patch, which flushes the whole jump buffer, cuts it to 1:40.
> > >
> > > A patch which finds the entries which need to be flushed more
> > > efficiently cuts it to 1:21.
> > >
> > > A patch which flushes up to 1/32nd of the jump buffer indiscriminately
> > > cuts it to 1:11-1:13.
> >
> > Warning:  I don't know anything about the Qemu MMU implementation
> > so this question is perhaps stupid :)
> >
> > Did you try to benchmark some user space applications with the
> > various implementations you propose?  The boot of a Linux kernel
> > is quite heavy on various kinds of flushes and so is very
> > different from "standard" applications.
> 
> MIPS is different because it has a relatively small software managed TLB. 

JFTR, increasing the TLB size from 16 to 64 entries made no performance
difference whatsoever.

> Other targets have a hardware managed TLB. On a hardware managed TLB the OS 
> treats it as if it were infinite size, and invalidation only occurs when a OS 
> changes the mappings. On a software managed TLB "flushes" are more likely to 
> occur during normal operation as TLB slots are reused.

The excessive flushing for mips happens because Qemu doesn't properly
model the hardware's ASID handling.


Thiemo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 14:29         ` Thiemo Seufer
@ 2006-11-12 14:44           ` Paul Brook
  2006-11-12 15:07             ` Daniel Jacobowitz
  2006-11-12 15:26             ` Thiemo Seufer
  2006-11-12 16:56           ` Daniel Jacobowitz
  1 sibling, 2 replies; 16+ messages in thread
From: Paul Brook @ 2006-11-12 14:44 UTC (permalink / raw)
  To: qemu-devel

> > Other targets have a hardware managed TLB. On a hardware managed TLB the
> > OS treats it as if it were infinite size, and invalidation only occurs
> > when a OS changes the mappings. On a software managed TLB "flushes" are
> > more likely to occur during normal operation as TLB slots are reused.
>
> The excessive flushing for mips happens because Qemu doesn't properly
> model the hardware's ASID handling.

Are you sure? IIUC changing the ASID causes a full qemu TLB flush. The code 
we're tweaking here is for single page flush.

Actually that gives me an idea. When a TLB entry with a different ASID gets 
evicted we currently flush that page. This should be a no-op because we 
already did a full flush when the ASID changed.

The other explanation is that the gest OS is manually doing a full TLB flush 
by manually evicting all the TLB entries. I'd hope that a sane guest OS would 
only do that as a last resort though.

Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 14:44           ` Paul Brook
@ 2006-11-12 15:07             ` Daniel Jacobowitz
  2006-11-12 15:24               ` Daniel Jacobowitz
  2006-11-12 15:26             ` Thiemo Seufer
  1 sibling, 1 reply; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12 15:07 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On Sun, Nov 12, 2006 at 02:44:46PM +0000, Paul Brook wrote:
> > > Other targets have a hardware managed TLB. On a hardware managed TLB the
> > > OS treats it as if it were infinite size, and invalidation only occurs
> > > when a OS changes the mappings. On a software managed TLB "flushes" are
> > > more likely to occur during normal operation as TLB slots are reused.
> >
> > The excessive flushing for mips happens because Qemu doesn't properly
> > model the hardware's ASID handling.
> 
> Are you sure? IIUC changing the ASID causes a full qemu TLB flush. The code 
> we're tweaking here is for single page flush.

The brutal performance problem that Thiemo and I have been working on
is for single page flushes.  With that solved, though, there's still
rather more memory management overhead than I'd like.

I was also thinking about pretending there were more TLB entries than
there actually are.  We have a certain leeway to do this, because there
are two tlb write instructions: indexed and random.  So there's a while
where the OS can't say for sure that an entry has been evicted.  That
might cut down on flushes, or it might not.  My hope would be holding
on to the evicted items until a global flush.

But what Thiemo is getting at, I think, is that we have to flush qemu's
TLB at every ASID switch.  That's pretty lousy!  It makes the ASID a
net performance penalty, instead of a boost.  I don't see anything
obvious that I could do about it, though.  The qemu tlb table only has
room for is_user and the virtual address.

> Actually that gives me an idea. When a TLB entry with a different ASID gets 
> evicted we currently flush that page. This should be a no-op because we 
> already did a full flush when the ASID changed.

Let me see if this makes any difference.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 15:07             ` Daniel Jacobowitz
@ 2006-11-12 15:24               ` Daniel Jacobowitz
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12 15:24 UTC (permalink / raw)
  To: qemu-devel

On Sun, Nov 12, 2006 at 10:07:15AM -0500, Daniel Jacobowitz wrote:
> > Actually that gives me an idea. When a TLB entry with a different ASID gets 
> > evicted we currently flush that page. This should be a no-op because we 
> > already did a full flush when the ASID changed.
> 
> Let me see if this makes any difference.

Saves about 2% of invalidate_tlb calls, no measurable time change, but
we might as well.  Attached.

-- 
Daniel Jacobowitz
CodeSourcery

---
 target-mips/op_helper.c |    7 +++++++
 1 file changed, 7 insertions(+)

Index: qemu/target-mips/op_helper.c
===================================================================
--- qemu.orig/target-mips/op_helper.c	2006-11-12 10:09:44.000000000 -0500
+++ qemu/target-mips/op_helper.c	2006-11-12 10:21:16.000000000 -0500
@@ -573,8 +573,15 @@ static void invalidate_tlb (int idx)
 {
     tlb_t *tlb;
     target_ulong addr;
+    uint8_t ASID;
+
+    ASID = env->CP0_EntryHi & 0xFF;
 
     tlb = &env->tlb[idx];
+    if (tlb->G == 0 && tlb->ASID != ASID) {
+        return;
+    }
+
     if (tlb->V0) {
         tb_invalidate_page_range(tlb->PFN[0], tlb->end - tlb->VPN);
         addr = tlb->VPN;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 14:44           ` Paul Brook
  2006-11-12 15:07             ` Daniel Jacobowitz
@ 2006-11-12 15:26             ` Thiemo Seufer
  1 sibling, 0 replies; 16+ messages in thread
From: Thiemo Seufer @ 2006-11-12 15:26 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

Paul Brook wrote:
> > > Other targets have a hardware managed TLB. On a hardware managed TLB the
> > > OS treats it as if it were infinite size, and invalidation only occurs
> > > when a OS changes the mappings. On a software managed TLB "flushes" are
> > > more likely to occur during normal operation as TLB slots are reused.
> >
> > The excessive flushing for mips happens because Qemu doesn't properly
> > model the hardware's ASID handling.
> 
> Are you sure? IIUC changing the ASID causes a full qemu TLB flush. The code 
> we're tweaking here is for single page flush.

I referred with that comment to the general problem of emulating a MMU
with ASIDs in the current qemu fremework.

> Actually that gives me an idea. When a TLB entry with a different ASID gets 
> evicted we currently flush that page. This should be a no-op because we 
> already did a full flush when the ASID changed.

That's the way the MIPS MMU is supposed to work.

> The other explanation is that the gest OS is manually doing a full TLB flush 
> by manually evicting all the TLB entries. I'd hope that a sane guest OS would 
> only do that as a last resort though.

Linux/MIPS does this only at a ASID wraparound, which doesn't happen
that often.


Thiemo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 14:29         ` Thiemo Seufer
  2006-11-12 14:44           ` Paul Brook
@ 2006-11-12 16:56           ` Daniel Jacobowitz
  2006-11-12 17:49             ` Daniel Jacobowitz
  2006-11-12 18:02             ` Dirk Behme
  1 sibling, 2 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12 16:56 UTC (permalink / raw)
  To: qemu-devel

On Sun, Nov 12, 2006 at 02:29:38PM +0000, Thiemo Seufer wrote:
> JFTR, increasing the TLB size from 16 to 64 entries made no performance
> difference whatsoever.

I suspect that's because we do about as much eviction.  Here's a
different approach.  Whenever an entry is evicted by tlbwr, the guest
can't predict which existing entry will be removed.  So, let's evict
none of them.  This takes the "evicted" entry and swaps it out to
a second set of TLB entries, avoiding the qemu internal TLB flush.

I'm trying for a complete as-if implementation, so tlbp only searches
the "real" entries (I don't know if it should cause a flush of the
shadowed entries, but things seem to work OK without it).  tlbwi and
tlbr both discard the shadowed entries.

This appears to cut single page flushes by 90%.

My best time for boot/runlevel-2/halt yesterday was 73 seconds.  This
runs at about 51 seconds.  apt-get update finishes in a reasonable
amount of time.  This is with all of the patches I've posted to the
list applied, including the improved tb_jmp_cache handling - we still
do a non-trivial number of single page cache flushes so I think it's
a good idea.

> The excessive flushing for mips happens because Qemu doesn't properly
> model the hardware's ASID handling.

We still do flushes at ASID switches, by the way, so it might be
possible to get further gains here.  But we're down to under ~ 15%
of CPU time for soft-mmu routines and tb management routines, which
is very good.  Then there's about 65% executing guest code and the rest
in translation, virtual hardware, and other overhead.

-- 
Daniel Jacobowitz
CodeSourcery

---
 target-mips/cpu.h       |    3 ++-
 target-mips/exec.h      |    1 +
 target-mips/helper.c    |    2 +-
 target-mips/mips-defs.h |    1 +
 target-mips/op_helper.c |   43 +++++++++++++++++++++++++++++++++++++------
 target-mips/translate.c |    1 +
 6 files changed, 43 insertions(+), 8 deletions(-)

Index: qemu/target-mips/cpu.h
===================================================================
--- qemu.orig/target-mips/cpu.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/cpu.h	2006-11-12 11:34:24.000000000 -0500
@@ -94,7 +94,8 @@ struct CPUMIPSState {
 		
 #endif
 #if defined(MIPS_USES_R4K_TLB)
-    tlb_t tlb[MIPS_TLB_NB];
+    tlb_t tlb[MIPS_TLB_MAX];
+    uint32_t tlb_in_use;
 #endif
     uint32_t CP0_index;
     uint32_t CP0_random;
Index: qemu/target-mips/exec.h
===================================================================
--- qemu.orig/target-mips/exec.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/exec.h	2006-11-12 11:34:24.000000000 -0500
@@ -115,5 +115,6 @@ uint32_t cpu_mips_get_count (CPUState *e
 void cpu_mips_store_count (CPUState *env, uint32_t value);
 void cpu_mips_store_compare (CPUState *env, uint32_t value);
 void cpu_mips_clock_init (CPUState *env);
+void cpu_mips_tlb_flush (CPUState *env, int flush_global);
 
 #endif /* !defined(__QEMU_MIPS_EXEC_H__) */
Index: qemu/target-mips/helper.c
===================================================================
--- qemu.orig/target-mips/helper.c	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/helper.c	2006-11-12 11:34:24.000000000 -0500
@@ -46,7 +46,7 @@ static int map_address (CPUState *env, t
     tlb_t *tlb;
     int i, n;
 
-    for (i = 0; i < MIPS_TLB_NB; i++) {
+    for (i = 0; i < env->tlb_in_use; i++) {
         tlb = &env->tlb[i];
         /* Check ASID, virtual page number & size */
         if ((tlb->G == 1 || tlb->ASID == ASID) &&
Index: qemu/target-mips/mips-defs.h
===================================================================
--- qemu.orig/target-mips/mips-defs.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/mips-defs.h	2006-11-12 11:34:24.000000000 -0500
@@ -22,6 +22,7 @@
 /* Uses MIPS R4Kc TLB model */
 #define MIPS_USES_R4K_TLB
 #define MIPS_TLB_NB 16
+#define MIPS_TLB_MAX 128
 /* basic FPU register support */
 #define MIPS_USES_FPU 1
 /* Define a implementation number of 1.
Index: qemu/target-mips/op_helper.c
===================================================================
--- qemu.orig/target-mips/op_helper.c	2006-11-12 11:34:02.000000000 -0500
+++ qemu/target-mips/op_helper.c	2006-11-12 11:42:44.000000000 -0500
@@ -367,7 +367,7 @@ void do_mtc0 (int reg, int sel)
         env->CP0_EntryHi = val;
 	/* If the ASID changes, flush qemu's TLB.  */
 	if ((old & 0xFF) != (val & 0xFF))
-	  tlb_flush (env, 1);
+	  cpu_mips_tlb_flush (env, 1);
         rn = "EntryHi";
         break;
     case 11:
@@ -569,7 +569,14 @@ void fpu_handle_exception(void)
 
 /* TLB management */
 #if defined(MIPS_USES_R4K_TLB)
-static void invalidate_tlb (int idx)
+void cpu_mips_tlb_flush (CPUState *env, int flush_global)
+{
+    /* Flush qemu's TLB and discard all shadowed entries.  */
+    tlb_flush (env, flush_global);
+    env->tlb_in_use = MIPS_TLB_NB;
+}
+
+static void invalidate_tlb (int idx, int use_extra)
 {
     tlb_t *tlb;
     target_ulong addr;
@@ -582,6 +589,15 @@ static void invalidate_tlb (int idx)
         return;
     }
 
+    if (use_extra && env->tlb_in_use < MIPS_TLB_MAX) {
+        /* For tlbwr, we can shadow the discarded entry into
+	   a new (fake) TLB entry, as long as the guest can not
+	   tell that it's there.  */
+        memcpy (&env->tlb[env->tlb_in_use], tlb, sizeof (*tlb));
+        env->tlb_in_use++;
+        return;
+    }
+
     if (tlb->V0) {
         tb_invalidate_page_range(tlb->PFN[0], tlb->end - tlb->VPN);
         addr = tlb->VPN;
@@ -600,6 +616,14 @@ static void invalidate_tlb (int idx)
     }
 }
 
+static void mips_tlb_flush_extra (CPUState *env)
+{
+    tlb_random = 2;
+    while (env->tlb_in_use > MIPS_TLB_NB) {
+        invalidate_tlb(--env->tlb_in_use, 0);
+    }
+}
+
 static void fill_tlb (int idx)
 {
     tlb_t *tlb;
@@ -626,9 +650,14 @@ static void fill_tlb (int idx)
 
 void do_tlbwi (void)
 {
+    /* Discard cached TLB entries.  We could avoid doing this if the
+       tlbwi is just upgrading access permissions on the current entry;
+       that might be a further win.  */
+    mips_tlb_flush_extra (env);
+
     /* Wildly undefined effects for CP0_index containing a too high value and
        MIPS_TLB_NB not being a power of two.  But so does real silicon.  */
-    invalidate_tlb(env->CP0_index & (MIPS_TLB_NB - 1));
+    invalidate_tlb(env->CP0_index & (MIPS_TLB_NB - 1), 0);
     fill_tlb(env->CP0_index & (MIPS_TLB_NB - 1));
 }
 
@@ -636,7 +665,7 @@ void do_tlbwr (void)
 {
     int r = cpu_mips_get_random(env);
 
-    invalidate_tlb(r);
+    invalidate_tlb(r, 1);
     fill_tlb(r);
 }
 
@@ -673,8 +702,10 @@ void do_tlbr (void)
     tlb = &env->tlb[env->CP0_index & (MIPS_TLB_NB - 1)];
 
     /* If this will change the current ASID, flush qemu's TLB.  */
-    if (ASID != tlb->ASID && tlb->G != 1)
-      tlb_flush (env, 1);
+    if (ASID != tlb->ASID)
+        cpu_mips_tlb_flush (env, 1);
+
+    mips_tlb_flush_extra(env);
 
     env->CP0_EntryHi = tlb->VPN | tlb->ASID;
     size = (tlb->end - tlb->VPN) >> 12;
Index: qemu/target-mips/translate.c
===================================================================
--- qemu.orig/target-mips/translate.c	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/translate.c	2006-11-12 11:34:24.000000000 -0500
@@ -2450,6 +2450,7 @@ void cpu_reset (CPUMIPSState *env)
     env->PC = 0xBFC00000;
 #if defined (MIPS_USES_R4K_TLB)
     env->CP0_random = MIPS_TLB_NB - 1;
+    env->tlb_in_use = MIPS_TLB_NB;
 #endif
     env->CP0_Wired = 0;
     env->CP0_Config0 = MIPS_CONFIG0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 16:56           ` Daniel Jacobowitz
@ 2006-11-12 17:49             ` Daniel Jacobowitz
  2006-11-12 18:02             ` Dirk Behme
  1 sibling, 0 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12 17:49 UTC (permalink / raw)
  To: qemu-devel

On Sun, Nov 12, 2006 at 11:56:35AM -0500, Daniel Jacobowitz wrote:
> ---
>  target-mips/cpu.h       |    3 ++-
>  target-mips/exec.h      |    1 +
>  target-mips/helper.c    |    2 +-
>  target-mips/mips-defs.h |    1 +
>  target-mips/op_helper.c |   43 +++++++++++++++++++++++++++++++++++++------
>  target-mips/translate.c |    1 +
>  6 files changed, 43 insertions(+), 8 deletions(-)

Let's try that again.

-- 
Daniel Jacobowitz
CodeSourcery

---
 target-mips/cpu.h       |    3 ++-
 target-mips/exec.h      |    1 +
 target-mips/helper.c    |    2 +-
 target-mips/mips-defs.h |    1 +
 target-mips/op_helper.c |   43 +++++++++++++++++++++++++++++++++++++------
 target-mips/translate.c |    1 +
 6 files changed, 43 insertions(+), 8 deletions(-)

Index: qemu/target-mips/cpu.h
===================================================================
--- qemu.orig/target-mips/cpu.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/cpu.h	2006-11-12 11:34:24.000000000 -0500
@@ -94,7 +94,8 @@ struct CPUMIPSState {
 		
 #endif
 #if defined(MIPS_USES_R4K_TLB)
-    tlb_t tlb[MIPS_TLB_NB];
+    tlb_t tlb[MIPS_TLB_MAX];
+    uint32_t tlb_in_use;
 #endif
     uint32_t CP0_index;
     uint32_t CP0_random;
Index: qemu/target-mips/exec.h
===================================================================
--- qemu.orig/target-mips/exec.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/exec.h	2006-11-12 11:34:24.000000000 -0500
@@ -115,5 +115,6 @@ uint32_t cpu_mips_get_count (CPUState *e
 void cpu_mips_store_count (CPUState *env, uint32_t value);
 void cpu_mips_store_compare (CPUState *env, uint32_t value);
 void cpu_mips_clock_init (CPUState *env);
+void cpu_mips_tlb_flush (CPUState *env, int flush_global);
 
 #endif /* !defined(__QEMU_MIPS_EXEC_H__) */
Index: qemu/target-mips/helper.c
===================================================================
--- qemu.orig/target-mips/helper.c	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/helper.c	2006-11-12 11:34:24.000000000 -0500
@@ -46,7 +46,7 @@ static int map_address (CPUState *env, t
     tlb_t *tlb;
     int i, n;
 
-    for (i = 0; i < MIPS_TLB_NB; i++) {
+    for (i = 0; i < env->tlb_in_use; i++) {
         tlb = &env->tlb[i];
         /* Check ASID, virtual page number & size */
         if ((tlb->G == 1 || tlb->ASID == ASID) &&
Index: qemu/target-mips/mips-defs.h
===================================================================
--- qemu.orig/target-mips/mips-defs.h	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/mips-defs.h	2006-11-12 11:34:24.000000000 -0500
@@ -22,6 +22,7 @@
 /* Uses MIPS R4Kc TLB model */
 #define MIPS_USES_R4K_TLB
 #define MIPS_TLB_NB 16
+#define MIPS_TLB_MAX 128
 /* basic FPU register support */
 #define MIPS_USES_FPU 1
 /* Define a implementation number of 1.
Index: qemu/target-mips/op_helper.c
===================================================================
--- qemu.orig/target-mips/op_helper.c	2006-11-12 11:34:02.000000000 -0500
+++ qemu/target-mips/op_helper.c	2006-11-12 11:49:32.000000000 -0500
@@ -18,6 +18,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 #include "exec.h"
+#include <string.h>
 
 #define MIPS_DEBUG_DISAS
 
@@ -367,7 +368,7 @@ void do_mtc0 (int reg, int sel)
         env->CP0_EntryHi = val;
 	/* If the ASID changes, flush qemu's TLB.  */
 	if ((old & 0xFF) != (val & 0xFF))
-	  tlb_flush (env, 1);
+	  cpu_mips_tlb_flush (env, 1);
         rn = "EntryHi";
         break;
     case 11:
@@ -569,7 +570,14 @@ void fpu_handle_exception(void)
 
 /* TLB management */
 #if defined(MIPS_USES_R4K_TLB)
-static void invalidate_tlb (int idx)
+void cpu_mips_tlb_flush (CPUState *env, int flush_global)
+{
+    /* Flush qemu's TLB and discard all shadowed entries.  */
+    tlb_flush (env, flush_global);
+    env->tlb_in_use = MIPS_TLB_NB;
+}
+
+static void invalidate_tlb (int idx, int use_extra)
 {
     tlb_t *tlb;
     target_ulong addr;
@@ -582,6 +590,15 @@ static void invalidate_tlb (int idx)
         return;
     }
 
+    if (use_extra && env->tlb_in_use < MIPS_TLB_MAX) {
+        /* For tlbwr, we can shadow the discarded entry into
+	   a new (fake) TLB entry, as long as the guest can not
+	   tell that it's there.  */
+        memcpy (&env->tlb[env->tlb_in_use], tlb, sizeof (*tlb));
+        env->tlb_in_use++;
+        return;
+    }
+
     if (tlb->V0) {
         tb_invalidate_page_range(tlb->PFN[0], tlb->end - tlb->VPN);
         addr = tlb->VPN;
@@ -600,6 +617,13 @@ static void invalidate_tlb (int idx)
     }
 }
 
+static void mips_tlb_flush_extra (CPUState *env)
+{
+    while (env->tlb_in_use > MIPS_TLB_NB) {
+        invalidate_tlb(--env->tlb_in_use, 0);
+    }
+}
+
 static void fill_tlb (int idx)
 {
     tlb_t *tlb;
@@ -626,9 +650,14 @@ static void fill_tlb (int idx)
 
 void do_tlbwi (void)
 {
+    /* Discard cached TLB entries.  We could avoid doing this if the
+       tlbwi is just upgrading access permissions on the current entry;
+       that might be a further win.  */
+    mips_tlb_flush_extra (env);
+
     /* Wildly undefined effects for CP0_index containing a too high value and
        MIPS_TLB_NB not being a power of two.  But so does real silicon.  */
-    invalidate_tlb(env->CP0_index & (MIPS_TLB_NB - 1));
+    invalidate_tlb(env->CP0_index & (MIPS_TLB_NB - 1), 0);
     fill_tlb(env->CP0_index & (MIPS_TLB_NB - 1));
 }
 
@@ -636,7 +665,7 @@ void do_tlbwr (void)
 {
     int r = cpu_mips_get_random(env);
 
-    invalidate_tlb(r);
+    invalidate_tlb(r, 1);
     fill_tlb(r);
 }
 
@@ -673,8 +702,10 @@ void do_tlbr (void)
     tlb = &env->tlb[env->CP0_index & (MIPS_TLB_NB - 1)];
 
     /* If this will change the current ASID, flush qemu's TLB.  */
-    if (ASID != tlb->ASID && tlb->G != 1)
-      tlb_flush (env, 1);
+    if (ASID != tlb->ASID)
+        cpu_mips_tlb_flush (env, 1);
+
+    mips_tlb_flush_extra(env);
 
     env->CP0_EntryHi = tlb->VPN | tlb->ASID;
     size = (tlb->end - tlb->VPN) >> 12;
Index: qemu/target-mips/translate.c
===================================================================
--- qemu.orig/target-mips/translate.c	2006-11-12 11:34:01.000000000 -0500
+++ qemu/target-mips/translate.c	2006-11-12 11:34:24.000000000 -0500
@@ -2450,6 +2450,7 @@ void cpu_reset (CPUMIPSState *env)
     env->PC = 0xBFC00000;
 #if defined (MIPS_USES_R4K_TLB)
     env->CP0_random = MIPS_TLB_NB - 1;
+    env->tlb_in_use = MIPS_TLB_NB;
 #endif
     env->CP0_Wired = 0;
     env->CP0_Config0 = MIPS_CONFIG0;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 16:56           ` Daniel Jacobowitz
  2006-11-12 17:49             ` Daniel Jacobowitz
@ 2006-11-12 18:02             ` Dirk Behme
  2006-11-12 22:13               ` Daniel Jacobowitz
  1 sibling, 1 reply; 16+ messages in thread
From: Dirk Behme @ 2006-11-12 18:02 UTC (permalink / raw)
  To: qemu-devel

Daniel Jacobowitz wrote:
> This is with all of the patches I've posted to the
> list applied

If patches settle down would be nice to get a list of 
patches or a summary patch to be applied in which order 
against which basis. Seems that I mixed up finding the 
correct ones in the correct order ;)

Many thanks

Dirk

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12  1:10   ` Daniel Jacobowitz
  2006-11-12 11:49     ` Laurent Desnogues
@ 2006-11-12 20:42     ` Paul Brook
  1 sibling, 0 replies; 16+ messages in thread
From: Paul Brook @ 2006-11-12 20:42 UTC (permalink / raw)
  To: qemu-devel

> A patch which flushes up to 1/32nd of the jump buffer indiscriminately
> cuts it to 1:11-1:13.
>
> Here's that last patch.  It changes the hash function so that entries
> from a particular page are always grouped together in tb_jmp_cache,
> then finds the possibly two affected ranges and memsets them clear.
> Thoughts?  Is this acceptable, where else should it be tested besides
> MIPS?  I haven't fine-tuned the numbers; it currently allows for max 64
> cached jump targets per target page, but that could be made higher or
> lower.

I've applied this patch.  It seems a reasonable compromise solution.

I tested a couple of different x86 guests, and couldn't measure any 
significant difference in performance.

Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH] Huge TLB performance improvement
  2006-11-12 18:02             ` Dirk Behme
@ 2006-11-12 22:13               ` Daniel Jacobowitz
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Jacobowitz @ 2006-11-12 22:13 UTC (permalink / raw)
  To: qemu-devel

On Sun, Nov 12, 2006 at 07:02:55PM +0100, Dirk Behme wrote:
> Daniel Jacobowitz wrote:
> >This is with all of the patches I've posted to the
> >list applied
> 
> If patches settle down would be nice to get a list of 
> patches or a summary patch to be applied in which order 
> against which basis. Seems that I mixed up finding the 
> correct ones in the correct order ;)

I'll do that, though not until Paul is finished applying whichever ones
he has time for :-)

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-11-12 22:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-06 14:59 [Qemu-devel] [PATCH] Huge TLB performance improvement Thiemo Seufer
2006-11-05 15:38 ` Daniel Jacobowitz
2006-11-12  1:10   ` Daniel Jacobowitz
2006-11-12 11:49     ` Laurent Desnogues
2006-11-12 13:52       ` Thiemo Seufer
2006-11-12 14:08       ` Paul Brook
2006-11-12 14:29         ` Thiemo Seufer
2006-11-12 14:44           ` Paul Brook
2006-11-12 15:07             ` Daniel Jacobowitz
2006-11-12 15:24               ` Daniel Jacobowitz
2006-11-12 15:26             ` Thiemo Seufer
2006-11-12 16:56           ` Daniel Jacobowitz
2006-11-12 17:49             ` Daniel Jacobowitz
2006-11-12 18:02             ` Dirk Behme
2006-11-12 22:13               ` Daniel Jacobowitz
2006-11-12 20:42     ` Paul Brook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).