All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Henderson <rth@twiddle.net>
To: "Emilio G. Cota" <cota@braap.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	MTTCG Devel <mttcg@greensocs.com>
Cc: "Alex Bennée" <alex.bennee@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Sergey Fedorov" <serge.fdrv@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump
Date: Fri, 22 Apr 2016 12:59:52 -0700	[thread overview]
Message-ID: <571A82B8.5080908@twiddle.net> (raw)
In-Reply-To: <571A6245.4070209@twiddle.net>

[-- Attachment #1: Type: text/plain, Size: 2222 bytes --]

On 04/22/2016 10:41 AM, Richard Henderson wrote:
> On 04/19/2016 04:07 PM, Emilio G. Cota wrote:
>> +    ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, &ht_heads);
>> +    cpu_fprintf(f, "TB hash avg chain   %0.5f buckets\n", ht_avg_len);
>> +    cpu_fprintf(f, "TB hash size        %zu head buckets\n", ht_heads);
> 
> 
> I think the accounting is questionable here.
> 
> Consider the following data:
> 
> TB count             230467/671088
> TB invalidate count  25915
> TB hash avg chain    1.03073 buckets
> TB hash size         131072 head buckets
> 
> This means that we've got 230467 - 25915 = 204552 active TB's, installed into a
> hash table with 131072 heads.  For a perfectly uniform distribution of TBs,
> that would be an average chain length of 204552 / 131072 = 1.56.
> 
> In order to get the average down to 1.03, there need to be a substantial number
> of heads with zero entries.
> 
> I think perhaps it might be more enlightening to separately account for empty
> and non-empty heads.  E.g.
> 
> TB hash buckets used  xxxx/131072
> TB hash avg chain     yyyy
> 
> where xxxx is the number of non-empty heads, and yyyy = |TBs| / xxxx.
> 
> I also wonder if it wouldn't be better to size the hash table as appropriate
> for the maximum number of allowable TBs.

FWIW, so that I could get an idea of how the stats change as we improve the
hashing, I inserted the attachment 1 patch between patches 5 and 6, and with
attachment 2 attempting to fix the accounting for patches 9 and 10.

For booting an alpha kernel to login prompt:

Before hashing changes (@5/11)

TB count             175363/671088
TB invalidate count  3996
TB hash buckets      31731/32768
TB hash avg chain    5.289 max=59

After xxhash patch (@7/11)

TB hash buckets      32582/32768
TB hash avg chain    5.260 max=18

So far so good!

After qht patches (@11/11)

TB hash buckets      94360/131072
TB hash avg chain    1.774 max=8

Do note that those last numbers are off: 1.774 avg * 94360 used buckets =
167394 total entries, which is far from 171367, the correct number of total
entries.

I'm tempted to pull over gcc's non-chaining hash table implementation
(libiberty/hashtab.c, still gplv2+) and compare...



r~

[-- Attachment #2: z1.txt --]
[-- Type: text/plain, Size: 4375 bytes --]

>From cdc7b3631fd78bd2e31d2823f7543e2a56681149 Mon Sep 17 00:00:00 2001
From: Richard Henderson <rth@twiddle.net>
Date: Fri, 22 Apr 2016 11:28:52 -0700
Subject: translate-all: Add hashtable accounting to info jit

Dump hash table occupancy numbers with "info jit".

Signed-off-by: Richard Henderson <rth@twiddle.net>

diff --git a/translate-all.c b/translate-all.c
index 1a8f68b..ed296d5 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1671,39 +1671,55 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
 
 void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
 {
-    int i, target_code_size, max_target_code_size;
-    int direct_jmp_count, direct_jmp2_count, cross_page;
+    size_t target_code_size, max_target_code_size;
+    unsigned direct_jmp_count, direct_jmp2_count, cross_page;
+    unsigned used_buckets, max_chain, hash_tbs;
     TranslationBlock *tb;
+    int i;
 
     target_code_size = 0;
     max_target_code_size = 0;
     cross_page = 0;
     direct_jmp_count = 0;
     direct_jmp2_count = 0;
-    for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) {
-        tb = &tcg_ctx.tb_ctx.tbs[i];
-        target_code_size += tb->size;
-        if (tb->size > max_target_code_size) {
-            max_target_code_size = tb->size;
-        }
-        if (tb->page_addr[1] != -1) {
-            cross_page++;
-        }
-        if (tb->tb_next_offset[0] != 0xffff) {
-            direct_jmp_count++;
-            if (tb->tb_next_offset[1] != 0xffff) {
-                direct_jmp2_count++;
+    used_buckets = 0;
+    hash_tbs = 0;
+    max_chain = 0;
+
+    for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) {
+        if (tcg_ctx.tb_ctx.tb_phys_hash[i]) {
+            unsigned this_chain = 0;
+            for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL;
+                 tb = tb->phys_hash_next) {
+                this_chain++;
+                hash_tbs++;
+                target_code_size += tb->size;
+                if (tb->page_addr[1] != -1) {
+                    cross_page++;
+                }
+                if (tb->tb_next_offset[0] != 0xffff) {
+                    direct_jmp_count++;
+                    if (tb->tb_next_offset[1] != 0xffff) {
+                        direct_jmp2_count++;
+                    }
+                }
             }
+            if (this_chain > max_chain) {
+                max_chain = this_chain;
+            }
+            used_buckets++;
         }
     }
-    /* XXX: avoid using doubles ? */
+    assert(hash_tbs ==
+           tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count);
+
     cpu_fprintf(f, "Translation buffer state:\n");
     cpu_fprintf(f, "gen code size       %td/%zd\n",
                 tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer,
                 tcg_ctx.code_gen_highwater - tcg_ctx.code_gen_buffer);
     cpu_fprintf(f, "TB count            %d/%d\n",
             tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.code_gen_max_blocks);
-    cpu_fprintf(f, "TB avg target size  %d max=%d bytes\n",
+    cpu_fprintf(f, "TB avg target size  %zd max=%zd bytes\n",
             tcg_ctx.tb_ctx.nb_tbs ? target_code_size /
                     tcg_ctx.tb_ctx.nb_tbs : 0,
             max_target_code_size);
@@ -1717,13 +1733,18 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
     cpu_fprintf(f, "cross page TB count %d (%d%%)\n", cross_page,
             tcg_ctx.tb_ctx.nb_tbs ? (cross_page * 100) /
                                     tcg_ctx.tb_ctx.nb_tbs : 0);
-    cpu_fprintf(f, "direct jump count   %d (%d%%) (2 jumps=%d %d%%)\n",
+    cpu_fprintf(f, "direct jump count   %u (%u%%) (2 jumps=%u %u%%)\n",
                 direct_jmp_count,
                 tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp_count * 100) /
                         tcg_ctx.tb_ctx.nb_tbs : 0,
                 direct_jmp2_count,
                 tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) /
                         tcg_ctx.tb_ctx.nb_tbs : 0);
+    cpu_fprintf(f, "TB hash buckets     %u/%d\n",
+                used_buckets, CODE_GEN_PHYS_HASH_SIZE);
+    cpu_fprintf(f, "TB hash avg chain   %0.3f max=%u\n",
+                used_buckets ? (double)hash_tbs / used_buckets : 0.0,
+                max_chain);
     cpu_fprintf(f, "\nStatistics:\n");
     cpu_fprintf(f, "TB flush count      %d\n", tcg_ctx.tb_ctx.tb_flush_count);
     cpu_fprintf(f, "TB invalidate count %d\n",

[-- Attachment #3: z2.txt --]
[-- Type: text/plain, Size: 5743 bytes --]

>From 7f1d677f3d085b5891e1adbd5f602185d68ba81a Mon Sep 17 00:00:00 2001
From: Richard Henderson <rth@twiddle.net>
Date: Fri, 22 Apr 2016 12:50:00 -0700
Subject: fixup to {09,10,11}/11


diff --git a/include/qemu/qht.h b/include/qemu/qht.h
index a0a1aa8..2d0b58f 100644
--- a/include/qemu/qht.h
+++ b/include/qemu/qht.h
@@ -49,6 +49,14 @@ void qht_grow(struct qht *ht);
 
 void *qht_lookup(struct qht *ht, qht_lookup_func_t func, const void *userp,
                  uint32_t hash);
-double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets);
+
+struct qht_stats {
+    size_t used_buckets;
+    size_t max_buckets;
+    size_t used_entries;
+    size_t max_chain;
+};
+
+struct qht_stats qht_statistics(struct qht *ht);
 
 #endif /* QHT_H */
diff --git a/translate-all.c b/translate-all.c
index 3b73b46..a9ceb0a 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1664,8 +1664,7 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
 {
     size_t target_code_size, max_target_code_size;
     unsigned direct_jmp_count, direct_jmp2_count, cross_page;
-    unsigned used_buckets, max_chain, hash_tbs;
-    TranslationBlock *tb;
+    struct qht_stats hinfo;
     int i;
 
     target_code_size = 0;
@@ -1673,35 +1672,23 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
     cross_page = 0;
     direct_jmp_count = 0;
     direct_jmp2_count = 0;
-    used_buckets = 0;
-    hash_tbs = 0;
-    max_chain = 0;
-
-    for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) {
-        if (tcg_ctx.tb_ctx.tb_phys_hash[i]) {
-            unsigned this_chain = 0;
-            for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL;
-                 tb = tb->phys_hash_next) {
-                this_chain++;
-                hash_tbs++;
-                target_code_size += tb->size;
-                if (tb->page_addr[1] != -1) {
-                    cross_page++;
-                }
-                if (tb->tb_next_offset[0] != 0xffff) {
-                    direct_jmp_count++;
-                    if (tb->tb_next_offset[1] != 0xffff) {
-                        direct_jmp2_count++;
-                    }
-                }
-            }
-            if (this_chain > max_chain) {
-                max_chain = this_chain;
+
+    for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) {
+        const TranslationBlock *tb = &tcg_ctx.tb_ctx.tbs[i];
+        target_code_size += tb->size;
+        if (tb->page_addr[1] != -1) {
+            cross_page++;
+        }
+        if (tb->tb_next_offset[0] != 0xffff) {
+            direct_jmp_count++;
+            if (tb->tb_next_offset[1] != 0xffff) {
+                direct_jmp2_count++;
             }
-            used_buckets++;
         }
     }
-    assert(hash_tbs ==
+
+    hinfo = qht_statistics(&tcg_ctx.tb_ctx.htable);
+    assert(hinfo.used_entries ==
            tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count);
 
     cpu_fprintf(f, "Translation buffer state:\n");
@@ -1731,11 +1718,12 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
                 direct_jmp2_count,
                 tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) /
                         tcg_ctx.tb_ctx.nb_tbs : 0);
-    cpu_fprintf(f, "TB hash buckets     %u/%d\n",
-                used_buckets, CODE_GEN_PHYS_HASH_SIZE);
-    cpu_fprintf(f, "TB hash avg chain   %0.3f max=%u\n",
-                used_buckets ? (double)hash_tbs / used_buckets : 0.0,
-                max_chain);
+    cpu_fprintf(f, "TB hash buckets     %zu/%zu\n",
+                hinfo.used_buckets, hinfo.max_buckets);
+    cpu_fprintf(f, "TB hash avg chain   %0.3f max=%zu\n",
+                hinfo.used_buckets
+                ? (double)hinfo.used_entries / hinfo.used_buckets : 0.0,
+                hinfo.max_chain);
     cpu_fprintf(f, "\nStatistics:\n");
     cpu_fprintf(f, "TB flush count      %d\n", tcg_ctx.tb_ctx.tb_flush_count);
     cpu_fprintf(f, "TB invalidate count %d\n",
diff --git a/util/qht.c b/util/qht.c
index 05ea5e8..535057b 100644
--- a/util/qht.c
+++ b/util/qht.c
@@ -556,35 +556,45 @@ void qht_grow(struct qht *ht)
  * value should be close to 1.
  * Note that each bucket tracks up to QHT_BUCKET_ENTRIES items.
  */
-double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets)
+struct qht_stats qht_statistics(struct qht *ht)
 {
     struct qht_map *map;
-    size_t count = 0;
-    size_t i;
+    struct qht_stats s = {};
+    size_t i, n;
 
     map = atomic_read(&ht->map);
     /* paired with smp_wmb() before setting ht->map */
     smp_rmb();
+    s.max_buckets = n = map->n;
 
-    for (i = 0; i < map->n; i++) {
+    for (i = 0; i < n; i++) {
         struct qht_bucket *head = &map->buckets[i];
         struct qht_bucket *b;
-        size_t bucket_count;
+        size_t this_chain;
         uint32_t version;
 
         do {
             version = seqlock_read_begin(&head->sequence);
-            bucket_count = 0;
+            this_chain = 0;
             b = head;
             do {
-                bucket_count++;
+                int j;
+                for (j = 0; j < QHT_BUCKET_ENTRIES; j++) {
+                    if (b->hashes[j]) {
+                        this_chain++;
+                    }
+                }
                 b = b->next;
             } while (b);
         } while (seqlock_read_retry(&head->sequence, version));
-        count += bucket_count;
-    }
-    if (n_head_buckets) {
-        *n_head_buckets = map->n;
+        if (this_chain != 0) {
+            s.used_entries += this_chain;
+            if (s.max_chain < this_chain) {
+                s.max_chain = this_chain;
+            }
+            s.used_buckets++;
+        }
     }
-    return (double)count / map->n;
+
+    return s;
 }

  parent reply	other threads:[~2016-04-22 20:00 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 23:07 [Qemu-devel] [PATCH v3 00/11] tb hash improvements Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 01/11] compiler.h: add QEMU_ALIGNED() to enforce struct alignment Emilio G. Cota
2016-04-22  9:32   ` Alex Bennée
2016-04-22  9:35   ` Peter Maydell
2016-04-22 15:50     ` Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 02/11] seqlock: remove optional mutex Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 03/11] seqlock: rename write_lock/unlock to write_begin/end Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 04/11] include/processor.h: define cpu_relax() Emilio G. Cota
2016-04-20 12:15   ` KONRAD Frederic
2016-04-20 17:16     ` Emilio G. Cota
2016-04-20 17:18       ` Peter Maydell
2016-04-20 17:32   ` [Qemu-devel] [UPDATED " Emilio G. Cota
2016-04-22  9:35   ` [Qemu-devel] [PATCH " Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock Emilio G. Cota
2016-04-20 15:18   ` Richard Henderson
2016-04-20 17:17     ` Emilio G. Cota
2016-04-20 17:55       ` Richard Henderson
2016-04-20 18:11         ` Emilio G. Cota
2016-04-20 19:39           ` Richard Henderson
2016-04-21 16:24             ` Emilio G. Cota
2016-04-20 17:35   ` [Qemu-devel] [UPDATED " Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 06/11] exec: add tb_hash_func5, derived from xxhash Emilio G. Cota
2016-04-20 15:19   ` Richard Henderson
2016-04-22 12:58   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 07/11] tb hash: hash phys_pc, pc, and flags with xxhash Emilio G. Cota
2016-04-22 12:58   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 08/11] qht: QEMU's fast, resizable and scalable Hash Table Emilio G. Cota
2016-04-22 14:04   ` Alex Bennée
2016-04-24 20:01   ` Richard Henderson
2016-04-24 21:58     ` Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 09/11] qht: add test program Emilio G. Cota
2016-04-22 14:35   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 10/11] tb hash: track translated blocks with qht Emilio G. Cota
2016-04-28 13:27   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump Emilio G. Cota
2016-04-20 15:21   ` Richard Henderson
2016-04-22 14:38   ` Alex Bennée
2016-04-22 17:41   ` Richard Henderson
2016-04-22 19:23     ` Emilio G. Cota
2016-04-22 19:59     ` Richard Henderson [this message]
2016-04-22 23:57       ` Emilio G. Cota
2016-04-24 19:46         ` Richard Henderson
2016-04-24 22:06           ` Emilio G. Cota
2016-04-27  2:43             ` Emilio G. Cota
2016-04-28 16:37               ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=571A82B8.5080908@twiddle.net \
    --to=rth@twiddle.net \
    --cc=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.