Re: [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sergey Fedorov <serge.fdrv@gmail.com>
To: "Emilio G. Cota" <cota@braap.org>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"MTTCG Devel" <mttcg@listserver.greensocs.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Richard Henderson" <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table
Date: Mon, 23 May 2016 22:29:51 +0300	[thread overview]
Message-ID: <57435A2F.9040708@gmail.com> (raw)
In-Reply-To: <20160521024811.GA29744@flamenco>

On 21/05/16 05:48, Emilio G. Cota wrote:
> On Sat, May 21, 2016 at 01:13:20 +0300, Sergey Fedorov wrote:
>> Although the API is mostly intuitive some kernel-doc-style comments
>> wouldn’t hurt, I think. ;-)
> The nit that bothered me is the "external lock needed" bit, but it's
> removed by the subsequent patch (which once it gets reviewed should be merged
> onto this patch); I think the interface is simple enough that comments
> would just add noise and maintenance burden. Plus, there are tests under
> tests/.

The interface is simple enough but e.g. the return value convention for
some of the functions may not be clear from a first glance. Regarding
maintenance burden, as soon as we have a good stable API it shouldn't be
painful.

> (snip)
>>> +/* define these to keep sizeof(qht_bucket) within QHT_BUCKET_ALIGN */
>>> +#if HOST_LONG_BITS == 32
>>> +#define QHT_BUCKET_ENTRIES 6
>>> +#else /* 64-bit */
>>> +#define QHT_BUCKET_ENTRIES 4
>>> +#endif
>>> +
>>> +struct qht_bucket {
>>> +    QemuSpin lock;
>>> +    QemuSeqLock sequence;
>>> +    uint32_t hashes[QHT_BUCKET_ENTRIES];
>>> +    void *pointers[QHT_BUCKET_ENTRIES];
>>> +    struct qht_bucket *next;
>>> +} QEMU_ALIGNED(QHT_BUCKET_ALIGN);
>>> +
>>> +QEMU_BUILD_BUG_ON(sizeof(struct qht_bucket) > QHT_BUCKET_ALIGN);
>> Have you considered using separate structures for head buckets and
>> non-head buckets, e.g. "struct qht_head_bucket" and "struct
>> qht_added_bucket"? This would give us a little more entries per cache-line.
> I considered it. Note however that the gain would only apply to
> 32-bit hosts, since on 64-bit we'd only save 8 bytes but we'd
> need 12 to store hash+pointer. (lock+sequence=8, hashes=4*4=16,
> pointers=4*8=32, next=8, that is 8+16+32+8=32+32=64).
>
> On 32-bits with 6 entries we have 4 bytes of waste; we could squeeze in
> an extra entry. I'm reluctant to do this because (1) it would complicate
> code and (2) I don't think we should care too much about performance on
> 32-bit hosts.

Fair enough.

> (snip)
>>> +static inline
>>> +void *qht_do_lookup(struct qht_bucket *head, qht_lookup_func_t func,
>>> +                    const void *userp, uint32_t hash)
>>> +{
>>> +    struct qht_bucket *b = head;
>>> +    int i;
>>> +
>>> +    do {
>>> +        for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
>>> +            if (atomic_read(&b->hashes[i]) == hash) {
>>> +                void *p = atomic_read(&b->pointers[i]);
>> Why do we need this atomic_read() and other (looking a bit inconsistent)
>> atomic operations on 'b->pointers' and 'b->hash'? if we always have to
>> access them protected properly by a seqlock together with a spinlock?
> [ There should be consistency: read accesses use the atomic ops to read,
>   while write accesses have acquired the bucket lock so don't need them.
>   Well, they need care when they write, since there may be concurrent
>   readers. ]

Well, I see the consistency now =)

> I'm using atomic_read but what I really want is ACCESS_ONCE. That is:
> (1) Make sure that the accesses are done in a single instruction (even
>     though gcc doesn't explicitly guarantee it even to aligned addresses
>     anymore[1])
> (2) Make sure the pointer value is only read once, and never refetched.
>     This is what comes right after the pointer is read:
>> +                if (likely(p) && likely(func(p, userp))) {
>> +                    return p;
>> +                }
>     Refetching the pointer value might result in us passing something
>     a NULL p value to the comparison function (since there may be
>     concurrent updaters!), with an immediate segfault. See [2] for a
>     discussion on this (essentially the compiler assumes that there's
>     only a single thread).
>
> Given that even reading a garbled hash is OK (we don't really need (1),
> since the seqlock will make us retry anyway), I've changed the code to:
>
>          for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
> -            if (atomic_read(&b->hashes[i]) == hash) {
> +            if (b->hashes[i] == hash) {
> +                /* make sure the pointer is read only once */
>                  void *p = atomic_read(&b->pointers[i]);
>
>                  if (likely(p) && likely(func(p, userp))) {
>
> Performance-wise this is the impact after 10 tries for:
> 	$ taskset -c 0 tests/qht-bench \
> 	  -d 5 -n 1 -u 0 -k 4096 -K 4096 -l 4096 -r 4096 -s 4096
> on my Haswell machine I get, in Mops/s:
> 	atomic_read() for all		40.389 +- 0.20888327415622
> 	atomic_read(p) only		40.759 +- 0.212835356294224
> 	no atomic_read(p) (unsafe)	40.559 +- 0.121422128680622
>
> Note that the unsafe version is slightly slower; I guess the CPU is trying
> to speculate too much and is gaining little from it.
>
> [1] "Linux-Kernel Memory Model" by Paul McKenney
>     http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4374.html
> [2] https://lwn.net/Articles/508991/

Okay.

> (snip)
>>> +/*
>>> + * Find the last valid entry in @head, and swap it with @orig[pos], which has
>>> + * just been invalidated.
>>> + */
>>> +static inline void qht_bucket_fill_hole(struct qht_bucket *orig, int pos)
>>> +{
>>> +    struct qht_bucket *b = orig;
>>> +    struct qht_bucket *prev = NULL;
>>> +    int i;
>>> +
>>> +    if (qht_entry_is_last(orig, pos)) {
>>> +        return;
>>> +    }
>>> +    do {
>>> +        for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
>> We could iterate in the opposite direction: from the last entry in a
>> qht_bucket to the first. It would allow us to fast-forward to the next
>> qht_bucket in a chain in case of non-NULL last entry and speed-up the
>> search.
> But it would slow us down if--say--only the first entry is set. Also
> it would complicate the code a bit.
>
> Note that with the resizing threshold that we have, we're guaranteed to
> have only up to 1/8 of the head buckets full. We should therefore optimize
> for the case where the head bucket isn't full.

Okay.


Kind regards,
Sergey

next prev parent reply	other threads:[~2016-05-23 19:30 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-14  3:34 [Qemu-devel] [PATCH v5 00/18] tb hash improvements Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 01/18] compiler.h: add QEMU_ALIGNED() to enforce struct alignment Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 02/18] seqlock: remove optional mutex Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 03/18] seqlock: rename write_lock/unlock to write_begin/end Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 04/18] include/processor.h: define cpu_relax() Emilio G. Cota
2016-05-18 17:47   ` Sergey Fedorov
2016-05-18 18:29     ` Emilio G. Cota
2016-05-18 18:37       ` Sergey Fedorov
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 05/18] atomics: add atomic_test_and_set_acquire Emilio G. Cota
2016-05-16 10:05   ` Paolo Bonzini
2016-05-17 16:15   ` Sergey Fedorov
2016-05-17 16:23     ` Paolo Bonzini
2016-05-17 16:47       ` Sergey Fedorov
2016-05-17 17:08         ` Paolo Bonzini
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 06/18] atomics: add atomic_read_acquire and atomic_set_release Emilio G. Cota
2016-05-15 10:22   ` Pranith Kumar
2016-05-16 18:27     ` Emilio G. Cota
2016-05-17 16:53   ` Sergey Fedorov
2016-05-17 17:08     ` Paolo Bonzini
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock Emilio G. Cota
     [not found]   ` <573B5134.8060104@gmail.com>
2016-05-17 19:19     ` Richard Henderson
2016-05-17 19:57       ` Sergey Fedorov
2016-05-17 20:01         ` Sergey Fedorov
2016-05-17 22:12           ` Richard Henderson
2016-05-17 22:22             ` Richard Henderson
2016-05-17 20:04       ` Emilio G. Cota
2016-05-17 20:20         ` Sergey Fedorov
2016-05-18  0:28           ` Emilio G. Cota
2016-05-18 14:18             ` Sergey Fedorov
2016-05-18 14:47               ` Sergey Fedorov
2016-05-18 14:59                 ` Paolo Bonzini
2016-05-18 15:05                   ` Sergey Fedorov
2016-05-18 15:09                     ` Paolo Bonzini
2016-05-18 16:59                       ` Emilio G. Cota
2016-05-18 17:00                         ` Paolo Bonzini
2016-05-18 15:35                     ` Peter Maydell
2016-05-18 15:36                       ` Paolo Bonzini
2016-05-18 15:44                         ` Peter Maydell
2016-05-18 15:59                           ` Sergey Fedorov
2016-05-18 16:02                       ` Richard Henderson
2016-05-17 19:38     ` Emilio G. Cota
2016-05-17 20:35       ` Sergey Fedorov
2016-05-17 23:18         ` Emilio G. Cota
2016-05-18 13:59           ` Sergey Fedorov
2016-05-18 14:05             ` Paolo Bonzini
2016-05-18 14:10               ` Sergey Fedorov
2016-05-18 14:40                 ` Paolo Bonzini
2016-05-18 18:21   ` Sergey Fedorov
2016-05-18 19:04     ` Emilio G. Cota
2016-05-18 19:51   ` Sergey Fedorov
2016-05-18 20:52     ` Emilio G. Cota
2016-05-18 20:57       ` Sergey Fedorov
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 08/18] exec: add tb_hash_func5, derived from xxhash Emilio G. Cota
2016-05-17 17:22   ` Sergey Fedorov
2016-05-17 19:48     ` Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 09/18] tb hash: hash phys_pc, pc, and flags with xxhash Emilio G. Cota
2016-05-17 17:47   ` Sergey Fedorov
2016-05-17 19:09     ` Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 10/18] qdist: add module to represent frequency distributions of data Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 11/18] qdist: add test program Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 12/18] qht: QEMU's fast, resizable and scalable Hash Table Emilio G. Cota
2016-05-20 22:13   ` Sergey Fedorov
2016-05-21  2:48     ` Emilio G. Cota
2016-05-21 17:41       ` Emilio G. Cota
2016-05-22  8:01         ` Alex Bennée
2016-05-23  5:35           ` Emilio G. Cota
2016-05-21 20:07       ` Sergey Fedorov
2016-05-23 19:29       ` Sergey Fedorov [this message]
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 13/18] qht: support parallel writes Emilio G. Cota
2016-05-23 20:28   ` Sergey Fedorov
2016-05-24 22:07     ` Emilio G. Cota
2016-05-24 22:17       ` Sergey Fedorov
2016-05-25  0:10         ` Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 14/18] qht: add test program Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 15/18] qht: add qht-bench, a performance benchmark Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 16/18] qht: add test-qht-par to invoke qht-bench from 'check' target Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 17/18] tb hash: track translated blocks with qht Emilio G. Cota
2016-05-14  3:34 ` [Qemu-devel] [PATCH v5 18/18] translate-all: add tb hash bucket info to 'info jit' dump Emilio G. Cota
2016-05-23 22:26 ` [Qemu-devel] [PATCH v5 00/18] tb hash improvements Sergey Fedorov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57435A2F.9040708@gmail.com \
    --to=serge.fdrv@gmail.com \
    --cc=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.