Re: [Qemu-devel] [PATCH 06/10] include: add xxhash.h

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>
Cc: MTTCG Devel <mttcg@listserver.greensocs.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Peter Crosthwaite <crosthwaite.peter@gmail.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Sergey Fedorov <serge.fdrv@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH 06/10] include: add xxhash.h
Date: Wed, 06 Apr 2016 12:39:42 +0100	[thread overview]
Message-ID: <87lh4rarup.fsf@linaro.org> (raw)
In-Reply-To: <1459834253-8291-7-git-send-email-cota@braap.org>


Emilio G. Cota <cota@braap.org> writes:

> xxhash is a fast, high-quality hashing function. The appended
> brings in the 32-bit version of it, with the small modification that
> it assumes the data to be hashed is made of 32-bit chunks; this increases
> speed slightly for the use-case we care about, i.e. tb-hash.
>
> The original algorithm, as well as a 64-bit implementation, can be found at:
>   https://github.com/Cyan4973/xxHash
>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  include/qemu/xxhash.h | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 106 insertions(+)
>  create mode 100644 include/qemu/xxhash.h
>
> diff --git a/include/qemu/xxhash.h b/include/qemu/xxhash.h
> new file mode 100644
> index 0000000..a13a665
> --- /dev/null
> +++ b/include/qemu/xxhash.h
> @@ -0,0 +1,106 @@
<snip>
> +
> +/* u32 hash of @n contiguous chunks of u32's */
> +static inline uint32_t qemu_xxh32(const uint32_t *p, size_t n, uint32_t seed)
> +{

What is the point of seed here? I looked on the original site to see if
there was any guidance on tuning seed but couldn't find anything. I
appreciate the compiler can inline the constant away but perhaps we
should #define it and drop the parameter if we are not intending to
modify it?

Also it might be helpful to wrap the call to avoid getting the
boilerplate sizing wrong:

  #define qemu_xxh32(s) qemu_xxh32_impl((const uint32_t *)s, sizeof(*s)/sizeof(uint32_t), 1)

Then calls become a little simpler for the user:

  return qemu_xxh32(&k);

Do we need to include a compile time check for structures that don't
neatly divide into uint32_t chunks?

> +    const uint32_t *end = p + n;
> +    uint32_t h32;
> +
> +    if (n >= 4) {
> +        const uint32_t * const limit = end - 4;
> +        uint32_t v1 = seed + PRIME32_1 + PRIME32_2;
> +        uint32_t v2 = seed + PRIME32_2;
> +        uint32_t v3 = seed + 0;
> +        uint32_t v4 = seed - PRIME32_1;
> +
> +        do {
> +            v1 += *p * PRIME32_2;
> +            v1 = XXH_rotl32(v1, 13);
> +            v1 *= PRIME32_1;
> +            p++;
> +            v2 += *p * PRIME32_2;
> +            v2 = XXH_rotl32(v2, 13);
> +            v2 *= PRIME32_1;
> +            p++;
> +            v3 += *p * PRIME32_2;
> +            v3 = XXH_rotl32(v3, 13);
> +            v3 *= PRIME32_1;
> +            p++;
> +            v4 += *p * PRIME32_2;
> +            v4 = XXH_rotl32(v4, 13);
> +            v4 *= PRIME32_1;
> +            p++;
> +        } while (p <= limit);
> +        h32 = XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12) +
> +            XXH_rotl32(v4, 18);
> +    } else {
> +        h32  = seed + PRIME32_5;
> +    }

I don't plead any particular knowledge of hashing codes but I note the
test cases we add only exercise the n == 1 path but in actual usage we
exercise n = 5 (at least for arm32, I guess aarch64 would be more).

> +
> +    h32 += n * sizeof(uint32_t);
> +
> +    while (p < end) {
> +        h32 += *p * PRIME32_3;
> +        h32  = XXH_rotl32(h32, 17) * PRIME32_4 ;
> +        p++;
> +    }
> +
> +    h32 ^= h32 >> 15;
> +    h32 *= PRIME32_2;
> +    h32 ^= h32 >> 13;
> +    h32 *= PRIME32_3;
> +    h32 ^= h32 >> 16;
> +
> +    return h32;
> +}
> +
> +#endif /* QEMU_XXHASH_H */


--
Alex Bennée

next prev parent reply	other threads:[~2016-04-06 11:39 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05  5:30 [Qemu-devel] [PATCH 00/10] tb hash improvements Emilio G. Cota
2016-04-05  5:30 ` [Qemu-devel] [PATCH 01/10] translate-all: add missing fold of tb_ctx into tcg_ctx Emilio G. Cota
2016-04-05  8:49   ` Paolo Bonzini
2016-04-05  5:30 ` [Qemu-devel] [PATCH 02/10] compiler.h: add QEMU_CACHELINE + QEMU_ALIGN() + QEMU_CACHELINE_ALIGNED Emilio G. Cota
2016-04-05  7:57   ` Peter Maydell
2016-04-05 17:24     ` Emilio G. Cota
2016-04-05 18:01       ` Peter Maydell
2016-04-05 19:13         ` Emilio G. Cota
2016-04-05  8:49   ` Paolo Bonzini
2016-04-05 12:57   ` Lluís Vilanova
2016-04-05 12:58     ` Peter Maydell
2016-04-05 15:29       ` Paolo Bonzini
2016-04-05 16:23       ` Lluís Vilanova
2016-04-05 16:31         ` Richard Henderson
2016-04-05 16:56           ` Peter Maydell
2016-04-05 19:02             ` Lluís Vilanova
2016-04-05 19:15               ` Richard Henderson
2016-04-05 20:09                 ` Lluís Vilanova
2016-04-06 11:44                   ` Paolo Bonzini
2016-04-06 12:02                     ` Laurent Desnogues
2016-04-05  5:30 ` [Qemu-devel] [PATCH 03/10] seqlock: remove optional mutex Emilio G. Cota
2016-04-06  8:38   ` Alex Bennée
2016-04-05  5:30 ` [Qemu-devel] [PATCH 04/10] seqlock: rename write_lock/unlock to write_begin/end Emilio G. Cota
2016-04-06  8:42   ` Alex Bennée
2016-04-05  5:30 ` [Qemu-devel] [PATCH 05/10] include: add spinlock wrapper Emilio G. Cota
2016-04-05  8:51   ` Paolo Bonzini
2016-04-06 15:51     ` Alex Bennée
2016-04-05  5:30 ` [Qemu-devel] [PATCH 06/10] include: add xxhash.h Emilio G. Cota
2016-04-06 11:39   ` Alex Bennée [this message]
2016-04-06 22:59     ` Emilio G. Cota
2016-04-05  5:30 ` [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash Emilio G. Cota
2016-04-05 15:41   ` Richard Henderson
2016-04-05 15:48     ` Paolo Bonzini
2016-04-05 16:07       ` Richard Henderson
2016-04-05 19:40         ` Emilio G. Cota
2016-04-05 21:08           ` Richard Henderson
2016-04-06  0:52             ` Emilio G. Cota
2016-04-06 11:52               ` Paolo Bonzini
2016-04-06 17:44                 ` Emilio G. Cota
2016-04-06 18:23                   ` Paolo Bonzini
2016-04-06 18:27                     ` Richard Henderson
2016-04-07  0:37                     ` Emilio G. Cota
2016-04-07  8:46                       ` Paolo Bonzini
2016-04-05 16:33     ` Laurent Desnogues
2016-04-05 17:19       ` Richard Henderson
2016-04-06  6:06         ` Laurent Desnogues
2016-04-06 17:32           ` Emilio G. Cota
2016-04-06 17:42             ` Richard Henderson
2016-04-07  8:12               ` Laurent Desnogues
2016-04-05  5:30 ` [Qemu-devel] [PATCH 08/10] qht: QEMU's fast, resizable and scalable Hash Table Emilio G. Cota
2016-04-05  9:01   ` Paolo Bonzini
2016-04-05 15:50   ` Richard Henderson
2016-04-08 10:27   ` Alex Bennée
2016-04-19 23:03     ` Emilio G. Cota
2016-04-05  5:30 ` [Qemu-devel] [PATCH 09/10] qht: add test program Emilio G. Cota
2016-04-08 10:45   ` Alex Bennée
2016-04-19 23:06     ` Emilio G. Cota
2016-04-20  7:50       ` Alex Bennée
2016-04-05  5:30 ` [Qemu-devel] [PATCH 10/10] tb hash: track translated blocks with qht Emilio G. Cota
2016-04-08 12:39   ` Alex Bennée
2016-04-05  8:47 ` [Qemu-devel] [PATCH 00/10] tb hash improvements Alex Bennée
2016-04-05  9:01 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lh4rarup.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.