git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Alexander Monakov <amonakov@ispras.ru>, git@vger.kernel.org
Cc: Phillip Wood <phillip.wood@dunelm.org.uk>
Subject: Re: [PATCH 2/2] xdiff: optimize xdl_hash_record_verbatim
Date: Mon, 4 Aug 2025 14:49:41 +0100	[thread overview]
Message-ID: <aedb1be1-3151-421e-94ce-27bc77d80b83@gmail.com> (raw)
In-Reply-To: <20250728190520.10962-3-amonakov@ispras.ru>

Hi Alexander

On 28/07/2025 20:05, Alexander Monakov wrote:
> xdl_hash_record_verbatim uses modified djb2 hash with XOR instead of ADD
> for combining. The ADD-based variant is used as the basis of the modern
> ("GNU") symbol lookup scheme in ELF. Glibc dynamic loader received an
> optimized version of this hash function thanks to Noah Goldstein [1].

Interesting

> Switch xdl_hash_record_verbatim to additive hashing and implement
> an optimized loop following the scheme suggested by Noah.
> 
> Timing 'git log --oneline --shortstat v2.0.0..v2.5.0' under perf, I got
> 
> version | cycles, bn | instructions, bn
> ---------------------------------------
> A         6.38         11.3
> B         6.21         10.89
> C         5.80          9.95
> D         5.83          8.74
> ---------------------------------------
> 
> A: baseline (git master at e4ef0485fd78)
> B: plus 'xdiff: refactor xdl_hash_record()'
> C: and plus this patch
> D: with 'xdiff: use xxhash' by Phillip Wood

I think it would be helpful to say that B is the previous patch and 
provide a link for D.
> The resulting speedup for xdl_hash_record_verbatim itself is about 1.5x.

While that's interesting it does not tell us how much this speeds up 
diff generation. Running the command above under hyperfine it is 1.02 ± 
0.01 times faster than the previous patch and 1.11 ± 0.01 times faster 
than master. Using xxhash (D above) is 1.03 ± 0.01 times faster than 
this patch. How do the changes below affect compilers other than gcc and 
clang than do not see the re-association barrier? We'd want to make sure 
that it does not result in slower diffs. Can we use 
atomic_signal_fence() on compilers that support C11? (we don't require 
C11 so we'd have to make it optional but it is supported by things like 
MSVC)

Thanks

Phillip
> 
> [1] https://inbox.sourceware.org/libc-alpha/20220519221803.57957-6-goldstein.w.n@gmail.com/
> 
> Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
> ---
>   xdiff/xutils.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 55 insertions(+), 4 deletions(-)
> 
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index e070ed649f..b1f8273f0f 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -294,16 +294,67 @@ unsigned long xdl_hash_record_with_whitespace(char const **data,
>   	return ha;
>   }
>   
> +/*
> + * Compiler reassociation barrier: pretend to modify X and Y to disallow
> + * changing evaluation order with respect to following uses of X and Y.
> + */
> +#ifdef __GNUC__
> +#define REASSOC_FENCE(x, y) asm("" : "+r"(x), "+r"(y))
> +#else
> +#define REASSOC_FENCE(x, y)
> +#endif
> +
>   unsigned long xdl_hash_record_verbatim(char const **data, char const *top) {
> -	unsigned long ha = 5381;
> +	unsigned long ha = 5381, c0, c1;
>   	char const *ptr = *data;
> -
> +#if 0
> +	/*
> +	 * The baseline form of the optimized loop below. This is the djb2
> +	 * hash (the above function uses a variant with XOR instead of ADD).
> +	 */
>   	for (; ptr < top && *ptr != '\n'; ptr++) {
>   		ha += (ha << 5);
> -		ha ^= (unsigned long) *ptr;
> +		ha += (unsigned long) *ptr;
>   	}
>   	*data = ptr < top ? ptr + 1: ptr;
> -
> +#else
> +	/* Process two characters per iteration. */
> +	if (top - ptr >= 2) do {
> +		if ((c0 = ptr[0]) == '\n') {
> +			*data = ptr + 1;
> +			return ha;
> +		}
> +		if ((c1 = ptr[1]) == '\n') {
> +			*data = ptr + 2;
> +			c0 += ha;
> +			REASSOC_FENCE(c0, ha);
> +			ha = ha * 32 + c0;
> +			return ha;
> +		}
> +		/*
> +		 * Combine characters C0 and C1 into the hash HA. We have
> +		 * HA = (HA * 33 + C0) * 33 + C1, and we want to ensure
> +		 * that dependency chain over HA is just one multiplication
> +		 * and one addition, i.e. we want to evaluate this as
> +		 * HA = HA * 33 * 33 + (C0 * 33 + C1), and likewise prefer
> +		 * (C0 * 32 + (C0 + C1)) for the expression in parenthesis.
> +		 */
> +		ha *= 33 * 33;
> +		c1 += c0;
> +		REASSOC_FENCE(c1, c0);
> +		c1 += c0 * 32;
> +		REASSOC_FENCE(c1, ha);
> +		ha += c1;
> +
> +		ptr += 2;
> +	} while (ptr < top - 1);
> +	*data = top;
> +	if (ptr < top && (c0 = ptr[0]) != '\n') {
> +		c0 += ha;
> +		REASSOC_FENCE(c0, ha);
> +		ha = ha * 32 + c0;
> +	}
> +#endif
>   	return ha;
>   }
>   


  parent reply	other threads:[~2025-08-04 13:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-28 19:05 [PATCH 0/2] optimize string hashing in xdiff Alexander Monakov
2025-07-28 19:05 ` [PATCH 1/2] xdiff: refactor xdl_hash_record() Alexander Monakov
2025-07-28 19:05 ` [PATCH 2/2] xdiff: optimize xdl_hash_record_verbatim Alexander Monakov
2025-07-28 20:50   ` Junio C Hamano
2025-07-28 20:57     ` Alexander Monakov
2025-08-04 13:49   ` Phillip Wood [this message]
2025-08-04 14:39     ` Alexander Monakov
2025-08-11 13:13       ` Phillip Wood
2025-08-11 14:14         ` Alexander Monakov
2025-08-12 17:56           ` Alexander Monakov
2025-08-20 21:34             ` Junio C Hamano
2025-09-08 19:06               ` Alexander Monakov
2025-09-08 21:04                 ` Junio C Hamano
2025-08-13 13:10           ` Phillip Wood
2025-07-28 19:32 ` [PATCH 0/2] optimize string hashing in xdiff Junio C Hamano
2025-07-28 19:56   ` Eli Schwartz
2025-07-28 20:54     ` Junio C Hamano
2025-07-28 20:25   ` Alexander Monakov
2025-08-14 15:01     ` Junio C Hamano
2025-08-28 23:40     ` Junio C Hamano
2025-08-29  1:13       ` Jacob Keller
2025-08-29  3:09       ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aedb1be1-3151-421e-94ce-27bc77d80b83@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=amonakov@ispras.ru \
    --cc=git@vger.kernel.org \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).