git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Toon Claes <toon@iotcl.com>
Cc: git@vger.kernel.org, Karthik Nayak <karthik.188@gmail.com>,
	Justin Tobler <jltobler@gmail.com>,
	Derrick Stolee <stolee@gmail.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH] last-modified: implement faster algorithm
Date: Thu, 23 Oct 2025 19:59:28 -0400	[thread overview]
Message-ID: <aPrBYGa6HWUtTI4V@nand.local> (raw)
In-Reply-To: <87cy6gtym2.fsf@iotcl.com>

On Tue, Oct 21, 2025 at 11:04:05AM +0200, Toon Claes wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
>
> >> Nice, I am glad to see that we are using a bitmap here rather than the
> >> hacky 'char *' that we had originally written. I seem to remember that
> >> there was a tiny slow-down when using bitmaps, but can't find the
> >> discussion anymore. (It wasn't in the internal PR that I originally
> >> opened, and I no longer can read messages that far back in history.)
> >>
> >> It might be worth benchmarking here to see if using a 'char *' is
> >> faster. Of course, that's 8x worse in terms of memory usage, but not a
> >> huge deal given both the magnitude and typical number of directory
> >> elements (you'd need 1024^2 entries in a single tree to occupy even a
> >> single MiB of heap).
>
> Using ewah bitmaps is slightly faster, although the difference is almost
> neglible.
>
>     Benchmark 1: bitmap-ewah
>       Time (mean ± σ):     793.1 ms ±   6.2 ms    [User: 755.1 ms, System: 35.2 ms]
>       Range (min … max):   784.7 ms … 804.8 ms    10 runs
>
>     Benchmark 2: bitmap-chars
>       Time (mean ± σ):     808.9 ms ±  11.2 ms    [User: 770.8 ms, System: 35.4 ms]
>       Range (min … max):   800.2 ms … 830.5 ms    10 runs
>
>     Summary
>       bitmap-ewah ran
>         1.02 ± 0.02 times faster than bitmap-chars

OK, makes sense, though just to clarify, "bitmap-ewah" is just a
bog-standard "struct bitmap", right? That happens to come from the EWAH
implementation, but the bitmap itself is not being EWAH compressed,
right?

> And ewah bitmap being more memory efficient, it makes more sense to keep
> using those.
>
> >> Likewise, I wonder if we should have elemtype here be just 'struct
> >> bitmap'. Unfortunately I don't think the EWAH code has a function like:
> >>
> >>     void bitmap_init(struct bitmap *);
> >>
> >> and only has ones that allocate for us. So we may consider adding one,
> >> or creating a dummy bitmap and copying its contents, or otherwise.
>
> I've done some testing, and to do so I've made bitmap_grow() public.
>
>     Benchmark 1: bitmap-as-pointers
>       Time (mean ± σ):     783.7 ms ±   8.9 ms    [User: 744.1 ms, System: 37.5 ms]
>       Range (min … max):   774.4 ms … 803.4 ms    10 runs
>
>     Benchmark 2: bitmap-as-values
>       Time (mean ± σ):     856.7 ms ±  10.5 ms    [User: 816.0 ms, System: 38.1 ms]
>       Range (min … max):   845.7 ms … 872.5 ms    10 runs
>
>     Summary
>       bitmap-as-pointers ran
>         1.09 ± 0.02 times faster than bitmap-as-values
>
> It seems using ewah bitmaps as pointers is faster than using bitmaps as
> values. I must admit I'm surprised as well, but in case you want to
> double check, here's the patch:

I think this makes sense; the pointers are half as wide as a struct
bitmap. Even though we're going through another layer of indirection, I
think that the smaller slab footprint results in better cache locality,
and ultimately faster code. Thanks for testing it out.

Thanks,
Taylor

  reply	other threads:[~2025-10-23 23:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  8:39 [PATCH] last-modified: implement faster algorithm Toon Claes
2025-10-16 18:51 ` Justin Tobler
2025-10-17 10:38   ` Toon Claes
2025-10-16 20:48 ` D. Ben Knoble
2025-10-17 10:45   ` Toon Claes
2025-10-16 23:38 ` Taylor Blau
2025-10-17  6:30   ` Jeff King
2025-10-17 14:54     ` Taylor Blau
2025-10-21  8:20       ` Jeff King
2025-10-17 12:07   ` Toon Claes
2025-10-21  9:04     ` Toon Claes
2025-10-23 23:59       ` Taylor Blau [this message]
2025-10-21 13:00     ` Toon Claes
2025-10-23 23:56     ` Taylor Blau
2025-10-27 15:48       ` Toon Claes
2025-10-17  6:37 ` Jeff King
2025-10-17 10:47   ` Toon Claes
2025-10-21 12:56 ` [PATCH v2] " Toon Claes
2025-10-21 17:52   ` Junio C Hamano
2025-10-22  0:26     ` Taylor Blau
2025-10-22  0:28       ` Taylor Blau
2025-10-22  3:48       ` Junio C Hamano
2025-10-24  0:01         ` Taylor Blau
2025-10-24  0:37           ` Junio C Hamano
2025-10-27 19:22             ` Taylor Blau
2025-10-29 13:01               ` Toon Claes
2025-10-23  8:01     ` Toon Claes
2025-10-23  7:50   ` [PATCH v3] " Toon Claes
2025-10-24  0:03     ` Taylor Blau
2025-10-27  7:03       ` Toon Claes
2025-11-03 15:47   ` [PATCH v4] " Toon Claes
2025-11-03 16:44     ` Junio C Hamano
2025-11-04 15:08       ` Toon Claes
2025-11-19 11:34     ` t8020-last-modified.sh failure on s390x (Re: [PATCH v4] last-modified: implement faster algorithm) Anders Kaseorg
2025-11-19 13:49       ` Kristoffer Haugsbakk
2025-11-19 20:06         ` Anders Kaseorg
2025-11-20  8:16           ` Jeff King
2025-11-28 16:45             ` Toon Claes
2025-11-28 17:35               ` Kristoffer Haugsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPrBYGa6HWUtTI4V@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).