public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Lidong Yan <yldhome2d2@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Kai Koponen <kaikoponen@google.com>, git@vger.kernel.org
Subject: Re: Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph
Date: Tue, 24 Jun 2025 11:16:09 +0800	[thread overview]
Message-ID: <E6A4C972-9675-47AE-B5CE-75103DB1D153@gmail.com> (raw)
In-Reply-To: <xmqq34bq5g29.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:
> 
> Kai Koponen <kaikoponen@google.com> writes:
> 
>> I see, more of a perf FR than a bug then.
>> I don't have much expertise here, but on the surface of it, it doesn't
>> seem to me like there would be any reason the algorithm couldn't check
>> each path's bloom filter in turn while searching, other than that this
>> would be a large and annoying change.
> 
> It looks like that the necessary changes are probably fairly well
> isolated to two functions, i.e., prepare_to_use_bloom_filter() and
> forbid_bloom_filters().  Right now, for a pathspec that has one
> element "dir/file", the code uses two bloom keys for "dir" and
> "dir/file", but if we have "dir1/file1" as well, then it does look
> like a matter of using two more (and the bloom_keys[] array is
> designed to be variable length).

I believe the issue here is that revs->bloom_keys[] represents an
AND condition, whereas what we actually want is an OR. In Kai’s example,
we’re trying to identify commits that modified either src/Make.dist or
src/clean.bash. However, by adding src, Make.dist, and clean.bash to the
bloom_keys, we end up filtering for commits that modified all of these, rather
than any of them.

> But those who have more intimate knowledge in the area than I do may
> point out what is missing in my "it looks like" gut feeling.
> 


  reply	other threads:[~2025-06-24  3:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 17:58 Perf bug: rev-list w/ 2+ paths relatively slow with commit-graph Kai Koponen
2025-06-23 18:04 ` Kai Koponen
2025-06-23 19:36 ` Junio C Hamano
2025-06-23 20:19   ` Kai Koponen
2025-06-23 21:00     ` Junio C Hamano
2025-06-24  3:16       ` Lidong Yan [this message]
2025-06-24 13:32         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E6A4C972-9675-47AE-B5CE-75103DB1D153@gmail.com \
    --to=yldhome2d2@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kaikoponen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox