From: Derrick Stolee <stolee@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Lars Schneider" <larsxschneider@gmail.com>
Cc: Jeff Hostetler <jeffhost@microsoft.com>,
Johannes Schindelin <johannes.schindelin@gmx.de>,
Derrick Stolee <dstolee@microsoft.com>,
Junio C Hamano <gitster@pobox.com>,
Jonathan Nieder <jrnieder@gmail.com>,
Jeff Hostetler <git@jeffhostetler.com>,
Git Mailing List <git@vger.kernel.org>, Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 1/5] core.aheadbehind: add new config setting
Date: Tue, 3 Apr 2018 07:39:20 -0400 [thread overview]
Message-ID: <d63b54e9-5ec6-f523-d882-756ac38b882b@gmail.com> (raw)
In-Reply-To: <87vad8vbid.fsf@evledraar.gmail.com>
On 4/3/2018 6:18 AM, Ævar Arnfjörð Bjarmason wrote:
> On Tue, Apr 03 2018, Lars Schneider wrote:
>> What is the state of this series? I can't find it in git/git nor in
>> git-for-windows/git. I think Stolee mentioned the config in
>> his Git Merge talk [1] and I was about to test it/roll it out :-)
> It's in the gvfs branch of git@github.com:Microsoft/git.git, i.e. it's
> not in Git for Windows, but used in Microsoft's own in-house version
> used for Windows.git.
Thanks for adding me to CC. I mentioned it in my talk because that was
one thing we shipped internally as a "quick fix" until we could do the
right thing.
If I remember correctly, Jeff abandoned shipping this upstream because
it did have the feel of a hack and we wanted to see if users used the
config setting or really cared about the output values. We saw fast
adoption of the feature and even turned the config setting on
automatically in the following version of GVFS.
> I may be misunderstanding this feature, but my impression was that it
> was a kludge as a workaround until the commit graph code landed, because
> once we have that then surely we can just cheaply report the actual (or
> approximate?) number in the common case, but of course it may still be
> slow if your commit graph file is out of date.
You are correct that the commit-graph file may be out of date, causing
slower performance. Even worse: the current graph patch only provides a
constant-multiple speedup (still walking the same number of commits, but
each commit is parsed much faster).
Speaking of our GVFS-specific fork [0], the 'gvfs' branch was updated
just yesterday with a couple of changes that I am prepping for
submission upstream:
* Lazy-load trees when parsing commits from commit-graph [1]
* Compute and consume generation numbers [2]
Each of these will speed up this ahead/behind calculation in different
ways. [1] makes the cost of loading each commit a bit faster, saving up
to 20% overall. [2] uses generation numbers in paint_down_to_common() to
make the while() condition O(1) instead of O(Q) where Q is the size of
the priority queue. The Windows repo is particularly "wide" with many
parallel branches being merged in complicated ways, so the queue becomes
quite large. This use of generation numbers saves about 4% on some
ahead/behind calculations. This speedup is modest, but the existing code
already made good use of limiting the commit walk to be mostly the
"important" commits.
The real benefit of generation numbers will manifest in a way to make
--topo-order much faster when rendering a small number of commits.
The generation numbers _could_ be used to approximate the ahead/behind
calculation in the following way: When comparing A and B, and gen(A) <
gen(B), then A is at least (gen(B) - gen(A)) behind. That's the only
information that can be gathered directly from those values, but may be
enough to short circuit an exact count.
To truly accelerate these ahead/behind calculations to be sub-linear* in
the ahead/behind counts, we would need a bitmap-based approach. The
object-reachability bitmap is a non-starter for client machines in the
Windows repo, but perhaps a commit-reachability bitmap could be
interesting. Performing set operations on the bitmaps could more quickly
answer these questions. Just thinking about it makes me want to go down
a deep rabbit hole, investigating ways to compute, store, and use these
bitmaps. However: let's wait and see how necessary it is as the
commit-graph feature stabilizes. (*These bitmap approaches are not
guaranteed to be sub-linear, because it may include iterating through a
list of O(N) bits, but good run-length encodings will likely make the
count operation very fast, even with a set-difference operation included.)
There are too many fun things to work on, not enough time!
Thanks,
-Stolee
[0] https://github.com/microsoft/git
Fork of GitForWindows that ships to Windows developers
[1]
https://github.com/Microsoft/git/commit/29114bf86f591f5c87075f779a1faa2d0f17b92f
Lazy-load trees when parsing commits from commit-graph
(accidentally squashed to one commit)
[2]
https://github.com/microsoft/git/compare/879b7d3b1bddea2587b28cdd656c9c655018683a...a0731ca93a35fd042560c4b30e8e0edbdfa4bf9f
Compute and consume generation numbers
next prev parent reply other threads:[~2018-04-03 11:39 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-21 19:09 [PATCH v2 0/5] Add --no-ahead-behind to status Jeff Hostetler
2017-12-21 19:09 ` [PATCH v2 1/5] core.aheadbehind: add new config setting Jeff Hostetler
2017-12-21 20:21 ` Igor Djordjevic
2017-12-21 20:43 ` Jonathan Nieder
2017-12-22 18:21 ` Junio C Hamano
2017-12-24 14:33 ` Jeff King
2017-12-27 17:41 ` Junio C Hamano
2018-01-04 19:26 ` Jeff King
2018-04-03 9:54 ` Lars Schneider
2018-04-03 10:18 ` Ævar Arnfjörð Bjarmason
2018-04-03 11:39 ` Derrick Stolee [this message]
2018-04-03 13:47 ` Jeff Hostetler
2018-01-02 21:54 ` Jeff Hostetler
2018-01-02 22:17 ` Jonathan Nieder
2017-12-21 19:09 ` [PATCH v2 2/5] stat_tracking_info: return +1 when branches are not equal Jeff Hostetler
2017-12-21 20:48 ` Jonathan Nieder
2017-12-21 19:09 ` [PATCH v2 3/5] status: add --[no-]ahead-behind to porcelain V2 output Jeff Hostetler
2017-12-21 20:57 ` Jonathan Nieder
2017-12-21 19:09 ` [PATCH v2 4/5] status: update short status to use --no-ahead-behind Jeff Hostetler
2017-12-21 19:09 ` [PATCH v2 5/5] status: support --no-ahead-behind in long format Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d63b54e9-5ec6-f523-d882-756ac38b882b@gmail.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeffhost@microsoft.com \
--cc=johannes.schindelin@gmx.de \
--cc=jrnieder@gmail.com \
--cc=larsxschneider@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).