From: Derrick Stolee <stolee@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <dstolee@microsoft.com>,
git@vger.kernel.org, git@jeffhostetler.com, sbeller@google.com
Subject: Re: [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r
Date: Wed, 4 Oct 2017 09:06:32 -0400 [thread overview]
Message-ID: <ff54de0c-cd7c-bc3d-dd18-5fc248f3f573@gmail.com> (raw)
In-Reply-To: <xmqqpoa3cujz.fsf@gitster.mtv.corp.google.com>
On 10/4/2017 2:10 AM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> ...
>> I understand that this patch on its own does not have good numbers. I
>> split the
>> patches 3 and 4 specifically to highlight two distinct changes:
>>
>> Patch 3: Unroll the len loop that may inspect all files multiple times.
>> Patch 4: Parse less while disambiguating.
>>
>> Patch 4 more than makes up for the performance hits in this patch.
> Now you confused me even more. When we read the similar table that
> appears in [Patch 4/5], what does the "Base Time" column mean?
> Vanilla Git with [Patch 3/5] applied? Vanillay Git with [Patch 4/5]
> alone applied? Something else?
In PATCH 3, 4, and 5, I used the commit-by-commit diff for the perf
numbers, so the "Base Time" for PATCH 4 is the time calculated when
PATCH 3 is applied. The table in the [PATCH 0/5] message includes the
relative change for all commits.
I recalculated the relative change for each patch related to the
baseline (PATCH 2). Looking again, it appears I misspoke and PATCH 4
does include a +8% change for a fully-repacked Linux repo relative to
PATCH 2. Since PATCH 5 includes an optimization targeted directly at
large packfiles, the final performance gain is significant in the
fully-packed cases.
It is also worth looking at the absolute times for these cases, since
the fully-packed case is significantly faster than the multiple-packfile
case, so the relative change impacts users less.
One final note: the improvement was clearer in test p0008.1 when the
test included "sort -R" to shuffle the known OIDs. Providing OIDs in
lexicographic order has had a significant effect on the performance,
which does not reflect real-world usage. I removed the "sort -R" because
it is a GNU-ism, but if there is a good cross-platform alternative I
would be happy to replace it.
p0008.1: find_unique_abbrev() for existing objects
--------------------------------------------------
For 10 repeated tests, each checking 100,000 known objects, we find the
following results when running in a Linux VM:
| Repo | Baseline | Patch 3 | Rel % | Patch 4 | Rel % | Patch 5 | Rel % |
|-------|----------|---------|-------|---------|-------|---------|-------|
| Git | 0.09 | 0.06 | -33% | 0.05 | -44% | 0.05 | -44% |
| Git | 0.11 | 0.08 | -27% | 0.08 | -27% | 0.08 | -27% |
| Git | 0.09 | 0.07 | -22% | 0.06 | -33% | 0.06 | -33% |
| Linux | 0.13 | 0.32 | 146% | 0.14 | + 8% | 0.05 | -62% |
| Linux | 1.13 | 1.12 | - 1% | 0.94 | -17% | 0.88 | -22% |
| Linux | 1.08 | 1.05 | - 3% | 0.86 | -20% | 0.80 | -26% |
| VSTS | 0.12 | 0.23 | +92% | 0.11 | - 8% | 0.05 | -58% |
| VSTS | 1.02 | 1.08 | + 6% | 0.95 | - 7% | 0.95 | - 7% |
| VSTS | 2.25 | 2.08 | - 8% | 1.82 | -19% | 1.93 | -14% |
(Each repo has three versions, in order: 1 packfile, multiple packfiles,
and multiple packfiles and loose objects.)
Thanks,
-Stolee
next prev parent reply other threads:[~2017-10-04 13:06 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-25 9:54 [PATCH v2 0/5] Improve abbreviation disambiguation Derrick Stolee
2017-09-25 9:54 ` [PATCH v2 1/5] test-list-objects: List a subset of object ids Derrick Stolee
2017-09-26 9:24 ` Junio C Hamano
2017-10-05 8:42 ` Jeff King
2017-10-05 9:48 ` Junio C Hamano
2017-10-05 10:00 ` Jeff King
2017-10-05 10:16 ` Junio C Hamano
2017-10-05 12:39 ` Derrick Stolee
2017-10-06 14:11 ` Jeff King
2017-10-07 19:12 ` Derrick Stolee
2017-10-07 19:33 ` Jeff King
2017-10-08 1:46 ` Junio C Hamano
2017-09-25 9:54 ` [PATCH v2 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf Derrick Stolee
2017-09-26 9:27 ` Junio C Hamano
2017-10-05 8:55 ` Jeff King
2017-10-05 8:57 ` Jeff King
2017-09-25 9:54 ` [PATCH v2 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-09-25 9:54 ` [PATCH v2 4/5] sha1_name: Parse less while finding common prefix Derrick Stolee
2017-09-25 23:42 ` Stefan Beller
2017-10-02 14:52 ` Derrick Stolee
2017-09-25 9:54 ` [PATCH v2 5/5] sha1_name: Minimize OID comparisons during disambiguation Derrick Stolee
2017-10-02 14:56 ` [PATCH v3 0/5] Improve abbreviation disambituation Derrick Stolee
2017-10-05 9:49 ` Jeff King
2017-10-02 14:56 ` [PATCH v3 1/5] test-list-objects: List a subset of object ids Derrick Stolee
2017-10-03 4:16 ` Junio C Hamano
2017-10-02 14:56 ` [PATCH v3 2/5] p0008-abbrev.sh: Test find_unique_abbrev() perf Derrick Stolee
2017-10-02 14:56 ` [PATCH v3 3/5] sha1_name: Unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-03 10:49 ` Junio C Hamano
2017-10-03 11:26 ` Derrick Stolee
2017-10-04 6:10 ` Junio C Hamano
2017-10-04 13:06 ` Derrick Stolee [this message]
2017-10-04 6:07 ` Junio C Hamano
2017-10-04 13:19 ` Derrick Stolee
2017-10-05 1:26 ` Junio C Hamano
2017-10-05 9:13 ` Jeff King
2017-10-05 9:50 ` Junio C Hamano
2017-10-02 14:56 ` [PATCH v3 4/5] sha1_name: Parse less while finding common prefix Derrick Stolee
2017-10-04 6:14 ` Junio C Hamano
2017-10-02 14:56 ` [PATCH v3 5/5] sha1_name: Minimize OID comparisons during disambiguation Derrick Stolee
2017-10-03 15:55 ` Stefan Beller
2017-10-03 17:05 ` Derrick Stolee
2017-10-05 9:44 ` Jeff King
2017-10-06 13:52 ` [PATCH] cleanup: fix possible overflow errors in binary search Derrick Stolee
2017-10-06 14:18 ` Jeff King
2017-10-06 14:41 ` Derrick Stolee
2017-10-08 18:29 ` [PATCH v2] " Derrick Stolee
2017-10-09 13:33 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ff54de0c-cd7c-bc3d-dd18-5fc248f3f573@gmail.com \
--to=stolee@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).