git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: "René Scharfe" <l.s.r@web.de>,
	"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: newren@gmail.com, Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 8/8] cache-tree: avoid path comparison loop when silent
Date: Thu, 31 Dec 2020 11:46:19 -0500	[thread overview]
Message-ID: <3ed37ae6-1f09-bd6b-c9d9-c8089da1af92@gmail.com> (raw)
In-Reply-To: <daeb59ee-861d-9c8d-3601-6aef1ac3fc94@web.de>

On 12/31/2020 7:34 AM, René Scharfe wrote:
> Am 30.12.20 um 20:26 schrieb Derrick Stolee via GitGitGadget:
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The verify_cache() method is used both for debugging issues with the
>> cached tree extension but also to avoid using the extension when there
>> are unmerged entries. It also checks for cache entries out of order with
>> respect to file-versus-directory sorting.
>>
>> In 996277c (Refactor cache_tree_update idiom from commit, 2011-12-06),
>> the silent option was added to remove the progress indicators from the
>> initial loop looking for unmerged entries. This was changed to be a flag
>> in e859c69 (cache-tree: update API to take abitrary flags, 2012-01-16).
>>
>> In both cases, the silent option is ignored for the second loop that
>> checks for file-versus-directory sorting. It must be that this loop is
>> intended only for debugging purposes and is not actually helpful in
>> practice.
> 
> So you're saying that the directory/file conflict check not honoring the
> WRITE_TREE_SILENT flag would have been noticed as a bug and therefore
> doesn't happen?
> 
> I'm not sure I can follow that logic.  I don't know how important that
> check is, how often it found a conflict and how likely such a find is
> overlooked or ignored, but disabling a check in silent mode that
> affects the return code instead of only suppressing its messages seems
> risky.
> 
> If we are sure that the check cannot trigger then we should remove it.
> If we are not so sure, but a conflict would be Git's fault (and not the
> user's) then we should always do the check and BUG out.  And otherwise
> we should keep it.

I think this method was originally designed for debugging issues with
the index, and it still serves that purpose when someone is working on
it. The check for out-of-order directory/file conflicts probably exists
only for that case.

However, this method also got repurposed to (silently) check for the
real scenario of unmerged entries. Perhaps it would be better to split
these into two cases and move the silent case into a new method,
"has_unmerged_entries()".

>> Since verify_cache() is called in cache_tree_update(), which is run
>> during 'git commit', we could speed up 'git commit' operations by not
>> iterating through this loop another time. Of course, noticing this loop
>> requires an incredibly large index.
>>
>> To get a measurable difference, I needed to run this test on the
>> Microsoft Office monorepo, which has over three million entries in its
>> index. I used 'git commit --amend --no-edit' as my command and saw the
>> performance go from 752ms to 739ms with this change.>> Could you please check the performance impact of the following code
> simplification?

Thank you for sending this. I started comparing the performance and
discovered that the performance difference I had measured before was
NOT consistent! I went back and debugged and found that 'git commit'
doesn't actually pass the silent flag, and my performance tests were
skewed by the filesystem cache. (A warmup of 3 runs was not sufficient
to get consistent numbers, it seemed. Perhaps the threaded index reads
are too inconsistent.)

> diff --git a/cache-tree.c b/cache-tree.c
> index a537a806c1..1105cfe6b7 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -187,10 +187,8 @@ static int verify_cache(struct cache_entry **cache,
>  		 */
>  		const char *this_name = cache[i]->name;
>  		const char *next_name = cache[i+1]->name;
> -		int this_len = strlen(this_name);
> -		if (this_len < strlen(next_name) &&
> -		    strncmp(this_name, next_name, this_len) == 0 &&
> -		    next_name[this_len] == '/') {
> +		const char *p;
> +		if (skip_prefix(next_name, this_name, &p) && *p == '/') {
>  			if (10 < ++funny) {
>  				fprintf(stderr, "...\n");
>  				break;
 
This performs consistently worse than both cases (see performance test
later in this message). I think the strlen is saving us some time here.

In fact, I think the compiler is doing some magic to save our strlen
results, as I get identical performance results when doing this:

diff --git a/cache-tree.c b/cache-tree.c
index c6c084463b..a076e7cec5 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -156,6 +156,8 @@ static int verify_cache(struct cache_entry **cache,
 {
        int i, funny;
        int silent = flags & WRITE_TREE_SILENT;
+       const char *this_name;
+       size_t this_len;
 
        /* Verify that the tree is merged */
        funny = 0;
@@ -182,18 +184,21 @@ static int verify_cache(struct cache_entry **cache,
         * and they should not exist unless a bug was introduced along
         * the way.
         */
-       if (silent)
-               return 0;
        funny = 0;
-       for (i = 0; i < entries - 1; i++) {
+
+       if (!entries)
+               return 0;
+       this_name = cache[0]->name;
+       this_len = strlen(this_name);
+
+       for (i = 1; i < entries; i++) {
                /* path/file always comes after path because of the way
                 * the cache is sorted.  Also path can appear only once,
                 * which means conflicting one would immediately follow.
                 */
-               const char *this_name = cache[i]->name;
-               const char *next_name = cache[i+1]->name;
-               int this_len = strlen(this_name);
-               if (this_len < strlen(next_name) &&
+               const char *next_name = cache[i]->name;
+               size_t next_len = strlen(next_name);
+               if (this_len < next_len &&
                    strncmp(this_name, next_name, this_len) == 0 &&
                    next_name[this_len] == '/') {
                        if (10 < ++funny) {
@@ -203,6 +208,8 @@ static int verify_cache(struct cache_entry **cache,
                        fprintf(stderr, "You have both %s and %s\n",
                                this_name, next_name);
                }
+               this_name = next_name;
+               this_len = next_len;
        }
        if (funny)
                return -1;

To get more consistent results, I modified my benchmark to be the following:

	git -c index.threads=1 commit --amend --allow-empty --no-edit

I also adjusted the test suite to use 5 warmup rounds. Here are the numbers:

  Benchmark #1: v2.30.0
    Time (mean ± σ):     856.6 ms ±  18.0 ms    [User: 807.9 ms, System: 198.4 ms]
    Range (min … max):   832.1 ms … 882.2 ms    10 runs
 
So, 756.5ms average for v2.30.0.

  Benchmark #2: skipping the second loop
    Time (mean ± σ):     805.5 ms ±  15.8 ms    [User: 758.0 ms, System: 197.1 ms]
    Range (min … max):   783.4 ms … 835.2 ms    10 runs
 
Here, I just deleted the second loop to see what is the potential minimum.

  Benchmark #3: storing this_name during second loop
    Time (mean ± σ):     855.6 ms ±  14.1 ms    [User: 804.5 ms, System: 194.6 ms]
    Range (min … max):   835.9 ms … 875.2 ms    10 runs

The diff I provided earlier reduces the number of strlen() computations
by storing them across the loop iterations. There is no effect, which
makes me think the compiler is being smart.

  Benchmark #4: check for '/' before strncmp()
    Time (mean ± σ):     834.1 ms ±  18.0 ms    [User: 786.6 ms, System: 196.7 ms]
    Range (min … max):   803.5 ms … 860.4 ms    10 runs 

HOWEVER, if we swap the order of the conditionals to be

		if (this_len < next_len &&
		    strncmp(this_name, next_name, this_len) == 0 &&
		    next_name[this_len] == '/') {

then we _do_ get a measurable, consistent speedup.

  Benchmark #5: using skip_prefix
    Time (mean ± σ):     877.3 ms ±  16.1 ms    [User: 839.1 ms, System: 187.5 ms]
    Range (min … max):   847.7 ms … 900.4 ms    10 runs

This final benchmark stores the results for your skip_prefix diff.

Including the full ranges of the 10 runs hopefully assists in demonstrating
that the performance changes are (mostly) consistent.

To wrap up, I'm going to eject this patch from my v2 since more investigation
must be done to see if this second loop _can_ be dropped. At minimum, it isn't
properly silent when requested and there _is_ a performance boost possible,
even if we keep the check.

Thanks,
-Stolee

  reply	other threads:[~2020-12-31 16:47 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-30 19:26 [PATCH 0/8] Cleanups around index operations Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 1/8] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2020-12-30 19:42   ` Elijah Newren
2020-12-30 19:51     ` Derrick Stolee
2020-12-30 19:26 ` [PATCH 2/8] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2020-12-30 19:45   ` Elijah Newren
2020-12-30 19:26 ` [PATCH 3/8] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 4/8] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 5/8] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2020-12-30 19:48   ` Elijah Newren
2020-12-30 19:53     ` Derrick Stolee
2020-12-30 19:26 ` [PATCH 6/8] index-format: update preamble to cached tree extension Derrick Stolee via GitGitGadget
2020-12-30 20:00   ` Elijah Newren
2020-12-30 19:26 ` [PATCH 7/8] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2020-12-30 19:26 ` [PATCH 8/8] cache-tree: avoid path comparison loop when silent Derrick Stolee via GitGitGadget
2020-12-30 20:14   ` Elijah Newren
2021-01-06  8:55     ` Junio C Hamano
2021-01-06 12:08       ` Derrick Stolee
2020-12-31 12:34   ` René Scharfe
2020-12-31 16:46     ` Derrick Stolee [this message]
2021-01-01 13:30       ` René Scharfe
2021-01-02 15:19       ` [PATCH] cache-tree: use ce_namelen() instead of strlen() René Scharfe
2021-01-04  1:26         ` Derrick Stolee
2021-01-05 12:05         ` Junio C Hamano
2021-01-02 15:31       ` [PATCH 8/8] cache-tree: avoid path comparison loop when silent René Scharfe
2020-12-30 20:19 ` [PATCH 0/8] Cleanups around index operations Elijah Newren
2020-12-30 20:24   ` Derrick Stolee
2021-01-04  3:09 ` [PATCH v2 0/9] " Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 1/9] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 2/9] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 3/9] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 4/9] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 5/9] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 6/9] index-format: update preamble to cached tree extension Derrick Stolee via GitGitGadget
2021-01-07  2:10     ` Junio C Hamano
2021-01-07 11:51       ` Derrick Stolee
2021-01-07 20:12         ` Junio C Hamano
2021-01-07 21:26         ` Junio C Hamano
2021-01-04  3:09   ` [PATCH v2 7/9] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 8/9] cache-tree: use ce_namelen() instead of strlen() René Scharfe via GitGitGadget
2021-01-04  3:09   ` [PATCH v2 9/9] cache-tree: speed up consecutive path comparisons Derrick Stolee via GitGitGadget
2021-01-07 16:32   ` [PATCH v3 00/10] Cleanups around index operations Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 01/10] tree-walk: report recursion counts Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 02/10] unpack-trees: add trace2 regions Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 03/10] cache-tree: use trace2 in cache_tree_update() Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 04/10] cache-tree: trace regions for I/O Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 05/10] cache-tree: trace regions for prime_cache_tree Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 06/10] index-format: use 'cache tree' over 'cached tree' Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 07/10] index-format: update preamble to cache tree extension Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 08/10] index-format: discuss recursion of cached-tree better Derrick Stolee via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 09/10] cache-tree: use ce_namelen() instead of strlen() René Scharfe via GitGitGadget
2021-01-07 16:32     ` [PATCH v3 10/10] cache-tree: speed up consecutive path comparisons Derrick Stolee via GitGitGadget
2021-01-16  6:58     ` [PATCH v3 00/10] Cleanups around index operations Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ed37ae6-1f09-bd6b-c9d9-c8089da1af92@gmail.com \
    --to=stolee@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=l.s.r@web.de \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).