From: Ben Peart <peartben@gmail.com>
To: Junio C Hamano <gitster@pobox.com>, Ben Peart <Ben.Peart@microsoft.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"
Date: Tue, 24 Jul 2018 16:47:37 -0400 [thread overview]
Message-ID: <ab8ee481-54fa-a014-69d9-8f621b136766@gmail.com> (raw)
In-Reply-To: <xmqqh8koxwwi.fsf@gitster-ct.c.googlers.com>
On 7/24/2018 3:21 PM, Junio C Hamano wrote:
> Ben Peart <Ben.Peart@microsoft.com> writes:
>
>> From: Ben Peart <Ben.Peart@microsoft.com>
>>
>> If the new core.optimizecheckout config setting is set to true, speed up
>> "git checkout -b foo" by avoiding the work to merge the working tree. This
>> is valid because no merge needs to occur - only creating the new branch/
>> updating the refs. Any other options force it through the old code path.
>>
>> This change in behavior is off by default and behind the config setting so
>> that users have to opt-in to the optimized behavior.
>
>
>
>
>> We've been running with this patch internally for a long time but it was
>> rejected when I submitted it to the mailing list before because it
>> implicitly changes the behavior of checkout -b. Trying it again configured
>> behind a config setting as a potential solution for other optimizations to
>> checkout that could change the behavior as well.
>>
>> https://public-inbox.org/git/20180724042740.GB13248@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469
>
> An incorrect link? It does not look like a thread that explains
> what was previously submitted but failed. The last paragraph looks
> like a fine material below the three-dash line.
>
See my earlier reply about this section in:
https://public-inbox.org/git/xmqqh8koxwwi.fsf@gitster-ct.c.googlers.com/T/#mb31136a09dbc1a963a5a62e840b118ac33043edf
>
>> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
>> ---
>>
>> Notes:
>> Base Ref: master
>> Web-Diff: https://github.com/benpeart/git/commit/f43d934ce7
>> Checkout: git fetch https://github.com/benpeart/git checkout-b-v1 && git checkout f43d934ce7
>>
>> Documentation/config.txt | 6 +++
>> builtin/checkout.c | 94 ++++++++++++++++++++++++++++++++++++++++
>> cache.h | 1 +
>> config.c | 5 +++
>> environment.c | 1 +
>> 5 files changed, 107 insertions(+)
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index a32172a43c..2c4f513bf1 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -911,6 +911,12 @@ core.commitGraph::
>> Enable git commit graph feature. Allows reading from the
>> commit-graph file.
>>
>> +core.optimizedCheckout
>> + Speed up "git checkout -b foo" by skipping much of the work of a
>> + full checkout command. This changs the behavior as it will skip
>> + merging the trees and updating the index and instead only create
>> + and switch to the new ref.
>
> By the way, why is it a core.* thing, not checkout.* thing?
>
I followed the naming convention used by core.sparsecheckout but I'm
happy to call it whatever people want.
> If a new feature is not necessarily recommendable for normal users
> and it needs to be hidden behind an opt-in knob (I do not have a
> strong opinion if that is or is not the case for this particular
> feature at this point), the documentation for the knob should give a
> bit more than "This chang(e)s the behavior" to the readers, I would
> think, to be intellectually honest ;-). Let's tell them what bad
> things happen if we pretend that we switched the branch without
> twoway merge and the index update to help them make an informed
> decision.
>
I attempted to explain what the change in behavior was in the same
sentence by saying what it skips and what it still does. If that isn't
intellectually honest, help me know how to better explain it so that it is.
Is this better?
Speed up "git checkout -b <new_branch>" by skipping the twoway merge and
index update. Instead, just create a new branch named <new_branch> and
switch to it. The working directory and index are left unchanged.
>> +static int needs_working_tree_merge(const struct checkout_opts *opts,
>> + const struct branch_info *old_branch_info,
>> + const struct branch_info *new_branch_info)
>> +{
>> + /*
>> + * We must do the merge if we are actually moving to a new
>> + * commit tree.
>
> What's a "commit tree"? Shouldn't it be just a "commit"?
>
Sure
>> + */
>> + if (!old_branch_info->commit || !new_branch_info->commit ||
>> + oidcmp(&old_branch_info->commit->object.oid, &new_branch_info->commit->object.oid))
>> + return 1;
>> +
>> + /*
>> + * opts->patch_mode cannot be used with switching branches so is
>> + * not tested here
>> + */
>> +
>> + /*
>> + * opts->quiet only impacts output so doesn't require a merge
>> + */
>> +
>> + /*
>> + * Honor the explicit request for a three-way merge or to throw away
>> + * local changes
>> + */
>> + if (opts->merge || opts->force)
>> + return 1;
>> +
>> + /*
>> + * --detach is documented as "updating the index and the files in the
>> + * working tree" but this optimization skips those steps so fall through
>> + * to the regular code path.
>> + */
>> + if (opts->force_detach)
>> + return 1;
>> +
>> + /*
>> + * opts->writeout_stage cannot be used with switching branches so is
>> + * not tested here
>> + */
>> +
>> + /*
>> + * Honor the explicit ignore requests
>> + */
>> + if (!opts->overwrite_ignore || opts->ignore_skipworktree ||
>> + opts->ignore_other_worktrees)
>> + return 1;
>> +
>> + /*
>> + * opts->show_progress only impacts output so doesn't require a merge
>> + */
>> +
>> + /*
>> + * If we aren't creating a new branch any changes or updates will
>> + * happen in the existing branch. Since that could only be updating
>> + * the index and working directory, we don't want to skip those steps
>> + * or we've defeated any purpose in running the command.
>> + */
>> + if (!opts->new_branch)
>> + return 1;
>> +
>> + /*
>> + * new_branch_force is defined to "create/reset and checkout a branch"
>> + * so needs to go through the merge to do the reset
>> + */
>> + if (opts->new_branch_force)
>> + return 1;
>> +
>> + /*
>> + * A new orphaned branch requrires the index and the working tree to be
>> + * adjusted to <start_point>
>> + */
>> + if (opts->new_orphan_branch)
>> + return 1;
>> +
>> + /*
>> + * Remaining variables are not checkout options but used to track state
>> + */
>> +
>> + return 0;
>> +}
>
> This helper function alone looks like we are creating a maintenance
> nightmare from a quick scan. How are we going to keep this up to
> date?
>
> I offhand do not know how "git checkout -b foo" would behave
> differently if we do not do a two-way merge between HEAD and HEAD to
> update the index. We'd still need to list the local modifications
> and say "Switched to a new branch 'foo'", but that would be a minor
> thing compared to the two-way merge machinery.
>
With the patch, the index and working directory aren't modified, it does
still list the local modifications and says "Switched to..."
The changes in behavior come from other inputs. For example, if you
turned on sparse checkout and then did a "git checkout -b foo" the old
code path would update the skip-worktree bit in the index and remove all
the files from the working directory that were no longer specified in
the sparse-checkout file. The new behavior would not do that as it no
longer updates the index or working directory.
The checkout documentation doesn't _say_ that "git checkout -b foo" will
also update the working directory after changing the sparse-checkout
settings but it happens to do that. Is that a common scenario? No - but
it is an undocumented behavior that will change if they opt in to this
new behavior. At least the documentation for the new flag _does_ let
the user know the working directory and index won't be changed so if
they opt in, they shouldn't be too surprised.
> Was the primary reason why the patch "changes the behaviour" because
> nobody could prove that needs_working_tree_merge() helper reliably
> detects that "checkout -b foo" case and that case alone, and show a
> way to make sure it will keep doing so in the future when other new
> features are added to the command?
>
My concern isn't ensuring that the patch reliably detects "checkout -b",
the challenge is ensuring it will keep doing so as features are added
that impact the "-b" option. The problem of adding a new option and
having to ensure it behaves properly with all the other options isn't
new but I agree this does add one more case that has to be handled.
>> @@ -479,6 +565,14 @@ static int merge_working_tree(const struct checkout_opts *opts,
>> int ret;
>> struct lock_file lock_file = LOCK_INIT;
>>
>> + /*
>> + * Skip merging the trees, updating the index, and work tree only if we
>> + * are simply creating a new branch via "git checkout -b foo." Any
>> + * other options or usage will continue to do all these steps.
>> + */
>> + if (core_optimize_checkout && !needs_working_tree_merge(opts, old_branch_info, new_branch_info))
>> + return 0;
>> +
>> hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
>> if (read_cache_preload(NULL) < 0)
>> return error(_("index file corrupt"));
>
> Thanks.
>
next prev parent reply other threads:[~2018-07-24 20:47 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-24 18:01 [PATCH v1] checkout: optionally speed up "git checkout -b foo" Ben Peart
2018-07-24 18:42 ` Eric Sunshine
2018-07-24 19:45 ` Ben Peart
2018-07-26 15:04 ` Junio C Hamano
2018-07-26 18:59 ` Eric Sunshine
2018-07-26 19:08 ` Eric Sunshine
2018-07-24 19:21 ` Junio C Hamano
2018-07-24 20:47 ` Ben Peart [this message]
2018-07-31 16:39 ` [PATCH v2] checkout: optimize "git checkout -b <new_branch>" Ben Peart
2018-07-31 20:01 ` Junio C Hamano
2018-08-01 15:10 ` Duy Nguyen
2018-08-02 18:02 ` Ben Peart
2018-08-03 15:58 ` Duy Nguyen
2018-08-06 14:25 ` Ben Peart
2018-08-15 21:05 ` Ben Peart
2018-08-05 8:57 ` Duy Nguyen
2018-08-16 18:27 ` [PATCH v3] " Ben Peart
2018-08-16 18:37 ` Duy Nguyen
2018-08-17 12:37 ` Ben Peart
2018-08-19 1:44 ` Elijah Newren
2018-08-20 13:40 ` Ben Peart
2018-08-20 18:16 ` Elijah Newren
2018-08-21 14:51 ` Duy Nguyen
2018-08-30 17:22 ` Elijah Newren
2018-09-04 16:46 ` Duy Nguyen
2018-08-20 18:31 ` Junio C Hamano
2018-09-18 5:34 ` [PATCH] config doc: add missing list separator for checkout.optimizeNewBranch Ævar Arnfjörð Bjarmason
2018-09-18 16:57 ` Taylor Blau
2018-09-18 17:16 ` Jeff King
2018-09-18 17:20 ` Taylor Blau
2018-09-18 17:13 ` Jeff King
2018-09-19 4:41 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab8ee481-54fa-a014-69d9-8f621b136766@gmail.com \
--to=peartben@gmail.com \
--cc=Ben.Peart@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).