All of lore.kernel.org
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Elijah Newren <newren@gmail.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix'
Date: Mon, 28 Feb 2022 10:44:51 -0800	[thread overview]
Message-ID: <51bd953d-7a57-9643-f4d3-c0e7c0dc9bf6@github.com> (raw)
In-Reply-To: <CABPp-BE6n2-v7-91b-aXOsMod6d3dA4zZE4nNyzxac+tVpaaCw@mail.gmail.com>

Elijah Newren wrote:
> Hi Victoria,
> 
> On Fri, Feb 25, 2022 at 12:25 PM Victoria Dye <vdye@github.com> wrote:
>>
>> Elijah Newren wrote:
>>> On Thu, Feb 24, 2022 at 2:34 PM Victoria Dye via GitGitGadget
>>> <gitgitgadget@gmail.com> wrote:
>>>>
>>>> From: Victoria Dye <vdye@github.com>
>>>>
>>>> When 'git read-tree' is provided with a prefix, expand the index only if the
>>>> prefix is equivalent to a sparse directory or contained within one. If the
>>>> index is not expanded in these cases, 'ce_in_traverse_path' will indicate
>>>> that the relevant sparse directory is not in the prefix/traverse path,
>>>> skipping past it and not unpacking the appropriate tree(s).
>>>>
>>>> If the prefix is in-cone, its sparse subdirectories (if any) will be
>>>> traversed correctly without index expansion.
>>>>
>>>> Signed-off-by: Victoria Dye <vdye@github.com>
>>>> ---
>>>>  builtin/read-tree.c                      |  3 +--
>>>>  t/t1092-sparse-checkout-compatibility.sh |  8 ++++++-
>>>>  unpack-trees.c                           | 30 ++++++++++++++++++++++++
>>>>  3 files changed, 38 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
>>>> index c2fdbc2657f..a7b7f822281 100644
>>>> --- a/builtin/read-tree.c
>>>> +++ b/builtin/read-tree.c
>>>> @@ -213,8 +213,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>>>>         if (opts.merge && !opts.index_only)
>>>>                 setup_work_tree();
>>>>
>>>> -       /* TODO: audit sparse index behavior in unpack_trees */
>>>> -       if (opts.skip_sparse_checkout || opts.prefix)
>>>> +       if (opts.skip_sparse_checkout)
>>>>                 ensure_full_index(&the_index);
>>>>
>>>>         if (opts.merge) {
>>>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>>>> index ae44451a0a9..a404be0a10f 100755
>>>> --- a/t/t1092-sparse-checkout-compatibility.sh
>>>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>>>> @@ -1415,7 +1415,13 @@ test_expect_success 'sparse index is not expanded: read-tree' '
>>>>         do
>>>>                 ensure_not_expanded read-tree -mu $MERGE_TREES &&
>>>>                 ensure_not_expanded reset --hard HEAD || return 1
>>>> -       done
>>>> +       done &&
>>>> +
>>>> +       rm -rf sparse-index/deep/deeper2 &&
>>>> +       ensure_not_expanded add . &&
>>>> +       ensure_not_expanded commit -m "test" &&
>>>> +
>>>> +       ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest
>>>>  '
>>>>
>>>>  test_expect_success 'ls-files' '
>>>> diff --git a/unpack-trees.c b/unpack-trees.c
>>>> index 360844bda3a..dba122a02bb 100644
>>>> --- a/unpack-trees.c
>>>> +++ b/unpack-trees.c
>>>> @@ -1739,6 +1739,36 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>>>                 setup_standard_excludes(o->dir);
>>>>         }
>>>>
>>>> +       /*
>>>> +        * If the prefix is equal to or contained within a sparse directory, the
>>>> +        * index needs to be expanded to traverse with the specified prefix.
>>>> +        */
>>>> +       if (o->prefix && o->src_index->sparse_index) {
>>>> +               int prefix_len = strlen(o->prefix);
>>>> +
>>>> +               while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/')
>>>> +                       prefix_len--;
>>>> +
>>>> +               if (prefix_len > 0) {
>>>
>>> Is this condition check necessary?  If we want some safety check here,
>>> could it instead be something like
>>>
>>>    if (prefix_len <= 0)
>>>        BUG("Broken prefix passed to unpack_trees");
>>>
>>
>> This condition was intended to skip unnecessary computation for the
>> (probably unlikely, but still technically valid) case where the prefix is
>> the repo root (e.g., '--prefix=/') - because the repo root is represented
>> with only directory separator(s), `prefix_len` would be 0 after removing
>> trailing '/'. In that scenario, the index won't need to be expanded, so we
>> don't need to go looking in the index for that path.
> 
> Wow, that doesn't result in an error?!?  That surprises me.  I never
> even thought to test such a thing.  Clearly, the following related
> command does give an error:
> 
>   git read-tree --prefix=/toplevel/ $TREE
> 
> (namely, "error: invalid path '/toplevel/subpath'")
> 
> whereas, the reason a single slash is accepted it because git is
> trying to be forgiving and treat the following two commands the same:
> 
>   git read-tree --prefix=subdir/ $TREE
>   git read-tree --prefix=subdir $TREE
> 
> i.e. it tries to allow the trailing slash to be optional.  And, by
> implementation quirk, making a trailing slash be optional turns out to
> mean that --prefix=/ is treated the same as no prefix at all, because
> empty string prefix just happens to give the same behavior as NULL
> prefix.
> 

If by "NULL prefix" you mean "the --prefix option isn't specified", the
behavior isn't *quite* the same as with an empty prefix, since `git
read-tree $TREE` will completely overwrite the index without regard for
what's already there, whereas `git read-tree --prefix= $TREE` "merges"
`$TREE` into the current index (failing if any entries in `$TREE` already
exist in the index). I still think it's an extremely niche usage, but can
also imagine a couple instances where it might be useful (e.g., reading from
an orphan branch).

> I think we should just throw an error if prefix starts with '/'.
> unpack_trees() can make it be a BUG() (and at the beginning of the
> function rather than down at this point and only inside some
> conditional). builtin/read-tree.c, the only thing that ever sets
> prefix in unpack_trees_options, should die() if it's passed something
> that starts with a '/'.  Having paths start with a '/' is antithetical
> to how "prefix" is used throughout the codebase, though I guess I can
> see people making that mistake if they are used to gitignore-style
> patterns instead.
> 

I think this makes sense (especially because it doesn't cause a loss of
functionality - a user can still specify '--prefix=' for the repo root if
they really want to). I'll include a patch for it in my next re-roll.

>> None of that is particularly clear from reading the patch, though, so I'll
>> add a comment & test covering it explicitly.
>>
>>> and then dedent the following code?  (Or are callers allowed to not
>>> sanitize their input before passing to unpack_trees(), meaning that we
>>> should use a die() rather than a BUG()?)
>>>
>>> To test this idea, near the top of unpack_trees(), I added:
>>>     if (o->prefix)
>>>         assert(*o->prefix && *o->prefix != '/');
>>> and reran all tests.  They all ran without hitting that assertion.  FWIW.
>>>
>>>> +                       struct strbuf ce_prefix = STRBUF_INIT;
>>>> +                       strbuf_grow(&ce_prefix, prefix_len + 1);
>>>> +                       strbuf_add(&ce_prefix, o->prefix, prefix_len);
>>>> +                       strbuf_addch(&ce_prefix, '/');
>>>> +
>>>> +                       /*
>>>> +                        * If the prefix is not inside the sparse cone, then the
>>>> +                        * index is explicitly expanded if it is found as a sparse
>>>> +                        * directory, or implicitly expanded (by 'index_name_pos')
>>>> +                        * if the path is inside a sparse directory.
>>>> +                        */
>>>> +                       if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) &&
>>>> +                           index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0)
>>>
>>> style nit: Can you rewrap both the comments and the code at 80 characters?
>>>
>>
>> I couldn't think of a way to wrap the condition that wouldn't make it more
>> difficult to read. The best I could come up with was:
>>
>>                         if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf,
>>                                                                o->src_index) &&
>>                             index_name_pos(o->src_index,
>>                                            ce_prefix.buf,
>>                                            ce_prefix.len) >= 0)
>>                                 ensure_full_index(o->src_index);
>>
>>
>> which, to me, is a bit hard to parse. Alternatively, though, I can move the
>> prefix-checking logic into its own function (kind of like
>> 'pathspec_needs_expanded_index(...)' in [1]), in which case I won't need to
>> change the current wrapping to keep it under 80 characters.
>>
>> [1] https://lore.kernel.org/git/822d7344587f698e73abba1ca726c3a905f7b403.1638201164.git.gitgitgadget@gmail.com/
>>
>>> It took me a bit of playing and testing to understand these two lines.
>>> The comment helps, but it's still a bit dense to unpack; somehow I
>>> didn't understand that the comment was referring to index_name_pos()'s
>>> call to ensure_full_index().  Once I understood that, it all looks
>>> good.
>>>
>>
>> Sorry about that, I'll revise to make that clearer.
> 
> Thanks.  :-)
> 
>>>
>>>> +                               ensure_full_index(o->src_index);
>>>> +
>>>> +                       strbuf_release(&ce_prefix);
>>>> +               }
>>>> +       }
>>>> +
>>>>         if (!core_apply_sparse_checkout || !o->update)
>>>>                 o->skip_sparse_checkout = 1;
>>>>         if (!o->skip_sparse_checkout && !o->pl) {
>>>> --
>>>> gitgitgadget


  reply	other threads:[~2022-02-28 18:45 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23 18:25 [PATCH 0/7] Sparse index: integrate with 'read-tree' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 16:48   ` Derrick Stolee
2022-02-24 21:42     ` Victoria Dye
2022-02-23 18:25 ` [PATCH 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-23 18:25 ` [PATCH 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-26  8:05   ` Elijah Newren
2022-02-28 18:04     ` Victoria Dye
2022-03-01  2:56       ` Elijah Newren
2022-02-23 18:25 ` [PATCH 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-24 16:59 ` [PATCH 0/7] Sparse index: integrate with 'read-tree' Derrick Stolee
2022-02-24 22:34 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 1/7] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 2/7] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-02-25  7:45     ` Elijah Newren
2022-02-28 23:17       ` Victoria Dye
2022-02-24 22:34   ` [PATCH v2 3/7] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-02-26  8:41     ` Elijah Newren
2022-02-28 18:14       ` Victoria Dye
2022-02-28 23:09     ` Ævar Arnfjörð Bjarmason
2022-02-28 23:27       ` Victoria Dye
2022-02-28 23:46         ` Ævar Arnfjörð Bjarmason
2022-02-24 22:34   ` [PATCH v2 4/7] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 5/7] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-02-25  8:38     ` Elijah Newren
2022-02-25 20:25       ` Victoria Dye
2022-02-26  7:52         ` Elijah Newren
2022-02-28 18:44           ` Victoria Dye [this message]
2022-02-24 22:34   ` [PATCH v2 6/7] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-02-24 22:34   ` [PATCH v2 7/7] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-02-26  8:46   ` [PATCH v2 0/7] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-01 20:24   ` [PATCH v3 0/8] " Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 1/8] sparse-index: prevent repo root from becoming sparse Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 2/8] status: fix nested sparse directory diff in sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 3/8] read-tree: explicitly disallow prefixes with a leading '/' Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 4/8] read-tree: expand sparse checkout test coverage Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 5/8] read-tree: integrate with sparse index Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 6/8] read-tree: narrow scope of index expansion for '--prefix' Victoria Dye via GitGitGadget
2022-03-03 17:54       ` Glen Choo
2022-03-03 21:19         ` Victoria Dye
2022-03-04 18:47           ` Glen Choo
2022-03-01 20:24     ` [PATCH v3 7/8] read-tree: make two-way merge sparse-aware Victoria Dye via GitGitGadget
2022-03-01 20:24     ` [PATCH v3 8/8] read-tree: make three-way " Victoria Dye via GitGitGadget
2022-03-02  7:22     ` [PATCH v3 0/8] Sparse index: integrate with 'read-tree' Elijah Newren
2022-03-02 13:40       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51bd953d-7a57-9643-f4d3-c0e7c0dc9bf6@github.com \
    --to=vdye@github.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.