git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH 1/1] mv: integrate with sparse-index
Date: Thu, 17 Mar 2022 14:57:32 -0700	[thread overview]
Message-ID: <97a665fe-07c9-c4f6-4ab6-b6c0e1397c31@github.com> (raw)
In-Reply-To: <CAJyCBORfAV_TV6DrOxgim4KtU9T-uTibOaQCsJZsi5_FQfci1Q@mail.gmail.com>

Shaoxuan Yuan wrote:
> Hi Derrick,
> 
> On Wed, Mar 16, 2022 at 9:34 PM Derrick Stolee <derrickstolee@github.com> wrote:
>> The issue here is that this file is "untracked", not just outside
>> of the sparse-checkout cone.
> 
> Thanks for the succinct explanation, it makes much more sense now :)
> 
>> Instead, what about
>>
>>         git mv folder2/a deep/new
>>
>> since folder2/a is a tracked file, just not in the working tree
>> since it is outside the sparse-checkout cone.
>>
>> (If it fails, then it should fail the same with and without the
>> sparse index, which is what "test_sparse_match" is for.)
> 
> I tested this and it fails as expected with:
> "fatal: bad source, source=folder2/a, destination=deep/new"
> 

Great! This should then probably be turned into a "test_expect_fail" test in
't1092' - that'll make sure we get both the right behavior and right error
message with sparse index after it's enabled.

However, I also get the same result when I add the '--sparse' option. I
would expect the behavior to be "move 'folder2/a' to 'deep/new' and check it
out in the worktree" - this may be a good candidate for improving the
existing integration with sparse *checkout* before enabling sparse *index*
(e.g., like when 'git add' was updated to not add sparse files by default
[1]).

[1] https://lore.kernel.org/git/2c5c834bc9fb42aeaff7befbba477aec727184c0.1632497954.git.gitgitgadget@gmail.com/

>> Thanks,
>> -Stolee
> 
> Thanks for the reply above!
> 
> Other than that, I also have found another issue (probably), with
> 
> $ mkdir folder2
> $ git mv folder2 deep
> 
> After these I do:
> 
> $ git status
> 
> And the output indicates that the index is updated with the following changes:
> 
>         renamed:    folder2/0/0/0 -> deep/folder2/0/0/0
>         renamed:    folder2/0/1 -> deep/folder2/0/1
>         renamed:    folder2/a -> deep/folder2/a
> 
> Nothing fails, which is not what I expected. What I expect is `git mv` will
> fail because it is being told to update a sparse-directory (which as I read the
> blogs and sparse-index.txt is taken as a sparse-directory entry) outside of the
> sparse-checkout cone. Unless `git mv` is supplied with `--sparse`, the command
> will do nothing but fail, no?
> 

I think you're right that this is a bug. This appears to come from the fact
that 'mv' decides whether a directory is sparse only *after* it sees that it
doesn't exist on-disk. 

> What confuses me more is that the `folder2`, which is present in the index but
> not in the working tree (due to sparse-checkout cone), seems to be "unlocked"
> and re-picked up by Git after `mkdir folder2` and move `folder2` into
> the cone area.
> And still, the files under `deep/folder2` are not present in the
> working tree (might
> be relevant to the previous context).
> 
This is a consequence of how sparse-checkout is implemented. The files in
'folder2/' aren't on-disk (and aren't correspondingly shown as "deleted" in
'git status') because the index entries of 'folder2/' files are marked with
the "SKIP_WORKTREE" flag. This flag basically indicates to git "this file
shouldn't be in the worktree (on-disk), but it is part of the repository (in
the index)". In other words, it's what makes a file "sparse", and
sparse-checkout (when you run 'set' or 'add') assigns that flag based on the
user-specified patterns. However, the flag isn't constantly being
re-evaluated - only certain commands change SKIP_WORKTREE, and only because
they do so explicitly (e.g., 'git reset --mixed' [2]) - leading to
situations like what you're seeing, where 'folder2/' files (which have
SKIP_WORKTREE enabled) are moved into the sparse cone, but still have
SKIP_WORKTREE enabled.

So I think there are three potential things to fix here: 

1. When empty folder2/ is on-disk, 'git mv' (without '--sparse') doesn't
   fail with "bad source", even though it should.
2. When you try to move a sparse file with 'git mv --sparse', it still
   fails.
3. SKIP_WORKTREE is not removed from out-of-cone files moved into the sparse
   cone.

On a related note, there is precedent for needing to make fixes like this
before integrating with sparse index. For example: in addition to the
earlier examples in 'add' and 'reset', 'checkout-index' was changed to no
longer checkout SKIP_WORKTREE files by default [3]. It's a somewhat expected
part of this process because sparse-checkout is still "experimental", and
one of our secondary goals with this sparse index work is to improve the
behavior of sparse-checkout in the commands we integrate. All of this
combined will, ideally, make the experience of using sparse-checkout much
nicer for users (both from usability and performance perspectives).

[2] https://lore.kernel.org/git/b221b00b7e06a3b135b9f68ce87cffaa7d782581.1638201164.git.gitgitgadget@gmail.com/
[3] https://lore.kernel.org/git/601888606d1cf7d7752844dbdbc7fac20d4be8c4.1641924306.git.gitgitgadget@gmail.com/

> I haven't run the gdb to see into the process, I just get somehow confused by
> these discrepancies (seemingly to me). I think I should gdb into it though,
> getting some info here from people can also be really helpful :)
> 

Another tool that may help you here is 'git ls-files --sparse -t'. It lists
the files in the index and their "tags" ('H' is "normal" tracked files, 'S'
is SKIP_WORKTREE, etc. [4]), which can help identify when a file you'd
expect to be SKIP_WORKTREE is not and vice versa.

[4] https://git-scm.com/docs/git-ls-files#Documentation/git-ls-files.txt--t 

  reply	other threads:[~2022-03-17 21:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 10:01 [RFC PATCH 0/1] mv: integrate with sparse-index Shaoxuan Yuan
2022-03-15 10:01 ` [RFC PATCH 1/1] " Shaoxuan Yuan
2022-03-15 16:07   ` Victoria Dye
2022-03-15 17:14     ` Derrick Stolee
2022-03-16  3:29       ` Shaoxuan Yuan
2022-03-17  8:37       ` Shaoxuan Yuan
2022-03-16  3:18     ` Shaoxuan Yuan
2022-03-16 10:45     ` Shaoxuan Yuan
2022-03-16 13:34       ` Derrick Stolee
2022-03-16 14:46         ` Shaoxuan Yuan
2022-03-17 21:57           ` Victoria Dye [this message]
2022-03-18  1:00             ` Junio C Hamano
2022-03-21 15:20               ` Derrick Stolee
2022-03-21 19:14                 ` Junio C Hamano
2022-03-21 19:45                   ` Derrick Stolee
2022-03-22  8:38                     ` Shaoxuan Yuan
2022-03-23 13:10                       ` Derrick Stolee
2022-03-23 21:33                         ` Junio C Hamano
2022-03-27  3:48             ` Shaoxuan Yuan
2022-03-28 13:32               ` Derrick Stolee
2022-03-15 17:23   ` Junio C Hamano
2022-03-15 20:00     ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97a665fe-07c9-c4f6-4ab6-b6c0e1397c31@github.com \
    --to=vdye@github.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=shaoxuan.yuan02@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).