git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v2 2/8] sparse-checkout: add basics of 'clean' command
Date: Thu, 11 Sep 2025 09:37:46 -0400	[thread overview]
Message-ID: <4ce92ef9-61ef-491e-80a3-370e92fd10fd@gmail.com> (raw)
In-Reply-To: <CABPp-BFzMLGJwz4QqYtvw3zRYgmC=Mb8T8GCOsrLZqT2z+8H7A@mail.gmail.com>

On 8/5/25 5:32 PM, Elijah Newren wrote:
> On Wed, Jul 16, 2025 at 6:34 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> 
> Sorry for the long delay in responding...
> 
> [...]
>> Add a new subcommand to 'git sparse-checkout' that removes these
>> tracked-but-sparse directories. This necessarily removes all files
>> contained within, including tracked and untracked files. Of particular
> 
> Nice to see tracked files also being addressed in v2.
> 
>> importance are ignored and excluded files which would normally be
>> ignored even by 'git clean -f' unless the '-x' or '-X' option is
>> provided. This is the most extreme method for doing this, but it works
>> when the sparse-checkout is in cone mode and is expected to rescope
>> based on directories, not files.
>>
>> The current implementation always deletes these sparse directories
>> without warning. This is unacceptable for a released version, but those
>> features will be added in changes coming immediately after this one.
>>
>> Note that untracked directories within the sparse-checkout remain.
> 
> You've changed the wording here relative to v1, but you haven't
> addressed the part that was ambiguous/misleading in v1.  In fact, you
> may have made a different part ambiguous as well, and made readers
> think that this sentence contradicts your above claims that this
> command is meant to clean out untracked directories underneath sparse
> directories.  Perhaps something like:
> 
> "Note that untracked directories in the sparse-checkout that are not
> within sparse directories will not be removed by this command; it only
> cleans up paths under directories that are supposed to be sparse."

Both here and in the documentation, things can get a bit confusing.
In the v3 I'm preparing, I'm taking the following approach:

  * In the commit message, focus on the implementation details and how
    that impacts the behavior of the tool.

  * In the documentation, focus on the list of files that will be
    "considered for removal". Use the most broad definition there:
    in a tracked directory that is outside of the sparse-checkout.
    Add pointers that could explain exceptions and how to remove
    these exceptions, but don't attempt to explain all special
    cases.

>> +test_expect_success 'clean with staged sparse change' '
>> +       git -C repo sparse-checkout set --cone deep/deeper1 &&
>> +       mkdir repo/deep/deeper2 repo/folder1 repo/folder2 &&
>> +       touch repo/deep/deeper2/file &&
>> +       touch repo/folder1/file &&
>> +       echo dirty >repo/folder2/a &&
>> +
>> +       git -C repo add --sparse folder1/file &&
>> +
>> +       # deletes deep/deeper2/ but leaves folder1/ and folder2/
>> +       cat >expect <<-\EOF &&
>> +       Removing deep/deeper2/
>> +       EOF
>> +
>> +       git -C repo sparse-checkout clean >out &&
>> +       test_cmp expect out &&
>> +
>> +       test_path_is_missing repo/deep/deeper2 &&
>> +       test_path_exists repo/folder1
> 
> What about repo/folder2/ ?
> 
> Anyway, this test shows that neither staged nor unstaged changes are
> cleaned up (which at least resolves the conflicting documentation you
> provided on the matter) -- or would if you also checked repo/folder2.
> 
> What it doesn't show is that tracked files with neither staged nor
> unstaged changes are not cleaned up either:
> 
> $ mkdir repo/folder2
> $ echo dirty >repo/folder2/a
> $ touch repo/folder2/untracked
> $ cd repo
> $ git status --porcelain
>   M folder2/a
> ?? folder2/untracked
> 
> # So, we have both a unstaged change and an untracked file; let's undo
> the unstaged change
> 
> $ git checkout HEAD folder2/a
> Updated 1 path from 8cc814f
> $ git status --porcelain
> ?? folder2/untracked
> $ ls folder2/
> a  untracked
> 
> # Both files are still present -- the untracked file, and the
> untracked file with no changes either staged or unstaged -- what does
> `git sparse-checkout clean` do?

It seems that the unstaged modification to a tracked, sparse file
is enough to prevent the sparse directory collapse. This is
similar to how 'git sparse-checkout reapply' will refuse to remove
those modified changes. I'll be sure to update my advice around
special cases to include this (and lock it in with a test case).

> $ git sparse-checkout clean
> $ ls folder2/
> a  untracked
> $ git status --porcelain
> ?? folder2/untracked
> 
> # Absolutely nothing.  Not only does it not clean anything up, it
> gives no warnings about not cleaning up what should be cleaned up.

At this point, the SKIP_WORKTREE bit is still removed because
we've staged the change.

> Let's try sparse-checkout reapply:
> 
> $ git sparse-checkout reapply
> warning: directory 'folder2/' contains untracked files, but is not in
> the sparse-checkout cone
> $ git status --porcelain
> ?? folder2/untracked
> $ ls folder2/
> untracked
>
> # So `sparse-checkout reapply` does correctly remove folder2/a for us,
> while warning about the untracked file.  (If folder2/a would have
> still had changes, it would have warned about it instead of
> removing.).  Let's try `sparse-checkout clean` now...
> 
> $ git sparse-checkout clean
> Removing folder2/
> $ git status --porcelain
> $ ls folder2/
> ls: cannot access 'folder2/': No such file or directory
> $
> 
> I think these cases either need to be a new testcase or part of this
> last testcase, and the commit message and documentation should be
> clearer about tracked-and-staged, tracked-with-unstaged-changes, and
> tracked-with-no-changes files...or at least comment that they'll be
> discussed later in the patch series.  (I have a feeling I just did a
> lot of work to discover as I read your next patches that you cover
> these later...)

No! you found interesting ways to test special cases. Thanks!

Describing the lifecycle of a sparse file (with sibling untracked
change) going from modified to staged to sparse to unlock the
cleaning would be helpful documentation.

I do think there is an interesting extra functionality that we
should consider for the future: "What files are in my worktree
that _should_ be sparse? Why is 'clean' not removing them?"

Thanks,
-Stolee


  reply	other threads:[~2025-09-11 13:37 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-08 11:19 [PATCH 0/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 11:19 ` [PATCH 1/3] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-08 20:49   ` Elijah Newren
2025-07-08 20:59   ` Junio C Hamano
2025-07-08 11:19 ` [PATCH 2/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 12:15   ` Patrick Steinhardt
2025-07-08 20:30     ` Junio C Hamano
2025-07-08 21:20   ` Junio C Hamano
2025-07-09 14:39     ` Derrick Stolee
2025-07-09 16:46       ` Junio C Hamano
2025-07-08 21:43   ` Elijah Newren
2025-07-09 16:13     ` Derrick Stolee
2025-07-09 17:35       ` Elijah Newren
2025-07-15 13:38         ` Derrick Stolee
2025-07-15 17:17           ` Elijah Newren
2025-07-08 11:19 ` [PATCH 3/3] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-08 21:45   ` Elijah Newren
2025-07-08 12:15 ` [PATCH 0/3] sparse-checkout: add 'clean' command Patrick Steinhardt
2025-07-08 20:36 ` Elijah Newren
2025-07-08 22:01   ` Elijah Newren
2025-07-08 23:41 ` Junio C Hamano
2025-07-09 15:41   ` Derrick Stolee
2025-07-17  1:34 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 1/8] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 2/8] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-08-05 21:32     ` Elijah Newren
2025-09-11 13:37       ` Derrick Stolee [this message]
2025-07-17  1:34   ` [PATCH v2 3/8] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-08-05 22:06     ` Elijah Newren
2025-09-11 13:52       ` Derrick Stolee
2025-07-17  1:34   ` [PATCH v2 4/8] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-08-05 22:22     ` Elijah Newren
2025-07-17  1:34   ` [PATCH v2 5/8] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-08-05 22:22     ` Elijah Newren
2025-09-11 14:06       ` Derrick Stolee
2025-07-17  1:34   ` [PATCH v2 6/8] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 7/8] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 8/8] sparse-checkout: make 'clean' clear more files Derrick Stolee via GitGitGadget
2025-08-06  0:21     ` Elijah Newren
2025-09-11 15:26       ` Derrick Stolee
2025-09-11 16:21         ` Derrick Stolee
2025-08-28 23:22   ` [PATCH v2 0/8] sparse-checkout: add 'clean' command Junio C Hamano
2025-08-29  0:15     ` Elijah Newren
2025-08-29  0:27       ` Junio C Hamano
2025-08-29 21:03         ` Junio C Hamano
2025-08-30 13:41           ` Derrick Stolee
2025-09-12 10:30   ` [PATCH v3 0/7] " Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 1/7] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 2/7] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-10-07 22:49       ` Elijah Newren
2025-10-20 14:16         ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 3/7] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 4/7] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 5/7] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-09-15 18:09       ` Derrick Stolee
2025-09-15 19:12         ` Junio C Hamano
2025-09-16  2:00           ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 6/7] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-10-07 22:53       ` Elijah Newren
2025-10-20 14:17         ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 7/7] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-09-12 16:12     ` [PATCH v3 0/7] sparse-checkout: add 'clean' command Junio C Hamano
2025-09-26 13:40       ` Derrick Stolee
2025-09-26 18:58         ` Elijah Newren
2025-10-07 23:07     ` Elijah Newren
2025-10-20 14:25       ` Derrick Stolee
2025-10-20 14:24     ` [PATCH 8/8] sparse-index: improve advice message instructions Derrick Stolee
2025-10-20 16:29       ` Junio C Hamano
2025-10-24  2:22       ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ce92ef9-61ef-491e-80a3-370e92fd10fd@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).