From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v2 2/8] sparse-checkout: add basics of 'clean' command
Date: Thu, 11 Sep 2025 09:37:46 -0400 [thread overview]
Message-ID: <4ce92ef9-61ef-491e-80a3-370e92fd10fd@gmail.com> (raw)
In-Reply-To: <CABPp-BFzMLGJwz4QqYtvw3zRYgmC=Mb8T8GCOsrLZqT2z+8H7A@mail.gmail.com>
On 8/5/25 5:32 PM, Elijah Newren wrote:
> On Wed, Jul 16, 2025 at 6:34 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>
> Sorry for the long delay in responding...
>
> [...]
>> Add a new subcommand to 'git sparse-checkout' that removes these
>> tracked-but-sparse directories. This necessarily removes all files
>> contained within, including tracked and untracked files. Of particular
>
> Nice to see tracked files also being addressed in v2.
>
>> importance are ignored and excluded files which would normally be
>> ignored even by 'git clean -f' unless the '-x' or '-X' option is
>> provided. This is the most extreme method for doing this, but it works
>> when the sparse-checkout is in cone mode and is expected to rescope
>> based on directories, not files.
>>
>> The current implementation always deletes these sparse directories
>> without warning. This is unacceptable for a released version, but those
>> features will be added in changes coming immediately after this one.
>>
>> Note that untracked directories within the sparse-checkout remain.
>
> You've changed the wording here relative to v1, but you haven't
> addressed the part that was ambiguous/misleading in v1. In fact, you
> may have made a different part ambiguous as well, and made readers
> think that this sentence contradicts your above claims that this
> command is meant to clean out untracked directories underneath sparse
> directories. Perhaps something like:
>
> "Note that untracked directories in the sparse-checkout that are not
> within sparse directories will not be removed by this command; it only
> cleans up paths under directories that are supposed to be sparse."
Both here and in the documentation, things can get a bit confusing.
In the v3 I'm preparing, I'm taking the following approach:
* In the commit message, focus on the implementation details and how
that impacts the behavior of the tool.
* In the documentation, focus on the list of files that will be
"considered for removal". Use the most broad definition there:
in a tracked directory that is outside of the sparse-checkout.
Add pointers that could explain exceptions and how to remove
these exceptions, but don't attempt to explain all special
cases.
>> +test_expect_success 'clean with staged sparse change' '
>> + git -C repo sparse-checkout set --cone deep/deeper1 &&
>> + mkdir repo/deep/deeper2 repo/folder1 repo/folder2 &&
>> + touch repo/deep/deeper2/file &&
>> + touch repo/folder1/file &&
>> + echo dirty >repo/folder2/a &&
>> +
>> + git -C repo add --sparse folder1/file &&
>> +
>> + # deletes deep/deeper2/ but leaves folder1/ and folder2/
>> + cat >expect <<-\EOF &&
>> + Removing deep/deeper2/
>> + EOF
>> +
>> + git -C repo sparse-checkout clean >out &&
>> + test_cmp expect out &&
>> +
>> + test_path_is_missing repo/deep/deeper2 &&
>> + test_path_exists repo/folder1
>
> What about repo/folder2/ ?
>
> Anyway, this test shows that neither staged nor unstaged changes are
> cleaned up (which at least resolves the conflicting documentation you
> provided on the matter) -- or would if you also checked repo/folder2.
>
> What it doesn't show is that tracked files with neither staged nor
> unstaged changes are not cleaned up either:
>
> $ mkdir repo/folder2
> $ echo dirty >repo/folder2/a
> $ touch repo/folder2/untracked
> $ cd repo
> $ git status --porcelain
> M folder2/a
> ?? folder2/untracked
>
> # So, we have both a unstaged change and an untracked file; let's undo
> the unstaged change
>
> $ git checkout HEAD folder2/a
> Updated 1 path from 8cc814f
> $ git status --porcelain
> ?? folder2/untracked
> $ ls folder2/
> a untracked
>
> # Both files are still present -- the untracked file, and the
> untracked file with no changes either staged or unstaged -- what does
> `git sparse-checkout clean` do?
It seems that the unstaged modification to a tracked, sparse file
is enough to prevent the sparse directory collapse. This is
similar to how 'git sparse-checkout reapply' will refuse to remove
those modified changes. I'll be sure to update my advice around
special cases to include this (and lock it in with a test case).
> $ git sparse-checkout clean
> $ ls folder2/
> a untracked
> $ git status --porcelain
> ?? folder2/untracked
>
> # Absolutely nothing. Not only does it not clean anything up, it
> gives no warnings about not cleaning up what should be cleaned up.
At this point, the SKIP_WORKTREE bit is still removed because
we've staged the change.
> Let's try sparse-checkout reapply:
>
> $ git sparse-checkout reapply
> warning: directory 'folder2/' contains untracked files, but is not in
> the sparse-checkout cone
> $ git status --porcelain
> ?? folder2/untracked
> $ ls folder2/
> untracked
>
> # So `sparse-checkout reapply` does correctly remove folder2/a for us,
> while warning about the untracked file. (If folder2/a would have
> still had changes, it would have warned about it instead of
> removing.). Let's try `sparse-checkout clean` now...
>
> $ git sparse-checkout clean
> Removing folder2/
> $ git status --porcelain
> $ ls folder2/
> ls: cannot access 'folder2/': No such file or directory
> $
>
> I think these cases either need to be a new testcase or part of this
> last testcase, and the commit message and documentation should be
> clearer about tracked-and-staged, tracked-with-unstaged-changes, and
> tracked-with-no-changes files...or at least comment that they'll be
> discussed later in the patch series. (I have a feeling I just did a
> lot of work to discover as I read your next patches that you cover
> these later...)
No! you found interesting ways to test special cases. Thanks!
Describing the lifecycle of a sparse file (with sibling untracked
change) going from modified to staged to sparse to unlock the
cleaning would be helpful documentation.
I do think there is an interesting extra functionality that we
should consider for the future: "What files are in my worktree
that _should_ be sparse? Why is 'clean' not removing them?"
Thanks,
-Stolee
next prev parent reply other threads:[~2025-09-11 13:37 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-08 11:19 [PATCH 0/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 11:19 ` [PATCH 1/3] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-08 20:49 ` Elijah Newren
2025-07-08 20:59 ` Junio C Hamano
2025-07-08 11:19 ` [PATCH 2/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 12:15 ` Patrick Steinhardt
2025-07-08 20:30 ` Junio C Hamano
2025-07-08 21:20 ` Junio C Hamano
2025-07-09 14:39 ` Derrick Stolee
2025-07-09 16:46 ` Junio C Hamano
2025-07-08 21:43 ` Elijah Newren
2025-07-09 16:13 ` Derrick Stolee
2025-07-09 17:35 ` Elijah Newren
2025-07-15 13:38 ` Derrick Stolee
2025-07-15 17:17 ` Elijah Newren
2025-07-08 11:19 ` [PATCH 3/3] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-08 21:45 ` Elijah Newren
2025-07-08 12:15 ` [PATCH 0/3] sparse-checkout: add 'clean' command Patrick Steinhardt
2025-07-08 20:36 ` Elijah Newren
2025-07-08 22:01 ` Elijah Newren
2025-07-08 23:41 ` Junio C Hamano
2025-07-09 15:41 ` Derrick Stolee
2025-07-17 1:34 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2025-07-17 1:34 ` [PATCH v2 1/8] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-17 1:34 ` [PATCH v2 2/8] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-08-05 21:32 ` Elijah Newren
2025-09-11 13:37 ` Derrick Stolee [this message]
2025-07-17 1:34 ` [PATCH v2 3/8] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-08-05 22:06 ` Elijah Newren
2025-09-11 13:52 ` Derrick Stolee
2025-07-17 1:34 ` [PATCH v2 4/8] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-08-05 22:22 ` Elijah Newren
2025-07-17 1:34 ` [PATCH v2 5/8] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-08-05 22:22 ` Elijah Newren
2025-09-11 14:06 ` Derrick Stolee
2025-07-17 1:34 ` [PATCH v2 6/8] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-17 1:34 ` [PATCH v2 7/8] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-07-17 1:34 ` [PATCH v2 8/8] sparse-checkout: make 'clean' clear more files Derrick Stolee via GitGitGadget
2025-08-06 0:21 ` Elijah Newren
2025-09-11 15:26 ` Derrick Stolee
2025-09-11 16:21 ` Derrick Stolee
2025-08-28 23:22 ` [PATCH v2 0/8] sparse-checkout: add 'clean' command Junio C Hamano
2025-08-29 0:15 ` Elijah Newren
2025-08-29 0:27 ` Junio C Hamano
2025-08-29 21:03 ` Junio C Hamano
2025-08-30 13:41 ` Derrick Stolee
2025-09-12 10:30 ` [PATCH v3 0/7] " Derrick Stolee via GitGitGadget
2025-09-12 10:30 ` [PATCH v3 1/7] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-09-12 10:30 ` [PATCH v3 2/7] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-10-07 22:49 ` Elijah Newren
2025-10-20 14:16 ` Derrick Stolee
2025-09-12 10:30 ` [PATCH v3 3/7] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-09-12 10:30 ` [PATCH v3 4/7] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-09-12 10:30 ` [PATCH v3 5/7] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-09-15 18:09 ` Derrick Stolee
2025-09-15 19:12 ` Junio C Hamano
2025-09-16 2:00 ` Derrick Stolee
2025-09-12 10:30 ` [PATCH v3 6/7] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-10-07 22:53 ` Elijah Newren
2025-10-20 14:17 ` Derrick Stolee
2025-09-12 10:30 ` [PATCH v3 7/7] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-09-12 16:12 ` [PATCH v3 0/7] sparse-checkout: add 'clean' command Junio C Hamano
2025-09-26 13:40 ` Derrick Stolee
2025-09-26 18:58 ` Elijah Newren
2025-10-07 23:07 ` Elijah Newren
2025-10-20 14:25 ` Derrick Stolee
2025-10-20 14:24 ` [PATCH 8/8] sparse-index: improve advice message instructions Derrick Stolee
2025-10-20 16:29 ` Junio C Hamano
2025-10-24 2:22 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ce92ef9-61ef-491e-80a3-370e92fd10fd@gmail.com \
--to=stolee@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).