git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  newren@gmail.com,
	 Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH 2/3] sparse-checkout: add 'clean' command
Date: Tue, 08 Jul 2025 14:20:27 -0700	[thread overview]
Message-ID: <xmqqa55etm5g.fsf@gitster.g> (raw)
In-Reply-To: <49418e8ec8a4c3e0ce9c65aa700042b6f3f3f4d7.1751973594.git.gitgitgadget@gmail.com> (Derrick Stolee via GitGitGadget's message of "Tue, 08 Jul 2025 11:19:52 +0000")

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <stolee@gmail.com>
>
> When users change their sparse-checkout definitions to add new
> directories and remove old ones, there may be a few reasons why
> directories no longer in scope remain (ignored or excluded files still
> exist, Windows handles are still open, etc.). When these files still
> exist, the sparse index feature notices that a tracked, but sparse,
> directory still exists on disk and thus the index expands. This causes a
> performance hit _and_ the advice printed isn't very helpful. Using 'git
> clean' isn't enough (generally '-dfx' may be needed) but also this may
> not be sufficient.
>
> Add a new subcommand to 'git sparse-checkout' that removes these
> tracked-but-sparse directories, including any excluded or ignored files

Are excluded files and ignored files form two separate sets, or are
they one and the same?  Do files that users forgot to add (e.g. new
source file that would not match any patterns listed in .gitignore)
and object files left over from the previous compilation (most
likely match *.o in .gitignore) treated the same way for the purpose
of determining if the directory that is no longer in the cone can be
removed?

> underneath. This is the most extreme method for doing this, but it works
> when the sparse-checkout is in cone mode and is expected to rescope
> based on directories, not files.
>
> Be sure to add a --dry-run option so users can predict what will be
> deleted. In general, output the directories that are being removed so
> users can know what was removed.

Hmph.  It would be safer to show not just the directories but which
excluded files are about to be lost, wouldn't it, especially when
the user is trying to play safe and see what potential damage they
are looking at?

Also even though ignored files are "ignored and expendable", nobody
marks their temporary file as "ignored but precious" (yet), so "it
is listed in .gitignore so we can safely remove it" may not be a
safe assumption for us to be making (yet).  Shouldn't we at least be
listing these ignored files in --dry-run output, next to those files
that the user may have forgotten to add?

> Note that untracked directories remain. Further, directories that
> contain staged changes are not deleted. This is a detail that is partly
> hidden by the implementation which relies on collapsing the index to a
> sparse index in-memory and only deleting directories that are listed as
> sparse in the index. If a staged change exists, then that entry is not
> stored as a sparse tree entry and thus remains on-disk until committed
> or reset.

Removing untracked directories is a job for "clean -d", so it makes
sense for this new command not to touch them.  Not losing changes
that have already been added is just a bad as losing new files that
the user forgot to add, so it does make sense not to remove them.

I wonder if we need "-x" and/or "-X" options "clean" has (and
perhaps "-d" that is a no-op, as the whole point of this subcommand
is about removing directories from the working tree) to control its
operation a bit finer-grained way.

> +	for (size_t i = 0; i < repo->index->cache_nr; i++) {
> +		DIR* dir;

The asterisk sticks to the variable, not the type, i.e.

		DIR *dir;

Thanks.

  parent reply	other threads:[~2025-07-08 21:20 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-08 11:19 [PATCH 0/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 11:19 ` [PATCH 1/3] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-08 20:49   ` Elijah Newren
2025-07-08 20:59   ` Junio C Hamano
2025-07-08 11:19 ` [PATCH 2/3] sparse-checkout: add 'clean' command Derrick Stolee via GitGitGadget
2025-07-08 12:15   ` Patrick Steinhardt
2025-07-08 20:30     ` Junio C Hamano
2025-07-08 21:20   ` Junio C Hamano [this message]
2025-07-09 14:39     ` Derrick Stolee
2025-07-09 16:46       ` Junio C Hamano
2025-07-08 21:43   ` Elijah Newren
2025-07-09 16:13     ` Derrick Stolee
2025-07-09 17:35       ` Elijah Newren
2025-07-15 13:38         ` Derrick Stolee
2025-07-15 17:17           ` Elijah Newren
2025-07-08 11:19 ` [PATCH 3/3] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-08 21:45   ` Elijah Newren
2025-07-08 12:15 ` [PATCH 0/3] sparse-checkout: add 'clean' command Patrick Steinhardt
2025-07-08 20:36 ` Elijah Newren
2025-07-08 22:01   ` Elijah Newren
2025-07-08 23:41 ` Junio C Hamano
2025-07-09 15:41   ` Derrick Stolee
2025-07-17  1:34 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 1/8] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 2/8] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-08-05 21:32     ` Elijah Newren
2025-09-11 13:37       ` Derrick Stolee
2025-07-17  1:34   ` [PATCH v2 3/8] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-08-05 22:06     ` Elijah Newren
2025-09-11 13:52       ` Derrick Stolee
2025-07-17  1:34   ` [PATCH v2 4/8] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-08-05 22:22     ` Elijah Newren
2025-07-17  1:34   ` [PATCH v2 5/8] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-08-05 22:22     ` Elijah Newren
2025-09-11 14:06       ` Derrick Stolee
2025-07-17  1:34   ` [PATCH v2 6/8] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 7/8] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-07-17  1:34   ` [PATCH v2 8/8] sparse-checkout: make 'clean' clear more files Derrick Stolee via GitGitGadget
2025-08-06  0:21     ` Elijah Newren
2025-09-11 15:26       ` Derrick Stolee
2025-09-11 16:21         ` Derrick Stolee
2025-08-28 23:22   ` [PATCH v2 0/8] sparse-checkout: add 'clean' command Junio C Hamano
2025-08-29  0:15     ` Elijah Newren
2025-08-29  0:27       ` Junio C Hamano
2025-08-29 21:03         ` Junio C Hamano
2025-08-30 13:41           ` Derrick Stolee
2025-09-12 10:30   ` [PATCH v3 0/7] " Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 1/7] sparse-checkout: remove use of the_repository Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 2/7] sparse-checkout: add basics of 'clean' command Derrick Stolee via GitGitGadget
2025-10-07 22:49       ` Elijah Newren
2025-10-20 14:16         ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 3/7] sparse-checkout: match some 'clean' behavior Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 4/7] dir: add generic "walk all files" helper Derrick Stolee via GitGitGadget
2025-09-12 10:30     ` [PATCH v3 5/7] sparse-checkout: add --verbose option to 'clean' Derrick Stolee via GitGitGadget
2025-09-15 18:09       ` Derrick Stolee
2025-09-15 19:12         ` Junio C Hamano
2025-09-16  2:00           ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 6/7] sparse-index: point users to new 'clean' action Derrick Stolee via GitGitGadget
2025-10-07 22:53       ` Elijah Newren
2025-10-20 14:17         ` Derrick Stolee
2025-09-12 10:30     ` [PATCH v3 7/7] t: expand tests around sparse merges and clean Derrick Stolee via GitGitGadget
2025-09-12 16:12     ` [PATCH v3 0/7] sparse-checkout: add 'clean' command Junio C Hamano
2025-09-26 13:40       ` Derrick Stolee
2025-09-26 18:58         ` Elijah Newren
2025-10-07 23:07     ` Elijah Newren
2025-10-20 14:25       ` Derrick Stolee
2025-10-20 14:24     ` [PATCH 8/8] sparse-index: improve advice message instructions Derrick Stolee
2025-10-20 16:29       ` Junio C Hamano
2025-10-24  2:22       ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqa55etm5g.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).