git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Don't delete untracked submodule's .git dirs by default
@ 2009-06-30  2:10 Jason Holden
  2009-06-30  2:10 ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Jason Holden
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Holden @ 2009-06-30  2:10 UTC (permalink / raw)
  To: git; +Cc: Jason Holden

Git-clean is not safe when there is a submodule tracked in a local branch that
is not tracked in the mainline branch. Running git-clean from the mainline
branch when we have unpushed changes in a submodule tracked only in a local
branch can lose local changes to that submodule permanentely.

I have accidentally lost changes in this way when working with very 
large projects where a git-clean is more reliable than a makefile's 
"make clean". 

By changing the default behavior of git-clean to not delete the .git
directories allows the history of the submodules to be recovered.

# Example of issue:
#
# Clone mainline project
git clone git://github.com/thoughtbot/paperclip.git           
cd paperclip/

# Add a submodule not tracked by mainline
git checkout -b test_submodule_clean
git submodule add git://github.com/technoweenie/attachment_fu.git attachement_fu
git commit -m "add submodule"

# Make a modification to the submodule.  Note that we haven't pushed the change
cd attachement_fu/
git checkout -b mod_readme_in_submodule
vi README 
git add README
git commit -m "Small change in submodule"

# Go back to mainline's master branch and do a clean
cd ..
git checkout master
git clean -f -d

# Our change to the submodule, that was never pushed, is now gone forever 
# because all the history stored in the submodule's .git direct is deleted.
# There is no recovering from this.
# This breaks the "git must be safe" rule, as we've lost potentially a lot of
# changes to any submodule projects that didn't get pushed yet. Solve
# this issue by not deleting any .git directories we come across during a
# git-clean, unless the "-m" option is passed to git-clean.

This is my first email submittal using git, so apologies in advance for any
formatting issues

Jason Holden (2):
  Add option to not delete a .git directory in remove_dir_recursively()
  Don't clean any untracked submodule's .git dir by default in
    git-clean

 Documentation/git-clean.txt |    6 +++++-
 builtin-clean.c             |   15 +++++++++++++--
 builtin-clone.c             |    4 ++--
 dir.c                       |   17 +++++++++++++++--
 dir.h                       |    2 +-
 refs.c                      |    2 +-
 transport.c                 |    4 ++--
 7 files changed, 39 insertions(+), 11 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively()
  2009-06-30  2:10 [PATCH 0/2] Don't delete untracked submodule's .git dirs by default Jason Holden
@ 2009-06-30  2:10 ` Jason Holden
  2009-06-30  2:10   ` [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean Jason Holden
  2009-06-30  6:48   ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Johannes Sixt
  0 siblings, 2 replies; 10+ messages in thread
From: Jason Holden @ 2009-06-30  2:10 UTC (permalink / raw)
  To: git; +Cc: Jason Holden

Because all existing calls to remove_dir_recursively() do not
currently have this protection, default all existing calls
to 0 (to not keep .git directories)

Signed-off-by: Jason Holden <jason.k.holden@gmail.com>
---
 builtin-clean.c |    2 +-
 builtin-clone.c |    4 ++--
 dir.c           |   17 +++++++++++++++--
 dir.h           |    2 +-
 refs.c          |    2 +-
 transport.c     |    4 ++--
 6 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/builtin-clean.c b/builtin-clean.c
index 1c1b6d2..cd82407 100644
--- a/builtin-clean.c
+++ b/builtin-clean.c
@@ -141,7 +141,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				   (matches == MATCHED_EXACTLY)) {
 				if (!quiet)
 					printf("Removing %s\n", qname);
-				if (remove_dir_recursively(&directory, 0) != 0) {
+				if (remove_dir_recursively(&directory, 0, 0) != 0) {
 					warning("failed to remove '%s'", qname);
 					errors++;
 				}
diff --git a/builtin-clone.c b/builtin-clone.c
index 2ceacb7..0c00c87 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -304,12 +304,12 @@ static void remove_junk(void)
 		return;
 	if (junk_git_dir) {
 		strbuf_addstr(&sb, junk_git_dir);
-		remove_dir_recursively(&sb, 0);
+		remove_dir_recursively(&sb, 0, 0);
 		strbuf_reset(&sb);
 	}
 	if (junk_work_tree) {
 		strbuf_addstr(&sb, junk_work_tree);
-		remove_dir_recursively(&sb, 0);
+		remove_dir_recursively(&sb, 0, 0);
 		strbuf_reset(&sb);
 	}
 }
diff --git a/dir.c b/dir.c
index bbfcb56..eadcddd 100644
--- a/dir.c
+++ b/dir.c
@@ -800,7 +800,7 @@ int is_empty_dir(const char *path)
 	return ret;
 }
 
-int remove_dir_recursively(struct strbuf *path, int only_empty)
+int remove_dir_recursively(struct strbuf *path, int only_empty, int keep_dot_git)
 {
 	DIR *dir = opendir(path->buf);
 	struct dirent *e;
@@ -812,6 +812,19 @@ int remove_dir_recursively(struct strbuf *path, int only_empty)
 		strbuf_addch(path, '/');
 
 	len = path->len;
+
+	if (keep_dot_git) {
+		char end_of_path[6]; /* enough space for ".git/"*/
+		memset(end_of_path, '\0', 6);
+		if (len >= 5) {
+			strncpy(end_of_path, path->buf + len - 5, 5);
+			if (strcmp(end_of_path, ".git/") == 0) {
+				printf("********Found .git!!!!  Skipping delete\n");
+				return 0;
+			}
+		}
+	}
+
 	while ((e = readdir(dir)) != NULL) {
 		struct stat st;
 		if (is_dot_or_dotdot(e->d_name))
@@ -822,7 +835,7 @@ int remove_dir_recursively(struct strbuf *path, int only_empty)
 		if (lstat(path->buf, &st))
 			; /* fall thru */
 		else if (S_ISDIR(st.st_mode)) {
-			if (!remove_dir_recursively(path, only_empty))
+			if (!remove_dir_recursively(path, only_empty, keep_dot_git))
 				continue; /* happy */
 		} else if (!only_empty && !unlink(path->buf))
 			continue; /* happy, too */
diff --git a/dir.h b/dir.h
index 541286a..8273bb9 100644
--- a/dir.h
+++ b/dir.h
@@ -89,7 +89,7 @@ static inline int is_dot_or_dotdot(const char *name)
 extern int is_empty_dir(const char *dir);
 
 extern void setup_standard_excludes(struct dir_struct *dir);
-extern int remove_dir_recursively(struct strbuf *path, int only_empty);
+extern int remove_dir_recursively(struct strbuf *path, int only_empty, int keep_dot_git);
 
 /* tries to remove the path with empty directories along it, ignores ENOENT */
 extern int remove_path(const char *path);
diff --git a/refs.c b/refs.c
index 24438c6..4ddb361 100644
--- a/refs.c
+++ b/refs.c
@@ -820,7 +820,7 @@ static int remove_empty_directories(const char *file)
 	strbuf_init(&path, 20);
 	strbuf_addstr(&path, file);
 
-	result = remove_dir_recursively(&path, 1);
+	result = remove_dir_recursively(&path, 1, 0);
 
 	strbuf_release(&path);
 
diff --git a/transport.c b/transport.c
index 501a77b..067d6b1 100644
--- a/transport.c
+++ b/transport.c
@@ -196,7 +196,7 @@ static struct ref *get_refs_via_rsync(struct transport *transport, int for_push)
 	insert_packed_refs(temp_dir.buf, &tail);
 	strbuf_setlen(&temp_dir, temp_dir_len);
 
-	if (remove_dir_recursively(&temp_dir, 0))
+	if (remove_dir_recursively(&temp_dir, 0, 0))
 		warning ("Error removing temporary directory %s.",
 				temp_dir.buf);
 
@@ -342,7 +342,7 @@ static int rsync_transport_push(struct transport *transport,
 		result = error("Could not push to %s",
 				rsync_url(transport->url));
 
-	if (remove_dir_recursively(&temp_dir, 0))
+	if (remove_dir_recursively(&temp_dir, 0, 0))
 		warning ("Could not remove temporary directory %s.",
 				temp_dir.buf);
 
-- 
1.6.3.2.207.ga8208

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30  2:10 ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Jason Holden
@ 2009-06-30  2:10   ` Jason Holden
  2009-06-30  6:07     ` Paolo Bonzini
  2009-06-30  6:40     ` Johannes Sixt
  2009-06-30  6:48   ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Johannes Sixt
  1 sibling, 2 replies; 10+ messages in thread
From: Jason Holden @ 2009-06-30  2:10 UTC (permalink / raw)
  To: git; +Cc: Jason Holden

Git-clean is not safe when the submodules are not tracked in mainline.  If
we run git-clean on the mainline branch, when we have a submodule that only
exists on a local branch, the entire .git directory of the untracked
submodule will get deleted, possibly losing any un-pushed local changes to
the submodule.

This change doesn't delete any untracked submodule's .git directories during
the recursive-delete (unless forced with the -m option to git-clean), so that
the submodule history can be restored w/ the proper git commands.

# Example illustrating problem:
# Clone mainline project
git clone git://github.com/thoughtbot/paperclip.git
cd paperclip/

# Add a submodule not tracked by mainline
git checkout -b test_submodule_clean
git submodule add git://github.com/technoweenie/attachment_fu.git attachement_fu
git commit -m "add submodule"

# Make a modification to the submodule.  Note that we haven't pushed the change
cd attachement_fu/
git checkout -b mod_readme_in_submodule
vi README
git add README
git commit -m "Small change in submodule"

# Go back to mainline's master branch and do a clean
cd ..
git checkout master
git clean -f -d

# Our change to the submodule, that was never pushed, is now gone forever
# because all the history stored in the submodule's .git direct is deleted.
# There is no recovering from this.
# This breaks the "git must be safe" rule, as we've lost potentially a lot of
# changes to any submodule projects that didn't get pushed yet. Solve
# this issue by not deleting any .git directories we come across during a
# git-clean, unless the "-m" option is passed to git-clean.

Signed-off-by: Jason Holden <jason.k.holden@gmail.com>
---
 Documentation/git-clean.txt |    6 +++++-
 builtin-clean.c             |   15 +++++++++++++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index be894af..04a5a65 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -8,7 +8,7 @@ git-clean - Remove untracked files from the working tree
 SYNOPSIS
 --------
 [verse]
-'git clean' [-d] [-f] [-n] [-q] [-x | -X] [--] <path>...
+'git clean' [-d] [-f] [-n] [-q] [-m] [-x | -X] [--] <path>...
 
 DESCRIPTION
 -----------
@@ -41,6 +41,10 @@ OPTIONS
 	Be quiet, only report errors, but not the files that are
 	successfully removed.
 
+-m::
+	Clean any .git directories that may be left-over, untracked
+	submodules.
+
 -x::
 	Don't use the ignore rules.  This allows removing all untracked
 	files, including build products.  This can be used (possibly in
diff --git a/builtin-clean.c b/builtin-clean.c
index cd82407..60d78dc 100644
--- a/builtin-clean.c
+++ b/builtin-clean.c
@@ -15,7 +15,7 @@
 static int force = -1; /* unset */
 
 static const char *const builtin_clean_usage[] = {
-	"git clean [-d] [-f] [-n] [-q] [-x | -X] [--] <paths>...",
+	"git clean [-d] [-f] [-n] [-q] [-m] [-x | -X] [--] <paths>...",
 	NULL
 };
 
@@ -31,6 +31,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	int i;
 	int show_only = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, baselen = 0, config_set = 0, errors = 0;
+	int rm_untracked_submodules = 0;
 	struct strbuf directory = STRBUF_INIT;
 	struct dir_struct dir;
 	const char *path, *base;
@@ -44,6 +45,8 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		OPT_BOOLEAN('f', NULL, &force, "force"),
 		OPT_BOOLEAN('d', NULL, &remove_directories,
 				"remove whole directories"),
+		OPT_BOOLEAN('m', NULL, &rm_untracked_submodules,
+				"remove untracked submodules"),
 		OPT_BOOLEAN('x', NULL, &ignored, "remove ignored files, too"),
 		OPT_BOOLEAN('X', NULL, &ignored_only,
 				"remove only ignored files"),
@@ -59,6 +62,14 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, builtin_clean_usage,
 			     0);
 
+
+	int keep_dot_git = 0;
+	if (rm_untracked_submodules == 0)
+		keep_dot_git = 1;
+	else
+		printf("Any untracked .git directories will be deleted (abandoned submodules)\n");
+
+
 	memset(&dir, 0, sizeof(dir));
 	if (ignored_only)
 		dir.flags |= DIR_SHOW_IGNORED;
@@ -141,7 +152,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				   (matches == MATCHED_EXACTLY)) {
 				if (!quiet)
 					printf("Removing %s\n", qname);
-				if (remove_dir_recursively(&directory, 0, 0) != 0) {
+				if (remove_dir_recursively(&directory, 0, keep_dot_git) != 0) {
 					warning("failed to remove '%s'", qname);
 					errors++;
 				}
-- 
1.6.3.2.207.ga8208

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30  2:10   ` [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean Jason Holden
@ 2009-06-30  6:07     ` Paolo Bonzini
  2009-06-30  6:40     ` Johannes Sixt
  1 sibling, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2009-06-30  6:07 UTC (permalink / raw)
  To: Jason Holden; +Cc: git

Useful indeed.

Note that 'git clean -n -d ' however will still report the directory as 
being removed.  Also, I'm not sure what happens (and what should happen) 
if an untracked directory foo.git is found.

Probably the best way to fix this is to add an is_dot_git_path function 
to dir.c like this

int
is_dot_git_path (const char *s, int len);
{
   while (len && s[len - 1] == '/')
     len--;
   return len >= 4 && !memcmp (s + len - 4, ".git", 5) &&
          (len == 4 || s[len - 5] == '/');
}

This function is safer if the directory does not have a trailing slash, 
as it might be for the paths in builtin-clean.c for the -n case.  You 
can have some adjustments if you decide do keep foo.git (just removing 
the last && of course).

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30  2:10   ` [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean Jason Holden
  2009-06-30  6:07     ` Paolo Bonzini
@ 2009-06-30  6:40     ` Johannes Sixt
  2009-06-30  7:34       ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Johannes Sixt @ 2009-06-30  6:40 UTC (permalink / raw)
  To: Jason Holden; +Cc: git

Jason Holden schrieb:
> Git-clean is not safe when the submodules are not tracked in mainline.

Generally, I think you are addressing a real issue. On the other hand, it
also changes the behavior substantially. Nevertheless IMO it is better to
be safe than sorry, even if existing 'git clean' users may now observe
that directories are left over that previously weren't.

>  If
> we run git-clean on the mainline branch, when we have a submodule that only
> exists on a local branch, the entire .git directory of the untracked
> submodule will get deleted, possibly losing any un-pushed local changes to
> the submodule.

This is not about "mainline" and "local branch"; it is about switching
from one branch that tracks the submodule to another one that doesn't
track it.

> This change doesn't delete any untracked submodule's .git directories during
> the recursive-delete (unless forced with the -m option to git-clean), so that
> the submodule history can be restored w/ the proper git commands.
> 
> # Example illustrating problem:
> # Clone mainline project
> git clone git://github.com/thoughtbot/paperclip.git
> cd paperclip/
> 
> # Add a submodule not tracked by mainline
> git checkout -b test_submodule_clean

# Add a submodule to a different branch
# git checkout -b has-submodule

> git submodule add git://github.com/technoweenie/attachment_fu.git attachement_fu
> git commit -m "add submodule"
> 
> # Make a modification to the submodule.  Note that we haven't pushed the change
> cd attachement_fu/
> git checkout -b mod_readme_in_submodule
> vi README
> git add README
> git commit -m "Small change in submodule"
> 
> # Go back to mainline's master branch and do a clean
> cd ..
> git checkout master
> git clean -f -d
> 
> # Our change to the submodule, that was never pushed, is now gone forever
> # because all the history stored in the submodule's .git direct is deleted.
> # There is no recovering from this.
> # This breaks the "git must be safe" rule, as we've lost potentially a lot of
> # changes to any submodule projects that didn't get pushed yet. Solve
> # this issue by not deleting any .git directories we come across during a
> # git-clean, unless the "-m" option is passed to git-clean.

If you indent the example script by some spaces, you won't have to mark
the surrounding text like shell script comments (surrounding text is the
line '# Example...' and the paragraph '# Our change...'. But the
interspersed comments are very helpful.

> +-m::
> +	Clean any .git directories that may be left-over, untracked
> +	submodules.

	Remove .git directories from subdirectories (i.e.
	untracked submodules).

Please address (here and in the code later) that -m makes sense only in
combination with -d.

There is one in-tree user of git-clean (git-filter-branch). Did you check
whether it needs this new flag?

> @@ -44,6 +45,8 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>  		OPT_BOOLEAN('f', NULL, &force, "force"),
>  		OPT_BOOLEAN('d', NULL, &remove_directories,
>  				"remove whole directories"),
> +		OPT_BOOLEAN('m', NULL, &rm_untracked_submodules,
> +				"remove untracked submodules"),
>  		OPT_BOOLEAN('x', NULL, &ignored, "remove ignored files, too"),
>  		OPT_BOOLEAN('X', NULL, &ignored_only,
>  				"remove only ignored files"),
> @@ -59,6 +62,14 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>  	argc = parse_options(argc, argv, prefix, options, builtin_clean_usage,
>  			     0);
>  
> +
> +	int keep_dot_git = 0;
> +	if (rm_untracked_submodules == 0)
> +		keep_dot_git = 1;
> +	else
> +		printf("Any untracked .git directories will be deleted (abandoned submodules)\n");
> +
> +

I can share your feelings about lost work, and that you want to be extra
verbose about .git directories.

But step back a bit. This warning is absolutely useless: It just repeats
the user's instruction: After passing -m, we *expect* 'git clean' to
remove .git directories.

BTW, for what reason are you using a new variable keep_dot_git if there is
already rm_untracked_submodules?

Please add a test to the test suite.

-- Hannes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively()
  2009-06-30  2:10 ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Jason Holden
  2009-06-30  2:10   ` [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean Jason Holden
@ 2009-06-30  6:48   ` Johannes Sixt
  1 sibling, 0 replies; 10+ messages in thread
From: Johannes Sixt @ 2009-06-30  6:48 UTC (permalink / raw)
  To: Jason Holden; +Cc: git

Jason Holden schrieb:
> @@ -812,6 +812,19 @@ int remove_dir_recursively(struct strbuf *path, int only_empty)
>  		strbuf_addch(path, '/');
>  
>  	len = path->len;
> +
> +	if (keep_dot_git) {
> +		char end_of_path[6]; /* enough space for ".git/"*/
> +		memset(end_of_path, '\0', 6);
> +		if (len >= 5) {
> +			strncpy(end_of_path, path->buf + len - 5, 5);
> +			if (strcmp(end_of_path, ".git/") == 0) {
> +				printf("********Found .git!!!!  Skipping delete\n");

I see no reason to ***shout!!!*** here. IOW:

				warning("not removing %s", dir);

is enough. This also sends the text to stderr.

> +				return 0;
> +			}
> +		}
> +	}
> +
>  	while ((e = readdir(dir)) != NULL) {
>  		struct stat st;
>  		if (is_dot_or_dotdot(e->d_name))

I think it is even better to move the check for ".git" below this 'if'. It
should not make a difference in practice.

-- Hannes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30  6:40     ` Johannes Sixt
@ 2009-06-30  7:34       ` Junio C Hamano
  2009-06-30 23:05         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2009-06-30  7:34 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jason Holden, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Jason Holden schrieb:
> ...
>>  If
>> we run git-clean on the mainline branch, when we have a submodule that only
>> exists on a local branch, the entire .git directory of the untracked
>> submodule will get deleted, possibly losing any un-pushed local changes to
>> the submodule.
>
> This is not about "mainline" and "local branch"; it is about switching
> from one branch that tracks the submodule to another one that doesn't
> track it.

I do not think it is even about that.

If you have an old-style nested git work tree, i.e. you have an
independent git repository in some subdirectory of a work tree of a git
work tree, you will have exactly the same issue.  There is no need for any
submodule to get involved.

For example, I have a clone of git.git repository at Meta/ and have the
'todo' branch checked out, so that I can say "Meta/Make", "Meta/Dothem",
etc.  In such a set-up, if you do not have Meta/ in .gitignore (or even if
you do, if you said "git clean -f -x -d"), you will lose that directory
(and arguably that is by design).

I think protecting users from mistakes is a very good idea, but I see at
least two small problems with the patch.  For brevity I'll name the "not a
submodule in the HEAD commit of the superproject" directory "Meta/" in the
following.

 (1) Protecting Meta/.git is not goot enough. If it were, and if this is
     only about submodules, then you can use the "gitdir:" facility to
     relocate Meta/.git directory to somewhere under superproject's .git
     and be done with it.

     You _must_ protect the checked out files, their uncommitted contents
     and untracked but unignored files.  After all, Meta/ is a valid git
     repository in its own right.  Noticing that "rm -r" is about to
     remove Meta/.git after it has already touched many other files in
     Meta/ is one recursion level too late.

 (2) Naming the option to force removal "-m" is wrong; this is not about
     submodule at all.  Can we use double-force "-f -f", perhaps?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30  7:34       ` Junio C Hamano
@ 2009-06-30 23:05         ` Junio C Hamano
  2009-07-01  1:44           ` Jason Holden
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2009-06-30 23:05 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jason Holden, git

Junio C Hamano <gitster@pobox.com> writes:

> I think protecting users from mistakes is a very good idea, but I see at
> least two small problems with the patch.  For brevity I'll name the "not a
> submodule in the HEAD commit of the superproject" directory "Meta/" in the
> following.
>
>  (1) Protecting Meta/.git is not goot enough. If it were, and if this is
>      only about submodules, then you can use the "gitdir:" facility to
>      relocate Meta/.git directory to somewhere under superproject's .git
>      and be done with it.
>
>      You _must_ protect the checked out files, their uncommitted contents
>      and untracked but unignored files.  After all, Meta/ is a valid git
>      repository in its own right.  Noticing that "rm -r" is about to
>      remove Meta/.git after it has already touched many other files in
>      Meta/ is one recursion level too late.
>
>  (2) Naming the option to force removal "-m" is wrong; this is not about
>      submodule at all.  Can we use double-force "-f -f", perhaps?

Perhaps like this.

Untested, as I never use "git clean" myself.

-- >8 --
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 30 Jun 2009 15:33:45 -0700
Subject: [PATCH] clean: require double -f options to nuke nested git repository and work tree

When you have an embedded git work tree in your work tree (be it
an orphaned submodule, or an independent checkout of an unrelated
project), "git clean -d -f" blindly descended into it and removed
everything.  This is rarely what the user wants.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-clean.c  |    7 ++++++-
 dir.c            |   12 ++++++++++--
 dir.h            |    5 ++++-
 refs.c           |    2 +-
 t/t7300-clean.sh |   39 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/builtin-clean.c b/builtin-clean.c
index 1c1b6d2..04ea181 100644
--- a/builtin-clean.c
+++ b/builtin-clean.c
@@ -31,6 +31,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	int i;
 	int show_only = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, baselen = 0, config_set = 0, errors = 0;
+	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
 	struct strbuf directory = STRBUF_INIT;
 	struct dir_struct dir;
 	const char *path, *base;
@@ -70,6 +71,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		die("clean.requireForce%s set and -n or -f not given; "
 		    "refusing to clean", config_set ? "" : " not");
 
+	if (force > 1)
+		rm_flags = 0;
+
 	dir.flags |= DIR_SHOW_OTHER_DIRECTORIES;
 
 	if (!ignored)
@@ -141,7 +145,8 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				   (matches == MATCHED_EXACTLY)) {
 				if (!quiet)
 					printf("Removing %s\n", qname);
-				if (remove_dir_recursively(&directory, 0) != 0) {
+				if (remove_dir_recursively(&directory,
+							   rm_flags) != 0) {
 					warning("failed to remove '%s'", qname);
 					errors++;
 				}
diff --git a/dir.c b/dir.c
index bbfcb56..d0cfe74 100644
--- a/dir.c
+++ b/dir.c
@@ -800,12 +800,20 @@ int is_empty_dir(const char *path)
 	return ret;
 }
 
-int remove_dir_recursively(struct strbuf *path, int only_empty)
+int remove_dir_recursively(struct strbuf *path, int flag)
 {
-	DIR *dir = opendir(path->buf);
+	DIR *dir;
 	struct dirent *e;
 	int ret = 0, original_len = path->len, len;
+	int only_empty = (flag & REMOVE_DIR_EMPTY_ONLY);
+	unsigned char submodule_head[20];
 
+	if ((flag & REMOVE_DIR_KEEP_NESTED_GIT) &&
+	    !resolve_gitlink_ref(path->buf, "HEAD", submodule_head))
+		/* Do not descend and nuke a nested git work tree. */
+		return 0;
+
+	dir = opendir(path->buf);
 	if (!dir)
 		return -1;
 	if (path->buf[original_len - 1] != '/')
diff --git a/dir.h b/dir.h
index 541286a..8c69bdd 100644
--- a/dir.h
+++ b/dir.h
@@ -89,7 +89,10 @@ static inline int is_dot_or_dotdot(const char *name)
 extern int is_empty_dir(const char *dir);
 
 extern void setup_standard_excludes(struct dir_struct *dir);
-extern int remove_dir_recursively(struct strbuf *path, int only_empty);
+
+#define REMOVE_DIR_EMPTY_ONLY 01
+#define REMOVE_DIR_KEEP_NESTED_GIT 02
+extern int remove_dir_recursively(struct strbuf *path, int flag);
 
 /* tries to remove the path with empty directories along it, ignores ENOENT */
 extern int remove_path(const char *path);
diff --git a/refs.c b/refs.c
index 24438c6..a7dd5ae 100644
--- a/refs.c
+++ b/refs.c
@@ -820,7 +820,7 @@ static int remove_empty_directories(const char *file)
 	strbuf_init(&path, 20);
 	strbuf_addstr(&path, file);
 
-	result = remove_dir_recursively(&path, 1);
+	result = remove_dir_recursively(&path, REMOVE_DIR_EMPTY_ONLY);
 
 	strbuf_release(&path);
 
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 929d5d4..118c6eb 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -380,4 +380,43 @@ test_expect_success 'removal failure' '
 '
 chmod 755 foo
 
+test_expect_success 'nested git work tree' '
+	rm -fr foo bar &&
+	mkdir foo bar &&
+	(
+		cd foo &&
+		git init &&
+		>hello.world
+		git add . &&
+		git commit -a -m nested
+	) &&
+	(
+		cd bar &&
+		>goodbye.people
+	) &&
+	git clean -f -d &&
+	test -f foo/.git/index &&
+	test -f foo/hello.world &&
+	! test -d bar
+'
+
+test_expect_success 'force removal of nested git work tree' '
+	rm -fr foo bar &&
+	mkdir foo bar &&
+	(
+		cd foo &&
+		git init &&
+		>hello.world
+		git add . &&
+		git commit -a -m nested
+	) &&
+	(
+		cd bar &&
+		>goodbye.people
+	) &&
+	git clean -f -f -d &&
+	! test -d foo &&
+	! test -d bar
+'
+
 test_done
-- 
1.6.3.3.362.g3c77e

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-06-30 23:05         ` Junio C Hamano
@ 2009-07-01  1:44           ` Jason Holden
  2009-07-01  2:13             ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Holden @ 2009-07-01  1:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> diff --git a/dir.c b/dir.c
> index bbfcb56..d0cfe74 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -800,12 +800,20 @@ int is_empty_dir(const char *path)
>  	return ret;
>  }
>  
> -int remove_dir_recursively(struct strbuf *path, int only_empty)
> +int remove_dir_recursively(struct strbuf *path, int flag)
>  {
> -	DIR *dir = opendir(path->buf);
> +	DIR *dir;
>  	struct dirent *e;
>  	int ret = 0, original_len = path->len, len;
> +	int only_empty = (flag & REMOVE_DIR_EMPTY_ONLY);
> +	unsigned char submodule_head[20];
>  
> +	if ((flag & REMOVE_DIR_KEEP_NESTED_GIT) &&
> +	    !resolve_gitlink_ref(path->buf, "HEAD", submodule_head))
> +		/* Do not descend and nuke a nested git work tree. */

Add a printout here to indicate that we didn't end up removing the
directory.  Otherwise when you run git clean -f -d you just end up
with something like:
"Removing attachement_fu/"
even when we didn't actually end up removing it.  Paolo noticed this
same issue in my original patch.

> +		return 0;
> +
> +	dir = opendir(path->buf);
>  	if (!dir)

I was able to test this patch and everything seems to behave as
expected.

If this becomes the final fix, don't forget to update
Documentation/git-clean.txt

I'm leaving for vacation tomorrow, so you won't hear any more from
me until ~July 8th.  Thanks for the quick responses.

-- 
Regards,
Jason Holden
Get my PGP Key at
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x6B7FBC8D

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean
  2009-07-01  1:44           ` Jason Holden
@ 2009-07-01  2:13             ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2009-07-01  2:13 UTC (permalink / raw)
  To: Jason Holden; +Cc: Johannes Sixt, git

Jason Holden <jason.k.holden@gmail.com> writes:

> If this becomes the final fix, don't forget to update
> Documentation/git-clean.txt

That's a note to yourself and other people who are intereseted ;-).

My patch was, as with many other patches I send to this list, no more than
"if you wanted to do that, you would do it like this.".  It definitely
wasn't meant to be the final shape of the resolution of this issue.

This is not my itch with a particularly high priority, and I do not have
infinite amount of time right now to scratch it.

There shouldn't be any output from dir.[ch] recursive removal function
(unless it is reporting an error).  Instead, the caller should say "removed"
only after it actually removed it, and it needs some reorganizing of the
call sequence.

I think the loop in builtin_clean.c should first be refactored into
smaller helper functions before any of these changes happen.  It has got
unmanageably large and ugly over time (or perhaps it was large and ugly
from the beginning. I do not even remember who did it initially).  

Anyway, enjoy your vacation.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-07-01  2:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-30  2:10 [PATCH 0/2] Don't delete untracked submodule's .git dirs by default Jason Holden
2009-06-30  2:10 ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Jason Holden
2009-06-30  2:10   ` [PATCH 2/2] Don't clean any untracked submodule's .git dir by default in git-clean Jason Holden
2009-06-30  6:07     ` Paolo Bonzini
2009-06-30  6:40     ` Johannes Sixt
2009-06-30  7:34       ` Junio C Hamano
2009-06-30 23:05         ` Junio C Hamano
2009-07-01  1:44           ` Jason Holden
2009-07-01  2:13             ` Junio C Hamano
2009-06-30  6:48   ` [PATCH 1/2] Add option to not delete a .git directory in remove_dir_recursively() Johannes Sixt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).