git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] for-each-repo: new command for multi-repo operations
@ 2013-01-27 12:46 Lars Hjemli
  2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli
  2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli
  0 siblings, 2 replies; 18+ messages in thread
From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw)
  To: git; +Cc: Lars Hjemli

Changes since v3:
* option -x used to execute non-git commands
* option -z used to NUL-terminate paths
* write_name_quoted() used to print repo paths
* repos are handled in sorted order (as defined by strcmp(3)) to get
  predictable output from the command
* unsetenv() reintroduced to avoid problems from GIT_DIR/WORK_TREE
* more tests

Lars Hjemli (2):
  for-each-repo: new command used for multi-repo operations
  git: rewrite `git -a` to become a git-for-each-repo command

 .gitignore                          |   1 +
 Documentation/git-for-each-repo.txt |  71 ++++++++++++
 Makefile                            |   1 +
 builtin.h                           |   1 +
 builtin/for-each-repo.c             | 145 ++++++++++++++++++++++++
 git.c                               |  37 +++++++
 t/t6400-for-each-repo.sh            | 213 ++++++++++++++++++++++++++++++++++++
 7 files changed, 469 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t6400-for-each-repo.sh

-- 
1.8.1.1.349.g4cdd23e

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli
@ 2013-01-27 12:46 ` Lars Hjemli
  2013-01-27 19:04   ` Junio C Hamano
  2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli
  1 sibling, 1 reply; 18+ messages in thread
From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw)
  To: git; +Cc: Lars Hjemli

When working with multiple, unrelated (or loosly related) git repos,
there is often a need to locate all repos with uncommitted work and
perform some action on them (say, commit and push). Before this patch,
such tasks would require manually visiting all repositories, running
`git status` within each one and then decide what to do next.

This mundane task can now be automated by e.g. `git for-each-repo --dirty
status`, which will find all non-bare git repositories below the current
directory (even nested ones), check if they are dirty (as defined by
`git diff --quiet && git diff --cached --quiet`), and for each dirty repo
print the path to the repo and then execute `git status` within the repo.

The command also honours the option '--clean' which restricts the set of
repos to those which '--dirty' would skip, and '-x' which is used to
execute non-git commands.

Finally, the command to execute within each repo is optional. If none is
given, git-for-each-repo will just print the path to each repo found. And
since the command supports -z, this can be used for more advanced scripting
needs.

Note: since git-for-each-repo can execute both git- and nongit commands, it
must cd into the worktree of each repository before executing the command.
It is then no need for the environment variables $GIT_WORK_TREE and $GIT_DIR
to be specified, so git-for-each-repo will instead unset these variables to
stop them from interfering with the executed commands.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 .gitignore                          |   1 +
 Documentation/git-for-each-repo.txt |  71 +++++++++++++++++
 Makefile                            |   1 +
 builtin.h                           |   1 +
 builtin/for-each-repo.c             | 145 ++++++++++++++++++++++++++++++++++
 git.c                               |   1 +
 t/t6400-for-each-repo.sh            | 150 ++++++++++++++++++++++++++++++++++++
 7 files changed, 370 insertions(+)
 create mode 100644 Documentation/git-for-each-repo.txt
 create mode 100644 builtin/for-each-repo.c
 create mode 100755 t/t6400-for-each-repo.sh

diff --git a/.gitignore b/.gitignore
index aa258a6..0c27981 100644
--- a/.gitignore
+++ b/.gitignore
@@ -56,6 +56,7 @@
 /git-filter-branch
 /git-fmt-merge-msg
 /git-for-each-ref
+/git-for-each-repo
 /git-format-patch
 /git-fsck
 /git-fsck-objects
diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
new file mode 100644
index 0000000..fb12b3f
--- /dev/null
+++ b/Documentation/git-for-each-repo.txt
@@ -0,0 +1,71 @@
+git-for-each-repo(1)
+====================
+
+NAME
+----
+git-for-each-repo - Execute a git command in multiple non-bare repositories
+
+SYNOPSIS
+--------
+[verse]
+'git for-each-repo' [-acdxz] [command]
+
+DESCRIPTION
+-----------
+The git-for-each-repo command is used to locate all non-bare git
+repositories within the current directory tree, and optionally
+execute a git command in each of the found repos.
+
+OPTIONS
+-------
+-a::
+--all::
+	Include both clean and dirty repositories (this is the default
+	behaviour of `git-for-each-repo`).
+
+-c::
+--clean::
+	Only include repositories with a clean worktree.
+
+-d::
+--dirty::
+	Only include repositories with a dirty worktree.
+
+-x::
+	Execute a genric (non-git) command in each repo.
+
+-z::
+	Terminate each path name with the NUL character.
+
+EXAMPLES
+--------
+
+Various ways to exploit this command::
++
+------------
+$ git for-each-repo            <1>
+$ git for-each-repo fetch      <2>
+$ git for-each-repo -d gui     <3>
+$ git for-each-repo -c push    <4>
+$ git for-each-repo -x du -sh  <5>
+------------
++
+<1> Print the path to all repos found below the current directory.
+
+<2> Fetch updates from default remote in all repos.
+
+<3> Start linkgit:git-gui[1] in each repo containing uncommitted changes.
+
+<4> Push the current branch in each repo with no uncommited changes.
+
+<5> Print disk-usage for each repository.
+
+NOTES
+-----
+
+For the purpose of `git-for-each-repo`, a dirty worktree is defined as a
+worktree with uncommitted changes.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index a786d4c..8c42c17 100644
--- a/Makefile
+++ b/Makefile
@@ -870,6 +870,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o
 BUILTIN_OBJS += builtin/fetch.o
 BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
+BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/grep.o
diff --git a/builtin.h b/builtin.h
index 7e7bbd6..02fc712 100644
--- a/builtin.h
+++ b/builtin.h
@@ -73,6 +73,7 @@ extern int cmd_fetch(int argc, const char **argv, const char *prefix);
 extern int cmd_fetch_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 extern int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
+extern int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 extern int cmd_format_patch(int argc, const char **argv, const char *prefix);
 extern int cmd_fsck(int argc, const char **argv, const char *prefix);
 extern int cmd_gc(int argc, const char **argv, const char *prefix);
diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
new file mode 100644
index 0000000..9333ae0
--- /dev/null
+++ b/builtin/for-each-repo.c
@@ -0,0 +1,145 @@
+/*
+ * "git for-each-repo" builtin command.
+ *
+ * Copyright (c) 2013 Lars Hjemli <hjemli@gmail.com>
+ */
+#include "cache.h"
+#include "color.h"
+#include "quote.h"
+#include "builtin.h"
+#include "run-command.h"
+#include "parse-options.h"
+
+#define ALL 0
+#define DIRTY 1
+#define CLEAN 2
+
+static char *color = GIT_COLOR_NORMAL;
+static int eol = '\n';
+static int match;
+static int runopt = RUN_GIT_CMD;
+
+static const char * const builtin_foreachrepo_usage[] = {
+	N_("git for-each-repo [-acdxz] [cmd]"),
+	NULL
+};
+
+static struct option builtin_foreachrepo_options[] = {
+	OPT_SET_INT('a', "all", &match, N_("match both clean and dirty repositories"), ALL),
+	OPT_SET_INT('c', "clean", &match, N_("only show clean repositories"), CLEAN),
+	OPT_SET_INT('d', "dirty", &match, N_("only show dirty repositories"), DIRTY),
+	OPT_SET_INT('x', NULL, &runopt, N_("execute generic (non-git) command"), 0),
+	OPT_SET_INT('z', NULL, &eol, N_("terminate each repo path with NUL character"), 0),
+	OPT_END(),
+};
+
+static int get_repo_state(const char *dir)
+{
+	const char *diffidx[] = {"diff", "--quiet", "--cached", NULL};
+	const char *diffwd[] = {"diff", "--quiet", NULL};
+
+	if (run_command_v_opt_cd_env(diffidx, RUN_GIT_CMD, dir, NULL) != 0)
+		return DIRTY;
+	if (run_command_v_opt_cd_env(diffwd, RUN_GIT_CMD, dir, NULL) != 0)
+		return DIRTY;
+	return CLEAN;
+}
+
+static void print_repo_path(const char *path, unsigned pretty)
+{
+	if (path[0] == '.' && path[1] == '/')
+		path += 2;
+	if (pretty)
+		color_fprintf_ln(stdout, color, "[%s]", path);
+	else
+		write_name_quoted(path, stdout, eol);
+}
+
+static void handle_repo(struct strbuf *path, const char **argv)
+{
+	const char *gitdir;
+	int len;
+
+	len = path->len;
+	strbuf_addstr(path, ".git");
+	gitdir = resolve_gitdir(path->buf);
+	strbuf_setlen(path, len - 1);
+	if (!gitdir)
+		goto done;
+	if (match != ALL && match != get_repo_state(path->buf))
+		goto done;
+	print_repo_path(path->buf, *argv != NULL);
+	if (*argv)
+		run_command_v_opt_cd_env(argv, runopt, path->buf, NULL);
+done:
+	strbuf_addstr(path, "/");
+}
+
+static int walk(struct strbuf *path, int argc, const char **argv)
+{
+	DIR *dir;
+	struct dirent *ent;
+	struct stat st;
+	size_t len;
+	int has_dotgit = 0;
+	struct string_list list = STRING_LIST_INIT_DUP;
+	struct string_list_item *item;
+
+	dir = opendir(path->buf);
+	if (!dir)
+		return errno;
+	strbuf_addstr(path, "/");
+	len = path->len;
+	while ((ent = readdir(dir))) {
+		if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
+			continue;
+		if (!strcmp(ent->d_name, ".git")) {
+			has_dotgit = 1;
+			continue;
+		}
+		switch (DTYPE(ent)) {
+		case DT_UNKNOWN:
+		case DT_LNK:
+			/* Use stat() to figure out if this path leads
+			 * to a directory - it's  not important if it's
+			 * a symlink which gets us there.
+			 */
+			strbuf_setlen(path, len);
+			strbuf_addstr(path, ent->d_name);
+			if (stat(path->buf, &st) || !S_ISDIR(st.st_mode))
+				break;
+			/* fallthrough */
+		case DT_DIR:
+			string_list_append(&list, ent->d_name);
+			break;
+		}
+	}
+	closedir(dir);
+	strbuf_setlen(path, len);
+	if (has_dotgit)
+		handle_repo(path, argv);
+	sort_string_list(&list);
+	for_each_string_list_item(item, &list) {
+		strbuf_setlen(path, len);
+		strbuf_addstr(path, item->string);
+		walk(path, argc, argv);
+	}
+	string_list_clear(&list, 0);
+	return 0;
+}
+
+int cmd_for_each_repo(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf path = STRBUF_INIT;
+
+	unsetenv(GIT_DIR_ENVIRONMENT);
+	unsetenv(GIT_WORK_TREE_ENVIRONMENT);
+	argc = parse_options(argc, argv, prefix,
+			     builtin_foreachrepo_options,
+			     builtin_foreachrepo_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (want_color(GIT_COLOR_AUTO))
+		color = GIT_COLOR_YELLOW;
+	strbuf_addstr(&path, ".");
+	return walk(&path, argc, argv);
+}
diff --git a/git.c b/git.c
index ed66c66..6b53169 100644
--- a/git.c
+++ b/git.c
@@ -337,6 +337,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "fetch-pack", cmd_fetch_pack, RUN_SETUP },
 		{ "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP },
 		{ "for-each-ref", cmd_for_each_ref, RUN_SETUP },
+		{ "for-each-repo", cmd_for_each_repo },
 		{ "format-patch", cmd_format_patch, RUN_SETUP },
 		{ "fsck", cmd_fsck, RUN_SETUP },
 		{ "fsck-objects", cmd_fsck, RUN_SETUP },
diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh
new file mode 100755
index 0000000..af02c0c
--- /dev/null
+++ b/t/t6400-for-each-repo.sh
@@ -0,0 +1,150 @@
+#!/bin/sh
+#
+# Copyright (c) 2013 Lars Hjemli
+#
+
+test_description='Test the git-for-each-repo command'
+
+. ./test-lib.sh
+
+qname="with\"quote"
+qqname="\"with\\\"quote\""
+
+test_expect_success "setup" '
+	test_create_repo clean &&
+	(cd clean && test_commit foo1) &&
+	git init --separate-git-dir=.cleansub clean/gitfile &&
+	(cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) &&
+	test_create_repo dirty-idx &&
+	(cd dirty-idx && test_commit foo3 && git rm foo3.t) &&
+	test_create_repo dirty-wt &&
+	(cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git &&
+	  test_commit foo4 && rm foo4.t) &&
+	test_create_repo "$qname" &&
+	(cd "$qname" && test_commit foo5) &&
+	mkdir fakedir && mkdir fakedir/.git
+'
+
+test_expect_success "without filtering, all repos are included" '
+	echo "." >expect &&
+	echo "clean" >>expect &&
+	echo "clean/gitfile" >>expect &&
+	echo "dirty-idx" >>expect &&
+	echo "dirty-wt" >>expect &&
+	echo "$qqname" >>expect &&
+	git for-each-repo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "-z NUL-terminates each path" '
+	echo "(.)" >expect &&
+	echo "(clean)" >>expect &&
+	echo "(clean/gitfile)" >>expect &&
+	echo "(dirty-idx)" >>expect &&
+	echo "(dirty-wt)" >>expect &&
+	echo "($qname)" >>expect &&
+	git for-each-repo -z | xargs -0 printf "(%s)\n"  >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "--dirty only includes dirty repos" '
+	echo "clean/gitfile" >expect &&
+	echo "dirty-idx" >>expect &&
+	echo "dirty-wt" >>expect &&
+	git for-each-repo --dirty >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "--clean only includes clean repos" '
+	echo "." >expect &&
+	echo "clean" >>expect &&
+	echo "$qqname" >>expect &&
+	git for-each-repo --clean >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "run a git-command in all repos" '
+	echo "[.]" >expect &&
+	echo "[clean]" >>expect &&
+	echo "[clean/gitfile]" >>expect &&
+	echo " M foo2.t" >>expect &&
+	echo "[dirty-idx]" >>expect &&
+	echo "D  foo3.t" >>expect &&
+	echo "[dirty-wt]" >>expect &&
+	echo " D foo4.t" >> expect
+	echo "[$qname]" >>expect &&
+	git for-each-repo status -suno >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "run a git-command in dirty repos only" '
+	echo "[clean/gitfile]" >expect &&
+	echo " M foo2.t" >>expect &&
+	echo "[dirty-idx]" >>expect &&
+	echo "D  foo3.t" >>expect &&
+	echo "[dirty-wt]" >>expect &&
+	echo " D foo4.t" >> expect
+	git for-each-repo -d status -suno >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "run a git-command in clean repos only" '
+	echo "[.]" >expect &&
+	echo "[clean]" >>expect &&
+	echo "foo1.t" >>expect &&
+	echo "[$qname]" >>expect &&
+	echo "foo5.t" >>expect &&
+	git for-each-repo -c ls-files >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "-z is disabled when a command is run" '
+	echo "[.]" >expect &&
+	echo "[clean]" >>expect &&
+	echo "foo1.t" >>expect &&
+	echo "[$qname]" >>expect &&
+	echo "foo5.t" >>expect &&
+	git for-each-repo -cz ls-files >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "-x executes any command in each repo" '
+	echo "[.]" >expect &&
+	echo "$HOME" >>expect &&
+	echo "[clean]" >>expect &&
+	echo "$HOME/clean" >>expect &&
+	echo "[clean/gitfile]" >>expect &&
+	echo "$HOME/clean/gitfile" >>expect &&
+	echo "[dirty-idx]" >>expect &&
+	echo "$HOME/dirty-idx" >>expect &&
+	echo "[dirty-wt]" >>expect &&
+	echo "$HOME/dirty-wt" >> expect
+	echo "[$qname]" >>expect &&
+	echo "$HOME/$qname" >>expect &&
+	git for-each-repo -x pwd >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "-cx executes any command in clean repos" '
+	echo "[.]" >expect &&
+	echo "$HOME" >>expect &&
+	echo "[clean]" >>expect &&
+	echo "$HOME/clean" >>expect &&
+	echo "[$qname]" >>expect &&
+	echo "$HOME/$qname" >>expect &&
+	git for-each-repo -cx pwd >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "-dx executes any command in dirty repos" '
+	echo "[clean/gitfile]" >expect &&
+	echo "$HOME/clean/gitfile" >>expect &&
+	echo "[dirty-idx]" >>expect &&
+	echo "$HOME/dirty-idx" >>expect &&
+	echo "[dirty-wt]" >>expect &&
+	echo "$HOME/dirty-wt" >> expect
+	git for-each-repo -dx pwd >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.8.1.1.349.g4cdd23e

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command
  2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli
  2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli
@ 2013-01-27 12:46 ` Lars Hjemli
  1 sibling, 0 replies; 18+ messages in thread
From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw)
  To: git; +Cc: Lars Hjemli

With this rewriting, it is now possible to run e.g. `git -ad gui` to
start up git-gui in each repo within the current directory which
contains uncommited work.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 git.c                    | 36 +++++++++++++++++++++++++++
 t/t6400-for-each-repo.sh | 63 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/git.c b/git.c
index 6b53169..f933b5d 100644
--- a/git.c
+++ b/git.c
@@ -31,8 +31,42 @@ static void commit_pager_choice(void) {
 	}
 }
 
+/*
+ * Rewrite 'git -ad status' to 'git for-each-repo -d status'
+ */
+static int rewrite_foreach_repo(const char ***orig_argv,
+				const char **curr_argv,
+				int *curr_argc)
+{
+	const char **new_argv;
+	char *tmp;
+	int new_argc, curr_pos, i, j;
+
+	curr_pos = curr_argv - *orig_argv;
+	if (strlen(curr_argv[0]) == 2) {
+		curr_argv[0] = "for-each-repo";
+		return curr_pos - 1;
+	}
+
+	new_argc = curr_pos + *curr_argc + 1;
+	new_argv = xmalloc(new_argc * sizeof(void *));
+	for (i = j = 0; j < new_argc; i++, j++) {
+		if (i == curr_pos) {
+			asprintf(&tmp, "-%s", (*orig_argv)[i] + 2);
+			new_argv[j] = "for-each-repo";
+			new_argv[++j] = tmp;
+		} else {
+			new_argv[j] = (*orig_argv)[i];
+		}
+	}
+	*orig_argv = new_argv;
+	(*curr_argc)++;
+	return curr_pos;
+}
+
 static int handle_options(const char ***argv, int *argc, int *envchanged)
 {
+	const char ***pargv = argv;
 	const char **orig_argv = *argv;
 
 	while (*argc > 0) {
@@ -143,6 +177,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "0", 1);
 			if (envchanged)
 				*envchanged = 1;
+		} else if (!strncmp(cmd, "-a", 2)) {
+			return rewrite_foreach_repo(pargv, *argv, argc);
 		} else {
 			fprintf(stderr, "Unknown option: %s\n", cmd);
 			usage(git_usage_string);
diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh
index af02c0c..eaa4518 100755
--- a/t/t6400-for-each-repo.sh
+++ b/t/t6400-for-each-repo.sh
@@ -147,4 +147,67 @@ test_expect_success "-dx executes any command in dirty repos" '
 	test_cmp expect actual
 '
 
+test_expect_success "rewrite 'git -a'" '
+	echo "." >expect &&
+	echo "clean" >>expect &&
+	echo "clean/gitfile" >>expect &&
+	echo "dirty-idx" >>expect &&
+	echo "dirty-wt" >>expect &&
+	echo "$qqname" >>expect &&
+	git -a >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "rewrite 'git -az'" '
+	echo "(.)" >expect &&
+	echo "(clean)" >>expect &&
+	echo "(clean/gitfile)" >>expect &&
+	echo "(dirty-idx)" >>expect &&
+	echo "(dirty-wt)" >>expect &&
+	echo "($qname)" >>expect &&
+	git -az | xargs -0 printf "(%s)\n"  >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "rewrite 'git -ad'" '
+	echo "clean/gitfile" >expect &&
+	echo "dirty-idx" >>expect &&
+	echo "dirty-wt" >>expect &&
+	git -ad >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "rewrite 'git -ac'" '
+	echo "." >expect &&
+	echo "clean" >>expect &&
+	echo "$qqname" >>expect &&
+	git -ac >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "rewrite 'git -a status -suno'" '
+	echo "[.]" >expect &&
+	echo "[clean]" >>expect &&
+	echo "[clean/gitfile]" >>expect &&
+	echo " M foo2.t" >>expect &&
+	echo "[dirty-idx]" >>expect &&
+	echo "D  foo3.t" >>expect &&
+	echo "[dirty-wt]" >>expect &&
+	echo " D foo4.t" >> expect
+	echo "[$qname]" >>expect &&
+	git -a status -suno >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success "rewrite 'git -acx pwd'" '
+	echo "[.]" >expect &&
+	echo "$HOME" >>expect &&
+	echo "[clean]" >>expect &&
+	echo "$HOME/clean" >>expect &&
+	echo "[$qname]" >>expect &&
+	echo "$HOME/$qname" >>expect &&
+	git -acx pwd >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.1.1.349.g4cdd23e

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli
@ 2013-01-27 19:04   ` Junio C Hamano
  2013-01-27 19:42     ` John Keeping
  2013-01-28  7:50     ` Lars Hjemli
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2013-01-27 19:04 UTC (permalink / raw)
  To: Lars Hjemli; +Cc: git

Lars Hjemli <hjemli@gmail.com> writes:

> When working with multiple, unrelated (or loosly related) git repos,
> there is often a need to locate all repos with uncommitted work and
> perform some action on them (say, commit and push). Before this patch,
> such tasks would require manually visiting all repositories, running
> `git status` within each one and then decide what to do next.
>
> This mundane task can now be automated by e.g. `git for-each-repo --dirty
> status`, which will find all non-bare git repositories below the current
> directory (even nested ones), check if they are dirty (as defined by
> `git diff --quiet && git diff --cached --quiet`), and for each dirty repo
> print the path to the repo and then execute `git status` within the repo.
>
> The command also honours the option '--clean' which restricts the set of
> repos to those which '--dirty' would skip, and '-x' which is used to
> execute non-git commands.

It might make sense to internally use RUN_GIT_CMD flag when the
first word of the command line is 'git' as an optimization, but 
I am not sure it is a good idea to force the end users to think
when to use -x and when not to is a good idea.

In other words, I think

     git for-each-repo -d diff --name-only
     git for-each-repo -d -x ls '*.c'

is less nice than letting the user say

     git for-each-repo -d git diff --name-only
     git for-each-repo -d ls '*.c'

> Finally, the command to execute within each repo is optional. If none is
> given, git-for-each-repo will just print the path to each repo found. And
> since the command supports -z, this can be used for more advanced scripting
> needs.

It amounts to the same thing, but I would rather describe it as:

    To allow scripts to handle paths with shell-unsafe characters,
    support "-z" to show paths with NUL termination.  Otherwise,
    such paths are shown with the usual c-quoting.

One more thing that nobody brought up during the previous reviews is
if we want to support subset of repositories by allowing the
standard pathspec match mechanism.  For example,

	git for-each-repo -d git diff --name-only -- foo/ bar/b\*z

might be a way to ask "please find repositories match the given
pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
are dirty".  We would need to think about how to mark the end of the
command though---we could borrow \; from find(1), even though find
is not the best example of the UI design.  I.e.

	git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z

with or without "--".

> diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt
> new file mode 100644
> index 0000000..fb12b3f
> --- /dev/null
> +++ b/Documentation/git-for-each-repo.txt
> @@ -0,0 +1,71 @@
> +git-for-each-repo(1)
> +====================
> +
> +NAME
> +----
> +git-for-each-repo - Execute a git command in multiple non-bare repositories

There is a separate topic in flight that turns s/git/Git/ when we
refer to the system as a whole.  In any case, this is no longer
limited to "execute a Git command".

	Find non-bare Git repositories in subdirectories

or

	Find or execute a command in non-bare Git repositories in subdirectories


perhaps?

> +SYNOPSIS
> +--------
> +[verse]
> +'git for-each-repo' [-acdxz] [command]
> +
> +DESCRIPTION
> +-----------
> +The git-for-each-repo command is used to locate all non-bare git

Should be sufficient to say s/is used to locate/locates/.

> +repositories within the current directory tree, and optionally
> +execute a git command in each of the found repos.

s/a git command/a command/;

> +OPTIONS
> +-------
> ...
> +-x::
> +	Execute a genric (non-git) command in each repo.

Drop this option.

> +NOTES
> +-----
> +
> +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a
> +worktree with uncommitted changes.

Is it a definition that is different from usual?  If so why does it
need to be inconsistent with the rest of the system?

> diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
> new file mode 100644
> index 0000000..9333ae0
> --- /dev/null
> +++ b/builtin/for-each-repo.c
> @@ -0,0 +1,145 @@
> +/*
> + * "git for-each-repo" builtin command.
> + *
> + * Copyright (c) 2013 Lars Hjemli <hjemli@gmail.com>
> + */
> +#include "cache.h"
> +#include "color.h"
> +#include "quote.h"
> +#include "builtin.h"
> +#include "run-command.h"
> +#include "parse-options.h"
> +
> +#define ALL 0
> +#define DIRTY 1
> +#define CLEAN 2
> +
> +static char *color = GIT_COLOR_NORMAL;
> +static int eol = '\n';
> +static int match;
> +static int runopt = RUN_GIT_CMD;
> +
> +static const char * const builtin_foreachrepo_usage[] = {
> +	N_("git for-each-repo [-acdxz] [cmd]"),
> +	NULL
> +};
> +
> +static struct option builtin_foreachrepo_options[] = {
> +	OPT_SET_INT('a', "all", &match, N_("match both clean and dirty repositories"), ALL),
> +	OPT_SET_INT('c', "clean", &match, N_("only show clean repositories"), CLEAN),
> +	OPT_SET_INT('d', "dirty", &match, N_("only show dirty repositories"), DIRTY),
> +	OPT_SET_INT('x', NULL, &runopt, N_("execute generic (non-git) command"), 0),
> +	OPT_SET_INT('z', NULL, &eol, N_("terminate each repo path with NUL character"), 0),
> +	OPT_END(),
> +};
> +
> +static int get_repo_state(const char *dir)
> +{
> +	const char *diffidx[] = {"diff", "--quiet", "--cached", NULL};
> +	const char *diffwd[] = {"diff", "--quiet", NULL};
> +
> +	if (run_command_v_opt_cd_env(diffidx, RUN_GIT_CMD, dir, NULL) != 0)
> +		return DIRTY;
> +	if (run_command_v_opt_cd_env(diffwd, RUN_GIT_CMD, dir, NULL) != 0)
> +		return DIRTY;
> +	return CLEAN;
> +}
> +
> +static void print_repo_path(const char *path, unsigned pretty)
> +{
> +	if (path[0] == '.' && path[1] == '/')
> +		path += 2;
> +	if (pretty)
> +		color_fprintf_ln(stdout, color, "[%s]", path);

This is shown before running a command in that repository.  I am of
two minds.  It certainly is nice to be able to tell which repository
each block of output lines comes from, and not requiring the command
to do this themselves is a good default.  However, I wonder if people
would want to do something like this:

	git for-each-repo sh -c '
		git diff --name-only |
		sed -e "s|^|$path/|"
        '

to get a consolidated view, in a way similar to how "submodule
foreach" can be used.  This unconditional output will get in the way
for such a use case.

Oh, that reminds me of another thing.  Perhaps we would want to
export the (relative) path to the found repository in some way to
allow the commands to do this kind of thing in the first place?
"submodule foreach" does this with $path, I think.

> +	else
> +		write_name_quoted(path, stdout, eol);
> +}

Nice.  Doubly nice that you do not hardcode "color" at this point
but made it into a separate variable.

> +static void handle_repo(struct strbuf *path, const char **argv)
> +{
> +	const char *gitdir;
> +	int len;
> +
> +	len = path->len;
> +	strbuf_addstr(path, ".git");
> +	gitdir = resolve_gitdir(path->buf);
> +	strbuf_setlen(path, len - 1);
> +	if (!gitdir)
> +		goto done;
> +	if (match != ALL && match != get_repo_state(path->buf))
> +		goto done;
> +	print_repo_path(path->buf, *argv != NULL);
> +	if (*argv)
> +		run_command_v_opt_cd_env(argv, runopt, path->buf, NULL);
> +done:
> +	strbuf_addstr(path, "/");

OK, you get "$D/" from the caller, make it "$D/.git" to call
resolve_gitdir() with, turn it to "$D" before printing and runnning,
and then add "/" back.  Slightly tricky but correct.

> +static int walk(struct strbuf *path, int argc, const char **argv)
> +{
> +	DIR *dir;
> +	struct dirent *ent;
> +	struct stat st;
> +	size_t len;
> +	int has_dotgit = 0;
> +	struct string_list list = STRING_LIST_INIT_DUP;
> +	struct string_list_item *item;
> +
> +	dir = opendir(path->buf);
> +	if (!dir)
> +		return errno;
> +	strbuf_addstr(path, "/");
> +	len = path->len;
> +	while ((ent = readdir(dir))) {
> +		if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
> +			continue;
> +		if (!strcmp(ent->d_name, ".git")) {
> +			has_dotgit = 1;
> +			continue;
> +		}
> +		switch (DTYPE(ent)) {
> +		case DT_UNKNOWN:
> +		case DT_LNK:
> +			/* Use stat() to figure out if this path leads
> +			 * to a directory - it's  not important if it's
> +			 * a symlink which gets us there.
> +			 */
> +			strbuf_setlen(path, len);
> +			strbuf_addstr(path, ent->d_name);
> +			if (stat(path->buf, &st) || !S_ISDIR(st.st_mode))
> +				break;
> +			/* fallthrough */
> +		case DT_DIR:
> +			string_list_append(&list, ent->d_name);
> +			break;
> +		}
> +	}
> +	closedir(dir);
> +	strbuf_setlen(path, len);
> +	if (has_dotgit)
> +		handle_repo(path, argv);
> +	sort_string_list(&list);
> +	for_each_string_list_item(item, &list) {
> +		strbuf_setlen(path, len);
> +		strbuf_addstr(path, item->string);
> +		walk(path, argc, argv);
> +	}
> +	string_list_clear(&list, 0);
> +	return 0;
> +}

Is the "collect-first-and-then-sort" done so that the repositories
are shown in a stable order regardless of the order in which
readdir() returns he entries?  I am not complaining, but being
curious.

> diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh

This command does not look like "6 - the revision tree commands" to
me. "7 - the porcelainish commands concerning the working tree" or
"9 - the git tools" may be a better match?

> new file mode 100755
> index 0000000..af02c0c
> --- /dev/null
> +++ b/t/t6400-for-each-repo.sh
> @@ -0,0 +1,150 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2013 Lars Hjemli
> +#
> +
> +test_description='Test the git-for-each-repo command'
> +
> +. ./test-lib.sh
> +
> +qname="with\"quote"
> +qqname="\"with\\\"quote\""

If Windows does not have problems with paths with dq in it, then
this is fine, but I dunno.  Otherwise, you may want to exclude the
c-quote testing from the main part of the test, and have a single
test that has prerequisite for filesystems that can do this at the
end of the script.

> +test_expect_success "setup" '
> +	test_create_repo clean &&
> +	(cd clean && test_commit foo1) &&
> +	git init --separate-git-dir=.cleansub clean/gitfile &&
> +	(cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) &&
> +	test_create_repo dirty-idx &&
> +	(cd dirty-idx && test_commit foo3 && git rm foo3.t) &&
> +	test_create_repo dirty-wt &&
> +	(cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git &&

Some platforms are symlink-challenged.  Can we do this test without
"ln -s"?  SYMLINKS prereq wouldn't be very useful for the setup
step, as all the remaining tests won't work without setting up the
test scenario.

> +	  test_commit foo4 && rm foo4.t) &&
> +	test_create_repo "$qname" &&
> +	(cd "$qname" && test_commit foo5) &&
> +	mkdir fakedir && mkdir fakedir/.git
> +'
> +
> +test_expect_success "without filtering, all repos are included" '
> +	echo "." >expect &&
> +	echo "clean" >>expect &&
> +	echo "clean/gitfile" >>expect &&
> +	echo "dirty-idx" >>expect &&
> +	echo "dirty-wt" >>expect &&
> +	echo "$qqname" >>expect &&

A single

	cat >expect <<-EOF
        .
        clean
        clean/gitfile
        ...
	$qqname
	EOF

may be a lot easier to read (likewise for all the "expect"
preparation in the rest of the script).

> +test_expect_success "-z NUL-terminates each path" '
> +	echo "(.)" >expect &&
> +	echo "(clean)" >>expect &&
> +	echo "(clean/gitfile)" >>expect &&
> +	echo "(dirty-idx)" >>expect &&
> +	echo "(dirty-wt)" >>expect &&
> +	echo "($qname)" >>expect &&
> +	git for-each-repo -z | xargs -0 printf "(%s)\n"  >actual &&

This needs prereq on "xargs -0", but because we know we do not have
any string with Q in it in the expected list of repositories, it may
be simpler to do something like this:

	echo ".QcleanQclean/gitfileQ...$qname" >expect &&
	git for-each-repo -z | tr "\0" Q >actual &&
	test_cmp expect actual

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-27 19:04   ` Junio C Hamano
@ 2013-01-27 19:42     ` John Keeping
  2013-01-27 19:45       ` Junio C Hamano
  2013-01-28  7:50     ` Lars Hjemli
  1 sibling, 1 reply; 18+ messages in thread
From: John Keeping @ 2013-01-27 19:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Hjemli, git

On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote:
> One more thing that nobody brought up during the previous reviews is
> if we want to support subset of repositories by allowing the
> standard pathspec match mechanism.  For example,
> 
> 	git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
> 
> might be a way to ask "please find repositories match the given
> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
> are dirty".  We would need to think about how to mark the end of the
> command though---we could borrow \; from find(1), even though find
> is not the best example of the UI design.  I.e.
> 
> 	git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z
> 
> with or without "--".

Would it be better to make this a (multi-valued) option?

    git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only

It seems a lot simpler than trying to figure out how the command is
going to handle '--' arguments.

> Oh, that reminds me of another thing.  Perhaps we would want to
> export the (relative) path to the found repository in some way to
> allow the commands to do this kind of thing in the first place?
> "submodule foreach" does this with $path, I think.

I think $path is the only variable exported by "submodule foreach" which
is applicable here, but it doesn't work on Windows, where environment
variables are case-insensitive.

Commit 64394e3 (git-submodule.sh: Don't use $path variable in
eval_gettext string) changed "submodule foreach" to use $sm_path
internally although I notice that the documentation still uses $path.

Perhaps $repo_path in this case?


John

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-27 19:42     ` John Keeping
@ 2013-01-27 19:45       ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2013-01-27 19:45 UTC (permalink / raw)
  To: John Keeping; +Cc: Lars Hjemli, git

John Keeping <john@keeping.me.uk> writes:

> On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote:
>> One more thing that nobody brought up during the previous reviews is
>> if we want to support subset of repositories by allowing the
>> standard pathspec match mechanism.  For example,
>> 
>> 	git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
>> 
>> might be a way to ask "please find repositories match the given
>> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
>> are dirty".  We would need to think about how to mark the end of the
>> command though---we could borrow \; from find(1), even though find
>> is not the best example of the UI design.  I.e.
>> 
>> 	git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z
>> 
>> with or without "--".
>
> Would it be better to make this a (multi-valued) option?
>
>     git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only

The standard way to use filtering based on paths we have is to use
the pathspec parameters at the end of the commmand line.

I see no reason for such an inconsistency with an option like --filter.

>> Oh, that reminds me of another thing.  Perhaps we would want to
>> export the (relative) path to the found repository in some way to
>> allow the commands to do this kind of thing in the first place?
>> "submodule foreach" does this with $path, I think.
>
> I think $path is the only variable exported by "submodule foreach" which
> is applicable here, but it doesn't work on Windows, where environment
> variables are case-insensitive.
>
> Commit 64394e3 (git-submodule.sh: Don't use $path variable in
> eval_gettext string) changed "submodule foreach" to use $sm_path
> internally although I notice that the documentation still uses $path.
>
> Perhaps $repo_path in this case?

I do not care too deeply about the name, as long as the names used
by both mechanisms are the same.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-27 19:04   ` Junio C Hamano
  2013-01-27 19:42     ` John Keeping
@ 2013-01-28  7:50     ` Lars Hjemli
  2013-01-28  8:10       ` Jonathan Nieder
  1 sibling, 1 reply; 18+ messages in thread
From: Lars Hjemli @ 2013-01-28  7:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, Jan 27, 2013 at 8:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Lars Hjemli <hjemli@gmail.com> writes:
>
>> The command also honours the option '--clean' which restricts the set of
>> repos to those which '--dirty' would skip, and '-x' which is used to
>> execute non-git commands.
>
> It might make sense to internally use RUN_GIT_CMD flag when the
> first word of the command line is 'git' as an optimization, but
> I am not sure it is a good idea to force the end users to think
> when to use -x and when not to is a good idea.
>
> In other words, I think
>
>      git for-each-repo -d diff --name-only
>      git for-each-repo -d -x ls '*.c'
>
> is less nice than letting the user say
>
>      git for-each-repo -d git diff --name-only
>      git for-each-repo -d ls '*.c'
>

The 'git-for-each-repo' command was made to allow any git command to
be executed in all discovered repositories, and I've used it that way
for two years (in the form of a shell-script called 'git-all'). During
this time, I've occasionally thought about forking non-git commands
but the itch hasn't been strong enough for me to scratch. The point
I'm trying to make is that to me, this command acts as a modifier for
other git commands[1]. Having the possibility to execute non-git
commands would be nice, but it is not the main objective of this
command.

[1] The 'git -a' rewrite patch shows how I think about this command -
it's just an option to the 'git' command, modifying the way any
subcommand is invoked (btw: I don't expect that patch to be applied
since 'git-all' was deemed to generic, so I'll just carry the patch in
my own tree).

>> Finally, the command to execute within each repo is optional. If none is
>> given, git-for-each-repo will just print the path to each repo found. And
>> since the command supports -z, this can be used for more advanced scripting
>> needs.
>
> It amounts to the same thing, but I would rather describe it as:
>
>     To allow scripts to handle paths with shell-unsafe characters,
>     support "-z" to show paths with NUL termination.  Otherwise,
>     such paths are shown with the usual c-quoting.
>

Much better, thanks.


> One more thing that nobody brought up during the previous reviews is
> if we want to support subset of repositories by allowing the
> standard pathspec match mechanism.  For example,
>
>         git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
>
> might be a way to ask "please find repositories match the given
> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
> are dirty".  We would need to think about how to mark the end of the
> command though---we could borrow \; from find(1), even though find
> is not the best example of the UI design.  I.e.
>
>         git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z
>
> with or without "--".

I don't think this would be very nice to end users, and would prefer
--include and --exclude options (the latter is actually already a part
of git-all, added by one of my coworkers).

>> +NOTES
>> +-----
>> +
>> +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a
>> +worktree with uncommitted changes.
>
> Is it a definition that is different from usual?  If so why does it
> need to be inconsistent with the rest of the system?

I just wanted to clarify what condition --dirty and --clean will
check. In particular, the lack of checking for untracked files (which
could be added as yet another option).

>> +static void print_repo_path(const char *path, unsigned pretty)
>> +{
>> +     if (path[0] == '.' && path[1] == '/')
>> +             path += 2;
>> +     if (pretty)
>> +             color_fprintf_ln(stdout, color, "[%s]", path);
>
> This is shown before running a command in that repository.  I am of
> two minds.  It certainly is nice to be able to tell which repository
> each block of output lines comes from, and not requiring the command
> to do this themselves is a good default.  However, I wonder if people
> would want to do something like this:
>
>         git for-each-repo sh -c '
>                 git diff --name-only |
>                 sed -e "s|^|$path/|"
>         '
>
> to get a consolidated view, in a way similar to how "submodule
> foreach" can be used.  This unconditional output will get in the way
> for such a use case.

I guess -q/--quiet could be useful.

>> +static int walk(struct strbuf *path, int argc, const char **argv)
>> +{
>> +     DIR *dir;
>> +     struct dirent *ent;
>> +     struct stat st;
>> +     size_t len;
>> +     int has_dotgit = 0;
>> +     struct string_list list = STRING_LIST_INIT_DUP;
>> +     struct string_list_item *item;
>> +
>> +     dir = opendir(path->buf);
>> +     if (!dir)
>> +             return errno;
>> +     strbuf_addstr(path, "/");
>> +     len = path->len;
>> +     while ((ent = readdir(dir))) {
>> +             if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
>> +                     continue;
>> +             if (!strcmp(ent->d_name, ".git")) {
>> +                     has_dotgit = 1;
>> +                     continue;
>> +             }
>> +             switch (DTYPE(ent)) {
>> +             case DT_UNKNOWN:
>> +             case DT_LNK:
>> +                     /* Use stat() to figure out if this path leads
>> +                      * to a directory - it's  not important if it's
>> +                      * a symlink which gets us there.
>> +                      */
>> +                     strbuf_setlen(path, len);
>> +                     strbuf_addstr(path, ent->d_name);
>> +                     if (stat(path->buf, &st) || !S_ISDIR(st.st_mode))
>> +                             break;
>> +                     /* fallthrough */
>> +             case DT_DIR:
>> +                     string_list_append(&list, ent->d_name);
>> +                     break;
>> +             }
>> +     }
>> +     closedir(dir);
>> +     strbuf_setlen(path, len);
>> +     if (has_dotgit)
>> +             handle_repo(path, argv);
>> +     sort_string_list(&list);
>> +     for_each_string_list_item(item, &list) {
>> +             strbuf_setlen(path, len);
>> +             strbuf_addstr(path, item->string);
>> +             walk(path, argc, argv);
>> +     }
>> +     string_list_clear(&list, 0);
>> +     return 0;
>> +}
>
> Is the "collect-first-and-then-sort" done so that the repositories
> are shown in a stable order regardless of the order in which
> readdir() returns he entries?

Yes (writing the testcases demonstrated a need for predictable output).


>> diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh
>
> This command does not look like "6 - the revision tree commands" to
> me. "7 - the porcelainish commands concerning the working tree" or
> "9 - the git tools" may be a better match?

Ok, how about t9003?

>> new file mode 100755
>> index 0000000..af02c0c
>> --- /dev/null
>> +++ b/t/t6400-for-each-repo.sh
>> @@ -0,0 +1,150 @@
>> +#!/bin/sh
>> +#
>> +# Copyright (c) 2013 Lars Hjemli
>> +#
>> +
>> +test_description='Test the git-for-each-repo command'
>> +
>> +. ./test-lib.sh
>> +
>> +qname="with\"quote"
>> +qqname="\"with\\\"quote\""
>
> If Windows does not have problems with paths with dq in it, then
> this is fine, but I dunno.  Otherwise, you may want to exclude the
> c-quote testing from the main part of the test, and have a single
> test that has prerequisite for filesystems that can do this at the
> end of the script.

I'll check my patch on msysgit before resending.


>> +test_expect_success "setup" '
>> +     test_create_repo clean &&
>> +     (cd clean && test_commit foo1) &&
>> +     git init --separate-git-dir=.cleansub clean/gitfile &&
>> +     (cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) &&
>> +     test_create_repo dirty-idx &&
>> +     (cd dirty-idx && test_commit foo3 && git rm foo3.t) &&
>> +     test_create_repo dirty-wt &&
>> +     (cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git &&
>
> Some platforms are symlink-challenged.  Can we do this test without
> "ln -s"?  SYMLINKS prereq wouldn't be very useful for the setup
> step, as all the remaining tests won't work without setting up the
> test scenario.

I added this test to check the DT_UNKNOWN/DT_LINK case in walk() so
I'd rather not drop it, but it can be moved into a standalone,
SYMLINKS-enabled testcase.

Thanks for the review.

-- 
larsh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28  7:50     ` Lars Hjemli
@ 2013-01-28  8:10       ` Jonathan Nieder
  2013-01-28 17:11         ` Lars Hjemli
  2013-01-28 17:45         ` Junio C Hamano
  0 siblings, 2 replies; 18+ messages in thread
From: Jonathan Nieder @ 2013-01-28  8:10 UTC (permalink / raw)
  To: Lars Hjemli; +Cc: Junio C Hamano, git

Hi,

Lars Hjemli wrote:

> [1] The 'git -a' rewrite patch shows how I think about this command -
> it's just an option to the 'git' command, modifying the way any
> subcommand is invoked (btw: I don't expect that patch to be applied
> since 'git-all' was deemed to generic, so I'll just carry the patch in
> my own tree).

As one data point, 'git all' also seems too generic to me but 'git -a'
doesn't.  Intuition can be weird.

So if I ran the world, then having commands

	git -a diff

and

	git for-each-repo git diff

do the same thing would be fine.  Of course I don't run the world. ;-)

[...]
>> One more thing that nobody brought up during the previous reviews is
>> if we want to support subset of repositories by allowing the
>> standard pathspec match mechanism.  For example,
>>
>>         git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
>>
>> might be a way to ask "please find repositories match the given
>> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
>> are dirty".  We would need to think about how to mark the end of the
>> command though---we could borrow \; from find(1), even though find
>> is not the best example of the UI design.

In most non-git commands, "--" represents an end-of-options marker,
allowing arbitrary options afterward without having to worry about
escaping minus signs.  So in that spirit, if this weren't a git
command, I'd expect to be able to do

	for-each-repo -- git diff -- '*.c'

and have the second '--' passed verbatim to "git diff".

Unfortunately in git (imitating commands like "grep", I suppose), "--"
means "paths start here".  That means that with the git convention,
there is only one place to pass paths to a given command.

Tracing backwards: it would be really nice to be able to do

	git for-each-repo git grep -e foo -- '*.c'

or

	git -a grep -e foo -- '*.c'

For this practical reason, it seems that paths listed after the '--'
should go to the command being run.  On the other hand, if I wanted to
limit my for-each-repo run to repositories in two subdirectories of
the cwd, I'd be tempted to try

	git for-each-repo git grep -e foo -- src/ doc/

And if I wanted to limit to different file types in the repositories
under each directory, it would be tempting to use

	git for-each-repo git grep -e foo -- 'src/*.c' 'doc/*.txt'

Is there a convention that would be usable today that is roughly
forward-compatible with that?  (To throw an example out, requiring
that each pathspec passed to for-each-repo either starts with '*' or
contains no wildcards.)

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28  8:10       ` Jonathan Nieder
@ 2013-01-28 17:11         ` Lars Hjemli
  2013-01-28 18:35           ` Junio C Hamano
  2013-01-28 17:45         ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Lars Hjemli @ 2013-01-28 17:11 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> Lars Hjemli wrote:
>
>> [1] The 'git -a' rewrite patch shows how I think about this command -
>> it's just an option to the 'git' command, modifying the way any
>> subcommand is invoked (btw: I don't expect that patch to be applied
>> since 'git-all' was deemed to generic, so I'll just carry the patch in
>> my own tree).
>
> As one data point, 'git all' also seems too generic to me but 'git -a'
> doesn't.  Intuition can be weird.
>
> So if I ran the world, then having commands
>
>         git -a diff
>
> and
>
>         git for-each-repo git diff
>
> do the same thing would be fine.  Of course I don't run the world. ;-)

This would make me very happy. Junio?

--
larsh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28  8:10       ` Jonathan Nieder
  2013-01-28 17:11         ` Lars Hjemli
@ 2013-01-28 17:45         ` Junio C Hamano
  2013-01-28 18:35           ` Lars Hjemli
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2013-01-28 17:45 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Lars Hjemli, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Tracing backwards: it would be really nice to be able to do
>
> 	git for-each-repo git grep -e foo -- '*.c'

This is a very good example that shows the command that is run in
the repositories found may want pathspecs passed, but at the same
time, makes me realize that these repositories have to be fairly
uniform for this command to be useful.  For example, 'src/*.c' or
'inc/*.h' pathspecs wouldn't be useful unless majority if not all
projects the loop finds follow that layout convention.  This is not
necessarily limited to pathspecs, of course.  Unless they all have
the 'next' branch "git for-each-repo checkout next" would not work,
etc. etc.

As to the pathspec limiting to affect the loop itself, not the
argument given to the command that is run, I don't think it is
absolutely needed; I am perfectly fine with declaring that
for-each-repo goes to repositories in all subdirectories without
limit, especially if doing so will make the UI issues we have to
deal with simpler.

As to the "option to the command, not to the subcommand, -a option",
I have been assuming that it was a joke patch, but if "git -a grep"
turns out to be really useful, "submodule foreach" that iterates
over the submodules may also want to have such a short and sweet
mechanism.  Between "for-each-repo" and "submodule foreach", I do
not yet have a strong opinion on which one deserves it more.

Come to think of it, is there a reason why "for-each-repo" should
not be an extention to "submodule foreach"?  We can view this as
visiting repositories that _could_ be registered as a submodule, in
addition to iterating over the registered submodules, no?

If these two are unified, then we do not have to even worry about
which one deserves "git -a" more.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 17:11         ` Lars Hjemli
@ 2013-01-28 18:35           ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2013-01-28 18:35 UTC (permalink / raw)
  To: Lars Hjemli; +Cc: Jonathan Nieder, git

Lars Hjemli <hjemli@gmail.com> writes:

> On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> ...
>> So if I ran the world, then having commands
>>
>>         git -a diff
>>
>> and
>>
>>         git for-each-repo git diff
>>
>> do the same thing would be fine.  Of course I don't run the world. ;-)
>
> This would make me very happy. Junio?

Ahh, our mails crossed (rather, I responded to the other message I
saw before I saw this one).  I am not completely sold on "git -a"
yet, but another worry I have is which one between "submodule
foreach" and "for-each-repo" should use "git -a", if we decide that
it is useful to the users to add it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 17:45         ` Junio C Hamano
@ 2013-01-28 18:35           ` Lars Hjemli
  2013-01-28 18:51             ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Lars Hjemli @ 2013-01-28 18:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git

On Mon, Jan 28, 2013 at 6:45 PM, Junio C Hamano <gitster@pobox.com> wrote:
> As to the pathspec limiting to affect the loop itself, not the
> argument given to the command that is run, I don't think it is
> absolutely needed; I am perfectly fine with declaring that
> for-each-repo goes to repositories in all subdirectories without
> limit, especially if doing so will make the UI issues we have to
> deal with simpler.

Good (since the relative path of each repo will be exported to the
child process, that process can perform path limiting when needed).


> As to the "option to the command, not to the subcommand, -a option",
> I have been assuming that it was a joke patch, but if "git -a grep"
> turns out to be really useful, "submodule foreach" that iterates
> over the submodules may also want to have such a short and sweet
> mechanism.  Between "for-each-repo" and "submodule foreach", I do
> not yet have a strong opinion on which one deserves it more.
>
> Come to think of it, is there a reason why "for-each-repo" should
> not be an extention to "submodule foreach"?  We can view this as
> visiting repositories that _could_ be registered as a submodule, in
> addition to iterating over the registered submodules, no?

Yes, but I see some possible problems with that approach:
-'git for-each-repo' does not need to be started from within a git worktree
-'git for-each-repo' and 'git submodule foreach' have different
semantics for --dirty and --clean
-'git for-each-repo' is in C because my 'git-all' shell script was
horribly slow on large directory trees (especially on windows)

All of these problems are probably solvable, but it would require
quite some reworking of git-submodule.sh

-- 
larsh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 18:35           ` Lars Hjemli
@ 2013-01-28 18:51             ` Junio C Hamano
  2013-01-28 19:42               ` Lars Hjemli
  2013-01-28 20:12               ` Jens Lehmann
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2013-01-28 18:51 UTC (permalink / raw)
  To: Lars Hjemli; +Cc: Jonathan Nieder, git

Lars Hjemli <hjemli@gmail.com> writes:

>> Come to think of it, is there a reason why "for-each-repo" should
>> not be an extention to "submodule foreach"?  We can view this as
>> visiting repositories that _could_ be registered as a submodule, in
>> addition to iterating over the registered submodules, no?
>
> Yes, but I see some possible problems with that approach:
> -'git for-each-repo' does not need to be started from within a git worktree

True, but "git submodule foreach --untracked" can be told that it is
OK not (yet) to be in any superproject, no?

> -'git for-each-repo' and 'git submodule foreach' have different
> semantics for --dirty and --clean

That could be a problem.  Is there a good reason why they should use
different definitions of dirtyness?

> -'git for-each-repo' is in C because my 'git-all' shell script was
> horribly slow on large directory trees (especially on windows)

Your for-each-repo could be a good basis to build a new builtin
"submodule--foreach" that is a pure helper hidden from the end users
that does both; cmd_foreach() in git-submodule.sh can simply delegate
to it.

> All of these problems are probably solvable, but it would require
> quite some reworking of git-submodule.sh

Of course some work is needed, but we do not have to convert all the
cmd_foo in git-submodule.sh in one step.  For the purpose of
unifying for-each-repo and submodule foreach to deliver the
functionality sooner to the end users, we can go the route to add
only the submodule--foreach builtin, out of which we will get
reusable implementation of module_list and other helper functions we
can leverage later to do other cmd_foo functions.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 18:51             ` Junio C Hamano
@ 2013-01-28 19:42               ` Lars Hjemli
  2013-01-28 20:12               ` Jens Lehmann
  1 sibling, 0 replies; 18+ messages in thread
From: Lars Hjemli @ 2013-01-28 19:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git

On Mon, Jan 28, 2013 at 7:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Lars Hjemli <hjemli@gmail.com> writes:
>
>>> Come to think of it, is there a reason why "for-each-repo" should
>>> not be an extention to "submodule foreach"?  We can view this as
>>> visiting repositories that _could_ be registered as a submodule, in
>>> addition to iterating over the registered submodules, no?
>>
>> Yes, but I see some possible problems with that approach:
>> -'git for-each-repo' does not need to be started from within a git worktree
>
> True, but "git submodule foreach --untracked" can be told that it is
> OK not (yet) to be in any superproject, no?

Yes.

>
>> -'git for-each-repo' and 'git submodule foreach' have different
>> semantics for --dirty and --clean
>
> That could be a problem.  Is there a good reason why they should use
> different definitions of dirtyness?

I suspected that 'submodule foreach --dirty' might want to compare the
HEAD sha1 in the submodule against the one recorded in the
superproject (similar to what 'git submodule status' does), but such a
check could be triggered by a different flag (e.g. --behind/--ahead or
something similar).

>> -'git for-each-repo' is in C because my 'git-all' shell script was
>> horribly slow on large directory trees (especially on windows)
>
> Your for-each-repo could be a good basis to build a new builtin
> "submodule--foreach" that is a pure helper hidden from the end users
> that does both; cmd_foreach() in git-submodule.sh can simply delegate
> to it.

Ok, I'll rework my patches in this direction. Thanks.

--
larsh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 18:51             ` Junio C Hamano
  2013-01-28 19:42               ` Lars Hjemli
@ 2013-01-28 20:12               ` Jens Lehmann
  2013-01-28 20:34                 ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Jens Lehmann @ 2013-01-28 20:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt

Am 28.01.2013 19:51, schrieb Junio C Hamano:
> Lars Hjemli <hjemli@gmail.com> writes:
> 
>>> Come to think of it, is there a reason why "for-each-repo" should
>>> not be an extention to "submodule foreach"?  We can view this as
>>> visiting repositories that _could_ be registered as a submodule, in
>>> addition to iterating over the registered submodules, no?
>>
>> Yes, but I see some possible problems with that approach:
>> -'git for-each-repo' does not need to be started from within a git worktree
> 
> True, but "git submodule foreach --untracked" can be told that it is
> OK not (yet) to be in any superproject, no?

Hmm, I'm not sure how that would work as it looks for gitlinks
in the index which point to work tree paths.

>> -'git for-each-repo' and 'git submodule foreach' have different
>> semantics for --dirty and --clean

I'm confused, what semantics of --dirty and --clean does current
'git submodule foreach' have? I can't find any sign of it in the
current code ... did I miss something while skimming through this
thread? Or are you talking about status and diff here?

> That could be a problem.  Is there a good reason why they should use
> different definitions of dirtyness?

I don't see any (except of course for comparing a gitlink with the
HEAD of the submodule, which is an additional condition that only
applies to submodules). But I think the current for-each-repo
proposal doesn't allow to traverse repos which contain untracked
content (and it would be nice if the user could somehow combine
that with the current --dirty flag to have both in one go).

>> -'git for-each-repo' is in C because my 'git-all' shell script was
>> horribly slow on large directory trees (especially on windows)
> 
> Your for-each-repo could be a good basis to build a new builtin
> "submodule--foreach" that is a pure helper hidden from the end users
> that does both; cmd_foreach() in git-submodule.sh can simply delegate
> to it.

I like that approach, because the operations are very similar from
the user's point of view. But please remember that internally they
would work differently, as submodule foreach walks the index and
only descends into those submodules that are populated (and contain
a .git directory or file) while for-each-repo scans the whole work
tree, which makes it a more expensive operation.

>> All of these problems are probably solvable, but it would require
>> quite some reworking of git-submodule.sh
> 
> Of course some work is needed, but we do not have to convert all the
> cmd_foo in git-submodule.sh in one step.  For the purpose of
> unifying for-each-repo and submodule foreach to deliver the
> functionality sooner to the end users, we can go the route to add
> only the submodule--foreach builtin, out of which we will get
> reusable implementation of module_list and other helper functions we
> can leverage later to do other cmd_foo functions.

I really like that idea!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 20:12               ` Jens Lehmann
@ 2013-01-28 20:34                 ` Junio C Hamano
  2013-01-28 21:25                   ` Jens Lehmann
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2013-01-28 20:34 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Am 28.01.2013 19:51, schrieb Junio C Hamano:
>> Lars Hjemli <hjemli@gmail.com> writes:
>> 
>>>> Come to think of it, is there a reason why "for-each-repo" should
>>>> not be an extention to "submodule foreach"?  We can view this as
>>>> visiting repositories that _could_ be registered as a submodule, in
>>>> addition to iterating over the registered submodules, no?
>>>
>>> Yes, but I see some possible problems with that approach:
>>> -'git for-each-repo' does not need to be started from within a git worktree
>> 
>> True, but "git submodule foreach --untracked" can be told that it is
>> OK not (yet) to be in any superproject, no?
>
> Hmm, I'm not sure how that would work as it looks for gitlinks
> in the index which point to work tree paths.

I was imagining that "foreach --untracked" could go something like this:

 * If you are inside an existing git repository, read its index to
   learn the gitlinks in the directory and its subdirectories.

 * Start from the current directory and recursively apply the
   procedure in this step:

   * Scan the directory and iterate over the ones that has ".git" in
     it:

     * If it is a gitlinked one, show it, but do not descend into it
       unless --recursive is given (e.g. you start from /home/jens,
       find /home/jens/proj/ directory that has /home/jens/proj/.git
       in it.  /home/jens/.git/index knows that it is a submodule of
       the top-level superproject.  "proj" is handled, and it is up
       to the --recursive option if its submodules are handled).

     * If it is _not_ a gitlinked one, show it and descend into it
       (e.g. /home/jens/ is not a repository or /home/jens/proj is
       not a tracked submodule) to apply this procedure recursively.

Of course, without --untracked, we have no need to iterate over the
readdir() return values; instead we just scan the index of the
top-level superproject.

>>> -'git for-each-repo' and 'git submodule foreach' have different
>>> semantics for --dirty and --clean
>
> I'm confused, what semantics of --dirty and --clean does current
> 'git submodule foreach' have? I can't find any sign of it in the
> current code ... did I miss something while skimming through this
> thread? Or are you talking about status and diff here?

I think Lars is hinting that "submodule foreach" could restrict its
operation to a similar --dirty/--clean/--both option he has.  Of
course, the command given to foreach can decide to become no-op by
inspecting the submodule itself, so in that sense, --dirty/--clean
can be done without, but I think it would make sense to have it in
"submodule foreach" even without the "--untracked" option.

> But I think the current for-each-repo
> proposal doesn't allow to traverse repos which contain untracked
> content (and it would be nice if the user could somehow combine
> that with the current --dirty flag to have both in one go).

Perhaps.  I personally felt it was really strange that submodule
diff and status consider that it is a sin to have untracked and
unignored cruft in the submodule working tree, though.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 20:34                 ` Junio C Hamano
@ 2013-01-28 21:25                   ` Jens Lehmann
  2013-02-04  6:41                     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jens Lehmann @ 2013-01-28 21:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt

Am 28.01.2013 21:34, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> Am 28.01.2013 19:51, schrieb Junio C Hamano:
>>> Lars Hjemli <hjemli@gmail.com> writes:
>>>
>>>>> Come to think of it, is there a reason why "for-each-repo" should
>>>>> not be an extention to "submodule foreach"?  We can view this as
>>>>> visiting repositories that _could_ be registered as a submodule, in
>>>>> addition to iterating over the registered submodules, no?
>>>>
>>>> Yes, but I see some possible problems with that approach:
>>>> -'git for-each-repo' does not need to be started from within a git worktree
>>>
>>> True, but "git submodule foreach --untracked" can be told that it is
>>> OK not (yet) to be in any superproject, no?
>>
>> Hmm, I'm not sure how that would work as it looks for gitlinks
>> in the index which point to work tree paths.
> 
> I was imagining that "foreach --untracked" could go something like this:
> 
>  * If you are inside an existing git repository, read its index to
>    learn the gitlinks in the directory and its subdirectories.
> 
>  * Start from the current directory and recursively apply the
>    procedure in this step:
> 
>    * Scan the directory and iterate over the ones that has ".git" in
>      it:
> 
>      * If it is a gitlinked one, show it, but do not descend into it
>        unless --recursive is given (e.g. you start from /home/jens,
>        find /home/jens/proj/ directory that has /home/jens/proj/.git
>        in it.  /home/jens/.git/index knows that it is a submodule of
>        the top-level superproject.  "proj" is handled, and it is up
>        to the --recursive option if its submodules are handled).
> 
>      * If it is _not_ a gitlinked one, show it and descend into it
>        (e.g. /home/jens/ is not a repository or /home/jens/proj is
>        not a tracked submodule) to apply this procedure recursively.
> 
> Of course, without --untracked, we have no need to iterate over the
> readdir() return values; instead we just scan the index of the
> top-level superproject.

Thanks for explaining, that makes tons of sense.

>>>> -'git for-each-repo' and 'git submodule foreach' have different
>>>> semantics for --dirty and --clean
>>
>> I'm confused, what semantics of --dirty and --clean does current
>> 'git submodule foreach' have? I can't find any sign of it in the
>> current code ... did I miss something while skimming through this
>> thread? Or are you talking about status and diff here?
> 
> I think Lars is hinting that "submodule foreach" could restrict its
> operation to a similar --dirty/--clean/--both option he has.  Of
> course, the command given to foreach can decide to become no-op by
> inspecting the submodule itself, so in that sense, --dirty/--clean
> can be done without, but I think it would make sense to have it in
> "submodule foreach" even without the "--untracked" option.

Nice idea. E.g. that would help submodule users to easily script
a workflow which descends only into modified submodules to create
branches and push them there. Or to remove branches which were
created everywhere only in those submodules that weren't changed.

>> But I think the current for-each-repo
>> proposal doesn't allow to traverse repos which contain untracked
>> content (and it would be nice if the user could somehow combine
>> that with the current --dirty flag to have both in one go).
> 
> Perhaps.  I personally felt it was really strange that submodule
> diff and status consider that it is a sin to have untracked and
> unignored cruft in the submodule working tree, though.

The VCS we used at work before Git didn't show us any untracked
files, which caused trouble on a regular basis as people were
breaking builds for others because they forgot to check in new
files. That didn't happen with Git anymore, which was very cool.
But the problem reappeared as we started using submodules. Since
I taught status and diff to show that we're happy again. So for
us it was everything but strange ;-)

But for for-each-repo I would rather propose that modifications of
tracked files can optionally and/or solely be used to pick the
repos. Maybe: --dirty=modified, --dirty=untracked and --dirty=both
with --dirty defaulting to modified?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations
  2013-01-28 21:25                   ` Jens Lehmann
@ 2013-02-04  6:41                     ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2013-02-04  6:41 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Am 28.01.2013 21:34, schrieb Junio C Hamano:
> ...
>> I was imagining that "foreach --untracked" could go something like this:
>> 
>>  * If you are inside an existing git repository, read its index to
>>    learn the gitlinks in the directory and its subdirectories.
>> 
>>  * Start from the current directory and recursively apply the
>>    procedure in this step:
>> 
>>    * Scan the directory and iterate over the ones that has ".git" in
>>      it:
>> 
>>      * If it is a gitlinked one, show it, but do not descend into it
>>        unless --recursive is given (e.g. you start from /home/jens,
>>        find /home/jens/proj/ directory that has /home/jens/proj/.git
>>        in it.  /home/jens/.git/index knows that it is a submodule of
>>        the top-level superproject.  "proj" is handled, and it is up
>>        to the --recursive option if its submodules are handled).
>> 
>>      * If it is _not_ a gitlinked one, show it and descend into it
>>        (e.g. /home/jens/ is not a repository or /home/jens/proj is
>>        not a tracked submodule) to apply this procedure recursively.
>> 
>> Of course, without --untracked, we have no need to iterate over the
>> readdir() return values; instead we just scan the index of the
>> top-level superproject.
>
> Thanks for explaining, that makes tons of sense.

There is a small thinko above, though, and I'd like to correct it
before anybody takes the above too seriously as _the_ outline of the
design and implements it to the letter.

The --recursive option should govern both a tracked submodule and an
untracked one.  When asking to list both existing submodules and
directories that could become submodules, you should be able to say

	$ git submodule foreach --untracked

to list the direct submodules and the directories with .git in them
that are not yet submodules of the top-level superproject, but the
latter is limited to those with no parent directories with .git in
them (other than the top-level of the working tree of the
superproject).  With

	$ git submodule foreach --untracked --recursive

you would see submodules and their submodules recursively, and also
directories with .git in them (i.e. candidates to become direct
submodules of the superproject) and the directories with .git in
them inside such submodule candidates (i.e. candidates to become
direct submodules of the directories that could become direct
submodules of the superproject) recursively.

If we set things up this way:

	mkdir -p a/b c/d &&
	for d in . a a/b c c/d
        do
		git init $d &&
                ( cd $d && git commit --allow-empty -m initial )
	done &&
        git add a &&
        ( cd a && git add b )

The expected results for various combinations are:

 * "git submodule foreach" would visit 'a' and nothing else;
 * "git submodule foreach --recursive" would visit 'a' and 'a/b';
 * "git submodule foreach --untracked" would visit 'a' and 'c'; and
 * "git submodule foreach --untracked --recursive" would visit all four.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-02-04  6:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli
2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli
2013-01-27 19:04   ` Junio C Hamano
2013-01-27 19:42     ` John Keeping
2013-01-27 19:45       ` Junio C Hamano
2013-01-28  7:50     ` Lars Hjemli
2013-01-28  8:10       ` Jonathan Nieder
2013-01-28 17:11         ` Lars Hjemli
2013-01-28 18:35           ` Junio C Hamano
2013-01-28 17:45         ` Junio C Hamano
2013-01-28 18:35           ` Lars Hjemli
2013-01-28 18:51             ` Junio C Hamano
2013-01-28 19:42               ` Lars Hjemli
2013-01-28 20:12               ` Jens Lehmann
2013-01-28 20:34                 ` Junio C Hamano
2013-01-28 21:25                   ` Jens Lehmann
2013-02-04  6:41                     ` Junio C Hamano
2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).