* [PATCH v4 0/2] for-each-repo: new command for multi-repo operations @ 2013-01-27 12:46 Lars Hjemli 2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli 2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli 0 siblings, 2 replies; 18+ messages in thread From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw) To: git; +Cc: Lars Hjemli Changes since v3: * option -x used to execute non-git commands * option -z used to NUL-terminate paths * write_name_quoted() used to print repo paths * repos are handled in sorted order (as defined by strcmp(3)) to get predictable output from the command * unsetenv() reintroduced to avoid problems from GIT_DIR/WORK_TREE * more tests Lars Hjemli (2): for-each-repo: new command used for multi-repo operations git: rewrite `git -a` to become a git-for-each-repo command .gitignore | 1 + Documentation/git-for-each-repo.txt | 71 ++++++++++++ Makefile | 1 + builtin.h | 1 + builtin/for-each-repo.c | 145 ++++++++++++++++++++++++ git.c | 37 +++++++ t/t6400-for-each-repo.sh | 213 ++++++++++++++++++++++++++++++++++++ 7 files changed, 469 insertions(+) create mode 100644 Documentation/git-for-each-repo.txt create mode 100644 builtin/for-each-repo.c create mode 100755 t/t6400-for-each-repo.sh -- 1.8.1.1.349.g4cdd23e ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli @ 2013-01-27 12:46 ` Lars Hjemli 2013-01-27 19:04 ` Junio C Hamano 2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli 1 sibling, 1 reply; 18+ messages in thread From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw) To: git; +Cc: Lars Hjemli When working with multiple, unrelated (or loosly related) git repos, there is often a need to locate all repos with uncommitted work and perform some action on them (say, commit and push). Before this patch, such tasks would require manually visiting all repositories, running `git status` within each one and then decide what to do next. This mundane task can now be automated by e.g. `git for-each-repo --dirty status`, which will find all non-bare git repositories below the current directory (even nested ones), check if they are dirty (as defined by `git diff --quiet && git diff --cached --quiet`), and for each dirty repo print the path to the repo and then execute `git status` within the repo. The command also honours the option '--clean' which restricts the set of repos to those which '--dirty' would skip, and '-x' which is used to execute non-git commands. Finally, the command to execute within each repo is optional. If none is given, git-for-each-repo will just print the path to each repo found. And since the command supports -z, this can be used for more advanced scripting needs. Note: since git-for-each-repo can execute both git- and nongit commands, it must cd into the worktree of each repository before executing the command. It is then no need for the environment variables $GIT_WORK_TREE and $GIT_DIR to be specified, so git-for-each-repo will instead unset these variables to stop them from interfering with the executed commands. Signed-off-by: Lars Hjemli <hjemli@gmail.com> --- .gitignore | 1 + Documentation/git-for-each-repo.txt | 71 +++++++++++++++++ Makefile | 1 + builtin.h | 1 + builtin/for-each-repo.c | 145 ++++++++++++++++++++++++++++++++++ git.c | 1 + t/t6400-for-each-repo.sh | 150 ++++++++++++++++++++++++++++++++++++ 7 files changed, 370 insertions(+) create mode 100644 Documentation/git-for-each-repo.txt create mode 100644 builtin/for-each-repo.c create mode 100755 t/t6400-for-each-repo.sh diff --git a/.gitignore b/.gitignore index aa258a6..0c27981 100644 --- a/.gitignore +++ b/.gitignore @@ -56,6 +56,7 @@ /git-filter-branch /git-fmt-merge-msg /git-for-each-ref +/git-for-each-repo /git-format-patch /git-fsck /git-fsck-objects diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt new file mode 100644 index 0000000..fb12b3f --- /dev/null +++ b/Documentation/git-for-each-repo.txt @@ -0,0 +1,71 @@ +git-for-each-repo(1) +==================== + +NAME +---- +git-for-each-repo - Execute a git command in multiple non-bare repositories + +SYNOPSIS +-------- +[verse] +'git for-each-repo' [-acdxz] [command] + +DESCRIPTION +----------- +The git-for-each-repo command is used to locate all non-bare git +repositories within the current directory tree, and optionally +execute a git command in each of the found repos. + +OPTIONS +------- +-a:: +--all:: + Include both clean and dirty repositories (this is the default + behaviour of `git-for-each-repo`). + +-c:: +--clean:: + Only include repositories with a clean worktree. + +-d:: +--dirty:: + Only include repositories with a dirty worktree. + +-x:: + Execute a genric (non-git) command in each repo. + +-z:: + Terminate each path name with the NUL character. + +EXAMPLES +-------- + +Various ways to exploit this command:: ++ +------------ +$ git for-each-repo <1> +$ git for-each-repo fetch <2> +$ git for-each-repo -d gui <3> +$ git for-each-repo -c push <4> +$ git for-each-repo -x du -sh <5> +------------ ++ +<1> Print the path to all repos found below the current directory. + +<2> Fetch updates from default remote in all repos. + +<3> Start linkgit:git-gui[1] in each repo containing uncommitted changes. + +<4> Push the current branch in each repo with no uncommited changes. + +<5> Print disk-usage for each repository. + +NOTES +----- + +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a +worktree with uncommitted changes. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index a786d4c..8c42c17 100644 --- a/Makefile +++ b/Makefile @@ -870,6 +870,7 @@ BUILTIN_OBJS += builtin/fetch-pack.o BUILTIN_OBJS += builtin/fetch.o BUILTIN_OBJS += builtin/fmt-merge-msg.o BUILTIN_OBJS += builtin/for-each-ref.o +BUILTIN_OBJS += builtin/for-each-repo.o BUILTIN_OBJS += builtin/fsck.o BUILTIN_OBJS += builtin/gc.o BUILTIN_OBJS += builtin/grep.o diff --git a/builtin.h b/builtin.h index 7e7bbd6..02fc712 100644 --- a/builtin.h +++ b/builtin.h @@ -73,6 +73,7 @@ extern int cmd_fetch(int argc, const char **argv, const char *prefix); extern int cmd_fetch_pack(int argc, const char **argv, const char *prefix); extern int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix); extern int cmd_for_each_ref(int argc, const char **argv, const char *prefix); +extern int cmd_for_each_repo(int argc, const char **argv, const char *prefix); extern int cmd_format_patch(int argc, const char **argv, const char *prefix); extern int cmd_fsck(int argc, const char **argv, const char *prefix); extern int cmd_gc(int argc, const char **argv, const char *prefix); diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c new file mode 100644 index 0000000..9333ae0 --- /dev/null +++ b/builtin/for-each-repo.c @@ -0,0 +1,145 @@ +/* + * "git for-each-repo" builtin command. + * + * Copyright (c) 2013 Lars Hjemli <hjemli@gmail.com> + */ +#include "cache.h" +#include "color.h" +#include "quote.h" +#include "builtin.h" +#include "run-command.h" +#include "parse-options.h" + +#define ALL 0 +#define DIRTY 1 +#define CLEAN 2 + +static char *color = GIT_COLOR_NORMAL; +static int eol = '\n'; +static int match; +static int runopt = RUN_GIT_CMD; + +static const char * const builtin_foreachrepo_usage[] = { + N_("git for-each-repo [-acdxz] [cmd]"), + NULL +}; + +static struct option builtin_foreachrepo_options[] = { + OPT_SET_INT('a', "all", &match, N_("match both clean and dirty repositories"), ALL), + OPT_SET_INT('c', "clean", &match, N_("only show clean repositories"), CLEAN), + OPT_SET_INT('d', "dirty", &match, N_("only show dirty repositories"), DIRTY), + OPT_SET_INT('x', NULL, &runopt, N_("execute generic (non-git) command"), 0), + OPT_SET_INT('z', NULL, &eol, N_("terminate each repo path with NUL character"), 0), + OPT_END(), +}; + +static int get_repo_state(const char *dir) +{ + const char *diffidx[] = {"diff", "--quiet", "--cached", NULL}; + const char *diffwd[] = {"diff", "--quiet", NULL}; + + if (run_command_v_opt_cd_env(diffidx, RUN_GIT_CMD, dir, NULL) != 0) + return DIRTY; + if (run_command_v_opt_cd_env(diffwd, RUN_GIT_CMD, dir, NULL) != 0) + return DIRTY; + return CLEAN; +} + +static void print_repo_path(const char *path, unsigned pretty) +{ + if (path[0] == '.' && path[1] == '/') + path += 2; + if (pretty) + color_fprintf_ln(stdout, color, "[%s]", path); + else + write_name_quoted(path, stdout, eol); +} + +static void handle_repo(struct strbuf *path, const char **argv) +{ + const char *gitdir; + int len; + + len = path->len; + strbuf_addstr(path, ".git"); + gitdir = resolve_gitdir(path->buf); + strbuf_setlen(path, len - 1); + if (!gitdir) + goto done; + if (match != ALL && match != get_repo_state(path->buf)) + goto done; + print_repo_path(path->buf, *argv != NULL); + if (*argv) + run_command_v_opt_cd_env(argv, runopt, path->buf, NULL); +done: + strbuf_addstr(path, "/"); +} + +static int walk(struct strbuf *path, int argc, const char **argv) +{ + DIR *dir; + struct dirent *ent; + struct stat st; + size_t len; + int has_dotgit = 0; + struct string_list list = STRING_LIST_INIT_DUP; + struct string_list_item *item; + + dir = opendir(path->buf); + if (!dir) + return errno; + strbuf_addstr(path, "/"); + len = path->len; + while ((ent = readdir(dir))) { + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) + continue; + if (!strcmp(ent->d_name, ".git")) { + has_dotgit = 1; + continue; + } + switch (DTYPE(ent)) { + case DT_UNKNOWN: + case DT_LNK: + /* Use stat() to figure out if this path leads + * to a directory - it's not important if it's + * a symlink which gets us there. + */ + strbuf_setlen(path, len); + strbuf_addstr(path, ent->d_name); + if (stat(path->buf, &st) || !S_ISDIR(st.st_mode)) + break; + /* fallthrough */ + case DT_DIR: + string_list_append(&list, ent->d_name); + break; + } + } + closedir(dir); + strbuf_setlen(path, len); + if (has_dotgit) + handle_repo(path, argv); + sort_string_list(&list); + for_each_string_list_item(item, &list) { + strbuf_setlen(path, len); + strbuf_addstr(path, item->string); + walk(path, argc, argv); + } + string_list_clear(&list, 0); + return 0; +} + +int cmd_for_each_repo(int argc, const char **argv, const char *prefix) +{ + struct strbuf path = STRBUF_INIT; + + unsetenv(GIT_DIR_ENVIRONMENT); + unsetenv(GIT_WORK_TREE_ENVIRONMENT); + argc = parse_options(argc, argv, prefix, + builtin_foreachrepo_options, + builtin_foreachrepo_usage, + PARSE_OPT_STOP_AT_NON_OPTION); + if (want_color(GIT_COLOR_AUTO)) + color = GIT_COLOR_YELLOW; + strbuf_addstr(&path, "."); + return walk(&path, argc, argv); +} diff --git a/git.c b/git.c index ed66c66..6b53169 100644 --- a/git.c +++ b/git.c @@ -337,6 +337,7 @@ static void handle_internal_command(int argc, const char **argv) { "fetch-pack", cmd_fetch_pack, RUN_SETUP }, { "fmt-merge-msg", cmd_fmt_merge_msg, RUN_SETUP }, { "for-each-ref", cmd_for_each_ref, RUN_SETUP }, + { "for-each-repo", cmd_for_each_repo }, { "format-patch", cmd_format_patch, RUN_SETUP }, { "fsck", cmd_fsck, RUN_SETUP }, { "fsck-objects", cmd_fsck, RUN_SETUP }, diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh new file mode 100755 index 0000000..af02c0c --- /dev/null +++ b/t/t6400-for-each-repo.sh @@ -0,0 +1,150 @@ +#!/bin/sh +# +# Copyright (c) 2013 Lars Hjemli +# + +test_description='Test the git-for-each-repo command' + +. ./test-lib.sh + +qname="with\"quote" +qqname="\"with\\\"quote\"" + +test_expect_success "setup" ' + test_create_repo clean && + (cd clean && test_commit foo1) && + git init --separate-git-dir=.cleansub clean/gitfile && + (cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) && + test_create_repo dirty-idx && + (cd dirty-idx && test_commit foo3 && git rm foo3.t) && + test_create_repo dirty-wt && + (cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git && + test_commit foo4 && rm foo4.t) && + test_create_repo "$qname" && + (cd "$qname" && test_commit foo5) && + mkdir fakedir && mkdir fakedir/.git +' + +test_expect_success "without filtering, all repos are included" ' + echo "." >expect && + echo "clean" >>expect && + echo "clean/gitfile" >>expect && + echo "dirty-idx" >>expect && + echo "dirty-wt" >>expect && + echo "$qqname" >>expect && + git for-each-repo >actual && + test_cmp expect actual +' + +test_expect_success "-z NUL-terminates each path" ' + echo "(.)" >expect && + echo "(clean)" >>expect && + echo "(clean/gitfile)" >>expect && + echo "(dirty-idx)" >>expect && + echo "(dirty-wt)" >>expect && + echo "($qname)" >>expect && + git for-each-repo -z | xargs -0 printf "(%s)\n" >actual && + test_cmp expect actual +' + +test_expect_success "--dirty only includes dirty repos" ' + echo "clean/gitfile" >expect && + echo "dirty-idx" >>expect && + echo "dirty-wt" >>expect && + git for-each-repo --dirty >actual && + test_cmp expect actual +' + +test_expect_success "--clean only includes clean repos" ' + echo "." >expect && + echo "clean" >>expect && + echo "$qqname" >>expect && + git for-each-repo --clean >actual && + test_cmp expect actual +' + +test_expect_success "run a git-command in all repos" ' + echo "[.]" >expect && + echo "[clean]" >>expect && + echo "[clean/gitfile]" >>expect && + echo " M foo2.t" >>expect && + echo "[dirty-idx]" >>expect && + echo "D foo3.t" >>expect && + echo "[dirty-wt]" >>expect && + echo " D foo4.t" >> expect + echo "[$qname]" >>expect && + git for-each-repo status -suno >actual && + test_cmp expect actual +' + +test_expect_success "run a git-command in dirty repos only" ' + echo "[clean/gitfile]" >expect && + echo " M foo2.t" >>expect && + echo "[dirty-idx]" >>expect && + echo "D foo3.t" >>expect && + echo "[dirty-wt]" >>expect && + echo " D foo4.t" >> expect + git for-each-repo -d status -suno >actual && + test_cmp expect actual +' + +test_expect_success "run a git-command in clean repos only" ' + echo "[.]" >expect && + echo "[clean]" >>expect && + echo "foo1.t" >>expect && + echo "[$qname]" >>expect && + echo "foo5.t" >>expect && + git for-each-repo -c ls-files >actual && + test_cmp expect actual +' + +test_expect_success "-z is disabled when a command is run" ' + echo "[.]" >expect && + echo "[clean]" >>expect && + echo "foo1.t" >>expect && + echo "[$qname]" >>expect && + echo "foo5.t" >>expect && + git for-each-repo -cz ls-files >actual && + test_cmp expect actual +' + +test_expect_success "-x executes any command in each repo" ' + echo "[.]" >expect && + echo "$HOME" >>expect && + echo "[clean]" >>expect && + echo "$HOME/clean" >>expect && + echo "[clean/gitfile]" >>expect && + echo "$HOME/clean/gitfile" >>expect && + echo "[dirty-idx]" >>expect && + echo "$HOME/dirty-idx" >>expect && + echo "[dirty-wt]" >>expect && + echo "$HOME/dirty-wt" >> expect + echo "[$qname]" >>expect && + echo "$HOME/$qname" >>expect && + git for-each-repo -x pwd >actual && + test_cmp expect actual +' + +test_expect_success "-cx executes any command in clean repos" ' + echo "[.]" >expect && + echo "$HOME" >>expect && + echo "[clean]" >>expect && + echo "$HOME/clean" >>expect && + echo "[$qname]" >>expect && + echo "$HOME/$qname" >>expect && + git for-each-repo -cx pwd >actual && + test_cmp expect actual +' + +test_expect_success "-dx executes any command in dirty repos" ' + echo "[clean/gitfile]" >expect && + echo "$HOME/clean/gitfile" >>expect && + echo "[dirty-idx]" >>expect && + echo "$HOME/dirty-idx" >>expect && + echo "[dirty-wt]" >>expect && + echo "$HOME/dirty-wt" >> expect + git for-each-repo -dx pwd >actual && + test_cmp expect actual +' + +test_done -- 1.8.1.1.349.g4cdd23e ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli @ 2013-01-27 19:04 ` Junio C Hamano 2013-01-27 19:42 ` John Keeping 2013-01-28 7:50 ` Lars Hjemli 0 siblings, 2 replies; 18+ messages in thread From: Junio C Hamano @ 2013-01-27 19:04 UTC (permalink / raw) To: Lars Hjemli; +Cc: git Lars Hjemli <hjemli@gmail.com> writes: > When working with multiple, unrelated (or loosly related) git repos, > there is often a need to locate all repos with uncommitted work and > perform some action on them (say, commit and push). Before this patch, > such tasks would require manually visiting all repositories, running > `git status` within each one and then decide what to do next. > > This mundane task can now be automated by e.g. `git for-each-repo --dirty > status`, which will find all non-bare git repositories below the current > directory (even nested ones), check if they are dirty (as defined by > `git diff --quiet && git diff --cached --quiet`), and for each dirty repo > print the path to the repo and then execute `git status` within the repo. > > The command also honours the option '--clean' which restricts the set of > repos to those which '--dirty' would skip, and '-x' which is used to > execute non-git commands. It might make sense to internally use RUN_GIT_CMD flag when the first word of the command line is 'git' as an optimization, but I am not sure it is a good idea to force the end users to think when to use -x and when not to is a good idea. In other words, I think git for-each-repo -d diff --name-only git for-each-repo -d -x ls '*.c' is less nice than letting the user say git for-each-repo -d git diff --name-only git for-each-repo -d ls '*.c' > Finally, the command to execute within each repo is optional. If none is > given, git-for-each-repo will just print the path to each repo found. And > since the command supports -z, this can be used for more advanced scripting > needs. It amounts to the same thing, but I would rather describe it as: To allow scripts to handle paths with shell-unsafe characters, support "-z" to show paths with NUL termination. Otherwise, such paths are shown with the usual c-quoting. One more thing that nobody brought up during the previous reviews is if we want to support subset of repositories by allowing the standard pathspec match mechanism. For example, git for-each-repo -d git diff --name-only -- foo/ bar/b\*z might be a way to ask "please find repositories match the given pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that are dirty". We would need to think about how to mark the end of the command though---we could borrow \; from find(1), even though find is not the best example of the UI design. I.e. git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z with or without "--". > diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt > new file mode 100644 > index 0000000..fb12b3f > --- /dev/null > +++ b/Documentation/git-for-each-repo.txt > @@ -0,0 +1,71 @@ > +git-for-each-repo(1) > +==================== > + > +NAME > +---- > +git-for-each-repo - Execute a git command in multiple non-bare repositories There is a separate topic in flight that turns s/git/Git/ when we refer to the system as a whole. In any case, this is no longer limited to "execute a Git command". Find non-bare Git repositories in subdirectories or Find or execute a command in non-bare Git repositories in subdirectories perhaps? > +SYNOPSIS > +-------- > +[verse] > +'git for-each-repo' [-acdxz] [command] > + > +DESCRIPTION > +----------- > +The git-for-each-repo command is used to locate all non-bare git Should be sufficient to say s/is used to locate/locates/. > +repositories within the current directory tree, and optionally > +execute a git command in each of the found repos. s/a git command/a command/; > +OPTIONS > +------- > ... > +-x:: > + Execute a genric (non-git) command in each repo. Drop this option. > +NOTES > +----- > + > +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a > +worktree with uncommitted changes. Is it a definition that is different from usual? If so why does it need to be inconsistent with the rest of the system? > diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c > new file mode 100644 > index 0000000..9333ae0 > --- /dev/null > +++ b/builtin/for-each-repo.c > @@ -0,0 +1,145 @@ > +/* > + * "git for-each-repo" builtin command. > + * > + * Copyright (c) 2013 Lars Hjemli <hjemli@gmail.com> > + */ > +#include "cache.h" > +#include "color.h" > +#include "quote.h" > +#include "builtin.h" > +#include "run-command.h" > +#include "parse-options.h" > + > +#define ALL 0 > +#define DIRTY 1 > +#define CLEAN 2 > + > +static char *color = GIT_COLOR_NORMAL; > +static int eol = '\n'; > +static int match; > +static int runopt = RUN_GIT_CMD; > + > +static const char * const builtin_foreachrepo_usage[] = { > + N_("git for-each-repo [-acdxz] [cmd]"), > + NULL > +}; > + > +static struct option builtin_foreachrepo_options[] = { > + OPT_SET_INT('a', "all", &match, N_("match both clean and dirty repositories"), ALL), > + OPT_SET_INT('c', "clean", &match, N_("only show clean repositories"), CLEAN), > + OPT_SET_INT('d', "dirty", &match, N_("only show dirty repositories"), DIRTY), > + OPT_SET_INT('x', NULL, &runopt, N_("execute generic (non-git) command"), 0), > + OPT_SET_INT('z', NULL, &eol, N_("terminate each repo path with NUL character"), 0), > + OPT_END(), > +}; > + > +static int get_repo_state(const char *dir) > +{ > + const char *diffidx[] = {"diff", "--quiet", "--cached", NULL}; > + const char *diffwd[] = {"diff", "--quiet", NULL}; > + > + if (run_command_v_opt_cd_env(diffidx, RUN_GIT_CMD, dir, NULL) != 0) > + return DIRTY; > + if (run_command_v_opt_cd_env(diffwd, RUN_GIT_CMD, dir, NULL) != 0) > + return DIRTY; > + return CLEAN; > +} > + > +static void print_repo_path(const char *path, unsigned pretty) > +{ > + if (path[0] == '.' && path[1] == '/') > + path += 2; > + if (pretty) > + color_fprintf_ln(stdout, color, "[%s]", path); This is shown before running a command in that repository. I am of two minds. It certainly is nice to be able to tell which repository each block of output lines comes from, and not requiring the command to do this themselves is a good default. However, I wonder if people would want to do something like this: git for-each-repo sh -c ' git diff --name-only | sed -e "s|^|$path/|" ' to get a consolidated view, in a way similar to how "submodule foreach" can be used. This unconditional output will get in the way for such a use case. Oh, that reminds me of another thing. Perhaps we would want to export the (relative) path to the found repository in some way to allow the commands to do this kind of thing in the first place? "submodule foreach" does this with $path, I think. > + else > + write_name_quoted(path, stdout, eol); > +} Nice. Doubly nice that you do not hardcode "color" at this point but made it into a separate variable. > +static void handle_repo(struct strbuf *path, const char **argv) > +{ > + const char *gitdir; > + int len; > + > + len = path->len; > + strbuf_addstr(path, ".git"); > + gitdir = resolve_gitdir(path->buf); > + strbuf_setlen(path, len - 1); > + if (!gitdir) > + goto done; > + if (match != ALL && match != get_repo_state(path->buf)) > + goto done; > + print_repo_path(path->buf, *argv != NULL); > + if (*argv) > + run_command_v_opt_cd_env(argv, runopt, path->buf, NULL); > +done: > + strbuf_addstr(path, "/"); OK, you get "$D/" from the caller, make it "$D/.git" to call resolve_gitdir() with, turn it to "$D" before printing and runnning, and then add "/" back. Slightly tricky but correct. > +static int walk(struct strbuf *path, int argc, const char **argv) > +{ > + DIR *dir; > + struct dirent *ent; > + struct stat st; > + size_t len; > + int has_dotgit = 0; > + struct string_list list = STRING_LIST_INIT_DUP; > + struct string_list_item *item; > + > + dir = opendir(path->buf); > + if (!dir) > + return errno; > + strbuf_addstr(path, "/"); > + len = path->len; > + while ((ent = readdir(dir))) { > + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) > + continue; > + if (!strcmp(ent->d_name, ".git")) { > + has_dotgit = 1; > + continue; > + } > + switch (DTYPE(ent)) { > + case DT_UNKNOWN: > + case DT_LNK: > + /* Use stat() to figure out if this path leads > + * to a directory - it's not important if it's > + * a symlink which gets us there. > + */ > + strbuf_setlen(path, len); > + strbuf_addstr(path, ent->d_name); > + if (stat(path->buf, &st) || !S_ISDIR(st.st_mode)) > + break; > + /* fallthrough */ > + case DT_DIR: > + string_list_append(&list, ent->d_name); > + break; > + } > + } > + closedir(dir); > + strbuf_setlen(path, len); > + if (has_dotgit) > + handle_repo(path, argv); > + sort_string_list(&list); > + for_each_string_list_item(item, &list) { > + strbuf_setlen(path, len); > + strbuf_addstr(path, item->string); > + walk(path, argc, argv); > + } > + string_list_clear(&list, 0); > + return 0; > +} Is the "collect-first-and-then-sort" done so that the repositories are shown in a stable order regardless of the order in which readdir() returns he entries? I am not complaining, but being curious. > diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh This command does not look like "6 - the revision tree commands" to me. "7 - the porcelainish commands concerning the working tree" or "9 - the git tools" may be a better match? > new file mode 100755 > index 0000000..af02c0c > --- /dev/null > +++ b/t/t6400-for-each-repo.sh > @@ -0,0 +1,150 @@ > +#!/bin/sh > +# > +# Copyright (c) 2013 Lars Hjemli > +# > + > +test_description='Test the git-for-each-repo command' > + > +. ./test-lib.sh > + > +qname="with\"quote" > +qqname="\"with\\\"quote\"" If Windows does not have problems with paths with dq in it, then this is fine, but I dunno. Otherwise, you may want to exclude the c-quote testing from the main part of the test, and have a single test that has prerequisite for filesystems that can do this at the end of the script. > +test_expect_success "setup" ' > + test_create_repo clean && > + (cd clean && test_commit foo1) && > + git init --separate-git-dir=.cleansub clean/gitfile && > + (cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) && > + test_create_repo dirty-idx && > + (cd dirty-idx && test_commit foo3 && git rm foo3.t) && > + test_create_repo dirty-wt && > + (cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git && Some platforms are symlink-challenged. Can we do this test without "ln -s"? SYMLINKS prereq wouldn't be very useful for the setup step, as all the remaining tests won't work without setting up the test scenario. > + test_commit foo4 && rm foo4.t) && > + test_create_repo "$qname" && > + (cd "$qname" && test_commit foo5) && > + mkdir fakedir && mkdir fakedir/.git > +' > + > +test_expect_success "without filtering, all repos are included" ' > + echo "." >expect && > + echo "clean" >>expect && > + echo "clean/gitfile" >>expect && > + echo "dirty-idx" >>expect && > + echo "dirty-wt" >>expect && > + echo "$qqname" >>expect && A single cat >expect <<-EOF . clean clean/gitfile ... $qqname EOF may be a lot easier to read (likewise for all the "expect" preparation in the rest of the script). > +test_expect_success "-z NUL-terminates each path" ' > + echo "(.)" >expect && > + echo "(clean)" >>expect && > + echo "(clean/gitfile)" >>expect && > + echo "(dirty-idx)" >>expect && > + echo "(dirty-wt)" >>expect && > + echo "($qname)" >>expect && > + git for-each-repo -z | xargs -0 printf "(%s)\n" >actual && This needs prereq on "xargs -0", but because we know we do not have any string with Q in it in the expected list of repositories, it may be simpler to do something like this: echo ".QcleanQclean/gitfileQ...$qname" >expect && git for-each-repo -z | tr "\0" Q >actual && test_cmp expect actual Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-27 19:04 ` Junio C Hamano @ 2013-01-27 19:42 ` John Keeping 2013-01-27 19:45 ` Junio C Hamano 2013-01-28 7:50 ` Lars Hjemli 1 sibling, 1 reply; 18+ messages in thread From: John Keeping @ 2013-01-27 19:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Hjemli, git On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote: > One more thing that nobody brought up during the previous reviews is > if we want to support subset of repositories by allowing the > standard pathspec match mechanism. For example, > > git for-each-repo -d git diff --name-only -- foo/ bar/b\*z > > might be a way to ask "please find repositories match the given > pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that > are dirty". We would need to think about how to mark the end of the > command though---we could borrow \; from find(1), even though find > is not the best example of the UI design. I.e. > > git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z > > with or without "--". Would it be better to make this a (multi-valued) option? git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only It seems a lot simpler than trying to figure out how the command is going to handle '--' arguments. > Oh, that reminds me of another thing. Perhaps we would want to > export the (relative) path to the found repository in some way to > allow the commands to do this kind of thing in the first place? > "submodule foreach" does this with $path, I think. I think $path is the only variable exported by "submodule foreach" which is applicable here, but it doesn't work on Windows, where environment variables are case-insensitive. Commit 64394e3 (git-submodule.sh: Don't use $path variable in eval_gettext string) changed "submodule foreach" to use $sm_path internally although I notice that the documentation still uses $path. Perhaps $repo_path in this case? John ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-27 19:42 ` John Keeping @ 2013-01-27 19:45 ` Junio C Hamano 0 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2013-01-27 19:45 UTC (permalink / raw) To: John Keeping; +Cc: Lars Hjemli, git John Keeping <john@keeping.me.uk> writes: > On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote: >> One more thing that nobody brought up during the previous reviews is >> if we want to support subset of repositories by allowing the >> standard pathspec match mechanism. For example, >> >> git for-each-repo -d git diff --name-only -- foo/ bar/b\*z >> >> might be a way to ask "please find repositories match the given >> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that >> are dirty". We would need to think about how to mark the end of the >> command though---we could borrow \; from find(1), even though find >> is not the best example of the UI design. I.e. >> >> git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z >> >> with or without "--". > > Would it be better to make this a (multi-valued) option? > > git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only The standard way to use filtering based on paths we have is to use the pathspec parameters at the end of the commmand line. I see no reason for such an inconsistency with an option like --filter. >> Oh, that reminds me of another thing. Perhaps we would want to >> export the (relative) path to the found repository in some way to >> allow the commands to do this kind of thing in the first place? >> "submodule foreach" does this with $path, I think. > > I think $path is the only variable exported by "submodule foreach" which > is applicable here, but it doesn't work on Windows, where environment > variables are case-insensitive. > > Commit 64394e3 (git-submodule.sh: Don't use $path variable in > eval_gettext string) changed "submodule foreach" to use $sm_path > internally although I notice that the documentation still uses $path. > > Perhaps $repo_path in this case? I do not care too deeply about the name, as long as the names used by both mechanisms are the same. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-27 19:04 ` Junio C Hamano 2013-01-27 19:42 ` John Keeping @ 2013-01-28 7:50 ` Lars Hjemli 2013-01-28 8:10 ` Jonathan Nieder 1 sibling, 1 reply; 18+ messages in thread From: Lars Hjemli @ 2013-01-28 7:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Sun, Jan 27, 2013 at 8:04 PM, Junio C Hamano <gitster@pobox.com> wrote: > Lars Hjemli <hjemli@gmail.com> writes: > >> The command also honours the option '--clean' which restricts the set of >> repos to those which '--dirty' would skip, and '-x' which is used to >> execute non-git commands. > > It might make sense to internally use RUN_GIT_CMD flag when the > first word of the command line is 'git' as an optimization, but > I am not sure it is a good idea to force the end users to think > when to use -x and when not to is a good idea. > > In other words, I think > > git for-each-repo -d diff --name-only > git for-each-repo -d -x ls '*.c' > > is less nice than letting the user say > > git for-each-repo -d git diff --name-only > git for-each-repo -d ls '*.c' > The 'git-for-each-repo' command was made to allow any git command to be executed in all discovered repositories, and I've used it that way for two years (in the form of a shell-script called 'git-all'). During this time, I've occasionally thought about forking non-git commands but the itch hasn't been strong enough for me to scratch. The point I'm trying to make is that to me, this command acts as a modifier for other git commands[1]. Having the possibility to execute non-git commands would be nice, but it is not the main objective of this command. [1] The 'git -a' rewrite patch shows how I think about this command - it's just an option to the 'git' command, modifying the way any subcommand is invoked (btw: I don't expect that patch to be applied since 'git-all' was deemed to generic, so I'll just carry the patch in my own tree). >> Finally, the command to execute within each repo is optional. If none is >> given, git-for-each-repo will just print the path to each repo found. And >> since the command supports -z, this can be used for more advanced scripting >> needs. > > It amounts to the same thing, but I would rather describe it as: > > To allow scripts to handle paths with shell-unsafe characters, > support "-z" to show paths with NUL termination. Otherwise, > such paths are shown with the usual c-quoting. > Much better, thanks. > One more thing that nobody brought up during the previous reviews is > if we want to support subset of repositories by allowing the > standard pathspec match mechanism. For example, > > git for-each-repo -d git diff --name-only -- foo/ bar/b\*z > > might be a way to ask "please find repositories match the given > pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that > are dirty". We would need to think about how to mark the end of the > command though---we could borrow \; from find(1), even though find > is not the best example of the UI design. I.e. > > git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z > > with or without "--". I don't think this would be very nice to end users, and would prefer --include and --exclude options (the latter is actually already a part of git-all, added by one of my coworkers). >> +NOTES >> +----- >> + >> +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a >> +worktree with uncommitted changes. > > Is it a definition that is different from usual? If so why does it > need to be inconsistent with the rest of the system? I just wanted to clarify what condition --dirty and --clean will check. In particular, the lack of checking for untracked files (which could be added as yet another option). >> +static void print_repo_path(const char *path, unsigned pretty) >> +{ >> + if (path[0] == '.' && path[1] == '/') >> + path += 2; >> + if (pretty) >> + color_fprintf_ln(stdout, color, "[%s]", path); > > This is shown before running a command in that repository. I am of > two minds. It certainly is nice to be able to tell which repository > each block of output lines comes from, and not requiring the command > to do this themselves is a good default. However, I wonder if people > would want to do something like this: > > git for-each-repo sh -c ' > git diff --name-only | > sed -e "s|^|$path/|" > ' > > to get a consolidated view, in a way similar to how "submodule > foreach" can be used. This unconditional output will get in the way > for such a use case. I guess -q/--quiet could be useful. >> +static int walk(struct strbuf *path, int argc, const char **argv) >> +{ >> + DIR *dir; >> + struct dirent *ent; >> + struct stat st; >> + size_t len; >> + int has_dotgit = 0; >> + struct string_list list = STRING_LIST_INIT_DUP; >> + struct string_list_item *item; >> + >> + dir = opendir(path->buf); >> + if (!dir) >> + return errno; >> + strbuf_addstr(path, "/"); >> + len = path->len; >> + while ((ent = readdir(dir))) { >> + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) >> + continue; >> + if (!strcmp(ent->d_name, ".git")) { >> + has_dotgit = 1; >> + continue; >> + } >> + switch (DTYPE(ent)) { >> + case DT_UNKNOWN: >> + case DT_LNK: >> + /* Use stat() to figure out if this path leads >> + * to a directory - it's not important if it's >> + * a symlink which gets us there. >> + */ >> + strbuf_setlen(path, len); >> + strbuf_addstr(path, ent->d_name); >> + if (stat(path->buf, &st) || !S_ISDIR(st.st_mode)) >> + break; >> + /* fallthrough */ >> + case DT_DIR: >> + string_list_append(&list, ent->d_name); >> + break; >> + } >> + } >> + closedir(dir); >> + strbuf_setlen(path, len); >> + if (has_dotgit) >> + handle_repo(path, argv); >> + sort_string_list(&list); >> + for_each_string_list_item(item, &list) { >> + strbuf_setlen(path, len); >> + strbuf_addstr(path, item->string); >> + walk(path, argc, argv); >> + } >> + string_list_clear(&list, 0); >> + return 0; >> +} > > Is the "collect-first-and-then-sort" done so that the repositories > are shown in a stable order regardless of the order in which > readdir() returns he entries? Yes (writing the testcases demonstrated a need for predictable output). >> diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh > > This command does not look like "6 - the revision tree commands" to > me. "7 - the porcelainish commands concerning the working tree" or > "9 - the git tools" may be a better match? Ok, how about t9003? >> new file mode 100755 >> index 0000000..af02c0c >> --- /dev/null >> +++ b/t/t6400-for-each-repo.sh >> @@ -0,0 +1,150 @@ >> +#!/bin/sh >> +# >> +# Copyright (c) 2013 Lars Hjemli >> +# >> + >> +test_description='Test the git-for-each-repo command' >> + >> +. ./test-lib.sh >> + >> +qname="with\"quote" >> +qqname="\"with\\\"quote\"" > > If Windows does not have problems with paths with dq in it, then > this is fine, but I dunno. Otherwise, you may want to exclude the > c-quote testing from the main part of the test, and have a single > test that has prerequisite for filesystems that can do this at the > end of the script. I'll check my patch on msysgit before resending. >> +test_expect_success "setup" ' >> + test_create_repo clean && >> + (cd clean && test_commit foo1) && >> + git init --separate-git-dir=.cleansub clean/gitfile && >> + (cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) && >> + test_create_repo dirty-idx && >> + (cd dirty-idx && test_commit foo3 && git rm foo3.t) && >> + test_create_repo dirty-wt && >> + (cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git && > > Some platforms are symlink-challenged. Can we do this test without > "ln -s"? SYMLINKS prereq wouldn't be very useful for the setup > step, as all the remaining tests won't work without setting up the > test scenario. I added this test to check the DT_UNKNOWN/DT_LINK case in walk() so I'd rather not drop it, but it can be moved into a standalone, SYMLINKS-enabled testcase. Thanks for the review. -- larsh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 7:50 ` Lars Hjemli @ 2013-01-28 8:10 ` Jonathan Nieder 2013-01-28 17:11 ` Lars Hjemli 2013-01-28 17:45 ` Junio C Hamano 0 siblings, 2 replies; 18+ messages in thread From: Jonathan Nieder @ 2013-01-28 8:10 UTC (permalink / raw) To: Lars Hjemli; +Cc: Junio C Hamano, git Hi, Lars Hjemli wrote: > [1] The 'git -a' rewrite patch shows how I think about this command - > it's just an option to the 'git' command, modifying the way any > subcommand is invoked (btw: I don't expect that patch to be applied > since 'git-all' was deemed to generic, so I'll just carry the patch in > my own tree). As one data point, 'git all' also seems too generic to me but 'git -a' doesn't. Intuition can be weird. So if I ran the world, then having commands git -a diff and git for-each-repo git diff do the same thing would be fine. Of course I don't run the world. ;-) [...] >> One more thing that nobody brought up during the previous reviews is >> if we want to support subset of repositories by allowing the >> standard pathspec match mechanism. For example, >> >> git for-each-repo -d git diff --name-only -- foo/ bar/b\*z >> >> might be a way to ask "please find repositories match the given >> pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that >> are dirty". We would need to think about how to mark the end of the >> command though---we could borrow \; from find(1), even though find >> is not the best example of the UI design. In most non-git commands, "--" represents an end-of-options marker, allowing arbitrary options afterward without having to worry about escaping minus signs. So in that spirit, if this weren't a git command, I'd expect to be able to do for-each-repo -- git diff -- '*.c' and have the second '--' passed verbatim to "git diff". Unfortunately in git (imitating commands like "grep", I suppose), "--" means "paths start here". That means that with the git convention, there is only one place to pass paths to a given command. Tracing backwards: it would be really nice to be able to do git for-each-repo git grep -e foo -- '*.c' or git -a grep -e foo -- '*.c' For this practical reason, it seems that paths listed after the '--' should go to the command being run. On the other hand, if I wanted to limit my for-each-repo run to repositories in two subdirectories of the cwd, I'd be tempted to try git for-each-repo git grep -e foo -- src/ doc/ And if I wanted to limit to different file types in the repositories under each directory, it would be tempting to use git for-each-repo git grep -e foo -- 'src/*.c' 'doc/*.txt' Is there a convention that would be usable today that is roughly forward-compatible with that? (To throw an example out, requiring that each pathspec passed to for-each-repo either starts with '*' or contains no wildcards.) Thanks, Jonathan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 8:10 ` Jonathan Nieder @ 2013-01-28 17:11 ` Lars Hjemli 2013-01-28 18:35 ` Junio C Hamano 2013-01-28 17:45 ` Junio C Hamano 1 sibling, 1 reply; 18+ messages in thread From: Lars Hjemli @ 2013-01-28 17:11 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, git On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > > Lars Hjemli wrote: > >> [1] The 'git -a' rewrite patch shows how I think about this command - >> it's just an option to the 'git' command, modifying the way any >> subcommand is invoked (btw: I don't expect that patch to be applied >> since 'git-all' was deemed to generic, so I'll just carry the patch in >> my own tree). > > As one data point, 'git all' also seems too generic to me but 'git -a' > doesn't. Intuition can be weird. > > So if I ran the world, then having commands > > git -a diff > > and > > git for-each-repo git diff > > do the same thing would be fine. Of course I don't run the world. ;-) This would make me very happy. Junio? -- larsh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 17:11 ` Lars Hjemli @ 2013-01-28 18:35 ` Junio C Hamano 0 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2013-01-28 18:35 UTC (permalink / raw) To: Lars Hjemli; +Cc: Jonathan Nieder, git Lars Hjemli <hjemli@gmail.com> writes: > On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> ... >> So if I ran the world, then having commands >> >> git -a diff >> >> and >> >> git for-each-repo git diff >> >> do the same thing would be fine. Of course I don't run the world. ;-) > > This would make me very happy. Junio? Ahh, our mails crossed (rather, I responded to the other message I saw before I saw this one). I am not completely sold on "git -a" yet, but another worry I have is which one between "submodule foreach" and "for-each-repo" should use "git -a", if we decide that it is useful to the users to add it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 8:10 ` Jonathan Nieder 2013-01-28 17:11 ` Lars Hjemli @ 2013-01-28 17:45 ` Junio C Hamano 2013-01-28 18:35 ` Lars Hjemli 1 sibling, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2013-01-28 17:45 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Lars Hjemli, git Jonathan Nieder <jrnieder@gmail.com> writes: > Tracing backwards: it would be really nice to be able to do > > git for-each-repo git grep -e foo -- '*.c' This is a very good example that shows the command that is run in the repositories found may want pathspecs passed, but at the same time, makes me realize that these repositories have to be fairly uniform for this command to be useful. For example, 'src/*.c' or 'inc/*.h' pathspecs wouldn't be useful unless majority if not all projects the loop finds follow that layout convention. This is not necessarily limited to pathspecs, of course. Unless they all have the 'next' branch "git for-each-repo checkout next" would not work, etc. etc. As to the pathspec limiting to affect the loop itself, not the argument given to the command that is run, I don't think it is absolutely needed; I am perfectly fine with declaring that for-each-repo goes to repositories in all subdirectories without limit, especially if doing so will make the UI issues we have to deal with simpler. As to the "option to the command, not to the subcommand, -a option", I have been assuming that it was a joke patch, but if "git -a grep" turns out to be really useful, "submodule foreach" that iterates over the submodules may also want to have such a short and sweet mechanism. Between "for-each-repo" and "submodule foreach", I do not yet have a strong opinion on which one deserves it more. Come to think of it, is there a reason why "for-each-repo" should not be an extention to "submodule foreach"? We can view this as visiting repositories that _could_ be registered as a submodule, in addition to iterating over the registered submodules, no? If these two are unified, then we do not have to even worry about which one deserves "git -a" more. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 17:45 ` Junio C Hamano @ 2013-01-28 18:35 ` Lars Hjemli 2013-01-28 18:51 ` Junio C Hamano 0 siblings, 1 reply; 18+ messages in thread From: Lars Hjemli @ 2013-01-28 18:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git On Mon, Jan 28, 2013 at 6:45 PM, Junio C Hamano <gitster@pobox.com> wrote: > As to the pathspec limiting to affect the loop itself, not the > argument given to the command that is run, I don't think it is > absolutely needed; I am perfectly fine with declaring that > for-each-repo goes to repositories in all subdirectories without > limit, especially if doing so will make the UI issues we have to > deal with simpler. Good (since the relative path of each repo will be exported to the child process, that process can perform path limiting when needed). > As to the "option to the command, not to the subcommand, -a option", > I have been assuming that it was a joke patch, but if "git -a grep" > turns out to be really useful, "submodule foreach" that iterates > over the submodules may also want to have such a short and sweet > mechanism. Between "for-each-repo" and "submodule foreach", I do > not yet have a strong opinion on which one deserves it more. > > Come to think of it, is there a reason why "for-each-repo" should > not be an extention to "submodule foreach"? We can view this as > visiting repositories that _could_ be registered as a submodule, in > addition to iterating over the registered submodules, no? Yes, but I see some possible problems with that approach: -'git for-each-repo' does not need to be started from within a git worktree -'git for-each-repo' and 'git submodule foreach' have different semantics for --dirty and --clean -'git for-each-repo' is in C because my 'git-all' shell script was horribly slow on large directory trees (especially on windows) All of these problems are probably solvable, but it would require quite some reworking of git-submodule.sh -- larsh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 18:35 ` Lars Hjemli @ 2013-01-28 18:51 ` Junio C Hamano 2013-01-28 19:42 ` Lars Hjemli 2013-01-28 20:12 ` Jens Lehmann 0 siblings, 2 replies; 18+ messages in thread From: Junio C Hamano @ 2013-01-28 18:51 UTC (permalink / raw) To: Lars Hjemli; +Cc: Jonathan Nieder, git Lars Hjemli <hjemli@gmail.com> writes: >> Come to think of it, is there a reason why "for-each-repo" should >> not be an extention to "submodule foreach"? We can view this as >> visiting repositories that _could_ be registered as a submodule, in >> addition to iterating over the registered submodules, no? > > Yes, but I see some possible problems with that approach: > -'git for-each-repo' does not need to be started from within a git worktree True, but "git submodule foreach --untracked" can be told that it is OK not (yet) to be in any superproject, no? > -'git for-each-repo' and 'git submodule foreach' have different > semantics for --dirty and --clean That could be a problem. Is there a good reason why they should use different definitions of dirtyness? > -'git for-each-repo' is in C because my 'git-all' shell script was > horribly slow on large directory trees (especially on windows) Your for-each-repo could be a good basis to build a new builtin "submodule--foreach" that is a pure helper hidden from the end users that does both; cmd_foreach() in git-submodule.sh can simply delegate to it. > All of these problems are probably solvable, but it would require > quite some reworking of git-submodule.sh Of course some work is needed, but we do not have to convert all the cmd_foo in git-submodule.sh in one step. For the purpose of unifying for-each-repo and submodule foreach to deliver the functionality sooner to the end users, we can go the route to add only the submodule--foreach builtin, out of which we will get reusable implementation of module_list and other helper functions we can leverage later to do other cmd_foo functions. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 18:51 ` Junio C Hamano @ 2013-01-28 19:42 ` Lars Hjemli 2013-01-28 20:12 ` Jens Lehmann 1 sibling, 0 replies; 18+ messages in thread From: Lars Hjemli @ 2013-01-28 19:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git On Mon, Jan 28, 2013 at 7:51 PM, Junio C Hamano <gitster@pobox.com> wrote: > Lars Hjemli <hjemli@gmail.com> writes: > >>> Come to think of it, is there a reason why "for-each-repo" should >>> not be an extention to "submodule foreach"? We can view this as >>> visiting repositories that _could_ be registered as a submodule, in >>> addition to iterating over the registered submodules, no? >> >> Yes, but I see some possible problems with that approach: >> -'git for-each-repo' does not need to be started from within a git worktree > > True, but "git submodule foreach --untracked" can be told that it is > OK not (yet) to be in any superproject, no? Yes. > >> -'git for-each-repo' and 'git submodule foreach' have different >> semantics for --dirty and --clean > > That could be a problem. Is there a good reason why they should use > different definitions of dirtyness? I suspected that 'submodule foreach --dirty' might want to compare the HEAD sha1 in the submodule against the one recorded in the superproject (similar to what 'git submodule status' does), but such a check could be triggered by a different flag (e.g. --behind/--ahead or something similar). >> -'git for-each-repo' is in C because my 'git-all' shell script was >> horribly slow on large directory trees (especially on windows) > > Your for-each-repo could be a good basis to build a new builtin > "submodule--foreach" that is a pure helper hidden from the end users > that does both; cmd_foreach() in git-submodule.sh can simply delegate > to it. Ok, I'll rework my patches in this direction. Thanks. -- larsh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 18:51 ` Junio C Hamano 2013-01-28 19:42 ` Lars Hjemli @ 2013-01-28 20:12 ` Jens Lehmann 2013-01-28 20:34 ` Junio C Hamano 1 sibling, 1 reply; 18+ messages in thread From: Jens Lehmann @ 2013-01-28 20:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt Am 28.01.2013 19:51, schrieb Junio C Hamano: > Lars Hjemli <hjemli@gmail.com> writes: > >>> Come to think of it, is there a reason why "for-each-repo" should >>> not be an extention to "submodule foreach"? We can view this as >>> visiting repositories that _could_ be registered as a submodule, in >>> addition to iterating over the registered submodules, no? >> >> Yes, but I see some possible problems with that approach: >> -'git for-each-repo' does not need to be started from within a git worktree > > True, but "git submodule foreach --untracked" can be told that it is > OK not (yet) to be in any superproject, no? Hmm, I'm not sure how that would work as it looks for gitlinks in the index which point to work tree paths. >> -'git for-each-repo' and 'git submodule foreach' have different >> semantics for --dirty and --clean I'm confused, what semantics of --dirty and --clean does current 'git submodule foreach' have? I can't find any sign of it in the current code ... did I miss something while skimming through this thread? Or are you talking about status and diff here? > That could be a problem. Is there a good reason why they should use > different definitions of dirtyness? I don't see any (except of course for comparing a gitlink with the HEAD of the submodule, which is an additional condition that only applies to submodules). But I think the current for-each-repo proposal doesn't allow to traverse repos which contain untracked content (and it would be nice if the user could somehow combine that with the current --dirty flag to have both in one go). >> -'git for-each-repo' is in C because my 'git-all' shell script was >> horribly slow on large directory trees (especially on windows) > > Your for-each-repo could be a good basis to build a new builtin > "submodule--foreach" that is a pure helper hidden from the end users > that does both; cmd_foreach() in git-submodule.sh can simply delegate > to it. I like that approach, because the operations are very similar from the user's point of view. But please remember that internally they would work differently, as submodule foreach walks the index and only descends into those submodules that are populated (and contain a .git directory or file) while for-each-repo scans the whole work tree, which makes it a more expensive operation. >> All of these problems are probably solvable, but it would require >> quite some reworking of git-submodule.sh > > Of course some work is needed, but we do not have to convert all the > cmd_foo in git-submodule.sh in one step. For the purpose of > unifying for-each-repo and submodule foreach to deliver the > functionality sooner to the end users, we can go the route to add > only the submodule--foreach builtin, out of which we will get > reusable implementation of module_list and other helper functions we > can leverage later to do other cmd_foo functions. I really like that idea! ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 20:12 ` Jens Lehmann @ 2013-01-28 20:34 ` Junio C Hamano 2013-01-28 21:25 ` Jens Lehmann 0 siblings, 1 reply; 18+ messages in thread From: Junio C Hamano @ 2013-01-28 20:34 UTC (permalink / raw) To: Jens Lehmann; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt Jens Lehmann <Jens.Lehmann@web.de> writes: > Am 28.01.2013 19:51, schrieb Junio C Hamano: >> Lars Hjemli <hjemli@gmail.com> writes: >> >>>> Come to think of it, is there a reason why "for-each-repo" should >>>> not be an extention to "submodule foreach"? We can view this as >>>> visiting repositories that _could_ be registered as a submodule, in >>>> addition to iterating over the registered submodules, no? >>> >>> Yes, but I see some possible problems with that approach: >>> -'git for-each-repo' does not need to be started from within a git worktree >> >> True, but "git submodule foreach --untracked" can be told that it is >> OK not (yet) to be in any superproject, no? > > Hmm, I'm not sure how that would work as it looks for gitlinks > in the index which point to work tree paths. I was imagining that "foreach --untracked" could go something like this: * If you are inside an existing git repository, read its index to learn the gitlinks in the directory and its subdirectories. * Start from the current directory and recursively apply the procedure in this step: * Scan the directory and iterate over the ones that has ".git" in it: * If it is a gitlinked one, show it, but do not descend into it unless --recursive is given (e.g. you start from /home/jens, find /home/jens/proj/ directory that has /home/jens/proj/.git in it. /home/jens/.git/index knows that it is a submodule of the top-level superproject. "proj" is handled, and it is up to the --recursive option if its submodules are handled). * If it is _not_ a gitlinked one, show it and descend into it (e.g. /home/jens/ is not a repository or /home/jens/proj is not a tracked submodule) to apply this procedure recursively. Of course, without --untracked, we have no need to iterate over the readdir() return values; instead we just scan the index of the top-level superproject. >>> -'git for-each-repo' and 'git submodule foreach' have different >>> semantics for --dirty and --clean > > I'm confused, what semantics of --dirty and --clean does current > 'git submodule foreach' have? I can't find any sign of it in the > current code ... did I miss something while skimming through this > thread? Or are you talking about status and diff here? I think Lars is hinting that "submodule foreach" could restrict its operation to a similar --dirty/--clean/--both option he has. Of course, the command given to foreach can decide to become no-op by inspecting the submodule itself, so in that sense, --dirty/--clean can be done without, but I think it would make sense to have it in "submodule foreach" even without the "--untracked" option. > But I think the current for-each-repo > proposal doesn't allow to traverse repos which contain untracked > content (and it would be nice if the user could somehow combine > that with the current --dirty flag to have both in one go). Perhaps. I personally felt it was really strange that submodule diff and status consider that it is a sin to have untracked and unignored cruft in the submodule working tree, though. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 20:34 ` Junio C Hamano @ 2013-01-28 21:25 ` Jens Lehmann 2013-02-04 6:41 ` Junio C Hamano 0 siblings, 1 reply; 18+ messages in thread From: Jens Lehmann @ 2013-01-28 21:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt Am 28.01.2013 21:34, schrieb Junio C Hamano: > Jens Lehmann <Jens.Lehmann@web.de> writes: > >> Am 28.01.2013 19:51, schrieb Junio C Hamano: >>> Lars Hjemli <hjemli@gmail.com> writes: >>> >>>>> Come to think of it, is there a reason why "for-each-repo" should >>>>> not be an extention to "submodule foreach"? We can view this as >>>>> visiting repositories that _could_ be registered as a submodule, in >>>>> addition to iterating over the registered submodules, no? >>>> >>>> Yes, but I see some possible problems with that approach: >>>> -'git for-each-repo' does not need to be started from within a git worktree >>> >>> True, but "git submodule foreach --untracked" can be told that it is >>> OK not (yet) to be in any superproject, no? >> >> Hmm, I'm not sure how that would work as it looks for gitlinks >> in the index which point to work tree paths. > > I was imagining that "foreach --untracked" could go something like this: > > * If you are inside an existing git repository, read its index to > learn the gitlinks in the directory and its subdirectories. > > * Start from the current directory and recursively apply the > procedure in this step: > > * Scan the directory and iterate over the ones that has ".git" in > it: > > * If it is a gitlinked one, show it, but do not descend into it > unless --recursive is given (e.g. you start from /home/jens, > find /home/jens/proj/ directory that has /home/jens/proj/.git > in it. /home/jens/.git/index knows that it is a submodule of > the top-level superproject. "proj" is handled, and it is up > to the --recursive option if its submodules are handled). > > * If it is _not_ a gitlinked one, show it and descend into it > (e.g. /home/jens/ is not a repository or /home/jens/proj is > not a tracked submodule) to apply this procedure recursively. > > Of course, without --untracked, we have no need to iterate over the > readdir() return values; instead we just scan the index of the > top-level superproject. Thanks for explaining, that makes tons of sense. >>>> -'git for-each-repo' and 'git submodule foreach' have different >>>> semantics for --dirty and --clean >> >> I'm confused, what semantics of --dirty and --clean does current >> 'git submodule foreach' have? I can't find any sign of it in the >> current code ... did I miss something while skimming through this >> thread? Or are you talking about status and diff here? > > I think Lars is hinting that "submodule foreach" could restrict its > operation to a similar --dirty/--clean/--both option he has. Of > course, the command given to foreach can decide to become no-op by > inspecting the submodule itself, so in that sense, --dirty/--clean > can be done without, but I think it would make sense to have it in > "submodule foreach" even without the "--untracked" option. Nice idea. E.g. that would help submodule users to easily script a workflow which descends only into modified submodules to create branches and push them there. Or to remove branches which were created everywhere only in those submodules that weren't changed. >> But I think the current for-each-repo >> proposal doesn't allow to traverse repos which contain untracked >> content (and it would be nice if the user could somehow combine >> that with the current --dirty flag to have both in one go). > > Perhaps. I personally felt it was really strange that submodule > diff and status consider that it is a sin to have untracked and > unignored cruft in the submodule working tree, though. The VCS we used at work before Git didn't show us any untracked files, which caused trouble on a regular basis as people were breaking builds for others because they forgot to check in new files. That didn't happen with Git anymore, which was very cool. But the problem reappeared as we started using submodules. Since I taught status and diff to show that we're happy again. So for us it was everything but strange ;-) But for for-each-repo I would rather propose that modifications of tracked files can optionally and/or solely be used to pick the repos. Maybe: --dirty=modified, --dirty=untracked and --dirty=both with --dirty defaulting to modified? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations 2013-01-28 21:25 ` Jens Lehmann @ 2013-02-04 6:41 ` Junio C Hamano 0 siblings, 0 replies; 18+ messages in thread From: Junio C Hamano @ 2013-02-04 6:41 UTC (permalink / raw) To: Jens Lehmann; +Cc: Lars Hjemli, Jonathan Nieder, git, Heiko Voigt Jens Lehmann <Jens.Lehmann@web.de> writes: > Am 28.01.2013 21:34, schrieb Junio C Hamano: > ... >> I was imagining that "foreach --untracked" could go something like this: >> >> * If you are inside an existing git repository, read its index to >> learn the gitlinks in the directory and its subdirectories. >> >> * Start from the current directory and recursively apply the >> procedure in this step: >> >> * Scan the directory and iterate over the ones that has ".git" in >> it: >> >> * If it is a gitlinked one, show it, but do not descend into it >> unless --recursive is given (e.g. you start from /home/jens, >> find /home/jens/proj/ directory that has /home/jens/proj/.git >> in it. /home/jens/.git/index knows that it is a submodule of >> the top-level superproject. "proj" is handled, and it is up >> to the --recursive option if its submodules are handled). >> >> * If it is _not_ a gitlinked one, show it and descend into it >> (e.g. /home/jens/ is not a repository or /home/jens/proj is >> not a tracked submodule) to apply this procedure recursively. >> >> Of course, without --untracked, we have no need to iterate over the >> readdir() return values; instead we just scan the index of the >> top-level superproject. > > Thanks for explaining, that makes tons of sense. There is a small thinko above, though, and I'd like to correct it before anybody takes the above too seriously as _the_ outline of the design and implements it to the letter. The --recursive option should govern both a tracked submodule and an untracked one. When asking to list both existing submodules and directories that could become submodules, you should be able to say $ git submodule foreach --untracked to list the direct submodules and the directories with .git in them that are not yet submodules of the top-level superproject, but the latter is limited to those with no parent directories with .git in them (other than the top-level of the working tree of the superproject). With $ git submodule foreach --untracked --recursive you would see submodules and their submodules recursively, and also directories with .git in them (i.e. candidates to become direct submodules of the superproject) and the directories with .git in them inside such submodule candidates (i.e. candidates to become direct submodules of the directories that could become direct submodules of the superproject) recursively. If we set things up this way: mkdir -p a/b c/d && for d in . a a/b c c/d do git init $d && ( cd $d && git commit --allow-empty -m initial ) done && git add a && ( cd a && git add b ) The expected results for various combinations are: * "git submodule foreach" would visit 'a' and nothing else; * "git submodule foreach --recursive" would visit 'a' and 'a/b'; * "git submodule foreach --untracked" would visit 'a' and 'c'; and * "git submodule foreach --untracked --recursive" would visit all four. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command 2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli 2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli @ 2013-01-27 12:46 ` Lars Hjemli 1 sibling, 0 replies; 18+ messages in thread From: Lars Hjemli @ 2013-01-27 12:46 UTC (permalink / raw) To: git; +Cc: Lars Hjemli With this rewriting, it is now possible to run e.g. `git -ad gui` to start up git-gui in each repo within the current directory which contains uncommited work. Signed-off-by: Lars Hjemli <hjemli@gmail.com> --- git.c | 36 +++++++++++++++++++++++++++ t/t6400-for-each-repo.sh | 63 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 99 insertions(+) diff --git a/git.c b/git.c index 6b53169..f933b5d 100644 --- a/git.c +++ b/git.c @@ -31,8 +31,42 @@ static void commit_pager_choice(void) { } } +/* + * Rewrite 'git -ad status' to 'git for-each-repo -d status' + */ +static int rewrite_foreach_repo(const char ***orig_argv, + const char **curr_argv, + int *curr_argc) +{ + const char **new_argv; + char *tmp; + int new_argc, curr_pos, i, j; + + curr_pos = curr_argv - *orig_argv; + if (strlen(curr_argv[0]) == 2) { + curr_argv[0] = "for-each-repo"; + return curr_pos - 1; + } + + new_argc = curr_pos + *curr_argc + 1; + new_argv = xmalloc(new_argc * sizeof(void *)); + for (i = j = 0; j < new_argc; i++, j++) { + if (i == curr_pos) { + asprintf(&tmp, "-%s", (*orig_argv)[i] + 2); + new_argv[j] = "for-each-repo"; + new_argv[++j] = tmp; + } else { + new_argv[j] = (*orig_argv)[i]; + } + } + *orig_argv = new_argv; + (*curr_argc)++; + return curr_pos; +} + static int handle_options(const char ***argv, int *argc, int *envchanged) { + const char ***pargv = argv; const char **orig_argv = *argv; while (*argc > 0) { @@ -143,6 +177,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged) setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "0", 1); if (envchanged) *envchanged = 1; + } else if (!strncmp(cmd, "-a", 2)) { + return rewrite_foreach_repo(pargv, *argv, argc); } else { fprintf(stderr, "Unknown option: %s\n", cmd); usage(git_usage_string); diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh index af02c0c..eaa4518 100755 --- a/t/t6400-for-each-repo.sh +++ b/t/t6400-for-each-repo.sh @@ -147,4 +147,67 @@ test_expect_success "-dx executes any command in dirty repos" ' test_cmp expect actual ' +test_expect_success "rewrite 'git -a'" ' + echo "." >expect && + echo "clean" >>expect && + echo "clean/gitfile" >>expect && + echo "dirty-idx" >>expect && + echo "dirty-wt" >>expect && + echo "$qqname" >>expect && + git -a >actual && + test_cmp expect actual +' + +test_expect_success "rewrite 'git -az'" ' + echo "(.)" >expect && + echo "(clean)" >>expect && + echo "(clean/gitfile)" >>expect && + echo "(dirty-idx)" >>expect && + echo "(dirty-wt)" >>expect && + echo "($qname)" >>expect && + git -az | xargs -0 printf "(%s)\n" >actual && + test_cmp expect actual +' + +test_expect_success "rewrite 'git -ad'" ' + echo "clean/gitfile" >expect && + echo "dirty-idx" >>expect && + echo "dirty-wt" >>expect && + git -ad >actual && + test_cmp expect actual +' + +test_expect_success "rewrite 'git -ac'" ' + echo "." >expect && + echo "clean" >>expect && + echo "$qqname" >>expect && + git -ac >actual && + test_cmp expect actual +' + +test_expect_success "rewrite 'git -a status -suno'" ' + echo "[.]" >expect && + echo "[clean]" >>expect && + echo "[clean/gitfile]" >>expect && + echo " M foo2.t" >>expect && + echo "[dirty-idx]" >>expect && + echo "D foo3.t" >>expect && + echo "[dirty-wt]" >>expect && + echo " D foo4.t" >> expect + echo "[$qname]" >>expect && + git -a status -suno >actual && + test_cmp expect actual +' + +test_expect_success "rewrite 'git -acx pwd'" ' + echo "[.]" >expect && + echo "$HOME" >>expect && + echo "[clean]" >>expect && + echo "$HOME/clean" >>expect && + echo "[$qname]" >>expect && + echo "$HOME/$qname" >>expect && + git -acx pwd >actual && + test_cmp expect actual +' + test_done -- 1.8.1.1.349.g4cdd23e ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-02-04 6:42 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-27 12:46 [PATCH v4 0/2] for-each-repo: new command for multi-repo operations Lars Hjemli 2013-01-27 12:46 ` [PATCH v4 1/2] for-each-repo: new command used " Lars Hjemli 2013-01-27 19:04 ` Junio C Hamano 2013-01-27 19:42 ` John Keeping 2013-01-27 19:45 ` Junio C Hamano 2013-01-28 7:50 ` Lars Hjemli 2013-01-28 8:10 ` Jonathan Nieder 2013-01-28 17:11 ` Lars Hjemli 2013-01-28 18:35 ` Junio C Hamano 2013-01-28 17:45 ` Junio C Hamano 2013-01-28 18:35 ` Lars Hjemli 2013-01-28 18:51 ` Junio C Hamano 2013-01-28 19:42 ` Lars Hjemli 2013-01-28 20:12 ` Jens Lehmann 2013-01-28 20:34 ` Junio C Hamano 2013-01-28 21:25 ` Jens Lehmann 2013-02-04 6:41 ` Junio C Hamano 2013-01-27 12:46 ` [PATCH v4 2/2] git: rewrite `git -a` to become a git-for-each-repo command Lars Hjemli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).