* [PATCH v2] contrib: added git-diffall @ 2012-02-22 22:12 Tim Henigan 2012-02-22 23:48 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Tim Henigan @ 2012-02-22 22:12 UTC (permalink / raw) To: git, gitster; +Cc: davvid, stefano.lattarini, tim.henigan This script adds directory diff support to git. It launches a single instance of the user-configured external diff tool and performs a directory diff between the specified revisions. The before/after files are copied to a tmp directory to do this. Either 'diff.tool' or 'merge.tool' must be set before running the script. The existing 'git difftool' command already allows the user to view diffs using an external tool. However if multiple files have changed, a separate instance of the tool is launched for each one. This can be tedious when many files are involved. Signed-off-by: Tim Henigan <tim.henigan@gmail.com> --- This script has been hosted on GitHub [1] since April 2010. Enough people have found it useful that I hope it will be considered for inclusion in the standard git install, either in contrib or as a new core command. Changes since v1: - Changed to #!/bin/sh - Eliminated use of 'which' statements - Fixed trap function to actually run on abnormal exit - Simplified path concatenation logic ($IFS) - Corrected indentation errors - Improved readability of while loop - Cleaned up quoting of variables This matches commit 5d4b90de3 on GitHub [1]. [1]: https://github.com/thenigan/git-diffall contrib/diffall/README | 18 +++ contrib/diffall/git-diffall | 255 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 273 insertions(+) create mode 100644 contrib/diffall/README create mode 100755 contrib/diffall/git-diffall diff --git a/contrib/diffall/README b/contrib/diffall/README new file mode 100644 index 0000000..12881d2 --- /dev/null +++ b/contrib/diffall/README @@ -0,0 +1,18 @@ +The git-diffall script provides a directory based diff mechanism +for git. The script relies on the diff.tool configuration option +to determine what diff viewer is used. + +This script is compatible with all the forms used to specify a +range of revisions to diff: + + 1. git diffall: shows diff between working tree and staged changes + 2. git diffall --cached [<commit>]: shows diff between staged changes and HEAD (or other named commit) + 3. git diffall <commit>: shows diff between working tree and named commit + 4. git diffall <commit> <commit>: show diff between two named commits + 5. git diffall <commit>..<commit>: same as above + 6. git diffall <commit>...<commit>: show the changes on the branch containing and up to the second , starting at a common ancestor of both <commit> + +Note: all forms take an optional path limiter [--] [<path>*] + +This script is based on an example provided by Thomas Rast on the Git list [1]: +[1] http://thread.gmane.org/gmane.comp.version-control.git/124807 diff --git a/contrib/diffall/git-diffall b/contrib/diffall/git-diffall new file mode 100755 index 0000000..d942612 --- /dev/null +++ b/contrib/diffall/git-diffall @@ -0,0 +1,255 @@ +#!/bin/sh +# Copyright 2010 - 2012, Tim Henigan <tim.henigan@gmail.com> +# +# Perform a directory diff between commits in the repository using +# the external diff or merge tool specified in the user's config. + +USAGE='[--cached] [--copy-back] [-x|--extcmd=<command>] <commit>{0,2} -- <path>* + + --cached Compare to the index rather than the working tree. + + --copy-back Copy files back to the working tree when the diff + tool exits (in case they were modified by the + user). This option is only valid if the diff + compared with the working tree. + + -x=<command> + --extcmd=<command> Specify a custom command for viewing diffs. + git-diffall ignores the configured defaults and + runs $command $LOCAL $REMOTE when this option is + specified. Additionally, $BASE is set in the + environment. +' + +SUBDIRECTORY_OK=1 +. "$(git --exec-path)/git-sh-setup" + +TOOL_MODE=diff +. "$(git --exec-path)/git-mergetool--lib" + +merge_tool="$(get_merge_tool)" +if test -z "$merge_tool" +then + echo "Error: Either the 'diff.tool' or 'merge.tool' option must be set." + usage +fi + +start_dir=$(pwd) + +# needed to access tar utility +cdup=$(git rev-parse --show-cdup) && +cd "$cdup" || { + echo >&2 "Cannot chdir to $cdup, the toplevel of the working tree" + exit 1 +} + +# mktemp is not available on all platforms (missing from msysgit) +# Use a hard-coded tmp dir if it is not available +tmp="$(mktemp -d -t tmp.XXXXXX 2>/dev/null)" || { + tmp=/tmp/git-diffall-tmp +} + +trap 'rm -rf "$tmp" 2>/dev/null' EXIT +mkdir -p "$tmp" + +left= +right= +paths= +path_sep= +compare_staged= +common_ancestor= +left_dir= +right_dir= +diff_tool= +copy_back= + +while test $# != 0 +do + case "$1" in + -h|--h|--he|--hel|--help) + usage + ;; + --cached) + compare_staged=1 + ;; + --copy-back) + copy_back=1 + ;; + -x|--e|--ex|--ext|--extc|--extcm|--extcmd) + diff_tool=$2 + shift + ;; + --) + path_sep=1 + ;; + -*) + echo Invalid option: "$1" + usage + ;; + *) + # could be commit, commit range or path limiter + case "$1" in + *...*) + left=${1%...*} + right=${1#*...} + common_ancestor=1 + ;; + *..*) + left=${1%..*} + right=${1#*..} + ;; + *) + if test -n "$path_sep" + then + paths="$paths$1 " + elif test -z "$left" + then + left=$1 + elif test -z "$right" + then + right=$1 + else + paths="$paths$1 " + fi + ;; + esac + ;; + esac + shift +done + +# Determine the set of files which changed +if test -n "$left" && test -n "$right" +then + left_dir="cmt-$(git rev-parse --short $left)" + right_dir="cmt-$(git rev-parse --short $right)" + + if test -n "$compare_staged" + then + usage + elif test -n "$common_ancestor" + then + git diff --name-only "$left"..."$right" -- $paths > "$tmp/filelist" + else + git diff --name-only "$left" "$right" -- $paths > "$tmp/filelist" + fi +elif test -n "$left" +then + left_dir="cmt-$(git rev-parse --short $left)" + + if test -n "$compare_staged" + then + right_dir="staged" + git diff --name-only --cached "$left" -- $paths > "$tmp/filelist" + else + right_dir="working_tree" + git diff --name-only "$left" -- $paths > "$tmp/filelist" + fi +else + left_dir="HEAD" + + if test -n "$compare_staged" + then + right_dir="staged" + git diff --name-only --cached -- $paths > "$tmp/filelist" + else + right_dir="working_tree" + git diff --name-only -- $paths > "$tmp/filelist" + fi +fi + +# Exit immediately if there are no diffs +if test ! -s "$tmp/filelist" +then + exit 0 +fi + +if test -n "$copy_back" && test "$right_dir" != "working_tree" +then + echo "--copy-back is only valid when diff includes the working tree." + exit 1 +fi + +# Create the named tmp directories that will hold the files to be compared +mkdir -p "$tmp/$left_dir" "$tmp/$right_dir" + +# Populate the tmp/right_dir directory with the files to be compared +if test -n "$right" +then + while read name + do + ls_list=$(git ls-tree $right $name) + if test -n "$ls_list" + then + mkdir -p "$tmp/$right_dir/$(dirname "$name")" + git show "$right":"$name" > "$tmp/$right_dir/$name" || true + fi + done < "$tmp/filelist" +elif test -n "$compare_staged" +then + while read name + do + ls_list=$(git ls-files -- $name) + if test -n "$ls_list" + then + mkdir -p "$tmp/$right_dir/$(dirname "$name")" + git show :"$name" > "$tmp/$right_dir/$name" + fi + done < "$tmp/filelist" +else + # Mac users have gnutar rather than tar + (tar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && tar -x)) || { + gnutar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && gnutar -x) + } +fi + +# Populate the tmp/left_dir directory with the files to be compared +while read name +do + if test -n "$left" + then + ls_list=$(git ls-tree $left $name) + if test -n "$ls_list" + then + mkdir -p "$tmp/$left_dir/$(dirname "$name")" + git show "$left":"$name" > "$tmp/$left_dir/$name" || true + fi + else + if test -n "$compare_staged" + then + ls_list=$(git ls-tree HEAD $name) + if test -n "$ls_list" + then + mkdir -p "$tmp/$left_dir/$(dirname "$name")" + git show HEAD:"$name" > "$tmp/$left_dir/$name" + fi + else + mkdir -p "$tmp/$left_dir/$(dirname "$name")" + git show :"$name" > "$tmp/$left_dir/$name" + fi + fi +done < "$tmp/filelist" + +cd "$tmp" +LOCAL="$left_dir" +REMOTE="$right_dir" + +if test -n "$diff_tool" +then + export BASE + eval $diff_tool '"$LOCAL"' '"$REMOTE"' +else + run_merge_tool "$merge_tool" false +fi + +# Copy files back to the working dir, if requested +if test -n "$copy_back" && test "$right_dir" = "working_tree" +then + cd "$start_dir" + git_top_dir=$(git rev-parse --show-toplevel) + find "$tmp/$right_dir" -type f | + while read file + do + cp "$file" "$git_top_dir/${file#$tmp/$right_dir/}" + done +fi -- 1.7.9.GIT ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-22 22:12 [PATCH v2] contrib: added git-diffall Tim Henigan @ 2012-02-22 23:48 ` Junio C Hamano 2012-02-23 9:56 ` Stefano Lattarini 2012-02-23 16:07 ` Tim Henigan 0 siblings, 2 replies; 8+ messages in thread From: Junio C Hamano @ 2012-02-22 23:48 UTC (permalink / raw) To: Tim Henigan; +Cc: git, davvid, stefano.lattarini Tim Henigan <tim.henigan@gmail.com> writes: > This script adds directory diff support to git. It launches a single > instance of the user-configured external diff tool and performs a > directory diff between the specified revisions. The before/after files > are copied to a tmp directory to do this. Either 'diff.tool' or > 'merge.tool' must be set before running the script. > > The existing 'git difftool' command already allows the user to view diffs > using an external tool. However if multiple files have changed, a > separate instance of the tool is launched for each one. This can be > tedious when many files are involved. We encourage our log messages to describe the problem first and then present solution to the problem, so I would update the above perhaps like this: The 'git difftool' command lets the user to use an external tool to view diffs, but it runs the tool for one file at the time. This makes it tedious to review a change that spans multiple files. The "git-diffall" script instead prepares temporary directories with preimage and postimage files, and launches a single instance of an external diff tool to view the differences in them. diff.tool or merge.tool configuration variable is used to specify what external tool is used. I am wondering if reusing "diff.tool" or "merge.tool" is a good idea, though. I guess that it is OK to assume that any external tool that can compare two directories MUST be able to compare two individual files, and if that is true, it is perfectly fine to reuse the configuration. But if an external tool "frobdiff" that can compare two directories cannot compare two individual files, it will make it impossible for the user to run "git difftool" if diff.tool is set to "frobdiff" to use with "diffall". Another thing that comes to my mind is if a user has an external tool that can use "diffall", is there ever a situation where the user chooses to use "difftool" instead, to go files one by one. I cannot offhand imagine any. Perhaps a longer term integration plan may be to lift the logic from this script and make it part of "difftool", and then add a boolean variable "difftool.<tool>.canCompareDirectory", without adding "git diffall" as a subcommand. The user can still run "git difftool", and when the external tool can compare two directories, the code to populate temporary directory (and to set the trap to clean after itself) taken from this tool will run inside "git difftool" frontend and then the external tool to compare the two directories is spawned. I also think that in a yet longer term, if this mode of "instantiate two directories to be compared, and let the external tool do the comparison" proves useful, almost all the "interesting" work done in this script should be made unnecessary by adding an updated "external diff interface" on the core side, so that nobody has to hurt his brain to implement an error-prone command line parsing logic. In other words, this statement cannot stay true: > +This script is compatible with all the forms used to specify a > +range of revisions to diff: without updating the script every time underlying "git diff" gains new way of comparing things. If we move the "prepare two directories, and point an external tool at them" logic to the core, we do not have to worry about it at all. Besides, I do not think the script covers all the forms; "git diff -R" support is totally missing. But that is all two steps in the future. > +# mktemp is not available on all platforms (missing from msysgit) > +# Use a hard-coded tmp dir if it is not available > +tmp="$(mktemp -d -t tmp.XXXXXX 2>/dev/null)" || { > + tmp=/tmp/git-diffall-tmp > +} It would not withstand malicious attacks, but doing tmp=/tmp/git-diffall-tmp.$$ would at least protect you from accidental name crashes better in the fallback codepath. > + > +trap 'rm -rf "$tmp" 2>/dev/null' EXIT Do you need to suppress errors, especially when you are running "rm -rf" with the 'f' option? > +mkdir -p "$tmp" > + > +left= > +right= > +paths= > +path_sep= > +compare_staged= > +common_ancestor= > +left_dir= > +right_dir= > +diff_tool= > +copy_back= You can write multiple assignment on a line to save vertical space if you want to, and the initialization sequence like this is a good place to do so. > +while test $# != 0 > +do > + case "$1" in > + -h|--h|--he|--hel|--help) > + usage > + ;; > + --cached) > + compare_staged=1 > + ;; > + --copy-back) > + copy_back=1 > + ;; > + -x|--e|--ex|--ext|--extc|--extcm|--extcmd) > + diff_tool=$2 > + shift > + ;; What if your command line ends with -x without $2? Don't you want to match "-t/--tool" that "difftool" already uses? > + --) > + path_sep=1 > + ;; > + -*) > + echo Invalid option: "$1" > + usage > + ;; > + *) > + # could be commit, commit range or path limiter > + case "$1" in > + *...*) > + left=${1%...*} > + right=${1#*...} > + common_ancestor=1 > + ;; Strictly speaking, that is not just a common_ancestor but is a merge_base, which is a common ancestor none of whose children is a common ancestor. > + *..*) > + left=${1%..*} > + right=${1#*..} > + ;; > + *) > + if test -n "$path_sep" > + then > + paths="$paths$1 " > + elif test -z "$left" > + then > + left=$1 > + elif test -z "$right" > + then > + right=$1 > + else > + paths="$paths$1 " > + fi > + ;; > + esac Hrm, so "diffall HEAD~2 Documentation/" is not the way to compare the contents of the Documentation/ directory between the named commit and the working tree, like "diff HEAD~2 Documentation/" does. That is not a show-stopper (a double-dash is an easy workaround), but it is worth pointing out. > + ;; > + esac > + shift > +done > + > +# Determine the set of files which changed > +if test -n "$left" && test -n "$right" > +then > + left_dir="cmt-$(git rev-parse --short $left)" > + right_dir="cmt-$(git rev-parse --short $right)" > + > + if test -n "$compare_staged" > + then > + usage > + elif test -n "$common_ancestor" > + then > + git diff --name-only "$left"..."$right" -- $paths > "$tmp/filelist" > + else > + git diff --name-only "$left" "$right" -- $paths > "$tmp/filelist" > + fi And this will not work with pathspec that have $IFS characters. If we really wanted to we could do that by properly quoting "$1" when you build $paths and then use eval when you run "git diff" here (look for 'sq' and 'eval' in existing scripts, e.g. "git-am.sh", if you are interested). Also you may want to write filelist using -z format to protect yourself from paths that contain LF, but that would require the loop "while read name" to be rewritten. > +# Exit immediately if there are no diffs > +if test ! -s "$tmp/filelist" > +then > + exit 0 > +fi Ok, you have trap set already so $tmp will disappear with this exit ;-) > +if test -n "$copy_back" && test "$right_dir" != "working_tree" > +then > + echo "--copy-back is only valid when diff includes the working tree." > + exit 1 > +fi I actually wondered why $right_dir needs to be populated with a copy in the first place (if you do not copy but give the working tree itself to the external tool, you do not even have to copy back). I know the answer to the question, namely, "because the external tool thinks files that are not in $left_dir are added files", but if you can find a way to tell the external tool to ignore new files (similar to how "diff -r" without -N works), running the tool with temporary left_dir and the true workdir as right_dir would be a lot cleaner solution to the problem. > +# Create the named tmp directories that will hold the files to be compared > +mkdir -p "$tmp/$left_dir" "$tmp/$right_dir" > + > +# Populate the tmp/right_dir directory with the files to be compared > +if test -n "$right" > +then > + while read name > + do > + ls_list=$(git ls-tree $right $name) > + if test -n "$ls_list" > + then > + mkdir -p "$tmp/$right_dir/$(dirname "$name")" > + git show "$right":"$name" > "$tmp/$right_dir/$name" || true > + fi > + done < "$tmp/filelist" "while read -r name" might make this slightly more robust; even though this loses leading and trailing whitespaces in filenames, we probably can get away without worrying about them. > +else > + # Mac users have gnutar rather than tar > + (tar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && tar -x)) || { > + gnutar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && gnutar -x) > + } What is this "--ignore-failed-read" about? Not reporting unreadable as an error smells really bad. If you require GNUism in your tar usage, this should be made configurable so that people can use alternative names (some systems come with "tar" that is POSIX and "gtar" that is GNU). > +cd "$tmp" > +LOCAL="$left_dir" > +REMOTE="$right_dir" > + > +if test -n "$diff_tool" > +then > + export BASE > + eval $diff_tool '"$LOCAL"' '"$REMOTE"' > +else > + run_merge_tool "$merge_tool" false > +fi > + > +# Copy files back to the working dir, if requested > +if test -n "$copy_back" && test "$right_dir" = "working_tree" > +then > + cd "$start_dir" > + git_top_dir=$(git rev-parse --show-toplevel) > + find "$tmp/$right_dir" -type f | > + while read file > + do > + cp "$file" "$git_top_dir/${file#$tmp/$right_dir/}" > + done > +fi This will copy new files created in $right_dir. Is that intended? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-22 23:48 ` Junio C Hamano @ 2012-02-23 9:56 ` Stefano Lattarini 2012-02-23 17:37 ` Junio C Hamano 2012-02-23 16:07 ` Tim Henigan 1 sibling, 1 reply; 8+ messages in thread From: Stefano Lattarini @ 2012-02-23 9:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tim Henigan, git, davvid Hello everybody. Hope you don't mind 2 cents from an outsider ... On 02/23/2012 12:48 AM, Junio C Hamano wrote: > > Tim Henigan <tim.henigan@gmail.com> writes: > >> +# mktemp is not available on all platforms (missing from msysgit) >> +# Use a hard-coded tmp dir if it is not available >> +tmp="$(mktemp -d -t tmp.XXXXXX 2>/dev/null)" || { >> + tmp=/tmp/git-diffall-tmp >> +} > > It would not withstand malicious attacks, but doing > > tmp=/tmp/git-diffall-tmp.$$ > > would at least protect you from accidental name crashes better in the > fallback codepath. > Maybe this would be enough to withstand malicious attacks (even if not denial-of-service attacks): # mktemp is not available on all platforms (missing from msysgit) tmp=$(mktemp -d -t tmp.XXXXXX 2>/dev/null) || { tmp=/tmp/git-diffall-tmp.$$ mkdir "$tmp" || fatal "couldn't create temporary directory" } > >> +mkdir -p "$tmp" > At which point this should be removed, of course. Regards, Stefano ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-23 9:56 ` Stefano Lattarini @ 2012-02-23 17:37 ` Junio C Hamano 0 siblings, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2012-02-23 17:37 UTC (permalink / raw) To: Stefano Lattarini; +Cc: Tim Henigan, git, davvid Stefano Lattarini <stefano.lattarini@gmail.com> writes: >>> +# mktemp is not available on all platforms (missing from msysgit) >>> +# Use a hard-coded tmp dir if it is not available >>> +tmp="$(mktemp -d -t tmp.XXXXXX 2>/dev/null)" || { >>> + tmp=/tmp/git-diffall-tmp >>> +} >> ... > # mktemp is not available on all platforms (missing from msysgit) > tmp=$(mktemp -d -t tmp.XXXXXX 2>/dev/null) || { > tmp=/tmp/git-diffall-tmp.$$ > mkdir "$tmp" || fatal "couldn't create temporary directory" > } > >>> +mkdir -p "$tmp" >> > At which point this should be removed, of course. Good eyes; thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-22 23:48 ` Junio C Hamano 2012-02-23 9:56 ` Stefano Lattarini @ 2012-02-23 16:07 ` Tim Henigan 2012-02-23 19:02 ` Junio C Hamano 2012-04-10 23:06 ` Matt McClure 1 sibling, 2 replies; 8+ messages in thread From: Tim Henigan @ 2012-02-23 16:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, davvid, stefano.lattarini On Wed, Feb 22, 2012 at 6:48 PM, Junio C Hamano <gitster@pobox.com> wrote: > Tim Henigan <tim.henigan@gmail.com> writes: > > We encourage our log messages to describe the problem first and then > present solution to the problem, so I would update the above perhaps like > this: > > The 'git difftool' command lets the user to use an external tool > to view diffs, but it runs the tool for one file at the time. This > makes it tedious to review a change that spans multiple files. > > The "git-diffall" script instead prepares temporary directories > with preimage and postimage files, and launches a single instance > of an external diff tool to view the differences in them. > diff.tool or merge.tool configuration variable is used to specify > what external tool is used. Understood. I will update in v3. > I am wondering if reusing "diff.tool" or "merge.tool" is a good idea, > though. > > I guess that it is OK to assume that any external tool that can compare > two directories MUST be able to compare two individual files, and if that > is true, it is perfectly fine to reuse the configuration. But if an > external tool "frobdiff" that can compare two directories cannot compare > two individual files, it will make it impossible for the user to run "git > difftool" if diff.tool is set to "frobdiff" to use with "diffall". > > Another thing that comes to my mind is if a user has an external tool that > can use "diffall", is there ever a situation where the user chooses to use > "difftool" instead, to go files one by one. I cannot offhand imagine any. It was my assumption that any tool that supports directory diff also supports individual file diff. It seems like a strict subset of directory diff case. > Perhaps a longer term integration plan may be to lift the logic from this > ...snip... > > But that is all two steps in the future. I hope that this feature finds it way into the existing core commands. This script is intended to be a conversation starter that is also immediately useful as a separate command. Would it be better to begin the long-term discussion now and skip adding this to contrib? >> +# mktemp is not available on all platforms (missing from msysgit) >> +# Use a hard-coded tmp dir if it is not available >> +tmp="$(mktemp -d -t tmp.XXXXXX 2>/dev/null)" || { >> + tmp=/tmp/git-diffall-tmp >> +} > > It would not withstand malicious attacks, but doing > > tmp=/tmp/git-diffall-tmp.$$ > > would at least protect you from accidental name crashes better in the > fallback codepath. This makes sense. I will add a unique portion to the name of the tmp dir in v3. >> +trap 'rm -rf "$tmp" 2>/dev/null' EXIT > > Do you need to suppress errors, especially when you are running "rm -rf" > with the 'f' option? On msysgit, I found that "rm -rf $tmp" consistently fails due with a permission error. I don't understand why the script that created the tmp dir is not allowed to delete it. I am still looking into it, but so far it appears to be an idiosyncrasy of msysgit. >> +left= >> +right= >> +paths= >> +path_sep= >> +compare_staged= >> +common_ancestor= >> +left_dir= >> +right_dir= >> +diff_tool= >> +copy_back= > > You can write multiple assignment on a line to save vertical space if you > want to, and the initialization sequence like this is a good place to do > so. My personal preference is to keep them on separate lines. However if the compressed style is preferred, I will change it. >> + -x|--e|--ex|--ext|--extc|--extcm|--extcmd) >> + diff_tool=$2 >> + shift >> + ;; > > What if your command line ends with -x without $2? Currently it results in a shift error with no useful message to the user. I will add something for this in v3. > Don't you want to match "-t/--tool" that "difftool" already uses? Are you suggesting that I a) change "-x/--extcmd" to "-t/--tool" or that b) I add support for the "difftool -t/--tool" option? If "a", I was reusing the "difftool --extcmd" option which has the same behavior. If "b", I will look into it. >> + # could be commit, commit range or path limiter >> + case "$1" in >> + *...*) >> + left=${1%...*} >> + right=${1#*...} >> + common_ancestor=1 >> + ;; > > Strictly speaking, that is not just a common_ancestor but is a merge_base, > which is a common ancestor none of whose children is a common ancestor. Understood. I will change the name to "merge_base" in v3. >> + *..*) >> + left=${1%..*} >> + right=${1#*..} >> + ;; >> + *) >> + if test -n "$path_sep" >> + then >> + paths="$paths$1 " >> + elif test -z "$left" >> + then >> + left=$1 >> + elif test -z "$right" >> + then >> + right=$1 >> + else >> + paths="$paths$1 " >> + fi >> + ;; >> + esac > > Hrm, so "diffall HEAD~2 Documentation/" is not the way to compare the > contents of the Documentation/ directory between the named commit and > the working tree, like "diff HEAD~2 Documentation/" does. > > That is not a show-stopper (a double-dash is an easy workaround), but > it is worth pointing out. So I would need something to determine if a string represents a commit/tag/branch or a path?. I presume it would need to handle the corner case where a branch/tag and path have the same name. Is there anything like this in the mergetool lib scripts today? >> + then >> + git diff --name-only "$left"..."$right" -- $paths > "$tmp/filelist" >> + else >> + git diff --name-only "$left" "$right" -- $paths > "$tmp/filelist" >> + fi > > And this will not work with pathspec that have $IFS characters. If we > really wanted to we could do that by properly quoting "$1" when you build > $paths and then use eval when you run "git diff" here (look for 'sq' and > 'eval' in existing scripts, e.g. "git-am.sh", if you are interested). > > Also you may want to write filelist using -z format to protect yourself > from paths that contain LF, but that would require the loop "while read > name" to be rewritten. I just discovered that the script fails to handle files that have spaces in their name, so some further work is needed. >> +# Exit immediately if there are no diffs >> +if test ! -s "$tmp/filelist" >> +then >> + exit 0 >> +fi > > Ok, you have trap set already so $tmp will disappear with this exit ;-) I'll try not to be such a slow learner in the future...but no guarantees. >> +if test -n "$copy_back" && test "$right_dir" != "working_tree" >> +then >> + echo "--copy-back is only valid when diff includes the working tree." >> + exit 1 >> +fi > > I actually wondered why $right_dir needs to be populated with a copy in > the first place (if you do not copy but give the working tree itself to > the external tool, you do not even have to copy back). > > I know the answer to the question, namely, "because the external tool > thinks files that are not in $left_dir are added files", but if you can > find a way to tell the external tool to ignore new files (similar to how > "diff -r" without -N works), running the tool with temporary left_dir and > the true workdir as right_dir would be a lot cleaner solution to the > problem. I'll note this as "for future consideration". I spent some time trying this is the original implementation, but could not find a workable solution in the time I had available. >> + while read name >> + do >> + ls_list=$(git ls-tree $right $name) >> + if test -n "$ls_list" >> + then >> + mkdir -p "$tmp/$right_dir/$(dirname "$name")" >> + git show "$right":"$name" > "$tmp/$right_dir/$name" || true >> + fi >> + done < "$tmp/filelist" > > "while read -r name" might make this slightly more robust; even though > this loses leading and trailing whitespaces in filenames, we probably > can get away without worrying about them. > >> +else >> + # Mac users have gnutar rather than tar >> + (tar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && tar -x)) || { >> + gnutar --ignore-failed-read -c -T "$tmp/filelist" | (cd "$tmp/$right_dir" && gnutar -x) >> + } > > What is this "--ignore-failed-read" about? Not reporting unreadable as an > error smells really bad. If a file was added or deleted between the two commits being compared, tar would fail because a file was missing from "$tmp/filelist". The "--ignore-failed-read" prevents tar from halting the script in this case. > If you require GNUism in your tar usage, this should be made configurable > so that people can use alternative names (some systems come with "tar" > that is POSIX and "gtar" that is GNU). Is there an example showing how this could be configurable? The problem is that the "--ignore-failed-read" was not supported in all flavors of tar. >> +# Copy files back to the working dir, if requested >> +if test -n "$copy_back" && test "$right_dir" = "working_tree" >> +then >> + cd "$start_dir" >> + git_top_dir=$(git rev-parse --show-toplevel) >> + find "$tmp/$right_dir" -type f | >> + while read file >> + do >> + cp "$file" "$git_top_dir/${file#$tmp/$right_dir/}" >> + done >> +fi > > This will copy new files created in $right_dir. Is that intended? hmmm...that was not intended. If would be odd for the user to create new files in this tmp directory, but if the diff tool automatically generates any files then this could result in unwanted files. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-23 16:07 ` Tim Henigan @ 2012-02-23 19:02 ` Junio C Hamano 2012-04-10 23:06 ` Matt McClure 1 sibling, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2012-02-23 19:02 UTC (permalink / raw) To: Tim Henigan; +Cc: git, davvid, stefano.lattarini Tim Henigan <tim.henigan@gmail.com> writes: > On Wed, Feb 22, 2012 at 6:48 PM, Junio C Hamano <gitster@pobox.com> wrote: > >> Another thing that comes to my mind is if a user has an external tool that >> can use "diffall", is there ever a situation where the user chooses to use >> "difftool" instead, to go files one by one. I cannot offhand imagine any. > > It was my assumption that any tool that supports directory diff also > supports individual file diff. It seems like a strict subset of > directory diff case. > ... >> Perhaps a longer term integration plan may be to lift the logic from this >> > ...snip... >> >> But that is all two steps in the future. > > I hope that this feature finds it way into the existing core commands. > This script is intended to be a conversation starter that is also > immediately useful as a separate command. Would it be better to begin > the long-term discussion now and skip adding this to contrib? I would envision we have this in contrib/ first, without even fixing the whitespace-in-pathspec and whitespace-or-lf-in-paths issues I pointed out in my review, and let people play with it. My crystal ball tells an optimist in me that we will see people (you and others) try to fix issues they hit in their real life use cases, and the script will be improved while it is still in contrib/. And then somebody who has worked on difftool will step up and roll it into difftool proper, along the lines I hinted in the message you are responding to, at which point the script will be removed from contrib/. That is the first step in the future. The second step in the future may or may not come. It will involve adding an updated external diff interface on the core side and would prepare the two temporary directories before the core side calls the difftool among other things, and when that happens, we can lose most of the code in this script that the first step in the future may have ported into difftool. >> You can write multiple assignment on a line to save vertical space if you >> want to, and the initialization sequence like this is a good place to do >> so. > My personal preference is to keep them on separate lines. However if > the compressed style is preferred, I will change it. "I wouldn't bother" was what I meant by "if you want to". >>> + -x|--e|--ex|--ext|--extc|--extcm|--extcmd) > ... >> Don't you want to match "-t/--tool" that "difftool" already uses? > > Are you suggesting that I a) change "-x/--extcmd" to "-t/--tool" or > that b) I add support for the "difftool -t/--tool" option? No, I just misread the set of options "difftool" takes, without realizing that --extcmd and --tool are two different things, and it is correct to call your option "--extcmd" if it specifies what corresponds to what "difftool --extcmd" specifies. Sorry for the confusion. >> Hrm, so "diffall HEAD~2 Documentation/" is not the way to compare the >> contents of the Documentation/ directory between the named commit and >> the working tree, like "diff HEAD~2 Documentation/" does. >> >> That is not a show-stopper (a double-dash is an easy workaround), but >> it is worth pointing out. > > So I would need something to determine if a string represents a > commit/tag/branch or a path?. "It does not have to be fixed in the first version" (aka "for future consideration") was what I meant by "not a show-stopper". >> And this will not work with pathspec that have $IFS characters. If we >> really wanted to we could do that by properly quoting "$1" when you build >> $paths and then use eval when you run "git diff" here (look for 'sq' and >> 'eval' in existing scripts, e.g. "git-am.sh", if you are interested). >> >> Also you may want to write filelist using -z format to protect yourself >> from paths that contain LF, but that would require the loop "while read >> name" to be rewritten. > > I just discovered that the script fails to handle files that have > spaces in their name, so some further work is needed. Again, "It does not have to be fixed in the first version" (aka "for future consideration") was what I meant by "If we really wanted to". >> What is this "--ignore-failed-read" about? Not reporting unreadable as an >> error smells really bad. > > If a file was added or deleted between the two commits being compared, > tar would fail because a file was missing from "$tmp/filelist". The > "--ignore-failed-read" prevents tar from halting the script in this > case. But it will also ignore errors coming from other causes, no? Wouldn't we rather see an error if tar fails to read from a path that *has to* exist in the working tree because "diff" said it does? Again, it is just "for future consideration". >> If you require GNUism in your tar usage, this should be made configurable >> so that people can use alternative names (some systems come with "tar" >> that is POSIX and "gtar" that is GNU). > > Is there an example showing how this could be configurable? The > problem is that the "--ignore-failed-read" was not supported in all > flavors of tar. Grep for "$TAR" and also @@DIFF@@ in the Makefile, and add substitution for @@TAR@@ in cmd_munge_script, perhaps? By the way, I actually have an even more radical suggestion that may let you get rid of most of the lines in your script. If you tweak the usage so that "diffall" specific options *MUST* come first, e.g. USAGE=[--copy-back] [-x <cmd>] <arguments for diff> then you can parse your argument partially, i.e. copy_back= extcmd= while case "$#,$1" in 0,*) break ;; *,-*) ;; *) break ;; esac do case "$1" in --copy-back) copy_back=true ;; -x | --extcmd) test $# != 1 || usage extcmd=$2 shift ;; *) break ;; esac shift done then feed the remainder all to "diff", e.g. diff --raw --no-abbrev "$@" | And then you can prepare two temporary index files and stuff the output in them, by having something like this on the downstream of the pipe: while read -r lmode rmode lsha1 rsha1 status path do if test "$lmode" != $null_mode then GIT_INDEX_FILE=$tmp.left_index \ git update-index --add --cacheinfo $lmode $lsha1 $path fi if test "$rmode" != $null_mode then GIT_INDEX_FILE=$tmp.right_index \ git update-index --add --cacheinfo $rmode $rsha1 $path fi done Side Note: In the production version, you would probably give the "-z" option to "diff", and write this loop in Perl so that you can cope better with funny characters in the path. Instead of running two instances of "git update-index" for every path, the loop would group the entries for left and right side, and drive one instance of "git update-index --index-info" each to populate the two index files. Also the above needs to be adjusted to deal with the side that represents the working tree files; they are reported with $null_sha1 so in such a case instead of putting it in the temporary index, you would copy the working tree file to the temporary location. After you prepare these two temporary index files, you can then use them to populate your left_dir and right_dir like this: GIT_DIR=$(git rev-parse --git-dir) \ GIT_WORK_TREE=$left_dir \ GIT_INDEX_FILE=$tmp.left_index \ git checkout-index -a With this, you do not have to worry about anything about the funny combinations of where the two "directories" comes from when preparing the temporary directories to be compared. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-02-23 16:07 ` Tim Henigan 2012-02-23 19:02 ` Junio C Hamano @ 2012-04-10 23:06 ` Matt McClure 2012-04-11 8:38 ` David Aguilar 1 sibling, 1 reply; 8+ messages in thread From: Matt McClure @ 2012-04-10 23:06 UTC (permalink / raw) To: git Tim Henigan <tim.henigan <at> gmail.com> writes: > >> + do > >> + cp "$file" "$git_top_dir/${file#$tmp/$right_dir/}" > >> + done > >> +fi > > > > This will copy new files created in $right_dir. Is that intended? > > hmmm...that was not intended. If would be odd for the user to create > new files in this tmp directory, but if the diff tool automatically > generates any files then this could result in unwanted files. I think more generally, I would prefer if either side of the comparison is the working copy that the temp directory on that side be populated with symlinks. A particularly bad failure mode of the copy-back approach is: git diffall --copy-back # while my diffall tool is running, I edit the file somewhere else. # quit my diffall tool # --> my edits in the other tool are overwritten by diffall Editing the files in place via symlinks would resolve that. Matt ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] contrib: added git-diffall 2012-04-10 23:06 ` Matt McClure @ 2012-04-11 8:38 ` David Aguilar 0 siblings, 0 replies; 8+ messages in thread From: David Aguilar @ 2012-04-11 8:38 UTC (permalink / raw) To: Matt McClure; +Cc: git On Tue, Apr 10, 2012 at 4:06 PM, Matt McClure <matthewlmcclure@gmail.com> wrote: > Tim Henigan <tim.henigan <at> gmail.com> writes: > >> >> + do >> >> + cp "$file" "$git_top_dir/${file#$tmp/$right_dir/}" >> >> + done >> >> +fi >> > >> > This will copy new files created in $right_dir. Is that intended? >> >> hmmm...that was not intended. If would be odd for the user to create >> new files in this tmp directory, but if the diff tool automatically >> generates any files then this could result in unwanted files. > > I think more generally, I would prefer if either side of the comparison is the > working copy that the temp directory on that side be populated with symlinks. > > A particularly bad failure mode of the copy-back approach is: > > git diffall --copy-back > # while my diffall tool is running, I edit the file somewhere else. > # quit my diffall tool > # --> my edits in the other tool are overwritten by diffall > > Editing the files in place via symlinks would resolve that. I had a similar idea but didn't mention it because Windows came to mind. I always want to say, "darn it, this code would be so much easier if we could just ignore Windows", but that's not very helpful. I'd be happy with a runtime platform check where the copy back is only done on Windows. Everyone else can enjoy symlinks. Reading between the lines that could be interpreted as, "well, that copy back code is no good and *we* don't want to use it, but it's okay for Windows users", which is slightly dangerous because we'd always be running the symlink code path and wouldn't hit problems with the other path. So I'm torn. I think symlinks are a great idea, but Windows drives us towards the less-than-ideal solution. I want the best solution possible. Do we just accept that the copy-back code is simply the cost of supporting Windows and keep both code paths around? I would not be opposed to that if the result is a more robust user experience. -- David ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-04-11 8:38 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-22 22:12 [PATCH v2] contrib: added git-diffall Tim Henigan 2012-02-22 23:48 ` Junio C Hamano 2012-02-23 9:56 ` Stefano Lattarini 2012-02-23 17:37 ` Junio C Hamano 2012-02-23 16:07 ` Tim Henigan 2012-02-23 19:02 ` Junio C Hamano 2012-04-10 23:06 ` Matt McClure 2012-04-11 8:38 ` David Aguilar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).