git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3] git apply: option to ignore whitespace differences
@ 2009-07-28 21:00 Giuseppe Bilotta
  2009-07-28 21:29 ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-28 21:00 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Giuseppe Bilotta

Introduce --ignore-whitespace option and corresponding config bool to
ignore whitespace differences while applying patches, akin to the
'patch' program.

'git am', 'git rebase' and the bash git completion are made aware of
this option.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
---

Third iteration of the ignore-whitespace option for git-apply. Sorry for
the long delay from the previous one, but I've been busy with real-life
stuff.

Significant changes since the previous iterations:
* a boolean config option has been introduced
* the patch that synchronized the whitespace when it was being ignored
  has been squashed in
* the fuzzy buffer comparison needs the length of both buffers since, as
  pointed out in a private comunication, the hash check is not enough to
  guarantee that the candidates have the same non-whitespace characters

With regard to this, by the way, the expected failure for patch3 is
currently catched by a hashing, but it should be improved by generating
two lines with the same hash, one being a proper initial substring of
the other (modulo whitespace); a situation which I wasn't able to create
without resorting to situations that would make git (and diff) go
binary. Suggestions welcome.

Also, I've come across the need to implement a similar option for merges
(which basically reduces to adding the flag XDF_IGNORE_WHITESPACE_CHANGE
in appropriate places) but couldn't come up with a sensible way to
expose it via parameters to pass through to the merge engines, whose
interactions are a little over my head for now.


 Documentation/config.txt               |    5 +
 Documentation/git-am.txt               |    3 +-
 Documentation/git-apply.txt            |    7 ++
 Documentation/git-rebase.txt           |    3 +-
 builtin-apply.c                        |  135 ++++++++++++++++++++++++++++--
 contrib/completion/git-completion.bash |    3 +
 git-am.sh                              |    3 +-
 git-rebase.sh                          |    3 +
 t/t4107-apply-ignore-whitespace.sh     |  143 ++++++++++++++++++++++++++++++++
 9 files changed, 295 insertions(+), 10 deletions(-)
 create mode 100755 t/t4107-apply-ignore-whitespace.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 792dd42..c24fa5d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -461,6 +461,11 @@ it will be treated as a shell command.  For example, defining
 executed from the top-level directory of a repository, which may
 not necessarily be the current directory.
 
+apply.ignore-whitespace::
+	Tells 'git-apply' to ignore whitespaces differences,
+	in the same way as the '--ignore-whitespace' option.
+	See linkgit:git-apply[1].
+
 apply.whitespace::
 	Tells 'git-apply' how to handle whitespaces, in the same way
 	as the '--whitespace' option. See linkgit:git-apply[1].
diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index 32e689b..acd2a5d 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git am' [--signoff] [--keep] [--utf8 | --no-utf8]
 	 [--3way] [--interactive] [--committer-date-is-author-date]
-	 [--ignore-date]
+	 [--ignore-date] [--ignore-whitespace]
 	 [--whitespace=<option>] [-C<n>] [-p<n>] [--directory=<dir>]
 	 [--reject] [-q | --quiet]
 	 [<mbox> | <Maildir>...]
@@ -65,6 +65,7 @@ default.   You can use `--no-utf8` to override this.
 	it is supposed to apply to and we have those blobs
 	available locally.
 
+--ignore-date::
 --whitespace=<option>::
 -C<n>::
 -p<n>::
diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 735374d..ef92c03 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -13,6 +13,7 @@ SYNOPSIS
 	  [--apply] [--no-add] [--build-fake-ancestor=<file>] [-R | --reverse]
 	  [--allow-binary-replacement | --binary] [--reject] [-z]
 	  [-pNUM] [-CNUM] [--inaccurate-eof] [--recount] [--cached]
+	  [--ignore-whitespace]
 	  [--whitespace=<nowarn|warn|fix|error|error-all>]
 	  [--exclude=PATH] [--include=PATH] [--directory=<root>]
 	  [--verbose] [<patch>...]
@@ -149,6 +150,9 @@ patch to each path is used.  A patch to a path that does not match any
 include/exclude pattern is used by default if there is no include pattern
 on the command line, and ignored if there is any include pattern.
 
+--ignore-whitespace::
+	When applying a patch, ignore whitespace differences if necessary.
+
 --whitespace=<action>::
 	When applying a patch, detect a new or modified line that has
 	whitespace errors.  What are considered whitespace errors is
@@ -205,6 +209,9 @@ running `git apply --directory=modules/git-gui`.
 Configuration
 -------------
 
+apply.ignore-whitespace::
+	Set this boolean configuration flag if you want to ignore whitespace
+	differences to be ignored by default.
 apply.whitespace::
 	When no `--whitespace` flag is given from the command
 	line, this configuration item is used as the default.
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index db1b71d..0aefc34 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -268,8 +268,9 @@ OPTIONS
 	exit with the message "Current branch is up to date" in such a
 	situation.
 
+--ignore-whitespace::
 --whitespace=<option>::
-	This flag is passed to the 'git-apply' program
+	These flag are passed to the 'git-apply' program
 	(see linkgit:git-apply[1]) that applies the patch.
 	Incompatible with the --interactive option.
 
diff --git a/builtin-apply.c b/builtin-apply.c
index 39dc96a..792c7ac 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -39,6 +39,7 @@ static int diffstat;
 static int numstat;
 static int summary;
 static int check;
+static int ignore_whitespace;
 static int apply = 1;
 static int apply_in_reverse;
 static int apply_with_reject;
@@ -214,6 +215,62 @@ static uint32_t hash_line(const char *cp, size_t len)
 	return h;
 }
 
+/*
+ * Compare lines s1 of length n1 and s2 of length n2, ignoring
+ * whitespace difference. Returns 1 if they match, 0 otherwise
+ */
+static int fuzzy_matchlines(const char *s1, size_t n1,
+			    const char *s2, size_t n2)
+{
+	const char *last1 = s1 + n1 - 1;
+	const char *last2 = s2 + n2 - 1;
+	int result = 0;
+
+	if (n1 < 0 || n2 < 0)
+		return 0;
+
+	/* ignore line endings */
+	while ((*last1 == '\r') || (*last1 == '\n'))
+		last1--;
+	while ((*last2 == '\r') || (*last2 == '\n'))
+		last2--;
+
+	/* skip leading whitespace */
+	while (isspace(*s1) && (s1 <= last1))
+		s1++;
+	while (isspace(*s2) && (s2 <= last2))
+		s2++;
+	/* early return if both lines are empty */
+	if ((s1 > last1) && (s2 > last2))
+		return 1;
+	while (!result) {
+		result = *s1++ - *s2++;
+		/*
+		 * Skip whitespace inside. We check for whitespace on
+		 * both buffers because we don't want "a b" to match
+		 * "ab"
+		 */
+		if (isspace(*s1) && isspace(*s2)) {
+			while (isspace(*s1) && s1 <= last1)
+				s1++;
+			while (isspace(*s2) && s2 <= last2)
+				s2++;
+		}
+		/*
+		 * If we reached the end on one side only,
+		 * lines don't match
+		 */
+		if (
+		    ((s2 > last2) && (s1 <= last1)) ||
+		    ((s1 > last1) && (s2 <= last2)))
+			return 0;
+		if ((s1 > last1) && (s2 > last2))
+			break;
+	}
+
+	return !result;
+}
+
 static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag)
 {
 	ALLOC_GROW(img->line_allocated, img->nr + 1, img->alloc);
@@ -1672,10 +1729,17 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	}
 }
 
+/*
+ * Update the preimage, and the common lines in postimage,
+ * from buffer buf of length len. If postlen is 0 the postimage
+ * is updated in place, otherwise it's updated on a new buffer
+ * of length postlen
+ */
+
 static void update_pre_post_images(struct image *preimage,
 				   struct image *postimage,
 				   char *buf,
-				   size_t len)
+				   size_t len, size_t postlen)
 {
 	int i, ctx;
 	char *new, *old, *fixed;
@@ -1694,11 +1758,19 @@ static void update_pre_post_images(struct image *preimage,
 	*preimage = fixed_preimage;
 
 	/*
-	 * Adjust the common context lines in postimage, in place.
-	 * This is possible because whitespace fixing does not make
-	 * the string grow.
+	 * Adjust the common context lines in postimage. This can be
+	 * done in-place when we are just doing whitespace fixing,
+	 * which does not make the string grow, but needs a new buffer
+	 * when ignore_whitespace causes the update, since in this case
+	 * we could have e.g. tabs converted to multiple spaces.
+	 * We trust the caller to tell us if the update can be done
+	 * in place (postlen==0) or not.
 	 */
-	new = old = postimage->buf;
+	old = postimage->buf;
+	if (postlen)
+		new = postimage->buf = xmalloc(postlen);
+	else
+		new = old;
 	fixed = preimage->buf;
 	for (i = ctx = 0; i < postimage->nr; i++) {
 		size_t len = postimage->line[i].len;
@@ -1773,12 +1845,57 @@ static int match_fragment(struct image *img,
 	    !memcmp(img->buf + try, preimage->buf, preimage->len))
 		return 1;
 
+	/*
+	 * No exact match. If we are ignoring whitespace, run a line-by-line
+	 * fuzzy matching. We collect all the line length information because
+	 * we need it to adjust whitespace if we match.
+	 */
+	if (ignore_whitespace) {
+		size_t imgoff = 0;
+		size_t preoff = 0;
+		size_t postlen = postimage->len;
+		size_t imglen[preimage->nr];
+		for (i = 0; i < preimage->nr; i++) {
+			imglen[i] = img->line[try_lno+i].len;
+			size_t prelen = preimage->line[i].len;
+			if (!fuzzy_matchlines(
+				img->buf + try + imgoff, imglen[i],
+				preimage->buf + preoff, prelen))
+				return 0;
+			if (preimage->line[i].flag & LINE_COMMON)
+				postlen += imglen[i] - prelen;
+			imgoff += imglen[i];
+			preoff += prelen;
+		}
+
+		/*
+		 * Ok, the preimage matches with whitespace fuzz. Update it and
+		 * the common postimage lines to use the same whitespace as the
+		 * target. imgoff now holds the true length of the target that
+		 * matches the preimage, and we need to update the line lengths
+		 * of the preimage to match the target ones.
+		 */
+		fixed_buf = xmalloc(imgoff);
+		memcpy(fixed_buf, img->buf + try, imgoff);
+		for (i = 0; i < preimage->nr; i++)
+			preimage->line[i].len = imglen[i];
+
+		/*
+		 * Update the preimage buffer and the postimage context lines.
+		 */
+		update_pre_post_images(preimage, postimage,
+				fixed_buf, imgoff, postlen);
+		return 1;
+	}
+
 	if (ws_error_action != correct_ws_error)
 		return 0;
 
 	/*
 	 * The hunk does not apply byte-by-byte, but the hash says
-	 * it might with whitespace fuzz.
+	 * it might with whitespace fuzz. We haven't been asked to
+	 * ignore whitespace, we were asked to correct whitespace
+	 * errors, so let's try matching after whitespace correction.
 	 */
 	fixed_buf = xmalloc(preimage->len + 1);
 	buf = fixed_buf;
@@ -1830,7 +1947,7 @@ static int match_fragment(struct image *img,
 	 * hunk match.  Update the context lines in the postimage.
 	 */
 	update_pre_post_images(preimage, postimage,
-			       fixed_buf, buf - fixed_buf);
+			       fixed_buf, buf - fixed_buf, 0);
 	return 1;
 
  unmatch_exit:
@@ -3272,6 +3389,8 @@ static int git_apply_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, "apply.whitespace"))
 		return git_config_string(&apply_default_whitespace, var, value);
+	else if (!strcmp(var, "apply.ignore-whitespace"))
+		ignore_whitespace = git_config_bool(var, value);
 	return git_default_config(var, value, cb);
 }
 
@@ -3384,6 +3503,8 @@ int cmd_apply(int argc, const char **argv, const char *unused_prefix)
 		{ OPTION_CALLBACK, 0, "whitespace", &whitespace_option, "action",
 			"detect new or modified lines that have whitespace errors",
 			0, option_parse_whitespace },
+		OPT_BOOLEAN(0, "ignore-whitespace", &ignore_whitespace,
+			"ignore whitespace differences when finding context"),
 		OPT_BOOLEAN('R', "reverse", &apply_in_reverse,
 			"apply the patch in reverse"),
 		OPT_BOOLEAN(0, "unidiff-zero", &unidiff_zero,
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 5543dc4..7978d1d 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -674,6 +674,7 @@ _git_am ()
 	--*)
 		__gitcomp "
 			--3way --committer-date-is-author-date --ignore-date
+			--ignore-whitespace
 			--interactive --keep --no-utf8 --signoff --utf8
 			--whitespace=
 			"
@@ -695,6 +696,7 @@ _git_apply ()
 			--stat --numstat --summary --check --index
 			--cached --index-info --reverse --reject --unidiff-zero
 			--apply --no-add --exclude=
+			--ignore-whitespace
 			--whitespace= --inaccurate-eof --verbose
 			"
 		return
@@ -1537,6 +1539,7 @@ _git_config ()
 	__gitcomp "
 		add.ignore-errors
 		alias.
+		apply.ignore-whitespace
 		apply.whitespace
 		branch.autosetupmerge
 		branch.autosetuprebase
diff --git a/git-am.sh b/git-am.sh
index d64d997..fe024b1 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -16,6 +16,7 @@ s,signoff       add a Signed-off-by line to the commit message
 u,utf8          recode into utf8 (default)
 k,keep          pass -k flag to git-mailinfo
 whitespace=     pass it through git-apply
+ignore-whitespace pass it through git-apply
 directory=      pass it through git-apply
 C=              pass it through git-apply
 p=              pass it through git-apply
@@ -303,7 +304,7 @@ do
 		git_apply_opt="$git_apply_opt $(sq "$1$2")"; shift ;;
 	--patch-format)
 		shift ; patch_format="$1" ;;
-	--reject)
+	--reject|--ignore-whitespace)
 		git_apply_opt="$git_apply_opt $1" ;;
 	--committer-date-is-author-date)
 		committer_date_is_author_date=t ;;
diff --git a/git-rebase.sh b/git-rebase.sh
index 18bc694..d741752 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -333,6 +333,9 @@ do
 			;;
 		esac
 		;;
+	--ignore-whitespace)
+		git_am_opt="$git_am_opt $1"
+		;;
 	--committer-date-is-author-date|--ignore-date)
 		git_am_opt="$git_am_opt $1"
 		force_rebase=t
diff --git a/t/t4107-apply-ignore-whitespace.sh b/t/t4107-apply-ignore-whitespace.sh
new file mode 100755
index 0000000..5a09001
--- /dev/null
+++ b/t/t4107-apply-ignore-whitespace.sh
@@ -0,0 +1,143 @@
+#!/bin/sh
+#
+# Copyright (c) 2009 Giuseppe Bilotta
+# Copyright (c) 2005 Junio C Hamano
+# Copyright (c) 2005 Robert Fitzsimons
+#
+
+test_description='git-apply --ignore-whitespace.
+
+'
+. ./test-lib.sh
+
+# setup
+
+cat > patch1.patch <<\EOF
+diff --git a/main.c b/main.c
+new file mode 100644
+--- /dev/null
++++ b/main.c
+@@ -0,0 +1,22 @@
++#include <stdio.h>
++
++void print_int(int num);
++int func(int num);
++
++int main() {
++       int i;
++
++       for (i = 0; i < 10; i++) {
++               print_int(func(i)); /* stuff */
++       }
++
++       return 0;
++}
++
++int func(int num) {
++       return num * num;
++}
++
++void print_int(int num) {
++       printf("%d", num);
++}
+EOF
+cat > patch2.patch <<\EOF
+diff --git a/main.c b/main.c
+--- a/main.c
++++ b/main.c
+@@ -10,6 +10,8 @@
+ 		print_int(func(i)); /* stuff */
+ 	}
+ 
++	printf("\n");
++
+ 	return 0;
+ }
+ 
+EOF
+
+cat > patch3.patch <<\EOF
+diff --git a/main.c b/main.c
+--- a/main.c
++++ b/main.c
+@@ -10,1 +10,1 @@
+ 		print_int(func(i)); 
+EOF
+
+cat > patch4.patch <<\EOF
+diff --git a/main.c b/main.c
+--- a/main.c
++++ b/main.c
+@@ -21,1 +21,1 @@
+ 	}; 
+\ No newline at end of file
+EOF
+
+cat > patch5.patch <<\EOF
+diff --git a/main.c b/main.c
+--- a/main.c
++++ b/main.c
+@@ -2,1 +2,1 @@
+ 	void print_int(int num);
+EOF
+
+cat > main.c.final <<\EOF
+#include <stdio.h>
+
+void print_int(int num);
+int func(int num);
+
+int main() {
+       int i;
+
+       for (i = 0; i < 10; i++) {
+               print_int(func(i)); /* stuff */
+       }
+
+	printf("\n");
+
+       return 0;
+}
+
+int func(int num) {
+       return num * num;
+}
+
+void print_int(int num) {
+       printf("%d", num);
+}
+EOF
+
+test_expect_success "S = patch1" \
+    'git-apply patch1.patch'
+
+test_expect_failure "F = patch2" \
+    'git-apply patch2.patch'
+
+git config apply.ignore-whitespace true
+
+test_expect_success "S = patch2 (--ignore-whitespace)" \
+    'git-apply patch2.patch'
+
+test_expect_failure "F = patch3 (missing string at EOL)" \
+    'git-apply patch3.patch'
+
+test_expect_failure "F = patch4 (missing EOL at EOF)" \
+    'git-apply patch4.patch'
+
+test_expect_success "S = patch5 (leading whitespace)" \
+    'git-apply patch5.patch'
+
+test_expect_success "S = main.c.final" \
+    'diff -q main.c main.c.final'
+
+rm -f main.c
+test_expect_success "S = patch1 (--ignore-whitespace)" \
+    'git-apply patch1.patch'
+
+test_expect_failure "F = patch5 (leading whitespace)" \
+    'git-apply --no-ignore-whitespace patch5.patch'
+
+test_done
+
+
-- 
1.6.3.3.511.gcffae6.dirty

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-28 21:00 [PATCHv3] git apply: option to ignore whitespace differences Giuseppe Bilotta
@ 2009-07-28 21:29 ` Junio C Hamano
  2009-07-29  6:33   ` Giuseppe Bilotta
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-07-28 21:29 UTC (permalink / raw)
  To: Giuseppe Bilotta; +Cc: git

Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:

> @@ -205,6 +209,9 @@ running `git apply --directory=modules/git-gui`.
>  Configuration
>  -------------
>  
> +apply.ignore-whitespace::
> +	Set this boolean configuration flag if you want to ignore whitespace
> +	differences to be ignored by default.

That's a strange sentence.

	When set to true, changes in amount of whitespace are ignored.

I am not happy with the name --ignore-whitespace.

Perhaps --ignore-space-change, to be consistent with a "git diff" option,
would be more appropriate.  Doing so has an added benefit of leaving the
door open to add --ignore-all-space option to the patch application side
later.

> +		/*
> +		 * Skip whitespace inside. We check for whitespace on
> +		 * both buffers because we don't want "a b" to match
> +		 * "ab"

		 * but that is only true when --ignore-space-change
                 * is in effect.
		 *
                 * If you want to match "if(a > b)" with "if (a > b)",
                 * please add --ignore-all-space option and use it.

    ;-)

If I am reading the patch correctly, use of this option always disables
checks and corrections of whitespace errors.  I personally feel that
somebody who is willing to accept a patch that has whitespace breakages
that needs this option would not care, but the documentation should warn
about it at the minimum.

Needless to say, a lot better option is not to disable the checks and
corrections at all.  After all, this "ignore space change" option is about
matching the common context lines and replaced/removed contents before the
change, and checks and corrections are about added/replaced contents after
the change.  They should be orthogonal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-28 21:29 ` Junio C Hamano
@ 2009-07-29  6:33   ` Giuseppe Bilotta
  2009-07-29  7:09     ` Junio C Hamano
  2009-07-29  8:40     ` Nanako Shiraishi
  0 siblings, 2 replies; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-29  6:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jul 28, 2009 at 11:29 PM, Junio C Hamano<gitster@pobox.com> wrote:
> Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:
>
>> @@ -205,6 +209,9 @@ running `git apply --directory=modules/git-gui`.
>>  Configuration
>>  -------------
>>
>> +apply.ignore-whitespace::
>> +     Set this boolean configuration flag if you want to ignore whitespace
>> +     differences to be ignored by default.
>
> That's a strange sentence.
>
>        When set to true, changes in amount of whitespace are ignored.

Indeed, my sentene was horrible.

> I am not happy with the name --ignore-whitespace.
>
> Perhaps --ignore-space-change, to be consistent with a "git diff" option,
> would be more appropriate.  Doing so has an added benefit of leaving the
> door open to add --ignore-all-space option to the patch application side
> later.

On the other hand, --ignore-whitespace matches the option name (and
behavior) of the 'patch' command (just like "git diff"'s matches the
'diff' option name and behavior). Principle of least surprise says
that someone coming to git from raw diff/patch setups would expect
--ignore-whitespace on the patch side.

A possible future expansion to allow ignoring all whitespace
completely would be to allow --ignore-whitespace=all

> If I am reading the patch correctly, use of this option always disables
> checks and corrections of whitespace errors.

Not exactly. Whitespace correction is not attempted to try matching
context lines, but the whitespace correction on the new lines is still
applied.

> I personally feel that
> somebody who is willing to accept a patch that has whitespace breakages
> that needs this option would not care

Indeed, my first patch fell through to the whitespace correction, I
changed it per your suggestion.

> but the documentation should warn
> about it at the minimum.

The code comments does mention that context line whitespace correction
that will be skipped. I'll add a note about it in the man page.

> Needless to say, a lot better option is not to disable the checks and
> corrections at all.  After all, this "ignore space change" option is about
> matching the common context lines and replaced/removed contents before the
> change, and checks and corrections are about added/replaced contents after
> the change.  They should be orthogonal.

They are, for new lines.

Actually, one thing that I've been thinking about doing is to adjust
the new lines to match the kind of whitespace change the context line
underwent. Example: if all the context lines had the change 4 spaces
-> tab, it would be nice to have the new lines undergo the same
change. However, this is going to be rather hard to implement.

-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  6:33   ` Giuseppe Bilotta
@ 2009-07-29  7:09     ` Junio C Hamano
  2009-07-29  8:20       ` Giuseppe Bilotta
  2009-07-29  8:40     ` Nanako Shiraishi
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-07-29  7:09 UTC (permalink / raw)
  To: Giuseppe Bilotta; +Cc: Junio C Hamano, git

Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:

> Actually, one thing that I've been thinking about doing is to adjust
> the new lines to match the kind of whitespace change the context line
> underwent. Example: if all the context lines had the change 4 spaces
> -> tab, it would be nice to have the new lines undergo the same
> change. However, this is going to be rather hard to implement.

Doing so will be actively wrong.

In the case of "whitespace=fix", it is justifiable to update ws broken
context lines while applying a ws corrected patch to a ws broken target,
and it also is justifiable not to update context lines while applying a ws
broken patch to a ws corrected target, because it is clear which one has
the correct whitespace layout (i.e. output of ws_fix_copy() by definition
is the correct outcome).  But in your example, it is not clear if the
layout using 4 spaces is correct or the one with a tab.  The tool should
refrain from guessing in such a case.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  7:09     ` Junio C Hamano
@ 2009-07-29  8:20       ` Giuseppe Bilotta
  2009-07-29  8:39         ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-29  8:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jul 29, 2009 at 9:09 AM, Junio C Hamano<gitster@pobox.com> wrote:
> Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:
>
>> Actually, one thing that I've been thinking about doing is to adjust
>> the new lines to match the kind of whitespace change the context line
>> underwent. Example: if all the context lines had the change 4 spaces
>> -> tab, it would be nice to have the new lines undergo the same
>> change. However, this is going to be rather hard to implement.
>
> Doing so will be actively wrong.
>
> In the case of "whitespace=fix", it is justifiable to update ws broken
> context lines while applying a ws corrected patch to a ws broken target,
> and it also is justifiable not to update context lines while applying a ws
> broken patch to a ws corrected target, because it is clear which one has
> the correct whitespace layout (i.e. output of ws_fix_copy() by definition
> is the correct outcome).  But in your example, it is not clear if the
> layout using 4 spaces is correct or the one with a tab.  The tool should
> refrain from guessing in such a case.

I was thinking more about consistency than 'correctness'  in this
case, actually. Typical scenario would be: patch is created for a file
using a given whitespace convention (e.g. 4 spaces). File is updated
to match the rest of the project (tabs). Patch would now introduce the
wrong whitespace convention for the new lines.

Of course, whitespace can (and should) be fixed a posteriori, but when
doing a rebase this can get quite annoying. So some kind of automatic
way to do it would be nice. But I don't think I'd get around to such a
feature any time soon. (Also, the merge case has a higher priority.)

-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  8:20       ` Giuseppe Bilotta
@ 2009-07-29  8:39         ` Junio C Hamano
  2009-07-29  9:05           ` Giuseppe Bilotta
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-07-29  8:39 UTC (permalink / raw)
  To: Giuseppe Bilotta; +Cc: git

Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:

> I was thinking more about consistency than 'correctness'  in this
> case, actually. Typical scenario would be: patch is created for a file
> using a given whitespace convention (e.g. 4 spaces). File is updated
> to match the rest of the project (tabs). Patch would now introduce the
> wrong whitespace convention for the new lines.

You are assuming that the patch is always based on a wrong convention and
the target always uses the right convention.  In general, however, you
cannot tell which way the "consisitency" should go --- often you have to
apply a patch based on a fixed codebase to an older version.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  6:33   ` Giuseppe Bilotta
  2009-07-29  7:09     ` Junio C Hamano
@ 2009-07-29  8:40     ` Nanako Shiraishi
  2009-07-29  9:09       ` Giuseppe Bilotta
  2009-07-31  0:27       ` Felipe Contreras
  1 sibling, 2 replies; 15+ messages in thread
From: Nanako Shiraishi @ 2009-07-29  8:40 UTC (permalink / raw)
  To: Giuseppe Bilotta; +Cc: Junio C Hamano, git

Quoting Giuseppe Bilotta <giuseppe.bilotta@gmail.com>

>> Perhaps --ignore-space-change, to be consistent with a "git diff" option,
>> would be more appropriate. Doing so has an added benefit of leaving the
>> door open to add --ignore-all-space option to the patch application side
>> later.
>
> On the other hand, --ignore-whitespace matches the option name (and
> behavior) of the 'patch' command (just like "git diff"'s matches the
> 'diff' option name and behavior). Principle of least surprise says
> that someone coming to git from raw diff/patch setups would expect
> --ignore-whitespace on the patch side.

Not everybody shares your diff/patch background.

I wouldn't be surprised if git were the first system they ever learn for
majority of users of version control systems in this century, especially
because now there are many books written on it.

Isn't it more important for git to be internally consistent across its
commands for such an audience to satisfy the principle of least surprise?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  8:39         ` Junio C Hamano
@ 2009-07-29  9:05           ` Giuseppe Bilotta
  0 siblings, 0 replies; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-29  9:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jul 29, 2009 at 10:39 AM, Junio C Hamano<gitster@pobox.com> wrote:
> Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:
>
>> I was thinking more about consistency than 'correctness'  in this
>> case, actually. Typical scenario would be: patch is created for a file
>> using a given whitespace convention (e.g. 4 spaces). File is updated
>> to match the rest of the project (tabs). Patch would now introduce the
>> wrong whitespace convention for the new lines.
>
> You are assuming that the patch is always based on a wrong convention and
> the target always uses the right convention.

No, I'm just assuming that the lines should have consistent whitespace

> In general, however, you
> cannot tell which way the "consisitency" should go --- often you have to
> apply a patch based on a fixed codebase to an older version.

Consistency should go the way of the context lines. So if the
convention changed and you're backporting changes, the changes would
use the old convention on the old codebase, and the new convention on
the new codebase, because that's what the context lines would direct
the new lines to.

Of course, this is all very abstract reasoning because I don't even
think the feature can be implemented at all robustly (too much 'AI'
work, IMO).

-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  8:40     ` Nanako Shiraishi
@ 2009-07-29  9:09       ` Giuseppe Bilotta
  2009-07-31  0:27       ` Felipe Contreras
  1 sibling, 0 replies; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-29  9:09 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Junio C Hamano, git

On Wed, Jul 29, 2009 at 10:40 AM, Nanako Shiraishi<nanako3@lavabit.com> wrote:
> Quoting Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
>
>>> Perhaps --ignore-space-change, to be consistent with a "git diff" option,
>>> would be more appropriate. Doing so has an added benefit of leaving the
>>> door open to add --ignore-all-space option to the patch application side
>>> later.
>>
>> On the other hand, --ignore-whitespace matches the option name (and
>> behavior) of the 'patch' command (just like "git diff"'s matches the
>> 'diff' option name and behavior). Principle of least surprise says
>> that someone coming to git from raw diff/patch setups would expect
>> --ignore-whitespace on the patch side.
>
> Not everybody shares your diff/patch background.
>
> I wouldn't be surprised if git were the first system they ever learn for
> majority of users of version control systems in this century, especially
> because now there are many books written on it.

This is something I hadn't considered.

> Isn't it more important for git to be internally consistent across its
> commands for such an audience to satisfy the principle of least surprise?

Would it be ok if I had the option and command line option turn into
ignore-space-change and keep ignore-whitespace as a synonym? Or would
that be too much?

-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-29  8:40     ` Nanako Shiraishi
  2009-07-29  9:09       ` Giuseppe Bilotta
@ 2009-07-31  0:27       ` Felipe Contreras
  2009-07-31  0:48         ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Felipe Contreras @ 2009-07-31  0:27 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Giuseppe Bilotta, Junio C Hamano, git

On Wed, Jul 29, 2009 at 8:40 AM, Nanako Shiraishi<nanako3@lavabit.com> wrote:
> Quoting Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
>
>>> Perhaps --ignore-space-change, to be consistent with a "git diff" option,
>>> would be more appropriate. Doing so has an added benefit of leaving the
>>> door open to add --ignore-all-space option to the patch application side
>>> later.
>>
>> On the other hand, --ignore-whitespace matches the option name (and
>> behavior) of the 'patch' command (just like "git diff"'s matches the
>> 'diff' option name and behavior). Principle of least surprise says
>> that someone coming to git from raw diff/patch setups would expect
>> --ignore-whitespace on the patch side.
>
> Not everybody shares your diff/patch background.
>
> I wouldn't be surprised if git were the first system they ever learn for
> majority of users of version control systems in this century, especially
> because now there are many books written on it.

That's not relevant, "white space" is an already used concept.

Google:
ignore space change: 17,300,000
ignore white space: 181,000,000

> Isn't it more important for git to be internally consistent across its
> commands for such an audience to satisfy the principle of least surprise?

Perhaps, but you are forgetting the option to change the current
commands' arguments.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-31  0:27       ` Felipe Contreras
@ 2009-07-31  0:48         ` Junio C Hamano
  2009-07-31 15:38           ` Felipe Contreras
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-07-31  0:48 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Nanako Shiraishi, Giuseppe Bilotta, Junio C Hamano, git

Felipe Contreras <felipe.contreras@gmail.com> writes:

> Google:
> ignore space change: 17,300,000
> ignore white space: 181,000,000

But that is not relevant either.  "ignore white space" is a superset of
"ignore space change", iow the latter has more specific meaning than the
former.  It is not surprising to see that search engines find larger
number of references to more general terms than more specific ones.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-31  0:48         ` Junio C Hamano
@ 2009-07-31 15:38           ` Felipe Contreras
  2009-07-31 16:16             ` Giuseppe Bilotta
  0 siblings, 1 reply; 15+ messages in thread
From: Felipe Contreras @ 2009-07-31 15:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nanako Shiraishi, Giuseppe Bilotta, git

On Fri, Jul 31, 2009 at 12:48 AM, Junio C Hamano<gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> Google:
>> ignore space change: 17,300,000
>> ignore white space: 181,000,000
>
> But that is not relevant either.  "ignore white space" is a superset of
> "ignore space change", iow the latter has more specific meaning than the
> former.  It is not surprising to see that search engines find larger
> number of references to more general terms than more specific ones.

The fact remains the same: white space is a much more used term, and
IMHO the only one that really matters.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-31 15:38           ` Felipe Contreras
@ 2009-07-31 16:16             ` Giuseppe Bilotta
  2009-07-31 17:17               ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-31 16:16 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Junio C Hamano, Nanako Shiraishi, git

On Fri, Jul 31, 2009 at 5:38 PM, Felipe
Contreras<felipe.contreras@gmail.com> wrote:
> On Fri, Jul 31, 2009 at 12:48 AM, Junio C Hamano<gitster@pobox.com> wrote:
>> Felipe Contreras <felipe.contreras@gmail.com> writes:
>>
>>> Google:
>>> ignore space change: 17,300,000
>>> ignore white space: 181,000,000
>>
>> But that is not relevant either.  "ignore white space" is a superset of
>> "ignore space change", iow the latter has more specific meaning than the
>> former.  It is not surprising to see that search engines find larger
>> number of references to more general terms than more specific ones.
>
> The fact remains the same: white space is a much more used term, and
> IMHO the only one that really matters.

Before I go on with the next revision of the patch, I would like to
have some kind of agreed convention to implement.

My suggestion would be the following:
(1) implement options --ignore-space-change, --ignore-all-space
mirroring their 'git diff' meaning.
(2) add --ignore-whitespace as a synonym to --ignore-space-change, for
consistency with 'patch'
(3) apply.ignore-whitespace accepts values
     * false,no,none,0 to mean no whitespace ignoring
     * true,yes,change,1 to mean ignore whitespace change
     * all,2 to mean ignore all whitespace

Objections?



-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-31 16:16             ` Giuseppe Bilotta
@ 2009-07-31 17:17               ` Junio C Hamano
  2009-07-31 19:22                 ` Giuseppe Bilotta
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2009-07-31 17:17 UTC (permalink / raw)
  To: Giuseppe Bilotta; +Cc: Felipe Contreras, Junio C Hamano, Nanako Shiraishi, git

Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:

> Before I go on with the next revision of the patch, I would like to
> have some kind of agreed convention to implement.
>
> My suggestion would be the following:
> (1) implement options --ignore-space-change, --ignore-all-space
> mirroring their 'git diff' meaning.
> (2) add --ignore-whitespace as a synonym to --ignore-space-change, for
> consistency with 'patch'
> (3) apply.ignore-whitespace accepts values
>      * false,no,none,0 to mean no whitespace ignoring
>      * true,yes,change,1 to mean ignore whitespace change
>      * all,2 to mean ignore all whitespace

I'd suggest taking one small bite at a time.  You do not have code to do
the ignore-all-space semantics that has been reviewed, and you neither
have the configuration variables.  So my preference would be to do in the
first round the first half of (1) with (2), docs, tests and nothing else.

A later follow-up patch would complete your (1) and (2) by supporting
ignore-all-space.  And perhaps you would give it --ignore-all-whitespace
synonym perhaps?  You may want to ask "GNU patch" people if they are
interested in ignoring all whitespaces, and if so what their plan is to
name that option, so that you can use the same name.

As to configuration:

 - I think the naming convention (I know, there are existing violators,
   but that is not an excuse to add new ones) is without dashes (again, I
   know I personally do not like CamelCase but that is what we have);

 - I personally think 0/1/2 would be more cluttering than they are useful.

 - I'd say "no (never, none)", "change", "all".  You could throw in
   "false" if you want, as declining this option is quite boolean-ish, but
   activating it is _not_ boolean decision and I would suggest actively
   keeping "true" or "yes" out of the choices.  Otherwise anybody who is
   tempted to answer "yes" has to ask himself "Ok, I want to say 'yes' but
   which semantics does it mean?  What would I be ignoring?"

That non-booleanness of "true" is why I am somewhat negative on your (2)
above.  It would only be there for people who know "GNU patch" because
they would understand it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv3] git apply: option to ignore whitespace differences
  2009-07-31 17:17               ` Junio C Hamano
@ 2009-07-31 19:22                 ` Giuseppe Bilotta
  0 siblings, 0 replies; 15+ messages in thread
From: Giuseppe Bilotta @ 2009-07-31 19:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Felipe Contreras, Nanako Shiraishi, git

On Fri, Jul 31, 2009 at 7:17 PM, Junio C Hamano<gitster@pobox.com> wrote:
> Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes:
>
>> Before I go on with the next revision of the patch, I would like to
>> have some kind of agreed convention to implement.
>>
>> My suggestion would be the following:
>> (1) implement options --ignore-space-change, --ignore-all-space
>> mirroring their 'git diff' meaning.
>> (2) add --ignore-whitespace as a synonym to --ignore-space-change, for
>> consistency with 'patch'
>> (3) apply.ignore-whitespace accepts values
>>      * false,no,none,0 to mean no whitespace ignoring
>>      * true,yes,change,1 to mean ignore whitespace change
>>      * all,2 to mean ignore all whitespace
>
> I'd suggest taking one small bite at a time.

Sure.

> A later follow-up patch would complete your (1) and (2) by supporting
> ignore-all-space.  And perhaps you would give it --ignore-all-whitespace
> synonym perhaps?  You may want to ask "GNU patch" people if they are
> interested in ignoring all whitespaces, and if so what their plan is to
> name that option, so that you can use the same name.

I was thinking about making --ignore-whitespace (I'd rather keep the
-- because that's what patch goes for) accept an optional argument,
allowing e.g. --ignore-whitespace=no to overrule the config option. In
this case we would have --ignore-whitespace=all as synonym to
-ignore-all-space. It would also make the --ignore-whitespace option
more in line with the ignorewhitespace config

-- 
Giuseppe "Oblomov" Bilotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-07-31 19:22 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-28 21:00 [PATCHv3] git apply: option to ignore whitespace differences Giuseppe Bilotta
2009-07-28 21:29 ` Junio C Hamano
2009-07-29  6:33   ` Giuseppe Bilotta
2009-07-29  7:09     ` Junio C Hamano
2009-07-29  8:20       ` Giuseppe Bilotta
2009-07-29  8:39         ` Junio C Hamano
2009-07-29  9:05           ` Giuseppe Bilotta
2009-07-29  8:40     ` Nanako Shiraishi
2009-07-29  9:09       ` Giuseppe Bilotta
2009-07-31  0:27       ` Felipe Contreras
2009-07-31  0:48         ` Junio C Hamano
2009-07-31 15:38           ` Felipe Contreras
2009-07-31 16:16             ` Giuseppe Bilotta
2009-07-31 17:17               ` Junio C Hamano
2009-07-31 19:22                 ` Giuseppe Bilotta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).