git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC 00/13] git-apply --whitespace=fix updates
@ 2008-02-02 10:54 Junio C Hamano
  2008-02-02 10:54 ` [PATCH 01/13] builtin-apply.c: refactor small part that matches context Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This mini-series teaches "git-apply --whitespace=fix" to cope
better with the whitespace changes it introduces to the context.

[PATCH 01/13] builtin-apply.c: refactor small part that matches context
[PATCH 02/13] builtin-apply.c: restructure "offset" matching
[PATCH 03/13] builtin-apply.c: push match-beginning/end logic down

These three are preparatory clean-ups to isolate the parts the
later patches need to update.

[PATCH 04/13] builtin-apply.c: make it more line oriented
[PATCH 05/13] builtin-apply.c: optimize match_beginning/end processing a bit.
[PATCH 06/13] builtin-apply.c: mark common context lines in lineinfo structure.

These introduce an array of hash values for each line in the
patch preimage, postimage and the target file, and simplifies
the code to find the place that the hunk applies.  The ultimate
motivation is to allow applying patches with whitespace
differences, but this step does not do that yet.

The matching logic that uses hash code of lines could also make
it faster to find the place the hunk applies to, but that is not
the primary purpose of the change.  We would however need to
bench it so that this does not at least slow it down too much.
I haven't done that yet.

[PATCH 07/13] builtin-apply.c: clean-up apply_one_fragment()

This reduces 3 variables (old, oldlines, oldsize) that keep
track of the preimage being built (and their "new" counterparts
for the postimage) to 2 variables (old, oldlines).  This cannot
be done cleanly without the conversion to a line oriented data
structure introduced by 04/13 (remove_first_line() is the most
problematic one).

[PATCH 08/13] builtin-apply.c: simplify calling site to apply_line()
[PATCH 09/13] builtin-apply.c: do not feed copy_wsfix() leading '+'
[PATCH 10/13] builtin-apply.c: move copy_wsfix() function a bit higher.

The function apply_line() reads one line of a patch and makes
whitespace corrections.  These three are preparatory steps to
make the function usable for correcting whitespaces in context
lines.  The function is renamed to copy_wsfix().

[PATCH 11/13] builtin-apply.c: pass ws_rule down to match_fragment()
[PATCH 12/13] git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run

This actually implements the application of a patch whose
preimage context lines do not match the target file but do match
if the whitespace breakages in them are corrected.  The fixing
goes the other way as well.  We may still have files with
whitespace breakages, and the submitter may have based the patch
on a tree whose whitespace breakages have been fixed.  In such a
case, the lines in the target file needs to be cleaned up and
compared with the cleaned-up context lines from the preimage of
the patch.

[PATCH 13/13] core.whitespace: cr-at-eol

This introduces a new error mode "cr-at-eol" which tells git not
to treat a CR at the end of line as a trailing whitespace error.
This is from my earlier patch sent to the list, and does not
depend on "the whitespace correction in the context" topic
above, but rebased because it textually conflicts with the
series.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/13] builtin-apply.c: refactor small part that matches context
  2008-02-02 10:54 [PATCH/RFC 00/13] git-apply --whitespace=fix updates Junio C Hamano
@ 2008-02-02 10:54 ` Junio C Hamano
  2008-02-02 10:54   ` [PATCH 02/13] builtin-apply.c: restructure "offset" matching Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This moves three "if" conditions out of line from find_offset()
function, which is responsible for finding the matching place in
the preimage to apply the patch.  There is no change in the
logic of the program.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   18 +++++++++++++-----
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 15432b6..2c052f8 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1437,6 +1437,17 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	}
 }
 
+static int match_fragment(const char *buf, unsigned long size,
+			  unsigned long try,
+			  const char *fragment, unsigned long fragsize)
+{
+	if (try + fragsize > size)
+		return 0;
+	if (memcmp(buf + try, fragment, fragsize))
+		return 0;
+	return 1;
+}
+
 static int find_offset(const char *buf, unsigned long size,
 		       const char *fragment, unsigned long fragsize,
 		       int line, int *lines)
@@ -1461,8 +1472,7 @@ static int find_offset(const char *buf, unsigned long size,
 	}
 
 	/* Exact line number? */
-	if ((start + fragsize <= size) &&
-	    !memcmp(buf + start, fragment, fragsize))
+	if (match_fragment(buf, size, start, fragment, fragsize))
 		return start;
 
 	/*
@@ -1494,9 +1504,7 @@ static int find_offset(const char *buf, unsigned long size,
 			try = forwards;
 		}
 
-		if (try + fragsize > size)
-			continue;
-		if (memcmp(buf + try, fragment, fragsize))
+		if (!match_fragment(buf, size, try, fragment, fragsize))
 			continue;
 		n = (i >> 1)+1;
 		if (i & 1)
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/13] builtin-apply.c: restructure "offset" matching
  2008-02-02 10:54 ` [PATCH 01/13] builtin-apply.c: refactor small part that matches context Junio C Hamano
@ 2008-02-02 10:54   ` Junio C Hamano
  2008-02-02 10:54     ` [PATCH 03/13] builtin-apply.c: push match-beginning/end logic down Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This restructures code to find matching location with offset
in find_offset() function, so that there is need for only one
call site of match_fragment() function.  There still isn't a
change in the logic of the program.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   49 +++++++++++++++++++++++++------------------------
 1 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 2c052f8..0a304ab 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1452,8 +1452,8 @@ static int find_offset(const char *buf, unsigned long size,
 		       const char *fragment, unsigned long fragsize,
 		       int line, int *lines)
 {
-	int i;
-	unsigned long start, backwards, forwards;
+	int i, no_more_backwards, no_more_forwards;
+	unsigned long start, backwards, forwards, try;
 
 	if (fragsize > size)
 		return -1;
@@ -1471,32 +1471,44 @@ static int find_offset(const char *buf, unsigned long size,
 		}
 	}
 
-	/* Exact line number? */
-	if (match_fragment(buf, size, start, fragment, fragsize))
-		return start;
-
 	/*
 	 * There's probably some smart way to do this, but I'll leave
 	 * that to the smart and beautiful people. I'm simple and stupid.
 	 */
 	backwards = start;
 	forwards = start;
+	try = start;
+
 	for (i = 0; ; i++) {
-		unsigned long try;
-		int n;
+		no_more_backwards = !backwards;
+		no_more_forwards = (forwards + fragsize > size);
+
+		if (match_fragment(buf, size, try, fragment, fragsize)) {
+			int shift = ((i+1) >> 1);
+			if (i & 1)
+				shift = -shift;
+			*lines = shift;
+			return try;
+		}
+
+	again:
+		if (no_more_backwards && no_more_forwards)
+			break;
 
-		/* "backward" */
 		if (i & 1) {
-			if (!backwards) {
-				if (forwards + fragsize > size)
-					break;
-				continue;
+			if (no_more_backwards) {
+				i++;
+				goto again;
 			}
 			do {
 				--backwards;
 			} while (backwards && buf[backwards-1] != '\n');
 			try = backwards;
 		} else {
+			if (no_more_forwards) {
+				i++;
+				goto again;
+			}
 			while (forwards + fragsize <= size) {
 				if (buf[forwards++] == '\n')
 					break;
@@ -1504,18 +1516,7 @@ static int find_offset(const char *buf, unsigned long size,
 			try = forwards;
 		}
 
-		if (!match_fragment(buf, size, try, fragment, fragsize))
-			continue;
-		n = (i >> 1)+1;
-		if (i & 1)
-			n = -n;
-		*lines = n;
-		return try;
 	}
-
-	/*
-	 * We should start searching forward and backward.
-	 */
 	return -1;
 }
 
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/13] builtin-apply.c: push match-beginning/end logic down
  2008-02-02 10:54   ` [PATCH 02/13] builtin-apply.c: restructure "offset" matching Junio C Hamano
@ 2008-02-02 10:54     ` Junio C Hamano
  2008-02-02 10:54       ` [PATCH 04/13] builtin-apply.c: make it more line oriented Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This moves the logic to force match at the beginning and/or at
the end of the buffer to the actual function that finds the
match from its caller.  This is a necessary preparation for the
next step to allow matching disregarding certain differences,
such as whitespace changes.

We probably could optimize this even more by taking advantage of
the fact that match_beginning and match_end forces the match to
be at an exact location (anchored at the beginning and/or the
end), but that's for another commit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   46 ++++++++++++++++++++++++++++++++--------------
 1 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 0a304ab..f84a405 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1439,18 +1439,36 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 
 static int match_fragment(const char *buf, unsigned long size,
 			  unsigned long try,
-			  const char *fragment, unsigned long fragsize)
+			  const char *fragment, unsigned long fragsize,
+			  int match_beginning, int match_end)
 {
-	if (try + fragsize > size)
+	if (match_beginning && try)
 		return 0;
-	if (memcmp(buf + try, fragment, fragsize))
-		return 0;
-	return 1;
+
+	/*
+	 * Do we have an exact match?  If we were told to match
+	 * at the end, size must be exactly at try+fragsize,
+	 * otherwise try+fragsize must be still within the preimage,
+	 * and either case, the old piece should match the preimage
+	 * exactly.
+	 */
+	if ((match_end
+	     ? (try + fragsize == size)
+	     : (try + fragsize <= size)) &&
+	    !memcmp(buf + try, fragment, fragsize))
+		return 1;
+
+	/*
+	 * NEEDSWORK: We can optionally match fuzzily here, but
+	 * that is for a later round.
+	 */
+	return 0;
 }
 
 static int find_offset(const char *buf, unsigned long size,
 		       const char *fragment, unsigned long fragsize,
-		       int line, int *lines)
+		       int line, int *lines,
+		       int match_beginning, int match_end)
 {
 	int i, no_more_backwards, no_more_forwards;
 	unsigned long start, backwards, forwards, try;
@@ -1483,7 +1501,8 @@ static int find_offset(const char *buf, unsigned long size,
 		no_more_backwards = !backwards;
 		no_more_forwards = (forwards + fragsize > size);
 
-		if (match_fragment(buf, size, try, fragment, fragsize)) {
+		if (match_fragment(buf, size, try, fragment, fragsize,
+				   match_beginning, match_end)) {
 			int shift = ((i+1) >> 1);
 			if (i & 1)
 				shift = -shift;
@@ -1765,17 +1784,16 @@ static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
 	pos = frag->newpos;
 	for (;;) {
 		offset = find_offset(buf->buf, buf->len,
-				     oldlines, oldsize, pos, &lines);
-		if (match_end && offset + oldsize != buf->len)
-			offset = -1;
-		if (match_beginning && offset)
-			offset = -1;
+				     oldlines, oldsize, pos, &lines,
+				     match_beginning, match_end);
 		if (offset >= 0) {
 			if (ws_error_action == correct_ws_error &&
-			    (buf->len - oldsize - offset == 0)) /* end of file? */
+			    (buf->len - oldsize - offset == 0))
+				/* end of file? */
 				newsize -= new_blank_lines_at_end;
 
-			/* Warn if it was necessary to reduce the number
+			/*
+			 * Warn if it was necessary to reduce the number
 			 * of context lines.
 			 */
 			if ((leading != frag->leading) ||
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/13] builtin-apply.c: make it more line oriented
  2008-02-02 10:54     ` [PATCH 03/13] builtin-apply.c: push match-beginning/end logic down Junio C Hamano
@ 2008-02-02 10:54       ` Junio C Hamano
  2008-02-02 10:54         ` [PATCH 05/13] builtin-apply.c: optimize match_beginning/end processing a bit Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This changes the way git-apply internally works to be more line
oriented.  The logic to find where the patch applies with offset
used to count line numbers by always counting LF from the
beginning of the buffer, but it is simplified because we count
the line length of the target file and the preimage snippet
upfront now.

The ultimate motivation is to allow applying patches
whose preimage context has whitespace corruption that has
already been corrected in the local copy.  For that purpose, we
introduce a table of line-hash that allows us to match lines
that differ only in whitespaces.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |  403 ++++++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 280 insertions(+), 123 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index f84a405..e046e87 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -161,6 +161,92 @@ struct patch {
 	struct patch *next;
 };
 
+/*
+ * A line in a file, len-bytes long (includes the terminating LF,
+ * except for an incomplete line at the end if the file ends with
+ * one), and its contents hashes to 'hash'.
+ */
+struct line {
+	size_t len;
+	unsigned hash : 24;
+	unsigned flag : 8;
+};
+
+/*
+ * This represents a "file", which is an array of "lines".
+ */
+struct image {
+	char *buf;
+	size_t len;
+	size_t nr;
+	struct line *line_allocated;
+	struct line *line;
+};
+
+static uint32_t hash_line(const char *cp, size_t len)
+{
+	size_t i;
+	uint32_t h;
+	for (i = 0, h = 0; i < len; i++) {
+		if (!isspace(cp[i])) {
+			h = h * 3 + (cp[i] & 0xff);
+		}
+	}
+	return h;
+}
+
+static void prepare_image(struct image *image, char *buf, size_t len,
+			  int prepare_linetable)
+{
+	const char *cp, *ep;
+	int n;
+
+	image->buf = buf;
+	image->len = len;
+
+	if (!prepare_linetable) {
+		image->line = NULL;
+		image->line_allocated = NULL;
+		image->nr = 0;
+		return;
+	}
+
+	ep = image->buf + image->len;
+
+	/* First count lines */
+	cp = image->buf;
+	n = 0;
+	while (cp < ep) {
+		cp = strchrnul(cp, '\n');
+		n++;
+		cp++;
+	}
+
+	image->line_allocated = xcalloc(n, sizeof(struct line));
+	image->line = image->line_allocated;
+	image->nr = n;
+	cp = image->buf;
+	n = 0;
+	while (cp < ep) {
+		const char *next;
+		for (next = cp; next < ep && *next != '\n'; next++)
+			;
+		if (next < ep)
+			next++;
+		image->line[n].len = next - cp;
+		image->line[n].hash = hash_line(cp, next - cp);
+		cp = next;
+		n++;
+	}
+}
+
+static void clear_image(struct image *image)
+{
+	free(image->buf);
+	image->buf = NULL;
+	image->len = 0;
+}
+
 static void say_patch_name(FILE *output, const char *pre,
 			   struct patch *patch, const char *post)
 {
@@ -1437,14 +1523,29 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	}
 }
 
-static int match_fragment(const char *buf, unsigned long size,
+static int match_fragment(struct image *img,
+			  struct image *preimage,
+			  struct image *postimage,
 			  unsigned long try,
-			  const char *fragment, unsigned long fragsize,
+			  int try_lno,
 			  int match_beginning, int match_end)
 {
-	if (match_beginning && try)
+	int i;
+
+	if (preimage->nr + try_lno > img->nr)
+		return 0;
+
+	if (match_beginning && try_lno)
+		return 0;
+
+	if (match_end && preimage->nr + try_lno != img->nr)
 		return 0;
 
+	/* Quick hash check */
+	for (i = 0; i < preimage->nr; i++)
+		if (preimage->line[i].hash != img->line[try_lno + i].hash)
+			return 0;
+
 	/*
 	 * Do we have an exact match?  If we were told to match
 	 * at the end, size must be exactly at try+fragsize,
@@ -1453,9 +1554,9 @@ static int match_fragment(const char *buf, unsigned long size,
 	 * exactly.
 	 */
 	if ((match_end
-	     ? (try + fragsize == size)
-	     : (try + fragsize <= size)) &&
-	    !memcmp(buf + try, fragment, fragsize))
+	     ? (try + preimage->len == img->len)
+	     : (try + preimage->len <= img->len)) &&
+	    !memcmp(img->buf + try, preimage->buf, preimage->len))
 		return 1;
 
 	/*
@@ -1465,105 +1566,78 @@ static int match_fragment(const char *buf, unsigned long size,
 	return 0;
 }
 
-static int find_offset(const char *buf, unsigned long size,
-		       const char *fragment, unsigned long fragsize,
-		       int line, int *lines,
-		       int match_beginning, int match_end)
+static int find_pos(struct image *img,
+		    struct image *preimage,
+		    struct image *postimage,
+		    int line,
+		    int match_beginning, int match_end)
 {
-	int i, no_more_backwards, no_more_forwards;
-	unsigned long start, backwards, forwards, try;
+	int i;
+	unsigned long backwards, forwards, try;
+	int backwards_lno, forwards_lno, try_lno;
 
-	if (fragsize > size)
+	if (preimage->nr > img->nr)
 		return -1;
 
-	start = 0;
-	if (line > 1) {
-		unsigned long offset = 0;
-		i = line-1;
-		while (offset + fragsize <= size) {
-			if (buf[offset++] == '\n') {
-				start = offset;
-				if (!--i)
-					break;
-			}
-		}
-	}
+	try = 0;
+	for (i = 0; i < line; i++)
+		try += img->line[i].len;
 
 	/*
 	 * There's probably some smart way to do this, but I'll leave
 	 * that to the smart and beautiful people. I'm simple and stupid.
 	 */
-	backwards = start;
-	forwards = start;
-	try = start;
+	backwards = try;
+	backwards_lno = line;
+	forwards = try;
+	forwards_lno = line;
+	try_lno = line;
 
 	for (i = 0; ; i++) {
-		no_more_backwards = !backwards;
-		no_more_forwards = (forwards + fragsize > size);
-
-		if (match_fragment(buf, size, try, fragment, fragsize,
-				   match_beginning, match_end)) {
-			int shift = ((i+1) >> 1);
-			if (i & 1)
-				shift = -shift;
-			*lines = shift;
-			return try;
-		}
+		if (match_fragment(img, preimage, postimage,
+				   try, try_lno,
+				   match_beginning, match_end))
+			return try_lno;
 
 	again:
-		if (no_more_backwards && no_more_forwards)
+		if (backwards_lno == 0 && forwards_lno == img->nr)
 			break;
 
 		if (i & 1) {
-			if (no_more_backwards) {
+			if (backwards_lno == 0) {
 				i++;
 				goto again;
 			}
-			do {
-				--backwards;
-			} while (backwards && buf[backwards-1] != '\n');
+			backwards_lno--;
+			backwards -= img->line[backwards_lno].len;
 			try = backwards;
+			try_lno = backwards_lno;
 		} else {
-			if (no_more_forwards) {
+			if (forwards_lno == img->nr) {
 				i++;
 				goto again;
 			}
-			while (forwards + fragsize <= size) {
-				if (buf[forwards++] == '\n')
-					break;
-			}
+			forwards += img->line[forwards_lno].len;
+			forwards_lno++;
 			try = forwards;
+			try_lno = forwards_lno;
 		}
 
 	}
 	return -1;
 }
 
-static void remove_first_line(const char **rbuf, int *rsize)
+static void remove_first_line(struct image *img)
 {
-	const char *buf = *rbuf;
-	int size = *rsize;
-	unsigned long offset;
-	offset = 0;
-	while (offset <= size) {
-		if (buf[offset++] == '\n')
-			break;
-	}
-	*rsize = size - offset;
-	*rbuf = buf + offset;
+	img->buf += img->line[0].len;
+	img->len -= img->line[0].len;
+	img->line++;
+	img->nr--;
 }
 
-static void remove_last_line(const char **rbuf, int *rsize)
+static void remove_last_line(struct image *img)
 {
-	const char *buf = *rbuf;
-	int size = *rsize;
-	unsigned long offset;
-	offset = size - 1;
-	while (offset > 0) {
-		if (buf[--offset] == '\n')
-			break;
-	}
-	*rsize = offset + 1;
+	img->len -= img->line[--img->nr].len;
 }
 
 static int apply_line(char *output, const char *patch, int plen,
@@ -1668,19 +1742,75 @@ static int apply_line(char *output, const char *patch, int plen,
 	return output + plen - buf;
 }
 
-static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
+static void update_image(struct image *img,
+			 int applied_pos,
+			 struct image *preimage,
+			 struct image *postimage)
+{
+	/*
+	 * remove the copy of preimage at offset in img
+	 * and replace it with postimage
+	 */
+	int i, nr;
+	size_t remove_count, insert_count, applied_at = 0;
+	char *result;
+
+	for (i = 0; i < applied_pos; i++)
+		applied_at += img->line[i].len;
+
+	remove_count = 0;
+	for (i = 0; i < preimage->nr; i++)
+		remove_count += img->line[applied_pos + i].len;
+	insert_count = postimage->len;
+
+	/* Adjust the contents */
+	result = xmalloc(img->len + insert_count - remove_count + 1);
+	memcpy(result, img->buf, applied_at);
+	memcpy(result + applied_at, postimage->buf, postimage->len);
+	memcpy(result + applied_at + postimage->len,
+	       img->buf + (applied_at + remove_count),
+	       img->len - (applied_at + remove_count));
+	free(img->buf);
+	img->buf = result;
+	img->len += insert_count - remove_count;
+	result[img->len] = '\0';
+
+	/* Adjust the line table */
+	nr = img->nr + postimage->nr - preimage->nr;
+	if (preimage->nr < postimage->nr) {
+		/*
+		 * NOTE: this knows that we never call remove_first_line()
+		 * on anything other than pre/post image.
+		 */
+		img->line = xrealloc(img->line, nr * sizeof(*img->line));
+		img->line_allocated = img->line;
+	}
+	if (preimage->nr != postimage->nr)
+		memmove(img->line + applied_pos + postimage->nr,
+			img->line + applied_pos + preimage->nr,
+			(img->nr - (applied_pos + preimage->nr)) *
+			sizeof(*img->line));
+	memcpy(img->line + applied_pos,
+	       postimage->line,
+	       postimage->nr * sizeof(*img->line));
+	img->nr = nr;
+}
+
+static int apply_one_fragment(struct image *img, struct fragment *frag,
 			      int inaccurate_eof, unsigned ws_rule)
 {
 	int match_beginning, match_end;
 	const char *patch = frag->patch;
-	int offset, size = frag->size;
+	int size = frag->size;
 	char *old = xmalloc(size);
 	char *new = xmalloc(size);
-	const char *oldlines, *newlines;
+	char *oldlines, *newlines;
 	int oldsize = 0, newsize = 0;
 	int new_blank_lines_at_end = 0;
 	unsigned long leading, trailing;
-	int pos, lines;
+	int pos, applied_pos;
+	struct image preimage;
+	struct image postimage;
 
 	while (size > 0) {
 		char first;
@@ -1780,32 +1910,16 @@ static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
 		match_end = !trailing;
 	}
 
-	lines = 0;
-	pos = frag->newpos;
+	pos = frag->newpos ? (frag->newpos - 1) : 0;
+	prepare_image(&preimage, oldlines, oldsize, 1);
+	prepare_image(&postimage, newlines, newsize, 1);
 	for (;;) {
-		offset = find_offset(buf->buf, buf->len,
-				     oldlines, oldsize, pos, &lines,
-				     match_beginning, match_end);
-		if (offset >= 0) {
-			if (ws_error_action == correct_ws_error &&
-			    (buf->len - oldsize - offset == 0))
-				/* end of file? */
-				newsize -= new_blank_lines_at_end;
 
-			/*
-			 * Warn if it was necessary to reduce the number
-			 * of context lines.
-			 */
-			if ((leading != frag->leading) ||
-			    (trailing != frag->trailing))
-				fprintf(stderr, "Context reduced to (%ld/%ld)"
-					" to apply fragment at %d\n",
-					leading, trailing, pos + lines);
-
-			strbuf_splice(buf, offset, oldsize, newlines, newsize);
-			offset = 0;
+		applied_pos = find_pos(img, &preimage, &postimage,
+				       pos, match_beginning, match_end);
+
+		if (applied_pos >= 0)
 			break;
-		}
 
 		/* Am I at my context limits? */
 		if ((leading <= p_context) && (trailing <= p_context))
@@ -1814,33 +1928,63 @@ static int apply_one_fragment(struct strbuf *buf, struct fragment *frag,
 			match_beginning = match_end = 0;
 			continue;
 		}
+
 		/*
 		 * Reduce the number of context lines; reduce both
 		 * leading and trailing if they are equal otherwise
 		 * just reduce the larger context.
 		 */
 		if (leading >= trailing) {
-			remove_first_line(&oldlines, &oldsize);
-			remove_first_line(&newlines, &newsize);
+			remove_first_line(&preimage);
+			remove_first_line(&postimage);
 			pos--;
 			leading--;
 		}
 		if (trailing > leading) {
-			remove_last_line(&oldlines, &oldsize);
-			remove_last_line(&newlines, &newsize);
+			remove_last_line(&preimage);
+			remove_last_line(&postimage);
 			trailing--;
 		}
 	}
 
-	if (offset && apply_verbosely)
-		error("while searching for:\n%.*s", oldsize, oldlines);
+	if (applied_pos >= 0) {
+		if (ws_error_action == correct_ws_error &&
+		    new_blank_lines_at_end &&
+		    postimage.nr + applied_pos == img->nr) {
+			/*
+			 * If the patch application adds blank lines
+			 * at the end, and if the patch applies at the
+			 * end of the image, remove those added blank
+			 * lines.
+			 */
+			while (new_blank_lines_at_end--)
+				remove_last_line(&postimage);
+		}
+
+		/*
+		 * Warn if it was necessary to reduce the number
+		 * of context lines.
+		 */
+		if ((leading != frag->leading) ||
+		    (trailing != frag->trailing))
+			fprintf(stderr, "Context reduced to (%ld/%ld)"
+				" to apply fragment at %d\n",
+				leading, trailing, applied_pos+1);
+		update_image(img, applied_pos, &preimage, &postimage);
+	} else {
+		if (apply_verbosely)
+			error("while searching for:\n%.*s", oldsize, oldlines);
+	}
 
 	free(old);
 	free(new);
-	return offset;
+	free(preimage.line_allocated);
+	free(postimage.line_allocated);
+
+	return (applied_pos < 0);
 }
 
-static int apply_binary_fragment(struct strbuf *buf, struct patch *patch)
+static int apply_binary_fragment(struct image *img, struct patch *patch)
 {
 	struct fragment *fragment = patch->fragments;
 	unsigned long len;
@@ -1857,22 +2001,26 @@ static int apply_binary_fragment(struct strbuf *buf, struct patch *patch)
 	}
 	switch (fragment->binary_patch_method) {
 	case BINARY_DELTA_DEFLATED:
-		dst = patch_delta(buf->buf, buf->len, fragment->patch,
+		dst = patch_delta(img->buf, img->len, fragment->patch,
 				  fragment->size, &len);
 		if (!dst)
 			return -1;
-		/* XXX patch_delta NUL-terminates */
-		strbuf_attach(buf, dst, len, len + 1);
+		clear_image(img);
+		img->buf = dst;
+		img->len = len;
 		return 0;
 	case BINARY_LITERAL_DEFLATED:
-		strbuf_reset(buf);
-		strbuf_add(buf, fragment->patch, fragment->size);
+		clear_image(img);
+		img->len = fragment->size;
+		img->buf = xmalloc(img->len+1);
+		memcpy(img->buf, fragment->patch, img->len);
+		img->buf[img->len] = '\0';
 		return 0;
 	}
 	return -1;
 }
 
-static int apply_binary(struct strbuf *buf, struct patch *patch)
+static int apply_binary(struct image *img, struct patch *patch)
 {
 	const char *name = patch->old_name ? patch->old_name : patch->new_name;
 	unsigned char sha1[20];
@@ -1893,7 +2041,7 @@ static int apply_binary(struct strbuf *buf, struct patch *patch)
 		 * See if the old one matches what the patch
 		 * applies to.
 		 */
-		hash_sha1_file(buf->buf, buf->len, blob_type, sha1);
+		hash_sha1_file(img->buf, img->len, blob_type, sha1);
 		if (strcmp(sha1_to_hex(sha1), patch->old_sha1_prefix))
 			return error("the patch applies to '%s' (%s), "
 				     "which does not match the "
@@ -1902,14 +2050,14 @@ static int apply_binary(struct strbuf *buf, struct patch *patch)
 	}
 	else {
 		/* Otherwise, the old one must be empty. */
-		if (buf->len)
+		if (img->len)
 			return error("the patch applies to an empty "
 				     "'%s' but it is not empty", name);
 	}
 
 	get_sha1_hex(patch->new_sha1_prefix, sha1);
 	if (is_null_sha1(sha1)) {
-		strbuf_release(buf);
+		clear_image(img);
 		return 0; /* deletion patch */
 	}
 
@@ -1924,20 +2072,21 @@ static int apply_binary(struct strbuf *buf, struct patch *patch)
 			return error("the necessary postimage %s for "
 				     "'%s' cannot be read",
 				     patch->new_sha1_prefix, name);
-		/* XXX read_sha1_file NUL-terminates */
-		strbuf_attach(buf, result, size, size + 1);
+		clear_image(img);
+		img->buf = result;
+		img->len = size;
 	} else {
 		/*
 		 * We have verified buf matches the preimage;
 		 * apply the patch data to it, which is stored
 		 * in the patch->fragments->{patch,size}.
 		 */
-		if (apply_binary_fragment(buf, patch))
+		if (apply_binary_fragment(img, patch))
 			return error("binary patch does not apply to '%s'",
 				     name);
 
 		/* verify that the result matches */
-		hash_sha1_file(buf->buf, buf->len, blob_type, sha1);
+		hash_sha1_file(img->buf, img->len, blob_type, sha1);
 		if (strcmp(sha1_to_hex(sha1), patch->new_sha1_prefix))
 			return error("binary patch to '%s' creates incorrect result (expecting %s, got %s)",
 				name, patch->new_sha1_prefix, sha1_to_hex(sha1));
@@ -1946,7 +2095,7 @@ static int apply_binary(struct strbuf *buf, struct patch *patch)
 	return 0;
 }
 
-static int apply_fragments(struct strbuf *buf, struct patch *patch)
+static int apply_fragments(struct image *img, struct patch *patch)
 {
 	struct fragment *frag = patch->fragments;
 	const char *name = patch->old_name ? patch->old_name : patch->new_name;
@@ -1954,10 +2103,10 @@ static int apply_fragments(struct strbuf *buf, struct patch *patch)
 	unsigned inaccurate_eof = patch->inaccurate_eof;
 
 	if (patch->is_binary)
-		return apply_binary(buf, patch);
+		return apply_binary(img, patch);
 
 	while (frag) {
-		if (apply_one_fragment(buf, frag, inaccurate_eof, ws_rule)) {
+		if (apply_one_fragment(img, frag, inaccurate_eof, ws_rule)) {
 			error("patch failed: %s:%ld", name, frag->oldpos);
 			if (!apply_with_reject)
 				return -1;
@@ -1993,6 +2142,9 @@ static int read_file_or_gitlink(struct cache_entry *ce, struct strbuf *buf)
 static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *ce)
 {
 	struct strbuf buf;
+	struct image image;
+	size_t len;
+	char *img;
 
 	strbuf_init(&buf, 0);
 	if (cached) {
@@ -2015,9 +2167,14 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
 		}
 	}
 
-	if (apply_fragments(&buf, patch) < 0)
+	img = strbuf_detach(&buf, &len);
+	prepare_image(&image, img, len, !patch->is_binary);
+
+	if (apply_fragments(&image, patch) < 0)
 		return -1; /* note with --reject this succeeds. */
-	patch->result = strbuf_detach(&buf, &patch->resultsize);
+	patch->result = image.buf;
+	patch->resultsize = image.len;
+	free(image.line_allocated);
 
 	if (0 < patch->is_delete && patch->resultsize)
 		return error("removal patch leaves file contents");
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/13] builtin-apply.c: optimize match_beginning/end processing a bit.
  2008-02-02 10:54       ` [PATCH 04/13] builtin-apply.c: make it more line oriented Junio C Hamano
@ 2008-02-02 10:54         ` Junio C Hamano
  2008-02-02 10:54           ` [PATCH 06/13] builtin-apply.c: mark common context lines in lineinfo structure Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

Wnen the caller knows the hunk needs to match at the beginning
or at the end, there is no point starting from the line number
that is found in the patch and trying match with increasing
offset.  The logic to find matching lines was made more line
oriented with the previous patch and this optimization is now
trivial.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index e046e87..dc650f1 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1579,6 +1579,16 @@ static int find_pos(struct image *img,
 	if (preimage->nr > img->nr)
 		return -1;
 
+	/*
+	 * If match_begining or match_end is specified, there is no
+	 * point starting from a wrong line that will never match and
+	 * wander around and wait for a match at the specified end.
+	 */
+	if (match_beginning)
+		line = 0;
+	else if (match_end)
+		line = img->nr - preimage->nr;
+
 	try = 0;
 	for (i = 0; i < line; i++)
 		try += img->line[i].len;
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/13] builtin-apply.c: mark common context lines in lineinfo structure.
  2008-02-02 10:54         ` [PATCH 05/13] builtin-apply.c: optimize match_beginning/end processing a bit Junio C Hamano
@ 2008-02-02 10:54           ` Junio C Hamano
  2008-02-02 10:54             ` [PATCH 07/13] builtin-apply.c: clean-up apply_one_fragment() Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This updates the way preimage and postimage in a patch hunk is
parsed and prepared for applying.  By looking at image->line[n].flag,
the code can tell if it is a common context line that is the
same between the preimage and the postimage.

This matters when we actually start applying a patch with
contexts that have whitespace breakages that have already been
fixed in the target file.
---
 builtin-apply.c |   57 ++++++++++++++++++++++++++++++------------------------
 1 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index dc650f1..acd84f9 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -170,6 +170,7 @@ struct line {
 	size_t len;
 	unsigned hash : 24;
 	unsigned flag : 8;
+#define LINE_COMMON     1
 };
 
 /*
@@ -179,6 +180,7 @@ struct image {
 	char *buf;
 	size_t len;
 	size_t nr;
+	size_t alloc;
 	struct line *line_allocated;
 	struct line *line;
 };
@@ -195,49 +197,39 @@ static uint32_t hash_line(const char *cp, size_t len)
 	return h;
 }
 
+static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag)
+{
+	ALLOC_GROW(img->line_allocated, img->nr + 1, img->alloc);
+	img->line_allocated[img->nr].len = len;
+	img->line_allocated[img->nr].hash = hash_line(bol, len);
+	img->line_allocated[img->nr].flag = flag;
+	img->nr++;
+}
+
 static void prepare_image(struct image *image, char *buf, size_t len,
 			  int prepare_linetable)
 {
 	const char *cp, *ep;
-	int n;
 
+	memset(image, 0, sizeof(*image));
 	image->buf = buf;
 	image->len = len;
 
-	if (!prepare_linetable) {
-		image->line = NULL;
-		image->line_allocated = NULL;
-		image->nr = 0;
+	if (!prepare_linetable)
 		return;
-	}
 
 	ep = image->buf + image->len;
-
-	/* First count lines */
 	cp = image->buf;
-	n = 0;
-	while (cp < ep) {
-		cp = strchrnul(cp, '\n');
-		n++;
-		cp++;
-	}
-
-	image->line_allocated = xcalloc(n, sizeof(struct line));
-	image->line = image->line_allocated;
-	image->nr = n;
-	cp = image->buf;
-	n = 0;
 	while (cp < ep) {
 		const char *next;
 		for (next = cp; next < ep && *next != '\n'; next++)
 			;
 		if (next < ep)
 			next++;
-		image->line[n].len = next - cp;
-		image->line[n].hash = hash_line(cp, next - cp);
+		add_line_info(image, cp, next - cp, 0);
 		cp = next;
-		n++;
 	}
+	image->line = image->line_allocated;
 }
 
 static void clear_image(struct image *image)
@@ -1822,6 +1814,9 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 	struct image preimage;
 	struct image postimage;
 
+	memset(&preimage, 0, sizeof(preimage));
+	memset(&postimage, 0, sizeof(postimage));
+
 	while (size > 0) {
 		char first;
 		int len = linelen(patch, size);
@@ -1857,10 +1852,14 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 				break;
 			old[oldsize++] = '\n';
 			new[newsize++] = '\n';
+			add_line_info(&preimage, "\n", 1, LINE_COMMON);
+			add_line_info(&postimage, "\n", 1, LINE_COMMON);
 			break;
 		case ' ':
 		case '-':
 			memcpy(old + oldsize, patch + 1, plen);
+			add_line_info(&preimage, old + oldsize, plen,
+				      (first == ' ' ? LINE_COMMON : 0));
 			oldsize += plen;
 			if (first == '-')
 				break;
@@ -1869,6 +1868,9 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 			if (first != '+' || !no_add) {
 				int added = apply_line(new + newsize, patch,
 						       plen, ws_rule);
+				add_line_info(&postimage, new + newsize, added,
+					      (first == '+' ? 0 : LINE_COMMON));
+
 				newsize += added;
 				if (first == '+' &&
 				    added == 1 && new[newsize-1] == '\n')
@@ -1921,8 +1923,13 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 	}
 
 	pos = frag->newpos ? (frag->newpos - 1) : 0;
-	prepare_image(&preimage, oldlines, oldsize, 1);
-	prepare_image(&postimage, newlines, newsize, 1);
+	preimage.buf = old;
+	preimage.len = oldsize;
+	postimage.buf = new;
+	postimage.len = newsize;
+	preimage.line = preimage.line_allocated;
+	postimage.line = postimage.line_allocated;
+
 	for (;;) {
 
 		applied_pos = find_pos(img, &preimage, &postimage,
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/13] builtin-apply.c: clean-up apply_one_fragment()
  2008-02-02 10:54           ` [PATCH 06/13] builtin-apply.c: mark common context lines in lineinfo structure Junio C Hamano
@ 2008-02-02 10:54             ` Junio C Hamano
  2008-02-02 10:54               ` [PATCH 08/13] builtin-apply.c: simplify calling site to apply_line() Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

We had two pointer variables pointing to the same buffer and an
integer variable used to index into its tail part that was
active (old, oldlines and oldsize for the preimage, and their
'new' counterparts for the postimage).

To help readability, use 'oldlines' as the allocated pointer,
and use 'old' as the pointer to the tail that advances while the
code builds up the contents in the buffer.  The size 'oldsize'
can be computed as (old-oldines).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   55 +++++++++++++++++++++++++++----------------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index acd84f9..7fb3305 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1804,10 +1804,7 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 	int match_beginning, match_end;
 	const char *patch = frag->patch;
 	int size = frag->size;
-	char *old = xmalloc(size);
-	char *new = xmalloc(size);
-	char *oldlines, *newlines;
-	int oldsize = 0, newsize = 0;
+	char *old, *new, *oldlines, *newlines;
 	int new_blank_lines_at_end = 0;
 	unsigned long leading, trailing;
 	int pos, applied_pos;
@@ -1816,7 +1813,11 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 
 	memset(&preimage, 0, sizeof(preimage));
 	memset(&postimage, 0, sizeof(postimage));
+	oldlines = xmalloc(size);
+	newlines = xmalloc(size);
 
+	old = oldlines;
+	new = newlines;
 	while (size > 0) {
 		char first;
 		int len = linelen(patch, size);
@@ -1833,7 +1834,7 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 		 * followed by "\ No newline", then we also remove the
 		 * last one (which is the newline, of course).
 		 */
-		plen = len-1;
+		plen = len - 1;
 		if (len < size && patch[len] == '\\')
 			plen--;
 		first = *patch;
@@ -1850,30 +1851,30 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 			if (plen < 0)
 				/* ... followed by '\No newline'; nothing */
 				break;
-			old[oldsize++] = '\n';
-			new[newsize++] = '\n';
+			*old++ = '\n';
+			*new++ = '\n';
 			add_line_info(&preimage, "\n", 1, LINE_COMMON);
 			add_line_info(&postimage, "\n", 1, LINE_COMMON);
 			break;
 		case ' ':
 		case '-':
-			memcpy(old + oldsize, patch + 1, plen);
-			add_line_info(&preimage, old + oldsize, plen,
+			memcpy(old, patch + 1, plen);
+			add_line_info(&preimage, old, plen,
 				      (first == ' ' ? LINE_COMMON : 0));
-			oldsize += plen;
+			old += plen;
 			if (first == '-')
 				break;
 		/* Fall-through for ' ' */
 		case '+':
 			if (first != '+' || !no_add) {
-				int added = apply_line(new + newsize, patch,
+				int added = apply_line(new, patch,
 						       plen, ws_rule);
-				add_line_info(&postimage, new + newsize, added,
+				add_line_info(&postimage, new, added,
 					      (first == '+' ? 0 : LINE_COMMON));
 
-				newsize += added;
+				new += added;
 				if (first == '+' &&
-				    added == 1 && new[newsize-1] == '\n')
+				    added == 1 && new[-1] == '\n')
 					added_blank_line = 1;
 			}
 			break;
@@ -1892,16 +1893,13 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 		patch += len;
 		size -= len;
 	}
-
 	if (inaccurate_eof &&
-	    oldsize > 0 && old[oldsize - 1] == '\n' &&
-	    newsize > 0 && new[newsize - 1] == '\n') {
-		oldsize--;
-		newsize--;
+	    old > oldlines && old[-1] == '\n' &&
+	    new > newlines && new[-1] == '\n') {
+		old--;
+		new--;
 	}
 
-	oldlines = old;
-	newlines = new;
 	leading = frag->leading;
 	trailing = frag->trailing;
 
@@ -1923,10 +1921,10 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 	}
 
 	pos = frag->newpos ? (frag->newpos - 1) : 0;
-	preimage.buf = old;
-	preimage.len = oldsize;
-	postimage.buf = new;
-	postimage.len = newsize;
+	preimage.buf = oldlines;
+	preimage.len = old - oldlines;
+	postimage.buf = newlines;
+	postimage.len = new - newlines;
 	preimage.line = preimage.line_allocated;
 	postimage.line = postimage.line_allocated;
 
@@ -1990,11 +1988,12 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 		update_image(img, applied_pos, &preimage, &postimage);
 	} else {
 		if (apply_verbosely)
-			error("while searching for:\n%.*s", oldsize, oldlines);
+			error("while searching for:\n%.*s",
+			      (int)(old - oldlines), oldlines);
 	}
 
-	free(old);
-	free(new);
+	free(oldlines);
+	free(newlines);
 	free(preimage.line_allocated);
 	free(postimage.line_allocated);
 
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/13] builtin-apply.c: simplify calling site to apply_line()
  2008-02-02 10:54             ` [PATCH 07/13] builtin-apply.c: clean-up apply_one_fragment() Junio C Hamano
@ 2008-02-02 10:54               ` Junio C Hamano
  2008-02-02 10:54                 ` [PATCH 09/13] builtin-apply.c: do not feed copy_wsfix() leading '+' Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

The function apply_line() changed its behaviour depending on the
ws_error_action, whitespace_error and if the input was a context.
Make its caller responsible for such checking so that we can convert
the function to copy the contents of line while fixing whitespace
breakage more easily.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   38 ++++++++++++++++++++------------------
 1 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 7fb3305..d0d008f 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1642,7 +1642,7 @@ static void remove_last_line(struct image *img)
 	img->len -= img->line[--img->nr].len;
 }
 
-static int apply_line(char *output, const char *patch, int plen,
+static int copy_wsfix(char *output, const char *patch, int plen,
 		      unsigned ws_rule)
 {
 	/*
@@ -1659,12 +1659,6 @@ static int apply_line(char *output, const char *patch, int plen,
 	int need_fix_leading_space = 0;
 	char *buf;
 
-	if ((ws_error_action != correct_ws_error) || !whitespace_error ||
-	    *patch != '+') {
-		memcpy(output, patch + 1, plen);
-		return plen;
-	}
-
 	/*
 	 * Strip trailing whitespace
 	 */
@@ -1821,7 +1815,7 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 	while (size > 0) {
 		char first;
 		int len = linelen(patch, size);
-		int plen;
+		int plen, added;
 		int added_blank_line = 0;
 
 		if (!len)
@@ -1866,17 +1860,25 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 				break;
 		/* Fall-through for ' ' */
 		case '+':
-			if (first != '+' || !no_add) {
-				int added = apply_line(new, patch,
-						       plen, ws_rule);
-				add_line_info(&postimage, new, added,
-					      (first == '+' ? 0 : LINE_COMMON));
-
-				new += added;
-				if (first == '+' &&
-				    added == 1 && new[-1] == '\n')
-					added_blank_line = 1;
+			/* --no-add does not add new lines */
+			if (first == '+' && no_add)
+				break;
+
+			if (first != '+' ||
+			    !whitespace_error ||
+			    ws_error_action != correct_ws_error) {
+				memcpy(new, patch + 1, plen);
+				added = plen;
+			}
+			else {
+				added = copy_wsfix(new, patch, plen, ws_rule);
 			}
+			add_line_info(&postimage, new, added,
+				      (first == '+' ? 0 : LINE_COMMON));
+			new += added;
+			if (first == '+' &&
+			    added == 1 && new[-1] == '\n')
+				added_blank_line = 1;
 			break;
 		case '@': case '\\':
 			/* Ignore it, we already handled it */
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/13] builtin-apply.c: do not feed copy_wsfix() leading '+'
  2008-02-02 10:54               ` [PATCH 08/13] builtin-apply.c: simplify calling site to apply_line() Junio C Hamano
@ 2008-02-02 10:54                 ` Junio C Hamano
  2008-02-02 10:54                   ` [PATCH 10/13] builtin-apply.c: move copy_wsfix() function a bit higher Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

The "patch" parameter used to include leading '+' of an added
line in the patch, and the array was treated as 1-based.  Make
it accept the contents of the line alone and simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |   38 ++++++++++++++++++--------------------
 1 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index d0d008f..0bc33bd 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1646,16 +1646,15 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 		      unsigned ws_rule)
 {
 	/*
-	 * plen is number of bytes to be copied from patch,
-	 * starting at patch+1 (patch[0] is '+').  Typically
-	 * patch[plen] is '\n', unless this is the incomplete
-	 * last line.
+	 * plen is number of bytes to be copied from patch, starting
+	 * at patch.  Typically patch[plen-1] is '\n', unless this is
+	 * the incomplete last line.
 	 */
 	int i;
 	int add_nl_to_tail = 0;
 	int fixed = 0;
-	int last_tab_in_indent = 0;
-	int last_space_in_indent = 0;
+	int last_tab_in_indent = -1;
+	int last_space_in_indent = -1;
 	int need_fix_leading_space = 0;
 	char *buf;
 
@@ -1663,11 +1662,11 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	 * Strip trailing whitespace
 	 */
 	if ((ws_rule & WS_TRAILING_SPACE) &&
-	    (1 < plen && isspace(patch[plen-1]))) {
-		if (patch[plen] == '\n')
+	    (2 < plen && isspace(patch[plen-2]))) {
+		if (patch[plen-1] == '\n')
 			add_nl_to_tail = 1;
 		plen--;
-		while (0 < plen && isspace(patch[plen]))
+		while (0 < plen && isspace(patch[plen-1]))
 			plen--;
 		fixed = 1;
 	}
@@ -1675,25 +1674,25 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	/*
 	 * Check leading whitespaces (indent)
 	 */
-	for (i = 1; i < plen; i++) {
+	for (i = 0; i < plen; i++) {
 		char ch = patch[i];
 		if (ch == '\t') {
 			last_tab_in_indent = i;
 			if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
-			    0 < last_space_in_indent)
+			    0 <= last_space_in_indent)
 			    need_fix_leading_space = 1;
 		} else if (ch == ' ') {
 			last_space_in_indent = i;
 			if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
 			    8 <= i - last_tab_in_indent)
 				need_fix_leading_space = 1;
-		}
-		else
+		} else
 			break;
 	}
 
 	buf = output;
 	if (need_fix_leading_space) {
+		/* Process indent ourselves */
 		int consecutive_spaces = 0;
 		int last = last_tab_in_indent + 1;
 
@@ -1706,10 +1705,10 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 		}
 
 		/*
-		 * between patch[1..last], strip the funny spaces,
+		 * between patch[0..last-1], strip the funny spaces,
 		 * updating them to tab as needed.
 		 */
-		for (i = 1; i < last; i++, plen--) {
+		for (i = 0; i < last; i++) {
 			char ch = patch[i];
 			if (ch != ' ') {
 				consecutive_spaces = 0;
@@ -1724,13 +1723,12 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 		}
 		while (0 < consecutive_spaces--)
 			*output++ = ' ';
+		plen -= last;
+		patch += last;
 		fixed = 1;
-		i = last;
 	}
-	else
-		i = 1;
 
-	memcpy(output, patch + i, plen);
+	memcpy(output, patch, plen);
 	if (add_nl_to_tail)
 		output[plen++] = '\n';
 	if (fixed)
@@ -1871,7 +1869,7 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 				added = plen;
 			}
 			else {
-				added = copy_wsfix(new, patch, plen, ws_rule);
+				added = copy_wsfix(new, patch + 1, plen, ws_rule);
 			}
 			add_line_info(&postimage, new, added,
 				      (first == '+' ? 0 : LINE_COMMON));
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/13] builtin-apply.c: move copy_wsfix() function a bit higher.
  2008-02-02 10:54                 ` [PATCH 09/13] builtin-apply.c: do not feed copy_wsfix() leading '+' Junio C Hamano
@ 2008-02-02 10:54                   ` Junio C Hamano
  2008-02-02 10:54                     ` [PATCH 11/13] builtin-apply.c: pass ws_rule down to match_fragment() Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

I'll be calling this from match_fragment() in later rounds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |  188 +++++++++++++++++++++++++++---------------------------
 1 files changed, 94 insertions(+), 94 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 0bc33bd..2af625a 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1515,6 +1515,100 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	}
 }
 
+static int copy_wsfix(char *output, const char *patch, int plen,
+		      unsigned ws_rule)
+{
+	/*
+	 * plen is number of bytes to be copied from patch, starting
+	 * at patch.  Typically patch[plen-1] is '\n', unless this is
+	 * the incomplete last line.
+	 */
+	int i;
+	int add_nl_to_tail = 0;
+	int fixed = 0;
+	int last_tab_in_indent = -1;
+	int last_space_in_indent = -1;
+	int need_fix_leading_space = 0;
+	char *buf;
+
+	/*
+	 * Strip trailing whitespace
+	 */
+	if ((ws_rule & WS_TRAILING_SPACE) &&
+	    (2 < plen && isspace(patch[plen-2]))) {
+		if (patch[plen-1] == '\n')
+			add_nl_to_tail = 1;
+		plen--;
+		while (0 < plen && isspace(patch[plen-1]))
+			plen--;
+		fixed = 1;
+	}
+
+	/*
+	 * Check leading whitespaces (indent)
+	 */
+	for (i = 0; i < plen; i++) {
+		char ch = patch[i];
+		if (ch == '\t') {
+			last_tab_in_indent = i;
+			if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
+			    0 <= last_space_in_indent)
+			    need_fix_leading_space = 1;
+		} else if (ch == ' ') {
+			last_space_in_indent = i;
+			if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
+			    8 <= i - last_tab_in_indent)
+				need_fix_leading_space = 1;
+		} else
+			break;
+	}
+
+	buf = output;
+	if (need_fix_leading_space) {
+		/* Process indent ourselves */
+		int consecutive_spaces = 0;
+		int last = last_tab_in_indent + 1;
+
+		if (ws_rule & WS_INDENT_WITH_NON_TAB) {
+			/* have "last" point at one past the indent */
+			if (last_tab_in_indent < last_space_in_indent)
+				last = last_space_in_indent + 1;
+			else
+				last = last_tab_in_indent + 1;
+		}
+
+		/*
+		 * between patch[0..last-1], strip the funny spaces,
+		 * updating them to tab as needed.
+		 */
+		for (i = 0; i < last; i++) {
+			char ch = patch[i];
+			if (ch != ' ') {
+				consecutive_spaces = 0;
+				*output++ = ch;
+			} else {
+				consecutive_spaces++;
+				if (consecutive_spaces == 8) {
+					*output++ = '\t';
+					consecutive_spaces = 0;
+				}
+			}
+		}
+		while (0 < consecutive_spaces--)
+			*output++ = ' ';
+		plen -= last;
+		patch += last;
+		fixed = 1;
+	}
+
+	memcpy(output, patch, plen);
+	if (add_nl_to_tail)
+		output[plen++] = '\n';
+	if (fixed)
+		applied_after_fixing_ws++;
+	return output + plen - buf;
+}
+
 static int match_fragment(struct image *img,
 			  struct image *preimage,
 			  struct image *postimage,
@@ -1642,100 +1736,6 @@ static void remove_last_line(struct image *img)
 	img->len -= img->line[--img->nr].len;
 }
 
-static int copy_wsfix(char *output, const char *patch, int plen,
-		      unsigned ws_rule)
-{
-	/*
-	 * plen is number of bytes to be copied from patch, starting
-	 * at patch.  Typically patch[plen-1] is '\n', unless this is
-	 * the incomplete last line.
-	 */
-	int i;
-	int add_nl_to_tail = 0;
-	int fixed = 0;
-	int last_tab_in_indent = -1;
-	int last_space_in_indent = -1;
-	int need_fix_leading_space = 0;
-	char *buf;
-
-	/*
-	 * Strip trailing whitespace
-	 */
-	if ((ws_rule & WS_TRAILING_SPACE) &&
-	    (2 < plen && isspace(patch[plen-2]))) {
-		if (patch[plen-1] == '\n')
-			add_nl_to_tail = 1;
-		plen--;
-		while (0 < plen && isspace(patch[plen-1]))
-			plen--;
-		fixed = 1;
-	}
-
-	/*
-	 * Check leading whitespaces (indent)
-	 */
-	for (i = 0; i < plen; i++) {
-		char ch = patch[i];
-		if (ch == '\t') {
-			last_tab_in_indent = i;
-			if ((ws_rule & WS_SPACE_BEFORE_TAB) &&
-			    0 <= last_space_in_indent)
-			    need_fix_leading_space = 1;
-		} else if (ch == ' ') {
-			last_space_in_indent = i;
-			if ((ws_rule & WS_INDENT_WITH_NON_TAB) &&
-			    8 <= i - last_tab_in_indent)
-				need_fix_leading_space = 1;
-		} else
-			break;
-	}
-
-	buf = output;
-	if (need_fix_leading_space) {
-		/* Process indent ourselves */
-		int consecutive_spaces = 0;
-		int last = last_tab_in_indent + 1;
-
-		if (ws_rule & WS_INDENT_WITH_NON_TAB) {
-			/* have "last" point at one past the indent */
-			if (last_tab_in_indent < last_space_in_indent)
-				last = last_space_in_indent + 1;
-			else
-				last = last_tab_in_indent + 1;
-		}
-
-		/*
-		 * between patch[0..last-1], strip the funny spaces,
-		 * updating them to tab as needed.
-		 */
-		for (i = 0; i < last; i++) {
-			char ch = patch[i];
-			if (ch != ' ') {
-				consecutive_spaces = 0;
-				*output++ = ch;
-			} else {
-				consecutive_spaces++;
-				if (consecutive_spaces == 8) {
-					*output++ = '\t';
-					consecutive_spaces = 0;
-				}
-			}
-		}
-		while (0 < consecutive_spaces--)
-			*output++ = ' ';
-		plen -= last;
-		patch += last;
-		fixed = 1;
-	}
-
-	memcpy(output, patch, plen);
-	if (add_nl_to_tail)
-		output[plen++] = '\n';
-	if (fixed)
-		applied_after_fixing_ws++;
-	return output + plen - buf;
-}
-
 static void update_image(struct image *img,
 			 int applied_pos,
 			 struct image *preimage,
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 11/13] builtin-apply.c: pass ws_rule down to match_fragment()
  2008-02-02 10:54                   ` [PATCH 10/13] builtin-apply.c: move copy_wsfix() function a bit higher Junio C Hamano
@ 2008-02-02 10:54                     ` Junio C Hamano
  2008-02-02 10:54                       ` [PATCH 12/13] git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This is necessary to allow match_fragment() to attempt a match
with a preimage that is based on a version before whitespace
errors were fixed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 2af625a..5f3c047 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1614,6 +1614,7 @@ static int match_fragment(struct image *img,
 			  struct image *postimage,
 			  unsigned long try,
 			  int try_lno,
+			  unsigned ws_rule,
 			  int match_beginning, int match_end)
 {
 	int i;
@@ -1656,6 +1657,7 @@ static int find_pos(struct image *img,
 		    struct image *preimage,
 		    struct image *postimage,
 		    int line,
+		    unsigned ws_rule,
 		    int match_beginning, int match_end)
 {
 	int i;
@@ -1691,7 +1693,7 @@ static int find_pos(struct image *img,
 
 	for (i = 0; ; i++) {
 		if (match_fragment(img, preimage, postimage,
-				   try, try_lno,
+				   try, try_lno, ws_rule,
 				   match_beginning, match_end))
 			return try_lno;
 
@@ -1930,8 +1932,8 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 
 	for (;;) {
 
-		applied_pos = find_pos(img, &preimage, &postimage,
-				       pos, match_beginning, match_end);
+		applied_pos = find_pos(img, &preimage, &postimage, pos,
+				       ws_rule, match_beginning, match_end);
 
 		if (applied_pos >= 0)
 			break;
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 12/13] git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run
  2008-02-02 10:54                     ` [PATCH 11/13] builtin-apply.c: pass ws_rule down to match_fragment() Junio C Hamano
@ 2008-02-02 10:54                       ` Junio C Hamano
  2008-02-02 10:54                         ` [PATCH 13/13] core.whitespace: cr-at-eol Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

When you have more than one patch series, an earlier one of which
tries to introduce whitespace breakages and a later one of which
has such a new line in its context, "git-apply --whitespace=fix"
will apply and fix the whitespace breakages in the earlier one,
making the resulting file not to match the context of the later
patch.

A short demonstration is in the new test, t4125.

For example, suppose the first patch is:

    diff a/hello.txt b/hello.txt
    --- a/hello.txt
    +++ b/hello.txt
    @@ -20,3 +20,3 @@
     Hello world.$
    -How Are you$
    -Today?$
    +How are you $
    +today? $

to fix broken case in the string, but it introduces unwanted
trailing whitespaces to the result (pretend you are looking at
"cat -e" output of the patch --- '$' signs are not in the patch
but are shown to make the EOL stand out).  And the second patch
is to change the wording of the greeting further:

    diff a/hello.txt b/hello.txt
    --- a/hello.txt
    +++ b/hello.txt
    @@ -18,5 +18,5 @@
     Greetings $

    -Hello world.$
    +Hello, everybody. $
     How are you $
    -today? $
    +these days? $

If you apply the first one with --whitespace=fix, you will get
this as the result:

    Hello world.$
    How are you$
    today?$

and this does not match the preimage of the second patch, which
demands extra whitespace after "How are you" and "today?".

This series is about teaching "git apply --whitespace=fix" to
cope with this situation better.  If the patch does not apply,
it rewrites the second patch like this and retries:

    diff a/hello.txt b/hello.txt
    --- a/hello.txt
    +++ b/hello.txt
    @@ -18,5 +18,5 @@
     Greetings$

    -Hello world.$
    +Hello, everybody.$
     How are you$
    -today?$
    +these days?$

This is done by rewriting the preimage lines in the hunk
(i.e. the lines that begin with ' ' or '-'), using the same
whitespace fixing rules as it is using to apply the patches, so
that it can notice what it did to the previous ones in the
series.

A careful reader may notice that the first patch in the example
did not touch the "Greetings" line, so the trailing whitespace
that is in the original preimage of the second patch is not from
the series.  Is rewriting this context line a problem?

If you think about it, you will realize that the reason for the
difference is because the submitter's tree was based on an
earlier version of the file that had whitespaces wrong on that
"Greetings" line, and the change that introduced the "Greetings"
line was added independently of this two-patch series to our
tree already with an earlier "git apply --whitespace=fix".

So it may appear this logic is rewriting too much, it is not
so.  It is just rewriting what we would have rewritten in the
past.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c          |  133 ++++++++++++++++++++++++++++++++++++++++++++--
 t/t4125-apply-ws-fuzz.sh |  103 +++++++++++++++++++++++++++++++++++
 2 files changed, 231 insertions(+), 5 deletions(-)
 create mode 100755 t/t4125-apply-ws-fuzz.sh

diff --git a/builtin-apply.c b/builtin-apply.c
index 5f3c047..fccf4a4 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1516,7 +1516,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 }
 
 static int copy_wsfix(char *output, const char *patch, int plen,
-		      unsigned ws_rule)
+		      unsigned ws_rule, int count_error)
 {
 	/*
 	 * plen is number of bytes to be copied from patch, starting
@@ -1604,11 +1604,74 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	memcpy(output, patch, plen);
 	if (add_nl_to_tail)
 		output[plen++] = '\n';
-	if (fixed)
+	if (fixed && count_error)
 		applied_after_fixing_ws++;
 	return output + plen - buf;
 }
 
+static void update_pre_post_images(struct image *preimage,
+				   struct image *postimage,
+				   char *buf,
+				   size_t len)
+{
+	int i, ctx;
+	char *new, *old, *fixed;
+	struct image fixed_preimage;
+
+	/*
+	 * Update the preimage with whitespace fixes.  Note that we
+	 * are not losing preimage->buf -- apply_one_fragment() will
+	 * free "oldlines".
+	 */
+	prepare_image(&fixed_preimage, buf, len, 1);
+	assert(fixed_preimage.nr == preimage->nr);
+	for (i = 0; i < preimage->nr; i++)
+		fixed_preimage.line[i].flag = preimage->line[i].flag;
+	free(preimage->line_allocated);
+	*preimage = fixed_preimage;
+
+	/*
+	 * Adjust the common context lines in postimage, in place.
+	 * This is possible because whitespace fixing does not make
+	 * the string grow.
+	 */
+	new = old = postimage->buf;
+	fixed = preimage->buf;
+	for (i = ctx = 0; i < postimage->nr; i++) {
+		size_t len = postimage->line[i].len;
+		if (!(postimage->line[i].flag & LINE_COMMON)) {
+			/* an added line -- no counterparts in preimage */
+			memmove(new, old, len);
+			old += len;
+			new += len;
+			continue;
+		}
+
+		/* a common context -- skip it in the original postimage */
+		old += len;
+
+		/* and find the corresponding one in the fixed preimage */
+		while (ctx < preimage->nr &&
+		       !(preimage->line[ctx].flag & LINE_COMMON)) {
+			fixed += preimage->line[ctx].len;
+			ctx++;
+		}
+		if (preimage->nr <= ctx)
+			die("oops");
+
+		/* and copy it in, while fixing the line length */
+		len = preimage->line[ctx].len;
+		memcpy(new, fixed, len);
+		new += len;
+		fixed += len;
+		postimage->line[i].len = len;
+		ctx++;
+	}
+
+	/* Fix the length of the whole thing */
+	postimage->len = new - postimage->buf;
+}
+
 static int match_fragment(struct image *img,
 			  struct image *preimage,
 			  struct image *postimage,
@@ -1618,6 +1681,7 @@ static int match_fragment(struct image *img,
 			  int match_beginning, int match_end)
 {
 	int i;
+	char *fixed_buf, *buf, *orig, *target;
 
 	if (preimage->nr + try_lno > img->nr)
 		return 0;
@@ -1646,10 +1710,68 @@ static int match_fragment(struct image *img,
 	    !memcmp(img->buf + try, preimage->buf, preimage->len))
 		return 1;
 
+	if (ws_error_action != correct_ws_error)
+		return 0;
+
+	/*
+	 * The hunk does not apply byte-by-byte, but the hash says
+	 * it might with whitespace fuzz.
+	 */
+	fixed_buf = xmalloc(preimage->len + 1);
+	buf = fixed_buf;
+	orig = preimage->buf;
+	target = img->buf + try;
+	for (i = 0; i < preimage->nr; i++) {
+		size_t fixlen; /* length after fixing the preimage */
+		size_t oldlen = preimage->line[i].len;
+		size_t tgtlen = img->line[try_lno + i].len;
+		size_t tgtfixlen; /* length after fixing the target line */
+		char tgtfixbuf[1024], *tgtfix;
+		int match;
+
+		/* Try fixing the line in the preimage */
+		fixlen = copy_wsfix(buf, orig, oldlen, ws_rule, 0);
+
+		/* Try fixing the line in the target */
+		if (sizeof(tgtfixbuf) < tgtlen)
+			tgtfix = tgtfixbuf;
+		else
+			tgtfix = xmalloc(tgtlen);
+		tgtfixlen = copy_wsfix(tgtfix, target, tgtlen, ws_rule, 0);
+
+		/*
+		 * If they match, either the preimage was based on
+		 * a version before our tree fixed whitespace breakage,
+		 * or we are lacking a whitespace-fix patch the tree
+		 * the preimage was based on already had (i.e. target
+		 * has whitespace breakage, the preimage doesn't).
+		 * In either case, we are fixing the whitespace breakages
+		 * so we might as well take the fix together with their
+		 * real change.
+		 */
+		match = (tgtfixlen == fixlen && !memcmp(tgtfix, buf, fixlen));
+
+		if (tgtfix != tgtfixbuf)
+			free(tgtfix);
+		if (!match)
+			goto unmatch_exit;
+
+		orig += oldlen;
+		buf += fixlen;
+		target += tgtlen;
+	}
+
 	/*
-	 * NEEDSWORK: We can optionally match fuzzily here, but
-	 * that is for a later round.
+	 * Yes, the preimage is based on an older version that still
+	 * has whitespace breakages unfixed, and fixing them makes the
+	 * hunk match.  Update the context lines in the postimage.
 	 */
+	update_pre_post_images(preimage, postimage,
+			       fixed_buf, buf - fixed_buf);
+	return 1;
+
+ unmatch_exit:
+	free(fixed_buf);
 	return 0;
 }
 
@@ -1871,7 +1993,8 @@ static int apply_one_fragment(struct image *img, struct fragment *frag,
 				added = plen;
 			}
 			else {
-				added = copy_wsfix(new, patch + 1, plen, ws_rule);
+				added = copy_wsfix(new, patch + 1, plen,
+						   ws_rule, 1);
 			}
 			add_line_info(&postimage, new, added,
 				      (first == '+' ? 0 : LINE_COMMON));
diff --git a/t/t4125-apply-ws-fuzz.sh b/t/t4125-apply-ws-fuzz.sh
new file mode 100755
index 0000000..d6f15be
--- /dev/null
+++ b/t/t4125-apply-ws-fuzz.sh
@@ -0,0 +1,103 @@
+#!/bin/sh
+
+test_description='applying patch that has broken whitespaces in context'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+
+	>file &&
+	git add file &&
+
+	# file-0 is full of whitespace breakages
+	for l in a bb c d eeee f ggg h
+	do
+		echo "$l "
+	done >file-0 &&
+
+	# patch-0 creates a whitespace broken file
+	cat file-0 >file &&
+	git diff >patch-0 &&
+	git add file &&
+
+	# file-1 is still full of whitespace breakages,
+	# but has one line updated, without fixing any
+	# whitespaces.
+	# patch-1 records that change.
+	sed -e "s/d/D/" file-0 >file-1 &&
+	cat file-1 >file &&
+	git diff >patch-1 &&
+
+	# patch-all is the effect of both patch-0 and patch-1
+	>file &&
+	git add file &&
+	cat file-1 >file &&
+	git diff >patch-all &&
+
+	# patch-2 is the same as patch-1 but is based
+	# on a version that already has whitespace fixed,
+	# and does not introduce whitespace breakages.
+	sed -e "s/ $//" patch-1 >patch-2 &&
+
+	# If all whitespace breakages are fixed the contents
+	# should look like file-fixed
+	sed -e "s/ $//" file-1 >file-fixed
+
+'
+
+test_expect_success nofix '
+
+	>file &&
+	git add file &&
+
+	# Baseline.  Applying without fixing any whitespace
+	# breakages.
+	git apply --whitespace=nowarn patch-0 &&
+	git apply --whitespace=nowarn patch-1 &&
+
+	# The result should obviously match.
+	diff -u file-1 file
+'
+
+test_expect_success 'withfix (forward)' '
+
+	>file &&
+	git add file &&
+
+	# The first application will munge the context lines
+	# the second patch depends on.  We should be able to
+	# adjust and still apply.
+	git apply --whitespace=fix patch-0 &&
+	git apply --whitespace=fix patch-1 &&
+
+	diff -u file-fixed file
+'
+
+test_expect_success 'withfix (backward)' '
+
+	>file &&
+	git add file &&
+
+	# Now we have a whitespace breakages on our side.
+	git apply --whitespace=nowarn patch-0 &&
+
+	# And somebody sends in a patch based on image
+	# with whitespace already fixed.
+	git apply --whitespace=fix patch-2 &&
+
+	# The result should accept the whitespace fixed
+	# postimage.  But the line with "h" is beyond context
+	# horizon and left unfixed.
+
+	sed -e /h/d file-fixed >fixed-head &&
+	sed -e /h/d file >file-head &&
+	diff -u fixed-head file-head &&
+
+	sed -n -e /h/p file-fixed >fixed-tail &&
+	sed -n -e /h/p file >file-tail &&
+
+	! diff -u fixed-tail file-tail
+
+'
+
+test_done
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 13/13] core.whitespace: cr-at-eol
  2008-02-02 10:54                       ` [PATCH 12/13] git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run Junio C Hamano
@ 2008-02-02 10:54                         ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2008-02-02 10:54 UTC (permalink / raw)
  To: git

This new error mode allows a line to have a carriage return at the
end of the line when checking and fixing trailing whitespace errors.

Some people like to keep CRLF line ending recorded in the repository,
and still want to take advantage of the automated trailing whitespace
stripping.  We still show ^M in the diff output piped to "less" to
remind them that they do have the CR at the end, but these carriage
return characters at the end are no longer flagged as errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |    4 ++++
 builtin-apply.c          |   18 ++++++++++++++----
 cache.h                  |    1 +
 t/t4019-diff-wserror.sh  |   40 ++++++++++++++++++++++++++++++++++++++++
 ws.c                     |   15 +++++++++++++--
 5 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4e222f1..44cb640 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -308,6 +308,10 @@ core.whitespace::
   error (enabled by default).
 * `indent-with-non-tab` treats a line that is indented with 8 or more
   space characters as an error (not enabled by default).
+* `cr-at-eol` treats a carriage-return at the end of line as
+  part of the line terminator, i.e. with it, `trailing-space`
+  does not trigger if the character before such a carriage-return
+  is not a whitespace (not enabled by default).
 
 alias.*::
 	Command aliases for the linkgit:git[1] command wrapper - e.g.
diff --git a/builtin-apply.c b/builtin-apply.c
index fccf4a4..2b8ba81 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1525,6 +1525,7 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	 */
 	int i;
 	int add_nl_to_tail = 0;
+	int add_cr_to_tail = 0;
 	int fixed = 0;
 	int last_tab_in_indent = -1;
 	int last_space_in_indent = -1;
@@ -1536,12 +1537,19 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	 */
 	if ((ws_rule & WS_TRAILING_SPACE) &&
 	    (2 < plen && isspace(patch[plen-2]))) {
-		if (patch[plen-1] == '\n')
+		if (patch[plen - 1] == '\n') {
 			add_nl_to_tail = 1;
-		plen--;
-		while (0 < plen && isspace(patch[plen-1]))
 			plen--;
-		fixed = 1;
+			if (1 < plen && patch[plen - 1] == '\r') {
+				add_cr_to_tail = !!(ws_rule & WS_CR_AT_EOL);
+				plen--;
+			}
+		}
+		if (0 < plen && isspace(patch[plen - 1])) {
+			while (0 < plen && isspace(patch[plen-1]))
+				plen--;
+			fixed = 1;
+		}
 	}
 
 	/*
@@ -1602,6 +1610,8 @@ static int copy_wsfix(char *output, const char *patch, int plen,
 	}
 
 	memcpy(output, patch, plen);
+	if (add_cr_to_tail)
+		output[plen++] = '\r';
 	if (add_nl_to_tail)
 		output[plen++] = '\n';
 	if (fixed && count_error)
diff --git a/cache.h b/cache.h
index 549f4bb..ad11c90 100644
--- a/cache.h
+++ b/cache.h
@@ -652,6 +652,7 @@ void shift_tree(const unsigned char *, const unsigned char *, unsigned char *, i
 #define WS_TRAILING_SPACE	01
 #define WS_SPACE_BEFORE_TAB	02
 #define WS_INDENT_WITH_NON_TAB	04
+#define WS_CR_AT_EOL           010
 #define WS_DEFAULT_RULE (WS_TRAILING_SPACE|WS_SPACE_BEFORE_TAB)
 extern unsigned whitespace_rule_cfg;
 extern unsigned whitespace_rule(const char *);
diff --git a/t/t4019-diff-wserror.sh b/t/t4019-diff-wserror.sh
index 67e080b..0d9cbb6 100755
--- a/t/t4019-diff-wserror.sh
+++ b/t/t4019-diff-wserror.sh
@@ -12,6 +12,7 @@ test_expect_success setup '
 	echo "         Eight SP indent" >>F &&
 	echo " 	HT and SP indent" >>F &&
 	echo "With trailing SP " >>F &&
+	echo "Carriage ReturnQ" | tr Q "\015" >>F &&
 	echo "No problem" >>F
 
 '
@@ -27,6 +28,7 @@ test_expect_success default '
 	grep Eight normal >/dev/null &&
 	grep HT error >/dev/null &&
 	grep With error >/dev/null &&
+	grep Return error >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -41,6 +43,7 @@ test_expect_success 'without -trail' '
 	grep Eight normal >/dev/null &&
 	grep HT error >/dev/null &&
 	grep With normal >/dev/null &&
+	grep Return normal >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -56,6 +59,7 @@ test_expect_success 'without -trail (attribute)' '
 	grep Eight normal >/dev/null &&
 	grep HT error >/dev/null &&
 	grep With normal >/dev/null &&
+	grep Return normal >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -71,6 +75,7 @@ test_expect_success 'without -space' '
 	grep Eight normal >/dev/null &&
 	grep HT normal >/dev/null &&
 	grep With error >/dev/null &&
+	grep Return error >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -86,6 +91,7 @@ test_expect_success 'without -space (attribute)' '
 	grep Eight normal >/dev/null &&
 	grep HT normal >/dev/null &&
 	grep With error >/dev/null &&
+	grep Return error >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -101,6 +107,7 @@ test_expect_success 'with indent-non-tab only' '
 	grep Eight error >/dev/null &&
 	grep HT normal >/dev/null &&
 	grep With normal >/dev/null &&
+	grep Return normal >/dev/null &&
 	grep No normal >/dev/null
 
 '
@@ -116,6 +123,39 @@ test_expect_success 'with indent-non-tab only (attribute)' '
 	grep Eight error >/dev/null &&
 	grep HT normal >/dev/null &&
 	grep With normal >/dev/null &&
+	grep Return normal >/dev/null &&
+	grep No normal >/dev/null
+
+'
+
+test_expect_success 'with cr-at-eol' '
+
+	rm -f .gitattributes
+	git config core.whitespace cr-at-eol
+	git diff --color >output
+	grep "$blue_grep" output >error
+	grep -v "$blue_grep" output >normal
+
+	grep Eight normal >/dev/null &&
+	grep HT error >/dev/null &&
+	grep With error >/dev/null &&
+	grep Return normal >/dev/null &&
+	grep No normal >/dev/null
+
+'
+
+test_expect_success 'with cr-at-eol (attribute)' '
+
+	git config --unset core.whitespace
+	echo "F whitespace=trailing,cr-at-eol" >.gitattributes
+	git diff --color >output
+	grep "$blue_grep" output >error
+	grep -v "$blue_grep" output >normal
+
+	grep Eight normal >/dev/null &&
+	grep HT error >/dev/null &&
+	grep With error >/dev/null &&
+	grep Return normal >/dev/null &&
 	grep No normal >/dev/null
 
 '
diff --git a/ws.c b/ws.c
index d09b9df..5a9ac45 100644
--- a/ws.c
+++ b/ws.c
@@ -14,6 +14,7 @@ static struct whitespace_rule {
 	{ "trailing-space", WS_TRAILING_SPACE },
 	{ "space-before-tab", WS_SPACE_BEFORE_TAB },
 	{ "indent-with-non-tab", WS_INDENT_WITH_NON_TAB },
+	{ "cr-at-eol", WS_CR_AT_EOL },
 };
 
 unsigned parse_whitespace_rule(const char *string)
@@ -124,6 +125,7 @@ unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
 	int written = 0;
 	int trailing_whitespace = -1;
 	int trailing_newline = 0;
+	int trailing_carriage_return = 0;
 	int i;
 
 	/* Logic is simpler if we temporarily ignore the trailing newline. */
@@ -131,6 +133,11 @@ unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
 		trailing_newline = 1;
 		len--;
 	}
+	if ((ws_rule & WS_CR_AT_EOL) &&
+	    len > 0 && line[len - 1] == '\r') {
+		trailing_carriage_return = 1;
+		len--;
+	}
 
 	/* Check for trailing whitespace. */
 	if (ws_rule & WS_TRAILING_SPACE) {
@@ -176,8 +183,10 @@ unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
 	}
 
 	if (stream) {
-		/* Now the rest of the line starts at written.
-		 * The non-highlighted part ends at trailing_whitespace. */
+		/*
+		 * Now the rest of the line starts at "written".
+		 * The non-highlighted part ends at "trailing_whitespace".
+		 */
 		if (trailing_whitespace == -1)
 			trailing_whitespace = len;
 
@@ -196,6 +205,8 @@ unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,
 			    len - trailing_whitespace, 1, stream);
 			fputs(reset, stream);
 		}
+		if (trailing_carriage_return)
+			fputc('\r', stream);
 		if (trailing_newline)
 			fputc('\n', stream);
 	}
-- 
1.5.4.2.g41ac4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-02-02 10:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-02 10:54 [PATCH/RFC 00/13] git-apply --whitespace=fix updates Junio C Hamano
2008-02-02 10:54 ` [PATCH 01/13] builtin-apply.c: refactor small part that matches context Junio C Hamano
2008-02-02 10:54   ` [PATCH 02/13] builtin-apply.c: restructure "offset" matching Junio C Hamano
2008-02-02 10:54     ` [PATCH 03/13] builtin-apply.c: push match-beginning/end logic down Junio C Hamano
2008-02-02 10:54       ` [PATCH 04/13] builtin-apply.c: make it more line oriented Junio C Hamano
2008-02-02 10:54         ` [PATCH 05/13] builtin-apply.c: optimize match_beginning/end processing a bit Junio C Hamano
2008-02-02 10:54           ` [PATCH 06/13] builtin-apply.c: mark common context lines in lineinfo structure Junio C Hamano
2008-02-02 10:54             ` [PATCH 07/13] builtin-apply.c: clean-up apply_one_fragment() Junio C Hamano
2008-02-02 10:54               ` [PATCH 08/13] builtin-apply.c: simplify calling site to apply_line() Junio C Hamano
2008-02-02 10:54                 ` [PATCH 09/13] builtin-apply.c: do not feed copy_wsfix() leading '+' Junio C Hamano
2008-02-02 10:54                   ` [PATCH 10/13] builtin-apply.c: move copy_wsfix() function a bit higher Junio C Hamano
2008-02-02 10:54                     ` [PATCH 11/13] builtin-apply.c: pass ws_rule down to match_fragment() Junio C Hamano
2008-02-02 10:54                       ` [PATCH 12/13] git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run Junio C Hamano
2008-02-02 10:54                         ` [PATCH 13/13] core.whitespace: cr-at-eol Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).