git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] Attribute and conversion patches
@ 2010-04-06 12:46 Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 1/8] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

These are some patches that are needed to get the support for
attributes and especially the 'ident' attribute to a useable
state.

Since last time the 'ident'-specific patch "Inhibit contraction of
foreign $Id$ during stats." is gone, and replaced with the generic
"Filter files that have changed only due to conversion changes.".

Henrik Grubbström (Grubba) (8):
  convert: Safer handling of $Id$ contraction.
  convert: Keep foreign $Id$ on checkout.
  status: Added missing calls to diff_unmodified_pair() in
    format_callbacks.
  diff: Filter files that have changed only due to conversion changes.
  convert: Added core.refilteronadd feature.
  attr: Fixed debug output for macro expansion.
  attr: Allow multiple changes to an attribute on the same line.
  attr: Expand macros immediately when encountered.

 Documentation/config.txt |   12 +++++++
 attr.c                   |   38 ++++++++++++++--------
 cache.h                  |    2 +
 config.c                 |   10 ++++++
 convert.c                |   28 ++++++++++++++++-
 diff.c                   |   42 +++++++++++++++++++++++++
 environment.c            |    2 +
 sha1_file.c              |   57 ++++++++++++++++++++++++++++++++++
 t/t0003-attributes.sh    |   15 +++++++++
 t/t0021-conversion.sh    |   76 ++++++++++++++++++++++++++++++++++++++++++----
 wt-status.c              |    4 ++
 11 files changed, 264 insertions(+), 22 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/8] convert: Safer handling of $Id$ contraction.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 2/8] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
of source text if the terminating $ was lost. It now refuses to
contract $Id:xxxxx$ strings spanning multiple lines.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
The behaviour implemented by the patch is in line with what other
VCSes that implement $Id$ do.

 convert.c             |   12 ++++++++++++
 t/t0021-conversion.sh |   16 ++++++++++------
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index 4f8fcb7..239fa0a 100644
--- a/convert.c
+++ b/convert.c
@@ -425,6 +425,8 @@ static int count_ident(const char *cp, unsigned long size)
 				cnt++;
 				break;
 			}
+			if (ch == '\n')
+				break;
 		}
 	}
 	return cnt;
@@ -455,6 +457,11 @@ static int ident_to_git(const char *path, const char *src, size_t len,
 			dollar = memchr(src + 3, '$', len - 3);
 			if (!dollar)
 				break;
+			if (memchr(src + 3, '\n', dollar - src - 3)) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			memcpy(dst, "Id$", 3);
 			dst += 3;
 			len -= dollar + 1 - src;
@@ -514,6 +521,11 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 				break;
 			}
 
+			if (memchr(src + 3, '\n', dollar - src - 3)) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			len -= dollar + 1 - src;
 			src  = dollar + 1;
 		} else {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 6cb8d60..29438c5 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -65,17 +65,21 @@ test_expect_success expanded_in_repo '
 		echo "\$Id:NoSpaceAtFront \$"
 		echo "\$Id:NoSpaceAtEitherEnd\$"
 		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: Foreign Commit With Spaces \$"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expanded-keywords &&
 
 	{
 		echo "File with expanded keywords"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expected-output &&
 
 	git add expanded-keywords &&
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/8] convert: Keep foreign $Id$ on checkout.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 1/8] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks Henrik Grubbström (Grubba)
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

If there are foreign $Id$ keywords in the repository, they are most
likely there for a reason. Let's keep them on checkout (which is also
what the documentation indicates). Foreign $Id$ keywords are now
recognized by there being multiple space separated fields in $Id:xxxxx$.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
The typical use case is for repositories that have been converted
from some other VCS, where it is desirable to keep the old identifiers
around until there's some other reason to alter the file.

Note that the comment about expanded Ids in the repository
is obsoleted by the core.refilterOnDiff patch.

 convert.c             |   16 ++++++++++++++--
 t/t0021-conversion.sh |    2 +-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/convert.c b/convert.c
index 239fa0a..5a0b7fb 100644
--- a/convert.c
+++ b/convert.c
@@ -477,7 +477,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
                              struct strbuf *buf, int ident)
 {
 	unsigned char sha1[20];
-	char *to_free = NULL, *dollar;
+	char *to_free = NULL, *dollar, *spc;
 	int cnt;
 
 	if (!ident)
@@ -513,7 +513,10 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 		} else if (src[2] == ':') {
 			/*
 			 * It's possible that an expanded Id has crept its way into the
-			 * repository, we cope with that by stripping the expansion out
+			 * repository, we cope with that by stripping the expansion out.
+			 * This is probably not a good idea, since it will cause changes
+			 * on checkout, which won't go away by stash, but let's keep it
+			 * for git-style ids.
 			 */
 			dollar = memchr(src + 3, '$', len - 3);
 			if (!dollar) {
@@ -526,6 +529,15 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 				continue;
 			}
 
+			spc = memchr(src + 4, ' ', dollar - src - 4);
+			if (spc && spc < dollar-1) {
+				/* There are spaces in unexpected places.
+				 * This is probably an id from some other
+				 * versioning system. Keep it for now.
+				 */
+				continue;
+			}
+
 			len -= dollar + 1 - src;
 			src  = dollar + 1;
 		} else {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 29438c5..828e35b 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -78,7 +78,7 @@ test_expect_success expanded_in_repo '
 		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: NoTerminatingSymbol"
-		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: Foreign Commit With Spaces \$"
 		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expected-output &&
 
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 1/8] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 2/8] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-10 22:31   ` Junio C Hamano
  2010-04-10 22:32   ` Junio C Hamano
  2010-04-06 12:46 ` [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes Henrik Grubbström (Grubba)
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

The diff_queue_struct provided by diff_flush() is raw, and needs to be
filtered through diff_unmodified_pair() before being used.
This is already done by most of the other functions operating on
diff_queue_struct called by diff_flush().

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
For diff_modified_pair() to be able to do its job, it needs
to be called...

 wt-status.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 8ca59a2..e661225 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -229,6 +229,8 @@ static void wt_status_collect_changed_cb(struct diff_queue_struct *q,
 		struct wt_status_change_data *d;
 
 		p = q->queue[i];
+		if (diff_unmodified_pair(p))
+			continue;
 		it = string_list_insert(p->one->path, &s->change);
 		d = it->util;
 		if (!d) {
@@ -276,6 +278,8 @@ static void wt_status_collect_updated_cb(struct diff_queue_struct *q,
 		struct wt_status_change_data *d;
 
 		p = q->queue[i];
+		if (diff_unmodified_pair(p))
+			continue;
 		it = string_list_insert(p->two->path, &s->change);
 		d = it->util;
 		if (!d) {
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
                   ` (2 preceding siblings ...)
  2010-04-06 12:46 ` [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-10 22:37   ` Junio C Hamano
  2010-04-06 12:46 ` [PATCH v4 5/8] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

When the conversion filter for a file is changed, files may get listed
as modified even though the user has not made any changes to them.
This patch adds a configuration option 'core.refilterOnDiff', which
performs an extra renormalization pass to filter out such files.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
The typical reason to enable this option is when you have lots of files
that have been affected by a configuration change (eg crlf convention
or ident expansion), but don't want to recommit the otherwise unchanged
files just to get them on canonic form in the repository.

 Documentation/config.txt |    6 ++++++
 cache.h                  |    1 +
 config.c                 |    5 +++++
 diff.c                   |   42 ++++++++++++++++++++++++++++++++++++++++++
 environment.c            |    1 +
 t/t0021-conversion.sh    |   25 +++++++++++++++++++++++++
 6 files changed, 80 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 06b2f82..4eb3ab3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -535,6 +535,12 @@ core.sparseCheckout::
 	Enable "sparse checkout" feature. See section "Sparse checkout" in
 	linkgit:git-read-tree[1] for more information.
 
+core.refilterOnDiff::
+	Enable "refilter on diff" feature. This causes source files that
+	have only changed from the committed version as a side effect of
+	a conversion filter change to be filtered from the output of eg
+	linkgit:git-status[1] and linkgit:git-diff[1].
+
 add.ignore-errors::
 	Tells 'git add' to continue adding files when some files cannot be
 	added due to indexing errors. Equivalent to the '--ignore-errors'
diff --git a/cache.h b/cache.h
index 6dcb100..cd2bca4 100644
--- a/cache.h
+++ b/cache.h
@@ -552,6 +552,7 @@ extern int read_replace_refs;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
+extern int core_refilter_on_diff;
 
 enum safe_crlf {
 	SAFE_CRLF_FALSE = 0,
diff --git a/config.c b/config.c
index 6963fbe..4954797 100644
--- a/config.c
+++ b/config.c
@@ -523,6 +523,11 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if(!strcmp(var, "core.refilterondiff")) {
+		core_refilter_on_diff = git_config_bool(var, value);
+		return 0;
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
 }
diff --git a/diff.c b/diff.c
index 2daa732..b2d8e6d 100644
--- a/diff.c
+++ b/diff.c
@@ -8,6 +8,8 @@
 #include "delta.h"
 #include "xdiff-interface.h"
 #include "color.h"
+#include "cache.h"
+#include "object.h"
 #include "attr.h"
 #include "run-command.h"
 #include "utf8.h"
@@ -3097,6 +3099,46 @@ int diff_unmodified_pair(struct diff_filepair *p)
 		return 1; /* no change */
 	if (!one->sha1_valid && !two->sha1_valid)
 		return 1; /* both look at the same file on the filesystem. */
+	if (one->dirty_submodule || two->dirty_submodule)
+		return 0; /* Known to differ. */
+	/* The hashes differ, but this might be due to either of them
+	 * not having been normalized (eg due to later .gitattributes
+	 * changes.
+	 */
+	if (core_refilter_on_diff) {
+		unsigned char one_sha1_norm[20];
+		unsigned char two_sha1_norm[20];
+		struct strbuf nbuf = STRBUF_INIT;
+		unsigned long buflen = 0;
+		void *buf;
+
+		diff_fill_sha1_info(one);
+		diff_fill_sha1_info(two);
+		memcpy(one_sha1_norm, one->sha1, 20);
+		memcpy(two_sha1_norm, two->sha1, 20);
+
+		buf = read_object_with_reference(one->sha1, typename(OBJ_BLOB),
+						 &buflen, one_sha1_norm);
+		if (buf && convert_to_git(one->path, buf, buflen,
+					  &nbuf, safe_crlf))
+			hash_sha1_file(nbuf.buf, nbuf.len,
+				       typename(OBJ_BLOB), one_sha1_norm);
+		if (buf)
+			free(buf);
+
+		buf = read_object_with_reference(two->sha1, typename(OBJ_BLOB),
+						 &buflen, two_sha1_norm);
+		if (buf && convert_to_git(two->path, buf, buflen,
+					  &nbuf, safe_crlf))
+			hash_sha1_file(nbuf.buf, nbuf.len,
+				       typename(OBJ_BLOB), two_sha1_norm);
+		if (buf)
+			free(buf);
+
+		strbuf_release(&nbuf);
+		if (!hashcmp(one_sha1_norm, two_sha1_norm))
+			return 1; /* Same hash after normalization. */
+	}
 	return 0;
 }
 
diff --git a/environment.c b/environment.c
index 876c5e5..1b52bed 100644
--- a/environment.c
+++ b/environment.c
@@ -52,6 +52,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE;
 char *notes_ref_name;
 int grafts_replace_parents = 1;
 int core_apply_sparse_checkout;
+int core_refilter_on_diff;
 
 /* Parallel index stat data preload? */
 int core_preload_index = 0;
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 828e35b..48ae8bb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,29 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+# Check that files containing keywords with proper markup aren't marked
+# as modified on checkout when core.refilterOnDiff is set.
+test_expect_success keywords_not_modified '
+	{
+		echo "File with foreign keywords"
+		echo "\$Id\$"
+		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: Foreign Commit With Spaces \$"
+		echo "\$Id: GitCommitId \$"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
+	} > expanded-keywords2 &&
+
+	git add expanded-keywords2 &&
+	git commit -m "File with keywords expanded" &&
+
+	echo "expanded-keywords2 ident" >> .gitattributes &&
+
+	rm -f expanded-keywords2 &&
+	git checkout -- expanded-keywords2 &&
+	test "x`git status --porcelain -- expanded-keywords2`" = \
+             "x M expanded-keywords2" &&
+	git config --add core.refilterondiff true &&
+	test "x`git status --porcelain -- expanded-keywords2`" = x
+'
+
 test_done
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 5/8] convert: Added core.refilteronadd feature.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
                   ` (3 preceding siblings ...)
  2010-04-06 12:46 ` [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 6/8] attr: Fixed debug output for macro expansion Henrik Grubbström (Grubba)
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

When having $Id$ tags in other versioning systems, it is customary
to recalculate the tags in the source on commit or equvivalent.
This commit adds a configuration option to git that causes source
files to pass through a conversion roundtrip when added to the index.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
 Documentation/config.txt |    6 +++++
 cache.h                  |    1 +
 config.c                 |    5 ++++
 environment.c            |    1 +
 sha1_file.c              |   57 ++++++++++++++++++++++++++++++++++++++++++++++
 t/t0021-conversion.sh    |   35 ++++++++++++++++++++++++++++
 6 files changed, 105 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4eb3ab3..5225047 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -535,6 +535,12 @@ core.sparseCheckout::
 	Enable "sparse checkout" feature. See section "Sparse checkout" in
 	linkgit:git-read-tree[1] for more information.
 
+core.refilterOnAdd::
+	Enable "refilter on add" feature. This causes source files to be
+	behave as if they were checked out after a linkgit:git-add[1].
+	This is typically usefull if eg the `ident` attribute is active,
+	in which case the $Id$ tags will be updated.
+
 core.refilterOnDiff::
 	Enable "refilter on diff" feature. This causes source files that
 	have only changed from the committed version as a side effect of
diff --git a/cache.h b/cache.h
index cd2bca4..1bd8484 100644
--- a/cache.h
+++ b/cache.h
@@ -552,6 +552,7 @@ extern int read_replace_refs;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
+extern int core_refilter_on_add;
 extern int core_refilter_on_diff;
 
 enum safe_crlf {
diff --git a/config.c b/config.c
index 4954797..d284897 100644
--- a/config.c
+++ b/config.c
@@ -523,6 +523,11 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.refilteronadd")) {
+		core_refilter_on_add = git_config_bool(var, value);
+		return 0;
+	}
+
 	if(!strcmp(var, "core.refilterondiff")) {
 		core_refilter_on_diff = git_config_bool(var, value);
 		return 0;
diff --git a/environment.c b/environment.c
index 1b52bed..4b4c966 100644
--- a/environment.c
+++ b/environment.c
@@ -52,6 +52,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE;
 char *notes_ref_name;
 int grafts_replace_parents = 1;
 int core_apply_sparse_checkout;
+int core_refilter_on_add;
 int core_refilter_on_diff;
 
 /* Parallel index stat data preload? */
diff --git a/sha1_file.c b/sha1_file.c
index a08a9d0..2aa800e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2466,6 +2466,54 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
 	return ret;
 }
 
+static int refilter_fd(int fd, struct stat *st, const char *path)
+{
+	int ret = -1;
+	size_t size = xsize_t(st->st_size);
+	struct strbuf gbuf = STRBUF_INIT;
+
+	if (!S_ISREG(st->st_mode)) {
+		struct strbuf sbuf = STRBUF_INIT;
+		if (strbuf_read(&sbuf, fd, 4096) >= 0)
+			ret = convert_to_git(path, sbuf.buf, sbuf.len, &gbuf, safe_crlf);
+		else
+			ret = -1;
+		strbuf_release(&sbuf);
+	} else if (size) {
+		void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		ret = convert_to_git(path, buf, size, &gbuf, safe_crlf);
+		munmap(buf, size);
+	} else
+		ret = -1;
+
+	if (ret > 0) {
+		/* Something happened during conversion to git.
+		 * Now convert it back, and save the result.
+		 */
+		struct strbuf obuf = STRBUF_INIT;
+
+		lseek(fd, 0, SEEK_SET);
+
+		if (convert_to_working_tree(path, gbuf.buf, gbuf.len, &obuf)) {
+			if (write_or_whine(fd, obuf.buf, obuf.len, path))
+				ftruncate(fd, obuf.len);
+			else
+				ret = -1;
+		} else {
+			if (write_or_whine(fd, gbuf.buf, gbuf.len, path))
+				ftruncate(fd, gbuf.len);
+			else
+				ret = -1;
+		}
+
+		strbuf_release(&obuf);
+	}
+	strbuf_release(&gbuf);
+
+	close(fd);
+	return ret;
+}
+
 int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
 {
 	int fd;
@@ -2480,6 +2528,15 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write
 		if (index_fd(sha1, fd, st, write_object, OBJ_BLOB, path) < 0)
 			return error("%s: failed to insert into database",
 				     path);
+		if (write_object && core_refilter_on_add) {
+			fd = open(path, O_RDWR);
+			if (fd < 0)
+				return error("open(\"%s\"): %s", path,
+					     strerror(errno));
+			if (refilter_fd(fd, st, path) < 0)
+				return error("%s: failed to refilter file",
+					     path);
+		}
 		break;
 	case S_IFLNK:
 		if (strbuf_readlink(&sb, path, st->st_size)) {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 48ae8bb..9ddbde3 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -118,4 +118,39 @@ test_expect_success keywords_not_modified '
 	test "x`git status --porcelain -- expanded-keywords2`" = x
 '
 
+# Check that keywords are expanded on add when refilter is enabled.
+test_expect_success expanded_on_add '
+	git config --add core.refilteronadd true
+
+	echo "expanded-keywords3 ident" >> .gitattributes &&
+
+	{
+		echo "File with keyword"
+		echo "\$Id\$"
+	} > expanded-keywords3 &&
+
+	{
+		echo "File with keyword"
+		echo "\$Id: 0661580d6f976fe7cc1e4512f8db3f2ddb8d93cc \$"
+	} > expected-output3 &&
+
+	git add expanded-keywords3 &&
+
+	cat expanded-keywords3 &&
+	cmp expanded-keywords3 expected-output3
+'
+
+# Check that keywords are expanded on commit when refilter is enabled.
+test_expect_success expanded_on_commit '
+	{
+		echo "File with keyword"
+		echo "\$Id\$"
+	} > expanded-keywords3 &&
+
+	git commit -m "File with keyword" expanded-keywords3 &&
+
+	cat expanded-keywords3 &&
+	cmp expanded-keywords3 expected-output3
+'
+
 test_done
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 6/8] attr: Fixed debug output for macro expansion.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
                   ` (4 preceding siblings ...)
  2010-04-06 12:46 ` [PATCH v4 5/8] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 7/8] attr: Allow multiple changes to an attribute on the same line Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 8/8] attr: Expand macros immediately when encountered Henrik Grubbström (Grubba)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

When debug_set() was called during macro expansion, it
received a pointer to a struct git_attr rather than a
string.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
 attr.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/attr.c b/attr.c
index f5346ed..5c6464e 100644
--- a/attr.c
+++ b/attr.c
@@ -605,7 +605,9 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 		const char *v = a->state[i].setto;
 
 		if (*n == ATTR__UNKNOWN) {
-			debug_set(what, a->u.pattern, attr, v);
+			debug_set(what,
+				  a->is_macro?a->u.attr->name:a->u.pattern,
+				  attr, v);
 			*n = v;
 			rem--;
 		}
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 7/8] attr: Allow multiple changes to an attribute on the same line.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
                   ` (5 preceding siblings ...)
  2010-04-06 12:46 ` [PATCH v4 6/8] attr: Fixed debug output for macro expansion Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  2010-04-06 12:46 ` [PATCH v4 8/8] attr: Expand macros immediately when encountered Henrik Grubbström (Grubba)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

When using macros it isn't inconceivable to have an attribute
being set by a macro, and then being reset explicitly.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
NB: Currently the tests in the testsuite patch will have the
opposite meaning, which is probably not what the user expects,
and is contrary to the documentation.

 attr.c                |    2 +-
 t/t0003-attributes.sh |    6 ++++++
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/attr.c b/attr.c
index 5c6464e..968fb8b 100644
--- a/attr.c
+++ b/attr.c
@@ -599,7 +599,7 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 	struct git_attr_check *check = check_all_attr;
 	int i;
 
-	for (i = 0; 0 < rem && i < a->num_attr; i++) {
+	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
 		struct git_attr *attr = a->state[i].attr;
 		const char **n = &(check[attr->attr_nr].value);
 		const char *v = a->state[i].setto;
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 1c77192..bd9c8de 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -22,6 +22,8 @@ test_expect_success 'setup' '
 	(
 		echo "f	test=f"
 		echo "a/i test=a/i"
+		echo "onoff test -test"
+		echo "offon -test test"
 	) >.gitattributes &&
 	(
 		echo "g test=a/g" &&
@@ -44,6 +46,8 @@ test_expect_success 'attribute test' '
 	attr_check b/g unspecified &&
 	attr_check a/b/h a/b/h &&
 	attr_check a/b/d/g "a/b/d/*"
+	attr_check onoff unset
+	attr_check offon set
 
 '
 
@@ -58,6 +62,8 @@ a/b/g: test: a/b/g
 b/g: test: unspecified
 a/b/h: test: a/b/h
 a/b/d/g: test: a/b/d/*
+onoff: test: unset
+offon: test: set
 EOF
 
 	sed -e "s/:.*//" < expect | git check-attr --stdin test > actual &&
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 8/8] attr: Expand macros immediately when encountered.
  2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
                   ` (6 preceding siblings ...)
  2010-04-06 12:46 ` [PATCH v4 7/8] attr: Allow multiple changes to an attribute on the same line Henrik Grubbström (Grubba)
@ 2010-04-06 12:46 ` Henrik Grubbström (Grubba)
  7 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-04-06 12:46 UTC (permalink / raw)
  To: git; +Cc: Henrik Grubbström  

When using macros it is otherwise hard to know whether an
attribute set by the macro should override an already set
attribute. Consider the following .gitattributes file:

[attr]mybinary	binary -ident
*		ident
foo.bin		mybinary
bar.bin		mybinary ident

Without this patch both foo.bin and bar.bin will have
the ident attribute set, which is probably not what
the user expects. With this patch foo.bin will have an
unset ident attribute, while bar.bin will have it set.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
FYI: The use case I attempted was:

 *.c			crlf ident
 [attr]foreign_ident	-ident block_commit=Remove-foreign_ident-attribute.
 src/version.c		foreign_ident

Which currently causes src/version.c to have the ident attribute set.

 attr.c                |   32 ++++++++++++++++++++------------
 t/t0003-attributes.sh |    9 +++++++++
 2 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/attr.c b/attr.c
index 968fb8b..f90bb8e 100644
--- a/attr.c
+++ b/attr.c
@@ -594,6 +594,8 @@ static int path_matches(const char *pathname, int pathlen,
 	return fnmatch(pattern, pathname + baselen, FNM_PATHNAME) == 0;
 }
 
+static int macroexpand_one(int attr_nr, int rem);
+
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
 	struct git_attr_check *check = check_all_attr;
@@ -610,6 +612,7 @@ static int fill_one(const char *what, struct match_attr *a, int rem)
 				  attr, v);
 			*n = v;
 			rem--;
+			rem = macroexpand_one(attr->attr_nr, rem);
 		}
 	}
 	return rem;
@@ -631,19 +634,27 @@ static int fill(const char *path, int pathlen, struct attr_stack *stk, int rem)
 	return rem;
 }
 
-static int macroexpand(struct attr_stack *stk, int rem)
+static int macroexpand_one(int attr_nr, int rem)
 {
+	struct attr_stack *stk;
+	struct match_attr *a = NULL;
 	int i;
-	struct git_attr_check *check = check_all_attr;
 
-	for (i = stk->num_matches - 1; 0 < rem && 0 <= i; i--) {
-		struct match_attr *a = stk->attrs[i];
-		if (!a->is_macro)
-			continue;
-		if (check[a->u.attr->attr_nr].value != ATTR__TRUE)
-			continue;
+	if (check_all_attr[attr_nr].value != ATTR__TRUE)
+		return rem;
+
+	for (stk = attr_stack; !a && stk; stk = stk->prev)
+		for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+			struct match_attr *ma = stk->attrs[i];
+			if (!ma->is_macro)
+				continue;
+			if (ma->u.attr->attr_nr == attr_nr)
+				a = ma;
+		}
+
+	if (a)
 		rem = fill_one("expand", a, rem);
-	}
+
 	return rem;
 }
 
@@ -668,9 +679,6 @@ int git_checkattr(const char *path, int num, struct git_attr_check *check)
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
 		rem = fill(path, pathlen, stk, rem);
 
-	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
-		rem = macroexpand(stk, rem);
-
 	for (i = 0; i < num; i++) {
 		const char *value = check_all_attr[check[i].attr->attr_nr].value;
 		if (value == ATTR__UNKNOWN)
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index bd9c8de..53bd7fc 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -20,10 +20,12 @@ test_expect_success 'setup' '
 
 	mkdir -p a/b/d a/c &&
 	(
+		echo "[attr]notest !test"
 		echo "f	test=f"
 		echo "a/i test=a/i"
 		echo "onoff test -test"
 		echo "offon -test test"
+		echo "no notest"
 	) >.gitattributes &&
 	(
 		echo "g test=a/g" &&
@@ -32,6 +34,7 @@ test_expect_success 'setup' '
 	(
 		echo "h test=a/b/h" &&
 		echo "d/* test=a/b/d/*"
+		echo "d/yes notest"
 	) >a/b/.gitattributes
 
 '
@@ -48,6 +51,9 @@ test_expect_success 'attribute test' '
 	attr_check a/b/d/g "a/b/d/*"
 	attr_check onoff unset
 	attr_check offon set
+	attr_check no unspecified
+	attr_check a/b/d/no "a/b/d/*"
+	attr_check a/b/d/yes unspecified
 
 '
 
@@ -64,6 +70,9 @@ a/b/h: test: a/b/h
 a/b/d/g: test: a/b/d/*
 onoff: test: unset
 offon: test: set
+no: test: unspecified
+a/b/d/no: test: a/b/d/*
+a/b/d/yes: test: unspecified
 EOF
 
 	sed -e "s/:.*//" < expect | git check-attr --stdin test > actual &&
-- 
1.7.0.3.316.g33b5e

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks.
  2010-04-06 12:46 ` [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks Henrik Grubbström (Grubba)
@ 2010-04-10 22:31   ` Junio C Hamano
  2010-04-12 13:00     ` Henrik Grubbström
  2010-04-10 22:32   ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-04-10 22:31 UTC (permalink / raw)
  To: Henrik Grubbström (Grubba); +Cc: git

"Henrik Grubbström (Grubba)"  <grubba@grubba.org> writes:

> The diff_queue_struct provided by diff_flush() is raw, and needs to be
> filtered through diff_unmodified_pair() before being used.
> This is already done by most of the other functions operating on
> diff_queue_struct called by diff_flush().

That is true but only if you are letting the diff front-end to feed
unmodified pairs to begin with, e.g. --find-copies-harder.  I don't think
the internal caller in wt-status does that.

I don't think the patch is wrong nor it would hurt, but I am puzzled why
you needed this patch.

>  wt-status.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks.
  2010-04-06 12:46 ` [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks Henrik Grubbström (Grubba)
  2010-04-10 22:31   ` Junio C Hamano
@ 2010-04-10 22:32   ` Junio C Hamano
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2010-04-10 22:32 UTC (permalink / raw)
  To: Henrik Grubbström (Grubba); +Cc: git

"Henrik Grubbström (Grubba)"  <grubba@grubba.org> writes:

> The diff_queue_struct provided by diff_flush() is raw, and needs to be
> filtered through diff_unmodified_pair() before being used.
> This is already done by most of the other functions operating on
> diff_queue_struct called by diff_flush().

That is true but only if you are letting the diff front-end to feed
unmodified pairs to begin with, e.g. --find-copies-harder.  I don't think
the internal caller in wt-status does that.

I don't think the patch is wrong nor it would hurt, but I am puzzled why
you needed this patch.

>  wt-status.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes.
  2010-04-06 12:46 ` [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes Henrik Grubbström (Grubba)
@ 2010-04-10 22:37   ` Junio C Hamano
  2010-04-16 15:30     ` Henrik Grubbström
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-04-10 22:37 UTC (permalink / raw)
  To: Henrik Grubbström (Grubba); +Cc: git

"Henrik Grubbström (Grubba)"  <grubba@grubba.org> writes:

> When the conversion filter for a file is changed, files may get listed
> as modified even though the user has not made any changes to them.
> This patch adds a configuration option 'core.refilterOnDiff', which
> performs an extra renormalization pass to filter out such files.
>
> Signed-off-by: Henrik Grubbström <grubba@grubba.org>

Does this really have to be done for every invocation of diff?

It is a problem worthy of a clean solution that changing the filtering
options makes files that are not really different (from the end user's
point of view).

But the problem feels very similar to the issue that touching the inode
information would make the cached stat information in the index invalid
and plumbing commands such as "diff-files" would report phantom changes.

And the way we solve the latter issue without undue overhead for all
command invocations is with "update-index --refresh" (either run directly
as a command inside Porcelain scripts that work with the plumbing, or
internally by calling refresh_cache() API in the C implementations of
Porcelain commands).  Hence:

	$ cat Makefile >Makefile+
        $ mv Makefile+ Makefile
        $ git diff-files --name-only
        Makefile
        $ git update-index --refresh
        $ git diff-files --name-only

once we spend cycles to revalidate the cached information in the index,
subsequent commands can trust the validity information without recomputing
the phantom differences that do not exist over and over.

I wonder if we can solve this in a similar way.  Especially, because
changing filtering options like the core.crlf settings is a one-off event
that is done even rarely than "touch Makefile", it doesn't feel right to
add an extra configuration that makes people pay the penalty during
everyday use just in case such a one-off event might have happened.

> The typical reason to enable this option is when you have lots of files
> that have been affected by a configuration change (eg crlf convention
> or ident expansion), but don't want to recommit the otherwise unchanged
> files just to get them on canonic form in the repository.

Of course you do not want to re-commit.  If however these files that are
unchanged from the end-user's point of view can be re-checked out safely,
then that would be similar to what "update-index --refresh" does for paths
that are stat-dirty.

         

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks.
  2010-04-10 22:31   ` Junio C Hamano
@ 2010-04-12 13:00     ` Henrik Grubbström
  0 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström @ 2010-04-12 13:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 863 bytes --]

On Sat, 10 Apr 2010, Junio C Hamano wrote:

> "Henrik Grubbström (Grubba)"  <grubba@grubba.org> writes:
>
>> The diff_queue_struct provided by diff_flush() is raw, and needs to be
>> filtered through diff_unmodified_pair() before being used.
>> This is already done by most of the other functions operating on
>> diff_queue_struct called by diff_flush().
>
> That is true but only if you are letting the diff front-end to feed
> unmodified pairs to begin with, e.g. --find-copies-harder.  I don't think
> the internal caller in wt-status does that.
>
> I don't think the patch is wrong nor it would hurt, but I am puzzled why
> you needed this patch.

Well, it's a prerequisite for the diff: Filter files that have changed...
patch, albeit apparently not sufficient (yet).

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes.
  2010-04-10 22:37   ` Junio C Hamano
@ 2010-04-16 15:30     ` Henrik Grubbström
  0 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström @ 2010-04-16 15:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1212 bytes --]

On Sat, 10 Apr 2010, Junio C Hamano wrote:

> "Henrik Grubbström (Grubba)"  <grubba@grubba.org> writes:
>
>> When the conversion filter for a file is changed, files may get listed
>> as modified even though the user has not made any changes to them.
>> This patch adds a configuration option 'core.refilterOnDiff', which
>> performs an extra renormalization pass to filter out such files.
>>
>> Signed-off-by: Henrik Grubbström <grubba@grubba.org>
>
> Does this really have to be done for every invocation of diff?
>
> But the problem feels very similar to the issue that touching the inode
> information would make the cached stat information in the index invalid
> and plumbing commands such as "diff-files" would report phantom changes.

True, storing this information in the index is a much better approach.

> Of course you do not want to re-commit.  If however these files that are
> unchanged from the end-user's point of view can be re-checked out safely,
> then that would be similar to what "update-index --refresh" does for paths
> that are stat-dirty.

I now have a tentative set of patches implementing this.

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-04-16 15:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-06 12:46 [PATCH v4 0/8] Attribute and conversion patches Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 1/8] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 2/8] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 3/8] status: Added missing calls to diff_unmodified_pair() in format_callbacks Henrik Grubbström (Grubba)
2010-04-10 22:31   ` Junio C Hamano
2010-04-12 13:00     ` Henrik Grubbström
2010-04-10 22:32   ` Junio C Hamano
2010-04-06 12:46 ` [PATCH v4 4/8] diff: Filter files that have changed only due to conversion changes Henrik Grubbström (Grubba)
2010-04-10 22:37   ` Junio C Hamano
2010-04-16 15:30     ` Henrik Grubbström
2010-04-06 12:46 ` [PATCH v4 5/8] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 6/8] attr: Fixed debug output for macro expansion Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 7/8] attr: Allow multiple changes to an attribute on the same line Henrik Grubbström (Grubba)
2010-04-06 12:46 ` [PATCH v4 8/8] attr: Expand macros immediately when encountered Henrik Grubbström (Grubba)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).