git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Parkins <andyparkins@gmail.com>
To: git@vger.kernel.org
Subject: [PATCH 2/2] Add keyword unexpansion support to convert.c
Date: Tue, 17 Apr 2007 10:41:45 +0100	[thread overview]
Message-ID: <200704171041.46176.andyparkins@gmail.com> (raw)

This patch adds expansion of keywords support.  The unexpansion is only
performed when the "keywords" attribute it found for a file.  The check
for this attribute is done in the same way as the "crlf" attribute
check.

The actual unexpansion is performed by keyword_unexpand_git() which is
called from convert_to_git() when the "keywords" attribute is found.

keyword_unexpand_git() finds strings of the form

 $KEYWORD: ARBITRARY STRING$

And collapses them into

 $KEYWORD:$

No parsing of the keyword itself is performed, the content is simply
dropped.

Despite the fact that this doesn't do anything useful from the users
perspective, this patch forms the more important half of keyword
expansion support - because it prevents the expansion from entering the
repository.  It effectively creates blind spots that git tools won't
see.

convert_to_git() has also been changed so that it no longer only does
CRLF conversion.  Instead, a flag is kept to say whether any conversion
was done by the CRLF code, and then that converted buffer is passed to
keyword_unexpand_git() and the flag again updated.  It then returns 1 if
either of these conversion functions actually changed anything.

I've also included a test script to show that the keyword unexpansion is
working.  It particular demonstrates that the diff between a file with
keywords and the repository is blind to the expanded keyword.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
I'm not submitting this for application; I've not polished it, and I've not
written the expansion half yet.

However, I did want to show what I've been banging on about with keyword
expansion, and this does a reasonable job.  The test code shows that the idea
is sound - what goes in the repository is stable, and what appears in the
working directory can contain any arbitrary keyword expansion.

Adding expansion is harder, as I have no clue which calls to make to find
even the most basic information about an object; but I thought I'd get
feedback before I expend that effort.

Areas that might cause problems are the git-apply type of commands, I haven't
checked to see if they use convert_to_git() on their input to normalise it
for entry into the repository.  I hope so, as the CRLF support relies on it as
well :-)

 convert.c           |  115 +++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t0030-keywords.sh |   76 +++++++++++++++++++++++++++++++++
 2 files changed, 188 insertions(+), 3 deletions(-)
 create mode 100755 t/t0030-keywords.sh

diff --git a/convert.c b/convert.c
index d0d4b81..0c7b270 100644
--- a/convert.c
+++ b/convert.c
@@ -230,16 +230,125 @@ static int git_path_check_crlf(const char *path)
 	return attr_crlf_check.isset;
 }
 
+/* ------------------ keywords -------------------- */
+
+static void setup_keyword_check(struct git_attr_check *check)
+{
+	static struct git_attr *attr_keyword;
+
+	if (!attr_keyword)
+		attr_keyword = git_attr("keywords", 8);
+	check->attr = attr_keyword;
+}
+
+static int git_path_check_keyword(const char *path)
+{
+	struct git_attr_check attr_keyword_check;
+
+	setup_keyword_check(&attr_keyword_check);
+
+	if (git_checkattr(path, 1, &attr_keyword_check))
+		return -1;
+	return attr_keyword_check.isset;
+}
+
+static int keyword_unexpand_git(const char *path, char **bufp, unsigned long *sizep)
+{
+	char *buffer, *nbuf, *keyword;
+	unsigned long size, keywordlength;
+	int changes = 0;
+	enum {
+		IN_VOID,
+		PRE_KEYWORD,
+		IN_KEYWORD,
+		IN_EXPANSION,
+		END_KEYWORD
+	} parser_state = IN_VOID;
+
+	size = *sizep;
+	if (!size)
+		return 0;
+	buffer = *bufp;
+
+	/*
+	 * Allocate an identically sized buffer, keyword unexpansion can
+	 * only reduce the size so we'll never overflow (although we might
+	 * waste a few bytes
+	 */
+	nbuf = xmalloc(size);
+	*bufp = nbuf;
+
+	while (size) {
+		unsigned char c;
+
+		c = *buffer;
+
+		switch( parser_state ) {
+		case IN_VOID:        /* Normal characters, wait for '$' */
+			if (c == '$')
+				parser_state = PRE_KEYWORD;
+			break;
+		case PRE_KEYWORD:    /* Gap between '$' and keyword */
+			keywordlength = 0;
+			keyword = buffer;
+			if (!isspace(c))
+				parser_state = IN_KEYWORD;
+			else
+				break;
+		case IN_KEYWORD:     /* Keyword itself */
+			if (c == ':')
+				parser_state = IN_EXPANSION;
+			else if (c == '$' || c == '\n' || c == '\r')
+				parser_state = END_KEYWORD;
+			else
+				keywordlength++;
+			break;
+		case IN_EXPANSION:   /* The expansion gets silently removed */
+			if (c == '$' || c == '\n')
+				parser_state = END_KEYWORD;
+			else {
+				changes = 1;
+				/* Every character we skip reduces the overall size */
+				(*sizep)--;
+				buffer++;
+				size--;
+			}
+			continue;
+		case END_KEYWORD:    /* End of keyword */
+			parser_state = IN_VOID;
+			break;
+		}
+
+		*nbuf++ = c;
+		buffer++;
+		size--;
+	}
+
+	return (changes != 0);
+}
+
+
+/* ------------------------------------------------ */
 int convert_to_git(const char *path, char **bufp, unsigned long *sizep)
 {
+	int changes = 0;
+
 	switch (git_path_check_crlf(path)) {
 	case 0:
-		return 0;
+		changes += 0;
 	case 1:
-		return forcecrlf_to_git(path, bufp, sizep);
+		changes += forcecrlf_to_git(path, bufp, sizep);
 	default:
-		return autocrlf_to_git(path, bufp, sizep);
+		changes += autocrlf_to_git(path, bufp, sizep);
+	}
+
+	switch (git_path_check_keyword(path)) {
+	case 0:
+		changes += 0;
+	case 1:
+		changes += keyword_unexpand_git(path, bufp, sizep);
 	}
+	return (changes != 0);
 }
 
 int convert_to_working_tree(const char *path, char **bufp, unsigned long *sizep)
diff --git a/t/t0030-keywords.sh b/t/t0030-keywords.sh
new file mode 100755
index 0000000..375acb8
--- /dev/null
+++ b/t/t0030-keywords.sh
@@ -0,0 +1,76 @@
+#!/bin/sh
+
+cd $(dirname $0)
+
+test_description='Keyword expansion'
+
+. ./test-lib.sh
+
+# Adding the attribute "keywords" turns the keyword expansion on
+# I've used "notkeywords" as an attribute as a placeholder attribute
+# but this is just "somerandomattribute", it has no meaning
+
+# Expect success because the keyword attribute should be found
+test_expect_success 'Keywords attribute present' '
+
+	echo "keywordsfile keywords" >.gitattributes &&
+
+	echo "\$keyword: anythingcangohere\$" > keywordsfile &&
+
+	git add keywordsfile &&
+	git add .gitattributes &&
+	git commit -m test-keywords &&
+
+	git check-attr keywords -- keywordsfile
+'
+
+# Expect failure because the repository version should be different from the
+# working tree version.
+#
+#  In repository : $keyword:$
+#  In working dir: $keyword: anythingcangohere$
+#
+test_expect_failure 'Keywords unexpansion active' '
+
+	git show HEAD:keywordsfile > keywordsfile.cmp &&
+	cmp keywordsfile keywordsfile.cmp
+
+'
+
+# expect success because we want to find the keyword line unexpanded in the
+# and hence appearing unchanged in the output of git-diff
+test_expect_success 'git-diff with keywords present' '
+	echo "Non-keyword containing line" >> keywordsfile &&
+	git diff -- keywordsfile | grep -qs "^ \$keyword:\$$"
+'
+
+# Expect failure because the keywords attribute should NOT be found
+test_expect_failure 'Keywords attribute absent' '
+
+	echo "keywordsfile notkeywords" >.gitattributes &&
+
+	git add .gitattributes &&
+	git commit -m test-not-keywords &&
+
+	git check-attr keywords -- keywordsfile
+
+'
+
+# If keywords are later disabled on that file, then the keyword unexpansion
+# will be ignored, so a diff should now show differences, because git is no
+# longer keyword blind
+test_expect_success 'git-diff with keywords in file but disabled' '
+	git diff -- keywordsfile | grep -qs "^diff"
+'
+
+# Expect success because the repository should be identical to the working tree
+test_expect_success 'Keywords unexpansion inactive' '
+
+	git add keywordsfile &&
+	git commit -m "test-not-keywords"
+
+	git show HEAD:keywordsfile > keywordsfile.cmp &&
+	cmp keywordsfile keywordsfile.cmp
+'
+
+test_done
-- 
1.5.1.1.821.g88bdb

             reply	other threads:[~2007-04-17  9:42 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-17  9:41 Andy Parkins [this message]
     [not found] ` <200704171803.58940.an dyparkins@gmail.com>
2007-04-17 10:09 ` [PATCH 2/2] Add keyword unexpansion support to convert.c Junio C Hamano
2007-04-17 11:35   ` Andy Parkins
2007-04-17 15:53     ` Linus Torvalds
2007-04-17 17:03       ` Andy Parkins
2007-04-17 18:12         ` Linus Torvalds
2007-04-17 19:12           ` Andy Parkins
     [not found]             ` <alpine.LFD. 0.98.0704171530220.4504@xanadu.home>
2007-04-17 19:41             ` Nicolas Pitre
2007-04-17 19:45               ` David Lang
     [not found]                 ` <alpin e.LFD.0.98.0704171624190.4504@xanadu.home>
2007-04-17 20:29                 ` Nicolas Pitre
2007-04-17 20:05                   ` David Lang
2007-04-17 21:16                     ` Nicolas Pitre
     [not found]                       ` <7vy7k qlj5r.fsf@assigned-by-dhcp.cox.net>
2007-04-17 20:53                       ` David Lang
2007-04-17 21:52                         ` Andy Parkins
2007-04-17 22:40                       ` Junio C Hamano
2007-04-18  2:39                         ` Nicolas Pitre
2007-04-18  5:04                           ` Junio C Hamano
2007-04-18 14:56                             ` Nicolas Pitre
2007-04-18 11:14                           ` Johannes Schindelin
2007-04-18 15:10                             ` Nicolas Pitre
2007-04-19  8:19                               ` Johannes Schindelin
2007-04-21  0:42                         ` David Lang
2007-04-21  1:54                           ` Junio C Hamano
2007-04-21  2:06                             ` Nicolas Pitre
2007-04-21 23:31                             ` David Lang
2007-04-18  6:24                       ` Rogan Dawes
2007-04-18 15:02                         ` Linus Torvalds
2007-04-18 15:34                           ` Nicolas Pitre
2007-04-18 15:38                           ` Rogan Dawes
2007-04-18 15:59                             ` Nicolas Pitre
2007-04-18 16:09                               ` Rogan Dawes
2007-04-18 17:58                               ` Alon Ziv
2007-04-17 19:54             ` Linus Torvalds
2007-04-17 20:46               ` Andy Parkins
2007-04-17 20:52                 ` [PATCH] Add keyword collapse " Andy Parkins
2007-04-17 21:10                 ` [PATCH 2/2] Add keyword unexpansion " Linus Torvalds
2007-04-17 21:13                   ` Linus Torvalds
2007-04-18 11:11                     ` Johannes Schindelin
2007-04-20 11:32                 ` Nikolai Weibull
2007-04-17 21:18             ` Martin Langhoff
2007-04-17 21:24     ` Junio C Hamano
2007-04-20  0:30     ` Jakub Narebski
2007-04-21  0:47       ` David Lang
2007-04-17 15:46   ` Linus Torvalds
2007-04-17 10:41 ` Johannes Sixt
2007-04-17 15:32 ` Linus Torvalds
2007-04-17 17:10   ` Andy Parkins
2007-04-17 17:18   ` Rogan Dawes
2007-04-17 18:23     ` Linus Torvalds
2007-04-17 20:27       ` Rogan Dawes
2007-04-17 23:56       ` Robin H. Johnson
2007-04-18  0:02         ` Junio C Hamano
2007-04-18  0:26           ` J. Bruce Fields
2007-04-18  1:19             ` Linus Torvalds
2007-04-18  1:28               ` Junio C Hamano
2007-04-18  1:33                 ` Linus Torvalds
2007-04-18  1:06           ` Robin H. Johnson
2007-04-18  1:15             ` Junio C Hamano
2007-04-18  1:42               ` Junio C Hamano
2007-04-18  2:53                 ` Robin H. Johnson
2007-04-18  4:15           ` Daniel Barkalow
2007-04-18 11:32           ` Johannes Schindelin
2007-04-18  2:50         ` Martin Langhoff
2007-04-18 10:06         ` David Kågedal
2007-04-18 11:08           ` Robin H. Johnson
2007-04-17 21:00 ` Matthieu Moy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200704171041.46176.andyparkins@gmail.com \
    --to=andyparkins@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).