Git development
 help / color / mirror / Atom feed
* [PATCH 00/27] Revamp the attribute system; another round
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams, gitster, pclouds, sbeller

This series has been bounced around a bit (from Junio to Stefan) and finally
landed in my lap.  The end result of Stefan's attempt at the series still had a
couple of things that needed more tweaking.  It also has a few patches on top
which added functionality to pathspecs to be able to query into the attribute
system, which I've dropped from this series due to this series' length.

As a reminder the intent of this series is to revamp the attribute system so
that it can be thread-safe as well as a couple of other quality of life
changes.  This entailed removing dependencies on writing global data structures
during the attribute collection process.  Major changes are as follows:

 * The global array used to collect attributes needed to be made local and as a
   result was pushed out to the attr_check structure the caller prepares before
   querying the attribute system.

 * As it turns out the attribute stack ends up being used as a read-only
   structure during the collection process and as such parts of the attribute
   stack can be shared between different threads calling into the system.  To
   enable this sharing the attribute stack frames are stored in a hashmap and
   can be read out (or created and stored in the hashmap) based on the
   directory name of the path being queried.  This is possible because if a
   particular stack frame is included in the overall stack for a particular
   query, all of the frames underneath it will be the same for all queries that
   use this frame (only exception is the info frame which is handled special
   case, see the patch for details).

I took many of the first patches of this series as is from the series Stefan
prepared as as such may only need a cursory glace.  I did modify and change
some of the later patches authored by Junio to address a couple of naming
changes and to redistribute some code between patches so those patches would
need a closer look.

Thanks again to all the work Junio and Stefan put into this before I got a hold
of it.

Any comments are appreciated!

Thanks,
Brandon Williams

Brandon Williams (8):
  attr: pass struct attr_check to collect_some_attrs
  attr: use hashmap for attribute dictionary
  attr: eliminate global check_all_attr array
  attr: remove maybe-real, maybe-macro from git_attr
  attr: tighten const correctness with git_attr and match_attr
  attr: store attribute stacks in hashmap
  attr: push the bare repo check into read_attr()
  attr: reformat git_attr_set_direction() function

Junio C Hamano (17):
  commit.c: use strchrnul() to scan for one line
  attr.c: use strchrnul() to scan for one line
  attr.c: update a stale comment on "struct match_attr"
  attr.c: explain the lack of attr-name syntax check in parse_attr()
  attr.c: complete a sentence in a comment
  attr.c: mark where #if DEBUG ends more clearly
  attr.c: simplify macroexpand_one()
  attr.c: tighten constness around "git_attr" structure
  attr.c: plug small leak in parse_attr_line()
  attr.c: add push_stack() helper
  attr.c: outline the future plans by heavily commenting
  attr: rename function and struct related to checking attributes
  attr: (re)introduce git_check_attr() and struct attr_check
  attr: convert git_all_attrs() to use "struct attr_check"
  attr: convert git_check_attrs() callers to use the new API
  attr: retire git_check_attrs() API
  attr: change validity check for attribute names to use positive logic

Nguyễn Thái Ngọc Duy (1):
  attr: support quoting pathname patterns in C style

Stefan Beller (1):
  Documentation: fix a typo

 Documentation/gitattributes.txt               |  10 +-
 Documentation/technical/api-gitattributes.txt |  86 ++-
 archive.c                                     |  24 +-
 attr.c                                        | 932 +++++++++++++++++---------
 attr.h                                        |  50 +-
 builtin/check-attr.c                          |  66 +-
 builtin/pack-objects.c                        |  19 +-
 commit.c                                      |   3 +-
 common-main.c                                 |   3 +
 convert.c                                     |  25 +-
 ll-merge.c                                    |  33 +-
 t/t0003-attributes.sh                         |  26 +
 userdiff.c                                    |  19 +-
 ws.c                                          |  19 +-
 14 files changed, 834 insertions(+), 481 deletions(-)

-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply

* [PATCH 14/27] attr: rename function and struct related to checking attributes
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 archive.c              |  6 +++---
 attr.c                 | 12 ++++++------
 attr.h                 |  8 ++++----
 builtin/check-attr.c   | 19 ++++++++++---------
 builtin/pack-objects.c |  6 +++---
 convert.c              | 12 ++++++------
 ll-merge.c             | 10 +++++-----
 userdiff.c             |  4 ++--
 ws.c                   |  6 +++---
 9 files changed, 42 insertions(+), 41 deletions(-)

diff --git a/archive.c b/archive.c
index 01751e574..b76bd4691 100644
--- a/archive.c
+++ b/archive.c
@@ -87,7 +87,7 @@ void *sha1_file_to_archive(const struct archiver_args *args,
 	return buffer;
 }
 
-static void setup_archive_check(struct git_attr_check *check)
+static void setup_archive_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_export_ignore;
 	static struct git_attr *attr_export_subst;
@@ -123,7 +123,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	struct git_attr_check check[2];
+	struct attr_check_item check[2];
 	const char *path_without_prefix;
 	int err;
 
@@ -138,7 +138,7 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 	path_without_prefix = path.buf + args->baselen;
 
 	setup_archive_check(check);
-	if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+	if (!git_check_attrs(path_without_prefix, ARRAY_SIZE(check), check)) {
 		if (ATTR_TRUE(check[0].value))
 			return 0;
 		args->convert = ATTR_TRUE(check[1].value);
diff --git a/attr.c b/attr.c
index 50e5ee393..2f180d609 100644
--- a/attr.c
+++ b/attr.c
@@ -56,7 +56,7 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
 static int cannot_trust_maybe_real;
 
 /* NEEDSWORK: This will become per git_attr_check */
-static struct git_attr_check *check_all_attr;
+static struct attr_check_item *check_all_attr;
 
 const char *git_attr_name(const struct git_attr *attr)
 {
@@ -713,7 +713,7 @@ static int macroexpand_one(int attr_nr, int rem);
 
 static int fill_one(const char *what, struct match_attr *a, int rem)
 {
-	struct git_attr_check *check = check_all_attr;
+	struct attr_check_item *check = check_all_attr;
 	int i;
 
 	for (i = a->num_attr - 1; 0 < rem && 0 <= i; i--) {
@@ -778,7 +778,7 @@ static int macroexpand_one(int nr, int rem)
  * collected. Otherwise all attributes are collected.
  */
 static void collect_some_attrs(const char *path, int num,
-			       struct git_attr_check *check)
+			       struct attr_check_item *check)
 
 {
 	struct attr_stack *stk;
@@ -806,7 +806,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = 0;
 		for (i = 0; i < num; i++) {
 			if (!check[i].attr->maybe_real) {
-				struct git_attr_check *c;
+				struct attr_check_item *c;
 				c = check_all_attr + check[i].attr->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
@@ -821,7 +821,7 @@ static void collect_some_attrs(const char *path, int num,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
-int git_check_attr(const char *path, int num, struct git_attr_check *check)
+int git_check_attrs(const char *path, int num, struct attr_check_item *check)
 {
 	int i;
 
@@ -837,7 +837,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 	return 0;
 }
 
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check)
 {
 	int i, count, j;
 
diff --git a/attr.h b/attr.h
index 00d7a662c..efc7bb3b3 100644
--- a/attr.h
+++ b/attr.h
@@ -20,11 +20,11 @@ extern const char git_attr__false[];
 #define ATTR_UNSET(v) ((v) == NULL)
 
 /*
- * Send one or more git_attr_check to git_check_attr(), and
+ * Send one or more git_attr_check to git_check_attrs(), and
  * each 'value' member tells what its value is.
  * Unset one is returned as NULL.
  */
-struct git_attr_check {
+struct attr_check_item {
 	const struct git_attr *attr;
 	const char *value;
 };
@@ -36,7 +36,7 @@ struct git_attr_check {
  */
 extern const char *git_attr_name(const struct git_attr *);
 
-int git_check_attr(const char *path, int, struct git_attr_check *);
+int git_check_attrs(const char *path, int, struct attr_check_item *);
 
 /*
  * Retrieve all attributes that apply to the specified path.  *num
@@ -45,7 +45,7 @@ int git_check_attr(const char *path, int, struct git_attr_check *);
  * objects describing the attributes and their values.  *check must be
  * free()ed by the caller.
  */
-int git_all_attrs(const char *path, int *num, struct git_attr_check **check);
+int git_all_attrs(const char *path, int *num, struct attr_check_item **check);
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 53a5a18c1..889264a5b 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,8 +24,8 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(int cnt, struct git_attr_check *check,
-	const char *file)
+static void output_attr(int cnt, struct attr_check_item *check,
+			const char *file)
 {
 	int j;
 	for (j = 0; j < cnt; j++) {
@@ -51,14 +51,15 @@ static void output_attr(int cnt, struct git_attr_check *check,
 	}
 }
 
-static void check_attr(const char *prefix, int cnt,
-	struct git_attr_check *check, const char *file)
+static void check_attr(const char *prefix,
+		       int cnt, struct attr_check_item *check,
+		       const char *file)
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
 	if (check != NULL) {
-		if (git_check_attr(full_path, cnt, check))
-			die("git_check_attr died");
+		if (git_check_attrs(full_path, cnt, check))
+			die("git_check_attrs died");
 		output_attr(cnt, check, file);
 	} else {
 		if (git_all_attrs(full_path, &cnt, &check))
@@ -69,8 +70,8 @@ static void check_attr(const char *prefix, int cnt,
 	free(full_path);
 }
 
-static void check_attr_stdin_paths(const char *prefix, int cnt,
-	struct git_attr_check *check)
+static void check_attr_stdin_paths(const char *prefix,
+				   int cnt, struct attr_check_item *check)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf unquoted = STRBUF_INIT;
@@ -99,7 +100,7 @@ static NORETURN void error_with_usage(const char *msg)
 
 int cmd_check_attr(int argc, const char **argv, const char *prefix)
 {
-	struct git_attr_check *check;
+	struct attr_check_item *check;
 	int cnt, i, doubledash, filei;
 
 	if (!is_bare_repository())
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8841f8b36..8b8fbd814 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -894,7 +894,7 @@ static void write_pack_file(void)
 			written, nr_result);
 }
 
-static void setup_delta_attr_check(struct git_attr_check *check)
+static void setup_delta_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_delta;
 
@@ -906,10 +906,10 @@ static void setup_delta_attr_check(struct git_attr_check *check)
 
 static int no_try_delta(const char *path)
 {
-	struct git_attr_check check[1];
+	struct attr_check_item check[1];
 
 	setup_delta_attr_check(check);
-	if (git_check_attr(path, ARRAY_SIZE(check), check))
+	if (git_check_attrs(path, ARRAY_SIZE(check), check))
 		return 0;
 	if (ATTR_FALSE(check->value))
 		return 1;
diff --git a/convert.c b/convert.c
index 4e17e45ed..1b9829279 100644
--- a/convert.c
+++ b/convert.c
@@ -1028,7 +1028,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 	return 1;
 }
 
-static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
+static enum crlf_action git_path_check_crlf(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1045,7 +1045,7 @@ static enum crlf_action git_path_check_crlf(struct git_attr_check *check)
 	return CRLF_UNDEFINED;
 }
 
-static enum eol git_path_check_eol(struct git_attr_check *check)
+static enum eol git_path_check_eol(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1058,7 +1058,7 @@ static enum eol git_path_check_eol(struct git_attr_check *check)
 	return EOL_UNSET;
 }
 
-static struct convert_driver *git_path_check_convert(struct git_attr_check *check)
+static struct convert_driver *git_path_check_convert(struct attr_check_item *check)
 {
 	const char *value = check->value;
 	struct convert_driver *drv;
@@ -1071,7 +1071,7 @@ static struct convert_driver *git_path_check_convert(struct git_attr_check *chec
 	return NULL;
 }
 
-static int git_path_check_ident(struct git_attr_check *check)
+static int git_path_check_ident(struct attr_check_item *check)
 {
 	const char *value = check->value;
 
@@ -1093,7 +1093,7 @@ static const char *conv_attr_name[] = {
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
 	int i;
-	static struct git_attr_check ccheck[NUM_CONV_ATTRS];
+	static struct attr_check_item ccheck[NUM_CONV_ATTRS];
 
 	if (!ccheck[0].attr) {
 		for (i = 0; i < NUM_CONV_ATTRS; i++)
@@ -1102,7 +1102,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		git_config(read_convert_config, NULL);
 	}
 
-	if (!git_check_attr(path, NUM_CONV_ATTRS, ccheck)) {
+	if (!git_check_attrs(path, NUM_CONV_ATTRS, ccheck)) {
 		ca->crlf_action = git_path_check_crlf(ccheck + 4);
 		if (ca->crlf_action == CRLF_UNDEFINED)
 			ca->crlf_action = git_path_check_crlf(ccheck + 0);
diff --git a/ll-merge.c b/ll-merge.c
index ad8be42f9..198f07aca 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -336,13 +336,13 @@ static const struct ll_merge_driver *find_ll_merge_driver(const char *merge_attr
 	return &ll_merge_drv[LL_TEXT_MERGE];
 }
 
-static int git_path_check_merge(const char *path, struct git_attr_check check[2])
+static int git_path_check_merge(const char *path, struct attr_check_item check[2])
 {
 	if (!check[0].attr) {
 		check[0].attr = git_attr("merge");
 		check[1].attr = git_attr("conflict-marker-size");
 	}
-	return git_check_attr(path, 2, check);
+	return git_check_attrs(path, 2, check);
 }
 
 static void normalize_file(mmfile_t *mm, const char *path)
@@ -362,7 +362,7 @@ int ll_merge(mmbuffer_t *result_buf,
 	     mmfile_t *theirs, const char *their_label,
 	     const struct ll_merge_options *opts)
 {
-	static struct git_attr_check check[2];
+	static struct attr_check_item check[2];
 	static const struct ll_merge_options default_opts;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
@@ -398,12 +398,12 @@ int ll_merge(mmbuffer_t *result_buf,
 
 int ll_merge_marker_size(const char *path)
 {
-	static struct git_attr_check check;
+	static struct attr_check_item check;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
 	if (!check.attr)
 		check.attr = git_attr("conflict-marker-size");
-	if (!git_check_attr(path, 1, &check) && check.value) {
+	if (!git_check_attrs(path, 1, &check) && check.value) {
 		marker_size = atoi(check.value);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
diff --git a/userdiff.c b/userdiff.c
index 2125d6da2..b0b44467a 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -263,7 +263,7 @@ struct userdiff_driver *userdiff_find_by_name(const char *name) {
 struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
 	static struct git_attr *attr;
-	struct git_attr_check check;
+	struct attr_check_item check;
 
 	if (!attr)
 		attr = git_attr("diff");
@@ -271,7 +271,7 @@ struct userdiff_driver *userdiff_find_by_path(const char *path)
 
 	if (!path)
 		return NULL;
-	if (git_check_attr(path, 1, &check))
+	if (git_check_attrs(path, 1, &check))
 		return NULL;
 
 	if (ATTR_TRUE(check.value))
diff --git a/ws.c b/ws.c
index ea4b2b1df..fbd876e84 100644
--- a/ws.c
+++ b/ws.c
@@ -71,7 +71,7 @@ unsigned parse_whitespace_rule(const char *string)
 	return rule;
 }
 
-static void setup_whitespace_attr_check(struct git_attr_check *check)
+static void setup_whitespace_attr_check(struct attr_check_item *check)
 {
 	static struct git_attr *attr_whitespace;
 
@@ -82,10 +82,10 @@ static void setup_whitespace_attr_check(struct git_attr_check *check)
 
 unsigned whitespace_rule(const char *pathname)
 {
-	struct git_attr_check attr_whitespace_rule;
+	struct attr_check_item attr_whitespace_rule;
 
 	setup_whitespace_attr_check(&attr_whitespace_rule);
-	if (!git_check_attr(pathname, 1, &attr_whitespace_rule)) {
+	if (!git_check_attrs(pathname, 1, &attr_whitespace_rule)) {
 		const char *value;
 
 		value = attr_whitespace_rule.value;
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 06/27] attr.c: mark where #if DEBUG ends more clearly
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 9bdf87a6f..17297fffe 100644
--- a/attr.c
+++ b/attr.c
@@ -469,7 +469,7 @@ static void debug_set(const char *what, const char *match, struct git_attr *attr
 #define debug_push(a) do { ; } while (0)
 #define debug_pop(a) do { ; } while (0)
 #define debug_set(a,b,c,d) do { ; } while (0)
-#endif
+#endif /* DEBUG_ATTR */
 
 static void drop_attr_stack(void)
 {
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 07/27] attr.c: simplify macroexpand_one()
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

The double-loop wants to do an early return immediately when one
matching macro is found.  Eliminate the extra variable 'a' used for
that purpose and rewrite the "assign the found item to 'a' to make
it non-NULL and force the loop(s) to terminate" with a direct return
from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 17297fffe..e42f931b3 100644
--- a/attr.c
+++ b/attr.c
@@ -705,24 +705,21 @@ static int fill(const char *path, int pathlen, int basename_offset,
 static int macroexpand_one(int nr, int rem)
 {
 	struct attr_stack *stk;
-	struct match_attr *a = NULL;
 	int i;
 
 	if (check_all_attr[nr].value != ATTR__TRUE ||
 	    !check_all_attr[nr].attr->maybe_macro)
 		return rem;
 
-	for (stk = attr_stack; !a && stk; stk = stk->prev)
-		for (i = stk->num_matches - 1; !a && 0 <= i; i--) {
+	for (stk = attr_stack; stk; stk = stk->prev) {
+		for (i = stk->num_matches - 1; 0 <= i; i--) {
 			struct match_attr *ma = stk->attrs[i];
 			if (!ma->is_macro)
 				continue;
 			if (ma->u.attr->attr_nr == nr)
-				a = ma;
+				return fill_one("expand", ma, rem);
 		}
-
-	if (a)
-		rem = fill_one("expand", a, rem);
+	}
 
 	return rem;
 }
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 10/27] attr: support quoting pathname patterns in C style
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, gitster, sbeller,
	Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Full pattern must be quoted. So 'pat"t"ern attr' will give exactly
'pat"t"ern', not 'pattern'. Also clarify that leading whitespaces are
not part of the pattern and document comment syntax.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitattributes.txt |  8 +++++---
 attr.c                          | 15 +++++++++++++--
 t/t0003-attributes.sh           | 26 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c122..3173dee7e 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -21,9 +21,11 @@ Each line in `gitattributes` file is of form:
 	pattern	attr1 attr2 ...
 
 That is, a pattern followed by an attributes list,
-separated by whitespaces.  When the pattern matches the
-path in question, the attributes listed on the line are given to
-the path.
+separated by whitespaces. Leading and trailing whitespaces are
+ignored. Lines that begin with '#' are ignored. Patterns
+that begin with a double quote are quoted in C style.
+When the pattern matches the path in question, the attributes
+listed on the line are given to the path.
 
 Each attribute can be in one of these states for a given path:
 
diff --git a/attr.c b/attr.c
index d180c7833..e1c630f79 100644
--- a/attr.c
+++ b/attr.c
@@ -13,6 +13,7 @@
 #include "attr.h"
 #include "dir.h"
 #include "utf8.h"
+#include "quote.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -212,12 +213,21 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	const char *cp, *name, *states;
 	struct match_attr *res = NULL;
 	int is_macro;
+	struct strbuf pattern = STRBUF_INIT;
 
 	cp = line + strspn(line, blank);
 	if (!*cp || *cp == '#')
 		return NULL;
 	name = cp;
-	namelen = strcspn(name, blank);
+
+	if (*cp == '"' && !unquote_c_style(&pattern, name, &states)) {
+		name = pattern.buf;
+		namelen = pattern.len;
+	} else {
+		namelen = strcspn(name, blank);
+		states = name + namelen;
+	}
+
 	if (strlen(ATTRIBUTE_MACRO_PREFIX) < namelen &&
 	    starts_with(name, ATTRIBUTE_MACRO_PREFIX)) {
 		if (!macro_ok) {
@@ -239,7 +249,6 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	else
 		is_macro = 0;
 
-	states = name + namelen;
 	states += strspn(states, blank);
 
 	/* First pass to count the attr_states */
@@ -282,9 +291,11 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			cannot_trust_maybe_real = 1;
 	}
 
+	strbuf_release(&pattern);
 	return res;
 
 fail_return:
+	strbuf_release(&pattern);
 	free(res);
 	return NULL;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index f0fbb4255..f19ae4f8c 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -13,10 +13,31 @@ attr_check () {
 	test_line_count = 0 err
 }
 
+attr_check_quote () {
+
+	path="$1"
+	quoted_path="$2"
+	expect="$3"
+
+	git check-attr test -- "$path" >actual &&
+	echo "\"$quoted_path\": test: $expect" >expect &&
+	test_cmp expect actual
+
+}
+
+test_expect_success 'open-quoted pathname' '
+	echo "\"a test=a" >.gitattributes &&
+	test_must_fail attr_check a a
+'
+
+
 test_expect_success 'setup' '
 	mkdir -p a/b/d a/c b &&
 	(
 		echo "[attr]notest !test"
+		echo "\" d \"	test=d"
+		echo " e	test=e"
+		echo " e\"	test=e"
 		echo "f	test=f"
 		echo "a/i test=a/i"
 		echo "onoff test -test"
@@ -69,6 +90,11 @@ test_expect_success 'command line checks' '
 '
 
 test_expect_success 'attribute test' '
+
+	attr_check " d " d &&
+	attr_check e e &&
+	attr_check_quote e\" e\\\" e &&
+
 	attr_check f f &&
 	attr_check a/f f &&
 	attr_check a/c/f f &&
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 09/27] attr.c: plug small leak in parse_attr_line()
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

If any error is noticed after the match_attr structure is allocated,
we shouldn't just return NULL from this function.

Add a fail_return label that frees the allocated structure and
returns NULL, and consistently jump there when we want to return
NULL after cleaning up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/attr.c b/attr.c
index f7cf7ae30..d180c7833 100644
--- a/attr.c
+++ b/attr.c
@@ -223,7 +223,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (!macro_ok) {
 			fprintf(stderr, "%s not allowed: %s:%d\n",
 				name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 		is_macro = 1;
 		name += strlen(ATTRIBUTE_MACRO_PREFIX);
@@ -233,7 +233,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 			fprintf(stderr,
 				"%.*s is not a valid attribute name: %s:%d\n",
 				namelen, name, src, lineno);
-			return NULL;
+			goto fail_return;
 		}
 	}
 	else
@@ -246,7 +246,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	for (cp = states, num_attr = 0; *cp; num_attr++) {
 		cp = parse_attr(src, lineno, cp, NULL);
 		if (!cp)
-			return NULL;
+			goto fail_return;
 	}
 
 	res = xcalloc(1,
@@ -267,7 +267,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		if (res->u.pat.flags & EXC_FLAG_NEGATIVE) {
 			warning(_("Negative patterns are ignored in git attributes\n"
 				  "Use '\\!' for literal leading exclamation."));
-			return NULL;
+			goto fail_return;
 		}
 	}
 	res->is_macro = is_macro;
@@ -283,6 +283,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	}
 
 	return res;
+
+fail_return:
+	free(res);
+	return NULL;
 }
 
 /*
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 04/27] attr.c: explain the lack of attr-name syntax check in parse_attr()
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/attr.c b/attr.c
index 007f1a299..6b55a57ef 100644
--- a/attr.c
+++ b/attr.c
@@ -183,6 +183,12 @@ static const char *parse_attr(const char *src, int lineno, const char *cp,
 			return NULL;
 		}
 	} else {
+		/*
+		 * As this function is always called twice, once with
+		 * e == NULL in the first pass and then e != NULL in
+		 * the second pass, no need for invalid_attr_name()
+		 * check here.
+		 */
 		if (*cp == '-' || *cp == '!') {
 			e->setto = (*cp == '-') ? ATTR__FALSE : ATTR__UNSET;
 			cp++;
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 05/27] attr.c: complete a sentence in a comment
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 6b55a57ef..9bdf87a6f 100644
--- a/attr.c
+++ b/attr.c
@@ -300,7 +300,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
  * This is exactly the same as what is_excluded() does in dir.c to deal with
- * .gitignore
+ * .gitignore file and info/excludes file as a fallback.
  */
 
 static struct attr_stack {
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* [PATCH 03/27] attr.c: update a stale comment on "struct match_attr"
From: Brandon Williams @ 2017-01-12 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, pclouds, sbeller, Brandon Williams
In-Reply-To: <20170112235354.153403-1-bmwill@google.com>

From: Junio C Hamano <gitster@pobox.com>

When 82dce998 (attr: more matching optimizations from .gitignore,
2012-10-15) changed a pointer to a string "*pattern" into an
embedded "struct pattern" in struct match_attr, it forgot to update
the comment that describes the structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 attr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index 04d24334e..007f1a299 100644
--- a/attr.c
+++ b/attr.c
@@ -131,9 +131,8 @@ struct pattern {
  * If is_macro is true, then u.attr is a pointer to the git_attr being
  * defined.
  *
- * If is_macro is false, then u.pattern points at the filename pattern
- * to which the rule applies.  (The memory pointed to is part of the
- * memory block allocated for the match_attr instance.)
+ * If is_macro is false, then u.pat is the filename pattern to which the
+ * rule applies.
  *
  * In either case, num_attr is the number of attributes affected by
  * this rule, and state is an array listing them.  The attributes are
-- 
2.11.0.390.gc69c2f50cf-goog


^ permalink raw reply related

* Re: Bug report: Documentation error in git-bisect man description
From: Manuel Ullmann @ 2017-01-13  0:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Christian Couder
In-Reply-To: <xmqq8tqfj15z.fsf@gitster.mtv.corp.google.com>

I see. Thanks for the clarification. The pairing not being pairs of
opposites was indeed, what confused me. So that description was not
meant in the sense, that you use these pairs when working with bisect, but
instead they are ordered according to the argument possibilities.

Sorry for spreading confusion. I think the second paragraph should be
sufficient for documentation.

Best regards
Manuel
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Manuel Ullmann <ullman.alias@posteo.de> writes:
>>
>> Hmmm, I tend to agree, modulo a minor fix.
>>
>> If the description were in a context inside a paragraph like this:
>>
>> 	When you want to tell 'git bisect' that a <rev> belongs to
>> 	the newer half of the history, you say
>>
>> 		git bisect (bad|new) [<rev>]
>>
>> 	On the other hand, when you want to tell 'git bisect' that a
>> 	<rev> belongs to the older half of the history, you can say
>>
>> 		git bisect (good|old) [<rev>]
>>
>> then the pairing we see in the current text makes quite a lot of
>> sense.
>
> Actually, the above is _exactly_ what was intended.  I misread the
> current documentation when I made the comment, and I think that the
> current one _IS_ correct.  The latter half of the above is not about
> a single rev.  You can paint multiple commits with the "older half"
> color, i.e.
>
> 	On the other hand, when you want to tell 'git bisect' that
> 	one or more <rev>s  belong to the older half of the history,
> 	you can say
>
> 		git bisect (good|old) [<rev>...]
>
> In contrast, you can mark only one <rev> as newer (or "already
> bad").  So pairing (bad|good) and (new|old) like you suggested
> breaks the correctness of the command line description.
>
> If (bad|new) and (good|old) bothers you because they may mislead the
> readers to think bad is an opposite of new (and good is an opposite
> of old), the only solution I can think of to that problem is to
> expand these two lines into four and list them like this:
>
>         git bisect bad [<rev>]
>         git bisect good [<rev>...]
>         git bisect new [<rev>]
>         git bisect old [<rev>...]

^ permalink raw reply

* Re: [PATCH 1/5] doc: add documentation for OPT_STRING_LIST
From: Jacob Keller @ 2017-01-13  0:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <alpine.DEB.2.20.1701121044120.3469@virtualbox>

On Thu, Jan 12, 2017 at 1:47 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Jake,
>
> On Wed, 11 Jan 2017, Jacob Keller wrote:
>
>> diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
>> index 27bd701c0d68..15e876e4c804 100644
>> --- a/Documentation/technical/api-parse-options.txt
>> +++ b/Documentation/technical/api-parse-options.txt
>> @@ -168,6 +168,11 @@ There are some macros to easily define options:
>>       Introduce an option with string argument.
>>       The string argument is put into `str_var`.
>>
>> +`OPT_STRING_LIST(short, long, &list, arg_str, description)`::
>> +     Introduce an option with a string argument. Repeated invocations
>> +     accumulate into a list of strings. Reset and clear the list with
>> +     `--no-option`.
>
> One suggestions: as the list parameter is not type-safe (apart from
> checking that it can be cast to a `void *`), it would be good to mention
> in the documentation that `list` must be of type `struct string_list`.
>

Ok.

> I was about to suggest that `--no-option` may be misleading, as the
> command-line option is not really called `--option` in almost all cases,
> but I see that the rest of that document uses that convention to refer to
> the negated option already...

Also, I am merely documenting what already existed, since the original
commit failed to add documentation. I don't know if we could make a
better implementation, but it certainly would be a behavior change for
the current users.

Thanks,
Jake

>
> Ciao,
> Dscho

^ permalink raw reply

* Re: [PATCH 2/5] name-rev: extend --refs to accept multiple patterns
From: Jacob Keller @ 2017-01-13  0:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <alpine.DEB.2.20.1701121049530.3469@virtualbox>

On Thu, Jan 12, 2017 at 1:56 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Jake,
>
> On Wed, 11 Jan 2017, Jacob Keller wrote:
>
>> diff --git a/t/t6007-rev-list-cherry-pick-file.sh b/t/t6007-rev-list-cherry-pick-file.sh
>> index 1408b608eb03..d072ec43b016 100755
>> --- a/t/t6007-rev-list-cherry-pick-file.sh
>> +++ b/t/t6007-rev-list-cherry-pick-file.sh
>> @@ -99,6 +99,36 @@ test_expect_success '--cherry-pick bar does not come up empty (II)' '
>>       test_cmp actual.named expect
>>  '
>>
>> +test_expect_success 'name-rev multiple --refs combine inclusive' '
>> +     git rev-list --left-right --cherry-pick F...E -- bar > actual &&
>
> Our current coding style seems to skip the space between `>` and `actual`
> (this applies to all redirections added in this patch).
>

Right that's easy to fix.

>> +     git name-rev --stdin --name-only --refs="*tags/F" --refs="*tags/E" \
>> +             < actual > actual.named &&
>> +     test_cmp actual.named expect
>> +'
>> +
>> +cat >expect <<EOF
>> +<tags/F
>> +$(git rev-list --left-right --right-only --cherry-pick F...E -- bar)
>> +EOF
>
> In the current revision of t6007, we seem to list the expected output
> explicitly, i.e. *not* generating it dynamically.
>
> If you *do* insist to generate the `expect` file dynamically, a better way
> would be to include that generation in the `test_expect_success` code so
> that errors in the call can be caught, too:
>

The main reason I don't like static expecations is that often other
tests inserted before or after my test now have to know what to put
here. Specifically, the problem is that we expect the output not to be
named, but actually to be sha1 hex output. This seems really brittle
for a test.

> test_expect_success 'name-rev --refs excludes non-matched patterns' '
>         echo "<tags/F" >expect &&
>         git rev-list --left-right --right-only --cherry-pick F...E -- \
>                 bar >>expect &&
>         [...]
>
> However, if I was asked for my preference, I would suggest to specify the
> `expect` contents explicitly, to document the expectation as of time of
> writing. The reason: I debugged my share of test breakages and these
> dynamically-generated `expect` files are the worst. When things break, you
> have to dig *real* deep to figure out what is going wrong, as sometimes
> the *generation of the `expect` file* regresses.
>

Do you have a better suggestion for how to check the expect vs actual
output ignoring the raw hex data that would be there otherwise? What I
want to avoid is a brittle test that depends precisely on actions of
prior tests in generating commits and tags. Specifically in this case,
we're going to end up with the sha1 hex ID of the commit in the
expected value.. hard-coding that feels wrong.

Thanks,
Jake

> Ciao,
> Dscho

^ permalink raw reply

* Re: [PATCH 3/5] name-rev: add support to discard refs by pattern match
From: Jacob Keller @ 2017-01-13  0:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <alpine.DEB.2.20.1701121056470.3469@virtualbox>

On Thu, Jan 12, 2017 at 1:57 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Jake,
>
> On Wed, 11 Jan 2017, Jacob Keller wrote:
>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> Extend name-rev further to support matching refs by adding `--discard`
>> patterns.
>
> Same comment applies as for 5/5: `--exclude-refs` may be a better name
> than `--discard`.
>

I agree, will change.

Thanks,
Jake

> Ciao,
> Dscho

^ permalink raw reply

* Re: [PATCH 5/5] describe: teach describe negative pattern matches
From: Jacob Keller @ 2017-01-13  0:59 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <5f723a0d-623f-bf97-00de-29d430484fed@kdbg.org>

On Thu, Jan 12, 2017 at 5:45 AM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 12.01.2017 um 01:17 schrieb Jacob Keller:
>>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> Teach git-describe the `--discard` option which will allow specifying
>> a glob pattern of tags to ignore. This can be combined with the
>> `--match` patterns to enable more flexibility in determining which tags
>> to consider.
>>
>> For example, suppose you wish to find the first official release tag
>> that contains a certain commit. If we assume that official release tags
>> are of the form "v*" and pre-release candidates include "*rc*" in their
>> name, we can now find the first tag that introduces commit abcdef via:
>>
>>   git describe --contains --match="v*" --discard="*rc*"
>
>
> I have a few dozen topic branches, many of them are work in progress and
> named wip/something. To see the completed branches, I routinely say
>
>     gitk --exclude=wip/* --branches
>
> these days.
>
> It would be great if you could provide the same user interface here. The
> example in the commit message would then look like this:
>
>    git describe --contains --exclude="*rc*" --match="v*"
>
> (I'm not saying that you should add --branches, but that you should prefer
> --exclude over --discard. Also, the order of --exclude and --match would be
> important.)

I think that --exclude makes sense, but the current implementation
does not differentiate ordering, since both are merely accumulated
into string_lists and then matched together. I'm not sure how order
would impact things here? In the current implementation, if something
is excluded and matched, it will be excluded. That is, exclusion
patterns take precedence over match patterns. I think this makes the
most sense semantically.

Thanks,
Jake

>
> -- Hannes
>

^ permalink raw reply

* Re: Bug report: Documentation error in git-bisect man description
From: Christian Couder @ 2017-01-13  1:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Manuel Ullmann, git
In-Reply-To: <xmqq8tqfj15z.fsf@gitster.mtv.corp.google.com>

On Fri, Jan 13, 2017 at 12:42 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Manuel Ullmann <ullman.alias@posteo.de> writes:
>>
>> Hmmm, I tend to agree, modulo a minor fix.
>>
>> If the description were in a context inside a paragraph like this:
>>
>>       When you want to tell 'git bisect' that a <rev> belongs to
>>       the newer half of the history, you say
>>
>>               git bisect (bad|new) [<rev>]
>>
>>       On the other hand, when you want to tell 'git bisect' that a
>>       <rev> belongs to the older half of the history, you can say
>>
>>               git bisect (good|old) [<rev>]
>>
>> then the pairing we see in the current text makes quite a lot of
>> sense.
>
> Actually, the above is _exactly_ what was intended.  I misread the
> current documentation when I made the comment, and I think that the
> current one _IS_ correct.  The latter half of the above is not about
> a single rev.  You can paint multiple commits with the "older half"
> color, i.e.
>
>         On the other hand, when you want to tell 'git bisect' that
>         one or more <rev>s  belong to the older half of the history,
>         you can say
>
>                 git bisect (good|old) [<rev>...]
>
> In contrast, you can mark only one <rev> as newer (or "already
> bad").  So pairing (bad|good) and (new|old) like you suggested
> breaks the correctness of the command line description.

Yeah, I agree.

> If (bad|new) and (good|old) bothers you because they may mislead the
> readers to think bad is an opposite of new (and good is an opposite
> of old), the only solution I can think of to that problem is to
> expand these two lines into four and list them like this:
>
>         git bisect bad [<rev>]
>         git bisect good [<rev>...]
>         git bisect new [<rev>]
>         git bisect old [<rev>...]

Maybe it would be more complete and a bit clearer if it was:

           git bisect (bad|new|<term-new>) [<rev>]
           git bisect (good|old|<term-old>) [<rev>...]

^ permalink raw reply

* Re: Bug report: Documentation error in git-bisect man description
From: Manuel Ullmann @ 2017-01-13  1:39 UTC (permalink / raw)
  To: Christian Couder; +Cc: Junio C Hamano, git
In-Reply-To: <CAP8UFD3JDWVVTWsSQcnh+dOHoZDcipVUFwkfQYNOAyV4431C2w@mail.gmail.com>

> On Fri, Jan 13, 2017 at 12:42 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>> Manuel Ullmann <ullman.alias@posteo.de> writes:
>>>
>>> Hmmm, I tend to agree, modulo a minor fix.
>>>
>>> If the description were in a context inside a paragraph like this:
>>>
>>>       When you want to tell 'git bisect' that a <rev> belongs to
>>>       the newer half of the history, you say
>>>
>>>               git bisect (bad|new) [<rev>]
>>>
>>>       On the other hand, when you want to tell 'git bisect' that a
>>>       <rev> belongs to the older half of the history, you can say
>>>
>>>               git bisect (good|old) [<rev>]
>>>
>>> then the pairing we see in the current text makes quite a lot of
>>> sense.
>>
>> Actually, the above is _exactly_ what was intended.  I misread the
>> current documentation when I made the comment, and I think that the
>> current one _IS_ correct.  The latter half of the above is not about
>> a single rev.  You can paint multiple commits with the "older half"
>> color, i.e.
>>
>>         On the other hand, when you want to tell 'git bisect' that
>>         one or more <rev>s  belong to the older half of the history,
>>         you can say
>>
>>                 git bisect (good|old) [<rev>...]
>>
>> In contrast, you can mark only one <rev> as newer (or "already
>> bad").  So pairing (bad|good) and (new|old) like you suggested
>> breaks the correctness of the command line description.
>
> Yeah, I agree.
>
>> If (bad|new) and (good|old) bothers you because they may mislead the
>> readers to think bad is an opposite of new (and good is an opposite
>> of old), the only solution I can think of to that problem is to
>> expand these two lines into four and list them like this:
>>
>>         git bisect bad [<rev>]
>>         git bisect good [<rev>...]
>>         git bisect new [<rev>]
>>         git bisect old [<rev>...]
>
> Maybe it would be more complete and a bit clearer if it was:
>
>            git bisect (bad|new|<term-new>) [<rev>]
>            git bisect (good|old|<term-old>) [<rev>...]

That would clarify the intention quite a bit (at least for me).

Best regards,
Manuel

^ permalink raw reply

* [RFC-PATCH] lib-submodule-update.sh: define tests for recursing into submodules
From: Stefan Beller @ 2017-01-13  1:51 UTC (permalink / raw)
  Cc: git, jacob.keller, bmwill, Jens.Lehmann, Stefan Beller

Currently lib-submodule-update.sh provides 2 functions
test_submodule_switch and test_submodule_forced_switch that are used by a
variety of tests to ensure that submodules behave as expected. The current
expected behavior is that submodules are not touched at all.
(See 42639d2317a for the exact setup)

In the future we want to teach all these commands to properly recurse
into submodules. To do that, we'll add two testing functions to
submodule-update-lib.sh test_submodule_switch_recursing and
test_submodule_forced_switch_recursing.

These two functions behave in analogy to the already existing functions
just with a different expectation on submodule behavior. The submodule
in the working tree is expected to be updated to the recorded submodule
version. The behavior is analogous to e.g. the behavior of files in a
nested directory in the working tree, where a change to the working tree
handles any arising directory/file conflicts just fine.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

So I have been looking into implementing "checkout --recurse-submodules",
for quite some time, but the correct approach is to not just make git-checkout
support submodules, but all the working tree operations at the same time.

Currently all the working tree operations have a test that submodules are
not touched even when the superproject is messing with the submodule,
so all submodule operations are tested via the submodule library, e.g.:

    test_submodule_switch "git pull"

would see what happens for git-pull. Below I am proposing a test
that can be used for any operation that is aware of submodules, e.g.
eventually we'll have checks like:

    test_submodule_switch_recursing "git checkout --recurse-submodules"
    # as well as
    test_submodule_switch_recursing "git pull --recurse-submodules"

This RFC email is asking if the behavior is sound and expected for submodule
operations in the working tree.

Thanks,
Stefan

 t/lib-submodule-update.sh | 465 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 463 insertions(+), 2 deletions(-)

diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
index 79cdd34a54..bf33c30409 100755
--- a/t/lib-submodule-update.sh
+++ b/t/lib-submodule-update.sh
@@ -4,6 +4,7 @@
 # - New submodule (no_submodule => add_sub1)
 # - Removed submodule (add_sub1 => remove_sub1)
 # - Updated submodule (add_sub1 => modify_sub1)
+# - Updated submodule recursively (modify_sub1 => modify_sub1_recursively)
 # - Submodule updated to invalid commit (add_sub1 => invalid_sub1)
 # - Submodule updated from invalid commit (invalid_sub1 => valid_sub1)
 # - Submodule replaced by tracked files in directory (add_sub1 =>
@@ -20,8 +21,8 @@
 #                 /   replace_sub1_with_directory
 #                /----O
 #               /     ^
-#              /      modify_sub1
-#      O------O-------O
+#              /      modify_sub1      modify_sub1_recursive
+#      O------O-------O----------------O
 #      ^      ^\      ^
 #      |      | \     remove_sub1
 #      |      |  -----O-----O
@@ -67,6 +68,14 @@ create_lib_submodule_repo () {
 		git add sub1 &&
 		git commit -m "Modify sub1" &&
 
+		git checkout -b "modify_sub1_recursively" "modify_sub1" &&
+		git -C sub1 checkout -b "add_nested_sub" &&
+		git -C sub1 submodule add --branch no_submodule ./. sub2 &&
+		git -C sub1 commit -a -m "add a nested submodule" &&
+		git add sub1 &&
+		git commit -a -m "update submodule, that updates a nested submodule" &&
+		git -C sub1 submodule deinit -f --all &&
+
 		git checkout -b "replace_sub1_with_directory" "add_sub1" &&
 		git submodule update &&
 		(
@@ -133,6 +142,15 @@ test_git_directory_is_unchanged () {
 	)
 }
 
+test_git_directory_exists() {
+	test -e ".git/modules/$1" &&
+	(
+		cd ".git/modules/$1" &&
+		# does core.worktree point at the right place?
+		test "$(git config core.worktree)" = "../../../$1"
+	)
+}
+
 # Helper function to be executed at the start of every test below, it sets up
 # the submodule repo if it doesn't exist and configures the most problematic
 # settings for diff.ignoreSubmodules.
@@ -678,3 +696,446 @@ test_submodule_forced_switch () {
 		)
 	'
 }
+
+# Test that submodule contents are correctly updated when switching
+# between commits that change a submodule.
+# Test that the following transitions are correctly handled:
+# (These tests are also above in the case where we expect no change
+#  in the submodule)
+# - Updated submodule
+# - New submodule
+# - Removed submodule
+# - Directory containing tracked files replaced by submodule
+# - Submodule replaced by tracked files in directory
+# - Submodule replaced by tracked file with the same name
+# - tracked file replaced by submodule
+#
+# New test cases
+# - Removing a submodule with a git directory absorbs the submodules
+#   git directory first into the superproject.
+
+test_submodule_switch_recursing () {
+	command="$1"
+	######################### Appearing submodule #########################
+	# Switching to a commit letting a submodule appear checks it out ...
+	test_expect_success "$command: added submodule is checked out" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			$command add_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# ... ignoring an empty existing directory ...
+	test_expect_success "$command: added submodule is checked out in empty dir" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			mkdir sub1 &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			$command add_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# ... unless there is an untracked file in its place.
+	test_expect_success "$command: added submodule doesn't remove untracked file with same name" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			: >sub1 &&
+			test_must_fail $command add_sub1 &&
+			test_superproject_content origin/no_submodule &&
+			test_must_be_empty sub1
+		)
+	'
+
+	# ... but an ignored file is fine.
+	test_expect_success "$command: added submodule removes an untracked ignored file" '
+		test_when_finished "rm submodule_update/.git/info/exclude" &&
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			: >sub1 &&
+			echo sub1 > .git/info/exclude
+			$command add_sub1 &&
+			test_superproject_content origin/no_submodule &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+
+	# Replacing a tracked file with a submodule produces a checked out submodule
+	test_expect_success "$command: replace tracked file with submodule checks out submodule" '
+		prolog &&
+		reset_work_tree_to replace_sub1_with_file &&
+		(
+			cd submodule_update &&
+			git branch -t replace_file_with_sub1 origin/replace_file_with_sub1 &&
+			$command replace_file_with_sub1 &&
+			test_superproject_content origin/replace_file_with_sub1 &&
+			test_submodule_content sub1 origin/replace_file_with_sub1
+		)
+	'
+	# ... as does removing a directory with tracked files with a
+	# submodule.
+	test_expect_success "$command: replace directory with submodule" '
+		prolog &&
+		reset_work_tree_to replace_sub1_with_directory &&
+		(
+			cd submodule_update &&
+			git branch -t replace_directory_with_sub1 origin/replace_directory_with_sub1 &&
+			$command replace_directory_with_sub1 &&
+			test_superproject_content origin/replace_directory_with_sub1 &&
+			test_submodule_content sub1 origin/replace_directory_with_sub1
+		)
+	'
+
+	######################## Disappearing submodule #######################
+	# Removing a submodule removes its work tree ...
+	test_expect_success "$command: removed submodule removes submodules working tree" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t remove_sub1 origin/remove_sub1 &&
+			$command remove_sub1 &&
+			test_superproject_content origin/remove_sub1 &&
+			! test -e sub1
+		)
+	'
+	# ... absorbing a .git directory along the way.
+	test_expect_success "$command: removed submodule absorbs submodules .git directory" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t remove_sub1 origin/remove_sub1 &&
+			replace_gitfile_with_git_dir sub1 &&
+			rm -rf .git/modules &&
+			$command remove_sub1 &&
+			test_superproject_content origin/remove_sub1 &&
+			! test -e sub1 &&
+			test_git_directory_exists sub1
+		)
+	'
+	# Replacing a submodule with files in a directory must succeeds
+	# when the submodule is clean
+	test_expect_success "$command: replace submodule with a directory" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_directory origin/replace_sub1_with_directory &&
+			$command replace_sub1_with_directory &&
+			test_superproject_content origin/replace_sub1_with_directory &&
+			test_submodule_content sub1 origin/replace_sub1_with_directory
+		)
+	'
+	# ... absorbing a .git directory.
+	test_expect_success "$command: replace submodule containing a .git directory with a directory must absorb the git dir" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_directory origin/replace_sub1_with_directory &&
+			replace_gitfile_with_git_dir sub1 &&
+			rm -rf .git/modules &&
+			$command replace_sub1_with_directory &&
+			test_superproject_content origin/replace_sub1_with_directory &&
+			test_git_directory_exists sub1
+		)
+	'
+
+	# Replacing it with a file ...
+	test_expect_success "$command: replace submodule with a file" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			$command replace_sub1_with_file &&
+			test_superproject_content origin/replace_sub1_with_file &&
+			test -f sub1
+		)
+	'
+
+	# ... must check its local work tree for untracked files
+	test_expect_success "$command: replace submodule with a file must fail with untracked files" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			> sub1/untrackedfile &&
+			$command replace_sub1_with_file &&
+			test_superproject_content origin/replace_sub1_with_file &&
+			test -f sub1
+		)
+	'
+
+	# ... and ignored files are ignroed
+	test_expect_success "$command: replace submodule with a file works ignores ignored files in submodule" '
+		test_when_finished "rm .git/modules/sub1/info/exclude" &&
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			> sub1/ignored &&
+			$command replace_sub1_with_file &&
+			test_superproject_content origin/replace_sub1_with_file &&
+			test -f sub1
+		)
+	'
+
+	########################## Modified submodule #########################
+	# Updating a submodule sha1 updates the submodule's work tree
+	test_expect_success "$command: modified submodule updates submodule work tree" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t modify_sub1 origin/modify_sub1 &&
+			$command modify_sub1 &&
+			test_superproject_content origin/modify_sub1 &&
+			test_submodule_content sub1 origin/modify_sub1
+		)
+	'
+
+	# Updating a submodule to an invalid sha1 doesn't update the
+	# superproject nor the submodule's work tree.
+	test_expect_success "$command: updating to a missing submodule commit fails" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t invalid_sub1 origin/invalid_sub1 &&
+			test_must_fail $command invalid_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+
+	test_expect_success "$command: modified submodule updates submodule recursively" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t modify_sub1_recursively origin/modify_sub1_recursively &&
+			$command modify_sub1_recursively &&
+			test_superproject_content origin/modify_sub1_recursively &&
+			test_submodule_content sub1 origin/modify_sub1_recursively
+			test_submodule_content sub1/sub2
+		)
+	'
+}
+
+# Test that submodule contents are updated when switching between commits
+# that change a submodule, but throwing away local changes in
+# the superproject as well as the submodule is allowed.
+test_submodule_forced_switch_recursing () {
+	command="$1"
+	######################### Appearing submodule #########################
+	# Switching to a commit letting a submodule appear creates empty dir ...
+	test_expect_success "$command: added submodule is checked out" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			$command add_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# ... and doesn't care if it already exists ...
+	test_expect_success "$command: added submodule ignores empty directory" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			mkdir sub1 &&
+			$command add_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# ... not caring about an untracked file either
+	test_expect_success "$command: added submodule does remove untracked unignored file with same name when forced" '
+		prolog &&
+		reset_work_tree_to no_submodule &&
+		(
+			cd submodule_update &&
+			git branch -t add_sub1 origin/add_sub1 &&
+			>sub1 &&
+			$command add_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# Replacing a tracked file with a submodule checks out the submodule
+	test_expect_success "$command: replace tracked file with submodule populates the submodule" '
+		prolog &&
+		reset_work_tree_to replace_sub1_with_file &&
+		(
+			cd submodule_update &&
+			git branch -t replace_file_with_sub1 origin/replace_file_with_sub1 &&
+			$command replace_file_with_sub1 &&
+			test_superproject_content origin/replace_file_with_sub1 &&
+			test_submodule_content sub1 origin/replace_file_with_sub1
+		)
+	'
+	# ... as does removing a directory with tracked files with a
+	# submodule.
+	test_expect_success "$command: replace directory with submodule" '
+		prolog &&
+		reset_work_tree_to replace_sub1_with_directory &&
+		(
+			cd submodule_update &&
+			git branch -t replace_directory_with_sub1 origin/replace_directory_with_sub1 &&
+			$command replace_directory_with_sub1 &&
+			test_superproject_content origin/replace_directory_with_sub1 &&
+			test_submodule_content sub1 origin/replace_directory_with_sub1
+		)
+	'
+
+	######################## Disappearing submodule #######################
+	# Removing a submodule doesn't remove its work tree ...
+	test_expect_success "$command: removed submodule leaves submodule directory and its contents in place" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t remove_sub1 origin/remove_sub1 &&
+			$command remove_sub1 &&
+			test_superproject_content origin/remove_sub1 &&
+			! test -e sub1
+		)
+	'
+	# ... especially when it contains a .git directory.
+	test_expect_success "$command: removed submodule leaves submodule containing a .git directory alone" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t remove_sub1 origin/remove_sub1 &&
+			replace_gitfile_with_git_dir sub1 &&
+			rm -rf .git/modules/sub1 &&
+			$command remove_sub1 &&
+			test_superproject_content origin/remove_sub1 &&
+			test_git_directory_exists sub1 &&
+			! test -e sub1
+		)
+	'
+	# Replacing a submodule with files in a directory ...
+	test_expect_failure "$command: replace submodule with a directory" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_directory origin/replace_sub1_with_directory &&
+			$command replace_sub1_with_directory &&
+			test_superproject_content origin/replace_sub1_with_directory
+		)
+	'
+	# ... absorbing a .git directory.
+	test_expect_failure "$command: replace submodule containing a .git directory with a directory must fail" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_directory origin/replace_sub1_with_directory &&
+			replace_gitfile_with_git_dir sub1 &&
+			rm -rf .git/modules/sub1 &&
+			$command replace_sub1_with_directory &&
+			test_superproject_content origin/replace_sub1_with_directory &&
+			test_git_directory_exists sub1
+		)
+	'
+	# Replacing it with a file
+	test_expect_failure "$command: replace submodule with a file must fail" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			$command replace_sub1_with_file &&
+			test_superproject_content origin/replace_sub1_with_file
+		)
+	'
+
+	# ... even if the submodule contains ignored files
+	test_expect_failure "$command: replace submodule with a file" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			: > sub1/expect &&
+			$command replace_sub1_with_file &&
+			test_superproject_content origin/replace_sub1_with_file
+		)
+	'
+
+	# ... but stops for untracked files that would be lost
+	test_expect_failure "$command: replace submodule with a file" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t replace_sub1_with_file origin/replace_sub1_with_file &&
+			: > sub1/untracked_file &&
+			test_must_fail $command replace_sub1_with_file &&
+			test_superproject_content origin/add_sub1 &&
+			test -f sub1/untracked_file
+		)
+	'
+
+	########################## Modified submodule #########################
+	# Updating a submodule sha1 updates the submodule's work tree
+	test_expect_success "$command: modified submodule updates submodule work tree" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t modify_sub1 origin/modify_sub1 &&
+			$command modify_sub1 &&
+			test_superproject_content origin/modify_sub1 &&
+			test_submodule_content sub1 origin/modify_sub1
+		)
+	'
+	# Updating a submodule to an invalid sha1 doesn't update the
+	# submodule's work tree, subsequent update will fail
+	test_expect_success "$command: modified submodule does not update submodule work tree to invalid commit" '
+		prolog &&
+		reset_work_tree_to add_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t invalid_sub1 origin/invalid_sub1 &&
+			test_must_fail $command invalid_sub1 &&
+			test_superproject_content origin/add_sub1 &&
+			test_submodule_content sub1 origin/add_sub1
+		)
+	'
+	# Updating a submodule from an invalid sha1 updates
+	test_expect_success "$command: modified submodule does not update submodule work tree from invalid commit" '
+		prolog &&
+		reset_work_tree_to invalid_sub1 &&
+		(
+			cd submodule_update &&
+			git branch -t valid_sub1 origin/valid_sub1 &&
+			$command valid_sub1 &&
+			test_superproject_content origin/valid_sub1 &&
+			test_submodule_content sub1 origin/valid_sub1
+		)
+	'
+}
-- 
2.11.0.299.g77584d2295


^ permalink raw reply related

* Re: [PATCH 2/2] Use 'env' to find perl instead of fixed path
From: Eric Wong @ 2017-01-13  2:48 UTC (permalink / raw)
  To: Pat Pannuto; +Cc: Junio C Hamano, Johannes Schindelin, Johannes Sixt, git
In-Reply-To: <CAAnLKaGvz4Wzs36gMSdoYCg+tzx6KFCe59FNnk5zNQ-L58ww1g@mail.gmail.com>

Pat Pannuto <pat.pannuto@gmail.com> wrote:
> You may still want the 1/2 patch in this series, just to make things
> internally consistent with "-w" vs "use warnings;" inside git's perl
> scripts.

No, that is a step back.  "-w" affects the entire process, so it
spots more potential problems.  The "warnings" pragma is scoped
to the enclosing block, so it won't span across files.

Existing instances of "use warnings" should remain, but I would
rather support adding "-w" to scripts which do not have it (and
fixing newly-found warnings along the way).

Thanks.

^ permalink raw reply

* Re: [PATCH 5/5] describe: teach describe negative pattern matches
From: Johannes Sixt @ 2017-01-13  6:43 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <CA+P7+xrmAmCPOzuaKcm+WxceXnowkM4gKz05tSpdC=CDwpCEug@mail.gmail.com>

Am 13.01.2017 um 01:59 schrieb Jacob Keller:
> I think that --exclude makes sense, but the current implementation
> does not differentiate ordering, since both are merely accumulated
> into string_lists and then matched together. I'm not sure how order
> would impact things here? In the current implementation, if something
> is excluded and matched, it will be excluded. That is, exclusion
> patterns take precedence over match patterns. I think this makes the
> most sense semantically.

When you write

   git log --exclude=wip/* --branches --remotes

--exclude applies only to --branches, not to --remotes.

  When you write

   git log --branches --exclude=origin/* --remotes

--exclude=origin/* applies only to --remotes, but not to --branches.

-- Hannes


^ permalink raw reply

* Re: [PATCH 5/5] describe: teach describe negative pattern matches
From: Jacob Keller @ 2017-01-13  6:57 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jacob Keller, Git mailing list
In-Reply-To: <5c8401ef-9609-f235-9228-be980a13edf1@kdbg.org>

On Thu, Jan 12, 2017 at 10:43 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 13.01.2017 um 01:59 schrieb Jacob Keller:
>>
>> I think that --exclude makes sense, but the current implementation
>> does not differentiate ordering, since both are merely accumulated
>> into string_lists and then matched together. I'm not sure how order
>> would impact things here? In the current implementation, if something
>> is excluded and matched, it will be excluded. That is, exclusion
>> patterns take precedence over match patterns. I think this makes the
>> most sense semantically.
>
>
> When you write
>
>   git log --exclude=wip/* --branches --remotes
>
> --exclude applies only to --branches, not to --remotes.
>
>  When you write
>
>   git log --branches --exclude=origin/* --remotes
>
> --exclude=origin/* applies only to --remotes, but not to --branches.
>
> -- Hannes
>

Well for describe I don't think the order matters.

Thanks,
Jake

^ permalink raw reply

* Re: "git fsck" not detecting garbage at the end of blob object files...
From: John Szakmeister @ 2017-01-13  9:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Dennis Kaarsemaker, git
In-Reply-To: <20170108052619.4ucjamsqad4g5add@sigill.intra.peff.net>

On Sun, Jan 8, 2017 at 12:26 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Jan 07, 2017 at 10:47:03PM +0100, Dennis Kaarsemaker wrote:
>> On Sat, 2017-01-07 at 07:50 -0500, John Szakmeister wrote:
>> > I was perusing StackOverflow this morning and ran across this
>> > question: http://stackoverflow.com/questions/41521143/git-fsck-full-only-checking-directories/
>> >
>> > It was a simple question about why "checking objects" was not
>> > appearing, but in it was another issue.  The user purposefully
>> > corrupted a blob object file to see if `git fsck` would catch it by
>> > tacking extra data on at the end.  `git fsck` happily said everything
>> > was okay, but when I played with things locally I found out that `git
>> > gc` does not like that extra garbage.  I'm not sure what the trade-off
>> > needs to be here, but my expectation is that if `git fsck` says
>> > everything is okay, then all operations using that object (file)
>> > should work too.
>> >
>> > Is that unreasonable?  What would be the impact of fixing this issue?
>>
>> If you do this with a commit object or tree object, fsck does complain.
>> I think it's sensible to do so for blob objects as well.
>
> The existing extra-garbage check is in unpack_sha1_rest(), which is
> called as part of read_sha1_file(). And that's what we hit for commits
> and trees. However, we check the sha1 of blobs using the streaming
> interface (in case they're large). I think you'd want to put a similar
> check into read_istream_loose(). But note if you are grepping for it, it
> is hidden behind a macro; look for read_method_decl(loose).

That's for the pointer.

> I'm actually not sure if this should be downgrade to a warning. It's
> true that it's a form of corruption, but it doesn't actually prohibit us
> from getting the data we need to complete the operation. Arguably fsck
> should be more picky, but it is just relying on the same parse_object()
> code path that the rest of git uses.
>
> I doubt anybody cares too much either way, though. It's not like this is
> a common thing.

I kind of wonder about that myself too, and I'm not sure what to
think about it.  On the one hand, I'd like to know about
*anything* that has changed in an adverse way--it could indicate
a failure somewhere else that needs to be handled.  On the other
hand, scaring the user isn't all that advantageous.  I guess I'm
in the former camp.

As to whether this is common, yeah, it's probably not.  However,
I was surprised by the number of results that turned up when I
search for "garbage at end of loose object".

> I did notice another interesting case when looking at this. Fsck ends up
> in fsck_loose(), which has the sha1 and path of the loose object. It
> passes the sha1 to fsck_sha1(), and ignores the path entirely!
>
> So if you have a duplicate copy of the object in a pack, we'd actually
> find and check the duplicate. This can happen, e.g., if you had a loose
> object and fetched a thin-pack which made a copy of the loose object to
> complete the pack).
>
> Probably fsck_loose() should be more picky about making sure we are
> reading the data from the loose version we found.

Interesting find!  Thanks for the information Peff!

-John

^ permalink raw reply

* Re: "git fsck" not detecting garbage at the end of blob object files...
From: John Szakmeister @ 2017-01-13  9:16 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git
In-Reply-To: <1483825623.31837.9.camel@kaarsemaker.net>

On Sat, Jan 7, 2017 at 4:47 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> On Sat, 2017-01-07 at 07:50 -0500, John Szakmeister wrote:
>> I was perusing StackOverflow this morning and ran across this
>> question: http://stackoverflow.com/questions/41521143/git-fsck-full-only-checking-directories/
>>
>> It was a simple question about why "checking objects" was not
>> appearing, but in it was another issue.  The user purposefully
>> corrupted a blob object file to see if `git fsck` would catch it by
>> tacking extra data on at the end.  `git fsck` happily said everything
>> was okay, but when I played with things locally I found out that `git
>> gc` does not like that extra garbage.  I'm not sure what the trade-off
>> needs to be here, but my expectation is that if `git fsck` says
>> everything is okay, then all operations using that object (file)
>> should work too.
>>
>> Is that unreasonable?  What would be the impact of fixing this issue?
>
> If you do this with a commit object or tree object, fsck does complain.
> I think it's sensible to do so for blob objects as well.

Also very good information.  Thanks Dennis!

-John

^ permalink raw reply

* [PATCH v2 0/2] diff --no-index: support symlinks and pipes
From: Dennis Kaarsemaker @ 2017-01-13 10:20 UTC (permalink / raw)
  To: git; +Cc: Dennis Kaarsemaker

git diff <(command1) <(command2) is less useful than it could be, all it outputs is:

diff --git a/dev/fd/63 b/dev/fd/62
index 9e6542b297..9f7b2c291b 120000
--- a/dev/fd/63
+++ b/dev/fd/62
@@ -1 +1 @@
-pipe:[464811685]
\ No newline at end of file
+pipe:[464811687]
\ No newline at end of file

Normal diff provides arguably better output: the diff of the output of the
commands. This series makes it possible for git diff --no-index to follow
symlinks and read from pipes, mimicking the behaviour of normal diff.

v1: http://public-inbox.org/git/20161111201958.2175-1-dennis@kaarsemaker.net/

Changes since the RFC/v1 patch:
- Following symlinks is now the default. I think an accurate summary of the
  discussion on v1 is that this behaviour is useful enough to be the default,
  but to add an escape hatch. That escape hatch is named --no-dereference, name
  stolen from gnu diff.
- Added tests and documentation

Specifically not changed:
These changes affect only diff --no-index. Using --no-dereference is an error
without --no-index.

Dennis Kaarsemaker (2):
  diff --no-index: follow symlinks
  diff --no-index: support reading from pipes

 Documentation/diff-options.txt |  7 +++++++
 diff-no-index.c                | 15 ++++++++++++---
 diff.c                         | 23 +++++++++++++++++++----
 diff.h                         |  2 +-
 t/t4053-diff-no-index.sh       | 40 ++++++++++++++++++++++++++++++++++++++++
 t/test-lib.sh                  |  4 ++++
 6 files changed, 83 insertions(+), 8 deletions(-)

-- 
2.11.0-234-gaf85957


^ permalink raw reply related

* [PATCH v2 2/2] diff --no-index: support reading from pipes
From: Dennis Kaarsemaker @ 2017-01-13 10:20 UTC (permalink / raw)
  To: git; +Cc: Dennis Kaarsemaker
In-Reply-To: <20170113102021.6054-1-dennis@kaarsemaker.net>

diff <(command1) <(command2) provides useful output, let's make it
possible for git to do the same.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
---
 diff-no-index.c          |  8 ++++++++
 diff.c                   | 13 +++++++++++--
 t/t4053-diff-no-index.sh | 10 ++++++++++
 t/test-lib.sh            |  4 ++++
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 826fe97ffc..2123da435b 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -83,6 +83,14 @@ static struct diff_filespec *noindex_filespec(const char *name, int mode)
 		name = "/dev/null";
 	s = alloc_filespec(name);
 	fill_filespec(s, null_sha1, 0, mode);
+	/*
+	 * In --no-index mode, we support reading from pipes. canon_mode, called by
+	 * fill_filespec, gets confused by this and thinks we now have subprojects.
+	 * Detect this and tell the rest of the diff machinery to treat pipes as
+	 * normal files.
+	 */
+	if (S_ISGITLINK(s->mode))
+		s->mode = S_IFREG | ce_permissions(mode);
 	if (name == file_from_standard_input)
 		populate_from_stdin(s);
 	return s;
diff --git a/diff.c b/diff.c
index 2fc0226338..bb04eab331 100644
--- a/diff.c
+++ b/diff.c
@@ -2839,9 +2839,18 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
 		fd = open(s->path, O_RDONLY);
 		if (fd < 0)
 			goto err_empty;
-		s->data = xmmap(NULL, s->size, PROT_READ, MAP_PRIVATE, fd, 0);
+		if (!S_ISREG(st.st_mode)) {
+			struct strbuf sb = STRBUF_INIT;
+			strbuf_read(&sb, fd, 0);
+			s->size = sb.len;
+			s->data = strbuf_detach(&sb, NULL);
+			s->should_free = 1;
+		}
+		else {
+			s->data = xmmap(NULL, s->size, PROT_READ, MAP_PRIVATE, fd, 0);
+			s->should_munmap = 1;
+		}
 		close(fd);
-		s->should_munmap = 1;
 
 		/*
 		 * Convert from working tree format to canonical git format
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index c6046fef19..ba343566c0 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -157,4 +157,14 @@ test_expect_success SYMLINKS 'diff --no-index --no-dereference does not follow s
 	test_cmp expect actual
 '
 
+test_expect_success PROCESS_SUBSTITUTION 'diff --no-index works on fifos' '
+	cat >expect <<\EOF
+		@@ -1 +1 @@
+		-1
+		+2
+	EOF
+	test_expect_code 1 git diff --no-index <(echo 1) <(echo 2) | tail -n +5 > actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 11562bde10..78f3d24651 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1128,3 +1128,7 @@ build_option () {
 test_lazy_prereq LONG_IS_64BIT '
 	test 8 -le "$(build_option sizeof-long)"
 '
+
+test_lazy_prereq PROCESS_SUBSTITUTION '
+	eval "foo=<(echo test)" 2>/dev/null
+'
-- 
2.11.0-234-gaf85957


^ permalink raw reply related

* [PATCH v2 1/2] diff --no-index: follow symlinks
From: Dennis Kaarsemaker @ 2017-01-13 10:20 UTC (permalink / raw)
  To: git; +Cc: Dennis Kaarsemaker
In-Reply-To: <20170113102021.6054-1-dennis@kaarsemaker.net>

Git's diff machinery does not follow symlinks, which makes sense as git
itself also does not, but stores the symlink destination.

In --no-index mode however, it is useful for diff to to follow symlinks,
matching the behaviour of ordinary diff. A new --no-dereference (name
copied from diff) option has been added to disable this behaviour.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
---
 Documentation/diff-options.txt |  7 +++++++
 diff-no-index.c                |  7 ++++---
 diff.c                         | 10 ++++++++--
 diff.h                         |  2 +-
 t/t4053-diff-no-index.sh       | 30 ++++++++++++++++++++++++++++++
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 2d77a19626..48bcf3cc5e 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -216,6 +216,13 @@ any of those replacements occurred.
 	commit range.  Defaults to `diff.submodule` or the 'short' format
 	if the config option is unset.
 
+ifdef::git-diff[]
+--no-dereference::
+	Normally, "git diff --no-index" will dereference symlinks and compare
+	the contents of the linked files, mimicking ordinary diff. This
+	option disables that behaviour.
+endif::git-diff[]
+
 --color[=<when>]::
 	Show colored diff.
 	`--color` (i.e. without '=<when>') is the same as `--color=always`.
diff --git a/diff-no-index.c b/diff-no-index.c
index f420786039..826fe97ffc 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -40,7 +40,7 @@ static int read_directory_contents(const char *path, struct string_list *list)
  */
 static const char file_from_standard_input[] = "-";
 
-static int get_mode(const char *path, int *mode)
+static int get_mode(const char *path, int *mode, int no_dereference)
 {
 	struct stat st;
 
@@ -52,7 +52,7 @@ static int get_mode(const char *path, int *mode)
 #endif
 	else if (path == file_from_standard_input)
 		*mode = create_ce_mode(0666);
-	else if (lstat(path, &st))
+	else if (no_dereference ? lstat(path, &st) : stat(path, &st))
 		return error("Could not access '%s'", path);
 	else
 		*mode = st.st_mode;
@@ -93,7 +93,8 @@ static int queue_diff(struct diff_options *o,
 {
 	int mode1 = 0, mode2 = 0;
 
-	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
+	if (get_mode(name1, &mode1, DIFF_OPT_TST(o, NO_DEREFERENCE)) ||
+		get_mode(name2, &mode2, DIFF_OPT_TST(o, NO_DEREFERENCE)))
 		return -1;
 
 	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
diff --git a/diff.c b/diff.c
index be11e4ef2b..2fc0226338 100644
--- a/diff.c
+++ b/diff.c
@@ -2815,7 +2815,7 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
 		s->size = xsize_t(st.st_size);
 		if (!s->size)
 			goto empty;
-		if (S_ISLNK(st.st_mode)) {
+		if (S_ISLNK(s->mode)) {
 			struct strbuf sb = STRBUF_INIT;
 
 			if (strbuf_readlink(&sb, s->path, s->size))
@@ -2825,6 +2825,10 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
 			s->should_free = 1;
 			return 0;
 		}
+		if (S_ISLNK(st.st_mode)) {
+			stat(s->path, &st);
+			s->size = xsize_t(st.st_size);
+		}
 		if (size_only)
 			return 0;
 		if ((flags & CHECK_BINARY) &&
@@ -3884,7 +3888,9 @@ int diff_opt_parse(struct diff_options *options,
 	else if (!strcmp(arg, "--no-follow")) {
 		DIFF_OPT_CLR(options, FOLLOW_RENAMES);
 		DIFF_OPT_CLR(options, DEFAULT_FOLLOW_RENAMES);
-	} else if (!strcmp(arg, "--color"))
+	} else if (!strcmp(arg, "--no-dereference"))
+		DIFF_OPT_SET(options, NO_DEREFERENCE);
+	else if (!strcmp(arg, "--color"))
 		options->use_color = 1;
 	else if (skip_prefix(arg, "--color=", &arg)) {
 		int value = git_config_colorbool(NULL, arg);
diff --git a/diff.h b/diff.h
index 25ae60d5ff..74883db1eb 100644
--- a/diff.h
+++ b/diff.h
@@ -69,7 +69,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES      (1 <<  7)
 #define DIFF_OPT_RENAME_EMPTY        (1 <<  8)
-/* (1 <<  9) unused */
+#define DIFF_OPT_NO_DEREFERENCE      (1 <<  9)
 #define DIFF_OPT_HAS_CHANGES         (1 << 10)
 #define DIFF_OPT_QUICK               (1 << 11)
 #define DIFF_OPT_NO_INDEX            (1 << 12)
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 453e6c35eb..c6046fef19 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -127,4 +127,34 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit)
 	test_cmp expect actual.head
 '
 
+test_expect_success SYMLINKS 'diff --no-index follows symlinks' '
+	echo a >1 &&
+	echo b >2 &&
+	ln -s 1 3 &&
+	ln -s 2 4 &&
+	cat >expect <<\EOF
+		--- a/3
+		+++ b/4
+		@@ -1 +1 @@
+		-a
+		+b
+	EOF
+	test_expect_code 1 git diff --no-index 3 4 | tail -n +3 >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success SYMLINKS 'diff --no-index --no-dereference does not follow symlinks' '
+	cat >expect <<\EOF
+		--- a/3
+		+++ b/4
+		@@ -1 +1 @@
+		-1
+		\ No newline at end of file
+		+2
+		\ No newline at end of file
+	EOF
+	test_expect_code 1 git diff --no-index --no-dereference 3 4 | tail -n +3 > actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.11.0-234-gaf85957


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox