Git development
 help / color / mirror / Atom feed
* [PATCH/WIP 00/11] read_directory() rewrite to support struct pathspec
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

This is the first time "make test" fully passes (*) for me, so it's
probably good enough for human eyes. Just heads up where this might
go.

A few points:

 - "git add --ignore-missing" is killed because I could not find an
   easy way to incorporate it to the new read_directory(). It looks
   like a hack to me, to expose .gitignore matching. Luckily no one
   except submodule seems to use it.

 - I chose to use tree_entry_interesting() instead of
   match_pathspec(). The former has more optimizations but requires a
   tree-based structure. So I have to read the whole directory in,
   re-construct a temporary tree object to make t_e_i() happy. I
   _think_ it does not impact performance with reasonable dir size.

 - there'll be more work to get rid of match_pathspec() calls after
   read_directory()/fill_directory(). I haven't got finished this part
   yet.

 - I really like to kill match_pathspec() so we only have one pathspec
   implementation instead of two now, but that may be real hard
   because of staged entries in index.

(*) t7012.7 fails but I think that's the test's fault.

Nguyễn Thái Ngọc Duy (11):
  Introduce "check-attr --excluded" as a replacement for "add --ignore-missing"
  notes-merge: use opendir/readdir instead of using read_directory()
  t5403: avoid doing "git add foo/bar" where foo/.git exists
  tree-walk.c: do not leak internal structure in tree_entry_len()
  symbolize return values of tree_entry_interesting()
  read_directory_recursive: reduce one indentation level
  tree_entry_interesting: make use of local pointer "item"
  tree-walk: mark useful pathspecs
  tree_entry_interesting: differentiate partial vs full match
  read-dir: stop using path_simplify code in favor of tree_entry_interesting()
  dir.c: remove dead code after read_directory() rewrite

 Documentation/git-check-attr.txt |    4 +
 builtin/add.c                    |   36 ++--
 builtin/check-attr.c             |   26 +++
 builtin/grep.c                   |   11 +-
 builtin/pack-objects.c           |    2 +-
 cache.h                          |    1 +
 dir.c                            |  428 +++++++++++++++++++-------------------
 dir.h                            |    8 +-
 git-submodule.sh                 |    2 +-
 list-objects.c                   |    9 +-
 notes-merge.c                    |   45 +++--
 t/t3700-add.sh                   |   19 --
 t/t5403-post-checkout-hook.sh    |   17 +-
 tree-diff.c                      |   19 +-
 tree-walk.c                      |   85 ++++----
 tree-walk.h                      |   19 ++-
 tree.c                           |   11 +-
 unpack-trees.c                   |    6 +-
 18 files changed, 394 insertions(+), 354 deletions(-)

-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply

* [PATCH/WIP 01/11] Introduce "check-attr --excluded" as a replacement for "add --ignore-missing"
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

--ignore-missing is used by submodule to check if a path may be
ignored by .gitignore files. It does not really fit in git-add (git
add takes pathspec, but --ignore-missing takes only paths)

Google reckons that --ignore-missing is not used anywhere but
git-submodule.sh. Remove --ignore-missing and introduce "check-attr
--excluded" as a replacement.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-check-attr.txt |    4 ++++
 builtin/add.c                    |   14 +++-----------
 builtin/check-attr.c             |   26 ++++++++++++++++++++++++++
 git-submodule.sh                 |    2 +-
 t/t3700-add.sh                   |   19 -------------------
 5 files changed, 34 insertions(+), 31 deletions(-)

diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt
index 5abdbaa..94d2068 100644
--- a/Documentation/git-check-attr.txt
+++ b/Documentation/git-check-attr.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git check-attr' [-a | --all | attr...] [--] pathname...
 'git check-attr' --stdin [-z] [-a | --all | attr...] < <list-of-paths>
+'git check-attr' --excluded pathname...
 
 DESCRIPTION
 -----------
@@ -34,6 +35,9 @@ OPTIONS
 	Only meaningful with `--stdin`; paths are separated with a
 	NUL character instead of a linefeed character.
 
+--excluded::
+	Check if given paths are excluded by standard .gitignore rules.
+
 \--::
 	Interpret all preceding arguments as attributes and all following
 	arguments as path names.
diff --git a/builtin/add.c b/builtin/add.c
index c59b0c9..23ad4b8 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -310,7 +310,7 @@ static const char ignore_error[] =
 N_("The following paths are ignored by one of your .gitignore files:\n");
 
 static int verbose = 0, show_only = 0, ignored_too = 0, refresh_only = 0;
-static int ignore_add_errors, addremove, intent_to_add, ignore_missing = 0;
+static int ignore_add_errors, addremove, intent_to_add;
 
 static struct option builtin_add_options[] = {
 	OPT__DRY_RUN(&show_only, "dry run"),
@@ -325,7 +325,6 @@ static struct option builtin_add_options[] = {
 	OPT_BOOLEAN('A', "all", &addremove, "add changes from all tracked and untracked files"),
 	OPT_BOOLEAN( 0 , "refresh", &refresh_only, "don't add, only refresh the index"),
 	OPT_BOOLEAN( 0 , "ignore-errors", &ignore_add_errors, "just skip files which cannot be added because of errors"),
-	OPT_BOOLEAN( 0 , "ignore-missing", &ignore_missing, "check if - even missing - files are ignored in dry run"),
 	OPT_END(),
 };
 
@@ -387,8 +386,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (addremove && take_worktree_changes)
 		die(_("-A and -u are mutually incompatible"));
-	if (!show_only && ignore_missing)
-		die(_("Option --ignore-missing can only be used together with --dry-run"));
 	if ((addremove || take_worktree_changes) && !argc) {
 		static const char *here[2] = { ".", NULL };
 		argc = 1;
@@ -446,13 +443,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		for (i = 0; pathspec[i]; i++) {
 			if (!seen[i] && pathspec[i][0]
 			    && !file_exists(pathspec[i])) {
-				if (ignore_missing) {
-					int dtype = DT_UNKNOWN;
-					if (excluded(&dir, pathspec[i], &dtype))
-						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
-				} else
-					die(_("pathspec '%s' did not match any files"),
-					    pathspec[i]);
+				die(_("pathspec '%s' did not match any files"),
+				    pathspec[i]);
 			}
 		}
 		free(seen);
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index 44c421e..4c17ccc 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -2,11 +2,13 @@
 #include "cache.h"
 #include "attr.h"
 #include "quote.h"
+#include "dir.h"
 #include "parse-options.h"
 
 static int all_attrs;
 static int cached_attrs;
 static int stdin_paths;
+static int exclude;
 static const char * const check_attr_usage[] = {
 "git check-attr [-a | --all | attr...] [--] pathname...",
 "git check-attr --stdin [-a | --all | attr...] < <list-of-paths>",
@@ -21,6 +23,7 @@ static const struct option check_attr_options[] = {
 	OPT_BOOLEAN(0 , "stdin", &stdin_paths, "read file names from stdin"),
 	OPT_BOOLEAN('z', NULL, &null_term_line,
 		"input paths are terminated by a null character"),
+	OPT_BOOLEAN(0,  "excluded", &exclude, "check exclude patterns"),
 	OPT_END()
 };
 
@@ -43,6 +46,16 @@ static void output_attr(int cnt, struct git_attr_check *check,
 	}
 }
 
+static void check_exclude(struct dir_struct *dir, const char *prefix, const char *file)
+{
+	char *full_path =
+		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
+	int dtype = DT_UNKNOWN;
+	if (excluded(dir, full_path, &dtype))
+		die("%s is ignored by one of your .gitignore files", full_path);
+	free(full_path);
+}
+
 static void check_attr(const char *prefix, int cnt,
 	struct git_attr_check *check, const char *file)
 {
@@ -103,6 +116,19 @@ int cmd_check_attr(int argc, const char **argv, const char *prefix)
 		die("invalid cache");
 	}
 
+	if (exclude) {
+		struct dir_struct dir;
+
+		if (stdin_paths)
+			die("--excluded cannot be used with --stdin (yet)");
+
+		memset(&dir, 0, sizeof(dir));
+		setup_standard_excludes(&dir);
+		for (i = 0; i < argc; i++)
+			check_exclude(&dir, prefix, argv[i]);
+		return 0;
+	}
+
 	if (cached_attrs)
 		git_attr_set_direction(GIT_ATTR_INDEX, NULL);
 
diff --git a/git-submodule.sh b/git-submodule.sh
index 928a62f..0bc3762 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -262,7 +262,7 @@ cmd_add()
 	git ls-files --error-unmatch "$path" > /dev/null 2>&1 &&
 	die "$(eval_gettext "'\$path' already exists in the index")"
 
-	if test -z "$force" && ! git add --dry-run --ignore-missing "$path" > /dev/null 2>&1
+	if test -z "$force" && ! git check-attr --excluded "$path" > /dev/null 2>&1
 	then
 		eval_gettextln "The following path is ignored by one of your .gitignore files:
 \$path
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index 575d950..23ff998 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -276,23 +276,4 @@ test_expect_success 'git add --dry-run of an existing file output' "
 	test_i18ncmp expect actual
 "
 
-cat >expect.err <<\EOF
-The following paths are ignored by one of your .gitignore files:
-ignored-file
-Use -f if you really want to add them.
-fatal: no files added
-EOF
-cat >expect.out <<\EOF
-add 'track-this'
-EOF
-
-test_expect_success 'git add --dry-run --ignore-missing of non-existing file' '
-	test_must_fail git add --dry-run --ignore-missing track-this ignored-file >actual.out 2>actual.err
-'
-
-test_expect_success 'git add --dry-run --ignore-missing of non-existing file output' '
-	test_i18ncmp expect.out actual.out &&
-	test_i18ncmp expect.err actual.err
-'
-
 test_done
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 02/11] notes-merge: use opendir/readdir instead of using read_directory()
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

notes_merge_commit() only needs to list all entries (non-recursively)
under a directory, which can be easily accomplished with
opendir/readdir and would be more lightweight than read_directory().

read_directory() is designed to list paths inside a working
directory. Using it outside of its scope may lead to undesired effects.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 notes-merge.c |   45 +++++++++++++++++++++++++++------------------
 1 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/notes-merge.c b/notes-merge.c
index e9e4199..80d64a2 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -680,48 +680,57 @@ int notes_merge_commit(struct notes_merge_options *o,
 	 * commit message and parents from 'partial_commit'.
 	 * Finally store the new commit object SHA1 into 'result_sha1'.
 	 */
-	struct dir_struct dir;
-	char *path = xstrdup(git_path(NOTES_MERGE_WORKTREE "/"));
-	int path_len = strlen(path), i;
+	DIR *dir;
+	struct dirent *e;
+	struct strbuf path = STRBUF_INIT;
 	const char *msg = strstr(partial_commit->buffer, "\n\n");
+	int baselen;
 
-	OUTPUT(o, 3, "Committing notes in notes merge worktree at %.*s",
-	       path_len - 1, path);
+	strbuf_addstr(&path, git_path(NOTES_MERGE_WORKTREE));
+	OUTPUT(o, 3, "Committing notes in notes merge worktree at %s", path.buf);
 
 	if (!msg || msg[2] == '\0')
 		die("partial notes commit has empty message");
 	msg += 2;
 
-	memset(&dir, 0, sizeof(dir));
-	read_directory(&dir, path, path_len, NULL);
-	for (i = 0; i < dir.nr; i++) {
-		struct dir_entry *ent = dir.entries[i];
+	dir = opendir(path.buf);
+	if (!dir)
+		die_errno("could not open %s", path.buf);
+
+	strbuf_addch(&path, '/');
+	baselen = path.len;
+	while ((e = readdir(dir)) != NULL) {
 		struct stat st;
-		const char *relpath = ent->name + path_len;
 		unsigned char obj_sha1[20], blob_sha1[20];
 
-		if (ent->len - path_len != 40 || get_sha1_hex(relpath, obj_sha1)) {
-			OUTPUT(o, 3, "Skipping non-SHA1 entry '%s'", ent->name);
+		if (is_dot_or_dotdot(e->d_name))
+			continue;
+
+		if (strlen(e->d_name) != 40 || get_sha1_hex(e->d_name, obj_sha1)) {
+			OUTPUT(o, 3, "Skipping non-SHA1 entry '%s%s'", path.buf, e->d_name);
 			continue;
 		}
 
+		strbuf_addstr(&path, e->d_name);
 		/* write file as blob, and add to partial_tree */
-		if (stat(ent->name, &st))
-			die_errno("Failed to stat '%s'", ent->name);
-		if (index_path(blob_sha1, ent->name, &st, HASH_WRITE_OBJECT))
-			die("Failed to write blob object from '%s'", ent->name);
+		if (stat(path.buf, &st))
+			die_errno("Failed to stat '%s'", path.buf);
+		if (index_path(blob_sha1, path.buf, &st, HASH_WRITE_OBJECT))
+			die("Failed to write blob object from '%s'", path.buf);
 		if (add_note(partial_tree, obj_sha1, blob_sha1, NULL))
 			die("Failed to add resolved note '%s' to notes tree",
-			    ent->name);
+			    path.buf);
 		OUTPUT(o, 4, "Added resolved note for object %s: %s",
 		       sha1_to_hex(obj_sha1), sha1_to_hex(blob_sha1));
+		strbuf_setlen(&path, baselen);
 	}
 
 	create_notes_commit(partial_tree, partial_commit->parents, msg,
 			    result_sha1);
 	OUTPUT(o, 4, "Finalized notes merge commit: %s",
 	       sha1_to_hex(result_sha1));
-	free(path);
+	strbuf_release(&path);
+	closedir(dir);
 	return 0;
 }
 
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 03/11] t5403: avoid doing "git add foo/bar" where foo/.git exists
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

In this case, "foo" is considered a submodule and bar, if added,
belongs to foo/.git. "git add" should only allow "git add foo" in this
case, but it passes somehow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t5403-post-checkout-hook.sh |   17 ++++++++++-------
 1 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/t/t5403-post-checkout-hook.sh b/t/t5403-post-checkout-hook.sh
index 1753ef2..3b3e2c1 100755
--- a/t/t5403-post-checkout-hook.sh
+++ b/t/t5403-post-checkout-hook.sh
@@ -16,10 +16,13 @@ test_expect_success setup '
 	git update-ref refs/heads/master $commit0 &&
 	git clone ./. clone1 &&
 	git clone ./. clone2 &&
-	GIT_DIR=clone2/.git git branch new2 &&
-	echo Data for commit1. >clone2/b &&
-	GIT_DIR=clone2/.git git add clone2/b &&
-	GIT_DIR=clone2/.git git commit -m new2
+	(
+		cd clone2 &&
+		git branch new2 &&
+		echo Data for commit1. >b &&
+		git add b &&
+		git commit -m new2
+	)
 '
 
 for clone in 1 2; do
@@ -48,7 +51,7 @@ test_expect_success 'post-checkout runs as expected ' '
 '
 
 test_expect_success 'post-checkout args are correct with git checkout -b ' '
-	GIT_DIR=clone1/.git git checkout -b new1 &&
+	( cd clone1 && git checkout -b new1 ) &&
 	old=$(awk "{print \$1}" clone1/.git/post-checkout.args) &&
 	new=$(awk "{print \$2}" clone1/.git/post-checkout.args) &&
 	flag=$(awk "{print \$3}" clone1/.git/post-checkout.args) &&
@@ -56,7 +59,7 @@ test_expect_success 'post-checkout args are correct with git checkout -b ' '
 '
 
 test_expect_success 'post-checkout receives the right args with HEAD changed ' '
-	GIT_DIR=clone2/.git git checkout new2 &&
+	( cd clone2 && git checkout new2 ) &&
 	old=$(awk "{print \$1}" clone2/.git/post-checkout.args) &&
 	new=$(awk "{print \$2}" clone2/.git/post-checkout.args) &&
 	flag=$(awk "{print \$3}" clone2/.git/post-checkout.args) &&
@@ -64,7 +67,7 @@ test_expect_success 'post-checkout receives the right args with HEAD changed ' '
 '
 
 test_expect_success 'post-checkout receives the right args when not switching branches ' '
-	GIT_DIR=clone2/.git git checkout master b &&
+	( cd clone2 && git checkout master b ) &&
 	old=$(awk "{print \$1}" clone2/.git/post-checkout.args) &&
 	new=$(awk "{print \$2}" clone2/.git/post-checkout.args) &&
 	flag=$(awk "{print \$3}" clone2/.git/post-checkout.args) &&
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 04/11] tree-walk.c: do not leak internal structure in tree_entry_len()
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

tree_entry_len() does not simply take two random arguments and return
a tree length. The two pointers must point to a tree item structure,
or struct name_entry. Passing random pointers will return incorrect
value.

Force callers to pass struct name_entry instead of two pointers (with
hope that they don't manually construct struct name_entry themselves)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/grep.c         |    2 +-
 builtin/pack-objects.c |    2 +-
 tree-diff.c            |    6 +++---
 tree-walk.c            |   16 ++++++++--------
 tree-walk.h            |    6 +++---
 tree.c                 |    2 +-
 unpack-trees.c         |    6 +++---
 7 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 7d0779f..2cd0612 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -547,7 +547,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 	int old_baselen = base->len;
 
 	while (tree_entry(tree, &entry)) {
-		int te_len = tree_entry_len(entry.path, entry.sha1);
+		int te_len = tree_entry_len(&entry);
 
 		if (match != 2) {
 			match = tree_entry_interesting(&entry, base, tn_len, pathspec);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2b18de5..864154b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -975,7 +975,7 @@ static void add_pbase_object(struct tree_desc *tree,
 	while (tree_entry(tree,&entry)) {
 		if (S_ISGITLINK(entry.mode))
 			continue;
-		cmp = tree_entry_len(entry.path, entry.sha1) != cmplen ? 1 :
+		cmp = tree_entry_len(&entry) != cmplen ? 1 :
 		      memcmp(name, entry.path, cmplen);
 		if (cmp > 0)
 			continue;
diff --git a/tree-diff.c b/tree-diff.c
index b3cc2e4..6782484 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -21,8 +21,8 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2,
 	sha1 = tree_entry_extract(t1, &path1, &mode1);
 	sha2 = tree_entry_extract(t2, &path2, &mode2);
 
-	pathlen1 = tree_entry_len(path1, sha1);
-	pathlen2 = tree_entry_len(path2, sha2);
+	pathlen1 = tree_entry_len(&t1->entry);
+	pathlen2 = tree_entry_len(&t2->entry);
 	cmp = base_name_compare(path1, pathlen1, mode1, path2, pathlen2, mode2);
 	if (cmp < 0) {
 		show_entry(opt, "-", t1, base);
@@ -85,7 +85,7 @@ static void show_entry(struct diff_options *opt, const char *prefix,
 	unsigned mode;
 	const char *path;
 	const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode);
-	int pathlen = tree_entry_len(path, sha1);
+	int pathlen = tree_entry_len(&desc->entry);
 	int old_baselen = base->len;
 
 	strbuf_add(base, path, pathlen);
diff --git a/tree-walk.c b/tree-walk.c
index 418107e..f5d19f9 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -116,7 +116,7 @@ void setup_traverse_info(struct traverse_info *info, const char *base)
 
 char *make_traverse_path(char *path, const struct traverse_info *info, const struct name_entry *n)
 {
-	int len = tree_entry_len(n->path, n->sha1);
+	int len = tree_entry_len(n);
 	int pathlen = info->pathlen;
 
 	path[pathlen + len] = 0;
@@ -126,7 +126,7 @@ char *make_traverse_path(char *path, const struct traverse_info *info, const str
 			break;
 		path[--pathlen] = '/';
 		n = &info->name;
-		len = tree_entry_len(n->path, n->sha1);
+		len = tree_entry_len(n);
 		info = info->prev;
 		pathlen -= len;
 	}
@@ -253,7 +253,7 @@ static void extended_entry_extract(struct tree_desc_x *t,
 	 * The caller wants "first" from this tree, or nothing.
 	 */
 	path = a->path;
-	len = tree_entry_len(a->path, a->sha1);
+	len = tree_entry_len(a);
 	switch (check_entry_match(first, first_len, path, len)) {
 	case -1:
 		entry_clear(a);
@@ -271,7 +271,7 @@ static void extended_entry_extract(struct tree_desc_x *t,
 	while (probe.size) {
 		entry_extract(&probe, a);
 		path = a->path;
-		len = tree_entry_len(a->path, a->sha1);
+		len = tree_entry_len(a);
 		switch (check_entry_match(first, first_len, path, len)) {
 		case -1:
 			entry_clear(a);
@@ -362,7 +362,7 @@ int traverse_trees(int n, struct tree_desc *t, struct traverse_info *info)
 			e = entry + i;
 			if (!e->path)
 				continue;
-			len = tree_entry_len(e->path, e->sha1);
+			len = tree_entry_len(e);
 			if (!first) {
 				first = e->path;
 				first_len = len;
@@ -381,7 +381,7 @@ int traverse_trees(int n, struct tree_desc *t, struct traverse_info *info)
 				/* Cull the ones that are not the earliest */
 				if (!e->path)
 					continue;
-				len = tree_entry_len(e->path, e->sha1);
+				len = tree_entry_len(e);
 				if (name_compare(e->path, len, first, first_len))
 					entry_clear(e);
 			}
@@ -434,8 +434,8 @@ static int find_tree_entry(struct tree_desc *t, const char *name, unsigned char
 		int entrylen, cmp;
 
 		sha1 = tree_entry_extract(t, &entry, mode);
+		entrylen = tree_entry_len(&t->entry);
 		update_tree_entry(t);
-		entrylen = tree_entry_len(entry, sha1);
 		if (entrylen > namelen)
 			continue;
 		cmp = memcmp(name, entry, entrylen);
@@ -596,7 +596,7 @@ int tree_entry_interesting(const struct name_entry *entry,
 				      ps->max_depth);
 	}
 
-	pathlen = tree_entry_len(entry->path, entry->sha1);
+	pathlen = tree_entry_len(entry);
 
 	for (i = ps->nr - 1; i >= 0; i--) {
 		const struct pathspec_item *item = ps->items+i;
diff --git a/tree-walk.h b/tree-walk.h
index 0089581..884d01a 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -20,9 +20,9 @@ static inline const unsigned char *tree_entry_extract(struct tree_desc *desc, co
 	return desc->entry.sha1;
 }
 
-static inline int tree_entry_len(const char *name, const unsigned char *sha1)
+static inline int tree_entry_len(const struct name_entry *ne)
 {
-	return (const char *)sha1 - name - 1;
+	return (const char *)ne->sha1 - ne->path - 1;
 }
 
 void update_tree_entry(struct tree_desc *);
@@ -58,7 +58,7 @@ extern void setup_traverse_info(struct traverse_info *info, const char *base);
 
 static inline int traverse_path_len(const struct traverse_info *info, const struct name_entry *n)
 {
-	return info->pathlen + tree_entry_len(n->path, n->sha1);
+	return info->pathlen + tree_entry_len(n);
 }
 
 extern int tree_entry_interesting(const struct name_entry *, struct strbuf *, int, const struct pathspec *ps);
diff --git a/tree.c b/tree.c
index 698ecf7..e622198 100644
--- a/tree.c
+++ b/tree.c
@@ -99,7 +99,7 @@ static int read_tree_1(struct tree *tree, struct strbuf *base,
 		else
 			continue;
 
-		len = tree_entry_len(entry.path, entry.sha1);
+		len = tree_entry_len(&entry);
 		strbuf_add(base, entry.path, len);
 		strbuf_addch(base, '/');
 		retval = read_tree_1(lookup_tree(sha1),
diff --git a/unpack-trees.c b/unpack-trees.c
index 8282f5e..7c9ecf6 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -446,7 +446,7 @@ static int traverse_trees_recursive(int n, unsigned long dirmask,
 	newinfo.prev = info;
 	newinfo.pathspec = info->pathspec;
 	newinfo.name = *p;
-	newinfo.pathlen += tree_entry_len(p->path, p->sha1) + 1;
+	newinfo.pathlen += tree_entry_len(p) + 1;
 	newinfo.conflicts |= df_conflicts;
 
 	for (i = 0; i < n; i++, dirmask >>= 1) {
@@ -495,7 +495,7 @@ static int do_compare_entry(const struct cache_entry *ce, const struct traverse_
 	ce_len -= pathlen;
 	ce_name = ce->name + pathlen;
 
-	len = tree_entry_len(n->path, n->sha1);
+	len = tree_entry_len(n);
 	return df_name_compare(ce_name, ce_len, S_IFREG, n->path, len, n->mode);
 }
 
@@ -626,7 +626,7 @@ static int find_cache_pos(struct traverse_info *info,
 	struct unpack_trees_options *o = info->data;
 	struct index_state *index = o->src_index;
 	int pfxlen = info->pathlen;
-	int p_len = tree_entry_len(p->path, p->sha1);
+	int p_len = tree_entry_len(p);
 
 	for (pos = o->cache_bottom; pos < index->cache_nr; pos++) {
 		struct cache_entry *ce = index->cache[pos];
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 05/11] symbolize return values of tree_entry_interesting()
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

This helps extending the value later on for "interesting, but cannot
decide if the entry truely matches yet" (ie. prefix matches)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/grep.c |    9 +++++----
 list-objects.c |    9 +++++----
 tree-diff.c    |   13 +++++++------
 tree-walk.c    |   45 +++++++++++++++++++++------------------------
 tree-walk.h    |   12 +++++++++++-
 tree.c         |    9 +++++----
 6 files changed, 54 insertions(+), 43 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 2cd0612..2fc51fa 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -542,18 +542,19 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 		     struct tree_desc *tree, struct strbuf *base, int tn_len)
 {
-	int hit = 0, match = 0;
+	int hit = 0;
+	enum interesting match = entry_not_interesting;
 	struct name_entry entry;
 	int old_baselen = base->len;
 
 	while (tree_entry(tree, &entry)) {
 		int te_len = tree_entry_len(&entry);
 
-		if (match != 2) {
+		if (match != all_entries_interesting) {
 			match = tree_entry_interesting(&entry, base, tn_len, pathspec);
-			if (match < 0)
+			if (match == all_entries_not_interesting)
 				break;
-			if (match == 0)
+			if (match == entry_not_interesting)
 				continue;
 		}
 
diff --git a/list-objects.c b/list-objects.c
index 39d80c0..3dd4a96 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -71,7 +71,8 @@ static void process_tree(struct rev_info *revs,
 	struct tree_desc desc;
 	struct name_entry entry;
 	struct name_path me;
-	int match = revs->diffopt.pathspec.nr == 0 ? 2 : 0;
+	enum interesting match = revs->diffopt.pathspec.nr == 0 ?
+		all_entries_interesting: entry_not_interesting;
 	int baselen = base->len;
 
 	if (!revs->tree_objects)
@@ -97,12 +98,12 @@ static void process_tree(struct rev_info *revs,
 	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
-		if (match != 2) {
+		if (match != all_entries_interesting) {
 			match = tree_entry_interesting(&entry, base, 0,
 						       &revs->diffopt.pathspec);
-			if (match < 0)
+			if (match == all_entries_not_interesting)
 				break;
-			if (match == 0)
+			if (match == entry_not_interesting)
 				continue;
 		}
 
diff --git a/tree-diff.c b/tree-diff.c
index 6782484..25cc981 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -64,14 +64,14 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2,
 static void show_tree(struct diff_options *opt, const char *prefix,
 		      struct tree_desc *desc, struct strbuf *base)
 {
-	int match = 0;
+	enum interesting match = entry_not_interesting;
 	for (; desc->size; update_tree_entry(desc)) {
-		if (match != 2) {
+		if (match != all_entries_interesting) {
 			match = tree_entry_interesting(&desc->entry, base, 0,
 						       &opt->pathspec);
-			if (match < 0)
+			if (match == all_entries_not_interesting)
 				break;
-			if (match == 0)
+			if (match == entry_not_interesting)
 				continue;
 		}
 		show_entry(opt, prefix, desc, base);
@@ -114,12 +114,13 @@ static void show_entry(struct diff_options *opt, const char *prefix,
 }
 
 static void skip_uninteresting(struct tree_desc *t, struct strbuf *base,
-			       struct diff_options *opt, int *match)
+			       struct diff_options *opt,
+			       enum interesting *match)
 {
 	while (t->size) {
 		*match = tree_entry_interesting(&t->entry, base, 0, &opt->pathspec);
 		if (*match) {
-			if (*match < 0)
+			if (*match == all_entries_not_interesting)
 				t->size = 0;
 			break;
 		}
diff --git a/tree-walk.c b/tree-walk.c
index f5d19f9..fc03262 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -573,27 +573,23 @@ static int match_dir_prefix(const char *base,
  *
  * Pre-condition: either baselen == base_offset (i.e. empty path)
  * or base[baselen-1] == '/' (i.e. with trailing slash).
- *
- * Return:
- *  - 2 for "yes, and all subsequent entries will be"
- *  - 1 for yes
- *  - zero for no
- *  - negative for "no, and no subsequent entries will be either"
  */
-int tree_entry_interesting(const struct name_entry *entry,
-			   struct strbuf *base, int base_offset,
-			   const struct pathspec *ps)
+enum interesting tree_entry_interesting(const struct name_entry *entry,
+					struct strbuf *base, int base_offset,
+					const struct pathspec *ps)
 {
 	int i;
 	int pathlen, baselen = base->len - base_offset;
-	int never_interesting = ps->has_wildcard ? 0 : -1;
+	int never_interesting = ps->has_wildcard ?
+		entry_not_interesting : all_entries_not_interesting;
 
 	if (!ps->nr) {
 		if (!ps->recursive || ps->max_depth == -1)
-			return 2;
-		return !!within_depth(base->buf + base_offset, baselen,
-				      !!S_ISDIR(entry->mode),
-				      ps->max_depth);
+			return all_entries_interesting;
+		return within_depth(base->buf + base_offset, baselen,
+				    !!S_ISDIR(entry->mode),
+				    ps->max_depth) ?
+			entry_interesting : entry_not_interesting;
 	}
 
 	pathlen = tree_entry_len(entry);
@@ -610,12 +606,13 @@ int tree_entry_interesting(const struct name_entry *entry,
 				goto match_wildcards;
 
 			if (!ps->recursive || ps->max_depth == -1)
-				return 2;
+				return all_entries_interesting;
 
-			return !!within_depth(base_str + matchlen + 1,
-					      baselen - matchlen - 1,
-					      !!S_ISDIR(entry->mode),
-					      ps->max_depth);
+			return within_depth(base_str + matchlen + 1,
+					    baselen - matchlen - 1,
+					    !!S_ISDIR(entry->mode),
+					    ps->max_depth) ?
+				entry_interesting : entry_not_interesting;
 		}
 
 		/* Either there must be no base, or the base must match. */
@@ -623,18 +620,18 @@ int tree_entry_interesting(const struct name_entry *entry,
 			if (match_entry(entry, pathlen,
 					match + baselen, matchlen - baselen,
 					&never_interesting))
-				return 1;
+				return entry_interesting;
 
 			if (ps->items[i].use_wildcard) {
 				if (!fnmatch(match + baselen, entry->path, 0))
-					return 1;
+					return entry_interesting;
 
 				/*
 				 * Match all directories. We'll try to
 				 * match files later on.
 				 */
 				if (ps->recursive && S_ISDIR(entry->mode))
-					return 1;
+					return entry_interesting;
 			}
 
 			continue;
@@ -653,7 +650,7 @@ match_wildcards:
 
 		if (!fnmatch(match, base->buf + base_offset, 0)) {
 			strbuf_setlen(base, base_offset + baselen);
-			return 1;
+			return entry_interesting;
 		}
 		strbuf_setlen(base, base_offset + baselen);
 
@@ -662,7 +659,7 @@ match_wildcards:
 		 * later on.
 		 */
 		if (ps->recursive && S_ISDIR(entry->mode))
-			return 1;
+			return entry_interesting;
 	}
 	return never_interesting; /* No matches */
 }
diff --git a/tree-walk.h b/tree-walk.h
index 884d01a..2bf0db9 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -61,6 +61,16 @@ static inline int traverse_path_len(const struct traverse_info *info, const stru
 	return info->pathlen + tree_entry_len(n);
 }
 
-extern int tree_entry_interesting(const struct name_entry *, struct strbuf *, int, const struct pathspec *ps);
+/* in general, positive means "kind of interesting" */
+enum interesting {
+	all_entries_not_interesting = -1, /* no, and no subsequent entries will be either */
+	entry_not_interesting = 0,
+	entry_interesting = 1,
+	all_entries_interesting = 2 /* yes, and all subsequent entries will be */
+};
+
+extern enum interesting tree_entry_interesting(const struct name_entry *,
+					       struct strbuf *, int,
+					       const struct pathspec *ps);
 
 #endif
diff --git a/tree.c b/tree.c
index e622198..676e9f7 100644
--- a/tree.c
+++ b/tree.c
@@ -52,7 +52,8 @@ static int read_tree_1(struct tree *tree, struct strbuf *base,
 	struct tree_desc desc;
 	struct name_entry entry;
 	unsigned char sha1[20];
-	int len, retval = 0, oldlen = base->len;
+	int len, oldlen = base->len;
+	enum interesting retval = entry_not_interesting;
 
 	if (parse_tree(tree))
 		return -1;
@@ -60,11 +61,11 @@ static int read_tree_1(struct tree *tree, struct strbuf *base,
 	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
-		if (retval != 2) {
+		if (retval != all_entries_interesting) {
 			retval = tree_entry_interesting(&entry, base, 0, pathspec);
-			if (retval < 0)
+			if (retval == all_entries_not_interesting)
 				break;
-			if (retval == 0)
+			if (retval == entry_not_interesting)
 				continue;
 		}
 
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 06/11] read_directory_recursive: reduce one indentation level
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c |   50 +++++++++++++++++++++++++-------------------------
 1 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/dir.c b/dir.c
index 6c0d782..0a78d00 100644
--- a/dir.c
+++ b/dir.c
@@ -968,34 +968,34 @@ static int read_directory_recursive(struct dir_struct *dir,
 {
 	DIR *fdir = opendir(*base ? base : ".");
 	int contents = 0;
+	struct dirent *de;
+	char path[PATH_MAX + 1];
 
-	if (fdir) {
-		struct dirent *de;
-		char path[PATH_MAX + 1];
-		memcpy(path, base, baselen);
-
-		while ((de = readdir(fdir)) != NULL) {
-			int len;
-			switch (treat_path(dir, de, path, sizeof(path),
-					   baselen, simplify, &len)) {
-			case path_recurse:
-				contents += read_directory_recursive
-					(dir, path, len, 0, simplify);
-				continue;
-			case path_ignored:
-				continue;
-			case path_handled:
-				break;
-			}
-			contents++;
-			if (check_only)
-				goto exit_early;
-			else
-				dir_add_name(dir, path, len);
+	if (!fdir)
+		return 0;
+
+	memcpy(path, base, baselen);
+
+	while ((de = readdir(fdir)) != NULL) {
+		int len;
+		switch (treat_path(dir, de, path, sizeof(path),
+				   baselen, simplify, &len)) {
+		case path_recurse:
+			contents += read_directory_recursive(dir, path, len, 0, simplify);
+			continue;
+		case path_ignored:
+			continue;
+		case path_handled:
+			break;
 		}
-exit_early:
-		closedir(fdir);
+		contents++;
+		if (check_only)
+			goto exit_early;
+		else
+			dir_add_name(dir, path, len);
 	}
+exit_early:
+	closedir(fdir);
 
 	return contents;
 }
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 07/11] tree_entry_interesting: make use of local pointer "item"
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 tree-walk.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index fc03262..2d9d17a 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -622,7 +622,7 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 					&never_interesting))
 				return entry_interesting;
 
-			if (ps->items[i].use_wildcard) {
+			if (item->use_wildcard) {
 				if (!fnmatch(match + baselen, entry->path, 0))
 					return entry_interesting;
 
@@ -638,7 +638,7 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 		}
 
 match_wildcards:
-		if (!ps->items[i].use_wildcard)
+		if (!item->use_wildcard)
 			continue;
 
 		/*
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 08/11] tree-walk: mark useful pathspecs
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

Useful pathspecs are those that help decide whether an item is in or
out, as opposed to useless ones whose existence does not change the
results.

Callers are responsible for cleaning before use, or doing anything
after.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h     |    1 +
 tree-walk.c |   13 ++++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index be07ec7..946d910 100644
--- a/cache.h
+++ b/cache.h
@@ -532,6 +532,7 @@ struct pathspec {
 		const char *match;
 		int len;
 		unsigned int use_wildcard:1;
+		unsigned int useful:1;
 	} *items;
 };
 
diff --git a/tree-walk.c b/tree-walk.c
index 2d9d17a..5e9c522 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -595,11 +595,15 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 	pathlen = tree_entry_len(entry);
 
 	for (i = ps->nr - 1; i >= 0; i--) {
-		const struct pathspec_item *item = ps->items+i;
+		struct pathspec_item *item = ps->items+i;
 		const char *match = item->match;
 		const char *base_str = base->buf + base_offset;
 		int matchlen = item->len;
 
+		/* assume it will be used (which usually means break
+		   the loop and return), reset it otherwise */
+		item->useful = 1;
+
 		if (baselen >= matchlen) {
 			/* If it doesn't match, move along... */
 			if (!match_dir_prefix(base_str, match, matchlen))
@@ -634,12 +638,12 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 					return entry_interesting;
 			}
 
-			continue;
+			goto nouse;
 		}
 
 match_wildcards:
 		if (!item->use_wildcard)
-			continue;
+			goto nouse;
 
 		/*
 		 * Concatenate base and entry->path into one and do
@@ -660,6 +664,9 @@ match_wildcards:
 		 */
 		if (ps->recursive && S_ISDIR(entry->mode))
 			return entry_interesting;
+
+nouse:
+		item->useful = 0;
 	}
 	return never_interesting; /* No matches */
 }
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 09/11] tree_entry_interesting: differentiate partial vs full match
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

Up until now, for a/b pathspec, both paths a and a/b would return
entry_interesting. Make it return entry_matched for the latter.

This way if the caller follows up to "a", but decide to stop for some
reason, then it knows that "a" has not really matched the given
pathspec yet.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 tree-walk.c |   13 ++++++++-----
 tree-walk.h |    5 +++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 5e9c522..6e12f0f 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -616,19 +616,22 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 					    baselen - matchlen - 1,
 					    !!S_ISDIR(entry->mode),
 					    ps->max_depth) ?
-				entry_interesting : entry_not_interesting;
+				entry_matched : entry_not_interesting;
 		}
 
 		/* Either there must be no base, or the base must match. */
 		if (baselen == 0 || !strncmp(base_str, match, baselen)) {
 			if (match_entry(entry, pathlen,
 					match + baselen, matchlen - baselen,
-					&never_interesting))
-				return entry_interesting;
+					&never_interesting)) {
+				if (match[baselen + pathlen] == '/')
+					return entry_interesting;
+				return entry_matched;
+			}
 
 			if (item->use_wildcard) {
 				if (!fnmatch(match + baselen, entry->path, 0))
-					return entry_interesting;
+					return entry_matched;
 
 				/*
 				 * Match all directories. We'll try to
@@ -654,7 +657,7 @@ match_wildcards:
 
 		if (!fnmatch(match, base->buf + base_offset, 0)) {
 			strbuf_setlen(base, base_offset + baselen);
-			return entry_interesting;
+			return entry_matched;
 		}
 		strbuf_setlen(base, base_offset + baselen);
 
diff --git a/tree-walk.h b/tree-walk.h
index 2bf0db9..a5f92fa 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -65,8 +65,9 @@ static inline int traverse_path_len(const struct traverse_info *info, const stru
 enum interesting {
 	all_entries_not_interesting = -1, /* no, and no subsequent entries will be either */
 	entry_not_interesting = 0,
-	entry_interesting = 1,
-	all_entries_interesting = 2 /* yes, and all subsequent entries will be */
+	entry_interesting = 1, /* a potential match, not not there yet  */
+	entry_matched = 2,
+	all_entries_interesting = 3 /* yes, and all subsequent entries will be */
 };
 
 extern enum interesting tree_entry_interesting(const struct name_entry *,
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 10/11] read-dir: stop using path_simplify code in favor of tree_entry_interesting()
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>

Current code tries to find a prefix set of given pathspecs and filter
on the set. Call sites are supposed to do exact pathspec matching
again to remove unmatched entries (but matches the prefix set).

This patch makes read_directory() use tree_entry_interesting()
directly, thus remove the need to filter again by call sites (although
call sites are untouched in this patch).

A less intrusive way would be to use match_pathspec_depth(), but I'd
rather reduce the use of that function and eventually remove it, so we
only have to maintain pathspec matching at one place:
tree_entry_interesting().

In order to make use of tree_entry_interesting(), directory content
from readdir() must be converted to tree object format, which means we
have to read all items of a directory at once and sort it. If the
directory is large, it may become expensive operation. But again,
current code does nothing to stop reading directory early, so nothing
is lost.

ignored_nr and ignored[] are not longer filled. read_directory() users
are supposed to use useful[] instead.

Many functions are left unused in this patch to avoid clutter up the
patch. They will be removed later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/add.c |   22 +++--
 dir.c         |  317 ++++++++++++++++++++++++++++++++++++++-------------------
 dir.h         |    5 +
 tree-walk.c   |    2 +
 4 files changed, 236 insertions(+), 110 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 23ad4b8..92ba3d4 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -307,7 +307,7 @@ static int edit_patch(int argc, const char **argv, const char *prefix)
 static struct lock_file lock_file;
 
 static const char ignore_error[] =
-N_("The following paths are ignored by one of your .gitignore files:\n");
+N_("The following pathspecs are ignored by one of your .gitignore files:\n");
 
 static int verbose = 0, show_only = 0, ignored_too = 0, refresh_only = 0;
 static int ignore_add_errors, addremove, intent_to_add;
@@ -342,12 +342,20 @@ static int add_files(struct dir_struct *dir, int flags)
 {
 	int i, exit_status = 0;
 
-	if (dir->ignored_nr) {
-		fprintf(stderr, _(ignore_error));
-		for (i = 0; i < dir->ignored_nr; i++)
-			fprintf(stderr, "%s\n", dir->ignored[i]->name);
-		fprintf(stderr, _("Use -f if you really want to add them.\n"));
-		die(_("no files added"));
+	if (dir->useful) {
+		int show_header = 0;
+		for (i = 0; i < dir->ps2.nr; i++)
+			if (!dir->useful[i]) {
+				if (!show_header) {
+					fprintf(stderr, _(ignore_error));
+					show_header = 1;
+				}
+				fprintf(stderr, "%s\n", dir->ps2.items[i].match);
+			}
+		if (show_header) {
+			fprintf(stderr, _("Use -f if you really want to add them.\n"));
+			die(_("no files added"));
+		}
 	}
 
 	for (i = 0; i < dir->nr; i++)
diff --git a/dir.c b/dir.c
index 0a78d00..2946b2d 100644
--- a/dir.c
+++ b/dir.c
@@ -8,14 +8,18 @@
 #include "cache.h"
 #include "dir.h"
 #include "refs.h"
+#include "tree-walk.h"
+#include "string-list.h"
 
 struct path_simplify {
 	int len;
 	const char *path;
 };
 
-static int read_directory_recursive(struct dir_struct *dir, const char *path, int len,
-	int check_only, const struct path_simplify *simplify);
+static int read_directory_recursive(struct dir_struct *dir,
+				    struct strbuf *base,
+				    int check_only,
+				    enum interesting match);
 static int get_dtype(struct dirent *de, const char *path, int len);
 
 /* helper string functions with support for the ignore_case flag */
@@ -609,6 +613,93 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname,
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
+/* Read and convert directory to tree object (with invalid SHA-1) */
+static void* dir_to_tree(struct strbuf *path, unsigned long *size)
+{
+	int pathlen = path->len;
+	DIR *fdir = opendir(pathlen ? path->buf : ".");
+	struct string_list paths = STRING_LIST_INIT_DUP;
+	struct dirent *de;
+	char *tree, *p;
+	int i,dtype;
+
+	if (!fdir)
+		return NULL;
+
+	*size = 0;
+	while ((de = readdir(fdir)) != NULL) {
+		int namelen = strlen(de->d_name);
+		struct string_list_item *item;
+		const char *mode = NULL;
+
+		if (is_dot_or_dotdot(de->d_name) ||
+		    !strcmp(de->d_name, ".git") ||
+		    /* Ignore overly long pathnames! */
+		    namelen + pathlen + 8 > PATH_MAX)
+			continue;
+
+		strbuf_add(path, de->d_name, namelen);
+		dtype = get_dtype(de, path->buf, path->len);
+		strbuf_setlen(path, pathlen);
+
+		switch (dtype) {
+		case DT_DIR: mode = "040000 "; break;
+		case DT_REG: mode = "100644 "; break;
+		case DT_LNK: mode = "120000 "; break;
+		default: continue;
+		}
+		item = string_list_insert(&paths, de->d_name);
+		item->util = (void*)mode;
+		/* 100644 SPC path NUL SHA-1 */
+		*size += 6 + 1 + namelen + 1 + 20;
+	}
+	closedir(fdir);
+
+	tree = xmalloc(*size);
+	for (i = 0, p = tree;i < paths.nr; i++) {
+		int len = strlen(paths.items[i].string) + 1;
+		if (!paths.items[i].util ||
+		    strlen(paths.items[i].util) != 7)
+			die("BUG: util should contain a mode");
+		memcpy(p, paths.items[i].util, 7);
+		p += 7;
+		memcpy(p, paths.items[i].string, len);
+		p += len;
+		/* we don't need valid SHA-1 for tree_entry_interesting() */
+		memcpy(p, "\xbb\xaa\xdd\xbb\xaa\xdd\xbb\xaa\xdd", 9);
+		p += 20;
+	}
+	string_list_clear(&paths, 0);
+	return tree;
+}
+
+static enum interesting match_both_pathspecs(struct dir_struct *dir,
+					     struct strbuf *base,
+					     const struct name_entry *ne)
+{
+	int i;
+	enum interesting ret1, ret2;
+
+	/* ps1 contains the base path, no need to care about it */
+	for (i = 0; i < dir->ps2.nr; i++)
+		dir->ps2.items[i].useful = 0;
+
+	ret1 = tree_entry_interesting(ne, base, 0, &dir->ps1);
+	if (ret1 <= 0)
+		return ret1;
+	ret2 = tree_entry_interesting(ne, base, 0, &dir->ps2);
+	if (ret2 <= 0)
+		return ret2;
+
+	if (ret1 == all_entries_interesting && ret2 == all_entries_interesting)
+		return all_entries_interesting;
+	else if ((ret1 == entry_matched || ret1 == all_entries_interesting) &&
+		 (ret2 == entry_matched || ret2 == all_entries_interesting))
+		return entry_matched;
+	else
+		return entry_interesting;
+}
+
 enum exist_status {
 	index_nonexistent = 0,
 	index_directory,
@@ -722,11 +813,10 @@ enum directory_treatment {
 };
 
 static enum directory_treatment treat_directory(struct dir_struct *dir,
-	const char *dirname, int len,
-	const struct path_simplify *simplify)
+						struct strbuf *dirname)
 {
 	/* The "len-1" is to strip the final '/' */
-	switch (directory_exists_in_index(dirname, len-1)) {
+	switch (directory_exists_in_index(dirname->buf, dirname->len-1)) {
 	case index_directory:
 		return recurse_into_directory;
 
@@ -740,7 +830,7 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 			break;
 		if (!(dir->flags & DIR_NO_GITLINKS)) {
 			unsigned char sha1[20];
-			if (resolve_gitlink_ref(dirname, "HEAD", sha1) == 0)
+			if (resolve_gitlink_ref(dirname->buf, "HEAD", sha1) == 0)
 				return show_directory;
 		}
 		return recurse_into_directory;
@@ -749,7 +839,7 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 	/* This is the "show_other_directories" case */
 	if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
 		return show_directory;
-	if (!read_directory_recursive(dir, dirname, len, 1, simplify))
+	if (!read_directory_recursive(dir, dirname, 1, entry_not_interesting))
 		return ignore_directory;
 	return show_directory;
 }
@@ -780,31 +870,35 @@ static int simplify_away(const char *path, int pathlen, const struct path_simpli
 }
 
 /*
- * This function tells us whether an excluded path matches a
- * list of "interesting" pathspecs. That is, whether a path matched
- * by any of the pathspecs could possibly be ignored by excluding
- * the specified path. This can happen if:
+ * This function flags pathspecs that are completely excluded, which
+ * usually means an input mistake. In other words, if all matched
+ * _files_ of a pathspec are excluded, flag the pathspec.
  *
- *   1. the path is mentioned explicitly in the pathspec
+ * The negated version would be: if any of matched files (by pathspec
+ * X) are not excluded, pathspec X is clear, which is exactly what
+ * this function does.
  *
- *   2. the path is a directory prefix of some element in the
- *      pathspec
+ * This function ignores dir->ps1 because that contains exactly one
+ * pathspec item: the path base. No need to worry about that.
  */
-static int exclude_matches_pathspec(const char *path, int len,
-		const struct path_simplify *simplify)
+static void mark_useful(struct dir_struct *dir,
+			const char *path, int len,
+			int dtype,
+			struct pathspec *ps, int exclude,
+			enum interesting match)
 {
-	if (simplify) {
-		for (; simplify->path; simplify++) {
-			if (len == simplify->len
-			    && !memcmp(path, simplify->path, len))
-				return 1;
-			if (len < simplify->len
-			    && simplify->path[len] == '/'
-			    && !memcmp(path, simplify->path, len))
-				return 1;
-		}
-	}
-	return 0;
+	int i;
+	if (!(dir->flags & DIR_COLLECT_IGNORED))
+		return;
+	/* half-matches (eg. prefix matches) do not count as useful */
+	if (match != all_entries_interesting && match != entry_matched)
+		return;
+	if (exclude && cache_name_is_other(path, len))
+		return;
+
+	for (i = 0; i < ps->nr; i++)
+		if (ps->items[i].useful)
+			dir->useful[i] = 1;
 }
 
 static int get_index_dtype(const char *path, int len)
@@ -872,15 +966,24 @@ enum path_treatment {
 	path_recurse
 };
 
-static enum path_treatment treat_one_path(struct dir_struct *dir,
-					  char *path, int *len,
-					  const struct path_simplify *simplify,
-					  int dtype, struct dirent *de)
+/* base is modified to contain ne */
+static int treat_path(struct dir_struct *dir,
+		      struct strbuf *base, const struct name_entry *ne,
+		      enum interesting match)
 {
-	int exclude = excluded(dir, path, &dtype);
-	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
-	    && exclude_matches_pathspec(path, *len, simplify))
-		dir_add_ignored(dir, path, *len);
+	int exclude, dtype;
+
+	strbuf_add(base, ne->path, tree_entry_len(ne));
+
+	/* It does not matter DT_REG or something else, excluded()
+	 * only cares if it's DT_DIR or not */
+	dtype = S_ISDIR(ne->mode) ? DT_DIR : DT_REG;
+	exclude = excluded(dir, base->buf, &dtype);
+
+	/* intermediate directory match does not count */
+	if (dtype == DT_REG)
+		mark_useful(dir, base->buf, base->len, dtype,
+				       &dir->ps2, exclude, match);
 
 	/*
 	 * Excluded? If we don't explicitly want to show
@@ -889,9 +992,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
 		return path_ignored;
 
-	if (dtype == DT_UNKNOWN)
-		dtype = get_dtype(de, path, *len);
-
 	/*
 	 * Do we want to see just the ignored files?
 	 * We still need to recurse into directories,
@@ -899,17 +999,13 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	 * directory may contain files that we do..
 	 */
 	if (!exclude && (dir->flags & DIR_SHOW_IGNORED)) {
-		if (dtype != DT_DIR)
+		if (!S_ISDIR(ne->mode))
 			return path_ignored;
 	}
 
-	switch (dtype) {
-	default:
-		return path_ignored;
-	case DT_DIR:
-		memcpy(path + *len, "/", 2);
-		(*len)++;
-		switch (treat_directory(dir, path, *len, simplify)) {
+	if (S_ISDIR(ne->mode)) {
+		strbuf_addch(base, '/');
+		switch (treat_directory(dir, base)) {
 		case show_directory:
 			if (exclude != !!(dir->flags
 					  & DIR_SHOW_IGNORED))
@@ -920,38 +1016,14 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		case ignore_directory:
 			return path_ignored;
 		}
-		break;
-	case DT_REG:
-	case DT_LNK:
-		break;
+
+		/* path_handled for dirs, must be gitlinks */
+		mark_useful(dir, base->buf, base->len, dtype,
+				       &dir->ps2, exclude, match);
 	}
 	return path_handled;
 }
 
-static enum path_treatment treat_path(struct dir_struct *dir,
-				      struct dirent *de,
-				      char *path, int path_max,
-				      int baselen,
-				      const struct path_simplify *simplify,
-				      int *len)
-{
-	int dtype;
-
-	if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
-		return path_ignored;
-	*len = strlen(de->d_name);
-	/* Ignore overly long pathnames! */
-	if (*len + baselen + 8 > path_max)
-		return path_ignored;
-	memcpy(path + baselen, de->d_name, *len + 1);
-	*len += baselen;
-	if (simplify_away(path, *len, simplify))
-		return path_ignored;
-
-	dtype = DTYPE(de);
-	return treat_one_path(dir, path, len, simplify, dtype, de);
-}
-
 /*
  * Read a directory tree. We currently ignore anything but
  * directories, regular files and symlinks. That's because git
@@ -962,40 +1034,46 @@ static enum path_treatment treat_path(struct dir_struct *dir,
  * That likely will not change.
  */
 static int read_directory_recursive(struct dir_struct *dir,
-				    const char *base, int baselen,
+				    struct strbuf *base,
 				    int check_only,
-				    const struct path_simplify *simplify)
+				    enum interesting match)
 {
-	DIR *fdir = opendir(*base ? base : ".");
-	int contents = 0;
-	struct dirent *de;
-	char path[PATH_MAX + 1];
+	unsigned long size;
+	void *tree_buf = dir_to_tree(base, &size);
+	int contents = 0, baselen = base->len;
+	struct tree_desc desc;
+	struct name_entry ne;
 
-	if (!fdir)
+	if (!tree_buf)
 		return 0;
 
-	memcpy(path, base, baselen);
+	init_tree_desc(&desc, tree_buf, size);
 
-	while ((de = readdir(fdir)) != NULL) {
-		int len;
-		switch (treat_path(dir, de, path, sizeof(path),
-				   baselen, simplify, &len)) {
+	while (tree_entry(&desc, &ne)) {
+		strbuf_setlen(base, baselen);
+		if (match != all_entries_interesting) {
+			match = match_both_pathspecs(dir, base, &ne);
+			if (match == all_entries_not_interesting)
+				break;
+			if (match == entry_not_interesting)
+				continue;
+		}
+		switch (treat_path(dir, base, &ne, match)) {
 		case path_recurse:
-			contents += read_directory_recursive(dir, path, len, 0, simplify);
-			continue;
-		case path_ignored:
+			contents += read_directory_recursive(dir, base, 0, match);
 			continue;
 		case path_handled:
+			contents++;
+			if (check_only)
+				goto exit_early;
+
+			dir_add_name(dir, base->buf, base->len);
 			break;
 		}
-		contents++;
-		if (check_only)
-			goto exit_early;
-		else
-			dir_add_name(dir, path, len);
 	}
 exit_early:
-	closedir(fdir);
+	free(tree_buf);
+	strbuf_setlen(base, baselen);
 
 	return contents;
 }
@@ -1054,6 +1132,7 @@ static void free_simplify(struct path_simplify *simplify)
 	free(simplify);
 }
 
+#if 0
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct path_simplify *simplify)
@@ -1088,20 +1167,52 @@ static int treat_leading_path(struct dir_struct *dir,
 			return 1; /* finished checking */
 	}
 }
+#endif
 
-int read_directory(struct dir_struct *dir, const char *path, int len, const char **pathspec)
+int read_directory(struct dir_struct *dir, const char *path, int len,
+		   const char **pathspec)
 {
-	struct path_simplify *simplify;
+	char *newpath = NULL;
+	struct strbuf base = STRBUF_INIT;
 
 	if (has_symlink_leading_path(path, len))
 		return dir->nr;
 
-	simplify = create_simplify(pathspec);
-	if (!len || treat_leading_path(dir, path, len, simplify))
-		read_directory_recursive(dir, path, len, 0, simplify);
-	free_simplify(simplify);
+	/*
+	 * tree_entry_interesting() does not implement AND operator on
+	 * pathspecs so we call tree_entry_interesting() twice and
+	 * join the results ourselves in match_both_pathspecs()
+	 */
+	if (path && *path) {
+		const char *pathspec1[2];
+		newpath = xmalloc(len + 1);
+		memcpy(newpath, path, len);
+		newpath[len] = 0;
+		pathspec1[0] = newpath;
+		pathspec1[1] = NULL;
+		init_pathspec(&dir->ps1, pathspec1);
+	}
+	else
+		init_pathspec(&dir->ps1, NULL);
+	init_pathspec(&dir->ps2, pathspec);
+
+	if (dir->flags & DIR_COLLECT_IGNORED) {
+		int size = sizeof(*dir->useful) * dir->ps2.nr;
+		dir->useful = xmalloc(size);
+		/* guilty until proven useful */
+		memset(dir->useful, 0, size);
+	}
+
+	read_directory_recursive(dir, &base, 0, entry_not_interesting);
+
+	strbuf_release(&base);
+	free_pathspec(&dir->ps1);
+	if (!(dir->flags & DIR_COLLECT_IGNORED))
+		free_pathspec(&dir->ps2);
+	free(newpath);
+
 	qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
-	qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name);
+
 	return dir->nr;
 }
 
diff --git a/dir.h b/dir.h
index dd6947e..362d7b1 100644
--- a/dir.h
+++ b/dir.h
@@ -43,6 +43,11 @@ struct dir_struct {
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
+	int *useful;
+
+	/* Include info (a joint of ps1 and ps2) */
+	struct pathspec ps1;
+	struct pathspec ps2;
 
 	/* Exclude info */
 	const char *exclude_per_dir;
diff --git a/tree-walk.c b/tree-walk.c
index 6e12f0f..b56fec1 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -600,6 +600,8 @@ enum interesting tree_entry_interesting(const struct name_entry *entry,
 		const char *base_str = base->buf + base_offset;
 		int matchlen = item->len;
 
+		/* TODO: 07ccbff (runstatus: do not recurse into subdirectories if not needed - 2006-09-28) */
+
 		/* assume it will be used (which usually means break
 		   the loop and return), reset it otherwise */
 		item->useful = 1;
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* [PATCH/WIP 11/11] dir.c: remove dead code after read_directory() rewrite
From: Nguyễn Thái Ngọc Duy @ 2011-10-24  6:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1319438176-7304-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c |  121 -----------------------------------------------------------------
 dir.h |    3 --
 2 files changed, 0 insertions(+), 124 deletions(-)

diff --git a/dir.c b/dir.c
index 2946b2d..4094962 100644
--- a/dir.c
+++ b/dir.c
@@ -11,11 +11,6 @@
 #include "tree-walk.h"
 #include "string-list.h"
 
-struct path_simplify {
-	int len;
-	const char *path;
-};
-
 static int read_directory_recursive(struct dir_struct *dir,
 				    struct strbuf *base,
 				    int check_only,
@@ -604,15 +599,6 @@ static struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathna
 	return dir->entries[dir->nr++] = dir_entry_new(pathname, len);
 }
 
-struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len)
-{
-	if (!cache_name_is_other(pathname, len))
-		return NULL;
-
-	ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);
-	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
-}
-
 /* Read and convert directory to tree object (with invalid SHA-1) */
 static void* dir_to_tree(struct strbuf *path, unsigned long *size)
 {
@@ -845,31 +831,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 }
 
 /*
- * This is an inexact early pruning of any recursive directory
- * reading - if the path cannot possibly be in the pathspec,
- * return true, and we'll skip it early.
- */
-static int simplify_away(const char *path, int pathlen, const struct path_simplify *simplify)
-{
-	if (simplify) {
-		for (;;) {
-			const char *match = simplify->path;
-			int len = simplify->len;
-
-			if (!match)
-				break;
-			if (len > pathlen)
-				len = pathlen;
-			if (!memcmp(path, match, len))
-				return 0;
-			simplify++;
-		}
-		return 1;
-	}
-	return 0;
-}
-
-/*
  * This function flags pathspecs that are completely excluded, which
  * usually means an input mistake. In other words, if all matched
  * _files_ of a pathspec are excluded, flag the pathspec.
@@ -1087,88 +1048,6 @@ static int cmp_name(const void *p1, const void *p2)
 				  e2->name, e2->len);
 }
 
-/*
- * Return the length of the "simple" part of a path match limiter.
- */
-static int simple_length(const char *match)
-{
-	int len = -1;
-
-	for (;;) {
-		unsigned char c = *match++;
-		len++;
-		if (c == '\0' || is_glob_special(c))
-			return len;
-	}
-}
-
-static struct path_simplify *create_simplify(const char **pathspec)
-{
-	int nr, alloc = 0;
-	struct path_simplify *simplify = NULL;
-
-	if (!pathspec)
-		return NULL;
-
-	for (nr = 0 ; ; nr++) {
-		const char *match;
-		if (nr >= alloc) {
-			alloc = alloc_nr(alloc);
-			simplify = xrealloc(simplify, alloc * sizeof(*simplify));
-		}
-		match = *pathspec++;
-		if (!match)
-			break;
-		simplify[nr].path = match;
-		simplify[nr].len = simple_length(match);
-	}
-	simplify[nr].path = NULL;
-	simplify[nr].len = 0;
-	return simplify;
-}
-
-static void free_simplify(struct path_simplify *simplify)
-{
-	free(simplify);
-}
-
-#if 0
-static int treat_leading_path(struct dir_struct *dir,
-			      const char *path, int len,
-			      const struct path_simplify *simplify)
-{
-	char pathbuf[PATH_MAX];
-	int baselen, blen;
-	const char *cp;
-
-	while (len && path[len - 1] == '/')
-		len--;
-	if (!len)
-		return 1;
-	baselen = 0;
-	while (1) {
-		cp = path + baselen + !!baselen;
-		cp = memchr(cp, '/', path + len - cp);
-		if (!cp)
-			baselen = len;
-		else
-			baselen = cp - path;
-		memcpy(pathbuf, path, baselen);
-		pathbuf[baselen] = '\0';
-		if (!is_directory(pathbuf))
-			return 0;
-		if (simplify_away(pathbuf, baselen, simplify))
-			return 0;
-		blen = baselen;
-		if (treat_one_path(dir, pathbuf, &blen, simplify,
-				   DT_DIR, NULL) == path_ignored)
-			return 0; /* do not recurse into it */
-		if (len <= baselen)
-			return 1; /* finished checking */
-	}
-}
-#endif
-
 int read_directory(struct dir_struct *dir, const char *path, int len,
 		   const char **pathspec)
 {
diff --git a/dir.h b/dir.h
index 362d7b1..7a7d818 100644
--- a/dir.h
+++ b/dir.h
@@ -33,7 +33,6 @@ struct exclude_stack {
 
 struct dir_struct {
 	int nr, alloc;
-	int ignored_nr, ignored_alloc;
 	enum {
 		DIR_SHOW_IGNORED = 1<<0,
 		DIR_SHOW_OTHER_DIRECTORIES = 1<<1,
@@ -42,7 +41,6 @@ struct dir_struct {
 		DIR_COLLECT_IGNORED = 1<<4
 	} flags;
 	struct dir_entry **entries;
-	struct dir_entry **ignored;
 	int *useful;
 
 	/* Include info (a joint of ps1 and ps2) */
@@ -82,7 +80,6 @@ extern int read_directory(struct dir_struct *, const char *path, int len, const
 extern int excluded_from_list(const char *pathname, int pathlen, const char *basename,
 			      int *dtype, struct exclude_list *el);
 extern int excluded(struct dir_struct *, const char *, int *);
-struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
 					  char **buf_p, struct exclude_list *which, int check_index);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* Re: [PATCH] Reindent closing bracket using tab instead of spaces
From: Junio C Hamano @ 2011-10-24  6:56 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1319430291-12612-1-git-send-email-pclouds@gmail.com>

Thanks.

^ permalink raw reply

* Re: [PATCH] read-cache.c: fix index memory allocation
From: Junio C Hamano @ 2011-10-24  7:07 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, John Hsing, Matthieu Moy, git
In-Reply-To: <4EA4B8E7.5070106@lsrfire.ath.cx>

Thanks.

This approach may be the most appropriate for the maintenance track, but
for the purpose of going forward, I wonder if we really want to keep the
"estimate and allocate a large pool, and carve out individual pieces".

This bulk-allocate dates back to the days when we didn't have ondisk vs
incore representation differences, IIRC, and as the result we deliberately
leak cache entries whenever an entry in the index is replaced with a new
one. Does the overhead to allocate individually really kill us that much
for say a tree with 30k files in it?

^ permalink raw reply

* Re: [PATCH] read-cache.c: fix index memory allocation
From: Junio C Hamano @ 2011-10-24  7:28 UTC (permalink / raw)
  To: René Scharfe; +Cc: Jeff King, John Hsing, Matthieu Moy, git
In-Reply-To: <4EA4B8E7.5070106@lsrfire.ath.cx>

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

>  t/t7510-status-index.sh |   50 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 53 insertions(+), 3 deletions(-)
>  create mode 100755 t/t7510-status-index.sh

> diff --git a/t/t7510-status-index.sh b/t/t7510-status-index.sh
> new file mode 100755
> index 0000000..bca359d
> --- /dev/null
> +++ b/t/t7510-status-index.sh
> @@ -0,0 +1,50 @@

Hmm, I cannot seem to make this fail this test without the fix on my
Fedora 14 i686 VM when applied to v1.7.6.4 (estimation code originates
cf55870 back in v1.7.6.1 days), but it does break on 'master'.

By the way, I'll move this to 7511.

Also would a patch like this help?

-- >8 --
Subject: [PATCH] read_index(): die on estimation error

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 read-cache.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 0a64103..2926615 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1270,6 +1270,7 @@ int read_index_from(struct index_state *istate, const char *path)
 	int fd, i;
 	struct stat st;
 	unsigned long src_offset, dst_offset;
+	size_t bulk_alloc_size;
 	struct cache_header *hdr;
 	void *mmap;
 	size_t mmap_size;
@@ -1315,7 +1316,8 @@ int read_index_from(struct index_state *istate, const char *path)
 	 * has room for a few  more flags, we can allocate using the same
 	 * index size
 	 */
-	istate->alloc = xmalloc(estimate_cache_size(mmap_size, istate->cache_nr));
+	bulk_alloc_size = estimate_cache_size(mmap_size, istate->cache_nr);
+	istate->alloc = xmalloc(bulk_alloc_size);
 	istate->initialized = 1;
 
 	src_offset = sizeof(*hdr);
@@ -1331,7 +1333,9 @@ int read_index_from(struct index_state *istate, const char *path)
 
 		src_offset += ondisk_ce_size(ce);
 		dst_offset += ce_size(ce);
+		if (bulk_alloc_size <= dst_offset)
+			die("cache size estimation error");
 	}
 	istate->timestamp.sec = st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
 
-- 
1.7.7.1.504.gcc718

^ permalink raw reply related

* Possible diff regression in v1.7.6-473-g27af01d
From: Franz Schrober @ 2011-10-24  9:23 UTC (permalink / raw)
  To: git@vger.kernel.org
  Cc: marat@slonopotamus.org, rctay89@gmail.com, gitster@pobox.com,
	franzschrober@yahoo.de

[-- Attachment #1: Type: text/plain, Size: 567 bytes --]

Hi,

I am using git to manage some patches on top of the actual upstream files, but noticed that the result of git-format-patch changed between 4bfe7cb6668c43c1136304bbb17eea1b3ddf0237 and 27af01d552331eacf1ed2671b2b4b6ad4c268106

I've attached two input files (I tried to provide a minimal example... I am not sure if a smaller example is possible but at least both files are smaller than 10 lines) and the results with version 1.7.6.3 and and 1.7.7. The diffs were created using: git diff anonymized_orig anonymized_new

My .gitconfig file is empty.

Thanks

[-- Attachment #2: anonymized_orig --]
[-- Type: application/octet-stream, Size: 8 bytes --]

0
0
0
0

[-- Attachment #3: anonymized_new --]
[-- Type: application/octet-stream, Size: 16 bytes --]

1
2
0
3
4
5
6
7

[-- Attachment #4: diff.1.7.6.3 --]
[-- Type: application/octet-stream, Size: 171 bytes --]

diff --git a/anonymized_orig b/anonymized_new
index 44e0be8..ad0f859 100644
--- a/anonymized_orig
+++ b/anonymized_new
@@ -1,4 +1,8 @@
-0
-0
-0
-0
+1
+2
+0
+3
+4
+5
+6
+7

[-- Attachment #5: diff.1.7.7 --]
[-- Type: application/octet-stream, Size: 168 bytes --]

diff --git a/anonymized_orig b/anonymized_new
index 44e0be8..ad0f859 100644
--- a/anonymized_orig
+++ b/anonymized_new
@@ -1,4 +1,8 @@
+1
+2
 0
-0
-0
-0
+3
+4
+5
+6
+7

^ permalink raw reply related

* Re: Possible diff regression in v1.7.6-473-g27af01d
From: Thomas Rast @ 2011-10-24  9:38 UTC (permalink / raw)
  To: Franz Schrober
  Cc: git@vger.kernel.org, marat@slonopotamus.org, rctay89@gmail.com,
	gitster@pobox.com
In-Reply-To: <1319448227.70497.YahooMailNeo@web29402.mail.ird.yahoo.com>

Franz Schrober wrote:
> Hi,
> 
> I am using git to manage some patches on top of the actual upstream files, but noticed that the result of git-format-patch changed between 4bfe7cb6668c43c1136304bbb17eea1b3ddf0237 and 27af01d552331eacf1ed2671b2b4b6ad4c268106
> 
> I've attached two input files (I tried to provide a minimal example... I am not sure if a smaller example is possible but at least both files are smaller than 10 lines) and the results with version 1.7.6.3 and and 1.7.7. The diffs were created using: git diff anonymized_orig anonymized_new
> 
> My .gitconfig file is empty.

I'm not sure why you call this a regression.  For the benefit of
people who hate saving attachments, you used

  $ paste anonymized_orig anonymized_new  | xclip
  0       1
  0       2
  0       0
  0       3
          4
          5
          6
          7

the old diff was

  --- a/anonymized_orig
  +++ b/anonymized_new
  @@ -1,4 +1,8 @@
  -0
  -0
  -0
  -0
  +1
  +2
  +0
  +3
  +4
  +5
  +6
  +7

and the new diff is

  --- a/anonymized_orig
  +++ b/anonymized_new
  @@ -1,4 +1,8 @@
  +1
  +2
   0
  -0
  -0
  -0
  +3
  +4
  +5
  +6
  +7
 
So the new diff correctly represents the change, and on top of that is
shorter (by only one line, admittedly).  What makes it a regression?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply

* Re: Possible diff regression in v1.7.6-473-g27af01d
From: Tay Ray Chuan @ 2011-10-24 10:11 UTC (permalink / raw)
  To: Thomas Rast
  Cc: Franz Schrober, git@vger.kernel.org, marat@slonopotamus.org,
	gitster@pobox.com
In-Reply-To: <201110241138.51448.trast@student.ethz.ch>

On Mon, Oct 24, 2011 at 5:38 PM, Thomas Rast <trast@student.ethz.ch> wrote:
>
> I'm not sure why you call this a regression.  For the benefit of
> people who hate saving attachments, you used
>
>  $ paste anonymized_orig anonymized_new  | xclip
>  0       1
>  0       2
>  0       0
>  0       3
>          4
>          5
>          6
>          7
>
> the old diff was
>
>  --- a/anonymized_orig
>  +++ b/anonymized_new
>  @@ -1,4 +1,8 @@
>  -0
>  -0
>  -0
>  -0
>  +1
>  +2
>  +0
>  +3
>  +4
>  +5
>  +6
>  +7
>
> and the new diff is
>
>  --- a/anonymized_orig
>  +++ b/anonymized_new
>  @@ -1,4 +1,8 @@
>  +1
>  +2
>   0
>  -0
>  -0
>  -0
>  +3
>  +4
>  +5
>  +6
>  +7
>
> So the new diff correctly represents the change, and on top of that is
> shorter (by only one line, admittedly).  What makes it a regression?

Thanks for inlining it, Thomas.

> Franz Schrober wrote:>> Hi,>>>> I am using git to manage some patches on top of the actual upstream files, but noticed that the result of git-format-patch changed between 4bfe7cb6668c43c1136304bbb17eea1b3ddf0237 and 27af01d552331eacf1ed2671b2b4b6ad4c268106>>>> I've attached two input files (I tried to provide a minimal example... I am not sure if a smaller example is possible but at least both files are smaller than 10 lines) and the results with version 1.7.6.3 and and 1.7.7. The diffs were created using: git diff anonymized_orig anonymized_new>>>> My .gitconfig file is empty.
This has been "fixed" in v1.7.7.1, with 713b85c (Merge branch
'rs/diff-cleanup-records-fix' into maint) - "fixed" in that it gives
back the old behaviour, not that the diff produced is incorrect and
needs fixing.
(I'm running 1.7.7.1.599.g03eec, I get the same diff as diff.1.7.6.3)

-- 
Cheers,
Ray Chuan

^ permalink raw reply

* Re: [PATCH 00/22] Refactor to accept NUL in commit messages
From: Štěpán Němec @ 2011-10-24 11:09 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, git, Jeff King, Ævar Arnfjörð
In-Reply-To: <CACsJy8AsfQnS3L1fabzB-z7BdH=jvB=XNnmP2RZu0qp7C1uGYQ@mail.gmail.com>

On Mon, 24 Oct 2011 07:10:08 +0200
Nguyen Thai Ngoc Duy wrote:

> This is argument for the sake of argument because I don't use utf-16
> and do not care much. UTF-16 can have more code points and some may
> prefer utf-16 to utf-8.

I suspect this is really tangential to this thread, but I can't make
much sense of that last sentence -- if you meant that UTF-16 is somehow
more apt at encoding Unicode code points than UTF-8, then that's not the
case. Both can represent all Unicode characters. If anything, things are
_more_, not less complicated in UTF-16, which apart from the NUL and
endianness complications has to jump through the "surrogate pairs" hoop
for code points bigger than U+FFFF (so you'll actually find many apps
with buggy UTF-16 implementation which break for those code points,
unlike when using UTF-8).

-- 
Štěpán

^ permalink raw reply

* Re: [PATCH 12/12] is_refname_available(): reimplement using do_for_each_ref_in_array()
From: Michael Haggerty @ 2011-10-24 11:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <4E9FD1C3.3090302@alum.mit.edu>

On 10/20/2011 09:46 AM, Michael Haggerty wrote:
> On 10/20/2011 03:40 AM, Junio C Hamano wrote:
>> Hmm, why is this patch and only this one in the series full of whitespace
>> violations? Did you use a different settings or something?
> 
> This happens rarely; I don't know why.  Maybe I copy-pasted snippets
> from a view in an application that expanded the tabs.  [...]

Now I think I know how this happened.  When "git diff"'s output goes to
a TTY, it passes its output through the pager.  The default pager, less,
seems to convert tabs into spaces.  I probably copy-pasted some output
of diff into my editor then removed the first column of '+' characters.

Just another reason why tabs are evil...

:-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* What's cooking in git.git (Oct 2011, #09; Sun, 23)
From: Junio C Hamano @ 2011-10-24 15:31 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking.  Commits prefixed with '-' are
only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'.

It probably is a good point to stop taking new topics and start
switching our focus to fixing bugs in the topics already in 'master'.

Here are the repositories that have my integration branches:

With maint, master, next, pu, todo, html and man:

        git://git.kernel.org/pub/scm/git/git.git
        git://repo.or.cz/alt-git.git
        https://code.google.com/p/git-core/
        https://github.com/git/git

With only maint, master, html and man:

        git://git.sourceforge.jp/gitroot/git-core/git.git
        git://git-core.git.sourceforge.net/gitroot/git-core/git-core

With all the topics and integration branches but not todo, html or man:

        https://github.com/gitster/git

By the way, I am planning to stop pushing the generated documentation
branches to the above repositories in the near term, as they are not
sources. The only reason the source repository at k.org has hosted these
branches was because it was the only repository over there that was
writable by me; it was an ugly historical and administrative workaround
and not a demonstration of the best practice.

They are pushed to their own separate repositories instead:

        git://git.kernel.org/pub/scm/git/git-{htmldocs,manpages}.git/
        git://repo.or.cz/git-{htmldocs,manpages}.git/
        https://code.google.com/p/git-{htmldocs,manpages}.git/
        https://github.com/gitster/git-{htmldocs,manpages}.git/

--------------------------------------------------
[New Topics]

* nd/pretty-commit-log-message (2011-10-23) 2 commits
 - pretty.c: use original commit message if reencoding fails
 - pretty.c: free get_header() return value

--------------------------------------------------
[Graduated to "master"]

* cn/doc-config-bare-subsection (2011-10-16) 1 commit
  (merged to 'next' on 2011-10-17 at a6412d4)
 + Documentation: update [section.subsection] to reflect what git does

* jc/broken-ref-dwim-fix (2011-10-19) 3 commits
  (merged to 'next' on 2011-10-19 at 40cad95)
 + resolve_ref(): report breakage to the caller without warning
 + resolve_ref(): expose REF_ISBROKEN flag
 + refs.c: move dwim_ref()/dwim_log() from sha1_name.c
 (this branch is tangled with jc/check-ref-format-fixup.)

This only takes good bits from the failed jc/check-ref-format-fixup topic
and implements saner workaround for the recent breakage on the 'master'.

* jc/maint-remove-renamed-ref (2011-10-12) 1 commit
  (merged to 'next' on 2011-10-12 at 819c3e4)
 + branch -m/-M: remove undocumented RENAMED-REF

* jc/make-tags (2011-10-18) 1 commit
  (merged to 'next' on 2011-10-19 at b0b91bf)
 + Makefile: ask "ls-files" to list source files if available

* jc/match-refs-clarify (2011-09-12) 2 commits
  (merged to 'next' on 2011-10-19 at b295e1e)
 + rename "match_refs()" to "match_push_refs()"
 + send-pack: typofix error message

* jc/unseekable-bundle (2011-10-13) 2 commits
  (merged to 'next' on 2011-10-19 at 2978ee0)
 + bundle: add parse_bundle_header() helper function
 + bundle: allowing to read from an unseekable fd

* jk/daemon-msgs (2011-10-15) 1 commit
  (merged to 'next' on 2011-10-15 at 415cf53)
 + daemon: give friendlier error messages to clients
 (this branch is used by cb/daemon-permission-errors.)

* jk/maint-pack-objects-compete-with-delete (2011-10-14) 2 commits
  (merged to 'next' on 2011-10-15 at 49479e4)
 + downgrade "packfile cannot be accessed" errors to warnings
 + pack-objects: protect against disappearing packs

* mh/ref-api (2011-10-16) 7 commits
  (merged to 'next' on 2011-10-17 at 219000f)
 + clear_ref_cache(): inline function
 + write_ref_sha1(): only invalidate the loose ref cache
 + clear_ref_cache(): extract two new functions
 + clear_ref_cache(): rename parameter
 + invalidate_ref_cache(): expose this function in the refs API
 + invalidate_ref_cache(): take the submodule as parameter
 + invalidate_ref_cache(): rename function from invalidate_cached_refs()
 (this branch is used by mh/ref-api-2 and mh/ref-api-3.)

* ph/transport-with-gitfile (2011-10-11) 5 commits
  (merged to 'next' on 2011-10-12 at 6d58417)
 + Fix is_gitfile() for files too small or larger than PATH_MAX to be a gitfile
  (merged to 'next' on 2011-10-06 at 891b8b6)
 + Add test showing git-fetch groks gitfiles
 + Teach transport about the gitfile mechanism
 + Learn to handle gitfiles in enter_repo
 + enter_repo: do not modify input

* po/insn-editor (2011-10-17) 1 commit
  (merged to 'next' on 2011-10-19 at cbf5e0b)
 + "rebase -i": support special-purpose editor to edit insn sheet

* pw/p4-update (2011-10-17) 6 commits
  (merged to 'next' on 2011-10-17 at f69f6cc)
 + git-p4: handle files with shell metacharacters
 + git-p4: keyword flattening fixes
 + git-p4: stop ignoring apple filetype
 + git-p4: recognize all p4 filetypes
 + git-p4: handle utf16 filetype properly
 + git-p4 tests: refactor and cleanup

* sc/difftool-skip (2011-10-14) 2 commits
  (merged to 'next' on 2011-10-14 at b91c581)
 + t7800: avoid arithmetic expansion notation
  (merged to 'next' on 2011-10-11 at 38d7e84)
 + git-difftool: allow skipping file by typing 'n' at prompt

* ss/inet-ntop (2011-10-18) 1 commit
  (merged to 'next' on 2011-10-19 at 85469f6)
 + inet_ntop.c: Work around GCC 4.6's detection of uninitialized variables

--------------------------------------------------
[Stalled]

* hv/submodule-merge-search (2011-10-13) 4 commits
 - submodule.c: make two functions static
 - allow multiple calls to submodule merge search for the same path
 - push: Don't push a repository with unpushed submodules
 - push: teach --recurse-submodules the on-demand option

What the topic aims to achieve may make sense, but the implementation
looked somewhat suboptimal.

The fix-up at the tip queued on fg/submodule-auto-push topic has been
moved to this topic.

* sr/transport-helper-fix-rfc (2011-07-19) 2 commits
 - t5800: point out that deleting branches does not work
 - t5800: document inability to push new branch with old content

Perhaps 281eee4 (revision: keep track of the end-user input from the
command line, 2011-08-25) would help.

* jc/lookup-object-hash (2011-08-11) 6 commits
 - object hash: replace linear probing with 4-way cuckoo hashing
 - object hash: we know the table size is a power of two
 - object hash: next_size() helper for readability
 - pack-objects --count-only
 - object.c: remove duplicated code for object hashing
 - object.c: code movement for readability

I do not think there is anything fundamentally wrong with this series, but
the risk of breakage far outweighs observed performance gain in one
particular workload.

* jc/verbose-checkout (2011-10-16) 2 commits
 - checkout -v: give full status output after switching branches
 - checkout: move the local changes report to the end

This is just to leave a record that the reason why we do not do this not
because we are incapable of coding this, but because it is not a good idea
to do this. I suspect people who are new to git that might think they need
it would soon realize the don't.

Will keep in 'pu' as a showcase for a while and then will drop.

--------------------------------------------------
[Cooking]

* tc/submodule-clone-name-detection (2011-10-21) 2 commits
  (merged to 'next' on 2011-10-23 at c18af03)
 + submodule::module_clone(): silence die() message from module_name()
 + submodule: whitespace fix

"git submodule clone" used to show unnecessary error message when
submodule mapping from name to path is not found in .gitmodules file.

Will merge to 'master'.

* jm/maint-gitweb-filter-forks-fix (2011-10-21) 1 commit
  (merged to 'next' on 2011-10-21 at debedcd)
 + gitweb: fix regression when filtering out forks

Will merge to 'master' shortly.

* lh/gitweb-site-html-head (2011-10-21) 1 commit
  (merged to 'next' on 2011-10-23 at 65075df)
 + gitweb: provide a way to customize html headers

Will merge to 'master' shortly.

* mh/ref-api-3 (2011-10-19) 11 commits
  (merged to 'next' on 2011-10-23 at 92e2d35)
 + is_refname_available(): reimplement using do_for_each_ref_in_array()
 + names_conflict(): simplify implementation
 + names_conflict(): new function, extracted from is_refname_available()
 + repack_without_ref(): reimplement using do_for_each_ref_in_array()
 + do_for_each_ref_in_array(): new function
 + do_for_each_ref(): correctly terminate while processesing extra_refs
 + add_ref(): take a (struct ref_entry *) parameter
 + create_ref_entry(): extract function from add_ref()
 + parse_ref_line(): add a check that the refname is properly formatted
 + repack_without_ref(): remove temporary
 + Rename another local variable name -> refname
 (this branch uses mh/ref-api-2.)

* mm/mediawiki-author-fix (2011-10-20) 1 commit
  (merged to 'next' on 2011-10-23 at 9f85b67)
 + git-remote-mediawiki: don't include HTTP login/password in author

Will merge to 'master' shortly.

* rr/revert-cherry-pick (2011-10-23) 5 commits
 - revert: simplify communicating command-line arguments
 - revert: allow mixed pick and revert instructions
 - revert: make commit subjects in insn sheet optional
 - revert: simplify getting commit subject in format_todo()
 - revert: free msg in format_todo()

The internals of "git revert/cherry-pick" has been further refactored to
serve as the basis for the sequencer.

Will merge to 'next'.

* jn/libperl-git-config (2011-10-21) 2 commits
  (merged to 'next' on 2011-10-21 at 76e2d4b)
 + Add simple test for Git::config_path() in t/t9700-perl-git.sh
 + libperl-git: refactor Git::config_*

Will merge to 'master' shortly.

* jc/check-ref-format-fixup (2011-10-19) 2 commits
  (merged to 'next' on 2011-10-19 at 98981be)
 + Revert "Restrict ref-like names immediately below $GIT_DIR"
  (merged to 'next' on 2011-10-15 at 8e89bc5)
 + Restrict ref-like names immediately below $GIT_DIR

This became a no-op except for the bottom one which is part of the other
topic now.
Will discard once the other topic graduates to 'master'.

* cb/daemon-permission-errors (2011-10-17) 2 commits
 - daemon: report permission denied error to clients
 - daemon: add tests

The tip commit might be loosening things a bit too much.
Will keep in 'pu' until hearing a convincing argument for the patch.

* kk/gitweb-side-by-side-diff (2011-10-17) 2 commits
 - gitweb: add a feature to show side-by-side diff
 - gitweb: change format_diff_line() to remove leading SP from $diff_class

Fun.
Will keep in 'pu' until the planned re-roll comes.

* mh/ref-api-2 (2011-10-17) 14 commits
  (merged to 'next' on 2011-10-19 at cc89f0e)
 + resolve_gitlink_ref_recursive(): change to work with struct ref_cache
 + Pass a (ref_cache *) to the resolve_gitlink_*() helper functions
 + resolve_gitlink_ref(): improve docstring
 + get_ref_dir(): change signature
 + refs: change signatures of get_packed_refs() and get_loose_refs()
 + is_dup_ref(): extract function from sort_ref_array()
 + add_ref(): add docstring
 + parse_ref_line(): add docstring
 + is_refname_available(): remove the "quiet" argument
 + clear_ref_array(): rename from free_ref_array()
 + refs: rename parameters result -> sha1
 + refs: rename "refname" variables
 + struct ref_entry: document name member
 + cache.h: add comments for git_path() and git_path_submodule()
 (this branch is used by mh/ref-api-3.)

It is either merge this quickly to 'master' and hope there won't be any
more unexpected breakage that forces us to delay the release, or hold it
on 'next' until the next cycle. I am inclined to do the former, but not
quite ready to commit to it yet.

* dm/pack-objects-update (2011-10-20) 4 commits
 - pack-objects: don't traverse objects unnecessarily
 - pack-objects: rewrite add_descendants_to_write_order() iteratively
 - pack-objects: use unsigned int for counter and offset values
 - pack-objects: mark add_to_write_order() as inline

Need to re-read this before deciding what to do; it came a bit too late in
the cycle for a series that touches a seriously important part of the
system.

* jk/git-tricks (2011-10-21) 3 commits
  (merged to 'next' on 2011-10-23 at 7c9bf71)
 + completion: match ctags symbol names in grep patterns
 + contrib: add git-jump script
 + contrib: add diff highlight script

* jc/signed-commit (2011-10-21) 7 commits
  (merged to 'next' on 2011-10-23 at 03eec25)
 + pretty: %G[?GS] placeholders
 + parse_signed_commit: really use the entire commit log message
 + test "commit -S" and "log --show-signature"
 + t7004: extract generic "GPG testing" bits
 + log: --show-signature
 + commit: teach --gpg-sign option
 + Split GPG interface into its own helper library

This is to replace the earlier "signed push" experiments.
Will keep in 'next' during this cycle.

* sg/complete-refs (2011-10-21) 9 commits
 - completion: remove broken dead code from __git_heads() and __git_tags()
 - completion: fast initial completion for config 'remote.*.fetch' value
 - completion: improve ls-remote output filtering in __git_refs_remotes()
 - completion: query only refs/heads/ in __git_refs_remotes()
 - completion: support full refs from remote repositories
 - completion: improve ls-remote output filtering in __git_refs()
 - completion: make refs completion consistent for local and remote repos
 - completion: optimize refs completion
 - completion: document __gitcomp()

Will merge to 'next' but won't merge further until an Ack or two from
people who have worked on the completion in the past comes.

* cn/fetch-prune (2011-10-15) 5 commits
  (merged to 'next' on 2011-10-16 at 02a449e)
 + fetch: treat --tags like refs/tags/*:refs/tags/* when pruning
 + fetch: honor the user-provided refspecs when pruning refs
 + remote: separate out the remote_find_tracking logic into query_refspecs
 + t5510: add tests for fetch --prune
 + fetch: free all the additional refspecs

"git fetch --prune" used to prune remote tracking branches by comparing
what was actually fetched and what was configured to be fetched, which was
wrong.

Will merge to 'master' shortly.

* jc/request-pull-show-head-4 (2011-10-15) 11 commits
  (merged to 'next' on 2011-10-15 at 7e340ff)
 + fmt-merge-msg.c: Fix an "dubious one-bit signed bitfield" sparse error
  (merged to 'next' on 2011-10-10 at 092175e)
 + environment.c: Fix an sparse "symbol not declared" warning
 + builtin/log.c: Fix an "Using plain integer as NULL pointer" warning
  (merged to 'next' on 2011-10-07 at fcaeca0)
 + fmt-merge-msg: use branch.$name.description
  (merged to 'next' on 2011-10-06 at fa5e0fe)
 + request-pull: use the branch description
 + request-pull: state what commit to expect
 + request-pull: modernize style
 + branch: teach --edit-description option
 + format-patch: use branch description in cover letter
 + branch: add read_branch_desc() helper function
 + Merge branch 'bk/ancestry-path' into jc/branch-desc

Allow setting "description" for branches and use it to help communications
between humans in various workflow elements.

Will keep in 'next' during this cycle.

^ permalink raw reply

* A note from the maintainer
From: Junio C Hamano @ 2011-10-24 15:32 UTC (permalink / raw)
  To: git

Welcome to git development community.

This message is written by the maintainer and talks about how Git
project is managed, and how you can work with it.

* Mailing list and the community

The development is primarily done on the Git mailing list. Help
requests, feature proposals, bug reports and patches should be sent to
the list address <git@vger.kernel.org>.  You don't have to be
subscribed to send messages.  The convention on the list is to keep
everybody involved on Cc:, so it is unnecessary to ask "Please Cc: me,
I am not subscribed".

Before sending patches, please read Documentation/SubmittingPatches
and Documentation/CodingGuidelines to familiarize yourself with the
project convention.

If you sent a patch and you did not hear any response from anybody for
several days, it could be that your patch was totally uninteresting,
but it also is possible that it was simply lost in the noise.  Please
do not hesitate to send a reminder message in such a case.  Messages
getting lost in the noise is a sign that people involved don't have
enough mental/time bandwidth to process them right at the moment, and
it often helps to wait until the list traffic becomes calmer before
sending such a reminder.

The list archive is available at a few public sites as well:

        http://news.gmane.org/gmane.comp.version-control.git/
        http://marc.theaimsgroup.com/?l=git
        http://www.spinics.net/lists/git/

and some people seem to prefer to read it over NNTP:

        nntp://news.gmane.org/gmane.comp.version-control.git

When you point at a message in a mailing list archive, using
gmane is often the easiest to follow by readers, like this:

        http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217

as it also allows people who subscribe to the mailing list as gmane
newsgroup to "jump to" the article.

Some members of the development community can sometimes also be found
on the #git IRC channel on Freenode.  Its log is available at:

        http://colabti.org/irclogger/irclogger_log/git

* Reporting bugs

When you think git does not behave as you expect, please do not stop your
bug report with just "git does not work".  "I tried to do X but it did not
work" is not much better, neither is "I tried to do X and git did Y, which
is broken".  It often is that what you expect is _not_ what other people
expect, and chances are that what you expect is very different from what
people who have worked on git have expected (otherwise, the behavior
would have been changed to match that expectation long time ago).

Please remember to always state

 - what you wanted to do;

 - what you did (the version of git and the command sequence to reproduce
   the behavior);

 - what you saw happen;

 - what you expected to see; and

 - how the last two are different.

See http://www.chiark.greenend.org.uk/~sgtatham/bugs.html for further
hints.

* Repositories, branches and documentation.

My public git.git repository is at:

        git://git.kernel.org/pub/scm/git/git.git/
	git://repo.or.cz/alt-git.git
	https://github.com/git/git
	https://code.google.com/p/git-core/

Impatient people might have better luck with the latter two (there are a
few other mirrors I push into at sourceforge and github as well).

Their gitweb interfaces are found at:

        http://git.kernel.org/?p=git/git.git
        http://repo.or.cz/w/alt-git.git

There are three branches in git.git repository that are not about the
source tree of git: "html", "man", and "todo".

The "html" and "man" are preformatted documentation from the tip of
the "master" branch; the tip of "html" is visible at:

        http://www.kernel.org/pub/software/scm/git/docs/
	http://git-core.googlecode.com/git-history/html/git.html

The above URL is the top-level documentation page, and it may have
links to documentation of older releases.

The "todo" branch was originally meant to contain a TODO list for me,
but is mostly used to keep some helper scripts I use to maintain git.
For example, the script that was used to maintain the two documentation
branches are found there as dodoc.sh, which may be a good demonstration
of how to use a post-update hook to automate a task after pushing into a
repository.

There are four branches in git.git repository that track the source tree
of git: "master", "maint", "next", and "pu".

The "master" branch is meant to contain what are very well tested and
ready to be used in a production setting.  Every now and then, a "feature
release" is cut from the tip of this branch and they typically are named
with three dotted decimal digits.  The last such release was 1.7.7 done on
Sept 30, 2011. You can expect that the tip of the "master" branch is always
more stable than any of the released versions.

Whenever a feature release is made, "maint" branch is forked off from
"master" at that point.  Obvious, safe and urgent fixes after a feature
release are applied to this branch and maintenance releases are cut from
it.  The maintenance releases are named with four dotted decimal, named
after the feature release they are updates to; the last such release was
1.7.7.1.  New features never go to this branch.  This branch is also
merged into "master" to propagate the fixes forward.

A new development does not usually happen on "master". When you send a
series of patches, after review on the mailing list, a separate topic
branch is forked from the tip of "master" and your patches are queued
there, and kept out of "master" while people test it out.  The quality of
topic branches are judged primarily by the mailing list discussions.

Topic branches that are in good shape are merged to the "next" branch. In
general, the "next" branch always contains the tip of "master".  It might
not be quite rock-solid production ready, but is expected to work more or
less without major breakage. The "next" branch is where new and exciting
things take place. A topic that is in "next" is expected to be polished to
perfection before it is merged to "master" (that's why "master" can be
expected to stay more stable than any released version).

The "pu" (proposed updates) branch bundles all the remaining topic
branches. The topics on the branch are not complete, well tested, nor well
documented and need further work. When a topic that was in "pu" proves to
be in testable shape, it is merged to "next".

You can run "git log --first-parent master..pu" to see what topics are
currently in flight.  Sometimes, an idea that looked promising turns out
to be not so good and the topic can be dropped from "pu" in such a case.

The two branches "master" and "maint" are never rewound, and "next"
usually will not be either.  After a feature release is made from
"master", however, "next" will be rebuilt from the tip of "master"
using the topics that didn't make the cut in the feature release.

Note that being in "next" is not a guarantee to appear in the next
release, nor even in any future release.  There were cases that topics
needed reverting a few commits in them before graduating to "master",
or a topic that already was in "next" was reverted from "next" because
fatal flaws were found in it after it was merged.


* Other people's trees, trusted lieutenants and credits.

Documentation/SubmittingPatches outlines to whom your proposed changes
should be sent.  As described in contrib/README, I would delegate fixes
and enhancements in contrib/ area to the primary contributors of them.

Although the following are included in git.git repository, they have their
own authoritative repository and maintainers:

 - git-gui/ comes from git-gui project, maintained by Pat Thoyts:

        git://repo.or.cz/git-gui.git

 - gitk-git/ comes from Paul Mackerras's gitk project:

        git://git.kernel.org/pub/scm/gitk/gitk.git

I would like to thank everybody who helped to raise git into the current
shape.  Especially I would like to thank the git list regulars whose help
I have relied on and expect to continue relying on heavily:

 - Linus Torvalds, Shawn Pearce, Johannes Schindelin, Nicolas Pitre,
   René Scharfe, Jeff King, Jonathan Nieder, Johan Herland, Johannes
   Sixt, Sverre Rabbelier, Michael J Gruber, Nguyễn Thái Ngọc Duy,
   Ævar Arnfjörð Bjarmason and Thomas Rast on general design and
   implementation issues and reviews on the mailing list.

 - Shawn and Nicolas Pitre on pack issues.

 - Martin Langhoff, Frank Lichtenheld and Ævar Arnfjörð Bjarmason on
   cvsserver and cvsimport.

 - Paul Mackerras on gitk.

 - Eric Wong, David D. Kilzer and Sam Vilain on git-svn.

 - Simon Hausmann and Pete Wyckoff on git-p4.

 - Jakub Narebski, John Hawley, Petr Baudis, Luben Tuikov, Giuseppe Bilotta on
   gitweb.

 - J. Bruce Fields, Jonathan Nieder, Michael J Gruber and Thomas Rast on
   documentation (and countless others for proofreading and fixing).

 - Alexandre Julliard on Emacs integration.

 - David Aguilar and Charles Bailey for taking good care of git-mergetool
   (and Theodore Ts'o for creating it in the first place) and git-difftool.

 - Johannes Schindelin, Johannes Sixt, Erik Faye-Lund, Pat Thoyts and others
   for their effort to move things forward on the Windows front.

 - People on non-Linux platforms for keeping their eyes on portability;
   especially, Randal Schwartz, Theodore Ts'o, Jason Riedy, Thomas Glanzmann,
   Brandon Casey, Jeff King, Alex Riesen and countless others.

* This document

The latest copy of this document is found in git.git repository,
on 'todo' branch, as MaintNotes.

^ permalink raw reply

* Re: [PATCH] read-cache.c: fix index memory allocation
From: René Scharfe @ 2011-10-24 15:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, John Hsing, Matthieu Moy, git
In-Reply-To: <7vaa8q4zm9.fsf@alter.siamese.dyndns.org>

Am 24.10.2011 09:28, schrieb Junio C Hamano:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> 
>>  t/t7510-status-index.sh |   50 +++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 53 insertions(+), 3 deletions(-)
>>  create mode 100755 t/t7510-status-index.sh
> 
>> diff --git a/t/t7510-status-index.sh b/t/t7510-status-index.sh
>> new file mode 100755
>> index 0000000..bca359d
>> --- /dev/null
>> +++ b/t/t7510-status-index.sh
>> @@ -0,0 +1,50 @@
> 
> Hmm, I cannot seem to make this fail this test without the fix on my
> Fedora 14 i686 VM when applied to v1.7.6.4 (estimation code originates
> cf55870 back in v1.7.6.1 days), but it does break on 'master'.

Err, yes, I forgot to mention in the commit message that on my test
system the breakage occurs only after 2548183ba, "fix phantom untracked
files when core.ignorecase is set", which adds the pointer dir_next to
struct cache_entry.  This seems to have caused an unlucky constellation
of offsets and struct sizes for the size estimator.

> By the way, I'll move this to 7511.
> 
> Also would a patch like this help?

Only a little, I suspect.  If we've moved past the end then it's too
late.  And if we catch the error before it happens, dying is only
slightly better than crashing.

> -- >8 --
> Subject: [PATCH] read_index(): die on estimation error
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  read-cache.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 0a64103..2926615 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1270,6 +1270,7 @@ int read_index_from(struct index_state *istate, const char *path)
>  	int fd, i;
>  	struct stat st;
>  	unsigned long src_offset, dst_offset;
> +	size_t bulk_alloc_size;
>  	struct cache_header *hdr;
>  	void *mmap;
>  	size_t mmap_size;
> @@ -1315,7 +1316,8 @@ int read_index_from(struct index_state *istate, const char *path)
>  	 * has room for a few  more flags, we can allocate using the same
>  	 * index size
>  	 */
> -	istate->alloc = xmalloc(estimate_cache_size(mmap_size, istate->cache_nr));
> +	bulk_alloc_size = estimate_cache_size(mmap_size, istate->cache_nr);
> +	istate->alloc = xmalloc(bulk_alloc_size);
>  	istate->initialized = 1;
>  
>  	src_offset = sizeof(*hdr);
> @@ -1331,7 +1333,9 @@ int read_index_from(struct index_state *istate, const char *path)
>  
>  		src_offset += ondisk_ce_size(ce);
>  		dst_offset += ce_size(ce);
> +		if (bulk_alloc_size <= dst_offset)
> +			die("cache size estimation error");
>  	}
>  	istate->timestamp.sec = st.st_mtime;
>  	istate->timestamp.nsec = ST_MTIME_NSEC(st);
>  

^ permalink raw reply

* Re: [PATCH] read-cache.c: fix index memory allocation
From: René Scharfe @ 2011-10-24 15:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, John Hsing, Matthieu Moy, git
In-Reply-To: <7vipne50lz.fsf@alter.siamese.dyndns.org>

Am 24.10.2011 09:07, schrieb Junio C Hamano:
> Thanks.
> 
> This approach may be the most appropriate for the maintenance track, but
> for the purpose of going forward, I wonder if we really want to keep the
> "estimate and allocate a large pool, and carve out individual pieces".
> 
> This bulk-allocate dates back to the days when we didn't have ondisk vs
> incore representation differences, IIRC, and as the result we deliberately
> leak cache entries whenever an entry in the index is replaced with a new
> one. Does the overhead to allocate individually really kill us that much
> for say a tree with 30k files in it?

Probably not; unpack_trees() does that already.  (It calls
create_ce_entry() via unpack_nondirectories() via unpack_callback() via
traverse_trees()).

René

^ permalink raw reply

* [PATCH v4 0/3] port upload-archive to Windows
From: Erik Faye-Lund @ 2011-10-24 16:02 UTC (permalink / raw)
  To: git; +Cc: gitster, j6t, peff, rene.scharfe

Here's a new iteration of this series. I delayed it until the
improved version of "enter_repo: do not modify input" hit master,
which happened recently.

The important change in this iteration (besides the patch that
already propagated upstream) is that I've moved
compat/win32/sys/poll.[ch] out of the sys-folder (as XSI suggests).
This enables us to easily upgrade the poll-emulation without
breaking the Windows build in the process.

Erik Faye-Lund (3):
  mingw: move poll out of sys-folder
  compat/win32/poll.c: upgrade from upstream
  upload-archive: use start_command instead of fork

 Makefile                 |    6 +-
 builtin/archive.c        |    6 +-
 builtin/upload-archive.c |   68 ++----
 compat/mingw.h           |    2 -
 compat/win32/poll.c      |  606 ++++++++++++++++++++++++++++++++++++++++++++++
 compat/win32/poll.h      |   53 ++++
 compat/win32/sys/poll.c  |  599 ---------------------------------------------
 compat/win32/sys/poll.h  |   53 ----
 t/t5000-tar-tree.sh      |   10 +-
 9 files changed, 694 insertions(+), 709 deletions(-)
 create mode 100644 compat/win32/poll.c
 create mode 100644 compat/win32/poll.h
 delete mode 100644 compat/win32/sys/poll.c
 delete mode 100644 compat/win32/sys/poll.h

-- 
1.7.7.msysgit.1.1.g7b316

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox