All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 10/14] ls-files: support "sparse patterns", used to form sparse checkout areas
Date: Sat, 20 Sep 2008 17:01:49 +0700	[thread overview]
Message-ID: <1221904913-25887-11-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <1221904913-25887-10-git-send-email-pclouds@gmail.com>

This implements sparse patterns and adds --narrow-match option in order
to test the patterns.

Sparse patterns are basically like .gitignore patterns, but they can be
combined in one line, separating by colons like $PATH.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-checkout.txt   |   45 +++++++++++++
 Documentation/git-ls-files.txt   |    8 ++-
 builtin-ls-files.c               |   21 +++++-
 t/t3003-ls-files-narrow-match.sh |   39 +++++++++++
 t/t3003/1                        |    3 +
 t/t3003/12                       |    6 ++
 t/t3003/clone-escape             |    4 +
 t/t3003/cur-12                   |    2 +
 t/t3003/root-sub-1               |    1 +
 t/t3003/slash-1                  |    1 +
 t/t3003/sub-1                    |    2 +
 t/t3003/sub-only                 |    3 +
 t/t3003/subsub-slash             |    3 +
 unpack-trees.c                   |  136 ++++++++++++++++++++++++++++++++++++++
 unpack-trees.h                   |   18 +++++
 15 files changed, 288 insertions(+), 4 deletions(-)
 create mode 100755 t/t3003-ls-files-narrow-match.sh
 create mode 100644 t/t3003/1
 create mode 100644 t/t3003/12
 create mode 100644 t/t3003/clone-escape
 create mode 100644 t/t3003/cur-12
 create mode 100644 t/t3003/root-sub-1
 create mode 100644 t/t3003/slash-1
 create mode 100644 t/t3003/sub
 create mode 100644 t/t3003/sub-1
 create mode 100644 t/t3003/sub-only
 create mode 100644 t/t3003/subsub-slash

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 2b344e1..d6f94a6 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -205,6 +205,51 @@ is "assume unchanged" bit just ignores corresponding files in working
 directory while sparse checkout goes a bit farther, remove those files
 when it is safe to do so.
 
+Sparse patterns
+---------------
+
+Sparse patterns specify how do you want to form your checkout area.
+Many patterns can be specified on one line, separated by colons.
+The patterns specify what files should or should not be checked out
+on working directory (depends on the option used with the patterns).
+Patterns have the following format:
+
+ - An optional prefix '!' which negates the pattern; any
+   matching file by a previous pattern will become
+   unmatched again.  If a negated pattern matches, this will
+   override lower precedence patterns sources.
+
+ - If the pattern ends with a slash, it is removed for the
+   purpose of the following description, but it would only find
+   a match with a directory.  In other words, `foo/` will match a
+   directory `foo` and paths underneath it, but will not match a
+   regular file or a symbolic link `foo` (this is consistent
+   with the way how pathspec works in general in git).
+
+ - If the pattern does not contain a slash '/', git treats it as
+   a shell glob pattern and checks for a match against the
+   pathname without leading directories.
+
+ - Otherwise, git treats the pattern as a shell glob suitable
+   for consumption by fnmatch(3) with the FNM_PATHNAME flag:
+   wildcards in the pattern will not match a / in the pathname.
+   For example, "Documentation/\*.html" matches
+   "Documentation/git.html" but not
+   "Documentation/ppc/ppc.html".  A leading slash matches the
+   beginning of the pathname; for example, "/*.c" matches
+   "cat-file.c" but not "mozilla-sha1/sha1.c".
+
+ - Patterns begin with a slash will match against full pathname,
+   as opposed to normal case when it only matches pathnames relative
+   to current working directory.
+
+ - Patterns begin with "./" are treated like normal patterns. That is
+   it will follow above rules. But since it has a slash inside,
+   "fnmatch rule" will apply. This is a work-around when you do not
+   want to apply "no slash" rule.
+
+ - Because colons are used to separate patterns, you cannot put them
+   in patterns directly. You must quote them using backslash.
 
 EXAMPLES
 --------
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 1de68e2..fbed73b 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -12,7 +12,7 @@ SYNOPSIS
 'git ls-files' [-z] [-t] [-v]
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified|orphaned|no-checkout])\*
 		(-[c|d|o|i|s|u|k|m])\*
-		[--sparse]
+		[--sparse] [--narrow-match=<sparse patterns>]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
@@ -90,6 +90,12 @@ OPTIONS
 	No-checkout entries can be shown using --orphaned or
 	--no-checkout (or both).
 
+--narrow-match=<sparse patterns>::
+	This option can be used to test sparse patterns. The given sparse patterns will
+	be used to filter ls-files output. Entries not matching the spec will be
+	ignored. This option can only be used with --cached or --stage.
+	See linkgit:git-checkout[1] for more information about sparse patterns.
+
 -z::
 	\0 line termination on output.
 
diff --git a/builtin-ls-files.c b/builtin-ls-files.c
index 873de15..1c81022 100644
--- a/builtin-ls-files.c
+++ b/builtin-ls-files.c
@@ -10,6 +10,8 @@
 #include "dir.h"
 #include "builtin.h"
 #include "tree.h"
+#include "tree-walk.h"
+#include "unpack-trees.h"
 
 static int abbrev;
 static int show_deleted;
@@ -31,6 +33,7 @@ static const char **pathspec;
 static int error_unmatch;
 static char *ps_matched;
 static const char *with_tree;
+static struct narrow_spec *narrow_spec;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -187,7 +190,7 @@ static void show_ce_entry(const char *tag, struct cache_entry *ce)
 	int len = prefix_len;
 	int offset = prefix_offset;
 
-	if (len >= ce_namelen(ce))
+	if (len >= ce_namelen(ce) && !narrow_spec)
 		die("git ls-files: internal error - cache entry not superset of prefix");
 
 	if (pathspec && !pathspec_match(pathspec, ps_matched, ce->name, len))
@@ -260,6 +263,8 @@ static void show_files(struct dir_struct *dir, const char *prefix)
 			}
 			if (!(show_cached | show_stage))
 				continue;
+			if (narrow_spec && !match_narrow_spec(narrow_spec, ce->name))
+				continue;
 			show_ce_entry(ce_stage(ce) ? tag_unmerged : tag_cached, ce);
 		}
 	}
@@ -439,7 +444,7 @@ int report_path_error(const char *ps_matched, const char **pathspec, int prefix_
 
 static const char ls_files_usage[] =
 	"git ls-files [-z] [-t] [-v] (--[cached|deleted|others|stage|unmerged|killed|modified|orphaned|no-checkout])* "
-	"[ --sparse ] "
+	"[ --sparse ] [--narrow-match=<narrow_spec>] "
 	"[ --ignored ] [--exclude=<pattern>] [--exclude-from=<file>] "
 	"[ --exclude-per-directory=<filename> ] [--exclude-standard] "
 	"[--full-name] [--abbrev] [--] [<file>]*";
@@ -498,6 +503,10 @@ int cmd_ls_files(int argc, const char **argv, const char *prefix)
 			sparse_checkout = 1;
 			continue;
 		}
+		if (!prefixcmp(arg, "--narrow-match=")) {
+			narrow_spec = parse_narrow_spec(arg+15, prefix);
+			continue;
+		}
 		if (!strcmp(arg, "-d") || !strcmp(arg, "--deleted")) {
 			show_deleted = 1;
 			continue;
@@ -629,8 +638,14 @@ int cmd_ls_files(int argc, const char **argv, const char *prefix)
 	      show_killed | show_modified | show_orphaned | show_no_checkout))
 		show_cached = 1;
 
+	if (narrow_spec && !show_cached && !show_stage)
+		die("ls-files: --narrow-match can only be used with either --cached or --stage");
+
+	if (narrow_spec && narrow_spec->has_root && prefix_offset != 0)
+		die("ls-files: --narrow-match with root matching patterns requires --full-name");
+
 	read_cache();
-	if (prefix)
+	if (prefix && (!narrow_spec || !narrow_spec->has_root))
 		prune_cache(prefix);
 	if (with_tree) {
 		/*
diff --git a/t/t3003-ls-files-narrow-match.sh b/t/t3003-ls-files-narrow-match.sh
new file mode 100755
index 0000000..5611cab
--- /dev/null
+++ b/t/t3003-ls-files-narrow-match.sh
@@ -0,0 +1,39 @@
+#!/bin/sh
+
+test_description='This test is for narrow spec matching'
+
+. test-lib.sh
+
+D="$(cd ..;pwd)"/t3003
+
+test_pattern() {
+	test_expect_success "pattern $1" '
+		(
+		if [ -n "'$3'" ]; then cd '$3'; fi
+		git ls-files --full-name --narrow-match="'"$2"'" > result &&
+		diff -u result "'"$D/$1"'"
+		)
+	'
+}
+
+test_expect_success 'setup' '
+	touch 1 2 3 "1:2" &&
+	mkdir -p sub/subsub &&
+	touch sub/1 sub/2 sub/3 &&
+	touch sub/subsub/1 sub/subsub/2 sub/subsub/3 &&
+	git add .
+'
+
+test_pattern 1 1
+test_pattern sub sub
+test_pattern sub-1 1 sub
+test_pattern root-sub-1 /1 sub
+test_pattern subsub-slash subsub/ sub
+test_pattern sub-only 'sub/:!sub/subsub/'
+test_pattern 12 1:2
+test_pattern cur-12 ./1:./2
+test_pattern slash-1 'sub/*1'
+test_pattern clone-escape '1\:2:1'
+
+test_done
+
diff --git a/t/t3003/1 b/t/t3003/1
new file mode 100644
index 0000000..9b73321
--- /dev/null
+++ b/t/t3003/1
@@ -0,0 +1,3 @@
+1
+sub/1
+sub/subsub/1
diff --git a/t/t3003/12 b/t/t3003/12
new file mode 100644
index 0000000..5d71811
--- /dev/null
+++ b/t/t3003/12
@@ -0,0 +1,6 @@
+1
+2
+sub/1
+sub/2
+sub/subsub/1
+sub/subsub/2
diff --git a/t/t3003/clone-escape b/t/t3003/clone-escape
new file mode 100644
index 0000000..11cdf68
--- /dev/null
+++ b/t/t3003/clone-escape
@@ -0,0 +1,4 @@
+1
+1:2
+sub/1
+sub/subsub/1
diff --git a/t/t3003/cur-12 b/t/t3003/cur-12
new file mode 100644
index 0000000..1191247
--- /dev/null
+++ b/t/t3003/cur-12
@@ -0,0 +1,2 @@
+1
+2
diff --git a/t/t3003/root-sub-1 b/t/t3003/root-sub-1
new file mode 100644
index 0000000..d00491f
--- /dev/null
+++ b/t/t3003/root-sub-1
@@ -0,0 +1 @@
+1
diff --git a/t/t3003/slash-1 b/t/t3003/slash-1
new file mode 100644
index 0000000..5798e42
--- /dev/null
+++ b/t/t3003/slash-1
@@ -0,0 +1 @@
+sub/1
diff --git a/t/t3003/sub b/t/t3003/sub
new file mode 100644
index 0000000..e69de29
diff --git a/t/t3003/sub-1 b/t/t3003/sub-1
new file mode 100644
index 0000000..3ef951a
--- /dev/null
+++ b/t/t3003/sub-1
@@ -0,0 +1,2 @@
+sub/1
+sub/subsub/1
diff --git a/t/t3003/sub-only b/t/t3003/sub-only
new file mode 100644
index 0000000..3115212
--- /dev/null
+++ b/t/t3003/sub-only
@@ -0,0 +1,3 @@
+sub/1
+sub/2
+sub/3
diff --git a/t/t3003/subsub-slash b/t/t3003/subsub-slash
new file mode 100644
index 0000000..bc585b0
--- /dev/null
+++ b/t/t3003/subsub-slash
@@ -0,0 +1,3 @@
+sub/subsub/1
+sub/subsub/2
+sub/subsub/3
diff --git a/unpack-trees.c b/unpack-trees.c
index e59d144..ce4c826 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -726,6 +726,142 @@ static void show_stage_entry(FILE *o,
 }
 #endif
 
+struct narrow_spec *parse_narrow_spec(const char *spec, const char *prefix)
+{
+	struct narrow_spec *ns;
+	struct narrow_pattern *p;
+	const char *start = spec, *end;
+	int has_wildcards, has_slashes;
+
+	ns = xmalloc(sizeof(*ns));
+	memset(ns, 0, sizeof(*ns));
+	if (prefix)
+		ns->prefix = xstrdup(prefix);
+
+	while (*start) {
+		end = start;
+		has_slashes = has_wildcards = 0;
+		while (*end && *end != ':') {
+			if (*end == '*' || *end == '[' || *end == '?')
+				has_wildcards = 1;
+			if (*end == '/')
+				has_slashes = 1;
+			if (*end == '\\') {
+				end++;
+				has_wildcards = 1;
+				if (*end == '\0') /* trailing backslash */
+					break;
+			}
+			end++;
+		}
+		if (start == end)
+			continue;
+
+		p = xmalloc(offsetof(struct narrow_pattern, pattern)+(end-start)+1);
+		p->negative = *start == '!';
+		if (p->negative)
+			start++;
+		p->has_slashes = has_slashes;
+		p->has_wildcards = has_wildcards;
+		p->has_trailing_slash = end[-1] == '/';
+		p->has_root = *start == '/';
+		if (p->has_root)
+			start++;
+		else if (*start == '.' && start[1] == '/')
+			start += 2;
+		p->len = end-start;
+		memcpy(p->pattern, start, p->len);
+		p->pattern[p->len] = '\0';
+
+		ALLOC_GROW(ns->patterns, ns->nr + 1, ns->alloc);
+		ns->patterns[ns->nr++] = p;
+		ns->has_root |= p->has_root;
+
+		if (*end != ':')
+			break;
+		start = end + 1;
+	}
+	return ns;
+}
+
+int match_narrow_spec(struct narrow_spec *spec, const char *path)
+{
+	int i, prefix_len = 0;
+
+	if (!spec || !spec->nr)
+		return 1; /* always match if spec is NULL */
+
+	if (spec->prefix) {
+		/*
+		 * optimization:
+		 * if there is no pattern with leading slash
+		 * then it is safe to only match inside prefix
+		 */
+		if (!spec->has_root && prefixcmp(path, spec->prefix))
+			return 0;
+		prefix_len = strlen(spec->prefix);
+	}
+
+	for (i = spec->nr - 1;i >= 0; i--) {
+		struct narrow_pattern *p = spec->patterns[i];
+		const char *new_path = path + prefix_len;
+		int match;
+
+		if (p->has_root)
+			new_path = path; /* match full path */
+		else if (spec->has_root) {
+			if (prefixcmp(path, spec->prefix))
+				continue;
+		}
+		/* !spec->has_root case has been handled above */
+
+		if (p->has_trailing_slash) {
+			/* the only "wildcard" here is backslash escape */
+			if (p->has_wildcards) {
+				char *unescaped_pattern = xstrdup(p->pattern);
+				char *src, *dst;
+
+				src = dst = unescaped_pattern;
+				while (*src) {
+					if (*src == '\\')
+						src++;
+					if (src != dst)
+						*dst = *src;
+					src++;
+					dst++;
+				}
+				*dst = '\0';
+				match = prefixcmp(new_path, unescaped_pattern) == 0;
+				free(unescaped_pattern);
+			}
+			else
+				match = prefixcmp(new_path, p->pattern) == 0;
+		}
+		else if (p->has_slashes) {
+			if (p->has_wildcards)
+				match = fnmatch(p->pattern, new_path, FNM_PATHNAME) == 0;
+			else
+				match = strcmp(p->pattern, new_path) == 0;
+		}
+		else {
+			const char *basename = strrchr(path + prefix_len, '/');
+			if (basename)
+				basename++;
+			else
+				basename = path + prefix_len;
+			if (p->has_wildcards)
+				match = fnmatch(p->pattern, basename, 0) == 0;
+			else
+				match = strcmp(p->pattern, basename) == 0;
+		}
+		if (match)
+			return p->negative ? 0 : 1;
+	}
+
+	/* no pattern is matched */
+	return 0;
+}
+
 int threeway_merge(struct cache_entry **stages, struct unpack_trees_options *o)
 {
 	struct cache_entry *index;
diff --git a/unpack-trees.h b/unpack-trees.h
index 0d26f3d..6b1971f 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -16,6 +16,22 @@ struct unpack_trees_error_msgs {
 	const char *bind_overlap;
 };
 
+struct narrow_spec {
+	int nr;
+	int alloc;
+	int has_root:1;
+	const char *prefix;
+	struct narrow_pattern {
+		int len;
+		int has_root:1;
+		int has_slashes:1;
+		int has_wildcards:1;
+		int has_trailing_slash:1;
+		int negative:1;
+		char pattern[FLEX_ARRAY];
+	} **patterns;
+};
+
 struct unpack_trees_options {
 	unsigned int reset:1,
 		     merge:1,
@@ -48,6 +64,8 @@ struct unpack_trees_options {
 extern int unpack_trees(unsigned n, struct tree_desc *t,
 		struct unpack_trees_options *options);
 
+struct narrow_spec *parse_narrow_spec(const char *spec, const char *prefix);
+int match_narrow_spec(struct narrow_spec *spec, const char *path);
 int threeway_merge(struct cache_entry **stages, struct unpack_trees_options *o);
 int twoway_merge(struct cache_entry **src, struct unpack_trees_options *o);
 int bind_merge(struct cache_entry **src, struct unpack_trees_options *o);
-- 
1.6.0.96.g2fad1.dirty

  reply	other threads:[~2008-09-20 10:04 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-20 10:01 [PATCH v2 00/14] Sparse checkout Nguyễn Thái Ngọc Duy
2008-09-20 10:01 ` [PATCH 01/14] Extend index to save more flags Nguyễn Thái Ngọc Duy
2008-09-20 10:01   ` [PATCH 02/14] Introduce CE_NO_CHECKOUT bit Nguyễn Thái Ngọc Duy
2008-09-20 10:01     ` [PATCH 03/14] ls-files: add options to support sparse checkout Nguyễn Thái Ngọc Duy
2008-09-20 10:01       ` [PATCH 04/14] update-index: refactor mark_valid() in preparation for new options Nguyễn Thái Ngọc Duy
2008-09-20 10:01         ` [PATCH 05/14] update-index: add --checkout/--no-checkout to update CE_NO_CHECKOUT bit Nguyễn Thái Ngọc Duy
2008-09-20 10:01           ` [PATCH 06/14] ls-files: Add tests for --sparse and friends Nguyễn Thái Ngọc Duy
2008-09-20 10:01             ` [PATCH 07/14] Prevent diff machinery from examining worktree outside sparse checkout Nguyễn Thái Ngọc Duy
2008-09-20 10:01               ` [PATCH 08/14] checkout_entry(): CE_NO_CHECKOUT on checked out entries Nguyễn Thái Ngọc Duy
2008-09-20 10:01                 ` [PATCH 09/14] grep: skip files outside sparse checkout area Nguyễn Thái Ngọc Duy
2008-09-20 10:01                   ` Nguyễn Thái Ngọc Duy [this message]
2008-09-20 10:01                     ` [PATCH 11/14] unpack_trees(): add support for sparse checkout Nguyễn Thái Ngọc Duy
2008-09-20 10:01                       ` [PATCH 12/14] clone: support sparse checkout with --narrow-path option Nguyễn Thái Ngọc Duy
2008-09-20 10:01                         ` [PATCH 13/14] checkout: add new options to support sparse checkout Nguyễn Thái Ngọc Duy
2008-09-20 10:01                           ` [PATCH 14/14] wt-status: Show orphaned entries in "git status" output Nguyễn Thái Ngọc Duy
2008-09-20 21:59   ` [PATCH 01/14] Extend index to save more flags Jakub Narebski
2008-09-20 22:23     ` Junio C Hamano
2008-09-20 22:26       ` Junio C Hamano
2008-09-21  4:34     ` Nguyen Thai Ngoc Duy
2008-09-21 22:21       ` Jakub Narebski
2008-09-20 10:48 ` [PATCH v2 00/14] Sparse checkout Santi Béjar
2008-09-20 12:07   ` Nguyen Thai Ngoc Duy
2008-09-20 16:45 ` Jakub Narebski
2008-09-20 17:33   ` Nguyen Thai Ngoc Duy
2008-09-20 18:01     ` Jakub Narebski
2008-09-20 18:40       ` Encoding problems with format-patch [Was: [PATCH v2 00/14] Sparse checkout] Uwe Kleine-König
2008-09-20 19:48       ` [PATCH v2 00/14] Sparse checkout Nguyen Thai Ngoc Duy
2008-09-20 22:11         ` Junio C Hamano
2008-09-21 10:11           ` Nguyen Thai Ngoc Duy
2008-09-21 10:49             ` Jakub Narebski
2008-09-21 11:32               ` Nguyen Thai Ngoc Duy
2008-09-21 22:14                 ` Jakub Narebski
2008-09-23 11:06             ` Santi Béjar
2008-09-23 11:56               ` Nguyen Thai Ngoc Duy
2008-09-26 16:00               ` Nguyen Thai Ngoc Duy
2008-09-20 18:52     ` Junio C Hamano
2008-09-23 11:57 ` Santi Béjar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1221904913-25887-11-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.