git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2] grep: use slash for path delimiter, not colon
@ 2013-08-26 14:46 Phil Hord
  2013-08-26 19:28 ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Phil Hord @ 2013-08-26 14:46 UTC (permalink / raw)
  To: git; +Cc: phil.hord, Jeff King, Junio C Hamano, Jonathan Nieder, Phil Hord

When a commit is grepped and matching filenames are printed, grep-objects
creates the filename by prefixing the original cmdline argument to the
matched path separated by a colon.  Normally this forms a valid blob
reference to the filename, like this:

  git grep -l foo HEAD
  HEAD:some/path/to/foo.txt
      ^

But a tree path may be given to grep instead; in this case the colon is
not a valid delimiter to use since it is placed inside a path.

  git grep -l foo HEAD:some
  HEAD:some:path/to/foo.txt
           ^

The slash path delimiter should be used instead.  Fix git grep to
discern the correct delimiter so it can report valid object names.

  git grep -l foo HEAD:some
  HEAD:some/path/to/foo.txt
           ^

Also, prevent the delimiter being added twice, as happens now in these
examples:

  git grep -l foo HEAD:
  HEAD::some/path/to/foo.txt
       ^
  git grep -l foo HEAD:some/
  HEAD:some/:path/to/foo.txt
            ^

Add a test to confirm correct path forming.
---
This version is a bit more deterministic and also adds a test.

It accepts the expense of examining the path argument again to 
determine if it is a tree-ish + path rather than just a tree (commit).
The get_sha1 call occurs one extra time for each tree-ish argument,
so it's not expensive. We avoid mucking with the object_array API this
way, and also do not rely on the object-type to tell us anything about
the way the object name was spelled.

This one also adds a check to avoid duplicating an extant delimiter.

 builtin/grep.c  |  9 ++++++++-
 t/t7810-grep.sh | 15 +++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 03bc442..6fc418f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -480,8 +480,15 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		len = name ? strlen(name) : 0;
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
+			struct object_context ctx;
+			unsigned char sha1[20];
+			char delimiter = ':';
+			if (!get_sha1_with_context(name, 0, sha1, &ctx) &&
+				ctx.path[0]!=0)
+				delimiter='/';
 			strbuf_add(&base, name, len);
-			strbuf_addch(&base, ':');
+			if (name[len-1] != delimiter)
+				strbuf_addch(&base, delimiter);
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index f698001..2494bfc 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -886,6 +886,21 @@ test_expect_success 'grep -e -- -- path' '
 '
 
 cat >expected <<EOF
+HEAD:t/a/v:1:vvv
+HEAD:t/v:1:vvv
+EOF
+
+test_expect_success "grep HEAD -- path/" '
+	git grep -n -e vvv HEAD -- t/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success "grep HEAD:path" '
+	git grep -n -e vvv HEAD:t/ >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<EOF
 hello.c:int main(int argc, const char **argv)
 hello.c:	printf("Hello world.\n");
 EOF
-- 
1.8.4.557.g34b3a2e

^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
@ 2013-09-22 19:15 Jonathon Mah
  2013-09-24  6:57 ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathon Mah @ 2013-09-22 19:15 UTC (permalink / raw)
  To: phil.hord, Jeff King; +Cc: git@vger.kernel.org

> >     HEAD:/some/path/to/foo.txt
> >     HEAD:some/path/to/foo.txt
> 
> With my patch it prints the latter.
> 
> This is because get_sha1_with_context("HEAD:"...) returns an empty
> 'path' string.  The code decides to use ':' as the delimiter in that
> case, but it sees there already is one at the end of "HEAD:".

A few days ago I came across the same "surprising" output of git-grep, tried to adjust the code to print "git show"-able object names, and ran into similar subtleties. I just found this thread, and Jeff's code handles more cases than mine did (I didn't try Phil's initial patch), but I can add some more test cases with non-showable output (again related to git-grep's path scoping):

$ git grep -l cache HEAD:./ | head -1
HEAD:./:.gitignore

$ cd Documentation
$ git grep -l cache HEAD | head -1
HEAD:CodingGuidelines

$ git grep -l cache HEAD:Documentation/CodingGuidelines
../HEAD:Documentation/CodingGuidelines
(woah!)

Sorry that I don't yet have anything useful to suggest! But I can tell the story of my use case:

I have a large repository (1.6GB bare) which I don't work on, but which contains code that I need to refer to. A checkout is ~600MB and 27k files, which I'd like to avoid (it's redundant data, and would slow down backups of my drive). I found myself "git-grep"ping through parts of the tree, looking through the results, and then "git-show"ing interesting files. Having a real object name in the grep output allows copy-and-paste of the object path.



Jonathon Mah
me@JonathonMah.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-09-24  6:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-26 14:46 [PATCHv2] grep: use slash for path delimiter, not colon Phil Hord
2013-08-26 19:28 ` Jeff King
2013-08-26 19:53   ` Jeff King
2013-08-26 19:54     ` [PATCH 1/2] grep: stop using object_array Jeff King
2013-08-26 19:56     ` [PATCH 2/2] grep: use slash for path delimiter, not colon Jeff King
2013-08-26 20:13       ` Johannes Sixt
2013-08-26 20:52         ` Phil Hord
2013-08-26 20:52         ` Jeff King
2013-08-26 21:03           ` Phil Hord
2013-08-26 21:13             ` Jeff King
2013-08-27  3:37     ` [PATCHv2] " Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2013-09-22 19:15 [PATCH 2/2] " Jonathon Mah
2013-09-24  6:57 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).