* [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 19:53 [PATCHv2] " Jeff King
@ 2013-08-26 19:56 ` Jeff King
2013-08-26 20:13 ` Johannes Sixt
0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2013-08-26 19:56 UTC (permalink / raw)
To: Phil Hord; +Cc: git, phil.hord, Junio C Hamano, Jonathan Nieder
From: Phil Hord <hordp@cisco.com>
When a commit is grepped and matching filenames are printed, grep-objects
creates the filename by prefixing the original cmdline argument to the
matched path separated by a colon. Normally this forms a valid blob
reference to the filename, like this:
git grep -l foo HEAD
HEAD:some/path/to/foo.txt
^
But a tree path may be given to grep instead; in this case the colon is
not a valid delimiter to use since it is placed inside a path.
git grep -l foo HEAD:some
HEAD:some:path/to/foo.txt
^
The slash path delimiter should be used instead. Fix git grep to
discern the correct delimiter so it can report valid object names.
git grep -l foo HEAD:some
HEAD:some/path/to/foo.txt
^
Also, prevent the delimiter being added twice, as happens now in these
examples:
git grep -l foo HEAD:
HEAD::some/path/to/foo.txt
^
git grep -l foo HEAD:some/
HEAD:some/:path/to/foo.txt
^
Add a test to confirm correct path forming.
Signed-off-by: Jeff King <peff@peff.net>
---
I left the author as you, since you have done all the hard work; this is
really just me rebasing your patch on top of mine. But note that you did
not signoff the original.
builtin/grep.c | 13 +++++++++----
t/t7810-grep.sh | 15 +++++++++++++++
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index ee47d49..2df7986 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,7 @@ struct object_to_grep {
struct object_to_grep {
struct object *item;
const char *name;
+ unsigned has_path:1;
};
static int use_threads = 1;
@@ -462,7 +463,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
}
static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
- struct object *obj, const char *name)
+ struct object *obj, const char *name, char delimiter)
{
if (obj->type == OBJ_BLOB)
return grep_sha1(opt, obj->sha1, name, 0, NULL);
@@ -485,7 +486,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
strbuf_init(&base, PATH_MAX + len + 1);
if (len) {
strbuf_add(&base, name, len);
- strbuf_addch(&base, ':');
+ if (name[len-1] != delimiter)
+ strbuf_addch(&base, delimiter);
}
init_tree_desc(&tree, data, size);
hit = grep_tree(opt, pathspec, &tree, &base, base.len,
@@ -506,7 +508,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
for (i = 0; i < nr; i++) {
struct object *real_obj;
real_obj = deref_tag(list[i].item, NULL, 0);
- if (grep_object(opt, pathspec, real_obj, list[i].name)) {
+ if (grep_object(opt, pathspec, real_obj, list[i].name,
+ list[i].has_path ? '/' : ':')) {
hit = 1;
if (opt->status_only)
break;
@@ -822,8 +825,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
for (i = 0; i < argc; i++) {
const char *arg = argv[i];
unsigned char sha1[20];
+ struct object_context oc;
/* Is it a rev? */
- if (!get_sha1(arg, sha1)) {
+ if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
struct object *object = parse_object_or_die(sha1, arg);
if (!seen_dashdash)
verify_non_filename(prefix, arg);
@@ -831,6 +835,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
ALLOC_GROW(list, list_nr+1, list_alloc);
list[list_nr].item = object;
list[list_nr].name = arg;
+ list[list_nr].has_path = !!oc.path[0];
list_nr++;
continue;
}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index f698001..2494bfc 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -886,6 +886,21 @@ cat >expected <<EOF
'
cat >expected <<EOF
+HEAD:t/a/v:1:vvv
+HEAD:t/v:1:vvv
+EOF
+
+test_expect_success "grep HEAD -- path/" '
+ git grep -n -e vvv HEAD -- t/ >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success "grep HEAD:path" '
+ git grep -n -e vvv HEAD:t/ >actual &&
+ test_cmp expected actual
+'
+
+cat >expected <<EOF
hello.c:int main(int argc, const char **argv)
hello.c: printf("Hello world.\n");
EOF
--
1.8.4.2.g87d4a77
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 19:56 ` [PATCH 2/2] " Jeff King
@ 2013-08-26 20:13 ` Johannes Sixt
2013-08-26 20:52 ` Phil Hord
2013-08-26 20:52 ` Jeff King
0 siblings, 2 replies; 8+ messages in thread
From: Johannes Sixt @ 2013-08-26 20:13 UTC (permalink / raw)
To: phil.hord; +Cc: Jeff King, Phil Hord, git, Junio C Hamano, Jonathan Nieder
Am 26.08.2013 21:56, schrieb Jeff King:
> Also, prevent the delimiter being added twice, as happens now in these
> examples:
>
> git grep -l foo HEAD:
> HEAD::some/path/to/foo.txt
> ^
Which one of these two does it print then?
HEAD:/some/path/to/foo.txt
HEAD:some/path/to/foo.txt
-- Hannes
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:13 ` Johannes Sixt
@ 2013-08-26 20:52 ` Phil Hord
2013-08-26 20:52 ` Jeff King
1 sibling, 0 replies; 8+ messages in thread
From: Phil Hord @ 2013-08-26 20:52 UTC (permalink / raw)
To: Johannes Sixt
Cc: Jeff King, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 4:13 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 26.08.2013 21:56, schrieb Jeff King:
>> Also, prevent the delimiter being added twice, as happens now in these
>> examples:
>>
>> git grep -l foo HEAD:
>> HEAD::some/path/to/foo.txt
>> ^
>
> Which one of these two does it print then?
>
> HEAD:/some/path/to/foo.txt
> HEAD:some/path/to/foo.txt
With my patch it prints the latter.
This is because get_sha1_with_context("HEAD:"...) returns an empty
'path' string. The code decides to use ':' as the delimiter in that
case, but it sees there already is one at the end of "HEAD:".
Phil
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:13 ` Johannes Sixt
2013-08-26 20:52 ` Phil Hord
@ 2013-08-26 20:52 ` Jeff King
2013-08-26 21:03 ` Phil Hord
1 sibling, 1 reply; 8+ messages in thread
From: Jeff King @ 2013-08-26 20:52 UTC (permalink / raw)
To: Johannes Sixt; +Cc: phil.hord, Phil Hord, git, Junio C Hamano, Jonathan Nieder
On Mon, Aug 26, 2013 at 10:13:14PM +0200, Johannes Sixt wrote:
> Am 26.08.2013 21:56, schrieb Jeff King:
> > Also, prevent the delimiter being added twice, as happens now in these
> > examples:
> >
> > git grep -l foo HEAD:
> > HEAD::some/path/to/foo.txt
> > ^
>
> Which one of these two does it print then?
>
> HEAD:/some/path/to/foo.txt
> HEAD:some/path/to/foo.txt
It should (and does) print the latter.
But I do note that our pathspec handling for subdirectories seems buggy.
If you do:
$ cd Documentation
$ git grep -l foo | head -1
RelNotes/1.5.1.5.txt
that's fine; we limit to the current directory. But then if you do:
$ git grep -l foo HEAD | head -1
HEAD:RelNotes/1.5.1.5.txt
we still limit to the current directory, but the output does note note
this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
orthogonal to Phil's patch, though.
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:52 ` Jeff King
@ 2013-08-26 21:03 ` Phil Hord
2013-08-26 21:13 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Phil Hord @ 2013-08-26 21:03 UTC (permalink / raw)
To: Jeff King
Cc: Johannes Sixt, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 4:52 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Aug 26, 2013 at 10:13:14PM +0200, Johannes Sixt wrote:
>
>> Am 26.08.2013 21:56, schrieb Jeff King:
>> > Also, prevent the delimiter being added twice, as happens now in these
>> > examples:
>> >
>> > git grep -l foo HEAD:
>> > HEAD::some/path/to/foo.txt
>> > ^
>>
>> Which one of these two does it print then?
>>
>> HEAD:/some/path/to/foo.txt
>> HEAD:some/path/to/foo.txt
>
> It should (and does) print the latter.
>
> But I do note that our pathspec handling for subdirectories seems buggy.
> If you do:
>
> $ cd Documentation
> $ git grep -l foo | head -1
> RelNotes/1.5.1.5.txt
>
> that's fine; we limit to the current directory. But then if you do:
>
> $ git grep -l foo HEAD | head -1
> HEAD:RelNotes/1.5.1.5.txt
>
> we still limit to the current directory, but the output does not note
> this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
> orthogonal to Phil's patch, though.
Maybe not. My path completes the assumption that the L:R value
returned by grep is an object ref; but Junio still thought it wasn't.
I think this is another case where his view was correct.
There's more bad news on this front.
$ cd Documentation
$ git grep -l foo HEAD .. | head -1
HEAD:../.gitignore
That's not a valid ref, either (though maybe it could be).
Phil
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 21:03 ` Phil Hord
@ 2013-08-26 21:13 ` Jeff King
0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2013-08-26 21:13 UTC (permalink / raw)
To: Phil Hord
Cc: Johannes Sixt, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 05:03:04PM -0400, Phil Hord wrote:
> > $ git grep -l foo HEAD | head -1
> > HEAD:RelNotes/1.5.1.5.txt
> >
> > we still limit to the current directory, but the output does not note
> > this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
> > orthogonal to Phil's patch, though.
>
> Maybe not. My path completes the assumption that the L:R value
> returned by grep is an object ref; but Junio still thought it wasn't.
> I think this is another case where his view was correct.
I certainly assumed it was, because it is in most cases it is. And something
like "HEAD:RelNotes/1.5.1.5.txt" certainly _looks_ like one, and is
generated by the current git. And what is the point of coming up with a
file listing if the names you return do not actually exist?
> There's more bad news on this front.
>
> $ cd Documentation
> $ git grep -l foo HEAD .. | head -1
> HEAD:../.gitignore
>
> That's not a valid ref, either (though maybe it could be).
Yes, though we seem to normalize paths already. So the other entries
from that command are (in git.git):
HEAD:../.mailmap
HEAD:RelNotes/1.5.1.5.txt
So we could either:
1. Prepend the current path before normalizing to yield:
HEAD:.mailmap
HEAD:Documentation/RelNotes/1.5.1.5.txt
2. Teach the get_sha1 path parser about "..", and prepend "./" when we
are in a prefixed subdir.
HEAD:./../.mailmap
HEAD:./RelNotes/1.5.1.5.txt
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
@ 2013-09-22 19:15 Jonathon Mah
2013-09-24 6:57 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Jonathon Mah @ 2013-09-22 19:15 UTC (permalink / raw)
To: phil.hord, Jeff King; +Cc: git@vger.kernel.org
> > HEAD:/some/path/to/foo.txt
> > HEAD:some/path/to/foo.txt
>
> With my patch it prints the latter.
>
> This is because get_sha1_with_context("HEAD:"...) returns an empty
> 'path' string. The code decides to use ':' as the delimiter in that
> case, but it sees there already is one at the end of "HEAD:".
A few days ago I came across the same "surprising" output of git-grep, tried to adjust the code to print "git show"-able object names, and ran into similar subtleties. I just found this thread, and Jeff's code handles more cases than mine did (I didn't try Phil's initial patch), but I can add some more test cases with non-showable output (again related to git-grep's path scoping):
$ git grep -l cache HEAD:./ | head -1
HEAD:./:.gitignore
$ cd Documentation
$ git grep -l cache HEAD | head -1
HEAD:CodingGuidelines
$ git grep -l cache HEAD:Documentation/CodingGuidelines
../HEAD:Documentation/CodingGuidelines
(woah!)
Sorry that I don't yet have anything useful to suggest! But I can tell the story of my use case:
I have a large repository (1.6GB bare) which I don't work on, but which contains code that I need to refer to. A checkout is ~600MB and 27k files, which I'd like to avoid (it's redundant data, and would slow down backups of my drive). I found myself "git-grep"ping through parts of the tree, looking through the results, and then "git-show"ing interesting files. Having a real object name in the grep output allows copy-and-paste of the object path.
Jonathon Mah
me@JonathonMah.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-09-22 19:15 [PATCH 2/2] grep: use slash for path delimiter, not colon Jonathon Mah
@ 2013-09-24 6:57 ` Jeff King
0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2013-09-24 6:57 UTC (permalink / raw)
To: Jonathon Mah; +Cc: phil.hord, git@vger.kernel.org
On Sun, Sep 22, 2013 at 12:15:21PM -0700, Jonathon Mah wrote:
> A few days ago I came across the same "surprising" output of git-grep,
> tried to adjust the code to print "git show"-able object names, and
> ran into similar subtleties. I just found this thread, and Jeff's code
> handles more cases than mine did (I didn't try Phil's initial patch),
> but I can add some more test cases with non-showable output (again
> related to git-grep's path scoping):
>
If you haven't read the side thread starting at [1], there are some
arguments that git-grep is doing the right thing already. I think there
are a few issues at play here:
> $ git grep -l cache HEAD:./ | head -1
> HEAD:./:.gitignore
As you show, using a colon separator from a tree-ish that contains a
partial path looks bad. The downside of turning this into a slash,
though, is that you lose the information of the tree-ish. See [2].
> $ cd Documentation
> $ git grep -l cache HEAD | head -1
> HEAD:CodingGuidelines
Grepping from a subdirectory produces relative paths that look like real
tree:path specifications, but aren't. Showing the full path would
potentially be cluttering if you are in a deep directory. These days we
have the "./" syntax, though, so we could perhaps output:
HEAD:./CodingGuidelines
which is succinct but can be used to access the path in the tree (and
makes more clear to the user that we have only grepped in the current
subdirectory).
> $ git grep -l cache HEAD:Documentation/CodingGuidelines
> ../HEAD:Documentation/CodingGuidelines
> (woah!)
That one just seems nonsensical and buggy to me. We should not be
applying ".." at all to a blob spec like this.
-Peff
[1] http://thread.gmane.org/gmane.comp.version-control.git/232892/focus=232980
[2] http://thread.gmane.org/gmane.comp.version-control.git/232892/focus=233004
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-09-24 6:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-22 19:15 [PATCH 2/2] grep: use slash for path delimiter, not colon Jonathon Mah
2013-09-24 6:57 ` Jeff King
-- strict thread matches above, loose matches on Subject: below --
2013-08-26 19:53 [PATCHv2] " Jeff King
2013-08-26 19:56 ` [PATCH 2/2] " Jeff King
2013-08-26 20:13 ` Johannes Sixt
2013-08-26 20:52 ` Phil Hord
2013-08-26 20:52 ` Jeff King
2013-08-26 21:03 ` Phil Hord
2013-08-26 21:13 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).