* [PATCH 1/2] grep: stop using object_array
2013-08-26 19:53 ` Jeff King
@ 2013-08-26 19:54 ` Jeff King
2013-08-26 19:56 ` [PATCH 2/2] grep: use slash for path delimiter, not colon Jeff King
2013-08-27 3:37 ` [PATCHv2] " Junio C Hamano
2 siblings, 0 replies; 11+ messages in thread
From: Jeff King @ 2013-08-26 19:54 UTC (permalink / raw)
To: Phil Hord; +Cc: git, phil.hord, Junio C Hamano, Jonathan Nieder
We use an object_array to store the set of objects to grep
that we received on the command-line. There is no particular
reason to use object_array here except that its code was
already written, and it contained the elements we needed
(though we did not care about mode at all).
However, future patches will need to remember more about the
arguments than object_array can provide. Let's use our own
custom struct. Thanks to the ALLOC_GROW macro, this is really
only a few lines longer (and we even save a few bytes of
memory as we do not care about the mode, and we know that we
do not have to copy the name strings, as they come from
argv).
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/grep.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index d3b3b1d..ee47d49 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -23,6 +23,11 @@ static char const * const grep_usage[] = {
NULL
};
+struct object_to_grep {
+ struct object *item;
+ const char *name;
+};
+
static int use_threads = 1;
#ifndef NO_PTHREADS
@@ -493,16 +498,15 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
}
static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
- const struct object_array *list)
+ const struct object_to_grep *list, int nr)
{
unsigned int i;
int hit = 0;
- const unsigned int nr = list->nr;
for (i = 0; i < nr; i++) {
struct object *real_obj;
- real_obj = deref_tag(list->objects[i].item, NULL, 0);
- if (grep_object(opt, pathspec, real_obj, list->objects[i].name)) {
+ real_obj = deref_tag(list[i].item, NULL, 0);
+ if (grep_object(opt, pathspec, real_obj, list[i].name)) {
hit = 1;
if (opt->status_only)
break;
@@ -628,7 +632,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
int external_grep_allowed__ignored;
const char *show_in_pager = NULL, *default_pager = "dummy";
struct grep_opt opt;
- struct object_array list = OBJECT_ARRAY_INIT;
+ struct object_to_grep *list = NULL;
+ int list_nr = 0, list_alloc = 0;
const char **paths = NULL;
struct pathspec pathspec;
struct string_list path_list = STRING_LIST_INIT_NODUP;
@@ -822,7 +827,11 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
struct object *object = parse_object_or_die(sha1, arg);
if (!seen_dashdash)
verify_non_filename(prefix, arg);
- add_object_array(object, arg, &list);
+
+ ALLOC_GROW(list, list_nr+1, list_alloc);
+ list[list_nr].item = object;
+ list[list_nr].name = arg;
+ list_nr++;
continue;
}
if (!strcmp(arg, "--")) {
@@ -833,7 +842,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
}
#ifndef NO_PTHREADS
- if (list.nr || cached || online_cpus() == 1)
+ if (list_nr || cached || online_cpus() == 1)
use_threads = 0;
#else
use_threads = 0;
@@ -861,7 +870,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
pathspec.max_depth = opt.max_depth;
pathspec.recursive = 1;
- if (show_in_pager && (cached || list.nr))
+ if (show_in_pager && (cached || list_nr))
die(_("--open-files-in-pager only works on the worktree"));
if (show_in_pager && opt.pattern_list && !opt.pattern_list->next) {
@@ -889,12 +898,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
if (!use_index || untracked) {
int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
- if (list.nr)
+ if (list_nr)
die(_("--no-index or --untracked cannot be used with revs."));
hit = grep_directory(&opt, &pathspec, use_exclude);
} else if (0 <= opt_exclude) {
die(_("--[no-]exclude-standard cannot be used for tracked contents."));
- } else if (!list.nr) {
+ } else if (!list_nr) {
if (!cached)
setup_work_tree();
@@ -902,7 +911,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
} else {
if (cached)
die(_("both --cached and trees are given."));
- hit = grep_objects(&opt, &pathspec, &list);
+ hit = grep_objects(&opt, &pathspec, list, list_nr);
}
if (use_threads)
--
1.8.4.2.g87d4a77
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 19:53 ` Jeff King
2013-08-26 19:54 ` [PATCH 1/2] grep: stop using object_array Jeff King
@ 2013-08-26 19:56 ` Jeff King
2013-08-26 20:13 ` Johannes Sixt
2013-08-27 3:37 ` [PATCHv2] " Junio C Hamano
2 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2013-08-26 19:56 UTC (permalink / raw)
To: Phil Hord; +Cc: git, phil.hord, Junio C Hamano, Jonathan Nieder
From: Phil Hord <hordp@cisco.com>
When a commit is grepped and matching filenames are printed, grep-objects
creates the filename by prefixing the original cmdline argument to the
matched path separated by a colon. Normally this forms a valid blob
reference to the filename, like this:
git grep -l foo HEAD
HEAD:some/path/to/foo.txt
^
But a tree path may be given to grep instead; in this case the colon is
not a valid delimiter to use since it is placed inside a path.
git grep -l foo HEAD:some
HEAD:some:path/to/foo.txt
^
The slash path delimiter should be used instead. Fix git grep to
discern the correct delimiter so it can report valid object names.
git grep -l foo HEAD:some
HEAD:some/path/to/foo.txt
^
Also, prevent the delimiter being added twice, as happens now in these
examples:
git grep -l foo HEAD:
HEAD::some/path/to/foo.txt
^
git grep -l foo HEAD:some/
HEAD:some/:path/to/foo.txt
^
Add a test to confirm correct path forming.
Signed-off-by: Jeff King <peff@peff.net>
---
I left the author as you, since you have done all the hard work; this is
really just me rebasing your patch on top of mine. But note that you did
not signoff the original.
builtin/grep.c | 13 +++++++++----
t/t7810-grep.sh | 15 +++++++++++++++
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index ee47d49..2df7986 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,7 @@ struct object_to_grep {
struct object_to_grep {
struct object *item;
const char *name;
+ unsigned has_path:1;
};
static int use_threads = 1;
@@ -462,7 +463,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
}
static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
- struct object *obj, const char *name)
+ struct object *obj, const char *name, char delimiter)
{
if (obj->type == OBJ_BLOB)
return grep_sha1(opt, obj->sha1, name, 0, NULL);
@@ -485,7 +486,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
strbuf_init(&base, PATH_MAX + len + 1);
if (len) {
strbuf_add(&base, name, len);
- strbuf_addch(&base, ':');
+ if (name[len-1] != delimiter)
+ strbuf_addch(&base, delimiter);
}
init_tree_desc(&tree, data, size);
hit = grep_tree(opt, pathspec, &tree, &base, base.len,
@@ -506,7 +508,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
for (i = 0; i < nr; i++) {
struct object *real_obj;
real_obj = deref_tag(list[i].item, NULL, 0);
- if (grep_object(opt, pathspec, real_obj, list[i].name)) {
+ if (grep_object(opt, pathspec, real_obj, list[i].name,
+ list[i].has_path ? '/' : ':')) {
hit = 1;
if (opt->status_only)
break;
@@ -822,8 +825,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
for (i = 0; i < argc; i++) {
const char *arg = argv[i];
unsigned char sha1[20];
+ struct object_context oc;
/* Is it a rev? */
- if (!get_sha1(arg, sha1)) {
+ if (!get_sha1_with_context(arg, 0, sha1, &oc)) {
struct object *object = parse_object_or_die(sha1, arg);
if (!seen_dashdash)
verify_non_filename(prefix, arg);
@@ -831,6 +835,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
ALLOC_GROW(list, list_nr+1, list_alloc);
list[list_nr].item = object;
list[list_nr].name = arg;
+ list[list_nr].has_path = !!oc.path[0];
list_nr++;
continue;
}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index f698001..2494bfc 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -886,6 +886,21 @@ cat >expected <<EOF
'
cat >expected <<EOF
+HEAD:t/a/v:1:vvv
+HEAD:t/v:1:vvv
+EOF
+
+test_expect_success "grep HEAD -- path/" '
+ git grep -n -e vvv HEAD -- t/ >actual &&
+ test_cmp expected actual
+'
+
+test_expect_success "grep HEAD:path" '
+ git grep -n -e vvv HEAD:t/ >actual &&
+ test_cmp expected actual
+'
+
+cat >expected <<EOF
hello.c:int main(int argc, const char **argv)
hello.c: printf("Hello world.\n");
EOF
--
1.8.4.2.g87d4a77
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 19:56 ` [PATCH 2/2] grep: use slash for path delimiter, not colon Jeff King
@ 2013-08-26 20:13 ` Johannes Sixt
2013-08-26 20:52 ` Phil Hord
2013-08-26 20:52 ` Jeff King
0 siblings, 2 replies; 11+ messages in thread
From: Johannes Sixt @ 2013-08-26 20:13 UTC (permalink / raw)
To: phil.hord; +Cc: Jeff King, Phil Hord, git, Junio C Hamano, Jonathan Nieder
Am 26.08.2013 21:56, schrieb Jeff King:
> Also, prevent the delimiter being added twice, as happens now in these
> examples:
>
> git grep -l foo HEAD:
> HEAD::some/path/to/foo.txt
> ^
Which one of these two does it print then?
HEAD:/some/path/to/foo.txt
HEAD:some/path/to/foo.txt
-- Hannes
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:13 ` Johannes Sixt
@ 2013-08-26 20:52 ` Phil Hord
2013-08-26 20:52 ` Jeff King
1 sibling, 0 replies; 11+ messages in thread
From: Phil Hord @ 2013-08-26 20:52 UTC (permalink / raw)
To: Johannes Sixt
Cc: Jeff King, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 4:13 PM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 26.08.2013 21:56, schrieb Jeff King:
>> Also, prevent the delimiter being added twice, as happens now in these
>> examples:
>>
>> git grep -l foo HEAD:
>> HEAD::some/path/to/foo.txt
>> ^
>
> Which one of these two does it print then?
>
> HEAD:/some/path/to/foo.txt
> HEAD:some/path/to/foo.txt
With my patch it prints the latter.
This is because get_sha1_with_context("HEAD:"...) returns an empty
'path' string. The code decides to use ':' as the delimiter in that
case, but it sees there already is one at the end of "HEAD:".
Phil
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:13 ` Johannes Sixt
2013-08-26 20:52 ` Phil Hord
@ 2013-08-26 20:52 ` Jeff King
2013-08-26 21:03 ` Phil Hord
1 sibling, 1 reply; 11+ messages in thread
From: Jeff King @ 2013-08-26 20:52 UTC (permalink / raw)
To: Johannes Sixt; +Cc: phil.hord, Phil Hord, git, Junio C Hamano, Jonathan Nieder
On Mon, Aug 26, 2013 at 10:13:14PM +0200, Johannes Sixt wrote:
> Am 26.08.2013 21:56, schrieb Jeff King:
> > Also, prevent the delimiter being added twice, as happens now in these
> > examples:
> >
> > git grep -l foo HEAD:
> > HEAD::some/path/to/foo.txt
> > ^
>
> Which one of these two does it print then?
>
> HEAD:/some/path/to/foo.txt
> HEAD:some/path/to/foo.txt
It should (and does) print the latter.
But I do note that our pathspec handling for subdirectories seems buggy.
If you do:
$ cd Documentation
$ git grep -l foo | head -1
RelNotes/1.5.1.5.txt
that's fine; we limit to the current directory. But then if you do:
$ git grep -l foo HEAD | head -1
HEAD:RelNotes/1.5.1.5.txt
we still limit to the current directory, but the output does note note
this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
orthogonal to Phil's patch, though.
-Peff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 20:52 ` Jeff King
@ 2013-08-26 21:03 ` Phil Hord
2013-08-26 21:13 ` Jeff King
0 siblings, 1 reply; 11+ messages in thread
From: Phil Hord @ 2013-08-26 21:03 UTC (permalink / raw)
To: Jeff King
Cc: Johannes Sixt, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 4:52 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Aug 26, 2013 at 10:13:14PM +0200, Johannes Sixt wrote:
>
>> Am 26.08.2013 21:56, schrieb Jeff King:
>> > Also, prevent the delimiter being added twice, as happens now in these
>> > examples:
>> >
>> > git grep -l foo HEAD:
>> > HEAD::some/path/to/foo.txt
>> > ^
>>
>> Which one of these two does it print then?
>>
>> HEAD:/some/path/to/foo.txt
>> HEAD:some/path/to/foo.txt
>
> It should (and does) print the latter.
>
> But I do note that our pathspec handling for subdirectories seems buggy.
> If you do:
>
> $ cd Documentation
> $ git grep -l foo | head -1
> RelNotes/1.5.1.5.txt
>
> that's fine; we limit to the current directory. But then if you do:
>
> $ git grep -l foo HEAD | head -1
> HEAD:RelNotes/1.5.1.5.txt
>
> we still limit to the current directory, but the output does not note
> this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
> orthogonal to Phil's patch, though.
Maybe not. My path completes the assumption that the L:R value
returned by grep is an object ref; but Junio still thought it wasn't.
I think this is another case where his view was correct.
There's more bad news on this front.
$ cd Documentation
$ git grep -l foo HEAD .. | head -1
HEAD:../.gitignore
That's not a valid ref, either (though maybe it could be).
Phil
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] grep: use slash for path delimiter, not colon
2013-08-26 21:03 ` Phil Hord
@ 2013-08-26 21:13 ` Jeff King
0 siblings, 0 replies; 11+ messages in thread
From: Jeff King @ 2013-08-26 21:13 UTC (permalink / raw)
To: Phil Hord
Cc: Johannes Sixt, Phil Hord, git@vger.kernel.org, Junio C Hamano,
Jonathan Nieder
On Mon, Aug 26, 2013 at 05:03:04PM -0400, Phil Hord wrote:
> > $ git grep -l foo HEAD | head -1
> > HEAD:RelNotes/1.5.1.5.txt
> >
> > we still limit to the current directory, but the output does not note
> > this (it should be "HEAD:./RelNotes/1.5.1.5.txt"). I think this bug is
> > orthogonal to Phil's patch, though.
>
> Maybe not. My path completes the assumption that the L:R value
> returned by grep is an object ref; but Junio still thought it wasn't.
> I think this is another case where his view was correct.
I certainly assumed it was, because it is in most cases it is. And something
like "HEAD:RelNotes/1.5.1.5.txt" certainly _looks_ like one, and is
generated by the current git. And what is the point of coming up with a
file listing if the names you return do not actually exist?
> There's more bad news on this front.
>
> $ cd Documentation
> $ git grep -l foo HEAD .. | head -1
> HEAD:../.gitignore
>
> That's not a valid ref, either (though maybe it could be).
Yes, though we seem to normalize paths already. So the other entries
from that command are (in git.git):
HEAD:../.mailmap
HEAD:RelNotes/1.5.1.5.txt
So we could either:
1. Prepend the current path before normalizing to yield:
HEAD:.mailmap
HEAD:Documentation/RelNotes/1.5.1.5.txt
2. Teach the get_sha1 path parser about "..", and prepend "./" when we
are in a prefixed subdir.
HEAD:./../.mailmap
HEAD:./RelNotes/1.5.1.5.txt
-Peff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv2] grep: use slash for path delimiter, not colon
2013-08-26 19:53 ` Jeff King
2013-08-26 19:54 ` [PATCH 1/2] grep: stop using object_array Jeff King
2013-08-26 19:56 ` [PATCH 2/2] grep: use slash for path delimiter, not colon Jeff King
@ 2013-08-27 3:37 ` Junio C Hamano
2 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2013-08-27 3:37 UTC (permalink / raw)
To: Jeff King; +Cc: Phil Hord, git, phil.hord, Jonathan Nieder
Jeff King <peff@peff.net> writes:
> On Mon, Aug 26, 2013 at 03:28:26PM -0400, Jeff King wrote:
>
>> Changing the object_array API would be hard, but I don't think we need
>> to do it here. Can we simply stop using object_array to pass the list,
>> and instead just have a custom list?
>>
>> I'll see how painful that is.
>
> Not very, I think. Here's the series.
>
> [1/2]: grep: stop using object_array
> [2/2]: grep: use slash for path delimiter, not colon
I agree that if we were to do this, these patches show a reasonable
approach to do so.
I however am not yet convinced if its output is necessarily better
X-<.
^ permalink raw reply [flat|nested] 11+ messages in thread