* Regression in git log with multiple authors
@ 2010-08-26 17:39 Emil Sit
2010-08-26 19:05 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Emil Sit @ 2010-08-26 17:39 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
Commit 80235ba79ef43349f455cce869397b3e726f4058 introduced a
regression in a corner case for git log --author when multiple authors
are specified. Prior to 1.7.0.3, if I wanted to find all commits done
by a series of authors, I could simply specify "git log --author=a1
--author=a2" to get all commits done by a1 and a2. However, in the
latest releases, this finds nothing.
Here's a simple test case that demonstrates this:
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 023f225..587069c 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -372,6 +372,14 @@ test_expect_success 'log --grep --author
implicitly uses all-match' '
test_cmp expect actual
'
+test_expect_success 'log --author --author matches both authors' '
+ # author matches only initial and third
+ # frotz matches only second
+ git log --author="A U Thor" --author="frotz\.com>$"
--format=%s >actual &&
+ ( echo third ; echo second ; echo initial ) >expect &&
+ test_cmp expect actual
+'
+
test_expect_success 'grep with CE_VALID file' '
git update-index --assume-unchanged t/t &&
rm t/t &&
This fails against master, but if you revert 80235ba, this will pass
(whereas obviously 'log --grep --author implicitly uses all-match'
will then fail).
It doesn't seem like I can work-around this with 'git log --author a1
--or --author a2'. Is there some other way to find commits by a set
of authors? I don't think it makes sense to treat multiple --author
flags with "and' logic since a commit can only have one author. So
maybe all --authors should be grouped with ors and then anded against
all --committers?
--
Emil Sit / http://www.emilsit.net/
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Regression in git log with multiple authors
2010-08-26 17:39 Regression in git log with multiple authors Emil Sit
@ 2010-08-26 19:05 ` Junio C Hamano
2010-09-13 8:13 ` [PATCH 1/2] grep: move logic to compile header pattern into a separate helper Junio C Hamano
2010-09-13 8:18 ` [PATCH 2/2] log --author: take union of multiple "author" requests Junio C Hamano
0 siblings, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2010-08-26 19:05 UTC (permalink / raw)
To: Emil Sit; +Cc: git
Emil Sit <sit@emilsit.net> writes:
> Commit 80235ba79ef43349f455cce869397b3e726f4058 introduced a
> regression in a corner case for git log --author when multiple authors
> are specified. Prior to 1.7.0.3, if I wanted to find all commits done
> by a series of authors, I could simply specify "git log --author=a1
> --author=a2" to get all commits done by a1 and a2. However, in the
> latest releases, this finds nothing.
That is more or less deliberate, not in the sense that the patch wanted to
forbid looking for multiple authors but in the sense that the patch wanted
to apply the "grep" terms as intersection, not as union.
In the olden days,
log --author=me --committer=him --grep=this --grep=that
used to be turned into:
(OR (HEADER-AUTHOR me)
(HEADER-COMMITTER him)
(PATTERN this)
(PATTERN that))
showing my patches that do not have any "this" nor "that", which was
totally bogus and useless.
80235ba ("log --author=me --grep=it" should find intersection, not union,
2010-01-17) improved it greatly to turn the same into:
(all-match (HEADER-AUTHOR me)
(HEADER-COMMITTER him)
(OR
(PATTERN this)
(PATTERN that)))
That is, "show only patches by me committed by him that have either this
or that", which is a lot more natural thing to ask. So simply reverting
the commit is out of question.
But I do not think it is a bad idea if you turned
log --author=me --author=her --committer=him --committer=you --grep=this
into
(all-match (OR
(HEADER-AUTHOR me)
(HEADER-AUTHOR her))
(OR
(HEADER-COMMITTER him)
(HEADER-COMMITTER you))
(OR
(PATTERN this)))
as it is obvious that with multiple authors (or committers) the command
line is asking for union among them.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] grep: move logic to compile header pattern into a separate helper
2010-08-26 19:05 ` Junio C Hamano
@ 2010-09-13 8:13 ` Junio C Hamano
2010-09-13 8:18 ` [PATCH 2/2] log --author: take union of multiple "author" requests Junio C Hamano
1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2010-09-13 8:13 UTC (permalink / raw)
To: git; +Cc: Emil Sit
The callers should be queuing only GREP_PATTERN_HEAD elements to the
header_list queue; simplify the switch and guard it with an assert.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* This is just a clean-up before the real fun.
grep.c | 43 +++++++++++++++++++++----------------------
1 files changed, 21 insertions(+), 22 deletions(-)
diff --git a/grep.c b/grep.c
index 82fb349..718a3c2 100644
--- a/grep.c
+++ b/grep.c
@@ -189,29 +189,31 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list)
return compile_pattern_or(list);
}
-void compile_grep_patterns(struct grep_opt *opt)
+static struct grep_expr *prep_header_patterns(struct grep_opt *opt)
{
struct grep_pat *p;
- struct grep_expr *header_expr = NULL;
-
- if (opt->header_list) {
- p = opt->header_list;
- header_expr = compile_pattern_expr(&p);
- if (p)
- die("incomplete pattern expression: %s", p->pattern);
- for (p = opt->header_list; p; p = p->next) {
- switch (p->token) {
- case GREP_PATTERN: /* atom */
- case GREP_PATTERN_HEAD:
- case GREP_PATTERN_BODY:
- compile_regexp(p, opt);
- break;
- default:
- opt->extended = 1;
- break;
- }
- }
+ struct grep_expr *header_expr;
+
+ if (!opt->header_list)
+ return NULL;
+ p = opt->header_list;
+ header_expr = compile_pattern_expr(&p);
+ if (p)
+ die("incomplete pattern expression: %s", p->pattern);
+ for (p = opt->header_list; p; p = p->next) {
+ if (p->token != GREP_PATTERN_HEAD)
+ die("bug: a non-header pattern in grep header list.");
+ if (p->field < 0 || GREP_HEADER_FIELD_MAX <= p->field)
+ die("bug: unknown header field %d", p->field);
+ compile_regexp(p, opt);
}
+ return header_expr;
+}
+
+void compile_grep_patterns(struct grep_opt *opt)
+{
+ struct grep_pat *p;
+ struct grep_expr *header_expr = prep_header_patterns(opt);
for (p = opt->pattern_list; p; p = p->next) {
switch (p->token) {
@@ -231,9 +233,6 @@ void compile_grep_patterns(struct grep_opt *opt)
else if (!opt->extended)
return;
- /* Then bundle them up in an expression.
- * A classic recursive descent parser would do.
- */
p = opt->pattern_list;
if (p)
opt->pattern_expression = compile_pattern_expr(&p);
--
1.7.3.rc1.227.gee5c7b
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] log --author: take union of multiple "author" requests
2010-08-26 19:05 ` Junio C Hamano
2010-09-13 8:13 ` [PATCH 1/2] grep: move logic to compile header pattern into a separate helper Junio C Hamano
@ 2010-09-13 8:18 ` Junio C Hamano
2010-09-13 16:17 ` Emil Sit
1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2010-09-13 8:18 UTC (permalink / raw)
To: git; +Cc: Emil Sit
In the olden days,
log --author=me --committer=him --grep=this --grep=that
used to be turned into:
(OR (HEADER-AUTHOR me)
(HEADER-COMMITTER him)
(PATTERN this)
(PATTERN that))
showing my patches that do not have any "this" nor "that", which was
totally useless.
80235ba ("log --author=me --grep=it" should find intersection, not union,
2010-01-17) improved it greatly to turn the same into:
(ALL-MATCH
(HEADER-AUTHOR me)
(HEADER-COMMITTER him)
(OR (PATTERN this) (PATTERN that)))
That is, "show only patches by me and committed by him, that have either
this or that", which is a lot more natural thing to ask.
We however need to be a bit more clever when the user asks more than one
"author" (or "committer"); because a commit has only one author (and one
committer), they ought to be interpreted as asking for union to be useful.
The current implementation simply added another author/committer pattern
at the same top-level for ALL-MATCH to insist on matching all, finding
nothing.
Turn
log --author=me --author=her \
--committer=him --committer=you \
--grep=this --grep=that
into
(ALL-MATCH
(OR (HEADER-AUTHOR me) (HEADER-AUTHOR her))
(OR (HEADER-COMMITTER him) (HEADER-COMMITTER you))
(OR (PATTERN this) (PATTERN that)))
instead.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
grep.c | 65 ++++++++++++++++++++++++++++++++++++++++++++----------
grep.h | 2 +
t/t7810-grep.sh | 29 +++++++++++++++++++++++-
3 files changed, 83 insertions(+), 13 deletions(-)
diff --git a/grep.c b/grep.c
index 718a3c2..63c4280 100644
--- a/grep.c
+++ b/grep.c
@@ -189,17 +189,32 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list)
return compile_pattern_or(list);
}
+static struct grep_expr *grep_true_expr(void)
+{
+ struct grep_expr *z = xcalloc(1, sizeof(*z));
+ z->node = GREP_NODE_TRUE;
+ return z;
+}
+
+static struct grep_expr *grep_or_expr(struct grep_expr *left, struct grep_expr *right)
+{
+ struct grep_expr *z = xcalloc(1, sizeof(*z));
+ z->node = GREP_NODE_OR;
+ z->u.binary.left = left;
+ z->u.binary.right = right;
+ return z;
+}
+
static struct grep_expr *prep_header_patterns(struct grep_opt *opt)
{
struct grep_pat *p;
struct grep_expr *header_expr;
+ struct grep_expr *(header_group[GREP_HEADER_FIELD_MAX]);
+ enum grep_header_field fld;
if (!opt->header_list)
return NULL;
p = opt->header_list;
- header_expr = compile_pattern_expr(&p);
- if (p)
- die("incomplete pattern expression: %s", p->pattern);
for (p = opt->header_list; p; p = p->next) {
if (p->token != GREP_PATTERN_HEAD)
die("bug: a non-header pattern in grep header list.");
@@ -207,6 +222,33 @@ static struct grep_expr *prep_header_patterns(struct grep_opt *opt)
die("bug: unknown header field %d", p->field);
compile_regexp(p, opt);
}
+
+ for (fld = 0; fld < GREP_HEADER_FIELD_MAX; fld++)
+ header_group[fld] = NULL;
+
+ for (p = opt->header_list; p; p = p->next) {
+ struct grep_expr *h;
+ struct grep_pat *pp = p;
+
+ h = compile_pattern_atom(&pp);
+ if (!h || pp != p->next)
+ die("bug: malformed header expr");
+ if (!header_group[p->field]) {
+ header_group[p->field] = h;
+ continue;
+ }
+ header_group[p->field] = grep_or_expr(h, header_group[p->field]);
+ }
+
+ header_expr = NULL;
+
+ for (fld = 0; fld < GREP_HEADER_FIELD_MAX; fld++) {
+ if (!header_group[fld])
+ continue;
+ if (!header_expr)
+ header_expr = grep_true_expr();
+ header_expr = grep_or_expr(header_group[fld], header_expr);
+ }
return header_expr;
}
@@ -242,22 +284,18 @@ void compile_grep_patterns(struct grep_opt *opt)
if (!header_expr)
return;
- if (opt->pattern_expression) {
- struct grep_expr *z;
- z = xcalloc(1, sizeof(*z));
- z->node = GREP_NODE_OR;
- z->u.binary.left = opt->pattern_expression;
- z->u.binary.right = header_expr;
- opt->pattern_expression = z;
- } else {
+ if (!opt->pattern_expression)
opt->pattern_expression = header_expr;
- }
+ else
+ opt->pattern_expression = grep_or_expr(opt->pattern_expression,
+ header_expr);
opt->all_match = 1;
}
static void free_pattern_expr(struct grep_expr *x)
{
switch (x->node) {
+ case GREP_NODE_TRUE:
case GREP_NODE_ATOM:
break;
case GREP_NODE_NOT:
@@ -486,6 +524,9 @@ static int match_expr_eval(struct grep_expr *x, char *bol, char *eol,
if (!x)
die("Not a valid grep expression");
switch (x->node) {
+ case GREP_NODE_TRUE:
+ h = 1;
+ break;
case GREP_NODE_ATOM:
h = match_one_pattern(x->u.atom, bol, eol, ctx, &match, 0);
break;
diff --git a/grep.h b/grep.h
index efa8cff..06621fe 100644
--- a/grep.h
+++ b/grep.h
@@ -22,6 +22,7 @@ enum grep_header_field {
GREP_HEADER_AUTHOR = 0,
GREP_HEADER_COMMITTER
};
+#define GREP_HEADER_FIELD_MAX (GREP_HEADER_COMMITTER + 1)
struct grep_pat {
struct grep_pat *next;
@@ -41,6 +42,7 @@ enum grep_expr_node {
GREP_NODE_ATOM,
GREP_NODE_NOT,
GREP_NODE_AND,
+ GREP_NODE_TRUE,
GREP_NODE_OR
};
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 8a63227..dc5c085 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -324,8 +324,13 @@ test_expect_success 'log grep setup' '
echo a >>file &&
test_tick &&
- git commit -a -m "third"
+ git commit -a -m "third" &&
+ echo a >>file &&
+ test_tick &&
+ GIT_AUTHOR_NAME="Night Fall" \
+ GIT_AUTHOR_EMAIL="nitfol@frobozz.com" \
+ git commit -a -m "fourth"
'
test_expect_success 'log grep (1)' '
@@ -372,6 +377,28 @@ test_expect_success 'log --grep --author implicitly uses all-match' '
test_cmp expect actual
'
+test_expect_success 'log with multiple --author uses union' '
+ git log --author="Thor" --author="Aster" --format=%s >actual &&
+ {
+ echo third && echo second && echo initial
+ } >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'log with --grep and multiple --author uses all-match' '
+ git log --author="Thor" --author="Night" --grep=i --format=%s >actual &&
+ {
+ echo third && echo initial
+ } >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'log with --grep and multiple --author uses all-match' '
+ git log --author="Thor" --author="Night" --grep=q --format=%s >actual &&
+ >expect &&
+ test_cmp expect actual
+'
+
test_expect_success 'grep with CE_VALID file' '
git update-index --assume-unchanged t/t &&
rm t/t &&
--
1.7.3.rc1.227.gee5c7b
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] log --author: take union of multiple "author" requests
2010-09-13 8:18 ` [PATCH 2/2] log --author: take union of multiple "author" requests Junio C Hamano
@ 2010-09-13 16:17 ` Emil Sit
2010-09-13 18:11 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Emil Sit @ 2010-09-13 16:17 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, Sep 13, 2010 at 4:18 AM, Junio C Hamano <gitster@pobox.com> wrote:
> log --author=me --author=her \
> --committer=him --committer=you \
> --grep=this --grep=that
>
> into
>
> (ALL-MATCH
> (OR (HEADER-AUTHOR me) (HEADER-AUTHOR her))
> (OR (HEADER-COMMITTER him) (HEADER-COMMITTER you))
> (OR (PATTERN this) (PATTERN that)))
Both patches look good to me. Tested fine with my use cases. Thanks
for doing this.
I'm a little confused about the implementation with regards to
--all-match; does there still need to be an all-match flag? Seems
like it has now been deprecated (to being, essentially, the default).
In any case, there should also probably something going along with
this patch series to update Documentation/rev-list-options.txt.
Thanks again.
--
Emil Sit / http://www.emilsit.net/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] log --author: take union of multiple "author" requests
2010-09-13 16:17 ` Emil Sit
@ 2010-09-13 18:11 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2010-09-13 18:11 UTC (permalink / raw)
To: Emil Sit; +Cc: git
Emil Sit <sit@emilsit.net> writes:
> I'm a little confused about the implementation with regards to
> --all-match; does there still need to be an all-match flag?
When used in the context of "git log --grep/--author/--committer" (as
opposed to more flexible "git grep"), in conjunction with either of the
"header match" element (--author/committer), all-match is implied.
The implementation of the all-match rewriting gets a bit trickier than
necessary, as our internal representation of nodes does not have n-ary
ALL-MATCH (nor n-ary OR/AND) node.
An (ALL-MATCH 1 2 3 4) node is instead represented by this grep_expr
binary tree (rooted at the leftmost OR node):
OR--OR--OR--4
| | |
1 2 3
and requiring the top-level terms of backbone OR chain (i.e. 1 2 3 4) to
all match.
In order to represent
(ALL-MATCH
(PATTERN this)
(OR (AUTHOR A) (AUTHOR B)))
we cannot simply do
OR--------------OR-----------author B
| |
pattern "this" author A
because this requires both (AUTHOR A) and (AUTHOR B) to match, in addition
to "this". We instead need to do something like:
OR--------------OR---TRUE
| |
pattern "this" OR---author B
|
author A
to say "this" must match and (OR (author A) (author B)) must match (IOW
the terms on the backbone OR chain are (PATTERN this), (OR (AUTHOR A/B))
and TRUE and they all have to match).
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-09-13 18:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-26 17:39 Regression in git log with multiple authors Emil Sit
2010-08-26 19:05 ` Junio C Hamano
2010-09-13 8:13 ` [PATCH 1/2] grep: move logic to compile header pattern into a separate helper Junio C Hamano
2010-09-13 8:18 ` [PATCH 2/2] log --author: take union of multiple "author" requests Junio C Hamano
2010-09-13 16:17 ` Emil Sit
2010-09-13 18:11 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).