git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC] Use regex for :/ matching
@ 2007-12-03  4:32 Jeff King
  2007-12-03 10:55 ` Johannes Schindelin
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2007-12-03  4:32 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

The sha1 syntax :/ used to be a strict prefix match.
Instead, let's use a regular expression, which can save on
typing. E.g.,

  git show :/"object name: introduce ':/<oneline prefix>'"

vs

  git show :/introduce..:/.oneline

Signed-off-by: Jeff King <peff@peff.net>
---
Obviously this changes the semantics of existing queries.
Specifically:

  - regex metacharacters are now interpreted
  - the pattern is no longer anchored at the start

I find it much more useful than the original, but perhaps it deserves
its own syntax instead?

I also considered that it might be much slower than the original, but it
is not:

  # original
  $ time git log :/notfound >/dev/null
  real    0m1.055s
  user    0m1.020s
  sys     0m0.024s

  # regex
  $ time git log :/notfound >/dev/null
  real    0m1.065s
  user    0m1.028s
  sys     0m0.036s

Curiously, both are much slower than --grep:

  $ time git log --grep=notfound >/dev/null
  real    0m0.677s
  user    0m0.640s
  sys     0m0.036s

And finally, if accepted, a followup patch should change the "prefix"
variable to "search" or similar; putting it in here made the diff a bit
noisy.

 Documentation/git-rev-parse.txt |    4 ++--
 sha1_name.c                     |   25 ++++++++++++++++++++++---
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 329fce0..41dd684 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -208,8 +208,8 @@ blobs contained in a commit.
   and dereference the tag recursively until a non-tag object is
   found.
 
-* A colon, followed by a slash, followed by a text: this names
-  a commit whose commit message starts with the specified text.
+* A colon, followed by a slash, followed by a regular expression: this names
+  a commit whose commit message starts with a line matching the expression.
   This name returns the youngest matching commit which is
   reachable from any ref.  If the commit message starts with a
   '!', you have to repeat that;  the special sequence ':/!',
diff --git a/sha1_name.c b/sha1_name.c
index d364244..2ad91a3 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -588,8 +588,8 @@ static int handle_one_ref(const char *path,
 
 /*
  * This interprets names like ':/Initial revision of "git"' by searching
- * through history and returning the first commit whose message starts
- * with the given string.
+ * through history and returning the first commit whose first line
+ * matches the given regex.
  *
  * For future extension, ':/!' is reserved. If you want to match a message
  * beginning with a '!', you have to repeat the exclamation mark.
@@ -600,12 +600,22 @@ static int get_sha1_oneline(const char *prefix, unsigned char *sha1)
 {
 	struct commit_list *list = NULL, *backup = NULL, *l;
 	int retval = -1;
+	regex_t re;
+	int r;
 
 	if (prefix[0] == '!') {
 		if (prefix[1] != '!')
 			die ("Invalid search pattern: %s", prefix);
 		prefix++;
 	}
+
+	r = regcomp(&re, prefix, REG_NOSUB);
+	if (r != 0) {
+		char errbuf[256];
+		regerror(r, &re, errbuf, sizeof(errbuf));
+		return error("unable to compile regex: %s", errbuf);
+	}
+
 	if (!save_commit_buffer)
 		return error("Could not expand oneline-name.");
 	for_each_ref(handle_one_ref, &list);
@@ -613,13 +623,21 @@ static int get_sha1_oneline(const char *prefix, unsigned char *sha1)
 		commit_list_insert(l->item, &backup);
 	while (list) {
 		char *p;
+		char *end;
 		struct commit *commit;
 
 		commit = pop_most_recent_commit(&list, ONELINE_SEEN);
 		parse_object(commit->object.sha1);
 		if (!commit->buffer || !(p = strstr(commit->buffer, "\n\n")))
 			continue;
-		if (!prefixcmp(p + 2, prefix)) {
+		p += 2;
+		end = strchr(p, '\n');
+		if (end)
+			*end = '\0';
+		r = regexec(&re, p, 0, NULL, 0);
+		if (end)
+			*end = '\n';
+		if (r == 0) {
 			hashcpy(sha1, commit->object.sha1);
 			retval = 0;
 			break;
@@ -628,6 +646,7 @@ static int get_sha1_oneline(const char *prefix, unsigned char *sha1)
 	free_commit_list(list);
 	for (l = backup; l; l = l->next)
 		clear_commit_marks(l->item, ONELINE_SEEN);
+	regfree(&re);
 	return retval;
 }
 
-- 
1.5.3.7.2068.g5949-dirty

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-03  4:32 [PATCH/RFC] Use regex for :/ matching Jeff King
@ 2007-12-03 10:55 ` Johannes Schindelin
  2007-12-03 17:30   ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Schindelin @ 2007-12-03 10:55 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Hi,

On Sun, 2 Dec 2007, Jeff King wrote:

> The sha1 syntax :/ used to be a strict prefix match.
> Instead, let's use a regular expression, which can save on
> typing. E.g.,
> 
>   git show :/"object name: introduce ':/<oneline prefix>'"
> 
> vs
> 
>   git show :/introduce..:/.oneline

Heh: 
http://repo.or.cz/w/git/dscho.git?a=commitdiff;h=2546cd9732bb8d4bc1d2485ba7bbc1d5c8bac935

Except that I did not support ".." (does yours?), _and_ that my patch is 
not as nice as yours.

But then, my patch also works when save_commit_buffer == 0.  But I can 
refactor this into its own patch, since it really is a separate issue.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-03 10:55 ` Johannes Schindelin
@ 2007-12-03 17:30   ` Jeff King
  2007-12-03 18:17     ` Junio C Hamano
  2007-12-03 18:42     ` [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0 Johannes Schindelin
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff King @ 2007-12-03 17:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Mon, Dec 03, 2007 at 10:55:15AM +0000, Johannes Schindelin wrote:

> > The sha1 syntax :/ used to be a strict prefix match.
> > Instead, let's use a regular expression, which can save on
> Heh: 
> http://repo.or.cz/w/git/dscho.git?a=commitdiff;h=2546cd9732bb8d4bc1d2485ba7bbc1d5c8bac935

Hmm, the major difference seems to be that you grep the entire body,
whereas I grep just the oneline. My goal was to avoid matching the
search string in the message of a merge commit with merge summaries
turned on.

> Except that I did not support ".." (does yours?), _and_ that my patch is 
> not as nice as yours.

No, I didn't. I'm not sure it is sane, since :/ can contain free-form
text (and with a regex, .. is not that unlikely). And you can always do
git-log --not :/foo :/bar

> But then, my patch also works when save_commit_buffer == 0.  But I can 
> refactor this into its own patch, since it really is a separate issue.

Agreed.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-03 17:30   ` Jeff King
@ 2007-12-03 18:17     ` Junio C Hamano
  2007-12-06  5:52       ` Jeff King
  2007-12-03 18:42     ` [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0 Johannes Schindelin
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2007-12-03 18:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> On Mon, Dec 03, 2007 at 10:55:15AM +0000, Johannes Schindelin wrote:
>
>> Except that I did not support ".." (does yours?), _and_ that my patch is 
>> not as nice as yours.
>
> No, I didn't. I'm not sure it is sane, since :/ can contain free-form
> text (and with a regex, .. is not that unlikely). And you can always do
> git-log --not :/foo :/bar
>
>> But then, my patch also works when save_commit_buffer == 0.  But I can 
>> refactor this into its own patch, since it really is a separate issue.
>
> Agreed.

What I found mildly irritating in the current syntax (and that is the
primary reason why I use it rarely) is that there is no way to tell it
to dig from a particular ref (e.g. "on master branch, find the latest
commit whose title matches this string").

For "git-log", you can do "git log master --grep=string -1" to emulate
this, and it extends to "git log maint..master --grep=string" to limit
the revision ranges and "find all not just latest" very naturally.

Also once we start to talk about supporting "ranges", I suspect the
semantics becomes fuzzier, because ":/" is defined as an extended SHA-1
expression that resolves to a single commit.  ":/A..:/B" is probably
unneeded as you can always say "^:/B :/A", but making ":/A" (we need an
extended syntax to differentiate from the singleton case we currently
have) to mean "all the commits that match this pattern" is something
people might be interested to see.  Unfortunately, that does not define
a revision range operator but a operator that gives back set of commits
that are not consecutive, primarily good for giving to nothing but "git
show", and again "git log --grep" would emulate it just as well.

So in short:

 * I do not think extending it to mean a set of commits (with some
   definition of how the set is computed) is a good idea.  It can stay
   "name one commit that matches this string" without losing usefulness,
   and I think it should;

 * The definition of the "match" can be tweaked and introducing regexp
   might be a good way;

 * The definition of the "match" may become more useful if we can limit
   which refs to dig from.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0
  2007-12-03 17:30   ` Jeff King
  2007-12-03 18:17     ` Junio C Hamano
@ 2007-12-03 18:42     ` Johannes Schindelin
  2007-12-03 21:34       ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Johannes Schindelin @ 2007-12-03 18:42 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster


Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Mon, 3 Dec 2007, Jeff King wrote:

	> On Mon, Dec 03, 2007 at 10:55:15AM +0000, Johannes Schindelin wrote:
	> 
	> > But then, my patch also works when save_commit_buffer == 0.  
	> > But I can refactor this into its own patch, since it really is 
	> > a separate issue.
	> 
	> Agreed.

	Here we go.

 sha1_name.c |   19 ++++++++++++++++---
 1 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index e78b2b9..da1aad1 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -601,24 +601,35 @@ static int get_sha1_oneline(const char *prefix, unsigned char *sha1)
 {
 	struct commit_list *list = NULL, *backup = NULL, *l;
 	int retval = -1;
+	char *temp_commit_buffer = NULL;
 
 	if (prefix[0] == '!') {
 		if (prefix[1] != '!')
 			die ("Invalid search pattern: %s", prefix);
 		prefix++;
 	}
-	if (!save_commit_buffer)
-		return error("Could not expand oneline-name.");
 	for_each_ref(handle_one_ref, &list);
 	for (l = list; l; l = l->next)
 		commit_list_insert(l->item, &backup);
 	while (list) {
 		char *p;
 		struct commit *commit;
+		enum object_type type;
+		unsigned long size;
 
 		commit = pop_most_recent_commit(&list, ONELINE_SEEN);
 		parse_object(commit->object.sha1);
-		if (!commit->buffer || !(p = strstr(commit->buffer, "\n\n")))
+		if (temp_commit_buffer)
+			free(temp_commit_buffer);
+		if (commit->buffer)
+			p = commit->buffer;
+		else {
+			p = read_sha1_file(commit->object.sha1, &type, &size);
+			if (!p)
+				continue;
+			temp_commit_buffer = p;
+		}
+		if (!(p = strstr(p, "\n\n")))
 			continue;
 		if (!prefixcmp(p + 2, prefix)) {
 			hashcpy(sha1, commit->object.sha1);
@@ -626,6 +637,8 @@ static int get_sha1_oneline(const char *prefix, unsigned char *sha1)
 			break;
 		}
 	}
+	if (temp_commit_buffer)
+		free(temp_commit_buffer);
 	free_commit_list(list);
 	for (l = backup; l; l = l->next)
 		clear_commit_marks(l->item, ONELINE_SEEN);
-- 
1.5.3.7.2119.g774e1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0
  2007-12-03 18:42     ` [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0 Johannes Schindelin
@ 2007-12-03 21:34       ` Junio C Hamano
  2007-12-03 22:52         ` Johannes Schindelin
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2007-12-03 21:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, git, gitster

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>
> 	On Mon, 3 Dec 2007, Jeff King wrote:
>
> 	> On Mon, Dec 03, 2007 at 10:55:15AM +0000, Johannes Schindelin wrote:
> 	> 
> 	> > But then, my patch also works when save_commit_buffer == 0.  
> 	> > But I can refactor this into its own patch, since it really is 
> 	> > a separate issue.
> 	> 
> 	> Agreed.
>
> 	Here we go.

The log message is a bit lacking.  I would have expected "git-show
command can take ':/prefix' as an argument to name a commit whose
subject begins with the prefix, but git-foobar command didn't and
instead errored out.  This fixes it so that ':/prefix' syntax can be
used for any command that wants to take a commit object name".

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0
  2007-12-03 21:34       ` Junio C Hamano
@ 2007-12-03 22:52         ` Johannes Schindelin
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Schindelin @ 2007-12-03 22:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

Hi,

On Mon, 3 Dec 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >
> > 	On Mon, 3 Dec 2007, Jeff King wrote:
> >
> > 	> On Mon, Dec 03, 2007 at 10:55:15AM +0000, Johannes Schindelin wrote:
> > 	> 
> > 	> > But then, my patch also works when save_commit_buffer == 0.  
> > 	> > But I can refactor this into its own patch, since it really is 
> > 	> > a separate issue.
> > 	> 
> > 	> Agreed.
> >
> > 	Here we go.
> 
> The log message is a bit lacking.  I would have expected "git-show
> command can take ':/prefix' as an argument to name a commit whose
> subject begins with the prefix, but git-foobar command didn't and
> instead errored out.  This fixes it so that ':/prefix' syntax can be
> used for any command that wants to take a commit object name".

Sorry.  Can I ask you to amend the commit message with this?

-- snip --
Earlier, ':/<oneline-prefix>' would not work (i.e. die) with commands that 
set save_commit_buffer = 0, such as blame, describe, pack-objects, reflog 
and bundle.

Fix this.
-- snap --

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-03 18:17     ` Junio C Hamano
@ 2007-12-06  5:52       ` Jeff King
  2007-12-06  6:33         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2007-12-06  5:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git

On Mon, Dec 03, 2007 at 10:17:18AM -0800, Junio C Hamano wrote:

> So in short:
> 
>  * I do not think extending it to mean a set of commits (with some
>    definition of how the set is computed) is a good idea.  It can stay
>    "name one commit that matches this string" without losing usefulness,
>    and I think it should;
> 
>  * The definition of the "match" can be tweaked and introducing regexp
>    might be a good way;
> 
>  * The definition of the "match" may become more useful if we can limit
>    which refs to dig from.

Obviously the overall design and usage of :/ is going to take some
thinking and is not 1.5.4 material. However, we do have it in its
current form, and I think regex versus prefix string matching is
orthogonal to the range issues. Should I post my rebased :/ regex patch,
or do you want to just leave it for post-1.5.4?

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-06  5:52       ` Jeff King
@ 2007-12-06  6:33         ` Junio C Hamano
  2007-12-06  6:39           ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2007-12-06  6:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> Obviously the overall design and usage of :/ is going to take some
> thinking and is not 1.5.4 material. However, we do have it in its
> current form, and I think regex versus prefix string matching is
> orthogonal to the range issues. Should I post my rebased :/ regex patch,
> or do you want to just leave it for post-1.5.4?

Making what was string match to regex using the same syntax is a
regression, isn't it?  I do not use :/ very often myself, so I
personally would not mind but people who are used to using :/ may get
upset about the change.  I do not feel strongly enough for changing it
to regex to declare such a change unilaterally.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-06  6:33         ` Junio C Hamano
@ 2007-12-06  6:39           ` Jeff King
  2007-12-06 12:13             ` Johannes Schindelin
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2007-12-06  6:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git

On Wed, Dec 05, 2007 at 10:33:49PM -0800, Junio C Hamano wrote:

> Making what was string match to regex using the same syntax is a
> regression, isn't it?  I do not use :/ very often myself, so I
> personally would not mind but people who are used to using :/ may get
> upset about the change.  I do not feel strongly enough for changing it
> to regex to declare such a change unilaterally.

Yes, it is a regression, which is why my initial patch was labeled RFC.
Nobody commented on that aspect, so either they don't care, or (as you
often seem to find out) they will just complain later. :)

I am fine with adding it with a new syntax, but that is going to take
some thought, so perhaps it is better left to post-1.5.4.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH/RFC] Use regex for :/ matching
  2007-12-06  6:39           ` Jeff King
@ 2007-12-06 12:13             ` Johannes Schindelin
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Schindelin @ 2007-12-06 12:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

Hi,

On Thu, 6 Dec 2007, Jeff King wrote:

> On Wed, Dec 05, 2007 at 10:33:49PM -0800, Junio C Hamano wrote:
> 
> > Making what was string match to regex using the same syntax is a 
> > regression, isn't it?  I do not use :/ very often myself, so I 
> > personally would not mind but people who are used to using :/ may get 
> > upset about the change.  I do not feel strongly enough for changing it 
> > to regex to declare such a change unilaterally.
> 
> Yes, it is a regression, which is why my initial patch was labeled RFC. 
> Nobody commented on that aspect, [...]

I did.  I sent a link to my own version of that patch, thereby displaying 
my agreement with the intent.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-12-06 12:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-03  4:32 [PATCH/RFC] Use regex for :/ matching Jeff King
2007-12-03 10:55 ` Johannes Schindelin
2007-12-03 17:30   ` Jeff King
2007-12-03 18:17     ` Junio C Hamano
2007-12-06  5:52       ` Jeff King
2007-12-06  6:33         ` Junio C Hamano
2007-12-06  6:39           ` Jeff King
2007-12-06 12:13             ` Johannes Schindelin
2007-12-03 18:42     ` [PATCH] Allow ':/<oneline-prefix>' syntax to work with save_commit_buffer == 0 Johannes Schindelin
2007-12-03 21:34       ` Junio C Hamano
2007-12-03 22:52         ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).