Git development
 help / color / mirror / Atom feed
* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Junio C Hamano @ 2006-06-30 20:26 UTC (permalink / raw)
  To: jnareb; +Cc: git
In-Reply-To: <e83t0m$4s0$2@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Matthias version is truly more expressive, especially with context limiting
> extension. 

That's orthogonal.  I do not think there is any reason you
cannot make the version whose --near is similar to --and to
understand different ranges for each "neighbor search"
expression using --near=M:N syntax.

Now stop talking and code it up, please ;-).

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Jakub Narebski @ 2006-06-30 19:11 UTC (permalink / raw)
  To: git
In-Reply-To: <7v1wt6ik4x.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> The --near Matthias talk about is syntactically not like --and
> but more like --not.  It takes a condition for a line after
> that, and loosens it to cover nearby lines.  So "-e A"
> means "the line must have A on it" but "--near -e A" means "the
> line must be nearby a line that satisfies `-e A'".
> 
> Matthias's "--near EXP" is spelled as "-e '' --near EXP" (the
> first one is always true) with our syntax, in other words.
> 
> I do not think either of these semantics is invalid; they are
> just different.  The version by Matthias is more general and
> more expressive.

It also uses the fact that grep search for _lines_, the fact I have forgot
about. But if we cannot search for multiline regexp using git-grep,
Matthias version is truly more expressive, especially with context limiting
extension. 

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: [PATCH/RFT] upload-pack.c: <sys/poll.h> includes <ctype.h> on OpenBSD 3.8
From: Jakub Narebski @ 2006-06-30 19:00 UTC (permalink / raw)
  To: git
In-Reply-To: <7vr716h4xm.fsf_-_@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Try to work it around by including the system headers first.

Shouldn't it be always the case?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* [PATCH/RFT] upload-pack.c: <sys/poll.h> includes <ctype.h> on OpenBSD 3.8
From: Junio C Hamano @ 2006-06-30 18:30 UTC (permalink / raw)
  To: git; +Cc: Randal L. Schwartz
In-Reply-To: <7vk66yilxd.fsf@assigned-by-dhcp.cox.net>

Merlyn reports that <sys/poll.h> on OpenBSD 3.8 includes <ctype.h>
and having our custom ctype (done in git-compat-util.h which is
included via cache.h) makes upload-pack.c uncompilable.  Try to
work it around by including the system headers first.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * Can somebody with OpenBSD who can reproduce the original
   problem confirm or reject this patch, so that the issue can
   be resolved before 1.4.1, please?

diff --git a/upload-pack.c b/upload-pack.c
index 2b70c3d..b18eb9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1,3 +1,6 @@
+#include <signal.h>
+#include <sys/wait.h>
+#include <sys/poll.h>
 #include "cache.h"
 #include "refs.h"
 #include "pkt-line.h"
@@ -5,9 +8,6 @@ #include "tag.h"
 #include "object.h"
 #include "commit.h"
 #include "exec_cmd.h"
-#include <signal.h>
-#include <sys/poll.h>
-#include <sys/wait.h>
 
 static const char upload_pack_usage[] = "git-upload-pack [--strict] [--timeout=nn] <dir>";
 

^ permalink raw reply related

* Re: [PATCH/RFC] Add git-instaweb, instantly browse the working repo with gitweb
From: Junio C Hamano @ 2006-06-30 18:24 UTC (permalink / raw)
  To: Eric Wong; +Cc: git
In-Reply-To: <11513991301372-git-send-email-normalperson@yhbt.net>

Eric Wong <normalperson@yhbt.net> writes:

> I got tired of having to configure gitweb for every repository
> I work on.  I sometimes prefer gitweb to standard GUIs like gitk
> or gitview; so this lets me automatically configure gitweb to
> browse my working repository and also opens my browser to it.
>
> Signed-off-by: Eric Wong <normalperson@yhbt.net>

This is cute but I haven't seen much responses to it yet.  Are
people uninterested?

^ permalink raw reply

* git object hash cleanups
From: Linus Torvalds @ 2006-06-30 18:20 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


This IMNSHO cleans up the object hashing.

The hash expansion is separated out into a function of its own, the hash 
array (and size) names are made more obvious, and the code is generally 
made to look a bit more like the object-ref hashing.

It also gets rid of "find_object()" returning an index (or negative 
position if no object is found), since that is made redundant by the 
simplified object rehashing. The basic operation is now "lookup_object()" 
which just returns the object itself.

There's an almost unmeasurable speed increase, but more importantly, I 
think the end result is more readable.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

I tried to be really careful, and this should all be good, but I'm still 
embarrassed about my hash insertion bug in object-refs.c, so people should 
double- and triple-check this.

diff --git a/object.c b/object.c
index 31c77ea..37277f9 100644
--- a/object.c
+++ b/object.c
@@ -5,88 +5,97 @@ #include "tree.h"
 #include "commit.h"
 #include "tag.h"
 
-static struct object **objs;
-static int nr_objs, obj_allocs;
+static struct object **obj_hash;
+static int nr_objs, obj_hash_size;
 
 unsigned int get_max_object_index(void)
 {
-	return obj_allocs;
+	return obj_hash_size;
 }
 
 struct object *get_indexed_object(unsigned int idx)
 {
-	return objs[idx];
+	return obj_hash[idx];
 }
 
 const char *type_names[] = {
 	"none", "blob", "tree", "commit", "bad"
 };
 
+static unsigned int hash_obj(struct object *obj, unsigned int n)
+{
+	unsigned int hash = *(unsigned int *)obj->sha1;
+	return hash % n;
+}
+
+static void insert_obj_hash(struct object *obj, struct object **hash, unsigned int size)
+{
+	int j = hash_obj(obj, size);
+
+	while (hash[j]) {
+		j++;
+		if (j >= size)
+			j = 0;
+	}
+	hash[j] = obj;
+}
+
 static int hashtable_index(const unsigned char *sha1)
 {
 	unsigned int i;
 	memcpy(&i, sha1, sizeof(unsigned int));
-	return (int)(i % obj_allocs);
+	return (int)(i % obj_hash_size);
 }
 
-static int find_object(const unsigned char *sha1)
+struct object *lookup_object(const unsigned char *sha1)
 {
 	int i;
+	struct object *obj;
 
-	if (!objs)
-		return -1;
+	if (!obj_hash)
+		return NULL;
 
 	i = hashtable_index(sha1);
-	while (objs[i]) {
-		if (memcmp(sha1, objs[i]->sha1, 20) == 0)
-			return i;
+	while ((obj = obj_hash[i]) != NULL) {
+		if (!memcmp(sha1, obj->sha1, 20))
+			break;
 		i++;
-		if (i == obj_allocs)
+		if (i == obj_hash_size)
 			i = 0;
 	}
-	return -1 - i;
+	return obj;
 }
 
-struct object *lookup_object(const unsigned char *sha1)
+static void grow_object_hash(void)
 {
-	int pos = find_object(sha1);
-	if (pos >= 0)
-		return objs[pos];
-	return NULL;
+	int i;
+	int new_hash_size = obj_hash_size < 32 ? 32 : 2 * obj_hash_size;
+	struct object **new_hash;
+
+	new_hash = calloc(new_hash_size, sizeof(struct object *));
+	for (i = 0; i < obj_hash_size; i++) {
+		struct object *obj = obj_hash[i];
+		if (!obj)
+			continue;
+		insert_obj_hash(obj, new_hash, new_hash_size);
+	}
+	free(obj_hash);
+	obj_hash = new_hash;
+	obj_hash_size = new_hash_size;
 }
 
 void created_object(const unsigned char *sha1, struct object *obj)
 {
-	int pos;
-
 	obj->parsed = 0;
-	memcpy(obj->sha1, sha1, 20);
-	obj->type = TYPE_NONE;
 	obj->used = 0;
+	obj->type = TYPE_NONE;
+	obj->flags = 0;
+	memcpy(obj->sha1, sha1, 20);
 
-	if (obj_allocs - 1 <= nr_objs * 2) {
-		int i, count = obj_allocs;
-		obj_allocs = (obj_allocs < 32 ? 32 : 2 * obj_allocs);
-		objs = xrealloc(objs, obj_allocs * sizeof(struct object *));
-		memset(objs + count, 0, (obj_allocs - count)
-				* sizeof(struct object *));
-		for (i = 0; i < obj_allocs; i++)
-			if (objs[i]) {
-				int j = find_object(objs[i]->sha1);
-				if (j != i) {
-					j = -1 - j;
-					objs[j] = objs[i];
-					objs[i] = NULL;
-				}
-			}
-	}
-
-	pos = find_object(sha1);
-	if (pos >= 0)
-		die("Inserting %s twice\n", sha1_to_hex(sha1));
-	pos = -pos-1;
+	if (obj_hash_size - 1 <= nr_objs * 2)
+		grow_object_hash();
 
-	objs[pos] = obj;
+	insert_obj_hash(obj, obj_hash, obj_hash_size);
 	nr_objs++;
 }
 

^ permalink raw reply related

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Matthias Lederhofer @ 2006-06-30 18:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski
In-Reply-To: <7vfyhmil07.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
> Matthias Lederhofer <matled@gmx.net> writes:
> 
> > -e foo --or  --near \( -e A -- or -e B \)
> > would mean lines containing foo or having A or B in the context.
> 
> How would that "--near" be useful?  You will see A or B either way.
Ok, this example was quite bad.

If --near is binary
-e foo --and ( --near=3:0 -e A --or --near=0:3 -e B )
could not be done anymore, could it (without repeating the first
pattern)? (Find foo with A in the 3 lines before or B in the 3 lines
after the line.)
Without different contexts for multiple --near it probably does not
matter if --near is binary or unary.

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Junio C Hamano @ 2006-06-30 18:16 UTC (permalink / raw)
  To: jnareb; +Cc: git
In-Reply-To: <e83p0q$dla$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Because --near needs an expression it check context for (context is for
> found match of lhs expression). So
>
>   -e foo --near \( -e A --or -e B \)
>
> means lines containing foo and either A or B in the context _for "foo"_.

The syntax and semantics of --near I suggested (and you are
following) and what Matthias discusses are different and I think
that is why you two are talking past each other.

What I originally suggested is that you can (syntactically)
replace --near with --and.  That is, the LHS is the match and
RHS is "the LHS must match, but in addition RHS must match but
unlike --and RHS does not have to be exactly on the same line
but it is OK if it is a line somewhere nearby".

The --near Matthias talk about is syntactically not like --and
but more like --not.  It takes a condition for a line after
that, and loosens it to cover nearby lines.  So "-e A"
means "the line must have A on it" but "--near -e A" means "the
line must be nearby a line that satisfies `-e A'".

Matthias's "--near EXP" is spelled as "-e '' --near EXP" (the
first one is always true) with our syntax, in other words.

I do not think either of these semantics is invalid; they are
just different.  The version by Matthias is more general and
more expressive.

^ permalink raw reply

* Re: [PATCH 1/3] Fix probing for already installed Error.pm
From: Pavel Roskin @ 2006-06-30 18:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, git
In-Reply-To: <7vbqsbks69.fsf@assigned-by-dhcp.cox.net>

On Fri, 2006-06-30 at 00:40 -0700, Junio C Hamano wrote:
> > It is trying to see if we need to install the Error.pm we ship
> > just in case the system does not have Error.pm available.  But
> > this script is run in perl/ directory where we have that private
> > copy of Error.pm, so "require Error" always succeeds, eh, at
> > least after you fixed the above syntax error X-<.

Nice catch!  Thank you for fixing it.

-- 
Regards,
Pavel Roskin

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Jakub Narebski @ 2006-06-30 18:03 UTC (permalink / raw)
  To: git
In-Reply-To: <E1FwN7M-0007GI-Ng@moooo.ath.cx>

Matthias Lederhofer wrote:

> Jakub Narebski wrote:
>> I think --near _has_ to be non-symmetric binary operator, i.e. first
>> argument specifies line to be found, second argument has to be in context
>> for first line if it is found.
>> 
>> So the above expression would be written as:
>> 
>>   -e foo --near \( A --or B \)
> Why is that?
>   -e foo --and --near \( -e A --or -e B \)
> would mean lines containing foo and either A or B in the context and
>   -e foo --or  --near \( -e A --or -e B \)
> would mean lines containing foo or having A or B in the context.

Because --near needs an expression it check context for (context is for
found match of lhs expression). So

  -e foo --near \( -e A --or -e B \)

means lines containing foo and either A or B in the context _for "foo"_.

--and --near could be shorthand for --and-near, and --or --near for
--or-near... except that the second one doesn't have much sense:

What is the difference between
  -e foo --or --near \( -e A --or -e B \)
and
  -e foo --or \( -e A --or -e B \)

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Junio C Hamano @ 2006-06-30 17:58 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git, Jakub Narebski
In-Reply-To: <E1FwN7M-0007GI-Ng@moooo.ath.cx>

Matthias Lederhofer <matled@gmx.net> writes:

> -e foo --or  --near \( -e A -- or -e B \)
> would mean lines containing foo or having A or B in the context.

How would that "--near" be useful?  You will see A or B either way.

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Matthias Lederhofer @ 2006-06-30 17:49 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e83n97$973$1@sea.gmane.org>

Jakub Narebski wrote:
> I think --near _has_ to be non-symmetric binary operator, i.e. first
> argument specifies line to be found, second argument has to be in context
> for first line if it is found.
> 
> So the above expression would be written as:
> 
>   -e foo --near \( A --or B \)
Why is that?
-e foo --and --near \( -e A -- or -e B \)
would mean lines containing foo and either A or B in the context and
-e foo --or  --near \( -e A -- or -e B \)
would mean lines containing foo or having A or B in the context.

> BTW. we can make -e equivalent to --or, and empty (default) operator to
> --and, but of course you have to delimit expression from files, i.e. either
> 
>   git grep A B C D -- files
This is incompatible with the current implementation.
'git grep A B C D -- files' means A is the pattern, B, C, D are
revisions and files is the pathspec.

> or
> 
>   git grep -e \( A B C D \) files
> 
> which would be equivalent to
> 
>   git grep A --and B --and C --and D files
I think this could probably be used.  But I think having two different
implicit operators depending on the context is too confusing.

^ permalink raw reply

* Re: fc046a75d539a78e6b2c16534c4078617a69a327 fails on OpenBSD 3.8
From: Junio C Hamano @ 2006-06-30 17:38 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: git
In-Reply-To: <86sllmy3ia.fsf@blue.stonehenge.com>

merlyn@stonehenge.com (Randal L. Schwartz) writes:

>>>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes:
>...
>>> In file included from /usr/include/sys/poll.h:54,
>>> from upload-pack.c:9:
>>> /usr/include/ctype.h:91:1: unterminated #if
>>> /usr/include/ctype.h:40:1: unterminated #ifndef
>
> Junio> Crap.  Including <sys/poll.h> pollutes your namespace with ctype
> Junio> macros?

I should stop imitating others -- not my style.  Sorry.
Would this work for you?

diff --git a/upload-pack.c b/upload-pack.c
index 2b70c3d..b18eb9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1,3 +1,6 @@
+#include <signal.h>
+#include <sys/wait.h>
+#include <sys/poll.h>
 #include "cache.h"
 #include "refs.h"
 #include "pkt-line.h"
@@ -5,9 +8,6 @@ #include "tag.h"
 #include "object.h"
 #include "commit.h"
 #include "exec_cmd.h"
-#include <signal.h>
-#include <sys/poll.h>
-#include <sys/wait.h>
 
 static const char upload_pack_usage[] = "git-upload-pack [--strict] [--timeout=nn] <dir>";
 

^ permalink raw reply related

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Jakub Narebski @ 2006-06-30 17:33 UTC (permalink / raw)
  To: git
In-Reply-To: <7vpsgqimu7.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Matthias Lederhofer <matled@gmx.net> writes:

>> 3. Is --near just another subexpression? e.g. search for foo with
>>    either A or B in the context:
>>    -e foo --and ( --near A --or --near B )
>>    This does not make sense without 1 and 2.
> 
> Ah, interesting.  I was thinking --near to be weaker form of --and,
> but you made it to be a unary predicate (like --not).  That
> would be neater.

I think --near _has_ to be non-symmetric binary operator, i.e. first
argument specifies line to be found, second argument has to be in context
for first line if it is found.

So the above expression would be written as:

  -e foo --near \( A --or B \)


BTW. we can make -e equivalent to --or, and empty (default) operator to
--and, but of course you have to delimit expression from files, i.e. either

  git grep A B C D -- files

or

  git grep -e \( A B C D \) files

which would be equivalent to

  git grep A --and B --and C --and D files

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Junio C Hamano @ 2006-06-30 17:18 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git
In-Reply-To: <E1FwMPf-0005XA-N9@moooo.ath.cx>

Matthias Lederhofer <matled@gmx.net> writes:

> 1. Should the context of near be the same as -[ABC] or perhaps
>    --near=N / --near=N:M (default could be the same as specified by
>    -[ABC]).

As an end-user, I do not care either way.

> 2. Should it be possible to specify another boolean expression after
>    --near? e.g. --near ( -e foo --or ( -e bar --and -e baz )) to match
>    if the context contains foo or 'bar and baz'.

I would say why not.

> 3. Is --near just another subexpression? e.g. search for foo with
>    either A or B in the context:
>    -e foo --and ( --near A --or --near B )
>    This does not make sense without 1 and 2.

Ah, interesting.  I was thinking --near to be weaker form of --and,
but you made it to be a unary predicate (like --not).  That
would be neater.

> With some or all of those features quite mighty and complex
> expressions can be build:
> -e A --and --near=3:-1 ( -e B --and --near=0:0 ( -e foo --and -e bar ) )
> This could mean: find lines containing A and have B in any of the 3
> lines before A (without the line containing A). Additionally foo and
> bar have to be found on the same line before A.

Having said that, I suspect the above made-up example may not be
so useful in practice.  I think a more realistic usage is "I
want to find lines that contain `made-up' and `realistic' but
the paragraph might have been filled by the editor and they may
be found on separate nearby lines.  Instead of saying `-e
made-up --and -e realistic', I would say `-e made-up --near -e
realistic' to find what I want".  That would find the first two
lines of this paragraph, among others.

> With the new extended expressions it would be really nice if git-grep
> could also be used outside a git repository :)

I am not sure about `outside' but it might be useful to extend
the working tree walker and glob filter used there to match what
ls-files uses so that it can do untracked files as well.

^ permalink raw reply

* Re: fc046a75d539a78e6b2c16534c4078617a69a327 fails on OpenBSD 3.8
From: Randal L. Schwartz @ 2006-06-30 17:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy7veindn.fsf@assigned-by-dhcp.cox.net>

>>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes:

Junio> merlyn@stonehenge.com (Randal L. Schwartz) writes:
>> gcc -o upload-pack.o -c -g -O2 -Wall -I/usr/local/include -DSHA1_HEADER='<openssl/sha.h>' -DNO_STRCASESTR upload-pack.c
>> In file included from /usr/include/sys/poll.h:54,
>> from upload-pack.c:9:
>> /usr/include/ctype.h:67: error: syntax error before ']' token
>> /usr/include/ctype.h:68: error: syntax error before ']' token
>> /usr/include/ctype.h:70: error: syntax error before ']' token
>> /usr/include/ctype.h:75: error: syntax error before ']' token
>> /usr/include/ctype.h:78: error: syntax error before '(' token
>> /usr/include/ctype.h:79: error: syntax error before '(' token
>> /usr/include/ctype.h:93: error: syntax error before "c"
>> In file included from /usr/include/sys/poll.h:54,
>> from upload-pack.c:9:
>> /usr/include/ctype.h:91:1: unterminated #if
>> /usr/include/ctype.h:40:1: unterminated #ifndef

Junio> Crap.  Including <sys/poll.h> pollutes your namespace with ctype
Junio> macros?

>From /usr/include/sys/poll.h:

    #ifndef _KERNEL
    #include <ctype.h>

So, I guess, it's ... Yes.

This sounds familiar.  Maybe the mailing list archive has me reporting
this bug last time too. :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* Re: fc046a75d539a78e6b2c16534c4078617a69a327 fails on OpenBSD 3.8
From: Junio C Hamano @ 2006-06-30 17:06 UTC (permalink / raw)
  To: git
In-Reply-To: <86wtayy42o.fsf@blue.stonehenge.com>

merlyn@stonehenge.com (Randal L. Schwartz) writes:

> gcc -o upload-pack.o -c -g -O2 -Wall -I/usr/local/include -DSHA1_HEADER='<openssl/sha.h>' -DNO_STRCASESTR upload-pack.c
> In file included from /usr/include/sys/poll.h:54,
>                  from upload-pack.c:9:
> /usr/include/ctype.h:67: error: syntax error before ']' token
> /usr/include/ctype.h:68: error: syntax error before ']' token
> /usr/include/ctype.h:70: error: syntax error before ']' token
> /usr/include/ctype.h:75: error: syntax error before ']' token
> /usr/include/ctype.h:78: error: syntax error before '(' token
> /usr/include/ctype.h:79: error: syntax error before '(' token
> /usr/include/ctype.h:93: error: syntax error before "c"
> In file included from /usr/include/sys/poll.h:54,
>                  from upload-pack.c:9:
> /usr/include/ctype.h:91:1: unterminated #if
> /usr/include/ctype.h:40:1: unterminated #ifndef

Crap.  Including <sys/poll.h> pollutes your namespace with ctype
macros?

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Matthias Lederhofer @ 2006-06-30 17:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vejx6k54p.fsf@assigned-by-dhcp.cox.net>

> Side note.  It would be interesting to have a slightly different
> form of --and called --near.  You would use it like this:
> 
> 	git grep -C -e AND --near -e OR
> 
> to find lines that has AND on it, and within the context
> distance there is a line that has OR on it.  The lines that are
> hit with such a query are still the ones that have AND on them
> (in other words, a line that has OR is used to further filter
> out the results so it will be prefixed with '-', not ':', unless
> that line happens to also have AND on it).
Nice idea even though I don't now about practical importance but it
sounds quite handy.  A few questions about this (some or all of those
features may make it quite complex):
1. Should the context of near be the same as -[ABC] or perhaps
   --near=N / --near=N:M (default could be the same as specified by
   -[ABC]).
2. Should it be possible to specify another boolean expression after
   --near? e.g. --near ( -e foo --or ( -e bar --and -e baz )) to match
   if the context contains foo or 'bar and baz'.
3. Is --near just another subexpression? e.g. search for foo with
   either A or B in the context:
   -e foo --and ( --near A --or --near B )
   This does not make sense without 1 and 2.

With some or all of those features quite mighty and complex
expressions can be build:
-e A --and --near=3:-1 ( -e B --and --near=0:0 ( -e foo --and -e bar ) )
This could mean: find lines containing A and have B in any of the 3
lines before A (without the line containing A). Additionally foo and
bar have to be found on the same line before A.

I'm really not asking for this, just telling about some ideas that
come to my mind for --near.

> With your syntax perhaps this is spelled as "--near -C -e AND -e
> OR".
Huh? What do you mean by "my syntax"? The only thing different is the
option to change the default operator to 'and'.

With the new extended expressions it would be really nice if git-grep
could also be used outside a git repository :)

^ permalink raw reply

* fc046a75d539a78e6b2c16534c4078617a69a327 fails on OpenBSD 3.8
From: Randal L. Schwartz @ 2006-06-30 16:57 UTC (permalink / raw)
  To: git


gcc -o upload-pack.o -c -g -O2 -Wall -I/usr/local/include -DSHA1_HEADER='<openssl/sha.h>' -DNO_STRCASESTR upload-pack.c
In file included from /usr/include/sys/poll.h:54,
                 from upload-pack.c:9:
/usr/include/ctype.h:67: error: syntax error before ']' token
/usr/include/ctype.h:68: error: syntax error before ']' token
/usr/include/ctype.h:70: error: syntax error before ']' token
/usr/include/ctype.h:75: error: syntax error before ']' token
/usr/include/ctype.h:78: error: syntax error before '(' token
/usr/include/ctype.h:79: error: syntax error before '(' token
/usr/include/ctype.h:93: error: syntax error before "c"
In file included from /usr/include/sys/poll.h:54,
                 from upload-pack.c:9:
/usr/include/ctype.h:91:1: unterminated #if
/usr/include/ctype.h:40:1: unterminated #ifndef
In file included from upload-pack.c:9:
/usr/include/sys/poll.h:53:1: unterminated #ifndef
/usr/include/sys/poll.h:28:1: unterminated #ifndef
gmake: *** [upload-pack.o] Error 1

The lines in ctype.h that are probably relevant are:

    #if defined(__GNUC__) || defined(_ANSI_LIBRARY) || defined(lint)
    int	isalnum(int);
    int	isalpha(int);
    int	iscntrl(int);
    int	isdigit(int);
    int	isgraph(int);
    int	islower(int);
    int	isprint(int);
    int	ispunct(int);
    int	isspace(int);
    int	isupper(int);
    int	isxdigit(int);
    int	tolower(int);
    int	toupper(int);

Line 67 is "int isalnum(int)"

Are you defining a macro when you shouldn't be in upload-pack?

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* Re: [PATCH] consider previous pack undeltified object state only when reusing delta data
From: Nicolas Pitre @ 2006-06-30 16:55 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Johannes Schindelin, Junio C Hamano, git
In-Reply-To: <44A518D6.8040901@op5.se>

On Fri, 30 Jun 2006, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> > Hi,
> > 
> > On Thu, 29 Jun 2006, Nicolas Pitre wrote:
> > 
> > 
> > > Without this there would never be a chance to improve packing for
> > > previously undeltified objects.
> > 
> > 
> > Earlier this year, I was quite surprised to learn that multiple repackings
> > actually improved packing. Does that patch mean this feature is gone?
> > 
> 
> The patch Linus sent removes that feature. This one re-introduces it.

Not really.

Actually that multiple repacking "feature" was rather an artifact of the 
delta data reuse code and not really by design.  Here's what happened 
before:

Consider the first repack where no delta exists, or "git-repack -a -f" 
where the -f argument makes it ignores existing delta data.  In that 
case all objects are sorted and delta attempted on them within a window.

So to simplify things let's assume objects are numbered from 1 upwards.  
First obj #1 is added to the window.  Obj #2 attempts a delta against 
obj #1.  Obj #3 attempts a delta against objs #2 and #1.  Obj #4 
attempts a delta against objs #3, #2 and #1.  And so on for all object: 
each new object attempts a delta against the last 10 objects (the 
default window size is 10) and the best delta, if any, is kept.

In the end, some objects get deltified, some don't, and a new pack is 
produced.

When repacking without -f to git-repack, then already deltified objects 
are simply copied as is from the existing pack(s) avoiding costly delta 
re-computation.  Still, without Linus' patch, non-deltified objects were 
considered for deltification and deltas attempted on them.

So supposing that objects #1 through #10 were not deltified, and objects 
#11 through #50 were deltified, then those deltified objects were 
skipped over for the purpose of delta matching and therefore object #51 
ended up attempting a delta against objs #1 to 10 instead of #41 to #50 
like in the previous run.  The net effect was similar to a larger window 
for some objects providing more opportunities for successful deltas, and 
therefore a smaller pack.

With Linus' patch those objects already known to be undeltified are, 
too, skipped.  That means that successive git-repack without the -f 
argument are now producing identical packs all the time and the artifact 
above is gone.

I think this is a good thing since now the packing behavior is more 
predictable.  But nothing is lost since if you want to have better 
packing like before you simply have to specify a slightly larger window 
size on the first git-repack.  It'll take a bit more time but running 
git-repack many times also took more time in the end anyway.


Nicolas

^ permalink raw reply

* Re: [PATCH 1/3] Add read_cache_from() and discard_cache()
From: Junio C Hamano @ 2006-06-30 16:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Alex Riesen
In-Reply-To: <Pine.LNX.4.63.0606301643150.29667@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> +int discard_cache()
> +{
> +	int ret;
> +	
> +	if (cache_mmap == NULL)
> +		return 0;
> +	ret = munmap(cache_mmap, cache_mmap_size);
> +	cache_mmap = NULL;
> +	cache_mmap_size = 0;
> +	active_nr = active_cache_changed = 0;
> +	/* no need to throw away allocated active_cache */
> +	return ret;
> +}
> +

I haven't been following the details of the patches in this
thread while they are being cooked actively, but two things to
look out for are:

 - I am guessing you run discard_cache() because you want to
   read in a new cache (or start from a clean slate).  I am not
   sure what you are doing with the old cache tree data
   structure.  If you are starting from a clean slate
   (i.e. there is no read_cache_from() after you call
   discard_cache), you would probably need to discard the old
   cache tree; otherwise your next write-tree may produce an
   incorrect index file.  If you keep the old one and later
   swap it in, the problem might be even more severe.

 - index_timestamp is left as the old value in this patch when
   you switch cache using read_cache_from() directly.  I have a
   suspicion you may be bitten by "Racy Git" problem, especially
   because the operations are supposed to happen quickly thanks
   to the effort of you two ;-) increasing the risks that the
   file timestamp of the working tree file and the cached entry
   match.

^ permalink raw reply

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
From: Junio C Hamano @ 2006-06-30 15:57 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git
In-Reply-To: <E1FwGgm-0006Nc-9a@moooo.ath.cx>

Matthias Lederhofer <matled@gmx.net> writes:

> Junio C Hamano wrote:
>> I see you are trying hard to think of a way to justify your
>> original prefix "--and" (or --FOO) implementation, but I simply
>> do not see much point in that.  I doubt changing the default
>> operator from --or to --and is less confusing than changing the
>> precedence for the users, so you would hear the same "I
>> personally feel FOO should not even exist" objection from me.
>
> It just happens to make more sense to me and I don't see a reason not to
> add this. If no one else is interested in this I'll just stop arguing :)
> Here again an overview of the arguments if anyone is interested:
> - Less to type for common searches using only AND (or more ANDs than
>   ORs).
> - Easy to implement (both with and without extended expressions).
> - AND/* is the normal implicit operator in other contexts than grep
>   (math).
> - The high precedence operator (AND) should be implicit rather than
>   the low precedence one (OR) (so this is only fulfilled when the
>   option is used).

Side note.  It would be interesting to have a slightly different
form of --and called --near.  You would use it like this:

	git grep -C -e AND --near -e OR

to find lines that has AND on it, and within the context
distance there is a line that has OR on it.  The lines that are
hit with such a query are still the ones that have AND on them
(in other words, a line that has OR is used to further filter
out the results so it will be prefixed with '-', not ':', unless
that line happens to also have AND on it).

With your syntax perhaps this is spelled as "--near -C -e AND -e
OR".

^ permalink raw reply

* Re: [PATCH] autoconf: Use autoconf to write installation directories to config.mak
From: Jakub Narebski @ 2006-06-30 15:15 UTC (permalink / raw)
  To: git
In-Reply-To: <44A51693.5020501@op5.se>

Andreas Ericsson wrote:

> grep -q autogen config.mak || \
>       echo "-include config.mak.autogen" >> config.mak
> 
> I wouldn't want my long-standing, functioning config.mak overwritten, 
> but I *might* be interested in trying some of the options provided by 
> ./configure.

Thanks for the solution. Done in my latest patch.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: [PATCH] git-grep: boolean expression on pattern matching.
From: Matthias Lederhofer @ 2006-06-30 15:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vsllnj6rh.fsf_-_@assigned-by-dhcp.cox.net>

> This extends the behaviour of git-grep when multiple -e options
> are given.  So far, we allowed multiple -e to behave just like
> regular grep with multiple -e, i.e. the patterns are OR'ed
> together.
> 
> With this change, you can also have multiple patterns AND'ed
> together, or form boolean expressions, like this (the
> parentheses are quoted from the shell in this example):
> 
> 	$ git grep -e _PATTERN --and \( -e atom -e token \)
This looks really nice. So for a few trivial tests it did not fail :)

I noticed an unrelated bug. The context separators ("--") are missing
between matches in different files:

$ git-grep -e foobar -A 1 (this uses external grep)
Documentation/git-diff-tree.txt:I.e. "foo" does not pick up `foobar.h`.  "foo" does match `foo/bar.h`
Documentation/git-diff-tree.txt-so it can be used to name subdirectories.
--
git-send-email.perl:#$initial_reply_to = ''; #<20050203173208.GA23964@foobar.com>';
git-send-email.perl-
--
[..]

$ git-grep -e foobar -A 1 master (this is internal grep)
master:Documentation/git-diff-tree.txt:I.e. "foo" does not pick up `foobar.h`.  "foo" does match `foo/bar.h`
master:Documentation/git-diff-tree.txt-so it can be used to name subdirectories.
master:git-send-email.perl:#$initial_reply_to = ''; #<20050203173208.GA23964@foobar.com>';
master:git-send-email.perl-
[..]

I think this cannot be fixed in the loop in builtin-grep.c:grep_cache
because after the last hit there should be no separator but it is not
known if a grep_sha1/grep_file will match and produce output. So I
think there has to be a variable passed down which tells those
functions to print the separator before any other output.

^ permalink raw reply

* [PATCH 13] autoconf: Append '-include config.mak.autogen' to config.mak if it is not present
From: Jakub Narebski @ 2006-06-30 15:11 UTC (permalink / raw)
  To: git; +Cc: Andreas Ericsson
In-Reply-To: <200606301708.19521.jnareb@gmail.com>

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Andreas Ericsson wrote:
> Jakub Narebski wrote:
>>
>> The idea was to use ./configure to _generate_ config.mak, which the user can
>> then edit.
>
> This is bad, since it forces users to do one thing first and then do
> what they're used to. Better to have the script add
>
> -include config.mak.autogen
>
> LAST in config.mak, unless it's already in the file and generate
> config.mak.autogen with configure, e.g. with
>
> grep -q autogen config.mak || \
>         echo "-include config.mak.autogen" >> config.mak
>
> Since Make does things bottoms-up (much like swedish students and
> midsummer celebrators), the previous hand-edited defaults in config.mak
> will beat the ones in config.mak.autogen (a good thing).
>
> I wouldn't want my long-standing, functioning config.mak overwritten,
> but I *might* be interested in trying some of the options provided by
> ./configure.

Done, with small changes.

Can anyone tell me if frep use is portable enough?

 configure.ac |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1ead656..2904077 100644
--- a/configure.ac
+++ b/configure.ac
@@ -54,3 +54,6 @@ AC_CONFIG_FILES(["${config_file}":"${con
 AC_OUTPUT
 
 rm -f "${config_append}"
+
+grep -q -s -F "-include ${config_file}" config.mak || \
+        echo  "-include ${config_file}" >> config.mak
-- 
1.4.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox