* Re: [RFC] get_pathspec(): free() old buffer if rewriting
2006-05-06 22:37 ` Linus Torvalds
@ 2006-05-06 22:44 ` Johannes Schindelin
2006-05-07 4:26 ` Junio C Hamano
1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2006-05-06 22:44 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git, junkio
Hi,
On Sat, 6 May 2006, Linus Torvalds wrote:
> On Sun, 7 May 2006, Johannes Schindelin wrote:
> >
> > This might be the wrong way to do it, but as it is without this patch,
> > get_pathspec() is leaking memory.
>
> I'm not at all convinced we want to do somethng like this.
>
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me,
> "leaking" is when you continually lose a bit of memory, and you eventually
> run out. I don't see that happening here.
I see your point. That was exactly why I put "RFC" and not "PATCH" on the
subject.
> So there's a difference between "don't care" and "leak memory". It sounds
> like you may be using some automated tool that warns because it simply
> doesn't understand that difference.
Nope. No automated tool. Just my brain which wanted to fix _all_
occurrences of that prefix_path() usage bug.
But in case anybody uses Valgrind et al. on git, how about at least a
comment telling the casual observer why we don't free() the buffers?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] get_pathspec(): free() old buffer if rewriting
2006-05-06 22:37 ` Linus Torvalds
2006-05-06 22:44 ` Johannes Schindelin
@ 2006-05-07 4:26 ` Junio C Hamano
2006-05-07 4:50 ` Junio C Hamano
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2006-05-07 4:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Linus Torvalds <torvalds@osdl.org> writes:
> On Sun, 7 May 2006, Johannes Schindelin wrote:
>>
>> This might be the wrong way to do it, but as it is without this patch,
>> get_pathspec() is leaking memory.
>
> I'm not at all convinced we want to do somethng like this.
>
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me,
> "leaking" is when you continually lose a bit of memory, and you eventually
> run out. I don't see that happening here.
I agree with that, except that blame does use it for each
commit and loses memory with deep history.
I have been contemplating to revamp pathspec stuff to deal with
not just an array pointers to strings. tree-diff already uses a
parallel array of ints (pathlens) to optimize away the repeated
use of strlen() for each pathspec elements, and I think we would
be better off using something like that everywhere. The API I
have in mind goes like this:
struct pathspec; /* opaque */
A type opaque to the caller.
struct pathspec **get_pathspec(const char *prefix,
const char **pathspec,
int wildcard);
The caller gives the prefix (return value from
setup_git_directory(), the user supplied pathspec list,
and if it wants to use ls-files style wildcard or
tree-diff style directory prefix bahaviour. A newly
allocated array is returned and the caller can free() it
when done.
int match_pathspec(const char *path, int len,
struct pathspec **pathspec);
See if the path (with length) matches the given spec.
path[len-1] == '/' signals that the caller is traversing
a tree and checking if it is worth descending into the
tree. In that case, original spec string "foo/bar/baz"
matches path = "foo/" (with len = 4). Otherwise the full
path is checked so that original spec string would match
path = "foo/bar/baz" (with len = 11), but not "foo"
(with len = 3). If the get_pathspec() was called with
wildcard support, spec string "foo/bar*" matches these:
"foo/" (i.e. should descend into it),
"foo/barboz/" (ditto)
"foo/bar.txt" (matches)
but not these:
"fob/" (no point descending into it)
"foo/bax" (does not match)
A wildcard aware diff-tree, when invoked like this:
cd Documentation
git-diff-tree -r A B -- 'ho*'
might do:
struct pathspec **spec;
const char *(paths[2]);
paths[0] = "how*"; paths[1] = NULL;
spec = get_pathspec("Documentation/", argv, 1);
and when traversing the tree for A and B, upon seeing "Documentation"
entry in the topleve tree object buffers, call:
path_match("Documentation", 14, 1, spec)
which would return true (worth descending into the directory).
The recursive call would, upon seeing "hooks.txt" entry, call:
path_match("Documentation/hooks.txt", 23, 0, spec)
which would say true and compares the entry from two trees.
Also, the same recursive invocation would call
path_match("Documentation/howto", 19, 1, spec)
which would return true to cause it further recurse into it.
^ permalink raw reply [flat|nested] 5+ messages in thread