git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] get_pathspec(): free() old buffer if rewriting
@ 2006-05-06 22:04 Johannes Schindelin
  2006-05-06 22:37 ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2006-05-06 22:04 UTC (permalink / raw)
  To: git, junkio


This might be the wrong way to do it, but as it is without this patch,
get_pathspec() is leaking memory.

However, it is by no means guaranteed that the input is malloc()ed. The
tests run through without problems, but you never know...

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

 setup.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/setup.c b/setup.c
index fe7f884..9c39d6e 100644
--- a/setup.c
+++ b/setup.c
@@ -126,6 +126,11 @@ const char **get_pathspec(const char *pr
 	prefixlen = prefix ? strlen(prefix) : 0;
 	do {
 		*p = prefix_path(prefix, prefixlen, entry);
+		if (*p != entry) {
+			if (*p > entry && *p < entry + strlen(entry))
+				*p = strdup(*p);
+			free((char*)entry);
+		}
 	} while ((entry = *++p) != NULL);
 	return (const char **) pathspec;
 }
-- 
1.3.2.g9ba6-dirty

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC] get_pathspec(): free() old buffer if rewriting
  2006-05-06 22:04 [RFC] get_pathspec(): free() old buffer if rewriting Johannes Schindelin
@ 2006-05-06 22:37 ` Linus Torvalds
  2006-05-06 22:44   ` Johannes Schindelin
  2006-05-07  4:26   ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-05-06 22:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, junkio



On Sun, 7 May 2006, Johannes Schindelin wrote:
> 
> This might be the wrong way to do it, but as it is without this patch,
> get_pathspec() is leaking memory.

I'm not at all convinced we want to do somethng like this.

get_pathspec() is a one-time event. It doesn't "leak" memory. To me, 
"leaking" is when you continually lose a bit of memory, and you eventually 
run out. I don't see that happening here.

It's like saying that calling an initialization routine "leaks" memory, 
because it doesn't free the memory that contains the function that is now 
no longer ever used. It's clearly leaving that memory unused, "dangling".

But the fact is, the OS will free the memory for us when we exit, and 
simplicity is often much more powerful than trying to be overly careful.

So there's a difference between "don't care" and "leak memory". It sounds 
like you may be using some automated tool that warns because it simply 
doesn't understand that difference.

			Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] get_pathspec(): free() old buffer if rewriting
  2006-05-06 22:37 ` Linus Torvalds
@ 2006-05-06 22:44   ` Johannes Schindelin
  2006-05-07  4:26   ` Junio C Hamano
  1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2006-05-06 22:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, junkio

Hi,

On Sat, 6 May 2006, Linus Torvalds wrote:

> On Sun, 7 May 2006, Johannes Schindelin wrote:
> > 
> > This might be the wrong way to do it, but as it is without this patch,
> > get_pathspec() is leaking memory.
> 
> I'm not at all convinced we want to do somethng like this.
> 
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me, 
> "leaking" is when you continually lose a bit of memory, and you eventually 
> run out. I don't see that happening here.

I see your point. That was exactly why I put "RFC" and not "PATCH" on the 
subject.

> So there's a difference between "don't care" and "leak memory". It sounds 
> like you may be using some automated tool that warns because it simply 
> doesn't understand that difference.

Nope. No automated tool. Just my brain which wanted to fix _all_ 
occurrences of that prefix_path() usage bug.

But in case anybody uses Valgrind et al. on git, how about at least a 
comment telling the casual observer why we don't free() the buffers?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] get_pathspec(): free() old buffer if rewriting
  2006-05-06 22:37 ` Linus Torvalds
  2006-05-06 22:44   ` Johannes Schindelin
@ 2006-05-07  4:26   ` Junio C Hamano
  2006-05-07  4:50     ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2006-05-07  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Sun, 7 May 2006, Johannes Schindelin wrote:
>> 
>> This might be the wrong way to do it, but as it is without this patch,
>> get_pathspec() is leaking memory.
>
> I'm not at all convinced we want to do somethng like this.
>
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me, 
> "leaking" is when you continually lose a bit of memory, and you eventually 
> run out. I don't see that happening here.

I agree with that, except that blame does use it for each
commit and loses memory with deep history.

I have been contemplating to revamp pathspec stuff to deal with
not just an array pointers to strings.  tree-diff already uses a
parallel array of ints (pathlens) to optimize away the repeated
use of strlen() for each pathspec elements, and I think we would
be better off using something like that everywhere.  The API I
have in mind goes like this:

struct pathspec; /* opaque */ 

	A type opaque to the caller.

struct pathspec **get_pathspec(const char *prefix,
			       const char **pathspec,
			       int wildcard);

	The caller gives the prefix (return value from
	setup_git_directory(), the user supplied pathspec list,
	and if it wants to use ls-files style wildcard or
	tree-diff style directory prefix bahaviour.  A newly
	allocated array is returned and the caller can free() it
	when done.

int match_pathspec(const char *path, int len,
	           struct pathspec **pathspec);

	See if the path (with length) matches the given spec.
	path[len-1] == '/' signals that the caller is traversing
	a tree and checking if it is worth descending into the
	tree.  In that case, original spec string "foo/bar/baz"
	matches path = "foo/" (with len = 4).  Otherwise the full
	path is checked so that original spec string would match
	path = "foo/bar/baz" (with len = 11), but not "foo"
	(with len = 3).  If the get_pathspec() was called with
	wildcard support, spec string "foo/bar*" matches these:

                "foo/" (i.e. should descend into it),
                "foo/barboz/" (ditto)
                "foo/bar.txt" (matches)

        but not these:

		"fob/" (no point descending into it)
                "foo/bax" (does not match)

A wildcard aware diff-tree, when invoked like this:

	cd Documentation
        git-diff-tree -r A B -- 'ho*'

might do:

	struct pathspec **spec;
        const char *(paths[2]);
	paths[0] = "how*"; paths[1] = NULL;
	spec = get_pathspec("Documentation/", argv, 1);

and when traversing the tree for A and B, upon seeing "Documentation"
entry in the topleve tree object buffers, call:

	path_match("Documentation", 14, 1, spec)

which would return true (worth descending into the directory).
The recursive call would, upon seeing "hooks.txt" entry, call:

	path_match("Documentation/hooks.txt", 23, 0, spec)

which would say true and compares the entry from two trees.
Also, the same recursive invocation would call

	path_match("Documentation/howto", 19, 1, spec)

which would return true to cause it further recurse into it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] get_pathspec(): free() old buffer if rewriting
  2006-05-07  4:26   ` Junio C Hamano
@ 2006-05-07  4:50     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2006-05-07  4:50 UTC (permalink / raw)
  To: git

Junio C Hamano <junkio@cox.net> writes:

> int match_pathspec(const char *path, int len,
> 	           struct pathspec **pathspec);
>
> 	See if the path (with length) matches the given spec.
> 	path[len-1] == '/' signals that the caller is traversing
> 	a tree and checking if it is worth descending into the
> 	tree.
> ...
> and when traversing the tree for A and B, upon seeing "Documentation"
> entry in the topleve tree object buffers, call:
>
> 	path_match("Documentation", 14, 1, spec)

Oops, match_pathspec("Documentation/", 15, spec) is what I meant
here.  Likewise in the later examples.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-05-07  4:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-06 22:04 [RFC] get_pathspec(): free() old buffer if rewriting Johannes Schindelin
2006-05-06 22:37 ` Linus Torvalds
2006-05-06 22:44   ` Johannes Schindelin
2006-05-07  4:26   ` Junio C Hamano
2006-05-07  4:50     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).