From: Junio C Hamano <junkio@cox.net>
To: Linus Torvalds <torvalds@osdl.org>
Cc: git@vger.kernel.org
Subject: Re: [RFC] get_pathspec(): free() old buffer if rewriting
Date: Sat, 06 May 2006 21:26:12 -0700 [thread overview]
Message-ID: <7viroipijf.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0605061532190.16343@g5.osdl.org> (Linus Torvalds's message of "Sat, 6 May 2006 15:37:19 -0700 (PDT)")
Linus Torvalds <torvalds@osdl.org> writes:
> On Sun, 7 May 2006, Johannes Schindelin wrote:
>>
>> This might be the wrong way to do it, but as it is without this patch,
>> get_pathspec() is leaking memory.
>
> I'm not at all convinced we want to do somethng like this.
>
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me,
> "leaking" is when you continually lose a bit of memory, and you eventually
> run out. I don't see that happening here.
I agree with that, except that blame does use it for each
commit and loses memory with deep history.
I have been contemplating to revamp pathspec stuff to deal with
not just an array pointers to strings. tree-diff already uses a
parallel array of ints (pathlens) to optimize away the repeated
use of strlen() for each pathspec elements, and I think we would
be better off using something like that everywhere. The API I
have in mind goes like this:
struct pathspec; /* opaque */
A type opaque to the caller.
struct pathspec **get_pathspec(const char *prefix,
const char **pathspec,
int wildcard);
The caller gives the prefix (return value from
setup_git_directory(), the user supplied pathspec list,
and if it wants to use ls-files style wildcard or
tree-diff style directory prefix bahaviour. A newly
allocated array is returned and the caller can free() it
when done.
int match_pathspec(const char *path, int len,
struct pathspec **pathspec);
See if the path (with length) matches the given spec.
path[len-1] == '/' signals that the caller is traversing
a tree and checking if it is worth descending into the
tree. In that case, original spec string "foo/bar/baz"
matches path = "foo/" (with len = 4). Otherwise the full
path is checked so that original spec string would match
path = "foo/bar/baz" (with len = 11), but not "foo"
(with len = 3). If the get_pathspec() was called with
wildcard support, spec string "foo/bar*" matches these:
"foo/" (i.e. should descend into it),
"foo/barboz/" (ditto)
"foo/bar.txt" (matches)
but not these:
"fob/" (no point descending into it)
"foo/bax" (does not match)
A wildcard aware diff-tree, when invoked like this:
cd Documentation
git-diff-tree -r A B -- 'ho*'
might do:
struct pathspec **spec;
const char *(paths[2]);
paths[0] = "how*"; paths[1] = NULL;
spec = get_pathspec("Documentation/", argv, 1);
and when traversing the tree for A and B, upon seeing "Documentation"
entry in the topleve tree object buffers, call:
path_match("Documentation", 14, 1, spec)
which would return true (worth descending into the directory).
The recursive call would, upon seeing "hooks.txt" entry, call:
path_match("Documentation/hooks.txt", 23, 0, spec)
which would say true and compares the entry from two trees.
Also, the same recursive invocation would call
path_match("Documentation/howto", 19, 1, spec)
which would return true to cause it further recurse into it.
next prev parent reply other threads:[~2006-05-07 4:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-06 22:04 [RFC] get_pathspec(): free() old buffer if rewriting Johannes Schindelin
2006-05-06 22:37 ` Linus Torvalds
2006-05-06 22:44 ` Johannes Schindelin
2006-05-07 4:26 ` Junio C Hamano [this message]
2006-05-07 4:50 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7viroipijf.fsf@assigned-by-dhcp.cox.net \
--to=junkio@cox.net \
--cc=git@vger.kernel.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).