git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: Linus Torvalds <torvalds@osdl.org>
Cc: git@vger.kernel.org
Subject: Re: [RFC] get_pathspec(): free() old buffer if rewriting
Date: Sat, 06 May 2006 21:26:12 -0700	[thread overview]
Message-ID: <7viroipijf.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0605061532190.16343@g5.osdl.org> (Linus Torvalds's message of "Sat, 6 May 2006 15:37:19 -0700 (PDT)")

Linus Torvalds <torvalds@osdl.org> writes:

> On Sun, 7 May 2006, Johannes Schindelin wrote:
>> 
>> This might be the wrong way to do it, but as it is without this patch,
>> get_pathspec() is leaking memory.
>
> I'm not at all convinced we want to do somethng like this.
>
> get_pathspec() is a one-time event. It doesn't "leak" memory. To me, 
> "leaking" is when you continually lose a bit of memory, and you eventually 
> run out. I don't see that happening here.

I agree with that, except that blame does use it for each
commit and loses memory with deep history.

I have been contemplating to revamp pathspec stuff to deal with
not just an array pointers to strings.  tree-diff already uses a
parallel array of ints (pathlens) to optimize away the repeated
use of strlen() for each pathspec elements, and I think we would
be better off using something like that everywhere.  The API I
have in mind goes like this:

struct pathspec; /* opaque */ 

	A type opaque to the caller.

struct pathspec **get_pathspec(const char *prefix,
			       const char **pathspec,
			       int wildcard);

	The caller gives the prefix (return value from
	setup_git_directory(), the user supplied pathspec list,
	and if it wants to use ls-files style wildcard or
	tree-diff style directory prefix bahaviour.  A newly
	allocated array is returned and the caller can free() it
	when done.

int match_pathspec(const char *path, int len,
	           struct pathspec **pathspec);

	See if the path (with length) matches the given spec.
	path[len-1] == '/' signals that the caller is traversing
	a tree and checking if it is worth descending into the
	tree.  In that case, original spec string "foo/bar/baz"
	matches path = "foo/" (with len = 4).  Otherwise the full
	path is checked so that original spec string would match
	path = "foo/bar/baz" (with len = 11), but not "foo"
	(with len = 3).  If the get_pathspec() was called with
	wildcard support, spec string "foo/bar*" matches these:

                "foo/" (i.e. should descend into it),
                "foo/barboz/" (ditto)
                "foo/bar.txt" (matches)

        but not these:

		"fob/" (no point descending into it)
                "foo/bax" (does not match)

A wildcard aware diff-tree, when invoked like this:

	cd Documentation
        git-diff-tree -r A B -- 'ho*'

might do:

	struct pathspec **spec;
        const char *(paths[2]);
	paths[0] = "how*"; paths[1] = NULL;
	spec = get_pathspec("Documentation/", argv, 1);

and when traversing the tree for A and B, upon seeing "Documentation"
entry in the topleve tree object buffers, call:

	path_match("Documentation", 14, 1, spec)

which would return true (worth descending into the directory).
The recursive call would, upon seeing "hooks.txt" entry, call:

	path_match("Documentation/hooks.txt", 23, 0, spec)

which would say true and compares the entry from two trees.
Also, the same recursive invocation would call

	path_match("Documentation/howto", 19, 1, spec)

which would return true to cause it further recurse into it.

  parent reply	other threads:[~2006-05-07  4:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-06 22:04 [RFC] get_pathspec(): free() old buffer if rewriting Johannes Schindelin
2006-05-06 22:37 ` Linus Torvalds
2006-05-06 22:44   ` Johannes Schindelin
2006-05-07  4:26   ` Junio C Hamano [this message]
2006-05-07  4:50     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7viroipijf.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).