Git development

Git development
 help / color / mirror / Atom feed

* Re: rev-list/tree committer/author information.
From: Petr Baudis @ 2005-05-16 22:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Daniel Barkalow, Git Mailing List
In-Reply-To: <7vekc6onzc.fsf_-_@assigned-by-dhcp.cox.net>

Dear diary, on Mon, May 16, 2005 at 11:33:11PM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> Anyway, everything I've read so far makes sense, and it
> LT> might make sense to continue git development using just
> LT> git-pb. The only thing I personally think sucks is the
> LT> author/committer matching of git-rev-list/tree, since it
> LT> would seem like somebody might well like to match on an
> LT> arbitrary part of a commit, and special-casing
> LT> author/committer seems somewhat broken.
> 
> Well, that author/committer thing is not in git-pb yet, if I am
> not mistaken [*1].
> 
> The only reason why I did it that way was because the strategy
> taken by "struct object" derivatives seemed to pick up bare
> absolute minimum to support actual callers that have immediate
> need for information stored in structural fields, as opposed to
> designing for helping yet to be written callers by adding fields
> to hold information of "having this might also help somebody in
> the future" type.  And the author and committer names are in the
> structured fields while signed-off-by and others are not.  Also
> when author / committer name strings are intern'ed like the way
> I did, the memory consumption even for a long sequence of
> commits are kept reasonably low.  However,...

I like Linus' suggestion. At the very least, what about making the
matching generic for the header? Something like --match-header or
whatever.

> *1* Petr has been applying quite good judgements. I would have
> polluted git-jc with that patch already if I were still running
> it.  So far, I have been generally happy with his acceptance
> criteria for external patches.  Anything he places on hold or
> just outright returns to me, I later find rooms for improvements
> myself, and the later rounds that eventually get accepted always
> turn out to be far cleaner, thanks to his comments.

Thanks for the praise :-), but I'm actually quite unhappy with putting
patches "on hold" and it's not intentional. It's just that I don't feel
right about the patch enough to apply it immediately, and I either don't
have time to voice my concerns if they are non-trivial, or I just want
some time to think about it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH 2/4] Tweak diff output further to make it a bit less distracting.
From: Petr Baudis @ 2005-05-16 22:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, torvalds
In-Reply-To: <7vvf5kqj9l.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Sun, May 15, 2005 at 11:19:50PM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> Adds an newline between each diff.  Also change "#mode : "
> string, which was misleading in that we are not showing just
> mode when we talk about a file changing into a symlink.
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>

So, I've been looking at the output, and I have to admit that I'm still
not too happy with it (I know I'm horrible). It turned out to be rather
confusing, since there are normally no blank lines in the middle of the
diffs, so it looked as the blank lines were actually part of the diffed
files.

What about just throwing away the newlines and just passing '@.'?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: [PATCH 3/4] Implement git-checkout-cache -u to update stat information in the cache.
From: Petr Baudis @ 2005-05-16 22:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, torvalds
In-Reply-To: <7vll6gqj3z.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Sun, May 15, 2005 at 11:23:12PM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> With -u flag, git-checkout-cache picks up the stat information
> from newly created file and updates the cache.  This removes the
> need to run git-update-cache --refresh immediately after running
> git-checkout-cache.

I actually feel ok with this, but I wonder about Linus' opinion about
it.  :-)

> *** The one I posted earlier failed to add a couple of files
> *** that the patch should have added.  Please discard it and
> *** replace it with this one.  Thanks.

That reminds me, the patches would be a little easier for me to process
if you followed the /dev/null convention. (You don't need to rediff your
older patches just because of that, or even bother about it if it's a
problem for you, though.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* rev-list/tree committer/author information.
From: Junio C Hamano @ 2005-05-16 21:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: pasky, Daniel Barkalow, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505160837080.28162@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Anyway, everything I've read so far makes sense, and it
LT> might make sense to continue git development using just
LT> git-pb. The only thing I personally think sucks is the
LT> author/committer matching of git-rev-list/tree, since it
LT> would seem like somebody might well like to match on an
LT> arbitrary part of a commit, and special-casing
LT> author/committer seems somewhat broken.

Well, that author/committer thing is not in git-pb yet, if I am
not mistaken [*1].

The only reason why I did it that way was because the strategy
taken by "struct object" derivatives seemed to pick up bare
absolute minimum to support actual callers that have immediate
need for information stored in structural fields, as opposed to
designing for helping yet to be written callers by adding fields
to hold information of "having this might also help somebody in
the future" type.  And the author and committer names are in the
structured fields while signed-off-by and others are not.  Also
when author / committer name strings are intern'ed like the way
I did, the memory consumption even for a long sequence of
commits are kept reasonably low.  However,...

LT> I personally suspect that both git-rev-list and git-rev-tree
LT> should have an alternate output format that could be more
LT> easily grepped by subsequent commands. For example, right
LT> now git-rev-list just outputs a list of commit ID's, and it
LT> might make sense to have a flag to just append the commit
LT> message to the output, and zero-terminate it (and if the
LT> commit message has a NUL byte in it, just truncate it at
LT> that point).

I think what you said here makes more sense [*2*].  The above
implies to keep the unpacked raw data as a whole to be
accessible to the callers for at least commit objects and if we
go that route I think it would make more sense to do that
uniformly for everything (probably except for pure "blob"
objects for size concerns but we might as well do them while we
are at it).  On the other hand, the current lifetime rules being
what it is, that strategy may introduce memory consumption
problems when working on a huge project.

[Footnote]

*1* Petr has been applying quite good judgements. I would have
polluted git-jc with that patch already if I were still running
it.  So far, I have been generally happy with his acceptance
criteria for external patches.  Anything he places on hold or
just outright returns to me, I later find rooms for improvements
myself, and the later rounds that eventually get accepted always
turn out to be far cleaner, thanks to his comments.

*2* At least in principle.  I am not quite sure what the output
should look like for rev-tree.

^ permalink raw reply

* Re: git-rev-list  in local commit order
From: Sean @ 2005-05-16 21:25 UTC (permalink / raw)
  To: tglx; +Cc: git
In-Reply-To: <1116195235.11872.213.camel@tglx>

On Sun, May 15, 2005 6:13 pm, Thomas Gleixner said:
> Last try.
>
> A repository Id makes it possible to identify workflows in and across
> repositories.

Sorry, your proposal falls short, accurate work flow would allow you to
show every repository a commit passed through on the way to its final
destination.  Your proposal does not allow that; as discussed.  Nor does
it handle multiple projects or branches within a single repository.

As noted by others, using git often means the creation of temporary
repositories, hardly something that deserves an identifier.  Git, by
design, doesn't give a hoot about individual repositories.

And you also haven't addressed what to do when someone else uses say,
Linus' repoid, as their own.  It seems like a risk to have the operation
of each repository depend on a value anyone else can duplicate.  Linus
can't control what repoid everyone else uses, he can control the time on
his own machine.  Unique repoid's are an illusion.

> This information is valuable for me and others due to already discussed
> reasons.

Why should everyone else manage repoids in their own personal repository
for you; what value will _they_ get out of it?

> I accept that is irrelevant for you.

Personally I don't really care either way.  But you haven't given one real
example where it is actually needed to do useful work.  Making pretty
graphs on a web page doesn't count if they're not useful to anyone.  You
shouldn't force everyone else to manage repoid's unless there is some
value for _them_.

If you're still going to pursue this, at least make sure repoid is not
mandatory.  If a local repository identifier isn't defined, don't create a
repoid line in the commits.

Sean

^ permalink raw reply

* Re: [RFD] Ignore rules
From: David Greaves @ 2005-05-16 16:05 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git
In-Reply-To: <pan.2005.05.16.09.35.22.73817@smurf.noris.de>

Matthias Urlichs wrote:

>Hi, Jon Seymour wrote:
>
>  
>
>>a. pushing the ignore logic into the core git tools such as git-ls-files
>>
>>b. including the current ignore .* rule as a default ignore rule that
>>can be overridden by a .gitignore file
>>    
>>
>
>I'd say YES to both.
>
>My preferred ignore file logic would be:
>
>- stop at first match (that's more efficient)
>  
>
more efficient true - but then surely 98% of the time you have to check
_all_ patterns since files aren't generally ignored.
And the ability to override earlier matches makes life much easier.
So I say no shortcuts, last pattern to match decides ignore/accept status

>- !pattern prevents exclusion of matching files
>- bash-style shell globs, except that ...
>  - a pattern that starts with / is a regexp
>  - * doesn't cross directory boundaries, but ** does
>  
>
>- I don't need a per-repository (i.e. non-checked-in/propagated)
>  ignore file.
>
I agree.
But for the sake of checking a couple of files it makes sense to define
a complete set of locations.

David


-- 


^ permalink raw reply

* Re: Summary of core GIT while you are away.
From: Linus Torvalds @ 2005-05-16 16:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: pasky, braddr, nico, david, Git Mailing List
In-Reply-To: <7vzmuy13od.fsf@assigned-by-dhcp.cox.net>

On Sat, 14 May 2005, Junio C Hamano wrote:
>
> Hoping that you had a good time during your vacation, and at the
> same time also hoping to help you catch up with GIT traffic,
> here is my version of the summary of things that happened in the
> GIT community around core GIT [*1*].

Thanks. I'm back and reading email, but have by no means caught up yet.

Anyway, everythign I've read so far makes sense, and it might make sense
to continue git development using just git-pb. The only thing I personally
think sucks is the author/committer matching of git-rev-list/tree, since
it would seem like somebody might well like to match on an arbitrary part
of a commit, and special-casing author/committer seems somewhat broken. 
But that's a fairly minor nit.

I personally suspect that both git-rev-list and git-rev-tree should have
an alternate output format that could be more easily grepped by subsequent
commands. For example, right now git-rev-list just outputs a list of
commit ID's, and it might make sense to have a flag to just append the
commit message to the output, and zero-terminate it (and if the commit
message has a NUL byte in it, just truncate it at that point).

Then you could just do

	git-rev-list -v --header HEAD | grep -z 'author[^\n]*Linus'

to tell it to do the "verbose" thing (only showing the header of the 
commit, not the whole message), and grep for "Linus" in the author line.

Or, if you want to see all commits that are signed off by so-and-so, do:

	git-rev-list -v HEAD | grep -zi 'signed-off-by:[^\n]*so-and-so'

and it would still be pretty efficient, and a _lot_ more generic and 
flexible than the "--author" and "--committer" flags.

I also suspect that a lot of uses if git-rev-{tree|list} actually want to 
see the commit stuff anyway, ie things like gitweb etc currently likely 
end up doing a separate "git-cat-file commit <sha1>" to get the whole 
commit information, which the "-v" flag would just give directly.

The only question is whether you want to have it be human-readable by
default (indent the message lines with a <tab>, and nonheaders with
<tab>+4*<space> or something), and then have a "-z" flag to do the
machine-readable version described above.

Hmm?

		Linus

^ permalink raw reply

* Re: speeding up cg-log -u
From: Daniel Barkalow @ 2005-05-16 15:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, Zack Brown, git, Linus Torvalds
In-Reply-To: <7voebboafy.fsf@assigned-by-dhcp.cox.net>

On Mon, 16 May 2005, Junio C Hamano wrote:

> >>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> DB> Existance is the primary thing, and everything else was added as
> DB> needed. (Pure connectivity is a bit special, because it's a property of
> DB> generic objects so that fsck-cache doesn't need to know about particular
> DB> types of objects unless there are particular things to check about them)
> 
> DB> If you need more fields, let me know, and I'll figure out how to include
> DB> them.
> 
> Could you take a look at the latest round of the patch and see
> what I did there makes sense?
> 
>     From: Junio C Hamano <junkio@cox.net>
>     Date: Sun, 15 May 2005 14:18:36 -0700
>     Message-ID: <7vy8agqjbn.fsf@assigned-by-dhcp.cox.net>
>     Subject: [PATCH 1/4] Add --author and --committer match
>              to git-rev-list and git-rev-tree.

It seems generally good to me.

I think it would fit stylistically a bit better if the "mark" field on the
person names were left for programs to use however they wanted, and the
"interesting" determination were done in the programs (or, since there are
two with the same characteristics, a new file they both link against).

Alternatively, put the used bit definitions in the header file and have a
mask for unused flags.

> Another thing I wanted to ask you about was lifetime rules.
> When we "lookup" these objects (and then "parse" them, which
> causes more memory to be used), who is responsible for freeing
> them?  When my program thinks it is done with a commit, is it
> allowed to free it?  Or does the lookup machinery own all of the
> objects that have ever been looked up, and the program should
> not worry about freeing them to begin with?

The lookup machinery owns all of the objects that have been looked up. The
thing is that the program can never effectively tell if it's really done
with a commit, because some other branch it's following could have
incorrect dates and suddenly turn out to be descended from a commit that
it freed, and things would likely misbehave if the object were looked up
again, since the flags would be reset.

We could have something that causes them to be reset to unparsed, if the
program thinks that, even if it references the same object again, it won't
need the contents.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: README rewrite
From: Zack Brown @ 2005-05-16 15:16 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Wink Saville, git
In-Reply-To: <20050515172802.GH13024@pasky.ji.cz>

On Sun, May 15, 2005 at 07:28:03PM +0200, Petr Baudis wrote:
> Dear diary, on Sun, May 15, 2005 at 05:53:15PM CEST, I got a letter
> where Zack Brown <zbrown@tumblerings.org> told me that...
> > On Sun, May 15, 2005 at 08:40:41AM -0700, Zack Brown wrote:
> > > This much I think I understand. What I don't understand is how to actually use
> > > branches. I don't see a Cogito command to create or destroy them.
> > 
> > Or I'm blind. The cg-branch-add command is right there. It also has a long
> > comment at the top of the script. Unfortunately the comment only describes how
> > to use the command, not what exactly branches are or how to work with them.
> > 
> > Clearly 'branches' are diverging branches of development. But if I have my
> > own tree, with several branches in it, it's unclear to me how to specify
> > which branch I'm actually working on at any given moment.
> 
> I think it's actually very BKish. Each repository has its own "master"
> branch, which always corresponds with your current branch of
> development. That is, your working tree is always represented by the
> "master" branch.
> 
> The rest of branches are "remote", that is they just point at the other
> repositories. When you want to get the new changes from them, you
> cg-pull, or cg-update to merge them to your branch too.
> 
> So if you want to create a new branch, you cg-clone the original branch.
> And if you want to refer to the new branch in any other branch, you
> cg-branch-add it in the other branch.

So a branch is just a name for a cloned tree somewhere, the same as a tag is
just a name for a revision some time in the past?

> 
> So the local branch is the "master" branch, the rest are "remote"
> branches. Note that there is a theoretical support for multiple local
> branches, but I decided not to make things even more confusing and there
> is no Cogito interface for managing them now.

Is there anything about the repository that 'knows' which is the master branch,
or is this just a matter of which person is in charge? So, if I have a project,
and I have a Cogito repository, so far it's just me, and just one branch.

Then another person joins the project, and they clone my repository onto their
local system, and give it their own branch name.

Now here is the question:

We decide that the other person is a better project leader, and we decide to use
their branch as the master branch, and mine as just a remote branch.

Would that be normal Cogito behavior? i.e. there is nothing to distinguish a
'master' branch from any other except that it is the one everyone says is the
master branch?

Be well,
Zack

> 
> I will add cg-switch which will switch the master branch to some other
> branch (e.g. cg-switch linus will rename your current master to
> master-1234 or something, update your "origin" branch to point to the
> "linus" branch, and make your "master" branch to point at the same
> commit as the "origin" branch). I might also do something like
> cg-branch-add --local, which will add a local branch and you could then
> cg-switch to it too.
> 
> -- 
> 				Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Zack Brown

^ permalink raw reply

* Re: [RFD] Ignore rules
From: Matthias Urlichs @ 2005-05-16  9:35 UTC (permalink / raw)
  To: git
In-Reply-To: <2cfc4032050514181127c02e43@mail.gmail.com>

Hi, Jon Seymour wrote:

> a. pushing the ignore logic into the core git tools such as git-ls-files
> 
> b. including the current ignore .* rule as a default ignore rule that
> can be overridden by a .gitignore file

I'd say YES to both.

My preferred ignore file logic would be:

- stop at first match (that's more efficient)
- !pattern prevents exclusion of matching files
- bash-style shell globs, except that ...
  - a pattern that starts with / is a regexp
  - * doesn't cross directory boundaries, but ** does
- I don't need a per-repository (i.e. non-checked-in/propagated)
  ignore file.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de



^ permalink raw reply

* Re: Mercurial 0.4e vs git network pull
From: Matthias Urlichs @ 2005-05-16  9:29 UTC (permalink / raw)
  To: git; +Cc: linux-kernel
In-Reply-To: <200505151122.j4FBMJa01073@adam.yggdrasil.com>

Hi, Adam J. Richter wrote:

> 	Being able to do without a server side CGI script might
> encourage deployment a bit more, both for security reasons and
> effort of deployment.

A simple server-side CGI would be a "send me all changeset SHA-1s,
starting at HEAD until you reach FOO" operation (FOO being the SHA1 of
the previous head you've pulled before). This operation is simple enough
that it people should have no problem installing such a CGI.

You could then stream-pull the actual contents over HTTP/1.1 without
further CGI interaction.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de

^ permalink raw reply

* Re: [PATCH cogito] "cg-whatsnew" command
From: Catalin Marinas @ 2005-05-16  8:33 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git
In-Reply-To: <pan.2005.05.15.17.36.37.623874@smurf.noris.de>

[-- Attachment #1: Type: text/plain, Size: 283 bytes --]

Matthias Urlichs <smurf@smurf.noris.de> wrote:
>> +	cg-diff		[-p] [-r FROM_ID[:TO_ID]] [-m [BNAME] [BNAME]] [FILE]...
>
> That should be
>
> [-m [BNAME [BNAME]]]

You are right.

> though I'd suggest something more mnemonic than two BNAMEs.

Another try, see attached.

-- 
Catalin


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch-m-option --]
[-- Type: text/x-patch, Size: 4529 bytes --]

"-m" option added to cg-diff, cg-log and cg-mkpatch

This option takes two optional parameters, branch2 and branch1, and shows
the changes in branch2 not yet merged to branch1. Branch2 defaults to
origin and branch1 to HEAD.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

---
commit 54c787f373557940617bca2c206c731d04c7e07b
tree d46bb25bdadb369e6cbf28ca25ffaeb4b41f7381
parent fa6e9eb368e949e78c4e66217461cf624b52b0a2
author Catalin Marinas <cmarinas@pc1117.cambridge.arm.com> Mon, 16 May 2005 09:30:08 +0100
committer Catalin Marinas <cmarinas@pc1117.cambridge.arm.com> Mon, 16 May 2005 09:30:08 +0100

 cg-diff    |   19 +++++++++++++++++++
 cg-help    |    8 +++++---
 cg-log     |   19 +++++++++++++++++++
 cg-mkpatch |   19 +++++++++++++++++++
 4 files changed, 62 insertions(+), 3 deletions(-)

Index: cg-diff
===================================================================
--- de641904363cd3759f132ee7c0dfaf8a2ee58388/cg-diff  (mode:100755)
+++ d46bb25bdadb369e6cbf28ca25ffaeb4b41f7381/cg-diff  (mode:100755)
@@ -14,6 +14,9 @@
 # -p instead of one ID denotes a parent commit to the specified ID
 # (which must not be a tree, obviously).
 #
+# -m [branch2] [branch1] shows the changes in branch2 (defaulting to
+# origin) not yet merged to branch1 (defaulting to HEAD)
+#
 # Outputs a diff converting the first tree to the second one.
 
 . ${COGITO_LIB}cg-Xlib
@@ -31,6 +34,22 @@
 	parent=1
 fi
 
+if [ "$1" = "-m" ]; then
+	branch=HEAD
+	id2=origin
+	shift
+	if [ "$1" ]; then
+		id2=$1
+		shift
+	fi
+	if [ "$1" ]; then
+		branch=$1
+		shift
+	fi
+	id1=$(git-merge-base "$branch" "$id2")
+	[ "$id1" ] || die "Unable to determine the merge base"
+fi
+
 if [ "$1" = "-r" ]; then
 	shift
 	id1=$(echo "$1": | cut -d : -f 1)
Index: cg-help
===================================================================
--- de641904363cd3759f132ee7c0dfaf8a2ee58388/cg-help  (mode:100755)
+++ d46bb25bdadb369e6cbf28ca25ffaeb4b41f7381/cg-help  (mode:100755)
@@ -26,14 +26,16 @@
 	cg-cancel
 	cg-clone	[-s] SOURCE_LOC [DESTDIR]
 	cg-commit	[-m"Commit message"]... [-e | -E] [FILE]... < log message
-	cg-diff		[-p] [-r FROM_ID[:TO_ID]] [FILE]...
+	cg-diff		[-p] [-r FROM_ID[:TO_ID]] [-m [SRC_BNAME [DST_BNAME]]] \\
+			[FILE]...
 	cg-export	DEST [TREE_ID]
 	cg-help		[COMMAND]
 	cg-init
-	cg-log		[-c] [-f] [-r FROM_ID[:TO_ID]] [FILE]...
+	cg-log		[-c] [-f] [-r FROM_ID[:TO_ID]] ] \\
+			[-m [SRC_BNAME [DST_BNAME]]] [FILE]...
 	cg-ls		[TREE_ID]
 	cg-merge	[-c] [-b BASE_ID] FROM_ID
-	cg-mkpatch	[-s] [-r FROM_ID[:TO_ID]]
+	cg-mkpatch	[-s] [-r FROM_ID[:TO_ID]]] [-m [SRC_BNAME [DST_BNAME]]]
 	cg-patch			< patch on stdin
 	cg-pull		[BNAME]
 	cg-restore	[FILE]...
Index: cg-log
===================================================================
--- de641904363cd3759f132ee7c0dfaf8a2ee58388/cg-log  (mode:100755)
+++ d46bb25bdadb369e6cbf28ca25ffaeb4b41f7381/cg-log  (mode:100755)
@@ -22,6 +22,9 @@
 # (HEAD by default), or id1:id2 representing an (id1;id2] range
 # of commits to show.
 #
+# -m [branch2] [branch1] shows the changes in branch2 (defaulting to
+# origin) not yet merged to branch1 (defaulting to HEAD)
+#
 # The rest of arguments are took as filenames; cg-log then displays
 # only changes in those files.
 
@@ -94,6 +97,22 @@
 
 log_start=
 log_end=
+if [ "$1" = "-m" ]; then
+	branch=HEAD
+	log_end=origin
+	shift
+	if [ "$1" ]; then
+		log_end=$1
+		shift
+	fi
+	if [ "$1" ]; then
+		branch=$1
+		shift
+	fi
+	log_start=$(git-merge-base "$branch" "$log_end")
+	[ "$log_start" ] || die "Unable to determine the merge base"
+fi
+
 if [ "$1" = "-r" ]; then
 	shift
 	log_start="$1"
Index: cg-mkpatch
===================================================================
--- de641904363cd3759f132ee7c0dfaf8a2ee58388/cg-mkpatch  (mode:100755)
+++ d46bb25bdadb369e6cbf28ca25ffaeb4b41f7381/cg-mkpatch  (mode:100755)
@@ -9,6 +9,9 @@
 #
 # Takes an -r followed with ID defaulting to HEAD, or id1:id2, forming
 # a range (id1;id2]. (Use "id1:" to take just everything from id1 to HEAD.)
+#
+# -m [branch2] [branch1] shows the changes in branch2 (defaulting to
+# origin) not yet merged to branch1 (defaulting to HEAD)
 
 . ${COGITO_LIB}cg-Xlib
 
@@ -65,6 +68,22 @@
 
 log_start=
 log_end=
+if [ "$1" = "-m" ]; then
+	branch=HEAD
+	log_end=origin
+	shift
+	if [ "$1" ]; then
+		log_end=$1
+		shift
+	fi
+	if [ "$1" ]; then
+		branch=$1
+		shift
+	fi
+	log_start=$(git-merge-base "$branch" "$log_end")
+	[ "$log_start" ] || die "Unable to determine the merge base"
+fi
+
 if [ "$1" = "-r" ]; then
 	shift
 	log_start="$1"

^ permalink raw reply

* Re: speeding up cg-log -u
From: Junio C Hamano @ 2005-05-16  8:13 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Petr Baudis, Zack Brown, git, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505151158300.30848-100000@iabervon.org>

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> Existance is the primary thing, and everything else was added as
DB> needed. (Pure connectivity is a bit special, because it's a property of
DB> generic objects so that fsck-cache doesn't need to know about particular
DB> types of objects unless there are particular things to check about them)

DB> If you need more fields, let me know, and I'll figure out how to include
DB> them.

Could you take a look at the latest round of the patch and see
what I did there makes sense?

    From: Junio C Hamano <junkio@cox.net>
    Date: Sun, 15 May 2005 14:18:36 -0700
    Message-ID: <7vy8agqjbn.fsf@assigned-by-dhcp.cox.net>
    Subject: [PATCH 1/4] Add --author and --committer match
             to git-rev-list and git-rev-tree.

Another thing I wanted to ask you about was lifetime rules.
When we "lookup" these objects (and then "parse" them, which
causes more memory to be used), who is responsible for freeing
them?  When my program thinks it is done with a commit, is it
allowed to free it?  Or does the lookup machinery own all of the
objects that have ever been looked up, and the program should
not worry about freeing them to begin with?

^ permalink raw reply

* Re: [PATCH 0/4] Pulling refs files
From: Junio C Hamano @ 2005-05-16  7:55 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Petr Baudis, git, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505151129200.30848-100000@iabervon.org>

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> On Sat, 14 May 2005, Junio C Hamano wrote:
>> I am having a bit hard time understanding how the end user uses
>> what you are trying to give them.  Is the basic idea to let them
>> say "I want to get Pasky's $GIT_DIR/refs/heads/master and store
>> it in my $GIT_DIR/refs/heads/git-pb, and then I want to start
>> the pull starting from the commit recorded in that ref"?

DB> Yes. This would be: git-http-pull -w heads/git-pb heads/master
DB> http://www.kernel.org/pub/scm/cogito/git-pb.git/

That would be quite handy.  IIRC, Pasky had a gripe about
limiting this under refs hierarchy (that's why you can write
"heads/master" in your example not "refs/heads/master", though),
and I am sympathetic to it.  Giving unlimited download access
anywhere under foobar.git/ would be fine.  About the receiving
side, letting things to be written anywhere the user wants would
also be fine.

Since this will all be scripted anyway, I do not mind if the
above example needs to be spelled as:

 $ git-http-pull -w ${GIT_DIR-.git}/refs/heads/git-pb refs/heads/master \
   http://www.kernel.org/pub/scm/cogito/git-pb.git/

^ permalink raw reply

* [PATCH 2/2] Add sample ignore logic to git-run-with-user-path command.
From: Junio C Hamano @ 2005-05-16  6:06 UTC (permalink / raw)
  To: pasky; +Cc: git, torvalds

This adds a sample ignore file logic to git-run-with-user-path
command.  This is primarily to serve as an example for plugging
ignore file logic to the previously introduced framework, and to
spur mailing list discussions on what the final ignore file
logic should be, and where the information should come from.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/git-run-with-user-path.txt |   32 ++++++++
Makefile                                 |    6 +
paths.c                                  |  117 +++++++++++++++++++++++++++++--
t/t7001-git-run-with-user-path-ignore.sh |   67 +++++++++++++++++
4 files changed, 215 insertions(+), 7 deletions(-)
t/t7001-git-run-with-user-path-ignore.sh (. --> 100755)

--- a/Documentation/git-run-with-user-path.txt
+++ b/Documentation/git-run-with-user-path.txt
@@ -53,6 +53,38 @@
 	--no-ignore flag, there is no such filtering done.
 
 
+IGNORE FILES
+------------
+
+This command currently uses a pcre based implementation to express
+ignore patterns.  The purpose of this implementation is to primarily
+serve as an example and to start GIT mailing list discussions, and by no
+means is cast in stone.  This section describes what this sample
+implementation does.
+
+The information used to define which paths to ignore is read from two
+files.  Both files use the same syntax.
+
+First, $CIT_DIR/ignore is read.  Then, the file whose path (relative to
+the project top) recorded in $GIT_DIR/info/ignore-file is read next.
+The latter file is expected to be revision controlled with GIT.
+
+These two files should record one ignore record per line.  A line that
+is empty, and a line that starts with a '#' are ignored and used as
+comments.
+
+Each ignore record is a pcre regular expression, optionally prefixed
+with a '!'.  To determine if a path is to be ignored, the path is
+matched against each ignore record in the order they appear in the
+ignore file.  If the ignore record matches the path, and it does not
+have the optional '!', then the path is ignored.  Otherwise, the path is
+not ignored.  In either case, the rest of ignore records are not used
+after the first match.  This means (1) an earlier entry in an ignore
+file has precedence over later ones, and (2) the entries in
+$GIT_DIR/ignore file have precedence over the ones in the file named
+by $GIT_DIR/info/ignore-file.
+
+
 ENVIRONMENT VARIABLES
 ---------------------
 
--- a/Makefile
+++ b/Makefile
@@ -54,6 +54,12 @@
 LIBS = $(LIB_FILE)
 LIBS += -lz
 
+IGNORE_USING_PCRE=1
+
+ifdef IGNORE_USING_PCRE
+  LIBS += -lpcreposix
+endif
+
 ifdef MOZILLA_SHA1
   SHA1_HEADER="mozilla-sha1/sha1.h"
   LIB_OBJS += mozilla-sha1/sha1.o
--- a/paths.c
+++ b/paths.c
@@ -2,6 +2,7 @@
  * Copyright (c) 2005 Junio C Hamano
  */
 #include <string.h>
+#include <pcreposix.h>
 #include "cache.h"
 #include "paths.h"
 
@@ -37,23 +38,125 @@
 	}
 }
 
+static struct ignore_entry {
+	int negate;
+	regex_t regexp;
+} **ignore_list;
+static int ignore_nr;
+static int ignore_alloc;
+
+static void add_ignore(const char *buf)
+{
+	struct ignore_entry *ie = xmalloc(sizeof(*ie));
+	if (buf[0] == '!') {
+		ie->negate = 1;
+		buf++;
+	}
+	else
+		ie->negate = 0;
+
+	if (regcomp(&(ie->regexp), buf, 0)) {
+		fprintf(stderr, "bad regexp <%s>\n", buf);
+		free(ie);
+		return;
+	}
+	if (ignore_alloc <= ignore_nr) {
+		ignore_alloc = alloc_nr(ignore_alloc);
+		ignore_list = xrealloc(ignore_list,
+				       ignore_alloc * sizeof(ie));
+	}
+	ignore_list[ignore_nr++] = ie;
+}
+
+static void read_ignore_list(const char *path)
+{
+	FILE *in;
+	char buf[1024];
+	in = fopen(path, "r");
+	if (!in)
+		return;
+	while (fgets(buf, sizeof(buf), in) != NULL) {
+		int l = strlen(buf);
+		/* An empty line and a line that starts with # is comment. */
+		if (buf[0] != '#' && buf[0] != '\n' && buf[l-1] == '\n') {
+			buf[l-1] = 0;
+			add_ignore(buf);
+		}
+	}
+	fclose(in);
+}
+
+static void read_ignore_list_from_file(const char *path)
+{
+	char filename[PATH_MAX];
+	int len;
+	FILE *in;
+
+	in = fopen(path, "r");
+	if (!in)
+		return;
+	strcpy(filename, git_project_top);
+	len = strlen(filename);
+	filename[len++] = '/';
+	if (fgets(filename + len, sizeof(filename) - len, in) == NULL) {
+		fclose(in);
+		return;
+	}
+	fclose(in);
+	len = strlen(filename);
+	if (filename[len-1] != '\n')
+		return;
+	filename[len-1] = 0;
+	read_ignore_list(filename);
+}
+
 static int initialize_ignore_list(void)
 {
-	/* Put the Porcelain layer ignore logic initialization here.
-	 * Return non-zero after issuing appropriate error message
-	 * if initialization fails.
-	 */
+	char *git_dir = gitenv("GIT_DIR");
+	char path[PATH_MAX];
+	int git_dir_len;
+
+	if (! git_dir)
+		sprintf(path, "%s/.git", git_project_top);
+	else
+		strcpy(path, git_dir);
+	git_dir_len = strlen(path);
+	path[git_dir_len++] = '/';
+
+	/* read private list first, and then shared list. */
+	strcpy(path + git_dir_len, "ignore");
+	read_ignore_list(path);
+
+	strcpy(path + git_dir_len, "info/ignore-file");
+	read_ignore_list_from_file(path);
+
 	return 0;
 }
 
 int path_ignored(const char *path)
 {
+	int i;
+
 	if (!verify_path(path))
 		return 1;
 
-	/* Put the Porcelain layer ignore logic here.
-	 * Return non-zero if path is to be ignored.
-	 */
+	for (i = 0; i < ignore_nr; i++) {
+		int status;
+		regmatch_t pmatch[10];
+		char errbuf[1024];
+
+		status = regexec(&(ignore_list[i]->regexp), path,
+				 sizeof(pmatch)/sizeof(pmatch[0]),
+				 pmatch, 0);
+		if (!status)
+			return !ignore_list[i]->negate;
+		if (status == REG_NOMATCH)
+			continue;
+
+		regerror(status, &(ignore_list[i]->regexp), errbuf,
+			 sizeof(errbuf));
+		fprintf(stderr, "pcre regexp execution error <%s>\n", errbuf);
+	}
 	return 0;
 }
 
--- a/t/t7001-git-run-with-user-path-ignore.sh
+++ b/t/t7001-git-run-with-user-path-ignore.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+#
+# Copyright (c) 2005, Junio C Hamano
+#
+
+test_description='git-run-with-user-path basic test (part #2).
+
+The command is used to help running core GIT commands that always
+expect to be run from the top level directory (i.e. the directory
+that corresponds to the top of tree GIT_INDEX_FILE describes).
+
+It knows how to handle ignore files convention used by the Porcelain
+layer implementation.
+'
+
+. ./test-lib.sh
+
+LF='
+'
+HERE=$(pwd)
+
+test_expect_success \
+setup '
+echo ".*1\$" >.git/ignore &&
+echo ".*0\$" >dontdiff &&
+mkdir .git/info &&
+echo "dontdiff" >.git/info/ignore-file &&
+mkdir path0 path1 path1/path2 &&
+for p in path0/file0 path1/file1 path1/path2/file2
+do
+    echo hello >$p || exit 1
+done
+'
+
+cd $HERE
+test_expect_success \
+'finding paths from a subdirectory' '
+    case "$(cd path0 &&
+            git-run-with-user-path echo -- \
+	    file0 ../path1/path2/file2)" in
+    "path1/path2/file2") : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+cd $HERE
+test_expect_success \
+'feeding find output via xargs from a subdirectory' '
+    case "$(cd path0 &&
+	    find . ../path1 -type f -print0 |
+	    xargs -r -0 git-run-with-user-path ls -1 --)" in
+    "path1/path2/file2") : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+cd $HERE
+test_expect_success \
+'using !negate pattern' '
+    echo "!path0/file0$" >>.git/ignore &&
+    case "$(git-run-with-user-path ls -1 -- path0/* path1/file1)" in
+    "path0/file0") : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_done
------------------------------------------------


^ permalink raw reply

* [PATCH 1/2] Introduce git-run-with-user-path helper program.
From: Junio C Hamano @ 2005-05-16  6:05 UTC (permalink / raw)
  To: pasky; +Cc: git, torvalds

Introduce git-run-with-user-path helper program.

A new command git-run-with-user-path takes a command and paths
that are filesystem paths (either relative to the cwd or
absolute pathname).  It canonicalizes these paths to be usable
by the core GIT commands, filters using the ignore pattern rule,
chdir(2)'s to the top level of the tree and runs the given
command with these canonicalizd paths as its arguments.

This version contains necessary hooks to implement the ignore
pattern rule, but it does not implement any ignore pattern
rules, waiting for more mailing list discussions.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/git-run-with-user-path.txt |   79 ++++++++++++
Makefile                                 |    7 -
paths.c                                  |  199 +++++++++++++++++++++++++++++++
paths.h                                  |   14 ++
run-with-user-path.c                     |   61 +++++++++
t/README                                 |    1 
t/t7000-git-run-with-user-path-basic.sh  |   66 ++++++++++
update-cache.c                           |   29 ----
8 files changed, 427 insertions(+), 29 deletions(-)
Documentation/git-run-with-user-path.txt (. --> 100644)
paths.c (. --> 100644)
paths.h (. --> 100644)
run-with-user-path.c (. --> 100644)
t/t7000-git-run-with-user-path-basic.sh (. --> 100755)

--- a/Documentation/git-run-with-user-path.txt
+++ b/Documentation/git-run-with-user-path.txt
@@ -0,0 +1,79 @@
+git-run-with-user-path(1)
+=========================
+v0.1, May 2005
+
+NAME
+----
+git-run-with-user-path - Run command from the top after canonicalizing paths.
+
+
+SYNOPSIS
+--------
+'git-run-with-user-path' [options] <command> <argument>... '--' <path>...
+
+DESCRIPTION
+-----------
+This command takes a <command>, zero or more <argument> and zero
+or more <path> arguments.  <path> arguments name objects on the
+filesystem, <command> is typically a core GIT command, and
+<argument> are the initial arguments to the <command>.
+
+It first finds the project top directory (the directory that corresponds
+to the top of the tree structure GIT_INDEX_FILE describes).  When the
+environment variable GIT_PROJECT_TOP is set, the value of the variable
+is used.  Then the <path> parameters are canonicalized to be relative to
+the project top.  It then chdir(2)'s to the project top directory and
+runs the given <command>, with <argument> and these canonicalized <path>
+arguments.
+
+This is useful for the Porcelain layer to run core GIT commands from
+subdirectories.  For example, if linux-2.6.git tree is checked out in
+/usr/src/linux, you can do:
+
+    $ cd /usr/src/linux/fs
+    $ ... work in fs directory making changes ...
+    $ git-run-with-user-path git-diff-tree -r HEAD -- ext? ../include/linux
+    $ find ext? ../include/linux ! -type d -print0 |
+      xargs -0 git-run-with-user-path git-update-cache --add -- --
+
+The above is roughly equivalent to:
+
+    $ cd /usr/src/linux
+    $ git-diff-tree -r HEAD fs/ext? include/linux
+    $ find fs/ext? include/linux ! -type d -print0 |
+      xargs git-update-cache --add --
+
+
+OPTIONS
+-------
+--no-ignore::
+
+	By default, the path arguments are filtered with the
+	same ignore rules Porcelain layers use.  With
+	--no-ignore flag, there is no such filtering done.
+
+
+ENVIRONMENT VARIABLES
+---------------------
+
+'GIT_PROJECT_TOP'::
+	If the 'GIT_PROJECT_TOP' environment variable is set
+	then it specifies the directory that corresponds to the
+	top level of the tree structure GIT_INDEX_FILE describes.
+	When this environment variable is not defined, the
+	closest parent directory that has .git/ subdirectory in
+	it is looked for and used.
+
+
+Author
+------
+Written by Junio C Hamano <junkio@cox.net>
+
+Documentation
+--------------
+Documentation by Junio C Hamano.
+
+GIT
+---
+Part of the link:git.html[git] suite
+
--- a/Makefile
+++ b/Makefile
@@ -28,7 +28,7 @@
 	git-unpack-file git-export git-diff-cache git-convert-cache \
 	git-http-pull git-rpush git-rpull git-rev-list git-mktag \
 	git-diff-helper git-tar-tree git-local-pull git-write-blob \
-	git-get-tar-commit-id
+	git-get-tar-commit-id git-run-with-user-path
 
 all: $(PROG)
 
@@ -46,6 +46,9 @@
 LIB_H += diff.h
 LIB_OBJS += diff.o
 
+LIB_H += paths.h
+LIB_OBJS += paths.o
+
 LIB_OBJS += gitenv.o
 
 LIBS = $(LIB_FILE)
@@ -100,6 +103,7 @@
 git-rpush: rsh.c
 git-rpull: rsh.c pull.c
 git-rev-list: rev-list.c
+git-run-with-user-path: run-with-user-path.c 
 git-mktag: mktag.c
 git-diff-helper: diff-helper.c
 git-tar-tree: tar-tree.c
@@ -117,6 +121,7 @@
 sha1_file.o: $(LIB_H)
 usage.o: $(LIB_H)
 diff.o: $(LIB_H)
+paths.o: $(LIB_H)
 strbuf.o: $(LIB_H)
 gitenv.o: $(LIB_H)
 
--- a/paths.c
+++ b/paths.c
@@ -0,0 +1,199 @@
+/*
+ * Copyright (c) 2005 Junio C Hamano
+ */
+#include <string.h>
+#include "cache.h"
+#include "paths.h"
+
+/****************************************************************/
+
+/* Ignore list handling part */
+
+/*
+ * We fundamentally don't like some paths: we don't want
+ * dot or dot-dot anywhere, and in fact, we don't even want
+ * any other dot-files (.git or anything else). They
+ * are hidden, for chist sake.
+ *
+ * Also, we don't want double slashes or slashes at the
+ * end that can make pathnames ambiguous.
+ */
+int verify_path(const char *path)
+{
+	char c;
+
+	goto inside;
+	for (;;) {
+		if (!c)
+			return 1;
+		if (c == '/') {
+inside:
+			c = *path++;
+			if (c != '/' && c != '.' && c != '\0')
+				continue;
+			return 0;
+		}
+		c = *path++;
+	}
+}
+
+static int initialize_ignore_list(void)
+{
+	/* Put the Porcelain layer ignore logic initialization here.
+	 * Return non-zero after issuing appropriate error message
+	 * if initialization fails.
+	 */
+	return 0;
+}
+
+int path_ignored(const char *path)
+{
+	if (!verify_path(path))
+		return 1;
+
+	/* Put the Porcelain layer ignore logic here.
+	 * Return non-zero if path is to be ignored.
+	 */
+	return 0;
+}
+
+
+/****************************************************************/
+
+/* Path canonicalization part */
+
+char *git_project_top = NULL;
+static char git_cwd[PATH_MAX];
+
+static int find_project_top(void)
+{
+	char path[PATH_MAX];
+	int dir_length;
+
+	if (!getcwd(git_cwd, sizeof(git_cwd)))
+		return error("cannot get cwd to find GIT_PROJECT_TOP");
+
+	git_project_top = gitenv("GIT_PROJECT_TOP");
+	if (git_project_top)
+		return 0;
+
+	strcpy(path, git_cwd);
+	while (path[0] && strcmp(path, "/") && !git_project_top) {
+		char *cp;
+		struct stat st;
+		dir_length = strlen(path);
+		path[dir_length] = '/';
+
+		strcpy(path + dir_length + 1, ".git");
+		if (stat(path, &st) < 0) {
+			if (errno != ENOENT)
+				return error("%s: %s", path, strerror(errno));
+			/* notfound */
+		}
+		else if (S_ISDIR(st.st_mode)) {
+			path[dir_length] = 0;
+			git_project_top = strdup(path);
+			break;
+		}
+		else
+			return error("%s: not a directory", path);
+		path[dir_length] = 0;
+		cp = strrchr(path, '/');
+		if (cp)
+			*cp = 0;
+	}
+	if (!git_project_top)
+		return error("cannot find GIT_PROJECT_TOP");
+
+	return 0;
+}
+
+char *canon_path(const char *path)
+{
+	/* path is either absolute path from root fs or
+	 * relative to the git_cwd.  What is the relative path
+	 * for that thing, viewed from GIT_PROJECT_TOP?
+	 */
+	char *cp, *op, *endp, *result = NULL;
+	char *work = xmalloc(strlen(git_cwd) + strlen(path) + 2);
+	int pfxlen = strlen(git_project_top);
+
+	if (path[0] == '/')
+		strcpy(work, path);
+	else
+		sprintf(work, "%s/%s", git_cwd, path);
+	/* We will copy to *op starting from *cp while removing
+	 * nonsense.  Initially op and cp are both set to one
+	 * past the root-level '/'.
+	 */
+	op = cp = work + 1;
+	endp = cp + strlen(cp);
+	while (cp < endp) {
+		char *ep = strchr(cp, '/');
+		if (!ep)
+			ep = endp; /* at terminating NUL */
+		/* Now look at what is between cp and ep. */
+		if (ep == cp) {
+			/* Remove double slashes.
+			 * "/xxx//foo" ==> "/xxx//foo"
+			 *    cp^              cp^
+			 */
+			cp++;
+			continue;
+		}
+		if (*cp == '.') {
+			/* dot something.  What is it? */
+			if (cp[1] == 0 || cp[1] == '/') {
+				/* Remove trailing dot.
+				 * "/xxx/." ==> "/xxx/."
+				 *     cp^           cp^
+				 * "/xxx/./foo" ==> "/xxx/./foo"
+				 *     cp^               cp^
+				 */
+				cp = ep;
+				continue;
+			}
+			if (cp[1] == '.' && (cp[2] == 0 || cp[2] == '/')) {
+				/* Uplevel.
+				 * "/xxx/../foo" ==> "/xxx/../foo"
+				 *     cp^                  cp^
+				 * while backspacing "xxx" in the op
+				 */
+				cp = cp + 3;
+				op -= 2;
+				if (op < work)
+					op = work + 1;
+				while (*op != '/' && work < op)
+					op--;
+				op++;
+				continue;
+			}
+		}
+		/* otherwise there is no funnies */
+		while (cp <= ep && *cp)
+			*op++ = *cp++;
+	}
+	*op = 0;
+	if (op[-1] == '/' && op != work)
+		op[-1] = 0;
+
+	if (!strncmp(git_project_top, work, pfxlen) &&
+	    (work[pfxlen] == '/' || work[pfxlen] == 0))
+		result = strdup(work + pfxlen + 1);
+	/* otherwise, path is outside of git-project-top and we ignore it. */
+
+	free(work);
+	return result;
+}
+
+/****************************************************************/
+
+int setup_paths(void)
+{
+	if (find_project_top())
+		return -1;
+	if (initialize_ignore_list())
+		return -1;
+	return 0;
+}
+
--- a/paths.h
+++ b/paths.h
@@ -0,0 +1,14 @@
+/*
+ * Copyright (c) 2005 Junio C Hamano
+ */
+#ifndef _PATHS_H_
+#define _PATHS_H_
+
+int setup_paths(void);
+extern char *git_project_top;
+
+char *canon_path(const char *);
+int path_ignored(const char *);
+int verify_path(const char *);
+
+#endif
--- a/run-with-user-path.c
+++ b/run-with-user-path.c
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2005 Junio C Hamano
+ */
+#include <unistd.h>
+#include "cache.h"
+#include "paths.h"
+
+static int no_ignore = 0;
+
+static const char *usage_rwup = 
+"git-run-with-user-path [ --no-ignore ] <command> <argument>... -- <path>...";
+
+static int prepare_path_args(char **exec_param, char **path)
+{
+	int i, cnt;
+	char *canon;
+
+	for (i = cnt = 0; path[i]; i++) {
+		canon = canon_path(path[i]);
+		if (no_ignore || !path_ignored(canon))
+			exec_param[cnt++] = canon;
+	}
+	return cnt;
+}
+
+int main(int ac, char **av)
+{
+	char **exec_param;
+	int i, command_end, cnt_path;
+
+	if (setup_paths())
+		exit(1);
+
+	while (1 < ac && av[1][0] == '-') {
+		if (!strcmp(av[1], "--no-ignore"))
+			no_ignore = 1;
+		else
+			break;
+		ac--; av++;
+	}
+	for (i = 1; i < ac; i++)
+		if (!strcmp(av[i], "--"))
+			break;
+	if (ac <= i)
+		die(usage_rwup); /* no -- to start path */
+
+	command_end = i; /* pointing at -- */
+
+	/* command command arg1 arg2 ... path1 path2 ... NULL */
+	exec_param = xcalloc(ac, sizeof(char *));
+	exec_param[ac - 1] = 0;
+	for (i = 1; i < command_end; i++)
+		exec_param[i - 1] = av[i];
+	cnt_path = prepare_path_args(exec_param + command_end - 1,
+				     av + command_end + 1);
+
+	chdir(git_project_top);
+	execvp(exec_param[0], exec_param);
+
+	exit(0);
+}
--- a/t/README
+++ b/t/README
@@ -73,6 +73,7 @@
 	4 - the diff commands
 	5 - the pull and exporting commands
 	6 - the revision tree commands (even e.g. merge-base)
+	7 - the non-core commands and helpers
 
 Second digit tells the particular command we are testing.
 
--- a/t/t7000-git-run-with-user-path-basic.sh
+++ b/t/t7000-git-run-with-user-path-basic.sh
@@ -0,0 +1,66 @@
+#!/bin/sh
+#
+# Copyright (c) 2005, Junio C Hamano
+#
+
+test_description='git-run-with-user-path basic test.
+
+The command is used to help running core GIT commands that always
+expect to be run from the top level directory (i.e. the directory
+that corresponds to the top of tree GIT_INDEX_FILE describes).
+'
+
+. ./test-lib.sh
+
+LF='
+'
+HERE=$(pwd)
+
+test_expect_success \
+setup '
+mkdir path0 path1 path1/path2
+for p in path0/file0 path1/file1 path1/path2/file2
+do
+    echo hello >$p
+    git-update-cache --add -- $p
+done
+'
+
+test_expect_success \
+'finding paths from a subdirectory' '
+    case "$(cd path0 &&
+            git-run-with-user-path --no-ignore cat -- \
+	    file0 ../path1/path2/file2)" in
+    "hello${LF}hello") : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+test_expect_success \
+'feeding find output via xargs from a subdirectory' '
+    case "$(cd path0 &&
+	    find . ../path1 -type f -print0 |
+	    xargs -r -0 git-run-with-user-path --no-ignore cat --)" in
+    "hello${LF}hello${LF}hello") : ;;
+    *) (exit 1) ;;
+    esac
+'
+
+cd $HERE
+mv .git .svn
+GIT_DIR=$(pwd)/.svn
+GIT_PROJECT_TOP=$(pwd)
+export GIT_DIR GIT_PROJECT_TOP
+
+test_expect_success \
+'feeding find output via xargs from a subdirectory (with GIT_PROJECT_TOP)' '
+    case "$(cd path0 &&
+            find . ../path1 -type f -print0 |
+	    xargs -r -0 git-run-with-user-path --no-ignore cat --)" in
+    "hello${LF}hello${LF}hello") : ;;
+    *) (exit 1) ;;
+    esac
+    cd ..
+'
+
+test_done
--- a/update-cache.c
+++ b/update-cache.c
@@ -5,6 +5,7 @@
  */
 #include <signal.h>
 #include "cache.h"
+#include "paths.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -257,34 +258,6 @@
 	return has_errors;
 }
 
-/*
- * We fundamentally don't like some paths: we don't want
- * dot or dot-dot anywhere, and in fact, we don't even want
- * any other dot-files (.git or anything else). They
- * are hidden, for chist sake.
- *
- * Also, we don't want double slashes or slashes at the
- * end that can make pathnames ambiguous.
- */
-static int verify_path(char *path)
-{
-	char c;
-
-	goto inside;
-	for (;;) {
-		if (!c)
-			return 1;
-		if (c == '/') {
-inside:
-			c = *path++;
-			if (c != '/' && c != '.' && c != '\0')
-				continue;
-			return 0;
-		}
-		c = *path++;
-	}
-}
-
 static int add_cacheinfo(char *arg1, char *arg2, char *arg3)
 {
 	int size, len, option;
------------------------------------------------


^ permalink raw reply

* [PATCH 0/2] Introducing git-run-with-user-path program.
From: Junio C Hamano @ 2005-05-16  6:04 UTC (permalink / raw)
  To: pasky; +Cc: git, torvalds

This is a new series I've mentioned earlier today.

 [PATCH 1/2] Introduce git-run-with-user-path helper program.
 [PATCH 2/2] Add sample ignore logic to git-run-with-user-path command.

The first one adds a path canonicalization helper with path
ignore hooks but no ignore logic implementation (it passes
everything that passes verify_path()).  The second one adds a
sample ignore logic implementation using PCRE.

Although the second one is done primarily as an example and to
start a mailing list discussion, it should be also safe to merge
if you decide to take patch 1, because the logic is used only by
git-run-with-user-path which is a new program. no Porcelain uses
right now.

^ permalink raw reply

* Re: Mercurial 0.4e vs git network pull
From: Matt Mackall @ 2005-05-16  1:12 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Adam J. Richter, pasky, git, linux-kernel, mercurial, torvalds
In-Reply-To: <428793A1.5070004@pobox.com>

On Sun, May 15, 2005 at 02:23:29PM -0400, Jeff Garzik wrote:
> Matt Mackall wrote:
> >On Sun, May 15, 2005 at 04:22:19AM -0700, Adam J. Richter wrote:
> >
> >>On Sun, 15 May 2005 10:54:05 +0200, Petr Baudis wrote:
> >>
> >>>Dear diary, on Thu, May 12, 2005 at 10:57:35PM CEST, I got a letter
> >>>where Matt Mackall <mpm@selenic.com> told me that...
> >>>
> >>>>Does this need an HTTP request (and round trip) per object? It appears
> >>>>to. That's 2200 requests/round trips for my 800 patch benchmark.
> >>
> >>>Yes it does. On the other side, it needs no server-side CGI. But I guess
> >>>it should be pretty easy to write some kind of server-side CGI streamer,
> >>>and it would then easily take just a single HTTP request (telling the
> >>>server the commit ID and receiving back all the objects).
> >>
> >>	I don't understand what was wrong with Jeff Garzik's previous
> >>suggestion of using http/1.1 pipelining to coalesce the round trips.
> >
> >
> >You can't do pipelining if you can't look ahead far enough to fill the 
> >pipe.
> 
> Even if you cannot fill a pipeline, HTTP/1.1 is sufficiently useful 
> simply by removing the per-request connection overhead.

Sure. It cuts round trips by a factor of 2. But that's just about all
it does.

Mercurial already does:
  - approximately O(log(new changesets)) requests/data to find new changesets
  - one request to get an entire changegroup (set of all new
    changesets), which comes back all nicely pipelined and sorted by file
  - delta transfer

In "dumb http" mode, ie what's been there since about day three, it
can do:
  - one request (size proportional to total number of changesets) to
    find new changesets
  - approximately two requests per changed file to pull all deltas
    (vs request per file revision)
  - delta transfer

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: [RFC] adding merge-node to parent lines in a commit
From: Sean @ 2005-05-15 22:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v64xkqfqg.fsf@assigned-by-dhcp.cox.net>

On Sun, May 15, 2005 6:36 pm, Junio C Hamano said:
>>>>>> "S" == Sean  <seanlkml@sympatico.ca> writes:
>
> S> Yes, it's pretty basic unless i'm overlooking something:
>
>     Rn---\
>          Mn
>          Mn-1
>     Rn-1  |
>     Rn-2  |
>     Rn-3--/
>     Initial
>
> S> So for this particular case it's rather simple, the Rn merge
> S> node would have:
>
> S> parent  Rn-1  (ommitted)
> S> parent  Mn    Rn-3
>
> Isn't that something you can obtain by scanning the parents and
> running merge-base across them without recording it in the
> commit?

Yes it is, and i was mistaken to think it gave more information.  It would
shave some time off history traversal, but nothing else.

Sean



^ permalink raw reply

* Re: [RFC] adding merge-node to parent lines in a commit
From: Junio C Hamano @ 2005-05-15 22:36 UTC (permalink / raw)
  To: Sean; +Cc: Junio C Hamano, git
In-Reply-To: <1441.10.10.10.24.1116194876.squirrel@linux1>

>>>>> "S" == Sean  <seanlkml@sympatico.ca> writes:

S> Yes, it's pretty basic unless i'm overlooking something:

    Rn---\
         Mn
         Mn-1
    Rn-1  |
    Rn-2  |
    Rn-3--/
    Initial

S> So for this particular case it's rather simple, the Rn merge
S> node would have:

S> parent  Rn-1  (ommitted)
S> parent  Mn    Rn-3

Isn't that something you can obtain by scanning the parents and
running merge-base across them without recording it in the
commit?

^ permalink raw reply

* Re: git-rev-list  in local commit order
From: Thomas Gleixner @ 2005-05-15 22:13 UTC (permalink / raw)
  To: Sean; +Cc: git
In-Reply-To: <1392.10.10.10.24.1116193437.squirrel@linux1>

On Sun, 2005-05-15 at 17:43 -0400, Sean wrote:

> Well I honestly don't know what you want.   If I wanted to include a
> "fortune" line in every commit and couldn't explain what value it
> provided, i'd expect you or others to object.

Last try.

A repository Id makes it possible to identify workflows in and across
repositories. 

This information is valuable for me and others due to already discussed
reasons. 

I accept that is irrelevant for you.

tglx

^ permalink raw reply

* Re: [RFC] adding merge-node to parent lines in a commit
From: Sean @ 2005-05-15 22:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfywoqira.fsf@assigned-by-dhcp.cox.net>

On Sun, May 15, 2005 5:30 pm, Junio C Hamano said:
> It is not clear to me ID of what object is being recorded as
> SHA1-MERGE-NODE in your proposal.  Care to illustrate?

Hey Junio,

Yes, it's pretty basic unless i'm overlooking something:

Rn---\
>    Mn
>    Mn-1
Rn-1  |
Rn-2  |
Rn-3--/
Initial

So for this particular case it's rather simple, the Rn merge node would have:

parent  Rn-1  (ommitted)
parent  Mn    Rn-3

Essentially, you are recording in the merge commit the length of each
branch that represents new commits that did not exist in this repository
before this merge.

But there are more complicated cases with multiple parent merges.   What
this would avoid is the need to calculate common_anscestor in routines
like git-rev-list, because that information would already be stored.   As
far as I know this information is already available at merge time anyway,
so recording it shouldn't be a burden.   I'm not sure it buys you a ton
out the other end, but perhaps it would be slightly easier to piece the
merge history back together.

Sean

^ permalink raw reply

* Re: git repository for net drivers available
From: Junio C Hamano @ 2005-05-15 21:46 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jeff Garzik, Linux Kernel, git
In-Reply-To: <20050515200514.GA31414@pasky.ji.cz>

>>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes:

PB> Dear diary, on Fri, May 13, 2005 at 05:29:30PM CEST, I got a letter
PB> where Jeff Garzik <jgarzik@pobox.com> told me that...
>> Looks like cogito is using $repo/heads/$branch, whereas my git repo is 
>> using $repo/branches/$branch.

PB> Would it be a big problem to use refs/heads/$branch? That's the
PB> currently commonly agreed convention about location for storing branch
PB> heads, not just some weird Cogito-specific invention. And it'd be very
PB> nice to have those locations consistent across git repositories.

Since Jeff brought up $repo/branches/$branch, you may also want
to add that $repo/branches/$branch is used to record the URL of
the remote $branch (the information used to be in a flat file
$repo/remotes, branch name and URL separated by shell $IFS, one
record on each line), and is quite different from those 40-byte
SHA1 plus LF files you see in $repo/refs/*/ directory.

I think it is a reasonable one, I also follow the
$repo/branches/$branch convention Cogito uses, and I would
encorage other Porcelain implementations to follow suit.

^ permalink raw reply

* Re: git-rev-list  in local commit order
From: Sean @ 2005-05-15 21:43 UTC (permalink / raw)
  To: tglx; +Cc: git
In-Reply-To: <1116192629.11872.201.camel@tglx>

On Sun, May 15, 2005 5:30 pm, Thomas Gleixner said:
> On Sun, 2005-05-15 at 17:21 -0400, Sean wrote:
>> > Time is illusion.
>>
>> What you're missing is that time is only important in this case to
>> deduce
>> the relative age of each commit LOCALLY.   The intention of this
>> proposal
>> is not to allow time comparison of commits between repositories.
>
> I do not want to compare times. I want to figure out workflows and
> histories between different repositories.

Well I honestly don't know what you want.   If I wanted to include a
"fortune" line in every commit and couldn't explain what value it
provided, i'd expect you or others to object.

My time based proposal solves the issue of :

Rn------\
Rn-1    Mn
Rn-2    Mn-1
Rn-3 ---/
Initial

Showing up in two repositories sorted based on the order they were
committed locally.  This was an issue that you stated you were trying to
solve.  The test case works just as advertised.  Remote times don't
matter, all that matters is the time you merge the objects locally.

> Even LOCALLY is no guarantee for correct timestamps.

Sure, but then your repoid might have gone missing or be set incorrectly
too.   One nice thing if your time is wrong, you can simply reset the
timestamp on the file.   If your repo-id is wrong, you have to recast the
commit object which will get a different SHA1 number and make things more
difficult.

Sean

^ permalink raw reply

* Re: [RFC] adding merge-node to parent lines in a commit
From: Junio C Hamano @ 2005-05-15 21:30 UTC (permalink / raw)
  To: Sean; +Cc: git
In-Reply-To: <1282.10.10.10.24.1116192147.squirrel@linux1>

It is not clear to me ID of what object is being recorded as
SHA1-MERGE-NODE in your proposal.  Care to illustrate?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox