Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Add some fancy colors in the test library when terminal    supports it.
From: Pierre Habouzit @ 2007-10-22 13:45 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Shawn O. Pearce, git
In-Reply-To: <471C950E.40702@viscovery.net>

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]

On Mon, Oct 22, 2007 at 12:18:22PM +0000, Johannes Sixt wrote:
> Pierre Habouzit schrieb:
> >On Mon, Oct 22, 2007 at 11:35:30AM +0000, Johannes Sixt wrote:
> >>Pierre Habouzit schrieb:
> >>>On Mon, Oct 22, 2007 at 08:53:36AM +0000, Johannes Sixt wrote:
> >>>>Pierre Habouzit schrieb:
> >>>>>+say_color () {
> >>>>>+	[ "$nocolor" = 0 ] &&  [ "$1" != '-1' ] && tput setaf "$1"
> >>>>>+	shift
> >>>>>+	echo "* $*"
> >>>>>+	tput op
>        ^^^^^^^^
> I am talking about this line.

  Oooh, good catch :P it should be guarded by a [ "$nocolor" = 0 ]
indeed :P (or use your solution).

> >>>>>+}
> >>I wanted to point out that if tput is not available, the second 
> >>invocation will leave "tput: command not found" behind on stderr. 
> >>Therefore, I proposed to make the definition of say_color() different 
> >>depending on whether $color is set or not. Then you don't need to test 
> >>for $color twice inside the function.
> >  Right we can do that. I'll try to rework the patch. and no it
> >shouldn't leave tput: command not found as I 2>/dev/null and I think the
> >shell doesn't print that in that case. At least my zsh doesn't.
> 
> There is no 2>/dev/null. Am I missing something?

  I was.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-22 13:48 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Jakub Narebski, Steffen Prohaska, Federico Mena Quintero, git
In-Reply-To: <471C9B13.9080603@op5.se>

Hi,

On Mon, 22 Oct 2007, Andreas Ericsson wrote:

> If I were to suggest any improvements, it'd be to change the semantics of
> git-pull to always update the local branches set up to be merged with the
> remote tracking branches when they, prior to fetching, pointed to the same
> commit, such that when
> 
> $ git show-ref master
> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/heads/master
> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/remotes/origin/master
> 
> refs/heads/master gets set to refs/remotes/origin/master post-fetch.

In general, this should fail.  Because you are expected to have local 
changes in the local branches.  What you describe suggests that you should 
not use the branch name "master" at all, but "origin/master".

That said, there is a pretty simple way to achieve what you want (even if 
it does not help the confusion you create between local and remote 
branches):

	git config --add remote.origin.fetch master:master

Of course, when you checkout "master" and pull then, you'll get even more 
problems, _exactly_ because you muddled up the clear distinction between 
local and remote branches.

Ciao,
Dscho

^ permalink raw reply

* Re: Proposed git mv behavioral change
From: Karl Hasselström @ 2007-10-22 14:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Ari Entlich, Michael Witten, git
In-Reply-To: <20071020064003.GA30605@coredump.intra.peff.net>

On 2007-10-20 02:40:03 -0400, Jeff King wrote:

> Right. So the exact state you have in your index never actually
> existed in your working tree. But that's OK if:
>   - the changes are trivial and obviously correct
>   - you're not actually planning on _committing_ that, you just want
>     to build the commit using the index

Or

    - you plan to revisit the commit at a later time and verify that
      it's OK

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* Re: Proposed git mv behavioral change
From: Karl Hasselström @ 2007-10-22 14:08 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Michael Witten, Shawn O. Pearce, git
In-Reply-To: <87ACA689-CBBD-4E6D-9BAD-B3BBA8391DB9@wincent.com>

On 2007-10-20 13:15:19 +0200, Wincent Colaiuta wrote:

> I think the issue here is that git-mv as it is currently implemented
> really conflates two things:
>
> 1. Renaming a file in the traditional "mv" sense
> 2. Staging the entire contents of the file in the index, ready or not
>
> So it's kind of like the command were called git-mv-and-add or git-
> rename-and-add. And given that the index as a staging area is such a
> central content in Git, users often want to have more control over
> what gets added to the index than that; ie. "I really just wanted to
> rename the file, and leave the staging of modifications to the
> content up to me".

I've come to that conclusion too while reading this thread. It would
make more sense for git-mv to, as others have already suggested, move
the file in the worktree and move the file in the index but _not_ add
workdir updates to the index. git-mv --cached would do only the index,
and not touch the worktree.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply

* Re: [PATCH] don't set-group-id on directories on apple
From: Johannes Schindelin @ 2007-10-22 14:16 UTC (permalink / raw)
  To: Scott R Parish; +Cc: git
In-Reply-To: <20071022075459.GA1157@srparish.net>

Hi,

On Mon, 22 Oct 2007, Scott R Parish wrote:

> "git init --shared=all" was failing because chmod was returning
> EPERM.

Not here.  This is git version 1.5.3.rc4.1716.gc3498, and "uname -a" says

Darwin michael-stirrats-mac-mini.local 8.10.0 Darwin Kernel Version 
8.10.0: Wed May 23 16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC 
Power Macintosh powerpc

Is it possible that you have stricter permission settings?  Or that you 
try to re-initialise a repository that somebody else initialised 
originally?

Ciao,
Dscho

^ permalink raw reply

* Re: Git User's Survey 2007 unfinished summary continued
From: Federico Mena Quintero @ 2007-10-22 14:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0710200035020.25221@racer.site>

On Sat, 2007-10-20 at 00:37 +0100, Johannes Schindelin wrote:

> FWIW try this:
> 
> 	git<Space><Tab>
> 
> or even better:
> 
> 	git<Enter>

Hmm, this didn't work.  I'm still using git 1.5.4.2 from the 
stock openSUSE 10.3 packages --- did this get added in a newer version?

  Federico

^ permalink raw reply

* Re: Git User's Survey 2007 unfinished summary continued
From: Andreas Ericsson @ 2007-10-22 14:29 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Johannes Schindelin, Steffen Prohaska, Federico Mena Quintero,
	git
In-Reply-To: <8fe92b430710220526i65ecb862ie1037e9d94d93b83@mail.gmail.com>

Jakub Narebski wrote:
> On 10/22/07, Andreas Ericsson <ae@op5.se> wrote:
> [...]
>>>>>> On 10/20/07, Steffen Prohaska <prohaska@zib.de> wrote:
>>>>>>
>>>>>>> Maybe we could group commands into more categories?
> 
>> Similarly, it might be helpful to have help topics the gdb way, like
>> "git help patches". It's one of those things that people have come to
>> expect from a software tool, so perhaps we should humor them? Given gits
>> "every help topic is a man-page" idiom, this shouldn't require any real
>> technical effort.
>>
>> Such topics should probably include
>> merge/merges/merging - overview of various ways of putting two lines of
>> development back together
>> patch/patches - how to create, send and apply
>> tags/branches/refs - what they are, why they're good, link to merging
> 
> Very good idea. It is definitely something that can be worked on.
> 
> By the way, what do you think about "spying" version of git, specially
> marked release which gathers statistics of porcelain used, with
> frequency of its use, and git-sendstats command added in this release?
> 

I like it and I'd use it. What's more interesting is that I could 
probably get my co-workers to do the same.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [PATCH] don't set-group-id on directories on apple
From: Scott Parish @ 2007-10-22 14:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0710221234070.25221@racer.site>

On Mon, Oct 22, 2007 at 03:16:01PM +0100, Johannes Schindelin wrote:

> On Mon, 22 Oct 2007, Scott R Parish wrote:
> 
> > "git init --shared=all" was failing because chmod was returning
> > EPERM.
> 
> Not here.  This is git version 1.5.3.rc4.1716.gc3498, and "uname -a" says
> 
> Darwin michael-stirrats-mac-mini.local 8.10.0 Darwin Kernel Version 
> 8.10.0: Wed May 23 16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC 
> Power Macintosh powerpc

Darwin poplar.local 8.10.1 Darwin Kernel Version 8.10.1: Wed May 23 16:33:00 PDT 2007; root:xnu-792.22.5~1/RELEASE_I386 i386 i386

> Is it possible that you have stricter permission settings?

This is a possibility, but i have no idea what that might be. I've
tried googling around without any luck (except for a post about
reading "mkdir(2)").

> Or that you try to re-initialise a repository that somebody else
> initialised originally?

No, this was a failure when running tests (t1301-shared-repo.sh)

sRp

-- 
Scott Parish
http://srparish.net/

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Andreas Ericsson @ 2007-10-22 14:31 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jakub Narebski, Steffen Prohaska, Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710221445170.25221@racer.site>

Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 22 Oct 2007, Andreas Ericsson wrote:
> 
>> If I were to suggest any improvements, it'd be to change the semantics of
>> git-pull to always update the local branches set up to be merged with the
>> remote tracking branches when they, prior to fetching, pointed to the same
>> commit, such that when
>>
>> $ git show-ref master
>> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/heads/master
>> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/remotes/origin/master
>>
>> refs/heads/master gets set to refs/remotes/origin/master post-fetch.
> 
> In general, this should fail.  Because you are expected to have local 
> changes in the local branches.


BS argument. Git knows when I haven't got any changes on my local 
branches, and it can be fairly safely assumed that when I feel like 
making any, I'd like to make them off as fresh a tip as possible unless 
I explicitly tell git otherwise.

Nice hint though. I'm working on a patch for it now but I've only looked 
at it 15 minutes over lunch today, so it'll probably be a few days.


>  What you describe suggests that you should 
> not use the branch name "master" at all, but "origin/master".
> 

No. I want the ability to commit locally without it affecting my 
upstream tracking branches, but I also want to make sure that when I 
want to work on some branch I don't frequently touch, git will make sure 
it's kept up-to-speed with the branch I explicitly have told it to merge 
with, without me having to remember if I was on that branch when I last 
did git-pull (I might not have a network connection), and without having 
to remember what I decided to call my locally-modifiable branch.


> That said, there is a pretty simple way to achieve what you want (even if 
> it does not help the confusion you create between local and remote 
> branches):
> 
> 	git config --add remote.origin.fetch master:master
> 
> Of course, when you checkout "master" and pull then, you'll get even more 
> problems, _exactly_ because you muddled up the clear distinction between 
> local and remote branches.
> 

That's not what I want at all. I must have been unclear in my original 
post. I'm talking about git doing automatically what every single user 
I've ever talked to wants it to do, which is to maintain the state of 
sync that the "local-and-modifiable" branches had with the 
"local-non-modifiable-aka-remote-tracking" branches. Note that the state 
of sync is more important to users than git never ever touching the 
branches that they *could* have (but don't have) changes on.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [PATCH] execv_git_cmd(): also try PATH if everything else fails.
From: Scott Parish @ 2007-10-22 14:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Shawn O. Pearce, git
In-Reply-To: <Pine.LNX.4.64.0710212256270.25221@racer.site>

On Sun, Oct 21, 2007 at 10:59:01PM +0100, Johannes Schindelin wrote:

> Earlier, we tried to find the git commands in several possible exec
> dirs.  Now, if all of these failed, try to find the git command in
> PATH.

I'm tempted to try a different approach. What if instead of looping
and building up strings of all the different absolute paths we want
to try we just prepend to PATH with the correct extra precedence,
and then call execvp on the command we want?

sRp

-- 
Scott Parish
http://srparish.net/

^ permalink raw reply

* deprecating colgit
From: martin f krafft @ 2007-10-22 14:23 UTC (permalink / raw)
  To: git discussion list; +Cc: home in vcs discussion list
In-Reply-To: <20070906131525.GA7261@piper.oerlikon.madduck.net>

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

Dear list(s),

I would like to deprecate colgit [0] in favour of Joey Hess' mr [1],
which does everything colgit did, and much more.

0. http://lists.zerezo.com/git/msg629629.html
1. http://kitenet.net/~joey/code/mr/

If you have not seen mr, well, basically it manages collections of
repositories, like colgit aspired to do, but Joey made it
vcs-agnostic and much more robust. It's definitely worth checking
out.

-- 
martin;              (greetings from the heart of the sun.)
  \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
 
"if a man treats life artistically, his brain is his heart."
                                                        -- oscar wilde
 
spamtraps: madduck.bogus@madduck.net

[-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Git User's Survey 2007 unfinished summary continued
From: Federico Mena Quintero @ 2007-10-22 14:53 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Johannes Schindelin, Jakub Narebski, git
In-Reply-To: <471C586A.9030900@op5.se>

On Mon, 2007-10-22 at 09:59 +0200, Andreas Ericsson wrote:

> I doubt many people on this list regularly use git-blame but it's a 
> command that's definitely non-trivial to script out using only the 
> "proper" commands, and CVS/SVN users expect it to be there, so it's 
> probably worth listing anyhow.

Hmmm, I don't really want to turn the "summary" thread into oodles of
sub-threads, but here goes :)

Personally I find git-blame *EXTREMELY* useful.  The workflow is:

1. Bug #12345 for FooApp gets assigned to you.

2. git-svn clone fooapp's repository

3. git checkout -b my-bugfix-branch-for-12345

4. debug debug debug

5. "WTF?  Who wrote this crappy code?"

6. git blame culprit-file.c

7. "Oh, it was $person with $commit_id... what were they thinking at the
time?"

8. git show $commit_id

9. "Oh, I see their intentions now... what was going on at that time?"

10. git log <date range around $commit_id>

11. etc.

Git-blame is very nice for code archaeology (long explanation at
http://mail.gnome.org/archives/desktop-devel-list/2007-September/msg00238.html).

> Similarly, it might be helpful to have help topics the gdb way, like 
> "git help patches".

This would be simply fantastic.  If those help topics suggested
workflows, I'd be delighted :)  Feel free to poke me if you write up
some text; I'd love to help on this.

  Federico

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-22 15:00 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Jakub Narebski, Steffen Prohaska, Federico Mena Quintero, git
In-Reply-To: <471CB443.9070606@op5.se>

Hi,

On Mon, 22 Oct 2007, Andreas Ericsson wrote:

> Johannes Schindelin wrote:
> 
> > On Mon, 22 Oct 2007, Andreas Ericsson wrote:
> > 
> > > If I were to suggest any improvements, it'd be to change the 
> > > semantics of git-pull to always update the local branches set up to 
> > > be merged with the remote tracking branches when they, prior to 
> > > fetching, pointed to the same commit, such that when
> > > 
> > > $ git show-ref master
> > > d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/heads/master
> > > d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/remotes/origin/master
> > > 
> > > refs/heads/master gets set to refs/remotes/origin/master post-fetch.
> > 
> > In general, this should fail.  Because you are expected to have local 
> > changes in the local branches.
> 
> 
> BS argument.

Aha.  So you want to make sure that the local branches are no longer 
"purely" local.  And you want to stop updating them when unpushed changes 
are in the local branches.

Seems I cannot help you.

Ciao,
Dscho

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summary continued
From: Andreas Ericsson @ 2007-10-22 15:16 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jakub Narebski, Steffen Prohaska, Federico Mena Quintero, git
In-Reply-To: <Pine.LNX.4.64.0710221558230.25221@racer.site>

Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 22 Oct 2007, Andreas Ericsson wrote:
> 
>> Johannes Schindelin wrote:
>>
>>> On Mon, 22 Oct 2007, Andreas Ericsson wrote:
>>>
>>>> If I were to suggest any improvements, it'd be to change the 
>>>> semantics of git-pull to always update the local branches set up to 
>>>> be merged with the remote tracking branches when they, prior to 
>>>> fetching, pointed to the same commit, such that when
>>>>
>>>> $ git show-ref master
>>>> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/heads/master
>>>> d4027a816dd0b416dc8c7b37e2c260e6905f11b6 refs/remotes/origin/master
>>>>
>>>> refs/heads/master gets set to refs/remotes/origin/master post-fetch.
>>> In general, this should fail.  Because you are expected to have local 
>>> changes in the local branches.
>>
>> BS argument.
> 
> Aha.  So you want to make sure that the local branches are no longer 
> "purely" local.  And you want to stop updating them when unpushed changes 
> are in the local branches.
> 

To me, it's more along the lines of "let git help me not make the 
mistake of hacking on a six-week old codebase when I've explicitly asked 
it to merge these and those remote tracking branches into these and 
those local branches". Not updating those branches when there *are* 
changes on them is something users can understand and will probably also 
appreciate, but the reason for not allowing even fast-forwards escape me.

> Seems I cannot help you.
> 

Well, I knew that much from the start so I didn't ask you to, and since 
you seem to fail to grasp what I had in mind, I'm sure you'd botch the 
implementation anyway. Thanks for not quite offering though ;-)

I'm sure you'll review the patch though, and I'm equally sure I will 
appreciate your technical comments rather a lot more than this current 
bickering about a feature it seems I can't express clearly enough in words.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [PATCH] execv_git_cmd(): also try PATH if everything else fails.
From: Andreas Ericsson @ 2007-10-22 15:19 UTC (permalink / raw)
  To: Scott Parish; +Cc: Johannes Schindelin, Shawn O. Pearce, git
In-Reply-To: <20071022143637.GP16291@srparish.net>

Scott Parish wrote:
> On Sun, Oct 21, 2007 at 10:59:01PM +0100, Johannes Schindelin wrote:
> 
>> Earlier, we tried to find the git commands in several possible exec
>> dirs.  Now, if all of these failed, try to find the git command in
>> PATH.
> 
> I'm tempted to try a different approach. What if instead of looping
> and building up strings of all the different absolute paths we want
> to try we just prepend to PATH with the correct extra precedence,
> and then call execvp on the command we want?
> 

That's how the original git --exec-dir feature got implemented. There's 
even a nifty function for it in git.c; prepend_to_path(). It's a 
provably workable solution.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: best git practices, was Re: Git User's Survey 2007 unfinished summarycontinued
From: Federico Mena Quintero @ 2007-10-22 15:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Andreas Ericsson, Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0710221156540.25221@racer.site>

On Mon, 2007-10-22 at 12:04 +0100, Johannes Schindelin wrote:

> <rationale>There is a good chance that git is not optimised for most 
> people's daily workflows, as project maintainers seemed to be much more 
> forthcoming with patches, and therefore maintainers' tasks are much more 
> optimised than in other SCMs.</rationale>

The following are workflows that would be very useful for my cow-orkers
and project peers.

End-user workflows:

* Clone from another SCM (mostly svn).  Make a local branch to implement
something in various commits.  Rebase to the "latest upstream sources"
when you are done, and then do the equivalent to "svn dcommit" to upload
your final changes to the other SCM.  The use case for this is to fix a
complicated bug in GNOME, which uses SVN.

* While you are doing that, produce a patchset that you can send to the
maintainers for review.  I love "git-format-patch -n --thread --stdout >
foo", but it's pretty painful to have to 1. look up git-format-patch's
options in the man page (if you use --stdout, shouldn't -n and --thread
be turned on by default?); 2. import "foo" into Evolution to then be
able to edit the zeroth mail, and then be able to use Evo's SMTP
configuration to send out the mails while preserving the threading.

* Clone a git repository which has several interesting branches.  Figure
out which branch you are interested in.  Create a local branch based on
that; do your changes there.  Keep your code up to date (rebase?  when
to fetch / pull / etc?).

* You have a personal branch with a bunch of commits:  a mess of "real
work", "remove debugging printf", "fix typo", etc.  Reformat / reorder
those patches into something suitable for submission.  [I just found out
about git rebase --interactive and it's *FANTASTIC* for this!]

Maintainer workflows:

* Start a personal project in git and publish it for others to clone.
Assume several possible setups:  dumb web server with no git installed,
git installed but no git-daemon, git installed with git-daemon running.
I've found that publishing is not trivial at all; it's a rather
cumbersome multi-step process.

* Several of your contributors publish their own git repositories.
Integrate changes from them, or review them.  This interesting because
you'll have to do a lot of navigation in repos with which you aren't
familiar, you'll have to use the merging and conflict resolution tools,
you'll have to maintain the signoffs, etc.

* Set up a public repository to which other people can push.

* Publish just some of your branches.  Do that often, say, because you
have new work to show in each of those branches.

  Federico

^ permalink raw reply

* Re: What's cooking in git/spearce.git (topics)
From: Steffen Prohaska @ 2007-10-22 15:27 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Shawn O. Pearce
In-Reply-To: <20071022063222.GS14735@spearce.org>


On Oct 22, 2007, at 8:32 AM, Shawn O. Pearce wrote:

> * sp/push-refspec (Sun Oct 14 10:54:45 2007 +0200) 6 commits
>  - push, send-pack: use same rules as git-rev-parse to resolve
>    refspecs
>  - add ref_cmp_full_short() comparing full ref name with a short name
>  - push, send-pack: support pushing HEAD to real ref name
>  - rev-parse: teach "git rev-parse --symbolic" to print the full ref
>    name
>  - add get_sha1_with_real_ref() returning full name of ref on demand
>  - push, send-pack: fix test if remote branch exists for colon-less
>    refspec
>
> I've briefly looked at this series and there's reasons why its not
> in next yet.

It's not ready for next. Especially the last patch in the list
changes the existing behaviour in a way that might be unexpected
by longtime git users. And maybe we even need for the 1.6 cycle
before we can change the behaviour of git push.


> Its actually something that I'm interested in seeing
> fixed as the current behavior of how git-push matches refs on the
> remote side is just plain nuts.  I'll look at it further after I
> get ph/parseopt and cc/skip into next.

I planned to draw a conclusion from the discussion in

http://marc.info/?l=git&m=119286893014690&w=2

and send an updated proposal based on what I learnt. But
unfortunately I didn't have time yet.

My impression now is that the details of the behaviour of "git
push" are hard to understand and should be made more explicit.

Related tasks are currently encoded in the refspecs, but the
details are not always obvious right away:
- creation of new branches on the remote side.
- deletion of branches on the remote side.
- pushing of branches matching on local and remote side.
- pushing local branches explicitly to a different ref on the remote.
- save newbies from pushing to 'non-standard' location, that
   is only push to heads and tags.
- but also allow to push to funny refs if you force git to
   do this.

All this is related to the topic above, although its maybe too much
to be solved at once.

	Steffen

^ permalink raw reply

* Re: [PATCH] execv_git_cmd(): also try PATH if everything else fails.
From: Johannes Sixt @ 2007-10-22 15:36 UTC (permalink / raw)
  To: Andreas Ericsson, Scott Parish; +Cc: Johannes Schindelin, Shawn O. Pearce, git
In-Reply-To: <471CBF88.6020300@op5.se>

Andreas Ericsson schrieb:
> Scott Parish wrote:
>> I'm tempted to try a different approach. What if instead of looping
>> and building up strings of all the different absolute paths we want
>> to try we just prepend to PATH with the correct extra precedence,
>> and then call execvp on the command we want?
>>
> 
> That's how the original git --exec-dir feature got implemented. There's 
> even a nifty function for it in git.c; prepend_to_path(). It's a 
> provably workable solution.

The reason that this was done is for the sake of shell scripts: They need to 
have the path that was finally decided as exec-path in $PATH.

But I can't think of any negative side effect if *all* exec-path candidates 
are in $PATH. It's important, though, that all paths are absolute because 
the tools chdir every now and then.

-- Hannes

^ permalink raw reply

* Re: [PATCH, take 1] Linear-time/space rename logic (exact renames only)
From: Linus Torvalds @ 2007-10-22 16:33 UTC (permalink / raw)
  To: skimo
  Cc: Git Mailing List, Junio C Hamano, Shawn O. Pearce, David Kastrup,
	Jeff King
In-Reply-To: <20071022070750.GM1179MdfPADPa@greensroom.kotnet.org>



On Mon, 22 Oct 2007, Sven Verdoolaege wrote:
> 
> Aren't you truncating the ptr list after the first entry here?
> (While you still need the whole list in free_file_table.)

Yes. I didn't have that bug in the first version (I didn't do a separate 
"free_file_table()" at all - I just free'd the src/dst pointer lists at 
the end of that function). But I wanted to "clean up" the thing. Duh.

		Linus

^ permalink raw reply

* odd behavior with concurrent fetch/checkout
From: J. Bruce Fields @ 2007-10-22 16:51 UTC (permalink / raw)
  To: git

Just now I checked out a topic branch in my working repo:

	git checkout server-xprt-switch

and while waiting for it to complete (I just started work and caches
were all cold), I ran a

	git fetch origin

in another window to update from Linus.  The git fetch gave a warning:

	remote: Generating pack...
	remote: Counting objects: 7550
	remote: Done counting 12885 objects.
	remote: Result has 8400 objects.
	remote: Deltifying 8400 objects...
	remote:  100% (8400/8400) done
	Indexing 8400 objects...
	remote: Total 8400 (delta 7257), reused 5696 (delta 4586)
	 100% (8400/8400) done
	Resolving 7257 deltas...
	 100% (7257/7257) done
	* refs/remotes/origin/master: fast forward to branch 'master' of
	* git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
	  old..new: d85714d..55b70a0
	Cannot fetch into the current branch.

Why the warning?  Also, afterwards I was left with server-xprt-switch
pointing to the tip of the branch I'd just switched from (another
miscellaneous topic branch).  The working directory was in some
completely different state--thanks to a quick reset --hard I don't know
what it was.  Also, in the reflog for the checked-out branch:

	commit bac1e7977eb4781e62cee7f1c7c3d13a9e5d8d74
	Reflog: server-xprt-switch@{0} (J. Bruce Fields <bfields@citi.umich.edu>)
	Reflog message: fetch origin: Undoing incorrectly fetched HEAD.
	Author: J. Bruce Fields <bfields@citi.umich.edu>
	Date:   Mon Oct 22 12:32:37 2007 -0400
	...

Why was a fetch into the remote fooling with HEAD or anything under
refs/heads/?

--b.

^ permalink raw reply

* [PATCH] "current_exec_path" is a misleading name, use "argv_exec_path"
From: Scott R Parish @ 2007-10-22 17:01 UTC (permalink / raw)
  To: git

 Signed-off-by: Scott R Parish <srp@srparish.net>

---
 exec_cmd.c |   12 ++++++------
 exec_cmd.h |    2 +-
 git.c      |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index 9b74ed2..8b681d0 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -5,11 +5,11 @@
 
 extern char **environ;
 static const char *builtin_exec_path = GIT_EXEC_PATH;
-static const char *current_exec_path;
+static const char *argv_exec_path = 0;
 
-void git_set_exec_path(const char *exec_path)
+void git_set_argv_exec_path(const char *exec_path)
 {
-	current_exec_path = exec_path;
+	argv_exec_path = exec_path;
 }
 
 
@@ -18,8 +18,8 @@ const char *git_exec_path(void)
 {
 	const char *env;
 
-	if (current_exec_path)
-		return current_exec_path;
+	if (argv_exec_path)
+		return argv_exec_path;
 
 	env = getenv(EXEC_PATH_ENVIRONMENT);
 	if (env && *env) {
@@ -34,7 +34,7 @@ int execv_git_cmd(const char **argv)
 {
 	char git_command[PATH_MAX + 1];
 	int i;
-	const char *paths[] = { current_exec_path,
+	const char *paths[] = { argv_exec_path,
 				getenv(EXEC_PATH_ENVIRONMENT),
 				builtin_exec_path };
 
diff --git a/exec_cmd.h b/exec_cmd.h
index 849a839..da99287 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -1,7 +1,7 @@
 #ifndef GIT_EXEC_CMD_H
 #define GIT_EXEC_CMD_H
 
-extern void git_set_exec_path(const char *exec_path);
+extern void git_set_argv_exec_path(const char *exec_path);
 extern const char* git_exec_path(void);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 extern int execl_git_cmd(const char *cmd, ...);
diff --git a/git.c b/git.c
index e1c99e3..f659338 100644
--- a/git.c
+++ b/git.c
@@ -51,7 +51,7 @@ static int handle_options(const char*** argv, int* argc, int* envchanged)
 		if (!prefixcmp(cmd, "--exec-path")) {
 			cmd += 11;
 			if (*cmd == '=')
-				git_set_exec_path(cmd + 1);
+				git_set_argv_exec_path(cmd + 1);
 			else {
 				puts(git_exec_path());
 				exit(0);
-- 
gitgui.0.8.4.11176.gd9205-dirty

^ permalink raw reply related

* [PATCH] use only the PATH for exec'ing git commands
From: Scott R Parish @ 2007-10-22 17:01 UTC (permalink / raw)
  To: git

We need to correctly set up PATH for non-c based git commands. Since we
already do this, we can just use that PATH and execvp, instead of looping
over the paths with execve.

This patch adds a setup_path() function to exec_cmd.c, which sets
the PATH order correctly for our search order. execv_git_cmd() is
stripped down to setting up argv and calling execvp(). git.c's main()
only only needs to call setup_path().

Signed-off-by: Scott R Parish <srp@srparish.net>
---
 exec_cmd.c |  122 ++++++++++++++++++++++++++----------------------------------
 exec_cmd.h |    1 +
 git.c      |   43 +++------------------
 3 files changed, 61 insertions(+), 105 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index 8b681d0..b154c24 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -29,85 +29,69 @@ const char *git_exec_path(void)
 	return builtin_exec_path;
 }
 
+static void add_path(struct strbuf *out, const char *path)
+{
+	if (path && strlen(path)) {
+		if (is_absolute_path(path))
+			strbuf_addstr(out, path);
+		else
+			strbuf_addstr(out, make_absolute_path(path));
+
+		strbuf_addch(out, ':');
+	}
+}
+
+void setup_path(const char *cmd_path)
+{
+	const char *old_path = getenv("PATH");
+	struct strbuf new_path;
+
+	strbuf_init(&new_path, 0);
+
+	add_path(&new_path, argv_exec_path);
+	add_path(&new_path, getenv(EXEC_PATH_ENVIRONMENT));
+	add_path(&new_path, builtin_exec_path);
+	add_path(&new_path, cmd_path);
+		
+	if (old_path)
+		strbuf_addstr(&new_path, old_path);
+	else 
+		strbuf_addstr(&new_path, "/usr/local/bin:/usr/bin:/bin");
+
+	setenv("PATH", new_path.buf, 1);
+
+	strbuf_release(&new_path);
+}
+
 
 int execv_git_cmd(const char **argv)
 {
-	char git_command[PATH_MAX + 1];
-	int i;
-	const char *paths[] = { argv_exec_path,
-				getenv(EXEC_PATH_ENVIRONMENT),
-				builtin_exec_path };
-
-	for (i = 0; i < ARRAY_SIZE(paths); ++i) {
-		size_t len;
-		int rc;
-		const char *exec_dir = paths[i];
-		const char *tmp;
-
-		if (!exec_dir || !*exec_dir) continue;
-
-		if (*exec_dir != '/') {
-			if (!getcwd(git_command, sizeof(git_command))) {
-				fprintf(stderr, "git: cannot determine "
-					"current directory: %s\n",
-					strerror(errno));
-				break;
-			}
-			len = strlen(git_command);
-
-			/* Trivial cleanup */
-			while (!prefixcmp(exec_dir, "./")) {
-				exec_dir += 2;
-				while (*exec_dir == '/')
-					exec_dir++;
-			}
-
-			rc = snprintf(git_command + len,
-				      sizeof(git_command) - len, "/%s",
-				      exec_dir);
-			if (rc < 0 || rc >= sizeof(git_command) - len) {
-				fprintf(stderr, "git: command name given "
-					"is too long.\n");
-				break;
-			}
-		} else {
-			if (strlen(exec_dir) + 1 > sizeof(git_command)) {
-				fprintf(stderr, "git: command name given "
-					"is too long.\n");
-				break;
-			}
-			strcpy(git_command, exec_dir);
-		}
-
-		len = strlen(git_command);
-		rc = snprintf(git_command + len, sizeof(git_command) - len,
-			      "/git-%s", argv[0]);
-		if (rc < 0 || rc >= sizeof(git_command) - len) {
-			fprintf(stderr,
-				"git: command name given is too long.\n");
-			break;
-		}
+	struct strbuf cmd;
+	const char *tmp;
 
-		/* argv[0] must be the git command, but the argv array
-		 * belongs to the caller, and my be reused in
-		 * subsequent loop iterations. Save argv[0] and
-		 * restore it on error.
-		 */
+	strbuf_init(&cmd, 0);
+	strbuf_addf(&cmd, "git-%s", argv[0]);
 
-		tmp = argv[0];
-		argv[0] = git_command;
+	/* argv[0] must be the git command, but the argv array
+	 * belongs to the caller, and my be reused in
+	 * subsequent loop iterations. Save argv[0] and
+	 * restore it on error.
+	 */
+	tmp = argv[0];
+	argv[0] = cmd.buf;
 
-		trace_argv_printf(argv, -1, "trace: exec:");
+	trace_argv_printf(argv, -1, "trace: exec:");
 
-		/* execve() can only ever return if it fails */
-		execve(git_command, (char **)argv, environ);
+	/* execvp() can only ever return if it fails */
+	execvp(cmd.buf, (char **)argv);
 
-		trace_printf("trace: exec failed: %s\n", strerror(errno));
+	trace_printf("trace: exec failed: %s\n", strerror(errno));
 
-		argv[0] = tmp;
-	}
-	return -1;
+	argv[0] = tmp;
 
+	strbuf_release(&cmd);
+
+	return -1;
 }
 
 
diff --git a/exec_cmd.h b/exec_cmd.h
index da99287..a892355 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -3,6 +3,7 @@
 
 extern void git_set_argv_exec_path(const char *exec_path);
 extern const char* git_exec_path(void);
+extern void setup_path(const char *);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 extern int execl_git_cmd(const char *cmd, ...);
 
diff --git a/git.c b/git.c
index f659338..a639e42 100644
--- a/git.c
+++ b/git.c
@@ -6,28 +6,6 @@
 const char git_usage_string[] =
 	"git [--version] [--exec-path[=GIT_EXEC_PATH]] [-p|--paginate|--no-pager] [--bare] [--git-dir=GIT_DIR] [--work-tree=GIT_WORK_TREE] [--help] COMMAND [ARGS]";
 
-static void prepend_to_path(const char *dir, int len)
-{
-	const char *old_path = getenv("PATH");
-	char *path;
-	int path_len = len;
-
-	if (!old_path)
-		old_path = "/usr/local/bin:/usr/bin:/bin";
-
-	path_len = len + strlen(old_path) + 1;
-
-	path = xmalloc(path_len + 1);
-
-	memcpy(path, dir, len);
-	path[len] = ':';
-	memcpy(path + len + 1, old_path, path_len - len);
-
-	setenv("PATH", path, 1);
-
-	free(path);
-}
-
 static int handle_options(const char*** argv, int* argc, int* envchanged)
 {
 	int handled = 0;
@@ -403,7 +381,7 @@ int main(int argc, const char **argv)
 {
 	const char *cmd = argv[0] ? argv[0] : "git-help";
 	char *slash = strrchr(cmd, '/');
-	const char *exec_path = NULL;
+	const char *cmd_path = NULL;
 	int done_alias = 0;
 
 	/*
@@ -413,10 +391,7 @@ int main(int argc, const char **argv)
 	 */
 	if (slash) {
 		*slash++ = 0;
-		if (*cmd == '/')
-			exec_path = cmd;
-		else
-			exec_path = xstrdup(make_absolute_path(cmd));
+		cmd_path = cmd;
 		cmd = slash;
 	}
 
@@ -451,16 +426,12 @@ int main(int argc, const char **argv)
 	cmd = argv[0];
 
 	/*
-	 * We execute external git command via execv_git_cmd(),
-	 * which looks at "--exec-path" option, GIT_EXEC_PATH
-	 * environment, and $(gitexecdir) in Makefile while built,
-	 * in this order.  For scripted commands, we prepend
-	 * the value of the exec_path variable to the PATH.
+	 * We use PATH to find git commands, but we prepend some higher
+	 * precidence paths: the "--exec-path" option, the GIT_EXEC_PATH
+	 * environment, and the $(gitexecdir) from the Makefile at build
+	 * time.
 	 */
-	if (exec_path)
-		prepend_to_path(exec_path, strlen(exec_path));
-	exec_path = git_exec_path();
-	prepend_to_path(exec_path, strlen(exec_path));
+	setup_path(cmd_path);
 
 	while (1) {
 		/* See if it's an internal command */
-- 
gitgui.0.8.4.11176.gd9205-dirty

^ permalink raw reply related

* (unknown)
From: racin @ 2007-10-22 18:16 UTC (permalink / raw)
  To: git


Hello,

I found the following on the development version of git.el: saving
non-git-managed files in Emacs threw an error.

It is due to a simple error in the call to condition-case in a
recently added function, git-update-save-file.

I attached the patch for your convenience.

Regards,
Matthieu Lemerre

PS: Please Cc me when you ackwowledge; I'm not subscribed to the list.
As a matter of fact, I found the bug only because I didn't find git.el
for my distribution (debian) so I got directly from the development
version on the website.

^ permalink raw reply

* [PATCH, take 2] Linear-time/space rename logic (exact renames only)
From: Linus Torvalds @ 2007-10-22 17:29 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano, Shawn O. Pearce
  Cc: David Kastrup, Jeff King, Sven Verdoolaege
In-Reply-To: <alpine.LFD.0.999.0710220932150.10525@woody.linux-foundation.org>


Ok, as some people notices, there were a few bugs in the previous patch. I 
didn't free the hashes correctly (stupid) and the Makefile had "hash.o" 
instead of "hash.h".

But more importantly, my "testing" had been totally broken, because I had 
forgotten to actually move the rename limiting code to after the exact 
rename phase, so when I tested the 100,000 file rename, almost none of the 
new code triggered, so my performance testing was totally bogus.

When fixing that, I noticed that while my new exact rename detection was 
essentially instantaneous, there were some O(n*m) effects in the generic 
diff code from the extremely stupid way we handled the "was it a copy or a 
rename" issue.

To fix that, I just made the "was the path used by a rename" be a counter 
instead of a single "it was used"

With that in place, I could actually time the rename detection of 100,000 
files in my big-rename test repository. This is what it looks like when 
you rename a hundred thousand files:

	[torvalds@woody big-rename]$ time ~/git/git show -C | wc -l
	400006

	real    0m2.675s
	user    0m2.148s
	sys     0m0.540s

(each renamed file is 4 lines: they looks like

	diff --git a/really-big-dir/file-1-1-1-1-1 b/moved-big-dir/file-1-1-1-1-1
	similarity index 100%
	rename from really-big-dir/file-1-1-1-1-1
	rename to moved-big-dir/file-1-1-1-1-1

and the extra six lines is from a one-liner commit message and all the 
commit information and spacing).

So two seconds to do that exact rename detection.

Now, I can't really compare it to the "before" stage, because that is just 
so horrible. The rename detection limit triggers, so you never even get 
any renames, but if I were to move the limit check later (like I do in 
this patch) without my other fixes, it would take hours. Trying to do ten 
billion (100k x 100k) SHA1 and pathname compares simply isn't going to 
work.

But I *can* compare it to the old code *with* the rename limiting, which 
still wastes all the time on the whole "copy usage" crap. So here are the 
numbers on that repo without the patch:

	[torvalds@woody big-rename]$ time git show -C | wc -l
	1400006
	
	real    0m12.383s
	user    0m12.365s
	sys     0m0.160s

That's right: we used to take 12 seconds and not even do renames (now you 
see fourteen lines per file moved: seven lines each for the delete and the 
create of a one-liner file, and the same extra six lines of commit 
information).

So it not only makes the rename detection possible in the first place, it 
removes some stupid code that made it take a long time even when it 
failed!

Now, I'd still be careful with this patch, and I'd really like people to 
double-check all my logic, but I think it's worthy of some 'pu' love.

		Linus
---

 Makefile          |    4 +-
 diff.c            |   24 +----
 diffcore-rename.c |  275 ++++++++++++++++++++++++++++++----------------------
 diffcore.h        |    2 +-
 hash.c            |  110 +++++++++++++++++++++
 hash.h            |   43 ++++++++
 6 files changed, 321 insertions(+), 137 deletions(-)

diff --git a/Makefile b/Makefile
index 8db4dbe..b1ca186 100644
--- a/Makefile
+++ b/Makefile
@@ -291,7 +291,7 @@ LIB_H = \
 	run-command.h strbuf.h tag.h tree.h git-compat-util.h revision.h \
 	tree-walk.h log-tree.h dir.h path-list.h unpack-trees.h builtin.h \
 	utf8.h reflog-walk.h patch-ids.h attr.h decorate.h progress.h \
-	mailmap.h remote.h
+	mailmap.h remote.h hash.h
 
 DIFF_OBJS = \
 	diff.o diff-lib.o diffcore-break.o diffcore-order.o \
@@ -301,7 +301,7 @@ DIFF_OBJS = \
 LIB_OBJS = \
 	blob.o commit.o connect.o csum-file.o cache-tree.o base85.o \
 	date.o diff-delta.o entry.o exec_cmd.o ident.o \
-	interpolate.o \
+	interpolate.o hash.o \
 	lockfile.o \
 	patch-ids.o \
 	object.o pack-check.o pack-write.o patch-delta.o path.o pkt-line.o \
diff --git a/diff.c b/diff.c
index 6648e01..e892030 100644
--- a/diff.c
+++ b/diff.c
@@ -2586,9 +2586,9 @@ void diff_debug_filepair(const struct diff_filepair *p, int i)
 {
 	diff_debug_filespec(p->one, i, "one");
 	diff_debug_filespec(p->two, i, "two");
-	fprintf(stderr, "score %d, status %c stays %d broken %d\n",
+	fprintf(stderr, "score %d, status %c rename_used %d broken %d\n",
 		p->score, p->status ? p->status : '?',
-		p->source_stays, p->broken_pair);
+		p->rename_used, p->broken_pair);
 }
 
 void diff_debug_queue(const char *msg, struct diff_queue_struct *q)
@@ -2606,8 +2606,8 @@ void diff_debug_queue(const char *msg, struct diff_queue_struct *q)
 
 static void diff_resolve_rename_copy(void)
 {
-	int i, j;
-	struct diff_filepair *p, *pp;
+	int i;
+	struct diff_filepair *p;
 	struct diff_queue_struct *q = &diff_queued_diff;
 
 	diff_debug_queue("resolve-rename-copy", q);
@@ -2629,27 +2629,15 @@ static void diff_resolve_rename_copy(void)
 		 * either in-place edit or rename/copy edit.
 		 */
 		else if (DIFF_PAIR_RENAME(p)) {
-			if (p->source_stays) {
-				p->status = DIFF_STATUS_COPIED;
-				continue;
-			}
 			/* See if there is some other filepair that
 			 * copies from the same source as us.  If so
 			 * we are a copy.  Otherwise we are either a
 			 * copy if the path stays, or a rename if it
 			 * does not, but we already handled "stays" case.
 			 */
-			for (j = i + 1; j < q->nr; j++) {
-				pp = q->queue[j];
-				if (strcmp(pp->one->path, p->one->path))
-					continue; /* not us */
-				if (!DIFF_PAIR_RENAME(pp))
-					continue; /* not a rename/copy */
-				/* pp is a rename/copy from the same source */
+			if (--p->one->rename_used > 0)
 				p->status = DIFF_STATUS_COPIED;
-				break;
-			}
-			if (!p->status)
+			else
 				p->status = DIFF_STATUS_RENAMED;
 		}
 		else if (hashcmp(p->one->sha1, p->two->sha1) ||
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 2077a9b..cc105db 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -4,6 +4,7 @@
 #include "cache.h"
 #include "diff.h"
 #include "diffcore.h"
+#include "hash.h"
 
 /* Table of rename/copy destinations */
 
@@ -55,12 +56,10 @@ static struct diff_rename_dst *locate_rename_dst(struct diff_filespec *two,
 static struct diff_rename_src {
 	struct diff_filespec *one;
 	unsigned short score; /* to remember the break score */
-	unsigned src_path_left : 1;
 } *rename_src;
 static int rename_src_nr, rename_src_alloc;
 
 static struct diff_rename_src *register_rename_src(struct diff_filespec *one,
-						   int src_path_left,
 						   unsigned short score)
 {
 	int first, last;
@@ -92,33 +91,9 @@ static struct diff_rename_src *register_rename_src(struct diff_filespec *one,
 			(rename_src_nr - first - 1) * sizeof(*rename_src));
 	rename_src[first].one = one;
 	rename_src[first].score = score;
-	rename_src[first].src_path_left = src_path_left;
 	return &(rename_src[first]);
 }
 
-static int is_exact_match(struct diff_filespec *src,
-			  struct diff_filespec *dst,
-			  int contents_too)
-{
-	if (src->sha1_valid && dst->sha1_valid &&
-	    !hashcmp(src->sha1, dst->sha1))
-		return 1;
-	if (!contents_too)
-		return 0;
-	if (diff_populate_filespec(src, 1) || diff_populate_filespec(dst, 1))
-		return 0;
-	if (src->size != dst->size)
-		return 0;
-	if (src->sha1_valid && dst->sha1_valid)
-	    return !hashcmp(src->sha1, dst->sha1);
-	if (diff_populate_filespec(src, 0) || diff_populate_filespec(dst, 0))
-		return 0;
-	if (src->size == dst->size &&
-	    !memcmp(src->data, dst->data, src->size))
-		return 1;
-	return 0;
-}
-
 static int basename_same(struct diff_filespec *src, struct diff_filespec *dst)
 {
 	int src_len = strlen(src->path), dst_len = strlen(dst->path);
@@ -216,6 +191,7 @@ static void record_rename_pair(int dst_index, int src_index, int score)
 		die("internal error: dst already matched.");
 
 	src = rename_src[src_index].one;
+	src->rename_used++;
 	one = alloc_filespec(src->path);
 	fill_filespec(one, src->sha1, src->mode);
 
@@ -229,7 +205,6 @@ static void record_rename_pair(int dst_index, int src_index, int score)
 		dp->score = rename_src[src_index].score;
 	else
 		dp->score = score;
-	dp->source_stays = rename_src[src_index].src_path_left;
 	rename_dst[dst_index].pair = dp;
 }
 
@@ -247,19 +222,127 @@ static int score_compare(const void *a_, const void *b_)
 	return b->score - a->score;
 }
 
-static int compute_stays(struct diff_queue_struct *q,
-			 struct diff_filespec *one)
+struct file_similarity {
+	int src_dst, index;
+	struct diff_filespec *filespec;
+	struct file_similarity *next;
+};
+
+static int find_identical_files(struct file_similarity *src,
+				struct file_similarity *dst)
 {
-	int i;
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		if (strcmp(one->path, p->two->path))
-			continue;
-		if (DIFF_PAIR_RENAME(p)) {
-			return 0; /* something else is renamed into this */
+	int renames = 0;
+	do {
+		struct diff_filespec *one = src->filespec;
+		struct file_similarity *p, *best;
+		int i = 100;
+
+		best = NULL;
+		for (p = dst; p; p = p->next) {
+			struct diff_filespec *two = p->filespec;
+
+			/* Already picked as a destination? */
+			if (!p->src_dst)
+				continue;
+			/* False hash collission? */
+			if (hashcmp(one->sha1, two->sha1))
+				continue;
+			best = p;
+			if (basename_same(one, two))
+				break;
+
+			/* Too many identical alternatives? Pick one */
+			if (!--i)
+				break;
 		}
+		if (best) {
+			best->src_dst = 0;
+			record_rename_pair(best->index, src->index, MAX_SCORE);
+			renames++;
+		}
+	} while ((src = src->next) != NULL);
+	return renames;
+}
+
+/*
+ * Note: the rest of the rename logic depends on this
+ * phase also populating all the filespecs for any
+ * entry that isn't matched up with an exact rename.
+ */
+static void free_similarity_list(struct file_similarity *p)
+{
+	while (p) {
+		struct file_similarity *entry = p;
+		p = p->next;
+
+		/* Stupid special case, see note above! */
+		diff_populate_filespec(entry->filespec, 0);
+		free(entry);
+	}
+}
+
+static int find_same_files(void *ptr)
+{
+	int ret;
+	struct file_similarity *p = ptr;
+	struct file_similarity *src = NULL, *dst = NULL;
+
+	/* Split the hash list up into sources and destinations */
+	do {
+		struct file_similarity *entry = p;
+		p = p->next;
+		if (entry->src_dst < 0) {
+			entry->next = src;
+			src = entry;
+		} else {
+			entry->next = dst;
+			dst = entry;
+		}
+	} while (p);
+
+	/*
+	 * If we have both sources *and* destinations, see if
+	 * we can match them up
+	 */
+	ret = (src && dst) ? find_identical_files(src, dst) : 0;
+
+	/* Free the hashes and return the number of renames found */
+	free_similarity_list(src);
+	free_similarity_list(dst);
+	return ret;
+}
+
+static unsigned int hash_filespec(struct diff_filespec *filespec)
+{
+	unsigned int hash;
+	if (!filespec->sha1_valid) {
+		if (diff_populate_filespec(filespec, 0))
+			return 0;
+		hash_sha1_file(filespec->data, filespec->size, "blob", filespec->sha1);
+	}
+	memcpy(&hash, filespec->sha1, sizeof(hash));
+	return hash;
+}
+
+static void insert_file_table(struct hash_table *table, int src_dst, int index, struct diff_filespec *filespec)
+{
+	void **pos;
+	unsigned int hash;
+	struct file_similarity *entry = xmalloc(sizeof(*entry));
+
+	entry->src_dst = src_dst;
+	entry->index = index;
+	entry->filespec = filespec;
+	entry->next = NULL;
+
+	hash = hash_filespec(filespec);
+	pos = insert_hash(hash, entry, table);
+
+	/* We already had an entry there? */
+	if (pos) {
+		entry->next = *pos;
+		*pos = entry;
 	}
-	return 1;
 }
 
 /*
@@ -268,50 +351,26 @@ static int compute_stays(struct diff_queue_struct *q,
  * The first round matches up the up-to-date entries,
  * and then during the second round we try to match
  * cache-dirty entries as well.
- *
- * Note: the rest of the rename logic depends on this
- * phase also populating all the filespecs for any
- * entry that isn't matched up with an exact rename,
- * see "is_exact_match()".
  */
 static int find_exact_renames(void)
 {
-	int rename_count = 0;
-	int contents_too;
-
-	for (contents_too = 0; contents_too < 2; contents_too++) {
-		int i;
-
-		for (i = 0; i < rename_dst_nr; i++) {
-			struct diff_filespec *two = rename_dst[i].two;
-			int j;
-
-			if (rename_dst[i].pair)
-				continue; /* dealt with an earlier round */
-			for (j = 0; j < rename_src_nr; j++) {
-				int k;
-				struct diff_filespec *one = rename_src[j].one;
-				if (!is_exact_match(one, two, contents_too))
-					continue;
-
-				/* see if there is a basename match, too */
-				for (k = j; k < rename_src_nr; k++) {
-					one = rename_src[k].one;
-					if (basename_same(one, two) &&
-						is_exact_match(one, two,
-							contents_too)) {
-						j = k;
-						break;
-					}
-				}
-
-				record_rename_pair(i, j, (int)MAX_SCORE);
-				rename_count++;
-				break; /* we are done with this entry */
-			}
-		}
-	}
-	return rename_count;
+	int i;
+	struct hash_table file_table;
+
+	init_hash(&file_table);
+	for (i = 0; i < rename_src_nr; i++)
+		insert_file_table(&file_table, -1, i, rename_src[i].one);
+
+	for (i = 0; i < rename_dst_nr; i++)
+		insert_file_table(&file_table, 1, i, rename_dst[i].two);
+
+	/* Find the renames */
+	i = for_each_hash(&file_table, find_same_files);
+
+	/* .. and free the hash data structure */
+	free_hash(&file_table);
+
+	return i;
 }
 
 void diffcore_rename(struct diff_options *options)
@@ -340,20 +399,36 @@ void diffcore_rename(struct diff_options *options)
 				locate_rename_dst(p->two, 1);
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
-			/* If the source is a broken "delete", and
+			/*
+			 * If the source is a broken "delete", and
 			 * they did not really want to get broken,
 			 * that means the source actually stays.
+			 * So we increment the "rename_used" score
+			 * by one, to indicate ourselves as a user
 			 */
-			int stays = (p->broken_pair && !p->score);
-			register_rename_src(p->one, stays, p->score);
+			if (p->broken_pair && !p->score)
+				p->one->rename_used++;
+			register_rename_src(p->one, p->score);
+		}
+		else if (detect_rename == DIFF_DETECT_COPY) {
+			/*
+			 * Increment the "rename_used" score by
+			 * one, to indicate ourselves as a user.
+			 */
+			p->one->rename_used++;
+			register_rename_src(p->one, p->score);
 		}
-		else if (detect_rename == DIFF_DETECT_COPY)
-			register_rename_src(p->one, 1, p->score);
 	}
 	if (rename_dst_nr == 0 || rename_src_nr == 0)
 		goto cleanup; /* nothing to do */
 
 	/*
+	 * We really want to cull the candidates list early
+	 * with cheap tests in order to avoid doing deltas.
+	 */
+	rename_count = find_exact_renames();
+
+	/*
 	 * This basically does a test for the rename matrix not
 	 * growing larger than a "rename_limit" square matrix, ie:
 	 *
@@ -369,12 +444,6 @@ void diffcore_rename(struct diff_options *options)
 	if (rename_dst_nr * rename_src_nr > rename_limit * rename_limit)
 		goto cleanup;
 
-	/*
-	 * We really want to cull the candidates list early
-	 * with cheap tests in order to avoid doing deltas.
-	 */
-	rename_count = find_exact_renames();
-
 	/* Have we run out the created file pool?  If so we can avoid
 	 * doing the delta matrix altogether.
 	 */
@@ -474,16 +543,7 @@ void diffcore_rename(struct diff_options *options)
 					pair_to_free = p;
 			}
 			else {
-				for (j = 0; j < rename_dst_nr; j++) {
-					if (!rename_dst[j].pair)
-						continue;
-					if (strcmp(rename_dst[j].pair->
-						   one->path,
-						   p->one->path))
-						continue;
-					break;
-				}
-				if (j < rename_dst_nr)
+				if (p->one->rename_used)
 					/* this path remains */
 					pair_to_free = p;
 			}
@@ -509,23 +569,6 @@ void diffcore_rename(struct diff_options *options)
 	*q = outq;
 	diff_debug_queue("done collapsing", q);
 
-	/* We need to see which rename source really stays here;
-	 * earlier we only checked if the path is left in the result,
-	 * but even if a path remains in the result, if that is coming
-	 * from copying something else on top of it, then the original
-	 * source is lost and does not stay.
-	 */
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		if (DIFF_PAIR_RENAME(p) && p->source_stays) {
-			/* If one appears as the target of a rename-copy,
-			 * then mark p->source_stays = 0; otherwise
-			 * leave it as is.
-			 */
-			p->source_stays = compute_stays(q, p->one);
-		}
-	}
-
 	for (i = 0; i < rename_dst_nr; i++) {
 		diff_free_filespec_data(rename_dst[i].two);
 		free(rename_dst[i].two);
diff --git a/diffcore.h b/diffcore.h
index eb618b1..ceda932 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -30,6 +30,7 @@ struct diff_filespec {
 	const char *funcname_pattern_ident;
 	unsigned long size;
 	int xfrm_flags;		 /* for use by the xfrm */
+	int rename_used;         /* Count of rename users */
 	unsigned short mode;	 /* file mode */
 	unsigned sha1_valid : 1; /* if true, use sha1 and trust mode;
 				  * if false, use the name and read from
@@ -56,7 +57,6 @@ struct diff_filepair {
 	struct diff_filespec *two;
 	unsigned short int score;
 	char status; /* M C R N D U (see Documentation/diff-format.txt) */
-	unsigned source_stays : 1; /* all of R/C are copies */
 	unsigned broken_pair : 1;
 	unsigned renamed_pair : 1;
 	unsigned is_unmerged : 1;
diff --git a/hash.c b/hash.c
new file mode 100644
index 0000000..7b492d4
--- /dev/null
+++ b/hash.c
@@ -0,0 +1,110 @@
+/*
+ * Some generic hashing helpers.
+ */
+#include "cache.h"
+#include "hash.h"
+
+/*
+ * Look up a hash entry in the hash table. Return the pointer to
+ * the existing entry, or the empty slot if none existed. The caller
+ * can then look at the (*ptr) to see whether it existed or not.
+ */
+static struct hash_table_entry *lookup_hash_entry(unsigned int hash, struct hash_table *table)
+{
+	unsigned int size = table->size, nr = hash % size;
+	struct hash_table_entry *array = table->array;
+
+	while (array[nr].ptr) {
+		if (array[nr].hash == hash)
+			break;
+		nr++;
+		if (nr >= size)
+			nr = 0;
+	}
+	return array + nr;
+}
+
+
+/*
+ * Insert a new hash entry pointer into the table.
+ *
+ * If that hash entry already existed, return the pointer to
+ * the existing entry (and the caller can create a list of the
+ * pointers or do anything else). If it didn't exist, return
+ * NULL (and the caller knows the pointer has been inserted).
+ */
+static void **insert_hash_entry(unsigned int hash, void *ptr, struct hash_table *table)
+{
+	struct hash_table_entry *entry = lookup_hash_entry(hash, table);
+
+	if (!entry->ptr) {
+		entry->ptr = ptr;
+		entry->hash = hash;
+		table->nr++;
+		return NULL;
+	}
+	return &entry->ptr;
+}
+
+static void grow_hash_table(struct hash_table *table)
+{
+	unsigned int i;
+	unsigned int old_size = table->size, new_size;
+	struct hash_table_entry *old_array = table->array, *new_array;
+
+	new_size = alloc_nr(old_size);
+	new_array = xcalloc(sizeof(struct hash_table_entry), new_size);
+	table->size = new_size;
+	table->array = new_array;
+	table->nr = 0;
+	for (i = 0; i < old_size; i++) {
+		unsigned int hash = old_array[i].hash;
+		void *ptr = old_array[i].ptr;
+		if (ptr)
+			insert_hash_entry(hash, ptr, table);
+	}
+	free(old_array);
+}
+
+void *lookup_hash(unsigned int hash, struct hash_table *table)
+{
+	if (!table->array)
+		return NULL;
+	return &lookup_hash_entry(hash, table)->ptr;
+}
+
+void **insert_hash(unsigned int hash, void *ptr, struct hash_table *table)
+{
+	unsigned int nr = table->nr;
+	if (nr >= table->size/2)
+		grow_hash_table(table);
+	return insert_hash_entry(hash, ptr, table);
+}
+
+int for_each_hash(struct hash_table *table, int (*fn)(void *))
+{
+	int sum = 0;
+	unsigned int i;
+	unsigned int size = table->size;
+	struct hash_table_entry *array = table->array;
+
+	for (i = 0; i < size; i++) {
+		void *ptr = array->ptr;
+		array++;
+		if (ptr) {
+			int val = fn(ptr);
+			if (val < 0)
+				return val;
+			sum += val;
+		}
+	}
+	return sum;
+}
+
+void free_hash(struct hash_table *table)
+{
+	free(table->array);
+	table->array = NULL;
+	table->size = 0;
+	table->nr = 0;
+}
diff --git a/hash.h b/hash.h
new file mode 100644
index 0000000..5056c9a
--- /dev/null
+++ b/hash.h
@@ -0,0 +1,43 @@
+#ifndef HASH_H
+#define HASH_H
+
+/*
+ * These are some simple generic hash table helper functions.
+ * Not necessarily suitable for all users, but good for things
+ * where you want to just keep track of a list of things, and
+ * have a good hash to use on them.
+ *
+ * It keeps the hash table at roughly 50-75% free, so the memory
+ * cost of the hash table itself is roughly
+ *
+ *	3 * 2*sizeof(void *) * nr_of_objects
+ *
+ * bytes. 
+ *
+ * FIXME: on 64-bit architectures, we waste memory. It would be
+ * good to have just 32-bit pointers, requiring a special allocator
+ * for hashed entries or something.
+ */
+struct hash_table_entry {
+	unsigned int hash;
+	void *ptr;
+};
+
+struct hash_table {
+	unsigned int size, nr;
+	struct hash_table_entry *array;
+};
+
+extern void *lookup_hash(unsigned int hash, struct hash_table *table);
+extern void **insert_hash(unsigned int hash, void *ptr, struct hash_table *table);
+extern int for_each_hash(struct hash_table *table, int (*fn)(void *));
+extern void free_hash(struct hash_table *table);
+
+static inline void init_hash(struct hash_table *table)
+{
+	table->size = 0;
+	table->nr = 0;
+	table->array = NULL;
+}
+
+#endif

^ permalink raw reply related

* Re: [PATCH] use only the PATH for exec'ing git commands
From: Johannes Schindelin @ 2007-10-22 17:44 UTC (permalink / raw)
  To: Scott R Parish; +Cc: git
In-Reply-To: <20071022170148.GB29642@srparish.net>

Hi,

On Mon, 22 Oct 2007, Scott R Parish wrote:

>  3 files changed, 61 insertions(+), 105 deletions(-)

Nice.

Ciao,
Dscho

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox