Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] Mark gitk script executable
From: Jonathan Nieder @ 2011-01-07  3:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Anders Kaseorg, git, Paul Mackerras
In-Reply-To: <7vlj2x8mr4.fsf@alter.siamese.dyndns.org>

Junio C Hamano wrote:
> Anders Kaseorg <andersk@MIT.EDU> writes:

>> The executable bit on gitk-git/gitk was lost (accidentally it seems) by 
>> commit 62ba5143ec2ab9d4083669b1b1679355e7639cd5.  Put it back, so that 
>> gitk can be run directly from a git.git checkout.
>>
>> Note that the script is already executable in gitk.git, just not in 
>> git.git.
>
> It did not lose the bit by accident but 62ba5143 pretty much was a
> deliberate fix.  "gitk" is a source file, and its build product,
> gitk-wish, is what is eventually installed with executable bit on.

How does this case differ from other executable source files like
git-am.sh?

^ permalink raw reply

* Re: git mergetool broken when rerere active
From: Martin von Zweigbergk @ 2011-01-07  2:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Martin von Zweigbergk, git, Magnus Baeck, Avery Pennarun,
	Jay Soffian, David Aguilar
In-Reply-To: <7v62u1bzeg.fsf@alter.siamese.dyndns.org>

On Thu, 6 Jan 2011, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Probably we would need a "git rerere remaining" sobcommand that is similar
> > to status but also includes the "punted" paths.
> 
> ... which may look like this.
> 
> Replace use of "git rerere status" in bb0a484 (mergetool: Skip
> autoresolved paths, 2010-08-17) with "git rerere remainder" and see what
> happens.

I think it almost works. The only thing I can see that is not handled
is the case where a file that had conflicts that have now been
resolved and the file has been added to the index. These files still
show up in 'git rerere remaining' and thus in 'git mergetool'. It is
of course possible in mergetool to calculate the set difference
between `git rerere remaining` and the set of added files, but would
that be the right thing to do? Should 'git rerere remaining' rather be
changed?

Would it be easier to add a 'git rerere resolved' that lists the files
for which all conflicts have been resolved by rerere?x We could then
make mergetool use the usual list of files from 'git ls-files -u' and
subtract these files.

^ permalink raw reply

* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Zenaan Harkness @ 2011-01-07  2:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <alpine.LFD.2.00.1101061552580.22191@xanadu.home>

On Fri, Jan 7, 2011 at 08:09, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 6 Jan 2011, Zenaan Harkness wrote:
>
>> Bittorrent requires some stability around torrent files.
>>
>> Can packs be generated deterministically?
>
> They _could_, but we do _not_ want to do that.
>
> The only thing which is stable in Git is the canonical representation of
> objects, and the objects they depend on, expressed by their SHA1
> signature.  Any BitTorrent-alike design for Git must be based on that
> property and not the packed representation of those objects which is not
> meant to be stable.
>
> If you don't want to design anything and simply reuse current BitTorrent
> codebase then simply create a Git bundle from some release version and
> seed that bundle for a sufficiently long period to be worth it.  Then
> falling back to git fetch in order to bring the repo up to date with the
> very latest commits should be small and quick.  When that clone gets too
> big then it's time to start seeding another more up-to-date bundle.

Thanks guys for the explanations.

So, we don't _want_ to generate packs deterministically.
BUT, we _can_ reliably unpack a pack (duh).

So if my configured "canonical upstream" decides on a particular
compression etc, I (my git client) doesn't care what has been chosen
by my upstream.

What is important for torrent-able packs though is stability over some
time period, no matter what the format.

There's been much talk of caching, invalidating of caches, overlapping
torrent-packs etc.

In every case, for torrents to work, the P2P'd files must have some
stability over some time period.
(If this assumption is incorrect, please clarify, not counting
every-file-is-a-torrent and every-commit-is-a-torrent.)

So, torrentable options:
- torrent per commit
- torrent per pack
- torrent per torrent-archive - new file format

Torrent per commit - too small, too many torrents; we need larger
p2p-able sizes in general.

Torrent per pack - packs non-deterministically created, both between
hosts and even intra-host (libz upgrade, nr_threads change, git pack
algorithm optimization).

A new torrent format, if "close enough" to current git pack
performance (cpu load, threadability, size) is potential for new
version of git pack file format - we don't want to store two sets of
pack files on disk, if sensible to not do so; unlikely to happen - I
can't conceive that a torrentable format would be anything but worse
than pack files and therefore would be rejected from git master.

Can we can relax the perceived requirement to deterministically create
pack files?
Well, over what time period are pack files stable in a particular git?
Over what time period do we require stable files for torrenting?

Can we simply configure our local git to keep specified pack files for
specified time period?
And use those for torrent-packs?
Perhaps the torrent file could have a UseBy date?

Zen

^ permalink raw reply

* Re: [PATCH] Mark gitk script executable
From: Junio C Hamano @ 2011-01-07  2:34 UTC (permalink / raw)
  To: Anders Kaseorg; +Cc: git
In-Reply-To: <alpine.DEB.2.02.1101061943140.6372@dr-wily.mit.edu>

Anders Kaseorg <andersk@MIT.EDU> writes:

> The executable bit on gitk-git/gitk was lost (accidentally it seems) by 
> commit 62ba5143ec2ab9d4083669b1b1679355e7639cd5.  Put it back, so that 
> gitk can be run directly from a git.git checkout.
>
> Note that the script is already executable in gitk.git, just not in 
> git.git.

It did not lose the bit by accident but 62ba5143 pretty much was a
deliberate fix.  "gitk" is a source file, and its build product,
gitk-wish, is what is eventually installed with executable bit on.

^ permalink raw reply

* Re: [PATCH] t9010: svnadmin can fail even if available
From: Jonathan Nieder @ 2011-01-07  1:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: A Large Angry SCM, Git Mailing List, Ramkumar Ramachandra
In-Reply-To: <7vpqs98qti.fsf@alter.siamese.dyndns.org>

Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>>> Also isn't the breakage not just this test, but also in all the tests that
>>> try to run "svnadmin load"?  Shouldn't we somehow hoist this logic out of
>>> t9010 and put it in t/lib-vcs-svn.sh or somewhere?
>
> Am I mistaken and t9010 is the only one that needs the fix in your patch?

Everything except t9010 is testing git-svn.  My argument before was
that this libsvn_subr breakage is going to leave git-svn broken
anyway, so why bother guarding against it in tests?

That precise argument is broken in at least three ways:

 - the libsvn_subr breakage would only breaks "git svn dcommit" (and
   "git svn set-tree") and only when the remote svn repo is actually a
   local svn repo using the FSFS backend.

 - maybe the next time "svnadmin load" breaks it will not break
   git svn

 - this particular breakage seems especially unworthy of workarounds
   because it is so easy to fix on the svn side.

So why do I like the patch to t9010?  Two reasons: first, it means
t9010 could run on systems without svn installed at all, and second,
it means that there is one less test to worry about if svn fails in
some other way in the future.

To do the same for t91* would be impossible.  If svn is broken or not
installed, svn-fe will run fine, but "git svn" will not.  On the other
hand, if svnadmin were broken but svn still worked, "git svn" would be
fine but that would be quite strange and I do not think it is worth
spending time to prepare for.

^ permalink raw reply

* Re: [msysGit] [PATCH/RFC] alias: use run_command api to execute aliases
From: Johannes Schindelin @ 2011-01-07  1:17 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: git, msysgit, j6t
In-Reply-To: <1294341187-3956-1-git-send-email-kusmabite@gmail.com>

Hi,

On Thu, 6 Jan 2011, Erik Faye-Lund wrote:

> On Windows, system() executes with cmd.exe instead of /bin/sh. This
> means that aliases currently has to be batch-scripts instead of
> bourne-scripts. On top of that, cmd.exe does not handle single quotes,
> which is what the code-path currently uses to handle arguments with
> spaces.
> 
> To solve both problems in one go, use run_command_v_opt() to execute
> the alias. It already does the right thing prepend "sh -c " to the
> alias.

Would this not break setups where aliases were defined to execute batch 
scripts?

If this is true, I'm of two minds here.

Ciao,
Dscho

^ permalink raw reply

* [PATCH] Mark gitk script executable
From: Anders Kaseorg @ 2011-01-07  1:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The executable bit on gitk-git/gitk was lost (accidentally it seems) by 
commit 62ba5143ec2ab9d4083669b1b1679355e7639cd5.  Put it back, so that 
gitk can be run directly from a git.git checkout.

Note that the script is already executable in gitk.git, just not in 
git.git.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
---
 0 files changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 gitk-git/gitk

diff --git a/gitk-git/gitk b/gitk-git/gitk
old mode 100644
new mode 100755
-- 
1.7.4-rc0

^ permalink raw reply

* Re: [PATCH] t9010: svnadmin can fail even if available
From: Junio C Hamano @ 2011-01-07  1:07 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: A Large Angry SCM, Git Mailing List, Ramkumar Ramachandra
In-Reply-To: <20110106204605.GA15090@burratino>

Jonathan Nieder <jrnieder@gmail.com> writes:

>> Also isn't the breakage not just this test, but also in all the tests that
>> try to run "svnadmin load"?  Shouldn't we somehow hoist this logic out of
>> t9010 and put it in t/lib-vcs-svn.sh or somewhere?

Am I mistaken and t9010 is the only one that needs the fix in your patch?

^ permalink raw reply

* [PATCH] gitk: Take only numeric version components when computing $git_version
From: Anders Kaseorg @ 2011-01-07  0:42 UTC (permalink / raw)
  To: Paul Mackerras, Junio C Hamano; +Cc: Mathias Lafeldt, git
In-Reply-To: <4D231646.5080005@debugon.org>

This fixes errors running with release candidate versions of Git:
  Error in startup script: expected version number but got "1.7.4-rc0"

Also, $git_version is no longer artificially limited to three
components.  That limitation was added by commit
194bbf6cc8c2f3c14a920c841841d66b7667a848 to deal with msysGit version
strings like “1.6.4.msysgit.0”, and we don’t need it now.  Hence as
another side effect, this enables showing notes with git version
1.6.6.2 or 1.6.6.3, as originally intended by commit
7defefb134270b6e8ab3e422b343b41a4a383f5d.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
---
 gitk-git/gitk |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gitk-git/gitk b/gitk-git/gitk
index e82c6bf..9cbc09d 100755
--- a/gitk-git/gitk
+++ b/gitk-git/gitk
@@ -11581,7 +11581,7 @@ if {![info exists have_ttk]} {
 set use_ttk [expr {$have_ttk && $want_ttk}]
 set NS [expr {$use_ttk ? "ttk" : ""}]
 
-set git_version [join [lrange [split [lindex [exec git version] end] .] 0 2] .]
+regexp {^git version ([\d.]*\d)} [exec git version] _ git_version
 
 set show_notes {}
 if {[package vcompare $git_version "1.6.6.2"] >= 0} {
-- 
1.7.4-rc0

^ permalink raw reply related

* Re: [PATCH/RFC] Documentation/checkout: explain behavior wrt local changes
From: Junio C Hamano @ 2011-01-07  0:16 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: r.ductor, git, Jeff King
In-Reply-To: <20110106225222.GA15900@burratino>

Jonathan Nieder <jrnieder@gmail.com> writes:

>  'git checkout' [<branch>]::
>  'git checkout' -b|-B <new_branch> [<start point>]::
>  
> -	This form switches branches by updating the index, working
> -	tree, and HEAD to reflect the specified branch.
> +	This form switches branches by changing `HEAD` and updating the
> +	tracked files to the specified branch.  'git checkout' will
> +	stop without doing anything if local changes overlap with
> +	changes to the tracked files.  (Any local changes that do not
> +	overlap with changes from `HEAD` to the specified branch will
> +	be preserved.)

Even though I am fully aware that "switch" was an attempt to match the
terminology from another SCM (e.g. svn), it may help readers in the longer
term if we stop using "switch" and instead explain this as "checking out a
branch" and what this action is _for_.  I suspect that an explanation of
the purpose would clarify the reason why the command does what it does.

You check out a branch to work on that branch, namely, eventually to
commit changes to that branch (as opposed to committing them to the
current branch).  For that purpose, the local changes obviously need to be
preserved.  And that is why it refuses to act if paths with local changes
need to be updated.

Side note.  When you check out a path (or a set of paths), you don't work
on these files (git does not track individual files), hence there is no
possible interpretation other than "get a fresh unmodified copy out of
something" when the command is used with paths, i.e. "checking out paths
(either out of the index or out of a named commit)".

In any case, your rewrite is certainly an improvement compared to an
unexplained "reflect", so I think this is a step in the right direction.

Thanks.

^ permalink raw reply

* Re: concurrent fetches to update same mirror
From: Jeff King @ 2011-01-06 23:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Neal Kreitzinger, git
In-Reply-To: <7vbp3vc4k4.fsf@alter.siamese.dyndns.org>

On Wed, Jan 05, 2011 at 03:29:47PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Interestingly, in the case of ref _creation_, not update, like this:
> >
> >   mkdir repo && cd repo && git init
> >   git remote add origin some-remote-repo-that-takes-a-few-seconds
> >   xterm -e 'git fetch -v; read' & xterm -e 'git fetch -v; read'
> >
> > then both will happily update, the second one overwriting the results of
> > the first. It seems in the case of locking a ref which previously didn't
> > exist, we don't enforce that it still doesn't exist.
> 
> We probably should, especially when there is no --force or +prefix is
> involved.

Hmph. So I created the test below to try to exercise this, expecting to
see at least one failure: according to the above example, we aren't
actually checking "null sha1 means ref must not exist", so we should get
an erroneous success for that case. And there is the added complication
that the null sha1 may also mean "don't care what the old one was". So
even if I changed the code, we would get erroneous failures the other
way.

But much to my surprise, it actually passes with stock git. Which means
I need to dig a little further to see exactly what is going on.

Hooray for test-driven development, I guess? :)

diff --git a/t/t1403-concurrent-refs.sh b/t/t1403-concurrent-refs.sh
new file mode 100755
index 0000000..7fb4424
--- /dev/null
+++ b/t/t1403-concurrent-refs.sh
@@ -0,0 +1,49 @@
+#!/bin/sh
+
+test_description='test behavior of concurrent ref updates'
+. ./test-lib.sh
+
+ref=refs/heads/foo
+null=0000000000000000000000000000000000000000
+
+check_ref() {
+	echo $2 >expect &&
+	git rev-parse --verify $1 >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success setup '
+	for name in A B C; do
+		test_tick &&
+		T=$(git write-tree) &&
+		sha1=$(echo $name | git commit-tree $T) &&
+		eval $name=$sha1
+	done
+'
+
+test_expect_success '(create ref, expecting non-null sha1) should fail' '
+	test_must_fail git update-ref $ref $A $C &&
+	test_must_fail git rev-parse --verify $ref
+'
+
+test_expect_success '(create ref, expecting null sha1) should work' '
+	git update-ref $ref $A $null &&
+	check_ref $ref $A
+'
+
+test_expect_success '(update ref, expecting null sha1) should fail' '
+	test_must_fail git update-ref $ref $B $null &&
+	check_ref $ref $A
+'
+
+test_expect_success '(update ref, expecting wrong sha1) should fail' '
+	test_must_fail git update-ref $ref $B $C &&
+	check_ref $ref $A
+'
+
+test_expect_success '(update ref, expecting current sha1) should work' '
+	git update-ref $ref $B $A &&
+	check_ref $ref $B
+'
+
+test_done

^ permalink raw reply related

* [PATCH v3] alias: use run_command api to execute aliases
From: Erik Faye-Lund @ 2011-01-06 23:00 UTC (permalink / raw)
  To: git; +Cc: msysgit, j6t, jrnieder

On Windows, system() executes with cmd.exe instead of /bin/sh. This
means that aliases currently has to be batch-scripts instead of
bourne-scripts. On top of that, cmd.exe does not handle single quotes,
which is what the code-path currently uses to handle arguments with
spaces.

To solve both problems in one go, use run_command_v_opt() to execute
the alias. It already does the right thing prepend "sh -c " to the
alias.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
---

Fixed the issues pointed out by Jonathan Nieder. For real this time.
Also changed die to die_errno, as suggested by Johannes Sixt.

 git.c |   34 +++++++++++++++++-----------------
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/git.c b/git.c
index 68334f6..23610aa 100644
--- a/git.c
+++ b/git.c
@@ -177,24 +177,24 @@ static int handle_alias(int *argcp, const char ***argv)
 	alias_string = alias_lookup(alias_command);
 	if (alias_string) {
 		if (alias_string[0] == '!') {
+			const char **alias_argv;
+			int argc = *argcp, i;
+
 			commit_pager_choice();
-			if (*argcp > 1) {
-				struct strbuf buf;
-
-				strbuf_init(&buf, PATH_MAX);
-				strbuf_addstr(&buf, alias_string);
-				sq_quote_argv(&buf, (*argv) + 1, PATH_MAX);
-				free(alias_string);
-				alias_string = buf.buf;
-			}
-			trace_printf("trace: alias to shell cmd: %s => %s\n",
-				     alias_command, alias_string + 1);
-			ret = system(alias_string + 1);
-			if (ret >= 0 && WIFEXITED(ret) &&
-			    WEXITSTATUS(ret) != 127)
-				exit(WEXITSTATUS(ret));
-			die("Failed to run '%s' when expanding alias '%s'",
-			    alias_string + 1, alias_command);
+
+			/* build alias_argv */
+			alias_argv = xmalloc(sizeof(*alias_argv) * (argc + 1));
+			alias_argv[0] = alias_string + 1;
+			for (i = 1; i < argc; ++i)
+				alias_argv[i] = (*argv)[i];
+			alias_argv[argc] = NULL;
+
+			ret = run_command_v_opt(alias_argv, RUN_USING_SHELL);
+			if (ret >= 0)   /* normal exit */
+				exit(ret);
+
+			die_errno("While expanding alias '%s': '%s'",
+			    alias_command, alias_string + 1);
 		}
 		count = split_cmdline(alias_string, &new_argv);
 		if (count < 0)
-- 
1.7.3.3.585.g74f6e

^ permalink raw reply related

* [PATCH/RFC] Documentation/checkout: explain behavior wrt local changes
From: Jonathan Nieder @ 2011-01-06 22:52 UTC (permalink / raw)
  To: r.ductor; +Cc: git, Jeff King
In-Reply-To: <20110106173522.GB11346@burratino>

Jonathan Nieder wrote:

> Proposed clearer text would be welcome, especially if in the form of
> a patch to Documentation/git-checkout.txt (see Documentation/SubmittingPatches).

Like this, maybe?

-- 8< --
Subject: Documentation/checkout: explain behavior wrt local changes

The current start of the description to "git checkout" tries to
combine an explanation of "git checkout <branch> --" with "git
checkout -- <paths>" and ends up with a muddle (as Jeff noticed).

In particular, the text does not make it obvious that the "git
checkout <branch> --" form does not clobber local changes relative
to the HEAD commit in the worktree and index.

Reported-by: r.ductor <r.ductor@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
I somehow suspect that a mention of --merge would make this clearer
but good words for it aren't coming at the moment.  Improvements
welcome.

 Documentation/git-checkout.txt |   14 ++++++++------
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 22d3611..cfb71a8 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -15,16 +15,18 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Updates files in the working tree to match the version in the index
-or the specified tree.  If no paths are given, 'git checkout' will
-also update `HEAD` to set the specified branch as the current
-branch.
+There are two different modes -- one to switch branches and one to
+make some paths in the work tree match the index or specified tree.
 
 'git checkout' [<branch>]::
 'git checkout' -b|-B <new_branch> [<start point>]::
 
-	This form switches branches by updating the index, working
-	tree, and HEAD to reflect the specified branch.
+	This form switches branches by changing `HEAD` and updating the
+	tracked files to the specified branch.  'git checkout' will
+	stop without doing anything if local changes overlap with
+	changes to the tracked files.  (Any local changes that do not
+	overlap with changes from `HEAD` to the specified branch will
+	be preserved.)
 +
 If `-b` is given, a new branch is created as if linkgit:git-branch[1]
 were called and then checked out; in this case you can
-- 
1.7.4.rc1

^ permalink raw reply related

* Re: [PATCH 0/3] Falling back "diff -C -C" to "diff -C" more gracefully
From: Jeff King @ 2011-01-06 22:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Stefan Haller
In-Reply-To: <1294350606-19530-1-git-send-email-gitster@pobox.com>

On Thu, Jan 06, 2011 at 01:50:03PM -0800, Junio C Hamano wrote:

> Junio C Hamano (3):
>   diffcore-rename: refactor "too many candidates" logic
>   diffcore-rename: record filepair for rename src
>   diffcore-rename: fall back to -C when -C -C busts the rename limit

I read through the series, and it looks good to me. I didn't do any
testing, though. :)

-Peff

^ permalink raw reply

* Re: clone breaks replace
From: Junio C Hamano @ 2011-01-06 21:59 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Phillip Susi, git, Christian Couder
In-Reply-To: <20110106213338.GA15325@burratino>

Jonathan Nieder <jrnieder@gmail.com> writes:

> Therefore if you want clients to be able to choose between a minimal
> history and a larger one to save bandwidth, it has to work like this
>
>  - to get the minimal history, fetch _without_ any replacement refs
>  - to get the full history, fetch the replacement refs on top of that.
>
> because an additional reference can only increase the number of
> objects to be downloaded.

Very nicely and clearly put.  Can we have this somewhere in the docs?

^ permalink raw reply

* Re: Concurrent pushes updating the same ref
From: Marc Branchaud @ 2011-01-06 21:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List
In-Reply-To: <7v1v4pbz6y.fsf@alter.siamese.dyndns.org>

On 11-01-06 02:37 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> On Thu, Jan 06, 2011 at 10:46:38AM -0500, Marc Branchaud wrote:
>>
>>> fatal: Unable to create
>>> '/usr/xiplink/git/public/Main.git/refs/builds/3.3.0-3.lock': File exists.
>>> If no other git process is currently running, this probably means a
>>> git process crashed in this repository earlier. Make sure no other git
>>> process is running and remove the file manually to continue.
>>> fatal: The remote end hung up unexpectedly
>>>
>>> I think the cause is pretty obvious, and in a normal interactive situation
>>> the solution would be to simply try again.  But in a script trying again
>>> isn't so straightforward.
>>>
>>> So I'm wondering if there's any sense or desire to make git a little more
>>> flexible here.  Maybe teach it to wait and try again once or twice when it
>>> sees a lock file.  I presume that normally a ref lock file should disappear
>>> pretty quickly, so there shouldn't be a need to wait very long.
>>
>> Yeah, we probably should try again. The simplest possible (and untested)
>> patch is below. However, a few caveats:
>>
>>   1. This patch unconditionally retries for all lock files. Do all
>>      callers want that?
> 
> I actually have to say that _no_ caller should want this.  If somebody
> earlier crashed, we would want to know about it (and how).  If somebody
> else alive is actively holding a lock, why not make it the responsibility
> of a calling script to decide if it wants to retry itself or perhaps
> decide to do something else?

I'm not sure I follow this.

How would retrying a few times prevent us from finding out about an earlier
crash?  It's not like we're overriding the lock by retrying.  Nobody's going
to be able to remove a lock created by a crashed process, right?

And if someone active doesn't release the lock and the low-level code retried
a few times, the caller can still decide what to do.  I don't see how it
would even impact that decision -- if the caller wants to try again, the
system can still retry a few times underneath the caller's one retry.  It
seems fine to me.

		M.

^ permalink raw reply

* [PATCH 2/3] diffcore-rename: record filepair for rename src
From: Junio C Hamano @ 2011-01-06 21:50 UTC (permalink / raw)
  To: git
In-Reply-To: <1294350606-19530-1-git-send-email-gitster@pobox.com>

This will allow us to later skip unmodified entries added due to "-C -C".
We might also want to do something similar to rename_dst side, but that
would only be for the sake of symmetry and not necessary for this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diffcore-rename.c |   23 ++++++++++++-----------
 1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 6ab050d..9ce81b6 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -54,22 +54,23 @@ static struct diff_rename_dst *locate_rename_dst(struct diff_filespec *two,
 
 /* Table of rename/copy src files */
 static struct diff_rename_src {
-	struct diff_filespec *one;
+	struct diff_filepair *p;
 	unsigned short score; /* to remember the break score */
 } *rename_src;
 static int rename_src_nr, rename_src_alloc;
 
-static struct diff_rename_src *register_rename_src(struct diff_filespec *one,
-						   unsigned short score)
+static struct diff_rename_src *register_rename_src(struct diff_filepair *p)
 {
 	int first, last;
+	struct diff_filespec *one = p->one;
+	unsigned short score = p->score;
 
 	first = 0;
 	last = rename_src_nr;
 	while (last > first) {
 		int next = (last + first) >> 1;
 		struct diff_rename_src *src = &(rename_src[next]);
-		int cmp = strcmp(one->path, src->one->path);
+		int cmp = strcmp(one->path, src->p->one->path);
 		if (!cmp)
 			return src;
 		if (cmp < 0) {
@@ -89,7 +90,7 @@ static struct diff_rename_src *register_rename_src(struct diff_filespec *one,
 	if (first < rename_src_nr)
 		memmove(rename_src + first + 1, rename_src + first,
 			(rename_src_nr - first - 1) * sizeof(*rename_src));
-	rename_src[first].one = one;
+	rename_src[first].p = p;
 	rename_src[first].score = score;
 	return &(rename_src[first]);
 }
@@ -204,7 +205,7 @@ static void record_rename_pair(int dst_index, int src_index, int score)
 	if (rename_dst[dst_index].pair)
 		die("internal error: dst already matched.");
 
-	src = rename_src[src_index].one;
+	src = rename_src[src_index].p->one;
 	src->rename_used++;
 	src->count++;
 
@@ -384,7 +385,7 @@ static int find_exact_renames(void)
 
 	init_hash(&file_table);
 	for (i = 0; i < rename_src_nr; i++)
-		insert_file_table(&file_table, -1, i, rename_src[i].one);
+		insert_file_table(&file_table, -1, i, rename_src[i].p->one);
 
 	for (i = 0; i < rename_dst_nr; i++)
 		insert_file_table(&file_table, 1, i, rename_dst[i].two);
@@ -472,7 +473,7 @@ void diffcore_rename(struct diff_options *options)
 			 */
 			if (p->broken_pair && !p->score)
 				p->one->rename_used++;
-			register_rename_src(p->one, p->score);
+			register_rename_src(p);
 		}
 		else if (detect_rename == DIFF_DETECT_COPY) {
 			/*
@@ -480,7 +481,7 @@ void diffcore_rename(struct diff_options *options)
 			 * one, to indicate ourselves as a user.
 			 */
 			p->one->rename_used++;
-			register_rename_src(p->one, p->score);
+			register_rename_src(p);
 		}
 	}
 	if (rename_dst_nr == 0 || rename_src_nr == 0)
@@ -526,7 +527,7 @@ void diffcore_rename(struct diff_options *options)
 			m[j].dst = -1;
 
 		for (j = 0; j < rename_src_nr; j++) {
-			struct diff_filespec *one = rename_src[j].one;
+			struct diff_filespec *one = rename_src[j].p->one;
 			struct diff_score this_src;
 			this_src.score = estimate_similarity(one, two,
 							     minimum_score);
@@ -556,7 +557,7 @@ void diffcore_rename(struct diff_options *options)
 		dst = &rename_dst[mx[i].dst];
 		if (dst->pair)
 			continue; /* already done, either exact or fuzzy. */
-		if (rename_src[mx[i].src].one->rename_used)
+		if (rename_src[mx[i].src].p->one->rename_used)
 			continue;
 		record_rename_pair(mx[i].dst, mx[i].src, mx[i].score);
 		rename_count++;
-- 
1.7.4.rc1.214.g2a4f9

^ permalink raw reply related

* [PATCH 0/3] Falling back "diff -C -C" to "diff -C" more gracefully
From: Junio C Hamano @ 2011-01-06 21:50 UTC (permalink / raw)
  To: git; +Cc: Stefan Haller
In-Reply-To: <7vhbdlahnp.fsf@alter.siamese.dyndns.org>

In a project with many paths, "diff -C -C" may have too many rename source
candidates (as it tries to use all existing paths) but the rename source
candidates "diff -C" would use may still fit under the rename detection
limit.  Currently, we punt and disable the inexact rename detection
altogether even in such a case.

This weatherballoon series illustrates how diffcore-rename can be tweaked
to allow "-C -C" to fall back to "-C".  Somebody should write a test, but
not today ;-).

Junio C Hamano (3):
  diffcore-rename: refactor "too many candidates" logic
  diffcore-rename: record filepair for rename src
  diffcore-rename: fall back to -C when -C -C busts the rename limit

 diffcore-rename.c |   97 ++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 71 insertions(+), 26 deletions(-)

-- 
1.7.4.rc1.214.g2a4f9

^ permalink raw reply

* [PATCH 3/3] diffcore-rename: fall back to -C when -C -C busts the rename limit
From: Junio C Hamano @ 2011-01-06 21:50 UTC (permalink / raw)
  To: git
In-Reply-To: <1294350606-19530-1-git-send-email-gitster@pobox.com>

When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed.  We however could fall back
to "git diff -C" if the number of paths modified are sufficiently small.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diffcore-rename.c |   37 +++++++++++++++++++++++++++++++++++--
 1 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 9ce81b6..4851af3 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -415,11 +415,18 @@ static void record_if_better(struct diff_score m[], struct diff_score *o)
 		m[worst] = *o;
 }
 
+/*
+ * Returns:
+ * 0 if we are under the limit;
+ * 1 if we need to disable inexact rename detection;
+ * 2 if we would be under the limit if we were given -C instead of -C -C.
+ */
 static int too_many_rename_candidates(int num_create,
 				      struct diff_options *options)
 {
 	int rename_limit = options->rename_limit;
 	int num_src = rename_src_nr;
+	int i;
 
 	/*
 	 * This basically does a test for the rename matrix not
@@ -436,6 +443,19 @@ static int too_many_rename_candidates(int num_create,
 	    (num_create * num_src <= rename_limit * rename_limit))
 		return 0;
 
+	/* Are we running under -C -C? */
+	if (!DIFF_OPT_TST(options, FIND_COPIES_HARDER))
+		return 1;
+
+	/* Would we bust the limit if we were running under -C? */
+	for (num_src = i = 0; i < rename_src_nr; i++) {
+		if (diff_unmodified_pair(rename_src[i].p))
+			continue;
+		num_src++;
+	}
+	if ((num_create <= rename_limit || num_src <= rename_limit) &&
+	    (num_create * num_src <= rename_limit * rename_limit))
+		return 2;
 	return 1;
 }
 
@@ -446,7 +466,7 @@ void diffcore_rename(struct diff_options *options)
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
 	struct diff_score *mx;
-	int i, j, rename_count;
+	int i, j, rename_count, skip_unmodified = 0;
 	int num_create, num_src, dst_cnt;
 
 	if (!minimum_score)
@@ -508,10 +528,18 @@ void diffcore_rename(struct diff_options *options)
 	if (!num_create)
 		goto cleanup;
 
-	if (too_many_rename_candidates(num_create, options)) {
+	switch (too_many_rename_candidates(num_create, options)) {
+	case 1:
 		if (options->warn_on_too_large_rename)
 			warning("too many files (created: %d deleted: %d), skipping inexact rename detection", num_create, num_src);
 		goto cleanup;
+	case 2:
+		if (options->warn_on_too_large_rename)
+			warning("too many files, falling back to -C");
+		skip_unmodified = 1;
+		break;
+	default:
+		break;
 	}
 
 	mx = xcalloc(num_create * NUM_CANDIDATE_PER_DST, sizeof(*mx));
@@ -529,6 +557,11 @@ void diffcore_rename(struct diff_options *options)
 		for (j = 0; j < rename_src_nr; j++) {
 			struct diff_filespec *one = rename_src[j].p->one;
 			struct diff_score this_src;
+
+			if (skip_unmodified &&
+			    diff_unmodified_pair(rename_src[j].p))
+				continue;
+
 			this_src.score = estimate_similarity(one, two,
 							     minimum_score);
 			this_src.name_score = basename_same(one, two);
-- 
1.7.4.rc1.214.g2a4f9

^ permalink raw reply related

* [PATCH 1/3] diffcore-rename: refactor "too many candidates" logic
From: Junio C Hamano @ 2011-01-06 21:50 UTC (permalink / raw)
  To: git
In-Reply-To: <1294350606-19530-1-git-send-email-gitster@pobox.com>

Move the logic to a separate function, to be enhanced by later patches in
the series.

While at it, swap the condition used in the if statement from "if it is
too big then do this" to "if it would fit then do this".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diffcore-rename.c |   39 +++++++++++++++++++++++++--------------
 1 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index df41be5..6ab050d 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -414,11 +414,34 @@ static void record_if_better(struct diff_score m[], struct diff_score *o)
 		m[worst] = *o;
 }
 
+static int too_many_rename_candidates(int num_create,
+				      struct diff_options *options)
+{
+	int rename_limit = options->rename_limit;
+	int num_src = rename_src_nr;
+
+	/*
+	 * This basically does a test for the rename matrix not
+	 * growing larger than a "rename_limit" square matrix, ie:
+	 *
+	 *    num_create * num_src > rename_limit * rename_limit
+	 *
+	 * but handles the potential overflow case specially (and we
+	 * assume at least 32-bit integers)
+	 */
+	if (rename_limit <= 0 || rename_limit > 32767)
+		rename_limit = 32767;
+	if ((num_create <= rename_limit || num_src <= rename_limit) &&
+	    (num_create * num_src <= rename_limit * rename_limit))
+		return 0;
+
+	return 1;
+}
+
 void diffcore_rename(struct diff_options *options)
 {
 	int detect_rename = options->detect_rename;
 	int minimum_score = options->rename_score;
-	int rename_limit = options->rename_limit;
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
 	struct diff_score *mx;
@@ -484,19 +507,7 @@ void diffcore_rename(struct diff_options *options)
 	if (!num_create)
 		goto cleanup;
 
-	/*
-	 * This basically does a test for the rename matrix not
-	 * growing larger than a "rename_limit" square matrix, ie:
-	 *
-	 *    num_create * num_src > rename_limit * rename_limit
-	 *
-	 * but handles the potential overflow case specially (and we
-	 * assume at least 32-bit integers)
-	 */
-	if (rename_limit <= 0 || rename_limit > 32767)
-		rename_limit = 32767;
-	if ((num_create > rename_limit && num_src > rename_limit) ||
-	    (num_create * num_src > rename_limit * rename_limit)) {
+	if (too_many_rename_candidates(num_create, options)) {
 		if (options->warn_on_too_large_rename)
 			warning("too many files (created: %d deleted: %d), skipping inexact rename detection", num_create, num_src);
 		goto cleanup;
-- 
1.7.4.rc1.214.g2a4f9

^ permalink raw reply related

* Re: [PATCH 2/3] t0001,t1510,t3301: use sane_unset which always returns with status 0
From: Brandon Casey @ 2011-01-06 21:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Casey
In-Reply-To: <7vlj2xai2v.fsf@alter.siamese.dyndns.org>

On 01/06/2011 02:32 PM, Junio C Hamano wrote:
> Thanks; there are a bit more from 
> 
>  $git grep -n -e '^[ 	]*unset[ 	].*&&' t
> 
> * t/t9130-git-svn-authors-file.sh:101:		unset GIT_CONFIG &&
> 
> As far as I can tell, this is unsafe; nobody has set GIT_CONFIG up to this
> point.

Ah, svn isn't installed on the Solaris box I tested on, so
this test wasn't run.

-Brandon

^ permalink raw reply

* Re: clone breaks replace
From: Jonathan Nieder @ 2011-01-06 21:33 UTC (permalink / raw)
  To: Phillip Susi; +Cc: git, Christian Couder
In-Reply-To: <4D262D68.2050804@cfl.rr.com>

Phillip Susi wrote:

> I managed to
> correctly add a replace commit that truncates the history and contains
> instructions where you can find it, and running git log only goes back
> to the replacement commit, unless you add --no-replace-objects, which
> causes it to show the original full history.

Before I get to your real question: this seems a bit backwards.  Let
me say a few words about why.

In the days before replacement refs (and today, too), each commit
name described not only the state of a tree at a moment but the
history that led up to it.  In fact you can see this somewhat directly:
given two distinct commits A and B if you try

	$ git cat-file commit A >a.commit
	$ git cat-file commit B >b.commit
	$ diff -u a.commit b.commit

then you will see precisely what can make them different:

 - the author's name and email and the date of authorship
 - the committer's name and email and the date committed
 - the names of the parent commits, describing the history
 - the name of a tree, describing the content
 - the log message, including its encoding

The commit name is a hash of that information (see git-hash-object(1))
and an invariant maintained is "if a repository has access to commit A,
it has access to its parents, their parents, and so on".  This invariant
is maintained during object transfer and garbage collection and relied
on by object transfer and revision traversal.

The beauty of replacement refs is that they can be easily added or
removed without breaking this invariant.  And a replacement ref is an
actual reference into history, so garbage collection does not remove
those commits and the repository keeps enough information to traverse
both the modified and unmodified history.

Therefore if you want clients to be able to choose between a minimal
history and a larger one to save bandwidth, it has to work like this

 - to get the minimal history, fetch _without_ any replacement refs
 - to get the full history, fetch the replacement refs on top of that.

because an additional reference can only increase the number of
objects to be downloaded.

> The problem is that when I clone the repository, I expect the clone to
> contain only history up to the replacement record, and not the old
> history before that.  Instead, the clone contains only the full original
> history, and the replacement ref is not imported at all.  A git replace
> in the new clone shows nothing.
>
> Shouldn't clone copy .git/refs/replace?

With that in mind, I suspect the best way to achieve what you are
looking for is the following:

 1. Make a big, ugly history (branch "big").  Presumably this part's
    already done.

 2. Find the part you want to get rid of and make appropriate
    replacement refs so "gitk big" shows what you want it to.

 3. Use "git filter-branch" to make that history a reality (branch
    "simpler").  Remove the replacement refs.

 4. Use "git replace" to graft back on the pieces you cauterized.
    Publish the result.

 5. Perhaps also run and publish "git replace big simpler", so
    contributors of branches based against the old 'big' can merge
    your latest changes from 'simpler'.  Encourage contributors to
    use 'git rebase' or 'git filter-branch' to rebase their
    contributions against the new, simpler history.

Does that make sense?

Jonathan

^ permalink raw reply

* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Nicolas Pitre @ 2011-01-06 21:09 UTC (permalink / raw)
  To: Zenaan Harkness; +Cc: git
In-Reply-To: <AANLkTikv+L5Da7A5VM7BAgnue=m0O_-nHmHchJzfGxJa@mail.gmail.com>

On Thu, 6 Jan 2011, Zenaan Harkness wrote:

> Bittorrent requires some stability around torrent files.
> 
> Can packs be generated deterministically?

They _could_, but we do _not_ want to do that.

The only thing which is stable in Git is the canonical representation of 
objects, and the objects they depend on, expressed by their SHA1 
signature.  Any BitTorrent-alike design for Git must be based on that 
property and not the packed representation of those objects which is not 
meant to be stable.

If you don't want to design anything and simply reuse current BitTorrent 
codebase then simply create a Git bundle from some release version and 
seed that bundle for a sufficiently long period to be worth it.  Then 
falling back to git fetch in order to bring the repo up to date with the 
very latest commits should be small and quick.  When that clone gets too 
big then it's time to start seeding another more up-to-date bundle.

Nicolas

^ permalink raw reply

* Re: log -p hides changes in merge commit
From: Jonathan Nieder @ 2011-01-06 21:04 UTC (permalink / raw)
  To: Phillip Susi; +Cc: Junio C Hamano, git
In-Reply-To: <4D262B05.2060306@cfl.rr.com>

Phillip Susi wrote:

> What I would like to do is be able to review a merge to sign off on it.
>  While the full diff against the left parent would be a large and
> unhelpful amalgamation of the changes in the merged branch, any
> additional changes made during the commit should not be hidden.  This
> allows someone performing the merge to effectively sneak in unintended
> changes.  I would expect any such changes to be shown by log -p, but
> this only seems to happen if you add -c.

To be more precise, here is what -c and --cc do.  Consider the
following history (time flowing left to right):

    -- [topic]
   /
 B --- [master]

>From the master branch, I merge topic.  (1) If I am lucky, the changes
from B to topic and B to master touch entirely different sections of
code (though perhaps within the same files), so one could just apply
the two diffs in succession to make a merge automatically.  (2) Almost
as good is the case when they touch code a couple of lines apart ---
"git merge" still figures it out automatically.  (3) Less nice is the
case when they touch the same line, say --- but even here the correct
merge can be obvious.  (4) Worst of all is when the changes
semantically conflict but syntactically do not:

	$ git merge topic
	$ make test;	# fails!
	$ ... hack hack hack ...
	$ git commit --amend

   -- o [topic] ------
  /                   \
 B -- o --------------- [master]

In case (1), -c will show a "combined diff" for files where master
does not match either the old master or topic.  --cc, on the other
hand, will correctly suppress these uninteresting diffs.

In case (2), -c will show a noisy "combined diff" as before.
--cc will show a combined diff when the changes from both parents
touch nearby code, even if it merged trivially.

In case (3), -c and --cc will show the semantically boring but
syntactically interesting merge.

Case (4) is underspecified.  So let's give a more precise example:
the old master and topic tried to fix the same bug in two incompatible
ways.  When merging, I decide I like the topic's way better, so I
resolve conflicts in favor of the topic.  Hopefully all unrelated
changes on master were preserved!

In this case, -c and --cc will very likely show nothing at all.
Each file matches one of the two parents (old master or topic) so
there is no easy way to distinguish the case from (0) or (1).

By now it should be clear how to get the diff you are looking for.
One makes a test merge, perhaps using the iffy "resolve in favor
of one side or the other" feature to save time on conflicts:

	git checkout oldmaster^0
	git merge topic
	git reset --merge ORIG_HEAD; # meh, too many conflicts
	git merge -Xours topic

and then makes a diff.

	git diff master

Hope that helps,
Jonathan

^ permalink raw reply

* clone breaks replace
From: Phillip Susi @ 2011-01-06 21:00 UTC (permalink / raw)
  To: git

I've been experimenting with git replace to remove ancient history, and
I have found that cloning a repository breaks replace.  I read about
this process at http://progit.org/2010/03/17/replace.html.  I managed to
correctly add a replace commit that truncates the history and contains
instructions where you can find it, and running git log only goes back
to the replacement commit, unless you add --no-replace-objects, which
causes it to show the original full history.

The problem is that when I clone the repository, I expect the clone to
contain only history up to the replacement record, and not the old
history before that.  Instead, the clone contains only the full original
history, and the replacement ref is not imported at all.  A git replace
in the new clone shows nothing.

Shouldn't clone copy .git/refs/replace?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox