Git development
 help / color / mirror / Atom feed
* Re: [PATCH 1/7] "git" calls help_unknown_cmd(""); "git help" and "git help -a" return 0
From: Scott Parish @ 2007-10-27  7:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v640twka3.fsf@gitster.siamese.dyndns.org>

On Fri, Oct 26, 2007 at 04:03:48PM -0700, Junio C Hamano wrote:

> > Regarding "git: '' is not a git-command" the way i was seeing that
> > is that git is usually only called with commands, and '' isn't a
> > valid command, hence the reason to exit 1, the help is just a nice
> > user experience.
> 
> But think who would type "git<Enter>".  They are either people
> who (1) do not even know that "git" alone is not useful and that
> it always wants a subcommand, or (2) know "git<Enter>" is the
> same as "git help" and wants to get the "common command list"
> quickly.  Technically, "'' is not a git-command" is correct, but
> the message does not help either audience, does it?

Fair enough, i'll drop that in the updated patch set i'm about ready
to send. Incidentally i was also missing the "usage" string.

sRp

-- 
Scott Parish
http://srparish.net/

^ permalink raw reply

* merge vs rebase: Is visualization in gitk the only problem?
From: Steffen Prohaska @ 2007-10-27  7:17 UTC (permalink / raw)
  To: Git Mailing List

There are some discussions going on about merge vs. rebase. The
suggestions by Dscho is to prefer rebase over merge in a
CVS-style workflow.

Rebase has definitely benefits but not all of its details
are obvious at a first glance. Tell a newbie to read the
git rebase man page and explain what git reabase does and
you know what I mean. Rebase definitely can help to create a
cleaner history. But it rewrites history and therefore destroys
information, for example information about the original code
base a patch was developed against, or merge conflicts that
were resolved. You also need to decide when to use rebase and
when to use merge. So you need to make a choice.

Why not always use git merge?

Is the only problem of git merge that it might create loops
in the history with potentially long running parallel lines
that are insufficiently visualized in gitk?

If so, why not improve the visualization?

Or is there any other deficiency of always using merge and
never using rebase that I don't see?

Obviously you can use git merge only if you want to have _all_
changes from the other branch. But this is often what you
want. In a CVS-style workflow you want to merge all changes
from the shared branch before pushing. Why going through the
hassel of git rebase?

I'm aware that you can do other things with git rebase that
cannot be achieved by using git merge. For example moving a
topic branch to a different stable branch without merging all
of the topic branch's history ("--onto" option). It is easy
to explain that rebase has it's own value.

	Steffen

^ permalink raw reply

* Re: Git and Windows
From: Steffen Prohaska @ 2007-10-27  6:54 UTC (permalink / raw)
  To: Bo Yang; +Cc: Git Mailing List
In-Reply-To: <47215396.1080202@gmail.com>


On Oct 26, 2007, at 4:40 AM, Bo Yang wrote:

> Johannes Schindelin :
>> Hi,
>>
>> On Thu, 25 Oct 2007, Bo Yang wrote:
>>
>>
>>>   I am a new comer to this list but I have used git for two week  
>>> development control. I think it is a very cool tool, the only  
>>> flaw is that I have not found Windows version of it. Does git  
>>> just aim at Linux kernel development? Is there any plan or in the  
>>> future to migrate it to windows?
>>>
>>
>> Funny.  The first three hits I get from Google are
>>
>> 	Wikipedia,
>> 	GitWiki and
>> 	msysgit
>>
>> The first two pointing to the third.  And happily enough, there is  
>> a Download page at the third site.  Oh, and it has a description  
>> what its affiliation with git is.
>>
> So, if I want to get involved with the Windows version git  
> development, I should download the Gitme, right?

Yes, if you want to contribute to git on Windows.

And if you only want to use git, you can download

http://msysgit.googlecode.com/files/Git-1.5.3-preview20071019.exe

	Steffen

^ permalink raw reply

* Re: [PATCH] Bisect run: "skip" current commit if script exit code is 125.
From: Junio C Hamano @ 2007-10-27  6:12 UTC (permalink / raw)
  To: Tom Prince; +Cc: git
In-Reply-To: <20071027053305.GB3115@hermes.priv>

Tom Prince <tom.prince@ualberta.net> writes:

> On Sat, Oct 27, 2007 at 07:02:31AM +0200, Christian Couder wrote:
>> Le vendredi 26 octobre 2007, Benoit SIGOURE a écrit :
>> > On Oct 26, 2007, at 5:39 AM, Christian Couder wrote:
>> > >
>> > > +The special exit code 125 should be used when the current source code
>> > > +cannot be tested. If the "run" script exits with this code, the
>> > > current
>> > > +revision will be "skip"ped, see `git bisect skip` above.
>> > > [...]
>> >
>> Also there is:
>> 
>> $ grep 77 /usr/include/sysexits.h
>> #define EX_NOPERM       77      /* permission denied */
>
> How about 
>
> #define EX_TEMPFAIL     75      /* temp failure; user is invited to retry */

Let's stop bikeshedding.  125 is as good as anything else.

^ permalink raw reply

* Re: recent change in git.git/master broke my repos
From: Junio C Hamano @ 2007-10-27  6:09 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Jeff King, git, Randal L. Schwartz
In-Reply-To: <Pine.LNX.4.64.0710251351330.7345@iabervon.org>

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Thu, 25 Oct 2007, Jeff King wrote:
>
>> On Thu, Oct 25, 2007 at 07:32:36AM -0700, Randal L. Schwartz wrote:
>> 
>> > I have echo "ref: refs/remotes/origin/master" >.git/refs/heads/upstream
>> > so that my daily update script can go:
>> > 
>> >    git-fetch
>> >    if [ repo is on master, and is not dirty ];
>> >       git-merge upstream
>> >    fi
>> > 
>> > Yesterday that worked.
>> > 
>> > Today I get a rash of:
>> > 
>> >   fatal: Couldn't find remote ref refs/remotes/origin/master
>> > 
>> > from my git-fetch.
>> 
>> Randal and I discussed this a bit on IRC, and it turns out not to be
>> related to the 'upstream' symref. Instead, he had a broken
>> branch.master.merge config that pointed to "refs/remotes/origin/master"
>> (which you can see from his script above doesn't actually get used).
>> 
>> So presumably the old git-fetch didn't care that the contents of
>> branch.*.master didn't exist (it's just that nothing got marked for
>> merging), but the one just merged from the db/fetch-pack topic does.
>> 
>> Is this behavior change intentional?
>
> It's not exactly intentional; it's just that nobody seems to have tested 
> this particular misconfiguration. It should probably report an error 
> (since the configuration is, in fact, broken and potentially misleading), 
> but it probably shouldn't be fatal and certainly shouldn't be so 
> uninformative.

How would we proceed from here, then?

If you had "branch.master.merge = refs/heads/foobar", kept
running happily, and suddenly the remote stopped carrying that
foobar branch, you would get a configuration that uses
nonexistent remote branch name, so this is not purely a
configuration error on the fetcher's side.

Older git used to enumerate remote tracking branches explicitly,
and one of the remote.origin.fetch entries would have said
"refs/heads/foobar:refs/remotes/origin/foobar", and that would
have made git-fetch fail with the error, complaining that such a
branch does not exist.

You are suggesting that git-fetch should not fail if
remote.origin.fetch is refs/heads/*:refs/remotes/origin/*
wildcard, and I think I can agree with that; "git pull" however
should notice and and exit with an error.

Perhaps something like this?

-- >8 --
[PATCH] git-fetch: do not fail when remote branch disappears

When the branch named with branch.$name.merge is not covered by
the fetch configuration for the remote repository named with
branch.$name.remote, we automatically add that branch to the set
of branches to be fetched.  However, if the remote repository
does not have that branch (e.g. it used to exist, but got
removed), this is not a reason to fail the git-fetch itself.

The situation however will be noticed if git-fetch was called by
git-pull, as the resulting FETCH_HEAD would not have any entry
that is marked for merging.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 builtin-fetch.c |   24 ++++++++++++++++--------
 remote.c        |   22 +++++++++++++---------
 remote.h        |    5 ++++-
 3 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index b9d2b0c..003ed76 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -48,15 +48,21 @@ static void add_merge_config(struct ref **head,
 		if (rm)
 			continue;
 
-		/* Not fetched to a tracking branch?  We need to fetch
+		/*
+		 * Not fetched to a tracking branch?  We need to fetch
 		 * it anyway to allow this branch's "branch.$name.merge"
-		 * to be honored by git-pull.
+		 * to be honored by git-pull, but we do not have to
+		 * fail if branch.$name.merge is misconfigured to point
+		 * at a nonexisting branch.  If we were indeed called by
+		 * git-pull, it will notice the misconfiguration because
+		 * there is no entry in the resulting FETCH_HEAD marked
+		 * for merging.
 		 */
 		refspec.src = branch->merge[i]->src;
 		refspec.dst = NULL;
 		refspec.pattern = 0;
 		refspec.force = 0;
-		get_fetch_map(remote_refs, &refspec, tail);
+		get_fetch_map(remote_refs, &refspec, tail, 1);
 		for (rm = *old_tail; rm; rm = rm->next)
 			rm->merge = 1;
 	}
@@ -75,7 +81,7 @@ static struct ref *get_ref_map(struct transport *transport,
 
 	if (ref_count || tags) {
 		for (i = 0; i < ref_count; i++) {
-			get_fetch_map(remote_refs, &refs[i], &tail);
+			get_fetch_map(remote_refs, &refs[i], &tail, 0);
 			if (refs[i].dst && refs[i].dst[0])
 				*autotags = 1;
 		}
@@ -88,7 +94,7 @@ static struct ref *get_ref_map(struct transport *transport,
 			refspec.dst = "refs/tags/";
 			refspec.pattern = 1;
 			refspec.force = 0;
-			get_fetch_map(remote_refs, &refspec, &tail);
+			get_fetch_map(remote_refs, &refspec, &tail, 0);
 		}
 	} else {
 		/* Use the defaults */
@@ -97,7 +103,7 @@ static struct ref *get_ref_map(struct transport *transport,
 		int has_merge = branch_has_merge_config(branch);
 		if (remote && (remote->fetch_refspec_nr || has_merge)) {
 			for (i = 0; i < remote->fetch_refspec_nr; i++) {
-				get_fetch_map(remote_refs, &remote->fetch[i], &tail);
+				get_fetch_map(remote_refs, &remote->fetch[i], &tail, 0);
 				if (remote->fetch[i].dst &&
 				    remote->fetch[i].dst[0])
 					*autotags = 1;
@@ -110,11 +116,13 @@ static struct ref *get_ref_map(struct transport *transport,
 			 * as given in branch.<name>.remote, we add the
 			 * ref given in branch.<name>.merge, too.
 			 */
-			if (has_merge && !strcmp(branch->remote_name,
-						remote->name))
+			if (has_merge &&
+			    !strcmp(branch->remote_name, remote->name))
 				add_merge_config(&ref_map, remote_refs, branch, &tail);
 		} else {
 			ref_map = get_remote_ref(remote_refs, "HEAD");
+			if (!ref_map)
+				die("Couldn't find remote ref HEAD");
 			ref_map->merge = 1;
 		}
 	}
diff --git a/remote.c b/remote.c
index 170015a..bec2ba1 100644
--- a/remote.c
+++ b/remote.c
@@ -857,7 +857,7 @@ struct ref *get_remote_ref(struct ref *remote_refs, const char *name)
 	struct ref *ref = find_ref_by_name_abbrev(remote_refs, name);
 
 	if (!ref)
-		die("Couldn't find remote ref %s\n", name);
+		return NULL;
 
 	return copy_ref(ref);
 }
@@ -889,20 +889,24 @@ static struct ref *get_local_ref(const char *name)
 
 int get_fetch_map(struct ref *remote_refs,
 		  const struct refspec *refspec,
-		  struct ref ***tail)
+		  struct ref ***tail,
+		  int missing_ok)
 {
 	struct ref *ref_map, *rm;
 
 	if (refspec->pattern) {
 		ref_map = get_expanded_map(remote_refs, refspec);
 	} else {
-		ref_map = get_remote_ref(remote_refs,
-					 refspec->src[0] ?
-					 refspec->src : "HEAD");
-
-		ref_map->peer_ref = get_local_ref(refspec->dst);
-		if (ref_map->peer_ref && refspec->force)
-			ref_map->peer_ref->force = 1;
+		const char *name = refspec->src[0] ? refspec->src : "HEAD";
+
+		ref_map = get_remote_ref(remote_refs, name);
+		if (!missing_ok && !ref_map)
+			die("Couldn't find remote ref %s", name);
+		if (ref_map) {
+			ref_map->peer_ref = get_local_ref(refspec->dst);
+			if (ref_map->peer_ref && refspec->force)
+				ref_map->peer_ref->force = 1;
+		}
 	}
 
 	for (rm = ref_map; rm; rm = rm->next) {
diff --git a/remote.h b/remote.h
index c62636d..878b4ec 100644
--- a/remote.h
+++ b/remote.h
@@ -67,9 +67,12 @@ int match_refs(struct ref *src, struct ref *dst, struct ref ***dst_tail,
  * *tail is the pointer to the tail pointer of the list of results
  * beforehand, and will be set to the tail pointer of the list of
  * results afterward.
+ *
+ * missing_ok is usually false, but when we are adding branch.$name.merge
+ * it is Ok if the branch is not at the remote anymore.
  */
 int get_fetch_map(struct ref *remote_refs, const struct refspec *refspec,
-		  struct ref ***tail);
+		  struct ref ***tail, int missing_ok);
 
 struct ref *get_remote_ref(struct ref *remote_refs, const char *name);
 

^ permalink raw reply related

* Re: [PATCH] Bisect run: "skip" current commit if script exit code is 125.
From: Tom Prince @ 2007-10-27  5:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20071027052834.GA3115@hermes.priv>

On Sat, Oct 27, 2007 at 07:02:31AM +0200, Christian Couder wrote:
> Le vendredi 26 octobre 2007, Benoit SIGOURE a écrit :
> > On Oct 26, 2007, at 5:39 AM, Christian Couder wrote:
> > >
> > > +The special exit code 125 should be used when the current source code
> > > +cannot be tested. If the "run" script exits with this code, the
> > > current
> > > +revision will be "skip"ped, see `git bisect skip` above.
> > > [...]
> >
> Also there is:
> 
> $ grep 77 /usr/include/sysexits.h
> #define EX_NOPERM       77      /* permission denied */

How about 

#define EX_TEMPFAIL     75      /* temp failure; user is invited to retry */

^ permalink raw reply

* Re: [PATCH] Bisect run: "skip" current commit if script exit code is 125.
From: Christian Couder @ 2007-10-27  5:02 UTC (permalink / raw)
  To: Benoit SIGOURE; +Cc: git list
In-Reply-To: <A43880E9-E496-48AA-BC1C-2C98DFD12370@lrde.epita.fr>

Le vendredi 26 octobre 2007, Benoit SIGOURE a écrit :
> On Oct 26, 2007, at 5:39 AM, Christian Couder wrote:
> >
> > +The special exit code 125 should be used when the current source code
> > +cannot be tested. If the "run" script exits with this code, the
> > current
> > +revision will be "skip"ped, see `git bisect skip` above.
> > [...]
>
> Since exit 77 is already used by automake to mean "skip", wouldn't it
> be better to do the same thing here?

I don't think 77 is better, first because for automake this is to ignore 
some non portable test results "in environments where they don't make 
sense", so if we "bisect run" the same test script and it returns 77 once, 
it will probably returns 77 everytime because the environment will not have 
changed after a new revision has been checked out.

Also there is:

$ grep 77 /usr/include/sysexits.h
#define EX_NOPERM       77      /* permission denied */

and a search for "include <sysexits.h>" in http://www.google.com/codesearch 
returns a lot of results.

Christian.

^ permalink raw reply

* Re: [PATCH 6/9] git-hash-object: Add --stdin-paths option
From: Junio C Hamano @ 2007-10-27  1:02 UTC (permalink / raw)
  To: Brian Downing; +Cc: Adam Roben, git
In-Reply-To: <20071026231902.GC2519@lavos.net>

bdowning@lavos.net (Brian Downing) writes:

> On Fri, Oct 26, 2007 at 02:00:47PM -0700, Junio C Hamano wrote:
>> In addition, if you are enhancing cat-file to spew chunked
>> output out, I suspect that there should be a mode of operation
>> for hash-object that eats that data format.  IOW, this pipe
>> 
>> 	git-cat-file --batch <list-of-sha1 |
>>         git-hash-object --batch
>> 
>> should be an intuitive no-op, shouldn't it?
>
> I think that's an obviously good thing to do.  However, given your
> suggested output format (which I also like):
>
>>    * git-cat-file --batch <list-of-sha1
>> 
>>      outputs a record of this form
>> 
>>           <sha1> SP <type> SP <size> LF <contents> LF
>> 
>>      for each of the input lines.
>
> What should the input behavior be?  Obviously the sha1 will probably
> not be known on the input side.  Should that simply be optional (i.e.
> it will accept either "<sha1> SP <type> SP <size>" or "<type> SP <size>"
> or should it only accept the latter, and a dummy sha1 will need to be
> filled in if the sha1 is not known (presumably "000...000")?

Yeah, you caught me ;-)

Either making it optional or requiring a dummy value would work.
If a non-dummy value is given, we could use it to validate it.

But that would not be a useful application anyway.  So perhaps
just the sequence of "<type> SP <size> LF <contents> LF" would
be the most sensible thing to do.

^ permalink raw reply

* Re: [PATCH 2/2] Do the fuzzy rename detection limits with the exact renames removed
From: Linus Torvalds @ 2007-10-27  0:12 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0710261654300.30120@woody.linux-foundation.org>


On Fri, 26 Oct 2007, Linus Torvalds wrote:
> 
> Works For Me(tm), although this isn't all that obviously testable (ie I 
> should find a test that is border-line, and triggers the rename detection 
> limits, but then has enough exact renames that the rename detection array 
> goes away).

Actually, I just tested it. I used that same 100,000 file thing, but added 
one more (larger) file, and did another commit that moved the 100,000 
files exactly, and the one larger file with a small change.

The code before this patch (but with my linear-time rename changes) would 
find the 100,000 exact renames, but would *not* find the one that had a 
small change, because it would hit the rename limit, and wouldn't even 
try.

With these two patches in place, it finds all the exact renames, and once 
it has done that, the resulting rename array is small enough (one single 
unknown target remaining, even if there are 100,001 possible source files) 
that it doesn't trigger the rename limit, and it now finds the remaining 
non-exact rename too.

So it not only looked obvious, it seems to work too.

		Linus

^ permalink raw reply

* [PATCH 2/2] Do the fuzzy rename detection limits with the exact renames removed
From: Linus Torvalds @ 2007-10-26 23:56 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0710261646230.30120@woody.linux-foundation.org>


When we do the fuzzy rename detection, we don't care about the
destinations that we already handled with the exact rename detector.
And, in fact, the code already knew that - but the rename limiter, which
used to run *before* exact renames were detected, did not.

This fixes it so that the rename detection limiter now bases its
decisions on the *remaining* rename counts, rather than the original
ones.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Pretty obvious. It just moves the tests around a bit, and uses the updated 
counters.

Works For Me(tm), although this isn't all that obviously testable (ie I 
should find a test that is border-line, and triggers the rename detection 
limits, but then has enough exact renames that the rename detection array 
goes away).

 diffcore-rename.c |   32 ++++++++++++++++++--------------
 1 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 7ed5ef8..f9ebea5 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -435,33 +435,37 @@ void diffcore_rename(struct diff_options *options)
 	 */
 	rename_count = find_exact_renames();
 
+	/* Did we only want exact renames? */
+	if (minimum_score == MAX_SCORE)
+		goto cleanup;
+
+	/*
+	 * Calculate how many renames are left (but all the source
+	 * files still remain as options for rename/copies!)
+	 */
+	num_create = (rename_dst_nr - rename_count);
+	num_src = rename_src_nr;
+
+	/* All done? */
+	if (!num_create)
+		goto cleanup;
+
 	/*
 	 * This basically does a test for the rename matrix not
 	 * growing larger than a "rename_limit" square matrix, ie:
 	 *
-	 *    rename_dst_nr * rename_src_nr > rename_limit * rename_limit
+	 *    num_create * num_src > rename_limit * rename_limit
 	 *
 	 * but handles the potential overflow case specially (and we
 	 * assume at least 32-bit integers)
 	 */
 	if (rename_limit <= 0 || rename_limit > 32767)
 		rename_limit = 32767;
-	if (rename_dst_nr > rename_limit && rename_src_nr > rename_limit)
+	if (num_create > rename_limit && num_src > rename_limit)
 		goto cleanup;
-	if (rename_dst_nr * rename_src_nr > rename_limit * rename_limit)
+	if (num_create * num_src > rename_limit * rename_limit)
 		goto cleanup;
 
-	/* Have we run out the created file pool?  If so we can avoid
-	 * doing the delta matrix altogether.
-	 */
-	if (rename_count == rename_dst_nr)
-		goto cleanup;
-
-	if (minimum_score == MAX_SCORE)
-		goto cleanup;
-
-	num_create = (rename_dst_nr - rename_count);
-	num_src = rename_src_nr;
 	mx = xmalloc(sizeof(*mx) * num_create * num_src);
 	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
 		int base = dst_cnt * num_src;

^ permalink raw reply related

* [PATCH 1/2] Fix ugly magic special case in exact rename detection
From: Linus Torvalds @ 2007-10-26 23:51 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


For historical reasons, the exact rename detection had populated the
filespecs for the entries it compared, and the rest of the similarity
analysis depended on that.  I hadn't even bothered to debug why that was
the case when I re-did the rename detection, I just made the new one
have the same broken behaviour, with a note about this special case.

This fixes that fixme.  The reason the exact rename detector needed to
fill in the file sizes of the files it checked was that the _inexact_
rename detector was broken, and started comparing file sizes before it
filled them in.

Fixing that allows the exact phase to do the sane thing of never even
caring (since all *it* cares about is really just the SHA1 itself, not
the size nor the contents).

It turns out that this also indirectly fixes a bug: trying to populate
all the filespecs will run out of virtual memory if there is tons and
tons of possible rename options.  The fuzzy similarity analysis does the
right thing in this regard, and free's the blob info after it has
generated the hash tables, so the special case code caused more trouble
than just some extra illogical code.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Without this, when I try to do a "git commit --amend" on my 
horror-repository with the commit that moved around a hundred thousand 
files, I actually have "git runstatus" die() on me, because xmmap() fails 
(mmap() returns NULL due to running out of vm mappings).

Not to mention that this is "obviously correct" (tm) and the "right thing" 
(tm)" to do (famous last words). It's wrong of estimate_similarity() to 
try to compare the sizes of the filespec entries before they have been 
filled in, and it really shouldn't expect the exact rename detection to 
fill them in, since the exact rename detection doesn't even care!

 diffcore-rename.c |   27 ++++++++++++++-------------
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 3946932..7ed5ef8 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -144,6 +144,20 @@ static int estimate_similarity(struct diff_filespec *src,
 	if (!S_ISREG(src->mode) || !S_ISREG(dst->mode))
 		return 0;
 
+	/*
+	 * Need to check that source and destination sizes are
+	 * filled in before comparing them.
+	 *
+	 * If we already have "cnt_data" filled in, we know it's
+	 * all good (avoid checking the size for zero, as that
+	 * is a possible size - we really should have a flag to
+	 * say whether the size is valid or not!)
+	 */
+	if (!src->cnt_data && diff_populate_filespec(src, 0))
+		return 0;
+	if (!dst->cnt_data && diff_populate_filespec(dst, 0))
+		return 0;
+
 	max_size = ((src->size > dst->size) ? src->size : dst->size);
 	base_size = ((src->size < dst->size) ? src->size : dst->size);
 	delta_size = max_size - base_size;
@@ -159,11 +173,6 @@ static int estimate_similarity(struct diff_filespec *src,
 	if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
 		return 0;
 
-	if ((!src->cnt_data && diff_populate_filespec(src, 0))
-		|| (!dst->cnt_data && diff_populate_filespec(dst, 0)))
-		return 0; /* error but caught downstream */
-
-
 	delta_limit = (unsigned long)
 		(base_size * (MAX_SCORE-minimum_score) / MAX_SCORE);
 	if (diffcore_count_changes(src, dst,
@@ -270,19 +279,11 @@ static int find_identical_files(struct file_similarity *src,
 	return renames;
 }
 
-/*
- * Note: the rest of the rename logic depends on this
- * phase also populating all the filespecs for any
- * entry that isn't matched up with an exact rename.
- */
 static void free_similarity_list(struct file_similarity *p)
 {
 	while (p) {
 		struct file_similarity *entry = p;
 		p = p->next;
-
-		/* Stupid special case, see note above! */
-		diff_populate_filespec(entry->filespec, 0);
 		free(entry);
 	}
 }

^ permalink raw reply related

* Re: [PATCH 4/6] copy vs rename detection: avoid unnecessary O(n*m) loops
From: Linus Torvalds @ 2007-10-26 23:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vy7dpv4ls.fsf@gitster.siamese.dyndns.org>



On Fri, 26 Oct 2007, Junio C Hamano wrote:
> 
> So for a broken pair, the actual value of rename_used does not really 
> matter.  We only care about it not going down to zero.

Correct. The rename_used count really is immaterial *except* for the magic 
distinction between zero ("it's a rename, no original source file left") 
and non-zero ("it's a copy, original source file remains").

Which is why the new counter is so fundamentally simple: by decrementing 
it for each rename we encounter, we automatically get that behaviour of 
"only the last user turns into a 'rename' if the source file really went 
away" that we want. 

The old code did it all with some really expensive loops over the 
remaining renames.

		Linus

^ permalink raw reply

* Re: [PATCH] Make rebase smarter
From: Junio C Hamano @ 2007-10-26 23:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Steven Walter, git, federico
In-Reply-To: <Pine.LNX.4.64.0710270013030.4362@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Fri, 26 Oct 2007, Junio C Hamano wrote:
>
>> Steven Walter <stevenrwalter@gmail.com> writes:
>> 
>> > It is a common workflow to run "git fetch; git rebase origin/<foo>" 
>> > Where foo is the remote tracking branch.  git-rebase should default to 
>> > using the remote tracking branch if no other ref is given.
>> 
>> This would be a reasonable choice between refusing outright and
>> picking one possible action.
>
> Another sensible choice would be "git rebase FETCH_HEAD", at least just 
> after a "git fetch <nick> <branch>"...

We can get the best of both worlds by noticing a line in
FETCH_HEAD without not-for-merge marker and use that as the
'onto' commit for the rebase.

^ permalink raw reply

* Re: [PATCH 4/6] copy vs rename detection: avoid unnecessary O(n*m) loops
From: Junio C Hamano @ 2007-10-26 23:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0710261600510.30120@woody.linux-foundation.org>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> The nice thing about the whole counting thing is that we really don't even 
> care. What happens is:
>
>  - *if* any rename at all happens, the rename detection will increment the 
>    "rename_used" thing even more for the source (once for each rename)
>
>  - so if the rename_used started out non-zero, it will never become zero 
>    in diff_resolve_rename_copy(), and every single detected "rename" will 
>    be considered a copy - exactly like we want.
>
>  - in other words, a "rename_used++" before rename-detection guarantees 
>    that you never see a rename, only a copy of the source.

Ok, and that is because we _know_ a broken pair means that the
path itself remains in the postimage and any other postimage
that takes from its preimage can never "rename the path away".
They can only be copies.  Which makes sense.

> The above is actually true *even*if* the 
>
> 	if (!strcmp(p->one->path, p->two->path))
>
> code in diff_resolve_rename_copy() actually triggers,...
> ..., so not doing 
> the decrement there is equivalent to doing another pre-increment.

Yes, this is what confused me.  So for a broken pair, the actual
value of rename_used does not really matter.  We only care about
it not going down to zero.

^ permalink raw reply

* Re: [PATCH 6/9] git-hash-object: Add --stdin-paths option
From: Brian Downing @ 2007-10-26 23:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Adam Roben, git
In-Reply-To: <7vy7dpwpz4.fsf@gitster.siamese.dyndns.org>

On Fri, Oct 26, 2007 at 02:00:47PM -0700, Junio C Hamano wrote:
> In addition, if you are enhancing cat-file to spew chunked
> output out, I suspect that there should be a mode of operation
> for hash-object that eats that data format.  IOW, this pipe
> 
> 	git-cat-file --batch <list-of-sha1 |
>         git-hash-object --batch
> 
> should be an intuitive no-op, shouldn't it?

I think that's an obviously good thing to do.  However, given your
suggested output format (which I also like):

>    * git-cat-file --batch <list-of-sha1
> 
>      outputs a record of this form
> 
>           <sha1> SP <type> SP <size> LF <contents> LF
> 
>      for each of the input lines.

What should the input behavior be?  Obviously the sha1 will probably
not be known on the input side.  Should that simply be optional (i.e.
it will accept either "<sha1> SP <type> SP <size>" or "<type> SP <size>"
or should it only accept the latter, and a dummy sha1 will need to be
filled in if the sha1 is not known (presumably "000...000")?

-bcd

^ permalink raw reply

* Re: [PATCH] Make rebase smarter
From: Johannes Schindelin @ 2007-10-26 23:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Steven Walter, git, federico
In-Reply-To: <7vk5p9wpwd.fsf@gitster.siamese.dyndns.org>

Hi,

On Fri, 26 Oct 2007, Junio C Hamano wrote:

> Steven Walter <stevenrwalter@gmail.com> writes:
> 
> > It is a common workflow to run "git fetch; git rebase origin/<foo>" 
> > Where foo is the remote tracking branch.  git-rebase should default to 
> > using the remote tracking branch if no other ref is given.
> 
> This would be a reasonable choice between refusing outright and
> picking one possible action.

Another sensible choice would be "git rebase FETCH_HEAD", at least just 
after a "git fetch <nick> <branch>"...

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 4/6] copy vs rename detection: avoid unnecessary O(n*m) loops
From: Linus Torvalds @ 2007-10-26 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vabq5wkri.fsf@gitster.siamese.dyndns.org>



On Fri, 26 Oct 2007, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > @@ -2640,27 +2640,21 @@ static void diff_resolve_rename_copy(void)
> >  		 * either in-place edit or rename/copy edit.
> >  		 */
> >  		else if (DIFF_PAIR_RENAME(p)) {
> > +			/*
> > +			 * A rename might have re-connected a broken
> > +			 * pair up, causing the pathnames to be the
> > +			 * same again. If so, that's not a rename at
> > +			 * all, just a modification..
> > +			 *
> > +			 * Otherwise, see if this source was used for
> > +			 * multiple renames, in which case we decrement
> > +			 * the count, and call it a copy.
> >  			 */
> > +			if (!strcmp(p->one->path, p->two->path))
> > +				p->status = DIFF_STATUS_MODIFIED;
> > +			else if (--p->one->rename_used > 0)
> >  				p->status = DIFF_STATUS_COPIED;
> > +			else
> >  				p->status = DIFF_STATUS_RENAMED;
> >  		}
> >  		else if (hashcmp(p->one->sha1, p->two->sha1) ||
> 
> The interaction between the above and ...

Heh.

I'm pretty sure it's right, because:

> > @@ -338,15 +320,25 @@ void diffcore_rename(struct diff_options *options)
> >  				locate_rename_dst(p->two, 1);
> >  		}
> >  		else if (!DIFF_FILE_VALID(p->two)) {
> > +			/*
> > +			 * If the source is a broken "delete", and
> >  			 * they did not really want to get broken,
> >  			 * that means the source actually stays.
> > +			 * So we increment the "rename_used" score
> > +			 * by one, to indicate ourselves as a user
> > +			 */
> > +			if (p->broken_pair && !p->score)
> > +				p->one->rename_used++;
> > +			register_rename_src(p->one, p->score);
> > +		}
> > +		else if (detect_rename == DIFF_DETECT_COPY) {
> > +			/*
> > +			 * Increment the "rename_used" score by
> > +			 * one, to indicate ourselves as a user.
> >  			 */
> > +			p->one->rename_used++;
> > +			register_rename_src(p->one, p->score);
> >  		}
> >  	}
> >  	if (rename_dst_nr == 0 || rename_src_nr == 0)
> >  		goto cleanup; /* nothing to do */
> 
> ... this part feels a bit too subtle for a still-jet-lagged
> brain to grok.  I wonder what happens if the preimage of a
> broken pair is used as the rename source for more than two
> postimages.

The nice thing about the whole counting thing is that we really don't even 
care. What happens is:

 - *if* any rename at all happens, the rename detection will increment the 
   "rename_used" thing even more for the source (once for each rename)

 - so if the rename_used started out non-zero, it will never become zero 
   in diff_resolve_rename_copy(), and every single detected "rename" will 
   be considered a copy - exactly like we want.

 - in other words, a "rename_used++" before rename-detection guarantees 
   that you never see a rename, only a copy of the source.

The above is actually true *even*if* the 

	if (!strcmp(p->one->path, p->two->path))

code in diff_resolve_rename_copy() actually triggers, so we could have 
(and at one point I did) done the "--p->one->rename_used > 0" test before 
the strcmp() test, and it would all continue to work fine. However, the 
reason that I put the strcmp() first is that it needs to be done whether 
we decide to consider it a "copy" or a "rename" (so we cannot avoid it 
anyway), and *if* it triggers, we actually want to avoid the rename_used 
going down to zero anyway (not that it would, because I think it's bound 
to be one of the cases where we pre-incremented the count), so not doing 
the decrement there is equivalent to doing another pre-increment.

So I think it's all right, and more obviously right than it used to be. 
But hey, it's possible that I missed something.

		Linus

^ permalink raw reply

* Re: [PATCH 1/7] "git" calls help_unknown_cmd(""); "git help" and "git help -a" return 0
From: Junio C Hamano @ 2007-10-26 23:03 UTC (permalink / raw)
  To: Scott Parish; +Cc: git
In-Reply-To: <20071025045228.GE759@srparish.net>

Scott Parish <sRp@srparish.net> writes:

> That's what i was hoping this patch did. I'm not entirely sure how
> its wrong as it seems to work for me.

I misread the patch.  My mistake.

> Regarding "git: '' is not a git-command" the way i was seeing that
> is that git is usually only called with commands, and '' isn't a
> valid command, hence the reason to exit 1, the help is just a nice
> user experience.

But think who would type "git<Enter>".  They are either people
who (1) do not even know that "git" alone is not useful and that
it always wants a subcommand, or (2) know "git<Enter>" is the
same as "git help" and wants to get the "common command list"
quickly.  Technically, "'' is not a git-command" is correct, but
the message does not help either audience, does it?

^ permalink raw reply

* Re: [PATCH 4/6] copy vs rename detection: avoid unnecessary O(n*m) loops
From: Junio C Hamano @ 2007-10-26 22:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0710251119120.30120@woody.linux-foundation.org>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> @@ -2640,27 +2640,21 @@ static void diff_resolve_rename_copy(void)
>  		 * either in-place edit or rename/copy edit.
>  		 */
>  		else if (DIFF_PAIR_RENAME(p)) {
> +			/*
> +			 * A rename might have re-connected a broken
> +			 * pair up, causing the pathnames to be the
> +			 * same again. If so, that's not a rename at
> +			 * all, just a modification..
> +			 *
> +			 * Otherwise, see if this source was used for
> +			 * multiple renames, in which case we decrement
> +			 * the count, and call it a copy.
>  			 */
> +			if (!strcmp(p->one->path, p->two->path))
> +				p->status = DIFF_STATUS_MODIFIED;
> +			else if (--p->one->rename_used > 0)
>  				p->status = DIFF_STATUS_COPIED;
> +			else
>  				p->status = DIFF_STATUS_RENAMED;
>  		}
>  		else if (hashcmp(p->one->sha1, p->two->sha1) ||

The interaction between the above and ...

> @@ -338,15 +320,25 @@ void diffcore_rename(struct diff_options *options)
>  				locate_rename_dst(p->two, 1);
>  		}
>  		else if (!DIFF_FILE_VALID(p->two)) {
> +			/*
> +			 * If the source is a broken "delete", and
>  			 * they did not really want to get broken,
>  			 * that means the source actually stays.
> +			 * So we increment the "rename_used" score
> +			 * by one, to indicate ourselves as a user
> +			 */
> +			if (p->broken_pair && !p->score)
> +				p->one->rename_used++;
> +			register_rename_src(p->one, p->score);
> +		}
> +		else if (detect_rename == DIFF_DETECT_COPY) {
> +			/*
> +			 * Increment the "rename_used" score by
> +			 * one, to indicate ourselves as a user.
>  			 */
> +			p->one->rename_used++;
> +			register_rename_src(p->one, p->score);
>  		}
>  	}
>  	if (rename_dst_nr == 0 || rename_src_nr == 0)
>  		goto cleanup; /* nothing to do */

... this part feels a bit too subtle for a still-jet-lagged
brain to grok.  I wonder what happens if the preimage of a
broken pair is used as the rename source for more than two
postimages.

^ permalink raw reply

* Re: How to remove a specific hunk
From: Olivier Ramonat @ 2007-10-26 21:59 UTC (permalink / raw)
  To: git
In-Reply-To: <47220A05.4040705@wanadoo.fr>

Pascal,

Pascal Obry <pascal.obry@wanadoo.fr> writes:
> Andreas,
>
>> Once you've added the other two hunks, they'll no longer show up in
>> git-diff, so you can do something like this:
>> 
>> $ git-add -i; # add the other two hunks to commit
>> $ git-diff > middle-hunk.patch
>> $ git-apply -R middle-hunk.patch
>> test, test, test
>> $ git-apply middle-hunk.patch
>
> Thanks, this will clearly work. I was expecting something more
> integrated like a "git reset --interactive" or something like that :)

A solution could be :

git add -i 
--> Add the two "good" hunks

git checkout-index file_with_bad_hunk
--> Remove the "bad" hunk by getting the staged version

And then 
git reset HEAD file_with_bad_hunk 
if you want to unstage it.

Olivier

^ permalink raw reply

* Re: How to run git-gui always in English?
From: Alex Riesen @ 2007-10-26 21:41 UTC (permalink / raw)
  To: Peter Karlsson; +Cc: Steffen Prohaska, Shawn O. Pearce, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0710260857210.3542@perkele.intern.softwolves.pp.se>

Peter Karlsson, Fri, Oct 26, 2007 10:00:32 +0200:
> Steffen Prohaska:
>
>> There are a lot of efforts going on to localize git-gui, including 
>> technical terms like "push". Personally I don't understand what this 
>> should be useful for. The command is called "git push"s. So, why should it 
>> be named differently in the gui.
>
> Not that I agree that "push" is a technical word, but perhaps you have a 
> point. Why should there be such words in the GUI to start with? It's a GUI, 
> trying to abstract away the command line. Why not have a button "Send" or a 
> menu entry "Send changes to server", mimicing the "git push" command line 
> option? Using command line names or showing protocol data directly in a 
> user-oriented GUI is most often a bad idea.
>
> Or perhaps what we need is an actual translation from "gitish" to English, 
> which would have
>
> msgid "Push"
> msgstr "Send changes to server"
>

Because you do not send changes to a _server_. There is no server.
There is just another repo. Hence just "push"

^ permalink raw reply

* Re: [PATCH] Make rebase smarter
From: Junio C Hamano @ 2007-10-26 21:02 UTC (permalink / raw)
  To: Steven Walter; +Cc: git, federico
In-Reply-To: <1193373682-3608-1-git-send-email-stevenrwalter@gmail.com>

Steven Walter <stevenrwalter@gmail.com> writes:

> It is a common workflow to run "git fetch; git rebase origin/<foo>" Where
> foo is the remote tracking branch.  git-rebase should default to using
> the remote tracking branch if no other ref is given.

This would be a reasonable choice between refusing outright and
picking one possible action.  I do not have a strong preference
as to what that "one possible action" should be, but if people
like to base on the remote tracking branch set to merge by
default, I am fine with it.

> +	curr_branch=$(git symbolic-ref -q HEAD)
> +	curr_branch=${curr_branch//refs\/heads\//}
> +	merge=$(git config branch.$curr_branch.merge)
> +	remote=$(git config branch.$curr_branch.remote)
> +	fetch=$(git config remote.$remote.fetch)
> +
> +	expanded=$(git fetch--tool expand-refs-wildcard "0000000000000000000000000000000000000000 $merge" "$remote" "$fetch")
> +	upstream_name=${expanded/#*:/}
> +fi
>  upstream=`git rev-parse --verify "${upstream_name}^0"` ||
>      die "invalid upstream $upstream_name"

 * How does this work if there is no such tracking configuration?

   - branch.<curr>.merge may be missing;
   - branch.<curr>.remote may be missing;
   - remote.<remote>.fetch may be explicit, multiple and/or non-wildcard;

 * ${parameter/pattern/string} is a bashism we do not allow in
   git scripts.

Problems in the implementation aside, it probably makes sense to
first have a helper function that takes a local branch name and
computes the remote tracking branch that a given local branch is
set to merge from, if exists, and use it here.  I suspect there
are other places in the Porcelain that would benefit from such a
helper function.

^ permalink raw reply

* Re: [PATCH] git-fetch: print informative messages to stdout, not stderr
From: Junio C Hamano @ 2007-10-26 21:01 UTC (permalink / raw)
  To: Gerrit Pape; +Cc: git
In-Reply-To: <20071026085355.24930.qmail@2c06371a7c3ae6.315fe32.mid.smarden.org>

Gerrit Pape <pape@smarden.org> writes:

> git-fetch writes informations about storing tags and the like to stderr,
> which should only be used for errors.  This patch changes it to use
> stdout instead.

I do not have strong preference but the reason this goes to
stderr is because it is considered part of the progress
reporting.  It is not designed for consumption by scripted
callers, and its formatting is subject to change freely, which
is what allows the recent "terse fetch" discussion.

^ permalink raw reply

* Re: [PATCH 6/9] git-hash-object: Add --stdin-paths option
From: Junio C Hamano @ 2007-10-26 21:00 UTC (permalink / raw)
  To: Adam Roben; +Cc: git
In-Reply-To: <1193307927-3592-7-git-send-email-aroben@apple.com>

Adam Roben <aroben@apple.com> writes:

> This allows multiple paths to be specified on stdin.

Ok.  List of paths is certainly a good thing to have.

In addition, if you are enhancing cat-file to spew chunked
output out, I suspect that there should be a mode of operation
for hash-object that eats that data format.  IOW, this pipe

	git-cat-file --batch <list-of-sha1 |
        git-hash-object --batch

should be an intuitive no-op, shouldn't it?

^ permalink raw reply

* Re: [PATCH 5/9] Add tests for git hash-object
From: Junio C Hamano @ 2007-10-26 21:00 UTC (permalink / raw)
  To: Adam Roben; +Cc: git, Johannes Sixt
In-Reply-To: <1193307927-3592-6-git-send-email-aroben@apple.com>

Adam Roben <aroben@apple.com> writes:

> +test_expect_success \
> +    'hash a file' \
> +    "test $hello_sha1 = \$(git hash-object hello)"
> +
> +test_expect_success \
> +    'hash from stdin' \
> +    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object --stdin)"

Needs to make sure no object has been written to the object
database at this point?

> +test_expect_success \
> +    'hash a file and write to database' \
> +    "test $hello_sha1 = \$(git hash-object -w hello)"

... and make sure the objectis written here?

> +test_expect_success \
> +    'hash from stdin and write to database' \
> +    "test $hello_sha1 = \$(echo '$hello_content' | git hash-object -w --stdin)"
> +
> +test_done

... and/or here?

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox