Git development

Git development
 help / color / mirror / Atom feed

* Re: Unresolved issues #2
From: Linus Torvalds @ 2006-05-06 15:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Git Mailing List
In-Reply-To: <7vhd43vgnm.fsf@assigned-by-dhcp.cox.net>

On Fri, 5 May 2006, Junio C Hamano wrote:
>
> > So I'd argue that (a) yes, we do want to have the "proto porcelain" that 
> > sets remote branch information without the user having to know the magic 
> > "git repo-config" incantation, or know which file in .git/remotes/ to 
> > edit, but that (b) it's even more important to try to decide on what the 
> > remote description format _is_.
> 
> Is it format you care about or the semantics?

I _personally_ care about the semantics, but not very deeply - since I 
tend to actually have just one main branch, and a couple of throw-away 
ones if I ended up working on something.

But I think that for this thing to become useful, we want to care about 
the format - or at least the interface to the different users (with the 
acknowledgement that "users" should often be porcelain above us).

Right now we've basically had people hand-editing the remotes files, and I 
think cogito still uses the older branches format that came from cogito in 
the first place. I think we should just try to decide on a config file 
format, and make it easy for cogito etc to use it.

		Linus

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-06 13:37 UTC (permalink / raw)
  To: git
In-Reply-To: <e3g9m6$8i0$1@sea.gmane.org>

Jakub Narebski wrote:

> Petr Baudis wrote:

>> If you use persistent file ids, you never miss it _AND_ you DO NOT WALK
>> THE COMMIT CHAIN! You still just match file ids in the two trees.
> 
> Let not jump to the one of the possible solution. The detecting and noting
> renames and content moving (with user interaction) at commit is nice...
> unless does something which cannot allow interactiveness (like applying
> patchbomb), but even then detecting and saving info at commit would be
> good idea.
> 
> What we need is to for two given linked revisions (with a path between
> them) to easily extract information about renames (content moving).
> Perhaps using additional structure... best if we could do this without
> walking the chain. The rest is details... ;-P

Or rather structure, which for given file F in given revision A, for given
other revision B would tell ALL the files in the revision B which are
source of contents (via history/commit tree) of the file F.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Bertrand Jacquin @ 2006-05-06 12:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git
In-Reply-To: <7vslnntxay.fsf@assigned-by-dhcp.cox.net>

On 5/6/06, Junio C Hamano <junkio@cox.net> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
> > Perhaps an option to do rename detection with walking the commit chain?
>
> Have fun implementing that ;-).

I agree that it could be interesting to have a such thing. But that's
increndibly stupid and moreover a rare case.

--
Beber
#e.fr@freenode

^ permalink raw reply

* [RFC] Managing projects - advanced Git tutorial/walkthrough
From: Jakub Narebski @ 2006-05-06  8:43 UTC (permalink / raw)
  To: git

I have browsed through Git documentation: "A tutorial introduction to
git" (tutorial.txt), "A short git tutorial" (core-tutorial.txt) which
contrary to the title is the tutorial in low-level git commands and is
longer that the first one, "Everyday GIT With 20 Commands Or
So" (everyday.txt) and "git for CVS users" (cvs-migration.txt) which does
not mention git-blame and git-annotate.

What I miss is walkthrough type tutorial, describing typical workflow (or
workflows), and tutorial concentrating on advanced topics which may come
once in a while or for some topics only, but it would be nice to know how
to resolve them.

Perhaps some of the following problems would need Git improvement (e.g.
better support for subprojects: "bind" idea)...

1. Description of typical workflow, with 'stable'/'maintenance'/'fixes' and
'development'/'master'/'main' branches, how to put bugfixes into both
branches etc. Perhaps description of git branches and workflow, or Linux
kernel branches and workflow.

2. Contrib: how to add project which was externally managed to contrib and
later/or to core, preserving history. Examples: gitk for git, or like
perhaps parsecvs would be for git, or like git-svn for git.

3. Subprojects: how to manage project which depends on other externally
managed (third-party) project, and perhaps needs patches for it. Examples:
out of tree kernel patches + userspace tools, plugin for some program which
may need bugfixes, program which need some library, gitk before
incorporating into git,... Perhaps description of the whole sequence of
project development from add-on project (some new filesystem for Linux,
gitk) to being incorporated into bigger project (filesystem included in
Linux kernel, gitk in git repository).

4. Splitting repository: splitting one big project (X.org, Linux
distribution) into modules.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Junio C Hamano @ 2006-05-06  7:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e3hjfk$bjn$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Perhaps an option to do rename detection with walking the commit chain?

Have fun implementing that ;-).

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Junio C Hamano @ 2006-05-06  7:40 UTC (permalink / raw)
  To: git
In-Reply-To: <7vy7xgzsiu.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

>> Yeah, things still to be done are deflating both delta and
>> optionally perhaps use just deflate without delta for "new file"
>> codepath.
>>
>> And testsuite.
>
> -- >8 --
> [PATCH] binary diff: further updates.
>...
> Signed-off-by: Junio C Hamano <junkio@cox.net>
>
> ---
>
>  * Have done only very minimum testing, and unlike somebody else ;-)
>    my code is not always perfect, so this will still stay out of
>    "next" for a while.

OK, now it passes the testsuite I wrote, so I'll push this out
in "next".  I was not drunk while doing the testsuite so the
code now must be perfect ;-).

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-06  7:33 UTC (permalink / raw)
  To: git
In-Reply-To: <46a038f90605052353m2d2aca11weac7efee80c6fb35@mail.gmail.com>

Martin Langhoff wrote:

> On 5/6/06, Junio C Hamano <junkio@cox.net> wrote:

>>> Try doing
>>>
>>> git diff v1.3.0..
>>>
>>> and think about what that actually _means_. Think about the fact that it
>>> doesn't actually walk the commit chain at all: it diffs the trees
>>> between v1.3.0 and the current one. What if the rename happened in a
>>> commit in the middle?
>>
>> Then the automated renames detection will miss it given that the other
>> accumulated differences are large enough, and the suggested workarounds
>> _are_ precisely walking the commit chain.
> 
> I agree here with Pasky that after a while the automated
> renames/copy/splitup detection will miss the operation in cases where
> it would be interesting to note it to the user.

Perhaps an option to do rename detection with walking the commit chain?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Junio C Hamano @ 2006-05-06  7:14 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90605052353m2d2aca11weac7efee80c6fb35@mail.gmail.com>

"Martin Langhoff" <martin.langhoff@gmail.com> writes:

> I agree here with Pasky that after a while the automated
> renames/copy/splitup detection will miss the operation in cases where
> it would be interesting to note it to the user. IIRC git-rerere is the
> tool that knows about this (still voodoo to me how) and could be used
> to help here. At what (runtime) cost, I don't know, but that kind of
> walking history to tell me more interesting things about the diff is
> something that is usually worthwhile.

FYI rerere is a totally unrelated voodoo.

It remembers the conflict marker pattern <<< === >>> immediately
after it runs "merge" (ah, that reminds me -- I should replace
them with diff3), and then remembers the result of the manual
resolution just before the user makes a commit.  Then, when next
time it runs "merge" for something and notices <<< === >>>
pattern it has seen before, it runs a three-way merge between
the previous resolution result and the current conflicted state,
using the previous conflicted state as the common origin.

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Junio C Hamano @ 2006-05-06  7:10 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90605052323o29f8bfadr7426f97d8dfc2319@mail.gmail.com>

"Martin Langhoff" <martin.langhoff@gmail.com> writes:

> On 5/6/06, Linus Torvalds <torvalds@osdl.org> wrote:
>> Of course, that would require another slight difference to "rev-list.c",
>> where we'd only recurse into trees of selected commit objects (ie we'd
>> have to mark the HAVE/WANT commits specially, but it's not exactly
>> complex either).
>
> Would it make sense to make all the shallow clone clone machinery walk
> everything and trim only blob objects? In that case, all the machinery
> that walks commits/trees would remain intact -- we only have to deal
> with the case of not having blob objects, which affects less
> codepaths.
>
> It means that for a merge or checkout involving stuff we "don't have",
> it's trivial to know we are missing, and so we can  attempt a fetch of
> the missing objects or tell the user how to request them them before
> retrying.
>
> And in any case commits and trees are lightweight and compress well...

Commit maybe, but is this based on a hard fact?  

Earlier Linus said something about "git log" working on
commit-only copy, but obviously you would want at least trees
for the path limiting part to work, so having commits and trees
would be handy, but my impression was that at least for deep
project like the kernel trees tend to be nonnegligible (a commit
consists of 18K paths and 1200 trees or something like that).

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Martin Langhoff @ 2006-05-06  6:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, git
In-Reply-To: <7vr738w8t4.fsf@assigned-by-dhcp.cox.net>

On 5/6/06, Junio C Hamano <junkio@cox.net> wrote:
> > If you use persistent file ids, you never miss it _AND_ you DO NOT WALK
> > THE COMMIT CHAIN! You still just match file ids in the two trees.
>
> It is unworkable.

+1 -- explicit file ids are evil. Arch/TLA demonstrated that amply...
they are a serious annoyance to the end user, they have a lot of
not-elegantly solvable cases (same file created with the same contents
in several repos -- say via an emailed patch) that git gets right
_today_.

They _are_ useful in a very small set of cases -- namely in the case
of a naive mv, which git handles correctly today. Subtler things git
sometimes does right, sometimes fails, but it can be made to be much
smarter by interpreting content changes better, for instance all this
talk about getting pickaxe to guess where the patch should be applied
for a file that got split into 3.

But those subtler cases are totally impossible with explicit id
tracking. I used Arch for a long time with very large trees, and
renames coming left, right and centre. Explicit ids didn't help much,
and the number of manual fixups we had to do was awful.

I am using GIT with the very same project, and just now, typing this,
I realised that there are still many renames happening in the project.
I had forgotten about it -- well, not really: I do use git-merge
instead of cg-merge when I suspect there may be interesting cases ;-)

Of course, YMMV, and I have to confess I was a sceptic for a while...
but now as an end-user dealing with messy projects, I say LIRAR: Linus
Is Right About Renames.

OTOH,

>> Try doing
>>
>> git diff v1.3.0..
>>
>> and think about what that actually _means_. Think about the fact that it
>> doesn't actually walk the commit chain at all: it diffs the trees between
>> v1.3.0 and the current one. What if the rename happened in a commit in
>> the middle?
>
> Then the automated renames detection will miss it given that the other
> accumulated differences are large enough, and the suggested workarounds
> _are_ precisely walking the commit chain.

I agree here with Pasky that after a while the automated
renames/copy/splitup detection will miss the operation in cases where
it would be interesting to note it to the user. IIRC git-rerere is the
tool that knows about this (still voodoo to me how) and could be used
to help here. At what (runtime) cost, I don't know, but that kind of
walking history to tell me more interesting things about the diff is
something that is usually worthwhile.

Usual disclaimers apply.

martin

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Martin Langhoff @ 2006-05-06  6:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0605050848230.3622@g5.osdl.org>

On 5/6/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Of course, that would require another slight difference to "rev-list.c",
> where we'd only recurse into trees of selected commit objects (ie we'd
> have to mark the HAVE/WANT commits specially, but it's not exactly
> complex either).

Would it make sense to make all the shallow clone clone machinery walk
everything and trim only blob objects? In that case, all the machinery
that walks commits/trees would remain intact -- we only have to deal
with the case of not having blob objects, which affects less
codepaths.

It means that for a merge or checkout involving stuff we "don't have",
it's trivial to know we are missing, and so we can  attempt a fetch of
the missing objects or tell the user how to request them them before
retrying.

And in any case commits and trees are lightweight and compress well...

> Of course, the complexity of _both_ of these approaches is really in the
> fsck stage, and all the crud you need to then do other things with these
> pared-down repos. For example, do you allow cloning? And do you just
> automatically notice that you're cloning a shallow repo, and only do a
> shallow clone. Etc etc..

Definitely.

cheers,

martin

^ permalink raw reply

* Re: Unresolved issues #2
From: Junio C Hamano @ 2006-05-06  5:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0605041715500.3611@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> I'm actually growing pretty fond of the config file interfaces that Dscho 
> is pushing. I really like the idea of "git pull" doing different things 
> depending on which branch is active at the time, because different 
> branches really can have different sources they come from.

> Always pulling from the same default source seems wrong,...

> So Johannes' patches seem to move into that direction, and having it all 
> in the config file actually seems to be quite readable.

I share the same reasoning and that is why I am carrying the
series in "next".  I think per branch attributes are wonderful
things.

> So I'd argue that (a) yes, we do want to have the "proto porcelain" that 
> sets remote branch information without the user having to know the magic 
> "git repo-config" incantation, or know which file in .git/remotes/ to 
> edit, but that (b) it's even more important to try to decide on what the 
> remote description format _is_.

Is it format you care about or the semantics?

> I personally have just two preferences:
>
>  - I'd like each branch I'm on to have a "default source" for pulling (and 
>    _maybe_ for pushing too). I'd like to just say "git pull", and it would 
>    automatically select the appropriate thing to pull from.
>
>  - maybe the same per-branch thing for "push", but more importantly for 
>    me, I like to push to multiple destinations, and I'd like the 
>    description format to be sane. I think it may already be sane in the 
>    form it is in now (supporting both config file _and_ .git/remotes/ 
>    formats), I'd just like us to decide on exactly what the meaning is, 
>    and hopefully get to the point where we can tell porcelain how to use 
>    that meaning to their advantage (and not change it)
>
> Others may disagree, or (equally importantly), may have additional 
> preferences. We should try to find something that works for everybody, and 
> that is easy to work with.

In my day job, I maintain a base code for a generic application
in "master", various topics, mostly branched from "master" but
sometimes from another topic branch, and one branch each per
customer installation, which pulls from the master, topics and
contains specific customizations.  While on master or any one of
generic topic branch, I need to remember not to pull from
installation branches.  For that matter, the installation
branches should not be pulled into anything else.  So not just
"this branch usually merges from there", but "this branch should
not be merged into others" (mark "installation branches" as
such), and "this branch should never merge from that one" (mark
"master" with "installation branches") would prevent mistakes.

One thing I noticed in "What's in libata.git" Jeff did by
mimicking my "What's in git.git" was that the description for
each topic branch included where it branched from (iow, what
other branch it builds on).  This is sometimes derivable, but
having it as a property for a branch is very handy.

^ permalink raw reply

* [PATCH/RFC (git-core)] update-index --again
From: Junio C Hamano @ 2006-05-06  5:05 UTC (permalink / raw)
  To: git; +Cc: Carl Worth

After running 'git-update-index' for some paths, you may want to
do the update on the same set of paths again.

The new flag --again checks the paths whose index entries are
are different from the HEAD commit and updates them from the
working tree contents.

This was brought up by Carl Worth on #git.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * I want to reorganize index file to contain both blob and tree
   entries in not so distant future, so I probably should not be
   doing something like this which _would_ need rework when that
   happens, but I think what Carl wanted to do is a reasonable
   thing to ask.

   http://colabti.de/irclogger/irclogger_log/git?date=2006-05-05,Fri&sel=16#l31

 Documentation/git-update-index.txt |    6 ++-
 t/t2101-update-index-reupdate.sh   |   73 ++++++++++++++++++++++++++++++++++++
 update-index.c                     |   56 ++++++++++++++++++++++++++--
 3 files changed, 131 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 57177c7..d043e86 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -15,7 +15,7 @@ SYNOPSIS
 	     [--cacheinfo <mode> <object> <file>]\*
 	     [--chmod=(+|-)x]
 	     [--assume-unchanged | --no-assume-unchanged]
-	     [--really-refresh] [--unresolve]
+	     [--really-refresh] [--unresolve] [--again]
 	     [--info-only] [--index-info]
 	     [-z] [--stdin]
 	     [--verbose]
@@ -80,6 +80,10 @@ OPTIONS
 	filesystem that has very slow lstat(2) system call
 	(e.g. cifs).
 
+--again::
+	Runs `git-update-index` itself on the paths whose index
+	entries are different from those from the `HEAD` commit.
+
 --unresolve::
 	Restores the 'unmerged' or 'needs updating' state of a
 	file during a merge if it was cleared by accident.
diff --git a/t/t2101-update-index-reupdate.sh b/t/t2101-update-index-reupdate.sh
new file mode 100755
index 0000000..5c505c6
--- /dev/null
+++ b/t/t2101-update-index-reupdate.sh
@@ -0,0 +1,73 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Junio C Hamano
+#
+
+test_description='git-update-index --again test.
+'
+
+. ./test-lib.sh
+
+test_expect_success 'update-index --add' \
+	'echo hello world >file1 &&
+	 echo goodbye people >file2 &&
+	 git-update-index --add file1 file2 &&
+	 git-ls-files -s >current &&
+	 cmp current - <<\EOF
+100644 3b18e512dba79e4c8300dd08aeb37f8e728b8dad 0	file1
+100644 9db8893856a8a02eaa73470054b7c1c5a7c82e47 0	file2
+EOF'
+
+test_expect_success 'update-index --again' \
+	'rm -f file1 &&
+	echo hello everybody >file2 &&
+	if git-update-index --again
+	then
+		echo should have refused to remove file1
+		exit 1
+	else
+		echo happy - failed as expected
+	fi &&
+	 git-ls-files -s >current &&
+	 cmp current - <<\EOF
+100644 3b18e512dba79e4c8300dd08aeb37f8e728b8dad 0	file1
+100644 9db8893856a8a02eaa73470054b7c1c5a7c82e47 0	file2
+EOF'
+
+test_expect_success 'update-index --remove --again' \
+	'git-update-index --remove --again &&
+	 git-ls-files -s >current &&
+	 cmp current - <<\EOF
+100644 0f1ae1422c2bf43f117d3dbd715c988a9ed2103f 0	file2
+EOF'
+
+test_expect_success 'first commit' 'git-commit -m initial'
+
+test_expect_success 'update-index again' \
+	'mkdir -p dir1 &&
+	echo hello world >dir1/file3 &&
+	echo goodbye people >file2 &&
+	git-update-index --add file2 dir1/file3 &&
+	echo hello everybody >file2
+	echo happy >dir1/file3 &&
+	git-update-index --again &&
+	git-ls-files -s >current &&
+	cmp current - <<\EOF
+100644 53ab446c3f4e42ce9bb728a0ccb283a101be4979 0	dir1/file3
+100644 0f1ae1422c2bf43f117d3dbd715c988a9ed2103f 0	file2
+EOF'
+
+test_expect_success 'update-index --update from subdir' \
+	'echo not so happy >file2 &&
+	cd dir1 &&
+	cat ../file2 >file3 &&
+	git-update-index --again &&
+	cd .. &&
+	git-ls-files -s >current &&
+	cmp current - <<\EOF
+100644 d7fb3f695f06c759dbf3ab00046e7cc2da22d10f 0	dir1/file3
+100644 0f1ae1422c2bf43f117d3dbd715c988a9ed2103f 0	file2
+EOF'
+
+test_done
+
diff --git a/update-index.c b/update-index.c
index 1870ac7..e6c460b 100644
--- a/update-index.c
+++ b/update-index.c
@@ -473,7 +473,7 @@ static void read_index_info(int line_ter
 }
 
 static const char update_index_usage[] =
-"git-update-index [-q] [--add] [--replace] [--remove] [--unmerged] [--refresh] [--really-refresh] [--cacheinfo] [--chmod=(+|-)x] [--assume-unchanged] [--info-only] [--force-remove] [--stdin] [--index-info] [--unresolve] [--ignore-missing] [-z] [--verbose] [--] <file>...";
+"git-update-index [-q] [--add] [--replace] [--remove] [--unmerged] [--refresh] [--really-refresh] [--cacheinfo] [--chmod=(+|-)x] [--assume-unchanged] [--info-only] [--force-remove] [--stdin] [--index-info] [--unresolve] [--again] [--ignore-missing] [-z] [--verbose] [--] <file>...";
 
 static unsigned char head_sha1[20];
 static unsigned char merge_head_sha1[20];
@@ -488,11 +488,13 @@ static struct cache_entry *read_one_ent(
 	struct cache_entry *ce;
 
 	if (get_tree_entry(ent, path, sha1, &mode)) {
-		error("%s: not in %s branch.", path, which);
+		if (which)
+			error("%s: not in %s branch.", path, which);
 		return NULL;
 	}
 	if (mode == S_IFDIR) {
-		error("%s: not a blob in %s branch.", path, which);
+		if (which)
+			error("%s: not a blob in %s branch.", path, which);
 		return NULL;
 	}
 	size = cache_entry_size(namelen);
@@ -597,6 +599,47 @@ static int do_unresolve(int ac, const ch
 	return err;
 }
 
+static int do_reupdate(int ac, const char **av,
+		       const char *prefix, int prefix_length)
+{
+	/* Read HEAD and run update-index on paths that are
+	 * merged and already different between index and HEAD.
+	 */
+	int pos;
+	int has_head = 1;
+
+	if (read_ref(git_path("HEAD"), head_sha1))
+		/* If there is no HEAD, that means it is an initial
+		 * commit.  Update everything in the index.
+		 */
+		has_head = 0;
+ redo:
+	for (pos = 0; pos < active_nr; pos++) {
+		struct cache_entry *ce = active_cache[pos];
+		struct cache_entry *old = NULL;
+		int save_nr;
+		if (ce_stage(ce))
+			continue;
+		if (has_head)
+			old = read_one_ent(NULL, head_sha1,
+					   ce->name, ce_namelen(ce), 0);
+		if (old && ce->ce_mode == old->ce_mode &&
+		    !memcmp(ce->sha1, old->sha1, 20)) {
+			free(old);
+			continue; /* unchanged */
+		}
+		/* Be careful.  The working tree may not have the
+		 * path anymore, in which case, under 'allow_remove',
+		 * or worse yet 'allow_replace', active_nr may decrease.
+		 */
+		save_nr = active_nr;
+		update_one(ce->name + prefix_length, prefix, prefix_length);
+		if (save_nr != active_nr)
+			goto redo;
+	}
+	return 0;
+}
+
 int main(int argc, const char **argv)
 {
 	int i, newfd, entries, has_errors = 0, line_termination = '\n';
@@ -714,6 +757,13 @@ int main(int argc, const char **argv)
 					active_cache_changed = 0;
 				goto finish;
 			}
+			if (!strcmp(path, "--again")) {
+				has_errors = do_reupdate(argc - i, argv + i,
+							 prefix, prefix_length);
+				if (has_errors)
+					active_cache_changed = 0;
+				goto finish;
+			}
 			if (!strcmp(path, "--ignore-missing")) {
 				not_new = 1;
 				continue;
-- 
1.3.2.g2749

^ permalink raw reply related

* script to create debian package
From: Matthias Lederhofer @ 2006-05-05 20:53 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 185 bytes --]

I wrote a script similar to the script in scripts/package/builddeb in
the kernel tree for git. Is there any interest to integrate it into
the git source tree? I've attached the script.

[-- Attachment #2: git-deb-build --]
[-- Type: text/plain, Size: 826 bytes --]

#!/bin/sh
tmpdir="`pwd`/debian/tmp"

make prefix=/usr all doc
make prefix="$tmpdir/usr" install install-doc

version="`cat GIT-VERSION-FILE | cut -d ' ' -f 3`"
name="git debian package script <`id -nu`@`hostname -f`>"

mkdir -p "$tmpdir/DEBIAN"

cat <<EOF > debian/control
Source: git
Section: devel
Priority: optional
Maintainer: $name
Standards-Version: 3.6.1

Package: git
Conflicts: git-arch, git-core, git-cvs, git-doc, git-email, git-svn, gitk
Provides: git-arch, git-core, git-cvs, git-doc, git-email, git-svn, gitk
Architecture: any
Description: git, version $version
 This package contains git version $version.
EOF

cat <<EOF > debian/changelog
git ($version-1) unstable; urgency=low

  * A standard release

 -- $name  `date -R`
EOF

chmod -R og-w debian/tmp
dpkg-gencontrol -isp
fakeroot dpkg --build "$tmpdir" ..

^ permalink raw reply

* [PATCH 2/2] git-svn 1.0.0
From: Eric Wong @ 2006-05-05 19:35 UTC (permalink / raw)
  To: junkio, git; +Cc: Eric Wong
In-Reply-To: <11468577403821-git-send-email-normalperson@yhbt.net>

Signed-off-by: Eric Wong <normalperson@yhbt.net>

---

 contrib/git-svn/git-svn.perl |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

3680a525691232e2f45f2bcf63458e547ff109ba
diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index e003501..de13a96 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -8,7 +8,7 @@ use vars qw/	$AUTHOR $VERSION
 		$GIT_SVN_INDEX $GIT_SVN
 		$GIT_DIR $REV_DIR/;
 $AUTHOR = 'Eric Wong <normalperson@yhbt.net>';
-$VERSION = '0.11.0';
+$VERSION = '1.0.0';
 
 use Cwd qw/abs_path/;
 $GIT_DIR = abs_path($ENV{GIT_DIR} || '.git');
-- 
1.3.2.ge3d7

^ permalink raw reply related

* [PATCH 1/2] git-svn: documentation updates
From: Eric Wong @ 2006-05-05 19:35 UTC (permalink / raw)
  To: junkio, git; +Cc: Eric Wong
In-Reply-To: <11468577402388-git-send-email-normalperson@yhbt.net>

* Clarify that 'init' requires an argument
* Remove instances of 'SVN_URL' in the manpage, it's not an
  environment variable.
* Refer to 'Additional Fetch Arguments' when documenting 'fetch'
* document --authors-file / -A option

Thanks to Pavel Roskin and Seth Falcon for bringing these issues
to my attention.

Signed-off-by: Eric Wong <normalperson@yhbt.net>

---

 contrib/git-svn/git-svn.perl |    6 ++++--
 contrib/git-svn/git-svn.txt  |   45 ++++++++++++++++++++++++++++++++----------
 2 files changed, 38 insertions(+), 13 deletions(-)

f16bddd18d55404fbaa4a88020f4d58ab2b1c620
diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index 7c44450..e003501 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -42,7 +42,8 @@ my %fc_opts = ( 'no-ignore-externals' =>
 my %cmd = (
 	fetch => [ \&fetch, "Download new revisions from SVN",
 			{ 'revision|r=s' => \$_revision, %fc_opts } ],
-	init => [ \&init, "Initialize and fetch (import)", { } ],
+	init => [ \&init, "Initialize a repo for tracking" .
+			  " (requires URL argument)", { } ],
 	commit => [ \&commit, "Commit git revisions to SVN",
 			{	'stdin|' => \$_stdin,
 				'edit|e' => \$_edit,
@@ -220,7 +221,8 @@ when you have upgraded your tools and ha
 }
 
 sub init {
-	$SVN_URL = shift or croak "SVN repository location required\n";
+	$SVN_URL = shift or die "SVN repository location required " .
+				"as a command-line argument\n";
 	unless (-d $GIT_DIR) {
 		sys('git-init-db');
 	}
diff --git a/contrib/git-svn/git-svn.txt b/contrib/git-svn/git-svn.txt
index e18fcaf..f7d3de4 100644
--- a/contrib/git-svn/git-svn.txt
+++ b/contrib/git-svn/git-svn.txt
@@ -36,17 +36,22 @@ COMMANDS
 --------
 init::
 	Creates an empty git repository with additional metadata
-	directories for git-svn.  The SVN_URL must be specified
-	at this point.
+	directories for git-svn.  The Subversion URL must be specified
+	as a command-line argument.
 
 fetch::
-	Fetch unfetched revisions from the SVN_URL we are tracking.
-	refs/heads/remotes/git-svn will be updated to the latest revision.
+	Fetch unfetched revisions from the Subversion URL we are
+	tracking.  refs/remotes/git-svn will be updated to the
+	latest revision.
 
-	Note: You should never attempt to modify the remotes/git-svn branch
-	outside of git-svn.  Instead, create a branch from remotes/git-svn
-	and work on that branch.  Use the 'commit' command (see below)
-	to write git commits back to remotes/git-svn.
+	Note: You should never attempt to modify the remotes/git-svn
+	branch outside of git-svn.  Instead, create a branch from
+	remotes/git-svn and work on that branch.  Use the 'commit'
+	command (see below) to write git commits back to
+	remotes/git-svn.
+
+	See 'Additional Fetch Arguments' if you are interested in
+	manually joining branches on commit.
 
 commit::
 	Commit specified commit or tree objects to SVN.  This relies on
@@ -62,9 +67,9 @@ rebuild::
 	tracked with git-svn.  Unfortunately, git-clone does not clone
 	git-svn metadata and the svn working tree that git-svn uses for
 	its operations.  This rebuilds the metadata so git-svn can
-	resume fetch operations.  SVN_URL may be optionally specified if
-	the directory/repository you're tracking has moved or changed
-	protocols.
+	resume fetch operations.  A Subversion URL may be optionally
+	specified at the command-line if the directory/repository you're
+	tracking has moved or changed protocols.
 
 show-ignore::
 	Recursively finds and lists the svn:ignore property on
@@ -123,6 +128,24 @@ OPTIONS
 	repo-config key: svn.l
 	repo-config key: svn.findcopiesharder
 
+-A<filename>::
+--authors-file=<filename>::
+
+	Syntax is compatible with the files used by git-svnimport and
+	git-cvsimport:
+
+------------------------------------------------------------------------
+loginname = Joe User <user@example.com>
+------------------------------------------------------------------------
+
+	If this option is specified and git-svn encounters an SVN
+	committer name that does not exist in the authors-file, git-svn
+	will abort operation. The user will then have to add the
+	appropriate entry.  Re-running the previous git-svn command
+	after the authors-file is modified should continue operation.
+
+	repo-config key: svn.authors-file
+
 ADVANCED OPTIONS
 ----------------
 -b<refname>::
-- 
1.3.2.ge3d7

^ permalink raw reply related

* [0/2 PATCH] git-svn 1.0.0 release
From: Eric Wong @ 2006-05-05 19:35 UTC (permalink / raw)
  To: junkio, git

It's been very solid for a long time now.  I haven't run into
any problems with it myself in a while, and no critical bugs
that I know of exist.  Labeling it 1.0.0 may make it look
less scary to new users :)

Thanks to all those who gave feedback.

 git-svn.perl |    8 +++++---
 git-svn.txt  |   45 ++++++++++++++++++++++++++++++++++-----------
 2 files changed, 39 insertions(+), 14 deletions(-)

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Junio C Hamano @ 2006-05-05 20:50 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605051605340.6713@iabervon.org>

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Fri, 5 May 2006, Junio C Hamano wrote:
>
>> But "binaryness" affects only certain operations that extract
>> the data (e.g. diff and grep) and not others (e.g. fetch).
>> Also, it makes sense to being able to retroactively mark a blob,
>> which was not marked as such originally, is a binary.  So I do
>> not think it should be recorded in the object header.
>
> Why do you think it makes sense to retroactively mark a blob with things 
> like binariness or MIME type? To the extent that the information is not 
> possible to extract from the blob contents, it seems to me to be a 
> permanent aspect of the blob. And I could see having blobs with the same 
> content but different type information (that one is a ZIP archive, while 
> this one is a OpenDocument file), and tools may care how they were 
> specified, and the user would want to be able to track how they had 
> historically been marked, if the system allows them to be marked at all.
>
> Of course, there's still the issue of how this info is generated for a new 
> blob; I think it should live in the index for tracked files and come from 
> a .gitignore-style file for new files. (For that matter, there could be a 
> .gitmetadata file, which would handle "ignore" as well as binary and 
> whatever other info you want to produce about your not-previously-tracked 
> files.)

I think Nico's solution (compromise?) is the right and most
practical one.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Olivier Galibert @ 2006-05-05 20:45 UTC (permalink / raw)
  To: linux; +Cc: git
In-Reply-To: <20060505181540.GB27689@pasky.or.cz>

On Fri, May 05, 2006 at 08:15:41PM +0200, Petr Baudis wrote:
> The automatic vs. explicit movement tracking is a lot more
> controversial. Explicit movement tracking is pretty easy to provide for
> file-level movements, it's just that the user says "I _did_ move file
> A to file B" (I never got the Linus' argument that the user has no idea
> - he just _performed_ the move, also explicitly, by calling *mv).

In one of my projects 99% or the renames are "done" when unzipping the
source release of the next version.  Explicit tracking would be
unbearable, frankly.

And once you have a good enough implicit tracking, why bother with an
explicit one?

  OG.

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Daniel Barkalow @ 2006-05-05 20:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <7vac9wxom0.fsf@assigned-by-dhcp.cox.net>

On Fri, 5 May 2006, Junio C Hamano wrote:

> But "binaryness" affects only certain operations that extract
> the data (e.g. diff and grep) and not others (e.g. fetch).
> Also, it makes sense to being able to retroactively mark a blob,
> which was not marked as such originally, is a binary.  So I do
> not think it should be recorded in the object header.

Why do you think it makes sense to retroactively mark a blob with things 
like binariness or MIME type? To the extent that the information is not 
possible to extract from the blob contents, it seems to me to be a 
permanent aspect of the blob. And I could see having blobs with the same 
content but different type information (that one is a ZIP archive, while 
this one is a OpenDocument file), and tools may care how they were 
specified, and the user would want to be able to track how they had 
historically been marked, if the system allows them to be marked at all.

Of course, there's still the issue of how this info is generated for a new 
blob; I think it should live in the index for tracked files and come from 
a .gitignore-style file for new files. (For that matter, there could be a 
.gitmetadata file, which would handle "ignore" as well as binary and 
whatever other info you want to produce about your not-previously-tracked 
files.)

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Nicolas Pitre @ 2006-05-05 20:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vac9wxom0.fsf@assigned-by-dhcp.cox.net>

On Fri, 5 May 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Fri, 5 May 2006, Junio C Hamano wrote:
> >
> >> The delta is going to be deflated and hopefully gets a bit
> >> smaller, so if we really care that level of detail, it might be
> >> worth to do (deflate_size*3/2) or something like that here, use
> >> delta with or without deflate whichever is smaller, and mark the
> >> uncompressed delta with a different tag ("uncompressed delta"?).
> >> And for symmetry, to deal with uncompressible data, we may want
> >> to have "uncompressed literal" as well.
> >
> > Nah...  Please just forget that.  ;-)
> 
> I was serious about the above actually.

And I think this is overkill.

First, if a deflated delta is to be _larger_ than its inflated version 
this is because the delta data is really really short, most probably 
shorter than a single base85 line.  Same for literal data.

So I truely think the pretty special and rare case where not deflating 
might be smaller is simply not worth the added complexity.

> BTW, this "binary patch" opens a different can of worms.
> 
> Currently, the diff uses a heuristic borrowed from GNU diff 
> (I did not look at the code when I did it, but it is described
> in its documentation) to decide if a file is binary (look at the
> first few bytes and find NUL).  I am sure people will want to
> have a way to say "that heuristic fails but this _is_ a binary
> file and please treat it as such".
> 
> There are two, both valid, I think, ways to do it.
> 
>  - give an option to "diff" that says "treat this path as binary
>    for this invocation of the program".
> 
>  - give an attribute to blob object that says "this blob is
>    binary and should be treated as such".
> 
> The latter is probably the right way to go in the longer term.

I'm not sure I agree.

> A blob being binary or not is a property of the content and does
> not depend on where it sits in the history, so unlike "recording
> renames as a hint in commit objects", the attribute is at the
> blob level, not at the commit nor the tree that points at the
> blob.

Well, sort of.

> But "binaryness" affects only certain operations that extract
> the data (e.g. diff and grep) and not others (e.g. fetch).
> Also, it makes sense to being able to retroactively mark a blob,
> which was not marked as such originally, is a binary.  So I do
> not think it should be recorded in the object header.

Agreed.

> Which suggests that we may perhaps want to have notes that can
> be attached to existing objects to augment them without changing
> the contents of the data, and have tools notice these notes when
> they are available.  Another example is to associate correct
> MIME types to blobs so, gitweb _blob_ links can do sensible
> things to them.

I think blobs are the wrong level to attach such notes.  If you go that 
path you'll have to add as many entries for the number of blobs many 
revisions of the same file might have.

Instead I think it should be attached to files.  After all being a 
binary or not is a file attribute regardless of its revision.  And 
implementation wise I'd do it as a .gitbin file listing all names of 
files that should be considered as binaries, with path globing and all, 
just like .gitignore currently lists files that should be ignored.

And the advantage is that those .gitbin files can be distributed and 
revision-controlled just like the .gitignore files.

And in addition to those files you could have a section in the repo 
config file listing default name patterns for files that are considered 
binaries.  Or even a section, if present, that lists patterns for files 
that are _not_ binaries since that list might certainly be shorter.  
There could be a corresponding .gittext as well.

And in the absence of any of those then the default automatic euristic 
applies.

Nicolas

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Junio C Hamano @ 2006-05-05 19:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060505185445.GD27689@pasky.or.cz>

Petr Baudis <pasky@suse.cz> writes:

> I doubt this in fact happens that often (to a degree the automatic
> rename detection would catch). And if it happens, then the user has to
> tell Git - I have never heard that _this_ would be any problem in other
> version control systems.

It does not become an issue only because users accept it as a
fact of life.  When Linus was moving most of the contents in
rev-list.c to create a new revision.c, I already had some tweaks
to rev-list.c published before he sent me a patch for the code
movement, and I am sure he needed to re-roll the patch by
merging the change I did to rev-list.c back into his revision.c
file.  No SCM may handle that automatically, and no user
accustomed to existing SCM (including git) expect that to work
automatically.  But that does not necessarily mean a tool that
notices it and tells user what is going on is a bad thing.

However it is a different story to try recording "what is going on"
whether it comes from the tool's guess or directly from the user.

Having a way to affect the inprecise "guess" the tool makes when
that guesswork is needed might make sense.  If you (think you)
know arch/i386/foo.h was copied to create arch/x86-64/foo.h but
the detector does not detect it and seeing a creation patch for
arch/x86-64/foo.h frustrates you, you may want to have a way to
explicitly say "compare arch/i386/foo.h with arch/x86-64/foo.h
in that commit -- I want to examine the change needed to adjust
foo to x86-64 architecture".

But we have "git diff v2.6.14:arch/i386/foo.h v2.6.14:arch/x86-64/foo.h"
for that ;-).

> Then the automated renames detection will miss it given that the other
> accumulated differences are large enough, and the suggested workarounds
> _are_ precisely walking the commit chain.

The HEAD may _not_ have anything to do with v1.3.0 in which case
you would get nothing from walking the ancestry.

> If you use persistent file ids, you never miss it _AND_ you DO NOT WALK
> THE COMMIT CHAIN! You still just match file ids in the two trees.

It is unworkable.

Which one should inherit the persistent id of the old
rev-list.c?  New rev-list.c, or revision.c that has most of the
old contents split out?

Oh, and did you know there was a different revision.h that is
not related to the current revision.h in the history of git?
Should its persistent id have any relation with the persistent
id of the current revision.h?  When would you decide to make the
id inherited and when not to?  If I remove revision.h by mistake
in a commit and resurrect it in the next commit, should it get
the same id back?  If I forget to tell the tool that those two
"disappeared and then reappeared" are related and should get the
same persistent id when I make the resurrection commit, and keep
piling other commits on top, do I have to rewind the ancestry
chain all the way to correct the mistake?

^ permalink raw reply

* A custom Logo that expresses your company! (ID949104636)
From: Orlando Hester @ 2006-05-05 22:43 UTC (permalink / raw)
  To: georas

hmz

Our art team creates a custom logo for you, based on your needs.  Years of experience have taught us how to create a logo that makes a statement that is unique to you.

In a professional manner we learn about your image and how you would like the world to perceive you and your company.  With this information we then create a logo that is not only unique but reflects the purpose of you and your company.

For value and a logo that reflects your image, take a few minutes and visit Logo Maker!

http://cornish.com.logotip-marke.com

Sincerely,
Logo Design Team

 committed attache cuttlebone

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-05 19:39 UTC (permalink / raw)
  To: git
In-Reply-To: <20060505185445.GD27689@pasky.or.cz>

Petr Baudis wrote:

> Dear diary, on Fri, May 05, 2006 at 08:31:06PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...

> I prefer making this [rename detection] data dependable to having to
> resort to guessing on dependable less amount of data.
> 
>> There's another reason why encoding movement information in the commit is
>> totally broken, namely the fact that a lot of the actions DO NOT WALK THE
>> COMMIT CHAIN!
>> 
>> Try doing
>> 
>> git diff v1.3.0..
>> 
>> and think about what that actually _means_. Think about the fact that it
>> doesn't actually walk the commit chain at all: it diffs the trees between
>> v1.3.0 and the current one. What if the rename happened in a commit in
>> the middle?
> 
> Then the automated renames detection will miss it given that the other
> accumulated differences are large enough, and the suggested workarounds
> _are_ precisely walking the commit chain.
> 
> If you use persistent file ids, you never miss it _AND_ you DO NOT WALK
> THE COMMIT CHAIN! You still just match file ids in the two trees.

Let not jump to the one of the possible solution. The detecting and noting
renames and content moving (with user interaction) at commit is nice...
unless does something which cannot allow interactiveness (like applying
patchbomb), but even then detecting and saving info at commit would be good
idea.

What we need is to for two given linked revisions (with a path between them)
to easily extract information about renames (content moving). Perhaps using
additional structure... best if we could do this without walking the chain.
The rest is details... ;-P

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Junio C Hamano @ 2006-05-05 19:23 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605051431390.24505@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> On Fri, 5 May 2006, Junio C Hamano wrote:
>
>> The delta is going to be deflated and hopefully gets a bit
>> smaller, so if we really care that level of detail, it might be
>> worth to do (deflate_size*3/2) or something like that here, use
>> delta with or without deflate whichever is smaller, and mark the
>> uncompressed delta with a different tag ("uncompressed delta"?).
>> And for symmetry, to deal with uncompressible data, we may want
>> to have "uncompressed literal" as well.
>
> Nah...  Please just forget that.  ;-)

I was serious about the above actually.

BTW, this "binary patch" opens a different can of worms.

Currently, the diff uses a heuristic borrowed from GNU diff 
(I did not look at the code when I did it, but it is described
in its documentation) to decide if a file is binary (look at the
first few bytes and find NUL).  I am sure people will want to
have a way to say "that heuristic fails but this _is_ a binary
file and please treat it as such".

There are two, both valid, I think, ways to do it.

 - give an option to "diff" that says "treat this path as binary
   for this invocation of the program".

 - give an attribute to blob object that says "this blob is
   binary and should be treated as such".

The latter is probably the right way to go in the longer term.

A blob being binary or not is a property of the content and does
not depend on where it sits in the history, so unlike "recording
renames as a hint in commit objects", the attribute is at the
blob level, not at the commit nor the tree that points at the
blob.

But "binaryness" affects only certain operations that extract
the data (e.g. diff and grep) and not others (e.g. fetch).
Also, it makes sense to being able to retroactively mark a blob,
which was not marked as such originally, is a binary.  So I do
not think it should be recorded in the object header.

Which suggests that we may perhaps want to have notes that can
be attached to existing objects to augment them without changing
the contents of the data, and have tools notice these notes when
they are available.  Another example is to associate correct
MIME types to blobs so, gitweb _blob_ links can do sensible
things to them.

These external notes are purely for Porcelains (in the context
of this sentence "diff" and "grep" are Porcelain), but we would
also want a way to propagate them across repositories somehow.
In a sense, "grafts" information is similar to the external
notes in that it augments existing commit objects, but its
effect is a bit more intrusive; it affects the way the core
operates.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox