Git development

Git development
 help / color / mirror / Atom feed

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Bill Davidsen @ 2005-05-03 17:40 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Andrea Arcangeli, Matt Mackall, Linus Torvalds, linux-kernel, git
In-Reply-To: <200505021614.j42GEufG008441@turing-police.cc.vt.edu>

Valdis.Kletnieks@vt.edu wrote:
> On Mon, 02 May 2005 11:49:32 EDT, Bill Davidsen said:
> 
>>Andrea Arcangeli wrote:
>>
>>>On Fri, Apr 29, 2005 at 01:39:59PM -0700, Matt Mackall wrote:
> 
> 
>>>-#!/usr/bin/python
>>>+#!/usr/bin/env python
>>> #
>>> # mercurial - a minimal scalable distributed SCM
>>> # v0.4b "oedipa maas"
>>
>>Could you explain why this is necessary or desirable? I looked at what 
>>env does, and I am missing the point of duplicating bash normal 
>>behaviour regarding definition of per-process environment entries.
> 
> 
> Most likely, his python lives elsewhere than /usr/bin, and the 'env' call
> results in causing a walk across $PATH to find it....

Assuming that he has env in a standard place... I hope this isn't going 
to start some rash of efforts to make packages run on non-standard 
toolchains, which add requirements for one tool to get around 
misplacement of another.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me


^ permalink raw reply

* Re: Careful object writing..
From: Linus Torvalds @ 2005-05-03 19:56 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Git Mailing List
In-Reply-To: <20050503194739.GA7082@taniwha.stupidest.org>

On Tue, 3 May 2005, Chris Wedgwood wrote:
>
> On Tue, May 03, 2005 at 12:47:36PM -0700, Linus Torvalds wrote:
> 
> > Me, I refuse to slow down my habits for old filesystems. You can
> > either fsck, or use a logging filesystem.
> 
> ok, so you're saying everyone use linux ext3 or similar more or
> less...

No. I'm saying
 - you can use git-fsck-cache
 - or you can use a logging filesystem.

I happen to use both.

> how about we drop all the objects in one directory then?

I don't even have directory hashing on, and as mentioned, the logging 
filesystem is _not_ a requirement. It's just a reality for most of us.

		Linus

^ permalink raw reply

* Re: git and symlinks as tracked content
From: H. Peter Anvin @ 2005-05-03 19:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kay Sievers, git
In-Reply-To: <Pine.LNX.4.58.0505031151240.26698@ppc970.osdl.org>

Linus Torvalds wrote:
> 
> So you have
> 
>  - directories: S_IFDIR (0040000) point to "tree" objects for contents
>  - symlinks: S_IFLNK (0120000) point to "blob" objects
>  - executables: S_IFREG | 0755 (0100755) point to "blob" objects
>  - regular files: S_IFREG | 0644 (0100644) point to "blob" objects
> 
> which seems very sane and regular. 
> 

One thing about using a hierarchy of "tree" objects... as far as I 
understand today, it's possible for "git" to represent a limited 
scattering of files underneath the root, such as keeping one's 
configuration files underneath one's home directory.  Scanning the whole 
home directory to check in (or worse, out) files would suck.

On the other hand, having a single "tree" object for a large project 
that would have to be constantly updated would suck, too.

This is certainly *not* mutually exclusive; it's mostly a matter of 
making sure that if scaffolding directory objects are necessary, that 
they can be automatically added/created, and aren't exhaustively 
searched for uncontrolled objects.

> Now, I also haev a plan for device nodes, but that one is so ugly that I'm 
> a bit ashamed of it. That one does:
> 
>  - S_IFCHR/S_IFBLK (0020000 or 0060000), with the 20-byte SHA1 not being a 
>    SHA1 at all, but just the major:minor numbers in some nice binary 
>    encoding. Probably: two network byte order 32-bit values, with twelve 
>    bytes of some non-zero signature (the SHA1 of all zeroes should be 
>    avoided, so the signature really should be soemthing else than just 
>    twelve bytes of zero).
> 

OK, that's ugly.  I'm impressed.  :)

	-hpa

^ permalink raw reply

* [PATCH] Terminate diff-* on non-zero exit from GIT_EXTERNAL_DIFF
From: Junio C Hamano @ 2005-05-03 19:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch changes the git-apply-patch-script to exit non-zero
when the patch cannot be applied.  Previously, the external diff
driver deliberately ignored the exit status of GIT_EXTERNAL_DIFF
command, which was a design mistake.  It now stops the
processing when GIT_EXTERNAL_DIFF exits non-zero, so the damages
from running git-diff-* with git-apply-patch-script between two
wrong trees can be contained.

The "diff" command line built-in driver builds is changed to
always exit 0 in order to match this new behaviour.  I know
Pasky does not use GIT_EXTERNAL_DIFF yet, so this change should
not break Cogito, either.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff.c                 |   19 +++++++++------
git-apply-patch-script |   59 +++++++++++++++++++++++--------------------------
2 files changed, 39 insertions(+), 39 deletions(-)

# - HEAD: Merge with linus-mirror.
# + 2: Use GIT_EXTERNAL_DIFF exit status to terminate diff early.
--- a/diff.c
+++ b/diff.c
@@ -83,7 +83,7 @@ static void builtin_diff(const char *nam
 {
 	int i, next_at;
 	const char *diff_cmd = "diff -L'%s%s' -L'%s%s'";
-	const char *diff_arg  = "'%s' '%s'";
+	const char *diff_arg  = "'%s' '%s'||:"; /* "||:" is to return 0 */
 	const char *input_name_sq[2];
 	const char *path0[2];
 	const char *path1[2];
@@ -261,16 +261,19 @@ void run_external_diff(const char *name,
 			printf("* Unmerged path %s\n", name);
 		exit(0);
 	}
-	if (waitpid(pid, &status, 0) < 0 || !WIFEXITED(status)) {
-		/* We do not check the exit status because typically
+	if (waitpid(pid, &status, 0) < 0 ||
+	    !WIFEXITED(status) || WEXITSTATUS(status)) {
+		/* Earlier we did not check the exit status because
 		 * diff exits non-zero if files are different, and
-		 * we are not interested in knowing that.  We *knew*
-		 * they are different and that's why we ran diff
-		 * in the first place!  However if it dies by a signal,
-		 * we stop processing immediately.
+		 * we are not interested in knowing that.  It was a
+		 * mistake which made it harder to quit a diff-*
+		 * session that uses the git-apply-patch-script as
+		 * the GIT_EXTERNAL_DIFF.  A custom GIT_EXTERNAL_DIFF
+		 * should also exit non-zero only when it wants to
+		 * abort the entire diff-* session.
 		 */
 		remove_tempfile();
-		die("external diff died unexpectedly.\n");
+		die("external diff died, stopping at %s.\n", name);
 	}
 	remove_tempfile();
 }
--- a/git-apply-patch-script
+++ b/git-apply-patch-script
@@ -21,38 +21,35 @@ then
 fi
 # This will say "patching ..." so we do not say anything outselves.
 
-diff -u -L "a/$name" -L "b/$name" "$tmp1" "$tmp2" | patch -p1
-test -f "$name.rej" || {
-    case "$mode1,$mode2" in
-    .,?x)
-	# newly created
-	case "$mode2" in
-	+x)
-	    echo >&2 "created $name with mode +x."
-	    chmod "$mode2" "$name"
-	    ;;
-	-x)
-	    echo >&2 "created $name."
-	    ;;
-	esac
-	git-update-cache --add -- "$name"
+diff -u -L "a/$name" -L "b/$name" "$tmp1" "$tmp2" | patch -p1 || exit
+case "$mode1,$mode2" in
+.,?x)
+    # newly created
+    case "$mode2" in
+    +x)
+	echo >&2 "created $name with mode +x."
+	chmod "$mode2" "$name"
 	;;
-    ?x,.)
-	# deleted
-	echo >&2 "deleted $name."
-	rm -f "$name"
-	git-update-cache --remove -- "$name"
+    -x)
+	echo >&2 "created $name."
 	;;
+    esac
+    git-update-cache --add -- "$name"
+    ;;
+?x,.)
+    # deleted
+    echo >&2 "deleted $name."
+    rm -f "$name"
+    git-update-cache --remove -- "$name"
+    ;;
+*)
+    # changed
+    case "$mode1,$mode2" in
+    "$mode2,$mode1") ;;
     *)
-	# changed
-	case "$mode1,$mode2" in
-	"$mode2,$mode1") ;;
-	*)
-	    echo >&2 "changing mode from $mode1 to $mode2."
-	    chmod "$mode2" "$name"
-	    ;;
-	esac
-	git-update-cache -- "$name"
+	echo >&2 "changing mode from $mode1 to $mode2."
+	chmod "$mode2" "$name"
+	;;
     esac
-}
-exit 0
+    git-update-cache -- "$name"
+esac


^ permalink raw reply

* Re: Careful object writing..
From: Chris Wedgwood @ 2005-05-03 19:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505031242330.26698@ppc970.osdl.org>

On Tue, May 03, 2005 at 12:47:36PM -0700, Linus Torvalds wrote:

> Me, I refuse to slow down my habits for old filesystems. You can
> either fsck, or use a logging filesystem.

ok, so you're saying everyone use linux ext3 or similar more or
less...

how about we drop all the objects in one directory then?

^ permalink raw reply

* Re: Careful object writing..
From: Linus Torvalds @ 2005-05-03 19:47 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Git Mailing List
In-Reply-To: <20050503192753.GA6435@taniwha.stupidest.org>

On Tue, 3 May 2005, Chris Wedgwood wrote:
> 
> how is this better than a single rename?  i take it there is something
> fundamental from clue.101 i slept though here?

A rename will overwrite any old object, which means that you cannot do any 
collision checks. In contrast, a "link()" will return EEXIST if somebody 
else raced with you and created a new object, and you can do collision 
checks instead of overwriting another persons object.

> also, if you are *really* paranoid you want to fsync *before* you do
> the link/unklink or rename --- which is what MTAs do[1]

Me, I refuse to slow down my habits for old filesystems. You can either 
fsck, or use a logging filesystem. 

I don't see anybody not using logging filesystems these days, so..

> also, shouldn't HEAD (and similar)[2] be updated with a temporary and
> a rename too?

Maybe. Much less important, though.

> > NOTE NOTE NOTE! I have _not_ updated all the helper stuff that also
> > write objects.
> 
> i thought this was all common code?  if it's not maybe now is the time
> to change that?

It is all common code, except:
 - things like "fetch from another host" will use rsync/wget/xxx to 
   actually get the files. To those programs, we're not talking about git 
   objects, we're just talking "regular files"
 - rpull.c has a special different routine to write its objects. I don't 
   use it, so..

Anyway, it should be reasonably easily fixable.

		Linus

^ permalink raw reply

* Re: [PATCH 0/3] cogito spec file updates
From: Chris Wright @ 2005-05-03 19:35 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050503182850.GL18917@shell0.pdx.osdl.net>

* Chris Wright (chrisw@osdl.org) wrote:
> Here's the outstanding updates for the spec file, up to 0.8-2 which is
> the latest on kernel.org.
> 
> 	http://www.kernel.org/pub/software/scm/cogito/RPMS/

What's your method for creating a release tarball?  If it were formalized
(i.e. Makefile rule), then it'd be simple to use VERSION to drive the
spec file, and it'd only need updating for real content changes (similar
to what Kay did).

thanks,
-chris

^ permalink raw reply

* Re: Careful object writing..
From: Chris Wedgwood @ 2005-05-03 19:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505031204030.26698@ppc970.osdl.org>

On Tue, May 03, 2005 at 12:15:08PM -0700, Linus Torvalds wrote:

> So now I do it "right", and create a temporary file in the "top"
> object directory, and then when it's all done, I do a "link()" to
> the final place and unlink the original.

how is this better than a single rename?  i take it there is something
fundamental from clue.101 i slept though here?

also, if you are *really* paranoid you want to fsync *before* you do
the link/unklink or rename --- which is what MTAs do[1]

however, that said it *kills* performance and if it's not critical
it's really a terrible idea

also, shouldn't HEAD (and similar)[2] be updated with a temporary and
a rename too?

> I also change the permission to 0444 before it gets its final name.

cool

> NOTE NOTE NOTE! I have _not_ updated all the helper stuff that also
> write objects.

i thought this was all common code?  if it's not maybe now is the time
to change that?

[1] yes, i know this depends on the fs used and various things and
    ext3 should be fine, blah blah blah, but not everyone uses ext3
    and quite probably not everyone will use git under Linux

[2] i didn't check the code as i'm still using BK in places

^ permalink raw reply

* Re: More problems...
From: Junio C Hamano @ 2005-05-03 19:18 UTC (permalink / raw)
  To: Andreas Gal
  Cc: Petr Baudis, Linus Torvalds, Anton Altaparmakov, Russell King,
	Ryan Anderson, git
In-Reply-To: <Pine.LNX.4.58.0505030757440.29716@sam.ics.uci.edu>

>>>>> "AG" == Andreas Gal <gal@uci.edu> writes:

AG> I am just soft-linking objects/ in the branched tree. I can live with 
AG> dangling objects, branching is extremly fast, and diskspace is cheap 
AG> anyway. The only downside is that it doesn't work too well with rsync as 
AG> network protocol,...

I usually do not symlinks myself, but doesn't "rsync -L" work
for you?

^ permalink raw reply

* Re: 'read-tree -m head' vs 'read-tree head'
From: Junio C Hamano @ 2005-05-03 19:13 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: GIT
In-Reply-To: <20050503124935.GT25004@cip.informatik.uni-erlangen.de>

The form "git-read-tree <tree>" does not care what the original
cache contained and builds the cache from scratch. On the other
hand, "git-read-tree -m <tree>" uses what the original cache
contained to speed things up in later checkout-cache.  That's
the official version of difference description.

That said, I've been wondering if "git-read-tree -m <tree>"
always does the same thing (but only making the operation
afterwards faster) as "git-read-tree <tree>".  That is, if there
is a valid use case where you would want to use it without "-m"
because "-m" does something wrong.  If there is no such valid
use case probably we should always do "-m" version if we are
reading only one tree, practically deprecating "-m" flag to the
same status as "-r" flag to git-diff-cache.

However, I have not had time to think things through and have
not bugged Linus about it myself.

^ permalink raw reply

* Careful object writing..
From: Linus Torvalds @ 2005-05-03 19:15 UTC (permalink / raw)
  To: Git Mailing List

I just pushed out the commit that tries to finally actually write the sha1
objects the right way in a shared object directory environment.

I used to be lazy, and just do "O_CREAT | O_EXCL" on the final name, but
that obviously is not very nice when it can result in other people seeing
objects that haven't been fully finalized yet.

So now I do it "right", and create a temporary file in the "top" object
directory, and then when it's all done, I do a "link()" to the final place
and unlink the original.

I also change the permission to 0444 before it gets its final name.

Two notes:

 - because the objects all get created initially in .git/objects rather 
   than in the subdirectory they get moved to, you can't use symlinks 
   to other filesystems for the 256 object subdirectories. The object 
   directory has to be one filesystem (but it doesn't have to be the same 
   one as you actually keep your working ddirectories on, of course)

 - The upside of this is that filesystem block allocators should do the 
   right thing. Instead of spreading the objects out (because they are in 
   different directories), they should be created together.

Anyway, somebody should double-check the thing. It _should_ now work
correctly over NFS etc too, and everything should be nice and atomic (and
with any half-way decent filesystem, it also means that even if you have a
system crash in the middle, you'll never see half-created objects).

NOTE NOTE NOTE! I have _not_ updated all the helper stuff that also write 
objects. So things like "git-http-pull" etc will still write objects 
directly into the object directory, and that can cause problems with 
shared usage. Same goes for "write_sha1_from_fd()" that rpull.c uses. I 
hope somebody will take a look at those issues..

Anyway, at least the really core operations should now really be
"thread-safe" in a shared object directory environment.

		Linus

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Morten Welinder @ 2005-05-03 19:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kay Sievers, git
In-Reply-To: <Pine.LNX.4.58.0505031151240.26698@ppc970.osdl.org>

Something in the patching food chain will also need to know how to turn
regular files into symlinks (and vice versa) in the same we ought to have
that for directories right now.

Morten

^ permalink raw reply

* Re: git and symlinks as tracked content
From: Linus Torvalds @ 2005-05-03 19:02 UTC (permalink / raw)
  To: Kay Sievers; +Cc: git
In-Reply-To: <1115145234.21105.111.camel@localhost.localdomain>

On Tue, 3 May 2005, Kay Sievers wrote:
>
> Is there a sane model to make git aware of tracking symlinks in the
> repository? In the bk udev tree we've had a test sysfs-tree with a lot
> of symlinks in it.
> 
> Where can we store the link-target? In its own blob-object or directly
> in the tree-object?

I'd suggest you create a blob object with the symlink name, and then in
the tree you point to that blob, but with the S_IFLNK value in the mode
field (0120000).

So you have

 - directories: S_IFDIR (0040000) point to "tree" objects for contents
 - symlinks: S_IFLNK (0120000) point to "blob" objects
 - executables: S_IFREG | 0755 (0100755) point to "blob" objects
 - regular files: S_IFREG | 0644 (0100644) point to "blob" objects

which seems very sane and regular. 

Now, I also haev a plan for device nodes, but that one is so ugly that I'm 
a bit ashamed of it. That one does:

 - S_IFCHR/S_IFBLK (0020000 or 0060000), with the 20-byte SHA1 not being a 
   SHA1 at all, but just the major:minor numbers in some nice binary 
   encoding. Probably: two network byte order 32-bit values, with twelve 
   bytes of some non-zero signature (the SHA1 of all zeroes should be 
   avoided, so the signature really should be soemthing else than just 
   twelve bytes of zero).

That should cover most of it.

> How would a exported "patch" with symlinks as content look like?

The easiest way is to make this exactly the same as the "executable bit". 
A symlink is just a normal blob, it just has a "symlink mode" instead of 
"0755" or "0644" mode.

When you think of it that way, the "patch" ends up falling out very 
naturally, I think. It would look like

	New file: filename (Mode: 0120000)
	--- /dev/null
	+++ filename
	@@ 0,0 1,1
	+symlink-value

(or something, you get the idea).

		Linus

^ permalink raw reply

* git and symlinks as tracked content
From: Kay Sievers @ 2005-05-03 18:33 UTC (permalink / raw)
  To: git

Is there a sane model to make git aware of tracking symlinks in the
repository? In the bk udev tree we've had a test sysfs-tree with a lot
of symlinks in it.

Where can we store the link-target? In its own blob-object or directly
in the tree-object?

How would a exported "patch" with symlinks as content look like?

Thanks,
Kay

^ permalink raw reply

* [PATCH 2/3] cogito spec file 0.8-1
From: Chris Wright @ 2005-05-03 18:32 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050503183038.GM18917@shell0.pdx.osdl.net>

Update spec file to cogito 0.8.  Obsolete the git package, add some
more build and install prereqs, move to /usr/bin, update URLs.

Signed-off-by: Chris Wright <chrisw@osdl.org>

--- cogito/git.spec~0.7-1	2005-05-03 11:02:15.000000000 -0700
+++ cogito/git.spec	2005-05-03 11:10:32.000000000 -0700
@@ -1,16 +1,17 @@
-Name: 		git
-Version: 	0.7
+Name: 		cogito
+Version: 	0.8
 Release: 	1
 Vendor: 	Petr Baudis <pasky@ucw.cz>
 Summary:  	Git core and tools
 License: 	GPL
 Group: 		Development/Tools
-URL: 		http://pasky.or.cz/~pasky/dev/git/
-Source: 	http://pasky.or.cz/~pasky/dev/git/%{name}-pasky-%{version}.tar.bz2
-Provides: 	git = %{version}
-BuildRequires:	zlib-devel openssl-devel
+URL: 		http://kernel.org/pub/software/scm/cogito/
+Source: 	http://kernel.org/pub/software/scm/cogito/%{name}-%{version}.tar.bz2
+Provides: 	cogito = %{version}
+Obsoletes:	git
+BuildRequires:	zlib-devel, openssl-devel, curl-devel
 BuildRoot:	%{_tmppath}/%{name}-%{version}-root
-Prereq: 	sh-utils diffutils
+Prereq: 	sh-utils, diffutils, rsync, rcs, mktemp >= 1.5
 
 %description
 GIT comes in two layers. The bottom layer is merely an extremely fast
@@ -20,7 +21,7 @@ enables human beings to work with the da
 similar to other SCM tools (like CVS, BitKeeper or Monotone).
 
 %prep
-%setup -q -n %{name}-pasky-%{version}
+%setup -q -n %{name}-%{version}
 
 %build
 
@@ -28,17 +29,19 @@ make
 
 %install
 rm -rf $RPM_BUILD_ROOT
-make DESTDIR=$RPM_BUILD_ROOT prefix=/usr/local install
+make DESTDIR=$RPM_BUILD_ROOT prefix=/usr/ install
 
 %clean
 rm -rf $RPM_BUILD_ROOT
 
 %files
 %defattr(-,root,root)
-/usr/local/bin/*
-#%{_mandir}/*/*
+/usr/bin/*
 
 %changelog
+* Mon Apr 25 2005 Chris Wright <chrisw@osdl.org> 0.8-1
+- Update to cogito, rename package, move to /usr/bin, update prereqs
+
 * Mon Apr 25 2005 Chris Wright <chrisw@osdl.org> 0.7-1
 - Update to 0.7
 

^ permalink raw reply

* [PATCH 3/3] cogito spec file 0.8-2
From: Chris Wright @ 2005-05-03 18:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, terje.rosten
In-Reply-To: <20050503183211.GN18917@shell0.pdx.osdl.net>

Some small additions to the cogito spec file:

 o include some documentation
 o use %{_prefix} macro
 o drop -n from %setup macro

(chrisw: dropped spec file rename)

Signed-off-by: Terje Rosten <terje.rosten@ntnu.no>
Signed-off-by: Chris Wright <chrisw@osdl.org>

--- cogito/git.spec~0.8-1	2005-05-03 11:02:28.000000000 -0700
+++ cogito/git.spec	2005-05-03 11:10:32.000000000 -0700
@@ -1,6 +1,6 @@
 Name: 		cogito
 Version: 	0.8
-Release: 	1
+Release: 	2
 Vendor: 	Petr Baudis <pasky@ucw.cz>
 Summary:  	Git core and tools
 License: 	GPL
@@ -21,7 +21,7 @@ enables human beings to work with the da
 similar to other SCM tools (like CVS, BitKeeper or Monotone).
 
 %prep
-%setup -q -n %{name}-%{version}
+%setup -q
 
 %build
 
@@ -29,7 +29,7 @@ make
 
 %install
 rm -rf $RPM_BUILD_ROOT
-make DESTDIR=$RPM_BUILD_ROOT prefix=/usr/ install
+make DESTDIR=$RPM_BUILD_ROOT prefix=%{_prefix} install
 
 %clean
 rm -rf $RPM_BUILD_ROOT
@@ -37,8 +37,14 @@ rm -rf $RPM_BUILD_ROOT
 %files
 %defattr(-,root,root)
 /usr/bin/*
+%doc README README.reference COPYING Changelog
 
 %changelog
+* Wed Apr 27 2005 Terje Rosten <terje.rosten@ntnu.no> 0.8-2
+- Doc files
+- Use %%{_prefix} macro
+- Drop -n option to %%setup macro
+
 * Mon Apr 25 2005 Chris Wright <chrisw@osdl.org> 0.8-1
 - Update to cogito, rename package, move to /usr/bin, update prereqs
 

^ permalink raw reply

* [PATCH 1/3] cogito spec file 0.7-1
From: Chris Wright @ 2005-05-03 18:30 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050503182850.GL18917@shell0.pdx.osdl.net>

Update spec file to 0.7

Signed-off-by: Chris Wright <chrisw@osdl.org>

--- cogito/git.spec~0.6.3-1	2005-05-03 11:01:52.000000000 -0700
+++ cogito/git.spec	2005-05-03 11:10:32.000000000 -0700
@@ -1,5 +1,5 @@
 Name: 		git
-Version: 	0.6.3
+Version: 	0.7
 Release: 	1
 Vendor: 	Petr Baudis <pasky@ucw.cz>
 Summary:  	Git core and tools
@@ -39,5 +39,8 @@ rm -rf $RPM_BUILD_ROOT
 #%{_mandir}/*/*
 
 %changelog
+* Mon Apr 25 2005 Chris Wright <chrisw@osdl.org> 0.7-1
+- Update to 0.7
+
 * Thu Apr 21 2005 Chris Wright <chrisw@osdl.org> 0.6.3-1
 - Initial rpm build

^ permalink raw reply

* [PATCH 0/3] cogito spec file updates
From: Chris Wright @ 2005-05-03 18:28 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Hi Pasky,

Here's the outstanding updates for the spec file, up to 0.8-2 which is
the latest on kernel.org.

	http://www.kernel.org/pub/software/scm/cogito/RPMS/

thanks,
-chris

^ permalink raw reply

* Re: RFC: adding xdelta compression to git
From: Davide Libenzi @ 2005-05-03 18:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: C. Scott Ananian, Alon Ziv, git
In-Reply-To: <Pine.LNX.4.58.0505031031240.3594@ppc970.osdl.org>

On Tue, 3 May 2005, Linus Torvalds wrote:

> On Tue, 3 May 2005, C. Scott Ananian wrote:
> > 
> > Linus knows this.  His point is just to be sure you actually *code* that 
> > walk in fsck, and (hopefully) do so w/o complicating the fsck too much.
> 
> Indeed. It's also a performance issue.
> 
> If you do xdelta objects, and don't tell fsck about it, then fsck will 
> just check every object as a blob. Why is that bad?
> 
> Think about it: let's say that you have a series of xdelta objects, and a 
> fsck that is xdelta-unaware. It will unpack each object independently, 
> which means that it will keep on doing the same early xdelta work over and 
> over and over again. Instead of just applying them in order, and checking 
> the sha1 of the result at each point.
> 
> Now, You probably want to limit the length of the chains to some firly 
> small number anyway, so maybe that's not a big deal. Who knows. And I'm 
> actually still so anal that I don't think I'd use this for _my_ tree, just 
> because I'm a worry-wart (and I still think disk is incredibly cheap ;)

If you use a "full tip" metadata format with reverse deltas, you drop a 
"full" version "time to time" along the chain, and you keep a small index 
file, you have:

1) No matter how big it becomes the xdelta collection object, you are only 
   touching very limited regions of it (due the small index file, that can 
   be less than 20+8 bytes per entry in the xdelta blob)

2) Checkout happens w/out even doing xpatching (since the tip is full)

3) Checkins requires only one xdelta operation (since the tip is full), 
   and zero if it is the time to store a full version along the chain (I 
   use to drop one every 10-16 xdeltas, depending on the progressive size 
   of the delta operations)

4) Worst case performance in reconstructing histories are bound by the 
   longest xdelta chain (10-16)

In some way I tend to agree (strangely ;) with you about the disk-cheap 
mantra, but network bandwidth matter IMO. So, if you do not want (being a 
real worry-wart) to use xdelta leverage on the FS trees, you can have way 
smarter network protocols using xdelta plus the knowledge of the git 
history structure. The rsync algo uses xdelta, but the poor guy is not 
able to leverage from the knowledge of the history that only git knows. 
So, if Larry and Greg shares a common object A, Larry changes A and makes 
a new git object B, rsync will transfer the whole object B, because it 
does not have any idea of the git structure. Git though, has this 
knowledge, and it can say to the remote fetcher: Look, I have this new 
thing called B, that is basically your thing A plus this very small xdelta 
(B-A). And typical xdelta diffs are really small (1/7 to 1/10 of classical 
'diff -u' ones).

- Davide

^ permalink raw reply

* [PATCH] cogito: Updated cg-status -a
From: Matt Porter @ 2005-05-03 17:47 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Updated patch versus latest cogito and bug fix for a thinko.
If -a is passed, the same output is generated but it also shows
all modified but uncommitted files as well.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>

--- aa6233be6d1b8bf42797c409a7c23b50593afc99/cg-status  (mode:100755 sha1:9e7f0e59284a3d15cda35bbd5579c44d8eda05d5)
+++ ee35a6204e59cf47966080be20d8248a6e4aa3c3/cg-status  (mode:100755 sha1:dc821a1255f012a612aa4d25ffc551c32b017bd9)
@@ -3,7 +3,9 @@
 # Show status of entries in your working tree.
 # Copyright (c) Petr Baudis, 2005
 #
-# Takes no arguments.
+# Takes an optional -a argument which will cause all repository status
+# to be shown, including modified but uncommitted files
+
 
 . cg-Xlib
 
@@ -20,3 +22,16 @@
 	shift
 done
 ' padding
+
+if [ "$1" = "-a" ]; then
+	{
+		git-update-cache --refresh
+	} | cut -f 1 -d ":" | xargs sh -c '
+	while [ "$1" ]; do
+		tag="M";
+		filename=${1%: *};
+		echo "$tag $filename";
+		shift
+	done
+	' padding
+fi

^ permalink raw reply

* Re: RFC: adding xdelta compression to git
From: Linus Torvalds @ 2005-05-03 17:35 UTC (permalink / raw)
  To: C. Scott Ananian; +Cc: Davide Libenzi, Alon Ziv, git
In-Reply-To: <Pine.LNX.4.61.0505031151380.32767@cag.csail.mit.edu>

On Tue, 3 May 2005, C. Scott Ananian wrote:
> 
> Linus knows this.  His point is just to be sure you actually *code* that 
> walk in fsck, and (hopefully) do so w/o complicating the fsck too much.

Indeed. It's also a performance issue.

If you do xdelta objects, and don't tell fsck about it, then fsck will 
just check every object as a blob. Why is that bad?

Think about it: let's say that you have a series of xdelta objects, and a 
fsck that is xdelta-unaware. It will unpack each object independently, 
which means that it will keep on doing the same early xdelta work over and 
over and over again. Instead of just applying them in order, and checking 
the sha1 of the result at each point.

Now, You probably want to limit the length of the chains to some firly 
small number anyway, so maybe that's not a big deal. Who knows. And I'm 
actually still so anal that I don't think I'd use this for _my_ tree, just 
because I'm a worry-wart (and I still think disk is incredibly cheap ;)

		Linus

^ permalink raw reply

* Re: questions about cg-update, cg-pull, and cg-clone.
From: Joel Becker @ 2005-05-03 17:20 UTC (permalink / raw)
  To: Zack Brown; +Cc: Petr Baudis, Git Mailing List
In-Reply-To: <20050503155915.GV4747@ca-server1.us.oracle.com>

On Tue, May 03, 2005 at 08:59:15AM -0700, Joel Becker wrote:
> 	Then you change the first file, adding a few functions.  You
> commit it, and it now has the hash 111111.  This change means the tree
> hash becomes 222222.  So, HEAD contains 222222.
> 	You then update from Petr again.  He's changed the second file.
> It's hash is no longer cccccc, it's eeeeee.  In his tree, the hash of
> the tree is 333333 (from file 1's aaaaaa and file 2's eeeeee).  But the
> hash of your tree is 444444 (from your local file 1's 111111 and file 2's eeeeee).  So, the hash of the your tree becomes 444444.  Your HEAD contains 444444.
> This does _not_ match his 333333 HEAD.  You are committing the
> combination of his change and yours.  He is saying that this work, which
> may have required hand-merging or commit resolution, is "interesting"
> information.

	Actually, it is more than interesting.  The tree has gone from a
HEAD of 222222 to a HEAD of 444444.  When HEAD changes, you need a
commit to describe the path.  Otherwise, you have a breakdown in the
history.  cg-log (or any other command) would have no way to get back
from 444444 to 222222 (or Petr's 333333) without the commit object
specifying its parent(s).
	If you have made no commits on your side, then the old HEAD is
Petr's old HEAD, the new HEAD is Petr's new 333333, and he's already
created a commit object describing this.  You're just fast-forwarding.

Joel

-- 

"The nice thing about egotists is that they don't talk about other
 people."
         - Lucille S. Harper

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Rene Scharfe @ 2005-05-03 17:14 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Matt Mackall,
	Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>,
	Linus Torvalds, Ryan Anderson, Andrea Arcangeli, linux-kernel,
	git
In-Reply-To: <4277A52E.1020601@tmr.com>

Bill Davidsen schrieb:
> On the theory that my first post got lost, why use /usr/bin/env at 
> all, when bash already does that substitution? To support people who 
> use other shells?
> 
> ie.: FOO=xx perl -e '$a=$ENV{FOO}; print "$a\n"'

/usr/bin/env is used in scripts in the shebang line (the very first line
of the script, starting with "#!", which denotes the interpreter to use
for that script) to make a PATH search for the real interpreter.
Some folks keep their python (or Perl, or Bash etc.) in /usr/local/bin
or in $HOME, that's why this construct is needed at all.

Changing environment variables is not the goal, insofar this usage
exploits only a side-effect of env.  It is portable in practice because
env is in /usr/bin on most modern systems.

So you could replace this first line of a bash script:

   #!/usr/bin/env python

with this:

   #!python

except that the latter doesn't work because you need to specify an
absolute path there. :]

Rene

^ permalink raw reply

* [PATCH] cogito: Add cg-undo command
From: Matt Porter @ 2005-05-03 17:06 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Adds a cg-undo command which takes a commit ID and resets HEAD
to the parent of the commit ID...refreshing the tree. This undoes
a single commit or a series of commits.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>

--- a1aff2a6748c0c0d08058c7d74503e724abc5d03/Makefile  (mode:100644 sha1:6ae0afa0208a8f755d383281a6d049a4ef90fe63)
+++ 023d9a7929d2f933d8e008f1679f13a58f7b1229/Makefile  (mode:100644 sha1:6c282aeebe86ecee9e634481b3d51fd53a582791)
@@ -47,7 +47,7 @@
 	cg-add cg-admin-lsobj cg-cancel cg-clone cg-commit cg-diff \
 	cg-export cg-help cg-init cg-log cg-ls cg-merge cg-mkpatch \
 	cg-patch cg-pull cg-branch-add cg-branch-ls cg-rm cg-seek cg-status \
-	cg-tag cg-tag-ls cg-update cg-Xlib
+	cg-tag cg-tag-ls cg-undo cg-update cg-Xlib
 
 COMMON=	read-cache.o
 
Index: cg-help
===================================================================
--- a1aff2a6748c0c0d08058c7d74503e724abc5d03/cg-help  (mode:100755 sha1:1f5d2d79b67490d44ce0f575ff9a4b80134ea47f)
+++ 023d9a7929d2f933d8e008f1679f13a58f7b1229/cg-help  (mode:100755 sha1:c7dc8f3e03895374cd0dae544570a37a459c2466)
@@ -43,6 +43,7 @@
 	cg-status
 	cg-tag		TNAME [COMMIT_ID]
 	cg-tag-ls
+	cg-undo		[COMMIT_ID]
 	cg-update	[BNAME]
 	cg-version
 
Index: cg-undo
===================================================================
--- /dev/null  (tree:a1aff2a6748c0c0d08058c7d74503e724abc5d03)
+++ 023d9a7929d2f933d8e008f1679f13a58f7b1229/cg-undo  (mode:100755 sha1:7fd6d89158fb5aeee42aa05a93f2c81884d9bd34)
@@ -0,0 +1,20 @@
+#!/usr/bin/env bash
+#
+# Undo a commit or a series of commits
+# Copyright (C) Matt Porter, 2005
+#
+# Takes a commit ID which is the earliest commit to be
+# removed from the repository.
+
+. cg-Xlib
+
+PARENT=`git-cat-file commit $1 | grep parent | cut -f 2 -d " "`
+echo "Undo from $1 to current HEAD"
+echo "Reset HEAD to $PARENT"
+echo "$PARENT" > .git/HEAD
+git-read-tree -m "$PARENT" || {
+	echo >&2 "$PARENT: bad commit"
+	exit 1
+}
+git-checkout-cache -f -a
+git-update-cache --refresh

^ permalink raw reply

* Re: [PATCH] add the ability to create and retrieve delta objects
From: Chris Mason @ 2005-05-03 16:54 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Alon Ziv, git
In-Reply-To: <Pine.LNX.4.62.0505031104080.14033@localhost.localdomain>

On Tuesday 03 May 2005 11:04, Nicolas Pitre wrote:
> On Tue, 3 May 2005, Chris Mason wrote:

> > coffee:~/git/linus.orig # echo foo > foo
> > coffee:~/git/linus.orig # echo foo2 > foo2
> > coffee:~/git/linus.orig # ./test-delta -d foo foo2 delta1
> > coffee:~/git/linus.orig # ls -la delta1
> > -rw-r--r--  1 root root 14 2005-05-03 10:36 delta1
> > coffee:~/git/linus.orig # ./test-delta -p foo delta1 out
> > *** glibc detected *** free(): invalid next size (fast): 0x0804b008 ***
>
> OK, doh!

Thanks, this one works ;)  I'll kick off a run with this replacing zdelta, 
should be around 3 hours.  For my small tree run with 300 patches, its faster 
than zdelta with about the same space savings.

-chris

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox