Git development
 help / color / mirror / Atom feed
* [PATCH] Make cg-clone handle local directories as sources
From: Ryan Anderson @ 2005-04-29 21:59 UTC (permalink / raw)
  To: git


cg-clone is described as only being used with remote repositories, but
it has the nice feature of creating the destination directory for you.

This patch adds two features:
	1. A destination directory can (optionally) be specified.
	2. The source directory can be in the local file system.

The following, for example, now works:

	cg-clone rsync://rsync.kernel.org/pub/scm/cogito/cogito.git
	mkdir test ; cd test
	cg-clone ../cogito ../cogito2/

Index: cg-clone
===================================================================
--- c3aa1e6b53cc59d5fbe261f3f859584904ae3a63/cg-clone  (mode:100755 sha1:4ee0685c358e094c5350b3968d013105da6ddf7e)
+++ uncommitted/cg-clone  (mode:100755)
@@ -11,13 +11,22 @@
 . cg-Xlib
 
 location=$1
-[ "$location" ] || die "usage: cg-clone SOURCE_LOC"
+[ "$location" ] || die "usage: cg-clone SOURCE_LOC [DEST_LOC]"
 location=${location%/}
 
-dir=${location##*/}; dir=${dir%.git}
+if [ "$2" == "" ]; then
+	dir=${location##*/}; dir=${dir%.git}
+else
+	dir=$2
+fi
+
+pwd=$(pwd)
+relative_location=$(echo "$location" | sed -e "s#^[^/]#$pwd\/&#")
+
 [ -e "$dir" ] && die "$dir/ already exists"
 mkdir "$dir"
 cd "$dir"
 
-cg-init $location || exit $?
+echo "cg-init $relative_location"
+cg-init $relative_location || exit $?
 echo "Cloned to $dir/ (origin $location available as branch \"origin\")"


-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply

* Re: More problems...
From: Anton Altaparmakov @ 2005-04-29 21:57 UTC (permalink / raw)
  To: Russell King
  Cc: Junio C Hamano, Linus Torvalds, Ryan Anderson, Petr Baudis, git
In-Reply-To: <20050429221903.F30010@flint.arm.linux.org.uk>

On Fri, 29 Apr 2005, Russell King wrote:
> On Fri, Apr 29, 2005 at 02:07:29PM -0700, Junio C Hamano wrote:
> > >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> > LT> Absolutely. I use the same "git-pull-script" between two local directories 
> > LT> on disk...
> > LT> Of course, I don't bother with the linking. But that's the trivial part.
> > 
> > Would it be useful if somebody wrote local-pull.c similar to
> > http-pull.c, which clones one local SHA_FILE_DIRECTORY to
> > another, with an option to (1) try hardlink and if it fails
> > fail; (2) try hardlink and if it fails try symlink and if it
> > fails fail; (3) try hardlink and if it fails try copy and if it
> > fails fail?
> 
> What would be nice is if it finds an existing file for the one it's
> trying to hard link, it compares the contents (maybe - is this actually
> necessary?) and if identical, it removes the original file replacing
> it with a hard link.

Unless I have completely misunderstood things, you never need to compare 
the file contents.  Just compare the file names.  If they match, i.e. the 
SHA1 is the same, the contents must match by definition.  So you only need 
a stat(), rather than read&decompress&compare.

> This means that you'll always be trying to maintain the hard linked
> structure between various working trees in the background.
> 
> But maybe this should have an option to enable this behaviour.

There should definitely be an option to either enable or disable this as 
there are legitimate cases for not wanting hard links or indeed using 
file systems which do not support them.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply

* Re: The big git command renaming..
From: H. Peter Anvin @ 2005-04-29 21:58 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <20050429213540.GA1691@redhat.com>

Dave Jones wrote:
> On Fri, Apr 29, 2005 at 02:24:43PM -0700, Linus Torvalds wrote:
>  > 
>  > Ok, I hate to do this, since my fingers have already gotten used to the 
>  > old names, but we clearly can't continue to use command names like 
>  > "update-cache" or "read-tree" that are totally non-git-specific.
>  > 
>  > So I just pushed out a change that renames the commands to always have a 
>  > "git-" prefix. In addition, I renamed "show-diff" to "diff-files", with 
>  > together with the prefix means that it becomes "git-diff-files" when used.
>  > 
>  > Since I end up using tab-completion for almost all my work, and since
>  > -within- the source directory there's no confusion, I didn't actually name
>  > the source files with any git- prefix. Quite the reverse: I removed the
>  > prefix from the two .c files that already had it (so git-mktag.c is now
>  > just "mktag.c"), and the general rule for building the executable from a C 
>  > file is now
>  > 
>  > 	git-%: %.c $(LIB_FILE)
>  > 		$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
>  > 
>  > 
>  > this seemed to be a nice regular interface that means that binaries get 
>  > installed with clear "git-" prefixes, but that I don't have to look at 
>  > them when I edit the sources.
> 
> Can you push out a new tarball to kernel.org too please, to kill
> some potential confusion in documentation/scripts ?

Oh yes, and can that tarball please be put in /pub/software/scm/git, and 
the associated git tree be moved to /pub/scm/git?

	-hpa

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Denys Duchier @ 2005-04-29 21:57 UTC (permalink / raw)
  To: Tom Lord; +Cc: noel, seanlkml, git
In-Reply-To: <200504292044.NAA28429@emf.net>

Tom Lord <lord@emf.net> writes:

>   > My example had Joe downloading a remote signed tree, reviewing the changes
>   > locally between his own trusted tree and the remote tree, 
>
> In the real world, that "review" step is the weak link.  When it goes
> wrong, the first step is to make sure we are reviewing a tree everyone
> involved *intended* -- and it's only with signed diffs adding up to
> that tree that we get there.

Hi Tom,

I hope I am not speaking out of turn or misinterpreting issues beyond my grasp,
but my perception of git is that when you sign a commit, you guarantee that this
is indeed the next step in the chronology of your own branch.  It's not about
diffs; it's about a singular brachial chronology - of course, additional
information may be recorded about topological antecedents, but that's not what
the signature is about.  The diff from that chronology can easily be generated
and scrutinized by anyone, and imported or not into another branch.

Cheers,

-- 
Dr. Denys Duchier - IRI & LIFL - CNRS, Lille, France
AIM: duchierdenys

^ permalink raw reply

* [PATCH] Makefile: The big git command renaming fallout fix.
From: Junio C Hamano @ 2005-04-29 21:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vacnh45x0.fsf@assigned-by-dhcp.cox.net>

Here is another.  This one belongs to a clean-up category.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
cd /opt/packrat/playpen/public/in-place/git/git.linus/
show-diff -p Makefile
--- k/Makefile  (mode:100644)
+++ l/Makefile  (mode:100644)
@@ -59,8 +59,6 @@ CFLAGS += '-DSHA1_HEADER=$(SHA1_HEADER)'
 $(LIB_FILE): $(LIB_OBJS)
 	$(AR) rcs $@ $(LIB_OBJS)
 
-init-db: init-db.o
-
 git-%: %.c $(LIB_FILE)
 	$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
 
@@ -104,6 +102,7 @@ read-cache.o: $(LIB_H)
 sha1_file.o: $(LIB_H)
 usage.o: $(LIB_H)
 diff.o: $(LIB_H)
+strbuf.o: $(LIB_H)
 
 clean:
 	rm -f *.o mozilla-sha1/*.o ppc/*.o $(PROG) $(LIB_FILE)



^ permalink raw reply

* RE: Mercurial 0.4b vs git patchbomb benchmark
From: Andrew Timberlake-Newell @ 2005-04-29 20:57 UTC (permalink / raw)
  To: 'Tom Lord'; +Cc: noel, seanlkml, git
In-Reply-To: <200504292026.NAA28131@emf.net>

>   > It looks to me like he did read carefully.
> 
>   > There were two different ideas:
>   >   TL)  Passing tree & diff and trusting diff to create tree
>   >   NM)  Passing tree and generating diff versus local tree for review
> 
> Well, I guess *you* didn't read carefully.  I also spoke about the
> value of passing around triples: ancestry, diff, and tree.  The
> question is about linking signatures to things that humans can
> reasonably *intend* and be reasonably held accountable for, hence one
> of the values of signed diffs.  (I cited other practical reasons to
> value signed diffs and use them in specific ways, too.)

I know that you mentioned other things.  That doesn't invalidate that Noel
was talking about your starting point description of how git works and
suggesting that it isn't how git actually works.  The relevance of your
other points depends upon having the base model correct.

You can argue that glass houses are inherently brittle, but why should I
care if mine is already made of bricks instead of glass?  If the model
against which you are arguing is not the model which is used by git, then
the model isn't a relevant basis for claiming problems with git.



^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Horst von Brand @ 2005-04-29 21:45 UTC (permalink / raw)
  To: Tom Lord; +Cc: seanlkml, git
In-Reply-To: <200504291928.MAA27145@emf.net>

Tom Lord <lord@emf.net> said:
> Think of it this way:
> 
>   (a) Joe, the mainline maintainer, gets a trusted message containing
>       a diff.
> 
>   (b) Joe reads the diff, it makes great sense, he wants to merge.
> 
>   (c) Joe downloads a tree.  Supposedly that tree is the result of
>       applying this diff.   The tree, not the diff, is used for
>       merging.
> 
> You can see the logical whole there... now the practical one:
> 
> 
>    (d) Joe is repeating (a..c) at an unfathomably high rate.
>        At a low rate, he could be double-checking enough that
>        that the diff-vs-tree problem isn't that serious.  But
>        at the rate he operates, exploits appear all along the
>        patch-flow pipeline because so much stuff goes unchecked.
> 
>        Joe may be scan the changes he's merged before committing but,
>        if his rate is high, that scan *must*, out of biological and
>        physical necessity, be shallow.   Exploits can occur on the
>        submitter machine, in the communication channel, and on Joe's 
>        machine.   Social exploits can occur because of the separation
>        between a submitter saying "this is what I'm doing" vs. the reality
>        of what the submitter is doing.

Now pray tell how Joe signing one, two, three, or none of the things he is
juggling makes any difference here.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply

* Re: More problems...
From: Daniel Barkalow @ 2005-04-29 21:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, Russell King, git
In-Reply-To: <7vhdhp47hq.fsf@assigned-by-dhcp.cox.net>

On Fri, 29 Apr 2005, Junio C Hamano wrote:

> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> Absolutely. I use the same "git-pull-script" between two local directories 
> LT> on disk...
> LT> Of course, I don't bother with the linking. But that's the trivial part.
> 
> Would it be useful if somebody wrote local-pull.c similar to
> http-pull.c, which clones one local SHA_FILE_DIRECTORY to
> another, with an option to (1) try hardlink and if it fails
> fail; (2) try hardlink and if it fails try symlink and if it
> fails fail; (3) try hardlink and if it fails try copy and if it
> fails fail?

If someone does this, they should make a pull.c out of http-pull and
rpull; the logic for determining what you need to copy, given what you
have and what the user wants to have, should be shared.

(Note that some usage patterns only require the latest commit, or at least
can deal with fetching other stuff only when needed.)

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply

* [PATCH] The big git command renaming fallout fix.
From: Junio C Hamano @ 2005-04-29 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504291416190.18901@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Ok, I hate to do this, ...

Well, it was time.  This fixes the git-export which calls diff-tree.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
cd /opt/packrat/playpen/public/in-place/git/git.linus/
show-diff -p export.c
--- k/export.c  (mode:100644)
+++ l/export.c  (mode:100644)
@@ -18,7 +18,7 @@ void show_commit(struct commit *commit)
 		char *against = sha1_to_hex(commit->parents->item->object.sha1);
 		printf("\n\n======== diff against %s ========\n", against);
 		fflush(NULL);
-		sprintf(cmdline, "diff-tree -p %s %s", against, hex);
+		sprintf(cmdline, "git-diff-tree -p %s %s", against, hex);
 		system(cmdline);
 	}
 	printf("======== end ========\n\n");





^ permalink raw reply

* Re: [PATCh] jit-trackdown
From: Daniel Barkalow @ 2005-04-29 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Greaves, GIT Mailing Lists
In-Reply-To: <7voebx4dyd.fsf@assigned-by-dhcp.cox.net>

On Fri, 29 Apr 2005, Junio C Hamano wrote:

> Have toilet side gitters reached a concensus (or semi-concensus)
> on how things under .git/ should be organized?  Is there a
> summary somewhere, something along the following lines?

I've made a proposal like the following:

.git/
  objects/    (traditional)
  refs/       Directories of hex SHA1 + newline files
    heads/    Commits which are heads of various sorts
    tags/     Tags, by the tag name (or some local renaming of it)
  info/       Other shared information
    remotes
  ...         Everything else isn't shared
  HEAD        Symlink to refs/heads/<something>

The plumbing doesn't care what you name heads or tags, but expects things
to be in heads to be commit objects and tags to be tag objects (which can
tag whatever).

AFAICT, there is general concensus that this is how things should be, but
I haven't convinced Linus that the plumbing should know about anything
other than objects/.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply

* Re: The big git command renaming..
From: Dave Jones @ 2005-04-29 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504291416190.18901@ppc970.osdl.org>

On Fri, Apr 29, 2005 at 02:24:43PM -0700, Linus Torvalds wrote:
 > 
 > Ok, I hate to do this, since my fingers have already gotten used to the 
 > old names, but we clearly can't continue to use command names like 
 > "update-cache" or "read-tree" that are totally non-git-specific.
 > 
 > So I just pushed out a change that renames the commands to always have a 
 > "git-" prefix. In addition, I renamed "show-diff" to "diff-files", with 
 > together with the prefix means that it becomes "git-diff-files" when used.
 > 
 > Since I end up using tab-completion for almost all my work, and since
 > -within- the source directory there's no confusion, I didn't actually name
 > the source files with any git- prefix. Quite the reverse: I removed the
 > prefix from the two .c files that already had it (so git-mktag.c is now
 > just "mktag.c"), and the general rule for building the executable from a C 
 > file is now
 > 
 > 	git-%: %.c $(LIB_FILE)
 > 		$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)
 > 
 > 
 > this seemed to be a nice regular interface that means that binaries get 
 > installed with clear "git-" prefixes, but that I don't have to look at 
 > them when I edit the sources.

Can you push out a new tarball to kernel.org too please, to kill
some potential confusion in documentation/scripts ?

		Dave



^ permalink raw reply

* Re: More problems...
From: Russell King @ 2005-04-29 21:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Ryan Anderson, Petr Baudis, git
In-Reply-To: <7vhdhp47hq.fsf@assigned-by-dhcp.cox.net>

On Fri, Apr 29, 2005 at 02:07:29PM -0700, Junio C Hamano wrote:
> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> Absolutely. I use the same "git-pull-script" between two local directories 
> LT> on disk...
> LT> Of course, I don't bother with the linking. But that's the trivial part.
> 
> Would it be useful if somebody wrote local-pull.c similar to
> http-pull.c, which clones one local SHA_FILE_DIRECTORY to
> another, with an option to (1) try hardlink and if it fails
> fail; (2) try hardlink and if it fails try symlink and if it
> fails fail; (3) try hardlink and if it fails try copy and if it
> fails fail?

What would be nice is if it finds an existing file for the one it's
trying to hard link, it compares the contents (maybe - is this actually
necessary?) and if identical, it removes the original file replacing
it with a hard link.

This means that you'll always be trying to maintain the hard linked
structure between various working trees in the background.

But maybe this should have an option to enable this behaviour.

-- 
Russell King


^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Matt Mackall @ 2005-04-29 21:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sean, linux-kernel, git
In-Reply-To: <Pine.LNX.4.58.0504291338540.18901@ppc970.osdl.org>

On Fri, Apr 29, 2005 at 01:49:18PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 29 Apr 2005, Matt Mackall wrote:
> > 
> > The changeset log (and everything else) has an external index.
> 
> I don't actually know exactly how the BK changeset file works, but your 
> explanation really sounds _very_ much like it.

I've never used BK, but I got the impression that it was all SCCS
under the covers, which means adding stuff and reconstructing random
versions is expensive (just as it is in CVS). The split between index
and data in Mercurial is intended to address that.
 
> I didn't want to do anything that even smelled of BK. Of course, part of
> my reason for that is that I didn't feel comfortable with a delta model at
> all (I wouldn't know where to start, and I hate how they always end up
> having different rules for "delta"ble and "non-delta"ble objects).

There aren't really any such rules here. While the index contains a
full DAG, the deltas are done opportunistically on a linearized
(topologically sorted) version of it. We try to make a delta against
the previous tip (regardless of whether or not it's the parent), and
if that is a win, we store it.

> So it sounds like it could work fine, but it in fact sounds so much like 
> the ChangeSet file that I'd personally not have done it that way. 

Well I originally set out to do it differently, but I decided my
current approach was the fastest route to something that actually
worked.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* The big git command renaming..
From: Linus Torvalds @ 2005-04-29 21:24 UTC (permalink / raw)
  To: Git Mailing List


Ok, I hate to do this, since my fingers have already gotten used to the 
old names, but we clearly can't continue to use command names like 
"update-cache" or "read-tree" that are totally non-git-specific.

So I just pushed out a change that renames the commands to always have a 
"git-" prefix. In addition, I renamed "show-diff" to "diff-files", with 
together with the prefix means that it becomes "git-diff-files" when used.

Since I end up using tab-completion for almost all my work, and since
-within- the source directory there's no confusion, I didn't actually name
the source files with any git- prefix. Quite the reverse: I removed the
prefix from the two .c files that already had it (so git-mktag.c is now
just "mktag.c"), and the general rule for building the executable from a C 
file is now

	git-%: %.c $(LIB_FILE)
		$(CC) $(CFLAGS) -o $@ $(filter %.c,$^) $(LIBS)


this seemed to be a nice regular interface that means that binaries get 
installed with clear "git-" prefixes, but that I don't have to look at 
them when I edit the sources.

Sorry to everybody else whose fingers have already learnt the old names. 
The good news is that if you use cogito, you won't care.

		Linus

^ permalink raw reply

* Re: git network protocol
From: Daniel Barkalow @ 2005-04-29 21:15 UTC (permalink / raw)
  To: David Lang; +Cc: git
In-Reply-To: <Pine.LNX.4.62.0504291333550.7439@qynat.qvtvafvgr.pbz>

On Fri, 29 Apr 2005, David Lang wrote:

> would it make sense for the network git protocol to be something along the 
> lines of
> 
> client contacts server and sends
> the tag you want to sync with (defaults to head)
> the local index file

Actually, you really want to have a bidirectional interaction, where the
client first fetches the info to determine where to start, and then goes
through the reachable space, asking for anything it doesn't already have.

(In the long run, we want to keep track of some things we already have all
of, or know we're missing, etc., so the receiver side doesn't have to
look over its whole tree.)

git already includes two versions of this protocol; the first runs against
a static HTTP server, and the second uses ssh to get a socket. At some
point, I'm going to enable these programs to read and write
.git/refs/?/? to figure out what they're supposed to get.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply

* Re: More problems...
From: Junio C Hamano @ 2005-04-29 21:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ryan Anderson, Petr Baudis, Russell King, git
In-Reply-To: <Pine.LNX.4.58.0504291311320.18901@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Absolutely. I use the same "git-pull-script" between two local directories 
LT> on disk...
LT> Of course, I don't bother with the linking. But that's the trivial part.

Would it be useful if somebody wrote local-pull.c similar to
http-pull.c, which clones one local SHA_FILE_DIRECTORY to
another, with an option to (1) try hardlink and if it fails
fail; (2) try hardlink and if it fails try symlink and if it
fails fail; (3) try hardlink and if it fails try copy and if it
fails fail?

Then from a source repository that contains good stuff plus
throwaway experimental commits you can prepare pruned for-public
tree.  Of course you can do it today by copying and then running
git-prune in the destination, though.



^ permalink raw reply

* Next problem: cg-commit
From: Russell King @ 2005-04-29 20:51 UTC (permalink / raw)
  To: git

Unfortunately, cg-commit seems to return wrong exit status, returning
1 on success.  Eg:

$ cg-commit
arch/arm/mach-ixp2000/pci.c
include/asm-arm/arch-ixp2000/platform.h
Enter commit message, terminated by ctrl-D on a separate line:
blah blah blah
Committed as fafb525292acc9c0818b91b1d8e58cf770616542.
$ echo $?
1

It appears that [ "$merging" ] towards the end of cg-commit is the
cause of this odd behaviour.  Force zero exit status, since we
successfully completed.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>

--- cg-commit.old	2005-04-26 04:02:01.000000000 +0100
+++ cg-commit	2005-04-29 21:47:57.161333483 +0100
@@ -114,6 +114,7 @@
 	echo "Committed as $newhead."
 	echo $newhead >.git/HEAD
 	[ "$merging" ] && rm .git/merging .git/merging-sym .git/merge-base
+	exit 0
 else
 	die "error during commit (oldhead $oldhead, treeid $treeid)"
 fi


-- 
Russell King


^ permalink raw reply

* Re: Odd decision of git-pasky-0.7 to do a merge
From: Russell King @ 2005-04-29 20:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0504291043060.18901@ppc970.osdl.org>

On Fri, Apr 29, 2005 at 10:44:29AM -0700, Linus Torvalds wrote:
> On Fri, 29 Apr 2005, Russell King wrote:
> > Why it decided that a merge was necessary is beyond me.  Any ideas?
> > Did Linus forget to merge his tree properly?
> 
> It looks like it was unable to find the right common ancestor.
> 
> If you only had my stuff in it, the common ancestor _should_ have been the 
> parent (c60c390620e0abb60d4ae8c43583714bda27763f), which _should_ have 
> been your old top.
> 
> But maybe merge-base didn't work right?

Yup - pasky-0.7 came out with some weird commit-id, but cogito-0.8
got it right.  Now using cogito-0.8 here, so I'm no longer concerned
about this particular problem.

-- 
Russell King


^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Linus Torvalds @ 2005-04-29 20:49 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Sean, linux-kernel, git
In-Reply-To: <20050429202341.GB21897@waste.org>



On Fri, 29 Apr 2005, Matt Mackall wrote:
> 
> The changeset log (and everything else) has an external index.

I don't actually know exactly how the BK changeset file works, but your 
explanation really sounds _very_ much like it.

I didn't want to do anything that even smelled of BK. Of course, part of
my reason for that is that I didn't feel comfortable with a delta model at
all (I wouldn't know where to start, and I hate how they always end up
having different rules for "delta"ble and "non-delta"ble objects).

But another was that exactly since I've been using BK for so long, I
wanted to make sure that my model just emulated the way I've been _using_
BK, rather than any BK technical details.

So it sounds like it could work fine, but it in fact sounds so much like 
the ChangeSet file that I'd personally not have done it that way. 

			Linus

^ permalink raw reply

* Re: Val Henson's critique of hash-based content storage systems
From: Morten Welinder @ 2005-04-29 20:47 UTC (permalink / raw)
  To: Rob Jellinghaus; +Cc: git
In-Reply-To: <loom.20050429T015434-928@post.gmane.org>

On 4/28/05, Rob Jellinghaus <robj@unrealities.com> wrote:
> I assume most people here have read this, but just in case:
> 
> http://www.usenix.org/events/hotos03/tech/full_papers/henson/henson.pdf

The math in section 3 is bogus.  1-(1-2^-b)^n  isn't hard to compute and
even if it was, it is the wrong formula.  (Set n==2^b; you obviously should
get probability 1 for collision.)

The right formula is 1-B!/B^n/(B-n)! where B=2^n.  For n=2^80 and b=160
you get about 39%.

Morten

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Tom Lord @ 2005-04-29 20:44 UTC (permalink / raw)
  To: noel; +Cc: noel, seanlkml, git
In-Reply-To: <20050429202117.GA15417@uglybox.localnet>



  > Your example had Joe reviewing a signed diff, and then applying changes
  > from a tree that "supposedly" had the diff applied correctly, but may
  > have been corrupted. If the tree was not an accurate representation of
  > applying the diff, then the changes Joe applied to his tree will be
  > different than those that he reviewed.

That's right.   I'm saying that Joe needn't rely on the tree at all since
he should be having his tools verify its contents anyway.  Given that, 
he may as well have his tools *generate* the tree.  Having generated the tree,
it's gravy to then verify that it matches the tree the submitter thought he
was sending -- that's a *secondary* checksum where `git' currently uses
it as primary.


  > My example had Joe downloading a remote signed tree, reviewing the changes
  > locally between his own trusted tree and the remote tree, 

In the real world, that "review" step is the weak link.  When it goes
wrong, the first step is to make sure we are reviewing a tree everyone
involved *intended* -- and it's only with signed diffs adding up to
that tree that we get there.

-t

^ permalink raw reply

* git network protocol
From: David Lang @ 2005-04-29 20:42 UTC (permalink / raw)
  To: git
In-Reply-To: <20050429202117.GA15417@uglybox.localnet>

would it make sense for the network git protocol to be something along the 
lines of

client contacts server and sends
the tag you want to sync with (defaults to head)
the local index file

then the server can use the git tools locally to figure out what objects 
need to be sent to do the merge and only send those objects.

no this isn't as efficiant as only sending diffs, but it avoids sending 
any objects that aren't needed (which would be sent if you just did a 
straight rsync)

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Matt Mackall @ 2005-04-29 20:39 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel, git
In-Reply-To: <20050429203027.GK17379@opteron.random>

On Fri, Apr 29, 2005 at 10:30:27PM +0200, Andrea Arcangeli wrote:
> On Thu, Apr 28, 2005 at 11:01:57PM -0700, Matt Mackall wrote:
> > change nodes so you've got to potentially traverse all the commits to
> > reconstruct a file's history. That's gonna be O(top-level changes)
> > seeks. This introduces a number of problems:
> > 
> > - no way to easily find previous revisions of a file
> >   (being able to see when a particular change was introduced is a
> >   pretty critical feature)
> > - no way to do bandwidth-efficient delta transfer
> > - no way to do efficient delta storage
> > - no way to do merges based on the file's history[1]
> 
> And IMHO also no-way to implement a git-on-the-fly efficient network
> protocol if tons of clients connects at the same time, it would be
> dosable etc... At the very least such a system would require an huge
> amount of ram. So I see the only efficient way to design a network
> protocol for git not to use git, but to import the data into mercurial
> and to implement the network protocol on top of mercurial.
> 
> The one downside is that git is sort of rock solid in the way it stores
> data on disk, it makes rsync usage trivial too, the git fsck is reliable
> and you can just sign the hash of the root of the tree and you sign
> everything including file contents. And of course the checkin is
> absolutely trivial and fast too.

Mercurial is ammenable to rsync provided you devote a read-only
repository to it on the client side. In other words, you rsync from
kernel.org/mercurial/linus to local/linus and then you merge from
local/linus to your own branch. Mercurial's hashing hierarchy is
similar to git's (and Monotone's), so you can sign a single hash of
the tree as well.

> With a more efficient diff-based storage like mercurial we'd be losing
> those fsck properties etc.. but those reliability properties don't worth
> the network and disk space they take IMHO, and the checkin time
> shouldn't be substantially different (still running in O(1) when
> appending at the head). And we could always store the hash of the
> changeset, to give it some basic self-checking.

I think I can implement a decent repository check similar to git, it's
just not been a priority.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: Val Henson's critique of hash-based content storage systems
From: C. Scott Ananian @ 2005-04-29 20:41 UTC (permalink / raw)
  To: Tom Lord; +Cc: git, robj
In-Reply-To: <200504292037.NAA28344@emf.net>

On Fri, 29 Apr 2005, Tom Lord wrote:

> My point is simply that blob-db implementations should assume that the
> mathemeticians will succeed and take the small steps necessary to make
> sure that those bitstrings can't be used to crash a distributed
> blob-db infrastructure.

And my point is that you haven't *begun* to describe how one might use an 
arbitrary hash collision to "crash a distributed blob-db infrastructure".

Remember, first you've got to get some reference to your collision into 
the db...  (and if you can do that, why are you mucking around with hash 
collisions?)
   --scott

Philadelphia PBPRIME STANDEL for Dummies milita Richard Tomlinson 
ESSENCE SUMAC Nader KUCLUB WSHOOFS QKENCHANT AK-47 AMQUACK supercomputer
                          ( http://cscott.net/ )

^ permalink raw reply

* Re: Val Henson's critique of hash-based content storage systems
From: Tom Lord @ 2005-04-29 20:37 UTC (permalink / raw)
  To: cscott; +Cc: git, robj
In-Reply-To: <Pine.LNX.4.61.0504291608410.32145@cag.csail.mit.edu>


  lord:

  > I would expect someone to have on hand a small number of blobs that are
  > different but have different hashes and, eventually, to drop said files
  > into a blob-based infrastructure to wreak havoc.

  cscott:
  
  This is just ridiculous.  The number of known collisions in SHA1 is 
  *exactly zero* at this point in time --- not guaranteed to stay that way, 
  of course, but generating collisions is likely to remain relatively 
  expensive for some time.

Blob-dbs and the low-level object system (trees, file-contents, and
changesets) are pretty fundamental things.  It is likely (and
desirable) -- not guaranteed but likely (and desirable) -- that people
will invest heavily in building infrastructure that operates solely at
that level of abstraction.  Arguably, that is already happening.

Simultaneously, it is very desirable that some mathemetican somewhere
will discover two bitstrings which are different but have SHA1
checksums, and then tell everyone in the world about their discovery.

My point is simply that blob-db implementations should assume that the
mathemeticians will succeed and take the small steps necessary to make
sure that those bitstrings can't be used to crash a distributed
blob-db infrastructure.

-t



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox