Git development

Git development
 help / color / mirror / Atom feed

* Re: VCS comparison table
From: Jeff King @ 2006-10-23 20:06 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, Jakub Narebski, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Andreas Ericsson, Carl Worth, git
In-Reply-To: <453CF966.7000308@utoronto.ca>

On Mon, Oct 23, 2006 at 01:18:30PM -0400, Aaron Bentley wrote:

> And, unlike git, Bazaar branches are all independent entities[1], and
[...]
> [1] The fact that they may share storage is not important to the model.

Sorry, I don't understand this statement. How are git branches not
independent? Sure, they tend to exist in repositories with other
branches, but there's no need to (it simply allows the sharing of object
storage). There's no reason I can't move any branch from any repo into
its own repo, or vice versa move any unrelated branch into a repo with
other branches.

It all Just Works because there _isn't_ any branch information. It's
simply a pointer into the DAG, so if I have the right parts of the DAG
(which git is careful to make sure of), I can just make a pointer, and I
have absolutely zero connection to wherever the DAG came from.

> they each have a URL.

In cogito, branches can each have a URL, but git-clone doesn't have a
way (that I know of) to clone only a subset of branches. It would be
fairly trivial to implement, I think.

> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

The git analog is of course:

http://kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git v2.6.18

as a name for

e478bec0ba0a83a48a0f6982934b6de079e7e6b3

The difference being that Linus assigned the "local" name of v2.6.18
rather than having git auto-assign it.

> And it does not depend on any other branch, especially not bzr.dev

Of course. For me, the above commit is actually

  ssh://peff.net/home/peff/git/linux-2.6 v2.6.18

but once it is in my local repository, it's indistinguishable from one I
pulled directly from kernel.org.

And I wonder if THAT is at the root of this discussion. bzr isn't
"centralized" in the sense that you have to talk to a central server, or
rely on it for doing any operations.  But you actually CARE about where
your commits come from, and git fundamentally doesn't.

-Peff

^ permalink raw reply

* Re: [RFC] git-split: Split the history of a git repository by subdirectories and ranges
From: Jakub Narebski @ 2006-10-23 20:07 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0610231237080.3962@g5.osdl.org>

There is also not-that-obvious result that

  git rev-log --parents --full-history <head> -- <pathspec> 

generates different result than if either --parents or --full-history are
absent.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* [PATCH] Checking for "diff.color." should come before "diff.color"
From: Andy Parkins @ 2006-10-23 19:51 UTC (permalink / raw)
  To: git

In git_diff_ui_config() the strncmp() for "diff.color" would have matched for
"diff.color.", so "diff.color." configs would never be processed.

Fix is to move "diff.color." check before "diff.color"
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
 diff.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index 3315378..d795be4 100644
--- a/diff.c
+++ b/diff.c
@@ -60,10 +60,6 @@ int git_diff_ui_config(const char *var, 
 		diff_rename_limit_default = git_config_int(var, value);
 		return 0;
 	}
-	if (!strcmp(var, "diff.color")) {
-		diff_use_color_default = git_config_colorbool(var, value);
-		return 0;
-	}
 	if (!strcmp(var, "diff.renames")) {
 		if (!value)
 			diff_detect_rename_default = DIFF_DETECT_RENAME;
@@ -79,6 +75,10 @@ int git_diff_ui_config(const char *var, 
 		color_parse(value, var, diff_colors[slot]);
 		return 0;
 	}
+	if (!strcmp(var, "diff.color")) {
+		diff_use_color_default = git_config_colorbool(var, value);
+		return 0;
+	}
 	return git_default_config(var, value);
 }
 
-- 
1.4.2.3

^ permalink raw reply related

* Re: [RFC] git-split: Split the history of a git repository by subdirectories and ranges
From: Linus Torvalds @ 2006-10-23 19:50 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Junio C Hamano, git
In-Reply-To: <453D17B5.6070203@freedesktop.org>

On Mon, 23 Oct 2006, Josh Triplett wrote:
>
> > Without the "--full-history", you get a simplified history, but it's 
> > likely to be _too_ simplified for your use, since it will not only 
> > collapse multiple identical parents, it will also totally _remove_ parents 
> > that don't introduce any new content.
> 
> Considering that git-split does exactly that (remove parents that don't
> introduce new content, assuming they changed things outside the
> subtree), that might actually work for us.  I just checked, and the
> output of "git log --parents -- $project" on one of my repositories
> seems to show the same sequence of commits as git log --parents on the
> head commit printed by git-split $project (apart from the rewritten
> sha1s), including elimination of irrelevant merges.

Ok. In that case, you're good to go, and just use the current 
simplification entirely.

Although I think that somebody (Dscho?) also had a patch to remove 
multiple identical parents, which he claimed could happen with 
simplification otherwise. I didn't look any closer at it.

> > So there are multiple levels of history simplification, and right now the 
> > internal git revision parser only gives you two choices: "none" 
> > (--full-history) and "extreme" (which is the default when you give a set 
> > of filenames). 
> 
> I don't think we need any middle ground here; why might we want less
> simplification?

There's really three levels of simplification:

 - none at all ("--full-history"). This is really annoying, but if you 
   want to guarantee that you see all the changes (even duplicate ones) 
   done along all branches, you currently need to do this one.

   Currently "git whatchanged" uses this one (and that ignores merges by
   default, making it quite palatable). So with "git whatchanged", you 
   will get _every_ commit that changed the file, even if there are 
   duplicates alogn different histories.

 - extreme (the current default). This one is really nice, in that it 
   shows the simplest history you can make that explains the end result. 
   But it means that if you had two branches that ended up with the same 
   result, we will pick just one of them. And the other one may have done 
   it differently, and the different way of reaching the same result might 
   be interesting. We'll never know.

   As an exmple: the extreme simplification can also throw away branches 
   that had work reverted on them - the branch ended up the _same_ as the 
   one we chose, but it did so because it had some experimental work that 
   was deemed to be bad. Extreme simplification may or may not remove that 
   experiment, simply depending on which branch it _happened_ to pick.

   Currently, this is what most git users see if they ask for pathname 
   simplification, ie "gitk drivers/char" or "git log -p kernel/sched.c"
   uses this simplification. It's extremely useful, but it definitely 
   culls real history too.

 - The nice one that doesn't throw away potentially interesting 
   duplicate paths to reach the same end result. We don't have this one, 
   so no git commands do this yet.

   The way to do this one would be "--full-history", but then removing all 
   parents that are "redundant". In other words, for any merge that 
   remains (because of the --full-history), check if one parent is a full 
   superset of another one, and if so, remove the "dominated" parent, 
   which simplifies the merge. Continue until nothing can be simplified 
   any more.

   This would _usually_ end up giving the same graph as the "extreme" 
   simplification, but if there were two branches that really _did_ 
   generate the same end result using different commits, they'd remain in 
   the end result.

The problem with the "nice one" is that it's expensive as hell. There may 
be clever tricks to make it less so, though. But I think it's the 
RightThing(tm) to do, at least as an option for when you really want to 
see a reasonable history that still contains everything that is relevant.

			Linus

^ permalink raw reply

* [PATCH] Fix regression tests on Cygwin
From: Lars Hjemli @ 2006-10-23 19:34 UTC (permalink / raw)
  To: git

On Cygwin, "make test" failes due to missing ".exe" a couple of places.

This fixes it, in a somewhat ugly way....

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
---
 t/t0000-basic.sh |    9 ++++++++-
 t/test-lib.sh    |   10 +++++++++-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 2c9bbb5..41d53be 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -25,7 +25,14 @@ # or have too old python without subproc
 # out before running any tests.  Also catch the bogosity of trying
 # to run tests without building while we are at it.
 
-../git >/dev/null
+X=
+uname=$(uname -o)
+if test "$uname" = "Cygwin"
+then
+	X=".exe"
+fi
+
+../git$X >/dev/null
 if test $? != 1
 then
 	echo >&2 'You do not seem to have built git yet.'
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 2488e6e..8a64f6e 100755
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -43,6 +43,14 @@ case $(echo $GIT_TRACE |tr "[A-Z]" "[a-z
 		;;
 esac
 
+X=
+uname=$(uname -o)
+if test "$uname" = "Cygwin"
+then
+	X=".exe"
+fi
+
+
 # Each test should start with something like this, after copyright notices:
 #
 # test_description='Description of this test...
@@ -175,7 +183,7 @@ test_create_repo () {
 	repo="$1"
 	mkdir "$repo"
 	cd "$repo" || error "Cannot setup test environment"
-	"$GIT_EXEC_PATH/git" init-db --template=$GIT_EXEC_PATH/templates/blt/ 2>/dev/null ||
+	"$GIT_EXEC_PATH/git$X" init-db --template=$GIT_EXEC_PATH/templates/blt/ 2>/dev/null ||
 	error "cannot run git init-db -- have you built things yet?"
 	mv .git/hooks .git/hooks-disabled
 	cd "$owd"
-- 
1.4.3.1.g1688

^ permalink raw reply related

* Re: [PATCH] git-cherry should show "+" instead of "-" and vice versa
From: Petr Baudis @ 2006-10-23 19:33 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git
In-Reply-To: <200610232003.08861.andyparkins@gmail.com>

Dear diary, on Mon, Oct 23, 2006 at 09:03:08PM CEST, I got a letter
where Andy Parkins <andyparkins@gmail.com> said that...
> In git-cherry.sh:
> 
>   if test -f "$patch/$2"
>   then
>     sign=-
>   else
>     sign=+
>   fi
> 
> Documentation says 'If the change seems to be in the upstream, it is shown on
> the standard output with prefix "+"', however the above does the reverse.  
> When
> the file $patch/$2 exists it is because the patch /is/ in upstream so the sign
> should be "+".
> Signed-off-by: Andy Parkins <andyparkins@gmail.com>

See also

	http://news.gmane.org/find-root.php?message_id=<Pine.LNX.4.58.0608071328200.22971@kivilampi-30.cs.helsinki.fi>

Did the documentation ever get fixed or noone cared enough? ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply

* Re: [RFC] git-split: Split the history of a git repository by subdirectories and ranges
From: Josh Triplett @ 2006-10-23 19:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0610230846420.3962@g5.osdl.org>

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]

Linus Torvalds wrote:
> 
> On Mon, 23 Oct 2006, Josh Triplett wrote:
>>> I wonder if using "git-log --full-history -- $project" to let the core 
>>> side omit commits that do not change the $project (but still give you 
>>> all merged branches) would have made your job any easier?
>> I don't think it would.  We still need to know what commit to use as the
>> parent of any given commit, so we don't want commits in the log output
>> with parents that don't exist in the log output.  And rewriting parents
>> in git-log based on which revisions change the specified subdirectory
>> seems like a bad idea.
> 
> Umm.. You didn't realize that git log already _does_ exactly that?

No, I didn't, primarily because the git log output I've scrutinized most
carefully came from git log --pretty=raw, which doesn't rewrite parents
even when pointed at a subdirectory.

> You need to rewrite the parents in order to get a nice and readable 
> history, which in turn is needed for any visualizer. So git has long done 
> the parent rewriting in order to be able to do things like
> 
> 	gitk drivers/char
> 
> on the kernel.
>
> And yes, that's done by the core revision parsing code, so when you do
> 
> 	git log --full-history --parents -- $project
> 
> you do get the rewritten parent output (of course, it's not actually 
> _simplified_, so you get a fair amount of duplicate parents etc which 
> you'd still have to simplify and which don't do anything at all).
> 
> Without the "--full-history", you get a simplified history, but it's 
> likely to be _too_ simplified for your use, since it will not only 
> collapse multiple identical parents, it will also totally _remove_ parents 
> that don't introduce any new content.

Considering that git-split does exactly that (remove parents that don't
introduce new content, assuming they changed things outside the
subtree), that might actually work for us.  I just checked, and the
output of "git log --parents -- $project" on one of my repositories
seems to show the same sequence of commits as git log --parents on the
head commit printed by git-split $project (apart from the rewritten
sha1s), including elimination of irrelevant merges.

> So there are multiple levels of history simplification, and right now the 
> internal git revision parser only gives you two choices: "none" 
> (--full-history) and "extreme" (which is the default when you give a set 
> of filenames). 

I don't think we need any middle ground here; why might we want less
simplification?

- Josh Triplett



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-23 19:18 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski
In-Reply-To: <1161629801.27312.22.camel@charis.lan.vernstok.nl>

On Mon, 23 Oct 2006, Jelmer Vernooij wrote:
>
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

That wasn't what I was trying to aim at - the problem is that the bzr 
revision ID isn't "safe" in itself. Anybody can create a revision with the 
same names - and they may both have checksums that match their own 
revision, but you have no idea which one is "correct".

So you just have to trust the person that generates the name, to use a 
proper name generation algorithm. You have to _trust_ that your 64-bit 
random number really is random, for example. And that nobody is trying to 
mess with your repo.

This isn't a problem in normal behaviour, but it's a problem in an attack 
schenario: imagine somebody hacking the central server, and replacing the 
repository with something that had all the same commit names, but one of 
the revisions was changed to introduce a nasty backhole problem. Change 
all the checksums to match too..

It would _look_ fine to somebody who fetches an update, and the maintainer 
might not ever even notice (because he wouldn't send the _old_ revision 
again, and _his_ tree would be fine, so he'd happily continue to to send 
out new revisions on top of the bad one on the public site, never even 
realizing that people are fetching something that doesn't match what he is 
pushing).

In contrast, in git, if you replace something in a git repository, the 
name changes, and if I were to try to push an update on top of a broken 
repo like that, it simply wouldn't work - I couldn't fast-forward my own 
branch, because it's no longer a proper subset of what I'm trying to send.

So in git, you can _trust_ the names. They actually self-verify. You can't 
have maliciously made-up names that point to something else than what they 
are. 

[ Also, as a result, and related to this same issue: the git protocol 
  actually never sends object names when sending the object itself. It 
  just sends the object data, and the _recipient_ generates the name from 
  that.

  So you can't do the _other_ kind of spoofing, and make a repository that 
  _claims_ to have one name and the data would differ - because if you do 
  that, anybody who pulls from the spoofed repository will re-create 
  different names than you claimed, and won't even be able to pull such a 
  malicious repository. ]

		Linus

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-23 19:12 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Carl Worth, Andreas Ericsson, git
In-Reply-To: <1161629801.27312.22.camel@charis.lan.vernstok.nl>

Jelmer Vernooij wrote:

> There are no requirements on what a revid is in bzr. It's a unique
> identifier, nothing more. It can be whatever you like, as long as it's
> unique for that specific commit. The committer+date+random_number is
> just what bzr uses at the moment to create those unique identifiers.

In unpacked git repository commit-id is also commit address. Pack files
adds another level of indirection via pack index file. And functions
as checksum.

P.S. I'm interested what are bzr equivalents of git different types
of objects: commits (revision info) and what is stored in there besides
commit message and "snapshot"; trees/manifest i.e. how files are 
gathered together to form given revision; blob i.e. what is the storage 
format and how it is divided: changeset-like of Arch or file "buckets" 
of Mercurial and CVS, or something yet different together. Is there 
equivalent of git tags and tags objects?

^ permalink raw reply

* [PATCH] git-cherry should show "+" instead of "-" and vice versa
From: Andy Parkins @ 2006-10-23 19:03 UTC (permalink / raw)
  To: git

In git-cherry.sh:

  if test -f "$patch/$2"
  then
    sign=-
  else
    sign=+
  fi

Documentation says 'If the change seems to be in the upstream, it is shown on
the standard output with prefix "+"', however the above does the reverse.  
When
the file $patch/$2 exists it is because the patch /is/ in upstream so the sign
should be "+".
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
 git-cherry.sh |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-cherry.sh b/git-cherry.sh
index 8832573..1cc48e8 100755
--- a/git-cherry.sh
+++ b/git-cherry.sh
@@ -71,9 +71,9 @@ do
 	then
 		if test -f "$patch/$2"
 		then
-			sign=-
-		else
 			sign=+
+		else
+			sign=-
 		fi
 		case "$verbose" in
 		t)
-- 
1.4.2.3

^ permalink raw reply related

* Re: VCS comparison table
From: Shawn Pearce @ 2006-10-23 19:02 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, Jakub Narebski, James Henstridge, bazaar-ng,
	Matthew D. Fuller, Andreas Ericsson, Carl Worth, git
In-Reply-To: <1161629801.27312.22.camel@charis.lan.vernstok.nl>

Jelmer Vernooij <jelmer@samba.org> wrote:
> On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > > The place for timestamp and commiter info is in the revision metadata
> > > (in commit object in git). Not in revision id. Unless you think that
> > > "accidentally the same" doesn't happen...
> > Well, git and bzr really do share the same "stable" revision naming, 
> > although in git it's more indirect, and thus "covers" more.
> > 
[snip]
> > So you could more easily _fake_ a commit name in bzr, and depending on how 
> > things are done it might be more open to malicious attacks for that reason 
> > (or unintentionally - if two people apply the exact same patch from an 
> > email, and take the author/date info from the email like hit does, you 
> > might have clashes. But with a 64-bit random number, that's probably 
> > unlikely, unless you also hit some other bad luck like having the 
> > pseudo-random sequence seeded by "time()", and people just _happen_ to 
> > apply the email at the exact same second).
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

I think Linus' original point here was that if you communicate the
revision id to another person and they fetch that revision there
is no assurance that the commit they have received is the exact
same commit you had.

In Git that assurance is implicitly present as the unique
identification you communicated to the other person is also that
integrity verification.  Therefore its nearly impossible to spoof.

-- 
Shawn.

^ permalink raw reply

* Re: VCS comparison table
From: Jelmer Vernooij @ 2006-10-23 18:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Narebski, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <Pine.LNX.4.64.0610231134450.3962@g5.osdl.org>

[-- Attachment #1: Type: text/plain, Size: 2030 bytes --]

On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > The place for timestamp and commiter info is in the revision metadata
> > (in commit object in git). Not in revision id. Unless you think that
> > "accidentally the same" doesn't happen...
> Well, git and bzr really do share the same "stable" revision naming, 
> although in git it's more indirect, and thus "covers" more.
> 
> In git, the revision name indirectly includes the commit comments too (and 
> git obviously also distinguishes between "committer" and "author", and 
> those end up being indirectly credited in the name of the commit too). But 
> in a very real sense, the bzr stable ("real") revision name does 
> effectively contain the same things as a git ID: it's just that it's a 
> small subset (only committer+date+random number) of what git includes in 
> its names.
There are no requirements on what a revid is in bzr. It's a unique
identifier, nothing more. It can be whatever you like, as long as it's
unique for that specific commit. The committer+date+random\ number is
just what bzr uses at the moment to create those unique identifiers.

> So you could more easily _fake_ a commit name in bzr, and depending on how 
> things are done it might be more open to malicious attacks for that reason 
> (or unintentionally - if two people apply the exact same patch from an 
> email, and take the author/date info from the email like hit does, you 
> might have clashes. But with a 64-bit random number, that's probably 
> unlikely, unless you also hit some other bad luck like having the 
> pseudo-random sequence seeded by "time()", and people just _happen_ to 
> apply the email at the exact same second).
Bzr stores a checksum of the commit separately from the revision id in
the metadata of a revision. The revision is not used by itself to check
the integrity of a revision.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [PATCH] enable index-pack streaming capability
From: Nicolas Pitre @ 2006-10-23 18:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

A new flag, --stdin, allows for a pack to be received over a stream.
When this flag is provided, the pack content is written to either
the named pack file or directly to the object repository under the
same name as produced by git-repack.  The pack index is written as
well with the corresponding base name, unless the index name is
overriden with -o.

With this patch, git-pack-index could be used instead of
git-unpack-objects when fetching remote objects but only with
non "thin" packs for now.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt
index 71ce557..0799e67 100644
--- a/Documentation/git-index-pack.txt
+++ b/Documentation/git-index-pack.txt
@@ -8,7 +8,7 @@ git-index-pack - Build pack index file f
 
 SYNOPSIS
 --------
-'git-index-pack' [-o <index-file>] <pack-file>
+'git-index-pack' [-o <index-file>] { <pack-file> | --stdin [<pack-file>] }
 
 
 DESCRIPTION
@@ -29,7 +29,13 @@ OPTIONS
 	fails if the name of packed archive does not end
 	with .pack).
 
-
+--stdin::
+	When this flag is provided, the pack is read from stdin
+	instead and a copy is then written to <pack-file>. If
+	<pack-file> is not specified, the pack is written to
+	objects/pack/ directory of the current git repository with
+	a default name determined from the pack content.
+	
 Author
 ------
 Written by Sergey Vlasov <vsu@altlinux.ru>
diff --git a/index-pack.c b/index-pack.c
index e33f605..cecdd26 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -8,7 +8,7 @@ #include "tag.h"
 #include "tree.h"
 
 static const char index_pack_usage[] =
-"git-index-pack [-o index-file] pack-file";
+"git-index-pack [-o <index-file>] { <pack-file> | --stdin [<pack-file>] }";
 
 struct object_entry
 {
@@ -37,17 +37,18 @@ struct delta_entry
 	union delta_base base;
 };
 
-static const char *pack_name;
 static struct object_entry *objects;
 static struct delta_entry *deltas;
 static int nr_objects;
 static int nr_deltas;
 
+static int from_stdin;
+
 /* We always read in 4kB chunks. */
 static unsigned char input_buffer[4096];
 static unsigned long input_offset, input_len, consumed_bytes;
 static SHA_CTX input_ctx;
-static int input_fd;
+static int input_fd, output_fd, mmap_fd;
 
 /*
  * Make sure at least "min" bytes are available in the buffer, and
@@ -60,6 +61,8 @@ static void * fill(int min)
 	if (min > sizeof(input_buffer))
 		die("cannot fill %d bytes", min);
 	if (input_offset) {
+		if (output_fd >= 0)
+			write_or_die(output_fd, input_buffer, input_offset);
 		SHA1_Update(&input_ctx, input_buffer, input_offset);
 		memcpy(input_buffer, input_buffer + input_offset, input_len);
 		input_offset = 0;
@@ -86,13 +89,31 @@ static void use(int bytes)
 	consumed_bytes += bytes;
 }
 
-static void open_pack_file(void)
+static const char * open_pack_file(const char *pack_name)
 {
-	input_fd = open(pack_name, O_RDONLY);
-	if (input_fd < 0)
-		die("cannot open packfile '%s': %s", pack_name,
-		    strerror(errno));
+	if (from_stdin) {
+		input_fd = 0;
+		if (!pack_name) {
+			static char tmpfile[PATH_MAX];
+			snprintf(tmpfile, sizeof(tmpfile),
+				 "%s/pack_XXXXXX", get_object_directory());
+			output_fd = mkstemp(tmpfile);
+			pack_name = xstrdup(tmpfile);
+		} else
+			output_fd = open(pack_name, O_CREAT|O_EXCL|O_RDWR, 0600);
+		if (output_fd < 0)
+			die("unable to create %s: %s\n", pack_name, strerror(errno));
+		mmap_fd = output_fd;
+	} else {
+		input_fd = open(pack_name, O_RDONLY);
+		if (input_fd < 0)
+			die("cannot open packfile '%s': %s",
+			    pack_name, strerror(errno));
+		output_fd = -1;
+		mmap_fd = input_fd;
+	}
 	SHA1_Init(&input_ctx);
+	return pack_name;
 }
 
 static void parse_pack_header(void)
@@ -101,10 +122,9 @@ static void parse_pack_header(void)
 
 	/* Header consistency check */
 	if (hdr->hdr_signature != htonl(PACK_SIGNATURE))
-		die("packfile '%s' signature mismatch", pack_name);
+		die("pack signature mismatch");
 	if (!pack_version_ok(hdr->hdr_version))
-		die("packfile '%s' version %d unsupported",
-		    pack_name, ntohl(hdr->hdr_version));
+		die("pack version %d unsupported", ntohl(hdr->hdr_version));
 
 	nr_objects = ntohl(hdr->hdr_entries);
 	use(sizeof(struct pack_header));
@@ -122,8 +142,7 @@ static void bad_object(unsigned long off
 	va_start(params, format);
 	vsnprintf(buf, sizeof(buf), format, params);
 	va_end(params);
-	die("packfile '%s': bad object at offset %lu: %s",
-	    pack_name, offset, buf);
+	die("pack has bad object at offset %lu: %s", offset, buf);
 }
 
 static void *unpack_entry_data(unsigned long offset, unsigned long size)
@@ -222,9 +241,9 @@ static void * get_data_from_pack(struct 
 	int st;
 
 	map = mmap(NULL, len + pg_offset, PROT_READ, MAP_PRIVATE,
-		   input_fd, from - pg_offset);
+		   mmap_fd, from - pg_offset);
 	if (map == MAP_FAILED)
-		die("cannot mmap packfile '%s': %s", pack_name, strerror(errno));
+		die("cannot mmap pack file: %s", strerror(errno));
 	data = xmalloc(obj->size);
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = data;
@@ -382,14 +401,16 @@ static void parse_pack_objects(unsigned 
 	SHA1_Update(&input_ctx, input_buffer, input_offset);
 	SHA1_Final(sha1, &input_ctx);
 	if (hashcmp(fill(20), sha1))
-		die("packfile '%s' SHA1 mismatch", pack_name);
+		die("pack is corrupted (SHA1 mismatch)");
 	use(20);
+	if (output_fd >= 0)
+		write_or_die(output_fd, input_buffer, input_offset);
 
 	/* If input_fd is a file, we should have reached its end now. */
 	if (fstat(input_fd, &st))
-		die("cannot fstat packfile '%s': %s", pack_name, strerror(errno));
+		die("cannot fstat packfile: %s", strerror(errno));
 	if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes)
-		die("packfile '%s' has junk at the end", pack_name);
+		die("pack has junk at the end");
 
 	/* Sort deltas by base SHA1/offset for fast searching */
 	qsort(deltas, nr_deltas, sizeof(struct delta_entry),
@@ -435,7 +456,7 @@ static void parse_pack_objects(unsigned 
 	for (i = 0; i < nr_deltas; i++) {
 		if (deltas[i].obj->real_type == OBJ_REF_DELTA ||
 		    deltas[i].obj->real_type == OBJ_OFS_DELTA)
-			die("packfile '%s' has unresolved deltas",  pack_name);
+			die("pack has unresolved deltas");
 	}
 }
 
@@ -450,12 +471,12 @@ static int sha1_compare(const void *_a, 
  * On entry *sha1 contains the pack content SHA1 hash, on exit it is
  * the SHA1 hash of sorted object names.
  */
-static void write_index_file(const char *index_name, unsigned char *sha1)
+static const char * write_index_file(const char *index_name, unsigned char *sha1)
 {
 	struct sha1file *f;
 	struct object_entry **sorted_by_sha, **list, **last;
 	unsigned int array[256];
-	int i;
+	int i, fd;
 	SHA_CTX ctx;
 
 	if (nr_objects) {
@@ -472,8 +493,19 @@ static void write_index_file(const char 
 	else
 		sorted_by_sha = list = last = NULL;
 
-	unlink(index_name);
-	f = sha1create("%s", index_name);
+	if (!index_name) {
+		static char tmpfile[PATH_MAX];
+		snprintf(tmpfile, sizeof(tmpfile),
+			 "%s/index_XXXXXX", get_object_directory());
+		fd = mkstemp(tmpfile);
+		index_name = xstrdup(tmpfile);
+	} else {
+		unlink(index_name);
+		fd = open(index_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
+	}
+	if (fd < 0)
+		die("unable to create %s: %s", index_name, strerror(errno));
+	f = sha1fd(fd, index_name);
 
 	/*
 	 * Write the first-level table (the list is sorted,
@@ -513,12 +545,52 @@ static void write_index_file(const char 
 	sha1close(f, NULL, 1);
 	free(sorted_by_sha);
 	SHA1_Final(sha1, &ctx);
+	return index_name;
+}
+
+static void final(const char *final_pack_name, const char *curr_pack_name,
+		  const char *final_index_name, const char *curr_index_name,
+		  unsigned char *sha1)
+{
+	char name[PATH_MAX];
+	int err;
+
+	if (!from_stdin) {
+		close(input_fd);
+	} else {
+		err = close(output_fd);
+		if (err)
+			die("error while closing pack file: %s", strerror(errno));
+		chmod(curr_pack_name, 0444);
+	}
+
+	if (final_pack_name != curr_pack_name) {
+		if (!final_pack_name) {
+			snprintf(name, sizeof(name), "%s/pack/pack-%s.pack",
+				 get_object_directory(), sha1_to_hex(sha1));
+			final_pack_name = name;
+		}
+		if (move_temp_to_file(curr_pack_name, final_pack_name))
+			die("cannot store pack file");
+	}
+
+	chmod(curr_index_name, 0444);
+	if (final_index_name != curr_index_name) {
+		if (!final_index_name) {
+			snprintf(name, sizeof(name), "%s/pack/pack-%s.idx",
+				 get_object_directory(), sha1_to_hex(sha1));
+			final_index_name = name;
+		}
+		if (move_temp_to_file(curr_index_name, final_index_name))
+			die("cannot store index file");
+	}
 }
 
 int main(int argc, char **argv)
 {
 	int i;
-	char *index_name = NULL;
+	const char *curr_pack, *pack_name = NULL;
+	const char *curr_index, *index_name = NULL;
 	char *index_name_buf = NULL;
 	unsigned char sha1[20];
 
@@ -526,7 +598,9 @@ int main(int argc, char **argv)
 		const char *arg = argv[i];
 
 		if (*arg == '-') {
-			if (!strcmp(arg, "-o")) {
+			if (!strcmp(arg, "--stdin")) {
+				from_stdin = 1;
+			} else if (!strcmp(arg, "-o")) {
 				if (index_name || (i+1) >= argc)
 					usage(index_pack_usage);
 				index_name = argv[++i];
@@ -540,9 +614,9 @@ int main(int argc, char **argv)
 		pack_name = arg;
 	}
 
-	if (!pack_name)
+	if (!pack_name && !from_stdin)
 		usage(index_pack_usage);
-	if (!index_name) {
+	if (!index_name && pack_name) {
 		int len = strlen(pack_name);
 		if (!has_extension(pack_name, ".pack"))
 			die("packfile name '%s' does not end with '.pack'",
@@ -553,13 +627,14 @@ int main(int argc, char **argv)
 		index_name = index_name_buf;
 	}
 
-	open_pack_file();
+	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
 	deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
 	parse_pack_objects(sha1);
 	free(deltas);
-	write_index_file(index_name, sha1);
+	curr_index = write_index_file(index_name, sha1);
+	final(pack_name, curr_pack, index_name, curr_index, sha1);
 	free(objects);
 	free(index_name_buf);
 

^ permalink raw reply related

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-23 18:45 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jelmer Vernooij, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <200610232031.12399.jnareb@gmail.com>

On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...

Well, git and bzr really do share the same "stable" revision naming, 
although in git it's more indirect, and thus "covers" more.

In git, the revision name indirectly includes the commit comments too (and 
git obviously also distinguishes between "committer" and "author", and 
those end up being indirectly credited in the name of the commit too). But 
in a very real sense, the bzr stable ("real") revision name does 
effectively contain the same things as a git ID: it's just that it's a 
small subset (only committer+date+random number) of what git includes in 
its names.

So you could more easily _fake_ a commit name in bzr, and depending on how 
things are done it might be more open to malicious attacks for that reason 
(or unintentionally - if two people apply the exact same patch from an 
email, and take the author/date info from the email like hit does, you 
might have clashes. But with a 64-bit random number, that's probably 
unlikely, unless you also hit some other bad luck like having the 
pseudo-random sequence seeded by "time()", and people just _happen_ to 
apply the email at the exact same second).

The git use of hashes and parenthood information make any accidental 
clashes like that a non-issue: if you have exactly the same information, 
it really _is_ the same commit, since the hash includes the parenthood 
too. So you're left with just malicious attacks, and those currently look 
practically impossible too, of course.

So I don't think bzr and git differ in this respect. I think you can 
_trust_ stable git names a lot more, but that's a separate issue.

			Linus

^ permalink raw reply

* Re: VCS comparison table
From: Jelmer Vernooij @ 2006-10-23 18:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <200610232031.12399.jnareb@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

On Mon, 2006-10-23 at 20:31 +0200, Jakub Narebski wrote:
> Jelmer Vernooij wrote:
> >> By the way, I wonder if accidentally identical revisions
> >> (see example for accidental clean merge on revctrl.org)
> >> would get the same revision id in bzr. In git they would.
> 
> > They won't. The revision id is made up of the committers email address,
> > a timestamp and a bunch of random data. It wouldn't be hard to switch
> > using checksums as revids instead, but I don't think there are any plans
> > in that direction.
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...
The revision id isn't parsed by bzr. It's just a unique identifier that
is generated at commit-time and is currently created by concatenating
those three fields. It can be anything you like. The bzr-svn plugin for
example creates revision ids in the form
svn:REVNUM@REPOS_UUID-BRANCHPATH and bzr-git uses git:GITREVID. Nothing
will break if bzr would start using a different format.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: prune/prune-packed
From: Petr Baudis @ 2006-10-23 18:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: gitzilla, J. Bruce Fields, git
In-Reply-To: <7vvembzp6y.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Mon, Oct 23, 2006 at 05:27:49AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> A Large Angry SCM <gitzilla@gmail.com> writes:
> 
> > J. Bruce Fields wrote:
> >> Junio C Hamano <junkio@cox.net> writes:
> >>> I am considering the following to address irritation some people
> >>> (including me, actually) are experiencing with this change when
> >>> viewing a small (or no) diff.  Any objections?
> >>
> >> So for me, if I run
> >>
> >> 	less -FRS file
> >>
> >> where "file" is less than a page, I see nothing happen whatsoever.
> >>
> >> At a guess, maybe it's clearing the screen, displaying the file, the
> >> restoring, all before I see anything happen?
> >
> > Junio,
> >
> > How about reverting this change? From the reports here, is causing
> > problems on a number of different distributions.
> 
> Hmmm.  I thought I was using gnome-terminal as well, but I
> always work in screen and did not see this problem.
> 
> Sorry, but you are right and Linus is more right.  How about
> doing FRSX.

I should like that solution more since I hate the alternate screen, but
I actually don't, since it should be left at the user's will whether to
use the alternate screen or not, and Git shouldn't change the default on
whim. Git is trying to be too smart here, and I think it's more annoying
to override what the user is used to than having to by default press q.

Yes, the user can always override Git by setting own $LESS, but that
means another explicit action at the user's side is required and they
don't receive any further cool flags we might stick in there later.
(BTW, I don't think this is right either. In Cogito, I do

	LESS="$myflags$LESS"

unless $CG_LESS is set, in which case I do

	LESS="$CG_LESS".

So people like Jens who have LESS set still get sensible behaviour from
Cogito _and_ they don't loose the ability to override Cogito's less
flags.)

BTW, I think not seeing output of paged commands is a major problem,
this should probably warrant another bugfix release.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-23 18:34 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <200610232021.55625.jnareb@gmail.com>

On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.

git can have no "accidentally identical revisions". They'd have to be 
purposefully done, but yes, they'd obviously (on purpose) get the same 
revision name if that's the case.

You may think of tree (not commit) identity, where git on purpose names 
trees the same regardless of how you got to them. So on a _tree_ level, 
you are always supposed to get the same result regardless of how you 
import things (ie two people importing the same tar-ball should always get 
exactly the same tree ID).

But the actual commit names are identical only if the same people are 
claimed to have authored (and committed) them at the same time - so it's 
definitely not "accidental" if the commits are called the same: they 
really _are_ the same.

Btw, I think you misunderstand the term "accidental clean merge". It means 
that two identical changes on two branches will merge without conflicts 
being reported.

A merge algorithm that doesn't do "accidental clean merge" is totally 
broken. The accidental clean merge is a usability requirement for pretty 
much anything - you often have two branches doing the same thing (possibly 
for different reasons - two people independently found the same bug that 
showed itself in two different ways - so they may even think that they 
are fixing different issues, and may have written totally different 
changelogs to explain the bug, but the solution is identical and should 
obviously merge cleanly).

So "accidental clean merge" may _sound_ like something bad, but it's 
actually a seriously good property (it's really just a special case of 
"convergence" - again, that's a good thing).

		Linus

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-23 18:31 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <1161628001.27312.8.camel@charis.lan.vernstok.nl>

Jelmer Vernooij wrote:
>> By the way, I wonder if accidentally identical revisions
>> (see example for accidental clean merge on revctrl.org)
>> would get the same revision id in bzr. In git they would.

> They won't. The revision id is made up of the committers email address,
> a timestamp and a bunch of random data. It wouldn't be hard to switch
> using checksums as revids instead, but I don't think there are any plans
> in that direction.

The place for timestamp and commiter info is in the revision metadata
(in commit object in git). Not in revision id. Unless you think that
"accidentally the same" doesn't happen...
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: VCS comparison table
From: Jelmer Vernooij @ 2006-10-23 18:26 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <200610232021.55625.jnareb@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]

On Mon, 2006-10-23 at 20:21 +0200, Jakub Narebski wrote:
> Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> >> 
> >> Besides, you need [constant] network access for this mapping.
> > 
> > I _think_ that Aaron was trying to say that
> > 
> > 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> > 
> > is always constant, so you can use that.
> > 
> > Of course, nobody will ever do that, because in practice they're not 
> > shown, the same way the "true" BK revision names were never shown and thus 
> > never really used.
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.
They won't. The revision id is made up of the committers email address,
a timestamp and a bunch of random data. It wouldn't be hard to switch
using checksums as revids instead, but I don't think there are any plans
in that direction.

Cheers,

Jelmer
-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-23 18:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <Pine.LNX.4.64.0610231103460.3962@g5.osdl.org>

Linus Torvalds wrote:
> 
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
>> 
>> Besides, you need [constant] network access for this mapping.
> 
> I _think_ that Aaron was trying to say that
> 
> 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> is always constant, so you can use that.
> 
> Of course, nobody will ever do that, because in practice they're not 
> shown, the same way the "true" BK revision names were never shown and thus 
> never really used.

By the way, I wonder if accidentally identical revisions
(see example for accidental clean merge on revctrl.org)
would get the same revision id in bzr. In git they would.
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-23 18:04 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Andreas Ericsson,
	Carl Worth, git
In-Reply-To: <200610231953.19605.jnareb@gmail.com>



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> Besides, you need [constant] network access for this mapping.

I _think_ that Aaron was trying to say that

	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

is always constant, so you can use that.

Of course, nobody will ever do that, because in practice they're not 
shown, the same way the "true" BK revision names were never shown and thus 
never really used.

		Linus

^ permalink raw reply

* Re: [PATCH] Documentation for the [remote] config
From: Jakub Narebski @ 2006-10-23 17:58 UTC (permalink / raw)
  To: git
In-Reply-To: <87pscjnfvd.fsf@gmail.com>

Thanks a lot!
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply

* Re: VCS comparison table
From: Jakub Narebski @ 2006-10-23 17:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <453CF966.7000308@utoronto.ca>

Aaron Bentley wrote:
> James Henstridge wrote:

>> Why do you continue to repeat this argument?  No one is claiming that
>> a revision number by itself, as Bazaar uses them, is a global
>> identifier.  In fact, we keep on saying that they only have meaning in
>> the context of a branch.
> 
> And, unlike git, Bazaar branches are all independent entities[1], and
> they each have a URL.
> 
> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> And it does not depend on any other branch, especially not bzr.dev
> 
> Since:
> 1. anyone with write access to the urls can create them
> 2. anyone with read access to the urls can read them
> 3. the maintainers of the mainline have no control over them
>    (except as provided by 1)
> 
> these identifiers are not centralized.

If you don't use centralized numbers (i.e. always refering to bzr.dev,
either by using always (bzr.dev URL, revno), or by using "merge" for
bzr.dev and "pull" for rest), the numbers are volatile. If URL vanishes,
then (URL, revno) to revid mapping is no longer valid. Yeah, I know,
cool URI don't change...

Besides, you need [constant] network access for this mapping.
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: VCS comparison table
From: Linus Torvalds @ 2006-10-23 17:29 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Erik Bågfors, bazaar-ng, git, Jakub Narebski
In-Reply-To: <20061022185350.GW75501@over-yonder.net>

On Sun, 22 Oct 2006, Matthew D. Fuller wrote:
> 
> > This special treatment influences or directly causes many of the
> > things in bzr that we've been discussing:
>   [...]
> > I've been arguing that all of these impacts are dubious. But I can
> > understand that a bzr user hearing arguments against them might fear
> > that they would lose the ability to be able to see a view of commits
> > that "belong" to a particular branch.
> 
> Dead center.

The thing that the bzr people don't seem to realize is that their choice 
of revision naming has serious side effects, some of them really 
technical, and limiting.

I already briought this up once, and I suspect that the bzr people simply 
DID NOT UNDERSTAND the question:

 - how do you do the git equivalent of "gitk --all"

which is just another reason why "branch-local" revision naming is simply 
stupid and has real _technical_ problems.

I really suspect that a lot of people can't see further than their own 
feet, and don't understand the subtle indirect problems that branch-local 
naming causes. 

For example, how long does it take to do an arbitrary "undo" (ie forcing a 
branch to an earlier state) in a project with tens of thousands of 
commits? That's actually a really important operation, and yes, 
performance does matter. It's something that you do a lot when you do 
things like "bisect" (which I used to approximate with BK by hand, and 
yes, re-weaving the branch history was apparently a big part of why it 
took _minutes_ to do sometimes).

Again, this is something that people don't expect to have _anything_ to do 
with revision numbering, but the fact is, it's a big part of the picture. 
If you have branch-local revision numbering, you need to renumber all 
revisions on events like this, and even if it is "just" re-creatigng the 
revno->"real ID" cache, it's actually an expensive operation exactly 
because it's going to be at least linear in history.

One of the git design requirements was that no operation should _ever_ 
need to be linear in history size, because it becomes a serious limiter of 
scalability at some point. We were seeing some of those issues with BK, 
which is why I cared.

So in git, doing things like jumping back and forth in history is O(1). 
Always (with a really low constant cost too). Of course, checking out the 
end result is then roughly O(n), but even there "n" is the size of the 
_changes_, not number of revisions or number of files.

(And there are obviously operations that _are_ O(revision history), the 
most trivial one being anything that visualizes all of history - but they 
depend on the size of history not because the operation itself gets more 
expensive, but because the dataset increases).

The whole confusing between "bzr pull" and "bzr merge" is another 
_technical_ sign of why branch-local revision numbers are a mistake. 

			Linus

^ permalink raw reply

* Re: VCS comparison table
From: Aaron Bentley @ 2006-10-23 17:18 UTC (permalink / raw)
  To: James Henstridge
  Cc: Jakub Narebski, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git
In-Reply-To: <a7e835d40610230801m4ac92409gbddcf66dcd1bb429@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Henstridge wrote:
> Why do you continue to repeat this argument?  No one is claiming that
> a revision number by itself, as Bazaar uses them, is a global
> identifier.  In fact, we keep on saying that they only have meaning in
> the context of a branch.

And, unlike git, Bazaar branches are all independent entities[1], and
they each have a URL.

So:

http://code.aaronbentley.com/bzrrepo/bzr.ab 1695

is a name for

abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

And it does not depend on any other branch, especially not bzr.dev

Since:
1. anyone with write access to the urls can create them
2. anyone with read access to the urls can read them
3. the maintainers of the mainline have no control over them
   (except as provided by 1)

these identifiers are not centralized.

Aaron

[1] The fact that they may share storage is not important to the model.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFPPlm0F+nu1YWqI0RAlmLAJ9cpw5X7UXQ82EmoIeUrKzEaFbhdACfZPsS
CRJ69XWi7XAWJRi7Fgt9ICU=
=WrV9
-----END PGP SIGNATURE-----

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox