Git development
 help / color / mirror / Atom feed
* Re: [PATCH] binary patch.
From: Junio C Hamano @ 2006-05-05 19:23 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605051431390.24505@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> On Fri, 5 May 2006, Junio C Hamano wrote:
>
>> The delta is going to be deflated and hopefully gets a bit
>> smaller, so if we really care that level of detail, it might be
>> worth to do (deflate_size*3/2) or something like that here, use
>> delta with or without deflate whichever is smaller, and mark the
>> uncompressed delta with a different tag ("uncompressed delta"?).
>> And for symmetry, to deal with uncompressible data, we may want
>> to have "uncompressed literal" as well.
>
> Nah...  Please just forget that.  ;-)

I was serious about the above actually.

BTW, this "binary patch" opens a different can of worms.

Currently, the diff uses a heuristic borrowed from GNU diff 
(I did not look at the code when I did it, but it is described
in its documentation) to decide if a file is binary (look at the
first few bytes and find NUL).  I am sure people will want to
have a way to say "that heuristic fails but this _is_ a binary
file and please treat it as such".

There are two, both valid, I think, ways to do it.

 - give an option to "diff" that says "treat this path as binary
   for this invocation of the program".

 - give an attribute to blob object that says "this blob is
   binary and should be treated as such".

The latter is probably the right way to go in the longer term.

A blob being binary or not is a property of the content and does
not depend on where it sits in the history, so unlike "recording
renames as a hint in commit objects", the attribute is at the
blob level, not at the commit nor the tree that points at the
blob.

But "binaryness" affects only certain operations that extract
the data (e.g. diff and grep) and not others (e.g. fetch).
Also, it makes sense to being able to retroactively mark a blob,
which was not marked as such originally, is a binary.  So I do
not think it should be recorded in the object header.

Which suggests that we may perhaps want to have notes that can
be attached to existing objects to augment them without changing
the contents of the data, and have tools notice these notes when
they are available.  Another example is to associate correct
MIME types to blobs so, gitweb _blob_ links can do sensible
things to them.

These external notes are purely for Porcelains (in the context
of this sentence "diff" and "grep" are Porcelain), but we would
also want a way to propagate them across repositories somehow.
In a sense, "grafts" information is similar to the external
notes in that it augments existing commit objects, but its
effect is a bit more intrusive; it affects the way the core
operates.

^ permalink raw reply

* [PATCH] Clarify git-cherry documentation.
From: sean @ 2006-05-05 19:06 UTC (permalink / raw)
  To: git

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>


---

 Documentation/git-cherry.txt |   19 ++++++++++++++-----
 1 files changed, 14 insertions(+), 5 deletions(-)

6978ad8b3935b8ce2c55da65b099c67a32ff94d0
diff --git a/Documentation/git-cherry.txt b/Documentation/git-cherry.txt
index 9a5e371..893baaa 100644
--- a/Documentation/git-cherry.txt
+++ b/Documentation/git-cherry.txt
@@ -11,11 +11,20 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Each commit between the fork-point and <head> is examined, and compared against
-the change each commit between the fork-point and <upstream> introduces.
-Commits already included in upstream are prefixed with '-' (meaning "drop from
-my local pull"), while commits missing from upstream are prefixed with '+'
-(meaning "add to the updated upstream").
+The changeset (or "diff") of each commit between the fork-point and <head>
+is compared against each commit between the fork-point and <upstream>.
+
+Every commit with a changeset that doesn't exist in the other branch
+has its id (sha1) reported, prefixed by a symbol.  Those existing only
+in the <upstream> branch are prefixed with a minus (-) sign, and those
+that only exist in the <head> branch are prefixed with a plus (+) symbol.
+
+Because git-cherry compares the changeset rather than the commit id
+(sha1), you can use git-cherry to find out if a commit you made locally
+has been applied <upstream> under a different commit id.  For example,
+this will happen if you're feeding patches <upstream> via email rather
+than pushing or pulling commits directly.
+
 
 OPTIONS
 -------
-- 
1.3.1.g9c203

^ permalink raw reply related

* [PATCH] Update  git-unpack-objects documentation.
From: sean @ 2006-05-05 19:05 UTC (permalink / raw)
  To: git

Document that git-unpack-objects will not produce any
results when used on a pack that exists in a repository;
move it first.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>


---

 Documentation/git-unpack-objects.txt |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

68facf4045556d10c541534a086b3c6486a1c5fb
diff --git a/Documentation/git-unpack-objects.txt b/Documentation/git-unpack-objects.txt
index 1828062..c20b38b 100644
--- a/Documentation/git-unpack-objects.txt
+++ b/Documentation/git-unpack-objects.txt
@@ -13,9 +13,16 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Reads a packed archive (.pack) from the standard input, and
-expands the objects contained in the pack into "one-file
-one-object" format in $GIT_OBJECT_DIRECTORY.
+Read a packed archive (.pack) from the standard input, expanding
+the objects contained within and writing them into the repository in
+"loose" (one object per file) format.
+
+Objects that already exist in the repository will *not* be unpacked
+from the pack-file.  Therefore, nothing will be unpacked if you use
+this command on a pack-file that exists within the target repository.
+
+Please see the `git-repack` documentation for options to generate
+new packs and replace existing ones.
 
 OPTIONS
 -------
-- 
1.3.1.g9c203

^ permalink raw reply related

* [PATCH] Fix up docs where "--" isn't displayed correctly.
From: sean @ 2006-05-05 19:05 UTC (permalink / raw)
  To: git

A bare "--" doesn't show up in man or html pages correctly
as two individual dashes unless backslashed as \--
in the asciidoc source.  Note, no backslash is needed
inside a literal block.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>


---

 Documentation/git-add.txt            |    2 +-
 Documentation/git-checkout-index.txt |    2 +-
 Documentation/git-commit.txt         |    2 +-
 Documentation/git-log.txt            |    2 +-
 Documentation/git-ls-files.txt       |    2 +-
 Documentation/git-merge-index.txt    |    4 ++--
 Documentation/git-prune.txt          |    2 +-
 Documentation/git-rm.txt             |    2 +-
 Documentation/git-update-index.txt   |    2 +-
 Documentation/git-verify-pack.txt    |    2 +-
 Documentation/git-whatchanged.txt    |    2 +-
 Documentation/gitk.txt               |    2 +-
 12 files changed, 13 insertions(+), 13 deletions(-)

32a74a984e6c1869dbebc9bc8d2fe9503e8dd624
diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index ae24547..5e31129 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -26,7 +26,7 @@ OPTIONS
 -v::
         Be verbose.
 
---::
+\--::
 	This option can be used to separate command-line options from
 	the list of files, (useful when filenames might be mistaken
 	for command-line options).
diff --git a/Documentation/git-checkout-index.txt b/Documentation/git-checkout-index.txt
index 09bd6a5..765c173 100644
--- a/Documentation/git-checkout-index.txt
+++ b/Documentation/git-checkout-index.txt
@@ -63,7 +63,7 @@ OPTIONS
 	Only meaningful with `--stdin`; paths are separated with
 	NUL character instead of LF.
 
---::
+\--::
 	Do not interpret any more arguments as options.
 
 The order of the flags used to matter, but not anymore.
diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index 0a7365b..38df59c 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -106,7 +106,7 @@ but can be used to amend a merge commit.
 	index and the latest commit does not match on the
 	specified paths to avoid confusion.
 
---::
+\--::
 	Do not interpret any more arguments as options.
 
 <file>...::
diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index af378ff..c9ffff7 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -51,7 +51,7 @@ git log v2.6.12.. include/scsi drivers/s
 	Show all commits since version 'v2.6.12' that changed any file
 	in the include/scsi or drivers/scsi subdirectories
 
-git log --since="2 weeks ago" -- gitk::
+git log --since="2 weeks ago" \-- gitk::
 
 	Show the changes during the last two weeks to the file 'gitk'.
 	The "--" is necessary to avoid confusion with the *branch* named
diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 796d049..a29c633 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -106,7 +106,7 @@ OPTIONS
 	lines, show only handful hexdigits prefix.
 	Non default number of digits can be specified with --abbrev=<n>.
 
---::
+\--::
 	Do not interpret any more arguments as options.
 
 <file>::
diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
index fbc986a..332e023 100644
--- a/Documentation/git-merge-index.txt
+++ b/Documentation/git-merge-index.txt
@@ -8,7 +8,7 @@ git-merge-index - Runs a merge for files
 
 SYNOPSIS
 --------
-'git-merge-index' [-o] [-q] <merge-program> (-a | -- | <file>\*)
+'git-merge-index' [-o] [-q] <merge-program> (-a | \-- | <file>\*)
 
 DESCRIPTION
 -----------
@@ -19,7 +19,7 @@ files are passed as arguments 5, 6 and 7
 
 OPTIONS
 -------
---::
+\--::
 	Do not interpret any more arguments as options.
 
 -a::
diff --git a/Documentation/git-prune.txt b/Documentation/git-prune.txt
index f694fcb..a11e303 100644
--- a/Documentation/git-prune.txt
+++ b/Documentation/git-prune.txt
@@ -28,7 +28,7 @@ OPTIONS
 	Do not remove anything; just report what it would
 	remove.
 
---::
+\--::
 	Do not interpret any more arguments as options.
 
 <head>...::
diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index c9c3088..66fc478 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -32,7 +32,7 @@ OPTIONS
 -v::
         Be verbose.
 
---::
+\--::
 	This option can be used to separate command-line options from
 	the list of files, (useful when filenames might be mistaken
 	for command-line options).
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 23f2b6f..57177c7 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -113,7 +113,7 @@ OPTIONS
 	Only meaningful with `--stdin`; paths are separated with
 	NUL character instead of LF.
 
---::
+\--::
 	Do not interpret any more arguments as options.
 
 <file>::
diff --git a/Documentation/git-verify-pack.txt b/Documentation/git-verify-pack.txt
index 4962d69..7a6132b 100644
--- a/Documentation/git-verify-pack.txt
+++ b/Documentation/git-verify-pack.txt
@@ -25,7 +25,7 @@ OPTIONS
 -v::
 	After verifying the pack, show list of objects contained
 	in the pack.
---::
+\--::
 	Do not interpret any more arguments as options.
 
 OUTPUT FORMAT
diff --git a/Documentation/git-whatchanged.txt b/Documentation/git-whatchanged.txt
index 641cb7e..e8f21d0 100644
--- a/Documentation/git-whatchanged.txt
+++ b/Documentation/git-whatchanged.txt
@@ -58,7 +58,7 @@ git-whatchanged -p v2.6.12.. include/scs
 	Show as patches the commits since version 'v2.6.12' that changed
 	any file in the include/scsi or drivers/scsi subdirectories
 
-git-whatchanged --since="2 weeks ago" -- gitk::
+git-whatchanged --since="2 weeks ago" \-- gitk::
 
 	Show the changes during the last two weeks to the file 'gitk'.
 	The "--" is necessary to avoid confusion with the *branch* named
diff --git a/Documentation/gitk.txt b/Documentation/gitk.txt
index eb126d7..cb482bf 100644
--- a/Documentation/gitk.txt
+++ b/Documentation/gitk.txt
@@ -31,7 +31,7 @@ gitk v2.6.12.. include/scsi drivers/scsi
 	Show as the changes since version 'v2.6.12' that changed any
 	file in the include/scsi or drivers/scsi subdirectories
 
-gitk --since="2 weeks ago" -- gitk::
+gitk --since="2 weeks ago" \-- gitk::
 
 	Show the changes during the last two weeks to the file 'gitk'.
 	The "--" is necessary to avoid confusion with the *branch* named
-- 
1.3.1.g9c203

^ permalink raw reply related

* [PATCH] Several trivial documentation touch ups.
From: sean @ 2006-05-05 19:05 UTC (permalink / raw)
  To: git

  Move incorrect asciidoc level 2 titles back to level 1.

  Show output of git-name-rev in man page example.

  Reword sentences that begin with a period (.) in asciidoc
  numbered lists to work around conversion to man page bug.

  Mention that git-repack now calls git-prune-packed
  when the -d option is passed to it.

  [imap] section headers in the config file example need to be
  contained in a literal block.  imap.pass is the proper config
  file variable to use, not imap.password.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>


---

 Documentation/git-clone.txt       |    2 +-
 Documentation/git-imap-send.txt   |    4 +++-
 Documentation/git-name-rev.txt    |    1 +
 Documentation/git-repack.txt      |    1 +
 Documentation/git-repo-config.txt |    6 +++---
 Documentation/git-reset.txt       |    2 +-
 6 files changed, 10 insertions(+), 6 deletions(-)

227b8dd1fa66a6d96a25e9fd8fc070be1ea31449
diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 131e445..b333f51 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -101,7 +101,7 @@ OPTIONS
 	is not allowed.
 
 Examples
-~~~~~~~~
+--------
 
 Clone from upstream::
 +
diff --git a/Documentation/git-imap-send.txt b/Documentation/git-imap-send.txt
index cfc0d88..eca9e9c 100644
--- a/Documentation/git-imap-send.txt
+++ b/Documentation/git-imap-send.txt
@@ -29,6 +29,7 @@ CONFIGURATION
 git-imap-send requires the following values in the repository
 configuration file (shown with examples):
 
+..........................
 [imap]
     Folder = "INBOX.Drafts"
 
@@ -38,8 +39,9 @@ configuration file (shown with examples)
 [imap]
     Host = imap.server.com
     User = bob
-    Password = pwd
+    Pass = pwd
     Port = 143
+..........................
 
 
 BUGS
diff --git a/Documentation/git-name-rev.txt b/Documentation/git-name-rev.txt
index 6870708..ffaa004 100644
--- a/Documentation/git-name-rev.txt
+++ b/Documentation/git-name-rev.txt
@@ -41,6 +41,7 @@ Enter git-name-rev:
 
 ------------
 % git name-rev 33db5f4d9027a10e477ccf054b2c1ab94f74c85a
+33db5f4d9027a10e477ccf054b2c1ab94f74c85a tags/v0.99^0~940
 ------------
 
 Now you are wiser, because you know that it happened 940 revisions before v0.99.
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index d2f9a44..9516227 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -38,6 +38,7 @@ OPTIONS
 -d::
 	After packing, if the newly created packs make some
 	existing packs redundant, remove the redundant packs.
+	Also runs gitlink:git-prune-packed[1].
 
 -l::
         Pass the `--local` option to `git pack-objects`, see
diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index ddcf523..fd44f62 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -34,10 +34,10 @@ convert the value to the canonical form 
 a "true" or "false" string for bool). If no type specifier is passed,
 no checks or transformations are performed on the value.
 
-This command will fail if
+This command will fail if:
 
-. .git/config is invalid,
-. .git/config can not be written to,
+. The .git/config file is invalid,
+. Can not write to .git/config,
 . no section was provided,
 . the section or key is invalid,
 . you try to unset an option which does not exist, or
diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt
index ebcfe5e..b27399d 100644
--- a/Documentation/git-reset.txt
+++ b/Documentation/git-reset.txt
@@ -43,7 +43,7 @@ OPTIONS
 	Commit to make the current HEAD.
 
 Examples
-~~~~~~~~
+--------
 
 Undo a commit and redo::
 +
-- 
1.3.1.g9c203

^ permalink raw reply related

* Re: [ANNOUNCE] Git wiki
From: Dave Jones @ 2006-05-05 19:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Petr Baudis, linux, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0605050944200.3622@g5.osdl.org>

On Fri, May 05, 2006 at 10:48:38AM -0700, Linus Torvalds wrote:

 > (and yes, I'm somewhat biased: in my opinion, having a 
 > million monkeys throwing crap at the walls and encoding the information in 
 > the patterns on monkey shit is a better format than CVS), so it would 
 > actually have improved BK, while also making it possible to interoperate 
 > if you didn't want to use BK itself.
 >  ...
 > So that was really my "fallback" position: if nothing out there worked, 
 > I'd rather go back to lists of patches than use CVS. 

I've encountered managing kernel trees in CVS both during my tenure at SuSE,
and to a more involved extent as Fedora/RHEL maintainer, and I'd just like
to echo how much it _completely sucks_ at times.

Rebasing to a newer release is a *nightmare* that usually takes
up most of an afternoon compared to rebasing my git based projects.

In the event I can't persuade the powers at be to switch to git at some point
for managing our packages, I'll be sure to bring up your suggestion of
a million monkeys. I believe you can pick them up fairly cheap these days.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply

* Re: [PATCH 1/3] Alphabetize the glossary.
From: Junio C Hamano @ 2006-05-05 19:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0605041238240.26488@wbgn013.biozentrum.uni-wuerzburg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The idea of having it not alphabetized, but doing it by a script, was to 
> let people actually _read_ it. There is nothing more annoying than having 
> to jump forward and backward and eventually be lost.
>
> glossary, as I started it, was topologically ordered: no Git term was used 
> before it was described (at least that was the plan).

I myself rarely read either man nor html formatted ones.  When I
need to find something, I go straight to Documentation/
directory looking for *.txt files.  Being able to find things
from an alphabetized list is very handy.

On the other hand, we would want to make it easy for people to
read it in the logical order.  For that purpose, html formatted
version, thanks to the cross references the script creates, is a
lot easier than the plain text version.

Maybe we should do both.  We _could_ teach the sort script to
also do an topological sort, and have two sections in the
resulting formatted documentation, the top part being
"alphabetical", and the second part being "bedtime reading".

A random sort that is merely topologically correct probably is
not what we want, so it might make sense to have a hint that
instructs "these should come first before those although they
are topologically independent" to the sort script.  Of course
that "hint" could be the order entries appear in the source text
(which was what you had originally), but when somebody wants to
add a new entry to the glossary, it makes unambiguous where the
new entry should go if the source text is already sorted, which
I am hoping would make it somewhat easier to maintain.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Petr Baudis @ 2006-05-05 18:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux, git
In-Reply-To: <Pine.LNX.4.64.0605051123420.3622@g5.osdl.org>

Dear diary, on Fri, May 05, 2006 at 08:31:06PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> Moving data around happens with a whole lot more than "mv".

Let's keep this on the per-file level - if you want to go below the file
granularity, I already _DID_ say that I agree that explicit tracking is
not a way. (If sub-file tracking would end up having any usable
reliability in real-world cases, which is something I do not take for
granted.)

Another thing is, the sub-file content tracking would end up being a lot
more "magic" than the simple per-file content tracking, and you stated
several times that you prefer simple merge over better but magic merge -
so why do you prefer sub-file content tracking anyway?

> It happens with patches (somebody _else_ may have done an "mv", without 
> using git at all),

_Here_ is the place for automated renames detection. Between applying
and committing the patch, the user can verify that it got the renames
right. That's impossible when guessing the renames later.

> and it happens with editors (moving data around until 
> most of it exists in another file).

I doubt this in fact happens that often (to a degree the automatic
rename detection would catch). And if it happens, then the user has to
tell Git - I have never heard that _this_ would be any problem in other
version control systems. You could make it more foolproof by running the
automatic rename detection on the diff being committed and suggesting
the user that other yet unrecorded renames did happen.

The point is, the user stays in control and can override any stupid guess.

> So doing "*mv" is just a special case.
> 
> And supporting special cases is _wrong_. If you start depending on data 
> that isn't actually dependable, that's WRONG.

I prefer making this data dependable to having to resort to guessing on
dependable less amount of data.

> There's another reason why encoding movement information in the commit is 
> totally broken, namely the fact that a lot of the actions DO NOT WALK THE 
> COMMIT CHAIN!
> 
> Try doing
> 
> 	git diff v1.3.0..
> 
> and think about what that actually _means_. Think about the fact that it 
> doesn't actually walk the commit chain at all: it diffs the trees between 
> v1.3.0 and the current one. What if the rename happened in a commit in the 
> middle?

Then the automated renames detection will miss it given that the other
accumulated differences are large enough, and the suggested workarounds
_are_ precisely walking the commit chain.

If you use persistent file ids, you never miss it _AND_ you DO NOT WALK
THE COMMIT CHAIN! You still just match file ids in the two trees.

> The "track contents, not intentions" approach avoids both these things. 
> The end result is _reliable_, not a "random guess".

No, the end result is whichever some heuristic randomly guessed, and
it's not reliable either since the heuristic can change.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-05 18:49 UTC (permalink / raw)
  To: git
In-Reply-To: <e3fvj2$779$1@sea.gmane.org>

Jakub Narebski wrote:

> Junio C Hamano wrote:
> 
>> Petr Baudis <pasky@suse.cz> writes:
>> 
>>> But the non-obviously important part here to note is that the branch B
>>> merely "corrects a typo on a comment somewhere" - the latest versions in
>>> branch A and branch B are always compared for renames, therefore if
>>> branch A renamed the file and branch B sums up to some larger-scale
>>> changes in the file, it still won't be merged properly.
>> 
>> I probably am guilty of starting this misinformation, but the
>> code does not compare the latest in A and B for rename
>> detection; it compares (O, A) and (O, B).
>> 
>> But the end result is the same - what you say is correct.  If a
>> path (say O to A) that renamed has too big a change, then no
>> matter how small the changes are on the other path (O to B),
>> rename detection can be fooled.  We could perhaps alleviate it
>> by following the whole commit chain.
> 
> Or perhaps by helper information about renames, entered either by git-mv
> (and git-cp) or rename detection at commit, e.g. in the following form
> 
>         note at <commit-sha1> was-in <pathname>
>         note at <commit-sha1> was-in <pathname>
> 
> (with the obvious limit of this "note header" solution is that it wouldn't
> work for filenames and directory name containing "\n"). I'm not sure if
> <pathname> should be just basename, of full pathname.

Erm, I'm sorry, forget the implementation which wouldn't work. The idea was
to accumulate renames and contents moving information, and remember at
which commit it occured. But it's place (as a _helper_ information) is
perhaps in separate structure.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Nicolas Pitre @ 2006-05-05 18:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vejz8z80p.fsf@assigned-by-dhcp.cox.net>

On Fri, 5 May 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> >> +	delta = NULL;
> >> +	deflated = deflate_it(two->ptr, two->size, &deflate_size);
> >> +	if (one->size && two->size) {
> >> +		delta = diff_delta(one->ptr, one->size,
> >> +				   two->ptr, two->size,
> >> +				   &delta_size, deflate_size);
> >
> > Here you probably want to use deflate_size-1 (deflate_size can't be 0).
> 
> I am not sure if -1 is worth here.
> 
> The delta is going to be deflated and hopefully gets a bit
> smaller, so if we really care that level of detail, it might be
> worth to do (deflate_size*3/2) or something like that here, use
> delta with or without deflate whichever is smaller, and mark the
> uncompressed delta with a different tag ("uncompressed delta"?).
> And for symmetry, to deal with uncompressible data, we may want
> to have "uncompressed literal" as well.

Nah...  Please just forget that.  ;-)


Nicolas

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Linus Torvalds @ 2006-05-05 18:31 UTC (permalink / raw)
  To: Petr Baudis; +Cc: linux, git
In-Reply-To: <20060505181540.GB27689@pasky.or.cz>



On Fri, 5 May 2006, Petr Baudis wrote:
> 
> The automatic vs. explicit movement tracking is a lot more
> controversial. Explicit movement tracking is pretty easy to provide for
> file-level movements, it's just that the user says "I _did_ move file
> A to file B" (I never got the Linus' argument that the user has no idea
> - he just _performed_ the move, also explicitly, by calling *mv).

THE USER DID NO SUCH THING.

Moving data around happens with a whole lot more than "mv".

It happens with patches (somebody _else_ may have done an "mv", without 
using git at all), and it happens with editors (moving data around until 
most of it exists in another file).

So doing "*mv" is just a special case.

And supporting special cases is _wrong_. If you start depending on data 
that isn't actually dependable, that's WRONG.

There's another reason why encoding movement information in the commit is 
totally broken, namely the fact that a lot of the actions DO NOT WALK THE 
COMMIT CHAIN!

Try doing

	git diff v1.3.0..

and think about what that actually _means_. Think about the fact that it 
doesn't actually walk the commit chain at all: it diffs the trees between 
v1.3.0 and the current one. What if the rename happened in a commit in the 
middle?

The "track contents, not intentions" approach avoids both these things. 
The end result is _reliable_, not a "random guess".

Adding file movement note to commits is simply WRONG.

Why does this come up every three months or so? I was right the first 
time. You'd think that as time passes, people would just notice more and 
more how right I was and am, instead of forgetting and bringing this 
idiotic idea up over and over and over again.

		Linus

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-05 18:27 UTC (permalink / raw)
  To: git
In-Reply-To: <20060505181540.GB27689@pasky.or.cz>

Petr Baudis wrote:

> The automatic vs. explicit movement tracking is a lot more
> controversial. Explicit movement tracking is pretty easy to provide for
> file-level movements, it's just that the user says "I _did_ move file
> A to file B" (I never got the Linus' argument that the user has no idea
> - he just _performed_ the move, also explicitly, by calling *mv).
> 
> However, I guess the explicit movement tracking completely fails if you
> go sub-file (without being extremely bothersome for the user) - you
> would have to have control over the editor and the clipboard and even
> then I'm not sure if you could reach any sensible results.

If I remember correctly there are some problems if the explicit file-level
contents movement tracking (aka. file rename tracking) is done via
equivalent of file-id, inodes, or persistent names. Although it works for
many (most?) cases.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Petr Baudis @ 2006-05-05 18:20 UTC (permalink / raw)
  To: linux; +Cc: git
In-Reply-To: <20060505181540.GB27689@pasky.or.cz>

Dear diary, on Fri, May 05, 2006 at 08:15:41PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> There are really two distinctions here which should be kept separate:
> automatic vs. explicit movement tracking and file-level vs.
> subfile-level movement tracking.

I should have revised this paragraph before sending the mail out, I
ended up sorting out my thoughts on the subject as I wrote the mail. The
two aspects end up so tied that it makes sense to mingle them. Examining
them separately here still hopefully shed some light on possible
reasoning behind the Git design decisions.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Petr Baudis @ 2006-05-05 18:15 UTC (permalink / raw)
  To: linux; +Cc: git
In-Reply-To: <20060505005659.9092.qmail@science.horizon.com>

Dear diary, on Fri, May 05, 2006 at 02:56:59AM CEST, I got a letter
where linux@horizon.com said that...
> But, as Linus has pointed out, this is a very partial solution which
> introduces a lot of difficulties elsewhere.  File renaming is a subset of
> the general class of code reorganizations.  Source files will be split,
> merged, and have functions moved back and forth.  You want the patch to
> find the code it applies to even if that code was moved.
> 
> And that can be done by taking a more global view of the patch.
> Identical file names is only a heuristic.  If the hunk on branch A
> can't find a place to apply on the same file in branch B, then
> you have to look a little harder, either at changes from branch B
> that introduce matching code elsewhere, or perhaps looking
> through history for a change that removed the match from the
> obvious place to see if it added a match elsewhere.

There are really two distinctions here which should be kept separate:
automatic vs. explicit movement tracking and file-level vs.
subfile-level movement tracking.

The automatic vs. explicit movement tracking is a lot more
controversial. Explicit movement tracking is pretty easy to provide for
file-level movements, it's just that the user says "I _did_ move file
A to file B" (I never got the Linus' argument that the user has no idea
- he just _performed_ the move, also explicitly, by calling *mv).

However, I guess the explicit movement tracking completely fails if you
go sub-file (without being extremely bothersome for the user) - you
would have to have control over the editor and the clipboard and even
then I'm not sure if you could reach any sensible results.

I still dislike automated movement tracking for whole files, but I'm
conciliated with it. Because it is probably the only really sensible way
to implement subfile-level tracking.  It would not be hard to implement
using pickaxe (actually, I believe it was near the top of Junio's TODO
few weeks ago) and a similarity detector comparing new and old version
(if it's dissimilar enough, check if that or a similar hunk was not
added somewhere else in the same commit; well, at least the idea
sounds simple).

One obvious problem are ambiguities - several similar files are renamed
to other similar files and now how do you decide which version to
choose? Merge the change to all the new files? Only to some? Panic?
I wonder how does the current recursive strategy deal with that.
Of course, this case sounds quite artificial and rare for whole files,
but I suspect that it will be much more common once you do not deal with
files but just hunks, moving bits of code around.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Linus Torvalds @ 2006-05-05 17:48 UTC (permalink / raw)
  To: Petr Baudis; +Cc: linux, Git Mailing List
In-Reply-To: <20060505163629.GZ27689@pasky.or.cz>



On Fri, 5 May 2006, Petr Baudis wrote:
> 
> It's a philosophical question here, but I'd say that Git is much closer
> to Monotone than to any other version control system

Some historical background..

Before I dropped BK, I ended up being involved in trying to get Larry and 
Tridge to come to some agreement about how to solve the issues Tridge had 
with BK not being open-source. That actually went on for maybe two months 
or so, and I kept on hoping that we'd find some acceptably middle ground. 

I thought we could find somethign that would actually work for everybody: 
to hopefully both make BK technically better, _and_ to make the end result 
more palatable to the "free software or bust" contingency.

One of the suggestions that I tried to push as an acceptable middle ground 
was to make a "generic" BK repository export format, so that people who 
didn't want to use BK could still get all the information, and not in a 
broken format like CVS (yes, CVS makes sense as an interchange format, 
since _everybody_ speaks CVS, but it's a horrible, horrible, horrible 
format from any technical standpoint).

My example export format was really a strange mixture of patches with 
parenthood information, where the history information was described with 
hashes (MD5 rather than SHA1, but that was just an implementation thing, 
and mostly because BK used MD5 sums). Not something really useful as a 
real SCM, but it wasn't designed for that - it was just meant to be a 
useful and unambiguous interoperability format.

Now, that didn't work out, and I was a little bummed. I thought it would 
have made both sides happy, because it would actually have been a better 
format than CVS (and yes, I'm somewhat biased: in my opinion, having a 
million monkeys throwing crap at the walls and encoding the information in 
the patterns on monkey shit is a better format than CVS), so it would 
actually have improved BK, while also making it possible to interoperate 
if you didn't want to use BK itself.

But Tridge didn't believe that it would actually have exported all the 
information in a BK tree, even if both I and Larry told him it would. I'm 
not a hundred percent sure that Larry would have gone for the export 
format either, but hey, one sign of a good compromise is that neither side 
really gets what they really want. Whatever. It didn't work.

So it didn't actually resolve the deadlock, but when it became clear that 
I couldn't work with BK any more, I thought I might use something like 
that "patch + parenthood" representation as a way to maintain my tree 
while looking at other alternatives.

So in many ways, when I started looking around for distributed SCM's, I 
came into the game with the background of keeping the history around as 
chains of hashes describing it, and then just having patches to describe 
the differences between versions.

So that was really my "fallback" position: if nothing out there worked, 
I'd rather go back to lists of patches than use CVS. 

Now, if you keep track of just patches, one of the issues is that you 
can't afford to re-create the tree every time by walking patches forward 
from the beginning, so I also was planning to have an "cache" that 
maintained the current state of the tree as a separate state from the 
working tree, so that I would always have the "working tree" and the 
"result of patches up to this moment" as two separate things (so that I 
could do the "bk diff" that I was used to doing to see the difference 
between my last state and the current state of the working tree).

In other words, I was already working on the git "index" file. And I was 
planning to just have a patch-based system behind it, with a hashed 
history. Kind of "quilt with history and an index to speed things up".

The index itself would be backed-up with whole files (all hidden in the 
".dircache" directory), and the patch series would thus normally never 
actually be _used_. So the inefficiency of working with patches would 
never be much of an issue. A "commit" would create a new patch from the 
current working directory and the previous shadow tree, and update the 
shadow tree and add a new entry to the history list.

And then I found Monotone.

Now, monotone was slow. Monotone was so _horrendously_ slow that I had to 
do special hacks just to import _one_ version of Linux into it in less 
than two hours. It was something stupid like an O(N**3) algorithm in the 
number of filenames (and the kernel had 17,291 files at that time: 
v2.6.12-rc2), and it was just totally unusable for me.

I also thought (and still think) that the whole signing thing was a waste 
of time and misdesigned, and I obviously am not a huge fan of databases. 
So in many ways I disliked the monotone implementation decisions (and some 
of its design decisions). But at the same time, I immediately liked the 
SHA1 object naming concept of Monotone.

It also already matched how I had conceptually planned on doing on the 
history anyway, and had some ideas for, but it took that whole "history 
hashing" all the way.

And thus git was born. 

So git really has three parents. In a very real sense, BK (or, perhaps 
more appropriately - the way I personally used BK, which is not 
necessarily how others have used it) was the biggest thing from the 
standpoint of what I wanted my _workflow_ to be like. It was simply how I 
had done things for the last few years, so a lot of my mental model for 
how things are supposed to _work_ came from BK. 

I still don't think people give Larry enough credit for actually pushing 
this whole distributed SCM thing as a _usable_ model. Very few of the 
open-source distributed SCM's are actually usable even today, and as far 
as I've been able to gather, the commercial ones aren't really any closer 
either. Larry didn't have the kind of examples of what _can_ work that I 
had.

The other parent was the stupid "series of patches" model, which was what 
really resulted in the "index" thing. I realize that people don't always 
much like the index, but it's really a pretty central part of git history, 
and one of the distinguising marks of git. It may be trivial, and to some 
degree it's been overshadowed by all the tree operations we do (the 
combination of revision walking and tree diffing), but it was very central 
to how git came to be.

The index also ended up being central to how we did merges - even if some 
day we may end up doing more of that on a pure tree level (ie the current 
git-merge-tree model), I think the way we ended up doing merges owes a lot 
to the index as a staging area.

(Historically, the "index" was called the "cache". Exactly because it came 
from the notion of "caching" the top commit state in a patch series, and 
then working with patches either backwards or forwards from that top 
cached state. Similarly, we didn't have a ".git" directory: it was 
called ".dircache", exactly because it was all about caching the state 
of the previous commit directory layout).

And finally, Monotone for the "everything is an object named by its SHA1" 
model, which to some degree is perhaps the central - or at least the most 
obvious - part of git. It largely was designed really just to be the 
"backing store" for the "cache", and to not be _that_ important. That also 
explains why I didn't worry too much about disk usage etc initially: the 
object store wasn't even the most important part, and I envisioned just 
moving old objects that weren't needed into some "backup storage" kind of 
thing.

			Linus

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Junio C Hamano @ 2006-05-05 17:38 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605051128100.28543@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

>> +	delta = NULL;
>> +	deflated = deflate_it(two->ptr, two->size, &deflate_size);
>> +	if (one->size && two->size) {
>> +		delta = diff_delta(one->ptr, one->size,
>> +				   two->ptr, two->size,
>> +				   &delta_size, deflate_size);
>
> Here you probably want to use deflate_size-1 (deflate_size can't be 0).

I am not sure if -1 is worth here.

The delta is going to be deflated and hopefully gets a bit
smaller, so if we really care that level of detail, it might be
worth to do (deflate_size*3/2) or something like that here, use
delta with or without deflate whichever is smaller, and mark the
uncompressed delta with a different tag ("uncompressed delta"?).
And for symmetry, to deal with uncompressible data, we may want
to have "uncompressed literal" as well.

>> +		orig_size = delta_size;
>> +		if (delta)
>> +			delta = deflate_it(delta, delta_size, &delta_size);
>
> Here you're leaking the original delta buffer memory.

Indeed.  Thanks.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Jakub Narebski @ 2006-05-05 16:47 UTC (permalink / raw)
  To: git
In-Reply-To: <7vejz8241m.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Petr Baudis <pasky@suse.cz> writes:
> 
>> But the non-obviously important part here to note is that the branch B
>> merely "corrects a typo on a comment somewhere" - the latest versions in
>> branch A and branch B are always compared for renames, therefore if
>> branch A renamed the file and branch B sums up to some larger-scale
>> changes in the file, it still won't be merged properly.
> 
> I probably am guilty of starting this misinformation, but the
> code does not compare the latest in A and B for rename
> detection; it compares (O, A) and (O, B).
> 
> But the end result is the same - what you say is correct.  If a
> path (say O to A) that renamed has too big a change, then no
> matter how small the changes are on the other path (O to B),
> rename detection can be fooled.  We could perhaps alleviate it
> by following the whole commit chain.

Or perhaps by helper information about renames, entered either by git-mv
(and git-cp) or rename detection at commit, e.g. in the following form

        note at <commit-sha1> was-in <pathname>
        note at <commit-sha1> was-in <pathname>

(with the obvious limit of this "note header" solution is that it wouldn't
work for filenames and directory name containing "\n"). I'm not sure if
<pathname> should be just basename, of full pathname.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Petr Baudis @ 2006-05-05 16:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vejz8241m.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Fri, May 05, 2006 at 11:51:01AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
> 
> > But the non-obviously important part here to note is that the branch B
> > merely "corrects a typo on a comment somewhere" - the latest versions in
> > branch A and branch B are always compared for renames, therefore if
> > branch A renamed the file and branch B sums up to some larger-scale
> > changes in the file, it still won't be merged properly.
> 
> I probably am guilty of starting this misinformation, but the
> code does not compare the latest in A and B for rename
> detection; it compares (O, A) and (O, B).

Where O = LCA(A,B) (modulo recursiveness)? Yes, that is what I meant to
say but I phrased it wrong, sorry.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [ANNOUNCE] Git wiki
From: Petr Baudis @ 2006-05-05 16:36 UTC (permalink / raw)
  To: linux; +Cc: git
In-Reply-To: <20060505005659.9092.qmail@science.horizon.com>

Dear diary, on Fri, May 05, 2006 at 02:56:59AM CEST, I got a letter
where linux@horizon.com said that...
> Actually, AFAICT from looking at the mailing list history, it's not dirty
> politics: the tie-breaker was the support and enthusiasm of the mercurial
> developers.  It passed with only minor comment on the git mailing list,
> but it was a Big Thing to the hg folks.
> 
> There are ups and downs.  OpenSolaris is definitely the big fish in
> the mercurial pond (that wasn't *meant* to sound like a recipe for
> heavy metal toxicity), and will get lots of attention, but git has more
> real-world experience.  The big fish in the git pond is Linus and Linux.
> 
> In any case, mercurial and git are really very similar, far closer
> to each other than any third system, so it's not like the decision is
> a descent into heresy.  Hopefully some useful cross-pollination
> can occur, and converting history from one to the other would be
> simple if anyone ever wanted to.

It's a philosophical question here, but I'd say that Git is much closer
to Monotone than to any other version control system - I think it can be
described as Monotone model with more elegant implementation (for some,
at least ;), no certificates and restriction of one head per branch.
And another important difference is that Monotone has persistent file
identifiers, but I think that's about the only thing that would make
Monotone more "file orientated".

I'm not much of a Mercurial pro but it appears to me that the
architectural differences there are larger, especially wrt. the revlogs
and wholly quite a more file-oriented model.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Linus Torvalds @ 2006-05-05 15:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e3fqb9$hed$1@sea.gmane.org>



On Fri, 5 May 2006, Jakub Narebski wrote:

> Linus Torvalds wrote:
> 
> > So what you'd get is a _really_ cut down history that doesn't contain any
> > commit history at all (just distinct "points in commit history time"), but
> > that _does_ contain all the objects that the commits point to.
> 
> So we would get 'skin-deep clone' rather than 'shallow' one?

Well, it's really shallow, but perhaps more importantly, I think it should 
be really easy, and have totally unambiguous semantics. Never any question 
of how far back to go, and I think we already really do have all the 
support logic for doing it.

Now, we don't actually expose the internal "no_walk" flag with a 
"--no-walk" command line argument parsing, but that's a one-liner.

There's another approach that might be a bit friendlier, which is again to 
walk only the objects of the WANT/HAVE things, but then _do_ walk the 
history for just commit objects. Something close to what I think the http 
fetch thing does if you pass it "-c -t". That too shouldn't require too 
much extra complexity, and it would mean that "git log" at least works.

Of course, that would require another slight difference to "rev-list.c", 
where we'd only recurse into trees of selected commit objects (ie we'd 
have to mark the HAVE/WANT commits specially, but it's not exactly 
complex either).

Of course, the complexity of _both_ of these approaches is really in the 
fsck stage, and all the crud you need to then do other things with these 
pared-down repos. For example, do you allow cloning? And do you just 
automatically notice that you're cloning a shallow repo, and only do a 
shallow clone. Etc etc..

		Linus

^ permalink raw reply

* Re: [PATCH] binary patch.
From: Nicolas Pitre @ 2006-05-05 15:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy7xgzsiu.fsf@assigned-by-dhcp.cox.net>

On Fri, 5 May 2006, Junio C Hamano wrote:

> > Yeah, things still to be done are deflating both delta and
> > optionally perhaps use just deflate without delta for "new file"
> > codepath.

> +	/* We could do deflated delta, or we could do just deflated two,
> +	 * whichever is smaller.
> +	 */
> +	delta = NULL;
> +	deflated = deflate_it(two->ptr, two->size, &deflate_size);
> +	if (one->size && two->size) {
> +		delta = diff_delta(one->ptr, one->size,
> +				   two->ptr, two->size,
> +				   &delta_size, deflate_size);

Here you probably want to use deflate_size-1 (deflate_size can't be 0).

> +		orig_size = delta_size;
> +		if (delta)
> +			delta = deflate_it(delta, delta_size, &delta_size);

Here you're leaking the original delta buffer memory.


Nicolas

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Carl Worth @ 2006-05-05 15:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0605050806370.3622@g5.osdl.org>

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

On Fri, 5 May 2006 08:10:50 -0700 (PDT), Linus Torvalds wrote:
> 
> Namely just have a mode to "git-send-pack" that uses the "--no-walk" flag 
> to generate the object list to send.
> 
> What that does is to never walk the object history: so it will just use 
> the "I HAVE THESE" and "I WANT THESE" commit references to directly 
> generate the list of commits, and then walks the trees to generate the 
> list of trees/blobs that differ between the particular end-points.

Oh, I think that's a great idea. I had proposed cutting the WANT list
down to a single commit, but I wasn't clever enough to think to also
cut down the HAVE walking to solve the problems that would have been
introduced by shallow clones.

And I think the resulting behavior is quite reasonable. I think one
can argue a sort of zero-one-infinity rule here. With history, either
you don't care about any of it, or else you really should care about
all of it.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Jakub Narebski @ 2006-05-05 15:18 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605050806370.3622@g5.osdl.org>

Linus Torvalds wrote:

> So what you'd get is a _really_ cut down history that doesn't contain any
> commit history at all (just distinct "points in commit history time"), but
> that _does_ contain all the objects that the commits point to.

So we would get 'skin-deep clone' rather than 'shallow' one?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Unresolved issues #2 (shallow clone again)
From: Linus Torvalds @ 2006-05-05 15:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carl Worth, git
In-Reply-To: <7v1wv92u7o.fsf@assigned-by-dhcp.cox.net>



On Thu, 4 May 2006, Junio C Hamano wrote:
> 
> Jokes aside, I think listing the updated conversation elements
> like you did above is a good step forward.
> 
> The vocabulary we would want from the requestor side is probably
> (at least):
> 
> 	I WANT to have these
>         I HAVE these
>         I'm MISSING these
>         Don't bother with these this time around (--since, ^v2.6.16, ...)

Actually, I think we can do something simpler that _most_ people might be 
happy with.

Namely just have a mode to "git-send-pack" that uses the "--no-walk" flag 
to generate the object list to send.

What that does is to never walk the object history: so it will just use 
the "I HAVE THESE" and "I WANT THESE" commit references to directly 
generate the list of commits, and then walks the trees to generate the 
list of trees/blobs that differ between the particular end-points.

We already have the "no_walk" flag internally, we just don't expose it.

So what you'd get is a _really_ cut down history that doesn't contain any 
commit history at all (just distinct "points in commit history time"), but 
that _does_ contain all the objects that the commits point to.

		Linus

^ permalink raw reply

* [PATCH] Fix for config file section parsing.
From: sean @ 2006-05-05 13:49 UTC (permalink / raw)
  To: git

Currently, if the target key has a section that matches
the initial substring of another section we mistakenly
believe we've found the correct section.  To avoid this
problem, ensure that the section lengths are identical
before comparison.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>

---

Here's an example of the problem:

$ git repo-config a.b.c d
$ cat .git/config
[a.b]
        c = d
$ git repo-config a.x y
$ cat .git/config
[a.b]
        c = d
        x = y


 config.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

f956fd1a6b1d3c82d9bc735427b58ec41d9e5dd1
diff --git a/config.c b/config.c
index 4e1f0c2..a3e14d7 100644
--- a/config.c
+++ b/config.c
@@ -335,8 +335,9 @@ static int store_aux(const char* key, co
 			store.offset[store.seen] = ftell(config_file);
 			store.state = KEY_SEEN;
 			store.seen++;
-		} else if(!strncmp(key, store.key, store.baselen))
-			store.state = SECTION_SEEN;
+		} else if (strrchr(key, '.') - key == store.baselen &&
+			      !strncmp(key, store.key, store.baselen))
+					store.state = SECTION_SEEN;
 	}
 	return 0;
 }
-- 
1.3.2.g2fb9

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox