Git development

Git development
 help / color / mirror / Atom feed

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-25 10:08 UTC (permalink / raw)
  To: git
In-Reply-To: <7vmzeax9gj.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Actually, this can be resolved using automatic history grafts to the
>> remote repository we pulled from, if the commit is not present on local
>> side (and removing graft when commit appears on local side).
> 
> You do not even need history grafts.  The "cherry-pick source"
> was a bad example.  Maybe using "related" as a way to implement
> "bind" would have been a better example -- we want inter-commit
> relationship that requires connectivity but without ancestry for
> them.
> 
> You can just have two kinds of 'related'.  One that means
> connectivity, the other that does not.

Good idea.

Another problem for core git, but I think orthogonal to the "related"/"note"
distinction is if the relation (or note) should be used as helper in
merges, perhaps by some agreed upon convention on the
comment/description/value part (e.g. "mergehelper" or "mergeinfo").

BTW. in your first example, what "key" relation should mean?
"cherrypick" (which should be "note" as we don't need connectivity) is
quite obvious (or equivalent "origin" if rebase wouldn't destroy the branch
picked from).

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Junio C Hamano @ 2006-04-25  9:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: jnareb, git
In-Reply-To: <e2kp27$8ne$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Actually, this can be resolved using automatic history grafts to the remote
> repository we pulled from, if the commit is not present on local side (and
> removing graft when commit appears on local side).

You do not even need history grafts.  The "cherry-pick source"
was a bad example.  Maybe using "related" as a way to implement
"bind" would have been a better example -- we want inter-commit
relationship that requires connectivity but without ancestry for
them.

You can just have two kinds of 'related'.  One that means
connectivity, the other that does not.

At that point, the latter does not even have to belong to the
core.  The Porcelains can make use of it as long as they agree
on a common convention and use that information consistently.
It does not even have to be "related" (which implies what comes
after "related" is an object name) -- it could be an arbitrary
metainformation that the core does not have to care.  So an
updated suggestion is to have optional 0-or-more "note" and
"related" fields.  'note' is followed by one token and
additional information.  'related' is followed by an object name
that needs the additional connectivity, and and additional
information.  For example:

    tree 0aaa3fecff73ab428999cb9156f8abc075516abe
    parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
    parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
    related a0e7d36193b96f552073558acf5fcc1f10528917 bind linux-2.6
    note cherrypick v1.3.0~12
    note origin "next" branch at junio's repository
    note rename "foobar" to "barboz"
    author Junio C Hamano <junkio@cox.net> 1145943079 -0700
    committer Junio C Hamano <junkio@cox.net> 1145943079 -0700

    Merge branch 'pb/config' into next

The core side can say "Oh, this is a 'note' so I do not care
what it is -- I'd just skip to the end of line", while
Porcelains that "cat-file commit" this object can grep for
"note" and look at the first token to figure out what to do with
it.  The core needs to be aware of the 'related' ones and does
the connectivity crud using the object name, and Porcelains can
use the rest of the line to do intelligent things.

Now, it is debatable that such an extra information like 'note'
belongs to the header that the core deals with.  IIRC, Linus
argued that he does not want to have arbitrary cruft in the
header and instead to have it as a comment in the message part
when somebody talked about recording renames in the commit.

We have the author and the committer fields that is not used by
the core (only half of the committer field is used by the core
to date-order the commit list).  But I suspect most of the time
such metainformation is useless to the end-user humans, so if I
have to vote I'd rather put them in the header, have the UI
layer filter them out unless asked when presenting the commit to
the humans, and give Porcelains freedom to do whatever they
wish.

Things are easier to filter out when they properly follow some
structure, so I'd rather have "cruft" in the header.  Right now,
git-cherry-pick ends the commit message with "(cherry picked
from $commit commit)".  In theory, rebase can notice by parsing
commit log message, but it certainly would be easier and more
robust if we had a 'note' facility and a well established
convention to use it.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-25  9:10 UTC (permalink / raw)
  To: git
In-Reply-To: <BAYC1-PASMTP116C6B217F25F2ADAF0C67AEBF0@CEZ.ICE>

sean wrote:

> On Tue, 25 Apr 2006 04:34:36 -0400
> sean <seanlkml@sympatico.ca> wrote:
> 
>> If you're cherry-picking from a disposable branch, then you don't want to
>> include a link to it in your new commit.  Once you include the link, the
>> source commit should be protected from pruning just like any other piece
>> of history.  Otherwise there's no way for fsck-objects to know if a
>> missing
>> object means corruption or not.  So you need a way at commit time to
>> request the explicit linkage.
> 
> Actually this implies that anyone pulling just this branch would
> potentially
> also end up pulling large portions of other branches too.   So maybe
> making
> them optional is The Right Thing.  In which case, we'd just have to accept
> these as weaker than the parentage links and fsck-objects et. al. would
> have to tolerate such missing commits.

Actually, this can be resolved using automatic history grafts to the remote
repository we pulled from, if the commit is not present on local side (and
removing graft when commit appears on local side).

I was more concerned about size of repository required by keeping some parts
of history which would be purged without those "related" links. But your
concern (pulling) is more important.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: sean @ 2006-04-25  8:57 UTC (permalink / raw)
  To: jnareb, git
In-Reply-To: <20060425043436.2ff53318.seanlkml@sympatico.ca>

On Tue, 25 Apr 2006 04:34:36 -0400
sean <seanlkml@sympatico.ca> wrote:

> If you're cherry-picking from a disposable branch, then you don't want to 
> include a link to it in your new commit.  Once you include the link, the 
> source commit should be protected from pruning just like any other piece 
> of history.  Otherwise there's no way for fsck-objects to know if a missing 
> object means corruption or not.  So you need a way at commit time to
> request the explicit linkage.

Actually this implies that anyone pulling just this branch would potentially
also end up pulling large portions of other branches too.   So maybe making
them optional is The Right Thing.  In which case, we'd just have to accept 
these as weaker than the parentage links and fsck-objects et. al. would have 
to tolerate such missing commits.

So now that i've clearly come down in favor of both sides of this argument,
i'll leave the decision to smarter people than me.

Sean

^ permalink raw reply

* Re: [RFC] get_sha1(): :path and :[0-3]:path to extract from index.
From: Junio C Hamano @ 2006-04-25  8:46 UTC (permalink / raw)
  To: Uwe Zeisberger; +Cc: Linus Torvalds, git
In-Reply-To: <20060425083724.GA1663@informatik.uni-freiburg.de>

Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de> writes:

>> This is a fairly straightforward patch to allow "get_sha1()" to
>> also have shorthands for blob objects in the current index.
>
> I sometimes want to have something like that:
>
> 	uzeisberger@io:~/gsrc/linux-2.6$ git cat-file blob v2.6.16:Makefile
>
> That is not a shortcut for objects in the current index, but for blobs
> in written trees.

That's already present in the "master".  You are responding to a
wrong message ;-).

^ permalink raw reply

* Re: [PATCH 3/4] Deprecate usage of git-var -l for getting config vars list
From: Uwe Zeisberger @ 2006-04-25  8:39 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git
In-Reply-To: <20060424225930.14086.76174.stgit@machine.or.cz>

Hello Petr,

Petr Baudis wrote:
> +    my @gitvars = `git-repo-config -l`;
>      if ($?) {
> -       print "E problems executing git-var on the server -- this is not a git repository or the PATH is not set correcly.\n";
> +       print "E problems executing git-repo-config on the server -- this is not a git repository or the PATH is not set correcly.\n";

I didn't check the patch, but you may want to s/correcly/correctly/.

Best regards
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=5+choose+3

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: sean @ 2006-04-25  8:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e2kjul$ntq$1@sea.gmane.org>

On Tue, 25 Apr 2006 09:43:33 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Perhaps there should be an option to specify that the link is optional, and
> the object pointed can be gone missing. For example for cherrypick the
> original cherry-picked commit can either be removed completely, e.g. when
> the original branch is deleted, or it can be modified breaking link when we
> rewrite history up to original commit on original branch.
> 
> Also all other commands which show commit (commit messsage at least) should
> be considered for including "related" links...

If you're cherry-picking from a disposable branch, then you don't want to 
include a link to it in your new commit.  Once you include the link, the 
source commit should be protected from pruning just like any other piece 
of history.  Otherwise there's no way for fsck-objects to know if a missing 
object means corruption or not.  So you need a way at commit time to
request the explicit linkage.

This might be useful for bug tracking front ends that could automatically 
show a hot fix migrating from devel, to testing, to release branches.  With 
Junio's proposal, perhaps there's even a better keyword for these particular 
linkages.

Sean.

^ permalink raw reply

* Re: [RFC] get_sha1(): :path and :[0-3]:path to extract from index.
From: Uwe Zeisberger @ 2006-04-25  8:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7v7j5iph7f.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:
> [ NOTE! The reason I put "RFC" in the subject rather than "PATCH" is that 
>   I'm not 100% sure this isn't just a "shiny object" of mine rather than a 
>   really useful thing to do. What do people think? Have you ever wanted to 
>   access individual files in some random revision? Do you think this is 
>   useful? I think it's cool and _may_ be useful, but I'm not going to 
>   really push this patch. Consider it a throw-away patch unless somebody 
>   else finds it intriguing enough.. ]
> 
> This is a fairly straightforward patch to allow "get_sha1()" to
> also have shorthands for blob objects in the current index.
I sometimes want to have something like that:

	uzeisberger@io:~/gsrc/linux-2.6$ git cat-file blob v2.6.16:Makefile

That is not a shortcut for objects in the current index, but for blobs
in written trees.

It's easy to hack a script that does that.  Something like that[1]:

	#! /bin/sh

	eval `echo ${1} | sed 's/\\(.*\\):\\(.*\\)/commit=\"\\1^{}\"; file=\"\\2\"/'`

	tree=`git cat-file commit ${commit} | sed -n 's/tree //p'`

	blob=`git ls-tree -r ${tree} | awk "\\\$4 == \\"${file}\\" { print \\\$3 }"`

	git cat-file blob ${blob}

But if the rev-parser could handle that, that would be much finer.  Or
is there already a way to do this that I don't know?

Best regards
Uwe

[1] It's not tested and probably fails if there are some "bad"
characters in ${1} and could be implemented in a much cleverer way.

-- 
Uwe Zeisberger

http://www.google.com/search?q=0+degree+Celsius+in+kelvin

^ permalink raw reply

* [ANNOUNCE] GIT 1.3.1
From: Junio C Hamano @ 2006-04-25  8:04 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

The latest maintenance release GIT 1.3.1 is available at the
usual places:

	http://www.kernel.org/pub/software/scm/git/

	git-1.3.1.tar.{gz,bz2}			(tarball)
	RPMS/$arch/git-*-1.3.1-1.$arch.rpm	(RPM)

Just to let people who are new to the game know, 1.3.X (1<=X)
series is purely bugfix maintenance on top of 1.3.0 release, and
no new features will be added unless there is a compelling
reason (think of them like 2.6.16.X releases of the kernel,
slightly looser updates criteria).

There are four primary branches in git.git repository.

 * Releases in the 1.3.X series come from the "maint" branch, which
   was forked at 1.3.0.

 * All the new features and improvements start their life as
   topic branches that are merged to the "pu" (proposed updates)
   branch.  They are often incomplete and/or unstable.  You can
   think of "pu" as the -mm.

 * When these topic branches become stable enough, they are
   merged into "next" branch.  I personally run "next" branch
   for my work to trust my data to it, until very close to the
   next feature release.

 * After being cooked for a few days to a week in "next", these
   good changes graduate to the "master" branch.  Some of them
   die while being cooked there, but that does not happen very
   often (bad apples are culled while in "pu").

I make feature release out of "master" branch periodically,
which are tagged as X.Y.0 releases.  At that point, "maint"
branch is forked to prepare X.Y.1 and onward.

If you are an end user, the "maint" releases are the recommended
stale releases, but you could miss out new features quickly.
Please do send in bug reports for them, but do not expect new
cool features to appear there.

If you want to stay current and stable, I would recommend to
track "master".

If you are a git developer, being aware of what is happening in
"next" would often be very helpful.  The infrastructure you plan
to base your change on may be in the process of being updated
and your changes to "master" can become useless when that
happens.

If you are a truly devoted git hacker, picking what is in "pu"
can be exciting and useful from time to time.

Post 1.3.0 "master" branch development currently contains major
rewrite of log/show/whatchanged infrastructure, and will be
getting updated pack-objects, faster write-tree, consolidated
"git diff" that is not a shell script, and hopefully many
others.  To reiterate, they will be part of 1.4.0 and no 1.3.X
series release will have them.

----------------------------------------------------------------

Changes since v1.3.0 are as follows:

Jonas Fonseca:
      Fix filename scaling for binary files

Junio C Hamano:
      git-merge: a bit more readable user guidance.
      pre-commit hook: complain about conflict markers.
      git-commit --amend: two fixes.
      pack-objects: do not stop at object that is "too small"
      mailinfo: decode underscore used in "Q" encoding properly.

Linus Torvalds:
      git-log produces no output

Nicolas Pitre:
      fix pack-object buffer size

Paul Mackerras:
      rev-parse: better error message for ambiguous arguments

Petr Baudis:
      Document git-var -l listing also configuration variables
      Document the configuration file

Santi Béjar:
      Reintroduce svn pools to solve the memory leak.

Serge E. Hallyn:
      socksetup: don't return on set_reuse_addr() error

Shawn Pearce:
      Document git-clone --reference

^ permalink raw reply

* Re: [BUG] gitk draws a wrong line
From: Uwe Zeisberger @ 2006-04-25  7:54 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17485.31716.452326.229628@cargo.ozlabs.ibm.com>

Hello Paul,

Paul Mackerras wrote:
> Uwe Zeisberger writes:
> 
> > and then going to commit 10c2df65060e1ab57b2f75e0749de0ee9b8f4810, 
> > I see a small superfluous line between the two commits under 10c2df.
> > 
> > But still worse, if I select the line going down from 10c2df and then
> > select it's parent (i.e c76b6b) I get a big line ending in the commit
> > descriptions and four lines ending in midair.
> 
> That is an X server bug, it seems.  Tk already clips vertices that it
> sends to the X server to be within a box that is no more than 32000
> pixels wide or high, but that seems not to be enough with some X
> servers.  What X server version are you using and what sort of video
> card?
It's a Debian system with XFree 4.3.0.dfsg.1-14 and according to lspci I
have

	0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE]

> If you're feeling adventurous, you can rebuild Tk with the patch below
> (courtesy of D. Richard Hipp) and see if that fixes it.  If it does it
> proves that it is an X server bug.
OK, I tried that and it helped. 

I list my steps for people who want to "fix" it, too:

	Note: You need a deb-src line in your sources.list.  Moreover
	some packages are assumed to be installed (e.g. devscripts,
	fakeroot, vim :-))

	~$ mkdir src; cd src
	~/src$ apt-get build-dep tk8.4
	~/src$ apt-get source tk8.4
	~/src$ cd tk8.4-8.4.12
	~/src/tk8.4-8.4.12$ vim generic/tkCanvUtil.c

	[ apply the patch provided by Paul/D. Richard Hipp ]

	~/src/tk8.4-8.4.12$ dch -n

	[ write a sensible changelog entry ]

	~/src/tk8.4-8.4.12$ fakeroot dpkg-buildpackage
	...

	~/src/tk8.4-8.4.12$ cd ..
	~/src$ sudo dpkg -i tk8.4_8.4.12-1.1_i386.deb

I'm not entirely clear what this patch does.  From only reading it, I
assume it should only have an effect on rather big windows, right?

Do you know some more details about the bug?  Do you know to who it
should get reported?

Thanks for your help
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=gravity+on+earth%3D

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-25  7:43 UTC (permalink / raw)
  To: git
In-Reply-To: <7v7j5e2jv7.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Here is a related but not necessarily competing idle thought.
> 
> How about an ability to "attach" arbitrary objects to commit
> objects?  The commit object would look like:
> 
>     tree 0aaa3fecff73ab428999cb9156f8abc075516abe
>     parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
>     parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
>     related a0e7d36193b96f552073558acf5fcc1f10528917 key
>     related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick
>     author Junio C Hamano <junkio@cox.net> 1145943079 -0700
>     committer Junio C Hamano <junkio@cox.net> 1145943079 -0700
> 
>     Merge branch 'pb/config' into next
> 
>     * pb/config:
>       Deprecate usage of git-var -l for getting config vars list
>       git-repo-config --list support
> 
> The format of "related" attribute is, keyword "related", SP, 40-byte
> hexadecimal object name, SP, and arbitrary sequence of bytes
> except LF and NUL.  Let's call this arbitrary sequence of bytes
> "the nature of relation".
> 
> The semantics I would attach to these "related" links are as
> follows:
> 
>  * To the "core" level git, they do not mean anything other than
>    "you must to have these objects, and objects reachable from
>    them, if you are going to have this commit and claim your
>    repository is without missing objects".
> 
> That means "git-rev-list --objects" needs to list these objects
> (and if they are tags, commits, and trees, then what are
> reachable from them), and "git-fsck" needs to consider these
> related objects and objects reachable from them are reachable
> from this commit.  NOTHING ELSE NEEDS TO BE DONE by the core
> (obviously, cat-file needs to show them, and commit-tree needs to
> record them, but that goes without saying).

Perhaps there should be an option to specify that the link is optional, and
the object pointed can be gone missing. For example for cherrypick the
original cherry-picked commit can either be removed completely, e.g. when
the original branch is deleted, or it can be modified breaking link when we
rewrite history up to original commit on original branch.

Also all other commands which show commit (commit messsage at least) should
be considered for including "related" links...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Junio C Hamano @ 2006-04-25  7:29 UTC (permalink / raw)
  To: git; +Cc: jnareb
In-Reply-To: <e2kgga$d7q$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Additionally for each of those cases we have to consider how to compute the
> link and which commands should be modified, which commands can make use of
> the link and should be modified, should the link be to commit, tag, tree or
> blob, what we want to do with link when pulling/pushing/cloning into
> another repository and which commands should be modified. Not only use case
> scenarios.

This last paragraph is a very good suggestion.  The alleged "use
cases" are just laudary list of wishes, if they are not
accompanied by descriptions on what the modified data structure
and added attribute _means_ and how they are _used_.

Here is a related but not necessarily competing idle thought.

How about an ability to "attach" arbitrary objects to commit
objects?  The commit object would look like:

    tree 0aaa3fecff73ab428999cb9156f8abc075516abe
    parent 5a6a8c0e012137a3f0059be40ec7b2f4aa614355
    parent e1cbc46d12a0524fd5e710cbfaf3f178fc3da504
    related a0e7d36193b96f552073558acf5fcc1f10528917 key
    related 0032d548db56eac9ea09b4ba05843365f6325b85 cherrypick
    author Junio C Hamano <junkio@cox.net> 1145943079 -0700
    committer Junio C Hamano <junkio@cox.net> 1145943079 -0700

    Merge branch 'pb/config' into next

    * pb/config:
      Deprecate usage of git-var -l for getting config vars list
      git-repo-config --list support

The format of "related" attribute is, keyword "related", SP, 40-byte
hexadecimal object name, SP, and arbitrary sequence of bytes
except LF and NUL.  Let's call this arbitrary sequence of bytes
"the nature of relation".

The semantics I would attach to these "related" links are as
follows:

 * To the "core" level git, they do not mean anything other than
   "you must to have these objects, and objects reachable from
   them, if you are going to have this commit and claim your
   repository is without missing objects".

That means "git-rev-list --objects" needs to list these objects
(and if they are tags, commits, and trees, then what are
reachable from them), and "git-fsck" needs to consider these
related objects and objects reachable from them are reachable
from this commit.  NOTHING ELSE NEEDS TO BE DONE by the core
(obviously, cat-file needs to show them, and commit-tree needs to
record them, but that goes without saying).

Then porcelains can agree on what different kinds of nature of
relation mean and do sensible things.  The earlier "omit the
cherry-picked ones" example I gave can examine "cherrypick".

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-25  6:44 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

Sam Vilain wrote:

> This patch series implements "prior" links in commit objects.  A
> 'prior' link on a commit represents its historical precedent, as
> opposed to the previous commit(s) that this commit builds upon.
> 
> This is a proof of concept only; there is an outstanding bug (I put
> the prior header right after parent, when it should really go after
> author/committer), and room for improvement no doubt remain elsewhere.
> Not to mention my shocking C coding style ;)

I think "prior" link concept is to generic and is used for quite unrelated
things

> Examples of use cases this helps:
> 
>  1. heads that represent topic branch merges
> 
>     This is the "pu" branch case, where the head is a merge of several
>     topic branches that is continually moved forward.
> 
>     topic branches     head
>       ,___.   ,___.
>      | TA1 | | TB1 |
>       `---'   `---'    ,__.
>          ^\_____^\____| H1 |
>                        `--'
> 
>     + some topic branch changes and a republish:
> 
>       ,___.   ,___.
>      | TA1 | | TB1 |
>       `---'   `---'^   ,__.
>         |^\_____^\____| H1 |
>         |       |      `--'
>       ,_|_.   ,_|_.      P
>      | TA2 | | TB2 |     |
>       `---'   `---'^     |
>         ^       ^        |
>       ,_|_.     |        |
>      | TA3 |    |        |
>       `---'     |      ,__.
>          ^\______\____| H2 |
>                        `--'
> 
>     key:  ^ = parent   P = prior

This case is clear. You want to record previous head of "pu"-like branch,
but you also want to drop the history, so you don't want to record it as
one of parents. I'm not sure if this link would be informative only, or if
it could be usefull e.g. in merge computing.

>  2. revising published commits / re-basing
> 
>     This is what "stg" et al do.  The tools allow you to commit,
>     rewind, revise, recommit, fast forward, etc.
> 
>     In this case, the "prior" link would point to the last revision of
>     a patch.  Tools would probably support only doing this for selected, 
>     "published" patch chains 

This case is quite different. If I understand it correctly prior either
points to the previous patch in patch stack, or the bottom of the
stack/patch stack attachment point. If this cannot be computed easily, it
could I guess be added, but perhaps using other name for link.

>  3. sub-projects
> 
>     In this case, the commit on the "main" commit line would have a
>     "prior" link to the commit on the sub-project.  The sub-project
>     would effectively be its own head with copied commits objects on
>     the main head.
>
>  4. tracking cherry picking
> 
>     In this case, the "prior" link just points to the commit that was
>     cherry picked.  This is perhaps a little different, but an idea
>     that somebody else had for this feature.

Those two are yet another case altogether, the "prior" link pointing to "the
same" commit in another history line. I agree with Junio that for (3)
"bind" proposal (if I understand correctly it points to tree rather than to
commit) is more clean way to go. As to cherry picking (and perhaps
"cherry-pick on steroids" aka rebase), there is truly 0-1 relation (either
this link is not needed at all, or there is only one commit to link to),
but I don't think it should have the same name as in case (1), as this is
very different. And there is a problem that the link might be dangling if
we deleted the branch we cherry-picked commit from, or did some history
rewrite. Perhaps "cherry" would be better name for this link :-)

Additionally for each of those cases we have to consider how to compute the
link and which commands should be modified, which commands can make use of
the link and should be modified, should the link be to commit, tag, tree or
blob, what we want to do with link when pulling/pushing/cloning into
another repository and which commands should be modified. Not only use case
scenarios.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases
From: Junio C Hamano @ 2006-04-25  5:19 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git
In-Reply-To: <20060425043106.18382.48165.stgit@localhost.localdomain>

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> From: Sam Vilain <sam.vilain@catalyst.net.nz>
>
> It is possible that a good merge base may be found looking via "prior"
> links as well.  We follow them where possible.

You need to define what "prior" means before making decision
like that.  If "prior" can mean cherry-picked one from unrelated
line of development, the above reasoning does not apply.

^ permalink raw reply

* Re: [PATCH 1/5] add 'prior' link in commit structure
From: Junio C Hamano @ 2006-04-25  5:18 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425043106.18382.24344.stgit@localhost.localdomain>

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> For now this is just recorded as a char* pointer, as it is not an
> error condition for the commit not to be present locally.

Object ancestry is parsed lazily, so you should not have to do this.
Just point at another commit if you are to have only one (I
recommend against it) or have another commit_list, but when you
instantiate you may want to have a flag in the commit object
itself that says "this need not exist".

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Junio C Hamano @ 2006-04-25  5:16 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

Sam Vilain <sam.vilain@catalyst.net.nz> writes:

> Examples of use cases this helps:

My reaction to this patch series is that you try to cover quite
different and unrelated things, without thinking things through,
and end up covering nothing usefully.  What is missing in these
"use cases" is a coherent semantics.

What the "prior" means to humans and tools.  And my *guess* of
what they mean suggests you are trying to make it mean many
unrelated concepts.

>  1. heads that represent topic branch merges
>
>     This is the "pu" branch case, where the head is a merge of several
>     topic branches that is continually moved forward.

For usage like "pu", the previous "pu" head could be recorded as
one of the parents; you do not need anything special.

The reason I do not include the previous head when I reconstruct
"pu" is because I explicitly *want* to drop history -- not
having to carry forward a failed experiment is what is desired
there.  Otherwise I would manage "pu" just like I currently do
"next" and "master".  So this is not a justification to add
something new.

>  2. revising published commits / re-basing
>
>     This is what "stg" et al do.  The tools allow you to commit,
>     rewind, revise, recommit, fast forward, etc.

stg wants to have a link to the fork-point commit.  I do not
know if it is absolutely necessary (you might be able to figure
it out using merge-base, I dunno).

>     In this case, the "prior" link would point to the last revision of
>     a patch.  Tools would probably

Probably what...???

>  3. sub-projects
>
>     In this case, the commit on the "main" commit line would have a
>     "prior" link to the commit on the sub-project.  The sub-project
>     would effectively be its own head with copied commits objects on
>     the main head.

You say you can have only one "prior" per commit, which makes
this unsuitable to bind multiple subprojects into a larger
project (the earlier "bind" proposal allows zero or more).

When you, a human, see a "prior" link in "git cat-file commit"
output, what does that tell you?  Is it "the previous commit
this thing replaces?"  Or is it a commit in a different line of
development which is its subproject?  Or is it a commit that was
cherry-picked from a different line?  How would you tell?  And
assuming you _could_ somehow tell, how would it help you to know
it?

When the Plumbing and the Porcelain sees a "prior" link, what
should they do?  It hugely depends on what that link means.  You
have a patch to merge-base to include the prior commit of the
commit in question in the ancestry chain, but that is probably
valid only for case 1. and perhaps 2. If the link points at a
commit of otherwise unrelated subproject head, you would _never_
want to include that in the merge-base computation.  Neither the
"this commit was taken out of context from otherwise unrelated
branch" link you envision to use for 4.  I think including
"prior" to ancestry list for case 1. and 2. makes some sense in
the merge-base example only because (1) it does not have to be any
different from an ordinary "parent" to begin with for case 1.,
and (2) it points at fork-point which is sort of a merge-base
already.

There may be some narrower concrete use case for which you can
devise coherent semantics, and teach tools and humans how to
interpret such inter-commit relationship that are _not_
parent-child ancestry.  For example, if you have one special
link to point at a "cherry-picked" commit, rebasing _could_ take
advantage of it.  When your side branch tip is at D, and commit
D has "this was cherry-picked from commit E" note, and if you
are rebasing your work on top of F:

        A---B---C---D
       /
  o---o---E---F

the tool can notice that F can reach E and carry forward only A,
B, and C on top of F, omitting D.  So having such a link might
be useful.  But if that is what you are going to do, I do not
think you would want to conflate that with other inter-commit
relationships, such as "previous hydra cap".

Oh, and you would need an update to rev-list --objects and
fsck-objects if you are to add any new link to commit objects.
Otherwise fetch/push would not get the related commits prior
points at, and prune will happily discard them.  But before even
bothering it, you need to come up with a semantics first.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Sam Vilain @ 2006-04-25  4:34 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

Sam Vilain wrote:

>    In this case, the "prior" link would point to the last revision of
>    a patch.  Tools would probably
>  
>
... support only doing this for selected, "published" patch chains

^ permalink raw reply

* [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Sam Vilain @ 2006-04-25  3:54 UTC (permalink / raw)
  To: git

This patch series implements "prior" links in commit objects.  A
'prior' link on a commit represents its historical precedent, as
opposed to the previous commit(s) that this commit builds upon.

This is a proof of concept only; there is an outstanding bug (I put
the prior header right after parent, when it should really go after
author/committer), and room for improvement no doubt remain elsewhere.
Not to mention my shocking C coding style ;)

Examples of use cases this helps:

 1. heads that represent topic branch merges

    This is the "pu" branch case, where the head is a merge of several
    topic branches that is continually moved forward.

    topic branches     head
      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'    ,__.
         ^\_____^\____| H1 |
                       `--'

    + some topic branch changes and a republish:

      ,___.   ,___.
     | TA1 | | TB1 |
      `---'   `---'^   ,__.
        |^\_____^\____| H1 |
        |       |      `--'
      ,_|_.   ,_|_.      P
     | TA2 | | TB2 |     |
      `---'   `---'^     |
        ^       ^        |
      ,_|_.     |        |
     | TA3 |    |        |
      `---'     |      ,__.
         ^\______\____| H2 |
                       `--'

    key:  ^ = parent   P = prior

 2. revising published commits / re-basing

    This is what "stg" et al do.  The tools allow you to commit,
    rewind, revise, recommit, fast forward, etc.

    In this case, the "prior" link would point to the last revision of
    a patch.  Tools would probably

 3. sub-projects

    In this case, the commit on the "main" commit line would have a
    "prior" link to the commit on the sub-project.  The sub-project
    would effectively be its own head with copied commits objects on
    the main head.

 4. tracking cherry picking

    In this case, the "prior" link just points to the commit that was
    cherry picked.  This is perhaps a little different, but an idea
    that somebody else had for this feature.

Sam.

^ permalink raw reply

* [PATCH 4/5] git-commit-tree: add support for prior
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add support in git-commit-tree for -r as well as associated
documentation.
---

 Documentation/git-commit-tree.txt |    6 ++++++
 commit-tree.c                     |   26 +++++++++++++++++++++-----
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index 27b3d12..e11ba1f 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -20,6 +20,9 @@ A commit object usually has 1 parent (a 
 to 16 parents.  More than one parent represents a merge of branches
 that led to them.
 
+A commit object can have 1 prior commit.  This represents the previous
+commit that this one replaces (including history).
+
 While a tree represents a particular directory state of a working
 directory, a commit represents that state in "time", and explains how
 to get there.
@@ -38,6 +41,8 @@ OPTIONS
 -p <parent commit>::
 	Each '-p' indicates the id of a parent commit object.
 	
+-r <other commit>::
+	One '-r' indicates the id of a prior commit object.
 
 Commit Information
 ------------------
@@ -45,6 +50,7 @@ Commit Information
 A commit encapsulates:
 
 - all parent object ids
+- a prior object id (optional)
 - author name, email and date
 - committer name and email and the commit time.
 
diff --git a/commit-tree.c b/commit-tree.c
index 2d86518..6660b01 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -61,8 +61,9 @@ static void check_valid(unsigned char *s
  */
 #define MAXPARENT (16)
 static unsigned char parent_sha1[MAXPARENT][20];
+static unsigned char prior_sha1[21] = "\0";
 
-static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* < changelog";
+static const char commit_tree_usage[] = "git-commit-tree <sha1> [-p <sha1>]* [-r <sha1>] < changelog";
 
 static int new_parent(int idx)
 {
@@ -99,11 +100,22 @@ int main(int argc, char **argv)
 	for (i = 2; i < argc; i += 2) {
 		char *a, *b;
 		a = argv[i]; b = argv[i+1];
-		if (!b || strcmp(a, "-p") || get_sha1(b, parent_sha1[parents]))
+		if (!b)
 			usage(commit_tree_usage);
-		check_valid(parent_sha1[parents], commit_type);
-		if (new_parent(parents))
-			parents++;
+		if (!strcmp(a, "-p")) {
+			if (get_sha1(b, parent_sha1[parents]) < 0)
+				usage(commit_tree_usage);
+			check_valid(parent_sha1[parents], commit_type);
+			if (new_parent(parents))
+				parents++;
+		}
+		else if (!strcmp(a, "-r")) {
+			if (strcmp(&prior_sha1, "") || get_sha1(b, &prior_sha1) < 0)
+				usage(commit_tree_usage);
+		}
+		else {
+			usage(commit_tree_usage);
+		}
 	}
 	if (!parents)
 		fprintf(stderr, "Committing initial tree %s\n", argv[1]);
@@ -118,6 +130,10 @@ int main(int argc, char **argv)
 	 */
 	for (i = 0; i < parents; i++)
 		add_buffer(&buffer, &size, "parent %s\n", sha1_to_hex(parent_sha1[i]));
+	if (strcmp(&prior_sha1, "")) {
+		fprintf(stderr, "Setting prior to %s\n", sha1_to_hex(&prior_sha1));
+		add_buffer(&buffer, &size, "prior %s\n", sha1_to_hex(&prior_sha1));
+	}
 
 	/* Person/date information */
 	add_buffer(&buffer, &size, "author %s\n", git_author_info(1));

^ permalink raw reply related

* [PATCH 1/5] add 'prior' link in commit structure
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add a space in the commit for a prior commit that forms this commit's
historical, not substantial, precedent.

For now this is just recorded as a char* pointer, as it is not an
error condition for the commit not to be present locally.
---

 commit.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/commit.h b/commit.h
index de142af..b00a6b9 100644
--- a/commit.h
+++ b/commit.h
@@ -13,6 +13,7 @@ struct commit {
 	struct object object;
 	unsigned long date;
 	struct commit_list *parents;
+	char *prior;
 	struct tree *tree;
 	char *buffer;
 };

^ permalink raw reply related

* [PATCH 3/5] commit.c: parse 'prior' link
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Parse for the 'prior' link in a commit
---

 commit.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/commit.c b/commit.c
index 2717dd8..e4bc396 100644
--- a/commit.c
+++ b/commit.c
@@ -260,6 +260,18 @@ int parse_commit_buffer(struct commit *i
 			n_refs++;
 		}
 	}
+	if (!memcmp(bufptr, "prior ", 6)) {
+		unsigned char prior[20];
+		if (get_sha1_hex(bufptr + 6, prior) || bufptr[46] != '\n')
+			return error("bad prior in commit %s", sha1_to_hex(item->object.sha1));
+		bufptr += 47;
+
+		item->prior = xmalloc(21);
+		strncpy(item->prior, (char*)&prior, 20);
+		item->prior[20] = '\0';
+	} else {
+		item->prior = 0;
+	}
 	if (graft) {
 		int i;
 		struct commit *new_parent;

^ permalink raw reply related

* [PATCH 2/5] git-merge-base: follow 'prior' links to find merge bases
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

It is possible that a good merge base may be found looking via "prior"
links as well.  We follow them where possible.
---

 merge-base.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/merge-base.c b/merge-base.c
index 07f5ab4..ed6d18c 100644
--- a/merge-base.c
+++ b/merge-base.c
@@ -207,6 +207,18 @@ static int merge_base(struct commit *rev
 			p->object.flags |= flags;
 			insert_by_date(p, &list);
 		}
+		/* If the commit has a "prior" reference, add it */
+		if (commit->prior) {
+			struct commit *prior;
+			prior = lookup_commit_reference_gently(commit->prior, 1);
+			if (prior) {
+				if ((prior->object.flags & flags) != flags) {
+					parse_commit(prior);
+					prior->object.flags |= flags;
+					insert_by_date(prior, &list);
+				}
+			}
+		}
 	}
 
 	if (!result)

^ permalink raw reply related

* [PATCH 5/5] git-commit: add --prior to set prior link
From: Sam Vilain @ 2006-04-25  4:31 UTC (permalink / raw)
  To: git
In-Reply-To: <20060425035421.18382.51677.stgit@localhost.localdomain>

From: Sam Vilain <sam.vilain@catalyst.net.nz>

Add command-line support for --prior and add a description to the
ASCIIDOC
---

 Documentation/git-commit.txt |   10 ++++++++++
 git-commit.sh                |   19 +++++++++++++++++--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index 6f2c495..ca5073c 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -10,6 +10,7 @@ SYNOPSIS
 [verse]
 'git-commit' [-a] [-s] [-v] [(-c | -C) <commit> | -F <file> | -m <msg>]
 	   [--no-verify] [--amend] [-e] [--author <author>]
+           [-p <commit>]
 	   [--] [[-i | -o ]<file>...]
 
 DESCRIPTION
@@ -106,6 +107,15 @@ but can be used to amend a merge commit.
 	index and the latest commit does not match on the
 	specified paths to avoid confusion.
 
+-p|--prior <commit>::
+	Specify a commit that this new commit is the next version of.
+        Use when you want a branch to supercede another branch, but
+        with a new commit history.  It is also use for sub-projects,
+        where commits on the parent tree mirror commits in the
+        sub-project.  <commit> does not have to exist in the local
+        repository, if it is specified as a full 40-digit hex SHA1
+        sum.  Otherwise it is parsed as a local revision.
+
 --::
 	Do not interpret any more arguments as options.
 
diff --git a/git-commit.sh b/git-commit.sh
index 26cd7ca..3feb60d 100755
--- a/git-commit.sh
+++ b/git-commit.sh
@@ -3,7 +3,7 @@ #
 # Copyright (c) 2005 Linus Torvalds
 # Copyright (c) 2006 Junio C Hamano
 
-USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [[-i | -o] <path>...]'
+USAGE='[-a] [-s] [-v] [--no-verify] [-m <message> | -F <logfile> | (-C|-c) <commit>) [--amend] [-e] [--author <author>] [-p <commit>] [[-i | -o] <path>...]'
 SUBDIRECTORY_OK=Yes
 . git-sh-setup
 
@@ -200,6 +200,7 @@ log_given=
 log_message=
 verify=t
 verbose=
+prior=
 signoff=
 force_author=
 only_include_assumed=
@@ -344,6 +345,19 @@ do
       shift
       break
       ;;
+  -p|--p|--pr|--pri|--prio|--prior)
+      shift
+      prior="$1"
+      if echo $prior | perl -ne 'exit 1 unless /^[0-9a-f]{40}$/i'
+      then
+          prior=`echo "$prior" | tr '[A-Z]' '[a-z]'`
+      else
+	  prior=`git-rev-parse "$prior"`
+	  [ -n "$prior" ] || exit 1
+      fi
+      PRIOR="-r $prior"
+      shift
+      ;;
   -*)
       usage
       ;;
@@ -602,6 +616,7 @@ then
 		PARENTS=$(git-cat-file commit HEAD |
 			sed -n -e '/^$/q' -e 's/^parent /-p /p')
 	fi
+	
 	current=$(git-rev-parse --verify HEAD)
 else
 	if [ -z "$(git-ls-files)" ]; then
@@ -673,7 +688,7 @@ then
 		tree=$(GIT_INDEX_FILE="$TMP_INDEX" git-write-tree) &&
 		rm -f "$TMP_INDEX"
 	fi &&
-	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS) &&
+	commit=$(cat "$GIT_DIR"/COMMIT_MSG | git-commit-tree $tree $PARENTS $PRIOR) &&
 	git-update-ref HEAD $commit $current &&
 	rm -f -- "$GIT_DIR/MERGE_HEAD" &&
 	if test -f "$NEXT_INDEX"

^ permalink raw reply related

* [PATCH] split the diff-delta interface
From: Nicolas Pitre @ 2006-04-25  3:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This patch splits the diff-delta interface into index creation and delta
generation.  A wrapper is provided to preserve the diff-delta() call.

This will allow for an optimization in pack-objects.c where the source 
object could be fixed and a full window of objects tentatively tried 
against 
that same source object without recomputing the source index each time.

This patch only restructure things, plus a couple cleanups for good 
measure. There is no performance change yet.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---


diff --git a/delta.h b/delta.h
index 9464f3e..9ef44c1 100644
--- a/delta.h
+++ b/delta.h
@@ -1,12 +1,73 @@
 #ifndef DELTA_H
 #define DELTA_H
 
-/* handling of delta buffers */
-extern void *diff_delta(void *from_buf, unsigned long from_size,
-			void *to_buf, unsigned long to_size,
-		        unsigned long *delta_size, unsigned long max_size);
-extern void *patch_delta(void *src_buf, unsigned long src_size,
-			 void *delta_buf, unsigned long delta_size,
+/* opaque object for delta index */
+struct delta_index;
+
+/*
+ * create_delta_index: compute index data from given buffer
+ *
+ * This returns a pointer to a struct delta_index that should be passed to
+ * subsequent create_delta() calls, or to free_delta_index().  A NULL pointer
+ * is returned on failure.  The given buffer must not be freed nor altered
+ * before free_delta_index() is called.  The returned pointer must be freed
+ * using free_delta_index().
+ */
+extern struct delta_index *
+create_delta_index(const void *buf, unsigned long bufsize);
+
+/*
+ * free_delta_index: free the index created by create_delta_index()
+ */
+extern void free_delta_index(struct delta_index *index);
+
+/*
+ * create_delta: create a delta from given index for the given buffer
+ *
+ * This function may be called multiple times with different buffers using
+ * the same delta_index pointer.  If max_delta_size is non-zero and the
+ * resulting delta is to be larger than max_delta_size then NULL is returned.
+ * On success, a non-NULL pointer to the buffer with the delta data is
+ * returned and *delta_size is updated with its size.  The returned buffer
+ * must be freed by the caller.
+ */
+extern void *
+create_delta(const struct delta_index *index,
+	     const void *buf, unsigned long bufsize,
+	     unsigned long *delta_size, unsigned long max_delta_size);
+
+/*
+ * diff_delta: create a delta from source buffer to target buffer
+ *
+ * If max_delta_size is non-zero and the resulting delta is to be larger
+ * than max_delta_size then NULL is returned.  On success, a non-NULL
+ * pointer to the buffer with the delta data is returned and *delta_size is
+ * updated with its size.  The returned buffer must be freed by the caller.
+ */
+static inline void *
+diff_delta(const void *src_buf, unsigned long src_bufsize,
+	   const void *trg_buf, unsigned long trg_bufsize,
+	   unsigned long *delta_size, unsigned long max_delta_size)
+{
+	struct delta_index *index = create_delta_index(src_buf, src_bufsize);
+	if (index) {
+		void *delta = create_delta(index, trg_buf, trg_bufsize, 
+					   delta_size, max_delta_size);
+		free_delta_index(index);
+		return delta;
+	}
+	return NULL;
+}
+
+/*
+ * patch_delta: recreate target buffer given source buffer and delta data
+ *
+ * On success, a non-NULL pointer to the target buffer is returned and
+ * *trg_bufsize is updated with its size.  On failure a NULL pointer is
+ * returned.  The returned buffer must be freed by the caller.
+ */
+extern void *patch_delta(const void *src_buf, unsigned long src_size,
+			 const void *delta_buf, unsigned long delta_size,
 			 unsigned long *dst_size);
 
 /* the smallest possible delta size is 4 bytes */
@@ -14,7 +75,7 @@ #define DELTA_SIZE_MIN	4
 
 /*
  * This must be called twice on the delta data buffer, first to get the
- * expected reference buffer size, and again to get the result buffer size.
+ * expected source buffer size, and again to get the target buffer size.
  */
 static inline unsigned long get_delta_hdr_size(const unsigned char **datap,
 					       const unsigned char *top)
diff --git a/diff-delta.c b/diff-delta.c
index 1188b31..fdedf94 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -27,53 +27,70 @@ #include "delta.h"
 /* block size: min = 16, max = 64k, power of 2 */
 #define BLK_SIZE 16
 
-#define MIN(a, b) ((a) < (b) ? (a) : (b))
+/* maximum hash entry list for the same hash bucket */
+#define HASH_LIMIT 64
 
 #define GR_PRIME 0x9e370001
 #define HASH(v, shift) (((unsigned int)(v) * GR_PRIME) >> (shift))
 
-struct index {
+struct index_entry {
 	const unsigned char *ptr;
 	unsigned int val;
-	struct index *next;
+	struct index_entry *next;
 };
 
-static struct index ** delta_index(const unsigned char *buf,
-				   unsigned long bufsize,
-				   unsigned long trg_bufsize,
-				   unsigned int *hash_shift)
+struct delta_index {
+	const void *src_buf;
+	unsigned long src_size;
+	unsigned int hash_shift;
+	struct index_entry *hash[0];
+};
+
+struct delta_index * create_delta_index(const void *buf, unsigned long bufsize)
 {
-	unsigned int i, hsize, hshift, hlimit, entries, *hash_count;
-	const unsigned char *data;
-	struct index *entry, **hash;
+	unsigned int i, hsize, hshift, entries, *hash_count;
+	const unsigned char *data, *buffer = buf;
+	struct delta_index *index;
+	struct index_entry *entry, **hash;
 	void *mem;
 
+	if (!buf || !bufsize)
+		return NULL;
+
 	/* determine index hash size */
 	entries = bufsize  / BLK_SIZE;
 	hsize = entries / 4;
 	for (i = 4; (1 << i) < hsize && i < 31; i++);
 	hsize = 1 << i;
 	hshift = 32 - i;
-	*hash_shift = hshift;
 
 	/* allocate lookup index */
-	mem = malloc(hsize * sizeof(*hash) + entries * sizeof(*entry));
+	mem = malloc(sizeof(*index) +
+		     sizeof(*hash) * hsize +
+		     sizeof(*entry) * entries);
 	if (!mem)
 		return NULL;
+	index = mem;
+	mem = index + 1;
 	hash = mem;
-	entry = mem + hsize * sizeof(*hash);
+	mem = hash + hsize;
+	entry = mem;
+
+	index->src_buf = buf;
+	index->src_size = bufsize;
+	index->hash_shift = hshift;
 	memset(hash, 0, hsize * sizeof(*hash));
 
 	/* allocate an array to count hash entries */
 	hash_count = calloc(hsize, sizeof(*hash_count));
 	if (!hash_count) {
-		free(hash);
+		free(index);
 		return NULL;
 	}
 
 	/* then populate the index */
-	data = buf + entries * BLK_SIZE - BLK_SIZE;
-	while (data >= buf) {
+	data = buffer + entries * BLK_SIZE - BLK_SIZE;
+	while (data >= buffer) {
 		unsigned int val = adler32(0, data, BLK_SIZE);
 		i = HASH(val, hshift);
 		entry->ptr = data;
@@ -91,27 +108,18 @@ static struct index ** delta_index(const
 	 * bucket that would bring us to O(m*n) computing costs (m and n
 	 * corresponding to reference and target buffer sizes).
 	 *
-	 * The more the target buffer is large, the more it is important to
-	 * have small entry lists for each hash buckets.  With such a limit
-	 * the cost is bounded to something more like O(m+n).
-	 */
-	hlimit = (1 << 26) / trg_bufsize;
-	if (hlimit < 4*BLK_SIZE)
-		hlimit = 4*BLK_SIZE;
-
-	/*
-	 * Now make sure none of the hash buckets has more entries than
+	 * Make sure none of the hash buckets has more entries than
 	 * we're willing to test.  Otherwise we cull the entry list
 	 * uniformly to still preserve a good repartition across
 	 * the reference buffer.
 	 */
 	for (i = 0; i < hsize; i++) {
-		if (hash_count[i] < hlimit)
+		if (hash_count[i] < HASH_LIMIT)
 			continue;
 		entry = hash[i];
 		do {
-			struct index *keep = entry;
-			int skip = hash_count[i] / hlimit / 2;
+			struct index_entry *keep = entry;
+			int skip = hash_count[i] / HASH_LIMIT / 2;
 			do {
 				entry = entry->next;
 			} while(--skip && entry);
@@ -120,7 +128,12 @@ static struct index ** delta_index(const
 	}
 	free(hash_count);
 
-	return hash;
+	return index;
+}
+
+void free_delta_index(struct delta_index *index)
+{
+	free(index);
 }
 
 /* provide the size of the copy opcode given the block offset and size */
@@ -131,21 +144,17 @@ #define COPYOP_SIZE(o, s) \
 /* the maximum size for any opcode */
 #define MAX_OP_SIZE COPYOP_SIZE(0xffffffff, 0xffffffff)
 
-void *diff_delta(void *from_buf, unsigned long from_size,
-		 void *to_buf, unsigned long to_size,
-		 unsigned long *delta_size,
-		 unsigned long max_size)
+void *
+create_delta(const struct delta_index *index,
+	     const void *trg_buf, unsigned long trg_size,
+	     unsigned long *delta_size, unsigned long max_size)
 {
 	unsigned int i, outpos, outsize, hash_shift;
 	int inscnt;
 	const unsigned char *ref_data, *ref_top, *data, *top;
 	unsigned char *out;
-	struct index *entry, **hash;
 
-	if (!from_size || !to_size)
-		return NULL;
-	hash = delta_index(from_buf, from_size, to_size, &hash_shift);
-	if (!hash)
+	if (!trg_buf || !trg_size)
 		return NULL;
 
 	outpos = 0;
@@ -153,60 +162,55 @@ void *diff_delta(void *from_buf, unsigne
 	if (max_size && outsize >= max_size)
 		outsize = max_size + MAX_OP_SIZE + 1;
 	out = malloc(outsize);
-	if (!out) {
-		free(hash);
+	if (!out)
 		return NULL;
-	}
-
-	ref_data = from_buf;
-	ref_top = from_buf + from_size;
-	data = to_buf;
-	top = to_buf + to_size;
 
 	/* store reference buffer size */
-	out[outpos++] = from_size;
-	from_size >>= 7;
-	while (from_size) {
-		out[outpos - 1] |= 0x80;
-		out[outpos++] = from_size;
-		from_size >>= 7;
+	i = index->src_size;
+	while (i >= 0x80) {
+		out[outpos++] = i | 0x80;
+		i >>= 7;
 	}
+	out[outpos++] = i;
 
 	/* store target buffer size */
-	out[outpos++] = to_size;
-	to_size >>= 7;
-	while (to_size) {
-		out[outpos - 1] |= 0x80;
-		out[outpos++] = to_size;
-		to_size >>= 7;
+	i = trg_size;
+	while (i >= 0x80) {
+		out[outpos++] = i | 0x80;
+		i >>= 7;
 	}
+	out[outpos++] = i;
 
+	ref_data = index->src_buf;
+	ref_top = ref_data + index->src_size;
+	data = trg_buf;
+	top = trg_buf + trg_size;
+	hash_shift = index->hash_shift;
 	inscnt = 0;
 
 	while (data < top) {
 		unsigned int moff = 0, msize = 0;
-		if (data + BLK_SIZE <= top) {
-			unsigned int val = adler32(0, data, BLK_SIZE);
-			i = HASH(val, hash_shift);
-			for (entry = hash[i]; entry; entry = entry->next) {
-				const unsigned char *ref = entry->ptr;
-				const unsigned char *src = data;
-				unsigned int ref_size = ref_top - ref;
-				if (entry->val != val)
-					continue;
-				if (ref_size > top - src)
-					ref_size = top - src;
-				if (ref_size > 0x10000)
-					ref_size = 0x10000;
-				if (ref_size <= msize)
-					break;
-				while (ref_size-- && *src++ == *ref)
-					ref++;
-				if (msize < ref - entry->ptr) {
-					/* this is our best match so far */
-					msize = ref - entry->ptr;
-					moff = entry->ptr - ref_data;
-				}
+		struct index_entry *entry;
+		unsigned int val = adler32(0, data, BLK_SIZE);
+		i = HASH(val, hash_shift);
+		for (entry = index->hash[i]; entry; entry = entry->next) {
+			const unsigned char *ref = entry->ptr;
+			const unsigned char *src = data;
+			unsigned int ref_size = ref_top - ref;
+			if (entry->val != val)
+				continue;
+			if (ref_size > top - src)
+				ref_size = top - src;
+			if (ref_size > 0x10000)
+				ref_size = 0x10000;
+			if (ref_size <= msize)
+				break;
+			while (ref_size-- && *src++ == *ref)
+				ref++;
+			if (msize < ref - entry->ptr) {
+				/* this is our best match so far */
+				msize = ref - entry->ptr;
+				moff = entry->ptr - ref_data;
 			}
 		}
 
@@ -271,7 +275,6 @@ void *diff_delta(void *from_buf, unsigne
 				out = realloc(out, outsize);
 			if (!out) {
 				free(tmp);
-				free(hash);
 				return NULL;
 			}
 		}
@@ -280,7 +283,6 @@ void *diff_delta(void *from_buf, unsigne
 	if (inscnt)
 		out[outpos - inscnt - 1] = inscnt;
 
-	free(hash);
 	*delta_size = outpos;
 	return out;
 }
diff --git a/patch-delta.c b/patch-delta.c
index d95f0d9..8f318ed 100644
--- a/patch-delta.c
+++ b/patch-delta.c
@@ -13,8 +13,8 @@ #include <stdlib.h>
 #include <string.h>
 #include "delta.h"
 
-void *patch_delta(void *src_buf, unsigned long src_size,
-		  void *delta_buf, unsigned long delta_size,
+void *patch_delta(const void *src_buf, unsigned long src_size,
+		  const void *delta_buf, unsigned long delta_size,
 		  unsigned long *dst_size)
 {
 	const unsigned char *data, *top;

^ permalink raw reply related

* Re: [BUG] gitk draws a wrong line
From: Paul Mackerras @ 2006-04-25  1:31 UTC (permalink / raw)
  To: Uwe Zeisberger; +Cc: git
In-Reply-To: <20060418104014.GA2299@informatik.uni-freiburg.de>

Uwe Zeisberger writes:

> and then going to commit 10c2df65060e1ab57b2f75e0749de0ee9b8f4810, 
> I see a small superfluous line between the two commits under 10c2df.
> 
> But still worse, if I select the line going down from 10c2df and then
> select it's parent (i.e c76b6b) I get a big line ending in the commit
> descriptions and four lines ending in midair.

That is an X server bug, it seems.  Tk already clips vertices that it
sends to the X server to be within a box that is no more than 32000
pixels wide or high, but that seems not to be enough with some X
servers.  What X server version are you using and what sort of video
card?

If you're feeling adventurous, you can rebuild Tk with the patch below
(courtesy of D. Richard Hipp) and see if that fixes it.  If it does it
proves that it is an X server bug.

Paul.

--- tkCanvUtil.c.orig   2006-02-08 08:51:31.859761208 -0500
+++ tkCanvUtil.c        2006-02-08 08:57:11.744090936 -0500
@@ -1657,25 +1657,27 @@
 
     /*
     ** Constrain all vertices of the path to be within a box that is no
-    ** larger than 32000 pixels wide or height.  The top-left corner of
+    ** larger than 16000 pixels wide or height.  The top-left corner of
     ** this clipping box is 1000 pixels above and to the left of the top
     ** left corner of the window on which the canvas is displayed.
     **
     ** This means that a canvas will not display properly on a canvas
-    ** window that is larger than 31000 pixels wide or high.  That is no
+    ** window that is larger than 14000 pixels wide or high.  That is no
     ** a problem today, but might someday become a factor for ultra-high
     ** resolutions displays.
     **
     ** The X11 protocol allows us (in theory) to expand the size of the
     ** clipping box to 32767 pixels.  But we have found experimentally that
-    ** XFree86 sometimes fails to draw lines correctly if they are longe
-    ** than about 32500 pixels.  So we have left a little margin in the
-    ** size to mask that bug.
+    ** XFree86 has problems with sizes bigger than 32500 pixels and we
+    ** have received reports of other X servers running in to trouble
+    ** at around 29000 pixels.  So we are going to play it safe a limit
+    ** pixel values to 14 bytes: 16384.  That is still sufficient for
+    ** a 4x4 ft display at 300 dpi.
     */
     lft = canvPtr->xOrigin - 1000.0;
     top = canvPtr->yOrigin - 1000.0;
-    rgh = lft + 32000.0;
-    btm = top + 32000.0;
+    rgh = lft + 16383.0;
+    btm = top + 16383.0;
 
     /* Try the common case first - no clipping.  Loop over the input
     ** coordinates and translate them into appropriate output coordinates.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox