Git development
 help / color / mirror / Atom feed
* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Junio C Hamano @ 2006-04-26  7:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e2n72h$aqe$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

> Do I understand correctly that toplevel (master project) commits have tree
> which points to combined tree, and "bind" links which points to the
> subprojects commits whose trees make up the overall tree, or does the
> master tree points to tree containing only toplevel files (overall Makefile
> for example, INSTALL or README for the whole project including
> subprojects,...)?

The plan for "bind commit" was to have the toplevel commit to
contain:

	tree -- this covers the whole tree including subprojects
        parent -- list of parents in the toplevel project
        bind -- commit object name of subproject, plus which
	        directory to graft its tree onto.

And a subproject commit, unless it contains subsubproject, would
look like just an ordinary commit.  Its tree would match the
entry in the tree the toplevel commit at the path in "bind" line
of the top-level commit.

Some reading material, from newer to older:

  * http://www.kernel.org/git/?p=git/git.git;a=blob;hb=todo;f=Subpro.txt

  This talks about the overall "vision" on how the user-level
  interaction might look like, with a sketch on how the core-level
  would help Porcelain to implement that interaction.  Most of the
  core-level support described there is in the "bind commit"
  changes, except "update-index --bind/-unbind" to record the
  information on bound subprojects in the index file.

  * http://thread.gmane.org/gmane.comp.version-control.git/15072

  This was the thread that led to the above proposal.

  * http://thread.gmane.org/gmane.comp.version-control.git/14486

  This is older.  It touches an alternative "gitlink" approach,
  which I meant to prototype but never got around to.

  Surprisingly, these two threads are mostly noise-free and
  literally every message is worth reading.

Some old but working core-side code is available at jc/bind
branch of public git.git repository.

> BTW. I have lately stumbled upon (somewhat Vault and Subversion biased)
>  http://software.ericsink.com/Beyond_CheckOut_and_CheckIn.html
> Read about Share and Pin -- it's about subprojects (when you edit out the
> flawed "branch as folder" approach of author).

Not really.  You can easily do that by checking out another
project in a separate subdirectory.

My private working area for git.git is structured like this:

	/home/junio/git.junio/.git
        		      Makefile
                              COPYING
                              Documentation/
                              ...
                              Meta/.git
                              Meta/TODO
                              Meta/Make
                              Meta/TO
                              Meta/WI
                              ...

Notice two .git directories?  That's right.  

The top-level .git repository has the familiar branches like
"maint", "master", "next", "pu", in addition to various topic
branches.

Meta/.git is a separate repository that is a clone of "todo"
branch of git.git repository.  The top-level .git repository
does not even have "todo" branch.  I just happen to push into
the same public repository git.git at kernel.org from these two
separate repositories.

The Meta/ repository is "pinned" to a specific version, without
having any funky "Pin feature", no thank you, because I have
full control of when I update what is checked out in the Meta/
directory.

What you _might_ want is a reverse of Pinning.  Sometimes, you
would want to make sure subproject part is at least this version
or later to build other parts of the whole.

But for my particular "Meta/" directory, I do not need such a
linkage.  The major reason I do not keep TODO in the main
project is because it is supposed to be a task list for me
across "maint", "master" and "next".  I do not want it to
fluctuate whenever I work on different branches.

-jc

^ permalink raw reply

* Re: [PATCH] Make die() and error() prefix line with binary name if set
From: Junio C Hamano @ 2006-04-26  8:32 UTC (permalink / raw)
  To: Rocco Rutte; +Cc: git
In-Reply-To: <20060425101207.GC5482@bolero.cs.tu-berlin.de>

Rocco Rutte <pdmef@gmx.net> writes:

> Now, git_set_appname() can be used to set the name of the binary
> as first call in a binary's main() routine which will be used
> as prefix in die() and error(). If it was not called, no prefix
> will be printed.

I agree with the general direction, but...

> @@ -1960,6 +1960,8 @@ int main(int argc, char **argv)
>  	int read_stdin = 1;
>  	const char *whitespace_option = NULL;
>  +	git_set_appname("git-apply");
> +
>  	for (i = 1; i < argc; i++) {
>  		const char *arg = argv[i];
>  		char *end;

... what's wrong with your mailer?

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Jakub Narebski @ 2006-04-26  8:44 UTC (permalink / raw)
  To: git
In-Reply-To: <7virowrd1y.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> BTW. I have lately stumbled upon (somewhat Vault and Subversion biased)
>>  http://software.ericsink.com/Beyond_CheckOut_and_CheckIn.html
>> Read about Share and Pin -- it's about subprojects (when you edit out the
>> flawed "branch as folder" approach of author).

By the way I mentioned this link only because it *might* be interesting what
others need subproject support for and how others think of it and implement
it.

> Not really.  You can easily do that by checking out another
> project in a separate subdirectory.
> 
> My private working area for git.git is structured like this:
> 
> /home/junio/git.junio/.git
>         Makefile
>                               COPYING
>                               Documentation/
>                               ...
>                               Meta/.git
>                               Meta/TODO
>                               Meta/Make
>                               Meta/TO
>                               Meta/WI
>                               ...
> 
> Notice two .git directories?  That's right.
[...] 
> Meta/.git is a separate repository that is a clone of "todo"
> branch of git.git repository.  The top-level .git repository
> does not even have "todo" branch.  I just happen to push into
> the same public repository git.git at kernel.org from these two
> separate repositories.

And top-level .git repository is told to ignore Meta directory?

Interesting idea...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Junio C Hamano @ 2006-04-26  9:21 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e2nbrl$p6l$1@sea.gmane.org>

Jakub Narebski <jnareb@gmail.com> writes:

>> Notice two .git directories?  That's right.
> [...] 
>> Meta/.git is a separate repository that is a clone of "todo"
>> branch of git.git repository.  The top-level .git repository
>> does not even have "todo" branch.  I just happen to push into
>> the same public repository git.git at kernel.org from these two
>> separate repositories.
>
> And top-level .git repository is told to ignore Meta directory?

Yes, I have .git/info/exclude that says something like this:

/.mailmap
*~
/Meta
+*

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links
From: Jakub Narebski @ 2006-04-26  9:28 UTC (permalink / raw)
  To: git
In-Reply-To: <7virowrd1y.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> And a subproject commit, unless it contains subsubproject, would
> look like just an ordinary commit.  Its tree would match the
> entry in the tree the toplevel commit at the path in "bind" line
> of the top-level commit.
> 
> Some reading material, from newer to older:
> 
>   * http://www.kernel.org/git/?p=git/git.git;a=blob;hb=todo;f=Subpro.txt
> 
>   This talks about the overall "vision" on how the user-level
>   interaction might look like, with a sketch on how the core-level
>   would help Porcelain to implement that interaction.  Most of the
>   core-level support described there is in the "bind commit"
>   changes, except "update-index --bind/-unbind" to record the
>   information on bound subprojects in the index file.

By the way, this file talks about (1) "using"/"userspace"/"embedder"
subproject holding 'appliance/', and toplevel (master) holding toplevel
Makefile, or (2) 'using' subproject holding both 'appliance/' and toplevel
Makefile with the help of --exclude. 

Another option would be to have only "embedded"/"used"/"requirement" be
subproject holding 'kernel-2.6', and 'appliance/' hold by toplevel (master)
commit.  Perhaps not the best solution for 'kernel + userspace tools'
example, but might be better workflow for 'application + library' or
'application + engine' example. 

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] Make die() and error() prefix line with binary name if set
From: Rocco Rutte @ 2006-04-26 10:43 UTC (permalink / raw)
  To: git
In-Reply-To: <7vejzkrb2y.fsf@assigned-by-dhcp.cox.net>

* Junio C Hamano <junkio@cox.net>:

>... what's wrong with your mailer?

I don't know. I recall to have seen this earlier.

And while I'll look at it (I bet this an f=f issue), the patch is at:

   <http://user.cs.tu-berlin.de/~pdmef/0001-Make-die-and-error-prefix-line-with-binary-name-if-set.txt>

   bye, Rocco
-- 
:wq!

^ permalink raw reply

* new gitk feature
From: Paul Mackerras @ 2006-04-26 10:59 UTC (permalink / raw)
  To: git

I just pushed some changes to gitk which add a new feature, the
ability to have multiple "views" of a repository.  Each view is a
subgraph of the full graph.  At the moment the only subgraph that you
can specify is the subgraph containing the commits that affect a
specified set of files or directories.  You can switch between views
quickly, and if the currently selected commit exists in the new view
when you switch views, it is selected in the new view.  There is one
view which always exists, the "All files" view.  If files or
directories are specified on the command line, a "Command line" view
is automatically created and selected at startup.

Thus, for the kernel repository I can have a "PPC" view which shows
changes to arch/powerpc, include/asm-powerpc etc.  When looking at a
commit in that view, I can switch to the "All files" view to see where
that commit fits in the overall history.

There is a "View" menu which contains the menu items for creating,
deleting, editing and selecting views.  If you check the "Remember
this view" box, gitk will write the definition of the view to your
~/.gitk file, and it will be automatically put in the list on startup.

I plan to add various other kinds of views, for example, a view that
shows only the commits that affect a selected file (or part of a file,
perhaps), and a view that shows just the current commit together with
all the commits that have tags.  (The latter will require some help
from git-rev-list. :)

Paul.

^ permalink raw reply

* What's in git.git
From: Junio C Hamano @ 2006-04-26 11:09 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

* The 'maint' branch has fixes mentioned in the 1.3.1 
  announcement.

  As I outlined in the 1.3.1 maintenance release announcement,
  people with that release will soon be missing many
  improvements.  The following is a list of what to expect.


* In addition to the above. the 'master' branch has these since
  the last announcement,

  - git-update-index --chmod=+x now affects all the subsequent
    files (Alex Riesen).

  - git-update-index --unresolve paths...; this needs
    documentation (hint).

  - minor "diff --stat" and "show --stat" fixes.

  - Makefile dependency fixes.  This fixes the infamous
    "libgit.a still contains stale diff.o" problem.

  - contrib has colordiff that understands --cc output.

  - beginning of libified "git diff" family.

  - git-commit-tree <ent> -p <parent> now takes extended SHA1
    expression, not limited to 40-byte SHA1, for <ent> (it
    already did so for <parent>).

  - updated gitk to handle repositories with large number of
    tags and heads (Paul).


* The 'next' branch, in addition, has these.

  - internal log/show/whatchanged family (Linus and me).

  - beginning of internal format-patch.

  - Geert's similarity code in contrib/

  - cache-tree optimization to speed up git-apply + write-tree
    cycles.

    Initially I was getting close to 50% improvement, but
    re-benching suggests it is more like 16%.  An earlier
    version in 'next' used a separate .git/index.aux to record
    the cache-tree information but now it is stored as part of
    the index.  If you used previous 'next' (ha, ha) version and
    see tmp-indexXXXX.aux or next-indexXXXX.aux files left in
    your $GIT_DIR, they can safely be removed.

  - more "diff --stat" fixes.

  - git-cvsserver: typofixes.

  - diff-delta interface reorganization (Nico)

  - git-repo-config --list (Pasky)


* The 'pu' branch, in addition, has these.

  - resurrect "bind commit"; this has been done only partially.

    I have not updated the rev-list/fsck-objects yet.  Probably
    need to drop the specific "bind " line and replace it with
    "link object bind" in the commit objects before going
    forward.

  - get_sha1(): :path and :[0-3]:path to extract from index.

  - Loosening path argument check a little bit in revision.c.

    I've been meaning to do the opposite of this, the tightening
    of ambiguous case mentione by Linus, but haven't got around
    to yet (I haven't got around to too many things, hint hint).

  - reverse the pack-objects delta window logic (Nico)

    This is in theory the right thing to do, but things are not
    quite there yet.  But Nico is on top of it so we will see
    quite an improvement in the pack generation hopefully very
    soon.

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Andreas Ericsson @ 2006-04-26 11:25 UTC (permalink / raw)
  To: sean; +Cc: Linus Torvalds, junkio, git, jnareb
In-Reply-To: <BAYC1-PASMTP086A906CFB378AB229C2D8AEBF0@CEZ.ICE>

sean wrote:
> On Tue, 25 Apr 2006 08:40:25 -0700 (PDT)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
> 
>>On Tue, 25 Apr 2006, Linus Torvalds wrote:
>>
>>>I want the git objects to have clear and unambiguous semantics. I want 
>>>people to be able to explain exactly what the fields _mean_. No "this 
>>>random field could be used this random way" crud, please.
>>
>>Btw, if the whole point is a "leave random porcelain a field that they can 
>>use any way they want", then I say "Hell NO!".
>>
>>Random porcelain can already just maintain their own lists of "related" 
>>stuff, any way they want: you can keep it in a file in ".git/porcelain", 
>>called "list-commit-relationships", or you could use a git blob for it and 
>>have a reference to it in .git/refs/porcelain/relationships or whatever. 
>>
>>If it has no clear and real semantic meaning for core git, then it 
>>shouldn't be in the core git objects.
>>
>>The absolute last thing we want is a "random out" that starts to mean 
>>different things to different people, groups and porcelains.
>>
>>That's just crazy, and it's how you end up with a backwards compatibility 
>>mess five years from now that is totally unresolvable, because different 
>>projects end up having different meanings or uses for the fields, so 
>>converting the database (if we ever find a better format, or somebody 
>>notices that SHA1 can be broken by a five-year-old-with-a-crayon).
>>
>>There's a reason "minimalist" actually ends up _working_. I'll take a UNIX 
>>"system calls have meanings" approach over a Windows "there's fifteen 
>>different flavors of 'open()', and we also support magic filenames with 
>>specific meaning" kind of thing.
>>
> 
> 
> It's a fair point.  But adding a separate database to augment the core 
> information has some downsides.  That is, that information isn't pulled, 
> cloned, or pushed automatically; it doesn't get to ride for free on top 
> of the core.
> 
> Accommodating extra git headers (or "note"'s in Junio's example) would allow
> a developer to record the fact that he is integrating a patch taken 
> from a commit in the devel branch and backporting it to the release 
> branch.   Either by adding a note that references the bug tracking #, or 
> a commit sha1 from the devel branch that is already associated with the bug.
> 

This information is something I, as a human, would definitely want to 
read. What's the point of recording it in the commit-header if we're not 
going to show it to users anyway? I'm with Linus on this one. Keep 
headers as simple as possible.

> Of course that information could be embedded in the free text area, but 
> you yourself have argued vigorously that it is brain damaged to try and rely
> on parsing free form text for these types of situations.

Why would there be a need to parse it? The entire *point* of history is 
to present it to readers in an as accessible and understandable way as 
possible. Git's sha1 hashes mean absolutely nothing, so a note saying 
something was cherry-picked from commit 
"89987987ad987aef987987aff987987d" on branch "devel" will be pointless 
unless the one doing the committing states the why as well as the what 
in the commit-message anyways.

Besides, only developers will likely ever look at the commit-messages, 
and they will likely only ever do it when they are bisecting or looking 
for the implementation date of a certain feature or other.

>  Most of the potential 
> uses aren't really meant for a human to read while looking at the log anyway, 
> they just get in the way.

I still fail to see a use case for this. Could you give me some examples 
to when information recorded isn't meant for being presented to the user?

> 
> But if the information is in the actual commit header it gets to tag along
> for free with never any worry it will be separated from the commit in question.
> So when the developer above updates his official repo the bug tracker system 
> can notice that the bug referenced in its system has had a patch backported 
> and take whatever action is desired.  
> 

We already have something like this. All commits with a top-line message 
containing "bug #" followed by a number automatically updates our 
bugtracking system with the commit-message in its entirety. If the word 
before "bug #" matches "fix.*" then the status of the bug is set to that.

This might seem cumbersome to some but it's really very straightforward, 
and for a couple of reasons it's a very good solution:
1. Devs who Do It Right don't have to fiddle with their browser just to 
enter the info twice, so they learn fast. :)
2. BT history (viewed by non-devs too) gets updated with accurate 
information promptly.
3. No matter how you solve the problem you're going to need to write a 
custom commit/update hook anyway, so this is as good as having the info 
in the note.
4. The info going to the BT is easily modifiable, so if someone screws 
up they can fix it later. Fixing an already written git commit takes 
some doing if there are commits on top.

> Of course there are other ways to do this, but integrating it into git means it
> gets a free ride on the core, and it shouldn't really get in the way of core 
> any more than email X- headers get in the way of email flowing.
> 

True. I've suggested before that arbitrary headers could be added to git 
commits by prefixing them with X- (preferrably followed by an abbrev of 
the porcelain name adding the note). This way it's easy to filter, you 
get the free ride, and porcelains can do whatever they want while core 
git can strip everything following the sequence "\nX-" up to and 
including the next newline.

This way you have only one special byte-sequence with special meaning 
that the plumbing has to know it should ignore, which is a lot more 
extensible (not to mention easier to code).

In addition, if those X- lines aren't included in the sha1 computation 
they can easily be removed and added to without affecting the ancestry 
chain. This would probably have quite a performance impact though.

That said, I don't think even "X-" headers is a very good idea. Perhaps 
i've just got poor imagination but I can't think of a good use for them.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* [PATCH] Alter git-rebase command line options.
From: sean @ 2006-04-26 11:51 UTC (permalink / raw)
  To: git


  git rebase [--branch <branch>] <newbase>
  git rebase --continue
  git rebase --abort

Add "--continue" to restart the rebase process after
manually resolving conflicts.  The user is warned if
there are still differences between the index and the
working files.

Add "--abort" to restore the original branch, and
remove the .dotest working files.

Change the order that branch and newbase are specified
as per comments from Linus.  Also remove the need to
specify both an upstream branch _and_ a new merge base.

The documentation is updated to reflect this new command
line format but the script still quietly supports the
existing command line options completely.

This fixes a minor bug in the current version where:
"git rebase master^ master" doesn't notice that there
is no need to perform the rebase.

---

 Documentation/git-rebase.txt |   95 ++++++++++++++++++++++--------
 git-rebase.sh                |  133 ++++++++++++++++++++++--------------------
 2 files changed, 139 insertions(+), 89 deletions(-)

d8366d9de1aecf3143af646f49e7f7bc0f924ae6
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 4a7e67a..f1e83ea 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -3,76 +3,121 @@ git-rebase(1)
 
 NAME
 ----
-git-rebase - Rebase local commits to new upstream head
+git-rebase - Rebase local commits to a new upstream head
 
 SYNOPSIS
 --------
-'git-rebase' [--onto <newbase>] <upstream> [<branch>]
+'git-rebase' [--branch <branch>] <newbase>
+
+'git-rebase' --continue
+
+'git-rebase' --abort
 
 DESCRIPTION
 -----------
-git-rebase applies to <upstream> (or optionally to <newbase>) commits
-from <branch> that do not appear in <upstream>. When <branch> is not
-specified it defaults to the current branch (HEAD).
+git-rebase replaces <branch> with a new branch of the same name having
+a HEAD of <newbase>.  It then attempts to make a new commit for each
+commit from the original <branch> that does not yet exist in this new
+<branch>.
+
+It is possible that a merge failure will prevent this process from being
+completely automatic.  You will have to resolve any such merge failure
+and run `git rebase --continue`.  If you can not resolve the merge
+failure, running `git rebase --abort` will restore the original <branch>
+and remove the working files found in the .dotest directory.
 
-When git-rebase is complete, <branch> will be updated to point to the
-newly created line of commit objects, so the previous line will not be
-accessible unless there are other references to it already.
+Note that if <branch> is not specified on the command line, the currently
+checked out branch is used.
 
 Assume the following history exists and the current branch is "topic":
 
+------------
           A---B---C topic
          /
     D---E---F---G master
+------------
+
+From this point, the result of running the following command:
+
 
-From this point, the result of either of the following commands:
+    git rebase --branch topic master
 
-    git-rebase master
-    git-rebase master topic
 
 would be:
 
+------------
                   A'--B'--C' topic
                  /
     D---E---F---G master
+------------
 
 While, starting from the same point, the result of either of the following
 commands:
 
-    git-rebase --onto master~1 master
-    git-rebase --onto master~1 master topic
+    git rebase master~1
+    git rebase --branch topic master~1
+
 
 would be:
 
+------------
               A'--B'--C' topic
              /
     D---E---F---G master
+------------
 
 In case of conflict, git-rebase will stop at the first problematic commit
-and leave conflict markers in the tree.  After resolving the conflict manually
-and updating the index with the desired resolution, you can continue the
-rebasing process with
+and leave conflict markers in the tree.  You can use git diff to locate
+the markers (<<<<<<) and make edits to resolve the conflict.  For each
+file you edit, you need to tell git that the conflict has been resolved,
+typically this would be done with
+
+
+    git update-index <filename>
+
+
+After resolving the conflict manually and updating the index with the
+desired resolution, you can continue the rebasing process with
+
+
+    git rebase --continue
 
-    git am --resolved --3way
 
 Alternatively, you can undo the git-rebase with
 
-    git reset --hard ORIG_HEAD
-    rm -r .dotest
+
+    git rebase --abort
 
 OPTIONS
 -------
 <newbase>::
-	Starting point at which to create the new commits. If the
-	--onto option is not specified, the starting point is
-	<upstream>.
-
-<upstream>::
-	Upstream branch to compare against.
+	Starting point at which to create the new commits.
 
 <branch>::
 	Working branch; defaults to HEAD.
 
+--continue::
+	Restart the rebasing process after having resolved a merge conflict.
+
+--abort::
+	Restore the original branch and abort the rebase operation.
+
+NOTES
+-----
+When you rebase a branch, you are changing its history in a way that
+will cause problems for anyone who already has a copy of the branch
+in their repository and tries to pull updates from you.  You should
+understand the implications of using 'git rebase' on a repository that
+you share.
+
+When the git rebase command is run, it will first execute a "pre-rebase"
+hook if one exists.  You can use this hook to do sanity checks and
+reject the rebase if it isn't appropriate.  Please see the template
+pre-rebase hook script for an example.
+
+You must be in the top directory of your project to start (or continue)
+a rebase.  Upon completion, <branch> will be the current branch.
+
 Author
 ------
 Written by Junio C Hamano <junkio@cox.net>
diff --git a/git-rebase.sh b/git-rebase.sh
index 86dfe9c..5a4e33b 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -3,40 +3,61 @@ #
 # Copyright (c) 2005 Junio C Hamano.
 #
 
-USAGE='[--onto <newbase>] <upstream> [<branch>]'
-LONG_USAGE='git-rebase applies to <upstream> (or optionally to <newbase>) commits
-from <branch> that do not appear in <upstream>. When <branch> is not
-specified it defaults to the current branch (HEAD).
-
-When git-rebase is complete, <branch> will be updated to point to the
-newly created line of commit objects, so the previous line will not be
-accessible unless there are other references to it already.
-
-Assuming the following history:
-
-          A---B---C topic
-         /
-    D---E---F---G master
-
-The result of the following command:
-
-    git-rebase --onto master~1 master topic
-
-  would be:
-
-              A'\''--B'\''--C'\'' topic
-             /
-    D---E---F---G master
+USAGE='[--branch <branch>] <newbase>'
+LONG_USAGE='git-rebase replaces <branch> with a new one of the
+same name having a HEAD of <newbase>.  It then attempts to create
+a new commit for each commit from the original <branch> that does
+not yet exist on this new <branch>.
+
+It is possible that a merge failure will prevent this process
+from being completely automatic.  You will have to resolve any
+such merge failure and run git-rebase --continue.  If you can
+not resolve the merge failure, running git-rebase --abort will
+restore the original <branch> and remove the working files found
+in the .dotest directory.
+
+Note that if <branch> is not specified on the command line, the
+currently checked out branch is used.  You must be in the top
+directory of your project to start (or continue) a rebase.
+
+Example:       git-rebase --branch topic master~1
+
+        A---B---C topic                   A'\''--B'\''--C'\'' topic
+       /                   -->           /
+  D---E---F---G master          D---E---F---G master
 '
 
 . git-sh-setup
 
 unset newbase
+unset branch_name
 while case "$#" in 0) break ;; esac
 do
 	case "$1" in
+	--continue)
+		diff=$(git-diff-files)
+		case "$diff" in
+		?*)	echo "You must edit all merge conflicts and then"
+			echo "mark them as resolved using git update-index"
+			exit 1
+			;;
+		esac
+		git am --resolved --3way
+		exit
+		;;
+	--abort)
+		[ -d .dotest ] || die "No rebase in progress?"
+		git reset --hard ORIG_HEAD
+		rm -r .dotest
+		exit
+		;;
+	--branch)
+		test $# -ne 3 -o -n "$newbase" && usage
+		branch_name="$2"
+		shift
+		;;
 	--onto)
-		test 2 -le "$#" || usage
+		test $# -lt 2 -o -n "$branch_name" && usage
 		newbase="$2"
 		shift
 		;;
@@ -49,6 +70,20 @@ do
 	esac
 	shift
 done
+# Quietly support the historic command line [--onto newbase] newbase' [branch]
+test $# -lt 1 && usage
+test -z "$newbase" && newbase="$1"
+shift
+if [ -z "$branch_name" ]; then
+	if [ $# -gt 0 ]; then
+		branch_name="$1"
+		shift
+	else	branch_name=`git symbolic-ref HEAD` || die "No current branch"
+		branch_name=`expr "z$branch_name" : 'zrefs/heads/\(.*\)'`
+	fi
+fi
+test $# -gt 0 && usage
+git checkout "$branch_name" || usage
 
 # Make sure we do not have .dotest
 if mkdir .dotest
@@ -72,11 +107,6 @@ case "$diff" in
 	;;
 esac
 
-# The upstream head must be given.  Make sure it is valid.
-upstream_name="$1"
-upstream=`git rev-parse --verify "${upstream_name}^0"` ||
-    die "invalid upstream $upstream_name"
-
 # If a hook exists, give it a chance to interrupt
 if test -x "$GIT_DIR/hooks/pre-rebase"
 then
@@ -86,47 +116,22 @@ then
 	}
 fi
 
-# If the branch to rebase is given, first switch to it.
-case "$#" in
-2)
-	branch_name="$2"
-	git-checkout "$2" || usage
-	;;
-*)
-	branch_name=`git symbolic-ref HEAD` || die "No current branch"
-	branch_name=`expr "z$branch_name" : 'zrefs/heads/\(.*\)'`
-	;;
-esac
-branch=$(git-rev-parse --verify "${branch_name}^0") || exit
-
 # Make sure the branch to rebase onto is valid.
-onto_name=${newbase-"$upstream_name"}
-onto=$(git-rev-parse --verify "${onto_name}^0") || exit
-
-# Now we are rebasing commits $upstream..$branch on top of $onto
+branch=$(git-rev-parse --verify "${branch_name}^0") || exit
+onto=$(git-rev-parse --verify "${newbase}^0") || exit
 
 # Check if we are already based on $onto, but this should be
 # done only when upstream and onto are the same.
-if test "$upstream" = "onto"
-then
-	mb=$(git-merge-base "$onto" "$branch")
-	if test "$mb" = "$onto"
-	then
-		echo >&2 "Current branch $branch_name is up to date."
-		exit 0
-	fi
-fi
-
-# Rewind the head to "$onto"; this saves our current head in ORIG_HEAD.
-git-reset --hard "$onto"
-
-# If the $onto is a proper descendant of the tip of the branch, then
-# we just fast forwarded.
+mb=$(git-merge-base "$onto" "$branch")
 if test "$mb" = "$onto"
 then
-	echo >&2 "Fast-forwarded $branch to $newbase."
+	echo >&2 "Current branch $branch_name already has $newbase as a base!"
 	exit 0
 fi
 
-git-format-patch -k --stdout --full-index "$upstream" ORIG_HEAD |
+# Rewind the head to "$onto"; this saves our current head in ORIG_HEAD.
+git-reset --hard "$newbase"
+
+# Now we are rebasing commits $newbase..$branch on top of $newbase
+git-format-patch -k --stdout --full-index "$newbase" ORIG_HEAD |
 git am --binary -3 -k
-- 
1.3.0.gd8366

^ permalink raw reply related

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-26 12:01 UTC (permalink / raw)
  To: git
In-Reply-To: <444F58B0.6090603@op5.se>

Andreas Ericsson wrote:

> I've suggested before that arbitrary headers could be added to git
> commits by prefixing them with X- (preferrably followed by an abbrev of
> the porcelain name adding the note). This way it's easy to filter, you
> get the free ride, and porcelains can do whatever they want while core
> git can strip everything following the sequence "\nX-" up to and
> including the next newline.
> 
> This way you have only one special byte-sequence with special meaning
> that the plumbing has to know it should ignore, which is a lot more
> extensible (not to mention easier to code).
> 
> In addition, if those X- lines aren't included in the sha1 computation
> they can easily be removed and added to without affecting the ancestry
> chain. This would probably have quite a performance impact though.
> 
> That said, I don't think even "X-" headers is a very good idea. Perhaps
> i've just got poor imagination but I can't think of a good use for them.

Well, the "note" headers are just that, but instead of prefixing 'extra'
headers with "X-" you prefix them with "note ".

I think that the "note" (or X-) headers should be included in calculating
sha1, as the free-form of commit (the comment) is.

As to use: for now 'git cherry-pick' and 'git revert' records the commit
picked or commit reverted in free form. It could be recorded in "note"
header, or additionally as "note" header. 'git rebase' could also record
the original commit e.g. as "note original <branchname> <sha1-of-commit>".

And it would be the place for Porcelain to record simple information which
is of use to them, but usualy not interesting to user, so it would be
better if it wouldn't pollute free-form/comment area.


The "prior" (for saving "pu"-like branches previous state) and "bind" (for
managing subprojects) I think should be rather of "related"/"link" kind.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] Alter git-rebase command line options.
From: Jakub Narebski @ 2006-04-26 12:30 UTC (permalink / raw)
  To: git
In-Reply-To: <BAYC1-PASMTP0659C709B7FFCB63182FE1AEBC0@CEZ.ICE>

sean wrote:

>   git rebase [--branch <branch>] <newbase>
>   git rebase --continue
>   git rebase --abort
> 
> Add "--continue" to restart the rebase process after
> manually resolving conflicts.  The user is warned if
> there are still differences between the index and the
> working files.
> 
> Add "--abort" to restore the original branch, and
> remove the .dotest working files.

Very nice.

>  SYNOPSIS
>  --------
> -'git-rebase' [--onto <newbase>] <upstream> [<branch>]
> +'git-rebase' [--branch <branch>] <newbase>
> +
> +'git-rebase' --continue
> +
> +'git-rebase' --abort
>  
>  DESCRIPTION
>  -----------
> -git-rebase applies to <upstream> (or optionally to <newbase>) commits
> -from <branch> that do not appear in <upstream>. When <branch> is not
> -specified it defaults to the current branch (HEAD).
> +git-rebase replaces <branch> with a new branch of the same name having
> +a HEAD of <newbase>.  It then attempts to make a new commit for each
> +commit from the original <branch> that does not yet exist in this new
> +<branch>.

What about 'git-rebase --onto <newbase> <upstream> <branch>' three options
version?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [RFC] [PATCH 0/5] Implement 'prior' commit object links (and other commit links ideas)
From: Jakub Narebski @ 2006-04-26 12:42 UTC (permalink / raw)
  To: git
In-Reply-To: <7vslo1v4zw.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

> My initial 'related' without 'note' was flawed - it used
> cherry-pick as an example of 'related' when it clearly should
> have been 'note' (no connectivitiy required).
[...]
> There definitely needs to be an ability to specify a list of
> "nature of links this repository accepts", if we were to do
> 'link'.  It probably should default to an empty set.  rev-list
> --objects would include objects pointed by 'link' only when the
> repository wants such links to be honored.  fsck-objects will
> declare an object that is reachable only by a 'link' that is not
> accepted by the repository "uninteresting" and let git-prune
> remove it.

I think that perhaps connectivity should be more fine-grained than this.
Namely we might want links which are not fsck-able nor pulled (and can be
dangling), but will prevent object pointed from being pruned. The
"original" (or "cherrypick") relation comes to mind.

Of course that can be configured per repository...

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: [PATCH] Alter git-rebase command line options.
From: sean @ 2006-04-26 13:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e2np4p$b9a$1@sea.gmane.org>

On Wed, 26 Apr 2006 14:30:47 +0200
Jakub Narebski <jnareb@gmail.com> wrote:
> 
> What about 'git-rebase --onto <newbase> <upstream> <branch>' three options
> version?

Ahh yes, I didn't look closely enough at that, and got fooled by a bug
in the current version[1] into thinking it was never used anyway.  Will have
to respin, the script and come up with some docs.  What's a reason someone
would want or need to use this three option version ?

Sean

[1]  Line 110:  if test "$upstream" = "onto"

^ permalink raw reply

* Re: new gitk feature
From: Jan-Benedict Glaw @ 2006-04-26 13:57 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17487.21137.344427.173131@cargo.ozlabs.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 795 bytes --]

On Wed, 2006-04-26 20:59:29 +1000, Paul Mackerras <paulus@samba.org> wrote:
> Thus, for the kernel repository I can have a "PPC" view which shows
> changes to arch/powerpc, include/asm-powerpc etc.  When looking at a
> commit in that view, I can switch to the "All files" view to see where
> that commit fits in the overall history.

Hmm.. Neat feature for arch maintainers. An easy way to see what's
happening in the i386 tree for example :)

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 für einen Freien Staat voll Freier Bürger"  | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [PATCH] Add --continue and --abort options to git-rebase.
From: sean @ 2006-04-26 14:49 UTC (permalink / raw)
  To: git
In-Reply-To: <e2np4p$b9a$1@sea.gmane.org>


  git rebase [--onto <newbase>] <upstream> [<branch>]
  git rebase --continue
  git rebase --abort

Add "--continue" to restart the rebase process after
manually resolving conflicts.  The user is warned if
there are still differences between the index and the
working files.

Add "--abort" to restore the original branch, and
remove the .dotest working files.

This fixes a minor bug in the current version where:
"git rebase master^ master" doesn't notice that there
is no need to perform the rebase.

Some minor additions to the git-rebase documentation.

---

Take 2.  Must simpler patch which doesn't trying to 
rejigger the command line too much.

 Documentation/git-rebase.txt |   76 +++++++++++++++++++++++++++++++++++-------
 git-rebase.sh                |   64 ++++++++++++++++++++++-------------
 2 files changed, 102 insertions(+), 38 deletions(-)

b009f7b17dce8f860f242f9cafc2aa510daf9f41
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 4a7e67a..cf74005 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -3,38 +3,54 @@ git-rebase(1)
 
 NAME
 ----
-git-rebase - Rebase local commits to new upstream head
+git-rebase - Rebase local commits to a new head
 
 SYNOPSIS
 --------
 'git-rebase' [--onto <newbase>] <upstream> [<branch>]
 
+'git-rebase' --continue
+
+'git-rebase' --abort
+
 DESCRIPTION
 -----------
-git-rebase applies to <upstream> (or optionally to <newbase>) commits
-from <branch> that do not appear in <upstream>. When <branch> is not
-specified it defaults to the current branch (HEAD).
+git-rebase replaces <branch> with a new branch of the same name.  When
+the --onto option is provided the new branch starts out with a HEAD equal
+to <newbase>, otherwise it is equal to <upstream>.  It then attempts to
+create a new commit for each commit from the original <branch> that does
+not exist in the <upstream> branch.
 
-When git-rebase is complete, <branch> will be updated to point to the
-newly created line of commit objects, so the previous line will not be
-accessible unless there are other references to it already.
+It is possible that a merge failure will prevent this process from being
+completely automatic.  You will have to resolve any such merge failure
+and run `git rebase --continue`.  If you can not resolve the merge
+failure, running `git rebase --abort` will restore the original <branch>
+and remove the working files found in the .dotest directory.
+
+Note that if <branch> is not specified on the command line, the currently
+checked out branch is used.
 
 Assume the following history exists and the current branch is "topic":
 
+------------
           A---B---C topic
          /
     D---E---F---G master
+------------
 
 From this point, the result of either of the following commands:
 
+
     git-rebase master
     git-rebase master topic
 
 would be:
 
+------------
                   A'--B'--C' topic
                  /
     D---E---F---G master
+------------
 
 While, starting from the same point, the result of either of the following
 commands:
@@ -44,21 +60,33 @@ commands:
 
 would be:
 
+------------
               A'--B'--C' topic
              /
     D---E---F---G master
+------------
 
 In case of conflict, git-rebase will stop at the first problematic commit
-and leave conflict markers in the tree.  After resolving the conflict manually
-and updating the index with the desired resolution, you can continue the
-rebasing process with
+and leave conflict markers in the tree.  You can use git diff to locate
+the markers (<<<<<<) and make edits to resolve the conflict.  For each
+file you edit, you need to tell git that the conflict has been resolved,
+typically this would be done with
+
+
+    git update-index <filename>
+
+
+After resolving the conflict manually and updating the index with the
+desired resolution, you can continue the rebasing process with
+
+
+    git rebase --continue
 
-    git am --resolved --3way
 
 Alternatively, you can undo the git-rebase with
 
-    git reset --hard ORIG_HEAD
-    rm -r .dotest
+
+    git rebase --abort
 
 OPTIONS
 -------
@@ -73,6 +101,28 @@ OPTIONS
 <branch>::
 	Working branch; defaults to HEAD.
 
+--continue::
+	Restart the rebasing process after having resolved a merge conflict.
+
+--abort::
+	Restore the original branch and abort the rebase operation.
+
+NOTES
+-----
+When you rebase a branch, you are changing its history in a way that
+will cause problems for anyone who already has a copy of the branch
+in their repository and tries to pull updates from you.  You should
+understand the implications of using 'git rebase' on a repository that
+you share.
+
+When the git rebase command is run, it will first execute a "pre-rebase"
+hook if one exists.  You can use this hook to do sanity checks and
+reject the rebase if it isn't appropriate.  Please see the template
+pre-rebase hook script for an example.
+
+You must be in the top directory of your project to start (or continue)
+a rebase.  Upon completion, <branch> will be the current branch.
+
 Author
 ------
 Written by Junio C Hamano <junkio@cox.net>
diff --git a/git-rebase.sh b/git-rebase.sh
index 86dfe9c..2085ebe 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -4,37 +4,51 @@ # Copyright (c) 2005 Junio C Hamano.
 #
 
 USAGE='[--onto <newbase>] <upstream> [<branch>]'
-LONG_USAGE='git-rebase applies to <upstream> (or optionally to <newbase>) commits
-from <branch> that do not appear in <upstream>. When <branch> is not
-specified it defaults to the current branch (HEAD).
-
-When git-rebase is complete, <branch> will be updated to point to the
-newly created line of commit objects, so the previous line will not be
-accessible unless there are other references to it already.
-
-Assuming the following history:
-
-          A---B---C topic
-         /
-    D---E---F---G master
-
-The result of the following command:
-
-    git-rebase --onto master~1 master topic
-
-  would be:
-
-              A'\''--B'\''--C'\'' topic
-             /
-    D---E---F---G master
+LONG_USAGE='git-rebase replaces <branch> with a new branch of the
+same name.  When the --onto option is provided the new branch starts
+out with a HEAD equal to <newbase>, otherwise it is equal to <upstream>
+It then attempts to create a new commit for each commit from the original
+<branch> that does not exist in the <upstream> branch.
+
+It is possible that a merge failure will prevent this process from being
+completely automatic.  You will have to resolve any such merge failure
+and run git-rebase --continue.  If you can not resolve the merge failure,
+running git-rebase --abort will restore the original <branch> and remove
+the working files found in the .dotest directory.
+
+Note that if <branch> is not specified on the command line, the
+currently checked out branch is used.  You must be in the top
+directory of your project to start (or continue) a rebase.
+
+Example:       git-rebase master~1 topic
+
+        A---B---C topic                   A'\''--B'\''--C'\'' topic
+       /                   -->           /
+  D---E---F---G master          D---E---F---G master
 '
-
 . git-sh-setup
 
 unset newbase
 while case "$#" in 0) break ;; esac
 do
 	case "$1" in
+	--continue)
+		diff=$(git-diff-files)
+		case "$diff" in
+		?*)	echo "You must edit all merge conflicts and then"
+			echo "mark them as resolved using git update-index"
+			exit 1
+			;;
+		esac
+		git am --resolved --3way
+		exit
+		;;
+	--abort)
+		[ -d .dotest ] || die "No rebase in progress?"
+		git reset --hard ORIG_HEAD
+		rm -r .dotest
+		exit
+		;;
 	--onto)
 		test 2 -le "$#" || usage
 		newbase="$2"
@@ -107,7 +121,7 @@ # Now we are rebasing commits $upstream.
 
 # Check if we are already based on $onto, but this should be
 # done only when upstream and onto are the same.
-if test "$upstream" = "onto"
+if test "$upstream" = "$onto"
 then
 	mb=$(git-merge-base "$onto" "$branch")
 	if test "$mb" = "$onto"
-- 
1.3.0.gb009

^ permalink raw reply related

* Re: new gitk feature
From: Linus Torvalds @ 2006-04-26 15:09 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17487.21137.344427.173131@cargo.ozlabs.ibm.com>



On Wed, 26 Apr 2006, Paul Mackerras wrote:
>
> I just pushed some changes to gitk which add a new feature, the
> ability to have multiple "views" of a repository.  Each view is a
> subgraph of the full graph.  At the moment the only subgraph that you
> can specify is the subgraph containing the commits that affect a
> specified set of files or directories.  You can switch between views
> quickly, and if the currently selected commit exists in the new view
> when you switch views, it is selected in the new view.

This gets close to something I wanted, but at the same time falls very 
short of it because the views are always shown completely disjoint.

I've wanted for a long time to have a way to _highlight_ commits. That's 
actually very much a "view" thing, but it's a mode where you really see 
one view, but the commits that exist in another view have a different 
color (or have the commits that _don't_ exist in the other view be grayed 
out).

I hope that your new "view" thing would support this notion too: instead 
of having to totally switch between view, it would be wonderful if you 
could have one "master view" and then use another view to "highlight".

Also, I think revision information should be part of a view. For example, 
in the "highlight" case, I'd love to have the "main view" be the default 
"everything", and then have some way to _highlight_ the view that is 
defined by the revision pattern "v1.3.1.."

Any possibility of something light that? I'd _love_ to be able to see the 
whole tree, but with things that touch certain files or things that are 
newer highlighted.

(Btw, the "revision information" is also cool things like "--unpacked". I 
actually use "gitk --unpacked" every once in a while, just because it's 
such a cool way to say "show me everything I've added since I packed the 
repo last).

			Linus

^ permalink raw reply

* Re: lstat() call in rev-parse.c
From: Matthias Lederhofer @ 2006-04-26 15:28 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0604230906370.3701@g5.osdl.org>

> So the rule is: if you don't give that "--", then we have to be able to 
> confirm that the filenames are really files. Not a misspelled revision 
> name, or a revision name that was correctly spelled, but for the wrong 
> project, because you were in the wrong subdirectory ;)

Shouldn't git rev-parse try to stat the file (additionally?) in the
current directory instead of the top git directory? git (diff|log|..)
seem to fail everytime in a subdirectory without --.

^ permalink raw reply

* Re: lstat() call in rev-parse.c
From: Linus Torvalds @ 2006-04-26 15:43 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git
In-Reply-To: <E1FYlwn-0005mf-CL@moooo.ath.cx>



On Wed, 26 Apr 2006, Matthias Lederhofer wrote:

> > So the rule is: if you don't give that "--", then we have to be able 
> > to confirm that the filenames are really files. Not a misspelled 
> > revision name, or a revision name that was correctly spelled, but for 
> > the wrong project, because you were in the wrong subdirectory ;)
> 
> Shouldn't git rev-parse try to stat the file (additionally?) in the 
> current directory instead of the top git directory? git (diff|log|..) 
> seem to fail everytime in a subdirectory without --.

Good point. However, the reason for that is that it actually _does_ stat 
the file in the current directory, but it has done the 

	revs->prefix = setup_git_directory();

in the init path (and it does need to do that, since that's what figures 
out where the .git directory is, so that we can parse the revisions 
correctly).

And that "setup_git_directory()" will chdir() to the root of the project.

So the "lstat()" should probably take "revs->prefix" into account, the 
way get_pathspec() does. Ie we should probably use

	char *name = argv[i];
	if (rev->prefix)
		name = prefix_filename(rev->prefix, strlen(rev->prefix), name);
	if (lstat(name, ..) < 0)
		die(...)

instead of just a plain lstat().

Probably worth doing as a small helper funtion of its own (and get rid of 
the current "die_badfile()" - and do all of that inside the helper 
function).

Somebody?

		Linus

^ permalink raw reply

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Nicolas Pitre @ 2006-04-26 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsj4sxer.fsf@assigned-by-dhcp.cox.net>

On Tue, 25 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > Note, this is a RFC particularly to Junio since the resulting pack is 
> > larger than without the patch with git-repack -a -f.  However using a 
> > subsequent git-repack -a brings the pack size down to expected size.  So 
> > I'm not sure I've got everything right.
> 
> I haven't tested it seriously yet, but there is nothing that
> looks obviously wrong that might cause the inflation problem,
> from the cursory look after applying the patch on top of your
> last round.
> 
> > +	if (nr_objects == nr_result && trg_entry->delta_limit >= max_depth)
> > +		return 0;
> 
> The older code was loosening this check only for a delta chain
> that is already in pack (which is limited to its previous
> max_depth).  The end result is almost the same -- a thin pack
> recipient would have deeper delta than it asked. The difference
> is that the earlier code had implicit 2*max_depth limit,

Ah.  Indeed.  Didn't realize that.  I can restore that behavior quite 
easily if necessary.

> but this one makes the chain length unbounded, which I do not think it 
> is necessarily a bad change.

Well as long as the thin pack doesn't carry too many revisions it should 
be fine since, as the comment in the code sais, those packs are always 
unpacked.

Initially I had a bug where the delta depth was completely ignored.  I 
was pretty excited when repacking the kernel produced a pack 20% smaller 
although I didn't know why at that time.  But when attempting another 
git-repack -a -f then the initial object counting was sooooooo 
slooooooooow.

> > -	/*
> > -	 * NOTE!
> > -	 *
> > -	 * We always delta from the bigger to the smaller, since that's
> > -	 * more space-efficient (deletes don't have to say _what_ they
> > -	 * delete).
> > -	 */
> 
> This comment by Linus still applies, even though the scan order
> is now reversed; no need to remove it.

This is not exactly true.  In general it is so, but as we fixed the 
deltification of objects with the same name but in different directories 
it is well possible to go from smaller to larger and leaving that 
comment there is misleading.

This is also why I changed the sizediff rule such that:

	sizediff = src_size < size ? size - src_size : 0;

Since the src buffer already has its delta index computed, it costs 
almost nothing to attempt matching much smaller objects against it.  
However if we go from small to larger then the previous logic still 
applies.

> > +	if (trg_entry->delta) {
> > +		/*
> > +		 * The target object already has a delta base but we just
> > +		 * found a better one.  Remove it from its former base
> > +		 * childhood and redetermine the base delta_limit (if used).
> > +		 */
> 
> And you are making the delta chain unbound for thin case, you
> can probably omit this with the same if() here; the
> recomputation seems rather expensive.

Ah right.  I was doing so partly, but I can skip any tree maintenance 
altogether in that case as well.

> > +		if (!size)
> > +			continue;
> > +		delta_index = create_delta_index(n->data, size);
> > +		if (!delta_index)
> > +			die("out of memory");
> 
> It might be worth saying "if (size < 50)" here as well; no point
> wasting the delta window for small sources.

Good point.  No real effect on the pack size though.

> > -#if 0
> > -		/* if we made n a delta, and if n is already at max
> > -		 * depth, leaving it in the window is pointless.  we
> > -		 * should evict it first.
> > -		 * ... in theory only; somehow this makes things worse.
> > -		 */
> > -		if (entry->delta && depth <= entry->depth)
> > -			continue;
> > -#endif
> 
> I was almost tempted to suggest that the degradation you are
> seeing might be related to this mystery I did not get around to
> solve.  By allowing to give chance to try delta against less
> optimum candidates, it appeared that we ended up making the
> final pack size bigger than otherwise, which suggests that our
> choice between plain undeltified and a delta half its size might
> be favoring delta too much.  But it does not appear to be
> related to the inflation you are seeing.

Certainly not, since git-repack -a may only delta _more_ and the pack 
size actualy goes down a lot in my case.

The mystery I'm facing is why would a second pass with git-repack -a fix 
things?  It has a different window behavior since objects already 
deltified do not occupy window space. Hmmm.  That would certainly 
explain why doing a git-repack -a after a git-repack -a -f produces a 
smaller pack even currently.

> BTW, have you tried it without --no-reuse-pack on an object list
> that is not thin?  It appears you are busting the depth limit.
> 
> Using the same "git rev-list --objects v1.2.3..v1.3.0" as input,
> git-pack-objects without --no-reuse-pack gives this
> distribution:
> 
> chain length = 1: 364 objects
> chain length = 2: 269 objects
> chain length = 3: 198 objects
> chain length = 4: 164 objects
> chain length = 5: 148 objects
> chain length = 6: 123 objects
> chain length = 7: 122 objects
> chain length = 8: 103 objects
> chain length = 9: 92 objects
> chain length = 10: 234 objects
> chain length = 11: 12 objects
> chain length = 12: 1 object
> chain length = 13: 2 objects

Oops.  OK fixed.

> So it _might_ be that the depth limiting code is subtly broken
> which is causing you throw away a perfectly good delta base
> which in turn results in a bad pack.

Actually no.  That bug instead allowed each given base to deltify more 
targets than it should have.


Nicolas

^ permalink raw reply

* [PATCH] git-fetch: resolve remote symrefs for HTTP transport
From: Nick Hengeveld @ 2006-04-26 16:10 UTC (permalink / raw)
  To: git

git-fetch validates that a remote ref resolves to a SHA1 prior to calling
git-http-fetch.  This adds support for resolving a few levels of symrefs
to get to the SHA1.

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>


---

Maybe this isn't the right way to handle this - since we're already
calling perl we could use LWP to do the transfers (using keepalive
even?) or we could let git-http-fetch take care of it and deal with
remote names that don't resolve.  It may also make sense to modify
git-http-fetch so it can fetch more than one head at a time.

 git-fetch.sh |   16 ++++++++++++----
 1 files changed, 12 insertions(+), 4 deletions(-)

aa50f9012834993d8bd080050bc13b23465f9185
diff --git a/git-fetch.sh b/git-fetch.sh
index 83143f8..280f62e 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -270,14 +270,22 @@ fetch_main () {
 	  if [ -n "$GIT_SSL_NO_VERIFY" ]; then
 	      curl_extra_args="-k"
 	  fi
-	  remote_name_quoted=$(perl -e '
+	  max_depth=5
+	  depth=0
+	  head="ref: $remote_name"
+	  while (expr "z$head" : "zref:" && expr $depth \< $max_depth) >/dev/null
+	  do
+	    remote_name_quoted=$(perl -e '
 	      my $u = $ARGV[0];
+              $u =~ s/^ref:\s*//;
 	      $u =~ s{([^-a-zA-Z0-9/.])}{sprintf"%%%02x",ord($1)}eg;
 	      print "$u";
-	  ' "$remote_name")
-	  head=$(curl -nsfL $curl_extra_args "$remote/$remote_name_quoted") &&
+	  ' "$head")
+	    head=$(curl -nsfL $curl_extra_args "$remote/$remote_name_quoted")
+	    depth=$( expr \( $depth + 1 \) )
+	  done
 	  expr "z$head" : "z$_x40\$" >/dev/null ||
-		  die "Failed to fetch $remote_name from $remote"
+	      die "Failed to fetch $remote_name from $remote"
 	  echo >&2 Fetching "$remote_name from $remote" using http
 	  git-http-fetch -v -a "$head" "$remote/" || exit
 	  ;;
-- 
1.3.0.g368f0-dirty

^ permalink raw reply related

* Re: [PATCH] git-fetch: resolve remote symrefs for HTTP transport
From: Shawn Pearce @ 2006-04-26 17:09 UTC (permalink / raw)
  To: Nick Hengeveld; +Cc: git
In-Reply-To: <20060426161001.GH32744@reactrix.com>

Nick Hengeveld <nickh@reactrix.com> wrote:
> 
> Maybe this isn't the right way to handle this - since we're already
> calling perl we could use LWP to do the transfers (using keepalive
> even?)

LWP, no.  My Mac OS X perl installation appears to have LWP installed
by dumb luck but my Gentoo Linux perl doesn't have LWP anywhere
in @INC.  :-) Yet both systems run GIT happily.

The HTTP support in GIT is already linked against libcurl and libcurl
is required to use said HTTP support.  I would think that libcurl
is capable of using Keep-Alive when possible, and libcurl and C
are certainly available anywhere GIT's HTTP support is currently
being used.  Ideally any HTTP feature should either be using the
curl command line tool, or better, be written in C against the
libcurl library.  But not LWP.  Its not always available even though
a valid perl is.

-- 
Shawn.

^ permalink raw reply

* [PATCH] Fix filename verification when in a subdirectory
From: Linus Torvalds @ 2006-04-26 17:15 UTC (permalink / raw)
  To: Junio C Hamano, Matthias Lederhofer; +Cc: Git Mailing List, Paul Mackerras
In-Reply-To: <Pine.LNX.4.64.0604260832240.3701@g5.osdl.org>


When we are in a subdirectory of a git archive, we need to take the prefix 
of that subdirectory into accoung when we verify filename arguments.

Noted by Matthias Lederhofer

This also uses the improved error reporting for all the other git commands 
that use the revision parsing interfaces, not just git-rev-parse. Also, it 
makes the error reporting for mixed filenames and argument flags clearer 
(you cannot put flags after the start of the pathname list).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
On Wed, 26 Apr 2006, Linus Torvalds wrote:
> > 
> > Shouldn't git rev-parse try to stat the file (additionally?) in the 
> > current directory instead of the top git directory? git (diff|log|..) 
> > seem to fail everytime in a subdirectory without --.
> 
> Good point. However, the reason for that is that it actually _does_ stat 
> the file in the current directory, but it has done the 
> 
> 	revs->prefix = setup_git_directory();
> 
> in the init path (and it does need to do that, since that's what figures 
> out where the .git directory is, so that we can parse the revisions 
> correctly).
> 
> And that "setup_git_directory()" will chdir() to the root of the project.

diff --git a/cache.h b/cache.h
index 69801b0..4d8fabc 100644
--- a/cache.h
+++ b/cache.h
@@ -134,6 +134,7 @@ extern const char *setup_git_directory_g
 extern const char *setup_git_directory(void);
 extern const char *prefix_path(const char *prefix, int len, const char *path);
 extern const char *prefix_filename(const char *prefix, int len, const char *path);
+extern void verify_filename(const char *prefix, const char *name);
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
diff --git a/rev-parse.c b/rev-parse.c
index 7f66ae2..62e16af 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -160,14 +160,6 @@ static int show_file(const char *arg)
 	return 0;
 }
 
-static void die_badfile(const char *arg)
-{
-	if (errno != ENOENT)
-		die("'%s': %s", arg, strerror(errno));
-	die("'%s' is ambiguous - revision name or file/directory name?\n"
-	    "Please put '--' before the list of filenames.", arg);
-}
-
 int main(int argc, char **argv)
 {
 	int i, as_is = 0, verify = 0;
@@ -177,14 +169,12 @@ int main(int argc, char **argv)
 	git_config(git_default_config);
 
 	for (i = 1; i < argc; i++) {
-		struct stat st;
 		char *arg = argv[i];
 		char *dotdot;
 
 		if (as_is) {
 			if (show_file(arg) && as_is < 2)
-				if (lstat(arg, &st) < 0)
-					die_badfile(arg);
+				verify_filename(prefix, arg);
 			continue;
 		}
 		if (!strcmp(arg,"-n")) {
@@ -350,8 +340,7 @@ int main(int argc, char **argv)
 			continue;
 		if (verify)
 			die("Needed a single revision");
-		if (lstat(arg, &st) < 0)
-			die_badfile(arg);
+		verify_filename(prefix, arg);
 	}
 	show_default();
 	if (verify && revs_count != 1)
diff --git a/revision.c b/revision.c
index f9c7d15..f2a9f25 100644
--- a/revision.c
+++ b/revision.c
@@ -752,17 +752,15 @@ int setup_revisions(int argc, const char
 			arg++;
 		}
 		if (get_sha1(arg, sha1) < 0) {
-			struct stat st;
 			int j;
 
 			if (seen_dashdash || local_flags)
 				die("bad revision '%s'", arg);
 
 			/* If we didn't have a "--", all filenames must exist */
-			for (j = i; j < argc; j++) {
-				if (lstat(argv[j], &st) < 0)
-					die("'%s': %s", argv[j], strerror(errno));
-			}
+			for (j = i; j < argc; j++)
+				verify_filename(revs->prefix, argv[j]);
+
 			revs->prune_data = get_pathspec(revs->prefix, argv + i);
 			break;
 		}
diff --git a/setup.c b/setup.c
index 36ede3d..119ef7d 100644
--- a/setup.c
+++ b/setup.c
@@ -62,6 +62,29 @@ const char *prefix_filename(const char *
 	return path;
 }
 
+/*
+ * Verify a filename that we got as an argument for a pathspec
+ * entry. Note that a filename that begins with "-" never verifies
+ * as true, because even if such a filename were to exist, we want
+ * it to be preceded by the "--" marker (or we want the user to
+ * use a format like "./-filename")
+ */
+void verify_filename(const char *prefix, const char *arg)
+{
+	const char *name;
+	struct stat st;
+
+	if (*arg == '-')
+		die("bad flag '%s' used after filename", arg);
+	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
+	if (!lstat(name, &st))
+		return;
+	if (errno == ENOENT);
+		die("ambiguous argument '%s': unknown revision or filename\n"
+		    "Use '--' to separate filenames from revisions", arg);
+	die("'%s': %s", arg, strerror(errno));
+}
+
 const char **get_pathspec(const char *prefix, const char **pathspec)
 {
 	const char *entry = *pathspec;

^ permalink raw reply related

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Nicolas Pitre @ 2006-04-26 17:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsj4sxer.fsf@assigned-by-dhcp.cox.net>

On Tue, 25 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > Note, this is a RFC particularly to Junio since the resulting pack is 
> > larger than without the patch with git-repack -a -f.  However using a 
> > subsequent git-repack -a brings the pack size down to expected size.  So 
> > I'm not sure I've got everything right.
> 
> I haven't tested it seriously yet, but there is nothing that
> looks obviously wrong that might cause the inflation problem,
> from the cursory look after applying the patch on top of your
> last round.

Never mind.  I found a flaw in the determination of delta_limit when 
reparenting a delta target.  The immediate parent's delta_limit is 
readjusted when its longest delta is moved to another base, but if that 
parent was itself a delta then the delta_limit adjustment is not 
propagated back up to the top.  This means that some objects were 
falsely credited with too high delta_limit.

And actually I'm not sure how to solve that without walking the tree 
up to the top each time, which I want to avoid as much as possible.


Nicolas

^ permalink raw reply

* Re: [PATCH] Fix filename verification when in a subdirectory
From: Timo Hirvonen @ 2006-04-26 18:05 UTC (permalink / raw)
  To: torvalds; +Cc: junkio, matled, git, paulus
In-Reply-To: <Pine.LNX.4.64.0604261010390.3701@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:

> +void verify_filename(const char *prefix, const char *arg)
> +{
> +	const char *name;
> +	struct stat st;
> +
> +	if (*arg == '-')
> +		die("bad flag '%s' used after filename", arg);
> +	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
> +	if (!lstat(name, &st))
> +		return;
> +	if (errno == ENOENT);

Extra semicolon.

-- 
http://onion.dynserv.net/~timo/

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox