Git development
 help / color / mirror / Atom feed
* Re: How to delete large files
From: Mikael Magnusson @ 2009-10-02 12:26 UTC (permalink / raw)
  To: Nils Homer; +Cc: git
In-Reply-To: <C6EB1E10.D7AB%nilshomer@gmail.com>

2009/10/2 Nils Homer <nilshomer@gmail.com>:
> I wish to delete some large offending files (10MB to >100MB each) from my
> git repository (git://bfast.git.sourceforge.net/gitroot/bfast/bfast).  The
> current size of the repo is 656MB.
>
> I created a backup of my repository and then searched for such offending
> files using a script found here:
> http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-ob
> jects-and-trim-your-waist-line/
> where I modified the script to output in MB instead of KB.
>
> This gave me a list of files that I wanted to delete:
> -e   size  pack                                      SHA
> location
> 113  113   f9f2faab597d4f8ccbfac2864347dbc256353fbf
> tests.long/save/save.tar.gz
> 113  113   926b1ba880a26354c4a6b9391985f57fbc9a1174
> tests.long/save/save.tar.gz
> 113  113   e568480bcb8239e6d1ed8d2da86c309c0d3d101b
> tests.long/save/save.tar.gz
> 113  113   e3c0ee53f20e8ebfb60eaefcd7b405168c26a565
> tests.long/save/save.tar.gz
> 103  103   ee2ee50c5075d05d29764c8d4b9acc2acedda919
> tests.long/save/save.tar.gz
> 35   35    319c75945c27096093dbab5a0bf6a9a08089bc2d
> tests.long/data/data.tar.gz
> 11   11    805193c74ceeffca9da3a2788545e701d77e1caf  tests/save/save.tar.gz
> 11   11    658e4a78c1028875ab597d6bde5823cd6a1694b9  tests/save/save.tar.gz
>
> So I decided to remove these files using:
> git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch
> tests.long/save/save.tar.gz tests.long/data/data.tar.gz
> tests/save/save.tar.gz" HEAD
>
> I then ran:
> rm -rf .git/refs/original
> git reflog expire --expire=now 卟ll
> git gc --prune=now
>
> Still (using du 虐) the repository is 656MB and I can see the above files in
> the revision list:
> git rev-list --all --objects | grep tests.long/save/save.tar.gz
> ee2ee50c5075d05d29764c8d4b9acc2acedda919 tests.long/save/save.tar.gz
> e568480bcb8239e6d1ed8d2da86c309c0d3d101b tests.long/save/save.tar.gz
> f9f2faab597d4f8ccbfac2864347dbc256353fbf tests.long/save/save.tar.gz
> 926b1ba880a26354c4a6b9391985f57fbc9a1174 tests.long/save/save.tar.gz
> e3c0ee53f20e8ebfb60eaefcd7b405168c26a565 tests.long/save/save.tar.gz
>
> Could this be because of tags that I had previously created?
>
> I am running git version 1.6.3.3.  I appreciate any help in advance,

Well, you just gave "HEAD" to git filter-branch to rewrite, i think
you want --all to rewrite all refs you have.

-- 
Mikael Magnusson

^ permalink raw reply

* Re: How to delete large files
From: Johannes Sixt @ 2009-10-02 13:41 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: Nils Homer, git
In-Reply-To: <237967ef0910020526w51c05570g606ebc16e0b4a3e7@mail.gmail.com>

Mikael Magnusson schrieb:
> Well, you just gave "HEAD" to git filter-branch to rewrite, i think
> you want --all to rewrite all refs you have.

... and '--tag-filter cat' to rewrite tags as well.

-- Hannes

^ permalink raw reply

* Re: [PATCH] filter-branch: add --prune-empty to option summary
From: Adam Brewster @ 2009-10-02 14:18 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20091002074537.GA27664@coredump.intra.peff.net>

>
> Thanks. This makes sense given the existing structure, though I have to
> wonder how useful some of these gigantic synopses really are.

I find them useful mostly when dealing with commands that I generally
know, but don't remember the exact spelling or ordering of the
options.  I might forget, for example, if it's --msg-filter or
--message-filter.

Adam

^ permalink raw reply

* Figuring out which patches have been applied
From: Jon Smirl @ 2009-10-02 14:36 UTC (permalink / raw)
  To: Git Mailing List

I have a stack of 100 patches against 2.6.30. A lot of these got
merged between 2.6.30-32.  How can I tell which ones have been
applied?

It doesn't work to check if patch A has been applied to 2.6.32. Other
patches may have been applied on top of patch A obscuring it.

Once solution would be to rebase the patch stack forward one commit at
a time. That solves the problem of later patches obscuring patch A. Is
there a better way to do this?

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Re: Git push over git protocol for corporate environment
From: Eugene Sajine @ 2009-10-02 14:41 UTC (permalink / raw)
  To: Eugene Sajine, git
In-Reply-To: <00163623ac5d75929b0474e66b96@google.com>

I'm sorry if it will be a dup again... My first email got stuck or
deleted by server, although i didn't use any html...

Thanks to everybody for prompt answers!

There is one thing I’m still missing though. Do I understand correctly
that if a person has an ssh access (account) to the host in internal
network, then this won’t be enough for him to be able to push to the
repo? Should we still go through the hassle of managing the ssh keys
for each particular user who is supposed to have push access?

I believe the answer is yes and that's why I'm leaning towards pulls
and pushes over git protocol. There is no solution yet which would be
as effective and simple to maintain. Using git protocol will not add
security, but it won't be worse than existing CVS or any other
centralized version control security model. As soon as security comes
into play, then we will need some other solution, but currently i
didn't see anything that would be easy to sell to the company.

Github is cool, but FI is way too expensive and very hard to sell.

Gitorious is even better!! for corporate use, i think, because of its
team oriented approach, but... man... I would kill for java
implementation or anything as simple as that!! As i see It is
impossible to install in network without internet access, and the
amount of dependencies which you have install/pre-install is enormous.
I read somewhere ruby on rails is fun to develop with, but is a
nightmare to deploy and maintain, and it seems to be true. Come on,
guys!! Look at the Hudson CI - one war file containing everything you
need, application starts from command line "java -jar hudson.war" and
runs on any port you specify. Time to start from download to having
first build is less the 10 min!!! If there are gitorious guys -
please, think about it and don't forget to share the profit;)!

I think Cgit can be something competitive - although i failed to run
it yet, having some issues with build...and as all other web based
stuff, you should implement something in order to create and set up
bare repos on the server automatically (even probably edit the config
file via script) to avoid a mess and to avoid one guy spending his
time adding and configuring repos... Probably we will and up using
gitweb as it at least knows to scan a folder for git repos...although
it also gives me troubles installing... both with cgit and gitweb are
conducted under cygwin, so probably this is the real problem with
them;)

I think that this is what is missing right now in order for git to get
rocket start and spread inside companies: secure and easy to maintain
mainline hosting.

Probably my lack of experience with git causes these thoughts - so,
while i will continue to work on it, i would really appreciate any
advice, especially about experience using git not for open source and
not in 3 person's team.

Thanks a lot,
Eugene

^ permalink raw reply

* Re: Re: Git push over git protocol for corporate environment
From: Shawn O. Pearce @ 2009-10-02 14:47 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: git
In-Reply-To: <76c5b8580910020741p2024f6c0w70be53338924e7e8@mail.gmail.com>

Eugene Sajine <euguess@gmail.com> wrote:
> Gitorious is even better!! for corporate use, i think, because of its
> team oriented approach, but... man... I would kill for java
> implementation or anything as simple as that!!

If you want a Java based server, look at either:

* SCuMD               http://github.com/gaffo/scumd
* Gerrit Code Review  http://code.google.com/p/gerrit/

I think SCuMD might be easier to install, I don't think it depends
upon a database or a servlet container like Gerrit does.  But both
are a SSH+Git implementation with some access control capabilities,
and are implemented in Java.

I don't think either is (yet) as easy to install as Hudson CI.
Both projects have a much smaller team of developers behind them,
and are still focusing on basic functionality rather than ease of
new system setup.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] Introduce <branch>@{tracked} as shortcut to the tracked branch
From: Björn Steinbrink @ 2009-10-02 14:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Johan Herland, git, Michael J Gruber,
	Johannes Schindelin, Pete Wyckoff, i.am
In-Reply-To: <7vzl92bql9.fsf@alter.siamese.dyndns.org>

On 2009.09.10 11:29:54 -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > On Thu, Sep 10, 2009 at 12:18:06PM +0200, Johan Herland wrote:
> >
> >> > > A special shortcut '@{tracked}' refers to the branch tracked by the
> >> > > current branch.
> >> >
> >> > Sorry, I didn't know the name of the long form was up for discussion.
> >> > But it should certainly coincide with the key which for-each-ref
> >> > uses, shouldn't it? I don't care whether tracked or upstream, but
> >> > for-each-ref's "upstream" has set the precedent.
> >> 
> >> ...and 'git branch --track' set an even earlier precedent...
> >
> > FWIW, that came about from this discussion:
> >
> >   http://article.gmane.org/gmane.comp.version-control.git/115765
> 
> After re-reading the discussion in the thread that contains the quoted
> article, it sounds like we may want to fix "branch --track X Y".  X does
> not "track" Y in the same sense as origin/master "tracks" master at
> origin.  Rather, X builds on Y.

FWIW, I just had a discussion on #git with Jim Ramsay (lack) about that.
We agreed that "branch --track" is unfortunate and causes confusion,
when it comes to the term "tracking branch" (which the glossary
basically defines to be something in the refs/remotes/ namespace, at
least something you never commit to), as people also start to use that
for branch heads that have an "upstream branch" set in the
configuration.

Jim initially preferred the term "linked", while I argued for the option
to be called --upstream (potentially with the possibility to say
--upstream=rebase or --upstream=merge, the default being "merge", unless
branch.autosetuprebase is set).

We could (I think) to some degree agree that "upstream" is quite good,
although we couldn't find a short and sweet term to describe the branch
head that has a configured upstream branch (some ideas were "upstreamed
branch", "uplinked branch", "downstream branch" and some more, which I
all dislike). After having read the glossary and finding "upstream
branch" in there, I'm even more in favor of using the "upstream" term in
some form.

During the discussion, there were some requests (again) for a command
that allows to change branch.<name>.remote and branch.<name>.merge for
already existing branch heads. With that extra input, I'd now favor:

git branch --set-upstream X Y

because that's potentially reusable for the "change upstream for an
existing branch" case, though I'm totally clueless how to actually do
that, given that "git branch" uses flags to switch between "create new
branch" and "operate on existing branch". So reusing a flag won't
(easily) do the trick, at least not without special casing that could be
dangerous. You could, for example, accidently change the upstream for an
existing branch, while you meant to create a new one.

It's a bit sad that we don't have subcommands for "git branch" like we
do for "git remote", that would make the whole thing a lot easier, but
it's way too late to change that, I guess. At least having "git branch
<name>" default to be "git branch add <name>" would make some branch
names "invalid" for that shortform. So that looks like a no-go.

Björn

^ permalink raw reply

* Re: Trying to split repository
From: Josef Wolf @ 2009-10-02 15:42 UTC (permalink / raw)
  To: git
In-Reply-To: <c376da900910011747i894404dne1ea60dae5e3990b@mail.gmail.com>

On Thu, Oct 01, 2009 at 08:47:15PM -0400, Adam Brewster wrote:
> >>
> >> git-filter-branch accepts a --prune-empty option that does what I
> >> think you're looking for.
> > Thanks for your answer, Adam!
> > Is this a new option? 1.6.0.2 don't seem to have it?
> 1.6.0.2 was released September 2008 (git log -n1 v1.6.0.2).
> 
> This feature was added in October 2008.  (git blame
> Documentation/git-filter-branch.txt; git log -n1 d3240d93).
> 
> It's still it is missing from the option summary in master though.

Thanks for clarifying that, Adam!

^ permalink raw reply

* Re: Re: Git push over git protocol for corporate environment
From: Eugene Sajine @ 2009-10-02 15:58 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git, Eugene Sajine
In-Reply-To: <20091002144727.GZ14660@spearce.org>

Thanks Shawn!

I saw info about Scumd and Gerrit in previous emails, but
unfortunately haven't enough time to spend with those tools yet.
Reading about Gerrit right now.


On Fri, Oct 2, 2009 at 10:47 AM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Eugene Sajine <euguess@gmail.com> wrote:
>> Gitorious is even better!! for corporate use, i think, because of its
>> team oriented approach, but... man... I would kill for java
>> implementation or anything as simple as that!!
>
> If you want a Java based server, look at either:
>
> * SCuMD               http://github.com/gaffo/scumd
> * Gerrit Code Review  http://code.google.com/p/gerrit/
>
> I think SCuMD might be easier to install, I don't think it depends
> upon a database or a servlet container like Gerrit does.  But both
> are a SSH+Git implementation with some access control capabilities,
> and are implemented in Java.
>
> I don't think either is (yet) as easy to install as Hudson CI.
> Both projects have a much smaller team of developers behind them,
> and are still focusing on basic functionality rather than ease of
> new system setup.
>
> --
> Shawn.
>

^ permalink raw reply

* git as a versioned filesystem
From: Scott Wiersdorf @ 2009-10-02 16:49 UTC (permalink / raw)
  To: git

Hi all,

First off, I'm *not* using git as a typical VCS on the front-end; I'm
using it for dist control on the back-end. I'm also fairly new to git
(about a week and a half into it).

The Scene
=========

For source control, we're using CVS (migrating off of it, btw--I only
have limited influence around here). We build our software, etc,
etc. and then we have the developers scp/rsync/untar their builds on a
*master disk image*.

This master disk image is disted via rsync to a few thousand servers
to keep them all up to date and in sync, etc. This works mostly fine
and I can't really change this system.

The Problem
===========

Our problem has been that occasionally bad stuff gets put in the
master image and we have no easy way to revert it or to allow the QA
team to cherry-pick/revert changes to that master image.

The Solution
============

Git seems like the perfect tool for this, but I'm still not sure how
to adapt it to our situation. I'm building a tool that uses git to let
the developers commit their binary changes to this master image into
the git repository, which hopefully will allow me to offer the QA team
some ability to cherry-pick updates or revert regressions and make a
clean dist image from week to week.

The Question
============

What I need to know from y'all is: is there a better way, a more
git-like way, to accomplish this. Here's the model I *want* to follow:

-----a----b--T1-------c--------d-e---f------g [master]
               \   (a)  \
                ----|----c'---                [B1]

Here is branch B1 created from the master at some point in time T1. On
branch B1, I revert commit (a) and cherry-pick commit (c):

  git checkout master
  git branch B1
  git checkout B1
  git revert a
  git cherry-pick c

At this point, B1 is our "perfect image" and we're ready to dist it. I
check it out elsewhere and rsync it, etc. Wonderful.

But now it's a few weeks later and we're ready to do another
dist. What I *want* to do is create a *copy* of branch B1 to give the
release manager a reference point for him to bring things up to
date. What is the best way to do that?

If I branch off of B1, now I have the burden of doing a whole lot of
cherry-picks and having a challenging time getting things back in sync:

-----a----b--T1-------c--------d-e---f------g [master]
               \   (a)  \         \   \
                ----|----c'---     \   \      [B1]
                               \    \   \
                                -----e'--f'---[B2]

Ugh. Now B2 is kind of a mess. If I rebase it on master, I'll get (d)
and maybe (a) again, which I don't want. [side question: unless
there's a way to rebase on master but still exclude
commits... possible?]. B3 and B4 are going to look even worse and the
risk of drifting so far away from the master is unappealing.

Ideally I'd want each week's release to come directly from the master,
kind of the flying-fish approach:

                               ----e'--f'---  [B2]
                             /    /   /
-----a----b--T1-------c--------d-e---f------g [master]
               \   (a)  \
                ----|----c'---                [B1]

The problem with this is that now B2 contains (a), so I'll have to
revert that again--which I can do happily--but I just wonder if
there's a better way. If it's possible to simply *copy* branch B1 to
B2 without making B2 a branch off of B1.

In the absence of a git-branch-copy, is there something that would
help me do set intersection and subtraction between branches?
Something like this:

  git log B1
  ... bunch of commit ids ...
  git log B2
  ... bunch of commit ids ...

  ## find the intersection(B1, B2)

  ## revert all the things missing in B1 from B2

  ## now B2 is the same as B1--assuming git is idempotent (is it?)

  ## is there way besides rebase to clean out a revert as if it never
  ## happened? I suppose I could branch again and repeat this as
  ## needed.

Am I even thinking about this correctly?

Keep in mind that these commits are not source code commits; they're
file system changes of all kinds: updated binaries and libraries, new
directory trees, removed directory trees, etc. It's much closer to a
package manager in spirit than a VCS.

I feel like I'm missing something grand in git-rev-list or git-log or
git-bisect some other tool that will make all my troubles
disappear. I've read an awful lot of the man pages, but am still very
new to git and I'm certain I've missed some subtleties.

Any ideas? I'm not even sure I'm asking the right questions. I'll
accept any advice on this subject.

Scott
-- 
Scott Wiersdorf
<scott@perlcode.org>

^ permalink raw reply

* Re: git only one file
From: gsky @ 2009-10-02 17:24 UTC (permalink / raw)
  To: git
In-Reply-To: <25140456.post@talk.nabble.com>




synhedionn wrote:
> 
> with git add .  , a directory is expected, but I don't need all my files
> to be recorded, only one of my thousands, so how can I record just 1 file?
> 

git add /path/to/file
-- 
View this message in context: http://www.nabble.com/git-only-one-file-tp25140456p25718014.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply

* HTTP NTLM Authentication
From: gsky @ 2009-10-02 17:28 UTC (permalink / raw)
  To: git


Is is possible for me to pass arguments to the curl calls that git uses to
access a repository hosted over HTTP?

I am having a problem accessing the repository as it is authenticated using
NTLM, I can curl the repository using

curl --ntlm http://username:pass@machine.domain/git

How can I do the same for the git clone of the repository?  Is it possible
easily, or do I have to modify the source and recompile?

gsky
-- 
View this message in context: http://www.nabble.com/HTTP-NTLM-Authentication-tp25718488p25718488.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply

* Re: git as a versioned filesystem
From: Avery Pennarun @ 2009-10-02 18:11 UTC (permalink / raw)
  To: Scott Wiersdorf, git
In-Reply-To: <20091002164929.GA12725@perlcode.org>

On Fri, Oct 2, 2009 at 12:49 PM, Scott Wiersdorf <scott@perlcode.org> wrote:
> Git seems like the perfect tool for this, but I'm still not sure how
> to adapt it to our situation. I'm building a tool that uses git to let
> the developers commit their binary changes to this master image into
> the git repository, which hopefully will allow me to offer the QA team
> some ability to cherry-pick updates or revert regressions and make a
> clean dist image from week to week.

Beware that git performs rather badly on binary files, especially huge
ones, which it tries to load entirely into RAM.  It also keeps every
revision of every file that was ever committed (and every user who
checks it out needs to download the whole thing), so your giant binary
repository is going to get very big, very fast.

I've looked into using git for this kind of situation myself.  It's
close, but not quite there (for my purposes anyway).  It basically
just needs some optimizations and some improved support for "shallow
clones."

But on to your actual question:

> But now it's a few weeks later and we're ready to do another
> dist. What I *want* to do is create a *copy* of branch B1 to give the
> release manager a reference point for him to bring things up to
> date. What is the best way to do that?
>
> If I branch off of B1, now I have the burden of doing a whole lot of
> cherry-picks and having a challenging time getting things back in sync:
>
> -----a----b--T1-------c--------d-e---f------g [master]
>               \   (a)  \         \   \
>                ----|----c'---     \   \      [B1]
>                               \    \   \
>                                -----e'--f'---[B2]
>
> Ugh. Now B2 is kind of a mess. If I rebase it on master, I'll get (d)
> and maybe (a) again, which I don't want. [side question: unless
> there's a way to rebase on master but still exclude
> commits... possible?]. B3 and B4 are going to look even worse and the
> risk of drifting so far away from the master is unappealing.

If you rebase your "release" changes onto current master, you'll get
the revert-a patch applied, so (a) will still be gone.  Rebase will
also probably be smart enough to throw away c', since it's identical
to (c).  You will indeed end up with the unwanted (d), but you can
just revert that in B2.

> Ideally I'd want each week's release to come directly from the master,
> kind of the flying-fish approach:
>
>                               ----e'--f'---  [B2]
>                             /    /   /
> -----a----b--T1-------c--------d-e---f------g [master]
>               \   (a)  \
>                ----|----c'---                [B1]
>
> The problem with this is that now B2 contains (a), so I'll have to
> revert that again--which I can do happily--but I just wonder if
> there's a better way. If it's possible to simply *copy* branch B1 to
> B2 without making B2 a branch off of B1.

"revert-a" is a patch on its own.  Git doesn't think of reverting (a)
as anything special; it's just a change that happens to reverse what
(a) does.  So if you rebase B1 onto master, it will get copied.  It
sounds rebase will produce exactly the results you're looking for
here.

Now, that said, this release process seems extremely suspicious to me.

To summarize what I'm hearing: you have a 'master' branch that people
put stuff into, but which doesn't actually work correctly.  At the
last minute before a release, you make a new branch, drop out the
stuff that doesn't work, and put it into production.

This sounds problematic.  If (a) and (d) don't work, why are they in
master at all?  Git makes branching really easy: get people to put
their not-quite-working features into a different branch, and let the
release manager merge those branches into master when they're actually
ready.

If you do that, you'll always be releasing straight out of master, and
your life will be a lot simpler.  And if you "merge --squash" from the
feature branches into master, you can throw away the interim versions
of the feature branches, which should help keep your repository from
growing so quickly with tons of binary file revisions that never even
got released.

>  ## is there way besides rebase to clean out a revert as if it never
>  ## happened? I suppose I could branch again and repeat this as
>  ## needed.

You probably want "git revert -i".

Have fun,

Avery

^ permalink raw reply

* Re: Figuring out which patches have been applied
From: Julian Phillips @ 2009-10-02 18:45 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List
In-Reply-To: <9e4733910910020736n539f4331nfd61175b275c7d28@mail.gmail.com>

On Fri, 2 Oct 2009, Jon Smirl wrote:

> I have a stack of 100 patches against 2.6.30. A lot of these got
> merged between 2.6.30-32.  How can I tell which ones have been
> applied?
>
> It doesn't work to check if patch A has been applied to 2.6.32. Other
> patches may have been applied on top of patch A obscuring it.
>
> Once solution would be to rebase the patch stack forward one commit at
> a time. That solves the problem of later patches obscuring patch A. Is
> there a better way to do this?

Have you tried using git-cherry?  I belive that it was intended for this 
purpose?

-- 
Julian

  ---
Progress might have been all right once, but it's gone on too long.
 		-- Ogden Nash

^ permalink raw reply

* Re: Git push over git protocol for corporate environment
From: Ismael Luceno @ 2009-10-02 18:54 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: git
In-Reply-To: <76c5b8580910020741p2024f6c0w70be53338924e7e8@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

Eugene Sajine escribió:
> I think that this is what is missing right now in order for git to get
> rocket start and spread inside companies: secure and easy to maintain
> mainline hosting.
> 

It looks like your problem is using cygwin. It's more complicated on a
MS-Windows environment, and personally I think it's a _very bad idea_.

Git is really easy to use in fact, you just set up the repo with:

  mkdir repo.git
  cd repo.git
  git init --bare --shared=all

--shared=all makes the repo readable to anyone, and ensures push rights
to users under the same group as the user setting up the repo. You can
change the group with chmod of course.

SSH access will be needed to push, unless the users can remotely mount
the repo via NFS or any other protocol.

Pulling is possible over http too, you just need to make
hooks/post-update executable. To export via git protocol you must create
an empty file named "git-daemon-export-ok".

Besides setting a web repo browser and git-server there's nothing else
specific to git.

-- 
Ismael Luceno


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* [PATCH] Use the best HTTP authentication method supported by the server
From: Nicholas Miell @ 2009-10-02 19:04 UTC (permalink / raw)
  To: gsky51; +Cc: git, Nicholas Miell
In-Reply-To: <25718488.post@talk.nabble.com>

Currently, libcurl is limited to using HTTP Basic authentication if a
username and password are specified. HTTP Basic passes the username
and password to the server as plaintext, which is obviously
suboptimal. Furthermore, some servers are configured to require a more
secure authentication method (e.g. Digest or NTLM), which means that
git can't talk to them at all.

This is easily solved by telling libcurl to use any HTTP
authentication method it pleases. I leave the decision as to whether
HTTP Basic (i.e. completely insecure) should be allowed at all to
somebody else.  This can be easily changed in the future by using
CURLAUTH_ANYSAFE instead of CURLAUTH_ANY.

Signed-off-by: Nicholas Miell <nmiell@gmail.com>
---
 http.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

This passes make test; but I haven't actually tested it on a real
HTTP server.

diff --git a/http.c b/http.c
index 23b2a19..1937b45 100644
--- a/http.c
+++ b/http.c
@@ -185,6 +185,7 @@ static void init_curl_http_auth(CURL *result)
 		if (!user_pass)
 			user_pass = xstrdup(getpass("Password: "));
 		strbuf_addf(&up, "%s:%s", user_name, user_pass);
+		curl_easy_setopt(result, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
 		curl_easy_setopt(result, CURLOPT_USERPWD,
 				 strbuf_detach(&up, NULL));
 	}
-- 
1.6.2.5

^ permalink raw reply related

* Re: Figuring out which patches have been applied
From: skillzero @ 2009-10-02 19:16 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List
In-Reply-To: <9e4733910910020736n539f4331nfd61175b275c7d28@mail.gmail.com>

On Fri, Oct 2, 2009 at 7:36 AM, Jon Smirl <jonsmirl@gmail.com> wrote:
> I have a stack of 100 patches against 2.6.30. A lot of these got
> merged between 2.6.30-32.  How can I tell which ones have been
> applied?
>
> It doesn't work to check if patch A has been applied to 2.6.32. Other
> patches may have been applied on top of patch A obscuring it.
>
> Once solution would be to rebase the patch stack forward one commit at
> a time. That solves the problem of later patches obscuring patch A. Is
> there a better way to do this?

There may be a better way, but I needed to do a similar thing with
commits that were cherry-pick'd so I wrote a simple
git-contains-equivalent script to search for equivalent patch ID's
given a commit ID. You could do something like that, but using
git-patch-id as the source instead getting it from an existing commit
like the following script does.

#!/bin/bash

set -o pipefail
searchCommitID=`git rev-parse $1`
searchPatchID=`git show $searchCommitID | git patch-id`
if [ $? -ne 0 ]; then
	exit 1
fi
searchPatchID=${searchPatchID% *}

echo "Searching for equivalents to commit $searchCommitID (patch
$searchPatchID)..."
git log --all -p | git patch-id | grep $searchPatchID |
while read patchID commitID; do
	if [ "$commitID" = "$searchCommitID" ]; then
		echo "Exact commit $commitID is on the following branches:"
	else
		echo "Equivalent commit $commitID is on the following branches:"
	fi
	git branch -a --contains $commitID
done

^ permalink raw reply

* Re: How to delete large files
From: Nils Homer @ 2009-10-02 19:20 UTC (permalink / raw)
  To: git
In-Reply-To: <4AC6031A.2070409@viscovery.net>

Thank-you for all of your insightful help. Combining all the advice, the
commands that worked are:

git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch
$files" --tag-name-filter cat -- --all

git -rf .git/refs/original

git reflog expire --expire=now

git gc --prune=now

I then cloned the repository to a different location and replaced my
centralized version with the cloned copy.

Thanks,

Nils


On 10/2/09 6:41 AM, "Johannes Sixt" <j.sixt@viscovery.net> wrote:

> Mikael Magnusson schrieb:
>> Well, you just gave "HEAD" to git filter-branch to rewrite, i think
>> you want --all to rewrite all refs you have.
> 
> ... and '--tag-filter cat' to rewrite tags as well.
> 
> -- Hannes

^ permalink raw reply

* "Not currently on any branch"
From: Tim @ 2009-10-02 20:08 UTC (permalink / raw)
  To: git

I have some code in a git repo that is "Not currently on any branch". Now,
there's the master branch and another branch 'ui-integration' that I'm using in
this project. I don't know how the project got into this headless state, but I
need to be using the 'ui-integration' branch. 

I tried looking around the blogosphere for a solution, and tried what I found
here. But it seems like only my last commit (not the previous 10 I made) shows
up in the master branch (not ui-integration ).  
http://blog.kortina.net/post/71935540/fix-git-not-currently-on-any-branch-problem

What's the most straightforward & cleanest way to merge my changes in the
headless branch to my 'ui-integration' branch? 

Thanks in advance
Tim

^ permalink raw reply

* Re: "Not currently on any branch"
From: Steven Noonan @ 2009-10-02 20:58 UTC (permalink / raw)
  To: Tim; +Cc: git
In-Reply-To: <loom.20091002T215942-663@post.gmane.org>

On Fri, Oct 2, 2009 at 1:08 PM, Tim <timothyjwashington@yahoo.ca> wrote:
> I have some code in a git repo that is "Not currently on any branch". Now,
> there's the master branch and another branch 'ui-integration' that I'm using in
> this project. I don't know how the project got into this headless state, but I
> need to be using the 'ui-integration' branch.
>
> I tried looking around the blogosphere for a solution, and tried what I found
> here. But it seems like only my last commit (not the previous 10 I made) shows
> up in the master branch (not ui-integration ).
> http://blog.kortina.net/post/71935540/fix-git-not-currently-on-any-branch-problem
>
> What's the most straightforward & cleanest way to merge my changes in the
> headless branch to my 'ui-integration' branch?
>

Try 'git checkout -b temp', which creates a branch called 'temp' with
its HEAD at where you currently are, and then merge your changes to
ui-integration via 'git checkout ui-integration; git merge temp', and
finally drop the junk branch with 'git branch -d temp'

- Steven

^ permalink raw reply

* [PATCH] MSVC: fix build warnings
From: Michael Wookey @ 2009-10-02 21:40 UTC (permalink / raw)
  To: git

When building with MSVC, the following warnings are issued:

  warning C4700: uninitialized local variable 'xxx' used

Where 'xxx' is the name of the uninitialised variable that is being used
to initialise another variable. In all instances, the variable 'xxx' is
being used to initialise itself. Remove the use of initialising a
variable with itself to suppress these warnings with MSVC.

Some of these variables require an initial value. This is to prevent gcc
from issuing a warning about a variable being used before it has been
initialised. Suppress these gcc warnings by explicitly initialising the
required variables.

Signed-off-by: Michael Wookey <michaelwookey@gmail.com>
---
This patch is but a small step in removing the build warnings that are
generated when compiling with MSVC.

 builtin-branch.c      |    2 +-
 builtin-cat-file.c    |    2 +-
 builtin-fast-export.c |    2 +-
 builtin-fetch--tool.c |    4 ++--
 builtin-rev-list.c    |    2 +-
 fast-import.c         |    4 ++--
 match-trees.c         |   12 ++++++------
 merge-recursive.c     |    2 +-
 run-command.c         |    2 +-
 transport.c           |    2 +-
 wt-status.c           |    2 +-
 11 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin-branch.c b/builtin-branch.c
index 9f57992..cf6a9ca 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -93,7 +93,7 @@ static const char *branch_get_color(enum color_branch ix)

 static int delete_branches(int argc, const char **argv, int force, int kinds)
 {
-	struct commit *rev, *head_rev = head_rev;
+	struct commit *rev, *head_rev;
 	unsigned char sha1[20];
 	char *name = NULL;
 	const char *fmt, *remote;
diff --git a/builtin-cat-file.c b/builtin-cat-file.c
index 5906842..669608a 100644
--- a/builtin-cat-file.c
+++ b/builtin-cat-file.c
@@ -152,7 +152,7 @@ static int batch_one_object(const char *obj_name,
int print_contents)
 	unsigned char sha1[20];
 	enum object_type type = 0;
 	unsigned long size;
-	void *contents = contents;
+	void *contents;

 	if (!obj_name)
 	   return 1;
diff --git a/builtin-fast-export.c b/builtin-fast-export.c
index b0a4029..07e41ea 100644
--- a/builtin-fast-export.c
+++ b/builtin-fast-export.c
@@ -422,7 +422,7 @@ static void get_tags_and_duplicates(struct
object_array *pending,
 	for (i = 0; i < pending->nr; i++) {
 		struct object_array_entry *e = pending->objects + i;
 		unsigned char sha1[20];
-		struct commit *commit = commit;
+		struct commit *commit;
 		char *full_name;

 		if (dwim_ref(e->name, strlen(e->name), sha1, &full_name) != 1)
diff --git a/builtin-fetch--tool.c b/builtin-fetch--tool.c
index 3dbdf7a..8463f66 100644
--- a/builtin-fetch--tool.c
+++ b/builtin-fetch--tool.c
@@ -416,14 +416,14 @@ static int expand_refs_wildcard(const char
*ls_remote_result, int numrefs,
 static int pick_rref(int sha1_only, const char *rref, const char
*ls_remote_result)
 {
 	int err = 0;
-	int lrr_count = lrr_count, i, pass;
+	int lrr_count, i, pass;
 	const char *cp;
 	struct lrr {
 		const char *line;
 		const char *name;
 		int namelen;
 		int shown;
-	} *lrr_list = lrr_list;
+	} *lrr_list;

 	for (pass = 0; pass < 2; pass++) {
 		/* pass 0 counts and allocates, pass 1 fills... */
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 4ba1c12..b7b9fe3 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -386,7 +386,7 @@ int cmd_rev_list(int argc, const char **argv,
const char *prefix)
 		mark_edges_uninteresting(revs.commits, &revs, show_edge);

 	if (bisect_list) {
-		int reaches = reaches, all = all;
+		int reaches, all;

 		revs.commits = find_bisection(revs.commits, &reaches, &all,
 					      bisect_find_all);
diff --git a/fast-import.c b/fast-import.c
index 7ef9865..6ed1602 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1858,7 +1858,7 @@ static void file_change_m(struct branch *b)
 	const char *p = command_buf.buf + 2;
 	static struct strbuf uq = STRBUF_INIT;
 	const char *endp;
-	struct object_entry *oe = oe;
+	struct object_entry *oe;
 	unsigned char sha1[20];
 	uint16_t mode, inline_data = 0;

@@ -2084,7 +2084,7 @@ static int parse_from(struct branch *b)

 static struct hash_list *parse_merge(unsigned int *count)
 {
-	struct hash_list *list = NULL, *n, *e = e;
+	struct hash_list *list = NULL, *n, *e;
 	const char *from;
 	struct branch *s;

diff --git a/match-trees.c b/match-trees.c
index 0fd6df7..99d559e 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -72,12 +72,12 @@ static int score_trees(const unsigned char *hash1,
const unsigned char *hash2)
 		die("%s is not a tree", sha1_to_hex(hash2));
 	init_tree_desc(&two, two_buf, size);
 	while (one.size | two.size) {
-		const unsigned char *elem1 = elem1;
-		const unsigned char *elem2 = elem2;
-		const char *path1 = path1;
-		const char *path2 = path2;
-		unsigned mode1 = mode1;
-		unsigned mode2 = mode2;
+		const unsigned char *elem1 = NULL;
+		const unsigned char *elem2 = NULL;
+		const char *path1 = NULL;
+		const char *path2 = NULL;
+		unsigned mode1 = 0;
+		unsigned mode2 = 0;
 		int cmp;

 		if (one.size)
diff --git a/merge-recursive.c b/merge-recursive.c
index f55b7eb..8d7de22 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1267,7 +1267,7 @@ int merge_recursive(struct merge_options *o,
 {
 	struct commit_list *iter;
 	struct commit *merged_common_ancestors;
-	struct tree *mrtree = mrtree;
+	struct tree *mrtree;
 	int clean;

 	if (show(o, 4)) {
diff --git a/run-command.c b/run-command.c
index cf2d8f7..014f723 100644
--- a/run-command.c
+++ b/run-command.c
@@ -19,7 +19,7 @@ int start_command(struct child_process *cmd)
 {
 	int need_in, need_out, need_err;
 	int fdin[2], fdout[2], fderr[2];
-	int failed_errno = failed_errno;
+	int failed_errno;

 	/*
 	 * In case of errors we must keep the promise to close FDs
diff --git a/transport.c b/transport.c
index 644a30a..c6bb992 100644
--- a/transport.c
+++ b/transport.c
@@ -102,7 +102,7 @@ static void insert_packed_refs(const char
*packed_refs, struct ref **list)
 		return;

 	for (;;) {
-		int cmp = cmp, len;
+		int cmp, len;

 		if (!fgets(buffer, sizeof(buffer), f)) {
 			fclose(f);
diff --git a/wt-status.c b/wt-status.c
index 38eb245..060ad17 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -133,7 +133,7 @@ static void wt_status_print_change_data(struct wt_status *s,
 {
 	struct wt_status_change_data *d = it->util;
 	const char *c = color(change_type, s);
-	int status = status;
+	int status = 0;
 	char *one_name;
 	char *two_name;
 	const char *one, *two;
-- 
1.6.5.rc2

^ permalink raw reply related

* Re: "Not currently on any branch"
From: Alex Riesen @ 2009-10-02 21:46 UTC (permalink / raw)
  To: Tim; +Cc: git
In-Reply-To: <loom.20091002T215942-663@post.gmane.org>

On Fri, Oct 2, 2009 at 22:08, Tim <timothyjwashington@yahoo.ca> wrote:
> What's the most straightforward & cleanest way to merge my changes in the
> headless branch to my 'ui-integration' branch?

Assuming you use a Bourne shell:

$ prev=$(git rev-parse HEAD)
$ git checkout ui-integration && git merge $prev

^ permalink raw reply

* Re: [PATCH] MSVC: fix build warnings
From: Junio C Hamano @ 2009-10-02 22:05 UTC (permalink / raw)
  To: Michael Wookey; +Cc: git
In-Reply-To: <d2e97e800910021440q46bd46c4y8a5af987620ffc5c@mail.gmail.com>

Michael Wookey <michaelwookey@gmail.com> writes:

> diff --git a/builtin-branch.c b/builtin-branch.c
> index 9f57992..cf6a9ca 100644
> --- a/builtin-branch.c
> +++ b/builtin-branch.c
> @@ -93,7 +93,7 @@ static const char *branch_get_color(enum color_branch ix)
>
>  static int delete_branches(int argc, const char **argv, int force, int kinds)
>  {
> -	struct commit *rev, *head_rev = head_rev;

I haven't tried, but the patch may break build with "gcc -Werror".

This is a common and unfortunate idiom to tell the readers of the code
that this initialization is unnecessary, gcc is not clever enough to
notice and gives warnings, and we are squelching it, knowing what we are
doing.

^ permalink raw reply

* Re: [PATCH 2/6 (v4)] basic revision cache system, no integration or features
From: Nick Edelen @ 2009-10-02 22:12 UTC (permalink / raw)
  To: Nick Edelen, Junio C Hamano, Nicolas Pitre, Johannes Schindelin,
	Sam Vilain
In-Reply-To: <op.uyuwkovjtdk399@sirnot.private>

Second in the revision cache series, this particular patch provides:
  - minimal API: caching only commit topo data
  - minimal porcelain: add and walk cache slices
  - appropriate tests

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
removed useless test.

  Makefile                  |    2 +
  builtin-rev-cache.c       |  207 ++++++++
  builtin.h                 |    1 +
  commit.c                  |    2 +
  git.c                     |    1 +
  rev-cache.c               | 1171 +++++++++++++++++++++++++++++++++++++++++++++
  rev-cache.h               |  107 ++++
  revision.c                |    2 +-
  revision.h                |   26 +-
  t/t6017-rev-cache-list.sh |  106 ++++
  10 files changed, 1623 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 12defd4..3098cc7 100644
--- a/Makefile
+++ b/Makefile
@@ -534,6 +534,7 @@ LIB_OBJS += refs.o
  LIB_OBJS += remote.o
  LIB_OBJS += replace_object.o
  LIB_OBJS += rerere.o
+LIB_OBJS += rev-cache.o
  LIB_OBJS += revision.o
  LIB_OBJS += run-command.o
  LIB_OBJS += server-info.o
@@ -626,6 +627,7 @@ BUILTIN_OBJS += builtin-remote.o
  BUILTIN_OBJS += builtin-replace.o
  BUILTIN_OBJS += builtin-rerere.o
  BUILTIN_OBJS += builtin-reset.o
+BUILTIN_OBJS += builtin-rev-cache.o
  BUILTIN_OBJS += builtin-rev-list.o
  BUILTIN_OBJS += builtin-rev-parse.o
  BUILTIN_OBJS += builtin-revert.o
diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
new file mode 100644
index 0000000..6eb7065
--- /dev/null
+++ b/builtin-rev-cache.c
@@ -0,0 +1,207 @@
+#include "cache.h"
+#include "object.h"
+#include "commit.h"
+#include "diff.h"
+#include "revision.h"
+#include "rev-cache.h"
+
+/* porcelain for rev-cache.c */
+static int handle_add(int argc, const char *argv[]) /* args beyond this command */
+{
+	struct rev_info revs;
+	struct rev_cache_info rci;
+	char dostdin = 0;
+	unsigned int flags = 0;
+	int i, retval;
+	unsigned char cache_sha1[20];
+	struct commit_list *starts = 0, *ends = 0;
+	struct commit *commit;
+
+	init_revisions(&revs, 0);
+	init_rev_cache_info(&rci);
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--stdin"))
+			dostdin = 1;
+		else if (!strcmp(argv[i], "--fresh") || !strcmp(argv[i], "--incremental"))
+			starts_from_slices(&revs, UNINTERESTING);
+		else if (!strcmp(argv[i], "--not"))
+			flags ^= UNINTERESTING;
+		else if (!strcmp(argv[i], "--legs"))
+			rci.legs = 1;
+		else if (!strcmp(argv[i], "--no-objects"))
+			rci.objects = 0;
+		else if (!strcmp(argv[i], "--all")) {
+			const char *args[2];
+			int argn = 0;
+
+			args[argn++] = "rev-list";
+			args[argn++] = "--all";
+			setup_revisions(argn, args, &revs, 0);
+		} else
+			handle_revision_arg(argv[i], &revs, flags, 1);
+	}
+
+	if (dostdin) {
+		char line[1000];
+
+		flags = 0;
+		while (fgets(line, sizeof(line), stdin)) {
+			int len = strlen(line);
+			while (len && (line[len - 1] == '\n' || line[len - 1] == '\r'))
+				line[--len] = 0;
+
+			if (!len)
+				break;
+
+			if (!strcmp(line, "--not"))
+				flags ^= UNINTERESTING;
+			else
+				handle_revision_arg(line, &revs, flags, 1);
+		}
+	}
+
+	retval = make_cache_slice(&rci, &revs, &starts, &ends, cache_sha1);
+	if (retval < 0)
+		return retval;
+
+	printf("%s\n", sha1_to_hex(cache_sha1));
+
+	fprintf(stderr, "endpoints:\n");
+	while ((commit = pop_commit(&starts)))
+		fprintf(stderr, "S %s\n", sha1_to_hex(commit->object.sha1));
+	while ((commit = pop_commit(&ends)))
+		fprintf(stderr, "E %s\n", sha1_to_hex(commit->object.sha1));
+
+	return 0;
+}
+
+static int handle_walk(int argc, const char *argv[])
+{
+	struct commit *commit;
+	struct rev_info revs;
+	struct commit_list *queue, *work, **qp;
+	unsigned char *sha1p, *sha1pt;
+	unsigned long date = 0;
+	unsigned int flags = 0;
+	int retval, slop = 5, i;
+
+	init_revisions(&revs, 0);
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--not"))
+			flags ^= UNINTERESTING;
+		else if (!strcmp(argv[i], "--objects"))
+			revs.tree_objects = revs.blob_objects = 1;
+		else
+			handle_revision_arg(argv[i], &revs, flags, 1);
+	}
+
+	work = 0;
+	sha1p = 0;
+	for (i = 0; i < revs.pending.nr; i++) {
+		commit = lookup_commit(revs.pending.objects[i].item->sha1);
+
+		sha1pt = get_cache_slice(commit);
+		if (!sha1pt)
+			die("%s: not in a cache slice", sha1_to_hex(commit->object.sha1));
+
+		if (!i)
+			sha1p = sha1pt;
+		else if (sha1p != sha1pt)
+			die("walking porcelain is /per/ cache slice; commits cannot be spread out amoung several");
+
+		insert_by_date(commit, &work);
+	}
+
+	if (!sha1p)
+		die("nothing to traverse!");
+
+	queue = 0;
+	qp = &queue;
+	commit = pop_commit(&work);
+	retval = traverse_cache_slice(&revs, sha1p, commit, &date, &slop, &qp, &work);
+	if (retval < 0)
+		return retval;
+
+	fprintf(stderr, "queue:\n");
+	while ((commit = pop_commit(&queue)) != 0) {
+		printf("%s\n", sha1_to_hex(commit->object.sha1));
+	}
+
+	fprintf(stderr, "work:\n");
+	while ((commit = pop_commit(&work)) != 0) {
+		printf("%s\n", sha1_to_hex(commit->object.sha1));
+	}
+
+	fprintf(stderr, "pending:\n");
+	for (i = 0; i < revs.pending.nr; i++) {
+		struct object *obj = revs.pending.objects[i].item;
+
+		/* unfortunately, despite our careful generation, object duplication *is* a possibility...
+		 * (eg. same object introduced into two different branches) */
+		if (obj->flags & SEEN)
+			continue;
+
+		printf("%s\n", sha1_to_hex(revs.pending.objects[i].item->sha1));
+		obj->flags |= SEEN;
+	}
+
+	return 0;
+}
+
+static int handle_help(void)
+{
+	char *usage = "\
+usage:\n\
+git-rev-cache COMMAND [options] [<commit-id>...]\n\
+commands:\n\
+  add    - add revisions to the cache.  reads commit ids from stdin, \n\
+           formatted as: START START ... --not END END ...\n\
+           options:\n\
+            --all                  use all branch heads as starts\n\
+            --fresh/--incremental  exclude everything already in a cache slice\n\
+            --stdin                also read commit ids from stdin (same form\n\
+                                   as cmd)\n\
+            --legs                 ensure branch is entirely self-contained\n\
+            --no-objects           don't add non-commit objects to slice\n\
+  walk   - walk a cache slice based on set of commits; formatted as add\n\
+           options:\n\
+           --objects               include non-commit objects in traversals\n\
+  fuse   - coalesce cache slices into a single cache.\n\
+           options:\n\
+            --all                  include all objects in repository\n\
+            --no-objects           don't add non-commit objects to slice\n\
+            --ignore-size[=N]      ignore slices of size >= N; defaults to ~5MB\n\
+  index  - regnerate the cache index.";
+
+	puts(usage);
+
+	return 0;
+}
+
+int cmd_rev_cache(int argc, const char *argv[], const char *prefix)
+{
+	const char *arg;
+	int r;
+
+	git_config(git_default_config, NULL);
+
+	if (argc > 1)
+		arg = argv[1];
+	else
+		arg = "";
+
+	argc -= 2;
+	argv += 2;
+	if (!strcmp(arg, "add"))
+		r = handle_add(argc, argv);
+	else if (!strcmp(arg, "walk"))
+		r = handle_walk(argc, argv);
+	else
+		return handle_help();
+
+	fprintf(stderr, "final return value: %d\n", r);
+
+	return 0;
+}
diff --git a/builtin.h b/builtin.h
index a2174dc..2f89feb 100644
--- a/builtin.h
+++ b/builtin.h
@@ -86,6 +86,7 @@ extern int cmd_remote(int argc, const char **argv, const char *prefix);
  extern int cmd_config(int argc, const char **argv, const char *prefix);
  extern int cmd_rerere(int argc, const char **argv, const char *prefix);
  extern int cmd_reset(int argc, const char **argv, const char *prefix);
+extern int cmd_rev_cache(int argc, const char **argv, const char *prefix);
  extern int cmd_rev_list(int argc, const char **argv, const char *prefix);
  extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
  extern int cmd_revert(int argc, const char **argv, const char *prefix);
diff --git a/commit.c b/commit.c
index fedbd5e..61d83c6 100644
--- a/commit.c
+++ b/commit.c
@@ -252,6 +252,8 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
  	item->tree = lookup_tree(parent);
  	bufptr += 46; /* "tree " + "hex sha1" + "\n" */
  	pptr = &item->parents;
+	while (pop_commit(pptr))
+		; /* clear anything from cache */

  	graft = lookup_commit_graft(item->object.sha1);
  	while (bufptr + 48 < tail && !memcmp(bufptr, "parent ", 7)) {
diff --git a/git.c b/git.c
index 9883009..dfaa07f 100644
--- a/git.c
+++ b/git.c
@@ -343,6 +343,7 @@ static void handle_internal_command(int argc, const char **argv)
  		{ "repo-config", cmd_config },
  		{ "rerere", cmd_rerere, RUN_SETUP },
  		{ "reset", cmd_reset, RUN_SETUP },
+		{ "rev-cache", cmd_rev_cache, RUN_SETUP },
  		{ "rev-list", cmd_rev_list, RUN_SETUP },
  		{ "rev-parse", cmd_rev_parse },
  		{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
diff --git a/rev-cache.c b/rev-cache.c
new file mode 100644
index 0000000..8951cdf
--- /dev/null
+++ b/rev-cache.c
@@ -0,0 +1,1171 @@
+#include "cache.h"
+#include "object.h"
+#include "commit.h"
+#include "tree.h"
+#include "tree-walk.h"
+#include "blob.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "rev-cache.h"
+#include "run-command.h"
+
+/* list resembles pack index format */
+static uint32_t fanout[0xff + 2];
+
+static unsigned char *idx_map;
+static int idx_size;
+static struct rc_index_header idx_head;
+static unsigned char *idx_caches;
+static char no_idx;
+
+static struct strbuf *acc_buffer;
+
+#define SLOP			5
+
+/* initialization */
+
+struct rc_index_entry *from_disked_rc_index_entry(struct rc_index_entry_ondisk *src, struct rc_index_entry *dst)
+{
+	static struct rc_index_entry entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->sha1 = src->sha1;
+	dst->is_start = !!(src->flags & 0x80);
+	dst->cache_index = src->flags & 0x7f;
+	dst->pos = ntohl(src->pos);
+
+	return dst;
+}
+
+struct rc_index_entry_ondisk *to_disked_rc_index_entry(struct rc_index_entry *src, struct rc_index_entry_ondisk *dst)
+{
+	static struct rc_index_entry_ondisk entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	if (dst->sha1 != src->sha1)
+		hashcpy(dst->sha1, src->sha1);
+	dst->flags = (unsigned char)src->is_start << 7 | (unsigned char)src->cache_index;
+	dst->pos = htonl(src->pos);
+
+	return dst;
+}
+
+struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondisk *src, struct rc_object_entry *dst)
+{
+	static struct rc_object_entry entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->type = src->flags >> 5;
+	dst->is_end = !!(src->flags & 0x10);
+	dst->is_start = !!(src->flags & 0x08);
+	dst->uninteresting = !!(src->flags & 0x04);
+	dst->include = !!(src->flags & 0x02);
+	dst->flag = !!(src->flags & 0x01);
+
+	dst->sha1 = src->sha1;
+	dst->merge_nr = src->merge_nr;
+	dst->split_nr = src->split_nr;
+
+	dst->size_size = src->sizes >> 5;
+	dst->padding = src->sizes & 0x1f;
+
+	dst->date = ntohl(src->date);
+	dst->path = ntohs(src->path);
+
+	return dst;
+}
+
+struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry *src, struct rc_object_entry_ondisk *dst)
+{
+	static struct rc_object_entry_ondisk entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->flags  = (unsigned char)src->type << 5;
+	dst->flags |= (unsigned char)src->is_end << 4;
+	dst->flags |= (unsigned char)src->is_start << 3;
+	dst->flags |= (unsigned char)src->uninteresting << 2;
+	dst->flags |= (unsigned char)src->include << 1;
+	dst->flags |= (unsigned char)src->flag;
+
+	if (dst->sha1 != src->sha1)
+		hashcpy(dst->sha1, src->sha1);
+	dst->merge_nr = src->merge_nr;
+	dst->split_nr = src->split_nr;
+
+	dst->sizes  = (unsigned char)src->size_size << 5;
+	dst->sizes |= (unsigned char)src->padding;
+
+	dst->date = htonl(src->date);
+	dst->path = htons(src->path);
+
+	return dst;
+}
+
+static int get_index_head(unsigned char *map, int len, struct rc_index_header *head, uint32_t *fanout, unsigned char **caches)
+{
+	struct rc_index_header whead;
+	int i, index = sizeof(struct rc_index_header);
+
+	memcpy(&whead, map, sizeof(struct rc_index_header));
+	if (memcmp(whead.signature, "REVINDEX", 8) || whead.version != SUPPORTED_REVINDEX_VERSION)
+		return -1;
+
+	memcpy(head->signature, "REVINDEX", 8);
+	head->version = whead.version;
+	head->ofs_objects = ntohl(whead.ofs_objects);
+	head->object_nr = ntohl(whead.object_nr);
+	head->cache_nr = whead.cache_nr;
+	head->max_date = ntohl(whead.max_date);
+
+	if (len < index + head->cache_nr * 20 + 0x100 * sizeof(uint32_t))
+		return -2;
+
+	*caches = xmalloc(head->cache_nr * 20);
+	memcpy(*caches, map + index, head->cache_nr * 20);
+	index += head->cache_nr * 20;
+
+	memcpy(fanout, map + index, 0x100 * sizeof(uint32_t));
+	for (i = 0; i <= 0xff; i++)
+		fanout[i] = ntohl(fanout[i]);
+	fanout[0x100] = len;
+
+	return 0;
+}
+
+/* added in init_index */
+static void cleanup_cache_slices(void)
+{
+	if (idx_map) {
+		free(idx_caches);
+		munmap(idx_map, idx_size);
+		idx_map = 0;
+	}
+
+}
+
+static int init_index(void)
+{
+	int fd;
+	struct stat fi;
+
+	fd = open(git_path("rev-cache/index"), O_RDONLY);
+	if (fd == -1 || fstat(fd, &fi))
+		goto end;
+	if (fi.st_size < sizeof(struct rc_index_header))
+		goto end;
+
+	idx_size = fi.st_size;
+	idx_map = xmmap(0, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+	if (idx_map == MAP_FAILED)
+		goto end;
+	if (get_index_head(idx_map, fi.st_size, &idx_head, fanout, &idx_caches))
+		goto end;
+
+	atexit(cleanup_cache_slices);
+
+	return 0;
+
+end:
+	idx_map = 0;
+	no_idx = 1;
+	return -1;
+}
+
+/* this assumes index is already loaded */
+static struct rc_index_entry_ondisk *search_index_1(unsigned char *sha1)
+{
+	int start, end, starti, endi, i, len, r;
+	struct rc_index_entry_ondisk *ie;
+
+	if (!idx_map)
+		return 0;
+
+	/* binary search */
+	start = fanout[(int)sha1[0]];
+	end = fanout[(int)sha1[0] + 1];
+	len = (end - start) / sizeof(struct rc_index_entry_ondisk);
+	if (!len || len * sizeof(struct rc_index_entry_ondisk) != end - start)
+		return 0;
+
+	starti = 0;
+	endi = len - 1;
+	for (;;) {
+		i = (endi + starti) / 2;
+		ie = (struct rc_index_entry_ondisk *)(idx_map + start + i * sizeof(struct rc_index_entry_ondisk));
+		r = hashcmp(sha1, ie->sha1);
+
+		if (r) {
+			if (starti + 1 == endi) {
+				starti++;
+				continue;
+			} else if (starti == endi)
+				break;
+
+			if (r > 0)
+				starti = i;
+			else /* r < 0 */
+				endi = i;
+		} else
+			return ie;
+	}
+
+	return 0;
+}
+
+static struct rc_index_entry *search_index(unsigned char *sha1)
+{
+	struct rc_index_entry_ondisk *ied = search_index_1(sha1);
+
+	if (ied)
+		return from_disked_rc_index_entry(ied, 0);
+
+	return 0;
+}
+
+unsigned char *get_cache_slice(struct commit *commit)
+{
+	struct rc_index_entry *ie;
+
+	if (!idx_map) {
+		if (no_idx)
+			return 0;
+		init_index();
+	}
+
+	if (commit->date > idx_head.max_date)
+		return 0;
+
+	ie = search_index(commit->object.sha1);
+	if (ie && ie->cache_index < idx_head.cache_nr)
+		return idx_caches + ie->cache_index * 20;
+
+	return 0;
+}
+
+
+/* traversal */
+
+static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work)
+{
+	struct rc_index_entry *iep;
+	struct rc_object_entry *oep;
+	struct commit_list *prev, *wp, **wpp;
+	int retval;
+
+	iep = search_index(commit->object.sha1), 0;
+	oep = RC_OBTAIN_OBJECT_ENTRY(map + iep->pos);
+
+	/* the .uniniteresting bit isn't strictly necessary, as we check the object during traversal as well,
+	 * but we might as well initialize it while we're at it */
+	oep->include = 1;
+	oep->uninteresting = !!(commit->object.flags & UNINTERESTING);
+	to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
+	retval = iep->pos;
+
+	/* include any others in the work array */
+	prev = 0;
+	wpp = work;
+	wp = *work;
+	while (wp) {
+		struct object *obj = &wp->item->object;
+		struct commit *co;
+
+		/* is this in our cache slice? */
+		iep = search_index(obj->sha1);
+		if (!iep || hashcmp(idx_caches + iep->cache_index * 20, head->sha1)) {
+			prev = wp;
+			wp = wp->next;
+			wpp = &wp;
+			continue;
+		}
+
+		if (iep->pos < retval)
+			retval = iep->pos;
+
+		oep = RC_OBTAIN_OBJECT_ENTRY(map + iep->pos);
+
+		/* mark this for later */
+		oep->include = 1;
+		oep->uninteresting = !!(obj->flags & UNINTERESTING);
+		to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
+
+		/* remove from work list */
+		co = pop_commit(wpp);
+		wp = *wpp;
+		if (prev)
+			prev->next = wp;
+	}
+
+	return retval;
+}
+
+#define IPATH				0x40
+#define UPATH				0x80
+
+#define GET_COUNT(x)		((x) & 0x3f)
+#define SET_COUNT(x, s)		((x) = ((x) & ~0x3f) | ((s) & 0x3f))
+
+static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *map,
+	struct rev_info *revs, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work)
+{
+	struct commit_list *insert_cache = 0;
+	struct commit **last_objects, *co;
+	int i, total_path_nr = head->path_nr, retval = -1;
+	char consume_children = 0;
+	unsigned char *paths;
+
+	i = setup_traversal(head, map, commit, work);
+	if (i < 0)
+		return -1;
+
+	paths = xcalloc(total_path_nr, sizeof(uint16_t));
+	last_objects = xcalloc(total_path_nr, sizeof(struct commit *));
+
+	/* i already set */
+	while (i < head->size) {
+		struct rc_object_entry *entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+		int path = entry->path;
+		struct object *obj;
+		int index = i;
+
+		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
+
+		/* add extra objects if necessary */
+		if (entry->type != OBJ_COMMIT)
+			continue;
+		else
+			consume_children = 0;
+
+		if (path >= total_path_nr)
+			goto end;
+
+		/* in one of our branches?
+		 * uninteresting trumps interesting */
+		if (entry->include)
+			paths[path] |= entry->uninteresting ? UPATH : IPATH;
+		else if (!paths[path])
+			continue;
+
+		/* date stuff */
+		if (revs->max_age != -1 && entry->date < revs->max_age)
+			paths[path] |= UPATH;
+
+		/* lookup object */
+		co = lookup_commit(entry->sha1);
+		obj = &co->object;
+
+		if (obj->flags & UNINTERESTING)
+			paths[path] |= UPATH;
+
+		if ((paths[path] & IPATH) && (paths[path] & UPATH)) {
+			paths[path] = UPATH;
+
+			/* mark edge */
+			if (last_objects[path]) {
+				parse_commit(last_objects[path]);
+
+				last_objects[path]->object.flags &= ~FACE_VALUE;
+				last_objects[path] = 0;
+			}
+		}
+
+		/* now we gotta re-assess the whole interesting thing... */
+		entry->uninteresting = !!(paths[path] & UPATH);
+
+		/* first close paths */
+		if (entry->split_nr) {
+			int j, off = index + sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE(entry->merge_nr);
+
+			for (j = 0; j < entry->split_nr; j++) {
+				unsigned short p = ntohs(*(uint16_t *)(map + off + RC_PATH_SIZE(j)));
+
+				if (p >= total_path_nr)
+					goto end;
+
+				/* boundary commit? */
+				if ((paths[p] & IPATH) && entry->uninteresting) {
+					if (last_objects[p]) {
+						parse_commit(last_objects[p]);
+
+						last_objects[p]->object.flags &= ~FACE_VALUE;
+						last_objects[p] = 0;
+					}
+				} else if (last_objects[p] && !last_objects[p]->object.parsed)
+					commit_list_insert(co, &last_objects[p]->parents);
+
+				/* can't close a merge path until all are parents have been encountered */
+				if (GET_COUNT(paths[p])) {
+					SET_COUNT(paths[p], GET_COUNT(paths[p]) - 1);
+
+					if (GET_COUNT(paths[p]))
+						continue;
+				}
+
+				paths[p] = 0;
+				last_objects[p] = 0;
+			}
+		}
+
+		/* make topo relations */
+		if (last_objects[path] && !last_objects[path]->object.parsed)
+			commit_list_insert(co, &last_objects[path]->parents);
+
+		/* initialize commit */
+		if (!entry->is_end) {
+			co->date = entry->date;
+			obj->flags |= ADDED | FACE_VALUE;
+		} else
+			parse_commit(co);
+
+		obj->flags |= SEEN;
+
+		if (entry->uninteresting)
+			obj->flags |= UNINTERESTING;
+
+		/* we need to know what the edges are */
+		last_objects[path] = co;
+
+		/* add to list */
+		if (!(obj->flags & UNINTERESTING) || revs->show_all) {
+			if (entry->is_end)
+				insert_by_date_cached(co, work, insert_cache, &insert_cache);
+			else
+				*queue = &commit_list_insert(co, *queue)->next;
+
+			/* add children to list as well */
+			if (obj->flags & UNINTERESTING)
+				consume_children = 0;
+			else
+				consume_children = 1;
+		}
+
+		/* open parents */
+		if (entry->merge_nr) {
+			int j, off = index + sizeof(struct rc_object_entry_ondisk);
+			char flag = entry->uninteresting ? UPATH : IPATH;
+
+			for (j = 0; j < entry->merge_nr; j++) {
+				unsigned short p = ntohs(*(uint16_t *)(map + off + RC_PATH_SIZE(j)));
+
+				if (p >= total_path_nr)
+					goto end;
+
+				if (paths[p] & flag)
+					continue;
+
+				paths[p] |= flag;
+			}
+
+			/* make sure we don't use this path before all our parents have had their say */
+			SET_COUNT(paths[path], entry->merge_nr);
+		}
+
+	}
+
+	retval = 0;
+
+end:
+	free(paths);
+	free(last_objects);
+
+	return retval;
+}
+
+static int get_cache_slice_header(unsigned char *cache_sha1, unsigned char *map, int len, struct rc_slice_header *head)
+{
+	int t;
+
+	memcpy(head, map, sizeof(struct rc_slice_header));
+	head->ofs_objects = ntohl(head->ofs_objects);
+	head->object_nr = ntohl(head->object_nr);
+	head->size = ntohl(head->size);
+	head->path_nr = ntohs(head->path_nr);
+
+	if (memcmp(head->signature, "REVCACHE", 8))
+		return -1;
+	if (head->version != SUPPORTED_REVCACHE_VERSION)
+		return -2;
+	if (hashcmp(head->sha1, cache_sha1))
+		return -3;
+	t = sizeof(struct rc_slice_header);
+	if (t != head->ofs_objects || t >= len)
+		return -4;
+
+	head->size = len;
+
+	return 0;
+}
+
+int traverse_cache_slice(struct rev_info *revs,
+	unsigned char *cache_sha1, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work)
+{
+	int fd = -1, retval = -3;
+	struct stat fi;
+	struct rc_slice_header head;
+	struct rev_cache_info *rci;
+	unsigned char *map = MAP_FAILED;
+
+	/* the index should've been loaded already to find cache_sha1, but it's good
+	 * to be absolutely sure... */
+	if (!idx_map)
+		init_index();
+	if (!idx_map)
+		return -1;
+
+	/* load options */
+	rci = &revs->rev_cache_info;
+
+	memset(&head, 0, sizeof(struct rc_slice_header));
+
+	fd = open(git_path("rev-cache/%s", sha1_to_hex(cache_sha1)), O_RDONLY);
+	if (fd == -1)
+		goto end;
+	if (fstat(fd, &fi) || fi.st_size < sizeof(struct rc_slice_header))
+		goto end;
+
+	map = xmmap(0, fi.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED)
+		goto end;
+	if (get_cache_slice_header(cache_sha1, map, fi.st_size, &head))
+		goto end;
+
+	retval = traverse_cache_slice_1(&head, map, revs, commit, date_so_far, slop_so_far, queue, work);
+
+end:
+	if (map != MAP_FAILED)
+		munmap(map, fi.st_size);
+	if (fd != -1)
+		close(fd);
+
+	return retval;
+}
+
+
+
+/* generation */
+
+struct path_track {
+	struct commit *commit;
+	int path; /* for keeping track of children */
+
+	struct path_track *next, *prev;
+};
+
+static unsigned char *paths;
+static int path_nr = 1, path_sz;
+
+static struct path_track *path_track;
+static struct path_track *path_track_alloc;
+
+#define PATH_IN_USE			0x80 /* biggest bit we can get as a char */
+
+static int get_new_path(void)
+{
+	int i;
+
+	for (i = 1; i < path_nr; i++)
+		if (!paths[i])
+			break;
+
+	if (i == path_nr) {
+		if (path_nr >= path_sz) {
+			path_sz += 50;
+			paths = xrealloc(paths, path_sz);
+			memset(paths + path_sz - 50, 0, 50);
+		}
+		path_nr++;
+	}
+
+	paths[i] = PATH_IN_USE;
+	return i;
+}
+
+static void remove_path_track(struct path_track **ppt, char total_free)
+{
+	struct path_track *t = *ppt;
+
+	if (t->next)
+		t->next->prev = t->prev;
+	if (t->prev)
+		t->prev->next = t->next;
+
+	t = t->next;
+
+	if (total_free)
+		free(*ppt);
+	else {
+		(*ppt)->next = path_track_alloc;
+		path_track_alloc = *ppt;
+	}
+
+	*ppt = t;
+}
+
+static struct path_track *make_path_track(struct path_track **head, struct commit *commit)
+{
+	struct path_track *pt;
+
+	if (path_track_alloc) {
+		pt = path_track_alloc;
+		path_track_alloc = pt->next;
+	} else
+		pt = xmalloc(sizeof(struct path_track));
+
+	memset(pt, 0, sizeof(struct path_track));
+	pt->commit = commit;
+
+	pt->next = *head;
+	if (*head)
+		(*head)->prev = pt;
+	*head = pt;
+
+	return pt;
+}
+
+static void add_path_to_track(struct commit *commit, int path)
+{
+	make_path_track(&path_track, commit);
+	path_track->path = path;
+}
+
+static void handle_paths(struct commit *commit, struct rc_object_entry *object, struct strbuf *merge_str, struct strbuf *split_str)
+{
+	int child_nr, parent_nr, open_parent_nr, this_path;
+	struct commit_list *list;
+	struct commit *first_parent;
+	struct path_track **ppt, *pt;
+
+	/* we can only re-use a closed path once all it's children have been encountered,
+	 * as we need to keep track of commit boundaries */
+	ppt = &path_track;
+	pt = *ppt;
+	child_nr = 0;
+	while (pt) {
+		if (pt->commit == commit) {
+			uint16_t write_path;
+
+			if (paths[pt->path] != PATH_IN_USE)
+				paths[pt->path]--;
+
+			/* make sure we can handle this */
+			child_nr++;
+			if (child_nr > 0x7f)
+				die("%s: too many branches!  rev-cache can only handle %d parents/children per commit",
+					sha1_to_hex(object->sha1), 0x7f);
+
+			/* add to split list */
+			object->split_nr++;
+			write_path = htons((uint16_t)pt->path);
+			strbuf_add(split_str, &write_path, sizeof(uint16_t));
+
+			remove_path_track(ppt, 0);
+			pt = *ppt;
+		} else {
+			pt = pt->next;
+			ppt = &pt;
+		}
+	}
+
+	/* initialize our self! */
+	if (!commit->indegree) {
+		commit->indegree = get_new_path();
+		object->is_start = 1;
+	}
+
+	this_path = commit->indegree;
+	paths[this_path] = PATH_IN_USE;
+	object->path = this_path;
+
+	/* count interesting parents */
+	parent_nr = open_parent_nr = 0;
+	first_parent = 0;
+	for (list = commit->parents; list; list = list->next) {
+		if (list->item->object.flags & UNINTERESTING) {
+			object->is_end = 1;
+			continue;
+		}
+
+		parent_nr++;
+		if (!list->item->indegree)
+			open_parent_nr++;
+		if (!first_parent)
+			first_parent = list->item;
+	}
+
+	if (!parent_nr)
+		return;
+
+	if (parent_nr == 1 && open_parent_nr == 1) {
+		first_parent->indegree = this_path;
+		return;
+	}
+
+	/* bail out on obscene parent/child #s */
+	if (parent_nr > 0x7f)
+		die("%s: too many parents in merge!  rev-cache can only handle %d parents/children per commit",
+			sha1_to_hex(object->sha1), 0x7f);
+
+	/* make merge list */
+	object->merge_nr = parent_nr;
+	paths[this_path] = parent_nr;
+
+	for (list = commit->parents; list; list = list->next) {
+		struct commit *p = list->item;
+		uint16_t write_path;
+
+		if (p->object.flags & UNINTERESTING)
+			continue;
+
+		/* unfortunately due to boundary tracking we can't re-use merge paths
+		 * (unable to guarantee last parent path = this -> last won't always be able to
+		 * set this as a boundary object */
+		if (!p->indegree)
+			p->indegree = get_new_path();
+
+		write_path = htons((uint16_t)p->indegree);
+		strbuf_add(merge_str, &write_path, sizeof(uint16_t));
+
+		/* make sure path is properly ended */
+		add_path_to_track(p, this_path);
+	}
+
+}
+
+
+static void add_object_entry(const unsigned char *sha1, int type, struct rc_object_entry *nothisone,
+	struct strbuf *merge_str, struct strbuf *split_str)
+{
+	struct rc_object_entry object;
+
+	if (!nothisone) {
+		memset(&object, 0, sizeof(object));
+		object.sha1 = (unsigned char *)sha1;
+		object.type = type;
+
+		if (merge_str)
+			object.merge_nr = merge_str->len / sizeof(uint16_t);
+		if (split_str)
+			object.split_nr = split_str->len / sizeof(uint16_t);
+
+		nothisone = &object;
+	}
+
+	strbuf_add(acc_buffer, to_disked_rc_object_entry(nothisone, 0), sizeof(struct rc_object_entry_ondisk));
+
+	if (merge_str && merge_str->len)
+		strbuf_add(acc_buffer, merge_str->buf, merge_str->len);
+	if (split_str && split_str->len)
+		strbuf_add(acc_buffer, split_str->buf, split_str->len);
+
+}
+
+static void init_revcache_directory(void)
+{
+	struct stat fi;
+
+	if (stat(git_path("rev-cache"), &fi) || !S_ISDIR(fi.st_mode))
+		if (mkdir(git_path("rev-cache"), 0777))
+			die("can't make rev-cache directory");
+
+}
+
+void init_rev_cache_info(struct rev_cache_info *rci)
+{
+	rci->objects = 1;
+	rci->legs = 0;
+	rci->make_index = 1;
+
+	rci->add_to_pending = 1;
+
+	rci->ignore_size = 0;
+}
+
+void maybe_fill_with_defaults(struct rev_cache_info *rci)
+{
+	static struct rev_cache_info def_rci;
+
+	if (rci)
+		return;
+
+	init_rev_cache_info(&def_rci);
+	rci = &def_rci;
+}
+
+int make_cache_slice(struct rev_cache_info *rci,
+	struct rev_info *revs, struct commit_list **starts, struct commit_list **ends,
+	unsigned char *cache_sha1)
+{
+	struct rev_info therevs;
+	struct strbuf buffer, startlist, endlist;
+	struct rc_slice_header head;
+	struct commit *commit;
+	unsigned char sha1[20];
+	struct strbuf merge_paths, split_paths;
+	int object_nr, total_sz, fd;
+	char file[PATH_MAX], *newfile;
+	struct rev_cache_info *trci;
+	git_SHA_CTX ctx;
+
+	maybe_fill_with_defaults(rci);
+
+	init_revcache_directory();
+	strcpy(file, git_path("rev-cache/XXXXXX"));
+	fd = xmkstemp(file);
+
+	strbuf_init(&buffer, 0);
+	strbuf_init(&startlist, 0);
+	strbuf_init(&endlist, 0);
+	strbuf_init(&merge_paths, 0);
+	strbuf_init(&split_paths, 0);
+	acc_buffer = &buffer;
+
+	if (!revs) {
+		revs = &therevs;
+		init_revisions(revs, 0);
+
+		/* we're gonna assume no one else has already traversed this... */
+		while ((commit = pop_commit(starts)))
+			add_pending_object(revs, &commit->object, 0);
+
+		while ((commit = pop_commit(ends))) {
+			commit->object.flags |= UNINTERESTING;
+			add_pending_object(revs, &commit->object, 0);
+		}
+	}
+
+	/* write head placeholder */
+	memset(&head, 0, sizeof(head));
+	head.ofs_objects = htonl(sizeof(head));
+	xwrite(fd, &head, sizeof(head));
+
+	/* init revisions! */
+	revs->tree_objects = 1;
+	revs->blob_objects = 1;
+	revs->topo_order = 1;
+	revs->lifo = 1;
+
+	/* re-use info from other caches if possible */
+	trci = &revs->rev_cache_info;
+	init_rev_cache_info(trci);
+	trci->add_to_pending = 0;
+
+	setup_revisions(0, 0, revs, 0);
+	if (prepare_revision_walk(revs))
+		die("died preparing revision walk");
+
+	object_nr = total_sz = 0;
+	while ((commit = get_revision(revs)) != 0) {
+		struct rc_object_entry object;
+
+		strbuf_setlen(&merge_paths, 0);
+		strbuf_setlen(&split_paths, 0);
+
+		memset(&object, 0, sizeof(object));
+		object.type = OBJ_COMMIT;
+		object.date = commit->date;
+		object.sha1 = commit->object.sha1;
+
+		handle_paths(commit, &object, &merge_paths, &split_paths);
+
+		if (object.is_end) {
+			strbuf_add(&endlist, object.sha1, 20);
+			if (ends)
+				commit_list_insert(commit, ends);
+		}
+		/* the two *aren't* mutually exclusive */
+		if (object.is_start) {
+			strbuf_add(&startlist, object.sha1, 20);
+			if (starts)
+				commit_list_insert(commit, starts);
+		}
+
+		commit->indegree = 0;
+
+		add_object_entry(0, 0, &object, &merge_paths, &split_paths);
+		object_nr++;
+
+		/* print every ~1MB or so */
+		if (buffer.len > 1000000) {
+			write_in_full(fd, buffer.buf, buffer.len);
+			total_sz += buffer.len;
+
+			strbuf_setlen(&buffer, 0);
+		}
+	}
+
+	if (buffer.len) {
+		write_in_full(fd, buffer.buf, buffer.len);
+		total_sz += buffer.len;
+	}
+
+	/* go ahead a free some stuff... */
+	strbuf_release(&buffer);
+	strbuf_release(&merge_paths);
+	strbuf_release(&split_paths);
+	if (path_sz)
+		free(paths);
+	while (path_track_alloc)
+		remove_path_track(&path_track_alloc, 1);
+
+	/* the meaning of the hash name is more or less irrelevant, it's the uniqueness that matters */
+	strbuf_add(&endlist, startlist.buf, startlist.len);
+	git_SHA1_Init(&ctx);
+	git_SHA1_Update(&ctx, endlist.buf, endlist.len);
+	git_SHA1_Final(sha1, &ctx);
+
+	/* now actually initialize header */
+	strcpy(head.signature, "REVCACHE");
+	head.version = SUPPORTED_REVCACHE_VERSION;
+
+	head.object_nr = htonl(object_nr);
+	head.size = htonl(ntohl(head.ofs_objects) + total_sz);
+	head.path_nr = htons(path_nr);
+	hashcpy(head.sha1, sha1);
+
+	/* some info! */
+	fprintf(stderr, "objects: %d\n", object_nr);
+	fprintf(stderr, "paths: %d\n", path_nr);
+
+	lseek(fd, 0, SEEK_SET);
+	xwrite(fd, &head, sizeof(head));
+
+	if (rci->make_index && make_cache_index(rci, sha1, fd, ntohl(head.size)) < 0)
+		die("can't update index");
+
+	close(fd);
+
+	newfile = git_path("rev-cache/%s", sha1_to_hex(sha1));
+	if (rename(file, newfile))
+		die("can't move temp file");
+
+	/* let our caller know what we've just made */
+	if (cache_sha1)
+		hashcpy(cache_sha1, sha1);
+
+	strbuf_release(&endlist);
+	strbuf_release(&startlist);
+
+	return 0;
+}
+
+
+static int index_sort_hash(const void *a, const void *b)
+{
+	return hashcmp(((struct rc_index_entry_ondisk *)a)->sha1, ((struct rc_index_entry_ondisk *)b)->sha1);
+}
+
+static int write_cache_index(struct strbuf *body)
+{
+	struct rc_index_header whead;
+	struct lock_file *lk;
+	int fd, i;
+
+	/* clear index map if loaded */
+	if (idx_map) {
+		munmap(idx_map, idx_size);
+		idx_map = 0;
+	}
+
+	lk = xcalloc(sizeof(struct lock_file), 1);
+	fd = hold_lock_file_for_update(lk, git_path("rev-cache/index"), 0);
+	if (fd < 0) {
+		free(lk);
+		return -1;
+	}
+
+	/* endianness yay! */
+	memset(&whead, 0, sizeof(whead));
+	memcpy(whead.signature, "REVINDEX", 8);
+	whead.version = idx_head.version;
+	whead.ofs_objects = htonl(idx_head.ofs_objects);
+	whead.object_nr = htonl(idx_head.object_nr);
+	whead.cache_nr = idx_head.cache_nr;
+	whead.max_date = htonl(idx_head.max_date);
+
+	write(fd, &whead, sizeof(struct rc_index_header));
+	write_in_full(fd, idx_caches, idx_head.cache_nr * 20);
+
+	for (i = 0; i <= 0xff; i++)
+		fanout[i] = htonl(fanout[i]);
+	write_in_full(fd, fanout, 0x100 * sizeof(uint32_t));
+
+	write_in_full(fd, body->buf, body->len);
+
+	if (commit_lock_file(lk) < 0)
+		return -2;
+
+	/* lk freed by lockfile.c */
+
+	return 0;
+}
+
+int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
+	int fd, unsigned int size)
+{
+	struct strbuf buffer;
+	int i, cache_index, cur;
+	unsigned char *map;
+	unsigned long max_date;
+
+	if (!idx_map)
+		init_index();
+
+	lseek(fd, 0, SEEK_SET);
+	map = xmmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED)
+		return -1;
+
+	strbuf_init(&buffer, 0);
+	if (idx_map) {
+		strbuf_add(&buffer, idx_map + fanout[0], fanout[0x100] - fanout[0]);
+	} else {
+		/* not an update */
+		memset(&idx_head, 0, sizeof(struct rc_index_header));
+		idx_caches = 0;
+
+		strcpy(idx_head.signature, "REVINDEX");
+		idx_head.version = SUPPORTED_REVINDEX_VERSION;
+		idx_head.ofs_objects = sizeof(struct rc_index_header) + 0x100 * sizeof(uint32_t);
+	}
+
+	/* are we remaking a slice? */
+	for (i = 0; i < idx_head.cache_nr; i++)
+		if (!hashcmp(idx_caches + i * 20, cache_sha1))
+			break;
+
+	if (i == idx_head.cache_nr) {
+		cache_index = idx_head.cache_nr++;
+		idx_head.ofs_objects += 20;
+
+		idx_caches = xrealloc(idx_caches, idx_head.cache_nr * 20);
+		hashcpy(idx_caches + cache_index * 20, cache_sha1);
+	} else
+		cache_index = i;
+
+	i = sizeof(struct rc_slice_header); /* offset */
+	max_date = idx_head.max_date;
+	while (i < size) {
+		struct rc_index_entry index_entry, *entry;
+		struct rc_index_entry_ondisk *disked_entry;
+		struct rc_object_entry *object_entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+		unsigned long date;
+		int off, pos = i;
+
+		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(object_entry);
+
+		if (object_entry->type != OBJ_COMMIT)
+			continue;
+
+		/* don't include ends; otherwise we'll find ourselves in loops */
+		if (object_entry->is_end)
+			continue;
+
+		/* handle index duplication
+		 * -> keep old copy unless new one is a start -- based on expected usage, older ones will be more
+		 * likely to lead to greater slice traversals than new ones */
+		date = object_entry->date;
+		if (date > idx_head.max_date) {
+			disked_entry = 0;
+			if (date > max_date)
+				max_date = date;
+		} else
+			disked_entry = search_index_1(object_entry->sha1);
+
+		if (disked_entry && !object_entry->is_start)
+			continue;
+		else if (disked_entry) {
+			/* mmm, pointer arithmetic... tasty */  /* (entry - idx_map = offset, so cast is valid) */
+			off = (unsigned int)((unsigned char *)disked_entry - idx_map) - fanout[0];
+			disked_entry = (struct rc_index_entry_ondisk *)(buffer.buf + off);
+			entry = from_disked_rc_index_entry(disked_entry, 0);
+		} else
+			entry = &index_entry;
+
+		memset(entry, 0, sizeof(index_entry));
+		entry->sha1 = object_entry->sha1;
+		entry->is_start = object_entry->is_start;
+		entry->cache_index = cache_index;
+		entry->pos = pos;
+
+		if (entry == &index_entry) {
+			strbuf_add(&buffer, to_disked_rc_index_entry(entry, 0), sizeof(struct rc_index_entry_ondisk));
+			idx_head.object_nr++;
+		} else
+			to_disked_rc_index_entry(entry, disked_entry);
+
+	}
+
+	idx_head.max_date = max_date;
+	qsort(buffer.buf, buffer.len / sizeof(struct rc_index_entry_ondisk), sizeof(struct rc_index_entry_ondisk), index_sort_hash);
+
+	/* generate fanout */
+	cur = 0x00;
+	for (i = 0; i < buffer.len; i += sizeof(struct rc_index_entry_ondisk)) {
+		struct rc_index_entry_ondisk *entry = (struct rc_index_entry_ondisk *)(buffer.buf + i);
+
+		while (cur <= entry->sha1[0])
+			fanout[cur++] = i + idx_head.ofs_objects;
+	}
+
+	while (cur <= 0xff)
+		fanout[cur++] = idx_head.ofs_objects + buffer.len;
+
+	/* BOOM! */
+	if (write_cache_index(&buffer))
+		return -1;
+
+	munmap(map, size);
+	strbuf_release(&buffer);
+
+	/* idx_map is unloaded without cleanup_cache_slices(), so regardless of previous index existence
+	 * we can still free this up */
+	free(idx_caches);
+
+	return 0;
+}
+
+
+/* add start-commits from each cache slice (uninterestingness will be propogated) */
+void starts_from_slices(struct rev_info *revs, unsigned int flags)
+{
+	struct commit *commit;
+	int i;
+
+	if (!idx_map)
+		init_index();
+	if (!idx_map)
+		return;
+
+	for (i = idx_head.ofs_objects; i < idx_size; i += sizeof(struct rc_index_entry_ondisk)) {
+		struct rc_index_entry *entry = RC_OBTAIN_INDEX_ENTRY(idx_map + i);
+
+		if (!entry->is_start)
+			continue;
+
+		commit = lookup_commit(entry->sha1);
+		if (!commit)
+			continue;
+
+		commit->object.flags |= flags;
+		add_pending_object(revs, &commit->object, 0);
+	}
+
+}
diff --git a/rev-cache.h b/rev-cache.h
new file mode 100644
index 0000000..a76dc53
--- /dev/null
+++ b/rev-cache.h
@@ -0,0 +1,107 @@
+#ifndef REV_CACHE_H
+#define REV_CACHE_H
+
+#define SUPPORTED_REVCACHE_VERSION 		1
+#define SUPPORTED_REVINDEX_VERSION		1
+
+#define RC_PATH_SIZE(x)	(sizeof(uint16_t) * (x))
+
+#define RC_OBTAIN_OBJECT_ENTRY(p)			from_disked_rc_object_entry((struct rc_object_entry_ondisk *)(p), 0)
+#define RC_OBTAIN_INDEX_ENTRY(p)			from_disked_rc_index_entry((struct rc_index_entry_ondisk *)(p), 0)
+
+#define RC_ACTUAL_OBJECT_ENTRY_SIZE(e)		(sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE((e)->merge_nr + (e)->split_nr) + (e)->size_size)
+
+/* single index maps objects to cache files */
+struct rc_index_header {
+	char signature[8]; /* REVINDEX */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	unsigned char cache_nr;
+
+	uint32_t max_date;
+};
+
+struct rc_index_entry_ondisk {
+	unsigned char sha1[20];
+	unsigned char flags;
+	uint32_t pos;
+};
+
+struct rc_index_entry {
+	unsigned char *sha1;
+	unsigned is_start:1;
+	unsigned cache_index:7;
+	uint32_t pos;
+};
+
+
+/* structure for actual cache file */
+struct rc_slice_header {
+	char signature[8]; /* REVCACHE */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	uint16_t path_nr;
+	uint32_t size;
+
+	unsigned char sha1[20];
+};
+
+struct rc_object_entry_ondisk {
+	unsigned char flags;
+	unsigned char sha1[20];
+
+	unsigned char merge_nr;
+	unsigned char split_nr;
+	unsigned char sizes;
+
+	uint32_t date;
+	uint16_t path;
+};
+
+struct rc_object_entry {
+	unsigned type:3;
+	unsigned is_end:1;
+	unsigned is_start:1;
+	unsigned uninteresting:1;
+	unsigned include:1;
+	unsigned flag:1; /* unused */
+	unsigned char *sha1; /* 20 byte */
+
+	unsigned char merge_nr; /* : 7 */
+	unsigned char split_nr; /* : 7 */
+	unsigned size_size:3;
+	unsigned padding:5;
+
+	uint32_t date;
+	uint16_t path;
+
+	/* merge paths */
+	/* split paths */
+	/* size */
+};
+
+struct rc_index_entry *from_disked_rc_index_entry(struct rc_index_entry_ondisk *src, struct rc_index_entry *dst);
+struct rc_index_entry_ondisk *to_disked_rc_index_entry(struct rc_index_entry *src, struct rc_index_entry_ondisk *dst);
+struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondisk *src, struct rc_object_entry *dst);
+struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry *src, struct rc_object_entry_ondisk *dst);
+
+extern unsigned char *get_cache_slice(struct commit *commit);
+extern int traverse_cache_slice(struct rev_info *revs,
+	unsigned char *cache_sha1, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work);
+
+extern void init_rev_cache_info(struct rev_cache_info *rci);
+extern int make_cache_slice(struct rev_cache_info *rci,
+	struct rev_info *revs, struct commit_list **starts, struct commit_list **ends,
+	unsigned char *cache_sha1);
+extern int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
+	int fd, unsigned int size);
+
+extern void starts_from_slices(struct rev_info *revs, unsigned int flags);
+
+#endif
diff --git a/revision.c b/revision.c
index 35eca4a..c7fd35f 100644
--- a/revision.c
+++ b/revision.c
@@ -432,7 +432,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
  	commit->object.flags |= TREESAME;
  }

-static void insert_by_date_cached(struct commit *p, struct commit_list **head,
+void insert_by_date_cached(struct commit *p, struct commit_list **head,
  		    struct commit_list *cached_base, struct commit_list **cache)
  {
  	struct commit_list *new_entry;
diff --git a/revision.h b/revision.h
index 9d0dddb..7db4b9e 100644
--- a/revision.h
+++ b/revision.h
@@ -13,7 +13,8 @@
  #define CHILD_SHOWN	(1u<<6)
  #define ADDED		(1u<<7)	/* Parents already parsed and added? */
  #define SYMMETRIC_LEFT	(1u<<8)
-#define ALL_REV_FLAGS	((1u<<9)-1)
+#define FACE_VALUE	(1u<<9)
+#define ALL_REV_FLAGS	((1u<<10)-1)

  #define DECORATE_SHORT_REFS	1
  #define DECORATE_FULL_REFS	2
@@ -21,6 +22,19 @@
  struct rev_info;
  struct log_info;

+struct rev_cache_info {
+	/* generation flags */
+	unsigned objects : 1,
+		legs : 1,
+		make_index : 1;
+
+	/* traversal flags */
+	unsigned add_to_pending : 1;
+
+	/* fuse options */
+	unsigned int ignore_size;
+};
+
  struct rev_info {
  	/* Starting list */
  	struct commit_list *commits;
@@ -76,6 +90,10 @@ struct rev_info {
  			dense_combined_merges:1,
  			always_show_header:1;

+	/* rev-cache flags */
+	unsigned int for_pack:1,
+		dont_cache_me:1;
+
  	/* Format info */
  	unsigned int	shown_one:1,
  			show_merge:1,
@@ -119,6 +137,9 @@ struct rev_info {
  	struct reflog_walk_info *reflog_info;
  	struct decoration children;
  	struct decoration merge_simplification;
+
+	/* caching info, used ONLY by traverse_cache_slice */
+	struct rev_cache_info rev_cache_info;
  };

  #define REV_TREE_SAME		0
@@ -171,4 +192,7 @@ enum commit_action {
  extern enum commit_action get_commit_action(struct rev_info *revs, struct commit *commit);
  extern enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit);

+extern void insert_by_date_cached(struct commit *p, struct commit_list **head,
+		    struct commit_list *cached_base, struct commit_list **cache);
+
  #endif
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
new file mode 100755
index 0000000..f59f568
--- /dev/null
+++ b/t/t6017-rev-cache-list.sh
@@ -0,0 +1,106 @@
+#!/bin/sh
+
+test_description='git rev-cache tests'
+. ./test-lib.sh
+
+test_cmp_sorted() {
+	grep -io "[a-f0-9]*" $1 | sort >.tmpfile1 &&
+	grep -io "[a-f0-9]*" $2 | sort >.tmpfile2 &&
+	test_cmp .tmpfile1 .tmpfile2
+}
+
+# we want a totally wacked out branch structure...
+# we need branching and merging of sizes up through 3, tree
+# addition/deletion, and enough branching to exercise path
+# reuse
+test_expect_success 'init repo' '
+	echo bla >file &&
+	git add . &&
+	git commit -m "bla" &&
+
+	git branch b1 &&
+	git checkout b1 &&
+	echo blu >file2 &&
+	mkdir d1 &&
+	echo bang >d1/filed1 &&
+	git add . &&
+	git commit -m "blu" &&
+
+	git checkout master &&
+	git branch b2 &&
+	git checkout b2 &&
+	echo kaplaa >>file &&
+	git commit -a -m "kaplaa" &&
+
+	git checkout master &&
+	mkdir smoke &&
+	echo omg >smoke/bong &&
+	git add . &&
+	git commit -m "omg" &&
+
+	git branch b4 &&
+	git checkout b4 &&
+	echo shazam >file8 &&
+	git add . &&
+	git commit -m "shazam" &&
+	git merge -m "merge b2" b2 &&
+
+	echo bam >smoke/pipe &&
+	git add .
+	git commit -m "bam" &&
+
+	git checkout master &&
+	echo pow >file7 &&
+	git add . &&
+	git commit -m "pow" &&
+	git merge -m "merge b4" b4 &&
+
+	git checkout b1 &&
+	echo stuff >d1/filed1 &&
+	git commit -a -m "stuff" &&
+
+	git branch b11 &&
+	git checkout b11 &&
+	echo wazzup >file3 &&
+	git add file3 &&
+	git commit -m "wazzup" &&
+
+	git checkout b1 &&
+	mkdir d1/d2 &&
+	echo lol >d1/d2/filed2 &&
+	git add . &&
+	git commit -m "lol" &&
+
+	git checkout master &&
+	git merge -m "triple merge" b1 b11 &&
+	git rm -r d1 &&
+	git commit -a -m "oh noes"
+'
+
+git-rev-list HEAD --not HEAD~3 >proper_commit_list_limited
+git-rev-list HEAD >proper_commit_list
+git-rev-list HEAD --objects >proper_object_list
+
+test_expect_success 'make cache slice' '
+	git-rev-cache add HEAD 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+test_expect_success 'remake cache slice' '
+	git-rev-cache add HEAD 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+#check core mechanics and rev-list hook for commits
+test_expect_success 'test rev-caches walker directly (limited)' '
+	git-rev-cache walk HEAD --not HEAD~3 >list &&
+	test_cmp_sorted list proper_commit_list_limited
+'
+
+test_expect_success 'test rev-caches walker directly (unlimited)' '
+	git-rev-cache walk HEAD >list &&
+	test_cmp_sorted list proper_commit_list
+'
+
+test_done
+
-- 
tg: (a6dbf88..) t/revcache/basic (depends on: master)

^ permalink raw reply related

* Re: [PATCH 3/6 (v4)] support for non-commit object caching in rev-cache
From: Nick Edelen @ 2009-10-02 22:12 UTC (permalink / raw)
  To: Nick Edelen, Junio C Hamano, Nicolas Pitre, Johannes Schindelin,
	Sam Vilain
In-Reply-To: <op.uyuwkrfntdk399@sirnot.private>

Summarized, this third patch contains:
  - support for non-commit object caching
  - expansion of porcelain to accomodate non-commit objects
  - appropriate tests

Objects are stored relative to the commit in which they were introduced --
commits are 'diffed' against their parents.  This will eliminate the need for
tree recursion in cached commits (significantly reducing I/O), and potentially
be useful to external applications.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
  rev-cache.c               |  202 ++++++++++++++++++++++++++++++++++++++++++++-
  t/t6017-rev-cache-list.sh |    6 ++
  2 files changed, 206 insertions(+), 2 deletions(-)

diff --git a/rev-cache.c b/rev-cache.c
index 8951cdf..ef6b58a 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -259,6 +259,32 @@ unsigned char *get_cache_slice(struct commit *commit)

  /* traversal */

+static void handle_noncommit(struct rev_info *revs, unsigned char *ptr, struct rc_object_entry *entry)
+{
+	struct object *obj = 0;
+
+	switch (entry->type) {
+	case OBJ_TREE:
+		if (revs->tree_objects)
+			obj = (struct object *)lookup_tree(entry->sha1);
+		break;
+	case OBJ_BLOB:
+		if (revs->blob_objects)
+			obj = (struct object *)lookup_blob(entry->sha1);
+		break;
+	case OBJ_TAG:
+		if (revs->tag_objects)
+			obj = (struct object *)lookup_tag(entry->sha1);
+		break;
+	}
+
+	if (!obj)
+		return;
+
+	obj->flags |= FACE_VALUE;
+	add_pending_object(revs, obj, "");
+}
+
  static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work)
  {
  	struct rc_index_entry *iep;
@@ -347,9 +373,12 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
  		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);

  		/* add extra objects if necessary */
-		if (entry->type != OBJ_COMMIT)
+		if (entry->type != OBJ_COMMIT) {
+			if (consume_children)
+				handle_noncommit(revs, map + index, entry);
+
  			continue;
-		else
+		} else
  			consume_children = 0;

  		if (path >= total_path_nr)
@@ -777,6 +806,171 @@ static void add_object_entry(const unsigned char *sha1, int type, struct rc_obje

  }

+/* returns non-zero to continue parsing, 0 to skip */
+typedef int (*dump_tree_fn)(const unsigned char *, const char *, unsigned int); /* sha1, path, mode */
+
+/* we need to walk the trees by hash, so unfortunately we can't use traverse_trees in tree-walk.c */
+static int dump_tree(struct tree *tree, dump_tree_fn fn)
+{
+	struct tree_desc desc;
+	struct name_entry entry;
+	struct tree *subtree;
+	int r;
+
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(&desc, tree->buffer, tree->size);
+	while (tree_entry(&desc, &entry)) {
+		switch (fn(entry.sha1, entry.path, entry.mode)) {
+		case 0:
+			goto continue_loop;
+		default:
+			break;
+		}
+
+		if (S_ISDIR(entry.mode)) {
+			subtree = lookup_tree(entry.sha1);
+			if (!subtree)
+				return -2;
+
+			if ((r = dump_tree(subtree, fn)) < 0)
+				return r;
+		}
+
+continue_loop:
+		continue;
+	}
+
+	return 0;
+}
+
+static int dump_tree_callback(const unsigned char *sha1, const char *path, unsigned int mode)
+{
+	unsigned char data[21];
+
+	hashcpy(data, sha1);
+	data[20] = !!S_ISDIR(mode);
+
+	strbuf_add(acc_buffer, data, 21);
+
+	return 1;
+}
+
+static void tree_addremove(struct diff_options *options,
+	int whatnow, unsigned mode,
+	const unsigned char *sha1,
+	const char *concatpath)
+{
+	unsigned char data[21];
+
+	if (whatnow != '+')
+		return;
+
+	hashcpy(data, sha1);
+	data[20] = !!S_ISDIR(mode);
+
+	strbuf_add(acc_buffer, data, 21);
+}
+
+static void tree_change(struct diff_options *options,
+	unsigned old_mode, unsigned new_mode,
+	const unsigned char *old_sha1,
+	const unsigned char *new_sha1,
+	const char *concatpath)
+{
+	unsigned char data[21];
+
+	if (!hashcmp(old_sha1, new_sha1))
+		return;
+
+	hashcpy(data, new_sha1);
+	data[20] = !!S_ISDIR(new_mode);
+
+	strbuf_add(acc_buffer, data, 21);
+}
+
+static int sort_type_hash(const void *a, const void *b)
+{
+	const unsigned char *sa = (const unsigned char *)a,
+		*sb = (const unsigned char *)b;
+
+	if (sa[20] == sb[20])
+		return hashcmp(sa, sb);
+
+	return sa[20] > sb[20] ? -1 : 1;
+}
+
+static int add_unique_objects(struct commit *commit)
+{
+	struct commit_list *list;
+	struct strbuf os, ost, *orig_buf;
+	struct diff_options opts;
+	int i, j, next;
+	char is_first = 1;
+
+	strbuf_init(&os, 0);
+	strbuf_init(&ost, 0);
+	orig_buf = acc_buffer;
+
+	diff_setup(&opts);
+	DIFF_OPT_SET(&opts, RECURSIVE);
+	DIFF_OPT_SET(&opts, TREE_IN_RECURSIVE);
+	opts.change = tree_change;
+	opts.add_remove = tree_addremove;
+
+	/* this is only called for non-ends (ie. all parents interesting) */
+	for (list = commit->parents; list; list = list->next) {
+		if (is_first)
+			acc_buffer = &os;
+		else
+			acc_buffer = &ost;
+
+		strbuf_setlen(acc_buffer, 0);
+		diff_tree_sha1(list->item->tree->object.sha1, commit->tree->object.sha1, "", &opts);
+		qsort(acc_buffer->buf, acc_buffer->len / 21, 21, (int (*)(const void *, const void *))hashcmp);
+
+		/* take intersection */
+		if (!is_first) {
+			for (next = i = j = 0; i < os.len; i += 21) {
+				while (j < ost.len && hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)) < 0)
+					j += 21;
+
+				if (j >= ost.len || hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)))
+					continue;
+
+				if (next != i)
+					memcpy(os.buf + next, os.buf + i, 21);
+				next += 21;
+			}
+
+			if (next != i)
+				strbuf_setlen(&os, next);
+		} else
+			is_first = 0;
+	}
+
+	if (is_first) {
+		acc_buffer = &os;
+		dump_tree(commit->tree, dump_tree_callback);
+	}
+
+	if (os.len)
+		qsort(os.buf, os.len / 21, 21, sort_type_hash);
+
+	acc_buffer = orig_buf;
+	for (i = 0; i < os.len; i += 21)
+		add_object_entry((unsigned char *)(os.buf + i), os.buf[i + 20] ? OBJ_TREE : OBJ_BLOB, 0, 0, 0);
+
+	/* last but not least, the main tree */
+	add_object_entry(commit->tree->object.sha1, OBJ_TREE, 0, 0, 0);
+
+	strbuf_release(&ost);
+	strbuf_release(&os);
+
+	return i / 21 + 1;
+}
+
  static void init_revcache_directory(void)
  {
  	struct stat fi;
@@ -902,6 +1096,10 @@ int make_cache_slice(struct rev_cache_info *rci,
  		add_object_entry(0, 0, &object, &merge_paths, &split_paths);
  		object_nr++;

+		/* add all unique children for this commit */
+		if (rci->objects && !object.is_end)
+			object_nr += add_unique_objects(commit);
+
  		/* print every ~1MB or so */
  		if (buffer.len > 1000000) {
  			write_in_full(fd, buffer.buf, buffer.len);
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
index f59f568..dc0fc07 100755
--- a/t/t6017-rev-cache-list.sh
+++ b/t/t6017-rev-cache-list.sh
@@ -102,5 +102,11 @@ test_expect_success 'test rev-caches walker directly (unlimited)' '
  	test_cmp_sorted list proper_commit_list
  '

+#do the same for objects
+test_expect_success 'test rev-caches walker with objects' '
+	git-rev-cache walk --objects HEAD >list &&
+	test_cmp_sorted list proper_object_list
+'
+
  test_done

-- 
tg: (ec20331..) t/revcache/objects (depends on: t/revcache/basic)

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox