* Re: [PATCH] strbuf_readlink semantics update.
From: Junio C Hamano @ 2008-12-25 7:23 UTC (permalink / raw)
To: René Scharfe; +Cc: Pierre Habouzit, Linus Torvalds, git
In-Reply-To: <4952532F.5050704@lsrfire.ath.cx>
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Pierre Habouzit schrieb:
>> On Tue, Dec 23, 2008 at 06:16:01PM +0000, Linus Torvalds wrote:
>>>
>>> On Tue, 23 Dec 2008, Pierre Habouzit wrote:
>>>> when readlink fails, the strbuf shall not be destroyed. It's not how
>>>> read_file_or_gitlink works for example.
>>> I disagree.
>>>
>>> This patch just makes things worse. Just leave the "strbuf_release()" in
>>> _one_ place.
> ...
> The "append or do nothing" rule is broken by strbuf_getline(), but I agree
> to your reasoning. How about refining this rule a bit to "do your thing
> and roll back changes if an error occurs"? I think it's not worth to undo
> allocation extensions, but making reverting first time allocations seems
> like a good idea. Something like this?
I think this is much better than Pierre's. Pierre's "if it is called
strbuf_*, it should always append" is a good uniformity to have in an API,
but making the caller suffer for clean-up is going backwards. The reason
we use strbuf when we can is so that the callers do not have to worry
about memory allocation issues too much.
^ permalink raw reply
* Re: [PATCH] handle_remote_ls_ctx can parsing href starting at http://
From: Junio C Hamano @ 2008-12-25 7:04 UTC (permalink / raw)
To: Kirill A. Korinskiy; +Cc: git
In-Reply-To: <1230021075-10113-1-git-send-email-catap@catap.ru>
"Kirill A. Korinskiy" <catap@catap.ru> writes:
> The program call remote_ls() to get remote objects over http;
> handle_remote_ls_ctx() is used to parse it's response to populated
> "struct remote_ls_ctx" that is returned from remote_ls().
>
> The handle_remote_ls_ctx() function assumed that the server will
> returned local path in href field, but RFC 4918 demand of support full
> URI (http://localhost/repo.git for example).
>
> This resulted in push failure (git-http-push ask server
> PROPFIND /repo.git/alhost:8080/repo.git/refs/) when a server returned
> full URI.
Thanks.
Do you mean PROPFIND was made to that garbage with :8080 in it when the
server returned a full URI http://localhost/repo.git as in the example in
the previous paragraph, or are you using a different example here?
I am contemplating of munging your commit log message like this...
commit e1f33efe07b9a520505fccd71bea1292fc9448dd
Author: Kirill A. Korinskiy <catap@catap.ru>
Date: Tue Dec 23 11:31:15 2008 +0300
http-push: support full URI in handle_remote_ls_ctx()
The program calls remote_ls() to get list of files from the server over
HTTP; handle_remote_ls_ctx() is used to parse its response to populate
"struct remote_ls_ctx" that is returned from remote_ls().
The handle_remote_ls_ctx() function assumed that the server returns a
local path in href field, but RFC 4918 (14.7) demand of support full URI
(e.g. "http://localhost:8080/repo.git").
This resulted in push failure (e.g. git-http-push issues a PROPFIND
request to "/repo.git/alhost:8080/repo.git/refs/" to the server).
Signed-off-by: Kirill A. Korinskiy <catap@catap.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
^ permalink raw reply
* A note from the maintainer
From: Junio C Hamano @ 2008-12-25 6:48 UTC (permalink / raw)
To: git
Welcome to git community.
This message talks about how git.git is managed, and how you can work
with it.
* IRC and Mailing list
Many active members of development community hang around on #git
IRC channel on Freenode. Its log is available at:
http://colabti.org/irclogger/irclogger_log/git
The development however is primarily done on the git mailing list
(git@vger.kernel.org). If you have patches, please send them to the
list, following Documentation/SubmittingPatches.
I usually try to read all patches posted to the list, and follow
almost all the discussions on the list, unless the topic is about an
obscure corner that I do not personally use. But I am obviously not
perfect. If you sent a patch that you did not hear from anybody for
three days, that is a very good indication that it was dropped on the
floor --- please do not hesitate to remind me.
The list archive is available at a few public sites as well:
http://news.gmane.org/gmane.comp.version-control.git/
http://marc.theaimsgroup.com/?l=git
http://www.spinics.net/lists/git/
and some people seem to prefer to read it over NNTP:
nntp://news.gmane.org/gmane.comp.version-control.git
When you point at a message in a mailing list archive, using
gmane is often the easiest to follow by readers, like this:
http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217
as it also allows people who subscribe to the mailing list as
gmane newsgroup to "jump to" the article.
* Repositories, branches and documentation.
My public git.git repository is at:
git://git.kernel.org/pub/scm/git/git.git/
Immediately after I publish to the primary repository at kernel.org, I
also push into an alternate here:
git://repo.or.cz/alt-git.git/
Impatient people might have better luck with the latter one.
Their gitweb interfaces are found at:
http://git.kernel.org/?p=git/git.git
http://repo.or.cz/w/alt-git.git
There are three branches in git.git repository that are not
about the source tree of git: "todo", "html" and "man". The
first one was meant to contain TODO list for me, but I am not
good at maintaining such a list and it is not as often updated as
it could/should be. It also contains some helper scripts I use
to maintain git.
The "html" and "man" are autogenerated documentation from the
tip of the "master" branch; the tip of "html" is extracted to be
visible at kernel.org at:
http://www.kernel.org/pub/software/scm/git/docs/
The above URL is the top-level documentation page, and it has
links to documentation of older releases.
The script to maintain these two documentation branches are
found in "todo" branch as dodoc.sh, if you are interested. It
is a good demonstration of how to use an update hook to automate
a task.
There are four branches in git.git repository that track the
source tree of git: "master", "maint", "next", and "pu". I may
add more maintenance branches (e.g. "maint-1.5.4") if we have
hugely backward incompatible feature updates in the future to keep
an older release alive; I may not, but the distributed nature of
git means any volunteer can run a stable-tree like that herself.
The "master" branch is meant to contain what are very well
tested and ready to be used in a production setting. There
could occasionally be minor breakages or brown paper bag bugs
but they are not expected to be anything major, and more
importantly quickly and trivially fixable. Every now and
then, a "feature release" is cut from the tip of this branch and
they typically are named with three dotted decimal digits. The
last such release was 1.6.1 done on Dec 24th 2008. You
can expect that the tip of the "master" branch is always more
stable than any of the released versions.
Whenever a feature release is made, "maint" branch is forked off
from "master" at that point. Obvious, safe and urgent fixes
after a feature release are applied to this branch and
maintenance releases are cut from it. The maintenance releases
are named with four dotted decimal, named after the feature
release they are updates to; the last such release was 1.6.0.6.
New features never go to this branch. This branch is also
merged into "master" to propagate the fixes forward.
A trivial and safe enhancement goes directly on top of "master".
A new development, either initiated by myself or more often by
somebody who found his or her own itch to scratch, does not
usually happen on "master", however. Instead, a separate topic
branch is forked from the tip of "master", and it first is
tested in isolation; I may make minimum fixups at this point.
Usually there are a handful such topic branches that are running
ahead of "master" in git.git repository. I do not publish the
tip of these branches in my public repository, however, partly
to keep the number of branches that downstream developers need
to worry about low, and primarily because I am lazy.
The quality of topic branches are judged primarily by the mailing list
discussions. Some of them start out as "good idea but obviously is
broken in some areas (e.g. breaks the existing testsuite)" and then
with some more work (either by the original contributor's effort or
help from other people on the list) becomes "more or less done and can
now be tested by wider audience". Luckily, most of them start out in
the latter, better shape.
The "next" branch is to merge and test topic branches in the
latter category. In general, the branch always contains the tip
of "master". It might not be quite rock-solid production ready,
but is expected to work more or less without major breakage. I
usually use "next" version of git for my own work, so it cannot
be _that_ broken to prevent me from pushing the changes out.
The "next" branch is where new and exciting things take place.
The two branches "master" and "maint" are never rewound, and
"next" usually will not be either (this automatically means the
topics that have been merged into "next" are usually not
rebased, and you can find the tip of topic branches you are
interested in from the output of "git log next"). You should be
able to safely track them.
After a feature release is made from "master", however, "next"
will be rebuilt from the tip of "master" using the surviving
topics. The commit that replaces the tip of the "next" will
have the identical tree, but it will have different ancestry
from the tip of "master". An announcement will be made to warn
people about such a rebasing.
The "pu" (proposed updates) branch bundles all the remainder of
topic branches. The "pu" branch, and topic branches that are
only in "pu", are subject to rebasing in general. By the above
definition of how "next" works, you can tell that this branch
will contain quite experimental and obviously broken stuff.
When a topic that was in "pu" proves to be in testable shape, it
graduates to "next". I do this with:
git checkout next
git merge that-topic-branch
Sometimes, an idea that looked promising turns out to be not so
good and the topic can be dropped from "pu" in such a case.
A topic that is in "next" is expected to be tweaked and fixed to
perfection before it is merged to "master" (that's why "master"
can be expected to stay very stable). Similarly to the above, I
do it with this:
git checkout master
git merge that-topic-branch
git branch -d that-topic-branch
Note that being in "next" is not a guarantee to appear in the
next release (being in "master" is such a guarantee, unless it
is later found seriously broken and reverted), or even in any
future release. There even were cases that topics needed
reverting a few commits in them before graduating to "master",
or a topic that already was in "next" were entirely reverted
from "next" because fatal flaws were found in them later.
Starting from v1.5.0, "master" and "maint" have release notes
for the next release in Documentation/RelNotes-* files, so that
I do not have to run around summarizing what happened just
before the release.
* Other people's trees, trusted lieutenants and credits.
Documentation/SubmittingPatches outlines who your changes should
be sent to. As described in contrib/README, I would delegate
fixes and enhancements in contrib/ area to primary contributors
of them.
Although the following are included in git.git repository, they
have their own authoritative repository and maintainers:
- git-gui/ comes from Shawn Pearce's git-gui project:
git://repo.or.cz/git-gui.git
- gitk-git/ comes from Paul Mackerras's gitk project:
git://git.kernel.org/pub/scm/gitk/gitk.git
I would like to thank everybody who helped to raise git into the
current shape. Especially I would like to thank the git list
regulars whose help I have relied on and expect to continue
relying on heavily:
- Linus on general design issues.
- Linus, Shawn Pearce, Johannes Schindelin, Nicolas Pitre,
René Scharfe and Jeff King on general implementation issues.
- Shawn and Nicolas Pitre on pack issues.
- Martin Langhoff and Frank Lichtenheld on cvsserver and cvsimport.
- Paul Mackerras on gitk.
- Eric Wong on git-svn.
- Simon Hausmann on git-p4.
- Jakub Narebski, Petr Baudis, and Luben Tuikov on gitweb.
- J. Bruce Fields on documentaton issues.
- Johannes Schindelin, Johannes Sixt and others for their effort
to move things forward on the Windows front. Most of the fruits
from their porting efforts have been merged to the mainline git.git
repository in 1.6.0 release.
- People on non-Linux platforms for keeping their eyes on
portability; especially, Randal Schwartz, Theodore Ts'o,
Jason Riedy, Thomas Glanzmann, but countless others as well.
* This document
The latest copy of this document is found in git.git repository,
on 'todo' branch, as MaintNotes.
I also keep a copy of it at http://members.cox.net/junkio/
^ permalink raw reply
* [ANNOUNCE] GIT 1.6.1
From: Junio C Hamano @ 2008-12-25 6:36 UTC (permalink / raw)
To: git; +Cc: linux-kernel
The latest feature release GIT 1.6.1 is available at the usual
places:
http://www.kernel.org/pub/software/scm/git/
git-1.6.1.tar.{gz,bz2} (source tarball)
git-htmldocs-1.6.1.tar.{gz,bz2} (preformatted docs)
git-manpages-1.6.1.tar.{gz,bz2} (preformatted docs)
The RPM binary packages for a few architectures are also found in the
vicinity.
RPMS/$arch/git-*-1.6.1-1.fc9.$arch.rpm (RPM)
----------------------------------------------------------------
GIT v1.6.1 Release Notes
========================
Updates since v1.6.0
--------------------
When some commands (e.g. "git log", "git diff") spawn pager internally, we
used to make the pager the parent process of the git command that produces
output. This meant that the exit status of the whole thing comes from the
pager, not the underlying git command. We swapped the order of the
processes around and you will see the exit code from the command from now
on.
(subsystems)
* gitk can call out to git-gui to view "git blame" output; git-gui in turn
can run gitk from its blame view.
* Various git-gui updates including updated translations.
* Various gitweb updates from repo.or.cz installation.
* Updates to emacs bindings.
(portability)
* A few test scripts used nonportable "grep" that did not work well on
some platforms, e.g. Solaris.
* Sample pre-auto-gc script has OS X support.
* Makefile has support for (ancient) FreeBSD 4.9.
(performance)
* Many operations that are lstat(3) heavy can be told to pre-execute
necessary lstat(3) in parallel before their main operations, which
potentially gives much improved performance for cold-cache cases or in
environments with weak metadata caching (e.g. NFS).
* The underlying diff machinery to produce textual output has been
optimized, which would result in faster "git blame" processing.
* Most of the test scripts (but not the ones that try to run servers)
can be run in parallel.
* Bash completion of refnames in a repository with massive number of
refs has been optimized.
* Cygwin port uses native stat/lstat implementations when applicable,
which leads to improved performance.
* "git push" pays attention to alternate repositories to avoid sending
unnecessary objects.
* "git svn" can rebuild an out-of-date rev_map file.
(usability, bells and whistles)
* When you mistype a command name, git helpfully suggests what it guesses
you might have meant to say. help.autocorrect configuration can be set
to a non-zero value to accept the suggestion when git can uniquely
guess.
* The packfile machinery hopefully is more robust when dealing with
corrupt packs if redundant objects involved in the corruption are
available elsewhere.
* "git add -N path..." adds the named paths as an empty blob, so that
subsequent "git diff" will show a diff as if they are creation events.
* "git add" gained a built-in synonym for people who want to say "stage
changes" instead of "add contents to the staging area" which amounts
to the same thing.
* "git apply" learned --include=paths option, similar to the existing
--exclude=paths option.
* "git bisect" is careful about a user mistake and suggests testing of
merge base first when good is not a strict ancestor of bad.
* "git bisect skip" can take a range of commits.
* "git blame" re-encodes the commit metainfo to UTF-8 from i18n.commitEncoding
by default.
* "git check-attr --stdin" can check attributes for multiple paths.
* "git checkout --track origin/hack" used to be a syntax error. It now
DWIMs to create a corresponding local branch "hack", i.e. acts as if you
said "git checkout --track -b hack origin/hack".
* "git checkout --ours/--theirs" can be used to check out one side of a
conflicting merge during conflict resolution.
* "git checkout -m" can be used to recreate the initial conflicted state
during conflict resolution.
* "git cherry-pick" can also utilize rerere for conflict resolution.
* "git clone" learned to be verbose with -v
* "git commit --author=$name" can look up author name from existing
commits.
* output from "git commit" has been reworded in a more concise and yet
more informative way.
* "git count-objects" reports the on-disk footprint for packfiles and
their corresponding idx files.
* "git daemon" learned --max-connections=<count> option.
* "git daemon" exports REMOTE_ADDR to record client address, so that
spawned programs can act differently on it.
* "git describe --tags" favours closer lightweight tags than farther
annotated tags now.
* "git diff" learned to mimic --suppress-blank-empty from GNU diff via a
configuration option.
* "git diff" learned to put more sensible hunk headers for Python,
HTML and ObjC contents.
* "git diff" learned to vary the a/ vs b/ prefix depending on what are
being compared, controlled by diff.mnemonicprefix configuration.
* "git diff" learned --dirstat-by-file to count changed files, not number
of lines, when summarizing the global picture.
* "git diff" learned "textconv" filters --- a binary or hard-to-read
contents can be munged into human readable form and the difference
between the results of the conversion can be viewed (obviously this
cannot produce a patch that can be applied, so this is disabled in
format-patch among other things).
* "--cached" option to "git diff has an easier to remember synonym "--staged",
to ask "what is the difference between the given commit and the
contents staged in the index?"
* "git for-each-ref" learned "refname:short" token that gives an
unambiguously abbreviated refname.
* Auto-numbering of the subject lines is the default for "git
format-patch" now.
* "git grep" learned to accept -z similar to GNU grep.
* "git help" learned to use GIT_MAN_VIEWER environment variable before
using "man" program.
* "git imap-send" can optionally talk SSL.
* "git index-pack" is more careful against disk corruption while
completing a thin pack.
* "git log --check" and "git log --exit-code" passes their underlying diff
status with their exit status code.
* "git log" learned --simplify-merges, a milder variant of --full-history;
"gitk --simplify-merges" is easier to view than with --full-history.
* "git log" learned "--source" to show what ref each commit was reached
from.
* "git log" also learned "--simplify-by-decoration" to show the
birds-eye-view of the topology of the history.
* "git log --pretty=format:" learned "%d" format element that inserts
names of tags that point at the commit.
* "git merge --squash" and "git merge --no-ff" into an unborn branch are
noticed as user errors.
* "git merge -s $strategy" can use a custom built strategy if you have a
command "git-merge-$strategy" on your $PATH.
* "git pull" (and "git fetch") can be told to operate "-v"erbosely or
"-q"uietly.
* "git push" can be told to reject deletion of refs with receive.denyDeletes
configuration.
* "git rebase" honours pre-rebase hook; use --no-verify to bypass it.
* "git rebase -p" uses interactive rebase machinery now to preserve the merges.
* "git reflog expire branch" can be used in place of "git reflog expire
refs/heads/branch".
* "git remote show $remote" lists remote branches one-per-line now.
* "git send-email" can be given revision range instead of files and
maildirs on the command line, and automatically runs format-patch to
generate patches for the given revision range.
* "git submodule foreach" subcommand allows you to iterate over checked
out submodules.
* "git submodule sync" subcommands allows you to update the origin URL
recorded in submodule directories from the toplevel .gitmodules file.
* "git svn branch" can create new branches on the other end.
* "gitweb" can use more saner PATH_INFO based URL.
(internal)
* "git hash-object" learned to lie about the path being hashed, so that
correct gitattributes processing can be done while hashing contents
stored in a temporary file.
* various callers of git-merge-recursive avoid forking it as an external
process.
* Git class defined in "Git.pm" can be subclasses a bit more easily.
* We used to link GNU regex library as a compatibility layer for some
platforms, but it turns out it is not necessary on most of them.
* Some path handling routines used fixed number of buffers used alternately
but depending on the call depth, this arrangement led to hard to track
bugs. This issue is being addressed.
Fixes since v1.6.0
------------------
All of the fixes in v1.6.0.X maintenance series are included in this
release, unless otherwise noted.
* Porcelains implemented as shell scripts were utterly confused when you
entered to a subdirectory of a work tree from sideways, following a
symbolic link (this may need to be backported to older releases later).
* Tracking symbolic links would work better on filesystems whose lstat()
returns incorrect st_size value for them.
* "git add" and "git update-index" incorrectly allowed adding S/F when S
is a tracked symlink that points at a directory D that has a path F in
it (we still need to fix a similar nonsense when S is a submodule and F
is a path in it).
* "git am" after stopping at a broken patch lost --whitespace, -C, -p and
--3way options given from the command line initially.
* "git diff --stdin" used to take two trees on a line and compared them,
but we dropped support for such a use case long time ago. This has
been resurrected.
* "git filter-branch" failed to rewrite a tag name with slashes in it.
* "git http-push" did not understand URI scheme other than opaquelocktoken
when acquiring a lock from the server (this may need to be backported to
older releases later).
* After "git rebase -p" stopped with conflicts while replaying a merge,
"git rebase --continue" did not work (may need to be backported to older
releases).
* "git revert" records relative to which parent a revert was made when
reverting a merge. Together with new documentation that explains issues
around reverting a merge and merging from the updated branch later, this
hopefully will reduce user confusion (this may need to be backported to
older releases later).
* "git rm --cached" used to allow an empty blob that was added earlier to
be removed without --force, even when the file in the work tree has
since been modified.
* "git push --tags --all $there" failed with generic usage message without
telling saying these two options are incompatible.
* "git log --author/--committer" match used to potentially match the
timestamp part, exposing internal implementation detail. Also these did
not work with --fixed-strings match at all.
* "gitweb" did not mark non-ASCII characters imported from external HTML fragments
correctly.
^ permalink raw reply
* Re: Questions about repo and git submodule
From: Emily Ren @ 2008-12-25 6:28 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Git Mailinglist
In-Reply-To: <20081223152951.GB27865@spearce.org>
Hi Shawn,
Merry Christmas !
Thank you for your guide ! I've created git repository with submodules
with a simple script successfully.
Now I have another question, since my android repo will always sync
up from android.git.kernel.org, my git repository needs to be updated
accordingly. Is there a tool I can use to sync up from repo to git
repositoy ?
Thanks,
Emily
On Tue, Dec 23, 2008 at 11:29 PM, Shawn O. Pearce <spearce@spearce.org> wrote:
> Emily Ren <lingyan.ren@gmail.com> wrote:
>>
>> I have some questions about android repo and git submodule.
>>
>> I created a repo repository with below commands:
>> 1. repo init -u git://android.git.kernel.org/platform/manifest.git
>> 2. repo initialized in /android
>>
>> 1. The android dir is not a git repository,
>
> Correct, it is not a git repository. The repo tool does not use
> git submodules. The top level of a repo client has a ".repo/"
> directory with metadata, not a ".git/" directory. The table of
> contents (the subprojects) is stored in XML files under ".repo/".
>
> <aside>
> I actually fought against the XML format for repo's manifest, but
> others felt it was suitable. And then walked away from the project
> after Android open-sourced its code tree. Leaving me to maintain it.
> I see a file format simplification in the future for repo.
> </aside>
>
>> if other people clone my
>> android code, how does it work?
>
> Sadly this isn't supported correctly. You can't initialize one
> repo client from another, even though you can git clone one git
> repository from any other. Its a bug in repo's design. The data
> under ".repo/projects/" isn't laid out correctly to permit reuse
> of one repo client to initialize another.
>
> Its something I keep meaning to fix, but its going to take some
> real effort.
>
> In the mean time, there is a "--mirror" flag to repo init
> which can be used to clone everything into bare repositories.
> Those bare repositories can be published for others to repo init
> from, though you need to customize the manifest.git:default.xml
> so that the embedded URL refers back to your server and not
> android.git.kernel.org. Yet another thing I want to fix.
>
>> 2. I want to make android dir to be a git repository, is it workable
>> that I create submodule for each subdirectory in another directory? Is
>> there a script for it?
>
> You might be able to do something like this:
>
> cd /android
> git init
> repo forall -c 'cd /android && git submodule add `pwd`'
>
> Also, you might want to consider asking questions related to repo
> on the repo-discuss@googlegroups.com mailing list. There's a lot
> more repo users there than on the git mailing list, and they have
> started to come up with their own "tips n tricks".
>
> --
> Shawn.
>
^ permalink raw reply
* Re: git rev-list with --name-status?
From: Junio C Hamano @ 2008-12-25 3:24 UTC (permalink / raw)
To: skillzero; +Cc: git
In-Reply-To: <2729632a0812241453x4ae50362g4bcd3317e5be0429@mail.gmail.com>
skillzero@gmail.com writes:
> I'd like to also include the output of --name-status so the email
> shows which files were changed by each commit (rather than just a
> summary at the end since our pushes sometimes have a lot of commits in
> them).
>
> git rev-list doesn't seem to support --name-status and git log doesn't
> seem to support --stdin. Is there a better way to do what I want?
The plumbing rev-list never runs diff internally.
Depending on what you want, "git log --stat" or "git log --name-only" or
even "git log --name-status -B -C" may serve you nicely.
"Depending on what you want" is the key phrase that indicates that what
you are asking for would be most likely found in Porcelains, not plumbing.
Even though there is not much reason to _avoid_ using "log" these days,
you could do your own scripting for whatever reason; perhaps you feel like
it would be a more macho thing to do (which isn't), perhaps you want more
customization than options supported by the stock "log" Porcelain gives
you.
In olden days, people scripted around plumbing, partly because the
Porcelains were implemented that way, and partly because the choices the
Porcelains back then gave you was limited than what we have now. Your
script may look like this:
git-rev-list --parents $range |
while read commit parents
do
... do whatever you want with them ...
done
or
git-rev-list --pretty --parents $range |
perl -e '
while (<>) {
if (/^commit /../^$/) {
if (/^commit (\S+)(.*)?/) {
... we have a new commit; flush what
... you accumulated for the previous one.
... and prepare for this commit.
... $1 is the commit, $2 has parents you
... can further split
}
... do "header" things here ...
next;
}
s/^ //;
... do "log" things here ...
}
... flush what you accumulated for the last commit.
'
^ permalink raw reply
* [JGIT PATCH 19/23] Allow more direct access to determine isWritten
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-19-git-send-email-spearce@spearce.org>
This gives the JVM a better chance to inline the isWritten method
when writing a pack file out.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 2 +-
.../spearce/jgit/transport/PackedObjectInfo.java | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index bb889e8..50d06c2 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -932,7 +932,7 @@ boolean isDeltaRepresentation() {
* @return true if object is already written; false otherwise.
*/
boolean isWritten() {
- return getOffset() != 0;
+ return offset != 0;
}
PackedObjectLoader getReuseLoader() {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/PackedObjectInfo.java b/org.spearce.jgit/src/org/spearce/jgit/transport/PackedObjectInfo.java
index f37f421..045b795 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/PackedObjectInfo.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/PackedObjectInfo.java
@@ -49,7 +49,7 @@
* objects from the pack. This extension of ObjectId includes the offset.
*/
public class PackedObjectInfo extends ObjectId {
- private long offset;
+ protected long offset;
private int crc;
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 16/23] Defer parsing of the ObjectId while walking a PackIndex Iterator
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-16-git-send-email-spearce@spearce.org>
This is a slight performance improvement for the PackReverseIndex
construction. The only code that really cares about the ObjectId
from a PackIndex entry is test cases. By avoiding construction
we can save some CPU time during the several passes we do to make
the reverse index data structure within PackWriter.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../tst/org/spearce/jgit/lib/PackIndexTest.java | 4 +-
.../tst/org/spearce/jgit/lib/PackWriterTest.java | 2 +-
.../src/org/spearce/jgit/lib/PackIndex.java | 48 +++++++++++--------
.../src/org/spearce/jgit/lib/PackIndexV1.java | 20 +++++---
.../src/org/spearce/jgit/lib/PackIndexV2.java | 27 +++++++----
5 files changed, 61 insertions(+), 40 deletions(-)
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackIndexTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackIndexTest.java
index 8ab380e..7163718 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackIndexTest.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackIndexTest.java
@@ -134,10 +134,10 @@ assertEquals("c59759f143fb1fe21c197981df75a7ee00290799", iter.next()
*/
public void testCompareEntriesOffsetsWithFindOffsets() {
for (MutableEntry me : smallIdx) {
- assertEquals(smallIdx.findOffset(me), me.getOffset());
+ assertEquals(smallIdx.findOffset(me.toObjectId()), me.getOffset());
}
for (MutableEntry me : denseIdx) {
- assertEquals(denseIdx.findOffset(me), me.getOffset());
+ assertEquals(denseIdx.findOffset(me.toObjectId()), me.getOffset());
}
}
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
index 3c02955..ec138a0 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
@@ -487,7 +487,7 @@ public int compare(MutableEntry o1, MutableEntry o2) {
int i = 0;
for (MutableEntry me : entries) {
- assertEquals(objectsOrder[i++], me.copy());
+ assertEquals(objectsOrder[i++].toObjectId(), me.toObjectId());
}
}
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndex.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndex.java
index 5fb41b1..5b7a62d 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndex.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndex.java
@@ -239,15 +239,10 @@ abstract long findCRC32(AnyObjectId objId) throws MissingObjectException,
* in pack (both mutable).
*
*/
- public static class MutableEntry extends MutableObjectId {
- private long offset;
+ public static class MutableEntry {
+ protected final MutableObjectId idBuffer = new MutableObjectId();
- /**
- * Empty constructor. Object fields should be filled in later.
- */
- public MutableEntry() {
- super();
- }
+ long offset;
/**
* Returns offset for this index object entry
@@ -258,30 +253,43 @@ public long getOffset() {
return offset;
}
- void setOffset(long offset) {
- this.offset = offset;
+ /** @return hex string describing the object id of this entry. */
+ public String name() {
+ ensureId();
+ return idBuffer.name();
}
- private MutableEntry(MutableEntry src) {
- super(src);
- this.offset = src.offset;
+ /** @return a copy of the object id. */
+ public ObjectId toObjectId() {
+ ensureId();
+ return idBuffer.toObjectId();
}
- /**
- * Returns mutable copy of this mutable entry.
- *
- * @return copy of this mutable entry
- */
+ /** @return a complete copy of this entry, that won't modify */
public MutableEntry cloneEntry() {
- return new MutableEntry(this);
+ final MutableEntry r = new MutableEntry();
+ ensureId();
+ r.idBuffer.w1 = idBuffer.w1;
+ r.idBuffer.w2 = idBuffer.w2;
+ r.idBuffer.w3 = idBuffer.w3;
+ r.idBuffer.w4 = idBuffer.w4;
+ r.idBuffer.w5 = idBuffer.w5;
+ r.offset = offset;
+ return r;
+ }
+
+ protected void ensureId() {
+ // Override in implementations.
}
}
protected abstract class EntriesIterator implements Iterator<MutableEntry> {
- protected MutableEntry objectId = new MutableEntry();
+ protected final MutableEntry entry = initEntry();
protected long returnedNumber = 0;
+ protected abstract MutableEntry initEntry();
+
public boolean hasNext() {
return returnedNumber < getObjectCount();
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV1.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV1.java
index 02465f6..90b5f6f 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV1.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV1.java
@@ -162,21 +162,27 @@ boolean hasCRC32Support() {
private int levelTwo;
+ @Override
+ protected MutableEntry initEntry() {
+ return new MutableEntry() {
+ protected void ensureId() {
+ idBuffer.fromRaw(idxdata[levelOne], levelTwo
+ - Constants.OBJECT_ID_LENGTH);
+ }
+ };
+ }
+
public MutableEntry next() {
for (; levelOne < idxdata.length; levelOne++) {
if (idxdata[levelOne] == null)
continue;
-
if (levelTwo < idxdata[levelOne].length) {
- long offset = NB.decodeUInt32(idxdata[levelOne], levelTwo);
- objectId.setOffset(offset);
- objectId.fromRaw(idxdata[levelOne], levelTwo + 4);
+ entry.offset = NB.decodeUInt32(idxdata[levelOne], levelTwo);
levelTwo += Constants.OBJECT_ID_LENGTH + 4;
returnedNumber++;
- return objectId;
- } else {
- levelTwo = 0;
+ return entry;
}
+ levelTwo = 0;
}
throw new NoSuchElementException();
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV2.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV2.java
index cc3ad65..48a5206 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV2.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackIndexV2.java
@@ -234,25 +234,32 @@ else if (cmp == 0) {
private int levelTwo;
+ @Override
+ protected MutableEntry initEntry() {
+ return new MutableEntry() {
+ protected void ensureId() {
+ idBuffer.fromRaw(names[levelOne], levelTwo
+ - Constants.OBJECT_ID_LENGTH / 4);
+ }
+ };
+ }
+
public MutableEntry next() {
for (; levelOne < names.length; levelOne++) {
if (levelTwo < names[levelOne].length) {
- objectId.fromRaw(names[levelOne], levelTwo);
- int arrayIdx = levelTwo / (Constants.OBJECT_ID_LENGTH / 4)
- * 4;
- long offset = NB.decodeUInt32(offset32[levelOne], arrayIdx);
+ int idx = levelTwo / (Constants.OBJECT_ID_LENGTH / 4) * 4;
+ long offset = NB.decodeUInt32(offset32[levelOne], idx);
if ((offset & IS_O64) != 0) {
- arrayIdx = (8 * (int) (offset & ~IS_O64));
- offset = NB.decodeUInt64(offset64, arrayIdx);
+ idx = (8 * (int) (offset & ~IS_O64));
+ offset = NB.decodeUInt64(offset64, idx);
}
- objectId.setOffset(offset);
+ entry.offset = offset;
levelTwo += Constants.OBJECT_ID_LENGTH / 4;
returnedNumber++;
- return objectId;
- } else {
- levelTwo = 0;
+ return entry;
}
+ levelTwo = 0;
}
throw new NoSuchElementException();
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 13/23] Dispose of RevCommit buffers when they aren't used in PackWriter
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-13-git-send-email-spearce@spearce.org>
We don't need the commit buffer, so we might as well throw it
away immediately to reduce the total memory usage within the
writer process.
The same goes for the buffers within UploadPack when its doing
a check to see if it has a complete base.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 2 ++
.../src/org/spearce/jgit/transport/UploadPack.java | 1 +
2 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 89460f2..88b2b1f 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -802,10 +802,12 @@ private void findObjectsToPack(final ObjectWalk walker)
while ((o = walker.next()) != null) {
addObject(o);
+ o.dispose();
initMonitor.update(1);
}
while ((o = walker.nextObject()) != null) {
addObject(o);
+ o.dispose();
initMonitor.update(1);
}
initMonitor.endTask();
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/UploadPack.java b/org.spearce.jgit/src/org/spearce/jgit/transport/UploadPack.java
index 4401951..d57df03 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/UploadPack.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/UploadPack.java
@@ -434,6 +434,7 @@ private boolean wantSatisfied(final RevCommit want) throws IOException {
}
return true;
}
+ c.dispose();
}
return false;
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 15/23] Don't unpack delta chains while converting delta to whole object
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-15-git-send-email-spearce@spearce.org>
If an object is currently a delta but we need to convert it into
a whole object we shouldn't be chasing the delta chain to get its
real type code. We had the type code at the time we made the list
of objects to pack; so we save it in the ObjectToPack structure.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 28 ++++++++++++++-----
1 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 88b2b1f..a0823c7 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -705,14 +705,14 @@ private void writeObject(final ObjectToPack otp) throws IOException {
private void writeWholeObject(final ObjectToPack otp) throws IOException {
if (otp.hasReuseLoader()) {
final PackedObjectLoader loader = otp.getReuseLoader();
- writeObjectHeader(loader.getType(), loader.getSize());
+ writeObjectHeader(otp.getType(), loader.getSize());
loader.copyRawData(out, buf);
otp.disposeLoader();
} else {
final ObjectLoader loader = db.openObject(windowCursor, otp);
final DeflaterOutputStream deflaterOut = new DeflaterOutputStream(
out, deflater);
- writeObjectHeader(loader.getType(), loader.getSize());
+ writeObjectHeader(otp.getType(), loader.getSize());
deflaterOut.write(loader.getCachedBytes());
deflaterOut.finish();
deflater.reset();
@@ -833,7 +833,7 @@ public void addObject(final RevObject object)
return;
}
- final ObjectToPack otp = new ObjectToPack(object);
+ final ObjectToPack otp = new ObjectToPack(object, object.getType());
try {
objectsLists[object.getType()].add(otp);
} catch (ArrayIndexOutOfBoundsException x) {
@@ -858,7 +858,8 @@ public void addObject(final RevObject object)
private PackedObjectLoader reuseLoader;
- private int deltaDepth;
+ /** Low bits contain the objectType; higher bits the deltaDepth */
+ private int flags;
private boolean wantWrite;
@@ -868,9 +869,12 @@ public void addObject(final RevObject object)
*
* @param src
* object id of object for packing
+ * @param type
+ * real type code of the object, not its in-pack type.
*/
- ObjectToPack(AnyObjectId src) {
+ ObjectToPack(AnyObjectId src, final int type) {
super(src);
+ flags |= type;
}
/**
@@ -946,15 +950,23 @@ void disposeLoader() {
this.reuseLoader = null;
}
+ int getType() {
+ return flags & 0x7;
+ }
+
int getDeltaDepth() {
- return deltaDepth;
+ return flags >>> 3;
}
void updateDeltaDepth() {
+ final int d;
if (deltaBase instanceof ObjectToPack)
- deltaDepth = ((ObjectToPack) deltaBase).deltaDepth + 1;
+ d = ((ObjectToPack) deltaBase).getDeltaDepth() + 1;
else if (deltaBase != null)
- deltaDepth = 1;
+ d = 1;
+ else
+ d = 0;
+ flags = (d << 3) | getType();
}
boolean wantWrite() {
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 21/23] Use an ArrayList for the reuseLoader collection in PackWriter
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-21-git-send-email-spearce@spearce.org>
Nulling out the elements of the array is quicker than deallocating
LinkedList nodes and allocating new ones for the next use.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 3ef9154..69b4294 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -46,7 +46,6 @@
import java.util.Collection;
import java.util.Collections;
import java.util.Iterator;
-import java.util.LinkedList;
import java.util.List;
import java.util.zip.Deflater;
import java.util.zip.DeflaterOutputStream;
@@ -577,8 +576,7 @@ public void writePack(OutputStream packStream) throws IOException {
private void searchForReuse() throws IOException {
initMonitor.beginTask(SEARCHING_REUSE_PROGRESS, getObjectsNumber());
- final Collection<PackedObjectLoader> reuseLoaders = new LinkedList<PackedObjectLoader>();
-
+ final Collection<PackedObjectLoader> reuseLoaders = new ArrayList<PackedObjectLoader>();
for (List<ObjectToPack> list : objectsLists) {
for (ObjectToPack otp : list) {
if (initMonitor.isCancelled())
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 22/23] Don't cut off existing delta chains if we are reusing deltas
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-22-git-send-email-spearce@spearce.org>
The existing pack file may have chains longer than our current max
depth, such as if it is a historical pack and was generated with a
maximum chain depth >100. Cutting off the chains would create a
bigger pack file that we need to transmit, and it costs a lot more
CPU time to expand the delta object in order to recompress it whole,
because we couldn't copy it directly from the pack file.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 6 ------
1 files changed, 0 insertions(+), 6 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 69b4294..4c01b6e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -681,12 +681,6 @@ private void writeObject(final ObjectToPack otp) throws IOException {
writeObject(deltaBase);
}
}
-
- otp.updateDeltaDepth();
- if (otp.getDeltaDepth() > maxDeltaDepth) {
- otp.clearDeltaBase();
- otp.disposeLoader();
- }
}
assert !otp.isWritten();
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 23/23] Correctly honor the thin parameter to PackWriter.writePack
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-23-git-send-email-spearce@spearce.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 4c01b6e..b2d0227 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -465,8 +465,9 @@ public void preparePack(
final Collection<? extends ObjectId> uninterestingObjects,
final boolean thin, final boolean ignoreMissingUninteresting)
throws IOException {
+ this.thin = thin;
ObjectWalk walker = setUpWalker(interestingObjects,
- uninterestingObjects, thin, ignoreMissingUninteresting);
+ uninterestingObjects, ignoreMissingUninteresting);
findObjectsToPack(walker);
}
@@ -759,7 +760,7 @@ private void writeChecksum() throws IOException {
private ObjectWalk setUpWalker(
final Collection<? extends ObjectId> interestingObjects,
final Collection<? extends ObjectId> uninterestingObjects,
- final boolean thin, final boolean ignoreMissingUninteresting)
+ final boolean ignoreMissingUninteresting)
throws MissingObjectException, IOException,
IncorrectObjectTypeException {
final ObjectWalk walker = new ObjectWalk(db);
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 18/23] Correctly use a long for the offsets within a generated pack
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-18-git-send-email-spearce@spearce.org>
Pack files can be larger than 2 GiB, especially if the PackIndexV2
format is being used.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../tst/org/spearce/jgit/lib/PackWriterTest.java | 12 ++++++------
.../spearce/jgit/util/CountingOutputStream.java | 5 ++---
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
index ec138a0..98e0d3a 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/PackWriterTest.java
@@ -308,11 +308,11 @@ public void testWritePack4ThinPack() throws IOException {
*/
public void testWritePack2SizeDeltasVsNoDeltas() throws Exception {
testWritePack2();
- final int sizePack2NoDeltas = cos.getCount();
+ final long sizePack2NoDeltas = cos.getCount();
tearDown();
setUp();
testWritePack2DeltasReuseRefs();
- final int sizePack2DeltasRefs = cos.getCount();
+ final long sizePack2DeltasRefs = cos.getCount();
assertTrue(sizePack2NoDeltas > sizePack2DeltasRefs);
}
@@ -327,11 +327,11 @@ public void testWritePack2SizeDeltasVsNoDeltas() throws Exception {
*/
public void testWritePack2SizeOffsetsVsRefs() throws Exception {
testWritePack2DeltasReuseRefs();
- final int sizePack2DeltasRefs = cos.getCount();
+ final long sizePack2DeltasRefs = cos.getCount();
tearDown();
setUp();
testWritePack2DeltasReuseOffsets();
- final int sizePack2DeltasOffsets = cos.getCount();
+ final long sizePack2DeltasOffsets = cos.getCount();
assertTrue(sizePack2DeltasRefs > sizePack2DeltasOffsets);
}
@@ -345,11 +345,11 @@ public void testWritePack2SizeOffsetsVsRefs() throws Exception {
*/
public void testWritePack4SizeThinVsNoThin() throws Exception {
testWritePack4();
- final int sizePack4 = cos.getCount();
+ final long sizePack4 = cos.getCount();
tearDown();
setUp();
testWritePack4ThinPack();
- final int sizePack4Thin = cos.getCount();
+ final long sizePack4Thin = cos.getCount();
assertTrue(sizePack4 > sizePack4Thin);
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/util/CountingOutputStream.java b/org.spearce.jgit/src/org/spearce/jgit/util/CountingOutputStream.java
index 53935dc..b0b5f7d 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/util/CountingOutputStream.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/util/CountingOutputStream.java
@@ -45,8 +45,7 @@
* Counting output stream decoration. Counts bytes written to stream.
*/
public class CountingOutputStream extends FilterOutputStream {
-
- private int count;
+ private long count;
/**
* Create counting stream being decorated to provided real output stream.
@@ -76,7 +75,7 @@ public void write(byte[] b, int off, int len) throws IOException {
* @return number of written bytes since last reset (object is reset upon
* creation)
*/
- public int getCount() {
+ public long getCount() {
return count;
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 20/23] Move "wantWrite" field of ObjectToPack into the flags field
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-20-git-send-email-spearce@spearce.org>
This reduces the ObjectToPack structure by 4 bytes, saving us 4 MB
on the linux kernel repository. Its only a single bit value, so
keeping it in a full boolean is rather wasteful given how many of
these objects we need in to pack a large project.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 23 ++++++++++++-------
1 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index 50d06c2..3ef9154 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -859,11 +859,16 @@ public void addObject(final RevObject object)
private PackedObjectLoader reuseLoader;
- /** Low bits contain the objectType; higher bits the deltaDepth */
+ /**
+ * Bit field, from bit 0 to bit 31:
+ * <ul>
+ * <li>1 bit: wantWrite</li>
+ * <li>3 bits: type</li>
+ * <li>28 bits: deltaDepth</li>
+ * </ul>
+ */
private int flags;
- private boolean wantWrite;
-
/**
* Construct object for specified object id. <br/> By default object is
* marked as not written and non-delta packed (as a whole object).
@@ -875,7 +880,7 @@ public void addObject(final RevObject object)
*/
ObjectToPack(AnyObjectId src, final int type) {
super(src);
- flags |= type;
+ flags |= type << 1;
}
/**
@@ -952,11 +957,11 @@ void disposeLoader() {
}
int getType() {
- return flags & 0x7;
+ return (flags>>1) & 0x7;
}
int getDeltaDepth() {
- return flags >>> 3;
+ return flags >>> 4;
}
void updateDeltaDepth() {
@@ -967,15 +972,15 @@ else if (deltaBase != null)
d = 1;
else
d = 0;
- flags = (d << 3) | getType();
+ flags = (d << 4) | flags & 0x15;
}
boolean wantWrite() {
- return wantWrite;
+ return (flags & 1) == 1;
}
void markWantWrite() {
- this.wantWrite = true;
+ flags |= 1;
}
}
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 14/23] Don't unpack delta chains while writing a pack from a pack v1 index
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-14-git-send-email-spearce@spearce.org>
We only need to inflate the delta to verify the content is sane;
going all the way through the delta chain by getCachedBytes is
a massive expensive that we just cannot afford to pay. This
simple change improves clone time for the Linux kernel from
14m50s to 3m20s, due to the improved throughput in our write
objects phase.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/ByteArrayWindow.java | 14 ++++++++++++++
.../src/org/spearce/jgit/lib/ByteBufferWindow.java | 17 +++++++++++++++++
.../src/org/spearce/jgit/lib/ByteWindow.java | 10 ++++++++++
.../src/org/spearce/jgit/lib/PackFile.java | 18 +++++++++++-------
.../src/org/spearce/jgit/lib/WindowCursor.java | 16 ++++++++++++++++
.../src/org/spearce/jgit/lib/WindowedFile.java | 5 +++++
6 files changed, 73 insertions(+), 7 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
index b32d4f8..7b7c7a4 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
@@ -89,6 +89,20 @@ int inflate(final byte[] array, final int pos, final byte[] b, int o,
return o;
}
+ void inflateVerify(final byte[] array, final int pos, final Inflater inf)
+ throws DataFormatException {
+ ensureLoaded(array);
+ while (!inf.finished()) {
+ if (inf.needsInput()) {
+ inf.setInput(array, pos, array.length - pos);
+ break;
+ }
+ inf.inflate(verifyGarbageBuffer, 0, verifyGarbageBuffer.length);
+ }
+ while (!inf.finished() && !inf.needsInput())
+ inf.inflate(verifyGarbageBuffer, 0, verifyGarbageBuffer.length);
+ }
+
private synchronized void ensureLoaded(final byte[] array) {
if (!loaded) {
try {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteBufferWindow.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteBufferWindow.java
index 03bafb1..a6757e8 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteBufferWindow.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteBufferWindow.java
@@ -90,4 +90,21 @@ int inflate(final ByteBuffer buffer, final int pos, final byte[] b, int o,
o += inf.inflate(b, o, b.length - o);
return o;
}
+
+ void inflateVerify(final ByteBuffer buffer, final int pos,
+ final Inflater inf) throws DataFormatException {
+ final byte[] tmp = new byte[512];
+ final ByteBuffer s = buffer.slice();
+ s.position(pos);
+ while (s.remaining() > 0 && !inf.finished()) {
+ if (inf.needsInput()) {
+ final int n = Math.min(s.remaining(), tmp.length);
+ s.get(tmp, 0, n);
+ inf.setInput(tmp, 0, n);
+ }
+ inf.inflate(verifyGarbageBuffer, 0, verifyGarbageBuffer.length);
+ }
+ while (!inf.finished() && !inf.needsInput())
+ inf.inflate(verifyGarbageBuffer, 0, verifyGarbageBuffer.length);
+ }
}
\ No newline at end of file
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
index 8d37de7..732664b 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
@@ -207,4 +207,14 @@ final int inflate(T ref, long pos, byte[] dstbuf, int dstoff, Inflater inf)
*/
abstract int inflate(T ref, int pos, byte[] dstbuf, int dstoff, Inflater inf)
throws DataFormatException;
+
+ protected static final byte[] verifyGarbageBuffer = new byte[2048];
+
+ final void inflateVerify(T ref, long pos, Inflater inf)
+ throws DataFormatException {
+ inflateVerify(ref, (int) (pos - start), inf);
+ }
+
+ abstract void inflateVerify(T ref, int pos, Inflater inf)
+ throws DataFormatException;
}
\ No newline at end of file
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
index 3cdca8f..0de4c55 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
@@ -218,15 +218,19 @@ final void copyRawData(final PackedObjectLoader loader,
final ObjectId id = findObjectForOffset(objectOffset);
final long expected = idx.findCRC32(id);
if (computed != expected)
- throw new CorruptObjectException(id,
- "Possible data corruption - CRC32 of raw pack data (object offset "
- + objectOffset
- + ") mismatch CRC32 from pack index");
+ throw new CorruptObjectException("Object at " + dataOffset
+ + " in " + getPackFile() + " has bad zlib stream");
} else {
+ try {
+ pack.verifyCompressed(dataOffset, curs);
+ } catch (DataFormatException dfe) {
+ final CorruptObjectException coe;
+ coe = new CorruptObjectException("Object at " + dataOffset
+ + " in " + getPackFile() + " has bad zlib stream");
+ coe.initCause(dfe);
+ throw coe;
+ }
pack.copyToStream(dataOffset, buf, cnt, out, curs);
-
- // read to verify against Adler32 zlib checksum
- loader.getCachedBytes();
}
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCursor.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCursor.java
index 7aad081..5c8bec5 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCursor.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCursor.java
@@ -125,6 +125,22 @@ int inflate(final WindowedFile provider, long position,
}
}
+ void inflateVerify(final WindowedFile provider, long position)
+ throws IOException, DataFormatException {
+ if (inf == null)
+ inf = InflaterCache.get();
+ else
+ inf.reset();
+ for (;;) {
+ pin(provider, position);
+ window.inflateVerify(handle, position, inf);
+ if (inf.finished())
+ return;
+ position = window.end;
+ }
+ }
+
+
private void pin(final WindowedFile provider, final long position)
throws IOException {
final ByteWindow w = window;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
index 7626693..5eb8465 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
@@ -248,6 +248,11 @@ void readCompressed(final long position, final byte[] dstbuf,
throw new EOFException("Short compressed stream at " + position);
}
+ void verifyCompressed(final long position, final WindowCursor curs)
+ throws IOException, DataFormatException {
+ curs.inflateVerify(this, position);
+ }
+
/**
* Overridable hook called after the file is opened.
* <p>
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 17/23] Only do one getCachedBytes per whole object written
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-17-git-send-email-spearce@spearce.org>
Getting the size of an object may cause getCachedBytes to be invoked,
and then have its result discarded. To save time we should make the
call to getCachedBytes first, hold onto the byte[], then obtain the
size off the byte[], and finally write from the byte[].
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/PackWriter.java | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
index a0823c7..bb889e8 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackWriter.java
@@ -710,10 +710,11 @@ private void writeWholeObject(final ObjectToPack otp) throws IOException {
otp.disposeLoader();
} else {
final ObjectLoader loader = db.openObject(windowCursor, otp);
+ final byte[] data = loader.getCachedBytes();
final DeflaterOutputStream deflaterOut = new DeflaterOutputStream(
out, deflater);
- writeObjectHeader(otp.getType(), loader.getSize());
- deflaterOut.write(loader.getCachedBytes());
+ writeObjectHeader(otp.getType(), data.length);
+ deflaterOut.write(data);
deflaterOut.finish();
deflater.reset();
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 12/23] Change ByteArrayWindow to read content outside of WindowCache's lock
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-12-git-send-email-spearce@spearce.org>
This improves the concurrency limit of WindowCache by allowing the
individual windows to be paged in outside of the cache lock. By
moving it out multiple threads can read in different windows of
the same (or different) pack files concurrently, but we still do
only one read per window.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/ByteArrayWindow.java | 17 +++++++++++++++++
.../src/org/spearce/jgit/lib/WindowCache.java | 2 +-
.../src/org/spearce/jgit/lib/WindowedFile.java | 15 +++++++--------
3 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
index cea71be..b32d4f8 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteArrayWindow.java
@@ -38,6 +38,8 @@
package org.spearce.jgit.lib;
+import java.io.IOException;
+import java.nio.ByteBuffer;
import java.util.zip.DataFormatException;
import java.util.zip.Inflater;
@@ -45,6 +47,8 @@
* A {@link ByteWindow} with an underlying byte array for storage.
*/
final class ByteArrayWindow extends ByteWindow<byte[]> {
+ boolean loaded;
+
/**
* Constructor for ByteWindow.
*
@@ -64,6 +68,7 @@ ByteArrayWindow(final WindowedFile o, final long p, final int d,
}
int copy(final byte[] array, final int p, final byte[] b, final int o, int n) {
+ ensureLoaded(array);
n = Math.min(array.length - p, n);
System.arraycopy(array, p, b, o, n);
return n;
@@ -71,6 +76,7 @@ int copy(final byte[] array, final int p, final byte[] b, final int o, int n) {
int inflate(final byte[] array, final int pos, final byte[] b, int o,
final Inflater inf) throws DataFormatException {
+ ensureLoaded(array);
while (!inf.finished()) {
if (inf.needsInput()) {
inf.setInput(array, pos, array.length - pos);
@@ -82,4 +88,15 @@ int inflate(final byte[] array, final int pos, final byte[] b, int o,
o += inf.inflate(b, o, b.length - o);
return o;
}
+
+ private synchronized void ensureLoaded(final byte[] array) {
+ if (!loaded) {
+ try {
+ provider.fd.getChannel().read(ByteBuffer.wrap(array), start);
+ } catch (IOException e) {
+ throw new RuntimeException("Cannot fault in window", e);
+ }
+ loaded = true;
+ }
+ }
}
\ No newline at end of file
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
index f478f04..649567b 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
@@ -270,7 +270,7 @@ public static synchronized final void get(final WindowCursor curs,
releaseMemory();
runClearedWindowQueue();
- wp.loadWindow(curs, id, id << windowSizeShift, wsz);
+ wp.allocWindow(curs, id, id << windowSizeShift, wsz);
final ByteWindow<?> e = curs.window;
e.chainNext = cache[idx];
cache[idx] = e;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
index 9c5cf1e..7626693 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
@@ -73,7 +73,7 @@
final int hash;
- private RandomAccessFile fd;
+ RandomAccessFile fd;
private long length;
@@ -298,8 +298,8 @@ void cacheClose() {
fd = null;
}
- void loadWindow(final WindowCursor curs, final int windowId,
- final long pos, final int size) throws IOException {
+ void allocWindow(final WindowCursor curs, final int windowId,
+ final long pos, final int size) {
if (WindowCache.mmap) {
MappedByteBuffer map;
try {
@@ -323,7 +323,10 @@ void loadWindow(final WindowCursor curs, final int windowId,
if (map != null) {
if (map.hasArray()) {
final byte[] b = map.array();
- curs.window = new ByteArrayWindow(this, pos, windowId, b);
+ final ByteArrayWindow w;
+ w = new ByteArrayWindow(this, pos, windowId, b);
+ w.loaded = true;
+ curs.window = w;
curs.handle = b;
} else {
curs.window = new ByteBufferWindow(this, pos, windowId, map);
@@ -334,10 +337,6 @@ void loadWindow(final WindowCursor curs, final int windowId,
}
final byte[] b = new byte[size];
- synchronized (fd) {
- fd.seek(pos);
- fd.readFully(b);
- }
curs.window = new ByteArrayWindow(this, pos, windowId, b);
curs.handle = b;
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 11/23] Rewrite WindowCache to use a hash table
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-11-git-send-email-spearce@spearce.org>
This shaved 10s off my linux-2.6 "jgit rev-list --objects" test.
We're also getting slightly better cache hit ratios, approaching
95% during commit access and 80% during tree access on the same
repository. Its a significant win.
The cache is also a bit better about memory usage. We now make
sure the byte[] or ByteBuffer handle is cleared out of the soft
reference when we push the reference into the queue manually. A
small change, but it makes a marked difference with the overall
JVM memory usage.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/ByteWindow.java | 10 +-
.../src/org/spearce/jgit/lib/WindowCache.java | 323 ++++++++++++--------
.../src/org/spearce/jgit/lib/WindowedFile.java | 2 +-
3 files changed, 199 insertions(+), 136 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
index 1f6beb8..8d37de7 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ByteWindow.java
@@ -56,14 +56,20 @@
* type of object reference used to manage the window data.
*/
abstract class ByteWindow<T> extends SoftReference<T> {
+ boolean sizeActive = true;
+
+ ByteWindow<?> chainNext;
+
+ ByteWindow<?> lruPrev;
+
+ ByteWindow<?> lruNext;
+
final WindowedFile provider;
final int id;
final int size;
- int lastAccessed;
-
final long start;
final long end;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
index f617845..f478f04 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowCache.java
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2007, Robin Rosenberg <robin.rosenberg@dewire.com>
+ * Copyright (C) 2008, Google Inc.
* Copyright (C) 2008, Shawn O. Pearce <spearce@spearce.org>
*
* All rights reserved.
@@ -41,11 +41,7 @@
import java.io.IOException;
import java.lang.ref.ReferenceQueue;
-/**
- * The WindowCache manages reusable <code>Windows</code> and inflaters used by
- * the other windowed file access classes.
- */
-public class WindowCache {
+class WindowCache {
private static final int KB = 1024;
private static final int MB = 1024 * KB;
@@ -68,23 +64,51 @@ private static final int bits(int newSize) {
static final ReferenceQueue<?> clearedWindowQueue;
- private static ByteWindow[] windows;
+ private static ByteWindow[] cache;
+
+ private static ByteWindow lruHead;
- private static int openWindowCount;
+ private static ByteWindow lruTail;
private static int openByteCount;
- private static int accessClock;
+ private static int hits;
+
+ private static int reqs;
+
+ private static int opens;
+
+ private static int closes;
static {
maxByteCount = 10 * MB;
windowSizeShift = bits(8 * KB);
windowSize = 1 << windowSizeShift;
mmap = false;
- windows = new ByteWindow[maxByteCount / windowSize];
+ cache = new ByteWindow[cacheTableSize()];
clearedWindowQueue = new ReferenceQueue<Object>();
}
+ static synchronized String statString() {
+ int maxChain = 0, tot = 0;
+ for (ByteWindow<?> e : cache) {
+ int n = 0;
+ for (; e != null; e = e.chainNext) {
+ n++;
+ tot++;
+ }
+ maxChain = Math.max(maxChain, n);
+ }
+ return "WindowCache[ hits: " + (hits * 100 / reqs) + "%"
+ + "; windows: " + tot + " maxChain: " + maxChain + "; kb:"
+ + (openByteCount / KB) + "; cache: " + cache.length + " fds: "
+ + (opens - closes) + "," + opens + "," + closes + " ]";
+ }
+
+ private static int cacheTableSize() {
+ return 5 * (maxByteCount / windowSize) / 2;
+ }
+
/**
* Modify the configuration of the window cache.
* <p>
@@ -135,27 +159,36 @@ private static synchronized void reconfigureImpl(final int packedGitLimit,
// We have to throw away every window we have. None
// of them are suitable for the new configuration.
//
- for (int i = 0; i < openWindowCount; i++) {
- final ByteWindow win = windows[i];
- if (--win.provider.openCount == 0)
- win.provider.cacheClose();
- windows[i] = null;
+ for (ByteWindow<?> e : cache) {
+ for (; e != null; e = e.chainNext)
+ clear(e);
+ }
+ runClearedWindowQueue();
+ cache = new ByteWindow[cacheTableSize()];
+
+ } else {
+ if (prune) {
+ // We should decrease our memory usage.
+ //
+ releaseMemory();
+ runClearedWindowQueue();
}
- windows = new ByteWindow[maxByteCount / windowSize];
- openWindowCount = 0;
- openByteCount = 0;
- } else if (prune) {
- // Our memory limit was decreased so we should try
- // to drop windows to ensure we meet the new lower
- // limit we were just given.
- //
- final int wincnt = maxByteCount / windowSize;
- releaseMemory(wincnt, null, 0, 0);
- if (wincnt != windows.length) {
- final ByteWindow[] n = new ByteWindow[wincnt];
- System.arraycopy(windows, 0, n, 0, openWindowCount);
- windows = n;
+ if (cache.length != cacheTableSize()) {
+ // The cache table should be resized.
+ // Rehash every entry.
+ //
+ final ByteWindow[] priorTable = cache;
+
+ cache = new ByteWindow[cacheTableSize()];
+ for (ByteWindow<?> e : priorTable) {
+ for (ByteWindow<?> n; e != null; e = n) {
+ n = e.chainNext;
+ final int idx = hash(e.provider, e.id);
+ e.chainNext = cache[idx];
+ cache[idx] = e;
+ }
+ }
}
}
}
@@ -178,19 +211,27 @@ private static synchronized void reconfigureImpl(final int packedGitLimit,
*/
public static synchronized final void get(final WindowCursor curs,
final WindowedFile wp, final long position) throws IOException {
+ reqs++;
final int id = (int) (position >> windowSizeShift);
- int idx = binarySearch(wp, id);
- if (0 <= idx) {
- final ByteWindow<?> w = windows[idx];
- if ((curs.handle = w.get()) != null) {
- w.lastAccessed = ++accessClock;
- curs.window = w;
- return;
+ final int idx = hash(wp, id);
+ for (ByteWindow<?> e = cache[idx]; e != null; e = e.chainNext) {
+ if (e.provider == wp && e.id == id) {
+ if ((curs.handle = e.get()) != null) {
+ curs.window = e;
+ makeMostRecent(e);
+ hits++;
+ return;
+ }
+
+ clear(e);
+ break;
}
}
- if (++wp.openCount == 1) {
+ if (wp.openCount == 0) {
try {
+ opens++;
+ wp.openCount = 1;
wp.cacheOpen();
} catch (IOException ioe) {
wp.openCount = 0;
@@ -201,106 +242,55 @@ public static synchronized final void get(final WindowCursor curs,
} catch (Error ioe) {
wp.openCount = 0;
throw ioe;
+ } finally {
+ wp.openCount--;
}
// The cacheOpen may have mapped the window we are trying to
// map ourselves. Retrying the search ensures that does not
// happen to us.
//
- idx = binarySearch(wp, id);
- if (0 <= idx) {
- final ByteWindow<?> w = windows[idx];
- if ((curs.handle = w.get()) != null) {
- w.lastAccessed = ++accessClock;
- curs.window = w;
- return;
+ for (ByteWindow<?> e = cache[idx]; e != null; e = e.chainNext) {
+ if (e.provider == wp && e.id == id) {
+ if ((curs.handle = e.get()) != null) {
+ curs.window = e;
+ makeMostRecent(e);
+ return;
+ }
+
+ clear(e);
+ break;
}
}
}
- idx = -(idx + 1);
- final int wSz = windowSize(wp, id);
- idx = releaseMemory(windows.length, wp, idx, wSz);
-
- if (idx < 0)
- idx = 0;
- final int toMove = openWindowCount - idx;
- if (toMove > 0)
- System.arraycopy(windows, idx, windows, idx + 1, toMove);
- wp.loadWindow(curs, id, id << windowSizeShift, wSz);
- windows[idx] = curs.window;
- openWindowCount++;
- openByteCount += curs.window.size;
+ final int wsz = windowSize(wp, id);
+ wp.openCount++;
+ openByteCount += wsz;
+ releaseMemory();
+ runClearedWindowQueue();
+
+ wp.loadWindow(curs, id, id << windowSizeShift, wsz);
+ final ByteWindow<?> e = curs.window;
+ e.chainNext = cache[idx];
+ cache[idx] = e;
+ insertLRU(e);
}
- private static int releaseMemory(final int maxWindowCount,
- final WindowedFile willRead, int insertionIndex, final int willAdd) {
- for (;;) {
- final ByteWindow<?> w = (ByteWindow<?>) clearedWindowQueue.poll();
- if (w == null)
- break;
- final int oldest = binarySearch(w.provider, w.id);
- if (oldest < 0 || windows[oldest] != w)
- continue; // Must have been evicted by our other controls.
-
- final WindowedFile p = w.provider;
- if (--p.openCount == 0 && p != willRead)
- p.cacheClose();
-
- openByteCount -= w.size;
- final int toMove = openWindowCount - oldest - 1;
- if (toMove > 0)
- System.arraycopy(windows, oldest + 1, windows, oldest, toMove);
- windows[--openWindowCount] = null;
- if (oldest < insertionIndex)
- insertionIndex--;
- }
-
- while (openWindowCount >= maxWindowCount
- || (openWindowCount > 0 && openByteCount + willAdd > maxByteCount)) {
- int oldest = 0;
- for (int k = openWindowCount - 1; k > 0; k--) {
- if (windows[k].lastAccessed < windows[oldest].lastAccessed)
- oldest = k;
- }
-
- final ByteWindow w = windows[oldest];
- final WindowedFile p = w.provider;
- if (--p.openCount == 0 && p != willRead)
- p.cacheClose();
-
- openByteCount -= w.size;
- final int toMove = openWindowCount - oldest - 1;
- if (toMove > 0)
- System.arraycopy(windows, oldest + 1, windows, oldest, toMove);
- windows[--openWindowCount] = null;
- w.enqueue();
- if (oldest < insertionIndex)
- insertionIndex--;
+ private static void makeMostRecent(ByteWindow<?> e) {
+ if (lruHead != e) {
+ unlinkLRU(e);
+ insertLRU(e);
}
-
- return insertionIndex;
}
- private static final int binarySearch(final WindowedFile sprov,
- final int sid) {
- if (openWindowCount == 0)
- return -1;
- final int shc = sprov.hash;
- int high = openWindowCount;
- int low = 0;
- do {
- final int mid = (low + high) / 2;
- final ByteWindow mw = windows[mid];
- if (mw.provider == sprov && mw.id == sid)
- return mid;
- final int mhc = mw.provider.hash;
- if (mhc < shc || (shc == mhc && mw.id < sid))
- low = mid + 1;
- else
- high = mid;
- } while (low < high);
- return -(low + 1);
+ private static void releaseMemory() {
+ ByteWindow<?> e = lruTail;
+ while (openByteCount > maxByteCount && e != null) {
+ final ByteWindow<?> p = e.lruPrev;
+ clear(e);
+ e = p;
+ }
}
/**
@@ -316,22 +306,89 @@ private static final int binarySearch(final WindowedFile sprov,
* cache.
*/
public static synchronized final void purge(final WindowedFile wp) {
- int d = 0;
- for (int s = 0; s < openWindowCount; s++) {
- final ByteWindow win = windows[s];
- if (win.provider != wp)
- windows[d++] = win;
- else
- openByteCount -= win.size;
+ for (ByteWindow e : cache) {
+ for (; e != null; e = e.chainNext) {
+ if (e.provider == wp)
+ clear(e);
+ }
+ }
+ runClearedWindowQueue();
+ }
+
+ private static void runClearedWindowQueue() {
+ ByteWindow<?> e;
+ while ((e = (ByteWindow) clearedWindowQueue.poll()) != null) {
+ unlinkSize(e);
+ unlinkLRU(e);
+ unlinkCache(e);
+ e.chainNext = null;
+ e.lruNext = null;
+ e.lruPrev = null;
}
- openWindowCount = d;
+ }
+
+ private static void clear(final ByteWindow<?> e) {
+ unlinkSize(e);
+ e.clear();
+ e.enqueue();
+ }
- if (wp.openCount > 0) {
- wp.openCount = 0;
- wp.cacheClose();
+ private static void unlinkSize(final ByteWindow<?> e) {
+ if (e.sizeActive) {
+ if (--e.provider.openCount == 0) {
+ closes++;
+ e.provider.cacheClose();
+ }
+ openByteCount -= e.size;
+ e.sizeActive = false;
}
}
+ private static void unlinkCache(final ByteWindow dead) {
+ final int idx = hash(dead.provider, dead.id);
+ ByteWindow<?> e = cache[idx], p = null, n;
+ for (; e != null; p = e, e = n) {
+ n = e.chainNext;
+ if (e == dead) {
+ if (p == null)
+ cache[idx] = n;
+ else
+ p.chainNext = n;
+ break;
+ }
+ }
+ }
+
+ private static void unlinkLRU(final ByteWindow e) {
+ final ByteWindow<?> prev = e.lruPrev;
+ final ByteWindow<?> next = e.lruNext;
+
+ if (prev != null)
+ prev.lruNext = next;
+ else
+ lruHead = next;
+
+ if (next != null)
+ next.lruPrev = prev;
+ else
+ lruTail = prev;
+ }
+
+ private static void insertLRU(final ByteWindow<?> e) {
+ final ByteWindow h = lruHead;
+ e.lruPrev = null;
+ e.lruNext = h;
+ if (h != null)
+ h.lruPrev = e;
+ else
+ lruTail = e;
+ lruHead = e;
+ }
+
+ private static int hash(final WindowedFile wp, final int id) {
+ return ((wp.hash + id) >>> 1) % cache.length;
+ }
+
private static int windowSize(final WindowedFile file, final int id) {
final long len = file.length();
final long pos = id << windowSizeShift;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
index dd94f24..9c5cf1e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
@@ -88,7 +88,7 @@
*/
public WindowedFile(final File file) {
fPath = file;
- hash = System.identityHashCode(this);
+ hash = System.identityHashCode(this) * 31;
length = Long.MAX_VALUE;
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 10/23] Make mmap mode more reliable by forcing GC at the correct spot
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-10-git-send-email-spearce@spearce.org>
ObjectLoaders can defer the bulk of their data access to the
getCachedBytes() method invocation, rather than at the time
of their construction. That means there's code paths into
the mmap allocator which aren't protected by the "try GC and
try again" within openObject, resulting in some random crashes
due to mmap failures.
The right place to check for mmap failure and retry is within
WindowedFile.loadWindow, where we actually invoke the mmap call.
If the call fails, we need can immediately retry it, and if it
fails again, we can temporarily degrade to the byte[] variant
until the JVM is able to evict and GC enough space.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/lib/Repository.java | 27 +-----------
.../src/org/spearce/jgit/lib/WindowedFile.java | 43 ++++++++++++++-----
2 files changed, 34 insertions(+), 36 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
index 02f8103..e1c4049 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
@@ -290,30 +290,9 @@ public ObjectLoader openObject(final WindowCursor curs, final AnyObjectId id)
int k = packs.length;
if (k > 0) {
do {
- try {
- final ObjectLoader ol = packs[--k].get(curs, id);
- if (ol != null)
- return ol;
- } catch (IOException ioe) {
- // This shouldn't happen unless the pack was corrupted
- // after we opened it or the VM runs out of memory. This is
- // a know problem with memory mapped I/O in java and have
- // been noticed with JDK < 1.6. Tell the gc that now is a good
- // time to collect and try once more.
- try {
- curs.release();
- System.gc();
- final ObjectLoader ol = packs[k].get(curs, id);
- if (ol != null)
- return ol;
- } catch (IOException ioe2) {
- ioe2.printStackTrace();
- ioe.printStackTrace();
- // Still fails.. that's BAD, maybe the pack has
- // been corrupted after all, or the gc didn't manage
- // to release enough previously mmaped areas.
- }
- }
+ final ObjectLoader ol = packs[--k].get(curs, id);
+ if (ol != null)
+ return ol;
} while (k > 0);
}
try {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
index f640c42..dd94f24 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WindowedFile.java
@@ -299,22 +299,41 @@ void cacheClose() {
}
void loadWindow(final WindowCursor curs, final int windowId,
- final long pos, final int windowSize) throws IOException {
+ final long pos, final int size) throws IOException {
if (WindowCache.mmap) {
- final MappedByteBuffer map = fd.getChannel().map(MapMode.READ_ONLY,
- pos, windowSize);
- if (map.hasArray()) {
- final byte[] b = map.array();
- curs.window = new ByteArrayWindow(this, pos, windowId, b);
- curs.handle = b;
- } else {
- curs.window = new ByteBufferWindow(this, pos, windowId, map);
- curs.handle = map;
+ MappedByteBuffer map;
+ try {
+ map = fd.getChannel().map(MapMode.READ_ONLY, pos, size);
+ } catch (IOException e) {
+ // The most likely reason this failed is the JVM has run out
+ // of virtual memory. We need to discard quickly, and try to
+ // force the GC to finalize and release any existing mappings.
+ try {
+ curs.release();
+ System.gc();
+ System.runFinalization();
+ map = fd.getChannel().map(MapMode.READ_ONLY, pos, size);
+ } catch (IOException ioe2) {
+ // Temporarily disable mmap and do buffered disk IO.
+ //
+ map = null;
+ System.err.println("warning: mmap failure: "+ioe2);
+ }
+ }
+ if (map != null) {
+ if (map.hasArray()) {
+ final byte[] b = map.array();
+ curs.window = new ByteArrayWindow(this, pos, windowId, b);
+ curs.handle = b;
+ } else {
+ curs.window = new ByteBufferWindow(this, pos, windowId, map);
+ curs.handle = map;
+ }
+ return;
}
- return;
}
- final byte[] b = new byte[windowSize];
+ final byte[] b = new byte[size];
synchronized (fd) {
fd.seek(pos);
fd.readFully(b);
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 09/23] Remove getId from ObjectLoader API as its unnecessary overhead
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-9-git-send-email-spearce@spearce.org>
Apparently nobody actually requires the getId() method on an ObjectLoader.
In every single location where the getId() is being used the caller has
an AnyObjectId wich the proper value already in-scope, typically the value
they passed into Repository.openObject(). Avoiding having the id in the
loader API means PackFile doesn't have to copy a MutableObjectId into an
immutable ObjectId, saving an object allocation per object lookup.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../tst/org/spearce/jgit/lib/T0004_PackReader.java | 2 -
.../jgit/errors/CorruptObjectException.java | 12 ++++++
.../src/org/spearce/jgit/lib/Constants.java | 2 +-
.../spearce/jgit/lib/DeltaPackedObjectLoader.java | 3 +-
.../src/org/spearce/jgit/lib/ObjectLoader.java | 38 --------------------
.../src/org/spearce/jgit/lib/PackFile.java | 12 +-----
.../src/org/spearce/jgit/lib/Repository.java | 2 +-
.../org/spearce/jgit/lib/UnpackedObjectLoader.java | 12 +++----
.../spearce/jgit/lib/WholePackedObjectLoader.java | 3 +-
.../src/org/spearce/jgit/revwalk/RevWalk.java | 8 ++--
.../jgit/transport/WalkFetchConnection.java | 15 +++++++-
11 files changed, 42 insertions(+), 67 deletions(-)
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
index f6fff52..6ffb904 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
@@ -57,7 +57,6 @@ public void test003_lookupCompressedObject() throws IOException {
pr = new PackFile(db, TEST_IDX, TEST_PACK);
or = pr.get(new WindowCursor(), id);
assertNotNull(or);
- assertEquals(id, or.getId());
assertEquals(Constants.OBJ_TREE, or.getType());
assertEquals(35, or.getSize());
assertEquals(7738, or.getDataOffset());
@@ -72,7 +71,6 @@ public void test004_lookupDeltifiedObject() throws IOException {
or = db.openObject(id);
assertNotNull(or);
assertTrue(or instanceof PackedObjectLoader);
- assertEquals(id, or.getId());
assertEquals(Constants.OBJ_BLOB, or.getType());
assertEquals(18009, or.getSize());
assertEquals(537, ((PackedObjectLoader) or).getDataOffset());
diff --git a/org.spearce.jgit/src/org/spearce/jgit/errors/CorruptObjectException.java b/org.spearce.jgit/src/org/spearce/jgit/errors/CorruptObjectException.java
index 4eb7aa6..2b0e287 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/errors/CorruptObjectException.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/errors/CorruptObjectException.java
@@ -40,6 +40,7 @@
import java.io.IOException;
+import org.spearce.jgit.lib.AnyObjectId;
import org.spearce.jgit.lib.ObjectId;
/**
@@ -55,6 +56,17 @@
* @param id
* @param why
*/
+ public CorruptObjectException(final AnyObjectId id, final String why) {
+ this(id.toObjectId(), why);
+ }
+
+ /**
+ * Construct a CorruptObjectException for reporting a problem specified
+ * object id
+ *
+ * @param id
+ * @param why
+ */
public CorruptObjectException(final ObjectId id, final String why) {
super("Object " + id.name() + " is corrupt: " + why);
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
index 9613d07..8f093d6 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Constants.java
@@ -311,7 +311,7 @@ public static String typeString(final int typeCode) {
* @throws CorruptObjectException
* there is no valid type identified by <code>typeString</code>.
*/
- public static int decodeTypeString(final ObjectId id,
+ public static int decodeTypeString(final AnyObjectId id,
final byte[] typeString, final byte endMark,
final MutableInteger offset) throws CorruptObjectException {
try {
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
index e73f8e5..85f2bda 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/DeltaPackedObjectLoader.java
@@ -92,7 +92,8 @@ public long getSize() throws IOException {
return data;
} catch (DataFormatException dfe) {
final CorruptObjectException coe;
- coe = new CorruptObjectException(getId(), "bad stream");
+ coe = new CorruptObjectException("Object at " + dataOffset + " in "
+ + pack.getPackFile() + " has bad zlib stream");
coe.initCause(dfe);
throw coe;
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
index 8d745dd..5485d8d 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/ObjectLoader.java
@@ -39,50 +39,12 @@
package org.spearce.jgit.lib;
import java.io.IOException;
-import java.security.MessageDigest;
/**
* Base class for a set of loaders for different representations of Git objects.
* New loaders are constructed for every object.
*/
public abstract class ObjectLoader {
- private ObjectId objectId;
-
- /**
- * @return the id of this object, possibly computed on demand
- * @throws IOException
- */
- public ObjectId getId() throws IOException {
- if (objectId == null) {
- final MessageDigest md = Constants.newMessageDigest();
- md.update(Constants.encodedTypeString(getType()));
- md.update((byte) ' ');
- md.update(Constants.encodeASCII(getSize()));
- md.update((byte) 0);
- md.update(getCachedBytes());
- objectId = ObjectId.fromRaw(md.digest());
- }
- return objectId;
- }
-
- /**
- * @return true if id of loaded object is already known, false otherwise.
- */
- protected boolean hasComputedId() {
- return objectId != null;
- }
-
- /**
- * Set the SHA-1 id of the object handled by this loader
- *
- * @param id
- */
- protected void setId(final ObjectId id) {
- if (objectId != null)
- throw new IllegalStateException("Id already set.");
- objectId = id;
- }
-
/**
* @return Git in pack object type, see {@link Constants}.
* @throws IOException
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
index cd17bd4..3cdca8f 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
@@ -129,11 +129,7 @@ public boolean hasObject(final AnyObjectId id) {
public PackedObjectLoader get(final WindowCursor curs, final AnyObjectId id)
throws IOException {
final long offset = idx.findOffset(id);
- if (offset == -1)
- return null;
- final PackedObjectLoader objReader = reader(curs, offset);
- objReader.setId(id.toObjectId());
- return objReader;
+ return 0 < offset ? reader(curs, offset) : null;
}
/**
@@ -219,11 +215,7 @@ final void copyRawData(final PackedObjectLoader loader,
pack.copyToStream(dataOffset, buf, cnt, crcOut, curs);
final long computed = crc.getValue();
- ObjectId id;
- if (loader.hasComputedId())
- id = loader.getId();
- else
- id = findObjectForOffset(objectOffset);
+ final ObjectId id = findObjectForOffset(objectOffset);
final long expected = idx.findCRC32(id);
if (computed != expected)
throw new CorruptObjectException(id,
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java b/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
index 8f6e96e..02f8103 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/Repository.java
@@ -317,7 +317,7 @@ public ObjectLoader openObject(final WindowCursor curs, final AnyObjectId id)
} while (k > 0);
}
try {
- return new UnpackedObjectLoader(this, id.toObjectId());
+ return new UnpackedObjectLoader(this, id);
} catch (FileNotFoundException fnfe) {
return null;
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
index 3ad273f..0560c3a 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/UnpackedObjectLoader.java
@@ -67,13 +67,13 @@
* SHA-1
* @throws IOException
*/
- public UnpackedObjectLoader(final Repository db, final ObjectId id)
+ public UnpackedObjectLoader(final Repository db, final AnyObjectId id)
throws IOException {
this(readCompressed(db, id), id);
}
- private static byte[] readCompressed(final Repository db, final ObjectId id)
- throws FileNotFoundException, IOException {
+ private static byte[] readCompressed(final Repository db,
+ final AnyObjectId id) throws FileNotFoundException, IOException {
final FileInputStream objStream = new FileInputStream(db.toFile(id));
final byte[] compressed;
try {
@@ -101,10 +101,8 @@ public UnpackedObjectLoader(final byte[] compressed)
this(compressed, null);
}
- private UnpackedObjectLoader(final byte[] compressed, final ObjectId id)
+ private UnpackedObjectLoader(final byte[] compressed, final AnyObjectId id)
throws CorruptObjectException {
- setId(id);
-
// Try to determine if this is a legacy format loose object or
// a new style loose object. The legacy format was completely
// compressed with zlib so the first byte must be 0x78 (15-bit
@@ -177,7 +175,7 @@ private UnpackedObjectLoader(final byte[] compressed, final ObjectId id)
}
}
- private void decompress(final ObjectId id, final Inflater inf, int p)
+ private void decompress(final AnyObjectId id, final Inflater inf, int p)
throws CorruptObjectException {
try {
while (!inf.finished())
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java b/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
index 7185df5..3b4f90d 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/WholePackedObjectLoader.java
@@ -72,7 +72,8 @@ WholePackedObjectLoader(final WindowCursor curs, final PackFile pr,
return data;
} catch (DataFormatException dfe) {
final CorruptObjectException coe;
- coe = new CorruptObjectException(getId(), "bad stream");
+ coe = new CorruptObjectException("Object at " + dataOffset + " in "
+ + pack.getPackFile() + " has bad zlib stream");
coe.initCause(dfe);
throw coe;
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java
index b1571ab..dfb34d9 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java
@@ -687,23 +687,23 @@ public RevObject parseAny(final AnyObjectId id)
final int type = ldr.getType();
switch (type) {
case Constants.OBJ_COMMIT: {
- final RevCommit c = createCommit(ldr.getId());
+ final RevCommit c = createCommit(id);
c.parseCanonical(this, data);
r = c;
break;
}
case Constants.OBJ_TREE: {
- r = new RevTree(ldr.getId());
+ r = new RevTree(id);
r.flags |= PARSED;
break;
}
case Constants.OBJ_BLOB: {
- r = new RevBlob(ldr.getId());
+ r = new RevBlob(id);
r.flags |= PARSED;
break;
}
case Constants.OBJ_TAG: {
- final RevTag t = new RevTag(ldr.getId());
+ final RevTag t = new RevTag(id);
t.parseCanonical(this, data);
r = t;
break;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
index 5d0c6bc..93b5bd2 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
@@ -42,6 +42,7 @@
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
+import java.security.MessageDigest;
import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
@@ -155,6 +156,8 @@
private final MutableObjectId idBuffer = new MutableObjectId();
+ private final MessageDigest objectDigest = Constants.newMessageDigest();
+
/**
* Errors received while trying to obtain an object.
* <p>
@@ -573,9 +576,17 @@ private void verifyLooseObject(final AnyObjectId id, final byte[] compressed)
throw e;
}
- if (!AnyObjectId.equals(id, uol.getId())) {
+ objectDigest.reset();
+ objectDigest.update(Constants.encodedTypeString(uol.getType()));
+ objectDigest.update((byte) ' ');
+ objectDigest.update(Constants.encodeASCII(uol.getSize()));
+ objectDigest.update((byte) 0);
+ objectDigest.update(uol.getCachedBytes());
+ idBuffer.fromRaw(objectDigest.digest(), 0);
+
+ if (!AnyObjectId.equals(id, idBuffer)) {
throw new TransportException("Incorrect hash for " + id.name()
- + "; computed " + uol.getId().name() + " as a "
+ + "; computed " + idBuffer.name() + " as a "
+ Constants.encodedTypeString(uol.getType()) + " from "
+ compressed.length + " bytes.");
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 07/23] Switch ObjectWalk to use a naked CanonicalTreeParser because its faster
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-7-git-send-email-spearce@spearce.org>
Removing the indirection of the TreeWalk during an ObjectWalk's loop
over the trees saves us 20 seconds on one of my linux-2.6 clones. On
average we went from 1m37s for "jgit rev-list --objects" down to 1m17s.
C git still does the same operation in 0m20s, so we're a long way from
matching C git's performance, but this is a worthwhile improvement, even
though the code has become slightly more complex.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../src/org/spearce/jgit/revwalk/ObjectWalk.java | 42 ++++++++++---------
.../jgit/treewalk/AbstractTreeIterator.java | 20 +++++++++
.../spearce/jgit/treewalk/CanonicalTreeParser.java | 39 ++++++++++++++++++-
.../src/org/spearce/jgit/treewalk/TreeWalk.java | 4 +-
4 files changed, 82 insertions(+), 23 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java b/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
index cfdbe18..0d67a38 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
@@ -46,7 +46,7 @@
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
import org.spearce.jgit.lib.Repository;
-import org.spearce.jgit.treewalk.TreeWalk;
+import org.spearce.jgit.treewalk.CanonicalTreeParser;
/**
* Specialized subclass of RevWalk to include trees, blobs and tags.
@@ -74,7 +74,7 @@
*/
private static final int IN_PENDING = RevWalk.REWRITE;
- private final TreeWalk treeWalk;
+ private CanonicalTreeParser treeWalk;
private BlockObjQueue pendingObjects;
@@ -92,8 +92,8 @@
*/
public ObjectWalk(final Repository repo) {
super(repo);
- treeWalk = new TreeWalk(repo);
pendingObjects = new BlockObjQueue();
+ treeWalk = new CanonicalTreeParser();
}
/**
@@ -240,17 +240,17 @@ public RevObject nextObject() throws MissingObjectException,
fromTreeWalk = false;
if (enterSubtree) {
- treeWalk.enterSubtree();
+ treeWalk = treeWalk.createSubtreeIterator(db, idBuffer, curs);
enterSubtree = false;
}
- while (treeWalk.next()) {
- final FileMode mode = treeWalk.getFileMode(0);
+ for (; !treeWalk.eof(); treeWalk = treeWalk.next()) {
+ final FileMode mode = treeWalk.getEntryFileMode();
final int sType = mode.getObjectType();
switch (sType) {
case Constants.OBJ_BLOB: {
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
final RevObject o = lookupAny(idBuffer, sType);
if ((o.flags & SEEN) != 0)
continue;
@@ -262,7 +262,7 @@ public RevObject nextObject() throws MissingObjectException,
return o;
}
case Constants.OBJ_TREE: {
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
final RevObject o = lookupAny(idBuffer, sType);
if ((o.flags & SEEN) != 0)
continue;
@@ -277,10 +277,11 @@ public RevObject nextObject() throws MissingObjectException,
default:
if (FileMode.GITLINK.equals(mode))
continue;
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
throw new CorruptObjectException("Invalid mode " + mode
+ " for " + idBuffer.name() + " "
- + treeWalk.getPathString() + " in " + currentTree + ".");
+ + treeWalk.getEntryPathString() + " in " + currentTree
+ + ".");
}
}
@@ -295,7 +296,7 @@ public RevObject nextObject() throws MissingObjectException,
continue;
if (o instanceof RevTree) {
currentTree = (RevTree) o;
- treeWalk.reset(currentTree);
+ treeWalk = treeWalk.resetRoot(db, currentTree, curs);
}
return o;
}
@@ -353,7 +354,7 @@ public void checkConnectivity() throws MissingObjectException,
* has no path, such as for annotated tags or root level trees.
*/
public String getPathString() {
- return fromTreeWalk ? treeWalk.getPathString() : null;
+ return fromTreeWalk ? treeWalk.getEntryPathString() : null;
}
@Override
@@ -385,33 +386,34 @@ private void markTreeUninteresting(final RevTree tree)
return;
tree.flags |= UNINTERESTING;
- treeWalk.reset(tree);
- while (treeWalk.next()) {
- final FileMode mode = treeWalk.getFileMode(0);
+ treeWalk = treeWalk.resetRoot(db, tree, curs);
+ for (;!treeWalk.eof(); treeWalk = treeWalk.next()) {
+ final FileMode mode = treeWalk.getEntryFileMode();
final int sType = mode.getObjectType();
switch (sType) {
case Constants.OBJ_BLOB: {
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
lookupAny(idBuffer, sType).flags |= UNINTERESTING;
continue;
}
case Constants.OBJ_TREE: {
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
final RevObject subtree = lookupAny(idBuffer, sType);
if ((subtree.flags & UNINTERESTING) == 0) {
subtree.flags |= UNINTERESTING;
- treeWalk.enterSubtree();
+ treeWalk = treeWalk.createSubtreeIterator(db, idBuffer,
+ curs);
}
continue;
}
default:
if (FileMode.GITLINK.equals(mode))
continue;
- treeWalk.getObjectId(idBuffer, 0);
+ treeWalk.getEntryObjectId(idBuffer);
throw new CorruptObjectException("Invalid mode " + mode
+ " for " + idBuffer.name() + " "
- + treeWalk.getPathString() + " in " + tree + ".");
+ + treeWalk.getEntryPathString() + " in " + tree + ".");
}
}
}
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
index c404cdc..0229582 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
@@ -308,6 +308,26 @@ public ObjectId getEntryObjectId() {
}
/**
+ * Obtain the ObjectId for the current entry.
+ *
+ * @param out
+ * buffer to copy the object id into.
+ */
+ public void getEntryObjectId(final MutableObjectId out) {
+ out.fromRaw(idBuffer(), idOffset());
+ }
+
+ /** @return the file mode of the current entry. */
+ public FileMode getEntryFileMode() {
+ return FileMode.fromBits(mode);
+ }
+
+ /** @return path of the current entry, as a string. */
+ public String getEntryPathString() {
+ return TreeWalk.pathOf(this);
+ }
+
+ /**
* Get the byte array buffer object IDs must be copied out of.
* <p>
* The id buffer contains the bytes necessary to construct an ObjectId for
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
index df3384d..6a5bada 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
@@ -52,6 +52,8 @@
/** Parses raw Git trees from the canonical semi-text/semi-binary format. */
public class CanonicalTreeParser extends AbstractTreeIterator {
+ private static final byte[] EMPTY = {};
+
private byte[] raw;
/** First offset within {@link #raw} of the current entry's data. */
@@ -62,7 +64,7 @@
/** Create a new parser. */
public CanonicalTreeParser() {
- // Nothing necessary.
+ raw = EMPTY;
}
private CanonicalTreeParser(final CanonicalTreeParser p) {
@@ -92,6 +94,41 @@ public void reset(final byte[] treeData) {
* messages if data corruption is found.
* @param curs
* window cursor to use during repository access.
+ * @return the root level parser.
+ * @throws MissingObjectException
+ * the object supplied is not available from the repository.
+ * @throws IncorrectObjectTypeException
+ * the object supplied as an argument is not actually a tree and
+ * cannot be parsed as though it were a tree.
+ * @throws IOException
+ * a loose object or pack file could not be read.
+ */
+ public CanonicalTreeParser resetRoot(final Repository repo,
+ final AnyObjectId id, final WindowCursor curs)
+ throws IncorrectObjectTypeException, IOException {
+ CanonicalTreeParser p = this;
+ while (p.parent != null)
+ p = (CanonicalTreeParser) p.parent;
+ p.reset(repo, id, curs);
+ return p;
+ }
+
+ /** @return this iterator, or its parent, if the tree is at eof. */
+ public CanonicalTreeParser next() {
+ next(1);
+ return eof() && parent != null ? (CanonicalTreeParser) parent : this;
+ }
+
+ /**
+ * Reset this parser to walk through the given tree.
+ *
+ * @param repo
+ * repository to load the tree data from.
+ * @param id
+ * identity of the tree being parsed; used only in exception
+ * messages if data corruption is found.
+ * @param curs
+ * window cursor to use during repository access.
* @throws MissingObjectException
* the object supplied is not available from the repository.
* @throws IncorrectObjectTypeException
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index cf66bf4..d1841b3 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -596,7 +596,7 @@ public ObjectId getObjectId(final int nth) {
public void getObjectId(final MutableObjectId out, final int nth) {
final AbstractTreeIterator t = trees[nth];
if (t.matches == currentHead)
- out.fromRaw(t.idBuffer(), t.idOffset());
+ t.getEntryObjectId(out);
else
out.clear();
}
@@ -834,7 +834,7 @@ private CanonicalTreeParser parserFor(final AnyObjectId id)
return p;
}
- private static String pathOf(final AbstractTreeIterator t) {
+ static String pathOf(final AbstractTreeIterator t) {
return RawParseUtils.decode(Constants.CHARSET, t.path, 0, t.pathLen);
}
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 06/23] Reduce garbage allocation when using TreeWalk
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-6-git-send-email-spearce@spearce.org>
We now support more efficiently resetting a TreeWalk when the
walker is using a single tree traversal from the repository.
This is very common during an ObjectWalk for a PackWriter, so
optimizing the code path is worth while. By using a special
form of reset we can avoid unnecessary temporary arrays and
also skip some loop conditional instructions.
When diving into a subtree iterator (which is also quite a
common operation in ObjectWalk) we want to reuse an idBuffer
and a WindowCursor as much as possible, ensuring that these
aren't created temporarily on the stack. This should help
to reduce the stress on the Inflater cache by allowing it
to reuse the same Inflater on each tree construction. We
also can avoid creating a temporary ObjectId by filling out
the mutable idBuffer during the repository lookup.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../jgit/pgm/opt/AbstractTreeIteratorHandler.java | 6 ++-
.../src/org/spearce/jgit/revwalk/ObjectWalk.java | 5 +-
.../jgit/transport/WalkFetchConnection.java | 4 +-
.../jgit/treewalk/AbstractTreeIterator.java | 28 ++++++++++
.../spearce/jgit/treewalk/CanonicalTreeParser.java | 46 ++++++++++++----
.../src/org/spearce/jgit/treewalk/TreeWalk.java | 58 ++++++++++++++++++--
6 files changed, 124 insertions(+), 23 deletions(-)
diff --git a/org.spearce.jgit.pgm/src/org/spearce/jgit/pgm/opt/AbstractTreeIteratorHandler.java b/org.spearce.jgit.pgm/src/org/spearce/jgit/pgm/opt/AbstractTreeIteratorHandler.java
index 9432d5f..12d4191 100644
--- a/org.spearce.jgit.pgm/src/org/spearce/jgit/pgm/opt/AbstractTreeIteratorHandler.java
+++ b/org.spearce.jgit.pgm/src/org/spearce/jgit/pgm/opt/AbstractTreeIteratorHandler.java
@@ -51,6 +51,7 @@
import org.spearce.jgit.errors.IncorrectObjectTypeException;
import org.spearce.jgit.errors.MissingObjectException;
import org.spearce.jgit.lib.ObjectId;
+import org.spearce.jgit.lib.WindowCursor;
import org.spearce.jgit.treewalk.AbstractTreeIterator;
import org.spearce.jgit.treewalk.CanonicalTreeParser;
import org.spearce.jgit.treewalk.FileTreeIterator;
@@ -110,8 +111,9 @@ public int parseArguments(final Parameters params) throws CmdLineException {
throw new CmdLineException(name + " is not a tree");
final CanonicalTreeParser p = new CanonicalTreeParser();
+ final WindowCursor curs = new WindowCursor();
try {
- p.reset(clp.getRepository(), clp.getRevWalk().parseTree(id));
+ p.reset(clp.getRepository(), clp.getRevWalk().parseTree(id), curs);
} catch (MissingObjectException e) {
throw new CmdLineException(name + " is not a tree");
} catch (IncorrectObjectTypeException e) {
@@ -119,6 +121,8 @@ public int parseArguments(final Parameters params) throws CmdLineException {
} catch (IOException e) {
throw new CmdLineException("cannot read " + name + ": "
+ e.getMessage());
+ } finally {
+ curs.release();
}
setter.addValue(p);
diff --git a/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java b/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
index 454cb4a..cfdbe18 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/revwalk/ObjectWalk.java
@@ -45,7 +45,6 @@
import org.spearce.jgit.lib.AnyObjectId;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
-import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.Repository;
import org.spearce.jgit.treewalk.TreeWalk;
@@ -296,7 +295,7 @@ public RevObject nextObject() throws MissingObjectException,
continue;
if (o instanceof RevTree) {
currentTree = (RevTree) o;
- treeWalk.reset(new ObjectId[] { currentTree });
+ treeWalk.reset(currentTree);
}
return o;
}
@@ -386,7 +385,7 @@ private void markTreeUninteresting(final RevTree tree)
return;
tree.flags |= UNINTERESTING;
- treeWalk.reset(new ObjectId[] { tree });
+ treeWalk.reset(tree);
while (treeWalk.next()) {
final FileMode mode = treeWalk.getFileMode(0);
final int sType = mode.getObjectType();
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
index 6300f10..5d0c6bc 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
@@ -296,7 +296,7 @@ private void processBlob(final RevObject obj) throws TransportException {
private void processTree(final RevObject obj) throws TransportException {
try {
- treeWalk.reset(new ObjectId[] { obj });
+ treeWalk.reset(obj);
while (treeWalk.next()) {
final FileMode mode = treeWalk.getFileMode(0);
final int sType = mode.getObjectType();
@@ -720,7 +720,7 @@ private void markTreeComplete(final RevTree tree) throws IOException {
if (tree.has(COMPLETE))
return;
tree.add(COMPLETE);
- treeWalk.reset(new ObjectId[] { tree });
+ treeWalk.reset(tree);
while (treeWalk.next()) {
final FileMode mode = treeWalk.getFileMode(0);
final int sType = mode.getObjectType();
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
index 5226ab6..c404cdc 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/AbstractTreeIterator.java
@@ -46,8 +46,10 @@
import org.spearce.jgit.errors.IncorrectObjectTypeException;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
+import org.spearce.jgit.lib.MutableObjectId;
import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.Repository;
+import org.spearce.jgit.lib.WindowCursor;
import org.spearce.jgit.treewalk.filter.TreeFilter;
/**
@@ -347,6 +349,32 @@ public abstract AbstractTreeIterator createSubtreeIterator(Repository repo)
throws IncorrectObjectTypeException, IOException;
/**
+ * Create a new iterator for the current entry's subtree.
+ * <p>
+ * The parent reference of the iterator must be <code>this</code>, otherwise
+ * the caller would not be able to exit out of the subtree iterator
+ * correctly and return to continue walking <code>this</code>.
+ *
+ * @param repo
+ * repository to load the tree data from.
+ * @param idBuffer
+ * temporary ObjectId buffer for use by this method.
+ * @param curs
+ * window cursor to use during repository access.
+ * @return a new parser that walks over the current subtree.
+ * @throws IncorrectObjectTypeException
+ * the current entry is not actually a tree and cannot be parsed
+ * as though it were a tree.
+ * @throws IOException
+ * a loose object or pack file could not be read.
+ */
+ public AbstractTreeIterator createSubtreeIterator(final Repository repo,
+ final MutableObjectId idBuffer, final WindowCursor curs)
+ throws IncorrectObjectTypeException, IOException {
+ return createSubtreeIterator(repo);
+ }
+
+ /**
* Is this tree iterator positioned on its first entry?
* <p>
* An iterator is positioned on the first entry if <code>back(1)</code>
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
index dcc53cd..df3384d 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/CanonicalTreeParser.java
@@ -41,11 +41,14 @@
import org.spearce.jgit.errors.IncorrectObjectTypeException;
import org.spearce.jgit.errors.MissingObjectException;
+import org.spearce.jgit.lib.AnyObjectId;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
+import org.spearce.jgit.lib.MutableObjectId;
import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.ObjectLoader;
import org.spearce.jgit.lib.Repository;
+import org.spearce.jgit.lib.WindowCursor;
/** Parses raw Git trees from the canonical semi-text/semi-binary format. */
public class CanonicalTreeParser extends AbstractTreeIterator {
@@ -87,6 +90,8 @@ public void reset(final byte[] treeData) {
* @param id
* identity of the tree being parsed; used only in exception
* messages if data corruption is found.
+ * @param curs
+ * window cursor to use during repository access.
* @throws MissingObjectException
* the object supplied is not available from the repository.
* @throws IncorrectObjectTypeException
@@ -95,27 +100,46 @@ public void reset(final byte[] treeData) {
* @throws IOException
* a loose object or pack file could not be read.
*/
- public void reset(final Repository repo, final ObjectId id)
+ public void reset(final Repository repo, final AnyObjectId id,
+ final WindowCursor curs)
throws IncorrectObjectTypeException, IOException {
- final ObjectLoader ldr = repo.openObject(id);
- if (ldr == null)
- throw new MissingObjectException(id, Constants.TYPE_TREE);
+ final ObjectLoader ldr = repo.openObject(curs, id);
+ if (ldr == null) {
+ final ObjectId me = id.toObjectId();
+ throw new MissingObjectException(me, Constants.TYPE_TREE);
+ }
final byte[] subtreeData = ldr.getCachedBytes();
- if (ldr.getType() != Constants.OBJ_TREE)
- throw new IncorrectObjectTypeException(id, Constants.TYPE_TREE);
+ if (ldr.getType() != Constants.OBJ_TREE) {
+ final ObjectId me = id.toObjectId();
+ throw new IncorrectObjectTypeException(me, Constants.TYPE_TREE);
+ }
reset(subtreeData);
}
- public CanonicalTreeParser createSubtreeIterator(final Repository repo)
+ @Override
+ public CanonicalTreeParser createSubtreeIterator(final Repository repo,
+ final MutableObjectId idBuffer, final WindowCursor curs)
throws IncorrectObjectTypeException, IOException {
- final ObjectId id = getEntryObjectId();
- if (!FileMode.TREE.equals(mode))
- throw new IncorrectObjectTypeException(id, Constants.TYPE_TREE);
+ idBuffer.fromRaw(idBuffer(), idOffset());
+ if (!FileMode.TREE.equals(mode)) {
+ final ObjectId me = idBuffer.toObjectId();
+ throw new IncorrectObjectTypeException(me, Constants.TYPE_TREE);
+ }
final CanonicalTreeParser p = new CanonicalTreeParser(this);
- p.reset(repo, id);
+ p.reset(repo, idBuffer, curs);
return p;
}
+ public CanonicalTreeParser createSubtreeIterator(final Repository repo)
+ throws IncorrectObjectTypeException, IOException {
+ final WindowCursor curs = new WindowCursor();
+ try {
+ return createSubtreeIterator(repo, new MutableObjectId(), curs);
+ } finally {
+ curs.release();
+ }
+ }
+
@Override
public byte[] idBuffer() {
return raw;
diff --git a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
index 38a726b..cf66bf4 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/treewalk/TreeWalk.java
@@ -44,11 +44,13 @@
import org.spearce.jgit.errors.IncorrectObjectTypeException;
import org.spearce.jgit.errors.MissingObjectException;
import org.spearce.jgit.errors.StopWalkException;
+import org.spearce.jgit.lib.AnyObjectId;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
import org.spearce.jgit.lib.MutableObjectId;
import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.Repository;
+import org.spearce.jgit.lib.WindowCursor;
import org.spearce.jgit.revwalk.RevTree;
import org.spearce.jgit.treewalk.filter.PathFilterGroup;
import org.spearce.jgit.treewalk.filter.TreeFilter;
@@ -99,7 +101,7 @@
* a tree object was not found.
*/
public static TreeWalk forPath(final Repository db, final String path,
- final ObjectId[] trees) throws MissingObjectException,
+ final AnyObjectId[] trees) throws MissingObjectException,
IncorrectObjectTypeException, CorruptObjectException, IOException {
final TreeWalk r = new TreeWalk(db);
r.setFilter(PathFilterGroup.createFromStrings(Collections
@@ -142,6 +144,10 @@ public static TreeWalk forPath(final Repository db, final String path,
private final Repository db;
+ private final MutableObjectId idBuffer = new MutableObjectId();
+
+ private final WindowCursor curs = new WindowCursor();
+
private TreeFilter filter;
AbstractTreeIterator[] trees;
@@ -278,6 +284,46 @@ public void reset() {
}
/**
+ * Reset this walker to run over a single existing tree.
+ *
+ * @param id
+ * the tree we need to parse. The walker will execute over this
+ * single tree if the reset is successful.
+ * @throws MissingObjectException
+ * the given tree object does not exist in this repository.
+ * @throws IncorrectObjectTypeException
+ * the given object id does not denote a tree, but instead names
+ * some other non-tree type of object. Note that commits are not
+ * trees, even if they are sometimes called a "tree-ish".
+ * @throws CorruptObjectException
+ * the object claimed to be a tree, but its contents did not
+ * appear to be a tree. The repository may have data corruption.
+ * @throws IOException
+ * a loose object or pack file could not be read.
+ */
+ public void reset(final AnyObjectId id) throws MissingObjectException,
+ IncorrectObjectTypeException, CorruptObjectException, IOException {
+ if (trees.length == 1) {
+ AbstractTreeIterator o = trees[0];
+ while (o.parent != null)
+ o = o.parent;
+ if (o instanceof CanonicalTreeParser) {
+ o.matches = null;
+ o.matchShift = 0;
+ ((CanonicalTreeParser) o).reset(db, id, curs);
+ trees[0] = o;
+ } else {
+ trees[0] = parserFor(id);
+ }
+ } else {
+ trees = new AbstractTreeIterator[] { parserFor(id) };
+ }
+
+ advance = false;
+ depth = 0;
+ }
+
+ /**
* Reset this walker to run over a set of existing trees.
*
* @param ids
@@ -295,7 +341,7 @@ public void reset() {
* @throws IOException
* a loose object or pack file could not be read.
*/
- public void reset(final ObjectId[] ids) throws MissingObjectException,
+ public void reset(final AnyObjectId[] ids) throws MissingObjectException,
IncorrectObjectTypeException, CorruptObjectException, IOException {
final int oldLen = trees.length;
final int newLen = ids.length;
@@ -311,7 +357,7 @@ public void reset(final ObjectId[] ids) throws MissingObjectException,
if (o instanceof CanonicalTreeParser) {
o.matches = null;
o.matchShift = 0;
- ((CanonicalTreeParser) o).reset(db, ids[i]);
+ ((CanonicalTreeParser) o).reset(db, ids[i], curs);
r[i] = o;
continue;
}
@@ -709,7 +755,7 @@ public void enterSubtree() throws MissingObjectException,
final AbstractTreeIterator t = trees[i];
final AbstractTreeIterator n;
if (t.matches == ch && !t.eof() && FileMode.TREE.equals(t.mode))
- n = t.createSubtreeIterator(db);
+ n = t.createSubtreeIterator(db, idBuffer, curs);
else
n = new EmptyTreeIterator(t);
tmp[i] = n;
@@ -781,10 +827,10 @@ private void exitSubtree() {
currentHead = minRef;
}
- private CanonicalTreeParser parserFor(final ObjectId id)
+ private CanonicalTreeParser parserFor(final AnyObjectId id)
throws IncorrectObjectTypeException, IOException {
final CanonicalTreeParser p = new CanonicalTreeParser();
- p.reset(db, id);
+ p.reset(db, id, curs);
return p;
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 08/23] Remove the unused PackFile.get(ObjectId) form
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-8-git-send-email-spearce@spearce.org>
Everyone passes in a WindowCursor, except this one test case.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../tst/org/spearce/jgit/lib/T0004_PackReader.java | 2 +-
.../src/org/spearce/jgit/lib/PackFile.java | 19 -------------------
2 files changed, 1 insertions(+), 20 deletions(-)
diff --git a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
index 8288e56..f6fff52 100644
--- a/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
+++ b/org.spearce.jgit.test/tst/org/spearce/jgit/lib/T0004_PackReader.java
@@ -55,7 +55,7 @@ public void test003_lookupCompressedObject() throws IOException {
id = ObjectId.fromString("902d5476fa249b7abc9d84c611577a81381f0327");
pr = new PackFile(db, TEST_IDX, TEST_PACK);
- or = pr.get(id);
+ or = pr.get(new WindowCursor(), id);
assertNotNull(or);
assertEquals(id, or.getId());
assertEquals(Constants.OBJ_TREE, or.getType());
diff --git a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
index fc1b6ea..cd17bd4 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/lib/PackFile.java
@@ -117,25 +117,6 @@ public boolean hasObject(final AnyObjectId id) {
/**
* Get an object from this pack.
*
- * @param id
- * the object to obtain from the pack. Must not be null.
- * @return the object loader for the requested object if it is contained in
- * this pack; null if the object was not found.
- * @throws IOException
- * the pack file or the index could not be read.
- */
- public PackedObjectLoader get(final AnyObjectId id) throws IOException {
- final WindowCursor wc = new WindowCursor();
- try {
- return get(wc, id);
- } finally {
- wc.release();
- }
- }
-
- /**
- * Get an object from this pack.
- *
* @param curs
* temporary working space associated with the calling thread.
* @param id
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
* [JGIT PATCH 05/23] Change walker based fetch to use TreeWalk's MutableObjectId accessor
From: Shawn O. Pearce @ 2008-12-25 2:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: git
In-Reply-To: <1230171079-17156-5-git-send-email-spearce@spearce.org>
Fetch time is usually dominated by network traffic, but while marking
objects reachable locally we can still benefit from the reduction in
object allocations this variant of getObjectId offers.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../jgit/transport/WalkFetchConnection.java | 29 +++++++++++--------
1 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
index 91c5ea8..6300f10 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/WalkFetchConnection.java
@@ -59,6 +59,7 @@
import org.spearce.jgit.lib.AnyObjectId;
import org.spearce.jgit.lib.Constants;
import org.spearce.jgit.lib.FileMode;
+import org.spearce.jgit.lib.MutableObjectId;
import org.spearce.jgit.lib.ObjectChecker;
import org.spearce.jgit.lib.ObjectId;
import org.spearce.jgit.lib.PackIndex;
@@ -152,6 +153,8 @@
*/
private final Set<String> packsConsidered;
+ private final MutableObjectId idBuffer = new MutableObjectId();
+
/**
* Errors received while trying to obtain an object.
* <p>
@@ -300,16 +303,17 @@ private void processTree(final RevObject obj) throws TransportException {
switch (sType) {
case Constants.OBJ_BLOB:
- case Constants.OBJ_TREE: {
- final ObjectId sId = treeWalk.getObjectId(0);
- needs(revWalk.lookupAny(sId, sType));
+ case Constants.OBJ_TREE:
+ treeWalk.getObjectId(idBuffer, 0);
+ needs(revWalk.lookupAny(idBuffer, sType));
continue;
- }
+
default:
if (FileMode.GITLINK.equals(mode))
continue;
+ treeWalk.getObjectId(idBuffer, 0);
throw new CorruptObjectException("Invalid mode " + mode
- + " for " + treeWalk.getObjectId(0).name() + " "
+ + " for " + idBuffer.name() + " "
+ treeWalk.getPathString() + " in "
+ obj.getId().name() + ".");
}
@@ -722,14 +726,14 @@ private void markTreeComplete(final RevTree tree) throws IOException {
final int sType = mode.getObjectType();
switch (sType) {
- case Constants.OBJ_BLOB: {
- final ObjectId sid = treeWalk.getObjectId(0);
- revWalk.lookupAny(sid, sType).add(COMPLETE);
+ case Constants.OBJ_BLOB:
+ treeWalk.getObjectId(idBuffer, 0);
+ revWalk.lookupAny(idBuffer, sType).add(COMPLETE);
continue;
- }
+
case Constants.OBJ_TREE: {
- final ObjectId sid = treeWalk.getObjectId(0);
- final RevObject o = revWalk.lookupAny(sid, sType);
+ treeWalk.getObjectId(idBuffer, 0);
+ final RevObject o = revWalk.lookupAny(idBuffer, sType);
if (!o.has(COMPLETE)) {
o.add(COMPLETE);
treeWalk.enterSubtree();
@@ -739,8 +743,9 @@ private void markTreeComplete(final RevTree tree) throws IOException {
default:
if (FileMode.GITLINK.equals(mode))
continue;
+ treeWalk.getObjectId(idBuffer, 0);
throw new CorruptObjectException("Invalid mode " + mode
- + " for " + treeWalk.getObjectId(0).name() + " "
+ + " for " + idBuffer.name() + " "
+ treeWalk.getPathString() + " in " + tree.name() + ".");
}
}
--
1.6.1.rc4.301.g5497a
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox