From: Chico Sokol <chico.sokol@gmail.com>
To: Shawn Pearce <spearce@spearce.org>
Cc: John Szakmeister <john@szakmeister.net>, git <git@vger.kernel.org>
Subject: Re: Reading commit objects
Date: Wed, 22 May 2013 11:20:44 -0300 [thread overview]
Message-ID: <CABx5MBS9YgNmZD_tumMJ-MJVjHbRFCKbCjs9AZ347-OCwqO7qQ@mail.gmail.com> (raw)
In-Reply-To: <CAJo=hJtqACW+CR5FkmDfwyK1Wg3Kcppy6DbW7P=On_qJyvsYvQ@mail.gmail.com>
I'm not criticizing JGit, guys. It simply doesn't fit into our needs.
We're not interested in mapping git commands in java and don't have
the same RAM limitations.
I know JGit team is doing a great job and we do not intend to build a
library with such completeness.
Are you guys contributors of JGit? Can you guys point me out to the
code that unpacks git objects? The closest I could get was that class:
https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java
It seems to be a standard and a non standard format of the packed
object, as I read the comments of this method:
https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/UnpackedObject.java#L272
I suspect that the default inflater class of java api expect the
object to be in the standard format.
What the following comment mean? What's the "Experimental pack-based"
format? Is there any docs on the specs of that?
We must determine if the buffer contains the standard
zlib-deflated stream or the experimental format based
on the in-pack object format. Compare the header byte
for each format:
RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
Experimental pack-based : Stttssss : ttt = 1,2,3,4
--
Chico Sokol
On Wed, May 22, 2013 at 2:59 AM, Shawn Pearce <spearce@spearce.org> wrote:
> On Tue, May 21, 2013 at 3:18 PM, Chico Sokol <chico.sokol@gmail.com> wrote:
>> Ok, we discovered that the commit object actually contains the tree
>> object's sha1, by reading its contents with python zlib library.
>>
>> So the bug must be with our java code (we're building a java lib).
>>
>> Is there any non-standard issue in git's zlib compression? We're
>> decompressing its contents with java default zlib api, so it should
>> work normally, here's our code, that's printing that wrong output:
>>
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.util.zip.InflaterInputStream;
>> import org.apache.commons.io.IOUtils;
>> ...
>> File obj = new File(".git/objects/25/0f67ef017fcb97b5371a302526872cfcadad21");
>> InflaterInputStream inflaterInputStream = new InflaterInputStream(new
>> FileInputStream(obj));
>> System.out.println(IOUtils.readLines(inflaterInputStream));
> ...
>>>> Currently, we're trying to parse commit objects. After decompressing
>>>> the contents of a commit object file we got the following output:
>>>>
>>>> commit 191
>>>> author Francisco Sokol <chico.sokol@gmail.com> 1369140112 -0300
>>>> committer Francisco Sokol <chico.sokol@gmail.com> 1369140112 -0300
>>>>
>>>> first commit
>
> Your code is broken. IOUtils is probably corrupting what you get back.
> After inflating the stream you should see the object type ("commit"),
> space, its length in bytes as a base 10 string, and then a NUL ('\0').
> Following that is the tree line, and parent(s) if any. I wonder if
> IOUtils discarded the remainder of the line after the NUL and did not
> consider the tree line.
>
> And you wonder why JGit code is confusing. We can't rely on "standard
> Java APIs" to do the right thing, because commonly used libraries have
> made assumptions that disagree with the way Git works.
next prev parent reply other threads:[~2013-05-22 14:21 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-21 21:21 Reading commit objects Chico Sokol
2013-05-21 21:25 ` Felipe Contreras
2013-05-21 21:37 ` John Szakmeister
2013-05-21 22:18 ` Chico Sokol
2013-05-21 22:22 ` Junio C Hamano
2013-05-21 22:33 ` Chico Sokol
2013-05-21 23:34 ` Jonathan Nieder
2013-05-22 5:54 ` Shawn Pearce
2013-05-22 4:51 ` java zlib woes (was: Reading commit objects) Andreas Krey
2013-05-22 5:56 ` Shawn Pearce
2013-05-27 4:11 ` Andreas Krey
2013-06-04 10:18 ` fetch delta resolution vs. checkout (was: java zlib woes) Andreas Krey
2013-05-22 5:59 ` Reading commit objects Shawn Pearce
2013-05-22 14:20 ` Chico Sokol [this message]
2013-05-22 20:02 ` Shawn Pearce
2013-05-22 14:25 ` Chico Sokol
2013-05-22 14:47 ` Chico Sokol
2013-05-22 19:59 ` Shawn Pearce
2013-05-21 22:20 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABx5MBS9YgNmZD_tumMJ-MJVjHbRFCKbCjs9AZ347-OCwqO7qQ@mail.gmail.com \
--to=chico.sokol@gmail.com \
--cc=git@vger.kernel.org \
--cc=john@szakmeister.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).