git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Java Inflater problem decompressing packfile
@ 2011-04-16  2:05 madmarcos
  2011-04-16  6:37 ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-16  2:05 UTC (permalink / raw)
  To: git

This may be better suited for the Java forums but I will ask it here just in
case someone has run into it before.

I have a packfile that I have saved as a file from the git-upload-pack
command. I want to read through the packfile, decompressing each of the
objects. 

My little inflater procedure works fine for a tiny HelloWorld project. So, I
decided to mix it up a little and use the jEdit source for a larger test. I
am 99% certain the jEdit.git packfile itself is ok as I have passed it
through directly to eGit's Import using an SSH proxy and eGit unpacked it
just fine.

So, my inflater method decompresses the first 7 objects fine (a commit, a
couple of trees, and several blobs) and a cursory visual inspection of the
decompressed data seems fine. The eighth object becomes a problem, though.
It is a blob with the name build.xml that is 51,060 bytes decompressed
(looking at the original pre-git-pushed jEdit source). The actual file size
matches the decompressed data content size in the packfile object header. 
The inflater procedure outputs the decompressed data to System.out for
visual inspection. Approximately the first 1/3 looks like the original
build.xml but after that, the output is garbled. The procedure continues
decompressing objects after the 8th, but garbled, object but it dies on the
9th object with an "unknown compression method" error.

So I created a new test inflater to focus only on decompressing the 8th
object. Simply opening the packfile, copying out the compressed data (7793
bytes), and inflating it yields the same 2/3 garbled xml. 

As a further test, I then take the original build.xml file, compress it
using java's Deflater (yielding a 7793 byte array), and then inflate it
using my same procedure and it decompresses fine. All of the xml is
readable.

Now, I have tried several variations of an inflater procedure, including an
patchwork variation from jGit's WindowCursor.inflate method. But they all
yield the same garbled result for the compressed build.xml data.

Any suggestions. I can post some of my ugly code if asked.

Thanks!

--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6278154.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16  2:05 Java Inflater problem decompressing packfile madmarcos
@ 2011-04-16  6:37 ` Jeff King
  2011-04-16 14:23   ` madmarcos
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2011-04-16  6:37 UTC (permalink / raw)
  To: madmarcos; +Cc: git

On Fri, Apr 15, 2011 at 07:05:05PM -0700, madmarcos wrote:

> So, my inflater method decompresses the first 7 objects fine (a commit, a
> couple of trees, and several blobs) and a cursory visual inspection of the
> decompressed data seems fine. The eighth object becomes a problem, though.
> It is a blob with the name build.xml that is 51,060 bytes decompressed
> (looking at the original pre-git-pushed jEdit source). The actual file size
> matches the decompressed data content size in the packfile object header. 
> The inflater procedure outputs the decompressed data to System.out for
> visual inspection. Approximately the first 1/3 looks like the original
> build.xml but after that, the output is garbled. The procedure continues
> decompressing objects after the 8th, but garbled, object but it dies on the
> 9th object with an "unknown compression method" error.

Is it possible that the blob is stored as a delta within the pack? In
that case the pack header will tell you what the eventual size of the
blob will be, but the data will actually be a diff against another pack
object. Does your inflater handle delta-fied objects?

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16  6:37 ` Jeff King
@ 2011-04-16 14:23   ` madmarcos
  2011-04-16 14:36     ` madmarcos
  2011-04-17  4:36     ` Jeff King
  0 siblings, 2 replies; 10+ messages in thread
From: madmarcos @ 2011-04-16 14:23 UTC (permalink / raw)
  To: git

No, my inflater doesn't handle deltas, yet. 
But there are a few reasons why I don't think that's the case.
1. The project has only been pushed once to the git repository before my
tests. No updates to the git repository project or anything like that. 
2. If it were a delta, would the first 1/3 of it be completely normal and
readable? There is no pattern that I can see to the remaining 2/3. It looks
as if the characters in the 2/3 part were interleaved with the other
characters about 10 times.
3. The object type in the header is parsed as 3, or a blob. Aren't the delta
object types higher numbers than that?


--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279028.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16 14:23   ` madmarcos
@ 2011-04-16 14:36     ` madmarcos
  2011-04-16 14:58       ` madmarcos
  2011-04-17  4:36     ` Jeff King
  1 sibling, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-16 14:36 UTC (permalink / raw)
  To: git

1 more thing:
when it is decompressing the garbled object, it throws an "incorrect data
check" error.

--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279050.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16 14:36     ` madmarcos
@ 2011-04-16 14:58       ` madmarcos
  2011-04-16 15:50         ` madmarcos
  0 siblings, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-16 14:58 UTC (permalink / raw)
  To: git

here is some code that shows the problem. sorry if the formatting fails. not
sure if I am supposed to use code tags or something.


		try {
			byte[] packFile = readFile("/Users/marcos/GitProxyCache/jedit.pack");
			
			//THE BELOW OBJECT DECOMPRESSES FINE
       		//Object starts at index 8616
       		//Type = 3, Decompressed size = 2248 (uses 2 extra size bytes)
			//byte [] packDataWindow = new byte[8000];
			//System.arraycopy(packFile, 8619, packDataWindow, 0,
packDataWindow.length); //works
			
			//THE BELOW OBJECT FAILS TO INFLATE
			//IT CAUSES an "incorrect data check" error
       		//Object starts at index 9470
       		//Type = 3, Decompressed size = 51060 (uses 2 extra size bytes)
			byte [] packDataWindow = new byte[8000];
			System.arraycopy(packFile, 9473, packDataWindow, 0,
packDataWindow.length); //does not work

			Inflater decompresser = new Inflater();
			decompresser.setInput(packDataWindow, 0, packDataWindow.length);
			byte[] result = new byte[60000];
			int resultLength = 0;
			resultLength = decompresser.inflate(result);
			String outputString = new String(result, 0, resultLength, "UTF-8");
			System.out.println(outputString);
			decompresser.end();
		} catch (Exception e) {
			e.printStackTrace();
		}


--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279085.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16 14:58       ` madmarcos
@ 2011-04-16 15:50         ` madmarcos
  2011-04-17  0:40           ` madmarcos
  0 siblings, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-16 15:50 UTC (permalink / raw)
  To: git

if you want to play with the packfile, you can find it here:

http://galadriel.cs.utsa.edu/utsadocs/jedit.pack



--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279183.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16 15:50         ` madmarcos
@ 2011-04-17  0:40           ` madmarcos
  2011-04-17  4:02             ` madmarcos
  0 siblings, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-17  0:40 UTC (permalink / raw)
  To: git

someone on the Java forums asked if I knew that the file was being read
completely before inflating. Well... I just assumed (yes, I know not a good
thing to do).
So here is my readFile code in case you want to see it:

public byte [] readFile(String fileName) {
		byte [] input2 = null;
	    File tempPackInputFile2 = new File(fileName);
		InputStream tempPackInputStream2;
		try {
			tempPackInputStream2 = new FileInputStream(tempPackInputFile2);
			long tempPackLength2 = tempPackInputFile2.length();
			input2 = new byte[(int) tempPackLength2];
			// Read in the bytes
		    int offset2 = 0;
		    int numRead2 = 0;
		    while (offset2 < input2.length
		           && (numRead2 = tempPackInputStream2.read(input2, offset2,
input2.length-offset2)) >= 0) {
		        offset2 += numRead2;
		    }
		    tempPackInputStream2.close();
		} catch (Exception e) {
			e.printStackTrace();
		}
	    return input2;
	}

--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280097.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-17  0:40           ` madmarcos
@ 2011-04-17  4:02             ` madmarcos
  2011-04-17  4:06               ` madmarcos
  0 siblings, 1 reply; 10+ messages in thread
From: madmarcos @ 2011-04-17  4:02 UTC (permalink / raw)
  To: git

my entire testing class is below. just change the file path string in the
first line in test4.


package server_test2;

import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.util.zip.Inflater;

public class InflaterTest2 {

	public InflaterTest2() {
	}

	public void test4() {
		try {
			byte[] packFile = readFile2("/Users/marcos/GitProxyCache/jedit.pack");

			byte [] packDataWindow = new byte[8000];

			//the below object is the .classpath blob. works fine
			System.arraycopy(packFile, 4828 + 2, packDataWindow, 0,
packDataWindow.length); 

			//the below object is the .project blob. works fine
			//System.arraycopy(packFile, 4978 + 2, packDataWindow, 0,
packDataWindow.length); 
			
			//the below object is the top-level tree. works fine
			//System.arraycopy(packFile, 5171 + 2, packDataWindow, 0,
packDataWindow.length); 

			//the below object is source code notes blob. works fine
			//System.arraycopy(packFile, 5760 + 3, packDataWindow, 0,
packDataWindow.length); 

			//the below object is a build file blob. works fine
			//System.arraycopy(packFile, 8619, packDataWindow, 0,
packDataWindow.length); 
			
			//THE BELOW OBJECT FAILS TO INFLATE
			//IT CAUSES an "incorrect data check" error
       		//Object starts at index 9470
       		//Type = 3, Decompressed size = 51060 (uses 2 extra size bytes)
			//System.arraycopy(packFile, 9470 + 3, packDataWindow, 0,
packDataWindow.length); 

			Inflater decompresser = new Inflater();
			decompresser.setInput(packDataWindow, 0, packDataWindow.length);
			byte[] result = new byte[60000];
			int resultLength = 0;
			resultLength = decompresser.inflate(result);
			
			String outputString = new String(result, 0, resultLength, "UTF-8");
			System.out.println(outputString);

			System.out.println("------- End Decompressed Output -------");
			int numCompressed = (int) decompresser.getBytesRead();
			System.out.println("# Bytes Compressed: " + numCompressed);
			decompresser.end();

		} catch (Exception e) {
			e.printStackTrace();
		}
	}

	public byte [] readFile2(String fileName) {
		byte [] input2 = null;
	    File tempPackInputFile2 = new File(fileName);
		DataInputStream tempPackInputStream2;
		try {
			tempPackInputStream2 = new DataInputStream(new
FileInputStream(tempPackInputFile2));
			long tempPackLength2 = tempPackInputFile2.length();
			input2 = new byte[(int) tempPackLength2];
			tempPackInputStream2.readFully(input2);
		    tempPackInputStream2.close();
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	    return input2;
	}

	public static void main(String[] args) {
		InflaterTest2 app = new InflaterTest2();
		app.test4();
		System.out.println("Done.");
	}

}

--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280285.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-17  4:02             ` madmarcos
@ 2011-04-17  4:06               ` madmarcos
  0 siblings, 0 replies; 10+ messages in thread
From: madmarcos @ 2011-04-17  4:06 UTC (permalink / raw)
  To: git

I hit send too quickly. 
beneath the readFile call in the test4() method are several commented-out
System.arraycopy lines. you can uncomment any one of them to see the blob or
tree decompressed. all but the the last arraycopy line work fine. the last
one is the problem.

thanks for any light you can shed on the problem. I am pulling my hair out.

--
View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280295.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Java Inflater problem decompressing packfile
  2011-04-16 14:23   ` madmarcos
  2011-04-16 14:36     ` madmarcos
@ 2011-04-17  4:36     ` Jeff King
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff King @ 2011-04-17  4:36 UTC (permalink / raw)
  To: madmarcos; +Cc: git

On Sat, Apr 16, 2011 at 07:23:52AM -0700, madmarcos wrote:

> No, my inflater doesn't handle deltas, yet. 
> But there are a few reasons why I don't think that's the case.
> 1. The project has only been pushed once to the git repository before my
> tests. No updates to the git repository project or anything like that. 

You can still have deltas between blobs in a single commit, but only if
you have similar blobs.

> 2. If it were a delta, would the first 1/3 of it be completely normal and
> readable? There is no pattern that I can see to the remaining 2/3. It looks
> as if the characters in the 2/3 part were interleaved with the other
> characters about 10 times.

It wouldn't be completely normal, but you might see chunks of the file
along with binary patch instructions (like "put this chunk at offset
...").

> 3. The object type in the header is parsed as 3, or a blob. Aren't the delta
> object types higher numbers than that?

Yeah, if it is coming up as type 3, then it should definitely be a
whole, literal blob.

As far as your Java code, I don't see anything overtly wrong, but then,
I know absolutely nothing about any of the classes you are using.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-04-17  4:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-16  2:05 Java Inflater problem decompressing packfile madmarcos
2011-04-16  6:37 ` Jeff King
2011-04-16 14:23   ` madmarcos
2011-04-16 14:36     ` madmarcos
2011-04-16 14:58       ` madmarcos
2011-04-16 15:50         ` madmarcos
2011-04-17  0:40           ` madmarcos
2011-04-17  4:02             ` madmarcos
2011-04-17  4:06               ` madmarcos
2011-04-17  4:36     ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).