* Java Inflater problem decompressing packfile @ 2011-04-16 2:05 madmarcos 2011-04-16 6:37 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-16 2:05 UTC (permalink / raw) To: git This may be better suited for the Java forums but I will ask it here just in case someone has run into it before. I have a packfile that I have saved as a file from the git-upload-pack command. I want to read through the packfile, decompressing each of the objects. My little inflater procedure works fine for a tiny HelloWorld project. So, I decided to mix it up a little and use the jEdit source for a larger test. I am 99% certain the jEdit.git packfile itself is ok as I have passed it through directly to eGit's Import using an SSH proxy and eGit unpacked it just fine. So, my inflater method decompresses the first 7 objects fine (a commit, a couple of trees, and several blobs) and a cursory visual inspection of the decompressed data seems fine. The eighth object becomes a problem, though. It is a blob with the name build.xml that is 51,060 bytes decompressed (looking at the original pre-git-pushed jEdit source). The actual file size matches the decompressed data content size in the packfile object header. The inflater procedure outputs the decompressed data to System.out for visual inspection. Approximately the first 1/3 looks like the original build.xml but after that, the output is garbled. The procedure continues decompressing objects after the 8th, but garbled, object but it dies on the 9th object with an "unknown compression method" error. So I created a new test inflater to focus only on decompressing the 8th object. Simply opening the packfile, copying out the compressed data (7793 bytes), and inflating it yields the same 2/3 garbled xml. As a further test, I then take the original build.xml file, compress it using java's Deflater (yielding a 7793 byte array), and then inflate it using my same procedure and it decompresses fine. All of the xml is readable. Now, I have tried several variations of an inflater procedure, including an patchwork variation from jGit's WindowCursor.inflate method. But they all yield the same garbled result for the compressed build.xml data. Any suggestions. I can post some of my ugly code if asked. Thanks! -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6278154.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 2:05 Java Inflater problem decompressing packfile madmarcos @ 2011-04-16 6:37 ` Jeff King 2011-04-16 14:23 ` madmarcos 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2011-04-16 6:37 UTC (permalink / raw) To: madmarcos; +Cc: git On Fri, Apr 15, 2011 at 07:05:05PM -0700, madmarcos wrote: > So, my inflater method decompresses the first 7 objects fine (a commit, a > couple of trees, and several blobs) and a cursory visual inspection of the > decompressed data seems fine. The eighth object becomes a problem, though. > It is a blob with the name build.xml that is 51,060 bytes decompressed > (looking at the original pre-git-pushed jEdit source). The actual file size > matches the decompressed data content size in the packfile object header. > The inflater procedure outputs the decompressed data to System.out for > visual inspection. Approximately the first 1/3 looks like the original > build.xml but after that, the output is garbled. The procedure continues > decompressing objects after the 8th, but garbled, object but it dies on the > 9th object with an "unknown compression method" error. Is it possible that the blob is stored as a delta within the pack? In that case the pack header will tell you what the eventual size of the blob will be, but the data will actually be a diff against another pack object. Does your inflater handle delta-fied objects? -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 6:37 ` Jeff King @ 2011-04-16 14:23 ` madmarcos 2011-04-16 14:36 ` madmarcos 2011-04-17 4:36 ` Jeff King 0 siblings, 2 replies; 10+ messages in thread From: madmarcos @ 2011-04-16 14:23 UTC (permalink / raw) To: git No, my inflater doesn't handle deltas, yet. But there are a few reasons why I don't think that's the case. 1. The project has only been pushed once to the git repository before my tests. No updates to the git repository project or anything like that. 2. If it were a delta, would the first 1/3 of it be completely normal and readable? There is no pattern that I can see to the remaining 2/3. It looks as if the characters in the 2/3 part were interleaved with the other characters about 10 times. 3. The object type in the header is parsed as 3, or a blob. Aren't the delta object types higher numbers than that? -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279028.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 14:23 ` madmarcos @ 2011-04-16 14:36 ` madmarcos 2011-04-16 14:58 ` madmarcos 2011-04-17 4:36 ` Jeff King 1 sibling, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-16 14:36 UTC (permalink / raw) To: git 1 more thing: when it is decompressing the garbled object, it throws an "incorrect data check" error. -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279050.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 14:36 ` madmarcos @ 2011-04-16 14:58 ` madmarcos 2011-04-16 15:50 ` madmarcos 0 siblings, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-16 14:58 UTC (permalink / raw) To: git here is some code that shows the problem. sorry if the formatting fails. not sure if I am supposed to use code tags or something. try { byte[] packFile = readFile("/Users/marcos/GitProxyCache/jedit.pack"); //THE BELOW OBJECT DECOMPRESSES FINE //Object starts at index 8616 //Type = 3, Decompressed size = 2248 (uses 2 extra size bytes) //byte [] packDataWindow = new byte[8000]; //System.arraycopy(packFile, 8619, packDataWindow, 0, packDataWindow.length); //works //THE BELOW OBJECT FAILS TO INFLATE //IT CAUSES an "incorrect data check" error //Object starts at index 9470 //Type = 3, Decompressed size = 51060 (uses 2 extra size bytes) byte [] packDataWindow = new byte[8000]; System.arraycopy(packFile, 9473, packDataWindow, 0, packDataWindow.length); //does not work Inflater decompresser = new Inflater(); decompresser.setInput(packDataWindow, 0, packDataWindow.length); byte[] result = new byte[60000]; int resultLength = 0; resultLength = decompresser.inflate(result); String outputString = new String(result, 0, resultLength, "UTF-8"); System.out.println(outputString); decompresser.end(); } catch (Exception e) { e.printStackTrace(); } -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279085.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 14:58 ` madmarcos @ 2011-04-16 15:50 ` madmarcos 2011-04-17 0:40 ` madmarcos 0 siblings, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-16 15:50 UTC (permalink / raw) To: git if you want to play with the packfile, you can find it here: http://galadriel.cs.utsa.edu/utsadocs/jedit.pack -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6279183.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 15:50 ` madmarcos @ 2011-04-17 0:40 ` madmarcos 2011-04-17 4:02 ` madmarcos 0 siblings, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-17 0:40 UTC (permalink / raw) To: git someone on the Java forums asked if I knew that the file was being read completely before inflating. Well... I just assumed (yes, I know not a good thing to do). So here is my readFile code in case you want to see it: public byte [] readFile(String fileName) { byte [] input2 = null; File tempPackInputFile2 = new File(fileName); InputStream tempPackInputStream2; try { tempPackInputStream2 = new FileInputStream(tempPackInputFile2); long tempPackLength2 = tempPackInputFile2.length(); input2 = new byte[(int) tempPackLength2]; // Read in the bytes int offset2 = 0; int numRead2 = 0; while (offset2 < input2.length && (numRead2 = tempPackInputStream2.read(input2, offset2, input2.length-offset2)) >= 0) { offset2 += numRead2; } tempPackInputStream2.close(); } catch (Exception e) { e.printStackTrace(); } return input2; } -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280097.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-17 0:40 ` madmarcos @ 2011-04-17 4:02 ` madmarcos 2011-04-17 4:06 ` madmarcos 0 siblings, 1 reply; 10+ messages in thread From: madmarcos @ 2011-04-17 4:02 UTC (permalink / raw) To: git my entire testing class is below. just change the file path string in the first line in test4. package server_test2; import java.io.DataInputStream; import java.io.File; import java.io.FileInputStream; import java.util.zip.Inflater; public class InflaterTest2 { public InflaterTest2() { } public void test4() { try { byte[] packFile = readFile2("/Users/marcos/GitProxyCache/jedit.pack"); byte [] packDataWindow = new byte[8000]; //the below object is the .classpath blob. works fine System.arraycopy(packFile, 4828 + 2, packDataWindow, 0, packDataWindow.length); //the below object is the .project blob. works fine //System.arraycopy(packFile, 4978 + 2, packDataWindow, 0, packDataWindow.length); //the below object is the top-level tree. works fine //System.arraycopy(packFile, 5171 + 2, packDataWindow, 0, packDataWindow.length); //the below object is source code notes blob. works fine //System.arraycopy(packFile, 5760 + 3, packDataWindow, 0, packDataWindow.length); //the below object is a build file blob. works fine //System.arraycopy(packFile, 8619, packDataWindow, 0, packDataWindow.length); //THE BELOW OBJECT FAILS TO INFLATE //IT CAUSES an "incorrect data check" error //Object starts at index 9470 //Type = 3, Decompressed size = 51060 (uses 2 extra size bytes) //System.arraycopy(packFile, 9470 + 3, packDataWindow, 0, packDataWindow.length); Inflater decompresser = new Inflater(); decompresser.setInput(packDataWindow, 0, packDataWindow.length); byte[] result = new byte[60000]; int resultLength = 0; resultLength = decompresser.inflate(result); String outputString = new String(result, 0, resultLength, "UTF-8"); System.out.println(outputString); System.out.println("------- End Decompressed Output -------"); int numCompressed = (int) decompresser.getBytesRead(); System.out.println("# Bytes Compressed: " + numCompressed); decompresser.end(); } catch (Exception e) { e.printStackTrace(); } } public byte [] readFile2(String fileName) { byte [] input2 = null; File tempPackInputFile2 = new File(fileName); DataInputStream tempPackInputStream2; try { tempPackInputStream2 = new DataInputStream(new FileInputStream(tempPackInputFile2)); long tempPackLength2 = tempPackInputFile2.length(); input2 = new byte[(int) tempPackLength2]; tempPackInputStream2.readFully(input2); tempPackInputStream2.close(); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } return input2; } public static void main(String[] args) { InflaterTest2 app = new InflaterTest2(); app.test4(); System.out.println("Done."); } } -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280285.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-17 4:02 ` madmarcos @ 2011-04-17 4:06 ` madmarcos 0 siblings, 0 replies; 10+ messages in thread From: madmarcos @ 2011-04-17 4:06 UTC (permalink / raw) To: git I hit send too quickly. beneath the readFile call in the test4() method are several commented-out System.arraycopy lines. you can uncomment any one of them to see the blob or tree decompressed. all but the the last arraycopy line work fine. the last one is the problem. thanks for any light you can shed on the problem. I am pulling my hair out. -- View this message in context: http://git.661346.n2.nabble.com/Java-Inflater-problem-decompressing-packfile-tp6278154p6280295.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Java Inflater problem decompressing packfile 2011-04-16 14:23 ` madmarcos 2011-04-16 14:36 ` madmarcos @ 2011-04-17 4:36 ` Jeff King 1 sibling, 0 replies; 10+ messages in thread From: Jeff King @ 2011-04-17 4:36 UTC (permalink / raw) To: madmarcos; +Cc: git On Sat, Apr 16, 2011 at 07:23:52AM -0700, madmarcos wrote: > No, my inflater doesn't handle deltas, yet. > But there are a few reasons why I don't think that's the case. > 1. The project has only been pushed once to the git repository before my > tests. No updates to the git repository project or anything like that. You can still have deltas between blobs in a single commit, but only if you have similar blobs. > 2. If it were a delta, would the first 1/3 of it be completely normal and > readable? There is no pattern that I can see to the remaining 2/3. It looks > as if the characters in the 2/3 part were interleaved with the other > characters about 10 times. It wouldn't be completely normal, but you might see chunks of the file along with binary patch instructions (like "put this chunk at offset ..."). > 3. The object type in the header is parsed as 3, or a blob. Aren't the delta > object types higher numbers than that? Yeah, if it is coming up as type 3, then it should definitely be a whole, literal blob. As far as your Java code, I don't see anything overtly wrong, but then, I know absolutely nothing about any of the classes you are using. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-04-17 4:36 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-16 2:05 Java Inflater problem decompressing packfile madmarcos 2011-04-16 6:37 ` Jeff King 2011-04-16 14:23 ` madmarcos 2011-04-16 14:36 ` madmarcos 2011-04-16 14:58 ` madmarcos 2011-04-16 15:50 ` madmarcos 2011-04-17 0:40 ` madmarcos 2011-04-17 4:02 ` madmarcos 2011-04-17 4:06 ` madmarcos 2011-04-17 4:36 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).