All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prasanta Sadhukhan <Prasanta.Sadhukhan@Sun.COM>
To: Linux C Programming List <linux-c-programming@vger.kernel.org>,
	Glynn Clements <glynn@gclements.plus.com>
Subject: Re: code to list contents of zip files
Date: Thu, 29 Oct 2009 15:03:52 +0530	[thread overview]
Message-ID: <4AE96180.8000807@sun.com> (raw)
In-Reply-To: <4AE94373.8020702@sun.com>

[-- Attachment #1: Type: text/plain, Size: 2710 bytes --]

Hi All,

Is it possible to output the content of a particular file from a zip file?
For example, in the attached ziptest.c I want to get the contents of a 
particular file Class3.class from testclasses.zip into a buffer, can 
anyone point out what I need to change in the code?

Thx in advance
Prasanta Sadhukhan wrote:
>> Thanks Glynn.
>> I tried to rectify the flaw and now I am getting the whole file 
>> contents from the zip file, ie., it is outputting the contents of the 
>> file stored in the zipfile.
>> But Actually, I wanted only to list the content of the zip file as 
>> zipinfo does.
>>
>> Regards
>> Prasanta
>> Glynn Clements wrote:
>>> Prasanta Sadhukhan wrote:
>>>
>>>  
>>>> I have tried to create a ziptest code with the information found 
>>>> but when I tried to inflate the contents by calling inflate() 
>>>> [line68], I am getting Z_DATA_ERROR citing  input data is corrupted 
>>>> or not conforming to zlib format but I can do zipinfo or unzip on 
>>>> the attached ziptest.zip file successfully (and also the zip header 
>>>> is found to be valid by the header validity check done in the program)
>>>> Can anyone point me as to what should I being more to get rid of 
>>>> this problem?
>>>>     
>>>
>>> I've found 3 flaws:
>>>
>>> 1. You're not skipping over the variable-length fields (file name and
>>> extra field) at the end of the header; add:
>>>
>>>     fseek(file, SH(&h[LOCFIL]), SEEK_CUR);
>>>     fseek(file, SH(&h[LOCEXT]), SEEK_CUR);
>>>
>>> after reading the header but before calling inf().
>>>
>>> 2. You're trying to inflate() everything up to end-of-file, when you
>>> should be using the compressed length from the header (LOCSIZ) to
>>> determine how much compressed data is available (even with only one
>>> file, the central directory occurs at the end of the file).
>>>
>>> 3. This one wasn't obvious until I looked at the unzip source code. 
>>> You need to call inflateInit2(&strm, -15) rather than inflateInit(),
>>> in order to have it process "raw" data. zlib.h says:
>>>
>>>      windowBits can also be -8..-15 for raw inflate. In this case, 
>>> -windowBits
>>>    determines the window size. inflate() will then process raw 
>>> deflate data,
>>>    not looking for a zlib or gzip header, not generating a check 
>>> value, and not
>>>    looking for any check values for comparison at the end of the 
>>> stream. This
>>>    is for use with other formats that use the deflate compressed 
>>> data format
>>>    such as zip. ...
>>>
>>> IOW, inflateInit() expects the data to be "wrapped" with a zlib or
>>> gzip header and trailer, but ZIP files don't have these.
>>>
>>> After fixing the above issues, I get C source code on stdout.
>>>
>>>   
>>


[-- Attachment #2: testclasses.zip --]
[-- Type: application/x-zip-compressed, Size: 7796 bytes --]

[-- Attachment #3: ziptest.c --]
[-- Type: text/plain, Size: 5306 bytes --]

#include <stdio.h>
#include <string.h>
#include <malloc.h>
#include <zlib.h>
#include <errno.h>
#include <assert.h>

#define CHUNK 16384

/* PKZIP header definitions */
#define ZIPMAG 0x4b50           /* two-byte zip lead-in */
#define LOCREM 0x0403           /* remaining two bytes in zip signature */
#define LOCSIG 0x04034b50L      /* full signature */
#define LOCFLG 4                /* offset of bit flag */
#define  CRPFLG 1               /*  bit for encrypted entry */
#define  EXTFLG 8               /*  bit for extended local header */
#define LOCHOW 6                /* offset of compression method */
#define LOCTIM 8                /* file mod time (for decryption) */
#define LOCCRC 12               /* offset of crc */
#define LOCSIZ 16               /* offset of compressed size */
#define LOCLEN 20               /* offset of uncompressed length */
#define LOCFIL 24               /* offset of file name field length */
#define LOCEXT 26               /* offset of extra field length */
#define LOCHDR 28               /* size of local header, including LOCREM */
#define EXTHDR 16               /* size of extended local header, inc sig */

#define SH(p) ((unsigned short)(unsigned char)((p)[0]) | ((unsigned short)(unsigned char)((p)[1]) << 8))

int inf(FILE *source, FILE *dest)
{
    int ret;
    unsigned have;
    z_stream strm;
    unsigned char in[CHUNK];
    unsigned char out[CHUNK];
	unsigned char dict;

    /* allocate inflate state */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    strm.avail_in = 0;
    strm.next_in = Z_NULL;
    ret = inflateInit2(&strm, -15);
    if (ret != Z_OK) {
		printf("inflateInit failed\n");
        return ret;
	}

		printf("inflateInit succeeded\n");
    /* decompress until deflate stream ends or end of file */
    do {
        strm.avail_in = fread(in, 1, CHUNK, source);
        if (ferror(source)) {
			printf("fread failed\n");
            (void)inflateEnd(&strm);
            return Z_ERRNO;
        }
		printf("fread succeeded\n");
        if (strm.avail_in == 0)
            break;
        strm.next_in = in;

        /* run inflate() on input until output buffer not full */
        do {
            strm.avail_out = CHUNK;
            strm.next_out = out;
            ret = inflate(&strm, Z_NO_FLUSH);
            assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
            switch (ret) {
            case Z_NEED_DICT:
		printf("inflate returned Z_NEED_DICT\n");
                ret = Z_DATA_ERROR;     /* and fall through */
            case Z_DATA_ERROR:
		printf("inflate returned Z_DATA_ERROR\n");
            case Z_MEM_ERROR:
		printf("inflate returned Z_MEM_ERROR\n");
                (void)inflateEnd(&strm);
                return ret;
            }
		printf("inflate succeeded\n");
            have = CHUNK - strm.avail_out;
            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
		printf("fwrite failed\n");
                (void)inflateEnd(&strm);
                return Z_ERRNO;
            }
        } while (strm.avail_out == 0);

        /* done when inflate() says it's done */
    } while (ret != Z_STREAM_END);

    /* clean up and return */
    (void)inflateEnd(&strm);
    return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
}

/* report a zlib or i/o error */
void zerr(int ret)
{
    fputs("ziptest: ", stderr);
    switch (ret) {
    case Z_ERRNO:
        if (ferror(stdin))
            fputs("error reading stdin\n", stderr);
        if (ferror(stdout))
            fputs("error writing stdout\n", stderr);
        break;
    case Z_STREAM_ERROR:
        fputs("invalid compression level\n", stderr);
        break;
    case Z_DATA_ERROR:
        fputs("invalid or incomplete deflate data\n", stderr);
        break;
    case Z_MEM_ERROR:
        fputs("out of memory\n", stderr);
        break;
    case Z_VERSION_ERROR:
        fputs("zlib version mismatch!\n", stderr);
    }
}

int main(char *argc, char **argv)
{
	char str[] = "./testclasses.zip/package1/package3/Class3.class";
	char *substr = strcasestr(str, ".zip");
	char *loc;
	char *zipfile;
	int errnum;
	unsigned short n;
	unsigned char h[LOCHDR];
	int ret;

	if (substr == NULL) {
		printf("zip not found\n");
		substr = strcasestr(str, ".jar");
		if (substr == NULL) 
			printf("jar not found\n");
		else
			printf("jar found at location: %s\n",substr);
	}
	else
		printf("zip found at location: %s\n",substr);
	if (*(substr+4) == '\0')
		printf("zip/jar found at last\n");

	loc = (char*)malloc(substr-str+4);
	strncpy(loc, str, substr-str+4);
	printf("zip path = %s\n",loc);

	zipfile = substr+4+1;
	printf("zipfile %s\n", zipfile);

	errno = 0;
	FILE* file = fopen(loc, "r");
	if (file == (FILE*)NULL)
		printf("cannot open zipfile. errno %d\n",errno);
	else 
		printf("file %p\n", file);

	n = getc(file);
	printf("n %x\n",n);
	n |= getc(file) << 8;
	printf("n %x\n",n);
	printf("getc returns n = 0x%x, errno %d\n", n, errno);
	if (n == ZIPMAG)
	{
		if (fread((char *)h, 1, LOCHDR, file) != LOCHDR || SH(h) != LOCREM) {
			printf("invalid zipfile");
		}
		else
			printf("valid zip or jar file\n");
	} else
		printf("input not a zip file\n");

	fseek(file, SH(&h[LOCFIL]), SEEK_CUR);
	fseek(file, SH(&h[LOCEXT]), SEEK_CUR);
	ret = inf(file, stdout);
	if (ret != Z_OK)
		zerr(ret);
}

  parent reply	other threads:[~2009-10-29  9:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-20 12:32 code to list contents of zip files Prasanta Sadhukhan
2009-10-20 13:14 ` Glynn Clements
2009-10-21 12:26   ` Prasanta Sadhukhan
2009-10-21 18:52     ` Glynn Clements
2009-10-22 12:28       ` Prasanta Sadhukhan
2009-10-22 15:13         ` Glynn Clements
     [not found]         ` <4AE94373.8020702@sun.com>
2009-10-29  9:33           ` Prasanta Sadhukhan [this message]
2009-10-30  2:30             ` Glynn Clements
2009-10-30  7:02               ` Prasanta Sadhukhan
2009-10-30 20:36                 ` Glynn Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AE96180.8000807@sun.com \
    --to=prasanta.sadhukhan@sun.com \
    --cc=glynn@gclements.plus.com \
    --cc=linux-c-programming@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.