git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix archiving some corner-case files into zip
@ 2025-08-04 16:56 Toon Claes
  2025-08-04 16:56 ` [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry() Toon Claes
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Toon Claes @ 2025-08-04 16:56 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, René Scharfe, Toon Claes

At $DAYJOB, we've add a customer report an issue where they failed to
download a zip archive from a repository. The error they saw come from
git-archive(1) is:

        fatal: deflate error (0)

My friendly colleague Justin Tobler was able to reproduce this issue[1].
We've diagnosed this error happens on some files that exceed
core.bigFileThreshold. To reproduce the issue, you can run:

        git clone --depth=1 https://github.com/chromium/chromium.git
        cd chromium
        git -c core.bigFileThreshold=1 archive -o foo.zip --format=zip HEAD -- \
                chrome/test/data/third_party/kraken/tests/kraken-1.1/imaging-darkroom-data.js

(originally he mentioned another file, but that didn't trigger the bug
for me)

And a patch to fix the issue was presented that message.

I have tested the fix, and I can confirm this fixes the issue. But I'm
concerned this doesn't fix all issues.

Another way one could trigger the issue, is by initializing
`unsigned char compressed` with length `STREAM_BUFFER_SIZE / 2` (so half
the length of the input buffer, instead of double).

With Justin's fix, you see the error doesn't happen no more. But it
seems, the resulting zip archive isn't valid. When I try to unzip it, I
see:

    inflating: chrome/test/data/third_party/kraken/tests/kraken-1.1/imaging-darkroom-data.js   bad CRC 3ba68a86  (should be b09a04a2)

And when the length is set to `STREAM_BUFFER_SIZE` (so equal length to
input buffer), the decompress goes well, but the data seems to be
mangled.

This is because only the final call of git_deflate() is being wrapped in
a loop for the current chunk of input data. We can see in various other
callsites in the Git codebase, git_deflate() is usually called in a
`while` loop (even when the `flush` parameter is set to `0` =
Z_NO_FLUSH).

For the record, I want to give all the credit to Justin for diagnosing
this bug and to determine a solution. Where he aims to provide a fix
that is minimal, I wanted to present an alternative solution that
implements zlib usuage according to the official usage example[2], but
the changes are more substantial.

I'm on the fence which of two is the better approach. Because the ZIP
format has a End Of Central Directory record (EOCD) at the end, it's far
more likely *only* the final git_deflate() call suffers from unprocessed
input data, so the final Justin provides probably Just Works. I'm gonna
leave it up to the community to decide what is "better"?

[1]: https://lore.kernel.org/git/20250802220803.95137-1-jltobler@gmail.com/
[2]: https://zlib.net/zlib_how.html
[3]: https://en.wikipedia.org/wiki/ZIP_(file_format)#End_of_central_directory_record_(EOCD)

--
Cheers,
Toon

---
Toon Claes (4):
      archive-zip: deduplicate code setting output buffer in write_zip_entry()
      archive-zip: remove unneccesarry condition in write_zip_entry()
      archive-zip: in write_zip_entry() call git_deflate() in a loop
      archive-zip: move git_deflate() with Z_FINISH into the loop

 archive-zip.c | 40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)
---



---

base-commit: e813a0200a7121b97fec535f0d0b460b0a33356c
change-id: 20250801-toon-archive-zip-fix-2deac42d5aa3

Thanks
--
Toon


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry()
  2025-08-04 16:56 [PATCH 0/4] Fix archiving some corner-case files into zip Toon Claes
@ 2025-08-04 16:56 ` Toon Claes
  2025-08-07 19:51   ` René Scharfe
  2025-08-04 16:56 ` [PATCH 2/4] archive-zip: remove unneccesarry condition " Toon Claes
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Toon Claes @ 2025-08-04 16:56 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, René Scharfe, Toon Claes

There were two callsites setting the size and address of the output
buffer. Instead of setting them outside the loop and in the loop after
calling git_deflate(). Set them once in the loop, right before the
git_deflate() call.

Co-authored-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
---
 archive-zip.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index df8866d5ba..cc6d0cadd9 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -458,8 +458,6 @@ static int write_zip_entry(struct archiver_args *args,
 		git_deflate_init_raw(&zstream, args->compression_level);
 
 		compressed_size = 0;
-		zstream.next_out = compressed;
-		zstream.avail_out = sizeof(compressed);
 
 		for (;;) {
 			readlen = read_istream(stream, buf, sizeof(buf));
@@ -473,6 +471,8 @@ static int write_zip_entry(struct archiver_args *args,
 
 			zstream.next_in = buf;
 			zstream.avail_in = readlen;
+			zstream.next_out = compressed;
+			zstream.avail_out = sizeof(compressed);
 			result = git_deflate(&zstream, 0);
 			if (result != Z_OK)
 				die(_("deflate error (%d)"), result);
@@ -481,8 +481,6 @@ static int write_zip_entry(struct archiver_args *args,
 			if (out_len > 0) {
 				write_or_die(1, compressed, out_len);
 				compressed_size += out_len;
-				zstream.next_out = compressed;
-				zstream.avail_out = sizeof(compressed);
 			}
 
 		}

-- 
2.50.1.327.g047016eb4a


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] archive-zip: remove unneccesarry condition in write_zip_entry()
  2025-08-04 16:56 [PATCH 0/4] Fix archiving some corner-case files into zip Toon Claes
  2025-08-04 16:56 ` [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry() Toon Claes
@ 2025-08-04 16:56 ` Toon Claes
  2025-08-07 19:51   ` René Scharfe
  2025-08-04 16:56 ` [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop Toon Claes
  2025-08-04 16:56 ` [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop Toon Claes
  3 siblings, 1 reply; 9+ messages in thread
From: Toon Claes @ 2025-08-04 16:56 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, René Scharfe, Toon Claes

The function write_or_die() can handle a length that's zero, thus we can
remove the condition that checks the value of `out_len` that surrounds
this call. The value shall never be negative as this would have caused
data being omitted in the deflated output.

Co-authored-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
---
 archive-zip.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index cc6d0cadd9..d41a12de5f 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -478,11 +478,8 @@ static int write_zip_entry(struct archiver_args *args,
 				die(_("deflate error (%d)"), result);
 			out_len = zstream.next_out - compressed;
 
-			if (out_len > 0) {
-				write_or_die(1, compressed, out_len);
-				compressed_size += out_len;
-			}
-
+			write_or_die(1, compressed, out_len);
+			compressed_size += out_len;
 		}
 		close_istream(stream);
 		if (readlen)

-- 
2.50.1.327.g047016eb4a


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop
  2025-08-04 16:56 [PATCH 0/4] Fix archiving some corner-case files into zip Toon Claes
  2025-08-04 16:56 ` [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry() Toon Claes
  2025-08-04 16:56 ` [PATCH 2/4] archive-zip: remove unneccesarry condition " Toon Claes
@ 2025-08-04 16:56 ` Toon Claes
  2025-08-07 19:51   ` René Scharfe
  2025-08-04 16:56 ` [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop Toon Claes
  3 siblings, 1 reply; 9+ messages in thread
From: Toon Claes @ 2025-08-04 16:56 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, René Scharfe, Toon Claes

The function git_deflate() might not complete to deflate all the input
data in one go. While the function is already being called in a loop,
every loop fresh data is read from the stream. This is not correct,
because input data might get lost.

As we see in many other callsites, git_deflate() should be called in a
loop on the existing input to make it process all the input data.

Add in a nested loop around git_deflate() to process the input buffer
completely, before continuing the parent loop that reads from more data
from the input stream.

Co-authored-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
---
 archive-zip.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index d41a12de5f..25a0224130 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -471,15 +471,17 @@ static int write_zip_entry(struct archiver_args *args,
 
 			zstream.next_in = buf;
 			zstream.avail_in = readlen;
-			zstream.next_out = compressed;
-			zstream.avail_out = sizeof(compressed);
-			result = git_deflate(&zstream, 0);
-			if (result != Z_OK)
-				die(_("deflate error (%d)"), result);
-			out_len = zstream.next_out - compressed;
+			do {
+				zstream.next_out = compressed;
+				zstream.avail_out = sizeof(compressed);
+				result = git_deflate(&zstream, 0);
+				if (result != Z_OK)
+					die(_("deflate error (%d)"), result);
+				out_len = zstream.next_out - compressed;
 
-			write_or_die(1, compressed, out_len);
-			compressed_size += out_len;
+				write_or_die(1, compressed, out_len);
+				compressed_size += out_len;
+			} while (zstream.avail_out == 0);
 		}
 		close_istream(stream);
 		if (readlen)

-- 
2.50.1.327.g047016eb4a


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop
  2025-08-04 16:56 [PATCH 0/4] Fix archiving some corner-case files into zip Toon Claes
                   ` (2 preceding siblings ...)
  2025-08-04 16:56 ` [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop Toon Claes
@ 2025-08-04 16:56 ` Toon Claes
  2025-08-07 19:51   ` René Scharfe
  3 siblings, 1 reply; 9+ messages in thread
From: Toon Claes @ 2025-08-04 16:56 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, René Scharfe, Toon Claes

Instead duplicating code to do the final deflate (with `flush` value
Z_FINISH), bring this call inside the loop that's deflate parts of the
input stream. This causes also this final deflate to be wrapped in a
loop to ensure the whole input is taken care of.

This change makes crc32() to be called without checking if the `readlen`
is greater than zero, but looking at the zlib manual[1] should be
allowed.

This patch concluded some refactoring, making the code more similar to
the example usage of the official zlib docs[2].

[1]: https://zlib.net/manual.html
[2]: https://zlib.net/zlib_how.html

Co-authored-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
---
 archive-zip.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index 25a0224130..559ed267be 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -451,7 +451,7 @@ static int write_zip_entry(struct archiver_args *args,
 		unsigned char buf[STREAM_BUFFER_SIZE];
 		ssize_t readlen;
 		git_zstream zstream;
-		int result;
+		int result, flush;
 		size_t out_len;
 		unsigned char compressed[STREAM_BUFFER_SIZE * 2];
 
@@ -459,44 +459,37 @@ static int write_zip_entry(struct archiver_args *args,
 
 		compressed_size = 0;
 
-		for (;;) {
+		do {
 			readlen = read_istream(stream, buf, sizeof(buf));
-			if (readlen <= 0)
+			if (readlen < 0)
 				break;
 			crc = crc32(crc, buf, readlen);
-			if (is_binary == -1)
+			if ((is_binary == -1) && readlen)
 				is_binary = entry_is_binary(args->repo->index,
 							    path_without_prefix,
 							    buf, readlen);
 
+			flush = readlen ? Z_NO_FLUSH : Z_FINISH;
 			zstream.next_in = buf;
 			zstream.avail_in = readlen;
 			do {
 				zstream.next_out = compressed;
 				zstream.avail_out = sizeof(compressed);
-				result = git_deflate(&zstream, 0);
-				if (result != Z_OK)
+				result = git_deflate(&zstream, flush);
+				if ((result != Z_OK) && (result != Z_STREAM_END))
 					die(_("deflate error (%d)"), result);
 				out_len = zstream.next_out - compressed;
 
 				write_or_die(1, compressed, out_len);
 				compressed_size += out_len;
 			} while (zstream.avail_out == 0);
-		}
+		} while (flush != Z_FINISH);
+
 		close_istream(stream);
 		if (readlen)
 			return readlen;
 
-		zstream.next_in = buf;
-		zstream.avail_in = 0;
-		result = git_deflate(&zstream, Z_FINISH);
-		if (result != Z_STREAM_END)
-			die("deflate error (%d)", result);
-
 		git_deflate_end(&zstream);
-		out_len = zstream.next_out - compressed;
-		write_or_die(1, compressed, out_len);
-		compressed_size += out_len;
 		zip_offset += compressed_size;
 
 		write_zip_data_desc(size, compressed_size, crc);

-- 
2.50.1.327.g047016eb4a


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry()
  2025-08-04 16:56 ` [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry() Toon Claes
@ 2025-08-07 19:51   ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2025-08-07 19:51 UTC (permalink / raw)
  To: Toon Claes, git; +Cc: Justin Tobler

On 8/4/25 6:56 PM, Toon Claes wrote:
> There were two callsites setting the size and address of the output
> buffer. Instead of setting them outside the loop and in the loop after
> calling git_deflate(). Set them once in the loop, right before the
> git_deflate() call.
> 
> Co-authored-by: Justin Tobler <jltobler@gmail.com>
> Signed-off-by: Toon Claes <toon@iotcl.com>
> ---
>  archive-zip.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index df8866d5ba..cc6d0cadd9 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -458,8 +458,6 @@ static int write_zip_entry(struct archiver_args *args,
>  		git_deflate_init_raw(&zstream, args->compression_level);
>  
>  		compressed_size = 0;
> -		zstream.next_out = compressed;
> -		zstream.avail_out = sizeof(compressed);
>  
>  		for (;;) {
>  			readlen = read_istream(stream, buf, sizeof(buf));
> @@ -473,6 +471,8 @@ static int write_zip_entry(struct archiver_args *args,
>  
>  			zstream.next_in = buf;
>  			zstream.avail_in = readlen;
> +			zstream.next_out = compressed;
> +			zstream.avail_out = sizeof(compressed);
>  			result = git_deflate(&zstream, 0);
>  			if (result != Z_OK)
>  				die(_("deflate error (%d)"), result);
> @@ -481,8 +481,6 @@ static int write_zip_entry(struct archiver_args *args,
>  			if (out_len > 0) {
>  				write_or_die(1, compressed, out_len);
>  				compressed_size += out_len;
> -				zstream.next_out = compressed;
> -				zstream.avail_out = sizeof(compressed);
>  			}
>  
>  		}
> 

Less lines, great!  This works because for out_len == 0 these
assignments are noops, and that condition so unlikely to be true that
we don't need to worry about the then unnecessary writes.

René


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] archive-zip: remove unneccesarry condition in write_zip_entry()
  2025-08-04 16:56 ` [PATCH 2/4] archive-zip: remove unneccesarry condition " Toon Claes
@ 2025-08-07 19:51   ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2025-08-07 19:51 UTC (permalink / raw)
  To: Toon Claes, git; +Cc: Justin Tobler

On 8/4/25 6:56 PM, Toon Claes wrote:
> The function write_or_die() can handle a length that's zero, thus we can
> remove the condition that checks the value of `out_len` that surrounds
> this call. The value shall never be negative as this would have caused
> data being omitted in the deflated output.
> 
> Co-authored-by: Justin Tobler <jltobler@gmail.com>
> Signed-off-by: Toon Claes <toon@iotcl.com>
> ---
>  archive-zip.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index cc6d0cadd9..d41a12de5f 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -478,11 +478,8 @@ static int write_zip_entry(struct archiver_args *args,
>  				die(_("deflate error (%d)"), result);
>  			out_len = zstream.next_out - compressed;
>  
> -			if (out_len > 0) {
> -				write_or_die(1, compressed, out_len);
> -				compressed_size += out_len;
> -			}
> -
> +			write_or_die(1, compressed, out_len);
> +			compressed_size += out_len;
>  		}
>  		close_istream(stream);
>  		if (readlen)
> 

OK, less lines again, great!  write_or_die() handles out_len == 0
just fine; this won't increase the number of syscalls we make.

René


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop
  2025-08-04 16:56 ` [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop Toon Claes
@ 2025-08-07 19:51   ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2025-08-07 19:51 UTC (permalink / raw)
  To: Toon Claes, git; +Cc: Justin Tobler

On 8/4/25 6:56 PM, Toon Claes wrote:
> The function git_deflate() might not complete to deflate all the input
> data in one go. While the function is already being called in a loop,
> every loop fresh data is read from the stream. This is not correct,
> because input data might get lost.
> 
> As we see in many other callsites, git_deflate() should be called in a
> loop on the existing input to make it process all the input data.
> 
> Add in a nested loop around git_deflate() to process the input buffer
> completely, before continuing the parent loop that reads from more data
> from the input stream.
> 
> Co-authored-by: Justin Tobler <jltobler@gmail.com>
> Signed-off-by: Toon Claes <toon@iotcl.com>
> ---
>  archive-zip.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index d41a12de5f..25a0224130 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -471,15 +471,17 @@ static int write_zip_entry(struct archiver_args *args,
>  
>  			zstream.next_in = buf;
>  			zstream.avail_in = readlen;
> -			zstream.next_out = compressed;
> -			zstream.avail_out = sizeof(compressed);
> -			result = git_deflate(&zstream, 0);
> -			if (result != Z_OK)
> -				die(_("deflate error (%d)"), result);
> -			out_len = zstream.next_out - compressed;
> +			do {
> +				zstream.next_out = compressed;
> +				zstream.avail_out = sizeof(compressed);
> +				result = git_deflate(&zstream, 0);
> +				if (result != Z_OK)
> +					die(_("deflate error (%d)"), result);
> +				out_len = zstream.next_out - compressed;
>  
> -			write_or_die(1, compressed, out_len);
> -			compressed_size += out_len;
> +				write_or_die(1, compressed, out_len);
> +				compressed_size += out_len;
> +			} while (zstream.avail_out == 0);
>  		}
>  		close_istream(stream);
>  		if (readlen)
> 

Makes sense.  If deflate somehow fills the output buffer (with
internally pending data, I suppose -- the fresh input data alone is not
enough), this clears it and lets it go another round without feeding new
input.

The mistake was thinking that the existence of deflateBound(), which
gives a maximum deflated size for any given input, implies a similarly
tight limit on individual chunks of input, which makes no sense in
hindsight.

René


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop
  2025-08-04 16:56 ` [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop Toon Claes
@ 2025-08-07 19:51   ` René Scharfe
  0 siblings, 0 replies; 9+ messages in thread
From: René Scharfe @ 2025-08-07 19:51 UTC (permalink / raw)
  To: Toon Claes, git; +Cc: Justin Tobler

On 8/4/25 6:56 PM, Toon Claes wrote:
> Instead duplicating code to do the final deflate (with `flush` value
> Z_FINISH), bring this call inside the loop that's deflate parts of the
> input stream. This causes also this final deflate to be wrapped in a
> loop to ensure the whole input is taken care of.
> 
> This change makes crc32() to be called without checking if the `readlen`
> is greater than zero, but looking at the zlib manual[1] should be
> allowed.

Reading the manual is surely allowed, that's what it was written for.
Snarkiness aside, s/should/it should/

But if you're referring to the example with a zero length there:

	uLong crc = crc32(0L, Z_NULL, 0);

... then note that it's also giving a Z_NULL pointer, which crc32_z()
(https://github.com/madler/zlib/blob/develop/crc32.c#L585) has special
handling for; the length is ignored in this example.

> 
> This patch concluded some refactoring, making the code more similar to
> the example usage of the official zlib docs[2].
> 
> [1]: https://zlib.net/manual.html
> [2]: https://zlib.net/zlib_how.html
> 
> Co-authored-by: Justin Tobler <jltobler@gmail.com>
> Signed-off-by: Toon Claes <toon@iotcl.com>
> ---
>  archive-zip.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index 25a0224130..559ed267be 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -451,7 +451,7 @@ static int write_zip_entry(struct archiver_args *args,
>  		unsigned char buf[STREAM_BUFFER_SIZE];
>  		ssize_t readlen;
>  		git_zstream zstream;
> -		int result;
> +		int result, flush;
>  		size_t out_len;
>  		unsigned char compressed[STREAM_BUFFER_SIZE * 2];
>  
> @@ -459,44 +459,37 @@ static int write_zip_entry(struct archiver_args *args,
>  
>  		compressed_size = 0;
>  
> -		for (;;) {
> +		do {
>  			readlen = read_istream(stream, buf, sizeof(buf));
> -			if (readlen <= 0)
> +			if (readlen < 0)
>  				break;
>  			crc = crc32(crc, buf, readlen);
> -			if (is_binary == -1)
> +			if ((is_binary == -1) && readlen)

It's probably fine to call crc32 with zero length, but since you have the
readlen condition here anyway it would basically be free to extend it to
that call as well.

>  				is_binary = entry_is_binary(args->repo->index,
>  							    path_without_prefix,
>  							    buf, readlen);
>  
> +			flush = readlen ? Z_NO_FLUSH : Z_FINISH;

... and setting Z_FINISH could be moved under there as well to avoid
setting Z_NO_FLUSH over and over.  I'm probably nitpicking here, though.

>  			zstream.next_in = buf;
>  			zstream.avail_in = readlen;
>  			do {
>  				zstream.next_out = compressed;
>  				zstream.avail_out = sizeof(compressed);
> -				result = git_deflate(&zstream, 0);
> -				if (result != Z_OK)
> +				result = git_deflate(&zstream, flush);
> +				if ((result != Z_OK) && (result != Z_STREAM_END))
>  					die(_("deflate error (%d)"), result);
>  				out_len = zstream.next_out - compressed;
>  
>  				write_or_die(1, compressed, out_len);
>  				compressed_size += out_len;
>  			} while (zstream.avail_out == 0);

Hmm, the manual says "After deflate has returned Z_STREAM_END, the only
possible operations on the stream are deflateReset or deflateEnd.".
Here we could have deflate fill the output buffer to the brim, be done
and return Z_STREAM_END and we'd go again, which would result in an
error, I suppose.  So the while loop above needs to end on Z_STREAM_END
as well, no?

René


> -		}
> +		} while (flush != Z_FINISH);
> +
>  		close_istream(stream);
>  		if (readlen)
>  			return readlen;
>  
> -		zstream.next_in = buf;
> -		zstream.avail_in = 0;
> -		result = git_deflate(&zstream, Z_FINISH);
> -		if (result != Z_STREAM_END)
> -			die("deflate error (%d)", result);
> -
>  		git_deflate_end(&zstream);
> -		out_len = zstream.next_out - compressed;
> -		write_or_die(1, compressed, out_len);
> -		compressed_size += out_len;
>  		zip_offset += compressed_size;
>  
>  		write_zip_data_desc(size, compressed_size, crc);
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-07 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-04 16:56 [PATCH 0/4] Fix archiving some corner-case files into zip Toon Claes
2025-08-04 16:56 ` [PATCH 1/4] archive-zip: deduplicate code setting output buffer in write_zip_entry() Toon Claes
2025-08-07 19:51   ` René Scharfe
2025-08-04 16:56 ` [PATCH 2/4] archive-zip: remove unneccesarry condition " Toon Claes
2025-08-07 19:51   ` René Scharfe
2025-08-04 16:56 ` [PATCH 3/4] archive-zip: in write_zip_entry() call git_deflate() in a loop Toon Claes
2025-08-07 19:51   ` René Scharfe
2025-08-04 16:56 ` [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop Toon Claes
2025-08-07 19:51   ` René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).