From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 7/9] reftable/block: reuse zstream when writing log blocks
Date: Thu, 4 Apr 2024 07:36:58 +0200 [thread overview]
Message-ID: <Zg48ersnCJqsk-Ey@tanuki> (raw)
In-Reply-To: <xmqqplv65jet.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 4421 bytes --]
On Wed, Apr 03, 2024 at 12:35:22PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > @@ -139,39 +143,60 @@ int block_writer_finish(struct block_writer *w)
> > w->next += 2;
> > put_be24(w->buf + 1 + w->header_off, w->next);
> >
> > + /*
> > + * Log records are stored zlib-compressed. Note that the compression
> > + * also spans over the restart points we have just written.
> > + */
> > if (block_writer_type(w) == BLOCK_TYPE_LOG) {
> > int block_header_skip = 4 + w->header_off;
> > + uLongf src_len = w->next - block_header_skip, compressed_len;
> > + unsigned char *compressed;
> > + int ret;
> > +
> > + ret = deflateReset(w->zstream);
> > + if (ret != Z_OK)
> > + return REFTABLE_ZLIB_ERROR;
> > +
> > + /*
> > + * Precompute the upper bound of how many bytes the compressed
> > + * data may end up with. Combined with `Z_FINISH`, `deflate()`
> > + * is guaranteed to return `Z_STREAM_END`.
> > + */
> > + compressed_len = deflateBound(w->zstream, src_len);
> > + REFTABLE_ALLOC_ARRAY(compressed, compressed_len);
>
> OK.
>
> > + w->zstream->next_out = compressed;
> > + w->zstream->avail_out = compressed_len;
> > + w->zstream->next_in = w->buf + block_header_skip;
> > + w->zstream->avail_in = src_len;
> > +
> > + /*
> > + * We want to perform all decompression in a single
> > + * step, which is why we can pass Z_FINISH here. Note
> > + * that both `Z_OK` and `Z_BUF_ERROR` indicate that we
> > + * need to retry according to documentation.
> > + *
> > + * If the call fails we retry with a bigger output
> > + * buffer.
> > + */
>
> I am not sure where the retry is happening, though.
>
> block_writer_finish() is called by writer_flush_nonempty_block()
> which returns a negative return to its caller, which is
> writer_flush_block(). writer_flush_block() in turn returns a
> negative return to its callers from writer_add_record(),
> write_finish_section(), and write_object_record(). Nobody seems to
> react to REFTABLE_ZLIB_ERROR (other than the reftable/error.c that
> stringifies the error for messages).
>
> But we have asked deflateBound() so if we did not get Z_STREAM_END,
> wouldn't it mean some data corruption that retrying would not help?
Yeha, this comment is stale from a previous iteration.
> > + ret = deflate(w->zstream, Z_FINISH);
> > + if (ret != Z_STREAM_END) {
> > reftable_free(compressed);
> > - break;
> > + return REFTABLE_ZLIB_ERROR;
> > }
> > +
> > + /*
> > + * Overwrite the uncompressed data we have already written and
> > + * adjust the `next` pointer to point right after the
> > + * compressed data.
> > + */
> > + memcpy(w->buf + block_header_skip, compressed,
> > + w->zstream->total_out);
> > + w->next = w->zstream->total_out + block_header_skip;
> > +
> > + reftable_free(compressed);
> > }
> > +
> > return w->next;
> > }
>
> OK.
>
> > @@ -425,6 +450,8 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it,
> >
> > void block_writer_release(struct block_writer *bw)
> > {
> > + deflateEnd(bw->zstream);
> > + FREE_AND_NULL(bw->zstream);
> > FREE_AND_NULL(bw->restarts);
> > strbuf_release(&bw->last_key);
> > /* the block is not owned. */
> > diff --git a/reftable/block.h b/reftable/block.h
> > index 47acc62c0a..1375957fc8 100644
> > --- a/reftable/block.h
> > +++ b/reftable/block.h
> > @@ -18,6 +18,7 @@ license that can be found in the LICENSE file or at
> > * allocation overhead.
> > */
> > struct block_writer {
> > + z_stream *zstream;
> > uint8_t *buf;
> > uint32_t block_size;
> >
> > diff --git a/reftable/writer.c b/reftable/writer.c
> > index d347ec4cc6..51e663bb19 100644
> > --- a/reftable/writer.c
> > +++ b/reftable/writer.c
> > @@ -153,6 +153,10 @@ void reftable_writer_free(struct reftable_writer *w)
> > {
> > if (!w)
> > return;
> > + if (w->block_writer) {
> > + block_writer_release(w->block_writer);
> > + w->block_writer = NULL;
> > + }
>
> This smells like an orthogonal fix to an unrelated resource leakage?
True. The memory leak simply never occurred before this change, but in
theory it could have happened. Will move into a separate commit.
Patrick
> > reftable_free(w->block);
> > reftable_free(w);
> > }
>
> Thanks.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-04-04 5:37 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-02 17:29 [PATCH 0/9] reftable: optimize write performance Patrick Steinhardt
2024-04-02 17:29 ` [PATCH 1/9] refs/reftable: fix D/F conflict error message on ref copy Patrick Steinhardt
2024-04-03 18:28 ` Junio C Hamano
2024-04-02 17:29 ` [PATCH 2/9] refs/reftable: perform explicit D/F check when writing symrefs Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 3/9] refs/reftable: skip duplicate name checks Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 4/9] refs/reftable: don't recompute committer ident Patrick Steinhardt
2024-04-03 18:58 ` Junio C Hamano
2024-04-04 5:36 ` Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 5/9] reftable/writer: refactorings for `writer_add_record()` Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 6/9] reftable/writer: refactorings for `writer_flush_nonempty_block()` Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 7/9] reftable/block: reuse zstream when writing log blocks Patrick Steinhardt
2024-04-03 19:35 ` Junio C Hamano
2024-04-04 5:36 ` Patrick Steinhardt [this message]
2024-04-02 17:30 ` [PATCH 8/9] reftable/block: reuse compressed array Patrick Steinhardt
2024-04-02 17:30 ` [PATCH 9/9] reftable/writer: reset `last_key` instead of releasing it Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 00/11] reftable: optimize write performance Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 01/11] refs/reftable: fix D/F conflict error message on ref copy Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 02/11] refs/reftable: perform explicit D/F check when writing symrefs Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 03/11] refs/reftable: skip duplicate name checks Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 04/11] reftable: remove " Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 05/11] refs/reftable: don't recompute committer ident Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 06/11] reftable/writer: refactorings for `writer_add_record()` Patrick Steinhardt
2024-04-04 6:58 ` Han-Wen Nienhuys
2024-04-04 7:32 ` Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 07/11] reftable/writer: refactorings for `writer_flush_nonempty_block()` Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 08/11] reftable/writer: unify releasing memory Patrick Steinhardt
2024-04-04 7:08 ` Han-Wen Nienhuys
2024-04-04 7:32 ` Patrick Steinhardt
2024-04-04 9:00 ` Han-Wen Nienhuys
2024-04-04 11:43 ` Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 09/11] reftable/writer: reset `last_key` instead of releasing it Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 10/11] reftable/block: reuse zstream when writing log blocks Patrick Steinhardt
2024-04-04 5:48 ` [PATCH v2 11/11] reftable/block: reuse compressed array Patrick Steinhardt
2024-04-04 7:09 ` [PATCH v2 00/11] reftable: optimize write performance Han-Wen Nienhuys
2024-04-04 7:32 ` Patrick Steinhardt
2024-04-08 12:23 ` [PATCH v3 " Patrick Steinhardt
2024-04-08 12:23 ` [PATCH v3 01/11] refs/reftable: fix D/F conflict error message on ref copy Patrick Steinhardt
2024-04-08 12:23 ` [PATCH v3 02/11] refs/reftable: perform explicit D/F check when writing symrefs Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 03/11] refs/reftable: skip duplicate name checks Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 04/11] reftable: remove " Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 05/11] refs/reftable: don't recompute committer ident Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 06/11] reftable/writer: refactorings for `writer_add_record()` Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 07/11] reftable/writer: refactorings for `writer_flush_nonempty_block()` Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 08/11] reftable/writer: unify releasing memory Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 09/11] reftable/writer: reset `last_key` instead of releasing it Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 10/11] reftable/block: reuse zstream when writing log blocks Patrick Steinhardt
2024-04-08 12:24 ` [PATCH v3 11/11] reftable/block: reuse compressed array Patrick Steinhardt
2024-04-09 0:09 ` [PATCH v3 00/11] reftable: optimize write performance Junio C Hamano
2024-04-09 3:16 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zg48ersnCJqsk-Ey@tanuki \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).