From: Patrick Steinhardt <ps@pks.im>
To: Karthik Nayak <karthik.188@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 3/3] reftable: prevent 'update_index' changes after adding records
Date: Tue, 21 Jan 2025 07:56:29 +0100 [thread overview]
Message-ID: <Z49FHQgsFSX6sTxu@pks.im> (raw)
In-Reply-To: <20250121-461-corrupted-reftable-followup-v2-3-37e26c7a79b4@gmail.com>
On Tue, Jan 21, 2025 at 04:34:12AM +0100, Karthik Nayak wrote:
> The function `reftable_writer_set_limits()` allows updating the
> 'min_update_index' and 'max_update_index' of a reftable writer. These
> values are written to both the writer's header and footer.
>
> Since the header is written during the first block write, any subsequent
> changes to the update index would create a mismatch between the header
> and footer values. The footer would contain the newer values while the
> header retained the original ones.
>
> To fix this bug, prevent callers from updating these values after any
Nit: it's not really fixing a bug, but protecting us against it. Not
worth a reroll though, from my point of view.
> diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
> index 5f9afa620bb00de66c311765fb0ae8c6f56401ae..1ea014d389cc47f173279e3234a82f3fcbc807a0 100644
> --- a/reftable/reftable-writer.h
> +++ b/reftable/reftable-writer.h
> @@ -124,17 +124,21 @@ int reftable_writer_new(struct reftable_writer **out,
> int (*flush_func)(void *),
> void *writer_arg, const struct reftable_write_options *opts);
>
> -/* Set the range of update indices for the records we will add. When writing a
> - table into a stack, the min should be at least
> - reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> -
> - For transactional updates to a stack, typically min==max, and the
> - update_index can be obtained by inspeciting the stack. When converting an
> - existing ref database into a single reftable, this would be a range of
> - update-index timestamps.
> +/*
> + * Set the range of update indices for the records we will add. When writing a
> + * table into a stack, the min should be at least
> + * reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> + *
> + * For transactional updates to a stack, typically min==max, and the
> + * update_index can be obtained by inspeciting the stack. When converting an
> + * existing ref database into a single reftable, this would be a range of
> + * update-index timestamps.
> + *
> + * The function should be called before adding any records to the writer. If not
> + * it will fail with REFTABLE_API_ERROR.
> */
Thanks for updating this. I think the reftable library is one of those
code areas where it makes sense to sneak in a formatting fix every now
and then because its coding style is quite alien to Git's own in some
places. We could also do it all in one go, but I strongly doubt that it
would be worth the churn.
> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> - uint64_t max);
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> + uint64_t max);
>
> /*
> Add a reftable_ref_record. The record should have names that come after
> diff --git a/reftable/writer.c b/reftable/writer.c
> index 740c98038eaf883258bef4988f78977ac7e4a75a..03acbdbcce75fd51820c5fb016bd94f0f7f4914a 100644
> --- a/reftable/writer.c
> +++ b/reftable/writer.c
> @@ -179,11 +179,20 @@ int reftable_writer_new(struct reftable_writer **out,
> return 0;
> }
>
> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> - uint64_t max)
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> + uint64_t max)
> {
> + /*
> + * The limits should be set before any records are added to the writer.
> + * Check if any records were added by checking if `last_key` was set.
> + */
> + if (w->last_key.len)
> + return REFTABLE_API_ERROR;
Hm. Using the last key feels somewhat dangerous to me as it does get
reset at times, e.g. when finishing writing the current section. It
_should_ work, but overall it just feels a tad to disconnected from the
thing that we actually want to check.
How about we instead use `next`? This variable records the offset of the
next block we're about to write, and `writer_flush_nonempty_block()`
uses it directly to check whether we're currently writing the first
block in order to decide whether it needs to write a header or not. If
it's 0, we know that we haven't written the first block yet. That feels
much closer aligned with what we're checking.
> diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c
> index aeec195b2b1014445d71c5db39a9795017fd8ff2..b23edf18a7d75b0c2292490ad06d4dfaaa571e79 100644
> --- a/t/unit-tests/t-reftable-stack.c
> +++ b/t/unit-tests/t-reftable-stack.c
Can we maybe add a unit test that demonstrates the error?
Patrick
next prev parent reply other threads:[~2025-01-21 6:56 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-17 7:59 [PATCH 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-17 7:59 ` [PATCH 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-17 9:29 ` Patrick Steinhardt
2025-01-20 11:17 ` Karthik Nayak
2025-01-17 7:59 ` [PATCH 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-17 7:59 ` [PATCH 3/3] reftable: prevent 'update_index' changes after header write Karthik Nayak
2025-01-17 9:29 ` Patrick Steinhardt
2025-01-20 11:47 ` Karthik Nayak
2025-01-20 12:18 ` Karthik Nayak
2025-01-21 3:34 ` [PATCH v2 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-21 3:34 ` [PATCH v2 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-21 3:34 ` [PATCH v2 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-21 3:34 ` [PATCH v2 3/3] reftable: prevent 'update_index' changes after adding records Karthik Nayak
2025-01-21 6:56 ` Patrick Steinhardt [this message]
2025-01-21 11:44 ` Karthik Nayak
2025-01-22 5:35 ` [PATCH v3 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-22 5:35 ` [PATCH v3 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-22 5:35 ` [PATCH v3 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-22 5:35 ` [PATCH v3 3/3] reftable: prevent 'update_index' changes after adding records Karthik Nayak
2025-01-22 12:12 ` Patrick Steinhardt
2025-01-22 17:50 ` Junio C Hamano
2025-01-22 21:57 ` Junio C Hamano
2025-02-01 2:24 ` undefined behavior in unit tests, was " Jeff King
2025-02-01 10:33 ` Phillip Wood
2025-02-03 5:41 ` Patrick Steinhardt
2025-02-03 14:11 ` Junio C Hamano
2025-02-03 15:37 ` Jeff King
2025-02-03 5:40 ` Patrick Steinhardt
2025-02-03 15:20 ` Karthik Nayak
2025-02-03 15:38 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z49FHQgsFSX6sTxu@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=karthik.188@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).