All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Karthik Nayak <karthik.188@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 3/3] reftable: prevent 'update_index' changes after adding records
Date: Tue, 21 Jan 2025 07:56:29 +0100	[thread overview]
Message-ID: <Z49FHQgsFSX6sTxu@pks.im> (raw)
In-Reply-To: <20250121-461-corrupted-reftable-followup-v2-3-37e26c7a79b4@gmail.com>

On Tue, Jan 21, 2025 at 04:34:12AM +0100, Karthik Nayak wrote:
> The function `reftable_writer_set_limits()` allows updating the
> 'min_update_index' and 'max_update_index' of a reftable writer. These
> values are written to both the writer's header and footer.
> 
> Since the header is written during the first block write, any subsequent
> changes to the update index would create a mismatch between the header
> and footer values. The footer would contain the newer values while the
> header retained the original ones.
> 
> To fix this bug, prevent callers from updating these values after any

Nit: it's not really fixing a bug, but protecting us against it. Not
worth a reroll though, from my point of view.

> diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
> index 5f9afa620bb00de66c311765fb0ae8c6f56401ae..1ea014d389cc47f173279e3234a82f3fcbc807a0 100644
> --- a/reftable/reftable-writer.h
> +++ b/reftable/reftable-writer.h
> @@ -124,17 +124,21 @@ int reftable_writer_new(struct reftable_writer **out,
>  			int (*flush_func)(void *),
>  			void *writer_arg, const struct reftable_write_options *opts);
>  
> -/* Set the range of update indices for the records we will add. When writing a
> -   table into a stack, the min should be at least
> -   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> -
> -   For transactional updates to a stack, typically min==max, and the
> -   update_index can be obtained by inspeciting the stack. When converting an
> -   existing ref database into a single reftable, this would be a range of
> -   update-index timestamps.
> +/*
> + * Set the range of update indices for the records we will add. When writing a
> + * table into a stack, the min should be at least
> + * reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> + *
> + * For transactional updates to a stack, typically min==max, and the
> + * update_index can be obtained by inspeciting the stack. When converting an
> + * existing ref database into a single reftable, this would be a range of
> + * update-index timestamps.
> + *
> + * The function should be called before adding any records to the writer. If not
> + * it will fail with REFTABLE_API_ERROR.
>   */

Thanks for updating this. I think the reftable library is one of those
code areas where it makes sense to sneak in a formatting fix every now
and then because its coding style is quite alien to Git's own in some
places. We could also do it all in one go, but I strongly doubt that it
would be worth the churn.

> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> -				uint64_t max);
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +			       uint64_t max);
>  
>  /*
>    Add a reftable_ref_record. The record should have names that come after

> diff --git a/reftable/writer.c b/reftable/writer.c
> index 740c98038eaf883258bef4988f78977ac7e4a75a..03acbdbcce75fd51820c5fb016bd94f0f7f4914a 100644
> --- a/reftable/writer.c
> +++ b/reftable/writer.c
> @@ -179,11 +179,20 @@ int reftable_writer_new(struct reftable_writer **out,
>  	return 0;
>  }
>  
> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> -				uint64_t max)
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +			       uint64_t max)
>  {
> +	/*
> +	 * The limits should be set before any records are added to the writer.
> +	 * Check if any records were added by checking if `last_key` was set.
> +	 */
> +	if (w->last_key.len)
> +		return REFTABLE_API_ERROR;

Hm. Using the last key feels somewhat dangerous to me as it does get
reset at times, e.g. when finishing writing the current section. It
_should_ work, but overall it just feels a tad to disconnected from the
thing that we actually want to check.

How about we instead use `next`? This variable records the offset of the
next block we're about to write, and `writer_flush_nonempty_block()`
uses it directly to check whether we're currently writing the first
block in order to decide whether it needs to write a header or not. If
it's 0, we know that we haven't written the first block yet. That feels
much closer aligned with what we're checking.

> diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c
> index aeec195b2b1014445d71c5db39a9795017fd8ff2..b23edf18a7d75b0c2292490ad06d4dfaaa571e79 100644
> --- a/t/unit-tests/t-reftable-stack.c
> +++ b/t/unit-tests/t-reftable-stack.c

Can we maybe add a unit test that demonstrates the error?

Patrick

  reply	other threads:[~2025-01-21  6:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17  7:59 [PATCH 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-17  7:59 ` [PATCH 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-17  9:29   ` Patrick Steinhardt
2025-01-20 11:17     ` Karthik Nayak
2025-01-17  7:59 ` [PATCH 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-17  7:59 ` [PATCH 3/3] reftable: prevent 'update_index' changes after header write Karthik Nayak
2025-01-17  9:29   ` Patrick Steinhardt
2025-01-20 11:47     ` Karthik Nayak
2025-01-20 12:18       ` Karthik Nayak
2025-01-21  3:34 ` [PATCH v2 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-21  3:34   ` [PATCH v2 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-21  3:34   ` [PATCH v2 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-21  3:34   ` [PATCH v2 3/3] reftable: prevent 'update_index' changes after adding records Karthik Nayak
2025-01-21  6:56     ` Patrick Steinhardt [this message]
2025-01-21 11:44       ` Karthik Nayak
2025-01-22  5:35   ` [PATCH v3 0/3] refs: small followups to the migration corruption fix Karthik Nayak
2025-01-22  5:35     ` [PATCH v3 1/3] refs: mark `ref_transaction_update_reflog()` as static Karthik Nayak
2025-01-22  5:35     ` [PATCH v3 2/3] refs: use 'uint64_t' for 'ref_update.index' Karthik Nayak
2025-01-22  5:35     ` [PATCH v3 3/3] reftable: prevent 'update_index' changes after adding records Karthik Nayak
2025-01-22 12:12       ` Patrick Steinhardt
2025-01-22 17:50         ` Junio C Hamano
2025-01-22 21:57           ` Junio C Hamano
2025-02-01  2:24       ` undefined behavior in unit tests, was " Jeff King
2025-02-01 10:33         ` Phillip Wood
2025-02-03  5:41           ` Patrick Steinhardt
2025-02-03 14:11             ` Junio C Hamano
2025-02-03 15:37           ` Jeff King
2025-02-03  5:40         ` Patrick Steinhardt
2025-02-03 15:20         ` Karthik Nayak
2025-02-03 15:38           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z49FHQgsFSX6sTxu@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.