git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Justin Tobler <jltobler@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 7/8] reftable/stack: fix corruption on concurrent compaction
Date: Thu, 1 Aug 2024 10:41:04 +0200	[thread overview]
Message-ID: <ZqtKIKW-lqvb00z9@tanuki> (raw)
In-Reply-To: <bkr4qwpgpzl2nffjuz52te3vwmn3gckoahrkynw2zrqrzptwd6@7b776obncdiy>

[-- Attachment #1: Type: text/plain, Size: 3464 bytes --]

On Wed, Jul 31, 2024 at 08:04:17PM -0500, Justin Tobler wrote:
> On 24/07/31 04:15PM, Patrick Steinhardt wrote:
> > But there is a bug in the code. Suppose we have two processes which are
> > compacting two slices of the table. Given that we lock each of the
> > tables before compacting them, we know that the slices must be disjunct
> > from each other. But regardless of that, compaction performed by one
> > process will always impact what the other process needs to write to the
> > "tables.list" file.
> 
> I'm not quite sure I understand at this point how it is possible for two
> compaction operations to be performed concurrently. Wouldn't there
> always be overlap between the two compaction segments thus causing one
> of the operations to be unable to acquire all of the required locks and
> abort?

In practice we cannot assume anything about how another process compacts
tables. While we can assume something about how a particular version of
Git compacts tables, we cannot assume anything about future versions of
Git or about alternate implementations of Git. The reftable backend
allows for compacting only a subset of tables, and the heuristic is not
mandated by the on-disk format except that the tables that we are about
to compact need to be next to each other in the stack.

Furthermore, with the next patch, we also handle it gracefully when some
parts of the stack are locked already. Thus, it can easily happen that
process A compacts tables 1 to 3, whereas process B will try to compact
tables 1 to 5, fail to acquire the lock for table 3, and then reduce the
range to compact to 3 to 5.

> > changed after we have locked it for the second time in (5). This has the
> > consequence that we will always commit the old, cached in-core tables to
> > disk without paying to respect what the other process has written. This
> > scenario would then lead to data loss and corruption.
> 
> If a concurrent compaction happens though, it would mess up the indices
> and cause problems when writting the "tables.list" file. That would not
> be good.

Yup.

> > This can even happen in the simpler case of one compacting process and
> > one writing process. The newly-appended table by the writing process
> > would get discarded by the compacting process because it never sees the
> > new table.
> 
> This is indeed a problem. Since we don't reload the stack, we are
> unaware of any concurrently append tables causing them to not be
> written in the new "tables.list" file. Scary

Indeed.

> > +		/*
> > +		 * We have found the new range that we want to replace, so
> > +		 * let's update the range of tables that we want to replace.
> > +		 */
> > +		last_to_replace = last + (new_offset - first);
> > +		first_to_replace = new_offset;
> > +	} else {
> > +		REFTABLE_CALLOC_ARRAY(names, st->merged->stack_len + 1);
> 
> I was confused at first by the `stack_len` + 1. The extra element is
> NULL which tells us there are no more tables to add to the list,
> correct? It looks like `fd_read_lines()` also adds an extra element.

Yes, that's the reason why we have it. We end up passing `names` to
`free_names()`, which uses `NULL` as a sentinel value to know when to
stop iterating over the array's entries.

I'll add a comment.

Thanks for your review. I'll wait a bit longer before sending out
another version of this patch series to wait for some more feedback.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-08-01  8:41 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 14:14 [PATCH 0/8] reftable: improvements and fixes for compaction Patrick Steinhardt
2024-07-31 14:14 ` [PATCH 1/8] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-02 20:26   ` Junio C Hamano
2024-07-31 14:14 ` [PATCH 2/8] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-02 21:05   ` Junio C Hamano
2024-08-05 12:11     ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 3/8] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 4/8] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-07-31 21:57   ` Justin Tobler
2024-08-02 23:00   ` Junio C Hamano
2024-08-05 12:11     ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 5/8] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 6/8] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-07-31 23:02   ` Justin Tobler
2024-08-01  8:40     ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 7/8] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-01  1:04   ` Justin Tobler
2024-08-01  8:41     ` Patrick Steinhardt [this message]
2024-07-31 14:15 ` [PATCH 8/8] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-02 23:24   ` Junio C Hamano
2024-08-05 12:11     ` Patrick Steinhardt
2024-08-05 13:07 ` [PATCH v2 0/9] reftable: improvements and fixes for compaction Patrick Steinhardt
2024-08-05 13:07   ` [PATCH v2 1/9] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-05 13:07   ` [PATCH v2 2/9] reftable/stack: extract function to setup stack with N tables Patrick Steinhardt
2024-08-05 13:07   ` [PATCH v2 3/9] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-08 10:46     ` Karthik Nayak
2024-08-05 13:08   ` [PATCH v2 4/9] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-08-05 13:08   ` [PATCH v2 5/9] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-08-05 13:08   ` [PATCH v2 6/9] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-08-05 13:08   ` [PATCH v2 7/9] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-08-05 13:08   ` [PATCH v2 8/9] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-08 12:14     ` Karthik Nayak
2024-08-08 13:29       ` Patrick Steinhardt
2024-08-05 13:08   ` [PATCH v2 9/9] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-06 18:46     ` Justin Tobler
2024-08-07  5:31       ` Patrick Steinhardt
2024-08-07 19:12         ` Justin Tobler
2024-08-08 12:25     ` Karthik Nayak
2024-08-08 12:26   ` [PATCH v2 0/9] reftable: improvements and fixes for compaction Karthik Nayak
2024-08-08 14:06 ` [PATCH v3 " Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 1/9] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 2/9] reftable/stack: extract function to setup stack with N tables Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 3/9] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 4/9] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 5/9] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 6/9] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 7/9] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 8/9] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-08 14:06   ` [PATCH v3 9/9] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-08 16:52   ` [PATCH v3 0/9] reftable: improvements and fixes for compaction Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqtKIKW-lqvb00z9@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).