From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Justin Tobler <jltobler@gmail.com>, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 0/9] reftable: improvements and fixes for compaction
Date: Mon, 5 Aug 2024 15:07:42 +0200 [thread overview]
Message-ID: <cover.1722862822.git.ps@pks.im> (raw)
In-Reply-To: <cover.1722435214.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 7432 bytes --]
Hi,
this is the second version of my patch series that aims to improve the
way reftable stack perform compaction.
Changes compared to v1:
- Extract a test helper function that sets up a stack with N tables
containing refs.
- Reuse file descriptor that we have already stored in a local
variable instead of calling `lock_file_fd()` a second time.
- Remove a no-op change in the last patch.
- Add a comment explaining why we have to allocate N+1 many table
names.
- Some typo fixes.
Thanks!
Patrick
Patrick Steinhardt (9):
reftable/stack: refactor function to gather table sizes
reftable/stack: extract function to setup stack with N tables
reftable/stack: test compaction with already-locked tables
reftable/stack: update stats on failed full compaction
reftable/stack: simplify tracking of table locks
reftable/stack: do not die when fsyncing lock file files
reftable/stack: use lock_file when adding table to "tables.list"
reftable/stack: fix corruption on concurrent compaction
reftable/stack: handle locked tables during auto-compaction
reftable/stack.c | 231 +++++++++++++++++++++++++++++--------
reftable/stack_test.c | 138 +++++++++++++++++-----
t/t0610-reftable-basics.sh | 21 ++--
3 files changed, 306 insertions(+), 84 deletions(-)
Range-diff against v1:
1: 5d99191f5c = 1: 6d2b54ba8e reftable/stack: refactor function to gather table sizes
-: ---------- > 2: ff17306cc0 reftable/stack: extract function to setup stack with N tables
2: 123fb9d80e ! 3: 63e64c8d82 reftable/stack: test compaction with already-locked tables
@@ reftable/stack_test.c: static void test_reftable_stack_auto_compaction(void)
+ err = reftable_new_stack(&st, dir, &opts);
+ EXPECT_ERR(err);
+
-+ for (size_t i = 0; i < 5; i++) {
-+ struct reftable_ref_record ref = {
-+ .update_index = reftable_stack_next_update_index(st),
-+ .value_type = REFTABLE_REF_VAL1,
-+ .value.val1 = { i },
-+ };
-+
-+ strbuf_reset(&buf);
-+ strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i);
-+ ref.refname = buf.buf;
-+
-+ err = reftable_stack_add(st, &write_test_ref, &ref);
-+ EXPECT_ERR(err);
-+ }
++ write_n_ref_tables(st, &opts, 5);
+ EXPECT(st->merged->stack_len == 5);
+
+ /*
@@ reftable/stack_test.c: static void test_reftable_stack_add_performs_auto_compact
+ err = reftable_new_stack(&st, dir, &opts);
+ EXPECT_ERR(err);
+
-+ for (size_t i = 0; i < 3; i++) {
-+ struct reftable_ref_record ref = {
-+ .update_index = reftable_stack_next_update_index(st),
-+ .value_type = REFTABLE_REF_VAL1,
-+ .value.val1 = { i },
-+ };
-+
-+ strbuf_reset(&buf);
-+ strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i);
-+ ref.refname = buf.buf;
-+
-+ err = reftable_stack_add(st, &write_test_ref, &ref);
-+ EXPECT_ERR(err);
-+ }
++ write_n_ref_tables(st, &opts, 3);
+ EXPECT(st->merged->stack_len == 3);
+
+ /* Lock one of the tables that we're about to compact. */
3: 1fa7acbddf = 4: 1989dafcb4 reftable/stack: update stats on failed full compaction
4: 40d9f75cf2 = 5: 73e5d104eb reftable/stack: simplify tracking of table locks
5: 9233c36894 = 6: e411e14904 reftable/stack: do not die when fsyncing lock file files
6: 9703246156 ! 7: b868a518d6 reftable/stack: use lock_file when adding table to "tables.list"
@@ Commit message
compacting the stack, we manually handle the lock when adding a new
table to it. While not wrong, it is at least inconsistent.
- Refactor the code to consistenly lock "tables.list" via the `lock_file`
+ Refactor the code to consistently lock "tables.list" via the `lock_file`
subsytem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
@@ reftable/stack.c: int reftable_addition_commit(struct reftable_addition *add)
goto done;
}
-- err = fsync_component(FSYNC_COMPONENT_REFERENCE, lock_file_fd);
-+ err = fsync_component(FSYNC_COMPONENT_REFERENCE,
-+ get_lock_file_fd(&add->tables_list_lock));
- if (err < 0) {
- err = REFTABLE_IO_ERROR;
- goto done;
- }
-
- err = rename_tempfile(&add->lock_file, add->stack->list_file);
+ err = commit_lock_file(&add->tables_list_lock);
if (err < 0) {
7: b746565bf0 ! 8: ff17414d26 reftable/stack: fix corruption on concurrent compaction
@@ Commit message
Letting concurrent processes modify the "tables.list" file while we are
doing the compaction is very much part of the design and thus expected.
After all, it may take some time to compact tables in the case where we
- are compacting a lot or very large tables.
+ are compacting a lot of very large tables.
But there is a bug in the code. Suppose we have two processes which are
compacting two slices of the table. Given that we lock each of the
@@ Commit message
process will always impact what the other process needs to write to the
"tables.list" file.
- Right now , we do not check whether the "tables.list" has been
- changed after we have locked it for the second time in (5). This has the
+ Right now, we do not check whether the "tables.list" has been changed
+ after we have locked it for the second time in (5). This has the
consequence that we will always commit the old, cached in-core tables to
disk without paying to respect what the other process has written. This
scenario would then lead to data loss and corruption.
@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
+ last_to_replace = last + (new_offset - first);
+ first_to_replace = new_offset;
+ } else {
++ /*
++ * `fd_read_lines()` uses a `NULL` sentinel to indicate that
++ * the array is at its end. As we use `free_names()` to free
++ * the array, we need to include this sentinel value here and
++ * thus have to allocate `stack_len + 1` many entries.
++ */
+ REFTABLE_CALLOC_ARRAY(names, st->merged->stack_len + 1);
+ for (size_t i = 0; i < st->merged->stack_len; i++)
+ names[i] = xstrdup(st->readers[i]->name);
8: dc22178307 ! 9: 1ef560feb1 reftable/stack: handle locked tables during auto-compaction
@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
}
/*
-@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
- * delete the files after we closed them on Windows, so this needs to
- * happen first.
- */
-- err = reftable_stack_reload_maybe_reuse(st, first < last);
-+ err = reftable_stack_reload_maybe_reuse(st, first_to_replace < last_to_replace);
- if (err < 0)
- goto done;
-
@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
static int stack_compact_range_stats(struct reftable_stack *st,
base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4
--
2.46.0.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-08-05 13:07 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 14:14 [PATCH 0/8] reftable: improvements and fixes for compaction Patrick Steinhardt
2024-07-31 14:14 ` [PATCH 1/8] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-02 20:26 ` Junio C Hamano
2024-07-31 14:14 ` [PATCH 2/8] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-02 21:05 ` Junio C Hamano
2024-08-05 12:11 ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 3/8] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 4/8] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-07-31 21:57 ` Justin Tobler
2024-08-02 23:00 ` Junio C Hamano
2024-08-05 12:11 ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 5/8] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 6/8] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-07-31 23:02 ` Justin Tobler
2024-08-01 8:40 ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 7/8] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-01 1:04 ` Justin Tobler
2024-08-01 8:41 ` Patrick Steinhardt
2024-07-31 14:15 ` [PATCH 8/8] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-02 23:24 ` Junio C Hamano
2024-08-05 12:11 ` Patrick Steinhardt
2024-08-05 13:07 ` Patrick Steinhardt [this message]
2024-08-05 13:07 ` [PATCH v2 1/9] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-05 13:07 ` [PATCH v2 2/9] reftable/stack: extract function to setup stack with N tables Patrick Steinhardt
2024-08-05 13:07 ` [PATCH v2 3/9] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-08 10:46 ` Karthik Nayak
2024-08-05 13:08 ` [PATCH v2 4/9] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-08-05 13:08 ` [PATCH v2 5/9] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-08-05 13:08 ` [PATCH v2 6/9] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-08-05 13:08 ` [PATCH v2 7/9] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-08-05 13:08 ` [PATCH v2 8/9] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-08 12:14 ` Karthik Nayak
2024-08-08 13:29 ` Patrick Steinhardt
2024-08-05 13:08 ` [PATCH v2 9/9] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-06 18:46 ` Justin Tobler
2024-08-07 5:31 ` Patrick Steinhardt
2024-08-07 19:12 ` Justin Tobler
2024-08-08 12:25 ` Karthik Nayak
2024-08-08 12:26 ` [PATCH v2 0/9] reftable: improvements and fixes for compaction Karthik Nayak
2024-08-08 14:06 ` [PATCH v3 " Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 1/9] reftable/stack: refactor function to gather table sizes Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 2/9] reftable/stack: extract function to setup stack with N tables Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 3/9] reftable/stack: test compaction with already-locked tables Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 4/9] reftable/stack: update stats on failed full compaction Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 5/9] reftable/stack: simplify tracking of table locks Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 6/9] reftable/stack: do not die when fsyncing lock file files Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 7/9] reftable/stack: use lock_file when adding table to "tables.list" Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 8/9] reftable/stack: fix corruption on concurrent compaction Patrick Steinhardt
2024-08-08 14:06 ` [PATCH v3 9/9] reftable/stack: handle locked tables during auto-compaction Patrick Steinhardt
2024-08-08 16:52 ` [PATCH v3 0/9] reftable: improvements and fixes for compaction Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1722862822.git.ps@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jltobler@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).