From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Subject: Re: [PATCH 3/4] reftable/stack: register lockfiles during compaction
Date: Wed, 6 Mar 2024 12:59:52 +0100 [thread overview]
Message-ID: <ZehauDgP4L40QKcH@tanuki> (raw)
In-Reply-To: <mwhby7dxpiyrvknqe2uoli4ulygjy6hbxcxpqt3alw3dthwntr@4o24tp5jp6h7>
[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]
On Tue, Mar 05, 2024 at 05:30:48PM -0600, Justin Tobler wrote:
> On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> > We do not register any of the locks we acquire when compacting the
> > reftable stack via our lockfiles interfaces. These locks will thus not
> > be released when Git gets killed.
> >
> > Refactor the code to register locks as lockfiles.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > ...
> > + /*
> > + * Write the new "tables.list" contents with the compacted table we
> > + * have just written. In case the compacted table became empty we
> > + * simply skip writing it.
> > + */
> > + for (i = 0; i < first; i++)
> > + strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
> > + if (!is_empty_table)
> > + strbuf_addf(&tables_list_buf, "%s\n", new_table_name.buf);
>
> Something not really related to this patch, but I noticed and had a
> question about.
>
> If I'm understanding this correctly, when a newly compacted table is
> empty, it becomes possible for a range of indexes to no longer exist
> within the stack. If this occurs in the middle of the stack, future
> compaction will likely combine the tables on either side and restore the
> missing index range. If the empty table was at the end of the stack,
> would this effectly reset the max index to something lower for future
> tables written to the stack? If so, could this lead to issues with
> separate concurrent table writes?
Very good question indeed, but I think we should be fine here. This is
mostly because concurrent writers will notice when "tables.list" has
changed, and, if so, abort the transaction with an out-of-date error.
A few scenarios with concurrent processes, one process which compacts
the stack (C) and one which modifies it (M):
- M acquires the lock before C compacts: M sees the whole stack and
uses the latest update index to update it, resulting in a newly
written table. When C then locks afterwards, it may decide to
compact and drop some tables in the middle of the stack. This may
lead to a gap in update indices, but this is fine.
- M acquires the lock while C compacts: M sees the whole stack and
uses the latest update index to update the stack. C then acquires
the lock to write the merged tables, notices that its compacted
tables still exist and are in the same order, and thus removes them.
We now have a gap in update indices, but this is totally fine.
- M acquires the lock after C compacts: M will refresh "tables.list"
after it has acquired the lock itself. Thus, it won't ever see the
now-dropped empty table.
M cannot write its table when C has the "tables.list" lock, so this
scenario cannot happen. In the same spirit, two Ms cannot race with each
other either as only one can have the "tables.list" lock, and the other
one would abort with an out-of-date error when it has subsequently
acquired the lock and found the "tables.list" contents to have been
updated concurrently.
> > ...
> > diff --git a/reftable/system.h b/reftable/system.h
> > index 6b74a81514..5d8b6dede5 100644
> > --- a/reftable/system.h
> > +++ b/reftable/system.h
> > @@ -12,7 +12,9 @@ license that can be found in the LICENSE file or at
> > /* This header glues the reftable library to the rest of Git */
> >
> > #include "git-compat-util.h"
> > +#include "lockfile.h"
> > #include "strbuf.h"
> > +#include "tempfile.h"
> > #include "hash-ll.h" /* hash ID, sizes.*/
> > #include "dir.h" /* remove_dir_recursively, for tests.*/
>
> Naive question, why do we include the headers in `system.h`? I assume
> this is because they are common? Are there other benefits to this
> indirection?
Well, "system.h" is supposedly the glue between the common Git codebase
and the reftable library, so all Git-specific headers should be added
here instead of being added individually to the respective files in the
library. Whether that is ultimately a sensible thing and whether it
really helps us all that much is a different question though.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-03-06 11:59 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
2024-03-05 22:09 ` Justin Tobler
2024-03-06 12:00 ` Patrick Steinhardt
2024-03-04 11:10 ` [PATCH 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
2024-03-05 22:30 ` Justin Tobler
2024-03-06 11:59 ` Patrick Steinhardt
2024-03-06 16:34 ` Justin Tobler
2024-03-06 16:36 ` Junio C Hamano
2024-03-07 6:17 ` Patrick Steinhardt
2024-03-07 17:59 ` Junio C Hamano
2024-03-07 20:54 ` Patrick Steinhardt
2024-03-07 21:06 ` Junio C Hamano
2024-03-04 11:10 ` [PATCH 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
2024-03-05 23:30 ` Justin Tobler
2024-03-06 11:59 ` Patrick Steinhardt [this message]
2024-03-06 16:39 ` Junio C Hamano
2024-03-06 19:57 ` Justin Tobler
2024-03-04 11:10 ` [PATCH 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
2024-03-07 12:38 ` Toon claes
2024-03-07 12:58 ` Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 1/4] lockfile: report when rollback fails Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZehauDgP4L40QKcH@tanuki \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).