git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] reftable/stack: register temporary files
@ 2024-03-04 11:10 Patrick Steinhardt
  2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-04 11:10 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]

Hi,

when writing to a reftable stack there are several kinds of temporary
files that we have to write. For one this is the new table, second the
new "tables.list.lock" file. We do not register all of these with the
tempfiles subsystem though, which can have the consequence that
interrupted writes may leave those files behind.

This patch series addresses those cases for three cases:

  - Newly written tables.

  - Newly compacted tables.

  - Pending "tables.list.lock" during compaction.

The end result should be more robust regarding interruption and leave
left cruft behind.

Patrick

Patrick Steinhardt (4):
  lockfile: report when rollback fails
  reftable/stack: register new tables as tempfiles
  reftable/stack: register lockfiles during compaction
  reftable/stack: register compacted tables as tempfiles

 lockfile.h        |   4 +-
 reftable/stack.c  | 329 ++++++++++++++++++++++------------------------
 reftable/system.h |   2 +
 tempfile.c        |  21 +--
 tempfile.h        |   2 +-
 5 files changed, 177 insertions(+), 181 deletions(-)

-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/4] lockfile: report when rollback fails
  2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
@ 2024-03-04 11:10 ` Patrick Steinhardt
  2024-03-05 22:09   ` Justin Tobler
  2024-03-04 11:10 ` [PATCH 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-04 11:10 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2959 bytes --]

We do not report to the caller when rolling back a lockfile fails, which
will be needed by the reftable compaction logic in a subsequent commit.
It also cannot really report on all errors because the function calls
`delete_tempfile()`, which doesn't return an error either.

Refactor the code so that both `delete_tempfile()` and
`rollback_lock_file()` return an error code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 lockfile.h |  4 ++--
 tempfile.c | 21 +++++++++++++--------
 tempfile.h |  2 +-
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/lockfile.h b/lockfile.h
index 90af4e66b2..4ed570d3f7 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -323,9 +323,9 @@ static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
  * for a `lock_file` object that has already been committed or rolled
  * back.
  */
-static inline void rollback_lock_file(struct lock_file *lk)
+static inline int rollback_lock_file(struct lock_file *lk)
 {
-	delete_tempfile(&lk->tempfile);
+	return delete_tempfile(&lk->tempfile);
 }
 
 #endif /* LOCKFILE_H */
diff --git a/tempfile.c b/tempfile.c
index ecdebf1afb..ed88cf8431 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -50,15 +50,17 @@
 
 static VOLATILE_LIST_HEAD(tempfile_list);
 
-static void remove_template_directory(struct tempfile *tempfile,
+static int remove_template_directory(struct tempfile *tempfile,
 				      int in_signal_handler)
 {
 	if (tempfile->directory) {
 		if (in_signal_handler)
-			rmdir(tempfile->directory);
+			return rmdir(tempfile->directory);
 		else
-			rmdir_or_warn(tempfile->directory);
+			return rmdir_or_warn(tempfile->directory);
 	}
+
+	return 0;
 }
 
 static void remove_tempfiles(int in_signal_handler)
@@ -353,16 +355,19 @@ int rename_tempfile(struct tempfile **tempfile_p, const char *path)
 	return 0;
 }
 
-void delete_tempfile(struct tempfile **tempfile_p)
+int delete_tempfile(struct tempfile **tempfile_p)
 {
 	struct tempfile *tempfile = *tempfile_p;
+	int err = 0;
 
 	if (!is_tempfile_active(tempfile))
-		return;
+		return 0;
 
-	close_tempfile_gently(tempfile);
-	unlink_or_warn(tempfile->filename.buf);
-	remove_template_directory(tempfile, 0);
+	err |= close_tempfile_gently(tempfile);
+	err |= unlink_or_warn(tempfile->filename.buf);
+	err |= remove_template_directory(tempfile, 0);
 	deactivate_tempfile(tempfile);
 	*tempfile_p = NULL;
+
+	return err ? -1 : 0;
 }
diff --git a/tempfile.h b/tempfile.h
index d0413af733..2d2ae5b657 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -269,7 +269,7 @@ int reopen_tempfile(struct tempfile *tempfile);
  * `delete_tempfile()` for a `tempfile` object that has already been
  * deleted or renamed.
  */
-void delete_tempfile(struct tempfile **tempfile_p);
+int delete_tempfile(struct tempfile **tempfile_p);
 
 /*
  * Close the file descriptor and/or file pointer if they are still
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
  2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
@ 2024-03-04 11:10 ` Patrick Steinhardt
  2024-03-05 22:30   ` Justin Tobler
  2024-03-04 11:10 ` [PATCH 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-04 11:10 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]

We do not register new tables which we're about to add to the stack with
the tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index b64e55648a..81544fbfa0 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -737,8 +737,9 @@ int reftable_addition_add(struct reftable_addition *add,
 	struct strbuf tab_file_name = STRBUF_INIT;
 	struct strbuf next_name = STRBUF_INIT;
 	struct reftable_writer *wr = NULL;
+	struct tempfile *tab_file = NULL;
 	int err = 0;
-	int tab_fd = 0;
+	int tab_fd;
 
 	strbuf_reset(&next_name);
 	format_name(&next_name, add->next_update_index, add->next_update_index);
@@ -746,17 +747,20 @@ int reftable_addition_add(struct reftable_addition *add,
 	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
 	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
 
-	tab_fd = mkstemp(temp_tab_file_name.buf);
-	if (tab_fd < 0) {
+	tab_file = mks_tempfile(temp_tab_file_name.buf);
+	if (!tab_file) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 	if (add->stack->config.default_permissions) {
-		if (chmod(temp_tab_file_name.buf, add->stack->config.default_permissions)) {
+		if (chmod(get_tempfile_path(tab_file),
+			  add->stack->config.default_permissions)) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
+	tab_fd = get_tempfile_fd(tab_file);
+
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
 				 &add->stack->config);
 	err = write_table(wr, arg);
@@ -771,14 +775,13 @@ int reftable_addition_add(struct reftable_addition *add,
 	if (err < 0)
 		goto done;
 
-	err = close(tab_fd);
-	tab_fd = 0;
+	err = close_tempfile_gently(tab_file);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
-	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	err = stack_check_addition(add->stack, get_tempfile_path(tab_file));
 	if (err < 0)
 		goto done;
 
@@ -789,14 +792,13 @@ int reftable_addition_add(struct reftable_addition *add,
 
 	format_name(&next_name, wr->min_update_index, wr->max_update_index);
 	strbuf_addstr(&next_name, ".ref");
-
 	stack_filename(&tab_file_name, add->stack, next_name.buf);
 
 	/*
 	  On windows, this relies on rand() picking a unique destination name.
 	  Maybe we should do retry loop as well?
 	 */
-	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	err = rename_tempfile(&tab_file, tab_file_name.buf);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
@@ -806,14 +808,7 @@ int reftable_addition_add(struct reftable_addition *add,
 			    add->new_tables_cap);
 	add->new_tables[add->new_tables_len++] = strbuf_detach(&next_name, NULL);
 done:
-	if (tab_fd > 0) {
-		close(tab_fd);
-		tab_fd = 0;
-	}
-	if (temp_tab_file_name.len > 0) {
-		unlink(temp_tab_file_name.buf);
-	}
-
+	delete_tempfile(&tab_file);
 	strbuf_release(&temp_tab_file_name);
 	strbuf_release(&tab_file_name);
 	strbuf_release(&next_name);
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/4] reftable/stack: register lockfiles during compaction
  2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
  2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
  2024-03-04 11:10 ` [PATCH 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
@ 2024-03-04 11:10 ` Patrick Steinhardt
  2024-03-05 23:30   ` Justin Tobler
  2024-03-04 11:10 ` [PATCH 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-04 11:10 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 10974 bytes --]

We do not register any of the locks we acquire when compacting the
reftable stack via our lockfiles interfaces. These locks will thus not
be released when Git gets killed.

Refactor the code to register locks as lockfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c  | 255 ++++++++++++++++++++++------------------------
 reftable/system.h |   2 +
 2 files changed, 123 insertions(+), 134 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 81544fbfa0..977336b7d5 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -978,212 +978,199 @@ static int stack_compact_range(struct reftable_stack *st,
 			       size_t first, size_t last,
 			       struct reftable_log_expiry_config *expiry)
 {
-	char **delete_on_success = NULL, **subtable_locks = NULL, **listp = NULL;
-	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tables_list_buf = STRBUF_INIT;
+	struct strbuf new_table_temp_path = STRBUF_INIT;
 	struct strbuf new_table_name = STRBUF_INIT;
-	struct strbuf lock_file_name = STRBUF_INIT;
-	struct strbuf ref_list_contents = STRBUF_INIT;
 	struct strbuf new_table_path = STRBUF_INIT;
-	size_t i, j, compact_count;
-	int err = 0;
-	int have_lock = 0;
-	int lock_file_fd = -1;
-	int is_empty_table = 0;
+	struct strbuf table_name = STRBUF_INIT;
+	struct lock_file tables_list_lock = LOCK_INIT;
+	struct lock_file *table_locks = NULL;
+	int is_empty_table = 0, err = 0;
+	size_t i;
 
 	if (first > last || (!expiry && first == last)) {
 		err = 0;
 		goto done;
 	}
 
-	compact_count = last - first + 1;
-	REFTABLE_CALLOC_ARRAY(delete_on_success, compact_count + 1);
-	REFTABLE_CALLOC_ARRAY(subtable_locks, compact_count + 1);
-
 	st->stats.attempts++;
 
-	strbuf_reset(&lock_file_name);
-	strbuf_addstr(&lock_file_name, st->list_file);
-	strbuf_addstr(&lock_file_name, ".lock");
-
-	lock_file_fd =
-		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0666);
-	if (lock_file_fd < 0) {
-		if (errno == EEXIST) {
+	/*
+	 * Hold the lock so that we can read "tables.list" and lock all tables
+	 * which are part of the user-specified range.
+	 */
+	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
+					LOCK_NO_DEREF);
+	if (err < 0) {
+		if (errno == EEXIST)
 			err = 1;
-		} else {
+		else
 			err = REFTABLE_IO_ERROR;
-		}
 		goto done;
 	}
-	/* Don't want to write to the lock for now.  */
-	close(lock_file_fd);
-	lock_file_fd = -1;
 
-	have_lock = 1;
 	err = stack_uptodate(st);
-	if (err != 0)
+	if (err)
 		goto done;
 
-	for (i = first, j = 0; i <= last; i++) {
-		struct strbuf subtab_file_name = STRBUF_INIT;
-		struct strbuf subtab_lock = STRBUF_INIT;
-		int sublock_file_fd = -1;
-
-		stack_filename(&subtab_file_name, st,
-			       reader_name(st->readers[i]));
-
-		strbuf_reset(&subtab_lock);
-		strbuf_addbuf(&subtab_lock, &subtab_file_name);
-		strbuf_addstr(&subtab_lock, ".lock");
+	/*
+	 * Lock all tables in the user-provided range. This is the slice of our
+	 * stack which we'll compact.
+	 */
+	REFTABLE_CALLOC_ARRAY(table_locks, last - first + 1);
+	for (i = first; i <= last; i++) {
+		stack_filename(&table_name, st, reader_name(st->readers[i]));
 
-		sublock_file_fd = open(subtab_lock.buf,
-				       O_EXCL | O_CREAT | O_WRONLY, 0666);
-		if (sublock_file_fd >= 0) {
-			close(sublock_file_fd);
-		} else if (sublock_file_fd < 0) {
-			if (errno == EEXIST) {
+		err = hold_lock_file_for_update(&table_locks[i - first],
+						table_name.buf, LOCK_NO_DEREF);
+		if (err < 0) {
+			if (errno == EEXIST)
 				err = 1;
-			} else {
+			else
 				err = REFTABLE_IO_ERROR;
-			}
+			goto done;
 		}
 
-		subtable_locks[j] = subtab_lock.buf;
-		delete_on_success[j] = subtab_file_name.buf;
-		j++;
-
-		if (err != 0)
+		/*
+		 * We need to close the lockfiles as we might otherwise easily
+		 * run into file descriptor exhaustion when we compress a lot
+		 * of tables.
+		 */
+		err = close_lock_file_gently(&table_locks[i - first]);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
 			goto done;
+		}
 	}
 
-	err = unlink(lock_file_name.buf);
-	if (err < 0)
+	/*
+	 * We have locked all tables in our range and can thus release the
+	 * "tables.list" lock while compacting the locked tables. This allows
+	 * concurrent updates to the stack to proceed.
+	 */
+	err = rollback_lock_file(&tables_list_lock);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
 		goto done;
-	have_lock = 0;
-
-	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
-				   expiry);
-	/* Compaction + tombstones can create an empty table out of non-empty
-	 * tables. */
-	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
-	if (is_empty_table) {
-		err = 0;
 	}
-	if (err < 0)
-		goto done;
 
-	lock_file_fd =
-		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0666);
-	if (lock_file_fd < 0) {
-		if (errno == EEXIST) {
+	/*
+	 * Compact the now-locked tables into a new table. Note that compacting
+	 * these tables may end up with an empty new table in case tombstones
+	 * end up cancelling out all refs in that range.
+	 */
+	err = stack_compact_locked(st, first, last, &new_table_temp_path, expiry);
+	if (err < 0) {
+		if (err != REFTABLE_EMPTY_TABLE_ERROR)
+			goto done;
+		is_empty_table = 1;
+	}
+
+	/*
+	 * Now that we have written the new, compacted table we need to re-lock
+	 * "tables.list". We'll then replace the compacted range of tables with
+	 * the new table.
+	 */
+	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
+					LOCK_NO_DEREF);
+	if (err < 0) {
+		if (errno == EEXIST)
 			err = 1;
-		} else {
+		else
 			err = REFTABLE_IO_ERROR;
-		}
 		goto done;
 	}
-	have_lock = 1;
+
 	if (st->config.default_permissions) {
-		if (chmod(lock_file_name.buf, st->config.default_permissions) < 0) {
+		if (chmod(get_lock_file_path(&tables_list_lock),
+			  st->config.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
 
-	format_name(&new_table_name, st->readers[first]->min_update_index,
-		    st->readers[last]->max_update_index);
-	strbuf_addstr(&new_table_name, ".ref");
-
-	stack_filename(&new_table_path, st, new_table_name.buf);
-
+	/*
+	 * If the resulting compacted table is not empty, then we need to move
+	 * it into place now.
+	 */
 	if (!is_empty_table) {
-		/* retry? */
-		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		format_name(&new_table_name, st->readers[first]->min_update_index,
+			    st->readers[last]->max_update_index);
+		strbuf_addstr(&new_table_name, ".ref");
+		stack_filename(&new_table_path, st, new_table_name.buf);
+
+		err = rename(new_table_temp_path.buf, new_table_path.buf);
 		if (err < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
 
-	for (i = 0; i < first; i++) {
-		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-	if (!is_empty_table) {
-		strbuf_addbuf(&ref_list_contents, &new_table_name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-	for (i = last + 1; i < st->merged->stack_len; i++) {
-		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-
-	err = write_in_full(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
-	if (err < 0) {
-		err = REFTABLE_IO_ERROR;
-		unlink(new_table_path.buf);
-		goto done;
-	}
-
-	err = fsync_component(FSYNC_COMPONENT_REFERENCE, lock_file_fd);
+	/*
+	 * Write the new "tables.list" contents with the compacted table we
+	 * have just written. In case the compacted table became empty we
+	 * simply skip writing it.
+	 */
+	for (i = 0; i < first; i++)
+		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
+	if (!is_empty_table)
+		strbuf_addf(&tables_list_buf, "%s\n", new_table_name.buf);
+	for (i = last + 1; i < st->merged->stack_len; i++)
+		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
+
+	err = write_in_full(get_lock_file_fd(&tables_list_lock),
+			    tables_list_buf.buf, tables_list_buf.len);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
 
-	err = close(lock_file_fd);
-	lock_file_fd = -1;
+	err = fsync_component(FSYNC_COMPONENT_REFERENCE, get_lock_file_fd(&tables_list_lock));
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
 
-	err = rename(lock_file_name.buf, st->list_file);
+	err = commit_lock_file(&tables_list_lock);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
-	have_lock = 0;
 
-	/* Reload the stack before deleting. On windows, we can only delete the
-	   files after we closed them.
-	*/
+	/*
+	 * Reload the stack before deleting the compacted tables. We can only
+	 * delete the files after we closed them on Windows, so this needs to
+	 * happen first.
+	 */
 	err = reftable_stack_reload_maybe_reuse(st, first < last);
+	if (err < 0)
+		goto done;
 
-	listp = delete_on_success;
-	while (*listp) {
-		if (strcmp(*listp, new_table_path.buf)) {
-			unlink(*listp);
-		}
-		listp++;
+	/*
+	 * Delete the old tables. They may still be in use by concurrent
+	 * readers, so it is expected that unlinking tables may fail.
+	 */
+	for (i = first; i <= last; i++) {
+		struct lock_file *table_lock = &table_locks[i - first];
+		char *table_path = get_locked_file_path(table_lock);
+		unlink(table_path);
+		free(table_path);
 	}
 
 done:
-	free_names(delete_on_success);
+	rollback_lock_file(&tables_list_lock);
+	for (i = first; table_locks && i <= last; i++)
+		rollback_lock_file(&table_locks[i - first]);
+	reftable_free(table_locks);
 
-	if (subtable_locks) {
-		listp = subtable_locks;
-		while (*listp) {
-			unlink(*listp);
-			listp++;
-		}
-		free_names(subtable_locks);
-	}
-	if (lock_file_fd >= 0) {
-		close(lock_file_fd);
-		lock_file_fd = -1;
-	}
-	if (have_lock) {
-		unlink(lock_file_name.buf);
-	}
 	strbuf_release(&new_table_name);
 	strbuf_release(&new_table_path);
-	strbuf_release(&ref_list_contents);
-	strbuf_release(&temp_tab_file_name);
-	strbuf_release(&lock_file_name);
+	strbuf_release(&new_table_temp_path);
+	strbuf_release(&tables_list_buf);
+	strbuf_release(&table_name);
 	return err;
 }
 
diff --git a/reftable/system.h b/reftable/system.h
index 6b74a81514..5d8b6dede5 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -12,7 +12,9 @@ license that can be found in the LICENSE file or at
 /* This header glues the reftable library to the rest of Git */
 
 #include "git-compat-util.h"
+#include "lockfile.h"
 #include "strbuf.h"
+#include "tempfile.h"
 #include "hash-ll.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/
 
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/4] reftable/stack: register compacted tables as tempfiles
  2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2024-03-04 11:10 ` [PATCH 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
@ 2024-03-04 11:10 ` Patrick Steinhardt
  2024-03-07 12:38   ` Toon claes
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
  4 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-04 11:10 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4529 bytes --]

We do not register tables resulting from stack compaction with the
tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register compacted tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 53 +++++++++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 23 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 977336b7d5..40129da16c 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -827,51 +827,57 @@ uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
 
 static int stack_compact_locked(struct reftable_stack *st,
 				size_t first, size_t last,
-				struct strbuf *temp_tab,
-				struct reftable_log_expiry_config *config)
+				struct reftable_log_expiry_config *config,
+				struct tempfile **temp_table_out)
 {
 	struct strbuf next_name = STRBUF_INIT;
-	int tab_fd = -1;
+	struct strbuf table_path = STRBUF_INIT;
 	struct reftable_writer *wr = NULL;
+	struct tempfile *temp_table;
+	int temp_table_fd;
 	int err = 0;
 
 	format_name(&next_name,
 		    reftable_reader_min_update_index(st->readers[first]),
 		    reftable_reader_max_update_index(st->readers[last]));
+	stack_filename(&table_path, st, next_name.buf);
+	strbuf_addstr(&table_path, ".temp.XXXXXX");
 
-	stack_filename(temp_tab, st, next_name.buf);
-	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+	temp_table = mks_tempfile(table_path.buf);
+	if (!temp_table) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	temp_table_fd = get_tempfile_fd(temp_table);
 
-	tab_fd = mkstemp(temp_tab->buf);
 	if (st->config.default_permissions &&
-	    chmod(temp_tab->buf, st->config.default_permissions) < 0) {
+	    chmod(get_tempfile_path(temp_table), st->config.default_permissions) < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
-	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd, &st->config);
-
+	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
+				 &temp_table_fd, &st->config);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
+
 	err = reftable_writer_close(wr);
 	if (err < 0)
 		goto done;
 
-	err = close(tab_fd);
-	tab_fd = 0;
+	err = close_tempfile_gently(temp_table);
+	if (err < 0)
+		goto done;
+
+	*temp_table_out = temp_table;
+	temp_table = NULL;
 
 done:
+	delete_tempfile(&temp_table);
 	reftable_writer_free(wr);
-	if (tab_fd > 0) {
-		close(tab_fd);
-		tab_fd = 0;
-	}
-	if (err != 0 && temp_tab->len > 0) {
-		unlink(temp_tab->buf);
-		strbuf_release(temp_tab);
-	}
 	strbuf_release(&next_name);
+	strbuf_release(&table_path);
 	return err;
 }
 
@@ -979,12 +985,12 @@ static int stack_compact_range(struct reftable_stack *st,
 			       struct reftable_log_expiry_config *expiry)
 {
 	struct strbuf tables_list_buf = STRBUF_INIT;
-	struct strbuf new_table_temp_path = STRBUF_INIT;
 	struct strbuf new_table_name = STRBUF_INIT;
 	struct strbuf new_table_path = STRBUF_INIT;
 	struct strbuf table_name = STRBUF_INIT;
 	struct lock_file tables_list_lock = LOCK_INIT;
 	struct lock_file *table_locks = NULL;
+	struct tempfile *new_table = NULL;
 	int is_empty_table = 0, err = 0;
 	size_t i;
 
@@ -1059,7 +1065,7 @@ static int stack_compact_range(struct reftable_stack *st,
 	 * these tables may end up with an empty new table in case tombstones
 	 * end up cancelling out all refs in that range.
 	 */
-	err = stack_compact_locked(st, first, last, &new_table_temp_path, expiry);
+	err = stack_compact_locked(st, first, last, expiry, &new_table);
 	if (err < 0) {
 		if (err != REFTABLE_EMPTY_TABLE_ERROR)
 			goto done;
@@ -1099,7 +1105,7 @@ static int stack_compact_range(struct reftable_stack *st,
 		strbuf_addstr(&new_table_name, ".ref");
 		stack_filename(&new_table_path, st, new_table_name.buf);
 
-		err = rename(new_table_temp_path.buf, new_table_path.buf);
+		err = rename_tempfile(&new_table, new_table_path.buf);
 		if (err < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
@@ -1166,9 +1172,10 @@ static int stack_compact_range(struct reftable_stack *st,
 		rollback_lock_file(&table_locks[i - first]);
 	reftable_free(table_locks);
 
+	delete_tempfile(&new_table);
 	strbuf_release(&new_table_name);
 	strbuf_release(&new_table_path);
-	strbuf_release(&new_table_temp_path);
+
 	strbuf_release(&tables_list_buf);
 	strbuf_release(&table_name);
 	return err;
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] lockfile: report when rollback fails
  2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
@ 2024-03-05 22:09   ` Justin Tobler
  2024-03-06 12:00     ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Justin Tobler @ 2024-03-05 22:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> We do not report to the caller when rolling back a lockfile fails, which
> will be needed by the reftable compaction logic in a subsequent commit.
> It also cannot really report on all errors because the function calls
> `delete_tempfile()`, which doesn't return an error either.
> 
> Refactor the code so that both `delete_tempfile()` and
> `rollback_lock_file()` return an error code.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  lockfile.h |  4 ++--
>  tempfile.c | 21 +++++++++++++--------
>  tempfile.h |  2 +-
>  3 files changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/lockfile.h b/lockfile.h
> index 90af4e66b2..4ed570d3f7 100644
> --- a/lockfile.h
> +++ b/lockfile.h
> @@ -323,9 +323,9 @@ static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
>   * for a `lock_file` object that has already been committed or rolled
>   * back.
>   */
> -static inline void rollback_lock_file(struct lock_file *lk)
> +static inline int rollback_lock_file(struct lock_file *lk)
>  {
> -	delete_tempfile(&lk->tempfile);
> +	return delete_tempfile(&lk->tempfile);
>  }

question: For a lockfile that is already committed or rolled back, is
the underlying tempfile still active? I'm trying to figure out if it
possible for an error to be returned in this scenerio. If not, it might
be nice to clarify in the comment above that, in addition to being a
NOOP, no error is returned.

-Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-04 11:10 ` [PATCH 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
@ 2024-03-05 22:30   ` Justin Tobler
  2024-03-06 11:59     ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Justin Tobler @ 2024-03-05 22:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> We do not register new tables which we're about to add to the stack with
> the tempfile API. Those tables will thus not be deleted in case Git gets
> killed.
> 
> Refactor the code to register tables as tempfiles.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  reftable/stack.c | 29 ++++++++++++-----------------
>  1 file changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/reftable/stack.c b/reftable/stack.c
> index b64e55648a..81544fbfa0 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -737,8 +737,9 @@ int reftable_addition_add(struct reftable_addition *add,
>  	struct strbuf tab_file_name = STRBUF_INIT;
>  	struct strbuf next_name = STRBUF_INIT;
>  	struct reftable_writer *wr = NULL;
> +	struct tempfile *tab_file = NULL;
>  	int err = 0;
> -	int tab_fd = 0;
> +	int tab_fd;
>  
>  	strbuf_reset(&next_name);
>  	format_name(&next_name, add->next_update_index, add->next_update_index);
> @@ -746,17 +747,20 @@ int reftable_addition_add(struct reftable_addition *add,
>  	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
>  	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
>  
> -	tab_fd = mkstemp(temp_tab_file_name.buf);
> -	if (tab_fd < 0) {
> +	tab_file = mks_tempfile(temp_tab_file_name.buf);
> +	if (!tab_file) {
>  		err = REFTABLE_IO_ERROR;
>  		goto done;
>  	}
>  	if (add->stack->config.default_permissions) {
> -		if (chmod(temp_tab_file_name.buf, add->stack->config.default_permissions)) {
> +		if (chmod(get_tempfile_path(tab_file),
> +			  add->stack->config.default_permissions)) {
>  			err = REFTABLE_IO_ERROR;
>  			goto done;
>  		}
>  	}

Since the tempfile is now being created through the tempfile API, I
think the file mode can be set directly through `mks_tempfile_m()`
instead of creating the tempfile and then using chmod. Just something I
thought to mention.

> +	tab_fd = get_tempfile_fd(tab_file);

-Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/4] reftable/stack: register lockfiles during compaction
  2024-03-04 11:10 ` [PATCH 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
@ 2024-03-05 23:30   ` Justin Tobler
  2024-03-06 11:59     ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Justin Tobler @ 2024-03-05 23:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> We do not register any of the locks we acquire when compacting the
> reftable stack via our lockfiles interfaces. These locks will thus not
> be released when Git gets killed.
> 
> Refactor the code to register locks as lockfiles.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> ...
> +	/*
> +	 * Write the new "tables.list" contents with the compacted table we
> +	 * have just written. In case the compacted table became empty we
> +	 * simply skip writing it.
> +	 */
> +	for (i = 0; i < first; i++)
> +		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
> +	if (!is_empty_table)
> +		strbuf_addf(&tables_list_buf, "%s\n", new_table_name.buf);

Something not really related to this patch, but I noticed and had a
question about.

If I'm understanding this correctly, when a newly compacted table is
empty, it becomes possible for a range of indexes to no longer exist 
within the stack. If this occurs in the middle of the stack, future
compaction will likely combine the tables on either side and restore the
missing index range. If the empty table was at the end of the stack,
would this effectly reset the max index to something lower for future
tables written to the stack? If so, could this lead to issues with
separate concurrent table writes?

> ...  
> diff --git a/reftable/system.h b/reftable/system.h
> index 6b74a81514..5d8b6dede5 100644
> --- a/reftable/system.h
> +++ b/reftable/system.h
> @@ -12,7 +12,9 @@ license that can be found in the LICENSE file or at
>  /* This header glues the reftable library to the rest of Git */
>  
>  #include "git-compat-util.h"
> +#include "lockfile.h"
>  #include "strbuf.h"
> +#include "tempfile.h"
>  #include "hash-ll.h" /* hash ID, sizes.*/
>  #include "dir.h" /* remove_dir_recursively, for tests.*/

Naive question, why do we include the headers in `system.h`? I assume
this is because they are common? Are there other benefits to this
indirection?

-Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/4] reftable/stack: register lockfiles during compaction
  2024-03-05 23:30   ` Justin Tobler
@ 2024-03-06 11:59     ` Patrick Steinhardt
  2024-03-06 16:39       ` Junio C Hamano
  2024-03-06 19:57       ` Justin Tobler
  0 siblings, 2 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-06 11:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]

On Tue, Mar 05, 2024 at 05:30:48PM -0600, Justin Tobler wrote:
> On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> > We do not register any of the locks we acquire when compacting the
> > reftable stack via our lockfiles interfaces. These locks will thus not
> > be released when Git gets killed.
> > 
> > Refactor the code to register locks as lockfiles.
> > 
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > ...
> > +	/*
> > +	 * Write the new "tables.list" contents with the compacted table we
> > +	 * have just written. In case the compacted table became empty we
> > +	 * simply skip writing it.
> > +	 */
> > +	for (i = 0; i < first; i++)
> > +		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
> > +	if (!is_empty_table)
> > +		strbuf_addf(&tables_list_buf, "%s\n", new_table_name.buf);
> 
> Something not really related to this patch, but I noticed and had a
> question about.
> 
> If I'm understanding this correctly, when a newly compacted table is
> empty, it becomes possible for a range of indexes to no longer exist 
> within the stack. If this occurs in the middle of the stack, future
> compaction will likely combine the tables on either side and restore the
> missing index range. If the empty table was at the end of the stack,
> would this effectly reset the max index to something lower for future
> tables written to the stack? If so, could this lead to issues with
> separate concurrent table writes?

Very good question indeed, but I think we should be fine here. This is
mostly because concurrent writers will notice when "tables.list" has
changed, and, if so, abort the transaction with an out-of-date error.

A few scenarios with concurrent processes, one process which compacts
the stack (C) and one which modifies it (M):

  - M acquires the lock before C compacts: M sees the whole stack and
    uses the latest update index to update it, resulting in a newly
    written table. When C then locks afterwards, it may decide to
    compact and drop some tables in the middle of the stack. This may
    lead to a gap in update indices, but this is fine.

  - M acquires the lock while C compacts: M sees the whole stack and
    uses the latest update index to update the stack. C then acquires
    the lock to write the merged tables, notices that its compacted
    tables still exist and are in the same order, and thus removes them.
    We now have a gap in update indices, but this is totally fine.

  - M acquires the lock after C compacts: M will refresh "tables.list"
    after it has acquired the lock itself. Thus, it won't ever see the
    now-dropped empty table.

M cannot write its table when C has the "tables.list" lock, so this
scenario cannot happen. In the same spirit, two Ms cannot race with each
other either as only one can have the "tables.list" lock, and the other
one would abort with an out-of-date error when it has subsequently
acquired the lock and found the "tables.list" contents to have been
updated concurrently.

> > ...  
> > diff --git a/reftable/system.h b/reftable/system.h
> > index 6b74a81514..5d8b6dede5 100644
> > --- a/reftable/system.h
> > +++ b/reftable/system.h
> > @@ -12,7 +12,9 @@ license that can be found in the LICENSE file or at
> >  /* This header glues the reftable library to the rest of Git */
> >  
> >  #include "git-compat-util.h"
> > +#include "lockfile.h"
> >  #include "strbuf.h"
> > +#include "tempfile.h"
> >  #include "hash-ll.h" /* hash ID, sizes.*/
> >  #include "dir.h" /* remove_dir_recursively, for tests.*/
> 
> Naive question, why do we include the headers in `system.h`? I assume
> this is because they are common? Are there other benefits to this
> indirection?

Well, "system.h" is supposedly the glue between the common Git codebase
and the reftable library, so all Git-specific headers should be added
here instead of being added individually to the respective files in the
library. Whether that is ultimately a sensible thing and whether it
really helps us all that much is a different question though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-05 22:30   ` Justin Tobler
@ 2024-03-06 11:59     ` Patrick Steinhardt
  2024-03-06 16:34       ` Justin Tobler
  2024-03-06 16:36       ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-06 11:59 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]

On Tue, Mar 05, 2024 at 04:30:18PM -0600, Justin Tobler wrote:
> On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> > We do not register new tables which we're about to add to the stack with
> > the tempfile API. Those tables will thus not be deleted in case Git gets
> > killed.
> > 
> > Refactor the code to register tables as tempfiles.
> > 
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  reftable/stack.c | 29 ++++++++++++-----------------
> >  1 file changed, 12 insertions(+), 17 deletions(-)
> > 
> > diff --git a/reftable/stack.c b/reftable/stack.c
> > index b64e55648a..81544fbfa0 100644
> > --- a/reftable/stack.c
> > +++ b/reftable/stack.c
> > @@ -737,8 +737,9 @@ int reftable_addition_add(struct reftable_addition *add,
> >  	struct strbuf tab_file_name = STRBUF_INIT;
> >  	struct strbuf next_name = STRBUF_INIT;
> >  	struct reftable_writer *wr = NULL;
> > +	struct tempfile *tab_file = NULL;
> >  	int err = 0;
> > -	int tab_fd = 0;
> > +	int tab_fd;
> >  
> >  	strbuf_reset(&next_name);
> >  	format_name(&next_name, add->next_update_index, add->next_update_index);
> > @@ -746,17 +747,20 @@ int reftable_addition_add(struct reftable_addition *add,
> >  	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
> >  	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
> >  
> > -	tab_fd = mkstemp(temp_tab_file_name.buf);
> > -	if (tab_fd < 0) {
> > +	tab_file = mks_tempfile(temp_tab_file_name.buf);
> > +	if (!tab_file) {
> >  		err = REFTABLE_IO_ERROR;
> >  		goto done;
> >  	}
> >  	if (add->stack->config.default_permissions) {
> > -		if (chmod(temp_tab_file_name.buf, add->stack->config.default_permissions)) {
> > +		if (chmod(get_tempfile_path(tab_file),
> > +			  add->stack->config.default_permissions)) {
> >  			err = REFTABLE_IO_ERROR;
> >  			goto done;
> >  		}
> >  	}
> 
> Since the tempfile is now being created through the tempfile API, I
> think the file mode can be set directly through `mks_tempfile_m()`
> instead of creating the tempfile and then using chmod. Just something I
> thought to mention.

Unfortunately not. The problem is that `mks_tempfile_m()` will munge
passed-in permissions via "core.sharedRepository", but we already pre
calculated the target mode in `config.default_permissions`. Thus, the
result would have wrong permissions if we used `mks_tempfile_m()`.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/4] lockfile: report when rollback fails
  2024-03-05 22:09   ` Justin Tobler
@ 2024-03-06 12:00     ` Patrick Steinhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-06 12:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1924 bytes --]

On Tue, Mar 05, 2024 at 04:09:07PM -0600, Justin Tobler wrote:
> On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> > We do not report to the caller when rolling back a lockfile fails, which
> > will be needed by the reftable compaction logic in a subsequent commit.
> > It also cannot really report on all errors because the function calls
> > `delete_tempfile()`, which doesn't return an error either.
> > 
> > Refactor the code so that both `delete_tempfile()` and
> > `rollback_lock_file()` return an error code.
> > 
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  lockfile.h |  4 ++--
> >  tempfile.c | 21 +++++++++++++--------
> >  tempfile.h |  2 +-
> >  3 files changed, 16 insertions(+), 11 deletions(-)
> > 
> > diff --git a/lockfile.h b/lockfile.h
> > index 90af4e66b2..4ed570d3f7 100644
> > --- a/lockfile.h
> > +++ b/lockfile.h
> > @@ -323,9 +323,9 @@ static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
> >   * for a `lock_file` object that has already been committed or rolled
> >   * back.
> >   */
> > -static inline void rollback_lock_file(struct lock_file *lk)
> > +static inline int rollback_lock_file(struct lock_file *lk)
> >  {
> > -	delete_tempfile(&lk->tempfile);
> > +	return delete_tempfile(&lk->tempfile);
> >  }
> 
> question: For a lockfile that is already committed or rolled back, is
> the underlying tempfile still active? I'm trying to figure out if it
> possible for an error to be returned in this scenerio. If not, it might
> be nice to clarify in the comment above that, in addition to being a
> NOOP, no error is returned.

No, it's not. I thought that it being a no-op should be clear enough,
but on the other hand it doesn't hurt either to clarify even further.

I'll refrain from sending a v2 only to add this comment as there haven't
been any other things that need to be addressed yet.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-06 11:59     ` Patrick Steinhardt
@ 2024-03-06 16:34       ` Justin Tobler
  2024-03-06 16:36       ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Justin Tobler @ 2024-03-06 16:34 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/03/06 12:59PM, Patrick Steinhardt wrote:
> On Tue, Mar 05, 2024 at 04:30:18PM -0600, Justin Tobler wrote:
> > On 24/03/04 12:10PM, Patrick Steinhardt wrote:
> > > We do not register new tables which we're about to add to the stack with
> > > the tempfile API. Those tables will thus not be deleted in case Git gets
> > > killed.
> > > 
> > > Refactor the code to register tables as tempfiles.
> > > 
> > > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > > ---
> > >  reftable/stack.c | 29 ++++++++++++-----------------
> > >  1 file changed, 12 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/reftable/stack.c b/reftable/stack.c
> > > index b64e55648a..81544fbfa0 100644
> > > --- a/reftable/stack.c
> > > +++ b/reftable/stack.c
> > > @@ -737,8 +737,9 @@ int reftable_addition_add(struct reftable_addition *add,
> > >  	struct strbuf tab_file_name = STRBUF_INIT;
> > >  	struct strbuf next_name = STRBUF_INIT;
> > >  	struct reftable_writer *wr = NULL;
> > > +	struct tempfile *tab_file = NULL;
> > >  	int err = 0;
> > > -	int tab_fd = 0;
> > > +	int tab_fd;
> > >  
> > >  	strbuf_reset(&next_name);
> > >  	format_name(&next_name, add->next_update_index, add->next_update_index);
> > > @@ -746,17 +747,20 @@ int reftable_addition_add(struct reftable_addition *add,
> > >  	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
> > >  	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
> > >  
> > > -	tab_fd = mkstemp(temp_tab_file_name.buf);
> > > -	if (tab_fd < 0) {
> > > +	tab_file = mks_tempfile(temp_tab_file_name.buf);
> > > +	if (!tab_file) {
> > >  		err = REFTABLE_IO_ERROR;
> > >  		goto done;
> > >  	}
> > >  	if (add->stack->config.default_permissions) {
> > > -		if (chmod(temp_tab_file_name.buf, add->stack->config.default_permissions)) {
> > > +		if (chmod(get_tempfile_path(tab_file),
> > > +			  add->stack->config.default_permissions)) {
> > >  			err = REFTABLE_IO_ERROR;
> > >  			goto done;
> > >  		}
> > >  	}
> > 
> > Since the tempfile is now being created through the tempfile API, I
> > think the file mode can be set directly through `mks_tempfile_m()`
> > instead of creating the tempfile and then using chmod. Just something I
> > thought to mention.
> 
> Unfortunately not. The problem is that `mks_tempfile_m()` will munge
> passed-in permissions via "core.sharedRepository", but we already pre
> calculated the target mode in `config.default_permissions`. Thus, the
> result would have wrong permissions if we used `mks_tempfile_m()`.

Ah that makes sense. Thanks for the explanation!

-Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-06 11:59     ` Patrick Steinhardt
  2024-03-06 16:34       ` Justin Tobler
@ 2024-03-06 16:36       ` Junio C Hamano
  2024-03-07  6:17         ` Patrick Steinhardt
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2024-03-06 16:36 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

>> Since the tempfile is now being created through the tempfile API, I
>> think the file mode can be set directly through `mks_tempfile_m()`
>> instead of creating the tempfile and then using chmod. Just something I
>> thought to mention.
>
> Unfortunately not. The problem is that `mks_tempfile_m()` will munge
> passed-in permissions via "core.sharedRepository", but we already pre
> calculated the target mode in `config.default_permissions`. Thus, the
> result would have wrong permissions if we used `mks_tempfile_m()`.

I somehow found that default_permissions thing always disturbing.

Even if we keep a separate mechanism for determining the file
permission (perhaps in order to give ourselves a better separation
as "an independent library" from the rest of Git), shouldn't the
permission setting that is computed by the mechanism and stored in
config.default_permissions be consistent with the permission the
rest of git computes based on core.sharedRepository?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/4] reftable/stack: register lockfiles during compaction
  2024-03-06 11:59     ` Patrick Steinhardt
@ 2024-03-06 16:39       ` Junio C Hamano
  2024-03-06 19:57       ` Justin Tobler
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2024-03-06 16:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

>> Naive question, why do we include the headers in `system.h`? I assume
>> this is because they are common? Are there other benefits to this
>> indirection?
>
> Well, "system.h" is supposedly the glue between the common Git codebase
> and the reftable library, so all Git-specific headers should be added
> here instead of being added individually to the respective files in the
> library. Whether that is ultimately a sensible thing and whether it
> really helps us all that much is a different question though.

That matches my understanding of what have been done in reftable/
directory.  If a project other than Git wants to use the reftable
code, they only need to prepare a shim and write their own
"system.h" to provide services equivalent to what Git supplies.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/4] reftable/stack: register lockfiles during compaction
  2024-03-06 11:59     ` Patrick Steinhardt
  2024-03-06 16:39       ` Junio C Hamano
@ 2024-03-06 19:57       ` Justin Tobler
  1 sibling, 0 replies; 26+ messages in thread
From: Justin Tobler @ 2024-03-06 19:57 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On 24/03/06 12:59PM, Patrick Steinhardt wrote:

> > Something not really related to this patch, but I noticed and had a
> > question about.
> > 
> > If I'm understanding this correctly, when a newly compacted table is
> > empty, it becomes possible for a range of indexes to no longer exist 
> > within the stack. If this occurs in the middle of the stack, future
> > compaction will likely combine the tables on either side and restore the
> > missing index range. If the empty table was at the end of the stack,
> > would this effectly reset the max index to something lower for future
> > tables written to the stack? If so, could this lead to issues with
> > separate concurrent table writes?
> 
> Very good question indeed, but I think we should be fine here. This is
> mostly because concurrent writers will notice when "tables.list" has
> changed, and, if so, abort the transaction with an out-of-date error.
> 
> A few scenarios with concurrent processes, one process which compacts
> the stack (C) and one which modifies it (M):
> 
>   - M acquires the lock before C compacts: M sees the whole stack and
>     uses the latest update index to update it, resulting in a newly
>     written table. When C then locks afterwards, it may decide to
>     compact and drop some tables in the middle of the stack. This may
>     lead to a gap in update indices, but this is fine.
> 
>   - M acquires the lock while C compacts: M sees the whole stack and
>     uses the latest update index to update the stack. C then acquires
>     the lock to write the merged tables, notices that its compacted
>     tables still exist and are in the same order, and thus removes them.
>     We now have a gap in update indices, but this is totally fine.
> 
>   - M acquires the lock after C compacts: M will refresh "tables.list"
>     after it has acquired the lock itself. Thus, it won't ever see the
>     now-dropped empty table.
> 
> M cannot write its table when C has the "tables.list" lock, so this
> scenario cannot happen. In the same spirit, two Ms cannot race with each
> other either as only one can have the "tables.list" lock, and the other
> one would abort with an out-of-date error when it has subsequently
> acquired the lock and found the "tables.list" contents to have been
> updated concurrently.

Thanks Patrick for the great explanation! Digging into this a bit
further, I see that we return `REFTABLE_LOCK_ERROR` when the list file
lock already exists or has changed when attempting to add a new table to
the list. 

When performing compaction in `stack_compact_range()`, after initially
acquiring the table list lock, we also check if the stack is up-to-date
with `stack_uptodate()`. I noticed that this check is not performed
again after the table list is locked for the second time. At first I
thought this could be problematic, but I realized that this would only
be an issue for concurrent compactions and because the tables are locked
it should not matter.

-Justin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-06 16:36       ` Junio C Hamano
@ 2024-03-07  6:17         ` Patrick Steinhardt
  2024-03-07 17:59           ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07  6:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]

On Wed, Mar 06, 2024 at 08:36:25AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> Since the tempfile is now being created through the tempfile API, I
> >> think the file mode can be set directly through `mks_tempfile_m()`
> >> instead of creating the tempfile and then using chmod. Just something I
> >> thought to mention.
> >
> > Unfortunately not. The problem is that `mks_tempfile_m()` will munge
> > passed-in permissions via "core.sharedRepository", but we already pre
> > calculated the target mode in `config.default_permissions`. Thus, the
> > result would have wrong permissions if we used `mks_tempfile_m()`.
> 
> I somehow found that default_permissions thing always disturbing.
> 
> Even if we keep a separate mechanism for determining the file
> permission (perhaps in order to give ourselves a better separation
> as "an independent library" from the rest of Git), shouldn't the
> permission setting that is computed by the mechanism and stored in
> config.default_permissions be consistent with the permission the
> rest of git computes based on core.sharedRepository?

It is consistent. The problem is rather that `mks_tempfile_m()` takes a
mode as input, but still ends up applying the umask to that mode. Thus,
using that function without a subsequent call to chmod(3P) would end up
mishandling "core.sharedRepository".

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] reftable/stack: register compacted tables as tempfiles
  2024-03-04 11:10 ` [PATCH 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
@ 2024-03-07 12:38   ` Toon claes
  2024-03-07 12:58     ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Toon claes @ 2024-03-07 12:38 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

> diff --git a/reftable/stack.c b/reftable/stack.c
> index 977336b7d5..40129da16c 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -827,51 +827,57 @@ uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
>  
>  static int stack_compact_locked(struct reftable_stack *st,
>  				size_t first, size_t last,
> -				struct strbuf *temp_tab,
> -				struct reftable_log_expiry_config *config)
> +				struct reftable_log_expiry_config *config,
> +				struct tempfile **temp_table_out)
>  {
>  	struct strbuf next_name = STRBUF_INIT;
> -	int tab_fd = -1;
> +	struct strbuf table_path = STRBUF_INIT;
>  	struct reftable_writer *wr = NULL;
> +	struct tempfile *temp_table;
> +	int temp_table_fd;

Just one small nit, if you don't mind? In PATCH 2/4 you use
`struct tempfile *tab_file` and `int tab_fd`. I would like to see
consistency and use similar names. Personally I don't like table being
shortened to "tab", and I think you feel the same as you've renamed the
parameter from this function.


-- 
Toon

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/4] reftable/stack: register compacted tables as tempfiles
  2024-03-07 12:38   ` Toon claes
@ 2024-03-07 12:58     ` Patrick Steinhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 12:58 UTC (permalink / raw)
  To: Toon claes; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1673 bytes --]

On Thu, Mar 07, 2024 at 01:38:56PM +0100, Toon claes wrote:
> > diff --git a/reftable/stack.c b/reftable/stack.c
> > index 977336b7d5..40129da16c 100644
> > --- a/reftable/stack.c
> > +++ b/reftable/stack.c
> > @@ -827,51 +827,57 @@ uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
> >  
> >  static int stack_compact_locked(struct reftable_stack *st,
> >  				size_t first, size_t last,
> > -				struct strbuf *temp_tab,
> > -				struct reftable_log_expiry_config *config)
> > +				struct reftable_log_expiry_config *config,
> > +				struct tempfile **temp_table_out)
> >  {
> >  	struct strbuf next_name = STRBUF_INIT;
> > -	int tab_fd = -1;
> > +	struct strbuf table_path = STRBUF_INIT;
> >  	struct reftable_writer *wr = NULL;
> > +	struct tempfile *temp_table;
> > +	int temp_table_fd;
> 
> Just one small nit, if you don't mind? In PATCH 2/4 you use
> `struct tempfile *tab_file` and `int tab_fd`. I would like to see
> consistency and use similar names. Personally I don't like table being
> shortened to "tab", and I think you feel the same as you've renamed the
> parameter from this function.

Yeah, I'm always a proponent of descriptive, unabbreviated names and
would thus rather want to use "temp_table" or something like that. But
in 2/4 I decided to stick with "tab_file" because there is a bunch of
already existing variables in that function which are called similar --
"temp_tab_file_name", "tab_file_name" and "tab_fd". But renaming all of
them to ensure function-level consistency would create a lot of churn.

I'll thus instead rename variables over here to their abbreviated
versions.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/4] reftable/stack: register temporary files
  2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2024-03-04 11:10 ` [PATCH 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
@ 2024-03-07 13:10 ` Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 1/4] lockfile: report when rollback fails Patrick Steinhardt
                     ` (3 more replies)
  4 siblings, 4 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 13:10 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon claes, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 5192 bytes --]

Hi,

this is the second version of my patch series that registers temporary
files written by the reftable library with the tempfile framework. This
ensures that those files get cleaned up even when Git dies or otherwise
gets signalled.

Changes compared to v1:

  - Patch 1: Clarify that rolling back a deactivated lockfile will not
    result in an error.

  - Patch 4: Rename some local variables for improved consistency with
    other code.

Patrick

Patrick Steinhardt (4):
  lockfile: report when rollback fails
  reftable/stack: register new tables as tempfiles
  reftable/stack: register lockfiles during compaction
  reftable/stack: register compacted tables as tempfiles

 lockfile.h        |   6 +-
 reftable/stack.c  | 330 ++++++++++++++++++++++------------------------
 reftable/system.h |   2 +
 tempfile.c        |  21 +--
 tempfile.h        |   2 +-
 5 files changed, 178 insertions(+), 183 deletions(-)

Range-diff against v1:
1:  1acaa9ca1a ! 1:  782e96a678 lockfile: report when rollback fails
    @@ Commit message
     
      ## lockfile.h ##
     @@ lockfile.h: static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
    +  * Roll back `lk`: close the file descriptor and/or file pointer and
    +  * remove the lockfile. It is a NOOP to call `rollback_lock_file()`
       * for a `lock_file` object that has already been committed or rolled
    -  * back.
    +- * back.
    ++ * back. No error will be returned in this case.
       */
     -static inline void rollback_lock_file(struct lock_file *lk)
     +static inline int rollback_lock_file(struct lock_file *lk)
2:  02bf41d419 = 2:  5dbc93d5be reftable/stack: register new tables as tempfiles
3:  45b5c3167f = 3:  c88c85443e reftable/stack: register lockfiles during compaction
4:  b952d54a05 ! 4:  4023d78f08 reftable/stack: register compacted tables as tempfiles
    @@ reftable/stack.c: uint64_t reftable_stack_next_update_index(struct reftable_stac
     -				struct strbuf *temp_tab,
     -				struct reftable_log_expiry_config *config)
     +				struct reftable_log_expiry_config *config,
    -+				struct tempfile **temp_table_out)
    ++				struct tempfile **tab_file_out)
      {
      	struct strbuf next_name = STRBUF_INIT;
     -	int tab_fd = -1;
    -+	struct strbuf table_path = STRBUF_INIT;
    ++	struct strbuf tab_file_path = STRBUF_INIT;
      	struct reftable_writer *wr = NULL;
    -+	struct tempfile *temp_table;
    -+	int temp_table_fd;
    - 	int err = 0;
    +-	int err = 0;
    ++	struct tempfile *tab_file;
    ++	int tab_fd, err = 0;
      
      	format_name(&next_name,
      		    reftable_reader_min_update_index(st->readers[first]),
      		    reftable_reader_max_update_index(st->readers[last]));
    -+	stack_filename(&table_path, st, next_name.buf);
    -+	strbuf_addstr(&table_path, ".temp.XXXXXX");
    ++	stack_filename(&tab_file_path, st, next_name.buf);
    ++	strbuf_addstr(&tab_file_path, ".temp.XXXXXX");
      
     -	stack_filename(temp_tab, st, next_name.buf);
     -	strbuf_addstr(temp_tab, ".temp.XXXXXX");
    -+	temp_table = mks_tempfile(table_path.buf);
    -+	if (!temp_table) {
    ++	tab_file = mks_tempfile(tab_file_path.buf);
    ++	if (!tab_file) {
     +		err = REFTABLE_IO_ERROR;
     +		goto done;
     +	}
    -+	temp_table_fd = get_tempfile_fd(temp_table);
    ++	tab_fd = get_tempfile_fd(tab_file);
      
     -	tab_fd = mkstemp(temp_tab->buf);
      	if (st->config.default_permissions &&
     -	    chmod(temp_tab->buf, st->config.default_permissions) < 0) {
    -+	    chmod(get_tempfile_path(temp_table), st->config.default_permissions) < 0) {
    ++	    chmod(get_tempfile_path(tab_file), st->config.default_permissions) < 0) {
      		err = REFTABLE_IO_ERROR;
      		goto done;
      	}
    @@ reftable/stack.c: uint64_t reftable_stack_next_update_index(struct reftable_stac
     -	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd, &st->config);
     -
     +	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
    -+				 &temp_table_fd, &st->config);
    ++				 &tab_fd, &st->config);
      	err = stack_write_compact(st, wr, first, last, config);
      	if (err < 0)
      		goto done;
    @@ reftable/stack.c: uint64_t reftable_stack_next_update_index(struct reftable_stac
      
     -	err = close(tab_fd);
     -	tab_fd = 0;
    -+	err = close_tempfile_gently(temp_table);
    ++	err = close_tempfile_gently(tab_file);
     +	if (err < 0)
     +		goto done;
     +
    -+	*temp_table_out = temp_table;
    -+	temp_table = NULL;
    ++	*tab_file_out = tab_file;
    ++	tab_file = NULL;
      
      done:
    -+	delete_tempfile(&temp_table);
    ++	delete_tempfile(&tab_file);
      	reftable_writer_free(wr);
     -	if (tab_fd > 0) {
     -		close(tab_fd);
    @@ reftable/stack.c: uint64_t reftable_stack_next_update_index(struct reftable_stac
     -		strbuf_release(temp_tab);
     -	}
      	strbuf_release(&next_name);
    -+	strbuf_release(&table_path);
    ++	strbuf_release(&tab_file_path);
      	return err;
      }
      
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/4] lockfile: report when rollback fails
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
@ 2024-03-07 13:10   ` Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 13:10 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon claes, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3155 bytes --]

We do not report to the caller when rolling back a lockfile fails, which
will be needed by the reftable compaction logic in a subsequent commit.
It also cannot really report on all errors because the function calls
`delete_tempfile()`, which doesn't return an error either.

Refactor the code so that both `delete_tempfile()` and
`rollback_lock_file()` return an error code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 lockfile.h |  6 +++---
 tempfile.c | 21 +++++++++++++--------
 tempfile.h |  2 +-
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/lockfile.h b/lockfile.h
index 90af4e66b2..1bb9926497 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -321,11 +321,11 @@ static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
  * Roll back `lk`: close the file descriptor and/or file pointer and
  * remove the lockfile. It is a NOOP to call `rollback_lock_file()`
  * for a `lock_file` object that has already been committed or rolled
- * back.
+ * back. No error will be returned in this case.
  */
-static inline void rollback_lock_file(struct lock_file *lk)
+static inline int rollback_lock_file(struct lock_file *lk)
 {
-	delete_tempfile(&lk->tempfile);
+	return delete_tempfile(&lk->tempfile);
 }
 
 #endif /* LOCKFILE_H */
diff --git a/tempfile.c b/tempfile.c
index ecdebf1afb..ed88cf8431 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -50,15 +50,17 @@
 
 static VOLATILE_LIST_HEAD(tempfile_list);
 
-static void remove_template_directory(struct tempfile *tempfile,
+static int remove_template_directory(struct tempfile *tempfile,
 				      int in_signal_handler)
 {
 	if (tempfile->directory) {
 		if (in_signal_handler)
-			rmdir(tempfile->directory);
+			return rmdir(tempfile->directory);
 		else
-			rmdir_or_warn(tempfile->directory);
+			return rmdir_or_warn(tempfile->directory);
 	}
+
+	return 0;
 }
 
 static void remove_tempfiles(int in_signal_handler)
@@ -353,16 +355,19 @@ int rename_tempfile(struct tempfile **tempfile_p, const char *path)
 	return 0;
 }
 
-void delete_tempfile(struct tempfile **tempfile_p)
+int delete_tempfile(struct tempfile **tempfile_p)
 {
 	struct tempfile *tempfile = *tempfile_p;
+	int err = 0;
 
 	if (!is_tempfile_active(tempfile))
-		return;
+		return 0;
 
-	close_tempfile_gently(tempfile);
-	unlink_or_warn(tempfile->filename.buf);
-	remove_template_directory(tempfile, 0);
+	err |= close_tempfile_gently(tempfile);
+	err |= unlink_or_warn(tempfile->filename.buf);
+	err |= remove_template_directory(tempfile, 0);
 	deactivate_tempfile(tempfile);
 	*tempfile_p = NULL;
+
+	return err ? -1 : 0;
 }
diff --git a/tempfile.h b/tempfile.h
index d0413af733..2d2ae5b657 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -269,7 +269,7 @@ int reopen_tempfile(struct tempfile *tempfile);
  * `delete_tempfile()` for a `tempfile` object that has already been
  * deleted or renamed.
  */
-void delete_tempfile(struct tempfile **tempfile_p);
+int delete_tempfile(struct tempfile **tempfile_p);
 
 /*
  * Close the file descriptor and/or file pointer if they are still
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/4] reftable/stack: register new tables as tempfiles
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 1/4] lockfile: report when rollback fails Patrick Steinhardt
@ 2024-03-07 13:10   ` Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
  3 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 13:10 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon claes, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]

We do not register new tables which we're about to add to the stack with
the tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index b64e55648a..81544fbfa0 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -737,8 +737,9 @@ int reftable_addition_add(struct reftable_addition *add,
 	struct strbuf tab_file_name = STRBUF_INIT;
 	struct strbuf next_name = STRBUF_INIT;
 	struct reftable_writer *wr = NULL;
+	struct tempfile *tab_file = NULL;
 	int err = 0;
-	int tab_fd = 0;
+	int tab_fd;
 
 	strbuf_reset(&next_name);
 	format_name(&next_name, add->next_update_index, add->next_update_index);
@@ -746,17 +747,20 @@ int reftable_addition_add(struct reftable_addition *add,
 	stack_filename(&temp_tab_file_name, add->stack, next_name.buf);
 	strbuf_addstr(&temp_tab_file_name, ".temp.XXXXXX");
 
-	tab_fd = mkstemp(temp_tab_file_name.buf);
-	if (tab_fd < 0) {
+	tab_file = mks_tempfile(temp_tab_file_name.buf);
+	if (!tab_file) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 	if (add->stack->config.default_permissions) {
-		if (chmod(temp_tab_file_name.buf, add->stack->config.default_permissions)) {
+		if (chmod(get_tempfile_path(tab_file),
+			  add->stack->config.default_permissions)) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
+	tab_fd = get_tempfile_fd(tab_file);
+
 	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd,
 				 &add->stack->config);
 	err = write_table(wr, arg);
@@ -771,14 +775,13 @@ int reftable_addition_add(struct reftable_addition *add,
 	if (err < 0)
 		goto done;
 
-	err = close(tab_fd);
-	tab_fd = 0;
+	err = close_tempfile_gently(tab_file);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
-	err = stack_check_addition(add->stack, temp_tab_file_name.buf);
+	err = stack_check_addition(add->stack, get_tempfile_path(tab_file));
 	if (err < 0)
 		goto done;
 
@@ -789,14 +792,13 @@ int reftable_addition_add(struct reftable_addition *add,
 
 	format_name(&next_name, wr->min_update_index, wr->max_update_index);
 	strbuf_addstr(&next_name, ".ref");
-
 	stack_filename(&tab_file_name, add->stack, next_name.buf);
 
 	/*
 	  On windows, this relies on rand() picking a unique destination name.
 	  Maybe we should do retry loop as well?
 	 */
-	err = rename(temp_tab_file_name.buf, tab_file_name.buf);
+	err = rename_tempfile(&tab_file, tab_file_name.buf);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
@@ -806,14 +808,7 @@ int reftable_addition_add(struct reftable_addition *add,
 			    add->new_tables_cap);
 	add->new_tables[add->new_tables_len++] = strbuf_detach(&next_name, NULL);
 done:
-	if (tab_fd > 0) {
-		close(tab_fd);
-		tab_fd = 0;
-	}
-	if (temp_tab_file_name.len > 0) {
-		unlink(temp_tab_file_name.buf);
-	}
-
+	delete_tempfile(&tab_file);
 	strbuf_release(&temp_tab_file_name);
 	strbuf_release(&tab_file_name);
 	strbuf_release(&next_name);
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/4] reftable/stack: register lockfiles during compaction
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 1/4] lockfile: report when rollback fails Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
@ 2024-03-07 13:10   ` Patrick Steinhardt
  2024-03-07 13:10   ` [PATCH v2 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
  3 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 13:10 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon claes, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 10974 bytes --]

We do not register any of the locks we acquire when compacting the
reftable stack via our lockfiles interfaces. These locks will thus not
be released when Git gets killed.

Refactor the code to register locks as lockfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c  | 255 ++++++++++++++++++++++------------------------
 reftable/system.h |   2 +
 2 files changed, 123 insertions(+), 134 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 81544fbfa0..977336b7d5 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -978,212 +978,199 @@ static int stack_compact_range(struct reftable_stack *st,
 			       size_t first, size_t last,
 			       struct reftable_log_expiry_config *expiry)
 {
-	char **delete_on_success = NULL, **subtable_locks = NULL, **listp = NULL;
-	struct strbuf temp_tab_file_name = STRBUF_INIT;
+	struct strbuf tables_list_buf = STRBUF_INIT;
+	struct strbuf new_table_temp_path = STRBUF_INIT;
 	struct strbuf new_table_name = STRBUF_INIT;
-	struct strbuf lock_file_name = STRBUF_INIT;
-	struct strbuf ref_list_contents = STRBUF_INIT;
 	struct strbuf new_table_path = STRBUF_INIT;
-	size_t i, j, compact_count;
-	int err = 0;
-	int have_lock = 0;
-	int lock_file_fd = -1;
-	int is_empty_table = 0;
+	struct strbuf table_name = STRBUF_INIT;
+	struct lock_file tables_list_lock = LOCK_INIT;
+	struct lock_file *table_locks = NULL;
+	int is_empty_table = 0, err = 0;
+	size_t i;
 
 	if (first > last || (!expiry && first == last)) {
 		err = 0;
 		goto done;
 	}
 
-	compact_count = last - first + 1;
-	REFTABLE_CALLOC_ARRAY(delete_on_success, compact_count + 1);
-	REFTABLE_CALLOC_ARRAY(subtable_locks, compact_count + 1);
-
 	st->stats.attempts++;
 
-	strbuf_reset(&lock_file_name);
-	strbuf_addstr(&lock_file_name, st->list_file);
-	strbuf_addstr(&lock_file_name, ".lock");
-
-	lock_file_fd =
-		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0666);
-	if (lock_file_fd < 0) {
-		if (errno == EEXIST) {
+	/*
+	 * Hold the lock so that we can read "tables.list" and lock all tables
+	 * which are part of the user-specified range.
+	 */
+	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
+					LOCK_NO_DEREF);
+	if (err < 0) {
+		if (errno == EEXIST)
 			err = 1;
-		} else {
+		else
 			err = REFTABLE_IO_ERROR;
-		}
 		goto done;
 	}
-	/* Don't want to write to the lock for now.  */
-	close(lock_file_fd);
-	lock_file_fd = -1;
 
-	have_lock = 1;
 	err = stack_uptodate(st);
-	if (err != 0)
+	if (err)
 		goto done;
 
-	for (i = first, j = 0; i <= last; i++) {
-		struct strbuf subtab_file_name = STRBUF_INIT;
-		struct strbuf subtab_lock = STRBUF_INIT;
-		int sublock_file_fd = -1;
-
-		stack_filename(&subtab_file_name, st,
-			       reader_name(st->readers[i]));
-
-		strbuf_reset(&subtab_lock);
-		strbuf_addbuf(&subtab_lock, &subtab_file_name);
-		strbuf_addstr(&subtab_lock, ".lock");
+	/*
+	 * Lock all tables in the user-provided range. This is the slice of our
+	 * stack which we'll compact.
+	 */
+	REFTABLE_CALLOC_ARRAY(table_locks, last - first + 1);
+	for (i = first; i <= last; i++) {
+		stack_filename(&table_name, st, reader_name(st->readers[i]));
 
-		sublock_file_fd = open(subtab_lock.buf,
-				       O_EXCL | O_CREAT | O_WRONLY, 0666);
-		if (sublock_file_fd >= 0) {
-			close(sublock_file_fd);
-		} else if (sublock_file_fd < 0) {
-			if (errno == EEXIST) {
+		err = hold_lock_file_for_update(&table_locks[i - first],
+						table_name.buf, LOCK_NO_DEREF);
+		if (err < 0) {
+			if (errno == EEXIST)
 				err = 1;
-			} else {
+			else
 				err = REFTABLE_IO_ERROR;
-			}
+			goto done;
 		}
 
-		subtable_locks[j] = subtab_lock.buf;
-		delete_on_success[j] = subtab_file_name.buf;
-		j++;
-
-		if (err != 0)
+		/*
+		 * We need to close the lockfiles as we might otherwise easily
+		 * run into file descriptor exhaustion when we compress a lot
+		 * of tables.
+		 */
+		err = close_lock_file_gently(&table_locks[i - first]);
+		if (err < 0) {
+			err = REFTABLE_IO_ERROR;
 			goto done;
+		}
 	}
 
-	err = unlink(lock_file_name.buf);
-	if (err < 0)
+	/*
+	 * We have locked all tables in our range and can thus release the
+	 * "tables.list" lock while compacting the locked tables. This allows
+	 * concurrent updates to the stack to proceed.
+	 */
+	err = rollback_lock_file(&tables_list_lock);
+	if (err < 0) {
+		err = REFTABLE_IO_ERROR;
 		goto done;
-	have_lock = 0;
-
-	err = stack_compact_locked(st, first, last, &temp_tab_file_name,
-				   expiry);
-	/* Compaction + tombstones can create an empty table out of non-empty
-	 * tables. */
-	is_empty_table = (err == REFTABLE_EMPTY_TABLE_ERROR);
-	if (is_empty_table) {
-		err = 0;
 	}
-	if (err < 0)
-		goto done;
 
-	lock_file_fd =
-		open(lock_file_name.buf, O_EXCL | O_CREAT | O_WRONLY, 0666);
-	if (lock_file_fd < 0) {
-		if (errno == EEXIST) {
+	/*
+	 * Compact the now-locked tables into a new table. Note that compacting
+	 * these tables may end up with an empty new table in case tombstones
+	 * end up cancelling out all refs in that range.
+	 */
+	err = stack_compact_locked(st, first, last, &new_table_temp_path, expiry);
+	if (err < 0) {
+		if (err != REFTABLE_EMPTY_TABLE_ERROR)
+			goto done;
+		is_empty_table = 1;
+	}
+
+	/*
+	 * Now that we have written the new, compacted table we need to re-lock
+	 * "tables.list". We'll then replace the compacted range of tables with
+	 * the new table.
+	 */
+	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
+					LOCK_NO_DEREF);
+	if (err < 0) {
+		if (errno == EEXIST)
 			err = 1;
-		} else {
+		else
 			err = REFTABLE_IO_ERROR;
-		}
 		goto done;
 	}
-	have_lock = 1;
+
 	if (st->config.default_permissions) {
-		if (chmod(lock_file_name.buf, st->config.default_permissions) < 0) {
+		if (chmod(get_lock_file_path(&tables_list_lock),
+			  st->config.default_permissions) < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
 
-	format_name(&new_table_name, st->readers[first]->min_update_index,
-		    st->readers[last]->max_update_index);
-	strbuf_addstr(&new_table_name, ".ref");
-
-	stack_filename(&new_table_path, st, new_table_name.buf);
-
+	/*
+	 * If the resulting compacted table is not empty, then we need to move
+	 * it into place now.
+	 */
 	if (!is_empty_table) {
-		/* retry? */
-		err = rename(temp_tab_file_name.buf, new_table_path.buf);
+		format_name(&new_table_name, st->readers[first]->min_update_index,
+			    st->readers[last]->max_update_index);
+		strbuf_addstr(&new_table_name, ".ref");
+		stack_filename(&new_table_path, st, new_table_name.buf);
+
+		err = rename(new_table_temp_path.buf, new_table_path.buf);
 		if (err < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
 		}
 	}
 
-	for (i = 0; i < first; i++) {
-		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-	if (!is_empty_table) {
-		strbuf_addbuf(&ref_list_contents, &new_table_name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-	for (i = last + 1; i < st->merged->stack_len; i++) {
-		strbuf_addstr(&ref_list_contents, st->readers[i]->name);
-		strbuf_addstr(&ref_list_contents, "\n");
-	}
-
-	err = write_in_full(lock_file_fd, ref_list_contents.buf, ref_list_contents.len);
-	if (err < 0) {
-		err = REFTABLE_IO_ERROR;
-		unlink(new_table_path.buf);
-		goto done;
-	}
-
-	err = fsync_component(FSYNC_COMPONENT_REFERENCE, lock_file_fd);
+	/*
+	 * Write the new "tables.list" contents with the compacted table we
+	 * have just written. In case the compacted table became empty we
+	 * simply skip writing it.
+	 */
+	for (i = 0; i < first; i++)
+		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
+	if (!is_empty_table)
+		strbuf_addf(&tables_list_buf, "%s\n", new_table_name.buf);
+	for (i = last + 1; i < st->merged->stack_len; i++)
+		strbuf_addf(&tables_list_buf, "%s\n", st->readers[i]->name);
+
+	err = write_in_full(get_lock_file_fd(&tables_list_lock),
+			    tables_list_buf.buf, tables_list_buf.len);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
 
-	err = close(lock_file_fd);
-	lock_file_fd = -1;
+	err = fsync_component(FSYNC_COMPONENT_REFERENCE, get_lock_file_fd(&tables_list_lock));
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
 
-	err = rename(lock_file_name.buf, st->list_file);
+	err = commit_lock_file(&tables_list_lock);
 	if (err < 0) {
 		err = REFTABLE_IO_ERROR;
 		unlink(new_table_path.buf);
 		goto done;
 	}
-	have_lock = 0;
 
-	/* Reload the stack before deleting. On windows, we can only delete the
-	   files after we closed them.
-	*/
+	/*
+	 * Reload the stack before deleting the compacted tables. We can only
+	 * delete the files after we closed them on Windows, so this needs to
+	 * happen first.
+	 */
 	err = reftable_stack_reload_maybe_reuse(st, first < last);
+	if (err < 0)
+		goto done;
 
-	listp = delete_on_success;
-	while (*listp) {
-		if (strcmp(*listp, new_table_path.buf)) {
-			unlink(*listp);
-		}
-		listp++;
+	/*
+	 * Delete the old tables. They may still be in use by concurrent
+	 * readers, so it is expected that unlinking tables may fail.
+	 */
+	for (i = first; i <= last; i++) {
+		struct lock_file *table_lock = &table_locks[i - first];
+		char *table_path = get_locked_file_path(table_lock);
+		unlink(table_path);
+		free(table_path);
 	}
 
 done:
-	free_names(delete_on_success);
+	rollback_lock_file(&tables_list_lock);
+	for (i = first; table_locks && i <= last; i++)
+		rollback_lock_file(&table_locks[i - first]);
+	reftable_free(table_locks);
 
-	if (subtable_locks) {
-		listp = subtable_locks;
-		while (*listp) {
-			unlink(*listp);
-			listp++;
-		}
-		free_names(subtable_locks);
-	}
-	if (lock_file_fd >= 0) {
-		close(lock_file_fd);
-		lock_file_fd = -1;
-	}
-	if (have_lock) {
-		unlink(lock_file_name.buf);
-	}
 	strbuf_release(&new_table_name);
 	strbuf_release(&new_table_path);
-	strbuf_release(&ref_list_contents);
-	strbuf_release(&temp_tab_file_name);
-	strbuf_release(&lock_file_name);
+	strbuf_release(&new_table_temp_path);
+	strbuf_release(&tables_list_buf);
+	strbuf_release(&table_name);
 	return err;
 }
 
diff --git a/reftable/system.h b/reftable/system.h
index 6b74a81514..5d8b6dede5 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -12,7 +12,9 @@ license that can be found in the LICENSE file or at
 /* This header glues the reftable library to the rest of Git */
 
 #include "git-compat-util.h"
+#include "lockfile.h"
 #include "strbuf.h"
+#include "tempfile.h"
 #include "hash-ll.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/
 
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 4/4] reftable/stack: register compacted tables as tempfiles
  2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2024-03-07 13:10   ` [PATCH v2 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
@ 2024-03-07 13:10   ` Patrick Steinhardt
  3 siblings, 0 replies; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 13:10 UTC (permalink / raw)
  To: git; +Cc: Justin Tobler, Toon claes, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 4510 bytes --]

We do not register tables resulting from stack compaction with the
tempfile API. Those tables will thus not be deleted in case Git gets
killed.

Refactor the code to register compacted tables as tempfiles.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/stack.c | 54 +++++++++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/reftable/stack.c b/reftable/stack.c
index 977336b7d5..1ecf1b9751 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -827,51 +827,56 @@ uint64_t reftable_stack_next_update_index(struct reftable_stack *st)
 
 static int stack_compact_locked(struct reftable_stack *st,
 				size_t first, size_t last,
-				struct strbuf *temp_tab,
-				struct reftable_log_expiry_config *config)
+				struct reftable_log_expiry_config *config,
+				struct tempfile **tab_file_out)
 {
 	struct strbuf next_name = STRBUF_INIT;
-	int tab_fd = -1;
+	struct strbuf tab_file_path = STRBUF_INIT;
 	struct reftable_writer *wr = NULL;
-	int err = 0;
+	struct tempfile *tab_file;
+	int tab_fd, err = 0;
 
 	format_name(&next_name,
 		    reftable_reader_min_update_index(st->readers[first]),
 		    reftable_reader_max_update_index(st->readers[last]));
+	stack_filename(&tab_file_path, st, next_name.buf);
+	strbuf_addstr(&tab_file_path, ".temp.XXXXXX");
 
-	stack_filename(temp_tab, st, next_name.buf);
-	strbuf_addstr(temp_tab, ".temp.XXXXXX");
+	tab_file = mks_tempfile(tab_file_path.buf);
+	if (!tab_file) {
+		err = REFTABLE_IO_ERROR;
+		goto done;
+	}
+	tab_fd = get_tempfile_fd(tab_file);
 
-	tab_fd = mkstemp(temp_tab->buf);
 	if (st->config.default_permissions &&
-	    chmod(temp_tab->buf, st->config.default_permissions) < 0) {
+	    chmod(get_tempfile_path(tab_file), st->config.default_permissions) < 0) {
 		err = REFTABLE_IO_ERROR;
 		goto done;
 	}
 
-	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush, &tab_fd, &st->config);
-
+	wr = reftable_new_writer(reftable_fd_write, reftable_fd_flush,
+				 &tab_fd, &st->config);
 	err = stack_write_compact(st, wr, first, last, config);
 	if (err < 0)
 		goto done;
+
 	err = reftable_writer_close(wr);
 	if (err < 0)
 		goto done;
 
-	err = close(tab_fd);
-	tab_fd = 0;
+	err = close_tempfile_gently(tab_file);
+	if (err < 0)
+		goto done;
+
+	*tab_file_out = tab_file;
+	tab_file = NULL;
 
 done:
+	delete_tempfile(&tab_file);
 	reftable_writer_free(wr);
-	if (tab_fd > 0) {
-		close(tab_fd);
-		tab_fd = 0;
-	}
-	if (err != 0 && temp_tab->len > 0) {
-		unlink(temp_tab->buf);
-		strbuf_release(temp_tab);
-	}
 	strbuf_release(&next_name);
+	strbuf_release(&tab_file_path);
 	return err;
 }
 
@@ -979,12 +984,12 @@ static int stack_compact_range(struct reftable_stack *st,
 			       struct reftable_log_expiry_config *expiry)
 {
 	struct strbuf tables_list_buf = STRBUF_INIT;
-	struct strbuf new_table_temp_path = STRBUF_INIT;
 	struct strbuf new_table_name = STRBUF_INIT;
 	struct strbuf new_table_path = STRBUF_INIT;
 	struct strbuf table_name = STRBUF_INIT;
 	struct lock_file tables_list_lock = LOCK_INIT;
 	struct lock_file *table_locks = NULL;
+	struct tempfile *new_table = NULL;
 	int is_empty_table = 0, err = 0;
 	size_t i;
 
@@ -1059,7 +1064,7 @@ static int stack_compact_range(struct reftable_stack *st,
 	 * these tables may end up with an empty new table in case tombstones
 	 * end up cancelling out all refs in that range.
 	 */
-	err = stack_compact_locked(st, first, last, &new_table_temp_path, expiry);
+	err = stack_compact_locked(st, first, last, expiry, &new_table);
 	if (err < 0) {
 		if (err != REFTABLE_EMPTY_TABLE_ERROR)
 			goto done;
@@ -1099,7 +1104,7 @@ static int stack_compact_range(struct reftable_stack *st,
 		strbuf_addstr(&new_table_name, ".ref");
 		stack_filename(&new_table_path, st, new_table_name.buf);
 
-		err = rename(new_table_temp_path.buf, new_table_path.buf);
+		err = rename_tempfile(&new_table, new_table_path.buf);
 		if (err < 0) {
 			err = REFTABLE_IO_ERROR;
 			goto done;
@@ -1166,9 +1171,10 @@ static int stack_compact_range(struct reftable_stack *st,
 		rollback_lock_file(&table_locks[i - first]);
 	reftable_free(table_locks);
 
+	delete_tempfile(&new_table);
 	strbuf_release(&new_table_name);
 	strbuf_release(&new_table_path);
-	strbuf_release(&new_table_temp_path);
+
 	strbuf_release(&tables_list_buf);
 	strbuf_release(&table_name);
 	return err;
-- 
2.44.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-07  6:17         ` Patrick Steinhardt
@ 2024-03-07 17:59           ` Junio C Hamano
  2024-03-07 20:54             ` Patrick Steinhardt
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2024-03-07 17:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> It is consistent. The problem is rather that `mks_tempfile_m()` takes a
> mode as input, but still ends up applying the umask to that mode.

Ah, OK.  

> Thus,
> using that function without a subsequent call to chmod(3P) would end up
> mishandling "core.sharedRepository".

It is understandable because this is meant for tempfile; you want to
create it, expect to use it yourself and not by others, before you
are done with it.  There is no place in this sequence where you need
to open access up to those who share access to the repository.

The rest of the message is primarily for my own education.

I got a bit curious how other parts of the system does this.  For
example, writing an loose object calls git_mkstemp_mode() with 0444
and then finalize_object_file() calls adjust_shared_perm() after it
renames the file to its final name at the end.  A tight umask like
0700 may demote the original 0444 to 0400, but adjust_shared_perm()
expands the 0400 appropriately.

The index file is more interesting.  We bypass the whole mkstemp
thing (as we know the final filename and add .lock after it), and
the call chain looks like this:

    repo_refresh_and_write_index() ->
      repo_hold_locked_index() ->
        hold_lock_file_for_update() ->
          lock_file() ->
            create_tempfile_mode() ->
              open()
              activate_tempfile()
              adjust_shared_perm()
      refresh_index()
      write_locked_index() ->
        commit_locked_index() ->
          commit_lock_file_to() ->
            rename_tempfile().

After we create the file with open() and before returning to the
caller, hold_lock_file_for_update() already sets the permission
bits correctly, so the "committing" phase to rename the written
lockfile into its final place becomes merely a rename without any
futzing with permission bits.

So in short, for a file that we intend to create, write, and then
commit to the final name, we use at least two approaches:

 - Let mkstemp_mode() do its thing, and fix the permission bits
   later with adjust_shared_perm().

 - Let hold_lock_file_for_update() take care of the permission bits.

I sense there might be some clean-up opportunities around here.
After all, lockfile is (or at least pretends to be) built on top of
tempfile, and it is for more permanent (as opposed to temporary)
files, but it somehow wasn't a good fit to wrap new tables in this
series?

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-07 17:59           ` Junio C Hamano
@ 2024-03-07 20:54             ` Patrick Steinhardt
  2024-03-07 21:06               ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Patrick Steinhardt @ 2024-03-07 20:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1381 bytes --]

On Thu, Mar 07, 2024 at 09:59:50AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
[snip]
> I sense there might be some clean-up opportunities around here.
> After all, lockfile is (or at least pretends to be) built on top of
> tempfile, and it is for more permanent (as opposed to temporary)
> files, but it somehow wasn't a good fit to wrap new tables in this
> series?

Well, I didn't think lockfiles are a good fit here. I did convert
"tables.list" to use a lock because it's a natural fit given that we
want to use "table.list.lock". But newly written tables aren't written
to a file with ".lock" suffix, but instead to a file ending with
".temp.XXXXXX". This is intentionally so that two processes can write
new tables at the same point in time, even though two concurrent writes
will end up being mutually exclusive.

As lockfiles to me are rather about mutually exclusive locking I think
that using tempfiles directly is preferable. As far as I can see there
is also no real benefit with lockfiles in our context, except for the
mode handling. But given that we have "default_permissions" I'd say it
is preferable to consistently use these for now via chmod(3P).

I ain't got any strong opinions on this though, I'm rather being
pragmatic. So if there is a good reason to use lockfiles that I missed
then I wouldn't mind converting the code.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/4] reftable/stack: register new tables as tempfiles
  2024-03-07 20:54             ` Patrick Steinhardt
@ 2024-03-07 21:06               ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2024-03-07 21:06 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> On Thu, Mar 07, 2024 at 09:59:50AM -0800, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
> [snip]
>> I sense there might be some clean-up opportunities around here.
>> After all, lockfile is (or at least pretends to be) built on top of
>> tempfile, and it is for more permanent (as opposed to temporary)
>> files, but it somehow wasn't a good fit to wrap new tables in this
>> series?
> ...
> As lockfiles to me are rather about mutually exclusive locking I think
> that using tempfiles directly is preferable. As far as I can see there
> is also no real benefit with lockfiles in our context, except for the
> mode handling. But given that we have "default_permissions" I'd say it
> is preferable to consistently use these for now via chmod(3P).

I wasn't talking about the use of temp or lock API in _this_ series,
but was talking about the different permission bit handling between
the two API.  The loose object codepath that uses the tempfile API
is closer to what you are doing, which suffers from the same "at
creation tempfile API does not bother with permission bits because
it may be removed at the end".  The index codepath that uses the
hold_lock_file_for_update() does not have to care, as it gets the
permission right from the start.

Because of these differences, the loose object codepath has to do
the adjust_perm_bits() itself, and similarly, you have to fix the
permission the same.

These callers may become simpler if we give an option to the
git_mkstemps_mode() to return a file whose permission bits are
already correct from the start.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-03-07 21:06 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-04 11:10 [PATCH 0/4] reftable/stack: register temporary files Patrick Steinhardt
2024-03-04 11:10 ` [PATCH 1/4] lockfile: report when rollback fails Patrick Steinhardt
2024-03-05 22:09   ` Justin Tobler
2024-03-06 12:00     ` Patrick Steinhardt
2024-03-04 11:10 ` [PATCH 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
2024-03-05 22:30   ` Justin Tobler
2024-03-06 11:59     ` Patrick Steinhardt
2024-03-06 16:34       ` Justin Tobler
2024-03-06 16:36       ` Junio C Hamano
2024-03-07  6:17         ` Patrick Steinhardt
2024-03-07 17:59           ` Junio C Hamano
2024-03-07 20:54             ` Patrick Steinhardt
2024-03-07 21:06               ` Junio C Hamano
2024-03-04 11:10 ` [PATCH 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
2024-03-05 23:30   ` Justin Tobler
2024-03-06 11:59     ` Patrick Steinhardt
2024-03-06 16:39       ` Junio C Hamano
2024-03-06 19:57       ` Justin Tobler
2024-03-04 11:10 ` [PATCH 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt
2024-03-07 12:38   ` Toon claes
2024-03-07 12:58     ` Patrick Steinhardt
2024-03-07 13:10 ` [PATCH v2 0/4] reftable/stack: register temporary files Patrick Steinhardt
2024-03-07 13:10   ` [PATCH v2 1/4] lockfile: report when rollback fails Patrick Steinhardt
2024-03-07 13:10   ` [PATCH v2 2/4] reftable/stack: register new tables as tempfiles Patrick Steinhardt
2024-03-07 13:10   ` [PATCH v2 3/4] reftable/stack: register lockfiles during compaction Patrick Steinhardt
2024-03-07 13:10   ` [PATCH v2 4/4] reftable/stack: register compacted tables as tempfiles Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).