[PATCH v4 0/3] Make update refs more atomic

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/3] Make update refs more atomic
@ 2014-04-14 18:29 Ronnie Sahlberg
  2014-04-14 18:29 ` [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions Ronnie Sahlberg
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-14 18:29 UTC (permalink / raw)
  To: git; +Cc: Ronnie Sahlberg

refs.c:ref_transaction_commit() intermingles doing updates and checks with
actually applying changes to the refs in loops that abort on error.
This is done one ref at a time and means that if an error is detected that
will fail the operation partway through the list of refs to update we
will end up with some changes applied to disk and others not.

Without having transaction support from the filesystem, it is hard to
make an update that involves multiple refs to guarantee atomicity, but we
can do a somewhat better than we currently do.

These patches change the update and delete functions to use a three
call pattern of

1, lock
2, update, or flag for deletion
3, apply on disk  (rename() or unlink())

When a transaction is commited we first do all the locking, preparations
and most of the error checking before we actually start applying any changes
to the filesystem store.

This means that more of the error cases that will fail the commit
will trigger before we start doing any changes to the actual files.

This should make the changes of refs in refs_transaction_commit slightly
more atomic.

Version 4:
* Fix a bug in fast-import.c:dump_tags and make sure no tests fail

Version 3:
* Rebased onto mhagger/ref-transactions.
* Removed the patch to do update/delete from a single loop.

Version 2:
Updates and fixes based on Junio's feedback.
* Fix the subject line for patches so they comply with the project standard.
* Redo the update/delete loops so that we maintain the correct order of
  operations. Perform all updates first, then perform the deletes.
* Add an additional patch that allows us to do the update/delete in the correct
  order from within a single loop by first sorting the refs so that deletes
  are after all non-deletes.

Ronnie Sahlberg (3):
  refs.c: split writing and commiting a ref into two separate functions
  refs.c: split delete_ref_loose() into a separate flag-for-deletion and
    commit phase
  refs.c: change ref_transaction_commit to run the commit loops once all
    work is finished

 branch.c               | 10 ++++--
 builtin/commit.c       |  5 +++
 builtin/fetch.c        |  7 +++-
 builtin/receive-pack.c |  4 +++
 builtin/replace.c      |  6 +++-
 builtin/tag.c          |  6 +++-
 fast-import.c          | 18 ++++++++--
 refs.c                 | 98 +++++++++++++++++++++++++++++++++-----------------
 refs.h                 |  6 ++++
 sequencer.c            |  4 +++
 walker.c               |  4 +++
 11 files changed, 129 insertions(+), 39 deletions(-)

-- 
1.9.1.505.gd05696d

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions
  2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
@ 2014-04-14 18:29 ` Ronnie Sahlberg
  2014-04-15 11:17   ` Michael Haggerty
  2014-04-14 18:29 ` [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase Ronnie Sahlberg
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-14 18:29 UTC (permalink / raw)
  To: git; +Cc: Ronnie Sahlberg

Change the function write_ref_sha1() to just write the ref but not
commit the ref or the lockfile.
Add a new function commit_ref_lock() that will commit the change done by
a previous write_ref_sha1().
Update all callers of write_ref_sha1() to call commit_ref_lock().

The new pattern for updating a ref is now :

lock = lock_ref_sha1_basic() (or varient of)
write_ref_sha1(lock)
unlock_ref(lock) | commit_ref_lock(lock)

Once write_ref_sha1() returns, the new ref has been written and the lock
file has been closed.
At that stage we can then either call unlock_ref() which will abort the
update and delete the lock file withouth applying it, or call
commit_ref_lock() which will rename the lock file onto the ref file.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
---
 branch.c               | 10 ++++++++--
 builtin/commit.c       |  5 +++++
 builtin/fetch.c        |  7 ++++++-
 builtin/receive-pack.c |  4 ++++
 builtin/replace.c      |  6 +++++-
 builtin/tag.c          |  6 +++++-
 fast-import.c          | 18 ++++++++++++++++--
 refs.c                 | 41 +++++++++++++++++++++++++++++++----------
 refs.h                 |  4 ++++
 sequencer.c            |  4 ++++
 walker.c               |  4 ++++
 11 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/branch.c b/branch.c
index 660097b..903ea75 100644
--- a/branch.c
+++ b/branch.c
@@ -304,9 +304,15 @@ void create_branch(const char *head,
 	if (real_ref && track)
 		setup_tracking(ref.buf + 11, real_ref, track, quiet);
 
-	if (!dont_change_ref)
-		if (write_ref_sha1(lock, sha1, msg) < 0)
+	if (!dont_change_ref) {
+		if (write_ref_sha1(lock, sha1, msg) < 0) {
+			unlock_ref(lock);
 			die_errno(_("Failed to write ref"));
+		}
+		if (commit_ref_lock(lock) < 0) {
+			die_errno(_("Failed to commit ref"));
+		}
+	}
 
 	strbuf_release(&ref);
 	free(real_ref);
diff --git a/builtin/commit.c b/builtin/commit.c
index d9550c5..3d8a3a8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1686,9 +1686,14 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 		die(_("cannot lock HEAD ref"));
 	}
 	if (write_ref_sha1(ref_lock, sha1, sb.buf) < 0) {
+		unlock_ref(ref_lock);
 		rollback_index_files();
 		die(_("cannot update HEAD ref"));
 	}
+	if (commit_ref_lock(ref_lock) < 0) {
+		rollback_index_files();
+		die(_("cannot commit HEAD ref"));
+	}
 
 	unlink(git_path("CHERRY_PICK_HEAD"));
 	unlink(git_path("REVERT_HEAD"));
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 55f457c..ebfb854 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -388,7 +388,12 @@ static int s_update_ref(const char *action,
 	if (!lock)
 		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
 					  STORE_REF_ERROR_OTHER;
-	if (write_ref_sha1(lock, ref->new_sha1, msg) < 0)
+	if (write_ref_sha1(lock, ref->new_sha1, msg) < 0) {
+		unlock_ref(lock);
+		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
+					  STORE_REF_ERROR_OTHER;
+	}
+	if (commit_ref_lock(lock) < 0)
 		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
 					  STORE_REF_ERROR_OTHER;
 	return 0;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c323081..4760274 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -587,8 +587,12 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 			return "failed to lock";
 		}
 		if (write_ref_sha1(lock, new_sha1, "push")) {
+			unlock_ref(lock);
 			return "failed to write"; /* error() already called */
 		}
+		if (commit_ref_lock(lock))
+			return "failed to commit"; /* error() already called */
+
 		return NULL; /* good */
 	}
 }
diff --git a/builtin/replace.c b/builtin/replace.c
index b62420a..c09ff49 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -160,8 +160,12 @@ static int replace_object(const char *object_ref, const char *replace_ref,
 	lock = lock_any_ref_for_update(ref, prev, 0, NULL);
 	if (!lock)
 		die("%s: cannot lock the ref", ref);
-	if (write_ref_sha1(lock, repl, NULL) < 0)
+	if (write_ref_sha1(lock, repl, NULL) < 0) {
+		unlock_ref(lock);
 		die("%s: cannot update the ref", ref);
+	}
+	if (commit_ref_lock(lock) < 0)
+		die("%s: cannot commit the ref", ref);
 
 	return 0;
 }
diff --git a/builtin/tag.c b/builtin/tag.c
index 40356e3..8653a64 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -644,8 +644,12 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 	lock = lock_any_ref_for_update(ref.buf, prev, 0, NULL);
 	if (!lock)
 		die(_("%s: cannot lock the ref"), ref.buf);
-	if (write_ref_sha1(lock, object, NULL) < 0)
+	if (write_ref_sha1(lock, object, NULL) < 0) {
+		unlock_ref(lock);
 		die(_("%s: cannot update the ref"), ref.buf);
+	}
+	if (commit_ref_lock(lock) < 0)
+		die(_("%s: cannot commit the ref"), ref.buf);
 	if (force && !is_null_sha1(prev) && hashcmp(prev, object))
 		printf(_("Updated tag '%s' (was %s)\n"), tag, find_unique_abbrev(prev, DEFAULT_ABBREV));
 
diff --git a/fast-import.c b/fast-import.c
index fb4738d..f732bfb 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1706,8 +1706,13 @@ static int update_branch(struct branch *b)
 			return -1;
 		}
 	}
-	if (write_ref_sha1(lock, b->sha1, msg) < 0)
+	if (write_ref_sha1(lock, b->sha1, msg) < 0) {
+		unlock_ref(lock);
 		return error("Unable to update %s", b->name);
+	}
+	if (commit_ref_lock(lock) < 0) {
+		return error("Unable to commit %s", b->name);
+	}
 	return 0;
 }
 
@@ -1732,8 +1737,17 @@ static void dump_tags(void)
 	for (t = first_tag; t; t = t->next_tag) {
 		sprintf(ref_name, "tags/%s", t->name);
 		lock = lock_ref_sha1(ref_name, NULL);
-		if (!lock || write_ref_sha1(lock, t->sha1, msg) < 0)
+		if (!lock) {
+			failure |= error("Unable to lock %s", ref_name);
+			continue;
+		}
+		if (write_ref_sha1(lock, t->sha1, msg) < 0) {
 			failure |= error("Unable to update %s", ref_name);
+			unlock_ref(lock);
+			continue;
+		}
+		if (commit_ref_lock(lock) < 0)
+			failure |= error("Unable to commit %s", ref_name);
 	}
 }
 
diff --git a/refs.c b/refs.c
index 728a761..646afd7 100644
--- a/refs.c
+++ b/refs.c
@@ -2633,9 +2633,14 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 	lock->force_write = 1;
 	hashcpy(lock->old_sha1, orig_sha1);
 	if (write_ref_sha1(lock, orig_sha1, logmsg)) {
+		unlock_ref(lock);
 		error("unable to write current sha1 into %s", newrefname);
 		goto rollback;
 	}
+	if (commit_ref_lock(lock)) {
+		error("unable to commit current sha1 into %s", newrefname);
+		goto rollback;
+	}
 
 	return 0;
 
@@ -2649,8 +2654,12 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 	lock->force_write = 1;
 	flag = log_all_ref_updates;
 	log_all_ref_updates = 0;
-	if (write_ref_sha1(lock, orig_sha1, NULL))
+	if (write_ref_sha1(lock, orig_sha1, NULL)) {
+		unlock_ref(lock);
 		error("unable to write current sha1 into %s", oldrefname);
+	}
+	if (commit_ref_lock(lock))
+		error("unable to commit current sha1 into %s", oldrefname);
 	log_all_ref_updates = flag;
 
  rollbacklog:
@@ -2807,34 +2816,30 @@ int write_ref_sha1(struct ref_lock *lock,
 	if (!lock)
 		return -1;
 	if (!lock->force_write && !hashcmp(lock->old_sha1, sha1)) {
-		unlock_ref(lock);
+		lock->skipped_write = 1;
 		return 0;
 	}
 	o = parse_object(sha1);
 	if (!o) {
 		error("Trying to write ref %s with nonexistent object %s",
 			lock->ref_name, sha1_to_hex(sha1));
-		unlock_ref(lock);
 		return -1;
 	}
 	if (o->type != OBJ_COMMIT && is_branch(lock->ref_name)) {
 		error("Trying to write non-commit object %s to branch %s",
 			sha1_to_hex(sha1), lock->ref_name);
-		unlock_ref(lock);
 		return -1;
 	}
 	if (write_in_full(lock->lock_fd, sha1_to_hex(sha1), 40) != 40 ||
 	    write_in_full(lock->lock_fd, &term, 1) != 1
 		|| close_ref(lock) < 0) {
 		error("Couldn't write %s", lock->lk->filename);
-		unlock_ref(lock);
 		return -1;
 	}
 	clear_loose_ref_cache(&ref_cache);
 	if (log_ref_write(lock->ref_name, lock->old_sha1, sha1, logmsg) < 0 ||
 	    (strcmp(lock->ref_name, lock->orig_ref_name) &&
 	     log_ref_write(lock->orig_ref_name, lock->old_sha1, sha1, logmsg) < 0)) {
-		unlock_ref(lock);
 		return -1;
 	}
 	if (strcmp(lock->orig_ref_name, "HEAD") != 0) {
@@ -2858,7 +2863,12 @@ int write_ref_sha1(struct ref_lock *lock,
 		    !strcmp(head_ref, lock->ref_name))
 			log_ref_write("HEAD", lock->old_sha1, sha1, logmsg);
 	}
-	if (commit_ref(lock)) {
+	return 0;
+}
+
+int commit_ref_lock(struct ref_lock *lock)
+{
+	if (!lock->skipped_write && commit_ref(lock)) {
 		error("Couldn't set %s", lock->ref_name);
 		unlock_ref(lock);
 		return -1;
@@ -3375,10 +3385,17 @@ int update_ref(const char *action, const char *refname,
 	       int flags, enum action_on_err onerr)
 {
 	struct ref_lock *lock;
+	int ret;
+
 	lock = update_ref_lock(refname, oldval, flags, NULL, onerr);
 	if (!lock)
 		return 1;
-	return update_ref_write(action, refname, sha1, lock, onerr);
+	ret = update_ref_write(action, refname, sha1, lock, onerr);
+	if (ret)
+		unlock_ref(lock);
+	else
+		ret = commit_ref_lock(lock);
+	return ret;
 }
 
 static int ref_update_compare(const void *r1, const void *r2)
@@ -3453,7 +3470,11 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 					       update->refname,
 					       update->new_sha1,
 					       update->lock, onerr);
-			update->lock = NULL; /* freed by update_ref_write */
+			if (ret)
+				unlock_ref(update->lock);
+			else
+				commit_ref_lock(update->lock);
+			update->lock = NULL;
 			if (ret)
 				goto cleanup;
 		}
@@ -3464,7 +3485,7 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		struct ref_update *update = updates[i];
 
 		if (update->lock) {
-			delnames[delnum++] = update->lock->ref_name;
+			delnames[delnum++] = update->refname;
 			ret |= delete_ref_loose(update->lock, update->type);
 		}
 	}
diff --git a/refs.h b/refs.h
index 0f08def..f14a417 100644
--- a/refs.h
+++ b/refs.h
@@ -8,6 +8,7 @@ struct ref_lock {
 	unsigned char old_sha1[20];
 	int lock_fd;
 	int force_write;
+	int skipped_write;
 };
 
 struct ref_transaction;
@@ -153,6 +154,9 @@ extern void unlock_ref(struct ref_lock *lock);
 /** Writes sha1 into the ref specified by the lock. **/
 extern int write_ref_sha1(struct ref_lock *lock, const unsigned char *sha1, const char *msg);
 
+/** Commit any changes done to the ref specified by the lock. **/
+extern int commit_ref_lock(struct ref_lock *lock);
+
 /** Setup reflog before using. **/
 int log_ref_setup(const char *refname, char *logfile, int bufsize);
 
diff --git a/sequencer.c b/sequencer.c
index bde5f04..ffadf82 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -283,6 +283,10 @@ static int fast_forward_to(const unsigned char *to, const unsigned char *from,
 					   0, NULL);
 	strbuf_addf(&sb, "%s: fast-forward", action_name(opts));
 	ret = write_ref_sha1(ref_lock, to, sb.buf);
+	if (ret)
+		unlock_ref(ref_lock);
+	else
+		ret |= commit_ref_lock(ref_lock);
 	strbuf_release(&sb);
 	return ret;
 }
diff --git a/walker.c b/walker.c
index 1dd86b8..5ce5a1d 100644
--- a/walker.c
+++ b/walker.c
@@ -295,6 +295,10 @@ int walker_fetch(struct walker *walker, int targets, char **target,
 		if (!write_ref || !write_ref[i])
 			continue;
 		ret = write_ref_sha1(lock[i], &sha1[20 * i], msg ? msg : "fetch (unknown)");
+		if (ret)
+			unlock_ref(lock[i]);
+		else
+			ret |= commit_ref_lock(lock[i]);
 		lock[i] = NULL;
 		if (ret)
 			goto unlock_and_fail;
-- 
1.9.1.505.gd05696d

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase
  2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
  2014-04-14 18:29 ` [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions Ronnie Sahlberg
@ 2014-04-14 18:29 ` Ronnie Sahlberg
  2014-04-15 17:19   ` Michael Haggerty
  2014-04-14 18:29 ` [PATCH v4 3/3] refs.c: change ref_transaction_commit to run the commit loops once all work is finished Ronnie Sahlberg
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-14 18:29 UTC (permalink / raw)
  To: git; +Cc: Ronnie Sahlberg

Change delete_ref_loose()) to just flag that a ref is to be deleted but do
not actually unlink the files.
Change commit_ref_lock() so that it will unlink refs that are flagged for
deletion.
Change all callers of delete_ref_loose() to explicitely call commit_ref_lock()
to commit the deletion.

The new pattern for deleting loose refs thus become:

lock = lock_ref_sha1_basic() (or varient of)
delete_ref_loose(lock)
unlock_ref(lock) | commit_ref_lock(lock)

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
---
 refs.c | 32 ++++++++++++++++++++------------
 refs.h |  2 ++
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index 646afd7..a14addb 100644
--- a/refs.c
+++ b/refs.c
@@ -2484,16 +2484,9 @@ static int repack_without_ref(const char *refname)
 
 static int delete_ref_loose(struct ref_lock *lock, int flag)
 {
-	if (!(flag & REF_ISPACKED) || flag & REF_ISSYMREF) {
-		/* loose */
-		int err, i = strlen(lock->lk->filename) - 5; /* .lock */
-
-		lock->lk->filename[i] = 0;
-		err = unlink_or_warn(lock->lk->filename);
-		lock->lk->filename[i] = '.';
-		if (err && errno != ENOENT)
-			return 1;
-	}
+	lock->delete_ref = 1;
+	lock->delete_flag = flag;
+
 	return 0;
 }
 
@@ -2515,7 +2508,7 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
 
 	unlink_or_warn(git_path("logs/%s", lock->ref_name));
 	clear_loose_ref_cache(&ref_cache);
-	unlock_ref(lock);
+	ret |= commit_ref_lock(lock);
 	return ret;
 }
 
@@ -2868,7 +2861,20 @@ int write_ref_sha1(struct ref_lock *lock,
 
 int commit_ref_lock(struct ref_lock *lock)
 {
-	if (!lock->skipped_write && commit_ref(lock)) {
+	if (lock->delete_ref) {
+		int flag = lock->delete_flag;
+
+		if (!(flag & REF_ISPACKED) || flag & REF_ISSYMREF) {
+			/* loose */
+			int err, i = strlen(lock->lk->filename) - 5; /* .lock */
+
+			lock->lk->filename[i] = 0;
+			err = unlink_or_warn(lock->lk->filename);
+			lock->lk->filename[i] = '.';
+			if (err && errno != ENOENT)
+				return 1;
+		}
+	} else if (!lock->skipped_write && commit_ref(lock)) {
 		error("Couldn't set %s", lock->ref_name);
 		unlock_ref(lock);
 		return -1;
@@ -3487,6 +3493,8 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		if (update->lock) {
 			delnames[delnum++] = update->refname;
 			ret |= delete_ref_loose(update->lock, update->type);
+			ret |= commit_ref_lock(update->lock);
+			update->lock = NULL;
 		}
 	}
 
diff --git a/refs.h b/refs.h
index f14a417..223be30 100644
--- a/refs.h
+++ b/refs.h
@@ -9,6 +9,8 @@ struct ref_lock {
 	int lock_fd;
 	int force_write;
 	int skipped_write;
+	int delete_ref;
+	int delete_flag;
 };
 
 struct ref_transaction;
-- 
1.9.1.505.gd05696d

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 3/3] refs.c: change ref_transaction_commit to run the commit loops once all work is finished
  2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
  2014-04-14 18:29 ` [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions Ronnie Sahlberg
  2014-04-14 18:29 ` [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase Ronnie Sahlberg
@ 2014-04-14 18:29 ` Ronnie Sahlberg
  2014-04-14 20:24 ` [PATCH v4 0/3] Make update refs more atomic Junio C Hamano
  2014-04-15  6:36 ` Michael Haggerty
  4 siblings, 0 replies; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-14 18:29 UTC (permalink / raw)
  To: git; +Cc: Ronnie Sahlberg

During a transaction commit we will both update and delete refs.
Since both update and delete now use the same pattern

    lock = lock_ref_sha1_basic() (or varient of)
    write_ref_sha1(lock)/delete_ref_loose(lock)
    unlock_ref(lock) | commit_ref_lock(lock)

we can now simplify ref_transaction_commit to have one loop that locks all
involved refs.
A second loop that writes or flags for deletion, but does not commit, all
the refs.
And a final third loop that commits all the refs once all the work and
preparations are complete.

This makes updating/deleting multiple refs more atomic since we will not start
the commit phase until all the preparations have completed successfully.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
---
 refs.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/refs.c b/refs.c
index a14addb..87193c7 100644
--- a/refs.c
+++ b/refs.c
@@ -3467,42 +3467,47 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		}
 	}
 
-	/* Perform updates first so live commits remain referenced */
+	/* Prepare all the updates/deletes */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
 
-		if (!is_null_sha1(update->new_sha1)) {
+		if (!is_null_sha1(update->new_sha1))
 			ret = update_ref_write(msg,
 					       update->refname,
 					       update->new_sha1,
 					       update->lock, onerr);
-			if (ret)
-				unlock_ref(update->lock);
-			else
-				commit_ref_lock(update->lock);
-			update->lock = NULL;
-			if (ret)
-				goto cleanup;
+		else {
+			delnames[delnum++] = update->refname;
+			ret = delete_ref_loose(update->lock, update->type);
 		}
+		if (ret)
+			goto cleanup;
 	}
 
-	/* Perform deletes now that updates are safely completed */
+	ret |= repack_without_refs(delnames, delnum);
+	for (i = 0; i < delnum; i++)
+		unlink_or_warn(git_path("logs/%s", delnames[i]));
+	clear_loose_ref_cache(&ref_cache);
+
+	/* Perform updates first so live commits remain referenced */
+	for (i = 0; i < n; i++) {
+		struct ref_update *update = updates[i];
+
+		if (update->lock && !update->lock->delete_ref) {
+			ret |= commit_ref_lock(update->lock);
+			update->lock = NULL;
+		}
+	}
+	/* And finally perform all deletes */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
 
 		if (update->lock) {
-			delnames[delnum++] = update->refname;
-			ret |= delete_ref_loose(update->lock, update->type);
 			ret |= commit_ref_lock(update->lock);
 			update->lock = NULL;
 		}
 	}
 
-	ret |= repack_without_refs(delnames, delnum);
-	for (i = 0; i < delnum; i++)
-		unlink_or_warn(git_path("logs/%s", delnames[i]));
-	clear_loose_ref_cache(&ref_cache);
-
 cleanup:
 	for (i = 0; i < n; i++)
 		if (updates[i]->lock)
-- 
1.9.1.505.gd05696d

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
                   ` (2 preceding siblings ...)
  2014-04-14 18:29 ` [PATCH v4 3/3] refs.c: change ref_transaction_commit to run the commit loops once all work is finished Ronnie Sahlberg
@ 2014-04-14 20:24 ` Junio C Hamano
  2014-04-15 16:41   ` Ronnie Sahlberg
  2014-04-15  6:36 ` Michael Haggerty
  4 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2014-04-14 20:24 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: git

Thanks; will queue.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
                   ` (3 preceding siblings ...)
  2014-04-14 20:24 ` [PATCH v4 0/3] Make update refs more atomic Junio C Hamano
@ 2014-04-15  6:36 ` Michael Haggerty
  2014-04-15 16:33   ` Ronnie Sahlberg
  4 siblings, 1 reply; 16+ messages in thread
From: Michael Haggerty @ 2014-04-15  6:36 UTC (permalink / raw)
  To: Ronnie Sahlberg, git

On 04/14/2014 08:29 PM, Ronnie Sahlberg wrote:
> refs.c:ref_transaction_commit() intermingles doing updates and checks with
> actually applying changes to the refs in loops that abort on error.
> This is done one ref at a time and means that if an error is detected that
> will fail the operation partway through the list of refs to update we
> will end up with some changes applied to disk and others not.
> 
> Without having transaction support from the filesystem, it is hard to
> make an update that involves multiple refs to guarantee atomicity, but we
> can do a somewhat better than we currently do.

It took me a moment to understand what you were talking about here,
because the code for ref_transaction_commit() already seems
superficially to do reference modifications in phases.  The problem is
that write_ref_sha1() internally contains additional checks that can
fail in "normal" circumstances.  So the most important part of this
patch series is allowing those checks to be done before committing anything.

> These patches change the update and delete functions to use a three
> call pattern of
> 
> 1, lock
> 2, update, or flag for deletion
> 3, apply on disk  (rename() or unlink())
> 
> When a transaction is commited we first do all the locking, preparations
> and most of the error checking before we actually start applying any changes
> to the filesystem store.
> 
> This means that more of the error cases that will fail the commit
> will trigger before we start doing any changes to the actual files.
> 
> 
> This should make the changes of refs in refs_transaction_commit slightly
> more atomic.
> [...]

Yes, this is a good and important goal.

I wonder, however, whether your approach of changing callers from

    lock = lock_ref_sha1_basic() (or varient of)
    write_ref_sha1(lock)

to

    lock = lock_ref_sha1_basic() (or varient of)
    write_ref_sha1(lock)
    unlock_ref(lock) | commit_ref_lock(lock)

is not doing work that we will soon need to rework.  Would it be jumping
the gun to change the callers to

    transaction = ref_transaction_begin();
    ref_transaction_{update,delete,etc}(transaction, ...);
    ref_transaction_{commit,rollback}(transaction, ...);

instead?  Then we could bury the details of calling write_ref_sha1() and
commit_lock_ref() inside ref_transaction_commit() rather than having to
expose them in the public API.

I suspect that the answer is "no, ref transactions are not yet powerful
enough to do everything that the callers need".  But then I would
suggest that we *make* them powerful enough and *then* make the change
at the callers.

I'm not saying that we shouldn't accept your change as a first step [1]
and do the next step later, but wanted to get your reaction about making
the first step a bit more ambitious.

Michael

[1] Though I still need to review your patch series in detail.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions
  2014-04-14 18:29 ` [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions Ronnie Sahlberg
@ 2014-04-15 11:17   ` Michael Haggerty
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Haggerty @ 2014-04-15 11:17 UTC (permalink / raw)
  To: Ronnie Sahlberg, git

See my comment to your cover letter where I suggest using ref
transactions instead of making callers deal with even more of the
details of updating references.  But I will comment on these patches
anyway, in case you'd rather leave them closer to the current form.

On 04/14/2014 08:29 PM, Ronnie Sahlberg wrote:
> Change the function write_ref_sha1() to just write the ref but not
> commit the ref or the lockfile.
> Add a new function commit_ref_lock() that will commit the change done by
> a previous write_ref_sha1().
> Update all callers of write_ref_sha1() to call commit_ref_lock().
> 
> The new pattern for updating a ref is now :
> 
> lock = lock_ref_sha1_basic() (or varient of)

s/varient/variant/

> write_ref_sha1(lock)
> unlock_ref(lock) | commit_ref_lock(lock)
> 
> Once write_ref_sha1() returns, the new ref has been written and the lock
> file has been closed.
> At that stage we can then either call unlock_ref() which will abort the
> update and delete the lock file withouth applying it, or call

You need a comma after "unlock_ref()".

s/withouth/without/

> commit_ref_lock() which will rename the lock file onto the ref file.


The commit message would be easier to read with better formatting; maybe

---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---
refs.c: split writing and commiting a ref into two separate functions

* Change the function write_ref_sha1() to just write the ref but not
  commit the ref or the lockfile.

* Add a new function commit_ref_lock() that will commit the change done by
  a previous write_ref_sha1().

* Update all callers of write_ref_sha1() to call commit_ref_lock().

The new pattern for updating a ref is now :

    lock = lock_ref_sha1_basic() (or variant of)
    write_ref_sha1(lock)
    unlock_ref(lock) | commit_ref_lock(lock)

Once write_ref_sha1() returns, the new ref has been written and the lock
file has been closed. At that stage we can then either call unlock_ref(),
which will abort the update and delete the lock file without applying it,
or call commit_ref_lock() which will rename the lock file onto the ref
file.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---


> Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
> ---
>  branch.c               | 10 ++++++++--
>  builtin/commit.c       |  5 +++++
>  builtin/fetch.c        |  7 ++++++-
>  builtin/receive-pack.c |  4 ++++
>  builtin/replace.c      |  6 +++++-
>  builtin/tag.c          |  6 +++++-
>  fast-import.c          | 18 ++++++++++++++++--
>  refs.c                 | 41 +++++++++++++++++++++++++++++++----------
>  refs.h                 |  4 ++++
>  sequencer.c            |  4 ++++
>  walker.c               |  4 ++++
>  11 files changed, 92 insertions(+), 17 deletions(-)
> 
> diff --git a/branch.c b/branch.c
> index 660097b..903ea75 100644
> --- a/branch.c
> +++ b/branch.c
> @@ -304,9 +304,15 @@ void create_branch(const char *head,
>  	if (real_ref && track)
>  		setup_tracking(ref.buf + 11, real_ref, track, quiet);
>  
> -	if (!dont_change_ref)
> -		if (write_ref_sha1(lock, sha1, msg) < 0)
> +	if (!dont_change_ref) {
> +		if (write_ref_sha1(lock, sha1, msg) < 0) {
> +			unlock_ref(lock);
>  			die_errno(_("Failed to write ref"));
> +		}
> +		if (commit_ref_lock(lock) < 0) {
> +			die_errno(_("Failed to commit ref"));
> +		}
> +	}
>  
>  	strbuf_release(&ref);
>  	free(real_ref);

There are a lot of changes like this with similar duplicated error
handling.  Why not define a helper function like
write_ref_sha1_and_commit() that does what the old write_ref_sha1() used
to do (for the callers who don't care about updating multiple references
at once)?

In fact, I would recommend renaming the function in a preparatory
commit, to reduce the amount of code churn in the second commit where
you extract the two new separate functions.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index d9550c5..3d8a3a8 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1686,9 +1686,14 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>  		die(_("cannot lock HEAD ref"));
>  	}
>  	if (write_ref_sha1(ref_lock, sha1, sb.buf) < 0) {
> +		unlock_ref(ref_lock);
>  		rollback_index_files();
>  		die(_("cannot update HEAD ref"));
>  	}
> +	if (commit_ref_lock(ref_lock) < 0) {
> +		rollback_index_files();
> +		die(_("cannot commit HEAD ref"));
> +	}
>  
>  	unlink(git_path("CHERRY_PICK_HEAD"));
>  	unlink(git_path("REVERT_HEAD"));
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 55f457c..ebfb854 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -388,7 +388,12 @@ static int s_update_ref(const char *action,
>  	if (!lock)
>  		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
>  					  STORE_REF_ERROR_OTHER;
> -	if (write_ref_sha1(lock, ref->new_sha1, msg) < 0)
> +	if (write_ref_sha1(lock, ref->new_sha1, msg) < 0) {
> +		unlock_ref(lock);
> +		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
> +					  STORE_REF_ERROR_OTHER;
> +	}
> +	if (commit_ref_lock(lock) < 0)
>  		return errno == ENOTDIR ? STORE_REF_ERROR_DF_CONFLICT :
>  					  STORE_REF_ERROR_OTHER;
>  	return 0;
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index c323081..4760274 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -587,8 +587,12 @@ static const char *update(struct command *cmd, struct shallow_info *si)
>  			return "failed to lock";
>  		}
>  		if (write_ref_sha1(lock, new_sha1, "push")) {
> +			unlock_ref(lock);
>  			return "failed to write"; /* error() already called */
>  		}
> +		if (commit_ref_lock(lock))
> +			return "failed to commit"; /* error() already called */
> +
>  		return NULL; /* good */
>  	}
>  }
> diff --git a/builtin/replace.c b/builtin/replace.c
> index b62420a..c09ff49 100644
> --- a/builtin/replace.c
> +++ b/builtin/replace.c
> @@ -160,8 +160,12 @@ static int replace_object(const char *object_ref, const char *replace_ref,
>  	lock = lock_any_ref_for_update(ref, prev, 0, NULL);
>  	if (!lock)
>  		die("%s: cannot lock the ref", ref);
> -	if (write_ref_sha1(lock, repl, NULL) < 0)
> +	if (write_ref_sha1(lock, repl, NULL) < 0) {
> +		unlock_ref(lock);
>  		die("%s: cannot update the ref", ref);
> +	}
> +	if (commit_ref_lock(lock) < 0)
> +		die("%s: cannot commit the ref", ref);
>  
>  	return 0;
>  }
> diff --git a/builtin/tag.c b/builtin/tag.c
> index 40356e3..8653a64 100644
> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -644,8 +644,12 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
>  	lock = lock_any_ref_for_update(ref.buf, prev, 0, NULL);
>  	if (!lock)
>  		die(_("%s: cannot lock the ref"), ref.buf);
> -	if (write_ref_sha1(lock, object, NULL) < 0)
> +	if (write_ref_sha1(lock, object, NULL) < 0) {
> +		unlock_ref(lock);
>  		die(_("%s: cannot update the ref"), ref.buf);
> +	}
> +	if (commit_ref_lock(lock) < 0)
> +		die(_("%s: cannot commit the ref"), ref.buf);
>  	if (force && !is_null_sha1(prev) && hashcmp(prev, object))
>  		printf(_("Updated tag '%s' (was %s)\n"), tag, find_unique_abbrev(prev, DEFAULT_ABBREV));
>  
> diff --git a/fast-import.c b/fast-import.c
> index fb4738d..f732bfb 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -1706,8 +1706,13 @@ static int update_branch(struct branch *b)
>  			return -1;
>  		}
>  	}
> -	if (write_ref_sha1(lock, b->sha1, msg) < 0)
> +	if (write_ref_sha1(lock, b->sha1, msg) < 0) {
> +		unlock_ref(lock);
>  		return error("Unable to update %s", b->name);
> +	}
> +	if (commit_ref_lock(lock) < 0) {
> +		return error("Unable to commit %s", b->name);
> +	}
>  	return 0;
>  }
>  
> @@ -1732,8 +1737,17 @@ static void dump_tags(void)
>  	for (t = first_tag; t; t = t->next_tag) {
>  		sprintf(ref_name, "tags/%s", t->name);
>  		lock = lock_ref_sha1(ref_name, NULL);
> -		if (!lock || write_ref_sha1(lock, t->sha1, msg) < 0)
> +		if (!lock) {
> +			failure |= error("Unable to lock %s", ref_name);
> +			continue;
> +		}
> +		if (write_ref_sha1(lock, t->sha1, msg) < 0) {
>  			failure |= error("Unable to update %s", ref_name);
> +			unlock_ref(lock);
> +			continue;
> +		}
> +		if (commit_ref_lock(lock) < 0)
> +			failure |= error("Unable to commit %s", ref_name);
>  	}
>  }
>  
> diff --git a/refs.c b/refs.c
> index 728a761..646afd7 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2633,9 +2633,14 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
>  	lock->force_write = 1;
>  	hashcpy(lock->old_sha1, orig_sha1);
>  	if (write_ref_sha1(lock, orig_sha1, logmsg)) {
> +		unlock_ref(lock);
>  		error("unable to write current sha1 into %s", newrefname);
>  		goto rollback;
>  	}
> +	if (commit_ref_lock(lock)) {
> +		error("unable to commit current sha1 into %s", newrefname);
> +		goto rollback;
> +	}
>  
>  	return 0;
>  
> @@ -2649,8 +2654,12 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
>  	lock->force_write = 1;
>  	flag = log_all_ref_updates;
>  	log_all_ref_updates = 0;
> -	if (write_ref_sha1(lock, orig_sha1, NULL))
> +	if (write_ref_sha1(lock, orig_sha1, NULL)) {
> +		unlock_ref(lock);
>  		error("unable to write current sha1 into %s", oldrefname);
> +	}
> +	if (commit_ref_lock(lock))
> +		error("unable to commit current sha1 into %s", oldrefname);
>  	log_all_ref_updates = flag;
>  
>   rollbacklog:
> @@ -2807,34 +2816,30 @@ int write_ref_sha1(struct ref_lock *lock,
>  	if (!lock)
>  		return -1;
>  	if (!lock->force_write && !hashcmp(lock->old_sha1, sha1)) {
> -		unlock_ref(lock);
> +		lock->skipped_write = 1;
>  		return 0;
>  	}
>  	o = parse_object(sha1);
>  	if (!o) {
>  		error("Trying to write ref %s with nonexistent object %s",
>  			lock->ref_name, sha1_to_hex(sha1));
> -		unlock_ref(lock);
>  		return -1;
>  	}
>  	if (o->type != OBJ_COMMIT && is_branch(lock->ref_name)) {
>  		error("Trying to write non-commit object %s to branch %s",
>  			sha1_to_hex(sha1), lock->ref_name);
> -		unlock_ref(lock);
>  		return -1;
>  	}
>  	if (write_in_full(lock->lock_fd, sha1_to_hex(sha1), 40) != 40 ||
>  	    write_in_full(lock->lock_fd, &term, 1) != 1
>  		|| close_ref(lock) < 0) {
>  		error("Couldn't write %s", lock->lk->filename);
> -		unlock_ref(lock);
>  		return -1;
>  	}
>  	clear_loose_ref_cache(&ref_cache);
>  	if (log_ref_write(lock->ref_name, lock->old_sha1, sha1, logmsg) < 0 ||
>  	    (strcmp(lock->ref_name, lock->orig_ref_name) &&
>  	     log_ref_write(lock->orig_ref_name, lock->old_sha1, sha1, logmsg) < 0)) {
> -		unlock_ref(lock);
>  		return -1;
>  	}
>  	if (strcmp(lock->orig_ref_name, "HEAD") != 0) {
> @@ -2858,7 +2863,12 @@ int write_ref_sha1(struct ref_lock *lock,
>  		    !strcmp(head_ref, lock->ref_name))
>  			log_ref_write("HEAD", lock->old_sha1, sha1, logmsg);
>  	}
> -	if (commit_ref(lock)) {
> +	return 0;
> +}
> +
> +int commit_ref_lock(struct ref_lock *lock)
> +{
> +	if (!lock->skipped_write && commit_ref(lock)) {
>  		error("Couldn't set %s", lock->ref_name);
>  		unlock_ref(lock);
>  		return -1;
> @@ -3375,10 +3385,17 @@ int update_ref(const char *action, const char *refname,
>  	       int flags, enum action_on_err onerr)
>  {
>  	struct ref_lock *lock;
> +	int ret;
> +
>  	lock = update_ref_lock(refname, oldval, flags, NULL, onerr);
>  	if (!lock)
>  		return 1;
> -	return update_ref_write(action, refname, sha1, lock, onerr);
> +	ret = update_ref_write(action, refname, sha1, lock, onerr);
> +	if (ret)
> +		unlock_ref(lock);
> +	else
> +		ret = commit_ref_lock(lock);
> +	return ret;
>  }
>  
>  static int ref_update_compare(const void *r1, const void *r2)
> @@ -3453,7 +3470,11 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  					       update->refname,
>  					       update->new_sha1,
>  					       update->lock, onerr);
> -			update->lock = NULL; /* freed by update_ref_write */
> +			if (ret)
> +				unlock_ref(update->lock);
> +			else
> +				commit_ref_lock(update->lock);
> +			update->lock = NULL;
>  			if (ret)
>  				goto cleanup;
>  		}
> @@ -3464,7 +3485,7 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  		struct ref_update *update = updates[i];
>  
>  		if (update->lock) {
> -			delnames[delnum++] = update->lock->ref_name;
> +			delnames[delnum++] = update->refname;

Isn't this hunk orthogonal to the main point of this commit?  If so,
please split it into a separate commit.

>  			ret |= delete_ref_loose(update->lock, update->type);
>  		}
>  	}
> diff --git a/refs.h b/refs.h
> index 0f08def..f14a417 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -8,6 +8,7 @@ struct ref_lock {
>  	unsigned char old_sha1[20];
>  	int lock_fd;
>  	int force_write;
> +	int skipped_write;
>  };
>  
>  struct ref_transaction;
> @@ -153,6 +154,9 @@ extern void unlock_ref(struct ref_lock *lock);
>  /** Writes sha1 into the ref specified by the lock. **/
>  extern int write_ref_sha1(struct ref_lock *lock, const unsigned char *sha1, const char *msg);
>  
> +/** Commit any changes done to the ref specified by the lock. **/
> +extern int commit_ref_lock(struct ref_lock *lock);
> +
>  /** Setup reflog before using. **/
>  int log_ref_setup(const char *refname, char *logfile, int bufsize);
>  
> diff --git a/sequencer.c b/sequencer.c
> index bde5f04..ffadf82 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -283,6 +283,10 @@ static int fast_forward_to(const unsigned char *to, const unsigned char *from,
>  					   0, NULL);
>  	strbuf_addf(&sb, "%s: fast-forward", action_name(opts));
>  	ret = write_ref_sha1(ref_lock, to, sb.buf);
> +	if (ret)
> +		unlock_ref(ref_lock);
> +	else
> +		ret |= commit_ref_lock(ref_lock);

"|=" could be changed to "=" here and in the next hunk.

>  	strbuf_release(&sb);
>  	return ret;
>  }
> diff --git a/walker.c b/walker.c
> index 1dd86b8..5ce5a1d 100644
> --- a/walker.c
> +++ b/walker.c
> @@ -295,6 +295,10 @@ int walker_fetch(struct walker *walker, int targets, char **target,
>  		if (!write_ref || !write_ref[i])
>  			continue;
>  		ret = write_ref_sha1(lock[i], &sha1[20 * i], msg ? msg : "fetch (unknown)");
> +		if (ret)
> +			unlock_ref(lock[i]);
> +		else
> +			ret |= commit_ref_lock(lock[i]);
>  		lock[i] = NULL;
>  		if (ret)
>  			goto unlock_and_fail;
> 


-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-15  6:36 ` Michael Haggerty
@ 2014-04-15 16:33   ` Ronnie Sahlberg
  2014-04-15 20:32     ` Michael Haggerty
  0 siblings, 1 reply; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-15 16:33 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git@vger.kernel.org

On Mon, Apr 14, 2014 at 11:36 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 04/14/2014 08:29 PM, Ronnie Sahlberg wrote:
>> refs.c:ref_transaction_commit() intermingles doing updates and checks with
>> actually applying changes to the refs in loops that abort on error.
>> This is done one ref at a time and means that if an error is detected that
>> will fail the operation partway through the list of refs to update we
>> will end up with some changes applied to disk and others not.
>>
>> Without having transaction support from the filesystem, it is hard to
>> make an update that involves multiple refs to guarantee atomicity, but we
>> can do a somewhat better than we currently do.
>
> It took me a moment to understand what you were talking about here,
> because the code for ref_transaction_commit() already seems
> superficially to do reference modifications in phases.  The problem is
> that write_ref_sha1() internally contains additional checks that can
> fail in "normal" circumstances.  So the most important part of this
> patch series is allowing those checks to be done before committing anything.

Yes.
This patch series is mainly focused on making ref_transaction_commit()
more atomic
so that it does more checks for failures before it starts doing
irreversible changes to the underlying files.

These patches change ref_transaction_commit() to do even more checks
before it starts applying the
actual changes to disk.


There are also changes to other users of write_ref_sha1() too but
those are mainly just to
reflect that this function no longer actually does the changes to the
underlying files any more.


>
>> These patches change the update and delete functions to use a three
>> call pattern of
>>
>> 1, lock
>> 2, update, or flag for deletion
>> 3, apply on disk  (rename() or unlink())
>>
>> When a transaction is commited we first do all the locking, preparations
>> and most of the error checking before we actually start applying any changes
>> to the filesystem store.
>>
>> This means that more of the error cases that will fail the commit
>> will trigger before we start doing any changes to the actual files.
>>
>>
>> This should make the changes of refs in refs_transaction_commit slightly
>> more atomic.
>> [...]
>
> Yes, this is a good and important goal.
>
> I wonder, however, whether your approach of changing callers from
>
>     lock = lock_ref_sha1_basic() (or varient of)
>     write_ref_sha1(lock)
>
> to
>
>     lock = lock_ref_sha1_basic() (or varient of)
>     write_ref_sha1(lock)
>     unlock_ref(lock) | commit_ref_lock(lock)
>
> is not doing work that we will soon need to rework.  Would it be jumping
> the gun to change the callers to
>
>     transaction = ref_transaction_begin();
>     ref_transaction_{update,delete,etc}(transaction, ...);
>     ref_transaction_{commit,rollback}(transaction, ...);
>
> instead?  Then we could bury the details of calling write_ref_sha1() and
> commit_lock_ref() inside ref_transaction_commit() rather than having to
> expose them in the public API.
>

I think you are right.

Lets put this patch series on the backburner for now and start by
making all callers use transactions
and remove write_ref_sha1() from the public API thar refs.c exports.

Once everything is switched over to transactions I can rework this
patchseries for ref_transaction_commit()
and resubmit to the mailing list.


Lets start preparing patches to change all external callers to use
transactions instead.
I am happy to help preparing patches for this. How do we ensure that
we do not create duplicate work
and work on the same functions?


regards
ronnie sahlberg


> I suspect that the answer is "no, ref transactions are not yet powerful
> enough to do everything that the callers need".  But then I would
> suggest that we *make* them powerful enough and *then* make the change
> at the callers.
>
> I'm not saying that we shouldn't accept your change as a first step [1]
> and do the next step later, but wanted to get your reaction about making
> the first step a bit more ambitious.
>
> Michael
>
> [1] Though I still need to review your patch series in detail.
>
> --
> Michael Haggerty
> mhagger@alum.mit.edu
> http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-14 20:24 ` [PATCH v4 0/3] Make update refs more atomic Junio C Hamano
@ 2014-04-15 16:41   ` Ronnie Sahlberg
  0 siblings, 0 replies; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-15 16:41 UTC (permalink / raw)
  To: Junio C Hamano, Michael Haggerty; +Cc: git@vger.kernel.org

On Mon, Apr 14, 2014 at 1:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Thanks; will queue.

Junio,
Please defer queuing for now.

I think we should convert more of the external callers to use
transactions first.
Once that is done and everything uses transactions I will re-send an
updated version of this patch series.


regards
ronnie sahlberg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase
  2014-04-14 18:29 ` [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase Ronnie Sahlberg
@ 2014-04-15 17:19   ` Michael Haggerty
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Haggerty @ 2014-04-15 17:19 UTC (permalink / raw)
  To: Ronnie Sahlberg, git; +Cc: Jeff King

On 04/14/2014 08:29 PM, Ronnie Sahlberg wrote:
> Change delete_ref_loose()) to just flag that a ref is to be deleted but do
> not actually unlink the files.
> Change commit_ref_lock() so that it will unlink refs that are flagged for
> deletion.
> Change all callers of delete_ref_loose() to explicitely call commit_ref_lock()

s/explicitely/explicitly/

> to commit the deletion.
> 
> The new pattern for deleting loose refs thus become:
> 
> lock = lock_ref_sha1_basic() (or varient of)

s/varient/variant/

> delete_ref_loose(lock)
> unlock_ref(lock) | commit_ref_lock(lock)

Formatting: sentences should be flowed together if they are within a
paragraph, or separated with blank lines if they constitute separate
paragraphs, or have "bullet" characters and be indented if they are a
bullet list.

Code should be indented to set it off from the prose.

> Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
> ---
>  refs.c | 32 ++++++++++++++++++++------------
>  refs.h |  2 ++
>  2 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 646afd7..a14addb 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2484,16 +2484,9 @@ static int repack_without_ref(const char *refname)
>  
>  static int delete_ref_loose(struct ref_lock *lock, int flag)
>  {
> -	if (!(flag & REF_ISPACKED) || flag & REF_ISSYMREF) {
> -		/* loose */
> -		int err, i = strlen(lock->lk->filename) - 5; /* .lock */
> -
> -		lock->lk->filename[i] = 0;
> -		err = unlink_or_warn(lock->lk->filename);
> -		lock->lk->filename[i] = '.';
> -		if (err && errno != ENOENT)
> -			return 1;
> -	}
> +	lock->delete_ref = 1;
> +	lock->delete_flag = flag;
> +
>  	return 0;
>  }
>  
> @@ -2515,7 +2508,7 @@ int delete_ref(const char *refname, const unsigned char *sha1, int delopt)
>  
>  	unlink_or_warn(git_path("logs/%s", lock->ref_name));
>  	clear_loose_ref_cache(&ref_cache);
> -	unlock_ref(lock);
> +	ret |= commit_ref_lock(lock);
>  	return ret;
>  }

Before this patch, the sequence for deleting a reference was

    acquire lock on loose ref file
    delete loose ref file
    acquire lock on packed-refs file
    rewrite packed-refs file, omitting ref
    activate packed-refs file and release its lock
    release lock on loose ref file

Another process that tries to read the reference's value between steps 2
and 4 sees some old value of the reference from the packed-refs file
rather than seeing either its recent value or seeing it undefined.  The
value that it sees can be arbitrarily old, and might even point at an
object that has long-since been garbage-collected.  If the packed-refs
lock acquisition fails, then the old value can be left in the
packed-refs file and becomes the value of the reference permanently.  So
this is not correct (it's a known problem).

After the patch, it is

    acquire lock on loose ref file
    acquire lock on packed-refs file
    rewrite packed-refs file, omitting ref
    activate packed-refs file and release its lock
    delete loose ref file          <-- now this happens later
    release lock on loose ref file

A pack-refs process that runs between steps 4 and 5 might be able to
acquire the packed-refs file lock, see the doomed loose value of the
reference, and pack it, with the end effect that the reference is not
deleted after all.  But this is less bad than what can happen now.  Can
anybody think of any new races that the new sequence would open?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-15 16:33   ` Ronnie Sahlberg
@ 2014-04-15 20:32     ` Michael Haggerty
  2014-04-16 17:11       ` Ronnie Sahlberg
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Haggerty @ 2014-04-15 20:32 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: git@vger.kernel.org

On 04/15/2014 06:33 PM, Ronnie Sahlberg wrote:
> On Mon, Apr 14, 2014 at 11:36 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> [...]
>> I wonder, however, whether your approach of changing callers from
>>
>>     lock = lock_ref_sha1_basic() (or varient of)
>>     write_ref_sha1(lock)
>>
>> to
>>
>>     lock = lock_ref_sha1_basic() (or varient of)
>>     write_ref_sha1(lock)
>>     unlock_ref(lock) | commit_ref_lock(lock)
>>
>> is not doing work that we will soon need to rework.  Would it be jumping
>> the gun to change the callers to
>>
>>     transaction = ref_transaction_begin();
>>     ref_transaction_{update,delete,etc}(transaction, ...);
>>     ref_transaction_{commit,rollback}(transaction, ...);
>>
>> instead?  Then we could bury the details of calling write_ref_sha1() and
>> commit_lock_ref() inside ref_transaction_commit() rather than having to
>> expose them in the public API.
> 
> I think you are right.
> 
> Lets put this patch series on the backburner for now and start by
> making all callers use transactions
> and remove write_ref_sha1() from the public API thar refs.c exports.
> 
> Once everything is switched over to transactions I can rework this
> patchseries for ref_transaction_commit()
> and resubmit to the mailing list.

Sounds good.  Rewriting callers to use transactions would be a great
next step.  Please especially keep track of what new features the
transactions API still needs.  More flexible error handling?  The
ability to have steps in the transaction that are "best-effort" (i.e.,
don't abort the transaction if they fail)?  Different reflog messages
for different updates within the same transaction rather than one reflog
message for all updates?  Etc.

And some callers who currently change multiple references one at a time
might be able to be rewritten to update the references in a single
transaction.

> Lets start preparing patches to change all external callers to use
> transactions instead.
> I am happy to help preparing patches for this. How do we ensure that
> we do not create duplicate work
> and work on the same functions?

I have a few loose ends to take care of on my lockfile patch series, and
there are a few things I would like to tidy up internal to the
transactions implementation, so I think if you are working on the caller
side then we won't step on each other's toes too much in the near future.

I suggest we use IRC (mhagger@freenode) or XMPP (mhagger@jabber.org) for
small-scale coordination.  I also have a GitHub repo
(http://github.com/mhagger/git) to which I often push intermediate
results; I will try to push to that more regularly (warning: I often
rebase feature branches even after they are pushed to GitHub).  I think
you are in Pacific Time whereas I am in Berlin, so we will tend to work
in serial rather than in parallel; that should help.  It would be a good
habit to shoot each short status emails at the end of each working day.

Of course we should only use one-on-one communication for early work; as
soon as something is getting ripe we should make sure our technical
discussions take place here on the mailing list.

Sound OK?
Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-15 20:32     ` Michael Haggerty
@ 2014-04-16 17:11       ` Ronnie Sahlberg
  2014-04-16 19:31         ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-16 17:11 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git@vger.kernel.org

On Tue, Apr 15, 2014 at 1:32 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 04/15/2014 06:33 PM, Ronnie Sahlberg wrote:
>> On Mon, Apr 14, 2014 at 11:36 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>>> [...]
>>> I wonder, however, whether your approach of changing callers from
>>>
>>>     lock = lock_ref_sha1_basic() (or varient of)
>>>     write_ref_sha1(lock)
>>>
>>> to
>>>
>>>     lock = lock_ref_sha1_basic() (or varient of)
>>>     write_ref_sha1(lock)
>>>     unlock_ref(lock) | commit_ref_lock(lock)
>>>
>>> is not doing work that we will soon need to rework.  Would it be jumping
>>> the gun to change the callers to
>>>
>>>     transaction = ref_transaction_begin();
>>>     ref_transaction_{update,delete,etc}(transaction, ...);
>>>     ref_transaction_{commit,rollback}(transaction, ...);
>>>
>>> instead?  Then we could bury the details of calling write_ref_sha1() and
>>> commit_lock_ref() inside ref_transaction_commit() rather than having to
>>> expose them in the public API.
>>
>> I think you are right.
>>
>> Lets put this patch series on the backburner for now and start by
>> making all callers use transactions
>> and remove write_ref_sha1() from the public API thar refs.c exports.
>>
>> Once everything is switched over to transactions I can rework this
>> patchseries for ref_transaction_commit()
>> and resubmit to the mailing list.
>
> Sounds good.  Rewriting callers to use transactions would be a great
> next step.  Please especially keep track of what new features the
> transactions API still needs.  More flexible error handling?  The
> ability to have steps in the transaction that are "best-effort" (i.e.,
> don't abort the transaction if they fail)?  Different reflog messages
> for different updates within the same transaction rather than one reflog
> message for all updates?  Etc.
>
> And some callers who currently change multiple references one at a time
> might be able to be rewritten to update the references in a single
> transaction.

As an experiment I rewrite most of the callers to use transactions yesterday.
Most callers would translate just fine, but some callers, such as walker_fetch()
does not yet fit well with the current transaction code.

For example that code does want to first take out locks on all refs
before it does a
lot of processing, with the locks held, before it writes and updates the refs.

Some of my thoughts after going over the callers :

Currently any locking of refs in a transaction only happens during the commit
phase. I think it would be useful to have a mechanism where you could
optionally take out locks for the involved refs early during the transaction.
So that simple callers could continue using
ref_transaction_begin()
ref_transaction_create|update|delete()*
ref_transaction_commit()

but, if a caller such as walker_fetch() could opt to do
ref_transaction_begin()
ref_transaction_lock_ref()*
...do stuff...
ref_transaction_create|update|delete()*
ref_transaction_commit()

In this second case ref_transaction_commit() would only take out any locks that
are missing during the 'lock the refs" loop.

Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
early during
a transaction.

A second idea is to change the signatures for
ref_transaction_create|update|delete()
slightly and allow them to return errors early.
We can check for things like add_update() failing, check that the
ref-name looks sane,
check some of the flags, like if has_old==true then old sha1 should
not be NULL or 0{40}, etc.

Additionally for robustness, if any of these functions detect an error
we can flag this in the
transaction structure and take action during ref_transaction_commit().
I.e. if a ref_transaction_update had a hard failure, do not allow
ref_transaction_commit()
to succeed.

Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
All callers that use these functions should check the function for error.

Suggestion 3: remove the qsort and check for duplicates in
ref_transaction_commit()
Since we are already taking out a lock for each ref we are updating
during the transaction
any duplicate refs will fail the second attempt to lock the same ref which will
implicitly make sure that a transaction will not change the same ref twice.

There are only two caveats I see with this third suggestion.
1, The second lock attempt will always cause a die() since we
eventually would end up
in lock_ref_sha1_basic() and this function will always unconditionally
die() if the lock failed.
But your locking changes are addressing this, right?
(all callers to lock_ref_sha1() or lock_any_ref_for_update() do check
for and handle if the lock
 failed, so that change to not die() should be safe)

2, We would need to take care when a lock fails here to print the
proper error message
so that we still show "Multiple updates for ref '%s' not allowed." if
the lock failed because
the transaction had already locked this file.

>
>> Lets start preparing patches to change all external callers to use
>> transactions instead.
>> I am happy to help preparing patches for this. How do we ensure that
>> we do not create duplicate work
>> and work on the same functions?
>
> I have a few loose ends to take care of on my lockfile patch series, and
> there are a few things I would like to tidy up internal to the
> transactions implementation, so I think if you are working on the caller
> side then we won't step on each other's toes too much in the near future.
>
> I suggest we use IRC (mhagger@freenode) or XMPP (mhagger@jabber.org) for
> small-scale coordination.  I also have a GitHub repo
> (http://github.com/mhagger/git) to which I often push intermediate
> results; I will try to push to that more regularly (warning: I often
> rebase feature branches even after they are pushed to GitHub).  I think
> you are in Pacific Time whereas I am in Berlin, so we will tend to work
> in serial rather than in parallel; that should help.  It would be a good
> habit to shoot each short status emails at the end of each working day.
>
> Of course we should only use one-on-one communication for early work; as
> soon as something is getting ripe we should make sure our technical
> discussions take place here on the mailing list.
>
> Sound OK?
> Michael
>
> --
> Michael Haggerty
> mhagger@alum.mit.edu
> http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-16 17:11       ` Ronnie Sahlberg
@ 2014-04-16 19:31         ` Junio C Hamano
  2014-04-16 21:31           ` Ronnie Sahlberg
  2014-04-16 21:51           ` Michael Haggerty
  0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-04-16 19:31 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: Michael Haggerty, git@vger.kernel.org

Ronnie Sahlberg <sahlberg@google.com> writes:

> Currently any locking of refs in a transaction only happens during the commit
> phase. I think it would be useful to have a mechanism where you could
> optionally take out locks for the involved refs early during the transaction.
> So that simple callers could continue using
> ref_transaction_begin()
> ref_transaction_create|update|delete()*
> ref_transaction_commit()
>
> but, if a caller such as walker_fetch() could opt to do
> ref_transaction_begin()
> ref_transaction_lock_ref()*
> ...do stuff...
> ref_transaction_create|update|delete()*
> ref_transaction_commit()
>
> In this second case ref_transaction_commit() would only take out any locks that
> are missing during the 'lock the refs" loop.
>
> Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
> early during
> a transaction.

Hmph.

I am not sure if that is the right way to go, or instead change all
create/update/delete to take locks without adding a new primitive.

> A second idea is to change the signatures for
> ref_transaction_create|update|delete()
> slightly and allow them to return errors early.
> We can check for things like add_update() failing, check that the
> ref-name looks sane,
> check some of the flags, like if has_old==true then old sha1 should
> not be NULL or 0{40}, etc.
>
> Additionally for robustness, if any of these functions detect an error
> we can flag this in the
> transaction structure and take action during ref_transaction_commit().
> I.e. if a ref_transaction_update had a hard failure, do not allow
> ref_transaction_commit()
> to succeed.
>
> Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
> All callers that use these functions should check the function for error.

I think that is a very sensible thing to do.

The details of determining "this cannot possibly succeed" may change
(for example, if we have them take locks at the point of
create/delete/update, a failure to lock may count as an early
error).

Is there any reason why this should be conditional (i.e. you said
"allow them to", implying that the early failure is optional)?

> Suggestion 3: remove the qsort and check for duplicates in
> ref_transaction_commit()
> Since we are already taking out a lock for each ref we are updating
> during the transaction
> any duplicate refs will fail the second attempt to lock the same ref which will
> implicitly make sure that a transaction will not change the same ref twice.

I do not know if I care about the implementation detail of "do we
have a unique set of update requests?".  While I do not see a strong
need for one transaction to touch the same ref twice (e.g. create to
point at commit A and update it to point at commit B), I do not see
why we should forbid such a use in the future.

Just some of my knee-jerk reactions.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-16 19:31         ` Junio C Hamano
@ 2014-04-16 21:31           ` Ronnie Sahlberg
  2014-04-16 21:42             ` Junio C Hamano
  2014-04-16 21:51           ` Michael Haggerty
  1 sibling, 1 reply; 16+ messages in thread
From: Ronnie Sahlberg @ 2014-04-16 21:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git@vger.kernel.org

On Wed, Apr 16, 2014 at 12:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Ronnie Sahlberg <sahlberg@google.com> writes:
>
>> Currently any locking of refs in a transaction only happens during the commit
>> phase. I think it would be useful to have a mechanism where you could
>> optionally take out locks for the involved refs early during the transaction.
>> So that simple callers could continue using
>> ref_transaction_begin()
>> ref_transaction_create|update|delete()*
>> ref_transaction_commit()
>>
>> but, if a caller such as walker_fetch() could opt to do
>> ref_transaction_begin()
>> ref_transaction_lock_ref()*
>> ...do stuff...
>> ref_transaction_create|update|delete()*
>> ref_transaction_commit()
>>
>> In this second case ref_transaction_commit() would only take out any locks that
>> are missing during the 'lock the refs" loop.
>>
>> Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
>> early during
>> a transaction.
>
> Hmph.
>
> I am not sure if that is the right way to go, or instead change all
> create/update/delete to take locks without adding a new primitive.

ack.

>
>> A second idea is to change the signatures for
>> ref_transaction_create|update|delete()
>> slightly and allow them to return errors early.
>> We can check for things like add_update() failing, check that the
>> ref-name looks sane,
>> check some of the flags, like if has_old==true then old sha1 should
>> not be NULL or 0{40}, etc.
>>
>> Additionally for robustness, if any of these functions detect an error
>> we can flag this in the
>> transaction structure and take action during ref_transaction_commit().
>> I.e. if a ref_transaction_update had a hard failure, do not allow
>> ref_transaction_commit()
>> to succeed.
>>
>> Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
>> All callers that use these functions should check the function for error.
>
> I think that is a very sensible thing to do.
>
> The details of determining "this cannot possibly succeed" may change
> (for example, if we have them take locks at the point of
> create/delete/update, a failure to lock may count as an early
> error).
>
> Is there any reason why this should be conditional (i.e. you said
> "allow them to", implying that the early failure is optional)?

It was poor wording on my side. Checking for the ref_transaction_*()
return for error should be mandatory (modulo bugs).

But a caller could be buggy and fail to check properly.
It would be very cheap to detect this condition in
ref_transaction_commit() which could then do

  die("transaction commit called for errored transaction");

which would make it easy to spot this kind of bugs.


>
>> Suggestion 3: remove the qsort and check for duplicates in
>> ref_transaction_commit()
>> Since we are already taking out a lock for each ref we are updating
>> during the transaction
>> any duplicate refs will fail the second attempt to lock the same ref which will
>> implicitly make sure that a transaction will not change the same ref twice.
>
> I do not know if I care about the implementation detail of "do we
> have a unique set of update requests?".  While I do not see a strong
> need for one transaction to touch the same ref twice (e.g. create to
> point at commit A and update it to point at commit B), I do not see
> why we should forbid such a use in the future.
>

ack.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-16 21:31           ` Ronnie Sahlberg
@ 2014-04-16 21:42             ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-04-16 21:42 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: Michael Haggerty, git@vger.kernel.org

Ronnie Sahlberg <sahlberg@google.com> writes:

>> I am not sure if that is the right way to go, or instead change all
>> create/update/delete to take locks without adding a new primitive.
>
> ack.

Hmph.  When I say "I am not sure", "I dunno", etc., I do mean it.

Did you mean by "Ack" "I do not know, either", or "I think it is
better to take locks early everywhere"?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/3] Make update refs more atomic
  2014-04-16 19:31         ` Junio C Hamano
  2014-04-16 21:31           ` Ronnie Sahlberg
@ 2014-04-16 21:51           ` Michael Haggerty
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Haggerty @ 2014-04-16 21:51 UTC (permalink / raw)
  To: Junio C Hamano, Ronnie Sahlberg; +Cc: git@vger.kernel.org

On 04/16/2014 09:31 PM, Junio C Hamano wrote:
> Ronnie Sahlberg <sahlberg@google.com> writes:
> 
>> Currently any locking of refs in a transaction only happens during the commit
>> phase. I think it would be useful to have a mechanism where you could
>> optionally take out locks for the involved refs early during the transaction.
>> So that simple callers could continue using
>> ref_transaction_begin()
>> ref_transaction_create|update|delete()*
>> ref_transaction_commit()
>>
>> but, if a caller such as walker_fetch() could opt to do
>> ref_transaction_begin()
>> ref_transaction_lock_ref()*
>> ...do stuff...
>> ref_transaction_create|update|delete()*
>> ref_transaction_commit()
>>
>> In this second case ref_transaction_commit() would only take out any locks that
>> are missing during the 'lock the refs" loop.
>>
>> Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
>> early during
>> a transaction.
> 
> Hmph.
> 
> I am not sure if that is the right way to go, or instead change all
> create/update/delete to take locks without adding a new primitive.

Junio's suggestion seems like a good idea to me.  Obviously, as soon as
we take out a lock we could also do any applicable old_sha1 check and
possibly fail fast.

Does a "verify" operation require holding a lock?  If not, when is the
verification done--early, or during the commit, or both?  (I realize
that we don't yet *have* a verify operation at the API level, but we
should!)

We also need to think about for what period of time we have to hold the
packed-refs lock.

Finally, we shouldn't forget that currently the reflog files are locked
by holding the lock on the corresponding loose reference file.  Do we
need to integrate these files into our system any more than they
currently are?

[By the way, I noticed the other day that the command

    git reflog expire --stale-fix --expire-unreachable=now --all

can hold loose reference locks for a *long* time (like 10s of minutes),
especially in weird cases like the repository having 9000 packfiles for
some reason or another :-)  The command execution time grows strongly
with the length of the reference's log, or maybe even the square of the
length assuming the history is roughly linear and reachability is
computed separately for each SHA-1.  This is just an empirical
observation so far; I haven't looked into the code yet.]

>> A second idea is to change the signatures for
>> ref_transaction_create|update|delete()
>> slightly and allow them to return errors early.
>> We can check for things like add_update() failing, check that the
>> ref-name looks sane,
>> check some of the flags, like if has_old==true then old sha1 should
>> not be NULL or 0{40}, etc.
>>
>> Additionally for robustness, if any of these functions detect an error
>> we can flag this in the
>> transaction structure and take action during ref_transaction_commit().
>> I.e. if a ref_transaction_update had a hard failure, do not allow
>> ref_transaction_commit()
>> to succeed.
>>
>> Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
>> All callers that use these functions should check the function for error.
> 
> I think that is a very sensible thing to do.
> 
> The details of determining "this cannot possibly succeed" may change
> (for example, if we have them take locks at the point of
> create/delete/update, a failure to lock may count as an early
> error).
> 
> Is there any reason why this should be conditional (i.e. you said
> "allow them to", implying that the early failure is optional)?

Also a good suggestion.  We should make it clear in the documentation
that the create/delete/update functions are not *obliged* to return an
error (for example that the current value of the reference does not
agree with old_sha1) because future alternate ref backends might
possibly not be able to make such checks until the commit phase.  For
example, checking old_sha1 might involve a round-trip to a database or
remote repository, in which case it might be preferable to check all
values in a single round-trip upon commit.

So, callers might be informed early of problems, or they might only
learn about problems when they try to commit the transaction.  They have
to be able to handle either type of error reporting.

So then the question arises (and maybe this is what Ronnie was getting
at by suggesting that the checks might be conditional): are callers
*obliged* to check the return values from create/delete/update, or are
they allowed to just keep throwing everything into the transaction,
ignoring errors, and only check the result of ref_transaction_commit()?

I don't feel strongly one way or the other about this question.  It
might be nice to be able to write callers sloppily, but it would cost a
bit more code in the reference backends.  Though maybe it wouldn't even
be much extra code, given that we would probably want to put consistency
checks in there anyway.

>> Suggestion 3: remove the qsort and check for duplicates in
>> ref_transaction_commit()
>> Since we are already taking out a lock for each ref we are updating
>> during the transaction
>> any duplicate refs will fail the second attempt to lock the same ref which will
>> implicitly make sure that a transaction will not change the same ref twice.
> 
> I do not know if I care about the implementation detail of "do we
> have a unique set of update requests?".  While I do not see a strong
> need for one transaction to touch the same ref twice (e.g. create to
> point at commit A and update it to point at commit B), I do not see
> why we should forbid such a use in the future.

I agree.  For example, we might very well want to allow multiple updates
to a single reference, as long as they are mutually consistent, to be
coalesced into a single update.  I also expect that the error message in
the current code is more illuminating than the generic error message
that would result from the inability to lock a reference (because it is
already locked by an earlier update in the same transaction!)  Let's
leave this aspect the way it is now and revisit the topic later when we
have learned more.

I see, though, that this will be a little tricky to implement early
locking.  Currently we sort all of the updates and look for duplicates
*before* we acquire any locks.  But if we want to acquire locks
immediately, then (since the list isn't sorted yet) it will be expensive
to look for duplicates before attempting to lock.  I can see two
possibilities:

1. Use a different data structure, like a hash map, for storing updates,
so that we have a cheap way to lookup whether there is already an update
involving a reference.

2. Try to lock the reference, and if the lock fails *then* look back
through the list to see if we're the ones holding the lock already.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-04-16 21:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-14 18:29 [PATCH v4 0/3] Make update refs more atomic Ronnie Sahlberg
2014-04-14 18:29 ` [PATCH v4 1/3] refs.c: split writing and commiting a ref into two separate functions Ronnie Sahlberg
2014-04-15 11:17   ` Michael Haggerty
2014-04-14 18:29 ` [PATCH v4 2/3] refs.c: split delete_ref_loose() into a separate flag-for-deletion and commit phase Ronnie Sahlberg
2014-04-15 17:19   ` Michael Haggerty
2014-04-14 18:29 ` [PATCH v4 3/3] refs.c: change ref_transaction_commit to run the commit loops once all work is finished Ronnie Sahlberg
2014-04-14 20:24 ` [PATCH v4 0/3] Make update refs more atomic Junio C Hamano
2014-04-15 16:41   ` Ronnie Sahlberg
2014-04-15  6:36 ` Michael Haggerty
2014-04-15 16:33   ` Ronnie Sahlberg
2014-04-15 20:32     ` Michael Haggerty
2014-04-16 17:11       ` Ronnie Sahlberg
2014-04-16 19:31         ` Junio C Hamano
2014-04-16 21:31           ` Ronnie Sahlberg
2014-04-16 21:42             ` Junio C Hamano
2014-04-16 21:51           ` Michael Haggerty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).