git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: git@vger.kernel.org, mhagger@alum.mit.edu
Cc: Jeff King <peff@peff.net>
Subject: [PATCH v3 42/44] introduce "extensions" form of core.repositoryformatversion
Date: Mon, 12 Oct 2015 17:52:03 -0400	[thread overview]
Message-ID: <1444686725-27660-44-git-send-email-dturner@twopensource.com> (raw)
In-Reply-To: <1444686725-27660-1-git-send-email-dturner@twopensource.com>

From: Jeff King <peff@peff.net>

Normally we try to avoid bumps of the whole-repository
core.repositoryformatversion field. However, it is
unavoidable if we want to safely change certain aspects of
git in a backwards-incompatible way (e.g., modifying the set
of ref tips that we must traverse to generate a list of
unreachable, safe-to-prune objects).

If we were to bump the repository version for every such
change, then any implementation understanding version `X`
would also have to understand `X-1`, `X-2`, and so forth,
even though the incompatibilities may be in orthogonal parts
of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that
the user cannot choose to use one feature without the other,
weighing the tradeoff in compatibility only for that
particular feature).

This patch documents the existing repositoryformatversion
strategy and introduces a new format, "1", which lets a
repository specify that it must run with an arbitrary set of
extensions. This can be used, for example:

 - to inform git that the objects should not be pruned based
   only on the reachability of the ref tips (e.g, because it
   has "clone --shared" children)

 - that the refs are stored in a format besides the usual
   "refs" and "packed-refs" directories

Because we bump to format "1", and because format "1"
requires that a running git knows about any extensions
mentioned, we know that older versions of the code will not
do something dangerous when confronted with these new
formats.

For example, if the user chooses to use database storage for
refs, they may set the "extensions.refbackend" config to
"db". Older versions of git will not understand format "1"
and bail. Versions of git which understand "1" but do not
know about "refbackend", or which know about "refbackend"
but not about the "db" backend, will refuse to run. This is
annoying, of course, but much better than the alternative of
claiming that there are no refs in the repository, or
writing to a location that other implementations will not
read.

Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that
is meant to be used by users and future extensions to
provide safety with older implementations.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/technical/repository-version.txt | 81 ++++++++++++++++++++++++++
 cache.h                                        |  6 ++
 setup.c                                        | 37 +++++++++++-
 t/t1302-repo-version.sh                        | 38 ++++++++++++
 4 files changed, 159 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/technical/repository-version.txt

diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt
new file mode 100644
index 0000000..3d7106d
--- /dev/null
+++ b/Documentation/technical/repository-version.txt
@@ -0,0 +1,81 @@
+Git Repository Format Versions
+==============================
+
+Every git repository is marked with a numeric version in the
+`core.repositoryformatversion` key of its `config` file. This version
+specifies the rules for operating on the on-disk repository data. An
+implementation of git which does not understand a particular version
+advertised by an on-disk repository MUST NOT operate on that repository;
+doing so risks not only producing wrong results, but actually losing
+data.
+
+Because of this rule, version bumps should be kept to an absolute
+minimum. Instead, we generally prefer these strategies:
+
+  - bumping format version numbers of individual data files (e.g.,
+    index, packfiles, etc). This restricts the incompatibilities only to
+    those files.
+
+  - introducing new data that gracefully degrades when used by older
+    clients (e.g., pack bitmap files are ignored by older clients, which
+    simply do not take advantage of the optimization they provide).
+
+A whole-repository format version bump should only be part of a change
+that cannot be independently versioned. For instance, if one were to
+change the reachability rules for objects, or the rules for locking
+refs, that would require a bump of the repository format version.
+
+Note that this applies only to accessing the repository's disk contents
+directly. An older client which understands only format `0` may still
+connect via `git://` to a repository using format `1`, as long as the
+server process understands format `1`.
+
+The preferred strategy for rolling out a version bump (whether whole
+repository or for a single file) is to teach git to read the new format,
+and allow writing the new format with a config switch or command line
+option (for experimentation or for those who do not care about backwards
+compatibility with older gits). Then after a long period to allow the
+reading capability to become common, we may switch to writing the new
+format by default.
+
+The currently defined format versions are:
+
+Version `0`
+-----------
+
+This is the format defined by the initial version of git, including but
+not limited to the format of the repository directory, the repository
+configuration file, and the object and ref storage. Specifying the
+complete behavior of git is beyond the scope of this document.
+
+Version `1`
+-----------
+
+This format is identical to version `0`, with the following exceptions:
+
+  1. When reading the `core.repositoryformatversion` variable, a git
+     implementation which supports version 1 MUST also read any
+     configuration keys found in the `extensions` section of the
+     configuration file.
+
+  2. If a version-1 repository specifies any `extensions.*` keys that
+     the running git has not implemented, the operation MUST NOT
+     proceed. Similarly, if the value of any known key is not understood
+     by the implementation, the operation MUST NOT proceed.
+
+Note that if no extensions are specified in the config file, then
+`core.repositoryformatversion` SHOULD be set to `0` (setting it to `1`
+provides no benefit, and makes the repository incompatible with older
+implementations of git).
+
+This document will serve as the master list for extensions. Any
+implementation wishing to define a new extension should make a note of
+it here, in order to claim the name.
+
+The defined extensions are:
+
+`noop`
+~~~~~~
+
+This extension does not change git's behavior at all. It is useful only
+for testing format-1 compatibility.
diff --git a/cache.h b/cache.h
index 692cfd3..edfe7fa 100644
--- a/cache.h
+++ b/cache.h
@@ -699,7 +699,13 @@ extern const char *refs_backend_type;
 
 extern int grafts_replace_parents;
 
+/*
+ * GIT_REPO_VERSION is the version we write by default. The
+ * _READ variant is the highest number we know how to
+ * handle.
+ */
 #define GIT_REPO_VERSION 0
+#define GIT_REPO_VERSION_READ 1
 extern int repository_format_version;
 extern int check_repository_format(void);
 
diff --git a/setup.c b/setup.c
index b264471..7433eed 100644
--- a/setup.c
+++ b/setup.c
@@ -5,6 +5,7 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+static struct string_list unknown_extensions = STRING_LIST_INIT_DUP;
 
 /*
  * The input parameter must contain an absolute path, and it must already be
@@ -356,10 +357,23 @@ void setup_work_tree(void)
 
 static int check_repo_format(const char *var, const char *value, void *cb)
 {
+	const char *ext;
+
 	if (strcmp(var, "core.repositoryformatversion") == 0)
 		repository_format_version = git_config_int(var, value);
 	else if (strcmp(var, "core.sharedrepository") == 0)
 		shared_repository = git_config_perm(var, value);
+	else if (skip_prefix(var, "extensions.", &ext)) {
+		/*
+		 * record any known extensions here; otherwise,
+		 * we fall through to recording it as unknown, and
+		 * check_repository_format will complain
+		 */
+		if (!strcmp(ext, "noop"))
+			;
+		else
+			string_list_append(&unknown_extensions, ext);
+	}
 	return 0;
 }
 
@@ -370,6 +384,8 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
 	config_fn_t fn;
 	int ret = 0;
 
+	string_list_clear(&unknown_extensions, 0);
+
 	if (get_common_dir(&sb, gitdir))
 		fn = check_repo_format;
 	else
@@ -387,16 +403,31 @@ static int check_repository_format_gently(const char *gitdir, int *nongit_ok)
 	 * is a good one.
 	 */
 	git_config_early(fn, NULL, repo_config);
-	if (GIT_REPO_VERSION < repository_format_version) {
+	if (GIT_REPO_VERSION_READ < repository_format_version) {
 		if (!nongit_ok)
 			die ("Expected git repo version <= %d, found %d",
-			     GIT_REPO_VERSION, repository_format_version);
+			     GIT_REPO_VERSION_READ, repository_format_version);
 		warning("Expected git repo version <= %d, found %d",
-			GIT_REPO_VERSION, repository_format_version);
+			GIT_REPO_VERSION_READ, repository_format_version);
 		warning("Please upgrade Git");
 		*nongit_ok = -1;
 		ret = -1;
 	}
+
+	if (repository_format_version >= 1 && unknown_extensions.nr) {
+		int i;
+
+		if (!nongit_ok)
+			die("unknown repository extension: %s",
+			    unknown_extensions.items[0].string);
+
+		for (i = 0; i < unknown_extensions.nr; i++)
+			warning("unknown repository extension: %s",
+				unknown_extensions.items[i].string);
+		*nongit_ok = -1;
+		ret = -1;
+	}
+
 	strbuf_release(&sb);
 	return ret;
 }
diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh
index 0d9388a..8dd6fd7 100755
--- a/t/t1302-repo-version.sh
+++ b/t/t1302-repo-version.sh
@@ -67,4 +67,42 @@ test_expect_success 'gitdir required mode' '
 	)
 '
 
+check_allow () {
+	git rev-parse --git-dir >actual &&
+	echo .git >expect &&
+	test_cmp expect actual
+}
+
+check_abort () {
+	test_must_fail git rev-parse --git-dir
+}
+
+# avoid git-config, since it cannot be trusted to run
+# in a repository with a broken version
+mkconfig () {
+	echo '[core]' &&
+	echo "repositoryformatversion = $1" &&
+	shift &&
+
+	if test $# -gt 0; then
+		echo '[extensions]' &&
+		for i in "$@"; do
+			echo "$i"
+		done
+	fi
+}
+
+while read outcome version extensions; do
+	test_expect_success "$outcome version=$version $extensions" "
+		mkconfig $version $extensions >.git/config &&
+		check_${outcome}
+	"
+done <<\EOF
+allow 0
+allow 1
+allow 1 noop
+abort 1 no-such-extension
+allow 0 no-such-extension
+EOF
+
 test_done
-- 
2.4.2.644.g97b850b-twtrsrc

  parent reply	other threads:[~2015-10-12 21:53 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-12 21:51 [PATCH v3 00/44] alternate refs backends David Turner
2015-10-12 21:51 ` [PATCH v3 01/44] refs.c: create a public version of verify_refname_available David Turner
2015-10-13  2:28   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 02/44] refs: make repack_without_refs and is_branch public David Turner
2015-10-13  2:39   ` Michael Haggerty
2015-10-13  7:23     ` Michael Haggerty
2015-10-13 21:18       ` David Turner
2015-10-12 21:51 ` [PATCH v3 03/44] refs-be-files.c: rename refs to refs-be-files David Turner
2015-10-12 21:51 ` [PATCH v3 04/44] refs.c: add a new refs.c file to hold all common refs code David Turner
2015-10-12 21:51 ` [PATCH v3 05/44] refs.c: move update_ref to refs.c David Turner
2015-10-13  3:41   ` Michael Haggerty
2015-10-13 20:40     ` David Turner
2015-10-14  1:01     ` David Turner
2015-10-12 21:51 ` [PATCH v3 06/44] refs.c: move delete_ref and delete_refs to the common code David Turner
2015-10-13  3:51   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 07/44] refs.c: move read_ref_at to the common refs file David Turner
2015-10-12 21:51 ` [PATCH v3 08/44] refs.c: move the hidden refs functions to the common code David Turner
2015-10-12 21:51 ` [PATCH v3 09/44] refs.c: move dwim and friend functions to the common refs code David Turner
2015-10-12 21:51 ` [PATCH v3 10/44] refs.c: move warn_if_dangling_symref* to the common code David Turner
2015-10-12 21:51 ` [PATCH v3 11/44] refs.c: move read_ref, read_ref_full and ref_exists " David Turner
2015-10-12 21:51 ` [PATCH v3 12/44] refs.c: move resolve_refdup to common David Turner
2015-10-12 21:51 ` [PATCH v3 13/44] refs.c: move check_refname_format to the common code David Turner
2015-10-12 21:51 ` [PATCH v3 14/44] refs.c: move is_branch " David Turner
2015-10-12 21:51 ` [PATCH v3 15/44] refs.c: move prettify_refname " David Turner
2015-10-12 21:51 ` [PATCH v3 16/44] refs.c: move ref iterators " David Turner
2015-10-13  4:12   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 17/44] refs.c: move head_ref_namespaced " David Turner
2015-10-12 21:51 ` [PATCH v3 18/44] refs: move transaction functions into " David Turner
2015-10-13  6:15   ` Michael Haggerty
2015-10-13 11:29   ` Michael Haggerty
2015-10-13 22:21     ` David Turner
2015-10-13 23:05       ` David Turner
2015-10-12 21:51 ` [PATCH v3 19/44] refs: add a backend method structure with transaction functions David Turner
2015-10-12 21:51 ` [PATCH v3 19/44] refs-be-files.c: " David Turner
2015-10-12 22:25   ` David Turner
2015-10-13  7:56   ` Michael Haggerty
2015-10-13 18:28     ` David Turner
2015-10-13 21:12       ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 20/44] refs-be-files.c: add methods for misc ref operations David Turner
2015-10-13  7:44   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 21/44] refs-be-files.c: add methods for the ref iterators David Turner
2015-10-12 21:51 ` [PATCH v3 22/44] refs-be-files.c: add method for for_each_reftype_ David Turner
2015-10-12 21:51 ` [PATCH v3 23/44] refs-be-files.c: add do_for_each_per_worktree_ref David Turner
2015-10-12 21:51 ` [PATCH v3 24/44] refs.c: move refname_is_safe to the common code David Turner
2015-10-12 21:51 ` [PATCH v3 25/44] refs.h: document make refname_is_safe and add it to header David Turner
2015-10-13  7:33   ` Michael Haggerty
2015-10-13  9:18     ` Michael Haggerty
2015-10-13 18:32       ` David Turner
2015-10-12 21:51 ` [PATCH v3 26/44] refs.c: move copy_msg to the common code David Turner
2015-10-13  9:25   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 27/44] refs.c: move peel_object " David Turner
2015-10-13  9:34   ` Michael Haggerty
2015-10-12 21:51 ` [PATCH v3 28/44] refs.c: move should_autocreate_reflog to " David Turner
2015-10-12 21:51 ` [PATCH v3 29/44] refs.c: add ref backend init function David Turner
2015-10-12 21:51 ` [PATCH v3 30/44] refs.c: add methods for reflog David Turner
2015-10-12 21:51 ` [PATCH v3 31/44] refs.c: add method for initial ref transaction commit David Turner
2015-10-12 21:51 ` [PATCH v3 32/44] initdb: move safe_create_dir into common code David Turner
2015-10-12 21:51 ` [PATCH v3 33/44] refs.c: add method for initializing refs db David Turner
2015-10-12 21:51 ` [PATCH v3 34/44] refs-be-files.c: add method to rename refs David Turner
2015-10-12 21:51 ` [PATCH v3 35/44] refs: make lock generic David Turner
2015-10-12 21:51 ` [PATCH v3 36/44] refs: make files_log_ref_write functions public David Turner
2015-10-12 21:51 ` [PATCH v3 37/44] refs: break out a ref conflict check David Turner
2015-10-13 10:25   ` Michael Haggerty
2015-10-13 18:46     ` David Turner
2015-10-12 21:51 ` [PATCH v3 38/44] refs: move duplicate check to common code David Turner
2015-10-12 21:52 ` [PATCH v3 39/44] refs: always handle non-normal refs in files backend David Turner
2015-10-12 21:52 ` [PATCH v3 40/44] refs: allow ref backend to be set for clone David Turner
2015-10-12 21:52 ` [PATCH v3 41/44] refs: add register_refs_backend David Turner
2015-10-12 21:52 ` David Turner [this message]
2015-10-12 21:52 ` [PATCH v3 43/44] refs: add LMDB refs backend David Turner
2015-10-12 21:52 ` [PATCH v3 44/44] refs: tests for db backend David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1444686725-27660-44-git-send-email-dturner@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).