From: Thomas Gummerer <t.gummerer@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, mhagger@alum.mit.edu
Subject: Re: [PATCH v3 19/20] refs: add LMDB refs backend
Date: Fri, 15 Jan 2016 14:33:17 +0100 [thread overview]
Message-ID: <20160115133317.GJ10612@hank> (raw)
In-Reply-To: <1452788777-24954-20-git-send-email-dturner@twopensource.com>
[I don't know too much about the refs code, but was still interested
in this patch series, so below are the few things that I noticed while
reading through it.]
On 01/14, David Turner wrote:
> Add a database backend for refs using LMDB. This backend runs git
> for-each-ref about 30% faster than the files backend with fully-packed
> refs on a repo with ~120k refs. It's also about 4x faster than using
> fully-unpacked refs. In addition, and perhaps more importantly, it
> avoids case-conflict issues on OS X.
>
> LMDB has a few features that make it suitable for usage in git:
>
> 1. It is relatively lightweight; it requires only one header file, and
> the library code takes under 64k at runtime.
>
> 2. It is well-tested: it's been used in OpenLDAP for years.
>
> 3. It's very fast. LMDB's benchmarks show that it is among
> the fastest key-value stores.
>
> 4. It has a relatively simple concurrency story; readers don't
> block writers and writers don't block readers.
>
> Ronnie Sahlberg's original version of this patchset used tdb. The
> major disadvantage of tdb is that tdb is hard to build on OS X. It's
> also not in homebrew. So lmdb seemed simpler.
>
> To test this backend's correctness, I hacked test-lib.sh and
> test-lib-functions.sh to run all tests under the refs backend. Dozens
> of tests use manual ref/reflog reading/writing, or create submodules
> without passing --ref-storage to git init. If those tests are
> changed to use the update-ref machinery or test-refs-lmdb-backend (or,
> in the case of packed-refs, corrupt refs, and dumb fetch tests, are
> skipped), the only remaining failing tests are the git-new-workdir
> tests and the gitweb tests.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
> .gitignore | 1 +
> Documentation/config.txt | 7 +
> Documentation/git-clone.txt | 3 +-
> Documentation/git-init.txt | 2 +-
> Documentation/technical/refs-lmdb-backend.txt | 52 +
> Documentation/technical/repository-version.txt | 5 +
> Makefile | 12 +
> config.c | 29 +
> configure.ac | 33 +
> contrib/workdir/git-new-workdir | 3 +
> refs.h | 2 +
> refs/lmdb-backend.c | 2051 ++++++++++++++++++++++++
> setup.c | 11 +-
> test-refs-lmdb-backend.c | 64 +
> transport.c | 7 +-
> 15 files changed, 2275 insertions(+), 7 deletions(-)
> create mode 100644 Documentation/technical/refs-lmdb-backend.txt
> create mode 100644 refs/lmdb-backend.c
> create mode 100644 test-refs-lmdb-backend.c
>
[...]
> diff --git a/Documentation/technical/refs-lmdb-backend.txt b/Documentation/technical/refs-lmdb-backend.txt
> new file mode 100644
> index 0000000..c328bfc
> --- /dev/null
> +++ b/Documentation/technical/refs-lmdb-backend.txt
> @@ -0,0 +1,52 @@
> +Notes on the LMDB refs backend
> +==============================
> +
> +Design:
> +------
> +
> +Refs and reflogs are stored in a lmdb database in .git/refdb. All
> +keys and values are \0-terminated.
> +
> +Keys for refs are the name of the ref (e.g. refs/heads/master).
> +Values are the value of the ref, in hex
> +(e.g. 61f23eb0f81357c19fa91e2b8c6f3906c3a8f9b0).
> +
> +All per-worktree refs (refs/bisect/* and HEAD) are stored using
> +the traditional files-based backend.
> +
> +Reflogs are stored as a series of database entries.
> +
> +For non-empty reflogs, there is one entry per logged ref update. The
> +key format is logs/[refname]\0[timestamp]. The timestamp is a 64-bit
> +unsigned integer number of nanoseconds since 1/1/1970, stored in
> +network byte order. This means that reflog entries are
> +chronologically ordered. Because LMDB is a btree database, we can
> +efficiently iterate over these keys.
> +
> +For an empty reflog, there is a "header" entry to show that a reflog
> +exists. The header has the same format as an ordinary reflog, but with
> +a timeztamp of all zeros and an empty value.
s/timeztamp/timestamp/
> +
> +Reflog values are in the same format as the original files-based
> +reflog, including the trailing LF. The date in the reflog value
> +matches the date in the timestamp field.
> +
> +Weaknesses:
> +-----------
> +
> +The reflog format is somewhat inefficient: a binary format could store
> +reflog date/time information in somewhat less space.
> +
> +The rsync and file:// transports don't work yet, because they
> +don't use the refs API.
> +
> +git new-workdir is incompatible with the lmdb backend. Fortunately,
> +git new-workdir is deprecated, and worktrees work fine.
> +
> +LMDB locks the entire database for write. Any other writer waits
> +until the first writer is done before beginning. Readers do not wait
> +for writers, and writers do not wait for readers. The underlying
> +scheme is approximately MVCC; each reader's queries see the state of
> +the database as-of the time that the reader acquired its read lock.
> +This is not too far off from the files backend, which loads all refs
> +into memory when one is requested.
[...]
> diff --git a/contrib/workdir/git-new-workdir b/contrib/workdir/git-new-workdir
> index 888c34a..c035a43 100755
> --- a/contrib/workdir/git-new-workdir
> +++ b/contrib/workdir/git-new-workdir
> @@ -28,6 +28,9 @@ git_dir=$(cd "$orig_git" 2>/dev/null &&
> git rev-parse --git-dir 2>/dev/null) ||
> die "Not a git repository: \"$orig_git\""
>
> +
> +test "$(git config extensions.refstorage)" = "lmdb" && die "git-new-workdir is incompatible with the refs lmdb storage"
> +
Is it expected that other potential ref backends are compatible with
git-new-workdir? Otherwise I think it would make more sense to
whitelist the files backend here instead of blacklisting the lmdb
backend, so we don't risk forgetting about this when adding another
backend.
> case "$git_dir" in
> .git)
> git_dir="$orig_git/.git"
[...]
> +static int lmdb_init_db(struct strbuf *err, int shared)
> +{
> + /*
> + * To create a db, all we need to do is make a directory for
> + * it to live in; lmdb will do the rest.
> + */
> +
> + if (!db_path)
> + db_path = xstrdup(real_path(get_refdb_path(get_git_common_dir())));
I think we're leaking some memory from get_refdb_path() here.
get_refdb_path() uses strbuf_detach(), which according to its docstring
makes its caller take care of the memory of the returned string.
real_path() then uses strbuf_addstr() to add the string to its
internal strbuf, but leaves the string we get from get_refdb_path()
alone, so it leaks.
> +
> + if (mkdir(db_path, 0775) && errno != EEXIST) {
> + strbuf_addf(err, "%s", strerror(errno));
> + return -1;
> + }
> +
> + return 0;
> +}
> +
[...]
> +/*
> + * Begin a transaction. Because only one transaction per thread is
> + * permitted, we use a global transaction object. If a read-write
> + * transaction is presently already in-progress, and a read-only
> + * transaction is requested, the read-write transaction will be
> + * returned instead. If a read-write transaction is requested and a
> + * read-only transaction is open, the read-only transaction will be
> + * closed.
> + *
> + * It is a bug to request a read-write transaction during another
> + * read-write transaction.
> + *
> + * As a result, it is unsafe to retain read-only transactions past the
> + * point where a read-write transaction might be needed. For
> + * instance, any call that has callbacks outside this module must
> + * conclude all of its reads from the database before calling those
> + * callbacks, or must reacquire the transaction after its callbacks
> + * are completed.
> + */
> +int lmdb_transaction_begin_flags(struct strbuf *err, unsigned int flags)
> +{
> + int ret;
> + MDB_txn *txn;
> + static size_t last_txnid = 0;
> + int force_restart = 0;
> + MDB_envinfo stat;
> +
> + if (!db_path)
> + db_path = xstrdup(real_path(get_refdb_path(get_git_common_dir())));
Same comment about leaking memory from get_refdb_path() as above applies.
> + init_env(&env, db_path);
> +
> + /*
> + * Since each transaction sees a consistent view of the db,
> + * downstream processes that write the db won't be seen in
> + * this transaction. We can check if the last transaction id
> + * has changed since this read transaction was started, and if
> + * so, we want to reopen the transaction.
> + */
> +
> + mdb_env_info(env, &stat);
> + if (stat.me_last_txnid != last_txnid) {
> + force_restart = 1;
> + last_txnid = stat.me_last_txnid;
> + }
> +
> + if (!transaction.txn) {
> + ret = mdb_txn_begin(env, NULL, flags, &txn);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_begin failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + ret = mdb_dbi_open(txn, NULL, 0, &transaction.dbi);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_open failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + transaction.txn = txn;
> + transaction.flags = flags;
> + return 0;
> + }
> +
> + if (transaction.flags == flags && !(flags & MDB_RDONLY))
> + die("BUG: rw transaction started during another rw txn");
> +
> + if (force_restart || (transaction.flags != flags && transaction.flags & MDB_RDONLY)) {
> + /*
> + * RO -> RW, or forced restart due to possible changes
> + * from downstream processes.
> + */
> + mdb_txn_abort(transaction.txn);
> + ret = mdb_txn_begin(env, NULL, flags, &txn);
> + if (ret) {
> + strbuf_addf(err, "restarting txn: mdb_txn_begin failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + ret = mdb_dbi_open(txn, NULL, 0, &transaction.dbi);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_open failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + transaction.txn = txn;
> + transaction.flags = flags;
> + }
> + /* RW -> RO just keeps the RW txn */
> + return 0;
> +}
[...]
next prev parent reply other threads:[~2016-01-15 13:32 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 16:25 [PATCH v3 00/20] refs backend rebase on pu David Turner
2016-01-14 16:25 ` [PATCH v3 01/20] refs: add a backend method structure with transaction functions David Turner
2016-01-14 16:25 ` [PATCH v3 02/20] refs: add methods for misc ref operations David Turner
2016-01-14 16:26 ` [PATCH v3 03/20] refs: add methods for the ref iterators David Turner
2016-01-14 16:26 ` [PATCH v3 04/20] refs: add do_for_each_per_worktree_ref David Turner
2016-01-14 16:26 ` [PATCH v3 05/20] refs: add methods for reflog David Turner
2016-01-14 16:26 ` [PATCH v3 06/20] refs: add method for initial ref transaction commit David Turner
2016-01-14 16:26 ` [PATCH v3 07/20] refs: add method for delete_refs David Turner
2016-01-14 16:26 ` [PATCH v3 08/20] refs: add methods to init refs db David Turner
2016-01-14 16:26 ` [PATCH v3 09/20] refs: add method to rename refs David Turner
2016-01-14 16:26 ` [PATCH v3 10/20] refs: make lock generic David Turner
2016-01-14 16:26 ` [PATCH v3 11/20] refs: move duplicate check to common code David Turner
2016-01-14 16:26 ` [PATCH v3 12/20] refs: allow log-only updates David Turner
2016-01-14 16:26 ` [PATCH v3 13/20] refs: resolve symbolic refs first David Turner
2016-02-04 7:37 ` Jeff King
2016-02-04 19:24 ` David Turner
2016-01-14 16:26 ` [PATCH v3 14/20] refs: always handle non-normal refs in files backend David Turner
2016-01-14 16:26 ` [PATCH v3 15/20] init: allow alternate backends to be set for new repos David Turner
2016-01-15 11:33 ` SZEDER Gábor
2016-01-15 12:51 ` Thomas Gummerer
2016-01-19 19:12 ` David Turner
2016-02-04 9:48 ` Duy Nguyen
2016-02-04 20:05 ` David Turner
2016-01-14 16:26 ` [PATCH v3 16/20] refs: check submodules ref storage config David Turner
2016-01-14 16:26 ` [PATCH v3 17/20] refs: allow ref backend to be set for clone David Turner
2016-01-15 11:32 ` SZEDER Gábor
2016-01-19 17:06 ` David Turner
2016-01-21 9:08 ` SZEDER Gábor
2016-01-14 16:26 ` [PATCH v3 18/20] svn: learn ref-storage argument David Turner
2016-01-15 11:34 ` SZEDER Gábor
2016-01-14 16:26 ` [PATCH v3 19/20] refs: add LMDB refs backend David Turner
2016-01-15 13:33 ` Thomas Gummerer [this message]
2016-01-19 18:55 ` David Turner
2016-02-04 9:58 ` Duy Nguyen
2016-02-04 19:33 ` David Turner
2016-01-14 16:26 ` [PATCH v3 20/20] refs: tests for lmdb backend David Turner
2016-02-02 20:08 ` [PATCH v3 00/20] refs backend rebase on pu David Turner
2016-02-02 22:13 ` Junio C Hamano
2016-02-04 0:09 ` Junio C Hamano
2016-02-04 1:12 ` David Turner
2016-02-04 1:54 ` Ramsay Jones
2016-02-04 2:58 ` Jeff King
2016-02-04 22:44 ` David Turner
2016-02-04 10:09 ` Duy Nguyen
2016-02-04 21:39 ` David Turner
2016-02-04 11:42 ` Duy Nguyen
2016-02-04 20:25 ` David Turner
2016-02-04 20:39 ` Ramsay Jones
2016-02-04 21:23 ` David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160115133317.GJ10612@hank \
--to=t.gummerer@gmail.com \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.