From: Thomas Gummerer <t.gummerer@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, mhagger@alum.mit.edu
Subject: Re: [PATCH v3 19/20] refs: add LMDB refs backend
Date: Fri, 15 Jan 2016 14:33:17 +0100 [thread overview]
Message-ID: <20160115133317.GJ10612@hank> (raw)
In-Reply-To: <1452788777-24954-20-git-send-email-dturner@twopensource.com>
[I don't know too much about the refs code, but was still interested
in this patch series, so below are the few things that I noticed while
reading through it.]
On 01/14, David Turner wrote:
> Add a database backend for refs using LMDB. This backend runs git
> for-each-ref about 30% faster than the files backend with fully-packed
> refs on a repo with ~120k refs. It's also about 4x faster than using
> fully-unpacked refs. In addition, and perhaps more importantly, it
> avoids case-conflict issues on OS X.
>
> LMDB has a few features that make it suitable for usage in git:
>
> 1. It is relatively lightweight; it requires only one header file, and
> the library code takes under 64k at runtime.
>
> 2. It is well-tested: it's been used in OpenLDAP for years.
>
> 3. It's very fast. LMDB's benchmarks show that it is among
> the fastest key-value stores.
>
> 4. It has a relatively simple concurrency story; readers don't
> block writers and writers don't block readers.
>
> Ronnie Sahlberg's original version of this patchset used tdb. The
> major disadvantage of tdb is that tdb is hard to build on OS X. It's
> also not in homebrew. So lmdb seemed simpler.
>
> To test this backend's correctness, I hacked test-lib.sh and
> test-lib-functions.sh to run all tests under the refs backend. Dozens
> of tests use manual ref/reflog reading/writing, or create submodules
> without passing --ref-storage to git init. If those tests are
> changed to use the update-ref machinery or test-refs-lmdb-backend (or,
> in the case of packed-refs, corrupt refs, and dumb fetch tests, are
> skipped), the only remaining failing tests are the git-new-workdir
> tests and the gitweb tests.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
> .gitignore | 1 +
> Documentation/config.txt | 7 +
> Documentation/git-clone.txt | 3 +-
> Documentation/git-init.txt | 2 +-
> Documentation/technical/refs-lmdb-backend.txt | 52 +
> Documentation/technical/repository-version.txt | 5 +
> Makefile | 12 +
> config.c | 29 +
> configure.ac | 33 +
> contrib/workdir/git-new-workdir | 3 +
> refs.h | 2 +
> refs/lmdb-backend.c | 2051 ++++++++++++++++++++++++
> setup.c | 11 +-
> test-refs-lmdb-backend.c | 64 +
> transport.c | 7 +-
> 15 files changed, 2275 insertions(+), 7 deletions(-)
> create mode 100644 Documentation/technical/refs-lmdb-backend.txt
> create mode 100644 refs/lmdb-backend.c
> create mode 100644 test-refs-lmdb-backend.c
>
[...]
> diff --git a/Documentation/technical/refs-lmdb-backend.txt b/Documentation/technical/refs-lmdb-backend.txt
> new file mode 100644
> index 0000000..c328bfc
> --- /dev/null
> +++ b/Documentation/technical/refs-lmdb-backend.txt
> @@ -0,0 +1,52 @@
> +Notes on the LMDB refs backend
> +==============================
> +
> +Design:
> +------
> +
> +Refs and reflogs are stored in a lmdb database in .git/refdb. All
> +keys and values are \0-terminated.
> +
> +Keys for refs are the name of the ref (e.g. refs/heads/master).
> +Values are the value of the ref, in hex
> +(e.g. 61f23eb0f81357c19fa91e2b8c6f3906c3a8f9b0).
> +
> +All per-worktree refs (refs/bisect/* and HEAD) are stored using
> +the traditional files-based backend.
> +
> +Reflogs are stored as a series of database entries.
> +
> +For non-empty reflogs, there is one entry per logged ref update. The
> +key format is logs/[refname]\0[timestamp]. The timestamp is a 64-bit
> +unsigned integer number of nanoseconds since 1/1/1970, stored in
> +network byte order. This means that reflog entries are
> +chronologically ordered. Because LMDB is a btree database, we can
> +efficiently iterate over these keys.
> +
> +For an empty reflog, there is a "header" entry to show that a reflog
> +exists. The header has the same format as an ordinary reflog, but with
> +a timeztamp of all zeros and an empty value.
s/timeztamp/timestamp/
> +
> +Reflog values are in the same format as the original files-based
> +reflog, including the trailing LF. The date in the reflog value
> +matches the date in the timestamp field.
> +
> +Weaknesses:
> +-----------
> +
> +The reflog format is somewhat inefficient: a binary format could store
> +reflog date/time information in somewhat less space.
> +
> +The rsync and file:// transports don't work yet, because they
> +don't use the refs API.
> +
> +git new-workdir is incompatible with the lmdb backend. Fortunately,
> +git new-workdir is deprecated, and worktrees work fine.
> +
> +LMDB locks the entire database for write. Any other writer waits
> +until the first writer is done before beginning. Readers do not wait
> +for writers, and writers do not wait for readers. The underlying
> +scheme is approximately MVCC; each reader's queries see the state of
> +the database as-of the time that the reader acquired its read lock.
> +This is not too far off from the files backend, which loads all refs
> +into memory when one is requested.
[...]
> diff --git a/contrib/workdir/git-new-workdir b/contrib/workdir/git-new-workdir
> index 888c34a..c035a43 100755
> --- a/contrib/workdir/git-new-workdir
> +++ b/contrib/workdir/git-new-workdir
> @@ -28,6 +28,9 @@ git_dir=$(cd "$orig_git" 2>/dev/null &&
> git rev-parse --git-dir 2>/dev/null) ||
> die "Not a git repository: \"$orig_git\""
>
> +
> +test "$(git config extensions.refstorage)" = "lmdb" && die "git-new-workdir is incompatible with the refs lmdb storage"
> +
Is it expected that other potential ref backends are compatible with
git-new-workdir? Otherwise I think it would make more sense to
whitelist the files backend here instead of blacklisting the lmdb
backend, so we don't risk forgetting about this when adding another
backend.
> case "$git_dir" in
> .git)
> git_dir="$orig_git/.git"
[...]
> +static int lmdb_init_db(struct strbuf *err, int shared)
> +{
> + /*
> + * To create a db, all we need to do is make a directory for
> + * it to live in; lmdb will do the rest.
> + */
> +
> + if (!db_path)
> + db_path = xstrdup(real_path(get_refdb_path(get_git_common_dir())));
I think we're leaking some memory from get_refdb_path() here.
get_refdb_path() uses strbuf_detach(), which according to its docstring
makes its caller take care of the memory of the returned string.
real_path() then uses strbuf_addstr() to add the string to its
internal strbuf, but leaves the string we get from get_refdb_path()
alone, so it leaks.
> +
> + if (mkdir(db_path, 0775) && errno != EEXIST) {
> + strbuf_addf(err, "%s", strerror(errno));
> + return -1;
> + }
> +
> + return 0;
> +}
> +
[...]
> +/*
> + * Begin a transaction. Because only one transaction per thread is
> + * permitted, we use a global transaction object. If a read-write
> + * transaction is presently already in-progress, and a read-only
> + * transaction is requested, the read-write transaction will be
> + * returned instead. If a read-write transaction is requested and a
> + * read-only transaction is open, the read-only transaction will be
> + * closed.
> + *
> + * It is a bug to request a read-write transaction during another
> + * read-write transaction.
> + *
> + * As a result, it is unsafe to retain read-only transactions past the
> + * point where a read-write transaction might be needed. For
> + * instance, any call that has callbacks outside this module must
> + * conclude all of its reads from the database before calling those
> + * callbacks, or must reacquire the transaction after its callbacks
> + * are completed.
> + */
> +int lmdb_transaction_begin_flags(struct strbuf *err, unsigned int flags)
> +{
> + int ret;
> + MDB_txn *txn;
> + static size_t last_txnid = 0;
> + int force_restart = 0;
> + MDB_envinfo stat;
> +
> + if (!db_path)
> + db_path = xstrdup(real_path(get_refdb_path(get_git_common_dir())));
Same comment about leaking memory from get_refdb_path() as above applies.
> + init_env(&env, db_path);
> +
> + /*
> + * Since each transaction sees a consistent view of the db,
> + * downstream processes that write the db won't be seen in
> + * this transaction. We can check if the last transaction id
> + * has changed since this read transaction was started, and if
> + * so, we want to reopen the transaction.
> + */
> +
> + mdb_env_info(env, &stat);
> + if (stat.me_last_txnid != last_txnid) {
> + force_restart = 1;
> + last_txnid = stat.me_last_txnid;
> + }
> +
> + if (!transaction.txn) {
> + ret = mdb_txn_begin(env, NULL, flags, &txn);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_begin failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + ret = mdb_dbi_open(txn, NULL, 0, &transaction.dbi);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_open failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + transaction.txn = txn;
> + transaction.flags = flags;
> + return 0;
> + }
> +
> + if (transaction.flags == flags && !(flags & MDB_RDONLY))
> + die("BUG: rw transaction started during another rw txn");
> +
> + if (force_restart || (transaction.flags != flags && transaction.flags & MDB_RDONLY)) {
> + /*
> + * RO -> RW, or forced restart due to possible changes
> + * from downstream processes.
> + */
> + mdb_txn_abort(transaction.txn);
> + ret = mdb_txn_begin(env, NULL, flags, &txn);
> + if (ret) {
> + strbuf_addf(err, "restarting txn: mdb_txn_begin failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + ret = mdb_dbi_open(txn, NULL, 0, &transaction.dbi);
> + if (ret) {
> + strbuf_addf(err, "mdb_txn_open failed: %s",
> + mdb_strerror(ret));
> + return -1;
> + }
> + transaction.txn = txn;
> + transaction.flags = flags;
> + }
> + /* RW -> RO just keeps the RW txn */
> + return 0;
> +}
[...]
next prev parent reply other threads:[~2016-01-15 13:32 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-14 16:25 [PATCH v3 00/20] refs backend rebase on pu David Turner
2016-01-14 16:25 ` [PATCH v3 01/20] refs: add a backend method structure with transaction functions David Turner
2016-01-14 16:25 ` [PATCH v3 02/20] refs: add methods for misc ref operations David Turner
2016-01-14 16:26 ` [PATCH v3 03/20] refs: add methods for the ref iterators David Turner
2016-01-14 16:26 ` [PATCH v3 04/20] refs: add do_for_each_per_worktree_ref David Turner
2016-01-14 16:26 ` [PATCH v3 05/20] refs: add methods for reflog David Turner
2016-01-14 16:26 ` [PATCH v3 06/20] refs: add method for initial ref transaction commit David Turner
2016-01-14 16:26 ` [PATCH v3 07/20] refs: add method for delete_refs David Turner
2016-01-14 16:26 ` [PATCH v3 08/20] refs: add methods to init refs db David Turner
2016-01-14 16:26 ` [PATCH v3 09/20] refs: add method to rename refs David Turner
2016-01-14 16:26 ` [PATCH v3 10/20] refs: make lock generic David Turner
2016-01-14 16:26 ` [PATCH v3 11/20] refs: move duplicate check to common code David Turner
2016-01-14 16:26 ` [PATCH v3 12/20] refs: allow log-only updates David Turner
2016-01-14 16:26 ` [PATCH v3 13/20] refs: resolve symbolic refs first David Turner
2016-02-04 7:37 ` Jeff King
2016-02-04 19:24 ` David Turner
2016-01-14 16:26 ` [PATCH v3 14/20] refs: always handle non-normal refs in files backend David Turner
2016-01-14 16:26 ` [PATCH v3 15/20] init: allow alternate backends to be set for new repos David Turner
2016-01-15 11:33 ` SZEDER Gábor
2016-01-15 12:51 ` Thomas Gummerer
2016-01-19 19:12 ` David Turner
2016-02-04 9:48 ` Duy Nguyen
2016-02-04 20:05 ` David Turner
2016-01-14 16:26 ` [PATCH v3 16/20] refs: check submodules ref storage config David Turner
2016-01-14 16:26 ` [PATCH v3 17/20] refs: allow ref backend to be set for clone David Turner
2016-01-15 11:32 ` SZEDER Gábor
2016-01-19 17:06 ` David Turner
2016-01-21 9:08 ` SZEDER Gábor
2016-01-14 16:26 ` [PATCH v3 18/20] svn: learn ref-storage argument David Turner
2016-01-15 11:34 ` SZEDER Gábor
2016-01-14 16:26 ` [PATCH v3 19/20] refs: add LMDB refs backend David Turner
2016-01-15 13:33 ` Thomas Gummerer [this message]
2016-01-19 18:55 ` David Turner
2016-02-04 9:58 ` Duy Nguyen
2016-02-04 19:33 ` David Turner
2016-01-14 16:26 ` [PATCH v3 20/20] refs: tests for lmdb backend David Turner
2016-02-02 20:08 ` [PATCH v3 00/20] refs backend rebase on pu David Turner
2016-02-02 22:13 ` Junio C Hamano
2016-02-04 0:09 ` Junio C Hamano
2016-02-04 1:12 ` David Turner
2016-02-04 1:54 ` Ramsay Jones
2016-02-04 2:58 ` Jeff King
2016-02-04 22:44 ` David Turner
2016-02-04 10:09 ` Duy Nguyen
2016-02-04 21:39 ` David Turner
2016-02-04 11:42 ` Duy Nguyen
2016-02-04 20:25 ` David Turner
2016-02-04 20:39 ` Ramsay Jones
2016-02-04 21:23 ` David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160115133317.GJ10612@hank \
--to=t.gummerer@gmail.com \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).