From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH v5 25/27] refs: add LMDB refs storage backend
Date: Wed, 24 Feb 2016 15:37:22 -0500 [thread overview]
Message-ID: <1456346242.18017.16.camel@twopensource.com> (raw)
In-Reply-To: <CACsJy8BQZCBpfmvXk+o5PqM7=zad7cxgA9B2995Rb0D0YBxEVw@mail.gmail.com>
On Sat, 2016-02-20 at 15:59 +0700, Duy Nguyen wrote:
> On Thu, Feb 18, 2016 at 12:17 PM, David Turner <
> dturner@twopensource.com> wrote:
> > LMDB has a few features that make it suitable for usage in git:
> > ...
>
> I'm reading lmdb documents and hitting the caveat section [1].
> Random thoughts
>
> * "There is normally no pure read-only mode, since readers need write
> access to locks and lock file.".
>
> This will be a problem for server side that serves git:// protocol
> only. Some of those servers disable write access to the entire
> repository and git still works fine (but won't when lmdb is used).
> Should we do something in this case? Just tell server admins to relax
> file access, or use MDB_NOLOCK (and when? based on config var?)
MDB_NOLOCK is a good idea. I'm going to add this to the "Weaknesses"
section of the docs and plan to fix it later, unless you feel strongly
otherwise.
> * " Use an MDB_env* in the process which opened it, without
> fork()ing.". We do use fork on non-Windows in run-command.c, but it
> should be followed by exec() with no ref access in between, so we're
> almost good.
>
> I notice atexit() is used in this for/exec code, which reminds me we
> also use atexit() in many other places. I hope none of them access
> refs, or we could be in trouble.
I don't think they should do that anyway -- it's too much complexity
for an atexit handler.
> * "Do not have open an LMDB database twice in the same process at the
> same time. Not even from a plain open() call - close()ing it breaks
> flock() advisory locking"
>
> I wonder what happens if we do open twice, will it error out or
> silently ignore and move on? Because if it's the latter case, we need
> some protection from the caller and I'm not sure if
> lmdb-backend.c:init_env() has it, especially when it open submodule's
> lmdb.
I think LMDB could check this, but it doesn't seem to. I've refactored
some of the submodule transaction stuff so that now we have more
protection there.
> * "Avoid long-lived transactions...."
>
> OK we don't have a problem with this. But it makes me realize lmdb
> transactions do not map with ref transactions. We don't open lmdb
> transaction at ref_transaction_begin(), for example. Is it simply
> more
> convenient to do transactions the current way, or is it impossible or
> incorrect to attach lmdb transactions to ref_transaction_*()?
That was what I did originally, but reviewers here noted that it had
some problems:
1. What if, while a transaction is open, git opens a subprocess that
wants to make its own transaction? There can only be one writer
transaction open at a time.
2. As you note, long-lived transactions.
Also, the files backend also doesn't do any locking until the last
moment, and it's reasonable to try to be compatible with that.
> * "Avoid aborting a process with an active transaction. The
> transaction becomes "long-lived" as above until a check for stale
> readers is performed or the lockfile is reset, since the process may
> not remove it from the lockfile."
>
> Does it mean we should at atexit() and signal handler to release
> currently active transaction?
Probably a good idea.
> * "Do not use LMDB databases on remote filesystems, even between
> processes on the same host. This breaks flock() on some OSes,
> possibly
> memory map sync, and certainly sync between programs on different
> hosts."
>
> OK can't do anything about it anyway, but maybe it should be
> mentioned
> somewhere in Git documentation.
Sure.
> * "Opening a database can fail if another process is opening or
> closing it at exactly the same time."
>
> We have some retry logic in resolve_ref_1(). Do we need the same for
> lmdb? Not very important though.
>
> [1] http://symas.com/mdb/doc/
wat.
OK, I'll add a retry loop on that. I guess we can just keep retrying
on EACCES or EAGAIN.
next prev parent reply other threads:[~2016-02-24 20:37 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-18 5:17 [PATCH v5 00/27] refs backends David Turner
2016-02-18 5:17 ` [PATCH v5 01/27] refs: Move head_ref{,_submodule} to the common code David Turner
2016-02-18 5:17 ` [PATCH v5 02/27] refs: move for_each_*ref* functions into " David Turner
2016-02-18 5:17 ` [PATCH v5 03/27] refs: add a backend method structure with transaction functions David Turner
2016-02-18 5:17 ` [PATCH v5 04/27] refs: add methods for misc ref operations David Turner
2016-02-18 5:17 ` [PATCH v5 05/27] refs: add method for do_for_each_ref David Turner
2016-02-18 5:17 ` [PATCH v5 06/27] refs: add do_for_each_per_worktree_ref David Turner
2016-02-18 5:17 ` [PATCH v5 07/27] refs: add methods for reflog David Turner
2016-02-18 5:17 ` [PATCH v5 08/27] refs: add method for initial ref transaction commit David Turner
2016-02-18 5:17 ` [PATCH v5 09/27] refs: add method for delete_refs David Turner
2016-02-18 5:17 ` [PATCH v5 10/27] refs: add methods to init refs db David Turner
2016-02-18 5:17 ` [PATCH v5 11/27] refs: add method to rename refs David Turner
2016-02-18 5:17 ` [PATCH v5 12/27] refs: forbid cross-backend ref renames David Turner
2016-02-20 4:30 ` Duy Nguyen
2016-02-24 20:48 ` David Turner
2016-02-18 5:17 ` [PATCH v5 13/27] refs: make lock generic David Turner
2016-02-18 5:17 ` [PATCH v5 14/27] refs: move duplicate check to common code David Turner
2016-02-18 5:17 ` [PATCH v5 15/27] refs: allow log-only updates David Turner
2016-02-18 5:17 ` [PATCH v5 16/27] refs: don't dereference on rename David Turner
2016-02-18 5:17 ` [PATCH v5 17/27] refs: on symref reflog expire, lock symref not referrent David Turner
2016-02-18 5:17 ` [PATCH v5 18/27] refs: resolve symbolic refs first David Turner
2016-02-18 5:17 ` [PATCH v5 19/27] refs: always handle non-normal refs in files backend David Turner
2016-02-18 5:17 ` [PATCH v5 20/27] init: allow alternate ref strorage to be set for new repos David Turner
2016-02-18 5:17 ` [PATCH v5 21/27] refs: check submodules' ref storage config David Turner
2016-02-18 5:17 ` [PATCH v5 22/27] clone: allow ref storage backend to be set for clone David Turner
2016-02-18 5:17 ` [PATCH v5 23/27] svn: learn ref-storage argument David Turner
2016-02-20 23:55 ` Eric Wong
2016-02-23 18:08 ` David Turner
2016-02-18 5:17 ` [PATCH v5 24/27] refs: add register_ref_storage_backends() David Turner
2016-02-18 5:17 ` [PATCH v5 25/27] refs: add LMDB refs storage backend David Turner
2016-02-18 8:50 ` Duy Nguyen
2016-02-18 20:23 ` David Turner
2016-02-18 21:15 ` Junio C Hamano
2016-02-19 2:54 ` Duy Nguyen
2016-02-19 19:10 ` David Turner
2016-02-20 13:14 ` Duy Nguyen
2016-02-24 20:41 ` David Turner
2016-02-20 21:32 ` Junio C Hamano
2016-02-19 22:49 ` David Turner
2016-02-19 23:08 ` Junio C Hamano
2016-02-20 2:58 ` Duy Nguyen
2016-02-24 20:43 ` David Turner
2016-02-25 10:07 ` Duy Nguyen
2016-02-20 8:59 ` Duy Nguyen
2016-02-24 20:37 ` David Turner [this message]
2016-02-25 10:12 ` Duy Nguyen
2016-02-25 20:05 ` [PATCH] refs: document transaction semantics David Turner
2016-02-25 20:10 ` David Turner
2016-02-25 20:34 ` Junio C Hamano
2016-02-25 20:50 ` David Turner
2016-02-18 5:17 ` [PATCH v5 26/27] refs: tests for lmdb backend David Turner
2016-02-18 5:17 ` [PATCH v5 27/27] tests: add ref-storage argument David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1456346242.18017.16.camel@twopensource.com \
--to=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.