From: David Disseldorp <ddiss@suse.de>
To: Amitay Isaacs <amitay@gmail.com>
Cc: Samba Technical <samba-technical@lists.samba.org>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: [PATCH] Ceph RADOS cluster mutex helper for Samba CTDB
Date: Thu, 8 Dec 2016 19:39:54 +0100 [thread overview]
Message-ID: <20161208193954.7ce6b896@suse.de> (raw)
In-Reply-To: <CAJ+X7mRh04D+Yvtf0xx3dT6rTa=9KvJagyK=1PJQC=1R+u++7w@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]
Hi Amitay,
On Wed, 7 Dec 2016 13:32:34 +1100, Amitay Isaacs wrote:
> Hi David,
>
> On Tue, Dec 6, 2016 at 11:18 PM, David Disseldorp <ddiss@suse.de> wrote:
>
> > This time with the patch-set attached...
> >
> > > ctdb/doc/Makefile | 3 +-
> > > ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml | 90 +++++
> > > .../utils/ceph/ctdb_mutex_ceph_rados_helper.c | 334 ++++++++++++++++++
> > > ctdb/utils/ceph/test_ceph_rados_reclock.sh | 151 ++++++++
> > > ctdb/wscript | 19 +
> > > 5 files changed, 596 insertions(+), 1 deletion(-)
> >
>
> In patch 1, why do you need to include any of the CTDB files
> (protocol/protocol.h and common/system.h) and have dependency on
> ctdb-system? I don't see you are using any of the functions defined in
> common/system.h.
>
> Please include the manpage in SAMBA_BINARY() definition. Also include it in
> manpages[] list. It might be better to merge patch 1 and patch 2.
Thanks for the feedback. Please find a new version attached (atop the
etcd changes), attempting to address your points above:
- drop unnecessary includes and ctdb-system dependency
+ add separate talloc and tevent deps
+ use tevent_timeval_current_ofs() instead of timeval_current_ofs()
- conditionally generate the man page
Cheers, David
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: ctdb_reclock_ceph_v3.patchset --]
[-- Type: text/x-patch, Size: 25256 bytes --]
From 54c16ac1dfafb06111aeafb2377b06bd5db36994 Mon Sep 17 00:00:00 2001
From: David Disseldorp <ddiss@samba.org>
Date: Thu, 1 Dec 2016 13:33:22 +0100
Subject: [PATCH 1/3] ctdb: cluster mutex helper using Ceph RADOS
ctdb_mutex_ceph_rados_helper implements the cluster mutex helper API
atop Ceph using the librados rados_lock_exclusive()/rados_unlock()
functionality.
Once configured, split brain avoidance during CTDB recovery will be
handled using locks against an object located in a Ceph RADOS pool.
Signed-off-by: David Disseldorp <ddiss@samba.org>
---
ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c | 328 +++++++++++++++++++++++++
ctdb/wscript | 19 ++
2 files changed, 347 insertions(+)
create mode 100644 ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c
diff --git a/ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c b/ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c
new file mode 100644
index 0000000..326a0b0
--- /dev/null
+++ b/ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c
@@ -0,0 +1,328 @@
+/*
+ CTDB mutex helper using Ceph librados locks
+
+ Copyright (C) David Disseldorp 2016
+
+ Based on ctdb_mutex_fcntl_helper.c, which is:
+ Copyright (C) Martin Schwenke 2015
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, see <http://www.gnu.org/licenses/>.
+*/
+
+#include "replace.h"
+
+#include "tevent.h"
+#include "talloc.h"
+#include "rados/librados.h"
+
+#define CTDB_MUTEX_CEPH_LOCK_NAME "ctdb_reclock_mutex"
+#define CTDB_MUTEX_CEPH_LOCK_COOKIE CTDB_MUTEX_CEPH_LOCK_NAME
+#define CTDB_MUTEX_CEPH_LOCK_DESC "CTDB recovery lock"
+
+#define CTDB_MUTEX_STATUS_HOLDING "0"
+#define CTDB_MUTEX_STATUS_CONTENDED "1"
+#define CTDB_MUTEX_STATUS_TIMEOUT "2"
+#define CTDB_MUTEX_STATUS_ERROR "3"
+
+static char *progname = NULL;
+
+static int ctdb_mutex_rados_ctx_create(const char *ceph_cluster_name,
+ const char *ceph_auth_name,
+ const char *pool_name,
+ rados_t *_ceph_cluster,
+ rados_ioctx_t *_ioctx)
+{
+ rados_t ceph_cluster = NULL;
+ rados_ioctx_t ioctx = NULL;
+ int ret;
+
+ ret = rados_create2(&ceph_cluster, ceph_cluster_name, ceph_auth_name, 0);
+ if (ret < 0) {
+ fprintf(stderr, "%s: failed to initialise Ceph cluster %s as %s"
+ " - (%s)\n", progname, ceph_cluster_name, ceph_auth_name,
+ strerror(-ret));
+ return ret;
+ }
+
+ /* path=NULL tells librados to use default locations */
+ ret = rados_conf_read_file(ceph_cluster, NULL);
+ if (ret < 0) {
+ fprintf(stderr, "%s: failed to parse Ceph cluster config"
+ " - (%s)\n", progname, strerror(-ret));
+ rados_shutdown(ceph_cluster);
+ return ret;
+ }
+
+ ret = rados_connect(ceph_cluster);
+ if (ret < 0) {
+ fprintf(stderr, "%s: failed to connect to Ceph cluster %s as %s"
+ " - (%s)\n", progname, ceph_cluster_name, ceph_auth_name,
+ strerror(-ret));
+ rados_shutdown(ceph_cluster);
+ return ret;
+ }
+
+
+ ret = rados_ioctx_create(ceph_cluster, pool_name, &ioctx);
+ if (ret < 0) {
+ fprintf(stderr, "%s: failed to create Ceph ioctx for pool %s"
+ " - (%s)\n", progname, pool_name, strerror(-ret));
+ rados_shutdown(ceph_cluster);
+ return ret;
+ }
+
+ *_ceph_cluster = ceph_cluster;
+ *_ioctx = ioctx;
+
+ return 0;
+}
+
+static void ctdb_mutex_rados_ctx_destroy(rados_t ceph_cluster,
+ rados_ioctx_t ioctx)
+{
+ rados_ioctx_destroy(ioctx);
+ rados_shutdown(ceph_cluster);
+}
+
+static int ctdb_mutex_rados_lock(rados_ioctx_t *ioctx,
+ const char *oid)
+{
+ int ret;
+
+ ret = rados_lock_exclusive(ioctx, oid,
+ CTDB_MUTEX_CEPH_LOCK_NAME,
+ CTDB_MUTEX_CEPH_LOCK_COOKIE,
+ CTDB_MUTEX_CEPH_LOCK_DESC,
+ NULL, /* infinite duration */
+ 0);
+ if ((ret == -EEXIST) || (ret == -EBUSY)) {
+ /* lock contention */
+ return ret;
+ } else if (ret < 0) {
+ /* unexpected failure */
+ fprintf(stderr,
+ "%s: Failed to get lock on RADOS object '%s' - (%s)\n",
+ progname, oid, strerror(-ret));
+ return ret;
+ }
+
+ /* lock obtained */
+ return 0;
+}
+
+static int ctdb_mutex_rados_unlock(rados_ioctx_t *ioctx,
+ const char *oid)
+{
+ int ret;
+
+ ret = rados_unlock(ioctx, oid,
+ CTDB_MUTEX_CEPH_LOCK_NAME,
+ CTDB_MUTEX_CEPH_LOCK_COOKIE);
+ if (ret < 0) {
+ fprintf(stderr,
+ "%s: Failed to drop lock on RADOS object '%s' - (%s)\n",
+ progname, oid, strerror(-ret));
+ return ret;
+ }
+
+ return 0;
+}
+
+struct ctdb_mutex_rados_state {
+ bool holding_mutex;
+ const char *ceph_cluster_name;
+ const char *ceph_auth_name;
+ const char *pool_name;
+ const char *object;
+ int ppid;
+ struct tevent_context *ev;
+ struct tevent_signal *sig_ev;
+ struct tevent_timer *timer_ev;
+ rados_t ceph_cluster;
+ rados_ioctx_t ioctx;
+};
+
+static void ctdb_mutex_rados_sigterm_cb(struct tevent_context *ev,
+ struct tevent_signal *se,
+ int signum,
+ int count,
+ void *siginfo,
+ void *private_data)
+{
+ struct ctdb_mutex_rados_state *cmr_state = private_data;
+ int ret;
+
+ if (!cmr_state->holding_mutex) {
+ fprintf(stderr, "Sigterm callback invoked without mutex!\n");
+ ret = -EINVAL;
+ goto err_ctx_cleanup;
+ }
+
+ ret = ctdb_mutex_rados_unlock(cmr_state->ioctx, cmr_state->object);
+err_ctx_cleanup:
+ ctdb_mutex_rados_ctx_destroy(cmr_state->ceph_cluster,
+ cmr_state->ioctx);
+ talloc_free(cmr_state);
+ exit(ret ? 1 : 0);
+}
+
+static void ctdb_mutex_rados_timer_cb(struct tevent_context *ev,
+ struct tevent_timer *te,
+ struct timeval current_time,
+ void *private_data)
+{
+ struct ctdb_mutex_rados_state *cmr_state = private_data;
+ int ret;
+
+ if (!cmr_state->holding_mutex) {
+ fprintf(stderr, "Timer callback invoked without mutex!\n");
+ ret = -EINVAL;
+ goto err_ctx_cleanup;
+ }
+
+ if ((kill(cmr_state->ppid, 0) == 0) || (errno != ESRCH)) {
+ /* parent still around, keep waiting */
+ cmr_state->timer_ev = tevent_add_timer(cmr_state->ev, cmr_state,
+ tevent_timeval_current_ofs(5, 0),
+ ctdb_mutex_rados_timer_cb,
+ cmr_state);
+ if (cmr_state->timer_ev == NULL) {
+ fprintf(stderr, "Failed to create timer event\n");
+ /* rely on signal cb */
+ }
+ return;
+ }
+
+ /* parent ended, drop lock and exit */
+ ret = ctdb_mutex_rados_unlock(cmr_state->ioctx, cmr_state->object);
+err_ctx_cleanup:
+ ctdb_mutex_rados_ctx_destroy(cmr_state->ceph_cluster,
+ cmr_state->ioctx);
+ talloc_free(cmr_state);
+ exit(ret ? 1 : 0);
+}
+
+int main(int argc, char *argv[])
+{
+ int ret;
+ struct ctdb_mutex_rados_state *cmr_state;
+
+ progname = argv[0];
+
+ if (argc != 5) {
+ fprintf(stderr, "Usage: %s <Ceph Cluster> <Ceph user> "
+ "<RADOS pool> <RADOS object>\n",
+ progname);
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ ret = setvbuf(stdout, NULL, _IONBF, 0);
+ if (ret != 0) {
+ fprintf(stderr, "Failed to configure unbuffered stdout I/O\n");
+ }
+
+ cmr_state = talloc_zero(NULL, struct ctdb_mutex_rados_state);
+ if (cmr_state == NULL) {
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ ret = -ENOMEM;
+ goto err_out;
+ }
+
+ cmr_state->ceph_cluster_name = argv[1];
+ cmr_state->ceph_auth_name = argv[2];
+ cmr_state->pool_name = argv[3];
+ cmr_state->object = argv[4];
+
+ cmr_state->ppid = getppid();
+ if (cmr_state->ppid == 1) {
+ /*
+ * The original parent is gone and the process has
+ * been reparented to init. This can happen if the
+ * helper is started just as the parent is killed
+ * during shutdown. The error message doesn't need to
+ * be stellar, since there won't be anything around to
+ * capture and log it...
+ */
+ fprintf(stderr, "%s: PPID == 1\n", progname);
+ ret = -EPIPE;
+ goto err_state_free;
+ }
+
+ cmr_state->ev = tevent_context_init(cmr_state);
+ if (cmr_state->ev == NULL) {
+ fprintf(stderr, "tevent_context_init failed\n");
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ ret = -ENOMEM;
+ goto err_state_free;
+ }
+
+ /* wait for sigterm */
+ cmr_state->sig_ev = tevent_add_signal(cmr_state->ev, cmr_state, SIGTERM, 0,
+ ctdb_mutex_rados_sigterm_cb,
+ cmr_state);
+ if (cmr_state->sig_ev == NULL) {
+ fprintf(stderr, "Failed to create signal event\n");
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ ret = -ENOMEM;
+ goto err_state_free;
+ }
+
+ /* periodically check parent */
+ cmr_state->timer_ev = tevent_add_timer(cmr_state->ev, cmr_state,
+ tevent_timeval_current_ofs(5, 0),
+ ctdb_mutex_rados_timer_cb,
+ cmr_state);
+ if (cmr_state->timer_ev == NULL) {
+ fprintf(stderr, "Failed to create timer event\n");
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ ret = -ENOMEM;
+ goto err_state_free;
+ }
+
+ ret = ctdb_mutex_rados_ctx_create(cmr_state->ceph_cluster_name,
+ cmr_state->ceph_auth_name,
+ cmr_state->pool_name,
+ &cmr_state->ceph_cluster,
+ &cmr_state->ioctx);
+ if (ret < 0) {
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ goto err_state_free;
+ }
+
+ ret = ctdb_mutex_rados_lock(cmr_state->ioctx, cmr_state->object);
+ if ((ret == -EEXIST) || (ret == -EBUSY)) {
+ fprintf(stdout, CTDB_MUTEX_STATUS_CONTENDED);
+ goto err_ctx_cleanup;
+ } else if (ret < 0) {
+ fprintf(stdout, CTDB_MUTEX_STATUS_ERROR);
+ goto err_ctx_cleanup;
+ }
+
+ cmr_state->holding_mutex = true;
+ fprintf(stdout, CTDB_MUTEX_STATUS_HOLDING);
+
+ /* wait for the signal / timer events to do their work */
+ ret = tevent_loop_wait(cmr_state->ev);
+ if (ret < 0) {
+ goto err_ctx_cleanup;
+ }
+err_ctx_cleanup:
+ ctdb_mutex_rados_ctx_destroy(cmr_state->ceph_cluster,
+ cmr_state->ioctx);
+err_state_free:
+ talloc_free(cmr_state);
+err_out:
+ return ret ? 1 : 0;
+}
diff --git a/ctdb/wscript b/ctdb/wscript
index d7b1891..59bd8e2 100644
--- a/ctdb/wscript
+++ b/ctdb/wscript
@@ -79,6 +79,9 @@ def set_options(opt):
opt.add_option('--enable-etcd-reclock',
help=("Enable etcd recovery lock helper (default=no)"),
action="store_true", dest='ctdb_etcd_reclock', default=False)
+ opt.add_option('--enable-ceph-reclock',
+ help=("Enable Ceph CTDB recovery lock helper (default=no)"),
+ action="store_true", dest='ctdb_ceph_reclock', default=False)
opt.add_option('--with-logdir',
help=("Path to log directory"),
@@ -201,6 +204,15 @@ def configure(conf):
Logs.info('Building with etcd support')
conf.env.etcd_reclock = have_etcd_reclock
+ if Options.options.ctdb_ceph_reclock:
+ if (conf.CHECK_HEADERS('rados/librados.h', False, False, 'rados') and
+ conf.CHECK_LIB('rados', shlib=True)):
+ Logs.info('Building with Ceph librados recovery lock support')
+ conf.define('HAVE_LIBRADOS', 1)
+ else:
+ Logs.error("Missing librados for Ceph recovery lock support")
+ sys.exit(1)
+
conf.env.CTDB_BINDIR = os.path.join(conf.env.EXEC_PREFIX, 'bin')
conf.env.CTDB_ETCDIR = os.path.join(conf.env.SYSCONFDIR, 'ctdb')
conf.env.CTDB_VARDIR = os.path.join(conf.env.LOCALSTATEDIR, 'lib/ctdb')
@@ -540,6 +552,13 @@ def build(bld):
bld.INSTALL_FILES('${CTDB_PMDADIR}', 'utils/pmda/README',
destname='README')
+ if bld.env.HAVE_LIBRADOS:
+ bld.SAMBA_BINARY('ctdb_mutex_ceph_rados_helper',
+ source='utils/ceph/ctdb_mutex_ceph_rados_helper.c',
+ deps='talloc tevent rados',
+ includes='include',
+ install_path='${CTDB_HELPER_BINDIR}')
+
sed_expr1 = 's|/usr/local/var/lib/ctdb|%s|g' % (bld.env.CTDB_VARDIR)
sed_expr2 = 's|/usr/local/etc/ctdb|%s|g' % (bld.env.CTDB_ETCDIR)
sed_expr3 = 's|/usr/local/var/log|%s|g' % (bld.env.CTDB_LOGDIR)
--
2.10.2
From 35912b7dca417639615ad5662b5a76ee3e25a6ec Mon Sep 17 00:00:00 2001
From: David Disseldorp <ddiss@samba.org>
Date: Thu, 1 Dec 2016 14:22:45 +0100
Subject: [PATCH 2/3] ctdb/doc: man page for Ceph RADOS cluster mutex helper
Signed-off-by: David Disseldorp <ddiss@samba.org>
---
ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml | 90 +++++++++++++++++++++++++++++
ctdb/wscript | 12 +++-
2 files changed, 100 insertions(+), 2 deletions(-)
create mode 100644 ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml
diff --git a/ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml b/ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml
new file mode 100644
index 0000000..e5dedc7
--- /dev/null
+++ b/ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml
@@ -0,0 +1,90 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry
+ PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+ "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<refentry id="ctdb_mutex_ceph_rados_helper.7">
+
+ <refmeta>
+ <refentrytitle>Ceph RADOS Mutex</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo class="source">ctdb</refmiscinfo>
+ <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>ctdb_mutex_ceph_rados_helper</refname>
+ <refpurpose>Ceph RADOS cluster mutex helper</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>DESCRIPTION</title>
+ <para>
+ ctdb_mutex_ceph_rados_helper_lock can be used as a recovery lock provider
+ for CTDB. When configured, split brain avoidance during CTDB recovery
+ will be handled using locks against an object located in a Ceph RADOS
+ pool.
+ To enable this functionality, include the following line in your CTDB
+ config file:
+ </para>
+ <screen format="linespecific">
+CTDB_RECOVERY_LOCK="!ctdb_mutex_ceph_rados_helper_lock [Cluster] [User] [Pool] [Object]"
+
+Cluster: Ceph cluster name (e.g. ceph)
+User: Ceph cluster user name (e.g. client.admin)
+Pool: Ceph RADOS pool name
+Object: Ceph RADOS object name
+ </screen>
+ <para>
+ The Ceph cluster <parameter>Cluster</parameter> must be up and running,
+ with a configuration, and keyring file for <parameter>User</parameter>
+ located in a librados default search path (e.g. /etc/ceph/).
+ <parameter>Pool</parameter> must already exist.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>SEE ALSO</title>
+ <para>
+ <citerefentry><refentrytitle>ctdb</refentrytitle>
+ <manvolnum>7</manvolnum></citerefentry>,
+
+ <citerefentry><refentrytitle>ctdbd</refentrytitle>
+ <manvolnum>1</manvolnum></citerefentry>,
+
+ <ulink url="http://ctdb.samba.org/"/>
+ </para>
+ </refsect1>
+
+ <refentryinfo>
+ <author>
+ <contrib>
+ This documentation was written by David Disseldorp
+ </contrib>
+ </author>
+
+ <copyright>
+ <year>2016</year>
+ <holder>David Disseldorp</holder>
+ </copyright>
+ <legalnotice>
+ <para>
+ This program is free software; you can redistribute it and/or
+ modify it under the terms of the GNU General Public License as
+ published by the Free Software Foundation; either version 3 of
+ the License, or (at your option) any later version.
+ </para>
+ <para>
+ This program is distributed in the hope that it will be
+ useful, but WITHOUT ANY WARRANTY; without even the implied
+ warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
+ PURPOSE. See the GNU General Public License for more details.
+ </para>
+ <para>
+ You should have received a copy of the GNU General Public
+ License along with this program; if not, see
+ <ulink url="http://www.gnu.org/licenses"/>.
+ </para>
+ </legalnotice>
+ </refentryinfo>
+
+</refentry>
diff --git a/ctdb/wscript b/ctdb/wscript
index 59bd8e2..d0e8ec7 100644
--- a/ctdb/wscript
+++ b/ctdb/wscript
@@ -58,6 +58,10 @@ manpages_etcd = [
'ctdb-etcd.7',
]
+manpages_ceph = [
+ 'ctdb_mutex_ceph_rados_helper.7',
+]
+
def set_options(opt):
opt.PRIVATE_EXTENSION_DEFAULT('ctdb')
@@ -273,7 +277,9 @@ def configure(conf):
conf.env.ctdb_prebuilt_manpages = []
manpages = manpages_binary + manpages_misc
if conf.env.etcd_reclock:
- manpages = manpages + manpages_etcd
+ manpages += manpages_etcd
+ if conf.env.HAVE_LIBRADOS:
+ manpages += manpages_ceph
for m in manpages:
if os.path.exists(os.path.join("doc", m)):
Logs.info(" %s: yes" % (m))
@@ -572,7 +578,9 @@ def build(bld):
manpages_extra = manpages_misc
if bld.env.etcd_reclock:
- manpages_extra = manpages_extra + manpages_etcd
+ manpages_extra += manpages_etcd
+ if bld.env.HAVE_LIBRADOS:
+ manpages_extra += manpages_ceph
for f in manpages_binary + manpages_extra:
x = '%s.xml' % (f)
bld.SAMBA_GENERATOR(x,
--
2.10.2
From dbc411675b338ba755c4521a0d859e2c9d67bf87 Mon Sep 17 00:00:00 2001
From: David Disseldorp <ddiss@samba.org>
Date: Tue, 6 Dec 2016 13:03:27 +0100
Subject: [PATCH 3/3] ctdb: add test script for ctdb_mutex_ceph_rados_helper
This standalone test script performs the following:
- using ctdb_mutex_ceph_rados_helper, take a lock on the Ceph RADOS
object a CLUSTER/$POOL/$OBJECT using the Ceph keyring for $USER
+ confirm that lock is obtained, via ctdb_mutex_ceph_rados_helper "0"
output
- check RADOS object lock state, using the "rados lock info" command
- attempt to obtain the lock again, using ctdb_mutex_ceph_rados_helper
+ confirm that the lock is not successfully taken
- tell the first locker to drop the lock and exit, via SIGTERM
- once the first locker has exited, attempt to get the lock again
+ confirm that this attempt succeeds
Signed-off-by: David Disseldorp <ddiss@samba.org>
---
ctdb/utils/ceph/test_ceph_rados_reclock.sh | 151 +++++++++++++++++++++++++++++
1 file changed, 151 insertions(+)
create mode 100755 ctdb/utils/ceph/test_ceph_rados_reclock.sh
diff --git a/ctdb/utils/ceph/test_ceph_rados_reclock.sh b/ctdb/utils/ceph/test_ceph_rados_reclock.sh
new file mode 100755
index 0000000..1adacf6
--- /dev/null
+++ b/ctdb/utils/ceph/test_ceph_rados_reclock.sh
@@ -0,0 +1,151 @@
+#!/bin/bash
+# standalone test for ctdb_mutex_ceph_rados_helper
+#
+# Copyright (C) David Disseldorp 2016
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, see <http://www.gnu.org/licenses/>.
+
+# XXX The following parameters may require configuration:
+CLUSTER="ceph" # Name of the Ceph cluster under test
+USER="client.admin" # Ceph user - a keyring must exist
+POOL="rbd" # RADOS pool - must exist
+OBJECT="ctdb_reclock" # RADOS object: target for lock requests
+
+# test procedure:
+# - using ctdb_mutex_ceph_rados_helper, take a lock on the Ceph RADOS object at
+# CLUSTER/$POOL/$OBJECT using the Ceph keyring for $USER
+# + confirm that lock is obtained, via ctdb_mutex_ceph_rados_helper "0" output
+# - check RADOS object lock state, using the "rados lock info" command
+# - attempt to obtain the lock again, using ctdb_mutex_ceph_rados_helper
+# + confirm that the lock is not successfully taken ("1" output=contention)
+# - tell the first locker to drop the lock and exit, via SIGTERM
+# - once the first locker has exited, attempt to get the lock again
+# + confirm that this attempt succeeds
+
+function _fail() {
+ echo "FAILED: $*"
+ exit 1
+}
+
+# this test requires the Ceph "rados" binary, and "jq" json parser
+which jq > /dev/null || exit 1
+which rados > /dev/null || exit 1
+which ctdb_mutex_ceph_rados_helper || exit 1
+
+TMP_DIR="$(mktemp --directory)" || exit 1
+rados -p "$POOL" rm "$OBJECT"
+
+(ctdb_mutex_ceph_rados_helper "$CLUSTER" "$USER" "$POOL" "$OBJECT" \
+ > ${TMP_DIR}/first) &
+locker_pid=$!
+
+# TODO wait for ctdb_mutex_ceph_rados_helper to write one byte to stdout,
+# indicating lock acquisition success/failure
+sleep 1
+
+first_out=$(cat ${TMP_DIR}/first)
+[ "$first_out" == "0" ] \
+ || _fail "expected lock acquisition (0), but got $first_out"
+
+rados -p "$POOL" lock info "$OBJECT" ctdb_reclock_mutex \
+ > ${TMP_DIR}/lock_state_first
+
+# echo "with lock: `cat ${TMP_DIR}/lock_state_first`"
+
+LOCK_NAME="$(jq -r '.name' ${TMP_DIR}/lock_state_first)"
+[ "$LOCK_NAME" == "ctdb_reclock_mutex" ] \
+ || _fail "unexpected lock name: $LOCK_NAME"
+LOCK_TYPE="$(jq -r '.type' ${TMP_DIR}/lock_state_first)"
+[ "$LOCK_TYPE" == "exclusive" ] \
+ || _fail "unexpected lock type: $LOCK_TYPE"
+
+LOCK_COUNT="$(jq -r '.lockers | length' ${TMP_DIR}/lock_state_first)"
+[ $LOCK_COUNT -eq 1 ] || _fail "expected 1 lock in rados state, got $LOCK_COUNT"
+LOCKER_COOKIE="$(jq -r '.lockers[0].cookie' ${TMP_DIR}/lock_state_first)"
+[ "$LOCKER_COOKIE" == "ctdb_reclock_mutex" ] \
+ || _fail "unexpected locker cookie: $LOCKER_COOKIE"
+LOCKER_DESC="$(jq -r '.lockers[0].description' ${TMP_DIR}/lock_state_first)"
+[ "$LOCKER_DESC" == "CTDB recovery lock" ] \
+ || _fail "unexpected locker description: $LOCKER_DESC"
+
+# second attempt while first is still holding the lock - expect failure
+ctdb_mutex_ceph_rados_helper "$CLUSTER" "$USER" "$POOL" "$OBJECT" \
+ > ${TMP_DIR}/second
+second_out=$(cat ${TMP_DIR}/second)
+[ "$second_out" == "1" ] \
+ || _fail "expected lock contention (1), but got $second_out"
+
+# confirm lock state didn't change
+rados -p "$POOL" lock info "$OBJECT" ctdb_reclock_mutex \
+ > ${TMP_DIR}/lock_state_second
+
+diff ${TMP_DIR}/lock_state_first ${TMP_DIR}/lock_state_second \
+ || _fail "unexpected lock state change"
+
+# tell first locker to drop the lock and terminate
+kill $locker_pid || exit 1
+
+wait $locker_pid &> /dev/null
+
+rados -p "$POOL" lock info "$OBJECT" ctdb_reclock_mutex \
+ > ${TMP_DIR}/lock_state_third
+# echo "without lock: `cat ${TMP_DIR}/lock_state_third`"
+
+LOCK_NAME="$(jq -r '.name' ${TMP_DIR}/lock_state_third)"
+[ "$LOCK_NAME" == "ctdb_reclock_mutex" ] \
+ || _fail "unexpected lock name: $LOCK_NAME"
+LOCK_TYPE="$(jq -r '.type' ${TMP_DIR}/lock_state_third)"
+[ "$LOCK_TYPE" == "exclusive" ] \
+ || _fail "unexpected lock type: $LOCK_TYPE"
+
+LOCK_COUNT="$(jq -r '.lockers | length' ${TMP_DIR}/lock_state_third)"
+[ $LOCK_COUNT -eq 0 ] \
+ || _fail "didn\'t expect any locks in rados state, got $LOCK_COUNT"
+
+exec >${TMP_DIR}/third -- ctdb_mutex_ceph_rados_helper "$CLUSTER" "$USER" "$POOL" "$OBJECT" &
+locker_pid=$!
+
+sleep 1
+
+rados -p "$POOL" lock info "$OBJECT" ctdb_reclock_mutex \
+ > ${TMP_DIR}/lock_state_fourth
+# echo "with lock again: `cat ${TMP_DIR}/lock_state_fourth`"
+
+LOCK_NAME="$(jq -r '.name' ${TMP_DIR}/lock_state_fourth)"
+[ "$LOCK_NAME" == "ctdb_reclock_mutex" ] \
+ || _fail "unexpected lock name: $LOCK_NAME"
+LOCK_TYPE="$(jq -r '.type' ${TMP_DIR}/lock_state_fourth)"
+[ "$LOCK_TYPE" == "exclusive" ] \
+ || _fail "unexpected lock type: $LOCK_TYPE"
+
+LOCK_COUNT="$(jq -r '.lockers | length' ${TMP_DIR}/lock_state_fourth)"
+[ $LOCK_COUNT -eq 1 ] || _fail "expected 1 lock in rados state, got $LOCK_COUNT"
+LOCKER_COOKIE="$(jq -r '.lockers[0].cookie' ${TMP_DIR}/lock_state_fourth)"
+[ "$LOCKER_COOKIE" == "ctdb_reclock_mutex" ] \
+ || _fail "unexpected locker cookie: $LOCKER_COOKIE"
+LOCKER_DESC="$(jq -r '.lockers[0].description' ${TMP_DIR}/lock_state_fourth)"
+[ "$LOCKER_DESC" == "CTDB recovery lock" ] \
+ || _fail "unexpected locker description: $LOCKER_DESC"
+
+kill $locker_pid || exit 1
+wait $locker_pid &> /dev/null
+
+third_out=$(cat ${TMP_DIR}/third)
+[ "$third_out" == "0" ] \
+ || _fail "expected lock acquisition (0), but got $third_out"
+
+rm ${TMP_DIR}/*
+rmdir $TMP_DIR
+
+echo "$0: all tests passed"
--
2.10.2
next prev parent reply other threads:[~2016-12-08 18:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-01 14:17 [PATCH] Ceph RADOS cluster mutex helper for Samba CTDB David Disseldorp
[not found] ` <CAJ+X7mTkBLQDYb+r9LELQe-sqfG_4YkQ9HbkDFAp70cPp7V8zA@mail.gmail.com>
2016-12-06 12:14 ` David Disseldorp
2016-12-06 12:18 ` David Disseldorp
[not found] ` <CAJ+X7mRh04D+Yvtf0xx3dT6rTa=9KvJagyK=1PJQC=1R+u++7w@mail.gmail.com>
2016-12-08 18:39 ` David Disseldorp [this message]
2016-12-09 3:11 ` Amitay Isaacs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161208193954.7ce6b896@suse.de \
--to=ddiss@suse.de \
--cc=amitay@gmail.com \
--cc=ceph-devel@vger.kernel.org \
--cc=samba-technical@lists.samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.