* [PATCH / RFC] nfs-utils: High Availability NFS
[not found] ` <16677.22269.988036.787320@cse.unsw.edu.au>
@ 2004-08-26 17:21 ` Paul Clements
2004-08-26 18:43 ` Paul Clements
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Paul Clements @ 2004-08-26 17:21 UTC (permalink / raw)
To: Neil Brown, nfs
[-- Attachment #1: Type: text/plain, Size: 2116 bytes --]
Hi all,
I've recently coded up some modifications to nfs-utils that will
allow the tools to be used in a High Availability NFS environment
(i.e., capable of switching and failing over NFS exports from
one server to another, while still preserving client connections
and file locks). The modifications are in the form of callout
hooks in statd and mountd. Any HA NFS implementation may take
advantage of these hooks since the actual content of the callout
programs will not be dictated by nfs-utils, but rather will be
left up to the HA cluster software implementor. Currently, the
callout hooks in statd and mountd look like:
statd and mountd -- new command line option:
-------------------------------------------
-H command line option to specify an HA callout program (without -H
no callouts are made) -- the callout program can be any executable
or script
statd events that trigger a callout:
-----------------------------------
add client to notify list (SM_MON) - triggers "add-client" callout
delete client from notify list (SM_UNMON and SM_UNMONALL) - triggers
"del-client" callout
statd events that trigger re-read of the notify list:
----------------------------------------------------
SIGUSR1 sent to statd - triggers re-read of notify list from disk
(notify_hosts()) -- this will be done when one server takes over
(e.g., on failover or switchover) an NFS export from another server
mountd events that trigger a callout:
------------------------------------
client mount request - triggers "mount" callout
client unmount request - triggers "unmount" callout
These callouts will simply result in the HA callout program being
called with the following command line arguments:
<ha_callout_prog> [mount|unmount|add-client|del-client] <client>
<server> <count>
Note that the mountd hook is not needed when running on the 2.6
kernel. The 2.6 kernel has a mechanism (which is more reliable
than using the rmtab file) that it uses to authenticate
unknown clients.
The patch (against nfs-utils-1.0.6) is pretty unobtrusive, adding a
little less than 100 lines.
Comments? Questions?
Thanks,
Paul
[-- Attachment #2: nfs_utils_ha_callout.diff --]
[-- Type: text/plain, Size: 9807 bytes --]
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h nfs-utils-1.0.6/support/include/ha-callout.h
--- nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h 1969-12-31 19:00:00.000000000 -0500
+++ nfs-utils-1.0.6/support/include/ha-callout.h 2004-08-26 10:37:53.000000000 -0400
@@ -0,0 +1,38 @@
+/*
+ * support/include/ha-callout.h
+ *
+ * High Availability NFS Callout support routines
+ *
+ * Copyright (c) 2004, Paul Clements, SteelEye Technology
+ *
+ * In order to implement HA NFS, we need several callouts at key
+ * points in statd and mountd. These callouts all come to ha_callout(),
+ * which, in turn, calls out to an ha-callout script (not part of nfs-utils;
+ * defined by -H argument to rpc.statd and rpc.mountd).
+ */
+#ifndef HA_CALLOUT_H
+#define HA_CALLOUT_H
+
+extern char *ha_callout_prog;
+
+static inline void
+ha_callout(char *event, char *arg1, char *arg2, int arg3)
+{
+ char buf[PATH_MAX]; /* should be plenty */
+ int ret;
+
+ if (!ha_callout_prog) /* HA callout is not enabled */
+ return;
+
+ sprintf(buf, "%s \"%s\" \"%s\" \"%s\" %.8x", ha_callout_prog,
+ event, arg1, arg2, arg3);
+ ret = system(buf);
+
+#ifdef dprintf
+ dprintf(N_DEBUG, "system call %s returned %d\n", buf, WEXITSTATUS(ret));
+#else
+ xlog(D_GENERAL, "system call %s returned %d\n", buf, WEXITSTATUS(ret));
+#endif
+}
+
+#endif
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c nfs-utils-1.0.6/utils/mountd/mountd.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c 2004-08-17 11:01:14.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.c 2004-08-26 10:40:25.000000000 -0400
@@ -36,6 +36,11 @@ static struct nfs_fh_len *get_rootfh(str
int new_cache = 0;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * send mount or unmount requests -- the callout is not needed for 2.6 kernel */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -48,6 +53,7 @@ static struct option longopts[] =
{ "version", 0, 0, 'v' },
{ "port", 1, 0, 'p' },
{ "no-tcp", 0, 0, 'n' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -444,7 +450,7 @@ main(int argc, char **argv)
/* Parse the command line options and arguments. */
opterr = 0;
- while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hN:V:v", longopts, NULL)) != EOF)
+ while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hH:N:V:v", longopts, NULL)) != EOF)
switch (c) {
case 'o':
descriptors = atoi(optarg);
@@ -463,6 +469,9 @@ main(int argc, char **argv)
case 'f':
export_file = optarg;
break;
+ case 'H': /* PRC: specify a high-availability callout program */
+ ha_callout_prog = optarg;
+ break;
case 'h':
usage(argv [0], 0);
break;
@@ -596,6 +605,7 @@ usage(const char *prog, int n)
"Usage: %s [-F|--foreground] [-h|--help] [-v|--version] [-d kind|--debug kind]\n"
" [-o num|--descriptors num] [-f exports-file|--exports-file=file]\n"
" [-p|--port port] [-V version|--nfs-version version]\n"
-" [-N version|--no-nfs-version version] [-n|--no-tcp]\n", prog);
+" [-N version|--no-nfs-version version] [-n|--no-tcp]\n"
+" [-H ha-callout-prog]\n", prog);
exit(n);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c nfs-utils-1.0.6/utils/mountd/rmtab.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c 2003-07-31 01:19:26.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/rmtab.c 2004-08-25 15:21:53.000000000 -0400
@@ -19,6 +19,7 @@
#include "exportfs.h"
#include "xio.h"
#include "mountd.h"
+#include "ha-callout.h"
#include <limits.h> /* PATH_MAX */
@@ -61,6 +62,8 @@ mountlist_add(char *host, const char *pa
host) == 0
&& strcmp(rep->r_path, path) == 0) {
rep->r_count++;
+ /* PRC: do the HA callout: */
+ ha_callout("mount", rep->r_client, rep->r_path, rep->r_count);
putrmtabent(rep, &pos);
endrmtabent();
xfunlock(lockid);
@@ -75,6 +78,8 @@ mountlist_add(char *host, const char *pa
xe.r_path [sizeof (xe.r_path) - 1] = '\0';
xe.r_count = 1;
if (setrmtabent("a")) {
+ /* PRC: do the HA callout: */
+ ha_callout("mount", xe.r_client, xe.r_path, xe.r_count);
putrmtabent(&xe, NULL);
endrmtabent();
}
@@ -103,8 +108,11 @@ mountlist_del(char *hname, const char *p
while ((rep = getrmtabent(1, NULL)) != NULL) {
match = !strcmp (rep->r_client, hname)
&& !strcmp(rep->r_path, path);
- if (match)
+ if (match) {
rep->r_count--;
+ /* PRC: do the HA callout: */
+ ha_callout("umount", rep->r_client, rep->r_path, rep->r_count);
+ }
if (!match || rep->r_count)
fputrmtabent(fp, rep, NULL);
}
Binary files nfs-utils-1.0.6-PRISTINE/utils/nfsd/nfsd and nfs-utils-1.0.6/utils/nfsd/nfsd differ
Binary files nfs-utils-1.0.6-PRISTINE/utils/nfsstat/nfsstat and nfs-utils-1.0.6/utils/nfsstat/nfsstat differ
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c nfs-utils-1.0.6/utils/statd/monitor.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c 2004-08-17 11:01:14.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/monitor.c 2004-08-26 09:40:10.000000000 -0400
@@ -19,6 +19,7 @@
#include "misc.h"
#include "statd.h"
#include "notlist.h"
+#include "ha-callout.h"
notify_list * rtnl = NULL; /* Run-time notify list. */
@@ -177,6 +178,8 @@ sm_mon_1_svc(struct mon *argp, struct sv
goto failure;
}
free(path);
+ /* PRC: do the HA callout: */
+ ha_callout("add-client", mon_name, my_name, 0);
nlist_insert(&rtnl, clnt);
close(fd);
@@ -232,6 +235,10 @@ sm_unmon_1_svc(struct mon_id *argp, stru
/* Match! */
dprintf(N_DEBUG, "UNMONITORING %s for %s",
mon_name, my_name);
+
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, my_name, 0);
+
nlist_free(&rtnl, clnt);
/* Do not unlink the monitor file. There are
* cases when a lock is cleared locally on the
@@ -287,6 +294,8 @@ sm_unmon_all_1_svc(struct my_id *argp, s
sizeof (mon_name) - 1);
mon_name[sizeof (mon_name) - 1] = '\0';
temp = NL_NEXT(clnt);
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, argp->my_name, 0);
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
++count;
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c nfs-utils-1.0.6/utils/statd/rmtcall.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c 2003-09-12 01:41:38.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/rmtcall.c 2004-08-25 14:54:00.000000000 -0400
@@ -38,6 +38,7 @@
#include "statd.h"
#include "notlist.h"
#include "log.h"
+#include "ha-callout.h"
#define MAXMSGSIZE (2048 / sizeof(unsigned int))
@@ -414,6 +415,8 @@ process_notify_list(void)
note(N_ERROR,
"Can't notify %s, giving up.",
NL_MON_NAME(entry));
+ /* PRC: do the HA callout */
+ ha_callout("del-client", NL_MY_NAME(entry), NL_MON_NAME(entry), 0);
xunlink(SM_BAK_DIR, NL_MON_NAME(entry), 0);
nlist_free(¬ify, entry);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c nfs-utils-1.0.6/utils/statd/statd.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c 2003-09-12 02:24:29.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.c 2004-08-25 13:29:08.000000000 -0400
@@ -48,6 +48,11 @@ int run_mode = 0; /* foreground logging
char *name_p = NULL;
char *version_p = NULL;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * are added or deleted to the notify list */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -59,6 +64,7 @@ static struct option longopts[] =
{ "name", 1, 0, 'n' },
{ "state-directory-path", 1, 0, 'P' },
{ "notify-mode", 0, 0, 'N' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -102,6 +108,13 @@ killer (int sig)
exit (0);
}
+static void
+sigusr (int sig)
+{
+ dprintf (N_DEBUG, "Caught signal %d, re-reading notify list.", sig);
+ notify_hosts();
+}
+
/*
* Startup information.
*/
@@ -148,6 +161,7 @@ usage()
fprintf(stderr," -n, --name Specify a local hostname.\n");
fprintf(stderr," -P State directory path.\n");
fprintf(stderr," -N Run in notify only mode.\n");
+ fprintf(stderr," -H Specify a high-availability callout program.\n");
}
static const char *pidfile = "/var/run/rpc.statd.pid";
@@ -236,7 +250,7 @@ int main (int argc, char **argv)
MY_NAME = NULL;
/* Process command line switches */
- while ((arg = getopt_long(argc, argv, "h?vVFNdn:p:o:P:", longopts, NULL)) != EOF) {
+ while ((arg = getopt_long(argc, argv, "h?vVFNH:dn:p:o:P:", longopts, NULL)) != EOF) {
switch (arg) {
case 'V': /* Version */
case 'v':
@@ -302,6 +316,13 @@ int main (int argc, char **argv)
sprintf(SM_STAT_PATH, "%s/state", DIR_BASE );
}
break;
+ case 'H': /* PRC: specify the ha-callout program */
+ if ((ha_callout_prog = xstrdup(optarg)) == NULL) {
+ fprintf(stderr, "%s: xstrdup(%s) failed!\n",
+ argv[0], optarg);
+ exit(1);
+ }
+ break;
case '?': /* heeeeeelllllllpppp? heh */
case 'h':
usage();
@@ -397,6 +418,8 @@ int main (int argc, char **argv)
signal (SIGHUP, killer);
signal (SIGINT, killer);
signal (SIGTERM, killer);
+ /* PRC: trap SIGUSR1 to re-read notify list from disk */
+ signal(SIGUSR1, sigusr);
/* WARNING: the following works on Linux and SysV, but not BSD! */
signal(SIGCHLD, SIG_IGN);
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-26 17:21 ` [PATCH / RFC] nfs-utils: High Availability NFS Paul Clements
@ 2004-08-26 18:43 ` Paul Clements
2004-08-27 7:30 ` Olaf Kirch
2004-08-31 6:51 ` Neil Brown
2 siblings, 0 replies; 18+ messages in thread
From: Paul Clements @ 2004-08-26 18:43 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 921 bytes --]
Of course when I reviewed my posting, I noticed a small error in the patch:
> diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c nfs-utils-1.0.6/utils/statd/rmtcall.c
> --- nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c 2003-09-12 01:41:38.000000000 -0400
> +++ nfs-utils-1.0.6/utils/statd/rmtcall.c 2004-08-25 14:54:00.000000000 -0400
> @@ -38,6 +38,7 @@
> #include "statd.h"
> #include "notlist.h"
> #include "log.h"
> +#include "ha-callout.h"
>
> #define MAXMSGSIZE (2048 / sizeof(unsigned int))
>
> @@ -414,6 +415,8 @@ process_notify_list(void)
> note(N_ERROR,
> "Can't notify %s, giving up.",
> NL_MON_NAME(entry));
> + /* PRC: do the HA callout */
> + ha_callout("del-client", NL_MY_NAME(entry), NL_MON_NAME(entry), 0);
The second and third arguments need to be reversed. Corrected patch
attached.
Sorry about that...
--
Paul
[-- Attachment #2: nfs_utils_ha_callout-2.diff --]
[-- Type: text/plain, Size: 9537 bytes --]
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h nfs-utils-1.0.6/support/include/ha-callout.h
--- nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h 1969-12-31 19:00:00.000000000 -0500
+++ nfs-utils-1.0.6/support/include/ha-callout.h 2004-08-26 11:32:18.000000000 -0400
@@ -0,0 +1,38 @@
+/*
+ * support/include/ha-callout.h
+ *
+ * High Availability NFS Callout support routines
+ *
+ * Copyright (c) 2004, Paul Clements, SteelEye Technology
+ *
+ * In order to implement HA NFS, we need several callouts at key
+ * points in statd and mountd. These callouts all come to ha_callout(),
+ * which, in turn, calls out to an ha-callout script (not part of nfs-utils;
+ * defined by -H argument to rpc.statd and rpc.mountd).
+ */
+#ifndef HA_CALLOUT_H
+#define HA_CALLOUT_H
+
+extern char *ha_callout_prog;
+
+static inline void
+ha_callout(char *event, char *arg1, char *arg2, int arg3)
+{
+ char buf[PATH_MAX]; /* should be plenty */
+ int ret;
+
+ if (!ha_callout_prog) /* HA callout is not enabled */
+ return;
+
+ sprintf(buf, "%s \"%s\" \"%s\" \"%s\" %.8x", ha_callout_prog,
+ event, arg1, arg2, arg3);
+ ret = system(buf);
+
+#ifdef dprintf
+ dprintf(N_DEBUG, "system call %s returned %d\n", buf, WEXITSTATUS(ret));
+#else
+ xlog(D_GENERAL, "system call %s returned %d\n", buf, WEXITSTATUS(ret));
+#endif
+}
+
+#endif
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c nfs-utils-1.0.6/utils/mountd/mountd.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c 2003-09-12 18:14:16.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.c 2004-08-26 11:32:18.000000000 -0400
@@ -36,6 +36,11 @@ static struct nfs_fh_len *get_rootfh(str
int new_cache = 0;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * send mount or unmount requests -- the callout is not needed for 2.6 kernel */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -48,6 +53,7 @@ static struct option longopts[] =
{ "version", 0, 0, 'v' },
{ "port", 1, 0, 'p' },
{ "no-tcp", 0, 0, 'n' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -444,7 +450,7 @@ main(int argc, char **argv)
/* Parse the command line options and arguments. */
opterr = 0;
- while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hN:V:v", longopts, NULL)) != EOF)
+ while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hH:N:V:v", longopts, NULL)) != EOF)
switch (c) {
case 'o':
descriptors = atoi(optarg);
@@ -463,6 +469,9 @@ main(int argc, char **argv)
case 'f':
export_file = optarg;
break;
+ case 'H': /* PRC: specify a high-availability callout program */
+ ha_callout_prog = optarg;
+ break;
case 'h':
usage(argv [0], 0);
break;
@@ -596,6 +605,7 @@ usage(const char *prog, int n)
"Usage: %s [-F|--foreground] [-h|--help] [-v|--version] [-d kind|--debug kind]\n"
" [-o num|--descriptors num] [-f exports-file|--exports-file=file]\n"
" [-p|--port port] [-V version|--nfs-version version]\n"
-" [-N version|--no-nfs-version version] [-n|--no-tcp]\n", prog);
+" [-N version|--no-nfs-version version] [-n|--no-tcp]\n"
+" [-H ha-callout-prog]\n", prog);
exit(n);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c nfs-utils-1.0.6/utils/mountd/rmtab.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c 2003-07-31 01:19:26.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/rmtab.c 2004-08-26 11:32:18.000000000 -0400
@@ -19,6 +19,7 @@
#include "exportfs.h"
#include "xio.h"
#include "mountd.h"
+#include "ha-callout.h"
#include <limits.h> /* PATH_MAX */
@@ -61,6 +62,8 @@ mountlist_add(char *host, const char *pa
host) == 0
&& strcmp(rep->r_path, path) == 0) {
rep->r_count++;
+ /* PRC: do the HA callout: */
+ ha_callout("mount", rep->r_client, rep->r_path, rep->r_count);
putrmtabent(rep, &pos);
endrmtabent();
xfunlock(lockid);
@@ -75,6 +78,8 @@ mountlist_add(char *host, const char *pa
xe.r_path [sizeof (xe.r_path) - 1] = '\0';
xe.r_count = 1;
if (setrmtabent("a")) {
+ /* PRC: do the HA callout: */
+ ha_callout("mount", xe.r_client, xe.r_path, xe.r_count);
putrmtabent(&xe, NULL);
endrmtabent();
}
@@ -103,8 +108,11 @@ mountlist_del(char *hname, const char *p
while ((rep = getrmtabent(1, NULL)) != NULL) {
match = !strcmp (rep->r_client, hname)
&& !strcmp(rep->r_path, path);
- if (match)
+ if (match) {
rep->r_count--;
+ /* PRC: do the HA callout: */
+ ha_callout("umount", rep->r_client, rep->r_path, rep->r_count);
+ }
if (!match || rep->r_count)
fputrmtabent(fp, rep, NULL);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c nfs-utils-1.0.6/utils/statd/monitor.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c 2003-09-12 01:41:35.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/monitor.c 2004-08-26 11:32:18.000000000 -0400
@@ -19,6 +19,7 @@
#include "misc.h"
#include "statd.h"
#include "notlist.h"
+#include "ha-callout.h"
notify_list * rtnl = NULL; /* Run-time notify list. */
@@ -177,6 +178,8 @@ sm_mon_1_svc(struct mon *argp, struct sv
goto failure;
}
free(path);
+ /* PRC: do the HA callout: */
+ ha_callout("add-client", mon_name, my_name, 0);
nlist_insert(&rtnl, clnt);
close(fd);
@@ -232,6 +235,10 @@ sm_unmon_1_svc(struct mon_id *argp, stru
/* Match! */
dprintf(N_DEBUG, "UNMONITORING %s for %s",
mon_name, my_name);
+
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, my_name, 0);
+
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
@@ -276,6 +283,8 @@ sm_unmon_all_1_svc(struct my_id *argp, s
sizeof (mon_name) - 1);
mon_name[sizeof (mon_name) - 1] = '\0';
temp = NL_NEXT(clnt);
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, argp->my_name, 0);
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
++count;
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c nfs-utils-1.0.6/utils/statd/rmtcall.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c 2003-09-12 01:41:38.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/rmtcall.c 2004-08-26 13:11:41.000000000 -0400
@@ -38,6 +38,7 @@
#include "statd.h"
#include "notlist.h"
#include "log.h"
+#include "ha-callout.h"
#define MAXMSGSIZE (2048 / sizeof(unsigned int))
@@ -414,6 +415,8 @@ process_notify_list(void)
note(N_ERROR,
"Can't notify %s, giving up.",
NL_MON_NAME(entry));
+ /* PRC: do the HA callout */
+ ha_callout("del-client", NL_MON_NAME(entry), NL_MY_NAME(entry), 0);
xunlink(SM_BAK_DIR, NL_MON_NAME(entry), 0);
nlist_free(¬ify, entry);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c nfs-utils-1.0.6/utils/statd/statd.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c 2003-09-12 02:24:29.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.c 2004-08-26 11:32:18.000000000 -0400
@@ -48,6 +48,11 @@ int run_mode = 0; /* foreground logging
char *name_p = NULL;
char *version_p = NULL;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * are added or deleted to the notify list */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -59,6 +64,7 @@ static struct option longopts[] =
{ "name", 1, 0, 'n' },
{ "state-directory-path", 1, 0, 'P' },
{ "notify-mode", 0, 0, 'N' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -102,6 +108,13 @@ killer (int sig)
exit (0);
}
+static void
+sigusr (int sig)
+{
+ dprintf (N_DEBUG, "Caught signal %d, re-reading notify list.", sig);
+ notify_hosts();
+}
+
/*
* Startup information.
*/
@@ -148,6 +161,7 @@ usage()
fprintf(stderr," -n, --name Specify a local hostname.\n");
fprintf(stderr," -P State directory path.\n");
fprintf(stderr," -N Run in notify only mode.\n");
+ fprintf(stderr," -H Specify a high-availability callout program.\n");
}
static const char *pidfile = "/var/run/rpc.statd.pid";
@@ -236,7 +250,7 @@ int main (int argc, char **argv)
MY_NAME = NULL;
/* Process command line switches */
- while ((arg = getopt_long(argc, argv, "h?vVFNdn:p:o:P:", longopts, NULL)) != EOF) {
+ while ((arg = getopt_long(argc, argv, "h?vVFNH:dn:p:o:P:", longopts, NULL)) != EOF) {
switch (arg) {
case 'V': /* Version */
case 'v':
@@ -302,6 +316,13 @@ int main (int argc, char **argv)
sprintf(SM_STAT_PATH, "%s/state", DIR_BASE );
}
break;
+ case 'H': /* PRC: specify the ha-callout program */
+ if ((ha_callout_prog = xstrdup(optarg)) == NULL) {
+ fprintf(stderr, "%s: xstrdup(%s) failed!\n",
+ argv[0], optarg);
+ exit(1);
+ }
+ break;
case '?': /* heeeeeelllllllpppp? heh */
case 'h':
usage();
@@ -397,6 +418,8 @@ int main (int argc, char **argv)
signal (SIGHUP, killer);
signal (SIGINT, killer);
signal (SIGTERM, killer);
+ /* PRC: trap SIGUSR1 to re-read notify list from disk */
+ signal(SIGUSR1, sigusr);
/* WARNING: the following works on Linux and SysV, but not BSD! */
signal(SIGCHLD, SIG_IGN);
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-26 17:21 ` [PATCH / RFC] nfs-utils: High Availability NFS Paul Clements
2004-08-26 18:43 ` Paul Clements
@ 2004-08-27 7:30 ` Olaf Kirch
2004-08-27 13:13 ` Paul Clements
2004-08-31 6:51 ` Neil Brown
2 siblings, 1 reply; 18+ messages in thread
From: Olaf Kirch @ 2004-08-27 7:30 UTC (permalink / raw)
To: Paul Clements; +Cc: Neil Brown, nfs
Hi,
I'm working on putting statd into the kernel, so I'm necessarily not
too enthusiastic about adding external callouts to statd :)
On Thu, Aug 26, 2004 at 01:21:20PM -0400, Paul Clements wrote:
> statd events that trigger a callout:
> -----------------------------------
> add client to notify list (SM_MON) - triggers "add-client" callout
>
> delete client from notify list (SM_UNMON and SM_UNMONALL) - triggers
> "del-client" callout
I think the same could be achieved by attaching a dnotify handler to
the directory, and it would be compatible with a kernel level statd.
Cheers
Olaf
--
Olaf Kirch | The Hardware Gods hate me.
okir@suse.de |
---------------+
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-27 7:30 ` Olaf Kirch
@ 2004-08-27 13:13 ` Paul Clements
2004-08-30 8:25 ` Olaf Kirch
0 siblings, 1 reply; 18+ messages in thread
From: Paul Clements @ 2004-08-27 13:13 UTC (permalink / raw)
To: Olaf Kirch; +Cc: Neil Brown, nfs
Hi,
Olaf Kirch wrote:
> I'm working on putting statd into the kernel, so I'm necessarily not
> too enthusiastic about adding external callouts to statd :)
With so many things moving out of the kernel to userland, it's
surprising to see statd moving into the kernel. And it certainly does
make modifications, such as this, much more difficult.
> On Thu, Aug 26, 2004 at 01:21:20PM -0400, Paul Clements wrote:
>
>>statd events that trigger a callout:
>>-----------------------------------
>>add client to notify list (SM_MON) - triggers "add-client" callout
>>
>>delete client from notify list (SM_UNMON and SM_UNMONALL) - triggers
>>"del-client" callout
>
>
> I think the same could be achieved by attaching a dnotify handler to
> the directory, and it would be compatible with a kernel level statd.
But the callout happens before statd replies to the client, so there's
no chance that we lose any clients if there's a failure. If we use
dnotify then we only find out there's been a change after the change has
occurred, which means the client may have already gotten a reply saying
he's been added to the notify list. If we get a failure at this point,
we've lost a client, and upon failover the client's locks are no longer
valid.
--
Paul
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-27 13:13 ` Paul Clements
@ 2004-08-30 8:25 ` Olaf Kirch
2004-08-30 10:19 ` Greg Banks
2004-09-03 7:28 ` Kedar Sovani
0 siblings, 2 replies; 18+ messages in thread
From: Olaf Kirch @ 2004-08-30 8:25 UTC (permalink / raw)
To: Paul Clements; +Cc: Neil Brown, nfs
On Fri, Aug 27, 2004 at 09:13:06AM -0400, Paul Clements wrote:
> With so many things moving out of the kernel to userland, it's
> surprising to see statd moving into the kernel. And it certainly does
> make modifications, such as this, much more difficult.
Well, the issue with statd is that it exists _only_ to make lockd happy.
But what it really does is make lockd awfully complicated because we
have to do upcalls all the time.
So my rationale for moving statd into the kernel is to actually eliminate
a lot of code and have a minimal set up functionality in there that does
exactly what lockd needs, no more. In particular, no more SM_MON and
SM_UNMON support that allows untrusted hosts to do sneaky stuff - instead,
lockd writes the /var/lib/nfs/sm files directly. The only thing my kernel
statd supports is SM_NOTIFY, and that even is delivered directly to the
locking code.
> But the callout happens before statd replies to the client, so there's
> no chance that we lose any clients if there's a failure. If we use
> dnotify then we only find out there's been a change after the change has
> occurred, which means the client may have already gotten a reply saying
> he's been added to the notify list. If we get a failure at this point,
> we've lost a client, and upon failover the client's locks are no longer
> valid.
Doesn't a HA NFS deployment require a shared disk anyway? Why not just
move /var/lib/nfs to this file system so it gets shared the same way you
share the rest of your data?
Olaf
--
Olaf Kirch | The Hardware Gods hate me.
okir@suse.de |
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-30 8:25 ` Olaf Kirch
@ 2004-08-30 10:19 ` Greg Banks
2004-08-30 15:03 ` Paul Clements
2004-09-03 7:28 ` Kedar Sovani
1 sibling, 1 reply; 18+ messages in thread
From: Greg Banks @ 2004-08-30 10:19 UTC (permalink / raw)
To: Olaf Kirch; +Cc: Paul Clements, Neil Brown, nfs
On Mon, Aug 30, 2004 at 10:25:11AM +0200, Olaf Kirch wrote:
> On Fri, Aug 27, 2004 at 09:13:06AM -0400, Paul Clements wrote:
> > With so many things moving out of the kernel to userland, it's
> > surprising to see statd moving into the kernel. And it certainly does
> > make modifications, such as this, much more difficult.
>
> Well, the issue with statd is that it exists _only_ to make lockd happy.
> But what it really does is make lockd awfully complicated because we
> have to do upcalls all the time.
How is that any different from the other rpc calls lockd has to do?
> So my rationale for moving statd into the kernel is to actually eliminate
> a lot of code and have a minimal set up functionality in there that does
> exactly what lockd needs, no more. In particular, no more SM_MON and
> SM_UNMON support that allows untrusted hosts to do sneaky stuff - instead,
Didn't you fix that for CERT CA-99.05? If configured correctly, statd
rejects calls not from localhost. Probably it should also reject calls
not from a privileged port.
> Doesn't a HA NFS deployment require a shared disk anyway?
Shared data disks, not a shared root disk.
> Why not just
> move /var/lib/nfs to this file system so it gets shared the same way you
> share the rest of your data?
This would have to be done manually on a per-site basis because you
can't predict where the mountpoint of any particular shared disk will
be when NFS starts. So packaging an HA solution based on that
arrangement would be a mite tricky.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-30 10:19 ` Greg Banks
@ 2004-08-30 15:03 ` Paul Clements
0 siblings, 0 replies; 18+ messages in thread
From: Paul Clements @ 2004-08-30 15:03 UTC (permalink / raw)
To: Greg Banks; +Cc: Olaf Kirch, Neil Brown, nfs
Greg Banks wrote:
> On Mon, Aug 30, 2004 at 10:25:11AM +0200, Olaf Kirch wrote:
>>Doesn't a HA NFS deployment require a shared disk anyway?
>
>
> Shared data disks, not a shared root disk.
>
>
>>Why not just
>>move /var/lib/nfs to this file system so it gets shared the same way you
>>share the rest of your data?
>
>
> This would have to be done manually on a per-site basis because you
> can't predict where the mountpoint of any particular shared disk will
> be when NFS starts. So packaging an HA solution based on that
> arrangement would be a mite tricky.
You also cannot support active/active configurations when /var/lib/nfs
is linked (or mounted) to a shared location. That is our primary reason
for needing the callouts. Specifically, in order to support
active/active configurations, either you've got to broadcast the
information to all cluster nodes or you've got to "shadow" the
/var/lib/nfs contents on shared storage.
--
Paul
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-26 17:21 ` [PATCH / RFC] nfs-utils: High Availability NFS Paul Clements
2004-08-26 18:43 ` Paul Clements
2004-08-27 7:30 ` Olaf Kirch
@ 2004-08-31 6:51 ` Neil Brown
2004-08-31 16:26 ` Paul Clements
2 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2004-08-31 6:51 UTC (permalink / raw)
To: Paul Clements; +Cc: nfs
On Thursday August 26, paul.clements@steeleye.com wrote:
> Hi all,
>
> I've recently coded up some modifications to nfs-utils that will
> allow the tools to be used in a High Availability NFS environment
> (i.e., capable of switching and failing over NFS exports from
> one server to another, while still preserving client connections
> and file locks). The modifications are in the form of callout
> hooks in statd and mountd. Any HA NFS implementation may take
> advantage of these hooks since the actual content of the callout
> programs will not be dictated by nfs-utils, but rather will be
> left up to the HA cluster software implementor.
I'm mostly happy with this patch.
I'd like to ask for two changes before it gets committed.
First - update the relevant man pages to explain -H and SIGUSR1
usage.
Second, I have a pathological aversion to "system(3)" for this sort
of task. I start worrying about stray quotes and other magic
characters.
I would much prefer something like:
sprintf(buf, "%.8x" args3)
switch(pid=fork()) {
case 0: execl(ha_callout_prog,"callout",arg1,arg2,buf,NULL);
exit(1);
case -1: perror(PROGNAME ": fork"); break;
default: waitpid(pid, NULL, 0);
}
Also, I would really prefer the count was passed in decimal (%d) - it
is more traditional. If hex is needed, the program can convert it.
(why it is hex in "rmtab" I really don't know)....
But I could probably be talked into hex if you try.
NeilBrown
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-31 6:51 ` Neil Brown
@ 2004-08-31 16:26 ` Paul Clements
2004-08-31 20:46 ` Paul Clements
2004-08-31 23:56 ` Neil Brown
0 siblings, 2 replies; 18+ messages in thread
From: Paul Clements @ 2004-08-31 16:26 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]
Neil Brown wrote:
> On Thursday August 26, paul.clements@steeleye.com wrote:
> I'm mostly happy with this patch.
> I'd like to ask for two changes before it gets committed.
>
> First - update the relevant man pages to explain -H and SIGUSR1
> usage.
OK, I've added sections to the mountd and statd man pages.
> Second, I have a pathological aversion to "system(3)" for this sort
> of task. I start worrying about stray quotes and other magic
> characters.
> I would much prefer something like:
>
> sprintf(buf, "%.8x" args3)
> switch(pid=fork()) {
> case 0: execl(ha_callout_prog,"callout",arg1,arg2,buf,NULL);
> exit(1);
> case -1: perror(PROGNAME ": fork"); break;
> default: waitpid(pid, NULL, 0);
> }
OK, I've changed this.
> Also, I would really prefer the count was passed in decimal (%d) - it
> is more traditional. If hex is needed, the program can convert it.
> (why it is hex in "rmtab" I really don't know)....
OK, I pass it in decimal now.
New patch attached.
--
Paul
[-- Attachment #2: nfs_utils_ha_callout-3.diff --]
[-- Type: text/plain, Size: 12780 bytes --]
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h nfs-utils-1.0.6/support/include/ha-callout.h
--- nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h 1969-12-31 19:00:00.000000000 -0500
+++ nfs-utils-1.0.6/support/include/ha-callout.h 2004-08-31 10:31:18.000000000 -0400
@@ -0,0 +1,49 @@
+/*
+ * support/include/ha-callout.h
+ *
+ * High Availability NFS Callout support routines
+ *
+ * Copyright (c) 2004, Paul Clements, SteelEye Technology
+ *
+ * In order to implement HA NFS, we need several callouts at key
+ * points in statd and mountd. These callouts all come to ha_callout(),
+ * which, in turn, calls out to an ha-callout script (not part of nfs-utils;
+ * defined by -H argument to rpc.statd and rpc.mountd).
+ */
+#ifndef HA_CALLOUT_H
+#define HA_CALLOUT_H
+
+#include <sys/wait.h>
+
+extern char *ha_callout_prog;
+
+static inline void
+ha_callout(char *event, char *arg1, char *arg2, int arg3)
+{
+ char buf[16]; /* should be plenty */
+ pid_t pid;
+ int ret = -1;
+
+ if (!ha_callout_prog) /* HA callout is not enabled */
+ return;
+
+ sprintf(buf, "%d", arg3);
+
+ pid = fork();
+ switch (pid) {
+ case 0: execl(ha_callout_prog, event, arg1, arg2, buf, NULL);
+ perror("execl");
+ exit(2);
+ case -1: perror("fork");
+ break;
+ default: ret = waitpid(pid, NULL, 0);
+ }
+
+#ifdef dprintf
+ dprintf(N_DEBUG, "ha callout returned %d\n", WEXITSTATUS(ret));
+#else
+ xlog(D_GENERAL, "ha callout returned %d\n", WEXITSTATUS(ret));
+#endif
+}
+
+#endif
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c nfs-utils-1.0.6/utils/mountd/mountd.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c 2003-09-12 18:14:16.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.c 2004-08-31 10:31:02.000000000 -0400
@@ -36,6 +36,11 @@ static struct nfs_fh_len *get_rootfh(str
int new_cache = 0;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * send mount or unmount requests -- the callout is not needed for 2.6 kernel */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -48,6 +53,7 @@ static struct option longopts[] =
{ "version", 0, 0, 'v' },
{ "port", 1, 0, 'p' },
{ "no-tcp", 0, 0, 'n' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -444,7 +450,7 @@ main(int argc, char **argv)
/* Parse the command line options and arguments. */
opterr = 0;
- while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hN:V:v", longopts, NULL)) != EOF)
+ while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hH:N:V:v", longopts, NULL)) != EOF)
switch (c) {
case 'o':
descriptors = atoi(optarg);
@@ -463,6 +469,9 @@ main(int argc, char **argv)
case 'f':
export_file = optarg;
break;
+ case 'H': /* PRC: specify a high-availability callout program */
+ ha_callout_prog = optarg;
+ break;
case 'h':
usage(argv [0], 0);
break;
@@ -596,6 +605,7 @@ usage(const char *prog, int n)
"Usage: %s [-F|--foreground] [-h|--help] [-v|--version] [-d kind|--debug kind]\n"
" [-o num|--descriptors num] [-f exports-file|--exports-file=file]\n"
" [-p|--port port] [-V version|--nfs-version version]\n"
-" [-N version|--no-nfs-version version] [-n|--no-tcp]\n", prog);
+" [-N version|--no-nfs-version version] [-n|--no-tcp]\n"
+" [-H ha-callout-prog]\n", prog);
exit(n);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.man nfs-utils-1.0.6/utils/mountd/mountd.man
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.man 2003-07-17 19:17:28.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.man 2004-08-31 10:31:02.000000000 -0400
@@ -2,7 +2,8 @@
.\" mountd(8)
.\"
.\" Copyright (C) 1999 Olaf Kirch <okir@monad.swb.de>
-.TH rpc.mountd 8 "25 Aug 2000"
+.\" Modified by Paul Clements, 2004.
+.TH rpc.mountd 8 "31 Aug 2004"
.SH NAME
rpc.mountd \- NFS mount daemon
.SH SYNOPSIS
@@ -99,6 +100,16 @@ Force
to bind to the specified port num, instead of using the random port
number assigned by the portmapper.
.TP
+.B \-H " or " \-\-ha-callout prog
+Specify a high availability callout program, which will receive callouts
+for all client mount and unmount requests. This allows
+.B rpc.mountd
+to be used in a High Availability NFS (HA-NFS) environment. This callout is not
+needed (and should not be used) with 2.6 and later kernels (instead,
+mount the nfsd filesystem on
+.B /proc/fs/nfsd
+).
+.TP
.B \-V " or " \-\-nfs-version
This option can be used to request that
.B rpc.mountd
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c nfs-utils-1.0.6/utils/mountd/rmtab.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c 2003-07-31 01:19:26.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/rmtab.c 2004-08-31 10:31:02.000000000 -0400
@@ -19,6 +19,7 @@
#include "exportfs.h"
#include "xio.h"
#include "mountd.h"
+#include "ha-callout.h"
#include <limits.h> /* PATH_MAX */
@@ -61,6 +62,8 @@ mountlist_add(char *host, const char *pa
host) == 0
&& strcmp(rep->r_path, path) == 0) {
rep->r_count++;
+ /* PRC: do the HA callout: */
+ ha_callout("mount", rep->r_client, rep->r_path, rep->r_count);
putrmtabent(rep, &pos);
endrmtabent();
xfunlock(lockid);
@@ -75,6 +78,8 @@ mountlist_add(char *host, const char *pa
xe.r_path [sizeof (xe.r_path) - 1] = '\0';
xe.r_count = 1;
if (setrmtabent("a")) {
+ /* PRC: do the HA callout: */
+ ha_callout("mount", xe.r_client, xe.r_path, xe.r_count);
putrmtabent(&xe, NULL);
endrmtabent();
}
@@ -103,8 +108,11 @@ mountlist_del(char *hname, const char *p
while ((rep = getrmtabent(1, NULL)) != NULL) {
match = !strcmp (rep->r_client, hname)
&& !strcmp(rep->r_path, path);
- if (match)
+ if (match) {
rep->r_count--;
+ /* PRC: do the HA callout: */
+ ha_callout("umount", rep->r_client, rep->r_path, rep->r_count);
+ }
if (!match || rep->r_count)
fputrmtabent(fp, rep, NULL);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c nfs-utils-1.0.6/utils/statd/monitor.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c 2003-09-12 01:41:35.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/monitor.c 2004-08-31 10:31:02.000000000 -0400
@@ -19,6 +19,7 @@
#include "misc.h"
#include "statd.h"
#include "notlist.h"
+#include "ha-callout.h"
notify_list * rtnl = NULL; /* Run-time notify list. */
@@ -177,6 +178,8 @@ sm_mon_1_svc(struct mon *argp, struct sv
goto failure;
}
free(path);
+ /* PRC: do the HA callout: */
+ ha_callout("add-client", mon_name, my_name, 0);
nlist_insert(&rtnl, clnt);
close(fd);
@@ -232,6 +235,10 @@ sm_unmon_1_svc(struct mon_id *argp, stru
/* Match! */
dprintf(N_DEBUG, "UNMONITORING %s for %s",
mon_name, my_name);
+
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, my_name, 0);
+
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
@@ -276,6 +283,8 @@ sm_unmon_all_1_svc(struct my_id *argp, s
sizeof (mon_name) - 1);
mon_name[sizeof (mon_name) - 1] = '\0';
temp = NL_NEXT(clnt);
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, argp->my_name, 0);
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
++count;
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c nfs-utils-1.0.6/utils/statd/rmtcall.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c 2003-09-12 01:41:38.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/rmtcall.c 2004-08-31 10:31:02.000000000 -0400
@@ -38,6 +38,7 @@
#include "statd.h"
#include "notlist.h"
#include "log.h"
+#include "ha-callout.h"
#define MAXMSGSIZE (2048 / sizeof(unsigned int))
@@ -414,6 +415,8 @@ process_notify_list(void)
note(N_ERROR,
"Can't notify %s, giving up.",
NL_MON_NAME(entry));
+ /* PRC: do the HA callout */
+ ha_callout("del-client", NL_MON_NAME(entry), NL_MY_NAME(entry), 0);
xunlink(SM_BAK_DIR, NL_MON_NAME(entry), 0);
nlist_free(¬ify, entry);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c nfs-utils-1.0.6/utils/statd/statd.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c 2003-09-12 02:24:29.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.c 2004-08-31 10:31:02.000000000 -0400
@@ -48,6 +48,11 @@ int run_mode = 0; /* foreground logging
char *name_p = NULL;
char *version_p = NULL;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * are added or deleted to the notify list */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -59,6 +64,7 @@ static struct option longopts[] =
{ "name", 1, 0, 'n' },
{ "state-directory-path", 1, 0, 'P' },
{ "notify-mode", 0, 0, 'N' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -102,6 +108,13 @@ killer (int sig)
exit (0);
}
+static void
+sigusr (int sig)
+{
+ dprintf (N_DEBUG, "Caught signal %d, re-reading notify list.", sig);
+ notify_hosts();
+}
+
/*
* Startup information.
*/
@@ -148,6 +161,7 @@ usage()
fprintf(stderr," -n, --name Specify a local hostname.\n");
fprintf(stderr," -P State directory path.\n");
fprintf(stderr," -N Run in notify only mode.\n");
+ fprintf(stderr," -H Specify a high-availability callout program.\n");
}
static const char *pidfile = "/var/run/rpc.statd.pid";
@@ -236,7 +250,7 @@ int main (int argc, char **argv)
MY_NAME = NULL;
/* Process command line switches */
- while ((arg = getopt_long(argc, argv, "h?vVFNdn:p:o:P:", longopts, NULL)) != EOF) {
+ while ((arg = getopt_long(argc, argv, "h?vVFNH:dn:p:o:P:", longopts, NULL)) != EOF) {
switch (arg) {
case 'V': /* Version */
case 'v':
@@ -302,6 +316,13 @@ int main (int argc, char **argv)
sprintf(SM_STAT_PATH, "%s/state", DIR_BASE );
}
break;
+ case 'H': /* PRC: specify the ha-callout program */
+ if ((ha_callout_prog = xstrdup(optarg)) == NULL) {
+ fprintf(stderr, "%s: xstrdup(%s) failed!\n",
+ argv[0], optarg);
+ exit(1);
+ }
+ break;
case '?': /* heeeeeelllllllpppp? heh */
case 'h':
usage();
@@ -397,6 +418,8 @@ int main (int argc, char **argv)
signal (SIGHUP, killer);
signal (SIGINT, killer);
signal (SIGTERM, killer);
+ /* PRC: trap SIGUSR1 to re-read notify list from disk */
+ signal(SIGUSR1, sigusr);
/* WARNING: the following works on Linux and SysV, but not BSD! */
signal(SIGCHLD, SIG_IGN);
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.man nfs-utils-1.0.6/utils/statd/statd.man
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.man 2002-09-16 15:23:03.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.man 2004-08-31 10:31:02.000000000 -0400
@@ -4,11 +4,12 @@
.\" Copyright (C) 1999 Olaf Kirch <okir@monad.swb.de>
.\" Modified by Jeffrey A. Uphoff, 1999, 2002.
.\" Modified by Lon Hohberger, 2000.
-.TH rpc.statd 8 "16 Sep 2002"
+.\" Modified by Paul Clements, 2004.
+.TH rpc.statd 8 "31 Aug 2004"
.SH NAME
rpc.statd \- NSM status monitor
.SH SYNOPSIS
-.B "/sbin/rpc.statd [-F] [-d] [-?] [-n " name "] [-o " port "] [-p " port "] [-V]"
+.B "/sbin/rpc.statd [-F] [-d] [-?] [-n " name "] [-o " port "] [-p " port "] [-H " prog "] [-V]"
.SH DESCRIPTION
The
.B rpc.statd
@@ -101,6 +102,12 @@ statd program will check its state direc
monitored nodes, and exit once the notifications have been sent. This mode is
used to enable Highly Available NFS implementations (i.e. HA-NFS).
.TP
+.BI "\-H, " "" " \-\-ha-callout " prog
+Specify a high availability callout program, which will receive callouts
+for all client monitor and unmonitor requests. This allows
+.B rpc.statd
+to be used in a High Availability NFS (HA-NFS) environment.
+.TP
.B -?
Causes
.B rpc.statd
@@ -135,6 +142,15 @@ and
.BR hosts_access (5)
manual pages.
+.SH SIGNALS
+.BR SIGUSR1
+causes
+.B rpc.statd
+to re-read the notify list from disk
+and send notifications to clients. This can be used in High Availability NFS
+(HA-NFS) environments to notify clients to reacquire file locks upon takeover
+of an NFS export from another server.
+
.SH FILES
.BR /var/lib/nfs/state
.br
@@ -153,3 +169,5 @@ Olaf Kirch <okir@monad.swb.de>
H.J. Lu <hjl@gnu.org>
.br
Lon Hohberger <hohberger@missioncriticallinux.com>
+.br
+Paul Clements <paul.clements@steeleye.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-31 16:26 ` Paul Clements
@ 2004-08-31 20:46 ` Paul Clements
2004-08-31 23:56 ` Neil Brown
1 sibling, 0 replies; 18+ messages in thread
From: Paul Clements @ 2004-08-31 20:46 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 1410 bytes --]
Paul Clements wrote:
> diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h nfs-utils-1.0.6/support/include/ha-callout.h
> --- nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h 1969-12-31 19:00:00.000000000 -0500
> +++ nfs-utils-1.0.6/support/include/ha-callout.h 2004-08-31 10:31:18.000000000 -0400
> @@ -0,0 +1,49 @@
> +/*
> + * support/include/ha-callout.h
> + *
> + * High Availability NFS Callout support routines
> + *
> + * Copyright (c) 2004, Paul Clements, SteelEye Technology
> + *
> + * In order to implement HA NFS, we need several callouts at key
> + * points in statd and mountd. These callouts all come to ha_callout(),
> + * which, in turn, calls out to an ha-callout script (not part of nfs-utils;
> + * defined by -H argument to rpc.statd and rpc.mountd).
> + */
> +#ifndef HA_CALLOUT_H
> +#define HA_CALLOUT_H
> +
> +#include <sys/wait.h>
> +
> +extern char *ha_callout_prog;
> +
> +static inline void
> +ha_callout(char *event, char *arg1, char *arg2, int arg3)
> +{
> + char buf[16]; /* should be plenty */
> + pid_t pid;
> + int ret = -1;
> +
> + if (!ha_callout_prog) /* HA callout is not enabled */
> + return;
> +
> + sprintf(buf, "%d", arg3);
> +
> + pid = fork();
> + switch (pid) {
> + case 0: execl(ha_callout_prog, event, arg1, arg2, buf, NULL);
This bit is not quite right. Corrected patch attached.
--
Paul
[-- Attachment #2: nfs_utils_ha_callout-4.diff --]
[-- Type: text/plain, Size: 12802 bytes --]
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h nfs-utils-1.0.6/support/include/ha-callout.h
--- nfs-utils-1.0.6-PRISTINE/support/include/ha-callout.h 1969-12-31 19:00:00.000000000 -0500
+++ nfs-utils-1.0.6/support/include/ha-callout.h 2004-08-31 15:29:13.000000000 -0400
@@ -0,0 +1,50 @@
+/*
+ * support/include/ha-callout.h
+ *
+ * High Availability NFS Callout support routines
+ *
+ * Copyright (c) 2004, Paul Clements, SteelEye Technology
+ *
+ * In order to implement HA NFS, we need several callouts at key
+ * points in statd and mountd. These callouts all come to ha_callout(),
+ * which, in turn, calls out to an ha-callout script (not part of nfs-utils;
+ * defined by -H argument to rpc.statd and rpc.mountd).
+ */
+#ifndef HA_CALLOUT_H
+#define HA_CALLOUT_H
+
+#include <sys/wait.h>
+
+extern char *ha_callout_prog;
+
+static inline void
+ha_callout(char *event, char *arg1, char *arg2, int arg3)
+{
+ char buf[16]; /* should be plenty */
+ pid_t pid;
+ int ret = -1;
+
+ if (!ha_callout_prog) /* HA callout is not enabled */
+ return;
+
+ sprintf(buf, "%d", arg3);
+
+ pid = fork();
+ switch (pid) {
+ case 0: execl(ha_callout_prog, ha_callout_prog,
+ event, arg1, arg2, buf, NULL);
+ perror("execl");
+ exit(2);
+ case -1: perror("fork");
+ break;
+ default: ret = waitpid(pid, NULL, 0);
+ }
+
+#ifdef dprintf
+ dprintf(N_DEBUG, "ha callout returned %d\n", WEXITSTATUS(ret));
+#else
+ xlog(D_GENERAL, "ha callout returned %d\n", WEXITSTATUS(ret));
+#endif
+}
+
+#endif
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c nfs-utils-1.0.6/utils/mountd/mountd.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.c 2003-09-12 18:14:16.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.c 2004-08-31 10:31:02.000000000 -0400
@@ -36,6 +36,11 @@ static struct nfs_fh_len *get_rootfh(str
int new_cache = 0;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * send mount or unmount requests -- the callout is not needed for 2.6 kernel */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -48,6 +53,7 @@ static struct option longopts[] =
{ "version", 0, 0, 'v' },
{ "port", 1, 0, 'p' },
{ "no-tcp", 0, 0, 'n' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -444,7 +450,7 @@ main(int argc, char **argv)
/* Parse the command line options and arguments. */
opterr = 0;
- while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hN:V:v", longopts, NULL)) != EOF)
+ while ((c = getopt_long(argc, argv, "o:n:Fd:f:p:P:hH:N:V:v", longopts, NULL)) != EOF)
switch (c) {
case 'o':
descriptors = atoi(optarg);
@@ -463,6 +469,9 @@ main(int argc, char **argv)
case 'f':
export_file = optarg;
break;
+ case 'H': /* PRC: specify a high-availability callout program */
+ ha_callout_prog = optarg;
+ break;
case 'h':
usage(argv [0], 0);
break;
@@ -596,6 +605,7 @@ usage(const char *prog, int n)
"Usage: %s [-F|--foreground] [-h|--help] [-v|--version] [-d kind|--debug kind]\n"
" [-o num|--descriptors num] [-f exports-file|--exports-file=file]\n"
" [-p|--port port] [-V version|--nfs-version version]\n"
-" [-N version|--no-nfs-version version] [-n|--no-tcp]\n", prog);
+" [-N version|--no-nfs-version version] [-n|--no-tcp]\n"
+" [-H ha-callout-prog]\n", prog);
exit(n);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.man nfs-utils-1.0.6/utils/mountd/mountd.man
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/mountd.man 2003-07-17 19:17:28.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/mountd.man 2004-08-31 10:31:02.000000000 -0400
@@ -2,7 +2,8 @@
.\" mountd(8)
.\"
.\" Copyright (C) 1999 Olaf Kirch <okir@monad.swb.de>
-.TH rpc.mountd 8 "25 Aug 2000"
+.\" Modified by Paul Clements, 2004.
+.TH rpc.mountd 8 "31 Aug 2004"
.SH NAME
rpc.mountd \- NFS mount daemon
.SH SYNOPSIS
@@ -99,6 +100,16 @@ Force
to bind to the specified port num, instead of using the random port
number assigned by the portmapper.
.TP
+.B \-H " or " \-\-ha-callout prog
+Specify a high availability callout program, which will receive callouts
+for all client mount and unmount requests. This allows
+.B rpc.mountd
+to be used in a High Availability NFS (HA-NFS) environment. This callout is not
+needed (and should not be used) with 2.6 and later kernels (instead,
+mount the nfsd filesystem on
+.B /proc/fs/nfsd
+).
+.TP
.B \-V " or " \-\-nfs-version
This option can be used to request that
.B rpc.mountd
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c nfs-utils-1.0.6/utils/mountd/rmtab.c
--- nfs-utils-1.0.6-PRISTINE/utils/mountd/rmtab.c 2003-07-31 01:19:26.000000000 -0400
+++ nfs-utils-1.0.6/utils/mountd/rmtab.c 2004-08-31 10:31:02.000000000 -0400
@@ -19,6 +19,7 @@
#include "exportfs.h"
#include "xio.h"
#include "mountd.h"
+#include "ha-callout.h"
#include <limits.h> /* PATH_MAX */
@@ -61,6 +62,8 @@ mountlist_add(char *host, const char *pa
host) == 0
&& strcmp(rep->r_path, path) == 0) {
rep->r_count++;
+ /* PRC: do the HA callout: */
+ ha_callout("mount", rep->r_client, rep->r_path, rep->r_count);
putrmtabent(rep, &pos);
endrmtabent();
xfunlock(lockid);
@@ -75,6 +78,8 @@ mountlist_add(char *host, const char *pa
xe.r_path [sizeof (xe.r_path) - 1] = '\0';
xe.r_count = 1;
if (setrmtabent("a")) {
+ /* PRC: do the HA callout: */
+ ha_callout("mount", xe.r_client, xe.r_path, xe.r_count);
putrmtabent(&xe, NULL);
endrmtabent();
}
@@ -103,8 +108,11 @@ mountlist_del(char *hname, const char *p
while ((rep = getrmtabent(1, NULL)) != NULL) {
match = !strcmp (rep->r_client, hname)
&& !strcmp(rep->r_path, path);
- if (match)
+ if (match) {
rep->r_count--;
+ /* PRC: do the HA callout: */
+ ha_callout("umount", rep->r_client, rep->r_path, rep->r_count);
+ }
if (!match || rep->r_count)
fputrmtabent(fp, rep, NULL);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c nfs-utils-1.0.6/utils/statd/monitor.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/monitor.c 2003-09-12 01:41:35.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/monitor.c 2004-08-31 10:31:02.000000000 -0400
@@ -19,6 +19,7 @@
#include "misc.h"
#include "statd.h"
#include "notlist.h"
+#include "ha-callout.h"
notify_list * rtnl = NULL; /* Run-time notify list. */
@@ -177,6 +178,8 @@ sm_mon_1_svc(struct mon *argp, struct sv
goto failure;
}
free(path);
+ /* PRC: do the HA callout: */
+ ha_callout("add-client", mon_name, my_name, 0);
nlist_insert(&rtnl, clnt);
close(fd);
@@ -232,6 +235,10 @@ sm_unmon_1_svc(struct mon_id *argp, stru
/* Match! */
dprintf(N_DEBUG, "UNMONITORING %s for %s",
mon_name, my_name);
+
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, my_name, 0);
+
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
@@ -276,6 +283,8 @@ sm_unmon_all_1_svc(struct my_id *argp, s
sizeof (mon_name) - 1);
mon_name[sizeof (mon_name) - 1] = '\0';
temp = NL_NEXT(clnt);
+ /* PRC: do the HA callout: */
+ ha_callout("del-client", mon_name, argp->my_name, 0);
nlist_free(&rtnl, clnt);
xunlink(SM_DIR, mon_name, 1);
++count;
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c nfs-utils-1.0.6/utils/statd/rmtcall.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/rmtcall.c 2003-09-12 01:41:38.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/rmtcall.c 2004-08-31 10:31:02.000000000 -0400
@@ -38,6 +38,7 @@
#include "statd.h"
#include "notlist.h"
#include "log.h"
+#include "ha-callout.h"
#define MAXMSGSIZE (2048 / sizeof(unsigned int))
@@ -414,6 +415,8 @@ process_notify_list(void)
note(N_ERROR,
"Can't notify %s, giving up.",
NL_MON_NAME(entry));
+ /* PRC: do the HA callout */
+ ha_callout("del-client", NL_MON_NAME(entry), NL_MY_NAME(entry), 0);
xunlink(SM_BAK_DIR, NL_MON_NAME(entry), 0);
nlist_free(¬ify, entry);
}
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c nfs-utils-1.0.6/utils/statd/statd.c
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.c 2003-09-12 02:24:29.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.c 2004-08-31 10:31:02.000000000 -0400
@@ -48,6 +48,11 @@ int run_mode = 0; /* foreground logging
char *name_p = NULL;
char *version_p = NULL;
+/* PRC: a high-availability callout program can be specified with -H
+ * When this is done, the program will receive callouts whenever clients
+ * are added or deleted to the notify list */
+char *ha_callout_prog = NULL;
+
static struct option longopts[] =
{
{ "foreground", 0, 0, 'F' },
@@ -59,6 +64,7 @@ static struct option longopts[] =
{ "name", 1, 0, 'n' },
{ "state-directory-path", 1, 0, 'P' },
{ "notify-mode", 0, 0, 'N' },
+ { "ha-callout", 1, 0, 'H' },
{ NULL, 0, 0, 0 }
};
@@ -102,6 +108,13 @@ killer (int sig)
exit (0);
}
+static void
+sigusr (int sig)
+{
+ dprintf (N_DEBUG, "Caught signal %d, re-reading notify list.", sig);
+ notify_hosts();
+}
+
/*
* Startup information.
*/
@@ -148,6 +161,7 @@ usage()
fprintf(stderr," -n, --name Specify a local hostname.\n");
fprintf(stderr," -P State directory path.\n");
fprintf(stderr," -N Run in notify only mode.\n");
+ fprintf(stderr," -H Specify a high-availability callout program.\n");
}
static const char *pidfile = "/var/run/rpc.statd.pid";
@@ -236,7 +250,7 @@ int main (int argc, char **argv)
MY_NAME = NULL;
/* Process command line switches */
- while ((arg = getopt_long(argc, argv, "h?vVFNdn:p:o:P:", longopts, NULL)) != EOF) {
+ while ((arg = getopt_long(argc, argv, "h?vVFNH:dn:p:o:P:", longopts, NULL)) != EOF) {
switch (arg) {
case 'V': /* Version */
case 'v':
@@ -302,6 +316,13 @@ int main (int argc, char **argv)
sprintf(SM_STAT_PATH, "%s/state", DIR_BASE );
}
break;
+ case 'H': /* PRC: specify the ha-callout program */
+ if ((ha_callout_prog = xstrdup(optarg)) == NULL) {
+ fprintf(stderr, "%s: xstrdup(%s) failed!\n",
+ argv[0], optarg);
+ exit(1);
+ }
+ break;
case '?': /* heeeeeelllllllpppp? heh */
case 'h':
usage();
@@ -397,6 +418,8 @@ int main (int argc, char **argv)
signal (SIGHUP, killer);
signal (SIGINT, killer);
signal (SIGTERM, killer);
+ /* PRC: trap SIGUSR1 to re-read notify list from disk */
+ signal(SIGUSR1, sigusr);
/* WARNING: the following works on Linux and SysV, but not BSD! */
signal(SIGCHLD, SIG_IGN);
diff -purN --exclude-from /export/public/clemep/tmp/dontdiff nfs-utils-1.0.6-PRISTINE/utils/statd/statd.man nfs-utils-1.0.6/utils/statd/statd.man
--- nfs-utils-1.0.6-PRISTINE/utils/statd/statd.man 2002-09-16 15:23:03.000000000 -0400
+++ nfs-utils-1.0.6/utils/statd/statd.man 2004-08-31 10:31:02.000000000 -0400
@@ -4,11 +4,12 @@
.\" Copyright (C) 1999 Olaf Kirch <okir@monad.swb.de>
.\" Modified by Jeffrey A. Uphoff, 1999, 2002.
.\" Modified by Lon Hohberger, 2000.
-.TH rpc.statd 8 "16 Sep 2002"
+.\" Modified by Paul Clements, 2004.
+.TH rpc.statd 8 "31 Aug 2004"
.SH NAME
rpc.statd \- NSM status monitor
.SH SYNOPSIS
-.B "/sbin/rpc.statd [-F] [-d] [-?] [-n " name "] [-o " port "] [-p " port "] [-V]"
+.B "/sbin/rpc.statd [-F] [-d] [-?] [-n " name "] [-o " port "] [-p " port "] [-H " prog "] [-V]"
.SH DESCRIPTION
The
.B rpc.statd
@@ -101,6 +102,12 @@ statd program will check its state direc
monitored nodes, and exit once the notifications have been sent. This mode is
used to enable Highly Available NFS implementations (i.e. HA-NFS).
.TP
+.BI "\-H, " "" " \-\-ha-callout " prog
+Specify a high availability callout program, which will receive callouts
+for all client monitor and unmonitor requests. This allows
+.B rpc.statd
+to be used in a High Availability NFS (HA-NFS) environment.
+.TP
.B -?
Causes
.B rpc.statd
@@ -135,6 +142,15 @@ and
.BR hosts_access (5)
manual pages.
+.SH SIGNALS
+.BR SIGUSR1
+causes
+.B rpc.statd
+to re-read the notify list from disk
+and send notifications to clients. This can be used in High Availability NFS
+(HA-NFS) environments to notify clients to reacquire file locks upon takeover
+of an NFS export from another server.
+
.SH FILES
.BR /var/lib/nfs/state
.br
@@ -153,3 +169,5 @@ Olaf Kirch <okir@monad.swb.de>
H.J. Lu <hjl@gnu.org>
.br
Lon Hohberger <hohberger@missioncriticallinux.com>
+.br
+Paul Clements <paul.clements@steeleye.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-31 16:26 ` Paul Clements
2004-08-31 20:46 ` Paul Clements
@ 2004-08-31 23:56 ` Neil Brown
2004-09-01 3:49 ` Paul Clements
1 sibling, 1 reply; 18+ messages in thread
From: Neil Brown @ 2004-08-31 23:56 UTC (permalink / raw)
To: Paul Clements; +Cc: nfs
On Tuesday August 31, paul.clements@steeleye.com wrote:
>
> New patch attached.
>
Thanks. I'm almost ready to commit it.
I added a bit to the documentation to describe the arguments passed to
the callout.
For mountd
+).
+The program will be called with 4 arguments.
+The first will be
+.B mount
+or
+.B umount
+depending on the reason for the callout.
+The second will be the name of the client performing the mount.
+The third will be the path that the client is mounting.
+The last is the number of concurrent mounts that we believe the client
+has of that path.
For statd
+The program will be run with 3 arguments: The first is either
+.B add-client
+or
+.B del-client
+depending on the reason for the callout.
+The second will be the name of the client.
+The third will be ....
+.TP
Note that for statd, I dropped the third arg being passed. It seemed
easier than explaining in the doco why there is a fourth arg.
Also note that I don't remember what "my_name" is all about. Could
you provide a small piece of text to go there.
Finally, would you consider having "unmount" instead of "umount" as a
possible first arg for the mountd callout?
Thanks
NeilBrown
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-31 23:56 ` Neil Brown
@ 2004-09-01 3:49 ` Paul Clements
2004-09-06 2:17 ` Neil Brown
0 siblings, 1 reply; 18+ messages in thread
From: Paul Clements @ 2004-09-01 3:49 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
Neil Brown wrote:
> On Tuesday August 31, paul.clements@steeleye.com wrote:
> I added a bit to the documentation to describe the arguments passed to
> the callout.
Those look good to me.
> Note that for statd, I dropped the third arg being passed. It seemed
> easier than explaining in the doco why there is a fourth arg.
> Also note that I don't remember what "my_name" is all about. Could
> you provide a small piece of text to go there.
The my_name argument is the server name, but honestly, I don't know that
we need anything other than the client name (we really just need to know
what files to create in the sm/sm.bak directories) so if you want to
just drop that argument it's fine with me.
> Finally, would you consider having "unmount" instead of "umount" as a
> possible first arg for the mountd callout?
unmount instead of umount is fine if you want to go ahead and make that
change...
Thanks,
Paul
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-08-30 8:25 ` Olaf Kirch
2004-08-30 10:19 ` Greg Banks
@ 2004-09-03 7:28 ` Kedar Sovani
2004-09-06 1:47 ` Neil Brown
1 sibling, 1 reply; 18+ messages in thread
From: Kedar Sovani @ 2004-09-03 7:28 UTC (permalink / raw)
To: Olaf Kirch; +Cc: Paul Clements, Neil Brown, nfs
NFS Server keeps state about the recent responses given to the clients,
in a reply cache.
Shouldn't that state be replicated to the failed-over NFS server as well
?
Kedar.
On Mon, 2004-08-30 at 13:55, Olaf Kirch wrote:
> On Fri, Aug 27, 2004 at 09:13:06AM -0400, Paul Clements wrote:
> > With so many things moving out of the kernel to userland, it's
> > surprising to see statd moving into the kernel. And it certainly does
> > make modifications, such as this, much more difficult.
>
> Well, the issue with statd is that it exists _only_ to make lockd happy.
> But what it really does is make lockd awfully complicated because we
> have to do upcalls all the time.
>
> So my rationale for moving statd into the kernel is to actually eliminate
> a lot of code and have a minimal set up functionality in there that does
> exactly what lockd needs, no more. In particular, no more SM_MON and
> SM_UNMON support that allows untrusted hosts to do sneaky stuff - instead,
> lockd writes the /var/lib/nfs/sm files directly. The only thing my kernel
> statd supports is SM_NOTIFY, and that even is delivered directly to the
> locking code.
>
> > But the callout happens before statd replies to the client, so there's
> > no chance that we lose any clients if there's a failure. If we use
> > dnotify then we only find out there's been a change after the change has
> > occurred, which means the client may have already gotten a reply saying
> > he's been added to the notify list. If we get a failure at this point,
> > we've lost a client, and upon failover the client's locks are no longer
> > valid.
>
> Doesn't a HA NFS deployment require a shared disk anyway? Why not just
> move /var/lib/nfs to this file system so it gets shared the same way you
> share the rest of your data?
>
> Olaf
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-09-03 7:28 ` Kedar Sovani
@ 2004-09-06 1:47 ` Neil Brown
2004-09-06 12:47 ` Kedar Sovani
0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2004-09-06 1:47 UTC (permalink / raw)
To: Kedar Sovani; +Cc: Olaf Kirch, Paul Clements, nfs
On September 3, kedars@hotpop.com wrote:
> NFS Server keeps state about the recent responses given to the clients,
> in a reply cache.
>
> Shouldn't that state be replicated to the failed-over NFS server as well
> ?
The reply cache is an optimisation and cannot be relied upon. Coping
it across in a fail-over would be both very expensive (it is highly
volatile) and of little benefit.
NeilBrown
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-09-01 3:49 ` Paul Clements
@ 2004-09-06 2:17 ` Neil Brown
2004-09-06 15:42 ` Paul Clements
2004-10-25 23:47 ` [PATCH] nfs-utils: High Availability NFS bugfixes Paul Clements
0 siblings, 2 replies; 18+ messages in thread
From: Neil Brown @ 2004-09-06 2:17 UTC (permalink / raw)
To: Paul Clements; +Cc: nfs
On Tuesday August 31, paul.clements@steeleye.com wrote:
> Neil Brown wrote:
> > On Tuesday August 31, paul.clements@steeleye.com wrote:
>
> > I added a bit to the documentation to describe the arguments passed to
> > the callout.
>
> Those look good to me.
>
>
> > Note that for statd, I dropped the third arg being passed. It seemed
> > easier than explaining in the doco why there is a fourth arg.
> > Also note that I don't remember what "my_name" is all about. Could
> > you provide a small piece of text to go there.
>
> The my_name argument is the server name, but honestly, I don't know that
> we need anything other than the client name (we really just need to know
> what files to create in the sm/sm.bak directories) so if you want to
> just drop that argument it's fine with me.
>
>
> > Finally, would you consider having "unmount" instead of "umount" as a
> > possible first arg for the mountd callout?
>
> unmount instead of umount is fine if you want to go ahead and make that
> change...
>
> Thanks,
> Paul
Ok, I have made these changes, committed them, and tags them with
nfs-utils-1-0-6-post3.
I'm not sure if I mentioned, but I change the SIGHUP handling so that
instead of doing the work in the signal handler, I set a flag and do
the work in the main look (svc_run). This avoids some races.
NeilBrown
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-09-06 1:47 ` Neil Brown
@ 2004-09-06 12:47 ` Kedar Sovani
0 siblings, 0 replies; 18+ messages in thread
From: Kedar Sovani @ 2004-09-06 12:47 UTC (permalink / raw)
To: Neil Brown; +Cc: Olaf Kirch, Paul Clements, nfs
On Mon, 2004-09-06 at 07:17, Neil Brown wrote:
> On September 3, kedars@hotpop.com wrote:
> > NFS Server keeps state about the recent responses given to the clients,
> > in a reply cache.
> >
> > Shouldn't that state be replicated to the failed-over NFS server as well
> > ?
>
> The reply cache is an optimisation and cannot be relied upon. Coping
> it across in a fail-over would be both very expensive (it is highly
> volatile) and of little benefit.
>
> NeilBrown
Yes, migrating it across server would be very expensive.
But isn't reply cache for correctness (idempotency) as against
optimisation ?
E.g.
1. NFS Client sends "create" (file) to the server.
2. Server creates the file and sends a "success" response.
3. The response is lost.
4. The Client retries the "create"
5. and the NFS Server responds from the _reply cache_ stating that the
result is true. (as the reply was cached in step 2)
Now if the Server crashes after the 3rd step, and the failed over server
starts processing requests, it will respond to the step 4 above with a
FILE EXISTS instead of success, which I think, is incorrect.
Am I missing something ?
Or probably such scenarios are rare to occur and if they do, are not
much significant, I suppose.
I guess this is what you meant by "of little benefit" above.
thanks,
Kedar.
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH / RFC] nfs-utils: High Availability NFS
2004-09-06 2:17 ` Neil Brown
@ 2004-09-06 15:42 ` Paul Clements
2004-10-25 23:47 ` [PATCH] nfs-utils: High Availability NFS bugfixes Paul Clements
1 sibling, 0 replies; 18+ messages in thread
From: Paul Clements @ 2004-09-06 15:42 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
Neil Brown wrote:
> Ok, I have made these changes, committed them, and tags them with
> nfs-utils-1-0-6-post3.
Thanks Neil. I'll take a look tomorrow.
> I'm not sure if I mentioned, but I change the SIGHUP handling so that
> instead of doing the work in the signal handler, I set a flag and do
> the work in the main look (svc_run). This avoids some races.
That sounds fine.
--
Paul
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] nfs-utils: High Availability NFS bugfixes
2004-09-06 2:17 ` Neil Brown
2004-09-06 15:42 ` Paul Clements
@ 2004-10-25 23:47 ` Paul Clements
1 sibling, 0 replies; 18+ messages in thread
From: Paul Clements @ 2004-10-25 23:47 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 527 bytes --]
A couple of bugfixes to go on top of the HA NFS patch that I sent a
while back:
* avoid call to waitpid() with SIGCHLD set to SIG_IGN -- we'll set
SIGCHLD to SIG_DFL before calling the HA callout and then switch it back
to its old value afterward
* fix arguments to waitpid() so that correct return value is reported in
debug message
* fix signal handler for statd to additionally do a change_state() --
this is required to ensure that clients see a state change, else they
may ignore the notification
Thanks,
Paul
[-- Attachment #2: nfs-utils-1.0.6-ha_bugfix.diff --]
[-- Type: text/plain, Size: 2197 bytes --]
diff -pur nfs-utils-1.0.6/support/include/ha-callout.h nfs-utils-1.0.6-ha_bugfix/support/include/ha-callout.h
--- nfs-utils-1.0.6/support/include/ha-callout.h 2004-10-25 12:09:37.000000000 -0400
+++ nfs-utils-1.0.6-ha_bugfix/support/include/ha-callout.h 2004-10-25 12:17:53.000000000 -0400
@@ -14,6 +14,7 @@
#define HA_CALLOUT_H
#include <sys/wait.h>
+#include <signal.h>
extern char *ha_callout_prog;
@@ -23,12 +24,18 @@ ha_callout(char *event, char *arg1, char
char buf[16]; /* should be plenty */
pid_t pid;
int ret = -1;
+ struct sigaction oldact, newact;
if (!ha_callout_prog) /* HA callout is not enabled */
return;
sprintf(buf, "%d", arg3);
+ newact.sa_handler = SIG_DFL;
+ newact.sa_flags = 0;
+ sigemptyset(&newact.sa_mask);
+ sigaction(SIGCHLD, &newact, &oldact);
+
pid = fork();
switch (pid) {
case 0: execl(ha_callout_prog, ha_callout_prog,
@@ -39,7 +46,8 @@ ha_callout(char *event, char *arg1, char
exit(2);
case -1: perror("fork");
break;
- default: ret = waitpid(pid, NULL, 0);
+ default: pid = waitpid(pid, &ret, 0);
+ sigaction(SIGCHLD, &oldact, &newact);
}
#ifdef dprintf
diff -pur nfs-utils-1.0.6/utils/statd/statd.c nfs-utils-1.0.6-ha_bugfix/utils/statd/statd.c
--- nfs-utils-1.0.6/utils/statd/statd.c 2004-10-25 12:09:37.000000000 -0400
+++ nfs-utils-1.0.6-ha_bugfix/utils/statd/statd.c 2004-10-25 12:17:53.000000000 -0400
@@ -111,7 +111,8 @@ killer (int sig)
static void
sigusr (int sig)
{
- dprintf (N_DEBUG, "Caught signal %d, re-reading notify list.", sig);
+ dprintf (N_DEBUG, "Caught signal %d, re-notifying (state %d).", sig,
+ MY_STATE);
re_notify = 1;
}
Only in nfs-utils-1.0.6-ha_bugfix/utils/statd: statd.c.orig
diff -pur nfs-utils-1.0.6/utils/statd/svc_run.c nfs-utils-1.0.6-ha_bugfix/utils/statd/svc_run.c
--- nfs-utils-1.0.6/utils/statd/svc_run.c 2004-10-25 12:09:37.000000000 -0400
+++ nfs-utils-1.0.6-ha_bugfix/utils/statd/svc_run.c 2004-10-25 12:17:53.000000000 -0400
@@ -88,6 +88,9 @@ my_svc_run(void)
if (svc_stop)
return;
if (re_notify) {
+ change_state();
+ dprintf(N_DEBUG, "Notifying...(new state %d)",
+ MY_STATE);
notify_hosts();
re_notify = 0;
}
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2004-10-25 23:47 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4124DB86.9060505@steeleye.com>
[not found] ` <16677.22269.988036.787320@cse.unsw.edu.au>
2004-08-26 17:21 ` [PATCH / RFC] nfs-utils: High Availability NFS Paul Clements
2004-08-26 18:43 ` Paul Clements
2004-08-27 7:30 ` Olaf Kirch
2004-08-27 13:13 ` Paul Clements
2004-08-30 8:25 ` Olaf Kirch
2004-08-30 10:19 ` Greg Banks
2004-08-30 15:03 ` Paul Clements
2004-09-03 7:28 ` Kedar Sovani
2004-09-06 1:47 ` Neil Brown
2004-09-06 12:47 ` Kedar Sovani
2004-08-31 6:51 ` Neil Brown
2004-08-31 16:26 ` Paul Clements
2004-08-31 20:46 ` Paul Clements
2004-08-31 23:56 ` Neil Brown
2004-09-01 3:49 ` Paul Clements
2004-09-06 2:17 ` Neil Brown
2004-09-06 15:42 ` Paul Clements
2004-10-25 23:47 ` [PATCH] nfs-utils: High Availability NFS bugfixes Paul Clements
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.