* [Cluster-devel] [PATCH dlm-tool 1/2] Revert "dlm_controld: add support for waitplock_recovery switch"
@ 2020-09-04 14:29 Alexander Aring
2020-09-04 14:29 ` [Cluster-devel] [PATCH dlm-tool 2/2] dlm_controld: set SO_RCVBUF for netlink socket Alexander Aring
0 siblings, 1 reply; 2+ messages in thread
From: Alexander Aring @ 2020-09-04 14:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
This reverts commit 0e9d0a6563f4acef5a27eade8eb29c7e6748c8d2.
---
dlm_controld/action.c | 5 -----
dlm_controld/dlm.conf.5 | 2 --
dlm_controld/dlm_daemon.h | 1 -
dlm_controld/main.c | 5 -----
4 files changed, 13 deletions(-)
diff --git a/dlm_controld/action.c b/dlm_controld/action.c
index bc9c44f2..9e18d286 100644
--- a/dlm_controld/action.c
+++ b/dlm_controld/action.c
@@ -881,11 +881,6 @@ int setup_configfs_options(void)
dlm_options[timewarn_ind].file_set)
set_configfs_cluster("timewarn_cs", NULL, opt(timewarn_ind));
- if (dlm_options[enable_waitplock_recovery_ind].cli_set ||
- dlm_options[enable_waitplock_recovery_ind].file_set)
- set_configfs_cluster("waitplock_recovery", NULL,
- opt(enable_waitplock_recovery_ind));
-
set_configfs_cluster("mark", NULL, optu(mark_ind));
proto_name = opts(protocol_ind);
diff --git a/dlm_controld/dlm.conf.5 b/dlm_controld/dlm.conf.5
index e92dfc8e..1ce0c644 100644
--- a/dlm_controld/dlm.conf.5
+++ b/dlm_controld/dlm.conf.5
@@ -46,8 +46,6 @@ debug_logfile
.br
enable_plock
.br
-enable_waitplock_recovery
-.br
plock_debug
.br
plock_rate_limit
diff --git a/dlm_controld/dlm_daemon.h b/dlm_controld/dlm_daemon.h
index ee21c256..0b4ae5f2 100644
--- a/dlm_controld/dlm_daemon.h
+++ b/dlm_controld/dlm_daemon.h
@@ -102,7 +102,6 @@ enum {
mark_ind,
enable_fscontrol_ind,
enable_plock_ind,
- enable_waitplock_recovery_ind,
plock_debug_ind,
plock_rate_limit_ind,
plock_ownership_ind,
diff --git a/dlm_controld/main.c b/dlm_controld/main.c
index 645bd26b..470a067c 100644
--- a/dlm_controld/main.c
+++ b/dlm_controld/main.c
@@ -1768,11 +1768,6 @@ static void set_opt_defaults(void)
1, NULL, 0,
"enable/disable posix lock support for cluster fs");
- set_opt_default(enable_waitplock_recovery_ind,
- "enable_waitplock_recovery", '\0', req_arg_bool,
- 0, NULL, 0,
- "enable/disable posix lock to wait for dlm recovery after lock acquire");
-
set_opt_default(plock_debug_ind,
"plock_debug", 'P', no_arg,
0, NULL, 0,
--
2.26.2
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [Cluster-devel] [PATCH dlm-tool 2/2] dlm_controld: set SO_RCVBUF for netlink socket
2020-09-04 14:29 [Cluster-devel] [PATCH dlm-tool 1/2] Revert "dlm_controld: add support for waitplock_recovery switch" Alexander Aring
@ 2020-09-04 14:29 ` Alexander Aring
0 siblings, 0 replies; 2+ messages in thread
From: Alexander Aring @ 2020-09-04 14:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
Saw some:
1597148652 uevent recv error -1 errno 105
on a dlm_tool dump. The errno 105 is ENOBUFS on an recv of an AF_NETLINK
socket. Further investigations showed that we dropping uevents in such
case, see the added comment. The above error message was on a node which
hung inside do_uevent() of dlm kernel code which means that the
node is waiting for a sysfs write of "event_done". My guess is that
dlm_controld dropped some "important" messages and never writes to
"event_done" in this case. However we should prevent such ENOBUFS cases
in netlink which this patch is trying to do in a simple way.
---
dlm_controld/dlm_daemon.h | 2 ++
dlm_controld/main.c | 19 +++++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/dlm_controld/dlm_daemon.h b/dlm_controld/dlm_daemon.h
index 0b4ae5f2..95848201 100644
--- a/dlm_controld/dlm_daemon.h
+++ b/dlm_controld/dlm_daemon.h
@@ -83,6 +83,8 @@
#define DEFAULT_LOGFILE_PRIORITY LOG_INFO
#define DEFAULT_LOGFILE LOG_FILE_PATH
+#define DEFAULT_NETLINK_RCVBUF (2 * 1024 * 1024)
+
enum {
no_arg = 0,
req_arg_bool = 1,
diff --git a/dlm_controld/main.c b/dlm_controld/main.c
index 470a067c..a82fc9c2 100644
--- a/dlm_controld/main.c
+++ b/dlm_controld/main.c
@@ -765,6 +765,7 @@ static void process_uevent(int ci)
static int setup_uevent(void)
{
struct sockaddr_nl snl;
+ int rcvbuf;
int s, rv;
s = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT);
@@ -773,6 +774,24 @@ static int setup_uevent(void)
return s;
}
+ /* man 7 netlink:
+ *
+ * However, reliable transmissions from kernel to user are impossible in
+ * any case. The kernel can't send a netlink message if the socket buffer
+ * is full: the message will be dropped and the kernel and the user-space
+ * process will no longer have the same view of kernel state. It is up to
+ * the application to detect when this happens (via the ENOBUFS error
+ * returned by recvmsg(2)) and resynchronize.
+ *
+ * To prevent ENOBUFS errors we just set the receive buffer to two
+ * megabyte as other applications do it. This will not ensure that we never
+ * receive ENOBUFS but it's more unlikely. May it's worth to handle ENOBUFS
+ * errors on a different way in future.
+ */
+ rcvbuf = DEFAULT_NETLINK_RCVBUF;
+ setsockopt(s, SOL_SOCKET, SO_RCVBUF, &rcvbuf, sizeof(rcvbuf));
+ setsockopt(s, SOL_SOCKET, SO_RCVBUFFORCE, &rcvbuf, sizeof(rcvbuf));
+
memset(&snl, 0, sizeof(snl));
snl.nl_family = AF_NETLINK;
snl.nl_pid = getpid();
--
2.26.2
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-09-04 14:29 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-04 14:29 [Cluster-devel] [PATCH dlm-tool 1/2] Revert "dlm_controld: add support for waitplock_recovery switch" Alexander Aring
2020-09-04 14:29 ` [Cluster-devel] [PATCH dlm-tool 2/2] dlm_controld: set SO_RCVBUF for netlink socket Alexander Aring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).