* [Cluster-devel] [PATCH AUTOSEL 5.12 003/116] fs: dlm: fix debugfs dump
[not found] <20210505163125.3460440-1-sashal@kernel.org>
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 004/116] fs: dlm: fix mark setting deadlock Sasha Levin
` (6 subsequent siblings)
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit 92c48950b43f4a767388cf87709d8687151a641f ]
This patch fixes the following message which randomly pops up during
glocktop call:
seq_file: buggy .next function table_seq_next did not update position index
The issue is that seq_read_iter() in fs/seq_file.c also needs an
increment of the index in an non next record case as well which this
patch fixes otherwise seq_read_iter() will print out the above message.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/debug_fs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/dlm/debug_fs.c b/fs/dlm/debug_fs.c
index d6bbccb0ed15..d5bd990bcab8 100644
--- a/fs/dlm/debug_fs.c
+++ b/fs/dlm/debug_fs.c
@@ -542,6 +542,7 @@ static void *table_seq_next(struct seq_file *seq, void *iter_ptr, loff_t *pos)
if (bucket >= ls->ls_rsbtbl_size) {
kfree(ri);
+ ++*pos;
return NULL;
}
tree = toss ? &ls->ls_rsbtbl[bucket].toss : &ls->ls_rsbtbl[bucket].keep;
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 004/116] fs: dlm: fix mark setting deadlock
[not found] <20210505163125.3460440-1-sashal@kernel.org>
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 003/116] fs: dlm: fix debugfs dump Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 005/116] fs: dlm: add errno handling to check callback Sasha Levin
` (5 subsequent siblings)
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit e125fbeb538e5e35a00c6c8150a5361bef34814c ]
This patch fixes an deadlock issue when dlm_lowcomms_close() is called.
When dlm_lowcomms_close() is called the clusters_root.subsys.su_mutex is
held to remove configfs items. At this time we flushing (e.g.
cancel_work_sync()) the workers of send and recv workqueue. Due the fact
that we accessing configfs items (mark values), these workers will lock
clusters_root.subsys.su_mutex as well which are already hold by
dlm_lowcomms_close() and ends in a deadlock situation.
[67170.703046] ======================================================
[67170.703965] WARNING: possible circular locking dependency detected
[67170.704758] 5.11.0-rc4+ #22 Tainted: G W
[67170.705433] ------------------------------------------------------
[67170.706228] dlm_controld/280 is trying to acquire lock:
[67170.706915] ffff9f2f475a6948 ((wq_completion)dlm_recv){+.+.}-{0:0}, at: __flush_work+0x203/0x4c0
[67170.708026]
but task is already holding lock:
[67170.708758] ffffffffa132f878 (&clusters_root.subsys.su_mutex){+.+.}-{3:3}, at: configfs_rmdir+0x29b/0x310
[67170.710016]
which lock already depends on the new lock.
The new behaviour adds the mark value to the node address configuration
which doesn't require to held the clusters_root.subsys.su_mutex by
accessing mark values in a separate datastructure. However the mark
values can be set now only after a node address was set which is the
case when the user is using dlm_controld.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/config.c | 29 ++++++++++------------------
fs/dlm/config.h | 1 -
fs/dlm/lowcomms.c | 49 ++++++++++++++++++++++++++++++++---------------
fs/dlm/lowcomms.h | 1 +
4 files changed, 45 insertions(+), 35 deletions(-)
diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 49c5f9407098..582bffa09a66 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -688,6 +688,7 @@ static ssize_t comm_mark_show(struct config_item *item, char *buf)
static ssize_t comm_mark_store(struct config_item *item, const char *buf,
size_t len)
{
+ struct dlm_comm *comm;
unsigned int mark;
int rc;
@@ -695,7 +696,15 @@ static ssize_t comm_mark_store(struct config_item *item, const char *buf,
if (rc)
return rc;
- config_item_to_comm(item)->mark = mark;
+ if (mark == 0)
+ mark = dlm_config.ci_mark;
+
+ comm = config_item_to_comm(item);
+ rc = dlm_lowcomms_nodes_set_mark(comm->nodeid, mark);
+ if (rc)
+ return rc;
+
+ comm->mark = mark;
return len;
}
@@ -870,24 +879,6 @@ int dlm_comm_seq(int nodeid, uint32_t *seq)
return 0;
}
-void dlm_comm_mark(int nodeid, unsigned int *mark)
-{
- struct dlm_comm *cm;
-
- cm = get_comm(nodeid);
- if (!cm) {
- *mark = dlm_config.ci_mark;
- return;
- }
-
- if (cm->mark)
- *mark = cm->mark;
- else
- *mark = dlm_config.ci_mark;
-
- put_comm(cm);
-}
-
int dlm_our_nodeid(void)
{
return local_comm ? local_comm->nodeid : 0;
diff --git a/fs/dlm/config.h b/fs/dlm/config.h
index c210250a2581..d2cd4bd20313 100644
--- a/fs/dlm/config.h
+++ b/fs/dlm/config.h
@@ -48,7 +48,6 @@ void dlm_config_exit(void);
int dlm_config_nodes(char *lsname, struct dlm_config_node **nodes_out,
int *count_out);
int dlm_comm_seq(int nodeid, uint32_t *seq);
-void dlm_comm_mark(int nodeid, unsigned int *mark);
int dlm_our_nodeid(void);
int dlm_our_addr(struct sockaddr_storage *addr, int num);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 372c34ff8594..440dce99d0d9 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -116,6 +116,7 @@ struct writequeue_entry {
struct dlm_node_addr {
struct list_head list;
int nodeid;
+ int mark;
int addr_count;
int curr_addr_index;
struct sockaddr_storage *addr[DLM_MAX_ADDR_COUNT];
@@ -303,7 +304,8 @@ static int addr_compare(const struct sockaddr_storage *x,
}
static int nodeid_to_addr(int nodeid, struct sockaddr_storage *sas_out,
- struct sockaddr *sa_out, bool try_new_addr)
+ struct sockaddr *sa_out, bool try_new_addr,
+ unsigned int *mark)
{
struct sockaddr_storage sas;
struct dlm_node_addr *na;
@@ -331,6 +333,8 @@ static int nodeid_to_addr(int nodeid, struct sockaddr_storage *sas_out,
if (!na->addr_count)
return -ENOENT;
+ *mark = na->mark;
+
if (sas_out)
memcpy(sas_out, &sas, sizeof(struct sockaddr_storage));
@@ -350,7 +354,8 @@ static int nodeid_to_addr(int nodeid, struct sockaddr_storage *sas_out,
return 0;
}
-static int addr_to_nodeid(struct sockaddr_storage *addr, int *nodeid)
+static int addr_to_nodeid(struct sockaddr_storage *addr, int *nodeid,
+ unsigned int *mark)
{
struct dlm_node_addr *na;
int rv = -EEXIST;
@@ -364,6 +369,7 @@ static int addr_to_nodeid(struct sockaddr_storage *addr, int *nodeid)
for (addr_i = 0; addr_i < na->addr_count; addr_i++) {
if (addr_compare(na->addr[addr_i], addr)) {
*nodeid = na->nodeid;
+ *mark = na->mark;
rv = 0;
goto unlock;
}
@@ -412,6 +418,7 @@ int dlm_lowcomms_addr(int nodeid, struct sockaddr_storage *addr, int len)
new_node->nodeid = nodeid;
new_node->addr[0] = new_addr;
new_node->addr_count = 1;
+ new_node->mark = dlm_config.ci_mark;
list_add(&new_node->list, &dlm_node_addrs);
spin_unlock(&dlm_node_addrs_spin);
return 0;
@@ -519,6 +526,23 @@ int dlm_lowcomms_connect_node(int nodeid)
return 0;
}
+int dlm_lowcomms_nodes_set_mark(int nodeid, unsigned int mark)
+{
+ struct dlm_node_addr *na;
+
+ spin_lock(&dlm_node_addrs_spin);
+ na = find_node_addr(nodeid);
+ if (!na) {
+ spin_unlock(&dlm_node_addrs_spin);
+ return -ENOENT;
+ }
+
+ na->mark = mark;
+ spin_unlock(&dlm_node_addrs_spin);
+
+ return 0;
+}
+
static void lowcomms_error_report(struct sock *sk)
{
struct connection *con;
@@ -867,7 +891,7 @@ static int accept_from_sock(struct listen_connection *con)
/* Get the new node's NODEID */
make_sockaddr(&peeraddr, 0, &len);
- if (addr_to_nodeid(&peeraddr, &nodeid)) {
+ if (addr_to_nodeid(&peeraddr, &nodeid, &mark)) {
unsigned char *b=(unsigned char *)&peeraddr;
log_print("connect from non cluster node");
print_hex_dump_bytes("ss: ", DUMP_PREFIX_NONE,
@@ -876,9 +900,6 @@ static int accept_from_sock(struct listen_connection *con)
return -1;
}
- dlm_comm_mark(nodeid, &mark);
- sock_set_mark(newsock->sk, mark);
-
log_print("got connection from %d", nodeid);
/* Check to see if we already have a connection to this node. This
@@ -892,6 +913,8 @@ static int accept_from_sock(struct listen_connection *con)
goto accept_err;
}
+ sock_set_mark(newsock->sk, mark);
+
mutex_lock(&newcon->sock_mutex);
if (newcon->sock) {
struct connection *othercon = newcon->othercon;
@@ -1015,8 +1038,6 @@ static void sctp_connect_to_sock(struct connection *con)
struct socket *sock;
unsigned int mark;
- dlm_comm_mark(con->nodeid, &mark);
-
mutex_lock(&con->sock_mutex);
/* Some odd races can cause double-connects, ignore them */
@@ -1029,7 +1050,7 @@ static void sctp_connect_to_sock(struct connection *con)
}
memset(&daddr, 0, sizeof(daddr));
- result = nodeid_to_addr(con->nodeid, &daddr, NULL, true);
+ result = nodeid_to_addr(con->nodeid, &daddr, NULL, true, &mark);
if (result < 0) {
log_print("no address for nodeid %d", con->nodeid);
goto out;
@@ -1104,13 +1125,11 @@ static void sctp_connect_to_sock(struct connection *con)
static void tcp_connect_to_sock(struct connection *con)
{
struct sockaddr_storage saddr, src_addr;
+ unsigned int mark;
int addr_len;
struct socket *sock = NULL;
- unsigned int mark;
int result;
- dlm_comm_mark(con->nodeid, &mark);
-
mutex_lock(&con->sock_mutex);
if (con->retries++ > MAX_CONNECT_RETRIES)
goto out;
@@ -1125,15 +1144,15 @@ static void tcp_connect_to_sock(struct connection *con)
if (result < 0)
goto out_err;
- sock_set_mark(sock->sk, mark);
-
memset(&saddr, 0, sizeof(saddr));
- result = nodeid_to_addr(con->nodeid, &saddr, NULL, false);
+ result = nodeid_to_addr(con->nodeid, &saddr, NULL, false, &mark);
if (result < 0) {
log_print("no address for nodeid %d", con->nodeid);
goto out_err;
}
+ sock_set_mark(sock->sk, mark);
+
add_sock(sock, con);
/* Bind to our cluster-known address connecting to avoid
diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h
index 0918f9376489..790d6703b17e 100644
--- a/fs/dlm/lowcomms.h
+++ b/fs/dlm/lowcomms.h
@@ -21,6 +21,7 @@ int dlm_lowcomms_close(int nodeid);
void *dlm_lowcomms_get_buffer(int nodeid, int len, gfp_t allocation, char **ppc);
void dlm_lowcomms_commit_buffer(void *mh);
int dlm_lowcomms_connect_node(int nodeid);
+int dlm_lowcomms_nodes_set_mark(int nodeid, unsigned int mark);
int dlm_lowcomms_addr(int nodeid, struct sockaddr_storage *addr, int len);
#endif /* __LOWCOMMS_DOT_H__ */
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 005/116] fs: dlm: add errno handling to check callback
[not found] <20210505163125.3460440-1-sashal@kernel.org>
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 003/116] fs: dlm: fix debugfs dump Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 004/116] fs: dlm: fix mark setting deadlock Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 006/116] fs: dlm: add check if dlm is currently running Sasha Levin
` (4 subsequent siblings)
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit 8aa9540b49e0833feba75dbf4f45babadd0ed215 ]
This allows to return individual errno values for the config attribute
check callback instead of returning invalid argument only.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/config.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 582bffa09a66..8439610c266a 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -125,7 +125,7 @@ static ssize_t cluster_cluster_name_store(struct config_item *item,
CONFIGFS_ATTR(cluster_, cluster_name);
static ssize_t cluster_set(struct dlm_cluster *cl, unsigned int *cl_field,
- int *info_field, bool (*check_cb)(unsigned int x),
+ int *info_field, int (*check_cb)(unsigned int x),
const char *buf, size_t len)
{
unsigned int x;
@@ -137,8 +137,11 @@ static ssize_t cluster_set(struct dlm_cluster *cl, unsigned int *cl_field,
if (rc)
return rc;
- if (check_cb && check_cb(x))
- return -EINVAL;
+ if (check_cb) {
+ rc = check_cb(x);
+ if (rc)
+ return rc;
+ }
*cl_field = x;
*info_field = x;
@@ -161,14 +164,20 @@ static ssize_t cluster_##name##_show(struct config_item *item, char *buf) \
} \
CONFIGFS_ATTR(cluster_, name);
-static bool dlm_check_zero(unsigned int x)
+static int dlm_check_zero(unsigned int x)
{
- return !x;
+ if (!x)
+ return -EINVAL;
+
+ return 0;
}
-static bool dlm_check_buffer_size(unsigned int x)
+static int dlm_check_buffer_size(unsigned int x)
{
- return (x < DEFAULT_BUFFER_SIZE);
+ if (x < DEFAULT_BUFFER_SIZE)
+ return -EINVAL;
+
+ return 0;
}
CLUSTER_ATTR(tcp_port, dlm_check_zero);
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 006/116] fs: dlm: add check if dlm is currently running
[not found] <20210505163125.3460440-1-sashal@kernel.org>
` (2 preceding siblings ...)
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 005/116] fs: dlm: add errno handling to check callback Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 007/116] fs: dlm: change allocation limits Sasha Levin
` (3 subsequent siblings)
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit 517461630d1c25ee4dfc1dee80973a64189d6ccf ]
This patch adds checks for dlm config attributes regarding to protocol
parameters as it makes only sense to change them when dlm is not running.
It also adds a check for valid protocol specifiers and return invalid
argument if they are not supported.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/config.c | 34 ++++++++++++++++++++++++++++++++--
fs/dlm/lowcomms.c | 2 +-
fs/dlm/lowcomms.h | 3 +++
3 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 8439610c266a..88d95d96e36c 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -164,6 +164,36 @@ static ssize_t cluster_##name##_show(struct config_item *item, char *buf) \
} \
CONFIGFS_ATTR(cluster_, name);
+static int dlm_check_protocol_and_dlm_running(unsigned int x)
+{
+ switch (x) {
+ case 0:
+ /* TCP */
+ break;
+ case 1:
+ /* SCTP */
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (dlm_allow_conn)
+ return -EBUSY;
+
+ return 0;
+}
+
+static int dlm_check_zero_and_dlm_running(unsigned int x)
+{
+ if (!x)
+ return -EINVAL;
+
+ if (dlm_allow_conn)
+ return -EBUSY;
+
+ return 0;
+}
+
static int dlm_check_zero(unsigned int x)
{
if (!x)
@@ -180,7 +210,7 @@ static int dlm_check_buffer_size(unsigned int x)
return 0;
}
-CLUSTER_ATTR(tcp_port, dlm_check_zero);
+CLUSTER_ATTR(tcp_port, dlm_check_zero_and_dlm_running);
CLUSTER_ATTR(buffer_size, dlm_check_buffer_size);
CLUSTER_ATTR(rsbtbl_size, dlm_check_zero);
CLUSTER_ATTR(recover_timer, dlm_check_zero);
@@ -188,7 +218,7 @@ CLUSTER_ATTR(toss_secs, dlm_check_zero);
CLUSTER_ATTR(scan_secs, dlm_check_zero);
CLUSTER_ATTR(log_debug, NULL);
CLUSTER_ATTR(log_info, NULL);
-CLUSTER_ATTR(protocol, NULL);
+CLUSTER_ATTR(protocol, dlm_check_protocol_and_dlm_running);
CLUSTER_ATTR(mark, NULL);
CLUSTER_ATTR(timewarn_cs, dlm_check_zero);
CLUSTER_ATTR(waitwarn_us, NULL);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 440dce99d0d9..c438ce0ac115 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -135,7 +135,7 @@ static DEFINE_SPINLOCK(dlm_node_addrs_spin);
static struct listen_connection listen_con;
static struct sockaddr_storage *dlm_local_addr[DLM_MAX_ADDR_COUNT];
static int dlm_local_count;
-static int dlm_allow_conn;
+int dlm_allow_conn;
/* Work queues */
static struct workqueue_struct *recv_workqueue;
diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h
index 790d6703b17e..bcd4dbd1dc98 100644
--- a/fs/dlm/lowcomms.h
+++ b/fs/dlm/lowcomms.h
@@ -14,6 +14,9 @@
#define LOWCOMMS_MAX_TX_BUFFER_LEN 4096
+/* switch to check if dlm is running */
+extern int dlm_allow_conn;
+
int dlm_lowcomms_start(void);
void dlm_lowcomms_stop(void);
void dlm_lowcomms_exit(void);
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 007/116] fs: dlm: change allocation limits
[not found] <20210505163125.3460440-1-sashal@kernel.org>
` (3 preceding siblings ...)
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 006/116] fs: dlm: add check if dlm is currently running Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 008/116] fs: dlm: check on minimum msglen size Sasha Levin
` (2 subsequent siblings)
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit c45674fbdda138814ca21138475219c96fa5aa1f ]
While running tcpkill I experienced invalid header length values while
receiving to check that a node doesn't try to send a invalid dlm message
we also check on applications minimum allocation limit. Also use
DEFAULT_BUFFER_SIZE as maximum allocation limit. The define
LOWCOMMS_MAX_TX_BUFFER_LEN is to calculate maximum buffer limits on
application layer, future midcomms layer will subtract their needs from
this define.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/lowcomms.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index c438ce0ac115..39d6418f6e89 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1374,9 +1374,11 @@ void *dlm_lowcomms_get_buffer(int nodeid, int len, gfp_t allocation, char **ppc)
struct writequeue_entry *e;
int offset = 0;
- if (len > LOWCOMMS_MAX_TX_BUFFER_LEN) {
- BUILD_BUG_ON(PAGE_SIZE < LOWCOMMS_MAX_TX_BUFFER_LEN);
+ if (len > DEFAULT_BUFFER_SIZE ||
+ len < sizeof(struct dlm_header)) {
+ BUILD_BUG_ON(PAGE_SIZE < DEFAULT_BUFFER_SIZE);
log_print("failed to allocate a buffer of size %d", len);
+ WARN_ON(1);
return NULL;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 008/116] fs: dlm: check on minimum msglen size
[not found] <20210505163125.3460440-1-sashal@kernel.org>
` (4 preceding siblings ...)
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 007/116] fs: dlm: change allocation limits Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 009/116] fs: dlm: flush swork on shutdown Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 010/116] fs: dlm: add shutdown hook Sasha Levin
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit 710176e8363f269c6ecd73d203973b31ace119d3 ]
This patch adds an additional check for minimum dlm header size which is
an invalid dlm message and signals a broken stream. A msglen field cannot
be less than the dlm header size because the field is inclusive header
lengths.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/midcomms.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index fde3a6afe4be..0bedfa8606a2 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -49,9 +49,10 @@ int dlm_process_incoming_buffer(int nodeid, unsigned char *buf, int len)
* cannot deliver this message to upper layers
*/
msglen = get_unaligned_le16(&hd->h_length);
- if (msglen > DEFAULT_BUFFER_SIZE) {
- log_print("received invalid length header: %u, will abort message parsing",
- msglen);
+ if (msglen > DEFAULT_BUFFER_SIZE ||
+ msglen < sizeof(struct dlm_header)) {
+ log_print("received invalid length header: %u from node %d, will abort message parsing",
+ msglen, nodeid);
return -EBADMSG;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 009/116] fs: dlm: flush swork on shutdown
[not found] <20210505163125.3460440-1-sashal@kernel.org>
` (5 preceding siblings ...)
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 008/116] fs: dlm: check on minimum msglen size Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 010/116] fs: dlm: add shutdown hook Sasha Levin
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit eec054b5a7cfe6d1f1598a323b05771ee99857b5 ]
This patch fixes the flushing of send work before shutdown. The function
cancel_work_sync() is not the right workqueue functionality to use here
as it would cancel the work if the work queues itself. In cases of
EAGAIN in send() for dlm message we need to be sure that everything is
send out before. The function flush_work() will ensure that every send
work is be done inclusive in EAGAIN cases.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/lowcomms.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 39d6418f6e89..2ab2a38cb3b7 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -709,10 +709,7 @@ static void shutdown_connection(struct connection *con)
{
int ret;
- if (cancel_work_sync(&con->swork)) {
- log_print("canceled swork for node %d", con->nodeid);
- clear_bit(CF_WRITE_PENDING, &con->flags);
- }
+ flush_work(&con->swork);
mutex_lock(&con->sock_mutex);
/* nothing to shutdown */
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH AUTOSEL 5.12 010/116] fs: dlm: add shutdown hook
[not found] <20210505163125.3460440-1-sashal@kernel.org>
` (6 preceding siblings ...)
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 009/116] fs: dlm: flush swork on shutdown Sasha Levin
@ 2021-05-05 16:29 ` Sasha Levin
7 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2021-05-05 16:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Alexander Aring <aahringo@redhat.com>
[ Upstream commit 9d232469bcd772dbedb9e75a165c681b920524ee ]
This patch fixes issues which occurs when dlm lowcomms synchronize their
workqueues but dlm application layer already released the lockspace. In
such cases messages like:
dlm: gfs2: release_lockspace final free
dlm: invalid lockspace 3841231384 from 1 cmd 1 type 11
are printed on the kernel log. This patch is solving this issue by
introducing a new "shutdown" hook before calling "stop" hook when the
lockspace is going to be released finally. This should pretend any
dlm messages sitting in the workqueues during or after lockspace
removal.
It's necessary to call dlm_scand_stop() as I instrumented
dlm_lowcomms_get_buffer() code to report a warning after it's called after
dlm_midcomms_shutdown() functionality, see below:
WARNING: CPU: 1 PID: 3794 at fs/dlm/midcomms.c:1003 dlm_midcomms_get_buffer+0x167/0x180
Modules linked in: joydev iTCO_wdt intel_pmc_bxt iTCO_vendor_support drm_ttm_helper ttm pcspkr serio_raw i2c_i801 i2c_smbus drm_kms_helper virtio_scsi lpc_ich virtio_balloon virtio_console xhci_pci xhci_pci_renesas cec qemu_fw_cfg drm [last unloaded: qxl]
CPU: 1 PID: 3794 Comm: dlm_scand Tainted: G W 5.11.0+ #26
Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014
RIP: 0010:dlm_midcomms_get_buffer+0x167/0x180
Code: 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b 45 31 e4 5b 5d 4c 89 e0 41 5c 41 5d 41 5e 41 5f c3 4c 89 e7 45 31 e4 e8 3b f1 ec ff eb 86 <0f> 0b 4c 89 e7 45 31 e4 e8 2c f1 ec ff e9 74 ff ff ff 0f 1f 80 00
RSP: 0018:ffffa81503f8fe60 EFLAGS: 00010202
RAX: 0000000000000008 RBX: ffff8f969827f200 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffffad1e89a0 RDI: ffff8f96a5294160
RBP: 0000000000000001 R08: 0000000000000000 R09: ffff8f96a250bc60
R10: 00000000000045d3 R11: 0000000000000000 R12: ffff8f96a250bc60
R13: ffffa81503f8fec8 R14: 0000000000000070 R15: 0000000000000c40
FS: 0000000000000000(0000) GS:ffff8f96fbc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055aa3351c000 CR3: 000000010bf22000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
dlm_scan_rsbs+0x420/0x670
? dlm_uevent+0x20/0x20
dlm_scand+0xbf/0xe0
kthread+0x13a/0x150
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30
To synchronize all dlm scand messages we stop it right before shutdown
hook.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/dlm/lockspace.c | 20 +++++++++++---------
fs/dlm/lowcomms.c | 42 +++++++++++++++++++++++-------------------
fs/dlm/lowcomms.h | 1 +
3 files changed, 35 insertions(+), 28 deletions(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
index 561dcad08ad6..c14cf2b7faab 100644
--- a/fs/dlm/lockspace.c
+++ b/fs/dlm/lockspace.c
@@ -404,12 +404,6 @@ static int threads_start(void)
return error;
}
-static void threads_stop(void)
-{
- dlm_scand_stop();
- dlm_lowcomms_stop();
-}
-
static int new_lockspace(const char *name, const char *cluster,
uint32_t flags, int lvblen,
const struct dlm_lockspace_ops *ops, void *ops_arg,
@@ -702,8 +696,11 @@ int dlm_new_lockspace(const char *name, const char *cluster,
ls_count++;
if (error > 0)
error = 0;
- if (!ls_count)
- threads_stop();
+ if (!ls_count) {
+ dlm_scand_stop();
+ dlm_lowcomms_shutdown();
+ dlm_lowcomms_stop();
+ }
out:
mutex_unlock(&ls_lock);
return error;
@@ -788,6 +785,11 @@ static int release_lockspace(struct dlm_ls *ls, int force)
dlm_recoverd_stop(ls);
+ if (ls_count == 1) {
+ dlm_scand_stop();
+ dlm_lowcomms_shutdown();
+ }
+
dlm_callback_stop(ls);
remove_lockspace(ls);
@@ -880,7 +882,7 @@ int dlm_release_lockspace(void *lockspace, int force)
if (!error)
ls_count--;
if (!ls_count)
- threads_stop();
+ dlm_lowcomms_stop();
mutex_unlock(&ls_lock);
return error;
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 2ab2a38cb3b7..6a3660bdbb46 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1607,6 +1607,29 @@ static int work_start(void)
return 0;
}
+static void shutdown_conn(struct connection *con)
+{
+ if (con->shutdown_action)
+ con->shutdown_action(con);
+}
+
+void dlm_lowcomms_shutdown(void)
+{
+ /* Set all the flags to prevent any
+ * socket activity.
+ */
+ dlm_allow_conn = 0;
+
+ if (recv_workqueue)
+ flush_workqueue(recv_workqueue);
+ if (send_workqueue)
+ flush_workqueue(send_workqueue);
+
+ dlm_close_sock(&listen_con.sock);
+
+ foreach_conn(shutdown_conn);
+}
+
static void _stop_conn(struct connection *con, bool and_other)
{
mutex_lock(&con->sock_mutex);
@@ -1628,12 +1651,6 @@ static void stop_conn(struct connection *con)
_stop_conn(con, true);
}
-static void shutdown_conn(struct connection *con)
-{
- if (con->shutdown_action)
- con->shutdown_action(con);
-}
-
static void connection_release(struct rcu_head *rcu)
{
struct connection *con = container_of(rcu, struct connection, rcu);
@@ -1690,19 +1707,6 @@ static void work_flush(void)
void dlm_lowcomms_stop(void)
{
- /* Set all the flags to prevent any
- socket activity.
- */
- dlm_allow_conn = 0;
-
- if (recv_workqueue)
- flush_workqueue(recv_workqueue);
- if (send_workqueue)
- flush_workqueue(send_workqueue);
-
- dlm_close_sock(&listen_con.sock);
-
- foreach_conn(shutdown_conn);
work_flush();
foreach_conn(free_conn);
work_stop();
diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h
index bcd4dbd1dc98..48bbc4e18761 100644
--- a/fs/dlm/lowcomms.h
+++ b/fs/dlm/lowcomms.h
@@ -18,6 +18,7 @@
extern int dlm_allow_conn;
int dlm_lowcomms_start(void);
+void dlm_lowcomms_shutdown(void);
void dlm_lowcomms_stop(void);
void dlm_lowcomms_exit(void);
int dlm_lowcomms_close(int nodeid);
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-05-05 16:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20210505163125.3460440-1-sashal@kernel.org>
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 003/116] fs: dlm: fix debugfs dump Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 004/116] fs: dlm: fix mark setting deadlock Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 005/116] fs: dlm: add errno handling to check callback Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 006/116] fs: dlm: add check if dlm is currently running Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 007/116] fs: dlm: change allocation limits Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 008/116] fs: dlm: check on minimum msglen size Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 009/116] fs: dlm: flush swork on shutdown Sasha Levin
2021-05-05 16:29 ` [Cluster-devel] [PATCH AUTOSEL 5.12 010/116] fs: dlm: add shutdown hook Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).