[PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D

Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed

* [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS
@ 2024-06-24  5:46 zhengbing.huang
  2024-06-24  5:46 ` [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters() zhengbing.huang
                   ` (10 more replies)
  0 siblings, 11 replies; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

In our network failure and drbd down testing, we found warning in dmesg and drbd down process into D state:

"kernel: drbd /unregistered/ramtest3/0 drbd103: ASSERTION device->disk_state[NOW] == D_FAILED || device->disk_state[NOW] == D_DETACHING FAILED in go_diskless"

the problem is the wait_event is inttruptable, it could be intrupted by signal and call drbd_cleanup_device before go_diskless()

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_nl.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drbd/drbd_nl.c b/drbd/drbd_nl.c
index 530334e61..7b4539431 100644
--- a/drbd/drbd_nl.c
+++ b/drbd/drbd_nl.c
@@ -3676,7 +3676,7 @@ static int adm_detach(struct drbd_device *device, bool force, bool intentional_d
 		      const char *tag, struct sk_buff *reply_skb)
 {
 	const char *err_str = NULL;
-	int ret, retcode;
+	int retcode;
 
 	device->device_conf.intentional_diskless = intentional_diskless;
 	if (force) {
@@ -3692,19 +3692,16 @@ static int adm_detach(struct drbd_device *device, bool force, bool intentional_d
 			CS_VERBOSE | CS_WAIT_COMPLETE | CS_SERIALIZE, tag, &err_str));
 	/* D_DETACHING will transition to DISKLESS. */
 	drbd_resume_io(device);
-	ret = wait_event_interruptible(device->misc_wait,
-			get_disk_state(device) != D_DETACHING);
+	wait_event(device->misc_wait, get_disk_state(device) != D_DETACHING);
 	if (retcode >= SS_SUCCESS) {
 		/* wait for completion of drbd_ldev_destroy() */
-		wait_event_interruptible(device->misc_wait, !test_bit(GOING_DISKLESS, &device->flags));
+		wait_event(device->misc_wait, !test_bit(GOING_DISKLESS, &device->flags));
 		drbd_cleanup_device(device);
 	}
 	else
 		device->device_conf.intentional_diskless = false;
 	if (retcode == SS_IS_DISKLESS)
 		retcode = SS_NOTHING_TO_DO;
-	if (ret)
-		retcode = ERR_INTR;
 out:
 	if (err_str) {
 		drbd_msg_put_info(reply_skb, err_str);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters()
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28  9:35   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path zhengbing.huang
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

check ldev is not NULL before use it in drbd_reconsider_queue_parameters()

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_receiver.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drbd/drbd_receiver.c b/drbd/drbd_receiver.c
index 49e7815ed..fd07b29d7 100644
--- a/drbd/drbd_receiver.c
+++ b/drbd/drbd_receiver.c
@@ -9845,7 +9845,12 @@ static void conn_disconnect(struct drbd_connection *connection)
 		rcu_read_unlock();
 
 		peer_device_disconnected(peer_device);
-		drbd_reconsider_queue_parameters(device, device->ldev);
+		if (get_ldev(device)) {
+			drbd_reconsider_queue_parameters(device, device->ldev);
+			put_ldev(device);
+		} else {
+			drbd_reconsider_queue_parameters(device, NULL);
+		}
 
 		kref_put(&device->kref, drbd_destroy_device);
 		rcu_read_lock();
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
  2024-06-24  5:46 ` [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters() zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28  9:40   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false zhengbing.huang
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index cfbae0e78..eccd0c6ce 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
 			atomic_set(&cs->active_state, PCS_INACTIVE);
 			wake_up(&cs->wq);
 		}
+		kref_put(&cm->kref, dtr_destroy_cm);
 		return;
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
  2024-06-24  5:46 ` [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters() zhengbing.huang
  2024-06-24  5:46 ` [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28 11:51   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED) zhengbing.huang
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index eccd0c6ce..b7ccb15d4 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1089,9 +1089,13 @@ static void dtr_cma_retry_connect_work_fn(struct work_struct *work)
 	if (err) {
 		struct dtr_path *path = container_of(cs, struct dtr_path, cs);
 		struct drbd_transport *transport = path->path.transport;
+		struct dtr_transport *rdma_transport =
+			container_of(transport, struct dtr_transport, transport);
 
 		tr_err(transport, "dtr_start_try_connect failed  %d\n", err);
-		schedule_delayed_work(&cs->retry_connect_work, HZ);
+		if (rdma_transport->active) {
+			schedule_delayed_work(&cs->retry_connect_work, HZ);
+		}
 	}
 }
 
@@ -1116,6 +1120,8 @@ static void dtr_remove_cm_from_path(struct dtr_path *path, struct dtr_cm *failed
 static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_cm)
 {
 	struct drbd_transport *transport = path->path.transport;
+	struct dtr_transport *rdma_transport =
+		container_of(transport, struct dtr_transport, transport);
 	struct dtr_connect_state *cs = &path->cs;
 	long connect_int = 10 * HZ;
 	struct net_conf *nc;
@@ -1128,7 +1134,9 @@ static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_c
 		connect_int = nc->connect_int * HZ;
 	rcu_read_unlock();
 
-	schedule_delayed_work(&cs->retry_connect_work, connect_int);
+	if (rdma_transport->active) {
+		schedule_delayed_work(&cs->retry_connect_work, connect_int);
+	}
 }
 
 static void dtr_cma_connect_work_fn(struct work_struct *work)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED)
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (2 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28 12:07   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 06/11] drbd_transport_rdma: put kref in error path zhengbing.huang
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

We need to drain all tx in disconnect to put all kref for cm

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index b7ccb15d4..9a6d15b78 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1956,9 +1956,6 @@ static void dtr_tx_cq_event_handler(struct ib_cq *cq, void *ctx)
 			err = dtr_handle_tx_cq_event(cq, cm);
 		} while (!err);
 
-		if (cm->state != DSM_CONNECTED)
-			break;
-
 		rc = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
 		if (unlikely(rc < 0)) {
 			struct drbd_transport *transport = cm->path->path.transport;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 06/11] drbd_transport_rdma: put kref in error path
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (3 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED) zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28 12:12   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error zhengbing.huang
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index 9a6d15b78..c7adc87e3 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1157,6 +1157,7 @@ static void dtr_cma_connect_work_fn(struct work_struct *work)
 	kref_get(&cm->kref); /* for the path->cm pointer */
 	err = dtr_path_prepare(path, cm, true);
 	if (err) {
+		kref_put(&cm->kref, dtr_destroy_cm);
 		tr_err(transport, "dtr_path_prepare() = %d\n", err);
 		goto out;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (4 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 06/11] drbd_transport_rdma: put kref in error path zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28 12:19   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop zhengbing.huang
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index c7adc87e3..77ff0055e 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -2355,8 +2355,11 @@ static int dtr_repost_tx_desc(struct dtr_cm *old_cm, struct dtr_tx_desc *tx_desc
 			return -ECONNRESET;
 
 		err = dtr_remap_tx_desc(old_cm, cm, tx_desc);
-		if (err)
+		if (err) {
+			pr_err("dtr_remap_tx_desc failed cm : %px\n", cm);
+			kref_put(&cm->kref, dtr_destroy_cm);
 			continue;
+		}
 
 		err = __dtr_post_tx_desc(cm, tx_desc);
 		if (!err) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (5 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28 12:36   ` Philipp Reisner
  2024-06-24  5:46 ` [PATCH 09/11] drbd_transport_rdma: introduce timeout for rdma_disocnnect zhengbing.huang
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

If the send_sig() in drbd_thread_stop before wait_for_completion_interruptible() in dtr_connect(),
it can't return from dtr_connect in network failure.

So replace wait_for_completion_interruptible with wait_for_completion_interruptible_timeout, and
check status by dtr_connect() itself.

This behavior is similar with tcp transport

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index 77ff0055e..c47b344f8 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -2996,12 +2996,21 @@ static int dtr_connect(struct drbd_transport *transport)
 {
 	struct dtr_transport *rdma_transport =
 		container_of(transport, struct dtr_transport, transport);
-	int i, err = -ENOMEM;
+	int i, err;
 
-	err = wait_for_completion_interruptible(&rdma_transport->connected);
-	if (err) {
+again:
+	if (drbd_should_abort_listening(transport)) {
+		err = -EAGAIN;
+		goto abort;
+	}
+
+	err = wait_for_completion_interruptible_timeout(&rdma_transport->connected, HZ);
+	if (err < 0) {
 		flush_signals(current);
 		goto abort;
+	} else if (err == 0) {
+		/* timed out */
+		goto again;
 	}
 
 	err = atomic_read(&rdma_transport->first_path_connect_err);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 09/11] drbd_transport_rdma: introduce timeout for rdma_disocnnect
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (6 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-24  5:46 ` [PATCH 10/11] drbd_transport_rdma: introduce timeout for rdma_connect zhengbing.huang
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

The rdma driver timeout for dreq is too long in network failure, we can
introduce a timeout for rdma_disconnect().

If timeout we will put kref, and finaly it will go to rdma_destory_id(),
which will cancel all dreq in rdma driver, so dont worry about use-after-free
problem in dtr_cma_event_handler.

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index c47b344f8..811f1a20a 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -2760,9 +2760,15 @@ static void __dtr_disconnect_path(struct dtr_path *path)
 	}
 
 	/* There might be a signal pending here. Not incorruptible! */
-	wait_event_timeout(cm->state_wq,
-			   !test_bit(DSB_CONNECTED, &cm->state),
-			   HZ);
+	err = wait_event_timeout(cm->state_wq,
+			   !test_bit(DSB_CONNECTED, &cm->state), 20 * HZ);
+
+	if (err == 0 && test_and_clear_bit(DSB_CONNECTED, &cm->state)) {
+		dtr_remove_cm_from_path(path, cm);
+
+		kref_put(&cm->kref, dtr_destroy_cm);
+		clear_bit(TR_ESTABLISHED, &path->path.flags);
+	}
 
 	if (test_bit(DSB_CONNECTED, &cm->state))
 		tr_warn(transport, "WARN: not properly disconnected, state = %lu\n",
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 10/11] drbd_transport_rdma: introduce timeout for rdma_connect
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (7 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 09/11] drbd_transport_rdma: introduce timeout for rdma_disocnnect zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-24  5:46 ` [PATCH 11/11] drbd_transport_rdma: wake up state_wq after clear DSB_CONNECTED in dtr_tx_timeout_work_fn zhengbing.huang
  2024-06-28  9:10 ` [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS Philipp Reisner
  10 siblings, 0 replies; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index 811f1a20a..0cd639254 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -140,11 +140,13 @@ union dtr_immediate {
 
 enum dtr_state_bits {
 	DSB_CONNECT_REQ,
+	DSB_CONNECTING,
 	DSB_CONNECTED,
 	DSB_ERROR,
 };
 
 #define DSM_CONNECT_REQ   (1 << DSB_CONNECT_REQ)
+#define DSM_CONNECTING    (1 << DSB_CONNECTING)
 #define DSM_CONNECTED     (1 << DSB_CONNECTED)
 #define DSM_ERROR         (1 << DSB_ERROR)
 
@@ -1033,6 +1035,7 @@ static int dtr_cma_accept(struct dtr_listener *listener, struct rdma_cm_id *new_
 		return -EAGAIN;
 	}
 
+	set_bit(DSB_CONNECTING, &cm->state);
 	err = rdma_accept(new_cm_id, &dtr_conn_param);
 	if (err)
 		kref_put(&cm->kref, dtr_destroy_cm);
@@ -1163,6 +1166,7 @@ static void dtr_cma_connect_work_fn(struct work_struct *work)
 	}
 
 	kref_get(&cm->kref); /* Expecting RDMA_CM_EVENT_ESTABLISHED */
+	set_bit(DSB_CONNECTING, &cm->state);
 	err = rdma_connect(cm->id, &dtr_conn_param);
 	if (err) {
 		kref_put(&cm->kref, dtr_destroy_cm); /* no RDMA_CM_EVENT_ESTABLISHED */
@@ -1170,6 +1174,15 @@ static void dtr_cma_connect_work_fn(struct work_struct *work)
 		goto out;
 	}
 
+	err = wait_event_timeout(cm->state_wq,
+			   !test_bit(DSB_CONNECTING, &cm->state), 20*HZ);
+
+	if (err == 0 && test_and_clear_bit(DSB_CONNECTING, &cm->state)) {
+		kref_put(&cm->kref, dtr_destroy_cm);
+		tr_err(transport, "rdma_connect timeout\n");
+		goto out;
+	}
+
 	kref_put(&cm->kref, dtr_destroy_cm); /* for work */
 	return;
 out:
@@ -1293,6 +1306,9 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event
 
 	case RDMA_CM_EVENT_ESTABLISHED:
 		// pr_info("%s: RDMA_CM_EVENT_ESTABLISHED\n", cm->name);
+		if (!test_and_clear_bit(DSB_CONNECTING, &cm->state))
+			return 0;
+		wake_up(&cm->state_wq);
 		/* cm->state = DSM_CONNECTED; is set later in the work item */
 		/* This is called for active and passive connections */
 
@@ -1313,6 +1329,8 @@ static int dtr_cma_event_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event
 		// pr_info("%s: RDMA_CM_EVENT_REJECTED\n", cm->name);
 		// pr_info("event = %d, status = %d\n", event->event, event->status);
 		set_bit(DSB_ERROR, &cm->state);
+		if (!test_and_clear_bit(DSB_CONNECTING, &cm->state))
+			return 0;
 
 		dtr_cma_retry_connect(cm->path, cm);
 		break;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 11/11] drbd_transport_rdma: wake up state_wq after clear DSB_CONNECTED in dtr_tx_timeout_work_fn
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (8 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 10/11] drbd_transport_rdma: introduce timeout for rdma_connect zhengbing.huang
@ 2024-06-24  5:46 ` zhengbing.huang
  2024-06-28  9:10 ` [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS Philipp Reisner
  10 siblings, 0 replies; 32+ messages in thread
From: zhengbing.huang @ 2024-06-24  5:46 UTC (permalink / raw)
  To: drbd-dev

From: Dongsheng Yang <dongsheng.yang@easystack.cn>

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
---
 drbd/drbd_transport_rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
index 0cd639254..2df33af90 100644
--- a/drbd/drbd_transport_rdma.c
+++ b/drbd/drbd_transport_rdma.c
@@ -1572,6 +1572,7 @@ static void dtr_tx_timeout_work_fn(struct work_struct *work)
 	if (!test_and_clear_bit(DSB_CONNECTED, &cm->state) || !path)
 		goto out;
 
+	wake_up(&cm->state_wq);
 	transport = path->path.transport;
 	tr_warn(transport, "%pI4 - %pI4: tx timeout\n",
 		&((struct sockaddr_in *)&path->path.my_addr)->sin_addr,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS
  2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
                   ` (9 preceding siblings ...)
  2024-06-24  5:46 ` [PATCH 11/11] drbd_transport_rdma: wake up state_wq after clear DSB_CONNECTED in dtr_tx_timeout_work_fn zhengbing.huang
@ 2024-06-28  9:10 ` Philipp Reisner
  2024-07-01  2:02   ` Dongsheng Yang
  10 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28  9:10 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

First of all, thanks for contributing patches to us.
Please find my reply on the patch below the quote:

On Mon, Jun 24, 2024 at 7:52 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> In our network failure and drbd down testing, we found warning in dmesg and drbd down process into D state:
>
> "kernel: drbd /unregistered/ramtest3/0 drbd103: ASSERTION device->disk_state[NOW] == D_FAILED || device->disk_state[NOW] == D_DETACHING FAILED in go_diskless"
>
> the problem is the wait_event is inttruptable, it could be intrupted by signal and call drbd_cleanup_device before go_diskless()
>

In this case, I suggest improving the expression in the assertion.
Improving an assertion can also mean removing that assertion.

The wait_event_interruptible() is there for a reason. Think of a
backing disk that behaves like a tar pit—a backing device that no
longer finishes IO requests. You want a way to interrupt the drbdsetup
waiting in detach.

PS: A bit more elaborative commit messages are welcome.

best regards,
 Philipp

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters()
  2024-06-24  5:46 ` [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters() zhengbing.huang
@ 2024-06-28  9:35   ` Philipp Reisner
  0 siblings, 0 replies; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28  9:35 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Thanks.

On Mon, Jun 24, 2024 at 8:32 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> check ldev is not NULL before use it in drbd_reconsider_queue_parameters()
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_receiver.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drbd/drbd_receiver.c b/drbd/drbd_receiver.c
> index 49e7815ed..fd07b29d7 100644
> --- a/drbd/drbd_receiver.c
> +++ b/drbd/drbd_receiver.c
> @@ -9845,7 +9845,12 @@ static void conn_disconnect(struct drbd_connection *connection)
>                 rcu_read_unlock();
>
>                 peer_device_disconnected(peer_device);
> -               drbd_reconsider_queue_parameters(device, device->ldev);
> +               if (get_ldev(device)) {
> +                       drbd_reconsider_queue_parameters(device, device->ldev);
> +                       put_ldev(device);
> +               } else {
> +                       drbd_reconsider_queue_parameters(device, NULL);
> +               }
>
>                 kref_put(&device->kref, drbd_destroy_device);
>                 rcu_read_lock();
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-06-24  5:46 ` [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path zhengbing.huang
@ 2024-06-28  9:40   ` Philipp Reisner
  2024-07-01  2:07     ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28  9:40 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

Please add more information why you think this change fixes a bug.
Have you experienced a leak of cm structs?
We got a RDMA_CM_EVENT_ESTABLISHED event. Even if DRBD does not do
anything with this cm, we sill expect a RDMA_CM_EVENT_DISCONNECTED in
the future. Is a problem in the handling of the disconnect?

best regards,
 Philipp

On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index cfbae0e78..eccd0c6ce 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
>                         atomic_set(&cs->active_state, PCS_INACTIVE);
>                         wake_up(&cs->wq);
>                 }
> +               kref_put(&cm->kref, dtr_destroy_cm);
>                 return;
>         }
>
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false
  2024-06-24  5:46 ` [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false zhengbing.huang
@ 2024-06-28 11:51   ` Philipp Reisner
  2024-07-01  2:11     ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28 11:51 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

Please explain what problem you are fixing with this change. Do you
have a log that shows a problem in this area? Please describe why your
proposed change improves DRBD's behavior.

best regards,
 Philipp

On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index eccd0c6ce..b7ccb15d4 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -1089,9 +1089,13 @@ static void dtr_cma_retry_connect_work_fn(struct work_struct *work)
>         if (err) {
>                 struct dtr_path *path = container_of(cs, struct dtr_path, cs);
>                 struct drbd_transport *transport = path->path.transport;
> +               struct dtr_transport *rdma_transport =
> +                       container_of(transport, struct dtr_transport, transport);
>
>                 tr_err(transport, "dtr_start_try_connect failed  %d\n", err);
> -               schedule_delayed_work(&cs->retry_connect_work, HZ);
> +               if (rdma_transport->active) {
> +                       schedule_delayed_work(&cs->retry_connect_work, HZ);
> +               }
>         }
>  }
>
> @@ -1116,6 +1120,8 @@ static void dtr_remove_cm_from_path(struct dtr_path *path, struct dtr_cm *failed
>  static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_cm)
>  {
>         struct drbd_transport *transport = path->path.transport;
> +       struct dtr_transport *rdma_transport =
> +               container_of(transport, struct dtr_transport, transport);
>         struct dtr_connect_state *cs = &path->cs;
>         long connect_int = 10 * HZ;
>         struct net_conf *nc;
> @@ -1128,7 +1134,9 @@ static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_c
>                 connect_int = nc->connect_int * HZ;
>         rcu_read_unlock();
>
> -       schedule_delayed_work(&cs->retry_connect_work, connect_int);
> +       if (rdma_transport->active) {
> +               schedule_delayed_work(&cs->retry_connect_work, connect_int);
> +       }
>  }
>
>  static void dtr_cma_connect_work_fn(struct work_struct *work)
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED)
  2024-06-24  5:46 ` [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED) zhengbing.huang
@ 2024-06-28 12:07   ` Philipp Reisner
  2024-07-01  2:23     ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28 12:07 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

It appears that you are trying to fix a leak of cm structures. Is that correct?
Do you the reference on cm that is held because of the timer?
Please describe what the problem is, and how you are improving the situation.

In case this approach is the right solution, the patch should also change the
dtr_handle_tx_cq_event() function to type void.

best regards,
 Philipp

On Mon, Jun 24, 2024 at 8:22 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> We need to drain all tx in disconnect to put all kref for cm
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index b7ccb15d4..9a6d15b78 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -1956,9 +1956,6 @@ static void dtr_tx_cq_event_handler(struct ib_cq *cq, void *ctx)
>                         err = dtr_handle_tx_cq_event(cq, cm);
>                 } while (!err);
>
> -               if (cm->state != DSM_CONNECTED)
> -                       break;
> -
>                 rc = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
>                 if (unlikely(rc < 0)) {
>                         struct drbd_transport *transport = cm->path->path.transport;
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 06/11] drbd_transport_rdma: put kref in error path
  2024-06-24  5:46 ` [PATCH 06/11] drbd_transport_rdma: put kref in error path zhengbing.huang
@ 2024-06-28 12:12   ` Philipp Reisner
  0 siblings, 0 replies; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28 12:12 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

It looks plausible, but still, please provide a description.

On Mon, Jun 24, 2024 at 7:52 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index 9a6d15b78..c7adc87e3 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -1157,6 +1157,7 @@ static void dtr_cma_connect_work_fn(struct work_struct *work)
>         kref_get(&cm->kref); /* for the path->cm pointer */
>         err = dtr_path_prepare(path, cm, true);
>         if (err) {
> +               kref_put(&cm->kref, dtr_destroy_cm);
>                 tr_err(transport, "dtr_path_prepare() = %d\n", err);
>                 goto out;
>         }
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error
  2024-06-24  5:46 ` [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error zhengbing.huang
@ 2024-06-28 12:19   ` Philipp Reisner
  2024-07-01  2:28     ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28 12:19 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

This looks wrong. In this loop, we are trying to find a path on which
to repost this tx_desc. When the remapping fails, there is no reason
to drop the drop a ref on the cm.

So, please provide a description what you are intending here with this change?

best regards,
 Philipp

On Mon, Jun 24, 2024 at 9:27 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index c7adc87e3..77ff0055e 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -2355,8 +2355,11 @@ static int dtr_repost_tx_desc(struct dtr_cm *old_cm, struct dtr_tx_desc *tx_desc
>                         return -ECONNRESET;
>
>                 err = dtr_remap_tx_desc(old_cm, cm, tx_desc);
> -               if (err)
> +               if (err) {
> +                       pr_err("dtr_remap_tx_desc failed cm : %px\n", cm);
> +                       kref_put(&cm->kref, dtr_destroy_cm);
>                         continue;
> +               }
>
>                 err = __dtr_post_tx_desc(cm, tx_desc);
>                 if (!err) {
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop
  2024-06-24  5:46 ` [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop zhengbing.huang
@ 2024-06-28 12:36   ` Philipp Reisner
  2024-07-01  2:30     ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-06-28 12:36 UTC (permalink / raw)
  To: zhengbing.huang; +Cc: Dongsheng Yang, drbd-dev

Hello Dongsheng,

I am repeating your description in my own words so that you can verify
I got it right:

CPU 0 executes dtr_connect() and is still before the
wait_for_completion_interruptible().
CPU 1 executes send_sig() in drbd_thread_stop().

Then you conclude that wait_for_completion_interruptible() will not
abort, because the signal
was raised before CPU 0 reached wait_for_completion_interruptible().

If that is your description, then it is wrong.
This is not how signals and the wait_event() macros work.

best regards,
 Philipp

On Mon, Jun 24, 2024 at 9:27 AM zhengbing.huang
<zhengbing.huang@easystack.cn> wrote:
>
> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> If the send_sig() in drbd_thread_stop before wait_for_completion_interruptible() in dtr_connect(),
> it can't return from dtr_connect in network failure.
>
> So replace wait_for_completion_interruptible with wait_for_completion_interruptible_timeout, and
> check status by dtr_connect() itself.
>
> This behavior is similar with tcp transport
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> ---
>  drbd/drbd_transport_rdma.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> index 77ff0055e..c47b344f8 100644
> --- a/drbd/drbd_transport_rdma.c
> +++ b/drbd/drbd_transport_rdma.c
> @@ -2996,12 +2996,21 @@ static int dtr_connect(struct drbd_transport *transport)
>  {
>         struct dtr_transport *rdma_transport =
>                 container_of(transport, struct dtr_transport, transport);
> -       int i, err = -ENOMEM;
> +       int i, err;
>
> -       err = wait_for_completion_interruptible(&rdma_transport->connected);
> -       if (err) {
> +again:
> +       if (drbd_should_abort_listening(transport)) {
> +               err = -EAGAIN;
> +               goto abort;
> +       }
> +
> +       err = wait_for_completion_interruptible_timeout(&rdma_transport->connected, HZ);
> +       if (err < 0) {
>                 flush_signals(current);
>                 goto abort;
> +       } else if (err == 0) {
> +               /* timed out */
> +               goto again;
>         }
>
>         err = atomic_read(&rdma_transport->first_path_connect_err);
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS
  2024-06-28  9:10 ` [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS Philipp Reisner
@ 2024-07-01  2:02   ` Dongsheng Yang
  2024-07-01 10:00     ` Philipp Reisner
  0 siblings, 1 reply; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:02 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 5:10, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> First of all, thanks for contributing patches to us.
> Please find my reply on the patch below the quote:
> 
> On Mon, Jun 24, 2024 at 7:52 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> In our network failure and drbd down testing, we found warning in dmesg and drbd down process into D state:
>>
>> "kernel: drbd /unregistered/ramtest3/0 drbd103: ASSERTION device->disk_state[NOW] == D_FAILED || device->disk_state[NOW] == D_DETACHING FAILED in go_diskless"
>>
>> the problem is the wait_event is inttruptable, it could be intrupted by signal and call drbd_cleanup_device before go_diskless()
>>
> 
> In this case, I suggest improving the expression in the assertion.
> Improving an assertion can also mean removing that assertion.

Hi Philipp,
	This patchset is fixing the problems found by a network failure test 
script[1].
	The [1/11] is not about just a WARNING, it will result a process with D 
state in wait_event(device->misc_wait, !test_bit(GOING_DISKLESS, 
&device->flags)); in adm_del_minor().

let's think about this sequence:

a) drbd_adm_down -> adm_detach -> change_disk_state(device, D_DETACHING...

b) it will call put_ldev(), set GOING_DISKLESS and post a work for 
GO_DISKLESS

c) adm_detach() start wait_event_interruptible(device->misc_wait,
			get_disk_state(device) != D_DETACHING);
but it can be intrrupted, then call drbd_cleanup_device() to set 
device->disk_state[NOW] = D_DISKLESS;

after that, it will go to adm_del_minor() and 
wait_event(device->misc_wait, !test_bit(GOING_DISKLESS, 
&device->flags)); which expects drbd_ldev_destroy to clear GOING_DISKLESS.

d) on the other hand, go_diskless work start and warn on the message in 
commit message. it will do change_disk_state(device, D_DISKLESS, 
CS_HARD, "go-diskless", NULL); But the disk_state[NOW] is already 
D_DISKLESS. So it will not schedule &device->ldev_destroy_work.

As a result, the wait_event in c) will never return.


[1]:
check_drbd_process() {
     ps aux | grep " D"|grep drbd
}

check_node_2_drbd_process() {
     ssh node-2 'ps aux' | grep " D"|grep drbd
}

wait_for_no_drbd_d_state() {
     count=0
     while true; do
         if check_drbd_process; then
             echo "Found drbd process in D state, sleeping for ${count} 
second..."
             sleep 1
             count=$((count + 1))
         else
             echo "No drbd process in D state."
             break
         fi
     done
     while true; do
         if check_node_2_drbd_process; then
             echo "Found drbd process in D state, sleeping for ${count} 
second..."
             sleep 1
             count=$((count + 1))
         else
             echo "No drbd process in D state."
             break
         fi
     done
}

random_sleep=$((RANDOM % 100))

ssh node-2 "ifup Bond2-roce.1469"
ifup Bond2-roce.1469

sleep 5

for i in `seq 0 9`; do
         drbdadm up ramtest${i}
         ssh node-2 "drbdadm up ramtest${i}"
done

sleep ${random_sleep}

ssh node-2 "ifdown Bond2-roce.1469"

random_sleep=$((RANDOM % 10))

for i in `seq 0 9`; do
         drbdsetup fail-io ramtest${i} &
         drbdadm down ramtest${i} &
done

sleep 10

wait_for_no_drbd_d_state
> 
> The wait_event_interruptible() is there for a reason. Think of a
> backing disk that behaves like a tar pit—a backing device that no
> longer finishes IO requests. You want a way to interrupt the drbdsetup
> waiting in detach.
> 
> PS: A bit more elaborative commit messages are welcome.
> 
> best regards,
>   Philipp
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-06-28  9:40   ` Philipp Reisner
@ 2024-07-01  2:07     ` Dongsheng Yang
  2024-07-01  2:48       ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:07 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 5:40, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> Please add more information why you think this change fixes a bug.
> Have you experienced a leak of cm structs?
> We got a RDMA_CM_EVENT_ESTABLISHED event. Even if DRBD does not do
> anything with this cm, we sill expect a RDMA_CM_EVENT_DISCONNECTED in
> the future. Is a problem in the handling of the disconnect?

If dtr_path_established() go into this branch, it will not 
schedule_work(&cm->establish_work);

That means path->cm->state = DSM_CONNECTED; will not be done in 
dtr_path_established_work_fn(), so __dtr_disconnect_path() will not call 
rdma_disconnect(). That means this reference will never be put.
> 
> best regards,
>   Philipp
> 
> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> ---
>>   drbd/drbd_transport_rdma.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index cfbae0e78..eccd0c6ce 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
>>                          atomic_set(&cs->active_state, PCS_INACTIVE);
>>                          wake_up(&cs->wq);
>>                  }
>> +               kref_put(&cm->kref, dtr_destroy_cm);
>>                  return;
>>          }
>>
>> --
>> 2.27.0
>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false
  2024-06-28 11:51   ` Philipp Reisner
@ 2024-07-01  2:11     ` Dongsheng Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:11 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 7:51, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> Please explain what problem you are fixing with this change. Do you
> have a log that shows a problem in this area? Please describe why your
> proposed change improves DRBD's behavior.

retry_connect_work can be flushed in dtr_free, that's correct. but if we 
schedule new work after that, there is a NULL pointer dereference in our 
testing. So dont schedule new retry_connect_work when 
rdma_transport->active is false. it is set to false in dtr_free before 
flushing retry_connect_work.
> 
> best regards,
>   Philipp
> 
> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> ---
>>   drbd/drbd_transport_rdma.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index eccd0c6ce..b7ccb15d4 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -1089,9 +1089,13 @@ static void dtr_cma_retry_connect_work_fn(struct work_struct *work)
>>          if (err) {
>>                  struct dtr_path *path = container_of(cs, struct dtr_path, cs);
>>                  struct drbd_transport *transport = path->path.transport;
>> +               struct dtr_transport *rdma_transport =
>> +                       container_of(transport, struct dtr_transport, transport);
>>
>>                  tr_err(transport, "dtr_start_try_connect failed  %d\n", err);
>> -               schedule_delayed_work(&cs->retry_connect_work, HZ);
>> +               if (rdma_transport->active) {
>> +                       schedule_delayed_work(&cs->retry_connect_work, HZ);
>> +               }
>>          }
>>   }
>>
>> @@ -1116,6 +1120,8 @@ static void dtr_remove_cm_from_path(struct dtr_path *path, struct dtr_cm *failed
>>   static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_cm)
>>   {
>>          struct drbd_transport *transport = path->path.transport;
>> +       struct dtr_transport *rdma_transport =
>> +               container_of(transport, struct dtr_transport, transport);
>>          struct dtr_connect_state *cs = &path->cs;
>>          long connect_int = 10 * HZ;
>>          struct net_conf *nc;
>> @@ -1128,7 +1134,9 @@ static void dtr_cma_retry_connect(struct dtr_path *path, struct dtr_cm *failed_c
>>                  connect_int = nc->connect_int * HZ;
>>          rcu_read_unlock();
>>
>> -       schedule_delayed_work(&cs->retry_connect_work, connect_int);
>> +       if (rdma_transport->active) {
>> +               schedule_delayed_work(&cs->retry_connect_work, connect_int);
>> +       }
>>   }
>>
>>   static void dtr_cma_connect_work_fn(struct work_struct *work)
>> --
>> 2.27.0
>>
> .
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED)
  2024-06-28 12:07   ` Philipp Reisner
@ 2024-07-01  2:23     ` Dongsheng Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:23 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 8:07, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> It appears that you are trying to fix a leak of cm structures. Is that correct?

Yes, in our network faulure testing, we found drbdadm down command hang 
at dtr_free() to 
wait_event(rdma_transport->cm_count_wait,!atomic_read(&rdma_transport->cm_count));, 


we can find out the leak cm in memory and found the tx_descs_posted is 
not 0. then we did more hacking and found this problem in [05/11]

let's say this case:

a) post two tx_desc and tx_desc_posted to 2.

b) first tx_desc complete and call dtr_tx_cq_event_handler and into 
dtr_handle_tx_cq_event().

c) network failure and dtr_tx_timeout_work_fn() clear CONNECTED.

d) dtr_handle_tx_cq_event() returns, at this time , the second tx_desc 
is already complete, we expect rc = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP 
| IB_CQ_REPORT_MISSED_EVENTS); to return 1 in rc and continue to call 
dtr_handle_tx_cq_event() in next while loop.

d) but it check cm->state is not CONNECTED, and break the outer while 
loop, so the second tx_desc will never be handled.

> Do you the reference on cm that is held because of the timer?
> Please describe what the problem is, and how you are improving the situation.
> 
> In case this approach is the right solution, the patch should also change the
> dtr_handle_tx_cq_event() function to type void.
> 
> best regards,
>   Philipp
> 
> On Mon, Jun 24, 2024 at 8:22 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> We need to drain all tx in disconnect to put all kref for cm
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> ---
>>   drbd/drbd_transport_rdma.c | 3 ---
>>   1 file changed, 3 deletions(-)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index b7ccb15d4..9a6d15b78 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -1956,9 +1956,6 @@ static void dtr_tx_cq_event_handler(struct ib_cq *cq, void *ctx)
>>                          err = dtr_handle_tx_cq_event(cq, cm);
>>                  } while (!err);
>>
>> -               if (cm->state != DSM_CONNECTED)
>> -                       break;
>> -
>>                  rc = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS);
>>                  if (unlikely(rc < 0)) {
>>                          struct drbd_transport *transport = cm->path->path.transport;
>> --
>> 2.27.0
>>
> .
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error
  2024-06-28 12:19   ` Philipp Reisner
@ 2024-07-01  2:28     ` Dongsheng Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:28 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 8:19, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> This looks wrong. In this loop, we are trying to find a path on which
> to repost this tx_desc. When the remapping fails, there is no reason
> to drop the drop a ref on the cm.

But dtr_select_and_get_cm_for_tx() get a ref, if we dont put it before 
continue, who will put this ref?
> 
> So, please provide a description what you are intending here with this change?
> 
> best regards,
>   Philipp
> 
> On Mon, Jun 24, 2024 at 9:27 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> ---
>>   drbd/drbd_transport_rdma.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index c7adc87e3..77ff0055e 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -2355,8 +2355,11 @@ static int dtr_repost_tx_desc(struct dtr_cm *old_cm, struct dtr_tx_desc *tx_desc
>>                          return -ECONNRESET;
>>
>>                  err = dtr_remap_tx_desc(old_cm, cm, tx_desc);
>> -               if (err)
>> +               if (err) {
>> +                       pr_err("dtr_remap_tx_desc failed cm : %px\n", cm);
>> +                       kref_put(&cm->kref, dtr_destroy_cm);
>>                          continue;
>> +               }
>>
>>                  err = __dtr_post_tx_desc(cm, tx_desc);
>>                  if (!err) {
>> --
>> 2.27.0
>>
> .
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop
  2024-06-28 12:36   ` Philipp Reisner
@ 2024-07-01  2:30     ` Dongsheng Yang
  0 siblings, 0 replies; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:30 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/6/28 星期五 下午 8:36, Philipp Reisner 写道:
> Hello Dongsheng,
> 
> I am repeating your description in my own words so that you can verify
> I got it right:
> 
> CPU 0 executes dtr_connect() and is still before the
> wait_for_completion_interruptible().
> CPU 1 executes send_sig() in drbd_thread_stop().
> 
> Then you conclude that wait_for_completion_interruptible() will not
> abort, because the signal
> was raised before CPU 0 reached wait_for_completion_interruptible().

The problem is dtr_prepare_connect() calles flush_signals(), so the 
signal from drbd_thread_stop() can be flushed by dtr_prepare_connect().
> 
> If that is your description, then it is wrong.
> This is not how signals and the wait_event() macros work.
> 
> best regards,
>   Philipp
> 
> On Mon, Jun 24, 2024 at 9:27 AM zhengbing.huang
> <zhengbing.huang@easystack.cn> wrote:
>>
>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>
>> If the send_sig() in drbd_thread_stop before wait_for_completion_interruptible() in dtr_connect(),
>> it can't return from dtr_connect in network failure.
>>
>> So replace wait_for_completion_interruptible with wait_for_completion_interruptible_timeout, and
>> check status by dtr_connect() itself.
>>
>> This behavior is similar with tcp transport
>>
>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> ---
>>   drbd/drbd_transport_rdma.c | 15 ++++++++++++---
>>   1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> index 77ff0055e..c47b344f8 100644
>> --- a/drbd/drbd_transport_rdma.c
>> +++ b/drbd/drbd_transport_rdma.c
>> @@ -2996,12 +2996,21 @@ static int dtr_connect(struct drbd_transport *transport)
>>   {
>>          struct dtr_transport *rdma_transport =
>>                  container_of(transport, struct dtr_transport, transport);
>> -       int i, err = -ENOMEM;
>> +       int i, err;
>>
>> -       err = wait_for_completion_interruptible(&rdma_transport->connected);
>> -       if (err) {
>> +again:
>> +       if (drbd_should_abort_listening(transport)) {
>> +               err = -EAGAIN;
>> +               goto abort;
>> +       }
>> +
>> +       err = wait_for_completion_interruptible_timeout(&rdma_transport->connected, HZ);
>> +       if (err < 0) {
>>                  flush_signals(current);
>>                  goto abort;
>> +       } else if (err == 0) {
>> +               /* timed out */
>> +               goto again;
>>          }
>>
>>          err = atomic_read(&rdma_transport->first_path_connect_err);
>> --
>> 2.27.0
>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-07-01  2:07     ` Dongsheng Yang
@ 2024-07-01  2:48       ` Dongsheng Yang
  2024-10-16 16:44         ` Philipp Reisner
  0 siblings, 1 reply; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-01  2:48 UTC (permalink / raw)
  To: Philipp Reisner, zhengbing.huang; +Cc: drbd-dev



在 2024/7/1 星期一 上午 10:07, Dongsheng Yang 写道:
> 
> 
> 在 2024/6/28 星期五 下午 5:40, Philipp Reisner 写道:
>> Hello Dongsheng,
>>
>> Please add more information why you think this change fixes a bug.
>> Have you experienced a leak of cm structs?
>> We got a RDMA_CM_EVENT_ESTABLISHED event. Even if DRBD does not do
>> anything with this cm, we sill expect a RDMA_CM_EVENT_DISCONNECTED in
>> the future. Is a problem in the handling of the disconnect?
> 
> If dtr_path_established() go into this branch, it will not 
> schedule_work(&cm->establish_work);
> 
> That means path->cm->state = DSM_CONNECTED; will not be done in 
> dtr_path_established_work_fn(), so __dtr_disconnect_path() will not call 
> rdma_disconnect(). That means this reference will never be put.

let me consider this  example:
a) rdma_connect() called and RDMA_CM_EVENT_ESTABLISHED received.

b) network failure and dtr_path_established() go into error path.

c) establish_work will not be scheduled.

d) drbdadm down test will hang because cm ref is not put.
>>
>> best regards,
>>   Philipp
>>
>> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
>> <zhengbing.huang@easystack.cn> wrote:
>>>
>>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>>
>>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>> ---
>>>   drbd/drbd_transport_rdma.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>>> index cfbae0e78..eccd0c6ce 100644
>>> --- a/drbd/drbd_transport_rdma.c
>>> +++ b/drbd/drbd_transport_rdma.c
>>> @@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
>>>                          atomic_set(&cs->active_state, PCS_INACTIVE);
>>>                          wake_up(&cs->wq);
>>>                  }
>>> +               kref_put(&cm->kref, dtr_destroy_cm);
>>>                  return;
>>>          }
>>>
>>> -- 
>>> 2.27.0
>>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS
  2024-07-01  2:02   ` Dongsheng Yang
@ 2024-07-01 10:00     ` Philipp Reisner
  2024-07-02  1:45       ` Dongsheng Yang
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-07-01 10:00 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: drbd-dev

Hi Dongsheng,

Thanks for all this information! That is already nearly a perfect
commit message.
Still, I am looking for a better approach to solving this problem.
Instead of making the detach uninterruptible, I suggest finding a way
to still schedule the ldev_destroy_work in this corner case.

PS: I prefer changes with 100 lines of commit message that touches 3
lines of code over 3 lines of commit message for 100 lines of code
changes.

best regards,
 Philipp

On Mon, Jul 1, 2024 at 4:02 AM Dongsheng Yang
<dongsheng.yang@easystack.cn> wrote:
>
>
>
> 在 2024/6/28 星期五 下午 5:10, Philipp Reisner 写道:
> > Hello Dongsheng,
> >
> > First of all, thanks for contributing patches to us.
> > Please find my reply on the patch below the quote:
> >
> > On Mon, Jun 24, 2024 at 7:52 AM zhengbing.huang
> > <zhengbing.huang@easystack.cn> wrote:
> >>
> >> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
> >>
> >> In our network failure and drbd down testing, we found warning in dmesg and drbd down process into D state:
> >>
> >> "kernel: drbd /unregistered/ramtest3/0 drbd103: ASSERTION device->disk_state[NOW] == D_FAILED || device->disk_state[NOW] == D_DETACHING FAILED in go_diskless"
> >>
> >> the problem is the wait_event is inttruptable, it could be intrupted by signal and call drbd_cleanup_device before go_diskless()
> >>
> >
> > In this case, I suggest improving the expression in the assertion.
> > Improving an assertion can also mean removing that assertion.
>
> Hi Philipp,
>         This patchset is fixing the problems found by a network failure test
> script[1].
>         The [1/11] is not about just a WARNING, it will result a process with D
> state in wait_event(device->misc_wait, !test_bit(GOING_DISKLESS,
> &device->flags)); in adm_del_minor().
>
> let's think about this sequence:
>
> a) drbd_adm_down -> adm_detach -> change_disk_state(device, D_DETACHING...
>
> b) it will call put_ldev(), set GOING_DISKLESS and post a work for
> GO_DISKLESS
>
> c) adm_detach() start wait_event_interruptible(device->misc_wait,
>                         get_disk_state(device) != D_DETACHING);
> but it can be intrrupted, then call drbd_cleanup_device() to set
> device->disk_state[NOW] = D_DISKLESS;
>
> after that, it will go to adm_del_minor() and
> wait_event(device->misc_wait, !test_bit(GOING_DISKLESS,
> &device->flags)); which expects drbd_ldev_destroy to clear GOING_DISKLESS.
>
> d) on the other hand, go_diskless work start and warn on the message in
> commit message. it will do change_disk_state(device, D_DISKLESS,
> CS_HARD, "go-diskless", NULL); But the disk_state[NOW] is already
> D_DISKLESS. So it will not schedule &device->ldev_destroy_work.
>
> As a result, the wait_event in c) will never return.
>
>
> [1]:
> check_drbd_process() {
>      ps aux | grep " D"|grep drbd
> }
>
> check_node_2_drbd_process() {
>      ssh node-2 'ps aux' | grep " D"|grep drbd
> }
>
> wait_for_no_drbd_d_state() {
>      count=0
>      while true; do
>          if check_drbd_process; then
>              echo "Found drbd process in D state, sleeping for ${count}
> second..."
>              sleep 1
>              count=$((count + 1))
>          else
>              echo "No drbd process in D state."
>              break
>          fi
>      done
>      while true; do
>          if check_node_2_drbd_process; then
>              echo "Found drbd process in D state, sleeping for ${count}
> second..."
>              sleep 1
>              count=$((count + 1))
>          else
>              echo "No drbd process in D state."
>              break
>          fi
>      done
> }
>
> random_sleep=$((RANDOM % 100))
>
> ssh node-2 "ifup Bond2-roce.1469"
> ifup Bond2-roce.1469
>
> sleep 5
>
> for i in `seq 0 9`; do
>          drbdadm up ramtest${i}
>          ssh node-2 "drbdadm up ramtest${i}"
> done
>
> sleep ${random_sleep}
>
> ssh node-2 "ifdown Bond2-roce.1469"
>
> random_sleep=$((RANDOM % 10))
>
> for i in `seq 0 9`; do
>          drbdsetup fail-io ramtest${i} &
>          drbdadm down ramtest${i} &
> done
>
> sleep 10
>
> wait_for_no_drbd_d_state
> >
> > The wait_event_interruptible() is there for a reason. Think of a
> > backing disk that behaves like a tar pit—a backing device that no
> > longer finishes IO requests. You want a way to interrupt the drbdsetup
> > waiting in detach.
> >
> > PS: A bit more elaborative commit messages are welcome.
> >
> > best regards,
> >   Philipp
> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS
  2024-07-01 10:00     ` Philipp Reisner
@ 2024-07-02  1:45       ` Dongsheng Yang
  2024-07-03 14:31         ` [PATCH] drbd: make drbd_adm_detach() interruptible Philipp Reisner
  0 siblings, 1 reply; 32+ messages in thread
From: Dongsheng Yang @ 2024-07-02  1:45 UTC (permalink / raw)
  To: Philipp Reisner, Dongsheng Yang; +Cc: drbd-dev



在 2024/7/1 星期一 下午 6:00, Philipp Reisner 写道:
> Hi Dongsheng,
> 
> Thanks for all this information! That is already nearly a perfect
> commit message.
> Still, I am looking for a better approach to solving this problem.
> Instead of making the detach uninterruptible, I suggest finding a way
> to still schedule the ldev_destroy_work in this corner case.

Sounds good, we are willing to do review and test for your patch when 
it's ready.
> 
> PS: I prefer changes with 100 lines of commit message that touches 3
> lines of code over 3 lines of commit message for 100 lines of code
> changes.

Totally understood, that's important for discussion and code maintaining.

Thanx
> 
> best regards,
>   Philipp
> 
> On Mon, Jul 1, 2024 at 4:02 AM Dongsheng Yang
> <dongsheng.yang@easystack.cn> wrote:
>>
>>
>>
>> 在 2024/6/28 星期五 下午 5:10, Philipp Reisner 写道:
>>> Hello Dongsheng,
>>>
>>> First of all, thanks for contributing patches to us.
>>> Please find my reply on the patch below the quote:
>>>
>>> On Mon, Jun 24, 2024 at 7:52 AM zhengbing.huang
>>> <zhengbing.huang@easystack.cn> wrote:
>>>>
>>>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>>>
>>>> In our network failure and drbd down testing, we found warning in dmesg and drbd down process into D state:
>>>>
>>>> "kernel: drbd /unregistered/ramtest3/0 drbd103: ASSERTION device->disk_state[NOW] == D_FAILED || device->disk_state[NOW] == D_DETACHING FAILED in go_diskless"
>>>>
>>>> the problem is the wait_event is inttruptable, it could be intrupted by signal and call drbd_cleanup_device before go_diskless()
>>>>
>>>
>>> In this case, I suggest improving the expression in the assertion.
>>> Improving an assertion can also mean removing that assertion.
>>
>> Hi Philipp,
>>          This patchset is fixing the problems found by a network failure test
>> script[1].
>>          The [1/11] is not about just a WARNING, it will result a process with D
>> state in wait_event(device->misc_wait, !test_bit(GOING_DISKLESS,
>> &device->flags)); in adm_del_minor().
>>
>> let's think about this sequence:
>>
>> a) drbd_adm_down -> adm_detach -> change_disk_state(device, D_DETACHING...
>>
>> b) it will call put_ldev(), set GOING_DISKLESS and post a work for
>> GO_DISKLESS
>>
>> c) adm_detach() start wait_event_interruptible(device->misc_wait,
>>                          get_disk_state(device) != D_DETACHING);
>> but it can be intrrupted, then call drbd_cleanup_device() to set
>> device->disk_state[NOW] = D_DISKLESS;
>>
>> after that, it will go to adm_del_minor() and
>> wait_event(device->misc_wait, !test_bit(GOING_DISKLESS,
>> &device->flags)); which expects drbd_ldev_destroy to clear GOING_DISKLESS.
>>
>> d) on the other hand, go_diskless work start and warn on the message in
>> commit message. it will do change_disk_state(device, D_DISKLESS,
>> CS_HARD, "go-diskless", NULL); But the disk_state[NOW] is already
>> D_DISKLESS. So it will not schedule &device->ldev_destroy_work.
>>
>> As a result, the wait_event in c) will never return.
>>
>>
>> [1]:
>> check_drbd_process() {
>>       ps aux | grep " D"|grep drbd
>> }
>>
>> check_node_2_drbd_process() {
>>       ssh node-2 'ps aux' | grep " D"|grep drbd
>> }
>>
>> wait_for_no_drbd_d_state() {
>>       count=0
>>       while true; do
>>           if check_drbd_process; then
>>               echo "Found drbd process in D state, sleeping for ${count}
>> second..."
>>               sleep 1
>>               count=$((count + 1))
>>           else
>>               echo "No drbd process in D state."
>>               break
>>           fi
>>       done
>>       while true; do
>>           if check_node_2_drbd_process; then
>>               echo "Found drbd process in D state, sleeping for ${count}
>> second..."
>>               sleep 1
>>               count=$((count + 1))
>>           else
>>               echo "No drbd process in D state."
>>               break
>>           fi
>>       done
>> }
>>
>> random_sleep=$((RANDOM % 100))
>>
>> ssh node-2 "ifup Bond2-roce.1469"
>> ifup Bond2-roce.1469
>>
>> sleep 5
>>
>> for i in `seq 0 9`; do
>>           drbdadm up ramtest${i}
>>           ssh node-2 "drbdadm up ramtest${i}"
>> done
>>
>> sleep ${random_sleep}
>>
>> ssh node-2 "ifdown Bond2-roce.1469"
>>
>> random_sleep=$((RANDOM % 10))
>>
>> for i in `seq 0 9`; do
>>           drbdsetup fail-io ramtest${i} &
>>           drbdadm down ramtest${i} &
>> done
>>
>> sleep 10
>>
>> wait_for_no_drbd_d_state
>>>
>>> The wait_event_interruptible() is there for a reason. Think of a
>>> backing disk that behaves like a tar pit—a backing device that no
>>> longer finishes IO requests. You want a way to interrupt the drbdsetup
>>> waiting in detach.
>>>
>>> PS: A bit more elaborative commit messages are welcome.
>>>
>>> best regards,
>>>    Philipp
>>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] drbd: make drbd_adm_detach() interruptible
  2024-07-02  1:45       ` Dongsheng Yang
@ 2024-07-03 14:31         ` Philipp Reisner
  2024-07-04  2:59           ` Zhengbing
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-07-03 14:31 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: Philipp Reisner, drbd-dev

If a backing device suddenly ceases delivering I/O completions, and in
reaction, the user issues a `drbdsetup detach`, the operation will
hang when it tries to write internal meta-data.

The user should have used `drbdsetup --force detach`, but it is too
late. There was no way to interrupt the hanging drbdsetup detach.

Improve the situation by making detach operations interruptible.
---
 drbd/drbd_actlog.c |  5 ++++-
 drbd/drbd_int.h    |  1 +
 drbd/drbd_state.c  | 29 +++++++++++++++++++++++++++--
 3 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/drbd/drbd_actlog.c b/drbd/drbd_actlog.c
index bc09dee2f..d6ba168ac 100644
--- a/drbd/drbd_actlog.c
+++ b/drbd/drbd_actlog.c
@@ -74,7 +74,10 @@ void wait_until_done_or_force_detached(struct drbd_device *device, struct drbd_b
 		dt = MAX_SCHEDULE_TIMEOUT;
 
 	dt = wait_event_timeout(device->misc_wait,
-			*done || test_bit(FORCE_DETACH, &device->flags), dt);
+			*done ||
+			test_bit(FORCE_DETACH, &device->flags) ||
+			test_bit(INTERRUPT_DETACH, &device->flags),
+			dt);
 	if (dt == 0) {
 		drbd_err(device, "meta-data IO operation timed out\n");
 		drbd_handle_io_error(device, DRBD_FORCE_DETACH);
diff --git a/drbd/drbd_int.h b/drbd/drbd_int.h
index 0ebd79091..8ea752edd 100644
--- a/drbd/drbd_int.h
+++ b/drbd/drbd_int.h
@@ -521,6 +521,7 @@ enum device_flag {
 	MD_NO_FUA,		/* meta data device does not support barriers,
 				   so don't even try */
 	FORCE_DETACH,		/* Force-detach from local disk, aborting any pending local IO */
+	INTERRUPT_DETACH,	/* Interrupt an ongoing detach operation */
 	NEW_CUR_UUID,		/* Create new current UUID when thawing IO or issuing local IO */
 	__NEW_CUR_UUID,		/* Set NEW_CUR_UUID as soon as state change visible */
 	WRITING_NEW_CUR_UUID,	/* Set while the new current ID gets generated. */
diff --git a/drbd/drbd_state.c b/drbd/drbd_state.c
index be1de8f06..643b2f385 100644
--- a/drbd/drbd_state.c
+++ b/drbd/drbd_state.c
@@ -924,14 +924,39 @@ void state_change_lock(struct drbd_resource *resource, unsigned long *irq_flags,
 	resource->state_change_flags = flags;
 }
 
+/* Interrupt writing meta-data */
+static void interrupt_detach(struct drbd_resource *resource, struct completion *done)
+{
+	struct drbd_device *device;
+	int vnr;
+
+	idr_for_each_entry(&resource->devices, device, vnr) {
+		if (device->disk_state[NOW] == D_DETACHING) {
+			set_bit(INTERRUPT_DETACH, &device->flags);
+			wake_up_all(&device->misc_wait);
+		}
+	}
+
+	wait_for_completion(done);
+
+	idr_for_each_entry(&resource->devices, device, vnr) {
+		if (test_bit(INTERRUPT_DETACH, &device->flags))
+			clear_bit(INTERRUPT_DETACH, &device->flags);
+	}
+}
+
 static void __state_change_unlock(struct drbd_resource *resource, unsigned long *irq_flags, struct completion *done)
 {
 	enum chg_state_flags flags = resource->state_change_flags;
 
 	resource->state_change_flags = 0;
 	write_unlock_irqrestore(&resource->state_rwlock, *irq_flags);
-	if (done && expect(resource, current != resource->worker.task))
-		wait_for_completion(done);
+	if (done && expect(resource, current != resource->worker.task)) {
+		int err = wait_for_completion_interruptible(done);
+
+		if (err == -ERESTARTSYS)
+			interrupt_detach(resource, done);
+	}
 	if ((flags & CS_SERIALIZE) && !(flags & (CS_ALREADY_SERIALIZED | CS_PREPARE)))
 		up(&resource->state_sem);
 }
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re:[PATCH] drbd: make drbd_adm_detach() interruptible
  2024-07-03 14:31         ` [PATCH] drbd: make drbd_adm_detach() interruptible Philipp Reisner
@ 2024-07-04  2:59           ` Zhengbing
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengbing @ 2024-07-04  2:59 UTC (permalink / raw)
  To: Philipp Reisner; +Cc: dongsheng.yang, drbd-dev

[-- Attachment #1: Type: text/plain, Size: 4738 bytes --]

Hi Philipp,
    IIUC, this patch is going to fix the following problem:
(1) backing disk error and no completion returned from hardware.

(2) drbdsetup detach command will queue_after_state_change_work(resource, done, work); this work will be handled in drbd_worker, but it con't finish in drbd_md_sync()->wait_until_done_or_force_detached()

(3) drbdsetup detach process will continue to __state_change_unlock and hang as the related after_state_change work not complete.

So this patch allow user to use kill command to send signal to drbdsetup detach command, then it go into interrupt_detach(), and interrupt_detach() can make wait_until_done_or_force_detached() continue.

After that, w_after_state_change() can continue to complete(worker->done), which makes drbdsetup detach process continue.

If this is what you want, I think it fix a different problem case based on the [1/11] in our patchset, So we need [1/11] and this patch both, right?


best regards,
    zhengbing



From: Philipp Reisner <philipp.reisner@linbit.com>Date: 2024-07-03 22:31:35
To:  Dongsheng Yang <dongsheng.yang@linux.dev>
Cc:  "zhengbing . huang" <zhengbing.huang@easystack.cn>,drbd-dev@lists.linbit.com,Philipp Reisner <philipp.reisner@linbit.com>
Subject: [PATCH] drbd: make drbd_adm_detach() interruptible>If a backing device suddenly ceases delivering I/O completions, and in
>reaction, the user issues a `drbdsetup detach`, the operation will
>hang when it tries to write internal meta-data.
>
>The user should have used `drbdsetup --force detach`, but it is too
>late. There was no way to interrupt the hanging drbdsetup detach.
>
>Improve the situation by making detach operations interruptible.
>---
> drbd/drbd_actlog.c |  5 ++++-
> drbd/drbd_int.h    |  1 +
> drbd/drbd_state.c  | 29 +++++++++++++++++++++++++++--
> 3 files changed, 32 insertions(+), 3 deletions(-)
>
>diff --git a/drbd/drbd_actlog.c b/drbd/drbd_actlog.c
>index bc09dee2f..d6ba168ac 100644
>--- a/drbd/drbd_actlog.c
>+++ b/drbd/drbd_actlog.c
>@@ -74,7 +74,10 @@ void wait_until_done_or_force_detached(struct drbd_device *device, struct drbd_b
> 		dt = MAX_SCHEDULE_TIMEOUT;
> 
> 	dt = wait_event_timeout(device->misc_wait,
>-			*done || test_bit(FORCE_DETACH, &device->flags), dt);
>+			*done ||
>+			test_bit(FORCE_DETACH, &device->flags) ||
>+			test_bit(INTERRUPT_DETACH, &device->flags),
>+			dt);
> 	if (dt == 0) {
> 		drbd_err(device, "meta-data IO operation timed out\n");
> 		drbd_handle_io_error(device, DRBD_FORCE_DETACH);
>diff --git a/drbd/drbd_int.h b/drbd/drbd_int.h
>index 0ebd79091..8ea752edd 100644
>--- a/drbd/drbd_int.h
>+++ b/drbd/drbd_int.h
>@@ -521,6 +521,7 @@ enum device_flag {
> 	MD_NO_FUA,		/* meta data device does not support barriers,
> 				   so don't even try */
> 	FORCE_DETACH,		/* Force-detach from local disk, aborting any pending local IO */
>+	INTERRUPT_DETACH,	/* Interrupt an ongoing detach operation */
> 	NEW_CUR_UUID,		/* Create new current UUID when thawing IO or issuing local IO */
> 	__NEW_CUR_UUID,		/* Set NEW_CUR_UUID as soon as state change visible */
> 	WRITING_NEW_CUR_UUID,	/* Set while the new current ID gets generated. */
>diff --git a/drbd/drbd_state.c b/drbd/drbd_state.c
>index be1de8f06..643b2f385 100644
>--- a/drbd/drbd_state.c
>+++ b/drbd/drbd_state.c
>@@ -924,14 +924,39 @@ void state_change_lock(struct drbd_resource *resource, unsigned long *irq_flags,
> 	resource->state_change_flags = flags;
> }
> 
>+/* Interrupt writing meta-data */
>+static void interrupt_detach(struct drbd_resource *resource, struct completion *done)
>+{
>+	struct drbd_device *device;
>+	int vnr;
>+
>+	idr_for_each_entry(&resource->devices, device, vnr) {
>+		if (device->disk_state[NOW] == D_DETACHING) {
>+			set_bit(INTERRUPT_DETACH, &device->flags);
>+			wake_up_all(&device->misc_wait);
>+		}
>+	}
>+
>+	wait_for_completion(done);
>+
>+	idr_for_each_entry(&resource->devices, device, vnr) {
>+		if (test_bit(INTERRUPT_DETACH, &device->flags))
>+			clear_bit(INTERRUPT_DETACH, &device->flags);
>+	}
>+}
>+
> static void __state_change_unlock(struct drbd_resource *resource, unsigned long *irq_flags, struct completion *done)
> {
> 	enum chg_state_flags flags = resource->state_change_flags;
> 
> 	resource->state_change_flags = 0;
> 	write_unlock_irqrestore(&resource->state_rwlock, *irq_flags);
>-	if (done && expect(resource, current != resource->worker.task))
>-		wait_for_completion(done);
>+	if (done && expect(resource, current != resource->worker.task)) {
>+		int err = wait_for_completion_interruptible(done);
>+
>+		if (err == -ERESTARTSYS)
>+			interrupt_detach(resource, done);
>+	}
> 	if ((flags & CS_SERIALIZE) && !(flags & (CS_ALREADY_SERIALIZED | CS_PREPARE)))
> 		up(&resource->state_sem);
> }
>-- 
>2.45.2
>





[-- Attachment #2: Type: text/html, Size: 5471 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-07-01  2:48       ` Dongsheng Yang
@ 2024-10-16 16:44         ` Philipp Reisner
  2024-10-17  6:42           ` Zhengbing
  0 siblings, 1 reply; 32+ messages in thread
From: Philipp Reisner @ 2024-10-16 16:44 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: drbd-dev

Hello easystack team,

I probably fixed the (or some of the) problems you intended to fix
with this patch series.

Please see these recent commits on the master branch:

3f82aed6e drbd_transport: fix a use after free in drbd_get_listener()
1f2f11c47 rdma: remove dtr_path_established()
e38120f3d rdma: Fix leaking a cm_id object
e0f5e307e rdma: fix superficial kref_put() in error code path in
dtr_cma_accept()
35a6b002c rdma: fix concurrency of activate_path() and __dtr_disconnect_path()
6ccd39432 rdma: fix an access after free() in dtr_destroy_cm()
5a711b347 rdma: fix free() of scheduled delayed_work
847aab659 rdma: remove misguided kref_get()/kref_put()

I will go over your patch series and your comments. I'm sorry I
dropped the ball on this. I was out on vacation in July, and when I
came back, 1000 things needed my attention.

Best regards,
 Philipp

On Mon, Jul 1, 2024 at 4:48 AM Dongsheng Yang
<dongsheng.yang@easystack.cn> wrote:
>
>
>
> 在 2024/7/1 星期一 上午 10:07, Dongsheng Yang 写道:
> >
> >
> > 在 2024/6/28 星期五 下午 5:40, Philipp Reisner 写道:
> >> Hello Dongsheng,
> >>
> >> Please add more information why you think this change fixes a bug.
> >> Have you experienced a leak of cm structs?
> >> We got a RDMA_CM_EVENT_ESTABLISHED event. Even if DRBD does not do
> >> anything with this cm, we sill expect a RDMA_CM_EVENT_DISCONNECTED in
> >> the future. Is a problem in the handling of the disconnect?
> >
> > If dtr_path_established() go into this branch, it will not
> > schedule_work(&cm->establish_work);
> >
> > That means path->cm->state = DSM_CONNECTED; will not be done in
> > dtr_path_established_work_fn(), so __dtr_disconnect_path() will not call
> > rdma_disconnect(). That means this reference will never be put.
>
> let me consider this  example:
> a) rdma_connect() called and RDMA_CM_EVENT_ESTABLISHED received.
>
> b) network failure and dtr_path_established() go into error path.
>
> c) establish_work will not be scheduled.
>
> d) drbdadm down test will hang because cm ref is not put.
> >>
> >> best regards,
> >>   Philipp
> >>
> >> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
> >> <zhengbing.huang@easystack.cn> wrote:
> >>>
> >>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
> >>>
> >>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
> >>> ---
> >>>   drbd/drbd_transport_rdma.c | 1 +
> >>>   1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
> >>> index cfbae0e78..eccd0c6ce 100644
> >>> --- a/drbd/drbd_transport_rdma.c
> >>> +++ b/drbd/drbd_transport_rdma.c
> >>> @@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
> >>>                          atomic_set(&cs->active_state, PCS_INACTIVE);
> >>>                          wake_up(&cs->wq);
> >>>                  }
> >>> +               kref_put(&cm->kref, dtr_destroy_cm);
> >>>                  return;
> >>>          }
> >>>
> >>> --
> >>> 2.27.0
> >>>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re:Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path
  2024-10-16 16:44         ` Philipp Reisner
@ 2024-10-17  6:42           ` Zhengbing
  0 siblings, 0 replies; 32+ messages in thread
From: Zhengbing @ 2024-10-17  6:42 UTC (permalink / raw)
  To: Philipp Reisner; +Cc: Dongsheng Yang, drbd-dev

[-- Attachment #1: Type: text/plain, Size: 3514 bytes --]

Hi Philipp,

Thank you for your reply.

We'll get to know those patches.



Best regards,
zhengbing




From: Philipp Reisner <philipp.reisner@linbit.com>
Date: 2024-10-17 00:44:40
To:  Dongsheng Yang <dongsheng.yang@easystack.cn>
Cc:  "zhengbing.huang" <zhengbing.huang@easystack.cn>,drbd-dev@lists.linbit.com,dongsheng.yang@linux.dev
Subject: Re: [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path>Hello easystack team,
>
>I probably fixed the (or some of the) problems you intended to fix
>with this patch series.
>
>Please see these recent commits on the master branch:
>
>3f82aed6e drbd_transport: fix a use after free in drbd_get_listener()
>1f2f11c47 rdma: remove dtr_path_established()
>e38120f3d rdma: Fix leaking a cm_id object
>e0f5e307e rdma: fix superficial kref_put() in error code path in
>dtr_cma_accept()
>35a6b002c rdma: fix concurrency of activate_path() and __dtr_disconnect_path()
>6ccd39432 rdma: fix an access after free() in dtr_destroy_cm()
>5a711b347 rdma: fix free() of scheduled delayed_work
>847aab659 rdma: remove misguided kref_get()/kref_put()
>
>I will go over your patch series and your comments. I'm sorry I
>dropped the ball on this. I was out on vacation in July, and when I
>came back, 1000 things needed my attention.
>
>Best regards,
> Philipp
>
>On Mon, Jul 1, 2024 at 4:48 AM Dongsheng Yang
><dongsheng.yang@easystack.cn> wrote:
>>
>>
>>
>> 在 2024/7/1 星期一 上午 10:07, Dongsheng Yang 写道:
>> >
>> >
>> > 在 2024/6/28 星期五 下午 5:40, Philipp Reisner 写道:
>> >> Hello Dongsheng,
>> >>
>> >> Please add more information why you think this change fixes a bug.
>> >> Have you experienced a leak of cm structs?
>> >> We got a RDMA_CM_EVENT_ESTABLISHED event. Even if DRBD does not do
>> >> anything with this cm, we sill expect a RDMA_CM_EVENT_DISCONNECTED in
>> >> the future. Is a problem in the handling of the disconnect?
>> >
>> > If dtr_path_established() go into this branch, it will not
>> > schedule_work(&cm->establish_work);
>> >
>> > That means path->cm->state = DSM_CONNECTED; will not be done in
>> > dtr_path_established_work_fn(), so __dtr_disconnect_path() will not call
>> > rdma_disconnect(). That means this reference will never be put.
>>
>> let me consider this  example:
>> a) rdma_connect() called and RDMA_CM_EVENT_ESTABLISHED received.
>>
>> b) network failure and dtr_path_established() go into error path.
>>
>> c) establish_work will not be scheduled.
>>
>> d) drbdadm down test will hang because cm ref is not put.
>> >>
>> >> best regards,
>> >>   Philipp
>> >>
>> >> On Mon, Jun 24, 2024 at 9:28 AM zhengbing.huang
>> >> <zhengbing.huang@easystack.cn> wrote:
>> >>>
>> >>> From: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> >>>
>> >>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>> >>> ---
>> >>>   drbd/drbd_transport_rdma.c | 1 +
>> >>>   1 file changed, 1 insertion(+)
>> >>>
>> >>> diff --git a/drbd/drbd_transport_rdma.c b/drbd/drbd_transport_rdma.c
>> >>> index cfbae0e78..eccd0c6ce 100644
>> >>> --- a/drbd/drbd_transport_rdma.c
>> >>> +++ b/drbd/drbd_transport_rdma.c
>> >>> @@ -922,6 +922,7 @@ static void dtr_path_established(struct dtr_cm *cm)
>> >>>                          atomic_set(&cs->active_state, PCS_INACTIVE);
>> >>>                          wake_up(&cs->wq);
>> >>>                  }
>> >>> +               kref_put(&cm->kref, dtr_destroy_cm);
>> >>>                  return;
>> >>>          }
>> >>>
>> >>> --
>> >>> 2.27.0
>> >>>





[-- Attachment #2: Type: text/html, Size: 4647 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2024-10-17  6:50 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-24  5:46 [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS zhengbing.huang
2024-06-24  5:46 ` [PATCH 02/11] drbd_receiver: get_ldev before use device->ldev for drbd_reconsider_queue_parameters() zhengbing.huang
2024-06-28  9:35   ` Philipp Reisner
2024-06-24  5:46 ` [PATCH 03/11] drbd_transport_rdma: put kref for cm in dtr_path_established in error path zhengbing.huang
2024-06-28  9:40   ` Philipp Reisner
2024-07-01  2:07     ` Dongsheng Yang
2024-07-01  2:48       ` Dongsheng Yang
2024-10-16 16:44         ` Philipp Reisner
2024-10-17  6:42           ` Zhengbing
2024-06-24  5:46 ` [PATCH 04/11] drbd_transport_rdma: dont schedule retry_connect_work in active is false zhengbing.huang
2024-06-28 11:51   ` Philipp Reisner
2024-07-01  2:11     ` Dongsheng Yang
2024-06-24  5:46 ` [PATCH 05/11] drbd_transport_rdma: dont break in dtr_tx_cq_event_handler if (cm->state != DSM_CONNECTED) zhengbing.huang
2024-06-28 12:07   ` Philipp Reisner
2024-07-01  2:23     ` Dongsheng Yang
2024-06-24  5:46 ` [PATCH 06/11] drbd_transport_rdma: put kref in error path zhengbing.huang
2024-06-28 12:12   ` Philipp Reisner
2024-06-24  5:46 ` [PATCH 07/11] drbd_transport_rdma: put kref in dtr_remap_tx_desc error zhengbing.huang
2024-06-28 12:19   ` Philipp Reisner
2024-07-01  2:28     ` Dongsheng Yang
2024-06-24  5:46 ` [PATCH 08/11] drbd_transport_rdma: fix a race between dtr_connect and drbd_thread_stop zhengbing.huang
2024-06-28 12:36   ` Philipp Reisner
2024-07-01  2:30     ` Dongsheng Yang
2024-06-24  5:46 ` [PATCH 09/11] drbd_transport_rdma: introduce timeout for rdma_disocnnect zhengbing.huang
2024-06-24  5:46 ` [PATCH 10/11] drbd_transport_rdma: introduce timeout for rdma_connect zhengbing.huang
2024-06-24  5:46 ` [PATCH 11/11] drbd_transport_rdma: wake up state_wq after clear DSB_CONNECTED in dtr_tx_timeout_work_fn zhengbing.huang
2024-06-28  9:10 ` [PATCH 01/11] drbd_nl: dont allow detating to be inttrupted in waiting D_DETACHING to DISKLESS Philipp Reisner
2024-07-01  2:02   ` Dongsheng Yang
2024-07-01 10:00     ` Philipp Reisner
2024-07-02  1:45       ` Dongsheng Yang
2024-07-03 14:31         ` [PATCH] drbd: make drbd_adm_detach() interruptible Philipp Reisner
2024-07-04  2:59           ` Zhengbing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox