[PATCH Resend V2 1/2] md-cluster: Defer MD reloading to mddev->thread

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH Resend V2 1/2] md-cluster: Defer MD reloading to mddev->thread
@ 2015-11-29  8:57 Guoqing Jiang
  2015-11-29  8:57 ` [PATCH Resend V2 2/2] md-cluster: Protect communication with mutexes Guoqing Jiang
  0 siblings, 1 reply; 5+ messages in thread
From: Guoqing Jiang @ 2015-11-29  8:57 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, rgoldwyn

Reloading of superblock must be performed under reconfig_mutex. However,
this cannot be done with md_reload_sb because it would deadlock with
the message DLM lock. So, we defer it in md_check_recovery() which is
executed by mddev->thread.

This introduces a new flag, MD_RELOAD_SB, which if set, will reload the
superblock. And good_device_nr is also added to 'struct mddev' which is
used to get the num of the good device within cluster raid.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
---
v2 changes: remove atomic and add good_device_nr to struct mddev

 drivers/md/md-cluster.c | 4 +++-
 drivers/md/md.c         | 4 ++++
 drivers/md/md.h         | 4 ++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index db9375f..b659ef7 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -432,8 +432,10 @@ static void process_add_new_disk(struct mddev *mddev, struct cluster_msg *cmsg)
 static void process_metadata_update(struct mddev *mddev, struct cluster_msg *msg)
 {
 	struct md_cluster_info *cinfo = mddev->cluster_info;
-	md_reload_sb(mddev, le32_to_cpu(msg->raid_slot));
+	mddev->good_device_nr = le32_to_cpu(msg->raid_slot);
+	set_bit(MD_RELOAD_SB, &mddev->flags);
 	dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
+	md_wakeup_thread(mddev->thread);
 }
 
 static void process_remove_disk(struct mddev *mddev, struct cluster_msg *msg)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 642c87c..5ee18a0 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8273,6 +8273,7 @@ void md_check_recovery(struct mddev *mddev)
 		(mddev->flags & MD_UPDATE_SB_FLAGS & ~ (1<<MD_CHANGE_PENDING)) ||
 		test_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
 		test_bit(MD_RECOVERY_DONE, &mddev->recovery) ||
+		test_bit(MD_RELOAD_SB, &mddev->flags) ||
 		(mddev->external == 0 && mddev->safemode == 1) ||
 		(mddev->safemode == 2 && ! atomic_read(&mddev->writes_pending)
 		 && !mddev->in_sync && mddev->recovery_cp == MaxSector)
@@ -8321,6 +8322,9 @@ void md_check_recovery(struct mddev *mddev)
 						rdev->raid_disk < 0)
 					md_kick_rdev_from_array(rdev);
 			}
+
+			if (test_and_clear_bit(MD_RELOAD_SB, &mddev->flags))
+				md_reload_sb(mddev, mddev->good_device_nr);
 		}
 
 		if (!mddev->external) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index f5b9aad..7a6da5c 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -235,6 +235,9 @@ struct mddev {
 				 */
 #define MD_JOURNAL_CLEAN 5	/* A raid with journal is already clean */
 #define MD_HAS_JOURNAL	6	/* The raid array has journal feature set */
+#define MD_RELOAD_SB	7	/* Reload the superblock because another node
+				 * updated it.
+				 */
 
 	int				suspended;
 	atomic_t			active_io;
@@ -465,6 +468,7 @@ struct mddev {
 	struct work_struct event_work;	/* used by dm to report failure event */
 	void (*sync_super)(struct mddev *mddev, struct md_rdev *rdev);
 	struct md_cluster_info		*cluster_info;
+	unsigned int			good_device_nr;	/* good device num within cluster raid */
 };
 
 static inline int __must_check mddev_lock(struct mddev *mddev)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH Resend V2 2/2] md-cluster: Protect communication with mutexes
  2015-11-29  8:57 [PATCH Resend V2 1/2] md-cluster: Defer MD reloading to mddev->thread Guoqing Jiang
@ 2015-11-29  8:57 ` Guoqing Jiang
  2015-11-29 17:43   ` [PATCH V3 " Guoqing Jiang
  0 siblings, 1 reply; 5+ messages in thread
From: Guoqing Jiang @ 2015-11-29  8:57 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, rgoldwyn

Communication can happen through multiple threads. It is possible that
one thread steps over another threads sequence. So, we use mutexes to
protect both the send and receive sequences.

Send communication is locked through state bit, MD_CLUSTER_SEND_LOCK.
Communication is locked with bit manipulation in order to allow
"lock and hold" for the add operation. In case of an add operation,
if the lock is held, MD_CLUSTER_SEND_LOCKED_ALREADY is set.
When md_update_sb() calls metadata_update_start(), it checks
(in a single statement to avoid races), if the communication
is already locked. If yes, it merely returns zero, else it
locks the token lockresource.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
v2 changes:
1. replace send_mutex with wait_queue.
2. change MD_CLUSTER_COMM_LOCKED to MD_CLUSTER_SEND_LOCKED_ALREADY,
   and set it later with add_new_disk.
3. add a new lock_token and change lock_comm to make it better
   suit for the specific cases of locking communication.

 drivers/md/md-cluster.c | 72 ++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 62 insertions(+), 10 deletions(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index b659ef7..c37132a 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -48,12 +48,26 @@ struct resync_info {
 #define		MD_CLUSTER_SUSPEND_READ_BALANCING	2
 #define		MD_CLUSTER_BEGIN_JOIN_CLUSTER		3
 
+/* Lock the send communication. This is done through
+ * bit manipulation as opposed to a mutex in order to
+ * accomodate lock and hold. See next comment.
+ */
+#define		MD_CLUSTER_SEND_LOCK			4
+/* If cluster operations must lock the communication channel,
+ * so as to perform extra operations (and no other operation
+ * is allowed on the MD, such as adding a disk. Token needs
+ * to be locked and held until the operation completes with
+ * a md_update_sb(), which would eventually release the lock.
+ */
+#define		MD_CLUSTER_SEND_LOCKED_ALREADY		5
+
 
 struct md_cluster_info {
 	/* dlm lock space and resources for clustered raid. */
 	dlm_lockspace_t *lockspace;
 	int slot_number;
 	struct completion completion;
+	struct mutex recv_mutex;
 	struct dlm_lock_resource *bitmap_lockres;
 	struct dlm_lock_resource **other_bitmap_lockres;
 	struct dlm_lock_resource *resync_lockres;
@@ -68,6 +82,7 @@ struct md_cluster_info {
 	struct dlm_lock_resource *no_new_dev_lockres;
 	struct md_thread *recv_thread;
 	struct completion newdisk_completion;
+	wait_queue_head_t wait;
 	unsigned long state;
 };
 
@@ -508,6 +523,7 @@ static void recv_daemon(struct md_thread *thread)
 	struct cluster_msg msg;
 	int ret;
 
+	mutex_lock(&cinfo->recv_mutex);
 	/*get CR on Message*/
 	if (dlm_lock_sync(message_lockres, DLM_LOCK_CR)) {
 		pr_err("md/raid1:failed to get CR on MESSAGE\n");
@@ -534,33 +550,45 @@ static void recv_daemon(struct md_thread *thread)
 	ret = dlm_unlock_sync(message_lockres);
 	if (unlikely(ret != 0))
 		pr_info("unlock msg failed return %d\n", ret);
+	mutex_unlock(&cinfo->recv_mutex);
 }
 
-/* lock_comm()
+/* lock_token()
  * Takes the lock on the TOKEN lock resource so no other
  * node can communicate while the operation is underway.
- * If called again, and the TOKEN lock is alread in EX mode
- * return success. However, care must be taken that unlock_comm()
- * is called only once.
  */
-static int lock_comm(struct md_cluster_info *cinfo)
+static int lock_token(struct md_cluster_info *cinfo)
 {
 	int error;
 
-	if (cinfo->token_lockres->mode == DLM_LOCK_EX)
-		return 0;
-
 	error = dlm_lock_sync(cinfo->token_lockres, DLM_LOCK_EX);
 	if (error)
 		pr_err("md-cluster(%s:%d): failed to get EX on TOKEN (%d)\n",
 				__func__, __LINE__, error);
+
+	/* Lock the receive sequence */
+	mutex_lock(&cinfo->recv_mutex);
 	return error;
 }
 
+/* lock_comm()
+ * Sets the MD_CLUSTER_SEND_LOCK bit to lock the send channel.
+ */
+static int lock_comm(struct md_cluster_info *cinfo)
+{
+	wait_event(cinfo->wait,
+		   !test_and_set_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state));
+
+	return lock_token(cinfo);
+}
+
 static void unlock_comm(struct md_cluster_info *cinfo)
 {
 	WARN_ON(cinfo->token_lockres->mode != DLM_LOCK_EX);
+	mutex_unlock(&cinfo->recv_mutex);
 	dlm_unlock_sync(cinfo->token_lockres);
+	clear_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state);
+	wake_up(&cinfo->wait);
 }
 
 /* __sendmsg()
@@ -713,6 +741,8 @@ static int join(struct mddev *mddev, int nodes)
 	spin_lock_init(&cinfo->suspend_lock);
 	init_completion(&cinfo->completion);
 	set_bit(MD_CLUSTER_BEGIN_JOIN_CLUSTER, &cinfo->state);
+	init_waitqueue_head(&cinfo->wait);
+	mutex_init(&cinfo->recv_mutex);
 
 	mddev->cluster_info = cinfo;
 
@@ -843,9 +873,25 @@ static int slot_number(struct mddev *mddev)
 	return cinfo->slot_number - 1;
 }
 
+/*
+ * Check if the communication is already locked, else lock the communication
+ * channel.
+ * If it is already locked, token is in EX mode, and hence lock_token()
+ * should not be called.
+ */
 static int metadata_update_start(struct mddev *mddev)
 {
-	return lock_comm(mddev->cluster_info);
+	struct md_cluster_info *cinfo = mddev->cluster_info;
+
+	wait_event(cinfo->wait,
+		   !test_and_set_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state) ||
+		   test_and_clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state));
+
+	/* If token is already locked, return 0 */
+	if (cinfo->token_lockres->mode == DLM_LOCK_EX)
+		return 0;
+
+	return lock_token(cinfo);
 }
 
 static int metadata_update_finish(struct mddev *mddev)
@@ -870,6 +916,7 @@ static int metadata_update_finish(struct mddev *mddev)
 		ret = __sendmsg(cinfo, &cmsg);
 	} else
 		pr_warn("md-cluster: No good device id found to send\n");
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 	return ret;
 }
@@ -877,6 +924,7 @@ static int metadata_update_finish(struct mddev *mddev)
 static void metadata_update_cancel(struct mddev *mddev)
 {
 	struct md_cluster_info *cinfo = mddev->cluster_info;
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 }
 
@@ -970,14 +1018,18 @@ static int add_new_disk(struct mddev *mddev, struct md_rdev *rdev)
 		ret = -ENOENT;
 	if (ret)
 		unlock_comm(cinfo);
-	else
+	else {
 		dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
+		set_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
+		wake_up(&cinfo->wait);
+	}
 	return ret;
 }
 
 static void add_new_disk_cancel(struct mddev *mddev)
 {
 	struct md_cluster_info *cinfo = mddev->cluster_info;
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V3 2/2] md-cluster: Protect communication with mutexes
  2015-11-29  8:57 ` [PATCH Resend V2 2/2] md-cluster: Protect communication with mutexes Guoqing Jiang
@ 2015-11-29 17:43   ` Guoqing Jiang
  2015-11-30  0:58     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Guoqing Jiang @ 2015-11-29 17:43 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, rgoldwyn

Communication can happen through multiple threads. It is possible that
one thread steps over another threads sequence. So, we use mutexes to
protect both the send and receive sequences.

Send communication is locked through state bit, MD_CLUSTER_SEND_LOCK.
Communication is locked with bit manipulation in order to allow
"lock and hold" for the add operation. In case of an add operation,
if the lock is held, MD_CLUSTER_SEND_LOCKED_ALREADY is set.
When md_update_sb() calls metadata_update_start(), it checks
(in a single statement to avoid races), if the communication
is already locked. If yes, it merely returns zero, else it
locks the token lockresource.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
v3 change: add missed mutex_unlock in recv_daemon

v2 changes:
1. replace send_mutex with wait_queue.
2. change MD_CLUSTER_COMM_LOCKED to MD_CLUSTER_SEND_LOCKED_ALREADY,
   and set it later with add_new_disk.
3. add a new lock_token and change lock_comm to make it better
   suit for the specific cases of locking communication.

 drivers/md/md-cluster.c | 73 ++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 63 insertions(+), 10 deletions(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index b659ef7..ad3ec7d 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -48,12 +48,26 @@ struct resync_info {
 #define		MD_CLUSTER_SUSPEND_READ_BALANCING	2
 #define		MD_CLUSTER_BEGIN_JOIN_CLUSTER		3
 
+/* Lock the send communication. This is done through
+ * bit manipulation as opposed to a mutex in order to
+ * accomodate lock and hold. See next comment.
+ */
+#define		MD_CLUSTER_SEND_LOCK			4
+/* If cluster operations must lock the communication channel,
+ * so as to perform extra operations (and no other operation
+ * is allowed on the MD, such as adding a disk. Token needs
+ * to be locked and held until the operation completes with
+ * a md_update_sb(), which would eventually release the lock.
+ */
+#define		MD_CLUSTER_SEND_LOCKED_ALREADY		5
+
 
 struct md_cluster_info {
 	/* dlm lock space and resources for clustered raid. */
 	dlm_lockspace_t *lockspace;
 	int slot_number;
 	struct completion completion;
+	struct mutex recv_mutex;
 	struct dlm_lock_resource *bitmap_lockres;
 	struct dlm_lock_resource **other_bitmap_lockres;
 	struct dlm_lock_resource *resync_lockres;
@@ -68,6 +82,7 @@ struct md_cluster_info {
 	struct dlm_lock_resource *no_new_dev_lockres;
 	struct md_thread *recv_thread;
 	struct completion newdisk_completion;
+	wait_queue_head_t wait;
 	unsigned long state;
 };
 
@@ -508,9 +523,11 @@ static void recv_daemon(struct md_thread *thread)
 	struct cluster_msg msg;
 	int ret;
 
+	mutex_lock(&cinfo->recv_mutex);
 	/*get CR on Message*/
 	if (dlm_lock_sync(message_lockres, DLM_LOCK_CR)) {
 		pr_err("md/raid1:failed to get CR on MESSAGE\n");
+		mutex_unlock(&cinfo->recv_mutex);
 		return;
 	}
 
@@ -534,33 +551,45 @@ static void recv_daemon(struct md_thread *thread)
 	ret = dlm_unlock_sync(message_lockres);
 	if (unlikely(ret != 0))
 		pr_info("unlock msg failed return %d\n", ret);
+	mutex_unlock(&cinfo->recv_mutex);
 }
 
-/* lock_comm()
+/* lock_token()
  * Takes the lock on the TOKEN lock resource so no other
  * node can communicate while the operation is underway.
- * If called again, and the TOKEN lock is alread in EX mode
- * return success. However, care must be taken that unlock_comm()
- * is called only once.
  */
-static int lock_comm(struct md_cluster_info *cinfo)
+static int lock_token(struct md_cluster_info *cinfo)
 {
 	int error;
 
-	if (cinfo->token_lockres->mode == DLM_LOCK_EX)
-		return 0;
-
 	error = dlm_lock_sync(cinfo->token_lockres, DLM_LOCK_EX);
 	if (error)
 		pr_err("md-cluster(%s:%d): failed to get EX on TOKEN (%d)\n",
 				__func__, __LINE__, error);
+
+	/* Lock the receive sequence */
+	mutex_lock(&cinfo->recv_mutex);
 	return error;
 }
 
+/* lock_comm()
+ * Sets the MD_CLUSTER_SEND_LOCK bit to lock the send channel.
+ */
+static int lock_comm(struct md_cluster_info *cinfo)
+{
+	wait_event(cinfo->wait,
+		   !test_and_set_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state));
+
+	return lock_token(cinfo);
+}
+
 static void unlock_comm(struct md_cluster_info *cinfo)
 {
 	WARN_ON(cinfo->token_lockres->mode != DLM_LOCK_EX);
+	mutex_unlock(&cinfo->recv_mutex);
 	dlm_unlock_sync(cinfo->token_lockres);
+	clear_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state);
+	wake_up(&cinfo->wait);
 }
 
 /* __sendmsg()
@@ -713,6 +742,8 @@ static int join(struct mddev *mddev, int nodes)
 	spin_lock_init(&cinfo->suspend_lock);
 	init_completion(&cinfo->completion);
 	set_bit(MD_CLUSTER_BEGIN_JOIN_CLUSTER, &cinfo->state);
+	init_waitqueue_head(&cinfo->wait);
+	mutex_init(&cinfo->recv_mutex);
 
 	mddev->cluster_info = cinfo;
 
@@ -843,9 +874,25 @@ static int slot_number(struct mddev *mddev)
 	return cinfo->slot_number - 1;
 }
 
+/*
+ * Check if the communication is already locked, else lock the communication
+ * channel.
+ * If it is already locked, token is in EX mode, and hence lock_token()
+ * should not be called.
+ */
 static int metadata_update_start(struct mddev *mddev)
 {
-	return lock_comm(mddev->cluster_info);
+	struct md_cluster_info *cinfo = mddev->cluster_info;
+
+	wait_event(cinfo->wait,
+		   !test_and_set_bit(MD_CLUSTER_SEND_LOCK, &cinfo->state) ||
+		   test_and_clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state));
+
+	/* If token is already locked, return 0 */
+	if (cinfo->token_lockres->mode == DLM_LOCK_EX)
+		return 0;
+
+	return lock_token(cinfo);
 }
 
 static int metadata_update_finish(struct mddev *mddev)
@@ -870,6 +917,7 @@ static int metadata_update_finish(struct mddev *mddev)
 		ret = __sendmsg(cinfo, &cmsg);
 	} else
 		pr_warn("md-cluster: No good device id found to send\n");
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 	return ret;
 }
@@ -877,6 +925,7 @@ static int metadata_update_finish(struct mddev *mddev)
 static void metadata_update_cancel(struct mddev *mddev)
 {
 	struct md_cluster_info *cinfo = mddev->cluster_info;
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 }
 
@@ -970,14 +1019,18 @@ static int add_new_disk(struct mddev *mddev, struct md_rdev *rdev)
 		ret = -ENOENT;
 	if (ret)
 		unlock_comm(cinfo);
-	else
+	else {
 		dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
+		set_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
+		wake_up(&cinfo->wait);
+	}
 	return ret;
 }
 
 static void add_new_disk_cancel(struct mddev *mddev)
 {
 	struct md_cluster_info *cinfo = mddev->cluster_info;
+	clear_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
 	unlock_comm(cinfo);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V3 2/2] md-cluster: Protect communication with mutexes
  2015-11-29 17:43   ` [PATCH V3 " Guoqing Jiang
@ 2015-11-30  0:58     ` NeilBrown
  2015-11-30 15:09       ` Guoqing Jiang
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2015-11-30  0:58 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: linux-raid, rgoldwyn

[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]

On Mon, Nov 30 2015, Guoqing Jiang wrote:

> +#define		MD_CLUSTER_SEND_LOCK			4
> +/* If cluster operations must lock the communication channel,
> + * so as to perform extra operations (and no other operation
> + * is allowed on the MD, such as adding a disk. Token needs
> + * to be locked and held until the operation completes with
> + * a md_update_sb(), which would eventually release the lock.
> + */

This comment has unbalanced parentheses.  How did it even compile :-)

But there is something else that isn't as clear as it could be....

>  
> @@ -970,14 +1019,18 @@ static int add_new_disk(struct mddev *mddev, struct md_rdev *rdev)
>  		ret = -ENOENT;
>  	if (ret)
>  		unlock_comm(cinfo);
> -	else
> +	else {
>  		dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
> +		set_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
> +		wake_up(&cinfo->wait);
> +	}

This code suggests that something has caused a metadata update to be
triggered.  i.e. something set MD_CHANGE_DEVS or similar.
That is presumably add_bound_rdev() - which hasn't yet been called, but
while soon after in add_new_disk().

This feels a little bit fragile.  The new code in md-cluster is only
correct if code in add_new_disk does a particular thing.  I guess I'd at
least like to see a comment in add_new_disk() in md-cluster explaining
what will cause MD_CLUSTE_SEND_LOCKED_ALREADY to eventually be cleared.
Maybe it could also schedule the MD_CHANGE_DEVS to be completely certain
that will happen, but that probably isn't necessary.

So I've applied these for now, but if you could fix one comment and add
another that would help.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3 2/2] md-cluster: Protect communication with mutexes
  2015-11-30  0:58     ` NeilBrown
@ 2015-11-30 15:09       ` Guoqing Jiang
  0 siblings, 0 replies; 5+ messages in thread
From: Guoqing Jiang @ 2015-11-30 15:09 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, rgoldwyn

Hi Neil,

On 11/30/2015 08:58 AM, NeilBrown wrote:
> On Mon, Nov 30 2015, Guoqing Jiang wrote:
>
>> +#define		MD_CLUSTER_SEND_LOCK			4
>> +/* If cluster operations must lock the communication channel,
>> + * so as to perform extra operations (and no other operation
>> + * is allowed on the MD, such as adding a disk. Token needs
>> + * to be locked and held until the operation completes with
>> + * a md_update_sb(), which would eventually release the lock.
>> + */
> This comment has unbalanced parentheses.  How did it even compile :-)
Oops, thanks for catch that, how careless I am.
> But there is something else that isn't as clear as it could be....
>
>>   
>> @@ -970,14 +1019,18 @@ static int add_new_disk(struct mddev *mddev, struct md_rdev *rdev)
>>   		ret = -ENOENT;
>>   	if (ret)
>>   		unlock_comm(cinfo);
>> -	else
>> +	else {
>>   		dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
>> +		set_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
>> +		wake_up(&cinfo->wait);
>> +	}
> This code suggests that something has caused a metadata update to be
> triggered.  i.e. something set MD_CHANGE_DEVS or similar.
> That is presumably add_bound_rdev() - which hasn't yet been called, but
> while soon after in add_new_disk().
>
> This feels a little bit fragile.  The new code in md-cluster is only
> correct if code in add_new_disk does a particular thing.  I guess I'd at
> least like to see a comment in add_new_disk() in md-cluster explaining
> what will cause MD_CLUSTE_SEND_LOCKED_ALREADY to eventually be cleared.
> Maybe it could also schedule the MD_CHANGE_DEVS to be completely certain
> that will happen, but that probably isn't necessary.
>
> So I've applied these for now, but if you could fix one comment and add
> another that would help.
What about the following changes? Thanks a lot.

[PATCH] md-cluster: comments update for MD_CLUSTER_SEND_LOCKED_ALREADY

1. fix unbalanced parentheses.
2. add more description about that MD_CLUSTER_SEND_LOCKED_ALREADY
    will be cleared after set it in add_new_disk.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
---
  drivers/md/md-cluster.c | 16 +++++++++++++---
  1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index ad3ec7d..68b4866 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -53,9 +53,9 @@ struct resync_info {
   * accomodate lock and hold. See next comment.
   */
  #define                MD_CLUSTER_SEND_LOCK                    4
-/* If cluster operations must lock the communication channel,
- * so as to perform extra operations (and no other operation
- * is allowed on the MD, such as adding a disk. Token needs
+/* If cluster operations (such as adding a disk) must lock
+ * the communication channel, so as to perform extra operations
+ * and no other operation is allowed on the MD. Token needs
   * to be locked and held until the operation completes with
   * a md_update_sb(), which would eventually release the lock.
   */
@@ -1021,6 +1021,16 @@ static int add_new_disk(struct mddev *mddev, 
struct md_rdev *rdev)
                 unlock_comm(cinfo);
         else {
                 dlm_lock_sync(cinfo->no_new_dev_lockres, DLM_LOCK_CR);
+                /* Since MD_CHANGE_DEVS will be set in add_bound_rdev which
+                 * will run soon after add_new_disk, the path will be 
invoked:
+                 *   md_wakeup_thread(mddev->thread) -> conf->thread 
(raid1d)
+                 *                     -> md_check_recovery -> md_update_sb
+                 *                     -> metadata_update_start/finish
+                 * MD_CLUSTER_SEND_LOCKED_ALREADY will be cleared 
eventually.
+                 *
+                 * For other failure cases, metadata_update_cancel and
+                 * add_new_disk_cancel also clear below bit as well.
+                 */
                 set_bit(MD_CLUSTER_SEND_LOCKED_ALREADY, &cinfo->state);
                 wake_up(&cinfo->wait);
         }
-- 
2.1.4

Guoqing

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-11-30 15:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-29  8:57 [PATCH Resend V2 1/2] md-cluster: Defer MD reloading to mddev->thread Guoqing Jiang
2015-11-29  8:57 ` [PATCH Resend V2 2/2] md-cluster: Protect communication with mutexes Guoqing Jiang
2015-11-29 17:43   ` [PATCH V3 " Guoqing Jiang
2015-11-30  0:58     ` NeilBrown
2015-11-30 15:09       ` Guoqing Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).