Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resync to stall
@ 2008-01-11 15:23 Graham, Simon
  2008-01-14  9:04 ` Lars Ellenberg
  2008-01-14 13:57 ` [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resyncto stall Graham, Simon
  0 siblings, 2 replies; 3+ messages in thread
From: Graham, Simon @ 2008-01-11 15:23 UTC (permalink / raw)
  To: drbd-dev


[-- Attachment #1.1: Type: text/plain, Size: 231 bytes --]

The attached patches revert a couple of changes made in DRBD-8.0 that
can cause resync to stall if it is pasued/resumed quickly (as can happen
if you have a lot of devices all releated with sync-after settings).

 

Simon


[-- Attachment #1.2: Type: text/html, Size: 3502 bytes --]

[-- Attachment #2: 0004-Revert-Fixing-drbdadm-pause-sync-r0-drbdadm-resu.patch --]
[-- Type: application/octet-stream, Size: 1039 bytes --]

From 93144f76bb3e22bc850cafafa489fc904b6c93db Mon Sep 17 00:00:00 2001
From: Simon Graham <sgraham@anna.sn.stratus.com>
Date: Fri, 21 Dec 2007 22:27:56 -0500
Subject: [PATCH] Revert "Fixing "drbdadm pause-sync r0 ; drbdadm resume-sync r0""

This reverts commit 68660cd885564b590c0559efdbb28f9edcb87ff2.

The change caused the sync to stall when paused and resumed quickly
---
 drbd/drbd_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drbd/drbd_main.c b/drbd/drbd_main.c
index 8d18601..1430794 100644
--- a/drbd/drbd_main.c
+++ b/drbd/drbd_main.c
@@ -785,7 +785,7 @@ int _drbd_set_state(drbd_dev* mdev, drbd_state_t ns,enum chg_state_flags flags)
 		INFO("Syncer continues.\n");
 		mdev->rs_paused += (long)jiffies-(long)mdev->rs_mark_time;
 		if (ns.conn == SyncTarget) {
-			if (!test_and_clear_bit(STOP_SYNC_TIMER,&mdev->flags)) {
+			if (!test_bit(STOP_SYNC_TIMER,&mdev->flags)) {
 				mod_timer(&mdev->resync_timer,jiffies);
 			}
 			/* This if (!test_bit) is only needed for the case
-- 
1.5.4.rc1


[-- Attachment #3: 0005-Revert-make-resync-more-robust-don-t-reset-find-bi.patch --]
[-- Type: application/octet-stream, Size: 2004 bytes --]

From cab2179c6ceea943b45dcfc99a54e14fd725a082 Mon Sep 17 00:00:00 2001
From: Simon Graham <sgraham@anna.sn.stratus.com>
Date: Fri, 21 Dec 2007 22:28:24 -0500
Subject: [PATCH] Revert "make resync more robust, don't reset find bit offset too early."

This reverts commit 3a57119417c46c51dd4bc720ab7dbf14228f05bb.

The change caused syncing to stall if pased/resumed quickly
---
 drbd/drbd_bitmap.c |    4 ++--
 drbd/drbd_main.c   |   12 ++++--------
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drbd/drbd_bitmap.c b/drbd/drbd_bitmap.c
index b9309a3..6a7e4ae 100644
--- a/drbd/drbd_bitmap.c
+++ b/drbd/drbd_bitmap.c
@@ -874,7 +874,7 @@ unsigned long drbd_bm_find_next(drbd_dev *mdev)
 	}
 	if (i >= b->bm_bits) {
 		i = -1UL;
-		/* leave b->bm_fo unchanged. */
+		b->bm_fo = 0;
 	} else {
 		b->bm_fo = i+1;
 	}
@@ -898,7 +898,7 @@ void drbd_bm_set_find(drbd_dev *mdev, unsigned long i)
 
 int drbd_bm_rs_done(drbd_dev *mdev)
 {
-	return (mdev->bitmap->bm_fo >= mdev->bitmap->bm_bits);
+	return mdev->bitmap->bm_fo == 0;
 }
 
 /* returns number of bits actually changed.
diff --git a/drbd/drbd_main.c b/drbd/drbd_main.c
index 1430794..98a1c63 100644
--- a/drbd/drbd_main.c
+++ b/drbd/drbd_main.c
@@ -784,14 +784,10 @@ int _drbd_set_state(drbd_dev* mdev, drbd_state_t ns,enum chg_state_flags flags)
 	    (ns.conn == SyncTarget  || ns.conn == SyncSource) ) {
 		INFO("Syncer continues.\n");
 		mdev->rs_paused += (long)jiffies-(long)mdev->rs_mark_time;
-		if (ns.conn == SyncTarget) {
-			if (!test_bit(STOP_SYNC_TIMER,&mdev->flags)) {
-				mod_timer(&mdev->resync_timer,jiffies);
-			}
-			/* This if (!test_bit) is only needed for the case
-			   that a device that has ceased to used its timer,
-			   i.e. it is already in drbd_resync_finished() gets
-			   paused and resumed. */
+		if( ns.conn == SyncTarget ) {
+			D_ASSERT(!test_bit(STOP_SYNC_TIMER,&mdev->flags));
+			clear_bit(STOP_SYNC_TIMER,&mdev->flags);
+			mod_timer(&mdev->resync_timer,jiffies);
 		}
 	}
 
-- 
1.5.4.rc1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resync to stall
  2008-01-11 15:23 [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resync to stall Graham, Simon
@ 2008-01-14  9:04 ` Lars Ellenberg
  2008-01-14 13:57 ` [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resyncto stall Graham, Simon
  1 sibling, 0 replies; 3+ messages in thread
From: Lars Ellenberg @ 2008-01-14  9:04 UTC (permalink / raw)
  To: drbd-dev

On Fri, Jan 11, 2008 at 10:23:37AM -0500, Graham, Simon wrote:
> The attached patches revert a couple of changes made in DRBD-8.0 that can cause
> resync to stall if it is pasued/resumed quickly (as can happen if you have a
> lot of devices all releated with sync-after settings).

I don't think these should be reverted,
I believe these changes are "correct".
If they cause resync to be stalled, then there is an other problem
which should be tracked down and fixed.


-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resyncto stall
  2008-01-11 15:23 [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resync to stall Graham, Simon
  2008-01-14  9:04 ` Lars Ellenberg
@ 2008-01-14 13:57 ` Graham, Simon
  1 sibling, 0 replies; 3+ messages in thread
From: Graham, Simon @ 2008-01-14 13:57 UTC (permalink / raw)
  To: Lars Ellenberg, drbd-dev

> On Fri, Jan 11, 2008 at 10:23:37AM -0500, Graham, Simon wrote:
> > The attached patches revert a couple of changes made in DRBD-8.0
that
> can cause
> > resync to stall if it is pasued/resumed quickly (as can happen if
you
> have a
> > lot of devices all releated with sync-after settings).
> 
> I don't think these should be reverted,
> I believe these changes are "correct".
> If they cause resync to be stalled, then there is an other problem
> which should be tracked down and fixed.
> 

Perhaps but they definitely cause sync stalls (because there are race
conditions setting and clearing the stop-sync-timer flag because it's
set in one thread and cleared in another). I think all the
checking/asserting should just be removed -- if you want it to stop, set
the flag, if you want sync to run, clear it and that's it...

For the moment, in 8.0 at least, I think it's simply better to revert
these changes because they cause problems.

Simon

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-01-14 13:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-11 15:23 [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resync to stall Graham, Simon
2008-01-14  9:04 ` Lars Ellenberg
2008-01-14 13:57 ` [Drbd-dev] [DRBD-8.0 PATCH] Revert some changes that cause resyncto stall Graham, Simon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox