From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Philipp Reisner To: drbd-dev@lists.linbit.com Subject: Re: [Drbd-dev] DRBD8: nodes deadlock in PausedSync{ST] Date: Wed, 31 Oct 2007 14:46:39 +0100 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200710311446.42512.philipp.reisner@linbit.com> Cc: "Montrose, Ernest" List-Id: Coordination of development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Monday 29 October 2007 23:33:29 Montrose, Ernest wrote: > Hi all, > I have been struggling with a problem here where the nodes enter > PausedSync[T|S] and stay there. > This happens when one node come up from a fresh attach, connect > sequence. I think the issue happens this way. Say we have two volumes > drbd5 and drbd16 and we attempt to connect both of them at roughly the > same time. Futhermore, drbd5 and 16 will require syncing say as sync > target. What I observe is this: > * drbd16 is connecting and drbd5 is syncing. So 16 is paused isp=1 > * drbd16 enters receive_state() but before acquiring the req_lock > that thread loses the CPU to drbd5 that is finishing syncing. After_isp > is cleared on 16 giving drbd16 the green light to continue syncing. So > far so good. > * Now drbd16 resumes with the old peer_isp=1 > * So now we are paused forever. > > So I think receive_state() is just racy but I could be wrong. I am > really not sure how to fix this but I include a patch here that may help > to at least illustrate the problem. It seems close window for this > particular race somewhat. > Hi Ernest, You patch fixes the issue. I spent an hour or so to understanding the exact timing, and drew diagrams of it... it is fixed with that change. No race left, I think. I committed it to my git tree. -Phil -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, 1120 Vienna, Austria http://www.linbit.com :