From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Wed, 26 Sep 2007 08:25:11 -0500 Subject: [Cluster-devel] [PATCH] dlm: schedule during recovery loops In-Reply-To: <46FA07DF.6040902@redhat.com> References: <20070925162311.GE15893@redhat.com> <46FA07DF.6040902@redhat.com> Message-ID: <20070926132511.GB15033@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, Sep 26, 2007 at 08:18:55AM +0100, Patrick Caulfield wrote: > David Teigland wrote: > > Call schedule() in a bunch of places where the recovery code loops > > through lists of locks. The theory is that these lists become so > > long that looping through them triggers the softlockup watchdog. > > (usually on ia64, doesn't seem to happen often on other arch's). > > > > Signed-off-by: David Teigland > > > I think we're encouraged to use cond_resched() instead these days. It has the > same effect but doesn't force a schedule if there is nothing else to run. OK, I'd like to try to do cond_resched() instead, how certain are we that it's just as effective in avoiding the softlockup watchdog? Testing it is going to be difficult since it's largely unreproducable outside of some single cpu ia64 machines in the qe dept...