From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Tue, 4 Jan 2011 16:27:38 -0500 Subject: [Cluster-devel] [PATCH] dlm: send_bast_queue() skip list loop not only sending basts to convertqueue In-Reply-To: <1294171611-24786-1-git-send-email-cmaiolino@redhat.com> References: <1294171611-24786-1-git-send-email-cmaiolino@redhat.com> Message-ID: <20110104212738.GA26812@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Tue, Jan 04, 2011 at 06:06:51PM -0200, cmaiolino at redhat.com wrote: > The resource groups got corrupted without this patch: I could see an extraneous bast leading to confusion in gfs2 about the lock state, but gfs2 should probably be asserting somewhere before it actually corrupts anything... > diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c > index 64e5f3e..565c519 100644 > --- a/fs/dlm/lock.c > +++ b/fs/dlm/lock.c > @@ -1847,7 +1847,7 @@ static void send_bast_queue(struct dlm_rsb *r, struct list_head *head, > > list_for_each_entry(gr, head, lkb_statequeue) { > /* skip self when sending basts to convertqueue */ > - if (gr == lkb) > + if (head == &r->res_grantqueue && gr == lkb) > continue; > if (gr->lkb_bastfn && modes_require_bast(gr, lkb)) { > queue_bast(r, gr, lkb->lkb_rqmode); I haven't been able to figure out the problem or the fix; some printk's around the case in question would be revealing. This is the specific case where a TRY_1CB (NOQUEUBAST) conversion fails. Here's how I step through the code for that case: _convert_lock(lkb) error = do_convert(lkb) when error equals -EAGAIN, lkb remains on grantqueue do_convert_effects(lkb, -EAGAIN) -EAGAIN and NOQUEUEBAST -> send_blocking_asts_all -> send_bast_queue(grantqueue, lkb) [lkb is expected to be here, skip sending bast to self] send_bast_queue(convertqueue, lkb): [lkb should not be on here, but your patch implies there are cases where it can be? I think that would be a bug.] Dave