From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Wed, 9 Aug 2017 11:41:44 -0500 Subject: [Cluster-devel] [PATCH 13/17] dlm: fix _can_be_granted() for lock at the head of covert queue. In-Reply-To: References: Message-ID: <20170809164144.GD21204@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, Aug 09, 2017 at 05:51:37AM +0000, tsutomu.owa at toshiba.co.jp wrote: > If there is a lock resource conflict on multiple nodes, the lock on > convert queue may not be granted forever. > > EX.) > grant queue: > node0 grmode NL / rqmode IV > node1 grmode NL / rqmode IV > > convert queue: > node2 grmode NL / rqmode EX > node3 grmode PR / rqmode EX > > wait queue: > node4 grmode IV / rqmode PR > node5 grmode IV / rqmode PR > > When the lock conversion (node PR -> NL) of node 0 is completed, the lock > of node 2 should be grantable. However, __can_be_granted() returns 0 > because the grmode of the lock on node 3 in convert queue is PR. > > When checking the lock at the head of convert queue, exclude > queue_conflict() targeting convert queue. This example doesn't look right. node2's NL->EX cannot be granted because it conflicts with the PR lock held by node3. (The grmode is still valid when a lock is on the convert queue.) There are two valid outcomes in the example above, either 1) node3 PR->EX is granted, or 2) node4 and node5 PR requests are granted. What have you seen the dlm do in this state? If it does not grant anything, that would be a bug. Based on the sequence of events you describe, I think that the correct outcome would be 1 (granting node3's PR->EX), based on this rule: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/dlm/lock.c#n2429 > - if (queue_conflict(&r->res_convertqueue, lkb)) > + if (!first_in_list(lkb, &r->res_convertqueue) && > + queue_conflict(&r->res_convertqueue, lkb)) > return 0;