All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joseph Qi <joseph.qi@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
Date: Fri, 8 Nov 2013 10:19:55 +0800	[thread overview]
Message-ID: <527C4A4B.6020402@huawei.com> (raw)
In-Reply-To: <20131107131948.GH24799@localhost>

On 2013/11/7 21:19, Joel Becker wrote:
> On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
>> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>>
>> The case is described below.
>> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
>> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
>> than the inode number of dir race.
>>
>> Node A                            Node B
>> mv /race/16/1 /race/
>>                                   right after Node A has got the
>>                                   EX mode of /race/16/, and tries to
>>                                   get EX mode of /race/
>>                                   ls /race/16/
>>
>> In this case, Node A has got the EX mode of /race/16/, and wants to get
>> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
>> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
>> dead lock happens.
> 
> Interesting.  What DLM are you using, o2dlm or fs/dlm?  I would expect
> that fs/dlm would do deadlock detection, but I could be wrong.
>
We are using ocfs2 dlm.

> There's no way the PR on /race/ will downconvert, because there is a
> reference.  We really want a signal to that PR waiting on /race/16/, but
> there's no in-progress work happening on node A for that.
> 
Can timeout resolve this issue?
A glancing thought is, once timeout happens, cancel the queued lock and
let it lock operation fail.

> I suppose we could hack this to check for ancestors.  That is, rename
> locks should be in ancestor order before trying inode number order.  I'm
> not sure that always works, though, especially if the ancestors are not
> consecutive and might also be affected by in-flight moves...
> 
> Joel
> 

  reply	other threads:[~2013-11-08  2:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-07 12:12 [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer Joseph Qi
2013-11-07 13:19 ` Joel Becker
2013-11-08  2:19   ` Joseph Qi [this message]
2013-11-28  1:40     ` Joseph Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=527C4A4B.6020402@huawei.com \
    --to=joseph.qi@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.