* [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
@ 2013-11-07 12:12 Joseph Qi
2013-11-07 13:19 ` Joel Becker
0 siblings, 1 reply; 4+ messages in thread
From: Joseph Qi @ 2013-11-07 12:12 UTC (permalink / raw)
To: ocfs2-devel
We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
The case is described below.
2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
/race/16/1 in the filesystem, and let the inode number of dir 16 is less
than the inode number of dir race.
Node A Node B
mv /race/16/1 /race/
right after Node A has got the
EX mode of /race/16/, and tries to
get EX mode of /race
ls /race/16/
In this case, Node A has got the EX mode of /race/16/, and wants to get
EX mode of /race/. Node B has got the PR mode of /race/, and wants to
get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
dead lock happens.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
2013-11-07 12:12 [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer Joseph Qi
@ 2013-11-07 13:19 ` Joel Becker
2013-11-08 2:19 ` Joseph Qi
0 siblings, 1 reply; 4+ messages in thread
From: Joel Becker @ 2013-11-07 13:19 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>
> The case is described below.
> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
> than the inode number of dir race.
>
> Node A Node B
> mv /race/16/1 /race/
> right after Node A has got the
> EX mode of /race/16/, and tries to
> get EX mode of /race
> ls /race/16/
>
> In this case, Node A has got the EX mode of /race/16/, and wants to get
> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
> dead lock happens.
Interesting. What DLM are you using, o2dlm or fs/dlm? I would expect
that fs/dlm would do deadlock detection, but I could be wrong.
There's no way the PR on /race/ will downconvert, because there is a
reference. We really want a signal to that PR waiting on /race/16/, but
there's no in-progress work happening on node A for that.
I suppose we could hack this to check for ancestors. That is, rename
locks should be in ancestor order before trying inode number order. I'm
not sure that always works, though, especially if the ancestors are not
consecutive and might also be affected by in-flight moves...
Joel
--
"Egotist: a person more interested in himself than in me."
- Ambrose Bierce
http://www.jlbec.org/
jlbec at evilplan.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
2013-11-07 13:19 ` Joel Becker
@ 2013-11-08 2:19 ` Joseph Qi
2013-11-28 1:40 ` Joseph Qi
0 siblings, 1 reply; 4+ messages in thread
From: Joseph Qi @ 2013-11-08 2:19 UTC (permalink / raw)
To: ocfs2-devel
On 2013/11/7 21:19, Joel Becker wrote:
> On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
>> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>>
>> The case is described below.
>> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
>> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
>> than the inode number of dir race.
>>
>> Node A Node B
>> mv /race/16/1 /race/
>> right after Node A has got the
>> EX mode of /race/16/, and tries to
>> get EX mode of /race/
>> ls /race/16/
>>
>> In this case, Node A has got the EX mode of /race/16/, and wants to get
>> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
>> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
>> dead lock happens.
>
> Interesting. What DLM are you using, o2dlm or fs/dlm? I would expect
> that fs/dlm would do deadlock detection, but I could be wrong.
>
We are using ocfs2 dlm.
> There's no way the PR on /race/ will downconvert, because there is a
> reference. We really want a signal to that PR waiting on /race/16/, but
> there's no in-progress work happening on node A for that.
>
Can timeout resolve this issue?
A glancing thought is, once timeout happens, cancel the queued lock and
let it lock operation fail.
> I suppose we could hack this to check for ancestors. That is, rename
> locks should be in ancestor order before trying inode number order. I'm
> not sure that always works, though, especially if the ancestors are not
> consecutive and might also be affected by in-flight moves...
>
> Joel
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
2013-11-08 2:19 ` Joseph Qi
@ 2013-11-28 1:40 ` Joseph Qi
0 siblings, 0 replies; 4+ messages in thread
From: Joseph Qi @ 2013-11-28 1:40 UTC (permalink / raw)
To: ocfs2-devel
On 2013/11/8 10:19, Joseph Qi wrote:
> On 2013/11/7 21:19, Joel Becker wrote:
>> On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
>>> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>>>
>>> The case is described below.
>>> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
>>> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
>>> than the inode number of dir race.
>>>
>>> Node A Node B
>>> mv /race/16/1 /race/
>>> right after Node A has got the
>>> EX mode of /race/16/, and tries to
>>> get EX mode of /race/
>>> ls /race/16/
>>>
>>> In this case, Node A has got the EX mode of /race/16/, and wants to get
>>> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
>>> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
>>> dead lock happens.
>>
>> Interesting. What DLM are you using, o2dlm or fs/dlm? I would expect
>> that fs/dlm would do deadlock detection, but I could be wrong.
>>
> We are using ocfs2 dlm.
>
>> There's no way the PR on /race/ will downconvert, because there is a
>> reference. We really want a signal to that PR waiting on /race/16/, but
>> there's no in-progress work happening on node A for that.
>>
> Can timeout resolve this issue?
> A glancing thought is, once timeout happens, cancel the queued lock and
> let it lock operation fail.
>
If this case happens, it seems that we can try to get the lock of /race/
first (parent), and then get the lock of /race/16/ (child).
That is to mean, check if one inode is the ancestor of the other.
Anything I am missing?
>> I suppose we could hack this to check for ancestors. That is, rename
>> locks should be in ancestor order before trying inode number order. I'm
>> not sure that always works, though, especially if the ancestors are not
>> consecutive and might also be affected by in-flight moves...
>>
>> Joel
>>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-11-28 1:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-07 12:12 [Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer Joseph Qi
2013-11-07 13:19 ` Joel Becker
2013-11-08 2:19 ` Joseph Qi
2013-11-28 1:40 ` Joseph Qi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.