All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
@ 2016-10-19  5:19 Eric Ren
  2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Eric Ren @ 2016-10-19  5:19 UTC (permalink / raw)
  To: ocfs2-devel

Hi all!

Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
results in another deadlock as we have discussed in the recent thread:
    https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html

Before this one, a similiar deadlock has been fixed by Junxiao:
    commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
    commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")

We are in the situation that we have to avoid recursive cluster locking, but
there is no way to check if a cluster lock has been taken by a precess already.

Mostly, we can avoid recursive locking by writing code carefully. However, as
the deadlock issues have proved out, it's very hard to handle the routines
that are called directly by vfs. For instance:

    const struct inode_operations ocfs2_file_iops = {
            .permission     = ocfs2_permission,
            .get_acl        = ocfs2_iop_get_acl,
            .set_acl        = ocfs2_iop_set_acl,
    };


ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

Possibly, there are three solutions I can think of.  The first one is to
implement the inode permission routine for ocfs2 itself, replacing the
existing generic_permission(); this will bring lots of changes and
involve too many trivial vfs functions into ocfs2 code. Frown on this.

The second one is, what I am trying now, to keep track of the processes who
lock/unlock a cluster lock by the following draft patches. But, I quickly
find out that a cluster locking which has been taken by processA can be unlocked
by processB. For example, systemfiles like journal:0000 is locked during mout, and
unlocked during umount. 

The thrid one is to revert that problematic commit! It looks like get/set_acl()
are always been called by other vfs callback like ocfs2_permission(). I think
we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)

Hope for your input to solve this problem;-)

Thanks,
Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-11-15  2:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-19  5:19 [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Eric Ren
2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock Eric Ren
2016-10-19  5:19 ` [Ocfs2-devel] [DRAFT 2/2] ocfs2: fix deadlock caused by recursive cluster locking Eric Ren
2016-10-31 10:55   ` piaojun
2016-11-01  1:45     ` Eric Ren
2016-11-10 10:49       ` piaojun
2016-11-11  1:56         ` Eric Ren
2016-11-14  5:42           ` piaojun
2016-11-14 10:03             ` Eric Ren
2016-11-15  2:13               ` Eric Ren
2016-11-09  4:55   ` Eric Ren
2016-10-19  6:57 ` [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas? Junxiao Bi
2016-10-19  7:46   ` Eric Ren
2016-10-24  9:13 ` Eric Ren
2016-10-28  6:20 ` Christoph Hellwig
2016-10-28  6:20   ` Christoph Hellwig
2016-10-28  7:06   ` Eric Ren
2016-10-28  7:06     ` Eric Ren
2016-11-09  4:47 ` [Ocfs2-devel] " Eric Ren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.