From: Andrew Morton <akpm@osdl.org>
To: Zach Brown <zach.brown@oracle.com>
Cc: linux-kernel@vger.kernel.org, hch@infradead.org
Subject: Re: [RFC] page lock ordering and OCFS2
Date: Mon, 17 Oct 2005 16:17:44 -0700 [thread overview]
Message-ID: <20051017161744.7df90a67.akpm@osdl.org> (raw)
In-Reply-To: <20051017222051.GA26414@tetsuo.zabbo.net>
Zach Brown <zach.brown@oracle.com> wrote:
>
>
> I sent an ealier version of this patch to linux-fsdevel and was met with
> deafening silence.
Maybe because nobody understood your description ;)
> I'm resending the commentary from that first mail and am
> including a new version of the patch. This time it has much clearer naming
> and some kerneldoc blurbs. Here goes...
>
> --
>
> In recent weeks we've been reworking the locking in OCFS2 to simplify things
> and make it behave more like a "local" Linux file system. We've run into an
> ordering inversion between a page's PG_locked and OCFS2's DLM locks that
> protect page cache contents. I'm including a patch at the end of this mail
> that I think is a clean way to give the file system a chance to get the
> ordering right, but we're open to any and all suggestions. We want to do the
> cleanest thing.
The patch is of course pretty unwelcome: lots of weird stuff in the core
VFS which everyone has to maintain but probably will not test.
So I think we need a better understanding of what the locking inversion
problem is, so we can perhaps find a better solution. Bear in mind that
ext3 has (rare, unsolved) lock inversion problems in this area as well, so
commonality will be sought for.
> OCFS2 maintains page cache coherence between nodes by requiring that a node
> hold a valid lock while there are active pages in the page cache.
"active pages in the page cache" means present pagecache pages in the node
which holds pages in its pagecache, yes?
> The page
> cache is invalidated before a node releases a lock so that another node can
> acquire it. While this invalidation is happening new locks can not be acquired
> on that node. This is equivalent to a DLM processing thread acquiring
> PG_locked during truncation while holding a DLM lock. Normal file system user
> tasks come to the a_ops with PG_locked acquired by their callers before they
> have a chance to get DLM locks.
So where is the lock inversion?
Perhaps if you were to cook up one of those little threadA/threadB ascii
diagrams we could see where the inversion occurs?
next prev parent reply other threads:[~2005-10-17 23:18 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-17 22:20 [RFC] page lock ordering and OCFS2 Zach Brown
2005-10-17 23:17 ` Andrew Morton [this message]
2005-10-18 0:40 ` Zach Brown
2005-10-18 1:24 ` Andrew Morton
2005-10-18 8:23 ` Anton Altaparmakov
2005-10-18 17:25 ` Zach Brown
2005-10-18 17:14 ` Zach Brown
2005-10-21 17:43 ` Zach Brown
2005-10-21 17:57 ` Christoph Hellwig
2005-10-21 20:36 ` Zach Brown
2005-10-21 20:59 ` Andrew Morton
2005-10-21 21:57 ` Zach Brown
2005-10-25 0:03 ` Zach Brown
2005-10-21 17:58 ` Andrew Morton
2005-10-21 20:32 ` Zach Brown
2005-10-17 23:37 ` Badari Pulavarty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051017161744.7df90a67.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox