linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Nick Piggin <npiggin@suse.de>
Cc: Jan Kara <jack@suse.cz>, Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [rfc][patch] mm: lockdep page lock
Date: Tue, 16 Mar 2010 12:52:00 +0100	[thread overview]
Message-ID: <20100316115200.GA5699@quack.suse.cz> (raw)
In-Reply-To: <20100316022153.GJ2869@laptop>

On Tue 16-03-10 13:21:53, Nick Piggin wrote:
> On Mon, Mar 15, 2010 at 07:08:00PM +0100, Jan Kara wrote:
> >   Hi,
> > 
> > On Tue 16-03-10 02:58:59, Nick Piggin wrote:
> > > This patch isn't totally complete. Needs some nesting annotations for
> > > filesystems like ntfs, and some async lock release annotations for other
> > > end-io handlers, also page migration code needs to set the page lock
> > > class. But the core of it is working nicely and is a pretty small patch.
> > > 
> > > It is a bit different to one Peter posted a while back, with differences.
> > > I don't care so much about bloating struct page with a few more bytes.
> > > lockdep can't run on a production kernel so I think it's preferable to be
> > > catching more complex errors than avoiding overhead. I also set the page
> > > lock class at the time it is added to pagecache when we have the mapping
> > > pinned to the page.
> > > 
> > > One issue I wonder about is if the lock class is changed while some other
> > > page locker is waiting to get the lock but has already called
> > > lock_acquire for the old class. Possibly it could be solved if lockdep
> > > has different primitives to say the caller is contending for a lock
> > > versus if it has been granted the lock?
> > > 
> > > Do you think it would be useful?  --
> > > 
> > > Page lock has very complex dependencies, so it would be really nice to
> > > add lockdep support for it.
> > > 
> > > For example: add_to_page_cache_locked(GFP_KERNEL) (called with page
> > > locked) -> page reclaim performs a trylock_page -> page reclaim performs
> > > a writepage -> writepage performs a get_block -> get_block reads
> > > buffercache -> buffercache read requires grow_dev_page -> grow_dev_page
> > > locks buffercache page -> if writepage fails, page reclaim calls
> > > handle_write_error -> handle_write_error performs a lock_page
> > > 
> > > So before even considering any other locks or more complex nested
> > > filesystems, we can hold at least 3 different page locks at once. Should
> > > be safe because we have an fs->bdev page lock ordering, and because
> > > add_to_page_cache* tend to be called on new (non-LRU) pages that can't be
> > > locked elsewhere, however a notable exception is tmpfs which moves live
> > > pages in and out of pagecache.
> > > 
> > > So lockdepify the page lock. Each filesystem type gets a unique key, to
> > > handle inter-filesystem nesting (like regular filesystem -> buffercache,
> > > or ecryptfs -> lower). Newly allocated pages get a default lock class,
> > > and it is reassigned to their filesystem type when being added to page
> > > cache.
> >   You'll probably soon notice that quite some filesystems (ext4, xfs,
> > ocfs2, ...) lock several pages at once in their writepages function. The
> 
> Yes indeed. This is what I had meant about nesting with NTFS, but I
> understand that others do it too.
> 
> 
> > locking rule here is that we always lock pages in index increasing order. I
> > don't think lockdep will be able to handle something like that. Probably we
> > can just avoid lockdep checking in these functions (or just acquire the
> > page lock class for the first page) but definitely there will be some
> 
> You are right, I don't think lockdep would work with that, so just
> checking the lock for the first page should be better than nothing.
> It might require some lockdep support in order to add context so it
> doesn't go mad when unlock_page is called (would rather not add any
> page flags to track that).
> 
> If we were really clever and able to get back to the address of
> struct page that _is_ holding the lock, we could just do a simple
> check to ensure its index is < the index of the page we are trying
> to take.
> 
> That would give reasonable nesting checking without requiring lockdep
> to track new chains for every page (obviously not feasible).
  This is an interesting idea. We could store a pointer to the
first locked page (which is attached to some mapping) in task_struct.
That should work fine.
								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-16 11:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-15 15:58 [rfc][patch] mm: lockdep page lock Nick Piggin
2010-03-15 18:08 ` Jan Kara
2010-03-16  2:21   ` Nick Piggin
2010-03-16 11:52     ` Jan Kara [this message]
2010-03-24 13:28     ` Peter Zijlstra
2010-03-25  5:36       ` Nick Piggin
2010-03-25  9:40         ` Peter Zijlstra
2010-03-26  3:18       ` Jamie Lokier
2010-03-26  6:54         ` Peter Zijlstra
2010-03-26 11:54           ` Jamie Lokier
2010-03-24 13:54     ` Peter Zijlstra
2010-03-16  6:20 ` Ingo Molnar
2010-03-16  6:42   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100316115200.GA5699@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).