linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Miklos Szeredi <mszeredi@suse.cz>,
	linux-mm@kvack.org, Al Viro <viro@ZenIV.linux.org.uk>,
	Jay <jinshan.xiong@whamcloud.com>,
	stable@kernel.org, Nick Piggin <npiggin@kernel.dk>
Subject: Re: [PATCH] mm: Fix assertion mapping->nrpages == 0 in end_writeback()
Date: Tue, 14 Jun 2011 00:49:24 +0200	[thread overview]
Message-ID: <20110613224924.GM4907@quack.suse.cz> (raw)
In-Reply-To: <20110613151401.51b539a0.akpm@linux-foundation.org>

On Mon 13-06-11 15:14:01, Andrew Morton wrote:
> On Tue, 14 Jun 2011 00:01:44 +0200
> Jan Kara <jack@suse.cz> wrote:
> > On Wed 08-06-11 18:36:43, Jan Kara wrote:
> > > On Tue 07-06-11 14:33:01, Andrew Morton wrote:
> > > > On Tue, 07 Jun 2011 07:46:37 +0200
> > > > Miklos Szeredi <mszeredi@suse.cz> wrote:
> > > > 
> > > > > > Either way, I don't think that the uglypatch expresses a full
> > > > > > understanding of te bug ;)
> > > > > 
> > > > > I don't see a better way, how would we make nrpages update atomically
> > > > > wrt the radix-tree while using only RCU?
> > > > > 
> > > > > The question is, does it matter that those two can get temporarily out
> > > > > of sync?
> > > > > 
> > > > > In case of inode eviction it does, not only because of that BUG_ON, but
> > > > > because page reclaim must be somehow synchronised with eviction.
> > > > > Otherwise it may access tree_lock on the mapping of an already freed
> > > > > inode.
> > > > > 
> > > > > In other cases?  AFAICS it doesn't matter.  Most ->nrpages accesses
> > > > > weren't under tree_lock before Nick's RCUification, so their use were
> > > > > just optimization.   
> > > > 
> > > > Gee, we've made a bit of a mess here.
> > > > 
> > > > Rather than bodging around particualr codesites where that mess exposes
> > > > itself, how about we step back and work out what our design is here,
> > > > then implement it and check that all sites comply with it?
> > > > 
> > > > What is the relationship between the radix-tree and nrpages?  What are
> > > > the locking rules?  Can anyone come up with a one-sentence proposal?
> > > AFAIU, nrpages and radix-tree are consistent under tree_lock.
> > > 
> > > nrpages is only used (well, apart from shmfs and other filesystems which
> > > use the value as a guess how much should they expect to write or similar
> > > heuristics) to test mapping->nrpages == 0 and the test is performed without
> > > any synchronization which looks natural because we later do only
> > > rcu-protected lookups anyway. So it seems it's expected the test is
> > > unreliable and we just use it to make things faster. The same race as with
> > > nrpages test can happen during the radix tree lookup anyway...
> > > 
> > > I went through the tests and the only place which seems to really care
> > > about the races with __add_to_page_cache() or __delete_from_page_cache()
> > > is when the inode should be removed from memory. There we have to be
> > > careful. Races with __add_to_page_cache() cannot happen because there is
> > > noone who could trigger addition of new page to the inode being evicted.
> > > Races with __delete_from_page_cache() are possible though...
> >   Andrew, any opinion on this? I'd like to get the bug fixed... I'll
> > happily move the nrpages check in end_writeback() under the spinlock if
> > people find that nicer. That place really looks like the only one which
> > depends on nrpages being consistent and uptodate.
> 
> That seems a cleaner way of avoiding one manifestation of the bug.
  OK.

> But what *is* the bug?  That we've made nrpages incoherent with the
> state of the tree?  Or is it simply that the rule has always been "you
> must hold tree_lock to access nrpages", and the rcuification exposed
> that?
> 
> I want to actually fix this stuff up and get a good clear design which
> we can describe and understand.  No band-aids, please.  Not in here.
  OK, I belive the rule is "you must hold tree_lock to access nrpages" but
there are plenty of places which don't hold tree_lock and still peek at
nrpages to see if they have anything to do (and they were there even before
radix tree was rcuified). These are inherently racy and usually they don't
care - but possibly each such place should carry a comment explaining why
this racy check does not matter...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-13 22:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-30  9:37 [PATCH] mm: Fix assertion mapping->nrpages == 0 in end_writeback() Jan Kara
2011-06-06 22:16 ` Andrew Morton
2011-06-07  5:46   ` Miklos Szeredi
2011-06-07 18:22     ` Jinshan Xiong
2011-06-08 16:40       ` Jan Kara
2011-06-08 20:10         ` Jinshan Xiong
2011-06-07 21:33     ` Andrew Morton
2011-06-08 16:36       ` Jan Kara
2011-06-13 22:01         ` Jan Kara
2011-06-13 22:14           ` Andrew Morton
2011-06-13 22:49             ` Jan Kara [this message]
2011-06-13 22:58               ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110613224924.GM4907@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=jinshan.xiong@whamcloud.com \
    --cc=linux-mm@kvack.org \
    --cc=mszeredi@suse.cz \
    --cc=npiggin@kernel.dk \
    --cc=stable@kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).