From: Fengguang Wu <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Greg Thelen <gthelen@google.com>, Ying Han <yinghan@google.com>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Minchan Kim <minchan.kim@gmail.com>,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/9] writeback: introduce the pageout work
Date: Fri, 2 Mar 2012 12:48:58 +0800 [thread overview]
Message-ID: <20120302044858.GA14802@localhost> (raw)
In-Reply-To: <20120301163837.GA13104@quack.suse.cz>
On Thu, Mar 01, 2012 at 05:38:37PM +0100, Jan Kara wrote:
> On Thu 01-03-12 20:36:40, Wu Fengguang wrote:
> > > Please have a think about all of this and see if you can demonstrate
> > > how the iput() here is guaranteed safe.
> >
> > There are already several __iget()/iput() calls inside fs-writeback.c.
> > The existing iput() calls already demonstrate its safety?
> >
> > Basically the flusher works in this way
> >
> > - the dirty inode list i_wb_list does not reference count the inode at all
> >
> > - the flusher thread does something analog to igrab() and set I_SYNC
> > before going off to writeout the inode
> >
> > - evict() will wait for completion of I_SYNC
> Yes, you are right that currently writeback code already holds inode
> references and so it can happen that flusher thread drops the last inode
> reference. But currently that could create problems only if someone waits
> for flusher thread to make progress while effectively blocking e.g.
> truncate from happening. Currently flusher thread handles sync(2) and
> background writeback and filesystems take care to not hold any locks
> blocking IO / truncate while possibly waiting for these.
>
> But with your addition situation changes significantly - now anyone doing
> allocation can block and do allocation from all sorts of places including
> ones where we hold locks blocking other fs activity. The good news is that
> we use GFP_NOFS in such places. So if GFP_NOFS allocation cannot possibly
> depend on a completion of some writeback work, then I'd still be
> comfortable with dropping inode references from writeback code. But Andrew
> is right this at least needs some arguing...
You seem to miss the point that we don't do wait or page allocations
inside queue_pageout_work(). The final iput() will not block the
random tasks because the latter don't wait for completion of the work.
random task flusher thread
page allocation
page reclaim
queue_pageout_work()
igrab()
...... after a while ......
execute pageout work
iput()
<work completed>
There will be some reclaim_wait()s if the pageout works are not
executed quickly, in which case vmscan will be impacted and slowed
down. However it's not waiting for any specific work to complete, so
there is no chance to form a loop of dependencies leading to deadlocks.
The iput() does have the theoretic possibility to deadlock the flusher
thread itself (but not with the other random tasks). Since the flusher
thread has always been doing iput() w/o running into such bugs, we can
reasonably expect the new iput() to be as safe in practical.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-03-02 4:49 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 14:00 [PATCH 0/9] [RFC] pageout work and dirty reclaim throttling Fengguang Wu
2012-02-28 14:00 ` [PATCH 1/9] memcg: add page_cgroup flags for dirty page tracking Fengguang Wu
2012-02-29 0:50 ` KAMEZAWA Hiroyuki
2012-03-04 1:29 ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 2/9] memcg: add dirty page accounting infrastructure Fengguang Wu
2012-02-28 22:37 ` Andrew Morton
2012-02-29 0:27 ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 3/9] memcg: add kernel calls for memcg dirty page stats Fengguang Wu
2012-02-29 1:10 ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 4/9] memcg: dirty page accounting support routines Fengguang Wu
2012-02-28 15:15 ` Fengguang Wu
2012-02-28 22:45 ` Andrew Morton
2012-02-29 1:15 ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 5/9] writeback: introduce the pageout work Fengguang Wu
2012-02-29 0:04 ` Andrew Morton
2012-02-29 2:31 ` Fengguang Wu
2012-02-29 13:28 ` Fengguang Wu
2012-03-01 11:04 ` Jan Kara
2012-03-01 11:41 ` Fengguang Wu
2012-03-01 16:50 ` Jan Kara
2012-03-01 19:46 ` Andrew Morton
2012-03-03 13:25 ` Fengguang Wu
2012-03-07 0:37 ` Andrew Morton
2012-03-07 5:40 ` Fengguang Wu
2012-03-01 19:42 ` Andrew Morton
2012-03-01 21:15 ` Jan Kara
2012-03-01 21:22 ` Andrew Morton
2012-03-01 12:36 ` Fengguang Wu
2012-03-01 16:38 ` Jan Kara
2012-03-02 4:48 ` Fengguang Wu [this message]
2012-03-02 9:59 ` Jan Kara
2012-03-02 10:39 ` Fengguang Wu
2012-03-02 19:57 ` Andrew Morton
2012-03-03 13:55 ` Fengguang Wu
2012-03-03 14:27 ` Fengguang Wu
2012-03-04 11:13 ` Fengguang Wu
2012-03-07 15:48 ` Artem Bityutskiy
2012-03-09 7:31 ` Fengguang Wu
2012-03-09 9:51 ` Jan Kara
2012-03-09 10:24 ` Artem Bityutskiy
2012-03-09 16:10 ` Artem Bityutskiy
2012-03-09 21:11 ` Jan Kara
2012-03-12 12:36 ` Artem Bityutskiy
2012-03-12 14:02 ` Jan Kara
2012-03-12 14:21 ` Artem Bityutskiy
2012-03-09 10:15 ` Jan Kara
2012-03-09 15:10 ` Fengguang Wu
2012-02-29 13:51 ` [PATCH v2 " Fengguang Wu
2012-03-01 13:35 ` Fengguang Wu
2012-03-02 6:22 ` [PATCH v3 " Fengguang Wu
2012-02-28 14:00 ` [PATCH 6/9] vmscan: dirty reclaim throttling Fengguang Wu
2012-02-28 14:00 ` [PATCH 7/9] mm: pass __GFP_WRITE to memcg charge and reclaim routines Fengguang Wu
2012-02-28 14:00 ` [PATCH 8/9] mm: dont set __GFP_WRITE on ramfs/sysfs writes Fengguang Wu
2012-03-01 10:13 ` Johannes Weiner
2012-03-01 10:30 ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 9/9] mm: debug vmscan waits Fengguang Wu
2012-03-02 6:59 ` [RFC PATCH] mm: don't treat anonymous pages as dirtyable pages Fengguang Wu
2012-03-02 7:18 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120302044858.GA14802@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=riel@redhat.com \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).