linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Greg Thelen <gthelen@google.com>,
	Ying Han <yinghan@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Minchan Kim <minchan.kim@gmail.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>
Subject: Re: [PATCH 5/9] writeback: introduce the pageout work
Date: Sat, 3 Mar 2012 22:27:45 +0800	[thread overview]
Message-ID: <20120303142745.GA17789@localhost> (raw)
In-Reply-To: <20120303135558.GA9869@localhost>

[correct email addresses for Artem and Adrian]

On Sat, Mar 03, 2012 at 09:55:58PM +0800, Fengguang Wu wrote:
> On Fri, Mar 02, 2012 at 11:57:00AM -0800, Andrew Morton wrote:
> > On Fri, 2 Mar 2012 18:39:51 +0800
> > Fengguang Wu <fengguang.wu@intel.com> wrote:
> > 
> > > > And I agree it's unlikely but given enough time and people, I
> > > > believe someone finds a way to (inadvertedly) trigger this.
> > > 
> > > Right. The pageout works could add lots more iput() to the flusher
> > > and turn some hidden statistical impossible bugs into real ones.
> > > 
> > > Fortunately the "flusher deadlocks itself" case is easy to detect and
> > > prevent as illustrated in another email.
> > 
> > It would be a heck of a lot safer and saner to avoid the iput().  We
> > know how to do this, so why not do it?
> 
> My concern about the page lock is, it costs more code and sounds like
> hacking around something. It seems we (including me) have been trying
> to shun away from the iput() problem. Since it's unlikely we are to
> get rid of the already existing iput() calls from the flusher context,
> why not face the problem, sort it out and use it with confident in new
> code?
> 
> Let me try it now. The only scheme iput() can deadlock the flusher is
> for the iput() path to come back to queue some work and wait for it.
> Here are the exhaust list of the queue+wait paths:
> 
> writeback_inodes_sb_nr_if_idle
>   ext4_nonda_switch
>     ext4_page_mkwrite                   # from page fault
>     ext4_da_write_begin                 # from user writes
> 
> writeback_inodes_sb_nr
>   quotactl syscall                      # from syscall
>   __sync_filesystem                     # from sync/umount
>   shrink_liability                      # ubifs
>     make_free_space
>       ubifs_budget_space                # from all over ubifs:
> 
>    2    274  /c/linux/fs/ubifs/dir.c <<ubifs_create>>
>    3    531  /c/linux/fs/ubifs/dir.c <<ubifs_link>>
>    4    586  /c/linux/fs/ubifs/dir.c <<ubifs_unlink>>
>    5    675  /c/linux/fs/ubifs/dir.c <<ubifs_rmdir>>
>    6    731  /c/linux/fs/ubifs/dir.c <<ubifs_mkdir>>
>    7    803  /c/linux/fs/ubifs/dir.c <<ubifs_mknod>>
>    8    871  /c/linux/fs/ubifs/dir.c <<ubifs_symlink>>
>    9   1006  /c/linux/fs/ubifs/dir.c <<ubifs_rename>>
>   10   1009  /c/linux/fs/ubifs/dir.c <<ubifs_rename>>
>   11    246  /c/linux/fs/ubifs/file.c <<write_begin_slow>>
>   12    388  /c/linux/fs/ubifs/file.c <<allocate_budget>>
>   13   1125  /c/linux/fs/ubifs/file.c <<do_truncation>>   <===== deadlockable
>   14   1217  /c/linux/fs/ubifs/file.c <<do_setattr>>
>   15   1381  /c/linux/fs/ubifs/file.c <<update_mctime>>
>   16   1486  /c/linux/fs/ubifs/file.c <<ubifs_vm_page_mkwrite>>
>   17    110  /c/linux/fs/ubifs/ioctl.c <<setflags>>
>   19    122  /c/linux/fs/ubifs/xattr.c <<create_xattr>>
>   20    201  /c/linux/fs/ubifs/xattr.c <<change_xattr>>
>   21    494  /c/linux/fs/ubifs/xattr.c <<remove_xattr>>
> 
> It seems they are all safe except for ubifs. ubifs may actually
> deadlock from the above do_truncation() caller. However it should be

Sorry that do_truncation() is actually called from ubifs_setattr()
which is not related to iput().

Are there other possibilities for iput() to call into the above list
of ubifs functions, then start writeback work and wait for it which
will deadlock the flusher? ubifs_unlink() and perhaps remove_xattr()?

> fixable because the ubifs call for writeback_inodes_sb_nr() sounds
> very brute force writeback and wait and there may well be better way
> out.
> 
> CCing ubifs developers for possible thoughts..
> 
> Thanks,
> Fengguang
> 
> PS. I'll be on travel in the following week and won't have much time
> for replying emails. Sorry about that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-03-03 14:33 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28 14:00 [PATCH 0/9] [RFC] pageout work and dirty reclaim throttling Fengguang Wu
2012-02-28 14:00 ` [PATCH 1/9] memcg: add page_cgroup flags for dirty page tracking Fengguang Wu
2012-02-29  0:50   ` KAMEZAWA Hiroyuki
2012-03-04  1:29     ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 2/9] memcg: add dirty page accounting infrastructure Fengguang Wu
2012-02-28 22:37   ` Andrew Morton
2012-02-29  0:27     ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 3/9] memcg: add kernel calls for memcg dirty page stats Fengguang Wu
2012-02-29  1:10   ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 4/9] memcg: dirty page accounting support routines Fengguang Wu
2012-02-28 15:15   ` Fengguang Wu
2012-02-28 22:45   ` Andrew Morton
2012-02-29  1:15     ` KAMEZAWA Hiroyuki
2012-02-28 14:00 ` [PATCH 5/9] writeback: introduce the pageout work Fengguang Wu
2012-02-29  0:04   ` Andrew Morton
2012-02-29  2:31     ` Fengguang Wu
2012-02-29 13:28     ` Fengguang Wu
2012-03-01 11:04     ` Jan Kara
2012-03-01 11:41       ` Fengguang Wu
2012-03-01 16:50         ` Jan Kara
2012-03-01 19:46         ` Andrew Morton
2012-03-03 13:25           ` Fengguang Wu
2012-03-07  0:37             ` Andrew Morton
2012-03-07  5:40               ` Fengguang Wu
2012-03-01 19:42       ` Andrew Morton
2012-03-01 21:15         ` Jan Kara
2012-03-01 21:22           ` Andrew Morton
2012-03-01 12:36     ` Fengguang Wu
2012-03-01 16:38       ` Jan Kara
2012-03-02  4:48         ` Fengguang Wu
2012-03-02  9:59           ` Jan Kara
2012-03-02 10:39             ` Fengguang Wu
2012-03-02 19:57               ` Andrew Morton
2012-03-03 13:55                 ` Fengguang Wu
2012-03-03 14:27                   ` Fengguang Wu [this message]
2012-03-04 11:13                     ` Fengguang Wu
2012-03-07 15:48                   ` Artem Bityutskiy
2012-03-09  7:31                     ` Fengguang Wu
2012-03-09  9:51                       ` Jan Kara
2012-03-09 10:24                         ` Artem Bityutskiy
2012-03-09 16:10                         ` Artem Bityutskiy
2012-03-09 21:11                           ` Jan Kara
2012-03-12 12:36                             ` Artem Bityutskiy
2012-03-12 14:02                               ` Jan Kara
2012-03-12 14:21                                 ` Artem Bityutskiy
2012-03-09 10:15                   ` Jan Kara
2012-03-09 15:10                     ` Fengguang Wu
2012-02-29 13:51   ` [PATCH v2 " Fengguang Wu
2012-03-01 13:35     ` Fengguang Wu
2012-03-02  6:22       ` [PATCH v3 " Fengguang Wu
2012-02-28 14:00 ` [PATCH 6/9] vmscan: dirty reclaim throttling Fengguang Wu
2012-02-28 14:00 ` [PATCH 7/9] mm: pass __GFP_WRITE to memcg charge and reclaim routines Fengguang Wu
2012-02-28 14:00 ` [PATCH 8/9] mm: dont set __GFP_WRITE on ramfs/sysfs writes Fengguang Wu
2012-03-01 10:13   ` Johannes Weiner
2012-03-01 10:30     ` Fengguang Wu
2012-02-28 14:00 ` [PATCH 9/9] mm: debug vmscan waits Fengguang Wu
2012-03-02  6:59   ` [RFC PATCH] mm: don't treat anonymous pages as dirtyable pages Fengguang Wu
2012-03-02  7:18     ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120303142745.GA17789@localhost \
    --to=fengguang.wu@intel.com \
    --cc=Artem.Bityutskiy@linux.intel.com \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).