linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: mhocko@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Dave Chinner <david@fromorbit.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	Jan Kara <jack@suse.cz>, Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC 5/8] ext4: Do not fail journal due to block allocator
Date: Wed, 5 Aug 2015 13:43:36 +0200	[thread overview]
Message-ID: <20150805114336.GB5132@quack.suse.cz> (raw)
In-Reply-To: <1438768284-30927-6-git-send-email-mhocko@kernel.org>

On Wed 05-08-15 11:51:21, mhocko@kernel.org wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Since "mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM"
> memory allocator doesn't endlessly loop to satisfy low-order allocations
> and instead fails them to allow callers to handle them gracefully.
> 
> Some of the callers are not yet prepared for this behavior though. ext4
> block allocator relies solely on GFP_NOFS allocation requests and
> allocation failures lead to aborting yournal too easily:
> 
> [  345.028333] oom-trash: page allocation failure: order:0, mode:0x50
> [  345.028336] CPU: 1 PID: 8334 Comm: oom-trash Tainted: G        W       4.0.0-nofs3-00006-gdfe9931f5f68 #588
> [  345.028337] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150428_134905-gandalf 04/01/2014
> [  345.028339]  0000000000000000 ffff880005a17708 ffffffff81538a54 ffffffff8107a40f
> [  345.028341]  0000000000000050 ffff880005a17798 ffffffff810fe854 0000000180000000
> [  345.028342]  0000000000000046 0000000000000000 ffffffff81a52100 0000000000000246
> [  345.028343] Call Trace:
> [  345.028348]  [<ffffffff81538a54>] dump_stack+0x4f/0x7b
> [  345.028370]  [<ffffffff810fe854>] warn_alloc_failed+0x12a/0x13f
> [  345.028373]  [<ffffffff81101bd2>] __alloc_pages_nodemask+0x7f3/0x8aa
> [  345.028375]  [<ffffffff810f9933>] pagecache_get_page+0x12a/0x1c9
> [  345.028390]  [<ffffffffa005bc64>] ext4_mb_load_buddy+0x220/0x367 [ext4]
> [  345.028414]  [<ffffffffa006014f>] ext4_free_blocks+0x522/0xa4c [ext4]
> [  345.028425]  [<ffffffffa0054e14>] ext4_ext_remove_space+0x833/0xf22 [ext4]
> [  345.028434]  [<ffffffffa005677e>] ext4_ext_truncate+0x8c/0xb0 [ext4]
> [  345.028441]  [<ffffffffa00342bf>] ext4_truncate+0x20b/0x38d [ext4]
> [  345.028462]  [<ffffffffa003573c>] ext4_evict_inode+0x32b/0x4c1 [ext4]
> [  345.028464]  [<ffffffff8116d04f>] evict+0xa0/0x148
> [  345.028466]  [<ffffffff8116dca8>] iput+0x1a1/0x1f0
> [  345.028468]  [<ffffffff811697b4>] __dentry_kill+0x136/0x1a6
> [  345.028470]  [<ffffffff81169a3e>] dput+0x21a/0x243
> [  345.028472]  [<ffffffff81157cda>] __fput+0x184/0x19b
> [  345.028473]  [<ffffffff81157d29>] ____fput+0xe/0x10
> [  345.028475]  [<ffffffff8105a05f>] task_work_run+0x8a/0xa1
> [  345.028477]  [<ffffffff810452f0>] do_exit+0x3c6/0x8dc
> [  345.028482]  [<ffffffff8104588a>] do_group_exit+0x4d/0xb2
> [  345.028483]  [<ffffffff8104eeeb>] get_signal+0x5b1/0x5f5
> [  345.028488]  [<ffffffff81002202>] do_signal+0x28/0x5d0
> [...]
> [  345.028624] EXT4-fs error (device hdb1) in ext4_free_blocks:4879: Out of memory
> [  345.033097] Aborting journal on device hdb1-8.
> [  345.036339] EXT4-fs (hdb1): Remounting filesystem read-only
> [  345.036344] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted
> [  345.036766] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted
> [  345.038583] EXT4-fs error (device hdb1) in ext4_ext_remove_space:3048: Journal has aborted
> [  345.049115] EXT4-fs error (device hdb1) in ext4_ext_truncate:4669: Journal has aborted
> [  345.050434] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted
> [  345.053064] EXT4-fs error (device hdb1) in ext4_truncate:3668: Journal has aborted
> [  345.053582] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted
> [  345.053946] EXT4-fs error (device hdb1) in ext4_orphan_del:2686: Journal has aborted
> [  345.055367] EXT4-fs error (device hdb1) in ext4_reserve_inode_write:4834: Journal has aborted
> 
> The failure is really premature because GFP_NOFS allocation context is
> very restricted - especially in the fs metadata heavy loads. Before we
> go with a more sofisticated solution, let's simply imitate the previous
> behavior of non-failing NOFS allocation and use __GFP_NOFAIL for the
> buddy block allocator. I wasn't able to trigger the issue with this
> patch anymore.
 
The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.com>

								Honza

> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  fs/ext4/mballoc.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 5b1613a54307..e6361622bfd5 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -992,7 +992,8 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
>  	block = group * 2;
>  	pnum = block / blocks_per_page;
>  	poff = block % blocks_per_page;
> -	page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
> +	page = find_or_create_page(inode->i_mapping, pnum,
> +				   GFP_NOFS|__GFP_NOFAIL);
>  	if (!page)
>  		return -ENOMEM;
>  	BUG_ON(page->mapping != inode->i_mapping);
> @@ -1006,7 +1007,8 @@ static int ext4_mb_get_buddy_page_lock(struct super_block *sb,
>  
>  	block++;
>  	pnum = block / blocks_per_page;
> -	page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
> +	page = find_or_create_page(inode->i_mapping, pnum,
> +				   GFP_NOFS|__GFP_NOFAIL);
>  	if (!page)
>  		return -ENOMEM;
>  	BUG_ON(page->mapping != inode->i_mapping);
> @@ -1158,7 +1160,8 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
>  			 * wait for it to initialize.
>  			 */
>  			page_cache_release(page);
> -		page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
> +		page = find_or_create_page(inode->i_mapping, pnum,
> +					   GFP_NOFS|__GFP_NOFAIL);
>  		if (page) {
>  			BUG_ON(page->mapping != inode->i_mapping);
>  			if (!PageUptodate(page)) {
> @@ -1194,7 +1197,8 @@ ext4_mb_load_buddy(struct super_block *sb, ext4_group_t group,
>  	if (page == NULL || !PageUptodate(page)) {
>  		if (page)
>  			page_cache_release(page);
> -		page = find_or_create_page(inode->i_mapping, pnum, GFP_NOFS);
> +		page = find_or_create_page(inode->i_mapping, pnum,
> +					   GFP_NOFS|__GFP_NOFAIL);
>  		if (page) {
>  			BUG_ON(page->mapping != inode->i_mapping);
>  			if (!PageUptodate(page)) {
> -- 
> 2.5.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2015-08-05 11:43 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-05  9:51 [RFC 0/8] Allow GFP_NOFS allocation to fail mhocko
2015-08-05  9:51 ` [RFC 1/8] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves mhocko
2015-08-05  9:51 ` [RFC 2/8] mm: Allow GFP_IOFS for page_cache_read page cache allocation mhocko
2015-08-05  9:51 ` [RFC 3/8] mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM mhocko
2015-08-05  9:51 ` [RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure mhocko
2015-08-05 11:42   ` Jan Kara
2015-08-05 16:49   ` Greg Thelen
2015-08-12  9:14     ` Michal Hocko
2015-08-15 13:54       ` Theodore Ts'o
2015-08-18 10:36         ` Michal Hocko
2015-08-24 12:06         ` Michal Hocko
2015-08-18 10:38   ` [RFC -v2 " Michal Hocko
2015-08-05  9:51 ` [RFC 5/8] ext4: Do not fail journal due to block allocator mhocko
2015-08-05 11:43   ` Jan Kara [this message]
2015-08-18 10:39   ` [RFC -v2 " Michal Hocko
2015-08-18 10:55     ` Michal Hocko
2015-08-05  9:51 ` [RFC 6/8] ext3: Do not abort journal prematurely mhocko
2015-08-18 10:39   ` [RFC -v2 " Michal Hocko
2015-08-05  9:51 ` [RFC 7/8] btrfs: Prevent from early transaction abort mhocko
2015-08-05 16:31   ` David Sterba
2015-08-18 10:40   ` [RFC -v2 " Michal Hocko
2015-08-18 11:01     ` Michal Hocko
2015-08-18 17:11     ` Chris Mason
2015-08-18 17:29       ` Michal Hocko
2015-08-19 12:26         ` Michal Hocko
2015-08-05  9:51 ` [RFC 8/8] btrfs: use __GFP_NOFAIL in alloc_btrfs_bio mhocko
2015-08-05 16:32   ` David Sterba
2015-08-18 10:41   ` [RFC -v2 " Michal Hocko
2015-08-05 19:58 ` [RFC 0/8] Allow GFP_NOFS allocation to fail Andreas Dilger
2015-08-06 14:34   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150805114336.GB5132@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).