All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Sage Weil <sage@inktank.com>, Mark Fasheh <mfasheh@suse.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: Use GFP_KERNEL allocation for the page cache in page_cache_read
Date: Thu, 19 Mar 2015 08:38:35 +1100	[thread overview]
Message-ID: <20150319083835.2115ba11@notabene.brown> (raw)
In-Reply-To: <20150318154540.GN17241@dhcp22.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 2983 bytes --]

On Wed, 18 Mar 2015 16:45:40 +0100 Michal Hocko <mhocko@suse.cz> wrote:

> What do you think about this v2? I cannot say I would like it but I
> really dislike the whole mapping_gfp_mask API to be honest.
> ---
> >From d88010d6f5f59d7eb87b691e27e201d12cab9141 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Wed, 18 Mar 2015 16:06:40 +0100
> Subject: [PATCH] mm: Allow __GFP_FS for page_cache_read page cache allocation
> 
> page_cache_read has been historically using page_cache_alloc_cold to
> allocate a new page. This means that mapping_gfp_mask is used as the
> base for the gfp_mask. Many filesystems are setting this mask to
> GFP_NOFS to prevent from fs recursion issues. page_cache_read is,
> however, not called from the fs layer so it doesn't need this
> protection. Even ceph and ocfs2 which call filemap_fault from their
> fault handlers seem to be OK because they are not taking any fs lock
> before invoking generic implementation.
> 
> The protection might be even harmful. There is a strong push to fail
> GFP_NOFS allocations rather than loop within allocator indefinitely with
> a very limited reclaim ability. Once we start failing those requests
> the OOM killer might be triggered prematurely because the page cache
> allocation failure is propagated up the page fault path and end up in
> pagefault_out_of_memory.
> 
> Add __GFP_FS and __GFPIO to the gfp mask which is coming from the
> mapping to fix this issue.
> 
> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  mm/filemap.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 968cd8e03d2e..8b50d5eb52b2 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1752,7 +1752,15 @@ static int page_cache_read(struct file *file, pgoff_t offset)
>  	int ret;
>  
>  	do {
> -		page = page_cache_alloc_cold(mapping);
> +		gfp_t page_cache_gfp = mapping_gfp_mask(mapping)|__GFP_COLD;
> +
> +		/*
> +		 * This code is not called from the fs layer so we do not need
> +		 * reclaim recursion protection. !GFP_FS might fail too easy
> +		 * and trigger OOM killer prematuraly.
> +		 */
> +		page_cache_gfp |= __GFP_FS | __GFP_IO;
> +		page = __page_cache_alloc(page_cache_gfp);
>  		if (!page)
>  			return -ENOMEM;
>  

Nearly half the places in the kernel which call mapping_gfp_mask() remove the
__GFP_FS bit.

That suggests to me that it might make sense to have
   mapping_gfp_mask_fs()
and
   mapping_gfp_mask_nofs()

and let the presence of __GFP_FS (and __GFP_IO) be determined by the
call-site rather than the filesystem.

However I am a bit concerned about drivers/block/loop.c.
Might a filesystem read on the loop block device wait for a page_cache_read()
on the loop-mounted file?  In that case you really don't want __GFP_FS set
when allocating that page.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2015-03-18 21:38 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-18 14:09 [PATCH] mm: Use GFP_KERNEL allocation for the page cache in page_cache_read Michal Hocko
2015-03-18 14:09 ` Michal Hocko
2015-03-18 14:32 ` Rik van Riel
2015-03-18 14:32   ` Rik van Riel
2015-03-18 14:37   ` Michal Hocko
2015-03-18 14:37     ` Michal Hocko
2015-03-18 14:38 ` Mel Gorman
2015-03-18 14:38   ` Mel Gorman
2015-03-18 14:43   ` Michal Hocko
2015-03-18 14:43     ` Michal Hocko
2015-03-18 14:44 ` Rik van Riel
2015-03-18 14:44   ` Rik van Riel
2015-03-18 14:55   ` Michal Hocko
2015-03-18 14:55     ` Michal Hocko
2015-03-19  7:14     ` Dave Chinner
2015-03-19  7:14       ` Dave Chinner
2015-03-19 11:11       ` [PATCH] mm: Use GFP_KERNEL allocation for the page cache inpage_cache_read Tetsuo Handa
2015-03-19 11:11         ` Tetsuo Handa
2015-03-19 12:44       ` [PATCH] mm: Use GFP_KERNEL allocation for the page cache in page_cache_read Michal Hocko
2015-03-19 12:44         ` Michal Hocko
2015-03-20  3:48         ` Dave Chinner
2015-03-20  3:48           ` Dave Chinner
2015-03-20 13:14           ` Michal Hocko
2015-03-20 13:14             ` Michal Hocko
2015-03-20 22:51             ` Dave Chinner
2015-03-20 22:51               ` Dave Chinner
2015-03-23 13:02               ` Michal Hocko
2015-03-23 13:02                 ` Michal Hocko
2015-03-26  9:53           ` Michal Hocko
2015-03-26  9:53             ` Michal Hocko
2015-03-26 21:43             ` Dave Chinner
2015-03-26 21:43               ` Dave Chinner
2015-03-30  8:22               ` Michal Hocko
2015-03-30  8:22                 ` Michal Hocko
2015-03-31 21:46                 ` Dave Chinner
2015-03-31 21:46                   ` Dave Chinner
2015-04-07 12:16                   ` Michal Hocko
2015-04-07 12:16                     ` Michal Hocko
2015-03-18 15:45 ` Michal Hocko
2015-03-18 15:45   ` Michal Hocko
2015-03-18 21:38   ` NeilBrown [this message]
2015-03-19 13:55     ` Michal Hocko
2015-03-19 13:55       ` Michal Hocko
2015-03-19 14:27       ` Michal Hocko
2015-03-19 14:27         ` Michal Hocko
2015-03-20  3:57       ` Dave Chinner
2015-03-20  3:57         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150319083835.2115ba11@notabene.brown \
    --to=neilb@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mfasheh@suse.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=riel@redhat.com \
    --cc=sage@inktank.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.