All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	Jiri Bohac <jbohac@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH] mm: make page pfmemalloc check more robust
Date: Thu, 13 Aug 2015 11:00:02 +0100	[thread overview]
Message-ID: <20150813100002.GA9854@suse.de> (raw)
In-Reply-To: <1439456364-4530-1-git-send-email-mhocko@kernel.org>

On Thu, Aug 13, 2015 at 10:58:54AM +0200, mhocko@kernel.org wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> The patch c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
> added the checks for page->pfmemalloc to __skb_fill_page_desc():
> 
>         if (page->pfmemalloc && !page->mapping)
>                 skb->pfmemalloc = true;
> 
> It assumes page->mapping == NULL implies that page->pfmemalloc can be
> trusted.  However, __delete_from_page_cache() can set set page->mapping
> to NULL and leave page->index value alone. Due to being in union, a
> non-zero page->index will be interpreted as true page->pfmemalloc.
> 
> So the assumption is invalid if the networking code can see such a
> page. And it seems it can. We have encountered this with a NFS over
> loopback setup when such a page is attached to a new skbuf. There is no
> copying going on in this case so the page confuses __skb_fill_page_desc
> which interprets the index as pfmemalloc flag and the network stack
> drops packets that have been allocated using the reserves unless they
> are to be queued on sockets handling the swapping which is the case here
> and that leads to hangs when the nfs client waits for a response from
> the server which has been dropped and thus never arrive.
> 
> The struct page is already heavily packed so rather than finding
> another hole to put it in, let's do a trick instead. We can reuse the
> index again but define it to an impossible value (-1UL). This is the
> page index so it should never see the value that large. Replace all
> direct users of page->pfmemalloc by page_is_pfmemalloc which will
> hide this nastiness from unspoiled eyes.
> 
> The information will get lost if somebody wants to use page->index
> obviously but that was the case before and the original code expected
> that the information should be persisted somewhere else if that is
> really needed (e.g. what SLAB and SLUB do).
> 
> Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
> Cc: stable # 3.6+
> Debugged-by: Vlastimil Babka <vbabka@suse.com>
> Debugged-by: Jiri Bohac <jbohac@suse.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: mhocko@kernel.org
Cc: linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	Jiri Bohac <jbohac@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH] mm: make page pfmemalloc check more robust
Date: Thu, 13 Aug 2015 11:00:02 +0100	[thread overview]
Message-ID: <20150813100002.GA9854@suse.de> (raw)
In-Reply-To: <1439456364-4530-1-git-send-email-mhocko@kernel.org>

On Thu, Aug 13, 2015 at 10:58:54AM +0200, mhocko@kernel.org wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> The patch c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
> added the checks for page->pfmemalloc to __skb_fill_page_desc():
> 
>         if (page->pfmemalloc && !page->mapping)
>                 skb->pfmemalloc = true;
> 
> It assumes page->mapping == NULL implies that page->pfmemalloc can be
> trusted.  However, __delete_from_page_cache() can set set page->mapping
> to NULL and leave page->index value alone. Due to being in union, a
> non-zero page->index will be interpreted as true page->pfmemalloc.
> 
> So the assumption is invalid if the networking code can see such a
> page. And it seems it can. We have encountered this with a NFS over
> loopback setup when such a page is attached to a new skbuf. There is no
> copying going on in this case so the page confuses __skb_fill_page_desc
> which interprets the index as pfmemalloc flag and the network stack
> drops packets that have been allocated using the reserves unless they
> are to be queued on sockets handling the swapping which is the case here
> and that leads to hangs when the nfs client waits for a response from
> the server which has been dropped and thus never arrive.
> 
> The struct page is already heavily packed so rather than finding
> another hole to put it in, let's do a trick instead. We can reuse the
> index again but define it to an impossible value (-1UL). This is the
> page index so it should never see the value that large. Replace all
> direct users of page->pfmemalloc by page_is_pfmemalloc which will
> hide this nastiness from unspoiled eyes.
> 
> The information will get lost if somebody wants to use page->index
> obviously but that was the case before and the original code expected
> that the information should be persisted somewhere else if that is
> really needed (e.g. what SLAB and SLUB do).
> 
> Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
> Cc: stable # 3.6+
> Debugged-by: Vlastimil Babka <vbabka@suse.com>
> Debugged-by: Jiri Bohac <jbohac@suse.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2015-08-13 10:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13  8:58 [PATCH] mm: make page pfmemalloc check more robust mhocko
2015-08-13  8:58 ` mhocko
2015-08-13  8:58 ` mhocko
2015-08-13  9:13 ` Vlastimil Babka
2015-08-13  9:13   ` Vlastimil Babka
2015-08-13  9:31   ` Michal Hocko
2015-08-13  9:31     ` Michal Hocko
2015-08-13 14:40   ` Eric Dumazet
2015-08-13 14:40     ` Eric Dumazet
2015-08-14 13:26     ` Vlastimil Babka
2015-08-14 13:26       ` Vlastimil Babka
2015-08-13 10:00 ` Mel Gorman [this message]
2015-08-13 10:00   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150813100002.GA9854@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jbohac@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=netdev@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.