All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
To: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
	Pekka Enberg <penberg-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Joonsoo Kim <iamjoonsoo.kim-Hm3cg6mZ9cc@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>,
	Jarno Rajahalme
	<jrajahalme-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>,
	Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org
Subject: Re: [patch v2 1/3] mm: remove GFP_THISNODE
Date: Mon, 02 Mar 2015 14:46:41 +0100	[thread overview]
Message-ID: <54F469C1.9090601@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1502271415510.7225-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>

On 02/27/2015 11:16 PM, David Rientjes wrote:
> NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE.
>
> GFP_THISNODE is a secret combination of gfp bits that have different
> behavior than expected.  It is a combination of __GFP_THISNODE,
> __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page allocator
> slowpath to fail without trying reclaim even though it may be used in
> combination with __GFP_WAIT.
>
> An example of the problem this creates: commit e97ca8e5b864 ("mm: fix
> GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE
> that really just wanted __GFP_THISNODE.  The problem doesn't end there,
> however, because even it was a no-op for alloc_misplaced_dst_page(),
> which also sets __GFP_NORETRY and __GFP_NOWARN, and
> migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT
> is set in GFP_TRANSHUGE.  Converting GFP_THISNODE to __GFP_THISNODE is
> a no-op in these cases since the page allocator special-cases
> __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN.
>
> It's time to just remove GFP_THISNODE entirely.  We leave __GFP_THISNODE
> to restrict an allocation to a local node, but remove GFP_THISNODE and
> its obscurity.  Instead, we require that a caller clear __GFP_WAIT if it
> wants to avoid reclaim.
>
> This allows the aforementioned functions to actually reclaim as they
> should.  It also enables any future callers that want to do
> __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim.  The
> rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT.
>
> Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so
> it is unchanged.
>
> Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Acked-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>

So you've convinced me that this is a non-functional change for slab and 
a prerequisity for patch 2/3 which is the main point of this series for 
4.0. That said, the new 'goto nopage' condition is still triggered by a 
combination of flag states, and the less we have those, the better for 
us IMHO.

Looking at commit 952f3b51be which introduced this particular check 
using GFP_THISNODE, it seemed like it was a workaround to avoid 
triggering reclaim, without needing a new gfp flag. Nowadays, we have 
such flag called __GFP_NO_KSWAPD and as I explained in my reply to v1 
(where I missed the new condition), passing the flag would already 
prevent wake_all_kswapds() and treating the allocation as atomic in 
gfp_to_alloc_flags(). So the whole difference would be another 
get_page_from_freelist() attempt (possibly less constrained than the 
fast path one) and then bail out on !wait.

So it would be IMHO better for longer-term maintainability to have the 
relevant __GFP_THISNODE callers pass also __GFP_NO_KSWAPD to denote 
these opportunistic allocation attempts, instead of having this subtle 
semantic difference attached to __GFP_THISNODE && !__GFP_WAIT. It would 
be probably too risky for 4.0. On the other hand, I don't think even 
this series is really urgent. It's true that patch 2/3 fixes two 
commits, including a 4.0 one, but those commits are already not 
regressions without the fix. But maybe current -rcX is low enough to 
proceed with this series and catch any regressions in time, allowing the 
larger cleanups later.

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Pravin Shelar <pshelar@nicira.com>,
	Jarno Rajahalme <jrajahalme@nicira.com>,
	Li Zefan <lizefan@huawei.com>, Greg Thelen <gthelen@google.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, cgroups@vger.kernel.org,
	dev@openvswitch.org
Subject: Re: [patch v2 1/3] mm: remove GFP_THISNODE
Date: Mon, 02 Mar 2015 14:46:41 +0100	[thread overview]
Message-ID: <54F469C1.9090601@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1502271415510.7225@chino.kir.corp.google.com>

On 02/27/2015 11:16 PM, David Rientjes wrote:
> NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE.
>
> GFP_THISNODE is a secret combination of gfp bits that have different
> behavior than expected.  It is a combination of __GFP_THISNODE,
> __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page allocator
> slowpath to fail without trying reclaim even though it may be used in
> combination with __GFP_WAIT.
>
> An example of the problem this creates: commit e97ca8e5b864 ("mm: fix
> GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE
> that really just wanted __GFP_THISNODE.  The problem doesn't end there,
> however, because even it was a no-op for alloc_misplaced_dst_page(),
> which also sets __GFP_NORETRY and __GFP_NOWARN, and
> migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT
> is set in GFP_TRANSHUGE.  Converting GFP_THISNODE to __GFP_THISNODE is
> a no-op in these cases since the page allocator special-cases
> __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN.
>
> It's time to just remove GFP_THISNODE entirely.  We leave __GFP_THISNODE
> to restrict an allocation to a local node, but remove GFP_THISNODE and
> its obscurity.  Instead, we require that a caller clear __GFP_WAIT if it
> wants to avoid reclaim.
>
> This allows the aforementioned functions to actually reclaim as they
> should.  It also enables any future callers that want to do
> __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim.  The
> rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT.
>
> Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so
> it is unchanged.
>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

So you've convinced me that this is a non-functional change for slab and 
a prerequisity for patch 2/3 which is the main point of this series for 
4.0. That said, the new 'goto nopage' condition is still triggered by a 
combination of flag states, and the less we have those, the better for 
us IMHO.

Looking at commit 952f3b51be which introduced this particular check 
using GFP_THISNODE, it seemed like it was a workaround to avoid 
triggering reclaim, without needing a new gfp flag. Nowadays, we have 
such flag called __GFP_NO_KSWAPD and as I explained in my reply to v1 
(where I missed the new condition), passing the flag would already 
prevent wake_all_kswapds() and treating the allocation as atomic in 
gfp_to_alloc_flags(). So the whole difference would be another 
get_page_from_freelist() attempt (possibly less constrained than the 
fast path one) and then bail out on !wait.

So it would be IMHO better for longer-term maintainability to have the 
relevant __GFP_THISNODE callers pass also __GFP_NO_KSWAPD to denote 
these opportunistic allocation attempts, instead of having this subtle 
semantic difference attached to __GFP_THISNODE && !__GFP_WAIT. It would 
be probably too risky for 4.0. On the other hand, I don't think even 
this series is really urgent. It's true that patch 2/3 fixes two 
commits, including a 4.0 one, but those commits are already not 
regressions without the fix. But maybe current -rcX is low enough to 
proceed with this series and catch any regressions in time, allowing the 
larger cleanups later.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Pravin Shelar <pshelar@nicira.com>,
	Jarno Rajahalme <jrajahalme@nicira.com>,
	Li Zefan <lizefan@huawei.com>, Greg Thelen <gthelen@google.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	netdev@vger.kernel.org, cgroups@vger.kernel.org,
	dev@openvswitch.org
Subject: Re: [patch v2 1/3] mm: remove GFP_THISNODE
Date: Mon, 02 Mar 2015 14:46:41 +0100	[thread overview]
Message-ID: <54F469C1.9090601@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1502271415510.7225@chino.kir.corp.google.com>

On 02/27/2015 11:16 PM, David Rientjes wrote:
> NOTE: this is not about __GFP_THISNODE, this is only about GFP_THISNODE.
>
> GFP_THISNODE is a secret combination of gfp bits that have different
> behavior than expected.  It is a combination of __GFP_THISNODE,
> __GFP_NORETRY, and __GFP_NOWARN and is special-cased in the page allocator
> slowpath to fail without trying reclaim even though it may be used in
> combination with __GFP_WAIT.
>
> An example of the problem this creates: commit e97ca8e5b864 ("mm: fix
> GFP_THISNODE callers and clarify") fixed up many users of GFP_THISNODE
> that really just wanted __GFP_THISNODE.  The problem doesn't end there,
> however, because even it was a no-op for alloc_misplaced_dst_page(),
> which also sets __GFP_NORETRY and __GFP_NOWARN, and
> migrate_misplaced_transhuge_page(), where __GFP_NORETRY and __GFP_NOWAIT
> is set in GFP_TRANSHUGE.  Converting GFP_THISNODE to __GFP_THISNODE is
> a no-op in these cases since the page allocator special-cases
> __GFP_THISNODE && __GFP_NORETRY && __GFP_NOWARN.
>
> It's time to just remove GFP_THISNODE entirely.  We leave __GFP_THISNODE
> to restrict an allocation to a local node, but remove GFP_THISNODE and
> its obscurity.  Instead, we require that a caller clear __GFP_WAIT if it
> wants to avoid reclaim.
>
> This allows the aforementioned functions to actually reclaim as they
> should.  It also enables any future callers that want to do
> __GFP_THISNODE but also __GFP_NORETRY && __GFP_NOWARN to reclaim.  The
> rule is simple: if you don't want to reclaim, then don't set __GFP_WAIT.
>
> Aside: ovs_flow_stats_update() really wants to avoid reclaim as well, so
> it is unchanged.
>
> Signed-off-by: David Rientjes <rientjes@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

So you've convinced me that this is a non-functional change for slab and 
a prerequisity for patch 2/3 which is the main point of this series for 
4.0. That said, the new 'goto nopage' condition is still triggered by a 
combination of flag states, and the less we have those, the better for 
us IMHO.

Looking at commit 952f3b51be which introduced this particular check 
using GFP_THISNODE, it seemed like it was a workaround to avoid 
triggering reclaim, without needing a new gfp flag. Nowadays, we have 
such flag called __GFP_NO_KSWAPD and as I explained in my reply to v1 
(where I missed the new condition), passing the flag would already 
prevent wake_all_kswapds() and treating the allocation as atomic in 
gfp_to_alloc_flags(). So the whole difference would be another 
get_page_from_freelist() attempt (possibly less constrained than the 
fast path one) and then bail out on !wait.

So it would be IMHO better for longer-term maintainability to have the 
relevant __GFP_THISNODE callers pass also __GFP_NO_KSWAPD to denote 
these opportunistic allocation attempts, instead of having this subtle 
semantic difference attached to __GFP_THISNODE && !__GFP_WAIT. It would 
be probably too risky for 4.0. On the other hand, I don't think even 
this series is really urgent. It's true that patch 2/3 fixes two 
commits, including a 4.0 one, but those commits are already not 
regressions without the fix. But maybe current -rcX is low enough to 
proceed with this series and catch any regressions in time, allowing the 
larger cleanups later.


  parent reply	other threads:[~2015-03-02 13:46 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-26  0:23 [patch 1/2] mm: remove GFP_THISNODE David Rientjes
2015-02-26  0:23 ` David Rientjes
2015-02-26  0:56 ` Christoph Lameter
2015-02-26  0:56   ` Christoph Lameter
2015-02-26  1:04   ` David Rientjes
2015-02-26  1:04     ` David Rientjes
2015-02-26  8:30 ` Vlastimil Babka
2015-02-26  8:30   ` Vlastimil Babka
2015-02-27  3:09   ` David Rientjes
2015-02-27  3:09     ` David Rientjes
2015-02-27  7:34     ` Vlastimil Babka
2015-02-27  7:34       ` Vlastimil Babka
2015-02-27 22:03       ` David Rientjes
2015-02-27 22:03         ` David Rientjes
2015-02-27 22:19         ` Vlastimil Babka
2015-02-27 22:19           ` Vlastimil Babka
2015-02-27 22:31           ` David Rientjes
2015-02-27 22:31             ` David Rientjes
2015-02-27 22:52             ` Vlastimil Babka
2015-02-27 22:52               ` Vlastimil Babka
     [not found] ` <alpine.DEB.2.10.1502251621010.10303-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2015-02-27 22:16   ` [patch v2 1/3] " David Rientjes
2015-02-27 22:16     ` David Rientjes
2015-02-27 22:16     ` David Rientjes
2015-02-27 22:17     ` [patch v2 2/3] mm, thp: really limit transparent hugepage allocation to local node David Rientjes
2015-02-27 22:17       ` David Rientjes
     [not found]       ` <alpine.DEB.2.10.1502271416580.7225-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2015-03-02 13:47         ` Vlastimil Babka
2015-03-02 13:47           ` Vlastimil Babka
2015-03-02 13:47           ` Vlastimil Babka
2015-02-27 22:17     ` [patch v2 3/3] kernel, cpuset: remove exception for __GFP_THISNODE David Rientjes
2015-02-27 22:17       ` David Rientjes
2015-03-02 13:47       ` Vlastimil Babka
2015-03-02 13:47         ` Vlastimil Babka
2015-02-27 22:53     ` [patch v2 1/3] mm: remove GFP_THISNODE Christoph Lameter
2015-02-27 22:53       ` Christoph Lameter
2015-02-28  3:21       ` David Rientjes
2015-02-28  3:21         ` David Rientjes
     [not found]     ` <alpine.DEB.2.10.1502271415510.7225-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2015-03-02 13:46       ` Vlastimil Babka [this message]
2015-03-02 13:46         ` Vlastimil Babka
2015-03-02 13:46         ` Vlastimil Babka
2015-03-02 15:46         ` Christoph Lameter
2015-03-02 15:46           ` Christoph Lameter
2015-03-02 16:02           ` Vlastimil Babka
2015-03-02 16:02             ` Vlastimil Babka
2015-03-02 16:08             ` Christoph Lameter
2015-03-02 16:08               ` Christoph Lameter
     [not found]               ` <alpine.DEB.2.11.1503021007030.6245-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>
2015-03-02 16:23                 ` Vlastimil Babka
2015-03-02 16:23                   ` Vlastimil Babka
2015-03-02 16:23                   ` Vlastimil Babka
2015-03-02 20:40                   ` David Rientjes
2015-03-02 20:40                     ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54F469C1.9090601@suse.cz \
    --to=vbabka-alswssmvlrq@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org \
    --cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
    --cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=iamjoonsoo.kim-Hm3cg6mZ9cc@public.gmane.org \
    --cc=jrajahalme-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
    --cc=mgorman-l3A5Bk7waGM@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=penberg-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.