linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node
@ 2014-08-29  7:03 Weijie Yang
  2014-08-29  8:12 ` Mel Gorman
  2014-08-29 13:09 ` Johannes Weiner
  0 siblings, 2 replies; 4+ messages in thread
From: Weijie Yang @ 2014-08-29  7:03 UTC (permalink / raw)
  To: 'Mel Gorman'
  Cc: 'Andrew Morton', 'Rik van Riel',
	'Johannes Weiner', rientjes, 'Weijie Yang',
	'linux-kernel', 'Linux-MM'

When enter page_alloc slowpath, we wakeup kswapd on every pgdat
according to the zonelist and high_zoneidx. However, this doesn't
take nodemask into account, and could prematurely wakeup kswapd on
some unintended nodes.

This patch uses for_each_zone_zonelist_nodemask() instead of
for_each_zone_zonelist() in wake_all_kswapds() to avoid the above situation.

Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
---
 mm/page_alloc.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 18cee0d..29b595a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2457,12 +2457,14 @@ __alloc_pages_high_priority(gfp_t gfp_mask, unsigned int order,
 static void wake_all_kswapds(unsigned int order,
 			     struct zonelist *zonelist,
 			     enum zone_type high_zoneidx,
-			     struct zone *preferred_zone)
+			     struct zone *preferred_zone,
+			     nodemask_t *nodemask)
 {
 	struct zoneref *z;
 	struct zone *zone;
 
-	for_each_zone_zonelist(zone, z, zonelist, high_zoneidx)
+	for_each_zone_zonelist_nodemask(zone, z, zonelist,
+						high_zoneidx, nodemask)
 		wakeup_kswapd(zone, order, zone_idx(preferred_zone));
 }
 
@@ -2560,7 +2562,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 restart:
 	if (!(gfp_mask & __GFP_NO_KSWAPD))
-		wake_all_kswapds(order, zonelist, high_zoneidx, preferred_zone);
+		wake_all_kswapds(order, zonelist, high_zoneidx,
+				preferred_zone, nodemask);
 
 	/*
 	 * OK, we're below the kswapd watermark and have kicked background
-- 
1.7.10.4


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node
  2014-08-29  7:03 [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node Weijie Yang
@ 2014-08-29  8:12 ` Mel Gorman
  2014-08-29  9:12   ` Weijie Yang
  2014-08-29 13:09 ` Johannes Weiner
  1 sibling, 1 reply; 4+ messages in thread
From: Mel Gorman @ 2014-08-29  8:12 UTC (permalink / raw)
  To: Weijie Yang
  Cc: 'Andrew Morton', 'Rik van Riel',
	'Johannes Weiner', rientjes, 'Weijie Yang',
	'linux-kernel', 'Linux-MM'

On Fri, Aug 29, 2014 at 03:03:19PM +0800, Weijie Yang wrote:
> When enter page_alloc slowpath, we wakeup kswapd on every pgdat
> according to the zonelist and high_zoneidx. However, this doesn't
> take nodemask into account, and could prematurely wakeup kswapd on
> some unintended nodes.
> 
> This patch uses for_each_zone_zonelist_nodemask() instead of
> for_each_zone_zonelist() in wake_all_kswapds() to avoid the above situation.
> 
> Signed-off-by: Weijie Yang <weijie.yang@samsung.com>

Just out of curiousity, did you measure a problem due to this or is
the patch due to code inspection? It was known that we examined useless
nodes but assumed to not be a problem because the watermark check should
prevent spurious wakeups.  However, we do a cpuset check and this patch
is consistent with that so regardless of why you wrote the patch

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node
  2014-08-29  8:12 ` Mel Gorman
@ 2014-08-29  9:12   ` Weijie Yang
  0 siblings, 0 replies; 4+ messages in thread
From: Weijie Yang @ 2014-08-29  9:12 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Weijie Yang, Andrew Morton, Rik van Riel, Johannes Weiner,
	rientjes, linux-kernel, Linux-MM

On Fri, Aug 29, 2014 at 4:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> On Fri, Aug 29, 2014 at 03:03:19PM +0800, Weijie Yang wrote:
>> When enter page_alloc slowpath, we wakeup kswapd on every pgdat
>> according to the zonelist and high_zoneidx. However, this doesn't
>> take nodemask into account, and could prematurely wakeup kswapd on
>> some unintended nodes.
>>
>> This patch uses for_each_zone_zonelist_nodemask() instead of
>> for_each_zone_zonelist() in wake_all_kswapds() to avoid the above situation.
>>
>> Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
>
> Just out of curiousity, did you measure a problem due to this or is
> the patch due to code inspection? It was known that we examined useless
> nodes but assumed to not be a problem because the watermark check should
> prevent spurious wakeups.  However, we do a cpuset check and this patch
> is consistent with that so regardless of why you wrote the patch

It is a patch due to code review :-)

> Acked-by: Mel Gorman <mgorman@suse.de>
>
> --
> Mel Gorman
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node
  2014-08-29  7:03 [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node Weijie Yang
  2014-08-29  8:12 ` Mel Gorman
@ 2014-08-29 13:09 ` Johannes Weiner
  1 sibling, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2014-08-29 13:09 UTC (permalink / raw)
  To: Weijie Yang
  Cc: 'Mel Gorman', 'Andrew Morton',
	'Rik van Riel', rientjes, 'Weijie Yang',
	'linux-kernel', 'Linux-MM'

On Fri, Aug 29, 2014 at 03:03:19PM +0800, Weijie Yang wrote:
> When enter page_alloc slowpath, we wakeup kswapd on every pgdat
> according to the zonelist and high_zoneidx. However, this doesn't
> take nodemask into account, and could prematurely wakeup kswapd on
> some unintended nodes.
> 
> This patch uses for_each_zone_zonelist_nodemask() instead of
> for_each_zone_zonelist() in wake_all_kswapds() to avoid the above situation.
> 
> Signed-off-by: Weijie Yang <weijie.yang@samsung.com>

Wow, we have never respected nodemask when waking kswapd, but your
change does make sense to me.

As far as impact go, this has the chance of reducing reclaim/swapping
for certain configurations.  Higher-order wakeups on an ineligible
zone are more obviously undesirable, but even order-0 rebalancing is
not necessarily a future investment for other allocations on that
node, as other allocations may have access to the free pages of a
third node and overall demand might drop before these are exhausted.
This reminds me of the issue fixed in 3a025760fc15 ("mm: page_alloc:
spill to remote nodes before waking kswapd"), where accidental eager
order-0 rebalancing turned out to be a true waste.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-08-29 13:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-29  7:03 [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node Weijie Yang
2014-08-29  8:12 ` Mel Gorman
2014-08-29  9:12   ` Weijie Yang
2014-08-29 13:09 ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).