All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Leon Romanovsky <leon@leon.nu>, Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP v2
Date: Mon, 8 Sep 2014 12:57:18 +0100	[thread overview]
Message-ID: <20140908115718.GL17501@suse.de> (raw)
In-Reply-To: <CALq1K=JO2b-=iq40RRvK8JFFbrzyH5EyAp5jyS50CeV0P3eQcA@mail.gmail.com>

Commit 4ffeaf35 (mm: page_alloc: reduce cost of the fair zone allocation
policy) arguably broke the fair zone allocation policy on UP with these
hunks.

a/mm/page_alloc.c
b/mm/page_alloc.c
@@ -1612,6 +1612,9 @@ again:
       	}

       	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
+       if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
+           !zone_is_fair_depleted(zone))
+               zone_set_flag(zone, ZONE_FAIR_DEPLETED);

       	__count_zone_vm_events(PGALLOC, zone, 1 << order);
       	zone_statistics(preferred_zone, zone, gfp_flags);
@@ -1966,8 +1985,10 @@ zonelist_scan:
               	if (alloc_flags & ALLOC_FAIR) {
                       	if (!zone_local(preferred_zone, zone))
                               	break;
-                       if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)
+                       if (zone_is_fair_depleted(zone)) {
+                               nr_fair_skipped++;
                               	continue;
+                       }
               	}

A <= check was replaced with a ==. On SMP it doesn't matter because
negative values are returned as zero due to per-CPU drift which is not
possible in the UP case. Vlastimil Babka correctly pointed out that this
can wrap negative due to high-order allocations.

However, Leon Romanovsky pointed out that a <= check on zone_page_state
was never correct as zone_page_state returns unsigned long so the root
cause of the breakage was the <= check in the first place.

zone_page_state is an API hazard because of the difference in behaviour
between SMP and UP is very surprising. There is a good reason to allow
NR_ALLOC_BATCH to go negative -- when the counter is reset the negative
value takes recent activity into account. This patch makes zone_page_state
behave the same on SMP and UP as saving one branch on UP is not likely to
make a measurable performance difference.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/vmstat.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 82e7db7..cece0f0 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -131,10 +131,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
 					enum zone_stat_item item)
 {
 	long x = atomic_long_read(&zone->vm_stat[item]);
-#ifdef CONFIG_SMP
 	if (x < 0)
 		x = 0;
-#endif
 	return x;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Leon Romanovsky <leon@leon.nu>, Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP v2
Date: Mon, 8 Sep 2014 12:57:18 +0100	[thread overview]
Message-ID: <20140908115718.GL17501@suse.de> (raw)
In-Reply-To: <CALq1K=JO2b-=iq40RRvK8JFFbrzyH5EyAp5jyS50CeV0P3eQcA@mail.gmail.com>

Commit 4ffeaf35 (mm: page_alloc: reduce cost of the fair zone allocation
policy) arguably broke the fair zone allocation policy on UP with these
hunks.

a/mm/page_alloc.c
b/mm/page_alloc.c
@@ -1612,6 +1612,9 @@ again:
       	}

       	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
+       if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
+           !zone_is_fair_depleted(zone))
+               zone_set_flag(zone, ZONE_FAIR_DEPLETED);

       	__count_zone_vm_events(PGALLOC, zone, 1 << order);
       	zone_statistics(preferred_zone, zone, gfp_flags);
@@ -1966,8 +1985,10 @@ zonelist_scan:
               	if (alloc_flags & ALLOC_FAIR) {
                       	if (!zone_local(preferred_zone, zone))
                               	break;
-                       if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)
+                       if (zone_is_fair_depleted(zone)) {
+                               nr_fair_skipped++;
                               	continue;
+                       }
               	}

A <= check was replaced with a ==. On SMP it doesn't matter because
negative values are returned as zero due to per-CPU drift which is not
possible in the UP case. Vlastimil Babka correctly pointed out that this
can wrap negative due to high-order allocations.

However, Leon Romanovsky pointed out that a <= check on zone_page_state
was never correct as zone_page_state returns unsigned long so the root
cause of the breakage was the <= check in the first place.

zone_page_state is an API hazard because of the difference in behaviour
between SMP and UP is very surprising. There is a good reason to allow
NR_ALLOC_BATCH to go negative -- when the counter is reset the negative
value takes recent activity into account. This patch makes zone_page_state
behave the same on SMP and UP as saving one branch on UP is not likely to
make a measurable performance difference.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/vmstat.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 82e7db7..cece0f0 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -131,10 +131,8 @@ static inline unsigned long zone_page_state(struct zone *zone,
 					enum zone_stat_item item)
 {
 	long x = atomic_long_read(&zone->vm_stat[item]);
-#ifdef CONFIG_SMP
 	if (x < 0)
 		x = 0;
-#endif
 	return x;
 }
 

  reply	other threads:[~2014-09-08 11:57 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09  8:13 [PATCH 0/5] Reduce sequential read overhead Mel Gorman
2014-07-09  8:13 ` Mel Gorman
2014-07-09  8:13 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:01   ` Johannes Weiner
2014-07-10 12:01     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 2/6] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:06   ` Johannes Weiner
2014-07-10 12:06     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 3/6] mm: Move zone->pages_scanned into a vmstat counter Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:08   ` Johannes Weiner
2014-07-10 12:08     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 4/6] mm: vmscan: Only update per-cpu thresholds for online CPU Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:09   ` Johannes Weiner
2014-07-10 12:09     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 5/6] mm: page_alloc: Abort fair zone allocation policy when remotes nodes are encountered Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:14   ` Johannes Weiner
2014-07-10 12:14     ` Johannes Weiner
2014-07-10 12:44     ` Mel Gorman
2014-07-10 12:44       ` Mel Gorman
2014-07-09  8:13 ` [PATCH 6/6] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:18   ` Johannes Weiner
2014-07-10 12:18     ` Johannes Weiner
2014-08-08 15:27   ` Vlastimil Babka
2014-08-08 15:27     ` Vlastimil Babka
2014-08-11 12:12     ` Mel Gorman
2014-08-11 12:12       ` Mel Gorman
2014-08-11 12:34       ` Vlastimil Babka
2014-08-11 12:34         ` Vlastimil Babka
2014-09-02 14:01         ` Johannes Weiner
2014-09-02 14:01           ` Johannes Weiner
2014-09-05 10:14           ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP Mel Gorman
2014-09-05 10:14             ` Mel Gorman
2014-09-07  6:32             ` Leon Romanovsky
2014-09-07  6:32               ` Leon Romanovsky
2014-09-08 11:57               ` Mel Gorman [this message]
2014-09-08 11:57                 ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP v2 Mel Gorman
2014-09-09  8:17                 ` Leon Romanovsky
2014-09-09 19:53                 ` Andrew Morton
2014-09-09 19:53                   ` Andrew Morton
2014-09-10  9:16                   ` Mel Gorman
2014-09-10  9:16                     ` Mel Gorman
2014-09-10 20:32                     ` Johannes Weiner
2014-09-10 20:32                       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140908115718.GL17501@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=leon@leon.nu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.