All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <aaron.lu@intel.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Kemi Wang <kemi.wang@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>, Michal Hocko <mhocko@suse.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Matthew Wilcox <willy@infradead.org>,
	Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: [RFC PATCH v2 4/4] mm/free_pcppages_bulk: reduce overhead of cluster operation on free path
Date: Tue, 20 Mar 2018 16:54:52 +0800	[thread overview]
Message-ID: <20180320085452.24641-5-aaron.lu@intel.com> (raw)
In-Reply-To: <20180320085452.24641-1-aaron.lu@intel.com>

After "no_merge for order 0", the biggest overhead in free path for
order 0 pages is now add_to_cluster(). As pages are freed one by one,
it caused frequent operation of add_to_cluster().

Ideally, if only one migratetype pcp list has pages to free and
count=pcp->batch in free_pcppages_bulk(), we can avoid calling
add_to_cluster() one time per page but adding them in one go as
a single cluster. Let's call this ideal case as single_mt and
single_mt_unmovable represents when only unmovable pcp list has
pages and count in free_pcppages_bulk() equals to pcp->batch.
Ditto for single_mt_movable and single_mt_reclaimable.

I added some counters to see how often this ideal case is. On my
desktop, after boot:

free_pcppages_bulk:	6268
single_mt:		3885 (62%)

free_pcppages_bulk means the number of time this function gets called.
single_mt means number of times when only one pcp migratetype list has
pages to be freed and count=pcp->batch.

single_mt can be further devided into the following 3 cases:
single_mt_unmovable:	 263 (4%)
single_mt_movable:	2566 (41%)
single_mt_reclaimable:	1056 (17%)

After kbuild with a distro kconfig:

free_pcppages_bulk:	9100508
single_mt:		8440310 (93%)

Again, single_mt can be further devided:
single_mt_unmovable:	    290 (0%)
single_mt_movable:	8435483 (92.75%)
single_mt_reclaimable:	   4537 (0.05%)

Considering capturing the case of single_mt_movable requires fewer
lines of code and it is the most often ideal case, I think capturing
this case alone is enough.

This optimization brings zone->lock contention down from 25% to
almost zero again using the parallel free workload.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
 mm/page_alloc.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ac93833a2877..ad15e4ef99d6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1281,6 +1281,36 @@ static bool bulkfree_pcp_prepare(struct page *page)
 }
 #endif /* CONFIG_DEBUG_VM */
 
+static inline bool free_cluster_pages(struct zone *zone, struct list_head *list,
+				      int mt, int count)
+{
+	struct cluster *c;
+	struct page *page, *n;
+
+	if (!can_skip_merge(zone, 0))
+		return false;
+
+	if (count != this_cpu_ptr(zone->pageset)->pcp.batch)
+		return false;
+
+	c = new_cluster(zone, count, list_first_entry(list, struct page, lru));
+	if (unlikely(!c))
+		return false;
+
+	list_for_each_entry_safe(page, n, list, lru) {
+		set_page_order(page, 0);
+		set_page_merge_skipped(page);
+		page->cluster = c;
+		list_add(&page->lru, &zone->free_area[0].free_list[mt]);
+	}
+
+	INIT_LIST_HEAD(list);
+	zone->free_area[0].nr_free += count;
+	__mod_zone_page_state(zone, NR_FREE_PAGES, count);
+
+	return true;
+}
+
 /*
  * Frees a number of pages from the PCP lists
  * Assumes all pages on list are in same zone, and of same order.
@@ -1295,9 +1325,9 @@ static bool bulkfree_pcp_prepare(struct page *page)
 static void free_pcppages_bulk(struct zone *zone, int count,
 					struct per_cpu_pages *pcp)
 {
-	int migratetype = 0;
-	int batch_free = 0;
-	bool isolated_pageblocks;
+	int migratetype = MIGRATE_MOVABLE;
+	int batch_free = 0, saved_count = count;
+	bool isolated_pageblocks, single_mt = false;
 	struct page *page, *tmp;
 	LIST_HEAD(head);
 
@@ -1319,8 +1349,11 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 		} while (list_empty(list));
 
 		/* This is the only non-empty list. Free them all. */
-		if (batch_free == MIGRATE_PCPTYPES)
+		if (batch_free == MIGRATE_PCPTYPES) {
 			batch_free = count;
+			if (batch_free == saved_count)
+				single_mt = true;
+		}
 
 		do {
 			unsigned long pfn, buddy_pfn;
@@ -1359,9 +1392,14 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 	spin_lock(&zone->lock);
 	isolated_pageblocks = has_isolate_pageblock(zone);
 
+	if (!isolated_pageblocks && single_mt)
+		free_cluster_pages(zone, &head, migratetype, saved_count);
+
 	/*
 	 * Use safe version since after __free_one_page(),
 	 * page->lru.next will not point to original list.
+	 *
+	 * If free_cluster_pages() succeeds, head will be an empty list here.
 	 */
 	list_for_each_entry_safe(page, tmp, &head, lru) {
 		int mt = get_pcppage_migratetype(page);
-- 
2.14.3

  parent reply	other threads:[~2018-03-20  8:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20  8:54 [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 1/4] mm/page_alloc: use helper functions to add/remove a page to/from buddy Aaron Lu
2018-03-20 11:35   ` Vlastimil Babka
2018-03-20 13:50     ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 2/4] mm/__free_one_page: skip merge for order-0 page unless compaction failed Aaron Lu
2018-03-20 11:45   ` Vlastimil Babka
2018-03-20 14:11     ` Aaron Lu
2018-03-21  7:53       ` Vlastimil Babka
2018-03-22 17:15       ` Matthew Wilcox
2018-03-22 18:39         ` Daniel Jordan
2018-03-22 18:50           ` Matthew Wilcox
2018-03-20 22:58   ` Figo.zhang
2018-03-21  1:59     ` Aaron Lu
2018-03-21  4:21       ` Figo.zhang
2018-03-21  4:53         ` Aaron Lu
2018-03-21  5:59           ` Figo.zhang
2018-03-21  7:42             ` Aaron Lu
2018-03-20  8:54 ` [RFC PATCH v2 3/4] mm/rmqueue_bulk: alloc without touching individual page structure Aaron Lu
2018-03-20 22:29   ` Figo.zhang
2018-03-21  1:52     ` Aaron Lu
2018-03-21 12:55   ` Vlastimil Babka
2018-03-21 15:01     ` Aaron Lu
2018-03-29 19:16       ` Daniel Jordan
2018-03-20  8:54 ` Aaron Lu [this message]
2018-03-21 17:44 ` [RFC PATCH v2 0/4] Eliminate zone->lock contention for will-it-scale/page_fault1 and parallel free Daniel Jordan
2018-03-22  1:30   ` Aaron Lu
2018-03-22 11:20     ` Daniel Jordan
2018-03-29 19:19 ` Daniel Jordan
2018-03-30  1:42   ` Aaron Lu
2018-03-30 14:27     ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180320085452.24641-5-aaron.lu@intel.com \
    --to=aaron.lu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=kemi.wang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.