All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Dykstra <jdykstra@cray.com>
To: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <mingo@redhat.com>, <linux-kernel@vger.kernel.org>
Subject: [PATCH V2] mm, x86, pat: Improve scaling of pat_pagerange_is_ram()
Date: Fri, 25 May 2012 16:12:46 -0500	[thread overview]
Message-ID: <1337980366.1979.6.camel@redwood> (raw)
In-Reply-To: <1337208407.1997.49.camel@sbsiddha-desk.sc.intel.com>

On Wed, 2012-05-16 at 15:46 -0700, Suresh Siddha wrote:
> Instead of duplicating what kernel/resource.c:walk_system_ram_range() is
> already doing, can we just provide a callback that can be used with
> walk_system_ram_range() and see if the expected RAM pages is what the
> callback also sees.

The resulting patch is a bit longer than V1.  
---

Function pat_pagerange_is_ram() scales poorly to large address ranges,
because it probes the resource tree for each page.  On a 2.6 GHz
Opteron, this function consumes 34 ms. for a 1 GB range.  It is called
twice during untrack_pfn_vma(), slowing process cleanup and handicapping
the OOM killer.

This replacement consumes less than 1ms. under the same conditions.

Signed-off-by: John Dykstra <jdykstra@cray.com> on behalf of Cray Inc. 
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
---
 arch/x86/mm/pat.c |   58
++++++++++++++++++++++++++++++++--------------------
 1 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f6ff57b..246cce8 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -158,31 +158,45 @@ static unsigned long pat_x_mtrr_type(u64 start,
u64 end, unsigned long req_type)
 	return req_type;
 }
 
+struct pagerange_is_ram_state {
+	unsigned long		cur_pfn;
+	int			ram;
+	int			not_ram;
+};
+
+static int pagerange_is_ram_callback(unsigned long initial_pfn,
+				unsigned long total_nr_pages, void *arg)
+{
+	struct pagerange_is_ram_state *state = arg;
+
+	state->not_ram |= initial_pfn > state->cur_pfn;
+	state->ram |= total_nr_pages > 0;
+	state->cur_pfn = initial_pfn + total_nr_pages;
+
+	return state->ram && state->not_ram;
+}
+
 static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 {
-	int ram_page = 0, not_rampage = 0;
-	unsigned long page_nr;
+	int ret = 0;
+	unsigned long start_pfn = start >> PAGE_SHIFT;
+	unsigned long end_pfn = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
+	struct pagerange_is_ram_state state = {start_pfn, 0, 0};
 
-	for (page_nr = (start >> PAGE_SHIFT); page_nr < (end >> PAGE_SHIFT);
-	     ++page_nr) {
-		/*
-		 * For legacy reasons, physical address range in the legacy ISA
-		 * region is tracked as non-RAM. This will allow users of
-		 * /dev/mem to map portions of legacy ISA region, even when
-		 * some of those portions are listed(or not even listed) with
-		 * different e820 types(RAM/reserved/..)
-		 */
-		if (page_nr >= (ISA_END_ADDRESS >> PAGE_SHIFT) &&
-		    page_is_ram(page_nr))
-			ram_page = 1;
-		else
-			not_rampage = 1;
-
-		if (ram_page == not_rampage)
-			return -1;
-	}
+	/*
+	 * For legacy reasons, physical address range in the legacy ISA
+	 * region is tracked as non-RAM. This will allow users of
+	 * /dev/mem to map portions of legacy ISA region, even when
+	 * some of those portions are listed(or not even listed) with
+	 * different e820 types(RAM/reserved/..)
+	 */
+	if (start_pfn < ISA_END_ADDRESS >> PAGE_SHIFT)
+		start_pfn = ISA_END_ADDRESS >> PAGE_SHIFT;
 
-	return ram_page;
+	if (start_pfn < end_pfn)
+		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
+				&state, pagerange_is_ram_callback);
+	return (ret > 0) ? -1 : (state.ram ? 1 : 0);
 }
 
 /*
-- 
1.7.0.4




  parent reply	other threads:[~2012-05-25 21:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-14 20:26 [PATCH] mm, x86, pat: Improve scaling of pat_pagerange_is_ram() John Dykstra
2012-05-16 22:46 ` Suresh Siddha
2012-05-18  6:48   ` Ingo Molnar
2012-05-25 21:12   ` John Dykstra [this message]
2012-05-26  0:37     ` [PATCH V2] " Suresh Siddha
2012-05-30 13:34     ` [tip:x86/urgent] x86/mm/pat: " tip-bot for John Dykstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1337980366.1979.6.camel@redwood \
    --to=jdykstra@cray.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.