public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 0/2] x86,pat: Reduce contention on the memtype_lock -V3
@ 2010-03-15 13:21 holt
  2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
  2010-03-15 13:21 ` [patch 2/2] x86,pat Convert memtype_lock into an rw_lock holt
  0 siblings, 2 replies; 7+ messages in thread
From: holt @ 2010-03-15 13:21 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Venkatesh Pallipadi, Venkatesh Pallipadi, Suresh Siddha,
	Linux Kernel Mailing List, x86


Tracking memtype on x86 uses a single global spin_lock for either reading
or changing the memory type.  This includes changes made to page flags
which is perfectly parallel.

Part one of the patchset makes the page-based tracking use cmpxchg
without a need for a lock.

Part two of the patchset converts the spin_lock into a read/write lock.


To: Ingo Molnar <mingo@redhat.com>
To: H. Peter Anvin <hpa@zytor.com>
To: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org

---

 arch/x86/include/asm/cacheflush.h |   44 +++++++++++++++++++++-----------------
 arch/x86/mm/pat.c                 |   28 ++++++++----------------
 2 files changed, 35 insertions(+), 37 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3
  2010-03-15 13:21 [patch 0/2] x86,pat: Reduce contention on the memtype_lock -V3 holt
@ 2010-03-15 13:21 ` holt
  2010-03-17 10:18   ` Robin Holt
                     ` (2 more replies)
  2010-03-15 13:21 ` [patch 2/2] x86,pat Convert memtype_lock into an rw_lock holt
  1 sibling, 3 replies; 7+ messages in thread
From: holt @ 2010-03-15 13:21 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Venkatesh Pallipadi, Venkatesh Pallipadi, Suresh Siddha,
	Linux Kernel Mailing List, x86

[-- Attachment #1: memtype_atomic_update_V3 --]
[-- Type: text/plain, Size: 5839 bytes --]


While testing an application using the xpmem (out of kernel) driver, we
noticed a significant page fault rate reduction of x86_64 with respect
to ia64.  For one test running with 32 cpus, one thread per cpu, it
took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
For the same test running on 256 cpus, one thread per cpu, it took 14:48
to vm_insert_pfn 2 GB worth of pages.

The slowdown was tracked to lookup_memtype which acquires the
spinlock memtype_lock.  This heavily contended lock was slowing down
vm_insert_pfn().

With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
cases take approx 00:01.3 seconds to complete.


To: Ingo Molnar <mingo@redhat.com>
To: H. Peter Anvin <hpa@zytor.com>
To: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org

---

Changes since -V2:
1) Cleared up the naming of the masks used in setting and clearing
the flags.


Changes since -V1:
1) Introduce atomically setting and clearing the page flags and not
using the global memtype_lock to protect page->flags.

2) This allowed me the opportunity to convert the rwlock back into a
spinlock and not affect _MY_ tests performance as all the pages my test
was utilizing are tracked by struct pages.

3) Corrected the commit log.  The timings were for 32 cpus and not 256.

 arch/x86/include/asm/cacheflush.h |   44 +++++++++++++++++++++-----------------
 arch/x86/mm/pat.c                 |    8 ------
 2 files changed, 25 insertions(+), 27 deletions(-)

Index: linux-next/arch/x86/include/asm/cacheflush.h
===================================================================
--- linux-next.orig/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:06.690471974 -0600
+++ linux-next/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:41.846472324 -0600
@@ -44,9 +44,6 @@ static inline void copy_from_user_page(s
 	memcpy(dst, src, len);
 }
 
-#define PG_WC				PG_arch_1
-PAGEFLAG(WC, WC)
-
 #ifdef CONFIG_X86_PAT
 /*
  * X86 PAT uses page flags WC and Uncached together to keep track of
@@ -55,16 +52,24 @@ PAGEFLAG(WC, WC)
  * _PAGE_CACHE_UC_MINUS and fourth state where page's memory type has not
  * been changed from its default (value of -1 used to denote this).
  * Note we do not support _PAGE_CACHE_UC here.
- *
- * Caller must hold memtype_lock for atomicity.
  */
+
+#define _PGMT_DEFAULT		0
+#define _PGMT_WC		PG_arch_1
+#define _PGMT_UC_MINUS		PG_uncached
+#define _PGMT_WB		(PG_uncached | PG_arch_1)
+#define _PGMT_MASK		(PG_uncached | PG_arch_1)
+#define _PGMT_CLEAR_MASK	(~_PGMT_MASK)
+
 static inline unsigned long get_page_memtype(struct page *pg)
 {
-	if (!PageUncached(pg) && !PageWC(pg))
+	unsigned long pg_flags = pg->flags & _PGMT_MASK;
+
+	if (pg_flags == _PGMT_DEFAULT)
 		return -1;
-	else if (!PageUncached(pg) && PageWC(pg))
+	else if (pg_flags == _PGMT_WC)
 		return _PAGE_CACHE_WC;
-	else if (PageUncached(pg) && !PageWC(pg))
+	else if (pg_flags == _PGMT_UC_MINUS)
 		return _PAGE_CACHE_UC_MINUS;
 	else
 		return _PAGE_CACHE_WB;
@@ -72,25 +77,26 @@ static inline unsigned long get_page_mem
 
 static inline void set_page_memtype(struct page *pg, unsigned long memtype)
 {
+	unsigned long memtype_flags = _PGMT_DEFAULT;
+	unsigned long old_flags;
+	unsigned long new_flags;
+
 	switch (memtype) {
 	case _PAGE_CACHE_WC:
-		ClearPageUncached(pg);
-		SetPageWC(pg);
+		memtype_flags = _PGMT_WC;
 		break;
 	case _PAGE_CACHE_UC_MINUS:
-		SetPageUncached(pg);
-		ClearPageWC(pg);
+		memtype_flags = _PGMT_UC_MINUS;
 		break;
 	case _PAGE_CACHE_WB:
-		SetPageUncached(pg);
-		SetPageWC(pg);
-		break;
-	default:
-	case -1:
-		ClearPageUncached(pg);
-		ClearPageWC(pg);
+		memtype_flags = _PGMT_WB;
 		break;
 	}
+
+	do {
+		old_flags = pg->flags;
+		new_flags = (old_flags & _PGMT_CLEAR_MASK) | memtype_flags;
+	} while (cmpxchg(&pg->flags, old_flags, new_flags) != old_flags);
 }
 #else
 static inline unsigned long get_page_memtype(struct page *pg) { return -1; }
Index: linux-next/arch/x86/mm/pat.c
===================================================================
--- linux-next.orig/arch/x86/mm/pat.c	2010-03-12 19:55:06.690471974 -0600
+++ linux-next/arch/x86/mm/pat.c	2010-03-12 19:55:59.434468352 -0600
@@ -190,8 +190,6 @@ static int pat_pagerange_is_ram(unsigned
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
- *
- * Caller must hold memtype_lock for atomicity.
  */
 static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
 				  unsigned long *new_type)
@@ -297,9 +295,7 @@ int reserve_memtype(u64 start, u64 end, 
 	is_range_ram = pat_pagerange_is_ram(start, end);
 	if (is_range_ram == 1) {
 
-		spin_lock(&memtype_lock);
 		err = reserve_ram_pages_type(start, end, req_type, new_type);
-		spin_unlock(&memtype_lock);
 
 		return err;
 	} else if (is_range_ram < 0) {
@@ -351,9 +347,7 @@ int free_memtype(u64 start, u64 end)
 	is_range_ram = pat_pagerange_is_ram(start, end);
 	if (is_range_ram == 1) {
 
-		spin_lock(&memtype_lock);
 		err = free_ram_pages_type(start, end);
-		spin_unlock(&memtype_lock);
 
 		return err;
 	} else if (is_range_ram < 0) {
@@ -394,10 +388,8 @@ static unsigned long lookup_memtype(u64 
 
 	if (pat_pagerange_is_ram(paddr, paddr + PAGE_SIZE)) {
 		struct page *page;
-		spin_lock(&memtype_lock);
 		page = pfn_to_page(paddr >> PAGE_SHIFT);
 		rettype = get_page_memtype(page);
-		spin_unlock(&memtype_lock);
 		/*
 		 * -1 from get_page_memtype() implies RAM page is in its
 		 * default state and not reserved, and hence of type WB


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch 2/2] x86,pat Convert memtype_lock into an rw_lock.
  2010-03-15 13:21 [patch 0/2] x86,pat: Reduce contention on the memtype_lock -V3 holt
  2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
@ 2010-03-15 13:21 ` holt
  2010-03-17 16:21   ` Suresh Siddha
  1 sibling, 1 reply; 7+ messages in thread
From: holt @ 2010-03-15 13:21 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Venkatesh Pallipadi, Venkatesh Pallipadi, Suresh Siddha,
	Linux Kernel Mailing List, x86

[-- Attachment #1: memtype_rwlock_V3 --]
[-- Type: text/plain, Size: 3012 bytes --]


Convert the memtype_lock from a spin_lock to an rw_lock.  The first
version of my patch had this and it did improve performance for fault
in times.  The atomic page flags patch (first in the series) improves
things much greater for ram pages.  This patch is to help the other pages.

To: Ingo Molnar <mingo@redhat.com>
To: H. Peter Anvin <hpa@zytor.com>
To: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org

---

 arch/x86/mm/pat.c |   20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Index: linux-next/arch/x86/mm/pat.c
===================================================================
--- linux-next.orig/arch/x86/mm/pat.c	2010-03-15 08:13:47.262469632 -0500
+++ linux-next/arch/x86/mm/pat.c	2010-03-15 08:13:47.294446141 -0500
@@ -130,7 +130,7 @@ void pat_init(void)
 
 #undef PAT
 
-static DEFINE_SPINLOCK(memtype_lock);	/* protects memtype accesses */
+static DEFINE_RWLOCK(memtype_lock);	/* protects memtype accesses */
 
 /*
  * Does intersection of PAT memory type and MTRR memory type and returns
@@ -310,7 +310,7 @@ int reserve_memtype(u64 start, u64 end, 
 	new->end	= end;
 	new->type	= actual_type;
 
-	spin_lock(&memtype_lock);
+	write_lock(&memtype_lock);
 
 	err = rbt_memtype_check_insert(new, new_type);
 	if (err) {
@@ -318,12 +318,12 @@ int reserve_memtype(u64 start, u64 end, 
 		       "track %s, req %s\n",
 		       start, end, cattr_name(new->type), cattr_name(req_type));
 		kfree(new);
-		spin_unlock(&memtype_lock);
+		write_unlock(&memtype_lock);
 
 		return err;
 	}
 
-	spin_unlock(&memtype_lock);
+	write_unlock(&memtype_lock);
 
 	dprintk("reserve_memtype added 0x%Lx-0x%Lx, track %s, req %s, ret %s\n",
 		start, end, cattr_name(new->type), cattr_name(req_type),
@@ -354,9 +354,9 @@ int free_memtype(u64 start, u64 end)
 		return -EINVAL;
 	}
 
-	spin_lock(&memtype_lock);
+	write_lock(&memtype_lock);
 	err = rbt_memtype_erase(start, end);
-	spin_unlock(&memtype_lock);
+	write_unlock(&memtype_lock);
 
 	if (err) {
 		printk(KERN_INFO "%s:%d freeing invalid memtype %Lx-%Lx\n",
@@ -400,7 +400,7 @@ static unsigned long lookup_memtype(u64 
 		return rettype;
 	}
 
-	spin_lock(&memtype_lock);
+	read_lock(&memtype_lock);
 
 	entry = rbt_memtype_lookup(paddr);
 	if (entry != NULL)
@@ -408,7 +408,7 @@ static unsigned long lookup_memtype(u64 
 	else
 		rettype = _PAGE_CACHE_UC_MINUS;
 
-	spin_unlock(&memtype_lock);
+	read_unlock(&memtype_lock);
 	return rettype;
 }
 
@@ -748,9 +748,9 @@ static struct memtype *memtype_get_idx(l
 	if (!print_entry)
 		return NULL;
 
-	spin_lock(&memtype_lock);
+	read_lock(&memtype_lock);
 	ret = rbt_memtype_copy_nth_element(print_entry, pos);
-	spin_unlock(&memtype_lock);
+	read_unlock(&memtype_lock);
 
 	if (!ret) {
 		return print_entry;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3
  2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
@ 2010-03-17 10:18   ` Robin Holt
  2010-03-17 16:21   ` Suresh Siddha
  2010-03-23 23:21   ` Rafael J. Wysocki
  2 siblings, 0 replies; 7+ messages in thread
From: Robin Holt @ 2010-03-17 10:18 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Venkatesh Pallipadi, Venkatesh Pallipadi, Suresh Siddha,
	Linux Kernel Mailing List, x86

Is there any movement on this?  The problem is easily understood and
the code in this patch is quite clear.  I am having difficulty getting
distros to evaluate this patch because it has not been accepted upstream.
While I understand a slow review process is desirable, I first submitted
these for review on 26 Feb.

Thanks,
Robin

On Mon, Mar 15, 2010 at 08:21:04AM -0500, holt@sgi.com wrote:
> 
> While testing an application using the xpmem (out of kernel) driver, we
> noticed a significant page fault rate reduction of x86_64 with respect
> to ia64.  For one test running with 32 cpus, one thread per cpu, it
> took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
> For the same test running on 256 cpus, one thread per cpu, it took 14:48
> to vm_insert_pfn 2 GB worth of pages.
> 
> The slowdown was tracked to lookup_memtype which acquires the
> spinlock memtype_lock.  This heavily contended lock was slowing down
> vm_insert_pfn().
> 
> With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
> cases take approx 00:01.3 seconds to complete.
> 
> 
> To: Ingo Molnar <mingo@redhat.com>
> To: H. Peter Anvin <hpa@zytor.com>
> To: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Robin Holt <holt@sgi.com>
> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> Cc: x86@kernel.org
> 
> ---
> 
> Changes since -V2:
> 1) Cleared up the naming of the masks used in setting and clearing
> the flags.
> 
> 
> Changes since -V1:
> 1) Introduce atomically setting and clearing the page flags and not
> using the global memtype_lock to protect page->flags.
> 
> 2) This allowed me the opportunity to convert the rwlock back into a
> spinlock and not affect _MY_ tests performance as all the pages my test
> was utilizing are tracked by struct pages.
> 
> 3) Corrected the commit log.  The timings were for 32 cpus and not 256.
> 
>  arch/x86/include/asm/cacheflush.h |   44 +++++++++++++++++++++-----------------
>  arch/x86/mm/pat.c                 |    8 ------
>  2 files changed, 25 insertions(+), 27 deletions(-)
> 
> Index: linux-next/arch/x86/include/asm/cacheflush.h
> ===================================================================
> --- linux-next.orig/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:06.690471974 -0600
> +++ linux-next/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:41.846472324 -0600
> @@ -44,9 +44,6 @@ static inline void copy_from_user_page(s
>  	memcpy(dst, src, len);
>  }
>  
> -#define PG_WC				PG_arch_1
> -PAGEFLAG(WC, WC)
> -
>  #ifdef CONFIG_X86_PAT
>  /*
>   * X86 PAT uses page flags WC and Uncached together to keep track of
> @@ -55,16 +52,24 @@ PAGEFLAG(WC, WC)
>   * _PAGE_CACHE_UC_MINUS and fourth state where page's memory type has not
>   * been changed from its default (value of -1 used to denote this).
>   * Note we do not support _PAGE_CACHE_UC here.
> - *
> - * Caller must hold memtype_lock for atomicity.
>   */
> +
> +#define _PGMT_DEFAULT		0
> +#define _PGMT_WC		PG_arch_1
> +#define _PGMT_UC_MINUS		PG_uncached
> +#define _PGMT_WB		(PG_uncached | PG_arch_1)
> +#define _PGMT_MASK		(PG_uncached | PG_arch_1)
> +#define _PGMT_CLEAR_MASK	(~_PGMT_MASK)
> +
>  static inline unsigned long get_page_memtype(struct page *pg)
>  {
> -	if (!PageUncached(pg) && !PageWC(pg))
> +	unsigned long pg_flags = pg->flags & _PGMT_MASK;
> +
> +	if (pg_flags == _PGMT_DEFAULT)
>  		return -1;
> -	else if (!PageUncached(pg) && PageWC(pg))
> +	else if (pg_flags == _PGMT_WC)
>  		return _PAGE_CACHE_WC;
> -	else if (PageUncached(pg) && !PageWC(pg))
> +	else if (pg_flags == _PGMT_UC_MINUS)
>  		return _PAGE_CACHE_UC_MINUS;
>  	else
>  		return _PAGE_CACHE_WB;
> @@ -72,25 +77,26 @@ static inline unsigned long get_page_mem
>  
>  static inline void set_page_memtype(struct page *pg, unsigned long memtype)
>  {
> +	unsigned long memtype_flags = _PGMT_DEFAULT;
> +	unsigned long old_flags;
> +	unsigned long new_flags;
> +
>  	switch (memtype) {
>  	case _PAGE_CACHE_WC:
> -		ClearPageUncached(pg);
> -		SetPageWC(pg);
> +		memtype_flags = _PGMT_WC;
>  		break;
>  	case _PAGE_CACHE_UC_MINUS:
> -		SetPageUncached(pg);
> -		ClearPageWC(pg);
> +		memtype_flags = _PGMT_UC_MINUS;
>  		break;
>  	case _PAGE_CACHE_WB:
> -		SetPageUncached(pg);
> -		SetPageWC(pg);
> -		break;
> -	default:
> -	case -1:
> -		ClearPageUncached(pg);
> -		ClearPageWC(pg);
> +		memtype_flags = _PGMT_WB;
>  		break;
>  	}
> +
> +	do {
> +		old_flags = pg->flags;
> +		new_flags = (old_flags & _PGMT_CLEAR_MASK) | memtype_flags;
> +	} while (cmpxchg(&pg->flags, old_flags, new_flags) != old_flags);
>  }
>  #else
>  static inline unsigned long get_page_memtype(struct page *pg) { return -1; }
> Index: linux-next/arch/x86/mm/pat.c
> ===================================================================
> --- linux-next.orig/arch/x86/mm/pat.c	2010-03-12 19:55:06.690471974 -0600
> +++ linux-next/arch/x86/mm/pat.c	2010-03-12 19:55:59.434468352 -0600
> @@ -190,8 +190,6 @@ static int pat_pagerange_is_ram(unsigned
>   * Here we do two pass:
>   * - Find the memtype of all the pages in the range, look for any conflicts
>   * - In case of no conflicts, set the new memtype for pages in the range
> - *
> - * Caller must hold memtype_lock for atomicity.
>   */
>  static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
>  				  unsigned long *new_type)
> @@ -297,9 +295,7 @@ int reserve_memtype(u64 start, u64 end, 
>  	is_range_ram = pat_pagerange_is_ram(start, end);
>  	if (is_range_ram == 1) {
>  
> -		spin_lock(&memtype_lock);
>  		err = reserve_ram_pages_type(start, end, req_type, new_type);
> -		spin_unlock(&memtype_lock);
>  
>  		return err;
>  	} else if (is_range_ram < 0) {
> @@ -351,9 +347,7 @@ int free_memtype(u64 start, u64 end)
>  	is_range_ram = pat_pagerange_is_ram(start, end);
>  	if (is_range_ram == 1) {
>  
> -		spin_lock(&memtype_lock);
>  		err = free_ram_pages_type(start, end);
> -		spin_unlock(&memtype_lock);
>  
>  		return err;
>  	} else if (is_range_ram < 0) {
> @@ -394,10 +388,8 @@ static unsigned long lookup_memtype(u64 
>  
>  	if (pat_pagerange_is_ram(paddr, paddr + PAGE_SIZE)) {
>  		struct page *page;
> -		spin_lock(&memtype_lock);
>  		page = pfn_to_page(paddr >> PAGE_SHIFT);
>  		rettype = get_page_memtype(page);
> -		spin_unlock(&memtype_lock);
>  		/*
>  		 * -1 from get_page_memtype() implies RAM page is in its
>  		 * default state and not reserved, and hence of type WB
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3
  2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
  2010-03-17 10:18   ` Robin Holt
@ 2010-03-17 16:21   ` Suresh Siddha
  2010-03-23 23:21   ` Rafael J. Wysocki
  2 siblings, 0 replies; 7+ messages in thread
From: Suresh Siddha @ 2010-03-17 16:21 UTC (permalink / raw)
  To: holt@sgi.com
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Pallipadi, Venkatesh, Venkatesh Pallipadi,
	Linux Kernel Mailing List, x86@kernel.org

On Mon, 2010-03-15 at 06:21 -0700, holt@sgi.com wrote:
> While testing an application using the xpmem (out of kernel) driver, we
> noticed a significant page fault rate reduction of x86_64 with respect
> to ia64.  For one test running with 32 cpus, one thread per cpu, it
> took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
> For the same test running on 256 cpus, one thread per cpu, it took 14:48
> to vm_insert_pfn 2 GB worth of pages.
> 
> The slowdown was tracked to lookup_memtype which acquires the
> spinlock memtype_lock.  This heavily contended lock was slowing down
> vm_insert_pfn().
> 
> With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
> cases take approx 00:01.3 seconds to complete.

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 2/2] x86,pat Convert memtype_lock into an rw_lock.
  2010-03-15 13:21 ` [patch 2/2] x86,pat Convert memtype_lock into an rw_lock holt
@ 2010-03-17 16:21   ` Suresh Siddha
  0 siblings, 0 replies; 7+ messages in thread
From: Suresh Siddha @ 2010-03-17 16:21 UTC (permalink / raw)
  To: holt@sgi.com
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Venkatesh Pallipadi,
	Venkatesh Pallipadi, Linux Kernel Mailing List, x86@kernel.org

On Mon, 2010-03-15 at 06:21 -0700, holt@sgi.com wrote:
> Convert the memtype_lock from a spin_lock to an rw_lock.  The first
> version of my patch had this and it did improve performance for fault
> in times.  The atomic page flags patch (first in the series) improves
> things much greater for ram pages.  This patch is to help the other pages.
> 

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>

X86 folks, can you please queue  both these patches if you don't have
any objections.

thanks,
suresh


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3
  2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
  2010-03-17 10:18   ` Robin Holt
  2010-03-17 16:21   ` Suresh Siddha
@ 2010-03-23 23:21   ` Rafael J. Wysocki
  2 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2010-03-23 23:21 UTC (permalink / raw)
  To: holt
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Venkatesh Pallipadi,
	Venkatesh Pallipadi, Suresh Siddha, Linux Kernel Mailing List,
	x86

On Monday 15 March 2010, holt@sgi.com wrote:
> 
> While testing an application using the xpmem (out of kernel) driver, we
> noticed a significant page fault rate reduction of x86_64 with respect
> to ia64.  For one test running with 32 cpus, one thread per cpu, it
> took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
> For the same test running on 256 cpus, one thread per cpu, it took 14:48
> to vm_insert_pfn 2 GB worth of pages.
> 
> The slowdown was tracked to lookup_memtype which acquires the
> spinlock memtype_lock.  This heavily contended lock was slowing down
> vm_insert_pfn().
> 
> With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
> cases take approx 00:01.3 seconds to complete.
> 
> 
> To: Ingo Molnar <mingo@redhat.com>
> To: H. Peter Anvin <hpa@zytor.com>
> To: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Robin Holt <holt@sgi.com>
> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> Cc: x86@kernel.org
> 
> ---
> 
> Changes since -V2:
> 1) Cleared up the naming of the masks used in setting and clearing
> the flags.
> 
> 
> Changes since -V1:
> 1) Introduce atomically setting and clearing the page flags and not
> using the global memtype_lock to protect page->flags.
> 
> 2) This allowed me the opportunity to convert the rwlock back into a
> spinlock and not affect _MY_ tests performance as all the pages my test
> was utilizing are tracked by struct pages.
> 
> 3) Corrected the commit log.  The timings were for 32 cpus and not 256.
> 
>  arch/x86/include/asm/cacheflush.h |   44 +++++++++++++++++++++-----------------
>  arch/x86/mm/pat.c                 |    8 ------
>  2 files changed, 25 insertions(+), 27 deletions(-)
> 
> Index: linux-next/arch/x86/include/asm/cacheflush.h
> ===================================================================
> --- linux-next.orig/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:06.690471974 -0600
> +++ linux-next/arch/x86/include/asm/cacheflush.h	2010-03-12 19:55:41.846472324 -0600
> @@ -44,9 +44,6 @@ static inline void copy_from_user_page(s
>  	memcpy(dst, src, len);
>  }
>  
> -#define PG_WC				PG_arch_1
> -PAGEFLAG(WC, WC)
> -
>  #ifdef CONFIG_X86_PAT
>  /*
>   * X86 PAT uses page flags WC and Uncached together to keep track of
> @@ -55,16 +52,24 @@ PAGEFLAG(WC, WC)
>   * _PAGE_CACHE_UC_MINUS and fourth state where page's memory type has not
>   * been changed from its default (value of -1 used to denote this).
>   * Note we do not support _PAGE_CACHE_UC here.
> - *
> - * Caller must hold memtype_lock for atomicity.
>   */
> +
> +#define _PGMT_DEFAULT		0
> +#define _PGMT_WC		PG_arch_1
> +#define _PGMT_UC_MINUS		PG_uncached
> +#define _PGMT_WB		(PG_uncached | PG_arch_1)
> +#define _PGMT_MASK		(PG_uncached | PG_arch_1)
> +#define _PGMT_CLEAR_MASK	(~_PGMT_MASK)
> +

Can we manipulate the PG_* constants this way?  They are just bit numbers,
so for example _PGMT_WB should be ((1 << PG_uncached) | (PG_arch_1)) for
example.

Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-03-23 23:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-15 13:21 [patch 0/2] x86,pat: Reduce contention on the memtype_lock -V3 holt
2010-03-15 13:21 ` [patch 1/2] x86,pat Update the page flags for memtype atomically instead of using memtype_lock. -V3 holt
2010-03-17 10:18   ` Robin Holt
2010-03-17 16:21   ` Suresh Siddha
2010-03-23 23:21   ` Rafael J. Wysocki
2010-03-15 13:21 ` [patch 2/2] x86,pat Convert memtype_lock into an rw_lock holt
2010-03-17 16:21   ` Suresh Siddha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox