public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory
@ 2007-12-13 15:03 de Dinechin, Christophe (Integrity VM)
  2007-12-13 21:07 ` Luck, Tony
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: de Dinechin, Christophe (Integrity VM) @ 2007-12-13 15:03 UTC (permalink / raw)
  To: linux-ia64@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Picco, Robert W.,
	tony.luck@intel.com

Improve performance of memory allocations on ia64 by avoiding a global TLB purge to purge a single page from the file cache. This happens whenever we evict a page from the buffer cache to make room for some other allocation.

Test case: Run 'find /usr -type f | xargs cat > /dev/null' in the background to fill the buffer cache, then run something that uses memory, e.g. 'gmake -j50 install'. Instrumentation showed that the number of global TLB purges went from a few millions down to about 170 over a 12 hours run of the above.

The performance impact is particularly noticeable under virtualization, because a virtual TLB is generally both larger and slower to purge than a physical one.

Signed-off-by: Christophe de Dinechin <ddd@hp.com>

---

diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 1682fc6..66ae9c4 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -10,6 +10,7 @@
  *              IPI based ptc implementation and A-step IPI implementation.
  * Rohit Seth <rohit.seth@intel.com>
  * Ken Chen <kenneth.w.chen@intel.com>
+ * Christophe de Dinechin <ddd@hp.com>: Avoid ptc.e on memory allocation
  */
 #include <linux/module.h>
 #include <linux/init.h>
@@ -89,9 +90,16 @@ ia64_global_tlb_purge (struct mm_struct *mm, unsigned long start,
 {
        static DEFINE_SPINLOCK(ptcg_lock);

-       if (mm != current->active_mm || !current->mm) {
-               flush_tlb_all();
-               return;
+       struct mm_struct *active_mm = current->active_mm;
+
+       if (mm != active_mm) {
+               /* Restore region IDs for mm */
+               if (mm && active_mm) {
+                       activate_context(mm);
+               } else {
+                       flush_tlb_all();
+                       return;
+               }
        }

        /* HW requires global serialization of ptc.ga.  */
@@ -107,6 +115,10 @@ ia64_global_tlb_purge (struct mm_struct *mm, unsigned long start,
                } while (start < end);
        }
        spin_unlock(&ptcg_lock);
+
+        if (mm != active_mm) {
+                activate_context(active_mm);
+        }
 }

 void

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory
  2007-12-13 15:03 [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory de Dinechin, Christophe (Integrity VM)
@ 2007-12-13 21:07 ` Luck, Tony
  2007-12-14 17:17   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
  2007-12-17 10:42 ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
  2007-12-18  1:05 ` KAMEZAWA Hiroyuki
  2 siblings, 1 reply; 8+ messages in thread
From: Luck, Tony @ 2007-12-13 21:07 UTC (permalink / raw)
  To: de Dinechin, Christophe (Integrity VM), linux-ia64
  Cc: linux-kernel, Picco, Robert W.

> Test case: Run 'find /usr -type f | xargs cat > /dev/null'
> in the background to fill the buffer cache, then run
> something that uses memory, e.g. 'gmake -j50 install'.
> Instrumentation showed that the number of global TLB
> purges went from a few millions down to about 170 over
> a 12 hours run of the above.

What was your system configuration for this test.  I'm running
on a 4 socket Montecito (with HT enabled, so 16 logical cpus)
with 4G of memory.  In the first hour of my test I've only
seen 125 calls where the old code would have called flush_tlb_all().
The new code managed to avoid all of these ... so it is batting
100%. This is out of just over a million calls to
ia64_global_tlb_purge().

So clearly I'm not manage to stress the system as heavily
as you did.

-Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating
  2007-12-13 21:07 ` Luck, Tony
@ 2007-12-14 17:17   ` de Dinechin, Christophe (Integrity VM)
  2007-12-14 18:19     ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory Luck, Tony
  0 siblings, 1 reply; 8+ messages in thread
From: de Dinechin, Christophe (Integrity VM) @ 2007-12-14 17:17 UTC (permalink / raw)
  To: Luck, Tony, linux-ia64@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Picco, Robert W.

Hi Tony,


Thanks for testing this. I just tried the following on a 4-way virtual machine with 2G of memory running on a 4-way rx5670 host with 14G:

        # One window: Fill the buffer cache with some stuff
        while true; do find / -type f | xargs cat > /dev/null; done

        # Another window: do some memory allocations
        cd /usr/src/linux-2.6.23.9
        while true; do gmake clean; gmake -j50; done

I also added some instrumentation that shows old and new PTC.E in /proc/meminfo. Here is what a grep of PTC.E in /proc/meminfo gives me every second under that load, after waiting until the buffer cache fills up:

        Fri Dec 14 23:27:16 CET 2007: PTC.E: 1161198 (old) 0 (new)
        Fri Dec 14 23:27:17 CET 2007: PTC.E: 1166233 (old) 0 (new)
        Fri Dec 14 23:27:19 CET 2007: PTC.E: 1174342 (old) 0 (new)
        Fri Dec 14 23:27:20 CET 2007: PTC.E: 1175436 (old) 0 (new)
        Fri Dec 14 23:27:21 CET 2007: PTC.E: 1176494 (old) 0 (new)
        Fri Dec 14 23:27:22 CET 2007: PTC.E: 1176756 (old) 0 (new)

As you can see, the global purge rates can be pretty respectable under this kind of load. I chose -j50 to generate enough processes to stress my own system, you may need more with 4G. Check with xosview or similar that the buffer cache fills up memory but is kept relatively small by user-space memory pressure (at around 5-10% for my own testing).

If for some reason the same effect does not show up on your system, then that would be very interesting... Please let me know.


Regards
Christophe

-----Original Message-----
From: linux-ia64-owner@vger.kernel.org [mailto:linux-ia64-owner@vger.kernel.org] On Behalf Of Luck, Tony
Sent: Thursday, December 13, 2007 10:07 PM
To: de Dinechin, Christophe (Integrity VM); linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; Picco, Robert W.
Subject: RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory

> Test case: Run 'find /usr -type f | xargs cat > /dev/null'
> in the background to fill the buffer cache, then run something that
> uses memory, e.g. 'gmake -j50 install'.
> Instrumentation showed that the number of global TLB purges went from
> a few millions down to about 170 over a 12 hours run of the above.

What was your system configuration for this test.  I'm running on a 4 socket Montecito (with HT enabled, so 16 logical cpus) with 4G of memory.  In the first hour of my test I've only seen 125 calls where the old code would have called flush_tlb_all().
The new code managed to avoid all of these ... so it is batting 100%. This is out of just over a million calls to ia64_global_tlb_purge().

So clearly I'm not manage to stress the system as heavily as you did.

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory
  2007-12-14 17:17   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
@ 2007-12-14 18:19     ` Luck, Tony
  0 siblings, 0 replies; 8+ messages in thread
From: Luck, Tony @ 2007-12-14 18:19 UTC (permalink / raw)
  To: de Dinechin, Christophe (Integrity VM), linux-ia64
  Cc: linux-kernel, Picco, Robert W.

> As you can see, the global purge rates can be pretty respectable
> under this kind of load. I chose -j50 to generate enough processes
> to stress my own system, you may need more with 4G. Check with
> xosview or similar that the buffer cache fills up memory but
> is kept relatively small by user-space memory pressure (at
> around 5-10% for my own testing).

My 4G of memory was indeed the problem ... in two ways:
  1) I didn't install "Everything" on this machine.  So the
     'find /usr -type f | xargs cat' was only juggling with
     just over 2G of files, which all fit in the page cache!
  2) I'd assumed you had used -j50 because you were running
     on some humungous superdome system with that many cpus.
     I was only using -j16 ... which probably fit into the
     remaining available memory.

So I moved the 'find' to /home (which has >7G of files), increased
the -j factor, and just to make really sure ran a little program
that did a malloc() & mlock() of 2G of memory.

I've been running for about 20 minutes and already see just over half
a million cases where your patch avoided flush_tlb_all() (at the
moment it is managing to do so in every case).

Do you know what the call sequence looks like for the few cases
where your patch doesn't manage to avoid (you mentioned just 170
times out of several million in the patch submission)?

-Tony


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating
  2007-12-13 15:03 [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory de Dinechin, Christophe (Integrity VM)
  2007-12-13 21:07 ` Luck, Tony
@ 2007-12-17 10:42 ` de Dinechin, Christophe (Integrity VM)
  2007-12-18  1:05 ` KAMEZAWA Hiroyuki
  2 siblings, 0 replies; 8+ messages in thread
From: de Dinechin, Christophe (Integrity VM) @ 2007-12-17 10:42 UTC (permalink / raw)
  To: linux-ia64

Luck, Tony wrote:
> Do you know what the call sequence looks like for the few cases where
> your patch doesn't manage to avoid (you mentioned just 170 times out
> of several million in the patch submission)?

One of the stack traces happens (at least for RedHat 2.6.9-42.EL) during module loading / unloading. This one seems quite legitimate to me. I don't know if there are others.

 [<a00000010005b3d0>] local_flush_tlb_all+0xd0/0x1e0
 [<a000000100053cd0>] smp_flush_tlb_all+0x70/0xa0
 [<a000000100119e50>] unmap_vm_area+0x390/0x3c0
 [<a00000010011a820>] __remove_vm_area+0x80/0xe0
 [<a00000010011a8b0>] remove_vm_area+0x30/0x80
 [<a00000010011a980>] __vunmap+0x80/0x2a0
 [<a00000010011ac40>] vfree+0xa0/0xc0
 [<a000000100050430>] module_free+0x90/0xc0


Regards
Christophe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating
  2007-12-13 15:03 [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory de Dinechin, Christophe (Integrity VM)
  2007-12-13 21:07 ` Luck, Tony
  2007-12-17 10:42 ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
@ 2007-12-18  1:05 ` KAMEZAWA Hiroyuki
  2007-12-18  6:00   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory Kyle McMartin
  2 siblings, 1 reply; 8+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-18  1:05 UTC (permalink / raw)
  To: de Dinechin, Christophe (Integrity VM)
  Cc: linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org,
	Picco, Robert W., tony.luck@intel.com

On Thu, 13 Dec 2007 15:03:07 +0000

> +       if (mm != active_mm) {
> +               /* Restore region IDs for mm */
> +               if (mm && active_mm) {
> +                       activate_context(mm);
> +               } else {
> +                       flush_tlb_all();
> +                       return;
> +               }
>         }
should be 

preempt_disable();
activate_context(mm);
preempt_enable();

?
(from comments for activate_context()).


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory
  2007-12-18  1:05 ` KAMEZAWA Hiroyuki
@ 2007-12-18  6:00   ` Kyle McMartin
  2007-12-18  6:26     ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 8+ messages in thread
From: Kyle McMartin @ 2007-12-18  6:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: de Dinechin, Christophe (Integrity VM),
	linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org,
	Picco, Robert W., tony.luck@intel.com

On Tue, Dec 18, 2007 at 10:05:45AM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 13 Dec 2007 15:03:07 +0000
> 
> > +       if (mm != active_mm) {
> > +               /* Restore region IDs for mm */
> > +               if (mm && active_mm) {
> > +                       activate_context(mm);
> > +               } else {
> > +                       flush_tlb_all();
> > +                       return;
> > +               }
> >         }
> should be 
> 

platform_global_tlb_purge is already (and afaict, only) called under
preempt_disable already. then again, the sn2 global_tlb_purge
does it, so possibly for completeness sake it should be added here as
well?

regards, kyle

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating
  2007-12-18  6:00   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory Kyle McMartin
@ 2007-12-18  6:26     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 8+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-18  6:26 UTC (permalink / raw)
  To: Kyle McMartin
  Cc: de Dinechin, Christophe (Integrity VM),
	linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org,
	Picco, Robert W., tony.luck@intel.com

On Tue, 18 Dec 2007 01:00:15 -0500
Kyle McMartin <kyle@mcmartin.ca> wrote:

> On Tue, Dec 18, 2007 at 10:05:45AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Thu, 13 Dec 2007 15:03:07 +0000
> > 
> > > +       if (mm != active_mm) {
> > > +               /* Restore region IDs for mm */
> > > +               if (mm && active_mm) {
> > > +                       activate_context(mm);
> > > +               } else {
> > > +                       flush_tlb_all();
> > > +                       return;
> > > +               }
> > >         }
> > should be 
> > 
> 
> platform_global_tlb_purge is already (and afaict, only) called under
> preempt_disable already. then again, the sn2 global_tlb_purge
> does it, so possibly for completeness sake it should be added here as
> well?
> 
Thank you. I see. flush_tlb_range() is the only caller.

...maybe adding comment is helpful (for me :).

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-12-18  6:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-13 15:03 [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory de Dinechin, Christophe (Integrity VM)
2007-12-13 21:07 ` Luck, Tony
2007-12-14 17:17   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
2007-12-14 18:19     ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory Luck, Tony
2007-12-17 10:42 ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating de Dinechin, Christophe (Integrity VM)
2007-12-18  1:05 ` KAMEZAWA Hiroyuki
2007-12-18  6:00   ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory Kyle McMartin
2007-12-18  6:26     ` [PATCH] ia64: Avoid unnecessary TLB flushes when allocating KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox