* [Patch 2/3] Free off node page tables instead of placing on the quicklist.
@ 2005-02-26 14:26 Robin Holt
2005-02-28 5:23 ` [Patch 2/3] Free off node page tables instead of placing on Zou Nan hai
2005-02-28 11:28 ` [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
0 siblings, 2 replies; 3+ messages in thread
From: Robin Holt @ 2005-02-26 14:26 UTC (permalink / raw)
To: linux-ia64
Tony,
This patch is simple but necessary for large numa configurations.
It simply ensures that only pages from the local node are added to a
cpus quicklist. This prevents the trapping of pages on a remote nodes
quicklist by starting a process, touching a large number of pages to
fill pmd and pte entries, migrating to another node, and then unmapping
or exiting. With those conditions, the pages get trapped and if the
machine has more than 100 nodes of the same size, the calculation of
the pgtable high water mark will be larger than any single node so page
table cache flushing will never occur.
I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
this patch and did not notice any change.
On and sn2 machine, there was a slight improvement which is possibly
due to pages from other nodes trapped on the test node before starting
the run. I did not investigate further.
Signed-off-by: Robin Holt <holt@sgi.com>
Before:
Process fork+exit: 184.2333 microseconds
Process fork+exit: 184.7241 microseconds
Process fork+exit: 184.0333 microseconds
Process fork+exit: 185.6667 microseconds
Process fork+exit: 185.4000 microseconds
Process fork+exit: 184.6000 microseconds
Process fork+exit: 184.1333 microseconds
Process fork+exit: 184.3667 microseconds
Process fork+exit: 184.7667 microseconds
Process fork+exit: 183.7097 microseconds
Process fork+execve: 188.5172 microseconds
Process fork+execve: 190.0000 microseconds
Process fork+execve: 189.7931 microseconds
Process fork+execve: 190.2414 microseconds
Process fork+execve: 190.5517 microseconds
Process fork+execve: 190.5172 microseconds
Process fork+execve: 191.0000 microseconds
Process fork+execve: 189.9310 microseconds
Process fork+execve: 191.2069 microseconds
Process fork+execve: 190.8276 microseconds
After:
Process fork+exit: 180.8065 microseconds
Process fork+exit: 182.4286 microseconds
Process fork+exit: 184.0333 microseconds
Process fork+exit: 183.3226 microseconds
Process fork+exit: 182.6333 microseconds
Process fork+exit: 183.4000 microseconds
Process fork+exit: 183.4667 microseconds
Process fork+exit: 182.1935 microseconds
Process fork+exit: 182.0667 microseconds
Process fork+exit: 183.7742 microseconds
Process fork+execve: 188.1667 microseconds
Process fork+execve: 188.6071 microseconds
Process fork+execve: 187.5333 microseconds
Process fork+execve: 188.9286 microseconds
Process fork+execve: 188.4333 microseconds
Process fork+execve: 187.6000 microseconds
Process fork+execve: 187.6333 microseconds
Process fork+execve: 188.5333 microseconds
Process fork+execve: 187.9655 microseconds
Process fork+execve: 186.3667 microseconds
pgalloc.h | 10 ++++++++++
1 files changed, 10 insertions(+)
Index: linux-2.6/include/asm-ia64/pgalloc.h
=================================--- linux-2.6.orig/include/asm-ia64/pgalloc.h 2005-02-25 14:40:02.208212833 -0600
+++ linux-2.6/include/asm-ia64/pgalloc.h 2005-02-25 15:10:32.929665721 -0600
@@ -49,6 +49,16 @@
static inline void
pgtable_quicklist_free (void *pgtable_entry)
{
+#ifdef CONFIG_NUMA
+ int pg_node;
+
+ pg_node = page_zone(virt_to_page(pgtable_entry))->zone_pgdat->node_id;
+ if (pg_node != numa_node_id()) {
+ free_page((unsigned long) pgtable_entry);
+ return;
+ }
+#endif
+
preempt_disable();
*(unsigned long *)pgtable_entry = (unsigned long) local_cpu_data->pgtable_quicklist;
local_cpu_data->pgtable_quicklist = (unsigned long *) pgtable_entry;
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [Patch 2/3] Free off node page tables instead of placing on
2005-02-26 14:26 [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
@ 2005-02-28 5:23 ` Zou Nan hai
2005-02-28 11:28 ` [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
1 sibling, 0 replies; 3+ messages in thread
From: Zou Nan hai @ 2005-02-28 5:23 UTC (permalink / raw)
To: linux-ia64
On Sat, 2005-02-26 at 22:26, Robin Holt wrote:
> Tony,
>
> This patch is simple but necessary for large numa configurations.
> Index: linux-2.6/include/asm-ia64/pgalloc.h
> =================================> --- linux-2.6.orig/include/asm-ia64/pgalloc.h 2005-02-25 14:40:02.208212833 -0600
> +++ linux-2.6/include/asm-ia64/pgalloc.h 2005-02-25 15:10:32.929665721 -0600
> @@ -49,6 +49,16 @@
> static inline void
> pgtable_quicklist_free (void *pgtable_entry)
> {
> +#ifdef CONFIG_NUMA
> + int pg_node;
> +
> + pg_node = page_zone(virt_to_page(pgtable_entry))->zone_pgdat->node_id;
Maybe you could use the macro page_to_nid
> + if (pg_node != numa_node_id()) {
will put an unlikely here better?
> + free_page((unsigned long) pgtable_entry);
> + return;
> + }
> +#endif
> +
> preempt_disable();
> *(unsigned long *)pgtable_entry = (unsigned long) local_cpu_data->pgtable_quicklist;
> local_cpu_data->pgtable_quicklist = (unsigned long *) pgtable_entry;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [Patch 2/3] Free off node page tables instead of placing on the quicklist.
2005-02-26 14:26 [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
2005-02-28 5:23 ` [Patch 2/3] Free off node page tables instead of placing on Zou Nan hai
@ 2005-02-28 11:28 ` Robin Holt
1 sibling, 0 replies; 3+ messages in thread
From: Robin Holt @ 2005-02-28 11:28 UTC (permalink / raw)
To: linux-ia64
Tony,
This patch is simple but necessary for large numa configurations.
It simply ensures that only pages from the local node are added to a
cpus quicklist. This prevents the trapping of pages on a remote nodes
quicklist by starting a process, touching a large number of pages to
fill pmd and pte entries, migrating to another node, and then unmapping
or exiting. With those conditions, the pages get trapped and if the
machine has more than 100 nodes of the same size, the calculation of
the pgtable high water mark will be larger than any single node so page
table cache flushing will never occur.
I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
this patch and did not notice any change.
On and sn2 machine, there was a slight improvement which is possibly
due to pages from other nodes trapped on the test node before starting
the run. I did not investigate further.
Signed-off-by: Robin Holt <holt@sgi.com>
Before:
Process fork+exit: 184.2333 microseconds
Process fork+exit: 184.7241 microseconds
Process fork+exit: 184.0333 microseconds
Process fork+exit: 185.6667 microseconds
Process fork+exit: 185.4000 microseconds
Process fork+exit: 184.6000 microseconds
Process fork+exit: 184.1333 microseconds
Process fork+exit: 184.3667 microseconds
Process fork+exit: 184.7667 microseconds
Process fork+exit: 183.7097 microseconds
Process fork+execve: 188.5172 microseconds
Process fork+execve: 190.0000 microseconds
Process fork+execve: 189.7931 microseconds
Process fork+execve: 190.2414 microseconds
Process fork+execve: 190.5517 microseconds
Process fork+execve: 190.5172 microseconds
Process fork+execve: 191.0000 microseconds
Process fork+execve: 189.9310 microseconds
Process fork+execve: 191.2069 microseconds
Process fork+execve: 190.8276 microseconds
After:
Process fork+exit: 180.8065 microseconds
Process fork+exit: 182.4286 microseconds
Process fork+exit: 184.0333 microseconds
Process fork+exit: 183.3226 microseconds
Process fork+exit: 182.6333 microseconds
Process fork+exit: 183.4000 microseconds
Process fork+exit: 183.4667 microseconds
Process fork+exit: 182.1935 microseconds
Process fork+exit: 182.0667 microseconds
Process fork+exit: 183.7742 microseconds
Process fork+execve: 188.1667 microseconds
Process fork+execve: 188.6071 microseconds
Process fork+execve: 187.5333 microseconds
Process fork+execve: 188.9286 microseconds
Process fork+execve: 188.4333 microseconds
Process fork+execve: 187.6000 microseconds
Process fork+execve: 187.6333 microseconds
Process fork+execve: 188.5333 microseconds
Process fork+execve: 187.9655 microseconds
Process fork+execve: 186.3667 microseconds
pgalloc.h | 10 ++++++++++
1 files changed, 10 insertions(+)
Index: linux-2.6/include/asm-ia64/pgalloc.h
=================================--- linux-2.6.orig/include/asm-ia64/pgalloc.h 2005-02-26 08:16:22.345531297 -0600
+++ linux-2.6/include/asm-ia64/pgalloc.h 2005-02-28 05:23:55.672514335 -0600
@@ -49,6 +49,13 @@
static inline void
pgtable_quicklist_free (void *pgtable_entry)
{
+#ifdef CONFIG_NUMA
+ if (unlikely(page_to_nid(virt_to_page(pgtable_entry)) != numa_node_id())) {
+ free_page((unsigned long) pgtable_entry);
+ return;
+ }
+#endif
+
preempt_disable();
*(unsigned long *)pgtable_entry = (unsigned long) local_cpu_data->pgtable_quicklist;
local_cpu_data->pgtable_quicklist = (unsigned long *) pgtable_entry;
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-02-28 11:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-26 14:26 [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
2005-02-28 5:23 ` [Patch 2/3] Free off node page tables instead of placing on Zou Nan hai
2005-02-28 11:28 ` [Patch 2/3] Free off node page tables instead of placing on the quicklist Robin Holt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox