From: "Roedel, Joerg" <Joerg.Roedel@amd.com>
To: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"H. Peter Anvin" <hpa@zytor.com>,
"x86@kernel.org" <x86@kernel.org>,
Yinghai Lu <yinghai@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Stable Maintainers <stable@kernel.org>
Subject: Re: [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive -V2.
Date: Thu, 19 Aug 2010 19:30:13 +0200 [thread overview]
Message-ID: <20100819173013.GG19773@amd.com> (raw)
In-Reply-To: <20100818183024.GZ3043@sgi.com>
On Wed, Aug 18, 2010 at 02:30:24PM -0400, Robin Holt wrote:
> When I investigated, I noticed that all the zone_data[] structures are
> allocated precisely at the beginning of the individual node's physical
> memory. By simply staggering based upon nodeid, I reduced the average
> down to 0.0006% of every second.
Interesting. Have you measured cache misses for both cases?
> With this patch, the max value did not change. I believe that is a
> combination of cacheline contention updating the zone's vmstat information
> combined with round_jiffies_common spattering unrelated cpus onto the same
> jiffie for their next update. I will investigate those issues seperately.
The max value probably the case where none of the data the code accesses
is actually in the cache. Cache aliasing does not help then so I would
not expect a change in the maximum amount here.
> I had no idea whether to ask stable@kernel.org to pull this back to the
> stable releases. My reading of the stable_kernel_rules.txt criteria is
> only fuzzy as to whether this meets the "oh, that's not good" standard.
> I personally think this meets that criteria, but I am unwilling to defend
> that position too stridently. In the end, I punted and added them to
> the Cc list. We will be asking both SuSE and RedHat to add this to
> their upcoming update releases as we expect it to affect their customers.
I don't think this is stable material. It improves performance and does
not fix a bug. But I am not the one to decide this :-)
>
> arch/x86/mm/numa_64.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> Index: round_jiffies/arch/x86/mm/numa_64.c
> ===================================================================
> --- round_jiffies.orig/arch/x86/mm/numa_64.c 2010-08-18 11:39:20.495141178 -0500
> +++ round_jiffies/arch/x86/mm/numa_64.c 2010-08-18 13:24:42.679080103 -0500
> @@ -198,6 +198,7 @@ setup_node_bootmem(int nodeid, unsigned
> unsigned long start_pfn, last_pfn, nodedata_phys;
> const int pgdat_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
> int nid;
> + unsigned long cache_alias_offset;
> #ifndef CONFIG_NO_BOOTMEM
> unsigned long bootmap_start, bootmap_pages, bootmap_size;
> void *bootmap;
> @@ -221,9 +222,16 @@ setup_node_bootmem(int nodeid, unsigned
> start_pfn = start >> PAGE_SHIFT;
> last_pfn = end >> PAGE_SHIFT;
>
> - node_data[nodeid] = early_node_mem(nodeid, start, end, pgdat_size,
> + /*
> + * Allocate an extra cacheline per node to reduce cacheline
> + * aliasing when scanning all node's node_data.
> + */
> + cache_alias_offset = nodeid * SMP_CACHE_BYTES;
> + node_data[nodeid] = cache_alias_offset +
> + early_node_mem(nodeid, start, end,
> + pgdat_size + cache_alias_offset,
> SMP_CACHE_BYTES);
> - if (node_data[nodeid] == NULL)
> + if (node_data[nodeid] == (void *)cache_alias_offset)
It is cleaner to keep the NULL check here and add the cache_alias_offset
to the pointer after that check.
> return;
> nodedata_phys = __pa(node_data[nodeid]);
> reserve_early(nodedata_phys, nodedata_phys + pgdat_size, "NODE_DATA");
>
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
next prev parent reply other threads:[~2010-08-19 17:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-18 16:56 [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive Robin Holt
2010-08-18 18:30 ` [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive -V2 Robin Holt
2010-08-19 17:30 ` Roedel, Joerg [this message]
2010-08-19 20:42 ` Robin Holt
2010-08-19 22:02 ` Robin Holt
2010-08-19 22:54 ` H. Peter Anvin
2010-08-20 13:58 ` Robin Holt
2010-08-20 15:03 ` Robin Holt
2010-08-20 16:16 ` H. Peter Anvin
2010-08-21 13:07 ` Robin Holt
2010-08-23 21:42 ` H. Peter Anvin
2010-08-25 11:08 ` Robin Holt
2010-08-25 18:56 ` H. Peter Anvin
2010-08-25 21:49 ` Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100819173013.GG19773@amd.com \
--to=joerg.roedel@amd.com \
--cc=holt@sgi.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=stable@kernel.org \
--cc=steiner@sgi.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.