Re: [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive -V2.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "H. Peter Anvin" <hpa@zytor.com>
To: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	x86@kernel.org, Yinghai Lu <yinghai@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Joerg Roedel <joerg.roedel@amd.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Stable Maintainers <stable@kernel.org>
Subject: Re: [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive -V2.
Date: Fri, 20 Aug 2010 09:16:08 -0700	[thread overview]
Message-ID: <4C6EAA48.7070902@zytor.com> (raw)
In-Reply-To: <20100820150319.GB3220@sgi.com>

On 08/20/2010 08:03 AM, Robin Holt wrote:
> 
> In short, without the cpu information, I think we are heading back to
> as much of a kludge as I had originally submitted.  We could assume
> the number of sets will always be less than some large value like 16MB,
> but that runs the risk of wasting a large amount of memory.
> 
> Alternatively, we could base the color value upon something very concrete.
> For this particular allocation, we have an array of structures whose
> elements are 1792 bytes long (28 cache lines).  If I specify an offset
> of 29, it merely means the first element of my newly allocated array
> is now going to collide with the first allocation's second element.
> I really see no advantage to further allocating space.  The advantage
> to this method is it entirely removes the processor configuration from
> the question.  It allows me to keep the offset calculation from polluting
> the e820 allocator as well.  Basically, the change remains localized.
> 

That's pretty much the idea, really.  However, I don't think you can
localize the change without making invalid assumptions of the behavior
of lower primitives, which given that changes are in progress will cause
serious problems.

The issue here is that the e820 allocator (which is about to be axed,
but its successor will need to support similar operations) has a few
parameters that it takes in: (start, end, size, alignment).  It will
return a block at address (addr) fulfilling the requirements:

	start <= addr
	addr+size <= end
	(addr % alignment) == 0

However, for coloring (which is what you're doing here, coloring doesn't
have to be precise) what you really want is for the last constraint to
read like:

	start <= addr
	addr+size <= end
	(addr % alignment) == offset

You can leave your alignment some arbitrarily large value (in the case
of your 1792-byte structure, you can make the observation that
1792*4096 < 8 MiB) and the alignment is simply 1792*(node number).  This
will over-color massively, of course, *but you're not allocating memory
you don't need* and so it doesn't really matter.

However, this does mean that there is a need to be able to pass the
offset parameter down to the allocator.  Not doing that will either mean
wasting huge amount of memory or relying on internal behavior of the
allocator which is already scheduled to change.

Does this make sense?

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

next prev parent reply	other threads:[~2010-08-20 16:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-18 16:56 [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive Robin Holt
2010-08-18 18:30 ` [Patch] numa:x86_64: Cacheline aliasing makes for_each_populated_zone extremely expensive -V2 Robin Holt
2010-08-19 17:30   ` Roedel, Joerg
2010-08-19 20:42     ` Robin Holt
2010-08-19 22:02       ` Robin Holt
2010-08-19 22:54   ` H. Peter Anvin
2010-08-20 13:58     ` Robin Holt
2010-08-20 15:03       ` Robin Holt
2010-08-20 16:16         ` H. Peter Anvin [this message]
2010-08-21 13:07           ` Robin Holt
2010-08-23 21:42             ` H. Peter Anvin
2010-08-25 11:08               ` Robin Holt
2010-08-25 18:56                 ` H. Peter Anvin
2010-08-25 21:49                   ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C6EAA48.7070902@zytor.com \
    --to=hpa@zytor.com \
    --cc=holt@sgi.com \
    --cc=joerg.roedel@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=stable@kernel.org \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.