public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Christoph Lameter <cl@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>, Tejun Heo <tj@kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	travis@sgi.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	steiner@sgi.com, Hugh Dickins <hugh@veritas.com>
Subject: Re: regarding the x86_64 zero-based percpu patches
Date: Mon, 12 Jan 2009 09:44:58 -0800	[thread overview]
Message-ID: <m1mydwxvtx.fsf@frodo.ebiederm.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0901121120280.30369@quilx.com> (Christoph Lameter's message of "Mon, 12 Jan 2009 11:23:27 -0600 (CST)")

Christoph Lameter <cl@linux-foundation.org> writes:

> On Sat, 10 Jan 2009, Rusty Russell wrote:
>
>> > As I was trying to do more stuff per-cpu
>> > (not putting a lot of stuff into per-cpu area but even with small
>> > things limited per-cpu area poses scalability problems), cpu_alloc
>> > seems to fit the bill better.
>>
>> Unfortunately cpu_alloc didn't solve this problem either.
>>
>> We need to grow the areas, but for NUMA layouts it's non-trivial.  I don't
>> like the idea of remapping: one TLB entry per page per cpu is going to suck.
>> Finding pages which are "congruent" with the original percpu pages is more
>> promising, but it will almost certainly need to elbow pages out the way to
>> have a chance of succeeding on a real system.
>
> An allocation automatically falls back to the nearest node on NUMA
> cpu_to_node() gives you the current node.
>
> There are 2M TLB entries on x86_64. If we really get into a high usage
> scenario then the 2M entry makes sense. Average server memory sizes likely
> already are way beyond 10G per box. The higher that goes the more
> reasonable the 2M TLB entry will be.

2M of per cpu data doesn't make sense, and likely indicates a design
flaw somewhere.  It just doesn't make sense to have large amounts of
data allocated per cpu.

The most common user of per cpu data I am aware of is allocating one
word per cpu for counters.

What would be better is simply to: 
- Require a lock to access another cpus per cpu data.
- Do large page allocations for the per cpu data.

At which point we could grow the per cpu data by simply reallocating it on
each cpu and updating the register that holds the base pointer.

Eric

  reply	other threads:[~2009-01-12 17:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <49649814.4040005@kernel.org>
     [not found] ` <20090107120225.GA30651@elte.hu>
2009-01-07 12:13   ` regarding the x86_64 zero-based percpu patches Tejun Heo
2009-01-10  6:46     ` Rusty Russell
2009-01-12 17:23       ` Christoph Lameter
2009-01-12 17:44         ` Eric W. Biederman [this message]
2009-01-12 19:00           ` Christoph Lameter
2009-01-13  0:33           ` Tejun Heo
2009-01-13  3:01             ` Eric W. Biederman
2009-01-13  3:14               ` Tejun Heo
2009-01-13  4:07                 ` Eric W. Biederman
2009-01-14  3:58                   ` Tejun Heo
2009-01-15  1:47                     ` Rusty Russell
2009-01-15  1:49                   ` Rusty Russell
2009-01-15 20:26                     ` Christoph Lameter
2009-01-15  1:34           ` Rusty Russell
2009-01-15 13:55             ` Ingo Molnar
2009-01-15 20:27             ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1mydwxvtx.fsf@frodo.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=steiner@sgi.com \
    --cc=tj@kernel.org \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox