From: Tejun Heo <tj@kernel.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
rusty@rustcorp.com.au, tglx@linutronix.de, x86@kernel.org,
linux-kernel@vger.kernel.org, hpa@zytor.com, jeremy@goop.org,
cpw@sgi.com
Subject: Re: [PATCHSET x86/core/percpu] implement dynamic percpu allocator
Date: Mon, 23 Feb 2009 09:43:30 +0900 [thread overview]
Message-ID: <49A1F132.8080004@kernel.org> (raw)
In-Reply-To: <20090222193817.GC21320@elte.hu>
Hello, Ingo.
Ingo Molnar wrote:
> Heck no. It is absolutely crazy to complicate __pa()/__va() in
> _any_ way just to 'save' one more 2MB dTLB.
Are __pa()/__va() that hot paths? Or am I over-estimating the cost of
2MB dTLB?
> We'll use that TLB because that is what TLBs are for: to handle
> mapped pages. Yes, in the percpu scheme we are working on we'll
> have a 'dual' mapping for the static percpu area (on 64-bit) but
> mapping aliases have been one of the most basic CPU features for
> the past 15 years ...
>
> Even a single NOP in the __pa()/__va() path is _more_ expensive
> than that TLB, believe me.
Alright, I'll believe you. That actually works very nice for me. :-)
> Look at last year's cheap quad CPU:
>
> Data TLB: 4MB pages, 4-way associative, 32 entries
>
> That's 32x2MB = 64MB of data reach. Our access patterns in the
> kernel tend to be pretty focused as well, so 32 is more than
> enough in practice.
>
> Especially if the pte is cached a TLB fill is very cheap on
> Intel CPUs. So even if we were trashing those 32 entries (which
> we are generally not), having a dTLB for the percpu area is a
> TLB entry well spent.
>
> So lets just do the most simple and most straightforward mapping
> approach which i suggested: it takes advantage of everything, is
> very close to the best possible performance in the cached case -
> and dont worry about hardware resources.
Alright, for NUMA, I'll just remap a large page. For UMA, I already
wrote code to embed it existing large page nicely, so I'll keep it
that way. The added code is only about 40 lines which is localized in
setup_percpu.c and all __init. The NUMA remap also shouldn't take too
much code if the __pa/__va() trick isn't necessary. I'll post the
patches soon.
> The moment you start worrying about hardware resources on that
> level and start 'optimizing' it in software, you've already lost
> it. It leads down to the path of soft-TLB handlers and other
> sillyness. There's no way you can win such a race against
> hardware fundamentals - at least at today's speed of advance in
> the hw space.
Well, I was hoping for not introducing any performance regression
while converting to new allocator. Performance penalty due to TLB
pressure is especially difficult to measure, so avoiding any addition
there makes accepting the new allocator much easier but I gotta admit
that I'm not an expert at x86 micro performance tuning. If you think
the overhead is acceptable, I'm a happy camper.
Thanks.
--
tejun
next prev parent reply other threads:[~2009-02-23 0:44 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-18 12:04 [PATCHSET x86/core/percpu] implement dynamic percpu allocator Tejun Heo
2009-02-18 12:04 ` [PATCH 01/10] vmalloc: call flush_cache_vunmap() from unmap_kernel_range() Tejun Heo
2009-02-19 12:06 ` Nick Piggin
2009-02-19 22:36 ` David Miller
2009-02-18 12:04 ` [PATCH 02/10] module: fix out-of-range memory access Tejun Heo
2009-02-19 12:08 ` Nick Piggin
2009-02-20 7:16 ` Tejun Heo
2009-02-18 12:04 ` [PATCH 03/10] module: reorder module pcpu related functions Tejun Heo
2009-02-18 12:04 ` [PATCH 04/10] alloc_percpu: change percpu_ptr to per_cpu_ptr Tejun Heo
2009-02-18 12:04 ` [PATCH 05/10] alloc_percpu: add align argument to __alloc_percpu Tejun Heo
2009-02-18 12:04 ` [PATCH 06/10] percpu: kill percpu_alloc() and friends Tejun Heo
2009-02-19 0:17 ` Rusty Russell
2009-03-11 18:36 ` Tony Luck
2009-03-11 22:44 ` Rusty Russell
2009-03-12 2:06 ` Tejun Heo
2009-02-18 12:04 ` [PATCH 07/10] vmalloc: implement vm_area_register_early() Tejun Heo
2009-02-19 0:55 ` Tejun Heo
2009-02-19 12:09 ` Nick Piggin
2009-02-18 12:04 ` [PATCH 08/10] vmalloc: add un/map_kernel_range_noflush() Tejun Heo
2009-02-19 12:17 ` Nick Piggin
2009-02-20 1:27 ` Tejun Heo
2009-02-20 7:15 ` Subject: [PATCH 08/10 UPDATED] " Tejun Heo
2009-02-20 8:32 ` Andrew Morton
2009-02-21 3:21 ` Tejun Heo
2009-02-18 12:04 ` [PATCH 09/10] percpu: implement new dynamic percpu allocator Tejun Heo
2009-02-19 10:10 ` Andrew Morton
2009-02-19 11:01 ` Ingo Molnar
2009-02-20 2:45 ` Tejun Heo
2009-02-19 12:07 ` Rusty Russell
2009-02-20 2:35 ` Tejun Heo
2009-02-20 3:04 ` Andrew Morton
2009-02-20 5:29 ` Tejun Heo
2009-02-24 2:52 ` Rusty Russell
2009-02-19 11:51 ` Rusty Russell
2009-02-20 3:01 ` Tejun Heo
2009-02-20 3:02 ` Tejun Heo
2009-02-24 2:56 ` Rusty Russell
2009-02-24 5:27 ` [PATCH tj-percpu] percpu: add __read_mostly to variables which are mostly read only Tejun Heo
2009-02-24 5:47 ` [PATCH 09/10] percpu: implement new dynamic percpu allocator Tejun Heo
2009-02-24 17:41 ` Luck, Tony
2009-02-26 3:17 ` Tejun Heo
2009-02-27 19:41 ` Luck, Tony
2009-02-19 12:36 ` Nick Piggin
2009-02-20 3:04 ` Tejun Heo
2009-02-20 7:30 ` [PATCH UPDATED " Tejun Heo
2009-02-20 8:37 ` Andrew Morton
2009-02-21 3:23 ` Tejun Heo
2009-02-21 3:42 ` [PATCH tj-percpu] percpu: s/size/bytes/g in new percpu allocator and interface Tejun Heo
2009-02-21 7:48 ` Tejun Heo
2009-02-21 7:55 ` [PATCH tj-percpu] percpu: clean up size usage Tejun Heo
2009-02-21 7:56 ` Tejun Heo
2009-02-18 12:04 ` [PATCH 10/10] x86: convert to the new dynamic percpu allocator Tejun Heo
2009-02-18 13:43 ` [PATCHSET x86/core/percpu] implement " Ingo Molnar
2009-02-19 0:31 ` Tejun Heo
2009-02-19 10:51 ` Rusty Russell
2009-02-19 11:06 ` Ingo Molnar
2009-02-19 12:14 ` Rusty Russell
2009-02-20 3:08 ` Tejun Heo
2009-02-20 5:36 ` Tejun Heo
2009-02-20 7:33 ` Tejun Heo
2009-02-19 0:30 ` Tejun Heo
2009-02-19 11:07 ` Ingo Molnar
2009-02-20 3:17 ` Tejun Heo
2009-02-20 9:32 ` Ingo Molnar
2009-02-21 7:10 ` Tejun Heo
2009-02-21 7:33 ` Tejun Heo
2009-02-22 19:38 ` Ingo Molnar
2009-02-23 0:43 ` Tejun Heo [this message]
2009-02-23 10:17 ` Ingo Molnar
2009-02-23 13:38 ` [patch] x86: optimize __pa() to be linear again on 64-bit x86 Ingo Molnar
2009-02-23 14:08 ` Nick Piggin
2009-02-23 14:53 ` Ingo Molnar
2009-02-24 16:00 ` Andi Kleen
2009-02-27 5:57 ` Tejun Heo
2009-02-27 6:57 ` Ingo Molnar
2009-02-27 7:11 ` Tejun Heo
2009-02-22 19:27 ` [PATCHSET x86/core/percpu] implement dynamic percpu allocator Ingo Molnar
2009-02-23 0:47 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49A1F132.8080004@kernel.org \
--to=tj@kernel.org \
--cc=cpw@sgi.com \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.