From: Ingo Molnar <mingo@elte.hu>
To: Pardo <pardo@google.com>
Cc: akpm@linux-foundation.org, hugh@veritas.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, briangrant@google.com,
cgd@google.com, mbligh@google.com,
Ulrich Drepper <drepper@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Arjan van de Ven <arjan@infradead.org>
Subject: Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?
Date: Wed, 13 Aug 2008 12:44:45 +0200 [thread overview]
Message-ID: <20080813104445.GA24632@elte.hu> (raw)
In-Reply-To: <af8810200808121745h596c175bk348d0aaeeb9bcb45@mail.gmail.com>
* Pardo <pardo@google.com> wrote:
> As example, in one case creating new threads goes from about 35,000
> cycles up to about 25,000,000 cycles -- which is under 100 threads per
> second. [...]
> Various things would address the slow pthread_create(). Choices
> include:
> - Be more platform-aware about when to use MAP_32BIT.
> - Abandon use of MAP_32BIT entirely, with worse performance on some machines.
> - Change the mmap() algorithm to be faster on allocation failure
> (avoid a linear search of vmas).
Sigh, unfortunately MAP_32BIT use in 64-bit apps for stacks was
apparently created without foresight about what would happen in the MM
when thread stacks exhaust 4GB.
The problem is that MAP_32BIT is used both as a performance hack for
64-bit apps and as an ABI compat mechanism for 32-bit apps. So we cannot
just start disregarding MAP_32BIT in the kernel - we'd break 32-bit
compat apps and/or compat 32-bit libraries.
There are various other options to solve the (severe!) performance
breakdown:
1- glibc could start not using MAP_32BIT for 64-bit thread stacks (the
boxes where context-switching is slow probably do not matter all that
much anymore - they were very slow at everything 64-bit anyway)
Pros: easiest solution.
Cons: slows down the affected machines and needs a new glibc.
2- We could introduce a new MAP_64BIT_STACK flag which we could
propagate it into MAP_32BIT on those old CPUs. It would be
disregarded on modern CPUs and thread stacks would be 64-bit.
Pros: cleanest solution.
Cons: needs both new glibc and new kernel to take advantage of.
3- We could detect the first-4G-is-full condition and cache it. Problem
is, there will likely be small holes in it so it's rather hard to do
it in a sane way. Also, every munmap() of a thread stack will
invalidate this - triggering a slow linear search every now and then.
Pros: only needs a new kernel to take advantage of.
Cons: is the most complex and messiest solution with no clear
benefit to other workloads. Also, does not 100% solve the
performance problem and prolongues the 4GB stack threads
hack.
i'd go for 1) or 2).
Ingo
WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@elte.hu>
To: Pardo <pardo@google.com>
Cc: akpm@linux-foundation.org, hugh@veritas.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, briangrant@google.com,
cgd@google.com, mbligh@google.com,
Ulrich Drepper <drepper@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Arjan van de Ven <arjan@infradead.org>
Subject: Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?
Date: Wed, 13 Aug 2008 12:44:45 +0200 [thread overview]
Message-ID: <20080813104445.GA24632@elte.hu> (raw)
In-Reply-To: <af8810200808121745h596c175bk348d0aaeeb9bcb45@mail.gmail.com>
* Pardo <pardo@google.com> wrote:
> As example, in one case creating new threads goes from about 35,000
> cycles up to about 25,000,000 cycles -- which is under 100 threads per
> second. [...]
> Various things would address the slow pthread_create(). Choices
> include:
> - Be more platform-aware about when to use MAP_32BIT.
> - Abandon use of MAP_32BIT entirely, with worse performance on some machines.
> - Change the mmap() algorithm to be faster on allocation failure
> (avoid a linear search of vmas).
Sigh, unfortunately MAP_32BIT use in 64-bit apps for stacks was
apparently created without foresight about what would happen in the MM
when thread stacks exhaust 4GB.
The problem is that MAP_32BIT is used both as a performance hack for
64-bit apps and as an ABI compat mechanism for 32-bit apps. So we cannot
just start disregarding MAP_32BIT in the kernel - we'd break 32-bit
compat apps and/or compat 32-bit libraries.
There are various other options to solve the (severe!) performance
breakdown:
1- glibc could start not using MAP_32BIT for 64-bit thread stacks (the
boxes where context-switching is slow probably do not matter all that
much anymore - they were very slow at everything 64-bit anyway)
Pros: easiest solution.
Cons: slows down the affected machines and needs a new glibc.
2- We could introduce a new MAP_64BIT_STACK flag which we could
propagate it into MAP_32BIT on those old CPUs. It would be
disregarded on modern CPUs and thread stacks would be 64-bit.
Pros: cleanest solution.
Cons: needs both new glibc and new kernel to take advantage of.
3- We could detect the first-4G-is-full condition and cache it. Problem
is, there will likely be small holes in it so it's rather hard to do
it in a sane way. Also, every munmap() of a thread stack will
invalidate this - triggering a slow linear search every now and then.
Pros: only needs a new kernel to take advantage of.
Cons: is the most complex and messiest solution with no clear
benefit to other workloads. Also, does not 100% solve the
performance problem and prolongues the 4GB stack threads
hack.
i'd go for 1) or 2).
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-08-13 10:45 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <af8810200808121736q76640cc1kb814385072fe9b29@mail.gmail.com>
2008-08-13 0:45 ` pthread_create() slow for many threads; also time to revisit 64b context switch optimization? Pardo
2008-08-13 0:45 ` Pardo
2008-08-13 10:44 ` Ingo Molnar [this message]
2008-08-13 10:44 ` Ingo Molnar
2008-08-13 13:35 ` Arjan van de Ven
2008-08-13 13:35 ` Arjan van de Ven
2008-08-13 14:21 ` Ulrich Drepper
2008-08-13 14:21 ` Ulrich Drepper
2008-08-13 14:25 ` Ingo Molnar
2008-08-13 14:25 ` Ingo Molnar
2008-08-13 14:36 ` Ulrich Drepper
2008-08-13 14:36 ` Ulrich Drepper
2008-08-13 15:10 ` Ingo Molnar
2008-08-13 15:10 ` Ingo Molnar
2008-08-13 15:21 ` Ulrich Drepper
2008-08-13 15:21 ` Ulrich Drepper
2008-08-13 15:40 ` Ingo Molnar
2008-08-13 15:40 ` Ingo Molnar
2008-08-13 15:55 ` Ulrich Drepper
2008-08-13 15:55 ` Ulrich Drepper
2008-08-13 16:02 ` Ingo Molnar
2008-08-13 16:02 ` Ingo Molnar
2008-08-15 15:54 ` Jamie Lokier
2008-08-15 15:54 ` Jamie Lokier
2008-08-15 16:03 ` Ingo Molnar
2008-08-15 16:03 ` Ingo Molnar
2008-08-15 17:13 ` Ulrich Drepper
2008-08-15 17:13 ` Ulrich Drepper
2008-08-15 17:19 ` Ingo Molnar
2008-08-15 17:19 ` Ingo Molnar
2008-08-15 17:23 ` Ulrich Drepper
2008-08-15 17:23 ` Ulrich Drepper
2008-08-15 19:00 ` Ingo Molnar
2008-08-15 19:00 ` Ingo Molnar
2008-08-13 17:09 ` Linus Torvalds
2008-08-13 17:09 ` Linus Torvalds
2008-08-13 18:04 ` Ulrich Drepper
2008-08-13 18:04 ` Ulrich Drepper
2008-08-13 18:16 ` Arjan van de Ven
2008-08-13 18:16 ` Arjan van de Ven
2008-08-13 18:22 ` Ulrich Drepper
2008-08-13 18:22 ` Ulrich Drepper
2008-08-13 16:05 ` H. Peter Anvin
2008-08-13 16:05 ` H. Peter Anvin
2008-08-13 20:42 ` Andi Kleen
2008-08-13 20:42 ` Andi Kleen
2008-08-13 20:56 ` Andrew Morton
2008-08-13 20:56 ` Andrew Morton
2008-08-13 21:46 ` Andi Kleen
2008-08-13 21:46 ` Andi Kleen
2008-08-15 12:43 ` Ingo Molnar
2008-08-15 12:43 ` Ingo Molnar
2008-08-15 13:33 ` Andi Kleen
2008-08-15 13:33 ` Andi Kleen
2008-08-21 16:35 Leon Bottou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080813104445.GA24632@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=briangrant@google.com \
--cc=cgd@google.com \
--cc=drepper@redhat.com \
--cc=hpa@zytor.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@google.com \
--cc=pardo@google.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.