From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>, Andi Kleen <ak@linux.intel.com>,
Peter Anvin <hpa@zytor.com>, Mike Galbraith <bitbucket@online.de>,
Thomas Gleixner <tglx@linutronix.de>,
Arjan van de Ven <arjan@linux.intel.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH 0/7] preempt_count rework -v2
Date: Thu, 12 Sep 2013 13:51:55 +0200 [thread overview]
Message-ID: <20130912115155.GV31370@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFwdS5-U8UtjZ8qsKyn7ap05MPXMFeU-GXgU2PE5KrCrHg@mail.gmail.com>
On Wed, Sep 11, 2013 at 07:43:42PM -0700, Linus Torvalds wrote:
> On Wed, Sep 11, 2013 at 7:20 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > I split the thing up into two macros GEN_UNARY_RMWcc and
> > GEN_BINARY_RMWcc which ends up being more readable as well as smaller
> > code overall.
>
> Yes, that looks like the right abstraction level. Powerful without
> being complicated.
> That said, the very same memory clobber may be what makes this whole
> approach a loss, if it causes gcc to do tons of reloads or other
> random things.
Random things below, not sure why
> For other cases? Who knows.. A lot of the "change and test atomically"
> things have the same containment need, so it might not be a problem.
Right, all atomic ops already contain a memory clobber to go along with
their memory barrier semantics.
So I did a defconfig build without the patch and with the patch and got
a significant size increase:
text data bss filename
11173443 1423736 1183744 defconfig-build/vmlinux.before
11174516 1423736 1183744 defconfig-build/vmlinux.after
I then undid all the bitop conversions that added the extra memory
clobber and got:
11174204 1423736 1183744 defconfig-build/vmlinux.after
Which is still a significant increase, so I had a look at what GCC
generates, for mm/rmap.c, which uses quite a few atomic_*_and_test(),
functions I got:
text data bss dec hex filename
8675 1 16 8692 21f4 defconfig-build/mm/rmap.o
8739 1 16 8756 2234 defconfig-build/mm/rmap.o
So the increase is there too, doing a objdump -D on them the first
difference is:
0000000000000660 <do_page_add_anon_rmap>:
660: 55 push %rbp
661: 48 89 e5 mov %rsp,%rbp
664: 48 83 ec 20 sub $0x20,%rsp
668: 48 89 5d f0 mov %rbx,-0x10(%rbp)
66c: 4c 89 65 f8 mov %r12,-0x8(%rbp)
670: 48 89 fb mov %rdi,%rbx
673: f0 ff 47 18 lock incl 0x18(%rdi)
677: 0f 94 c0 sete %al
67a: 84 c0 test %al,%al
67c: 75 12 jne 690 <do_page_add_anon_rmap+0x30>
67e: 48 8b 5d f0 mov -0x10(%rbp),%rbx
682: 4c 8b 65 f8 mov -0x8(%rbp),%r12
686: c9 leaveq
vs.:
0000000000000660 <do_page_add_anon_rmap>:
660: 55 push %rbp
661: 48 89 e5 mov %rsp,%rbp
664: 48 83 ec 20 sub $0x20,%rsp
668: 48 89 5d e0 mov %rbx,-0x20(%rbp)
66c: 4c 89 65 e8 mov %r12,-0x18(%rbp)
670: 48 89 fb mov %rdi,%rbx
673: 4c 89 6d f0 mov %r13,-0x10(%rbp)
677: 4c 89 75 f8 mov %r14,-0x8(%rbp)
67b: f0 ff 47 18 lock incl 0x18(%rdi)
67f: 74 17 je 698 <do_page_add_anon_rmap+0x38>
681: 48 8b 5d e0 mov -0x20(%rbp),%rbx
685: 4c 8b 65 e8 mov -0x18(%rbp),%r12
689: 4c 8b 6d f0 mov -0x10(%rbp),%r13
68d: 4c 8b 75 f8 mov -0x8(%rbp),%r14
691: c9 leaveq
For some obscure (to me) reason the new fangled asm goto construct
generates a bunch of extra MOVs.
OTOH the good news is that a kernel with that patch applied does indeed
boot properly on real hardware.
next prev parent reply other threads:[~2013-09-12 11:52 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-10 13:08 [PATCH 0/7] preempt_count rework -v2 Peter Zijlstra
2013-09-10 13:08 ` [PATCH 1/7] sched: Introduce preempt_count accessor functions Peter Zijlstra
2013-09-10 13:08 ` [PATCH 2/7] sched: Add NEED_RESCHED to the preempt_count Peter Zijlstra
2013-09-11 1:59 ` Andy Lutomirski
2013-09-11 8:25 ` Peter Zijlstra
2013-09-11 11:06 ` Peter Zijlstra
2013-09-11 13:34 ` Mike Galbraith
2013-09-12 6:01 ` Mike Galbraith
2013-09-11 16:35 ` Andy Lutomirski
2013-09-11 18:05 ` Peter Zijlstra
2013-09-11 18:07 ` Andy Lutomirski
2013-09-11 11:14 ` Peter Zijlstra
2013-09-10 13:08 ` [PATCH 3/7] sched, arch: Create asm/preempt.h Peter Zijlstra
2013-09-10 13:08 ` [PATCH 4/7] sched: Create more preempt_count accessors Peter Zijlstra
2013-09-10 13:08 ` [PATCH 5/7] sched: Extract the basic add/sub preempt_count modifiers Peter Zijlstra
2013-09-10 13:08 ` [PATCH 6/7] sched, x86: Provide a per-cpu preempt_count implementation Peter Zijlstra
2013-09-10 13:27 ` Peter Zijlstra
2013-09-10 14:02 ` Eric Dumazet
2013-09-10 15:25 ` Peter Zijlstra
2013-09-10 16:48 ` Peter Zijlstra
2013-09-10 13:08 ` [PATCH 7/7] sched, x86: Optimize the preempt_schedule() call Peter Zijlstra
2013-09-10 13:42 ` Ingo Molnar
2013-09-10 13:55 ` Jan Beulich
2013-09-10 13:55 ` Jan Beulich
2013-09-10 14:25 ` Ingo Molnar
2013-09-10 13:51 ` [PATCH 0/7] preempt_count rework -v2 Ingo Molnar
2013-09-10 13:56 ` Ingo Molnar
2013-09-10 15:14 ` Peter Zijlstra
2013-09-10 15:29 ` Arjan van de Ven
2013-09-10 15:35 ` Peter Zijlstra
2013-09-10 16:24 ` Linus Torvalds
2013-09-11 16:00 ` H. Peter Anvin
2013-09-10 16:34 ` Linus Torvalds
2013-09-10 16:45 ` Peter Zijlstra
2013-09-10 17:06 ` Linus Torvalds
2013-09-10 21:25 ` Peter Zijlstra
2013-09-10 21:43 ` Linus Torvalds
2013-09-10 21:51 ` H. Peter Anvin
2013-09-10 22:02 ` Linus Torvalds
2013-09-10 22:06 ` H. Peter Anvin
2013-09-11 13:13 ` Peter Zijlstra
2013-09-11 13:26 ` Peter Zijlstra
2013-09-11 15:29 ` H. Peter Anvin
2013-09-11 15:33 ` Linus Torvalds
2013-09-11 18:59 ` Peter Zijlstra
2013-09-11 23:02 ` Linus Torvalds
2013-09-12 2:20 ` Peter Zijlstra
2013-09-12 2:43 ` Linus Torvalds
2013-09-12 11:51 ` Peter Zijlstra [this message]
2013-09-12 12:25 ` Ingo Molnar
2013-09-13 7:25 ` Kevin Easton
2013-09-13 8:06 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130912115155.GV31370@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=ak@linux.intel.com \
--cc=arjan@linux.intel.com \
--cc=bitbucket@online.de \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.