All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>, Andi Kleen <ak@linux.intel.com>,
	Peter Anvin <hpa@zytor.com>, Mike Galbraith <bitbucket@online.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH 0/7] preempt_count rework -v2
Date: Thu, 12 Sep 2013 13:51:55 +0200	[thread overview]
Message-ID: <20130912115155.GV31370@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFwdS5-U8UtjZ8qsKyn7ap05MPXMFeU-GXgU2PE5KrCrHg@mail.gmail.com>

On Wed, Sep 11, 2013 at 07:43:42PM -0700, Linus Torvalds wrote:
> On Wed, Sep 11, 2013 at 7:20 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > I split the thing up into two macros GEN_UNARY_RMWcc and
> > GEN_BINARY_RMWcc which ends up being more readable as well as smaller
> > code overall.
> 
> Yes, that looks like the right abstraction level. Powerful without
> being complicated.

> That said, the very same memory clobber may be what makes this whole
> approach a loss, if it causes gcc to do tons of reloads or other
> random things.

Random things below, not sure why

> For other cases? Who knows.. A lot of the "change and test atomically"
> things have the same containment need, so it might not be a problem.

Right, all atomic ops already contain a memory clobber to go along with
their memory barrier semantics.

So I did a defconfig build without the patch and with the patch and got
a significant size increase:

      text    data     bss   filename
   11173443 1423736 1183744 defconfig-build/vmlinux.before
   11174516 1423736 1183744 defconfig-build/vmlinux.after

I then undid all the bitop conversions that added the extra memory
clobber and got:

   11174204 1423736 1183744 defconfig-build/vmlinux.after

Which is still a significant increase, so I had a look at what GCC
generates, for mm/rmap.c, which uses quite a few atomic_*_and_test(),
functions I got:

   text    data     bss     dec     hex filename
   8675       1      16    8692    21f4 defconfig-build/mm/rmap.o
   8739       1      16    8756    2234 defconfig-build/mm/rmap.o

So the increase is there too, doing a objdump -D on them the first
difference is:

0000000000000660 <do_page_add_anon_rmap>:
     660:	55                   	push   %rbp
     661:	48 89 e5             	mov    %rsp,%rbp
     664:	48 83 ec 20          	sub    $0x20,%rsp
     668:	48 89 5d f0          	mov    %rbx,-0x10(%rbp)
     66c:	4c 89 65 f8          	mov    %r12,-0x8(%rbp)
     670:	48 89 fb             	mov    %rdi,%rbx
     673:	f0 ff 47 18          	lock incl 0x18(%rdi)
     677:	0f 94 c0             	sete   %al
     67a:	84 c0                	test   %al,%al
     67c:	75 12                	jne    690 <do_page_add_anon_rmap+0x30>
     67e:	48 8b 5d f0          	mov    -0x10(%rbp),%rbx
     682:	4c 8b 65 f8          	mov    -0x8(%rbp),%r12
     686:	c9                   	leaveq 

vs.:

0000000000000660 <do_page_add_anon_rmap>:
     660:	55                   	push   %rbp
     661:	48 89 e5             	mov    %rsp,%rbp
     664:	48 83 ec 20          	sub    $0x20,%rsp
     668:	48 89 5d e0          	mov    %rbx,-0x20(%rbp)
     66c:	4c 89 65 e8          	mov    %r12,-0x18(%rbp)
     670:	48 89 fb             	mov    %rdi,%rbx
     673:	4c 89 6d f0          	mov    %r13,-0x10(%rbp)
     677:	4c 89 75 f8          	mov    %r14,-0x8(%rbp)
     67b:	f0 ff 47 18          	lock incl 0x18(%rdi)
     67f:	74 17                	je     698 <do_page_add_anon_rmap+0x38>
     681:	48 8b 5d e0          	mov    -0x20(%rbp),%rbx
     685:	4c 8b 65 e8          	mov    -0x18(%rbp),%r12
     689:	4c 8b 6d f0          	mov    -0x10(%rbp),%r13
     68d:	4c 8b 75 f8          	mov    -0x8(%rbp),%r14
     691:	c9                   	leaveq 

For some obscure (to me) reason the new fangled asm goto construct
generates a bunch of extra MOVs.

OTOH the good news is that a kernel with that patch applied does indeed
boot properly on real hardware.

  reply	other threads:[~2013-09-12 11:52 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10 13:08 [PATCH 0/7] preempt_count rework -v2 Peter Zijlstra
2013-09-10 13:08 ` [PATCH 1/7] sched: Introduce preempt_count accessor functions Peter Zijlstra
2013-09-10 13:08 ` [PATCH 2/7] sched: Add NEED_RESCHED to the preempt_count Peter Zijlstra
2013-09-11  1:59   ` Andy Lutomirski
2013-09-11  8:25     ` Peter Zijlstra
2013-09-11 11:06       ` Peter Zijlstra
2013-09-11 13:34         ` Mike Galbraith
2013-09-12  6:01           ` Mike Galbraith
2013-09-11 16:35         ` Andy Lutomirski
2013-09-11 18:05           ` Peter Zijlstra
2013-09-11 18:07             ` Andy Lutomirski
2013-09-11 11:14   ` Peter Zijlstra
2013-09-10 13:08 ` [PATCH 3/7] sched, arch: Create asm/preempt.h Peter Zijlstra
2013-09-10 13:08 ` [PATCH 4/7] sched: Create more preempt_count accessors Peter Zijlstra
2013-09-10 13:08 ` [PATCH 5/7] sched: Extract the basic add/sub preempt_count modifiers Peter Zijlstra
2013-09-10 13:08 ` [PATCH 6/7] sched, x86: Provide a per-cpu preempt_count implementation Peter Zijlstra
2013-09-10 13:27   ` Peter Zijlstra
2013-09-10 14:02   ` Eric Dumazet
2013-09-10 15:25     ` Peter Zijlstra
2013-09-10 16:48   ` Peter Zijlstra
2013-09-10 13:08 ` [PATCH 7/7] sched, x86: Optimize the preempt_schedule() call Peter Zijlstra
2013-09-10 13:42   ` Ingo Molnar
2013-09-10 13:55     ` Jan Beulich
2013-09-10 13:55       ` Jan Beulich
2013-09-10 14:25       ` Ingo Molnar
2013-09-10 13:51 ` [PATCH 0/7] preempt_count rework -v2 Ingo Molnar
2013-09-10 13:56   ` Ingo Molnar
2013-09-10 15:14     ` Peter Zijlstra
2013-09-10 15:29     ` Arjan van de Ven
2013-09-10 15:35       ` Peter Zijlstra
2013-09-10 16:24       ` Linus Torvalds
2013-09-11 16:00         ` H. Peter Anvin
2013-09-10 16:34     ` Linus Torvalds
2013-09-10 16:45       ` Peter Zijlstra
2013-09-10 17:06         ` Linus Torvalds
2013-09-10 21:25           ` Peter Zijlstra
2013-09-10 21:43             ` Linus Torvalds
2013-09-10 21:51               ` H. Peter Anvin
2013-09-10 22:02                 ` Linus Torvalds
2013-09-10 22:06                   ` H. Peter Anvin
2013-09-11 13:13               ` Peter Zijlstra
2013-09-11 13:26                 ` Peter Zijlstra
2013-09-11 15:29                 ` H. Peter Anvin
2013-09-11 15:33                 ` Linus Torvalds
2013-09-11 18:59                   ` Peter Zijlstra
2013-09-11 23:02                     ` Linus Torvalds
2013-09-12  2:20                       ` Peter Zijlstra
2013-09-12  2:43                         ` Linus Torvalds
2013-09-12 11:51                           ` Peter Zijlstra [this message]
2013-09-12 12:25                             ` Ingo Molnar
2013-09-13  7:25                         ` Kevin Easton
2013-09-13  8:06                           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130912115155.GV31370@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=arjan@linux.intel.com \
    --cc=bitbucket@online.de \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.