All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Paul Mackerras <paulus@au1.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	James Hogan <james.hogan@imgtec.com>,
	"James E.J. Bottomley" <jejb@parisc-linux.org>,
	Helge Deller <deller@gmx.de>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	"David S. Miller" <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Anton Blanchard <anton@au1.ibm.com>
Subject: Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix
Date: Tue, 24 Sep 2013 10:10:27 +1000	[thread overview]
Message-ID: <1379981427.5443.8.camel@pasglop> (raw)
In-Reply-To: <CA+55aFwaf_Wst=AS75ydBJVQ6aJxPfAzXdt-UXj3qC9WeUt7kw@mail.gmail.com>

On Sun, 2013-09-22 at 15:22 -0700, Linus Torvalds wrote:
>  - use %r13 for the per-thread thread-info pointer instead. A
> per-thread pointer is *not* volatile like the per-cpu base is.

 .../...

> Alternatively, make %r13 point to the percpu side, but make sure that
> you always use an asm accessor to fetch the value. In particular, I
> think you need to make __my_cpu_offset be an inline asm that fetches
> %r13 into some other register. Otherwise you can never get it right.

BTW, that boils down to a choice between using r13 as either a TLS for
current or current_thread_info, or as a per-cpu pointer, which one is
the most performance critical ?

Now in the first case, it seems to me that using it as "current" rather
than "current_thread_info()" is a better idea since we access current a
LOT more overall in the kernel, from there we can find a way to put
thread_info into task struct (via thread struct maybe) to make it a
simple offset from current.

The big pro of that approach is of course that r13 becomes the TLS as
intended, and we can feel a lot more comfortable that we are "safe" vs.
whatever crazyness gcc will come up with next.

The flip side is that per-cpu will remain a load away, so getting the
address of a per-cpu variable would typically be a 3 instruction deal
involving a load and a pair of adds to get to the address, then the
actual per-cpu access proper. This is equivalent to what we have today
(we put the per-cpu offset in the PACA). Using r13 as per-cpu allows to
avoid that first load.

So what's the most worthwhile thing to do here ? I'm leaning toward 1,
ie, stick current in r13 and feel a lot safer about it (I won't have to
scrutinize generated code all over the place to convince myself things
aren't crossing the barriers), and if the thread_info is in the task
struct, that makes accessing it really trivial & fast as well.

Cheers,
Ben.



  parent reply	other threads:[~2013-09-24  0:11 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker
2013-09-19 19:51 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker
2013-09-19 19:51 ` [PATCH 2/3] irq: Execute softirq on its own stack on irq exit Frederic Weisbecker
2013-09-19 19:51 ` [PATCH 3/3] irq: Comment on the use of inline stack for ksoftirqd Frederic Weisbecker
2013-09-20  0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds
2013-09-20  1:53   ` Benjamin Herrenschmidt
2013-09-20 11:03   ` Thomas Gleixner
2013-09-20 11:11     ` Peter Zijlstra
2013-09-21  0:55       ` Benjamin Herrenschmidt
2013-09-20 16:26     ` Frederic Weisbecker
2013-09-20 17:30       ` Thomas Gleixner
2013-09-20 18:37         ` Frederic Weisbecker
2013-09-20 22:14       ` Linus Torvalds
2013-09-21  7:47         ` Ingo Molnar
2013-09-21 18:58         ` Frederic Weisbecker
2013-09-21 21:45           ` Benjamin Herrenschmidt
2013-09-21 23:27             ` Frederic Weisbecker
2013-09-22  2:01             ` H. Peter Anvin
2013-09-22  4:39               ` Benjamin Herrenschmidt
2013-09-22  4:41                 ` Benjamin Herrenschmidt
2013-09-22 16:24                   ` Peter Zijlstra
2013-09-22 17:47                     ` H. Peter Anvin
2013-09-22 22:00                       ` Benjamin Herrenschmidt
2013-09-22 21:56                     ` Benjamin Herrenschmidt
2013-09-22 22:22                       ` Linus Torvalds
2013-09-22 22:38                         ` Benjamin Herrenschmidt
2013-09-23  4:35                           ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt
2013-09-23  4:35                             ` Benjamin Herrenschmidt
2013-09-23  7:56                             ` Stephen Rothwell
2013-09-23  7:56                               ` Stephen Rothwell
2013-09-23 10:13                               ` Benjamin Herrenschmidt
2013-09-23 10:13                                 ` Benjamin Herrenschmidt
2013-09-23 16:47                             ` Linus Torvalds
2013-09-23 16:47                               ` Linus Torvalds
2013-09-23 20:51                               ` Benjamin Herrenschmidt
2013-09-23 20:51                                 ` Benjamin Herrenschmidt
2013-09-24  5:42                           ` [PATCH v2] " Benjamin Herrenschmidt
2013-09-24  5:42                             ` Benjamin Herrenschmidt
2013-09-23 17:59                         ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Chris Metcalf
2013-09-23 20:57                           ` Benjamin Herrenschmidt
2013-09-24 19:27                             ` Chris Metcalf
2013-09-24 20:58                               ` Benjamin Herrenschmidt
2013-09-24  0:10                         ` Benjamin Herrenschmidt [this message]
2013-09-24  1:19                           ` Linus Torvalds
2013-09-24  1:52                             ` Benjamin Herrenschmidt
2013-09-24  8:04                               ` Peter Zijlstra
2013-09-24  8:16                                 ` Benjamin Herrenschmidt
2013-09-24  8:21                                   ` Peter Zijlstra
2013-09-24  9:31                                     ` Benjamin Herrenschmidt
2013-09-23  4:40             ` Benjamin Herrenschmidt
2013-09-23  5:01               ` David Miller
2013-09-24  2:44               ` Frederic Weisbecker
2013-09-24  4:42                 ` Benjamin Herrenschmidt
2013-09-24 13:56                   ` Frederic Weisbecker
2013-09-24 20:55                     ` Benjamin Herrenschmidt
2013-09-25  8:46                       ` Frederic Weisbecker
2013-09-21  0:52       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1379981427.5443.8.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@au1.ibm.com \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=james.hogan@imgtec.com \
    --cc=jejb@parisc-linux.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulus@au1.ibm.com \
    --cc=peterz@infradead.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.