All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
	Davidlohr Bueso <dbueso@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()
Date: Wed, 13 Jan 2016 00:21:10 +0200	[thread overview]
Message-ID: <20160113001824-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <CALCETrWBxiAb8KHjBfb2rRhX3KrbLfc3bzhfQnyCdE3G4mnsSA@mail.gmail.com>

On Tue, Jan 12, 2016 at 12:59:58PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 12, 2016 at 12:54 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Tue, Jan 12, 2016 at 12:30 PM, Andy Lutomirski <luto@kernel.org> wrote:
> >>
> >> I recall reading somewhere that lock addl $0, 32(%rsp) or so (maybe even 64)
> >> was better because it avoided stomping on very-likely-to-be-hot write
> >> buffers.
> >
> > I suspect it could go either way. You want a small constant (for the
> > isntruction size), but any small constant is likely to be within the
> > current stack frame anyway. I don't think 0(%rsp) is particularly
> > likely to have a spill on it right then and there, but who knows..
> >
> > And 64(%rsp) is  possibly going to be cold in the L1 cache, especially
> > if it's just after a deep function call. Which it might be. So it
> > might work the other way.
> >
> > So my guess would be that you wouldn't be able to measure the
> > difference. It might be there, but probably too small to really see in
> > any noise.
> >
> > But numbers talk, bullshit walks. It would be interesting to be proven wrong.
> 
> Here's an article with numbers:
> 
> http://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> 
> I think they're suggesting using a negative offset, which is safe as
> long as it doesn't page fault, even though we have the redzone
> disabled.
> 
> --Andy

OK so I'll have to tweak the test to put something
on stack to measure the difference: my test tweaks a
global variable instead.
I'll try that by tomorrow.

I couldn't measure any difference between mfence and lock+addl
except in a micro-benchmark, but hey since we are tweaking this,
let's do the optimal thing.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Davidlohr Bueso <dbueso@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 3/4] x86,asm: Re-work smp_store_mb()
Date: Wed, 13 Jan 2016 00:21:10 +0200	[thread overview]
Message-ID: <20160113001824-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <CALCETrWBxiAb8KHjBfb2rRhX3KrbLfc3bzhfQnyCdE3G4mnsSA@mail.gmail.com>

On Tue, Jan 12, 2016 at 12:59:58PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 12, 2016 at 12:54 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Tue, Jan 12, 2016 at 12:30 PM, Andy Lutomirski <luto@kernel.org> wrote:
> >>
> >> I recall reading somewhere that lock addl $0, 32(%rsp) or so (maybe even 64)
> >> was better because it avoided stomping on very-likely-to-be-hot write
> >> buffers.
> >
> > I suspect it could go either way. You want a small constant (for the
> > isntruction size), but any small constant is likely to be within the
> > current stack frame anyway. I don't think 0(%rsp) is particularly
> > likely to have a spill on it right then and there, but who knows..
> >
> > And 64(%rsp) is  possibly going to be cold in the L1 cache, especially
> > if it's just after a deep function call. Which it might be. So it
> > might work the other way.
> >
> > So my guess would be that you wouldn't be able to measure the
> > difference. It might be there, but probably too small to really see in
> > any noise.
> >
> > But numbers talk, bullshit walks. It would be interesting to be proven wrong.
> 
> Here's an article with numbers:
> 
> http://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> 
> I think they're suggesting using a negative offset, which is safe as
> long as it doesn't page fault, even though we have the redzone
> disabled.
> 
> --Andy

OK so I'll have to tweak the test to put something
on stack to measure the difference: my test tweaks a
global variable instead.
I'll try that by tomorrow.

I couldn't measure any difference between mfence and lock+addl
except in a micro-benchmark, but hey since we are tweaking this,
let's do the optimal thing.

-- 
MST

  parent reply	other threads:[~2016-01-12 22:21 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-27 19:53 [PATCH -tip 0/4] A few updates around smp_store_mb() Davidlohr Bueso
2015-10-27 19:53 ` [PATCH 1/4] arch,cmpxchg: Remove tas() definitions Davidlohr Bueso
2015-10-27 23:27   ` David Howells
2015-12-04 12:01   ` [tip:locking/core] locking/cmpxchg, arch: " tip-bot for Davidlohr Bueso
2015-10-27 19:53 ` [PATCH 2/4] arch,barrier: Use smp barriers in smp_store_release() Davidlohr Bueso
2015-10-27 20:03   ` Davidlohr Bueso
2015-12-04 12:01   ` [tip:locking/core] lcoking/barriers, arch: " tip-bot for Davidlohr Bueso
2015-10-27 19:53 ` [PATCH 3/4] x86,asm: Re-work smp_store_mb() Davidlohr Bueso
2015-10-27 21:33   ` Linus Torvalds
2015-10-27 22:01     ` Davidlohr Bueso
2015-10-27 22:37     ` Peter Zijlstra
2015-10-28 19:49       ` Davidlohr Bueso
2015-11-02 20:15       ` Davidlohr Bueso
2015-11-03  0:06         ` Linus Torvalds
2015-11-03  1:36           ` Davidlohr Bueso
2016-01-12 13:57           ` Michael S. Tsirkin
2016-01-12 17:20             ` Linus Torvalds
2016-01-12 17:20               ` Linus Torvalds
2016-01-12 17:45               ` Michael S. Tsirkin
2016-01-12 17:45                 ` Michael S. Tsirkin
2016-01-12 18:04                 ` Linus Torvalds
2016-01-12 18:04                   ` Linus Torvalds
2016-01-12 20:30               ` Andy Lutomirski
2016-01-12 20:54                 ` Linus Torvalds
2016-01-12 20:54                   ` Linus Torvalds
2016-01-12 20:59                   ` Andy Lutomirski
2016-01-12 20:59                     ` Andy Lutomirski
2016-01-12 21:37                     ` Linus Torvalds
2016-01-12 21:37                       ` Linus Torvalds
2016-01-12 22:14                       ` Michael S. Tsirkin
2016-01-12 22:14                       ` Michael S. Tsirkin
2016-01-13 16:20                       ` Michael S. Tsirkin
2016-01-13 16:20                         ` Michael S. Tsirkin
2016-01-12 22:21                     ` Michael S. Tsirkin [this message]
2016-01-12 22:21                       ` Michael S. Tsirkin
2016-01-12 22:55                       ` H. Peter Anvin
2016-01-12 22:55                         ` H. Peter Anvin
2016-01-12 23:24                         ` Linus Torvalds
2016-01-12 23:24                         ` Linus Torvalds
2016-01-13 16:17                           ` Borislav Petkov
2016-01-13 16:17                             ` Borislav Petkov
2016-01-13 16:25                             ` Michael S. Tsirkin
2016-01-13 16:25                               ` Michael S. Tsirkin
2016-01-13 16:33                               ` Borislav Petkov
2016-01-13 16:33                                 ` Borislav Petkov
2016-01-13 16:42                                 ` Michael S. Tsirkin
2016-01-13 16:42                                   ` Michael S. Tsirkin
2016-01-13 16:53                                   ` Borislav Petkov
2016-01-13 16:53                                     ` Borislav Petkov
2016-01-13 17:00                                     ` Michael S. Tsirkin
2016-01-13 17:00                                     ` Michael S. Tsirkin
2016-01-13 18:38                                   ` Linus Torvalds
2016-01-13 18:38                                     ` Linus Torvalds
2016-01-12 13:57           ` Michael S. Tsirkin
2015-10-27 19:53 ` [PATCH 4/4] doc,smp: Remove ambiguous statement in smp_store_mb() Davidlohr Bueso
2015-12-04 12:01   ` [tip:locking/core] locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation tip-bot for Davidlohr Bueso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160113001824-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.