From: Will Deacon <will.deacon@arm.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>,
Jan Glauber <jglauber@marvell.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Jayachandran Chandrasekharan Nair <jnair@marvell.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC] Disable lockref on arm64
Date: Wed, 22 May 2019 17:04:17 +0100 [thread overview]
Message-ID: <20190522160417.GF7876@fuggles.cambridge.arm.com> (raw)
In-Reply-To: <CAKv+Gu9U9z3iAuz4V1c5zTHuz1As8FSNGY-TJon4OLErB8ts8Q@mail.gmail.com>
On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> <jnair@marvell.com> wrote:
> >
> > On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> > > On Mon, May 06, 2019 at 06:13:12AM +0000, Jayachandran Chandrasekharan Nair wrote:
> > > > Perhaps someone from ARM can chime in here how the cas/yield combo
> > > > is expected to work when there is contention. ThunderX2 does not
> > > > do much with the yield, but I don't expect any ARM implementation
> > > > to treat YIELD as a hint not to yield, but to get/keep exclusive
> > > > access to the last failed CAS location.
> > >
> > > Just picking up on this as "someone from ARM".
> > >
> > > The yield instruction in our implementation of cpu_relax() is *only* there
> > > as a scheduling hint to QEMU so that it can treat it as an internal
> > > scheduling hint and run some other thread; see 1baa82f48030 ("arm64:
> > > Implement cpu_relax as yield"). We can't use WFE or WFI blindly here, as it
> > > could be a long time before we see a wake-up event such as an interrupt. Our
> > > implementation of smp_cond_load_acquire() is much better for that kind of
> > > thing, but doesn't help at all for a contended CAS loop where the variable
> > > is actually changing constantly.
> >
> > Looking thru the perf output of this case (open/close of a file from
> > multiple CPUs), I see that refcount is a significant factor in most
> > kernel configurations - and that too uses cmpxchg (without yield).
> > x86 has an optimized inline version of refcount that helps
> > significantly. Do you think this is worth looking at for arm64?
> >
>
> I looked into this a while ago [0], but at the time, we decided to
> stick with the generic implementation until we encountered a use case
> that benefits from it. Worth a try, I suppose ...
>
> [0] https://lore.kernel.org/linux-arm-kernel/20170903101622.12093-1-ard.biesheuvel@linaro.org/
If JC can show that we benefit from this, it would be interesting to see if
we can implement the refcount-full saturating arithmetic using the
LDMIN/LDMAX instructions instead of the current cmpxchg() loops.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon@arm.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
Jan Glauber <jglauber@marvell.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [RFC] Disable lockref on arm64
Date: Wed, 22 May 2019 17:04:17 +0100 [thread overview]
Message-ID: <20190522160417.GF7876@fuggles.cambridge.arm.com> (raw)
In-Reply-To: <CAKv+Gu9U9z3iAuz4V1c5zTHuz1As8FSNGY-TJon4OLErB8ts8Q@mail.gmail.com>
On Sat, May 18, 2019 at 12:00:34PM +0200, Ard Biesheuvel wrote:
> On Sat, 18 May 2019 at 06:25, Jayachandran Chandrasekharan Nair
> <jnair@marvell.com> wrote:
> >
> > On Mon, May 06, 2019 at 07:10:40PM +0100, Will Deacon wrote:
> > > On Mon, May 06, 2019 at 06:13:12AM +0000, Jayachandran Chandrasekharan Nair wrote:
> > > > Perhaps someone from ARM can chime in here how the cas/yield combo
> > > > is expected to work when there is contention. ThunderX2 does not
> > > > do much with the yield, but I don't expect any ARM implementation
> > > > to treat YIELD as a hint not to yield, but to get/keep exclusive
> > > > access to the last failed CAS location.
> > >
> > > Just picking up on this as "someone from ARM".
> > >
> > > The yield instruction in our implementation of cpu_relax() is *only* there
> > > as a scheduling hint to QEMU so that it can treat it as an internal
> > > scheduling hint and run some other thread; see 1baa82f48030 ("arm64:
> > > Implement cpu_relax as yield"). We can't use WFE or WFI blindly here, as it
> > > could be a long time before we see a wake-up event such as an interrupt. Our
> > > implementation of smp_cond_load_acquire() is much better for that kind of
> > > thing, but doesn't help at all for a contended CAS loop where the variable
> > > is actually changing constantly.
> >
> > Looking thru the perf output of this case (open/close of a file from
> > multiple CPUs), I see that refcount is a significant factor in most
> > kernel configurations - and that too uses cmpxchg (without yield).
> > x86 has an optimized inline version of refcount that helps
> > significantly. Do you think this is worth looking at for arm64?
> >
>
> I looked into this a while ago [0], but at the time, we decided to
> stick with the generic implementation until we encountered a use case
> that benefits from it. Worth a try, I suppose ...
>
> [0] https://lore.kernel.org/linux-arm-kernel/20170903101622.12093-1-ard.biesheuvel@linaro.org/
If JC can show that we benefit from this, it would be interesting to see if
we can implement the refcount-full saturating arithmetic using the
LDMIN/LDMAX instructions instead of the current cmpxchg() loops.
Will
next prev parent reply other threads:[~2019-05-22 16:04 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-29 14:52 [RFC] Disable lockref on arm64 Jan Glauber
2019-04-29 14:52 ` Jan Glauber
2019-05-01 16:01 ` Will Deacon
2019-05-01 16:01 ` Will Deacon
2019-05-02 8:38 ` Jan Glauber
2019-05-02 8:38 ` Jan Glauber
2019-05-01 16:41 ` Linus Torvalds
2019-05-01 16:41 ` Linus Torvalds
2019-05-02 8:27 ` Jan Glauber
2019-05-02 8:27 ` Jan Glauber
2019-05-02 16:12 ` Linus Torvalds
2019-05-02 16:12 ` Linus Torvalds
2019-05-02 23:19 ` Jayachandran Chandrasekharan Nair
2019-05-02 23:19 ` Jayachandran Chandrasekharan Nair
2019-05-03 19:40 ` Linus Torvalds
2019-05-03 19:40 ` Linus Torvalds
2019-05-06 6:13 ` [EXT] " Jayachandran Chandrasekharan Nair
2019-05-06 6:13 ` Jayachandran Chandrasekharan Nair
2019-05-06 17:13 ` Linus Torvalds
2019-05-06 17:13 ` Linus Torvalds
2019-05-06 18:10 ` Will Deacon
2019-05-06 18:10 ` Will Deacon
2019-05-18 4:24 ` Jayachandran Chandrasekharan Nair
2019-05-18 4:24 ` Jayachandran Chandrasekharan Nair
2019-05-18 10:00 ` Ard Biesheuvel
2019-05-18 10:00 ` Ard Biesheuvel
2019-05-22 16:04 ` Will Deacon [this message]
2019-05-22 16:04 ` Will Deacon
2019-06-12 4:10 ` Jayachandran Chandrasekharan Nair
2019-06-12 4:10 ` Jayachandran Chandrasekharan Nair
2019-06-12 9:31 ` Will Deacon
2019-06-12 9:31 ` Will Deacon
2019-06-14 7:09 ` Jayachandran Chandrasekharan Nair
2019-06-14 7:09 ` Jayachandran Chandrasekharan Nair
2019-06-14 9:58 ` Will Deacon
2019-06-14 9:58 ` Will Deacon
2019-06-14 10:24 ` Ard Biesheuvel
2019-06-14 10:24 ` Ard Biesheuvel
2019-06-14 10:38 ` Will Deacon
2019-06-14 10:38 ` Will Deacon
2019-06-15 4:21 ` Kees Cook
2019-06-15 4:21 ` Kees Cook
2019-06-15 8:47 ` Ard Biesheuvel
2019-06-15 8:47 ` Ard Biesheuvel
2019-06-15 13:59 ` Kees Cook
2019-06-15 13:59 ` Kees Cook
2019-06-15 14:18 ` Ard Biesheuvel
2019-06-15 14:18 ` Ard Biesheuvel
2019-06-16 21:31 ` Kees Cook
2019-06-16 21:31 ` Kees Cook
2019-06-17 11:33 ` Ard Biesheuvel
2019-06-17 11:33 ` Ard Biesheuvel
2019-06-17 17:26 ` Will Deacon
2019-06-17 17:26 ` Will Deacon
2019-06-17 20:07 ` Jayachandran Chandrasekharan Nair
2019-06-17 20:07 ` Jayachandran Chandrasekharan Nair
2019-06-18 5:41 ` Kees Cook
2019-06-18 5:41 ` Kees Cook
2019-06-13 9:53 ` Hanjun Guo
2019-06-13 9:53 ` Hanjun Guo
2019-06-05 13:48 ` [PATCH] lockref: Limit number of cmpxchg loop retries Jan Glauber
2019-06-05 13:48 ` Jan Glauber
2019-06-05 20:16 ` Linus Torvalds
2019-06-05 20:16 ` Linus Torvalds
2019-06-06 8:03 ` Jan Glauber
2019-06-06 8:03 ` Jan Glauber
2019-06-06 9:41 ` Will Deacon
2019-06-06 9:41 ` Will Deacon
2019-06-06 10:28 ` Jan Glauber
2019-06-06 10:28 ` Jan Glauber
2019-06-07 7:27 ` Jan Glauber
2019-06-07 7:27 ` Jan Glauber
2019-06-07 20:14 ` Linus Torvalds
2019-06-07 20:14 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190522160417.GF7876@fuggles.cambridge.arm.com \
--to=will.deacon@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=jglauber@marvell.com \
--cc=jnair@marvell.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.