All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jisheng Zhang <jszhang@kernel.org>
Cc: Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64/lib: copy_page: s/stnp/stp
Date: Wed, 26 Jun 2024 19:08:42 +0100	[thread overview]
Message-ID: <ZnxZKih_riuFb1NF@arm.com> (raw)
In-Reply-To: <ZnwAoWh-_bTxT-rT@xhacker>

On Wed, Jun 26, 2024 at 07:50:57PM +0800, Jisheng Zhang wrote:
> On Mon, Jun 24, 2024 at 06:56:33PM +0100, Catalin Marinas wrote:
> > On Thu, Jun 13, 2024 at 08:18:12AM +0800, Jisheng Zhang wrote:
> > > stnp performs non-temporal store, give a hints to the memory system
> > > that caching is not useful for this data. But the scenario where
> > > copy_page() used may not have this implication, although I must admit
> > > there's such case where stnp helps performance(good). In this good
> > > case, we can rely on the HW write streaming mechanism in some
> > > implementations such as cortex-a55 to detect the case and take actions.
> > > 
> > > testing with https://github.com/apinski-cavium/copy_page_benchmark
> > > this patch can reduce the time by about 3% on cortex-a55 platforms.
[...]
> > It looks like it always copies to the same page, the stp may even
> > benefit from some caching of the data which we wouldn't need in a real
> > scenario.
> 
> Yep this is also my understanding where's the improvement from. And
> I must admit there's case where stnp helps performance. we can rely
> on the HW write streaming mechanism to detect and take actions.

Well, is that case realistic? Can you show any improvement with some
real-world uses? Most likely modern CPUs fall back to non-temporal
stores after a series of STPs but it depends on how soon they do it, how
much cache gets polluted. OTOH, page copying could be the result of a
CoW and we'd expect subsequent accesses from the user where some caching
may be beneficial.

So, hard to tell but we should make a decision based on a microbenchmark
that writes over the same page multiple times. If you have some
real-world tests that exercise this path (e.g. CoW, Android app startup)
and show an improvement, I'd be in favour of this. Otherwise, no.

Thanks.

-- 
Catalin


      reply	other threads:[~2024-06-26 18:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13  0:18 [PATCH] arm64/lib: copy_page: s/stnp/stp Jisheng Zhang
2024-06-24 17:56 ` Catalin Marinas
2024-06-26 11:50   ` Jisheng Zhang
2024-06-26 18:08     ` Catalin Marinas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnxZKih_riuFb1NF@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=jszhang@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.