From: Catalin Marinas <catalin.marinas@arm.com>
To: Zhangshaokun <zhangshaokun@hisilicon.com>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>,
John Garry <john.garry@huawei.com>,
Will Deacon <will.deacon@arm.com>,
Zhenfa Qiu <qiuzhenfa@hisilicon.com>,
Hanjun Guo <guohanjun@huawei.com>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64: cache: Update cache_line_size for HiSilicon certain platform
Date: Fri, 29 Mar 2019 18:52:57 +0000 [thread overview]
Message-ID: <20190329185257.GD48010@arrakis.emea.arm.com> (raw)
In-Reply-To: <d7287106-d526-7817-9c31-298bd0038d6c@hisilicon.com>
On Wed, Mar 27, 2019 at 03:16:34PM +0800, Zhangshaokun wrote:
> On 2019/3/26 22:55, Catalin Marinas wrote:
> > On Tue, Mar 26, 2019 at 02:28:10PM +0800, Shaokun Zhang wrote:
> >> For HiSilicon's certain platform, like Kunpeng920 server SoC, it uses the
> >> tsv110 CPUs whose L1 cache line size is 64-Byte, while the cache line size
> >> of L3C is 128-Byte.
> >> cache_line_size is used mostly for IO device drivers, so we shall correct
> >> the right value and the device drivers can match it accurately to get good
> >> performance.
[...]
> > What's the CTR_EL0.CWG value on your SoC?
>
> It's 4'b0100 and cache line size is 64-byte.
>
> >> When test mlx5 with Kunpeng920 SoC, ib_send_bw is run under the condition
> >> that the length of the packet is 4-Byte and only one queue and cpu core:
> >> Without this patch: 1.67 Mpps
> >> with this patch : 2.40 Mpps
> >
> > This needs a better explanation. How does cache_line_size() affect the
> > 4-byte packet? Does it send more packets at once?
> >
> > I've seen in the mlx5 code assumptions about cache_line_size() being
> > 128. It looks to me more like some driver hand-tuning for specific
> > system configuration. Can the driver be changed to be more generic
>
> I'm not sure that mlx5 may implement some actions for different cache line
> size from different arch or platforms, so the driver needs to read the
> right cache_line_size.
We need to better understand why the performance hit but at a quick grep
for "128" in the mlx5 code, I can see different code paths executed when
cache_line_size() returned 128 (saved in cqe_size). IOW, presuming you
can somehow disable the L3C, do you still see the same performance
difference?
> Originally, I thought this interface was used mainly for IO drivers and no
> harm to any other places.
Looking through the slab code, cache_line_size() is used when
SLAB_HWCACHE_ALIGN is passed and IIUC this is for performance reasons
rather than I/O (the DMA alignment is given by ARCH_DMA_MINALIGN which
is 128 on arm64).
Anyway, if the performance drop is indeed caused by more L3C cacheline
bouncing, we can look at fixing cache_line_size() for your CPU to return
128 but I'd like it done using the cpufeature/errata framework. We have
an arm64_ftr_reg_ctrel0.sys_val that we could update to report a 128
byte cacheline and read this in cache_line_size() but I think this may
cause user accesses to CTR_EL0 to trap into the kernel to be emulated.
This may cause more performance issues for the user than just
misreporting the CWG.
Alternatively, we store the cache_line_size in a variable and return it.
Cc'ing Suzuki for comments on cpu errata approach.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-03-29 18:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-26 6:28 [PATCH] arm64: cache: Update cache_line_size for HiSilicon certain platform Shaokun Zhang
2019-03-26 14:55 ` Catalin Marinas
2019-03-27 7:16 ` Zhangshaokun
2019-03-29 18:52 ` Catalin Marinas [this message]
2019-04-02 7:51 ` Zhangshaokun
2019-04-03 12:57 ` Catalin Marinas
2019-04-08 7:51 ` Zhangshaokun
2019-04-16 13:51 ` Will Deacon
2019-04-16 14:23 ` Zhangshaokun
2019-04-16 14:59 ` Catalin Marinas
2019-04-17 3:41 ` Zhangshaokun
2019-04-04 10:27 ` Catalin Marinas
2019-04-05 8:29 ` John Garry
2019-04-08 8:24 ` Zhangshaokun
2019-04-02 13:02 ` Suzuki K Poulose
2019-04-08 8:33 ` Zhangshaokun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190329185257.GD48010@arrakis.emea.arm.com \
--to=catalin.marinas@arm.com \
--cc=Suzuki.Poulose@arm.com \
--cc=guohanjun@huawei.com \
--cc=john.garry@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=qiuzhenfa@hisilicon.com \
--cc=will.deacon@arm.com \
--cc=zhangshaokun@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).