From: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
Ganesh GR <ganeshgr@chelsio.com>,
Nirranjan Kirubaharan <nirranjan@chelsio.com>,
Indranil Choudhury <indranil@chelsio.com>
Subject: Re: [RFC PATCH 2/3] x86/io: implement 256-bit IO read and write
Date: Tue, 20 Mar 2018 19:02:07 +0530 [thread overview]
Message-ID: <20180320133206.GB25574@chelsio.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1803191525010.2010@nanos.tec.linutronix.de>
On Monday, March 03/19/18, 2018 at 20:13:10 +0530, Thomas Gleixner wrote:
> On Mon, 19 Mar 2018, Rahul Lakkireddy wrote:
>
> > Use VMOVDQU AVX CPU instruction when available to do 256-bit
> > IO read and write.
>
> That's not what the patch does. See below.
>
> > Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> > Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
>
> That Signed-off-by chain is wrong....
>
> > +#ifdef CONFIG_AS_AVX
> > +#include <asm/fpu/api.h>
> > +
> > +static inline u256 __readqq(const volatile void __iomem *addr)
> > +{
> > + u256 ret;
> > +
> > + kernel_fpu_begin();
> > + asm volatile("vmovdqu %0, %%ymm0" :
> > + : "m" (*(volatile u256 __force *)addr));
> > + asm volatile("vmovdqu %%ymm0, %0" : "=m" (ret));
> > + kernel_fpu_end();
> > + return ret;
>
> You _cannot_ assume that the instruction is available just because
> CONFIG_AS_AVX is set. The availability is determined by the runtime
> evaluated CPU feature flags, i.e. X86_FEATURE_AVX.
>
Ok. Will add boot_cpu_has(X86_FEATURE_AVX) check as well.
> Aside of that I very much doubt that this is faster than 4 consecutive
> 64bit reads/writes as you have the full overhead of
> kernel_fpu_begin()/end() for each access.
>
> You did not provide any numbers for this so its even harder to
> determine.
>
Sorry about that. Here are the numbers with and without this series.
When reading up to 2 GB on-chip memory via MMIO, the time taken:
Without Series With Series
(64-bit read) (256-bit read)
52 seconds 26 seconds
As can be seen, we see good improvement with doing 256-bits at a
time.
> As far as I can tell the code where you are using this is a debug
> facility. What's the point? Debug is hardly a performance critical problem.
>
On High Availability Server, the logs of the failing system must be
collected as quickly as possible. So, we're concerned with the amount
of time taken to collect our large on-chip memory. We see improvement
in doing 256-bit reads at a time.
Thanks,
Rahul
next prev parent reply other threads:[~2018-03-20 13:33 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-19 14:20 [RFC PATCH 0/3] kernel: add support for 256-bit IO access Rahul Lakkireddy
2018-03-19 14:20 ` [RFC PATCH 1/3] include/linux: add 256-bit IO accessors Rahul Lakkireddy
2018-03-19 14:20 ` [RFC PATCH 2/3] x86/io: implement 256-bit IO read and write Rahul Lakkireddy
2018-03-19 14:43 ` Thomas Gleixner
2018-03-20 13:32 ` Rahul Lakkireddy [this message]
2018-03-20 13:44 ` Andy Shevchenko
2018-03-21 12:27 ` Rahul Lakkireddy
2018-03-20 14:40 ` David Laight
2018-03-21 12:28 ` Rahul Lakkireddy
2018-03-20 14:42 ` Alexander Duyck
2018-03-21 12:28 ` Rahul Lakkireddy
2018-03-22 1:26 ` Linus Torvalds
2018-03-22 10:48 ` David Laight
2018-03-22 17:16 ` Linus Torvalds
2018-03-19 14:20 ` [RFC PATCH 3/3] cxgb4: read on-chip memory 256-bits at a time Rahul Lakkireddy
2018-03-19 14:53 ` [RFC PATCH 0/3] kernel: add support for 256-bit IO access David Laight
2018-03-19 15:05 ` Thomas Gleixner
2018-03-19 15:19 ` David Laight
2018-03-19 15:37 ` Thomas Gleixner
2018-03-19 15:53 ` David Laight
2018-03-19 16:29 ` Linus Torvalds
2018-03-20 8:26 ` Ingo Molnar
2018-03-20 8:38 ` Thomas Gleixner
2018-03-20 9:08 ` Ingo Molnar
2018-03-20 9:41 ` Thomas Gleixner
2018-03-20 9:59 ` David Laight
2018-03-20 10:54 ` Ingo Molnar
2018-03-20 13:30 ` David Laight
2018-04-03 8:49 ` Pavel Machek
2018-04-03 10:36 ` Ingo Molnar
2018-03-20 14:57 ` Andy Lutomirski
2018-03-20 15:10 ` David Laight
2018-03-21 0:39 ` Andy Lutomirski
2018-03-20 18:01 ` Linus Torvalds
2018-03-21 6:32 ` Ingo Molnar
2018-03-21 15:45 ` Andy Lutomirski
2018-03-22 9:36 ` Ingo Molnar
2018-03-21 7:46 ` Ingo Molnar
2018-03-21 18:15 ` Linus Torvalds
2018-03-22 9:33 ` Ingo Molnar
2018-03-22 17:40 ` Alexei Starovoitov
2018-03-22 17:44 ` Andy Lutomirski
2018-03-22 10:35 ` David Laight
2018-03-22 12:48 ` David Laight
2018-03-22 17:07 ` Linus Torvalds
2018-03-19 15:27 ` Christoph Hellwig
2018-03-20 13:45 ` Rahul Lakkireddy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180320133206.GB25574@chelsio.com \
--to=rahul.lakkireddy@chelsio.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ganeshgr@chelsio.com \
--cc=hpa@zytor.com \
--cc=indranil@chelsio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nirranjan@chelsio.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.