linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: David Laight <David.Laight@ACULAB.COM>,
	"'Rahul Lakkireddy'" <rahul.lakkireddy@chelsio.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	"ganeshgr@chelsio.com" <ganeshgr@chelsio.com>,
	"nirranjan@chelsio.com" <nirranjan@chelsio.com>,
	"indranil@chelsio.com" <indranil@chelsio.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Eric Biggers <ebiggers3@gmail.com>
Subject: Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access
Date: Tue, 20 Mar 2018 09:26:51 +0100	[thread overview]
Message-ID: <20180320082651.jmxvvii2xvmpyr2s@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1803191625080.2010@nanos.tec.linutronix.de>


* Thomas Gleixner <tglx@linutronix.de> wrote:

> > Useful also for code that needs AVX-like registers to do things like CRCs.
> 
> x86/crypto/ has a lot of AVX optimized code.

Yeah, that's true, but the crypto code is processing fundamentally bigger blocks 
of data, which amortizes the cost of using kernel_fpu_begin()/_end().

kernel_fpu_begin()/_end() is a pretty heavy operation because it does a full FPU 
save/restore via the XSAVE[S] and XRSTOR[S] instructions, which can easily copy a 
thousand bytes around! So kernel_fpu_begin()/_end() is probably a non-starter for 
something small, like a single 256-bit or 512-bit word access.

But there's actually a new thing in modern kernels: we got rid of (most of) lazy 
save/restore FPU code, our new x86 FPU model is very "direct" with no FPU faults 
taken normally.

So assuming the target driver will only load on modern FPUs I *think* it should 
actually be possible to do something like (pseudocode):

	vmovdqa %ymm0, 40(%rsp)
	vmovdqa %ymm1, 80(%rsp)

	...
	# use ymm0 and ymm1
	...

	vmovdqa 80(%rsp), %ymm1
	vmovdqa 40(%rsp), %ymm0

... without using the heavy XSAVE/XRSTOR instructions.

Note that preemption probably still needs to be disabled and possibly there are 
other details as well, but there should be no 'heavy' FPU operations.

I think this should still preserve all user-space FPU state and shouldn't muck up 
any 'weird' user-space FPU state (such as pending exceptions, legacy x87 running 
code, NaN registers or weird FPU control word settings) we might have interrupted 
either.

But I could be wrong, it should be checked whether this sequence is safe. 
Worst-case we might have to save/restore the FPU control and tag words - but those 
operations should still be much faster than a full XSAVE/XRSTOR pair.

So I do think we could do more in this area to improve driver performance, if the 
code is correct and if there's actual benchmarks that are showing real benefits.

Thanks,

	Ingo

  parent reply	other threads:[~2018-03-20  8:27 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-19 14:20 [RFC PATCH 0/3] kernel: add support for 256-bit IO access Rahul Lakkireddy
2018-03-19 14:20 ` [RFC PATCH 1/3] include/linux: add 256-bit IO accessors Rahul Lakkireddy
2018-03-19 14:20 ` [RFC PATCH 2/3] x86/io: implement 256-bit IO read and write Rahul Lakkireddy
2018-03-19 14:43   ` Thomas Gleixner
2018-03-20 13:32     ` Rahul Lakkireddy
2018-03-20 13:44       ` Andy Shevchenko
2018-03-21 12:27         ` Rahul Lakkireddy
2018-03-20 14:40       ` David Laight
2018-03-21 12:28         ` Rahul Lakkireddy
2018-03-20 14:42       ` Alexander Duyck
2018-03-21 12:28         ` Rahul Lakkireddy
2018-03-22  1:26         ` Linus Torvalds
2018-03-22 10:48           ` David Laight
2018-03-22 17:16             ` Linus Torvalds
2018-03-19 14:20 ` [RFC PATCH 3/3] cxgb4: read on-chip memory 256-bits at a time Rahul Lakkireddy
2018-03-19 14:53 ` [RFC PATCH 0/3] kernel: add support for 256-bit IO access David Laight
2018-03-19 15:05   ` Thomas Gleixner
2018-03-19 15:19     ` David Laight
2018-03-19 15:37       ` Thomas Gleixner
2018-03-19 15:53         ` David Laight
2018-03-19 16:29           ` Linus Torvalds
2018-03-20  8:26         ` Ingo Molnar [this message]
2018-03-20  8:38           ` Thomas Gleixner
2018-03-20  9:08             ` Ingo Molnar
2018-03-20  9:41               ` Thomas Gleixner
2018-03-20  9:59                 ` David Laight
2018-03-20 10:54                 ` Ingo Molnar
2018-03-20 13:30                   ` David Laight
2018-04-03  8:49                   ` Pavel Machek
2018-04-03 10:36                     ` Ingo Molnar
2018-03-20 14:57           ` Andy Lutomirski
2018-03-20 15:10             ` David Laight
2018-03-21  0:39               ` Andy Lutomirski
2018-03-20 18:01           ` Linus Torvalds
2018-03-21  6:32             ` Ingo Molnar
2018-03-21 15:45               ` Andy Lutomirski
2018-03-22  9:36                 ` Ingo Molnar
2018-03-21  7:46             ` Ingo Molnar
2018-03-21 18:15               ` Linus Torvalds
2018-03-22  9:33                 ` Ingo Molnar
2018-03-22 17:40                   ` Alexei Starovoitov
2018-03-22 17:44                     ` Andy Lutomirski
2018-03-22 10:35                 ` David Laight
2018-03-22 12:48                   ` David Laight
2018-03-22 17:07                     ` Linus Torvalds
2018-03-19 15:27 ` Christoph Hellwig
2018-03-20 13:45   ` Rahul Lakkireddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180320082651.jmxvvii2xvmpyr2s@gmail.com \
    --to=mingo@kernel.org \
    --cc=David.Laight@ACULAB.COM \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=ebiggers3@gmail.com \
    --cc=fenghua.yu@intel.com \
    --cc=ganeshgr@chelsio.com \
    --cc=hpa@zytor.com \
    --cc=indranil@chelsio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nirranjan@chelsio.com \
    --cc=rahul.lakkireddy@chelsio.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).