linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* How to make use of SPE instructions?
@ 2015-01-08  9:58 Markus Stockhausen
  2015-01-15 22:38 ` Michael Ellerman
  2015-01-15 22:56 ` Scott Wood
  0 siblings, 2 replies; 5+ messages in thread
From: Markus Stockhausen @ 2015-01-08  9:58 UTC (permalink / raw)
  To: linuxppc-dev@lists.ozlabs.org

[-- Attachment #1: Type: text/plain, Size: 839 bytes --]

Hello,

I developed a SHA224/256 kernel crypto module with SPE instructions.
The result looks quite promising (~ +50% speedup). Nevertheless the
flooding of kernel messages "SPE used in kernel" makes me feel 
uncomfortable.

My findings so far:

- I can configure the kernel with "SPE support". 
- arch/powerpc/kernel/head_fsl_booke.S suggests that the message is
  triggerd unconditionally whenwever we make use of SPE in kernel.
- There exists a function enable_kernel_spe() but I don't know how
  this could help me in my work.

I guess I need some kind of "brackets" around my coding to make sure 
the upper 32 bit of the registers are stored correctly during task switch. 
Or is the use of SPE instructions inside the kernel totally forbidden? Any 
expert with some helpful advise?

Thanks in advance.

Markus
=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to make use of SPE instructions?
  2015-01-08  9:58 How to make use of SPE instructions? Markus Stockhausen
@ 2015-01-15 22:38 ` Michael Ellerman
  2015-01-15 22:56 ` Scott Wood
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2015-01-15 22:38 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: linuxppc-dev@lists.ozlabs.org

On Thu, 2015-01-08 at 09:58 +0000, Markus Stockhausen wrote:
> Hello,
> 
> I developed a SHA224/256 kernel crypto module with SPE instructions.
> The result looks quite promising (~ +50% speedup). Nevertheless the
> flooding of kernel messages "SPE used in kernel" makes me feel 
> uncomfortable.
> 
> My findings so far:
> 
> - I can configure the kernel with "SPE support". 
> - arch/powerpc/kernel/head_fsl_booke.S suggests that the message is
>   triggerd unconditionally whenwever we make use of SPE in kernel.
> - There exists a function enable_kernel_spe() but I don't know how
>   this could help me in my work.
> 
> I guess I need some kind of "brackets" around my coding to make sure 
> the upper 32 bit of the registers are stored correctly during task switch. 
> Or is the use of SPE instructions inside the kernel totally forbidden? Any 
> expert with some helpful advise?

I don't know about SPE specifically, hopefully someone from FSL can chime in.

But, IIUIC you should just be able to call enable_kernel_spe() prior to your
code, ie. before you execute any SPE instructions. That will save any values
userspace has left in the SPE regs, and then enable SPE for you in the kernel.
That should get rid of the "SPE used in kernel" messages.  

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to make use of SPE instructions?
  2015-01-08  9:58 How to make use of SPE instructions? Markus Stockhausen
  2015-01-15 22:38 ` Michael Ellerman
@ 2015-01-15 22:56 ` Scott Wood
  2015-01-16  5:27   ` AW: " Markus Stockhausen
  1 sibling, 1 reply; 5+ messages in thread
From: Scott Wood @ 2015-01-15 22:56 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: linuxppc-dev@lists.ozlabs.org

On Thu, 2015-01-08 at 09:58 +0000, Markus Stockhausen wrote:
> Hello,
> 
> I developed a SHA224/256 kernel crypto module with SPE instructions.
> The result looks quite promising (~ +50% speedup). Nevertheless the
> flooding of kernel messages "SPE used in kernel" makes me feel 
> uncomfortable.
> 
> My findings so far:
> 
> - I can configure the kernel with "SPE support". 
> - arch/powerpc/kernel/head_fsl_booke.S suggests that the message is
>   triggerd unconditionally whenwever we make use of SPE in kernel.
> - There exists a function enable_kernel_spe() but I don't know how
>   this could help me in my work.
> 
> I guess I need some kind of "brackets" around my coding to make sure 
> the upper 32 bit of the registers are stored correctly during task switch. 
> Or is the use of SPE instructions inside the kernel totally forbidden? Any 
> expert with some helpful advise?

You need to disable preemption, call enable_kernel_spe(), and finish
using SPE before you enable preemption.  This assumes that SPE is never
used from interrupt context.  Be careful to not disable preemption for
too long.

-Scott

^ permalink raw reply	[flat|nested] 5+ messages in thread

* AW: How to make use of SPE instructions?
  2015-01-15 22:56 ` Scott Wood
@ 2015-01-16  5:27   ` Markus Stockhausen
  2015-01-20  7:38     ` Scott Wood
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Stockhausen @ 2015-01-16  5:27 UTC (permalink / raw)
  To: Scott Wood, Michael Ellerman; +Cc: linuxppc-dev@lists.ozlabs.org

[-- Attachment #1: Type: text/plain, Size: 2468 bytes --]

> Von: Scott Wood [scottwood@freescale.com]
> Gesendet: Donnerstag, 15. Januar 2015 23:56
> An: Markus Stockhausen
> Cc: linuxppc-dev@lists.ozlabs.org
> Betreff: Re: How to make use of SPE instructions?
> 
> On Thu, 2015-01-08 at 09:58 +0000, Markus Stockhausen wrote:
> > Hello,
> >
> > I developed a SHA224/256 kernel crypto module with SPE instructions.
> > The result looks quite promising (~ +50% speedup). Nevertheless the
> > flooding of kernel messages "SPE used in kernel" makes me feel
> > uncomfortable.
> >
> > My findings so far:
> >
> > - I can configure the kernel with "SPE support".
> > - arch/powerpc/kernel/head_fsl_booke.S suggests that the message is
> >   triggerd unconditionally whenwever we make use of SPE in kernel.
> - There exists a function enable_kernel_spe() but I don't know how
>   this could help me in my work.
> >
> > I guess I need some kind of "brackets" around my coding to make sure
> > the upper 32 bit of the registers are stored correctly during task switch.
> > Or is the use of SPE instructions inside the kernel totally forbidden? Any
> > expert with some helpful advise?
> 
> You need to disable preemption, call enable_kernel_spe(), and finish
> using SPE before you enable preemption.  This assumes that SPE is never
> used from interrupt context.  Be careful to not disable preemption for
> too long.

Thanks for your feedback. That did the trick. I'm currently working on
a (low power) 800 MHz single core P1014 CPU. That should be the
cheapest and slowest hardware that is available with SPE. My target 
is to use the module for calculating hash values of IPsec packets. So 
we are talking about input data of up to ~1500 bytes. 

I did some tests with the tcrypt module and I get a hashing speed of
~ 46MByte/s for 2K data chunks. Stock module gives 29MByte/s. In 
other words ~22,000 hashes per second. Overhead of the tcrypt data 
feeder of around 10% included. That are worst case 46us per hash and 
therefore 46us inside a non preemptive task.

In beetween I spent some time to do the same for SHA-1. There
we have ~46,000 hashes per second or 21us per 2K data. That are
+13% compared to the already available PPC assembler module.

Three questions are left:

- Does the setup conflict with the mentioned interrupt context?
- Is that a reasonable time interval for disabling preemption?
- Should I send the patches to this or to the crypto list?

Markus
=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: AW: How to make use of SPE instructions?
  2015-01-16  5:27   ` AW: " Markus Stockhausen
@ 2015-01-20  7:38     ` Scott Wood
  0 siblings, 0 replies; 5+ messages in thread
From: Scott Wood @ 2015-01-20  7:38 UTC (permalink / raw)
  To: Markus Stockhausen; +Cc: linuxppc-dev@lists.ozlabs.org

On Fri, 2015-01-16 at 05:27 +0000, Markus Stockhausen wrote:
> > Von: Scott Wood [scottwood@freescale.com]
> > Gesendet: Donnerstag, 15. Januar 2015 23:56
> > An: Markus Stockhausen
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Betreff: Re: How to make use of SPE instructions?
> > 
> > On Thu, 2015-01-08 at 09:58 +0000, Markus Stockhausen wrote:
> > > Hello,
> > >
> > > I developed a SHA224/256 kernel crypto module with SPE instructions.
> > > The result looks quite promising (~ +50% speedup). Nevertheless the
> > > flooding of kernel messages "SPE used in kernel" makes me feel
> > > uncomfortable.
> > >
> > > My findings so far:
> > >
> > > - I can configure the kernel with "SPE support".
> > > - arch/powerpc/kernel/head_fsl_booke.S suggests that the message is
> > >   triggerd unconditionally whenwever we make use of SPE in kernel.
> > - There exists a function enable_kernel_spe() but I don't know how
> >   this could help me in my work.
> > >
> > > I guess I need some kind of "brackets" around my coding to make sure
> > > the upper 32 bit of the registers are stored correctly during task switch.
> > > Or is the use of SPE instructions inside the kernel totally forbidden? Any
> > > expert with some helpful advise?
> > 
> > You need to disable preemption, call enable_kernel_spe(), and finish
> > using SPE before you enable preemption.  This assumes that SPE is never
> > used from interrupt context.  Be careful to not disable preemption for
> > too long.
> 
> Thanks for your feedback. That did the trick. I'm currently working on
> a (low power) 800 MHz single core P1014 CPU. That should be the
> cheapest and slowest hardware that is available with SPE.

Some of the mpc85xx chips can go a bit slower than that, e.g. 667 MHz is
the bottom end of the range for MPC8544:

http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8544E

>  My target is to use the module for calculating hash values of IPsec
> packets. So we are talking about input data of up to ~1500 bytes. 
> 
> I did some tests with the tcrypt module and I get a hashing speed of
> ~ 46MByte/s for 2K data chunks. Stock module gives 29MByte/s. In 
> other words ~22,000 hashes per second. Overhead of the tcrypt data 
> feeder of around 10% included. That are worst case 46us per hash and 
> therefore 46us inside a non preemptive task.

Worst case or average case?  Can chunks be larger?  How long does it
take to do a chunk if you start with a cold cache?  Etc.

> In beetween I spent some time to do the same for SHA-1. There
> we have ~46,000 hashes per second or 21us per 2K data. That are
> +13% compared to the already available PPC assembler module.
> 
> Three questions are left:
> 
> - Does the setup conflict with the mentioned interrupt context?

You didn't say whether you're doing it from interrupt context...

That said, the only current user of enable_kernel_spe() is KVM which
disables interrupts, so it wouldn't bother me to change it to WARN_ON(!
irqs_disabled()) other than that it would deviate from what
enable_kernel_fp does (and that does have users that only disable
preemption).

> - Is that a reasonable time interval for disabling preemption?

It's OK if the worst case is really 46 us, but if you can find a way to
break it up a bit without affecting throughput too much, I'd do so.

> - Should I send the patches to this or to the crypto list?

You can CC this list for broader review, but the crypto list and
maintainer is how changes to the crypto driver would get merged.

-Scott

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-01-20  7:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-08  9:58 How to make use of SPE instructions? Markus Stockhausen
2015-01-15 22:38 ` Michael Ellerman
2015-01-15 22:56 ` Scott Wood
2015-01-16  5:27   ` AW: " Markus Stockhausen
2015-01-20  7:38     ` Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).