From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: Call for testing/opinions: Optimized memset/memcpy
Date: Sun, 14 Jul 2013 14:13:21 +0100 [thread overview]
Message-ID: <20130714131320.GS24642@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <CAKv+Gu-fUNSZaNMgZekoCX2kOGEFDLKY1yFtKq1=yp9xQN3rOA@mail.gmail.com>
On Sun, Jul 14, 2013 at 01:37:44PM +0200, Ard Biesheuvel wrote:
> On 14 July 2013 13:19, Harm Hanemaaijer <fgenfb@yahoo.com> wrote:
> > Dr. David Alan Gilbert <gilbertd <at> treblig.org> writes:
> >>
> >> Maybe neon is worth a try these days (although be careful of platforms
> >> like Tegra 2 that doens't have it); there was a recent patch that enabled
> >> use in the kernel (I think for some RAID use). The downside is it's
> >> supposed to be quite power hungry.
> >>
> >
> > As it turns out, NEON isn't too hard to implement. I have added NEON support
> > to copy_page, memset, memzero, and memcpy (both for the aligned and unaligned
> > case) in my userspace testing environment. It gives a nice boost (ranging
> > from 10% for copy_page to >30% for unaligned memcpy on a Cortex A8), which
> > can potentially be more on other cores. Although I have not tested a live
> > kernel yet, it looks like NEON can be used fairly transparently #ifdefed on
> > the CONFIG_NEON kernel definition as long as only the lower end of the
> > NEON/vfp register file is clobbered (although this needs verification).
> >
>
> You will clobber the userland NEON contents of the register file if
> you don't preserve them properly. Also, kernel preemption (if enabled)
> may put your task to sleep at any time, and the context switching
> machinery is totally oblivious of NEON being used in the kernel, so
> the kernel side will get corrupted as well in this case.
The other issue is - not every ARMv7 core has Neon, so this is going
to have to be something that is selected at runtime - which means
indirecting every memcpy/memset through a function pointer.
The final point is, don't forget that gcc will generate implicit calls
to memset/memcpy, and neon won't be available early in the kernel boot,
so you can't optimize those function pointers away.
next prev parent reply other threads:[~2013-07-14 13:13 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-13 15:51 Call for testing/opinions: Optimized memset/memcpy Harm Hanemaaijer
2013-07-13 16:48 ` Dr. David Alan Gilbert
2013-07-13 21:13 ` Harm Hanemaaijer
2013-07-15 13:15 ` Catalin Marinas
2013-07-14 11:19 ` Harm Hanemaaijer
2013-07-14 11:32 ` Dr. David Alan Gilbert
2013-07-14 11:37 ` Ard Biesheuvel
2013-07-14 13:13 ` Russell King - ARM Linux [this message]
2013-07-14 13:33 ` Harm Hanemaaijer
2013-07-14 14:09 ` Ard Biesheuvel
2013-07-14 14:32 ` Russell King - ARM Linux
2013-07-13 17:24 ` Willy Tarreau
2013-07-13 21:51 ` Harm Hanemaaijer
2013-07-14 6:13 ` Willy Tarreau
2013-07-14 11:00 ` Harm Hanemaaijer
2013-07-14 13:09 ` Russell King - ARM Linux
2013-07-14 13:59 ` Harm Hanemaaijer
2013-07-14 15:21 ` Siarhei Siamashka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130714131320.GS24642@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).