From: Jeff Garzik <jgarzik@pobox.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Arjan van de Ven <arjanv@redhat.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [BK+PATCH] remove __constant_memcpy
Date: Thu, 17 Apr 2003 19:49:03 -0400 [thread overview]
Message-ID: <3E9F3D6F.9030501@pobox.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0304171253270.2795-100000@home.transmeta.com>
[-- Attachment #1: Type: text/plain, Size: 1719 bytes --]
Linus Torvalds wrote:
> On Thu, 17 Apr 2003, Jeff Garzik wrote:
>
>>__constant_memcpy was used for small, constant-sized cases AFTER
>>the kernel made the decision not to hand the copy duties over to the
>>kernel's MMX/SSE code. Take a look at the bottom of the patch below,
>>and also this snip from a non-hacked string.h, for illustration...
>
>
> This is the part I don't like
>
> #define memcpy(t, f, n) \
> (__builtin_constant_p(n) ? \
> - __constant_memcpy((t),(f),(n)) : \
> + __builtin_memcpy((t),(f),(n)) : \
> __memcpy((t),(f),(n)))
>
> Notice? Our old __constant_memcpy() would do the rigth thing for large
> copies. In conrast, I don't know that gcc will do so.
If DTRT means just using the existing code for large copies in general,
that's easy enough... patch attached. I made the definition of "large
copy" more pessimistic, where <= 128 bytes goes to __builtin_memcpy,
otherwise to __constant_memcpy. I carried this over to the MMX side too
by proxy, as the existing MMX code already called __constant_memcpy.
Thus, the modification is now only in one place.
If DTRT means not using MMX/SSE[2], that's a given considering the above
code is from the "we don't have MMX/SSE" part of the ifdef. If gcc
starts using MMX with -march=i586 it's time for us all to go home and
write a new compiler ;-)
Three tangents:
1) where did the 486 string ops go?
2) why no sse2-optimized memcpy? just that noone has done one yet?
3) for copies >512 bytes, should we simply call the un-inlined memcpy?
I would think that the call would get lost in cache misses of the block
copy itself, but then again, __constant_memcpy (as it appears in output
asm) is pretty darn small already.
Jeff
[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2015 bytes --]
===== include/asm-i386/string.h 1.8 vs edited =====
--- 1.8/include/asm-i386/string.h Mon Mar 31 17:29:08 2003
+++ edited/include/asm-i386/string.h Thu Apr 17 19:45:20 2003
@@ -214,49 +214,9 @@
*/
static inline void * __constant_memcpy(void * to, const void * from, size_t n)
{
- switch (n) {
- case 0:
- return to;
- case 1:
- *(unsigned char *)to = *(const unsigned char *)from;
- return to;
- case 2:
- *(unsigned short *)to = *(const unsigned short *)from;
- return to;
- case 3:
- *(unsigned short *)to = *(const unsigned short *)from;
- *(2+(unsigned char *)to) = *(2+(const unsigned char *)from);
- return to;
- case 4:
- *(unsigned long *)to = *(const unsigned long *)from;
- return to;
- case 6: /* for Ethernet addresses */
- *(unsigned long *)to = *(const unsigned long *)from;
- *(2+(unsigned short *)to) = *(2+(const unsigned short *)from);
- return to;
- case 8:
- *(unsigned long *)to = *(const unsigned long *)from;
- *(1+(unsigned long *)to) = *(1+(const unsigned long *)from);
- return to;
- case 12:
- *(unsigned long *)to = *(const unsigned long *)from;
- *(1+(unsigned long *)to) = *(1+(const unsigned long *)from);
- *(2+(unsigned long *)to) = *(2+(const unsigned long *)from);
- return to;
- case 16:
- *(unsigned long *)to = *(const unsigned long *)from;
- *(1+(unsigned long *)to) = *(1+(const unsigned long *)from);
- *(2+(unsigned long *)to) = *(2+(const unsigned long *)from);
- *(3+(unsigned long *)to) = *(3+(const unsigned long *)from);
- return to;
- case 20:
- *(unsigned long *)to = *(const unsigned long *)from;
- *(1+(unsigned long *)to) = *(1+(const unsigned long *)from);
- *(2+(unsigned long *)to) = *(2+(const unsigned long *)from);
- *(3+(unsigned long *)to) = *(3+(const unsigned long *)from);
- *(4+(unsigned long *)to) = *(4+(const unsigned long *)from);
- return to;
- }
+ if (n <= 128)
+ return __builtin_memcpy(to, from, n);
+
#define COMMON(x) \
__asm__ __volatile__( \
"rep ; movsl" \
next prev parent reply other threads:[~2003-04-17 23:37 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-17 0:57 [BK+PATCH] remove __constant_memcpy Jeff Garzik
2003-04-17 1:04 ` Jeff Garzik
2003-04-17 2:06 ` Linus Torvalds
2003-04-17 8:46 ` Arjan van de Ven
2003-04-17 9:02 ` Roman Zippel
2003-04-17 9:04 ` Arjan van de Ven
2003-04-17 9:11 ` Jakub Jelinek
2003-04-17 16:07 ` Linus Torvalds
2003-04-17 19:07 ` Jeff Garzik
2003-04-17 19:19 ` Jeff Garzik
2003-04-17 19:54 ` Linus Torvalds
2003-04-17 23:49 ` Jeff Garzik [this message]
2003-04-17 23:52 ` Jeff Garzik
2003-04-17 23:59 ` Linus Torvalds
2003-04-18 0:29 ` H. Peter Anvin
2003-04-18 9:06 ` Arjan van de Ven
2003-04-18 14:31 ` Timothy Miller
2003-04-18 15:07 ` Richard B. Johnson
2003-04-17 22:58 ` J.A. Magallon
2003-04-17 23:10 ` Jeff Garzik
2003-04-17 13:17 ` Alan Cox
2003-04-17 13:17 ` Alan Cox
2003-04-17 14:32 ` Jeff Garzik
2003-04-17 14:40 ` Jeff Garzik
2003-04-17 20:01 ` H. Peter Anvin
-- strict thread matches above, loose matches on Subject: below --
2003-04-17 2:22 Nakajima, Jun
2003-04-17 23:50 Chuck Ebbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3E9F3D6F.9030501@pobox.com \
--to=jgarzik@pobox.com \
--cc=arjanv@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.