public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Linus Torvalds <torvalds@transmeta.com>, linux-kernel@vger.kernel.org
Subject: Re: [beta patch] SSE copy_page() / clear_page()
Date: Wed, 14 Feb 2001 23:37:59 +0100	[thread overview]
Message-ID: <3A8B08C7.BD79E3B4@colorfullife.com> (raw)
In-Reply-To: <3A846C84.109F1D7D@colorfullife.com> <200102092240.OAA15902@penguin.transmeta.com>

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

I have another idea for sse, and this one is far safer:

only use sse prefetch, leave the string operations for the actual copy.
The prefetch operations only prefetch, don't touch the sse registers,
thus neither any reentency nor interrupt problems.

I tried the attached hack^H^H^H^Hpatch, and read(fd, buf, 4000000) from
user space got 7% faster (from 264768842 cycles to 246303748 cycles,
single cpu, noacpi, 'linux -b', fastest time from several thousand
runs).

The reason why this works is simple:

Intel Pentium III and P 4 have hardcoded "fast stringcopy" operations
that invalidate whole cachelines during write (documented in the most
obvious place: multiprocessor management, memory ordering)

The result is a very fast write, but the read is still slow.

--
	Manfred

[-- Attachment #2: patch-sse-prefetchnta --]
[-- Type: text/plain, Size: 566 bytes --]

--- 2.4/mm/filemap.c	Wed Feb 14 10:51:42 2001
+++ build-2.4/mm/filemap.c	Wed Feb 14 22:11:44 2001
@@ -1248,6 +1248,20 @@
 		size = count;
 
 	kaddr = kmap(page);
+	if (size > 128) {
+		int i;
+		__asm__ __volatile__(
+			"mov %1, %0\n\t"
+			: "=r" (i)
+			: "r" (kaddr+offset)); /* load tlb entry */
+		for(i=0;i<size;i+=64) {
+			__asm__ __volatile__(
+				"prefetchnta (%1, %0)\n\t"
+				"prefetchnta 32(%1, %0)\n\t"
+				: /* no output */
+				: "r" (i), "r" (kaddr+offset));
+		}
+	}
 	left = __copy_to_user(desc->buf, kaddr + offset, size);
 	kunmap(page);
 	

  parent reply	other threads:[~2001-02-14 22:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-02-09 22:17 [beta patch] SSE copy_page() / clear_page() Manfred Spraul
2001-02-09 22:40 ` Linus Torvalds
2001-02-09 23:03   ` Doug Ledford
2001-02-10  9:09     ` Manfred Spraul
2001-02-10 17:18       ` Doug Ledford
2001-02-10 18:00         ` Manfred Spraul
2001-02-10 18:18           ` Manfred Spraul
     [not found] ` <200102092240.OAA15902@penguin.transmeta.com>
2001-02-14 22:37   ` Manfred Spraul [this message]
2001-02-16 15:27     ` Andrew Morton
2001-02-20 17:35     ` Pavel Machek
2001-02-20 20:49       ` Alan Cox
2001-02-20 20:52         ` Pavel Machek
2001-02-20 21:08           ` Alan Cox
2001-02-20 21:16           ` Manfred Spraul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3A8B08C7.BD79E3B4@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox