From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: flush_icache_range
Date: Mon, 23 May 2005 13:43:16 +0000 [thread overview]
Message-ID: <4291DDF4.9060107@bull.net> (raw)
In-Reply-To: <4236D7B5.8050408@bull.net>
[-- Attachment #1: Type: text/plain, Size: 2667 bytes --]
The Itanium 2 processor Reference Manual for SW development & optimization
(May 2004) says in the chapter 5.8:
"In Itanium 2 processor, each fc will invalidate 128 bytes corresponding
to the L3 cache line size. Since both the L1I and L1D have line sizes of
64 bytes, a single fc instruction can invalidate two lines."
Can someone please confirm that an equivalent statement is true for the
"fc.i", too ?
Say:
"In Itanium 2 processor, each fc.i instruction will ensure that 128 bytes
(corresponding to the L3 cache line size) of the I-cache(s) be coherent with
the data caches. Since the L1I cache has line sizes of 64 bytes, a single
fc.i instruction can make coherent two lines."
This gave me the idea to try with 128-byte strides
(the measures are repeated for 10 times):
Modified in d-cache:
cycles = 19,164 time = 14.782 usec
cycles = 18,060 time = 13.930 usec
cycles = 16,929 time = 13.058 usec
cycles = 17,597 time = 13.573 usec
cycles = 17,163 time = 13.239 usec
cycles = 16,990 time = 13.105 usec
cycles = 17,427 time = 13.442 usec
cycles = 17,028 time = 13.134 usec
cycles = 16,993 time = 13.107 usec
cycles = 16,930 time = 13.059 usec
Valid:
cycles = 13,514 time = 10.424 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,746 time = 10.603 usec
cycles = 13,866 time = 10.695 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,790 time = 10.637 usec
cycles = 13,830 time = 10.668 usec
Invalid:
cycles = 13,794 time = 10.640 usec
cycles = 13,790 time = 10.637 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,966 time = 10.773 usec
cycles = 13,994 time = 10.794 usec
cycles = 14,074 time = 10.856 usec
cycles = 13,574 time = 10.470 usec
cycles = 13,902 time = 10.723 usec
cycles = 14,114 time = 10.887 usec
I got these incredibly low number of cycles,
compared to my previous results:
With a 32-byte stride:
Modified in d-cache: cycles = 215 K, time = 169 usec
Valid: cycles = 222 K, time = 171 usec
Invalid: cycles = 222 K, time = 171 usec
With a 64-byte stride:
Modified in d-cache: cycles = 63 K, time = 49 usec
Valid: cycles = 116 K, time = 89 usec
Invalid: cycles = 116 K, time = 89 usec
This is a Tiger box with the following CPUs:
processor : 0
vendor : GenuineIntel
arch : IA-64
family : Itanium 2
model : 1
revision : 5
archrev : 0
features : branchlong
cpu number : 0
cpu regs : 4
cpu MHz : 1296.435998
itc MHz : 1296.435998
BogoMIPS : 1941.96
etc...
Can these results be real?
Thanks,
Zoltan
[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 1217 bytes --]
--- linux-2.6.11-orig/arch/ia64/lib/flush.S 2005-04-26 15:59:49.000000000 +0200
+++ linux-2.6.11/arch/ia64/lib/flush.S 2005-05-23 15:30:24.891935385 +0200
@@ -7,6 +7,22 @@
#include <asm/asmmacro.h>
#include <asm/page.h>
+
+#if defined(CONFIG_ITANIUM)
+#define CACHE_SHIFT 5
+#else
+/*
+ * In Itanium 2 processor, each fc.i instruction will ensure that 128 bytes
+ * (corresponding to the L3 cache line size) of the I-cache(s) be coherent with
+ * the data caches. Since the L1I cache has line sizes of 64 bytes, a single
+ * fc.i instruction can make coherent two lines.
+ */
+#define CACHE_SHIFT 7
+#endif
+
+#define CACHE_BYTES (1 << CACHE_SHIFT)
+
+
/*
* flush_icache_range(start,end)
* Must flush range from start to end-1 but nothing else (need to
@@ -17,7 +33,7 @@
alloc r2=ar.pfs,2,0,0,0
sub r8=in1,in0,1
;;
- shr.u r8=r8,5 // we flush 32 bytes per iteration
+ shr.u r8=r8,CACHE_SHIFT // we flush CACHE_BYTES bytes per iteration
.save ar.lc, r3
mov r3=ar.lc // save ar.lc
;;
@@ -26,8 +42,8 @@
mov ar.lc=r8
;;
-.Loop: fc in0 // issuable on M0 only
- add in0=32,in0
+.Loop: fc.i in0 // issuable on M0 only
+ add in0=CACHE_BYTES,in0
br.cloop.sptk.few .Loop
;;
sync.i
next prev parent reply other threads:[~2005-05-23 13:43 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-15 12:40 flush_icache_range Zoltan Menyhart
2005-03-15 18:21 ` flush_icache_range David Mosberger
2005-03-16 10:58 ` flush_icache_range Zoltan Menyhart
2005-03-16 11:19 ` flush_icache_range Duraid Madina
2005-03-16 18:31 ` flush_icache_range David Mosberger
2005-05-20 14:17 ` flush_icache_range Zoltan Menyhart
2005-05-20 15:03 ` flush_icache_range David Mosberger
2005-05-23 13:43 ` Zoltan Menyhart [this message]
2005-05-26 17:21 ` flush_icache_range David Mosberger
2005-05-26 17:39 ` flush_icache_range Seth, Rohit
2005-05-27 15:45 ` flush_icache_range Zoltan Menyhart
2005-05-27 15:56 ` flush_icache_range David Mosberger
2005-05-27 16:45 ` flush_icache_range Zoltan Menyhart
2005-05-27 16:55 ` flush_icache_range David Mosberger
2005-05-27 18:27 ` flush_icache_range Grant Grundler
2005-05-27 19:00 ` flush_icache_range Russ Anderson
2005-05-29 20:23 ` flush_icache_range Menyhart, Zoltan
2005-06-01 23:50 ` flush_icache_range David Mosberger
2005-06-02 3:00 ` flush_icache_range Jim Hull
2005-06-02 12:12 ` flush_icache_range Zoltan Menyhart
2005-06-02 14:25 ` flush_icache_range Zoltan Menyhart
2005-06-02 17:36 ` flush_icache_range David Mosberger
2005-06-02 18:28 ` flush_icache_range David Mosberger
2005-06-02 18:31 ` flush_icache_range David Mosberger
2005-06-02 19:00 ` flush_icache_range Jim Hull
2005-06-02 21:37 ` flush_icache_range Menyhart, Zoltan
2005-06-02 22:23 ` flush_icache_range David Mosberger
2005-06-02 22:55 ` flush_icache_range Menyhart, Zoltan
2005-06-02 23:07 ` flush_icache_range David Mosberger
2005-06-03 12:35 ` flush_icache_range Zoltan Menyhart
2005-06-03 21:09 ` flush_icache_range David Mosberger
2005-06-13 11:20 ` flush_icache_range Zoltan Menyhart
-- strict thread matches above, loose matches on Subject: below --
2000-07-23 1:07 flush_icache_range Kanoj Sarcar
2000-07-23 18:36 ` flush_icache_range Ralf Baechle
2000-07-24 16:10 ` flush_icache_range Kanoj Sarcar
2000-07-25 0:06 ` flush_icache_range Ralf Baechle
2000-07-25 1:11 ` flush_icache_range Kanoj Sarcar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4291DDF4.9060107@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.