From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: flush_icache_range
Date: Mon, 23 May 2005 13:43:16 +0000 [thread overview]
Message-ID: <4291DDF4.9060107@bull.net> (raw)
In-Reply-To: <4236D7B5.8050408@bull.net>
[-- Attachment #1: Type: text/plain, Size: 2667 bytes --]
The Itanium 2 processor Reference Manual for SW development & optimization
(May 2004) says in the chapter 5.8:
"In Itanium 2 processor, each fc will invalidate 128 bytes corresponding
to the L3 cache line size. Since both the L1I and L1D have line sizes of
64 bytes, a single fc instruction can invalidate two lines."
Can someone please confirm that an equivalent statement is true for the
"fc.i", too ?
Say:
"In Itanium 2 processor, each fc.i instruction will ensure that 128 bytes
(corresponding to the L3 cache line size) of the I-cache(s) be coherent with
the data caches. Since the L1I cache has line sizes of 64 bytes, a single
fc.i instruction can make coherent two lines."
This gave me the idea to try with 128-byte strides
(the measures are repeated for 10 times):
Modified in d-cache:
cycles = 19,164 time = 14.782 usec
cycles = 18,060 time = 13.930 usec
cycles = 16,929 time = 13.058 usec
cycles = 17,597 time = 13.573 usec
cycles = 17,163 time = 13.239 usec
cycles = 16,990 time = 13.105 usec
cycles = 17,427 time = 13.442 usec
cycles = 17,028 time = 13.134 usec
cycles = 16,993 time = 13.107 usec
cycles = 16,930 time = 13.059 usec
Valid:
cycles = 13,514 time = 10.424 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,518 time = 10.427 usec
cycles = 13,746 time = 10.603 usec
cycles = 13,866 time = 10.695 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,790 time = 10.637 usec
cycles = 13,830 time = 10.668 usec
Invalid:
cycles = 13,794 time = 10.640 usec
cycles = 13,790 time = 10.637 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,830 time = 10.668 usec
cycles = 13,966 time = 10.773 usec
cycles = 13,994 time = 10.794 usec
cycles = 14,074 time = 10.856 usec
cycles = 13,574 time = 10.470 usec
cycles = 13,902 time = 10.723 usec
cycles = 14,114 time = 10.887 usec
I got these incredibly low number of cycles,
compared to my previous results:
With a 32-byte stride:
Modified in d-cache: cycles = 215 K, time = 169 usec
Valid: cycles = 222 K, time = 171 usec
Invalid: cycles = 222 K, time = 171 usec
With a 64-byte stride:
Modified in d-cache: cycles = 63 K, time = 49 usec
Valid: cycles = 116 K, time = 89 usec
Invalid: cycles = 116 K, time = 89 usec
This is a Tiger box with the following CPUs:
processor : 0
vendor : GenuineIntel
arch : IA-64
family : Itanium 2
model : 1
revision : 5
archrev : 0
features : branchlong
cpu number : 0
cpu regs : 4
cpu MHz : 1296.435998
itc MHz : 1296.435998
BogoMIPS : 1941.96
etc...
Can these results be real?
Thanks,
Zoltan
[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 1217 bytes --]
--- linux-2.6.11-orig/arch/ia64/lib/flush.S 2005-04-26 15:59:49.000000000 +0200
+++ linux-2.6.11/arch/ia64/lib/flush.S 2005-05-23 15:30:24.891935385 +0200
@@ -7,6 +7,22 @@
#include <asm/asmmacro.h>
#include <asm/page.h>
+
+#if defined(CONFIG_ITANIUM)
+#define CACHE_SHIFT 5
+#else
+/*
+ * In Itanium 2 processor, each fc.i instruction will ensure that 128 bytes
+ * (corresponding to the L3 cache line size) of the I-cache(s) be coherent with
+ * the data caches. Since the L1I cache has line sizes of 64 bytes, a single
+ * fc.i instruction can make coherent two lines.
+ */
+#define CACHE_SHIFT 7
+#endif
+
+#define CACHE_BYTES (1 << CACHE_SHIFT)
+
+
/*
* flush_icache_range(start,end)
* Must flush range from start to end-1 but nothing else (need to
@@ -17,7 +33,7 @@
alloc r2=ar.pfs,2,0,0,0
sub r8=in1,in0,1
;;
- shr.u r8=r8,5 // we flush 32 bytes per iteration
+ shr.u r8=r8,CACHE_SHIFT // we flush CACHE_BYTES bytes per iteration
.save ar.lc, r3
mov r3=ar.lc // save ar.lc
;;
@@ -26,8 +42,8 @@
mov ar.lc=r8
;;
-.Loop: fc in0 // issuable on M0 only
- add in0=32,in0
+.Loop: fc.i in0 // issuable on M0 only
+ add in0=CACHE_BYTES,in0
br.cloop.sptk.few .Loop
;;
sync.i
next prev parent reply other threads:[~2005-05-23 13:43 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-15 12:40 flush_icache_range Zoltan Menyhart
2005-03-15 18:21 ` flush_icache_range David Mosberger
2005-03-16 10:58 ` flush_icache_range Zoltan Menyhart
2005-03-16 11:19 ` flush_icache_range Duraid Madina
2005-03-16 18:31 ` flush_icache_range David Mosberger
2005-05-20 14:17 ` flush_icache_range Zoltan Menyhart
2005-05-20 15:03 ` flush_icache_range David Mosberger
2005-05-23 13:43 ` Zoltan Menyhart [this message]
2005-05-26 17:21 ` flush_icache_range David Mosberger
2005-05-26 17:39 ` flush_icache_range Seth, Rohit
2005-05-27 15:45 ` flush_icache_range Zoltan Menyhart
2005-05-27 15:56 ` flush_icache_range David Mosberger
2005-05-27 16:45 ` flush_icache_range Zoltan Menyhart
2005-05-27 16:55 ` flush_icache_range David Mosberger
2005-05-27 18:27 ` flush_icache_range Grant Grundler
2005-05-27 19:00 ` flush_icache_range Russ Anderson
2005-05-29 20:23 ` flush_icache_range Menyhart, Zoltan
2005-06-01 23:50 ` flush_icache_range David Mosberger
2005-06-02 3:00 ` flush_icache_range Jim Hull
2005-06-02 12:12 ` flush_icache_range Zoltan Menyhart
2005-06-02 14:25 ` flush_icache_range Zoltan Menyhart
2005-06-02 17:36 ` flush_icache_range David Mosberger
2005-06-02 18:28 ` flush_icache_range David Mosberger
2005-06-02 18:31 ` flush_icache_range David Mosberger
2005-06-02 19:00 ` flush_icache_range Jim Hull
2005-06-02 21:37 ` flush_icache_range Menyhart, Zoltan
2005-06-02 22:23 ` flush_icache_range David Mosberger
2005-06-02 22:55 ` flush_icache_range Menyhart, Zoltan
2005-06-02 23:07 ` flush_icache_range David Mosberger
2005-06-03 12:35 ` flush_icache_range Zoltan Menyhart
2005-06-03 21:09 ` flush_icache_range David Mosberger
2005-06-13 11:20 ` flush_icache_range Zoltan Menyhart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4291DDF4.9060107@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox