* [parisc-linux] 2.6.10-rc1-pa11 profile data
@ 2004-11-11 7:54 Grant Grundler
2004-11-11 8:11 ` Randolph Chung
2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
0 siblings, 2 replies; 25+ messages in thread
From: Grant Grundler @ 2004-11-11 7:54 UTC (permalink / raw)
To: parisc-linux
I was comparing "time" output for various flavors of
kernels and arches. We are something like 11m/17m/5m
for real/user/sys on a dual j6700 (dual 750Mhz, running
2.6.10-rc1-pa11-32SMP kernel building a 64-bit kernel
(using gcc 3.0.4). Similar numbers for J6000 (dual 550Mhz)
doing a 32-bit kernel build (gcc 3.3.x): 14m/22m/5m.
While this might look very favorable to a similar full kernel
build on a 1.5Ghz RX2600 which takes about as long (11m/20m/1m),
the ia64 machine spends less than 1m in the kernel.
I've collect two profiles for -64SMP and will collect
some UP profiles tomorrow. profiles so far are measuring
a full kernel build. I expect I'll do the same for -64UP
kernels too.
What I have so far is on:
http://www.parisc-linux.org/~grundler/prof-j6700/
d- and i-cache flushing routines are still the top consumers.
hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
@ 2004-11-11 8:11 ` Randolph Chung
2004-11-11 17:39 ` Carlos O'Donell
` (5 more replies)
2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
1 sibling, 6 replies; 25+ messages in thread
From: Randolph Chung @ 2004-11-11 8:11 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
> I've collect two profiles for -64SMP and will collect
> some UP profiles tomorrow. profiles so far are measuring
> a full kernel build. I expect I'll do the same for -64UP
> kernels too.
hmm.. interesting. top consumers are (with idle loop functions removed)
40646 flush_kernel_icache_page 406.4600
7364 fdsync 368.2000
10567 flush_user_dcache_range_asm 293.5278
10387 flush_user_icache_range_asm 288.5278
21409 __clear_user_page_asm 191.1518
5356 _spin_lock_irqsave 111.5833
1768 fisync 110.5000
1928 _spin_lock 48.2000
4255 purge_kernel_dcache_page 42.5500
339 $lclu_done 42.3750
4089 flush_kernel_dcache_page 40.8900
5053 copy_user_page_asm 33.2434
569 _write_unlock_irq 17.7812
422 _spin_unlock 17.5833
1567 find_vma_prev 16.3229
181 $lslen_loop 15.0833
96 $lslen_done 12.0000
996 _write_trylock 11.3182
137 $lsfu_loop 8.5625
748 flush_user_dcache_page 7.4800
we really need to do better at cache flushing..... anybody have any
ideas? :)
but looking at the other ones:
- __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes
at a time instead of 4
- _spin_lock* needs investigation to see if we have some bad locks
someplace. lockmeter anybody?
- *lclu* can be rewritten to do better than 1-byte at a time
- copy_user_page_asm can be sped up slightly by using pa_memcpy, but not
much when i tried last time
- *lslen* can also probably be written in a smarter way...
i suspect some areas for further investigation are:
- can we do tlb_flush_mm() in a smarter way for SMP?
- can we improve kernel entry time for interrupts (and syscalls) by
being smarter about what we save on the stack? (i.e. only callee-save
registers and not all the registers?)
volunteers? :)
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 8:11 ` Randolph Chung
@ 2004-11-11 17:39 ` Carlos O'Donell
2004-11-11 17:42 ` Randolph Chung
2004-11-11 18:23 ` Joel Soete
` (4 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Carlos O'Donell @ 2004-11-11 17:39 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
> - can we improve kernel entry time for interrupts (and syscalls) by
> being smarter about what we save on the stack? (i.e. only callee-save
> registers and not all the registers?)
>
> volunteers? :)
I have been stewing over the following:
Leave the existing syscall save everything code in place.
Create a branch infront of the syscall save everything code that
branches on the value of "enable_fast_syscall"
The variable is set via some mechanism. What's the currently accepted
way? /proc twiddle?
The branched code path contains the fast callee-save only register.
Allow a compile time option to switch kernel syscalls to the 'fast'
function call ABI method for people that know they are installing
on a recent glibc.
That's much later on my todo list, but because sometimes I get
frustrated with binutils I go to work on other things for a break :)
Cheers,
Carlos.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 17:39 ` Carlos O'Donell
@ 2004-11-11 17:42 ` Randolph Chung
2004-11-11 17:50 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Randolph Chung @ 2004-11-11 17:42 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: parisc-linux
> I have been stewing over the following:
>
> Leave the existing syscall save everything code in place.
>
> Create a branch infront of the syscall save everything code that
> branches on the value of "enable_fast_syscall"
eh? nononono. we should *always* be able to only preserve callee-saved
registers. From the application point of view, when they call e.g.
read(), it is a function call. The app should not expect any
caller-saved registers to be preserved across the function/system call.
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 17:42 ` Randolph Chung
@ 2004-11-11 17:50 ` Matthew Wilcox
2004-11-11 17:59 ` Randolph Chung
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2004-11-11 17:50 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
On Thu, Nov 11, 2004 at 09:42:58AM -0800, Randolph Chung wrote:
> eh? nononono. we should *always* be able to only preserve callee-saved
> registers. From the application point of view, when they call e.g.
> read(), it is a function call. The app should not expect any
> caller-saved registers to be preserved across the function/system call.
As I'm sure you already know, we do have to be careful to avoid leaking
kernel-internal or another task's information in the registers that
are call-clobbered. I know some architectures do this by having a
kernel exit path that deliberately clobbers as many registers as possible.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 17:50 ` Matthew Wilcox
@ 2004-11-11 17:59 ` Randolph Chung
2004-11-11 18:36 ` Grant Grundler
0 siblings, 1 reply; 25+ messages in thread
From: Randolph Chung @ 2004-11-11 17:59 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
> As I'm sure you already know, we do have to be careful to avoid leaking
> kernel-internal or another task's information in the registers that
> are call-clobbered. I know some architectures do this by having a
> kernel exit path that deliberately clobbers as many registers as possible.
sure, we can zero all the call clobbered registers on exit. But not
having to save all of those pesky floating pointer registers and half a
dozen general registers should still be a huge win.
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 8:11 ` Randolph Chung
2004-11-11 17:39 ` Carlos O'Donell
@ 2004-11-11 18:23 ` Joel Soete
2004-11-11 18:51 ` Randolph Chung
2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete
` (3 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Joel Soete @ 2004-11-11 18:23 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
Randolph Chung wrote:
>>I've collect two profiles for -64SMP and will collect
>>some UP profiles tomorrow. profiles so far are measuring
>>a full kernel build. I expect I'll do the same for -64UP
>>kernels too.
>
>
> hmm.. interesting. top consumers are (with idle loop functions removed)
>
> 40646 flush_kernel_icache_page 406.4600
> 7364 fdsync 368.2000
> 10567 flush_user_dcache_range_asm 293.5278
> 10387 flush_user_icache_range_asm 288.5278
mmm (may be another stupid remarks but) I noticed that:
748 flush_user_dcache_page 7.4800
648 flush_user_icache_page 6.4800
4255 purge_kernel_dcache_page 42.5500
10567 flush_user_dcache_range_asm 293.5278
10387 flush_user_icache_range_asm 288.5278
40646 flush_kernel_icache_page 406.4600
10 flush_kernel_icache_range_asm 0.0862
i.e. flush_kernel_[di]cache_page is few used versus flush_kernel_[di]cache_range_asm while flush_user_[di]cache_range_asm is more
used then flush_user_[di]cache_page.
Isn't it strange?
[...]
mmm also:
49576 machine_restart 774.6250
??
(I don't understand because stat were cleaned "readprofile -r" before the build)
>
> we really need to do better at cache flushing..... anybody have any
> ideas? :)
>
> but looking at the other ones:
> - __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes
> at a time instead of 4
> - _spin_lock* needs investigation to see if we have some bad locks
> someplace. lockmeter anybody?
> - *lclu* can be rewritten to do better than 1-byte at a time
> - copy_user_page_asm can be sped up slightly by using pa_memcpy, but not
> much when i tried last time
> - *lslen* can also probably be written in a smarter way...
>
> i suspect some areas for further investigation are:
> - can we do tlb_flush_mm() in a smarter way for SMP?
> - can we improve kernel entry time for interrupts (and syscalls) by
> being smarter about what we save on the stack? (i.e. only callee-save
> registers and not all the registers?)
>
> volunteers? :)
>
I couldn't realy help more but I will take a look in more details from time to time :)
Thanks,
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 17:59 ` Randolph Chung
@ 2004-11-11 18:36 ` Grant Grundler
0 siblings, 0 replies; 25+ messages in thread
From: Grant Grundler @ 2004-11-11 18:36 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
On Thu, Nov 11, 2004 at 09:59:33AM -0800, Randolph Chung wrote:
> sure, we can zero all the call clobbered registers on exit. But not
> having to save all of those pesky floating pointer registers and half a
> dozen general registers should still be a huge win.
Randolph and I talked about this more privately.
In a nutshell, "huge win" is slightly overstating it and we agree
fixing the cache utilization would be a much bigger win.
Randolph thinks we can save 20 load and stores per interrupt
and potential context switches. The thinking is we are saving/restoring
some registers twice and should split the save/restore between
interrupt/trap and context switch code. So if no context switch is
performed, we only save/restore a subset of the registers manually
and the rest are preserved according to the ABI.
Did I get that right?
thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 18:23 ` Joel Soete
@ 2004-11-11 18:51 ` Randolph Chung
0 siblings, 0 replies; 25+ messages in thread
From: Randolph Chung @ 2004-11-11 18:51 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
> i.e. flush_kernel_[di]cache_page is few used versus
> flush_kernel_[di]cache_range_asm while flush_user_[di]cache_range_asm is
> more used then flush_user_[di]cache_page.
>
> Isn't it strange?
could it be that kernel mappings tend to be bigger and user mappings
tend to be smaller? i'm only guessing here...
> mmm also:
> 49576 machine_restart 774.6250
this is an artifact of the way the measurements are done. these are
actually calls to cpu_idle().
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
2004-11-11 8:11 ` Randolph Chung
@ 2004-11-12 5:29 ` Grant Grundler
1 sibling, 0 replies; 25+ messages in thread
From: Grant Grundler @ 2004-11-12 5:29 UTC (permalink / raw)
To: parisc-linux
On Thu, Nov 11, 2004 at 12:54:31AM -0700, Grant Grundler wrote:
> I've collect two profiles for -64SMP and will collect
> some UP profiles tomorrow.
"tomorrow" finally arrived. :^)
> http://www.parisc-linux.org/~grundler/prof-j6700/
I've added the 64-bit UP profile numbers as promised.
And some of the top consumers look familiar:
root@gggj6k:~# sort -rnk 3 prof-2.6.10-rc1-pa11-64-01.txt
40150 flush_kernel_icache_page 401.5000
6798 fdsync 339.9000
10645 flush_user_dcache_range_asm 295.6944
10353 flush_user_icache_range_asm 287.5833
13871 machine_restart 216.7344
20839 __clear_user_page_asm 186.0625
10478 cpu_idle 145.5278
1380 fisync 86.2500
365 $lclu_done 45.6250
3794 purge_kernel_dcache_page 37.9400
3535 flush_kernel_dcache_page 35.3500
3279 copy_user_page_asm 21.5724
1358 find_get_page 18.8611
185 $lslen_loop 15.4167
1228 find_vma_prev 12.7917
101 $lslen_done 12.6250
128 $lsfu_loop 8.0000
162 file_ra_state_init 6.7500
...
enjoy!
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-11 8:11 ` Randolph Chung
2004-11-11 17:39 ` Carlos O'Donell
2004-11-11 18:23 ` Joel Soete
@ 2004-11-26 16:59 ` Joel Soete
2004-11-26 17:13 ` Randolph Chung
2004-11-26 19:02 ` Grant Grundler
2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete
` (2 subsequent siblings)
5 siblings, 2 replies; 25+ messages in thread
From: Joel Soete @ 2004-11-26 16:59 UTC (permalink / raw)
To: Randolph Chung, Grant Grundler; +Cc: parisc-linux
Hello all,
>
> hmm.. interesting. top consumers are (with idle loop functions removed)=
>
> 40646 flush_kernel_icache_page 406.4600
[...]
> 4089 flush_kernel_dcache_page 40.8900
[...]
> we really need to do better at cache flushing..... anybody have any
> ideas? :)
>
Is somebody can help me to understand those:
[...]
flush_kernel_dcache_page:
.proc
.callinfo NO_CALLS
.entry
ldil L%dcache_stride,%r1
ldw R%dcache_stride(%r1),%r23
#ifdef __LP64__
depdi,z 1,63-PAGE_SHIFT,1,%r25
#else
depwi,z 1,31-PAGE_SHIFT,1,%r25
#endif
add %r26,%r25,%r25
sub %r25,%r23,%r25
1: fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
fdc,m %r23(%r26)
CMPB<< %r26,%r25,1b
fdc,m %r23(%r26)
sync
bv %r0(%r2)
nop
.exit
.procend
.export flush_user_dcache_page
[...]
flush_kernel_icache_page:
.proc
.callinfo NO_CALLS
.entry
ldil L%icache_stride,%r1
ldw R%icache_stride(%r1),%r23
#ifdef __LP64__
depdi,z 1,63-PAGE_SHIFT,1,%r25
#else
depwi,z 1,31-PAGE_SHIFT,1,%r25
#endif
add %r26,%r25,%r25
sub %r25,%r23,%r25
1: fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
fic,m %r23(%r26)
CMPB<< %r26,%r25,1b
fic,m %r23(%r26)
sync
bv %r0(%r2)
nop
.exit
.procend
[...]
I try google on the p-l m-l but I didn't reach to find out why those chai=
n
of 15 f[di]c,m?
Thanks in advance for your attention,
Joel
-------------------------------------------------------------------------=
--
Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.=
..
http://reg.tiscali.be/adsl/default.asp?lg=3DFR
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete
@ 2004-11-26 17:13 ` Randolph Chung
2004-11-26 19:02 ` Grant Grundler
1 sibling, 0 replies; 25+ messages in thread
From: Randolph Chung @ 2004-11-26 17:13 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
> I try google on the p-l m-l but I didn't reach to find out why those chain
> of 15 f[di]c,m?
16, actually (including the one in the delay slot of the cmpib).
,m is post increment, so we are just doing an unrolled loop of
fdc offset(address)
offset = offset + cache_stride
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data
2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete
2004-11-26 17:13 ` Randolph Chung
@ 2004-11-26 19:02 ` Grant Grundler
1 sibling, 0 replies; 25+ messages in thread
From: Grant Grundler @ 2004-11-26 19:02 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
On Fri, Nov 26, 2004 at 05:59:18PM +0100, Joel Soete wrote:
> > 40646 flush_kernel_icache_page 406.4600
> [...]
> > 4089 flush_kernel_dcache_page 40.8900
> [...]
> > we really need to do better at cache flushing..... anybody have any
> > ideas? :)
> >
> Is somebody can help me to understand those:
> [...]
> flush_kernel_dcache_page:
Joel,
the problem is not in the flush_kernel_dcache_page() routine.
The problem is we are calling too often.
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-11 8:11 ` Randolph Chung
` (2 preceding siblings ...)
2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete
@ 2004-11-28 21:01 ` Joel Soete
2004-11-28 21:13 ` Matthew Wilcox
2004-12-01 17:44 ` More questions " Joel Soete
2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete
5 siblings, 1 reply; 25+ messages in thread
From: Joel Soete @ 2004-11-28 21:01 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
Hello all,
Randolph Chung wrote:
>>I've collect two profiles for -64SMP and will collect
>>some UP profiles tomorrow. profiles so far are measuring
>>a full kernel build. I expect I'll do the same for -64UP
>>kernels too.
>
>
> hmm.. interesting. top consumers are (with idle loop functions removed)
>
> 40646 flush_kernel_icache_page 406.4600
> 7364 fdsync 368.2000
> 10567 flush_user_dcache_range_asm 293.5278
> 10387 flush_user_icache_range_asm 288.5278
I have additional question about such functions:
* in parisc above ..._dcache_... refer well to data cache?
* and respectively ..._icache_... refer to instruction cache?
Have they different meaning for generic linux?
The confusion came for me from:
include/asm-parisc/cacheflush.h:
[...]
#define flush_icache_page(vma,page) do { flush_kernel_dcache_page(page_address(page));
flush_kernel_icache_page(page_address(page)); } while (0)
[...]
Thanks again,
Joel
PS: I didn't suspect any error, I am just confused :(
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete
@ 2004-11-28 21:13 ` Matthew Wilcox
2004-11-29 1:14 ` Michael S. Zick
0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2004-11-28 21:13 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
On Sun, Nov 28, 2004 at 09:01:42PM +0000, Joel Soete wrote:
> I have additional question about such functions:
>
> * in parisc above ..._dcache_... refer well to data cache?
> * and respectively ..._icache_... refer to instruction cache?
>
> Have they different meaning for generic linux?
No, your understanding is correct (see Documentation/cachetlb.txt)
> The confusion came for me from:
>
> include/asm-parisc/cacheflush.h:
> [...]
> #define flush_icache_page(vma,page) do {
> flush_kernel_dcache_page(page_address(page));
> flush_kernel_icache_page(page_address(page)); } while (0)
> [...]
I see why this confuses you. PA-RISC has writeback data caches that
are non-coherent with the instruction cache. So it's not enough to
just flush the icache; if the page has been modified, we need to force
the data in the dcache back to ram, then remove any existing cache for
instructions in that page. Then instruction accesses to that page will
fetch the correct data from memory and everything will work.
Many other architectures have writethrough data caches. They don't need
to flush the dcache.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-28 21:13 ` Matthew Wilcox
@ 2004-11-29 1:14 ` Michael S. Zick
2004-11-29 2:00 ` Matthew Wilcox
0 siblings, 1 reply; 25+ messages in thread
From: Michael S. Zick @ 2004-11-29 1:14 UTC (permalink / raw)
To: parisc-linux
On Sun November 28 2004 15:13, Matthew Wilcox wrote:
> On Sun, Nov 28, 2004 at 09:01:42PM +0000, Joel Soete wrote:
> >
> > include/asm-parisc/cacheflush.h:
> > [...]
> > #define flush_icache_page(vma,page) do {
> > flush_kernel_dcache_page(page_address(page));
> > flush_kernel_icache_page(page_address(page)); } while (0)
> > [...]
>
> I see why this confuses you. PA-RISC has writeback data caches that
> are non-coherent with the instruction cache. So it's not enough to
> just flush the icache; if the page has been modified, we need to force
> the data in the dcache back to ram, then remove any existing cache for
> instructions in that page. Then instruction accesses to that page will
> fetch the correct data from memory and everything will work.
>
Matt, Joel,
Here is a, perhaps dumb, question from a non-parisc source...
I note Matt's statement: "...then remove any existing cache for
instructions in that page."
Which sounds very reasonable.
Question:
Is the:
> > flush_kernel_icache_page(page_address(page));
(or the hardware that receives the command)
smart enough to just mark the page 'invalid' or is it
actually an 'absolute update external storage'?
I ask because both flush commands are written the
same, BUT...
The first should be an 'absolute update external
storage'.
The second should be either just a 'mark
invalid' or 'conditional update external storage'.
Mike
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-29 1:14 ` Michael S. Zick
@ 2004-11-29 2:00 ` Matthew Wilcox
0 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2004-11-29 2:00 UTC (permalink / raw)
To: Michael S. Zick; +Cc: parisc-linux
On Sun, Nov 28, 2004 at 07:14:14PM -0600, Michael S. Zick wrote:
> Question:
> Is the:
> > > flush_kernel_icache_page(page_address(page));
> (or the hardware that receives the command)
>
> smart enough to just mark the page 'invalid' or is it
> actually an 'absolute update external storage'?
The I-cache is, by definition, read-only, so there's nothing to update
main memory with.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-11 8:11 ` Randolph Chung
` (3 preceding siblings ...)
2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete
@ 2004-12-01 17:44 ` Joel Soete
2004-12-01 17:56 ` Matthew Wilcox
2004-12-03 10:24 ` Joel Soete
2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete
5 siblings, 2 replies; 25+ messages in thread
From: Joel Soete @ 2004-12-01 17:44 UTC (permalink / raw)
To: Randolph Chung, Grant Grundler; +Cc: parisc-linux
[...]
>
> but looking at the other ones:
> - __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes
> at a time instead of 4
Is it the idea (principle only):
--- arch/parisc/kernel/pacache.S.Orig 2004-12-01 17:09:53.000000000 +0100=
+++ arch/parisc/kernel/pacache.S-t1 2004-12-01 17:58:16.000000000 +0100
@@ -505,6 +505,16 @@
ldi 64,%r1
1:
+#ifdef __LP64__
+ std %r0,0(%r28)
+ std %r0,8(%r28)
+ std %r0,16(%r28)
+ std %r0,24(%r28)
+ std %r0,32(%r28)
+ std %r0,40(%r28)
+ std %r0,48(%r28)
+ std %r0,56(%r28)
+#else
stw %r0,0(%r28)
stw %r0,4(%r28)
stw %r0,8(%r28)
@@ -521,6 +531,7 @@
stw %r0,52(%r28)
stw %r0,56(%r28)
stw %r0,60(%r28)
+#endif
ADDIB> -1,%r1,1b
ldo 64(%r28),%r28
I doubt that's enough because I don't yet find how r0 is set?
[...]
Thanks in advance for your attention,
Joel
-------------------------------------------------------------------------=
--
Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.=
..
http://reg.tiscali.be/adsl/default.asp?lg=3DFR
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-01 17:44 ` More questions " Joel Soete
@ 2004-12-01 17:56 ` Matthew Wilcox
2004-12-01 18:33 ` Joel Soete
2004-12-03 10:24 ` Joel Soete
1 sibling, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2004-12-01 17:56 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
On Wed, Dec 01, 2004 at 06:44:44PM +0100, Joel Soete wrote:
> I doubt that's enough because I don't yet find how r0 is set?
r0 is magic on PA. Writes are discarded, reads return 0. It's the
hardware equivalent of /dev/zero ;-)
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-01 17:56 ` Matthew Wilcox
@ 2004-12-01 18:33 ` Joel Soete
0 siblings, 0 replies; 25+ messages in thread
From: Joel Soete @ 2004-12-01 18:33 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
>
> On Wed, Dec 01, 2004 at 06:44:44PM +0100, Joel Soete wrote:
> > I doubt that's enough because I don't yet find how r0 is set?
>
> r0 is magic on PA. Writes are discarded, reads return 0. It's the
> hardware equivalent of /dev/zero ;-)
>
Cool that make the stuff so ?
And iirc registers are 64bit wide on pa2.0?
Thanks,
Joel
-------------------------------------------------------------------------=
--
Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.=
..
http://reg.tiscali.be/adsl/default.asp?lg=3DFR
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-01 17:44 ` More questions " Joel Soete
2004-12-01 17:56 ` Matthew Wilcox
@ 2004-12-03 10:24 ` Joel Soete
2004-12-03 15:41 ` Randolph Chung
1 sibling, 1 reply; 25+ messages in thread
From: Joel Soete @ 2004-12-03 10:24 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
Hello all,
I need more advise:
Joel Soete wrote:
> [...]
>
>>but looking at the other ones:
>>- __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes
>> at a time instead of 4
>
> Is it the idea (principle only):
> --- arch/parisc/kernel/pacache.S.Orig 2004-12-01 17:09:53.000000000 +0100
> +++ arch/parisc/kernel/pacache.S-t1 2004-12-01 17:58:16.000000000 +0100
> @@ -505,6 +505,16 @@
> ldi 64,%r1
>
> 1:
> +#ifdef __LP64__
> + std %r0,0(%r28)
> + std %r0,8(%r28)
> + std %r0,16(%r28)
> + std %r0,24(%r28)
> + std %r0,32(%r28)
> + std %r0,40(%r28)
> + std %r0,48(%r28)
> + std %r0,56(%r28)
> +#else
> stw %r0,0(%r28)
> stw %r0,4(%r28)
> stw %r0,8(%r28)
> @@ -521,6 +531,7 @@
> stw %r0,52(%r28)
> stw %r0,56(%r28)
> stw %r0,60(%r28)
> +#endif
> ADDIB> -1,%r1,1b
> ldo 64(%r28),%r28
>
I test it on my b2k with 64bit kernel and it works but it didn't seems to bring me any benefit?
As far as I can believe readprofile: the ratio between the first and last column is always the same (1/80 in this case).
What did I miss?
My hope was at least a befenit of 1/2 as I reduce insn number in the same rate.
Right now my idea is to write a small test case to see if using the same number on insn but with less loop will help or not?
Thanks again for more advise,
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* *lcul and memory granularity question[Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-11-11 8:11 ` Randolph Chung
` (4 preceding siblings ...)
2004-12-01 17:44 ` More questions " Joel Soete
@ 2004-12-03 15:00 ` Joel Soete
2004-12-03 15:13 ` Matthew Wilcox
5 siblings, 1 reply; 25+ messages in thread
From: Joel Soete @ 2004-12-03 15:00 UTC (permalink / raw)
Cc: parisc-linux
Hello all,
Randolph Chung wrote:
> - *lclu* can be rewritten to do better than 1-byte at a time
I have an additional question about parisc alignment and this remark:
a char type var is 1byte align; ... but what's about a 3, 5, 7 and more bytes struct size?
My idea is that a 3bytes could be align as a 32bites word and clearing such struct could be done by clearing all the word;
the same for 5 and 7 bytes if aligned as 2*32bites and so on for an unrolled loop of the max cache size (128 bytes iirc);
and btw using a case define as we use for __put/get__user/kernel_asm?
Or the memory management is more complex then I imagine and I would really consider a 3bytes as 2+1 bytes (5=2*2+1, ...)?
Thanks a lot,
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: *lcul and memory granularity question[Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete
@ 2004-12-03 15:13 ` Matthew Wilcox
0 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2004-12-03 15:13 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
On Fri, Dec 03, 2004 at 03:00:33PM +0000, Joel Soete wrote:
> >- *lclu* can be rewritten to do better than 1-byte at a time
>
> I have an additional question about parisc alignment and this remark:
> a char type var is 1byte align; ... but what's about a 3, 5, 7 and more
> bytes struct size?
Don't think in terms of structs, think in terms of an arbitrary array of
bytes. You can't assume anything about the alignment of lclu or the size.
In particular, you can't write more than the requested number of bytes.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-03 10:24 ` Joel Soete
@ 2004-12-03 15:41 ` Randolph Chung
2004-12-07 14:42 ` Joel Soete
0 siblings, 1 reply; 25+ messages in thread
From: Randolph Chung @ 2004-12-03 15:41 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
> I test it on my b2k with 64bit kernel and it works but it didn't seems to
> bring me any benefit?
> As far as I can believe readprofile: the ratio between the first and last
> column is always the same (1/80 in this case).
> What did I miss?
what workload did you test this on? possibly you only see a benefit with
workloads that need to do a lot of page clearings.... so you probably
want to find a workload so that the first column is >>1
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data]
2004-12-03 15:41 ` Randolph Chung
@ 2004-12-07 14:42 ` Joel Soete
0 siblings, 0 replies; 25+ messages in thread
From: Joel Soete @ 2004-12-07 14:42 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
Hello Randolph,
>
>
> > I test it on my b2k with 64bit kernel and it works but it didn't seem=
s
> to
> > bring me any benefit?
> > As far as I can believe readprofile: the ratio between the first and
last
>
> > column is always the same (1/80 in this case).
> > What did I miss?
>
> what workload did you test this on? possibly you only see a benefit wit=
h
> workloads that need to do a lot of page clearings.... so you probably
> want to find a workload so that the first column is >>1
>
I re-do the following test with 2.6.10-rc3-pa2 with this b2k and 64bits
kernel:
readprofile -r ; make V=3D1 vmlinux 2>&1 | tee /var/logs/k-2.6.10-rc3-pa2=
-b2k64
; readprofile > /var/logs/prof2b-2.6.10-rc3-pa2-b2k64-3
I build first the kernel from cvs and reboot it to obtain following resul=
t:
/var/logs/prof2b-2.6.10-rc3-pa2-b2k64-3: 15500 __clear_user_page_asm
138.3929 (i.e. 1/112)
I apply previous mentioned lclu patch, rebuild again and reboot to rebuil=
d
a 4th time this kernel to obtain:
/var/logs/prof2b-2.6.10-rc3-pa2-b2k64-4: 12609 __clear_user_page_asm
112.5804 (i.e. 1/112)
Interesting: the ratio stay cst between test but the number of clock tick=
s
was well reduced (so I presume a potential benefit even though small ;-)
hth,
Joel
-------------------------------------------------------------------------=
--
Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.=
..
http://reg.tiscali.be/adsl/default.asp?lg=3DFR
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2004-12-07 14:42 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
2004-11-11 8:11 ` Randolph Chung
2004-11-11 17:39 ` Carlos O'Donell
2004-11-11 17:42 ` Randolph Chung
2004-11-11 17:50 ` Matthew Wilcox
2004-11-11 17:59 ` Randolph Chung
2004-11-11 18:36 ` Grant Grundler
2004-11-11 18:23 ` Joel Soete
2004-11-11 18:51 ` Randolph Chung
2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete
2004-11-26 17:13 ` Randolph Chung
2004-11-26 19:02 ` Grant Grundler
2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete
2004-11-28 21:13 ` Matthew Wilcox
2004-11-29 1:14 ` Michael S. Zick
2004-11-29 2:00 ` Matthew Wilcox
2004-12-01 17:44 ` More questions " Joel Soete
2004-12-01 17:56 ` Matthew Wilcox
2004-12-01 18:33 ` Joel Soete
2004-12-03 10:24 ` Joel Soete
2004-12-03 15:41 ` Randolph Chung
2004-12-07 14:42 ` Joel Soete
2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete
2004-12-03 15:13 ` Matthew Wilcox
2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.