* [parisc-linux] 2.6.10-rc1-pa11 profile data @ 2004-11-11 7:54 Grant Grundler 2004-11-11 8:11 ` Randolph Chung 2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler 0 siblings, 2 replies; 25+ messages in thread From: Grant Grundler @ 2004-11-11 7:54 UTC (permalink / raw) To: parisc-linux I was comparing "time" output for various flavors of kernels and arches. We are something like 11m/17m/5m for real/user/sys on a dual j6700 (dual 750Mhz, running 2.6.10-rc1-pa11-32SMP kernel building a 64-bit kernel (using gcc 3.0.4). Similar numbers for J6000 (dual 550Mhz) doing a 32-bit kernel build (gcc 3.3.x): 14m/22m/5m. While this might look very favorable to a similar full kernel build on a 1.5Ghz RX2600 which takes about as long (11m/20m/1m), the ia64 machine spends less than 1m in the kernel. I've collect two profiles for -64SMP and will collect some UP profiles tomorrow. profiles so far are measuring a full kernel build. I expect I'll do the same for -64UP kernels too. What I have so far is on: http://www.parisc-linux.org/~grundler/prof-j6700/ d- and i-cache flushing routines are still the top consumers. hth, grant _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler @ 2004-11-11 8:11 ` Randolph Chung 2004-11-11 17:39 ` Carlos O'Donell ` (5 more replies) 2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler 1 sibling, 6 replies; 25+ messages in thread From: Randolph Chung @ 2004-11-11 8:11 UTC (permalink / raw) To: Grant Grundler; +Cc: parisc-linux > I've collect two profiles for -64SMP and will collect > some UP profiles tomorrow. profiles so far are measuring > a full kernel build. I expect I'll do the same for -64UP > kernels too. hmm.. interesting. top consumers are (with idle loop functions removed) 40646 flush_kernel_icache_page 406.4600 7364 fdsync 368.2000 10567 flush_user_dcache_range_asm 293.5278 10387 flush_user_icache_range_asm 288.5278 21409 __clear_user_page_asm 191.1518 5356 _spin_lock_irqsave 111.5833 1768 fisync 110.5000 1928 _spin_lock 48.2000 4255 purge_kernel_dcache_page 42.5500 339 $lclu_done 42.3750 4089 flush_kernel_dcache_page 40.8900 5053 copy_user_page_asm 33.2434 569 _write_unlock_irq 17.7812 422 _spin_unlock 17.5833 1567 find_vma_prev 16.3229 181 $lslen_loop 15.0833 96 $lslen_done 12.0000 996 _write_trylock 11.3182 137 $lsfu_loop 8.5625 748 flush_user_dcache_page 7.4800 we really need to do better at cache flushing..... anybody have any ideas? :) but looking at the other ones: - __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes at a time instead of 4 - _spin_lock* needs investigation to see if we have some bad locks someplace. lockmeter anybody? - *lclu* can be rewritten to do better than 1-byte at a time - copy_user_page_asm can be sped up slightly by using pa_memcpy, but not much when i tried last time - *lslen* can also probably be written in a smarter way... i suspect some areas for further investigation are: - can we do tlb_flush_mm() in a smarter way for SMP? - can we improve kernel entry time for interrupts (and syscalls) by being smarter about what we save on the stack? (i.e. only callee-save registers and not all the registers?) volunteers? :) randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 8:11 ` Randolph Chung @ 2004-11-11 17:39 ` Carlos O'Donell 2004-11-11 17:42 ` Randolph Chung 2004-11-11 18:23 ` Joel Soete ` (4 subsequent siblings) 5 siblings, 1 reply; 25+ messages in thread From: Carlos O'Donell @ 2004-11-11 17:39 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux > - can we improve kernel entry time for interrupts (and syscalls) by > being smarter about what we save on the stack? (i.e. only callee-save > registers and not all the registers?) > > volunteers? :) I have been stewing over the following: Leave the existing syscall save everything code in place. Create a branch infront of the syscall save everything code that branches on the value of "enable_fast_syscall" The variable is set via some mechanism. What's the currently accepted way? /proc twiddle? The branched code path contains the fast callee-save only register. Allow a compile time option to switch kernel syscalls to the 'fast' function call ABI method for people that know they are installing on a recent glibc. That's much later on my todo list, but because sometimes I get frustrated with binutils I go to work on other things for a break :) Cheers, Carlos. _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 17:39 ` Carlos O'Donell @ 2004-11-11 17:42 ` Randolph Chung 2004-11-11 17:50 ` Matthew Wilcox 0 siblings, 1 reply; 25+ messages in thread From: Randolph Chung @ 2004-11-11 17:42 UTC (permalink / raw) To: Carlos O'Donell; +Cc: parisc-linux > I have been stewing over the following: > > Leave the existing syscall save everything code in place. > > Create a branch infront of the syscall save everything code that > branches on the value of "enable_fast_syscall" eh? nononono. we should *always* be able to only preserve callee-saved registers. From the application point of view, when they call e.g. read(), it is a function call. The app should not expect any caller-saved registers to be preserved across the function/system call. randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 17:42 ` Randolph Chung @ 2004-11-11 17:50 ` Matthew Wilcox 2004-11-11 17:59 ` Randolph Chung 0 siblings, 1 reply; 25+ messages in thread From: Matthew Wilcox @ 2004-11-11 17:50 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux On Thu, Nov 11, 2004 at 09:42:58AM -0800, Randolph Chung wrote: > eh? nononono. we should *always* be able to only preserve callee-saved > registers. From the application point of view, when they call e.g. > read(), it is a function call. The app should not expect any > caller-saved registers to be preserved across the function/system call. As I'm sure you already know, we do have to be careful to avoid leaking kernel-internal or another task's information in the registers that are call-clobbered. I know some architectures do this by having a kernel exit path that deliberately clobbers as many registers as possible. -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 17:50 ` Matthew Wilcox @ 2004-11-11 17:59 ` Randolph Chung 2004-11-11 18:36 ` Grant Grundler 0 siblings, 1 reply; 25+ messages in thread From: Randolph Chung @ 2004-11-11 17:59 UTC (permalink / raw) To: Matthew Wilcox; +Cc: parisc-linux > As I'm sure you already know, we do have to be careful to avoid leaking > kernel-internal or another task's information in the registers that > are call-clobbered. I know some architectures do this by having a > kernel exit path that deliberately clobbers as many registers as possible. sure, we can zero all the call clobbered registers on exit. But not having to save all of those pesky floating pointer registers and half a dozen general registers should still be a huge win. randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 17:59 ` Randolph Chung @ 2004-11-11 18:36 ` Grant Grundler 0 siblings, 0 replies; 25+ messages in thread From: Grant Grundler @ 2004-11-11 18:36 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux On Thu, Nov 11, 2004 at 09:59:33AM -0800, Randolph Chung wrote: > sure, we can zero all the call clobbered registers on exit. But not > having to save all of those pesky floating pointer registers and half a > dozen general registers should still be a huge win. Randolph and I talked about this more privately. In a nutshell, "huge win" is slightly overstating it and we agree fixing the cache utilization would be a much bigger win. Randolph thinks we can save 20 load and stores per interrupt and potential context switches. The thinking is we are saving/restoring some registers twice and should split the save/restore between interrupt/trap and context switch code. So if no context switch is performed, we only save/restore a subset of the registers manually and the rest are preserved according to the ABI. Did I get that right? thanks, grant _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 8:11 ` Randolph Chung 2004-11-11 17:39 ` Carlos O'Donell @ 2004-11-11 18:23 ` Joel Soete 2004-11-11 18:51 ` Randolph Chung 2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete ` (3 subsequent siblings) 5 siblings, 1 reply; 25+ messages in thread From: Joel Soete @ 2004-11-11 18:23 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux Randolph Chung wrote: >>I've collect two profiles for -64SMP and will collect >>some UP profiles tomorrow. profiles so far are measuring >>a full kernel build. I expect I'll do the same for -64UP >>kernels too. > > > hmm.. interesting. top consumers are (with idle loop functions removed) > > 40646 flush_kernel_icache_page 406.4600 > 7364 fdsync 368.2000 > 10567 flush_user_dcache_range_asm 293.5278 > 10387 flush_user_icache_range_asm 288.5278 mmm (may be another stupid remarks but) I noticed that: 748 flush_user_dcache_page 7.4800 648 flush_user_icache_page 6.4800 4255 purge_kernel_dcache_page 42.5500 10567 flush_user_dcache_range_asm 293.5278 10387 flush_user_icache_range_asm 288.5278 40646 flush_kernel_icache_page 406.4600 10 flush_kernel_icache_range_asm 0.0862 i.e. flush_kernel_[di]cache_page is few used versus flush_kernel_[di]cache_range_asm while flush_user_[di]cache_range_asm is more used then flush_user_[di]cache_page. Isn't it strange? [...] mmm also: 49576 machine_restart 774.6250 ?? (I don't understand because stat were cleaned "readprofile -r" before the build) > > we really need to do better at cache flushing..... anybody have any > ideas? :) > > but looking at the other ones: > - __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes > at a time instead of 4 > - _spin_lock* needs investigation to see if we have some bad locks > someplace. lockmeter anybody? > - *lclu* can be rewritten to do better than 1-byte at a time > - copy_user_page_asm can be sped up slightly by using pa_memcpy, but not > much when i tried last time > - *lslen* can also probably be written in a smarter way... > > i suspect some areas for further investigation are: > - can we do tlb_flush_mm() in a smarter way for SMP? > - can we improve kernel entry time for interrupts (and syscalls) by > being smarter about what we save on the stack? (i.e. only callee-save > registers and not all the registers?) > > volunteers? :) > I couldn't realy help more but I will take a look in more details from time to time :) Thanks, Joel _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 18:23 ` Joel Soete @ 2004-11-11 18:51 ` Randolph Chung 0 siblings, 0 replies; 25+ messages in thread From: Randolph Chung @ 2004-11-11 18:51 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux > i.e. flush_kernel_[di]cache_page is few used versus > flush_kernel_[di]cache_range_asm while flush_user_[di]cache_range_asm is > more used then flush_user_[di]cache_page. > > Isn't it strange? could it be that kernel mappings tend to be bigger and user mappings tend to be smaller? i'm only guessing here... > mmm also: > 49576 machine_restart 774.6250 this is an artifact of the way the measurements are done. these are actually calls to cpu_idle(). randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 8:11 ` Randolph Chung 2004-11-11 17:39 ` Carlos O'Donell 2004-11-11 18:23 ` Joel Soete @ 2004-11-26 16:59 ` Joel Soete 2004-11-26 17:13 ` Randolph Chung 2004-11-26 19:02 ` Grant Grundler 2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete ` (2 subsequent siblings) 5 siblings, 2 replies; 25+ messages in thread From: Joel Soete @ 2004-11-26 16:59 UTC (permalink / raw) To: Randolph Chung, Grant Grundler; +Cc: parisc-linux Hello all, > > hmm.. interesting. top consumers are (with idle loop functions removed)= > > 40646 flush_kernel_icache_page 406.4600 [...] > 4089 flush_kernel_dcache_page 40.8900 [...] > we really need to do better at cache flushing..... anybody have any > ideas? :) > Is somebody can help me to understand those: [...] flush_kernel_dcache_page: .proc .callinfo NO_CALLS .entry ldil L%dcache_stride,%r1 ldw R%dcache_stride(%r1),%r23 #ifdef __LP64__ depdi,z 1,63-PAGE_SHIFT,1,%r25 #else depwi,z 1,31-PAGE_SHIFT,1,%r25 #endif add %r26,%r25,%r25 sub %r25,%r23,%r25 1: fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) fdc,m %r23(%r26) CMPB<< %r26,%r25,1b fdc,m %r23(%r26) sync bv %r0(%r2) nop .exit .procend .export flush_user_dcache_page [...] flush_kernel_icache_page: .proc .callinfo NO_CALLS .entry ldil L%icache_stride,%r1 ldw R%icache_stride(%r1),%r23 #ifdef __LP64__ depdi,z 1,63-PAGE_SHIFT,1,%r25 #else depwi,z 1,31-PAGE_SHIFT,1,%r25 #endif add %r26,%r25,%r25 sub %r25,%r23,%r25 1: fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) fic,m %r23(%r26) CMPB<< %r26,%r25,1b fic,m %r23(%r26) sync bv %r0(%r2) nop .exit .procend [...] I try google on the p-l m-l but I didn't reach to find out why those chai= n of 15 f[di]c,m? Thanks in advance for your attention, Joel -------------------------------------------------------------------------= -- Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.= .. http://reg.tiscali.be/adsl/default.asp?lg=3DFR _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete @ 2004-11-26 17:13 ` Randolph Chung 2004-11-26 19:02 ` Grant Grundler 1 sibling, 0 replies; 25+ messages in thread From: Randolph Chung @ 2004-11-26 17:13 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux > I try google on the p-l m-l but I didn't reach to find out why those chain > of 15 f[di]c,m? 16, actually (including the one in the delay slot of the cmpib). ,m is post increment, so we are just doing an unrolled loop of fdc offset(address) offset = offset + cache_stride randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: flush_kernel_[di]cache_page question? [WAS: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete 2004-11-26 17:13 ` Randolph Chung @ 2004-11-26 19:02 ` Grant Grundler 1 sibling, 0 replies; 25+ messages in thread From: Grant Grundler @ 2004-11-26 19:02 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux On Fri, Nov 26, 2004 at 05:59:18PM +0100, Joel Soete wrote: > > 40646 flush_kernel_icache_page 406.4600 > [...] > > 4089 flush_kernel_dcache_page 40.8900 > [...] > > we really need to do better at cache flushing..... anybody have any > > ideas? :) > > > Is somebody can help me to understand those: > [...] > flush_kernel_dcache_page: Joel, the problem is not in the flush_kernel_dcache_page() routine. The problem is we are calling too often. grant _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-11 8:11 ` Randolph Chung ` (2 preceding siblings ...) 2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete @ 2004-11-28 21:01 ` Joel Soete 2004-11-28 21:13 ` Matthew Wilcox 2004-12-01 17:44 ` More questions " Joel Soete 2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete 5 siblings, 1 reply; 25+ messages in thread From: Joel Soete @ 2004-11-28 21:01 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux Hello all, Randolph Chung wrote: >>I've collect two profiles for -64SMP and will collect >>some UP profiles tomorrow. profiles so far are measuring >>a full kernel build. I expect I'll do the same for -64UP >>kernels too. > > > hmm.. interesting. top consumers are (with idle loop functions removed) > > 40646 flush_kernel_icache_page 406.4600 > 7364 fdsync 368.2000 > 10567 flush_user_dcache_range_asm 293.5278 > 10387 flush_user_icache_range_asm 288.5278 I have additional question about such functions: * in parisc above ..._dcache_... refer well to data cache? * and respectively ..._icache_... refer to instruction cache? Have they different meaning for generic linux? The confusion came for me from: include/asm-parisc/cacheflush.h: [...] #define flush_icache_page(vma,page) do { flush_kernel_dcache_page(page_address(page)); flush_kernel_icache_page(page_address(page)); } while (0) [...] Thanks again, Joel PS: I didn't suspect any error, I am just confused :( _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete @ 2004-11-28 21:13 ` Matthew Wilcox 2004-11-29 1:14 ` Michael S. Zick 0 siblings, 1 reply; 25+ messages in thread From: Matthew Wilcox @ 2004-11-28 21:13 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux On Sun, Nov 28, 2004 at 09:01:42PM +0000, Joel Soete wrote: > I have additional question about such functions: > > * in parisc above ..._dcache_... refer well to data cache? > * and respectively ..._icache_... refer to instruction cache? > > Have they different meaning for generic linux? No, your understanding is correct (see Documentation/cachetlb.txt) > The confusion came for me from: > > include/asm-parisc/cacheflush.h: > [...] > #define flush_icache_page(vma,page) do { > flush_kernel_dcache_page(page_address(page)); > flush_kernel_icache_page(page_address(page)); } while (0) > [...] I see why this confuses you. PA-RISC has writeback data caches that are non-coherent with the instruction cache. So it's not enough to just flush the icache; if the page has been modified, we need to force the data in the dcache back to ram, then remove any existing cache for instructions in that page. Then instruction accesses to that page will fetch the correct data from memory and everything will work. Many other architectures have writethrough data caches. They don't need to flush the dcache. -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-28 21:13 ` Matthew Wilcox @ 2004-11-29 1:14 ` Michael S. Zick 2004-11-29 2:00 ` Matthew Wilcox 0 siblings, 1 reply; 25+ messages in thread From: Michael S. Zick @ 2004-11-29 1:14 UTC (permalink / raw) To: parisc-linux On Sun November 28 2004 15:13, Matthew Wilcox wrote: > On Sun, Nov 28, 2004 at 09:01:42PM +0000, Joel Soete wrote: > > > > include/asm-parisc/cacheflush.h: > > [...] > > #define flush_icache_page(vma,page) do { > > flush_kernel_dcache_page(page_address(page)); > > flush_kernel_icache_page(page_address(page)); } while (0) > > [...] > > I see why this confuses you. PA-RISC has writeback data caches that > are non-coherent with the instruction cache. So it's not enough to > just flush the icache; if the page has been modified, we need to force > the data in the dcache back to ram, then remove any existing cache for > instructions in that page. Then instruction accesses to that page will > fetch the correct data from memory and everything will work. > Matt, Joel, Here is a, perhaps dumb, question from a non-parisc source... I note Matt's statement: "...then remove any existing cache for instructions in that page." Which sounds very reasonable. Question: Is the: > > flush_kernel_icache_page(page_address(page)); (or the hardware that receives the command) smart enough to just mark the page 'invalid' or is it actually an 'absolute update external storage'? I ask because both flush commands are written the same, BUT... The first should be an 'absolute update external storage'. The second should be either just a 'mark invalid' or 'conditional update external storage'. Mike _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-29 1:14 ` Michael S. Zick @ 2004-11-29 2:00 ` Matthew Wilcox 0 siblings, 0 replies; 25+ messages in thread From: Matthew Wilcox @ 2004-11-29 2:00 UTC (permalink / raw) To: Michael S. Zick; +Cc: parisc-linux On Sun, Nov 28, 2004 at 07:14:14PM -0600, Michael S. Zick wrote: > Question: > Is the: > > > flush_kernel_icache_page(page_address(page)); > (or the hardware that receives the command) > > smart enough to just mark the page 'invalid' or is it > actually an 'absolute update external storage'? The I-cache is, by definition, read-only, so there's nothing to update main memory with. -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-11 8:11 ` Randolph Chung ` (3 preceding siblings ...) 2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete @ 2004-12-01 17:44 ` Joel Soete 2004-12-01 17:56 ` Matthew Wilcox 2004-12-03 10:24 ` Joel Soete 2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete 5 siblings, 2 replies; 25+ messages in thread From: Joel Soete @ 2004-12-01 17:44 UTC (permalink / raw) To: Randolph Chung, Grant Grundler; +Cc: parisc-linux [...] > > but looking at the other ones: > - __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes > at a time instead of 4 Is it the idea (principle only): --- arch/parisc/kernel/pacache.S.Orig 2004-12-01 17:09:53.000000000 +0100= +++ arch/parisc/kernel/pacache.S-t1 2004-12-01 17:58:16.000000000 +0100 @@ -505,6 +505,16 @@ ldi 64,%r1 1: +#ifdef __LP64__ + std %r0,0(%r28) + std %r0,8(%r28) + std %r0,16(%r28) + std %r0,24(%r28) + std %r0,32(%r28) + std %r0,40(%r28) + std %r0,48(%r28) + std %r0,56(%r28) +#else stw %r0,0(%r28) stw %r0,4(%r28) stw %r0,8(%r28) @@ -521,6 +531,7 @@ stw %r0,52(%r28) stw %r0,56(%r28) stw %r0,60(%r28) +#endif ADDIB> -1,%r1,1b ldo 64(%r28),%r28 I doubt that's enough because I don't yet find how r0 is set? [...] Thanks in advance for your attention, Joel -------------------------------------------------------------------------= -- Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.= .. http://reg.tiscali.be/adsl/default.asp?lg=3DFR _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-01 17:44 ` More questions " Joel Soete @ 2004-12-01 17:56 ` Matthew Wilcox 2004-12-01 18:33 ` Joel Soete 2004-12-03 10:24 ` Joel Soete 1 sibling, 1 reply; 25+ messages in thread From: Matthew Wilcox @ 2004-12-01 17:56 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux On Wed, Dec 01, 2004 at 06:44:44PM +0100, Joel Soete wrote: > I doubt that's enough because I don't yet find how r0 is set? r0 is magic on PA. Writes are discarded, reads return 0. It's the hardware equivalent of /dev/zero ;-) -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-01 17:56 ` Matthew Wilcox @ 2004-12-01 18:33 ` Joel Soete 0 siblings, 0 replies; 25+ messages in thread From: Joel Soete @ 2004-12-01 18:33 UTC (permalink / raw) To: Matthew Wilcox; +Cc: parisc-linux > > On Wed, Dec 01, 2004 at 06:44:44PM +0100, Joel Soete wrote: > > I doubt that's enough because I don't yet find how r0 is set? > > r0 is magic on PA. Writes are discarded, reads return 0. It's the > hardware equivalent of /dev/zero ;-) > Cool that make the stuff so ? And iirc registers are 64bit wide on pa2.0? Thanks, Joel -------------------------------------------------------------------------= -- Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.= .. http://reg.tiscali.be/adsl/default.asp?lg=3DFR _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-01 17:44 ` More questions " Joel Soete 2004-12-01 17:56 ` Matthew Wilcox @ 2004-12-03 10:24 ` Joel Soete 2004-12-03 15:41 ` Randolph Chung 1 sibling, 1 reply; 25+ messages in thread From: Joel Soete @ 2004-12-03 10:24 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux Hello all, I need more advise: Joel Soete wrote: > [...] > >>but looking at the other ones: >>- __clear_user_page_asm can be optimized for 64-bit by writing 8 bytes >> at a time instead of 4 > > Is it the idea (principle only): > --- arch/parisc/kernel/pacache.S.Orig 2004-12-01 17:09:53.000000000 +0100 > +++ arch/parisc/kernel/pacache.S-t1 2004-12-01 17:58:16.000000000 +0100 > @@ -505,6 +505,16 @@ > ldi 64,%r1 > > 1: > +#ifdef __LP64__ > + std %r0,0(%r28) > + std %r0,8(%r28) > + std %r0,16(%r28) > + std %r0,24(%r28) > + std %r0,32(%r28) > + std %r0,40(%r28) > + std %r0,48(%r28) > + std %r0,56(%r28) > +#else > stw %r0,0(%r28) > stw %r0,4(%r28) > stw %r0,8(%r28) > @@ -521,6 +531,7 @@ > stw %r0,52(%r28) > stw %r0,56(%r28) > stw %r0,60(%r28) > +#endif > ADDIB> -1,%r1,1b > ldo 64(%r28),%r28 > I test it on my b2k with 64bit kernel and it works but it didn't seems to bring me any benefit? As far as I can believe readprofile: the ratio between the first and last column is always the same (1/80 in this case). What did I miss? My hope was at least a befenit of 1/2 as I reduce insn number in the same rate. Right now my idea is to write a small test case to see if using the same number on insn but with less loop will help or not? Thanks again for more advise, Joel _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-03 10:24 ` Joel Soete @ 2004-12-03 15:41 ` Randolph Chung 2004-12-07 14:42 ` Joel Soete 0 siblings, 1 reply; 25+ messages in thread From: Randolph Chung @ 2004-12-03 15:41 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux > I test it on my b2k with 64bit kernel and it works but it didn't seems to > bring me any benefit? > As far as I can believe readprofile: the ratio between the first and last > column is always the same (1/80 in this case). > What did I miss? what workload did you test this on? possibly you only see a benefit with workloads that need to do a lot of page clearings.... so you probably want to find a workload so that the first column is >>1 randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: More questions [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-03 15:41 ` Randolph Chung @ 2004-12-07 14:42 ` Joel Soete 0 siblings, 0 replies; 25+ messages in thread From: Joel Soete @ 2004-12-07 14:42 UTC (permalink / raw) To: Randolph Chung; +Cc: parisc-linux Hello Randolph, > > > > I test it on my b2k with 64bit kernel and it works but it didn't seem= s > to > > bring me any benefit? > > As far as I can believe readprofile: the ratio between the first and last > > > column is always the same (1/80 in this case). > > What did I miss? > > what workload did you test this on? possibly you only see a benefit wit= h > workloads that need to do a lot of page clearings.... so you probably > want to find a workload so that the first column is >>1 > I re-do the following test with 2.6.10-rc3-pa2 with this b2k and 64bits kernel: readprofile -r ; make V=3D1 vmlinux 2>&1 | tee /var/logs/k-2.6.10-rc3-pa2= -b2k64 ; readprofile > /var/logs/prof2b-2.6.10-rc3-pa2-b2k64-3 I build first the kernel from cvs and reboot it to obtain following resul= t: /var/logs/prof2b-2.6.10-rc3-pa2-b2k64-3: 15500 __clear_user_page_asm 138.3929 (i.e. 1/112) I apply previous mentioned lclu patch, rebuild again and reboot to rebuil= d a 4th time this kernel to obtain: /var/logs/prof2b-2.6.10-rc3-pa2-b2k64-4: 12609 __clear_user_page_asm 112.5804 (i.e. 1/112) Interesting: the ratio stay cst between test but the number of clock tick= s was well reduced (so I presume a potential benefit even though small ;-) hth, Joel -------------------------------------------------------------------------= -- Tiscali vous offre 3 mois d'ADSL et 3 mois de DVD gratuits...profitez-en.= .. http://reg.tiscali.be/adsl/default.asp?lg=3DFR _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* *lcul and memory granularity question[Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-11-11 8:11 ` Randolph Chung ` (4 preceding siblings ...) 2004-12-01 17:44 ` More questions " Joel Soete @ 2004-12-03 15:00 ` Joel Soete 2004-12-03 15:13 ` Matthew Wilcox 5 siblings, 1 reply; 25+ messages in thread From: Joel Soete @ 2004-12-03 15:00 UTC (permalink / raw) Cc: parisc-linux Hello all, Randolph Chung wrote: > - *lclu* can be rewritten to do better than 1-byte at a time I have an additional question about parisc alignment and this remark: a char type var is 1byte align; ... but what's about a 3, 5, 7 and more bytes struct size? My idea is that a 3bytes could be align as a 32bites word and clearing such struct could be done by clearing all the word; the same for 5 and 7 bytes if aligned as 2*32bites and so on for an unrolled loop of the max cache size (128 bytes iirc); and btw using a case define as we use for __put/get__user/kernel_asm? Or the memory management is more complex then I imagine and I would really consider a 3bytes as 2+1 bytes (5=2*2+1, ...)? Thanks a lot, Joel _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: *lcul and memory granularity question[Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] 2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete @ 2004-12-03 15:13 ` Matthew Wilcox 0 siblings, 0 replies; 25+ messages in thread From: Matthew Wilcox @ 2004-12-03 15:13 UTC (permalink / raw) To: Joel Soete; +Cc: parisc-linux On Fri, Dec 03, 2004 at 03:00:33PM +0000, Joel Soete wrote: > >- *lclu* can be rewritten to do better than 1-byte at a time > > I have an additional question about parisc alignment and this remark: > a char type var is 1byte align; ... but what's about a 3, 5, 7 and more > bytes struct size? Don't think in terms of structs, think in terms of an arbitrary array of bytes. You can't assume anything about the alignment of lclu or the size. In particular, you can't write more than the requested number of bytes. -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [parisc-linux] 2.6.10-rc1-pa11 profile data 2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler 2004-11-11 8:11 ` Randolph Chung @ 2004-11-12 5:29 ` Grant Grundler 1 sibling, 0 replies; 25+ messages in thread From: Grant Grundler @ 2004-11-12 5:29 UTC (permalink / raw) To: parisc-linux On Thu, Nov 11, 2004 at 12:54:31AM -0700, Grant Grundler wrote: > I've collect two profiles for -64SMP and will collect > some UP profiles tomorrow. "tomorrow" finally arrived. :^) > http://www.parisc-linux.org/~grundler/prof-j6700/ I've added the 64-bit UP profile numbers as promised. And some of the top consumers look familiar: root@gggj6k:~# sort -rnk 3 prof-2.6.10-rc1-pa11-64-01.txt 40150 flush_kernel_icache_page 401.5000 6798 fdsync 339.9000 10645 flush_user_dcache_range_asm 295.6944 10353 flush_user_icache_range_asm 287.5833 13871 machine_restart 216.7344 20839 __clear_user_page_asm 186.0625 10478 cpu_idle 145.5278 1380 fisync 86.2500 365 $lclu_done 45.6250 3794 purge_kernel_dcache_page 37.9400 3535 flush_kernel_dcache_page 35.3500 3279 copy_user_page_asm 21.5724 1358 find_get_page 18.8611 185 $lslen_loop 15.4167 1228 find_vma_prev 12.7917 101 $lslen_done 12.6250 128 $lsfu_loop 8.0000 162 file_ra_state_init 6.7500 ... enjoy! grant _______________________________________________ parisc-linux mailing list parisc-linux@lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2004-12-07 14:42 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-11-11 7:54 [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler 2004-11-11 8:11 ` Randolph Chung 2004-11-11 17:39 ` Carlos O'Donell 2004-11-11 17:42 ` Randolph Chung 2004-11-11 17:50 ` Matthew Wilcox 2004-11-11 17:59 ` Randolph Chung 2004-11-11 18:36 ` Grant Grundler 2004-11-11 18:23 ` Joel Soete 2004-11-11 18:51 ` Randolph Chung 2004-11-26 16:59 ` flush_kernel_[di]cache_page question? [WAS: " Joel Soete 2004-11-26 17:13 ` Randolph Chung 2004-11-26 19:02 ` Grant Grundler 2004-11-28 21:01 ` [id]cache meaning? [Was: [parisc-linux] 2.6.10-rc1-pa11 profile data] Joel Soete 2004-11-28 21:13 ` Matthew Wilcox 2004-11-29 1:14 ` Michael S. Zick 2004-11-29 2:00 ` Matthew Wilcox 2004-12-01 17:44 ` More questions " Joel Soete 2004-12-01 17:56 ` Matthew Wilcox 2004-12-01 18:33 ` Joel Soete 2004-12-03 10:24 ` Joel Soete 2004-12-03 15:41 ` Randolph Chung 2004-12-07 14:42 ` Joel Soete 2004-12-03 15:00 ` *lcul and memory granularity question[Was: " Joel Soete 2004-12-03 15:13 ` Matthew Wilcox 2004-11-12 5:29 ` [parisc-linux] 2.6.10-rc1-pa11 profile data Grant Grundler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.