From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <396AF1B8.6FB1401C@agelectronics.co.uk> Date: Tue, 11 Jul 2000 11:06:48 +0100 From: Adrian Cox MIME-Version: 1.0 To: Dan Malek CC: linuxppc-dev Subject: Re: Help with string.S References: <3967B1E3.80CAC746@embeddededge.com> <396969E1.A7256E4A@lightning.ch> <396A5162.411F49EF@embeddededge.com> Content-Type: text/plain; charset=us-ascii Sender: owner-linuxppc-dev@lists.linuxppc.org List-Id: Dan Malek wrote: > > What gives me trouble is the fact that dcbz instruction in function > > arch/ppc/lib/string.S:__copy_tofrom_user does not seem to work for me. > These are becoming a pain in the ass instructions. Has anyone ever > done some performance analysis to see what we really gain here in > real life? Sure, locally and logically you can make an intuitive > argument, but we are sure fetching lots of instructions just to get > this aligned, and further to actually move the data. The 7xx(x) processors don't have the alignment handler set up to cover this problem in 2.2, so they just get an oops when somebody writes to uncached memory, like a framebuffer device. This could probably be solved by starting the function with a test of the address, and using a version without cache operations for target addresses above the kernel image of memory. Or by removing the cache operations. Even if they stay, could they be a compilation time optimisation for particular processors? > You know, we could make this even faster by using the Altivec and the > new cache streaming modes on the 7400 processors :-). I've tested this > in applications. It really works. The 7400 certainly doesn't need the dcbz, as it will perform an implicit allocation if the entire cache line is written by store instructions. - Adrian Cox, AG Electronics ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/