* question about altivec registers @ 1999-10-25 20:51 Jim Terman 1999-10-25 22:27 ` Claude Robitaille 1999-10-26 4:42 ` Kumar Gala 0 siblings, 2 replies; 23+ messages in thread From: Jim Terman @ 1999-10-25 20:51 UTC (permalink / raw) To: linuxppc-dev I have a question about using the altivec registers on a G4 Macintosh running linuxppc. Will there be any conflicts. I don't expect any kernal support, but can I be confident that the kernal will not touch any of the altivec registers. Any info will be greatly appreciated. -- ______________________________________________________________________________ Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 20:51 question about altivec registers Jim Terman @ 1999-10-25 22:27 ` Claude Robitaille 1999-10-25 22:31 ` Jim Terman 1999-10-26 4:42 ` Kumar Gala 1 sibling, 1 reply; 23+ messages in thread From: Claude Robitaille @ 1999-10-25 22:27 UTC (permalink / raw) To: Jim Terman; +Cc: linuxppc-dev There is a flag for that purpose. You should look into the Altivec environment on Motorola's Web site, or at www.altivec.org Claude On Mon, 25 Oct 1999, Jim Terman wrote: > > I have a question about using the altivec registers on a G4 Macintosh > running linuxppc. Will there be any conflicts. I don't expect any > kernal support, but can I be confident that the kernal will not touch > any of the altivec registers. Any info will be greatly appreciated. > > -- > ______________________________________________________________________________ > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com > > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 22:27 ` Claude Robitaille @ 1999-10-25 22:31 ` Jim Terman 1999-10-25 22:44 ` erik cameron ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Jim Terman @ 1999-10-25 22:31 UTC (permalink / raw) To: Claude Robitaille; +Cc: Jim Terman, linuxppc-dev I understand how to get it working from the processor point of view. I just want to make sure that the linuxppc kernal will not interfere if I compile a program that uses these registers. The bottom line is that I am not looking for any help with the altivec registers from the kernal. I just want to be sure that there will not be any inteference. If not, great. I'd just like that insurance. Thanks for any clarification you can give me. Claude Robitaille writes: > There is a flag for that purpose. You should look into the Altivec > environment on Motorola's Web site, or at www.altivec.org > > Claude > > > On Mon, 25 Oct 1999, Jim Terman wrote: > > > > > I have a question about using the altivec registers on a G4 Macintosh > > running linuxppc. Will there be any conflicts. I don't expect any > > kernal support, but can I be confident that the kernal will not touch > > any of the altivec registers. Any info will be greatly appreciated. > > > > -- > > ______________________________________________________________________________ > > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com > > > > > > -- ______________________________________________________________________________ Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 22:31 ` Jim Terman @ 1999-10-25 22:44 ` erik cameron 1999-10-25 23:28 ` Claude Robitaille [not found] ` <Pine.LNX.4.10.9910251916060.5902-100000@modemcable220.93-200-24.mtl.mc.vi deotron.net> 2 siblings, 0 replies; 23+ messages in thread From: erik cameron @ 1999-10-25 22:44 UTC (permalink / raw) To: Jim Terman; +Cc: Claude Robitaille, linuxppc-dev perhaps this is a stupid question, but aren't the values of the altivec registers just saved as part of the process' hardware context and swapped in and out as part of a context switch? in which case it wouldn't matter if the kernel (or any other process) changed the values while your job was on a sleeping. On Mon, Oct 25, 1999 at 03:31:48PM -0700, Jim Terman wrote: > > I understand how to get it working from the processor point of view. I > just want to make sure that the linuxppc kernal will not interfere if I > compile a program that uses these registers. > > The bottom line is that I am not looking for any help with the altivec > registers from the kernal. I just want to be sure that there will not > be any inteference. If not, great. I'd just like that insurance. > Thanks for any clarification you can give me. > > Claude Robitaille writes: > > There is a flag for that purpose. You should look into the Altivec > > environment on Motorola's Web site, or at www.altivec.org > > > > Claude > > > > > > On Mon, 25 Oct 1999, Jim Terman wrote: > > > > > > > > I have a question about using the altivec registers on a G4 Macintosh > > > running linuxppc. Will there be any conflicts. I don't expect any > > > kernal support, but can I be confident that the kernal will not touch > > > any of the altivec registers. Any info will be greatly appreciated. > > > > > > -- > > > ______________________________________________________________________________ > > > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > > > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > > > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com > > > > > > > > > > > -- > ______________________________________________________________________________ > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com -- erik cameron unix systems administrator jfi/mrsec @ the university of chicago e-cameron@uchicago.edu ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 22:31 ` Jim Terman 1999-10-25 22:44 ` erik cameron @ 1999-10-25 23:28 ` Claude Robitaille [not found] ` <Pine.LNX.4.10.9910251916060.5902-100000@modemcable220.93-200-24.mtl.mc.vi deotron.net> 2 siblings, 0 replies; 23+ messages in thread From: Claude Robitaille @ 1999-10-25 23:28 UTC (permalink / raw) To: Jim Terman; +Cc: linuxppc-dev The flag is to actually tell the kernel that your application is using the Altivec registers, so that it can save time by not saving and restoring them when the interrupted or swapped application is not using them. It is supposed to be dynamic so only routines actually using Altivec should set (and clear) it. I am not sure of the details so look into the manual for the hardware support. I think the kernel should enforce the use of this flag since moving the full Altivec register set is time consuming (16 bytes X 32 registers = 1/2 KB). Claude On Mon, 25 Oct 1999, Jim Terman wrote: > I understand how to get it working from the processor point of view. I > just want to make sure that the linuxppc kernal will not interfere if I > compile a program that uses these registers. > > The bottom line is that I am not looking for any help with the altivec > registers from the kernal. I just want to be sure that there will not > be any inteference. If not, great. I'd just like that insurance. > Thanks for any clarification you can give me. > > Claude Robitaille writes: > > There is a flag for that purpose. You should look into the Altivec > > environment on Motorola's Web site, or at www.altivec.org > > > > Claude > > > > > > On Mon, 25 Oct 1999, Jim Terman wrote: > > > > > > > > I have a question about using the altivec registers on a G4 Macintosh > > > running linuxppc. Will there be any conflicts. I don't expect any l > > > kernal support, but can I be confident that the kernal will not touch > > > any of the altivec registers. Any info will be greatly appreciated. > > > > > > -- > > > ______________________________________________________________________________ > > > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > > > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > > > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com > > > > > > > > > > > -- > ______________________________________________________________________________ > Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 > terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 > Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com > > ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <Pine.LNX.4.10.9910251916060.5902-100000@modemcable220.93-200-24.mtl.mc.vi deotron.net>]
* Re: question about altivec registers [not found] ` <Pine.LNX.4.10.9910251916060.5902-100000@modemcable220.93-200-24.mtl.mc.vi deotron.net> @ 1999-10-25 23:53 ` Rob Barris 1999-10-26 18:22 ` Geert Uytterhoeven 1999-10-26 22:03 ` Tom Vier 0 siblings, 2 replies; 23+ messages in thread From: Rob Barris @ 1999-10-25 23:53 UTC (permalink / raw) To: linuxppc-dev >The flag is to actually tell the kernel that your application is using >the Altivec registers, so that it can save time by not saving and >restoring them when the interrupted or swapped application is not using >them. It is supposed to be dynamic so only routines actually using Altivec >should set (and clear) it. I am not sure of the details so look into the >manual for the hardware support. I think the kernel should enforce the >use of this flag since moving the full Altivec register set is time >consuming (16 bytes X 32 registers = 1/2 KB). I worked this out once, the extra 512 bytes of register context, multiplied by (say) a thousand context switches per second only add up to about a MB of memory traffic per second - a fraction of a percent of the available memory bandwidth in a G4 machine. Most of that will sit in cache anyway depending on the working set size of the processes involved. Wiggling the mouse probably causes more memory traffic than that (code fetches and ISR handling). And this is with a very high hypothetical context switch rate which I suspect may never be seen in real life use. -- Rob Barris Quicksilver Software Inc. rbarris@quicksilver.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 23:53 ` Rob Barris @ 1999-10-26 18:22 ` Geert Uytterhoeven 1999-10-26 22:13 ` Rob Barris 1999-10-26 22:38 ` Tom Vier 1999-10-26 22:03 ` Tom Vier 1 sibling, 2 replies; 23+ messages in thread From: Geert Uytterhoeven @ 1999-10-26 18:22 UTC (permalink / raw) To: Rob Barris; +Cc: linuxppc-dev On Mon, 25 Oct 1999, Rob Barris wrote: > >The flag is to actually tell the kernel that your application is using > >the Altivec registers, so that it can save time by not saving and > >restoring them when the interrupted or swapped application is not using > >them. It is supposed to be dynamic so only routines actually using Altivec > >should set (and clear) it. I am not sure of the details so look into the > >manual for the hardware support. I think the kernel should enforce the > >use of this flag since moving the full Altivec register set is time > >consuming (16 bytes X 32 registers = 1/2 KB). > > I worked this out once, the extra 512 bytes of register context, > multiplied by (say) a thousand context switches per second only add up to > about a MB of memory traffic per second - a fraction of a percent of the > available memory bandwidth in a G4 machine. Most of that will sit in cache > anyway depending on the working set size of the processes involved. Moving around blocks of 512 bytes quickly thrashes the L1 cache, unless the loads/stores are done using cache-bypassing instructions (cfr. MOVE16 on '040). Don't know whether PPC has these (still no PPC guru :-( Gr{oetje,eeting}s, -- Geert Uytterhoeven -- Linux/{m68k~Amiga,PPC~CHRP} -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-26 18:22 ` Geert Uytterhoeven @ 1999-10-26 22:13 ` Rob Barris 1999-10-26 22:38 ` Tom Vier 1 sibling, 0 replies; 23+ messages in thread From: Rob Barris @ 1999-10-26 22:13 UTC (permalink / raw) To: linuxppc-dev >On Mon, 25 Oct 1999, Rob Barris wrote: >> >The flag is to actually tell the kernel that your application is using >> >the Altivec registers, so that it can save time by not saving and >> >restoring them when the interrupted or swapped application is not using >> >them. It is supposed to be dynamic so only routines actually using Altivec >> >should set (and clear) it. I am not sure of the details so look into the >> >manual for the hardware support. I think the kernel should enforce the >> >use of this flag since moving the full Altivec register set is time >> >consuming (16 bytes X 32 registers = 1/2 KB). >> >> I worked this out once, the extra 512 bytes of register context, >> multiplied by (say) a thousand context switches per second only add up to >> about a MB of memory traffic per second - a fraction of a percent of the >> available memory bandwidth in a G4 machine. Most of that will sit in cache >> anyway depending on the working set size of the processes involved. > >Moving around blocks of 512 bytes quickly thrashes the L1 cache, unless the >loads/stores are done using cache-bypassing instructions (cfr. MOVE16 on >'040). >Don't know whether PPC has these (still no PPC guru :-( Well, doing anything useful will cause traffic in and out of L1. That's just a fact of life. "Thrash" is a strong word, considering we're talking about 512 bytes of data, that's 512/32768 == 1/64 of the typical PPC 750 data cache. Further, the PPC register state was already large (at least 384 bytes for all 32 int and 32 fp regs) - no one seemed to be noticing context switch time as a problem before, this further supports my assertion. 2.5 times "tiny" is still "tiny". Now, copying a 16K or 32K block from point A to point B will indeed cause a complete cache replacement. But that's not what's going on here. In fact, for a few processes being switched between rapidly, it may well be the case that those register state blocks may park in the L1 or L2 and not go back out to main memory at all. But my estimate was based on a worst case again, and assuming that anything leaving L1 has to go to RAM and not the L2. The point I was trying to make is that even in a hypothetical worst case scenario, the added traffic is modest and possibly below the threshold of noticeability. Moving things in and out of L1 is not bad in itself. The net impact is what matters, that's what my calculation was trying to show. If for example, memory was infinitely fast, traffic to and from L1 would have no impact. OK so that's not true, the question then becomes "so how much time does in fact get spent servicing that traffic, given real memory speeds". At a hypothetical switch rate of 1KHz (extremely high) the overhead is still quite small. -- Rob Barris Quicksilver Software Inc. rbarris@quicksilver.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-26 18:22 ` Geert Uytterhoeven 1999-10-26 22:13 ` Rob Barris @ 1999-10-26 22:38 ` Tom Vier 1 sibling, 0 replies; 23+ messages in thread From: Tom Vier @ 1999-10-26 22:38 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: linuxppc-dev On Tue, Oct 26, 1999 at 08:22:06PM +0200, Geert Uytterhoeven wrote: > Moving around blocks of 512 bytes quickly thrashes the L1 cache, unless the > loads/stores are done using cache-bypassing instructions (cfr. MOVE16 on '040). > Don't know whether PPC has these (still no PPC guru :-( from what i've read, you can disable cache for the altivec regs. this was intended for doing infrequent vector ops between frequent vectors ops (loops) without distrubing the cache. -- Tom Vier - 0x27371A1C thomassr@erols.com http://users.erols.com/thomassr/zero/ DSA Key fingerprint: 42D4 82D6 6DF5 77EC 1251 30D2 D9E7 E858 2737 1A2C ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 23:53 ` Rob Barris 1999-10-26 18:22 ` Geert Uytterhoeven @ 1999-10-26 22:03 ` Tom Vier 1 sibling, 0 replies; 23+ messages in thread From: Tom Vier @ 1999-10-26 22:03 UTC (permalink / raw) To: Rob Barris; +Cc: linuxppc-dev On Mon, Oct 25, 1999 at 04:53:52PM -0700, Rob Barris wrote: > I worked this out once, the extra 512 bytes of register context, > multiplied by (say) a thousand context switches per second only add up to > about a MB of memory traffic per second - a fraction of a percent of the > available memory bandwidth in a G4 machine. Most of that will sit in cac= he > anyway depending on the working set size of the processes involved. couldn't you just do lazy context saves? ie, disable the vector ops by default; when a proc tries to use a vector op catch the exception, mark the proc as vector using and enable vectors. when a context switch occurs, mark the proc as vector enable, disable vectors, continue (and re-enable when you switch the proc's context back in). it's a little more complicated when more than one proc wants vectors. in that case, before you re-enable vectors, check to see if the vector regs need their context switched. or maybe that complexity isn't worth bandwidth/latency it saves. does linux/ppc do lazy FPU context saves this way? if you don't do lazy vector saves, i would think it would raise context switch times a sizable amount. there's four times as much data in those 128bit regs as there are in the 32bit GPRs. -- Tom Vier - 0x27371A1C thomassr@erols.com http://users.erols.com/thomassr/zero/ DSA Key fingerprint: 42D4 82D6 6DF5 77EC 1251 30D2 D9E7 E858 2737 1A2C ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-25 20:51 question about altivec registers Jim Terman 1999-10-25 22:27 ` Claude Robitaille @ 1999-10-26 4:42 ` Kumar Gala 1999-10-26 21:52 ` Jim Terman 1 sibling, 1 reply; 23+ messages in thread From: Kumar Gala @ 1999-10-26 4:42 UTC (permalink / raw) To: linuxppc-dev The linux kernel as is will not effect the AltiVec registers in any way. However, there is a minor change to the kernel that will be required. You will need to enable the MSR VEC bit (bit 6 in the MSR) to tell the processor that the AltiVec Unit is available (this is similar to the MSR FP bit). If the bit is not set the processor will generate an AltIVec Unavailable exception which will be trapped (incorrectly) as an unknown 0xf00 exception the 0xf00 exception is for the performance monitors and 0xf20 in the AltiVec unavailable exception. All if these details are documented in the AltiVec Programming Environ Manual (available from the Motorola Website). If you need any help getting a simple kernel up and running for running single altiVec enabled processes let me know. - kumar gala ignorance is bliss. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-26 4:42 ` Kumar Gala @ 1999-10-26 21:52 ` Jim Terman 1999-10-26 22:43 ` Kumar Gala 0 siblings, 1 reply; 23+ messages in thread From: Jim Terman @ 1999-10-26 21:52 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev Will the linuxppc kernal as it is right now save the AltiVec registers if we enable the MSR VEC bit. I've been trying to follow the other messages on this subject, but I'm not clear. Kumar Gala writes: > > The linux kernel as is will not effect the AltiVec registers in any way. > However, there is a minor change to the kernel that will be required. You > will need to enable the MSR VEC bit (bit 6 in the MSR) to tell the > processor that the AltiVec Unit is available (this is similar to the MSR > FP bit). If the bit is not set the processor will generate an AltIVec > Unavailable exception which will be trapped (incorrectly) as an unknown > 0xf00 exception > > the 0xf00 exception is for the performance monitors > and 0xf20 in the AltiVec unavailable exception. > > All if these details are documented in the AltiVec Programming Environ > Manual (available from the Motorola Website). > > If you need any help getting a simple kernel up and running for running > single altiVec enabled processes let me know. > > - kumar gala > > > ignorance is bliss. > > > -- ______________________________________________________________________________ Jim Terman | 323 Vintage Park Dr. | Voice: (650) 356-5446 terman@ddi.com | Foster City, CA | Fax: (650) 356-5490 Diab-SDS, Inc. | 94404 | web site - http://www.ddi.com ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-26 21:52 ` Jim Terman @ 1999-10-26 22:43 ` Kumar Gala 1999-10-27 8:58 ` Adrian Cox 0 siblings, 1 reply; 23+ messages in thread From: Kumar Gala @ 1999-10-26 22:43 UTC (permalink / raw) To: Jim Terman; +Cc: linuxppc-dev > > Will the linuxppc kernal as it is right now save the AltiVec registers > if we enable the MSR VEC bit. I've been trying to follow the other messages > on this subject, but I'm not clear. > No, the kernel does not know anything about the AltiVec registers, unlike the x86 platform on interrupts only two registers and saved and restored typically. SRR0 and SRR1, one contains the MSR settings and the other the PC to return two. On an rfi values are copied out of these registers. The AltiVec registers have to be saved and restore explicitly, if you look at /arch/ppc/kernel/head.S and look for load_up_fp you will see how the floating point unit is handled on exceptions. Essential what is done is there are some checks done, and a pointer is kept to the last process using the FP unit (last_task_used_fp) which then if needed the FP regs are saved in to that processes context and the FPs for the incoming are restored. - kumar ignorance is bliss. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-26 22:43 ` Kumar Gala @ 1999-10-27 8:58 ` Adrian Cox 1999-10-27 13:21 ` Gabriel Paubert 0 siblings, 1 reply; 23+ messages in thread From: Adrian Cox @ 1999-10-27 8:58 UTC (permalink / raw) To: Kumar Gala; +Cc: Jim Terman, linuxppc-dev Kumar Gala wrote: > The AltiVec registers have to be saved and restore explicitly, if you look > at /arch/ppc/kernel/head.S and look for load_up_fp you will see how the > floating point unit is handled on exceptions. Essential what is done is > there are some checks done, and a pointer is kept to the last > process using the FP unit (last_task_used_fp) which then if needed the FP > regs are saved in to that processes context and the FPs for the incoming > are restored. Linux on PowerPC should end up doing a classic lazy save/restore for the vector context, as it already does for the floating point registers. On SMP systems this simple approach isn't possible, but a quick approximation is to detect the first time a process uses Altivec, and marking it to always save and restore vector context from then on. I'd recommend that compiler writers use the vrsave register to mark which vector registers they use, as a precaution against future kernels which may look at this. Note that the G4 is extremely fast at linear sequences of cacheable stores (store miss merging), and it is probably cheaper for the kernel to ignore vrsave and avoid branches in the save and restore sequence. Of course, it is correct to simply set every bit in vrsave at the start of your application, and never change it again. It may be non-optimal on future systems, but it should remain correct. As for the cache thrashing effect, remember that 512 bytes going in and out of the L2 cache is not very expensive, and that there is probably 1 or 2MB of L2 fitted. - Adrian Cox, AG Electronics ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-27 8:58 ` Adrian Cox @ 1999-10-27 13:21 ` Gabriel Paubert 1999-10-27 16:05 ` Geert Uytterhoeven ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Gabriel Paubert @ 1999-10-27 13:21 UTC (permalink / raw) To: Adrian Cox; +Cc: Kumar Gala, Jim Terman, linuxppc-dev On Wed, 27 Oct 1999, Adrian Cox wrote: > Linux on PowerPC should end up doing a classic lazy save/restore for the > vector context, as it already does for the floating point registers. On > SMP systems this simple approach isn't possible, but a quick > approximation is to detect the first time a process uses Altivec, and > marking it to always save and restore vector context from then on. Agreed. > I'd recommend that compiler writers use the vrsave register to mark > which vector registers they use, as a precaution against future kernels > which may look at this. Note that the G4 is extremely fast at linear > sequences of cacheable stores (store miss merging), and it is probably > cheaper for the kernel to ignore vrsave and avoid branches in the save > and restore sequence. Of course, it is correct to simply set every bit > in vrsave at the start of your application, and never change it again. > It may be non-optimal on future systems, but it should remain correct. Don't forget nevertheless a worthwhile optimization: that VRSAVE=0 means that the program has no active Altivec registers at the time so that the save can be skipped altogether (except for vrsave and the control/status register). And why would you want to use a bitmap ? This seems braindead to me, put a value between 0 and 32 in vrsave. Since all registers are identical in use and purpose, save registers 0 to n. Disclaimer: I've not seen if the ABI specifies how and which Altivec registers are saved restored across calls. Paranoid point of view: the restore must reload all altivec registers (or clear the ones which are not specified as used by VRSAVE), otherwise you might leak the contents of the Altivec registers of another process. I'm not a security expert, but I don't like this possibility at all. Code bloat concerns: actually to save or restore a single altivec register, you need 2 instructions given the available addressing modes: this makes 512 bytes of code for 32 register save + 32 register restore (there are ways to slightly reduce it but there is also the overhead of setting up several integer registers, saving vrsave and the control/status register...). Count 12 bytes/register if you use a bit in vrsave to check every register. But the branches are not that expensive if the cr bits are set enough in advance: assuming vrsave has been copied to r0: cmpwi r0,0 bne- done mtcrf 0x1,r0 la r3,vregsavearea+448 li r4,16 li r5,32 li r6,48 bf 31,30f stvx v31,r6,r3 30: mtcrf 0x2,r0 bf 30,29f stvx v30,r5,r3 29: srwi r0,r0,8 bf 29,28f stvx v29,r4,r3 28: bf 28,27f stvx v28,0,r3 27: addi r3,r3,-64 bf 27,26f stvx v27,r6,r3 26: mtcrf 0x1,r0 bf 26,25f stvx v26,r5,r3 25: bf 25,24f stvx v25,r4,r3 24: bf 24,23f stvx v24,0,r3 23: addi r3,r3,-64 bf 31,22f stvx v23,r6,r3 22: mtcrf 0x2,r0 # Cycle since 30: repeats here bf 30,21f stvx v22,r5,r3 21: srwi r0,r0,8 bf 29,20f ... 0: bf 24,done stvx v0,0,r3 done: # now save the control/status register... in this code the bits to test are always set or moe 3 branches ahead of the test by interleaving 2 cr fields set up by mtcrf according to vrsave bits. But the code is significantly larger than using a count and branching at the right place in the save routine. > As for the cache thrashing effect, remember that 512 bytes going in and > out of the L2 cache is not very expensive, and that there is probably 1 > or 2MB of L2 fitted. My feeling is that it is unlikely that the code is in the L1 cache, this code is not a tight loop which is executed 1000 times in a row, and it is probably saturating L2 cache bandwidth. If you need 8 bytes of code and 16 bytes of data for each register save/load on average, it's 3 L2 data beats or 6 clocks in the most common scenario (L2 at 1/2 core frequency). Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-27 13:21 ` Gabriel Paubert @ 1999-10-27 16:05 ` Geert Uytterhoeven 1999-10-27 18:23 ` Kumar Gala 1999-10-27 22:39 ` Tony Mantler 2 siblings, 0 replies; 23+ messages in thread From: Geert Uytterhoeven @ 1999-10-27 16:05 UTC (permalink / raw) To: Gabriel Paubert; +Cc: Adrian Cox, Kumar Gala, Jim Terman, linuxppc-dev On Wed, 27 Oct 1999, Gabriel Paubert wrote: > On Wed, 27 Oct 1999, Adrian Cox wrote: > > As for the cache thrashing effect, remember that 512 bytes going in and > > out of the L2 cache is not very expensive, and that there is probably 1 > > or 2MB of L2 fitted. > > My feeling is that it is unlikely that the code is in the L1 cache, this > code is not a tight loop which is executed 1000 times in a row, and it is > probably saturating L2 cache bandwidth. If you need 8 bytes of code and 16 > bytes of data for each register save/load on average, it's 3 L2 data beats > or 6 clocks in the most common scenario (L2 at 1/2 core frequency). I wasn't primarily concerned about code taking space in the L1 cache, but the saved Altived registers pushing out valuable data of the L1 cache on each save. Gr{oetje,eeting}s, -- Geert Uytterhoeven -- Linux/{m68k~Amiga,PPC~CHRP} -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-27 13:21 ` Gabriel Paubert 1999-10-27 16:05 ` Geert Uytterhoeven @ 1999-10-27 18:23 ` Kumar Gala 1999-10-27 22:39 ` Tony Mantler 2 siblings, 0 replies; 23+ messages in thread From: Kumar Gala @ 1999-10-27 18:23 UTC (permalink / raw) To: Gabriel Paubert; +Cc: Adrian Cox, Jim Terman, linuxppc-dev The main purpose of VRSAVE was for apple. The allocate registers in order which makes it efficient to use it to decide which AltiVec registers to save and restore since you only need to determine up to which register to save. The ABI for System V, allocates registers similar to GPRs, in that vr3,vr4,vr5 are used to pass arguments and so on. These is documented in the Motorola docs on (AltiVec Programmers Environments Manual on there website). The problem then becomes that due to the ABI it is more costly to determine which vector registers where used and which were not if VRSAVE is used as a bitmap. Also, the current ABI does not state the use of VRSAVE at all and therefor the non-Apple compilers to not even include it in code generated that uses AltiVec. Also, based on how FP is implemented, if the process currently running does not use AltiVec in its current time slice the registers are not saved and restored. As for the cache issues with saving and restoring all the VR registers, I am looking into this further. - kumar ignorance is bliss. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-27 13:21 ` Gabriel Paubert 1999-10-27 16:05 ` Geert Uytterhoeven 1999-10-27 18:23 ` Kumar Gala @ 1999-10-27 22:39 ` Tony Mantler 1999-10-28 11:01 ` Gabriel Paubert 2 siblings, 1 reply; 23+ messages in thread From: Tony Mantler @ 1999-10-27 22:39 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev At 8:21 AM -0500 10/27/99, Gabriel Paubert wrote: [...] >And why would you want to use a bitmap ? This seems braindead to me, put a >value between 0 and 32 in vrsave. Since all registers are identical >in use and purpose, save registers 0 to n. Disclaimer: I've not seen if >the ABI specifies how and which Altivec registers are saved restored >across calls. It would seem to me that using a bitmap to mark used registers would allow more flexibility on the compiler side to play with register usage without incurring longer context switch times. I wouldn't try to guess the relative tradeoff values though. >Paranoid point of view: the restore must reload all altivec registers >(or clear the ones which are not specified as used by VRSAVE), otherwise >you might leak the contents of the Altivec registers of another process. >I'm not a security expert, but I don't like this possibility at all. [...] Beyond security, clearning the registers would also serve to enforce strict usage of whatever is defined as the VRSAVE format, and avoid the possibility of a mouth-breathing code-typist releasing a binary that doesn't mark it's registers, which in theory would only break once a different application touches the altivec registers, resulting in a situation of either A: the kernel being forced to save all altivec registers (bad) or B: allowing those binaries to be broken and upsetting it's users (slightly less bad). Obviously pre-breaking those binaries would be the preferable solution, so they never need see the light of day. That's my 2c. Cheers - Tony :) -- Tony Mantler Renaissance Nerd Extraordinaire eek@escape.ca Winnipeg, Manitoba, Canada http://www.escape.ca/~eek ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-27 22:39 ` Tony Mantler @ 1999-10-28 11:01 ` Gabriel Paubert 1999-10-28 21:20 ` Tony Mantler 0 siblings, 1 reply; 23+ messages in thread From: Gabriel Paubert @ 1999-10-28 11:01 UTC (permalink / raw) To: Tony Mantler; +Cc: linuxppc-dev On Wed, 27 Oct 1999, Tony Mantler wrote: > At 8:21 AM -0500 10/27/99, Gabriel Paubert wrote: > [...] > >And why would you want to use a bitmap ? This seems braindead to me, put a > >value between 0 and 32 in vrsave. Since all registers are identical > >in use and purpose, save registers 0 to n. Disclaimer: I've not seen if > >the ABI specifies how and which Altivec registers are saved restored > >across calls. > > It would seem to me that using a bitmap to mark used registers would allow > more flexibility on the compiler side to play with register usage without > incurring longer context switch times. I wouldn't try to guess the relative > tradeoff values though. Since all 32 registers have exactly the same capabilities, using only a consecutive set of registers is not a problem for the compiler. That's how it works for integer registers and FP registers 14 to 31 (for integer actually register 0 is special). Taking into account the comments on the ABI, the only strategy which seems implementable is to distinguish 2 cases only vrsave=0 and vrsave!=0. Using a bitmap means a lot of conditional branches, which can obviously be folded but nevertheless bloat the code. (For save+restore: 64 conditional branches + 16 mtcrf= 320 bytes). There is also a problem of knowing how to update the bitmap in nested subroutines: to keep it correct the called subroutine must save the current vrsave and then or it with the bitmask of the registers it uses. Then on exit it has to restore caller's vrsave. Do we want such a complex strategy ? I don't mean that it is impossible to implement, but that it looks complex. OTOH, if the register usage is designed similarly to integer and FP, the bitmask might look like 111...1100...0011...11 (i.e. with at most 2 transitions between 0 and 1 in the bit string). It might be worth optimizing the save/restore routine for this case, saving/restoring more registers than necessary when vrsave does not have such a canonical form. > > > >Paranoid point of view: the restore must reload all altivec registers > >(or clear the ones which are not specified as used by VRSAVE), otherwise > >you might leak the contents of the Altivec registers of another process. > >I'm not a security expert, but I don't like this possibility at all. > [...] > > Beyond security, clearning the registers would also serve to enforce strict > usage of whatever is defined as the VRSAVE format, and avoid the > possibility of a mouth-breathing code-typist releasing a binary that > doesn't mark it's registers, which in theory would only break once a > different application touches the altivec registers, resulting in a > situation of either A: the kernel being forced to save all altivec > registers (bad) or B: allowing those binaries to be broken and upsetting > it's users (slightly less bad). Obviously pre-breaking those binaries would > be the preferable solution, so they never need see the light of day. Indeed, I had not considered this problem. Note that conditional clearing of most registers can probably be done without conditional branches. Just put a copy of vrsave in one vr and then find a smart way to transform these bits in masks to clear the registers (probably you'll have to splat it first). It won't work for the the last register(s) because you need some workspace, however. Anyway, having a special fast path for the case vrsave=0 is probably the most important optimization IMHO. Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-28 11:01 ` Gabriel Paubert @ 1999-10-28 21:20 ` Tony Mantler 1999-10-29 11:58 ` Benjamin Herrenschmidt 1999-10-29 12:49 ` Gabriel Paubert 0 siblings, 2 replies; 23+ messages in thread From: Tony Mantler @ 1999-10-28 21:20 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev At 6:01 AM -0500 10/28/99, Gabriel Paubert wrote: >On Wed, 27 Oct 1999, Tony Mantler wrote: > [...] >> It would seem to me that using a bitmap to mark used registers would allow >> more flexibility on the compiler side to play with register usage without >> incurring longer context switch times. I wouldn't try to guess the relative >> tradeoff values though. > >Since all 32 registers have exactly the same capabilities, using only a >consecutive set of registers is not a problem for the compiler. That's how >it works for integer registers and FP registers 14 to 31 (for integer >actually register 0 is special). I suppose I'm a bit too used to 68k stuff, where sorting register usage takes a back seat to efficient register re-use. However, with the size of the data in the Altivec registers, I would expect a bit of optimization to slant away from cases where the registers can be easily sorted and packed. >Taking into account the comments on the >ABI, the only strategy which seems implementable is to distinguish 2 cases >only vrsave=0 and vrsave!=0. Using a bitmap means a lot of conditional >branches, which can obviously be folded but nevertheless bloat the code. >(For save+restore: 64 conditional branches + 16 mtcrf= 320 bytes). Quite true. >There is also a problem of knowing how to update the bitmap in nested >subroutines: to keep it correct the called subroutine must save the >current vrsave and then or it with the bitmask of the registers it uses. >Then on exit it has to restore caller's vrsave. Do we want such a complex >strategy ? I don't mean that it is impossible to implement, but that it >looks complex. I think saving registers in a subroutine is a pain no matter how it's implemented. If the VRSAVE is used as a count, the subroutine still has to save the old value, save the overwritten registers, calculate what the proper new value is (think new < old = oops!) then restore the overwritten registers and old VRSAVE value when it exits. >OTOH, if the register usage is designed similarly to integer and FP, the >bitmask might look like 111...1100...0011...11 (i.e. with at most 2 >transitions between 0 and 1 in the bit string). It might be worth >optimizing the save/restore routine for this case, saving/restoring more >registers than necessary when vrsave does not have such a canonical form. Hmm, count bits in from the left and right, mask and check for missed bits, then branch to either a full save or a left+right save. Doing it that way would also somewhat optimize VRSAVE=0, since both the leftmost and rightmost bits are 0, it would pass right through the left-save and right-save half of the optimized register save. Perhaps a little longer than "if (VRSAVE==0) return;", but it's quick enough for me. [.. enforce proper VRSAVE usage ..] > >Indeed, I had not considered this problem. Note that conditional clearing >of most registers can probably be done without conditional branches. Just >put a copy of vrsave in one vr and then find a smart way to transform >these bits in masks to clear the registers (probably you'll have to splat >it first). It won't work for the the last register(s) because you need >some workspace, however. Mmm, the joys of a bitwise AND. >Anyway, having a special fast path for the case vrsave=0 is probably the >most important optimization IMHO. Indeed. -- Tony Mantler Renaissance Nerd Extraordinaire eek@escape.ca Winnipeg, Manitoba, Canada http://www.escape.ca/~eek ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-28 21:20 ` Tony Mantler @ 1999-10-29 11:58 ` Benjamin Herrenschmidt 1999-10-29 12:49 ` Gabriel Paubert 1 sibling, 0 replies; 23+ messages in thread From: Benjamin Herrenschmidt @ 1999-10-29 11:58 UTC (permalink / raw) To: Tony Mantler, linuxppc-dev On Thu, Oct 28, 1999, Tony Mantler <eek@escape.ca> wrote: >I think saving registers in a subroutine is a pain no matter how it's >implemented. If the VRSAVE is used as a count, the subroutine still has to >save the old value, save the overwritten registers, calculate what the >proper new value is (think new < old = oops!) then restore the overwritten >registers and old VRSAVE value when it exits. In their latest Mac compiler, Metroweks added a pragma for manually optimising this when you know you'll use a bunch of those registers: << #pragma altivec_vrsave allon is now supported. It sets VRsave assuming that all altivec registers are in use, best used with "#pragma altivec_vrsave off" so only the parent routine updates the vrsave register. >> ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-28 21:20 ` Tony Mantler 1999-10-29 11:58 ` Benjamin Herrenschmidt @ 1999-10-29 12:49 ` Gabriel Paubert 1999-10-30 4:14 ` Tony Mantler 1 sibling, 1 reply; 23+ messages in thread From: Gabriel Paubert @ 1999-10-29 12:49 UTC (permalink / raw) To: Tony Mantler; +Cc: linuxppc-dev On Thu, 28 Oct 1999, Tony Mantler wrote: > I suppose I'm a bit too used to 68k stuff, where sorting register usage > takes a back seat to efficient register re-use. However, with the size of > the data in the Altivec registers, I would expect a bit of optimization to > slant away from cases where the registers can be easily sorted and packed. Things are different when all registers are identical and instructions have separate operands for inputs and the output. I've programmed 68k to and it's often painful (Intel is worse, to be fair). > >There is also a problem of knowing how to update the bitmap in nested > >subroutines: to keep it correct the called subroutine must save the > >current vrsave and then or it with the bitmask of the registers it uses. > >Then on exit it has to restore caller's vrsave. Do we want such a complex > >strategy ? I don't mean that it is impossible to implement, but that it > >looks complex. > > I think saving registers in a subroutine is a pain no matter how it's > implemented. If the VRSAVE is used as a count, the subroutine still has to > save the old value, save the overwritten registers, calculate what the > proper new value is (think new < old = oops!) then restore the overwritten > registers and old VRSAVE value when it exits. In the end a bitmap seems the best, since the code can be free of conditionals and fairly compact: - at start of routine (register numbers chosen randomly): mfspr r12,vrsave oris r0,r12,0x.... # mask of used bits ori r0,r0,0x.... # mask of used bits (only is using vr16-vr31) stw r12,somewhere on the stack mtspr vrsave,r0 and the end: lwz r12,somewhere on the stack mtspr vrsave,r12 > >OTOH, if the register usage is designed similarly to integer and FP, the > >bitmask might look like 111...1100...0011...11 (i.e. with at most 2 > >transitions between 0 and 1 in the bit string). It might be worth > >optimizing the save/restore routine for this case, saving/restoring more > >registers than necessary when vrsave does not have such a canonical form. > > Hmm, count bits in from the left and right, mask and check for missed bits, > then branch to either a full save or a left+right save. Yes, cntlzw on a vrsave copy (after a few simple manipulations) is your friend. Besides this the ABI separates two ranges: R0 to R13 and R14 to R31 (I could be off by one). Optimize for the common case, find the first set bit with index >=14 and last set bit with index <=13 and save only these 2 ranges. Optimizing for more complex cases is not worth the trouble, just ensure that they work properly. > Doing it that way would also somewhat optimize VRSAVE=0, since both the > leftmost and rightmost bits are 0, it would pass right through the > left-save and right-save half of the optimized register save. I would also optimize speecifically for the vrsave=0, a compare and a conditional branch are not that costly, especially if the branch is done well after the branch, with all the bitmap manipulation in between: mfspr r3,vrsave cpmwi cr1,r3,0 andis. r4,r3,0,0xfffc rlwinm r5,r3,0,0x0003ffff neg r6,r4 cntlzw r5,r5 # first register of r14..r31 to save and r4,r4,r6 cntlzw r4,r4 # last register of r0..r13 to save beq cr1,nothing_to_save It's not finished: you've to setup registers to addres the save area and compute a branch inside the save routine to actually perform the save (backwards for r0..r13, forwards for r14..r31). > Perhaps a little longer than "if (VRSAVE==0) return;", but it's quick > enough for me. Probably close to optimal, lazy enough without trying to be too smart and executing tons of code as the result. Never forget that this code is unlikely to be found in L1 icache. > >Indeed, I had not considered this problem. Note that conditional clearing > >of most registers can probably be done without conditional branches. Just > >put a copy of vrsave in one vr and then find a smart way to transform > >these bits in masks to clear the registers (probably you'll have to splat > >it first). It won't work for the the last register(s) because you need > >some workspace, however. > > Mmm, the joys of a bitwise AND. Well, after having a moore detailed look at Altivec, I missed a shift by immdieate amount in bits to make the code as compact as possible. There are probably tricks to work around this, I might have started with the wrong idea on the way to implement this... Gabriel. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: question about altivec registers 1999-10-29 12:49 ` Gabriel Paubert @ 1999-10-30 4:14 ` Tony Mantler 0 siblings, 0 replies; 23+ messages in thread From: Tony Mantler @ 1999-10-30 4:14 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev At 7:49 AM -0500 10/29/99, Gabriel Paubert wrote: >On Thu, 28 Oct 1999, Tony Mantler wrote: > >> I suppose I'm a bit too used to 68k stuff, where sorting register usage >> takes a back seat to efficient register re-use. However, with the size of >> the data in the Altivec registers, I would expect a bit of optimization to >> slant away from cases where the registers can be easily sorted and packed. > >Things are different when all registers are identical and instructions >have separate operands for inputs and the output. I've programmed 68k to >and it's often painful (Intel is worse, to be fair). I haven't found it too bad. It's rather sensibly designed for it's intended applications, and considering that it was originally laid out way back in the early 80's (iirc), it's stood the test of time rather well. [...] >> I think saving registers in a subroutine is a pain no matter how it's >> implemented. If the VRSAVE is used as a count, the subroutine still has to >> save the old value, save the overwritten registers, calculate what the >> proper new value is (think new < old = oops!) then restore the overwritten >> registers and old VRSAVE value when it exits. > >In the end a bitmap seems the best, since the code can be free of >conditionals and fairly compact: >- at start of routine (register numbers chosen randomly): > mfspr r12,vrsave > oris r0,r12,0x.... # mask of used bits > ori r0,r0,0x.... # mask of used bits (only is using vr16-vr31) > stw r12,somewhere on the stack > mtspr vrsave,r0 > >and the end: > lwz r12,somewhere on the stack > mtspr vrsave,r12 Looks clean enough to me. [...] >Yes, cntlzw on a vrsave copy (after a few simple manipulations) is your >friend. Besides this the ABI separates two ranges: R0 to R13 and R14 >to R31 (I could be off by one). Optimize for the common case, find the >first set bit with index >=14 and last set bit with index <=13 and save >only these 2 ranges. Optimizing for more complex cases is not worth the >trouble, just ensure that they work properly. Indeed. >> Doing it that way would also somewhat optimize VRSAVE=0, since both the >> leftmost and rightmost bits are 0, it would pass right through the >> left-save and right-save half of the optimized register save. > >I would also optimize speecifically for the vrsave=0, a compare and a >conditional branch are not that costly, especially if the branch is done >well after the branch, with all the bitmap manipulation in between: > > mfspr r3,vrsave > cpmwi cr1,r3,0 > andis. r4,r3,0,0xfffc > rlwinm r5,r3,0,0x0003ffff > neg r6,r4 > cntlzw r5,r5 # first register of r14..r31 to save > and r4,r4,r6 > cntlzw r4,r4 # last register of r0..r13 to save > beq cr1,nothing_to_save > >It's not finished: you've to setup registers to addres the save area and >compute a branch inside the save routine to actually perform the save >(backwards for r0..r13, forwards for r14..r31). Yeah, one extra branch certainly won't kill anyone. [.. clearing unused registers ..] >Well, after having a moore detailed look at Altivec, I missed a shift >by immdieate amount in bits to make the code as compact as possible. There >are probably tricks to work around this, I might have started with the >wrong idea on the way to implement this... Hmm, I just re-read the altivec spec sheet and, though I wouldn't call myself an expert on PPC, it would seem that there's 3 ways to clear the registers. The first way would be to use a bunch of branch conditionals, which we probably want to avoid. The second way would be to calculate a 0 or -1 entirely within the vector unit, which would both use a bunch of vector registers, and probably be rather messy, as it's not really what the vector unit is designed for. The third would be to calculate a 0 or -1 in the GPRs, then copy and splat it into a vector register. Unfortunaltey it would appear that copying a value from a GPR to a Vector register can only be done by writing the value to memory, then reading it back in again, which isn't very pretty at all. Oh well, time to watch Southpark, filmed in hella-cool ((( Spooooky-vision ))) ;) -- Tony Mantler Renaissance Nerd Extraordinaire eek@escape.ca Winnipeg, Manitoba, Canada http://www.escape.ca/~eek ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~1999-10-30 4:14 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
1999-10-25 20:51 question about altivec registers Jim Terman
1999-10-25 22:27 ` Claude Robitaille
1999-10-25 22:31 ` Jim Terman
1999-10-25 22:44 ` erik cameron
1999-10-25 23:28 ` Claude Robitaille
[not found] ` <Pine.LNX.4.10.9910251916060.5902-100000@modemcable220.93-200-24.mtl.mc.vi deotron.net>
1999-10-25 23:53 ` Rob Barris
1999-10-26 18:22 ` Geert Uytterhoeven
1999-10-26 22:13 ` Rob Barris
1999-10-26 22:38 ` Tom Vier
1999-10-26 22:03 ` Tom Vier
1999-10-26 4:42 ` Kumar Gala
1999-10-26 21:52 ` Jim Terman
1999-10-26 22:43 ` Kumar Gala
1999-10-27 8:58 ` Adrian Cox
1999-10-27 13:21 ` Gabriel Paubert
1999-10-27 16:05 ` Geert Uytterhoeven
1999-10-27 18:23 ` Kumar Gala
1999-10-27 22:39 ` Tony Mantler
1999-10-28 11:01 ` Gabriel Paubert
1999-10-28 21:20 ` Tony Mantler
1999-10-29 11:58 ` Benjamin Herrenschmidt
1999-10-29 12:49 ` Gabriel Paubert
1999-10-30 4:14 ` Tony Mantler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).