From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cary Coutant Date: Tue, 17 Oct 2000 22:40:06 +0000 Subject: Re: [Linux-ia64] avoiding float underflow software assist Message-Id: List-Id: References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-ia64@vger.kernel.org >There's this method of masking just the control bits of sf0 into sf* if >you'd rather not provide direct library access to the fpsr. E.g., to >set ftz in s0: > > asm volatile("fsetc.s0 0x7f, 0x01"); > >Or dup s0 into s2: > > asm volatile("fsetc.s2 0x7f, 0x00"); > >Not sure if/why intel gave us this instruction. Perhaps switching speed >is important to some apps. This is definitely your best approach to setting the ftz bit: fsetc.s0 0x7f,0x01 fsetc.s2 0,0x40 fsetc.s3 0,0x40 This instruction is there to facilitate keeping the control bits of sf2 and sf3 in sync with those of sf0, as mandated by the runtime architecture document. It makes it very easy for the compiler to change a control bit temporarily in sf2 or sf3 for one operation or a few, then restore it to its proper state. Note the 0x40 when setting sf2 and sf3 -- the td (trap disable) bits should be set for these status fields (also mandated by the runtime architecture document). -cary