qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel]  Why some ARM NEON helper functions need mask?
@ 2011-10-30 11:39 陳韋任
  2011-10-30 12:06 ` Max Filippov
  2011-10-30 18:21 ` Chih-Min Chao
  0 siblings, 2 replies; 4+ messages in thread
From: 陳韋任 @ 2011-10-30 11:39 UTC (permalink / raw)
  To: qemu-devel

Hi, all

  I am looking into QEMU's implementation for ARM NEON instructions
(target-arm/neon_helper.c). Some helper functions will do mask
operation, neon_add_u8, for example. I thought simply adding a and b
is enough and can't figure out why the mask operation is needed.

---
uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b)
{
    uint32_t mask;
    mask = (a ^ b) & 0x80808080u;
    a &= ~0x80808080u;
    b &= ~0x80808080u;
    return (a + b) ^ mask;
}
---

  Any help is appreciated.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Why some ARM NEON helper functions need mask?
  2011-10-30 11:39 [Qemu-devel] Why some ARM NEON helper functions need mask? 陳韋任
@ 2011-10-30 12:06 ` Max Filippov
  2011-10-30 18:21 ` Chih-Min Chao
  1 sibling, 0 replies; 4+ messages in thread
From: Max Filippov @ 2011-10-30 12:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: 陳韋任

>   I am looking into QEMU's implementation for ARM NEON instructions
> (target-arm/neon_helper.c). Some helper functions will do mask
> operation, neon_add_u8, for example. I thought simply adding a and b
> is enough and can't figure out why the mask operation is needed.

These are SIMD instructions acting upon independent data 'lanes' packed into bigger data item.
Lane operations must not interfere with each other.
 
> ---
> uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b)
> {
>     uint32_t mask;
>1:     mask = (a ^ b) & 0x80808080u;
>2:     a &= ~0x80808080u;
>3:     b &= ~0x80808080u;
>4:     return (a + b) ^ mask;
> }
> ---

In your example there are four 8-bit lanes packed into 32-bit word.
If we add whole 32-bit words then care must be taken to prevent overflow propagation between the lanes.
This is done by putting zero at the top bit of each 8-bit operand (steps 2 and 3).
These top bits are summed modulo 2 separately (step 1) and then added back (step4).

Thanks.
-- Max

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Why some ARM NEON helper functions need mask?
  2011-10-30 11:39 [Qemu-devel] Why some ARM NEON helper functions need mask? 陳韋任
  2011-10-30 12:06 ` Max Filippov
@ 2011-10-30 18:21 ` Chih-Min Chao
  2011-10-31  7:48   ` 陳韋任
  1 sibling, 1 reply; 4+ messages in thread
From: Chih-Min Chao @ 2011-10-30 18:21 UTC (permalink / raw)
  To: 陳韋任; +Cc: qemu-devel

On Sun, Oct 30, 2011 at 7:39 PM, 陳韋任 <chenwj@iis.sinica.edu.tw> wrote:
> Hi, all
>
>  I am looking into QEMU's implementation for ARM NEON instructions
> (target-arm/neon_helper.c). Some helper functions will do mask
> operation, neon_add_u8, for example. I thought simply adding a and b
> is enough and can't figure out why the mask operation is needed.
>
> ---
> uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b)
> {
>    uint32_t mask;
>    mask = (a ^ b) & 0x80808080u;
>    a &= ~0x80808080u;
>    b &= ~0x80808080u;
>    return (a + b) ^ mask;
> }
> ---
>
ex :

a =  0x01  01 01 01
b = 0xFF  FF FF FF

the expected result of a+ b is
0x0   0   0   0

simply add a to b is
0x1   1   1   0

>  Any help is appreciated.
>
> Regards,
> chenwj
>
> --
> Wei-Ren Chen (陳韋任)
> Computer Systems Lab, Institute of Information Science,
> Academia Sinica, Taiwan (R.O.C.)
> Tel:886-2-2788-3799 #1667
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Why some ARM NEON helper functions need mask?
  2011-10-30 18:21 ` Chih-Min Chao
@ 2011-10-31  7:48   ` 陳韋任
  0 siblings, 0 replies; 4+ messages in thread
From: 陳韋任 @ 2011-10-31  7:48 UTC (permalink / raw)
  To: Max Filippov, Chih-Min Chao; +Cc: qemu-devel, 陳韋任


  Thanks, Max and Chih-Min. It's much clear to me now.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-10-31  7:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-30 11:39 [Qemu-devel] Why some ARM NEON helper functions need mask? 陳韋任
2011-10-30 12:06 ` Max Filippov
2011-10-30 18:21 ` Chih-Min Chao
2011-10-31  7:48   ` 陳韋任

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).