* [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
@ 2017-10-09 11:07 Sandipan Das
2017-10-09 13:49 ` David Laight
2017-10-09 14:45 ` Naveen N. Rao
0 siblings, 2 replies; 8+ messages in thread
From: Sandipan Das @ 2017-10-09 11:07 UTC (permalink / raw)
To: mpe; +Cc: naveen.n.rao, paulus, anton, segher, linuxppc-dev
According to the GCC documentation, the behaviour of __builtin_clz()
and __builtin_clzl() is undefined if the value of the input argument
is zero. Without handling this special case, these builtins have been
used for emulating the following instructions:
* Count Leading Zeros Word (cntlzw[.])
* Count Leading Zeros Doubleword (cntlzd[.])
This fixes the emulated behaviour of these instructions by adding an
additional check for this special case.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
---
arch/powerpc/lib/sstep.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 0f7e41bd7e88..ebbc0b92650c 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1717,11 +1717,19 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
* Logical instructions
*/
case 26: /* cntlzw */
- op->val = __builtin_clz((unsigned int) regs->gpr[rd]);
+ val = (unsigned int) regs->gpr[rd];
+ if (val == 0)
+ op->val = 32;
+ else
+ op->val = __builtin_clz(val);
goto logical_done;
#ifdef __powerpc64__
case 58: /* cntlzd */
- op->val = __builtin_clzl(regs->gpr[rd]);
+ val = regs->gpr[rd];
+ if (val == 0)
+ op->val = 64;
+ else
+ op->val = __builtin_clzl(val);
goto logical_done;
#endif
case 28: /* and */
--
2.13.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 11:07 [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions Sandipan Das
@ 2017-10-09 13:49 ` David Laight
2017-10-09 14:20 ` Segher Boessenkool
2017-10-09 14:45 ` Naveen N. Rao
1 sibling, 1 reply; 8+ messages in thread
From: David Laight @ 2017-10-09 13:49 UTC (permalink / raw)
To: 'Sandipan Das', mpe@ellerman.id.au
Cc: linuxppc-dev@lists.ozlabs.org, naveen.n.rao@linux.vnet.ibm.com,
paulus@samba.org, anton@samba.org
From: Sandipan Das
> Sent: 09 October 2017 12:07
> According to the GCC documentation, the behaviour of __builtin_clz()
> and __builtin_clzl() is undefined if the value of the input argument
> is zero. Without handling this special case, these builtins have been
> used for emulating the following instructions:
> * Count Leading Zeros Word (cntlzw[.])
> * Count Leading Zeros Doubleword (cntlzd[.])
>=20
> This fixes the emulated behaviour of these instructions by adding an
> additional check for this special case.
Presumably the result is undefined because the underlying cpu
instruction is used - and it's return value is implementation defined.
Here you are emulating the cpu instruction - so executing one will
give the same answer as it the 'real' one were execucted.
Indeed it might be worth an asm statement that definitely executes
the cpu instruction?
David
>=20
> Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
> ---
> arch/powerpc/lib/sstep.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>=20
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 0f7e41bd7e88..ebbc0b92650c 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1717,11 +1717,19 @@ int analyse_instr(struct instruction_op *op, cons=
t struct pt_regs *regs,
> * Logical instructions
> */
> case 26: /* cntlzw */
> - op->val =3D __builtin_clz((unsigned int) regs->gpr[rd]);
> + val =3D (unsigned int) regs->gpr[rd];
> + if (val =3D=3D 0)
> + op->val =3D 32;
> + else
> + op->val =3D __builtin_clz(val);
> goto logical_done;
> #ifdef __powerpc64__
> case 58: /* cntlzd */
> - op->val =3D __builtin_clzl(regs->gpr[rd]);
> + val =3D regs->gpr[rd];
> + if (val =3D=3D 0)
> + op->val =3D 64;
> + else
> + op->val =3D __builtin_clzl(val);
> goto logical_done;
> #endif
> case 28: /* and */
> --
> 2.13.6
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 13:49 ` David Laight
@ 2017-10-09 14:20 ` Segher Boessenkool
2017-10-09 14:43 ` David Laight
0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2017-10-09 14:20 UTC (permalink / raw)
To: David Laight
Cc: 'Sandipan Das', mpe@ellerman.id.au, paulus@samba.org,
naveen.n.rao@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
anton@samba.org
On Mon, Oct 09, 2017 at 01:49:26PM +0000, David Laight wrote:
> From: Sandipan Das
> > Sent: 09 October 2017 12:07
> > According to the GCC documentation, the behaviour of __builtin_clz()
> > and __builtin_clzl() is undefined if the value of the input argument
> > is zero. Without handling this special case, these builtins have been
> > used for emulating the following instructions:
> > * Count Leading Zeros Word (cntlzw[.])
> > * Count Leading Zeros Doubleword (cntlzd[.])
> >
> > This fixes the emulated behaviour of these instructions by adding an
> > additional check for this special case.
>
> Presumably the result is undefined because the underlying cpu
> instruction is used - and it's return value is implementation defined.
It is undefined because the result is undefined, and the compiler
optimises based on that. The return value of the builtin is undefined,
not implementation defined.
The patch is correct.
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 14:20 ` Segher Boessenkool
@ 2017-10-09 14:43 ` David Laight
2017-10-09 14:47 ` naveen.n.rao
2017-10-09 14:47 ` Segher Boessenkool
0 siblings, 2 replies; 8+ messages in thread
From: David Laight @ 2017-10-09 14:43 UTC (permalink / raw)
To: 'Segher Boessenkool'
Cc: 'Sandipan Das', mpe@ellerman.id.au, paulus@samba.org,
naveen.n.rao@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
anton@samba.org
From: Segher Boessenkool
> Sent: 09 October 2017 15:21
> On Mon, Oct 09, 2017 at 01:49:26PM +0000, David Laight wrote:
> > From: Sandipan Das
> > > Sent: 09 October 2017 12:07
> > > According to the GCC documentation, the behaviour of __builtin_clz()
> > > and __builtin_clzl() is undefined if the value of the input argument
> > > is zero. Without handling this special case, these builtins have been
> > > used for emulating the following instructions:
> > > * Count Leading Zeros Word (cntlzw[.])
> > > * Count Leading Zeros Doubleword (cntlzd[.])
> > >
> > > This fixes the emulated behaviour of these instructions by adding an
> > > additional check for this special case.
> >
> > Presumably the result is undefined because the underlying cpu
> > instruction is used - and it's return value is implementation defined.
>=20
> It is undefined because the result is undefined, and the compiler
> optimises based on that. The return value of the builtin is undefined,
> not implementation defined.
>=20
> The patch is correct.
But the code you are emulating might be relying on the (un)defined value
the cpu instruction gives for zero input rather than the input width.
Or, put another way, if the return value for a clz instruction with zero
argument is undefined (as it is on x86 - intel and amd may differ) then the
emulation can return any value since the code can't care.
So the conditional is not needed.
David
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 14:43 ` David Laight
@ 2017-10-09 14:47 ` naveen.n.rao
2017-10-09 15:24 ` David Laight
2017-10-09 14:47 ` Segher Boessenkool
1 sibling, 1 reply; 8+ messages in thread
From: naveen.n.rao @ 2017-10-09 14:47 UTC (permalink / raw)
To: David Laight
Cc: 'Segher Boessenkool', 'Sandipan Das',
mpe@ellerman.id.au, paulus@samba.org,
linuxppc-dev@lists.ozlabs.org, anton@samba.org
On 2017/10/09 02:43PM, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 09 October 2017 15:21
> > On Mon, Oct 09, 2017 at 01:49:26PM +0000, David Laight wrote:
> > > From: Sandipan Das
> > > > Sent: 09 October 2017 12:07
> > > > According to the GCC documentation, the behaviour of __builtin_clz()
> > > > and __builtin_clzl() is undefined if the value of the input argument
> > > > is zero. Without handling this special case, these builtins have been
> > > > used for emulating the following instructions:
> > > > * Count Leading Zeros Word (cntlzw[.])
> > > > * Count Leading Zeros Doubleword (cntlzd[.])
> > > >
> > > > This fixes the emulated behaviour of these instructions by adding an
> > > > additional check for this special case.
> > >
> > > Presumably the result is undefined because the underlying cpu
> > > instruction is used - and it's return value is implementation defined.
> >
> > It is undefined because the result is undefined, and the compiler
> > optimises based on that. The return value of the builtin is undefined,
> > not implementation defined.
> >
> > The patch is correct.
>
> But the code you are emulating might be relying on the (un)defined value
> the cpu instruction gives for zero input rather than the input width.
>
> Or, put another way, if the return value for a clz instruction with zero
> argument is undefined (as it is on x86 - intel and amd may differ) then the
> emulation can return any value since the code can't care.
> So the conditional is not needed.
This is about the behavior of the gcc builtin being undefined, rather
than the actual cpu instruction itself.
- Naveen
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 14:47 ` naveen.n.rao
@ 2017-10-09 15:24 ` David Laight
0 siblings, 0 replies; 8+ messages in thread
From: David Laight @ 2017-10-09 15:24 UTC (permalink / raw)
To: 'naveen.n.rao@linux.vnet.ibm.com'
Cc: 'Sandipan Das', paulus@samba.org, anton@samba.org,
linuxppc-dev@lists.ozlabs.org
From: naveen.n.rao@linux.vnet.ibm.com
> Sent: 09 October 2017 15:48
...
> This is about the behavior of the gcc builtin being undefined, rather
> than the actual cpu instruction itself.
I'd have hoped that the ggc builtin just generated the expected cpu instruc=
tion.
So is only undefined because it is very cpu dependant.
More problematic here would be any cpu flag register settings.
eg: x86 would set the 'Z' flag for zero input.
David
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 14:43 ` David Laight
2017-10-09 14:47 ` naveen.n.rao
@ 2017-10-09 14:47 ` Segher Boessenkool
1 sibling, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2017-10-09 14:47 UTC (permalink / raw)
To: David Laight
Cc: 'Sandipan Das', mpe@ellerman.id.au, paulus@samba.org,
naveen.n.rao@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
anton@samba.org
On Mon, Oct 09, 2017 at 02:43:45PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 09 October 2017 15:21
> > On Mon, Oct 09, 2017 at 01:49:26PM +0000, David Laight wrote:
> > > From: Sandipan Das
> > > > Sent: 09 October 2017 12:07
> > > > According to the GCC documentation, the behaviour of __builtin_clz()
> > > > and __builtin_clzl() is undefined if the value of the input argument
> > > > is zero. Without handling this special case, these builtins have been
> > > > used for emulating the following instructions:
> > > > * Count Leading Zeros Word (cntlzw[.])
> > > > * Count Leading Zeros Doubleword (cntlzd[.])
> > > >
> > > > This fixes the emulated behaviour of these instructions by adding an
> > > > additional check for this special case.
> > >
> > > Presumably the result is undefined because the underlying cpu
> > > instruction is used - and it's return value is implementation defined.
> >
> > It is undefined because the result is undefined, and the compiler
> > optimises based on that. The return value of the builtin is undefined,
> > not implementation defined.
> >
> > The patch is correct.
>
> But the code you are emulating might be relying on the (un)defined value
> the cpu instruction gives for zero input rather than the input width.
>
> Or, put another way, if the return value for a clz instruction with zero
> argument is undefined (as it is on x86 - intel and amd may differ) then the
> emulation can return any value since the code can't care.
> So the conditional is not needed.
The cntlz[wd][.] insn has defined behaviour for 0 input. It's just the
builtin that does not. So we shouldn't call the builtin with an input
of 0 -- exactly what this patch does -- and that is all that was wrong.
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions
2017-10-09 11:07 [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions Sandipan Das
2017-10-09 13:49 ` David Laight
@ 2017-10-09 14:45 ` Naveen N. Rao
1 sibling, 0 replies; 8+ messages in thread
From: Naveen N. Rao @ 2017-10-09 14:45 UTC (permalink / raw)
To: Sandipan Das; +Cc: mpe, paulus, anton, segher, linuxppc-dev
On 2017/10/09 11:07AM, Sandipan Das wrote:
> According to the GCC documentation, the behaviour of __builtin_clz()
> and __builtin_clzl() is undefined if the value of the input argument
> is zero. Without handling this special case, these builtins have been
> used for emulating the following instructions:
> * Count Leading Zeros Word (cntlzw[.])
> * Count Leading Zeros Doubleword (cntlzd[.])
>
> This fixes the emulated behaviour of these instructions by adding an
> additional check for this special case.
So:
Fixes: 3cdfcbfd32b9d ("powerpc: Change analyse_instr so it doesn't
modify *regs")
>
> Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
> ---
> arch/powerpc/lib/sstep.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 0f7e41bd7e88..ebbc0b92650c 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1717,11 +1717,19 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> * Logical instructions
> */
> case 26: /* cntlzw */
> - op->val = __builtin_clz((unsigned int) regs->gpr[rd]);
> + val = (unsigned int) regs->gpr[rd];
> + if (val == 0)
> + op->val = 32;
> + else
> + op->val = __builtin_clz(val);
Can be made more compact:
op->val = ( val ? __builtin_clz(val) : 32 );
- Naveen
> goto logical_done;
> #ifdef __powerpc64__
> case 58: /* cntlzd */
> - op->val = __builtin_clzl(regs->gpr[rd]);
> + val = regs->gpr[rd];
> + if (val == 0)
> + op->val = 64;
> + else
> + op->val = __builtin_clzl(val);
> goto logical_done;
> #endif
> case 28: /* and */
> --
> 2.13.6
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-10-09 15:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-09 11:07 [PATCH] powerpc/lib/sstep: Fix count leading zeros instructions Sandipan Das
2017-10-09 13:49 ` David Laight
2017-10-09 14:20 ` Segher Boessenkool
2017-10-09 14:43 ` David Laight
2017-10-09 14:47 ` naveen.n.rao
2017-10-09 15:24 ` David Laight
2017-10-09 14:47 ` Segher Boessenkool
2017-10-09 14:45 ` Naveen N. Rao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.