* [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
@ 2015-01-09 23:47 Tobias Klausmann
[not found] ` <1420847276-8754-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
0 siblings, 1 reply; 18+ messages in thread
From: Tobias Klausmann @ 2015-01-09 23:47 UTC (permalink / raw)
To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 9a0bb60..6a3d515 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -997,6 +997,115 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
i->op = OP_MOV;
break;
}
+ case OP_CVT: {
+ Storage res;
+ bld.setPosition(i, true); /* make sure bld is init'ed */
+ switch(i->dType) {
+ case TYPE_U16:
+ switch (i->sType) {
+ case TYPE_F32:
+ if (i->saturate)
+ res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0,
+ UINT16_MAX));
+ else
+ res.data.u16 = util_iround(imm0.reg.data.f32);
+ break;
+ case TYPE_F64:
+ if (i->saturate)
+ res.data.u16 = util_iround(CLAMP(imm0.reg.data.f64, 0,
+ UINT16_MAX));
+ else
+ res.data.u16 = util_iround(imm0.reg.data.f64);
+ break;
+ default:
+ return;
+ }
+ i->setSrc(0, bld.mkImm(res.data.u16));
+ break;
+ case TYPE_U32:
+ switch (i->sType) {
+ case TYPE_F32:
+ if (i->saturate)
+ res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0,
+ UINT32_MAX));
+ else
+ res.data.u32 = util_iround(imm0.reg.data.f32);
+ break;
+ case TYPE_F64:
+ if (i->saturate)
+ res.data.u32 = util_iround(CLAMP(imm0.reg.data.f64, 0,
+ UINT32_MAX));
+ else
+ res.data.u32 = util_iround(imm0.reg.data.f64);
+ break;
+ default:
+ return;
+ }
+ i->setSrc(0, bld.mkImm(res.data.u32));
+ break;
+ case TYPE_S16:
+ switch (i->sType) {
+ case TYPE_F32:
+ if (i->saturate)
+ res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN,
+ INT16_MAX));
+ else
+ res.data.s16 = util_iround(imm0.reg.data.f32);
+ break;
+ case TYPE_F64:
+ if (i->saturate)
+ res.data.s16 = util_iround(CLAMP(imm0.reg.data.f64, INT16_MIN,
+ INT16_MAX));
+ else
+ res.data.s16 = util_iround(imm0.reg.data.f64);
+ break;
+ default:
+ return;
+ }
+ i->setSrc(0, bld.mkImm(res.data.s16));
+ break;
+ case TYPE_S32:
+ switch (i->sType) {
+ case TYPE_F32:
+ if (i->saturate)
+ res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN,
+ INT32_MAX));
+ else
+ res.data.s32 = util_iround(imm0.reg.data.f32);
+ break;
+ case TYPE_F64:
+ if (i->saturate)
+ res.data.s32 = util_iround(CLAMP(imm0.reg.data.f64, INT32_MIN,
+ INT32_MAX));
+ else
+ res.data.s32 = util_iround(imm0.reg.data.f64);
+ break;
+ default:
+ return;
+ }
+ i->setSrc(0, bld.mkImm(res.data.s32));
+ break;
+ case TYPE_F32:
+ switch (i->sType) {
+ case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break;
+ case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break;
+ case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break;
+ case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break;
+ default:
+ return;
+ }
+ i->setSrc(0, bld.mkImm(res.data.f32));
+ break;
+ default:
+ return;
+ }
+ i->setType(i->dType); /* Remove i->sType, which we don't need anymore */
+ i->setSrc(1, NULL);
+ i->op = OP_MOV;
+
+ i->src(0).mod = Modifier(0); /* Clear the already applied modifier */
+ break;
+ }
default:
return;
}
--
2.2.1
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau
^ permalink raw reply related [flat|nested] 18+ messages in thread[parent not found: <1420847276-8754-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <1420847276-8754-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-10 1:41 ` Ilia Mirkin [not found] ` <CAKb7UvhkHtoRFP1rk8=9w68ZcgesV21mGAMmdB5LsHvFVNzo3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-10 1:41 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Fri, Jan 9, 2015 at 6:47 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> > --- > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ > 1 file changed, 109 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 9a0bb60..6a3d515 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -997,6 +997,115 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > i->op = OP_MOV; > break; > } > + case OP_CVT: { > + Storage res; > + bld.setPosition(i, true); /* make sure bld is init'ed */ > + switch(i->dType) { > + case TYPE_U16: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, > + UINT16_MAX)); > + else > + res.data.u16 = util_iround(imm0.reg.data.f32); > + break; > + case TYPE_F64: The F64 stuff needs more thought, as I don't think we can always store the f64 immediates. In my patches, I just outlaw fp64 immediates in the first place. Please leave these out for now. > + if (i->saturate) > + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f64, 0, > + UINT16_MAX)); > + else > + res.data.u16 = util_iround(imm0.reg.data.f64); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.u16)); > + break; > + case TYPE_U32: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, > + UINT32_MAX)); > + else > + res.data.u32 = util_iround(imm0.reg.data.f32); > + break; > + case TYPE_F64: > + if (i->saturate) > + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f64, 0, > + UINT32_MAX)); > + else > + res.data.u32 = util_iround(imm0.reg.data.f64); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.u32)); > + break; > + case TYPE_S16: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, > + INT16_MAX)); > + else > + res.data.s16 = util_iround(imm0.reg.data.f32); > + break; > + case TYPE_F64: > + if (i->saturate) > + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f64, INT16_MIN, > + INT16_MAX)); > + else > + res.data.s16 = util_iround(imm0.reg.data.f64); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.s16)); > + break; > + case TYPE_S32: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, > + INT32_MAX)); > + else > + res.data.s32 = util_iround(imm0.reg.data.f32); > + break; > + case TYPE_F64: > + if (i->saturate) > + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f64, INT32_MIN, > + INT32_MAX)); > + else > + res.data.s32 = util_iround(imm0.reg.data.f64); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.s32)); > + break; > + case TYPE_F32: > + switch (i->sType) { > + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; > + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; > + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; > + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.f32)); > + break; > + default: > + return; > + } > + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ > + i->setSrc(1, NULL); > + i->op = OP_MOV; > + > + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ > + break; > + } > default: > return; > } > -- > 2.2.1 > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7UvhkHtoRFP1rk8=9w68ZcgesV21mGAMmdB5LsHvFVNzo3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvhkHtoRFP1rk8=9w68ZcgesV21mGAMmdB5LsHvFVNzo3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-10 1:08 ` Tobias Klausmann 2015-01-10 1:24 ` [PATCH v2] " Tobias Klausmann 1 sibling, 0 replies; 18+ messages in thread From: Tobias Klausmann @ 2015-01-10 1:08 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 10.01.2015 02:41, Ilia Mirkin wrote: > On Fri, Jan 9, 2015 at 6:47 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >> --- >> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ >> 1 file changed, 109 insertions(+) >> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> index 9a0bb60..6a3d515 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> @@ -997,6 +997,115 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) >> i->op = OP_MOV; >> break; >> } >> + case OP_CVT: { >> + Storage res; >> + bld.setPosition(i, true); /* make sure bld is init'ed */ >> + switch(i->dType) { >> + case TYPE_U16: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >> + UINT16_MAX)); >> + else >> + res.data.u16 = util_iround(imm0.reg.data.f32); >> + break; >> + case TYPE_F64: > The F64 stuff needs more thought, as I don't think we can always store > the f64 immediates. In my patches, I just outlaw fp64 immediates in > the first place. Please leave these out for now. Oh i removed only the lower part of it, i beg you pardon for delivering that thing here :/ > >> + if (i->saturate) >> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f64, 0, >> + UINT16_MAX)); >> + else >> + res.data.u16 = util_iround(imm0.reg.data.f64); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.u16)); >> + break; >> + case TYPE_U32: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, >> + UINT32_MAX)); >> + else >> + res.data.u32 = util_iround(imm0.reg.data.f32); >> + break; >> + case TYPE_F64: >> + if (i->saturate) >> + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f64, 0, >> + UINT32_MAX)); >> + else >> + res.data.u32 = util_iround(imm0.reg.data.f64); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.u32)); >> + break; >> + case TYPE_S16: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, >> + INT16_MAX)); >> + else >> + res.data.s16 = util_iround(imm0.reg.data.f32); >> + break; >> + case TYPE_F64: >> + if (i->saturate) >> + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f64, INT16_MIN, >> + INT16_MAX)); >> + else >> + res.data.s16 = util_iround(imm0.reg.data.f64); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.s16)); >> + break; >> + case TYPE_S32: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, >> + INT32_MAX)); >> + else >> + res.data.s32 = util_iround(imm0.reg.data.f32); >> + break; >> + case TYPE_F64: >> + if (i->saturate) >> + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f64, INT32_MIN, >> + INT32_MAX)); >> + else >> + res.data.s32 = util_iround(imm0.reg.data.f64); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.s32)); >> + break; >> + case TYPE_F32: >> + switch (i->sType) { >> + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; >> + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; >> + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; >> + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.f32)); >> + break; >> + default: >> + return; >> + } >> + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ >> + i->setSrc(1, NULL); >> + i->op = OP_MOV; >> + >> + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ >> + break; >> + } >> default: >> return; >> } >> -- >> 2.2.1 >> >> _______________________________________________ >> Nouveau mailing list >> Nouveau@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvhkHtoRFP1rk8=9w68ZcgesV21mGAMmdB5LsHvFVNzo3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-01-10 1:08 ` Tobias Klausmann @ 2015-01-10 1:24 ` Tobias Klausmann [not found] ` <1420853067-13115-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 1 sibling, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-10 1:24 UTC (permalink / raw) To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 9a0bb60..741c74f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->op = OP_MOV; break; } + case OP_CVT: { + Storage res; + bld.setPosition(i, true); /* make sure bld is init'ed */ + switch(i->dType) { + case TYPE_U16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, + UINT16_MAX)); + else + res.data.u16 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u16)); + break; + case TYPE_U32: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, + UINT32_MAX)); + else + res.data.u32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u32)); + break; + case TYPE_S16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, + INT16_MAX)); + else + res.data.s16 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s16)); + break; + case TYPE_S32: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, + INT32_MAX)); + else + res.data.s32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s32)); + break; + case TYPE_F32: + switch (i->sType) { + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.f32)); + break; + default: + return; + } + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ + i->setSrc(1, NULL); + i->op = OP_MOV; + + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ + break; + } default: return; } -- 2.2.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 18+ messages in thread
[parent not found: <1420853067-13115-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <1420853067-13115-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 0:58 ` Ilia Mirkin [not found] ` <CAKb7UvinNbgsgz7PGzzy0fAmfzAykm9Fph_FRDxoZKV5cS+ybg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 0:58 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> > --- > V2: beat me, whip me, split out F64 > > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ > 1 file changed, 81 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 9a0bb60..741c74f 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > i->op = OP_MOV; > break; > } > + case OP_CVT: { > + Storage res; > + bld.setPosition(i, true); /* make sure bld is init'ed */ > + switch(i->dType) { > + case TYPE_U16: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, > + UINT16_MAX)); Where did this saturate stuff come from? It doesn't make sense to saturate to a non-float dtype. I'd go ahead and just assert(!i->saturate) in the int dtype cases. One does wonder what the hw does if the float doesn't fit in the destination... whether it saturates or not. I don't hugely care though. > + else > + res.data.u16 = util_iround(imm0.reg.data.f32); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.u16)); > + break; > + case TYPE_U32: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, > + UINT32_MAX)); > + else > + res.data.u32 = util_iround(imm0.reg.data.f32); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.u32)); > + break; > + case TYPE_S16: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, > + INT16_MAX)); > + else > + res.data.s16 = util_iround(imm0.reg.data.f32); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.s16)); > + break; > + case TYPE_S32: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, > + INT32_MAX)); > + else > + res.data.s32 = util_iround(imm0.reg.data.f32); > + break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.s32)); > + break; > + case TYPE_F32: > + switch (i->sType) { > + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; > + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; > + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; > + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; > + default: > + return; > + } > + i->setSrc(0, bld.mkImm(res.data.f32)); > + break; > + default: > + return; > + } > + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ > + i->setSrc(1, NULL); How can src(1) be set? OP_CVT only has the one arg... > + i->op = OP_MOV; > + > + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ > + break; > + } > default: > return; > } > -- > 2.2.1 > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7UvinNbgsgz7PGzzy0fAmfzAykm9Fph_FRDxoZKV5cS+ybg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvinNbgsgz7PGzzy0fAmfzAykm9Fph_FRDxoZKV5cS+ybg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-11 17:27 ` Tobias Klausmann [not found] ` <54B2B283.9020400-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 17:27 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 01:58, Ilia Mirkin wrote: > On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >> --- >> V2: beat me, whip me, split out F64 >> >> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ >> 1 file changed, 81 insertions(+) >> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> index 9a0bb60..741c74f 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) >> i->op = OP_MOV; >> break; >> } >> + case OP_CVT: { >> + Storage res; >> + bld.setPosition(i, true); /* make sure bld is init'ed */ >> + switch(i->dType) { >> + case TYPE_U16: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >> + UINT16_MAX)); > Where did this saturate stuff come from? It doesn't make sense to > saturate to a non-float dtype. I'd go ahead and just > assert(!i->saturate) in the int dtype cases. > > One does wonder what the hw does if the float doesn't fit in the > destination... whether it saturates or not. I don't hugely care > though. Actually i can't remember why that was added in the first place, i'll go ahead and follow your advice here. >> + else >> + res.data.u16 = util_iround(imm0.reg.data.f32); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.u16)); >> + break; >> + case TYPE_U32: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.u32 = util_iround(CLAMP(imm0.reg.data.f32, 0, >> + UINT32_MAX)); >> + else >> + res.data.u32 = util_iround(imm0.reg.data.f32); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.u32)); >> + break; >> + case TYPE_S16: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, >> + INT16_MAX)); >> + else >> + res.data.s16 = util_iround(imm0.reg.data.f32); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.s16)); >> + break; >> + case TYPE_S32: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.s32 = util_iround(CLAMP(imm0.reg.data.f32, INT32_MIN, >> + INT32_MAX)); >> + else >> + res.data.s32 = util_iround(imm0.reg.data.f32); >> + break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.s32)); >> + break; >> + case TYPE_F32: >> + switch (i->sType) { >> + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; >> + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; >> + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; >> + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; >> + default: >> + return; >> + } >> + i->setSrc(0, bld.mkImm(res.data.f32)); >> + break; >> + default: >> + return; >> + } >> + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ >> + i->setSrc(1, NULL); > How can src(1) be set? OP_CVT only has the one arg... Agreed, its NULL anyway. >> + i->op = OP_MOV; >> + >> + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ >> + break; >> + } >> default: >> return; >> } >> -- >> 2.2.1 >> >> _______________________________________________ >> Nouveau mailing list >> Nouveau@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/nouveau _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <54B2B283.9020400-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <54B2B283.9020400-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 19:19 ` Ilia Mirkin [not found] ` <CAKb7Uvg_dR3U_swgod0oLb5cLm8OX8OtTQPU514VG92G1vYg2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 19:19 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > > > On 11.01.2015 01:58, Ilia Mirkin wrote: >> >> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >> <tobias.johannes.klausmann@mni.thm.de> wrote: >>> >>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>> {S16/32})->F32 >>> >>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>> --- >>> V2: beat me, whip me, split out F64 >>> >>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>> ++++++++++++++++++++++ >>> 1 file changed, 81 insertions(+) >>> >>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> index 9a0bb60..741c74f 100644 >>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue >>> &imm0, int s) >>> i->op = OP_MOV; >>> break; >>> } >>> + case OP_CVT: { >>> + Storage res; >>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>> + switch(i->dType) { >>> + case TYPE_U16: >>> + switch (i->sType) { >>> + case TYPE_F32: >>> + if (i->saturate) >>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>> + UINT16_MAX)); >> >> Where did this saturate stuff come from? It doesn't make sense to >> saturate to a non-float dtype. I'd go ahead and just >> assert(!i->saturate) in the int dtype cases. >> >> One does wonder what the hw does if the float doesn't fit in the >> destination... whether it saturates or not. I don't hugely care >> though. > > Actually i can't remember why that was added in the first place, i'll go > ahead and follow your advice here. Oh wait... this was to support saturating an array access into a u16... const int sat = (i->op == OP_TXF) ? 1 : 0; DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; So... basically if the source is a U32 and the dest is a U16, we want to saturate there? IMO this is such a minor use-case that it doesn't really matter. However I guess you can keep the saturate bits around if you like. -ilia _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7Uvg_dR3U_swgod0oLb5cLm8OX8OtTQPU514VG92G1vYg2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7Uvg_dR3U_swgod0oLb5cLm8OX8OtTQPU514VG92G1vYg2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-11 19:56 ` Tobias Klausmann [not found] ` <54B2D552.6030700-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 19:56 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 20:19, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> >> On 11.01.2015 01:58, Ilia Mirkin wrote: >>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>> {S16/32})->F32 >>>> >>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>>> --- >>>> V2: beat me, whip me, split out F64 >>>> >>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>> ++++++++++++++++++++++ >>>> 1 file changed, 81 insertions(+) >>>> >>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> index 9a0bb60..741c74f 100644 >>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue >>>> &imm0, int s) >>>> i->op = OP_MOV; >>>> break; >>>> } >>>> + case OP_CVT: { >>>> + Storage res; >>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>> + switch(i->dType) { >>>> + case TYPE_U16: >>>> + switch (i->sType) { >>>> + case TYPE_F32: >>>> + if (i->saturate) >>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>> + UINT16_MAX)); >>> Where did this saturate stuff come from? It doesn't make sense to >>> saturate to a non-float dtype. I'd go ahead and just >>> assert(!i->saturate) in the int dtype cases. >>> >>> One does wonder what the hw does if the float doesn't fit in the >>> destination... whether it saturates or not. I don't hugely care >>> though. >> Actually i can't remember why that was added in the first place, i'll go >> ahead and follow your advice here. > Oh wait... this was to support saturating an array access into a u16... > > const int sat = (i->op == OP_TXF) ? 1 : 0; > DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; > bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; > > So... basically if the source is a U32 and the dest is a U16, we want > to saturate there? IMO this is such a minor use-case that it doesn't > really matter. However I guess you can keep the saturate bits around > if you like. We can do it with or without the saturate if we rely on the test, assert(!i->saturate)'ing is the only thing that breaks the test you sure meant: glsl-resource-not-bound 1DArray glsl-resource-not-bound 2DArray glsl-resource-not-bound 2DMSArray > > -ilia _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <54B2D552.6030700-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <54B2D552.6030700-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 19:57 ` Ilia Mirkin [not found] ` <CAKb7Uvi7Ke_fzbpQ_JvLvv-1u2H=F-udRdb0A+dCM0=tsqBKBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 19:57 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > > > On 11.01.2015 20:19, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >> <tobias.johannes.klausmann@mni.thm.de> wrote: >>> >>> >>> On 11.01.2015 01:58, Ilia Mirkin wrote: >>>> >>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>>> >>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>> {S16/32})->F32 >>>>> >>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>>>> --- >>>>> V2: beat me, whip me, split out F64 >>>>> >>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>>> ++++++++++++++++++++++ >>>>> 1 file changed, 81 insertions(+) >>>>> >>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> index 9a0bb60..741c74f 100644 >>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, >>>>> ImmediateValue >>>>> &imm0, int s) >>>>> i->op = OP_MOV; >>>>> break; >>>>> } >>>>> + case OP_CVT: { >>>>> + Storage res; >>>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>>> + switch(i->dType) { >>>>> + case TYPE_U16: >>>>> + switch (i->sType) { >>>>> + case TYPE_F32: >>>>> + if (i->saturate) >>>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>>> + UINT16_MAX)); >>>> >>>> Where did this saturate stuff come from? It doesn't make sense to >>>> saturate to a non-float dtype. I'd go ahead and just >>>> assert(!i->saturate) in the int dtype cases. >>>> >>>> One does wonder what the hw does if the float doesn't fit in the >>>> destination... whether it saturates or not. I don't hugely care >>>> though. >>> >>> Actually i can't remember why that was added in the first place, i'll go >>> ahead and follow your advice here. >> >> Oh wait... this was to support saturating an array access into a u16... >> >> const int sat = (i->op == OP_TXF) ? 1 : 0; >> DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; >> bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; >> >> So... basically if the source is a U32 and the dest is a U16, we want >> to saturate there? IMO this is such a minor use-case that it doesn't >> really matter. However I guess you can keep the saturate bits around >> if you like. > > We can do it with or without the saturate if we rely on the test, > assert(!i->saturate)'ing is the only thing that breaks the test you sure > meant: > > glsl-resource-not-bound 1DArray > glsl-resource-not-bound 2DArray > glsl-resource-not-bound 2DMSArray Hm, those are the only times that a texelFetch is done in piglit with a constant layer index, I guess. _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7Uvi7Ke_fzbpQ_JvLvv-1u2H=F-udRdb0A+dCM0=tsqBKBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7Uvi7Ke_fzbpQ_JvLvv-1u2H=F-udRdb0A+dCM0=tsqBKBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-11 20:17 ` Tobias Klausmann [not found] ` <54B2DA6A.7080505-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 20:17 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 20:57, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> >> On 11.01.2015 20:19, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>> >>>> On 11.01.2015 01:58, Ilia Mirkin wrote: >>>>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>>>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>>> {S16/32})->F32 >>>>>> >>>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>>>>> --- >>>>>> V2: beat me, whip me, split out F64 >>>>>> >>>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 >>>>>> ++++++++++++++++++++++ >>>>>> 1 file changed, 81 insertions(+) >>>>>> >>>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> index 9a0bb60..741c74f 100644 >>>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> @@ -997,6 +997,87 @@ ConstantFolding::opnd(Instruction *i, >>>>>> ImmediateValue >>>>>> &imm0, int s) >>>>>> i->op = OP_MOV; >>>>>> break; >>>>>> } >>>>>> + case OP_CVT: { >>>>>> + Storage res; >>>>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>>>> + switch(i->dType) { >>>>>> + case TYPE_U16: >>>>>> + switch (i->sType) { >>>>>> + case TYPE_F32: >>>>>> + if (i->saturate) >>>>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>>>> + UINT16_MAX)); >>>>> Where did this saturate stuff come from? It doesn't make sense to >>>>> saturate to a non-float dtype. I'd go ahead and just >>>>> assert(!i->saturate) in the int dtype cases. >>>>> >>>>> One does wonder what the hw does if the float doesn't fit in the >>>>> destination... whether it saturates or not. I don't hugely care >>>>> though. >>>> Actually i can't remember why that was added in the first place, i'll go >>>> ahead and follow your advice here. >>> Oh wait... this was to support saturating an array access into a u16... >>> >>> const int sat = (i->op == OP_TXF) ? 1 : 0; >>> DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; >>> bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; >>> >>> So... basically if the source is a U32 and the dest is a U16, we want >>> to saturate there? IMO this is such a minor use-case that it doesn't >>> really matter. However I guess you can keep the saturate bits around >>> if you like. >> We can do it with or without the saturate if we rely on the test, >> assert(!i->saturate)'ing is the only thing that breaks the test you sure >> meant: >> >> glsl-resource-not-bound 1DArray >> glsl-resource-not-bound 2DArray >> glsl-resource-not-bound 2DMSArray > Hm, those are the only times that a texelFetch is done in piglit with > a constant layer index, I guess. Ok, i'll keep the saturates for (U/S)16 to for once satisfy the "dependency" you posted up there and to be future proof if somebody implements something similar(?) for the S16 one! _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <54B2DA6A.7080505-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <54B2DA6A.7080505-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 21:40 ` Tobias Klausmann [not found] ` <1421012422-30607-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 21:40 UTC (permalink / raw) To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, imirkin-FrUbXkNCsVf2fBVCVOL8/A Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> --- V2: Split out F64 parts V3: remove handling of saturate for (U/S)32, .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 ++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 21d20ca..aaf0d0d 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->op = OP_MOV; break; } + case OP_CVT: { + Storage res; + bld.setPosition(i, true); /* make sure bld is init'ed */ + switch(i->dType) { + case TYPE_U16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, + UINT16_MAX)); + else + res.data.u16 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u16)); + break; + case TYPE_U32: + assert(!i->saturate); + switch (i->sType) { + case TYPE_F32: + res.data.u32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u32)); + break; + case TYPE_S16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, + INT16_MAX)); + else + res.data.s16 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s16)); + break; + case TYPE_S32: + assert(!i->saturate); + switch (i->sType) { + case TYPE_F32: + res.data.s32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s32)); + break; + case TYPE_F32: + switch (i->sType) { + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.f32)); + break; + default: + return; + } + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ + i->op = OP_MOV; + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ + break; + } default: return; } -- 2.2.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 18+ messages in thread
[parent not found: <1421012422-30607-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <1421012422-30607-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 21:54 ` Ilia Mirkin [not found] ` <CAKb7Uvgnd25Ubm4m_-auNHw8p_Z9g=7sSxDrRb+CmbE5MZtohA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 21:54 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 > > Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> > --- > V2: Split out F64 parts > V3: remove handling of saturate for (U/S)32, > > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 ++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 21d20ca..aaf0d0d 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > i->op = OP_MOV; > break; > } > + case OP_CVT: { > + Storage res; > + bld.setPosition(i, true); /* make sure bld is init'ed */ > + switch(i->dType) { > + case TYPE_U16: > + switch (i->sType) { > + case TYPE_F32: > + if (i->saturate) > + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, > + UINT16_MAX)); > + else > + res.data.u16 = util_iround(imm0.reg.data.f32); > + break; > + default: > + return; > + } This won't get hit for the U32 -> U16 conversion though right? Did you test that case? Am I misreading/misunderstanding perhaps? -ilia _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7Uvgnd25Ubm4m_-auNHw8p_Z9g=7sSxDrRb+CmbE5MZtohA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7Uvgnd25Ubm4m_-auNHw8p_Z9g=7sSxDrRb+CmbE5MZtohA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-11 22:08 ` Tobias Klausmann [not found] ` <54B2F467.3050007-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 22:08 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 22:54, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >> --- >> V2: Split out F64 parts >> V3: remove handling of saturate for (U/S)32, >> >> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 ++++++++++++++++++++++ >> 1 file changed, 73 insertions(+) >> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> index 21d20ca..aaf0d0d 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) >> i->op = OP_MOV; >> break; >> } >> + case OP_CVT: { >> + Storage res; >> + bld.setPosition(i, true); /* make sure bld is init'ed */ >> + switch(i->dType) { >> + case TYPE_U16: >> + switch (i->sType) { >> + case TYPE_F32: >> + if (i->saturate) >> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >> + UINT16_MAX)); >> + else >> + res.data.u16 = util_iround(imm0.reg.data.f32); >> + break; >> + default: >> + return; >> + } > This won't get hit for the U32 -> U16 conversion though right? Did you > test that case? Am I misreading/misunderstanding perhaps? A complete piglit run did not hit i->saturate for U32 or S32. That said, i kept the assert() there on purpose for now to actually make sure we are no hitting such a case. Do i misread you now? :) _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <54B2F467.3050007-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <54B2F467.3050007-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 22:12 ` Ilia Mirkin [not found] ` <CAKb7UvgiWyNAumaKaiw_B2f49Q+xXszdwBE1abJwoC8SEpH8Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 22:12 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > > > On 11.01.2015 22:54, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann >> <tobias.johannes.klausmann@mni.thm.de> wrote: >>> >>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>> {S16/32})->F32 >>> >>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>> --- >>> V2: Split out F64 parts >>> V3: remove handling of saturate for (U/S)32, >>> >>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 >>> ++++++++++++++++++++++ >>> 1 file changed, 73 insertions(+) >>> >>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> index 21d20ca..aaf0d0d 100644 >>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>> @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue >>> &imm0, int s) >>> i->op = OP_MOV; >>> break; >>> } >>> + case OP_CVT: { >>> + Storage res; >>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>> + switch(i->dType) { >>> + case TYPE_U16: >>> + switch (i->sType) { >>> + case TYPE_F32: >>> + if (i->saturate) >>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>> + UINT16_MAX)); >>> + else >>> + res.data.u16 = util_iround(imm0.reg.data.f32); >>> + break; >>> + default: >>> + return; >>> + } >> >> This won't get hit for the U32 -> U16 conversion though right? Did you >> test that case? Am I misreading/misunderstanding perhaps? > > A complete piglit run did not hit i->saturate for U32 or S32. That said, i > kept the assert() there on purpose for now to actually make sure we are no > hitting such a case. Do i misread you now? :) From my read of the code, we'd hit that case now with TXF on a 2D_ARRAY with a constant as the array element. i.e. a piglit with uniform sampler2DArray foo; texelFetch(foo, ivec3(1, 2, 3)); -ilia _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7UvgiWyNAumaKaiw_B2f49Q+xXszdwBE1abJwoC8SEpH8Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvgiWyNAumaKaiw_B2f49Q+xXszdwBE1abJwoC8SEpH8Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-11 22:48 ` Tobias Klausmann [not found] ` <54B2FDB9.60906-AqjdNwhu20eELgA04lAiVw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Tobias Klausmann @ 2015-01-11 22:48 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 23:12, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> >> On 11.01.2015 22:54, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann >>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>> {S16/32})->F32 >>>> >>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>>> --- >>>> V2: Split out F64 parts >>>> V3: remove handling of saturate for (U/S)32, >>>> >>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 >>>> ++++++++++++++++++++++ >>>> 1 file changed, 73 insertions(+) >>>> >>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> index 21d20ca..aaf0d0d 100644 >>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>> @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue >>>> &imm0, int s) >>>> i->op = OP_MOV; >>>> break; >>>> } >>>> + case OP_CVT: { >>>> + Storage res; >>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>> + switch(i->dType) { >>>> + case TYPE_U16: >>>> + switch (i->sType) { >>>> + case TYPE_F32: >>>> + if (i->saturate) >>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>> + UINT16_MAX)); >>>> + else >>>> + res.data.u16 = util_iround(imm0.reg.data.f32); >>>> + break; >>>> + default: >>>> + return; >>>> + } >>> This won't get hit for the U32 -> U16 conversion though right? Did you >>> test that case? Am I misreading/misunderstanding perhaps? >> A complete piglit run did not hit i->saturate for U32 or S32. That said, i >> kept the assert() there on purpose for now to actually make sure we are no >> hitting such a case. Do i misread you now? :) > From my read of the code, we'd hit that case now with TXF on a > 2D_ARRAY with a constant as the array element. i.e. a piglit with > > uniform sampler2DArray foo; > texelFetch(foo, ivec3(1, 2, 3)); Tested this (hope i did the right thing) and the assert did not get triggered, but i am still uncertain of this. -> move the assert into the F32 case for U32/S32 just to make sure... switch (i->sType) case TYPE_F32: assert(...) ... other than that, we are not even going to fold U32 -> U16 ;-) Greetings, Tobias _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <54B2FDB9.60906-AqjdNwhu20eELgA04lAiVw@public.gmane.org>]
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <54B2FDB9.60906-AqjdNwhu20eELgA04lAiVw@public.gmane.org> @ 2015-01-11 22:53 ` Ilia Mirkin [not found] ` <CAKb7UvidgoVLmvvG3r2M2Eio-EexLh1RsXh9GK8Pf-UMSVOPgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Ilia Mirkin @ 2015-01-11 22:53 UTC (permalink / raw) To: Tobias Klausmann Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Sun, Jan 11, 2015 at 5:48 PM, Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> wrote: > > > On 11.01.2015 23:12, Ilia Mirkin wrote: >> >> On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann >> <tobias.johannes.klausmann@mni.thm.de> wrote: >>> >>> >>> On 11.01.2015 22:54, Ilia Mirkin wrote: >>>> >>>> On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann >>>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>>> >>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>> {S16/32})->F32 >>>>> >>>>> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> >>>>> --- >>>>> V2: Split out F64 parts >>>>> V3: remove handling of saturate for (U/S)32, >>>>> >>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 >>>>> ++++++++++++++++++++++ >>>>> 1 file changed, 73 insertions(+) >>>>> >>>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> index 21d20ca..aaf0d0d 100644 >>>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>> @@ -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, >>>>> ImmediateValue >>>>> &imm0, int s) >>>>> i->op = OP_MOV; >>>>> break; >>>>> } >>>>> + case OP_CVT: { >>>>> + Storage res; >>>>> + bld.setPosition(i, true); /* make sure bld is init'ed */ >>>>> + switch(i->dType) { >>>>> + case TYPE_U16: >>>>> + switch (i->sType) { >>>>> + case TYPE_F32: >>>>> + if (i->saturate) >>>>> + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, >>>>> + UINT16_MAX)); >>>>> + else >>>>> + res.data.u16 = util_iround(imm0.reg.data.f32); >>>>> + break; >>>>> + default: >>>>> + return; >>>>> + } >>>> >>>> This won't get hit for the U32 -> U16 conversion though right? Did you >>>> test that case? Am I misreading/misunderstanding perhaps? >>> >>> A complete piglit run did not hit i->saturate for U32 or S32. That said, >>> i >>> kept the assert() there on purpose for now to actually make sure we are >>> no >>> hitting such a case. Do i misread you now? :) >> >> From my read of the code, we'd hit that case now with TXF on a >> 2D_ARRAY with a constant as the array element. i.e. a piglit with >> >> uniform sampler2DArray foo; >> texelFetch(foo, ivec3(1, 2, 3)); > > Tested this (hope i did the right thing) and the assert did not get > triggered, but i am still uncertain of this. > -> move the assert into the F32 case for U32/S32 just to make sure... > switch (i->sType) > case TYPE_F32: > assert(...) > ... > > other than that, we are not even going to fold U32 -> U16 ;-) Right, and that's the problem. Try it with a piglit that has the code I suggest... if you don't end up collapsing it, include the TGSI that's generated (and also the shader test source) and we'll go from there. -ilia _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CAKb7UvidgoVLmvvG3r2M2Eio-EexLh1RsXh9GK8Pf-UMSVOPgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* [PATCH v4] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvidgoVLmvvG3r2M2Eio-EexLh1RsXh9GK8Pf-UMSVOPgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-01-24 18:18 ` Tobias Klausmann 2015-01-24 18:19 ` [PATCH] " Tobias Klausmann 1 sibling, 0 replies; 18+ messages in thread From: Tobias Klausmann @ 2015-01-24 18:18 UTC (permalink / raw) To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, imirkin-FrUbXkNCsVf2fBVCVOL8/A Folding for conversions: F32->(U{16/32}, S{16/32}) (U{16/32}, {S16/32})->F32 U32 -> U16 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> --- V2: Split out F64 parts V3: remove handling of saturate for (U/S)32 V4: handle U32->U16 for OP_TXF .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 79 ++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 21d20ca..235aed9 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -997,6 +997,85 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->op = OP_MOV; break; } + case OP_CVT: { + Storage res; + bld.setPosition(i, true); /* make sure bld is init'ed */ + switch(i->dType) { + case TYPE_U16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.u16 = util_iround(CLAMP(imm0.reg.data.f32, 0, + UINT16_MAX)); + else + res.data.u16 = util_iround(imm0.reg.data.f32); + break; + case TYPE_U32: + if (i->saturate) + res.data.u16 = CLAMP(imm0.reg.data.u32, 0, UINT16_MAX); + else + res.data.u16 = imm0.reg.data.u32; + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u16)); + break; + case TYPE_U32: + assert(!i->saturate); + switch (i->sType) { + case TYPE_F32: + res.data.u32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.u32)); + break; + case TYPE_S16: + switch (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.s16 = util_iround(CLAMP(imm0.reg.data.f32, INT16_MIN, + INT16_MAX)); + else + res.data.s16 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s16)); + break; + case TYPE_S32: + assert(!i->saturate); + switch (i->sType) { + case TYPE_F32: + res.data.s32 = util_iround(imm0.reg.data.f32); + break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.s32)); + break; + case TYPE_F32: + switch (i->sType) { + case TYPE_U16: res.data.f32 = (float) imm0.reg.data.u16; break; + case TYPE_U32: res.data.f32 = (float) imm0.reg.data.u32; break; + case TYPE_S16: res.data.f32 = (float) imm0.reg.data.s16; break; + case TYPE_S32: res.data.f32 = (float) imm0.reg.data.s32; break; + default: + return; + } + i->setSrc(0, bld.mkImm(res.data.f32)); + break; + default: + return; + } + i->setType(i->dType); /* Remove i->sType, which we don't need anymore */ + i->op = OP_MOV; + i->src(0).mod = Modifier(0); /* Clear the already applied modifier */ + break; + } default: return; } -- 2.2.2 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] nv50/ir: Handle OP_CVT when folding constant expressions [not found] ` <CAKb7UvidgoVLmvvG3r2M2Eio-EexLh1RsXh9GK8Pf-UMSVOPgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-01-24 18:18 ` [PATCH v4] " Tobias Klausmann @ 2015-01-24 18:19 ` Tobias Klausmann 1 sibling, 0 replies; 18+ messages in thread From: Tobias Klausmann @ 2015-01-24 18:19 UTC (permalink / raw) To: Ilia Mirkin; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On 11.01.2015 23:53, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 5:48 PM, Tobias Klausmann > <tobias.johannes.klausmann@mni.thm.de> wrote: >> On 11.01.2015 23:12, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 5:08 PM, Tobias Klausmann >>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>> On 11.01.2015 22:54, Ilia Mirkin wrote: >>>>> On Sun, Jan 11, 2015 at 4:40 PM, Tobias Klausmann >>>>> <tobias.johannes.klausmann@mni.thm.de> wrote: >>>>>> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, >>>>>> {S16/32})->F32 Signed-off-by: Tobias Klausmann >>>>>> <tobias.johannes.klausmann@mni.thm.de> --- V2: Split out F64 >>>>>> parts V3: remove handling of saturate for (U/S)32, >>>>>> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 73 >>>>>> ++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff >>>>>> --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index >>>>>> 21d20ca..aaf0d0d 100644 --- >>>>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ >>>>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ >>>>>> -997,6 +997,79 @@ ConstantFolding::opnd(Instruction *i, >>>>>> ImmediateValue &imm0, int s) i->op = OP_MOV; break; } + case >>>>>> OP_CVT: { + Storage res; + bld.setPosition(i, true); /* make sure >>>>>> bld is init'ed */ + switch(i->dType) { + case TYPE_U16: + switch >>>>>> (i->sType) { + case TYPE_F32: + if (i->saturate) + res.data.u16 = >>>>>> util_iround(CLAMP(imm0.reg.data.f32, 0, + UINT16_MAX)); + else + >>>>>> res.data.u16 = util_iround(imm0.reg.data.f32); + break; + >>>>>> default: + return; + } >>>>> This won't get hit for the U32 -> U16 conversion though right? Did >>>>> you test that case? Am I misreading/misunderstanding perhaps? >>>> A complete piglit run did not hit i->saturate for U32 or S32. That >>>> said, i kept the assert() there on purpose for now to actually make >>>> sure we are no hitting such a case. Do i misread you now? :) >>> From my read of the code, we'd hit that case now with TXF on a >>> 2D_ARRAY with a constant as the array element. i.e. a piglit with >>> uniform sampler2DArray foo; texelFetch(foo, ivec3(1, 2, 3)); >> Tested this (hope i did the right thing) and the assert did not get >> triggered, but i am still uncertain of this. -> move the assert into >> the F32 case for U32/S32 just to make sure... switch (i->sType) case >> TYPE_F32: assert(...) ... other than that, we are not even going to >> fold U32 -> U16 ;-) > Right, and that's the problem. Try it with a piglit that has the code > I suggest... if you don't end up collapsing it, include the TGSI > that's generated (and also the shader test source) and we'll go from > there. -ilia Haven't found a piglit test triggering that, but i have created a TGSI shader on my own. That's the only reason i am writing this email and not just posting the patch. This one is collapsed just fine though. FRAG DCL OUT[0..2], COLOR DCL CONST[0..2] DCL TEMP[0..2], LOCAL IMM[0] FLT32 { 1, 2, 3, 4} IMM[1] UINT32 { 5, 6, 7, 8} 0: TEX TEMP[0], IMM[0], SAMP[0], 2D_ARRAY 1: TXF TEMP[1], IMM[1], SAMP[0], 2D_ARRAY 2: MOV OUT[0], TEMP[0] 3: MOV OUT[1], TEMP[1] 4: END Greetings, Tobias _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-01-24 18:19 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-09 23:47 [RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions Tobias Klausmann
[not found] ` <1420847276-8754-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-10 1:41 ` Ilia Mirkin
[not found] ` <CAKb7UvhkHtoRFP1rk8=9w68ZcgesV21mGAMmdB5LsHvFVNzo3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-10 1:08 ` Tobias Klausmann
2015-01-10 1:24 ` [PATCH v2] " Tobias Klausmann
[not found] ` <1420853067-13115-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 0:58 ` Ilia Mirkin
[not found] ` <CAKb7UvinNbgsgz7PGzzy0fAmfzAykm9Fph_FRDxoZKV5cS+ybg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-11 17:27 ` Tobias Klausmann
[not found] ` <54B2B283.9020400-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 19:19 ` Ilia Mirkin
[not found] ` <CAKb7Uvg_dR3U_swgod0oLb5cLm8OX8OtTQPU514VG92G1vYg2A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-11 19:56 ` Tobias Klausmann
[not found] ` <54B2D552.6030700-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 19:57 ` Ilia Mirkin
[not found] ` <CAKb7Uvi7Ke_fzbpQ_JvLvv-1u2H=F-udRdb0A+dCM0=tsqBKBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-11 20:17 ` Tobias Klausmann
[not found] ` <54B2DA6A.7080505-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 21:40 ` [PATCH] " Tobias Klausmann
[not found] ` <1421012422-30607-1-git-send-email-tobias.johannes.klausmann-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 21:54 ` Ilia Mirkin
[not found] ` <CAKb7Uvgnd25Ubm4m_-auNHw8p_Z9g=7sSxDrRb+CmbE5MZtohA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-11 22:08 ` Tobias Klausmann
[not found] ` <54B2F467.3050007-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 22:12 ` Ilia Mirkin
[not found] ` <CAKb7UvgiWyNAumaKaiw_B2f49Q+xXszdwBE1abJwoC8SEpH8Lg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-11 22:48 ` Tobias Klausmann
[not found] ` <54B2FDB9.60906-AqjdNwhu20eELgA04lAiVw@public.gmane.org>
2015-01-11 22:53 ` Ilia Mirkin
[not found] ` <CAKb7UvidgoVLmvvG3r2M2Eio-EexLh1RsXh9GK8Pf-UMSVOPgw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-01-24 18:18 ` [PATCH v4] " Tobias Klausmann
2015-01-24 18:19 ` [PATCH] " Tobias Klausmann
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.