All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible fix to _mesa_remove_extra_moves function in shader/prog_optimize.c (request testing)
@ 2010-05-05 20:35 Conn Clark
  2010-05-06 21:50 ` Brian Paul
  0 siblings, 1 reply; 3+ messages in thread
From: Conn Clark @ 2010-05-05 20:35 UTC (permalink / raw)
  To: dri-devel

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

Hello,

Here is a possible fix/hack to get _mesa_remove_extra_moves function
in shader/prog_optimize.c usable. As far as I could tell with my
testing there was an issue with this optimizing pass and OPCODE_MUL .
I just added an exception to for this one instruction and made it easy
to add others should further testing indicate they need to be added
too.

It bumped my Nexuiz scores on demo1 from 5,8,and 12 to 5,9, and 13. It
also reduced the testing runtime from 234 seconds to 225 seconds.


I have only tested on my radeon hd 3100 based laptop but would like to
hear results from other types of cards too.

Conn
-- 

Conn O. Clark

Observation: In formal computer science advances are made
by standing on the shoulders of giants. Linux has proved
that if there are enough of you, you can advance just as
far by stepping on each others toes.

[-- Attachment #2: possible-_mesa_remove_extra_moves-fix.patch --]
[-- Type: application/octet-stream, Size: 3521 bytes --]

diff --git a/src/mesa/shader/prog_optimize.c b/src/mesa/shader/prog_optimize.c
index 2941a17..1cdc186 100644
--- a/src/mesa/shader/prog_optimize.c
+++ b/src/mesa/shader/prog_optimize.c
@@ -356,13 +356,14 @@ enum temp_use
    READ,
    WRITE,
    FLOW,
+   PROBLEMATIC,
    END
 };
 
 /**
  * Scan forward in program from 'start' for the next occurance of TEMP[index].
- * Return READ, WRITE, FLOW or END to indicate the next usage or an indicator
- * that we can't look further.
+ * Return READ, WRITE, FLOW, PROBLEMATIC, or END to indicate the next usage or
+ * an indicator that we can't look further.
  */
 static enum temp_use
 find_next_temp_use(const struct gl_program *prog, GLuint start, GLuint index)
@@ -372,11 +373,21 @@ find_next_temp_use(const struct gl_program *prog, GLuint start, GLuint index)
    for (i = start; i < prog->NumInstructions; i++) {
       const struct prog_instruction *inst = prog->Instructions + i;
       switch (inst->Opcode) {
-      case OPCODE_BGNLOOP:
+/*      case OPCODE_BGNLOOP: */ /* does not appear to be a problem */
       case OPCODE_ENDLOOP:
       case OPCODE_BGNSUB:
       case OPCODE_ENDSUB:
          return FLOW;
+
+      case OPCODE_MUL: /* suspect it has something to do with writemask */
+/*      case OPCODE_ABS:
+      case OPCODE_ADD:
+      case OPCODE_SUB:  */ 
+/* if trouble occurs I would guess one or more of the commented out
+ * instructions above are the likely culprits */
+
+         return PROBLEMATIC;
+
       default:
          {
             const GLuint numSrc = _mesa_num_inst_src_regs(inst->Opcode);
@@ -578,7 +589,7 @@ _mesa_remove_extra_moves(struct gl_program *prog)
             /* get pointer to previous instruction */
             prevI = i - 1;
             while (prevI > 0 && removeInst[prevI])
-               prevI--;
+               prevI--;  
             prevInst = prog->Instructions + prevI;
 
             if (prevInst->DstReg.File == PROGRAM_TEMPORARY &&
@@ -588,7 +599,7 @@ _mesa_remove_extra_moves(struct gl_program *prog)
                enum temp_use next_use =
                   find_next_temp_use(prog, i + 1, tempIndex);
 
-               if (next_use == WRITE || next_use == END) {
+               if (next_use == WRITE || next_use == END) { 
                   /* OK, we can safely remove this MOV instruction.
                    * Transform:
                    *   prevI: FOO tempIndex, x, y;
@@ -602,7 +613,7 @@ _mesa_remove_extra_moves(struct gl_program *prog)
                   prevInst->DstReg.Index = inst->DstReg.Index;
 
                   /* flag this instruction for removal */
-                  removeInst[i] = GL_TRUE;
+                  removeInst[i] = GL_TRUE; 
 
                   if (dbg) {
                      printf("Remove MOV at %u\n", i);
@@ -619,7 +630,7 @@ _mesa_remove_extra_moves(struct gl_program *prog)
    }
 
    /* now remove the instructions which aren't needed */
-   rem = remove_instructions(prog, removeInst);
+   rem = remove_instructions(prog, removeInst); 
 
    free(removeInst);
 
@@ -1020,16 +1031,18 @@ _mesa_reallocate_registers(struct gl_program *prog)
 void
 _mesa_optimize_program(GLcontext *ctx, struct gl_program *program)
 {
+
    _mesa_remove_extra_move_use(program);
 
    if (1)
       _mesa_remove_dead_code(program);
 
-   if (0) /* not tested much yet */
+   if (1) /* not tested much yet */
       _mesa_remove_extra_moves(program);
 
    if (0)
       _mesa_consolidate_registers(program);
    else
       _mesa_reallocate_registers(program);
+
 }

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: Possible fix to _mesa_remove_extra_moves function in shader/prog_optimize.c (request testing)
  2010-05-05 20:35 Possible fix to _mesa_remove_extra_moves function in shader/prog_optimize.c (request testing) Conn Clark
@ 2010-05-06 21:50 ` Brian Paul
  2010-05-06 22:27   ` Conn Clark
  0 siblings, 1 reply; 3+ messages in thread
From: Brian Paul @ 2010-05-06 21:50 UTC (permalink / raw)
  To: Conn Clark; +Cc: dri-devel@lists.freedesktop.org

Conn Clark wrote:
> Hello,
> 
> Here is a possible fix/hack to get _mesa_remove_extra_moves function
> in shader/prog_optimize.c usable. As far as I could tell with my
> testing there was an issue with this optimizing pass and OPCODE_MUL .
> I just added an exception to for this one instruction and made it easy
> to add others should further testing indicate they need to be added
> too.
> 
> It bumped my Nexuiz scores on demo1 from 5,8,and 12 to 5,9, and 13. It
> also reduced the testing runtime from 234 seconds to 225 seconds.
> 
> 
> I have only tested on my radeon hd 3100 based laptop but would like to
> hear results from other types of cards too.

I'm a bit nervous about enabling that function without a _lot_ more 
testing.  And any special case added for MUL would seem to apply to 
any ALU instruction.  That tells me that there's probably other issues 
to shake out of the code before we can enable it.

If you're interested, you should at least run the glean and piglits 
tests which exercise shaders and GPU programs.

-Brian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Possible fix to _mesa_remove_extra_moves function in shader/prog_optimize.c (request testing)
  2010-05-06 21:50 ` Brian Paul
@ 2010-05-06 22:27   ` Conn Clark
  0 siblings, 0 replies; 3+ messages in thread
From: Conn Clark @ 2010-05-06 22:27 UTC (permalink / raw)
  To: Brian Paul; +Cc: dri-devel@lists.freedesktop.org

On Thu, May 6, 2010 at 2:50 PM, Brian Paul <brianp@vmware.com> wrote:
> Conn Clark wrote:
>>
>> Hello,
>>
>> Here is a possible fix/hack to get _mesa_remove_extra_moves function
>> in shader/prog_optimize.c usable. As far as I could tell with my
>> testing there was an issue with this optimizing pass and OPCODE_MUL .
>> I just added an exception to for this one instruction and made it easy
>> to add others should further testing indicate they need to be added
>> too.
>>
>> It bumped my Nexuiz scores on demo1 from 5,8,and 12 to 5,9, and 13. It
>> also reduced the testing runtime from 234 seconds to 225 seconds.
>>
>>
>> I have only tested on my radeon hd 3100 based laptop but would like to
>> hear results from other types of cards too.
>
> I'm a bit nervous about enabling that function without a _lot_ more testing.
>  And any special case added for MUL would seem to apply to any ALU
> instruction.  That tells me that there's probably other issues to shake out
> of the code before we can enable it.
>
> If you're interested, you should at least run the glean and piglits tests
> which exercise shaders and GPU programs.
>
> -Brian
>

Brian,

 I couldn't agree more about the testing and that is the stage where I
am at. If you read my patch you probably noticed that I had a few ALU
instructions ready to drop into the problematic slot. Of course there
is still the chance that the MUL problem is with the R600/R700 support
itself. So far I have had only 3 testers besides myself (all of them
radeon users).  None of them has reported any problems yet. I will run
piglet and try and get glean running however.

Please don't take my current work as a push to get it included yet.

Thanks for your original work.

Conn
-- 

Conn O. Clark

Observation: In formal computer science advances are made
by standing on the shoulders of giants. Linux has proved
that if there are enough of you, you can advance just as
far by stepping on each others toes.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-05-06 22:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-05 20:35 Possible fix to _mesa_remove_extra_moves function in shader/prog_optimize.c (request testing) Conn Clark
2010-05-06 21:50 ` Brian Paul
2010-05-06 22:27   ` Conn Clark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.