public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
@ 2015-10-01 15:59 Michel Thierry
  2015-10-01 16:09 ` Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Michel Thierry @ 2015-10-01 15:59 UTC (permalink / raw)
  To: intel-gfx

We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
range pt in gen6_for_each_pde").

But the static analyzer still complains that, just before we break due
to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
iter value that is bigger than I915_PDES. Of course, this isn't really
a problem since no one uses pt outside the macro. Still, every single
new usage of the macro will create a new issue for us to mark as a
false positive.

Also, Paulo re-started the discussion a while ago [1], but didn't end up
implemented.

In order to "solve" this "problem", this patch takes the ideas from
Chris and Dave, but that check would change the desired behavior of the
code, because the object (for example pdp->page_directory[iter]) can be
null during init/alloc, and C would take this as false, breaking the for
loop immediately.

This has been already verified with "static analysis tools".

[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9fbb07d..94f8344 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
  */
 #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
 	for (iter = gen6_pde_index(start); \
-	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
+	     pt = (length > 0 && iter < I915_PDES) ? \
+			(pd)->page_table[iter] : NULL, \
+	     length > 0 && iter < I915_PDES; \
 	     iter++, \
 	     temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
 	     temp = min_t(unsigned, temp, length), \
@@ -459,7 +461,9 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
  */
 #define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
 	for (iter = gen8_pde_index(start); \
-	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES;	\
+	     pt = (length > 0 && iter < I915_PDES) ? \
+			(pd)->page_table[iter] : NULL, \
+	     length > 0 && iter < I915_PDES;	\
 	     iter++,				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
 	     temp = min(temp, length),					\
@@ -467,7 +471,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 
 #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
 	for (iter = gen8_pdpe_index(start); \
-	     pd = (pdp)->page_directory[iter], \
+	     pd = (length > 0 && (iter < I915_PDPES_PER_PDP(dev))) ? \
+			(pdp)->page_directory[iter] : NULL, \
 	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
 	     iter++,				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
@@ -476,7 +481,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 
 #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
 	for (iter = gen8_pml4e_index(start);	\
-	     pdp = (pml4)->pdps[iter], \
+	     pdp = (length > 0 && iter < GEN8_PML4ES_PER_PML4) ? \
+			(pml4)->pdps[iter] : NULL, \
 	     length > 0 && iter < GEN8_PML4ES_PER_PML4; \
 	     iter++,				\
 	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry
@ 2015-10-01 16:09 ` Chris Wilson
  2015-10-02 12:47   ` Michel Thierry
  2015-10-02  7:58 ` Daniel Vetter
  2015-10-02 13:16 ` [PATCH v2] " Michel Thierry
  2 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2015-10-01 16:09 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> range pt in gen6_for_each_pde").
> 
> But the static analyzer still complains that, just before we break due
> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> iter value that is bigger than I915_PDES. Of course, this isn't really
> a problem since no one uses pt outside the macro. Still, every single
> new usage of the macro will create a new issue for us to mark as a
> false positive.
> 
> Also, Paulo re-started the discussion a while ago [1], but didn't end up
> implemented.
> 
> In order to "solve" this "problem", this patch takes the ideas from
> Chris and Dave, but that check would change the desired behavior of the
> code, because the object (for example pdp->page_directory[iter]) can be
> null during init/alloc, and C would take this as false, breaking the for
> loop immediately.
> 
> This has been already verified with "static analysis tools".
> 
> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> 
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 9fbb07d..94f8344 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
>   */
>  #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>  	for (iter = gen6_pde_index(start); \
> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
> +	     pt = (length > 0 && iter < I915_PDES) ? \
> +			(pd)->page_table[iter] : NULL, \
> +	     length > 0 && iter < I915_PDES; \

length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0,

as the compiler wouldn't be able to CSE it otherwise (I think).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry
  2015-10-01 16:09 ` Chris Wilson
@ 2015-10-02  7:58 ` Daniel Vetter
  2015-10-02  8:58   ` Chris Wilson
  2015-10-02 13:16 ` [PATCH v2] " Michel Thierry
  2 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2015-10-02  7:58 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> range pt in gen6_for_each_pde").
> 
> But the static analyzer still complains that, just before we break due
> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> iter value that is bigger than I915_PDES. Of course, this isn't really
> a problem since no one uses pt outside the macro. Still, every single
> new usage of the macro will create a new issue for us to mark as a
> false positive.
> 
> Also, Paulo re-started the discussion a while ago [1], but didn't end up
> implemented.
> 
> In order to "solve" this "problem", this patch takes the ideas from
> Chris and Dave, but that check would change the desired behavior of the
> code, because the object (for example pdp->page_directory[iter]) can be
> null during init/alloc, and C would take this as false, breaking the for
> loop immediately.
> 
> This has been already verified with "static analysis tools".
> 
> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> 
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>

So maybe I'm dense and not seeing what's really going on, but the only
thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer
to the element right after the last valid one. Pointer arithmetic and
comparison are explicitly allowed by the C standard on such a pointer. The
only thing not allowed is dereference it (which we don't seem to be doing
here).

The reason for this is exactly this case here, not allowing this means
checking whether you've run off the end of an array often becomes very
unnatural.

So if there's nothing else going on I think the right choice is to mark
them all up as false positives - the tool is wrong.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 9fbb07d..94f8344 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
>   */
>  #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>  	for (iter = gen6_pde_index(start); \
> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
> +	     pt = (length > 0 && iter < I915_PDES) ? \
> +			(pd)->page_table[iter] : NULL, \
> +	     length > 0 && iter < I915_PDES; \
>  	     iter++, \
>  	     temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
>  	     temp = min_t(unsigned, temp, length), \
> @@ -459,7 +461,9 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>   */
>  #define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
>  	for (iter = gen8_pde_index(start); \
> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES;	\
> +	     pt = (length > 0 && iter < I915_PDES) ? \
> +			(pd)->page_table[iter] : NULL, \
> +	     length > 0 && iter < I915_PDES;	\
>  	     iter++,				\
>  	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
>  	     temp = min(temp, length),					\
> @@ -467,7 +471,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>  
>  #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
>  	for (iter = gen8_pdpe_index(start); \
> -	     pd = (pdp)->page_directory[iter], \
> +	     pd = (length > 0 && (iter < I915_PDPES_PER_PDP(dev))) ? \
> +			(pdp)->page_directory[iter] : NULL, \
>  	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
>  	     iter++,				\
>  	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
> @@ -476,7 +481,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>  
>  #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
>  	for (iter = gen8_pml4e_index(start);	\
> -	     pdp = (pml4)->pdps[iter], \
> +	     pdp = (length > 0 && iter < GEN8_PML4ES_PER_PML4) ? \
> +			(pml4)->pdps[iter] : NULL, \
>  	     length > 0 && iter < GEN8_PML4ES_PER_PML4; \
>  	     iter++,				\
>  	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
> -- 
> 2.6.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-02  7:58 ` Daniel Vetter
@ 2015-10-02  8:58   ` Chris Wilson
  2015-10-02 10:52     ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2015-10-02  8:58 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Fri, Oct 02, 2015 at 09:58:05AM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
> > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > range pt in gen6_for_each_pde").
> > 
> > But the static analyzer still complains that, just before we break due
> > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > iter value that is bigger than I915_PDES. Of course, this isn't really
> > a problem since no one uses pt outside the macro. Still, every single
> > new usage of the macro will create a new issue for us to mark as a
> > false positive.
> > 
> > Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > implemented.
> > 
> > In order to "solve" this "problem", this patch takes the ideas from
> > Chris and Dave, but that check would change the desired behavior of the
> > code, because the object (for example pdp->page_directory[iter]) can be
> > null during init/alloc, and C would take this as false, breaking the for
> > loop immediately.
> > 
> > This has been already verified with "static analysis tools".
> > 
> > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > 
> > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Dave Gordon <david.s.gordon@intel.com>
> > Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> 
> So maybe I'm dense and not seeing what's really going on, but the only
> thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer
> to the element right after the last valid one. Pointer arithmetic and
> comparison are explicitly allowed by the C standard on such a pointer. The
> only thing not allowed is dereference it (which we don't seem to be doing
> here).

You're thinking of &(pd)->page_table[iter] (i.e. (pd)->page_table +
iter). There is an apparent dereference here of (pd)->page_table[ITER_SIZE].
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-02  8:58   ` Chris Wilson
@ 2015-10-02 10:52     ` Daniel Vetter
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2015-10-02 10:52 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx

On Fri, Oct 02, 2015 at 09:58:08AM +0100, Chris Wilson wrote:
> On Fri, Oct 02, 2015 at 09:58:05AM +0200, Daniel Vetter wrote:
> > On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
> > > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > > range pt in gen6_for_each_pde").
> > > 
> > > But the static analyzer still complains that, just before we break due
> > > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > > iter value that is bigger than I915_PDES. Of course, this isn't really
> > > a problem since no one uses pt outside the macro. Still, every single
> > > new usage of the macro will create a new issue for us to mark as a
> > > false positive.
> > > 
> > > Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > > implemented.
> > > 
> > > In order to "solve" this "problem", this patch takes the ideas from
> > > Chris and Dave, but that check would change the desired behavior of the
> > > code, because the object (for example pdp->page_directory[iter]) can be
> > > null during init/alloc, and C would take this as false, breaking the for
> > > loop immediately.
> > > 
> > > This has been already verified with "static analysis tools".
> > > 
> > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > > 
> > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Dave Gordon <david.s.gordon@intel.com>
> > > Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> > 
> > So maybe I'm dense and not seeing what's really going on, but the only
> > thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer
> > to the element right after the last valid one. Pointer arithmetic and
> > comparison are explicitly allowed by the C standard on such a pointer. The
> > only thing not allowed is dereference it (which we don't seem to be doing
> > here).
> 
> You're thinking of &(pd)->page_table[iter] (i.e. (pd)->page_table +
> iter). There is an apparent dereference here of (pd)->page_table[ITER_SIZE].

Oh right.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-01 16:09 ` Chris Wilson
@ 2015-10-02 12:47   ` Michel Thierry
  2015-10-02 12:58     ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Michel Thierry @ 2015-10-02 12:47 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 10/1/2015 5:09 PM, Chris Wilson wrote:
> On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
>> ---
>>   drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index 9fbb07d..94f8344 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
>>    */
>>   #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>>   	for (iter = gen6_pde_index(start); \
>> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
>> +	     pt = (length > 0 && iter < I915_PDES) ? \
>> +			(pd)->page_table[iter] : NULL, \
>> +	     length > 0 && iter < I915_PDES; \
>
> length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0,
>
> as the compiler wouldn't be able to CSE it otherwise (I think).

Even after that change, the compiler keeps doing an optimization when 
page_table[iter] is null (takes the null assignment as the break condition).

I've been playing with these examples
http://paste.ubuntu.com/12638106/

Only the 1st example (a) iterates over all elements, b & c stop after 
the 1st run.

Thanks,

-Michel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-02 12:47   ` Michel Thierry
@ 2015-10-02 12:58     ` Chris Wilson
  2015-10-02 13:11       ` Michel Thierry
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2015-10-02 12:58 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Fri, Oct 02, 2015 at 01:47:03PM +0100, Michel Thierry wrote:
> On 10/1/2015 5:09 PM, Chris Wilson wrote:
> >On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
> >>---
> >>  drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
> >>  1 file changed, 10 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>index 9fbb07d..94f8344 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>@@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
> >>   */
> >>  #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
> >>  	for (iter = gen6_pde_index(start); \
> >>-	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
> >>+	     pt = (length > 0 && iter < I915_PDES) ? \
> >>+			(pd)->page_table[iter] : NULL, \
> >>+	     length > 0 && iter < I915_PDES; \
> >
> >length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0,
> >
> >as the compiler wouldn't be able to CSE it otherwise (I think).
> 
> Even after that change, the compiler keeps doing an optimization
> when page_table[iter] is null (takes the null assignment as the
> break condition).
> 
> I've been playing with these examples
> http://paste.ubuntu.com/12638106/
> 
> Only the 1st example (a) iterates over all elements, b & c stop
> after the 1st run.

Forgot that was the condition you wanted to change.

length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]), 1 : 0,

Would be nice to prove that length > 0 implies iter < I915_PDES. If only
we had smart tools :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-02 12:58     ` Chris Wilson
@ 2015-10-02 13:11       ` Michel Thierry
  0 siblings, 0 replies; 22+ messages in thread
From: Michel Thierry @ 2015-10-02 13:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Paulo Zanoni, Dave Gordon, daniel

On 10/2/2015 1:58 PM, Chris Wilson wrote:
> On Fri, Oct 02, 2015 at 01:47:03PM +0100, Michel Thierry wrote:
>> On 10/1/2015 5:09 PM, Chris Wilson wrote:
>>> On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote:
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
>>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> index 9fbb07d..94f8344 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
>>>>    */
>>>>   #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>>>>   	for (iter = gen6_pde_index(start); \
>>>> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
>>>> +	     pt = (length > 0 && iter < I915_PDES) ? \
>>>> +			(pd)->page_table[iter] : NULL, \
>>>> +	     length > 0 && iter < I915_PDES; \
>>>
>>> length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0,
>>>
>>> as the compiler wouldn't be able to CSE it otherwise (I think).
>>
>> Even after that change, the compiler keeps doing an optimization
>> when page_table[iter] is null (takes the null assignment as the
>> break condition).
>>
>> I've been playing with these examples
>> http://paste.ubuntu.com/12638106/
>>
>> Only the 1st example (a) iterates over all elements, b & c stop
>> after the 1st run.
>
> Forgot that was the condition you wanted to change.
>
> length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]), 1 : 0,
>
> Would be nice to prove that length > 0 implies iter < I915_PDES. If only
> we had smart tools :)

Thanks, that made it.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry
  2015-10-01 16:09 ` Chris Wilson
  2015-10-02  7:58 ` Daniel Vetter
@ 2015-10-02 13:16 ` Michel Thierry
  2015-10-05 16:36   ` Dave Gordon
  2 siblings, 1 reply; 22+ messages in thread
From: Michel Thierry @ 2015-10-02 13:16 UTC (permalink / raw)
  To: intel-gfx

We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
range pt in gen6_for_each_pde").

But the static analyzer still complains that, just before we break due
to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
iter value that is bigger than I915_PDES. Of course, this isn't really
a problem since no one uses pt outside the macro. Still, every single
new usage of the macro will create a new issue for us to mark as a
false positive.

Also, Paulo re-started the discussion a while ago [1], but didn't end up
implemented.

In order to "solve" this "problem", this patch takes the ideas from
Chris and Dave, but that check would change the desired behavior of the
code, because the object (for example pdp->page_directory[iter]) can be
null during init/alloc, and C would take this as false, breaking the for
loop immediately.

This has been already verified with "static analysis tools".

[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html

v2: Make it a single statement, while preventing the common subexpression
elimination (Chris)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9fbb07d..a216397 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
  */
 #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
 	for (iter = gen6_pde_index(start); \
-	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
+	     length > 0 && iter < I915_PDES ? \
+			(pt = (pd)->page_table[iter]), 1 : 0; \
 	     iter++, \
 	     temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
 	     temp = min_t(unsigned, temp, length), \
@@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
  */
 #define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
 	for (iter = gen8_pde_index(start); \
-	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES;	\
+	     length > 0 && iter < I915_PDES ? \
+			(pt = (pd)->page_table[iter]), 1 : 0; \
 	     iter++,				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
 	     temp = min(temp, length),					\
@@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 
 #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
 	for (iter = gen8_pdpe_index(start); \
-	     pd = (pdp)->page_directory[iter], \
-	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
+	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
+			(pd = (pdp)->page_directory[iter]), 1 : 0; \
 	     iter++,				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
 	     temp = min(temp, length),					\
@@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 
 #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
 	for (iter = gen8_pml4e_index(start);	\
-	     pdp = (pml4)->pdps[iter], \
-	     length > 0 && iter < GEN8_PML4ES_PER_PML4; \
+	     length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
+			(pdp = (pml4)->pdps[iter]), 1 : 0; \
 	     iter++,				\
 	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
 	     temp = min(temp, length),					\
-- 
2.6.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-02 13:16 ` [PATCH v2] " Michel Thierry
@ 2015-10-05 16:36   ` Dave Gordon
  2015-10-05 16:59     ` Michel Thierry
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Gordon @ 2015-10-05 16:36 UTC (permalink / raw)
  To: Michel Thierry, intel-gfx

On 02/10/15 14:16, Michel Thierry wrote:
> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> range pt in gen6_for_each_pde").
>
> But the static analyzer still complains that, just before we break due
> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> iter value that is bigger than I915_PDES. Of course, this isn't really
> a problem since no one uses pt outside the macro. Still, every single
> new usage of the macro will create a new issue for us to mark as a
> false positive.
>
> Also, Paulo re-started the discussion a while ago [1], but didn't end up
> implemented.
>
> In order to "solve" this "problem", this patch takes the ideas from
> Chris and Dave, but that check would change the desired behavior of the
> code, because the object (for example pdp->page_directory[iter]) can be
> null during init/alloc, and C would take this as false, breaking the for
> loop immediately.
>
> This has been already verified with "static analysis tools".
>
> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
>
> v2: Make it a single statement, while preventing the common subexpression
> elimination (Chris)
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------
>   1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 9fbb07d..a216397 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
>    */
>   #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>   	for (iter = gen6_pde_index(start); \
> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
> +	     length > 0 && iter < I915_PDES ? \
> +			(pt = (pd)->page_table[iter]), 1 : 0; \
>   	     iter++, \
>   	     temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
>   	     temp = min_t(unsigned, temp, length), \
> @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>    */
>   #define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
>   	for (iter = gen8_pde_index(start); \
> -	     pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES;	\
> +	     length > 0 && iter < I915_PDES ? \
> +			(pt = (pd)->page_table[iter]), 1 : 0; \
>   	     iter++,				\
>   	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
>   	     temp = min(temp, length),					\
> @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>
>   #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
>   	for (iter = gen8_pdpe_index(start); \
> -	     pd = (pdp)->page_directory[iter], \
> -	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
> +	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
> +			(pd = (pdp)->page_directory[iter]), 1 : 0; \
>   	     iter++,				\
>   	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
>   	     temp = min(temp, length),					\
> @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>
>   #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
>   	for (iter = gen8_pml4e_index(start);	\
> -	     pdp = (pml4)->pdps[iter], \
> -	     length > 0 && iter < GEN8_PML4ES_PER_PML4; \
> +	     length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
> +			(pdp = (pml4)->pdps[iter]), 1 : 0; \

this won't compile -- see below

>   	     iter++,				\
>   	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
>   	     temp = min(temp, length),					\

The man page for C operators tells us:

        Operator                             Associativity
        () [] -> .                           left to right
        ! ~ ++ -- + - (type) * & sizeof      right to left
        * / %                                left to right
        + -                                  left to right
        << >>                                left to right
        < <= > >=                            left to right
        == !=                                left to right
        &                                    left to right
        ^                                    left to right
        |                                    left to right
        &&                                   left to right
        ||                                   left to right
        ?:                                   right to left
        = += -= *= /= %= <<= >>= &= ^= |=    right to left
        ,                                    left to right

So there's a problem with the above code, because the comma operator is 
LOWER precedence than either assignment or ?: You'd need to put the 
parentheses around the (pdp = ... , 1) section, not just the assignment.

Or for yet another variation, how about:

#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
         for (iter = gen8_pdpe_index(start);                            \
              iter < I915_PDPES_PER_PDP(dev) &&                         \
                 (pd = (pdp)->page_directory[iter], length > 0);        \
              iter++,                                                   \
              temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,      \
              temp = min(temp, length),                                 \
              start += temp, length -= temp)

#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
         for (iter = gen8_pml4e_index(start);                           \
              iter < GEN8_PML4ES_PER_PML4 &&                            \
                 (pdp = (pml4)->pdps[iter], length > 0);                \
              iter++,                                                   \
              temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
              temp = min(temp, length),                                 \
              start += temp, length -= temp)

with no ugly ?: at all :)

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-05 16:36   ` Dave Gordon
@ 2015-10-05 16:59     ` Michel Thierry
  2015-10-06  8:38       ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Michel Thierry @ 2015-10-05 16:59 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx

On 10/5/2015 5:36 PM, Dave Gordon wrote:
> On 02/10/15 14:16, Michel Thierry wrote:
>> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
>> range pt in gen6_for_each_pde").
>>
>> But the static analyzer still complains that, just before we break due
>> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
>> iter value that is bigger than I915_PDES. Of course, this isn't really
>> a problem since no one uses pt outside the macro. Still, every single
>> new usage of the macro will create a new issue for us to mark as a
>> false positive.
>>
>> Also, Paulo re-started the discussion a while ago [1], but didn't end up
>> implemented.
>>
>> In order to "solve" this "problem", this patch takes the ideas from
>> Chris and Dave, but that check would change the desired behavior of the
>> code, because the object (for example pdp->page_directory[iter]) can be
>> null during init/alloc, and C would take this as false, breaking the for
>> loop immediately.
>>
>> This has been already verified with "static analysis tools".
>>
>> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
>>
>> v2: Make it a single statement, while preventing the common subexpression
>> elimination (Chris)
>>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Dave Gordon <david.s.gordon@intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------
>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> index 9fbb07d..a216397 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>> @@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
>>    */
>>   #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>>       for (iter = gen6_pde_index(start); \
>> -         pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
>> +         length > 0 && iter < I915_PDES ? \
>> +            (pt = (pd)->page_table[iter]), 1 : 0; \
>>            iter++, \
>>            temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
>>            temp = min_t(unsigned, temp, length), \
>> @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>    */
>>   #define gen8_for_each_pde(pt, pd, start, length, temp, iter)        \
>>       for (iter = gen8_pde_index(start); \
>> -         pt = (pd)->page_table[iter], length > 0 && iter <
>> I915_PDES;    \
>> +         length > 0 && iter < I915_PDES ? \
>> +            (pt = (pd)->page_table[iter]), 1 : 0; \
>>            iter++,                \
>>            temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,    \
>>            temp = min(temp, length),                    \
>> @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>
>>   #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)    \
>>       for (iter = gen8_pdpe_index(start); \
>> -         pd = (pdp)->page_directory[iter], \
>> -         length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
>> +         length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
>> +            (pd = (pdp)->page_directory[iter]), 1 : 0; \
>>            iter++,                \
>>            temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,    \
>>            temp = min(temp, length),                    \
>> @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>
>>   #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)    \
>>       for (iter = gen8_pml4e_index(start);    \
>> -         pdp = (pml4)->pdps[iter], \
>> -         length > 0 && iter < GEN8_PML4ES_PER_PML4; \
>> +         length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
>> +            (pdp = (pml4)->pdps[iter]), 1 : 0; \
>
> this won't compile -- see below

Hmm, it compiled (also got rid of of the "analysis tool error" and 
didn't see any behavior change).

>
>>            iter++,                \
>>            temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,    \
>>            temp = min(temp, length),                    \
>
> The man page for C operators tells us:
>
>         Operator                             Associativity
>         () [] -> .                           left to right
>         ! ~ ++ -- + - (type) * & sizeof      right to left
>         * / %                                left to right
>         + -                                  left to right
>         << >>                                left to right
>         < <= > >=                            left to right
>         == !=                                left to right
>         &                                    left to right
>         ^                                    left to right
>         |                                    left to right
>         &&                                   left to right
>         ||                                   left to right
>         ?:                                   right to left
>         = += -= *= /= %= <<= >>= &= ^= |=    right to left
>         ,                                    left to right
>
> So there's a problem with the above code, because the comma operator is
> LOWER precedence than either assignment or ?: You'd need to put the
> parentheses around the (pdp = ... , 1) section, not just the assignment.
>
> Or for yet another variation, how about:
>
> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>          for (iter = gen8_pdpe_index(start);                            \
>               iter < I915_PDPES_PER_PDP(dev) &&                         \
>                  (pd = (pdp)->page_directory[iter], length > 0);        \
>               iter++,                                                   \
>               temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,      \
>               temp = min(temp, length),                                 \
>               start += temp, length -= temp)
>
> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
>          for (iter = gen8_pml4e_index(start);                           \
>               iter < GEN8_PML4ES_PER_PML4 &&                            \
>                  (pdp = (pml4)->pdps[iter], length > 0);                \
>               iter++,                                                   \
>               temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
>               temp = min(temp, length),                                 \
>               start += temp, length -= temp)
>
> with no ugly ?: at all :)

This also works.

Since it's a change for not a real issue, I don't have a preference.
I can send either.

Thanks,

>
> .Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-05 16:59     ` Michel Thierry
@ 2015-10-06  8:38       ` Daniel Vetter
  2015-10-06  9:43         ` Chris Wilson
  2015-10-06 11:21         ` Dave Gordon
  0 siblings, 2 replies; 22+ messages in thread
From: Daniel Vetter @ 2015-10-06  8:38 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx

On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> On 10/5/2015 5:36 PM, Dave Gordon wrote:
> >On 02/10/15 14:16, Michel Thierry wrote:
> >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> >>range pt in gen6_for_each_pde").
> >>
> >>But the static analyzer still complains that, just before we break due
> >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> >>iter value that is bigger than I915_PDES. Of course, this isn't really
> >>a problem since no one uses pt outside the macro. Still, every single
> >>new usage of the macro will create a new issue for us to mark as a
> >>false positive.
> >>
> >>Also, Paulo re-started the discussion a while ago [1], but didn't end up
> >>implemented.
> >>
> >>In order to "solve" this "problem", this patch takes the ideas from
> >>Chris and Dave, but that check would change the desired behavior of the
> >>code, because the object (for example pdp->page_directory[iter]) can be
> >>null during init/alloc, and C would take this as false, breaking the for
> >>loop immediately.
> >>
> >>This has been already verified with "static analysis tools".
> >>
> >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> >>
> >>v2: Make it a single statement, while preventing the common subexpression
> >>elimination (Chris)
> >>
> >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>Cc: Dave Gordon <david.s.gordon@intel.com>
> >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------
> >>  1 file changed, 8 insertions(+), 6 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>index 9fbb07d..a216397 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> >>@@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
> >>   */
> >>  #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
> >>      for (iter = gen6_pde_index(start); \
> >>-         pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
> >>+         length > 0 && iter < I915_PDES ? \
> >>+            (pt = (pd)->page_table[iter]), 1 : 0; \
> >>           iter++, \
> >>           temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
> >>           temp = min_t(unsigned, temp, length), \
> >>@@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
> >>   */
> >>  #define gen8_for_each_pde(pt, pd, start, length, temp, iter)        \
> >>      for (iter = gen8_pde_index(start); \
> >>-         pt = (pd)->page_table[iter], length > 0 && iter <
> >>I915_PDES;    \
> >>+         length > 0 && iter < I915_PDES ? \
> >>+            (pt = (pd)->page_table[iter]), 1 : 0; \
> >>           iter++,                \
> >>           temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,    \
> >>           temp = min(temp, length),                    \
> >>@@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
> >>
> >>  #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)    \
> >>      for (iter = gen8_pdpe_index(start); \
> >>-         pd = (pdp)->page_directory[iter], \
> >>-         length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
> >>+         length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
> >>+            (pd = (pdp)->page_directory[iter]), 1 : 0; \
> >>           iter++,                \
> >>           temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,    \
> >>           temp = min(temp, length),                    \
> >>@@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
> >>
> >>  #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)    \
> >>      for (iter = gen8_pml4e_index(start);    \
> >>-         pdp = (pml4)->pdps[iter], \
> >>-         length > 0 && iter < GEN8_PML4ES_PER_PML4; \
> >>+         length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
> >>+            (pdp = (pml4)->pdps[iter]), 1 : 0; \
> >
> >this won't compile -- see below
> 
> Hmm, it compiled (also got rid of of the "analysis tool error" and didn't
> see any behavior change).
> 
> >
> >>           iter++,                \
> >>           temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,    \
> >>           temp = min(temp, length),                    \
> >
> >The man page for C operators tells us:
> >
> >        Operator                             Associativity
> >        () [] -> .                           left to right
> >        ! ~ ++ -- + - (type) * & sizeof      right to left
> >        * / %                                left to right
> >        + -                                  left to right
> >        << >>                                left to right
> >        < <= > >=                            left to right
> >        == !=                                left to right
> >        &                                    left to right
> >        ^                                    left to right
> >        |                                    left to right
> >        &&                                   left to right
> >        ||                                   left to right
> >        ?:                                   right to left
> >        = += -= *= /= %= <<= >>= &= ^= |=    right to left
> >        ,                                    left to right
> >
> >So there's a problem with the above code, because the comma operator is
> >LOWER precedence than either assignment or ?: You'd need to put the
> >parentheses around the (pdp = ... , 1) section, not just the assignment.
> >
> >Or for yet another variation, how about:
> >
> >#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
> >         for (iter = gen8_pdpe_index(start);                            \
> >              iter < I915_PDPES_PER_PDP(dev) &&                         \
> >                 (pd = (pdp)->page_directory[iter], length > 0);        \
> >              iter++,                                                   \
> >              temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,      \
> >              temp = min(temp, length),                                 \
> >              start += temp, length -= temp)
> >
> >#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
> >         for (iter = gen8_pml4e_index(start);                           \
> >              iter < GEN8_PML4ES_PER_PML4 &&                            \
> >                 (pdp = (pml4)->pdps[iter], length > 0);                \
> >              iter++,                                                   \
> >              temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
> >              temp = min(temp, length),                                 \
> >              start += temp, length -= temp)
> >
> >with no ugly ?: at all :)
> 
> This also works.
> 
> Since it's a change for not a real issue, I don't have a preference.
> I can send either.

Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
middle and always parses it as a ? (b) : c. If lower-level operators in
the middle could split the ternary operator then it would result in
parsing fail (sinc ? without the : is just useless). So lgtm. Someone
willing to smack an r-b onto the patch?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06  8:38       ` Daniel Vetter
@ 2015-10-06  9:43         ` Chris Wilson
  2015-10-06 10:09           ` Daniel Vetter
  2015-10-06 11:21         ` Dave Gordon
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2015-10-06  9:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote:
> On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> > On 10/5/2015 5:36 PM, Dave Gordon wrote:
> > >On 02/10/15 14:16, Michel Thierry wrote:
> > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > >>range pt in gen6_for_each_pde").
> > >>
> > >>But the static analyzer still complains that, just before we break due
> > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > >>iter value that is bigger than I915_PDES. Of course, this isn't really
> > >>a problem since no one uses pt outside the macro. Still, every single
> > >>new usage of the macro will create a new issue for us to mark as a
> > >>false positive.
> > >>
> > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > >>implemented.
> > >>
> > >>In order to "solve" this "problem", this patch takes the ideas from
> > >>Chris and Dave, but that check would change the desired behavior of the
> > >>code, because the object (for example pdp->page_directory[iter]) can be
> > >>null during init/alloc, and C would take this as false, breaking the for
> > >>loop immediately.
> > >>
> > >>This has been already verified with "static analysis tools".
> > >>
> > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > >>
> > >>v2: Make it a single statement, while preventing the common subexpression
> > >>elimination (Chris)
> > >>
> > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > >>Cc: Dave Gordon <david.s.gordon@intel.com>
> > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>

> Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> middle and always parses it as a ? (b) : c. If lower-level operators in
> the middle could split the ternary operator then it would result in
> parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> willing to smack an r-b onto the patch?

I think it's good enough.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Something to consider is that ppgtt_insert is 10x slower than
ppgtt_clear, and that some workloads (admittedly not 48b!) spend a
disproportionate amount of time changing PTE. If you have ideas for
spending up insertion, feel free to experiment!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06  9:43         ` Chris Wilson
@ 2015-10-06 10:09           ` Daniel Vetter
  2015-10-06 10:19             ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2015-10-06 10:09 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx

On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote:
> > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> > > On 10/5/2015 5:36 PM, Dave Gordon wrote:
> > > >On 02/10/15 14:16, Michel Thierry wrote:
> > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > > >>range pt in gen6_for_each_pde").
> > > >>
> > > >>But the static analyzer still complains that, just before we break due
> > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > > >>iter value that is bigger than I915_PDES. Of course, this isn't really
> > > >>a problem since no one uses pt outside the macro. Still, every single
> > > >>new usage of the macro will create a new issue for us to mark as a
> > > >>false positive.
> > > >>
> > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > > >>implemented.
> > > >>
> > > >>In order to "solve" this "problem", this patch takes the ideas from
> > > >>Chris and Dave, but that check would change the desired behavior of the
> > > >>code, because the object (for example pdp->page_directory[iter]) can be
> > > >>null during init/alloc, and C would take this as false, breaking the for
> > > >>loop immediately.
> > > >>
> > > >>This has been already verified with "static analysis tools".
> > > >>
> > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > > >>
> > > >>v2: Make it a single statement, while preventing the common subexpression
> > > >>elimination (Chris)
> > > >>
> > > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > >>Cc: Dave Gordon <david.s.gordon@intel.com>
> > > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> 
> > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> > middle and always parses it as a ? (b) : c. If lower-level operators in
> > the middle could split the ternary operator then it would result in
> > parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> > willing to smack an r-b onto the patch?
> 
> I think it's good enough.
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Queued for -next, thanks for the patch.

> Something to consider is that ppgtt_insert is 10x slower than
> ppgtt_clear, and that some workloads (admittedly not 48b!) spend a
> disproportionate amount of time changing PTE. If you have ideas for
> spending up insertion, feel free to experiment!

Hm, where do we waste all that time? 10x slower is pretty impressive since
on a quick look I can only see the sg table walk as the additional bit of
memory traversals insert does on top of clear ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06 10:09           ` Daniel Vetter
@ 2015-10-06 10:19             ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2015-10-06 10:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Oct 06, 2015 at 12:09:51PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote:
> > On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote:
> > > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> > > > On 10/5/2015 5:36 PM, Dave Gordon wrote:
> > > > >On 02/10/15 14:16, Michel Thierry wrote:
> > > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > > > >>range pt in gen6_for_each_pde").
> > > > >>
> > > > >>But the static analyzer still complains that, just before we break due
> > > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > > > >>iter value that is bigger than I915_PDES. Of course, this isn't really
> > > > >>a problem since no one uses pt outside the macro. Still, every single
> > > > >>new usage of the macro will create a new issue for us to mark as a
> > > > >>false positive.
> > > > >>
> > > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > > > >>implemented.
> > > > >>
> > > > >>In order to "solve" this "problem", this patch takes the ideas from
> > > > >>Chris and Dave, but that check would change the desired behavior of the
> > > > >>code, because the object (for example pdp->page_directory[iter]) can be
> > > > >>null during init/alloc, and C would take this as false, breaking the for
> > > > >>loop immediately.
> > > > >>
> > > > >>This has been already verified with "static analysis tools".
> > > > >>
> > > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > > > >>
> > > > >>v2: Make it a single statement, while preventing the common subexpression
> > > > >>elimination (Chris)
> > > > >>
> > > > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > >>Cc: Dave Gordon <david.s.gordon@intel.com>
> > > > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> > 
> > > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> > > middle and always parses it as a ? (b) : c. If lower-level operators in
> > > the middle could split the ternary operator then it would result in
> > > parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> > > willing to smack an r-b onto the patch?
> > 
> > I think it's good enough.
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Queued for -next, thanks for the patch.
> 
> > Something to consider is that ppgtt_insert is 10x slower than
> > ppgtt_clear, and that some workloads (admittedly not 48b!) spend a
> > disproportionate amount of time changing PTE. If you have ideas for
> > spending up insertion, feel free to experiment!
> 
> Hm, where do we waste all that time? 10x slower is pretty impressive since
> on a quick look I can only see the sg table walk as the additional bit of
> memory traversals insert does on top of clear ...

The sg_page_iter claims top spot, followed by the memory dereferences
(at a guess, it doesn't seem the individual sg struct isn't dense enough
to be cache friendly or maybe we should just cachealign it?). We can make
substantial improvement by opencoding the page iter (under the assumption
that we do not have page coallescing) like:

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=f9baf155c3096058ef9aeeb39b586713291efc56

with that and eliminating the ivb_pte_encode() indirection gets us to
only be ~5x slower than clearing. It is not pretty (and is rather
cavalier about the last page), but at that point it seems to be memory bound.

However, in this particular benchmark where inserting ppgtt dominates
the kernel profile, we are GPU bound and improving the insert is lost in
the noise. (Unless we disable the GPU load and assume an infinitely fast
GPU like Skylake!)
-Chris


-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06  8:38       ` Daniel Vetter
  2015-10-06  9:43         ` Chris Wilson
@ 2015-10-06 11:21         ` Dave Gordon
  2015-10-06 11:53           ` Chris Wilson
  1 sibling, 1 reply; 22+ messages in thread
From: Dave Gordon @ 2015-10-06 11:21 UTC (permalink / raw)
  To: Daniel Vetter, Michel Thierry; +Cc: intel-gfx

On 06/10/15 09:38, Daniel Vetter wrote:
> On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
>> On 10/5/2015 5:36 PM, Dave Gordon wrote:
>>> On 02/10/15 14:16, Michel Thierry wrote:
>>>> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
>>>> range pt in gen6_for_each_pde").
>>>>
>>>> But the static analyzer still complains that, just before we break due
>>>> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
>>>> iter value that is bigger than I915_PDES. Of course, this isn't really
>>>> a problem since no one uses pt outside the macro. Still, every single
>>>> new usage of the macro will create a new issue for us to mark as a
>>>> false positive.
>>>>
>>>> Also, Paulo re-started the discussion a while ago [1], but didn't end up
>>>> implemented.
>>>>
>>>> In order to "solve" this "problem", this patch takes the ideas from
>>>> Chris and Dave, but that check would change the desired behavior of the
>>>> code, because the object (for example pdp->page_directory[iter]) can be
>>>> null during init/alloc, and C would take this as false, breaking the for
>>>> loop immediately.
>>>>
>>>> This has been already verified with "static analysis tools".
>>>>
>>>> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
>>>>
>>>> v2: Make it a single statement, while preventing the common subexpression
>>>> elimination (Chris)
>>>>
>>>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Dave Gordon <david.s.gordon@intel.com>
>>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------
>>>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> index 9fbb07d..a216397 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
>>>> @@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
>>>>    */
>>>>   #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
>>>>       for (iter = gen6_pde_index(start); \
>>>> -         pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
>>>> +         length > 0 && iter < I915_PDES ? \
>>>> +            (pt = (pd)->page_table[iter]), 1 : 0; \
>>>>            iter++, \
>>>>            temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
>>>>            temp = min_t(unsigned, temp, length), \
>>>> @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>>>    */
>>>>   #define gen8_for_each_pde(pt, pd, start, length, temp, iter)        \
>>>>       for (iter = gen8_pde_index(start); \
>>>> -         pt = (pd)->page_table[iter], length > 0 && iter <
>>>> I915_PDES;    \
>>>> +         length > 0 && iter < I915_PDES ? \
>>>> +            (pt = (pd)->page_table[iter]), 1 : 0; \
>>>>            iter++,                \
>>>>            temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,    \
>>>>            temp = min(temp, length),                    \
>>>> @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>>>
>>>>   #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)    \
>>>>       for (iter = gen8_pdpe_index(start); \
>>>> -         pd = (pdp)->page_directory[iter], \
>>>> -         length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
>>>> +         length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
>>>> +            (pd = (pdp)->page_directory[iter]), 1 : 0; \
>>>>            iter++,                \
>>>>            temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,    \
>>>>            temp = min(temp, length),                    \
>>>> @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>>>>
>>>>   #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)    \
>>>>       for (iter = gen8_pml4e_index(start);    \
>>>> -         pdp = (pml4)->pdps[iter], \
>>>> -         length > 0 && iter < GEN8_PML4ES_PER_PML4; \
>>>> +         length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
>>>> +            (pdp = (pml4)->pdps[iter]), 1 : 0; \
>>>
>>> this won't compile -- see below
>>
>> Hmm, it compiled (also got rid of of the "analysis tool error" and didn't
>> see any behavior change).
>>
>>>
>>>>            iter++,                \
>>>>            temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,    \
>>>>            temp = min(temp, length),                    \
>>>
>>> The man page for C operators tells us:
>>>
>>>         Operator                             Associativity
>>>         () [] -> .                           left to right
>>>         ! ~ ++ -- + - (type) * & sizeof      right to left
>>>         * / %                                left to right
>>>         + -                                  left to right
>>>         << >>                                left to right
>>>         < <= > >=                            left to right
>>>         == !=                                left to right
>>>         &                                    left to right
>>>         ^                                    left to right
>>>         |                                    left to right
>>>         &&                                   left to right
>>>         ||                                   left to right
>>>         ?:                                   right to left
>>>         = += -= *= /= %= <<= >>= &= ^= |=    right to left
>>>         ,                                    left to right
>>>
>>> So there's a problem with the above code, because the comma operator is
>>> LOWER precedence than either assignment or ?: You'd need to put the
>>> parentheses around the (pdp = ... , 1) section, not just the assignment.
>>>
>>> Or for yet another variation, how about:
>>>
>>> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)         \
>>>          for (iter = gen8_pdpe_index(start);                            \
>>>               iter < I915_PDPES_PER_PDP(dev) &&                         \
>>>                  (pd = (pdp)->page_directory[iter], length > 0);        \
>>>               iter++,                                                   \
>>>               temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,      \
>>>               temp = min(temp, length),                                 \
>>>               start += temp, length -= temp)
>>>
>>> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
>>>          for (iter = gen8_pml4e_index(start);                           \
>>>               iter < GEN8_PML4ES_PER_PML4 &&                            \
>>>                  (pdp = (pml4)->pdps[iter], length > 0);                \
>>>               iter++,                                                   \
>>>               temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
>>>               temp = min(temp, length),                                 \
>>>               start += temp, length -= temp)
>>>
>>> with no ugly ?: at all :)
>>
>> This also works.
>>
>> Since it's a change for not a real issue, I don't have a preference.
>> I can send either.
>
> Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> middle and always parses it as a ? (b) : c. If lower-level operators in
> the middle could split the ternary operator then it would result in
> parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> willing to smack an r-b onto the patch?
> -Daniel

Hmmm ... it failed to compile when I tried it yesterday, but I think I 
must have bodged the copypaste from the email to the header file, and 
then just assumed the error message meant something was wrong with the 
way all those low-precedence operators were stacked together. Anyway I 
just did it again and it compiles OK now, so it gets my

Reviewed-by: Dave Gordon <david.s.gordon@intel.com>

... although I still think my version is (slightly) easier to read. Or 
it could be improved even more by moving the increment of 'iter' to the 
end, making it one line shorter and perhaps helping the compiler a little :)

#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
         for (iter = gen8_pml4e_index(start);                           \
              iter < GEN8_PML4ES_PER_PML4 &&                            \
                 (pdp = (pml4)->pdps[iter], length > 0);                \
              temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
              temp = min(temp, length),                                 \
              start += temp, length -= temp, ++iter)

Or, noting that 'temp' is never used in the generated loop bodies, we 
could eliminate the parameter and make it local to the loop header :)

#define gen8_for_each_pml4e(pdp, pml4, start, length, iter)            \
         for (iter = gen8_pml4e_index(start);                           \
              iter < GEN8_PML4ES_PER_PML4 &&                            \
                 (pdp = (pml4)->pdps[iter], length > 0);                \
              ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);   \
                     temp = min(temp - start, length);                  \
                     start += temp, length -= temp; }),                 \
                 ++iter)

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06 11:21         ` Dave Gordon
@ 2015-10-06 11:53           ` Chris Wilson
  2015-12-03 15:29             ` Dave Gordon
  2015-12-08 13:30             ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
  0 siblings, 2 replies; 22+ messages in thread
From: Chris Wilson @ 2015-10-06 11:53 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Oct 06, 2015 at 12:21:05PM +0100, Dave Gordon wrote:
> ... although I still think my version is (slightly) easier to read.
> Or it could be improved even more by moving the increment of 'iter'
> to the end, making it one line shorter and perhaps helping the
> compiler a little :)
> 
> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
>         for (iter = gen8_pml4e_index(start);                           \
>              iter < GEN8_PML4ES_PER_PML4 &&                            \
>                 (pdp = (pml4)->pdps[iter], length > 0);                \
>              temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
>              temp = min(temp, length),                                 \
>              start += temp, length -= temp, ++iter)

Shorter, yes, but we may as well take advantage of not using [iter] if
length == 0, so meh.
 
> Or, noting that 'temp' is never used in the generated loop bodies,
> we could eliminate the parameter and make it local to the loop
> header :)
> 
> #define gen8_for_each_pml4e(pdp, pml4, start, length, iter)            \
>         for (iter = gen8_pml4e_index(start);                           \
>              iter < GEN8_PML4ES_PER_PML4 &&                            \
>                 (pdp = (pml4)->pdps[iter], length > 0);                \
>              ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);   \
>                     temp = min(temp - start, length);                  \
>                     start += temp, length -= temp; }),                 \
>                 ++iter)

Removing extraneous parameters from macros is differently a usability win.
Care to spin a real patch so we can see how it looks in practice?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)
  2015-10-06 11:53           ` Chris Wilson
@ 2015-12-03 15:29             ` Dave Gordon
  2015-12-08 13:30             ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
  1 sibling, 0 replies; 22+ messages in thread
From: Dave Gordon @ 2015-12-03 15:29 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx

[-- Attachment #1: Type: text/plain, Size: 2155 bytes --]

On 06/10/15 12:53, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 12:21:05PM +0100, Dave Gordon wrote:
>> ... although I still think my version is (slightly) easier to read.
>> Or it could be improved even more by moving the increment of 'iter'
>> to the end, making it one line shorter and perhaps helping the
>> compiler a little :)
>>
>> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)      \
>>          for (iter = gen8_pml4e_index(start);                           \
>>               iter < GEN8_PML4ES_PER_PML4 &&                            \
>>                  (pdp = (pml4)->pdps[iter], length > 0);                \
>>               temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
>>               temp = min(temp, length),                                 \
>>               start += temp, length -= temp, ++iter)
>
> Shorter, yes, but we may as well take advantage of not using [iter] if
> length == 0, so meh.
>
>> Or, noting that 'temp' is never used in the generated loop bodies,
>> we could eliminate the parameter and make it local to the loop
>> header :)
>>
>> #define gen8_for_each_pml4e(pdp, pml4, start, length, iter)            \
>>          for (iter = gen8_pml4e_index(start);                           \
>>               iter < GEN8_PML4ES_PER_PML4 &&                            \
>>                  (pdp = (pml4)->pdps[iter], length > 0);                \
>>               ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);   \
>>                      temp = min(temp - start, length);                  \
>>                      start += temp, length -= temp; }),                 \
>>                  ++iter)
>
> Removing extraneous parameters from macros is differently a usability win.
> Care to spin a real patch so we can see how it looks in practice?
> -Chris

OK, here's the patch that Chris asked for. On compilation, this version 
comes out about 160 bytes smaller than the original, and the disassembly 
has 100 fewer lines. This probably doesn't improve performance 
significantly, though, as many of the uses of these macros are inside 
debug-only data dumping functions.

.Dave.


[-- Attachment #2: 0001-drm-i915-eliminate-temp-in-gen8_for_each_-pdd-pdpe-p.patch --]
[-- Type: text/x-patch, Size: 10038 bytes --]

>From aa121eb42768e19097d93d3a430ab998fc0f62d2 Mon Sep 17 00:00:00 2001
From: Dave Gordon <david.s.gordon@intel.com>
Date: Thu, 3 Dec 2015 11:19:58 +0000
Subject: [PATCH] drm/i915: eliminate 'temp' in gen8_for_each_{pdd,pdpe,pml4e}
 macros
Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ

All of these iterator macros require a 'temp' argument, used merely to
hold internal partial results. We can instead declare the temporary
variable inside the macro, so the caller need not provide it.

Some of the old code contained nested iterators that actually reused the
same 'temp' variable for both inner and outer instances. It's quite
surprising that this didn't introduce bugs! But it does show that the
value of 'temp' isn't required to persist during the iterated body.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++---------------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++--------------------
 2 files changed, 41 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1f7e6b9..c25e8b0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 		gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length,
 					   scratch_pte);
 	} else {
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		struct i915_page_directory_pointer *pdp;
 
-		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
 			gen8_ppgtt_clear_pte_range(vm, pdp, start, length,
 						   scratch_pte);
 		}
@@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 					      cache_level);
 	} else {
 		struct i915_page_directory_pointer *pdp;
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT;
 
-		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
 			gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter,
 						      start, cache_level);
 		}
@@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_table *pt;
-	uint64_t temp;
 	uint32_t pde;
 
-	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
+	gen8_for_each_pde(pt, pd, start, length, pde) {
 		/* Don't reallocate page tables */
 		if (test_bit(pde, pd->used_pdes)) {
 			/* Scratch is never allocated this way */
@@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_directory *pd;
-	uint64_t temp;
 	uint32_t pdpe;
 	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 
 	WARN_ON(!bitmap_empty(new_pds, pdpes));
 
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		if (test_bit(pdpe, pdp->used_pdpes))
 			continue;
 
@@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_directory_pointer *pdp;
-	uint64_t temp;
 	uint32_t pml4e;
 
 	WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4));
 
-	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 		if (!test_bit(pml4e, pml4->used_pml4es)) {
 			pdp = alloc_pdp(dev);
 			if (IS_ERR(pdp))
@@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 	struct i915_page_directory *pd;
 	const uint64_t orig_start = start;
 	const uint64_t orig_length = length;
-	uint64_t temp;
 	uint32_t pdpe;
 	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 	int ret;
@@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 	}
 
 	/* For every page directory referenced, allocate page tables */
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
 						new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES));
 		if (ret)
@@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		gen8_pde_t *const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
 		uint64_t pd_len = length;
@@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 		/* Every pd should be allocated, we just did that above. */
 		WARN_ON(!pd);
 
-		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
 			/* Same reasoning as pd */
 			WARN_ON(!pt);
 			WARN_ON(!pd_len);
@@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 err_out:
 	while (pdpe--) {
+		unsigned long temp;
+
 		for_each_set_bit(temp, new_page_tables + pdpe *
 				BITS_TO_LONGS(I915_PDES), I915_PDES)
 			free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]);
@@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 			container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer *pdp;
-	uint64_t temp, pml4e;
+	uint64_t pml4e;
 	int ret = 0;
 
 	/* Do the pml4 allocations first, so we don't need to track the newly
@@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	     "The allocation has spanned more than 512GB. "
 	     "It is highly likely this is incorrect.");
 
-	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 		WARN_ON(!pdp);
 
 		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
@@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
 			  struct seq_file *m)
 {
 	struct i915_page_directory *pd;
-	uint64_t temp;
 	uint32_t pdpe;
 
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		struct i915_page_table *pt;
 		uint64_t pd_len = length;
 		uint64_t pd_start = start;
@@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
 			continue;
 
 		seq_printf(m, "\tPDPE #%d\n", pdpe);
-		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
 			uint32_t  pte;
 			gen8_pte_t *pt_vaddr;
 
@@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
 		gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m);
 	} else {
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		struct i915_pml4 *pml4 = &ppgtt->pml4;
 		struct i915_page_directory_pointer *pdp;
 
-		gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 			if (!test_bit(pml4e, pml4->used_pml4es))
 				continue;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 877c32c..b448ad8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
  * between from start until start + length. On gen8+ it simply iterates
  * over every page directory entry in a page directory.
  */
-#define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
-	for (iter = gen8_pde_index(start); \
-	     length > 0 && iter < I915_PDES ? \
-			(pt = (pd)->page_table[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
-
-#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
-	for (iter = gen8_pdpe_index(start); \
-	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
-			(pd = (pdp)->page_directory[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
-
-#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
-	for (iter = gen8_pml4e_index(start);	\
-	     length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
-			(pdp = (pml4)->pdps[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
+#define gen8_for_each_pde(pt, pd, start, length, iter)			\
+	for (iter = gen8_pde_index(start);				\
+	     length > 0 && iter < I915_PDES &&				\
+		(pt = (pd)->page_table[iter], true);			\
+	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT);		\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
+
+#define gen8_for_each_pdpe(pd, pdp, start, length, iter)		\
+	for (iter = gen8_pdpe_index(start);				\
+	     length > 0 && iter < I915_PDPES_PER_PDP(dev) &&		\
+		(pd = (pdp)->page_directory[iter], true);		\
+	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT);	\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
+
+#define gen8_for_each_pml4e(pdp, pml4, start, length, iter)		\
+	for (iter = gen8_pml4e_index(start);				\
+	     length > 0 && iter < GEN8_PML4ES_PER_PML4 &&		\
+		(pdp = (pml4)->pdps[iter], true);			\
+	     ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);	\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
 
 static inline uint32_t gen8_pte_index(uint64_t address)
 {
-- 
1.9.1


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e}
  2015-10-06 11:53           ` Chris Wilson
  2015-12-03 15:29             ` Dave Gordon
@ 2015-12-08 13:30             ` Dave Gordon
  2015-12-08 13:30               ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon
  2015-12-10  8:35               ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter
  1 sibling, 2 replies; 22+ messages in thread
From: Dave Gordon @ 2015-12-08 13:30 UTC (permalink / raw)
  To: intel-gfx

Way back at the beginning of October, Chris Wilson suggested that cleaning
up these macros by removing the redundant 'temp' might be worthwhile. So
here's the patch.

There's one more thing that might be cleaned up here (but for which
I don't have a patch yet), which is that gen8_for_each_pdpe() still
references a non-parameter value named 'dev'. Bizarrely, this apparent
nonlocal reference is only used as the parameter to another macro -- which
then doesn't actually use it! Suggestions on how to tidy this up welcome!

One possibility was to define a common iterator (__gen8_for_each())
and then use it to create all the three level-specific macros, which
meant that the loop bounds were passed in rather than being part of the
iterator. But the common macro needed a LOT of parameters, and didn't
really look like an improvement. So I've gone with this simple rewrite
for now.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros
  2015-12-08 13:30             ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
@ 2015-12-08 13:30               ` Dave Gordon
  2015-12-10  8:39                 ` Daniel Vetter
  2015-12-10  8:35               ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter
  1 sibling, 1 reply; 22+ messages in thread
From: Dave Gordon @ 2015-12-08 13:30 UTC (permalink / raw)
  To: intel-gfx

All of these iterator macros require a 'temp' argument, used merely to
hold internal partial results. We can instead declare the temporary
variable inside the macro, so the caller need not provide it.

Some of the old code contained nested iterators that actually reused the
same 'temp' variable for both inner and outer instances. It's quite
surprising that this didn't introduce bugs! But it does show that the
value of 'temp' isn't required to persist during the iterated body.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++---------------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++--------------------
 2 files changed, 41 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1f7e6b9..c25e8b0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 		gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length,
 					   scratch_pte);
 	} else {
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		struct i915_page_directory_pointer *pdp;
 
-		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
 			gen8_ppgtt_clear_pte_range(vm, pdp, start, length,
 						   scratch_pte);
 		}
@@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 					      cache_level);
 	} else {
 		struct i915_page_directory_pointer *pdp;
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT;
 
-		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
 			gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter,
 						      start, cache_level);
 		}
@@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_table *pt;
-	uint64_t temp;
 	uint32_t pde;
 
-	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
+	gen8_for_each_pde(pt, pd, start, length, pde) {
 		/* Don't reallocate page tables */
 		if (test_bit(pde, pd->used_pdes)) {
 			/* Scratch is never allocated this way */
@@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_directory *pd;
-	uint64_t temp;
 	uint32_t pdpe;
 	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 
 	WARN_ON(!bitmap_empty(new_pds, pdpes));
 
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		if (test_bit(pdpe, pdp->used_pdpes))
 			continue;
 
@@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm,
 {
 	struct drm_device *dev = vm->dev;
 	struct i915_page_directory_pointer *pdp;
-	uint64_t temp;
 	uint32_t pml4e;
 
 	WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4));
 
-	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 		if (!test_bit(pml4e, pml4->used_pml4es)) {
 			pdp = alloc_pdp(dev);
 			if (IS_ERR(pdp))
@@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 	struct i915_page_directory *pd;
 	const uint64_t orig_start = start;
 	const uint64_t orig_length = length;
-	uint64_t temp;
 	uint32_t pdpe;
 	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 	int ret;
@@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 	}
 
 	/* For every page directory referenced, allocate page tables */
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
 						new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES));
 		if (ret)
@@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		gen8_pde_t *const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
 		uint64_t pd_len = length;
@@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 		/* Every pd should be allocated, we just did that above. */
 		WARN_ON(!pd);
 
-		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
 			/* Same reasoning as pd */
 			WARN_ON(!pt);
 			WARN_ON(!pd_len);
@@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 err_out:
 	while (pdpe--) {
+		unsigned long temp;
+
 		for_each_set_bit(temp, new_page_tables + pdpe *
 				BITS_TO_LONGS(I915_PDES), I915_PDES)
 			free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]);
@@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 			container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer *pdp;
-	uint64_t temp, pml4e;
+	uint64_t pml4e;
 	int ret = 0;
 
 	/* Do the pml4 allocations first, so we don't need to track the newly
@@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	     "The allocation has spanned more than 512GB. "
 	     "It is highly likely this is incorrect.");
 
-	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 		WARN_ON(!pdp);
 
 		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
@@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
 			  struct seq_file *m)
 {
 	struct i915_page_directory *pd;
-	uint64_t temp;
 	uint32_t pdpe;
 
-	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
 		struct i915_page_table *pt;
 		uint64_t pd_len = length;
 		uint64_t pd_start = start;
@@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
 			continue;
 
 		seq_printf(m, "\tPDPE #%d\n", pdpe);
-		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
 			uint32_t  pte;
 			gen8_pte_t *pt_vaddr;
 
@@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
 		gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m);
 	} else {
-		uint64_t templ4, pml4e;
+		uint64_t pml4e;
 		struct i915_pml4 *pml4 = &ppgtt->pml4;
 		struct i915_page_directory_pointer *pdp;
 
-		gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) {
+		gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
 			if (!test_bit(pml4e, pml4->used_pml4es))
 				continue;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 877c32c..b448ad8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
  * between from start until start + length. On gen8+ it simply iterates
  * over every page directory entry in a page directory.
  */
-#define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
-	for (iter = gen8_pde_index(start); \
-	     length > 0 && iter < I915_PDES ? \
-			(pt = (pd)->page_table[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
-
-#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
-	for (iter = gen8_pdpe_index(start); \
-	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
-			(pd = (pdp)->page_directory[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
-
-#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
-	for (iter = gen8_pml4e_index(start);	\
-	     length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
-			(pdp = (pml4)->pdps[iter]), 1 : 0; \
-	     iter++,				\
-	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
-	     temp = min(temp, length),					\
-	     start += temp, length -= temp)
+#define gen8_for_each_pde(pt, pd, start, length, iter)			\
+	for (iter = gen8_pde_index(start);				\
+	     length > 0 && iter < I915_PDES &&				\
+		(pt = (pd)->page_table[iter], true);			\
+	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT);		\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
+
+#define gen8_for_each_pdpe(pd, pdp, start, length, iter)		\
+	for (iter = gen8_pdpe_index(start);				\
+	     length > 0 && iter < I915_PDPES_PER_PDP(dev) &&		\
+		(pd = (pdp)->page_directory[iter], true);		\
+	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT);	\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
+
+#define gen8_for_each_pml4e(pdp, pml4, start, length, iter)		\
+	for (iter = gen8_pml4e_index(start);				\
+	     length > 0 && iter < GEN8_PML4ES_PER_PML4 &&		\
+		(pdp = (pml4)->pdps[iter], true);			\
+	     ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);	\
+		    temp = min(temp - start, length);			\
+		    start += temp, length -= temp; }), ++iter)
 
 static inline uint32_t gen8_pte_index(uint64_t address)
 {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e}
  2015-12-08 13:30             ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
  2015-12-08 13:30               ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon
@ 2015-12-10  8:35               ` Daniel Vetter
  1 sibling, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2015-12-10  8:35 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Dec 08, 2015 at 01:30:50PM +0000, Dave Gordon wrote:
> Way back at the beginning of October, Chris Wilson suggested that cleaning
> up these macros by removing the redundant 'temp' might be worthwhile. So
> here's the patch.
> 
> There's one more thing that might be cleaned up here (but for which
> I don't have a patch yet), which is that gen8_for_each_pdpe() still
> references a non-parameter value named 'dev'. Bizarrely, this apparent
> nonlocal reference is only used as the parameter to another macro -- which
> then doesn't actually use it! Suggestions on how to tidy this up welcome!

Iirc this is fallout from how we originally handled full ppgtt and
aliasing ppgtt. Old versions tried to transparently fall back to simpler
code, nowadays we just declare the gpu wedged. That's why there's no need
to look at dev any more. I think you can just remove that.
-Daniel

> 
> One possibility was to define a common iterator (__gen8_for_each())
> and then use it to create all the three level-specific macros, which
> meant that the loop bounds were passed in rather than being part of the
> iterator. But the common macro needed a LOT of parameters, and didn't
> really look like an improvement. So I've gone with this simple rewrite
> for now.
> 
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros
  2015-12-08 13:30               ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon
@ 2015-12-10  8:39                 ` Daniel Vetter
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Vetter @ 2015-12-10  8:39 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Dec 08, 2015 at 01:30:51PM +0000, Dave Gordon wrote:
> All of these iterator macros require a 'temp' argument, used merely to
> hold internal partial results. We can instead declare the temporary
> variable inside the macro, so the caller need not provide it.
> 
> Some of the old code contained nested iterators that actually reused the
> same 'temp' variable for both inner and outer instances. It's quite
> surprising that this didn't introduce bugs! But it does show that the
> value of 'temp' isn't required to persist during the iterated body.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++---------------
>  drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++--------------------
>  2 files changed, 41 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1f7e6b9..c25e8b0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>  		gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length,
>  					   scratch_pte);
>  	} else {
> -		uint64_t templ4, pml4e;
> +		uint64_t pml4e;
>  		struct i915_page_directory_pointer *pdp;
>  
> -		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
> +		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
>  			gen8_ppgtt_clear_pte_range(vm, pdp, start, length,
>  						   scratch_pte);
>  		}
> @@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>  					      cache_level);
>  	} else {
>  		struct i915_page_directory_pointer *pdp;
> -		uint64_t templ4, pml4e;
> +		uint64_t pml4e;
>  		uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT;
>  
> -		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
> +		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) {
>  			gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter,
>  						      start, cache_level);
>  		}
> @@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
>  {
>  	struct drm_device *dev = vm->dev;
>  	struct i915_page_table *pt;
> -	uint64_t temp;
>  	uint32_t pde;
>  
> -	gen8_for_each_pde(pt, pd, start, length, temp, pde) {
> +	gen8_for_each_pde(pt, pd, start, length, pde) {
>  		/* Don't reallocate page tables */
>  		if (test_bit(pde, pd->used_pdes)) {
>  			/* Scratch is never allocated this way */
> @@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
>  {
>  	struct drm_device *dev = vm->dev;
>  	struct i915_page_directory *pd;
> -	uint64_t temp;
>  	uint32_t pdpe;
>  	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
>  
>  	WARN_ON(!bitmap_empty(new_pds, pdpes));
>  
> -	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
> +	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
>  		if (test_bit(pdpe, pdp->used_pdpes))
>  			continue;
>  
> @@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm,
>  {
>  	struct drm_device *dev = vm->dev;
>  	struct i915_page_directory_pointer *pdp;
> -	uint64_t temp;
>  	uint32_t pml4e;
>  
>  	WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4));
>  
> -	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
>  		if (!test_bit(pml4e, pml4->used_pml4es)) {
>  			pdp = alloc_pdp(dev);
>  			if (IS_ERR(pdp))
> @@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  	struct i915_page_directory *pd;
>  	const uint64_t orig_start = start;
>  	const uint64_t orig_length = length;
> -	uint64_t temp;
>  	uint32_t pdpe;
>  	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
>  	int ret;
> @@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  	}
>  
>  	/* For every page directory referenced, allocate page tables */
> -	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
> +	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
>  		ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
>  						new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES));
>  		if (ret)
> @@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  
>  	/* Allocations have completed successfully, so set the bitmaps, and do
>  	 * the mappings. */
> -	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
> +	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
>  		gen8_pde_t *const page_directory = kmap_px(pd);
>  		struct i915_page_table *pt;
>  		uint64_t pd_len = length;
> @@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  		/* Every pd should be allocated, we just did that above. */
>  		WARN_ON(!pd);
>  
> -		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
> +		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
>  			/* Same reasoning as pd */
>  			WARN_ON(!pt);
>  			WARN_ON(!pd_len);
> @@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
>  
>  err_out:
>  	while (pdpe--) {
> +		unsigned long temp;
> +
>  		for_each_set_bit(temp, new_page_tables + pdpe *
>  				BITS_TO_LONGS(I915_PDES), I915_PDES)
>  			free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]);
> @@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
>  	struct i915_hw_ppgtt *ppgtt =
>  			container_of(vm, struct i915_hw_ppgtt, base);
>  	struct i915_page_directory_pointer *pdp;
> -	uint64_t temp, pml4e;
> +	uint64_t pml4e;
>  	int ret = 0;
>  
>  	/* Do the pml4 allocations first, so we don't need to track the newly
> @@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
>  	     "The allocation has spanned more than 512GB. "
>  	     "It is highly likely this is incorrect.");
>  
> -	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> +	gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
>  		WARN_ON(!pdp);
>  
>  		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
> @@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
>  			  struct seq_file *m)
>  {
>  	struct i915_page_directory *pd;
> -	uint64_t temp;
>  	uint32_t pdpe;
>  
> -	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
> +	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
>  		struct i915_page_table *pt;
>  		uint64_t pd_len = length;
>  		uint64_t pd_start = start;
> @@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
>  			continue;
>  
>  		seq_printf(m, "\tPDPE #%d\n", pdpe);
> -		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
> +		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
>  			uint32_t  pte;
>  			gen8_pte_t *pt_vaddr;
>  
> @@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
>  	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
>  		gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m);
>  	} else {
> -		uint64_t templ4, pml4e;
> +		uint64_t pml4e;
>  		struct i915_pml4 *pml4 = &ppgtt->pml4;
>  		struct i915_page_directory_pointer *pdp;
>  
> -		gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) {
> +		gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
>  			if (!test_bit(pml4e, pml4->used_pml4es))
>  				continue;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 877c32c..b448ad8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
>   * between from start until start + length. On gen8+ it simply iterates
>   * over every page directory entry in a page directory.
>   */
> -#define gen8_for_each_pde(pt, pd, start, length, temp, iter)		\
> -	for (iter = gen8_pde_index(start); \
> -	     length > 0 && iter < I915_PDES ? \
> -			(pt = (pd)->page_table[iter]), 1 : 0; \
> -	     iter++,				\
> -	     temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,	\
> -	     temp = min(temp, length),					\
> -	     start += temp, length -= temp)
> -
> -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)	\
> -	for (iter = gen8_pdpe_index(start); \
> -	     length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
> -			(pd = (pdp)->page_directory[iter]), 1 : 0; \
> -	     iter++,				\
> -	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
> -	     temp = min(temp, length),					\
> -	     start += temp, length -= temp)
> -
> -#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
> -	for (iter = gen8_pml4e_index(start);	\
> -	     length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
> -			(pdp = (pml4)->pdps[iter]), 1 : 0; \
> -	     iter++,				\
> -	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
> -	     temp = min(temp, length),					\
> -	     start += temp, length -= temp)
> +#define gen8_for_each_pde(pt, pd, start, length, iter)			\
> +	for (iter = gen8_pde_index(start);				\
> +	     length > 0 && iter < I915_PDES &&				\
> +		(pt = (pd)->page_table[iter], true);			\
> +	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT);		\
> +		    temp = min(temp - start, length);			\
> +		    start += temp, length -= temp; }), ++iter)
> +
> +#define gen8_for_each_pdpe(pd, pdp, start, length, iter)		\
> +	for (iter = gen8_pdpe_index(start);				\
> +	     length > 0 && iter < I915_PDPES_PER_PDP(dev) &&		\
> +		(pd = (pdp)->page_directory[iter], true);		\
> +	     ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT);	\
> +		    temp = min(temp - start, length);			\
> +		    start += temp, length -= temp; }), ++iter)
> +
> +#define gen8_for_each_pml4e(pdp, pml4, start, length, iter)		\
> +	for (iter = gen8_pml4e_index(start);				\
> +	     length > 0 && iter < GEN8_PML4ES_PER_PML4 &&		\
> +		(pdp = (pml4)->pdps[iter], true);			\
> +	     ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);	\
> +		    temp = min(temp - start, length);			\
> +		    start += temp, length -= temp; }), ++iter)
>  
>  static inline uint32_t gen8_pte_index(uint64_t address)
>  {
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-12-10  8:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry
2015-10-01 16:09 ` Chris Wilson
2015-10-02 12:47   ` Michel Thierry
2015-10-02 12:58     ` Chris Wilson
2015-10-02 13:11       ` Michel Thierry
2015-10-02  7:58 ` Daniel Vetter
2015-10-02  8:58   ` Chris Wilson
2015-10-02 10:52     ` Daniel Vetter
2015-10-02 13:16 ` [PATCH v2] " Michel Thierry
2015-10-05 16:36   ` Dave Gordon
2015-10-05 16:59     ` Michel Thierry
2015-10-06  8:38       ` Daniel Vetter
2015-10-06  9:43         ` Chris Wilson
2015-10-06 10:09           ` Daniel Vetter
2015-10-06 10:19             ` Chris Wilson
2015-10-06 11:21         ` Dave Gordon
2015-10-06 11:53           ` Chris Wilson
2015-12-03 15:29             ` Dave Gordon
2015-12-08 13:30             ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
2015-12-08 13:30               ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon
2015-12-10  8:39                 ` Daniel Vetter
2015-12-10  8:35               ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox