* [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3)
@ 2015-10-01 15:59 Michel Thierry
2015-10-01 16:09 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Michel Thierry @ 2015-10-01 15:59 UTC (permalink / raw)
To: intel-gfx
We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
range pt in gen6_for_each_pde").
But the static analyzer still complains that, just before we break due
to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
iter value that is bigger than I915_PDES. Of course, this isn't really
a problem since no one uses pt outside the macro. Still, every single
new usage of the macro will create a new issue for us to mark as a
false positive.
Also, Paulo re-started the discussion a while ago [1], but didn't end up
implemented.
In order to "solve" this "problem", this patch takes the ideas from
Chris and Dave, but that check would change the desired behavior of the
code, because the object (for example pdp->page_directory[iter]) can be
null during init/alloc, and C would take this as false, breaking the for
loop immediately.
This has been already verified with "static analysis tools".
[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9fbb07d..94f8344 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -394,7 +394,9 @@ struct i915_hw_ppgtt {
*/
#define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
for (iter = gen6_pde_index(start); \
- pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
+ pt = (length > 0 && iter < I915_PDES) ? \
+ (pd)->page_table[iter] : NULL, \
+ length > 0 && iter < I915_PDES; \
iter++, \
temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
temp = min_t(unsigned, temp, length), \
@@ -459,7 +461,9 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
*/
#define gen8_for_each_pde(pt, pd, start, length, temp, iter) \
for (iter = gen8_pde_index(start); \
- pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
+ pt = (length > 0 && iter < I915_PDES) ? \
+ (pd)->page_table[iter] : NULL, \
+ length > 0 && iter < I915_PDES; \
iter++, \
temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \
temp = min(temp, length), \
@@ -467,7 +471,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \
for (iter = gen8_pdpe_index(start); \
- pd = (pdp)->page_directory[iter], \
+ pd = (length > 0 && (iter < I915_PDPES_PER_PDP(dev))) ? \
+ (pdp)->page_directory[iter] : NULL, \
length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
iter++, \
temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \
@@ -476,7 +481,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \
for (iter = gen8_pml4e_index(start); \
- pdp = (pml4)->pdps[iter], \
+ pdp = (length > 0 && iter < GEN8_PML4ES_PER_PML4) ? \
+ (pml4)->pdps[iter] : NULL, \
length > 0 && iter < GEN8_PML4ES_PER_PML4; \
iter++, \
temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \
--
2.6.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry @ 2015-10-01 16:09 ` Chris Wilson 2015-10-02 12:47 ` Michel Thierry 2015-10-02 7:58 ` Daniel Vetter 2015-10-02 13:16 ` [PATCH v2] " Michel Thierry 2 siblings, 1 reply; 22+ messages in thread From: Chris Wilson @ 2015-10-01 16:09 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > range pt in gen6_for_each_pde"). > > But the static analyzer still complains that, just before we break due > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > iter value that is bigger than I915_PDES. Of course, this isn't really > a problem since no one uses pt outside the macro. Still, every single > new usage of the macro will create a new issue for us to mark as a > false positive. > > Also, Paulo re-started the discussion a while ago [1], but didn't end up > implemented. > > In order to "solve" this "problem", this patch takes the ideas from > Chris and Dave, but that check would change the desired behavior of the > code, because the object (for example pdp->page_directory[iter]) can be > null during init/alloc, and C would take this as false, breaking the for > loop immediately. > > This has been already verified with "static analysis tools". > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Dave Gordon <david.s.gordon@intel.com> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h > index 9fbb07d..94f8344 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > @@ -394,7 +394,9 @@ struct i915_hw_ppgtt { > */ > #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ > for (iter = gen6_pde_index(start); \ > - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > + pt = (length > 0 && iter < I915_PDES) ? \ > + (pd)->page_table[iter] : NULL, \ > + length > 0 && iter < I915_PDES; \ length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0, as the compiler wouldn't be able to CSE it otherwise (I think). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-01 16:09 ` Chris Wilson @ 2015-10-02 12:47 ` Michel Thierry 2015-10-02 12:58 ` Chris Wilson 0 siblings, 1 reply; 22+ messages in thread From: Michel Thierry @ 2015-10-02 12:47 UTC (permalink / raw) To: Chris Wilson, intel-gfx On 10/1/2015 5:09 PM, Chris Wilson wrote: > On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: >> --- >> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h >> index 9fbb07d..94f8344 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h >> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt { >> */ >> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ >> for (iter = gen6_pde_index(start); \ >> - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ >> + pt = (length > 0 && iter < I915_PDES) ? \ >> + (pd)->page_table[iter] : NULL, \ >> + length > 0 && iter < I915_PDES; \ > > length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0, > > as the compiler wouldn't be able to CSE it otherwise (I think). Even after that change, the compiler keeps doing an optimization when page_table[iter] is null (takes the null assignment as the break condition). I've been playing with these examples http://paste.ubuntu.com/12638106/ Only the 1st example (a) iterates over all elements, b & c stop after the 1st run. Thanks, -Michel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-02 12:47 ` Michel Thierry @ 2015-10-02 12:58 ` Chris Wilson 2015-10-02 13:11 ` Michel Thierry 0 siblings, 1 reply; 22+ messages in thread From: Chris Wilson @ 2015-10-02 12:58 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Fri, Oct 02, 2015 at 01:47:03PM +0100, Michel Thierry wrote: > On 10/1/2015 5:09 PM, Chris Wilson wrote: > >On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: > >>--- > >> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++---- > >> 1 file changed, 10 insertions(+), 4 deletions(-) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h > >>index 9fbb07d..94f8344 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.h > >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > >>@@ -394,7 +394,9 @@ struct i915_hw_ppgtt { > >> */ > >> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ > >> for (iter = gen6_pde_index(start); \ > >>- pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > >>+ pt = (length > 0 && iter < I915_PDES) ? \ > >>+ (pd)->page_table[iter] : NULL, \ > >>+ length > 0 && iter < I915_PDES; \ > > > >length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0, > > > >as the compiler wouldn't be able to CSE it otherwise (I think). > > Even after that change, the compiler keeps doing an optimization > when page_table[iter] is null (takes the null assignment as the > break condition). > > I've been playing with these examples > http://paste.ubuntu.com/12638106/ > > Only the 1st example (a) iterates over all elements, b & c stop > after the 1st run. Forgot that was the condition you wanted to change. length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]), 1 : 0, Would be nice to prove that length > 0 implies iter < I915_PDES. If only we had smart tools :) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-02 12:58 ` Chris Wilson @ 2015-10-02 13:11 ` Michel Thierry 0 siblings, 0 replies; 22+ messages in thread From: Michel Thierry @ 2015-10-02 13:11 UTC (permalink / raw) To: Chris Wilson, intel-gfx, Paulo Zanoni, Dave Gordon, daniel On 10/2/2015 1:58 PM, Chris Wilson wrote: > On Fri, Oct 02, 2015 at 01:47:03PM +0100, Michel Thierry wrote: >> On 10/1/2015 5:09 PM, Chris Wilson wrote: >>> On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: >>>> --- >>>> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++---- >>>> 1 file changed, 10 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> index 9fbb07d..94f8344 100644 >>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> @@ -394,7 +394,9 @@ struct i915_hw_ppgtt { >>>> */ >>>> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ >>>> for (iter = gen6_pde_index(start); \ >>>> - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ >>>> + pt = (length > 0 && iter < I915_PDES) ? \ >>>> + (pd)->page_table[iter] : NULL, \ >>>> + length > 0 && iter < I915_PDES; \ >>> >>> length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]) : 0, >>> >>> as the compiler wouldn't be able to CSE it otherwise (I think). >> >> Even after that change, the compiler keeps doing an optimization >> when page_table[iter] is null (takes the null assignment as the >> break condition). >> >> I've been playing with these examples >> http://paste.ubuntu.com/12638106/ >> >> Only the 1st example (a) iterates over all elements, b & c stop >> after the 1st run. > > Forgot that was the condition you wanted to change. > > length > 0 && iter < I915_PDES ? (pt = (pd)->page_table[iter]), 1 : 0, > > Would be nice to prove that length > 0 implies iter < I915_PDES. If only > we had smart tools :) Thanks, that made it. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry 2015-10-01 16:09 ` Chris Wilson @ 2015-10-02 7:58 ` Daniel Vetter 2015-10-02 8:58 ` Chris Wilson 2015-10-02 13:16 ` [PATCH v2] " Michel Thierry 2 siblings, 1 reply; 22+ messages in thread From: Daniel Vetter @ 2015-10-02 7:58 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > range pt in gen6_for_each_pde"). > > But the static analyzer still complains that, just before we break due > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > iter value that is bigger than I915_PDES. Of course, this isn't really > a problem since no one uses pt outside the macro. Still, every single > new usage of the macro will create a new issue for us to mark as a > false positive. > > Also, Paulo re-started the discussion a while ago [1], but didn't end up > implemented. > > In order to "solve" this "problem", this patch takes the ideas from > Chris and Dave, but that check would change the desired behavior of the > code, because the object (for example pdp->page_directory[iter]) can be > null during init/alloc, and C would take this as false, breaking the for > loop immediately. > > This has been already verified with "static analysis tools". > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Dave Gordon <david.s.gordon@intel.com> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> So maybe I'm dense and not seeing what's really going on, but the only thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer to the element right after the last valid one. Pointer arithmetic and comparison are explicitly allowed by the C standard on such a pointer. The only thing not allowed is dereference it (which we don't seem to be doing here). The reason for this is exactly this case here, not allowing this means checking whether you've run off the end of an array often becomes very unnatural. So if there's nothing else going on I think the right choice is to mark them all up as false positives - the tool is wrong. -Daniel > --- > drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h > index 9fbb07d..94f8344 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > @@ -394,7 +394,9 @@ struct i915_hw_ppgtt { > */ > #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ > for (iter = gen6_pde_index(start); \ > - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > + pt = (length > 0 && iter < I915_PDES) ? \ > + (pd)->page_table[iter] : NULL, \ > + length > 0 && iter < I915_PDES; \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ > temp = min_t(unsigned, temp, length), \ > @@ -459,7 +461,9 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > */ > #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ > for (iter = gen8_pde_index(start); \ > - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > + pt = (length > 0 && iter < I915_PDES) ? \ > + (pd)->page_table[iter] : NULL, \ > + length > 0 && iter < I915_PDES; \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ > temp = min(temp, length), \ > @@ -467,7 +471,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > > #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > for (iter = gen8_pdpe_index(start); \ > - pd = (pdp)->page_directory[iter], \ > + pd = (length > 0 && (iter < I915_PDPES_PER_PDP(dev))) ? \ > + (pdp)->page_directory[iter] : NULL, \ > length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > @@ -476,7 +481,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > > #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > for (iter = gen8_pml4e_index(start); \ > - pdp = (pml4)->pdps[iter], \ > + pdp = (length > 0 && iter < GEN8_PML4ES_PER_PML4) ? \ > + (pml4)->pdps[iter] : NULL, \ > length > 0 && iter < GEN8_PML4ES_PER_PML4; \ > iter++, \ > temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > -- > 2.6.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-02 7:58 ` Daniel Vetter @ 2015-10-02 8:58 ` Chris Wilson 2015-10-02 10:52 ` Daniel Vetter 0 siblings, 1 reply; 22+ messages in thread From: Chris Wilson @ 2015-10-02 8:58 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Fri, Oct 02, 2015 at 09:58:05AM +0200, Daniel Vetter wrote: > On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: > > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > > range pt in gen6_for_each_pde"). > > > > But the static analyzer still complains that, just before we break due > > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > > iter value that is bigger than I915_PDES. Of course, this isn't really > > a problem since no one uses pt outside the macro. Still, every single > > new usage of the macro will create a new issue for us to mark as a > > false positive. > > > > Also, Paulo re-started the discussion a while ago [1], but didn't end up > > implemented. > > > > In order to "solve" this "problem", this patch takes the ideas from > > Chris and Dave, but that check would change the desired behavior of the > > code, because the object (for example pdp->page_directory[iter]) can be > > null during init/alloc, and C would take this as false, breaking the for > > loop immediately. > > > > This has been already verified with "static analysis tools". > > > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Dave Gordon <david.s.gordon@intel.com> > > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > > So maybe I'm dense and not seeing what's really going on, but the only > thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer > to the element right after the last valid one. Pointer arithmetic and > comparison are explicitly allowed by the C standard on such a pointer. The > only thing not allowed is dereference it (which we don't seem to be doing > here). You're thinking of &(pd)->page_table[iter] (i.e. (pd)->page_table + iter). There is an apparent dereference here of (pd)->page_table[ITER_SIZE]. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-02 8:58 ` Chris Wilson @ 2015-10-02 10:52 ` Daniel Vetter 0 siblings, 0 replies; 22+ messages in thread From: Daniel Vetter @ 2015-10-02 10:52 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx On Fri, Oct 02, 2015 at 09:58:08AM +0100, Chris Wilson wrote: > On Fri, Oct 02, 2015 at 09:58:05AM +0200, Daniel Vetter wrote: > > On Thu, Oct 01, 2015 at 04:59:35PM +0100, Michel Thierry wrote: > > > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > > > range pt in gen6_for_each_pde"). > > > > > > But the static analyzer still complains that, just before we break due > > > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > > > iter value that is bigger than I915_PDES. Of course, this isn't really > > > a problem since no one uses pt outside the macro. Still, every single > > > new usage of the macro will create a new issue for us to mark as a > > > false positive. > > > > > > Also, Paulo re-started the discussion a while ago [1], but didn't end up > > > implemented. > > > > > > In order to "solve" this "problem", this patch takes the ideas from > > > Chris and Dave, but that check would change the desired behavior of the > > > code, because the object (for example pdp->page_directory[iter]) can be > > > null during init/alloc, and C would take this as false, breaking the for > > > loop immediately. > > > > > > This has been already verified with "static analysis tools". > > > > > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > > > > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Dave Gordon <david.s.gordon@intel.com> > > > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > > > > So maybe I'm dense and not seeing what's really going on, but the only > > thing we seem to be doing is create a pointer to arr[SIZE], i.e. a pointer > > to the element right after the last valid one. Pointer arithmetic and > > comparison are explicitly allowed by the C standard on such a pointer. The > > only thing not allowed is dereference it (which we don't seem to be doing > > here). > > You're thinking of &(pd)->page_table[iter] (i.e. (pd)->page_table + > iter). There is an apparent dereference here of (pd)->page_table[ITER_SIZE]. Oh right. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry 2015-10-01 16:09 ` Chris Wilson 2015-10-02 7:58 ` Daniel Vetter @ 2015-10-02 13:16 ` Michel Thierry 2015-10-05 16:36 ` Dave Gordon 2 siblings, 1 reply; 22+ messages in thread From: Michel Thierry @ 2015-10-02 13:16 UTC (permalink / raw) To: intel-gfx We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of range pt in gen6_for_each_pde"). But the static analyzer still complains that, just before we break due to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an iter value that is bigger than I915_PDES. Of course, this isn't really a problem since no one uses pt outside the macro. Still, every single new usage of the macro will create a new issue for us to mark as a false positive. Also, Paulo re-started the discussion a while ago [1], but didn't end up implemented. In order to "solve" this "problem", this patch takes the ideas from Chris and Dave, but that check would change the desired behavior of the code, because the object (for example pdp->page_directory[iter]) can be null during init/alloc, and C would take this as false, breaking the for loop immediately. This has been already verified with "static analysis tools". [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html v2: Make it a single statement, while preventing the common subexpression elimination (Chris) Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> --- drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 9fbb07d..a216397 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -394,7 +394,8 @@ struct i915_hw_ppgtt { */ #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ for (iter = gen6_pde_index(start); \ - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ + length > 0 && iter < I915_PDES ? \ + (pt = (pd)->page_table[iter]), 1 : 0; \ iter++, \ temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ temp = min_t(unsigned, temp, length), \ @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) */ #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ for (iter = gen8_pde_index(start); \ - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ + length > 0 && iter < I915_PDES ? \ + (pt = (pd)->page_table[iter]), 1 : 0; \ iter++, \ temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ temp = min(temp, length), \ @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ for (iter = gen8_pdpe_index(start); \ - pd = (pdp)->page_directory[iter], \ - length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ + length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ + (pd = (pdp)->page_directory[iter]), 1 : 0; \ iter++, \ temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ temp = min(temp, length), \ @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ for (iter = gen8_pml4e_index(start); \ - pdp = (pml4)->pdps[iter], \ - length > 0 && iter < GEN8_PML4ES_PER_PML4; \ + length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ + (pdp = (pml4)->pdps[iter]), 1 : 0; \ iter++, \ temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ temp = min(temp, length), \ -- 2.6.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-02 13:16 ` [PATCH v2] " Michel Thierry @ 2015-10-05 16:36 ` Dave Gordon 2015-10-05 16:59 ` Michel Thierry 0 siblings, 1 reply; 22+ messages in thread From: Dave Gordon @ 2015-10-05 16:36 UTC (permalink / raw) To: Michel Thierry, intel-gfx On 02/10/15 14:16, Michel Thierry wrote: > We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > range pt in gen6_for_each_pde"). > > But the static analyzer still complains that, just before we break due > to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > iter value that is bigger than I915_PDES. Of course, this isn't really > a problem since no one uses pt outside the macro. Still, every single > new usage of the macro will create a new issue for us to mark as a > false positive. > > Also, Paulo re-started the discussion a while ago [1], but didn't end up > implemented. > > In order to "solve" this "problem", this patch takes the ideas from > Chris and Dave, but that check would change the desired behavior of the > code, because the object (for example pdp->page_directory[iter]) can be > null during init/alloc, and C would take this as false, breaking the for > loop immediately. > > This has been already verified with "static analysis tools". > > [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > v2: Make it a single statement, while preventing the common subexpression > elimination (Chris) > > Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Dave Gordon <david.s.gordon@intel.com> > Signed-off-by: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h > index 9fbb07d..a216397 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > @@ -394,7 +394,8 @@ struct i915_hw_ppgtt { > */ > #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ > for (iter = gen6_pde_index(start); \ > - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > + length > 0 && iter < I915_PDES ? \ > + (pt = (pd)->page_table[iter]), 1 : 0; \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ > temp = min_t(unsigned, temp, length), \ > @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > */ > #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ > for (iter = gen8_pde_index(start); \ > - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > + length > 0 && iter < I915_PDES ? \ > + (pt = (pd)->page_table[iter]), 1 : 0; \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ > temp = min(temp, length), \ > @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > > #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > for (iter = gen8_pdpe_index(start); \ > - pd = (pdp)->page_directory[iter], \ > - length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ > + length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ > + (pd = (pdp)->page_directory[iter]), 1 : 0; \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > temp = min(temp, length), \ > @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > > #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > for (iter = gen8_pml4e_index(start); \ > - pdp = (pml4)->pdps[iter], \ > - length > 0 && iter < GEN8_PML4ES_PER_PML4; \ > + length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ > + (pdp = (pml4)->pdps[iter]), 1 : 0; \ this won't compile -- see below > iter++, \ > temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > temp = min(temp, length), \ The man page for C operators tells us: Operator Associativity () [] -> . left to right ! ~ ++ -- + - (type) * & sizeof right to left * / % left to right + - left to right << >> left to right < <= > >= left to right == != left to right & left to right ^ left to right | left to right && left to right || left to right ?: right to left = += -= *= /= %= <<= >>= &= ^= |= right to left , left to right So there's a problem with the above code, because the comma operator is LOWER precedence than either assignment or ?: You'd need to put the parentheses around the (pdp = ... , 1) section, not just the assignment. Or for yet another variation, how about: #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ for (iter = gen8_pdpe_index(start); \ iter < I915_PDPES_PER_PDP(dev) && \ (pd = (pdp)->page_directory[iter], length > 0); \ iter++, \ temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ temp = min(temp, length), \ start += temp, length -= temp) #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ for (iter = gen8_pml4e_index(start); \ iter < GEN8_PML4ES_PER_PML4 && \ (pdp = (pml4)->pdps[iter], length > 0); \ iter++, \ temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ temp = min(temp, length), \ start += temp, length -= temp) with no ugly ?: at all :) .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-05 16:36 ` Dave Gordon @ 2015-10-05 16:59 ` Michel Thierry 2015-10-06 8:38 ` Daniel Vetter 0 siblings, 1 reply; 22+ messages in thread From: Michel Thierry @ 2015-10-05 16:59 UTC (permalink / raw) To: Dave Gordon, intel-gfx On 10/5/2015 5:36 PM, Dave Gordon wrote: > On 02/10/15 14:16, Michel Thierry wrote: >> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of >> range pt in gen6_for_each_pde"). >> >> But the static analyzer still complains that, just before we break due >> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an >> iter value that is bigger than I915_PDES. Of course, this isn't really >> a problem since no one uses pt outside the macro. Still, every single >> new usage of the macro will create a new issue for us to mark as a >> false positive. >> >> Also, Paulo re-started the discussion a while ago [1], but didn't end up >> implemented. >> >> In order to "solve" this "problem", this patch takes the ideas from >> Chris and Dave, but that check would change the desired behavior of the >> code, because the object (for example pdp->page_directory[iter]) can be >> null during init/alloc, and C would take this as false, breaking the for >> loop immediately. >> >> This has been already verified with "static analysis tools". >> >> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html >> >> v2: Make it a single statement, while preventing the common subexpression >> elimination (Chris) >> >> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> Cc: Dave Gordon <david.s.gordon@intel.com> >> Signed-off-by: Michel Thierry <michel.thierry@intel.com> >> --- >> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------ >> 1 file changed, 8 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h >> b/drivers/gpu/drm/i915/i915_gem_gtt.h >> index 9fbb07d..a216397 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h >> @@ -394,7 +394,8 @@ struct i915_hw_ppgtt { >> */ >> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ >> for (iter = gen6_pde_index(start); \ >> - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ >> + length > 0 && iter < I915_PDES ? \ >> + (pt = (pd)->page_table[iter]), 1 : 0; \ >> iter++, \ >> temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ >> temp = min_t(unsigned, temp, length), \ >> @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >> */ >> #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ >> for (iter = gen8_pde_index(start); \ >> - pt = (pd)->page_table[iter], length > 0 && iter < >> I915_PDES; \ >> + length > 0 && iter < I915_PDES ? \ >> + (pt = (pd)->page_table[iter]), 1 : 0; \ >> iter++, \ >> temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ >> temp = min(temp, length), \ >> @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >> >> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ >> for (iter = gen8_pdpe_index(start); \ >> - pd = (pdp)->page_directory[iter], \ >> - length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ >> + length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ >> + (pd = (pdp)->page_directory[iter]), 1 : 0; \ >> iter++, \ >> temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ >> temp = min(temp, length), \ >> @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >> >> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ >> for (iter = gen8_pml4e_index(start); \ >> - pdp = (pml4)->pdps[iter], \ >> - length > 0 && iter < GEN8_PML4ES_PER_PML4; \ >> + length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ >> + (pdp = (pml4)->pdps[iter]), 1 : 0; \ > > this won't compile -- see below Hmm, it compiled (also got rid of of the "analysis tool error" and didn't see any behavior change). > >> iter++, \ >> temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ >> temp = min(temp, length), \ > > The man page for C operators tells us: > > Operator Associativity > () [] -> . left to right > ! ~ ++ -- + - (type) * & sizeof right to left > * / % left to right > + - left to right > << >> left to right > < <= > >= left to right > == != left to right > & left to right > ^ left to right > | left to right > && left to right > || left to right > ?: right to left > = += -= *= /= %= <<= >>= &= ^= |= right to left > , left to right > > So there's a problem with the above code, because the comma operator is > LOWER precedence than either assignment or ?: You'd need to put the > parentheses around the (pdp = ... , 1) section, not just the assignment. > > Or for yet another variation, how about: > > #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > for (iter = gen8_pdpe_index(start); \ > iter < I915_PDPES_PER_PDP(dev) && \ > (pd = (pdp)->page_directory[iter], length > 0); \ > iter++, \ > temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > temp = min(temp, length), \ > start += temp, length -= temp) > > #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > for (iter = gen8_pml4e_index(start); \ > iter < GEN8_PML4ES_PER_PML4 && \ > (pdp = (pml4)->pdps[iter], length > 0); \ > iter++, \ > temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > temp = min(temp, length), \ > start += temp, length -= temp) > > with no ugly ?: at all :) This also works. Since it's a change for not a real issue, I don't have a preference. I can send either. Thanks, > > .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-05 16:59 ` Michel Thierry @ 2015-10-06 8:38 ` Daniel Vetter 2015-10-06 9:43 ` Chris Wilson 2015-10-06 11:21 ` Dave Gordon 0 siblings, 2 replies; 22+ messages in thread From: Daniel Vetter @ 2015-10-06 8:38 UTC (permalink / raw) To: Michel Thierry; +Cc: intel-gfx On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote: > On 10/5/2015 5:36 PM, Dave Gordon wrote: > >On 02/10/15 14:16, Michel Thierry wrote: > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > >>range pt in gen6_for_each_pde"). > >> > >>But the static analyzer still complains that, just before we break due > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > >>iter value that is bigger than I915_PDES. Of course, this isn't really > >>a problem since no one uses pt outside the macro. Still, every single > >>new usage of the macro will create a new issue for us to mark as a > >>false positive. > >> > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up > >>implemented. > >> > >>In order to "solve" this "problem", this patch takes the ideas from > >>Chris and Dave, but that check would change the desired behavior of the > >>code, because the object (for example pdp->page_directory[iter]) can be > >>null during init/alloc, and C would take this as false, breaking the for > >>loop immediately. > >> > >>This has been already verified with "static analysis tools". > >> > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > >> > >>v2: Make it a single statement, while preventing the common subexpression > >>elimination (Chris) > >> > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>Cc: Dave Gordon <david.s.gordon@intel.com> > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com> > >>--- > >> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------ > >> 1 file changed, 8 insertions(+), 6 deletions(-) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h > >>b/drivers/gpu/drm/i915/i915_gem_gtt.h > >>index 9fbb07d..a216397 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.h > >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > >>@@ -394,7 +394,8 @@ struct i915_hw_ppgtt { > >> */ > >> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ > >> for (iter = gen6_pde_index(start); \ > >>- pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ > >>+ length > 0 && iter < I915_PDES ? \ > >>+ (pt = (pd)->page_table[iter]), 1 : 0; \ > >> iter++, \ > >> temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ > >> temp = min_t(unsigned, temp, length), \ > >>@@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > >> */ > >> #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ > >> for (iter = gen8_pde_index(start); \ > >>- pt = (pd)->page_table[iter], length > 0 && iter < > >>I915_PDES; \ > >>+ length > 0 && iter < I915_PDES ? \ > >>+ (pt = (pd)->page_table[iter]), 1 : 0; \ > >> iter++, \ > >> temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ > >> temp = min(temp, length), \ > >>@@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > >> > >> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > >> for (iter = gen8_pdpe_index(start); \ > >>- pd = (pdp)->page_directory[iter], \ > >>- length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ > >>+ length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ > >>+ (pd = (pdp)->page_directory[iter]), 1 : 0; \ > >> iter++, \ > >> temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > >> temp = min(temp, length), \ > >>@@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > >> > >> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > >> for (iter = gen8_pml4e_index(start); \ > >>- pdp = (pml4)->pdps[iter], \ > >>- length > 0 && iter < GEN8_PML4ES_PER_PML4; \ > >>+ length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ > >>+ (pdp = (pml4)->pdps[iter]), 1 : 0; \ > > > >this won't compile -- see below > > Hmm, it compiled (also got rid of of the "analysis tool error" and didn't > see any behavior change). > > > > >> iter++, \ > >> temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > >> temp = min(temp, length), \ > > > >The man page for C operators tells us: > > > > Operator Associativity > > () [] -> . left to right > > ! ~ ++ -- + - (type) * & sizeof right to left > > * / % left to right > > + - left to right > > << >> left to right > > < <= > >= left to right > > == != left to right > > & left to right > > ^ left to right > > | left to right > > && left to right > > || left to right > > ?: right to left > > = += -= *= /= %= <<= >>= &= ^= |= right to left > > , left to right > > > >So there's a problem with the above code, because the comma operator is > >LOWER precedence than either assignment or ?: You'd need to put the > >parentheses around the (pdp = ... , 1) section, not just the assignment. > > > >Or for yet another variation, how about: > > > >#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > > for (iter = gen8_pdpe_index(start); \ > > iter < I915_PDPES_PER_PDP(dev) && \ > > (pd = (pdp)->page_directory[iter], length > 0); \ > > iter++, \ > > temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > > temp = min(temp, length), \ > > start += temp, length -= temp) > > > >#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > > for (iter = gen8_pml4e_index(start); \ > > iter < GEN8_PML4ES_PER_PML4 && \ > > (pdp = (pml4)->pdps[iter], length > 0); \ > > iter++, \ > > temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > > temp = min(temp, length), \ > > start += temp, length -= temp) > > > >with no ugly ?: at all :) > > This also works. > > Since it's a change for not a real issue, I don't have a preference. > I can send either. Yeah, since ?: is a ternary operator parsing implicitly adds the () in the middle and always parses it as a ? (b) : c. If lower-level operators in the middle could split the ternary operator then it would result in parsing fail (sinc ? without the : is just useless). So lgtm. Someone willing to smack an r-b onto the patch? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 8:38 ` Daniel Vetter @ 2015-10-06 9:43 ` Chris Wilson 2015-10-06 10:09 ` Daniel Vetter 2015-10-06 11:21 ` Dave Gordon 1 sibling, 1 reply; 22+ messages in thread From: Chris Wilson @ 2015-10-06 9:43 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote: > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote: > > On 10/5/2015 5:36 PM, Dave Gordon wrote: > > >On 02/10/15 14:16, Michel Thierry wrote: > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > > >>range pt in gen6_for_each_pde"). > > >> > > >>But the static analyzer still complains that, just before we break due > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > > >>iter value that is bigger than I915_PDES. Of course, this isn't really > > >>a problem since no one uses pt outside the macro. Still, every single > > >>new usage of the macro will create a new issue for us to mark as a > > >>false positive. > > >> > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up > > >>implemented. > > >> > > >>In order to "solve" this "problem", this patch takes the ideas from > > >>Chris and Dave, but that check would change the desired behavior of the > > >>code, because the object (for example pdp->page_directory[iter]) can be > > >>null during init/alloc, and C would take this as false, breaking the for > > >>loop immediately. > > >> > > >>This has been already verified with "static analysis tools". > > >> > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > >> > > >>v2: Make it a single statement, while preventing the common subexpression > > >>elimination (Chris) > > >> > > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > > >>Cc: Dave Gordon <david.s.gordon@intel.com> > > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com> > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the > middle and always parses it as a ? (b) : c. If lower-level operators in > the middle could split the ternary operator then it would result in > parsing fail (sinc ? without the : is just useless). So lgtm. Someone > willing to smack an r-b onto the patch? I think it's good enough. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Something to consider is that ppgtt_insert is 10x slower than ppgtt_clear, and that some workloads (admittedly not 48b!) spend a disproportionate amount of time changing PTE. If you have ideas for spending up insertion, feel free to experiment! -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 9:43 ` Chris Wilson @ 2015-10-06 10:09 ` Daniel Vetter 2015-10-06 10:19 ` Chris Wilson 0 siblings, 1 reply; 22+ messages in thread From: Daniel Vetter @ 2015-10-06 10:09 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote: > On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote: > > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote: > > > On 10/5/2015 5:36 PM, Dave Gordon wrote: > > > >On 02/10/15 14:16, Michel Thierry wrote: > > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > > > >>range pt in gen6_for_each_pde"). > > > >> > > > >>But the static analyzer still complains that, just before we break due > > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > > > >>iter value that is bigger than I915_PDES. Of course, this isn't really > > > >>a problem since no one uses pt outside the macro. Still, every single > > > >>new usage of the macro will create a new issue for us to mark as a > > > >>false positive. > > > >> > > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up > > > >>implemented. > > > >> > > > >>In order to "solve" this "problem", this patch takes the ideas from > > > >>Chris and Dave, but that check would change the desired behavior of the > > > >>code, because the object (for example pdp->page_directory[iter]) can be > > > >>null during init/alloc, and C would take this as false, breaking the for > > > >>loop immediately. > > > >> > > > >>This has been already verified with "static analysis tools". > > > >> > > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > > >> > > > >>v2: Make it a single statement, while preventing the common subexpression > > > >>elimination (Chris) > > > >> > > > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > >>Cc: Dave Gordon <david.s.gordon@intel.com> > > > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com> > > > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the > > middle and always parses it as a ? (b) : c. If lower-level operators in > > the middle could split the ternary operator then it would result in > > parsing fail (sinc ? without the : is just useless). So lgtm. Someone > > willing to smack an r-b onto the patch? > > I think it's good enough. > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Queued for -next, thanks for the patch. > Something to consider is that ppgtt_insert is 10x slower than > ppgtt_clear, and that some workloads (admittedly not 48b!) spend a > disproportionate amount of time changing PTE. If you have ideas for > spending up insertion, feel free to experiment! Hm, where do we waste all that time? 10x slower is pretty impressive since on a quick look I can only see the sg table walk as the additional bit of memory traversals insert does on top of clear ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 10:09 ` Daniel Vetter @ 2015-10-06 10:19 ` Chris Wilson 0 siblings, 0 replies; 22+ messages in thread From: Chris Wilson @ 2015-10-06 10:19 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Tue, Oct 06, 2015 at 12:09:51PM +0200, Daniel Vetter wrote: > On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote: > > On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote: > > > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote: > > > > On 10/5/2015 5:36 PM, Dave Gordon wrote: > > > > >On 02/10/15 14:16, Michel Thierry wrote: > > > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of > > > > >>range pt in gen6_for_each_pde"). > > > > >> > > > > >>But the static analyzer still complains that, just before we break due > > > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an > > > > >>iter value that is bigger than I915_PDES. Of course, this isn't really > > > > >>a problem since no one uses pt outside the macro. Still, every single > > > > >>new usage of the macro will create a new issue for us to mark as a > > > > >>false positive. > > > > >> > > > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up > > > > >>implemented. > > > > >> > > > > >>In order to "solve" this "problem", this patch takes the ideas from > > > > >>Chris and Dave, but that check would change the desired behavior of the > > > > >>code, because the object (for example pdp->page_directory[iter]) can be > > > > >>null during init/alloc, and C would take this as false, breaking the for > > > > >>loop immediately. > > > > >> > > > > >>This has been already verified with "static analysis tools". > > > > >> > > > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html > > > > >> > > > > >>v2: Make it a single statement, while preventing the common subexpression > > > > >>elimination (Chris) > > > > >> > > > > >>Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> > > > > >>Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > >>Cc: Dave Gordon <david.s.gordon@intel.com> > > > > >>Signed-off-by: Michel Thierry <michel.thierry@intel.com> > > > > > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the > > > middle and always parses it as a ? (b) : c. If lower-level operators in > > > the middle could split the ternary operator then it would result in > > > parsing fail (sinc ? without the : is just useless). So lgtm. Someone > > > willing to smack an r-b onto the patch? > > > > I think it's good enough. > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > > Queued for -next, thanks for the patch. > > > Something to consider is that ppgtt_insert is 10x slower than > > ppgtt_clear, and that some workloads (admittedly not 48b!) spend a > > disproportionate amount of time changing PTE. If you have ideas for > > spending up insertion, feel free to experiment! > > Hm, where do we waste all that time? 10x slower is pretty impressive since > on a quick look I can only see the sg table walk as the additional bit of > memory traversals insert does on top of clear ... The sg_page_iter claims top spot, followed by the memory dereferences (at a guess, it doesn't seem the individual sg struct isn't dense enough to be cache friendly or maybe we should just cachealign it?). We can make substantial improvement by opencoding the page iter (under the assumption that we do not have page coallescing) like: http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=f9baf155c3096058ef9aeeb39b586713291efc56 with that and eliminating the ivb_pte_encode() indirection gets us to only be ~5x slower than clearing. It is not pretty (and is rather cavalier about the last page), but at that point it seems to be memory bound. However, in this particular benchmark where inserting ppgtt dominates the kernel profile, we are GPU bound and improving the insert is lost in the noise. (Unless we disable the GPU load and assume an infinitely fast GPU like Skylake!) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 8:38 ` Daniel Vetter 2015-10-06 9:43 ` Chris Wilson @ 2015-10-06 11:21 ` Dave Gordon 2015-10-06 11:53 ` Chris Wilson 1 sibling, 1 reply; 22+ messages in thread From: Dave Gordon @ 2015-10-06 11:21 UTC (permalink / raw) To: Daniel Vetter, Michel Thierry; +Cc: intel-gfx On 06/10/15 09:38, Daniel Vetter wrote: > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote: >> On 10/5/2015 5:36 PM, Dave Gordon wrote: >>> On 02/10/15 14:16, Michel Thierry wrote: >>>> We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of >>>> range pt in gen6_for_each_pde"). >>>> >>>> But the static analyzer still complains that, just before we break due >>>> to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an >>>> iter value that is bigger than I915_PDES. Of course, this isn't really >>>> a problem since no one uses pt outside the macro. Still, every single >>>> new usage of the macro will create a new issue for us to mark as a >>>> false positive. >>>> >>>> Also, Paulo re-started the discussion a while ago [1], but didn't end up >>>> implemented. >>>> >>>> In order to "solve" this "problem", this patch takes the ideas from >>>> Chris and Dave, but that check would change the desired behavior of the >>>> code, because the object (for example pdp->page_directory[iter]) can be >>>> null during init/alloc, and C would take this as false, breaking the for >>>> loop immediately. >>>> >>>> This has been already verified with "static analysis tools". >>>> >>>> [1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html >>>> >>>> v2: Make it a single statement, while preventing the common subexpression >>>> elimination (Chris) >>>> >>>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> >>>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>>> Cc: Dave Gordon <david.s.gordon@intel.com> >>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/i915_gem_gtt.h | 14 ++++++++------ >>>> 1 file changed, 8 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> b/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> index 9fbb07d..a216397 100644 >>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h >>>> @@ -394,7 +394,8 @@ struct i915_hw_ppgtt { >>>> */ >>>> #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \ >>>> for (iter = gen6_pde_index(start); \ >>>> - pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \ >>>> + length > 0 && iter < I915_PDES ? \ >>>> + (pt = (pd)->page_table[iter]), 1 : 0; \ >>>> iter++, \ >>>> temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \ >>>> temp = min_t(unsigned, temp, length), \ >>>> @@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >>>> */ >>>> #define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ >>>> for (iter = gen8_pde_index(start); \ >>>> - pt = (pd)->page_table[iter], length > 0 && iter < >>>> I915_PDES; \ >>>> + length > 0 && iter < I915_PDES ? \ >>>> + (pt = (pd)->page_table[iter]), 1 : 0; \ >>>> iter++, \ >>>> temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ >>>> temp = min(temp, length), \ >>>> @@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >>>> >>>> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ >>>> for (iter = gen8_pdpe_index(start); \ >>>> - pd = (pdp)->page_directory[iter], \ >>>> - length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \ >>>> + length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ >>>> + (pd = (pdp)->page_directory[iter]), 1 : 0; \ >>>> iter++, \ >>>> temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ >>>> temp = min(temp, length), \ >>>> @@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr) >>>> >>>> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ >>>> for (iter = gen8_pml4e_index(start); \ >>>> - pdp = (pml4)->pdps[iter], \ >>>> - length > 0 && iter < GEN8_PML4ES_PER_PML4; \ >>>> + length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ >>>> + (pdp = (pml4)->pdps[iter]), 1 : 0; \ >>> >>> this won't compile -- see below >> >> Hmm, it compiled (also got rid of of the "analysis tool error" and didn't >> see any behavior change). >> >>> >>>> iter++, \ >>>> temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ >>>> temp = min(temp, length), \ >>> >>> The man page for C operators tells us: >>> >>> Operator Associativity >>> () [] -> . left to right >>> ! ~ ++ -- + - (type) * & sizeof right to left >>> * / % left to right >>> + - left to right >>> << >> left to right >>> < <= > >= left to right >>> == != left to right >>> & left to right >>> ^ left to right >>> | left to right >>> && left to right >>> || left to right >>> ?: right to left >>> = += -= *= /= %= <<= >>= &= ^= |= right to left >>> , left to right >>> >>> So there's a problem with the above code, because the comma operator is >>> LOWER precedence than either assignment or ?: You'd need to put the >>> parentheses around the (pdp = ... , 1) section, not just the assignment. >>> >>> Or for yet another variation, how about: >>> >>> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ >>> for (iter = gen8_pdpe_index(start); \ >>> iter < I915_PDPES_PER_PDP(dev) && \ >>> (pd = (pdp)->page_directory[iter], length > 0); \ >>> iter++, \ >>> temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ >>> temp = min(temp, length), \ >>> start += temp, length -= temp) >>> >>> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ >>> for (iter = gen8_pml4e_index(start); \ >>> iter < GEN8_PML4ES_PER_PML4 && \ >>> (pdp = (pml4)->pdps[iter], length > 0); \ >>> iter++, \ >>> temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ >>> temp = min(temp, length), \ >>> start += temp, length -= temp) >>> >>> with no ugly ?: at all :) >> >> This also works. >> >> Since it's a change for not a real issue, I don't have a preference. >> I can send either. > > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the > middle and always parses it as a ? (b) : c. If lower-level operators in > the middle could split the ternary operator then it would result in > parsing fail (sinc ? without the : is just useless). So lgtm. Someone > willing to smack an r-b onto the patch? > -Daniel Hmmm ... it failed to compile when I tried it yesterday, but I think I must have bodged the copypaste from the email to the header file, and then just assumed the error message meant something was wrong with the way all those low-precedence operators were stacked together. Anyway I just did it again and it compiles OK now, so it gets my Reviewed-by: Dave Gordon <david.s.gordon@intel.com> ... although I still think my version is (slightly) easier to read. Or it could be improved even more by moving the increment of 'iter' to the end, making it one line shorter and perhaps helping the compiler a little :) #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ for (iter = gen8_pml4e_index(start); \ iter < GEN8_PML4ES_PER_PML4 && \ (pdp = (pml4)->pdps[iter], length > 0); \ temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ temp = min(temp, length), \ start += temp, length -= temp, ++iter) Or, noting that 'temp' is never used in the generated loop bodies, we could eliminate the parameter and make it local to the loop header :) #define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ for (iter = gen8_pml4e_index(start); \ iter < GEN8_PML4ES_PER_PML4 && \ (pdp = (pml4)->pdps[iter], length > 0); \ ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ temp = min(temp - start, length); \ start += temp, length -= temp; }), \ ++iter) .Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 11:21 ` Dave Gordon @ 2015-10-06 11:53 ` Chris Wilson 2015-12-03 15:29 ` Dave Gordon 2015-12-08 13:30 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon 0 siblings, 2 replies; 22+ messages in thread From: Chris Wilson @ 2015-10-06 11:53 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Oct 06, 2015 at 12:21:05PM +0100, Dave Gordon wrote: > ... although I still think my version is (slightly) easier to read. > Or it could be improved even more by moving the increment of 'iter' > to the end, making it one line shorter and perhaps helping the > compiler a little :) > > #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > for (iter = gen8_pml4e_index(start); \ > iter < GEN8_PML4ES_PER_PML4 && \ > (pdp = (pml4)->pdps[iter], length > 0); \ > temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > temp = min(temp, length), \ > start += temp, length -= temp, ++iter) Shorter, yes, but we may as well take advantage of not using [iter] if length == 0, so meh. > Or, noting that 'temp' is never used in the generated loop bodies, > we could eliminate the parameter and make it local to the loop > header :) > > #define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ > for (iter = gen8_pml4e_index(start); \ > iter < GEN8_PML4ES_PER_PML4 && \ > (pdp = (pml4)->pdps[iter], length > 0); \ > ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ > temp = min(temp - start, length); \ > start += temp, length -= temp; }), \ > ++iter) Removing extraneous parameters from macros is differently a usability win. Care to spin a real patch so we can see how it looks in practice? -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3) 2015-10-06 11:53 ` Chris Wilson @ 2015-12-03 15:29 ` Dave Gordon 2015-12-08 13:30 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon 1 sibling, 0 replies; 22+ messages in thread From: Dave Gordon @ 2015-12-03 15:29 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, Michel Thierry, intel-gfx [-- Attachment #1: Type: text/plain, Size: 2155 bytes --] On 06/10/15 12:53, Chris Wilson wrote: > On Tue, Oct 06, 2015 at 12:21:05PM +0100, Dave Gordon wrote: >> ... although I still think my version is (slightly) easier to read. >> Or it could be improved even more by moving the increment of 'iter' >> to the end, making it one line shorter and perhaps helping the >> compiler a little :) >> >> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ >> for (iter = gen8_pml4e_index(start); \ >> iter < GEN8_PML4ES_PER_PML4 && \ >> (pdp = (pml4)->pdps[iter], length > 0); \ >> temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ >> temp = min(temp, length), \ >> start += temp, length -= temp, ++iter) > > Shorter, yes, but we may as well take advantage of not using [iter] if > length == 0, so meh. > >> Or, noting that 'temp' is never used in the generated loop bodies, >> we could eliminate the parameter and make it local to the loop >> header :) >> >> #define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ >> for (iter = gen8_pml4e_index(start); \ >> iter < GEN8_PML4ES_PER_PML4 && \ >> (pdp = (pml4)->pdps[iter], length > 0); \ >> ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ >> temp = min(temp - start, length); \ >> start += temp, length -= temp; }), \ >> ++iter) > > Removing extraneous parameters from macros is differently a usability win. > Care to spin a real patch so we can see how it looks in practice? > -Chris OK, here's the patch that Chris asked for. On compilation, this version comes out about 160 bytes smaller than the original, and the disassembly has 100 fewer lines. This probably doesn't improve performance significantly, though, as many of the uses of these macros are inside debug-only data dumping functions. .Dave. [-- Attachment #2: 0001-drm-i915-eliminate-temp-in-gen8_for_each_-pdd-pdpe-p.patch --] [-- Type: text/x-patch, Size: 10038 bytes --] >From aa121eb42768e19097d93d3a430ab998fc0f62d2 Mon Sep 17 00:00:00 2001 From: Dave Gordon <david.s.gordon@intel.com> Date: Thu, 3 Dec 2015 11:19:58 +0000 Subject: [PATCH] drm/i915: eliminate 'temp' in gen8_for_each_{pdd,pdpe,pml4e} macros Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ All of these iterator macros require a 'temp' argument, used merely to hold internal partial results. We can instead declare the temporary variable inside the macro, so the caller need not provide it. Some of the old code contained nested iterators that actually reused the same 'temp' variable for both inner and outer instances. It's quite surprising that this didn't introduce bugs! But it does show that the value of 'temp' isn't required to persist during the iterated body. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++--------------- drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++-------------------- 2 files changed, 41 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1f7e6b9..c25e8b0 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length, scratch_pte); } else { - uint64_t templ4, pml4e; + uint64_t pml4e; struct i915_page_directory_pointer *pdp; - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte); } @@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, cache_level); } else { struct i915_page_directory_pointer *pdp; - uint64_t templ4, pml4e; + uint64_t pml4e; uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT; - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level); } @@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_table *pt; - uint64_t temp; uint32_t pde; - gen8_for_each_pde(pt, pd, start, length, temp, pde) { + gen8_for_each_pde(pt, pd, start, length, pde) { /* Don't reallocate page tables */ if (test_bit(pde, pd->used_pdes)) { /* Scratch is never allocated this way */ @@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_directory *pd; - uint64_t temp; uint32_t pdpe; uint32_t pdpes = I915_PDPES_PER_PDP(dev); WARN_ON(!bitmap_empty(new_pds, pdpes)); - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { if (test_bit(pdpe, pdp->used_pdpes)) continue; @@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_directory_pointer *pdp; - uint64_t temp; uint32_t pml4e; WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4)); - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { if (!test_bit(pml4e, pml4->used_pml4es)) { pdp = alloc_pdp(dev); if (IS_ERR(pdp)) @@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, struct i915_page_directory *pd; const uint64_t orig_start = start; const uint64_t orig_length = length; - uint64_t temp; uint32_t pdpe; uint32_t pdpes = I915_PDPES_PER_PDP(dev); int ret; @@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, } /* For every page directory referenced, allocate page tables */ - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length, new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES)); if (ret) @@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, /* Allocations have completed successfully, so set the bitmaps, and do * the mappings. */ - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { gen8_pde_t *const page_directory = kmap_px(pd); struct i915_page_table *pt; uint64_t pd_len = length; @@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, /* Every pd should be allocated, we just did that above. */ WARN_ON(!pd); - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { /* Same reasoning as pd */ WARN_ON(!pt); WARN_ON(!pd_len); @@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, err_out: while (pdpe--) { + unsigned long temp; + for_each_set_bit(temp, new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES), I915_PDES) free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]); @@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_directory_pointer *pdp; - uint64_t temp, pml4e; + uint64_t pml4e; int ret = 0; /* Do the pml4 allocations first, so we don't need to track the newly @@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, "The allocation has spanned more than 512GB. " "It is highly likely this is incorrect."); - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { WARN_ON(!pdp); ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length); @@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, struct seq_file *m) { struct i915_page_directory *pd; - uint64_t temp; uint32_t pdpe; - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { struct i915_page_table *pt; uint64_t pd_len = length; uint64_t pd_start = start; @@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, continue; seq_printf(m, "\tPDPE #%d\n", pdpe); - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { uint32_t pte; gen8_pte_t *pt_vaddr; @@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) if (!USES_FULL_48BIT_PPGTT(vm->dev)) { gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m); } else { - uint64_t templ4, pml4e; + uint64_t pml4e; struct i915_pml4 *pml4 = &ppgtt->pml4; struct i915_page_directory_pointer *pdp; - gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { if (!test_bit(pml4e, pml4->used_pml4es)) continue; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 877c32c..b448ad8 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr) * between from start until start + length. On gen8+ it simply iterates * over every page directory entry in a page directory. */ -#define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ - for (iter = gen8_pde_index(start); \ - length > 0 && iter < I915_PDES ? \ - (pt = (pd)->page_table[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) - -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ - for (iter = gen8_pdpe_index(start); \ - length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ - (pd = (pdp)->page_directory[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) - -#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ - for (iter = gen8_pml4e_index(start); \ - length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ - (pdp = (pml4)->pdps[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) +#define gen8_for_each_pde(pt, pd, start, length, iter) \ + for (iter = gen8_pde_index(start); \ + length > 0 && iter < I915_PDES && \ + (pt = (pd)->page_table[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) + +#define gen8_for_each_pdpe(pd, pdp, start, length, iter) \ + for (iter = gen8_pdpe_index(start); \ + length > 0 && iter < I915_PDPES_PER_PDP(dev) && \ + (pd = (pdp)->page_directory[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) + +#define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ + for (iter = gen8_pml4e_index(start); \ + length > 0 && iter < GEN8_PML4ES_PER_PML4 && \ + (pdp = (pml4)->pdps[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) static inline uint32_t gen8_pte_index(uint64_t address) { -- 1.9.1 [-- Attachment #3: Type: text/plain, Size: 159 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} 2015-10-06 11:53 ` Chris Wilson 2015-12-03 15:29 ` Dave Gordon @ 2015-12-08 13:30 ` Dave Gordon 2015-12-08 13:30 ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon 2015-12-10 8:35 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter 1 sibling, 2 replies; 22+ messages in thread From: Dave Gordon @ 2015-12-08 13:30 UTC (permalink / raw) To: intel-gfx Way back at the beginning of October, Chris Wilson suggested that cleaning up these macros by removing the redundant 'temp' might be worthwhile. So here's the patch. There's one more thing that might be cleaned up here (but for which I don't have a patch yet), which is that gen8_for_each_pdpe() still references a non-parameter value named 'dev'. Bizarrely, this apparent nonlocal reference is only used as the parameter to another macro -- which then doesn't actually use it! Suggestions on how to tidy this up welcome! One possibility was to define a common iterator (__gen8_for_each()) and then use it to create all the three level-specific macros, which meant that the loop bounds were passed in rather than being part of the iterator. But the common macro needed a LOT of parameters, and didn't really look like an improvement. So I've gone with this simple rewrite for now. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros 2015-12-08 13:30 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon @ 2015-12-08 13:30 ` Dave Gordon 2015-12-10 8:39 ` Daniel Vetter 2015-12-10 8:35 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter 1 sibling, 1 reply; 22+ messages in thread From: Dave Gordon @ 2015-12-08 13:30 UTC (permalink / raw) To: intel-gfx All of these iterator macros require a 'temp' argument, used merely to hold internal partial results. We can instead declare the temporary variable inside the macro, so the caller need not provide it. Some of the old code contained nested iterators that actually reused the same 'temp' variable for both inner and outer instances. It's quite surprising that this didn't introduce bugs! But it does show that the value of 'temp' isn't required to persist during the iterated body. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> --- drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++--------------- drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++-------------------- 2 files changed, 41 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1f7e6b9..c25e8b0 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length, scratch_pte); } else { - uint64_t templ4, pml4e; + uint64_t pml4e; struct i915_page_directory_pointer *pdp; - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte); } @@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, cache_level); } else { struct i915_page_directory_pointer *pdp; - uint64_t templ4, pml4e; + uint64_t pml4e; uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT; - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level); } @@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_table *pt; - uint64_t temp; uint32_t pde; - gen8_for_each_pde(pt, pd, start, length, temp, pde) { + gen8_for_each_pde(pt, pd, start, length, pde) { /* Don't reallocate page tables */ if (test_bit(pde, pd->used_pdes)) { /* Scratch is never allocated this way */ @@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_directory *pd; - uint64_t temp; uint32_t pdpe; uint32_t pdpes = I915_PDPES_PER_PDP(dev); WARN_ON(!bitmap_empty(new_pds, pdpes)); - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { if (test_bit(pdpe, pdp->used_pdpes)) continue; @@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm, { struct drm_device *dev = vm->dev; struct i915_page_directory_pointer *pdp; - uint64_t temp; uint32_t pml4e; WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4)); - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { if (!test_bit(pml4e, pml4->used_pml4es)) { pdp = alloc_pdp(dev); if (IS_ERR(pdp)) @@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, struct i915_page_directory *pd; const uint64_t orig_start = start; const uint64_t orig_length = length; - uint64_t temp; uint32_t pdpe; uint32_t pdpes = I915_PDPES_PER_PDP(dev); int ret; @@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, } /* For every page directory referenced, allocate page tables */ - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length, new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES)); if (ret) @@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, /* Allocations have completed successfully, so set the bitmaps, and do * the mappings. */ - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { gen8_pde_t *const page_directory = kmap_px(pd); struct i915_page_table *pt; uint64_t pd_len = length; @@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, /* Every pd should be allocated, we just did that above. */ WARN_ON(!pd); - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { /* Same reasoning as pd */ WARN_ON(!pt); WARN_ON(!pd_len); @@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, err_out: while (pdpe--) { + unsigned long temp; + for_each_set_bit(temp, new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES), I915_PDES) free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]); @@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_directory_pointer *pdp; - uint64_t temp, pml4e; + uint64_t pml4e; int ret = 0; /* Do the pml4 allocations first, so we don't need to track the newly @@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, "The allocation has spanned more than 512GB. " "It is highly likely this is incorrect."); - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { WARN_ON(!pdp); ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length); @@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, struct seq_file *m) { struct i915_page_directory *pd; - uint64_t temp; uint32_t pdpe; - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { struct i915_page_table *pt; uint64_t pd_len = length; uint64_t pd_start = start; @@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, continue; seq_printf(m, "\tPDPE #%d\n", pdpe); - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { uint32_t pte; gen8_pte_t *pt_vaddr; @@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) if (!USES_FULL_48BIT_PPGTT(vm->dev)) { gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m); } else { - uint64_t templ4, pml4e; + uint64_t pml4e; struct i915_pml4 *pml4 = &ppgtt->pml4; struct i915_page_directory_pointer *pdp; - gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) { + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { if (!test_bit(pml4e, pml4->used_pml4es)) continue; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 877c32c..b448ad8 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr) * between from start until start + length. On gen8+ it simply iterates * over every page directory entry in a page directory. */ -#define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ - for (iter = gen8_pde_index(start); \ - length > 0 && iter < I915_PDES ? \ - (pt = (pd)->page_table[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) - -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ - for (iter = gen8_pdpe_index(start); \ - length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ - (pd = (pdp)->page_directory[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) - -#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ - for (iter = gen8_pml4e_index(start); \ - length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ - (pdp = (pml4)->pdps[iter]), 1 : 0; \ - iter++, \ - temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ - temp = min(temp, length), \ - start += temp, length -= temp) +#define gen8_for_each_pde(pt, pd, start, length, iter) \ + for (iter = gen8_pde_index(start); \ + length > 0 && iter < I915_PDES && \ + (pt = (pd)->page_table[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) + +#define gen8_for_each_pdpe(pd, pdp, start, length, iter) \ + for (iter = gen8_pdpe_index(start); \ + length > 0 && iter < I915_PDPES_PER_PDP(dev) && \ + (pd = (pdp)->page_directory[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) + +#define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ + for (iter = gen8_pml4e_index(start); \ + length > 0 && iter < GEN8_PML4ES_PER_PML4 && \ + (pdp = (pml4)->pdps[iter], true); \ + ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ + temp = min(temp - start, length); \ + start += temp, length -= temp; }), ++iter) static inline uint32_t gen8_pte_index(uint64_t address) { -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros 2015-12-08 13:30 ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon @ 2015-12-10 8:39 ` Daniel Vetter 0 siblings, 0 replies; 22+ messages in thread From: Daniel Vetter @ 2015-12-10 8:39 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 01:30:51PM +0000, Dave Gordon wrote: > All of these iterator macros require a 'temp' argument, used merely to > hold internal partial results. We can instead declare the temporary > variable inside the macro, so the caller need not provide it. > > Some of the old code contained nested iterators that actually reused the > same 'temp' variable for both inner and outer instances. It's quite > surprising that this didn't introduce bugs! But it does show that the > value of 'temp' isn't required to persist during the iterated body. > > Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Queued for -next, thanks for the patch. -Daniel > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 39 ++++++++++++++--------------- > drivers/gpu/drm/i915/i915_gem_gtt.h | 49 +++++++++++++++++-------------------- > 2 files changed, 41 insertions(+), 47 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 1f7e6b9..c25e8b0 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -770,10 +770,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, > gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length, > scratch_pte); > } else { > - uint64_t templ4, pml4e; > + uint64_t pml4e; > struct i915_page_directory_pointer *pdp; > > - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { > + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { > gen8_ppgtt_clear_pte_range(vm, pdp, start, length, > scratch_pte); > } > @@ -839,10 +839,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, > cache_level); > } else { > struct i915_page_directory_pointer *pdp; > - uint64_t templ4, pml4e; > + uint64_t pml4e; > uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT; > > - gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { > + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, pml4e) { > gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, > start, cache_level); > } > @@ -1020,10 +1020,9 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm, > { > struct drm_device *dev = vm->dev; > struct i915_page_table *pt; > - uint64_t temp; > uint32_t pde; > > - gen8_for_each_pde(pt, pd, start, length, temp, pde) { > + gen8_for_each_pde(pt, pd, start, length, pde) { > /* Don't reallocate page tables */ > if (test_bit(pde, pd->used_pdes)) { > /* Scratch is never allocated this way */ > @@ -1082,13 +1081,12 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm, > { > struct drm_device *dev = vm->dev; > struct i915_page_directory *pd; > - uint64_t temp; > uint32_t pdpe; > uint32_t pdpes = I915_PDPES_PER_PDP(dev); > > WARN_ON(!bitmap_empty(new_pds, pdpes)); > > - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { > + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { > if (test_bit(pdpe, pdp->used_pdpes)) > continue; > > @@ -1136,12 +1134,11 @@ gen8_ppgtt_alloc_page_dirpointers(struct i915_address_space *vm, > { > struct drm_device *dev = vm->dev; > struct i915_page_directory_pointer *pdp; > - uint64_t temp; > uint32_t pml4e; > > WARN_ON(!bitmap_empty(new_pdps, GEN8_PML4ES_PER_PML4)); > > - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { > + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { > if (!test_bit(pml4e, pml4->used_pml4es)) { > pdp = alloc_pdp(dev); > if (IS_ERR(pdp)) > @@ -1225,7 +1222,6 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, > struct i915_page_directory *pd; > const uint64_t orig_start = start; > const uint64_t orig_length = length; > - uint64_t temp; > uint32_t pdpe; > uint32_t pdpes = I915_PDPES_PER_PDP(dev); > int ret; > @@ -1252,7 +1248,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, > } > > /* For every page directory referenced, allocate page tables */ > - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { > + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { > ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length, > new_page_tables + pdpe * BITS_TO_LONGS(I915_PDES)); > if (ret) > @@ -1264,7 +1260,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, > > /* Allocations have completed successfully, so set the bitmaps, and do > * the mappings. */ > - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { > + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { > gen8_pde_t *const page_directory = kmap_px(pd); > struct i915_page_table *pt; > uint64_t pd_len = length; > @@ -1274,7 +1270,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, > /* Every pd should be allocated, we just did that above. */ > WARN_ON(!pd); > > - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { > + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { > /* Same reasoning as pd */ > WARN_ON(!pt); > WARN_ON(!pd_len); > @@ -1311,6 +1307,8 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, > > err_out: > while (pdpe--) { > + unsigned long temp; > + > for_each_set_bit(temp, new_page_tables + pdpe * > BITS_TO_LONGS(I915_PDES), I915_PDES) > free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]); > @@ -1333,7 +1331,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, > struct i915_hw_ppgtt *ppgtt = > container_of(vm, struct i915_hw_ppgtt, base); > struct i915_page_directory_pointer *pdp; > - uint64_t temp, pml4e; > + uint64_t pml4e; > int ret = 0; > > /* Do the pml4 allocations first, so we don't need to track the newly > @@ -1352,7 +1350,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, > "The allocation has spanned more than 512GB. " > "It is highly likely this is incorrect."); > > - gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) { > + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { > WARN_ON(!pdp); > > ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length); > @@ -1392,10 +1390,9 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, > struct seq_file *m) > { > struct i915_page_directory *pd; > - uint64_t temp; > uint32_t pdpe; > > - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { > + gen8_for_each_pdpe(pd, pdp, start, length, pdpe) { > struct i915_page_table *pt; > uint64_t pd_len = length; > uint64_t pd_start = start; > @@ -1405,7 +1402,7 @@ static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp, > continue; > > seq_printf(m, "\tPDPE #%d\n", pdpe); > - gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) { > + gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) { > uint32_t pte; > gen8_pte_t *pt_vaddr; > > @@ -1455,11 +1452,11 @@ static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) > if (!USES_FULL_48BIT_PPGTT(vm->dev)) { > gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m); > } else { > - uint64_t templ4, pml4e; > + uint64_t pml4e; > struct i915_pml4 *pml4 = &ppgtt->pml4; > struct i915_page_directory_pointer *pdp; > > - gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) { > + gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) { > if (!test_bit(pml4e, pml4->used_pml4es)) > continue; > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h > index 877c32c..b448ad8 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.h > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h > @@ -455,32 +455,29 @@ static inline uint32_t gen6_pde_index(uint32_t addr) > * between from start until start + length. On gen8+ it simply iterates > * over every page directory entry in a page directory. > */ > -#define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ > - for (iter = gen8_pde_index(start); \ > - length > 0 && iter < I915_PDES ? \ > - (pt = (pd)->page_table[iter]), 1 : 0; \ > - iter++, \ > - temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start, \ > - temp = min(temp, length), \ > - start += temp, length -= temp) > - > -#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ > - for (iter = gen8_pdpe_index(start); \ > - length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \ > - (pd = (pdp)->page_directory[iter]), 1 : 0; \ > - iter++, \ > - temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ > - temp = min(temp, length), \ > - start += temp, length -= temp) > - > -#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \ > - for (iter = gen8_pml4e_index(start); \ > - length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \ > - (pdp = (pml4)->pdps[iter]), 1 : 0; \ > - iter++, \ > - temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \ > - temp = min(temp, length), \ > - start += temp, length -= temp) > +#define gen8_for_each_pde(pt, pd, start, length, iter) \ > + for (iter = gen8_pde_index(start); \ > + length > 0 && iter < I915_PDES && \ > + (pt = (pd)->page_table[iter], true); \ > + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT); \ > + temp = min(temp - start, length); \ > + start += temp, length -= temp; }), ++iter) > + > +#define gen8_for_each_pdpe(pd, pdp, start, length, iter) \ > + for (iter = gen8_pdpe_index(start); \ > + length > 0 && iter < I915_PDPES_PER_PDP(dev) && \ > + (pd = (pdp)->page_directory[iter], true); \ > + ({ u64 temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT); \ > + temp = min(temp - start, length); \ > + start += temp, length -= temp; }), ++iter) > + > +#define gen8_for_each_pml4e(pdp, pml4, start, length, iter) \ > + for (iter = gen8_pml4e_index(start); \ > + length > 0 && iter < GEN8_PML4ES_PER_PML4 && \ > + (pdp = (pml4)->pdps[iter], true); \ > + ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT); \ > + temp = min(temp - start, length); \ > + start += temp, length -= temp; }), ++iter) > > static inline uint32_t gen8_pte_index(uint64_t address) > { > -- > 1.9.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} 2015-12-08 13:30 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon 2015-12-08 13:30 ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon @ 2015-12-10 8:35 ` Daniel Vetter 1 sibling, 0 replies; 22+ messages in thread From: Daniel Vetter @ 2015-12-10 8:35 UTC (permalink / raw) To: Dave Gordon; +Cc: intel-gfx On Tue, Dec 08, 2015 at 01:30:50PM +0000, Dave Gordon wrote: > Way back at the beginning of October, Chris Wilson suggested that cleaning > up these macros by removing the redundant 'temp' might be worthwhile. So > here's the patch. > > There's one more thing that might be cleaned up here (but for which > I don't have a patch yet), which is that gen8_for_each_pdpe() still > references a non-parameter value named 'dev'. Bizarrely, this apparent > nonlocal reference is only used as the parameter to another macro -- which > then doesn't actually use it! Suggestions on how to tidy this up welcome! Iirc this is fallout from how we originally handled full ppgtt and aliasing ppgtt. Old versions tried to transparently fall back to simpler code, nowadays we just declare the gpu wedged. That's why there's no need to look at dev any more. I think you can just remove that. -Daniel > > One possibility was to define a common iterator (__gen8_for_each()) > and then use it to create all the three level-specific macros, which > meant that the loop bounds were passed in rather than being part of the > iterator. But the common macro needed a LOT of parameters, and didn't > really look like an improvement. So I've gone with this simple rewrite > for now. > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2015-12-10 8:39 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 15:59 [PATCH] drm/i915: prevent out of range pt in the PDE macros (take 3) Michel Thierry
2015-10-01 16:09 ` Chris Wilson
2015-10-02 12:47 ` Michel Thierry
2015-10-02 12:58 ` Chris Wilson
2015-10-02 13:11 ` Michel Thierry
2015-10-02 7:58 ` Daniel Vetter
2015-10-02 8:58 ` Chris Wilson
2015-10-02 10:52 ` Daniel Vetter
2015-10-02 13:16 ` [PATCH v2] " Michel Thierry
2015-10-05 16:36 ` Dave Gordon
2015-10-05 16:59 ` Michel Thierry
2015-10-06 8:38 ` Daniel Vetter
2015-10-06 9:43 ` Chris Wilson
2015-10-06 10:09 ` Daniel Vetter
2015-10-06 10:19 ` Chris Wilson
2015-10-06 11:21 ` Dave Gordon
2015-10-06 11:53 ` Chris Wilson
2015-12-03 15:29 ` Dave Gordon
2015-12-08 13:30 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Dave Gordon
2015-12-08 13:30 ` [PATCH 1/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} macros Dave Gordon
2015-12-10 8:39 ` Daniel Vetter
2015-12-10 8:35 ` [PATCH 0/1] drm/i915: eliminate 'temp' in gen8_for_each_{pdd, pdpe, pml4e} Daniel Vetter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox