* [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
@ 2017-03-10 9:42 Chris Wilson
2017-03-10 9:58 ` Mika Kuoppala
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 9:42 UTC (permalink / raw)
To: intel-gfx
We only need to clflush those cachelines that we have validated to be
read by the GPU. Userspace typically fills the batch length in
correctly, the exceptions tend to be explicit tests within igt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 21b1cd917d81..b9ce9a6881ea 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
}
if (ret == 0 && needs_clflush_after)
- drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
+ drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
+ (void *)cmd - shadow_batch_obj->mm.mapping);
i915_gem_object_unpin_map(shadow_batch_obj);
return ret;
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 9:42 [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines Chris Wilson
@ 2017-03-10 9:58 ` Mika Kuoppala
2017-03-10 10:04 ` Chris Wilson
2017-03-10 10:46 ` [PATCH v2] " Chris Wilson
2017-03-10 11:55 ` [PATCH v3] " Chris Wilson
2 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2017-03-10 9:58 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> We only need to clflush those cachelines that we have validated to be
> read by the GPU. Userspace typically fills the batch length in
> correctly, the exceptions tend to be explicit tests within igt.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 21b1cd917d81..b9ce9a6881ea 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> }
>
> if (ret == 0 && needs_clflush_after)
> - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> + (void *)cmd - shadow_batch_obj->mm.mapping);
(void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
We get away as the wb mapping being zero but for correctness.
-Mika
> i915_gem_object_unpin_map(shadow_batch_obj);
>
> return ret;
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 9:58 ` Mika Kuoppala
@ 2017-03-10 10:04 ` Chris Wilson
2017-03-10 10:19 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 10:04 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > We only need to clflush those cachelines that we have validated to be
> > read by the GPU. Userspace typically fills the batch length in
> > correctly, the exceptions tend to be explicit tests within igt.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > index 21b1cd917d81..b9ce9a6881ea 100644
> > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> > }
> >
> > if (ret == 0 && needs_clflush_after)
> > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> > + (void *)cmd - shadow_batch_obj->mm.mapping);
>
> (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>
> We get away as the wb mapping being zero but for correctness.
The low bits of mm.mapping cannot change the cacheline and so doesn't
affect the clflush. Same as before.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:04 ` Chris Wilson
@ 2017-03-10 10:19 ` Chris Wilson
2017-03-10 10:26 ` Mika Kuoppala
0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 10:19 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> >
> > > We only need to clflush those cachelines that we have validated to be
> > > read by the GPU. Userspace typically fills the batch length in
> > > correctly, the exceptions tend to be explicit tests within igt.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > index 21b1cd917d81..b9ce9a6881ea 100644
> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> > > }
> > >
> > > if (ret == 0 && needs_clflush_after)
> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
> >
> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >
> > We get away as the wb mapping being zero but for correctness.
>
> The low bits of mm.mapping cannot change the cacheline and so doesn't
> affect the clflush. Same as before.
Although, not quite the same as before as before we didn't use a fixed
end-point and so it could overshoot by a cacheline, even across a page
boundary.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:19 ` Chris Wilson
@ 2017-03-10 10:26 ` Mika Kuoppala
2017-03-10 10:39 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2017-03-10 10:26 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
>> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
>> > Chris Wilson <chris@chris-wilson.co.uk> writes:
>> >
>> > > We only need to clflush those cachelines that we have validated to be
>> > > read by the GPU. Userspace typically fills the batch length in
>> > > correctly, the exceptions tend to be explicit tests within igt.
>> > >
>> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > ---
>> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>> > > 1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > index 21b1cd917d81..b9ce9a6881ea 100644
>> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>> > > }
>> > >
>> > > if (ret == 0 && needs_clflush_after)
>> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
>> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
>> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
>> >
>> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>> >
>> > We get away as the wb mapping being zero but for correctness.
>>
>> The low bits of mm.mapping cannot change the cacheline and so doesn't
>> affect the clflush. Same as before.
>
> Although, not quite the same as before as before we didn't use a fixed
> end-point and so it could overshoot by a cacheline, even across a page
> boundary.
And there could be PAGE_MASK worth of lowbits.
-Mika
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:26 ` Mika Kuoppala
@ 2017-03-10 10:39 ` Chris Wilson
2017-03-10 10:42 ` Mika Kuoppala
0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 10:39 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> >> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> >
> >> > > We only need to clflush those cachelines that we have validated to be
> >> > > read by the GPU. Userspace typically fills the batch length in
> >> > > correctly, the exceptions tend to be explicit tests within igt.
> >> > >
> >> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > > ---
> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> >> > > }
> >> > >
> >> > > if (ret == 0 && needs_clflush_after)
> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> >> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
> >> >
> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >> >
> >> > We get away as the wb mapping being zero but for correctness.
> >>
> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
> >> affect the clflush. Same as before.
> >
> > Although, not quite the same as before as before we didn't use a fixed
> > end-point and so it could overshoot by a cacheline, even across a page
> > boundary.
>
> And there could be PAGE_MASK worth of lowbits.
There can? Wasn't anticipating adding a few thousand memory types, and
then sharing a single cache. :-p
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:39 ` Chris Wilson
@ 2017-03-10 10:42 ` Mika Kuoppala
2017-03-10 10:50 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2017-03-10 10:42 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>>
>> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
>> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
>> >> > Chris Wilson <chris@chris-wilson.co.uk> writes:
>> >> >
>> >> > > We only need to clflush those cachelines that we have validated to be
>> >> > > read by the GPU. Userspace typically fills the batch length in
>> >> > > correctly, the exceptions tend to be explicit tests within igt.
>> >> > >
>> >> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> >> > > ---
>> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
>> >> > > 1 file changed, 2 insertions(+), 1 deletion(-)
>> >> > >
>> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
>> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
>> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>> >> > > }
>> >> > >
>> >> > > if (ret == 0 && needs_clflush_after)
>> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
>> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
>> >> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
>> >> >
>> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
>> >> >
>> >> > We get away as the wb mapping being zero but for correctness.
>> >>
>> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
>> >> affect the clflush. Same as before.
>> >
>> > Although, not quite the same as before as before we didn't use a fixed
>> > end-point and so it could overshoot by a cacheline, even across a page
>> > boundary.
>>
>> And there could be PAGE_MASK worth of lowbits.
>
> There can? Wasn't anticipating adding a few thousand memory types, and
> then sharing a single cache. :-p
#define ptr_mask_bits(ptr) ({
\
unsigned long __v = (unsigned long)(ptr);
\
(typeof(ptr))(__v & PAGE_MASK);
\
})
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 9:42 [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines Chris Wilson
2017-03-10 9:58 ` Mika Kuoppala
@ 2017-03-10 10:46 ` Chris Wilson
2017-03-10 11:19 ` Mika Kuoppala
2017-03-10 11:55 ` [PATCH v3] " Chris Wilson
2 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 10:46 UTC (permalink / raw)
To: intel-gfx
We only need to clflush those cachelines that we have validated to be
read by the GPU. Userspace typically fills the batch length in
correctly, the exceptions tend to be explicit tests within igt.
v2: Use ptr_mask_bits() to make Mika happy
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 21b1cd917d81..787980d52272 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1330,8 +1330,10 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
ret = -EINVAL;
}
- if (ret == 0 && needs_clflush_after)
- drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
+ if (ret == 0 && needs_clflush_after) {
+ void *ptr = ptr_mask_bits(shadow_batch_obj->mm.mapping);
+ drm_clflush_virt_range(ptr, (void *)cmd - ptr);
+ }
i915_gem_object_unpin_map(shadow_batch_obj);
return ret;
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:42 ` Mika Kuoppala
@ 2017-03-10 10:50 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 10:50 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Fri, Mar 10, 2017 at 12:42:34PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >>
> >> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
> >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> >> >> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> >> >
> >> >> > > We only need to clflush those cachelines that we have validated to be
> >> >> > > read by the GPU. Userspace typically fills the batch length in
> >> >> > > correctly, the exceptions tend to be explicit tests within igt.
> >> >> > >
> >> >> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> >> > > ---
> >> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >> >> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> >> >> > >
> >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
> >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> >> >> > > }
> >> >> > >
> >> >> > > if (ret == 0 && needs_clflush_after)
> >> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> >> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> >> >> > > + (void *)cmd - shadow_batch_obj->mm.mapping);
> >> >> >
> >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >> >> >
> >> >> > We get away as the wb mapping being zero but for correctness.
> >> >>
> >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
> >> >> affect the clflush. Same as before.
> >> >
> >> > Although, not quite the same as before as before we didn't use a fixed
> >> > end-point and so it could overshoot by a cacheline, even across a page
> >> > boundary.
> >>
> >> And there could be PAGE_MASK worth of lowbits.
> >
> > There can? Wasn't anticipating adding a few thousand memory types, and
> > then sharing a single cache. :-p
>
> #define ptr_mask_bits(ptr) ({
> \
> unsigned long __v = (unsigned long)(ptr);
> \
> (typeof(ptr))(__v & PAGE_MASK);
> \
> })
Yes, I didn't name that function very well. The low bits of
obj->mm.mapping currently represent and index into the cache.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 10:46 ` [PATCH v2] " Chris Wilson
@ 2017-03-10 11:19 ` Mika Kuoppala
2017-03-10 11:39 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2017-03-10 11:19 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> We only need to clflush those cachelines that we have validated to be
> read by the GPU. Userspace typically fills the batch length in
> correctly, the exceptions tend to be explicit tests within igt.
>
> v2: Use ptr_mask_bits() to make Mika happy
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 21b1cd917d81..787980d52272 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1330,8 +1330,10 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> ret = -EINVAL;
> }
>
> - if (ret == 0 && needs_clflush_after)
> - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> + if (ret == 0 && needs_clflush_after) {
> + void *ptr = ptr_mask_bits(shadow_batch_obj->mm.mapping);
> + drm_clflush_virt_range(ptr, (void *)cmd - ptr);
Your ptr is page aligned, and the cmd is pointing to a BATCH_BUFFER_END
right at the next cacheline boundary. Apparently in this case you would
get flush only to the previous cacheline and not the one containg the
BB_END?
-Mika
> + }
> i915_gem_object_unpin_map(shadow_batch_obj);
>
> return ret;
> --
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 11:19 ` Mika Kuoppala
@ 2017-03-10 11:39 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 11:39 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Fri, Mar 10, 2017 at 01:19:01PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > We only need to clflush those cachelines that we have validated to be
> > read by the GPU. Userspace typically fills the batch length in
> > correctly, the exceptions tend to be explicit tests within igt.
> >
> > v2: Use ptr_mask_bits() to make Mika happy
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_cmd_parser.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > index 21b1cd917d81..787980d52272 100644
> > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> > @@ -1330,8 +1330,10 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> > ret = -EINVAL;
> > }
> >
> > - if (ret == 0 && needs_clflush_after)
> > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> > + if (ret == 0 && needs_clflush_after) {
> > + void *ptr = ptr_mask_bits(shadow_batch_obj->mm.mapping);
> > + drm_clflush_virt_range(ptr, (void *)cmd - ptr);
>
> Your ptr is page aligned, and the cmd is pointing to a BATCH_BUFFER_END
> right at the next cacheline boundary. Apparently in this case you would
> get flush only to the previous cacheline and not the one containg the
> BB_END?
No, I thought cmd was always after. That upsets cmd >= batch_end....
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 9:42 [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines Chris Wilson
2017-03-10 9:58 ` Mika Kuoppala
2017-03-10 10:46 ` [PATCH v2] " Chris Wilson
@ 2017-03-10 11:55 ` Chris Wilson
2017-03-10 12:41 ` Mika Kuoppala
2 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 11:55 UTC (permalink / raw)
To: intel-gfx
We only need to clflush those cachelines that we have validated to be
read by the GPU. Userspace typically fills the batch length in
correctly, the exceptions tend to be explicit tests within igt.
v2: Use ptr_mask_bits() to make Mika happy
v3: cmd is not advanced on MI_BBE, so make sure to include an extra
dword in the clflush.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_cmd_parser.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 21b1cd917d81..7af100f84410 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1279,11 +1279,17 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
* space. Parsing should be faster in some cases this way.
*/
batch_end = cmd + (batch_len / sizeof(*batch_end));
- while (cmd < batch_end) {
+ do {
u32 length;
- if (*cmd == MI_BATCH_BUFFER_END)
+ if (*cmd == MI_BATCH_BUFFER_END) {
+ if (needs_clflush_after) {
+ void *ptr = ptr_mask_bits(shadow_batch_obj->mm.mapping);
+ drm_clflush_virt_range(ptr,
+ (void *)(cmd + 1) - ptr);
+ }
break;
+ }
desc = find_cmd(engine, *cmd, desc, &default_desc);
if (!desc) {
@@ -1323,17 +1329,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
}
cmd += length;
- }
-
- if (cmd >= batch_end) {
- DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
- ret = -EINVAL;
- }
+ if (cmd >= batch_end) {
+ DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
+ ret = -EINVAL;
+ break;
+ }
+ } while (1);
- if (ret == 0 && needs_clflush_after)
- drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
i915_gem_object_unpin_map(shadow_batch_obj);
-
return ret;
}
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 11:55 ` [PATCH v3] " Chris Wilson
@ 2017-03-10 12:41 ` Mika Kuoppala
2017-03-10 13:19 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2017-03-10 12:41 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> We only need to clflush those cachelines that we have validated to be
> read by the GPU. Userspace typically fills the batch length in
> correctly, the exceptions tend to be explicit tests within igt.
>
> v2: Use ptr_mask_bits() to make Mika happy
> v3: cmd is not advanced on MI_BBE, so make sure to include an extra
> dword in the clflush.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I am now very happy.
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_cmd_parser.c | 25 ++++++++++++++-----------
> 1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 21b1cd917d81..7af100f84410 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1279,11 +1279,17 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> * space. Parsing should be faster in some cases this way.
> */
> batch_end = cmd + (batch_len / sizeof(*batch_end));
> - while (cmd < batch_end) {
> + do {
> u32 length;
>
> - if (*cmd == MI_BATCH_BUFFER_END)
> + if (*cmd == MI_BATCH_BUFFER_END) {
> + if (needs_clflush_after) {
> + void *ptr = ptr_mask_bits(shadow_batch_obj->mm.mapping);
> + drm_clflush_virt_range(ptr,
> + (void *)(cmd + 1) - ptr);
> + }
> break;
> + }
>
> desc = find_cmd(engine, *cmd, desc, &default_desc);
> if (!desc) {
> @@ -1323,17 +1329,14 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> }
>
> cmd += length;
> - }
> -
> - if (cmd >= batch_end) {
> - DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
> - ret = -EINVAL;
> - }
> + if (cmd >= batch_end) {
> + DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
> + ret = -EINVAL;
> + break;
> + }
> + } while (1);
>
> - if (ret == 0 && needs_clflush_after)
> - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> i915_gem_object_unpin_map(shadow_batch_obj);
> -
> return ret;
> }
>
> --
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] drm/i915/cmdparser: Limit clflush to active cachelines
2017-03-10 12:41 ` Mika Kuoppala
@ 2017-03-10 13:19 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2017-03-10 13:19 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx
On Fri, Mar 10, 2017 at 02:41:16PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > We only need to clflush those cachelines that we have validated to be
> > read by the GPU. Userspace typically fills the batch length in
> > correctly, the exceptions tend to be explicit tests within igt.
> >
> > v2: Use ptr_mask_bits() to make Mika happy
> > v3: cmd is not advanced on MI_BBE, so make sure to include an extra
> > dword in the clflush.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> I am now very happy.
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
And after all that I pushed without adding the r-b. :(
I should go and crawl underneath a rock.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-03-10 13:19 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-10 9:42 [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines Chris Wilson
2017-03-10 9:58 ` Mika Kuoppala
2017-03-10 10:04 ` Chris Wilson
2017-03-10 10:19 ` Chris Wilson
2017-03-10 10:26 ` Mika Kuoppala
2017-03-10 10:39 ` Chris Wilson
2017-03-10 10:42 ` Mika Kuoppala
2017-03-10 10:50 ` Chris Wilson
2017-03-10 10:46 ` [PATCH v2] " Chris Wilson
2017-03-10 11:19 ` Mika Kuoppala
2017-03-10 11:39 ` Chris Wilson
2017-03-10 11:55 ` [PATCH v3] " Chris Wilson
2017-03-10 12:41 ` Mika Kuoppala
2017-03-10 13:19 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.