From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= <ville.syrjala@linux.intel.com>
Subject: Re: [PATCH 3/4] drm/i915: Update rules for writing
 through the LLC with the cpu
Date: Tue, 6 Aug 2013 22:08:38 +0300
Message-ID: <20130806190838.GN5004@intel.com>
References: <1375791425-14978-1-git-send-email-chris@chris-wilson.co.uk>
	<1375791425-14978-3-git-send-email-chris@chris-wilson.co.uk>
	<20130806142425.GH5004@intel.com>
	<20130806161614.GA6075@cantiga.alporthouse.com>
	<20130806180301.GL5004@intel.com>
	<20130806184540.GA8181@cantiga.alporthouse.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Return-path: <intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
	by gabe.freedesktop.org (Postfix) with ESMTP id C1EBFE5D33
	for <intel-gfx@lists.freedesktop.org>;
	Tue,  6 Aug 2013 12:08:41 -0700 (PDT)
Content-Disposition: inline
In-Reply-To: <20130806184540.GA8181@cantiga.alporthouse.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
List-Id: intel-gfx@lists.freedesktop.org

On Tue, Aug 06, 2013 at 07:45:40PM +0100, Chris Wilson wrote:
> On Tue, Aug 06, 2013 at 09:03:01PM +0300, Ville Syrj=E4l=E4 wrote:
> > On Tue, Aug 06, 2013 at 05:16:14PM +0100, Chris Wilson wrote:
> > > On Tue, Aug 06, 2013 at 05:24:25PM +0300, Ville Syrj=E4l=E4 wrote:
> > > > On Tue, Aug 06, 2013 at 01:17:04PM +0100, Chris Wilson wrote:
> > > > > +static bool cpu_write_needs_clflush(struct drm_i915_gem_object *=
obj)
> > > > > +{
> > > > > +	if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level))
> > > > > +		return true;
> > > > > +
> > > > > +	return atomic_read(&obj->fb_count) > 0;
> > > > > +}
> > > > =

> > > > I was thinking a bit about this fb_count thing. The other option wo=
uld
> > > > be to track actual scanout activity, which wouldn't be hard to do, =
but
> > > > I guess that could shift a bunch of the clflush cost to page flip t=
ime.
> > > > So maybe we don't want that?
> > > =

> > > Different series, see the frontbuffer write tracking. As for page fli=
p,
> > > that is currently the only time we clflush...
> > >  =

> > > > Also i915_gem_sw_finish_ioctl() still looks at pin_count as an
> > > > indication whether scanout is happening. We should add an fb_count
> > > > check there as well. Maybe we should leave the pin_count check ther=
e,
> > > > so that framebuffers that aren't being scanned out currently can be
> > > > massaged a bit more freely before a clflush needs to be issued?
> > > =

> > > It does do a fb_count check, the pin_count + fb_count is a really str=
ong
> > > indicator of active scanout.
> > =

> > The fb_count checks is done only through the eventual check in
> > i915_gem_clflush_object() which is why I didn't think of it.
> > =

> > On non-LLC platforms it also ends up doing the flush for all
> > non-snooped non-scanout buffers, which would get optimized away if the
> > fb_count check was where the pin_count check is.
> =

> But it does still do the fb_count check before the flush? What am I
> missing here?

I believe the code now ends up doing this (on non-LLC):
 if (pin_count && (cache_level =3D=3D UC || fb_count > 0) && write_domain =
=3D=3D CPU)
   clflush();

So it'll flush also all pinned non-snooped buffers whether they be
scanout or not. We could delay such flushes until the bo gets
moved away from the CPU write domain. But maybe it's just a nonsense
scenario. Why would someone write to a pinned bo unless the pinning is
due to scanout?

> > > > > -	    obj->tiling_mode =3D=3D I915_TILING_NONE &&
> > > > > -	    obj->base.write_domain !=3D I915_GEM_DOMAIN_CPU) {
> > > > > +	if (obj->tiling_mode =3D=3D I915_TILING_NONE &&
> > > > > +	    !cpu_write_needs_clflush(obj)) {
> > > > =

> > > > This would break non-LLC platforms, I think. Before, GTT writes were
> > > > used for non-snooped buffers w/ clean CPU caches (write domain !=3D=
 cpu),
> > > > which makes sense since access via GTT doesn't snoop ever according
> > > > to the docs. Now I think it'll do GTT writes for snooped non-scanout
> > > > buffers.
> > > =

> > > D'oh. What I was aiming for was to always use shmem writes on snooped
> > > bo, irrespective of whether it is GPU hot.
> > > =

> > > s/!cpu_write_needs_clflush/cpu_write_needs_clflush/
> > =

> > So that means a GTT write for anything non-snooped or scanout. If it's
> > snooped scanout (I guess we may not really care about such a weird
> > combination), or non-snooped but in CPU write domain, that would
> > still have issues with possibly dirty data in the CPU caches.
> =

> Hmm, that should be fixed up by i915_gem_gtt_pwrite_fast() as we flush
> all the caches before we start overwriting. I've thoroughly confused
> myself over which paths do the fancy tricks. :(
>  =

> > Might be intersting to try on real hardware what the GTT access
> > snooping behaviour really is, especially for some modern non-LLC
> > system like VLV.
> =

> I remember why I have the !DOMAIN_CPU test now. If we have already paid
> the price to pull it back into the CPU domain, let it rest there until
> it is used again.
> =

> So as it stands, I think we want:
> =

> if (obj->tiling_mode =3D=3D I915_TILING_NONE &&
>     obj->base.write_domain !=3D I915_GEM_DOMAIN_CPU &&
>     cpu_write_needs_clflush(obj))
> 	i915_gem_gtt_pwrite_fast();

Yeah that looks better to me at least.

-- =

Ville Syrj=E4l=E4
Intel OTC