From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [PATCH] drm/i915: GFDT support for SNB/IVB Date: Thu, 4 Apr 2013 14:27:12 +0300 Message-ID: <20130404112712.GY4469@intel.com> References: <1362162777-21626-1-git-send-email-ville.syrjala@linux.intel.com> <1362587289-20059-1-git-send-email-chris@chris-wilson.co.uk> <20130404090102.GV2228@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 56A6EE5C31 for ; Thu, 4 Apr 2013 04:27:16 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20130404090102.GV2228@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Vetter Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Thu, Apr 04, 2013 at 11:01:02AM +0200, Daniel Vetter wrote: > On Wed, Mar 06, 2013 at 04:28:09PM +0000, Chris Wilson wrote: > > From: Ville Syrj=E4l=E4 > > = > > Currently all scanout buffers must be uncached because the > > display controller doesn't snoop the LLC. SNB introduced another > > method to guarantee coherency for the display controller. It's > > called the GFDT or graphics data type. > > = > > Pages that have the GFDT bit enabled in their PTEs get flushed > > all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is > > issued with the "synchronize GFDT" bit set. > > = > > So rather than making all scanout buffers uncached, set the GFDT > > bit in their PTEs, and modify the ring flush functions to enable > > the "synchronize GFDT" bit. > > = > > On HSW the GFDT bit was removed from the PTE, and it's only present in > > surface state, so we can't really set it from the kernel. Also the docs > > state that the hardware isn't actually guaranteed to respect the GFDT > > bit. So it looks like GFDT isn't all that useful on HSW. > > = > > So far I've tried this very quickly on an IVB machine, and > > it seems to be working as advertised. No idea if it does any > > good though. > > = > > On an i5-2520m (laptop) running gnome-shell at 1366x768: > > padman 140.78 -> 145.98 fps > > openarena 183.72 -> 186.87 fps > > gtkperf ComboBoxEntry 20.27 -> 22.14s > > gtkperf pixbuf 1.12 -> 1.47s > > x11perf -aa10text 13.40 -> 13.20 Mglyphs > > which are well within the throttling noise. > > = > > v2 [ickle]: adapt to comply with existing userspace guarantees > > = > > Signed-off-by: Ville Syrj=E4l=E4 > > Signed-off-by: Chris Wilson > = > Oops, this one here somehow fell a bit through the cracks. > = > Two bikesheds and one real issue: > - s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect > more fun to come here ;-) > - ring->flush has a pile of arguments, I guess we should coalesce > invalidate/flush and the new internal into a simple flags array. I don't > expect that we'll every switch invalidate/flush back to domain arrays > from the current "it's just a bool, really" usage. > = > Also, the above should be done in separate prep patches imo. > = > The real deal is flushing cpu access, since that probably does not set the > gfdt flag. So cpu reads are fine and don't require any flushing, but cpu > writes must be clflushed I think. In a way gfdt works as if the gpu is > doing write-through caching (safe that we have to manually initiate the > write-through with the gfdt flush). But from the pov of cpu access it's > the same, and hence requires the same amount of clflushing. > = > Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc > for gfdt we still need to keep track of the gfdt_dirty state, but all the > cpu side flushing business should be much clearer. > = > Stumbled over this while checking whether sw_finish_ioctl would still do > the right thing, and noticed that the clflush is a noop when gfdt is > treated like the current CACHE_LLC. My original patch had a change to clflushing where pinned objects w/ gfdt were also flushed. I didn't really read v2 well enough to figure out why that got dropped. -- = Ville Syrj=E4l=E4 Intel OTC