From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ben Widawsky <ben@bwidawsk.net>
Subject: Re: [PATCH] [RFC] drm/i915: read-read semaphore
	optimization
Date: Mon, 16 Jan 2012 14:20:55 -0800
Message-ID: <4F14A2C7.7020006@bwidawsk.net>
References: <1323748328-10153-1-git-send-email-ben@bwidawsk.net>
	<87fwgoidza.fsf@eliezer.anholt.net> <4EE79B1F.2000707@bwidawsk.net>
	<20120116215052.GO3627@phenom.ffwll.local>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org>
Received: from cloud01.chad-versace.us (184-106-247-128.static.cloud-ips.com
	[184.106.247.128])
	by gabe.freedesktop.org (Postfix) with ESMTP id DD4009ED55
	for <intel-gfx@lists.freedesktop.org>;
	Mon, 16 Jan 2012 14:21:03 -0800 (PST)
In-Reply-To: <20120116215052.GO3627@phenom.ffwll.local>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
List-Id: intel-gfx@lists.freedesktop.org

On 01/16/2012 01:50 PM, Daniel Vetter wrote:
> On Tue, Dec 13, 2011 at 10:36:15AM -0800, Ben Widawsky wrote:
>> On 12/13/2011 09:22 AM, Eric Anholt wrote:
>>> On Mon, 12 Dec 2011 19:52:08 -0800, Ben Widawsky<ben@bwidawsk.net>  wrote:
>>>> Since we don't differentiate on the different GPU read domains, it
>>>> should be safe to allow back to back reads to occur without issuing a
>>>> wait (or flush in the non-semaphore case).
>>>>
>>>> This has the unfortunate side effect that we need to keep track of all
>>>> the outstanding buffer reads so that we can synchronize on a write, to
>>>> another ring (since we don't know which read finishes first). In other
>>>> words, the code is quite simple for two rings, but gets more tricky for
>>>>> 2 rings.
>>>>
>>>> Here is a picture of the solution to the above problem
>>>>
>>>> Ring 0            Ring 1             Ring 2
>>>> batch 0           batch 1            batch 2
>>>>   read buffer A     read buffer A      wait batch 0
>>>>                                        wait batch 1
>>>>                                        write buffer A
>>>>
>>>> This code is really untested. I'm hoping for some feedback if this is
>>>> worth cleaning up, and testing more thoroughly.
>>>
>>> You say it's an optimization -- do you have performance numbers?
>>
>> 33% improvement on a hacked version of gem_ring_sync_loop with.
>>
>> It's not really a valid test as it's not coherent, but this is
>> approximately the best case improvement.
>>
>> Oddly semaphores doesn't make much difference in this test, which
>> was surprising.
> 
> Our domain tracking is already complicated in unfunny ways. And (at least
> without a use-case showing gains with hard numbers in either perf or power
> usage) I think this patch is the kind of "this looks cool" stuff that
> added a lot to the current problem.
> 
> So before adding more complexity on top I'd like to remove some of the
> superflous stuff we already have. I.e. all the flushing_list stuff and
> maybe other things ...

Can you be more clear on what exactly you want done before taking a
patch like this? Maybe I can work on it during some down time.

> 
> Cheers, Daniel

~Ben