From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Siluvery, Arun" <arun.siluvery@linux.intel.com>
Subject: Re: [RFC] drm/i915: Add variable gem object size
 support to i915
Date: Fri, 23 May 2014 15:54:54 +0100
Message-ID: <537F613E.9070709@linux.intel.com>
References: <1398697289-5607-1-git-send-email-arun.siluvery@linux.intel.com>
 <87iopbexrr.fsf@eliezer.anholt.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by gabe.freedesktop.org (Postfix) with ESMTP id BAF576EE2C
 for <intel-gfx@lists.freedesktop.org>; Fri, 23 May 2014 07:54:57 -0700 (PDT)
In-Reply-To: <87iopbexrr.fsf@eliezer.anholt.net>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
To: Eric Anholt <eric@anholt.net>, intel-gfx@lists.freedesktop.org
List-Id: intel-gfx@lists.freedesktop.org

On 12/05/2014 18:02, Eric Anholt wrote:
> arun.siluvery@linux.intel.com writes:
>
>> From: "Siluvery, Arun" <arun.siluvery@intel.com>
>>
>> This patch adds support to have gem objects of variable size.
>> The size of the gem object obj->size is always constant and this fact
>> is tightly coupled in the driver; this implementation allows to vary
>> its effective size using an interface similar to fallocate().
>>
>> A new ioctl() is introduced to mark a range as scratch/usable.
>> Once marked as scratch, associated backing store is released and the
>> region is filled with scratch pages. The region can also be unmarked
>> at a later point in which case new backing pages are created.
>> The range can be anywhere within the object space, it can have multiple
>> ranges possibly overlapping forming a large contiguous range.
>>
>> There is only one single scratch page and Kernel allows to write to this
>> page; userspace need to keep track of scratch page range otherwise any
>> subsequent writes to these pages will overwrite previous content.
>>
>> This feature is useful where the exact size of the object is not clear
>> at the time of its creation, in such case we usually create an object
>> with more than the required size but end up using it partially.
>> In devices where there are tight memory constraints it would be useful
>> to release that additional space which is currently unused. Using this
>> interface the region can be simply marked as scratch which releases
>> its backing store thus reducing the memory pressure on the kernel.
>>
>> Many thanks to Daniel, ChrisW, Tvrtko, Bob for the idea and feedback
>> on this implementation.
>>
>> v2: fix holes in error handling and use consistent data types (Tvrtko)
>>   - If page allocation fails simply return error; do not try to invoke
>>     shrinker to free backing store.
>>   - Release new pages created by us in case of error during page allocation
>>     or sg_table update.
>>   - Use 64-bit data types for start and length values to avoid truncation.
>
> The idea sounds nice to have for Mesa.  We've got this ugly code right
> now for guessing how many levels a miptree is going to be, and then do
> copies if we find out we were wrong about how many the app was going to
> use.  This will let us allocate for a maximum-depth miptree, and mark
> the smaller levels as unused until an image gets put there.
>
> The problem I see with this plan is if the page table twiddling ends up
> being too expensive in our BO reallocation path (right now, if we make
> the same guess on every allocation, we'll reuse cached BOs with the same
> size and no mapping cost).
>
> It would be nice to see some performance data from real applications, if
> possible.  But then, I don't think I've seen any real applications hit
> the copy path.
>
The way I am planning to test is to calculate the time it takes to 
falloc a big object. Could you suggest a best way to test the 
performance of this change?

regards
Arun