public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Tvrtko Ursulin <tursulin@ursulin.net>,
	Intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 06/13] drm/i915: Use binary search when looking up forcewake domains
Date: Fri, 30 Sep 2016 12:08:26 +0100	[thread overview]
Message-ID: <ea41b87d-bd78-6637-540b-dab65ec6101a@linux.intel.com> (raw)
In-Reply-To: <20160929161605.GB9653@nuc-i3427.alporthouse.com>


On 29/09/2016 17:16, Chris Wilson wrote:
> On Thu, Sep 29, 2016 at 04:35:49PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Instead of the existing linear seach, now that we have sorted
>> range tables, we can do a binary search on them for some
>> potential miniscule performance gain, but more importantly
>> for elegance and code size. Hopefully the perfomance gain is
>> sufficient to offset the function calls which were not there
>> before.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_uncore.c | 28 ++++++++++++++++++++--------
>>   1 file changed, 20 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index bee1482a5ece..ae5edaea16f7 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -26,6 +26,7 @@
>>   #include "i915_vgpu.h"
>>   
>>   #include <linux/pm_runtime.h>
>> +#include <linux/bsearch.h>
>>   
>>   #define FORCEWAKE_ACK_TIMEOUT_MS 50
>>   
>> @@ -589,20 +590,31 @@ struct intel_forcewake_range
>>   	enum forcewake_domains domains;
>>   };
>>   
>> +static int fw_range_cmp(const void *key, const void *elt)
>> +{
>> +	struct intel_forcewake_range *entry =
>> +		(struct intel_forcewake_range *)elt;
>> +	u32 offset = (u32)((unsigned long)key);
>> +
>> +	if (offset < entry->start)
>> +		return -1;
>> +	else if (offset > entry->end)
>> +		return 1;
>> +	else
>> +		return 0;
>> +}
>> +
>>   static enum forcewake_domains
>>   find_fw_domain(u32 offset, const struct intel_forcewake_range *ranges,
>>   	       unsigned int num_ranges)
>>   {
>> -	unsigned int i;
>> -	struct intel_forcewake_range *entry =
>> -		(struct intel_forcewake_range *)ranges;
>> +	struct intel_forcewake_range *entry;
>>   
>> -	for (i = 0; i < num_ranges; i++, entry++) {
>> -		if (offset >= entry->start && offset <= entry->end)
>> -			return entry->domains;
>> -	}
>> +	entry = bsearch((void *)(unsigned long)offset, (const void *)ranges,
>> +			num_ranges, sizeof(struct intel_forcewake_range),
>> +			fw_range_cmp);
> How much for bsearch() to be turned into a generator macro?

By default it is a small code size win (128 bytes). It makes 
find_fw_domain a function with an inlined comparator (so one function 
call less per search iteration than using library bsearch) and inlines 
is_gen8_shadowed completely.

Forcing find_fw_domain to be fully inline adds approximately 1k.

I am not sure - you think it is worth doing some of the above? Function 
calls are supposed to be cheap so perhaps just with the default 
inlining, but then it is either pushing the core patch or having a local 
copy of a macro.

>> -	return -1;
>> +	return entry ? entry->domains : -1;
>>   }
> Looks ok, maybe pass in the default value to return if !entry, saves the
> double check.

It goes away later in the series so it is fine. I just wanted to be 
gradual so it is easy to review.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2016-09-30 11:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-29 15:35 [PATCH 00/13] Forcewake binary search & code shrink Tvrtko Ursulin
2016-09-29 15:35 ` [PATCH 01/13] drm/i915: Remove redundant hsw_write* mmio functions Tvrtko Ursulin
2016-09-29 15:35 ` [PATCH 02/13] drm/i915: Keep track of active forcewake domains in a bitmask Tvrtko Ursulin
2016-09-29 15:35 ` [PATCH 03/13] drm/i915: Do not inline forcewake taking in mmio accessors Tvrtko Ursulin
2016-09-29 15:35 ` [PATCH 04/13] drm/i915: Data driven register to forcewake domains lookup Tvrtko Ursulin
2016-09-29 15:35 ` [PATCH 05/13] drm/i915: Sort forcewake mapping tables Tvrtko Ursulin
2016-09-29 16:13   ` Chris Wilson
2016-09-29 15:35 ` [PATCH 06/13] drm/i915: Use binary search when looking up forcewake domains Tvrtko Ursulin
2016-09-29 16:16   ` Chris Wilson
2016-09-29 20:41     ` [PATCH] lib: Typesafe generator macro for bsearch Chris Wilson
2016-09-30 11:08     ` Tvrtko Ursulin [this message]
2016-09-30 11:22       ` [PATCH 06/13] drm/i915: Use binary search when looking up forcewake domains Chris Wilson
2016-09-29 15:35 ` [PATCH 07/13] drm/i915: Eliminate Gen9 special case Tvrtko Ursulin
2016-09-29 16:18   ` Chris Wilson
2016-09-29 15:35 ` [PATCH 08/13] drm/i915: Store the active forcewake range table pointer Tvrtko Ursulin
2016-09-29 16:21   ` Chris Wilson
2016-09-29 15:35 ` [PATCH 09/13] drm/i915: Remove identical macros Tvrtko Ursulin
2016-09-30  7:29   ` Joonas Lahtinen
2016-09-29 15:35 ` [PATCH 10/13] drm/i915: Remove identical mmio read functions Tvrtko Ursulin
2016-09-30  7:32   ` Joonas Lahtinen
2016-09-29 15:35 ` [PATCH 11/13] drm/i915: Remove identical write mmmio functions Tvrtko Ursulin
2016-09-30  8:11   ` Joonas Lahtinen
2016-09-29 15:35 ` [PATCH 12/13] drm/i915: Sort the shadow register table Tvrtko Ursulin
2016-09-29 16:23   ` Chris Wilson
2016-09-30  8:09   ` Joonas Lahtinen
2016-09-30  9:06     ` Tvrtko Ursulin
2016-09-30 11:13       ` Joonas Lahtinen
2016-09-29 15:35 ` [PATCH 13/13] drm/i915: Use binary search when looking for shadowed registers Tvrtko Ursulin
2016-09-30  7:54   ` Joonas Lahtinen
2016-09-30 10:01     ` Tvrtko Ursulin
2016-09-29 15:58 ` [PATCH 00/13] Forcewake binary search & code shrink Jani Nikula
2016-09-29 16:20 ` ✗ Fi.CI.BAT: warning for " Patchwork
2016-09-29 16:24 ` [PATCH 00/13] " Chris Wilson
2016-09-29 21:01 ` ✗ Fi.CI.BAT: failure for Forcewake binary search & code shrink (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea41b87d-bd78-6637-540b-dab65ec6101a@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox