public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [RFC] drm/i915: store all mmio bases in intel_engines
Date: Mon, 12 Mar 2018 10:48:42 -0700	[thread overview]
Message-ID: <336af63a-e0e3-d37b-5534-d9590cc9f75c@intel.com> (raw)
In-Reply-To: <650027ea-07aa-ecd6-1212-5c4f5ba01b33@linux.intel.com>



On 12/03/18 04:04, Tvrtko Ursulin wrote:
> 
> On 09/03/2018 19:44, Tvrtko Ursulin wrote:
>>
>> On 09/03/2018 18:47, Daniele Ceraolo Spurio wrote:
>>> On 09/03/18 01:53, Tvrtko Ursulin wrote:
>>>>
>>>> On 08/03/2018 18:46, Daniele Ceraolo Spurio wrote:
>>>>> On 08/03/18 01:31, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 07/03/2018 19:45, Daniele Ceraolo Spurio wrote:
>>>>>>> The mmio bases we're currently storing in the intel_engines array 
>>>>>>> are
>>>>>>> only valid for a subset of gens, so we need to ignore them and use
>>>>>>> different values in some cases. Instead of doing that, we can have a
>>>>>>> table of [starting gen, mmio base] pairs for each engine in
>>>>>>> intel_engines and select the correct one based on the gen we're 
>>>>>>> running
>>>>>>> on in a consistent way.
>>>>>>>
>>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>>> Signed-off-by: Daniele Ceraolo Spurio 
>>>>>>> <daniele.ceraolospurio@intel.com>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 77 
>>>>>>> +++++++++++++++++++++------------
>>>>>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  1 -
>>>>>>>   2 files changed, 49 insertions(+), 29 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
>>>>>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>>>>> index 4ba139c27fba..1dd92cac8d67 100644
>>>>>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>>>>>> @@ -81,12 +81,16 @@ static const struct engine_class_info 
>>>>>>> intel_engine_classes[] = {
>>>>>>>       },
>>>>>>>   };
>>>>>>> +#define MAX_MMIO_BASES 3
>>>>>>>   struct engine_info {
>>>>>>>       unsigned int hw_id;
>>>>>>>       unsigned int uabi_id;
>>>>>>>       u8 class;
>>>>>>>       u8 instance;
>>>>>>> -    u32 mmio_base;
>>>>>>> +    struct engine_mmio_base {
>>>>>>> +        u32 gen : 8;
>>>>>>> +        u32 base : 24;
>>>>>>> +    } mmio_bases[MAX_MMIO_BASES];
>>>>>>>       unsigned irq_shift;
>>>>>>>   };
>>>>>>
>>>>>> It's not bad, but I am just wondering if it is too complicated 
>>>>>> versus simply duplicating the tables.
>>>>>>
>>>>>> Duplicated tables would certainly be much less code and 
>>>>>> complexity, but what about the size of pure data?
>>>>>>
>>>>>> In this patch sizeof(struct engine_info) grows by MAX_MMIO_BASES * 
>>>>>> sizeof(u32), so 12, to the total of 30 if I am not mistaken. Times 
>>>>>> NUM_ENGINES equals 240 bytes for intel_engines[] array with this 
>>>>>> scheme.
>>>>>>
>>>>>
>>>>> we remove a u32 (the old mmio base) so we only grow 8 per engine, 
>>>>> but the compiler rounds up to a multiple of u32 so 28 per engine, 
>>>>> for a total of 224.
>>>>>
>>>>>> Separate tables per gens would be:
>>>>>>
>>>>>> sizeof(struct engine_info) is 18 bytes.
>>>>>>
>>>>>> For < gen6 we'd need 3 engines * 18 = 54
>>>>>> < gen11 = 5 engines * 18 = 90
>>>>>>  >= gen11 = 8 engines * 18 = 144
>>>>>>
>>>>>> 54 + 90 + 144 = 288 bytes
>>>>>>
>>>>>> So a little bit bigger. Hm. Plus we would need to refactor so 
>>>>>> intel_engines[] is not indexed by engine->id but is contiguous 
>>>>>> array with engine->id in the elements. Code to lookup the compact 
>>>>>> array should be simpler than combined new code from this patch, 
>>>>>> especially if you add the test as Chris suggested. So separate 
>>>>>> engine info arrays would win I think overall - when considering 
>>>>>> size of text + size of data + size of source code.
>>>>>>
>>>>>
>>>>> Not sure. I would exclude the selftest, which is usually not 
>>>>> compiled for released kernels, which leads to:
>>>>
>>>> Yes, but we cannot exclude its source since selftests for separate 
>>>> tables wouldn't be needed.
>>>>
>>>>> add/remove: 0/0 grow/shrink: 3/1 up/down: 100/-7 (93)
>>>>> Function                                     old     new   delta
>>>>> intel_engines                                160     224     +64
>>>>> __func__                                   13891   13910     +19
>>>>> intel_engines_init_mmio                     1247    1264     +17
>>>>> intel_init_bsd_ring_buffer                   142     135      -7
>>>>> Total: Before=1479626, After=1479719, chg +0.01%
>>>>>
>>>>> Total growth is 93, which is less then your estimation for the 
>>>>> growth introduced by replicating the table.
>>>>>
>>>>>> But on the other hand your solution might be more future proof. So 
>>>>>> I don't know. Use the crystal ball to decide? :)
>>>>>>
>>>>>
>>>>> I think we should expect that the total engine count could grow 
>>>>> with future gens. In this case to me adding a new entry to a 
>>>>> unified table seems much cleaner (and uses less data) than adding a 
>>>>> completely new table each time.
>>>>
>>>> Okay I was off in my estimates. But in general I was coming from the 
>>>> angle of thinking that every new mmio base difference in this scheme 
>>>> grows the size for all engines. So if just VCS grows one new base, 
>>>> impact is multiplied by NUM_ENGINES. But maybe separate tables are 
>>>> also not a solution. Perhaps pulling out mmio_base arrays to outside 
>>>> struct engine_info in another set of static tables, so they could be 
>>>> null terminated would be best of both worlds?
>>>>
>>>> struct engine_mmio_base {
>>>>      u32 gen : 8;
>>>>      u32 base : 24;
>>>> };
>>>>
>>>> static const struct engine_mmio_base vcs0_mmio_bases[] = {
>>>>      { .gen = 11, .base = GEN11_BSD_RING_BASE },
>>>>      { .gen = 6, .base = GEN6_BSD_RING_BASE },
>>>>      { .gen = 4, .base = BSD_RING_BASE },
>>>>      { },
>>>> };
>>>>
>>>> And then in intel_engines array, for BSD:
>>>>
>>>>     {
>>>>      ...
>>>>      .mmio_bases = &vcs0_mmio_bases;
>>>>      ...
>>>>     },
>>>>
>>>> This way we limit data growth only to engines which change their 
>>>> mmio bases frequently.
>>>>
>>>> Just an idea.
>>>>
>>>
>>> I gave this a try and the code actually grows:
>>>
>>> add/remove: 8/0 grow/shrink: 2/0 up/down: 120/0 (120)
>>> Function                                     old     new   delta
>>> intel_engines                                224     256     +32
>>> vcs0_mmio_bases                                -      16     +16
>>> vecs1_mmio_bases                               -      12     +12
>>> vecs0_mmio_bases                               -      12     +12
>>> vcs1_mmio_bases                                -      12     +12
>>> vcs3_mmio_bases                                -       8      +8
>>> vcs2_mmio_bases                                -       8      +8
>>> rcs_mmio_bases                                 -       8      +8
>>> intel_engines_init_mmio                     1264    1272      +8
>>> bcs_mmio_bases                                 -       4      +4
>>> Total: Before=1479719, After=1479839, chg +0.01%
>>>
>>> I have no idea what the compiler is doing to grow intel_engines, 
>>> since by replacing and array of 3 u32s with a pointer I would expect 
>>> a shrink of 4 bytes per-engine. Anyway, even without the compiler 
>>> weirdness with
>>
>> It probably has to align the pointer to 8 bytes so that creates a 
>> hole. Moving mmio_base pointer higher up, either to the top or after 
>> two unsigned ints should work. Or packing the struct.
>>
>>> this method we would grow the code size of at least 4 bytes per 
>>> engine because we replace an array of 3 u32 (12B) with a pointer (8B) 
>>> and an array of at 2 or more u32 (8B+). I guess we can reconsider 
>>> if/when one engine reaches more than 3 mmio bases.
>>
>> Yeah it's fine. I was thinking that since you are almost there it 
>> makes sense to future proof it more, since no one will probably 
>> remember it later. But OK.
> 
> One idea to compact more, in addition to avoiding alignment holes, could 
> be to store the engine_mmio_base directly in the engine_mmio_base 
> pointer when it is a single entry, othewrise use a pointer to null 
> terminated array. Actually we could store two without table indirection 
> to be most optimal on 64-bit. Something like:
> 
> struct engine_mmio_base {
>      u32 base : 24; /* Must be LSB, */
>      u32 gen : 8;
> };
> 
> union engine_mmio {
>      struct engine_mmio_base mmio_base[2];
>      struct engine_mmio_base *mmio_base_list;
> };
> 
> #define MMIO_PTR_LIST BIT(0)
> 
>     {
>      ... render engine ...
>      .mmio = { (struct engine_mmio_base){.base = ..., ..gen = ...} },
>      ...
>     },
>     {
>      ...
>      .mmio = { .mmio_base_list = &vcs0_mmio_bases | MMIO_PTR_LIST }

This is giving a compiler error (even with the appropriate casting) for 
value not constant. I've solved by doing:

union engine_mmio {
	const struct engine_mmio_base bases[MAX_BUILTIN_MMIO_BASES];
	const struct engine_mmio_base *bases_list;
	union {
		const u64 is_list : 1;
		const u64 	  : 63;
	};
};

Code size is almost unchanged but still looks like a good compromise. 
Will update the selftest and send patches soon.

Daniele

>      ...
>     },
> 
> This could be the best of both worlds, with up to two bases built-in, 
> and engines with more point to an array.
> 
> 
> Regards,
> 
> Tvrtko
> 
> 
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-03-12 17:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-07 19:45 [RFC] drm/i915: store all mmio bases in intel_engines Daniele Ceraolo Spurio
2018-03-07 19:56 ` Chris Wilson
2018-03-07 20:36 ` ✓ Fi.CI.BAT: success for " Patchwork
2018-03-07 21:38 ` ✓ Fi.CI.IGT: " Patchwork
2018-03-08  9:31 ` [RFC] " Tvrtko Ursulin
2018-03-08 18:46   ` Daniele Ceraolo Spurio
2018-03-09  9:53     ` Tvrtko Ursulin
2018-03-09 18:47       ` Daniele Ceraolo Spurio
2018-03-09 19:44         ` Tvrtko Ursulin
2018-03-12 11:04           ` Tvrtko Ursulin
2018-03-12 17:48             ` Daniele Ceraolo Spurio [this message]
2018-03-12 21:05               ` Daniele Ceraolo Spurio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=336af63a-e0e3-d37b-5534-d9590cc9f75c@intel.com \
    --to=daniele.ceraolospurio@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox