From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <igt-dev-bounces@lists.freedesktop.org>
Received: from mga05.intel.com (mga05.intel.com [192.55.52.43])
 by gabe.freedesktop.org (Postfix) with ESMTPS id E71E910E55F
 for <igt-dev@lists.freedesktop.org>; Thu,  6 Jan 2022 02:23:22 +0000 (UTC)
Date: Wed, 05 Jan 2022 18:23:21 -0800
Message-ID: <87tueha9gm.wl-ashutosh.dixit@intel.com>
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
In-Reply-To: <20220105064945.9640-2-zbigniew.kempczynski@intel.com>
References: <20220105064945.9640-1-zbigniew.kempczynski@intel.com>	<20220105064945.9640-2-zbigniew.kempczynski@intel.com>
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable
Subject: Re: [igt-dev] [PATCH i-g-t 1/3] lib/intel_memory_region: Add start
 offset and alignment detection
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/igt-dev>,
 <mailto:igt-dev-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/igt-dev>
List-Post: <mailto:igt-dev@lists.freedesktop.org>
List-Help: <mailto:igt-dev-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/igt-dev>,
 <mailto:igt-dev-request@lists.freedesktop.org?subject=subscribe>
Errors-To: igt-dev-bounces@lists.freedesktop.org
Sender: "igt-dev" <igt-dev-bounces@lists.freedesktop.org>
To: Zbigniew =?ISO-8859-2?Q?Kempczy=F1ski?= <zbigniew.kempczynski@intel.com>
Cc: igt-dev@lists.freedesktop.org, Petri Latvala <petri.latvala@intel.com>
List-ID: <igt-dev@lists.freedesktop.org>

On Tue, 04 Jan 2022 22:49:43 -0800, Zbigniew Kempczy=F1ski wrote:
>
> With era of new gens we're enforced to use no-reloc (softpin). This
> brings few problems like vm range limitations which were well solved
> by the kernel. This can be handled also in userspace code by adding
> gen related conditionals or by trying to detect the constraints.
>
> Lets try to do this dynamically and detect safe start offset and
> alignment for each memory region we got. This should be universal solution
> regardless hw limitations and bugs. As such detection is not lightweight
> technique add also some caching structures to handle consequtive calls
> about same data.

I have a few more comments and suggestions below. However, since these are
non-blocking this patch is:

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

> +static IGT_LIST_HEAD(cache);
> +static pthread_mutex_t cache_mutex =3D PTHREAD_MUTEX_INITIALIZER;

The caching will help for multiple calls to the code within a single
process, but in CI each process will still need to build up the cache.

> +uint64_t gem_detect_min_start_offset_for_region(int i915, uint32_t regio=
n)
> +{
> +	struct drm_i915_gem_exec_object2 obj;
> +	struct drm_i915_gem_execbuffer2 eb;
> +	uint64_t start_offset =3D 0;
> +	uint64_t bb_size =3D PAGE_SIZE;
> +	uint32_t *batch;
> +	uint16_t devid =3D intel_get_drm_devid(i915);
> +	struct cache_entry *entry, *newentry;
> +
> +	pthread_mutex_lock(&cache_mutex);
> +	entry =3D find_entry_unlocked(MIN_START_OFFSET, devid, region, 0);
> +	if (entry)
> +		goto out;
> +	pthread_mutex_unlock(&cache_mutex);

I think it would be better to add the locking to find_entry_unlocked(). And
also add a list_add_entry() kind of function with the locking. This will
associate the mutex directly with the list and get it out of the
callers. The check if the entry has been previously added would also need
to move to list_add_entry().

Anyway if this is complicated leave as is.

> +	newentry =3D malloc(sizeof(*newentry));
> +	if (!newentry)
> +		return start_offset;

I'd suggest just do 'igt_assert(newentry)' in all these functions to keep
things simple.

> +/**
> + * gem_detect_min_alignment_for_regions:
> + * @i915: drm fd
> + * @region1: first region
> + * @region2: second region
> + *
> + * Returns: minimum alignment which must be used when objects from @regi=
on1 and
> + * @region2 are going to interact.

Here actually it is not obvious why the code below in this function is
needed. Page sizes for different memory regions can be different. From
discussions with people here, it seems what happens is that different page
sizes cannot be included as part of the same page table structure level (in
the multi-level page table structure hierarchy). That is why offsets for
memory regions with different page sizes cannot be adjacent.

Can we add some explanation of this sort here as a hint as to why the code
below is needed?

There is still the question in my mind whether the situation is "dynamic"
when a memory region has multiple page sizes, say 64 K and 2 MiB. In this
case when we run the detection we get one safe alignemnt value which is
cached, but "later" because of memory use, this cached value proves
insufficient (so when we detect say we get 64 K but actually later the 2
MiB value is needed). This could happen because of different code paths
taken in the kernel.

Any case, I think what we have is good enough for now and worth
trying. Let's see if we hit this issue later.

> +	/* Establish bb offset first */
> +	eb.buffers_ptr =3D to_user_pointer(obj);
> +	eb.buffer_count =3D 1;

Looks like this line is not needed, eb.buffer_count is set again below.

> +	eb.flags =3D I915_EXEC_BATCH_FIRST | I915_EXEC_DEFAULT;
> +	igt_assert(__gem_create_in_memory_regions(i915, &obj[0].handle,
> +						  &bb_size, region1) =3D=3D 0);
> +	obj[0].flags =3D EXEC_OBJECT_PINNED;
> +
> +	batch =3D gem_mmap__device_coherent(i915, obj[0].handle, 0, bb_size,
> +					  PROT_WRITE);
> +	*batch =3D MI_BATCH_BUFFER_END;
> +	munmap(batch, bb_size);
> +
> +	obj[0].offset =3D gem_detect_min_start_offset_for_region(i915, region1);
> +
> +	/* Find appropriate alignment of object */
> +	eb.buffer_count =3D ARRAY_SIZE(obj);