From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id E71E910E55F for ; Thu, 6 Jan 2022 02:23:22 +0000 (UTC) Date: Wed, 05 Jan 2022 18:23:21 -0800 Message-ID: <87tueha9gm.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" In-Reply-To: <20220105064945.9640-2-zbigniew.kempczynski@intel.com> References: <20220105064945.9640-1-zbigniew.kempczynski@intel.com> <20220105064945.9640-2-zbigniew.kempczynski@intel.com> MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable Subject: Re: [igt-dev] [PATCH i-g-t 1/3] lib/intel_memory_region: Add start offset and alignment detection List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: igt-dev-bounces@lists.freedesktop.org Sender: "igt-dev" To: Zbigniew =?ISO-8859-2?Q?Kempczy=F1ski?= Cc: igt-dev@lists.freedesktop.org, Petri Latvala List-ID: On Tue, 04 Jan 2022 22:49:43 -0800, Zbigniew Kempczy=F1ski wrote: > > With era of new gens we're enforced to use no-reloc (softpin). This > brings few problems like vm range limitations which were well solved > by the kernel. This can be handled also in userspace code by adding > gen related conditionals or by trying to detect the constraints. > > Lets try to do this dynamically and detect safe start offset and > alignment for each memory region we got. This should be universal solution > regardless hw limitations and bugs. As such detection is not lightweight > technique add also some caching structures to handle consequtive calls > about same data. I have a few more comments and suggestions below. However, since these are non-blocking this patch is: Reviewed-by: Ashutosh Dixit > +static IGT_LIST_HEAD(cache); > +static pthread_mutex_t cache_mutex =3D PTHREAD_MUTEX_INITIALIZER; The caching will help for multiple calls to the code within a single process, but in CI each process will still need to build up the cache. > +uint64_t gem_detect_min_start_offset_for_region(int i915, uint32_t regio= n) > +{ > + struct drm_i915_gem_exec_object2 obj; > + struct drm_i915_gem_execbuffer2 eb; > + uint64_t start_offset =3D 0; > + uint64_t bb_size =3D PAGE_SIZE; > + uint32_t *batch; > + uint16_t devid =3D intel_get_drm_devid(i915); > + struct cache_entry *entry, *newentry; > + > + pthread_mutex_lock(&cache_mutex); > + entry =3D find_entry_unlocked(MIN_START_OFFSET, devid, region, 0); > + if (entry) > + goto out; > + pthread_mutex_unlock(&cache_mutex); I think it would be better to add the locking to find_entry_unlocked(). And also add a list_add_entry() kind of function with the locking. This will associate the mutex directly with the list and get it out of the callers. The check if the entry has been previously added would also need to move to list_add_entry(). Anyway if this is complicated leave as is. > + newentry =3D malloc(sizeof(*newentry)); > + if (!newentry) > + return start_offset; I'd suggest just do 'igt_assert(newentry)' in all these functions to keep things simple. > +/** > + * gem_detect_min_alignment_for_regions: > + * @i915: drm fd > + * @region1: first region > + * @region2: second region > + * > + * Returns: minimum alignment which must be used when objects from @regi= on1 and > + * @region2 are going to interact. Here actually it is not obvious why the code below in this function is needed. Page sizes for different memory regions can be different. From discussions with people here, it seems what happens is that different page sizes cannot be included as part of the same page table structure level (in the multi-level page table structure hierarchy). That is why offsets for memory regions with different page sizes cannot be adjacent. Can we add some explanation of this sort here as a hint as to why the code below is needed? There is still the question in my mind whether the situation is "dynamic" when a memory region has multiple page sizes, say 64 K and 2 MiB. In this case when we run the detection we get one safe alignemnt value which is cached, but "later" because of memory use, this cached value proves insufficient (so when we detect say we get 64 K but actually later the 2 MiB value is needed). This could happen because of different code paths taken in the kernel. Any case, I think what we have is good enough for now and worth trying. Let's see if we hit this issue later. > + /* Establish bb offset first */ > + eb.buffers_ptr =3D to_user_pointer(obj); > + eb.buffer_count =3D 1; Looks like this line is not needed, eb.buffer_count is set again below. > + eb.flags =3D I915_EXEC_BATCH_FIRST | I915_EXEC_DEFAULT; > + igt_assert(__gem_create_in_memory_regions(i915, &obj[0].handle, > + &bb_size, region1) =3D=3D 0); > + obj[0].flags =3D EXEC_OBJECT_PINNED; > + > + batch =3D gem_mmap__device_coherent(i915, obj[0].handle, 0, bb_size, > + PROT_WRITE); > + *batch =3D MI_BATCH_BUFFER_END; > + munmap(batch, bb_size); > + > + obj[0].offset =3D gem_detect_min_start_offset_for_region(i915, region1); > + > + /* Find appropriate alignment of object */ > + eb.buffer_count =3D ARRAY_SIZE(obj);