From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id A35716E06D for ; Wed, 4 Aug 2021 23:26:34 +0000 (UTC) Date: Wed, 04 Aug 2021 16:26:32 -0700 Message-ID: <87im0ku6zb.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" In-Reply-To: <20210726200026.4815-7-zbigniew.kempczynski@intel.com> References: <20210726200026.4815-1-zbigniew.kempczynski@intel.com> <20210726200026.4815-7-zbigniew.kempczynski@intel.com> MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable Subject: Re: [igt-dev] [PATCH i-g-t v3 06/52] lib/intel_batchbuffer: Add allocator support in blitter src copy List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: igt-dev-bounces@lists.freedesktop.org Sender: "igt-dev" To: Zbigniew =?ISO-8859-2?Q?Kempczy=F1ski?= Cc: igt-dev@lists.freedesktop.org List-ID: On Mon, 26 Jul 2021 12:59:40 -0700, Zbigniew Kempczy=F1ski wrote: > > @@ -808,9 +816,21 @@ void igt_blitter_src_copy(int fd, > uint32_t src_pitch, dst_pitch; > uint32_t dst_reloc_offset, src_reloc_offset; > uint32_t gen =3D intel_gen(intel_get_drm_devid(fd)); > + uint64_t batch_offset, src_offset, dst_offset; > const bool has_64b_reloc =3D gen >=3D 8; > int i =3D 0; > > + batch_handle =3D gem_create(fd, 4096); > + if (ahnd) { > + src_offset =3D get_offset(ahnd, src_handle, src_size, 0); > + dst_offset =3D get_offset(ahnd, dst_handle, dst_size, 0); > + batch_offset =3D get_offset(ahnd, batch_handle, 4096, 0); > + } else { > + src_offset =3D 16 << 20; > + dst_offset =3D ALIGN(src_offset + src_size, 1 << 20); > + batch_offset =3D ALIGN(dst_offset + dst_size, 1 << 20); For the !ahnd case, we are providing relocations right? We still need to provide these offsets or they can all be 0? > @@ -882,22 +902,29 @@ void igt_blitter_src_copy(int fd, > > igt_assert(i <=3D ARRAY_SIZE(batch)); > > - batch_handle =3D gem_create(fd, 4096); > gem_write(fd, batch_handle, 0, batch, sizeof(batch)); > > - fill_relocation(&relocs[0], dst_handle, -1, dst_delta, dst_reloc_offset, > + fill_relocation(&relocs[0], dst_handle, dst_offset, > + dst_delta, dst_reloc_offset, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER); > - fill_relocation(&relocs[1], src_handle, -1, src_delta, src_reloc_offset, > + fill_relocation(&relocs[1], src_handle, src_offset, > + src_delta, src_reloc_offset, > I915_GEM_DOMAIN_RENDER, 0); > > - fill_object(&objs[0], dst_handle, 0, NULL, 0); > - fill_object(&objs[1], src_handle, 0, NULL, 0); > - fill_object(&objs[2], batch_handle, 0, relocs, 2); > + fill_object(&objs[0], dst_handle, dst_offset, NULL, 0); > + fill_object(&objs[1], src_handle, src_offset, NULL, 0); > + fill_object(&objs[2], batch_handle, batch_offset, relocs, 2); > > - objs[0].flags |=3D EXEC_OBJECT_NEEDS_FENCE; > + objs[0].flags |=3D EXEC_OBJECT_NEEDS_FENCE | EXEC_OBJECT_WRITE; > objs[1].flags |=3D EXEC_OBJECT_NEEDS_FENCE; > > - exec_blit(fd, objs, 3, gen, 0); > + if (ahnd) { > + objs[0].flags |=3D EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRE= SS; > + objs[1].flags |=3D EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRE= SS; > + objs[2].flags |=3D EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRE= SS; > + } Should be add an "else" here and pull the fill_relocation() and set the relocation_count to 2 only if we have !ahnd? Maybe ok to leave as is too if the kernel will ignore the reloc stuff when EXEC_OBJECT_PINNED is set. > @@ -584,10 +601,17 @@ static void work(int i915, int dmabuf, const intel_= ctx_t *ctx, unsigned ring) > obj[SCRATCH].handle =3D prime_fd_to_handle(i915, dmabuf); > > obj[BATCH].handle =3D gem_create(i915, size); > + obj[BATCH].offset =3D get_offset(ahnd, obj[BATCH].handle, size, 0); > obj[BATCH].relocs_ptr =3D (uintptr_t)store; > - obj[BATCH].relocation_count =3D ARRAY_SIZE(store); > + obj[BATCH].relocation_count =3D !ahnd ? ARRAY_SIZE(store) : 0; > memset(store, 0, sizeof(store)); > > + if (ahnd) { > + obj[SCRATCH].flags =3D EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE; > + obj[SCRATCH].offset =3D scratch_offset; > + obj[BATCH].flags =3D EXEC_OBJECT_PINNED; > + } Why don't we compute scratch_offset in work() itself (and rather pass it in from the callers)? > @@ -602,8 +626,8 @@ static void work(int i915, int dmabuf, const intel_ct= x_t *ctx, unsigned ring) > store[count].write_domain =3D I915_GEM_DOMAIN_INSTRUCTION; > batch[i] =3D MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0); > if (gen >=3D 8) { > - batch[++i] =3D 0; > - batch[++i] =3D 0; > + batch[++i] =3D scratch_offset + store[count].delta; > + batch[++i] =3D (scratch_offset + store[count].delta) >> 32; > } else if (gen >=3D 4) { > batch[++i] =3D 0; > batch[++i] =3D 0; Should we add the offset's even for previous gen's (gen < 8)? Because I am thinking at present kernel is supporting reloc's for gen < 12 but maybe later kernels will discontinue them completely so we'll need to fix the previous gen's all over again? Maybe too much?