From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A32862E36EB; Sat, 2 Aug 2025 20:58:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754168294; cv=none; b=rS2oe6IZJs+uMJSPzS2eiy3zaBklJ5i4CFBrwagqICQs+27/CGD3AXt3GmvnTuO4BVso9RJ/lK3haPjp+Toqw7s8iw+68Xqphor4gtfR6/BGNlEvqDkMGDkee+jXjxXZ++JX9gONPuDFAty/+vnDpzIafXHS+QKF6QZVLrQX61M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754168294; c=relaxed/simple; bh=jls7RI4UDsR9chSuQa2K32QuFY3fZP41M1jzeL+swwU=; h=Mime-Version:Content-Type:Date:Message-Id:Cc:Subject:From:To: References:In-Reply-To; b=b1SNXhkK88XfMlLQiRGQ5/EPTvHdYvg7Aw5c3EXgWx51ohVf8lbcInYn7cbnt8E/jdzS+pFCwLOs+Tz8o0nr0USaehDjck/ISrHokT6qSGCa6GqFwcmo89M6yjsKs40ewKm38TEKxZxNkjEtWsFZXE8mCfoMv2wQ6LgLVwaxtco= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=koAI3Zbh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="koAI3Zbh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7AC92C4CEEF; Sat, 2 Aug 2025 20:58:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1754168294; bh=jls7RI4UDsR9chSuQa2K32QuFY3fZP41M1jzeL+swwU=; h=Date:Cc:Subject:From:To:References:In-Reply-To:From; b=koAI3ZbhWKyjZtdIXDToFdqaiZcJLDzoToPPsxNKCXwkje8tMAZhTKoNgSee7Fkz0 dvyLCEA20NvwVFs7cghTlJ87K4PSR3BxIUlrPAh6VeQnO3f7/65VNmUytivaEvblea w0up9iFo41Vvh2qAPdWNrFAGdkeMTm5AnUJL+M37kPCxDxwpJqFxOU9/AJKu46u6Tj qv30n6sVhXiMiAL5QRnhhQOiOVD80TH8g/YY5hzCQaj8zEGA2ucG+DzujhpiRsyn+7 5enrYAyHGKMegoOVh20fY/ynjJ4JTDVYjbXjZGE7/M7rlK3ppZnm4gGRiTgKUSFVsw 8gcG2c6cBfwSw== Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Sat, 02 Aug 2025 22:58:08 +0200 Message-Id: Cc: "Onur" , "Boqun Feng" , , , , , , , , , , , , , , , , , "dri-devel" Subject: Re: [PATCH v5 2/3] implement ww_mutex abstraction for the Rust tree From: "Benno Lossin" To: "Daniel Almeida" X-Mailer: aerc 0.20.1 References: <20250621184454.8354-1-work@onurozkan.dev> <20250621184454.8354-3-work@onurozkan.dev> <20250707163913.5ffc046d@nimda.home> <20250707210613.2fd5bb55@nimda.home> In-Reply-To: On Sat Aug 2, 2025 at 4:15 PM CEST, Daniel Almeida wrote: > On 2 Aug 2025, at 07:42, Benno Lossin wrote: >> On Fri Aug 1, 2025 at 11:22 PM CEST, Daniel Almeida wrote: >>> One thing I didn=E2=80=99t understand with your approach: is it amenabl= e to loops? >>> i.e.: are things like drm_exec() implementable? >>=20 >> I don't think so, see also my reply here: >>=20 >> https://lore.kernel.org/all/DBOPIJHY9NZ7.2CU5XP7UY7ES3@kernel.org >>=20 >> The type-based approach with tuples doesn't handle dynamic number of >> locks. >>=20 > > This is probably the default use-case by the way. That's an important detail. In that case, a type state won't we a good idea. Unless it's also common to have a finite number of them, in which case we should have two API. >>> /** >>> * drm_exec_until_all_locked - loop until all GEM objects are locked >>> * @exec: drm_exec object >>> * >>> * Core functionality of the drm_exec object. Loops until all GEM object= s are >>> * locked and no more contention exists. At the beginning of the loop it= is >>> * guaranteed that no GEM object is locked. >>> * >>> * Since labels can't be defined local to the loops body we use a jump p= ointer >>> * to make sure that the retry is only used from within the loops body. >>> */ >>> #define drm_exec_until_all_locked(exec) \ >>> __PASTE(__drm_exec_, __LINE__): \ >>> for (void *__drm_exec_retry_ptr; ({ \ >>> __drm_exec_retry_ptr =3D &&__PASTE(__drm_exec_, __LINE__);\ >>> (void)__drm_exec_retry_ptr; \ >>> drm_exec_cleanup(exec); \ >>> });) >>=20 >> My understanding of C preprocessor macros is not good enough to parse or >> understand this :( What is that `__PASTE` thing? > > This macro is very useful, but also cursed :) > > This declares a unique label before the loop, so you can jump back to it = on > contention. It is usually used in conjunction with: Ahh, I missed the `:` at the end of the line. Thanks for explaining! (also Miguel in the other reply!) If you don't mind I'll ask some more basic C questions :) And yeah it's pretty cursed... > /** > * drm_exec_retry_on_contention - restart the loop to grap all locks > * @exec: drm_exec object > * > * Control flow helper to continue when a contention was detected and we = need to > * clean up and re-start the loop to prepare all GEM objects. > */ > #define drm_exec_retry_on_contention(exec) \ > do { \ > if (unlikely(drm_exec_is_contended(exec))) \ > goto *__drm_exec_retry_ptr; \ > } while (0) The `do { ... } while(0)` is used because C doesn't have `{ ... }` scopes? (& because you want to be able to have this be called from an if without braces?) > The termination is handled by: > > /** > * drm_exec_cleanup - cleanup when contention is detected > * @exec: the drm_exec object to cleanup > * > * Cleanup the current state and return true if we should stay inside the= retry > * loop, false if there wasn't any contention detected and we can keep th= e > * objects locked. > */ > bool drm_exec_cleanup(struct drm_exec *exec) > { > if (likely(!exec->contended)) { > ww_acquire_done(&exec->ticket); > return false; > } > > if (likely(exec->contended =3D=3D DRM_EXEC_DUMMY)) { > exec->contended =3D NULL; > ww_acquire_init(&exec->ticket, &reservation_ww_class); > return true; > } > > drm_exec_unlock_all(exec); > exec->num_objects =3D 0; > return true; > } > EXPORT_SYMBOL(drm_exec_cleanup); > > The third clause in the loop is empty. > > For example, in amdgpu: > > /** > * reserve_bo_and_vm - reserve a BO and a VM unconditionally. > * @mem: KFD BO structure. > * @vm: the VM to reserve. > * @ctx: the struct that will be used in unreserve_bo_and_vms(). > */ > static int reserve_bo_and_vm(struct kgd_mem *mem, > struct amdgpu_vm *vm, > struct bo_vm_reservation_context *ctx) > { > struct amdgpu_bo *bo =3D mem->bo; > int ret; > > WARN_ON(!vm); > > ctx->n_vms =3D 1; > ctx->sync =3D &mem->sync; > drm_exec_init(&ctx->exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); > drm_exec_until_all_locked(&ctx->exec) { > ret =3D amdgpu_vm_lock_pd(vm, &ctx->exec, 2); > drm_exec_retry_on_contention(&ctx->exec); > if (unlikely(ret)) > goto error; > > ret =3D drm_exec_prepare_obj(&ctx->exec, &bo->tbo.base, 1); > drm_exec_retry_on_contention(&ctx->exec); > if (unlikely(ret)) > goto error; > } > // <=E2=80=94=E2=80=94 everything is locked at this point. Which function call locks the mutexes? > return 0; > > > So, something like: > > some_unique_label: > for(void *retry_ptr; > ({ retry_ptr =3D &some_unique_label; drm_exec_cleanup(); }); Normally this should be a condition, or rather an expression evaluating to bool, why is this okay? Or does C just take the value of the last function call due to the `({})`? Why isn't `({})` used instead of `do { ... } while(0)` above? > /* empty *) { > /* user code here, which potentially jumps back to some_unique_label= */ > } Thanks for the example & the macro expansion. What I gather from this is that we'd probably want a closure that executes the code & reruns it when contention is detected. >>> In fact, perhaps we can copy drm_exec, basically? i.e.: >>>=20 >>> /** >>> * struct drm_exec - Execution context >>> */ >>> struct drm_exec { >>> /** >>> * @flags: Flags to control locking behavior >>> */ >>> u32 flags; >>>=20 >>> /** >>> * @ticket: WW ticket used for acquiring locks >>> */ >>> struct ww_acquire_ctx ticket; >>>=20 >>> /** >>> * @num_objects: number of objects locked >>> */ >>> unsigned int num_objects; >>>=20 >>> /** >>> * @max_objects: maximum objects in array >>> */ >>> unsigned int max_objects; >>>=20 >>> /** >>> * @objects: array of the locked objects >>> */ >>> struct drm_gem_object **objects; >>>=20 >>> /** >>> * @contended: contended GEM object we backed off for >>> */ >>> struct drm_gem_object *contended; >>>=20 >>> /** >>> * @prelocked: already locked GEM object due to contention >>> */ >>> struct drm_gem_object *prelocked; >>> }; >>>=20 >>> This is GEM-specific, but we could perhaps implement the same idea by >>> tracking ww_mutexes instead of GEM objects. >>=20 >> But this would only work for `Vec>`, right? > > I=E2=80=99m not sure if I understand your point here. > > The list of ww_mutexes that we've managed to currently lock would be some= thing > that we'd keep track internally in our context. In what way is a KVec an = issue? I saw "array of the locked objects" and thus thought so this must only work for an array of locks. Looking at the type a bit closer, it actually is an array of pointers, so it does work for arbitrary data structures storing the locks. So essentially it would amount to storing `Vec>` in Rust IIUC. I was under the impression that we wanted to avoid that, because it's an extra allocation. But maybe that's actually what's done on the C side. > Btw, I can also try to implement a proof of concept, so long as people ag= ree that > this approach makes sense. I'm not sure I understand what you are proposing, so I can't give a recommendation yet. --- Cheers, Benno