From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-40134.protonmail.ch (mail-40134.protonmail.ch [185.70.40.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34E9F17CA00 for ; Fri, 20 Sep 2024 15:28:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.70.40.134 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726846139; cv=none; b=Ohrl37N/qUd+FaaT6rN1Ubl4IHaLDxw/LOWnaJMRPA3MlbQSEjCi6Vt+6QBf4fVnTB2MKx3Fm4JldyVd/PEg/LZg0RY+Hgozx5CZvpd7idri80r8sG0epC/8vrfr7UU1CRhDsuK25QUm6mXyFWR9qG4ojKHuNAk4OPLJBEE5rL0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726846139; c=relaxed/simple; bh=Xru1Xw6EXxrCV55SOq8vDOrd/yHDyBJip3dmdKkU954=; h=Date:To:From:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IZ9IpRs8Skem18uC/2j4HR2xvTcUn2/9KvRPEQQqARrx8L3YkiCe8rfPjQ55L5H9L4FR6D1XKITTnBXW0tXitqbIex9L53QLW3Wu025uCdiUQrHUWEi5KKReWcxJj2g8dchIefseF3CxSvLSQL0ReDZauCtKEl4QjJAMRA4d2ms= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me; spf=pass smtp.mailfrom=proton.me; dkim=pass (2048-bit key) header.d=proton.me header.i=@proton.me header.b=Op14Gnkx; arc=none smtp.client-ip=185.70.40.134 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=proton.me Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=proton.me header.i=@proton.me header.b="Op14Gnkx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=en6br3hjingk7jsrrmzwetwexe.protonmail; t=1726846129; x=1727105329; bh=N7F0vlhEavmWrnMWcAbLj6GIz5oqybxOWyz+kMUkRoU=; h=Date:To:From:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=Op14Gnkxt+QNcu+eSIg0ZO8dYWBZ3YxrUUDdBBQ3Wq39g/UGcOchQFj8LNxzYFXtq fhvgf1Xj5k/6AuYso91nX7HypBNLNVITflI0RzG8Nbe9jeXk8COXuo5uksnzKQeGNc hWvVOHNDil5M/1MEn10scdsL3CUwoof7hNOvOwPZb995bO1GIBswfSdIottsErZLo5 lMEie7uVZlSdi0/YAvJ12wKxUUYdF677VdwqmiVNYjtBjM0aMmJnjBoyy379tjxYHR tJvQj5G41tAL/BQ8pXLyKoJDGT67EEX0rhW6G4l2LHyzqDh3l0Q+X/u1BwB0N/NWlN N3+4jIA1zlyAQ== Date: Fri, 20 Sep 2024 15:28:47 +0000 To: Simona Vetter , Greg KH , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?utf-8?Q?Bj=C3=B6rn_Roy_Baron?= , Andreas Hindborg , Alice Ryhl , Trevor Gross , rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org From: Benno Lossin Subject: Re: [PATCH 1/3] rust: add untrusted data abstraction Message-ID: In-Reply-To: References: <20240913112643.542914-1-benno.lossin@proton.me> <20240913112643.542914-2-benno.lossin@proton.me> <26534d80-989d-4b77-9720-84575275890f@proton.me> Feedback-ID: 71780778:user:proton X-Pm-Message-ID: 899cc48ae97e3cd27dbaf4cbf3c6cf2d4d2f223c Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 20.09.24 16:29, Simona Vetter wrote: > On Wed, Sep 18, 2024 at 03:40:54PM +0000, Benno Lossin wrote: >> On 16.09.24 17:49, Simona Vetter wrote: >>> On Fri, Sep 13, 2024 at 04:49:29PM +0000, Benno Lossin wrote: >>>> What I think we should do instead is make our APIs that return untrust= ed >>>> data just return `Untrusted` and implement the following method= : >>>> >>>> impl Folio { >>>> pub fn read(self: &Untrusted) -> &Untrusted<[u8]>; >>>> } >>>> >>>> I think that is the best of both worlds: we don't need to do excessive >>>> type shenanigans for every type carrying potentially untrusted data an= d >>>> we get to have methods specific to untrusted data. >>>> >>>> However, I think implementing this method will be a bit difficult with >>>> the `Untrusted`/`Unvalidated` split. Maybe we can have some `pub(crate= )` >>>> methods on `Unvalidated` to perform some mappings? >>> >>> The thing is, folios are just a pile of contig pages, and there's nothi= ng >>> in the rules that they only contain untrusted data. Currently in rust c= ode >>> we have that's the case, but not in general. So we need that associated >>> type. >>> >>> But I also think Folio here is special, a lot of the other places where= I >>> want this annotation it's the case that the data returned is _always_ >>> untrusted. So we don't need to place associated types all over the >>> codebase to make this work, it's just that the rfc example you've picke= d >>> needs it. >> >> I think we should try to make just wrapping stuff in `Untrusted` work. I >> don't see how the associated types would help you any more than just >> implementing stuff on `&Untrusted=20 > I guess you could wrap it as Untrusted in each use site when you get the > data out of the Folio, but that makes the guarantees we get out of these > annotations much less stringent. Which is why I think for Folio<> (well > really for Pagecache) we need to go with the associated type or it's a bi= t > self-defeating. Let's just implement both ways and see which one is better. [...] >>>>>> +pub trait Validator { >>>>>> + /// Type of the input data that is untrusted. >>>>>> + type Input: ?Sized; >>>>>> + /// Type of the validated data. >>>>>> + type Output; >>>>> >>>>> So I think the explicit Output makes sense if you have multiple diffe= rent >>>>> untrusted input that validate to the same trusted structure, but I'm = not >>>>> sure this makes sense as associated types. Instead I'd go with generi= cs >>>>> and somethign like this: >>>>> >>>>> pub trait Validator { >>>>> type Err; >>>>> >>>>> fn validate(untrusted: &Untrusted) -> Result; >>>>> } >>>>> >>>>> That means you can't implement validate for types from other modules >>>>> directly but need a newtype (I think at least, not sure). But I think >>>>> that's actually a good thing, since often that means you're validatin= g >>>>> some generic state plus whatever your own code needs (like the >>>>> inode::Params in your example), and both pieces nee= d to >>>>> be consisted overall and not just individually (otherwise why does th= e >>>>> that other module not do the parsing for you). And so explicitly trea= ting >>>>> the validated output as an explicit new type just makes sense to me. = Plus >>>>> with derive(Deref) it's trivial to unbox after validation. >>>> >>>> There might be the need to validate the same piece of data with >>>> different ways and I am not convinced adding a newtype for every singl= e >>>> case is a good way to achieve it. >>>> Although it would simplify the `Validator` trait... I will think a bit >>>> about this. >>> >>> Hm, but unless I misunderstand you already need a random type to attach >>> your current trait too? So not worse if we require that for the >>> less-common type of multiple ways to validate the same, and simpler for >>> the common one. >> >> Yes, but you wouldn't have to unwrap the return type. For example with >> your proposal we have: >> >> struct MyINodeParams(INodeParams); >> >> impl Validator<[u8]> for MyINodeParams { >> type Err =3D Error; >> >> fn validate(untrusted: &Untrusted<[u8]>) -> Result { >> /*...*/ >> Ok(Self(params)) >> } >> } >> >> impl MyINodeParams { >> fn into_inner(self) -> INodeParams { >> self.0 >> } >> } >> >> And then you would do: >> >> let params =3D untrusted.validate::().into_inner(); >> >> I find the `into_inner()` a bit annoying (one could just use `.0` >> instead, but I also don't like that). I find specifying the `Output` a >> bit cleaner. >=20 > Hm right. But I guess with your new plan to only support validate, which > gets the inner passed in explicitly and returns whatever the closure > returns? The only thing that changes with my suggestion is the parameter type of `validate` (and the names): struct MyINodeParams(INodeParams); impl Validate<[u8]> for MyINodeParams { type Err =3D Error; fn validate(untrusted: &[u8]) -> Result { /* ... */ Ok(Self(params)) } } let params =3D untrusted.validate::().into_inner(); And with the `Output` type on the trait we would have: struct MyINodeParams; impl Validate<[u8]> for MyINodeParams { type Err =3D Error; type Output =3D INodeParams; fn validate(untrusted: &[u8]) -> Result { // ... } } let params =3D untrusted.validate::(); I don't think that it's a huge difference, but nonetheless it is probably useful. But, I just remembered a probably more important thing: returning `Result` will make it possible to use type inference in places wehre you *do* want your custom type, so struct Foo { /* ... */ } impl Validate for Foo { /* ... */ } fn use_my_foo(foo: Foo) { /* ... */ } =20 use_my_foo(untrusted.validate()?) Should work (ie the `.validate()` call doesn't need the generic argument). So I think I will go for no `Output` type in the next version. --- Cheers, Benno