From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7EFF353AC; Sat, 22 Feb 2025 06:32:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=163.172.96.212 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740205949; cv=none; b=BN1z2AmwsCWJIFjJPRp1nUABNh/Oe/ORLUI+jASkB0X6hZc0YM+va0gOiABKYjnOwmWyTu8jpuY2M5b08qU6+yaNdOeKGze1QMYeAOUoaZNmEEnPqyRxOMPvTr1DlPp+GjQn488SRiiXdBrUGKr4oCgIGEZH4fQZ25TrMqrnK2k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740205949; c=relaxed/simple; bh=JLCSUhXUS2CigTiGRmJZ9sAM+gxU0XLYAsb1Lrm409Q=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HZTRXf5as9RgnVUifNNncronB5HjYR0+2czUpxXq/YTZ2YNPWmgEJpEEQdKLklYYrmk4Sx4K+TXHqbOOzo5rvcUjMUdJ5Txsxyf4hxmj70emplAwnynIlDBxtptK4xIUEVkrZbmVQCIEbRhDlV8nntO40+NCUq6X68R1ifG24EE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=1wt.eu; spf=pass smtp.mailfrom=1wt.eu; arc=none smtp.client-ip=163.172.96.212 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=1wt.eu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=1wt.eu Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 51M6WAUn011931; Sat, 22 Feb 2025 07:32:10 +0100 Date: Sat, 22 Feb 2025 07:32:10 +0100 From: Willy Tarreau To: David Laight Cc: Linus Torvalds , Jan Engelhardt , "H. Peter Anvin" , Greg KH , Boqun Feng , Miguel Ojeda , Christoph Hellwig , rust-for-linux , David Airlie , linux-kernel@vger.kernel.org, ksummit@lists.linux.dev Subject: Re: C aggregate passing (Rust kernel policy) Message-ID: <20250222063210.GA11482@1wt.eu> References: <2025021954-flaccid-pucker-f7d9@gregkh> <2nn05osp-9538-11n6-5650-p87s31pnnqn0@vanv.qr> <2025022052-ferment-vice-a30b@gregkh> <9B01858A-7EBD-4570-AC51-3F66B2B1E868@zytor.com> <20250221183437.1e2b5b94@pumpkin> <20250221214501.11b76aa8@pumpkin> Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250221214501.11b76aa8@pumpkin> User-Agent: Mutt/1.10.1 (2018-07-13) On Fri, Feb 21, 2025 at 09:45:01PM +0000, David Laight wrote: > On Fri, 21 Feb 2025 11:12:27 -0800 > Linus Torvalds wrote: > > > On Fri, 21 Feb 2025 at 10:34, David Laight wrote: > > > > > > As Linus said, most modern ABI pass short structures in one or two registers > > > (or stack slots). > > > But aggregate returns are always done by passing a hidden pointer argument. > > > > > > It is annoying that double-sized integers (u64 on 32bit and u128 on 64bit) > > > are returned in a register pair - but similar sized structures have to be > > > returned by value. > > > > No, they really don't. At least not on x86 and arm64 with our ABI. > > Two-register structures get returned in registers too. > > > > Try something like this: > > > > struct a { > > unsigned long val1, val2; > > } function(void) > > { return (struct a) { 5, 100 }; } > > > > and you'll see both gcc and clang generate > > > > movl $5, %eax > > movl $100, %edx > > retq > > > > (and you'll similar code on other architectures). > > Humbug, I'm sure it didn't do that the last time I tried it. You have not dreamed, most likely last time you tried it was on a 32-bit arch like i386 or ARM. Gcc doesn't do that there, most likely due to historic reasons that couldn't be changed later, it passes a pointer argument to write the data there: 00000000 : 0: 8b 44 24 04 mov 0x4(%esp),%eax 4: c7 00 05 00 00 00 movl $0x5,(%eax) a: c7 40 04 64 00 00 00 movl $0x64,0x4(%eax) 11: c2 04 00 ret $0x4 You can improve it slightly with -mregparm but that's all, and I never found an option nor attribute to change that: 00000000 : 0: c7 00 05 00 00 00 movl $0x5,(%eax) 6: c7 40 04 64 00 00 00 movl $0x64,0x4(%eax) d: c3 ret ARM does the same on 32 bits: 00000000 : 0: 2105 movs r1, #5 2: 2264 movs r2, #100 ; 0x64 4: e9c0 1200 strd r1, r2, [r0] 8: 4770 bx lr I think it's simply that this practice arrived long after these old architectures were fairly common and it was too late to change their ABI. But x86_64 and aarch64 had the opportunity to benefit from this. For example, gcc-3.4 on x86_64 already does the right thing: 0000000000000000 : 0: ba 64 00 00 00 mov $0x64,%edx 5: b8 05 00 00 00 mov $0x5,%eax a: c3 retq So does aarch64 since the oldest gcc I have that supports it (linaro 4.7): 0000000000000000 : 0: d28000a0 mov x0, #0x5 // #5 4: d2800c81 mov x1, #0x64 // #100 8: d65f03c0 ret For my use cases I consider that older architectures are not favored but they are not degraded either, while newer ones do significantly benefit from the approach, that's why I'm using it extensively. Quite frankly, there's no reason to avoid using this for pairs of pointers or (status,value) pairs or coordinates etc. And if you absolutely need to also support 32-bit archs optimally, you can do it using a macro to turn your structs to a larger register and back: struct a { unsigned long v1, v2; }; #define MKPAIR(x) (((unsigned long long)(x.v1) << 32) | (x.v2)) #define GETPAIR(x) ({ unsigned long long _x = x; (struct a){ .v1 = (_x >> 32), .v2 = (_x)}; }) unsigned long long fct(void) { struct a a = { 5, 100 }; return MKPAIR(a); } long caller(void) { struct a a = GETPAIR(fct()); return a.v1 + a.v2; } 00000000 : 0: b8 64 00 00 00 mov $0x64,%eax 5: ba 05 00 00 00 mov $0x5,%edx a: c3 ret 0000000b : b: b8 69 00 00 00 mov $0x69,%eax 10: c3 ret But quite frankly due to their relevance these days I don't think it's worth the effort. Hoping this helps, Willy