public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: "Thomas Weißschuh" <linux@weissschuh.net>
Cc: Willy Tarreau <w@1wt.eu>, Daniel Palmer <daniel@thingy.jp>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] tools/nolibc: fcntl: Add fallocate()
Date: Sun, 3 May 2026 23:38:52 +0100	[thread overview]
Message-ID: <20260503233852.3f950b63@pumpkin> (raw)
In-Reply-To: <cc72fd15-d002-49ca-b573-611e4c12220d@t-8ch.de>

On Sun, 3 May 2026 18:28:39 +0200
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-05-02 22:26:07+0100, David Laight wrote:
> > On Sat, 2 May 2026 05:00:06 +0200
> > Willy Tarreau <w@1wt.eu> wrote:
> >   
> > > On Fri, May 01, 2026 at 09:18:31AM +0100, David Laight wrote:  
> > > > On Fri,  1 May 2026 01:41:24 +0900
> > > > Daniel Palmer <daniel@thingy.jp> wrote:  
> 
> (...)
...
> > Looking again, the code is trying to copy what the compiler generates
> > when it passes a 64bit quantity on stack or in 2 registers.
> > So f(a, b, c, d) is traditionally 'push d; push c; push b, push a; call f'.
> > With the normal 'stack grows down' this gives (in increasing addresses):
> > 	return address
> > 	a
> > 	b
> > 	c
> > 	d
> > If 'b,c' is replaced by a 64bit value then you want the stack memory
> > to contain the correct representation of a 64bit value.
> > So for LE you need to pass the low part before the high part.  
> 
> Most architectures should use registers and not the stack.

Indeed, but the registers are picked as though they are caching on-stack locations.
So the way the stack would look is relevant.

> 
> > But there is one architecture (parisc) where the stack goes the other way.
> > It is BE (I believe), but would need the LE argument order.
> > 
> > I think all the conditionals could be moved out of the fallocate code.
> > I'm not sure of the exact pre-processor conditionals but something like:
> > 
> > #if (__BITS_PER_LONG == 64) || defined(x86_x32) || defined(mips_n32)
> > #define __NOLIBC_ARG64(x) (x)
> > #else
> > /* The on-stack data has to be in the natural order for a 64bit value. */
> > #if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN_) ^ defined(STACK_GROWS_UP)
> > #define __NOLIBC_ARG64(x) ((int)(x)), ((int)((long long)(x) >> 32))
> > #else
> > #define __NOLIBC_ARG64(x) ((int)((long long)(x) >> 32)), ((int)(x))
> > #endif
> > #endif  
> 
> I highly dislike macros like these which may expand to either a single
> or multiple arguments.

> Also this logic looks much more complicated than what Daniel proposed.

I'm not sure the casts are needed.
The implicit casts from the function calls may mean that it is enough
to use (x) and (x) >> 32.

I really didn't like the expansions Daniel proposed in sys_fallocate() itself.
Neither the fallocate code nor the header file really said what was going on.
You had to read both together and then read between the lines so see why
it was necessary at all. 

> > Then (if I've got it right):
> > static __attribute__((unused))
> > int _sys_fallocate(int fd, int mode, off_t offset, off_t size)
> > {
> > 	return _syscall(__NR_fallocate, fd, mode, __NOLIBC_ARG64(offset),
> > 			__NOLIBC_ARG64(size));
> > }  
> 
> We intentionally use __nolibc_syscallX() inside nolibc proper over
> _syscall()/syscall() to have the arity of the syscall written out.

In this case the number of arguments varies - so that wouldn't work.
But I can see why it would normally be better.

> 
> > There is the other problem that arm32 (and maybe others) requires the 64bit
> > variable be aligned in the stack frame.
> > So f(int a, long long b) is effectively f(int a, int pad, long long b).
> > (This usually causes grief with lseek().)
> > So a pad might be needed for some system calls on some 32bit architectures.
> > This could be done with something that expands to '' or '0,'.  
> 
> I don't see why we have to worry about variable alignment or stack
> direction at all. That's the compilers job.

Except you do because you are trying to match what the compiler would
generate for a 64bit argument by passing two 32bit values.

I'm pretty sure that the normal syscall interface just passes the registers
that contain arguments over from user space to kernel space and then calls
the kernel function (ie without any regard for the types of the parameters).
This is normally fine except for lseek() and mmap() - they may have an
interface that re-orders the arguments.

> We do have both lseek() and
> parisc32 nowadays and both work fine. Also with Daniel's series on top.
> Am I missing something?

godbolt doesn't have a parisc32 compiler - so I couldn't check.
But I think if you compare the code for calls to f1(int, int, int, int),
f2(int, long long, int) and f3(int, int long long) for f1(1,2,3,4),
f2(1, 2ull<<32 | 3, 4) and f3(1, 2, 3ull<<32 | 4) against another BE
system the 64bit constants will overlay different 32bit ones.

	David

> 
> 
> Thomas


  reply	other threads:[~2026-05-03 22:38 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30 16:41 [PATCH v3 0/2] nolibc: Add fallocate() Daniel Palmer
2026-04-30 16:41 ` [PATCH v3 1/2] tools/nolibc: fcntl: " Daniel Palmer
2026-05-01  8:18   ` David Laight
2026-05-02  3:00     ` Willy Tarreau
2026-05-02 21:26       ` David Laight
2026-05-03 16:28         ` Thomas Weißschuh
2026-05-03 22:38           ` David Laight [this message]
2026-04-30 16:41 ` [PATCH v3 2/2] selftests/nolibc: Add a very basic test for fallocate() Daniel Palmer
2026-05-02  3:04   ` Willy Tarreau
2026-05-02  4:00     ` Daniel Palmer
2026-05-02  4:40       ` Willy Tarreau
2026-05-03 16:20   ` Thomas Weißschuh
2026-05-02  3:05 ` [PATCH v3 0/2] nolibc: Add fallocate() Willy Tarreau
2026-05-03 16:21 ` Thomas Weißschuh
2026-05-04  1:46   ` Daniel Palmer
2026-05-04 15:33     ` Thomas Weißschuh
2026-05-05  2:20       ` Daniel Palmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260503233852.3f950b63@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=daniel@thingy.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@weissschuh.net \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox