public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
       [not found]         ` <1258471436.2876.34.camel@dhcp231-106.rdu.redhat.com>
@ 2009-11-18  7:04           ` Heiko Carstens
  2009-11-18  7:04             ` Heiko Carstens
                               ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18  7:04 UTC (permalink / raw)
  To: Eric Paris
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Tue, Nov 17, 2009 at 10:23:56AM -0500, Eric Paris wrote:
> On Tue, 2009-11-17 at 14:55 +0100, Heiko Carstens wrote:
> 
> > Yes, also some places should have used lgfr instead of llgfr for proper sign
> > extension. But please, just drop the s390 bits from your patch.
> > Its easier and less painful for us to do it ourselves instead of reviewing
> > and fixing these things. (No offence intended!).
> 
> dropped and won't show up in -next tomorrow.
> 
> This what I thought it should be and would love to read if you say it's
> right....
> 

[...]

> +sys32_fanotify_mark_wrapper:
> +	lgfr	%r2,%r2			# int
> +	llgfr	%r3,%r3			# unsigned int
> +	lgfr	%r4,%r4			# int
> +	llgtr	%r5,%r5			# char *
> +	sllg	%r6,%r6,32		# get high word of 64bit mask
> +	l	%r6,164(%r15)		# get low word of 64bit mask
> +	jg	sys_fanotify_mark

Oh wait, I have to correct myself:

With

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname,
                       u64 mask);

we have a 64 bit type as 5th argument. That doesn't work for syscalls
on 32 bit s390.
I just simplify the reason for this: on 32 bit long longs will be passed via
two consecutive registers _unless_ the first register would be r6 (which is
the case here). In that case the whole 64 bits would be passed on the stack.
Our glibc syscall code will always put the contents of the first parameter
stack slot into register r7, so we have six registers for parameter passing
(r2-r7). So with the 64 bit value put into two stack slots we would miss
the second part of the 5th argument.

Please note that other architectures (I think at least arm and powerpc) put
64 bit values into even/odd register pairs and add padding if the first free
available register is an odd one. So any of the following interfaces should
work for all architectures:

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname,
                       u32 mask_high, u32 mask_low);

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
                       u64 mask,
     	 	       int fd, const char  __user *pathname);

long sys_fanotify_mark(u64 mask,
     		       int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
@ 2009-11-18  7:04             ` Heiko Carstens
  2009-11-18  9:27             ` Russell King
                               ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18  7:04 UTC (permalink / raw)
  To: Eric Paris
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Tue, Nov 17, 2009 at 10:23:56AM -0500, Eric Paris wrote:
> On Tue, 2009-11-17 at 14:55 +0100, Heiko Carstens wrote:
> 
> > Yes, also some places should have used lgfr instead of llgfr for proper sign
> > extension. But please, just drop the s390 bits from your patch.
> > Its easier and less painful for us to do it ourselves instead of reviewing
> > and fixing these things. (No offence intended!).
> 
> dropped and won't show up in -next tomorrow.
> 
> This what I thought it should be and would love to read if you say it's
> right....
> 

[...]

> +sys32_fanotify_mark_wrapper:
> +	lgfr	%r2,%r2			# int
> +	llgfr	%r3,%r3			# unsigned int
> +	lgfr	%r4,%r4			# int
> +	llgtr	%r5,%r5			# char *
> +	sllg	%r6,%r6,32		# get high word of 64bit mask
> +	l	%r6,164(%r15)		# get low word of 64bit mask
> +	jg	sys_fanotify_mark

Oh wait, I have to correct myself:

With

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname,
                       u64 mask);

we have a 64 bit type as 5th argument. That doesn't work for syscalls
on 32 bit s390.
I just simplify the reason for this: on 32 bit long longs will be passed via
two consecutive registers _unless_ the first register would be r6 (which is
the case here). In that case the whole 64 bits would be passed on the stack.
Our glibc syscall code will always put the contents of the first parameter
stack slot into register r7, so we have six registers for parameter passing
(r2-r7). So with the 64 bit value put into two stack slots we would miss
the second part of the 5th argument.

Please note that other architectures (I think at least arm and powerpc) put
64 bit values into even/odd register pairs and add padding if the first free
available register is an odd one. So any of the following interfaces should
work for all architectures:

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname,
                       u32 mask_high, u32 mask_low);

long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
                       u64 mask,
     	 	       int fd, const char  __user *pathname);

long sys_fanotify_mark(u64 mask,
     		       int fanotify_fd, unsigned int flags,
     	 	       int fd, const char  __user *pathname);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
  2009-11-18  7:04             ` Heiko Carstens
@ 2009-11-18  9:27             ` Russell King
  2009-11-18 14:49             ` Ralf Baechle
                               ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Russell King @ 2009-11-18  9:27 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, Nov 18, 2009 at 08:04:18AM +0100, Heiko Carstens wrote:
> On Tue, Nov 17, 2009 at 10:23:56AM -0500, Eric Paris wrote:
> With
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname,
>                        u64 mask);
> 
> we have a 64 bit type as 5th argument. That doesn't work for syscalls
> on 32 bit s390.

Note that the above works on ARM, since we end up with the following
register allocation:

r0: fanotify_fd
r1: flags
r2: fd
r3: pathname
r4,r5: mask

since 'mask' is an even,odd pair, and the arguments all fit within r0 - r5
inclusive.

> Please note that other architectures (I think at least arm and powerpc) put
> 64 bit values into even/odd register pairs and add padding if the first free
> available register is an odd one. So any of the following interfaces should
> work for all architectures:

Indeed.

Since we're going around the loop of re-organizing the argument order of
syscalls, the question which needs asking is: what's happening with HPA's
idea of having a set of per-arch rules and generating the interfaces
according to those rules?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
  2009-11-18  7:04             ` Heiko Carstens
  2009-11-18  9:27             ` Russell King
@ 2009-11-18 14:49             ` Ralf Baechle
  2009-11-18 14:49               ` Ralf Baechle
  2009-11-18 16:02             ` Eric Paris
  2009-11-18 17:24             ` Eric Paris
  4 siblings, 1 reply; 15+ messages in thread
From: Ralf Baechle @ 2009-11-18 14:49 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	linux-mips, Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, Nov 18, 2009 at 08:04:18AM +0100, Heiko Carstens wrote:

> Please note that other architectures (I think at least arm and powerpc) put
> 64 bit values into even/odd register pairs and add padding if the first free
> available register is an odd one. So any of the following interfaces should
> work for all architectures:
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname,
>                        u32 mask_high, u32 mask_low);
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>                        u64 mask,
>      	 	       int fd, const char  __user *pathname);
> 
> long sys_fanotify_mark(u64 mask,
>      		       int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname);

Correct - but the splitting is unnecessary pain for some platforms like
64-bit userland on 64-bit MIPS where the 64-bit argument would be passed
in a single register so I have preference for the 2nd or 3rd suggestion.

The 1st prototype has the advantage of avoiding the need for a compat
wrapper which otherwise would look like:

long compat_sys_fanotify_mark(int fanotify_fd, unsigned int flags,
                       u32 a2, u32 a3,
     	 	       int fd, const char  __user *pathname);
{
	/* assuming 2nd suggested prototype from above */
	return sys_fanotify_mark(fd, flags,
				 merge64(a2, a3),
				 fd, pathname);
}

I'd prefer to see the compat code carrying the burden.

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 14:49             ` Ralf Baechle
@ 2009-11-18 14:49               ` Ralf Baechle
  0 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2009-11-18 14:49 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	linux-mips, Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, Nov 18, 2009 at 08:04:18AM +0100, Heiko Carstens wrote:

> Please note that other architectures (I think at least arm and powerpc) put
> 64 bit values into even/odd register pairs and add padding if the first free
> available register is an odd one. So any of the following interfaces should
> work for all architectures:
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname,
>                        u32 mask_high, u32 mask_low);
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>                        u64 mask,
>      	 	       int fd, const char  __user *pathname);
> 
> long sys_fanotify_mark(u64 mask,
>      		       int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname);

Correct - but the splitting is unnecessary pain for some platforms like
64-bit userland on 64-bit MIPS where the 64-bit argument would be passed
in a single register so I have preference for the 2nd or 3rd suggestion.

The 1st prototype has the advantage of avoiding the need for a compat
wrapper which otherwise would look like:

long compat_sys_fanotify_mark(int fanotify_fd, unsigned int flags,
                       u32 a2, u32 a3,
     	 	       int fd, const char  __user *pathname);
{
	/* assuming 2nd suggested prototype from above */
	return sys_fanotify_mark(fd, flags,
				 merge64(a2, a3),
				 fd, pathname);
}

I'd prefer to see the compat code carrying the burden.

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
                               ` (2 preceding siblings ...)
  2009-11-18 14:49             ` Ralf Baechle
@ 2009-11-18 16:02             ` Eric Paris
  2009-11-18 16:02               ` Eric Paris
                                 ` (2 more replies)
  2009-11-18 17:24             ` Eric Paris
  4 siblings, 3 replies; 15+ messages in thread
From: Eric Paris @ 2009-11-18 16:02 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:
> On Tue, Nov 17, 2009 at 10:23:56AM -0500, Eric Paris wrote:
> > On Tue, 2009-11-17 at 14:55 +0100, Heiko Carstens wrote:
> > 
> > > Yes, also some places should have used lgfr instead of llgfr for proper sign
> > > extension. But please, just drop the s390 bits from your patch.
> > > Its easier and less painful for us to do it ourselves instead of reviewing
> > > and fixing these things. (No offence intended!).
> > 
> > dropped and won't show up in -next tomorrow.
> > 
> > This what I thought it should be and would love to read if you say it's
> > right....
> > 
> 
> [...]
> 
> > +sys32_fanotify_mark_wrapper:
> > +	lgfr	%r2,%r2			# int
> > +	llgfr	%r3,%r3			# unsigned int
> > +	lgfr	%r4,%r4			# int
> > +	llgtr	%r5,%r5			# char *
> > +	sllg	%r6,%r6,32		# get high word of 64bit mask
> > +	l	%r6,164(%r15)		# get low word of 64bit mask
> > +	jg	sys_fanotify_mark
> 
> Oh wait, I have to correct myself:
> 
> With
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname,
>                        u64 mask);
> 
> we have a 64 bit type as 5th argument. That doesn't work for syscalls
> on 32 bit s390.
> I just simplify the reason for this: on 32 bit long longs will be passed via
> two consecutive registers _unless_ the first register would be r6 (which is
> the case here). In that case the whole 64 bits would be passed on the stack.
> Our glibc syscall code will always put the contents of the first parameter
> stack slot into register r7, so we have six registers for parameter passing
> (r2-r7). So with the 64 bit value put into two stack slots we would miss
> the second part of the 5th argument.

asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);

sys_fallocate_wrapper:
        lgfr    %r2,%r2                 # int
        lgfr    %r3,%r3                 # int
        sllg    %r4,%r4,32              # get high word of 64bit loff_t
        lr      %r4,%r5                 # get low word of 64bit loff_t
        sllg    %r5,%r6,32              # get high word of 64bit loff_t
        l       %r5,164(%r15)           # get low word of 64bit loff_t
        jg      sys_fallocate

Does this work?  It's basically the same thing, right?  I'm willing to
hear "that's fine you are clueless"   Just saw it and hoping that we
have everything right....

-Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 16:02             ` Eric Paris
@ 2009-11-18 16:02               ` Eric Paris
  2009-11-18 16:22               ` Heiko Carstens
  2009-11-18 17:34               ` Martin Schwidefsky
  2 siblings, 0 replies; 15+ messages in thread
From: Eric Paris @ 2009-11-18 16:02 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:
> On Tue, Nov 17, 2009 at 10:23:56AM -0500, Eric Paris wrote:
> > On Tue, 2009-11-17 at 14:55 +0100, Heiko Carstens wrote:
> > 
> > > Yes, also some places should have used lgfr instead of llgfr for proper sign
> > > extension. But please, just drop the s390 bits from your patch.
> > > Its easier and less painful for us to do it ourselves instead of reviewing
> > > and fixing these things. (No offence intended!).
> > 
> > dropped and won't show up in -next tomorrow.
> > 
> > This what I thought it should be and would love to read if you say it's
> > right....
> > 
> 
> [...]
> 
> > +sys32_fanotify_mark_wrapper:
> > +	lgfr	%r2,%r2			# int
> > +	llgfr	%r3,%r3			# unsigned int
> > +	lgfr	%r4,%r4			# int
> > +	llgtr	%r5,%r5			# char *
> > +	sllg	%r6,%r6,32		# get high word of 64bit mask
> > +	l	%r6,164(%r15)		# get low word of 64bit mask
> > +	jg	sys_fanotify_mark
> 
> Oh wait, I have to correct myself:
> 
> With
> 
> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>      	 	       int fd, const char  __user *pathname,
>                        u64 mask);
> 
> we have a 64 bit type as 5th argument. That doesn't work for syscalls
> on 32 bit s390.
> I just simplify the reason for this: on 32 bit long longs will be passed via
> two consecutive registers _unless_ the first register would be r6 (which is
> the case here). In that case the whole 64 bits would be passed on the stack.
> Our glibc syscall code will always put the contents of the first parameter
> stack slot into register r7, so we have six registers for parameter passing
> (r2-r7). So with the 64 bit value put into two stack slots we would miss
> the second part of the 5th argument.

asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);

sys_fallocate_wrapper:
        lgfr    %r2,%r2                 # int
        lgfr    %r3,%r3                 # int
        sllg    %r4,%r4,32              # get high word of 64bit loff_t
        lr      %r4,%r5                 # get low word of 64bit loff_t
        sllg    %r5,%r6,32              # get high word of 64bit loff_t
        l       %r5,164(%r15)           # get low word of 64bit loff_t
        jg      sys_fallocate

Does this work?  It's basically the same thing, right?  I'm willing to
hear "that's fine you are clueless"   Just saw it and hoping that we
have everything right....

-Eric


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 16:02             ` Eric Paris
  2009-11-18 16:02               ` Eric Paris
@ 2009-11-18 16:22               ` Heiko Carstens
  2009-11-18 16:22                 ` Heiko Carstens
  2009-11-18 17:34               ` Martin Schwidefsky
  2 siblings, 1 reply; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18 16:22 UTC (permalink / raw)
  To: Eric Paris
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, Nov 18, 2009 at 11:02:57AM -0500, Eric Paris wrote:
> 
> asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
> 
> sys_fallocate_wrapper:
>         lgfr    %r2,%r2                 # int
>         lgfr    %r3,%r3                 # int
>         sllg    %r4,%r4,32              # get high word of 64bit loff_t
>         lr      %r4,%r5                 # get low word of 64bit loff_t
>         sllg    %r5,%r6,32              # get high word of 64bit loff_t
>         l       %r5,164(%r15)           # get low word of 64bit loff_t
>         jg      sys_fallocate
> 
> Does this work?  It's basically the same thing, right?  I'm willing to
> hear "that's fine you are clueless"   Just saw it and hoping that we
> have everything right....

It works, because for 32 bit s390 it's not the interface above but this one:
(see arch/s390/kernel/sys_s390.c):

/*
 * This is a wrapper to call sys_fallocate(). For 31 bit s390 the last
 * 64 bit argument "len" is split into the upper and lower 32 bits. The
 * system call wrapper in the user space loads the value to %r6/%r7.
 * The code in entry.S keeps the values in %r2 - %r6 where they are and
 * stores %r7 to 96(%r15). But the standard C linkage requires that
 * the whole 64 bit value for len is stored on the stack and doesn't
 * use %r6 at all. So s390_fallocate has to convert the arguments from
 *   %r2: fd, %r3: mode, %r4/%r5: offset, %r6/96(%r15)-99(%r15): len
 * to
 *   %r2: fd, %r3: mode, %r4/%r5: offset, 96(%r15)-103(%r15): len
 */
SYSCALL_DEFINE(s390_fallocate)(int fd, int mode, loff_t offset,
			       u32 len_high, u32 len_low)
{
	return sys_fallocate(fd, mode, offset, ((u64)len_high << 32) | len_low);
}
#ifdef CONFIG_HAVE_SYSCALL_WRAPPERS
asmlinkage long SyS_s390_fallocate(long fd, long mode, loff_t offset,
				   long len_high, long len_low)
{
	return SYSC_s390_fallocate((int) fd, (int) mode, offset,
				   (u32) len_high, (u32) len_low);
}
SYSCALL_ALIAS(sys_s390_fallocate, SyS_s390_fallocate);
#endif

We (or better: Martin ;) had to define a special s390 fallocate system call
because the "real" system call interface doesn't work on s390 for the same
reason like your proposed system call.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 16:22               ` Heiko Carstens
@ 2009-11-18 16:22                 ` Heiko Carstens
  0 siblings, 0 replies; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18 16:22 UTC (permalink / raw)
  To: Eric Paris
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, Nov 18, 2009 at 11:02:57AM -0500, Eric Paris wrote:
> 
> asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
> 
> sys_fallocate_wrapper:
>         lgfr    %r2,%r2                 # int
>         lgfr    %r3,%r3                 # int
>         sllg    %r4,%r4,32              # get high word of 64bit loff_t
>         lr      %r4,%r5                 # get low word of 64bit loff_t
>         sllg    %r5,%r6,32              # get high word of 64bit loff_t
>         l       %r5,164(%r15)           # get low word of 64bit loff_t
>         jg      sys_fallocate
> 
> Does this work?  It's basically the same thing, right?  I'm willing to
> hear "that's fine you are clueless"   Just saw it and hoping that we
> have everything right....

It works, because for 32 bit s390 it's not the interface above but this one:
(see arch/s390/kernel/sys_s390.c):

/*
 * This is a wrapper to call sys_fallocate(). For 31 bit s390 the last
 * 64 bit argument "len" is split into the upper and lower 32 bits. The
 * system call wrapper in the user space loads the value to %r6/%r7.
 * The code in entry.S keeps the values in %r2 - %r6 where they are and
 * stores %r7 to 96(%r15). But the standard C linkage requires that
 * the whole 64 bit value for len is stored on the stack and doesn't
 * use %r6 at all. So s390_fallocate has to convert the arguments from
 *   %r2: fd, %r3: mode, %r4/%r5: offset, %r6/96(%r15)-99(%r15): len
 * to
 *   %r2: fd, %r3: mode, %r4/%r5: offset, 96(%r15)-103(%r15): len
 */
SYSCALL_DEFINE(s390_fallocate)(int fd, int mode, loff_t offset,
			       u32 len_high, u32 len_low)
{
	return sys_fallocate(fd, mode, offset, ((u64)len_high << 32) | len_low);
}
#ifdef CONFIG_HAVE_SYSCALL_WRAPPERS
asmlinkage long SyS_s390_fallocate(long fd, long mode, loff_t offset,
				   long len_high, long len_low)
{
	return SYSC_s390_fallocate((int) fd, (int) mode, offset,
				   (u32) len_high, (u32) len_low);
}
SYSCALL_ALIAS(sys_s390_fallocate, SyS_s390_fallocate);
#endif

We (or better: Martin ;) had to define a special s390 fallocate system call
because the "real" system call interface doesn't work on s390 for the same
reason like your proposed system call.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
                               ` (3 preceding siblings ...)
  2009-11-18 16:02             ` Eric Paris
@ 2009-11-18 17:24             ` Eric Paris
  4 siblings, 0 replies; 15+ messages in thread
From: Eric Paris @ 2009-11-18 17:24 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, Martin Schwidefsky, linux-arch

On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:


> long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
>                        u64 mask,
>      	 	       int fd, const char  __user *pathname);

I'm willing to go with this interface if that works the best for most
arches.  I'll queue it up for -next tomorrow.

-Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 16:02             ` Eric Paris
  2009-11-18 16:02               ` Eric Paris
  2009-11-18 16:22               ` Heiko Carstens
@ 2009-11-18 17:34               ` Martin Schwidefsky
  2009-11-18 17:34                 ` Martin Schwidefsky
  2009-11-18 18:41                 ` Heiko Carstens
  2 siblings, 2 replies; 15+ messages in thread
From: Martin Schwidefsky @ 2009-11-18 17:34 UTC (permalink / raw)
  To: Eric Paris
  Cc: Heiko Carstens, Sachin Sant, linux-s390, Stephen Rothwell,
	linux-next, Andrew Morton, linux-arch

On Wed, 18 Nov 2009 11:02:57 -0500
Eric Paris <eparis@redhat.com> wrote:

> On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:
> > Oh wait, I have to correct myself:
> > 
> > With
> > 
> > long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
> >      	 	       int fd, const char  __user *pathname,
> >                        u64 mask);
> > 
> > we have a 64 bit type as 5th argument. That doesn't work for syscalls
> > on 32 bit s390.
> > I just simplify the reason for this: on 32 bit long longs will be passed via
> > two consecutive registers _unless_ the first register would be r6 (which is
> > the case here). In that case the whole 64 bits would be passed on the stack.
> > Our glibc syscall code will always put the contents of the first parameter
> > stack slot into register r7, so we have six registers for parameter passing
> > (r2-r7). So with the 64 bit value put into two stack slots we would miss
> > the second part of the 5th argument.
> 
> asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
> 
> sys_fallocate_wrapper:
>         lgfr    %r2,%r2                 # int
>         lgfr    %r3,%r3                 # int
>         sllg    %r4,%r4,32              # get high word of 64bit loff_t
>         lr      %r4,%r5                 # get low word of 64bit loff_t
>         sllg    %r5,%r6,32              # get high word of 64bit loff_t
>         l       %r5,164(%r15)           # get low word of 64bit loff_t
>         jg      sys_fallocate
> 
> Does this work?  It's basically the same thing, right?  I'm willing to
> hear "that's fine you are clueless"   Just saw it and hoping that we
> have everything right....

Ok, we need the full version of the story..
The 32 bit ELF ABI specifies that the 32 bit registers %r2 to %r6 are
used for parameter passing. 64 bit values are passed as registers pairs
with the first register an even numbered register. The effect of that
rule is that parameter registers may be skipped or that the whole 64 bit
value is passed on the stack. Examples:

fn(int a, int b, long long c)
a is passed in %r2, b is passed in %r3, c is passed in %r4/%r5.

fn(int a, long long b, int c)
a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
skipped.

fn(int a, int b, int c, int d, long long e)
a is passed in %r2, b is passed in %r3, c is passed in %r4, d is passed
in %r5, e is passed on the stack, %r6 is skipped.

The second fact to understand is how the system call arguments are
passed. The original system call ABI used the same calling conventions
as the ELF ABI. That is only registers %r2 to %r6 are used. Now futex
came along with 6 parameters. We did not want to use the user process
stack to pass the parameters because that would require a
copy_from_user which is expensive. Instead we tricked a little bit. The
6th parameter is passed by glue code in glibc in register %r7 (no user
copy). The code in entry.S stores %r7 to the beginning of the pt_regs
structure:

struct pt_regs
{
        unsigned long args[1];
	...
};

The C function that implements a system call with 6 32-bit parameters
expects 5 parameters in registers, the 6th is located on the stack. The
args element of pt_regs "happens" to be at the same offset where the C
function is looking for the first overflow argument (= the 6th
parameter).

Now consider a system call with an overflowing 64 bit parameter. The
glue code in glibc could be hacked in a way that the 64 bit value is
split into %r6 and %r7. But the system call function is just a C
function. It follows the ELF ABI and expects the 64 bit argument on the
stack. It would take two 32 bit overflow registers in pt_regs to make
one 64 bit parameter. With the current code that won't work. We would
need a wrapper function in the kernel to untangle this parameter mess.

The avoid all this all 64 bit parameter have to be placed at positions
where no register is skipped because of the even/odd rule and where it
is not affected by the %r7 trick (= may not be the last parameter).
Easy, no?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 17:34               ` Martin Schwidefsky
@ 2009-11-18 17:34                 ` Martin Schwidefsky
  2009-11-18 18:41                 ` Heiko Carstens
  1 sibling, 0 replies; 15+ messages in thread
From: Martin Schwidefsky @ 2009-11-18 17:34 UTC (permalink / raw)
  To: Eric Paris
  Cc: Heiko Carstens, Sachin Sant, linux-s390, Stephen Rothwell,
	linux-next, Andrew Morton, linux-arch

On Wed, 18 Nov 2009 11:02:57 -0500
Eric Paris <eparis@redhat.com> wrote:

> On Wed, 2009-11-18 at 08:04 +0100, Heiko Carstens wrote:
> > Oh wait, I have to correct myself:
> > 
> > With
> > 
> > long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
> >      	 	       int fd, const char  __user *pathname,
> >                        u64 mask);
> > 
> > we have a 64 bit type as 5th argument. That doesn't work for syscalls
> > on 32 bit s390.
> > I just simplify the reason for this: on 32 bit long longs will be passed via
> > two consecutive registers _unless_ the first register would be r6 (which is
> > the case here). In that case the whole 64 bits would be passed on the stack.
> > Our glibc syscall code will always put the contents of the first parameter
> > stack slot into register r7, so we have six registers for parameter passing
> > (r2-r7). So with the 64 bit value put into two stack slots we would miss
> > the second part of the 5th argument.
> 
> asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
> 
> sys_fallocate_wrapper:
>         lgfr    %r2,%r2                 # int
>         lgfr    %r3,%r3                 # int
>         sllg    %r4,%r4,32              # get high word of 64bit loff_t
>         lr      %r4,%r5                 # get low word of 64bit loff_t
>         sllg    %r5,%r6,32              # get high word of 64bit loff_t
>         l       %r5,164(%r15)           # get low word of 64bit loff_t
>         jg      sys_fallocate
> 
> Does this work?  It's basically the same thing, right?  I'm willing to
> hear "that's fine you are clueless"   Just saw it and hoping that we
> have everything right....

Ok, we need the full version of the story..
The 32 bit ELF ABI specifies that the 32 bit registers %r2 to %r6 are
used for parameter passing. 64 bit values are passed as registers pairs
with the first register an even numbered register. The effect of that
rule is that parameter registers may be skipped or that the whole 64 bit
value is passed on the stack. Examples:

fn(int a, int b, long long c)
a is passed in %r2, b is passed in %r3, c is passed in %r4/%r5.

fn(int a, long long b, int c)
a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
skipped.

fn(int a, int b, int c, int d, long long e)
a is passed in %r2, b is passed in %r3, c is passed in %r4, d is passed
in %r5, e is passed on the stack, %r6 is skipped.

The second fact to understand is how the system call arguments are
passed. The original system call ABI used the same calling conventions
as the ELF ABI. That is only registers %r2 to %r6 are used. Now futex
came along with 6 parameters. We did not want to use the user process
stack to pass the parameters because that would require a
copy_from_user which is expensive. Instead we tricked a little bit. The
6th parameter is passed by glue code in glibc in register %r7 (no user
copy). The code in entry.S stores %r7 to the beginning of the pt_regs
structure:

struct pt_regs
{
        unsigned long args[1];
	...
};

The C function that implements a system call with 6 32-bit parameters
expects 5 parameters in registers, the 6th is located on the stack. The
args element of pt_regs "happens" to be at the same offset where the C
function is looking for the first overflow argument (= the 6th
parameter).

Now consider a system call with an overflowing 64 bit parameter. The
glue code in glibc could be hacked in a way that the 64 bit value is
split into %r6 and %r7. But the system call function is just a C
function. It follows the ELF ABI and expects the 64 bit argument on the
stack. It would take two 32 bit overflow registers in pt_regs to make
one 64 bit parameter. With the current code that won't work. We would
need a wrapper function in the kernel to untangle this parameter mess.

The avoid all this all 64 bit parameter have to be placed at positions
where no register is skipped because of the even/odd rule and where it
is not affected by the %r7 trick (= may not be the last parameter).
Easy, no?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 17:34               ` Martin Schwidefsky
  2009-11-18 17:34                 ` Martin Schwidefsky
@ 2009-11-18 18:41                 ` Heiko Carstens
  2009-11-18 18:41                   ` Heiko Carstens
  2009-11-19  8:54                   ` Martin Schwidefsky
  1 sibling, 2 replies; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18 18:41 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, linux-arch

On Wed, Nov 18, 2009 at 06:34:00PM +0100, Martin Schwidefsky wrote:
> used for parameter passing. 64 bit values are passed as registers pairs
> with the first register an even numbered register. The effect of that
> rule is that parameter registers may be skipped or that the whole 64 bit
> value is passed on the stack. Examples:

Erm, did I miss something? The 32 bit ABI doesn't say anything about
even/odd register pairs for 64 bit values. Also gcc doesn't generate
such code:

> fn(int a, long long b, int c)
> a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
> skipped.

extern void fn(int a, long long b, int c);

void bla(void)
{
	fn(0, 1, 2);
}

00000000 <bla>:
   0:   90 ef f0 38             stm     %r14,%r15,56(%r15)
   4:   a7 fa ff a0             ahi     %r15,-96
   8:   a7 28 00 00             lhi     %r2,0
   c:   a7 38 00 00             lhi     %r3,0
  10:   a7 48 00 01             lhi     %r4,1
  14:   a7 58 00 02             lhi     %r5,2
  18:   c0 e5 00 00 00 00       brasl   %r14,18 <bla+0x18>
  1e:   98 ef f0 98             lm      %r14,%r15,152(%r15)
  22:   07 fe                   br      %r14

Not that it really matters, since other archs enforce the rule anyway.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 18:41                 ` Heiko Carstens
@ 2009-11-18 18:41                   ` Heiko Carstens
  2009-11-19  8:54                   ` Martin Schwidefsky
  1 sibling, 0 replies; 15+ messages in thread
From: Heiko Carstens @ 2009-11-18 18:41 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, linux-arch

On Wed, Nov 18, 2009 at 06:34:00PM +0100, Martin Schwidefsky wrote:
> used for parameter passing. 64 bit values are passed as registers pairs
> with the first register an even numbered register. The effect of that
> rule is that parameter registers may be skipped or that the whole 64 bit
> value is passed on the stack. Examples:

Erm, did I miss something? The 32 bit ABI doesn't say anything about
even/odd register pairs for 64 bit values. Also gcc doesn't generate
such code:

> fn(int a, long long b, int c)
> a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
> skipped.

extern void fn(int a, long long b, int c);

void bla(void)
{
	fn(0, 1, 2);
}

00000000 <bla>:
   0:   90 ef f0 38             stm     %r14,%r15,56(%r15)
   4:   a7 fa ff a0             ahi     %r15,-96
   8:   a7 28 00 00             lhi     %r2,0
   c:   a7 38 00 00             lhi     %r3,0
  10:   a7 48 00 01             lhi     %r4,1
  14:   a7 58 00 02             lhi     %r5,2
  18:   c0 e5 00 00 00 00       brasl   %r14,18 <bla+0x18>
  1e:   98 ef f0 98             lm      %r14,%r15,152(%r15)
  22:   07 fe                   br      %r14

Not that it really matters, since other archs enforce the rule anyway.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S)
  2009-11-18 18:41                 ` Heiko Carstens
  2009-11-18 18:41                   ` Heiko Carstens
@ 2009-11-19  8:54                   ` Martin Schwidefsky
  1 sibling, 0 replies; 15+ messages in thread
From: Martin Schwidefsky @ 2009-11-19  8:54 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Eric Paris, Sachin Sant, linux-s390, Stephen Rothwell, linux-next,
	Andrew Morton, linux-arch

On Wed, 18 Nov 2009 19:41:56 +0100
Heiko Carstens <heiko.carstens@de.ibm.com> wrote:

> On Wed, Nov 18, 2009 at 06:34:00PM +0100, Martin Schwidefsky wrote:
> > used for parameter passing. 64 bit values are passed as registers pairs
> > with the first register an even numbered register. The effect of that
> > rule is that parameter registers may be skipped or that the whole 64 bit
> > value is passed on the stack. Examples:
> 
> Erm, did I miss something? The 32 bit ABI doesn't say anything about
> even/odd register pairs for 64 bit values. Also gcc doesn't generate
> such code:

Odd, I can remember that there has been a rule like that. It is indeed
not in the ABI, seems like we dropped it/never added it. There are quite
a few 32 bit instruction that operate on 64 bit values in even/odd
registers which makes it beneficial to pass 64 bit values in a way that
they already are located in even/odd and not in odd/even registers. 

> > fn(int a, long long b, int c)
> > a is passed in %r2, b is passed in %r4/%r5, c is passed in %r6, %r3 is
> > skipped.
> 
> extern void fn(int a, long long b, int c);
> 
> void bla(void)
> {
> 	fn(0, 1, 2);
> }
> 
> 00000000 <bla>:
>    0:   90 ef f0 38             stm     %r14,%r15,56(%r15)
>    4:   a7 fa ff a0             ahi     %r15,-96
>    8:   a7 28 00 00             lhi     %r2,0
>    c:   a7 38 00 00             lhi     %r3,0
>   10:   a7 48 00 01             lhi     %r4,1
>   14:   a7 58 00 02             lhi     %r5,2
>   18:   c0 e5 00 00 00 00       brasl   %r14,18 <bla+0x18>
>   1e:   98 ef f0 98             lm      %r14,%r15,152(%r15)
>   22:   07 fe                   br      %r14
> 
> Not that it really matters, since other archs enforce the rule anyway.

Perhaps that is what confused me. Anyway, the thing to remember here is
that passing a long long as the last parameter which would end up in
%r6/%r7 requires a system call wrapper for the standard 32 bit system
call for a 32 bit kernel. So please don't do that.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-11-19  8:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20091117195309.6cc3ead0.sfr@canb.auug.org.au>
     [not found] ` <4B0291BB.3090005@in.ibm.com>
     [not found]   ` <20091117125201.GB5124@osiris.boeblingen.de.ibm.com>
     [not found]     ` <1258465241.2876.29.camel@dhcp231-106.rdu.redhat.com>
     [not found]       ` <20091117135525.GF5124@osiris.boeblingen.de.ibm.com>
     [not found]         ` <1258471436.2876.34.camel@dhcp231-106.rdu.redhat.com>
2009-11-18  7:04           ` [-next Nov 17] s390 build break(arch/s390/kernel/compat_wrapper.S) Heiko Carstens
2009-11-18  7:04             ` Heiko Carstens
2009-11-18  9:27             ` Russell King
2009-11-18 14:49             ` Ralf Baechle
2009-11-18 14:49               ` Ralf Baechle
2009-11-18 16:02             ` Eric Paris
2009-11-18 16:02               ` Eric Paris
2009-11-18 16:22               ` Heiko Carstens
2009-11-18 16:22                 ` Heiko Carstens
2009-11-18 17:34               ` Martin Schwidefsky
2009-11-18 17:34                 ` Martin Schwidefsky
2009-11-18 18:41                 ` Heiko Carstens
2009-11-18 18:41                   ` Heiko Carstens
2009-11-19  8:54                   ` Martin Schwidefsky
2009-11-18 17:24             ` Eric Paris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox