public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC v2 PATCH 7/7] x86/entry: use int for syscall number; handle all invalid syscall nrs
Date: Wed, 12 May 2021 14:09:03 +0200	[thread overview]
Message-ID: <871racf928.ffs@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20210510185316.3307264-8-hpa@zytor.com>

On Mon, May 10 2021 at 11:53, H. Peter Anvin wrote:
> Redefine the system call number consistently to be "int". The value -1
> is a non-system call (which can be poked in by ptrace/seccomp to
> indicate that no further processing should be done and that the return
> value should be the current value in regs->ax, default to -ENOSYS; any
> other value which does not correspond to a valid system call
> unconditionally calls sys_ni_syscall() and returns -ENOSYS just like
> any system call that corresponds to a hole in the system call table.

That sentence spawns 6 lines, has a unmatched ( inside and is confusing
at best. I know what you want to say, but heck...

> This is the defined semantics of syscall_get_nr(), so that is what all
> the architecture-independent code already expects.  As documented in
> <asm-generic/syscall.h> (which is simply the documentation file for
> <asm/syscall.h>):
>
> /**
>  * syscall_get_nr - find what system call a task is executing
>  * @task:       task of interest, must be blocked
>  * @regs:       task_pt_regs() of @task
>  *
>  * If @task is executing a system call or is at system call
>  * tracing about to attempt one, returns the system call number.
>  * If @task is not executing a system call, i.e. it's blocked
>  * inside the kernel for a fault or signal, returns -1.
>  *
>  * Note this returns int even on 64-bit machines.  Only 32 bits of
>  * system call number can be meaningful.  If the actual arch value
>  * is 64 bits, this truncates to 32 bits so 0xffffffff means -1.
>  *
>  * It's only valid to call this when @task is known to be blocked.
>  */
> int syscall_get_nr(struct task_struct *task, struct pt_regs *regs);

No need for copying this comment. Something like this is sufficient:

The syscall number has to be an 'int' as defined by syscall_get_nr().

Aside of that the subject says:

      x86/entry: use int for syscall number; handle all invalid syscall nrs

which suggests that something is not handled correctly today. But the
changelog does not say anything about it.

>  
>  #ifdef CONFIG_X86_64
> -__visible noinstr void do_syscall_64(struct pt_regs *regs, unsigned long nr)
> +
> +static __always_inline bool do_syscall_x64(struct pt_regs *regs, int nr)
> +{
> +	unsigned long unr = nr;

What's the point of this cast? Turn -1 into something larger than
NR_SYSCALLS, right? Comments exist for a reason.

Also why unsigned long? unsigned int is sufficient

> +	if (likely(unr < NR_syscalls)) {
> +		unr = array_index_nospec(unr, NR_syscalls);
> +		regs->ax = sys_call_table[unr](regs);
> +		return true;
> +	}
> +	return false;
> +}

Something like this:

static __always_inline bool do_syscall_x64(struct pt_regs *regs, unsigned int nr)
{
        /* nr is unsigned so it catches 
	if (likely(nr < NR_syscalls)) {
		nr = array_index_nospec(nr, NR_syscalls);
		regs->ax = sys_call_table[nr](regs);
		return true;
	}
	return false;
}

static __always_inline bool do_syscall_x32(struct pt_regs *regs, unsigned int nr)
{
        /*
         * If nr < __X32_SYSCALL_BIT then the result will be > __X32_SYSCALL_BIT
         * due to unsigned math.
         */
	nr -= __X32_SYSCALL_BIT;

	if (IS_ENABLED(CONFIG_X86_X32_ABI) && likely(nr < X32_NR_syscalls)) {
        	nr = array_index_nospec(nr, X32_NR_syscalls);
		regs->ax = x32_sys_call_table[nr](regs);
		return true;
	}
	return false;
}

> index 1d9db15fdc69..85f04ea0e368 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -108,7 +108,7 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
>  
>  	/* IRQs are off. */
>  	movq	%rsp, %rdi
> -	movq	%rax, %rsi
> +	movslq	%eax, %rsi

This is wrong.

  syscall(long number,...);

So the above turns syscall(UINT_MAX + N, ...) into syscall(N, ...).

Thanks,

        tglx



  parent reply	other threads:[~2021-05-12 12:09 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 18:53 [RFC v2 PATCH 0/6] x86/entry: cleanups and consistent syscall number handling H. Peter Anvin
2021-05-10 18:53 ` [RFC v2 PATCH 1/7] x86/entry: unify definitions from calling.h and ptrace-abi.h H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/entry: Unify definitions from <asm/calling.h> and <asm/ptrace-abi.h> tip-bot2 for H. Peter Anvin (Intel)
2021-05-10 18:53 ` [RFC v2 PATCH 2/7] x86/entry: reverse arguments to do_syscall_64() H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/entry: Reverse " tip-bot2 for H. Peter Anvin (Intel)
2021-05-10 18:53 ` [RFC v2 PATCH 3/7] x86/syscall: unconditionally prototype {ia32,x32}_sys_call_table[] H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/syscall: Unconditionally " tip-bot2 for H. Peter Anvin (Intel)
2021-05-10 18:53 ` [RFC v2 PATCH 4/7] x86/syscall: maximize MSR_SYSCALL_MASK H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/syscall: Maximize MSR_SYSCALL_MASK tip-bot2 for H. Peter Anvin (Intel)
2021-05-10 18:53 ` [RFC v2 PATCH 5/7] x86/entry: split PUSH_AND_CLEAR_REGS into two submacros H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/entry: Split " tip-bot2 for H. Peter Anvin (Intel)
2021-05-10 18:53 ` [RFC v2 PATCH 6/7] x86/regs: syscall_get_nr() returns -1 for a non-system call H. Peter Anvin
2021-05-12  9:23   ` [tip: x86/asm] x86/regs: Syscall_get_nr() " tip-bot2 for H. Peter Anvin
2021-05-10 18:53 ` [RFC v2 PATCH 7/7] x86/entry: use int for syscall number; handle all invalid syscall nrs H. Peter Anvin
2021-05-12  8:51   ` Ingo Molnar
2021-05-12 17:50     ` H. Peter Anvin
2021-05-12 12:09   ` Thomas Gleixner [this message]
2021-05-12 18:21     ` H. Peter Anvin
2021-05-12 18:34       ` Thomas Gleixner
2021-05-12 22:09         ` H. Peter Anvin
2021-05-12 22:22           ` Thomas Gleixner
2021-05-12 22:24             ` H. Peter Anvin
2021-05-14  0:38             ` H. Peter Anvin
2021-05-14  3:18               ` Andy Lutomirski
2021-05-14  3:23                 ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871racf928.ffs@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox