From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Gabriel Krisman Bertazi <krisman@collabora.com>, alx.manpages@gmail.com
Cc: mtk.manpages@gmail.com, linux-man@vger.kernel.org, kernel@collabora.com
Subject: Re: [PATCH v6] prctl.2: Document Syscall User Dispatch
Date: Wed, 30 Dec 2020 11:24:04 +0100 [thread overview]
Message-ID: <5da9a8bc-e034-1ab4-3f87-328108c1b27d@gmail.com> (raw)
In-Reply-To: <20201228173832.347794-1-krisman@collabora.com>
Hello Gabriel
This is looking much better. Thank you! I have a few more
comments still.
On 12/28/20 6:38 PM, Gabriel Krisman Bertazi wrote:
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>
> ---
> Changes since v5:
> (suggested by Michael Kerrisk)
> - Change () punctuation
> - fix grammar
> - Add information about interception, return and return value
>
> Changes since v4:
> (suggested by Michael Kerrisk)
> - Modify explanation of what dispatch to user space means.
> - Drop references to emulation.
> - Document suggestion about placing libc in allowed-region.
> - Comment about avoiding syscall cost.
> Changes since v3:
> (suggested by Michael Kerrisk)
> - Explain what dispatch to user space means.
> - Document the fact that the memory region is a single consecutive
> range.
> - Explain failure if *arg5 is set to a bad value.
> - fix english typo.
> - Define what 'invalid memory region' means.
>
> Changes since v2:
> (suggested by Alejandro Colomar)
> - selective -> selectively
> - Add missing oxford comma.
>
> Changes since v1:
> (suggested by Alejandro Colomar)
> - Use semantic lines
> - Fix usage of .{B|I}R and .{B|I}
> - Don't format literals
> - Fix preferred spelling of userspace
> - Fix case of word
> ---
> man2/prctl.2 | 159 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 159 insertions(+)
>
> diff --git a/man2/prctl.2 b/man2/prctl.2
> index f25f05fdb593..0a0abfb78055 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1533,6 +1533,135 @@ For more information, see the kernel source file
> (or
> .I Documentation/arm64/sve.txt
> before Linux 5.3).
> +.TP
> +.\" prctl PR_SET_SYSCALL_USER_DISPATCH
> +.\" commit 1446e1df9eb183fdf81c3f0715402f1d7595d4
> +.BR PR_SET_SYSCALL_USER_DISPATCH " (since Linux 5.11, x86 only)"
> +.IP
> +Configure the Syscall User Dispatch mechanism
> +for the calling thread.
> +This mechanism allows an application
> +to selectively intercept system calls
> +so that they can be handled within the application itself.
> +Interception takes the form of a thread-directed
> +.B SIGSYS
> +signal that is delivered to the thread
> +when it makes a system call.
> +If intercepted,
> +the system call is not executed by the kernel.
> +.IP
> +The current Syscall User Dispatch mode is selected via
> +.IR arg2 ,
> +which can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the feature,
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to turn it off.
So, I realize now that I'm slightly confused.
The value of arg2 can be either PR_SYS_DISPATCH_ON or
PR_SYS_DISPATCH_OFF. The value of the selector pointed to by
arg5 can likewise be R_SYS_DISPATCH_ON or PR_SYS_DISPATCH_OFF.
What is the relationship between these two attributes? For example,
what does it mean if arg2 isP R_SYS_DISPATCH_ON and, at the time of
the prctl() call, the selector has the value PR_SYS_DISPATCH_OFF?
> +.IP
> +When
> +.I arg2
> +is set to
> +.BR PR_SYS_DISPATCH_ON ,
> +.I arg3
> +and
> +.I arg4
> +respectively identify the
> +.I offset
> +and
> +.I length
> +of a single contiguous memory region in the process map
Better: s/map/address space/ ?
> +from where system calls are always allowed to be executed,
> +regardless of the switch variable
s/variable/variable./
> +(Typically, this area would include the area of memory
> +containing the C library.)
I think just to ease readability (smaller paragraphs), insert
.IP
here.
> +.I arg5
> +points to a char-sized variable
> +that is a fast switch to enable/disable the mechanism
> +without the overhead of doing a system call.
> +The variable pointed by
> +.I arg5
> +can either be set to
> +.B PR_SYS_DISPATCH_ON
> +to enable the mechanism
> +or to
> +.B PR_SYS_DISPATCH_OFF
> +to temporarily disable it.
> +This value is checked by the kernel
> +on every system call entry,
> +and any unexpected value will raise
> +an uncatchable
> +.B SIGSYS
> +at that time,
> +killing the application.
> +.IP
> +When a system call is intercepted,
> +the kernel sends a thread-directed
> +.B SIGSYS
> +signal to the triggering thread.
> +Various fields will be set in the
> +.I siginfo_t
> +structure (see
> +.BR sigaction (2))
> +associated with the signal:
> +.RS
> +.IP * 3
> +.I si_signo
> +will contain
> +.BR SIGSYS .
> +.IP *
> +.IR si_call_addr
> +will show the address of the system call instruction.
> +.IP *
> +.IR si_syscall
> +and
> +.IR si_arch
> +will indicate which system call was attempted.
> +.IP *
> +.I si_code
> +will contain
> +.BR SYS_USER_DISPATCH .
> +.IP *
> +.I si_errno
> +will be set to 0.
> +.RE
> +.IP
> +The program counter will be as though the system call happened
> +(i.e., the program counter will not point to the system call instruction).
> +.IP
> +When the signal handler returns to the kernel,
> +the system call completes immediately
> +and returns to the calling thread,
> +without actually being executed.
> +If necessary
> +(i.e., when emulating the system call on user space.),
> +the signal handler should set the system call return value
> +to a sane value,
> +by modifying the register context stored in the
> +.I ucontext
> +argument of the signal handler.
Just for my own education, do you have any example code somewhere
that demonstrates setting the syscall return value?
Thanks,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
next prev parent reply other threads:[~2020-12-30 10:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-28 17:38 [PATCH v6] prctl.2: Document Syscall User Dispatch Gabriel Krisman Bertazi
2020-12-30 10:24 ` Michael Kerrisk (man-pages) [this message]
2020-12-30 16:51 ` Gabriel Krisman Bertazi
2020-12-30 19:50 ` Michael Kerrisk (man-pages)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5da9a8bc-e034-1ab4-3f87-328108c1b27d@gmail.com \
--to=mtk.manpages@gmail.com \
--cc=alx.manpages@gmail.com \
--cc=kernel@collabora.com \
--cc=krisman@collabora.com \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox