* [RFC PATCH 0/8] signals: Support more than 64 signals
@ 2022-01-03 18:19 Walt Drummond
2022-01-03 18:19 ` [RFC PATCH 6/8] signals: Round up _NSIG_WORDS Walt Drummond
` (3 more replies)
0 siblings, 4 replies; 21+ messages in thread
From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw)
To: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever,
bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert,
gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris,
bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook,
mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini,
peterz, rth, richard, serge, rostedt, tglx, trond.myklebust,
vincent.guittot, x86
Cc: linux-kernel, Walt Drummond, ceph-devel, kvm, linux-alpha,
linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs,
linux-scsi, linux-security-module
This patch set expands the number of signals in Linux beyond the
current cap of 64. It sets a new cap at the somewhat arbitrary limit
of 1024 signals, both because it’s what GLibc and MUSL support and
because many architectures pad sigset_t or ucontext_t in the kernel to
this cap. This limit is not fixed and can be further expanded within
reason.
Despite best efforts, there is some non-zero potential that this could
break user space; I'd appreciate any comments, review and/or pointers
to areas of concern.
Basically, these changes entail:
- Make all system calls that accept sigset_t honor the existing
sigsetsize parameter for values between 8 and 128, and to return
sigsetsize bytes to user space.
- Add AT_SIGSET_SZ to the aux vector to signal to user space the
maximum size sigset_t the kernel can accept.
- Remove the sigmask() macro except in compatibility cases, change
the sigaddset()/sigdelset()/etc. to accept a comma separated list
of signal numbers.
- Change the _NSIG_WORDS calculation to round up when needed on
generic and x86.
- Place the complete sigmask in the real time signal frame (x86_64,
x32 and ia32).
- Various fixes where sigset_t size is assumed.
- Add BSD SIGINFO (and VSTATUS) as a test.
The changes that have the most risk of breaking user space are the
ones that put more than 8 bytes of sigset_t in the real time signal
stack frame (Patches 2 & 6), and I should note that an earlier and
incomplete version of patch 2 was NAK’ed by Al in
https://lore.kernel.org/lkml/20201119221132.1515696-1-walt@drummond.us/.
As far as I have been able to determine this patchset, and
specifically changing the size of sigset_t, does not break user space.
The two uses of sigset_t that pose the most user space risk are 1) as
a member of ucontext_t passed as a parameter to the signal handler and
2) when user space performs manual inspection of the real-time signal
stack frame.
In case (1), user space has definitions of both siget_t and ucontext_t
that are independent of, and may differ from, the kernel (eg, sigset_t
in uclibc-ng is 16 bytes, musl is 128 bytes, glibc is 128 bytes on all
architectures except Arc, etc.). User space will interpret the data
on the signal stack through these definitions, and extensions to
sigset_t will be opaque. Other non-C runtimes are similarly
independent from kernel sigset_t and ucontext_t and derive their
definition of sigset_t from libc either directly or indirectly, and do
not manually inspect the signal stack (specifically OpenJDK, Golang,
Python3, Rust and Perl).
The only instances I found of case (2), manually inspecting the signal
stack frame, are in stack unwinders/backtracers (GDB, GCC, libunwind)
and in GDB when recording program execution, and only on the i386,
x86_64, s390 and powerpc architectures. The GDB, GCC and libunwind
behave consistently with and without this patchset.
GDB's execution recording is somewhat more complicated. It uses
internally defined architecture specific constants to represent the
total size of the signal frame, and will save that entire frame for
later use. I cannot confirm that the values for powerpc and s390 are
correct, but for this purpose it doesn't matter as these architectures
explicitly pad for an expanded uc_sigmask. I can, however, confirm
that the values for i386 and x86_64 are not correct, and that GDB is
recording an incorrect amount of stack data. This doesn’t appear to
be an issue; while I cannot build a test case on x86_64 due to a known
bug[1], a basic test on i386 shows that the stack is correctly being
recorded, and forward and reverse replay seems to work just fine
across signal handlers.
There are other cases to consider if the number of signals and
therefore the size of sigset_t changes:
Impact on struct rt_sigframe member elements
The placement of ucontext_t in struct rt_sigframe has the potential
to move following member elements in ways that could break user
space if user space relied on the offsets of these elements.
However a review shows that any elements in rt_sigframe after
ucontext_t.uc_sigmask are either (1) unused or only used by the
kernel or (2) fall into the x86_64/i386 floating point state case
above.
Kernel has new signals, user space does not
Any new bits in ucontext.uc_sigmask placed on the signal stack are
opaque to user space (except in cases where user space already has a
larger sigset_t, as in glibc).
There are no changes to the real-time signals system call semantics,
as the kernel will honor the hard-coded sigsetsize value of 8 in
libc and behave as it has before these changes.
Signal numbers larger than 64 cannot be blocked or caught until user
space is updated, however their default action will work as
expected. This can cause one problem: a parent process that uses
the signal number a child exited with as an index into an array
without bounds checking can cause a crash. I’ve seen exactly one
instance of this in tcsh, and is, I think, a bug in tcsh.
User space has new signals, kernel does not
User space attempting to use a signal number not supported by the
kernel in system calls (eg, sigaction()) or other libc functions (eg,
sigaddset()) will result in EINVAL, as expected.
User space needs to know how to set the sigsetsize parameter to the
real time signal system calls and it can use getauxval(AT_SIGSET_SZ)
to determine this. If it returns zero the sigsetsize must be 8,
otherwise the kernel will accept sigsetsize between 8 and the return
value.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23188
Walt Drummond (8):
signals: Make the real-time signal system calls accept different sized
sigset_t from user space.
signals: Put the full signal mask on the signal stack for x86_64, X32
and ia32 compatibility mode
signals: Use a helper function to test if a signal is a real-time
signal.
signals: Remove sigmask() macro
signals: Better support cases where _NSIG_WORDS is greater than 2
signals: Round up _NSIG_WORDS
signals: Add signal debugging
signals: Support BSD VSTATUS, KERNINFO and SIGINFO
arch/alpha/kernel/signal.c | 4 +-
arch/m68k/include/asm/signal.h | 6 +-
arch/nios2/kernel/signal.c | 2 -
arch/x86/ia32/ia32_signal.c | 5 +-
arch/x86/include/asm/sighandling.h | 34 +++
arch/x86/include/asm/signal.h | 10 +-
arch/x86/include/uapi/asm/signal.h | 4 +-
arch/x86/kernel/signal.c | 11 +-
drivers/scsi/dpti.h | 2 -
drivers/tty/Makefile | 2 +-
drivers/tty/n_tty.c | 21 ++
drivers/tty/tty_io.c | 10 +-
drivers/tty/tty_ioctl.c | 4 +
drivers/tty/tty_status.c | 135 ++++++++++
fs/binfmt_elf.c | 1 +
fs/binfmt_elf_fdpic.c | 1 +
fs/ceph/addr.c | 2 +-
fs/jffs2/background.c | 2 +-
fs/lockd/svc.c | 1 -
fs/proc/array.c | 32 +--
fs/proc/base.c | 48 ++++
fs/signalfd.c | 26 +-
include/asm-generic/termios.h | 4 +-
include/linux/compat.h | 98 ++++++-
include/linux/sched.h | 52 +++-
include/linux/signal.h | 389 ++++++++++++++++++++--------
include/linux/tty.h | 8 +
include/uapi/asm-generic/ioctls.h | 2 +
include/uapi/asm-generic/signal.h | 8 +-
include/uapi/asm-generic/termbits.h | 34 +--
include/uapi/linux/auxvec.h | 1 +
kernel/compat.c | 30 +--
kernel/fork.c | 2 +-
kernel/ptrace.c | 18 +-
kernel/signal.c | 288 ++++++++++----------
kernel/sysctl.c | 41 +++
kernel/time/posix-timers.c | 3 +-
lib/Kconfig.debug | 10 +
security/apparmor/ipc.c | 4 +-
virt/kvm/kvm_main.c | 18 +-
40 files changed, 974 insertions(+), 399 deletions(-)
create mode 100644 drivers/tty/tty_status.c
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread* [RFC PATCH 6/8] signals: Round up _NSIG_WORDS 2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond @ 2022-01-03 18:19 ` Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Arnd Bergmann Cc: linux-kernel, Walt Drummond, linux-arch When needed, round _NSIG_WORDS up for generic and x86 architectures. Signed-off-by: Walt Drummond <walt@drummond.us> --- arch/x86/include/asm/signal.h | 2 +- include/uapi/asm-generic/signal.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h index 9bac7c6e524c..d8e2efe6cd46 100644 --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -16,7 +16,7 @@ # define _NSIG_BPW 64 #endif -#define _NSIG_WORDS (_NSIG / _NSIG_BPW) +#define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) typedef unsigned long old_sigset_t; /* at least 32 bits */ diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h index f634822906e4..3c4cc9b8378e 100644 --- a/include/uapi/asm-generic/signal.h +++ b/include/uapi/asm-generic/signal.h @@ -6,7 +6,7 @@ #define _NSIG 64 #define _NSIG_BPW __BITS_PER_LONG -#define _NSIG_WORDS (_NSIG / _NSIG_BPW) +#define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) #define SIGHUP 1 #define SIGINT 2 -- 2.30.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 6/8] signals: Round up _NSIG_WORDS Walt Drummond @ 2022-01-03 18:19 ` Walt Drummond 2022-01-04 7:27 ` Greg Kroah-Hartman ` (2 more replies) 2022-01-03 18:48 ` [RFC PATCH 0/8] signals: Support more than 64 signals Al Viro 2022-01-04 18:00 ` Eric W. Biederman 3 siblings, 3 replies; 21+ messages in thread From: Walt Drummond @ 2022-01-03 18:19 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira Cc: linux-kernel, Walt Drummond, linux-fsdevel, linux-arch Support TTY VSTATUS character, NOKERNINFO local control bit and the signal SIGINFO, all as in 4.3BSD. Signed-off-by: Walt Drummond <walt@drummond.us> --- arch/x86/include/asm/signal.h | 2 +- arch/x86/include/uapi/asm/signal.h | 4 +- drivers/tty/Makefile | 2 +- drivers/tty/n_tty.c | 21 +++++ drivers/tty/tty_io.c | 10 ++- drivers/tty/tty_ioctl.c | 4 + drivers/tty/tty_status.c | 135 ++++++++++++++++++++++++++++ fs/proc/array.c | 29 +----- include/asm-generic/termios.h | 4 +- include/linux/sched.h | 52 ++++++++++- include/linux/signal.h | 4 + include/linux/tty.h | 8 ++ include/uapi/asm-generic/ioctls.h | 2 + include/uapi/asm-generic/signal.h | 6 +- include/uapi/asm-generic/termbits.h | 34 +++---- 15 files changed, 264 insertions(+), 53 deletions(-) create mode 100644 drivers/tty/tty_status.c diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h index d8e2efe6cd46..0a01877c11ab 100644 --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -8,7 +8,7 @@ /* Most things should be clean enough to redefine this at will, if care is taken to make libc match. */ -#define _NSIG 64 +#define _NSIG 65 #ifdef __i386__ # define _NSIG_BPW 32 diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h index 164a22a72984..60dca62d3dcf 100644 --- a/arch/x86/include/uapi/asm/signal.h +++ b/arch/x86/include/uapi/asm/signal.h @@ -60,7 +60,9 @@ typedef unsigned long sigset_t; /* These should not be considered constants from userland. */ #define SIGRTMIN 32 -#define SIGRTMAX _NSIG +#define SIGRTMAX 64 + +#define SIGINFO 65 #define SA_RESTORER 0x04000000 diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile index a2bd75fbaaa4..d50ba690bb87 100644 --- a/drivers/tty/Makefile +++ b/drivers/tty/Makefile @@ -2,7 +2,7 @@ obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ tty_buffer.o tty_port.o tty_mutex.o \ tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ - n_null.o + n_null.o tty_status.o obj-$(CONFIG_LEGACY_PTYS) += pty.o obj-$(CONFIG_UNIX98_PTYS) += pty.o obj-$(CONFIG_AUDIT) += tty_audit.o diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 0ec93f1a61f5..b510e01289fd 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c) commit_echoes(tty); return; } +#ifdef VSTATUS + if (c == STATUS_CHAR(tty)) { + /* Do the status message first and then send + * the signal, otherwise signal delivery can + * change the process state making the status + * message misleading. Also, use __isig() and + * not sig(), as if we flush the tty we can + * lose parts of the message. + */ + + if (!L_NOKERNINFO(tty)) + tty_status(tty); +# if defined(SIGINFO) && SIGINFO != SIGPWR + __isig(SIGINFO, tty); +# endif + return; + } +#endif /* VSTATUS */ if (c == '\n') { if (L_ECHO(tty) || L_ECHONL(tty)) { echo_char_raw('\n', ldata); @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old) set_bit(EOF_CHAR(tty), ldata->char_map); set_bit('\n', ldata->char_map); set_bit(EOL_CHAR(tty), ldata->char_map); +#ifdef VSTATUS + set_bit(STATUS_CHAR(tty), ldata->char_map); +#endif if (L_IEXTEN(tty)) { set_bit(WERASE_CHAR(tty), ldata->char_map); set_bit(LNEXT_CHAR(tty), ldata->char_map); diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 6616d4a0d41d..8e488ecba330 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -120,18 +120,26 @@ #define TTY_PARANOIA_CHECK 1 #define CHECK_TTY_COUNT 1 +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */ +#ifdef NOKERNINFO +# define __NOKERNINFO NOKERNINFO +#else +# define __NOKERNINFO 0 +#endif + struct ktermios tty_std_termios = { /* for the benefit of tty drivers */ .c_iflag = ICRNL | IXON, .c_oflag = OPOST | ONLCR, .c_cflag = B38400 | CS8 | CREAD | HUPCL, .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK | - ECHOCTL | ECHOKE | IEXTEN, + ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO, .c_cc = INIT_C_CC, .c_ispeed = 38400, .c_ospeed = 38400, /* .c_line = N_TTY, */ }; EXPORT_SYMBOL(tty_std_termios); +#undef __NOKERNINFO /* This list gets poked at by procfs and various bits of boot up code. This * could do with some rationalisation such as pulling the tty proc function diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c index 507a25d692bb..b250eabca1ba 100644 --- a/drivers/tty/tty_ioctl.c +++ b/drivers/tty/tty_ioctl.c @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file, if (get_user(arg, (unsigned int __user *) arg)) return -EFAULT; return tty_change_softcar(real_tty, arg); +#ifdef TIOCSTAT + case TIOCSTAT: + return tty_status(real_tty); +#endif default: return -ENOIOCTLCMD; } diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c new file mode 100644 index 000000000000..a9600f5bd48c --- /dev/null +++ b/drivers/tty/tty_status.c @@ -0,0 +1,135 @@ +// SPDX-License-Identifier: GPL-1.0+ +/* + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 + * + */ + +#include <linux/sched.h> +#include <linux/mm.h> +#include <linux/tty.h> +#include <linux/sched/cputime.h> +#include <linux/sched/loadavg.h> +#include <linux/pid.h> +#include <linux/slab.h> +#include <linux/math64.h> + +#define MSGLEN (160 + TASK_COMM_LEN) + +inline unsigned long getRSSk(struct mm_struct *mm) +{ + if (mm == NULL) + return 0; + return get_mm_rss(mm) * PAGE_SIZE / 1024; +} + +inline long nstoms(long l) +{ + l /= NSEC_PER_MSEC * 10; + if (l < 10) + l *= 10; + return l; +} + +inline struct task_struct *compare(struct task_struct *new, + struct task_struct *old) +{ + unsigned int ostate, nstate; + + if (old == NULL) + return new; + + ostate = task_state_index(old); + nstate = task_state_index(new); + + if (ostate == nstate) { + if (old->start_time > new->start_time) + return old; + return new; + } + + if (ostate < nstate) + return old; + + return new; +} + +struct task_struct *pick_process(struct pid *pgrp) +{ + struct task_struct *p, *winner = NULL; + + read_lock(&tasklist_lock); + do_each_pid_task(pgrp, PIDTYPE_PGID, p) { + winner = compare(p, winner); + } while_each_pid_task(pgrp, PIDTYPE_PGID, p); + read_unlock(&tasklist_lock); + + return winner; +} + +int tty_status(struct tty_struct *tty) +{ + char tname[TASK_COMM_LEN]; + unsigned long loadavg[3]; + uint64_t pcpu, cputime, wallclock; + struct task_struct *p; + struct rusage rusage; + struct timespec64 utime, stime, rtime; + char msg[MSGLEN] = {0}; + int len = 0; + + if (tty == NULL) + return -ENOTTY; + + get_avenrun(loadavg, FIXED_1/200, 0); + len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu ", + LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0])); + + if (tty->ctrl.session == NULL) { + len += scnprintf((char *)&msg[len], MSGLEN - len, + "not a controlling terminal"); + goto print; + } + + if (tty->ctrl.pgrp == NULL) { + len += scnprintf((char *)&msg[len], MSGLEN - len, + "no foreground process group"); + goto print; + } + + p = pick_process(tty->ctrl.pgrp); + if (p == NULL) { + len += scnprintf((char *)&msg[len], MSGLEN - len, + "empty foreground process group"); + goto print; + } + + get_task_comm(tname, p); + getrusage(p, RUSAGE_BOTH, &rusage); + wallclock = ktime_get_ns() - p->start_time; + + utime.tv_sec = rusage.ru_utime.tv_sec; + utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC; + stime.tv_sec = rusage.ru_stime.tv_sec; + stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC; + rtime = ns_to_timespec64(wallclock); + + cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime); + pcpu = div64_u64(cputime * 100, wallclock); + + len += scnprintf((char *)&msg[len], MSGLEN - len, + /* task, PID, task state */ + "cmd: %s %d [%s] " + /* rtime, utime, stime, %cpu, rss */ + "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk", + tname, task_pid_vnr(p), (char *)get_task_state_name(p), + rtime.tv_sec, nstoms(rtime.tv_nsec), + utime.tv_sec, nstoms(utime.tv_nsec), + stime.tv_sec, nstoms(stime.tv_nsec), + pcpu, getRSSk(p->mm)); + +print: + len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n"); + tty_write_message(tty, msg); + + return 0; +} diff --git a/fs/proc/array.c b/fs/proc/array.c index f37c03077b58..eb14306cdde2 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -62,6 +62,7 @@ #include <linux/tty.h> #include <linux/string.h> #include <linux/mman.h> +#include <linux/sched.h> #include <linux/sched/mm.h> #include <linux/sched/numa_balancing.h> #include <linux/sched/task_stack.h> @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) seq_printf(m, "%.64s", tcomm); } -/* - * The task state array is a strange "bitmap" of - * reasons to sleep. Thus "running" is zero, and - * you can test for combinations of others with - * simple bit tests. - */ -static const char * const task_state_array[] = { - - /* states in TASK_REPORT: */ - "R (running)", /* 0x00 */ - "S (sleeping)", /* 0x01 */ - "D (disk sleep)", /* 0x02 */ - "T (stopped)", /* 0x04 */ - "t (tracing stop)", /* 0x08 */ - "X (dead)", /* 0x10 */ - "Z (zombie)", /* 0x20 */ - "P (parked)", /* 0x40 */ - - /* states beyond TASK_REPORT: */ - "I (idle)", /* 0x80 */ -}; - -static inline const char *get_task_state(struct task_struct *tsk) -{ - BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); - return task_state_array[task_state_index(tsk)]; -} - static inline void task_state(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *p) { diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h index b1398d0d4a1d..9b080e1a82d4 100644 --- a/include/asm-generic/termios.h +++ b/include/asm-generic/termios.h @@ -10,9 +10,9 @@ eof=^D vtime=\0 vmin=\1 sxtc=\0 start=^Q stop=^S susp=^Z eol=\0 reprint=^R discard=^U werase=^W lnext=^V - eol2=\0 + eol2=\0 status=^T */ -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0" +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024" /* * Translate a "termio" structure into a "termios". Ugh. diff --git a/include/linux/sched.h b/include/linux/sched.h index c1a927ddec64..2171074ec8f5 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -70,7 +70,7 @@ struct task_group; /* * Task state bitmask. NOTE! These bits are also - * encoded in fs/proc/array.c: get_task_state(). + * encoded in get_task_state(). * * We have two separate sets of flags: task->state * is about runnability, while task->exit_state are @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk) return task_index_to_char(task_state_index(tsk)); } +static inline const char *get_task_state_name(struct task_struct *tsk) +{ + static const char * const task_state_array[] = { + + /* states in TASK_REPORT: */ + "running", /* 0x00 */ + "sleeping", /* 0x01 */ + "disk sleep", /* 0x02 */ + "stopped", /* 0x04 */ + "tracing stop", /* 0x08 */ + "dead", /* 0x10 */ + "zombie", /* 0x20 */ + "parked", /* 0x40 */ + + /* states beyond TASK_REPORT: */ + "idle", /* 0x80 */ + }; + + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); + return task_state_array[task_state_index(tsk)]; +} + +static inline const char *get_task_state(struct task_struct *tsk) +{ + /* + * The task state array is a strange "bitmap" of + * reasons to sleep. Thus "running" is zero, and + * you can test for combinations of others with + * simple bit tests. + */ + static const char * const task_state_array[] = { + + /* states in TASK_REPORT: */ + "R (running)", /* 0x00 */ + "S (sleeping)", /* 0x01 */ + "D (disk sleep)", /* 0x02 */ + "T (stopped)", /* 0x04 */ + "t (tracing stop)", /* 0x08 */ + "X (dead)", /* 0x10 */ + "Z (zombie)", /* 0x20 */ + "P (parked)", /* 0x40 */ + + /* states beyond TASK_REPORT: */ + "I (idle)", /* 0x80 */ + }; + + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); + return task_state_array[task_state_index(tsk)]; +} + /** * is_global_init - check if a task structure is init. Since init * is free to have sub-threads we need to check tgid. diff --git a/include/linux/signal.h b/include/linux/signal.h index b77f9472a37c..76bda1a20578 100644 --- a/include/linux/signal.h +++ b/include/linux/signal.h @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig); * | non-POSIX signal | default action | * +--------------------+------------------+ * | SIGEMT | coredump | + * | SIGINFO | ignore | * +--------------------+------------------+ * * (+) For SIGKILL and SIGSTOP the action is "always", not just "default". @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig) return sig == SIGCONT || sig == SIGCHLD || sig == SIGWINCH || +#if defined(SIGINFO) && SIGINFO != SIGPWR + sig == SIGINFO || +#endif sig == SIGURG; } diff --git a/include/linux/tty.h b/include/linux/tty.h index 168e57e40bbb..943d85aa471c 100644 --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -49,6 +49,9 @@ #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE]) #define LNEXT_CHAR(tty) ((tty)->termios.c_cc[VLNEXT]) #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2]) +#ifdef VSTATUS +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS]) +#endif #define _I_FLAG(tty, f) ((tty)->termios.c_iflag & (f)) #define _O_FLAG(tty, f) ((tty)->termios.c_oflag & (f)) @@ -114,6 +117,9 @@ #define L_PENDIN(tty) _L_FLAG((tty), PENDIN) #define L_IEXTEN(tty) _L_FLAG((tty), IEXTEN) #define L_EXTPROC(tty) _L_FLAG((tty), EXTPROC) +#ifdef NOKERNINFO +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO) +#endif struct device; struct signal_struct; @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty); extern void tty_unlock_slave(struct tty_struct *tty); extern void tty_set_lock_subclass(struct tty_struct *tty); +extern int tty_status(struct tty_struct *tty); + #endif diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h index cdc9f4ca8c27..baa2b8d42679 100644 --- a/include/uapi/asm-generic/ioctls.h +++ b/include/uapi/asm-generic/ioctls.h @@ -97,6 +97,8 @@ #define TIOCMIWAIT 0x545C /* wait for a change on serial input line(s) */ #define TIOCGICOUNT 0x545D /* read serial port inline interrupt counts */ +/* Some architectures use 0x545E for FIOQSIZE */ +#define TIOCSTAT 0x545F /* display process group stats on tty */ /* * Some arches already define FIOQSIZE due to a historical diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h index 3c4cc9b8378e..0b771eb1db94 100644 --- a/include/uapi/asm-generic/signal.h +++ b/include/uapi/asm-generic/signal.h @@ -4,7 +4,7 @@ #include <linux/types.h> -#define _NSIG 64 +#define _NSIG 65 #define _NSIG_BPW __BITS_PER_LONG #define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) @@ -49,9 +49,11 @@ /* These should not be considered constants from userland. */ #define SIGRTMIN 32 #ifndef SIGRTMAX -#define SIGRTMAX _NSIG +#define SIGRTMAX 64 #endif +#define SIGINFO 65 + #if !defined MINSIGSTKSZ || !defined SIGSTKSZ #define MINSIGSTKSZ 2048 #define SIGSTKSZ 8192 diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h index 2fbaf9ae89dd..cb4e9c6d629f 100644 --- a/include/uapi/asm-generic/termbits.h +++ b/include/uapi/asm-generic/termbits.h @@ -58,6 +58,7 @@ struct ktermios { #define VWERASE 14 #define VLNEXT 15 #define VEOL2 16 +#define VSTATUS 17 /* c_iflag bits */ #define IGNBRK 0000001 @@ -164,22 +165,23 @@ struct ktermios { #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */ /* c_lflag bits */ -#define ISIG 0000001 -#define ICANON 0000002 -#define XCASE 0000004 -#define ECHO 0000010 -#define ECHOE 0000020 -#define ECHOK 0000040 -#define ECHONL 0000100 -#define NOFLSH 0000200 -#define TOSTOP 0000400 -#define ECHOCTL 0001000 -#define ECHOPRT 0002000 -#define ECHOKE 0004000 -#define FLUSHO 0010000 -#define PENDIN 0040000 -#define IEXTEN 0100000 -#define EXTPROC 0200000 +#define ISIG 0000001 +#define ICANON 0000002 +#define XCASE 0000004 +#define ECHO 0000010 +#define ECHOE 0000020 +#define ECHOK 0000040 +#define ECHONL 0000100 +#define NOFLSH 0000200 +#define TOSTOP 0000400 +#define ECHOCTL 0001000 +#define ECHOPRT 0002000 +#define ECHOKE 0004000 +#define FLUSHO 0010000 +#define PENDIN 0040000 +#define IEXTEN 0100000 +#define EXTPROC 0200000 +#define NOKERNINFO 0400000 /* tcflow() and TCXONC use these */ #define TCOOFF 0 -- 2.30.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond @ 2022-01-04 7:27 ` Greg Kroah-Hartman 2022-01-07 21:48 ` Arseny Maslennikov 2022-01-08 14:38 ` Arseny Maslennikov 2 siblings, 0 replies; 21+ messages in thread From: Greg Kroah-Hartman @ 2022-01-04 7:27 UTC (permalink / raw) To: Walt Drummond Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel, linux-arch On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote: > Support TTY VSTATUS character, NOKERNINFO local control bit and the > signal SIGINFO, all as in 4.3BSD. I am sorry, but this changelog text does not make any sense to me at all. It needs to be much more detailed and explain why you are doing this and what exactly it is doing as I have no idea. Also, you seem to be adding new user/kernel apis here with no documentation that I can see, nor any tests. So how is anyone supposed to use this? And finally: > --- /dev/null > +++ b/drivers/tty/tty_status.c > @@ -0,0 +1,135 @@ > +// SPDX-License-Identifier: GPL-1.0+ Please no, you know better than that, and the checkpatch tool should have warned you. > +/* > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 > + * > + */ > + > +#include <linux/sched.h> > +#include <linux/mm.h> > +#include <linux/tty.h> > +#include <linux/sched/cputime.h> > +#include <linux/sched/loadavg.h> > +#include <linux/pid.h> > +#include <linux/slab.h> > +#include <linux/math64.h> > + > +#define MSGLEN (160 + TASK_COMM_LEN) > + > +inline unsigned long getRSSk(struct mm_struct *mm) > +{ > + if (mm == NULL) > + return 0; > + return get_mm_rss(mm) * PAGE_SIZE / 1024; > +} > + > +inline long nstoms(long l) > +{ > + l /= NSEC_PER_MSEC * 10; > + if (l < 10) > + l *= 10; > + return l; > +} > + > +inline struct task_struct *compare(struct task_struct *new, > + struct task_struct *old) > +{ > + unsigned int ostate, nstate; > + > + if (old == NULL) > + return new; > + > + ostate = task_state_index(old); > + nstate = task_state_index(new); > + > + if (ostate == nstate) { > + if (old->start_time > new->start_time) > + return old; > + return new; > + } > + > + if (ostate < nstate) > + return old; > + > + return new; > +} > + > +struct task_struct *pick_process(struct pid *pgrp) Also, always run sparse on your changes, you have loads of new global functions for no reason. thanks, greg k-h ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond 2022-01-04 7:27 ` Greg Kroah-Hartman @ 2022-01-07 21:48 ` Arseny Maslennikov 2022-01-07 21:52 ` Walt Drummond 2022-01-08 14:38 ` Arseny Maslennikov 2 siblings, 1 reply; 21+ messages in thread From: Arseny Maslennikov @ 2022-01-07 21:48 UTC (permalink / raw) To: Walt Drummond Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel, linux-arch [-- Attachment #1: Type: text/plain, Size: 19564 bytes --] On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote: > Support TTY VSTATUS character, NOKERNINFO local control bit and the > signal SIGINFO, all as in 4.3BSD. > > Signed-off-by: Walt Drummond <walt@drummond.us> > --- > arch/x86/include/asm/signal.h | 2 +- > arch/x86/include/uapi/asm/signal.h | 4 +- > drivers/tty/Makefile | 2 +- > drivers/tty/n_tty.c | 21 +++++ > drivers/tty/tty_io.c | 10 ++- > drivers/tty/tty_ioctl.c | 4 + > drivers/tty/tty_status.c | 135 ++++++++++++++++++++++++++++ > fs/proc/array.c | 29 +----- > include/asm-generic/termios.h | 4 +- > include/linux/sched.h | 52 ++++++++++- > include/linux/signal.h | 4 + > include/linux/tty.h | 8 ++ > include/uapi/asm-generic/ioctls.h | 2 + > include/uapi/asm-generic/signal.h | 6 +- > include/uapi/asm-generic/termbits.h | 34 +++---- > 15 files changed, 264 insertions(+), 53 deletions(-) > create mode 100644 drivers/tty/tty_status.c > > diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h > index d8e2efe6cd46..0a01877c11ab 100644 > --- a/arch/x86/include/asm/signal.h > +++ b/arch/x86/include/asm/signal.h > @@ -8,7 +8,7 @@ > /* Most things should be clean enough to redefine this at will, if care > is taken to make libc match. */ > > -#define _NSIG 64 > +#define _NSIG 65 > > #ifdef __i386__ > # define _NSIG_BPW 32 > diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h > index 164a22a72984..60dca62d3dcf 100644 > --- a/arch/x86/include/uapi/asm/signal.h > +++ b/arch/x86/include/uapi/asm/signal.h > @@ -60,7 +60,9 @@ typedef unsigned long sigset_t; > > /* These should not be considered constants from userland. */ > #define SIGRTMIN 32 > -#define SIGRTMAX _NSIG > +#define SIGRTMAX 64 > + > +#define SIGINFO 65 > > #define SA_RESTORER 0x04000000 > > diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile > index a2bd75fbaaa4..d50ba690bb87 100644 > --- a/drivers/tty/Makefile > +++ b/drivers/tty/Makefile > @@ -2,7 +2,7 @@ > obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ > tty_buffer.o tty_port.o tty_mutex.o \ > tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ > - n_null.o > + n_null.o tty_status.o > obj-$(CONFIG_LEGACY_PTYS) += pty.o > obj-$(CONFIG_UNIX98_PTYS) += pty.o > obj-$(CONFIG_AUDIT) += tty_audit.o > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c > index 0ec93f1a61f5..b510e01289fd 100644 > --- a/drivers/tty/n_tty.c > +++ b/drivers/tty/n_tty.c > @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c) > commit_echoes(tty); > return; > } > +#ifdef VSTATUS > + if (c == STATUS_CHAR(tty)) { > + /* Do the status message first and then send > + * the signal, otherwise signal delivery can > + * change the process state making the status > + * message misleading. Also, use __isig() and > + * not sig(), as if we flush the tty we can > + * lose parts of the message. ...As well as the character input in the canonical mode's built-in line editor. > + */ > + > + if (!L_NOKERNINFO(tty)) > + tty_status(tty); > +# if defined(SIGINFO) && SIGINFO != SIGPWR > + __isig(SIGINFO, tty); > +# endif > + return; > + } > +#endif /* VSTATUS */ > if (c == '\n') { > if (L_ECHO(tty) || L_ECHONL(tty)) { > echo_char_raw('\n', ldata); > @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old) > set_bit(EOF_CHAR(tty), ldata->char_map); > set_bit('\n', ldata->char_map); > set_bit(EOL_CHAR(tty), ldata->char_map); > +#ifdef VSTATUS > + set_bit(STATUS_CHAR(tty), ldata->char_map); > +#endif > if (L_IEXTEN(tty)) { > set_bit(WERASE_CHAR(tty), ldata->char_map); > set_bit(LNEXT_CHAR(tty), ldata->char_map); > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c > index 6616d4a0d41d..8e488ecba330 100644 > --- a/drivers/tty/tty_io.c > +++ b/drivers/tty/tty_io.c > @@ -120,18 +120,26 @@ > #define TTY_PARANOIA_CHECK 1 > #define CHECK_TTY_COUNT 1 > > +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */ > +#ifdef NOKERNINFO > +# define __NOKERNINFO NOKERNINFO > +#else > +# define __NOKERNINFO 0 > +#endif > + > struct ktermios tty_std_termios = { /* for the benefit of tty drivers */ > .c_iflag = ICRNL | IXON, > .c_oflag = OPOST | ONLCR, > .c_cflag = B38400 | CS8 | CREAD | HUPCL, > .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK | > - ECHOCTL | ECHOKE | IEXTEN, > + ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO, > .c_cc = INIT_C_CC, > .c_ispeed = 38400, > .c_ospeed = 38400, > /* .c_line = N_TTY, */ > }; > EXPORT_SYMBOL(tty_std_termios); > +#undef __NOKERNINFO > > /* This list gets poked at by procfs and various bits of boot up code. This > * could do with some rationalisation such as pulling the tty proc function > diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c > index 507a25d692bb..b250eabca1ba 100644 > --- a/drivers/tty/tty_ioctl.c > +++ b/drivers/tty/tty_ioctl.c > @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file, > if (get_user(arg, (unsigned int __user *) arg)) > return -EFAULT; > return tty_change_softcar(real_tty, arg); > +#ifdef TIOCSTAT > + case TIOCSTAT: > + return tty_status(real_tty); > +#endif > default: > return -ENOIOCTLCMD; > } > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c > new file mode 100644 Nitpick: the new functionality is part of n_tty and not the generic tty subsystem, so "tty_status.c" is a misleading name for the new file, unlike e. g. "n_tty_status.c". It has no use in the various modem drivers, for example. Likewise for the tty_status() function. > index 000000000000..a9600f5bd48c > --- /dev/null > +++ b/drivers/tty/tty_status.c > @@ -0,0 +1,135 @@ > +// SPDX-License-Identifier: GPL-1.0+ > +/* > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 > + * > + */ > + > +#include <linux/sched.h> > +#include <linux/mm.h> > +#include <linux/tty.h> > +#include <linux/sched/cputime.h> > +#include <linux/sched/loadavg.h> > +#include <linux/pid.h> > +#include <linux/slab.h> > +#include <linux/math64.h> > + > +#define MSGLEN (160 + TASK_COMM_LEN) > + > +inline unsigned long getRSSk(struct mm_struct *mm) > +{ > + if (mm == NULL) > + return 0; > + return get_mm_rss(mm) * PAGE_SIZE / 1024; > +} > + > +inline long nstoms(long l) > +{ > + l /= NSEC_PER_MSEC * 10; > + if (l < 10) > + l *= 10; > + return l; > +} > + > +inline struct task_struct *compare(struct task_struct *new, > + struct task_struct *old) > +{ > + unsigned int ostate, nstate; > + > + if (old == NULL) > + return new; > + > + ostate = task_state_index(old); > + nstate = task_state_index(new); > + > + if (ostate == nstate) { > + if (old->start_time > new->start_time) > + return old; > + return new; > + } > + > + if (ostate < nstate) > + return old; > + > + return new; > +} > + > +struct task_struct *pick_process(struct pid *pgrp) > +{ > + struct task_struct *p, *winner = NULL; > + > + read_lock(&tasklist_lock); > + do_each_pid_task(pgrp, PIDTYPE_PGID, p) { > + winner = compare(p, winner); > + } while_each_pid_task(pgrp, PIDTYPE_PGID, p); > + read_unlock(&tasklist_lock); > + > + return winner; > +} > + > +int tty_status(struct tty_struct *tty) > +{ > + char tname[TASK_COMM_LEN]; > + unsigned long loadavg[3]; > + uint64_t pcpu, cputime, wallclock; > + struct task_struct *p; > + struct rusage rusage; > + struct timespec64 utime, stime, rtime; > + char msg[MSGLEN] = {0}; > + int len = 0; > + > + if (tty == NULL) > + return -ENOTTY; > + > + get_avenrun(loadavg, FIXED_1/200, 0); > + len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu ", > + LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0])); > + > + if (tty->ctrl.session == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "not a controlling terminal"); > + goto print; > + } > + > + if (tty->ctrl.pgrp == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "no foreground process group"); > + goto print; > + } > + > + p = pick_process(tty->ctrl.pgrp); > + if (p == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "empty foreground process group"); > + goto print; > + } > + > + get_task_comm(tname, p); > + getrusage(p, RUSAGE_BOTH, &rusage); > + wallclock = ktime_get_ns() - p->start_time; > + > + utime.tv_sec = rusage.ru_utime.tv_sec; > + utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC; > + stime.tv_sec = rusage.ru_stime.tv_sec; > + stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC; > + rtime = ns_to_timespec64(wallclock); > + > + cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime); > + pcpu = div64_u64(cputime * 100, wallclock); > + > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + /* task, PID, task state */ > + "cmd: %s %d [%s] " > + /* rtime, utime, stime, %cpu, rss */ > + "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk", > + tname, task_pid_vnr(p), (char *)get_task_state_name(p), > + rtime.tv_sec, nstoms(rtime.tv_nsec), > + utime.tv_sec, nstoms(utime.tv_nsec), > + stime.tv_sec, nstoms(stime.tv_nsec), > + pcpu, getRSSk(p->mm)); > + > +print: > + len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n"); > + tty_write_message(tty, msg); tty_write_message() is quite risky to use; while writing my implementation a couple of years ago I've found it easy to accidentally set up deadlocks with this interface — in particular if the function is called from the tty character receive path. I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled. > + > + return 0; > +} > diff --git a/fs/proc/array.c b/fs/proc/array.c > index f37c03077b58..eb14306cdde2 100644 > --- a/fs/proc/array.c > +++ b/fs/proc/array.c > @@ -62,6 +62,7 @@ > #include <linux/tty.h> > #include <linux/string.h> > #include <linux/mman.h> > +#include <linux/sched.h> > #include <linux/sched/mm.h> > #include <linux/sched/numa_balancing.h> > #include <linux/sched/task_stack.h> > @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) > seq_printf(m, "%.64s", tcomm); > } > > -/* > - * The task state array is a strange "bitmap" of > - * reasons to sleep. Thus "running" is zero, and > - * you can test for combinations of others with > - * simple bit tests. > - */ > -static const char * const task_state_array[] = { > - > - /* states in TASK_REPORT: */ > - "R (running)", /* 0x00 */ > - "S (sleeping)", /* 0x01 */ > - "D (disk sleep)", /* 0x02 */ > - "T (stopped)", /* 0x04 */ > - "t (tracing stop)", /* 0x08 */ > - "X (dead)", /* 0x10 */ > - "Z (zombie)", /* 0x20 */ > - "P (parked)", /* 0x40 */ > - > - /* states beyond TASK_REPORT: */ > - "I (idle)", /* 0x80 */ > -}; > - > -static inline const char *get_task_state(struct task_struct *tsk) > -{ > - BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > - return task_state_array[task_state_index(tsk)]; > -} > - > static inline void task_state(struct seq_file *m, struct pid_namespace *ns, > struct pid *pid, struct task_struct *p) > { > diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h > index b1398d0d4a1d..9b080e1a82d4 100644 > --- a/include/asm-generic/termios.h > +++ b/include/asm-generic/termios.h > @@ -10,9 +10,9 @@ > eof=^D vtime=\0 vmin=\1 sxtc=\0 > start=^Q stop=^S susp=^Z eol=\0 > reprint=^R discard=^U werase=^W lnext=^V > - eol2=\0 > + eol2=\0 status=^T > */ > -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0" > +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024" > > /* > * Translate a "termio" structure into a "termios". Ugh. > diff --git a/include/linux/sched.h b/include/linux/sched.h > index c1a927ddec64..2171074ec8f5 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -70,7 +70,7 @@ struct task_group; > > /* > * Task state bitmask. NOTE! These bits are also > - * encoded in fs/proc/array.c: get_task_state(). > + * encoded in get_task_state(). > * > * We have two separate sets of flags: task->state > * is about runnability, while task->exit_state are > @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk) > return task_index_to_char(task_state_index(tsk)); > } > > +static inline const char *get_task_state_name(struct task_struct *tsk) > +{ > + static const char * const task_state_array[] = { > + > + /* states in TASK_REPORT: */ > + "running", /* 0x00 */ > + "sleeping", /* 0x01 */ > + "disk sleep", /* 0x02 */ > + "stopped", /* 0x04 */ > + "tracing stop", /* 0x08 */ > + "dead", /* 0x10 */ > + "zombie", /* 0x20 */ > + "parked", /* 0x40 */ > + > + /* states beyond TASK_REPORT: */ > + "idle", /* 0x80 */ > + }; > + > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > + return task_state_array[task_state_index(tsk)]; > +} > + > +static inline const char *get_task_state(struct task_struct *tsk) > +{ > + /* > + * The task state array is a strange "bitmap" of > + * reasons to sleep. Thus "running" is zero, and > + * you can test for combinations of others with > + * simple bit tests. > + */ > + static const char * const task_state_array[] = { > + > + /* states in TASK_REPORT: */ > + "R (running)", /* 0x00 */ > + "S (sleeping)", /* 0x01 */ > + "D (disk sleep)", /* 0x02 */ > + "T (stopped)", /* 0x04 */ > + "t (tracing stop)", /* 0x08 */ > + "X (dead)", /* 0x10 */ > + "Z (zombie)", /* 0x20 */ > + "P (parked)", /* 0x40 */ > + > + /* states beyond TASK_REPORT: */ > + "I (idle)", /* 0x80 */ > + }; > + > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > + return task_state_array[task_state_index(tsk)]; > +} > + > /** > * is_global_init - check if a task structure is init. Since init > * is free to have sub-threads we need to check tgid. > diff --git a/include/linux/signal.h b/include/linux/signal.h > index b77f9472a37c..76bda1a20578 100644 > --- a/include/linux/signal.h > +++ b/include/linux/signal.h > @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig); > * | non-POSIX signal | default action | > * +--------------------+------------------+ > * | SIGEMT | coredump | > + * | SIGINFO | ignore | > * +--------------------+------------------+ > * > * (+) For SIGKILL and SIGSTOP the action is "always", not just "default". > @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig) > return sig == SIGCONT || > sig == SIGCHLD || > sig == SIGWINCH || > +#if defined(SIGINFO) && SIGINFO != SIGPWR > + sig == SIGINFO || > +#endif > sig == SIGURG; > } > > diff --git a/include/linux/tty.h b/include/linux/tty.h > index 168e57e40bbb..943d85aa471c 100644 > --- a/include/linux/tty.h > +++ b/include/linux/tty.h > @@ -49,6 +49,9 @@ > #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE]) > #define LNEXT_CHAR(tty) ((tty)->termios.c_cc[VLNEXT]) > #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2]) > +#ifdef VSTATUS > +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS]) > +#endif > > #define _I_FLAG(tty, f) ((tty)->termios.c_iflag & (f)) > #define _O_FLAG(tty, f) ((tty)->termios.c_oflag & (f)) > @@ -114,6 +117,9 @@ > #define L_PENDIN(tty) _L_FLAG((tty), PENDIN) > #define L_IEXTEN(tty) _L_FLAG((tty), IEXTEN) > #define L_EXTPROC(tty) _L_FLAG((tty), EXTPROC) > +#ifdef NOKERNINFO > +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO) > +#endif > > struct device; > struct signal_struct; > @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty); > extern void tty_unlock_slave(struct tty_struct *tty); > extern void tty_set_lock_subclass(struct tty_struct *tty); > > +extern int tty_status(struct tty_struct *tty); > + > #endif > diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h > index cdc9f4ca8c27..baa2b8d42679 100644 > --- a/include/uapi/asm-generic/ioctls.h > +++ b/include/uapi/asm-generic/ioctls.h > @@ -97,6 +97,8 @@ > > #define TIOCMIWAIT 0x545C /* wait for a change on serial input line(s) */ > #define TIOCGICOUNT 0x545D /* read serial port inline interrupt counts */ > +/* Some architectures use 0x545E for FIOQSIZE */ > +#define TIOCSTAT 0x545F /* display process group stats on tty */ > > /* > * Some arches already define FIOQSIZE due to a historical > diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h > index 3c4cc9b8378e..0b771eb1db94 100644 > --- a/include/uapi/asm-generic/signal.h > +++ b/include/uapi/asm-generic/signal.h > @@ -4,7 +4,7 @@ > > #include <linux/types.h> > > -#define _NSIG 64 > +#define _NSIG 65 > #define _NSIG_BPW __BITS_PER_LONG > #define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) > > @@ -49,9 +49,11 @@ > /* These should not be considered constants from userland. */ > #define SIGRTMIN 32 > #ifndef SIGRTMAX > -#define SIGRTMAX _NSIG > +#define SIGRTMAX 64 > #endif > > +#define SIGINFO 65 > + > #if !defined MINSIGSTKSZ || !defined SIGSTKSZ > #define MINSIGSTKSZ 2048 > #define SIGSTKSZ 8192 > diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h > index 2fbaf9ae89dd..cb4e9c6d629f 100644 > --- a/include/uapi/asm-generic/termbits.h > +++ b/include/uapi/asm-generic/termbits.h > @@ -58,6 +58,7 @@ struct ktermios { > #define VWERASE 14 > #define VLNEXT 15 > #define VEOL2 16 > +#define VSTATUS 17 > > /* c_iflag bits */ > #define IGNBRK 0000001 > @@ -164,22 +165,23 @@ struct ktermios { > #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */ > > /* c_lflag bits */ > -#define ISIG 0000001 > -#define ICANON 0000002 > -#define XCASE 0000004 > -#define ECHO 0000010 > -#define ECHOE 0000020 > -#define ECHOK 0000040 > -#define ECHONL 0000100 > -#define NOFLSH 0000200 > -#define TOSTOP 0000400 > -#define ECHOCTL 0001000 > -#define ECHOPRT 0002000 > -#define ECHOKE 0004000 > -#define FLUSHO 0010000 > -#define PENDIN 0040000 > -#define IEXTEN 0100000 > -#define EXTPROC 0200000 > +#define ISIG 0000001 > +#define ICANON 0000002 > +#define XCASE 0000004 > +#define ECHO 0000010 > +#define ECHOE 0000020 > +#define ECHOK 0000040 > +#define ECHONL 0000100 > +#define NOFLSH 0000200 > +#define TOSTOP 0000400 > +#define ECHOCTL 0001000 > +#define ECHOPRT 0002000 > +#define ECHOKE 0004000 > +#define FLUSHO 0010000 > +#define PENDIN 0040000 > +#define IEXTEN 0100000 > +#define EXTPROC 0200000 > +#define NOKERNINFO 0400000 > > /* tcflow() and TCXONC use these */ > #define TCOOFF 0 > -- > 2.30.2 > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-07 21:48 ` Arseny Maslennikov @ 2022-01-07 21:52 ` Walt Drummond 2022-01-07 22:39 ` Arseny Maslennikov 0 siblings, 1 reply; 21+ messages in thread From: Walt Drummond @ 2022-01-07 21:52 UTC (permalink / raw) To: Arseny Maslennikov Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel, linux-arch On Fri, Jan 7, 2022 at 1:48 PM Arseny Maslennikov <ar@cs.msu.ru> wrote: > > On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote: > > Support TTY VSTATUS character, NOKERNINFO local control bit and the > > signal SIGINFO, all as in 4.3BSD. > > > > Signed-off-by: Walt Drummond <walt@drummond.us> > > --- > > arch/x86/include/asm/signal.h | 2 +- > > arch/x86/include/uapi/asm/signal.h | 4 +- > > drivers/tty/Makefile | 2 +- > > drivers/tty/n_tty.c | 21 +++++ > > drivers/tty/tty_io.c | 10 ++- > > drivers/tty/tty_ioctl.c | 4 + > > drivers/tty/tty_status.c | 135 ++++++++++++++++++++++++++++ > > fs/proc/array.c | 29 +----- > > include/asm-generic/termios.h | 4 +- > > include/linux/sched.h | 52 ++++++++++- > > include/linux/signal.h | 4 + > > include/linux/tty.h | 8 ++ > > include/uapi/asm-generic/ioctls.h | 2 + > > include/uapi/asm-generic/signal.h | 6 +- > > include/uapi/asm-generic/termbits.h | 34 +++---- > > 15 files changed, 264 insertions(+), 53 deletions(-) > > create mode 100644 drivers/tty/tty_status.c > > > > diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h > > index d8e2efe6cd46..0a01877c11ab 100644 > > --- a/arch/x86/include/asm/signal.h > > +++ b/arch/x86/include/asm/signal.h > > @@ -8,7 +8,7 @@ > > /* Most things should be clean enough to redefine this at will, if care > > is taken to make libc match. */ > > > > -#define _NSIG 64 > > +#define _NSIG 65 > > > > #ifdef __i386__ > > # define _NSIG_BPW 32 > > diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h > > index 164a22a72984..60dca62d3dcf 100644 > > --- a/arch/x86/include/uapi/asm/signal.h > > +++ b/arch/x86/include/uapi/asm/signal.h > > @@ -60,7 +60,9 @@ typedef unsigned long sigset_t; > > > > /* These should not be considered constants from userland. */ > > #define SIGRTMIN 32 > > -#define SIGRTMAX _NSIG > > +#define SIGRTMAX 64 > > + > > +#define SIGINFO 65 > > > > #define SA_RESTORER 0x04000000 > > > > diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile > > index a2bd75fbaaa4..d50ba690bb87 100644 > > --- a/drivers/tty/Makefile > > +++ b/drivers/tty/Makefile > > @@ -2,7 +2,7 @@ > > obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ > > tty_buffer.o tty_port.o tty_mutex.o \ > > tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ > > - n_null.o > > + n_null.o tty_status.o > > obj-$(CONFIG_LEGACY_PTYS) += pty.o > > obj-$(CONFIG_UNIX98_PTYS) += pty.o > > obj-$(CONFIG_AUDIT) += tty_audit.o > > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c > > index 0ec93f1a61f5..b510e01289fd 100644 > > --- a/drivers/tty/n_tty.c > > +++ b/drivers/tty/n_tty.c > > @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c) > > commit_echoes(tty); > > return; > > } > > +#ifdef VSTATUS > > + if (c == STATUS_CHAR(tty)) { > > + /* Do the status message first and then send > > + * the signal, otherwise signal delivery can > > + * change the process state making the status > > + * message misleading. Also, use __isig() and > > + * not sig(), as if we flush the tty we can > > + * lose parts of the message. > > ...As well as the character input in the canonical mode's built-in line > editor. > Yes, good catch. But this is not going to be in the next version of the patch. > > + */ > > + > > + if (!L_NOKERNINFO(tty)) > > + tty_status(tty); > > +# if defined(SIGINFO) && SIGINFO != SIGPWR > > + __isig(SIGINFO, tty); > > +# endif > > + return; > > + } > > +#endif /* VSTATUS */ > > if (c == '\n') { > > if (L_ECHO(tty) || L_ECHONL(tty)) { > > echo_char_raw('\n', ldata); > > @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old) > > set_bit(EOF_CHAR(tty), ldata->char_map); > > set_bit('\n', ldata->char_map); > > set_bit(EOL_CHAR(tty), ldata->char_map); > > +#ifdef VSTATUS > > + set_bit(STATUS_CHAR(tty), ldata->char_map); > > +#endif > > if (L_IEXTEN(tty)) { > > set_bit(WERASE_CHAR(tty), ldata->char_map); > > set_bit(LNEXT_CHAR(tty), ldata->char_map); > > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c > > index 6616d4a0d41d..8e488ecba330 100644 > > --- a/drivers/tty/tty_io.c > > +++ b/drivers/tty/tty_io.c > > @@ -120,18 +120,26 @@ > > #define TTY_PARANOIA_CHECK 1 > > #define CHECK_TTY_COUNT 1 > > > > +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */ > > +#ifdef NOKERNINFO > > +# define __NOKERNINFO NOKERNINFO > > +#else > > +# define __NOKERNINFO 0 > > +#endif > > + > > struct ktermios tty_std_termios = { /* for the benefit of tty drivers */ > > .c_iflag = ICRNL | IXON, > > .c_oflag = OPOST | ONLCR, > > .c_cflag = B38400 | CS8 | CREAD | HUPCL, > > .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK | > > - ECHOCTL | ECHOKE | IEXTEN, > > + ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO, > > .c_cc = INIT_C_CC, > > .c_ispeed = 38400, > > .c_ospeed = 38400, > > /* .c_line = N_TTY, */ > > }; > > EXPORT_SYMBOL(tty_std_termios); > > +#undef __NOKERNINFO > > > > /* This list gets poked at by procfs and various bits of boot up code. This > > * could do with some rationalisation such as pulling the tty proc function > > diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c > > index 507a25d692bb..b250eabca1ba 100644 > > --- a/drivers/tty/tty_ioctl.c > > +++ b/drivers/tty/tty_ioctl.c > > @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file, > > if (get_user(arg, (unsigned int __user *) arg)) > > return -EFAULT; > > return tty_change_softcar(real_tty, arg); > > +#ifdef TIOCSTAT > > + case TIOCSTAT: > > + return tty_status(real_tty); > > +#endif > > default: > > return -ENOIOCTLCMD; > > } > > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c > > new file mode 100644 > > Nitpick: the new functionality is part of n_tty and not the generic tty > subsystem, so "tty_status.c" is a misleading name for the new file, > unlike e. g. "n_tty_status.c". It has no use in the various modem > drivers, for example. > Likewise for the tty_status() function. ACK, will do. > > > index 000000000000..a9600f5bd48c > > --- /dev/null > > +++ b/drivers/tty/tty_status.c > > @@ -0,0 +1,135 @@ > > +// SPDX-License-Identifier: GPL-1.0+ > > +/* > > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 > > + * > > + */ > > + > > +#include <linux/sched.h> > > +#include <linux/mm.h> > > +#include <linux/tty.h> > > +#include <linux/sched/cputime.h> > > +#include <linux/sched/loadavg.h> > > +#include <linux/pid.h> > > +#include <linux/slab.h> > > +#include <linux/math64.h> > > + > > +#define MSGLEN (160 + TASK_COMM_LEN) > > + > > +inline unsigned long getRSSk(struct mm_struct *mm) > > +{ > > + if (mm == NULL) > > + return 0; > > + return get_mm_rss(mm) * PAGE_SIZE / 1024; > > +} > > + > > +inline long nstoms(long l) > > +{ > > + l /= NSEC_PER_MSEC * 10; > > + if (l < 10) > > + l *= 10; > > + return l; > > +} > > + > > +inline struct task_struct *compare(struct task_struct *new, > > + struct task_struct *old) > > +{ > > + unsigned int ostate, nstate; > > + > > + if (old == NULL) > > + return new; > > + > > + ostate = task_state_index(old); > > + nstate = task_state_index(new); > > + > > + if (ostate == nstate) { > > + if (old->start_time > new->start_time) > > + return old; > > + return new; > > + } > > + > > + if (ostate < nstate) > > + return old; > > + > > + return new; > > +} > > + > > +struct task_struct *pick_process(struct pid *pgrp) > > +{ > > + struct task_struct *p, *winner = NULL; > > + > > + read_lock(&tasklist_lock); > > + do_each_pid_task(pgrp, PIDTYPE_PGID, p) { > > + winner = compare(p, winner); > > + } while_each_pid_task(pgrp, PIDTYPE_PGID, p); > > + read_unlock(&tasklist_lock); > > + > > + return winner; > > +} > > + > > +int tty_status(struct tty_struct *tty) > > +{ > > + char tname[TASK_COMM_LEN]; > > + unsigned long loadavg[3]; > > + uint64_t pcpu, cputime, wallclock; > > + struct task_struct *p; > > + struct rusage rusage; > > + struct timespec64 utime, stime, rtime; > > + char msg[MSGLEN] = {0}; > > + int len = 0; > > + > > + if (tty == NULL) > > + return -ENOTTY; > > + > > + get_avenrun(loadavg, FIXED_1/200, 0); > > + len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu ", > > + LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0])); > > + > > + if (tty->ctrl.session == NULL) { > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > + "not a controlling terminal"); > > + goto print; > > + } > > + > > + if (tty->ctrl.pgrp == NULL) { > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > + "no foreground process group"); > > + goto print; > > + } > > + > > + p = pick_process(tty->ctrl.pgrp); > > + if (p == NULL) { > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > + "empty foreground process group"); > > + goto print; > > + } > > + > > + get_task_comm(tname, p); > > + getrusage(p, RUSAGE_BOTH, &rusage); > > + wallclock = ktime_get_ns() - p->start_time; > > + > > + utime.tv_sec = rusage.ru_utime.tv_sec; > > + utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC; > > + stime.tv_sec = rusage.ru_stime.tv_sec; > > + stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC; > > + rtime = ns_to_timespec64(wallclock); > > + > > + cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime); > > + pcpu = div64_u64(cputime * 100, wallclock); > > + > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > + /* task, PID, task state */ > > + "cmd: %s %d [%s] " > > + /* rtime, utime, stime, %cpu, rss */ > > + "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk", > > + tname, task_pid_vnr(p), (char *)get_task_state_name(p), > > + rtime.tv_sec, nstoms(rtime.tv_nsec), > > + utime.tv_sec, nstoms(utime.tv_nsec), > > + stime.tv_sec, nstoms(stime.tv_nsec), > > + pcpu, getRSSk(p->mm)); > > + > > +print: > > + len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n"); > > + tty_write_message(tty, msg); > > tty_write_message() is quite risky to use; while writing my > implementation a couple of years ago I've found it easy to accidentally > set up deadlocks with this interface — in particular if the function is > called from the tty character receive path. > I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled. I have not, but I will. Is there a different 'put a message on this tty' api I should be using? Thanks. > > > + > > + return 0; > > +} > > diff --git a/fs/proc/array.c b/fs/proc/array.c > > index f37c03077b58..eb14306cdde2 100644 > > --- a/fs/proc/array.c > > +++ b/fs/proc/array.c > > @@ -62,6 +62,7 @@ > > #include <linux/tty.h> > > #include <linux/string.h> > > #include <linux/mman.h> > > +#include <linux/sched.h> > > #include <linux/sched/mm.h> > > #include <linux/sched/numa_balancing.h> > > #include <linux/sched/task_stack.h> > > @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) > > seq_printf(m, "%.64s", tcomm); > > } > > > > -/* > > - * The task state array is a strange "bitmap" of > > - * reasons to sleep. Thus "running" is zero, and > > - * you can test for combinations of others with > > - * simple bit tests. > > - */ > > -static const char * const task_state_array[] = { > > - > > - /* states in TASK_REPORT: */ > > - "R (running)", /* 0x00 */ > > - "S (sleeping)", /* 0x01 */ > > - "D (disk sleep)", /* 0x02 */ > > - "T (stopped)", /* 0x04 */ > > - "t (tracing stop)", /* 0x08 */ > > - "X (dead)", /* 0x10 */ > > - "Z (zombie)", /* 0x20 */ > > - "P (parked)", /* 0x40 */ > > - > > - /* states beyond TASK_REPORT: */ > > - "I (idle)", /* 0x80 */ > > -}; > > - > > -static inline const char *get_task_state(struct task_struct *tsk) > > -{ > > - BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > - return task_state_array[task_state_index(tsk)]; > > -} > > - > > static inline void task_state(struct seq_file *m, struct pid_namespace *ns, > > struct pid *pid, struct task_struct *p) > > { > > diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h > > index b1398d0d4a1d..9b080e1a82d4 100644 > > --- a/include/asm-generic/termios.h > > +++ b/include/asm-generic/termios.h > > @@ -10,9 +10,9 @@ > > eof=^D vtime=\0 vmin=\1 sxtc=\0 > > start=^Q stop=^S susp=^Z eol=\0 > > reprint=^R discard=^U werase=^W lnext=^V > > - eol2=\0 > > + eol2=\0 status=^T > > */ > > -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0" > > +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024" > > > > /* > > * Translate a "termio" structure into a "termios". Ugh. > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index c1a927ddec64..2171074ec8f5 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -70,7 +70,7 @@ struct task_group; > > > > /* > > * Task state bitmask. NOTE! These bits are also > > - * encoded in fs/proc/array.c: get_task_state(). > > + * encoded in get_task_state(). > > * > > * We have two separate sets of flags: task->state > > * is about runnability, while task->exit_state are > > @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk) > > return task_index_to_char(task_state_index(tsk)); > > } > > > > +static inline const char *get_task_state_name(struct task_struct *tsk) > > +{ > > + static const char * const task_state_array[] = { > > + > > + /* states in TASK_REPORT: */ > > + "running", /* 0x00 */ > > + "sleeping", /* 0x01 */ > > + "disk sleep", /* 0x02 */ > > + "stopped", /* 0x04 */ > > + "tracing stop", /* 0x08 */ > > + "dead", /* 0x10 */ > > + "zombie", /* 0x20 */ > > + "parked", /* 0x40 */ > > + > > + /* states beyond TASK_REPORT: */ > > + "idle", /* 0x80 */ > > + }; > > + > > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > + return task_state_array[task_state_index(tsk)]; > > +} > > + > > +static inline const char *get_task_state(struct task_struct *tsk) > > +{ > > + /* > > + * The task state array is a strange "bitmap" of > > + * reasons to sleep. Thus "running" is zero, and > > + * you can test for combinations of others with > > + * simple bit tests. > > + */ > > + static const char * const task_state_array[] = { > > + > > + /* states in TASK_REPORT: */ > > + "R (running)", /* 0x00 */ > > + "S (sleeping)", /* 0x01 */ > > + "D (disk sleep)", /* 0x02 */ > > + "T (stopped)", /* 0x04 */ > > + "t (tracing stop)", /* 0x08 */ > > + "X (dead)", /* 0x10 */ > > + "Z (zombie)", /* 0x20 */ > > + "P (parked)", /* 0x40 */ > > + > > + /* states beyond TASK_REPORT: */ > > + "I (idle)", /* 0x80 */ > > + }; > > + > > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > + return task_state_array[task_state_index(tsk)]; > > +} > > + > > /** > > * is_global_init - check if a task structure is init. Since init > > * is free to have sub-threads we need to check tgid. > > diff --git a/include/linux/signal.h b/include/linux/signal.h > > index b77f9472a37c..76bda1a20578 100644 > > --- a/include/linux/signal.h > > +++ b/include/linux/signal.h > > @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig); > > * | non-POSIX signal | default action | > > * +--------------------+------------------+ > > * | SIGEMT | coredump | > > + * | SIGINFO | ignore | > > * +--------------------+------------------+ > > * > > * (+) For SIGKILL and SIGSTOP the action is "always", not just "default". > > @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig) > > return sig == SIGCONT || > > sig == SIGCHLD || > > sig == SIGWINCH || > > +#if defined(SIGINFO) && SIGINFO != SIGPWR > > + sig == SIGINFO || > > +#endif > > sig == SIGURG; > > } > > > > diff --git a/include/linux/tty.h b/include/linux/tty.h > > index 168e57e40bbb..943d85aa471c 100644 > > --- a/include/linux/tty.h > > +++ b/include/linux/tty.h > > @@ -49,6 +49,9 @@ > > #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE]) > > #define LNEXT_CHAR(tty) ((tty)->termios.c_cc[VLNEXT]) > > #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2]) > > +#ifdef VSTATUS > > +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS]) > > +#endif > > > > #define _I_FLAG(tty, f) ((tty)->termios.c_iflag & (f)) > > #define _O_FLAG(tty, f) ((tty)->termios.c_oflag & (f)) > > @@ -114,6 +117,9 @@ > > #define L_PENDIN(tty) _L_FLAG((tty), PENDIN) > > #define L_IEXTEN(tty) _L_FLAG((tty), IEXTEN) > > #define L_EXTPROC(tty) _L_FLAG((tty), EXTPROC) > > +#ifdef NOKERNINFO > > +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO) > > +#endif > > > > struct device; > > struct signal_struct; > > @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty); > > extern void tty_unlock_slave(struct tty_struct *tty); > > extern void tty_set_lock_subclass(struct tty_struct *tty); > > > > +extern int tty_status(struct tty_struct *tty); > > + > > #endif > > diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h > > index cdc9f4ca8c27..baa2b8d42679 100644 > > --- a/include/uapi/asm-generic/ioctls.h > > +++ b/include/uapi/asm-generic/ioctls.h > > @@ -97,6 +97,8 @@ > > > > #define TIOCMIWAIT 0x545C /* wait for a change on serial input line(s) */ > > #define TIOCGICOUNT 0x545D /* read serial port inline interrupt counts */ > > +/* Some architectures use 0x545E for FIOQSIZE */ > > +#define TIOCSTAT 0x545F /* display process group stats on tty */ > > > > /* > > * Some arches already define FIOQSIZE due to a historical > > diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h > > index 3c4cc9b8378e..0b771eb1db94 100644 > > --- a/include/uapi/asm-generic/signal.h > > +++ b/include/uapi/asm-generic/signal.h > > @@ -4,7 +4,7 @@ > > > > #include <linux/types.h> > > > > -#define _NSIG 64 > > +#define _NSIG 65 > > #define _NSIG_BPW __BITS_PER_LONG > > #define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) > > > > @@ -49,9 +49,11 @@ > > /* These should not be considered constants from userland. */ > > #define SIGRTMIN 32 > > #ifndef SIGRTMAX > > -#define SIGRTMAX _NSIG > > +#define SIGRTMAX 64 > > #endif > > > > +#define SIGINFO 65 > > + > > #if !defined MINSIGSTKSZ || !defined SIGSTKSZ > > #define MINSIGSTKSZ 2048 > > #define SIGSTKSZ 8192 > > diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h > > index 2fbaf9ae89dd..cb4e9c6d629f 100644 > > --- a/include/uapi/asm-generic/termbits.h > > +++ b/include/uapi/asm-generic/termbits.h > > @@ -58,6 +58,7 @@ struct ktermios { > > #define VWERASE 14 > > #define VLNEXT 15 > > #define VEOL2 16 > > +#define VSTATUS 17 > > > > /* c_iflag bits */ > > #define IGNBRK 0000001 > > @@ -164,22 +165,23 @@ struct ktermios { > > #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */ > > > > /* c_lflag bits */ > > -#define ISIG 0000001 > > -#define ICANON 0000002 > > -#define XCASE 0000004 > > -#define ECHO 0000010 > > -#define ECHOE 0000020 > > -#define ECHOK 0000040 > > -#define ECHONL 0000100 > > -#define NOFLSH 0000200 > > -#define TOSTOP 0000400 > > -#define ECHOCTL 0001000 > > -#define ECHOPRT 0002000 > > -#define ECHOKE 0004000 > > -#define FLUSHO 0010000 > > -#define PENDIN 0040000 > > -#define IEXTEN 0100000 > > -#define EXTPROC 0200000 > > +#define ISIG 0000001 > > +#define ICANON 0000002 > > +#define XCASE 0000004 > > +#define ECHO 0000010 > > +#define ECHOE 0000020 > > +#define ECHOK 0000040 > > +#define ECHONL 0000100 > > +#define NOFLSH 0000200 > > +#define TOSTOP 0000400 > > +#define ECHOCTL 0001000 > > +#define ECHOPRT 0002000 > > +#define ECHOKE 0004000 > > +#define FLUSHO 0010000 > > +#define PENDIN 0040000 > > +#define IEXTEN 0100000 > > +#define EXTPROC 0200000 > > +#define NOKERNINFO 0400000 > > > > /* tcflow() and TCXONC use these */ > > #define TCOOFF 0 > > -- > > 2.30.2 > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-07 21:52 ` Walt Drummond @ 2022-01-07 22:39 ` Arseny Maslennikov 0 siblings, 0 replies; 21+ messages in thread From: Arseny Maslennikov @ 2022-01-07 22:39 UTC (permalink / raw) To: Walt Drummond Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel, linux-arch [-- Attachment #1: Type: text/plain, Size: 26392 bytes --] On Fri, Jan 07, 2022 at 01:52:23PM -0800, Walt Drummond wrote: > On Fri, Jan 7, 2022 at 1:48 PM Arseny Maslennikov <ar@cs.msu.ru> wrote: > > > > On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote: > > > Support TTY VSTATUS character, NOKERNINFO local control bit and the > > > signal SIGINFO, all as in 4.3BSD. > > > > > > Signed-off-by: Walt Drummond <walt@drummond.us> > > > --- > > > arch/x86/include/asm/signal.h | 2 +- > > > arch/x86/include/uapi/asm/signal.h | 4 +- > > > drivers/tty/Makefile | 2 +- > > > drivers/tty/n_tty.c | 21 +++++ > > > drivers/tty/tty_io.c | 10 ++- > > > drivers/tty/tty_ioctl.c | 4 + > > > drivers/tty/tty_status.c | 135 ++++++++++++++++++++++++++++ > > > fs/proc/array.c | 29 +----- > > > include/asm-generic/termios.h | 4 +- > > > include/linux/sched.h | 52 ++++++++++- > > > include/linux/signal.h | 4 + > > > include/linux/tty.h | 8 ++ > > > include/uapi/asm-generic/ioctls.h | 2 + > > > include/uapi/asm-generic/signal.h | 6 +- > > > include/uapi/asm-generic/termbits.h | 34 +++---- > > > 15 files changed, 264 insertions(+), 53 deletions(-) > > > create mode 100644 drivers/tty/tty_status.c > > > > > > diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h > > > index d8e2efe6cd46..0a01877c11ab 100644 > > > --- a/arch/x86/include/asm/signal.h > > > +++ b/arch/x86/include/asm/signal.h > > > @@ -8,7 +8,7 @@ > > > /* Most things should be clean enough to redefine this at will, if care > > > is taken to make libc match. */ > > > > > > -#define _NSIG 64 > > > +#define _NSIG 65 > > > > > > #ifdef __i386__ > > > # define _NSIG_BPW 32 > > > diff --git a/arch/x86/include/uapi/asm/signal.h b/arch/x86/include/uapi/asm/signal.h > > > index 164a22a72984..60dca62d3dcf 100644 > > > --- a/arch/x86/include/uapi/asm/signal.h > > > +++ b/arch/x86/include/uapi/asm/signal.h > > > @@ -60,7 +60,9 @@ typedef unsigned long sigset_t; > > > > > > /* These should not be considered constants from userland. */ > > > #define SIGRTMIN 32 > > > -#define SIGRTMAX _NSIG > > > +#define SIGRTMAX 64 > > > + > > > +#define SIGINFO 65 > > > > > > #define SA_RESTORER 0x04000000 > > > > > > diff --git a/drivers/tty/Makefile b/drivers/tty/Makefile > > > index a2bd75fbaaa4..d50ba690bb87 100644 > > > --- a/drivers/tty/Makefile > > > +++ b/drivers/tty/Makefile > > > @@ -2,7 +2,7 @@ > > > obj-$(CONFIG_TTY) += tty_io.o n_tty.o tty_ioctl.o tty_ldisc.o \ > > > tty_buffer.o tty_port.o tty_mutex.o \ > > > tty_ldsem.o tty_baudrate.o tty_jobctrl.o \ > > > - n_null.o > > > + n_null.o tty_status.o > > > obj-$(CONFIG_LEGACY_PTYS) += pty.o > > > obj-$(CONFIG_UNIX98_PTYS) += pty.o > > > obj-$(CONFIG_AUDIT) += tty_audit.o > > > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c > > > index 0ec93f1a61f5..b510e01289fd 100644 > > > --- a/drivers/tty/n_tty.c > > > +++ b/drivers/tty/n_tty.c > > > @@ -1334,6 +1334,24 @@ static void n_tty_receive_char_special(struct tty_struct *tty, unsigned char c) > > > commit_echoes(tty); > > > return; > > > } > > > +#ifdef VSTATUS > > > + if (c == STATUS_CHAR(tty)) { > > > + /* Do the status message first and then send > > > + * the signal, otherwise signal delivery can > > > + * change the process state making the status > > > + * message misleading. Also, use __isig() and > > > + * not sig(), as if we flush the tty we can > > > + * lose parts of the message. > > > > ...As well as the character input in the canonical mode's built-in line > > editor. > > > > Yes, good catch. But this is not going to be in the next version of the patch. > > > > + */ > > > + > > > + if (!L_NOKERNINFO(tty)) > > > + tty_status(tty); > > > +# if defined(SIGINFO) && SIGINFO != SIGPWR > > > + __isig(SIGINFO, tty); > > > +# endif > > > + return; > > > + } > > > +#endif /* VSTATUS */ > > > if (c == '\n') { > > > if (L_ECHO(tty) || L_ECHONL(tty)) { > > > echo_char_raw('\n', ldata); > > > @@ -1763,6 +1781,9 @@ static void n_tty_set_termios(struct tty_struct *tty, struct ktermios *old) > > > set_bit(EOF_CHAR(tty), ldata->char_map); > > > set_bit('\n', ldata->char_map); > > > set_bit(EOL_CHAR(tty), ldata->char_map); > > > +#ifdef VSTATUS > > > + set_bit(STATUS_CHAR(tty), ldata->char_map); > > > +#endif > > > if (L_IEXTEN(tty)) { > > > set_bit(WERASE_CHAR(tty), ldata->char_map); > > > set_bit(LNEXT_CHAR(tty), ldata->char_map); > > > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c > > > index 6616d4a0d41d..8e488ecba330 100644 > > > --- a/drivers/tty/tty_io.c > > > +++ b/drivers/tty/tty_io.c > > > @@ -120,18 +120,26 @@ > > > #define TTY_PARANOIA_CHECK 1 > > > #define CHECK_TTY_COUNT 1 > > > > > > +/* Less ugly than an ifdef in the middle of the initalizer below, maybe? */ > > > +#ifdef NOKERNINFO > > > +# define __NOKERNINFO NOKERNINFO > > > +#else > > > +# define __NOKERNINFO 0 > > > +#endif > > > + > > > struct ktermios tty_std_termios = { /* for the benefit of tty drivers */ > > > .c_iflag = ICRNL | IXON, > > > .c_oflag = OPOST | ONLCR, > > > .c_cflag = B38400 | CS8 | CREAD | HUPCL, > > > .c_lflag = ISIG | ICANON | ECHO | ECHOE | ECHOK | > > > - ECHOCTL | ECHOKE | IEXTEN, > > > + ECHOCTL | ECHOKE | IEXTEN | __NOKERNINFO, > > > .c_cc = INIT_C_CC, > > > .c_ispeed = 38400, > > > .c_ospeed = 38400, > > > /* .c_line = N_TTY, */ > > > }; > > > EXPORT_SYMBOL(tty_std_termios); > > > +#undef __NOKERNINFO > > > > > > /* This list gets poked at by procfs and various bits of boot up code. This > > > * could do with some rationalisation such as pulling the tty proc function > > > diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c > > > index 507a25d692bb..b250eabca1ba 100644 > > > --- a/drivers/tty/tty_ioctl.c > > > +++ b/drivers/tty/tty_ioctl.c > > > @@ -809,6 +809,10 @@ int tty_mode_ioctl(struct tty_struct *tty, struct file *file, > > > if (get_user(arg, (unsigned int __user *) arg)) > > > return -EFAULT; > > > return tty_change_softcar(real_tty, arg); > > > +#ifdef TIOCSTAT > > > + case TIOCSTAT: > > > + return tty_status(real_tty); > > > +#endif > > > default: > > > return -ENOIOCTLCMD; > > > } > > > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c > > > new file mode 100644 > > > > Nitpick: the new functionality is part of n_tty and not the generic tty > > subsystem, so "tty_status.c" is a misleading name for the new file, > > unlike e. g. "n_tty_status.c". It has no use in the various modem > > drivers, for example. > > Likewise for the tty_status() function. > > ACK, will do. > > > > > > index 000000000000..a9600f5bd48c > > > --- /dev/null > > > +++ b/drivers/tty/tty_status.c > > > @@ -0,0 +1,135 @@ > > > +// SPDX-License-Identifier: GPL-1.0+ > > > +/* > > > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 > > > + * > > > + */ > > > + > > > +#include <linux/sched.h> > > > +#include <linux/mm.h> > > > +#include <linux/tty.h> > > > +#include <linux/sched/cputime.h> > > > +#include <linux/sched/loadavg.h> > > > +#include <linux/pid.h> > > > +#include <linux/slab.h> > > > +#include <linux/math64.h> > > > + > > > +#define MSGLEN (160 + TASK_COMM_LEN) > > > + > > > +inline unsigned long getRSSk(struct mm_struct *mm) > > > +{ > > > + if (mm == NULL) > > > + return 0; > > > + return get_mm_rss(mm) * PAGE_SIZE / 1024; > > > +} > > > + > > > +inline long nstoms(long l) > > > +{ > > > + l /= NSEC_PER_MSEC * 10; > > > + if (l < 10) > > > + l *= 10; > > > + return l; > > > +} > > > + > > > +inline struct task_struct *compare(struct task_struct *new, > > > + struct task_struct *old) > > > +{ > > > + unsigned int ostate, nstate; > > > + > > > + if (old == NULL) > > > + return new; > > > + > > > + ostate = task_state_index(old); > > > + nstate = task_state_index(new); > > > + > > > + if (ostate == nstate) { > > > + if (old->start_time > new->start_time) > > > + return old; > > > + return new; > > > + } > > > + > > > + if (ostate < nstate) > > > + return old; > > > + > > > + return new; > > > +} > > > + > > > +struct task_struct *pick_process(struct pid *pgrp) > > > +{ > > > + struct task_struct *p, *winner = NULL; > > > + > > > + read_lock(&tasklist_lock); > > > + do_each_pid_task(pgrp, PIDTYPE_PGID, p) { > > > + winner = compare(p, winner); > > > + } while_each_pid_task(pgrp, PIDTYPE_PGID, p); > > > + read_unlock(&tasklist_lock); > > > + > > > + return winner; > > > +} > > > + > > > +int tty_status(struct tty_struct *tty) > > > +{ > > > + char tname[TASK_COMM_LEN]; > > > + unsigned long loadavg[3]; > > > + uint64_t pcpu, cputime, wallclock; > > > + struct task_struct *p; > > > + struct rusage rusage; > > > + struct timespec64 utime, stime, rtime; > > > + char msg[MSGLEN] = {0}; > > > + int len = 0; > > > + > > > + if (tty == NULL) > > > + return -ENOTTY; > > > + > > > + get_avenrun(loadavg, FIXED_1/200, 0); > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu ", > > > + LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0])); > > > + > > > + if (tty->ctrl.session == NULL) { > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > > + "not a controlling terminal"); > > > + goto print; > > > + } > > > + > > > + if (tty->ctrl.pgrp == NULL) { > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > > + "no foreground process group"); > > > + goto print; > > > + } > > > + > > > + p = pick_process(tty->ctrl.pgrp); > > > + if (p == NULL) { > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > > + "empty foreground process group"); > > > + goto print; > > > + } > > > + > > > + get_task_comm(tname, p); > > > + getrusage(p, RUSAGE_BOTH, &rusage); > > > + wallclock = ktime_get_ns() - p->start_time; > > > + > > > + utime.tv_sec = rusage.ru_utime.tv_sec; > > > + utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC; > > > + stime.tv_sec = rusage.ru_stime.tv_sec; > > > + stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC; > > > + rtime = ns_to_timespec64(wallclock); > > > + > > > + cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime); > > > + pcpu = div64_u64(cputime * 100, wallclock); > > > + > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, > > > + /* task, PID, task state */ > > > + "cmd: %s %d [%s] " > > > + /* rtime, utime, stime, %cpu, rss */ > > > + "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk", > > > + tname, task_pid_vnr(p), (char *)get_task_state_name(p), > > > + rtime.tv_sec, nstoms(rtime.tv_nsec), > > > + utime.tv_sec, nstoms(utime.tv_nsec), > > > + stime.tv_sec, nstoms(stime.tv_nsec), > > > + pcpu, getRSSk(p->mm)); > > > + > > > +print: > > > + len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n"); > > > + tty_write_message(tty, msg); > > > > tty_write_message() is quite risky to use; while writing my > > implementation a couple of years ago I've found it easy to accidentally > > set up deadlocks with this interface — in particular if the function is > > called from the tty character receive path. > > I hope you're testing the functionality with CONFIG_PROVE_LOCKING enabled. > > I have not, but I will. > > Is there a different 'put a message on this tty' api I should be using? There was none at the time; unfortunately, as of v5.15 it looks like there's still none. Please see 6/7 of the following series: https://lore.kernel.org/lkml/20200430064301.1099452-7-ar@cs.msu.ru/ I had to do that, then use it like this from the line discipline in 7/7 (copy-paste from the series with new notes): diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index f72a3fd4b..905cdd985 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -2489,6 +2496,21 @@ static int n_tty_ioctl(struct tty_struct *tty, struct file *file, } } +static void n_tty_status_line(struct tty_struct *tty) +{ + /* private data! can't move this to another file. */ + struct n_tty_data *ldata = tty->disc_data; + char *msg, *buf; + msg = buf = kzalloc(STATUS_LINE_LEN, GFP_KERNEL); + tty_sprint_status_line(tty, buf + 1, STATUS_LINE_LEN - 1); + /* The only current caller of this takes output_lock for us. */ + if (ldata->column != 0) + *msg = '\n'; + else + msg++; + /* a call to the new function */ + do_n_tty_write(tty, NULL, msg, strlen(msg)); + kfree(buf); +} + static struct tty_ldisc_ops n_tty_ops = { .magic = TTY_LDISC_MAGIC, .name = "n_tty", The tty_sprint_status_line() is defined in n_tty_status.c, it produces the line at a buf+len. Also, unlike in arguments of tty_write_message() which bypasses the ldisc, '\n' gets translated by the line discipline to '\r\n' automatically if relevant termios flags are set. Also if the cursor is not at the first column of the current row, there is an automatic newline. > Thanks. > > > > > > + > > > + return 0; > > > +} > > > diff --git a/fs/proc/array.c b/fs/proc/array.c > > > index f37c03077b58..eb14306cdde2 100644 > > > --- a/fs/proc/array.c > > > +++ b/fs/proc/array.c > > > @@ -62,6 +62,7 @@ > > > #include <linux/tty.h> > > > #include <linux/string.h> > > > #include <linux/mman.h> > > > +#include <linux/sched.h> > > > #include <linux/sched/mm.h> > > > #include <linux/sched/numa_balancing.h> > > > #include <linux/sched/task_stack.h> > > > @@ -111,34 +112,6 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) > > > seq_printf(m, "%.64s", tcomm); > > > } > > > > > > -/* > > > - * The task state array is a strange "bitmap" of > > > - * reasons to sleep. Thus "running" is zero, and > > > - * you can test for combinations of others with > > > - * simple bit tests. > > > - */ > > > -static const char * const task_state_array[] = { > > > - > > > - /* states in TASK_REPORT: */ > > > - "R (running)", /* 0x00 */ > > > - "S (sleeping)", /* 0x01 */ > > > - "D (disk sleep)", /* 0x02 */ > > > - "T (stopped)", /* 0x04 */ > > > - "t (tracing stop)", /* 0x08 */ > > > - "X (dead)", /* 0x10 */ > > > - "Z (zombie)", /* 0x20 */ > > > - "P (parked)", /* 0x40 */ > > > - > > > - /* states beyond TASK_REPORT: */ > > > - "I (idle)", /* 0x80 */ > > > -}; > > > - > > > -static inline const char *get_task_state(struct task_struct *tsk) > > > -{ > > > - BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > > - return task_state_array[task_state_index(tsk)]; > > > -} > > > - > > > static inline void task_state(struct seq_file *m, struct pid_namespace *ns, > > > struct pid *pid, struct task_struct *p) > > > { > > > diff --git a/include/asm-generic/termios.h b/include/asm-generic/termios.h > > > index b1398d0d4a1d..9b080e1a82d4 100644 > > > --- a/include/asm-generic/termios.h > > > +++ b/include/asm-generic/termios.h > > > @@ -10,9 +10,9 @@ > > > eof=^D vtime=\0 vmin=\1 sxtc=\0 > > > start=^Q stop=^S susp=^Z eol=\0 > > > reprint=^R discard=^U werase=^W lnext=^V > > > - eol2=\0 > > > + eol2=\0 status=^T > > > */ > > > -#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0" > > > +#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0\024" > > > > > > /* > > > * Translate a "termio" structure into a "termios". Ugh. > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > > index c1a927ddec64..2171074ec8f5 100644 > > > --- a/include/linux/sched.h > > > +++ b/include/linux/sched.h > > > @@ -70,7 +70,7 @@ struct task_group; > > > > > > /* > > > * Task state bitmask. NOTE! These bits are also > > > - * encoded in fs/proc/array.c: get_task_state(). > > > + * encoded in get_task_state(). > > > * > > > * We have two separate sets of flags: task->state > > > * is about runnability, while task->exit_state are > > > @@ -1643,6 +1643,56 @@ static inline char task_state_to_char(struct task_struct *tsk) > > > return task_index_to_char(task_state_index(tsk)); > > > } > > > > > > +static inline const char *get_task_state_name(struct task_struct *tsk) > > > +{ > > > + static const char * const task_state_array[] = { > > > + > > > + /* states in TASK_REPORT: */ > > > + "running", /* 0x00 */ > > > + "sleeping", /* 0x01 */ > > > + "disk sleep", /* 0x02 */ > > > + "stopped", /* 0x04 */ > > > + "tracing stop", /* 0x08 */ > > > + "dead", /* 0x10 */ > > > + "zombie", /* 0x20 */ > > > + "parked", /* 0x40 */ > > > + > > > + /* states beyond TASK_REPORT: */ > > > + "idle", /* 0x80 */ > > > + }; > > > + > > > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > > + return task_state_array[task_state_index(tsk)]; > > > +} > > > + > > > +static inline const char *get_task_state(struct task_struct *tsk) > > > +{ > > > + /* > > > + * The task state array is a strange "bitmap" of > > > + * reasons to sleep. Thus "running" is zero, and > > > + * you can test for combinations of others with > > > + * simple bit tests. > > > + */ > > > + static const char * const task_state_array[] = { > > > + > > > + /* states in TASK_REPORT: */ > > > + "R (running)", /* 0x00 */ > > > + "S (sleeping)", /* 0x01 */ > > > + "D (disk sleep)", /* 0x02 */ > > > + "T (stopped)", /* 0x04 */ > > > + "t (tracing stop)", /* 0x08 */ > > > + "X (dead)", /* 0x10 */ > > > + "Z (zombie)", /* 0x20 */ > > > + "P (parked)", /* 0x40 */ > > > + > > > + /* states beyond TASK_REPORT: */ > > > + "I (idle)", /* 0x80 */ > > > + }; > > > + > > > + BUILD_BUG_ON(1 + ilog2(TASK_REPORT_MAX) != ARRAY_SIZE(task_state_array)); > > > + return task_state_array[task_state_index(tsk)]; > > > +} > > > + > > > /** > > > * is_global_init - check if a task structure is init. Since init > > > * is free to have sub-threads we need to check tgid. > > > diff --git a/include/linux/signal.h b/include/linux/signal.h > > > index b77f9472a37c..76bda1a20578 100644 > > > --- a/include/linux/signal.h > > > +++ b/include/linux/signal.h > > > @@ -541,6 +541,7 @@ extern bool unhandled_signal(struct task_struct *tsk, int sig); > > > * | non-POSIX signal | default action | > > > * +--------------------+------------------+ > > > * | SIGEMT | coredump | > > > + * | SIGINFO | ignore | > > > * +--------------------+------------------+ > > > * > > > * (+) For SIGKILL and SIGSTOP the action is "always", not just "default". > > > @@ -567,6 +568,9 @@ static inline int sig_kernel_ignore(unsigned long sig) > > > return sig == SIGCONT || > > > sig == SIGCHLD || > > > sig == SIGWINCH || > > > +#if defined(SIGINFO) && SIGINFO != SIGPWR > > > + sig == SIGINFO || > > > +#endif > > > sig == SIGURG; > > > } > > > > > > diff --git a/include/linux/tty.h b/include/linux/tty.h > > > index 168e57e40bbb..943d85aa471c 100644 > > > --- a/include/linux/tty.h > > > +++ b/include/linux/tty.h > > > @@ -49,6 +49,9 @@ > > > #define WERASE_CHAR(tty) ((tty)->termios.c_cc[VWERASE]) > > > #define LNEXT_CHAR(tty) ((tty)->termios.c_cc[VLNEXT]) > > > #define EOL2_CHAR(tty) ((tty)->termios.c_cc[VEOL2]) > > > +#ifdef VSTATUS > > > +#define STATUS_CHAR(tty) ((tty)->termios.c_cc[VSTATUS]) > > > +#endif > > > > > > #define _I_FLAG(tty, f) ((tty)->termios.c_iflag & (f)) > > > #define _O_FLAG(tty, f) ((tty)->termios.c_oflag & (f)) > > > @@ -114,6 +117,9 @@ > > > #define L_PENDIN(tty) _L_FLAG((tty), PENDIN) > > > #define L_IEXTEN(tty) _L_FLAG((tty), IEXTEN) > > > #define L_EXTPROC(tty) _L_FLAG((tty), EXTPROC) > > > +#ifdef NOKERNINFO > > > +#define L_NOKERNINFO(tty) _L_FLAG((tty), NOKERNINFO) > > > +#endif > > > > > > struct device; > > > struct signal_struct; > > > @@ -428,4 +434,6 @@ extern void tty_lock_slave(struct tty_struct *tty); > > > extern void tty_unlock_slave(struct tty_struct *tty); > > > extern void tty_set_lock_subclass(struct tty_struct *tty); > > > > > > +extern int tty_status(struct tty_struct *tty); > > > + > > > #endif > > > diff --git a/include/uapi/asm-generic/ioctls.h b/include/uapi/asm-generic/ioctls.h > > > index cdc9f4ca8c27..baa2b8d42679 100644 > > > --- a/include/uapi/asm-generic/ioctls.h > > > +++ b/include/uapi/asm-generic/ioctls.h > > > @@ -97,6 +97,8 @@ > > > > > > #define TIOCMIWAIT 0x545C /* wait for a change on serial input line(s) */ > > > #define TIOCGICOUNT 0x545D /* read serial port inline interrupt counts */ > > > +/* Some architectures use 0x545E for FIOQSIZE */ > > > +#define TIOCSTAT 0x545F /* display process group stats on tty */ > > > > > > /* > > > * Some arches already define FIOQSIZE due to a historical > > > diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h > > > index 3c4cc9b8378e..0b771eb1db94 100644 > > > --- a/include/uapi/asm-generic/signal.h > > > +++ b/include/uapi/asm-generic/signal.h > > > @@ -4,7 +4,7 @@ > > > > > > #include <linux/types.h> > > > > > > -#define _NSIG 64 > > > +#define _NSIG 65 > > > #define _NSIG_BPW __BITS_PER_LONG > > > #define _NSIG_WORDS ((_NSIG + _NSIG_BPW - 1) / _NSIG_BPW) > > > > > > @@ -49,9 +49,11 @@ > > > /* These should not be considered constants from userland. */ > > > #define SIGRTMIN 32 > > > #ifndef SIGRTMAX > > > -#define SIGRTMAX _NSIG > > > +#define SIGRTMAX 64 > > > #endif > > > > > > +#define SIGINFO 65 > > > + > > > #if !defined MINSIGSTKSZ || !defined SIGSTKSZ > > > #define MINSIGSTKSZ 2048 > > > #define SIGSTKSZ 8192 > > > diff --git a/include/uapi/asm-generic/termbits.h b/include/uapi/asm-generic/termbits.h > > > index 2fbaf9ae89dd..cb4e9c6d629f 100644 > > > --- a/include/uapi/asm-generic/termbits.h > > > +++ b/include/uapi/asm-generic/termbits.h > > > @@ -58,6 +58,7 @@ struct ktermios { > > > #define VWERASE 14 > > > #define VLNEXT 15 > > > #define VEOL2 16 > > > +#define VSTATUS 17 > > > > > > /* c_iflag bits */ > > > #define IGNBRK 0000001 > > > @@ -164,22 +165,23 @@ struct ktermios { > > > #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */ > > > > > > /* c_lflag bits */ > > > -#define ISIG 0000001 > > > -#define ICANON 0000002 > > > -#define XCASE 0000004 > > > -#define ECHO 0000010 > > > -#define ECHOE 0000020 > > > -#define ECHOK 0000040 > > > -#define ECHONL 0000100 > > > -#define NOFLSH 0000200 > > > -#define TOSTOP 0000400 > > > -#define ECHOCTL 0001000 > > > -#define ECHOPRT 0002000 > > > -#define ECHOKE 0004000 > > > -#define FLUSHO 0010000 > > > -#define PENDIN 0040000 > > > -#define IEXTEN 0100000 > > > -#define EXTPROC 0200000 > > > +#define ISIG 0000001 > > > +#define ICANON 0000002 > > > +#define XCASE 0000004 > > > +#define ECHO 0000010 > > > +#define ECHOE 0000020 > > > +#define ECHOK 0000040 > > > +#define ECHONL 0000100 > > > +#define NOFLSH 0000200 > > > +#define TOSTOP 0000400 > > > +#define ECHOCTL 0001000 > > > +#define ECHOPRT 0002000 > > > +#define ECHOKE 0004000 > > > +#define FLUSHO 0010000 > > > +#define PENDIN 0040000 > > > +#define IEXTEN 0100000 > > > +#define EXTPROC 0200000 > > > +#define NOKERNINFO 0400000 > > > > > > /* tcflow() and TCXONC use these */ > > > #define TCOOFF 0 > > > -- > > > 2.30.2 > > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond 2022-01-04 7:27 ` Greg Kroah-Hartman 2022-01-07 21:48 ` Arseny Maslennikov @ 2022-01-08 14:38 ` Arseny Maslennikov 2 siblings, 0 replies; 21+ messages in thread From: Arseny Maslennikov @ 2022-01-08 14:38 UTC (permalink / raw) To: Walt Drummond Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel, linux-fsdevel, linux-arch [-- Attachment #1: Type: text/plain, Size: 5928 bytes --] On Mon, Jan 03, 2022 at 10:19:56AM -0800, Walt Drummond wrote: > Support TTY VSTATUS character, NOKERNINFO local control bit and the > signal SIGINFO, all as in 4.3BSD. > > Signed-off-by: Walt Drummond <walt@drummond.us> > --- > arch/x86/include/asm/signal.h | 2 +- > arch/x86/include/uapi/asm/signal.h | 4 +- > drivers/tty/Makefile | 2 +- > drivers/tty/n_tty.c | 21 +++++ > drivers/tty/tty_io.c | 10 ++- > drivers/tty/tty_ioctl.c | 4 + > drivers/tty/tty_status.c | 135 ++++++++++++++++++++++++++++ > fs/proc/array.c | 29 +----- > include/asm-generic/termios.h | 4 +- > include/linux/sched.h | 52 ++++++++++- > include/linux/signal.h | 4 + > include/linux/tty.h | 8 ++ > include/uapi/asm-generic/ioctls.h | 2 + > include/uapi/asm-generic/signal.h | 6 +- > include/uapi/asm-generic/termbits.h | 34 +++---- > 15 files changed, 264 insertions(+), 53 deletions(-) > create mode 100644 drivers/tty/tty_status.c > > <...> > > diff --git a/drivers/tty/tty_status.c b/drivers/tty/tty_status.c > new file mode 100644 > index 000000000000..a9600f5bd48c > --- /dev/null > +++ b/drivers/tty/tty_status.c > @@ -0,0 +1,135 @@ > +// SPDX-License-Identifier: GPL-1.0+ > +/* > + * tty_status.c --- implements VSTATUS and TIOCSTAT from BSD4.3/4.4 > + * > + */ > + > +#include <linux/sched.h> > +#include <linux/mm.h> > +#include <linux/tty.h> > +#include <linux/sched/cputime.h> > +#include <linux/sched/loadavg.h> > +#include <linux/pid.h> > +#include <linux/slab.h> > +#include <linux/math64.h> > + > +#define MSGLEN (160 + TASK_COMM_LEN) > + > +inline unsigned long getRSSk(struct mm_struct *mm) > +{ > + if (mm == NULL) > + return 0; > + return get_mm_rss(mm) * PAGE_SIZE / 1024; > +} > + > +inline long nstoms(long l) > +{ > + l /= NSEC_PER_MSEC * 10; > + if (l < 10) > + l *= 10; > + return l; > +} > + > +inline struct task_struct *compare(struct task_struct *new, > + struct task_struct *old) > +{ > + unsigned int ostate, nstate; > + > + if (old == NULL) > + return new; > + > + ostate = task_state_index(old); > + nstate = task_state_index(new); > + > + if (ostate == nstate) { > + if (old->start_time > new->start_time) > + return old; > + return new; > + } > + > + if (ostate < nstate) > + return old; > + > + return new; > +} > + > +struct task_struct *pick_process(struct pid *pgrp) > +{ > + struct task_struct *p, *winner = NULL; > + > + read_lock(&tasklist_lock); > + do_each_pid_task(pgrp, PIDTYPE_PGID, p) { > + winner = compare(p, winner); > + } while_each_pid_task(pgrp, PIDTYPE_PGID, p); > + read_unlock(&tasklist_lock); > + > + return winner; > +} > + > +int tty_status(struct tty_struct *tty) > +{ > + char tname[TASK_COMM_LEN]; > + unsigned long loadavg[3]; > + uint64_t pcpu, cputime, wallclock; > + struct task_struct *p; > + struct rusage rusage; > + struct timespec64 utime, stime, rtime; > + char msg[MSGLEN] = {0}; > + int len = 0; > + > + if (tty == NULL) > + return -ENOTTY; > + > + get_avenrun(loadavg, FIXED_1/200, 0); > + len += scnprintf((char *)&msg[len], MSGLEN - len, "load: %lu.%02lu ", > + LOAD_INT(loadavg[0]), LOAD_FRAC(loadavg[0])); > + > + if (tty->ctrl.session == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "not a controlling terminal"); > + goto print; > + } > + > + if (tty->ctrl.pgrp == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "no foreground process group"); > + goto print; > + } > + > + p = pick_process(tty->ctrl.pgrp); > + if (p == NULL) { > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + "empty foreground process group"); > + goto print; > + } > + > + get_task_comm(tname, p); > + getrusage(p, RUSAGE_BOTH, &rusage); > + wallclock = ktime_get_ns() - p->start_time; > + > + utime.tv_sec = rusage.ru_utime.tv_sec; > + utime.tv_nsec = rusage.ru_utime.tv_usec * NSEC_PER_USEC; > + stime.tv_sec = rusage.ru_stime.tv_sec; > + stime.tv_nsec = rusage.ru_stime.tv_usec * NSEC_PER_USEC; > + rtime = ns_to_timespec64(wallclock); > + > + cputime = timespec64_to_ns(&utime) + timespec64_to_ns(&stime); > + pcpu = div64_u64(cputime * 100, wallclock); > + > + len += scnprintf((char *)&msg[len], MSGLEN - len, > + /* task, PID, task state */ > + "cmd: %s %d [%s] " > + /* rtime, utime, stime, %cpu, rss */ > + "%llu.%02lur %llu.%02luu %llu.%02lus %llu%% %luk", > + tname, task_pid_vnr(p), (char *)get_task_state_name(p), task_pid_vnr(p) returns the PID of p in the PID namespace of current: pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, struct pid_namespace *ns) { pid_t nr = 0; rcu_read_lock(); if (!ns) ns = task_active_pid_ns(current); nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); rcu_read_unlock(); return nr; } struct pid_namespace *task_active_pid_ns(struct task_struct *tsk) { return ns_of_pid(task_pid(tsk)); } static inline pid_t task_pid_vnr(struct task_struct *tsk) { return __task_pid_nr_ns(tsk, PIDTYPE_PID, NULL); } At this point current is an arbitrary kernel worker thread, not p. Most likely we need another helper function in <linux/sched.h>. > + rtime.tv_sec, nstoms(rtime.tv_nsec), > + utime.tv_sec, nstoms(utime.tv_nsec), > + stime.tv_sec, nstoms(stime.tv_nsec), > + pcpu, getRSSk(p->mm)); > + > +print: > + len += scnprintf((char *)&msg[len], MSGLEN - len, "\r\n"); > + tty_write_message(tty, msg); > + > + return 0; > +} > > <...> > > -- > 2.30.2 > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 6/8] signals: Round up _NSIG_WORDS Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond @ 2022-01-03 18:48 ` Al Viro 2022-01-04 1:00 ` Walt Drummond 2022-01-04 18:00 ` Eric W. Biederman 3 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2022-01-03 18:48 UTC (permalink / raw) To: Walt Drummond Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote: > This patch set expands the number of signals in Linux beyond the > current cap of 64. It sets a new cap at the somewhat arbitrary limit > of 1024 signals, both because it’s what GLibc and MUSL support and > because many architectures pad sigset_t or ucontext_t in the kernel to > this cap. This limit is not fixed and can be further expanded within > reason. Could you explain the point of the entire exercise? Why do we need more rt signals in the first place? glibc has quite a bit of utterly pointless future-proofing. So "they allow more" is not a good reason - not without a plausible use-case, at least. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-03 18:48 ` [RFC PATCH 0/8] signals: Support more than 64 signals Al Viro @ 2022-01-04 1:00 ` Walt Drummond 2022-01-04 1:16 ` Al Viro 0 siblings, 1 reply; 21+ messages in thread From: Walt Drummond @ 2022-01-04 1:00 UTC (permalink / raw) To: Al Viro Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module I simply wanted SIGINFO and VSTATUS, and that necessitated this. If the limit of 1024 rt signals is an issue, that's an extremely simple change to make. On Mon, Jan 3, 2022 at 10:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote: > > This patch set expands the number of signals in Linux beyond the > > current cap of 64. It sets a new cap at the somewhat arbitrary limit > > of 1024 signals, both because it’s what GLibc and MUSL support and > > because many architectures pad sigset_t or ucontext_t in the kernel to > > this cap. This limit is not fixed and can be further expanded within > > reason. > > Could you explain the point of the entire exercise? Why do we need more > rt signals in the first place? > > glibc has quite a bit of utterly pointless future-proofing. So "they > allow more" is not a good reason - not without a plausible use-case, > at least. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 1:00 ` Walt Drummond @ 2022-01-04 1:16 ` Al Viro 2022-01-04 1:49 ` Al Viro 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2022-01-04 1:16 UTC (permalink / raw) To: Walt Drummond Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote: > I simply wanted SIGINFO and VSTATUS, and that necessitated this. Elaborate, please. What exactly requires more than 32 rt signals? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 1:16 ` Al Viro @ 2022-01-04 1:49 ` Al Viro 0 siblings, 0 replies; 21+ messages in thread From: Al Viro @ 2022-01-04 1:49 UTC (permalink / raw) To: Walt Drummond Cc: aacraid, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module On Tue, Jan 04, 2022 at 01:16:17AM +0000, Al Viro wrote: > On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote: > > I simply wanted SIGINFO and VSTATUS, and that necessitated this. > > Elaborate, please. What exactly requires more than 32 rt signals? More to the point, which system had SIGINFO >= SIGRTMIN? Or signals with numbers greater than SIGRTMAX, for that matter? I really don't get it... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond ` (2 preceding siblings ...) 2022-01-03 18:48 ` [RFC PATCH 0/8] signals: Support more than 64 signals Al Viro @ 2022-01-04 18:00 ` Eric W. Biederman 2022-01-04 20:52 ` Theodore Ts'o 3 siblings, 1 reply; 21+ messages in thread From: Eric W. Biederman @ 2022-01-04 18:00 UTC (permalink / raw) To: Walt Drummond Cc: aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module Walt Drummond <walt@drummond.us> writes: > This patch set expands the number of signals in Linux beyond the > current cap of 64. It sets a new cap at the somewhat arbitrary limit > of 1024 signals, both because it’s what GLibc and MUSL support and > because many architectures pad sigset_t or ucontext_t in the kernel to > this cap. This limit is not fixed and can be further expanded within > reason. Ahhhh!! Please let's not expand the number of signals supported if there is any alternative. Signals only really make sense for supporting existing interfaces. For new applications there is almost always something better. In the last discussion of adding SIGINFO https://lore.kernel.org/lkml/20190625161153.29811-1-ar@cs.msu.ru/ the approach examined was to fix SIGPWR to be ignored by default and to define SIGINFO as SIGPWR. I dug through the previous conversations and there is a little debate about what makes sense for SIGPWR to do by default. Alan Cox remembered SIGPWR was sent when the power was restored, so ignoring SIGPWR by default made sense. Ted Tso pointed out a different scenario where it was reasonable for SIGPWR to be a terminating signal. So far no one has actually found any applications that will regress if SIGPWR becomes ignored by default. Furthermore on linux SIGPWR is only defined to be sent to init, and init ignores all signals by default so in practice SIGPWR is ignored by the only process that receives it currently. I am persuaded at least enough that I could see adding a patch to linux-next and them sending to Linus that could be reverted if anything broke. Where I saw the last conversation falter was in making a persuasive case of why SIGINFO was interesting to add. Given a world of ssh connections I expect a persuasive case can be made. Especially if there are a handful of utilities where it is already implemented that just need to be built with SIGINFO defined. > - Add BSD SIGINFO (and VSTATUS) as a test. If your actual point is not to implement SIGINFO and you really have another use case for expanding sigset_t please make it clear. Without seeing the persuasive case for more signals I have to say that adding more signals to the kernel sounds like a bad idea. Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 18:00 ` Eric W. Biederman @ 2022-01-04 20:52 ` Theodore Ts'o 2022-01-04 21:33 ` Walt Drummond ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Theodore Ts'o @ 2022-01-04 20:52 UTC (permalink / raw) To: Eric W. Biederman Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote: > I dug through the previous conversations and there is a little debate > about what makes sense for SIGPWR to do by default. Alan Cox remembered > SIGPWR was sent when the power was restored, so ignoring SIGPWR by > default made sense. Ted Tso pointed out a different scenario where it > was reasonable for SIGPWR to be a terminating signal. > > So far no one has actually found any applications that will regress if > SIGPWR becomes ignored by default. Furthermore on linux SIGPWR is only > defined to be sent to init, and init ignores all signals by default so > in practice SIGPWR is ignored by the only process that receives it > currently. As it turns out, systemd does *not* ignore SIGPWR. Instead, it will initiate the sigpwr target. From the systemd.special man page: sigpwr.target A special target that is started when systemd receives the SIGPWR process signal, which is normally sent by the kernel or UPS daemons when power fails. And child processes of systemd are not ignoring SIGPWR. Instead, they are getting terminated. <tytso@cwcc> 41% /bin/sleep 50 & [1] 180671 <tytso@cwcc> 42% kill -PWR 180671 [1]+ Power failure /bin/sleep 50 > Where I saw the last conversation falter was in making a persuasive > case of why SIGINFO was interesting to add. Given a world of ssh > connections I expect a persuasive case can be made. Especially if there > are a handful of utilities where it is already implemented that just > need to be built with SIGINFO defined. One thing that's perhaps worth disentangling is the value of supporting VSTATUS --- which is a control character much like VINTR (^C) or VQUIT (control backslash) which is set via the c_cc[] array in termios structure. Quoting from the termios man page: VSTATUS (not in POSIX; not supported under Linux; status request: 024, DC4, Ctrl-T). Status character (STATUS). Display status information at terminal, including state of foreground process and amount of CPU time it has consumed. Also sends a SIGINFO signal (not supported on Linux) to the foreground process group. The basic idea is that when you type C-t, you can find out information about the currently running process. This is a feature that originally comes from TOPS-10's TENEX operating system, and it is supported today on FreeBSD and Mac OS. For example, it might display something like this: load: 2.39 cmd: ping 5374 running 0.00u 0.00s The reason why SIGINFO is sent to the foreground process group is that it gives the process an opportunity print application specific information about currently running process. For example, maybe the C compiler could print something like "parsing 2042 of 5000 header files", or some such. :-) There are people who wish that Linux supported Control-T / VSTATUS, for example, just last week, on TUHS, the Unix greybeards list, there were two such heartfelt wishes for Control-T support from two such greybeards: "It's my biggest annoyance with Linux that it doesn't [support control-t] - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html "I personally can't stand using Linux, even casually for a very short sys-admin task, because of this missing feature" - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html I claim, though, that we could implement VSTATUS without implenting the SIGINFO part of the feature. Previous few applications *ever* implemented SIGINFO signal handlers so they could give status information, it's the hard one, since we don't have any spare signals left. If we were to repurpose some lesser used signal, whether it be SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some userspace program (such as a UPS monitoring program which wants to trigger power fail handling, or a userspace NFSv4 process that wants to signal that it was unable to recover a file's file lock after a server reboot), and if we try to take over the signal assignment, it's possible that we might get surprised. Furthermore, all of the possibly unused signals that we might try to reclaim terminate the process by default, and SIGINFO *has* to have a default signal handling action of Ignore, since otherwise typing Control-T will end up killing the current foreground application. Personally, I don't care all that much about VSTATUS support --- I used it when I was in university, but honestly, I've never missed it. But if there is someone who wants to try to implement VSTATUS, and make some Unix greybeards happy, and maybe even switch from FreeBSD to Linux as a result, go wild. I'm not convinced, though, that adding the SIGINFO part of the support is worth the effort. Not only do almost no programs implement SIGINFO support, a lot of CPU bound programs where this might be actually useful, end up running a large number of processes in parallel. Take the "parsing 2042 of 5000 header files" example I gave above. Consider what would happen if gcc implemented support for SIGINFO, but the user was running a "make -j 16" and typed Control-T. The result would be chaos! So if you really miss Control-T, and it's the only thing holding back a few FreeBSD users from Linux, I don't see the problem with implementing that part of the feature. Why not just do the easy part of the feature which is perhaps 5% of the work, and might provide 99% of the benefit (at least for those people who care). > Without seeing the persuasive case for more signals I have to say that > adding more signals to the kernel sounds like a bad idea. Concur, 100%. - Ted ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 20:52 ` Theodore Ts'o @ 2022-01-04 21:33 ` Walt Drummond 2022-01-04 22:05 ` Eric W. Biederman 2022-01-07 19:19 ` Arseny Maslennikov 2 siblings, 0 replies; 21+ messages in thread From: Walt Drummond @ 2022-01-04 21:33 UTC (permalink / raw) To: Theodore Ts'o Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module Fair enough. I'll abandon the signals part of this and just send out the VSTATUS/Control-T part, after I address some comments from Greg. Thanks. On Tue, Jan 4, 2022 at 12:52 PM Theodore Ts'o <tytso@mit.edu> wrote: > > On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote: > > I dug through the previous conversations and there is a little debate > > about what makes sense for SIGPWR to do by default. Alan Cox remembered > > SIGPWR was sent when the power was restored, so ignoring SIGPWR by > > default made sense. Ted Tso pointed out a different scenario where it > > was reasonable for SIGPWR to be a terminating signal. > > > > So far no one has actually found any applications that will regress if > > SIGPWR becomes ignored by default. Furthermore on linux SIGPWR is only > > defined to be sent to init, and init ignores all signals by default so > > in practice SIGPWR is ignored by the only process that receives it > > currently. > > As it turns out, systemd does *not* ignore SIGPWR. Instead, it will > initiate the sigpwr target. From the systemd.special man page: > > sigpwr.target > A special target that is started when systemd receives the > SIGPWR process signal, which is normally sent by the kernel > or UPS daemons when power fails. > > And child processes of systemd are not ignoring SIGPWR. Instead, they > are getting terminated. > > <tytso@cwcc> > 41% /bin/sleep 50 & > [1] 180671 > <tytso@cwcc> > 42% kill -PWR 180671 > [1]+ Power failure /bin/sleep 50 > > > Where I saw the last conversation falter was in making a persuasive > > case of why SIGINFO was interesting to add. Given a world of ssh > > connections I expect a persuasive case can be made. Especially if there > > are a handful of utilities where it is already implemented that just > > need to be built with SIGINFO defined. > > One thing that's perhaps worth disentangling is the value of > supporting VSTATUS --- which is a control character much like VINTR > (^C) or VQUIT (control backslash) which is set via the c_cc[] array in > termios structure. Quoting from the termios man page: > > VSTATUS > (not in POSIX; not supported under Linux; status > request: 024, DC4, Ctrl-T). Status character (STATUS). > Display status information at terminal, including state > of foreground process and amount of CPU time it has > consumed. Also sends a SIGINFO signal (not supported on > Linux) to the foreground process group. > > The basic idea is that when you type C-t, you can find out information > about the currently running process. This is a feature that > originally comes from TOPS-10's TENEX operating system, and it is > supported today on FreeBSD and Mac OS. For example, it might display > something like this: > > load: 2.39 cmd: ping 5374 running 0.00u 0.00s > > The reason why SIGINFO is sent to the foreground process group is that > it gives the process an opportunity print application specific > information about currently running process. For example, maybe the C > compiler could print something like "parsing 2042 of 5000 header > files", or some such. :-) > > There are people who wish that Linux supported Control-T / VSTATUS, > for example, just last week, on TUHS, the Unix greybeards list, there > were two such heartfelt wishes for Control-T support from two such > greybeards: > > "It's my biggest annoyance with Linux that it doesn't [support > control-t] > - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html > > "I personally can't stand using Linux, even casually for a very > short sys-admin task, because of this missing feature" > - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html > > I claim, though, that we could implement VSTATUS without implenting > the SIGINFO part of the feature. Previous few applications *ever* > implemented SIGINFO signal handlers so they could give status > information, it's the hard one, since we don't have any spare signals > left. If we were to repurpose some lesser used signal, whether it be > SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some > userspace program (such as a UPS monitoring program which wants to > trigger power fail handling, or a userspace NFSv4 process that wants > to signal that it was unable to recover a file's file lock after a > server reboot), and if we try to take over the signal assignment, it's > possible that we might get surprised. Furthermore, all of the > possibly unused signals that we might try to reclaim terminate the > process by default, and SIGINFO *has* to have a default signal > handling action of Ignore, since otherwise typing Control-T will end > up killing the current foreground application. > > Personally, I don't care all that much about VSTATUS support --- I > used it when I was in university, but honestly, I've never missed it. > But if there is someone who wants to try to implement VSTATUS, and > make some Unix greybeards happy, and maybe even switch from FreeBSD to > Linux as a result, go wild. I'm not convinced, though, that adding > the SIGINFO part of the support is worth the effort. > > Not only do almost no programs implement SIGINFO support, a lot of CPU > bound programs where this might be actually useful, end up running a > large number of processes in parallel. Take the "parsing 2042 of 5000 > header files" example I gave above. Consider what would happen if gcc > implemented support for SIGINFO, but the user was running a "make -j > 16" and typed Control-T. The result would be chaos! > > So if you really miss Control-T, and it's the only thing holding back > a few FreeBSD users from Linux, I don't see the problem with > implementing that part of the feature. Why not just do the easy part > of the feature which is perhaps 5% of the work, and might provide 99% > of the benefit (at least for those people who care). > > > Without seeing the persuasive case for more signals I have to say that > > adding more signals to the kernel sounds like a bad idea. > > Concur, 100%. > > - Ted ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 20:52 ` Theodore Ts'o 2022-01-04 21:33 ` Walt Drummond @ 2022-01-04 22:05 ` Eric W. Biederman 2022-01-04 22:23 ` Theodore Ts'o 2022-01-07 19:19 ` Arseny Maslennikov 2 siblings, 1 reply; 21+ messages in thread From: Eric W. Biederman @ 2022-01-04 22:05 UTC (permalink / raw) To: Theodore Ts'o Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module "Theodore Ts'o" <tytso@mit.edu> writes: > On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote: >> I dug through the previous conversations and there is a little debate >> about what makes sense for SIGPWR to do by default. Alan Cox remembered >> SIGPWR was sent when the power was restored, so ignoring SIGPWR by >> default made sense. Ted Tso pointed out a different scenario where it >> was reasonable for SIGPWR to be a terminating signal. >> >> So far no one has actually found any applications that will regress if >> SIGPWR becomes ignored by default. Furthermore on linux SIGPWR is only >> defined to be sent to init, and init ignores all signals by default so >> in practice SIGPWR is ignored by the only process that receives it >> currently. > > As it turns out, systemd does *not* ignore SIGPWR. Instead, it will > initiate the sigpwr target. From the systemd.special man page: > > sigpwr.target > A special target that is started when systemd receives the > SIGPWR process signal, which is normally sent by the kernel > or UPS daemons when power fails. > > And child processes of systemd are not ignoring SIGPWR. Instead, they > are getting terminated. > > <tytso@cwcc> > 41% /bin/sleep 50 & > [1] 180671 > <tytso@cwcc> > 42% kill -PWR 180671 > [1]+ Power failure /bin/sleep 50 That is all as expected, and does not demonstrate a regression would happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT, SIGCHLD, SIGURG do. It does show there is the possibility of problems. The practical question is does anything send SIGPWR to anything besides init, and expect the process to handle SIGPWR or terminate? Possibly easier to implement (if people desire) is to simply send SIGCONT with an si_code that indicates someone pressed the VSTATUS key. We have a per signal 32bit si_code space so that should be comparatively easy. > I claim, though, that we could implement VSTATUS without implenting > the SIGINFO part of the feature. I agree that is the place to start. And if we aren't going to use SIGINFO perhaps we could have an equally good notification method if anyone wants one. Say call an ioctl and get an fd that can be read when a VSTATUS request comes in. SIGINFO vs SIGCONT vs a fd vs something else is something we can sort out when people get interested in modifying userspace. Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 22:05 ` Eric W. Biederman @ 2022-01-04 22:23 ` Theodore Ts'o 2022-01-04 22:31 ` Walt Drummond 0 siblings, 1 reply; 21+ messages in thread From: Theodore Ts'o @ 2022-01-04 22:23 UTC (permalink / raw) To: Eric W. Biederman Cc: Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote: > > That is all as expected, and does not demonstrate a regression would > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT, > SIGCHLD, SIGURG do. It does show there is the possibility of problems. > > The practical question is does anything send SIGPWR to anything besides > init, and expect the process to handle SIGPWR or terminate? So if I *cared* about SIGINFO, what I'd do is ask the systemd developers and users list if there are any users of the sigpwr.target feature that they know of. And I'd also download all of the open source UPS monitoring applications (and perhaps documentation of closed-source UPS applications, such as for example APC's program) and see if any of them are trying to send the SIGPWR signal. I don't personally think it's worth the effort to do that research, but maybe other people care enough to do the work. > > I claim, though, that we could implement VSTATUS without implenting > > the SIGINFO part of the feature. > > I agree that is the place to start. And if we aren't going to use > SIGINFO perhaps we could have an equally good notification method > if anyone wants one. Say call an ioctl and get an fd that can > be read when a VSTATUS request comes in. > > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort > out when people get interested in modifying userspace. Once VSTATUS support lands in the kernel, we can wait and see if there is anyone who shows up wanting the SIGINFO functionality. Certainly we have no shortage of userspace notification interfaces in Linux. :-) - Ted ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 22:23 ` Theodore Ts'o @ 2022-01-04 22:31 ` Walt Drummond 2022-01-07 19:29 ` Arseny Maslennikov 0 siblings, 1 reply; 21+ messages in thread From: Walt Drummond @ 2022-01-04 22:31 UTC (permalink / raw) To: Theodore Ts'o Cc: Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module The only standard tools that support SIGINFO are sleep, dd and ping, (and kill, for obvious reasons) so it's not like there's a vast hole in the tooling or something, nor is there a large legacy software base just waiting for SIGINFO to appear. So while I very much enjoyed figuring out how to make SIGINFO work ... I'll have the VSTATUS patch out in a little bit. I also think there might be some merit in consolidating the 10 'sigsetsize != sizeof(sigset_t)' checks in a macro and adding comments that wave people off on trying to do what I did. If that would be useful, happy to provide the patch. On Tue, Jan 4, 2022 at 2:23 PM Theodore Ts'o <tytso@mit.edu> wrote: > > On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote: > > > > That is all as expected, and does not demonstrate a regression would > > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT, > > SIGCHLD, SIGURG do. It does show there is the possibility of problems. > > > > The practical question is does anything send SIGPWR to anything besides > > init, and expect the process to handle SIGPWR or terminate? > > So if I *cared* about SIGINFO, what I'd do is ask the systemd > developers and users list if there are any users of the sigpwr.target > feature that they know of. And I'd also download all of the open > source UPS monitoring applications (and perhaps documentation of > closed-source UPS applications, such as for example APC's program) and > see if any of them are trying to send the SIGPWR signal. > > I don't personally think it's worth the effort to do that research, > but maybe other people care enough to do the work. > > > > I claim, though, that we could implement VSTATUS without implenting > > > the SIGINFO part of the feature. > > > > I agree that is the place to start. And if we aren't going to use > > SIGINFO perhaps we could have an equally good notification method > > if anyone wants one. Say call an ioctl and get an fd that can > > be read when a VSTATUS request comes in. > > > > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort > > out when people get interested in modifying userspace. > > > Once VSTATUS support lands in the kernel, we can wait and see if there > is anyone who shows up wanting the SIGINFO functionality. Certainly > we have no shortage of userspace notification interfaces in Linux. :-) > > - Ted ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 22:31 ` Walt Drummond @ 2022-01-07 19:29 ` Arseny Maslennikov 2022-05-19 12:27 ` Pavel Machek 0 siblings, 1 reply; 21+ messages in thread From: Arseny Maslennikov @ 2022-01-07 19:29 UTC (permalink / raw) To: Walt Drummond Cc: Theodore Ts'o, Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module [-- Attachment #1: Type: text/plain, Size: 852 bytes --] On Tue, Jan 04, 2022 at 02:31:44PM -0800, Walt Drummond wrote: > The only standard tools that support SIGINFO are sleep, dd and ping, > (and kill, for obvious reasons) so it's not like there's a vast hole > in the tooling or something, nor is there a large legacy software base > just waiting for SIGINFO to appear. So while I very much enjoyed > figuring out how to make SIGINFO work ... As far as I recall, GNU make on *BSD does support SIGINFO (Not a standard tool, but obviously an established one). The developers of strace have expressed interest in SIGINFO support to print tracer status messages (unfortunately, not on a public list). Computational software can use this instead of stderr progress spam, if run in an interactive fashion on a terminal, as it frequently is. There is a user base, it's just not very vocal on kernel lists. :) [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-07 19:29 ` Arseny Maslennikov @ 2022-05-19 12:27 ` Pavel Machek 0 siblings, 0 replies; 21+ messages in thread From: Pavel Machek @ 2022-05-19 12:27 UTC (permalink / raw) To: Arseny Maslennikov Cc: Walt Drummond, Theodore Ts'o, Eric W. Biederman, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module Hi! > > The only standard tools that support SIGINFO are sleep, dd and ping, > > (and kill, for obvious reasons) so it's not like there's a vast hole > > in the tooling or something, nor is there a large legacy software base > > just waiting for SIGINFO to appear. So while I very much enjoyed > > figuring out how to make SIGINFO work ... > > As far as I recall, GNU make on *BSD does support SIGINFO (Not a > standard tool, but obviously an established one). > > The developers of strace have expressed interest in SIGINFO support > to print tracer status messages (unfortunately, not on a public list). > Computational software can use this instead of stderr progress spam, if > run in an interactive fashion on a terminal, as it frequently is. There > is a user base, it's just not very vocal on kernel lists. :) And often it would be useful if cp supported this. Yes, this is feature I'd like to see. BR, Pavel -- ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC PATCH 0/8] signals: Support more than 64 signals 2022-01-04 20:52 ` Theodore Ts'o 2022-01-04 21:33 ` Walt Drummond 2022-01-04 22:05 ` Eric W. Biederman @ 2022-01-07 19:19 ` Arseny Maslennikov 2 siblings, 0 replies; 21+ messages in thread From: Arseny Maslennikov @ 2022-01-07 19:19 UTC (permalink / raw) To: Theodore Ts'o Cc: Eric W. Biederman, Walt Drummond, aacraid, viro, anna.schumaker, arnd, bsegall, bp, chuck.lever, bristot, dave.hansen, dwmw2, dietmar.eggemann, dinguyen, geert, gregkh, hpa, idryomov, mingo, yzaikin, ink, jejb, jmorris, bfields, jlayton, jirislaby, john.johansen, juri.lelli, keescook, mcgrof, martin.petersen, mattst88, mgorman, oleg, pbonzini, peterz, rth, richard, serge, rostedt, tglx, trond.myklebust, vincent.guittot, x86, linux-kernel, ceph-devel, kvm, linux-alpha, linux-arch, linux-fsdevel, linux-m68k, linux-mtd, linux-nfs, linux-scsi, linux-security-module [-- Attachment #1: Type: text/plain, Size: 8254 bytes --] I generally agree with Ted's suggestion that we could merge the easy-to-design part — the VSTATUS+kerninfo — first and deal with the SIGINFO part later. The only concern I have here is that the "later" part might never practically arrive... :) Still, some notes on the SIGINFO/userspace-status part: On Tue, Jan 04, 2022 at 03:52:28PM -0500, Theodore Ts'o wrote: > On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote: > > I dug through the previous conversations and there is a little debate > > about what makes sense for SIGPWR to do by default. Alan Cox remembered > > SIGPWR was sent when the power was restored, so ignoring SIGPWR by > > default made sense. Ted Tso pointed out a different scenario where it > > was reasonable for SIGPWR to be a terminating signal. > > > > So far no one has actually found any applications that will regress if > > SIGPWR becomes ignored by default. Some folks from linux-api@ claimed otherwise, but unfortunately didn't elaborate. > > Furthermore on linux SIGPWR is only > > defined to be sent to init, and init ignores all signals by default so > > in practice SIGPWR is ignored by the only process that receives it > > currently. > > As it turns out, systemd does *not* ignore SIGPWR. Instead, it will > initiate the sigpwr target. From the systemd.special man page: > > sigpwr.target > A special target that is started when systemd receives the > SIGPWR process signal, which is normally sent by the kernel > or UPS daemons when power fails. Not sure what you had in mind; in case you're suggesting that systemd has to drop the sigpwr.target semantics — it doesn't. We don't need to ask systemd to drop sigpwr.target semantics. To introduce SIGINFO == SIGPWR to the kernel, the only "breaking" change we have to do is to change the default disposition for SIGPWR, i. e. the behaviour if the signal is set to SIG_DFL. If a process (including PID 1) installs its own signal handler for SIGPWR to do something when PWR is received (or blocks the signal and handles it via signalfd notifications), then the default disposition does not matter at all, as Eric notes further in this thread. From a quick glance at systemd code, pid1's main() function calls manager_new() calls manager_setup_signals(); this function, in turn, blocks a set of signals, including PWR, and sets up a signalfd(2) on that set. No changes have to be made in systemd, no need to remove the sigpwr.target semantics. The target activation does not send SIGPWR to anyone, it results in systemd services being started and possibly stopped; the exact consequences are out of scope for systemd. There could be another concern: a VSTATUS keypress could result in SIGINFO == SIGPWR being sent to pid1. In a correct implementation this will not ever happen, because a sane PID 1 does not have (and never acquires) a controlling terminal. > And child processes of systemd are not ignoring SIGPWR. Instead, they > are getting terminated. > > <tytso@cwcc> > 41% /bin/sleep 50 & > [1] 180671 > <tytso@cwcc> > 42% kill -PWR 180671 > [1]+ Power failure /bin/sleep 50 All the possible surprises with the SIGINFO == SIGPWR approach we might get stem from here, not from the sigpwr.target. > > Where I saw the last conversation falter was in making a persuasive > > case of why SIGINFO was interesting to add. Given a world of ssh > > connections I expect a persuasive case can be made. Especially if there > > are a handful of utilities where it is already implemented that just > > need to be built with SIGINFO defined. > > One thing that's perhaps worth disentangling is the value of > supporting VSTATUS --- which is a control character much like VINTR > (^C) or VQUIT (control backslash) which is set via the c_cc[] array in > termios structure. Quoting from the termios man page: > > VSTATUS > (not in POSIX; not supported under Linux; status > request: 024, DC4, Ctrl-T). Status character (STATUS). > Display status information at terminal, including state > of foreground process and amount of CPU time it has > consumed. Also sends a SIGINFO signal (not supported on > Linux) to the foreground process group. > > The basic idea is that when you type C-t, you can find out information > about the currently running process. This is a feature that > originally comes from TOPS-10's TENEX operating system, and it is > supported today on FreeBSD and Mac OS. For example, it might display > something like this: > > load: 2.39 cmd: ping 5374 running 0.00u 0.00s > > The reason why SIGINFO is sent to the foreground process group is that > it gives the process an opportunity print application specific > information about currently running process. For example, maybe the C > compiler could print something like "parsing 2042 of 5000 header > files", or some such. :-) > > There are people who wish that Linux supported Control-T / VSTATUS, > for example, just last week, on TUHS, the Unix greybeards list, there > were two such heartfelt wishes for Control-T support from two such > greybeards: > > "It's my biggest annoyance with Linux that it doesn't [support > control-t] > - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html > > "I personally can't stand using Linux, even casually for a very > short sys-admin task, because of this missing feature" > - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html > > I claim, though, that we could implement VSTATUS without implenting > the SIGINFO part of the feature. Previous few applications *ever* > implemented SIGINFO signal handlers so they could give status > information, it's the hard one, since we don't have any spare signals > left. If we were to repurpose some lesser used signal, whether it be > SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some > userspace program (such as a UPS monitoring program which wants to > trigger power fail handling, or a userspace NFSv4 process that wants > to signal that it was unable to recover a file's file lock after a > server reboot), and if we try to take over the signal assignment, it's > possible that we might get surprised. Furthermore, all of the > possibly unused signals that we might try to reclaim terminate the > process by default, and SIGINFO *has* to have a default signal > handling action of Ignore, since otherwise typing Control-T will end > up killing the current foreground application. > > Personally, I don't care all that much about VSTATUS support --- I > used it when I was in university, but honestly, I've never missed it. > But if there is someone who wants to try to implement VSTATUS, and > make some Unix greybeards happy, and maybe even switch from FreeBSD to > Linux as a result, go wild. I'm not convinced, though, that adding > the SIGINFO part of the support is worth the effort. > > Not only do almost no programs implement SIGINFO support, a lot of CPU To be fair, many programs are a lot younger than 4.3BSD, and with the current ubiquity of Linux without VSTATUS, it's kind of a chicken-egg problem. :) > bound programs where this might be actually useful, end up running a > large number of processes in parallel. Take the "parsing 2042 of 5000 > header files" example I gave above. Consider what would happen if gcc > implemented support for SIGINFO, but the user was running a "make -j > 16" and typed Control-T. The result would be chaos! > > So if you really miss Control-T, and it's the only thing holding back > a few FreeBSD users from Linux, I don't see the problem with > implementing that part of the feature. Why not just do the easy part > of the feature which is perhaps 5% of the work, and might provide 99% > of the benefit (at least for those people who care). > > > Without seeing the persuasive case for more signals I have to say that > > adding more signals to the kernel sounds like a bad idea. > > Concur, 100%. > > - Ted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-05-19 12:28 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-03 18:19 [RFC PATCH 0/8] signals: Support more than 64 signals Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 6/8] signals: Round up _NSIG_WORDS Walt Drummond 2022-01-03 18:19 ` [RFC PATCH 8/8] signals: Support BSD VSTATUS, KERNINFO and SIGINFO Walt Drummond 2022-01-04 7:27 ` Greg Kroah-Hartman 2022-01-07 21:48 ` Arseny Maslennikov 2022-01-07 21:52 ` Walt Drummond 2022-01-07 22:39 ` Arseny Maslennikov 2022-01-08 14:38 ` Arseny Maslennikov 2022-01-03 18:48 ` [RFC PATCH 0/8] signals: Support more than 64 signals Al Viro 2022-01-04 1:00 ` Walt Drummond 2022-01-04 1:16 ` Al Viro 2022-01-04 1:49 ` Al Viro 2022-01-04 18:00 ` Eric W. Biederman 2022-01-04 20:52 ` Theodore Ts'o 2022-01-04 21:33 ` Walt Drummond 2022-01-04 22:05 ` Eric W. Biederman 2022-01-04 22:23 ` Theodore Ts'o 2022-01-04 22:31 ` Walt Drummond 2022-01-07 19:29 ` Arseny Maslennikov 2022-05-19 12:27 ` Pavel Machek 2022-01-07 19:19 ` Arseny Maslennikov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).