* [patch v4 00/27] posix-timers: Cure the SIG_IGN mess
@ 2024-09-27 8:48 Thomas Gleixner
2024-09-27 8:48 ` [patch v4 01/27] signal: Confine POSIX_TIMERS properly Thomas Gleixner
` (27 more replies)
0 siblings, 28 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
This are the remaining bits to cure the SIG_IGN mess. The preparatory work
from the previous version 3 has been merged already. Version 3 can be found
here:
https://lore.kernel.org/lkml/20240610163452.591699700@linutronix.de
Last year I reread a 15 years old comment about the SIG_IGN problem:
"FIXME: What we really want, is to stop this timer completely and restart
it in case the SIG_IGN is removed. This is a non trivial change which
involves sighand locking (sigh !), which we don't want to do late in the
release cycle. ... A more complex fix which solves also another related
inconsistency is already in the pipeline."
The embarrasing part was that I put that comment in back then. So I went
back and rumaged through old notes as I completely had forgotten why our
attempts to fix this back then failed.
It turned out that the comment is about right: sighand locking and life
time issues. So I sat down with the old notes and started to wrap my head
around this again.
The problem to solve:
Posix interval timers are not rearmed automatically by the kernel for
various reasons:
1) To prevent DoS by extremly short intervals.
2) To avoid timer overhead when a signal is pending and has not
yet been delivered.
This is achieved by queueing the signal at timer expiry and rearming the
timer at signal delivery to user space. This puts the rearming basically
under scheduler control and the work happens in context of the task which
asked for the signal.
There is a problem with that vs. SIG_IGN. If a signal has SIG_IGN installed
as handler, the related signals are discarded. So in case of posix interval
timers this means that such a timer is never rearmed even when SIG_IGN is
replaced later with a real handler (including SIG_DFL).
To work around that the kernel self rearms those timers and throttles them
when the interval is smaller than a tick to prevent a DoS.
That just keeps timers ticking, which obviously has effects on power and
just creates work for nothing.
So ideally these timers should be stopped and rearmed when SIG_IGN is
replaced, which aligns with the regular handling of posix timers.
Sounds trivial, but isn't:
1) Lock ordering.
The timer lock cannot be taken with sighand lock held which is
problematic vs. the atomicity of sigaction().
2) Life time rules
The timer and the sigqueue are separate entities which requires a
lookup of the timer ID in the signal rearm code. This can be handled,
but the separate life time rules are not necessarily robust.
3) Finding the relevant timers
Obviosly it is possible to walk the posix timer list under sighand
lock and handle it from there. That can be expensive especially in the
case that there are no affected timers as the walk would just end up
doing nothing.
The following series is a new and this time actually working attempt to
solve this. It addresses it by:
1) Embedding the preallocated sigqueue into struct k_itimer, which makes
the life time rules way simpler and just needs a trivial reference
count.
2) Having a separate list in task::signal on which ignored timers are
queued.
This avoids walking a potentially large timer list for nothing on a
SIG_IGN to handler transition.
3) Requeueing the timers signal in the relevant signal queue so the timer
is rearmed when the signal is actually delivered
That turned out to be the least complicated way to address the sighand
lock vs. timer lock ordering issue.
With that timers which have their signal ignored are not longer self
rearmed and the relevant workarounds including throttling for DoS
prevention are removed.
The series is also available from git:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git posixt-v4
Changes vs. V3:
- Rebased to mainline
- Fixed up a intermediate build breakage reported by 0-day
Thanks,
tglx
---
arch/x86/kernel/signal_32.c | 2
arch/x86/kernel/signal_64.c | 2
drivers/power/supply/charger-manager.c | 3
fs/proc/base.c | 4
fs/timerfd.c | 4
include/linux/alarmtimer.h | 10
include/linux/posix-timers.h | 67 +++-
include/linux/sched/signal.h | 4
include/uapi/asm-generic/siginfo.h | 2
init/init_task.c | 5
kernel/fork.c | 1
kernel/signal.c | 476 +++++++++++++++++++--------------
kernel/time/alarmtimer.c | 87 ------
kernel/time/itimer.c | 22 +
kernel/time/posix-cpu-timers.c | 38 +-
kernel/time/posix-timers.c | 227 +++++++--------
kernel/time/posix-timers.h | 8
net/netfilter/xt_IDLETIMER.c | 4
18 files changed, 523 insertions(+), 443 deletions(-)
^ permalink raw reply [flat|nested] 36+ messages in thread
* [patch v4 01/27] signal: Confine POSIX_TIMERS properly
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 12:21 ` Frederic Weisbecker
2024-09-27 8:48 ` [patch v4 02/27] signal: Prevent user space from setting si_sys_private Thomas Gleixner
` (26 subsequent siblings)
27 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Move the itimer rearming out of the signal code and consolidate all posix
timer related functions in the signal code under one ifdef.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 5 +-
kernel/signal.c | 125 +++++++++++++++-----------------------------
kernel/time/itimer.c | 22 +++++++-
kernel/time/posix-timers.c | 15 ++++-
4 files changed, 81 insertions(+), 86 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 453691710839..670bf03a56ef 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -100,6 +100,8 @@ static inline void posix_cputimers_rt_watchdog(struct posix_cputimers *pct,
{
pct->bases[CPUCLOCK_SCHED].nextevt = runtime;
}
+void posixtimer_rearm_itimer(struct task_struct *p);
+void posixtimer_rearm(struct kernel_siginfo *info);
/* Init task static initializer */
#define INIT_CPU_TIMERBASE(b) { \
@@ -122,6 +124,8 @@ struct cpu_timer { };
static inline void posix_cputimers_init(struct posix_cputimers *pct) { }
static inline void posix_cputimers_group_init(struct posix_cputimers *pct,
u64 cpu_limit) { }
+static inline void posixtimer_rearm_itimer(struct task_struct *p) { }
+static inline void posixtimer_rearm(struct kernel_siginfo *info) { }
#endif
#ifdef CONFIG_POSIX_CPU_TIMERS_TASK_WORK
@@ -196,5 +200,4 @@ void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
int update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
-void posixtimer_rearm(struct kernel_siginfo *info);
#endif
diff --git a/kernel/signal.c b/kernel/signal.c
index 6f3a5aa39b09..a83ea99f9389 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -478,42 +478,6 @@ void flush_signals(struct task_struct *t)
}
EXPORT_SYMBOL(flush_signals);
-#ifdef CONFIG_POSIX_TIMERS
-static void __flush_itimer_signals(struct sigpending *pending)
-{
- sigset_t signal, retain;
- struct sigqueue *q, *n;
-
- signal = pending->signal;
- sigemptyset(&retain);
-
- list_for_each_entry_safe(q, n, &pending->list, list) {
- int sig = q->info.si_signo;
-
- if (likely(q->info.si_code != SI_TIMER)) {
- sigaddset(&retain, sig);
- } else {
- sigdelset(&signal, sig);
- list_del_init(&q->list);
- __sigqueue_free(q);
- }
- }
-
- sigorsets(&pending->signal, &signal, &retain);
-}
-
-void flush_itimer_signals(void)
-{
- struct task_struct *tsk = current;
- unsigned long flags;
-
- spin_lock_irqsave(&tsk->sighand->siglock, flags);
- __flush_itimer_signals(&tsk->pending);
- __flush_itimer_signals(&tsk->signal->shared_pending);
- spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
-}
-#endif
-
void ignore_signals(struct task_struct *t)
{
int i;
@@ -636,31 +600,9 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
*type = PIDTYPE_TGID;
signr = __dequeue_signal(&tsk->signal->shared_pending,
mask, info, &resched_timer);
-#ifdef CONFIG_POSIX_TIMERS
- /*
- * itimer signal ?
- *
- * itimers are process shared and we restart periodic
- * itimers in the signal delivery path to prevent DoS
- * attacks in the high resolution timer case. This is
- * compliant with the old way of self-restarting
- * itimers, as the SIGALRM is a legacy signal and only
- * queued once. Changing the restart behaviour to
- * restart the timer in the signal dequeue path is
- * reducing the timer noise on heavy loaded !highres
- * systems too.
- */
- if (unlikely(signr == SIGALRM)) {
- struct hrtimer *tmr = &tsk->signal->real_timer;
-
- if (!hrtimer_is_queued(tmr) &&
- tsk->signal->it_real_incr != 0) {
- hrtimer_forward(tmr, tmr->base->get_time(),
- tsk->signal->it_real_incr);
- hrtimer_restart(tmr);
- }
- }
-#endif
+
+ if (unlikely(signr == SIGALRM))
+ posixtimer_rearm_itimer(tsk);
}
recalc_sigpending();
@@ -682,22 +624,12 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
*/
current->jobctl |= JOBCTL_STOP_DEQUEUED;
}
-#ifdef CONFIG_POSIX_TIMERS
- if (resched_timer) {
- /*
- * Release the siglock to ensure proper locking order
- * of timer locks outside of siglocks. Note, we leave
- * irqs disabled here, since the posix-timers code is
- * about to disable them again anyway.
- */
- spin_unlock(&tsk->sighand->siglock);
- posixtimer_rearm(info);
- spin_lock(&tsk->sighand->siglock);
- /* Don't expose the si_sys_private value to userspace */
- info->si_sys_private = 0;
+ if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
+ if (unlikely(resched_timer))
+ posixtimer_rearm(info);
}
-#endif
+
return signr;
}
EXPORT_SYMBOL_GPL(dequeue_signal);
@@ -1922,15 +1854,43 @@ int kill_pid(struct pid *pid, int sig, int priv)
}
EXPORT_SYMBOL(kill_pid);
+#ifdef CONFIG_POSIX_TIMERS
/*
- * These functions support sending signals using preallocated sigqueue
- * structures. This is needed "because realtime applications cannot
- * afford to lose notifications of asynchronous events, like timer
- * expirations or I/O completions". In the case of POSIX Timers
- * we allocate the sigqueue structure from the timer_create. If this
- * allocation fails we are able to report the failure to the application
- * with an EAGAIN error.
+ * These functions handle POSIX timer signals. POSIX timers use
+ * preallocated sigqueue structs for sending signals.
*/
+static void __flush_itimer_signals(struct sigpending *pending)
+{
+ sigset_t signal, retain;
+ struct sigqueue *q, *n;
+
+ signal = pending->signal;
+ sigemptyset(&retain);
+
+ list_for_each_entry_safe(q, n, &pending->list, list) {
+ int sig = q->info.si_signo;
+
+ if (likely(q->info.si_code != SI_TIMER)) {
+ sigaddset(&retain, sig);
+ } else {
+ sigdelset(&signal, sig);
+ list_del_init(&q->list);
+ __sigqueue_free(q);
+ }
+ }
+
+ sigorsets(&pending->signal, &signal, &retain);
+}
+
+void flush_itimer_signals(void)
+{
+ struct task_struct *tsk = current;
+
+ guard(spinlock_irqsave)(&tsk->sighand->siglock);
+ __flush_itimer_signals(&tsk->pending);
+ __flush_itimer_signals(&tsk->signal->shared_pending);
+}
+
struct sigqueue *sigqueue_alloc(void)
{
return __sigqueue_alloc(-1, current, GFP_KERNEL, 0, SIGQUEUE_PREALLOC);
@@ -2027,6 +1987,7 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
rcu_read_unlock();
return ret;
}
+#endif /* CONFIG_POSIX_TIMERS */
void do_notify_pidfd(struct task_struct *task)
{
diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c
index 00629e658ca1..876d389b2e21 100644
--- a/kernel/time/itimer.c
+++ b/kernel/time/itimer.c
@@ -151,7 +151,27 @@ COMPAT_SYSCALL_DEFINE2(getitimer, int, which,
#endif
/*
- * The timer is automagically restarted, when interval != 0
+ * Invoked from dequeue_signal() when SIG_ALRM is delivered.
+ *
+ * Restart the ITIMER_REAL timer if it is armed as periodic timer. Doing
+ * this in the signal delivery path instead of self rearming prevents a DoS
+ * with small increments in the high reolution timer case and reduces timer
+ * noise in general.
+ */
+void posixtimer_rearm_itimer(struct task_struct *tsk)
+{
+ struct hrtimer *tmr = &tsk->signal->real_timer;
+
+ if (!hrtimer_is_queued(tmr) && tsk->signal->it_real_incr != 0) {
+ hrtimer_forward(tmr, tmr->base->get_time(),
+ tsk->signal->it_real_incr);
+ hrtimer_restart(tmr);
+ }
+}
+
+/*
+ * Interval timers are restarted in the signal delivery path. See
+ * posixtimer_rearm_itimer().
*/
enum hrtimer_restart it_real_fn(struct hrtimer *timer)
{
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cc830ef93a7..bcd5e56412e7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -251,7 +251,7 @@ static void common_hrtimer_rearm(struct k_itimer *timr)
/*
* This function is called from the signal delivery code if
- * info->si_sys_private is not zero, which indicates that the timer has to
+ * info::si_sys_private is not zero, which indicates that the timer has to
* be rearmed. Restart the timer and update info::si_overrun.
*/
void posixtimer_rearm(struct kernel_siginfo *info)
@@ -259,9 +259,15 @@ void posixtimer_rearm(struct kernel_siginfo *info)
struct k_itimer *timr;
unsigned long flags;
+ /*
+ * Release siglock to ensure proper locking order versus
+ * timr::it_lock. Keep interrupts disabled.
+ */
+ spin_unlock(¤t->sighand->siglock);
+
timr = lock_timer(info->si_tid, &flags);
if (!timr)
- return;
+ goto out;
if (timr->it_interval && timr->it_requeue_pending == info->si_sys_private) {
timr->kclock->timer_rearm(timr);
@@ -275,6 +281,11 @@ void posixtimer_rearm(struct kernel_siginfo *info)
}
unlock_timer(timr, flags);
+out:
+ spin_lock(¤t->sighand->siglock);
+
+ /* Don't expose the si_sys_private value to userspace */
+ info->si_sys_private = 0;
}
int posix_timer_queue_signal(struct k_itimer *timr)
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 02/27] signal: Prevent user space from setting si_sys_private
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
2024-09-27 8:48 ` [patch v4 01/27] signal: Confine POSIX_TIMERS properly Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 12:37 ` Frederic Weisbecker
2024-09-27 13:40 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 03/27] signal: Get rid of resched_timer logic Thomas Gleixner
` (25 subsequent siblings)
27 siblings, 2 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
The si_sys_private member of siginfo is used to handle posix-timer rearming
from the signal delivery path. Prevent user space from setting it as that
creates inconsistent state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 8 ++++++++
1 file changed, 8 insertions(+)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index a83ea99f9389..7706cd304785 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3354,6 +3354,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t *from)
static int post_copy_siginfo_from_user(kernel_siginfo_t *info,
const siginfo_t __user *from)
{
+ /*
+ * Clear the si_sys_private field for timer signals as that's the
+ * indicator for rearming a posix timer. User space submitted
+ * signals are not allowed to inject that.
+ */
+ if (info->si_code == SI_TIMER)
+ info->si_sys_private = 0;
+
if (unlikely(!known_siginfo_layout(info->si_signo, info->si_code))) {
char __user *expansion = si_expansion(from);
char buf[SI_EXPANSION_SIZE];
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 03/27] signal: Get rid of resched_timer logic
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
2024-09-27 8:48 ` [patch v4 01/27] signal: Confine POSIX_TIMERS properly Thomas Gleixner
2024-09-27 8:48 ` [patch v4 02/27] signal: Prevent user space from setting si_sys_private Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 13:08 ` Frederic Weisbecker
2024-09-27 13:53 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 04/27] posix-timers: Cure si_sys_private race Thomas Gleixner
` (24 subsequent siblings)
27 siblings, 2 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
There is no reason for handing the *resched pointer argument through
several functions just to check whether the signal is related to a self
rearming posix timer.
SI_TIMER is only used by the posix timer code and cannot be queued from
user space. The only extra check in collect_signal() to verify whether the
queued signal is preallocated is not really useful. Some other places
already check purely the SI_TIMER type.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index 7706cd304785..3d2e087283ab 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -526,8 +526,7 @@ bool unhandled_signal(struct task_struct *tsk, int sig)
return !tsk->ptrace;
}
-static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info,
- bool *resched_timer)
+static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info)
{
struct sigqueue *q, *first = NULL;
@@ -549,12 +548,6 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i
still_pending:
list_del_init(&first->list);
copy_siginfo(info, &first->info);
-
- *resched_timer =
- (first->flags & SIGQUEUE_PREALLOC) &&
- (info->si_code == SI_TIMER) &&
- (info->si_sys_private);
-
__sigqueue_free(first);
} else {
/*
@@ -571,13 +564,12 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i
}
}
-static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
- kernel_siginfo_t *info, bool *resched_timer)
+static int __dequeue_signal(struct sigpending *pending, sigset_t *mask, kernel_siginfo_t *info)
{
int sig = next_signal(pending, mask);
if (sig)
- collect_signal(sig, pending, info, resched_timer);
+ collect_signal(sig, pending, info);
return sig;
}
@@ -589,17 +581,15 @@ static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
{
struct task_struct *tsk = current;
- bool resched_timer = false;
int signr;
lockdep_assert_held(&tsk->sighand->siglock);
*type = PIDTYPE_PID;
- signr = __dequeue_signal(&tsk->pending, mask, info, &resched_timer);
+ signr = __dequeue_signal(&tsk->pending, mask, info);
if (!signr) {
*type = PIDTYPE_TGID;
- signr = __dequeue_signal(&tsk->signal->shared_pending,
- mask, info, &resched_timer);
+ signr = __dequeue_signal(&tsk->signal->shared_pending, mask, info);
if (unlikely(signr == SIGALRM))
posixtimer_rearm_itimer(tsk);
@@ -626,7 +616,7 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
- if (unlikely(resched_timer))
+ if (unlikely(info->si_code == SI_TIMER && info->si_sys_private))
posixtimer_rearm(info);
}
@@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig, struct kernel_siginfo *info,
lockdep_assert_held(&t->sighand->siglock);
+ if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
+ return 0;
+
result = TRACE_SIGNAL_IGNORED;
if (!prepare_signal(sig, t, force))
goto ret;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 04/27] posix-timers: Cure si_sys_private race
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (2 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 03/27] signal: Get rid of resched_timer logic Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 14:02 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 05/27] signal: Allow POSIX timer signals to be dropped Thomas Gleixner
` (23 subsequent siblings)
27 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
The si_sys_private member of the siginfo which is embedded in the
preallocated sigqueue is used by the posix timer code to decide whether a
timer must be reprogrammed on signal delivery.
The handling of this is racy as a long standing comment in that code
documents. It is modified with the timer lock held, but without sighand
lock being held. The actual signal delivery code checks for it under
sighand lock without holding the timer lock.
Hand the new value to send_sigqueue() as argument and store it with sighand
lock held.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/sched/signal.h | 2 +-
kernel/signal.c | 10 +++++++++-
kernel/time/posix-timers.c | 15 +--------------
3 files changed, 11 insertions(+), 16 deletions(-)
---
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index c8ed09ac29ac..bd9f569231d9 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -340,7 +340,7 @@ extern int send_sig(int, struct task_struct *, int);
extern int zap_other_threads(struct task_struct *p);
extern struct sigqueue *sigqueue_alloc(void);
extern void sigqueue_free(struct sigqueue *);
-extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type);
+extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type, int si_private);
extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
static inline void clear_notify_signal(void)
diff --git a/kernel/signal.c b/kernel/signal.c
index 3d2e087283ab..443baadb5ab0 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1915,7 +1915,7 @@ void sigqueue_free(struct sigqueue *q)
__sigqueue_free(q);
}
-int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
+int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int si_private)
{
int sig = q->info.si_signo;
struct sigpending *pending;
@@ -1950,6 +1950,14 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
if (!likely(lock_task_sighand(t, &flags)))
goto ret;
+ /*
+ * Update @q::info::si_sys_private for posix timer signals with
+ * sighand locked to prevent a race against dequeue_signal() which
+ * decides based on si_sys_private whether to invoke
+ * posixtimer_rearm() or not.
+ */
+ q->info.si_sys_private = si_private;
+
ret = 1; /* the signal is ignored */
result = TRACE_SIGNAL_IGNORED;
if (!prepare_signal(sig, t, false))
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index bcd5e56412e7..b6cca1ed2f90 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -299,21 +299,8 @@ int posix_timer_queue_signal(struct k_itimer *timr)
if (timr->it_interval)
si_private = ++timr->it_requeue_pending;
- /*
- * FIXME: if ->sigq is queued we can race with
- * dequeue_signal()->posixtimer_rearm().
- *
- * If dequeue_signal() sees the "right" value of
- * si_sys_private it calls posixtimer_rearm().
- * We re-queue ->sigq and drop ->it_lock().
- * posixtimer_rearm() locks the timer
- * and re-schedules it while ->sigq is pending.
- * Not really bad, but not that we want.
- */
- timr->sigq->info.si_sys_private = si_private;
-
type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
- ret = send_sigqueue(timr->sigq, timr->it_pid, type);
+ ret = send_sigqueue(timr->sigq, timr->it_pid, type, si_private);
/* If we failed to send the signal the timer stops. */
return ret > 0;
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 05/27] signal: Allow POSIX timer signals to be dropped
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (3 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 04/27] posix-timers: Cure si_sys_private race Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 06/27] posix-timers: Drop signal if timer has been deleted or reprogrammed Thomas Gleixner
` (22 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
In case that a timer was reprogrammed or deleted an already pending signal
is obsolete. Right now such signals are kept around and eventually
delivered. While POSIX is blury about this:
- "The effect of disarming or resetting a timer with pending expiration
notifications is unspecified."
- "The disposition of pending signals for the deleted timer is
unspecified."
it is reasonable in both cases to expect that pending signals are discarded
as they have no meaning anymore.
Prepare the signal code to allow dropping posix timer signals.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 5 +++--
kernel/signal.c | 7 +++++--
kernel/time/posix-timers.c | 3 ++-
3 files changed, 10 insertions(+), 5 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 670bf03a56ef..4ab49e5c42af 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -100,8 +100,9 @@ static inline void posix_cputimers_rt_watchdog(struct posix_cputimers *pct,
{
pct->bases[CPUCLOCK_SCHED].nextevt = runtime;
}
+
void posixtimer_rearm_itimer(struct task_struct *p);
-void posixtimer_rearm(struct kernel_siginfo *info);
+bool posixtimer_deliver_signal(struct kernel_siginfo *info);
/* Init task static initializer */
#define INIT_CPU_TIMERBASE(b) { \
@@ -125,7 +126,7 @@ static inline void posix_cputimers_init(struct posix_cputimers *pct) { }
static inline void posix_cputimers_group_init(struct posix_cputimers *pct,
u64 cpu_limit) { }
static inline void posixtimer_rearm_itimer(struct task_struct *p) { }
-static inline void posixtimer_rearm(struct kernel_siginfo *info) { }
+static inline bool posixtimer_deliver_signal(struct kernel_siginfo *info) { return false; }
#endif
#ifdef CONFIG_POSIX_CPU_TIMERS_TASK_WORK
diff --git a/kernel/signal.c b/kernel/signal.c
index 443baadb5ab0..c35b6ff52767 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -585,6 +585,7 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
lockdep_assert_held(&tsk->sighand->siglock);
+again:
*type = PIDTYPE_PID;
signr = __dequeue_signal(&tsk->pending, mask, info);
if (!signr) {
@@ -616,8 +617,10 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
- if (unlikely(info->si_code == SI_TIMER && info->si_sys_private))
- posixtimer_rearm(info);
+ if (unlikely(info->si_code == SI_TIMER && info->si_sys_private)) {
+ if (!posixtimer_deliver_signal(info))
+ goto again;
+ }
}
return signr;
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index b6cca1ed2f90..d7ed7542f803 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -254,7 +254,7 @@ static void common_hrtimer_rearm(struct k_itimer *timr)
* info::si_sys_private is not zero, which indicates that the timer has to
* be rearmed. Restart the timer and update info::si_overrun.
*/
-void posixtimer_rearm(struct kernel_siginfo *info)
+bool posixtimer_deliver_signal(struct kernel_siginfo *info)
{
struct k_itimer *timr;
unsigned long flags;
@@ -286,6 +286,7 @@ void posixtimer_rearm(struct kernel_siginfo *info)
/* Don't expose the si_sys_private value to userspace */
info->si_sys_private = 0;
+ return true;
}
int posix_timer_queue_signal(struct k_itimer *timr)
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 06/27] posix-timers: Drop signal if timer has been deleted or reprogrammed
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (4 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 05/27] signal: Allow POSIX timer signals to be dropped Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 07/27] posix-timers: Rename k_itimer::it_requeue_pending Thomas Gleixner
` (21 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
No point in delivering a signal from the past. POSIX does not specify the
behaviour here:
- "The effect of disarming or resetting a timer with pending expiration
notifications is unspecified."
- "The disposition of pending signals for the deleted timer is unspecified."
In both cases it is reasonable to expect that pending signals are
discarded. Especially in the reprogramming case it does not make sense to
account for previous overruns or to deliver a signal for a timer which has
been disarmed.
Drop the signal as that is conistent and understandable behaviour.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/time/posix-timers.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
---
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index d7ed7542f803..b5d7e71c10f2 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -250,14 +250,14 @@ static void common_hrtimer_rearm(struct k_itimer *timr)
}
/*
- * This function is called from the signal delivery code if
- * info::si_sys_private is not zero, which indicates that the timer has to
- * be rearmed. Restart the timer and update info::si_overrun.
+ * This function is called from the signal delivery code. It decides
+ * whether the signal should be dropped and rearms interval timers.
*/
bool posixtimer_deliver_signal(struct kernel_siginfo *info)
{
struct k_itimer *timr;
unsigned long flags;
+ bool ret = false;
/*
* Release siglock to ensure proper locking order versus
@@ -279,6 +279,7 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
info->si_overrun = timer_overrun_to_int(timr, info->si_overrun);
}
+ ret = true;
unlock_timer(timr, flags);
out:
@@ -286,7 +287,7 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
/* Don't expose the si_sys_private value to userspace */
info->si_sys_private = 0;
- return true;
+ return ret;
}
int posix_timer_queue_signal(struct k_itimer *timr)
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 07/27] posix-timers: Rename k_itimer::it_requeue_pending
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (5 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 06/27] posix-timers: Drop signal if timer has been deleted or reprogrammed Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 08/27] posix-timers: Add proper state tracking Thomas Gleixner
` (20 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Prepare for using this struct member to do a proper reprogramming and
deletion accounting so that stale signals can be dropped.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 5 ++---
kernel/time/alarmtimer.c | 2 +-
kernel/time/posix-cpu-timers.c | 4 ++--
kernel/time/posix-timers.c | 12 ++++++------
4 files changed, 11 insertions(+), 12 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 4ab49e5c42af..253d106fac2c 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -150,8 +150,7 @@ static inline void posix_cputimers_init_work(void) { }
* @it_active: Marker that timer is active
* @it_overrun: The overrun counter for pending signals
* @it_overrun_last: The overrun at the time of the last delivered signal
- * @it_requeue_pending: Indicator that timer waits for being requeued on
- * signal delivery
+ * @it_signal_seq: Sequence count to control signal delivery
* @it_sigev_notify: The notify word of sigevent struct for signal delivery
* @it_interval: The interval for periodic timers
* @it_signal: Pointer to the creators signal struct
@@ -172,7 +171,7 @@ struct k_itimer {
int it_active;
s64 it_overrun;
s64 it_overrun_last;
- int it_requeue_pending;
+ unsigned int it_signal_seq;
int it_sigev_notify;
ktime_t it_interval;
struct signal_struct *it_signal;
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 76bd4fda3472..22d5145dd9a7 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -584,7 +584,7 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
* small intervals cannot starve the system.
*/
ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
- ++ptr->it_requeue_pending;
+ ++ptr->it_signal_seq;
ptr->it_active = 1;
result = ALARMTIMER_RESTART;
}
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 6bcee4704059..993243b5be98 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -608,7 +608,7 @@ static void cpu_timer_fire(struct k_itimer *timer)
* ticking in case the signal is deliverable next time.
*/
posix_cpu_timer_rearm(timer);
- ++timer->it_requeue_pending;
+ ++timer->it_signal_seq;
}
}
@@ -745,7 +745,7 @@ static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *i
* - Timers which expired, but the signal has not yet been
* delivered
*/
- if (iv && ((timer->it_requeue_pending & REQUEUE_PENDING) || sigev_none))
+ if (iv && ((timer->it_signal_seq & REQUEUE_PENDING) || sigev_none))
expires = bump_cpu_timer(timer, now);
else
expires = cpu_timer_getexpires(&timer->it.cpu);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index b5d7e71c10f2..26243d38d27d 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -269,13 +269,13 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
if (!timr)
goto out;
- if (timr->it_interval && timr->it_requeue_pending == info->si_sys_private) {
+ if (timr->it_interval && timr->it_signal_seq == info->si_sys_private) {
timr->kclock->timer_rearm(timr);
timr->it_active = 1;
timr->it_overrun_last = timr->it_overrun;
timr->it_overrun = -1LL;
- ++timr->it_requeue_pending;
+ ++timr->it_signal_seq;
info->si_overrun = timer_overrun_to_int(timr, info->si_overrun);
}
@@ -299,7 +299,7 @@ int posix_timer_queue_signal(struct k_itimer *timr)
timr->it_active = 0;
if (timr->it_interval)
- si_private = ++timr->it_requeue_pending;
+ si_private = ++timr->it_signal_seq;
type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
ret = send_sigqueue(timr->sigq, timr->it_pid, type, si_private);
@@ -366,7 +366,7 @@ static enum hrtimer_restart posix_timer_fn(struct hrtimer *timer)
timr->it_overrun += hrtimer_forward(timer, now, timr->it_interval);
ret = HRTIMER_RESTART;
- ++timr->it_requeue_pending;
+ ++timr->it_signal_seq;
timr->it_active = 1;
}
}
@@ -667,7 +667,7 @@ void common_timer_get(struct k_itimer *timr, struct itimerspec64 *cur_setting)
* is a SIGEV_NONE timer move the expiry time forward by intervals,
* so expiry is > now.
*/
- if (iv && (timr->it_requeue_pending & REQUEUE_PENDING || sig_none))
+ if (iv && (timr->it_signal_seq & REQUEUE_PENDING || sig_none))
timr->it_overrun += kc->timer_forward(timr, now);
remaining = kc->timer_remaining(timr, now);
@@ -868,7 +868,7 @@ void posix_timer_set_common(struct k_itimer *timer, struct itimerspec64 *new_set
timer->it_interval = 0;
/* Prevent reloading in case there is a signal pending */
- timer->it_requeue_pending = (timer->it_requeue_pending + 2) & ~REQUEUE_PENDING;
+ timer->it_signal_seq = (timer->it_signal_seq + 2) & ~REQUEUE_PENDING;
/* Reset overrun accounting */
timer->it_overrun_last = 0;
timer->it_overrun = -1LL;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 08/27] posix-timers: Add proper state tracking
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (6 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 07/27] posix-timers: Rename k_itimer::it_requeue_pending Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 09/27] posix-timers: Make signal delivery consistent Thomas Gleixner
` (19 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Right now the state tracking is done by two struct members:
- it_active:
A boolean which tracks armed/disarmed state
- it_signal_seq:
A sequence counter which is used to invalidate settings
and prevent rearming
Replace it_active with it_status and keep properly track about the states
in one place.
This allows to reuse it_signal_seq to track reprogramming, disarm and
delete operations in order to drop signals which are related to the state
previous of those operations.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 4 ++--
kernel/time/alarmtimer.c | 2 +-
kernel/time/posix-cpu-timers.c | 15 ++++++++-------
kernel/time/posix-timers.c | 22 +++++++++++++---------
kernel/time/posix-timers.h | 6 ++++++
5 files changed, 30 insertions(+), 19 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 253d106fac2c..02afbb4da7f7 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -147,7 +147,7 @@ static inline void posix_cputimers_init_work(void) { }
* @kclock: Pointer to the k_clock struct handling this timer
* @it_clock: The posix timer clock id
* @it_id: The posix timer id for identifying the timer
- * @it_active: Marker that timer is active
+ * @it_status: The status of the timer
* @it_overrun: The overrun counter for pending signals
* @it_overrun_last: The overrun at the time of the last delivered signal
* @it_signal_seq: Sequence count to control signal delivery
@@ -168,7 +168,7 @@ struct k_itimer {
const struct k_clock *kclock;
clockid_t it_clock;
timer_t it_id;
- int it_active;
+ int it_status;
s64 it_overrun;
s64 it_overrun_last;
unsigned int it_signal_seq;
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 22d5145dd9a7..71360f8072d4 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -585,7 +585,7 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
*/
ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
++ptr->it_signal_seq;
- ptr->it_active = 1;
+ ptr->it_status = POSIX_TIMER_ARMED;
result = ALARMTIMER_RESTART;
}
spin_unlock_irqrestore(&ptr->it_lock, flags);
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 993243b5be98..12f828d704b1 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -453,7 +453,6 @@ static void disarm_timer(struct k_itimer *timer, struct task_struct *p)
struct cpu_timer *ctmr = &timer->it.cpu;
struct posix_cputimer_base *base;
- timer->it_active = 0;
if (!cpu_timer_dequeue(ctmr))
return;
@@ -494,11 +493,12 @@ static int posix_cpu_timer_del(struct k_itimer *timer)
*/
WARN_ON_ONCE(ctmr->head || timerqueue_node_queued(&ctmr->node));
} else {
- if (timer->it.cpu.firing)
+ if (timer->it.cpu.firing) {
ret = TIMER_RETRY;
- else
+ } else {
disarm_timer(timer, p);
-
+ timer->it_status = POSIX_TIMER_DISARMED;
+ }
unlock_task_sighand(p, &flags);
}
@@ -560,7 +560,7 @@ static void arm_timer(struct k_itimer *timer, struct task_struct *p)
struct cpu_timer *ctmr = &timer->it.cpu;
u64 newexp = cpu_timer_getexpires(ctmr);
- timer->it_active = 1;
+ timer->it_status = POSIX_TIMER_ARMED;
if (!cpu_timer_enqueue(&base->tqhead, ctmr))
return;
@@ -586,7 +586,8 @@ static void cpu_timer_fire(struct k_itimer *timer)
{
struct cpu_timer *ctmr = &timer->it.cpu;
- timer->it_active = 0;
+ timer->it_status = POSIX_TIMER_DISARMED;
+
if (unlikely(timer->sigq == NULL)) {
/*
* This a special case for clock_nanosleep,
@@ -671,7 +672,7 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
ret = TIMER_RETRY;
} else {
cpu_timer_dequeue(ctmr);
- timer->it_active = 0;
+ timer->it_status = POSIX_TIMER_DISARMED;
}
/*
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 26243d38d27d..6f0dacec25e0 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -272,7 +272,7 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
if (timr->it_interval && timr->it_signal_seq == info->si_sys_private) {
timr->kclock->timer_rearm(timr);
- timr->it_active = 1;
+ timr->it_status = POSIX_TIMER_ARMED;
timr->it_overrun_last = timr->it_overrun;
timr->it_overrun = -1LL;
++timr->it_signal_seq;
@@ -292,14 +292,17 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
int posix_timer_queue_signal(struct k_itimer *timr)
{
+ enum posix_timer_state state = POSIX_TIMER_DISARMED;
int ret, si_private = 0;
enum pid_type type;
lockdep_assert_held(&timr->it_lock);
- timr->it_active = 0;
- if (timr->it_interval)
+ if (timr->it_interval) {
+ state = POSIX_TIMER_REQUEUE_PENDING;
si_private = ++timr->it_signal_seq;
+ }
+ timr->it_status = state;
type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
ret = send_sigqueue(timr->sigq, timr->it_pid, type, si_private);
@@ -367,7 +370,7 @@ static enum hrtimer_restart posix_timer_fn(struct hrtimer *timer)
timr->it_overrun += hrtimer_forward(timer, now, timr->it_interval);
ret = HRTIMER_RESTART;
++timr->it_signal_seq;
- timr->it_active = 1;
+ timr->it_status = POSIX_TIMER_ARMED;
}
}
@@ -647,10 +650,10 @@ void common_timer_get(struct k_itimer *timr, struct itimerspec64 *cur_setting)
/* interval timer ? */
if (iv) {
cur_setting->it_interval = ktime_to_timespec64(iv);
- } else if (!timr->it_active) {
+ } else if (timr->it_status == POSIX_TIMER_DISARMED) {
/*
* SIGEV_NONE oneshot timers are never queued and therefore
- * timr->it_active is always false. The check below
+ * timr->it_status is always DISARMED. The check below
* vs. remaining time will handle this case.
*
* For all other timers there is nothing to update here, so
@@ -895,7 +898,7 @@ int common_timer_set(struct k_itimer *timr, int flags,
if (kc->timer_try_to_cancel(timr) < 0)
return TIMER_RETRY;
- timr->it_active = 0;
+ timr->it_status = POSIX_TIMER_DISARMED;
posix_timer_set_common(timr, new_setting);
/* Keep timer disarmed when it_value is zero */
@@ -908,7 +911,8 @@ int common_timer_set(struct k_itimer *timr, int flags,
sigev_none = timr->it_sigev_notify == SIGEV_NONE;
kc->timer_arm(timr, expires, flags & TIMER_ABSTIME, sigev_none);
- timr->it_active = !sigev_none;
+ if (!sigev_none)
+ timr->it_status = POSIX_TIMER_ARMED;
return 0;
}
@@ -1007,7 +1011,7 @@ int common_timer_del(struct k_itimer *timer)
timer->it_interval = 0;
if (kc->timer_try_to_cancel(timer) < 0)
return TIMER_RETRY;
- timer->it_active = 0;
+ timer->it_status = POSIX_TIMER_DISARMED;
return 0;
}
diff --git a/kernel/time/posix-timers.h b/kernel/time/posix-timers.h
index 4784ea65f685..4d09677e584e 100644
--- a/kernel/time/posix-timers.h
+++ b/kernel/time/posix-timers.h
@@ -1,6 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
#define TIMER_RETRY 1
+enum posix_timer_state {
+ POSIX_TIMER_DISARMED,
+ POSIX_TIMER_ARMED,
+ POSIX_TIMER_REQUEUE_PENDING,
+};
+
struct k_clock {
int (*clock_getres)(const clockid_t which_clock,
struct timespec64 *tp);
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 09/27] posix-timers: Make signal delivery consistent
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (7 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 08/27] posix-timers: Add proper state tracking Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 10/27] posix-timers: Make signal overrun accounting sensible Thomas Gleixner
` (18 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Signals of timers which are reprogammed, disarmed or deleted can deliver
signals related to the past. The POSIX spec is blury about this:
- "The effect of disarming or resetting a timer with pending expiration
notifications is unspecified."
- "The disposition of pending signals for the deleted timer is
unspecified."
In both cases it is reasonable to expect that pending signals are
discarded. Especially in the reprogramming case it does not make sense to
account for previous overruns or to deliver a signal for a timer which has
been disarmed. This makes the behaviour consistent and understandable.
Remove the si_sys_private check from the signal delivery code and invoke
posix_timer_deliver_signal() unconditionally.
Change that function so it controls the actual signal delivery via the
return value. It now instructs the signal code to drop the signal when:
1) The timer does not longer exist in the hash table
2) The timer signal_seq value is not the same as the si_sys_private value
which was set when the signal was queued.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 2 --
kernel/signal.c | 2 +-
kernel/time/posix-cpu-timers.c | 2 +-
kernel/time/posix-timers.c | 25 +++++++++++++++----------
4 files changed, 17 insertions(+), 14 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 02afbb4da7f7..8c6d97412526 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -137,8 +137,6 @@ static inline void clear_posix_cputimers_work(struct task_struct *p) { }
static inline void posix_cputimers_init_work(void) { }
#endif
-#define REQUEUE_PENDING 1
-
/**
* struct k_itimer - POSIX.1b interval timer structure.
* @list: List head for binding the timer to signals->posix_timers
diff --git a/kernel/signal.c b/kernel/signal.c
index c35b6ff52767..a407724f1267 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -617,7 +617,7 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
- if (unlikely(info->si_code == SI_TIMER && info->si_sys_private)) {
+ if (unlikely(info->si_code == SI_TIMER)) {
if (!posixtimer_deliver_signal(info))
goto again;
}
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 12f828d704b1..bc2cd32b7a40 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -746,7 +746,7 @@ static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *i
* - Timers which expired, but the signal has not yet been
* delivered
*/
- if (iv && ((timer->it_signal_seq & REQUEUE_PENDING) || sigev_none))
+ if (iv && timer->it_status != POSIX_TIMER_ARMED)
expires = bump_cpu_timer(timer, now);
else
expires = cpu_timer_getexpires(&timer->it.cpu);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 6f0dacec25e0..1231efb7c30f 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -269,7 +269,10 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
if (!timr)
goto out;
- if (timr->it_interval && timr->it_signal_seq == info->si_sys_private) {
+ if (timr->it_signal_seq != info->si_sys_private)
+ goto out_unlock;
+
+ if (timr->it_interval && timr->it_status == POSIX_TIMER_REQUEUE_PENDING) {
timr->kclock->timer_rearm(timr);
timr->it_status = POSIX_TIMER_ARMED;
@@ -281,6 +284,7 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
}
ret = true;
+out_unlock:
unlock_timer(timr, flags);
out:
spin_lock(¤t->sighand->siglock);
@@ -293,19 +297,19 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
int posix_timer_queue_signal(struct k_itimer *timr)
{
enum posix_timer_state state = POSIX_TIMER_DISARMED;
- int ret, si_private = 0;
enum pid_type type;
+ int ret;
lockdep_assert_held(&timr->it_lock);
if (timr->it_interval) {
+ timr->it_signal_seq++;
state = POSIX_TIMER_REQUEUE_PENDING;
- si_private = ++timr->it_signal_seq;
}
timr->it_status = state;
type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
- ret = send_sigqueue(timr->sigq, timr->it_pid, type, si_private);
+ ret = send_sigqueue(timr->sigq, timr->it_pid, type, timr->it_signal_seq);
/* If we failed to send the signal the timer stops. */
return ret > 0;
}
@@ -670,7 +674,7 @@ void common_timer_get(struct k_itimer *timr, struct itimerspec64 *cur_setting)
* is a SIGEV_NONE timer move the expiry time forward by intervals,
* so expiry is > now.
*/
- if (iv && (timr->it_signal_seq & REQUEUE_PENDING || sig_none))
+ if (iv && timr->it_status != POSIX_TIMER_ARMED)
timr->it_overrun += kc->timer_forward(timr, now);
remaining = kc->timer_remaining(timr, now);
@@ -870,8 +874,6 @@ void posix_timer_set_common(struct k_itimer *timer, struct itimerspec64 *new_set
else
timer->it_interval = 0;
- /* Prevent reloading in case there is a signal pending */
- timer->it_signal_seq = (timer->it_signal_seq + 2) & ~REQUEUE_PENDING;
/* Reset overrun accounting */
timer->it_overrun_last = 0;
timer->it_overrun = -1LL;
@@ -889,8 +891,6 @@ int common_timer_set(struct k_itimer *timr, int flags,
if (old_setting)
common_timer_get(timr, old_setting);
- /* Prevent rearming by clearing the interval */
- timr->it_interval = 0;
/*
* Careful here. On SMP systems the timer expiry function could be
* active and spinning on timr->it_lock.
@@ -940,6 +940,9 @@ static int do_timer_settime(timer_t timer_id, int tmr_flags,
if (old_spec64)
old_spec64->it_interval = ktime_to_timespec64(timr->it_interval);
+ /* Prevent signal delivery and rearming. */
+ timr->it_signal_seq++;
+
kc = timr->kclock;
if (WARN_ON_ONCE(!kc || !kc->timer_set))
error = -EINVAL;
@@ -1008,7 +1011,6 @@ int common_timer_del(struct k_itimer *timer)
{
const struct k_clock *kc = timer->kclock;
- timer->it_interval = 0;
if (kc->timer_try_to_cancel(timer) < 0)
return TIMER_RETRY;
timer->it_status = POSIX_TIMER_DISARMED;
@@ -1036,6 +1038,9 @@ SYSCALL_DEFINE1(timer_delete, timer_t, timer_id)
if (!timer)
return -EINVAL;
+ /* Prevent signal delivery and rearming. */
+ timer->it_signal_seq++;
+
if (unlikely(timer_delete_hook(timer) == TIMER_RETRY)) {
/* Unlocks and relocks the timer if it still exists */
timer = timer_wait_running(timer, &flags);
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 10/27] posix-timers: Make signal overrun accounting sensible
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (8 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 09/27] posix-timers: Make signal delivery consistent Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 11/27] posix-cpu-timers: Use dedicated flag for CPU timer nanosleep Thomas Gleixner
` (17 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
The handling of the timer overrun in the signal code is inconsistent as it
takes previous overruns into account. This is just wrong as after the
reprogramming of a timer the overrun count starts over from a clean state,
i.e. 0.
Make the accounting in send_sigqueue() consistent with that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 34 ++++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index a407724f1267..a99274287902 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1961,6 +1961,34 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int s
*/
q->info.si_sys_private = si_private;
+ /*
+ * Set the overrun count to zero unconditionally. The posix timer
+ * code does not self rearm periodic timers. They are rearmed from
+ * dequeue_signal().
+ *
+ * But there is a situation where @q is already enqueued:
+ *
+ * 1) timer_settime()
+ * arm_timer()
+ * 2) timer_expires()
+ * send_sigqueue(@q)
+ * enqueue(@q)
+ * 3) timer_settime()
+ * arm_timer()
+ * 4) timer_expires()
+ * send_sigqueue(@q) <- Observes @q already queued
+ *
+ * In this case incrementing si_overrun does not make sense because
+ * there is no relationship between timer_settime() #1 and #2.
+ *
+ * The POSIX specification is useful as always: "The effect of
+ * disarming or resetting a timer with pending expiration
+ * notifications is unspecified."
+ *
+ * Just do the sensible thing and reset the overrun.
+ */
+ q->info.si_overrun = 0;
+
ret = 1; /* the signal is ignored */
result = TRACE_SIGNAL_IGNORED;
if (!prepare_signal(sig, t, false))
@@ -1968,15 +1996,9 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int s
ret = 0;
if (unlikely(!list_empty(&q->list))) {
- /*
- * If an SI_TIMER entry is already queue just increment
- * the overrun count.
- */
- q->info.si_overrun++;
result = TRACE_SIGNAL_ALREADY_PENDING;
goto out;
}
- q->info.si_overrun = 0;
signalfd_notify(t, sig);
pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 11/27] posix-cpu-timers: Use dedicated flag for CPU timer nanosleep
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (9 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 10/27] posix-timers: Make signal overrun accounting sensible Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 12/27] posix-timers: Add a refcount to struct k_itimer Thomas Gleixner
` (16 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
POSIX CPU timer nanosleep creates a k_itimer on stack and uses the sigq
pointer to detect the nanosleep case in the expiry function.
Prepare for embedding sigqueue into struct k_itimer by using a dedicated
flag for nanosleep.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 4 +++-
kernel/time/posix-cpu-timers.c | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 8c6d97412526..bcd01208d795 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -42,6 +42,7 @@ static inline int clockid_to_fd(const clockid_t clk)
* @pid: Pointer to target task PID
* @elist: List head for the expiry list
* @firing: Timer is currently firing
+ * @nanosleep: Timer is used for nanosleep and is not a regular posix-timer
* @handling: Pointer to the task which handles expiry
*/
struct cpu_timer {
@@ -49,7 +50,8 @@ struct cpu_timer {
struct timerqueue_head *head;
struct pid *pid;
struct list_head elist;
- int firing;
+ bool firing;
+ bool nanosleep;
struct task_struct __rcu *handling;
};
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index bc2cd32b7a40..ea1835cb541a 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -588,7 +588,7 @@ static void cpu_timer_fire(struct k_itimer *timer)
timer->it_status = POSIX_TIMER_DISARMED;
- if (unlikely(timer->sigq == NULL)) {
+ if (unlikely(ctmr->nanosleep)) {
/*
* This a special case for clock_nanosleep,
* not a normal timer from sys_timer_create.
@@ -1479,6 +1479,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
timer.it_overrun = -1;
error = posix_cpu_timer_create(&timer);
timer.it_process = current;
+ timer.it.cpu.nanosleep = true;
if (!error) {
static struct itimerspec64 zero_it;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 12/27] posix-timers: Add a refcount to struct k_itimer
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (10 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 11/27] posix-cpu-timers: Use dedicated flag for CPU timer nanosleep Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 13/27] signal: Split up __sigqueue_alloc() Thomas Gleixner
` (15 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races of all sorts.
To make that work correctly it needs reference counting so that timer
deletion does not free the timer prematuraly when there is a signal queued
or delivered concurrently.
Add a rcuref to the posix timer part.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 14 ++++++++++++++
kernel/time/posix-timers.c | 7 ++++---
2 files changed, 18 insertions(+), 3 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index bcd01208d795..9740fd0c2933 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -6,11 +6,13 @@
#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/posix-timers_types.h>
+#include <linux/rcuref.h>
#include <linux/spinlock.h>
#include <linux/timerqueue.h>
struct kernel_siginfo;
struct task_struct;
+struct k_itimer;
static inline clockid_t make_process_cpuclock(const unsigned int pid,
const clockid_t clock)
@@ -105,6 +107,7 @@ static inline void posix_cputimers_rt_watchdog(struct posix_cputimers *pct,
void posixtimer_rearm_itimer(struct task_struct *p);
bool posixtimer_deliver_signal(struct kernel_siginfo *info);
+void posixtimer_free_timer(struct k_itimer *timer);
/* Init task static initializer */
#define INIT_CPU_TIMERBASE(b) { \
@@ -129,6 +132,7 @@ static inline void posix_cputimers_group_init(struct posix_cputimers *pct,
u64 cpu_limit) { }
static inline void posixtimer_rearm_itimer(struct task_struct *p) { }
static inline bool posixtimer_deliver_signal(struct kernel_siginfo *info) { return false; }
+static inline void posixtimer_free_timer(struct k_itimer *timer) { }
#endif
#ifdef CONFIG_POSIX_CPU_TIMERS_TASK_WORK
@@ -156,6 +160,7 @@ static inline void posix_cputimers_init_work(void) { }
* @it_signal: Pointer to the creators signal struct
* @it_pid: The pid of the process/task targeted by the signal
* @it_process: The task to wakeup on clock_nanosleep (CPU timers)
+ * @rcuref: Reference count for life time management
* @sigq: Pointer to preallocated sigqueue
* @it: Union representing the various posix timer type
* internals.
@@ -180,6 +185,7 @@ struct k_itimer {
struct task_struct *it_process;
};
struct sigqueue *sigq;
+ rcuref_t rcuref;
union {
struct {
struct hrtimer timer;
@@ -200,4 +206,12 @@ void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
int update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
+#ifdef CONFIG_POSIX_TIMERS
+static inline void posixtimer_putref(struct k_itimer *tmr)
+{
+ if (rcuref_put(&tmr->rcuref))
+ posixtimer_free_timer(tmr);
+}
+#endif /* !CONFIG_POSIX_TIMERS */
+
#endif
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1231efb7c30f..1c2f6090b767 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -417,6 +417,7 @@ static struct k_itimer * alloc_posix_timer(void)
return NULL;
}
clear_siginfo(&tmr->sigq->info);
+ rcuref_init(&tmr->rcuref, 1);
return tmr;
}
@@ -427,7 +428,7 @@ static void k_itimer_rcu_free(struct rcu_head *head)
kmem_cache_free(posix_timers_cache, tmr);
}
-static void posix_timer_free(struct k_itimer *tmr)
+void posixtimer_free_timer(struct k_itimer *tmr)
{
put_pid(tmr->it_pid);
sigqueue_free(tmr->sigq);
@@ -439,7 +440,7 @@ static void posix_timer_unhash_and_free(struct k_itimer *tmr)
spin_lock(&hash_lock);
hlist_del_rcu(&tmr->t_hash);
spin_unlock(&hash_lock);
- posix_timer_free(tmr);
+ posixtimer_putref(tmr);
}
static int common_timer_create(struct k_itimer *new_timer)
@@ -474,7 +475,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
*/
new_timer_id = posix_timer_add(new_timer);
if (new_timer_id < 0) {
- posix_timer_free(new_timer);
+ posixtimer_free_timer(new_timer);
return new_timer_id;
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 13/27] signal: Split up __sigqueue_alloc()
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (11 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 12/27] posix-timers: Add a refcount to struct k_itimer Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 14/27] signal: Provide posixtimer_sigqueue_init() Thomas Gleixner
` (14 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races of all sorts.
Reorganize __sigqueue_alloc() so the ucounts retrieval and the
initialization can be used independently.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 52 +++++++++++++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 17 deletions(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index a99274287902..87c349a2ddf7 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -396,16 +396,9 @@ void task_join_group_stop(struct task_struct *task)
task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING);
}
-/*
- * allocate a new signal queue record
- * - this may be called without locks if and only if t == current, otherwise an
- * appropriate lock must be held to stop the target task from exiting
- */
-static struct sigqueue *
-__sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
- int override_rlimit, const unsigned int sigqueue_flags)
+static struct ucounts *sig_get_ucounts(struct task_struct *t, int sig,
+ int override_rlimit)
{
- struct sigqueue *q = NULL;
struct ucounts *ucounts;
long sigpending;
@@ -424,19 +417,44 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
if (!sigpending)
return NULL;
- if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) {
- q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
- } else {
+ if (unlikely(!override_rlimit && sigpending > task_rlimit(t, RLIMIT_SIGPENDING))) {
+ dec_rlimit_put_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING);
print_dropped_signal(sig);
+ return NULL;
}
- if (unlikely(q == NULL)) {
+ return ucounts;
+}
+
+static void __sigqueue_init(struct sigqueue *q, struct ucounts *ucounts,
+ const unsigned int sigqueue_flags)
+{
+ INIT_LIST_HEAD(&q->list);
+ q->flags = sigqueue_flags;
+ q->ucounts = ucounts;
+}
+
+/*
+ * allocate a new signal queue record
+ * - this may be called without locks if and only if t == current, otherwise an
+ * appropriate lock must be held to stop the target task from exiting
+ */
+static struct sigqueue *__sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
+ int override_rlimit, const unsigned int sigqueue_flags)
+{
+ struct ucounts *ucounts = sig_get_ucounts(t, sig, override_rlimit);
+ struct sigqueue *q;
+
+ if (!ucounts)
+ return NULL;
+
+ q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
+ if (!q) {
dec_rlimit_put_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING);
- } else {
- INIT_LIST_HEAD(&q->list);
- q->flags = sigqueue_flags;
- q->ucounts = ucounts;
+ return NULL;
}
+
+ __sigqueue_init(q, ucounts, sigqueue_flags);
return q;
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 14/27] signal: Provide posixtimer_sigqueue_init()
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (12 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 13/27] signal: Split up __sigqueue_alloc() Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 15/27] signal: Add sys_private_ptr to siginfo::_sifields:: _timer Thomas Gleixner
` (13 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races of all sorts.
Provide a new function to initialize the embedded sigqueue to prepare for
that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 2 ++
kernel/signal.c | 11 +++++++++++
2 files changed, 13 insertions(+)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 9740fd0c2933..200098d27cc0 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -12,6 +12,7 @@
struct kernel_siginfo;
struct task_struct;
+struct sigqueue;
struct k_itimer;
static inline clockid_t make_process_cpuclock(const unsigned int pid,
@@ -106,6 +107,7 @@ static inline void posix_cputimers_rt_watchdog(struct posix_cputimers *pct,
}
void posixtimer_rearm_itimer(struct task_struct *p);
+bool posixtimer_init_sigqueue(struct sigqueue *q);
bool posixtimer_deliver_signal(struct kernel_siginfo *info);
void posixtimer_free_timer(struct k_itimer *timer);
diff --git a/kernel/signal.c b/kernel/signal.c
index 87c349a2ddf7..a857f6628e77 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1905,6 +1905,17 @@ void flush_itimer_signals(void)
__flush_itimer_signals(&tsk->signal->shared_pending);
}
+bool posixtimer_init_sigqueue(struct sigqueue *q)
+{
+ struct ucounts *ucounts = sig_get_ucounts(current, -1, 0);
+
+ if (!ucounts)
+ return false;
+ clear_siginfo(&q->info);
+ __sigqueue_init(q, ucounts, SIGQUEUE_PREALLOC);
+ return true;
+}
+
struct sigqueue *sigqueue_alloc(void)
{
return __sigqueue_alloc(-1, current, GFP_KERNEL, 0, SIGQUEUE_PREALLOC);
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 15/27] signal: Add sys_private_ptr to siginfo::_sifields:: _timer
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (13 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 14/27] signal: Provide posixtimer_sigqueue_init() Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 16/27] posix-timers: Store PID type in the timer Thomas Gleixner
` (12 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
On signal delivery collect_signal() copies the queued siginfo into a caller
provided siginfo struct. The posix timer signal delivery code then uses
siginfo::si_tid to lookup the timer in the hash table.
That's required today as the timer and the sigqueue are separate entities
and have different life time rules.
The sigqueue will be embedded into struct k_itimer to address a few issues
in the posix timer signal handling, which means the life time rules are
not longer separate, which can spare the lookup.
Due to locking rules posixtimer_deliver_signal() cannot be invoked from
collect_signal(). The timer pointer could be handed down from
collect_signal() to dequeue_signal(), but that's just overhead for the
non-posixtimer case.
There is room in the _sifields union for an extra pointer which will be
used later for storing the timer pointer. This field is copied with siginfo
and cleared before the info is delivered to userspace like the existing
si_sys_private field.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/kernel/signal_32.c | 2 +-
arch/x86/kernel/signal_64.c | 2 +-
include/uapi/asm-generic/siginfo.h | 2 ++
kernel/signal.c | 6 ++++--
kernel/time/posix-timers.c | 4 +++-
5 files changed, 11 insertions(+), 5 deletions(-)
---
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index ef654530bf5a..34ce839ed4ff 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -456,7 +456,7 @@ CHECK_SI_OFFSET(_timer);
/* compat_siginfo_t doesn't have si_sys_private */
CHECK_SI_SIZE (_timer, 3*sizeof(int));
#else
-CHECK_SI_SIZE (_timer, 4*sizeof(int));
+CHECK_SI_SIZE (_timer, 5*sizeof(int));
#endif
static_assert(offsetof(siginfo32_t, si_tid) == 0x0C);
static_assert(offsetof(siginfo32_t, si_overrun) == 0x10);
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 8a94053c5444..dd96b7f3f60c 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -462,7 +462,7 @@ static_assert(offsetof(siginfo_t, si_pid) == 0x10);
static_assert(offsetof(siginfo_t, si_uid) == 0x14);
CHECK_SI_OFFSET(_timer);
-CHECK_SI_SIZE (_timer, 6*sizeof(int));
+CHECK_SI_SIZE (_timer, 8*sizeof(int));
static_assert(offsetof(siginfo_t, si_tid) == 0x10);
static_assert(offsetof(siginfo_t, si_overrun) == 0x14);
static_assert(offsetof(siginfo_t, si_value) == 0x18);
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index b7bc545ec3b2..702d7d3ca117 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -47,6 +47,7 @@ union __sifields {
int _overrun; /* overrun count */
sigval_t _sigval; /* same as below */
int _sys_private; /* not to be passed to user */
+ void *_sys_privptr; /* not to be passed to user */
} _timer;
/* POSIX.1b signals */
@@ -146,6 +147,7 @@ typedef struct siginfo {
#define si_tid _sifields._timer._tid
#define si_overrun _sifields._timer._overrun
#define si_sys_private _sifields._timer._sys_private
+#define si_sys_privptr _sifields._timer._sys_privptr
#define si_status _sifields._sigchld._status
#define si_utime _sifields._sigchld._utime
#define si_stime _sifields._sigchld._stime
diff --git a/kernel/signal.c b/kernel/signal.c
index a857f6628e77..0aa01eec5e2d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3410,12 +3410,14 @@ static int post_copy_siginfo_from_user(kernel_siginfo_t *info,
const siginfo_t __user *from)
{
/*
- * Clear the si_sys_private field for timer signals as that's the
+ * Clear the si_sys_priv* fields for timer signals as that's the
* indicator for rearming a posix timer. User space submitted
* signals are not allowed to inject that.
*/
- if (info->si_code == SI_TIMER)
+ if (info->si_code == SI_TIMER) {
info->si_sys_private = 0;
+ info->si_sys_privptr = NULL;
+ }
if (unlikely(!known_siginfo_layout(info->si_signo, info->si_code))) {
char __user *expansion = si_expansion(from);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1c2f6090b767..9d7e02db4157 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -289,8 +289,9 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
out:
spin_lock(¤t->sighand->siglock);
- /* Don't expose the si_sys_private value to userspace */
+ /* Don't expose the si_sys_priv* values to userspace */
info->si_sys_private = 0;
+ info->si_sys_privptr = NULL;
return ret;
}
@@ -505,6 +506,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
new_timer->sigq->info.si_tid = new_timer->it_id;
new_timer->sigq->info.si_code = SI_TIMER;
+ new_timer->sigq->info.si_sys_privptr = new_timer;
if (copy_to_user(created_timer_id, &new_timer_id, sizeof (new_timer_id))) {
error = -EFAULT;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 16/27] posix-timers: Store PID type in the timer
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (14 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 15/27] signal: Add sys_private_ptr to siginfo::_sifields:: _timer Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:48 ` [patch v4 17/27] signal: Refactor send_sigqueue() Thomas Gleixner
` (11 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
instead of re-evaluating the signal delivery mode everywhere.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 2 ++
kernel/time/posix-timers.c | 9 ++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 200098d27cc0..947176582de9 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -5,6 +5,7 @@
#include <linux/alarmtimer.h>
#include <linux/list.h>
#include <linux/mutex.h>
+#include <linux/pid.h>
#include <linux/posix-timers_types.h>
#include <linux/rcuref.h>
#include <linux/spinlock.h>
@@ -180,6 +181,7 @@ struct k_itimer {
s64 it_overrun_last;
unsigned int it_signal_seq;
int it_sigev_notify;
+ enum pid_type it_pid_type;
ktime_t it_interval;
struct signal_struct *it_signal;
union {
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 9d7e02db4157..bf68d80a0d75 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -298,7 +298,6 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
int posix_timer_queue_signal(struct k_itimer *timr)
{
enum posix_timer_state state = POSIX_TIMER_DISARMED;
- enum pid_type type;
int ret;
lockdep_assert_held(&timr->it_lock);
@@ -309,8 +308,7 @@ int posix_timer_queue_signal(struct k_itimer *timr)
}
timr->it_status = state;
- type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
- ret = send_sigqueue(timr->sigq, timr->it_pid, type, timr->it_signal_seq);
+ ret = send_sigqueue(timr->sigq, timr->it_pid, timr->it_pid_type, timr->it_signal_seq);
/* If we failed to send the signal the timer stops. */
return ret > 0;
}
@@ -504,6 +502,11 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
new_timer->it_pid = get_pid(task_tgid(current));
}
+ if (new_timer->it_sigev_notify & SIGEV_THREAD_ID)
+ new_timer->it_pid_type = PIDTYPE_PID;
+ else
+ new_timer->it_pid_type = PIDTYPE_TGID;
+
new_timer->sigq->info.si_tid = new_timer->it_id;
new_timer->sigq->info.si_code = SI_TIMER;
new_timer->sigq->info.si_sys_privptr = new_timer;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 17/27] signal: Refactor send_sigqueue()
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (15 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 16/27] posix-timers: Store PID type in the timer Thomas Gleixner
@ 2024-09-27 8:48 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 18/27] posix-timers: Embed sigqueue in struct k_itimer Thomas Gleixner
` (10 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:48 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To handle posix timers which have their signal ignored via SIG_IGN properly
it is required to requeue a ignored signal for delivery when SIG_IGN is
lifted so the timer gets rearmed.
Split the required code out of send_sigqueue() so it can be reused in
context of sigaction().
While at it rename send_sigqueue() to posixtimer_send_sigqueue() so its
clear what this is about.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 1 +-
include/linux/sched/signal.h | 1 +-
kernel/signal.c | 73 +++++++++++++++++++++++++--------------------
kernel/time/posix-timers.c | 2 +-
4 files changed, 44 insertions(+), 33 deletions(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 947176582de9..52611ea923b2 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -109,6 +109,7 @@ static inline void posix_cputimers_rt_watchdog(struct posix_cputimers *pct,
void posixtimer_rearm_itimer(struct task_struct *p);
bool posixtimer_init_sigqueue(struct sigqueue *q);
+int posixtimer_send_sigqueue(struct k_itimer *tmr);
bool posixtimer_deliver_signal(struct kernel_siginfo *info);
void posixtimer_free_timer(struct k_itimer *timer);
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index bd9f569231d9..36283c1c55e9 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -340,7 +340,6 @@ extern int send_sig(int, struct task_struct *, int);
extern int zap_other_threads(struct task_struct *p);
extern struct sigqueue *sigqueue_alloc(void);
extern void sigqueue_free(struct sigqueue *);
-extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type, int si_private);
extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
static inline void clear_notify_signal(void)
diff --git a/kernel/signal.c b/kernel/signal.c
index 0aa01eec5e2d..01102470e174 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1947,38 +1947,53 @@ void sigqueue_free(struct sigqueue *q)
__sigqueue_free(q);
}
-int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int si_private)
+static void posixtimer_queue_sigqueue(struct sigqueue *q, struct task_struct *t, enum pid_type type)
{
- int sig = q->info.si_signo;
struct sigpending *pending;
+ int sig = q->info.si_signo;
+
+ signalfd_notify(t, sig);
+ pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
+ list_add_tail(&q->list, &pending->list);
+ sigaddset(&pending->signal, sig);
+ complete_signal(sig, t, type);
+}
+
+/*
+ * This function is used by POSIX timers to deliver a timer signal.
+ * Where type is PIDTYPE_PID (such as for timers with SIGEV_THREAD_ID
+ * set), the signal must be delivered to the specific thread (queues
+ * into t->pending).
+ *
+ * Where type is not PIDTYPE_PID, signals must be delivered to the
+ * process. In this case, prefer to deliver to current if it is in
+ * the same thread group as the target process, which avoids
+ * unnecessarily waking up a potentially idle task.
+ */
+static inline struct task_struct *posixtimer_get_target(struct k_itimer *tmr)
+{
+ struct task_struct *t = pid_task(tmr->it_pid, tmr->it_pid_type);
+
+ if (t && tmr->it_pid_type != PIDTYPE_PID && same_thread_group(t, current))
+ t = current;
+ return t;
+}
+
+int posixtimer_send_sigqueue(struct k_itimer *tmr)
+{
+ struct sigqueue *q = tmr->sigq;
+ int sig = q->info.si_signo;
struct task_struct *t;
unsigned long flags;
int ret, result;
- if (WARN_ON_ONCE(!(q->flags & SIGQUEUE_PREALLOC)))
- return 0;
- if (WARN_ON_ONCE(q->info.si_code != SI_TIMER))
- return 0;
-
ret = -1;
rcu_read_lock();
- /*
- * This function is used by POSIX timers to deliver a timer signal.
- * Where type is PIDTYPE_PID (such as for timers with SIGEV_THREAD_ID
- * set), the signal must be delivered to the specific thread (queues
- * into t->pending).
- *
- * Where type is not PIDTYPE_PID, signals must be delivered to the
- * process. In this case, prefer to deliver to current if it is in
- * the same thread group as the target process, which avoids
- * unnecessarily waking up a potentially idle task.
- */
- t = pid_task(pid, type);
+ t = posixtimer_get_target(tmr);
if (!t)
goto ret;
- if (type != PIDTYPE_PID && same_thread_group(t, current))
- t = current;
+
if (!likely(lock_task_sighand(t, &flags)))
goto ret;
@@ -1988,7 +2003,7 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int s
* decides based on si_sys_private whether to invoke
* posixtimer_rearm() or not.
*/
- q->info.si_sys_private = si_private;
+ q->info.si_sys_private = tmr->it_signal_seq;
/*
* Set the overrun count to zero unconditionally. The posix timer
@@ -2019,24 +2034,20 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int s
q->info.si_overrun = 0;
ret = 1; /* the signal is ignored */
- result = TRACE_SIGNAL_IGNORED;
- if (!prepare_signal(sig, t, false))
+ if (!prepare_signal(sig, t, false)) {
+ result = TRACE_SIGNAL_IGNORED;
goto out;
+ }
ret = 0;
if (unlikely(!list_empty(&q->list))) {
result = TRACE_SIGNAL_ALREADY_PENDING;
goto out;
}
-
- signalfd_notify(t, sig);
- pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
- list_add_tail(&q->list, &pending->list);
- sigaddset(&pending->signal, sig);
- complete_signal(sig, t, type);
+ posixtimer_queue_sigqueue(q, t, tmr->it_pid_type);
result = TRACE_SIGNAL_DELIVERED;
out:
- trace_signal_generate(sig, &q->info, t, type != PIDTYPE_PID, result);
+ trace_signal_generate(sig, &q->info, t, tmr->it_pid_type != PIDTYPE_PID, result);
unlock_task_sighand(t, &flags);
ret:
rcu_read_unlock();
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index bf68d80a0d75..369c8f1c5e4c 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -308,7 +308,7 @@ int posix_timer_queue_signal(struct k_itimer *timr)
}
timr->it_status = state;
- ret = send_sigqueue(timr->sigq, timr->it_pid, timr->it_pid_type, timr->it_signal_seq);
+ ret = posixtimer_send_sigqueue(timr);
/* If we failed to send the signal the timer stops. */
return ret > 0;
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 18/27] posix-timers: Embed sigqueue in struct k_itimer
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (16 preceding siblings ...)
2024-09-27 8:48 ` [patch v4 17/27] signal: Refactor send_sigqueue() Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 19/27] signal: Cleanup unused posix-timer leftovers Thomas Gleixner
` (9 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races of all sorts.
Now that the prerequisites are in place, embed the sigqueue into struct
k_itimer and fixup the relevant usage sites.
Aside of preparing for proper SIG_IGN handling, this spares an extra
allocation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
fs/proc/base.c | 4 +--
include/linux/posix-timers.h | 23 ++++++++++++++++--
kernel/signal.c | 12 +++++++--
kernel/time/posix-timers.c | 59 +++++++++++++++++++++++++++------------------
4 files changed, 69 insertions(+), 29 deletions(-)
---
diff --git a/fs/proc/base.c b/fs/proc/base.c
index dd579332a7f8..f01ea013ff9b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2496,8 +2496,8 @@ static int show_timer(struct seq_file *m, void *v)
seq_printf(m, "ID: %d\n", timer->it_id);
seq_printf(m, "signal: %d/%px\n",
- timer->sigq->info.si_signo,
- timer->sigq->info.si_value.sival_ptr);
+ timer->sigq.info.si_signo,
+ timer->sigq.info.si_value.sival_ptr);
seq_printf(m, "notify: %s/%s.%d\n",
nstr[notify & ~SIGEV_THREAD_ID],
(notify & SIGEV_THREAD_ID) ? "tid" : "pid",
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 52611ea923b2..ddd7ccd9ba77 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -39,6 +39,8 @@ static inline int clockid_to_fd(const clockid_t clk)
#ifdef CONFIG_POSIX_TIMERS
+#include <linux/signal_types.h>
+
/**
* cpu_timer - Posix CPU timer representation for k_itimer
* @node: timerqueue node to queue in the task/sig
@@ -165,7 +167,7 @@ static inline void posix_cputimers_init_work(void) { }
* @it_pid: The pid of the process/task targeted by the signal
* @it_process: The task to wakeup on clock_nanosleep (CPU timers)
* @rcuref: Reference count for life time management
- * @sigq: Pointer to preallocated sigqueue
+ * @sigq: Embedded sigqueue
* @it: Union representing the various posix timer type
* internals.
* @rcu: RCU head for freeing the timer.
@@ -189,7 +191,7 @@ struct k_itimer {
struct pid *it_pid;
struct task_struct *it_process;
};
- struct sigqueue *sigq;
+ struct sigqueue sigq;
rcuref_t rcuref;
union {
struct {
@@ -217,6 +219,23 @@ static inline void posixtimer_putref(struct k_itimer *tmr)
if (rcuref_put(&tmr->rcuref))
posixtimer_free_timer(tmr);
}
+
+static inline void posixtimer_sigqueue_getref(struct sigqueue *q)
+{
+ struct k_itimer *tmr = container_of(q, struct k_itimer, sigq);
+
+ WARN_ON_ONCE(!rcuref_get(&tmr->rcuref));
+}
+
+static inline void posixtimer_sigqueue_putref(struct sigqueue *q)
+{
+ struct k_itimer *tmr = container_of(q, struct k_itimer, sigq);
+
+ posixtimer_putref(tmr);
+}
+#else /* CONFIG_POSIX_TIMERS */
+static inline void posixtimer_sigqueue_getref(struct sigqueue *q) { }
+static inline void posixtimer_sigqueue_putref(struct sigqueue *q) { }
#endif /* !CONFIG_POSIX_TIMERS */
#endif
diff --git a/kernel/signal.c b/kernel/signal.c
index 01102470e174..7a07f86e2ae6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -566,7 +566,12 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i
still_pending:
list_del_init(&first->list);
copy_siginfo(info, &first->info);
- __sigqueue_free(first);
+ /*
+ * Do not drop the reference count for posix timer
+ * signals. That's done in posix_timer_deliver_signal().
+ */
+ if (info->si_code != SI_TIMER)
+ __sigqueue_free(first);
} else {
/*
* Ok, it wasn't in the queue. This must be
@@ -1981,7 +1986,7 @@ static inline struct task_struct *posixtimer_get_target(struct k_itimer *tmr)
int posixtimer_send_sigqueue(struct k_itimer *tmr)
{
- struct sigqueue *q = tmr->sigq;
+ struct sigqueue *q = &tmr->sigq;
int sig = q->info.si_signo;
struct task_struct *t;
unsigned long flags;
@@ -2041,9 +2046,12 @@ int posixtimer_send_sigqueue(struct k_itimer *tmr)
ret = 0;
if (unlikely(!list_empty(&q->list))) {
+ /* This holds a reference count already */
result = TRACE_SIGNAL_ALREADY_PENDING;
goto out;
}
+
+ posixtimer_sigqueue_getref(q);
posixtimer_queue_sigqueue(q, t, tmr->it_pid_type);
result = TRACE_SIGNAL_DELIVERED;
out:
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 369c8f1c5e4c..b62e3ccb45ff 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -251,12 +251,13 @@ static void common_hrtimer_rearm(struct k_itimer *timr)
/*
* This function is called from the signal delivery code. It decides
- * whether the signal should be dropped and rearms interval timers.
+ * whether the signal should be dropped and rearms interval timers. The
+ * timer can be unconditionally accessed as there is a reference held on
+ * it.
*/
bool posixtimer_deliver_signal(struct kernel_siginfo *info)
{
- struct k_itimer *timr;
- unsigned long flags;
+ struct k_itimer *timr = info->si_sys_privptr;
bool ret = false;
/*
@@ -264,12 +265,14 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
* timr::it_lock. Keep interrupts disabled.
*/
spin_unlock(¤t->sighand->siglock);
+ spin_lock(&timr->it_lock);
- timr = lock_timer(info->si_tid, &flags);
- if (!timr)
- goto out;
-
- if (timr->it_signal_seq != info->si_sys_private)
+ /*
+ * Check if the timer is still alive or whether it got modified
+ * since the signal was queued. In either case, don't rearm and
+ * drop the signal.
+ */
+ if (!timr->it_signal || timr->it_signal_seq != info->si_sys_private)
goto out_unlock;
if (timr->it_interval && timr->it_status == POSIX_TIMER_REQUEUE_PENDING) {
@@ -285,8 +288,10 @@ bool posixtimer_deliver_signal(struct kernel_siginfo *info)
ret = true;
out_unlock:
- unlock_timer(timr, flags);
-out:
+ spin_unlock(&timr->it_lock);
+ /* Drop the reference which was acquired when the signal was queued */
+ posixtimer_putref(timr);
+
spin_lock(¤t->sighand->siglock);
/* Don't expose the si_sys_priv* values to userspace */
@@ -405,17 +410,17 @@ static struct pid *good_sigevent(sigevent_t * event)
}
}
-static struct k_itimer * alloc_posix_timer(void)
+static struct k_itimer *alloc_posix_timer(void)
{
struct k_itimer *tmr = kmem_cache_zalloc(posix_timers_cache, GFP_KERNEL);
if (!tmr)
return tmr;
- if (unlikely(!(tmr->sigq = sigqueue_alloc()))) {
+
+ if (unlikely(!posixtimer_init_sigqueue(&tmr->sigq))) {
kmem_cache_free(posix_timers_cache, tmr);
return NULL;
}
- clear_siginfo(&tmr->sigq->info);
rcuref_init(&tmr->rcuref, 1);
return tmr;
}
@@ -430,7 +435,8 @@ static void k_itimer_rcu_free(struct rcu_head *head)
void posixtimer_free_timer(struct k_itimer *tmr)
{
put_pid(tmr->it_pid);
- sigqueue_free(tmr->sigq);
+ if (tmr->sigq.ucounts)
+ dec_rlimit_put_ucounts(tmr->sigq.ucounts, UCOUNT_RLIMIT_SIGPENDING);
call_rcu(&tmr->rcu, k_itimer_rcu_free);
}
@@ -492,13 +498,13 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
goto out;
}
new_timer->it_sigev_notify = event->sigev_notify;
- new_timer->sigq->info.si_signo = event->sigev_signo;
- new_timer->sigq->info.si_value = event->sigev_value;
+ new_timer->sigq.info.si_signo = event->sigev_signo;
+ new_timer->sigq.info.si_value = event->sigev_value;
} else {
new_timer->it_sigev_notify = SIGEV_SIGNAL;
- new_timer->sigq->info.si_signo = SIGALRM;
- memset(&new_timer->sigq->info.si_value, 0, sizeof(sigval_t));
- new_timer->sigq->info.si_value.sival_int = new_timer->it_id;
+ new_timer->sigq.info.si_signo = SIGALRM;
+ memset(&new_timer->sigq.info.si_value, 0, sizeof(sigval_t));
+ new_timer->sigq.info.si_value.sival_int = new_timer->it_id;
new_timer->it_pid = get_pid(task_tgid(current));
}
@@ -507,9 +513,9 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
else
new_timer->it_pid_type = PIDTYPE_TGID;
- new_timer->sigq->info.si_tid = new_timer->it_id;
- new_timer->sigq->info.si_code = SI_TIMER;
- new_timer->sigq->info.si_sys_privptr = new_timer;
+ new_timer->sigq.info.si_tid = new_timer->it_id;
+ new_timer->sigq.info.si_code = SI_TIMER;
+ new_timer->sigq.info.si_sys_privptr = new_timer;
if (copy_to_user(created_timer_id, &new_timer_id, sizeof (new_timer_id))) {
error = -EFAULT;
@@ -593,7 +599,14 @@ static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags)
* 1) Set timr::it_signal to NULL with timr::it_lock held
* 2) Release timr::it_lock
* 3) Remove from the hash under hash_lock
- * 4) Call RCU for removal after the grace period
+ * 4) Put the reference count.
+ *
+ * The reference count might not drop to zero if timr::sigq is
+ * queued. In that case the signal delivery or flush will put the
+ * last reference count.
+ *
+ * When the reference count reaches zero, the timer is scheduled
+ * for RCU removal after the grace period.
*
* Holding rcu_read_lock() accross the lookup ensures that
* the timer cannot be freed.
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 19/27] signal: Cleanup unused posix-timer leftovers
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (17 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 18/27] posix-timers: Embed sigqueue in struct k_itimer Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 20/27] signal: Add task argument to flush_sigqueue_mask() Thomas Gleixner
` (8 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Remove the leftovers of sigqueue preallocation as it's not longer used.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/sched/signal.h | 2 --
kernel/signal.c | 43 +++++++------------------------------------
2 files changed, 7 insertions(+), 38 deletions(-)
---
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 36283c1c55e9..02972fd41931 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -338,8 +338,6 @@ extern void force_fatal_sig(int);
extern void force_exit_sig(int);
extern int send_sig(int, struct task_struct *, int);
extern int zap_other_threads(struct task_struct *p);
-extern struct sigqueue *sigqueue_alloc(void);
-extern void sigqueue_free(struct sigqueue *);
extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
static inline void clear_notify_signal(void)
diff --git a/kernel/signal.c b/kernel/signal.c
index 7a07f86e2ae6..48bceca90a91 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -439,8 +439,8 @@ static void __sigqueue_init(struct sigqueue *q, struct ucounts *ucounts,
* - this may be called without locks if and only if t == current, otherwise an
* appropriate lock must be held to stop the target task from exiting
*/
-static struct sigqueue *__sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
- int override_rlimit, const unsigned int sigqueue_flags)
+static struct sigqueue *sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
+ int override_rlimit)
{
struct ucounts *ucounts = sig_get_ucounts(t, sig, override_rlimit);
struct sigqueue *q;
@@ -454,14 +454,16 @@ static struct sigqueue *__sigqueue_alloc(int sig, struct task_struct *t, gfp_t g
return NULL;
}
- __sigqueue_init(q, ucounts, sigqueue_flags);
+ __sigqueue_init(q, ucounts, 0);
return q;
}
static void __sigqueue_free(struct sigqueue *q)
{
- if (q->flags & SIGQUEUE_PREALLOC)
+ if (q->flags & SIGQUEUE_PREALLOC) {
+ posixtimer_sigqueue_putref(q);
return;
+ }
if (q->ucounts) {
dec_rlimit_put_ucounts(q->ucounts, UCOUNT_RLIMIT_SIGPENDING);
q->ucounts = NULL;
@@ -1065,7 +1067,7 @@ static int __send_signal_locked(int sig, struct kernel_siginfo *info,
else
override_rlimit = 0;
- q = __sigqueue_alloc(sig, t, GFP_ATOMIC, override_rlimit, 0);
+ q = sigqueue_alloc(sig, t, GFP_ATOMIC, override_rlimit);
if (q) {
list_add_tail(&q->list, &pending->list);
@@ -1921,37 +1923,6 @@ bool posixtimer_init_sigqueue(struct sigqueue *q)
return true;
}
-struct sigqueue *sigqueue_alloc(void)
-{
- return __sigqueue_alloc(-1, current, GFP_KERNEL, 0, SIGQUEUE_PREALLOC);
-}
-
-void sigqueue_free(struct sigqueue *q)
-{
- spinlock_t *lock = ¤t->sighand->siglock;
- unsigned long flags;
-
- if (WARN_ON_ONCE(!(q->flags & SIGQUEUE_PREALLOC)))
- return;
- /*
- * We must hold ->siglock while testing q->list
- * to serialize with collect_signal() or with
- * __exit_signal()->flush_sigqueue().
- */
- spin_lock_irqsave(lock, flags);
- q->flags &= ~SIGQUEUE_PREALLOC;
- /*
- * If it is queued it will be freed when dequeued,
- * like the "regular" sigqueue.
- */
- if (!list_empty(&q->list))
- q = NULL;
- spin_unlock_irqrestore(lock, flags);
-
- if (q)
- __sigqueue_free(q);
-}
-
static void posixtimer_queue_sigqueue(struct sigqueue *q, struct task_struct *t, enum pid_type type)
{
struct sigpending *pending;
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 20/27] signal: Add task argument to flush_sigqueue_mask()
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (18 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 19/27] signal: Cleanup unused posix-timer leftovers Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 21/27] signal: Provide ignored_posix_timers list Thomas Gleixner
` (7 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To prepare for handling posix timer signals on sigaction(SIG_IGN) properly,
add a task argument to flush_sigqueue_mask() and fixup all call sites.
This argument will be used in a later step to enqueue posix timers on an
ignored list, so their signal can be requeued when SIG_IGN is lifted later
on.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index 48bceca90a91..93c2d681309c 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -724,11 +724,10 @@ void signal_wake_up_state(struct task_struct *t, unsigned int state)
/*
* Remove signals in mask from the pending set and queue.
- * Returns 1 if any signals were found.
*
* All callers must be holding the siglock.
*/
-static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s)
+static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s, struct task_struct *ptmr_tsk)
{
struct sigqueue *q, *n;
sigset_t m;
@@ -866,18 +865,18 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
* This is a stop signal. Remove SIGCONT from all queues.
*/
siginitset(&flush, sigmask(SIGCONT));
- flush_sigqueue_mask(&flush, &signal->shared_pending);
+ flush_sigqueue_mask(&flush, &signal->shared_pending, NULL);
for_each_thread(p, t)
- flush_sigqueue_mask(&flush, &t->pending);
+ flush_sigqueue_mask(&flush, &t->pending, NULL);
} else if (sig == SIGCONT) {
unsigned int why;
/*
* Remove all stop signals from all queues, wake all threads.
*/
siginitset(&flush, SIG_KERNEL_STOP_MASK);
- flush_sigqueue_mask(&flush, &signal->shared_pending);
+ flush_sigqueue_mask(&flush, &signal->shared_pending, NULL);
for_each_thread(p, t) {
- flush_sigqueue_mask(&flush, &t->pending);
+ flush_sigqueue_mask(&flush, &t->pending, NULL);
task_clear_jobctl_pending(t, JOBCTL_STOP_PENDING);
if (likely(!(t->ptrace & PT_SEIZED))) {
t->jobctl &= ~JOBCTL_STOPPED;
@@ -4169,8 +4168,8 @@ void kernel_sigaction(int sig, __sighandler_t action)
sigemptyset(&mask);
sigaddset(&mask, sig);
- flush_sigqueue_mask(&mask, ¤t->signal->shared_pending);
- flush_sigqueue_mask(&mask, ¤t->pending);
+ flush_sigqueue_mask(&mask, ¤t->signal->shared_pending, NULL);
+ flush_sigqueue_mask(&mask, ¤t->pending, NULL);
recalc_sigpending();
}
spin_unlock_irq(¤t->sighand->siglock);
@@ -4237,9 +4236,9 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
if (sig_handler_ignored(sig_handler(p, sig), sig)) {
sigemptyset(&mask);
sigaddset(&mask, sig);
- flush_sigqueue_mask(&mask, &p->signal->shared_pending);
+ flush_sigqueue_mask(&mask, &p->signal->shared_pending, NULL);
for_each_thread(p, t)
- flush_sigqueue_mask(&mask, &t->pending);
+ flush_sigqueue_mask(&mask, &t->pending, NULL);
}
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 21/27] signal: Provide ignored_posix_timers list
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (19 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 20/27] signal: Add task argument to flush_sigqueue_mask() Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 22/27] posix-timers: Handle ignored list on delete and exit Thomas Gleixner
` (6 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To prepare for handling posix timer signals on sigaction(SIG_IGN) properly,
add a list to task::signal.
This list will be used to queue posix timers so their signal can be
requeued when SIG_IGN is lifted later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/sched/signal.h | 1 +
init/init_task.c | 5 +++--
kernel/fork.c | 1 +
3 files changed, 5 insertions(+), 2 deletions(-)
---
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 02972fd41931..d5d03d919df8 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -138,6 +138,7 @@ struct signal_struct {
/* POSIX.1b Interval Timers */
unsigned int next_posix_timer_id;
struct hlist_head posix_timers;
+ struct hlist_head ignored_posix_timers;
/* ITIMER_REAL timer for the process */
struct hrtimer real_timer;
diff --git a/init/init_task.c b/init/init_task.c
index 5d0399bc8d2f..65af2ba93f25 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -29,8 +29,9 @@ static struct signal_struct init_signals = {
.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
.exec_update_lock = __RWSEM_INITIALIZER(init_signals.exec_update_lock),
#ifdef CONFIG_POSIX_TIMERS
- .posix_timers = HLIST_HEAD_INIT,
- .cputimer = {
+ .posix_timers = HLIST_HEAD_INIT,
+ .ignored_posix_timers = HLIST_HEAD_INIT,
+ .cputimer = {
.cputime_atomic = INIT_CPUTIME_ATOMIC,
},
#endif
diff --git a/kernel/fork.c b/kernel/fork.c
index c1b343cba560..bef4b51bd474 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1862,6 +1862,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
#ifdef CONFIG_POSIX_TIMERS
INIT_HLIST_HEAD(&sig->posix_timers);
+ INIT_HLIST_HEAD(&sig->ignored_posix_timers);
hrtimer_init(&sig->real_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
sig->real_timer.function = it_real_fn;
#endif
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 22/27] posix-timers: Handle ignored list on delete and exit
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (20 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 21/27] signal: Provide ignored_posix_timers list Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 23/27] signal: Handle ignored signals in do_sigaction(action != SIG_IGN) Thomas Gleixner
` (5 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
To handle posix timer signals on sigaction(SIG_IGN) properly, the timers
will be queued on a separate ignored list.
Add the necessary cleanup code for timer_delete() and exit_itimers().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 4 +++-
kernel/time/posix-timers.c | 20 ++++++++++++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
---
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index ddd7ccd9ba77..efab1ef7a7fe 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -151,7 +151,8 @@ static inline void posix_cputimers_init_work(void) { }
/**
* struct k_itimer - POSIX.1b interval timer structure.
- * @list: List head for binding the timer to signals->posix_timers
+ * @list: List node for binding the timer to tsk::signal::posix_timers
+ * @ignored_list: List node for tracking ignored timers in tsk::signal::ignored_posix_timers
* @t_hash: Entry in the posix timer hash table
* @it_lock: Lock protecting the timer
* @kclock: Pointer to the k_clock struct handling this timer
@@ -174,6 +175,7 @@ static inline void posix_cputimers_init_work(void) { }
*/
struct k_itimer {
struct hlist_node list;
+ struct hlist_node ignored_list;
struct hlist_node t_hash;
spinlock_t it_lock;
const struct k_clock *kclock;
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index b62e3ccb45ff..5a5967a01f53 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1036,6 +1036,18 @@ int common_timer_del(struct k_itimer *timer)
return 0;
}
+/*
+ * If the deleted timer is on the ignored list, remove it and
+ * drop the associated reference.
+ */
+static inline void posix_timer_cleanup_ignored(struct k_itimer *tmr)
+{
+ if (!hlist_unhashed(&tmr->ignored_list)) {
+ hlist_del_init(&tmr->ignored_list);
+ posixtimer_putref(tmr);
+ }
+}
+
static inline int timer_delete_hook(struct k_itimer *timer)
{
const struct k_clock *kc = timer->kclock;
@@ -1068,6 +1080,7 @@ SYSCALL_DEFINE1(timer_delete, timer_t, timer_id)
spin_lock(¤t->sighand->siglock);
hlist_del(&timer->list);
+ posix_timer_cleanup_ignored(timer);
spin_unlock(¤t->sighand->siglock);
/*
* A concurrent lookup could check timer::it_signal lockless. It
@@ -1119,6 +1132,8 @@ static void itimer_delete(struct k_itimer *timer)
}
hlist_del(&timer->list);
+ posix_timer_cleanup_ignored(timer);
+
/*
* Setting timer::it_signal to NULL is technically not required
* here as nothing can access the timer anymore legitimately via
@@ -1151,6 +1166,11 @@ void exit_itimers(struct task_struct *tsk)
/* The timers are not longer accessible via tsk::signal */
while (!hlist_empty(&timers))
itimer_delete(hlist_entry(timers.first, struct k_itimer, list));
+
+ /* Mop up timers which are on the ignored list */
+ hlist_move_list(&tsk->signal->ignored_posix_timers, &timers);
+ while (!hlist_empty(&timers))
+ posix_timer_cleanup_ignored(hlist_entry(timers.first, struct k_itimer, list));
}
SYSCALL_DEFINE2(clock_settime, const clockid_t, which_clock,
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 23/27] signal: Handle ignored signals in do_sigaction(action != SIG_IGN)
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (21 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 22/27] posix-timers: Handle ignored list on delete and exit Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 24/27] signal: Queue ignored posixtimers on ignore list Thomas Gleixner
` (4 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
When a real handler (including SIG_DFL) is installed for a signal, which
had previously SIG_IGN set, then the list of ignored posix timers has to be
checked for timers which are affected by this change.
Add a list walk function which checks for the matching signal number and if
found requeues the timers signal, so the timer is rearmed on signal
delivery.
Rearming the timer right away is not possible because that requires to drop
sighand lock.
No functional change as the counter part which queues the timers on the
ignored list is still missing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index 93c2d681309c..855f19f74287 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2031,7 +2031,55 @@ int posixtimer_send_sigqueue(struct k_itimer *tmr)
rcu_read_unlock();
return ret;
}
-#endif /* CONFIG_POSIX_TIMERS */
+
+static void posixtimer_sig_unignore(struct task_struct *tsk, int sig)
+{
+ struct hlist_head *head = &tsk->signal->ignored_posix_timers;
+ struct hlist_node *tmp;
+ struct k_itimer *tmr;
+
+ if (likely(hlist_empty(head)))
+ return;
+
+ /*
+ * Rearming a timer with sighand lock held is not possible due to
+ * lock ordering vs. tmr::it_lock. Just stick the sigqueue back and
+ * let the signal delivery path deal with it whether it needs to be
+ * rearmed or not. This cannot be decided here w/o dropping sighand
+ * lock and creating a loop retry horror show.
+ */
+ hlist_for_each_entry_safe(tmr, tmp , head, ignored_list) {
+ struct task_struct *target;
+
+ /*
+ * tmr::sigq.info.si_signo is immutable, so accessing it
+ * without holding tmr::it_lock is safe.
+ */
+ if (tmr->sigq.info.si_signo != sig)
+ continue;
+
+ hlist_del_init(&tmr->ignored_list);
+
+ /* This should never happen and leaks a reference count */
+ if (WARN_ON_ONCE(!list_empty(&tmr->sigq.list)))
+ continue;
+
+ /*
+ * Get the target for the signal. If target is a thread and
+ * has exited by now, drop the reference count.
+ */
+ rcu_read_lock();
+ target = posixtimer_get_target(tmr);
+ if (target)
+ posixtimer_queue_sigqueue(&tmr->sigq, target, tmr->it_pid_type);
+ else
+ posixtimer_putref(tmr);
+ rcu_read_unlock();
+ }
+}
+#else /* CONFIG_POSIX_TIMERS */
+static inline void posixtimer_sig_unignore(struct task_struct *tsk, int sig) { }
+#endif /* !CONFIG_POSIX_TIMERS */
void do_notify_pidfd(struct task_struct *task)
{
@@ -4219,6 +4267,8 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
sigaction_compat_abi(act, oact);
if (act) {
+ bool was_ignored = k->sa.sa_handler == SIG_IGN;
+
sigdelsetmask(&act->sa.sa_mask,
sigmask(SIGKILL) | sigmask(SIGSTOP));
*k = *act;
@@ -4239,6 +4289,8 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
flush_sigqueue_mask(&mask, &p->signal->shared_pending, NULL);
for_each_thread(p, t)
flush_sigqueue_mask(&mask, &t->pending, NULL);
+ } else if (was_ignored) {
+ posixtimer_sig_unignore(p, sig);
}
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 24/27] signal: Queue ignored posixtimers on ignore list
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (22 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 23/27] signal: Handle ignored signals in do_sigaction(action != SIG_IGN) Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 25/27] posix-timers: Cleanup SIG_IGN workaround leftovers Thomas Gleixner
` (3 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Queue posixtimers which have their signal ignored on the ignored list:
1) When the timer fires and the signal has SIG_IGN set
2) When SIG_IGN is installed via sigaction() and a timer signal
is already queued
This completes the SIG_IGN handling and such timers are not longer self
rearmed which avoids pointless wakeups.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/signal.c | 44 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 36 insertions(+), 8 deletions(-)
---
diff --git a/kernel/signal.c b/kernel/signal.c
index 855f19f74287..cb29f817b71a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -722,6 +722,16 @@ void signal_wake_up_state(struct task_struct *t, unsigned int state)
kick_process(t);
}
+static inline void posixtimer_sig_ignore(struct task_struct *tsk, struct sigqueue *q);
+
+static void sigqueue_free_ignored(struct task_struct *ptmr_tsk, struct sigqueue *q)
+{
+ if (likely(!ptmr_tsk || q->info.si_code != SI_TIMER))
+ __sigqueue_free(q);
+ else
+ posixtimer_sig_ignore(ptmr_tsk, q);
+}
+
/*
* Remove signals in mask from the pending set and queue.
*
@@ -740,7 +750,7 @@ static void flush_sigqueue_mask(sigset_t *mask, struct sigpending *s, struct tas
list_for_each_entry_safe(q, n, &s->list, list) {
if (sigismember(mask, q->info.si_signo)) {
list_del_init(&q->list);
- __sigqueue_free(q);
+ sigqueue_free_ignored(ptmr_tsk, q);
}
}
}
@@ -1960,9 +1970,8 @@ int posixtimer_send_sigqueue(struct k_itimer *tmr)
int sig = q->info.si_signo;
struct task_struct *t;
unsigned long flags;
- int ret, result;
+ int result;
- ret = -1;
rcu_read_lock();
t = posixtimer_get_target(tmr);
@@ -2008,13 +2017,24 @@ int posixtimer_send_sigqueue(struct k_itimer *tmr)
*/
q->info.si_overrun = 0;
- ret = 1; /* the signal is ignored */
if (!prepare_signal(sig, t, false)) {
result = TRACE_SIGNAL_IGNORED;
+
+ /* Paranoia check. Try to survive. */
+ if (WARN_ON_ONCE(!list_empty(&q->list)))
+ goto out;
+
+ if (hlist_unhashed(&tmr->ignored_list)) {
+ hlist_add_head(&tmr->ignored_list, &t->signal->ignored_posix_timers);
+ posixtimer_sigqueue_getref(q);
+ }
goto out;
}
- ret = 0;
+ /* This should never happen and leaks a reference count */
+ if (WARN_ON_ONCE(!hlist_unhashed(&tmr->ignored_list)))
+ hlist_del_init(&tmr->ignored_list);
+
if (unlikely(!list_empty(&q->list))) {
/* This holds a reference count already */
result = TRACE_SIGNAL_ALREADY_PENDING;
@@ -2029,7 +2049,14 @@ int posixtimer_send_sigqueue(struct k_itimer *tmr)
unlock_task_sighand(t, &flags);
ret:
rcu_read_unlock();
- return ret;
+ return 0;
+}
+
+static inline void posixtimer_sig_ignore(struct task_struct *tsk, struct sigqueue *q)
+{
+ struct k_itimer *tmr = container_of(q, struct k_itimer, sigq);
+
+ hlist_add_head(&tmr->ignored_list, &tsk->signal->ignored_posix_timers);
}
static void posixtimer_sig_unignore(struct task_struct *tsk, int sig)
@@ -2078,6 +2105,7 @@ static void posixtimer_sig_unignore(struct task_struct *tsk, int sig)
}
}
#else /* CONFIG_POSIX_TIMERS */
+static inline void posixtimer_sig_ignore(struct task_struct *tsk, struct sigqueue *q) { }
static inline void posixtimer_sig_unignore(struct task_struct *tsk, int sig) { }
#endif /* !CONFIG_POSIX_TIMERS */
@@ -4286,9 +4314,9 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
if (sig_handler_ignored(sig_handler(p, sig), sig)) {
sigemptyset(&mask);
sigaddset(&mask, sig);
- flush_sigqueue_mask(&mask, &p->signal->shared_pending, NULL);
+ flush_sigqueue_mask(&mask, &p->signal->shared_pending, p);
for_each_thread(p, t)
- flush_sigqueue_mask(&mask, &t->pending, NULL);
+ flush_sigqueue_mask(&mask, &t->pending, p);
} else if (was_ignored) {
posixtimer_sig_unignore(p, sig);
}
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [patch v4 25/27] posix-timers: Cleanup SIG_IGN workaround leftovers
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (23 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 24/27] signal: Queue ignored posixtimers on ignore list Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 26/27] alarmtimers: Remove the throttle mechanism from alarm_forward_now() Thomas Gleixner
` (2 subsequent siblings)
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Now that ignored posix timer signals are requeued and the timers are
rearmed on signal delivery the workaround to keep such timers alive and
self rearm them is not longer required.
Remove the relevant hacks and the not longer required return values from
the related functions. The alarm timer workarounds will be cleaned up in a
separate step.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/posix-timers.h | 2 -
kernel/signal.c | 3 -
kernel/time/alarmtimer.c | 47 +++++------------------------
kernel/time/posix-cpu-timers.c | 18 ++---------
kernel/time/posix-timers.c | 65 +++--------------------------------------
kernel/time/posix-timers.h | 2 -
6 files changed, 21 insertions(+), 116 deletions(-)
---
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -111,7 +111,7 @@ static inline void posix_cputimers_rt_wa
void posixtimer_rearm_itimer(struct task_struct *p);
bool posixtimer_init_sigqueue(struct sigqueue *q);
-int posixtimer_send_sigqueue(struct k_itimer *tmr);
+void posixtimer_send_sigqueue(struct k_itimer *tmr);
bool posixtimer_deliver_signal(struct kernel_siginfo *info);
void posixtimer_free_timer(struct k_itimer *timer);
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1964,7 +1964,7 @@ static inline struct task_struct *posixt
return t;
}
-int posixtimer_send_sigqueue(struct k_itimer *tmr)
+void posixtimer_send_sigqueue(struct k_itimer *tmr)
{
struct sigqueue *q = &tmr->sigq;
int sig = q->info.si_signo;
@@ -2049,7 +2049,6 @@ int posixtimer_send_sigqueue(struct k_it
unlock_task_sighand(t, &flags);
ret:
rcu_read_unlock();
- return 0;
}
static inline void posixtimer_sig_ignore(struct task_struct *tsk, struct sigqueue *q)
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -197,28 +197,15 @@ static enum hrtimer_restart alarmtimer_f
{
struct alarm *alarm = container_of(timer, struct alarm, timer);
struct alarm_base *base = &alarm_bases[alarm->type];
- unsigned long flags;
- int ret = HRTIMER_NORESTART;
- int restart = ALARMTIMER_NORESTART;
- spin_lock_irqsave(&base->lock, flags);
- alarmtimer_dequeue(base, alarm);
- spin_unlock_irqrestore(&base->lock, flags);
+ scoped_guard (spinlock_irqsave, &base->lock)
+ alarmtimer_dequeue(base, alarm);
if (alarm->function)
- restart = alarm->function(alarm, base->get_ktime());
-
- spin_lock_irqsave(&base->lock, flags);
- if (restart != ALARMTIMER_NORESTART) {
- hrtimer_set_expires(&alarm->timer, alarm->node.expires);
- alarmtimer_enqueue(base, alarm);
- ret = HRTIMER_RESTART;
- }
- spin_unlock_irqrestore(&base->lock, flags);
+ alarm->function(alarm, base->get_ktime());
trace_alarmtimer_fired(alarm, base->get_ktime());
- return ret;
-
+ return HRTIMER_NORESTART;
}
ktime_t alarm_expires_remaining(const struct alarm *alarm)
@@ -567,30 +554,14 @@ static enum alarmtimer_type clock2alarm(
*
* Return: whether the timer is to be restarted
*/
-static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
- ktime_t now)
+static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm, ktime_t now)
{
- struct k_itimer *ptr = container_of(alarm, struct k_itimer,
- it.alarm.alarmtimer);
- enum alarmtimer_restart result = ALARMTIMER_NORESTART;
- unsigned long flags;
-
- spin_lock_irqsave(&ptr->it_lock, flags);
+ struct k_itimer *ptr = container_of(alarm, struct k_itimer, it.alarm.alarmtimer);
- if (posix_timer_queue_signal(ptr) && ptr->it_interval) {
- /*
- * Handle ignored signals and rearm the timer. This will go
- * away once we handle ignored signals proper. Ensure that
- * small intervals cannot starve the system.
- */
- ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
- ++ptr->it_signal_seq;
- ptr->it_status = POSIX_TIMER_ARMED;
- result = ALARMTIMER_RESTART;
- }
- spin_unlock_irqrestore(&ptr->it_lock, flags);
+ guard(spinlock_irqsave)(&ptr->it_lock);
+ posix_timer_queue_signal(ptr);
- return result;
+ return ALARMTIMER_NORESTART;
}
/**
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -595,21 +595,11 @@ static void cpu_timer_fire(struct k_itim
*/
wake_up_process(timer->it_process);
cpu_timer_setexpires(ctmr, 0);
- } else if (!timer->it_interval) {
- /*
- * One-shot timer. Clear it as soon as it's fired.
- */
+ } else {
posix_timer_queue_signal(timer);
- cpu_timer_setexpires(ctmr, 0);
- } else if (posix_timer_queue_signal(timer)) {
- /*
- * The signal did not get queued because the signal
- * was ignored, so we won't get any callback to
- * reload the timer. But we need to keep it
- * ticking in case the signal is deliverable next time.
- */
- posix_cpu_timer_rearm(timer);
- ++timer->it_signal_seq;
+ /* Disable oneshot timers */
+ if (!timer->it_interval)
+ cpu_timer_setexpires(ctmr, 0);
}
}
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -300,10 +300,9 @@ bool posixtimer_deliver_signal(struct ke
return ret;
}
-int posix_timer_queue_signal(struct k_itimer *timr)
+void posix_timer_queue_signal(struct k_itimer *timr)
{
enum posix_timer_state state = POSIX_TIMER_DISARMED;
- int ret;
lockdep_assert_held(&timr->it_lock);
@@ -313,9 +312,7 @@ int posix_timer_queue_signal(struct k_it
}
timr->it_status = state;
- ret = posixtimer_send_sigqueue(timr);
- /* If we failed to send the signal the timer stops. */
- return ret > 0;
+ posixtimer_send_sigqueue(timr);
}
/*
@@ -328,62 +325,10 @@ int posix_timer_queue_signal(struct k_it
static enum hrtimer_restart posix_timer_fn(struct hrtimer *timer)
{
struct k_itimer *timr = container_of(timer, struct k_itimer, it.real.timer);
- enum hrtimer_restart ret = HRTIMER_NORESTART;
- unsigned long flags;
- spin_lock_irqsave(&timr->it_lock, flags);
-
- if (posix_timer_queue_signal(timr)) {
- /*
- * The signal was not queued due to SIG_IGN. As a
- * consequence the timer is not going to be rearmed from
- * the signal delivery path. But as a real signal handler
- * can be installed later the timer must be rearmed here.
- */
- if (timr->it_interval != 0) {
- ktime_t now = hrtimer_cb_get_time(timer);
-
- /*
- * FIXME: What we really want, is to stop this
- * timer completely and restart it in case the
- * SIG_IGN is removed. This is a non trivial
- * change to the signal handling code.
- *
- * For now let timers with an interval less than a
- * jiffy expire every jiffy and recheck for a
- * valid signal handler.
- *
- * This avoids interrupt starvation in case of a
- * very small interval, which would expire the
- * timer immediately again.
- *
- * Moving now ahead of time by one jiffy tricks
- * hrtimer_forward() to expire the timer later,
- * while it still maintains the overrun accuracy
- * for the price of a slight inconsistency in the
- * timer_gettime() case. This is at least better
- * than a timer storm.
- *
- * Only required when high resolution timers are
- * enabled as the periodic tick based timers are
- * automatically aligned to the next tick.
- */
- if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS)) {
- ktime_t kj = TICK_NSEC;
-
- if (timr->it_interval < kj)
- now = ktime_add(now, kj);
- }
-
- timr->it_overrun += hrtimer_forward(timer, now, timr->it_interval);
- ret = HRTIMER_RESTART;
- ++timr->it_signal_seq;
- timr->it_status = POSIX_TIMER_ARMED;
- }
- }
-
- unlock_timer(timr, flags);
- return ret;
+ guard(spinlock_irqsave)(&timr->it_lock);
+ posix_timer_queue_signal(timr);
+ return HRTIMER_NORESTART;
}
static struct pid *good_sigevent(sigevent_t * event)
--- a/kernel/time/posix-timers.h
+++ b/kernel/time/posix-timers.h
@@ -42,7 +42,7 @@ extern const struct k_clock clock_proces
extern const struct k_clock clock_thread;
extern const struct k_clock alarm_clock;
-int posix_timer_queue_signal(struct k_itimer *timr);
+void posix_timer_queue_signal(struct k_itimer *timr);
void common_timer_get(struct k_itimer *timr, struct itimerspec64 *cur_setting);
int common_timer_set(struct k_itimer *timr, int flags,
^ permalink raw reply [flat|nested] 36+ messages in thread
* [patch v4 26/27] alarmtimers: Remove the throttle mechanism from alarm_forward_now()
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (24 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 25/27] posix-timers: Cleanup SIG_IGN workaround leftovers Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 8:49 ` [patch v4 27/27] alarmtimers: Remove return value from alarm functions Thomas Gleixner
2024-09-27 14:39 ` [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Eric W. Biederman
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Now that ignored posix timer signals are requeued and the timers are
rearmed on signal delivery the workaround to keep such timers alive and
self rearm them is not longer required.
Remove the unused alarm timer parts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/time/alarmtimer.c | 28 ++--------------------------
1 file changed, 2 insertions(+), 26 deletions(-)
---
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -467,35 +467,11 @@ u64 alarm_forward(struct alarm *alarm, k
}
EXPORT_SYMBOL_GPL(alarm_forward);
-static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle)
+u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
{
struct alarm_base *base = &alarm_bases[alarm->type];
- ktime_t now = base->get_ktime();
-
- if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) {
- /*
- * Same issue as with posix_timer_fn(). Timers which are
- * periodic but the signal is ignored can starve the system
- * with a very small interval. The real fix which was
- * promised in the context of posix_timer_fn() never
- * materialized, but someone should really work on it.
- *
- * To prevent DOS fake @now to be 1 jiffy out which keeps
- * the overrun accounting correct but creates an
- * inconsistency vs. timer_gettime(2).
- */
- ktime_t kj = NSEC_PER_SEC / HZ;
- if (interval < kj)
- now = ktime_add(now, kj);
- }
-
- return alarm_forward(alarm, now, interval);
-}
-
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
-{
- return __alarm_forward_now(alarm, interval, false);
+ return alarm_forward(alarm, base->get_ktime(), interval);
}
EXPORT_SYMBOL_GPL(alarm_forward_now);
^ permalink raw reply [flat|nested] 36+ messages in thread
* [patch v4 27/27] alarmtimers: Remove return value from alarm functions
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (25 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 26/27] alarmtimers: Remove the throttle mechanism from alarm_forward_now() Thomas Gleixner
@ 2024-09-27 8:49 ` Thomas Gleixner
2024-09-27 14:39 ` [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Eric W. Biederman
27 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 8:49 UTC (permalink / raw)
To: LKML
Cc: Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Eric Biederman,
Oleg Nesterov
From: Thomas Gleixner <tglx@linutronix.de>
Now that the SIG_IGN problem is solved in the core code, the alarmtimer
callbacks do not require a return value anymore.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
drivers/power/supply/charger-manager.c | 3 +--
fs/timerfd.c | 4 +---
include/linux/alarmtimer.h | 10 ++--------
kernel/time/alarmtimer.c | 16 +++++-----------
net/netfilter/xt_IDLETIMER.c | 4 +---
5 files changed, 10 insertions(+), 27 deletions(-)
---
diff --git a/drivers/power/supply/charger-manager.c b/drivers/power/supply/charger-manager.c
index 96f0a7fbf105..09ec0ecf1486 100644
--- a/drivers/power/supply/charger-manager.c
+++ b/drivers/power/supply/charger-manager.c
@@ -1412,10 +1412,9 @@ static inline struct charger_desc *cm_get_drv_data(struct platform_device *pdev)
return dev_get_platdata(&pdev->dev);
}
-static enum alarmtimer_restart cm_timer_func(struct alarm *alarm, ktime_t now)
+static void cm_timer_func(struct alarm *alarm, ktime_t now)
{
cm_timer_set = false;
- return ALARMTIMER_NORESTART;
}
static int charger_manager_probe(struct platform_device *pdev)
diff --git a/fs/timerfd.c b/fs/timerfd.c
index 4bf2f8bfec11..67400c9bde07 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -79,13 +79,11 @@ static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr)
return HRTIMER_NORESTART;
}
-static enum alarmtimer_restart timerfd_alarmproc(struct alarm *alarm,
- ktime_t now)
+static void timerfd_alarmproc(struct alarm *alarm, ktime_t now)
{
struct timerfd_ctx *ctx = container_of(alarm, struct timerfd_ctx,
t.alarm);
timerfd_triggered(ctx);
- return ALARMTIMER_NORESTART;
}
/*
diff --git a/include/linux/alarmtimer.h b/include/linux/alarmtimer.h
index 05e758b8b894..3ffa5341dce2 100644
--- a/include/linux/alarmtimer.h
+++ b/include/linux/alarmtimer.h
@@ -20,12 +20,6 @@ enum alarmtimer_type {
ALARM_BOOTTIME_FREEZER,
};
-enum alarmtimer_restart {
- ALARMTIMER_NORESTART,
- ALARMTIMER_RESTART,
-};
-
-
#define ALARMTIMER_STATE_INACTIVE 0x00
#define ALARMTIMER_STATE_ENQUEUED 0x01
@@ -42,14 +36,14 @@ enum alarmtimer_restart {
struct alarm {
struct timerqueue_node node;
struct hrtimer timer;
- enum alarmtimer_restart (*function)(struct alarm *, ktime_t now);
+ void (*function)(struct alarm *, ktime_t now);
enum alarmtimer_type type;
int state;
void *data;
};
void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
- enum alarmtimer_restart (*function)(struct alarm *, ktime_t));
+ void (*function)(struct alarm *, ktime_t));
void alarm_start(struct alarm *alarm, ktime_t start);
void alarm_start_relative(struct alarm *alarm, ktime_t start);
void alarm_restart(struct alarm *alarm);
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 593e7d561fa8..37d2d79daea4 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -321,7 +321,7 @@ static int alarmtimer_resume(struct device *dev)
static void
__alarm_init(struct alarm *alarm, enum alarmtimer_type type,
- enum alarmtimer_restart (*function)(struct alarm *, ktime_t))
+ void (*function)(struct alarm *, ktime_t))
{
timerqueue_init(&alarm->node);
alarm->timer.function = alarmtimer_fired;
@@ -337,7 +337,7 @@ __alarm_init(struct alarm *alarm, enum alarmtimer_type type,
* @function: callback that is run when the alarm fires
*/
void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
- enum alarmtimer_restart (*function)(struct alarm *, ktime_t))
+ void (*function)(struct alarm *, ktime_t))
{
hrtimer_init(&alarm->timer, alarm_bases[type].base_clockid,
HRTIMER_MODE_ABS);
@@ -530,14 +530,12 @@ static enum alarmtimer_type clock2alarm(clockid_t clockid)
*
* Return: whether the timer is to be restarted
*/
-static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm, ktime_t now)
+static void alarm_handle_timer(struct alarm *alarm, ktime_t now)
{
struct k_itimer *ptr = container_of(alarm, struct k_itimer, it.alarm.alarmtimer);
guard(spinlock_irqsave)(&ptr->it_lock);
posix_timer_queue_signal(ptr);
-
- return ALARMTIMER_NORESTART;
}
/**
@@ -698,18 +696,14 @@ static int alarm_timer_create(struct k_itimer *new_timer)
* @now: time at the timer expiration
*
* Wakes up the task that set the alarmtimer
- *
- * Return: ALARMTIMER_NORESTART
*/
-static enum alarmtimer_restart alarmtimer_nsleep_wakeup(struct alarm *alarm,
- ktime_t now)
+static void alarmtimer_nsleep_wakeup(struct alarm *alarm, ktime_t now)
{
struct task_struct *task = alarm->data;
alarm->data = NULL;
if (task)
wake_up_process(task);
- return ALARMTIMER_NORESTART;
}
/**
@@ -761,7 +755,7 @@ static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp,
static void
alarm_init_on_stack(struct alarm *alarm, enum alarmtimer_type type,
- enum alarmtimer_restart (*function)(struct alarm *, ktime_t))
+ void (*function)(struct alarm *, ktime_t))
{
hrtimer_init_on_stack(&alarm->timer, alarm_bases[type].base_clockid,
HRTIMER_MODE_ABS);
diff --git a/net/netfilter/xt_IDLETIMER.c b/net/netfilter/xt_IDLETIMER.c
index db720efa811d..5514600586a9 100644
--- a/net/netfilter/xt_IDLETIMER.c
+++ b/net/netfilter/xt_IDLETIMER.c
@@ -107,14 +107,12 @@ static void idletimer_tg_expired(struct timer_list *t)
schedule_work(&timer->work);
}
-static enum alarmtimer_restart idletimer_tg_alarmproc(struct alarm *alarm,
- ktime_t now)
+static void idletimer_tg_alarmproc(struct alarm *alarm, ktime_t now)
{
struct idletimer_tg *timer = alarm->data;
pr_debug("alarm %s expired\n", timer->attr.attr.name);
schedule_work(&timer->work);
- return ALARMTIMER_NORESTART;
}
static int idletimer_check_sysfs_name(const char *name, unsigned int size)
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [patch v4 01/27] signal: Confine POSIX_TIMERS properly
2024-09-27 8:48 ` [patch v4 01/27] signal: Confine POSIX_TIMERS properly Thomas Gleixner
@ 2024-09-27 12:21 ` Frederic Weisbecker
0 siblings, 0 replies; 36+ messages in thread
From: Frederic Weisbecker @ 2024-09-27 12:21 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, John Stultz, Peter Zijlstra,
Ingo Molnar, Stephen Boyd, Eric Biederman, Oleg Nesterov
Le Fri, Sep 27, 2024 at 10:48:40AM +0200, Thomas Gleixner a écrit :
> From: Thomas Gleixner <tglx@linutronix.de>
>
> Move the itimer rearming out of the signal code and consolidate all posix
> timer related functions in the signal code under one ifdef.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 02/27] signal: Prevent user space from setting si_sys_private
2024-09-27 8:48 ` [patch v4 02/27] signal: Prevent user space from setting si_sys_private Thomas Gleixner
@ 2024-09-27 12:37 ` Frederic Weisbecker
2024-09-27 13:40 ` Eric W. Biederman
1 sibling, 0 replies; 36+ messages in thread
From: Frederic Weisbecker @ 2024-09-27 12:37 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, John Stultz, Peter Zijlstra,
Ingo Molnar, Stephen Boyd, Eric Biederman, Oleg Nesterov
Le Fri, Sep 27, 2024 at 10:48:41AM +0200, Thomas Gleixner a écrit :
> From: Thomas Gleixner <tglx@linutronix.de>
>
> The si_sys_private member of siginfo is used to handle posix-timer rearming
> from the signal delivery path. Prevent user space from setting it as that
> creates inconsistent state.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Funny that this field is exposed to userspace.
Anyway:
Acked-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 03/27] signal: Get rid of resched_timer logic
2024-09-27 8:48 ` [patch v4 03/27] signal: Get rid of resched_timer logic Thomas Gleixner
@ 2024-09-27 13:08 ` Frederic Weisbecker
2024-09-27 13:53 ` Eric W. Biederman
1 sibling, 0 replies; 36+ messages in thread
From: Frederic Weisbecker @ 2024-09-27 13:08 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, John Stultz, Peter Zijlstra,
Ingo Molnar, Stephen Boyd, Eric Biederman, Oleg Nesterov
Le Fri, Sep 27, 2024 at 10:48:42AM +0200, Thomas Gleixner a écrit :
> From: Thomas Gleixner <tglx@linutronix.de>
>
> There is no reason for handing the *resched pointer argument through
> several functions just to check whether the signal is related to a self
> rearming posix timer.
>
> SI_TIMER is only used by the posix timer code and cannot be queued from
> user space. The only extra check in collect_signal() to verify whether the
> queued signal is preallocated is not really useful. Some other places
> already check purely the SI_TIMER type.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 02/27] signal: Prevent user space from setting si_sys_private
2024-09-27 8:48 ` [patch v4 02/27] signal: Prevent user space from setting si_sys_private Thomas Gleixner
2024-09-27 12:37 ` Frederic Weisbecker
@ 2024-09-27 13:40 ` Eric W. Biederman
1 sibling, 0 replies; 36+ messages in thread
From: Eric W. Biederman @ 2024-09-27 13:40 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Oleg Nesterov
Thomas Gleixner <tglx@linutronix.de> writes:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> The si_sys_private member of siginfo is used to handle posix-timer rearming
> from the signal delivery path. Prevent user space from setting it as that
> creates inconsistent state.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> ---
> kernel/signal.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
> ---
> diff --git a/kernel/signal.c b/kernel/signal.c
> index a83ea99f9389..7706cd304785 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -3354,6 +3354,14 @@ int copy_siginfo_to_user(siginfo_t __user *to, const kernel_siginfo_t *from)
> static int post_copy_siginfo_from_user(kernel_siginfo_t *info,
> const siginfo_t __user *from)
> {
> + /*
> + * Clear the si_sys_private field for timer signals as that's the
> + * indicator for rearming a posix timer. User space submitted
> + * signals are not allowed to inject that.
> + */
> + if (info->si_code == SI_TIMER)
> + info->si_sys_private = 0;
> +
> if (unlikely(!known_siginfo_layout(info->si_signo, info->si_code))) {
> char __user *expansion = si_expansion(from);
> char buf[SI_EXPANSION_SIZE];
Can we do this differently for maintainability? The siginfo union sucks
to deal with.
Can we place this test after the !known_siginfo_layout test.
Can you further make the case say something like:
if ((siginfo_layout(info->si_signo, info->si_code) == SIL_TIMER) &&
(info->si_sys_private != 0)) {
return -EINVAL?
}
Using siginfo_layout is slightly more expensive but it will catch any
future oddness that comes up, and I don't think signal injection is a path
where we need to optimize every last cycle.
Unless we expect userspace to be injecting signals with
info->si_sys_private set to non-zero (and we need to maintain backwards
comparability) it is probably better to simply error.
I unfortunately overlooked this corner case when I cleaned up signal
copying.
Eric
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 03/27] signal: Get rid of resched_timer logic
2024-09-27 8:48 ` [patch v4 03/27] signal: Get rid of resched_timer logic Thomas Gleixner
2024-09-27 13:08 ` Frederic Weisbecker
@ 2024-09-27 13:53 ` Eric W. Biederman
1 sibling, 0 replies; 36+ messages in thread
From: Eric W. Biederman @ 2024-09-27 13:53 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Oleg Nesterov
Thomas Gleixner <tglx@linutronix.de> writes:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> There is no reason for handing the *resched pointer argument through
> several functions just to check whether the signal is related to a self
> rearming posix timer.
>
> SI_TIMER is only used by the posix timer code and cannot be queued from
> user space.
Huh??? We have rt_sigqueueinfo. You just touched the code that
copies the queued signal from userspace.
> The only extra check in collect_signal() to verify whether the
> queued signal is preallocated is not really useful. Some other places
> already check purely the SI_TIMER type.
The check to see if the signal was preallocated prevents shenanigans
with setting si_sys_private.
That is today you can queue a signal with rt_sigqueueinfo and set
si_sys_private and it will make it to userspace. I don't know how
much we care but that is the case.
Which means that WARN_ON you added in __send_signal_locked can
most definitely be triggered by userspace.
Eric
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> ---
> kernel/signal.c | 25 +++++++++----------------
> 1 file changed, 9 insertions(+), 16 deletions(-)
> ---
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 7706cd304785..3d2e087283ab 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -526,8 +526,7 @@ bool unhandled_signal(struct task_struct *tsk, int sig)
> return !tsk->ptrace;
> }
>
> -static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info,
> - bool *resched_timer)
> +static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *info)
> {
> struct sigqueue *q, *first = NULL;
>
> @@ -549,12 +548,6 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i
> still_pending:
> list_del_init(&first->list);
> copy_siginfo(info, &first->info);
> -
> - *resched_timer =
> - (first->flags & SIGQUEUE_PREALLOC) &&
> - (info->si_code == SI_TIMER) &&
> - (info->si_sys_private);
> -
> __sigqueue_free(first);
> } else {
> /*
> @@ -571,13 +564,12 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i
> }
> }
>
> -static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
> - kernel_siginfo_t *info, bool *resched_timer)
> +static int __dequeue_signal(struct sigpending *pending, sigset_t *mask, kernel_siginfo_t *info)
> {
> int sig = next_signal(pending, mask);
>
> if (sig)
> - collect_signal(sig, pending, info, resched_timer);
> + collect_signal(sig, pending, info);
> return sig;
> }
>
> @@ -589,17 +581,15 @@ static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
> int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
> {
> struct task_struct *tsk = current;
> - bool resched_timer = false;
> int signr;
>
> lockdep_assert_held(&tsk->sighand->siglock);
>
> *type = PIDTYPE_PID;
> - signr = __dequeue_signal(&tsk->pending, mask, info, &resched_timer);
> + signr = __dequeue_signal(&tsk->pending, mask, info);
> if (!signr) {
> *type = PIDTYPE_TGID;
> - signr = __dequeue_signal(&tsk->signal->shared_pending,
> - mask, info, &resched_timer);
> + signr = __dequeue_signal(&tsk->signal->shared_pending, mask, info);
>
> if (unlikely(signr == SIGALRM))
> posixtimer_rearm_itimer(tsk);
> @@ -626,7 +616,7 @@ int dequeue_signal(sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type)
> }
>
> if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
> - if (unlikely(resched_timer))
> + if (unlikely(info->si_code == SI_TIMER && info->si_sys_private))
> posixtimer_rearm(info);
> }
>
> @@ -1011,6 +1001,9 @@ static int __send_signal_locked(int sig, struct kernel_siginfo *info,
>
> lockdep_assert_held(&t->sighand->siglock);
>
> + if (WARN_ON_ONCE(!is_si_special(info) && info->si_code == SI_TIMER))
> + return 0;
> +
> result = TRACE_SIGNAL_IGNORED;
> if (!prepare_signal(sig, t, force))
> goto ret;
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 04/27] posix-timers: Cure si_sys_private race
2024-09-27 8:48 ` [patch v4 04/27] posix-timers: Cure si_sys_private race Thomas Gleixner
@ 2024-09-27 14:02 ` Eric W. Biederman
0 siblings, 0 replies; 36+ messages in thread
From: Eric W. Biederman @ 2024-09-27 14:02 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Oleg Nesterov
Thomas Gleixner <tglx@linutronix.de> writes:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> The si_sys_private member of the siginfo which is embedded in the
> preallocated sigqueue is used by the posix timer code to decide whether a
> timer must be reprogrammed on signal delivery.
>
> The handling of this is racy as a long standing comment in that code
> documents. It is modified with the timer lock held, but without sighand
> lock being held. The actual signal delivery code checks for it under
> sighand lock without holding the timer lock.
I suspect this falls under the ancient all integers are atomic rule
in practice.
> Hand the new value to send_sigqueue() as argument and store it with sighand
> lock held.
Is there any way we can simply remove the hack of using si_sys_private,
and use a field in the structure that contains the preallocated signal?
I don't have any issues with updating send_siqueue so that the locking
is consistent. However can we possibly name the argument something like
it_requeue_pending instead of si_private?
Eric
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> ---
> include/linux/sched/signal.h | 2 +-
> kernel/signal.c | 10 +++++++++-
> kernel/time/posix-timers.c | 15 +--------------
> 3 files changed, 11 insertions(+), 16 deletions(-)
> ---
> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
> index c8ed09ac29ac..bd9f569231d9 100644
> --- a/include/linux/sched/signal.h
> +++ b/include/linux/sched/signal.h
> @@ -340,7 +340,7 @@ extern int send_sig(int, struct task_struct *, int);
> extern int zap_other_threads(struct task_struct *p);
> extern struct sigqueue *sigqueue_alloc(void);
> extern void sigqueue_free(struct sigqueue *);
> -extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type);
> +extern int send_sigqueue(struct sigqueue *, struct pid *, enum pid_type, int si_private);
> extern int do_sigaction(int, struct k_sigaction *, struct k_sigaction *);
>
> static inline void clear_notify_signal(void)
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 3d2e087283ab..443baadb5ab0 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1915,7 +1915,7 @@ void sigqueue_free(struct sigqueue *q)
> __sigqueue_free(q);
> }
>
> -int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
> +int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type, int si_private)
> {
> int sig = q->info.si_signo;
> struct sigpending *pending;
> @@ -1950,6 +1950,14 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
> if (!likely(lock_task_sighand(t, &flags)))
> goto ret;
>
> + /*
> + * Update @q::info::si_sys_private for posix timer signals with
> + * sighand locked to prevent a race against dequeue_signal() which
> + * decides based on si_sys_private whether to invoke
> + * posixtimer_rearm() or not.
> + */
> + q->info.si_sys_private = si_private;
> +
> ret = 1; /* the signal is ignored */
> result = TRACE_SIGNAL_IGNORED;
> if (!prepare_signal(sig, t, false))
> diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
> index bcd5e56412e7..b6cca1ed2f90 100644
> --- a/kernel/time/posix-timers.c
> +++ b/kernel/time/posix-timers.c
> @@ -299,21 +299,8 @@ int posix_timer_queue_signal(struct k_itimer *timr)
> if (timr->it_interval)
> si_private = ++timr->it_requeue_pending;
>
> - /*
> - * FIXME: if ->sigq is queued we can race with
> - * dequeue_signal()->posixtimer_rearm().
> - *
> - * If dequeue_signal() sees the "right" value of
> - * si_sys_private it calls posixtimer_rearm().
> - * We re-queue ->sigq and drop ->it_lock().
> - * posixtimer_rearm() locks the timer
> - * and re-schedules it while ->sigq is pending.
> - * Not really bad, but not that we want.
> - */
> - timr->sigq->info.si_sys_private = si_private;
> -
> type = !(timr->it_sigev_notify & SIGEV_THREAD_ID) ? PIDTYPE_TGID : PIDTYPE_PID;
> - ret = send_sigqueue(timr->sigq, timr->it_pid, type);
> + ret = send_sigqueue(timr->sigq, timr->it_pid, type, si_private);
> /* If we failed to send the signal the timer stops. */
> return ret > 0;
> }
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 00/27] posix-timers: Cure the SIG_IGN mess
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
` (26 preceding siblings ...)
2024-09-27 8:49 ` [patch v4 27/27] alarmtimers: Remove return value from alarm functions Thomas Gleixner
@ 2024-09-27 14:39 ` Eric W. Biederman
2024-09-27 19:24 ` Thomas Gleixner
27 siblings, 1 reply; 36+ messages in thread
From: Eric W. Biederman @ 2024-09-27 14:39 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Oleg Nesterov
Thomas Gleixner <tglx@linutronix.de> writes:
> This are the remaining bits to cure the SIG_IGN mess. The preparatory work
> from the previous version 3 has been merged already. Version 3 can be found
> here:
>
> https://lore.kernel.org/lkml/20240610163452.591699700@linutronix.de
>
> Last year I reread a 15 years old comment about the SIG_IGN problem:
>
> "FIXME: What we really want, is to stop this timer completely and restart
> it in case the SIG_IGN is removed. This is a non trivial change which
> involves sighand locking (sigh !), which we don't want to do late in the
> release cycle. ... A more complex fix which solves also another related
> inconsistency is already in the pipeline."
>
> The embarrasing part was that I put that comment in back then. So I went
> back and rumaged through old notes as I completely had forgotten why our
> attempts to fix this back then failed.
>
> It turned out that the comment is about right: sighand locking and life
> time issues. So I sat down with the old notes and started to wrap my head
> around this again.
>
> The problem to solve:
>
> Posix interval timers are not rearmed automatically by the kernel for
> various reasons:
>
> 1) To prevent DoS by extremly short intervals.
> 2) To avoid timer overhead when a signal is pending and has not
> yet been delivered.
>
> This is achieved by queueing the signal at timer expiry and rearming the
> timer at signal delivery to user space. This puts the rearming basically
> under scheduler control and the work happens in context of the task which
> asked for the signal.
>
> There is a problem with that vs. SIG_IGN. If a signal has SIG_IGN installed
> as handler, the related signals are discarded. So in case of posix interval
> timers this means that such a timer is never rearmed even when SIG_IGN is
> replaced later with a real handler (including SIG_DFL).
>
> To work around that the kernel self rearms those timers and throttles them
> when the interval is smaller than a tick to prevent a DoS.
>
> That just keeps timers ticking, which obviously has effects on power and
> just creates work for nothing.
>
> So ideally these timers should be stopped and rearmed when SIG_IGN is
> replaced, which aligns with the regular handling of posix timers.
>
> Sounds trivial, but isn't:
>
> 1) Lock ordering.
>
> The timer lock cannot be taken with sighand lock held which is
> problematic vs. the atomicity of sigaction().
>
> 2) Life time rules
>
> The timer and the sigqueue are separate entities which requires a
> lookup of the timer ID in the signal rearm code. This can be handled,
> but the separate life time rules are not necessarily robust.
>
> 3) Finding the relevant timers
>
> Obviosly it is possible to walk the posix timer list under sighand
> lock and handle it from there. That can be expensive especially in the
> case that there are no affected timers as the walk would just end up
> doing nothing.
>
> The following series is a new and this time actually working attempt to
> solve this. It addresses it by:
>
> 1) Embedding the preallocated sigqueue into struct k_itimer, which makes
> the life time rules way simpler and just needs a trivial reference
> count.
>
> 2) Having a separate list in task::signal on which ignored timers are
> queued.
>
> This avoids walking a potentially large timer list for nothing on a
> SIG_IGN to handler transition.
>
> 3) Requeueing the timers signal in the relevant signal queue so the timer
> is rearmed when the signal is actually delivered
>
> That turned out to be the least complicated way to address the sighand
> lock vs. timer lock ordering issue.
>
> With that timers which have their signal ignored are not longer self
> rearmed and the relevant workarounds including throttling for DoS
> prevention are removed.
>
> The series is also available from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git posixt-v4
>
> Changes vs. V3:
>
> - Rebased to mainline
>
> - Fixed up a intermediate build breakage reported by 0-day
I have stopped looking at this after patch 4.
The current code can and does handle userspace injecting a signal with
si_sys_private sent to an non-zero value using rt_sigqueueinfo(2) and
that value will be delivered to userspace.
I think the at least the ability to inject such a signal (ignoring
si_sys_private) is very interesting for debuggers and checkpoint restart
applications.
I get the feeling the rest of the patch series depends upon not
supporting userspace injecting signals with si_code == SI_TIMER. That
seems unnecessary.
It seems reasonable to depend upon something like the SIGQUEUE_PREALLOC
in the flags field of struct sigqueue to detect a kernel generated
signal. Rather than adding various hacks to make everything work
with just a struct kernel_siginfo_t. Especially as the timer signals
today are the only signals that are preallocated.
Is there any chance 18/27 posix-timers: Embed sigqueue in struct k_itimer
can be moved up?
That should allow removing the reliance on si_sys_private.
That should prevent the need to add another hack with sys_private_ptr in
struct kernel_siginfo
Perhaps what needs to happen is to update collect_signal to return the
sigqueue entry (if it was preallocated), instead of the resched_timer.
Then the timer code can just use container_of to get the struct
k_itimer?
After that si_sys_private can move into struct k_itimer, and the code
won't need to worry about userspace setting that value, or about needing
to clear that value. As si_sys_private will always be 0 in preallocated
signals.
Eric
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch v4 00/27] posix-timers: Cure the SIG_IGN mess
2024-09-27 14:39 ` [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Eric W. Biederman
@ 2024-09-27 19:24 ` Thomas Gleixner
0 siblings, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-09-27 19:24 UTC (permalink / raw)
To: Eric W. Biederman
Cc: LKML, Anna-Maria Behnsen, Frederic Weisbecker, John Stultz,
Peter Zijlstra, Ingo Molnar, Stephen Boyd, Oleg Nesterov
On Fri, Sep 27 2024 at 09:39, Eric W. Biederman wrote:
> I have stopped looking at this after patch 4.
>
> The current code can and does handle userspace injecting a signal with
> si_sys_private sent to an non-zero value using rt_sigqueueinfo(2) and
> that value will be delivered to userspace.
>
> I think the at least the ability to inject such a signal (ignoring
> si_sys_private) is very interesting for debuggers and checkpoint restart
> applications.
>
> I get the feeling the rest of the patch series depends upon not
> supporting userspace injecting signals with si_code == SI_TIMER. That
> seems unnecessary.
>
> It seems reasonable to depend upon something like the SIGQUEUE_PREALLOC
> in the flags field of struct sigqueue to detect a kernel generated
> signal. Rather than adding various hacks to make everything work
> with just a struct kernel_siginfo_t. Especially as the timer signals
> today are the only signals that are preallocated.
Fair enough.
> Is there any chance 18/27 posix-timers: Embed sigqueue in struct k_itimer
> can be moved up?
>
> That should allow removing the reliance on si_sys_private.
>
> That should prevent the need to add another hack with sys_private_ptr in
> struct kernel_siginfo
>
> Perhaps what needs to happen is to update collect_signal to return the
> sigqueue entry (if it was preallocated), instead of the resched_timer.
> Then the timer code can just use container_of to get the struct
> k_itimer?
>
> After that si_sys_private can move into struct k_itimer, and the code
> won't need to worry about userspace setting that value, or about needing
> to clear that value. As si_sys_private will always be 0 in preallocated
> signals.
Let me try that.
Thanks for taking a look!
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2024-09-27 19:24 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-27 8:48 [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Thomas Gleixner
2024-09-27 8:48 ` [patch v4 01/27] signal: Confine POSIX_TIMERS properly Thomas Gleixner
2024-09-27 12:21 ` Frederic Weisbecker
2024-09-27 8:48 ` [patch v4 02/27] signal: Prevent user space from setting si_sys_private Thomas Gleixner
2024-09-27 12:37 ` Frederic Weisbecker
2024-09-27 13:40 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 03/27] signal: Get rid of resched_timer logic Thomas Gleixner
2024-09-27 13:08 ` Frederic Weisbecker
2024-09-27 13:53 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 04/27] posix-timers: Cure si_sys_private race Thomas Gleixner
2024-09-27 14:02 ` Eric W. Biederman
2024-09-27 8:48 ` [patch v4 05/27] signal: Allow POSIX timer signals to be dropped Thomas Gleixner
2024-09-27 8:48 ` [patch v4 06/27] posix-timers: Drop signal if timer has been deleted or reprogrammed Thomas Gleixner
2024-09-27 8:48 ` [patch v4 07/27] posix-timers: Rename k_itimer::it_requeue_pending Thomas Gleixner
2024-09-27 8:48 ` [patch v4 08/27] posix-timers: Add proper state tracking Thomas Gleixner
2024-09-27 8:48 ` [patch v4 09/27] posix-timers: Make signal delivery consistent Thomas Gleixner
2024-09-27 8:48 ` [patch v4 10/27] posix-timers: Make signal overrun accounting sensible Thomas Gleixner
2024-09-27 8:48 ` [patch v4 11/27] posix-cpu-timers: Use dedicated flag for CPU timer nanosleep Thomas Gleixner
2024-09-27 8:48 ` [patch v4 12/27] posix-timers: Add a refcount to struct k_itimer Thomas Gleixner
2024-09-27 8:48 ` [patch v4 13/27] signal: Split up __sigqueue_alloc() Thomas Gleixner
2024-09-27 8:48 ` [patch v4 14/27] signal: Provide posixtimer_sigqueue_init() Thomas Gleixner
2024-09-27 8:48 ` [patch v4 15/27] signal: Add sys_private_ptr to siginfo::_sifields:: _timer Thomas Gleixner
2024-09-27 8:48 ` [patch v4 16/27] posix-timers: Store PID type in the timer Thomas Gleixner
2024-09-27 8:48 ` [patch v4 17/27] signal: Refactor send_sigqueue() Thomas Gleixner
2024-09-27 8:49 ` [patch v4 18/27] posix-timers: Embed sigqueue in struct k_itimer Thomas Gleixner
2024-09-27 8:49 ` [patch v4 19/27] signal: Cleanup unused posix-timer leftovers Thomas Gleixner
2024-09-27 8:49 ` [patch v4 20/27] signal: Add task argument to flush_sigqueue_mask() Thomas Gleixner
2024-09-27 8:49 ` [patch v4 21/27] signal: Provide ignored_posix_timers list Thomas Gleixner
2024-09-27 8:49 ` [patch v4 22/27] posix-timers: Handle ignored list on delete and exit Thomas Gleixner
2024-09-27 8:49 ` [patch v4 23/27] signal: Handle ignored signals in do_sigaction(action != SIG_IGN) Thomas Gleixner
2024-09-27 8:49 ` [patch v4 24/27] signal: Queue ignored posixtimers on ignore list Thomas Gleixner
2024-09-27 8:49 ` [patch v4 25/27] posix-timers: Cleanup SIG_IGN workaround leftovers Thomas Gleixner
2024-09-27 8:49 ` [patch v4 26/27] alarmtimers: Remove the throttle mechanism from alarm_forward_now() Thomas Gleixner
2024-09-27 8:49 ` [patch v4 27/27] alarmtimers: Remove return value from alarm functions Thomas Gleixner
2024-09-27 14:39 ` [patch v4 00/27] posix-timers: Cure the SIG_IGN mess Eric W. Biederman
2024-09-27 19:24 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox