From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2506CEB64D9 for ; Sun, 2 Jul 2023 19:45:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232693AbjGBTpU (ORCPT ); Sun, 2 Jul 2023 15:45:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231828AbjGBTn4 (ORCPT ); Sun, 2 Jul 2023 15:43:56 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9799810C3; Sun, 2 Jul 2023 12:42:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5C2AF60C81; Sun, 2 Jul 2023 19:41:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 06B5AC433C7; Sun, 2 Jul 2023 19:41:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688326911; bh=y0BGyO9vP/1rvmqOlfeZMRV2JEhWd4TshxYGxfyLG0w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j6sTWAKWAh9WK6rzMayqR/RNHkkFyDpwbsQGC4mtcHQ2S166l4YYTeMSmn3QDFwfO qfWSaiIeOiJI8eWMxYVMVFTnoXAKqvkZS3VUMYyv0uTIZF+QVt1qZwch0FY/rg1QTz RaSStGPZ5iI6fQxyvdmQjHnSjawZYyB2SLaFdM8qgM/wxOkjgBuYPpjTF1YxO1/3U2 VeJ3scFXOm/MMDnIlLNrCWeB52dSN3IydMP+otzYth5WS8TVkNiP8fkd4rPI+G4Chy ndJ6cctD7WH708ELzYm4/SyGWfzhImMVqcbcFLOd677RXreIRKP+UuvnUJE2fvmf++ Xaf7R4p1ioLJg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Thomas Gleixner , syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com, Dmitry Vyukov , Frederic Weisbecker , Sasha Levin , ebiederm@xmission.com Subject: [PATCH AUTOSEL 5.15 07/10] posix-timers: Ensure timer ID search-loop limit is valid Date: Sun, 2 Jul 2023 15:41:36 -0400 Message-Id: <20230702194139.1778398-7-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230702194139.1778398-1-sashal@kernel.org> References: <20230702194139.1778398-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 5.15.119 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Gleixner [ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ] posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation. This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point. But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem: CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0; So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true: if (sig->posix_timer_id == start) break; While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness. Rewrite it so that all id operations are under the hash lock. Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov Signed-off-by: Thomas Gleixner Reviewed-by: Frederic Weisbecker Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx Signed-off-by: Sasha Levin --- include/linux/sched/signal.h | 2 +- kernel/time/posix-timers.c | 31 ++++++++++++++++++------------- 2 files changed, 19 insertions(+), 14 deletions(-) diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 5f0e8403e8ceb..9743f7d173a0b 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -125,7 +125,7 @@ struct signal_struct { #ifdef CONFIG_POSIX_TIMERS /* POSIX.1b Interval Timers */ - int posix_timer_id; + unsigned int next_posix_timer_id; struct list_head posix_timers; /* ITIMER_REAL timer for the process */ diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index 808a247205a9a..4431aecb8b12c 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -140,25 +140,30 @@ static struct k_itimer *posix_timer_by_id(timer_t id) static int posix_timer_add(struct k_itimer *timer) { struct signal_struct *sig = current->signal; - int first_free_id = sig->posix_timer_id; struct hlist_head *head; - int ret = -ENOENT; + unsigned int cnt, id; - do { + /* + * FIXME: Replace this by a per signal struct xarray once there is + * a plan to handle the resulting CRIU regression gracefully. + */ + for (cnt = 0; cnt <= INT_MAX; cnt++) { spin_lock(&hash_lock); - head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)]; - if (!__posix_timers_find(head, sig, sig->posix_timer_id)) { + id = sig->next_posix_timer_id; + + /* Write the next ID back. Clamp it to the positive space */ + sig->next_posix_timer_id = (id + 1) & INT_MAX; + + head = &posix_timers_hashtable[hash(sig, id)]; + if (!__posix_timers_find(head, sig, id)) { hlist_add_head_rcu(&timer->t_hash, head); - ret = sig->posix_timer_id; + spin_unlock(&hash_lock); + return id; } - if (++sig->posix_timer_id < 0) - sig->posix_timer_id = 0; - if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT)) - /* Loop over all possible ids completed */ - ret = -EAGAIN; spin_unlock(&hash_lock); - } while (ret == -ENOENT); - return ret; + } + /* POSIX return code when no timer ID could be allocated */ + return -EAGAIN; } static inline void unlock_timer(struct k_itimer *timr, unsigned long flags) -- 2.39.2