From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFC48C433EF for ; Mon, 7 Mar 2022 23:30:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343918AbiCGXbo (ORCPT ); Mon, 7 Mar 2022 18:31:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343911AbiCGXbk (ORCPT ); Mon, 7 Mar 2022 18:31:40 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCBACD88 for ; Mon, 7 Mar 2022 15:30:45 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6A1866122C for ; Mon, 7 Mar 2022 23:30:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6ABD3C340F4; Mon, 7 Mar 2022 23:30:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646695844; bh=qUVF7kTyEancPlf3zp+fG1Apgn/5q8QCxp97FRG2MdY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YdzhOa38yhphI+8m7LvPVRRF444bSSgLUtnwFnki/g5xnQOjhYw/Z83I0CzxQONMd 1lBJMuRjNlnch69p4CZMIxAq4T+3CwQOndd2nv8HKOWR3laNwyvGO2b1OlR9yq5C4q 5l8VQtzEqdjflPojl9DVyMshrtvwZY1gsYeMU2up6CfJOYePsNtfVNnLEIcb8A0tZ6 dP5Rv0YGExgTg0RVaVTvcSz4fIFazVZXqcShsCYFRgx7H9ZYJAddHnb+oaP8ThkwLS TlC5xx+AsskP56e8GUoC1Sx9Ajjc2eS+uRroLmSIl7TpQqKLOtEAUS3XIn86nJyVrR pnW3Bx8yS8AaA== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , "Paul E . McKenney" , Peter Zijlstra , Paul Menzel Subject: [PATCH 1/4] tick: Detect and fix jiffies update stall Date: Tue, 8 Mar 2022 00:30:31 +0100 Message-Id: <20220307233034.34550-2-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220307233034.34550-1-frederic@kernel.org> References: <20220307233034.34550-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On some rare cases, the timekeeper CPU may be delaying its jiffies update duty for a while. Known causes include: * The timekeeper is waiting on stop_machine in a MULTI_STOP_DISABLE_IRQ or MULTI_STOP_RUN state. Disabled interrupts prevent from timekeeping updates while waiting for the target CPU to complete its stop_machine() callback. * The timekeeper vcpu has VMEXIT'ed for a long while due to some overload on the host. Detect and fix these situations with emergency timekeeping catchups. Original-patch-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Thomas Gleixner --- kernel/time/tick-sched.c | 17 +++++++++++++++++ kernel/time/tick-sched.h | 4 ++++ 2 files changed, 21 insertions(+) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 17a283ce2b20..c89f50a7e690 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -169,6 +169,8 @@ static ktime_t tick_init_jiffy_update(void) return period; } +#define MAX_STALLED_JIFFIES 5 + static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now) { int cpu = smp_processor_id(); @@ -196,6 +198,21 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now) if (tick_do_timer_cpu == cpu) tick_do_update_jiffies64(now); + /* + * If jiffies update stalled for too long (timekeeper in stop_machine() + * or VMEXIT'ed for several msecs), force an update. + */ + if (ts->last_tick_jiffies != jiffies) { + ts->stalled_jiffies = 0; + ts->last_tick_jiffies = READ_ONCE(jiffies); + } else { + if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) { + tick_do_update_jiffies64(now); + ts->stalled_jiffies = 0; + ts->last_tick_jiffies = READ_ONCE(jiffies); + } + } + if (ts->inidle) ts->got_idle_tick = 1; } diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h index d952ae393423..504649513399 100644 --- a/kernel/time/tick-sched.h +++ b/kernel/time/tick-sched.h @@ -49,6 +49,8 @@ enum tick_nohz_mode { * @timer_expires_base: Base time clock monotonic for @timer_expires * @next_timer: Expiry time of next expiring timer for debugging purpose only * @tick_dep_mask: Tick dependency mask - is set, if someone needs the tick + * @last_tick_jiffies: Value of jiffies seen on last tick + * @stalled_jiffies: Number of stalled jiffies detected across ticks */ struct tick_sched { struct hrtimer sched_timer; @@ -77,6 +79,8 @@ struct tick_sched { u64 next_timer; ktime_t idle_expires; atomic_t tick_dep_mask; + unsigned long last_tick_jiffies; + unsigned int stalled_jiffies; }; extern struct tick_sched *tick_get_tick_sched(int cpu); -- 2.25.1