From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E61E5C6379D for ; Wed, 25 Nov 2020 02:16:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 88C0620DD4 for ; Wed, 25 Nov 2020 02:16:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="dU9vlBJK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726558AbgKYCP7 (ORCPT ); Tue, 24 Nov 2020 21:15:59 -0500 Received: from mail.kernel.org ([198.145.29.99]:38134 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725287AbgKYCP6 (ORCPT ); Tue, 24 Nov 2020 21:15:58 -0500 Received: from localhost.localdomain (unknown [94.238.200.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 99D3B21527; Wed, 25 Nov 2020 02:15:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1606270557; bh=ph4IbPvf2coEQHlSOyuonx8by8tMRVHWubJ5wkpQb/o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dU9vlBJKxbgxaC2ANx9rTP2tG5XtIHHEKv/HaEfN5SprztEEjeX00/n3vTklMBQep Hw3Z+0Z8qjJZ803q68PUUKbntT7yhsauNz1PTIcCQG5qwioqqcTtiuGBR88ZrMHWaT 8edsewuHASaZurxd62FFkJdozXvvcYxLl29LRi6Q= From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Tony Luck , Peter Zijlstra , Vasily Gorbik , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Christian Borntraeger , Fenghua Yu , Heiko Carstens Subject: [RFC PATCH 3/4] sched/irqtime: Move irqtime entry accounting after irq offset incrementation Date: Wed, 25 Nov 2020 03:15:41 +0100 Message-Id: <20201125021542.30237-4-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201125021542.30237-1-frederic@kernel.org> References: <20201125021542.30237-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org IRQ time entry is currently accounted before HARDIRQ_OFFSET or SOFTIRQ_OFFSET are incremented. This is convenient to decide to which index the cputime to account is dispatched. Unfortunately it prevents tick_irq_enter() from being called under HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ entry accounting due to the necessary clock catch up. As a result we don't benefit from appropriate lockdep coverage on tick_irq_enter(). To prepare for fixing this, move the IRQ entry cputime accounting after the preempt offset is incremented. This requires the cputime dispatch code to handle the extra offset. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Tony Luck Cc: Fenghua Yu Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger --- include/linux/hardirq.h | 4 +-- include/linux/vtime.h | 10 +++++--- kernel/sched/cputime.c | 56 ++++++++++++++++++++++++++++++----------- kernel/softirq.c | 2 +- 4 files changed, 51 insertions(+), 21 deletions(-) diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h index 754f67ac4326..02499c10fbf7 100644 --- a/include/linux/hardirq.h +++ b/include/linux/hardirq.h @@ -32,9 +32,9 @@ static __always_inline void rcu_irq_enter_check_tick(void) */ #define __irq_enter() \ do { \ + preempt_count_add(HARDIRQ_OFFSET); \ + lockdep_hardirq_enter(); \ account_irq_enter_time(current); \ - preempt_count_add(HARDIRQ_OFFSET); \ - lockdep_hardirq_enter(); \ } while (0) /* diff --git a/include/linux/vtime.h b/include/linux/vtime.h index f827b38c3bb7..cad8ff530273 100644 --- a/include/linux/vtime.h +++ b/include/linux/vtime.h @@ -96,21 +96,23 @@ static inline void vtime_flush(struct task_struct *tsk) { } #ifdef CONFIG_IRQ_TIME_ACCOUNTING -extern void irqtime_account_irq(struct task_struct *tsk); +extern void irqtime_account_enter(struct task_struct *tsk); +extern void irqtime_account_exit(struct task_struct *tsk); #else -static inline void irqtime_account_irq(struct task_struct *tsk) { } +static inline void irqtime_account_enter(struct task_struct *tsk) { } +static inline void irqtime_account_exit(struct task_struct *tsk) { } #endif static inline void account_irq_enter_time(struct task_struct *tsk) { vtime_account_irq_enter(tsk); - irqtime_account_irq(tsk); + irqtime_account_enter(tsk); } static inline void account_irq_exit_time(struct task_struct *tsk) { vtime_account_irq_exit(tsk); - irqtime_account_irq(tsk); + irqtime_account_exit(tsk); } #endif /* _LINUX_KERNEL_VTIME_H */ diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 6fa81cc33fec..82623d97667c 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -43,23 +43,49 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, u64_stats_update_end(&irqtime->sync); } -/* - * Called before incrementing preempt_count on {soft,}irq_enter - * and before decrementing preempt_count on {soft,}irq_exit. - */ -void irqtime_account_irq(struct task_struct *curr) +static s64 irqtime_get_delta(struct irqtime *irqtime) { - struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime); + int cpu = smp_processor_id(); s64 delta; - int cpu; - if (!sched_clock_irqtime) - return; - - cpu = smp_processor_id(); delta = sched_clock_cpu(cpu) - irqtime->irq_start_time; irqtime->irq_start_time += delta; + return delta; +} + +/* Called after incrementing preempt_count on {soft,}irq_enter */ +void irqtime_account_enter(struct task_struct *curr) +{ + struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime); + u64 delta; + + if (!sched_clock_irqtime) + return; + + delta = irqtime_get_delta(irqtime); + /* + * We do not account for softirq time from ksoftirqd here. + * We want to continue accounting softirq time to ksoftirqd thread + * in that case, so as not to confuse scheduler with a special task + * that do not consume any time, but still wants to run. + */ + if ((irq_count() == (SOFTIRQ_OFFSET | HARDIRQ_OFFSET)) && + curr != this_cpu_ksoftirqd()) + irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ); +} +EXPORT_SYMBOL_GPL(irqtime_account_enter); + +/* Called before decrementing preempt_count on {soft,}irq_exit */ +void irqtime_account_exit(struct task_struct *curr) +{ + struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime); + u64 delta; + + if (!sched_clock_irqtime) + return; + + delta = irqtime_get_delta(irqtime); /* * We do not account for softirq time from ksoftirqd here. * We want to continue accounting softirq time to ksoftirqd thread @@ -71,7 +97,7 @@ void irqtime_account_irq(struct task_struct *curr) else if (in_serving_softirq() && curr != this_cpu_ksoftirqd()) irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ); } -EXPORT_SYMBOL_GPL(irqtime_account_irq); +EXPORT_SYMBOL_GPL(irqtime_account_exit); static u64 irqtime_tick_accounted(u64 maxtime) { @@ -428,9 +454,11 @@ void vtime_task_switch(struct task_struct *prev) */ void vtime_account_irq_enter(struct task_struct *tsk) { - if (hardirq_count()) { + WARN_ON_ONCE(in_task()); + + if (hardirq_count() > HARDIRQ_OFFSET) { vtime_account_hardirq(tsk); - } else if (in_serving_softirq()) { + } else if (hardirq_count() && in_serving_softirq()) { vtime_account_softirq(tsk); } else if (is_idle_task(tsk)) { vtime_account_idle(tsk); diff --git a/kernel/softirq.c b/kernel/softirq.c index 617009ccd82c..24254c41bb7c 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -315,9 +315,9 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) current->flags &= ~PF_MEMALLOC; pending = local_softirq_pending(); - account_irq_enter_time(current); __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); + account_irq_enter_time(current); in_hardirq = lockdep_softirq_start(); restart: -- 2.25.1