From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D108C77B75 for ; Mon, 8 May 2023 10:45:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235006AbjEHKpx (ORCPT ); Mon, 8 May 2023 06:45:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235086AbjEHKp1 (ORCPT ); Mon, 8 May 2023 06:45:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78C1926EB0 for ; Mon, 8 May 2023 03:44:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 115F462870 for ; Mon, 8 May 2023 10:44:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 07808C433EF; Mon, 8 May 2023 10:44:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1683542684; bh=nd36LqgPgXbyzTWgYP9hgPLGp15vgwclQBLev1kFv8E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EiSCqw8hIIRlbdP7z2yUMWXsAMnaPX52EWDpPBYA7DQML4/QkMDNgkyRbTw75XYap iu1HtJioYAoK1sdS42rfM0hNWWdzJIpCPRTm+Y2/4oU+A+3mPW/HWR2HdeHLDPeHUu G99a+NnSbu+D3E5U2Jpq2LwisoQxYJAoynYvH38U= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Schspa Shi , "Peter Zijlstra (Intel)" , "Steven Rostedt (Google)" , Dietmar Eggemann , Valentin Schneider , Dwaine Gonyier , Sasha Levin Subject: [PATCH 6.2 514/663] sched/rt: Fix bad task migration for rt tasks Date: Mon, 8 May 2023 11:45:40 +0200 Message-Id: <20230508094445.431689554@linuxfoundation.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230508094428.384831245@linuxfoundation.org> References: <20230508094428.384831245@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Schspa Shi [ Upstream commit feffe5bb274dd3442080ef0e4053746091878799 ] Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing") allows find_lock_lowest_rq() to pick a task with migration disabled. The purpose of the commit is to push the current running task on the CPU that has the migrate_disable() task away. However, there is a race which allows a migrate_disable() task to be migrated. Consider: CPU0 CPU1 push_rt_task check is_migration_disabled(next_task) task not running and migration_disabled == 0 find_lock_lowest_rq(next_task, rq); _double_lock_balance(this_rq, busiest); raw_spin_rq_unlock(this_rq); double_rq_lock(this_rq, busiest); <> task become running migrate_disable(); deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); WARN_ON_ONCE(is_migration_disabled(p)); Fixes: 95158a89dd50 ("sched,rt: Use the full cpumask for balancing") Signed-off-by: Schspa Shi Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steven Rostedt (Google) Reviewed-by: Dietmar Eggemann Reviewed-by: Valentin Schneider Tested-by: Dwaine Gonyier Signed-off-by: Sasha Levin --- kernel/sched/deadline.c | 1 + kernel/sched/rt.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 0d97d54276cc8..238bdb36f8848 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2246,6 +2246,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) || task_on_cpu(rq, task) || !dl_task(task) || + is_migration_disabled(task) || !task_on_rq_queued(task))) { double_unlock_balance(rq, later_rq); later_rq = NULL; diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0a11f44adee57..4f5796dd26a56 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2000,11 +2000,15 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) * the mean time, task could have * migrated already or had its affinity changed. * Also make sure that it wasn't scheduled on its rq. + * It is possible the task was scheduled, set + * "migrate_disabled" and then got preempted, so we must + * check the task migration disable flag here too. */ if (unlikely(task_rq(task) != rq || !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || task_on_cpu(rq, task) || !rt_task(task) || + is_migration_disabled(task) || !task_on_rq_queued(task))) { double_unlock_balance(rq, lowest_rq); -- 2.39.2