Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Tengfei Fan <tengfei.fan@oss.qualcomm.com>
To: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	linux-arm-msm@vger.kernel.org
Cc: kernel@oss.qualcomm.com, linux-kernel@vger.kernel.org,
	Tengfei Fan <tengfei.fan@oss.qualcomm.com>
Subject: [PATCH] sched: Recheck the rt task's on rq state after double_lock_balance()
Date: Thu, 09 Oct 2025 00:23:55 -0700	[thread overview]
Message-ID: <20251009-recheck_rt_task_enqueue_state-v1-1-5f9c96d3c4fd@oss.qualcomm.com> (raw)

Recheck whether next_task is still in the runqueue of this_rq after
locking this_rq and lowest_rq via double_lock_balance() in
push_rt_task(). This is necessary because double_lock_balance() first
releases this_rq->lock and then attempts to acquire both this_rq->lock
and lowest_rq->lock, during which next_task may have already been
removed from this_rq's runqueue, leading to a double dequeue issue.

The double dequeue issue can occur in the following scenario:
1. Core0 call stack:
        autoremove_wake_function
        default_wake_function
        try_to_wake_up
        ttwu_do_activate
        task_woken_rt
        push_rt_task
        move_queued_task_locked
        dequeue_task
        __wake_up

2. Execution flow on Core0, Core1 and Core2(Core0, Core1 and Core2 are
   contending for Core1's rq->lock):
   - Core1: enqueue next_task on Core1
   - Core0: lock Core1's rq->lock
            next_task = pick_next_pushable_task()
            unlock Core1's rq->lock via double_lock_balance()
   - Core1: lock Core1's rq->lock
            next_task = pick_next_task()
            unlock Core1's rq->lock
   - Core2: lock Core1's rq->lock in migration thread
   - Core1: running next_task
   - Core2: unlock Core1's rq->lock
   - Core1: lock Core1's rq->lock
            switches out and dequeue next_task
            unlock Core1's rq->lock
   - Core0: relock Core1's rq->lock from double_lock_balance()
            try to relock Core1's rq->lock from double_lock_balance()
            but next_task has been dequeued from Core1, causing the issue

Signed-off-by: Tengfei Fan <tengfei.fan@oss.qualcomm.com>
---
 kernel/sched/rt.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7936d4333731..b4e44317a5de 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2037,6 +2037,14 @@ static int push_rt_task(struct rq *rq, bool pull)
 		goto retry;
 	}
 
+	/* Within find_lock_lowest_rq(), it's possible to first unlock the
+	 * rq->lock of the runqueue containing next_task, and the re->lock
+	 * it. During this window, the state of next_task might have change.
+	 */
+	if (unlikely(rq != task_rq(next_task) ||
+		     !task_on_rq_queued(next_task)))
+		goto out;
+
 	move_queued_task_locked(rq, lowest_rq, next_task);
 	resched_curr(lowest_rq);
 	ret = 1;

---
base-commit: 7c3ba4249a3604477ea9c077e10089ba7ddcaa03
change-id: 20251008-recheck_rt_task_enqueue_state-e159aa6a2749

Best regards,
-- 
Tengfei Fan <tengfei.fan@oss.qualcomm.com>


             reply	other threads:[~2025-10-09  7:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-09  7:23 Tengfei Fan [this message]
2025-10-20 12:55 ` [PATCH] sched: Recheck the rt task's on rq state after double_lock_balance() Valentin Schneider
2025-10-25  6:43   ` Tengfei Fan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251009-recheck_rt_task_enqueue_state-v1-1-5f9c96d3c4fd@oss.qualcomm.com \
    --to=tengfei.fan@oss.qualcomm.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel@oss.qualcomm.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox