From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-dy1-f175.google.com (mail-dy1-f175.google.com [74.125.82.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 120BF368970 for ; Thu, 28 May 2026 22:15:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.175 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780006506; cv=none; b=uyVP5Q2VxX+MmKJJZ/gkVDAkWzx/XRLst1Za2ie5/7EDx9bRG6BN52wqOy8qGX8fJTUpV4a1/07MifC1XjnwvFCi16N5pIwHryWslr8jS7CrP4g+AIkYoDJVCIVAuonJT/URkkHrkp61Hdx5LrGLimviA1q3JAh7W76paaKA340= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780006506; c=relaxed/simple; bh=XC8H5HI9gmgIbRVKHBui6Kax6ZiUlphgBeorZIsEyT4=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=mhbSEhBNdoqBhS3PkOXb6yplDDVCLrAFdnZwkSVg/wKvTT2KlHx0w1Jn7Mga/By5UdyO3qLk0XhpZxlS7IY0hfD47KE0PJJj+2ak13Fefz6jUhMCav/c5ISEV6JKmrFbR/VQNLMif6Ul/bz77D302wRCF4EOegVGZE3sipVoUv8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=RYlBl02a; arc=none smtp.client-ip=74.125.82.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="RYlBl02a" Received: by mail-dy1-f175.google.com with SMTP id 5a478bee46e88-3045c195251so7149722eec.1 for ; Thu, 28 May 2026 15:15:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1780006504; x=1780611304; darn=vger.kernel.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=witmpICraam0BXntyCVfAsI7UGCtR6EWkaGSUSZHr1Y=; b=RYlBl02a2pEuk9FGmUiYftrC0HGR+U8CdUTcFXB+ztrXOXMMdVNwKm6obzg+qtPShj ON3nnxMpAhkaKnFIpqkKS7hyuTA4oMzmCM0hj8TkqhVOaGVadOyaYCrTUYG9RhyHRu2M v2w5ot43v9AYUEwm1LWepLh5r5KSqTR6ldBMycIJrycis62C62hF1tM1FBlav5GGT0xY 3Ghe8F28lh21xPzLGFGj20K/asi2uEoexOLnuOXcKGeUh6Hd6P1md4y1KJN+CJoObSvv V9IlLi9l2OYKGPv175DAIEd5pTYmvTM7Dh2FQdNVLAt1qS1nJtgJJG3uKM0HUIBlUZup iohQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780006504; x=1780611304; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=witmpICraam0BXntyCVfAsI7UGCtR6EWkaGSUSZHr1Y=; b=KiWYmA3jV5scWdARsHcv2k1k2DRL7aMT/m+XZqO0AIyJTgT6KnUevNSmFODofyQ/7J QvfI2ttT2yhItcBNco0ok2/QdCduXGXkubrx3dioX/3iYrZvlkcc1Gp/mT6MyPhoe29V 9jMPxBNYu0FHUwIVfmoRgL9k8JXLx46RvufPBS4XgFy5pv0fEpRREPTPN/IFgxnocLrH deCERZw1WRNvIDAbrO6KATue7Z2cXZqkRfIBWEVg9by+9JOLuDU2hojNfncGBGC4QmSL +OHv4FFexeq53/GZIktkCAgH11sE2DK0G6779/tSC+jriG+LhR4XwPH14W4PBXC2Idc0 qAQA== X-Forwarded-Encrypted: i=1; AFNElJ/nDXLpOzo6d8wE3byO3u9zQ8jtTavyOUkwdVR9cqmpsbS4W2ngHF9SOFXV1Y5pz8nnZXH67SKQRj+Al3E=@vger.kernel.org X-Gm-Message-State: AOJu0YwuKLXwtj9k/2Yypw8OWBKO4PobBqxlwmPXP2simaqBLtOP4rDy CqVaImefrpBkTK7jZKQZ2QmKbeqQAGPDoDeMHtTnK22h7ncmvUU+7inMf4UwXAxvK0H+QOX0Zvn S7zdAz7el X-Gm-Gg: Acq92OEAXUcTOzit6gV7AytZSyaIujk2f51IDDHuBdxl2mdyrEU/DygRbSGVDzXJU78 zBc/lcx06CFFjmc8g0Jh6hw2axVJklG+mPMHv1+uocxTgiRFlezYNYoRNstNy9P6WKemgDhCcNI 2j752oLeZqmsk74Kjp5SRjyZxolhkmTbzU0J96VgXq6zLJ9T3+cqEYnH3MFJlodbeEk8tqqApyE Ssc4QcSZ/I3ffldUrSoak1rXCS3Zzvq4JsBBY3aFkVpiihS13Fcp99b5C0phdm3qnYQ8onl4+GA hQrnoL+hMOsZxGvW4IcTUORY8rjxKgIPACBep9/i9mPMHCeHJwTl4zrVVevMbzJCYHgfSPdyS4Z Z7bj3sJq4G3qQ3C4pwvydLhjS13x/JjgGhAMfLadMa7qzPdy2ztInfAIZzKiukO5nGKQu2XfMQ1 i9qdVjw0gEHC1/wOMuNpEoKCxcLcqaJY7mVBc9gAXI/w== X-Received: by 2002:a05:7300:d706:b0:2f2:6dde:df54 with SMTP id 5a478bee46e88-304eb22ab36mr191709eec.33.1780006503421; Thu, 28 May 2026 15:15:03 -0700 (PDT) Received: from bsegall27.localhost ([98.45.141.147]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-304ebcc8016sm111448eec.23.2026.05.28.15.15.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 May 2026 15:15:02 -0700 (PDT) From: Benjamin Segall To: K Prateek Nayak Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Mel Gorman , Valentin Schneider , Aaron Lu , Josh Don , Subject: Re: [PATCH 4/5] sched/fair: Move the throttled tasks to a local list in tg_unthrottle_up() In-Reply-To: <20260528094830.13291-5-kprateek.nayak@amd.com> (K. Prateek Nayak's message of "Thu, 28 May 2026 09:48:29 +0000") References: <20260528094830.13291-1-kprateek.nayak@amd.com> <20260528094830.13291-5-kprateek.nayak@amd.com> Date: Thu, 28 May 2026 15:14:59 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain K Prateek Nayak writes: > An update_curr() during the enqueue of throttled task will start > throttling the hierarchy from subsequent commit. This can lead to > tg_throttle_down() seeing non-empty throttled_limbo_list for the cfs_rq > attaching the task from throttled_limbo_list one by one. For example: > > R > | > A > / \ > *B C > | > rq->curr > > *B is throttled with tasks on hte limbo list. When the tasks are > unthrottled via tg_unthrottle_up() and entity of group B is placed onto > A, update_curr() is called to catch up the vruntime and it may throttle > group A causing the subsequent tg_throttle_down() to see the pending > task's on B's limbo list. > > tg_unthrottle_up() > /* --cfs_rq->throttle_count == 0 */ > list_for_each_entry_safe(p, cfs_rq->throttled_limbo_list) > enqueue_task_fair() > enqueue_entity(se /* B->se */) > update_curr(cfs_rq /* A->gcfs_rq */) > account_cfs_rq_runtime(cfs_rq) > throttle_cfs_rq(cfs_rq /* A->gcfs_rq */ ) > tg_throttle_down() > /* Reaches B->cfs_rq with throttle_count == 0 */ > > !!! !list_empty(&cfs_rq->throttled_limbo_list)) !!! > > Move the tasks from throttled_limbo_list onto a local list before > starting the unthrottle to prevent the splat described above. If the > hierarchy is throttled again in middle of an unthrottle, put the pending > tasks back onto the limbo list to prevent running them unnecessarily. And for extra fun, in order to get here we need to have just finished throttle_cfs_rq_work and its resched_curr, but not have managed to reach schedule yet when the unthrottle hits. But yeah, that can happen. Reviewed-By: Benjamin Segall > > Signed-off-by: K Prateek Nayak > --- > kernel/sched/fair.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index b3b3172702a9..c48eaf2d7919 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6710,6 +6710,7 @@ static int tg_unthrottle_up(struct task_group *tg, void *data) > struct rq *rq = data; > struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; > struct task_struct *p, *tmp; > + LIST_HEAD(throttled_tasks); > > /* > * If cfs_rq->curr is set, the cfs_rq might not have caught up > @@ -6740,13 +6741,31 @@ static int tg_unthrottle_up(struct task_group *tg, void *data) > cfs_rq->throttled_clock_self_time += delta; > } > > + /* > + * Move the tasks to a local list since an update_curr() during > + * enqueue_task_fair() can throttle a higher cfs_rq, and it can > + * see the "throttled_limbo_list" being non-empty in > + * tg_throttle_down() if throttle_count turned 0 above. > + */ > + list_splice_init(&cfs_rq->throttled_limbo_list, &throttled_tasks); > + > /* Re-enqueue the tasks that have been throttled at this level. */ > - list_for_each_entry_safe(p, tmp, &cfs_rq->throttled_limbo_list, throttle_node) { > + list_for_each_entry_safe(p, tmp, &throttled_tasks, throttle_node) { > + /* > + * Back to being throttled! Break out and put the remaining > + * tasks back onto the limbo_list to prevent running them > + * unnecessarily. > + */ > + if (cfs_rq->throttle_count) > + break; > + > list_del_init(&p->throttle_node); > p->throttled = false; > - enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP); > + enqueue_task_fair(rq, p, ENQUEUE_WAKEUP); > } > > + list_splice(&throttled_tasks, &cfs_rq->throttled_limbo_list); > + > /* Add cfs_rq with load or one or more already running entities to the list */ > if (!cfs_rq_is_decayed(cfs_rq)) > list_add_leaf_cfs_rq(cfs_rq);