From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D82EA3A3831 for ; Tue, 7 Apr 2026 12:23:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775564588; cv=none; b=C2nMji7cAdfJypg9Ox623/6F5eme1RkxzuGq4FEUc7TgKUUgYE4J5zHd7ugIUqv/Xq4mks3aBdkRQW+rFCz5Rl8DEi+O6uSrNB74q6Q8DwsJUSoM76KHbJlXketwsAiZVpREf/LBSRpxduppeuT/yLw1boqSY0ooaZesDHWkrQs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775564588; c=relaxed/simple; bh=XEOl2J9A6OefP4FpbQtIaM12ndPzUyskx44SrE/AnrE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jKuh2OCwLBAm03E7CnlrREUtYtE9QpKs73cvGo3M3lAsgGIp/ghgHwaiPoWv5ekUcP0uv+nZEJanuPTqDssWWMXesPP1GK62RXLo9Ee1gg+7JbfO5779kEt/x3Mpz4GpVc7CT8Cjy20b/MQMr1t6Uqwbdm4Z+hMLqzPjHjqMGUw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ezLVVhsv; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=VkW4kfSx; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ezLVVhsv"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="VkW4kfSx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1775564585; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dMQIikDNHU1Oe4k4jXFbXRfSUzF0IhtvTpVP6p5AIws=; b=ezLVVhsvhzjOMxlTDDdOab7IzlEb796YWAlThI+0/GceqE8+FHWAH3C5I7u0KjH2J/EI/N LLV8BMniSMjPCNIzDWxcJq4ITChjtQhhcIegeb+SxX6Q8E4FqBmNSnO+I22mzH/THg5Had WTqTsQFCH41XJntXkoN4gM+XzSw12bE= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-60-B47FBwn8MaKAsNpnVxtpyw-1; Tue, 07 Apr 2026 08:23:04 -0400 X-MC-Unique: B47FBwn8MaKAsNpnVxtpyw-1 X-Mimecast-MFC-AGG-ID: B47FBwn8MaKAsNpnVxtpyw_1775564583 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-43b93781cb8so4337702f8f.3 for ; Tue, 07 Apr 2026 05:23:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1775564582; x=1776169382; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=dMQIikDNHU1Oe4k4jXFbXRfSUzF0IhtvTpVP6p5AIws=; b=VkW4kfSxUQdzq0syKHqSVi/NZr6Yvqyzi60Jhp8kWISf7YKkSJPukCQcRk+g1mA7iA ikn4jHhZRmSrwbYKSgT9kya2r+CSTrND2jejZowFrdqQZlYNrpp3lE2HuKzuKzDNRb0i KHuNJi79qYoKAfV0jKW/uMJA6KPg8mOXg451Iwx9VahXCswLlc6kVtYh4+A++XJ5e9i6 znti/9QRpLbHvjRmDVLR9hDdnE1XLnVDq7j9KBc0nO+ujD3+/sycFJA2Kqk/knUWY2GR sWVT1M8D0RSalvRL5hV+pGIJhZpzpj0vF1B08hB44PYMPXy309W5MyLzl3bfWiTH72iR srWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775564582; x=1776169382; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dMQIikDNHU1Oe4k4jXFbXRfSUzF0IhtvTpVP6p5AIws=; b=Oqfcn2x6YTK30p/iYdjTVvej4rjYt6xOcDS/QnKaDRp+B8+EsIN6XpTAyxDKkQMOeE Udm0x2KLr7qvwV0GQrgX/Y7TekobndBQpNEQNryDqk+uKl4VGYJHxUmbJLeaPMVWOzXX t8eUJMC5aVhLwRAATSimphCHSeoiB2xWNQoGP4tkqrKzoA3MOw5RTb+79wCm9UxXqmcg zaXzpPpQcn36F/EvybXpV1+JazdKgpXwv+cCVkRwLwF5v9tPS19284V6SKtX/MYeSfX1 kcWCV1Jkj7YUSDrWe6GZn5etcFn+5Pn5BYvNgs0DTQfNkDHdzORquZ6m+GQW/rAvsr8P EPAw== X-Forwarded-Encrypted: i=1; AJvYcCWH8ORfiNJtzDjMsoS4CxMFQhK586ygexKtDHwsoq/iLNEaSBE5ToKDQJFHBJAmcfvWSOsVe0aSD/C7AFs=@vger.kernel.org X-Gm-Message-State: AOJu0YxlUGGozdUQqM4tl/VPn0BWW2+CJuySv/Ak6EhtUOsZrjLfuQ76 AG+ZBl0Ix60rncwQ/9ovzsWYvbXl8d0XR9wNWiaLntacvwBa10Pdd+YJKrw6iHkzhZrG9UwarsY oKvZkX9Buc2y21JJ6SdReM3oV6UtK6qioND1iyEBZFNBc7AINezrTPh/lG3bUBhacA+4Bwn0n1A == X-Gm-Gg: AeBDiesSWKQY8Vqh3/hjrZszdTFeCydJfET262QmWU51dZkWt9sl5In0dhlrRhlL5u6 Zthtw38qcSOK3qYPDk2qRlSbKmwol/v1pzLmDtrYJjkdfVwi8rJbLW9sHmIsI5mM5+K3idMN6WX 6FhAS5anZf883E+syV5SEWm7shPPtp3qibeKqpBe+LfdjdswwhvflG7hQC0BEVfI/cjPPZR7aSC 2FjkrF3PRz1oYPqRucR8C8UWmLkO2jv6ehBbFjlxdg5CdU5XWeQLi4I6g6+RprNO6liKLATaDyO 76ZOQybImh5X3LmeHA11I9M2hcL/3aEKiwlimU0bb/Nr8mLED8ByqNl/hBxIopkBtxtwPUwDwUU ZmsZ58N9QoIco3absYnHKi0Pu3baXVoDbrr+lJHQgJfh4AbAIcoJMTw== X-Received: by 2002:a05:6000:230c:b0:43b:42af:75e with SMTP id ffacd0b85a97d-43d292fed0dmr24048115f8f.44.1775564581668; Tue, 07 Apr 2026 05:23:01 -0700 (PDT) X-Received: by 2002:a05:6000:230c:b0:43b:42af:75e with SMTP id ffacd0b85a97d-43d292fed0dmr24048068f8f.44.1775564581185; Tue, 07 Apr 2026 05:23:01 -0700 (PDT) Received: from jlelli-thinkpadt14gen4.remote.csb ([151.29.152.195]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43d1e4d27a8sm50273388f8f.17.2026.04.07.05.23.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Apr 2026 05:23:00 -0700 (PDT) Date: Tue, 7 Apr 2026 14:22:58 +0200 From: Juri Lelli To: Peter Zijlstra Cc: John Stultz , soolaugust@gmail.com, mingo@redhat.com, linux-kernel@vger.kernel.org, zhidao su , Andrea Righi Subject: Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch Message-ID: References: <20260403081215.3942454-1-soolaugust@gmail.com> <20260403134256.GH3558198@noisy.programming.kicks-ass.net> <20260403224610.GJ2872@noisy.programming.kicks-ass.net> <20260404102244.GB22575@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260404102244.GB22575@noisy.programming.kicks-ass.net> On 04/04/26 12:22, Peter Zijlstra wrote: > On Sat, Apr 04, 2026 at 12:46:10AM +0200, Peter Zijlstra wrote: > > On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote: > > > > > Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled: > > > > > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") > > > reverted, I see the (expected, maybe) behavior where the starvation > > > lasts ~1second, then dl_server allows all the threads to spawn right > > > away, and then the test runs for 10 seconds. > > > > > > See perfetto chart: > > > https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a > > > (click the Kernel-threads track and scroll down to see the test > > > threads named referee/defense/offense/crazy-fan) > > > > > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") > > > applied, it seems the dl_server boosting the kthreadd spawning is much > > > more staggered. Again we spin up NR_CPU low priority threads, and > > > there's ~1second of starvation, then we spawn one of the mid threads, > > > and another second delay, then there's a two second delay befofe we > > > get the third running, then we get a small burst of 5 threads at once, > > > then it falls back to 1 second or more per thread as it spawns off the > > > rest. All in all it takes ~44 seconds just to spawn the threads before > > > running the test. > > > > > > Perfetto chart: > > > https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4 > > > > > > With higher cpu counts (64), the test effectively prevents the system > > > from booting (trips the hung task watchdog). > > > > > > I haven't really diagnosed the issue, but it feels a little like the > > > dl_server is boosting until the fair rq is empty but then giving up > > > the rest of its time, so if a fair task runs repeatedly but for a very > > > short period of time, it won't get to run again until the next > > > dl_server period? Causing this rate-limiting one-task-per-second > > > effect for thread spawning? I still need to stare at the dl_server > > > logic some more. > > > > I'm getting a sense of deja-vu here. Didn't we cure this once before? > > > > I'll go stare at this somewhere next week I suppose -- we have a long > > weekend here. > > Random brain wave... > > Since the dl_server is LLF (deferred), it will pretty much always trip > the dl_entity_overflow() when interrupted, right? Does it make sense to > use the revised wake-up rule for it, when appropriate? > > --- > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index d08b00429323..674de6a48551 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1027,7 +1027,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se) > if (dl_time_before(dl_se->deadline, rq_clock(rq)) || > dl_entity_overflow(dl_se, rq_clock(rq))) { > > - if (unlikely(!dl_is_implicit(dl_se) && > + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && > !dl_time_before(dl_se->deadline, rq_clock(rq)) && > !is_dl_boosted(dl_se))) { > update_dl_revised_wakeup(dl_se, rq); > So to keep boosting, by reducing runtime appropriately, until the end of the current dl-server period. Makes sense to me. Thanks! Juri