From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from serv15.avernis.de (serv15.avernis.de [176.9.89.163]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1663221F39 for ; Sat, 9 May 2026 11:42:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=176.9.89.163 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778326965; cv=none; b=Rb9/n2FHjDxpF2h4m32uu6xAzXAMQpp6DrqkfncLwFjbL+puw6dEpCvRx2VcXFIE/a8U/67AzsWw+3jD/7ExpNJroCI3BszllDX5P+ZkkqVNSDGdjnmQmY8Cdgo4wF0EXNP4Ty6f2PLxffJ0R7WnK3XTuoz6Yf+cZpmBQPffYTM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778326965; c=relaxed/simple; bh=mxDem2/LXMeCz0cg/jA2hJSZKngfU3+nr8zgAV1C7Co=; h=MIME-Version:Date:From:To:Cc:Subject:In-Reply-To:References: Message-ID:Content-Type; b=L9c7dE3Cj2UV16F+houCC8iSTmuMDbS/5XxtN446+lpvOYxyDqVZpJcXyCR3XQi+OsM9WQIOJjo4ajkzr0Qbmmu8RIczAJlTFtAj1mL4qq7efiaSpYC0XZbPBNyGv9oLjFZOjznHv3w7iqZJa7x5lgcMr1qT1pQn+yF+NjqIHDw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=umbiko.net; spf=none smtp.mailfrom=umbiko.net; dkim=pass (1024-bit key) header.d=umbiko.net header.i=@umbiko.net header.b=5CQApDQ0; arc=none smtp.client-ip=176.9.89.163 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=umbiko.net Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=umbiko.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=umbiko.net header.i=@umbiko.net header.b="5CQApDQ0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=umbiko.net; s=mail; t=1778326953; bh=mxDem2/LXMeCz0cg/jA2hJSZKngfU3+nr8zgAV1C7Co=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=5CQApDQ058JgdgiG4oE6qvOYksJzsOph7um2UOqFZFV1TGOCj3RjhdFmai89yCOvk MGETeSdBJ6EZLg4gW3hWk42aAm9HySigIxEFWpIr1RjC3pdiv86RFRCvTDVfnyvq7t 5Tw78H2RF/vIQA0Ah5HLGnkEC8NcNauVPZe8SK94= Received: by serv15.avernis.de (Postfix) with ESMTPSA id 40307BDE4ED5; Sat, 09 May 2026 13:42:33 +0200 (CEST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Date: Sat, 09 May 2026 11:42:33 +0000 From: Andreas Ziegler To: Christian Loehle Cc: Peter Zijlstra , Juri Lelli , linux-kernel@vger.kernel.org, Dietmar Eggemann , John Stultz Subject: Re: sched/deadline: Use revised wakeup rule for dl_server In-Reply-To: References: <496e4b3329fe258da9618b9f05b18fcf@umbiko.net> <97d9e04fd9d222f1a64f1ecfda8b81d7@umbiko.net> Message-ID: X-Sender: br025@umbiko.net Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: clamav-milter 1.4.3 at serv15.avernis.de X-Virus-Status: Clean Hi Christian, Everyone, On 2026-05-08 14:13, Christian Loehle wrote: > On 5/8/26 13:06, Andreas Ziegler wrote: >> Hi Christian, >> >> On 2026-05-08 09:20, Christian Loehle wrote: >>> On 5/8/26 09:09, Andreas Ziegler wrote: >>>> Linux kernel version: 6.12 >>>>   CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >>>> Architecture: aarch64 >>>> Platform: Raspberry Pi 4 >>>> >>>> Hi everyone, >>>> >>>> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for >>>> dl_server) [1] introduced a marked degradation in scheduling latency >>>> for real-time tasks in the presence of heavy I/O load. >>>> >>>> --- a/kernel/sched/deadline.c >>>> +++ b/kernel/sched/deadline.c >>>> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct >>>> sched_dl_entity *dl_se) >>>>      if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >>>>          dl_entity_overflow(dl_se, rq_clock(rq))) { >>>> >>>> -        if (unlikely(!dl_is_implicit(dl_se) && >>>> +        if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && >>>>                   !dl_time_before(dl_se->deadline, rq_clock(rq)) && >>>>                   !is_dl_boosted(dl_se))) { >>>>              update_dl_revised_wakeup(dl_se, rq); >>>> >>>> This was observed using a modified version of Con Kolivas' >>>> interactivity benchmark [2]; kernel bisection eventually pointed to >>>> the above mentioned commit. >>>> >>>> Benchmark results before d66792919d4f: >>>> >>>> --- Benchmarking simulated cpu of Audio real time in the presence of >>>> simulated --- >>>> Load    Latency +/- SD   median  max [100n]    Desired CPU  >>>> Deadlines met [%] >>>> None      76.6 +/- 8.3654    76  166 >>>> Video      78.5 +/- 3.9433    78  107 >>>> X      76.4 +/- 8.123     75  157 >>>> Burn      72.0 +/- 6.4733    71  127 >>>> Write     255.3 +/- 26.627   252  331 >>>> Read     226.6 +/- 12.38    227  262 >>>> Ring      84.2 +/- 6.6207    83  125 >>>> Compile     225.3 +/- 23.949   222  328 >>>> >>>>      136.8 +/- 78.462        331 >>>> >>>> Benchmark results after d66792919d4f: >>>> >>>> --- Benchmarking simulated cpu of Audio real time in the presence of >>>> simulated --- >>>> Load    Latency +/- SD   median  max [100n]    Desired CPU  >>>> Deadlines met [%] >>>> None      68.4 +/- 9.7864    67  169 >>>> Video      74.4 +/- 3.724     74   97 >>>> X      72.0 +/- 6.5681    71  129 >>>> Burn      66.9 +/- 5.9059    66  117 >>>> Write    9576.9 +/- 67639    250500418        98.1         98.1 >>>> Read     209.3 +/- 11.018   209  267 >>>> Ring      80.5 +/- 8.0993    78  125 >>>> Compile     239.0 +/- 29.447   234  372 >>>> >>>>     1298.4 +/- 24118       500418 >>>> >>>> Reverting this commit obviously solves the issue for me. I have no >>>> idea why this issue appears exclusively with heavy write loads in >>>> the background. >>>> >>>> Is this a scheduler issue, or rather something in the background? >>>> >>> >>> Hi Andreas, >>> You're using cpufreq schedutil for your tests I'm assuming? >>> Is there a difference in cpufreq behavior (avg cpufreq or OPP >>> residencies?) >>> Does the regression also happen on powersave/performance governor? >> >> Actually this is a very stripped-down system. The 'performance' >> cpufreq governor is the only one compiled in, the processor cores run >> on a fixed frequency. CONFIG_PM_OPP is not set. > > That certainly makes the analysis easier. > I couldn't reproduce the issue so far on my system but it does seem > like the dl server > would get potentially unbounded running time with very frequent > starting and stopping of the dlserver (which presumably happens because > of > the writeback) reset the runtime, which then leads to your 25s observed > latency. > Peter, how is the revised wakeup rule supposed to behave here? > >> [snip] This seems to be a case of runtime starvation. If I change sched_rt_runtime_us to a smaller value, the benchmark returns reasonable latency values. # echo "980000" > /proc/sys/kernel/sched_rt_runtime_us I could live with this workaround, since it seems not to impact overall latency values in a noticeable way. Kind regards, Andreas