From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02AFF32720D
	for <linux-kernel@vger.kernel.org>; Fri, 29 May 2026 07:53:11 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.48
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1780041193; cv=none; b=IK8+FWuJL3eDkwie+IN78iPV8trqJJZVGIHbpzOsxNIL5P/4k1HqQE66h3kssHcXZlQxYuT0D7Np6d6nBgL4yAEY/b2T7OFZn7Lif4W5QIt3RDTDr3ManMvcqOittfXCNQLPsTANq0CFMHVxZFOAV7Gequat0r4zsY9+hPxnAFo=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1780041193; c=relaxed/simple;
	bh=MT5dskzzSEVVlXTM6hOvcWX9lezshIvtW5k+lVSTqRo=;
	h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To; b=X+rf8ILjjDYF3iWJ4+h3tXBkYkkdhCBICOHoJYW5OlZfhZ1lPmVWr9kKH6m3CrwqXkzE2NxoXSI7ReNZ+L3TSxBMcsWnN4C55RKEWVdY2O8wfYxWYxymfWke/Y0KbN5ZUE4OjE3Xl92BUgNxNrtNNq2TiRM6WgX6+TdyL0oSROQ=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=glX6p13+; arc=none smtp.client-ip=209.85.128.48
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="glX6p13+"
Received: by mail-wm1-f48.google.com with SMTP id 5b1f17b1804b1-4891e5b9c1fso118146335e9.2
        for <linux-kernel@vger.kernel.org>; Fri, 29 May 2026 00:53:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1780041190; x=1780645990; darn=vger.kernel.org;
        h=in-reply-to:subject:cc:to:from:message-id:date:from:to:cc:subject
         :date:message-id:reply-to;
        bh=Rz1zTzC7OEnF1tJwBOUtd5qB9/IzN6xkroCPmYLH6kI=;
        b=glX6p13+jECyfefhNQhxprNYlxaernZTaXHWkGvhPu0NLZxVG+TXoaGlEIhTkv1IcP
         sOONaeHMObS0KN3cvOk5FPz7bPRARA7EspnbVWln+IJRuyDj+V+S6ciga5fnSVqrR3D/
         /x5uBs4Tj69uowe1sU1rq5bE4T2KKBdS+50hsDpgrQD8ouq0DZ2LBhbaaTfmD5jArgEx
         0+oxLkjshOn8VkYX50LTQPLzaRUs3wEAbzuMM7yweOuXHwsTBMOtT9KI2SUiGAve0srd
         lx0u787OvU71yLyn4pg327L0Nst+n+GgTsu+YQQf4WxXkYZCuG0fU4IARV6gvYLHt37z
         +jrg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1780041190; x=1780645990;
        h=in-reply-to:subject:cc:to:from:message-id:date:x-gm-gg
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=Rz1zTzC7OEnF1tJwBOUtd5qB9/IzN6xkroCPmYLH6kI=;
        b=SFpPoOgF2LdNAdUMHb1H7ACnMv7sNPIZzVgjWYuxu69nd5viopO16L2dxg72X2q5If
         7KePZoCkq7HVVugVGI/UlRzXVqHM/E5KHLyJyltYHjjEuSdghGgrM9Q1e8M3yX3o+Xx/
         J+qm3Cf+IXRfM9OH5GptevQhgqmxC7WGuEYdYRGED5/zc0GvyNrzDEfvtg/JTW76EdAH
         +fvi1KbOhSV9+6XrLFbPx3AfWYE36H/KYhsYcyzyQNh7FaDmpX5GcalrKOVecN6GBzNB
         pIXnuTegeGbgs8EPt+RrHWeFYP3/5A9baDpwuMQ9yl0zZD3szWVSd0VH24t4A8qWdS/K
         Ue6w==
X-Gm-Message-State: AOJu0YzG9IA6hMfIOs4Z8amj4z1L2guJkDpZQUjGz2Mu1VZ0WF1i/azz
	YvT1U87YiD2n9eBRms6OCfPPKG9QRoQLj0aFYw5aglcV1zzcPdtSwDy0
X-Gm-Gg: Acq92OFrWDGvfBSe+m1zf24ZGHK0NfNzwhuqZZ9HIRVLeBVM0yqumIaRqoBGUmmWLsJ
	3eZt+zHY7xHKsTEFSXGqqs5fB73ykyMQjfCAnOL453jm1ZYgbUoJgs7ahB5hc8nnUB2hkmwHPxZ
	VyoNjgtNEaPVuyxuWME6gsVZacMcDbWtGnczHfJPUQO5yEnhGh1u3rTtKj6Hr3nVjUpM93+FN7o
	49TpT9USr5NXaAhSFb5+BwaUw/jLeeuv2+3KoeeRiH7KgHE3TPhSv0ugHCm74Z1DfBwkwNMsNHT
	ZIqn7Ili15KHXpQFSwvMU2AcAyimzoJYtDIWwjKjPAdhfeFEOFqeI0nE963tC4anoxc3xUZY87T
	GESWXItY8A7SyqSD8pZhwyyULbHKIA4SoCUCncWY5GPvPDRIB+CFfkY9X8xkwGEX75+X3Wp4XFt
	NG//VjUFhykottzzIulVE3uwW5drZNduE1IaP2Oc8faF7+E/K9qyxm87AV+EmtG3s71u7nrm13P
	4+6Y0x7uAyA0ZT6FWlXJUYPTWHQbGR9YhkeISyg+y+mq5M998yYJHF69sxumrXITNyV88DZ+WZ2
	uj5zou1LmD7X0/LM7FjBA/U7XItgBGlGI3wjJYlTFmQk8o0ZQkTZcfuP
X-Received: by 2002:a05:600c:1d16:b0:48a:581c:ead with SMTP id 5b1f17b1804b1-4909c07d836mr28511055e9.10.1780041190014;
        Fri, 29 May 2026 00:53:10 -0700 (PDT)
Received: from localhost (ip-082-212-034-085.um22.pools.vodafone-ip.de. [82.212.34.85])
        by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4909c0c39b0sm13174755e9.2.2026.05.29.00.53.09
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 29 May 2026 00:53:09 -0700 (PDT)
Date: Fri, 29 May 2026 09:53:08 +0200
Message-ID: <65ef80bbf22867d1ea59642a3e03e3ef.tomge68@gmail.com>
From: Tom Gebhardt <tomge68@gmail.com>
To: Qais Yousef <qyousef@layalina.io>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [PATCH v2 00/13] sched/fair/schedutil: Better manage system response time
In-Reply-To: <20260529014333.w3tqvkirgo7jij6l@airbuntu>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>

On 05/29/26 02:43, Qais Yousef wrote:
> 7.0+ttwu+vincent is the best, right?

Yes.

> Have you verified your actual workload is seeing benefit? I think when
> I scanned the github bug you references the original report was observing
> a regression in some real setup, not this stressng tests.

I am the reporter. The original issue (#7308) was observed as a drop in
camera frame rate when running two parallel IMX477 streams via libcamera on
RPi5 under kernel 6.12+. The camera pipeline is pipe-IPC-heavy (GStreamer /
libcamera internal queues), so the regression surfaced there first in a real
workload. To isolate the cause I moved to synthetic pipe benchmarks
(stress-ng), which confirmed and quantified the regression cleanly.
A Raspberry Pi developer (popcornmix) also posted IPC benchmark results on
the issue, independently confirming the trend across kernel versions
(6.6=2065 > 6.18=1805 > 6.12=1662 > 7.0=1570 Kops/s).

The stress-ng pipe stressor is therefore not an artificial worst-case -- it
directly exercises the code path that causes the real-world camera regression.
That said, I agree stress-ng amplifies the effect, and I cannot give you an
exact frame-rate number yet with the ttwu+vincent patches applied.

> IPC drops 14% on 7.0 stock. Due to stalling you reckon?

Yes. The branch misprediction rate explains most of it. On Cortex-A76 a
branch mispredict costs ~13 cycles. Normalised by instruction count:

  Kernel             branch-miss rate   vs 6.6
  -----------------  -----------------  ------
  6.6.78             0.178%             ref
  7.0.0 stock        0.427%             +140%
  7.0.0+ttwu+vincent 0.271%             +52%

The raw counts I reported yesterday were misleading because the instruction
counts differ between kernels (different amounts of useful work). Apologies
for not normalising upfront. The rate tells a cleaner story: stock EEVDF
causes 2.4× more mispredictions per instruction than CFS, and ttwu+vincent
brings that down to 1.5× -- significant improvement but not full recovery.

> Do you have the full output? It would be interesting to use perf diff.

A proper perf diff with resolved kernel symbols requires running against the
matching kernel. I ran `perf report --no-children -s symbol` on each .data
file while booted on the corresponding kernel. Key findings:

7.0.0 stock (flat, self-overhead):

  12.98%  finish_task_switch.isra.0
          -> __schedule -> schedule
             -> anon_pipe_read   5.72%
             -> anon_pipe_write  1.38%

7.0.0+ttwu+vincent (flat, self-overhead):

  19.62%  finish_task_switch.isra.0
          -> __schedule -> schedule
             -> anon_pipe_read   8.22%
             -> anon_pipe_write  4.34%

The striking difference is in the pipe_write -> schedule() path: 1.38% on
stock vs 4.34% with ttwu+vincent. The ttwu patches make pipe writers yield
the CPU far more aggressively after each write, allowing the reader to run
immediately. Stock EEVDF leaves this to the scheduler's own timing, which
results in more latency and lower throughput.

The higher absolute percentage in finish_task_switch for vincent is expected:
vincent completes ~24% more pipe operations in the same wall time, so there
are proportionally more context switches completing.

On 6.6 (from the call-graph profile recorded separately), finish_task_switch
is not visible as a top-level hotspot at all -- consistent with CFS handling
this path much more efficiently.

> Maybe there's higher rq lock contention. But this finish_task_switch and
> __raw_spin_unlock_irqrestore are common to see, especially when there's
> high context switch rate.

Agreed -- I cannot rule out rq lock contention without perf diff with
matched build-IDs. The pattern I see (finish_task_switch dominant, driven
by pipe_read/write) is consistent with high context switch rate rather than
a pathological lock. But your point about a 'hot variable' like rq->clock
is noted -- I cannot confirm or deny that from flat profiles alone.

> I hope perfetto trace will help visualize the pattern that led to this
> higher context switching.

I will work on getting a perfetto trace. Expecting to have that in a
follow-up.

Tom