From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 401993D3D1D for ; Tue, 20 Jan 2026 21:17:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768943851; cv=none; b=hEGf5Z9oJd/N7bV57Uh6UA3r6jZJCu7ewMjNrWrfjrtR2N01TeuuLCyzEyw3Uwk3HAq1gYXHpYUY3yqFuVEkawQmbT2blDOCfTRDGT5ZtSlPjSX/gsJoMAaQTMyJxD/melojwqVXTz/oQ5X/015v1EWC/pMVb73MmqR6+W0vJo0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768943851; c=relaxed/simple; bh=jEjvPtt74aUzJ8zKn13o+/Id+UW0qLCvqwwKwCSkWUE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=W1KWB0rKukKGiUGtxp6vHucePjOS/NhISPyCFt137XK/x2t36Rxp5yJHkCBrVXdEQlng6FA43zLeOe7far5ctUaEttELB5ieypVtcePnrOJydofAjk7NjSNW4u6G4HbO7Cpa6ahuM1v8J1cZqxTJtb/bA3wds+KTSE5jdqCPy1A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Jll/B1yx; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Jll/B1yx" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-47ee937ecf2so2097645e9.0 for ; Tue, 20 Jan 2026 13:17:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1768943847; x=1769548647; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xCXsBA5JiE6tSH+r9DnbOrMvsbODSkyuZ4R8Vi1rRHs=; b=Jll/B1yxShdAytXMOJSDWAlFucDAkNKLZQiKwSJbLnBecg2QGkn9h5RkxkyfvSRKdA DWDSR/YL311z5lr0ohFZl2J0S/ZTiWm7bG2/m2KCkgk2//lDfd66uUky8Nzq0Z8K5lVg QyQQzudGM4oFS6zxBuPLRcqRTf6i0WHjBwZJCn82kbFVPeOY/8mwnRbrlqYoHPtFGvcJ A/bkLr+bi2CZvt0CLWTq1q4bwVPaapoO4LXLUC0atuVvhzLQJTE5Ro9bmwDORXz+je5f Y7Pe/+mKA6rvdDJx2XkggaYY+1yq5F1ICREUo4M06aQ9tkh4Cqb1crkwe7r19vNLr4KV j2Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768943847; x=1769548647; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xCXsBA5JiE6tSH+r9DnbOrMvsbODSkyuZ4R8Vi1rRHs=; b=izhxRWI7O6gciyQCYEcfhN6RXm8aFMwOdkkWKdvMdUO0kcpV28JBmlV3z73JTF2pEy bhmW658czEqVYxf1FwARH+Uh1UDz5fUX5mAuSE+qNvZSGK2XGSiXWCgLA3CIkLtcDm7r ebf/niH1EoN59SBzJr78WUvtLltFyetTSTNB6orH/KYXSclCt3oyFF2i230Uehn/CsJK 6zgucGdJeAcU5nsZMXxzs7qaGPMV7lrVmg2Cv0Ga0Q2EPgDubVN4ZYJaM3qWNh5ocQUX hkJvRdRxv8Q4NcsGA+JBfKbxn9XJp9MPXY16LEFDLIfRi0yl28MG4ESJpySZAlYesLpP hK+Q== X-Forwarded-Encrypted: i=1; AJvYcCV7Wcv1Nr2A3qRB+U+0LO569zqthsvPLdEwxfK6DyCSWCODIGacJLCK1eQaNxYUC3GXT4Nnc+B60mE++Dc=@vger.kernel.org X-Gm-Message-State: AOJu0Yz6DFYm0PNTzhP1Hd+2pzo6py+45vi1K0SMv2MoIosl4gOOqH78 WO7r+2NwxYRfH8P/lvxT/Sx0eUqQN/clFMtlXb8g+Yq7EBeDrRYBz26N X-Gm-Gg: AY/fxX43qgFvZ2PMC1k3m17OOg8q3GWhh1qAWiouM466OZe4+4FeaDyXP+Et74vMl/m UxNzeWgKuJ32EczR0Vq6liu8bLDOyqY5iDwcP0xD9G/OZiODrBTVXZtbvi/8u6SrnPU150BOwi1 +uvegN7bLL15IXL0r0DKwAa3couwmJQjoJabSa7XRaWXa3OhfHPdbUn20vQUkIFoe5RSIR5v4/C WzJqpKcuHq966vuStdpORE7t18qWpSrKuTyDhcL9JM2fyqCWiSszYY+7Rk8NYh+HorUMRBdLG+p Ia0VufNnAPe3LPWPlik5nJl5FcyamPSe/h5iT9Eb4qemHZsF5t5iPqCx6WDXJnw55Y8nMBkjNHW aW80M+g9qF5A+JjS7zghvqQ3Q5kYPe83SsM+nHdwpQ882wY7k9/lRbVgDbYbW/BsNGZnhM0KaLg l/kuW3+NokfOi0p8tL7YDuu2AMHmglQ90G70aIEIK8NfG5wASd8Q== X-Received: by 2002:a05:600c:1f86:b0:477:a289:d854 with SMTP id 5b1f17b1804b1-4801e53ca36mr249177555e9.5.1768943847297; Tue, 20 Jan 2026 13:17:27 -0800 (PST) Received: from ionutnechita-arz2022.local ([2a02:2f0e:c504:d100:f54e:1a6b:fa97:f3ec]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47f4b2672d6sm329104325e9.14.2026.01.20.13.17.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jan 2026 13:17:26 -0800 (PST) From: "Ionut Nechita (Sunlight Linux)" To: rafael@kernel.org Cc: ionut_n2001@yahoo.com, daniel.lezcano@linaro.org, christian.loehle@arm.com, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/1] cpuidle: menu: Fix high wakeup latency on modern Intel server platforms Date: Tue, 20 Jan 2026 23:17:24 +0200 Message-ID: <20260120211725.124349-1-sunlightlinux@gmail.com> X-Mailer: git-send-email 2.52.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Ionut Nechita Hi, This patch addresses a performance regression in the menu cpuidle governor affecting modern Intel server platforms (Sapphire Rapids, Granite Rapids, and newer). == Problem Description == On Intel server platforms from 2022 onwards, we observe excessive wakeup latencies (~150us) in network-sensitive workloads when using the menu governor with NOHZ_FULL enabled. Measurement with qperf tcp_lat shows: - Sapphire Rapids (SPR): 151us latency - Ice Lake (ICL): 12us latency - Skylake (SKL): 21us latency The 12x latency regression on SPR compared to Ice Lake is unacceptable for latency-sensitive applications (HPC, real-time, financial trading, etc.). == Root Cause == The issue stems from menu.c:294-295: if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC) predicted_ns = data->next_timer_ns; When the tick is already stopped and the predicted idle duration is short (<2ms), the governor switches to using next_timer_ns directly (often 10ms+). This causes the selection of very deep package C-states (PC6). Modern server platforms have significantly longer C-state exit latencies due to architectural changes: - Tile-based architecture with per-tile power gating - DDR5 power management overhead - CXL link restoration - Complex mesh interconnect resynchronization When a network packet arrives after 500us but the governor selected PC6 based on a 10ms timer, the 150us exit latency dominates the response time. On older platforms (Ice Lake, Skylake) with faster C-state transitions (12-21us), this issue was less noticeable, but SPR's tile architecture makes it critical. == Solution == Instead of using next_timer_ns directly (100% timer-based), add a 25% safety margin to the prediction and clamp to next_timer_ns: predicted_ns = min(predicted_ns + (predicted_ns >> 2), data->next_timer_ns); This provides: - Conservative prediction (avoids too-shallow states) - Protection against excessively deep states (clamped to timer) - Platform-agnostic solution (no hardcoded thresholds) - Minimal overhead (one shift, one add, one min) The 25% margin (>> 2 = divide by 4) was chosen as a balance between: - Too small (10%): Insufficient protection on high-latency platforms - Too large (50%): Overly conservative, may hurt power efficiency == Results == Testing on Sapphire Rapids with qperf tcp_lat: - Before: 151us average latency - After: ~30us average latency - Improvement: 5x latency reduction Testing on Ice Lake and Skylake shows minimal impact: - Ice Lake: 12us → 12us (no regression) - Skylake: 21us → 21us (no regression) Power efficiency testing shows <1% difference in package power consumption during mixed workloads, well within measurement noise. == Examples == Short prediction (500us), timer at 10ms: - Before: predicted_ns = 10ms → selects PC6 → 151us wakeup - After: predicted_ns = min(625us, 10ms) = 625us → selects C1E → 15us wakeup Long prediction (1800us), timer at 2ms: - Before: predicted_ns = 2ms → selects C6 - After: predicted_ns = min(2250us, 2ms) = 2ms → selects C6 (same state) The algorithm naturally adapts to workload characteristics without platform-specific tuning. Ionut Nechita (1): cpuidle: menu: Add 25% safety margin to short predictions when tick is stopped drivers/cpuidle/governors/menu.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) -- 2.52.0