From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AF40C4360F for ; Fri, 5 Apr 2019 09:19:05 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 46D7D217D7 for ; Fri, 5 Apr 2019 09:19:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46D7D217D7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 44bDlV2RvLzDqSD for ; Fri, 5 Apr 2019 20:19:02 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=huntbag@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 44bDjN6hPpzDqQt for ; Fri, 5 Apr 2019 20:17:12 +1100 (AEDT) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x359BQHg097961 for ; Fri, 5 Apr 2019 05:17:09 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0b-001b2d01.pphosted.com with ESMTP id 2rp39qb5mg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 05 Apr 2019 05:17:08 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 5 Apr 2019 10:17:07 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 5 Apr 2019 10:17:04 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x359H3vO56295500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 5 Apr 2019 09:17:03 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9728652054; Fri, 5 Apr 2019 09:17:03 +0000 (GMT) Received: from boston16h.aus.stglabs.ibm.com (unknown [9.3.23.78]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 56E2E52067; Fri, 5 Apr 2019 09:17:02 +0000 (GMT) From: Abhishek Goel To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-pm@vger.kernel.org Subject: [PATCH v2 0/2] Auto-promotion logic for cpuidle states Date: Fri, 5 Apr 2019 04:16:45 -0500 X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 x-cbid: 19040509-0012-0000-0000-0000030B9169 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19040509-0013-0000-0000-00002143A4DA Message-Id: <20190405091647.4169-1-huntbag@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-04-05_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=816 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904050068 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Abhishek Goel , daniel.lezcano@linaro.org, rjw@rjwysocki.net, ego@linux.vnet.ibm.com Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Currently, the cpuidle governors (menu/ladder) determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be busy soon. However, if no new workload is scheduled on that CPU in the near future, the CPU will end up in the shallow state. Motivation ---------- In case of POWER, this is problematic, when the predicted state in the aforementioned scenario is a lite stop state, as such lite states will inhibit SMT folding, thereby depriving the other threads in the core from using the core resources. To address this, such lite states need to be autopromoted. The cpuidle-core can queue timer to correspond with the residency value of the next available state. Thus leading to auto-promotion to a deeper idle state as soon as possible. Experiment ---------- I performed experiments for three scenarios to collect some data. case 1 : Without this patch and without tick retained, i.e. in a upstream kernel, It would spend more than even a second to get out of stop0_lite. case 2 : With tick retained(as suggested) - Generally, we have a sched tick at 4ms(CONF_HZ = 250). Ideally I expected it to take 8 sched tick to get out of stop0_lite. Experimentally, observation was ========================================================= sample min max 99percentile 20 4ms 12ms 4ms ========================================================= *ms = milliseconds It would take atleast one sched tick to get out of stop0_lite. case 2 : With this patch (not stopping tick, but explicitly queuing a timer) ============================================================ sample min max 99percentile ============================================================ 20 144us 192us 144us ============================================================ *us = microseconds In this patch, we queue a timer just before entering into a stop0_lite state. The timer fires at (residency of next available state + exit latency of next available state * 2). Let's say if next state(stop0) is available which has residency of 20us, it should get out in as low as (20+2*2)*8 [Based on the forumla (residency + 2xlatency)*history length] microseconds = 192us. Ideally we would expect 8 iterations, it was observed to get out in 6-7 iterations. Even if let's say stop2 is next available state(stop0 and stop1 both are unavailable), it would take (100+2*10)*8 = 960us to get into stop2. So, We are able to get out of stop0_lite generally in 150us(with this patch) as compared to 4ms(with tick retained). As stated earlier, we do not want to get stuck into stop0_lite as it inhibits SMT folding for other sibling threads, depriving them of core resources. Current patch is using auto-promotion only for stop0_lite, as it gives performance benefit(primary reason) along with lowering down power consumption. We may extend this model for other states in future. Abhishek Goel (2): cpuidle : auto-promotion for cpuidle states cpuidle : Add auto-promotion flag to cpuidle flags arch/powerpc/include/asm/opal-api.h | 1 + drivers/cpuidle/Kconfig | 4 ++ drivers/cpuidle/cpuidle-powernv.c | 13 +++++- drivers/cpuidle/cpuidle.c | 68 ++++++++++++++++++++++++++++- drivers/cpuidle/governors/ladder.c | 3 +- drivers/cpuidle/governors/menu.c | 22 +++++++++- include/linux/cpuidle.h | 10 ++++- 7 files changed, 115 insertions(+), 6 deletions(-) -- 2.17.1