From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751367AbeDEOLu (ORCPT ); Thu, 5 Apr 2018 10:11:50 -0400 Received: from merlin.infradead.org ([205.233.59.134]:57686 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751216AbeDEOLs (ORCPT ); Thu, 5 Apr 2018 10:11:48 -0400 Date: Thu, 5 Apr 2018 16:11:27 +0200 From: Peter Zijlstra To: "Rafael J. Wysocki" Cc: "Rafael J. Wysocki" , Linux PM , Frederic Weisbecker , Thomas Gleixner , Paul McKenney , Thomas Ilsche , Doug Smythies , Rik van Riel , Aubrey Li , Mike Galbraith , LKML , Len Brown Subject: Re: [PATCH v9 10/10] cpuidle: menu: Avoid selecting shallow states with stopped tick Message-ID: <20180405141127.GS4043@hirez.programming.kicks-ass.net> References: <1736751.LdhZHb50jq@aspire.rjw.lan> <6542020.eHGLEK9V0J@aspire.rjw.lan> <20180405124757.GQ4082@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.3 (2018-01-21) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 05, 2018 at 03:49:32PM +0200, Rafael J. Wysocki wrote: > On Thu, Apr 5, 2018 at 2:47 PM, Peter Zijlstra wrote: > > On Wed, Apr 04, 2018 at 10:50:36AM +0200, Rafael J. Wysocki wrote: > >> + if (tick_nohz_tick_stopped()) { > >> + /* > >> + * If the tick is already stopped, the cost of possible short > >> + * idle duration misprediction is much higher, because the CPU > >> + * may be stuck in a shallow idle state for a long time as a > >> + * result of it. In that case say we might mispredict and try > >> + * to force the CPU into a state for which we would have stopped > >> + * the tick, unless the tick timer is going to expire really > >> + * soon anyway. > > > > Wait what; the tick was stopped, therefore it _cannot_ expire soon. > > > > *confused* > > > > Did you mean s/tick/a/ ? > > Yeah, that should be "a timer". *phew* ok, that makes a lot more sense ;-) My only concern with this is that we can now be overly pessimistic. The predictor might know that statistically it's very likely a device interrupt will arrive soon, but because the tick is already disabled, we don't dare trust it, causing possible excessive latencies. Would an alternative be to make @stop_tick be an enum capable of forcing the tick back on? enum tick_action { NOHZ_TICK_STOP, NOHZ_TICK_RETAIN, NOHZ_TICK_START, }; enum tick_action tick_action = NOHZ_TICK_STOP; state = cpuidle_select(..., &tick_action); switch (tick_action) { case NOHZ_TICK_STOP: tick_nohz_stop_tick(); break; case NOHZ_TICK_RETAIN: tick_nozh_retain_tick(); break; case NOHZ_TICK_START: tick_nohz_start_tick(); break; }; Or something along those lines?