From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 191FEC433DF for ; Wed, 5 Aug 2020 08:51:55 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D52042177B for ; Wed, 5 Aug 2020 08:51:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="doQT0ZXH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D52042177B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=t/QUl5vo1IYrLCObyQqhNnASvG+s4VgDWzP07HOiIPQ=; b=doQT0ZXHZy5SqRvxXdSzWpfpi 7oADVO4HrPhhh8M7z849pb2ibot47tJD0HGQph2QCGxjYKj0wFumUY+V8i0cPGArZnWgiC/2fOFOF l5PBvjYVaE9k3Qnd6G0i20fIfIO83EFfiXWoumGG6zkrUSJygTG6PHm6oeeO/3rR3pKDPsZfNNg7s MLofi2DOINNHW//7e3GbW9z8PCqOG03m/S7Xg4uRx4j9RlaGXMYYSc+BcODMzmO6naJjWGu0TyvvR PMscnafalCLD3GVFmui16PPwfqcJPAjhujm3Oj626JzqKJ0/Mwm2kxzSrDUW++nNwyGj1RwBFq84A VLWqR1zpg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k3F8C-0003Zy-Bz; Wed, 05 Aug 2020 08:50:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k3F89-0003ZD-Fz for linux-arm-kernel@lists.infradead.org; Wed, 05 Aug 2020 08:50:38 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 14D5CD6E; Wed, 5 Aug 2020 01:50:33 -0700 (PDT) Received: from [192.168.178.2] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DDF223F6CF; Wed, 5 Aug 2020 01:50:30 -0700 (PDT) Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting To: Valentin Schneider , Thomas Gleixner References: <20200729033934.22349-1-alison.wang@nxp.com> <877dumbtoi.fsf@kurt> <20200729094943.lsmhsqlnl7rlnl6f@skbuf> <87mu3ho48v.fsf@kurt> <20200730082228.r24zgdeiofvwxijm@skbuf> <873654m9zi.fsf@kurt> <87lfiwm2bj.fsf@nanos.tec.linutronix.de> <20200803114112.mrcuupz4ir5uqlp6@skbuf> <87d047n4oh.fsf@nanos.tec.linutronix.de> <875z9zmt4i.fsf@nanos.tec.linutronix.de> From: Dietmar Eggemann Message-ID: <02195130-3d9a-a206-d931-fab7dc606061@arm.com> Date: Wed, 5 Aug 2020 10:50:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200805_045037_653420_0EF3F998 X-CRM114-Status: GOOD ( 20.14 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mw@semihalf.com, paulmck@kernel.org, Anna-Maria Gleixner , catalin.marinas@arm.com, Alison Wang , linux-kernel@vger.kernel.org, leoyang.li@nxp.com, Peter Zijlstra , vladimir.oltean@nxp.com, Kurt Kanzenbach , Vladimir Oltean , will@kernel.org, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 04/08/2020 01:59, Valentin Schneider wrote: > > On 03/08/20 20:22, Thomas Gleixner wrote: >> Valentin, >> >> Valentin Schneider writes: >>> On 03/08/20 16:13, Thomas Gleixner wrote: >>>> Vladimir Oltean writes: >>>>>> 1) When irq accounting is disabled, RT throttling kicks in as >>>>>> expected. >>>>>> >>>>>> 2) With irq accounting the RT throttler does not kick in and the RCU >>>>>> stall/lockups happen. >>>>> What is this telling us? >>>> >>>> It seems that the fine grained irq time accounting affects the runtime >>>> accounting in some way which I haven't figured out yet. >>>> >>> >>> With IRQ_TIME_ACCOUNTING, rq_clock_task() will always be incremented by a >>> lesser-or-equal value than when not having the option; you start with the >>> same delta_exec but slice some for the IRQ accounting, and leave the rest >>> for the rq_clock_task() (+paravirt). >>> >>> IIUC this means that if you spend e.g. 10% of the time in IRQ and 90% of >>> the time running the stress-ng RT tasks, despite having RT tasks hogging >>> the entirety of the "available time" it is still only 90% runtime, which is >>> below the 95% default and the throttling doesn't happen. >> >> totaltime = irqtime + tasktime >> >> Ignoring irqtime and pretending that totaltime is what the scheduler >> can control and deal with is naive at best. >> > > Agreed, however AFAICT rt_time is only incremented by rq_clock_task() > deltas, which don't include IRQ time with IRQ_TIME_ACCOUNTING=y. That would > then be directly compared to the sysctl runtime. > > Adding some prints in sched_rt_runtime_exceeded() and running this test > case on my Juno, I get: > # IRQ_TIME_ACCOUNTING=y > cpu=2 rt_time=713455220 runtime=950000000 rq->avg_irq.util_avg=265 > (rt_time oscillates between [70.1e7, 75.1e7]; avg_irq between [220, 270]) > > # IRQ_TIME_ACCOUNTING=n > cpu=2 rt_time=963035300 runtime=949951811 > (rt_time oscillates between [94.1e7, 96.1e7]; > > Throttling happens for IRQ_TIME_ACCOUNTING=n and doesn't for > IRQ_TIME_ACCOUNTING=y - clearly the accounted rt_time isn't high enough for > that to happen, and it does look like what is missing in rt_time (or what > should be subtracted from the available runtime) is there in the avg_irq. I agree that w/ IRQ_TIME_ACCOUNTING=y rt_rq->rt_time isn't high enough in this testcase. stress-ng-hrtim-1655 [001] 462.897733: bprint: update_curr_rt: rt_rq->rt_time=416716900 rt_rq->rt_runtime=950000000 rt_b->rt_runtime=950000000 The 5% reservation (1 - sched_rt_runtime_us/sched_rt_period_us) for CFS is massively eclipsed by irqtime. It's true that avg_irq tracks 'irq_delta + steal' time but it is meant to potentially reduce cpu capacity. It's also cpu and frequency invariant (your CPU2 is a big CPU so no issue here). Could a rq_clock(rq) derived rt_rq signal been used to compare against rt_runtime? BTW, DL already influences rt_rq->rt_time. [...] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel