From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752092AbcF0XVM (ORCPT ); Mon, 27 Jun 2016 19:21:12 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:35164 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751888AbcF0XVL (ORCPT ); Mon, 27 Jun 2016 19:21:11 -0400 Date: Tue, 28 Jun 2016 01:21:08 +0200 From: Frederic Weisbecker To: riel@redhat.com Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@kernel.org, pbonzini@redhat.com, fweisbec@redhat.com, wanpeng.li@hotmail.com, efault@gmx.de, tglx@linutronix.de, rkrcmar@redhat.com Subject: Re: [PATCH 2/5] nohz,cputime: remove VTIME_GEN vtime irq time code Message-ID: <20160627232105.GA7582@lerouge> References: <1466648751-7958-1-git-send-email-riel@redhat.com> <1466648751-7958-3-git-send-email-riel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1466648751-7958-3-git-send-email-riel@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 22, 2016 at 10:25:48PM -0400, riel@redhat.com wrote: > From: Rik van Riel > > The CONFIG_VIRT_CPU_ACCOUNTING_GEN irq time tracking code does not > appear to currently work right. > > On CPUs that are nohz_full, people typically do not assign IRQs. Right, but they can still fire. At least one tick per second, plus the pinned timers, etc... > > On the housekeeping CPU (when a system is booted up with nohz_full), > sampling should work ok to determine irq and softirq time use, but > that only covers the housekeeping CPU itself, not the other > non-nohz_full CPUs. Hmm, every non-nohz_full CPUs, including the CPU 0, account the irqtime the same way: through the tick (and therefore can't account much of it). So I'm a bit confused by the above statements.. > > On CPUs that are nohz_idle (the typical way a distro kernel is > booted), irq time is not accounted at all while the CPU is idle, > due to the lack of timer ticks. But as soon as a timer tick fires in idle or afterward, the pending irqtime is accounted. That said I don't see how it explains why we do the below: > > Remove the VTIME_GEN vtime irq time code. The next patch will > allow NO_HZ_FULL kernels to use the IRQ_TIME_ACCOUNTING code. I don't get the reason why we are doing this. Now arguably the irqtime accounting is probably not working as well as before since we switched to jiffy clock. But I still see some hard irqs accounted when account_irq_exit() is lucky enough to observe that jiffies changed since the beginning of the interrupt. So it's not entirely broken. I agree that we need to switch it to the generic irqtime accounting code but breaking the code now to reactivate it in a subsequent patch is prone to future bisection issues. Thanks.