From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754157AbcIAJtO (ORCPT ); Thu, 1 Sep 2016 05:49:14 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:43299 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751243AbcIAJtN (ORCPT ); Thu, 1 Sep 2016 05:49:13 -0400 Date: Thu, 1 Sep 2016 11:49:06 +0200 From: Peter Zijlstra To: Stanislaw Gruszka Cc: linux-kernel@vger.kernel.org, Giovanni Gherdovich , Linus Torvalds , Mel Gorman , Mike Galbraith , Paolo Bonzini , Rik van Riel , Thomas Gleixner , Wanpeng Li , Ingo Molnar Subject: Re: [PATCH 1/3] sched/cputime: Improve scalability of times()/clock_gettime() on 32 bit cpus Message-ID: <20160901094906.GP10153@twins.programming.kicks-ass.net> References: <1472722064-7151-1-git-send-email-sgruszka@redhat.com> <1472722064-7151-2-git-send-email-sgruszka@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1472722064-7151-2-git-send-email-sgruszka@redhat.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 01, 2016 at 11:27:42AM +0200, Stanislaw Gruszka wrote: > My previous commit: > > a1eb1411b4e4 ("sched/cputime: Improve scalability by not accounting thread group tasks pending runtime") > > helped to achieve good performance of SYS_times() and > SYS_clock_gettimes(CLOCK_PROCESS_CPUTIME_ID) on 64 bit architectures. > However taking task_rq_lock() when reading t->se.sum_exec_runtime on > 32 bit architectures still make those syscalls slow. > > The reason why we take the lock is to make 64bit sum_exec_runtime > variable consistent. While a inconsistency scenario is very very unlike, > I assume it still may happen at least on some 32 bit architectures. > > To protect the variable I introduced new seqcount lock. Performance > improvements on machine with 32 cores (32-bit cpus) measured by > benchmarks described in commit: No,.. running 32bit kernels on a machine with 32 cores is insane, full stop. You're now making rather hot paths slower to benefit a rather slow path, that too is backwards. [ also, seqcount is not a lock ]. Really, people should not expect process wide numbers to be fast.