From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757113AbbCSRVL (ORCPT ); Thu, 19 Mar 2015 13:21:11 -0400 Received: from g1t5425.austin.hp.com ([15.216.225.55]:36924 "EHLO g1t5425.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755579AbbCSRVF (ORCPT ); Thu, 19 Mar 2015 13:21:05 -0400 Message-ID: <1426785661.2370.38.camel@j-VirtualBox> Subject: Re: [PATCH v2] sched, timer: Use atomics for thread_group_cputimer to improve scalability From: Jason Low To: Linus Torvalds Cc: Peter Zijlstra , Ingo Molnar , "Paul E. McKenney" , Andrew Morton , Oleg Nesterov , Mike Galbraith , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Scott Norton , Aswin Chandramouleeswaran , Linux Kernel Mailing List , jason.low2@hp.com Date: Thu, 19 Mar 2015 10:21:01 -0700 In-Reply-To: <1425332984.5304.66.camel@j-VirtualBox> References: <1425321731.5304.14.camel@j-VirtualBox> <1425332984.5304.66.camel@j-VirtualBox> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2015-03-02 at 13:49 -0800, Jason Low wrote: > On Mon, 2015-03-02 at 11:03 -0800, Linus Torvalds wrote: > > On Mon, Mar 2, 2015 at 10:42 AM, Jason Low wrote: > > > > > > This patch converts the timers to 64 bit atomic variables and use > > > atomic add to update them without a lock. With this patch, the percent > > > of total time spent updating thread group cputimer timers was reduced > > > from 30% down to less than 1%. > > > > NAK. > > > > Not because I think this is wrong, but because somebody needs to look > > at the effects on 32-bit architectures too. > > Okay, I will run some tests to see how this change affects the > performance of itimers on 32 bit systems. Hi Linus, I tested this patch on a 32 bit ARM system with 4 cores. Using the generic 64 bit atomics, I did not see any performance change with this patch, and the relevant functions (account_group_*_time(), ect...) don't show up in perf reports. One factor might be because locking/cacheline contention isn't as apparent on smaller systems to begin with, and lib/atomic64.c also mentions that "this is expected to used on systems with small numbers of CPUs (<= 4 or so)".