From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754796AbcFTPhm (ORCPT ); Mon, 20 Jun 2016 11:37:42 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:55052 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754692AbcFTPhe (ORCPT ); Mon, 20 Jun 2016 11:37:34 -0400 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Mon, 20 Jun 2016 08:05:55 -0700 From: "Paul E. McKenney" To: Thomas Gleixner Cc: LKML , Ingo Molnar , Peter Zijlstra , Eric Dumazet , Frederic Weisbecker , Chris Mason , Arjan van de Ven , rt@linutronix.de, Rik van Riel , Linus Torvalds , George Spelvin , Len Brown Subject: Re: [patch V2 00/20] timer: Refactor the timer wheel Reply-To: paulmck@linux.vnet.ibm.com References: <20160617121134.417319325@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160617121134.417319325@linutronix.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16062015-8235-0000-0000-000008A144C2 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16062015-8236-0000-0000-000032651E6E Message-Id: <20160620150555.GR3923@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-20_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606200167 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 17, 2016 at 01:26:28PM -0000, Thomas Gleixner wrote: > This is the second version of the timer wheel rework series. The first series > can be found here: > > http://lkml.kernel.org/r/20160613070440.950649741@linutronix.de > > The series is also available in git: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers Ran some longer rcutorture tests, and the scripting complained about hangs. This turned out to be due to the 12.5% uncertainty, so I fixed this by switching the rcutorture stop-test timer to hrtimers. Things are now working as well as before, with the exception of SRCU, for which I am getting lots of grace-period stall complaints. This came as a bit of a surprise. Anyway, I will be reviewing SRCU for timing dependencies. Thanx, Paul > Changes vs. V1: > > - Addressed the review comments of V1 > > - Fixed the fallout in tty/metag (noticed by Arjan) > - Renamed the hlist helper (noticed by Paolo/George) > - Used the proper mask in get_timer_base() (noticed by Richard) > - Fixed the inverse state check in internal_add_timer() (noticed by Richard) > - Simplified the macro maze, removed wrapper (noticed by George) > - Reordered data retrieval in run_timer() (noticed by George) > > - Removed cascading completely > > We have a hard cutoff of expiry times at the capacity of the last wheel > level now. Timers which insist on timeouts longer than that, i.e. ~6days, > will expire at the cutoff, i.e. ~6 days. From our data gathering the > largest timeouts are 5 days (networking contrack), which are well in the > capacity. > > To achieve this capacity with HZ=1000 without increasing the storage size > by another level, we reduced the granularity of the first wheel level from > 1ms to 4ms. According to our data, there is no user which relies on that > 1ms granularity and 99% of those timers are canceled before expiry. > > As a side effect there is the benefit of better batching in the first level > which helps networking to avoid rearming timers in the hotpath. > > We gathered more data about performance and batching. Compared to mainline the > following changes have been observed: > > - The bad outliers in mainline when the timer wheel needs to be forwarded > after a long idle sleep are completely gone. > > - The total cpu time used for timer softirq processing is significantly > reduced. Depending on the HZ setting and workload this ranges from factor > 2 to 6. > > - The average invocation period of the timer softirq on an idle system > increases significantly. Depending on the HZ settings and workload this > ranges from factor 1.5 to 5. That means that the residency in deep > c-states should be improved. Have not yet have time to verify this with > the power tools. > > Thanks, > > tglx > > --- > arch/x86/kernel/apic/x2apic_uv_x.c | 4 > arch/x86/kernel/cpu/mcheck/mce.c | 4 > block/genhd.c | 5 > drivers/cpufreq/powernv-cpufreq.c | 5 > drivers/mmc/host/jz4740_mmc.c | 2 > drivers/net/ethernet/tile/tilepro.c | 4 > drivers/power/bq27xxx_battery.c | 5 > drivers/tty/metag_da.c | 4 > drivers/tty/mips_ejtag_fdc.c | 4 > drivers/usb/host/ohci-hcd.c | 1 > drivers/usb/host/xhci.c | 2 > include/linux/list.h | 10 > include/linux/timer.h | 30 > kernel/time/tick-internal.h | 1 > kernel/time/tick-sched.c | 46 - > kernel/time/timer.c | 1099 +++++++++++++++++++++--------------- > lib/random32.c | 1 > net/ipv4/inet_connection_sock.c | 7 > net/ipv4/inet_timewait_sock.c | 5 > 19 files changed, 725 insertions(+), 514 deletions(-) > >