From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935776AbdJRHsi (ORCPT ); Wed, 18 Oct 2017 03:48:38 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:51343 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932533AbdJRHsg (ORCPT ); Wed, 18 Oct 2017 03:48:36 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: byungchul.park@lge.com X-Original-SENDERIP: 10.177.222.33 X-Original-MAILFROM: byungchul.park@lge.com Date: Wed, 18 Oct 2017 16:48:25 +0900 From: Byungchul Park To: Thomas Gleixner Cc: Ingo Molnar , johan@kernel.org, arnd@arndb.de, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, peterz@infradead.org, hpa@zytor.com, tony@atomide.com, linux-tip-commits@vger.kernel.org, kernel-team@lge.com Subject: Re: [tip:locking/urgent] locking/lockdep: Disable cross-release features for now Message-ID: <20171018074825.GC32368@X58A-UD3R> References: <20171014072659.f2yr6mhm5ha3eou7@gmail.com> <20171016020447.GP3323@X58A-UD3R> <20171017071202.6x22ho2o5yz74dak@gmail.com> <20171017144230.dwrrxnpseo7tv6rp@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 17, 2017 at 05:03:40PM +0200, Thomas Gleixner wrote: > On Tue, 17 Oct 2017, Ingo Molnar wrote: > > * Thomas Gleixner wrote: > > > On Tue, 17 Oct 2017, Ingo Molnar wrote: > > > > No, please fix performance. > > > > > > You know very well that with the cross release stuff we have to take the > > > performance hit of stack unwinding because we have no idea whether there > > > will show up a new lock relation later or not. And there is not much you > > > can do in that respect. > > > > > > OTOH, the cross release feature unearthed real deadlocks already so it is a > > > valuable debug feature and having an explicit config switch which defaults > > > to N is well worth it. > > > > I disagree, because even if that's correct, the choices are not binary. The > > performance regression was a slowdown of around 7x: lockdep boot overhead on that > > particula system went from +3 seconds to +21 seconds... > > Hmm, I might have missed something, but what I've seen in this thread is: > > > > > Boot time (from "Linux version" to login prompt) had in fact doubled > > > > since 4.13 where it took 17 seconds (with my current config) compared to > > > > the 35 seconds I now see with 4.14-rc4. > > So that's 2x not 7x. On one of my main test machines it's about ~1.4 so I > did not even really notice until this thread came up. Probably I have no > expectations on boot time and performance when lockdep is on :) > > > As a response to the performance regression I haven't seen _any_ attempt to > > measure, profile and generally quantify the performance impact, which would at > > least make it more believable that the overhead cannot be reduced. That really > > makes me worry about the code on a higher level than just whether it can be > > enabled by default or not. > > I did some quick perf top analysis, not in detail though, and what really > dominates with that feature is the unwinder, which needs to be > unconditional due to the nature of the problem. > > I have not spend a huge amount of time to think about ways to improve that, > but I could not come up with anything smart so far. > > The only thing I thought about was making the unwind short and only record > one or two call levels (if at all) instead of following the full call Yes, I think that's the best option I can do. Thank you very much. > chain. That makes it less useful for a quick test, but once you hit a splat > you can enable full depth recording for full analysis. In the full analysis > case performance is the least of your worries. > > > Caring about the performance of debug features very much matters, _especially_ > > when they are expensive. > > I'm not disagreeing. I'm just trying to understand why this is marked > BROKEN where I think it should be marked TOO_EXPENSIVE. > > Thanks, > > tglx