From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH 3/7] xen: rework locking for dump of scheduler info (debug-key r) Date: Tue, 17 Mar 2015 12:01:40 +0000 Message-ID: <550817A4.2080805@eu.citrix.com> References: <20150316165642.10279.86684.stgit@Solace.station> <20150316170509.10279.79362.stgit@Solace.station> <55081607020000780006AAAE@mail.emea.novell.com> <55080A71.1070705@eu.citrix.com> <55081D29020000780006AB3C@mail.emea.novell.com> <550810B2.4070801@eu.citrix.com> <55082168020000780006AB85@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55082168020000780006AB85@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Keir Fraser , Dario Faggioli , Meng Xu , Xen-devel List-Id: xen-devel@lists.xenproject.org On 03/17/2015 11:43 AM, Jan Beulich wrote: >>>> On 17.03.15 at 12:32, wrote: >> On 03/17/2015 11:25 AM, Jan Beulich wrote: >>>>>> On 17.03.15 at 12:05, wrote: >>>> On 03/17/2015 10:54 AM, Jan Beulich wrote: >>>>> Finally, as said in different contexts earlier, I think unconditionally >>>>> acquiring locks in dumping routines isn't the best practice. At least >>>>> in non-debug builds I think these should be try-locks only, skipping >>>>> the dumping when a lock is busy. >>>> >>>> You mean so that we don't block the console if there turns out to be a >>>> deadlock? >>> >>> For example. And also to not unduly get in the way of an otherwise >>> extremely busy system. >> >> I don't understand this last argument. If you're using the debug keys, >> you want to know about the state of the system. I would much rather my >> system ran 25% slower for the 5 seconds the debug key was dumping >> information, and have a complete snapshot of the system, than for it to >> only run 10% slower and to have half the information missing. The >> upshot of missing information would likely be that I have to press the >> debug key 3-4 times in a row, meaning I'd be running 10% slower for 20 >> seconds rather than 25% slower for 5 seconds. > > Yes, I understand this, and in many cases this is the perspective to > take. Yet I've been in the situation where suggesting the use of > debug keys to learn something about a (partially) live locked system > would have had the risk of causing further corruption to it, and > hence a more careful state dumping approach would have been > desirable. > >> All in all, I don't think the performance of the debug keys should be a >> major concern. The only thing I'd be worried about is making the system >> as diagnosable as possible if things have already gone pear-shaped >> (e.g., if there's a deadlock). > > It's not their performance that's of concern, but the effect they > may have on the performance (or even correctness - see how > many process_pending_softirqs() calls we had to sprinkle around > over the years) of other code. So it sounds like maybe we're actually on the same page, but are using words slightly differently. :-) It sounds like we agree that the ability to tread carefully on a system which may be having trouble, in order not to make it worse, is important. For instance, not wedging the serial console behind a deadlocked lock, and not further corrupting a system that had gotten itself wedged in livelock. Those are things I would classify under "correctness" and/or "diagnosis". When I say "performance is not a concern", I mean "it does not concern me that someone's web page loads 25% slower for the five seconds it takes to dump the information". If delaying other parts of the system causes the system to get wedged or crash, that's obviously a problem. -George