From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Fri, 4 Nov 2011 13:21:02 -0400 Subject: [Cluster-devel] GFS2: glock statistics gathering (RFC) In-Reply-To: <1320425851.2732.82.camel@menhir> References: <1320419989.2732.60.camel@menhir> <20111104163152.GA15232@redhat.com> <1320425851.2732.82.camel@menhir> Message-ID: <20111104172102.GC15232@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Fri, Nov 04, 2011 at 04:57:31PM +0000, Steven Whitehouse wrote: > Hi, > > On Fri, 2011-11-04 at 12:31 -0400, David Teigland wrote: > > On Fri, Nov 04, 2011 at 03:19:49PM +0000, Steven Whitehouse wrote: > > > The three pairs of mean/variance measure the following > > > things: > > > > > > 1. DLM lock time (non-blocking requests) > > > > You don't need to track and save this value, because all results will be > > one of three values which can gather once: > > > > short: the dir node and master node are local: 0 network round trip > > medium: one is local, one is remote: 1 network round trip > > long: both are remote: 2 network round trips > > > > Once you've measured values for short/med/long, then you're done. > > The distribution will depend on the usage pattern. > > > The reason for tracking this is to be able to compare it with the > blocking request value to (I hope) get a rough idea of the difference > between the two, which may indicate contention on the lock. So this > is really a "baseline" measurement. > > Plus we do need to measure it, since it will vary according to a > number of things, such as what hardware is in use. Right, but the baseline shouldn't change once you have it. > > > 2. To spot performance issues more easily > > > > Apart from contention, I'm not sure there are many perf issues that dlm > > measurements would help with. > > > That is the #1 cause of reported performance issues, so top of our list > to work on. The goal is to make it easier to track down the source of > these kinds of problems. I still think that time averages and computations sounds like a difficult and indirect way of measuring contention... but see how it works. Dave