From mboxrd@z Thu Jan  1 00:00:00 1970
From: Dan Magenheimer <dan.magenheimer@oracle.com>
Subject: RE: Scheduling anomaly with 4.0.0 (rc6)
Date: Mon, 5 Apr 2010 13:17:34 -0700 (PDT)
Message-ID: <c30b927d-2424-4135-a1b5-c7e2cc9ddab7@default>
References: <3b511656-ea50-4ebb-918e-e24b40080580@default
	r2qde76405a1004050743pab10aa0dud58de464dff6db0e@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <r2qde76405a1004050743pab10aa0dud58de464dff6db0e@mail.gmail.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

Thanks for the reply!

Well I'm now seeing something a little more alarming:  Running
an identical but CPU-overcommitted workload (just normal PV domains,
no tmem or ballooning or anything), what would you expect the
variance to be between successive identical measured runs
on identical hardware?

I am seeing total runtimes, both measured by elapsed time and by
sum-of-CPUsec across all domains (incl dom0), vary by 6-7% or more.
This seems a bit unusual/excessive to me and makes it very hard
to measure improvements (e.g. by tmem, for an upcoming Xen summit
presentation) or benchmark anything complex.

> Is it possible that Linux is just favoring one vcpu over the other for
> some reason?  Did you try running the same test but with only one VM?

Well "make -j8" will likely be single-threaded part of the time,
but I wouldn't expect that to make that big a difference between
two identical workloads.

I'm not sure I understand how I would run the same test with
only one VM when the observation of the strangeness requires
two VMs (and even then must be observed at random points during
execution).

> Another theory would be that most interrupts are delivered to vcpu 0,
> so it may end up in "boost" priority more often.

Hmmm... I'm not sure I get that, but what about _physical_ cpu 0
for Xen?  If all physical cpu's are not the same and one VM
has an affinity for vcpu0-on-pcpu0 and the other has an affinity
for vcpu1-in-pcpu0, would that make a difference?

But still, 40% seems very large and almost certainly a bug,
especially given the new observations above.

> -----Original Message-----
> From: George Dunlap [mailto:George.Dunlap@eu.citrix.com]
> Sent: Monday, April 05, 2010 8:44 AM
> To: Dan Magenheimer
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] Scheduling anomaly with 4.0.0 (rc6)
>=20
> Is it possible that Linux is just favoring one vcpu over the other for
> some reason?  Did you try running the same test but with only one VM?
>=20
> Another theory would be that most interrupts are delivered to vcpu 0,
> so it may end up in "boost" priority more often.
>=20
> I'll re-post the credit2 series shortly; Keir said he'd accept it
> post-4.0.  You could try it with that and see what the performance is
> like.
>=20
>  -George
>=20
> On Fri, Apr 2, 2010 at 5:48 PM, Dan Magenheimer
> <dan.magenheimer@oracle.com> wrote:
> > I've been running some heavy testing on a recent Xen 4.0
> > snapshot and seeing a strange scheduling anomaly that
> > I thought I should report. =A0I don't know if this is
> > a regression... I suspect not.
> >
> > System is a Core 2 Duo (Conroe). =A0Load is four 2-VCPU
> > EL5u4 guests, two of which are 64-bit and two of which
> > are 32-bit. =A0Otherwise they are identical. =A0All four
> > are running a sequence of three Linux compiles with
> > (make -j8 clean; make -j8). =A0All are started approximately
> > concurrently: I synchronize the start of the test after
> > all domains are launched with an external NFS semaphore
> > file that is checked every 30 seconds.
> >
> > What I am seeing is a rather large discrepancy in the
> > amount of time consumed "underway" by the four domains
> > as reported by xentop and xm list. =A0I have seen this
> > repeatedly, but the numbers in front of me right now are:
> >
> > 1191s dom0
> > 3182s 64-bit #1
> > 2577s 64-bit #2 <-- 20% less!
> > 4316s 32-bit #1
> > 2667s 32-bit #2 <-- 40% less!
> >
> > Again these are identical workloads and the pairs
> > are identical released kernels running from identical
> > "file"-based virtual block devices containing released
> > distros. =A0Much of my testing had been with tmem and
> > self-ballooning so I had blamed them for awhile,
> > but I have reproduced it multiple times with both
> > of those turned off.
> >
> > At start and after each kernel compile, I record
> > a timestamp, so I know the same work is being done.
> > Eventually the workload finishes on each domain and
> > intentionally crashes the kernel so measurement is
> > stopped. =A0At the conclusion, the 64-bit pair have
> > very similar total CPU sec and the 32-bit pair have
> > very similar total CPU sec so eventually (presumably
> > when the #1's are done hogging CPU), the "slower"
> > domains do finish the same amount of work. =A0As a
> > result, it is hard to tell from just the final
> > results that the four domains are getting scheduled
> > at very different rates.
> >
> > Does this seem like a scheduler problem, or are there
> > other explanations? Anybody care to try to reproduce it?
> > Unfortunately, I have to use the machine now for other
> > work.
> >
> > P.S. According to xentop, there is almost no network
> > activity, so it is all CPU and VBD. =A0And the ratio
> > of VBD activity looks to be approximately the same
> > ratio as CPU(sec).
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >