From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: Xenalyze? Date: Fri, 09 Jul 2010 12:04:29 +0100 Message-ID: <4C37023D.8010201@eu.citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Thomas Graves Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org OK, I unified all of the p->current checks, so all of them only issue a=20 warning and skip that record. Try it now. On my to-do list is to add a lamport clock to the runstate change trace=20 records. Hopefully that will solve the intractable TSC drift problems=20 once and for all. -George On 08/07/10 23:08, Thomas Graves wrote: > Thanks for the updates. It ran a lot longer then before but it still > ended up failing. I=92ll try truncating the file and a few other things= . > Let me know if you have any other ideas. > > > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate blocked. > Possible tsc skew. > runstate_change old_runstate blocked, d1402v0 runstate runnable. > Possible tsc skew. > runstate_change old_runstate blocked, d1402v0 runstate runnable. > Possible tsc skew. > runstate_change old_runstate runnable, d1402v0 runstate running. > Possible tsc skew. > Not updating. > FATAL: p->current null > ] 20f101(20:f:101) 3 [ 802061ea ffffffff f ] > > > Tom > > > On 7/8/10 11:36 AM, "George Dunlap" wrote= : > > I had a work-around for the problem in a local patch-queue somewher= e. > I've pushed it (along with a bunch of local stuff I had lying aroun= d > ) -- do a pull and let me know if it works better. > > -George > > On Thu, Jul 8, 2010 at 3:46 PM, Thomas Graves > wrote: > > > > -bash-3.2$ hg id > > 503e0902a86a+ tip > > -bash-3.2$ hg parents > > changeset: 49:503e0902a86a > > tag: tip > > user: George Dunlap > > date: Tue Jun 22 17:11:51 2010 +0100 > > summary: More xenalyze type fixes > > > > I=92m using a clone of http://xenbits.xensource.com/ext/xenalyze= .hg > and then > > patched with the patch -p1 < back-patches/3.4.diff and make on > rhel5.4. > > > > Let me know if you > > > > Thanks, > > Tom > > > > > > On 7/8/10 9:24 AM, "George Dunlap" > wrote: > > > > The file length itself probably isn't that important, but rather= the > > fact that longer trace files increase the opportunity for certai= n > > kinds of probabilistic problematic events to occur. > > > > The problem here looks like a problem with TSC skew -- xenalyze = is > > having trouble figuring out how to process the records in the ri= ght > > order because of drift in the TSC value across cores (that's wha= t the > > "Possible tsc skew" messages are about), and end up breaking an > > assumption because it's failing (hence the "FATAL: p->current =3D= NULL" > > message) . > > > > Can you give me the cs of the tip of your hg tree? I'll take a l= ook > > and see if I have a local fix. > > > > -George > > > > On Thu, Jul 8, 2010 at 3:09 PM, Thomas Graves > wrote: > > > Hello, > > > > > > I'm new to using xentrace and xenalyze and I am having problems > running > > > xenalyze on a large trace file. It is always giving me a fatal > error. If I > > > run it on like a 30 second trace it seems to work fine. > > > > > > Is this a known issue or am I possibly doing something wrong? D= o you > > > think > > > it would work if I truncate the file or would it be missing stu= ff > xenalyze > > > expects? If there is no way to truncate it perhaps I'll see if = I can > > > modify it to only show me certain time frame - I haven't looked > at the > > > code > > > yet so I guess I'll have to see if that is possible. > > > > > > I'm using xen3.4.3 with rhel5.4 dom0 running a rhel5.4 vm. > > > > > > I'm trying to debug a vm hang at boot which sporadically occurs > so I just > > > have trace running while I do a bunch of creates and deletes so > the trace > > > file gets fairly large. If you have other ideas what might work > better I > > > would be interested in hearing them. > > > > > > > > > ------- > > > -bash-3.2$ ls -la trace.raw > > > -rw-r--r-- 1 root root 13238044416 Jul 7 23:02 trace.raw > > > -bash-3.2$ xenalyze/xenalyze --cpu-hz=3D2.43G --summary trace.r= aw > out > > > > > > ------- > > > .. > > > .. > > > .. > > > .. > > > runstate_change old_runstate blocked, d1402v0 runstate runnable= . > Possible > > > tsc skew. > > > runstate_change old_runstate blocked, d1402v0 runstate runnable= . > Possible > > > tsc skew. > > > runstate_change old_runstate runnable, d1402v0 runstate running= . > Possible > > > tsc skew. > > > Not updating. > > > FATAL: p->current null > > > ] 20f101(20:f:101) 3 [ 802061ea ffffffff f ] > > > ----- > > > > > > > > > Any help is appreciated, > > > Thanks, > > > Tom Graves > > > > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xensource.com > > > http://lists.xensource.com/xen-devel > > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >