From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: Xen4.2 S3 regression? Date: Fri, 21 Sep 2012 19:42:24 +0100 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ben Guthro , Jan Beulich Cc: Konrad Rzeszutek Wilk , john.baboval@citrix.com, Thomas Goetz , xen-devel List-Id: xen-devel@lists.xenproject.org On 21/09/2012 19:20, "Ben Guthro" wrote: > = > = > On Fri, Sep 21, 2012 at 2:47 AM, Jan Beulich wrote: >> = >> That's because CPU1 is stuck in cpu_init() (in the infinite loop after >> printing "CPU#1 already initialized!"), as Keir pointed out yesterday. >> = > = > I've done some more tracing on this, and instrumented cpu_init(), cpu_uni= nit() > - and found something I cannot quite explain. > I was most interested in the cpu_initialized mask, set just above these t= wo > functions (and only used in those two functions) > = > I convert =A0cpu_initialized to a string, using=A0cpumask_scnprintf - and= print it > out when it is read, or written in these two functions. > = > When CPU1 is being torn down, the cpumask bit gets cleared for CPU1, and = I am > able to print this to the console to verify. > However, when the machine is returning from S3, and going through cpu_ini= t - > the bit is set again. > = > Could this be an issue of caches not being flushed? > = > I see that the last thing done before=A0acpi_enter_sleep_state actually > writes=A0PM1A_CONTROL /=A0PM1B_CONTROL to enter S3 is a=A0ACPI_FLUSH_CPU_= CACHE() > = > This analysis seems unlikely, at this point...but I'm not sure what to ma= ke of > the data other than a cache issue. > = > Am I "barking up the wrong tree" here? Perhaps not. Try dumping it immediately before and after the actual S3 sleep. Since you probably can't print to serial line at that point, you could just take a copy of the bitmap and print them both shortly after S3 resume. Then if it still looks bad, or the problem magically resolves with the extra printing, you can suspect cache flush a bit more strongly. However, WBINVD (which is what ACPI_FLUSH_CPU_CACHE() is) should be enough. -- Keir