From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Guthro Subject: Re: [PATCH] Introduce an s3 test Date: Wed, 1 May 2013 07:58:21 -0400 Message-ID: <5181035D.1030407@citrix.com> References: <1366997193-4991-1-git-send-email-benjamin.guthro@citrix.com> <20864.62687.527969.563181@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20864.62687.527969.563181@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: "xen-devel@lists.xen.org" , Ian Campbell , Ben Guthro List-Id: xen-devel@lists.xenproject.org On 05/01/2013 06:56 AM, Ian Jackson wrote: > Ben Guthro writes ("[PATCH] Introduce an s3 test"): >> From: root >> >> This test attempts to have an initial pass at introducing a test to catch regressions in S3. >> It currently just suspends for N seconds, and checks xl dmesg for a partiular message printed >> when S3 is complete. > > Thanks. Most of this looks plausible. I have some comments: > >> +# Put the machine to sleep >> +target_cmd_root($ho, "pm-suspend"); >> + >> +# Give the machine some time to go to sleep. >> +sleep (5 + $timeout); >> + >> +# check log for resume message >> +poll_loop(4*$timeout, 2, 's3-confirm-resumed', >> + target_cmd_output($ho,"xl dmesg | grep 'ACPI S' | tail -1 | " . >> + "grep -n 'Finishing wakeup from S3 state'")); > > Why does this need a poll loop ? Surely after the machine comes out > of suspend it should be up right away ? This is a bit of a "first pass" in a test environment I've never used before. I modeled this after other tests I found in the same dir. If this is inappropriate, then I suspect you are correct. I put it in the loop for the case of networking taking some time to come back online, so if the ssh command failed it would be retried. Additionally, I have found that the RTC wakeup mechanism is not very accurate in its timing. > >> +# TODO: >> +# - Check pcpu state >> +# - Affinity has been restored >> +# - C-states are not lost >> +# - CPU pools are all correct > > We don't do any cpu affinity testing at all right now. Leaving > this as a TODO here is fine. > >> +# - Check timer queues are correct >> +# - vcpu_singleshot_timer on every pcpu > > I'm not sure I follow this. Wouldn't messed up timer queues cause > other trouble in the guest ? Yes, but it has been a common point of failure / problems after S3. I put this here as a placeholder to verify that everything is still as it should be. > >> +# - Check for kernel Oops >> +# - Check for Xen WARN > > These are a good idea but should perhaps be a separate test step. Wouldn't you want a warning/oops that was provoked by S3 to be associated with that test? > > Ian. >