From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Burns Subject: Re: Large system boot problems Date: Fri, 08 Feb 2008 10:22:10 -0500 Message-ID: <47AC73A2.8030805@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Ian Pratt , xen-devel@lists.xensource.com, "Carb, Brian A" List-Id: xen-devel@lists.xenproject.org Keir Fraser wrote: > On 8/2/08 15:10, "Bill Burns" wrote: > >> The message from early_time_init (caller of >> iinit_pit_and_calibrate_tsc, indicates that the >> initial detection is ok: >> >> (pmtimer case) (XEN) Detected 3400.114 MHz processor. >> ((pit case) (XEN) Detected 3400.165 MHz processor. >> >> So I think it's the latter. The init of a large system >> is staving off the soft irq so that the next calc fails. > > Okay, well you could test this by inserting a process_pending_timers() in > the CPU-booting loop in smpboot.c. If you do timer work after booting each > CPU, perhaps that makes the problem go away? I woke up in the middle of the night with that idea a few days ago and tried it without success. Seemed that calls to process_pending_timers had no effect until a certain point. But I need to go and look at that some more and see why... > But ultimately the calibration code should be robust to long delays before > it is executed. It shouldn't go haywire. So something is bad there. Do you > have a dump of the decision made by the calibration code on cpu0 the very > first time it actually gets invoked? We probably need to trace the hell out > of that first invocation to work out why it gets things so badly wrong. I don't have more than in the earlier email where is shows the large delta in tsc time, which seems to cause the bogus result. Bill > > -- Keir > >